LinLin ElectronicStructureTheory
LinLin ElectronicStructureTheory
org/page/terms
SL04_LIN_FM_V6.indd 1
to Electronic
Introduction
A Mathematical
Structure Theory
3/26/2019 9:36:22 AM
Spotlights
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
SIAM Spotlights is a new book series that comprises brief and enlightening books on timely
topics in applied and computational mathematics and scientific computing. The books, spanning
125 pages or less, will be produced on an accelerated schedule and will be attractively priced.
Editorial Board
Peter Benner Chen Greif Michael J. Miksis
Max Planck Institute for Dynamics University of British Columbia Northwestern University
of Complex Technical Systems, Per Christian Hansen Padma Raghavan
Magdeburg Technical University of Denmark Vanderbilt University
Timothy Chartier Nicholas Higham Charles Van Loan
Davidson College The University of Manchester Cornell University
Felipe Cucker Jeffrey Humpherys Margaret Wright
City University of Hong Kong Brigham Young University New York University
Donald Estep C. T. Kelley
Colorado State University North Carolina State University
Josef Málek and Zdeněk Strakoš, Preconditioning and the Conjugate Gradient Method in the Context of
Solving PDEs
Paul G. Constantine, Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies
Dominique Orban and Mario Arioli, Iterative Solution of Symmetric Quasi-Definite Linear Systems
Lin Lin and Jianfeng Lu, A Mathematical Introduction to Electronic Structure Theory
A Mathematical
Introduction
to Electronic
Structure Theory
Lin Lin
University of California
Berkeley, California
Jianfeng Lu
Duke University
Durham, North Carolina
10 9 8 7 6 5 4 3 2 1
All rights reserved. Printed in the United States of America. No part of this book may be
reproduced, stored, or transmitted in any manner without the written permission of the
publisher. For information, write to the Society for Industrial and Applied Mathematics,
3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA.
Trademarked names may be used in this book without the inclusion of a trademark symbol.
These names are used in an editorial context only; no infringement of trademark is intended.
is a registered trademark.
SL04_LIN_FM_V6.indd 5
v
and
To our families,
Dongxu and Xiaolu
3/26/2019 9:36:23 AM
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
SL04_LIN_FM_V6.indd 6
3/26/2019 9:36:23 AM
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Contents
Preface ix
vii
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
viii
Index
Bibliography
Selected references for further reading
125
119
117
Contents
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Preface
ix
x Preface
especially those who are new to this field, will benefit from these choices. We thank
Weinan E for encouraging us to engage in the summer school courses and to write the
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
book in this format. We are also grateful for discussions with Volker Blum, Roberto Car,
Alexandre Chorin, Yingzhou Li, James Sethian, Lin-Wang Wang, Chao Yang, Lexing
Ying, and Weitao Yang during the writing of this book.
Lin Lin
University of California, Berkeley
Jianfeng Lu
Duke University
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Chapter 1
The Stern–Gerlach (SG) experiment (1922) was one of the earliest experiments for
which the result could not be explained using classical physics by any means. Hence it
demonstrated unambiguously the importance of quantum effects. The SG experiment
can be explained using the quantum theory for the spin- 12 particle, which is a quantum
system that can be represented simply by 2 × 2 matrices. We will use the spin- 12 parti-
cle to introduce some of the basic concepts of quantum mechanics, such as state space,
operators, measurement, the uncertainty principle, and the evolution equation. Further-
more, the theory for the spin- 12 particle can be readily generalized to finite dimensional
quantum systems, i.e., systems that can be represented by finite dimensional matrices.
1
Throughout the text we use atomic units me = e = ~ = 4π 0
= kB = 1, where me
1
is the mass of an electron, e is the unit charge, ~ is the reduced Planck constant, 4π 0
is the Coulomb constant, and kB is the Boltzmann constant. In atomic units, the unit of
length is called Bohr, and the unit of energy is called Hartree.
1
2 Chapter 1. Basic theory of quantum mechanics
along the y-direction. They then pass through an inhomogeneous magnetic field, which
is produced using a homogeneous magnetic field pointing along the z-direction plus a
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
small perturbation. The final position of each atom along the z-coordinate is recorded by
the detecting screen on the right. In the oven the silver atom loses a valence electron, and
hence carries a magnetic moment called the spin, denoted by a vector µ ∈ R3 . Whether
the trajectory of the silver atom bends upwards or downwards through the magnetic field
depends on the direction of µ. While we leave the detailed setup of the experiment as
well as the derivation of the classical physics prediction to standard physics textbooks
for interested readers, the prediction from classical physics for the distribution of the
z-coordinate on the screen can be qualitatively described by Figure 1.2(a). Since every
silver atom was prepared in a thermal state, the initial magnetic moment µ can point
towards any direction, and hence classical physics will always predict a continuous
distribution on the screen (the particular shape of the distribution is not important to our
discussion). The result from the SG experiment was shockingly different: the detecting
screen always shows a discrete, symmetric bimodal distribution as in Figure 1.2(b).
From now on let us represent the entire experimental apparatus in Figure 1.1 by a
box SGz as in Figure 1.3(a). We define the two output states from the SGz apparatus
as |+z i and |−z i, respectively. Following the Dirac notation, |·i is called a ket and is
used to denote a given quantum state. We may block one channel of the output, say,
|−z i. This produces a filtering apparatus, which removes the |−z i contribution from
any initial mixed state.
In order to understand the nature of the experimental result, Stern and Gerlach con-
tinued with a few other experiments. First, if we pass a |+z i state produced by the
SGz filtering apparatus to another SGz apparatus, there is only one output state |+z i
(Figure 1.4(a)). This shows that |+z i is intrinsically related to the SGz apparatus. By
symmetry, the same result holds for the |−z i state. Note that there is nothing special
1.1. Finite dimensional quantum systems 3
about the z-direction. We may rotate the magnets so that the silver atoms can bend
upwards and downwards along the x-direction. The resulting apparatus is denoted by
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
SGx , and its two output states are denoted by |+x i and |−x i, respectively. Similarly,
we can define the SGy apparatus with output states |+y i and |−y i. By symmetry (and
experimental validation), the result in Figure 1.4(a) holds as well when z is replaced by
x or y. Second, if we pass the |+x i state produced by the SGx filtering apparatus to
an SGz apparatus, we observe the same bimodal symmetric pattern as in Figure 1.2(b).
This is illustrated in Figure 1.4(b), and hence the |+x i state can be “converted” to |+z i
and |−z i states. By symmetry the result holds for any input state and SG apparatus as-
sociated with different directions. Finally, we may combine Figures 1.4(a) and (b) and
arrive at Figure 1.4(c). Note that although passing |+z i through SGz only produces the
|+z i state, the combined SGx filtering apparatus and SGz apparatus can generate both
|+z i and |−z i states! How shall we explain these results?
State space
Below we demonstrate that the mysterious experimental results from the SG experi-
ments can be consistently explained using linear algebra for 2 × 2 matrices. Quantum
mechanics postulates that the state of a spin- 21 particle is a two-dimensional vector on
a vector space H isomorphic to C2 , which is called a state vector space (or simply the
state space). An element |ψi ∈ H is called a state vector or a ket vector. The states
|+z i, |−z i form a basis of H, and hence any general state vector |ψi can be written as
the linear combination of these two basis vectors as
In particular, the states |±x i, |±y i are also states in H and can be expanded as the linear
combination of |±z i.
Quantum mechanics postulates that H is equipped with an inner product (·, ·) and
hence H is a Hilbert space (in the infinite dimensional case, it is also postulated that the
space is complete with respect to the inner product). The inner product between two ket
vectors is often written in Dirac notation as hϕ|ψi. In particular, the notation hϕ| can
be used separately as a bra vector, which is a vector in the dual space of H. The states
|±z i are orthonormal under this inner product, i.e.,
Since the choice of the z-direction is arbitrary, the orthonormality condition in (1.1.2)
should hold when z is replaced by x or y. One immediate consequence is that the states
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
|+x i and |−x i are linearly independent and hence also form a basis for H, and the states
|±z i can be expanded as the linear combinations of |±x i as well. The same result holds
when x is replaced by y.
The discussion above can be generalized to any finite dimensional quantum system,
represented by a finite dimensional Hilbert space H. Given an orthonormal basis set of
n
H denoted by |ϕi ii=1 , any state vector |ψi can be written as the linear combination of
these basis vectors as
n
X
|ψi = ci |ϕi i, ci ∈ C. (1.1.3)
i=1
Quantum mechanics also postulates that the physical meanings of |ψi and c|ψi are the
same for any c ∈ C and c 6= 0. If c = 0, then 0|ψi = |0i is called the zero vector or the
null state. Hence without loss of generality we may normalize any nonzero vector |ψi
so that hψ|ψi = 1. We may readily verify
n
X n
X n
X
1 = hψ|ψi = c∗i cj hϕi |ϕj i = c∗i cj δij = |ci |2 . (1.1.4)
i,j=1 i,j=1 i=1
Here δij is the Kronecker δ-symbol. Note that |ci |2 can be interpreted as a probability
distribution over the state vectors {|ϕi i}. Indeed, quantum mechanics postulates that
such a probability distribution is precisely the distribution for the outcome of a mea-
surement process of a physical observable, as will be explained below.
Quantum operator
We first introduce some notation. Take a linear operator  acting on a finite dimensional
Hilbert space H. The adjoint of Â, denoted by Â∗ , is defined such that
Since  is self-adjoint, all eigenvalues are real, and the set of all eigenvectors forms an
orthonormal basis set of H, i.e.,
Quantum mechanics postulates that spin, and in general all physical observables,
can be represented using a self-adjoint operator on H. The procedure of mapping an
observable in classical mechanics to a linear operator in quantum mechanics is called
quantization. At first sight, this may (and should) seem strange: physical observables
such as length, weight, density, position, and velocity are always numbers, or at most
vectors. In what sense can a physical observable be represented by a self-adjoint oper-
ator?
1.1. Finite dimensional quantum systems 5
To answer this question, we first recall that in classical physics, the state of a sys-
tem can be measured without interfering with the system. This means that we may
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
point out that a classical particle is at position r with velocity v, and the measurement
process needed to obtain such information only interacts with the classical state to a
negligible extent. On the other hand, quantum mechanics postulates that all measure-
ment processes must interact with the quantum state. More specifically, as Dirac stated,
“a measurement always causes the system to jump into an eigenstate of the dynamical
variable that is being measured, the eigenvalue this eigenstate belongs to being equal to
the result of the measurement.” In other words, assume the initial state is given as the
linear combination of the eigenstates of  as
X
|ψi = ci |ϕi i, ci ∈ C. (1.1.8)
i
If we would like to measure the value of the physical observable corresponding to  for
the state |ψi, then the output state |ψi must be one of the eigenvectors of Â, however the
measurement process is designed. This interpretation of the measurement process pro-
vides the physical meaning of the eigenvalues: For finite dimensional quantum systems,
the value of any physical observable only takes discrete values, given by the eigenvalues
of the corresponding self-adjoint operator.
As indicated in (1.1.4), the coefficients |ci |2 can be interpreted as a probability dis-
tribution over the eigenstates. Quantum mechanics postulates that in a measurement
process, the state |ψi should randomly collapse into an eigenstate |ϕi i and the out-
come value of the measurement is the associated eigenvalue ai with probability |ci |2 .
Hence the result of any single measurement in quantum physics is almost never pre-
determined. One important, and the only, exception is that |ψi is already an eigenstate,
say, |ϕ1 i. In this case, |ci |2 is 1 if i = 1 and 0 otherwise. Hence the result of the
measurement is deterministically a1 with the state being |ϕ1 i after measurement.
In the context of the SG experiment, the SGz apparatus performs a measurement
of the value of the spin along the z-direction. The corresponding linear operator is
denoted by Ŝz , and its eigenstates are |±z i. Similarly, the spin operators along the
x, y directions are denoted by Ŝx , Ŝy with eigenstates |±x i, |±y i, respectively. This
explanation is consistent with the result of the SG experiment in Figure 1.4(a), where
the output state is the same as the input state due to the fact that |+z i is already an
eigenstate of Ŝz .
Although one cannot deterministically predict the value of the outcome of a single
measurement associated with a linear operator Â, the expectation value denoted by hÂi
can be predicted deterministically. This is because
n
X n
X
hÂi = hψ|Â|ψi = c∗i cj ai δij = ai |ci |2 . (1.1.9)
i,j=1 i=1
Here the right-hand side of the equation is precisely the expectation value of the physical
observable, and the expectation value can be experimentally obtained if one can prepare
a large number of copies of the same state |ψi and repeat the measurements.
We now determine the expansion coefficients relating the states |±x i, |±y i, and
|±z i. The experiment in Figure 1.4(b) yields a bimodal symmetric distribution. The
probabilistic explanation above implies the following relation:
1
|+x i = √ |+z i + eiα |−z i ,
α ∈ R.
2
6 Chapter 1. Basic theory of quantum mechanics
Here √12 is a normalization factor, α is an arbitrary phase factor, and we choose the
coefficient of |+z i to be exactly √12 since the physical meaning of a state is not changed
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
We are now ready to construct the linear operator for the spin operators Ŝx , Ŝy , Ŝz .
It is also common to combine the three operators in vector form as Ŝ = (Ŝx , Ŝy , Ŝz )> .
Furthermore, one can define the spin operator along an arbitrary unit vector n ∈ R3 as
Ŝn = Ŝ · n = Ŝx nx + Ŝy ny + Ŝz nz . (1.1.12)
By the convention in quantum mechanics, the |±z i are eigenstates of Ŝz with eigenval-
ues ± 21 . This means that
1
Ŝz = (|+z ih+z | − |−z ih−z |) . (1.1.13)
2
Using the coordinates in (1.1.11), we have the matrix representation of Ŝz as
1 1 0
Ŝz = . (1.1.14)
2 0 −1
1.1. Finite dimensional quantum systems 7
Ŝx = (|+x ih+x | − |−x ih−x |) , Ŝy = (|+y ih+y | − |−y ih−y |) , (1.1.15)
2 2
with the corresponding matrix representation
1 0 1 1 0 −i
Ŝx = , Ŝy = . (1.1.16)
2 1 0 2 i 0
The matrix representation can be concisely written using the Pauli matrices as Ŝ =
1 1 >
2 σ = 2 (σx , σy , σz ) , where
0 1 0 −i 1 0
σx = , σy = , σz = . (1.1.17)
1 0 i 0 0 −1
The Pauli matrices are Hermitian and unitary. Together with the 2 × 2 identity matrix,
they form a basis for all self-adjoint linear operators on C2 .
From now on, we take {|±z i} as the standard basis set for the spin- 21 particle. |+z i
is often denoted by |↑i and is called the spin-up state. Similarly, |−z i is denoted by |↓i
and is called the spin-down state.
We remark that up to this point, it is impossible to tell whether the explanation has
fundamental value or is simply phenomenological. On the other hand, today we know
that spin is indeed an intrinsic degree of freedom of quantum particles and strangely has
no analogue in classical physics. Numerous experimental results have demonstrated
that the linear algebra interpretation fits consistently into the larger picture of physics
to the extent of the present day’s knowledge, and can be practically regarded as the
fundamental theory for spin- 21 particles. It is useful to keep in mind that such a prag-
matic approach was taken during the development of many theoretical concepts known
in modern physics, rather than following some rigorous axiomatic approach. We refer
readers to the reading materials at the end of the book for a more detailed discussion on
the foundation of quantum mechanics with a more axiomatic approach, or alternative
formulations of quantum mechanics such as path integrals. These topics are beyond the
scope of this book.
Uncertainty principle
One immediate consequence of the linear operator interpretation for physical observ-
ables in quantum physics is that the physical observables usually do not commute. For
example,
1 0 1 1 0 1 0 −1
Ŝx Ŝz = = ,
4 1 0 0 −1 4 1 0
1 1 0 0 1 1 0 1
Ŝz Ŝx = = .
4 0 −1 1 0 4 −1 0
Hence Ŝx , Ŝy , Ŝz are mutually incompatible. One can verify in the exercise that the
2
square of the magnitude of the spin operator, Ŝ = Ŝx2 + Ŝy2 + Ŝz2 , is compatible with
all the spin operators along any individual direction.
The compatibility condition has a direct physics consequence. From linear algebra
we know that for compatible Hermitian matrices  and B̂, we can always find the
eigenvectors |ϕi i so that the two operators can be simultaneously diagonalized:
Â|ϕi i = ai |ϕi i, B̂|ϕi i = bi |ϕi i. (1.1.19)
Recall the quantum mechanics postulation that the final state from any measurement
leads to an eigenstate of the operator corresponding to a physical observable. Then if
two operators can be simultaneously diagonalized using the same set of eigenstates, it
means that one can simultaneously measure  and B̂. The compatibility condition is
sufficient and necessary. In other words, if the two operators are incompatible, then one
cannot always simultaneously measure the values of the two physical observables.
The statement above can be quantified in terms of the uncertainty principle, which
can be formulated in terms of an inequality for the fluctuation of the measurements for
 and B̂. For a given operator  and state ψ, define an operator
∆ =  − hÂiI :=  − hψ|Â|ψiI.
Thus ∆ is an operator with zero expectation value:
h∆Âi = hψ|∆Â|ψi = 0.
The variance can be defined using ∆ as
2
h∆Â2 i = hψ|( − hÂiI)2 |ψi = hÂ2 i − hÂi . (1.1.20)
If the operators Â, B̂ are compatible, and |ψi is one of their common eigenvectors,
then
h∆Â2 i = h∆B̂ 2 i = 0.
This means that there is no uncertainty in measuring the values of both  and B̂. Un-
fortunately in general, h∆Â2 i and h∆B̂ 2 i cannot be arbitrarily small.
For any two Hermitian operators Â, B̂ on H, recall the Cauchy–Schwarz inequality
h∆Â2 ih∆B̂ 2 i ≥ |h∆Â∆B̂i|2 .
Observe that
1 1
∆Â∆B̂ = (∆Â∆B̂ + ∆B̂∆Â) + (∆Â∆B̂ − ∆B̂∆Â)
2 2
1 1
= {∆Â, ∆B̂} + [∆Â, ∆B̂],
2 2
where {Â, B̂} = ÂB̂ + B̂ Â is called the anti-commutator.
1.1. Finite dimensional quantum systems 9
Notice that {∆Â, ∆B̂} is Hermitian, so h{∆Â, ∆B̂}i is real. Similarly, we find
that h[∆Â, ∆B̂]i is purely imaginary by noticing that i[∆Â, ∆B̂] is Hermitian. Then
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
1 1
|h∆Â∆B̂i|2 ≥ |h[∆Â, ∆B̂]i|2 = |h[Â, B̂]i|2 .
4 4
Therefore
1
h∆Â2 ih∆B̂ 2 i ≥ |h[Â, B̂]i|2 . (1.1.21)
4
Equation (1.1.21) is called the uncertainty principle in the context of finite dimensional
quantum systems. It states that there is a lower bound for the product of the uncertainty
of two operators h∆Â2 ih∆B̂ 2 i, given by the expectation value of the commutator. Due
to the uncertainty principle, one cannot obtain simultaneously precise measurements of,
e.g., Ŝx and Ŝz .
Schrödinger equation
In order to understand how one quantum state |ψ(t0 )i at time t0 evolves into another
state |ψ(t)i at time t > t0 , quantum mechanics postulates that there is a linear operator
Û (t, t0 ), called the propagator, which is independent of the initial state |ψ(t0 )i and
satisfies |ψ(t)i = Û (t, t0 )|ψ(t0 )i. The normalization convention implies that
for any t ≥ t0 and initial state |ψ(t0 )i. Therefore, Û ∗ (t, t0 )Û (t, t0 ) = I. Since Û (t, t0 )
is a finite dimensional matrix, the operator Û (t, t0 ) is unitary for any t ≥ t0 .
Another natural property that the evolution operator U should satisfy is that for time
t0 < t1 < t2 ,
Assuming the evolution of |ψ(t)i is continuous, i.e., lim∆t→0+ |ψ(t + ∆t)i = |ψ(t)i,
we have
Û (t, t) = lim + Û (t + ∆t, t) = I.
∆t→0
Hence
In order to satisfy the unitary condition to the first order with respect to ∆t, we have
Ω̂∗ (t) = Ω̂(t). Therefore Ω̂(t) is a self-adjoint operator and can be associated with a
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
physical observable.
Quantum mechanics postulates that Ω̂ is given by the Hamiltonian operator Ĥ,
which is obtained from the quantization process of the Hamiltonian in classical physics,
and is related to the total energy of the system. Then
Ĥ|ϕi i = Ei |ϕi i,
The ground state energy is E0 = − B2 with eigenstate |ψ0 i = |↓i, and the first excited
state energy is E1 = B2 with eigenstate |ψ1 i = |↑i.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
e−iBt/2 e+iBt/2
|ψ(t)i = √ |↑i + √ |↓i.
2 2
The evolution of the expectation value of the spin operator Ŝx satisfies the following
equation:
d
i hŜx i(t) = h[Ŝx , Ĥ]i(t) = −iBhŜy i(t). (1.1.25)
dt
Similarly,
d
i hŜy i(t) = h[Ŝy , Ĥ]i(t) = iBhŜx i(t).
dt
Hence
d2
hŜx i(t) = −B 2 hŜx i(t). (1.1.26)
dt2
From the initial state (1.1.24) we may readily compute hŜx i(0) = 12 , hŜy i(0) = 0. To-
gether with (1.1.25), we find that (1.1.26) is a second-order ODE with initial conditions
1 d
hŜx i(0) = , hŜx i(0) = 0.
2 dt
Solving this equation, we have
1
hŜx i(t) = cos Bt.
2
Similarly,
1
hŜy i(t) = sin Bt, hŜz i(t) = 0.
2
Therefore the expectation of the spin operator
1 >
hŜi(t) = (cos Bt, sin Bt, 0) ,
2
which is rotating with its axis point along the magnetic field B. Since the spin is itself
an intrinsic rotating degree of freedom of a quantum particle, this motion is called the
spin precession.
For any two state vectors |ψi, |ϕi ∈ H, the inner product is defined as
Z
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Note the subtle difference in the notation here: x̂ is an operator while x is a real number,
and the right-hand side is understood as the evaluation of the function xψ at x. This
equation can also be written in Dirac notation as
then we must have ψ(x) = 0 if x 6= x0 . Since ψ(x) vanishes almost everywhere on the
real line, |ψi is equivalent to the null state. This contradicts the assumption that ψ(x) is
a normalized eigenfunction. In fact, the operator x̂ does not have any square integrable
eigenstate. The eigen-decomposition of the position operator needs to be represented
using the Dirac δ-function loosely thought of as
(
∞, if x = x0 ;
δ(x − x0 ) = (1.2.5)
0, otherwise,
R
and δ(x − x0 ) dx = 1. According to the distribution theory, the Dirac δ-function
should be considered as a linear functional on H such that acting on a smooth function
f as
δ(· − x0 ) : f 7→ f (x0 ).
We will not discuss further details of this more rigorous perspective here.
The Dirac δ-notation allows us to formally define a state |x0 i using δ(x − x0 ). Then
the formal eigen-decomposition of x̂ reads
Using such notation, the relation between a quantum state |ψi and its function ψ(x) is
Furthermore,
hx1 |x2 i = δ(x1 − x2 ), (1.2.8)
1.2. Schrödinger equation in the real space 13
appropriately viewed as the “generalized coordinate” of the state vector |ψi, and the ba-
sis is provided by the “generalized eigenfunctions” of the operator x̂. Hence the function
ψ(x) is also called the real space representation of |ψi, and the square integrable condi-
tion for ψ(x) means that |ψ(x)|2 can be interpreted as the probability density of finding
the particle at x. In the discussion below, we may use the notation |ψi and its associated
function ψ(x) interchangeably.
If the Hilbert space H is finite dimensional and  is a linear operator on H, then for
any state vector |ψi ∈ H we still have Â|ψi ∈ H. The same statement does not hold for
1
infinite dimensional spaces. For example, consider the wavefunction ψ(x) ∝ 1+|x| 2/3 .
Then one can verify that ψ ∈ H but xψ(x) 6∈ H. This means that the position operator
x̂ cannot be defined for all possible wavefunctions ψ ∈ H. The domain of the position
operator is
Z
2 2 2
dom x̂ = ψ ∈ L (R) x |ψ(x)| dx < ∞ , (1.2.9)
R
2
which is a subset of H = L (R). It is a dense subset as it contains all compactly
supported continuous functions.
Momentum operator
The definition of the momentum operator is arguably another point of mystery for first-
time readers of quantum mechanics. Here we simply state that quantum mechanics
postulates that the momentum operator should be a differential operator1
d
p̂ = −i .
dx
In other words, for ψ ∈ H,
(p̂ψ)(x) = −iψ 0 (x).
The eigenfunction of p̂ can be discussed in parallel to that of the position operator.
We formally denote by |p0 i the eigenfunction of p̂ with eigenvalue p0 , i.e.,
p̂|p0 i = p0 |p0 i, p0 ∈ R.
which is a generalized orthonormality condition. Note that |p0 i 6∈ L2 (R), and therefore
|p0 i is also a generalized eigenfunction.
Similar to the position operator, p̂ cannot be applied to all functions ψ(x) ∈ L2 (R).
The domain of the momentum operator is
Uncertainty principle
We can compute the commutator of the position operator x̂ and the momentum operator
p̂ in the following way:
[x̂, p̂]ψ(x) = (x̂p̂ − p̂x̂)ψ(x) = x(−iψ 0 (x)) + iψ(x) + ixψ 0 (x) = iψ(x). (1.2.15)
This is true for arbitrary function ψ(x) in the set {ψ ∈ H 1 (R) | x2 |ψ 0 (x)|2 < ∞}, so
R
that all operations above are well defined. This is a dense subset of H = L2 (R). Hence
we have
[x̂, p̂] = i. (1.2.16)
Equation (1.2.16) is a fundamental relation in quantum mechanics and is called the
canonical commutation relation.
From the canonical commutation relation, we find that the position and momentum
operators are not compatible, and it is not possible to simultaneously determine the
position and momentum of a quantum particle. A more quantitative version of this
statement is given by the uncertainty principle in (1.1.21) as
1 1
hψ|∆x̂2 |ψihψ|∆p̂2 |ψi ≥ |hψ|[x̂, p̂]|ψi|2 = ,
4 4
or simply
p p 1
h∆x̂2 i h∆p̂2 i ≥ . (1.2.17)
2
The relation (1.2.17) is called the Heisenberg uncertainty principle.
1.2. Schrödinger equation in the real space 15
1
|hp|x0 i|2 =
2π
holds for any p ∈ R, we have formally h∆p̂2 i = ∞. Similar calculation shows that if
|ψi = |p0 i, then h∆x̂2 i = ∞.
L = r × p. (1.2.18)
Then the quantization rule defines the quantum angular momentum operator as
L̂ = r̂ × p̂ = r̂ × (−i∇r ) , (1.2.19)
∂ ∂
L̂x = −iŷ + iẑ ,
∂z ∂y
∂ ∂
L̂y = −iẑ + ix̂ , (1.2.20)
∂x ∂z
∂ ∂
L̂z = −ix̂ + iŷ .
∂y ∂x
Similarly to the spin- 12 particle, we define the square of the magnitude of the angular
momentum operator as
2
L̂ := L̂2x + L̂2y + L̂2z . (1.2.21)
The different components of the angular momentum satisfy the cyclic relation
h i h i h i
L̂x , L̂y = iL̂z , L̂y , L̂z = iL̂x , L̂z , L̂x = iL̂y . (1.2.22)
Furthermore,
2
h i
L̂ , L̂α = 0, α = x, y, z. (1.2.23)
The nature of the angular momentum operator can be better revealed in the spherical
coordinate system. By a change of variable of the Euclidean coordinate
x = r sin θ cos ϕ,
y = r sin θ sin ϕ, (1.2.24)
z = r cos θ,
16 Chapter 1. Basic theory of quantum mechanics
0 0 √ s orbital
4π
r
3
−1 sin θe−iϕ
r 8π
3
1 0 cos θ p orbital
r 8π
3
1 − sin θeiϕ
8π
r
15
−2 sin2 θe−2iϕ
r 32π
15
−1 sin θ cos θe−iϕ
r 32π
5
2 0 (3 cos2 θ − 1) d orbital
r 16π
15
−1 − sin θ cos θeiϕ
r 32π
15
−2 sin2 θe2iϕ
32π
2
we find that the operator L̂ in the spherical coordinates reads
1 ∂2
2 1 ∂ ∂
L̂ = − sin θ − . (1.2.25)
sin θ ∂θ ∂θ sin2 θ ∂ϕ2
2
In particular, L̂ depends only on the angular directions θ, ϕ and is independent of the
radial direction.
Following the derivation in Appendix A.2, the eigenvalues and eigenfunctions of
2
L̂ can be directly evaluated using separation of variables. We have
2
L̂ Ylm (θ, ϕ) = l (l + 1) Ylm (θ, ϕ) . (1.2.26)
Here l ∈ N and m can choose from −l, −l + 1, . . . , l. The eigenfunctions Ylm depend
only on the θ, ϕ variables and are called the spherical harmonics. Spherical harmonics
are one of the most important classes of special functions used in quantum physics, and
they provide solutions and also chemical intuition to the solution of the Schrödinger
equation in electronic structure theory. The formula for the first few spherical harmonics
is given in Table 1.1.
In the general theory of angular momentum, any operator L̂ satisfying the cyclic
relation (1.2.22) is called an angular momentum operator. From this perspective, the
spin operator Ŝ satisfies the cyclic relation and hence is an angular momentum operator.
Since
2 3 1 1
Ŝ = I = + 1 I, (1.2.27)
4 2 2
2
we identify that Ŝ is the square of the magnitude of the angular momentum opera-
tor with l = 21 , compared with (1.2.25). This justifies the name “spin- 12 ” particle in
1.3. Hydrogen atom 17
section 1.1. The general theory of angular momentum further states that the value of l
appearing in the eigenvalues of any angular momentum operator can only be integers
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Hamiltonian operator
In the absence of the magnetic field, the total energy in classical mechanics for a particle
2
with unitary mass in a potential field is E(x, p) = p2 + V (x). After the quantization
procedure, we obtain the Hamiltonian for a particle on the real line,
p̂2
Ĥ = + V (x̂),
2
where V (x̂) is interpreted as a multiplicative operator defined as
For particles in three dimensions with a potential field V , the Hamiltonian operator is
defined as
1 2 1
p̂ + p̂2y + p̂2z + V (r̂) = − ∆r + V (r̂).
Ĥ = (1.2.28)
2 x 2
Here ∆r = ∂x2 + ∂y2 + ∂z2 is the Laplacian operator.
From now on, we will drop the ˆ· notation for the position, momentum, angular mo-
mentum, and Hamiltonian operators for the rest of the discussion to make the notation
concise. So r, p, L, S, H will be interpreted as operators when necessary. We also
write r = (x, y, z)> ≡ (r1 , r2 , r3 )> for the convenience of summations over spatial
coordinate components. Similarly, p = (px , py , pz )> ≡ (p1 , p2 , p3 )> . Following this
notation, we may write the Hamiltonian as
1
H = − ∆r + V (r). (1.2.29)
2
For the Hamiltonian in (1.2.29), the time-dependent Schrödinger equation reads
1
i∂t ψ(r, t) = Hψ(r, t) = − ∆r ψ(r, t) + V (r)ψ(r, t). (1.2.30)
2
Since the Hamiltonian H does not depend on the time variable, one can find the station-
ary state solution of the Schrödinger equation by solving the eigenvalue problem
1
− ∆r + V (r) ψ(r) = Eψ(r). (1.2.31)
2
mechanics. The Hamiltonian for the hydrogen atom takes the form (1.2.29). To sim-
plify the discussion, we will take the Born–Oppenheimer approximation and regard the
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
nucleus as fixed at the origin. We remark that it is also possible to explicitly solve the
full quantum mechanical description of the hydrogen atom. The potential is a centrally
symmetric potential and only depends on the radial direction as (recall that r = |r|)
1
V (r) = − . (1.3.1)
r
Hence we can find the eigenstates of the hydrogen atom by solving the eigenvalue prob-
lem
1 1
− ∆r − ψ (r) = Eψ (r) . (1.3.2)
2 r
In quantum mechanics, such an eigenfunction ψ(r) can also be interchangeably referred
to as an “orbital.”
Using the spherical coordinates (1.2.24), the Laplacian operator takes the form
∂2
1 ∂ ∂ 1 ∂ ∂ 1
∆r = 2 r2 + 2 sin θ + 2 2 . (1.3.3)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ2
Comparing the relation above and (1.2.25), we find that ∆r is related to the L2 operator
as
1 ∂ 2 ∂ 1
∆r = 2 r − 2 L2 . (1.3.4)
r ∂r ∂r r
Since all eigenfunctions for L2 can be explicitly identified via (1.2.26), we may readily
use separation of variables to solve (1.3.2). Assume the wavefunction takes the form
ψ(r, θ, ϕ) = R(r)Ylm (θ, ϕ), (1.3.5)
then the radial part of the wavefunction R(r) should satisfy the eigenvalue problem
1 ∂ ∂R l (l + 1) 1
− 2 r2 + R (r) − R (r) = ER (r) , r ∈ (0, ∞), (1.3.6)
2r ∂r ∂r 2r2 r
subject to the condition that R(0) is finite and the function decays at infinity. Direct
calculation shows that the following relation holds in the operator sense:
1 ∂2
1 ∂ 2 ∂
2
r = r. (1.3.7)
r ∂r ∂r r ∂r2
By a change of variable
u(r) = rR(r), (1.3.8)
we have an equivalent eigenvalue problem
1 ∂2u
− (r) + Ve (r)u(r) = Eu(r), (1.3.9)
2 ∂r
where
l (l + 1) 1
Ve (r) = − (1.3.10)
2r2 r
is an effective potential along the radial direction. Compared to V (r), the extra term is
due to the angular momentum operator. Asymptotically, as r → ∞, (1.3.9) becomes
approximately
1 ∂2u
− = Eu. (1.3.11)
2 ∂r2
1.3. Hydrogen atom 19
√ √
If the eigenvalue E > 0, then u(r) ∼ c1 ei 2Er + c2 e−i 2Er . This planewave-like
solution cannot be square integrable, and hence any value E > 0 cannot be an isolated
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
eigenvalue. In fact, the hydrogen atom has a continuous spectrum on [0, ∞). Hence all
eigenvalues with square integrable eigenfunctions must satisfy E < 0.
If l = 0 in (1.3.10), one can easily find one analytic solution u(r) = re−r for
(1.3.9), i.e.,
r
1
Ψ(r) = R(r), R(r) = exp(−r),
4π
p
which corresponds to the eigenvalue E0 = − 12 . Here the prefactor 1/(4π) comes
from normalization. It turns out that this is the ground state energy. Further calculation
shows that all other eigenvalues are given by a simple relation
1
Ekl = − ,
2(k + l)2
where Ekl is the kth eigenvalue of (1.3.9) corresponding to the angular momentum l. If
we define a number n = k +l, then all eigenvalues can be labeled using a single number
n as
1
En = − 2 , n = 1, 2, . . . .
2n
For each eigenvalue En , the corresponding degenerate eigenfunctions can be labeled
as ψnlm with n, l, m being integer parameters. Here n is called the principal quantum
number (n ≥ 1), l is called the azimuthal quantum number (0 ≤ l ≤ n − 1), and m is
called the magnetic quantum number (−l ≤ m ≤ l).
In spectroscopic terminologies, the spherical harmonics with l = 0, 1, 2, 3 are named
s, p, d, f orbitals, respectively.2 These terms are often used in electronic structure the-
ory as well. Hence ψ100 is called the 1s orbital (with the only possible choice l = 0 and
m = 0). The normalized 1s orbital takes the form
r
1
ψ100 (r) = exp(−r).
4π
Example: H+
2
The H+2 molecule consists of two hydrogen atoms, but with one electron removed. The
system thus contains only one electron and the total charge of the system is +1. Without
2 Here the symbols s, p, d, and f stand for “sharp,” “principal,” “diffuse,” and “fundamental,” (or “fine”),
loss of generality we assume the positions of the two atoms are fixed at 0 and R =
(R, 0, 0)> , respectively, and the Hamiltonian for the electron is
1 1 1
H = − ∆r − − . (1.3.12)
2 |r| |r − R|
As a crude approximation, we assume the ground state wavefunction is given by the
linear combination of two 1s orbitals centered at 0 and R, respectively, i.e.,
ψ(r) ≈ c1 ψ100 (r) + c2 ψ100 (r − R). (1.3.13)
Since the vector space span {ψ100 (r), ψ100 (r − R)} is isomorphic to C2 , the Hamil-
tonian operator of the H+2 molecule in the above approximation can be approximated
by a 2 × 2 matrix. More specifically, we can solve this system by a Galerkin projection
principle. By projecting the eigenvalue problem
Hψ = Eψ
to the space span {ψ100 (r), ψ100 (r − R)}, we have a generalized eigenvalue problem
ε −t 1 s
c=E c. (1.3.14)
−t ε s 1
The generalized eigenvector is c = (c1 , c2 )> and the matrix elements are
Z
ε = ψ100 (r)(Hψ100 )(r) dr, (1.3.15)
Z
−t = ψ100 (r)(Hψ100 )(r − R) dr, (1.3.16)
Z
s = ψ100 (r)ψ100 (r − R) dr. (1.3.17)
Given that t, s > 0, the ground state eigenvalue and eigenfunction are
ε−t 1
Eg = , cg = p (1, 1)> . (1.3.18)
1+s 2(1 + s)
Similarly, the eigenvalue and eigenfunction for the first excited state are
ε+t 1
Ee = , ce = p (1, −1)> . (1.3.19)
1−s 2(1 − s)
It can be verified that the eigenfunctions in the real space using the ansatz (1.3.13)
satisfy
hψg |ψg i = hψe |ψe i = 1, hψg |ψe i = 0. (1.3.20)
In particular, the wavefunction of the ground state is symmetric on two sites, and ψg (r)
is nonzero between two atoms. This is called a “bonding state” and it is the prototypical
model for covalent bonds in chemistry. On the other hand, ψe (r) is exactly zero at
r = (R/2, 0, 0)> . This is called an “anti-bonding state.”
1.4. Periodic systems 21
The main character of isolated systems is that the potential V (r) decays to zero when
|r| → ∞. In addition to isolated systems, another important class of systems commonly
investigated in electronic structure theory is condensed matter systems, such as liquids
and solids. In such a case, the size of support of V (r) is on the macroscopic scale and
can therefore be considered to be infinity from the microscopic perspective of electronic
structure theory. Although condensed matter systems contain a macroscopic number of
electrons, let us simplify the discussion for the moment and consider a single electron
in the condensed matter system. From this perspective, the condensed matter system
only provides a background potential V (r) and the Hilbert space for the electron is still
given by L2 (R3 ). Many-electron systems will be discussed later in the book.
A simple example of a condensed matter system is a crystalline solid system, or
simply a crystal. The atomic positions form a Bravais lattice L, defined as the set
L = {R | R = n1 a1 + n2 a2 + n3 a3 , n1 , n2 , n3 ∈ Z} . (1.4.1)
The periodic property of V implies that [TR , H] = 0, and hence TR and H are simul-
taneously diagonalizable. Formally, take any eigenvector ψ such that
Hψ = Eψ, TR ψ = CR ψ. (1.4.4)
is a monochromatic planewave
Hence ψ may not be periodic with respect to R, but satisfies the twisted boundary
condition (also called the Bloch boundary condition) as
then
Therefore u(r) = u(r + R), i.e., u is periodic with respect to R. Since both V and u
are periodic, one can solve the equation for u(r) only in the unit cell Ω.
1.4. Periodic systems 23
L∗ = {G | G = n1 b1 + n2 b2 + n3 b3 , n1 , n2 , n3 ∈ Z} . (1.4.11)
Here b1 , b2 , and b3 are the reciprocal lattice vectors, defined via the relation
aα · bβ = 2πδα,β , α, β = 1, 2, 3. (1.4.12)
Note the convention in the range of c1 , c2 , c3 and that the unit cell of the reciprocal
lattice centers at (0, 0, 0). In physics literature, Ω∗ is referred to as the (first) Brillouin
zone. Note that (1.4.8) implies that any k and k + G (G ∈ L∗ ) are equivalent. Hence
we can reduce the range of k to the first Brillouin zone Ω∗ .
More specifically, for a given k ∈ R3 , we would like to find the eigen-decomposition
of the form
Hψn,k (r) = En,k ψn,k (r), (1.4.14)
where n = 0, 1, 2, . . . is the index for eigenvalues for each k. Using the change of
variable from ψ to the periodic part u as
we find
1 ik·r
ik·r 1 2
− ∆ + V (r) e un,k (r) = e − (∇ + ik) + V (r) un,k (r)
2 2
= En,k eik·r un,k (r), (1.4.16)
As un,k extends periodically into R3 , un,k (and also ψn,k ) cannot be square integrable
in R3 . However, on Ω, un,k is a proper eigenfunction of Hk . For a fixed n, the function
En,k viewed as a function of k is a continuous function. This is called a Bloch band or
energy band. The collection of all eigenvalues {En,k } is called the band structure.
As will be discussed in section 2.8, the band structure plays a fundamental role in
understanding electronic properties in solids.
24 Chapter 1. Basic theory of quantum mechanics
The method for describing quantum systems containing more than one particle is the
tensor product of Hilbert spaces. For simplicity, we only consider finite dimensional
Hilbert spaces. Let HA , HB be two Hilbert spaces of dimension NA , NB , respectively,
NA B NB
and let {|ϕA
i i}i=1 , {|ϕj i}j=1 be the corresponding basis sets. Then the tensor product
space is defined as
n o
HA ⊗ HB = span |ϕA ϕ
i j
B
i i = 1, . . . , NA , j = 1, . . . , NB . (1.5.1)
Here {|ϕA B
i ϕj i} form a new basis set, which is orthonormal under the inner product
hϕA B A B A A B B
i ϕj |ϕi0 ϕj 0 i = hϕi |ϕi0 ihϕj |ϕj 0 i = δi,i0 δj,j 0 .
(Â ⊗ B̂)|ϕA B A B
i ϕj i = |(Âϕi )(B̂ϕj )i. (1.5.2)
For example, consider two spin- 21 particles. The Hilbert space for each particle,
denoted by H = span {|↑i, |↓i}, is isomorphic to C2 , and the product space is therefore
isomorphic to C2 ⊗ C2 ∼
= C4 . The basis set for H ⊗ H is given by
or in simplified notation {|↑↑i, |↑↓i, |↓↑i, |↓↓i}. The spin operators along the z-direction
on the product space can be defined as
Similarly, we can define Sxtot , Sytot . The square of the magnitude of the total spin
operator is
2 2 2 2
S tot = Sxtot + Sytot + Sztot .
Below we demonstrate that the tensor product space for two spin- 12 particles can be
2
explicitly categorized using the common eigenfunctions of Sztot and S tot .
To start, we compute Sztot acting on the four basis functions as
1 1
Sztot |↑↑i = Sz ⊗ I|↑↑i + I ⊗ Sz |↑↑i = |↑↑i + |↑↑i = |↑↑i,
2 2
Sztot |↓↓i = −|↓↓i,
1 1
Sztot |↓↑i = Sz ⊗ I|↓↑i + I ⊗ Sz |↓↑i = − |↓↑i + |↓↑i = |0i,
2 2
Sztot |↑↓i = |0i.
Hence all basis vectors of the tensor product space are eigenstates of Sztot . Similarly,
one can evaluate Sxtot and Sytot acting on the basis vectors. The square of the magnitude
1.6. Identical particles 25
2
of the total spin operator S tot requires further computation. Since
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
2
Therefore S tot and Sztot can be simultaneously diagonalized. More specifically,
2 2
S tot |↑↑i = 2|↑↑i, S tot |↓↓i = 2|↓↓i. (1.5.6)
2
On the other hand, |↑↓i and |↓↑i are not eigenstates of S tot . Instead,
tot 2
1 1 1 1
S √ |↑↓i + √ |↓↑i = 2 √ |↑↓i + √ |↓↑i ,
2 2 2 2
(1.5.7)
2 1 1
S tot
√ |↑↓i − √ |↓↑i = 0.
2 2
2
Thus the operator S tot can be used to distinguish the eigenspace of Sztot correspond-
ing to the eigenvalue 0 spanned by the degenerate eigenstates |↑↓i and |↓↑i.
2
In summary, the operator S tot has a single eigenvalue 0, which is called the
singlet state, and a three-fold degenerate eigenvalue 2, which is called the triplet states.
The triplet states can be further distinguished using the operator Sztot . The eigenvalues
2
and eigenvectors of S tot and Sztot are summarized in Table 1.2.
2
State Type S tot Sztot
√1 (|↑↓i − |↓↑i) singlet 0 0
2
|↑↑i 1
√1 (|↑↓i + |↓↑i) triplet 2 0
2
|↓↓i −1
2
Table 1.2. Eigenvalues and eigenvectors of S tot and Sztot for
two spin- 21 particles.
degrees of freedom together. For a spin-dependent quantum particle in the real space,
the state space is
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
X Z
H = L2 (R3 ; C2 ) := ψ(r, σ) |ψ(r, σ)|2 dr < ∞ . (1.6.1)
R3
σ∈{↑,↓}
We often use x = (r, σ) to denote collectively the spatial and spin variables. We also
introduce the notation Z X Z
dx := dr.
σ∈{↑,↓} R3
Let us consider a system with two quantum particles. Each state vector |Ψi is in the
tensor product space H⊗H, where H = L2 (R3 ; C2 ). The wavefunction is Ψ(x1 , x2 ) =
hx1 , x2 |Ψi, where xi = (r i , σi ) represents the collective spatial and spin variables of
the ith particle. Again with some abuse of notation, we may not distinguish the state
vector |Ψi and its associated wavefunction Ψ(x1 , x2 ). If we interchange the indices for
the two particles, then the wavefunction becomes Ψ(x2 , x1 ). This can be represented
using a permutation operator P12 , defined as
Hence the eigenvalues of the permutation operator P12 must be ±1. Furthermore, if the
Hamiltonian H commutes with P12 , then one can find |Ψi so that it is simultaneously
the eigenstate of H and P12 , i.e.,
Ψ(x1 , x2 ) = Ψ(x2 , x1 ).
This is called a bosonic state. If the sign is −, then |Ψi is an anti-symmetric function
Ψ(x1 , x2 ) = −Ψ(x2 , x1 ).
Although the Hamiltonian does not explicitly involve spin operators, the wavefunction
involves both spatial and spin degrees of freedom as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Ψ(x1 , x2 ) ≡ Ψ ((r 1 , σ1 ), (r 2 , σ2 )) .
The wavefunction |Ψi for electrons is always a fermionic state and is in the space A2 =
V2 2 3 2
L (R ; C ), which consists of all anti-symmetric functions in the tensor product
space L2 (R3 ; C2 ) ⊗ L2 (R3 ; C2 ).
Since the Hamiltonian does not explicitly involve the spin degrees of freedom, we
have tot 2 tot
(S ) , H = 0, Sz , H = 0. (1.6.4)
Then |Ψi must simultaneously be the eigenstate of (S tot )2 , Sztot , and H, and we can
find the ground state wavefunction Ψ by separating the spatial and spin degrees of free-
dom as
Ψ(x1 , x2 ) = ϕ(r 1 , r 2 )χ(σ1 , σ2 ). (1.6.5)
According to the discussion in section 1.5, χ must be either the anti-symmetric function
(spin singlet) or the symmetric function (spin triplet). Since the overall wavefunction
must be anti-symmetric, if χ is a spin singlet, then the spatial wavefunction ϕ must be
symmetric (i.e., of the bosonic form) and vice versa.
More specifically, if χ is a spin singlet, i.e.,
1
χ(σ1 , σ2 ) = χS (σ1 , σ2 ) := √ hσ1 σ2 |↑↓i − hσ1 σ2 |↓↑i ,
2
then the simplest symmetric wavefunction for the spatial degrees of freedom takes the
factorized form
ϕ(r 1 , r 2 ) = φ(r 1 )φ(r 2 ). (1.6.6)
The normalization condition for Ψ(x1 , x2 ) requires φ to be normalized as
Z
|φ(r)|2 dr = 1.
If χ is a spin triplet, i.e., χ(σ1 , σ2 ) = hσ1 σ2 |↑↑i, then the spatial part ϕ(r 1 , r 2 )
should be anti-symmetric. Note that anti-symmetrizing the function of the form (1.6.6)
would simply give a zero function. Hence the simplest anti-symmetric function requires
two orthonormal functions φ1 (r), φ2 (r) and
1 1 φ1 (r 1 ) φ1 (r 2 )
ϕ(r 1 , r 2 ) = √ (φ1 (r 1 )φ2 (r 2 ) − φ1 (r 2 )φ2 (r 1 )) = √ .
2 2 φ2 (r 1 ) φ2 (r 2 )
(1.6.7)
It can be readily verified that ϕ(r 1 , r 2 ) is a normalized, anti-symmetric function. The
determinant on the right-hand side of (1.6.7) is called a Slater determinant.
The above representation explicitly separates the spatial degrees of freedom and the
spin degrees of freedom, and the single particle functions φ, φ1 , φ2 are called (single
particle) spatial orbitals. It is also useful to consider these two sets of degrees of freedom
together. For example, define the functions
ψ1 (x) = φ(r)hσ|↑i, ψ2 (x) = φ(r)hσ|↓i,
which are called (single particle) spin orbitals. The orthonormality condition
Z
ψi∗ (x)ψj (x) dx = δij , i, j = 1, 2,
28 Chapter 1. Basic theory of quantum mechanics
is naturally satisfied. Then one can readily verify that the spin singlet state can be
defined as a Slater determinant as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
1 ψ1 (x1 ) ψ1 (x2 )
Ψ(x1 , x2 ) = √ . (1.6.8)
2 ψ2 (x1 ) ψ2 (x2 )
then the spin triplet wavefunction can again be written in the form (1.6.8). Hence the
Slater determinant provides a unified representation using spin orbitals.
The discussion above is a special case of the Hartree–Fock ansatz, which will be
discussed in detail in the next chapter. We emphasize that the Hartree–Fock ansatz
is only an approximate theory for solving systems with more than one electron, since
the many-body wavefunction can be the linear combination of more than one Slater
determinant.
In order to find the approximation ground state wavefunction, we need to compare
the energy hΨ|H|Ψi for |Ψi being the spin singlet and the triplet state, respectively.
Borrowing from the intuition of the H+ 2 molecule, we expect the spin singlet state with
a symmetric spatial wavefunction profile to have a lower energy. Numerical results
obtained from the Hartree–Fock calculation indicate that this is indeed the case. Hence
even when the Hamiltonian does not explicitly involve the spin degree of freedom, spin
still plays an important role in quantum chemistry by influencing the symmetry of the
spatial component of the wavefunction, which leads to measurable effects.
Furthermore, for a given type of elementary particle, the choice of the + or − sign is
independent of i, j. In particular, if the sign is +, the particle is called a boson. The
bosonic wavefunctions are (totally) symmetric:
N
O
SN := SymN L2 (R3 ; C2 ) ⊂ L2 (R3 ; C2 ). (1.6.12)
If the sign is −, the particle is called a fermion. The fermionic wavefunctions are
(totally) anti-symmetric:
Note that in classical physics, one can always assign different labels to different
particles, for example by using their positions. Once the labels are assigned, one can
“name” a particle and distinguish one from the other. In quantum physics, this is not
possible. The symmetric property for the square of the magnitude |hx1 , . . . , xN |Pij |Ψi|2
implies that the probability of finding any permutation of particle positions in a configu-
ration is exactly the same. In this sense, elementary quantum particles are identical. We
stress that the fact that quantum particles are identical is yet another intrinsic property
of quantum particles which distinguishes them from their classical counterparts.
Whether a particle is a boson or a fermion is determined by its spin. If the spin is a
half integer (e.g., electron and proton), the particle is a fermion. If the spin is an integer
(e.g., photon), the particle is a boson. This statement is beyond quantum mechanics in
the Schrödinger picture, and must be explained using the theory of relativity.
The method of generating bosonic and fermionic wavefunctions can be generalized
to systems with N particles. For instance, starting from N normalized (not necessar-
ily orthogonal) single particle spin orbitals {ψi }N i=1 , the simplest wavefunction in the
absence of any symmetry constraint takes the factorized form
In order to satisfy the symmetry constraint, one can construct the bosonic wavefunction
by symmetrizing over all possible permutations from the permutation group Sym(N):
X
ΨB (x1 , x2 , . . . , xN ) = CB ψπ(1) (x1 )ψπ(2) (x2 ) · · · ψπ(N ) (xN ), (1.6.16)
π∈Sym(N)
ΨF (x1 , . . . , xN ) = CF
.. .
(1.6.18)
.
ψN (x1 ) · · · ψN (xN )
One immediate consequence of the form of a determinant is that if two spin orbitals
ψi (x) and ψj (x) are the same, then the determinant vanishes. Hence, without loss of
generality, we assume the spin orbitals must be orthonormal, i.e.,
Z
ψi∗ (x)ψj (x) dx = δij , i, j = 1, . . . , N.
Exercises
1. Prove that the spin- 21 operator satisfies the identity
2 3
Ŝ = Ŝx2 + Ŝy2 + Ŝz2 = I2 . (1.6.19)
4
2
Show that [Ŝ , Ŝα ] = 0 for α = x, y, z.
2. Given an operator Â(t) depending explicitly on time and a time-dependent state
vector |ψ(t)i, define the expectation value as hÂ(t)i := hψ(t)|Â(t)|ψ(t)i. Prove
that the evolution of the expectation value satisfies
d
i hÂ(t)i = ih∂t Â(t)i + h[Â, Ĥ](t)i.
dt
3. From the example of the spin precession, solve the Schrödinger equation directly
to evaluate the expectation value hŜx (t)i, hŜy (t)i, hŜz (t)i.
4. From the example of the spin precession, let us add an external potential V̂ (t) =
γ cos ωt (|↑ih↓| + |↓ih↑|). The corresponding Hamiltonian operator in the matrix
form is B
γ cos ωt
Ĥ(t) = Ŝ · B + V̂ (t) = 2 .
γ cos ωt − B2
(a) Write down the Schrödinger equation for a state vector |ψ(t)i.
(b) Starting from |ψ(0)i = |↓i, write a computer program using the fourth-
order Runge–Kutta method to propagate the Schrödinger equation with γ =
B = 1. Define ∆ = B − ω and perform the simulation at ∆ = 0; the re-
sulting oscillation is called the Rabi oscillation. Also try other values, e.g.,
∆ = 0.5, −0.5, −1.0, −2.0.
1.6. Identical particles 31
5. Let knk = 1 be a unit vector in R3 and let θ ∈ R. Prove the following matrix
identity:
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
6. Verify that the position operator x̂ is symmetric with respect to the inner product
on L2 (R): for ϕ(x), ψ(x) satisfying (1.2.9),
hϕ|x̂ψi = hx̂ϕ|ψi.
7. Verify that the momentum operator p̂ is symmetric with respect to the inner prod-
uct on L2 (R): for ϕ(x), ψ(x) ∈ L2 (R) satisfying ϕ0 (x), ψ 0 (x) ∈ L2 (R), use
integration by parts and prove that
hϕ|p̂ψi = hp̂ϕ|ψi.
8. Derive the function in L2 (R) that achieves minimal uncertainty according to the
Heisenberg uncertainty principle.
9. Verify the relation (1.2.22) and (1.2.23) for the angular momentum operator.
10. Using the definition (1.2.20), check the formula (1.2.25) in the spherical coordi-
nate.
11. Verify that Hk in (1.4.17) is a self-adjoint operator on the unit cell Ω.
12. Verify (1.5.5), (1.5.6), and (1.5.7).
13. Verify that the wavefunction described by the Slater determinant (1.6.18) is in-
deed the same as that in (1.6.17).
Chapter 2
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Density functional
theory: Formulation and
algorithms
In a quantum many-body system, the ground state is often the most important state.
This is because the energy gap E1 − E0 for many quantum systems is on the order of
electron volts (eV), or 104 Kelvin measured in terms of kB T , where kB is the Boltz-
mann constant. This is much higher than room temperature (300 Kelvin). According to
the Boltzmann distribution, the probability for the state Ei to be occupied is e−βEi /Z,
where β is the inverse temperature and Z is a normalization factor. Hence the ground
state is often the dominating state. Even for metallic systems where E1 − E0 is small
or zero, the ground state can still be very important.
For a system containing both nuclei and electrons, in principle all particles are quan-
tum particles and should be characterized by quantum mechanics. Since the mass of the
lightest element in the periodic table (hydrogen) is around 2000 times larger than that of
the electron, the commonly used Born–Oppenheimer approximation assumes that the
nuclei can be described by classical mechanics. This is often a very good approxima-
tion.
The many-body Hamiltonian with M nuclei and N electrons is given by
N N N M
X 1 X X 1 X ZI ZJ
H= − ∆r i + Vext (r i ; {RI }) + +
i=1
2 i=1 i<j
|r i − r j | |RI − RJ | (2.0.1)
I<J
where the first three terms of the Hamiltonian are kinetic, electron-ion, and electron-
electron interactions, respectively:
N N N
X 1 X X 1
T = − ∆ri , Ven = Vext (r i ; {RI }), Vee = .
i=1
2 i=1 i<j
|r i − r j |
In principle the nuclei are also quantum particles and can be described by the Schrö-
dinger equation as well. However, since the mass of a nucleus is much larger than
that of an electron, the nuclei are often treated as classical particles. This is called the
Born–Oppenheimer (BO) approximation. Under the BO approximation, the nuclei have
a set of fixed positions {RI }M
I=1 , which is called the atomic configuration. The ion-ion
33
34 Chapter 2. Density functional theory: Formulation and algorithms
M
X ZI ZJ
EII = .
|RI − RJ |
I<J
The many-body ground state wavefunction Ψ(x1 , . . . , xN ) is associated with the small-
est eigenvalue E of the linear eigenvalue problem:
HΨ = EΨ. (2.0.2)
Since the Hartree–Fock theory restricts the variational problem to a smaller class of
functions, it is immediate that
E HF ≥ E. (2.1.3)
The error of the Hartree–Fock approximation is called the correlation energy:
Ec = E − E HF , (2.1.4)
N
D X 1 E D 1 E
Ψ − ∆ri + Vext (r i ) Ψ = N Ψ − ∆r1 + Vext (r 1 )Ψ , (2.1.5)
i=1
2 2
D X 1 E N D 1 E
Ψ Ψ = Ψ Ψ . (2.1.6)
i<j
|r i − r j | 2 |r 1 − r 2 |
Let us calculate first the one-body term for Ψ given as a Slater determinant. Expanding
the determinant using the permutation operation π, we have
1 X
Ψ(x1 , . . . , xN ) = √ (−1)π ψπ(1) (x1 ) · · · ψπ(N ) (xN ). (2.1.7)
N ! π∈Sym(N)
Then
D 1 E
Ψ − ∆r1 + Vext (r 1 )Ψ
2
1 X 0
D 1 E
= (−1)π (−1)π ψπ(1) (x1 ) − ∆r1 + Vext (r 1 )ψπ0 (1) (x1 )
N! 0
2 x1
π,π
N D
Y E
× ψπ(k) (xk ) ψπ0 (k) (xk )
xk
k=2
N
1 X 0
D 1 E Y
= (−1)π (−1)π ψπ(1) (x1 ) − ∆r1 + Vext (r 1 )ψπ0 (1) (x1 ) δπ(k)π0 (k) ,
N! 0
2 x1
π,π k=2
where the last equality uses the orthonormality of the orbitals and we add the subscript
x1 , xk to the inner product to emphasize the variable to be integrated. The contribution
is zero unless π(k) = π 0 (k) for k = 2, . . . , N ; thus the two permutations coincide.
Therefore,
D 1 E 1 XD 1 E
Ψ − ∆r1 + Vext (r 1 )Ψ = ψπ(1) − ∆r + Vext (r)ψπ(1)
2 N! π 2
N
1 X X D 1 E
= ψi − ∆r + Vext (r)ψi δiπ(1)
N ! i=1 π 2
N
1 X D 1 E
= ψi − ∆ + Vext ψi ,
N i=1 2
where the second equality introduces the dummy summation index i for π(1). We have
N N D
D X 1 E X
1
E
Ψ − ∆ri + Vext (r i ) Ψ = ψi − ∆ + Vext ψi
i=1
2 i=1
2
N Z
X 1
= |∇r ψi (x)|2 + Vext (r)|ψi (x)|2 dx.
i=1
2
(2.1.8)
36 Chapter 2. Density functional theory: Formulation and algorithms
Ψ
|r 1 − r 2 |
Ψ
1 X 0
D 1 E
= (−1)π (−1)π ψπ(1) (x1 )ψπ(2) (x2 ) ψπ0 (1) (x1 )ψπ0 (2) (x2 )
N! 0
|r 1 − r 2 | x1 ,x2
π,π
N D
Y E
× ψπ(k) (xk ) ψπ0 (k) (xk )
xk
k=3
1 X 0
D 1 E
= (−1)π (−1)π ψπ(1) (x)ψπ(2) (x0 ) 0
0 (1) (x)ψπ 0 (2) (x )
π
N! |r − r 0 |
ψ
0
x,x0
π,π
N
Y
× δπ(k)π0 (k) .
k=3
(2.1.9)
Therefore, the two permutations are the same except potentially for the first two indices,
and we have two possibilities: either
π(1) = π 0 (1) = i, π(2) = π 0 (2) = j
or
π(1) = π 0 (2) = i, π(2) = π 0 (1) = j,
where i, j ∈ {1, . . . , N } are two different indices. In the former case, the two permu-
0
tations are the same, and hence (−1)π (−1)π = 1, whereas in the latter case we have
0
(−1)π (−1)π = −1. Thus, we get
N
D 1 E 1 X D 1 E
Ψ = ψi (x)ψj (x0 ) ψi (x)ψj (x0 )
|r 1 − r 2 |
Ψ
N (N − 1) |r − r |0
i6=j
!
D 1 E
0 0
− ψi (x)ψj (x ) ψj (x)ψi (x ) . (2.1.10)
|r − r 0 |
Therefore,
1
D X E
Ψ
|r i − r j |
Ψ
i<j
N
!
1X 1 1
D E D E
0 0 0 0
= ψi (x)ψj (x ) ψi (x)ψj (x ) − ψi (x)ψj (x )
ψj (x)ψi (x )
2 |r − r 0 | |r − r 0 |
i6=j
N
!
1X 1 1
D E D E
0 0 0 0
= ψi (x)ψj (x ) ψi (x)ψj (x ) − ψi (x)ψj (x )
ψj (x)ψi (x ) .
2 i,j |r − r 0 | |r − r 0 |
(2.1.11)
For the second identity, we use the observation that the two terms in the parentheses are
the same if i = j. Writing the integral more explicitly, we get
D X 1 E
Ψ
|r − r |
Ψ
i<j i j
ZZ ∗ !
|ψi (x)|2 |ψj (x0 )|2 ψi (x)ψj∗ (x0 )ψj (x)ψi (x0 )
ZZ
1X 0 0
= dx dx − dx dx .
2 i,j |r − r 0 | |r − r 0 |
(2.1.12)
2.1. Hartree–Fock theory 37
In summary, the Hartree–Fock energy functional for a given set of orthonormal spin
orbitals {ψi }N
i=1 is given by
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
N Z
|ψi (x)|2 |ψj (x0 )|2
ZZ
X 1 1X
E HF
({ψi }N
i=1 ) = 2 2
|∇r ψi | + Vext |ψi | dx + 0|
dx dx0
i=1
2 2 i,j
|r − r
ZZ ∗
1X ψi (x)ψj∗ (x0 )ψj (x)ψi (x0 )
− dx dx0 + EII .
2 i,j |r − r 0 |
(2.1.13)
To simplify the expression, we define the spin-dependent single particle electron density
for the ground state Ψ as
Z
%(x) = N |Ψ(x, x2 , . . . , xn )|2 dx2 · · · dxN (2.1.14)
When the context is clear, the (total) single particle electron density is also referred to
as the electron density or simply the density. The electron density is only a quantity in
L1 (R3 ), and hence carries significantly less information than Ψ. For Ψ ∈ A0N , we find
that the spin-dependent electron density
N
X
%(x) = |ψi (x)|2 . (2.1.16)
i=1
which is a projection operator since the {ψi }’s are orthonormal. In particular, the spin-
dependent electron density is simply the diagonal elements of the spin-dependent den-
sity matrix, i.e., %(x) = P (x, x). We will discuss the electron density and the density
matrix further in a later part of this chapter, as they are essential for electronic structure
theory and calculations.
Using (2.1.16) and (2.1.18), the Hartree–Fock energy functional can be written as
N Z
ρ(r)ρ(r 0 )
ZZ
X 1 1
E HF
({ψi }N
i=1 ) = 2 2
|∇r ψi | + Vext |ψi | dx + 0|
dr dr 0
i=1
2 2 |r − r
|P (x, x0 )|2
ZZ
1
− dx dx0 + EII .
2 |r − r 0 |
(2.1.19)
38 Chapter 2. Density functional theory: Formulation and algorithms
E HF = E HF ({ψi }N
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
The formulation with the functional (2.1.19) is called the general Hartree–Fock theory
(GHF) and the associated energy E HF is commonly written as E GHF .
where χi is preassigned to be either the spin-up state |↑i or the spin-down state |↓i.
For systems with an even number of electrons (N = 2Nocc ), the restricted Hartree–
Fock theory (RHF) further assumes that each spatial component contributes to two spin
orbitals associated with the spin-up state and the spin-down state as
thanks to the orthogonality of |↑i and |↓i. In particular, the exchange term includes only
orbitals with the same spin.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
In comparison with (2.1.23), the density matrix P ((r, σ), (r 0 , σ)) no longer depends
explicitly on σ and hence can be simplified as
Nocc
X
P (r, r 0 ) = ϕi (r)ϕ∗i (r 0 ). (2.1.28)
i=1
Note that the number of spatial orbitals is reduced from N to N/2 = Nocc due to the
symmetry restriction (Nocc stands for the number of occupied orbitals). Correspond-
ingly, a factor of 2 from spin needs to be properly treated in the electron density and the
exchange energy term.
Here, in the second equality, we split one single minimization process with respect to
Ψ into two nested minimization processes. The first minimization is with respect to
the density ρ and the second one is with respect to all Ψ, giving rise to the same given
density ρ which is denoted by Ψ 7→ ρ. One natural question is whether for a given
40 Chapter 2. Density functional theory: Formulation and algorithms
density ρ there is at least one Ψ ∈ AN that gives rise to this density such that the
kinetic energy of Ψ is finite, i.e.,
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
N Z
1X
hΨ|T |Ψi = ∇ri |Ψ(x1 , . . . , xN )|2 dx1 · · · dxN < ∞.
2 i=1
It turns out that it is sufficient to consider the density ρ in the following set:
Z √
JN = ρ ≥ 0 ρ(r) dr = N, ∇ ρ ∈ L2 (R3 ) . (2.2.2)
√
The main nontrivial condition in the definition of JN is ∇ ρ ∈ L2 (R3 ). This can
be understood from the case of N = 1, where the ground state wavefunction Ψ(x)
can be expressed without loss of generality as Ψ(x) = ψ(r)hσ| ↑i and ψ is a nodeless
function, i.e., ψ(r) ≥ 0 [52]. Then ρ(r) = ψ(r)2 and
√
∇ ρ(r) = ∇ψ(r).
√
Therefore the condition ∇ ρ ∈ L2 (R3 ) simply means that the kinetic energy is finite.
In general, note that
2
X Z
2 2 ∗
|∇ρ(r)| ≤ 4N ∇r Ψ ((r, σ), x2 , . . . , xN )Ψ((r, σ), x2 , . . . , xN ) dx2 · · · dxN
σ
Z !
X
≤ 4N ρ(r) |∇r Ψ((r, σ), x2 , . . . , xN )|2 dx2 · · · dxN ,
σ
(2.2.3)
√ √
where we have used the Cauchy–Schwarz inequality. Together with ∇ ρ = ∇ρ/(2 ρ),
we have
√
Z Z
1 2 NX 2
|∇ ρ(r)| dr ≤ |∇r Ψ((r, σ), x2 , . . . , xn )| dr 1 dx2 · · · dxN
2 2 σ
= hΨ|T |Ψi.
(2.2.4)
√
As a result, ∇ ρ ∈ L2 (R3 ) is a necessary condition that the kinetic energy of Ψ is
finite. It can also be proved that ρ ∈ JN implies that there exists at least one Ψ ∈ AN
with finite kinetic energy such that Ψ 7→ ρ [51].
Therefore the constrained minimization procedure is well defined and we have
Z
E = inf inf hΨ|(T + Vee )|Ψi + ρ(r)Vext (r) dr + EII (2.2.5)
ρ∈JN Ψ∈AN
Ψ7→ρ
Z
= inf FLL [ρ] + ρ(r)Vext (r) dr + EII . (2.2.6)
ρ∈JN
The functional FLL [ρ], called the Levy–Lieb energy functional, depends only on the
kinetic and electron-electron repulsion and not on the external potential Vext . Recall that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
the external potential Vext is given by the atomic type and position, and hence encodes
all the external inputs of the system. In this sense, the functional FLL [ρ] is universal.
Another important consequence of DFT is that if ρ∗ is the minimizer of (2.2.6) and Ψ∗ is
the minimizer that results in FLL [ρ∗ ] and is assumed to be unique, then the many-body
ground state wavefunction Ψ∗ is also determined by the electron density ρ∗ . In this
sense, if the ground state is nondegenerate, then there is a one-to-one mapping between
the ground state electron density and the many-body ground state wavefunction.
Starting from the very early days of quantum mechanics, physicists have been seek-
ing the approximation to the energy only in terms of the electron density, pioneered by
Thomas and Fermi. Until the 1960s, this effort has been mainly restricted to uniform
electron gas, where many calculations can be done analytically. Despite significant
progress in the past few decades, modeling FLL [ρ] remains a very difficult task. To
appreciate the difficulty, just recall the atomic shell structure from the eigenfunctions
of the Hamiltonian of the hydrogen atom. It is already highly nontrivial to find such
mapping for a single atom. Furthermore, in chemistry and materials science, the abso-
lute value of the energy is usually not the most important quantity. What determines
whether a chemistry process will happen or not is its relative energy landscape. This
often requires the ground state energy to be calculated with an accuracy of 99.9% or
higher.
The breakthrough of DFT is generally attributed to Kohn and Sham in 1965, who
proposed combining DFT with the orbital structure. Using the constrained minimization
over Slater determinants, the Kohn–Sham proposal can be interpreted as
ρ(r)ρ(r 0 )
ZZ
1
FLL [ρ] = inf 0 hΨ|T |Ψi + dr dr 0 + Exc [ρ]. (2.2.8)
Ψ∈AN 2 |r − r 0 |
Ψ7→ρ
It turns out that, for any ρ ∈ JN , there exists at least one Ψ ∈ A0N that gives the
density ρ and the constrained minimization of the kinetic energy term is well defined.
Similarly to the calculation in the Hartree–Fock theory, the first term in (2.2.8) con-
tributes to the kinetic energy from N single particle orbitals contributing to the Slater
determinant. The second term is the Hartree energy, which characterizes the electron-
electron repulsion energy at the mean-field level. The last term, Exc [ρ], is called the
exchange-correlation energy functional, which at first glance simply defines whatever
we do not know about FLL [ρ]. This is partially true. However, the insight from Kohn
and Sham is that the kinetic and Hartree terms often encode more than 95% of the to-
tal energy. Therefore the approximation to Exc [ρ], while still very difficult, is relatively
much easier than approximating FLL [ρ] directly. As a result, the Kohn–Sham variational
problem is
ρ(r)ρ(r 0 )
ZZ Z
1
E KS = inf hΨ|T |Ψi + dr dr 0 + Exc [ρ] + ρ(r)Vext (r) dr + EII
Ψ∈A0
N
2 |r − r 0 |
(2.2.9)
with ρ given by the Slater determinant Ψ. Writing the kinetic energy of the Slater
determinant more explicitly, we have
KS N 2
E ({ψi }i=1 ) = |∇r ψi (x)| dx + ρ(r)Vext (r) dr
2 i=1
(2.2.11)
ρ(r)ρ(r 0 )
ZZ
1
+ dr dr 0 + Exc [ρ].
2 |r − r 0 |
Although the form of the Kohn–Sham energy formally resembles that of the Hartree–
Fock energy (2.1.2), we should keep in mind that the Hartree–Fock theory is only the
(conceptually) simplest approximation to the many-body ground state wavefunction.
On the other hand, Kohn–Sham DFT is in principle an exact theory if we have access
to the exact exchange-correlation functional Exc [ρ]. The exchange-correlation func-
tional is also universal, i.e., it is independent of the external potential Vext and hence
the atomic configuration. Compared to the Hartree–Fock theory in (2.1.26), we find
that Kohn–Sham DFT does not involve the single particle density matrix P (x, x0 ) and
absorbs the exchange interaction into the exchange-correlation energy Exc that depends
only on the electron density ρ.
In order to use Kohn–Sham DFT in practice, the exchange-correlation functional
Exc must be approximated. Since the local density approximation was proposed by
Kohn and Sham, a “zoo” of exchange-correlation functionals has been proposed (see an
incomplete list in Figure 2.1). According to Perdew and Schmidt [72], these exchange-
correlation functionals can be organized according to the “Jacob’s ladder” of exchange-
correlation functionals (Figure 2.2). When no exchange-correlation functional is used,
Kohn–Sham DFT is essentially a Hartree approximation (with the Pauli exclusion prin-
ciple) and can thus be significantly less accurate than the Hartree–Fock theory. This is
referred to as “Hartree’s hell.” As the ladder moves up, the accuracy of the DFT cal-
culation generally improves towards the “heaven of chemical accuracy” of 1 kcal/mol
(or 1.6 Hartree per atom) when compared to experimental results. Correspondingly, the
form of the functional becomes increasingly more complex and the computational cost
also increases as a result.
At the first level of the ladder, we have the local density approximation (LDA),
where Exc is modeled locally by the electron density
Z
Exc [ρ] = e xc (ρ(r))ρ(r) dr. (2.2.12)
2.2. Kohn–Sham density functional theory 43
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Here exc (ρ(r)) is the intensity of the exchange-correlation energy, which is to be mul-
tiplied with the electron density ρ(r) to obtain the exchange-correlation energy density
at r. For convenience in later discussion, we introduce
xc (ρ) = e
xc (ρ)ρ
due to local rotational symmetry. The GGA functionals are currently the most widely
used functionals.
44 Chapter 2. Density functional theory: Formulation and algorithms
The third level of the ladder is the meta-GGA approximation [80, 81], where the
second-order derivative information is added, in particular the Hessian ∇2 ρ(r) and the
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Since τ is the local kinetic energy of the orbitals, it may seem that the meta-GGA func-
tional is no longer a density functional. It turns out that the Euler–Lagrange equation
associated with the meta-GGA energy functional is still of the same form as that of the
LDA and GGA energy functionals, as we will see below in section 2.3.
Further up the ladder, the energy functionals no longer depend only on the den-
sity, but also intrinsically on the orbitals. Hence, rigorously speaking, these exchange-
correlation functionals are no longer “pure” density functionals, and hence we use a
dashed line to separate the functionals on the fourth rung of the ladder and above as “or-
bital dependent functionals.” For instance, the fourth rung functional is typically called
the “hybrid functional” of the following form [7, 35]:
Exc [{ψi }] = (1 − α)Ex [ρ] + αEXX [{ψi }] + Ec [ρ]. (2.2.16)
Here Ex and Ec are the exchange and correlation parts from lower rung XC functionals
such as the GGA functionals. EXX [{ψi }] is the Hartree–Fock exchange energy of the
orbitals {ψi }, often referred to as the “exact exchange” contribution.
On the fifth rung of the ladder, we have functionals that depend not only on the
density matrix, but also on other quantities such as the linear response operators (to
be discussed in Chapter 3). Examples of such functionals include the double hybrid
functional and the random phase approximation (RPA) functional. These functionals
are closely related to Green’s function theory in many-body physics. The derivation
of the RPA correlation energy will be given at the end of this book. There are also
functionals that depend fully nonlocally on the electron density, such as those that take
into account the van der Waals interaction. These topics are beyond the scope of this
book.
Readers at this point might notice that none of the exchange functionals involve
explicitly the spin degrees of freedom. This is indeed correct. According to the con-
strained minimization procedure, in the absence of an external magnetic field the total
energy should depend solely on the electron density rather than each of its spin com-
ponents. Then, when the number of electrons is even, it is often justified to further use
the space A0N as the space spanned by orbitals of the form (2.1.22), as in the restricted
Hartree–Fock ansatz. The energy functional (2.2.11) thus becomes
Nocc Z
X Z
E KS ({ϕi }N
i=1 ) =
occ
|∇r ϕi (r)|2 dr + ρ(r)Vext (r) dr
i=1 (2.2.17)
ρ(r)ρ(r 0 )
Z
1
+ dr dr 0 + Exc [ρ].
2 |r − r 0 |
For spin-polarized systems and open-shell systems, it is more advantageous to dis-
tinguish the spin channels explicitly in the formulation, similarly to the treatment of the
2.3. Nonlinear eigenvalue problem 45
unrestricted Hartree–Fock ansatz. In the simplest setting, this leads to the local spin
density approximation [73]. We will not discuss the details of spin-dependent function-
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
als here.
Hartree–Fock equation
For simplicity of notation, we restrict our discussion here to the restricted Hartree–Fock
theory (2.1.26), while leaving the generalizations to UHF and GHF as exercises for the
reader. Recall that the energy functional can be written as
Nocc Z
ρ(r)ρ(r 0 )
Z ZZ
X 1
E RHF ({ϕi }N
i=1 ) =
occ
|∇ϕi (r)|2 dr + Vext (r)ρ(r) dr + dr dr 0
i=1
2 |r − r 0 |
2
|P (r, r 0 )|
ZZ
− dr dr 0 .
|r − r 0 |
(2.3.1)
In order to minimize the Hartree–Fock energy, let us find the stationary point of the
Lagrangian. Note that
1 δE RHF ρ(r 0 )
Z
1
∗ = − ∆ϕi (r) + Vext (r)ϕi (r) + dr 0 ϕi (r)
2 δϕi (r) 2 |r − r 0 |
P (r, r 0 )
Z
(2.3.2)
− ϕi (r 0 ) dr 0
|r − r 0 |
=: H RHF [Φ]ϕi (r).
The last line gives the definition of the Fock operator H RHF [Φ], which depends on the
orbitals Φ = {ϕi }Ni=1 . Taking into account the orthonormal constraints of ϕi , we get
occ
ρ(r 0 )
Z
1
H RHF
[Φ]ϕi (r) = − ∆ϕi (r) + Vext (r)ϕi (r) + dr 0 ϕi (r)
2 |r − r 0 |
P (r, r 0 )
Z
− ϕi (r 0 ) dr 0 (2.3.3)
|r − r 0 |
X
= ϕj (r)λji ,
j
X
ψi (r) = ϕj (r)Uji , i = 1, . . . , Nocc . (2.3.6)
j=1
Thus {ψi } satisfies a set of nonlinear eigenvalue problems, as the operator H RHF also
depends on the eigenfunctions. The nonlinear eigenvalue problem (2.3.10) is known
as the Hartree–Fock equation. The orbitals in the set {ψi }N occ
i=1 are called occupied or-
∞
bitals. The rest of the orbitals {ψi }i=Nocc +1 are called the unoccupied orbitals or virtual
orbitals.
Kohn–Sham equations
Next, in order to minimize the Kohn–Sham energy functional (2.2.11), let us find the
stationary point of the Lagrangian. Similarly to the discussion above, we will also use
the spin-restricted ansatz (2.1.22) for the Kohn–Sham wavefunctions. We differentiate
(2.2.17) as
1 δE KS ({ϕi }) ρ(r 0 )
Z
1 0
= − ∆ r + V ext (r) + dr + Vxc [ρ](r) ϕi (r)
2 δϕ∗i (r) 2 |r − r 0 |
=: H KS [ρ]ϕi (r),
(2.3.11)
where H KS [ρ] can be viewed as a self-adjoint operator acting on the orbitals which
depends on ρ, given by the orbitals
Nocc
X
ρ(r) = 2 |ϕi (r)|2 ,
i=1
2.3. Nonlinear eigenvalue problem 47
δExc [ρ]
Vxc [ρ] = , (2.3.12)
δρ
which for now can be thought of as a ρ-dependent potential acting on orbitals; Vxc will
be further discussed and made more explicit below. R
Taking into account the orthonormality condition ϕ∗i (r)ϕj (r) dr = δij , we get
the Euler–Lagrange equations as
ρ(r 0 )
Z
1 0
H KS [ρ]ϕi (r) = − ∆ + Vext (r) + dr + Vxc [ρ](r) ϕi (r)
2 |r − r 0 |
X (2.3.13)
= ϕj (r)λij ,
j
where the λij ’s are Lagrange multipliers. Following a similar procedure of rotating the
orbitals, it suffices to consider the Euler–Lagrange equation of the form
ρ(r 0 )
Z
1 0
− ∆ + Vext (r) + dr + Vxc [ρ](r) ψi (r) = εi ψi (r), i = 1, . . . , Nocc .
2 |r − r 0 |
(2.3.14)
Since the operator H KS depends on the orbitals {ψi } through the electron density ρ,
this is a set of nonlinear eigenvalue problems known as the Kohn–Sham equations.
The Kohn–Sham equations must be solved self-consistently with respect to the elec-
tron density ρ. For a given electron density ρ, the Hamiltonian H KS [ρ] = − 21 ∆ +
Veff (r) is a self-adjoint linear operator, where the effective potential induced by ρ is
Here VHxc [ρ] includes the Hartree and exchange-correlation contributions and only de-
pends on the electron density as
ρ(r 0 )
Z
vC [ρ](r) = dr 0 , (2.3.17)
|r − r 0 |
which gives the Hartree potential.
The Kohn–Sham orbitals {ψi } are thus eigenfunctions of H KS [ρ]. It should be
noted that a priori there is no guarantee that {ψi }N i=1 will correspond to the lowest
few eigenvalues of H KS [ρ] to achieve the global minimum of the Kohn–Sham energy
functional (2.2.9), as in the case when the electrons are noninteracting except for the
Pauli exclusion principle. In practice, this is often assumed in solving the Kohn–Sham
equations according to the aufbau principle.
Let us now come back to the exchange-correlation potential, given by the functional
derivative of Exc with respect to the density. For the LDA approximation , the variation
of Exc is given by Z
∂xc
δExc = (ρ(r))δρ(r) dr.
∂ρ
48 Chapter 2. Density functional theory: Formulation and algorithms
Hence
δExc ∂xc
Vxc [ρ](r) = = (ρ(r)) .
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
δρ(r) ∂ρ
For GGA, the variation with respect to the density gives
Z
∂xc ∂xc
δExc = δρ + δσ dr.
∂ρ ∂σ
Using the chain rule
we have
Z Z
∂xc ∂xc ∂xc ∂xc
δExc = δρ + 2 (∇ρ · ∇δρ) dr = δρ − 2∇ · ∇ρ δρ dr.
∂ρ ∂σ ∂ρ ∂σ
Thus
∂xc ∂xc
Vxc [ρ] = − 2∇ · ∇ρ .
∂ρ ∂σ
The derivation of the exchange-correlation potential is slightly different for meta-GGA,
since the kinetic energy density τ (r) explicitly involves the Kohn–Sham orbitals {ψi }.
First, Z
∂xc ∂xc ∂xc
δExc = δρ + δσ + δτ dr.
∂ρ ∂σ ∂τ
In the spin-restricted case, since
Nocc
X
δτ (r) = ∇ψi∗ (r) · ∇δψi (r) + ∇δψi∗ (r) · ∇ψi (r),
i=1
Recall that the Kohn–Sham equation is obtained by variation with respect to δψi∗ ; thus
the exchange-correlation “potential” Vxc [ρ] applied to the occupied orbital ψi is (note
the extra 1/2 factor due to the spin degeneracy)
∂xc ∂xc 1 ∂xc
Vxc [ρ]ψi = − 2∇ · ∇ρ − ∇ · ∇ ψi . (2.3.19)
∂ρ ∂σ 2 ∂τ
Therefore, the exchange-correlation functional Vxc [ρ] is still independent of the orbitals
and can be defined only using the electron density as
∂xc ∂xc 1 ∂xc
Vxc [ρ] = − 2∇ · ∇ρ − ∇ · ∇ . (2.3.20)
∂ρ ∂σ 2 ∂τ
Strictly speaking, in meta-GGA, Vxc [ρ] is no longer a potential. The dependence of Exc
on the kinetic energy density introduces a differential operator ∇ · ∂
∂τ ∇ acting on the
xc
2.4. Self-consistent field iteration 49
orbitals. On the other hand, all our previous discussions on the Kohn–Sham equations
still apply without change as can be verified easily.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Further up the Jacob’s ladder for the exchange-correlation functionals, Vxc can no
longer be determined only from the electron density. In particular, for hybrid function-
als involving the Fock exchange term, the Kohn–Sham equations will be similar to the
Hartree–Fock equation and involve the single particle density matrix P (r, r 0 ) or the oc-
cupied orbitals. Functionals beyond the hybrid functional, such as the RPA correlation
energy functional, further involve the polarizability operator χ(r, r 0 ) , which must be
defined using not only the occupied orbitals but the unoccupied orbitals as well. The
increased complexity of the density functional brings in more fidelity but also signifi-
cantly increases the computational cost. Therefore the computational cost of DFTs with
high-level exchange-correlation functionals may be close to that of some approaches
based on many-body theories for accurate electronic structure calculations.
In the discussion below, we assume that the LDA/GGA functionals are used, i.e., the
exchange-correlation potential can always be expressed as a local potential Vxc [ρ](r).
Much of the discussion can be generalized to more complicated exchange-correlation
functionals as well as spin-dependent functionals.
For the rest of the book, for simplicity we choose to neglect the spin index in the
presentation and consider spin-less quantum systems unless otherwise noted. The treat-
ment of spin-less particles is similar to that of the restricted Hartree–Fock/Kohn–Sham
calculations above, but we ignore the factor of 2 due to the spin degeneracy. In other
words, we assume Nocc = N and use the spatial variable r most of the time instead
of x. Readers may find this odd at first sight, since we started by introducing quantum
mechanics for a system that involves only the spin degrees of freedom! However, from
a mathematical perspective, the spin degrees of freedom mainly introduce another layer
of indices to keep track of for all quantities under consideration and do not (usually)
add extra complexity at the conceptual level of Kohn–Sham DFT.
The electron density ρ can be evaluated from the Kohn–Sham map by solving a linear
eigenvalue problem or by using density matrix techniques, to be discussed in section 2.7.
Hence ρ and Veff should be iteratively determined by each other until convergence. This
is called the self-consistent field (SCF) iteration.
We begin with a certain initial electron density denoted by ρ0 , and we denote by
ρk , Vk the electron density and the effective potential Veff at the kth SCF iteration,
respectively. Then the flow of the SCF iteration becomes
Depending on the starting point, the relation (2.4.2) can be viewed as a mapping from
ρk to ρk+1 , or from Vk to Vk+1 . The former is known as density mixing and the latter
as potential mixing. There is no qualitative difference between the two types of mixing
50 Chapter 2. Density functional theory: Formulation and algorithms
schemes. However, density mixing has the extra constraint that the density must be non-
negative everywhere and must be normalized to have the correct number of electrons.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
In practice, this constraint can be easily satisfied by setting all the negative entries of
ρ (usually these entries have very small magnitude) to 0 or a small positive number,
followed by a normalization step. On the other hand, potential mixing, which we will
consider below, is formally free of such a constraint. We remark that both density
mixing and potential mixing strategies are widely used in electronic structure software
packages, and that the algorithm below for potential mixing can be used for density
mixing as well.
When self-consistency is reached, the converged effective potential is denoted by
V? and satisfies the nonlinear equation
However, as we shall analyze later, the fixed point iteration generally cannot be expected
to converge, even if the initial potential V0 is already very close to V ∗ .
In order to achieve a self-consistent solution, the simplest practically usable scheme
is the simple mixing method, which introduces a slight modification of the fixed point
iteration as
Vk+1 = αVeff [FKS [Vk ]] + (1 − α) Vk . (2.4.5)
If we introduce the residual error of the Kohn–Sham potential as
When α = 1, the simple mixing is the same as the fixed point iteration. As will be
analyzed in section 3.4, if we neglect the contribution from the exchange-correlation
potential, the simple mixing always converges when α is a small enough positive num-
ber.
Newton’s method
The convergence of the simple mixing method often requires a rather small mixing
constant α. Hence the SCF procedure may take many iterations to converge. One
possible acceleration can be achieved by using Newton’s method, which can be written
as
Vk+1 = Vk − Jk−1 rk . (2.4.8)
Here Jk is the Jacobian matrix for the residual map
Hence the simple mixing method can also be interpreted as approximating the inverse of
the Jacobian matrix Jk−1 simply by αI. For a system with N electrons, the evaluation
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
of the Jacobian matrix for the composition map Veff ◦ FKS requires in principle O(N )
evaluations of the Kohn–Sham map, which is prohibitively expensive.
The Jacobian-free Krylov–Newton method replaces the need for explicit evaluation
of the Jacobian matrix by solving a linear equation
Jk δVk = −rk (2.4.9)
to obtain the Newton update δVk . This can be done using iterative methods for solving
linear equations, such as the generalized minimal residual method (GMRES) [78]. In
order to compute the matrix-vector multiplication related to the Jacobian matrix, one
can use the finite difference formula
Jk δVk ≈ δVk − (Veff [FKS [Vk + δVk ]] − Veff [FKS [Vk ]]) . (2.4.10)
The finite difference calculation requires at least one additional function evaluation of
FKS (Vk + δV ) per iteration step. Therefore, even though Newton’s method may exhibit
local quadratic convergence, each Newton iteration may take many inner iterations to
solve the linear equation (2.4.9).
Broyden’s method
A widely used alternative to Newton’s method is the quasi-Newton method, which re-
places Jk−1 by an approximate matrix Ck that is easy to compute and apply. Then the
updating strategy becomes
Vk+1 = Vk − Ck rk . (2.4.11)
Using Broyden’s techniques [39], one can systematically approximate Jk or Jk−1 .
In Broyden’s second method, Ck is obtained by performing a sequence of low-rank
modifications to some initial approximation C0 of the Jacobian inverse using a recursive
formula [26, 57]. At each step, Ck is obtained by solving the following constrained
optimization problem:
1 2
min kC − Ck−1 kF
C 2
s.t. Sk = CYk , (2.4.12)
where Ck−1 is the approximation to the Jacobian constructed in the (k − 1)th Broyden
iteration. The matrices Sk and Yk above are defined as
Sk = (sk , sk−1 , . . . , sk−` ), Yk = (yk , yk−1 , . . . , yk−` ), (2.4.13)
where sj and yj are defined as sj = Vj − Vj−1 and yj = rj − rj−1 , respectively. The
number ` is the length of the history used in Broyden’s method.
Equation (2.4.12) is a constrained optimization problem which can be solved using
the method of Lagrange multipliers. The solution is (exercise)
Ck = Ck−1 + (Sk − Ck−1 Yk )Yk† . (2.4.14)
Here Yk† denotes the Moore–Penrose pseudoinverse of Yk , i.e., Yk† = (Yk> Yk )−1 Yk> .
We remark that in practice Yk† is not constructed explicitly, since we only need to apply
Yk† to a residual vector rk . This operation can be carried out by solving a linear least
squares problem with appropriate regularization (e.g., through a truncated singular value
decomposition).
52 Chapter 2. Density functional theory: Formulation and algorithms
tronic structure software packages. When solving Eq. (2.4.12), Anderson’s method
fixes Ck−1 to be the initial approximation C0 . It follows from (2.4.11) that Anderson’s
method updates the potential as
where C0 is an initial approximation to the inverse of the Jacobian (at the solution).
Substituting Vopt = Vk − Sk Yk† rk into (2.4.17), and combining with the linearity as-
sumption of Veff [FKS (V )], we arrive at exactly Anderson’s updating formula (2.4.15).
The DIIS method can be further generalized by taking other definitions of the resid-
ual vector r. For instance, the commutator-DIIS (C-DIIS) method [75] defines the resid-
ual as the commutator of the Hamiltonian operator and the density matrix. This is the
most widely used method in quantum chemistry software packages for achieving self-
consistency.
2.5. Density matrix formulation 53
We remark that C0 plays the role of a preconditioner, and hence better C0 can be
chosen to accelerate the convergence of Anderson’s method in practical electronic struc-
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
ture calculations. For instance, for uniform electron gas, as well as simple metallic sys-
tems such as bulk sodium and aluminum, the most widely used preconditioner is the
Kerker preconditioner [43]. It assigns a smaller weight on the long-wavelength Fourier
modes in order to reduce the effect of “charge sloshing” that commonly occurs in elec-
tronic structure calculations for metallic systems.
Hence P is self-adjoint and idempotent, and is the projection operator onto the occupied
space. From an operator perspective, the density matrix can be viewed as an integral
operator
Z N
Z X
0 0 0
(P f )(r) = P (r, r )f (r ) dr = ψi (r)ψi∗ (r 0 )f (r 0 ) dr 0 (2.5.4)
i=1
In particular, we observe that the diagonal part of the kernel is just the electron density
in the real space representation
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
N
X N
X
P (r, r) = ψi (r)ψi∗ (r) = |ψi (r)|2 = ρ(r). (2.5.6)
i=1 i=1
Moreover, Z
Tr P = P (r, r) dr = N, (2.5.7)
which is the normalization condition due to the constraint of the number of electrons.
In fact, P can be represented independent of the orbitals. Since H is a Hermitian
operator, a matrix function f (H) can be defined using its spectral decomposition as
X
f (H) = f (εi )|ψi ihψi | (2.5.8)
i
for any Borel measureable function f on the real line. Recall that for simplicity we have
assumed that the spectrum of H is discrete. Thus, assuming that εN < εN +1 and the
parameter µ ∈ R satisfies εN ≤ µ < εN +1 , we have
N
1(−∞,µ] (H) = 1(−∞,µ] (εi )|ψi ihψi | =
X X X
|ψi ihψi | = |ψi ihψi | = P.
i i:εi ≤µ i=1
(2.5.9)
We conclude with
P = 1(−∞,µ] (H), (2.5.10)
where the right-hand side is the spectral projection onto the interval (−∞, µ]. Here µ is
known as the Fermi level or the chemical potential.
The density matrix can also be equivalently defined using contour integrals from
complex analysis. Recall the Cauchy integral formula (Appendix A.3)
(
0, ε 6∈ D,
I
1 1
dλ = (2.5.11)
2πi ∂D λ − ε 1, ε ∈ D,
We obtain I
1
P = (λ − H)−1 dλ. (2.5.13)
2πi C
The integrand in the above formula
Gλ = (λ − H)−1 (2.5.14)
2.6. Extension to finite temperature 55
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
is known as Green’s function (in spectral theory it is usually called the resolvent, but
we will stick to the terminology that is more common to physics and PDEs here). The
relation between the density matrix and Green’s function using contour integrals is a
very useful tool both theoretically and numerically. After a certain choice of quadrature
rule, the discretized contour integral takes the form
m
X
P ≈ ωl (zl − H)−1 , (2.5.15)
l=1
where {zl }, {ωl } are quadrature nodes and weights, respectively, and Gl = (zl − H)−1
is a Green’s function evaluated at λ = zl . This is called the pole expansion of the
density matrix.
F [P (N ) ] = Tr P (N ) (H + β −1 ln P (N ) ), (2.6.1)
X
0 ≤ fi ≤ 1 and fi = 1. (2.6.3)
i
Note that for the special case that P (N ) is a projection operator associated with the
wavefunction |Ψi, i.e., P (N ) = |ΨihΨ|, we have Tr P (N ) H = hΨ|H|Ψi and
Tr P (N ) ln P (N ) = 0, and thus the free energy reduces to the energy of the wave-
function |Ψi. However, the entropy term prefers fi strictly between 0 and 1 so as to
make fi ln fi negative, and thus the minimizing density matrix at finite temperature is
in general not a projector (i.e., fi being either 0 or 1).
The Euler–Lagrange equation of the minimization of (2.6.1) is
H + β −1 (ln P (N ) + I) − λI = 0, (2.6.6)
where DN denotes the set of N -body density matrices. Note that for any ρ ∈ JN , there
exists at least one P (N ) that gives the density since we can take P (N ) = |ΨihΨ| with
Ψ ∈ AN giving the density. Thus, the constrained minimization is well defined and we
can further write Z
F = inf Fβ [ρ] + ρVext dr , (2.6.10)
ρ∈JN
2.6. Extension to finite temperature 57
where Fβ [ρ] is the universal functional that depends only on kinetic energy, electron-
electron repulsion, and entropy (hence temperature):
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
!
Fβ [ρ] = inf Tr P (N ) (T + Vee ) + β −1 Tr(P (N ) ln P (N ) ) . (2.6.11)
P (N ) ∈DN
P (N ) 7→ρ
1
Tr (−∆)P + β −1 Tr P ln P + (I − P ) ln(I − P )
Fβ [ρ] = inf
P ∈D 2
P 7→ρ
!
ρ(r)ρ(r 0 )
ZZ
1 0
+ dr dr + Exc,β [ρ] , (2.6.12)
2 |r − r 0 |
where D is the set of all one-particle density matrices for an N -electron system:
n o
D = P ∈ B(L2 (R3 , C2 )) | P = P ∗ , 0 P I, Tr P = N , (2.6.13)
where, for two self-adjoint operators A and B, the notation A B means that hv|A|vi ≤
hv|B|vi for any v. The constraint 0 P I comes from the Pauli exclusion principle
[18]: Let the eigen-decomposition of P be
X
P = fi |ψi ihψi |, (2.6.14)
i
with
Z
1X X
FβKS {fi }, {ψi } = fi |∇ψi |2 dx + β −1
fi ln fi + (1 − fi ) ln(1 − fi )
2 i i
ρ(r)ρ(r 0 )
Z ZZ
1
+ ρ(r)Vext (r) dr + dr dr 0 + Exc,β [ρ], (2.6.17)
2 |r − r 0 |
In principle, the exchange-correlation functional for the finite temperature Exc,β [ρ] de-
pends on β and also has a ladder of approximation schemes like the zero temperature
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
case. However, almost all finite temperature DFT calculations in practice still use the
temperature-independent exchange-correlation functional. For the purpose of our dis-
cussions, since the zero and finite temperature functionals share the same mathematical
structure, we will not explicitly distinguish them in what follows.
We now consider minimizing the finite temperature functional (2.6.12). As the func-
tional involves minimization with respect to the density matrix P , it is more convenient
to consider the combined minimization of the density and the associated density matrix:
1
Tr (−∆)P + β −1 Tr P ln P + (I − P ) ln(I − P )
inf
P ∈D 2
!
ρ(r)ρ(r 0 )
ZZ
1
+ dr dr 0 + Exc,β [ρ] , (2.6.19)
2 |r − r 0 |
where ρ is given by the diagonal of the density matrix P . Taking the variation with
respect to P , we obtain the Euler–Lagrange equation
where µ is the Lagrange multiplier associated with the constraint Tr P = N and the
effective Hamiltonian is given similarly as in the zero temperature case (cf. (2.3.11))
ρ(r 0 )
Z
1
KS
Hβ [ρ] = − ∆ + Vext + dr 0 + Vxc,β [ρ]. (2.6.21)
2 |r − r 0 |
Solving (2.6.20) for P , we get
−1
P = I + exp β(HβKS [ρ] − µ)
. (2.6.22)
where εi and ψi are the eigenvalue and associated eigenfunction of HβKS [ρ], respec-
tively. We see that the occupation number is given by
1
fi = fβ (εi − µ) = . (2.6.25)
1 + exp(β(εi − µ))
Thus fi ∈ (0, 1), so all eigenstates are occupied with some fraction. Also notice that as
β → ∞ (zero temperature limit), fβ converges to the function f∞ :
1, ε < 0,
f∞ (ε) = 21 , ε = 0, (2.6.26)
0, ε > 0.
2.7. Density matrix algorithms 59
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
1
lim µβ = (εN + εN +1 ) =: µ∞ . (2.6.27)
β→∞ 2
Therefore, the density matrix of finite temperature is consistent with the definition of
the chemical potential at zero temperature.
Similarly to the zero temperature case, we may also represent the density matrix at
finite temperature using Green’s functions. Let C be a contour close to the real line
enclosing the entire spectrum of H. By the Cauchy integral formula, we obtain
I I X
1 −1 1
fβ (λ − µ)(λ − H) dλ = fβ (λ − µ)(λ − εi )−1 |ψi ihψi | dλ
2πi C 2πi C i
X
= fβ (εi − µ)|ψi ihψi | = P, (2.6.28)
i
where we have used the property that the Fermi–Dirac function fβ (z) (extended to the
complex plane) is a meromorphic function. It only has simple poles at z = (2k+1)iπ/β
with k ∈ Z and is analytic everywhere else. These poles are called the Matsubara
frequencies.
The contour integral formalism also provides a viable way of obtaining a numerical
approximation of the density matrix. The discretized contour integral gives the pole
expansion in the finite temperature case and takes the same form as in (2.5.15), but with
different choices of quadrature nodes and weights (Figure 2.4). We note that in the finite
temperature case the contour integral formulation remains well defined in the gapless
case, i.e., εN = εN +1 . The pole expansion can be obtained efficiently semi-analytically
using, e.g., complex analysis techniques [54]. The computation of Green’s function will
be further discussed in section 2.7.
on the infinite dimensional space L2 (R3 ), it should be first discretized. Many discretiza-
tion schemes can be characterized by a finite dimensional basis set {φj (r)}N b
j=1 so that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Nb
span{φj (r)}j=1 ⊂ L2 (R3 ). This basis set can also be written in a matrix form as
or compactly written in the linear algebra form as H = Φ∗ ĤΦ. Here we have used
the notation Ĥ again to distinguish the Hamiltonian as an operator and H as a finite
dimensional matrix. The overlap matrix is defined as
Z
Sij = φ∗i (r)φj (r) dr (2.7.2)
or S = Φ∗ Φ.
In the discussion below, one particularly convenient discretization scheme is the
real space discretization , where each basis function φi (r) can be associated with a
discretized spatial point {r i }N b
i=1 , and physical quantities such as the electron density
can be represented on the same set of grid points. Furthermore, we require that the
basis functions are orthonormal, i.e., S = I is an identity matrix. In particular, in
the literature of electronic structure theory, the finite difference discretization of the
Laplacian operator (and hence the Hamiltonian) is often referred to loosely as a real
space discretization scheme. One particular advantage of real space discretization is
that the electron density is given directly by the diagonal elements of the density matrix,
which simplifies the introduction of the numerical methods below.
In an orthonormal basis set, the discretized Kohn–Sham equation is
X
Hij Cj,k = Ci,k εk , (2.7.3)
j
where Ck = (C1,k , C2,k , . . . , CNb ,k )> is the coefficient vector for the kth eigenfunc-
tion. Hence (2.7.3) can be solved using an eigensolver. In order to solve Kohn–Sham
DFT, we do not need all the eigenvectors, but only those corresponding to the N small-
est eigenvalues. When Nb is small (such as when Gaussian-type orbitals or numerical
atomic orbitals are used), one can treat H as a dense matrix and find all the eigen-
values and eigenvectors using software packages such as LAPACK or ScaLAPACK.
When Nb is large (such as when planewaves or finite elements are used), H must be
treated as a sparse matrix and iterative methods should be used only to compute the
eigenpairs needed to construct the density matrix. Examples of iterative eigensolvers
include conjugate gradient-type methods [3, 44] and the Davidson method [22]. Using
the eigenvector of (2.7.3), the Kohn–Sham orbital can be reconstructed as
X
ψk (r) = φj (r)Cj,k (2.7.4)
j
and the electron density is given accordingly. The partial diagonalization procedure in
(2.7.3) is the most straightforward and is also the most widely used method for evaluat-
ing the Kohn–Sham map ρ = FKS [Veff ]. Since the matrix C must consist of orthonormal
2.7. Density matrix algorithms 61
columns, the orthogonalization step will cost at least O(Nb N 2 ), regardless of the eigen-
solver used. Since the number of basis functions Nb should scale linearly with respect
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
to the number of electrons N , the complexity is cubic with respect to N . The cubic scal-
ing is manageable for small systems, but becomes prohibitively expensive when solving
Kohn–Sham DFT for large systems.
Since the diagonalization-type methods are widely used in practically all electronic
structure software packages, it is often assumed that solving the eigenvalue problem of
type (2.7.3) is a necessary step for solving Kohn–Sham DFT. However, according to
the discussion in section 2.4, what is really needed is the evaluation of the Kohn–Sham
map, which maps the effective potential Veff to an output electron density ρ. Hence the
diagonalization method is only one possibility for evaluating the Kohn–Sham map.
In this section, we focus on algorithms for solving Kohn–Sham DFT that directly
evaluate the Kohn–Sham map without obtaining any eigenvalues or eigenfunctions as
intermediate quantities. These algorithms often involve the direct approximation of
the density matrix, and hence we refer to these algorithms as density matrix algo-
rithms. While the computational complexity of diagonalization-type algorithms is al-
ways O(N 3 ) with respect to the number of electrons N , density matrix algorithms offer
the possibility of significantly reducing the complexity. We illustrate density matrix al-
gorithms using two representative examples: the density matrix purification method and
the Fermi operator expansion method.
Let us see why McWeeny’s purification works. Note that the iterate stays self-
adjoint during the iteration and the eigenfunctions remain the same, so it suffices to
keep track of the eigenvalues. Considering a specific eigenvalue λ0 of P0 , we have
whose three roots are given by 0, 21 , and 1. We calculate the derivative of fMcW and get
0,
x = 0,
0
fMcW (x) = 6x − 6x2 = 32 , x = 12 , (2.7.9)
0, x = 1.
62 Chapter 2. Density functional theory: Formulation and algorithms
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
1
Therefore, 0 and 1 are the stable fixed points while 2 is unstable. We also see that
Hence, the iteration converges to 0 if the initial condition lies in [0, 21 ), while it con-
verges to 1 if the initial condition lies in ( 12 , 1]; see Figure 2.5. In fact, the iteration
converges to 0 starting from [− 12 , 12 ) and to 1 starting from ( 12 , 32 ].
Given this, let us now consider how to use the purification to get the density matrix.
Note that P shares the same eigenfunctions as H; thus this fits into the framework of
purification. We want to make all the eigenvalues of H below the chemical potential µ
converge to 1 and all the eigenvalues of H above it converge to 0. Therefore, we would
like to start with an initial guess that suitably rescales the matrix µ − H. Assuming
spec(H) ⊂ [εmin , εmax ], we take
1 1
P0 = α (µ − H) + I, (2.7.11)
2 2
where
α = min (εmax − µ)−1 , (µ − εmin )−1 .
(2.7.12)
The density matrix is then approximated by
1
1(−∞,0] (x) = (1 − sgn(x)). (2.7.15)
2
For a Hermitian matrix A with a spectrum in (−1, 1), the matrix sign function sgn(A)
can be computed using Newton’s iteration, applied to finding the root of
g(X) = X 2 − I (2.7.16)
h(Y ) = Y −1 − X. (2.7.18)
Combining one step of Schulz iteration (2.7.19) with (2.7.17), we arrive at the Newton–
Schulz algorithm for the matrix sign function:
1 1
Xk+1 = (Xk + 2Xk − Xk Xk2 ) = Xk (3I − Xk2 ). (2.7.20)
2 2
Then, using the relation (2.7.15), the initial guess for McWeeny’s purification be-
comes
1 α
P0 = I − (H − µ)
2 2
with the iteration
1
2
I − 2Pk+1 = (I − 2Pk ) 3I − (I − 2Pk ) , (2.7.21)
2
or equivalently
Pk+1 = 3Pk2 − 2Pk3 . (2.7.22)
This is exactly McWeeny’s purification method.
Density matrix-based algorithms provide a possible means of reducing the compu-
tational complexity for the evaluation of the Kohn–Sham map. Note that in McWeeny’s
purification method, we only need to perform matrix-matrix multiplication operations.
64 Chapter 2. Density functional theory: Formulation and algorithms
Hence if each matrix can be approximated by a sparse matrix, the computational cost
may be significantly reduced. This is indeed the case for systems with a finite gap,
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
where the magnitude of the elements of the density matrix can decay rapidly along the
off-diagonal direction. This is often referred to as the near-sightedness principle in
physics literature [45]. Mathematically, the near-sightedness principle can be explained
using decay properties of Green’s functions, which we shall discuss in section 3.5.
For systems obeying the near-sightedness principle, the number of nonzero ele-
ments of the density matrix only increases linearly with respect to N . Therefore, with
proper implementation, all matrix-matrix multiplication operations can be carried out
with O(N ) cost, which leads to a linear scaling method. There is a rich body of litera-
ture on linear scaling algorithms that has been developed in the past two decades and is
still being actively developed today. We refer readers to the review papers [30, 10] on
this topic.
P = fβ (H − µ). (2.7.23)
The right-hand side is a matrix function with respect to the Hamiltonian matrix H.
Instead of diagonalizing H and evaluating the matrix function using the eigen-decom-
position, the basic idea of FOE is to expand the Fermi–Dirac function fβ (·) into an
m-term expansion as
m
X
fβ (ε) ≈ fβ,m (ε) = gn (ε). (2.7.24)
n=1
m
X
fβ (H − µ) ≈ fβ,m (H − µ) = gn (H − µ). (2.7.25)
n=1
X
fβ (H −µ)−fβ,m (H −µ) |vi = fβ (εi −µ)−fβ,m (εi −µ) |ψi ihψi |vi. (2.7.27)
i
2.7. Density matrix algorithms 65
2
i
2
X 2
≤ sup |fβ (εi − µ) − fβ,m (εi − µ)| |hψi |vi|
i
i
2 2
≤ sup |fβ (ε) − fβ,m (ε)| kvk2 .
ε∈spec(H−µ)
(2.7.28)
The equation above can be rewritten as
kfβ (H − µ) − fβ,m (H − µ)k2 ≤ kfβ (·) − fβ,m (·)k∞ , (2.7.29)
where the left-hand side is the operator norm for matrices, and the right-hand side is
the L∞ norm for scalar functions. Thus the error of the FOE (2.7.25) will be small as
long as the corresponding approximation is small in the sense of expansions for scalar
functions (2.7.24).
One example of FOE is the expansion of the Fermi–Dirac function into polynomials:
m
X
fβ (ε) ≈ cn εn−1 . (2.7.30)
n=1
Note that each term of (2.7.31) is simply a matrix power (H − µ)n , which can be
evaluated using only matrix-matrix multiplication recursively, without diagonalizing
the matrix H. In order to implement the FOE (2.7.31) for a high-order polynomial, it is
more efficient and stable to expand fβ using Chebyshev polynomials.
Aside from polynomial expansion, another option is to approximate the Fermi–
Dirac function using rational functions. Rational functions can consist of terms with
simple poles of the form (ε − z)−1 , as well as higher-order poles of the form (ε − z)−n
with n > 1. It turns out that the simple pole expansion, or just the pole expansion as in
(2.5.15), achieves the best balance between efficiency and accuracy for approximating
meromorphic matrix functions such as the Fermi–Dirac function [54].
Each term in the pole expansion corresponds to a matrix inverse, or Green’s func-
tion (zl − H)−1 , which can be evaluated directly without diagonalizing the matrix H.
Equation (2.5.15) converts the problem of computing P to the problem of computing
m Green’s functions. In order to find the Kohn–Sham map, we do not need the entire
density matrix P but only the electron density which corresponds to the diagonal of P
(again for simplicity we assume that the real space discretization is used). This amounts
to the question of finding the diagonal of a Green’s function. Note that even if H is a
sparse matrix, the matrix inverse Gl = (zl − H)−1 can be a fully dense matrix. One di-
rect method is to first evaluate each Green’s function and extract its diagonal elements.
However, when H is a sparse matrix, the computation of diagonal entries, and, more
generally, the entries of Gl corresponding to the sparsity pattern of H, can be evaluated
much more efficiently by means of the selected inversion method [24, 53, 56, 38].
For simplicity, let A be a sparse, symmetric (real or complex), and nonsingular
matrix. The standard approach for computing A−1 is to first decompose A as
A = LDL> , (2.7.32)
66 Chapter 2. Density functional theory: Formulation and algorithms
α b>
A= . (2.7.34)
b A e
1 `>
1 α
A= e − bb> /α , (2.7.35)
` I A I
where `(1) = ` = b/α, yields the final L factor. At this last step the matrix in the
middle, which is the D matrix, becomes diagonal.
From (2.7.35), A−1 can be expressed by
−1
α + `> S −1 ` −`> S −1
A−1 = . (2.7.36)
−S −1 ` S −1
This expression suggests that once α and ` are known, the task of computing A−1 can
be reduced to that of computing S −1 . Because a sequence of Schur complements is
produced recursively in the LDL> factorization of A, the computation of A−1 can be
organized in a recursive fashion as well. Clearly, the reciprocal of the last entry of D
is the (n, n)th entry of A−1 . Starting from this entry, which is also the 1 × 1 Schur
complement produced in the (n − 1)th step of the LDL> factorization procedure, we
can construct the inverse of the 2 × 2 Schur complement produced at the (n − 2)th step
of the factorization procedure using the recipe given by (2.7.36). This 2 × 2 matrix is
the trailing 2 × 2 block of A−1 . As we proceed from the lower right corner of L and
D towards their upper left corner, more and more elements of A−1 are recovered. The
pole expansion and selected inversion (PEXSI) method [54, 56, 38] combines the pole
expansion and the selected inversion and evaluates the Kohn–Sham map without solving
any eigenvalues or eigenfunctions. The selected inversion method is an exact method
if exact arithmetic is used, i.e., the only error in the selected inversion method is due to
2.8. Brillouin zone sampling for periodic systems 67
round-off errors. Hence the accuracy of the PEXSI method is determined by the pole
expansion, which can be systematically improved by increasing the number of poles m.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
The computational scaling of PEXSI is only related to the number of nonzero elements
in the Cholesky factor L. More specifically, the complexity is O(N ) for quasi-one-
dimensional systems (such as nanotubes), O(N 1.5 ) for quasi-two-dimensional systems
(such as surfaces), and O(N 2 ) for three-dimensional bulk systems.
(2π)3
|Ω∗ | = .
|Ω|
Supercell formulation
Although the size of a periodic system is infinite, it is in fact very helpful, at least
heuristically, to think that a periodic system is nothing other than a “giant molecule.”
In other words, we may approximate the periodic system by a finite-sized system with
N1` , N2` , N3` unit cells along the Bravais lattice vectors a1 , a2 , a3 , respectively. This
fictitious system is often called a supercell denoted by Ω` . The total number of unit
cells in the supercell is thus
Assume the number of electrons per unit cell is N , and the total number of electrons
in the system is N N ` . Then Kohn–Sham DFT can be expressed using N N ` single
N`
particle orbitals {ψi }N
i=1 satisfying the orthonormality condition. Each orbital should
satisfy a periodic boundary condition on the supercell, i.e.,
ψi (r + aα Nα` ) = ψi (r) ∀ r ∈ Ω` , α = 1, 2, 3.
This particular periodic boundary condition is called the Born–von Karman boundary
condition. Correspondingly the electron density ρ(r) can also be periodically extended
to R3 . Hence the Kohn–Sham energy functional over the supercell can be written as
before:
Z X Z
1
E ` [{ψi }] = |∇ψi (r)|2 dr + Vext (r)ρ(r) dr
2 Ω` i Ω`
(2.8.1)
ρ(r)ρ(r 0 ) 0
Z Z
1 ` `
+ dr dr + Exc [ρ] + EII [{RI }].
2 Ω` R3 |r − r 0 |
Here the exchange-correlation energy functional and the nuclei repulsion energy also
carry the superscript ` to reflect the fact that they are defined with respect to the super-
68 Chapter 2. Density functional theory: Formulation and algorithms
cell. The ground state energy in the supercell can be obtained variationally as
min E ` [{ψi }]
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
`
{ψi }N N
i=1
Z
s.t. ψi∗ (r)ψj (r) dr = δi,j , (2.8.2)
Ω`
X
ρ(r) = |ψi (r)|2
i
√ is the periodic boundary condition in the unit cell Ω. The normalization factor
which
1/ N ` is introduced for convenience of later discussion. For now we assume each k
point has the same number of orbitals, i.e., n = 1, . . . , N . As will be seen later, this
corresponds to the setup of insulating systems at zero temperature.
For a fixed unit cell, the Born–von Karman boundary condition imposes extra con-
ditions on the phase factor as
`
eik·(r+aα Nα ) = eik·r , ∀ r ∈ Ω` , α = 1, 2, 3,
or equivalently,
`
eiNα k·aα = 1, α = 1, 2, 3.
∗
Therefore k cannot be an arbitrary point in Ω . Without loss of generality we assume
that N1` , N2` , and N3` are all even numbers and define the set
( 3 )
X mα Nα` Nα`
K =`
b mα = − + 1, . . . , , α = 1, 2, 3 ⊂ Ω∗ .
N ` α 2 2
α=1 α
Then the Born–von Karman boundary condition requires k ∈ K` . Note that the cardi-
nality |K` | = N ` , so the total number of admissible k points is equal to the number of
unit cells in the supercell Ω` .
2.8. Brillouin zone sampling for periodic systems 69
Here we have used the periodicity of un,k and the fact that k − k0 cannot differ by
more than one lattice vector along any direction in the reciprocal space. Hence the
orthonormality condition for ψn,k in the supercell can be equivalently written as that
for un,k in the unit cell, i.e.,
Z
u∗n0 ,k (r)un,k (r) dr = δn0 ,n . (2.8.6)
Ω
We stress that un,k and un0 ,k0 are in general not orthogonal to each other when k 6= k0 .
Now let us rewrite the energy per unit cell in (2.8.1) in terms of {un,k }. First, the
kinetic energy part per unit cell is (note the normalization factor on both sides of the
equation)
Z X Z X
1 1 2 1 X 1
`
|∇ψn,k (r)| dr = `
|(∇r + ik)un,k (r)|2 dr.
N Ω` 2 N Ω n 2
n,k k
Since the set K` is a set of uniform grid points discretizing the Brillouin zone, the
summation over k can be seen as a Riemann sum of the integration in Ω∗ , i.e., there
exists the limit
|Ω∗ | X
Z
N ` →∞
f (k) −−−−−→ f (k) dk (2.8.7)
N` Ω∗
k
for any continuous function f . Hence in the thermodynamic limit, the kinetic energy
per unit cell is Z Z X
1 1
∗
|(∇r + ik)un,k (r)|2 dr dk. (2.8.8)
|Ω | Ω∗ Ω n 2
which is a periodic function in the unit cell Ω. Since all other terms in the energy
functional depend only on the electron density, the energy functional per unit cell can
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
ρ(r)ρ(r 0 ) 0
Z Z
1
+ dr dr + Exc [ρ] + EII [{RI }].
2 Ω R3 |r − r 0 |
(2.8.10)
Note that all integration ranges in the real space are restricted to the unit cell Ω, except
for the long-range Coulomb interaction, where the r0 variable includes the contribution
to the electrostatic interaction from the electron density within the unit cell Ω as well
as all its periodic images in R3 . Since ρ is a periodic function, the integration over R3
leads to a divergent Hartree potential. However, such divergence will be canceled by
the electron-nuclei interaction in Vext and the nuclei-nuclei repulsion in EII [{RI }]. The
total electrostatic energy from the nuclei and the electron density is well defined. This is
true at least for charge neutral systems [50]. The exchange-correlation functional Exc [ρ]
and the nuclei repulsion energy EII should be understood as the energies per unit cell
as well. Then Kohn–Sham DFT for periodic systems in the thermodynamic limit can
be formulated variationally as
min E[{un,k }]
{un,k }
Z
s.t. u∗n0 ,k (r)un,k (r) dr = δn0 ,n , (2.8.11)
Ω
Z X
1
ρ(r) = ∗ |un,k (r)|2 dk.
|Ω | Ω∗ n
subject to the orthonormality condition and the self-consistency condition for the elec-
tron density. The effective potential is defined as
ρ(r 0 )
Z
Veff [ρ](r) = Vext (r) + 0
dr 0 + Vxc [ρ](r). (2.8.13)
R3 |r − r |
In particular, if Vext and ρ are periodic functions with respect to the Bravais lattice,
then so is Veff [ρ](r). This justifies the Bloch decomposition in the previous discussion.
Compared to the supercell formulation, the formulation given in (2.8.11) and (2.8.12)
has major computational advantages. After discretization in the Brillouin zone into N `
points, each k point corresponds to a decoupled eigenvalue problem defined only on the
unit cell. The computational cost with respect to N ` is thus reduced from cubic scaling
to linear scaling. This enables simulation with a large number of k points even with
a modest amount of computational resource. The collection of all eigenvalues {εn,k }
forms the band structure modeled by Kohn–Sham DFT.
2.8. Brillouin zone sampling for periodic systems 71
Now we return to the orbitals {ψn,k } before the Bloch decomposition. Although
each ψn,k can be normalized within any finite-sized supercell, such normalized ψn,k
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
would converge weakly to zero in the thermodynamic limit. This is because ψn,k cannot
belong to L2 (R3 ) and is only a generalized eigenfunction of H in R3 . This is similar
to the fact that planewaves can be seen as generalized eigenfunctions of the momentum
operator in section 1.2. For this purpose, with some abuse of notation, we may define
the generalized eigenfunctions using the Bloch decomposition, still denoted by {ψn,k }
as (for all r ∈ R3 )
ψn,k (r) = eik·r un,k (r), k ∈ Ω∗ . (2.8.14)
Compared to (2.8.3), we have dropped the normalization factor. The resulting {ψn,k }
satisfy the orthonormality condition in the distribution sense:
Z
ψn∗ 0 ,k0 (r)ψn,k (r) dr = |Ω∗ |δn0 ,n δ(k0 − k). (2.8.15)
R3
This implies the normalization condition when integrated over the Brillouin zone:
Z Z
1
ψ ∗ 0 0 (r)ψn,k (r) dr dk = δn0 ,n . (2.8.16)
|Ω∗ | Ω∗ R3 n ,k
Another benefit of the choice of (2.8.14) is that the electron density can be written using
{ψn,k } as well:
Z X
1
ρ(r) = ∗ |ψn,k (r)|2 dk.
|Ω | Ω∗ n
Kohn–Sham DFT requires that orbitals with low energies must be occupied before
those with higher energies. So far we have assumed that the number of occupied or-
bitals N holds uniformly for all k. This is only true when the following band isolation
condition is satisfied:
εN,k < εN +1,k0 ∀ k, k0 ∈ Ω∗ . (2.8.17)
A system satisfying the isolation condition is called an insulating system, which has a
positive band gap defined as
εgap = inf εN +1,k − sup εN,k . (2.8.18)
k∈Ω∗ k∈Ω∗
On the other hand, when the isolation is violated, the system is called a metallic
system and the band gap εgap is defined to be 0. In materials science, another commonly
used term is semiconducting system, which is an insulating system but with a relatively
small band gap (typically < 1 eV). Hence this terminology only holds in a quantitative
sense.
Figure 2.6 shows the band structures of a bulk silicon system and a bulk aluminum
system, respectively, and the calculations are performed using the Quantum ESPRESSO
[29] software package. Bulk silicon has a small but finite band gap (around 0.6 eV), so
it is an insulating system (more specifically, a semiconductor). Bulk aluminum has zero
band gap and the Fermi energy passes through the band structure, so it is a metallic
system.
For metallic systems, the orbitals with lower energies must be occupied first, and
the number of occupied orbitals per k point will be inhomogeneous across the Brillouin
72 Chapter 2. Density functional theory: Formulation and algorithms
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
zone. The treatment of insulating and metallic systems can be unified by introducing
the chemical potential µ, which defines a set of occupation numbers
(
1, εn,k ≤ µ,
fn,k := f∞ (εn,k − µ) =
0, εn,k > µ.
Then, after solving the Kohn–Sham equations (2.8.12), the electron density is defined
using the occupation number as
Z X
1
ρ(r) = ∗ fn,k |un,k (r)|2 dk. (2.8.19)
|Ω | Ω∗ n
The chemical potential µ should be adjusted self-consistently to fulfill the condition that
each unit cell has N electrons, i.e.,
Z
ρ(r) dr = N. (2.8.20)
Ω
Hence again the chemical potential µ is the Lagrange multiplier associated with N .
From this perspective, given the band isolation condition, the choice of the occupation
numbers for insulating systems becomes
(
1, 1 ≤ n ≤ N,
fn,k =
0, n > N.
Finally, the energy functional for a periodic system can also be generalized to the finite
temperature case by introducing an entropy term for the occupation number fn,k . We
will omit the details here.
The case when Nα is an odd number can be defined in a similar way. We have in-
troduced a shift vector s = (s1 , s2 , s3 )> . A common choice of the shift vector is
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
s = (0, 0, 0)> , which is the same as K` . Another one is (s1 , s2 , s3 ) = 21 (1, 1, 1)> .
This choice allows the discretization points to be kept away from the high symmetry
points in the Brillouin zone, which is observed to have numerical benefits in practice,
especially when a small number of k points is used. We can see that this choice of the
shift corresponds to ψn,k , satisfying the Born–von Karman boundary condition over a
supercell that has 2N1` , 2N2` , 2N3` along each direction, respectively. More generally, if
one component of the shift corresponds to an irrational number, the discretization will
not correspond to any supercell with the Born–von Karman boundary condition. Hence
the discretization of the Brillouin zone offers a more general perspective for treating
periodic systems than the supercell approach even at the level of numerics. The grid
in (2.8.21) is called the Monkhorst–Pack grid [65] and is the most widely used scheme
for discretizing the Brillouin zone.
The Monkhorst–Pack grid is a set of uniform grid points and the corresponding
quadrature is the trapezoidal rule. For instance, the electron density can be computed as
1 X X
ρ(r) = fn,k |un,k (r)|2 . (2.8.22)
N` ` n
k∈Ks
For insulating systems, the integrand in (2.8.19) is smooth with respect to k and is peri-
odic over the Brillouin zone. The convergence rate with respect to the refinement of the
discretization is at least superalgebraic. Hence for insulating systems the Monkhorst–
Pack grid converges very quickly, and very few k points are needed to converge phys-
ical quantities such as electron density and energy. However, for metallic systems,
f∞ (εn,k − µ) effectively truncates the support of the integrand, which is no longer a
smooth function over the Brillouin zone. Therefore if the Monkhorst–Pack grid is used,
many more k points are needed to converge.
2.9 Localization
Boys localization and Wannier localization
The Kohn–Sham orbitals {ψi }N i=1 are eigenfunctions of a Hamiltonian matrix and are
generally delocalized across the entire system, i.e., with significant magnitude in large
portions of the computational domain. Nonetheless, if we apply a unitary rotation U ∈
CN ×N to the Kohn–Sham orbitals
N
X
wi = ψj Uji , (2.9.1)
j=1
we have seen that the density matrix and hence the electron density is invariant. Hence
Kohn–Sham DFT (up to the fourth rung of exchange-correlation functionals) and the
Hartree–Fock theory are invariant to such gauge transformation as well. We may won-
der whether we can find the optimal unitary transformation so that the resulting set of
orthonormal functions {wi }Ni=1 has significant magnitude on only a small portion of the
computational domain. This is called localization. Localized representations of elec-
tronic wavefunctions have a wide range of applications in quantum physics, chemistry,
and materials science. They require significantly less memory to be stored and are the
foundation of the so-called linear scaling methods [45, 30, 10] for solving quantum
74 Chapter 2. Density functional theory: Formulation and algorithms
problems. They can also be used to analyze chemical bonding in complex materials,
interpolate the band structure of crystals, accelerate ground and excited state electronic
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
structure calculations, and form reduced-order models for strongly correlated many-
body systems [61].
For isolated systems, the localized representation can be identified through the Boys
localization procedure [28], which minimizes the following spread functional:
N
X 2
inf Ω[{wi }N
i=1 ] = hwi |r 2 |wi i − (hwi |r|wi i)
U
i=1
N
(2.9.2)
X
s.t. wi = ψj Uji , U ∗ U = IN .
j=1
where hrii := hwi |r|wi i. Hence the spread functional can be written compactly as
N
X 2
Ω[{wi }N
i=1 ] = hwi | (r − hrii ) |wi i.
i=1
The functional Ω characterizes the total spatial spread of the rotated orbitals {wi } in
terms of the second moment around each center hrii . A smaller spread value indicates
a more localized representation of the Kohn–Sham occupied subspace. Numerically,
the localization problem (2.9.2) can be solved as a constrained minimization problem.
Since isolated systems are surrounded by an infinite-sized vacuum, the occupied
orbitals always decay exponentially fast towards zero as |r| → ∞. Hence, qualitatively,
both the Kohn–Sham orbitals ψi and the localized orbitals wi decay exponentially as
|r| → ∞, and the localized representation only makes a quantitative difference. On the
other hand, for crystals the Kohn–Sham orbitals satisfy the Bloch boundary condition
on each unit cell, and hence have support on the macroscopic scale. One can show that
through a gauge transformation we can still obtain localized orbitals. For insulating
systems with a finite energy gap, we can even obtain exponentially localized orbitals.
Therefore, localized representation makes a qualitative difference. These functions are
called Wannier functions.
For periodic systems, if we rotate the set of generalized functions {ψn,k } by an arbi-
trary k-dependent unitary matrix U (k) ∈ CN ×N , we can define a new set of functions
N
X
ψen,k (r) = ψn0 ,k (r)Un0 ,n (k), k ∈ Ω∗ . (2.9.3)
n0 =1
Here the set of matrices {U (k)}k∈Ω∗ is referred to as the gauge. For each k, the density
matrix P (k) for each k point is gauge invariant,
N
X N
X
P (k) = |ψn,k ihψn,k | = |ψen,k ihψen,k |, (2.9.4)
n=1 n=1
2.9. Localization 75
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
For any choice of gauge, the Wannier functions for crystals are defined as [84]
Z
1
wn,R (r) = ∗ ψen,k (r)e−ik·R dk, r ∈ R3 , R ∈ L. (2.9.5)
|Ω | Ω∗
The Wannier functions {wn,R } are orthogonal to each other in L2 (R3 ) (exercise) and
span the same space as the range of the total density matrix P . They are also translation
invariant, and wn,R (r) = wn,0 (r − R).
Due to the translational invariance property, the Wannier localization problem for
crystals is thus reduced to the problem of finding a gauge {U (k)} such that wn,0 is
localized, or equivalently, ψen,k is smooth with respect to k. This can be done by mini-
mizing the spread functional similar to the Boys localization [62], i.e.,
N
X 2
Ω[{wn,0 }N
n=1 ] = hwn,0 |r 2 |wn,0 i − (hwn,0 |r|wn,0 i) . (2.9.6)
n=1
Here wn,0 depends on the unitary gauge U (k) through ψen,k as in (2.9.5). Figure 2.7
shows one Wannier function for a silicon crystal calculated on an 8 × 8 × 8k-point grid.
The Wannier function is clearly localized in the real space.
main idea of the SCDM method is to obtain localized orbitals directly from columns of
the density matrix P = ΨΨ∗ .
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
When the real space discretization is used and a set of localized orbitals does exist,
an immediate consequence is that each column (and hence row) of the density matrix
P decays exponentially fast along the off-diagonal direction. In this sense, the density
matrix P is said to be localized, and selecting any linearly independent subset of N
of them will yield a localized basis for the span of Ψ. However, picking N random
columns of P may result in a poorly conditioned basis if, for example, there is too much
overlap between the selected columns. Therefore, we would like a means for choosing
a well-conditioned set of columns in the real space, denoted C = {c1 , c2 , . . . , cN }.
Intuitively we would expect such a basis to minimize the overlaps between columns
whenever possible.
This is algorithmically accomplished with a QR factorization with column pivoting
(QRCP) procedure [31]. Simply speaking, for a given matrix A, the QRCP procedure
seeks to compute a permutation matrix Π such that the leading submatrices (AΠ)1,...,k
for any applicable k are as well conditioned as possible. In our setting, this means
we would ideally compute a QRCP factorization of the matrix P to identify N well-
conditioned columns from which we may construct a localized basis. However, this
would be very costly if the size of the matrix P were much larger than its rank N .
Fortunately, we may equivalently compute the set C by employing a QRCP factorization
of Ψ∗ . More specifically, we compute
Ψ∗ Π = Q R1 R2 ,
(2.9.7)
and the first N columns of the permutation matrix Π encode C.
One simple implementation of the SCDM procedure for constructing an orthonor-
mal set of localized orbitals {wi } is as follows.
1. Perform QRCP for Ψ∗ using (2.9.7).
2. Evaluate W = ΨQ.
This algorithm does not require an initial guess, and the direct usage of Q as the gauge
matrix corresponds to a Cholesky–QR orthonormalization procedure [21], where the
Cholesky factorization is only performed implicitly. Numerical results indicate that
the resulting spread can be very similar to that of the optimized solution in the Boys
localization procedure. Although the formulation above is stated for isolated systems,
it can be generalized to crystals as well.
then a necessary condition for identifying a minimizer is that the atomic force F I should
vanish for all atoms.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
For a closed system with M nuclei, the total energy for the nuclei degrees of free-
dom is
M 2
X MI ṘI
Etot = + E({RI }M I=1 ), (2.10.2)
2
I=1
where MI is the mass of the Ith nuclei. The dynamical properties of the nuclei can be
studied using Newton’s equation
∂E
MI R̈I (t) = − ({RI (t)}) = F I ({RI (t)}). (2.10.3)
∂RI
This is called ab initio molecular dynamics (AIMD). Geometry optimization and molec-
ular dynamics remove the need for the construction of the empirical interatomic poten-
tial and have enabled a vast range of applications for the study of static and dynamic
properties in chemistry and materials science.
In both geometry optimization and AIMD, the key ingredient is to evaluate the
atomic force. In three-dimensional space, such derivative information formally requires
3M independent electronic structure calculations, which would be prohibitively expen-
sive. Fortunately, the Hellmann–Feynman theorem allows us to evaluate the atomic
force with little added cost compared to a single electronic structure calculation.
δE[ψ, λ]
= 0. (2.10.10)
δψ
ψ=ψ(λ)
Now, for the total energy in Kohn–Sham DFT, only the external potential Vext and
the ion-ion interaction EII depend explicitly on the atomic configuration {RI }. Fur-
thermore, the external potential can be generally decomposed into contributions from
each atom as X
I
Vext (r; {RI }) = Vext (r − RI ). (2.10.11)
I
Here µ is called the fictitious electron mass. E({ψi }; {RI }) is the energy functional in
Kohn–Sham DFT, while {ψi } may or may not be the minimizing Kohn–Sham orbitals
for the atomic configuration {RI }.
The equation of motion induced by the Lagrangian (2.10.13) gives the CPMD
∂V I (r − RI (t))
Z
∂EII
MI R̈I (t) = − ρ(r, t) ext dr − ({RI (t)}), I = 1, . . . , M,
∂RI ∂RI
X
µψ̈i (t) = −H[ρ(t); {RI (t)}]ψi (t) + ψj (t)Λji (t), i = 1, . . . , N,
j
N
X
ρ(r, t) = |ψi (r, t)|2 .
i=1
(2.10.14)
Here the Λ’s are the Lagrange multipliers determined so that {ψi (t)} is an orthonormal
set of functions for any time. Compared to BOMD, CPMD uses a fictitious dynamics
to guide the motion of the electrons without the need for a convergent SCF iteration.
The dynamics of the electronic orbitals can be loosely regarded as a special way of per-
forming the SCF iteration at each molecular dynamics step. Thanks to the Hamiltonian
2.11. Time-dependent density functional theory 79
structure, numerical simulation for CPMD is stable and the energy is conservative over a
much longer time period compared to that for BOMD without a tight SCF convergence
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
criterion. When the system has a spectral gap, the accuracy of CPMD is controlled by
a single parameter, the fictitious electron mass µ. The result of CPMD approaches that
of BOMD as µ goes to zero [69, 9]. However, it has also been shown that CPMD does
not work as well for systems with a vanishing gap, for example metallic systems [69].
where the explicit time-dependence comes from the external potential Vext . The state-
ment of Runge and Gross is that the many-body wavefunction Ψ(t) at any time t ≥ t0
is uniquely determined by the initial state Ψ0 and the history of the electron density
{ρ(r, s)}t0 ≤s≤t . Furthermore, according to the Hohenberg–Kohn theorem, if the sys-
tem starts from the many-body ground state, then Ψ0 is uniquely determined by ρ(r, t0 )
as well. Hence the evolution of the many-body system is determined entirely from the
evolution of the density {ρ(r, s)}t0 ≤s≤t .
Furthermore, similarly to the construction of Kohn–Sham DFT, the Runge–Gross
TDDFT can also be formally given by the following coupled time-dependent Schrö-
dinger equation in R3 as
ρ(r 0 , t)
Z
1 0
i∂t ψj (x, t) = − ∆ + Vext (r, t) + dr + Vxc [{ρ}t0 ≤s≤t ] (r) ψj (x, t),
2 |r − r 0 |
N
XX 2
ρ(r, t) = |ψj (x, t)| .
σ j=1
(2.11.4)
Compared to the ground state Kohn–Sham DFT, the important difference is that the
exchange-correlation potential Vxc is in principle not only nonlocal in the spatial vari-
able r but also nonlocal in the temporal variable t as well. The memory effect can
80 Chapter 2. Density functional theory: Formulation and algorithms
span the entire history {ρ(r, s)}t0 ≤s≤t . Given the already formidable difficulty of find-
ing the exact exchange-correlation functional in ground state Kohn–Sham DFT, it be-
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
comes much more difficult to further identify the memory effect in TDDFT. In practical
TDDFT calculations, almost all exchange-correlation functionals assume
KS
Vxc [{ρ}t0 ≤s≤t ] (r) ≈ Vxc [ρ(t)] (r), (2.11.5)
KS
where Vxc is the exchange-correlation functional for ground state calculations. Equa-
tion (2.11.5) is also called the adiabatic approximation in TDDFT. While such an ap-
proximation works well in the computation of a wide range of electronic and optical
properties, it can also fail qualitatively, even when the exact ground state exchange-
correlation functional is available.
From a practical perspective, for a given exchange-correlation functional, one pri-
mary challenge of TDDFT calculation is the small time step, which is usually on the
order of attoseconds (1 attosecond = 10−18 sec). Compared to the typical time scale of
the atomic movement which is on the order of femtoseconds (1 femtosecond = 10−15
sec), the time scale of TDDFT is much faster, and therefore it can be challenging to
study nonadiabatic dynamics where the degrees of freedom of electrons and nuclei are
propagated simultaneously.
Finally, let us rewrite (2.11.4) in terms of the time-dependent density matrix:
N
X
P (t) = |ψj (t)ihψj (t)|.
j=1
This is the (self-consistent) quantum Liouville equation (also called the von Neumann
equation), which can be seen as an intrinsic representation of TDDFT.
Exercises
1. Verify the relations (2.1.16) and (2.1.18).
3. Write down the total energy of the H+2 molecule in the unrestricted Hartree–Fock
approximation and the corresponding Euler–Lagrange equation. Explain why this
gives the exact solution to the problem. Then write down the total energy of the
H+2 molecule in Kohn–Sham DFT and the corresponding Euler–Lagrange equa-
tion. If you neglect the exchange-correlation functional, note that the electron
interacts with itself through the Hartree potential, even though there is only one
electron!
5. The ground state energy E in Kohn–Sham DFT is closely related to the band
energy, i.e., the sum of the eigenvalues of occupied states. Prove that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
N
ρ(r)ρ(r 0 )
ZZ Z
X 1
E= εi − dr dr 0 − ρ(r)Vxc [ρ](r) dr + Exc [ρ] + EII .
i=1
2 |r − r 0 |
The difference between the total energy and the band energy is called the double
counting term, which is due to the nonlinearity of the Hartree energy and the
exchange-correlation functional.
6. Write down a one-dimensional model for the helium atom. Write a computer
program to solve this system at the restricted Hartree–Fock level. Can you show
that the spin singlet state has a lower energy than the spin triplet state?
7. Solve the constrained minimization problem (2.4.12) for ` = 1 and find that the
solution is given by Broyden’s update (2.4.14).
8. Verify the Hellmann–Feynman force formula (2.10.12).
9. Verify that the Wannier functions defined in (2.9.5) are orthonormal, i.e.,
Z
∗
wn,R (r)wn0 ,R0 (r) dr = δn,n0 δ(R − R0 ), R, R0 ∈ L.
R3
10. Write a computer program to find the localized Boys orbitals for a chain involving
four hydrogen atoms in one dimension.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Chapter 3
Many properties of chemical or materials systems are characterized by how they re-
spond to external perturbations. When the external perturbation is sufficiently small
that only the leading-order perturbation is important, this is known as the linear response
regime. It is in some sense analogous to the linear stability analysis of a dynamical sys-
tem or that in fluid mechanics. This chapter will give an in-depth discussion of the
linear response theory for both static and dynamic perturbations and their applications.
The analysis will be from the perspective of the perturbation theory of Green’s func-
tions. Towards the end of the chapter, the perturbation theory of a many-body quantum
system and its connection to DFT will also be briefly addressed.
Thus, the image of the Green’s function (as an operator) solves an elliptic equation.
Using spectral decomposition (assuming again the discrete spectrum of H):
X
Gλ f = (λ − H)−1 f = (λ − εi )−1 |ψi ihψi |f i. (3.1.2)
i
The kernel of the Green’s function, viewed as an integral operator, is thus given by
X
Gλ (r, r 0 ) = (λ − εi )−1 ψi (r)ψi∗ (r 0 ). (3.1.3)
i
By direct computation, we can verify the two useful identities for Green’s functions,
known as the resolvent identities:
(λ1 − H)−1 − (λ2 − H)−1 = (λ1 − H)−1 (λ2 − λ1 )(λ2 − H)−1 , (3.1.4)
−1 −1 −1 −1
(λ − H1 ) − (λ − H2 ) = (λ − H1 ) (H1 − H2 )(λ − H2 ) . (3.1.5)
83
84 Chapter 3. Linear response theory
Indeed, we have
(λ1 − H)−1 − (λ2 − H)−1 = (λ1 − H)−1 (λ2 − H)(λ2 − H)−1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
or equivalently,
∞
X m
(λ − H )−1 = (λ − H)−1 m W (λ − H)−1 .
(3.1.17)
m=0
where the last step uses the Taylor series of (1 − x)−1 at x = 0, i.e., the Neumann
series.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
There is yet another way to view this series expansion using the resolvent identity:
(λ−H )−1 −(λ−H)−1 = (λ−H)−1 (H −H)(λ−H )−1 = (λ−H)−1 W (λ−H )−1 .
(3.1.20)
In other words,
This is known as the Dyson equation. Let us now solve the equation by an iteration
scheme
(n) (n−1)
Gλ, = Gλ + Gλ (W )Gλ, (3.1.22)
(0)
with the initial guess Gλ, = Gλ . Using the contraction mapping theorem, we can
prove that the iteration converges and that the converged solution is given by (3.1.17).
The above result tells us that if λ is not in the spectrum of H and the operator
W (λ − H)−1 is bounded, then for sufficiently small , λ − H is also invertible. Thus
λ is not in the spectrum of H . Note that by a similar argument we can also show that
if µ is close to λ so that |µ − λ| k(λ − H)−1 k < 1, then µ is also in the resolvent set.
Thus the resolvent set is an open set.
where we have assumed a gap between the highest occupied state energy and the lowest
unoccupied state energy, and C encloses the occupied spectrum of the unperturbed
Hamiltonian H.
Using the perturbation theory of Green’s functions, we can then study how P changes
when the Hamiltonian is perturbed to H = H + W , where W is a Hermitian opera-
tor. We will assume that W is H-bounded (in other words, there exists λ ∈ C so that
kW (λ − H)−1 k is bounded). As C lies in the resolvent set of H, for sufficiently small
, C remains in the resolvent set of H , thanks to the result of section 3.1. Thus, the
contour integral
I
1
P = (λ − H )−1 dλ (3.2.2)
2πi C
P − P = (λ − H )−1 − (λ − H)−1 dλ
2πi C
I
1
= (λ − H )−1 (W )(λ − H)−1 dλ
2πi C
I X ∞ (3.2.3)
(3.1.17) 1 n
= (λ − H)−1 W (λ − H)−1 dλ
2πi C n=1
I
= (λ − H)−1 W (λ − H)−1 dλ + O(2 ).
2πi C
If we define an operator X0 as
I
1
X0 W := (λ − H)−1 W (λ − H)−1 dλ, (3.2.4)
2πi C
we arrive at
P (H + W ) − P (H) = P − P = X0 W + O(2 ), (3.2.5)
where the notation P (H) stands for the density matrix corresponding to H. In other
words,
dP (H + W )
= X0 W. (3.2.6)
d
=0
More precisely, this means that the Gâteaux derivative of P in the W direction is given
by X0 W :
δ(λ − H)−1
I
δP 1
(W ) = (W ) dλ = X0 W. (3.2.7)
δV 2πi C δV
In physical terms, the linear response of the density matrix when the potential changes
is given by the operator X0 .
Using the spectral decomposition, we have
|ψp ihψp |W |ψq ihψq |
I X
1
X0 W = dλ
2πi C p,q (λ − εp )(λ − εq )
(3.2.8)
|ψp ihψp |W |ψq ihψq |
I
1 X
= (λ − εp )−1 − (λ − εq )−1 dλ .
2πi p,q C εp − εq
δρ
(W ) = diag(X0 W ) =: χ0 W, (3.2.10)
δV
Here c.c. represents the complex conjugate of the first term. The operator χ0 , which
describes the linear response of the density with respect to the change of the potential, is
known as the polarizability operator (or in the current case of independent particles, the
independent particle polarizability operator or the irreducible polarizability operator).
Observe that
Z occ unocc
X X hψi |W |ψa ihψa |W |ψi i
W (r)(χ0 W )(r) dr = + c.c.
i a
εi − εa
occ unocc 2
(3.2.12)
X X |hψi |W |ψa i|
=2 ≤ 0.
i a
εi − εa
Z occ unocc
X |hψi |W |ψa i|2 occ unocc
X 2 XX 2
− W (r)(χ0 W )(r) dr = 2 ≤ |hψi |W |ψa i|
i a
εa − εi εgap i a
1 X 2 1 2
≤ |hψp |W |ψq i| = kW k2 , (3.2.13)
εgap p,q εgap
where εgap is the spectral gap between the occupied and unoccupied spectra. Here kW k2
should be understood as the operator norm for W . This implies that, as an operator, χ0
satisfies
1
− I χ0 0. (3.2.14)
εgap
In other words, for systems with a spectral gap, the response of the density with respect
to a perturbation of the potential cannot be arbitrarily large in the linear regime.
While it appears from the definition (3.2.11) that the evaluation of χ0 W would
involve the summation over unoccupied orbitals, in fact χ0 W can be obtained by using
only occupied orbitals with the help of Green’s function as follows. For simplicity,
88 Chapter 3. Linear response theory
X
χ0 (r, r 0 ) = + c.c.
i a
εi − εa
occ unocc
!
X X ψa (r)ψa (r 0 )
=2 ψi (r) ψi (r 0 )
i a
εi − εa
occ
X
ψi (r) Q(εi − H)−1 Q (r, r 0 )ψi (r 0 ),
=2
i
Pocc
where Q = I − j |ψj ihψj | is the projection operator to the unoccupied space. The
advantage of the above representation is that it only involves the occupied orbitals. Then
Z occ
X Z
χ0 (r, r 0 )W (r 0 ) dr 0 = 2 Q(εi − H)−1 Q (r, r 0 )ψi (r 0 )W (r 0 ) dr 0
ψi (r)
i
occ
X
:= 2 ψi (r)ξi (r),
i
Although the operator on the left-hand side of (3.2.15) is singular, the equation is well
posed since the right-hand side has a vanishing component in the kernel space of Q(εi −
H)Q due to the projection Q. Furthermore, Q(εi − H)Q is invertible when restricted
to the range of Q, and the solution of (3.2.15) is unique.
Perturbation of eigenvalues
As an application, the perturbation theory of the density matrix can be used to study the
perturbation of eigenvalues of a Hamiltonian. Let ε0 be an isolated eigenvalue of H.
Hence we can take a disk D = B(ε0 , r) ⊂ C in the complex plane centered at ε0 with
a sufficiently small radius r so that spec(H) ∩ D = {ε0 }.
Let P be the projector onto the eigenspace corresponding to ε0 . Recall the contour
integral formula I
1
P = (λ − H)−1 dλ. (3.2.16)
2πi ∂D
After perturbation, the projector becomes
I
1
P = (λ − H )−1 dλ. (3.2.17)
2πi ∂D
For sufficiently small, ∂D will lie in the resolvent set of the perturbed Hamiltonian
H and hence P is well defined and is a projection operator.
We next compare the two projection operators as before:
I
1
P − P = (λ − H )−1 − (λ − H)−1 dλ
2πi ∂D
I
1
= (λ − H )−1 (H − H)(λ − H)−1 dλ. (3.2.18)
2πi ∂D
3.3. Density functional perturbation theory 89
Thus, the norm of the difference is on the order of as long as W is H-bounded (note
that the contour is compact and thus |λ| is bounded). In particular, we have kP − P k <
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
1 for sufficiently small. As both are projection operators, this implies that [76]
This means that H has the same number of eigenvalues as H in D for sufficiently
small.
Now we study the eigenvalues and eigenfunctions under perturbation. Recall that
where the last identity follows from a similar calculation in deriving the Sternheimer
equation (3.2.15).
If ε0 is nondegenerate, (3.2.21) gives the perturbation of the eigenfunctions. We can
also get the correction of the eigenvalue by applying H on P ψi , which we will leave
as an exercise.
For the degenerate case, we want to make P ψi eigenfunctions of H by appropri-
ately choosing ψi (note that since the eigenvalue is degenerate, the choice of eigenfunc-
tions is not unique). We thus calculate
(1)
for some εi . Therefore, ψi should be chosen such that hψj |W |ψi i is diagonal, or
equivalently, ψi should diagonalize the operator W restricted to the range of P . The
(1)
eigenvalue gives us the first-order correction to the eigenvalue as ε0 + εi . This can
be extended to higher orders, which we will not discuss further here.
Denote the effective Hamiltonian as H[ρ] = − 12 ∆ + Vext + VHxc [ρ], where Vext
is the external potential and VHxc [ρ] is the Hartree and exchange-correlation potential
induced by electron density ρ. Kohn–Sham DFT can be solved self-consistently via the
following nonlinear equation:
Note that the perturbation to the potential comes from the direct perturbation to the
external potential and the induced change of the effective potential due to the change of
the density. Thus, from the discussion in section 3.2, the corresponding perturbation of
the density satisfies the equation
The last equality defines χ, called the reducible polarizability operator, which is the
linear response of the density with respect to perturbation of the external potential in
DFT. If the exact exchange-correlation functional is used, the reducible polarizability
operator χ defined here should agree with χexact , the exact linear response of the many-
body electron system, which is to be further discussed in section 3.7.
From a physical point of view, if I − χ0 fHxc is not invertible, it means that it is
possible that a small perturbation of the potential generates a large perturbation to the
density. This is to say that the electronic structure of the system is not stable with respect
3.4. Applications of density functional perturbation theory 91
to external perturbations. Thus, the invertibility of the operator is known as the stability
condition of electronic structure in the context of Kohn–Sham DFT [23].
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
In many applications, we do not need to have access to the full reducible polariz-
ability operator χ, but only the application of χ on some vector, say g:
Z
u(r) := (χg)(r) = χ(r, r 0 )g(r 0 ) dr 0 .
u = χg = (I − χ0 fHxc )−1 χ0 g,
we have
u − χ0 fHxc u = χ0 g,
or equivalently,
u = χ0 g + χ0 fHxc u. (3.3.9)
Equation (3.3.9) can be solved iteratively as a fixed point problem for u. A simple
iteration scheme can be constructed by recursively substituting u into the right-hand
side. This leads to the Neumann series
The iterative solution requires the application of χ0 to a vector, which can be obtained
efficiently by solving the Sternheimer equations (3.2.15).
However, since VHxc depends on ρ, the change of the electron density ρ further induces
the change of the effective potential. Therefore the change of the electron density δρ
must satisfy the following self-consistent equation:
Taking the random phase approximation (RPA) for the exchange-correlation functional,
i.e., neglecting the exchange-correlation kernel fxc and setting fHxc ≈ vC , we get
Here wC = −1 d vC is the screened Coulomb interaction operator. The inverse of the
dielectric operator −1
d characterizes the screening effect of the electron system with
respect to the perturbation of the external charge, and is directly related to the dielec-
tric constant in macroscopic electrostatic theory [1, 85]. In contrast, sometimes vC is
referred to as the bare Coulomb interaction for which the electron screening is not con-
sidered. The screened Coulomb operator wC also plays an important role in many-body
perturbation theory, including the GW theory [34].
Experimentally, one cannot measure the full detail of δρ but only its macroscopic av-
erage. One measurable quantity is the induced dipole moments along the {rα }3α=1
directions, defined respectively as
Z Z
dα = rα δρ(r) dr = − rα χ(r, r 0 )r10 dr dr 0 . (3.4.9)
In general, in the linear response regime the induced dipole moment is a linear function
of the external electric field E,
X
dα = Aαβ Eβ , (3.4.10)
β
SCF iteration, and simple mixing sometimes resolves this problem. Since the exchange-
correlation functional is neither convex nor concave with respect to the electron density,
the rigorous study of the global convergence properties of SCF schemes in Kohn–Sham
DFT calculations is very difficult. Hence we consider the linear response regime, where
we assume that the initial effective potential V0 is already close to V? , the converged
Kohn–Sham effective potential. Let
ek := Vk − V?
be the error of the potential at the kth iteration. Recall the fixed point iteration for the
SCF iteration
Vk+1 = Veff [FKS [Vk ]]. (3.4.12)
In order to study the propagation of the error in the fixed point iteration, we apply the
chain rule and we have
2
ek+1 = fHxc χ0 ek + O(kek k ), (3.4.13)
δVeff δFKS
since fHxc = δρ and χ0 = δV . In the linear response regime, we assume that the
2
O(kek k ) term is small and is omitted in the following discussion. Then, after k steps,
k
ek ≈ (fHxc χ0 ) e0 . (3.4.14)
Hence, in the linear response regime, the convergence of the fixed point iteration re-
quires that the spectral radius of the operator, denoted by rσ (fHxc χ0 ), is smaller than
1. Unfortunately, such a spectral radius is generally much larger than 1, and the error
in the fixed point iteration will therefore diverge, even if the initial potential is already
very close to the self-consistent potential.
In the simple mixing method, the error propagation follows as
2
ek+1 = (I − αd ) ek + O kek k , (3.4.15)
is satisfied, the simple mixing will converge. In practical calculations, the spectral radius
rσ (d ) can be large, and hence α needs to be chosen rather small and the simple mixing
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
method converges with a very slow rate. Thus the simple mixing method is rarely used
directly in practical electronic structure calculations.
It remains to decide what is the optimal choice of α satisfying the constraint (3.4.17).
This requires the solution of the following minimax problem:
or
2
α= . (3.4.20)
λmin + λmax
Substituting this choice of α into (3.4.15), we find that the optimal convergence rate of
simple mixing is
Here κ(εd ) = λmax /λmin is the condition number of the dielectric operator.
∂E({RI })
F I ({R?I }) = − = 0, ∀I = 1, . . . , M. (3.4.22)
∂RI
{RI }={R?
I}
Then, in the presence of a small perturbation away from R?I , the dynamics of the nuclei
is given by Newton’s law,
d2 X ∂ 2 E({RI })
MI R I = F I ({R I }) ≈ − (RJ − R?J ), (3.4.23)
dt2 ∂RI ∂RJ {RI }={R?I }
J
where we have used Taylor’s expansion around R∗I and omitted higher-order terms. If
we define the displacement vector δRI = RI − R?I , (3.4.23) can be written as
d2 X ∂ 2 E({RI })
MI δR I ≈ − δRJ . (3.4.24)
dt2 ∂RI ∂RJ {RI }={R?I }
J
The linear equation (3.4.24) can be exactly solved. Define the dynamical matrix, which
is the scaled Hessian matrix, as
1 ∂ 2 E({RI })
DIJ = √ . (3.4.25)
MI MJ ∂RI ∂RJ {RI }={R?I }
3.5. Exponential decay of the Green’s function 95
When the system is at a local minimum, D is a positive semi-definite matrix and can be
diagonalized as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
and we find that the second-order derivative with respect to the atomic position takes
the form
∂ 2 E({RI }) I
(r − RI )
Z
∂ρ(r) ∂Vext
= dr
∂RI ∂RJ ∂RJ ∂RI
(3.4.28)
∂ 2 Vext
I
(r − RI ) ∂ 2 EII ({RI })
Z
+ ρ(r) drδI,J + .
∂RI ∂RI ∂RI ∂RJ
I
Here we have used the form of Vext in (2.10.11). The second and third terms on the
right-hand side of (3.4.28) can be readily evaluated. The first term involves the self-
consistent response of the electron density with respect to the perturbation of the atomic
position as
I
(r − RI ) I
(r − RI ) ∂V J (r 0 − RJ )
Z Z
∂ρ(r) ∂Vext ∂Vext
dr = χ(r, r 0 ) ext dr dr 0
∂RJ ∂RI ∂RI ∂RJ
(3.4.29)
with a compact contour C , the decay of Gλ implies that of the density matrix as long
as the system has a gap, so that the contour integral representation is well defined.
Similarly, one can also study the exponential decay of the density matrix for general
systems at finite temperature, the details of which we will leave to the reader.
It is helpful to consider first a concrete example for the Green’s function. Let us
take H = −∆ (in R3 ), for which we know that the spectrum is [0, ∞). Let λ < 0 and
consider the corresponding Green’s function Gλ = (λ + ∆)−1 . We have
(λ + ∆)(Gλ f ) = f. (3.5.2)
96 Chapter 3. Linear response theory
(λ − |k|2 )b
g (k) = fb(k) (3.5.3)
and hence
fb(k)
gb(k) = − . (3.5.4)
|λ| + |k|2
Thus, by the convolution theorem, we have
Z
g(r) = Kλ (r − r 0 )f (r 0 ) dr 0 , (3.5.5)
b λ (k) = − 13/2
where K 1
(2π) |λ|+|k|2 , so that
1/2 0
e−|λ| |r−r |
Z
1 1 0
Gλ (r, r 0 ) = Kλ (r − r 0 ) = − eik·(r−r ) dk = − .
(2π)3 |λ| + |k| 2 4π|r − r 0 |
(3.5.6)
We observe that Gλ (r, r 0 ) decays exponentially as |r − r 0 | becomes large.
Note that the decay rate is given by |λ|1/2 , which depends on the distance of λ to
the spectrum of −∆. If λ lies in the spectrum, we may still write down the expression
of the Green’s function (which is then not bounded!), but the kernel no longer decays
exponentially. For example, if λ = 0, as is well known, the Green’s function of ∆ is
given by the Poisson kernel
1
G0 (r, r 0 ) = − . (3.5.7)
4π|r − r 0 |
For λ positive, we get the Green’s function for the Helmholtz equations
1/2 0
0 e±iλ |r−r |
Gλ (r, r ) = − for λ > 0. (3.5.8)
4π|r − r 0 |
Tr0 ,γ = Er−1
0 ,γ
(λ − H)−1 Er0 ,γ . (3.5.11)
3.5. Exponential decay of the Green’s function 97
Note that Tr0 ,0 is just the Green’s function (λ − H)−1 . Observe that
−1 −1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
We will now show that, for γ small, ∆ − Er−1 ∆Er0 ,γ (λ − H)−1 is a small
0 ,γ
perturbation to the identity operator in (3.5.12). Note that
(Er−1 ∆Er0 ,γ f )(r) = er0 ,γ (r)−1 ∆ er0 ,γ (r)f (r)
0 ,γ
0 2 1/2
Gλ (r, r 0 )eγ(|r0 −r0 | +1) dr dr0 . (3.5.23)
|r−r0 |≤1 |r0 −r00 |≤1
Thus the boundedness of Tr0 ,γ implies that the kernel of the Green’s function decays
exponentially at least in an averaged sense, given that, e.g., V is in L∞ . We remark that
if V has better regularity, we can get stronger conclusions on the decay estimate (such
as a pointwise estimate) [23], but we will not go into the details here.
The decay property of the Green’s function implies that the density matrix decays
exponentially along the off-diagonal direction for systems with a positive band gap.
This is often referred to as the near-sightedness property of insulating systems. Note that
this implies that each selected column obtained from the SCDM algorithm in section 2.9
also decays exponentially. In order to obtain exponentially localized Wannier functions
as discussed in section 2.9, the selected columns should be orthogonalized. It turns out
that the band gap condition is insufficient to guarantee that the resulting orthogonalized
functions will also decay exponentially due to potential topological obstructions. There
is a rich body of literature along this direction in mathematics and physics from the past
few decades. We refer readers to [11, 67, 19] for more details.
∂
P (t) = −i[H(t), P (t)]. (3.6.1)
∂t
To solve this equation, let us consider first the solution to a general linear equation
∂
A(t) = −iH(t)A(t). (3.6.2)
∂t
Note that two possible outcomes are in general different, since A at different times
might not commute. Therefore, we have
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Z t Z t
··· T [H(s1 ) · · · H(sn )] ds1 · · · dsn
0 0
Z t Z sn Z sn−1 Z s2
= n! ··· H(sn ) · · · H(s1 ) ds1 · · · dsn . (3.6.6)
0 0 0 0
Using this, we can check by differentiating that (3.6.3) is indeed the solution to (3.6.2).
Thus, the time-ordered exponential is the propagator for (3.6.2):
h R t2 i
U(t2 , t1 ) = T e−i t1 H(s) ds , t1 ≤ t2 . (3.6.7)
For convenience, we also define U(t2 , t1 ) = U(t1 , t2 )∗ if t1 > t2 . As we will see later,
the propagator plays the role of the Green’s function for time-dependent problems.
Using the propagator (or the time-ordered matrix exponential), the solution to (3.6.1)
can be written as (the validation is left as an exercise)
h Rt i h R i∗
−i 0 H(s) ds −i 0t H(s) ds
P (t) = T e P (0) T e = U(t, 0)P (0)U(0, t). (3.6.8)
In order to derive the first-order perturbation to the density matrix, it suffices to under-
stand the perturbation to the propagator. Let U be the propagator to the equation
∂
A (t) = −iH (t)A (t) = −iH(t)A (t) − iW (t)A (t). (3.6.10)
∂t
Thus, viewing −iW (t)A (t) as an inhomogeneous source term, we have by Duhamel’s
principle Z t
A (t) = U(t, 0)A (0) − i U(t, s)W (s)A (s) ds. (3.6.11)
0
Thus, in terms of the propagators, we get
Z t
U (t, 0) = U(t, 0) − i U(t, s)W (s)U (s, 0) ds. (3.6.12)
0
This is the time-dependent version of Dyson’s equation (recall (3.1.21) for the perturbed
Green’s function in the time-independent case).
We will focus on the linear response regime and hence keep the terms up to O().
Analogous to time-independent DFPT, we define the operator
∂P (t)
(X0 W )(t) = . (3.6.13)
∂ =0
The propagator satisfies
Z t
U (t, 0) = U(t, 0) − i U(t, s)W (s)U(s, 0) ds + O(2 ), (3.6.14)
0
100 Chapter 3. Linear response theory
and hence
P (t) = U (t, 0)P (0)U (0, t)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Z t
= P (t) − i U(t, s)W (s)U(s, 0)P (0)U(0, t) ds
0 (3.6.15)
Z t
+ iU(t, 0)P (0) U(0, s)W (s)U(s, t) ds + O(2 ),
0
and
Z t
(X0 W )(t) = −i U(t, s)W (s)U(s, 0)P (0)U(0, t) ds + h.c. (3.6.16)
0
Let us consider a system that lies in the ground state initially without perturbation,
i.e., the unperturbed Hamiltonian is time-independent H(t) ≡ H and P (0) ≡ P0 is
the corresponding density matrix at zero temperature P0 = f∞ (H − µ). Here µ is
the chemical potential and we assume that there is an energy gap at t = 0. Since the
Hamiltonian without perturbation is time-independent, the propagator is just given by
where we have used the fact that P0 = f∞ (H − µ) commutes with the propagator
U(0, t) = eitH . The starting time 0 in the above formula is somewhat arbitrary, as
we may relabel the starting time to any point. To remove this arbitrary choice, let us
imagine that we start the perturbation at some time after t = −∞, and hence
Z t
(X0 W )(t) = −i e−i(t−s)H [W (s), P0 ]ei(t−s)H ds. (3.6.19)
−∞
for some real frequency ω0 . Let Θ(t) be the Heaviside function that Θ(t) = 1 for t > 0,
Θ(t) = 0 for t < 0, and Θ(0) = 12 . We can rewrite (3.6.22) as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Z t Z ∞
e−i(t−s)ω0 W (s) ds = e−i(t−s)ω0 W (s)Θ(t − s) ds, (3.6.23)
−∞ −∞
where p. v. stands for the principal value. Furthermore, using the Sokhotski–Plemelj
formula in Appendix A.3, we conclude that
Z t
Wc (ω)
F e−i(t−s)ω0 W (s) ds (ω) = i lim . (3.6.25)
−∞ η→0+ ω − ω0 + iη
Note that (ω − ω0 + iη)−1 only has a pole in the lower half plane.
Substituting the above equation into (3.6.20), we obtain
X |ψp ihψp |[W
c (ω), P0 ]|ψq ihψq |
F(X0 W )(ω) = lim
η→0+
p,q
ω − (εp − εq ) + iη
occ unocc
X X |ψa ihψa |W
c (ω)|ψi ihψi |
= lim (3.6.26)
η→0+
i a
ω − (εa − εi ) + iη
occ unocc
X X |ψi ihψi |W
c (ω)|ψa ihψa |
− lim .
η→0+
i a
ω − (εi − εa ) + iη
Here we have used the short-hand forms hψi |δr |ψa i = ψi∗ (r)ψa (r).
For time-dependent density functional theory (TDDFT), we also need to take into
account the change of the potential induced by the perturbation of the density. This is
described by the reducible dynamic polarizability operator denoted by χ(ω). Following
the same derivation as in the time-independent case, we write the formula relating χ and
χ0 as
Note that in TDDFT, the exchange-correlation kernel fHxc should also be frequency-
dependent in general, which originates from the memory effect of the exchange-
102 Chapter 3. Linear response theory
correlation potential. However, almost all practical TDDFT calculations use the adi-
abatic approximation, which applies the ground state exchange-correlation functional
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
occ unocc
ψa∗ (r 0 )ĝ(r 0 , ω)ψi (r 0 ) dr 0 ψi∗ (r)
R
X X ψa (r)
(χ0 ĝ)(r, ω) = lim
η→0+
i a
ω − (εa − εi ) + iη
occ unocc
X X ψi (r) ψi (r )ĝ(r 0 , ω)ψa (r 0 ) dr 0 ψa∗ (r)
R ∗ 0
− lim . (3.6.29)
η→0+
i a
ω − (εi − εa ) + iη
Similarly to the discussion on the time-independent case, we have (the spatial depen-
dence is omitted in the notation for simplicity)
occ
X
ξi,+ (ω)ψi∗ − ξi,−
∗
(χ0 ĝ)(ω) = lim (ω)ψi , (3.6.30)
η→0+
i
P (N ) = Ψ0 Ψ0 . (3.7.1)
Note that the many-body Hamiltonian system is linear. Hence there is no distinction
between the irreducible and reducible polarizability operators in the many-body context.
In order to connect with the (effective) one-body picture of DFT, we consider the
special class of perturbations
N
X Z N
X Z
(N )
W = W (r i ) = W (r) δ(r − r i ) dr =: W (r)b
ρ(r) dr, (3.7.5)
i=1 i=1
k6=0 E0 − E k
(3.7.7)
X hΨ(N ) |U (N ) |Ψ(N ) ihΨ(N ) |W (N ) |Ψ(N ) i
0 k k 0
+ (N ) (N )
.
k6=0 E0 − Ek
Within the class of perturbations for W (N ) as in (3.7.5), as well as the test function U (N )
in the same class, the N -body polarizability operator induces a one-body polarizability
104 Chapter 3. Linear response theory
operator χexact (we use the superscript “exact” to indicate that this comes from the
many-body theory) such that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Z X hΨ(N ) |b (N ) (N )
ρ(r)|Ψ ihΨ |b
(N )
ρ(r 0 )|Ψ i
Tr(U χ exact
W) = U (r)W (r 0 ) k 0
(N )
0
(N )
k
dr dr 0
k6=0 E0 − Ek
Z X hΨ(N ) |b (N ) (N )
ρ(r)|Ψ ihΨ |b
(N )
ρ(r 0 )|Ψ i
+ U (r)W (r 0 ) 0 k
(N )
k
(N )
0
dr dr 0 .
k6=0 E0 − Ek
(3.7.8)
Thus χexact is an integral operator on R3 with kernel
D ED E
(N ) (N ) (N ) (N )
X Ψk b ρ(r)Ψ0 Ψ0 b ρ(r 0 )Ψk
χexact (r, r 0 ) = (N ) (N )
k6=0 E0 − Ek
D ED E
(N ) (N ) (N ) (N )
X Ψ0 b ρ(r)Ψk Ψk b ρ(r 0 )Ψ0
+ (N ) (N )
. (3.7.9)
k6=0 E0 − Ek
occ unocc
X X ψ ∗ (r)ψi (r)ψ ∗ (r 0 )ψa (r 0 ) occ unocc
X X ψ ∗ (r)ψa (r)ψ ∗ (r 0 )ψi (r 0 )
χ0 (r, r 0 ) = a i
+ i a
i a
εi − εa i a
εi − εa
occ unocc
X X hψa |δr |ψi ihψi |δr0 |ψa i occ unocc
X X hψi |δr |ψa ihψa |δr0 |ψi i
= + ,
i a
εi − εa i a
εi − εa
(3.7.10)
we observe that the two formulae are quite similar since δr is the one-body version
of the density operator ρb(r). Despite the formal similarity, we remark that χexact is
better approximated by the reducible polarizability χ in the context of DFPT, which
represents the response of the electron density to the external potential while taking into
account the electron interactions. On the other hand, the irreducible polarizability χ0
only represents the response of the electron density to the effective potential, which is
not physically measurable.
Similarly to the static case, we can also apply the time-dependent perturbation to
the many-body Hamiltonian, and we get in the frequency domain
D ED E
(N ) (N ) (N ) (N )
X Ψ0 b ρ(r)Ψk Ψk b ρ(r 0 )Ψ0
χexact (r, r 0 ; ω) = lim (N ) (N )
η→0+ ω − Ek − E0 + iη
k6=0
D ED E (3.7.11)
(N ) (N ) (N ) (N )
X Ψk b ρ(r)Ψ0 Ψ0 b ρ(r 0 )Ψk
− lim (N ) (N )
.
η→0+ ω − E0 − Ek + iη
k6=0
(N ) (N )
In particular, (3.7.11) indicates that the poles of χexact (ω) take the form Ek − E0 ,
which is the difference between kth neutrally excited state energy and the ground state
energy (note that the electron number is kept the same). Such energy differences
are called (neutral) excitation energies. Again in the context of time-dependent DFPT,
3.8. Casida formalism 105
χexact (ω) can be approximated by the reducible polarizability operator χ(ω), and hence
the excitation energies can be approximated through the poles of χ(ω).
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
for some function f . In order to solve the above equation, let us first truncate the
set of unoccupied orbitals up to some fixed energy level Ec . Then consider functions
Ψai = ψa ψi∗ and the complex conjugate Ψia := Ψ∗ai = ψi ψa∗ , where i and a are indices
for an occupied orbital and an unoccupied orbital within the energy cutoff, respectively.
Denote ωai = εa − εi , and expand f using this set of functions as
X
f= fai Ψai + fia Ψia .
ai
or equivalently,
occ unocc
X X
hΨai |fHxc |Ψbj ifbj + hΨai |fHxc |Ψjb ifjb = (ω − ωai )fai , (3.8.5)
j b
occ
XX unocc
hΨia |fHxc |Ψbj ifbj + hΨia |fHxc |Ψjb ifjb = −(ω + ωai )fia . (3.8.6)
j b
This can be viewed as a (non-Hermitian) eigenvalue equation for ω, known as the Casida
equation [15].
One important application of TDDFT is the computation of the absorption spectrum,
which is directly related to the excitation energies. Following the discussion in the time-
independent case, the frequency-dependent macroscopic polarizability tensor is defined
as (cf. (3.4.11)) Z
Aij (ω) = − rα χ(r, r 0 ; ω)rβ0 dr dr 0 . (3.8.7)
106 Chapter 3. Linear response theory
Here χ(ω) can be interpreted as the exact polarizability operator χexact (ω) or approxi-
mation obtained from TDDFT. The absorption spectrum cross section, denoted by σ(ω),
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
is then given by
4πω
σ(ω) = Im Tr[A(ω)], (3.8.8)
c
where c is the speed of light, which is approximately 137 in atomic units.
Since only the poles of χ(ω) contribute to Im Tr[A(ω)], we may readily use the
Casida formalism to evaluate the absorption spectrum. However, the Casida formal-
ism requires the diagonalization of a matrix that is of size Nocc Nunocc , where Nocc is
the number of occupied states and Nunocc is the number of unoccupied states within
the energy, respectively. For a large system the matrix size becomes O(N 2 ), and the
diagonalization is thus very expensive. A more efficient numerical method is to use a
Lanczos approach to avoid the explicit diagonalization of this matrix.
Furthermore, note that the evaluation of A(ω) only requires applying χ(ω) to a
specific vector, i.e., the vector representing the uniform electric field. We may solve
the time-dependent Sternheimer equation directly to obtain the value of the absorption
spectrum at any specific point of interest. The advantage of this approach is that we
may completely remove the error due to the truncation of the unoccupied states at some
fixed energy level.
multiplier for the constraint that the wavefunction is giving the density ρ) such that Ψλ
(N ) (N )
becomes the ground state of Hλ [68]. We denote E(λ) = hΨλ |Hλ |Ψλ i such that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
in particular E(1) is the ground state energy of the physical system we are interested in.
To obtain E(1), the idea is to use the fundamental theorem of calculus and write
Z 1
∂E(λ)
E(1) − E(0) = dλ
0 ∂λ
Z 1D (N )
∂Hλ
E
= Ψλ Ψλ dλ (3.9.2)
0 ∂λ
Z 1D
∂Vλ E
= Ψλ Vee + Ψλ dλ,
0 ∂λ
where the second equality uses the Hellmann–Feynman theorem. Recall that, given an
N -body wavefunction Ψ, using the symmetry we can write the energy as
D X 1 E 1 Z Z ρ(2) (r, r 0 )
Ψ Ψ = dr dr 0 . (3.9.3)
|r − r | 2 |r − r 0|
i<j i j
N (N − 1)
Z
ρ(2) (r 1 , r 2 ) = |Ψ(r 1 , r 2 , r 3 , . . . , r n )|2 dr 3 · · · dr N . (3.9.4)
2
Since Vλ is a single-body term, we have
Z 1D Z 1Z Z Z
∂Vλ ∂Vλ
E
Ψλ Ψλ dλ = ρλ dr dλ = ρ(V1 − V0 ) dr = − ρV0 dr,
0 ∂λ 0 ∂λ
(3.9.5)
where we have used the assumption that ρλ does not depend on λ and Vλ=1 = 0.
Recalling that E(0) is just the energy of a noninteracting system with effective potential
V0 , we get
1 (2)
ρλ (r, r 0 )
Z Z ZZ
1
E(1) = E(0) − ρV0 dr + dr dr 0 dλ, (3.9.6)
2 0 |r − r 0 |
(2)
where ρλ is the two-body electron density for the λ-system. Therefore, to obtain an
(2)
“exact” functional, we just to need to know ρλ .
Let us recall the definitions of χ0 and χ in (3.6.27) and (3.7.11) and consider an
analytic extension of them to imaginary frequencies (assuming for simplicity that all
eigenfunctions are real):
occ unocc
X X hψi |δr |ψa ihψa |δr0 |ψi i occ unocc
X X hψa |δr |ψi ihψi |δr0 |ψa i
χ0 (r, r 0 ; iω) = −
i a
iω − (εa − εi ) i a
iω − (εi − εa )
occ unocc
X X hψi |δr |ψa ihψa |δr0 |ψi i
= + c.c.
i a
iω − (εa − εi )
occ unocc
X X (εa − εi )
= −2 hψi |δr |ψa ihψa |δr0 |ψi i
i a
ω 2 + (εa − εi )2
(3.9.7)
108 Chapter 3. Linear response theory
and similarly
D ED E
(N ) (N ) (N ) (N )
ρ(r 0 )Ψ0
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
X Ψ0 b ρ(r)Ψk Ψk b
χ(r, r 0 ; iω) = (N ) (N )
k6=0 iω − Ek − E0
D ED E
(N ) (N ) (N ) (N )
X Ψk b ρ(r)Ψ0 Ψ0 b ρ(r 0 )Ψk
− (N ) (N )
k6=0 iω − E0 − Ek
(N ) (N )
X (Ek − E0 ) D
(N )
(N )
ED
(N )
0 (N )
E
= −2 (N ) (N )
Ψ k bρ (r) Ψ 0 Ψ 0 bρ (r )Ψ k .
2 − E0 )2
k6=0 ω + (Ek
R∞ a (3.9.8)
π
Integrating along the imaginary axis and noting that 0 a2 +ω 2 dω = 2 for a > 0, we
get
Z ∞ occ unocc Z
1 0 2XX ∞ (εa − εi )
χ0 (r, r ; iω) dω = − hψi |δr |ψa ihψa |δr0 |ψi i dω
2π −∞ π i a 0 ω 2 + (εa − εi )2
occ unocc
X X
=− hψi |δr0 |ψa ihψa |δr |ψi i
i a
occ X
X occ occ
X
= hψi |δr0 |ψj ihψj |δr |ψi i − hψi |δr |ψi iδ(r − r 0 )
i j i
0 2 0
= |Ps (r, r )| − ρs (r)δ(r − r )
(3.9.9)
and similarly
∞ (N ) (N )
2X ∞ (Ek − E0 )
Z Z
1
χ(r, r 0 ; iω) dω = − (N ) (N )
dω
2π −∞ π ω 2 + (Ek − E0 )2
k6=0 0
D ED E
(N ) (N ) (N ) (N )
× Ψk b ρ(r)Ψ0 Ψ0 b ρ(r 0 )Ψk
X D (N )
(N )
ED
(N )
(N )
E
=− Ψk b ρ(r)Ψ0 Ψ0 b ρ(r 0 )Ψk
k6=0
D ED E
(N ) (N ) (N ) (N )
= Ψ0 b ρ(r)Ψ0 Ψ0 b ρ(r 0 )Ψ0
D E
(N ) (N )
− Ψ0 b ρ(r 0 )Ψ0
ρ(r)b
= ρ(r)ρ(r 0 ) − ρ(2) (r, r 0 ) − ρ(r)δ(r − r 0 ),
(3.9.10)
where we have used ρs and ρ for the one-particle electron density for noninteracting
and interacting systems, respectively. In the above calculation, we have used that
X
ρ(r 0 ) =
ρb(r)b δ(r − r i )δ(r 0 − r j )
i,j
X X
= δ(r − r i )δ(r − r 0 ) + δ(r − r i )δ(r 0 − r j ) (3.9.11)
i i6=j
This gives
The right-hand side is exactly the Hartree–Fock approximation of the electronic Cou-
lomb repulsion! This motivates us to look for a better approximation of χ so that we
can hopefully get a better approximation of the many-body correlation energy (which is
completely missing if we replace χ by χ0 ).
(2)
Our previous discussion suggests rewriting ρλ using χλ , the dynamic polarizability
operator along the imaginary frequency axis for the λ-system, and then approximating
χλ . Since we already know that χ0 gives the Hartree and exchange terms, the additional
contribution gives the correlation energy
Z ∞ Z 1 ZZ
1 1 1
χλ (r, r 0 ; iω) − χ0 (r, r 0 ; iω) dr dr 0 dλ dω.
Ec = − 0
2π 2−∞ 0 |r − r |
(3.9.14)
(N )
Recall that χλ is the one-body polarizability of a many-body Hamiltonian Hλ .
According to our discussion of DFPT, in principle the response is captured if we know
λ
the exact exchange-correlation kernel corresponding to the λ-system fxc at the imagi-
nary frequency (recall (3.6.28)):
χ−1 −1 λ
λ = χ0 − λvC − fxc . (3.9.15)
Notice that the Coulomb kernel is changed to λvC due to the adiabatic connection. If we
know the exchange-correlation functional exactly, we can represent χλ and get the cor-
relation energy Ec (which is of course circular). But the advantage of representing into
χλ is that hopefully this will yield practical strategies for approximating the correlation
energy.
λ
The RPA proposes a simple approximation of χ. It is essentially neglecting fxc in
(3.9.15) and approximating
χ−1 −1
λ ≈ χ0 − λvC , (3.9.16)
or equivalently,
χλ ≈ χ0 + λχ0 vC χλ . (3.9.17)
More precisely, the approximate χλ solves the Dyson-like equation
Thus
Z 1 Z 1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Here the matrix logarithm can be computed by diagonalizing the matrix 1 − χ0 (iω)vC
or by the Cauchy contour integral formula. As a word of caution, there are actually
many flavors of RPA. The one we have just discussed is often known as the direct RPA
or the DFT-flavored RPA. There are various other formulations under the umbrella term
of RPA (e.g., RPA with exchange (RPAX), the Hartree–Fock-flavored RPA, and the
partial-partial RPA) that are beyond the scope of this book.
Exercises
1. Following section 3.2, derive the linear response of the density matrix for finite
temperature. Show that
δP X fp − fq
(W ) = |ψp ihψp |W |ψq ihψq |, (3.9.22)
δV ε − εq
p,q p
3. Verify the second-order derivative of the total energy with respect to the atomic
position as in (3.4.28) and (3.4.29).
4. Use the decay of Green’s function as in section 3.5 to prove that at finite temper-
ature the density matrix decays exponentially even for systems without a spectral
gap. [Hint: represent the density matrix using Green’s function as (2.6.28).]
5. Prove the claim (3.6.8) using the definition of the time-ordered matrix exponen-
tial.
6. Expand the perturbation series for the time-dependent density matrix (3.6.15) to
the next order.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Appendix A
Notations and
preliminaries
A.1 Notation
General conventions
i imaginary unit
z∗ complex conjugate of the complex number z
N number of electrons
M number of nuclei
β
H inverse temperature
C
dλ contour integral Z Z
R
integrals, for example f dr, f (r) dr, or
Z R3
X Z
f (x) dx := f (r, σ) dr
σ∈{↑,↓} R3
hψ|, |ψi, hψ|ϕi bra vector, ket vector, and braket in Dirac notation
↑, ↓ spin-up and spin-down components
Coordinates
r, rα single electron spatial coordinate and its Cartesian com-
ponents, α = x, y, z or 1, 2, 3
p, pα single electron momentum coordinate and its Cartesian
components
xi = (r i , σi ) space-spin coordinates of the ith electron
rij = |r i − r j | Euclidean distance between electrons i and j
ZI charge of the Ith nuclei
RI spatial coordinate of the Ith nuclei
Wavefunctions and densities
Ψ or |Ψi N -electron wavefunction
(N )
Ψk (x1 , . . . , xN ) space-spin coordinates of the kth excited state of an N -
electron wavefunction
P (N ) N -electron density matrix
P single particle density matrix
111
112 Appendix A. Notations and preliminaries
Function spaces
H general Hilbert space
Lp (Rd ), Lp (Rd ; C) spaces of real/complex valued Lp functions
H s (Rd ), H s (Rd ; C) spaces of real/complex valued H s functions
L2 (R3 ; C2 ) single electron state space in the real space
^N
AN ≡ L2 (R3 ; C2 ) N -electron state space in the real space
Notations for matrix representation
A> transpose of A
A∗ or A† Hermitian tranpose/adjoint of A
Ng number of grid points/degrees of freedom
Nb number of basis functions
Nocc number of occupied orbitals, especially in the spin-
restricted case
Ψ = [ψ1 , . . . , ψN ] a matrix collecting N single particle orbitals
Φ = [φ1 , . . . , φNb ] a matrix collecting Nb basis functions, usually of size
Ng × Nb
C coefficients of the single particle orbitals with respect to
a basis set
H, S discretized Hamiltonian and overlap matrices
G discretized Green’s function
I identity matrix
Other quantities
E electric field
B magnetic field
⊗ tensor product
1(−∞,0) indicator function
fβ finite temperature Fermi–Dirac function
f∞ zero temperature Fermi–Dirac function, the same as an
indicator function
F, F −1 Fourier transform and its inverse
η a small positive quantity approaching 0+
114 Appendix A. Notations and preliminaries
1 ∂2
1 ∂ ∂
− sin θ − Θ (θ) Φ (ϕ) = EΘ (θ) Φ (ϕ) . (A.2.2)
sin θ ∂θ ∂θ sin2 θ ∂ϕ2
Multiply sin2 (θ) to both sides of the equation, and we first separate out the ϕ variable
as
∂2Φ
− 2 = m2 Φ, (A.2.3)
∂ϕ
where m2 is an eigenvalue. The solution takes the following form:
m2
1 ∂ ∂Θ
− sin θ + Θ = kΘ, (A.2.5)
sin θ ∂θ ∂θ sin2 θ
where k is an eigenvalue. Perform the change of variables
then (A.2.5) is reduced to the associated Legendre equation on the interval [−1, 1]:
m2
d dξ
1 − ζ2 + k− ξ = 0, ξ(−1), ξ(1) are finite. (A.2.7)
dζ dζ 1 − ζ2
k = l (l + 1) , l ∈ N. (A.2.8)
(l − m)! m
Pl−m = (−1)m P (ζ) , m = 0, . . . , l. (A.2.10)
(l + m)! l
Combining the solution for Θ and Φ, we find that all eigenfunctions Y (θ, ϕ) can be
written in the form
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
which corresponds to the eigenvalue El = l(l + 1). In other words, the eigenvalue El
has multiplicity 2l + 1. Clm is a normalization factor and is chosen so that
Z
∗
Ylm (θ, ϕ) Yl0 m0 (θ, ϕ) sin2 θdθdϕ = δll0 δmm0 . (A.2.13)
This fact can be generalized to the matrix case: If all eigenvalues of A ∈ CN ×N are
enclosed by the closed contour C, then
I
1
f (A) = f (λ)(λI − A)−1 dλ. (A.3.2)
2πi C
For the Heaviside function
1,
t > 0,
Θ(t) = 21 , t = 0, (A.3.3)
0, t < 0,
where in the last equality we have used the fact that the integral of the sinc function is π.
The Sokhotski–Plemelj formula is the following:
Z Z
f (x) f (x)
lim dx = ∓iπf (0) + p. v. dx. (A.3.6)
η→0+ R x ± iη R x
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Appendix B
Chapter 1
R. F EYNMAN AND A. H IBBS, Quantum Mechanics and Path Integrals, McGraw-Hill,
New York, 1965.
S. J. G USTAFSON AND I. M. S IGAL, Mathematical Concepts of Quantum Mechanics,
Springer, Berlin, 2011.
E. K AXIRAS, Atomic and Electronic Structure of Solids, Cambridge University Press,
Cambridge, 2003.
L. L ANDAU AND E. L IFSHITZ, Quantum Mechanics: Non-Relativistic Theory, Butter-
worth-Heinemann, Oxford, 1991.
M. R EED AND B. S IMON, Methods of Modern Mathematical Physics. I. Functional
Analysis, Academic Press, New York, 1978.
J. J. S AKURAI, Modern Quantum Mechanics, Addison-Wesley, Reading, MA, 1994.
Chapter 2
D. R. B OWLER AND T. M IYAZAKI, O(N) methods in electronic structure calculations,
Rep. Progr. Phys., 75 (2012), 036503.
E. C ANCÈS , M. D EFRANCESCHI , W. K UTZELNIGG , C. L E B RIS , AND Y. M ADAY,
Computational quantum chemistry: A primer, Handb. Numer. Anal., 10 (2003),
pp. 3–270.
H. E SCHRIG, The Fundamentals of Density Functional Theory, B. G. Teubner, Stuttgart,
1996.
S. G OEDECKER, Linear scaling electronic structure methods, Rev. Mod. Phys., 71
(1999), pp. 1085–1123.
R. O. J ONES, Density functional theory: Its origins, rise to prominence, and future,
Rev. Mod. Phys., 87 (2015), pp. 897–923.
R. M ARTIN, Electronic Structure: Basic Theory and Practical Methods, Cambridge
University Press, New York, 2004.
D. M ARX AND J. H UTTER, Ab Initio Molecular Dynamics: Basic Theory and Ad-
vanced Methods, Cambridge University Press, New York, 2009.
N. M ARZARI , A. A. M OSTOFI , J. R. YATES , I. S OUZA , AND D. VANDERBILT, Max-
imally localized Wannier functions: Theory and applications, Rev. Mod. Phys., 84
(2012), pp. 1419–1475.
117
118 Appendix B. Selected references for further reading
R. PARR AND W. YANG, Density Functional Theory of Atoms and Molecules, Oxford
University Press, New York, 1989.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Chapter 3
S. BARONI , S. DE G IRONCOLI , A. DAL C ORSO , AND P. G IANNOZZI, Phonons and
related crystal properties from density-functional perturbation theory, Rev. Mod.
Phys., 73 (2001), pp. 515–562.
M. E. C ASIDA AND M. H UIX -ROTLLANT, Progress in time-dependent density-
functional theory, Annu. Rev. Phys. Chem., 63 (2012), pp. 287–323.
T. K ATO, Perturbation Theory for Linear Operators, Springer, Berlin, 1966.
M. M ARQUES AND E. K. U. G ROSS, Time-dependent density functional theory, Annu.
Rev. Phys. Chem., 55 (2004), pp. 427–455.
G. O NIDA , L. R EINING , AND A. RUBIO, Electronic excitations: Density-functional
versus many-body Green’s-function approaches, Rev. Mod. Phys., 74 (2002), p. 601.
X. R EN , P. R INKE , C. J OAS , AND M. S CHEFFLER, Random-phase approximation and
its applications in computational chemistry and materials science, J. Mater. Sci., 47
(2012), pp. 7447–7471.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Bibliography
[1] S. L. A DLER, Quantum theory of the dielectric constant in real solids, Phys. Rev., 126
(1962), pp. 413–420. (Cited on p. 92)
[2] D. G. A NDERSON, Iterative procedures for nonlinear integral equations, J. Assoc. Comput.
Mach., 12 (1965), pp. 547–560. (Cited on p. 52)
[4] N. A SHCROFT AND N. M ERMIN, Solid State Physics, Thomson Learning, Toronto, 1976.
(Cited on p. 21)
[5] S. BARONI , S. DE G IRONCOLI , A. DAL C ORSO , AND P. G IANNOZZI, Phonons and re-
lated crystal properties from density-functional perturbation theory, Rev. Mod. Phys., 73
(2001), pp. 515–562. (Cited on pp. 92, 95)
[7] A. D. B ECKE, Density functional thermochemistry. III. The role of exact exchange, J. Chem.
Phys., 98 (1993), p. 5648. (Cited on p. 44)
[8] D. B OHM AND D. P INES, A collective description of electron interactions: III. Coulomb
interactions in a degenerate electron gas, Phys. Rev., 92 (1953), p. 609. (Cited on p. 106)
[12] K. B URKE, Perspective on density functional theory, J. Chem. Phys., 136 (2012), 150901.
(Cited on p. 42)
[14] R. C AR AND M. PARRINELLO, Unified approach for molecular dynamics and density-
functional theory, Phys. Rev. Lett., 55 (1985), pp. 2471–2474. (Cited on p. 78)
119
120 Bibliography
[15] M. E. C ASIDA, Time-dependent density functional response theory for molecules, in Recent
Advances in Density Functional Methods:(Part I), 1 (1995), p. 155. (Cited on p. 105)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
[17] D. M. C EPERLEY AND B. J. A LDER, Ground state of the electron gas by a stochastic
method, Phys. Rev. Lett., 45 (1980), pp. 566–569. (Cited on p. 43)
[18] A. J. C OLEMAN, Structure of Fermion density matrices, Rev. Mod. Phys., 35 (1963),
pp. 668–687. (Cited on p. 57)
[20] A. DAMLE , A. L EVITT, AND L. L IN, Variational formulation for Wannier functions with
entangled band structure, Multiscale Model. Simul., 17 (2019), pp. 167–191. (Cited on
p. 75)
[22] E. DAVIDSON, The iterative calculation of a few of the lowest eigenvalues and correspond-
ing eigenvectors of large real-symmetric matrices, J. Comput. Phys., 17 (1975), pp. 87–94.
(Cited on p. 60)
[23] W. E AND J. L U, The Kohn-Sham Equation for Deformed Crystals, Memoirs of the Amer-
ican Mathematical Society, 221, American Mathematical Society, Providence, RI, 2013.
(Cited on pp. 91, 98)
[24] A. E RISMAN AND W. T INNEY, On computing certain elements of the inverse of a sparse
matrix, Commun. ACM, 18 (1975), pp. 177–179. (Cited on p. 65)
[26] H.-R. FANG AND Y. S AAD, Two classes of multisecant methods for nonlinear acceleration,
Numer. Linear Algebra Appl., 16 (2009), pp. 197–221. (Cited on p. 51)
[27] R. F EYNMAN AND A. H IBBS, Quantum Mechanics and Path Integrals, McGraw-Hill, New
York, 1965. (Not cited)
[30] S. G OEDECKER, Linear scaling electronic structure methods, Rev. Mod. Phys., 71 (1999),
pp. 1085–1123. (Cited on pp. 64, 73)
Bibliography 121
[31] G. H. G OLUB AND C. F. VAN L OAN, Matrix Computations, 4th ed. Johns Hopkins Uni-
versity Press, Baltimore, MD, 2013. (Cited on p. 76)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
[32] X. G ONZE AND C. L EE, Dynamical matrices, Born effective charges, dielectric permittivity
tensors, and interatomic force constants from density-functional perturbation theory, Phys.
Rev. B, 55 (1997), p. 10355. (Cited on p. 95)
[34] L. H EDIN, New method for calculating the one-particle Green’s function with application
to the electron-gas problem, Phys. Rev., 139 (1965), p. A796. (Cited on p. 92)
[36] N. J. H IGHAM, Functions of Matrices: Theory and Computation, SIAM, Philadelphia, PA,
2008. (Cited on p. 63)
[37] P. H OHENBERG AND W. KOHN, Inhomogeneous electron gas, Phys. Rev., 136 (1964),
pp. B864–B871. (Cited on p. 39)
[40] R. O. J ONES, Density functional theory: Its origins, rise to prominence, and future, Rev.
Mod. Phys., 87 (2015), pp. 897–923. (Not cited)
[41] T. K ATO, Perturbation Theory for Linear Operators, Springer, Berlin, 1966. (Not cited)
[42] E. K AXIRAS, Atomic and Electronic Structure of Solids, Cambridge University Press, Cam-
bridge, 2003. (Not cited)
[44] A. V. K NYAZEV, Toward the optimal preconditioned eigensolver: Locally optimal block
preconditioned conjugate gradient method, SIAM J. Sci. Comput., 23 (2001), pp. 517–541.
(Cited on p. 60)
[45] W. KOHN, Density functional and density matrix method scaling linearly with the number
of atoms, Phys. Rev. Lett., 76 (1996), pp. 3168–3171. (Cited on pp. 64, 73)
[46] W. KOHN AND L. S HAM, Self-consistent equations including exchange and correlation
effects, Phys. Rev., 140 (1965), pp. A1133–A1138. (Cited on p. 39)
[49] M. L EVY, Universal variational functionals of electron densities, first-order density ma-
trices, and natural spin-orbitals and solution of the v-representability problem, Proc. Natl.
Acad. Sci. USA, 76 (1979), pp. 6062–6065. (Cited on p. 39)
122 Bibliography
[50] E. H. L IEB, Thomas-Fermi and related theories of atoms and molecules, Rev. Mod. Phys.,
53 (1981), pp. 603–641. (Cited on p. 70)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
[51] E. H. L IEB, Density functionals for Coulomb systems, Int J. Quantum Chem., 24 (1983),
pp. 243–277. (Cited on pp. 39, 40)
[52] E. H. L IEB AND M. L OSS, Analysis, 2nd ed., Graduate Studies in Mathematics 14, Ameri-
can Mathematical Society, Providence, RI, 2001. (Cited on p. 40)
[55] L. L IN AND C. YANG, Elliptic preconditioner for accelerating the self-consistent field iter-
ation in Kohn–Sham density functional theory, SIAM J. Sci. Comput., 35 (2013), pp. S277–
S298. (Cited on p. 93)
[57] L. D. M ARKS AND D. R. L UKE, Robust mixing for ab initio quantum mechanical calcula-
tions, Phys. Rev. B, 78 (2008), 075114. (Cited on p. 51)
[59] R. M ARTIN, Electronic Structure—Basic Theory and Practical Methods, Cambridge Uni-
versity Press, New York, 2004. (Not cited)
[60] D. M ARX AND J. H UTTER, Ab Initio Molecular Dynamics: Basic Theory and Advanced
Methods, Cambridge University Press, New York, 2009. (Not cited)
[63] R. M C W EENY, Some recent advances in density matrix theory, Rev. Mod. Phys., 32 (1960),
pp. 335–369. (Cited on p. 61)
[64] N. M ERMIN, Thermal properties of the inhomogeneous electron gas, Phys. Rev., 137
(1965), p. A1441. (Cited on p. 39)
[65] H. J. M ONKHORST AND J. D. PACK, Special points for Brillouin-zone integrations, Phys.
Rev. B, 13 (1976), p. 5188. (Cited on p. 73)
[67] G. PANATI AND A. P ISANTE, Bloch bundles, Marzari-Vanderbilt functional and maximally
localized Wannier functions, Comm. Math. Phys., 322 (2013), pp. 835–875. (Cited on p. 98)
Bibliography 123
[68] R. PARR AND W. YANG, Density Functional Theory of Atoms and Molecules, Oxford Uni-
versity Press, New York, 1989. (Cited on pp. 57, 107)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
[72] J. P. P ERDEW AND K. S CHMIDT, Jacob’s ladder of density functional approximations for
the exchange-correlation energy, AIP Conf. Proc., 577 (2001), pp. 1–20. (Cited on p. 42)
[74] P. P ULAY, Convergence acceleration of iterative sequences: The case of SCF iteration,
Chem. Phys. Lett., 73 (1980), pp. 393–398. (Cited on p. 52)
[75] P. P ULAY, Improved SCF convergence acceleration, J. Comput. Chem., 3 (1982), pp. 54–
69. (Cited on p. 52)
[76] M. R EED AND B. S IMON, Methods of Modern Mathematical Physics. I. Functional Anal-
ysis, Academic Press, New York, 1978. (Cited on p. 89)
[78] Y. S AAD AND M. H. S CHULTZ, GMRES: A generalized minimal residual algorithm for
solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986), pp. 856–869.
(Cited on pp. 51, 102)
[79] J. J. S AKURAI, Modern Quantum Mechanics, Addison-Wesley, Reading, MA, 1994. (Cited
on p. 13)
[82] M. T ETER , M. PAYNE , AND D. A LLAN, Solution of Schrödinger’s equation for large
systems, Phys. Rev. B, 40 (1989), p. 12255. (Cited on p. 102)
[84] G. H. WANNIER, The structure of electronic excitation levels in insulating crystals, Phys.
Rev., 52 (1937), p. 191. (Cited on p. 75)
[85] N. W ISER, Dielectric constant with local field effects included, Phys. Rev., 129 (1963),
pp. 62–69. (Cited on p. 92)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Index
125
126 Index
fermion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26, 29 M
finite temperature . . . . . . . . . . . . . . . . . . . . . . . . 55 matrix function . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
P S
Pauli exclusion principle . . . . . . . . . . . . . . 30, 34 Schrödinger equation . . . . . . . . . . . . . . . . . . . . . 10
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
electronic structure theory and prepare them to conduct research in this area.
A Mathematical Introduction to Electronic Structure Theory begins with an elementary introduction
of quantum mechanics, including the uncertainty principle and the Hartree–Fock theory, which
A Mathematical
is considered the starting point of modern electronic structure theory. The authors then provide
an in-depth discussion of two carefully selected topics that are directly related to several aspects
of modern electronic structure calculations: density matrix based algorithms and linear response
theory. Chapter 2 introduces the Kohn–Sham density functional theory with a focus on the density
matrix based numerical algorithms, and Chapter 3 introduces linear response theory, which provides
a unified viewpoint of several important phenomena in physics and numerics. An understanding of
these topics will prepare readers for more advanced topics in this field. The book concludes with the Introduction
to Electronic
random phase approximation to the correlation energy.
The book is written for advanced undergraduate and beginning graduate students, specifically those
with mathematical backgrounds but without a priori knowledge of quantum mechanics, and can
be used for self-study by researchers, instructors, and other scientists. The book can also serve as a
Structure Theory
starting point to learn about many-body perturbation theory, a topic at the frontier of the study of
interacting electrons.
SL04
ISBN 978-1-611975-79-6
90000
SL04
9781611975796