0% found this document useful (0 votes)
248 views

LinLin ElectronicStructureTheory

Uploaded by

Jordi Vila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
248 views

LinLin ElectronicStructureTheory

Uploaded by

Jordi Vila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 135

Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.

org/page/terms

SL04_LIN_FM_V6.indd 1
to Electronic
Introduction
A Mathematical

Structure Theory

3/26/2019 9:36:22 AM
Spotlights
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

SIAM Spotlights is a new book series that comprises brief and enlightening books on timely
topics in applied and computational mathematics and scientific computing. The books, spanning
125 pages or less, will be produced on an accelerated schedule and will be attractively priced.

Editorial Board
Peter Benner Chen Greif Michael J. Miksis
Max Planck Institute for Dynamics University of British Columbia Northwestern University
of Complex Technical Systems, Per Christian Hansen Padma Raghavan
Magdeburg Technical University of Denmark Vanderbilt University
Timothy Chartier Nicholas Higham Charles Van Loan
Davidson College The University of Manchester Cornell University
Felipe Cucker Jeffrey Humpherys Margaret Wright
City University of Hong Kong Brigham Young University New York University
Donald Estep C. T. Kelley

Colorado State University North Carolina State University

Josef Málek and Zdeněk Strakoš, Preconditioning and the Conjugate Gradient Method in the Context of
Solving PDEs
Paul G. Constantine, Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies
Dominique Orban and Mario Arioli, Iterative Solution of Symmetric Quasi-Definite Linear Systems
Lin Lin and Jianfeng Lu, A Mathematical Introduction to Electronic Structure Theory

SL04_LIN_FM_V6.indd 2 3/26/2019 9:36:23 AM


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

A Mathematical
Introduction
to Electronic
Structure Theory

Lin Lin
University of California
Berkeley, California

Jianfeng Lu
Duke University
Durham, North Carolina

Society for Industrial and Applied Mathematics


Philadelphia

SL04_LIN_FM_V6.indd 3 3/26/2019 9:36:23 AM


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Copyright © 2019 by the Society for Industrial and Applied Mathematics

10 9 8 7 6 5 4 3 2 1

All rights reserved. Printed in the United States of America. No part of this book may be
reproduced, stored, or transmitted in any manner without the written permission of the
publisher. For information, write to the Society for Industrial and Applied Mathematics,
3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA.

Trademarked names may be used in this book without the inclusion of a trademark symbol.
These names are used in an editorial context only; no infringement of trademark is intended.

Publications Director Kivmars H. Bowling


Acquisitions Editor Paula Callaghan
Developmental Editor Gina Rinelli Harris
Managing Editor Kelly Thomas
Production Editor Ann Manning Allen
Copy Editor Nicola Howcroft
Production Manager Donna Witzleben
Production Coordinator Cally A. Shrader
Compositor Cheryl Hufnagle
Graphic Designer Doug Smock

Library of Congress Cataloging-in-Publication Data


Names: Lin, Lin, 1985- author. | Lu, Jianfeng, 1983- author.
Title: A mathematical introduction to electronic structure theory
Lin Lin (University of California, Berkeley, California), Jianfeng Lu (Duke University,
Durham, North Carolina).
Description: Philadelphia : Society for Industrial and Applied Mathematics,
[2019] | Series: SIAM spotlights ; 4 | Includes bibliographical
references and index.
Identifiers: LCCN 2019005297 (print) | LCCN 2019010547 (ebook) | ISBN
9781611975802 | ISBN 9781611975796 (print : alk. paper)
Subjects: LCSH: Electronic structure.
Classification: LCC QC176.8.E4 (ebook) | LCC QC176.8.E4 L55 2019 (print) |
DDC 530.4/110151--dc23
LC record available at https://lccn.loc.gov/2019005297

is a registered trademark.

SL04_LIN_FM_V6.indd 4 3/26/2019 9:36:23 AM


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

SL04_LIN_FM_V6.indd 5
v
and
To our families,
Dongxu and Xiaolu

Jichun, Hantian, and Hanwen

3/26/2019 9:36:23 AM
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

SL04_LIN_FM_V6.indd 6
3/26/2019 9:36:23 AM
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Contents

Preface ix

1 Basic theory of quantum mechanics 1


1.1 Finite dimensional quantum systems . . . . . . . . . . . . . . . . 1
1.2 Schrödinger equation in the real space . . . . . . . . . . . . . . . 11
1.3 Hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Periodic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 Tensor product spaces: Two spin- 12 particles . . . . . . . . . . . . 24
1.6 Identical particles . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Density functional theory: Formulation and algorithms 33


2.1 Hartree–Fock theory . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2 Kohn–Sham density functional theory . . . . . . . . . . . . . . . 39
2.3 Nonlinear eigenvalue problem . . . . . . . . . . . . . . . . . . . . 45
2.4 Self-consistent field iteration . . . . . . . . . . . . . . . . . . . . 49
2.5 Density matrix formulation . . . . . . . . . . . . . . . . . . . . . 53
2.6 Extension to finite temperature . . . . . . . . . . . . . . . . . . . 55
2.7 Density matrix algorithms . . . . . . . . . . . . . . . . . . . . . . 59
2.8 Brillouin zone sampling for periodic systems . . . . . . . . . . . . 67
2.9 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.10 Geometry optimization and ab initio molecular dynamics . . . . . 76
2.11 Time-dependent density functional theory . . . . . . . . . . . . . 79

3 Linear response theory 83


3.1 Perturbation of Green’s function . . . . . . . . . . . . . . . . . . 83
3.2 Perturbation of the density matrix . . . . . . . . . . . . . . . . . . 85
3.3 Density functional perturbation theory . . . . . . . . . . . . . . . 89
3.4 Applications of density functional perturbation theory . . . . . . . 91
3.5 Exponential decay of the Green’s function . . . . . . . . . . . . . 95
3.6 Time-dependent density functional perturbation theory . . . . . . . 98
3.7 Perturbation of the many-body Hamiltonian . . . . . . . . . . . . 102
3.8 Casida formalism . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.9 Random phase approximation . . . . . . . . . . . . . . . . . . . . 106

A Notations and preliminaries 111


A.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
A.2 Spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . . 114
A.3 Equalities in complex analysis . . . . . . . . . . . . . . . . . . . . 115

vii
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

viii

Index
Bibliography
Selected references for further reading

125
119
117
Contents
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Preface

Electronic structure theories, particularly those represented by Kohn–Sham density


functional theory, have been developed into workhorse tools with a wide range of sci-
entific applications in physics, chemistry, materials science, and related fields. This
book is based on a series of summer school courses for advanced undergraduate and be-
ginning graduate students in electronic structure theory taught at Beijing and Berkeley
from 2012 to 2016. The goals of the book are as follows.
1. To present a fundamental level of background knowledge on quantum mechanics
towards the understanding of electronic structure theory.
2. To introduce some basic ideas of electronic structure theory and its associated
algorithms and analysis in a mathematically attractive manner.
3. To arouse the interest of students in this challenging and useful subject.
Within this short introductory textbook, it is beyond our ability to provide an ex-
haustive account of electronic structure theory, to provide a quantitative assessment of
the scientific value of various approximation schemes, or to introduce electronic struc-
ture theory in a fully mathematically rigorous manner. The book is specifically written
for readers with mathematical backgrounds without a priori knowledge of quantum me-
chanics. However, readers with a background in physics, chemistry, or materials science
might also find certain discussions in Chapter 2 and Chapter 3 of this book refreshing.
The book is divided into three chapters. Chapter 1 contains an elementary introduc-
tion to quantum mechanics, starting from a finite quantum system that can be completely
characterized by matrices of size 2. Nonetheless, some important aspects of quantum
mechanics, such as the uncertainty principle, can be readily observed. Chapter 2 in-
troduces the Hartree–Fock theory followed by Kohn–Sham density functional theory.
One particular area of focus is density matrix-based numerical algorithms. From our
perspective this is an intrinsic viewpoint of Kohn–Sham density functional theory, and
is also more suitable for large-scale computation. Chapter 3 introduces linear response
theory, which provides a unified viewpoint on a number of important phenomena in
physics and numerics, such as screening, phonons, self-consistent field iteration, and
localization. The book finishes with the random phase approximation to the correlation
energy, which can be obtained from linear response theory at the many-body level. This
also serves as a starting point for readers to learn about many-body perturbation theory,
which is a topic at the current frontier of the study of interacting electrons.
We acknowledge that the format, and even the title of this book, are largely inspired
by A Mathematical Introduction to Fluid Mechanics, a classical textbook by Alexandre
Chorin and Jerrold Marsden. While we do not even attempt to repeat the success of this
classical text for mathematical aspects of computational fluid mechanics, we observe
some similarity between the current status of electronic structure theory and that of
computational fluid mechanics in the 1970s. We also find that the format forces us
to carefully select the topics and to concisely state the results. We hope that readers,

ix
x Preface

especially those who are new to this field, will benefit from these choices. We thank
Weinan E for encouraging us to engage in the summer school courses and to write the
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

book in this format. We are also grateful for discussions with Volker Blum, Roberto Car,
Alexandre Chorin, Yingzhou Li, James Sethian, Lin-Wang Wang, Chao Yang, Lexing
Ying, and Weitao Yang during the writing of this book.

Lin Lin
University of California, Berkeley

Jianfeng Lu
Duke University
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Chapter 1

Basic theory of quantum


mechanics

The Stern–Gerlach (SG) experiment (1922) was one of the earliest experiments for
which the result could not be explained using classical physics by any means. Hence it
demonstrated unambiguously the importance of quantum effects. The SG experiment
can be explained using the quantum theory for the spin- 12 particle, which is a quantum
system that can be represented simply by 2 × 2 matrices. We will use the spin- 12 parti-
cle to introduce some of the basic concepts of quantum mechanics, such as state space,
operators, measurement, the uncertainty principle, and the evolution equation. Further-
more, the theory for the spin- 12 particle can be readily generalized to finite dimensional
quantum systems, i.e., systems that can be represented by finite dimensional matrices.
1
Throughout the text we use atomic units me = e = ~ = 4π 0
= kB = 1, where me
1
is the mass of an electron, e is the unit charge, ~ is the reduced Planck constant, 4π 0
is the Coulomb constant, and kB is the Boltzmann constant. In atomic units, the unit of
length is called Bohr, and the unit of energy is called Hartree.

1.1 Finite dimensional quantum systems


Stern–Gerlach experiment
We briefly introduce the SG experiment below, and this is the only experiment that
will be directly discussed in this book. The conceptual setup of the SG experiment is
given in Figure 1.1. Hot silver atoms are produced from an oven with initial velocity

Figure 1.1. Setup of the Stern–Gerlach experiment.

1
2 Chapter 1. Basic theory of quantum mechanics

along the y-direction. They then pass through an inhomogeneous magnetic field, which
is produced using a homogeneous magnetic field pointing along the z-direction plus a
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

small perturbation. The final position of each atom along the z-coordinate is recorded by
the detecting screen on the right. In the oven the silver atom loses a valence electron, and
hence carries a magnetic moment called the spin, denoted by a vector µ ∈ R3 . Whether
the trajectory of the silver atom bends upwards or downwards through the magnetic field
depends on the direction of µ. While we leave the detailed setup of the experiment as
well as the derivation of the classical physics prediction to standard physics textbooks
for interested readers, the prediction from classical physics for the distribution of the
z-coordinate on the screen can be qualitatively described by Figure 1.2(a). Since every
silver atom was prepared in a thermal state, the initial magnetic moment µ can point
towards any direction, and hence classical physics will always predict a continuous
distribution on the screen (the particular shape of the distribution is not important to our
discussion). The result from the SG experiment was shockingly different: the detecting
screen always shows a discrete, symmetric bimodal distribution as in Figure 1.2(b).

Figure 1.2. (a) The prediction of the result of the Stern–Gerlach


experiment from classical theory and (b) the experimental result.

From now on let us represent the entire experimental apparatus in Figure 1.1 by a
box SGz as in Figure 1.3(a). We define the two output states from the SGz apparatus
as |+z i and |−z i, respectively. Following the Dirac notation, |·i is called a ket and is
used to denote a given quantum state. We may block one channel of the output, say,
|−z i. This produces a filtering apparatus, which removes the |−z i contribution from
any initial mixed state.

Figure 1.3. (a) The Stern–Gerlach apparatus along the z-


direction and (b) a filtering apparatus.

In order to understand the nature of the experimental result, Stern and Gerlach con-
tinued with a few other experiments. First, if we pass a |+z i state produced by the
SGz filtering apparatus to another SGz apparatus, there is only one output state |+z i
(Figure 1.4(a)). This shows that |+z i is intrinsically related to the SGz apparatus. By
symmetry, the same result holds for the |−z i state. Note that there is nothing special
1.1. Finite dimensional quantum systems 3

about the z-direction. We may rotate the magnets so that the silver atoms can bend
upwards and downwards along the x-direction. The resulting apparatus is denoted by
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

SGx , and its two output states are denoted by |+x i and |−x i, respectively. Similarly,
we can define the SGy apparatus with output states |+y i and |−y i. By symmetry (and
experimental validation), the result in Figure 1.4(a) holds as well when z is replaced by
x or y. Second, if we pass the |+x i state produced by the SGx filtering apparatus to
an SGz apparatus, we observe the same bimodal symmetric pattern as in Figure 1.2(b).
This is illustrated in Figure 1.4(b), and hence the |+x i state can be “converted” to |+z i
and |−z i states. By symmetry the result holds for any input state and SG apparatus as-
sociated with different directions. Finally, we may combine Figures 1.4(a) and (b) and
arrive at Figure 1.4(c). Note that although passing |+z i through SGz only produces the
|+z i state, the combined SGx filtering apparatus and SGz apparatus can generate both
|+z i and |−z i states! How shall we explain these results?

Figure 1.4. Additional Stern–Gerlach experiments.

State space
Below we demonstrate that the mysterious experimental results from the SG experi-
ments can be consistently explained using linear algebra for 2 × 2 matrices. Quantum
mechanics postulates that the state of a spin- 21 particle is a two-dimensional vector on
a vector space H isomorphic to C2 , which is called a state vector space (or simply the
state space). An element |ψi ∈ H is called a state vector or a ket vector. The states
|+z i, |−z i form a basis of H, and hence any general state vector |ψi can be written as
the linear combination of these two basis vectors as

|ψi = c1 |+z i + c2 |−z i, c1 , c2 ∈ C. (1.1.1)

In particular, the states |±x i, |±y i are also states in H and can be expanded as the linear
combination of |±z i.
Quantum mechanics postulates that H is equipped with an inner product (·, ·) and
hence H is a Hilbert space (in the infinite dimensional case, it is also postulated that the
space is complete with respect to the inner product). The inner product between two ket
vectors is often written in Dirac notation as hϕ|ψi. In particular, the notation hϕ| can
be used separately as a bra vector, which is a vector in the dual space of H. The states
|±z i are orthonormal under this inner product, i.e.,

h+z |+z i = h−z |−z i = 1, h+z |−z i = 0. (1.1.2)


4 Chapter 1. Basic theory of quantum mechanics

Since the choice of the z-direction is arbitrary, the orthonormality condition in (1.1.2)
should hold when z is replaced by x or y. One immediate consequence is that the states
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

|+x i and |−x i are linearly independent and hence also form a basis for H, and the states
|±z i can be expanded as the linear combinations of |±x i as well. The same result holds
when x is replaced by y.
The discussion above can be generalized to any finite dimensional quantum system,
represented by a finite dimensional Hilbert space H. Given an orthonormal basis set of
n
H denoted by |ϕi ii=1 , any state vector |ψi can be written as the linear combination of
these basis vectors as
n
X
|ψi = ci |ϕi i, ci ∈ C. (1.1.3)
i=1

Quantum mechanics also postulates that the physical meanings of |ψi and c|ψi are the
same for any c ∈ C and c 6= 0. If c = 0, then 0|ψi = |0i is called the zero vector or the
null state. Hence without loss of generality we may normalize any nonzero vector |ψi
so that hψ|ψi = 1. We may readily verify
n
X n
X n
X
1 = hψ|ψi = c∗i cj hϕi |ϕj i = c∗i cj δij = |ci |2 . (1.1.4)
i,j=1 i,j=1 i=1

Here δij is the Kronecker δ-symbol. Note that |ci |2 can be interpreted as a probability
distribution over the state vectors {|ϕi i}. Indeed, quantum mechanics postulates that
such a probability distribution is precisely the distribution for the outcome of a mea-
surement process of a physical observable, as will be explained below.

Quantum operator
We first introduce some notation. Take a linear operator  acting on a finite dimensional
Hilbert space H. The adjoint of Â, denoted by Â∗ , is defined such that

hϕ|Âψi = hÂ∗ ϕ|ψi ∀|ϕi, |ψi ∈ H. (1.1.5)

If the adjoint of  is the same as  itself, then  is called a self-adjoint operator. In


Dirac notation, the inner product hϕ|Âψi can also be written in a more symmetric form
as hϕ|Â|ψi when  is self-adjoint. All self-adjoint operators on a finite dimensional
space can be diagonalized, i.e., there exist eigenvalues ai and corresponding eigenvec-
tors (eigenstates) |ϕi i, i = 1, . . . , n, such that, for each i,

Â|ϕi i = ai |ϕi i. (1.1.6)

Since  is self-adjoint, all eigenvalues are real, and the set of all eigenvectors forms an
orthonormal basis set of H, i.e.,

hϕi |ϕj i = δij , i, j = 1, . . . , n. (1.1.7)

Quantum mechanics postulates that spin, and in general all physical observables,
can be represented using a self-adjoint operator on H. The procedure of mapping an
observable in classical mechanics to a linear operator in quantum mechanics is called
quantization. At first sight, this may (and should) seem strange: physical observables
such as length, weight, density, position, and velocity are always numbers, or at most
vectors. In what sense can a physical observable be represented by a self-adjoint oper-
ator?
1.1. Finite dimensional quantum systems 5

To answer this question, we first recall that in classical physics, the state of a sys-
tem can be measured without interfering with the system. This means that we may
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

point out that a classical particle is at position r with velocity v, and the measurement
process needed to obtain such information only interacts with the classical state to a
negligible extent. On the other hand, quantum mechanics postulates that all measure-
ment processes must interact with the quantum state. More specifically, as Dirac stated,
“a measurement always causes the system to jump into an eigenstate of the dynamical
variable that is being measured, the eigenvalue this eigenstate belongs to being equal to
the result of the measurement.” In other words, assume the initial state is given as the
linear combination of the eigenstates of  as
X
|ψi = ci |ϕi i, ci ∈ C. (1.1.8)
i

If we would like to measure the value of the physical observable corresponding to  for
the state |ψi, then the output state |ψi must be one of the eigenvectors of Â, however the
measurement process is designed. This interpretation of the measurement process pro-
vides the physical meaning of the eigenvalues: For finite dimensional quantum systems,
the value of any physical observable only takes discrete values, given by the eigenvalues
of the corresponding self-adjoint operator.
As indicated in (1.1.4), the coefficients |ci |2 can be interpreted as a probability dis-
tribution over the eigenstates. Quantum mechanics postulates that in a measurement
process, the state |ψi should randomly collapse into an eigenstate |ϕi i and the out-
come value of the measurement is the associated eigenvalue ai with probability |ci |2 .
Hence the result of any single measurement in quantum physics is almost never pre-
determined. One important, and the only, exception is that |ψi is already an eigenstate,
say, |ϕ1 i. In this case, |ci |2 is 1 if i = 1 and 0 otherwise. Hence the result of the
measurement is deterministically a1 with the state being |ϕ1 i after measurement.
In the context of the SG experiment, the SGz apparatus performs a measurement
of the value of the spin along the z-direction. The corresponding linear operator is
denoted by Ŝz , and its eigenstates are |±z i. Similarly, the spin operators along the
x, y directions are denoted by Ŝx , Ŝy with eigenstates |±x i, |±y i, respectively. This
explanation is consistent with the result of the SG experiment in Figure 1.4(a), where
the output state is the same as the input state due to the fact that |+z i is already an
eigenstate of Ŝz .
Although one cannot deterministically predict the value of the outcome of a single
measurement associated with a linear operator Â, the expectation value denoted by hÂi
can be predicted deterministically. This is because
n
X n
X
hÂi = hψ|Â|ψi = c∗i cj ai δij = ai |ci |2 . (1.1.9)
i,j=1 i=1

Here the right-hand side of the equation is precisely the expectation value of the physical
observable, and the expectation value can be experimentally obtained if one can prepare
a large number of copies of the same state |ψi and repeat the measurements.
We now determine the expansion coefficients relating the states |±x i, |±y i, and
|±z i. The experiment in Figure 1.4(b) yields a bimodal symmetric distribution. The
probabilistic explanation above implies the following relation:
1 
|+x i = √ |+z i + eiα |−z i ,

α ∈ R.
2
6 Chapter 1. Basic theory of quantum mechanics

Here √12 is a normalization factor, α is an arbitrary phase factor, and we choose the
coefficient of |+z i to be exactly √12 since the physical meaning of a state is not changed
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

when multiplied by any nonzero complex number. We also have


1 
|−x i = √ |+z i − eiα |−z i ,

2
where the minus sign comes from the orthogonality condition h+x |−x i = 0. Similarly,
1  1 
|+y i = √ |+z i + eiβ |−z i , |−y i = √ |+z i − eiβ |−z i .
 
2 2
In order to find the relation between α and β, let us repeat the experiment in Fig-
ure 1.4(b) by passing |±x i through SGy ; we should still observe a bimodal symmetric
distribution. This indicates that
1
|h±y |±x i|2 = .
2
Combining the relations above, we have
π
α−β =± + 2πn, n ∈ Z.
2
The SG experiment cannot provide any more information, and we need to impose some
conventions. Let α = 0, β = π/2, and we have
1 1
|±x i = √ [|+z i ± |−z i] , |±y i = √ [|+z i ± i|−z i] . (1.1.10)
2 2
It is also convenient to write down the coordinates of the states |±x i, |±y i, |±z i with
respect to the basis set {|±z i}. Since H is isomorphic to C2 , we can identify these
vectors with their coordinates as
   
1 0
|+z i = , |−z i = ,
0 1
   
1 1 1 1
|+x i = √ , |−x i = √ , (1.1.11)
2 1 2 −1
   
1 1 1 1
|+y i = √ , |−y i = √ .
2 i 2 −i

We are now ready to construct the linear operator for the spin operators Ŝx , Ŝy , Ŝz .
It is also common to combine the three operators in vector form as Ŝ = (Ŝx , Ŝy , Ŝz )> .
Furthermore, one can define the spin operator along an arbitrary unit vector n ∈ R3 as
Ŝn = Ŝ · n = Ŝx nx + Ŝy ny + Ŝz nz . (1.1.12)

By the convention in quantum mechanics, the |±z i are eigenstates of Ŝz with eigenval-
ues ± 21 . This means that
1
Ŝz = (|+z ih+z | − |−z ih−z |) . (1.1.13)
2
Using the coordinates in (1.1.11), we have the matrix representation of Ŝz as
 
1 1 0
Ŝz = . (1.1.14)
2 0 −1
1.1. Finite dimensional quantum systems 7

Similarly, we can define Ŝx , Ŝy as


1 1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Ŝx = (|+x ih+x | − |−x ih−x |) , Ŝy = (|+y ih+y | − |−y ih−y |) , (1.1.15)
2 2
with the corresponding matrix representation
   
1 0 1 1 0 −i
Ŝx = , Ŝy = . (1.1.16)
2 1 0 2 i 0

The matrix representation can be concisely written using the Pauli matrices as Ŝ =
1 1 >
2 σ = 2 (σx , σy , σz ) , where
     
0 1 0 −i 1 0
σx = , σy = , σz = . (1.1.17)
1 0 i 0 0 −1
The Pauli matrices are Hermitian and unitary. Together with the 2 × 2 identity matrix,
they form a basis for all self-adjoint linear operators on C2 .
From now on, we take {|±z i} as the standard basis set for the spin- 21 particle. |+z i
is often denoted by |↑i and is called the spin-up state. Similarly, |−z i is denoted by |↓i
and is called the spin-down state.
We remark that up to this point, it is impossible to tell whether the explanation has
fundamental value or is simply phenomenological. On the other hand, today we know
that spin is indeed an intrinsic degree of freedom of quantum particles and strangely has
no analogue in classical physics. Numerous experimental results have demonstrated
that the linear algebra interpretation fits consistently into the larger picture of physics
to the extent of the present day’s knowledge, and can be practically regarded as the
fundamental theory for spin- 21 particles. It is useful to keep in mind that such a prag-
matic approach was taken during the development of many theoretical concepts known
in modern physics, rather than following some rigorous axiomatic approach. We refer
readers to the reading materials at the end of the book for a more detailed discussion on
the foundation of quantum mechanics with a more axiomatic approach, or alternative
formulations of quantum mechanics such as path integrals. These topics are beyond the
scope of this book.

Uncertainty principle
One immediate consequence of the linear operator interpretation for physical observ-
ables in quantum physics is that the physical observables usually do not commute. For
example,
    
1 0 1 1 0 1 0 −1
Ŝx Ŝz = = ,
4 1 0 0 −1 4 1 0
    
1 1 0 0 1 1 0 1
Ŝz Ŝx = = .
4 0 −1 1 0 4 −1 0

Hence Ŝx Ŝz 6= Ŝz Ŝx .


The extent to which the two operators do not commute is an important quantity in
quantum mechanics. Let  and B̂ be two operators on H. Then the commutator of Â
and B̂ is defined as
[Â, B̂] := ÂB̂ − B̂ Â. (1.1.18)
Â, B̂ are called compatible if [Â, B̂] = 0, and otherwise they are incompatible.
8 Chapter 1. Basic theory of quantum mechanics

Using the notation of the commutator, we have


 
1 0 1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

[Ŝz , Ŝx ] = = iŜy .


2 −1 0
Similarly, we obtain a cyclic relation
[Ŝx , Ŝy ] = iŜz , [Ŝy , Ŝz ] = iŜx .

Hence Ŝx , Ŝy , Ŝz are mutually incompatible. One can verify in the exercise that the
2
square of the magnitude of the spin operator, Ŝ = Ŝx2 + Ŝy2 + Ŝz2 , is compatible with
all the spin operators along any individual direction.
The compatibility condition has a direct physics consequence. From linear algebra
we know that for compatible Hermitian matrices  and B̂, we can always find the
eigenvectors |ϕi i so that the two operators can be simultaneously diagonalized:
Â|ϕi i = ai |ϕi i, B̂|ϕi i = bi |ϕi i. (1.1.19)
Recall the quantum mechanics postulation that the final state from any measurement
leads to an eigenstate of the operator corresponding to a physical observable. Then if
two operators can be simultaneously diagonalized using the same set of eigenstates, it
means that one can simultaneously measure  and B̂. The compatibility condition is
sufficient and necessary. In other words, if the two operators are incompatible, then one
cannot always simultaneously measure the values of the two physical observables.
The statement above can be quantified in terms of the uncertainty principle, which
can be formulated in terms of an inequality for the fluctuation of the measurements for
 and B̂. For a given operator  and state ψ, define an operator
∆ =  − hÂiI :=  − hψ|Â|ψiI.
Thus ∆ is an operator with zero expectation value:
h∆Âi = hψ|∆Â|ψi = 0.
The variance can be defined using ∆ as
2
h∆Â2 i = hψ|( − hÂiI)2 |ψi = hÂ2 i − hÂi . (1.1.20)
If the operators Â, B̂ are compatible, and |ψi is one of their common eigenvectors,
then
h∆Â2 i = h∆B̂ 2 i = 0.
This means that there is no uncertainty in measuring the values of both  and B̂. Un-
fortunately in general, h∆Â2 i and h∆B̂ 2 i cannot be arbitrarily small.
For any two Hermitian operators Â, B̂ on H, recall the Cauchy–Schwarz inequality
h∆Â2 ih∆B̂ 2 i ≥ |h∆Â∆B̂i|2 .
Observe that
1 1
∆Â∆B̂ = (∆Â∆B̂ + ∆B̂∆Â) + (∆Â∆B̂ − ∆B̂∆Â)
2 2
1 1
= {∆Â, ∆B̂} + [∆Â, ∆B̂],
2 2
where {Â, B̂} = ÂB̂ + B̂ Â is called the anti-commutator.
1.1. Finite dimensional quantum systems 9

Notice that {∆Â, ∆B̂} is Hermitian, so h{∆Â, ∆B̂}i is real. Similarly, we find
that h[∆Â, ∆B̂]i is purely imaginary by noticing that i[∆Â, ∆B̂] is Hermitian. Then
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

1 1
|h∆Â∆B̂i|2 ≥ |h[∆Â, ∆B̂]i|2 = |h[Â, B̂]i|2 .
4 4
Therefore
1
h∆Â2 ih∆B̂ 2 i ≥ |h[Â, B̂]i|2 . (1.1.21)
4
Equation (1.1.21) is called the uncertainty principle in the context of finite dimensional
quantum systems. It states that there is a lower bound for the product of the uncertainty
of two operators h∆Â2 ih∆B̂ 2 i, given by the expectation value of the commutator. Due
to the uncertainty principle, one cannot obtain simultaneously precise measurements of,
e.g., Ŝx and Ŝz .

Schrödinger equation
In order to understand how one quantum state |ψ(t0 )i at time t0 evolves into another
state |ψ(t)i at time t > t0 , quantum mechanics postulates that there is a linear operator
Û (t, t0 ), called the propagator, which is independent of the initial state |ψ(t0 )i and
satisfies |ψ(t)i = Û (t, t0 )|ψ(t0 )i. The normalization convention implies that

hψ(t)|ψ(t)i = hψ(t0 )|Û ∗ (t, t0 )Û (t, t0 )|ψ(t0 )i = 1

for any t ≥ t0 and initial state |ψ(t0 )i. Therefore, Û ∗ (t, t0 )Û (t, t0 ) = I. Since Û (t, t0 )
is a finite dimensional matrix, the operator Û (t, t0 ) is unitary for any t ≥ t0 .
Another natural property that the evolution operator U should satisfy is that for time
t0 < t1 < t2 ,

|ψ(t2 )i = Û (t2 , t0 )|ψ(t0 )i = Û (t2 , t1 )Û (t1 , t0 )|ψ(t0 )i.

This is called the semi-group property. If we take t2 = t + ∆t, t1 = t, then

|ψ(t + ∆t)i = Û (t + ∆t, t0 )|ψ(t0 )i


= Û (t + ∆t, t)Û (t, t0 )|ψ(t0 )i
= Û (t + ∆t, t)|ψ(t)i.

Assuming the evolution of |ψ(t)i is continuous, i.e., lim∆t→0+ |ψ(t + ∆t)i = |ψ(t)i,
we have
Û (t, t) = lim + Û (t + ∆t, t) = I.
∆t→0

Then we can formally write down the Taylor expansion

Û (t + ∆t) = I − iΩ̂(t)∆t + O(∆t2 ).

Hence

Û ∗ (t + ∆t, t)Û (t + ∆t, t) = (I + iΩ̂∗ (t)∆t + O(∆t2 ))(I − iΩ̂(t)∆t + O(∆t2 ))


= I + i(Ω̂∗ (t) − Ω̂(t))∆t + O(∆t2 ).
10 Chapter 1. Basic theory of quantum mechanics

In order to satisfy the unitary condition to the first order with respect to ∆t, we have
Ω̂∗ (t) = Ω̂(t). Therefore Ω̂(t) is a self-adjoint operator and can be associated with a
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

physical observable.
Quantum mechanics postulates that Ω̂ is given by the Hamiltonian operator Ĥ,
which is obtained from the quantization process of the Hamiltonian in classical physics,
and is related to the total energy of the system. Then

|ψ(t + ∆t)i = |ψ(t)i − iĤ(t)∆t|ψ(t)i + O(∆t2 ).

Taking the limit ∆t → 0, we arrive at a differential equation

i∂t |ψ(t)i = Ĥ(t)|ψ(t)i (1.1.22)

known as the Schrödinger equation.


Let us consider the solution to the Schrödinger equation when the Hamiltonian is
time-independent, i.e., Ĥ(t) ≡ Ĥ. Ĥ can be diagonalized as

Ĥ|ϕi i = Ei |ϕi i,

with eigenvalues E0 ≤ E1 ≤ · · · ≤ En−1 . If all eigenvalues are distinct, then we may


distinguish |ϕ0 i as the ground state, |ϕ1 i as the first excited state, |ϕ2 i as the second
excited state, etc.
If the initial state is an eigenvector of Ĥ, |ψ(t0 )i = |ϕi i, the solution to the
Schrödinger equation is
|ψ(t)i = e−iEi (t−t0 ) |ψ(t0 )i.
Hence if we start from an eigenvector, the state will remain unchanged other than a ro-
tating phase factor. Since the eigenvectors of Pany self-adjoint operator acting on H form
a basis of H, for any initial state |ψ(t0 )i = i ci |ϕi i, the solution to the Schrödinger
equation is
X n
|ψ(t)i = ci e−iEi (t−t0 ) |ϕi i.
i=1
As an application of the Schrödinger equation, we study the behavior of the evolu-
tion of the expectation value. Let the operator  and |ψ(t0 )i be given, and we introduce
the following notation:
hÂi(t) := hψ(t)|Â|ψ(t)i.
Then hÂi(t) satisfies the equation
d
i hÂi(t) = ih∂t ψ(t)|Â|ψ(t)i + ihψ(t)|Â|∂t ψ(t)i
dt
= −hψ(t)|Ĥ(t)Â|ψ(t)i + hψ(t)|Â|Ĥ(t)ψ(t)i,
which gives
d
i hÂi(t) = h[Â, Ĥ]i(t). (1.1.23)
dt

Example: Spin precession


As an example, we consider a spin- 21 system in a constant magnetic field B = (0, 0, B)> .
The quantum mechanical Hamiltonian is
 
B 1 0
Ĥ = Ŝ · B = B Ŝz = .
2 0 −1
1.2. Schrödinger equation in the real space 11

The ground state energy is E0 = − B2 with eigenstate |ψ0 i = |↓i, and the first excited
state energy is E1 = B2 with eigenstate |ψ1 i = |↑i.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Starting from an initial state


1
|ψ(0)i = |+x i = √ (|↑i + |↓i), (1.1.24)
2
the solution of the Schrödinger equation is

e−iBt/2 e+iBt/2
|ψ(t)i = √ |↑i + √ |↓i.
2 2

The evolution of the expectation value of the spin operator Ŝx satisfies the following
equation:
d
i hŜx i(t) = h[Ŝx , Ĥ]i(t) = −iBhŜy i(t). (1.1.25)
dt
Similarly,
d
i hŜy i(t) = h[Ŝy , Ĥ]i(t) = iBhŜx i(t).
dt
Hence
d2
hŜx i(t) = −B 2 hŜx i(t). (1.1.26)
dt2
From the initial state (1.1.24) we may readily compute hŜx i(0) = 12 , hŜy i(0) = 0. To-
gether with (1.1.25), we find that (1.1.26) is a second-order ODE with initial conditions
1 d
hŜx i(0) = , hŜx i(0) = 0.
2 dt
Solving this equation, we have
1
hŜx i(t) = cos Bt.
2
Similarly,
1
hŜy i(t) = sin Bt, hŜz i(t) = 0.
2
Therefore the expectation of the spin operator
1 >
hŜi(t) = (cos Bt, sin Bt, 0) ,
2
which is rotating with its axis point along the magnetic field B. Since the spin is itself
an intrinsic rotating degree of freedom of a quantum particle, this motion is called the
spin precession.

1.2 Schrödinger equation in the real space


Let us now generalize the formulation of the quantum mechanics with finite dimensional
state space. We first consider a particle on a line. Since the position of the particle can
take an infinite number of values, we need to generalize the finite dimensional quantum
system to the infinite dimensional setup. For a particle on a real line, the Hilbert space
is  Z 
2 2
H = L (R) := f |f (x)| dx < ∞ . (1.2.1)

R
12 Chapter 1. Basic theory of quantum mechanics

For any two state vectors |ψi, |ϕi ∈ H, the inner product is defined as
Z
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

hϕ|ψi = ϕ∗ (x)ψ(x) dx.


R

The normalization condition of the wavefunction is


Z
2
kψk := hψ|ψi = |ψ(x)|2 dx = 1.
R

Position operator on a real line


The position operator x̂ is a linear operator on H and acts on a state vector |ψi with
function ψ(x) according to

(x̂ψ)(x) = xψ(x), x ∈ R. (1.2.2)

Note the subtle difference in the notation here: x̂ is an operator while x is a real number,
and the right-hand side is understood as the evaluation of the function xψ at x. This
equation can also be written in Dirac notation as

x̂|ψi = |xψi. (1.2.3)

Formally, if we try to find the eigenstate of the position operator

x̂|ψi = x0 |ψi ∀x ∈ R, (1.2.4)

then we must have ψ(x) = 0 if x 6= x0 . Since ψ(x) vanishes almost everywhere on the
real line, |ψi is equivalent to the null state. This contradicts the assumption that ψ(x) is
a normalized eigenfunction. In fact, the operator x̂ does not have any square integrable
eigenstate. The eigen-decomposition of the position operator needs to be represented
using the Dirac δ-function loosely thought of as
(
∞, if x = x0 ;
δ(x − x0 ) = (1.2.5)
0, otherwise,
R
and δ(x − x0 ) dx = 1. According to the distribution theory, the Dirac δ-function
should be considered as a linear functional on H such that acting on a smooth function
f as
δ(· − x0 ) : f 7→ f (x0 ).
We will not discuss further details of this more rigorous perspective here.
The Dirac δ-notation allows us to formally define a state |x0 i using δ(x − x0 ). Then
the formal eigen-decomposition of x̂ reads

x̂|x0 i = x0 |x0 i. (1.2.6)

Using such notation, the relation between a quantum state |ψi and its function ψ(x) is

hx|ψi = ψ(x). (1.2.7)

Furthermore,
hx1 |x2 i = δ(x1 − x2 ), (1.2.8)
1.2. Schrödinger equation in the real space 13

which is a generalized orthonormality condition. We will call |x0 i a generalized eigen-


function. Compared to the finite dimensional setup, we find that ψ(x) should be more
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

appropriately viewed as the “generalized coordinate” of the state vector |ψi, and the ba-
sis is provided by the “generalized eigenfunctions” of the operator x̂. Hence the function
ψ(x) is also called the real space representation of |ψi, and the square integrable condi-
tion for ψ(x) means that |ψ(x)|2 can be interpreted as the probability density of finding
the particle at x. In the discussion below, we may use the notation |ψi and its associated
function ψ(x) interchangeably.
If the Hilbert space H is finite dimensional and  is a linear operator on H, then for
any state vector |ψi ∈ H we still have Â|ψi ∈ H. The same statement does not hold for
1
infinite dimensional spaces. For example, consider the wavefunction ψ(x) ∝ 1+|x| 2/3 .

Then one can verify that ψ ∈ H but xψ(x) 6∈ H. This means that the position operator
x̂ cannot be defined for all possible wavefunctions ψ ∈ H. The domain of the position
operator is  
Z
2 2 2
dom x̂ = ψ ∈ L (R) x |ψ(x)| dx < ∞ , (1.2.9)

R
2
which is a subset of H = L (R). It is a dense subset as it contains all compactly
supported continuous functions.

Momentum operator
The definition of the momentum operator is arguably another point of mystery for first-
time readers of quantum mechanics. Here we simply state that quantum mechanics
postulates that the momentum operator should be a differential operator1
d
p̂ = −i .
dx
In other words, for ψ ∈ H,
(p̂ψ)(x) = −iψ 0 (x).
The eigenfunction of p̂ can be discussed in parallel to that of the position operator.
We formally denote by |p0 i the eigenfunction of p̂ with eigenvalue p0 , i.e.,

p̂|p0 i = p0 |p0 i, p0 ∈ R.

In the real space representation, the equation becomes


d
−i hx|p0 i = p0 hx|p0 i ∀x ∈ R.
dx
The solution is
hx|p0 i = Ceip0 x ,
where C is a constant. In order to determine the constant C, consider an arbitrary
function ϕ(p) ∈ L2 (R), then Parseval’s identity suggests
Z Z 2
√1 e−ipx ϕ(p) dp dx

R

R 2π
Z Z  Z  Z (1.2.10)
1
= ei(p1 −p2 )x dx ϕ∗ (p1 )ϕ(p2 ) dp1 dp2 = |ϕ(p)|2 dp.
R R 2π R
1 Mathematically, the momentum operator can be associated with the infinitesimal generator of the trans-

lation operation. See [79, p. 46].


14 Chapter 1. Basic theory of quantum mechanics

Since ϕ is arbitrary, (1.2.10) can also be written as


Z
1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

ei(p1 −p2 )x dx = δ(p1 − p2 ). (1.2.11)


2π R

This suggests the choice of constant to be C = 1/ 2π, so that
1
hx|p0 i = √ eip0 x ,

Therefore, for ψ ∈ H, Z
1
hp|ψi = √ e−ipx ψ(x) dx, (1.2.12)
2π R

which is the Fourier transform of ψ(x). Also


Z
hp1 |p2 i = hp1 |xihx|p2 i dx = δ(p1 − p2 ), (1.2.13)
R

which is a generalized orthonormality condition. Note that |p0 i 6∈ L2 (R), and therefore
|p0 i is also a generalized eigenfunction.
Similar to the position operator, p̂ cannot be applied to all functions ψ(x) ∈ L2 (R).
The domain of the momentum operator is

dom p̂ = H 1 (R) := ψ ∈ L2 (R), ψ 0 (x) ∈ L2 (R) ,



(1.2.14)

which is also a dense subset of H = L2 (R).

Uncertainty principle
We can compute the commutator of the position operator x̂ and the momentum operator
p̂ in the following way:

[x̂, p̂]ψ(x) = (x̂p̂ − p̂x̂)ψ(x) = x(−iψ 0 (x)) + iψ(x) + ixψ 0 (x) = iψ(x). (1.2.15)

This is true for arbitrary function ψ(x) in the set {ψ ∈ H 1 (R) | x2 |ψ 0 (x)|2 < ∞}, so
R

that all operations above are well defined. This is a dense subset of H = L2 (R). Hence
we have
[x̂, p̂] = i. (1.2.16)
Equation (1.2.16) is a fundamental relation in quantum mechanics and is called the
canonical commutation relation.
From the canonical commutation relation, we find that the position and momentum
operators are not compatible, and it is not possible to simultaneously determine the
position and momentum of a quantum particle. A more quantitative version of this
statement is given by the uncertainty principle in (1.1.21) as
1 1
hψ|∆x̂2 |ψihψ|∆p̂2 |ψi ≥ |hψ|[x̂, p̂]|ψi|2 = ,
4 4
or simply
p p 1
h∆x̂2 i h∆p̂2 i ≥ . (1.2.17)
2
The relation (1.2.17) is called the Heisenberg uncertainty principle.
1.2. Schrödinger equation in the real space 15

Let us consider the Heisenberg uncertainty principle in an extreme case. If the


position of a quantum state is fully determined as |ψi = |x0 i, then h∆x̂2 i = 0. This
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

implies that h∆p̂2 i = ∞. Indeed, since

1
|hp|x0 i|2 =

holds for any p ∈ R, we have formally h∆p̂2 i = ∞. Similar calculation shows that if
|ψi = |p0 i, then h∆x̂2 i = ∞.

Angular momentum operator


The discussion above can be directly generalized to dimensions higher than one. For
example, in the three-dimensional case, we can define the position operator as a vector
r̂ = (x̂, ŷ, ẑ)> and the momentum operator p̂ = (p̂x , p̂y , p̂z )> . The wavefunction is
defined as ψ(r) = ψ(x, y, z) ∈ L2 (R3 ). With the position and momentum operators
defined, we can also define the angular momentum operator in quantum mechanics. In
classical mechanics, the angular momentum is defined as

L = r × p. (1.2.18)

Then the quantization rule defines the quantum angular momentum operator as

L̂ = r̂ × p̂ = r̂ × (−i∇r ) , (1.2.19)

which can be written in the component form as

∂ ∂
L̂x = −iŷ + iẑ ,
∂z ∂y
∂ ∂
L̂y = −iẑ + ix̂ , (1.2.20)
∂x ∂z
∂ ∂
L̂z = −ix̂ + iŷ .
∂y ∂x

Similarly to the spin- 12 particle, we define the square of the magnitude of the angular
momentum operator as
2
L̂ := L̂2x + L̂2y + L̂2z . (1.2.21)
The different components of the angular momentum satisfy the cyclic relation
h i h i h i
L̂x , L̂y = iL̂z , L̂y , L̂z = iL̂x , L̂z , L̂x = iL̂y . (1.2.22)

Furthermore,
2
h i
L̂ , L̂α = 0, α = x, y, z. (1.2.23)

The nature of the angular momentum operator can be better revealed in the spherical
coordinate system. By a change of variable of the Euclidean coordinate

x = r sin θ cos ϕ,
y = r sin θ sin ϕ, (1.2.24)
z = r cos θ,
16 Chapter 1. Basic theory of quantum mechanics

l m Yl,m (θ, ϕ) Orbital


1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

0 0 √ s orbital

r
3
−1 sin θe−iϕ
r 8π
3
1 0 cos θ p orbital
r 8π
3
1 − sin θeiϕ

r
15
−2 sin2 θe−2iϕ
r 32π
15
−1 sin θ cos θe−iϕ
r 32π
5
2 0 (3 cos2 θ − 1) d orbital
r 16π
15
−1 − sin θ cos θeiϕ
r 32π
15
−2 sin2 θe2iϕ
32π

Table 1.1. Real spherical harmonics for l = 0, 1, 2. The names


s, p, d will be explained in section 1.3.

2
we find that the operator L̂ in the spherical coordinates reads
1 ∂2
 
2 1 ∂ ∂
L̂ = − sin θ − . (1.2.25)
sin θ ∂θ ∂θ sin2 θ ∂ϕ2
2
In particular, L̂ depends only on the angular directions θ, ϕ and is independent of the
radial direction.
Following the derivation in Appendix A.2, the eigenvalues and eigenfunctions of
2
L̂ can be directly evaluated using separation of variables. We have
2
L̂ Ylm (θ, ϕ) = l (l + 1) Ylm (θ, ϕ) . (1.2.26)
Here l ∈ N and m can choose from −l, −l + 1, . . . , l. The eigenfunctions Ylm depend
only on the θ, ϕ variables and are called the spherical harmonics. Spherical harmonics
are one of the most important classes of special functions used in quantum physics, and
they provide solutions and also chemical intuition to the solution of the Schrödinger
equation in electronic structure theory. The formula for the first few spherical harmonics
is given in Table 1.1.
In the general theory of angular momentum, any operator L̂ satisfying the cyclic
relation (1.2.22) is called an angular momentum operator. From this perspective, the
spin operator Ŝ satisfies the cyclic relation and hence is an angular momentum operator.
Since  
2 3 1 1
Ŝ = I = + 1 I, (1.2.27)
4 2 2
2
we identify that Ŝ is the square of the magnitude of the angular momentum opera-
tor with l = 21 , compared with (1.2.25). This justifies the name “spin- 12 ” particle in
1.3. Hydrogen atom 17

section 1.1. The general theory of angular momentum further states that the value of l
appearing in the eigenvalues of any angular momentum operator can only be integers
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

or half integers. When l is an integer, the corresponding angular momentum opera-


tor L is called the orbital angular momentum operator. When l is a half integer, the
corresponding operator is called the spin angular momentum operator.

Hamiltonian operator
In the absence of the magnetic field, the total energy in classical mechanics for a particle
2
with unitary mass in a potential field is E(x, p) = p2 + V (x). After the quantization
procedure, we obtain the Hamiltonian for a particle on the real line,

p̂2
Ĥ = + V (x̂),
2
where V (x̂) is interpreted as a multiplicative operator defined as

hx|V (x̂)|ψi = V (x)ψ(x).

For particles in three dimensions with a potential field V , the Hamiltonian operator is
defined as
1 2 1
p̂ + p̂2y + p̂2z + V (r̂) = − ∆r + V (r̂).

Ĥ = (1.2.28)
2 x 2
Here ∆r = ∂x2 + ∂y2 + ∂z2 is the Laplacian operator.
From now on, we will drop the ˆ· notation for the position, momentum, angular mo-
mentum, and Hamiltonian operators for the rest of the discussion to make the notation
concise. So r, p, L, S, H will be interpreted as operators when necessary. We also
write r = (x, y, z)> ≡ (r1 , r2 , r3 )> for the convenience of summations over spatial
coordinate components. Similarly, p = (px , py , pz )> ≡ (p1 , p2 , p3 )> . Following this
notation, we may write the Hamiltonian as

1
H = − ∆r + V (r). (1.2.29)
2
For the Hamiltonian in (1.2.29), the time-dependent Schrödinger equation reads

1
i∂t ψ(r, t) = Hψ(r, t) = − ∆r ψ(r, t) + V (r)ψ(r, t). (1.2.30)
2
Since the Hamiltonian H does not depend on the time variable, one can find the station-
ary state solution of the Schrödinger equation by solving the eigenvalue problem
 
1
− ∆r + V (r) ψ(r) = Eψ(r). (1.2.31)
2

Equation (1.2.31) is commonly referred to as the time-independent Schrödinger equa-


tion.

1.3 Hydrogen atom


The Schrödinger equation for the hydrogen atom has a closed form solution, and hy-
drogen remains the only exactly solvable atom in the periodic table using quantum
18 Chapter 1. Basic theory of quantum mechanics

mechanics. The Hamiltonian for the hydrogen atom takes the form (1.2.29). To sim-
plify the discussion, we will take the Born–Oppenheimer approximation and regard the
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

nucleus as fixed at the origin. We remark that it is also possible to explicitly solve the
full quantum mechanical description of the hydrogen atom. The potential is a centrally
symmetric potential and only depends on the radial direction as (recall that r = |r|)
1
V (r) = − . (1.3.1)
r
Hence we can find the eigenstates of the hydrogen atom by solving the eigenvalue prob-
lem  
1 1
− ∆r − ψ (r) = Eψ (r) . (1.3.2)
2 r
In quantum mechanics, such an eigenfunction ψ(r) can also be interchangeably referred
to as an “orbital.”
Using the spherical coordinates (1.2.24), the Laplacian operator takes the form
∂2
   
1 ∂ ∂ 1 ∂ ∂ 1
∆r = 2 r2 + 2 sin θ + 2 2 . (1.3.3)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ2
Comparing the relation above and (1.2.25), we find that ∆r is related to the L2 operator
as  
1 ∂ 2 ∂ 1
∆r = 2 r − 2 L2 . (1.3.4)
r ∂r ∂r r
Since all eigenfunctions for L2 can be explicitly identified via (1.2.26), we may readily
use separation of variables to solve (1.3.2). Assume the wavefunction takes the form
ψ(r, θ, ϕ) = R(r)Ylm (θ, ϕ), (1.3.5)
then the radial part of the wavefunction R(r) should satisfy the eigenvalue problem
 
1 ∂ ∂R l (l + 1) 1
− 2 r2 + R (r) − R (r) = ER (r) , r ∈ (0, ∞), (1.3.6)
2r ∂r ∂r 2r2 r
subject to the condition that R(0) is finite and the function decays at infinity. Direct
calculation shows that the following relation holds in the operator sense:
1 ∂2
 
1 ∂ 2 ∂
2
r = r. (1.3.7)
r ∂r ∂r r ∂r2
By a change of variable
u(r) = rR(r), (1.3.8)
we have an equivalent eigenvalue problem
1 ∂2u
− (r) + Ve (r)u(r) = Eu(r), (1.3.9)
2 ∂r
where
l (l + 1) 1
Ve (r) = − (1.3.10)
2r2 r
is an effective potential along the radial direction. Compared to V (r), the extra term is
due to the angular momentum operator. Asymptotically, as r → ∞, (1.3.9) becomes
approximately
1 ∂2u
− = Eu. (1.3.11)
2 ∂r2
1.3. Hydrogen atom 19
√ √
If the eigenvalue E > 0, then u(r) ∼ c1 ei 2Er + c2 e−i 2Er . This planewave-like
solution cannot be square integrable, and hence any value E > 0 cannot be an isolated
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

eigenvalue. In fact, the hydrogen atom has a continuous spectrum on [0, ∞). Hence all
eigenvalues with square integrable eigenfunctions must satisfy E < 0.
If l = 0 in (1.3.10), one can easily find one analytic solution u(r) = re−r for
(1.3.9), i.e.,
r
1
Ψ(r) = R(r), R(r) = exp(−r),

p
which corresponds to the eigenvalue E0 = − 12 . Here the prefactor 1/(4π) comes
from normalization. It turns out that this is the ground state energy. Further calculation
shows that all other eigenvalues are given by a simple relation

1
Ekl = − ,
2(k + l)2

where Ekl is the kth eigenvalue of (1.3.9) corresponding to the angular momentum l. If
we define a number n = k +l, then all eigenvalues can be labeled using a single number
n as
1
En = − 2 , n = 1, 2, . . . .
2n
For each eigenvalue En , the corresponding degenerate eigenfunctions can be labeled
as ψnlm with n, l, m being integer parameters. Here n is called the principal quantum
number (n ≥ 1), l is called the azimuthal quantum number (0 ≤ l ≤ n − 1), and m is
called the magnetic quantum number (−l ≤ m ≤ l).
In spectroscopic terminologies, the spherical harmonics with l = 0, 1, 2, 3 are named
s, p, d, f orbitals, respectively.2 These terms are often used in electronic structure the-
ory as well. Hence ψ100 is called the 1s orbital (with the only possible choice l = 0 and
m = 0). The normalized 1s orbital takes the form
r
1
ψ100 (r) = exp(−r).

Similarly ψ200 is called a 2s orbital, ψ21m (−1 ≤ m ≤ 1) is called a 2p orbital, and


ψ32m (−2 ≤ m ≤ 2) is called a 3d orbital, and so on. The degeneracy between the
2s and 2p orbitals for the hydrogen atom is purely coincidental, in the sense that it is
not caused by a fundamental symmetry constraint. Hence such degeneracy is called the
accidental degeneracy. On the other hand, the degeneracy with respect to m is due to
the spherical symmetry and is called the essential degeneracy.
In summary, the spectrum of the hydrogen atom is such that there are infinitely many
eigenvalues En < 0, bounded from below by − 12 , and 0 is an accumulation point. The
essential spectrum of the hydrogen atom is [0, ∞) (Figure 1.5).

Example: H+
2

The H+2 molecule consists of two hydrogen atoms, but with one electron removed. The
system thus contains only one electron and the total charge of the system is +1. Without
2 Here the symbols s, p, d, and f stand for “sharp,” “principal,” “diffuse,” and “fundamental,” (or “fine”),

respectively. The spherical harmonics with l = 4, 5, 6, . . . are labeled alphabetically as g, h, i, . . . , respec-


tively.
20 Chapter 1. Basic theory of quantum mechanics
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 1.5. Spectrum of the hydrogen atom: An infinite number


of discrete eigenvalues below 0 and essential spectrum above 0.

loss of generality we assume the positions of the two atoms are fixed at 0 and R =
(R, 0, 0)> , respectively, and the Hamiltonian for the electron is
1 1 1
H = − ∆r − − . (1.3.12)
2 |r| |r − R|
As a crude approximation, we assume the ground state wavefunction is given by the
linear combination of two 1s orbitals centered at 0 and R, respectively, i.e.,
ψ(r) ≈ c1 ψ100 (r) + c2 ψ100 (r − R). (1.3.13)
Since the vector space span {ψ100 (r), ψ100 (r − R)} is isomorphic to C2 , the Hamil-
tonian operator of the H+2 molecule in the above approximation can be approximated
by a 2 × 2 matrix. More specifically, we can solve this system by a Galerkin projection
principle. By projecting the eigenvalue problem
Hψ = Eψ
to the space span {ψ100 (r), ψ100 (r − R)}, we have a generalized eigenvalue problem
   
ε −t 1 s
c=E c. (1.3.14)
−t ε s 1
The generalized eigenvector is c = (c1 , c2 )> and the matrix elements are
Z
ε = ψ100 (r)(Hψ100 )(r) dr, (1.3.15)
Z
−t = ψ100 (r)(Hψ100 )(r − R) dr, (1.3.16)
Z
s = ψ100 (r)ψ100 (r − R) dr. (1.3.17)

Given that t, s > 0, the ground state eigenvalue and eigenfunction are
ε−t 1
Eg = , cg = p (1, 1)> . (1.3.18)
1+s 2(1 + s)
Similarly, the eigenvalue and eigenfunction for the first excited state are
ε+t 1
Ee = , ce = p (1, −1)> . (1.3.19)
1−s 2(1 − s)
It can be verified that the eigenfunctions in the real space using the ansatz (1.3.13)
satisfy
hψg |ψg i = hψe |ψe i = 1, hψg |ψe i = 0. (1.3.20)
In particular, the wavefunction of the ground state is symmetric on two sites, and ψg (r)
is nonzero between two atoms. This is called a “bonding state” and it is the prototypical
model for covalent bonds in chemistry. On the other hand, ψe (r) is exactly zero at
r = (R/2, 0, 0)> . This is called an “anti-bonding state.”
1.4. Periodic systems 21

1.4 Periodic systems


The hydrogen atom and the H+ 2 system are prototypical examples of isolated systems.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

The main character of isolated systems is that the potential V (r) decays to zero when
|r| → ∞. In addition to isolated systems, another important class of systems commonly
investigated in electronic structure theory is condensed matter systems, such as liquids
and solids. In such a case, the size of support of V (r) is on the macroscopic scale and
can therefore be considered to be infinity from the microscopic perspective of electronic
structure theory. Although condensed matter systems contain a macroscopic number of
electrons, let us simplify the discussion for the moment and consider a single electron
in the condensed matter system. From this perspective, the condensed matter system
only provides a background potential V (r) and the Hilbert space for the electron is still
given by L2 (R3 ). Many-electron systems will be discussed later in the book.
A simple example of a condensed matter system is a crystalline solid system, or
simply a crystal. The atomic positions form a Bravais lattice L, defined as the set

L = {R | R = n1 a1 + n2 a2 + n3 a3 , n1 , n2 , n3 ∈ Z} . (1.4.1)

The vectors aα (α = 1, 2, 3) are called basis vectors. If one atom is at position R0 ,


then the same type of atom must be present at the position R0 + R for any R ∈ L. The
periodicity of the atomic position is directly reflected in the potential function as in the
Hamiltonian (1.2.29). V (r) is a periodic function with respect to the Bravais lattice,
i.e.,
V (r) = V (r + R), ∀r ∈ R3 , ∀R ∈ L. (1.4.2)
The region
Ω = {r = c1 a1 + c2 a2 + c3 a3 , 0 < c1 , c2 , c3 ≤ 1} (1.4.3)
3
is called a unit cell. All lattices in R can be categorized into seven crystal systems.
However, each crystal system may contain more than one type of lattice depending on
the symmetry, which gives arise to 14 types of Bravais lattices in total [4]. Figure 1.6
depicts the classification of crystal systems in R3 . In particular, the simple cubic (SC),
body-centered cubic (BCC), and face-centered cubic (FCC) lattices are among the lat-
tices with the highest symmetry and are commonly encountered in real materials.
As we will see in the following discussion, the Hamiltonian H with a periodic po-
tential cannot have any isolated eigenvalues, i.e., even if we formally write the eigen-
decomposition as
H|ψi = E|ψi,
|ψi cannot be a square integrable function. Nonetheless, the eigen-decomposition is
still very useful, and |ψi is often called a generalized eigenfunction in physics litera-
ture. For instance, we have seen that a planewave eikx can be viewed as a generalized
eigenfunction of the momentum operator px . With some abuse of terminology, we sim-
ply refer to generalized eigenfunctions as eigenfunctions in the discussion below, unless
otherwise noted.
In order to find the eigen-decomposition of H, we first note that even if the potential
is periodic, the eigenfunctions are not necessarily periodic. To see this, simply consider
the case V (r) = 0, which is by definition periodic with respect to an arbitrary period.
However, the eigenfunctions of H = − 21 ∆ are planewaves and are not necessarily
periodic functions with an arbitrary period.
The basic tool for finding the eigen-decomposition of the Hamiltonian correspond-
ing to a crystal is the Bloch decomposition, also known as the Bloch–Floquet decom-
22 Chapter 1. Basic theory of quantum mechanics
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 1.6. All lattices in R3 categorized into 7 types of crystal


systems and 14 types of Bravais lattices.

position. For a given R ∈ L, define the translation operator

TR : L2 (R3 ) → L2 (R3 ), (TR f )(r) = f (r + R).

The periodic property of V implies that [TR , H] = 0, and hence TR and H are simul-
taneously diagonalizable. Formally, take any eigenvector ψ such that

Hψ = Eψ, TR ψ = CR ψ. (1.4.4)

Hence ψ(r + R) = CR ψ(r). Furthermore,

ψ(r + R + R0 ) = CR ψ(r + R0 ) = CR CR0 ψ(r) = CR+R0 ψ(r) (1.4.5)

is true for any R, R0 . In general, the solution to the equation

CR+R0 = CR CR0 ∀R, R0 ∈ L (1.4.6)

is a monochromatic planewave

CR = eiR·k for some k ∈ R3 . (1.4.7)

Hence ψ may not be periodic with respect to R, but satisfies the twisted boundary
condition (also called the Bloch boundary condition) as

ψ(r + R) = eik·R ψ(r). (1.4.8)

If we change the variable from ψ to u as

ψ(r) = eik·r u(r), (1.4.9)

then

eik·(r+R) u(r + R) = ψ(r + R) = eik·R ψ(r) = eik·R eik·r u(r). (1.4.10)

Therefore u(r) = u(r + R), i.e., u is periodic with respect to R. Since both V and u
are periodic, one can solve the equation for u(r) only in the unit cell Ω.
1.4. Periodic systems 23

The Bravais lattice induces a reciprocal lattice, defined as


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

L∗ = {G | G = n1 b1 + n2 b2 + n3 b3 , n1 , n2 , n3 ∈ Z} . (1.4.11)

Here b1 , b2 , and b3 are the reciprocal lattice vectors, defined via the relation

aα · bβ = 2πδα,β , α, β = 1, 2, 3. (1.4.12)

The unit cell of the reciprocal lattice, denoted by Ω∗ , is defined as


 
1 1
Ω∗ = k = c1 b1 + c2 b2 + c3 b3 , − < c1 , c2 , c3 ≤ . (1.4.13)
2 2

Note the convention in the range of c1 , c2 , c3 and that the unit cell of the reciprocal
lattice centers at (0, 0, 0). In physics literature, Ω∗ is referred to as the (first) Brillouin
zone. Note that (1.4.8) implies that any k and k + G (G ∈ L∗ ) are equivalent. Hence
we can reduce the range of k to the first Brillouin zone Ω∗ .
More specifically, for a given k ∈ R3 , we would like to find the eigen-decomposition
of the form
Hψn,k (r) = En,k ψn,k (r), (1.4.14)

where n = 0, 1, 2, . . . is the index for eigenvalues for each k. Using the change of
variable from ψ to the periodic part u as

ψn,k (r) = eik·r un,k (r), (1.4.15)

we find
   
1 ik·r
 ik·r 1 2
− ∆ + V (r) e un,k (r) = e − (∇ + ik) + V (r) un,k (r)
2 2
= En,k eik·r un,k (r), (1.4.16)

which can be written as


 
1 2
(Hk un,k )(r) := − (∇ + ik) + V (r) un,k (r) = En,k un,k (r). (1.4.17)
2

The Hamiltonian operator Hk is a self-adjoint operator on L2 (Ω); hence all eigenvalues


En,k are real and the eigenfunctions un,k are subject to the orthonormality condition
Z
u∗n,k (r)um,k (r) dr = δn,m . (1.4.18)

As un,k extends periodically into R3 , un,k (and also ψn,k ) cannot be square integrable
in R3 . However, on Ω, un,k is a proper eigenfunction of Hk . For a fixed n, the function
En,k viewed as a function of k is a continuous function. This is called a Bloch band or
energy band. The collection of all eigenvalues {En,k } is called the band structure.
As will be discussed in section 2.8, the band structure plays a fundamental role in
understanding electronic properties in solids.
24 Chapter 1. Basic theory of quantum mechanics

1.5 Tensor product spaces: Two spin- 12 particles


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

The method for describing quantum systems containing more than one particle is the
tensor product of Hilbert spaces. For simplicity, we only consider finite dimensional
Hilbert spaces. Let HA , HB be two Hilbert spaces of dimension NA , NB , respectively,
NA B NB
and let {|ϕA
i i}i=1 , {|ϕj i}j=1 be the corresponding basis sets. Then the tensor product
space is defined as
n o
HA ⊗ HB = span |ϕA ϕ
i j
B
i i = 1, . . . , NA , j = 1, . . . , NB . (1.5.1)

Here {|ϕA B
i ϕj i} form a new basis set, which is orthonormal under the inner product

hϕA B A B A A B B
i ϕj |ϕi0 ϕj 0 i = hϕi |ϕi0 ihϕj |ϕj 0 i = δi,i0 δj,j 0 .

Hence HA ⊗ HB is a Hilbert space of dimension NA × NB .


If Â, B̂ are linear operators on HA , HB , respectively, then the tensor product  ⊗ B̂
acting on any basis vector |ϕA B
i ϕj i is defined as

(Â ⊗ B̂)|ϕA B A B
i ϕj i = |(Âϕi )(B̂ϕj )i. (1.5.2)

For example, consider two spin- 21 particles. The Hilbert space for each particle,
denoted by H = span {|↑i, |↓i}, is isomorphic to C2 , and the product space is therefore
isomorphic to C2 ⊗ C2 ∼
= C4 . The basis set for H ⊗ H is given by

|↑A ↑B i, |↑A ↓B i, |↓A ↑B i, |↓A ↓B i,

or in simplified notation {|↑↑i, |↑↓i, |↓↑i, |↓↓i}. The spin operators along the z-direction
on the product space can be defined as

Sz(1) = Sz ⊗ I, Sz(2) = I ⊗ Sz (1.5.3)

and the total spin operator along the z-direction is

Sztot = Sz(1) + Sz(2) . (1.5.4)

Similarly, we can define Sxtot , Sytot . The square of the magnitude of the total spin
operator is
2 2 2 2
S tot = Sxtot + Sytot + Sztot .
Below we demonstrate that the tensor product space for two spin- 12 particles can be
2
explicitly categorized using the common eigenfunctions of Sztot and S tot .
To start, we compute Sztot acting on the four basis functions as
1 1
Sztot |↑↑i = Sz ⊗ I|↑↑i + I ⊗ Sz |↑↑i = |↑↑i + |↑↑i = |↑↑i,
2 2
Sztot |↓↓i = −|↓↓i,
1 1
Sztot |↓↑i = Sz ⊗ I|↓↑i + I ⊗ Sz |↓↑i = − |↓↑i + |↓↑i = |0i,
2 2
Sztot |↑↓i = |0i.

Hence all basis vectors of the tensor product space are eigenstates of Sztot . Similarly,
one can evaluate Sxtot and Sytot acting on the basis vectors. The square of the magnitude
1.6. Identical particles 25

2
of the total spin operator S tot requires further computation. Since
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

(Sztot )2 = (Sz ⊗ I)2 + (Sz ⊗ I)(I ⊗ Sz ) + (I ⊗ Sz )(Sz ⊗ I) + (I ⊗ Sz )2


1
= I ⊗ I + 2Sz ⊗ Sz ,
2
we have
32
S tot I ⊗ I + 2Sx ⊗ Sx + 2Sy ⊗ Sy + 2Sz ⊗ Sz .
=
2
We can now check that (exercise)
h 2 i
S tot , Sztot = 0. (1.5.5)

2
Therefore S tot and Sztot can be simultaneously diagonalized. More specifically,
2 2
S tot |↑↑i = 2|↑↑i, S tot |↓↓i = 2|↓↓i. (1.5.6)
2
On the other hand, |↑↓i and |↓↑i are not eigenstates of S tot . Instead,
   
tot 2
 1 1 1 1
S √ |↑↓i + √ |↓↑i = 2 √ |↑↓i + √ |↓↑i ,
2 2 2 2
  (1.5.7)
2 1 1
S tot

√ |↑↓i − √ |↓↑i = 0.
2 2
2
Thus the operator S tot can be used to distinguish the eigenspace of Sztot correspond-
ing to the eigenvalue 0 spanned by the degenerate eigenstates |↑↓i and |↓↑i.
2
In summary, the operator S tot has a single eigenvalue 0, which is called the
singlet state, and a three-fold degenerate eigenvalue 2, which is called the triplet states.
The triplet states can be further distinguished using the operator Sztot . The eigenvalues
2
and eigenvectors of S tot and Sztot are summarized in Table 1.2.

2
State Type S tot Sztot
√1 (|↑↓i − |↓↑i) singlet 0 0
2

|↑↑i 1
√1 (|↑↓i + |↓↑i) triplet 2 0
2
|↓↓i −1
2
Table 1.2. Eigenvalues and eigenvectors of S tot and Sztot for
two spin- 21 particles.

1.6 Identical particles


Two-particle quantum system
Our discussion so far has involved either the spatial degrees of freedom (continuous or
discretized) or the spin degrees of freedom. Now let us consider the spatial and spin
26 Chapter 1. Basic theory of quantum mechanics

degrees of freedom together. For a spin-dependent quantum particle in the real space,
the state space is
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

 
 X Z 
H = L2 (R3 ; C2 ) := ψ(r, σ) |ψ(r, σ)|2 dr < ∞ . (1.6.1)

 R3
σ∈{↑,↓}

We often use x = (r, σ) to denote collectively the spatial and spin variables. We also
introduce the notation Z X Z
dx := dr.
σ∈{↑,↓} R3

Then the state space H can be written as


 Z 
2
H = ψ(x) |ψ(x)| dx < ∞ .

Let us consider a system with two quantum particles. Each state vector |Ψi is in the
tensor product space H⊗H, where H = L2 (R3 ; C2 ). The wavefunction is Ψ(x1 , x2 ) =
hx1 , x2 |Ψi, where xi = (r i , σi ) represents the collective spatial and spin variables of
the ith particle. Again with some abuse of notation, we may not distinguish the state
vector |Ψi and its associated wavefunction Ψ(x1 , x2 ). If we interchange the indices for
the two particles, then the wavefunction becomes Ψ(x2 , x1 ). This can be represented
using a permutation operator P12 , defined as

hx1 , x2 |P12 |Ψi = Ψ(x2 , x1 ).


2
One can verify immediately that P12 = I, i.e.,

hx1 , x2 |Pij2 |Ψi = Ψ(x1 , x2 ). (1.6.2)

Hence the eigenvalues of the permutation operator P12 must be ±1. Furthermore, if the
Hamiltonian H commutes with P12 , then one can find |Ψi so that it is simultaneously
the eigenstate of H and P12 , i.e.,

H|Ψi = E|Ψi, P12 |Ψi = ±|Ψi.

If the sign is +, then |Ψi is a symmetric function

Ψ(x1 , x2 ) = Ψ(x2 , x1 ).

This is called a bosonic state. If the sign is −, then |Ψi is an anti-symmetric function

Ψ(x1 , x2 ) = −Ψ(x2 , x1 ).

This is called a fermionic state.

Example: Helium atom


The Hamiltonian of a helium atom (so the system has two electrons with nucleus charge
2) is
1 1 2 2 1
H = − ∆r1 − ∆r2 − − + . (1.6.3)
2 2 |r 1 | |r 2 | |r 1 − r 2 |
1.6. Identical particles 27

Although the Hamiltonian does not explicitly involve spin operators, the wavefunction
involves both spatial and spin degrees of freedom as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Ψ(x1 , x2 ) ≡ Ψ ((r 1 , σ1 ), (r 2 , σ2 )) .
The wavefunction |Ψi for electrons is always a fermionic state and is in the space A2 =
V2 2 3 2
L (R ; C ), which consists of all anti-symmetric functions in the tensor product
space L2 (R3 ; C2 ) ⊗ L2 (R3 ; C2 ).
Since the Hamiltonian does not explicitly involve the spin degrees of freedom, we
have  tot 2   tot 
(S ) , H = 0, Sz , H = 0. (1.6.4)
Then |Ψi must simultaneously be the eigenstate of (S tot )2 , Sztot , and H, and we can
find the ground state wavefunction Ψ by separating the spatial and spin degrees of free-
dom as
Ψ(x1 , x2 ) = ϕ(r 1 , r 2 )χ(σ1 , σ2 ). (1.6.5)
According to the discussion in section 1.5, χ must be either the anti-symmetric function
(spin singlet) or the symmetric function (spin triplet). Since the overall wavefunction
must be anti-symmetric, if χ is a spin singlet, then the spatial wavefunction ϕ must be
symmetric (i.e., of the bosonic form) and vice versa.
More specifically, if χ is a spin singlet, i.e.,
1 
χ(σ1 , σ2 ) = χS (σ1 , σ2 ) := √ hσ1 σ2 |↑↓i − hσ1 σ2 |↓↑i ,
2
then the simplest symmetric wavefunction for the spatial degrees of freedom takes the
factorized form
ϕ(r 1 , r 2 ) = φ(r 1 )φ(r 2 ). (1.6.6)
The normalization condition for Ψ(x1 , x2 ) requires φ to be normalized as
Z
|φ(r)|2 dr = 1.

If χ is a spin triplet, i.e., χ(σ1 , σ2 ) = hσ1 σ2 |↑↑i, then the spatial part ϕ(r 1 , r 2 )
should be anti-symmetric. Note that anti-symmetrizing the function of the form (1.6.6)
would simply give a zero function. Hence the simplest anti-symmetric function requires
two orthonormal functions φ1 (r), φ2 (r) and

1 1 φ1 (r 1 ) φ1 (r 2 )
ϕ(r 1 , r 2 ) = √ (φ1 (r 1 )φ2 (r 2 ) − φ1 (r 2 )φ2 (r 1 )) = √ .
2 2 φ2 (r 1 ) φ2 (r 2 )
(1.6.7)
It can be readily verified that ϕ(r 1 , r 2 ) is a normalized, anti-symmetric function. The
determinant on the right-hand side of (1.6.7) is called a Slater determinant.
The above representation explicitly separates the spatial degrees of freedom and the
spin degrees of freedom, and the single particle functions φ, φ1 , φ2 are called (single
particle) spatial orbitals. It is also useful to consider these two sets of degrees of freedom
together. For example, define the functions
ψ1 (x) = φ(r)hσ|↑i, ψ2 (x) = φ(r)hσ|↓i,
which are called (single particle) spin orbitals. The orthonormality condition
Z
ψi∗ (x)ψj (x) dx = δij , i, j = 1, 2,
28 Chapter 1. Basic theory of quantum mechanics

is naturally satisfied. Then one can readily verify that the spin singlet state can be
defined as a Slater determinant as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms


1 ψ1 (x1 ) ψ1 (x2 )
Ψ(x1 , x2 ) = √ . (1.6.8)
2 ψ2 (x1 ) ψ2 (x2 )

On the other hand, if we define the spin orbitals as

ψ1 (x) = φ1 (r)hσ|↑i, ψ2 (x) = φ2 (r)hσ|↓i,

then the spin triplet wavefunction can again be written in the form (1.6.8). Hence the
Slater determinant provides a unified representation using spin orbitals.
The discussion above is a special case of the Hartree–Fock ansatz, which will be
discussed in detail in the next chapter. We emphasize that the Hartree–Fock ansatz
is only an approximate theory for solving systems with more than one electron, since
the many-body wavefunction can be the linear combination of more than one Slater
determinant.
In order to find the approximation ground state wavefunction, we need to compare
the energy hΨ|H|Ψi for |Ψi being the spin singlet and the triplet state, respectively.
Borrowing from the intuition of the H+ 2 molecule, we expect the spin singlet state with
a symmetric spatial wavefunction profile to have a lower energy. Numerical results
obtained from the Hartree–Fock calculation indicate that this is indeed the case. Hence
even when the Hamiltonian does not explicitly involve the spin degree of freedom, spin
still plays an important role in quantum chemistry by influencing the symmetry of the
spatial component of the wavefunction, which leads to measurable effects.

N -particle quantum systems


The above discussion for the two-particle quantum system can be readily generalized to
systems with N quantum particles. In general, consider a state for an N -particle system
|Ψi ∈ ⊗N L2 (R3 ; C2 ), i.e.,

Ψ(x1 , . . . , xN ) = hx1 , x2 , . . . , xN |Ψi.

We can introduce the permutation operator Pij :

hx1 , x2 , . . . , xi , . . . , xj , . . . , xN |Pij |Ψi = Ψ(x1 , x2 , . . . , xj , . . . , xi , . . . , xN ).


(1.6.9)
Since Pij2 = I for any i 6= j, the eigenvalues of the permutation operator Pij must be
±1.
Quantum mechanics postulates that, for a system containing N elementary particles
of the same type, any quantum state |Ψi must simultaneously be the eigenstate of all
permutation operators Pij , i, j = 1, . . . , N, i 6= j, i.e.,

Pij |Ψi = ±|Ψi. (1.6.10)

Furthermore, for a given type of elementary particle, the choice of the + or − sign is
independent of i, j. In particular, if the sign is +, the particle is called a boson. The
bosonic wavefunctions are (totally) symmetric:

hx1 , . . . , xi , . . . , xj , . . . , xN |Pij |Ψi = Ψ(x1 , . . . , xj , . . . , xi , . . . , xN ). (1.6.11)


1.6. Identical particles 29

The set of all symmetric wavefunctions is denoted by


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

N
O
SN := SymN L2 (R3 ; C2 ) ⊂ L2 (R3 ; C2 ). (1.6.12)

If the sign is −, the particle is called a fermion. The fermionic wavefunctions are
(totally) anti-symmetric:

hx1 , . . . , xi , . . . , xj , . . . , xN |Pij |Ψi = −Ψ(x1 , . . . , xj , . . . , xi , . . . , xN ). (1.6.13)

The set of all anti-symmetric wavefunctions is denoted by


N
^ N
O
AN := L2 (R3 ; C2 ) ⊂ L2 (R3 ; C2 ). (1.6.14)

Note that in classical physics, one can always assign different labels to different
particles, for example by using their positions. Once the labels are assigned, one can
“name” a particle and distinguish one from the other. In quantum physics, this is not
possible. The symmetric property for the square of the magnitude |hx1 , . . . , xN |Pij |Ψi|2
implies that the probability of finding any permutation of particle positions in a configu-
ration is exactly the same. In this sense, elementary quantum particles are identical. We
stress that the fact that quantum particles are identical is yet another intrinsic property
of quantum particles which distinguishes them from their classical counterparts.
Whether a particle is a boson or a fermion is determined by its spin. If the spin is a
half integer (e.g., electron and proton), the particle is a fermion. If the spin is an integer
(e.g., photon), the particle is a boson. This statement is beyond quantum mechanics in
the Schrödinger picture, and must be explained using the theory of relativity.
The method of generating bosonic and fermionic wavefunctions can be generalized
to systems with N particles. For instance, starting from N normalized (not necessar-
ily orthogonal) single particle spin orbitals {ψi }N i=1 , the simplest wavefunction in the
absence of any symmetry constraint takes the factorized form

Ψ(x1 , x2 , . . . , xN ) = ψ1 (x1 )ψ2 (x2 ) · · · ψN (xN ). (1.6.15)

In order to satisfy the symmetry constraint, one can construct the bosonic wavefunction
by symmetrizing over all possible permutations from the permutation group Sym(N):
X
ΨB (x1 , x2 , . . . , xN ) = CB ψπ(1) (x1 )ψπ(2) (x2 ) · · · ψπ(N ) (xN ), (1.6.16)
π∈Sym(N)

where CB is a normalization factor.


For the fermionic wavefunction, one can symmetrize over all possible permutation
operations as well, but here introducing a minus sign due to anti-symmetry:
X
ΨF (x1 , x2 , . . . , xN ) = CF (−1)π ψπ(1) (x1 )ψπ(2) (x2 ) · · · ψπ(N ) (xN ),
π∈Sym(N)
(1.6.17)
where CF is a normalization constant and
(
1, π is an even permutation,
(−1)π =
−1, π is an odd permutation.
30 Chapter 1. Basic theory of quantum mechanics

We find that ΨF can be written as a Slater determinant:



ψ1 (x1 ) · · · ψ1 (xN )
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms


ΨF (x1 , . . . , xN ) = CF
.. .

(1.6.18)
.
ψN (x1 ) · · · ψN (xN )
One immediate consequence of the form of a determinant is that if two spin orbitals
ψi (x) and ψj (x) are the same, then the determinant vanishes. Hence, without loss of
generality, we assume the spin orbitals must be orthonormal, i.e.,
Z
ψi∗ (x)ψj (x) dx = δij , i, j = 1, . . . , N.

Then the normalization factor is


1
CF = √ .
N!
In general, an N -body fermionic wavefunction Ψ(x1 , . . . , xN ) must vanish if two
coordinates xi and xj take the same value due to anti-symmetry, even if Ψ is not a Slater
determinant. This is called the Pauli exclusion principle, which can also be stated as the
fact that any two fermions cannot occupy the same quantum state. The anti-symmetric
wavefunction of fermions leads to major difficulties in electronic structure theory.

Exercises
1. Prove that the spin- 21 operator satisfies the identity
2 3
Ŝ = Ŝx2 + Ŝy2 + Ŝz2 = I2 . (1.6.19)
4
2
Show that [Ŝ , Ŝα ] = 0 for α = x, y, z.
2. Given an operator Â(t) depending explicitly on time and a time-dependent state
vector |ψ(t)i, define the expectation value as hÂ(t)i := hψ(t)|Â(t)|ψ(t)i. Prove
that the evolution of the expectation value satisfies
d
i hÂ(t)i = ih∂t Â(t)i + h[Â, Ĥ](t)i.
dt
3. From the example of the spin precession, solve the Schrödinger equation directly
to evaluate the expectation value hŜx (t)i, hŜy (t)i, hŜz (t)i.

4. From the example of the spin precession, let us add an external potential V̂ (t) =
γ cos ωt (|↑ih↓| + |↓ih↑|). The corresponding Hamiltonian operator in the matrix
form is  B 
γ cos ωt
Ĥ(t) = Ŝ · B + V̂ (t) = 2 .
γ cos ωt − B2
(a) Write down the Schrödinger equation for a state vector |ψ(t)i.
(b) Starting from |ψ(0)i = |↓i, write a computer program using the fourth-
order Runge–Kutta method to propagate the Schrödinger equation with γ =
B = 1. Define ∆ = B − ω and perform the simulation at ∆ = 0; the re-
sulting oscillation is called the Rabi oscillation. Also try other values, e.g.,
∆ = 0.5, −0.5, −1.0, −2.0.
1.6. Identical particles 31

5. Let knk = 1 be a unit vector in R3 and let θ ∈ R. Prove the following matrix
identity:
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

eiθn·σ = cos(θ)I + i sin(θ)n · σ.

6. Verify that the position operator x̂ is symmetric with respect to the inner product
on L2 (R): for ϕ(x), ψ(x) satisfying (1.2.9),

hϕ|x̂ψi = hx̂ϕ|ψi.

7. Verify that the momentum operator p̂ is symmetric with respect to the inner prod-
uct on L2 (R): for ϕ(x), ψ(x) ∈ L2 (R) satisfying ϕ0 (x), ψ 0 (x) ∈ L2 (R), use
integration by parts and prove that

hϕ|p̂ψi = hp̂ϕ|ψi.

8. Derive the function in L2 (R) that achieves minimal uncertainty according to the
Heisenberg uncertainty principle.
9. Verify the relation (1.2.22) and (1.2.23) for the angular momentum operator.
10. Using the definition (1.2.20), check the formula (1.2.25) in the spherical coordi-
nate.
11. Verify that Hk in (1.4.17) is a self-adjoint operator on the unit cell Ω.
12. Verify (1.5.5), (1.5.6), and (1.5.7).
13. Verify that the wavefunction described by the Slater determinant (1.6.18) is in-
deed the same as that in (1.6.17).
Chapter 2
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Density functional
theory: Formulation and
algorithms

In a quantum many-body system, the ground state is often the most important state.
This is because the energy gap E1 − E0 for many quantum systems is on the order of
electron volts (eV), or 104 Kelvin measured in terms of kB T , where kB is the Boltz-
mann constant. This is much higher than room temperature (300 Kelvin). According to
the Boltzmann distribution, the probability for the state Ei to be occupied is e−βEi /Z,
where β is the inverse temperature and Z is a normalization factor. Hence the ground
state is often the dominating state. Even for metallic systems where E1 − E0 is small
or zero, the ground state can still be very important.
For a system containing both nuclei and electrons, in principle all particles are quan-
tum particles and should be characterized by quantum mechanics. Since the mass of the
lightest element in the periodic table (hydrogen) is around 2000 times larger than that of
the electron, the commonly used Born–Oppenheimer approximation assumes that the
nuclei can be described by classical mechanics. This is often a very good approxima-
tion.
The many-body Hamiltonian with M nuclei and N electrons is given by

N N N M
X 1 X X 1 X ZI ZJ
H= − ∆r i + Vext (r i ; {RI }) + +
i=1
2 i=1 i<j
|r i − r j | |RI − RJ | (2.0.1)
I<J

≡ T + Ven + Vee + EII ,

where the first three terms of the Hamiltonian are kinetic, electron-ion, and electron-
electron interactions, respectively:

N N N
X 1 X X 1
T = − ∆ri , Ven = Vext (r i ; {RI }), Vee = .
i=1
2 i=1 i<j
|r i − r j |

In principle the nuclei are also quantum particles and can be described by the Schrö-
dinger equation as well. However, since the mass of a nucleus is much larger than
that of an electron, the nuclei are often treated as classical particles. This is called the
Born–Oppenheimer (BO) approximation. Under the BO approximation, the nuclei have
a set of fixed positions {RI }M
I=1 , which is called the atomic configuration. The ion-ion

33
34 Chapter 2. Density functional theory: Formulation and algorithms

interaction then simply adds a constant shift to the Hamiltonian with


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

M
X ZI ZJ
EII = .
|RI − RJ |
I<J

The many-body ground state wavefunction Ψ(x1 , . . . , xN ) is associated with the small-
est eigenvalue E of the linear eigenvalue problem:

HΨ = EΨ. (2.0.2)

Due to the Pauli exclusion principle for identical electrons, Ψ is an anti-symmetric


function. The Courant–Fischer min-max theorem states that eigenvalue problem (2.0.2)
can also be viewed as an optimization problem:

E= inf hΨ|H|Ψi, (2.0.3)


|Ψi∈AN ,hΨ|Ψi=1

where AN defined in (1.6.14) is the set of anti-symmetric functions with N electrons.


Equation (2.0.3) is the variational principle for the ground state in quantum mechanics.

2.1 Hartree–Fock theory


General Hartree–Fock theory
The function space AN is very large and complex. The Hartree–Fock theory is the
simplest approximation to AN by assuming the many-body wavefunction Ψ to be a
single Slater determinant

ψ1 (x1 ) ψ1 (x2 ) · · · ψ1 (xN )

1 ψ2 (x1 ) ψ2 (x2 ) · · · ψ2 (xN )
Ψ(x1 , . . . , xN ) = √ .. .. .. .. (2.1.1)
N ! . . . .

ψN (x1 ) ψN (x2 ) · · · ψN (xN )

such that the set of N spin orbital functions {ψi }N 2 3 2


i=1 , ψi ∈ L (R ; C ) are orthonormal
0
to each other, i.e., hψi |ψj i = δij . Let AN denote the set of all Slater determinants for
N electrons, then the Hartree–Fock theory replaces the variational problem (2.0.3) by

E HF = inf hΨ|H|Ψi. (2.1.2)


Ψ∈A0N ,hΨ|Ψi=1

Since the Hartree–Fock theory restricts the variational problem to a smaller class of
functions, it is immediate that
E HF ≥ E. (2.1.3)
The error of the Hartree–Fock approximation is called the correlation energy:

Ec = E − E HF , (2.1.4)

which is always nonpositive as a result.


The energy of a Slater determinant, i.e., the right-hand side of (2.1.2), can be calcu-
lated much more explicitly. Let us write Vext (r) ≡ Vext (r; {RI }) for short. Due to the
2.1. Hartree–Fock theory 35

anti-symmetry property of Ψ, we have


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

N 
D X 1  E D 1 E
Ψ − ∆ri + Vext (r i ) Ψ = N Ψ − ∆r1 + Vext (r 1 ) Ψ , (2.1.5)

i=1
2 2
D X 1 E N D 1 E
Ψ Ψ = Ψ Ψ . (2.1.6)

i<j
|r i − r j | 2 |r 1 − r 2 |

Let us calculate first the one-body term for Ψ given as a Slater determinant. Expanding
the determinant using the permutation operation π, we have

1 X
Ψ(x1 , . . . , xN ) = √ (−1)π ψπ(1) (x1 ) · · · ψπ(N ) (xN ). (2.1.7)
N ! π∈Sym(N)

Then
D 1 E
Ψ − ∆r1 + Vext (r 1 ) Ψ

2
1 X 0
D 1 E
= (−1)π (−1)π ψπ(1) (x1 ) − ∆r1 + Vext (r 1 ) ψπ0 (1) (x1 )

N! 0
2 x1
π,π
N D
Y E
× ψπ(k) (xk ) ψπ0 (k) (xk )

xk
k=2
N
1 X 0
D 1 E Y
= (−1)π (−1)π ψπ(1) (x1 ) − ∆r1 + Vext (r 1 ) ψπ0 (1) (x1 ) δπ(k)π0 (k) ,

N! 0
2 x1
π,π k=2

where the last equality uses the orthonormality of the orbitals and we add the subscript
x1 , xk to the inner product to emphasize the variable to be integrated. The contribution
is zero unless π(k) = π 0 (k) for k = 2, . . . , N ; thus the two permutations coincide.
Therefore,
D 1 E 1 XD 1 E
Ψ − ∆r1 + Vext (r 1 ) Ψ = ψπ(1) − ∆r + Vext (r) ψπ(1)

2 N! π 2
N
1 X X D 1 E
= ψi − ∆r + Vext (r) ψi δiπ(1)

N ! i=1 π 2
N
1 X D 1 E
= ψi − ∆ + Vext ψi ,

N i=1 2

where the second equality introduces the dummy summation index i for π(1). We have

N  N D
D X 1  E X
1
E
Ψ − ∆ri + Vext (r i ) Ψ = ψi − ∆ + Vext ψi

i=1
2 i=1
2
N Z
X 1
= |∇r ψi (x)|2 + Vext (r)|ψi (x)|2 dx.
i=1
2
(2.1.8)
36 Chapter 2. Density functional theory: Formulation and algorithms

For the two-body term, we have


D 1 E
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Ψ

|r 1 − r 2 |
Ψ
1 X 0
D 1 E
= (−1)π (−1)π ψπ(1) (x1 )ψπ(2) (x2 ) ψπ0 (1) (x1 )ψπ0 (2) (x2 )

N! 0
|r 1 − r 2 | x1 ,x2
π,π
N D
Y E
× ψπ(k) (xk ) ψπ0 (k) (xk )

xk
k=3
1 X 0
D 1 E
= (−1)π (−1)π ψπ(1) (x)ψπ(2) (x0 ) 0
0 (1) (x)ψπ 0 (2) (x )

π
N! |r − r 0 |
ψ
0
x,x0
π,π
N
Y
× δπ(k)π0 (k) .
k=3
(2.1.9)
Therefore, the two permutations are the same except potentially for the first two indices,
and we have two possibilities: either
π(1) = π 0 (1) = i, π(2) = π 0 (2) = j
or
π(1) = π 0 (2) = i, π(2) = π 0 (1) = j,
where i, j ∈ {1, . . . , N } are two different indices. In the former case, the two permu-
0
tations are the same, and hence (−1)π (−1)π = 1, whereas in the latter case we have
0
(−1)π (−1)π = −1. Thus, we get
N
D 1 E 1 X D 1 E
Ψ = ψi (x)ψj (x0 ) ψi (x)ψj (x0 )

|r 1 − r 2 |
Ψ
N (N − 1) |r − r |0
i6=j
!
D 1 E
0 0
− ψi (x)ψj (x ) ψj (x)ψi (x ) . (2.1.10)

|r − r 0 |
Therefore,
1
D X E
Ψ

|r i − r j |
Ψ
i<j

N
!
1X 1 1
D E D E
0 0 0 0
= ψi (x)ψj (x ) ψi (x)ψj (x ) − ψi (x)ψj (x )

ψj (x)ψi (x )

2 |r − r 0 | |r − r 0 |
i6=j

N
!
1X 1 1
D E D E
0 0 0 0
= ψi (x)ψj (x ) ψi (x)ψj (x ) − ψi (x)ψj (x )

ψj (x)ψi (x ) .

2 i,j |r − r 0 | |r − r 0 |
(2.1.11)
For the second identity, we use the observation that the two terms in the parentheses are
the same if i = j. Writing the integral more explicitly, we get
D X 1 E
Ψ

|r − r |
Ψ
i<j i j
ZZ ∗ !
|ψi (x)|2 |ψj (x0 )|2 ψi (x)ψj∗ (x0 )ψj (x)ψi (x0 )
ZZ
1X 0 0
= dx dx − dx dx .
2 i,j |r − r 0 | |r − r 0 |
(2.1.12)
2.1. Hartree–Fock theory 37

In summary, the Hartree–Fock energy functional for a given set of orthonormal spin
orbitals {ψi }N
i=1 is given by
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

N Z
|ψi (x)|2 |ψj (x0 )|2
ZZ
X 1 1X
E HF
({ψi }N
i=1 ) = 2 2
|∇r ψi | + Vext |ψi | dx + 0|
dx dx0
i=1
2 2 i,j
|r − r
ZZ ∗
1X ψi (x)ψj∗ (x0 )ψj (x)ψi (x0 )
− dx dx0 + EII .
2 i,j |r − r 0 |
(2.1.13)
To simplify the expression, we define the spin-dependent single particle electron density
for the ground state Ψ as
Z
%(x) = N |Ψ(x, x2 , . . . , xn )|2 dx2 · · · dxN (2.1.14)

and the (total) single particle electron density as


X
ρ(r) = %(r, σ). (2.1.15)
σ∈{↑,↓}

When the context is clear, the (total) single particle electron density is also referred to
as the electron density or simply the density. The electron density is only a quantity in
L1 (R3 ), and hence carries significantly less information than Ψ. For Ψ ∈ A0N , we find
that the spin-dependent electron density
N
X
%(x) = |ψi (x)|2 . (2.1.16)
i=1

We also introduce the spin-dependent (single particle) density matrix as


Z
P (x, x0 ) = N Ψ(x, x2 , . . . , xN )Ψ∗ (x0 , x2 , . . . , xN ) dx2 · · · dxN . (2.1.17)

For Ψ ∈ A0N , the spin-dependent density matrix can be computed as


N
X
P (x, x0 ) = ψi (x)ψi∗ (x0 ), (2.1.18)
i=1

which is a projection operator since the {ψi }’s are orthonormal. In particular, the spin-
dependent electron density is simply the diagonal elements of the spin-dependent den-
sity matrix, i.e., %(x) = P (x, x). We will discuss the electron density and the density
matrix further in a later part of this chapter, as they are essential for electronic structure
theory and calculations.
Using (2.1.16) and (2.1.18), the Hartree–Fock energy functional can be written as
N Z
ρ(r)ρ(r 0 )
ZZ
X 1 1
E HF
({ψi }N
i=1 ) = 2 2
|∇r ψi | + Vext |ψi | dx + 0|
dr dr 0
i=1
2 2 |r − r
|P (x, x0 )|2
ZZ
1
− dx dx0 + EII .
2 |r − r 0 |
(2.1.19)
38 Chapter 2. Density functional theory: Formulation and algorithms

The Hartree–Fock variational problem is then

E HF = E HF ({ψi }N
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

inf i=1 ). (2.1.20)


{ψi }N
i=1 ,hψi |ψj i=δij

The formulation with the functional (2.1.19) is called the general Hartree–Fock theory
(GHF) and the associated energy E HF is commonly written as E GHF .

Unrestricted and restricted Hartree–Fock theory


In GHF, we have assumed that each spin orbital ψi (r, σ) is a general function in
L2 (R3 ; C2 ). We may further treat the spatial and spin degrees of freedom separately,
which reduces the computational cost for solving the Hartree–Fock problem.
More specifically, in the unrestricted Hartree–Fock theory (UHF), we assume that
the orbitals take the separable form

ψi (x) = ϕi (r)χi (σ), (2.1.21)

where χi is preassigned to be either the spin-up state |↑i or the spin-down state |↓i.
For systems with an even number of electrons (N = 2Nocc ), the restricted Hartree–
Fock theory (RHF) further assumes that each spatial component contributes to two spin
orbitals associated with the spin-up state and the spin-down state as

ψi (x) = ϕi (r)hσ|↑i, ψi+Nocc (x) = ϕi (r)hσ|↓i. (2.1.22)

In terms of the admissible function space in the variational problem, we have

RHF ⊆ UHF ⊆ GHF

and hence in general


E GHF ≤ E UHF ≤ E RHF .
However, for many systems it has been found that E GHF = E UHF and even E GHF =
E RHF . Hence RHF and UHF are more widely used than GHF in quantum chemistry
software packages.
From the energy formulation of GHF in (2.1.2), we find that the energy functional
in the UHF ansatz is
N
ρ(r)ρ(r 0 )
Z Z ZZ
1X 1
E UHF
({ϕi }N
i=1 ) = 2
|∇ϕi (r)| dr + Vext (r)ρ(r) dr + dr dr 0
2 i=1 2 |r − r 0 |
|P ((r, σ), (r 0 , σ))|2
ZZ
1X
− dr dr 0 .
2 σ |r − r 0 |
(2.1.23)
In the UHF case, the formulation for the electron density can be simplified as
N
X
ρ(r) = |ϕi (r)|2 , (2.1.24)
i=1

and we have used the fact that


X X
|P ((r, σ), (r 0 , σ 0 ))|2 = |P ((r, σ), (r 0 , σ))|2 (2.1.25)
σ,σ 0 σ
2.2. Kohn–Sham density functional theory 39

thanks to the orthogonality of |↑i and |↓i. In particular, the exchange term includes only
orbitals with the same spin.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Similarly, the energy functional for the RHF ansatz is


Nocc Z
ρ(r)ρ(r 0 )
Z ZZ
X 1
E RHF
({ϕi }N
i=1 )
occ
= 2
|∇ϕi (r)| dr + Vext (r)ρ(r) dr + dr dr 0
i=1
2 |r − r 0 |
2
|P (r, r 0 )|
ZZ
− dr dr 0 .
|r − r 0 |
(2.1.26)
Correspondingly, the electron density becomes
Nocc
X
ρ(r) = 2 |ϕi (r)|2 . (2.1.27)
i=1

In comparison with (2.1.23), the density matrix P ((r, σ), (r 0 , σ)) no longer depends
explicitly on σ and hence can be simplified as
Nocc
X
P (r, r 0 ) = ϕi (r)ϕ∗i (r 0 ). (2.1.28)
i=1

Note that the number of spatial orbitals is reduced from N to N/2 = Nocc due to the
symmetry restriction (Nocc stands for the number of occupied orbitals). Correspond-
ingly, a factor of 2 from spin needs to be properly treated in the electron density and the
exchange energy term.

2.2 Kohn–Sham density functional theory


In order to solve the quantum many-body problem exactly, it appears at first sight that
one has to know the quantum many-body wavefunction. Surprisingly, this is (at least
formally) not the case if we are only interested in the ground state. First established by
Hohenberg and Kohn [37], density functional theory (DFT) illustrates that the electron
density is all one needs to determine the quantum many-body ground state. Mermin
then [64] extended the DFT formulation to the finite temperature setup to include ther-
mal effects as well. The most widely used form of DFT was proposed by Kohn and
Sham [46] and is called Kohn–Sham DFT. Below we introduce DFT from the per-
spective of constrained minimization, which was first proposed by Levy [49] and then
rigorously established by Lieb [51]. We also remark that to date there is no analogous
rigorous foundation of DFT for general excited states.
According to the variational principle (2.0.3), the ground state energy can be written
as a constrained minimization problem:
( )
E= inf hΨ|H|Ψi = inf inf hΨ|H|Ψi . (2.2.1)
Ψ∈AN ,hΨ|Ψi=1 ρ∈JN Ψ∈AN
Ψ7→ρ

Here, in the second equality, we split one single minimization process with respect to
Ψ into two nested minimization processes. The first minimization is with respect to
the density ρ and the second one is with respect to all Ψ, giving rise to the same given
density ρ which is denoted by Ψ 7→ ρ. One natural question is whether for a given
40 Chapter 2. Density functional theory: Formulation and algorithms

density ρ there is at least one Ψ ∈ AN that gives rise to this density such that the
kinetic energy of Ψ is finite, i.e.,
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

N Z
1X
hΨ|T |Ψi = ∇ri |Ψ(x1 , . . . , xN )|2 dx1 · · · dxN < ∞.
2 i=1

It turns out that it is sufficient to consider the density ρ in the following set:
 
Z √
JN = ρ ≥ 0 ρ(r) dr = N, ∇ ρ ∈ L2 (R3 ) . (2.2.2)


The main nontrivial condition in the definition of JN is ∇ ρ ∈ L2 (R3 ). This can
be understood from the case of N = 1, where the ground state wavefunction Ψ(x)
can be expressed without loss of generality as Ψ(x) = ψ(r)hσ| ↑i and ψ is a nodeless
function, i.e., ψ(r) ≥ 0 [52]. Then ρ(r) = ψ(r)2 and

∇ ρ(r) = ∇ψ(r).

Therefore the condition ∇ ρ ∈ L2 (R3 ) simply means that the kinetic energy is finite.
In general, note that
2
X Z
2 2 ∗
|∇ρ(r)| ≤ 4N ∇r Ψ ((r, σ), x2 , . . . , xN )Ψ((r, σ), x2 , . . . , xN ) dx2 · · · dxN


σ
Z !
X
≤ 4N ρ(r) |∇r Ψ((r, σ), x2 , . . . , xN )|2 dx2 · · · dxN ,
σ
(2.2.3)
√ √
where we have used the Cauchy–Schwarz inequality. Together with ∇ ρ = ∇ρ/(2 ρ),
we have

Z Z
1 2 NX 2
|∇ ρ(r)| dr ≤ |∇r Ψ((r, σ), x2 , . . . , xn )| dr 1 dx2 · · · dxN
2 2 σ
= hΨ|T |Ψi.
(2.2.4)

As a result, ∇ ρ ∈ L2 (R3 ) is a necessary condition that the kinetic energy of Ψ is
finite. It can also be proved that ρ ∈ JN implies that there exists at least one Ψ ∈ AN
with finite kinetic energy such that Ψ 7→ ρ [51].
Therefore the constrained minimization procedure is well defined and we have
 
 Z 
E = inf inf hΨ|(T + Vee )|Ψi + ρ(r)Vext (r) dr + EII (2.2.5)
ρ∈JN Ψ∈AN 
Ψ7→ρ
 Z 
= inf FLL [ρ] + ρ(r)Vext (r) dr + EII . (2.2.6)
ρ∈JN

Here we have used


Z Z
hΨ|Vext |Ψi = N |Ψ(x1 , . . . , xN )|2 Vext (r 1 ) dx1 · · · dxN = ρ(r)Vext (r) dr.
(2.2.7)
2.2. Kohn–Sham density functional theory 41

The functional FLL [ρ], called the Levy–Lieb energy functional, depends only on the
kinetic and electron-electron repulsion and not on the external potential Vext . Recall that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

the external potential Vext is given by the atomic type and position, and hence encodes
all the external inputs of the system. In this sense, the functional FLL [ρ] is universal.
Another important consequence of DFT is that if ρ∗ is the minimizer of (2.2.6) and Ψ∗ is
the minimizer that results in FLL [ρ∗ ] and is assumed to be unique, then the many-body
ground state wavefunction Ψ∗ is also determined by the electron density ρ∗ . In this
sense, if the ground state is nondegenerate, then there is a one-to-one mapping between
the ground state electron density and the many-body ground state wavefunction.
Starting from the very early days of quantum mechanics, physicists have been seek-
ing the approximation to the energy only in terms of the electron density, pioneered by
Thomas and Fermi. Until the 1960s, this effort has been mainly restricted to uniform
electron gas, where many calculations can be done analytically. Despite significant
progress in the past few decades, modeling FLL [ρ] remains a very difficult task. To
appreciate the difficulty, just recall the atomic shell structure from the eigenfunctions
of the Hamiltonian of the hydrogen atom. It is already highly nontrivial to find such
mapping for a single atom. Furthermore, in chemistry and materials science, the abso-
lute value of the energy is usually not the most important quantity. What determines
whether a chemistry process will happen or not is its relative energy landscape. This
often requires the ground state energy to be calculated with an accuracy of 99.9% or
higher.
The breakthrough of DFT is generally attributed to Kohn and Sham in 1965, who
proposed combining DFT with the orbital structure. Using the constrained minimization
over Slater determinants, the Kohn–Sham proposal can be interpreted as

ρ(r)ρ(r 0 )
ZZ
1
FLL [ρ] = inf 0 hΨ|T |Ψi + dr dr 0 + Exc [ρ]. (2.2.8)
Ψ∈AN 2 |r − r 0 |
Ψ7→ρ

It turns out that, for any ρ ∈ JN , there exists at least one Ψ ∈ A0N that gives the
density ρ and the constrained minimization of the kinetic energy term is well defined.
Similarly to the calculation in the Hartree–Fock theory, the first term in (2.2.8) con-
tributes to the kinetic energy from N single particle orbitals contributing to the Slater
determinant. The second term is the Hartree energy, which characterizes the electron-
electron repulsion energy at the mean-field level. The last term, Exc [ρ], is called the
exchange-correlation energy functional, which at first glance simply defines whatever
we do not know about FLL [ρ]. This is partially true. However, the insight from Kohn
and Sham is that the kinetic and Hartree terms often encode more than 95% of the to-
tal energy. Therefore the approximation to Exc [ρ], while still very difficult, is relatively
much easier than approximating FLL [ρ] directly. As a result, the Kohn–Sham variational
problem is

ρ(r)ρ(r 0 )
 ZZ Z 
1
E KS = inf hΨ|T |Ψi + dr dr 0 + Exc [ρ] + ρ(r)Vext (r) dr + EII
Ψ∈A0
N
2 |r − r 0 |
(2.2.9)
with ρ given by the Slater determinant Ψ. Writing the kinetic energy of the Slater
determinant more explicitly, we have

E KS = inf E KS ({ψi }) + EII , (2.2.10)


{ψi }N
i=1 ,hψi |ψj i=δij
42 Chapter 2. Density functional theory: Formulation and algorithms

with Kohn–Sham energy functional given by


N Z Z
1X
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

KS N 2
E ({ψi }i=1 ) = |∇r ψi (x)| dx + ρ(r)Vext (r) dr
2 i=1
(2.2.11)
ρ(r)ρ(r 0 )
ZZ
1
+ dr dr 0 + Exc [ρ].
2 |r − r 0 |
Although the form of the Kohn–Sham energy formally resembles that of the Hartree–
Fock energy (2.1.2), we should keep in mind that the Hartree–Fock theory is only the
(conceptually) simplest approximation to the many-body ground state wavefunction.
On the other hand, Kohn–Sham DFT is in principle an exact theory if we have access
to the exact exchange-correlation functional Exc [ρ]. The exchange-correlation func-
tional is also universal, i.e., it is independent of the external potential Vext and hence
the atomic configuration. Compared to the Hartree–Fock theory in (2.1.26), we find
that Kohn–Sham DFT does not involve the single particle density matrix P (x, x0 ) and
absorbs the exchange interaction into the exchange-correlation energy Exc that depends
only on the electron density ρ.
In order to use Kohn–Sham DFT in practice, the exchange-correlation functional
Exc must be approximated. Since the local density approximation was proposed by
Kohn and Sham, a “zoo” of exchange-correlation functionals has been proposed (see an
incomplete list in Figure 2.1). According to Perdew and Schmidt [72], these exchange-
correlation functionals can be organized according to the “Jacob’s ladder” of exchange-
correlation functionals (Figure 2.2). When no exchange-correlation functional is used,
Kohn–Sham DFT is essentially a Hartree approximation (with the Pauli exclusion prin-
ciple) and can thus be significantly less accurate than the Hartree–Fock theory. This is
referred to as “Hartree’s hell.” As the ladder moves up, the accuracy of the DFT cal-
culation generally improves towards the “heaven of chemical accuracy” of 1 kcal/mol
(or 1.6 Hartree per atom) when compared to experimental results. Correspondingly, the
form of the functional becomes increasingly more complex and the computational cost
also increases as a result.

Figure 2.1. A “zoo” of exchange-correlation functionals in


DFT [12]. Reprinted with permission from the American Institute
of Physics, Kieron Burke, and Peter Elliott.

At the first level of the ladder, we have the local density approximation (LDA),
where Exc is modeled locally by the electron density
Z
Exc [ρ] = e xc (ρ(r))ρ(r) dr. (2.2.12)
2.2. Kohn–Sham density functional theory 43
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 2.2. The “Jacob’s ladder” of exchange-correlation func-


tionals.

Here exc (ρ(r)) is the intensity of the exchange-correlation energy, which is to be mul-
tiplied with the electron density ρ(r) to obtain the exchange-correlation energy density
at r. For convenience in later discussion, we introduce

xc (ρ) = e
xc (ρ)ρ

and thus the LDA exchange-correlation functional becomes


Z
Exc [ρ] = xc (ρ(r)) dr. (2.2.13)

The most widely used LDA exchange-correlation functional is obtained by parameter-


izing the result from the quantum Monte Carlo simulation obtained for the uniform
electron gas system in the 1980s [17, 73]. Although most, if not all, real chemical and
materials systems are very different from the uniform electron gas system, the Kohn–
Sham DFT calculations with such an exchange-correlation functional have already been
shown to perform surprisingly well for many systems.
In order to improve the accuracy of the exchange-correlation functional at the next
level of the ladder, we have the generalized gradient approximation (GGA) [48, 6, 71],
which in addition depends on information on the gradient of electron density
Z
Exc [ρ] = xc (ρ(r), σ(r)) dr, (2.2.14)

where the functional depends only on the norm of the gradient

σ(r) = |∇r ρ(r)|2

due to local rotational symmetry. The GGA functionals are currently the most widely
used functionals.
44 Chapter 2. Density functional theory: Formulation and algorithms

The third level of the ladder is the meta-GGA approximation [80, 81], where the
second-order derivative information is added, in particular the Hessian ∇2 ρ(r) and the
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

kinetic energy density


N
1 XX
τ (r) = |∇r ψi (r, σ)|2 .
2 i=1 σ

Then the meta-GGA energy functional is


Z
Exc [ρ] = xc (ρ(r), σ(r), ∇2 ρ(r), τ (r)) dr. (2.2.15)

Since τ is the local kinetic energy of the orbitals, it may seem that the meta-GGA func-
tional is no longer a density functional. It turns out that the Euler–Lagrange equation
associated with the meta-GGA energy functional is still of the same form as that of the
LDA and GGA energy functionals, as we will see below in section 2.3.
Further up the ladder, the energy functionals no longer depend only on the den-
sity, but also intrinsically on the orbitals. Hence, rigorously speaking, these exchange-
correlation functionals are no longer “pure” density functionals, and hence we use a
dashed line to separate the functionals on the fourth rung of the ladder and above as “or-
bital dependent functionals.” For instance, the fourth rung functional is typically called
the “hybrid functional” of the following form [7, 35]:
Exc [{ψi }] = (1 − α)Ex [ρ] + αEXX [{ψi }] + Ec [ρ]. (2.2.16)
Here Ex and Ec are the exchange and correlation parts from lower rung XC functionals
such as the GGA functionals. EXX [{ψi }] is the Hartree–Fock exchange energy of the
orbitals {ψi }, often referred to as the “exact exchange” contribution.
On the fifth rung of the ladder, we have functionals that depend not only on the
density matrix, but also on other quantities such as the linear response operators (to
be discussed in Chapter 3). Examples of such functionals include the double hybrid
functional and the random phase approximation (RPA) functional. These functionals
are closely related to Green’s function theory in many-body physics. The derivation
of the RPA correlation energy will be given at the end of this book. There are also
functionals that depend fully nonlocally on the electron density, such as those that take
into account the van der Waals interaction. These topics are beyond the scope of this
book.
Readers at this point might notice that none of the exchange functionals involve
explicitly the spin degrees of freedom. This is indeed correct. According to the con-
strained minimization procedure, in the absence of an external magnetic field the total
energy should depend solely on the electron density rather than each of its spin com-
ponents. Then, when the number of electrons is even, it is often justified to further use
the space A0N as the space spanned by orbitals of the form (2.1.22), as in the restricted
Hartree–Fock ansatz. The energy functional (2.2.11) thus becomes
Nocc Z
X Z
E KS ({ϕi }N
i=1 ) =
occ
|∇r ϕi (r)|2 dr + ρ(r)Vext (r) dr
i=1 (2.2.17)
ρ(r)ρ(r 0 )
Z
1
+ dr dr 0 + Exc [ρ].
2 |r − r 0 |
For spin-polarized systems and open-shell systems, it is more advantageous to dis-
tinguish the spin channels explicitly in the formulation, similarly to the treatment of the
2.3. Nonlinear eigenvalue problem 45

unrestricted Hartree–Fock ansatz. In the simplest setting, this leads to the local spin
density approximation [73]. We will not discuss the details of spin-dependent function-
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

als here.

2.3 Nonlinear eigenvalue problem


To solve the variational problems of Hartree–Fock theory and Kohn–Sham DFT, we
consider the Euler–Lagrange equation associated with them. These are often referred to
as Hartree–Fock equations and Kohn–Sham equations.

Hartree–Fock equation
For simplicity of notation, we restrict our discussion here to the restricted Hartree–Fock
theory (2.1.26), while leaving the generalizations to UHF and GHF as exercises for the
reader. Recall that the energy functional can be written as

Nocc Z
ρ(r)ρ(r 0 )
Z ZZ
X 1
E RHF ({ϕi }N
i=1 ) =
occ
|∇ϕi (r)|2 dr + Vext (r)ρ(r) dr + dr dr 0
i=1
2 |r − r 0 |
2
|P (r, r 0 )|
ZZ
− dr dr 0 .
|r − r 0 |
(2.3.1)
In order to minimize the Hartree–Fock energy, let us find the stationary point of the
Lagrangian. Note that

1 δE RHF ρ(r 0 )
Z
1
∗ = − ∆ϕi (r) + Vext (r)ϕi (r) + dr 0 ϕi (r)
2 δϕi (r) 2 |r − r 0 |
P (r, r 0 )
Z
(2.3.2)
− ϕi (r 0 ) dr 0
|r − r 0 |
=: H RHF [Φ]ϕi (r).

The last line gives the definition of the Fock operator H RHF [Φ], which depends on the
orbitals Φ = {ϕi }Ni=1 . Taking into account the orthonormal constraints of ϕi , we get
occ

ρ(r 0 )
Z
1
H RHF
[Φ]ϕi (r) = − ∆ϕi (r) + Vext (r)ϕi (r) + dr 0 ϕi (r)
2 |r − r 0 |
P (r, r 0 )
Z
− ϕi (r 0 ) dr 0 (2.3.3)
|r − r 0 |
X
= ϕj (r)λji ,
j

where the λ’s are Lagrange multipliers.


Let us further simplify the equation. First note that, due to the self-adjointness of
H RHF ,

λji = hϕj |H RHF [Φ]|ϕi i = hϕi |H RHF [Φ]|ϕj i = λ∗ij ; (2.3.4)
thus Λ = (λij ) is a Hermitian matrix with the eigen-decomposition

Λ = U diag(ε1 , . . . , εNocc )U ∗ . (2.3.5)


46 Chapter 2. Density functional theory: Formulation and algorithms

Consider a linear combination of the orbitals


Nocc
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

X
ψi (r) = ϕj (r)Uji , i = 1, . . . , Nocc . (2.3.6)
j=1

Since U is a unitary matrix, we verify that


Nocc
X Nocc X
X Nocc
ρΨ (r) = 2 |ψi (r)|2 = 2 ϕ∗j (r)Uji

Uik ϕk (r)
i=1 i=1 jk=1
(2.3.7)
Nocc
X
=2 ϕ∗j (r)δjk ϕk (r) = ρΦ (r)
jk=1
0 0
and similarly PΨ (r, r ) = PΦ (r, r ). Therefore the Fock operator given by the set of
new orbitals is the same as the previous one:
H RHF [Ψ] = H RHF [Φ]. (2.3.8)
It also follows that the RHF energy (2.3.1) is invariant under the rotation of the orbitals:
E RHF (Ψ) = E RHF (Φ); (2.3.9)
thus the collections of orbitals {ϕi } and {ψi } are equivalent. The equation of ψi sim-
plifies to
Nocc
X X
H RHF [Ψ]ψi (r) = H RHF [Φ] ϕj (r)Uji = ϕk (r)λkj Uji
j=1 kj (2.3.10)
X
= ϕk (r)Uki εi = ψi (r)εi .
kj

Thus {ψi } satisfies a set of nonlinear eigenvalue problems, as the operator H RHF also
depends on the eigenfunctions. The nonlinear eigenvalue problem (2.3.10) is known
as the Hartree–Fock equation. The orbitals in the set {ψi }N occ
i=1 are called occupied or-

bitals. The rest of the orbitals {ψi }i=Nocc +1 are called the unoccupied orbitals or virtual
orbitals.

Kohn–Sham equations
Next, in order to minimize the Kohn–Sham energy functional (2.2.11), let us find the
stationary point of the Lagrangian. Similarly to the discussion above, we will also use
the spin-restricted ansatz (2.1.22) for the Kohn–Sham wavefunctions. We differentiate
(2.2.17) as
1 δE KS ({ϕi }) ρ(r 0 )
 Z 
1 0
= − ∆ r + V ext (r) + dr + Vxc [ρ](r) ϕi (r)
2 δϕ∗i (r) 2 |r − r 0 |
=: H KS [ρ]ϕi (r),
(2.3.11)
where H KS [ρ] can be viewed as a self-adjoint operator acting on the orbitals which
depends on ρ, given by the orbitals
Nocc
X
ρ(r) = 2 |ϕi (r)|2 ,
i=1
2.3. Nonlinear eigenvalue problem 47

and the exchange-correlation potential is defined to be the functional derivative of Exc


as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

δExc [ρ]
Vxc [ρ] = , (2.3.12)
δρ
which for now can be thought of as a ρ-dependent potential acting on orbitals; Vxc will
be further discussed and made more explicit below. R
Taking into account the orthonormality condition ϕ∗i (r)ϕj (r) dr = δij , we get
the Euler–Lagrange equations as

ρ(r 0 )
 Z 
1 0
H KS [ρ]ϕi (r) = − ∆ + Vext (r) + dr + Vxc [ρ](r) ϕi (r)
2 |r − r 0 |
X (2.3.13)
= ϕj (r)λij ,
j

where the λij ’s are Lagrange multipliers. Following a similar procedure of rotating the
orbitals, it suffices to consider the Euler–Lagrange equation of the form

ρ(r 0 )
 Z 
1 0
− ∆ + Vext (r) + dr + Vxc [ρ](r) ψi (r) = εi ψi (r), i = 1, . . . , Nocc .
2 |r − r 0 |
(2.3.14)

Since the operator H KS depends on the orbitals {ψi } through the electron density ρ,
this is a set of nonlinear eigenvalue problems known as the Kohn–Sham equations.
The Kohn–Sham equations must be solved self-consistently with respect to the elec-
tron density ρ. For a given electron density ρ, the Hamiltonian H KS [ρ] = − 21 ∆ +
Veff (r) is a self-adjoint linear operator, where the effective potential induced by ρ is

Veff (r) = Vext (r) + VHxc [ρ](r). (2.3.15)

Here VHxc [ρ] includes the Hartree and exchange-correlation contributions and only de-
pends on the electron density as

VHxc [ρ](r) = vC [ρ](r) + Vxc [ρ](r). (2.3.16)

We also define the Coulomb operator as

ρ(r 0 )
Z
vC [ρ](r) = dr 0 , (2.3.17)
|r − r 0 |
which gives the Hartree potential.
The Kohn–Sham orbitals {ψi } are thus eigenfunctions of H KS [ρ]. It should be
noted that a priori there is no guarantee that {ψi }N i=1 will correspond to the lowest
few eigenvalues of H KS [ρ] to achieve the global minimum of the Kohn–Sham energy
functional (2.2.9), as in the case when the electrons are noninteracting except for the
Pauli exclusion principle. In practice, this is often assumed in solving the Kohn–Sham
equations according to the aufbau principle.
Let us now come back to the exchange-correlation potential, given by the functional
derivative of Exc with respect to the density. For the LDA approximation , the variation
of Exc is given by Z
∂xc
δExc = (ρ(r))δρ(r) dr.
∂ρ
48 Chapter 2. Density functional theory: Formulation and algorithms

Hence
δExc ∂xc
Vxc [ρ](r) = = (ρ(r)) .
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

δρ(r) ∂ρ
For GGA, the variation with respect to the density gives
Z
∂xc ∂xc
δExc = δρ + δσ dr.
∂ρ ∂σ
Using the chain rule

δσ = |∇(ρ + δρ)|2 − |∇ρ|2 = 2∇ρ · ∇δρ + O(|δρ|2 ),

we have
Z Z  
∂xc ∂xc ∂xc ∂xc
δExc = δρ + 2 (∇ρ · ∇δρ) dr = δρ − 2∇ · ∇ρ δρ dr.
∂ρ ∂σ ∂ρ ∂σ
Thus  
∂xc ∂xc
Vxc [ρ] = − 2∇ · ∇ρ .
∂ρ ∂σ
The derivation of the exchange-correlation potential is slightly different for meta-GGA,
since the kinetic energy density τ (r) explicitly involves the Kohn–Sham orbitals {ψi }.
First, Z
∂xc ∂xc ∂xc
δExc = δρ + δσ + δτ dr.
∂ρ ∂σ ∂τ
In the spin-restricted case, since
Nocc
X
δτ (r) = ∇ψi∗ (r) · ∇δψi (r) + ∇δψi∗ (r) · ∇ψi (r),
i=1

we can use integration by parts and obtain


Z Nocc Z  
∂xc X ∂xc
δτ dr = − δψi∗ ∇ · ∇ ψi dr
∂τ i=1
∂τ
Nocc Z  
X ∂xc
− ψi∗ ∇ · ∇ δψi dr. (2.3.18)
i=1
∂τ

Recall that the Kohn–Sham equation is obtained by variation with respect to δψi∗ ; thus
the exchange-correlation “potential” Vxc [ρ] applied to the occupied orbital ψi is (note
the extra 1/2 factor due to the spin degeneracy)
    
∂xc ∂xc 1 ∂xc
Vxc [ρ]ψi = − 2∇ · ∇ρ − ∇ · ∇ ψi . (2.3.19)
∂ρ ∂σ 2 ∂τ
Therefore, the exchange-correlation functional Vxc [ρ] is still independent of the orbitals
and can be defined only using the electron density as
   
∂xc ∂xc 1 ∂xc
Vxc [ρ] = − 2∇ · ∇ρ − ∇ · ∇ . (2.3.20)
∂ρ ∂σ 2 ∂τ
Strictly speaking, in meta-GGA, Vxc [ρ] is no longer a potential. The dependence of Exc
on the kinetic energy density introduces a differential operator ∇ · ∂
∂τ ∇ acting on the
xc
2.4. Self-consistent field iteration 49

orbitals. On the other hand, all our previous discussions on the Kohn–Sham equations
still apply without change as can be verified easily.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Further up the Jacob’s ladder for the exchange-correlation functionals, Vxc can no
longer be determined only from the electron density. In particular, for hybrid function-
als involving the Fock exchange term, the Kohn–Sham equations will be similar to the
Hartree–Fock equation and involve the single particle density matrix P (r, r 0 ) or the oc-
cupied orbitals. Functionals beyond the hybrid functional, such as the RPA correlation
energy functional, further involve the polarizability operator χ(r, r 0 ) , which must be
defined using not only the occupied orbitals but the unoccupied orbitals as well. The
increased complexity of the density functional brings in more fidelity but also signifi-
cantly increases the computational cost. Therefore the computational cost of DFTs with
high-level exchange-correlation functionals may be close to that of some approaches
based on many-body theories for accurate electronic structure calculations.
In the discussion below, we assume that the LDA/GGA functionals are used, i.e., the
exchange-correlation potential can always be expressed as a local potential Vxc [ρ](r).
Much of the discussion can be generalized to more complicated exchange-correlation
functionals as well as spin-dependent functionals.
For the rest of the book, for simplicity we choose to neglect the spin index in the
presentation and consider spin-less quantum systems unless otherwise noted. The treat-
ment of spin-less particles is similar to that of the restricted Hartree–Fock/Kohn–Sham
calculations above, but we ignore the factor of 2 due to the spin degeneracy. In other
words, we assume Nocc = N and use the spatial variable r most of the time instead
of x. Readers may find this odd at first sight, since we started by introducing quantum
mechanics for a system that involves only the spin degrees of freedom! However, from
a mathematical perspective, the spin degrees of freedom mainly introduce another layer
of indices to keep track of for all quantities under consideration and do not (usually)
add extra complexity at the conceptual level of Kohn–Sham DFT.

2.4 Self-consistent field iteration


For Kohn–Sham DFT with exchange-correlation functionals given as pure density func-
tionals, the effective Kohn–Sham potential Veff [ρ] and hence the Hamiltonian matrix
depends only on the density ρ through the Kohn–Sham potential. We call the mapping
from Veff to ρ the Kohn–Sham map, denoted by

ρ = FKS [Veff ]. (2.4.1)

The electron density ρ can be evaluated from the Kohn–Sham map by solving a linear
eigenvalue problem or by using density matrix techniques, to be discussed in section 2.7.
Hence ρ and Veff should be iteratively determined by each other until convergence. This
is called the self-consistent field (SCF) iteration.
We begin with a certain initial electron density denoted by ρ0 , and we denote by
ρk , Vk the electron density and the effective potential Veff at the kth SCF iteration,
respectively. Then the flow of the SCF iteration becomes

· · · → ρk → Vk = Veff [ρk ] → ρk+1 → Vk+1 = Veff [ρk+1 ] → · · · . (2.4.2)

Depending on the starting point, the relation (2.4.2) can be viewed as a mapping from
ρk to ρk+1 , or from Vk to Vk+1 . The former is known as density mixing and the latter
as potential mixing. There is no qualitative difference between the two types of mixing
50 Chapter 2. Density functional theory: Formulation and algorithms

schemes. However, density mixing has the extra constraint that the density must be non-
negative everywhere and must be normalized to have the correct number of electrons.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

In practice, this constraint can be easily satisfied by setting all the negative entries of
ρ (usually these entries have very small magnitude) to 0 or a small positive number,
followed by a normalization step. On the other hand, potential mixing, which we will
consider below, is formally free of such a constraint. We remark that both density
mixing and potential mixing strategies are widely used in electronic structure software
packages, and that the algorithm below for potential mixing can be used for density
mixing as well.
When self-consistency is reached, the converged effective potential is denoted by
V? and satisfies the nonlinear equation

V? = Veff [FKS [V? ]]. (2.4.3)

Simple mixing method


The simplest version of the SCF iteration is the fixed point iteration, where the potential
at the (k + 1)th step is directly given by the output potential at the kth step:

Vk+1 = Veff [FKS [Vk ]]. (2.4.4)

However, as we shall analyze later, the fixed point iteration generally cannot be expected
to converge, even if the initial potential V0 is already very close to V ∗ .
In order to achieve a self-consistent solution, the simplest practically usable scheme
is the simple mixing method, which introduces a slight modification of the fixed point
iteration as
Vk+1 = αVeff [FKS [Vk ]] + (1 − α) Vk . (2.4.5)
If we introduce the residual error of the Kohn–Sham potential as

rk = Vk − Veff [FKS [Vk ]] , (2.4.6)

then the simple mixing scheme can also be written as

Vk+1 = Vk − αrk . (2.4.7)

When α = 1, the simple mixing is the same as the fixed point iteration. As will be
analyzed in section 3.4, if we neglect the contribution from the exchange-correlation
potential, the simple mixing always converges when α is a small enough positive num-
ber.

Newton’s method
The convergence of the simple mixing method often requires a rather small mixing
constant α. Hence the SCF procedure may take many iterations to converge. One
possible acceleration can be achieved by using Newton’s method, which can be written
as
Vk+1 = Vk − Jk−1 rk . (2.4.8)
Here Jk is the Jacobian matrix for the residual map

δV 7→ δV − Veff [FKS [Vk + δV ]] .


2.4. Self-consistent field iteration 51

Hence the simple mixing method can also be interpreted as approximating the inverse of
the Jacobian matrix Jk−1 simply by αI. For a system with N electrons, the evaluation
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

of the Jacobian matrix for the composition map Veff ◦ FKS requires in principle O(N )
evaluations of the Kohn–Sham map, which is prohibitively expensive.
The Jacobian-free Krylov–Newton method replaces the need for explicit evaluation
of the Jacobian matrix by solving a linear equation
Jk δVk = −rk (2.4.9)
to obtain the Newton update δVk . This can be done using iterative methods for solving
linear equations, such as the generalized minimal residual method (GMRES) [78]. In
order to compute the matrix-vector multiplication related to the Jacobian matrix, one
can use the finite difference formula
Jk δVk ≈ δVk − (Veff [FKS [Vk + δVk ]] − Veff [FKS [Vk ]]) . (2.4.10)
The finite difference calculation requires at least one additional function evaluation of
FKS (Vk + δV ) per iteration step. Therefore, even though Newton’s method may exhibit
local quadratic convergence, each Newton iteration may take many inner iterations to
solve the linear equation (2.4.9).

Broyden’s method
A widely used alternative to Newton’s method is the quasi-Newton method, which re-
places Jk−1 by an approximate matrix Ck that is easy to compute and apply. Then the
updating strategy becomes
Vk+1 = Vk − Ck rk . (2.4.11)
Using Broyden’s techniques [39], one can systematically approximate Jk or Jk−1 .
In Broyden’s second method, Ck is obtained by performing a sequence of low-rank
modifications to some initial approximation C0 of the Jacobian inverse using a recursive
formula [26, 57]. At each step, Ck is obtained by solving the following constrained
optimization problem:
1 2
min kC − Ck−1 kF
C 2
s.t. Sk = CYk , (2.4.12)
where Ck−1 is the approximation to the Jacobian constructed in the (k − 1)th Broyden
iteration. The matrices Sk and Yk above are defined as
Sk = (sk , sk−1 , . . . , sk−` ), Yk = (yk , yk−1 , . . . , yk−` ), (2.4.13)
where sj and yj are defined as sj = Vj − Vj−1 and yj = rj − rj−1 , respectively. The
number ` is the length of the history used in Broyden’s method.
Equation (2.4.12) is a constrained optimization problem which can be solved using
the method of Lagrange multipliers. The solution is (exercise)
Ck = Ck−1 + (Sk − Ck−1 Yk )Yk† . (2.4.14)

Here Yk† denotes the Moore–Penrose pseudoinverse of Yk , i.e., Yk† = (Yk> Yk )−1 Yk> .
We remark that in practice Yk† is not constructed explicitly, since we only need to apply
Yk† to a residual vector rk . This operation can be carried out by solving a linear least
squares problem with appropriate regularization (e.g., through a truncated singular value
decomposition).
52 Chapter 2. Density functional theory: Formulation and algorithms

Anderson’s method and Pulay’s method


A variant of Broyden’s method is Anderson’s method [2], which is widely used in elec-
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

tronic structure software packages. When solving Eq. (2.4.12), Anderson’s method
fixes Ck−1 to be the initial approximation C0 . It follows from (2.4.11) that Anderson’s
method updates the potential as

Vk+1 = Vk − C0 (I − Yk Yk† )rk − Sk Yk† rk . (2.4.15)

In particular, if C0 is set to αI, we obtain Anderson’s method,

Vk+1 = Vk − α(I − Yk Yk† )rk − Sk Yk† rk ,

commonly used in Kohn–Sham DFT solvers.


An alternative way to derive Broyden’s method is through a technique called Direct
Inversion of Iterative Subspace (DIIS). The technique was originally developed by Pulay
for accelerating a Hartree–Fock calculation [74]. Hence it is often referred to as Pulay
mixing. The motivation of Pulay’s method is to minimize the residual V −Veff [FKS (V )]
within the subspace spanned by {Vk−`−1 , . . . , Vk }. In Pulay’s original work [74], the
Pk
optimal approximation to V is expressed as Vopt = j=k−`−1 αj Vj , where Vj (j =
k − ` − 1, . . . , k) are previous approximations to V and the coefficients αj are chosen
Pk
to satisfy the constraint j=k−`−1 αj = 1. P
When the Vj ’s are all sufficiently close to the fixed point solution, Veff [FKS ( j αj Vj )]
P
≈ j αj Veff [FKS (Vj )] holds approximately. Hence we may obtain αj (and conse-
quently Vopt ) by solving the following quadratic programming problem:
2
k
X

min α r
j j

{αj }

j=k−`−1
2 (2.4.16)
k
X
s.t. αj = 1,
j=k−`−1

where rj = Vj − Veff [FKS (Vj )].


Note that (2.4.16) can be reformulated as an unconstrained minimization problem if
Pk
Vopt is required to take the form Vopt = Vk + j=k−` βj (Vj − Vj−1 ), where βj can be
any unconstrained real number. Again, if we assume that Veff [FKS (V )] is approximately
linear around Vj and let b = (βk−` , . . . , βk )> , minimizing kVopt − Veff [FKS (Vopt )]k
with respect to {βj } yields b = −Yk† rk , where Yk is the same as that defined in (2.4.13).
Then Pulay’s method for updating V is defined as

Vk+1 = Vopt − C0 (Vopt − Veff [FKS (Vopt )]), (2.4.17)

where C0 is an initial approximation to the inverse of the Jacobian (at the solution).
Substituting Vopt = Vk − Sk Yk† rk into (2.4.17), and combining with the linearity as-
sumption of Veff [FKS (V )], we arrive at exactly Anderson’s updating formula (2.4.15).
The DIIS method can be further generalized by taking other definitions of the resid-
ual vector r. For instance, the commutator-DIIS (C-DIIS) method [75] defines the resid-
ual as the commutator of the Hamiltonian operator and the density matrix. This is the
most widely used method in quantum chemistry software packages for achieving self-
consistency.
2.5. Density matrix formulation 53

We remark that C0 plays the role of a preconditioner, and hence better C0 can be
chosen to accelerate the convergence of Anderson’s method in practical electronic struc-
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

ture calculations. For instance, for uniform electron gas, as well as simple metallic sys-
tems such as bulk sodium and aluminum, the most widely used preconditioner is the
Kerker preconditioner [43]. It assigns a smaller weight on the long-wavelength Fourier
modes in order to reduce the effect of “charge sloshing” that commonly occurs in elec-
tronic structure calculations for metallic systems.

2.5 Density matrix formulation


Recall that the spin-dependent density matrix was defined in (2.1.18) to simplify the en-
ergy expression of the Hartree–Fock theory. Compared with the orbital representation,
the density matrix is more intrinsic, as it is invariant with respect to the unitary rotation
of the orbitals. In terms of numerical algorithms, the use of the density matrix is also
advantageous, especially for large-scale problems, as we will discuss in section 2.7.
Thus, in this section, we reformulate the Kohn–Sham equations in terms of the density
matrix.
Consider a (one-body) Hamiltonian operator H = − 12 ∆ + V . Assume that H has
a discrete spectrum and denote the eigenpairs of H as {εi , ψi }:
Hψi = εi ψi . (2.5.1)
Assume that the system has N electrons that occupy the first N eigenstates according
to the Pauli exclusion principle. Here, for definiteness, we assume that εN < εN +1 so
that the ground state is not degenerate. Thus, the occupied space is given by the span of
{ψi }i=1,...,N .
Recall that the energy functionals in the Hartree–Fock and Kohn–Sham DFTs are
invariant with respect to unitary rotations of the orbitals. Such unitary rotation is of-
ten called the gauge degrees of freedom. Hence the physical quantity is the subspace
spanned by the occupied orbitals instead of the individual eigenfunctions. This sub-
space is often called the Kohn–Sham occupied subspace and can be represented by the
density matrix
XN
P = |ψi ihψi |. (2.5.2)
i=1

Since the {ψi }’s are orthonormal, we have


N
X N
X
P2 = |ψi ihψi |ψj ihψj | = |ψi ihψi | = P. (2.5.3)
i,j=1 i=1

Hence P is self-adjoint and idempotent, and is the projection operator onto the occupied
space. From an operator perspective, the density matrix can be viewed as an integral
operator
Z N
Z X
0 0 0
(P f )(r) = P (r, r )f (r ) dr = ψi (r)ψi∗ (r 0 )f (r 0 ) dr 0 (2.5.4)
i=1

with the kernel of the integral operator P given by


N
X
P (r, r 0 ) = ψi (r)ψi∗ (r 0 ). (2.5.5)
i=1
54 Chapter 2. Density functional theory: Formulation and algorithms

In particular, we observe that the diagonal part of the kernel is just the electron density
in the real space representation
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

N
X N
X
P (r, r) = ψi (r)ψi∗ (r) = |ψi (r)|2 = ρ(r). (2.5.6)
i=1 i=1

Moreover, Z
Tr P = P (r, r) dr = N, (2.5.7)

which is the normalization condition due to the constraint of the number of electrons.
In fact, P can be represented independent of the orbitals. Since H is a Hermitian
operator, a matrix function f (H) can be defined using its spectral decomposition as
X
f (H) = f (εi )|ψi ihψi | (2.5.8)
i

for any Borel measureable function f on the real line. Recall that for simplicity we have
assumed that the spectrum of H is discrete. Thus, assuming that εN < εN +1 and the
parameter µ ∈ R satisfies εN ≤ µ < εN +1 , we have
N
1(−∞,µ] (H) = 1(−∞,µ] (εi )|ψi ihψi | =
X X X
|ψi ihψi | = |ψi ihψi | = P.
i i:εi ≤µ i=1
(2.5.9)
We conclude with
P = 1(−∞,µ] (H), (2.5.10)
where the right-hand side is the spectral projection onto the interval (−∞, µ]. Here µ is
known as the Fermi level or the chemical potential.
The density matrix can also be equivalently defined using contour integrals from
complex analysis. Recall the Cauchy integral formula (Appendix A.3)
(
0, ε 6∈ D,
I
1 1
dλ = (2.5.11)
2πi ∂D λ − ε 1, ε ∈ D,

where the contour ∂D is chosen as the boundary of a domain D ⊂ C in the counter-


clockwise direction, i.e., the above integral is 1 if the contour encloses the pole ε of
the integrand. Now consider a contour C enclosing only the first N eigenvalues of H
(Figure 2.3). Applying the spectral decomposition, we calculate
I I X
1 −1 1
(λ − H) dλ = (λ − εi )−1 |ψi ihψi | dλ
2πi C 2πi C i
X N
X
= |ψi ihψi | = |ψi ihψi |. (2.5.12)
i: εi is enclosed by C i=1

We obtain I
1
P = (λ − H)−1 dλ. (2.5.13)
2πi C
The integrand in the above formula

Gλ = (λ − H)−1 (2.5.14)
2.6. Extension to finite temperature 55
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 2.3. Contour integral representation and pole expansion


for the density matrix at zero temperature.

is known as Green’s function (in spectral theory it is usually called the resolvent, but
we will stick to the terminology that is more common to physics and PDEs here). The
relation between the density matrix and Green’s function using contour integrals is a
very useful tool both theoretically and numerically. After a certain choice of quadrature
rule, the discretized contour integral takes the form
m
X
P ≈ ωl (zl − H)−1 , (2.5.15)
l=1

where {zl }, {ωl } are quadrature nodes and weights, respectively, and Gl = (zl − H)−1
is a Green’s function evaluated at λ = zl . This is called the pole expansion of the
density matrix.

2.6 Extension to finite temperature


So far we have been focusing on zero temperature and thus the ground state energy
of a many-body system. For finite temperature, instead of minimizing the energy the
quantum system will minimize the free energy instead. As a result, states with higher
energies will also contribute, and thus it becomes more convenient to describe the sys-
tem using a (many-body) density matrix P (N ) , where here and in what follows, we
use superscript (N ) to emphasize the N -body quantities. The density matrix is a non-
negative self-adjoint operator with unit trace on the Hilbert space AN . In terms of the
density matrix, the free energy is given by

F [P (N ) ] = Tr P (N ) (H + β −1 ln P (N ) ), (2.6.1)

where β is the inverse temperature and β −1 Tr P (N ) ln P (N ) is the von Neumann en-


tropy, while Tr P (N ) H is the energy. Note that in the limit of zero temperature, β → ∞,
the above free energy reduces to the energy Tr P (N ) H. The state of a finite temperature
system is determined by minimizing the free energy F [P (N ) ] subject to the constraints
that P (N ) is self-adjoint, positive semidefinite, and its trace is 1.
Since P (N ) is self-adjoint, we may write down its eigen-decomposition as
X
P (N ) = fi |Ψi ihΨi |, (2.6.2)
i
56 Chapter 2. Density functional theory: Formulation and algorithms

where {fi } are called occupation numbers. As P (N ) is non-negative and Tr P (N ) = 1,


we have
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

X
0 ≤ fi ≤ 1 and fi = 1. (2.6.3)
i

Using (fi , Ψi ), the energy can be written as


X X
Tr P (N ) H = fi Tr |Ψi ihΨi |H = fi hΨi |H|Ψi i. (2.6.4)
i i

Similarly, the entropy is given by


X
β −1 Tr P (N ) ln P (N ) = β −1 fi ln fi . (2.6.5)
i

Note that for the special case that P (N ) is a projection operator associated with the
wavefunction |Ψi, i.e., P (N ) = |ΨihΨ|, we have Tr P (N ) H = hΨ|H|Ψi and
Tr P (N ) ln P (N ) = 0, and thus the free energy reduces to the energy of the wave-
function |Ψi. However, the entropy term prefers fi strictly between 0 and 1 so as to
make fi ln fi negative, and thus the minimizing density matrix at finite temperature is
in general not a projector (i.e., fi being either 0 or 1).
The Euler–Lagrange equation of the minimization of (2.6.1) is

H + β −1 (ln P (N ) + I) − λI = 0, (2.6.6)

where λ is a Lagrange multiplier for the constraint Tr P (N ) = 1. Thus the minimizer is


given by
(N ) 1
Pβ = exp(−βH), (2.6.7)
Z
(N )
where Z is a normalization constant Z = Tr exp(−βH) so that Tr Pβ = 1; Z is
known as the partition function in physics terms.
Similarly to the zero temperature case, the idea behind finite temperature DFT is to
consider the free energy as a functional of the one-body electron density, which is given
by the density matrix as
X Z
ρ(r) = N P (N ) ((r, σ), x2 , . . . , xN ; (r, σ), x2 , . . . , xN ) dx2 · · · dxN .
σ
(2.6.8)
Following the constrained minimization procedure, we write

F = inf Tr(P (N ) H) + β −1 Tr(P (N ) ln P (N ) )


P (N ) ∈DN
( )
(N ) −1 (N ) (N )
(2.6.9)
= inf inf Tr(P H) + β Tr(P ln P ) ,
ρ∈JN P (N ) ∈DN
P (N ) 7→ρ

where DN denotes the set of N -body density matrices. Note that for any ρ ∈ JN , there
exists at least one P (N ) that gives the density since we can take P (N ) = |ΨihΨ| with
Ψ ∈ AN giving the density. Thus, the constrained minimization is well defined and we
can further write  Z 
F = inf Fβ [ρ] + ρVext dr , (2.6.10)
ρ∈JN
2.6. Extension to finite temperature 57

where Fβ [ρ] is the universal functional that depends only on kinetic energy, electron-
electron repulsion, and entropy (hence temperature):
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

!
Fβ [ρ] = inf Tr P (N ) (T + Vee ) + β −1 Tr(P (N ) ln P (N ) ) . (2.6.11)
P (N ) ∈DN
P (N ) 7→ρ

To get practical approximations to Fβ [ρ], similarly to the Kohn–Sham proposal we


replace the kinetic and entropy terms by the noninteracting counterpart (known as the
electronic entropy, or Fermi–Dirac entropy [68]) and write

1
Tr (−∆)P + β −1 Tr P ln P + (I − P ) ln(I − P )
 
Fβ [ρ] = inf
P ∈D 2
P 7→ρ
!
ρ(r)ρ(r 0 )
ZZ
1 0
+ dr dr + Exc,β [ρ] , (2.6.12)
2 |r − r 0 |

where D is the set of all one-particle density matrices for an N -electron system:
n o
D = P ∈ B(L2 (R3 , C2 )) | P = P ∗ , 0  P  I, Tr P = N , (2.6.13)

where, for two self-adjoint operators A and B, the notation A  B means that hv|A|vi ≤
hv|B|vi for any v. The constraint 0  P  I comes from the Pauli exclusion principle
[18]: Let the eigen-decomposition of P be
X
P = fi |ψi ihψi |, (2.6.14)
i

the constraints 0  P  I and Tr P = N implying that the occupation numbers {fi }


satisfy
X
0 ≤ fi ≤ 1 and fi = N. (2.6.15)
i

Thus no state is occupied by more than one electron.


We can write down the variational problem more explicitly in terms of the (fi , ψi )’s
as
FβKS {fi }, {ψi }

F = inf (2.6.16)
{fi },{ψ
P i}
0≤fi ≤1, i fi =N
hψi |ψj i=δij

with
Z
 1X X
FβKS {fi }, {ψi } = fi |∇ψi |2 dx + β −1

fi ln fi + (1 − fi ) ln(1 − fi )
2 i i
ρ(r)ρ(r 0 )
Z ZZ
1
+ ρ(r)Vext (r) dr + dr dr 0 + Exc,β [ρ], (2.6.17)
2 |r − r 0 |

and the density is given by


XX
ρ(r) = fi |ψi (x)|2 . (2.6.18)
σ i
58 Chapter 2. Density functional theory: Formulation and algorithms

In principle, the exchange-correlation functional for the finite temperature Exc,β [ρ] de-
pends on β and also has a ladder of approximation schemes like the zero temperature
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

case. However, almost all finite temperature DFT calculations in practice still use the
temperature-independent exchange-correlation functional. For the purpose of our dis-
cussions, since the zero and finite temperature functionals share the same mathematical
structure, we will not explicitly distinguish them in what follows.
We now consider minimizing the finite temperature functional (2.6.12). As the func-
tional involves minimization with respect to the density matrix P , it is more convenient
to consider the combined minimization of the density and the associated density matrix:

1
Tr (−∆)P + β −1 Tr P ln P + (I − P ) ln(I − P )
 
inf
P ∈D 2
!
ρ(r)ρ(r 0 )
ZZ
1
+ dr dr 0 + Exc,β [ρ] , (2.6.19)
2 |r − r 0 |

where ρ is given by the diagonal of the density matrix P . Taking the variation with
respect to P , we obtain the Euler–Lagrange equation

HβKS [ρ] + β −1 ln P − ln(I − P ) − µI = 0,



(2.6.20)

where µ is the Lagrange multiplier associated with the constraint Tr P = N and the
effective Hamiltonian is given similarly as in the zero temperature case (cf. (2.3.11))
ρ(r 0 )
Z
1
KS
Hβ [ρ] = − ∆ + Vext + dr 0 + Vxc,β [ρ]. (2.6.21)
2 |r − r 0 |
Solving (2.6.20) for P , we get
−1
P = I + exp β(HβKS [ρ] − µ)

. (2.6.22)

Let fβ be the Fermi–Dirac distribution


1
fβ (ε) = . (2.6.23)
1 + exp(βε)
We arrive at the self-consistent equation
X
P = fβ (HβKS [ρ] − µ) = fβ (εi − µ)|ψi ihψi |, (2.6.24)
i

where εi and ψi are the eigenvalue and associated eigenfunction of HβKS [ρ], respec-
tively. We see that the occupation number is given by
1
fi = fβ (εi − µ) = . (2.6.25)
1 + exp(β(εi − µ))
Thus fi ∈ (0, 1), so all eigenstates are occupied with some fraction. Also notice that as
β → ∞ (zero temperature limit), fβ converges to the function f∞ :

1, ε < 0,

f∞ (ε) = 21 , ε = 0, (2.6.26)

0, ε > 0.

2.7. Density matrix algorithms 59
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 2.4. Contour integral representation and pole expansion


for the density matrix at finite temperature.

Denoting by µβ the Lagrange multiplier at inverse temperature β, in the limit we can


show that (as before, we assume εN < εN +1 )

1
lim µβ = (εN + εN +1 ) =: µ∞ . (2.6.27)
β→∞ 2

Therefore, the density matrix of finite temperature is consistent with the definition of
the chemical potential at zero temperature.
Similarly to the zero temperature case, we may also represent the density matrix at
finite temperature using Green’s functions. Let C be a contour close to the real line
enclosing the entire spectrum of H. By the Cauchy integral formula, we obtain
I I X
1 −1 1
fβ (λ − µ)(λ − H) dλ = fβ (λ − µ)(λ − εi )−1 |ψi ihψi | dλ
2πi C 2πi C i
X
= fβ (εi − µ)|ψi ihψi | = P, (2.6.28)
i

where we have used the property that the Fermi–Dirac function fβ (z) (extended to the
complex plane) is a meromorphic function. It only has simple poles at z = (2k+1)iπ/β
with k ∈ Z and is analytic everywhere else. These poles are called the Matsubara
frequencies.
The contour integral formalism also provides a viable way of obtaining a numerical
approximation of the density matrix. The discretized contour integral gives the pole
expansion in the finite temperature case and takes the same form as in (2.5.15), but with
different choices of quadrature nodes and weights (Figure 2.4). We note that in the finite
temperature case the contour integral formulation remains well defined in the gapless
case, i.e., εN = εN +1 . The pole expansion can be obtained efficiently semi-analytically
using, e.g., complex analysis techniques [54]. The computation of Green’s function will
be further discussed in section 2.7.

2.7 Density matrix algorithms


In this section we discuss how to evaluate the Kohn–Sham map ρ = FKS [Veff ] for some
given effective potential Veff (r). Since the Kohn–Sham Hamiltonian operator is defined
60 Chapter 2. Density functional theory: Formulation and algorithms

on the infinite dimensional space L2 (R3 ), it should be first discretized. Many discretiza-
tion schemes can be characterized by a finite dimensional basis set {φj (r)}N b
j=1 so that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Nb
span{φj (r)}j=1 ⊂ L2 (R3 ). This basis set can also be written in a matrix form as

Φ(r) = [φ1 (r), . . . , φNb (r)].

The discretized Hamiltonian matrix is a matrix of size Nb × Nb , defined as


Z
Hij = φ∗i (r)(Ĥφj )(r) dr, (2.7.1)

or compactly written in the linear algebra form as H = Φ∗ ĤΦ. Here we have used
the notation Ĥ again to distinguish the Hamiltonian as an operator and H as a finite
dimensional matrix. The overlap matrix is defined as
Z
Sij = φ∗i (r)φj (r) dr (2.7.2)

or S = Φ∗ Φ.
In the discussion below, one particularly convenient discretization scheme is the
real space discretization , where each basis function φi (r) can be associated with a
discretized spatial point {r i }N b
i=1 , and physical quantities such as the electron density
can be represented on the same set of grid points. Furthermore, we require that the
basis functions are orthonormal, i.e., S = I is an identity matrix. In particular, in
the literature of electronic structure theory, the finite difference discretization of the
Laplacian operator (and hence the Hamiltonian) is often referred to loosely as a real
space discretization scheme. One particular advantage of real space discretization is
that the electron density is given directly by the diagonal elements of the density matrix,
which simplifies the introduction of the numerical methods below.
In an orthonormal basis set, the discretized Kohn–Sham equation is
X
Hij Cj,k = Ci,k εk , (2.7.3)
j

where Ck = (C1,k , C2,k , . . . , CNb ,k )> is the coefficient vector for the kth eigenfunc-
tion. Hence (2.7.3) can be solved using an eigensolver. In order to solve Kohn–Sham
DFT, we do not need all the eigenvectors, but only those corresponding to the N small-
est eigenvalues. When Nb is small (such as when Gaussian-type orbitals or numerical
atomic orbitals are used), one can treat H as a dense matrix and find all the eigen-
values and eigenvectors using software packages such as LAPACK or ScaLAPACK.
When Nb is large (such as when planewaves or finite elements are used), H must be
treated as a sparse matrix and iterative methods should be used only to compute the
eigenpairs needed to construct the density matrix. Examples of iterative eigensolvers
include conjugate gradient-type methods [3, 44] and the Davidson method [22]. Using
the eigenvector of (2.7.3), the Kohn–Sham orbital can be reconstructed as
X
ψk (r) = φj (r)Cj,k (2.7.4)
j

and the electron density is given accordingly. The partial diagonalization procedure in
(2.7.3) is the most straightforward and is also the most widely used method for evaluat-
ing the Kohn–Sham map ρ = FKS [Veff ]. Since the matrix C must consist of orthonormal
2.7. Density matrix algorithms 61

columns, the orthogonalization step will cost at least O(Nb N 2 ), regardless of the eigen-
solver used. Since the number of basis functions Nb should scale linearly with respect
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

to the number of electrons N , the complexity is cubic with respect to N . The cubic scal-
ing is manageable for small systems, but becomes prohibitively expensive when solving
Kohn–Sham DFT for large systems.
Since the diagonalization-type methods are widely used in practically all electronic
structure software packages, it is often assumed that solving the eigenvalue problem of
type (2.7.3) is a necessary step for solving Kohn–Sham DFT. However, according to
the discussion in section 2.4, what is really needed is the evaluation of the Kohn–Sham
map, which maps the effective potential Veff to an output electron density ρ. Hence the
diagonalization method is only one possibility for evaluating the Kohn–Sham map.
In this section, we focus on algorithms for solving Kohn–Sham DFT that directly
evaluate the Kohn–Sham map without obtaining any eigenvalues or eigenfunctions as
intermediate quantities. These algorithms often involve the direct approximation of
the density matrix, and hence we refer to these algorithms as density matrix algo-
rithms. While the computational complexity of diagonalization-type algorithms is al-
ways O(N 3 ) with respect to the number of electrons N , density matrix algorithms offer
the possibility of significantly reducing the complexity. We illustrate density matrix al-
gorithms using two representative examples: the density matrix purification method and
the Fermi operator expansion method.

Density matrix purification


The density matrix purification algorithm is a method for obtaining the density matrix
in the case of zero temperature. Recall that the density matrix is the projection oper-
ator onto the subspace spanned by the occupied orbitals; hence the density matrix is
idempotent, i.e.,
P = P 2. (2.7.5)
Let us first consider the question of, if we are given a Hermitian matrix with eigenvalues
close to 0 and 1 (but not exactly 0 and 1), how to make it idempotent. One strategy is
McWeeny’s purification [63], which applies recursively the function fMcW (x) = 3x2 −
2x3 on the matrix, starting from the initial self-adjoint matrix P0 :

Pn+1 = fMcW (Pn ) = 3Pn2 − 2Pn3 . (2.7.6)

Let us see why McWeeny’s purification works. Note that the iterate stays self-
adjoint during the iteration and the eigenfunctions remain the same, so it suffices to
keep track of the eigenvalues. Considering a specific eigenvalue λ0 of P0 , we have

λn+1 = fMcW (λn ) = 3λ2n − 2λ3n . (2.7.7)

The fixed point of this map is given by the solution to

x = fMcW (x) = 3x2 − 2x3 , (2.7.8)

whose three roots are given by 0, 21 , and 1. We calculate the derivative of fMcW and get

0,
 x = 0,
0
fMcW (x) = 6x − 6x2 = 32 , x = 12 , (2.7.9)

0, x = 1.

62 Chapter 2. Density functional theory: Formulation and algorithms
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 2.5. Shape of fMcW for the McWeeny purification.

1
Therefore, 0 and 1 are the stable fixed points while 2 is unstable. We also see that

x < fMcW (x) for x ∈ (0, 21 ),


x > fMcW (x) for x ∈ ( 12 , 1).

Hence, the iteration converges to 0 if the initial condition lies in [0, 21 ), while it con-
verges to 1 if the initial condition lies in ( 12 , 1]; see Figure 2.5. In fact, the iteration
converges to 0 starting from [− 12 , 12 ) and to 1 starting from ( 12 , 32 ].
Given this, let us now consider how to use the purification to get the density matrix.

P = 1(−∞,0] (H − µ). (2.7.10)

Note that P shares the same eigenfunctions as H; thus this fits into the framework of
purification. We want to make all the eigenvalues of H below the chemical potential µ
converge to 1 and all the eigenvalues of H above it converge to 0. Therefore, we would
like to start with an initial guess that suitably rescales the matrix µ − H. Assuming
spec(H) ⊂ [εmin , εmax ], we take

1 1
P0 = α (µ − H) + I, (2.7.11)
2 2
where
α = min (εmax − µ)−1 , (µ − εmin )−1 .

(2.7.12)
The density matrix is then approximated by

P ≈ Pm = fMcW ◦ fMcW ◦ · · · ◦ fMcW (P0 ). (2.7.13)


| {z }
m

We realize that this is also a polynomial approximation. Unlike standard polynomial


approximation, such as Chebyshev approximation, the approximation in the purification
is recursive. As a result, the degree of the polynomial is much higher. For example, if the
m
m step iteration is used, the highest degree monomial is x3 , where 2m matrix-matrix
multiplication is needed. This is useful since our goal is the polynomial approximation
of a Heaviside function (Fermi–Dirac function at zero temperature); hence a high-order
polynomial is required to reduce the effect of the Gibbs phenomenon.
2.7. Density matrix algorithms 63

Now we demonstrate that McWeeny’s purification method can be understood from


the Newton–Schulz algorithm for the matrix sign function [36].
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

The sgn function is defined by



1,
 x > 0,
sgn(x) = 0, x = 0, (2.7.14)

−1, x < 0.

The sign function and 1(−∞,0] are directly related as

1
1(−∞,0] (x) = (1 − sgn(x)). (2.7.15)
2
For a Hermitian matrix A with a spectrum in (−1, 1), the matrix sign function sgn(A)
can be computed using Newton’s iteration, applied to finding the root of

g(X) = X 2 − I (2.7.16)

with the initial guess X0 = A. This results in the iterative scheme


1
Xk+1 = Xk − g 0 (Xk )−1 g(Xk ) = (Xk + Xk−1 ). (2.7.17)
2
Equation (2.7.17) still requires computation of the matrix inverse Xk−1 , which may not
be desirable for matrices of large size. The Schulz method applies another Newton’s
iteration to compute Y = X −1 by finding the root of

h(Y ) = Y −1 − X. (2.7.18)

Starting from Y0 = X, the Schulz iteration computes X −1 using an iterative scheme

Yk+1 = Yk − h0 (Yk )−1 h(Yk ) = 2Yk − XYk2 . (2.7.19)

Combining one step of Schulz iteration (2.7.19) with (2.7.17), we arrive at the Newton–
Schulz algorithm for the matrix sign function:
1 1
Xk+1 = (Xk + 2Xk − Xk Xk2 ) = Xk (3I − Xk2 ). (2.7.20)
2 2
Then, using the relation (2.7.15), the initial guess for McWeeny’s purification be-
comes
1 α
P0 = I − (H − µ)
2 2
with the iteration
1 
2

I − 2Pk+1 = (I − 2Pk ) 3I − (I − 2Pk ) , (2.7.21)
2
or equivalently
Pk+1 = 3Pk2 − 2Pk3 . (2.7.22)
This is exactly McWeeny’s purification method.
Density matrix-based algorithms provide a possible means of reducing the compu-
tational complexity for the evaluation of the Kohn–Sham map. Note that in McWeeny’s
purification method, we only need to perform matrix-matrix multiplication operations.
64 Chapter 2. Density functional theory: Formulation and algorithms

Hence if each matrix can be approximated by a sparse matrix, the computational cost
may be significantly reduced. This is indeed the case for systems with a finite gap,
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

where the magnitude of the elements of the density matrix can decay rapidly along the
off-diagonal direction. This is often referred to as the near-sightedness principle in
physics literature [45]. Mathematically, the near-sightedness principle can be explained
using decay properties of Green’s functions, which we shall discuss in section 3.5.
For systems obeying the near-sightedness principle, the number of nonzero ele-
ments of the density matrix only increases linearly with respect to N . Therefore, with
proper implementation, all matrix-matrix multiplication operations can be carried out
with O(N ) cost, which leads to a linear scaling method. There is a rich body of litera-
ture on linear scaling algorithms that has been developed in the past two decades and is
still being actively developed today. We refer readers to the review papers [30, 10] on
this topic.

Fermi operator expansion


Unlike the density matrix purification, Fermi operator expansion (FOE) is a method for
directly evaluating the density matrix at the finite temperature

P = fβ (H − µ). (2.7.23)

The right-hand side is a matrix function with respect to the Hamiltonian matrix H.
Instead of diagonalizing H and evaluating the matrix function using the eigen-decom-
position, the basic idea of FOE is to expand the Fermi–Dirac function fβ (·) into an
m-term expansion as
m
X
fβ (ε) ≈ fβ,m (ε) = gn (ε). (2.7.24)
n=1

The corresponding matrix function approximation is

m
X
fβ (H − µ) ≈ fβ,m (H − µ) = gn (H − µ). (2.7.25)
n=1

The above formulation is general, and we only require each term gn (H − µ) to be


a simple function so that the corresponding matrix function can be evaluated directly
without diagonalizing the matrix. For instance, gn can be chosen to be a polynomial
function or a rational function.
The approximation error of the matrix form (2.7.25) is directly related to that of the
scalar form (2.7.24). Given the diagonalized form

H|ψi i = εi |ψi i, (2.7.26)

we have, for any ket vector |vi,

 X 
fβ (H −µ)−fβ,m (H −µ) |vi = fβ (εi −µ)−fβ,m (εi −µ) |ψi ihψi |vi. (2.7.27)
i
2.7. Density matrix algorithms 65

Thus, for any vector |vi,


fβ (H − µ) − fβ,m (H − µ) |vi 2 =
 X 2 2
|fβ (εi − µ) − fβ,m (εi − µ)| |hψi |vi|
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

2
i
2
X 2
≤ sup |fβ (εi − µ) − fβ,m (εi − µ)| |hψi |vi|
i
i
2 2
≤ sup |fβ (ε) − fβ,m (ε)| kvk2 .
ε∈spec(H−µ)
(2.7.28)
The equation above can be rewritten as
kfβ (H − µ) − fβ,m (H − µ)k2 ≤ kfβ (·) − fβ,m (·)k∞ , (2.7.29)
where the left-hand side is the operator norm for matrices, and the right-hand side is
the L∞ norm for scalar functions. Thus the error of the FOE (2.7.25) will be small as
long as the corresponding approximation is small in the sense of expansions for scalar
functions (2.7.24).
One example of FOE is the expansion of the Fermi–Dirac function into polynomials:
m
X
fβ (ε) ≈ cn εn−1 . (2.7.30)
n=1

The corresponding matrix function version is


m
X
fβ (H − µ) ≈ cn (H − µ)n−1 . (2.7.31)
n=1

Note that each term of (2.7.31) is simply a matrix power (H − µ)n , which can be
evaluated using only matrix-matrix multiplication recursively, without diagonalizing
the matrix H. In order to implement the FOE (2.7.31) for a high-order polynomial, it is
more efficient and stable to expand fβ using Chebyshev polynomials.
Aside from polynomial expansion, another option is to approximate the Fermi–
Dirac function using rational functions. Rational functions can consist of terms with
simple poles of the form (ε − z)−1 , as well as higher-order poles of the form (ε − z)−n
with n > 1. It turns out that the simple pole expansion, or just the pole expansion as in
(2.5.15), achieves the best balance between efficiency and accuracy for approximating
meromorphic matrix functions such as the Fermi–Dirac function [54].
Each term in the pole expansion corresponds to a matrix inverse, or Green’s func-
tion (zl − H)−1 , which can be evaluated directly without diagonalizing the matrix H.
Equation (2.5.15) converts the problem of computing P to the problem of computing
m Green’s functions. In order to find the Kohn–Sham map, we do not need the entire
density matrix P but only the electron density which corresponds to the diagonal of P
(again for simplicity we assume that the real space discretization is used). This amounts
to the question of finding the diagonal of a Green’s function. Note that even if H is a
sparse matrix, the matrix inverse Gl = (zl − H)−1 can be a fully dense matrix. One di-
rect method is to first evaluate each Green’s function and extract its diagonal elements.
However, when H is a sparse matrix, the computation of diagonal entries, and, more
generally, the entries of Gl corresponding to the sparsity pattern of H, can be evaluated
much more efficiently by means of the selected inversion method [24, 53, 56, 38].
For simplicity, let A be a sparse, symmetric (real or complex), and nonsingular
matrix. The standard approach for computing A−1 is to first decompose A as
A = LDL> , (2.7.32)
66 Chapter 2. Density functional theory: Formulation and algorithms

where L is a unit lower triangular matrix and D is a diagonal or a block-diagonal ma-


trix. Equation (2.7.32) is often known as the LDL> factorization of A. Given such a
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

factorization, one can obtain A−1 = (x1 , x2 , . . . , xn ) by solving a number of triangular


systems
Ly = ej , Dw = y, L> xj = w (2.7.33)
for j = 1, 2, . . . , n, where ej is the jth column of the identity matrix I. The computa-
tional cost of such an algorithm is generally O(n3 ), with n being the dimension of A.
However, when A is sparse, we can exploit the sparsity structure of L and ej to reduce
the complexity of computing selected components of A−1 .
The selected inversion algorithm can be heuristically understood as follows. Let A
be partitioned into the 2 × 2 matrix block form

α b>
 
A= . (2.7.34)
b A e

The first step of an LDL> factorization produces a decomposition of A that can be


expressed by

1 `>
   
1 α
A= e − bb> /α , (2.7.35)
` I A I

e − bb> /α is known as the


where α is often referred to as a pivot, ` = b/α, and S = A
Schur complement. The same type of decomposition can be applied recursively to the
Schur complement S until its dimension becomes 1. The product of the lower triangular
matrices produced from the recursive procedure, which all have the form
 
I
 1 ,
(i)
` I

where `(1) = ` = b/α, yields the final L factor. At this last step the matrix in the
middle, which is the D matrix, becomes diagonal.
From (2.7.35), A−1 can be expressed by
 −1
α + `> S −1 ` −`> S −1

A−1 = . (2.7.36)
−S −1 ` S −1

This expression suggests that once α and ` are known, the task of computing A−1 can
be reduced to that of computing S −1 . Because a sequence of Schur complements is
produced recursively in the LDL> factorization of A, the computation of A−1 can be
organized in a recursive fashion as well. Clearly, the reciprocal of the last entry of D
is the (n, n)th entry of A−1 . Starting from this entry, which is also the 1 × 1 Schur
complement produced in the (n − 1)th step of the LDL> factorization procedure, we
can construct the inverse of the 2 × 2 Schur complement produced at the (n − 2)th step
of the factorization procedure using the recipe given by (2.7.36). This 2 × 2 matrix is
the trailing 2 × 2 block of A−1 . As we proceed from the lower right corner of L and
D towards their upper left corner, more and more elements of A−1 are recovered. The
pole expansion and selected inversion (PEXSI) method [54, 56, 38] combines the pole
expansion and the selected inversion and evaluates the Kohn–Sham map without solving
any eigenvalues or eigenfunctions. The selected inversion method is an exact method
if exact arithmetic is used, i.e., the only error in the selected inversion method is due to
2.8. Brillouin zone sampling for periodic systems 67

round-off errors. Hence the accuracy of the PEXSI method is determined by the pole
expansion, which can be systematically improved by increasing the number of poles m.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

The computational scaling of PEXSI is only related to the number of nonzero elements
in the Cholesky factor L. More specifically, the complexity is O(N ) for quasi-one-
dimensional systems (such as nanotubes), O(N 1.5 ) for quasi-two-dimensional systems
(such as surfaces), and O(N 2 ) for three-dimensional bulk systems.

2.8 Brillouin zone sampling for periodic systems


Kohn–Sham DFT can also be applied to crystal systems with periodic potentials. In
this section, we discuss the issues related to sampling the Brillouin zone (known as k-
point sampling). Similar to the notation used so far, we use r ∈ R3 as the variable in
the real space and k ∈ R3 as the variable in the reciprocal space. For simplicity we
consider spin-less particles only in this section. Following the discussion in section 1.4,
we denote by L the Bravais lattice and by L∗ its dual lattice in the reciprocal space. Let
Ω, Ω∗ be the unit cell in the Bravais lattice and the first Brillouin zone, respectively. The
volumes of Ω, Ω∗ are denoted by |Ω|, |Ω∗ |, which are connected according to

(2π)3
|Ω∗ | = .
|Ω|

Supercell formulation
Although the size of a periodic system is infinite, it is in fact very helpful, at least
heuristically, to think that a periodic system is nothing other than a “giant molecule.”
In other words, we may approximate the periodic system by a finite-sized system with
N1` , N2` , N3` unit cells along the Bravais lattice vectors a1 , a2 , a3 , respectively. This
fictitious system is often called a supercell denoted by Ω` . The total number of unit
cells in the supercell is thus

N ` = N1` × N2` × N3` .

Assume the number of electrons per unit cell is N , and the total number of electrons
in the system is N N ` . Then Kohn–Sham DFT can be expressed using N N ` single
N`
particle orbitals {ψi }N
i=1 satisfying the orthonormality condition. Each orbital should
satisfy a periodic boundary condition on the supercell, i.e.,

ψi (r + aα Nα` ) = ψi (r) ∀ r ∈ Ω` , α = 1, 2, 3.

This particular periodic boundary condition is called the Born–von Karman boundary
condition. Correspondingly the electron density ρ(r) can also be periodically extended
to R3 . Hence the Kohn–Sham energy functional over the supercell can be written as
before:
Z X Z
1
E ` [{ψi }] = |∇ψi (r)|2 dr + Vext (r)ρ(r) dr
2 Ω` i Ω`
(2.8.1)
ρ(r)ρ(r 0 ) 0
Z Z
1 ` `
+ dr dr + Exc [ρ] + EII [{RI }].
2 Ω` R3 |r − r 0 |

Here the exchange-correlation energy functional and the nuclei repulsion energy also
carry the superscript ` to reflect the fact that they are defined with respect to the super-
68 Chapter 2. Density functional theory: Formulation and algorithms

cell. The ground state energy in the supercell can be obtained variationally as

min E ` [{ψi }]
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

`
{ψi }N N
i=1
Z
s.t. ψi∗ (r)ψj (r) dr = δi,j , (2.8.2)
Ω`
X
ρ(r) = |ψi (r)|2
i

with periodic boundary conditions.


The minimization problem (2.8.2) is well defined, and we can expect that as we
approach the thermodynamic limit (Ω` → R3 ) the total energy per unit cell, defined as
E ` /N ` , will have a well-defined limit towards that for the periodic system. However, it
is not clear from this formulation how to concisely write the thermodynamic limit. Fur-
thermore, this formulation has two major computational drawbacks. First, the number
of Kohn–Sham orbitals is N N ` . The  computational complexity for diagonalization-
based solvers scales as O (N N ` )3 . Second, the domain of integration is Ω` , which
also becomes very expensive as N ` becomes large.

Unit cell formulation


Fortunately, all these problems can be solved using the Bloch–Floquet decomposition.
Since the external potential is still periodic with respect to the unit cell, the Bloch de-
composition in section 1.4 resolves the single orbital index i into two indices (n, k)
as
1
ψi (r) := ψn,k (r) = √ eik·r un,k (r), k ∈ Ω∗ . (2.8.3)
N`
Here the periodic part un,k (r) satisfies

un,k (r + R) = un,k (r), ∀R ∈ L, (2.8.4)

√ is the periodic boundary condition in the unit cell Ω. The normalization factor
which
1/ N ` is introduced for convenience of later discussion. For now we assume each k
point has the same number of orbitals, i.e., n = 1, . . . , N . As will be seen later, this
corresponds to the setup of insulating systems at zero temperature.
For a fixed unit cell, the Born–von Karman boundary condition imposes extra con-
ditions on the phase factor as
`
eik·(r+aα Nα ) = eik·r , ∀ r ∈ Ω` , α = 1, 2, 3,

or equivalently,
`
eiNα k·aα = 1, α = 1, 2, 3.

Therefore k cannot be an arbitrary point in Ω . Without loss of generality we assume
that N1` , N2` , and N3` are all even numbers and define the set
( 3 )
X mα Nα` Nα`
K =`
b mα = − + 1, . . . , , α = 1, 2, 3 ⊂ Ω∗ .

N ` α 2 2
α=1 α

Then the Born–von Karman boundary condition requires k ∈ K` . Note that the cardi-
nality |K` | = N ` , so the total number of admissible k points is equal to the number of
unit cells in the supercell Ω` .
2.8. Brillouin zone sampling for periodic systems 69

Since K` is a discrete set, from the orthonormality condition of ψn,k we have


Z
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

δn0 ,n δk0 ,k = ψn∗ 0 ,k0 (r)ψn,k (r) dr


Ω`
Z
1 0
= ` ei(k−k )·r u∗n0 ,k0 (r)un,k (r) dr
N Ω`
Z
1 X 0
= ` ei(k−k )·(r+R) u∗n0 ,k0 (r + R)un,k (r + R) dr
N (2.8.5)
R∈L Ω
!Z
1 X i(k−k0 )·R 0
= `
e ei(k−k )·r u∗n0 ,k0 (r)un,k (r) dr
N Ω
R∈L
Z
= δk0 ,k u∗n0 ,k0 (r)un,k (r) dr.

Here we have used the periodicity of un,k and the fact that k − k0 cannot differ by
more than one lattice vector along any direction in the reciprocal space. Hence the
orthonormality condition for ψn,k in the supercell can be equivalently written as that
for un,k in the unit cell, i.e.,
Z
u∗n0 ,k (r)un,k (r) dr = δn0 ,n . (2.8.6)

We stress that un,k and un0 ,k0 are in general not orthogonal to each other when k 6= k0 .
Now let us rewrite the energy per unit cell in (2.8.1) in terms of {un,k }. First, the
kinetic energy part per unit cell is (note the normalization factor on both sides of the
equation)
Z X Z X
1 1 2 1 X 1
`
|∇ψn,k (r)| dr = `
|(∇r + ik)un,k (r)|2 dr.
N Ω` 2 N Ω n 2
n,k k

Since the set K` is a set of uniform grid points discretizing the Brillouin zone, the
summation over k can be seen as a Riemann sum of the integration in Ω∗ , i.e., there
exists the limit
|Ω∗ | X
Z
N ` →∞
f (k) −−−−−→ f (k) dk (2.8.7)
N` Ω∗
k

for any continuous function f . Hence in the thermodynamic limit, the kinetic energy
per unit cell is Z Z X
1 1

|(∇r + ik)un,k (r)|2 dr dk. (2.8.8)
|Ω | Ω∗ Ω n 2

The electron density ρ(r) can be written as


X 1 XX
ρ(r) = |ψn,k (r)|2 = ` |un,k (r)|2 .
N n
n,k k

Hence in the thermodynamic limit,


Z
1 X
ρ(r) = |un,k (r)|2 dk, (2.8.9)
|Ω∗ | Ω∗ n
70 Chapter 2. Density functional theory: Formulation and algorithms

which is a periodic function in the unit cell Ω. Since all other terms in the energy
functional depend only on the electron density, the energy functional per unit cell can
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

be written concisely in the thermodynamic limit using {un,k } as follows:


Z
Z X N Z
1 1
E[{un,k }] = |(∇r + ik)un,k (r)|2 dr dk + Vext (r)ρ(r) dr
|Ω∗ |Ω∗ Ω n=1 2 Ω

ρ(r)ρ(r 0 ) 0
Z Z
1
+ dr dr + Exc [ρ] + EII [{RI }].
2 Ω R3 |r − r 0 |
(2.8.10)

Note that all integration ranges in the real space are restricted to the unit cell Ω, except
for the long-range Coulomb interaction, where the r0 variable includes the contribution
to the electrostatic interaction from the electron density within the unit cell Ω as well
as all its periodic images in R3 . Since ρ is a periodic function, the integration over R3
leads to a divergent Hartree potential. However, such divergence will be canceled by
the electron-nuclei interaction in Vext and the nuclei-nuclei repulsion in EII [{RI }]. The
total electrostatic energy from the nuclei and the electron density is well defined. This is
true at least for charge neutral systems [50]. The exchange-correlation functional Exc [ρ]
and the nuclei repulsion energy EII should be understood as the energies per unit cell
as well. Then Kohn–Sham DFT for periodic systems in the thermodynamic limit can
be formulated variationally as

min E[{un,k }]
{un,k }
Z
s.t. u∗n0 ,k (r)un,k (r) dr = δn0 ,n , (2.8.11)

Z X
1
ρ(r) = ∗ |un,k (r)|2 dk.
|Ω | Ω∗ n

The Euler–Lagrange equation for (2.8.11) is then


 
1 2
(Hk un,k )(r) := − (∇ + ik) + Veff [ρ](r) un,k (r)
2 (2.8.12)

= εn,k un,k (r), r ∈ Ω, k∈Ω ,

subject to the orthonormality condition and the self-consistency condition for the elec-
tron density. The effective potential is defined as

ρ(r 0 )
Z
Veff [ρ](r) = Vext (r) + 0
dr 0 + Vxc [ρ](r). (2.8.13)
R3 |r − r |

In particular, if Vext and ρ are periodic functions with respect to the Bravais lattice,
then so is Veff [ρ](r). This justifies the Bloch decomposition in the previous discussion.
Compared to the supercell formulation, the formulation given in (2.8.11) and (2.8.12)
has major computational advantages. After discretization in the Brillouin zone into N `
points, each k point corresponds to a decoupled eigenvalue problem defined only on the
unit cell. The computational cost with respect to N ` is thus reduced from cubic scaling
to linear scaling. This enables simulation with a large number of k points even with
a modest amount of computational resource. The collection of all eigenvalues {εn,k }
forms the band structure modeled by Kohn–Sham DFT.
2.8. Brillouin zone sampling for periodic systems 71

Now we return to the orbitals {ψn,k } before the Bloch decomposition. Although
each ψn,k can be normalized within any finite-sized supercell, such normalized ψn,k
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

would converge weakly to zero in the thermodynamic limit. This is because ψn,k cannot
belong to L2 (R3 ) and is only a generalized eigenfunction of H in R3 . This is similar
to the fact that planewaves can be seen as generalized eigenfunctions of the momentum
operator in section 1.2. For this purpose, with some abuse of notation, we may define
the generalized eigenfunctions using the Bloch decomposition, still denoted by {ψn,k }
as (for all r ∈ R3 )
ψn,k (r) = eik·r un,k (r), k ∈ Ω∗ . (2.8.14)

Compared to (2.8.3), we have dropped the normalization factor. The resulting {ψn,k }
satisfy the orthonormality condition in the distribution sense:
Z
ψn∗ 0 ,k0 (r)ψn,k (r) dr = |Ω∗ |δn0 ,n δ(k0 − k). (2.8.15)
R3

This implies the normalization condition when integrated over the Brillouin zone:
Z Z
1
ψ ∗ 0 0 (r)ψn,k (r) dr dk = δn0 ,n . (2.8.16)
|Ω∗ | Ω∗ R3 n ,k

Another benefit of the choice of (2.8.14) is that the electron density can be written using
{ψn,k } as well:
Z X
1
ρ(r) = ∗ |ψn,k (r)|2 dk.
|Ω | Ω∗ n

Kohn–Sham DFT requires that orbitals with low energies must be occupied before
those with higher energies. So far we have assumed that the number of occupied or-
bitals N holds uniformly for all k. This is only true when the following band isolation
condition is satisfied:
εN,k < εN +1,k0 ∀ k, k0 ∈ Ω∗ . (2.8.17)

A system satisfying the isolation condition is called an insulating system, which has a
positive band gap defined as
   
εgap = inf εN +1,k − sup εN,k . (2.8.18)
k∈Ω∗ k∈Ω∗

On the other hand, when the isolation is violated, the system is called a metallic
system and the band gap εgap is defined to be 0. In materials science, another commonly
used term is semiconducting system, which is an insulating system but with a relatively
small band gap (typically < 1 eV). Hence this terminology only holds in a quantitative
sense.
Figure 2.6 shows the band structures of a bulk silicon system and a bulk aluminum
system, respectively, and the calculations are performed using the Quantum ESPRESSO
[29] software package. Bulk silicon has a small but finite band gap (around 0.6 eV), so
it is an insulating system (more specifically, a semiconductor). Bulk aluminum has zero
band gap and the Fermi energy passes through the band structure, so it is a metallic
system.
For metallic systems, the orbitals with lower energies must be occupied first, and
the number of occupied orbitals per k point will be inhomogeneous across the Brillouin
72 Chapter 2. Density functional theory: Formulation and algorithms
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 2.6. Band structures of bulk silicon (semiconductor) and


bulk aluminum (metal). The black dashed line is the Fermi energy.

zone. The treatment of insulating and metallic systems can be unified by introducing
the chemical potential µ, which defines a set of occupation numbers
(
1, εn,k ≤ µ,
fn,k := f∞ (εn,k − µ) =
0, εn,k > µ.
Then, after solving the Kohn–Sham equations (2.8.12), the electron density is defined
using the occupation number as
Z X
1
ρ(r) = ∗ fn,k |un,k (r)|2 dk. (2.8.19)
|Ω | Ω∗ n

The chemical potential µ should be adjusted self-consistently to fulfill the condition that
each unit cell has N electrons, i.e.,
Z
ρ(r) dr = N. (2.8.20)

Hence again the chemical potential µ is the Lagrange multiplier associated with N .
From this perspective, given the band isolation condition, the choice of the occupation
numbers for insulating systems becomes
(
1, 1 ≤ n ≤ N,
fn,k =
0, n > N.
Finally, the energy functional for a periodic system can also be generalized to the finite
temperature case by introducing an entropy term for the occupation number fn,k . We
will omit the details here.

Discretization of the Brillouin zone


In order to solve (2.8.12), the integration in the Brillouin zone Ω∗ must be discretized.
In the previous discussion, we found that there is a natural one-to-one correspondence
between the solution of Kohn–Sham DFT in a large supercell Ω` and the uniform dis-
cretization of the Brillouin zone according to K` . We can further generalize the choice
of discretization points as follows:
( 3 )
X mα − sα ` `
N N
Ks` = bα mα = − α + 1, . . . , α , 0 ≤ sα < 1, α = 1, 2, 3 .

N ` 2 2
α=1 α
(2.8.21)
2.9. Localization 73

The case when Nα is an odd number can be defined in a similar way. We have in-
troduced a shift vector s = (s1 , s2 , s3 )> . A common choice of the shift vector is
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

s = (0, 0, 0)> , which is the same as K` . Another one is (s1 , s2 , s3 ) = 21 (1, 1, 1)> .
This choice allows the discretization points to be kept away from the high symmetry
points in the Brillouin zone, which is observed to have numerical benefits in practice,
especially when a small number of k points is used. We can see that this choice of the
shift corresponds to ψn,k , satisfying the Born–von Karman boundary condition over a
supercell that has 2N1` , 2N2` , 2N3` along each direction, respectively. More generally, if
one component of the shift corresponds to an irrational number, the discretization will
not correspond to any supercell with the Born–von Karman boundary condition. Hence
the discretization of the Brillouin zone offers a more general perspective for treating
periodic systems than the supercell approach even at the level of numerics. The grid
in (2.8.21) is called the Monkhorst–Pack grid [65] and is the most widely used scheme
for discretizing the Brillouin zone.
The Monkhorst–Pack grid is a set of uniform grid points and the corresponding
quadrature is the trapezoidal rule. For instance, the electron density can be computed as
1 X X
ρ(r) = fn,k |un,k (r)|2 . (2.8.22)
N` ` n
k∈Ks

For insulating systems, the integrand in (2.8.19) is smooth with respect to k and is peri-
odic over the Brillouin zone. The convergence rate with respect to the refinement of the
discretization is at least superalgebraic. Hence for insulating systems the Monkhorst–
Pack grid converges very quickly, and very few k points are needed to converge phys-
ical quantities such as electron density and energy. However, for metallic systems,
f∞ (εn,k − µ) effectively truncates the support of the integrand, which is no longer a
smooth function over the Brillouin zone. Therefore if the Monkhorst–Pack grid is used,
many more k points are needed to converge.

2.9 Localization
Boys localization and Wannier localization
The Kohn–Sham orbitals {ψi }N i=1 are eigenfunctions of a Hamiltonian matrix and are
generally delocalized across the entire system, i.e., with significant magnitude in large
portions of the computational domain. Nonetheless, if we apply a unitary rotation U ∈
CN ×N to the Kohn–Sham orbitals
N
X
wi = ψj Uji , (2.9.1)
j=1

we have seen that the density matrix and hence the electron density is invariant. Hence
Kohn–Sham DFT (up to the fourth rung of exchange-correlation functionals) and the
Hartree–Fock theory are invariant to such gauge transformation as well. We may won-
der whether we can find the optimal unitary transformation so that the resulting set of
orthonormal functions {wi }Ni=1 has significant magnitude on only a small portion of the
computational domain. This is called localization. Localized representations of elec-
tronic wavefunctions have a wide range of applications in quantum physics, chemistry,
and materials science. They require significantly less memory to be stored and are the
foundation of the so-called linear scaling methods [45, 30, 10] for solving quantum
74 Chapter 2. Density functional theory: Formulation and algorithms

problems. They can also be used to analyze chemical bonding in complex materials,
interpolate the band structure of crystals, accelerate ground and excited state electronic
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

structure calculations, and form reduced-order models for strongly correlated many-
body systems [61].
For isolated systems, the localized representation can be identified through the Boys
localization procedure [28], which minimizes the following spread functional:

N
X 2
inf Ω[{wi }N
i=1 ] = hwi |r 2 |wi i − (hwi |r|wi i)
U
i=1
N
(2.9.2)
X
s.t. wi = ψj Uji , U ∗ U = IN .
j=1

In three-dimensional space, the formula for the functional Ω should be interpreted as


3 h
X i
2
hwi |r 2α |wi i − (hwi |r α |wi i) := hwi | (r − hrii ) · (r − hrii ) |wi i,
α=1

where hrii := hwi |r|wi i. Hence the spread functional can be written compactly as

N
X 2
Ω[{wi }N
i=1 ] = hwi | (r − hrii ) |wi i.
i=1

The functional Ω characterizes the total spatial spread of the rotated orbitals {wi } in
terms of the second moment around each center hrii . A smaller spread value indicates
a more localized representation of the Kohn–Sham occupied subspace. Numerically,
the localization problem (2.9.2) can be solved as a constrained minimization problem.
Since isolated systems are surrounded by an infinite-sized vacuum, the occupied
orbitals always decay exponentially fast towards zero as |r| → ∞. Hence, qualitatively,
both the Kohn–Sham orbitals ψi and the localized orbitals wi decay exponentially as
|r| → ∞, and the localized representation only makes a quantitative difference. On the
other hand, for crystals the Kohn–Sham orbitals satisfy the Bloch boundary condition
on each unit cell, and hence have support on the macroscopic scale. One can show that
through a gauge transformation we can still obtain localized orbitals. For insulating
systems with a finite energy gap, we can even obtain exponentially localized orbitals.
Therefore, localized representation makes a qualitative difference. These functions are
called Wannier functions.
For periodic systems, if we rotate the set of generalized functions {ψn,k } by an arbi-
trary k-dependent unitary matrix U (k) ∈ CN ×N , we can define a new set of functions

N
X
ψen,k (r) = ψn0 ,k (r)Un0 ,n (k), k ∈ Ω∗ . (2.9.3)
n0 =1

Here the set of matrices {U (k)}k∈Ω∗ is referred to as the gauge. For each k, the density
matrix P (k) for each k point is gauge invariant,

N
X N
X
P (k) = |ψn,k ihψn,k | = |ψen,k ihψen,k |, (2.9.4)
n=1 n=1
2.9. Localization 75
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Figure 2.7. One Wannier function for a silicon crystal calculated


on an 8 × 8 × 8k-point grid. Here the red and gray isosurfaces
represent the positive and negative parts of the Wannier function,
and the yellow spheres indicate the locations of the silicon atoms.
(Image is taken from Figure 6 of [20].)

and so is the total density matrix


Z
1
P = P (k) dk.
|Ω∗ | Ω∗

For any choice of gauge, the Wannier functions for crystals are defined as [84]
Z
1
wn,R (r) = ∗ ψen,k (r)e−ik·R dk, r ∈ R3 , R ∈ L. (2.9.5)
|Ω | Ω∗

The Wannier functions {wn,R } are orthogonal to each other in L2 (R3 ) (exercise) and
span the same space as the range of the total density matrix P . They are also translation
invariant, and wn,R (r) = wn,0 (r − R).
Due to the translational invariance property, the Wannier localization problem for
crystals is thus reduced to the problem of finding a gauge {U (k)} such that wn,0 is
localized, or equivalently, ψen,k is smooth with respect to k. This can be done by mini-
mizing the spread functional similar to the Boys localization [62], i.e.,
N
X 2
Ω[{wn,0 }N
n=1 ] = hwn,0 |r 2 |wn,0 i − (hwn,0 |r|wn,0 i) . (2.9.6)
n=1

Here wn,0 depends on the unitary gauge U (k) through ψen,k as in (2.9.5). Figure 2.7
shows one Wannier function for a silicon crystal calculated on an 8 × 8 × 8k-point grid.
The Wannier function is clearly localized in the real space.

Selected columns of the density matrix


One main drawback of the Boys and Wannier localization procedures is that they rely on
a nonlinear optimization procedure. As opposed to the case in the SCF iteration, such
nonlinear optimization procedures can frequently get stuck at local minima. An alterna-
tive procedure is the selected columns of the density matrix (SCDM) method [21]. The
76 Chapter 2. Density functional theory: Formulation and algorithms

main idea of the SCDM method is to obtain localized orbitals directly from columns of
the density matrix P = ΨΨ∗ .
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

When the real space discretization is used and a set of localized orbitals does exist,
an immediate consequence is that each column (and hence row) of the density matrix
P decays exponentially fast along the off-diagonal direction. In this sense, the density
matrix P is said to be localized, and selecting any linearly independent subset of N
of them will yield a localized basis for the span of Ψ. However, picking N random
columns of P may result in a poorly conditioned basis if, for example, there is too much
overlap between the selected columns. Therefore, we would like a means for choosing
a well-conditioned set of columns in the real space, denoted C = {c1 , c2 , . . . , cN }.
Intuitively we would expect such a basis to minimize the overlaps between columns
whenever possible.
This is algorithmically accomplished with a QR factorization with column pivoting
(QRCP) procedure [31]. Simply speaking, for a given matrix A, the QRCP procedure
seeks to compute a permutation matrix Π such that the leading submatrices (AΠ)1,...,k
for any applicable k are as well conditioned as possible. In our setting, this means
we would ideally compute a QRCP factorization of the matrix P to identify N well-
conditioned columns from which we may construct a localized basis. However, this
would be very costly if the size of the matrix P were much larger than its rank N .
Fortunately, we may equivalently compute the set C by employing a QRCP factorization
of Ψ∗ . More specifically, we compute
Ψ∗ Π = Q R1 R2 ,
 
(2.9.7)
and the first N columns of the permutation matrix Π encode C.
One simple implementation of the SCDM procedure for constructing an orthonor-
mal set of localized orbitals {wi } is as follows.
1. Perform QRCP for Ψ∗ using (2.9.7).
2. Evaluate W = ΨQ.
This algorithm does not require an initial guess, and the direct usage of Q as the gauge
matrix corresponds to a Cholesky–QR orthonormalization procedure [21], where the
Cholesky factorization is only performed implicitly. Numerical results indicate that
the resulting spread can be very similar to that of the optimized solution in the Boys
localization procedure. Although the formulation above is stated for isolated systems,
it can be generalized to crystals as well.

2.10 Geometry optimization and ab initio molecular


dynamics
In the Born–Oppenheimer approximation, the nuclei are described by classical mechan-
ics and the electrons are slaved to the nuclei according to the ground state wavefunction
associated with each atomic configuration. The total energy E({RI }) then provides a
first principle description of the interatomic potential energy. The geometry optimiza-
tion (or structure optimization) problem aims at finding a (global or local) minimum
on the potential energy surface E({RI }), and the corresponding minimizer {R∗I } is a
(globally or locally) stable configuration. Most naturally existing molecules and solids
are at or around such minimizers. Define the atomic force for the Ith nuclei as
∂E({RI })
F I ({RI }) = − ; (2.10.1)
∂RI
2.10. Geometry optimization and ab initio molecular dynamics 77

then a necessary condition for identifying a minimizer is that the atomic force F I should
vanish for all atoms.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

For a closed system with M nuclei, the total energy for the nuclei degrees of free-
dom is
M 2
X MI ṘI
Etot = + E({RI }M I=1 ), (2.10.2)
2
I=1
where MI is the mass of the Ith nuclei. The dynamical properties of the nuclei can be
studied using Newton’s equation
∂E
MI R̈I (t) = − ({RI (t)}) = F I ({RI (t)}). (2.10.3)
∂RI
This is called ab initio molecular dynamics (AIMD). Geometry optimization and molec-
ular dynamics remove the need for the construction of the empirical interatomic poten-
tial and have enabled a vast range of applications for the study of static and dynamic
properties in chemistry and materials science.
In both geometry optimization and AIMD, the key ingredient is to evaluate the
atomic force. In three-dimensional space, such derivative information formally requires
3M independent electronic structure calculations, which would be prohibitively expen-
sive. Fortunately, the Hellmann–Feynman theorem allows us to evaluate the atomic
force with little added cost compared to a single electronic structure calculation.

Evaluation of the atomic force


In order to introduce the Hellmann–Feynman theorem, let us first consider a parameter-
dependent Hamiltonian H(λ). For each given λ, we assume the ground state is nonde-
generate. Its eigenvalue ε(λ) and eigenfunction ψ(r; λ) can be computed as
H(λ)ψ(λ) = ε(λ)ψ(λ), (2.10.4)
where we have omitted the r-dependence for simplicity. Then the derivative of the
eigenvalue with respect to λ is
dε(λ) d
= hψ(λ)|H(λ)|ψ(λ)i (2.10.5)
dλ dλ
dH(λ) d
= hψ(λ)| |ψ(λ)i + ε(λ) hψ(λ)|ψ(λ)i (2.10.6)
dλ dλ
dH(λ)
= hψ(λ)| |ψ(λ)i. (2.10.7)

Here we have used the normalization condition hψ(λ)|ψ(λ)i = 1 for all λ. Hence the
Hellmann–Feynman theorem indicates that the first-order derivative of the eigenvalue
with respect to the perturbation of an external parameter is simply given by the change
of the Hamiltonian with respect to λ.
Another perspective of the Hellmann–Feynman theorem can be understood from the
variational principle
hψ|H(λ)|ψi
ε(λ) = inf E[ψ, λ] := inf . (2.10.8)
|ψi6=|0i |ψi6=|0i hψ|ψi
Hence
Z
dε(λ) ∂E[ψ, λ] δE[ψ, λ] ∂ψ(r; λ) dH(λ)
= + dr = hψ(λ)| |ψ(λ)i.
dλ ∂λ δψ(r) ψ=ψ(λ) ∂λ dλ

ψ=ψ(λ)
(2.10.9)
78 Chapter 2. Density functional theory: Formulation and algorithms

Note that while the derivative ∂ψ(r;λ)


∂λ in general does not vanish, it does not contribute
dε(λ)
to dλ due to the Euler–Lagrange equation
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

δE[ψ, λ]
= 0. (2.10.10)
δψ

ψ=ψ(λ)

Now, for the total energy in Kohn–Sham DFT, only the external potential Vext and
the ion-ion interaction EII depend explicitly on the atomic configuration {RI }. Fur-
thermore, the external potential can be generally decomposed into contributions from
each atom as X
I
Vext (r; {RI }) = Vext (r − RI ). (2.10.11)
I

Then, applying the Hellmann–Feynman theorem, we find

∂E({RI }) ∂V I (r − RI ) ∂EII ({RI })


Z
FI = − = − ρ(r) ext dr − . (2.10.12)
∂RI ∂RI ∂RI

Car–Parrinello molecular dynamics


In AIMD, the electronic structure problem must be solved fully self-consistently in
order to reach the ground state for each atomic configuration. Hence this type of
AIMD is also called Born–Oppenheimer molecular dynamics (BOMD). For a long time,
despite its apparent potential, BOMD was considered to be prohibitively expensive.
AIMD was made practical by the ground-breaking Car–Parrinello molecular dynam-
ics (CPMD) [14], which introduces an extended Lagrangian including the degrees of
freedom of both nuclei and electrons:

LCP {RI }, {ṘI }, {ψi }, {ψ̇i }
X MI Ṙ2 XµZ
= I
+ ψ̇i∗ (r)ψ̇i (r) dr − E({ψi }; {RI }). (2.10.13)
2 i
2
I

Here µ is called the fictitious electron mass. E({ψi }; {RI }) is the energy functional in
Kohn–Sham DFT, while {ψi } may or may not be the minimizing Kohn–Sham orbitals
for the atomic configuration {RI }.
The equation of motion induced by the Lagrangian (2.10.13) gives the CPMD

∂V I (r − RI (t))
Z
∂EII
MI R̈I (t) = − ρ(r, t) ext dr − ({RI (t)}), I = 1, . . . , M,
∂RI ∂RI
X
µψ̈i (t) = −H[ρ(t); {RI (t)}]ψi (t) + ψj (t)Λji (t), i = 1, . . . , N,
j
N
X
ρ(r, t) = |ψi (r, t)|2 .
i=1
(2.10.14)

Here the Λ’s are the Lagrange multipliers determined so that {ψi (t)} is an orthonormal
set of functions for any time. Compared to BOMD, CPMD uses a fictitious dynamics
to guide the motion of the electrons without the need for a convergent SCF iteration.
The dynamics of the electronic orbitals can be loosely regarded as a special way of per-
forming the SCF iteration at each molecular dynamics step. Thanks to the Hamiltonian
2.11. Time-dependent density functional theory 79

structure, numerical simulation for CPMD is stable and the energy is conservative over a
much longer time period compared to that for BOMD without a tight SCF convergence
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

criterion. When the system has a spectral gap, the accuracy of CPMD is controlled by
a single parameter, the fictitious electron mass µ. The result of CPMD approaches that
of BOMD as µ goes to zero [69, 9]. However, it has also been shown that CPMD does
not work as well for systems with a vanishing gap, for example metallic systems [69].

2.11 Time-dependent density functional theory


Density functional theory states that the many-body ground state energy and the many-
body ground state wavefunction are determined only by the electron density ρ. The
key tool is the variational principle, which results in the constrained minimization for-
mulation. For the time-dependent many-body Schrödinger equation, Runge and Gross
proposed time-dependent density functional theory (TDDFT) in 1984. However, the
rigorous foundation of TDDFT is much less clear compared to DFT, and therefore we
do not provide the original argument from a physical perspective and give only the
conclusion instead.
The many-body Schrödinger equation is
i∂t Ψ(t) = H(t)Ψ(t), (2.11.1)
where Ψ(t) = Ψ(x1 , . . . , xN , t) is the time-dependent many-body wavefunction with
initial condition Ψ(t0 ) = Ψ0 . The time-dependent electron density is
XZ 2
ρ(r, t) = N |Ψ((r, σ), x2 , . . . , xN , t)| dx2 . . . dxN . (2.11.2)
σ

The time-dependent Hamiltonian is


N  N
X 1  X 1
H(t) = − ∆ri + Vext (r i , t) + , (2.11.3)
i=1
2 i<j
|r i − rj |

where the explicit time-dependence comes from the external potential Vext . The state-
ment of Runge and Gross is that the many-body wavefunction Ψ(t) at any time t ≥ t0
is uniquely determined by the initial state Ψ0 and the history of the electron density
{ρ(r, s)}t0 ≤s≤t . Furthermore, according to the Hohenberg–Kohn theorem, if the sys-
tem starts from the many-body ground state, then Ψ0 is uniquely determined by ρ(r, t0 )
as well. Hence the evolution of the many-body system is determined entirely from the
evolution of the density {ρ(r, s)}t0 ≤s≤t .
Furthermore, similarly to the construction of Kohn–Sham DFT, the Runge–Gross
TDDFT can also be formally given by the following coupled time-dependent Schrö-
dinger equation in R3 as
ρ(r 0 , t)
 Z 
1 0
i∂t ψj (x, t) = − ∆ + Vext (r, t) + dr + Vxc [{ρ}t0 ≤s≤t ] (r) ψj (x, t),
2 |r − r 0 |
N
XX 2
ρ(r, t) = |ψj (x, t)| .
σ j=1
(2.11.4)
Compared to the ground state Kohn–Sham DFT, the important difference is that the
exchange-correlation potential Vxc is in principle not only nonlocal in the spatial vari-
able r but also nonlocal in the temporal variable t as well. The memory effect can
80 Chapter 2. Density functional theory: Formulation and algorithms

span the entire history {ρ(r, s)}t0 ≤s≤t . Given the already formidable difficulty of find-
ing the exact exchange-correlation functional in ground state Kohn–Sham DFT, it be-
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

comes much more difficult to further identify the memory effect in TDDFT. In practical
TDDFT calculations, almost all exchange-correlation functionals assume
KS
Vxc [{ρ}t0 ≤s≤t ] (r) ≈ Vxc [ρ(t)] (r), (2.11.5)
KS
where Vxc is the exchange-correlation functional for ground state calculations. Equa-
tion (2.11.5) is also called the adiabatic approximation in TDDFT. While such an ap-
proximation works well in the computation of a wide range of electronic and optical
properties, it can also fail qualitatively, even when the exact ground state exchange-
correlation functional is available.
From a practical perspective, for a given exchange-correlation functional, one pri-
mary challenge of TDDFT calculation is the small time step, which is usually on the
order of attoseconds (1 attosecond = 10−18 sec). Compared to the typical time scale of
the atomic movement which is on the order of femtoseconds (1 femtosecond = 10−15
sec), the time scale of TDDFT is much faster, and therefore it can be challenging to
study nonadiabatic dynamics where the degrees of freedom of electrons and nuclei are
propagated simultaneously.
Finally, let us rewrite (2.11.4) in terms of the time-dependent density matrix:
N
X
P (t) = |ψj (t)ihψj (t)|.
j=1

P (t) then satisfies


N 
∂ X 
i P (t) = |H(t)ψj (t)ihψj (t)| − |ψj (t)ihH(t)ψj (t)| = [H(t), P (t)]. (2.11.6)
∂t j=1

This is the (self-consistent) quantum Liouville equation (also called the von Neumann
equation), which can be seen as an intrinsic representation of TDDFT.

Exercises
1. Verify the relations (2.1.16) and (2.1.18).

2. Derive the Euler–Lagrange equation for the unrestricted Hartree–Fock theory.

3. Write down the total energy of the H+2 molecule in the unrestricted Hartree–Fock
approximation and the corresponding Euler–Lagrange equation. Explain why this
gives the exact solution to the problem. Then write down the total energy of the
H+2 molecule in Kohn–Sham DFT and the corresponding Euler–Lagrange equa-
tion. If you neglect the exchange-correlation functional, note that the electron
interacts with itself through the Hartree potential, even though there is only one
electron!

4. Write a computer program to solve the one-dimensional H2 molecule using the


unrestricted Hartree–Fock approximation and the restricted Hartree–Fock approx-
imation, respectively, by optimizing the total energy as a constrained minimiza-
tion problem. Plot the energy curve with respect to the bond length.
2.11. Time-dependent density functional theory 81

5. The ground state energy E in Kohn–Sham DFT is closely related to the band
energy, i.e., the sum of the eigenvalues of occupied states. Prove that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

N
ρ(r)ρ(r 0 )
ZZ Z
X 1
E= εi − dr dr 0 − ρ(r)Vxc [ρ](r) dr + Exc [ρ] + EII .
i=1
2 |r − r 0 |

The difference between the total energy and the band energy is called the double
counting term, which is due to the nonlinearity of the Hartree energy and the
exchange-correlation functional.
6. Write down a one-dimensional model for the helium atom. Write a computer
program to solve this system at the restricted Hartree–Fock level. Can you show
that the spin singlet state has a lower energy than the spin triplet state?
7. Solve the constrained minimization problem (2.4.12) for ` = 1 and find that the
solution is given by Broyden’s update (2.4.14).
8. Verify the Hellmann–Feynman force formula (2.10.12).
9. Verify that the Wannier functions defined in (2.9.5) are orthonormal, i.e.,
Z

wn,R (r)wn0 ,R0 (r) dr = δn,n0 δ(R − R0 ), R, R0 ∈ L.
R3

10. Write a computer program to find the localized Boys orbitals for a chain involving
four hydrogen atoms in one dimension.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Chapter 3

Linear response theory

Many properties of chemical or materials systems are characterized by how they re-
spond to external perturbations. When the external perturbation is sufficiently small
that only the leading-order perturbation is important, this is known as the linear response
regime. It is in some sense analogous to the linear stability analysis of a dynamical sys-
tem or that in fluid mechanics. This chapter will give an in-depth discussion of the
linear response theory for both static and dynamic perturbations and their applications.
The analysis will be from the perspective of the perturbation theory of Green’s func-
tions. Towards the end of the chapter, the perturbation theory of a many-body quantum
system and its connection to DFT will also be briefly addressed.

3.1 Perturbation of Green’s function


Recall that for a Hamiltonian operator H and λ in the resolvent set of H (i.e., the
complement of the spectrum), we define the Green’s function Gλ = (λ − H)−1 . For a
function f denote g = Gλ f , then we have

(λ − H)g = (λ − H)Gλ f = f. (3.1.1)

Thus, the image of the Green’s function (as an operator) solves an elliptic equation.
Using spectral decomposition (assuming again the discrete spectrum of H):
X
Gλ f = (λ − H)−1 f = (λ − εi )−1 |ψi ihψi |f i. (3.1.2)
i

The kernel of the Green’s function, viewed as an integral operator, is thus given by
X
Gλ (r, r 0 ) = (λ − εi )−1 ψi (r)ψi∗ (r 0 ). (3.1.3)
i

By direct computation, we can verify the two useful identities for Green’s functions,
known as the resolvent identities:

(λ1 − H)−1 − (λ2 − H)−1 = (λ1 − H)−1 (λ2 − λ1 )(λ2 − H)−1 , (3.1.4)
−1 −1 −1 −1
(λ − H1 ) − (λ − H2 ) = (λ − H1 ) (H1 − H2 )(λ − H2 ) . (3.1.5)

83
84 Chapter 3. Linear response theory

Indeed, we have
(λ1 − H)−1 − (λ2 − H)−1 = (λ1 − H)−1 (λ2 − H)(λ2 − H)−1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

− (λ1 − H)−1 (λ1 − H)(λ2 − H)−1 (3.1.6)


= (λ1 − H)−1 (λ2 − λ1 )(λ2 − H)−1
and similarly for the other identity.
We now consider the change of the Green’s function as the Hamiltonian is perturbed.
Denoting the unperturbed Hamiltonian as H, let us consider a perturbed Hamiltonian
H = H + W, (3.1.7)
where the perturbation is given by W with the parameter  being the strength of the
perturbation. To approach the Green’s function of the perturbed Hamiltonian Gλ, :=
(λ − H )−1 , we investigate how g = (λ − H )−1 f depends on . By definition, g
solves the equation
(λ − H )g = f. (3.1.8)
Let us take a formal asymptotic series g = g0 + g1 + 2 g2 + · · · . Substituting into
the above equation, we get
(λ − H − W )(g0 + g1 + 2 g2 + · · · ) = f. (3.1.9)
Matching in orders of , we have
(λ − H)g0 = f, (3.1.10)
(λ − H)g1 = W g0 , (3.1.11)
(λ − H)g2 = W g1 , (3.1.12)
and so on. In other words,
g0 = (λ − H)−1 f, (3.1.13)
−1 −1 −1
g1 = (λ − H) (W g0 ) = (λ − H) W (λ − H) f, (3.1.14)
−1 −1 −1 −1
g2 = (λ − H) (W g1 ) = (λ − H) W (λ − H) W (λ − H) f. (3.1.15)
Continuing this to higher orders, we get

X m
m (λ − H)−1 W (λ − H)−1 f,

g = (3.1.16)
m=0

or equivalently,

X m
(λ − H )−1 = (λ − H)−1 m W (λ − H)−1 .

(3.1.17)
m=0

By the dominated convergence theorem, the above series converges if


 kW (λ − H)−1 k < 1. (3.1.18)
In fact, we can also get (3.1.17) more directly as
−1
(λ − H )−1 = (λ − H − W )−1 = (λ − H)−1 I − W (λ − H)−1

X m (3.1.19)
= (λ − H)−1 W (λ − H)−1 ,

m=0
3.2. Perturbation of the density matrix 85

where the last step uses the Taylor series of (1 − x)−1 at x = 0, i.e., the Neumann
series.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

There is yet another way to view this series expansion using the resolvent identity:

(λ−H )−1 −(λ−H)−1 = (λ−H)−1 (H −H)(λ−H )−1 = (λ−H)−1 W (λ−H )−1 .
(3.1.20)

In other words,

Gλ, − Gλ = Gλ (W )Gλ, . (3.1.21)

This is known as the Dyson equation. Let us now solve the equation by an iteration
scheme
(n) (n−1)
Gλ, = Gλ + Gλ (W )Gλ, (3.1.22)

(0)
with the initial guess Gλ, = Gλ . Using the contraction mapping theorem, we can
prove that the iteration converges and that the converged solution is given by (3.1.17).
The above result tells us that if λ is not in the spectrum of H and the operator
W (λ − H)−1 is bounded, then for sufficiently small , λ − H is also invertible. Thus
λ is not in the spectrum of H . Note that by a similar argument we can also show that
if µ is close to λ so that |µ − λ| k(λ − H)−1 k < 1, then µ is also in the resolvent set.
Thus the resolvent set is an open set.

3.2 Perturbation of the density matrix


We now consider the perturbation of the density matrix, which is a central object in
electronic structure theory. Let us focus on the case of zero temperature. The derivation
for the finite temperature case is similar and will be left as an exercise.
First recall the contour integral representation of the density matrix at zero temper-
ature (cf. (2.5.13))
I
1
P = (λ − H)−1 dλ, (3.2.1)
2πi C

where we have assumed a gap between the highest occupied state energy and the lowest
unoccupied state energy, and C encloses the occupied spectrum of the unperturbed
Hamiltonian H.
Using the perturbation theory of Green’s functions, we can then study how P changes
when the Hamiltonian is perturbed to H = H + W , where W is a Hermitian opera-
tor. We will assume that W is H-bounded (in other words, there exists λ ∈ C so that
kW (λ − H)−1 k is bounded). As C lies in the resolvent set of H, for sufficiently small
, C remains in the resolvent set of H , thanks to the result of section 3.1. Thus, the
contour integral
I
1
P = (λ − H )−1 dλ (3.2.2)
2πi C

is well defined. P is also a projection operator.


86 Chapter 3. Linear response theory

For the perturbation of the density matrix, we calculate the difference


I
1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

P − P = (λ − H )−1 − (λ − H)−1 dλ
2πi C
I
1
= (λ − H )−1 (W )(λ − H)−1 dλ
2πi C
I X ∞ (3.2.3)
(3.1.17) 1 n
= (λ − H)−1 W (λ − H)−1 dλ
2πi C n=1
I

= (λ − H)−1 W (λ − H)−1 dλ + O(2 ).
2πi C
If we define an operator X0 as
I
1
X0 W := (λ − H)−1 W (λ − H)−1 dλ, (3.2.4)
2πi C

we arrive at
P (H + W ) − P (H) = P − P = X0 W + O(2 ), (3.2.5)
where the notation P (H) stands for the density matrix corresponding to H. In other
words,
dP (H + W )
= X0 W. (3.2.6)
d

=0
More precisely, this means that the Gâteaux derivative of P in the W direction is given
by X0 W :
δ(λ − H)−1
I
δP 1
(W ) = (W ) dλ = X0 W. (3.2.7)
δV 2πi C δV
In physical terms, the linear response of the density matrix when the potential changes
is given by the operator X0 .
Using the spectral decomposition, we have
|ψp ihψp |W |ψq ihψq |
I X
1
X0 W = dλ
2πi C p,q (λ − εp )(λ − εq )
(3.2.8)
|ψp ihψp |W |ψq ihψq |
I
1 X  
= (λ − εp )−1 − (λ − εq )−1 dλ .
2πi p,q C εp − εq

The integral above is nonvanishing only if one of εp or εq is enclosed by C and the


other is not; in other words, one of p, q is occupied and the other is unoccupied. Since
they are dummy variables, let us replace the occupied orbital index by i and the unoc-
cupied orbital index by a (this is a common index convention in the quantum chemistry
literature). We have
occ unocc
X |ψi ihψi |W |ψa ihψa | unocc occ
X XX |ψa ihψa |W |ψi ihψi |
X0 W = −
i a
εi − εa a i
εa − εi
occ unocc
X |ψi ihψi |W |ψa ihψa | unocc occ
X XX |ψa ihψa |W |ψi ihψi |
= + (3.2.9)
i a
εi − εa a i
εi − εa
occ unocc
XX |ψi ihψi |W |ψa ihψa |
= + h.c.,
i a
εi − εa

where h.c. represents the Hermitian conjugate of the first term.


3.2. Perturbation of the density matrix 87

Perturbation of the electron density


Of particular interest is the perturbation of the electron density in response to the change
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

of a local potential denoted by W (r). With some abuse of notation, we use W to


denote both the local potential and its associated multiplicative operator depending on
the context. As ρ(r) = P (r, r), we have

δρ
(W ) = diag(X0 W ) =: χ0 W, (3.2.10)
δV

and hence χ0 applied to W is defined as


I
1
(χ0 W )(r) = (X0 W )(r, r) = (λ − H)−1 W (λ − H)−1 dλ(r, r)
2πi C
occ unocc
X X hψi |W |ψa i (3.2.11)
= ψi (r)ψa∗ (r) + c.c.
i a
εi − εa

Here c.c. represents the complex conjugate of the first term. The operator χ0 , which
describes the linear response of the density with respect to the change of the potential, is
known as the polarizability operator (or in the current case of independent particles, the
independent particle polarizability operator or the irreducible polarizability operator).
Observe that
Z occ unocc
X X hψi |W |ψa ihψa |W |ψi i
W (r)(χ0 W )(r) dr = + c.c.
i a
εi − εa
occ unocc 2
(3.2.12)
X X |hψi |W |ψa i|
=2 ≤ 0.
i a
εi − εa

Hence χ0 is a nonpositive operator. We also note the bound

Z occ unocc
X |hψi |W |ψa i|2 occ unocc
X 2 XX 2
− W (r)(χ0 W )(r) dr = 2 ≤ |hψi |W |ψa i|
i a
εa − εi εgap i a
1 X 2 1 2
≤ |hψp |W |ψq i| = kW k2 , (3.2.13)
εgap p,q εgap

where εgap is the spectral gap between the occupied and unoccupied spectra. Here kW k2
should be understood as the operator norm for W . This implies that, as an operator, χ0
satisfies
1
− I  χ0  0. (3.2.14)
εgap

In other words, for systems with a spectral gap, the response of the density with respect
to a perturbation of the potential cannot be arbitrarily large in the linear regime.

While it appears from the definition (3.2.11) that the evaluation of χ0 W would
involve the summation over unoccupied orbitals, in fact χ0 W can be obtained by using
only occupied orbitals with the help of Green’s function as follows. For simplicity,
88 Chapter 3. Linear response theory

assume all eigenfunctions ψi are real; then


occ unocc
X hψi |δr |ψa ihψa |δr0 |ψi i
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

X
χ0 (r, r 0 ) = + c.c.
i a
εi − εa
occ unocc
!
X X ψa (r)ψa (r 0 )
=2 ψi (r) ψi (r 0 )
i a
εi − εa
occ
X
ψi (r) Q(εi − H)−1 Q (r, r 0 )ψi (r 0 ),
 
=2
i
Pocc
where Q = I − j |ψj ihψj | is the projection operator to the unoccupied space. The
advantage of the above representation is that it only involves the occupied orbitals. Then
Z occ
X Z
χ0 (r, r 0 )W (r 0 ) dr 0 = 2 Q(εi − H)−1 Q (r, r 0 )ψi (r 0 )W (r 0 ) dr 0
 
ψi (r)
i
occ
X
:= 2 ψi (r)ξi (r),
i

where ξi can be obtained by solving the Sternheimer equation

Q(εi − H)Qξi = Q(ψi W ). (3.2.15)

Although the operator on the left-hand side of (3.2.15) is singular, the equation is well
posed since the right-hand side has a vanishing component in the kernel space of Q(εi −
H)Q due to the projection Q. Furthermore, Q(εi − H)Q is invertible when restricted
to the range of Q, and the solution of (3.2.15) is unique.

Perturbation of eigenvalues
As an application, the perturbation theory of the density matrix can be used to study the
perturbation of eigenvalues of a Hamiltonian. Let ε0 be an isolated eigenvalue of H.
Hence we can take a disk D = B(ε0 , r) ⊂ C in the complex plane centered at ε0 with
a sufficiently small radius r so that spec(H) ∩ D = {ε0 }.
Let P be the projector onto the eigenspace corresponding to ε0 . Recall the contour
integral formula I
1
P = (λ − H)−1 dλ. (3.2.16)
2πi ∂D
After perturbation, the projector becomes
I
1
P = (λ − H )−1 dλ. (3.2.17)
2πi ∂D
For  sufficiently small, ∂D will lie in the resolvent set of the perturbed Hamiltonian
H and hence P is well defined and is a projection operator.
We next compare the two projection operators as before:
I
1
P − P = (λ − H )−1 − (λ − H)−1 dλ
2πi ∂D
I
1
= (λ − H )−1 (H − H)(λ − H)−1 dλ. (3.2.18)
2πi ∂D
3.3. Density functional perturbation theory 89

Thus, the norm of the difference is on the order of  as long as W is H-bounded (note
that the contour is compact and thus |λ| is bounded). In particular, we have kP − P k <
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

1 for  sufficiently small. As both are projection operators, this implies that [76]

rank P = rank P = dim ker(H − ε0 ). (3.2.19)

This means that H has the same number of eigenvalues as H in D for  sufficiently
small.
Now we study the eigenvalues and eigenfunctions under perturbation. Recall that

P = P + X0 W + O(2 ). (3.2.20)

Denote the eigenfunctions of H with eigenvalue ε0 as ψ1 , . . . , ψk . As P is close to


P , the range of P is spanned by P ψi , i = 1, . . . , k (note that these functions after
projection might not be orthonormal). We calculate

P ψi = (P + X0 W )ψi + O(2 ) = ψi + (X0 W )ψi + O(2 )


(3.2.21)
= ψi + (ε0 − H)−1 (I − P )W ψi + O(2 ),

where the last identity follows from a similar calculation in deriving the Sternheimer
equation (3.2.15).
If ε0 is nondegenerate, (3.2.21) gives the perturbation of the eigenfunctions. We can
also get the correction of the eigenvalue by applying H on P ψi , which we will leave
as an exercise.
For the degenerate case, we want to make P ψi eigenfunctions of H by appropri-
ately choosing ψi (note that since the eigenvalue is degenerate, the choice of eigenfunc-
tions is not unique). We thus calculate

H P ψi = Hψi + W ψi + H(ε0 − H)−1 (I − P )W ψi + O(2 )


= ε0 ψi + P W ψi + ε0 (ε0 − H)−1 (I − P )W ψi + O(2 ) (3.2.22)
2
= ε0 P ψi + P W ψi + O( ).

Thus we require P W ψi to be parallel to P ψi up to O(). This is equivalent to


k
(1)
X
P W ψi = P W P ψi = ψj hψj |W |ψi i = εi ψi (3.2.23)
j=1

(1)
for some εi . Therefore, ψi should be chosen such that hψj |W |ψi i is diagonal, or
equivalently, ψi should diagonalize the operator W restricted to the range of P . The
(1)
eigenvalue gives us the first-order correction to the eigenvalue as ε0 + εi . This can
be extended to higher orders, which we will not discuss further here.

3.3 Density functional perturbation theory


In this section we apply the general result of linear response theory in Section 3.2 to
Kohn–Sham DFT. The resulting theory is called density functional perturbation the-
ory (DFPT). Since the effective Hamiltonian depends self-consistently on the electron
density, when considering the perturbation of the electron density with respect to the
external potential we also need to take into account the change of the effective potential
90 Chapter 3. Linear response theory

induced by the density perturbation. This leads to a set of self-consistent equations,


similar to the self-consistent Kohn–Sham equations.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Denote the effective Hamiltonian as H[ρ] = − 12 ∆ + Vext + VHxc [ρ], where Vext
is the external potential and VHxc [ρ] is the Hartree and exchange-correlation potential
induced by electron density ρ. Kohn–Sham DFT can be solved self-consistently via the
following nonlinear equation:

ρ(r) = 1(−∞,µ] (H[ρ])(r, r). (3.3.1)

Let us denote by H0 the self-consistent Hamiltonian and consider a small perturba-


tion to the external potential denoted by δVext . We assume that the density changes to
ρ + δρ. Thus, up to the leading order, the effective potential changes to

VHxc [ρ + δρ] = VHxc [ρ] + fHxc [δρ] + O(δρ)2 , (3.3.2)


δVHxc (r)
where fHxc (r, r 0 ) = δρ(r 0 ) is the kernel for the linearized operator associated with
VHxc . For LDA:
Z
1
VHxc [ρ](r) = ρ(r 0 ) dr 0 + vxc (ρ(r)). (3.3.3)
|r − r 0 |
Then the linearization is given by
Z
1
(fHxc [g])(r) = g(r 0 ) dr 0 + vxc
0
(ρ(r))g(r). (3.3.4)
|r − r 0 |
Similar calculations can be done for more general exchange-correlation energy func-
tionals. Thus, up to the leading order, the Hamiltonian becomes

H = H0 + fHxc [δρ] + δVext . (3.3.5)

Note that the perturbation to the potential comes from the direct perturbation to the
external potential and the induced change of the effective potential due to the change of
the density. Thus, from the discussion in section 3.2, the corresponding perturbation of
the density satisfies the equation

δρ = χ0 (fHxc [δρ] + δVext ), (3.3.6)

where we recall that χ0 is the irreducible polarizability operator. Equivalently, we have

(I − χ0 fHxc )δρ = χ0 (δVext ). (3.3.7)

Assuming I − χ0 fHxc is invertible, we then get

δρ = (I − χ0 fHxc )−1 χ0 (δVext ) =: χ(δVext ). (3.3.8)

The last equality defines χ, called the reducible polarizability operator, which is the
linear response of the density with respect to perturbation of the external potential in
DFT. If the exact exchange-correlation functional is used, the reducible polarizability
operator χ defined here should agree with χexact , the exact linear response of the many-
body electron system, which is to be further discussed in section 3.7.
From a physical point of view, if I − χ0 fHxc is not invertible, it means that it is
possible that a small perturbation of the potential generates a large perturbation to the
density. This is to say that the electronic structure of the system is not stable with respect
3.4. Applications of density functional perturbation theory 91

to external perturbations. Thus, the invertibility of the operator is known as the stability
condition of electronic structure in the context of Kohn–Sham DFT [23].
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

In many applications, we do not need to have access to the full reducible polariz-
ability operator χ, but only the application of χ on some vector, say g:
Z
u(r) := (χg)(r) = χ(r, r 0 )g(r 0 ) dr 0 .

Recalling the definition of χ,

u = χg = (I − χ0 fHxc )−1 χ0 g,

we have
u − χ0 fHxc u = χ0 g,
or equivalently,
u = χ0 g + χ0 fHxc u. (3.3.9)
Equation (3.3.9) can be solved iteratively as a fixed point problem for u. A simple
iteration scheme can be constructed by recursively substituting u into the right-hand
side. This leads to the Neumann series

u = χ0 g + χ0 fHxc χ0 g + χ0 fHxc χ0 fHxc χ0 g + · · · .

The iterative solution requires the application of χ0 to a vector, which can be obtained
efficiently by solving the Sternheimer equations (3.2.15).

3.4 Applications of density functional perturbation theory


Here we consider some applications of DFPT.

Example: Dielectric screening


Consider a system at its ground state with a self-consistent electron density ρ. Let us
apply some small external change to the charge density denoted by δρext , for instance by
adding a small number of electrons to the system. The external charge density induces
a small change to the external potential denoted by δVext via the Coulomb interaction,
i.e.,
δVext = vC δρext . (3.4.1)
Recall that if the effective potential were independent of ρ, the linear response would
be described by the irreducible polarizability operator χ0 as

δρ = χ0 δVext = χ0 vC δρext . (3.4.2)

However, since VHxc depends on ρ, the change of the electron density ρ further induces
the change of the effective potential. Therefore the change of the electron density δρ
must satisfy the following self-consistent equation:

δρ = χ0 (δVext + δVHxc ) = χ0 (δVext + fHxc δρ) . (3.4.3)

Taking the random phase approximation (RPA) for the exchange-correlation functional,
i.e., neglecting the exchange-correlation kernel fxc and setting fHxc ≈ vC , we get

δρ ≈ (I − χ0 vC )−1 χ0 vC δρext = χvC δρext . (3.4.4)


92 Chapter 3. Linear response theory

Therefore the total change of the effective potential


δVeff ≈ δVext + vC (I − χ0 vC )−1 χ0 vC δρext
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

= vC δρext + vC (I − χ0 vC )−1 (χ0 vC − I + I)δρext


(3.4.5)
= vC (I − χ0 vC )−1 δρext
= (I − vC χ0 )−1 vC δρext .
Define the operator as
d = I − fHxc χ0 . (3.4.6)
Then, with RPA, d ≈ I − vC χ0 and we have
δVeff ≈ −1
d vC δρext := wC δρext .

Here wC = −1 d vC is the screened Coulomb interaction operator. The inverse of the
dielectric operator −1
d characterizes the screening effect of the electron system with
respect to the perturbation of the external charge, and is directly related to the dielec-
tric constant in macroscopic electrostatic theory [1, 85]. In contrast, sometimes vC is
referred to as the bare Coulomb interaction for which the electron screening is not con-
sidered. The screened Coulomb operator wC also plays an important role in many-body
perturbation theory, including the GW theory [34].

Example: Macroscopic polarizability tensor


The perturbation theory also allows us to consider the response of electrons with respect
to a macroscopic perturbation, such as the electric field [5]. The change of the external
potential induced by an electric field E = (1, 0, 0)> is (with ||  1 so that we are in
the linear response regime)
δVext (r) = −E · r = −r1 . (3.4.7)
The self-consistent response of the electron density is given by the reducible polariz-
ability operator as Z
δρ(r) = χ(r, r 0 )δVext (r 0 ) dr 0 . (3.4.8)

Experimentally, one cannot measure the full detail of δρ but only its macroscopic av-
erage. One measurable quantity is the induced dipole moments along the {rα }3α=1
directions, defined respectively as
Z Z
dα = rα δρ(r) dr = − rα χ(r, r 0 )r10 dr dr 0 . (3.4.9)

In general, in the linear response regime the induced dipole moment is a linear function
of the external electric field E,
X
dα = Aαβ Eβ , (3.4.10)
β

where A is called the (macroscopic) polarizability tensor, defined as


Z
Aαβ = − rα χ(r, r 0 )rβ0 dr dr 0 . (3.4.11)

Since χ is in general negative semi-definite (note that χ0 is always negative semi-


definite), the polarizability tensor A is in general a positive semi-definite matrix.
3.4. Applications of density functional perturbation theory 93

Example: Analysis of self-consistent field iteration schemes


Here we use DFPT to understand why fixed point iteration often fails to converge in the
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

SCF iteration, and simple mixing sometimes resolves this problem. Since the exchange-
correlation functional is neither convex nor concave with respect to the electron density,
the rigorous study of the global convergence properties of SCF schemes in Kohn–Sham
DFT calculations is very difficult. Hence we consider the linear response regime, where
we assume that the initial effective potential V0 is already close to V? , the converged
Kohn–Sham effective potential. Let

ek := Vk − V?

be the error of the potential at the kth iteration. Recall the fixed point iteration for the
SCF iteration
Vk+1 = Veff [FKS [Vk ]]. (3.4.12)
In order to study the propagation of the error in the fixed point iteration, we apply the
chain rule and we have
2
ek+1 = fHxc χ0 ek + O(kek k ), (3.4.13)
δVeff δFKS
since fHxc = δρ and χ0 = δV . In the linear response regime, we assume that the
2
O(kek k ) term is small and is omitted in the following discussion. Then, after k steps,
k
ek ≈ (fHxc χ0 ) e0 . (3.4.14)

Hence, in the linear response regime, the convergence of the fixed point iteration re-
quires that the spectral radius of the operator, denoted by rσ (fHxc χ0 ), is smaller than
1. Unfortunately, such a spectral radius is generally much larger than 1, and the error
in the fixed point iteration will therefore diverge, even if the initial potential is already
very close to the self-consistent potential.
In the simple mixing method, the error propagation follows as
 
2
ek+1 = (I − αd ) ek + O kek k , (3.4.15)

where the dielectric operator d = I − fHxc χ0 is as defined in (3.4.6). Equation (3.4.15)


can be iterated recursively to have
k
ek ≈ (I − αd ) e0 . (3.4.16)

Under proper assumption of fHxc , d is similar to a positive definite matrix [55]. In


particular, d is diagonalizable with real positive eigenvalues. In order to achieve con-
vergence in the linear response regime, we need each eigenvalue of d , denoted by λ, to
satisfy
|1 − αλ| < 1,
or equivalently,
2
0<α< .
λ
Denote by λmin , λmax the smallest and largest eigenvalues of d , respectively; the spec-
tral radius rσ (d ) is then given by λmax . If
2
0<α< (3.4.17)
rσ (d )
94 Chapter 3. Linear response theory

is satisfied, the simple mixing will converge. In practical calculations, the spectral radius
rσ (d ) can be large, and hence α needs to be chosen rather small and the simple mixing
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

method converges with a very slow rate. Thus the simple mixing method is rarely used
directly in practical electronic structure calculations.
It remains to decide what is the optimal choice of α satisfying the constraint (3.4.17).
This requires the solution of the following minimax problem:

min max|1 − αλ|. (3.4.18)


α λ

The optimal choice of α satisfies

1 − αλmin = αλmax − 1 (3.4.19)

or
2
α= . (3.4.20)
λmin + λmax
Substituting this choice of α into (3.4.15), we find that the optimal convergence rate of
simple mixing is

λmax − λmin κ(d ) − 1


max|1 − αλ| = = . (3.4.21)
λ λmax + λmin κ(d ) + 1

Here κ(εd ) = λmax /λmin is the condition number of the dielectric operator.

Example: Phonon calculation


As another application of DFPT, we consider an isolated system with M atoms. The
potential energy at the atomic configuration {RI }M I=1 is denoted by E({RI }), which
is the total energy obtained from Kohn–Sham DFT. If the system is at an equilibrium
configuration {R?I }MI=1 , the force should vanish for all atoms, i.e.,

∂E({RI })
F I ({R?I }) = − = 0, ∀I = 1, . . . , M. (3.4.22)
∂RI

{RI }={R?
I}

Then, in the presence of a small perturbation away from R?I , the dynamics of the nuclei
is given by Newton’s law,

d2 X ∂ 2 E({RI })
MI R I = F I ({R I }) ≈ − (RJ − R?J ), (3.4.23)
dt2 ∂RI ∂RJ {RI }={R?I }

J

where we have used Taylor’s expansion around R∗I and omitted higher-order terms. If
we define the displacement vector δRI = RI − R?I , (3.4.23) can be written as

d2 X ∂ 2 E({RI })
MI δR I ≈ − δRJ . (3.4.24)
dt2 ∂RI ∂RJ {RI }={R?I }

J

The linear equation (3.4.24) can be exactly solved. Define the dynamical matrix, which
is the scaled Hessian matrix, as

1 ∂ 2 E({RI })
DIJ = √ . (3.4.25)
MI MJ ∂RI ∂RJ {RI }={R?I }

3.5. Exponential decay of the Green’s function 95

When the system is at a local minimum, D is a positive semi-definite matrix and can be
diagonalized as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Duk = ωk2 uk . (3.4.26)


Here the eigenvalue ωk is called the kth phonon frequency, and the eigenvector uk is
called the kth phonon mode. A large variety of mechanical and physical properties
of solids depend on such phonon calculations. A few examples include infrared spec-
troscopy, elastic neutron scattering, specific heat, heat conduction, and electron-phonon
interaction-related behaviors, such as superconductivity [32, 5].
In order to evaluate the Hessian matrix, we recall that the Hellmann–Feynman for-
mula for the atomic force in section 2.10 gives
I
∂E({RI }) (r − RI ) ∂EII ({RI })
Z
∂Vext
= ρ(r) dr + , (3.4.27)
∂RI ∂RI ∂RI

and we find that the second-order derivative with respect to the atomic position takes
the form
∂ 2 E({RI }) I
(r − RI )
Z
∂ρ(r) ∂Vext
= dr
∂RI ∂RJ ∂RJ ∂RI
(3.4.28)
∂ 2 Vext
I
(r − RI ) ∂ 2 EII ({RI })
Z
+ ρ(r) drδI,J + .
∂RI ∂RI ∂RI ∂RJ
I
Here we have used the form of Vext in (2.10.11). The second and third terms on the
right-hand side of (3.4.28) can be readily evaluated. The first term involves the self-
consistent response of the electron density with respect to the perturbation of the atomic
position as
I
(r − RI ) I
(r − RI ) ∂V J (r 0 − RJ )
Z Z
∂ρ(r) ∂Vext ∂Vext
dr = χ(r, r 0 ) ext dr dr 0
∂RJ ∂RI ∂RI ∂RJ
(3.4.29)

and it can be computed from DFPT (exercise).

3.5 Exponential decay of the Green’s function


As an analytical application of the perturbation theory, we study in this section the decay
property of Green’s functions. As the density matrix (at zero temperature) is given by
I
1
P = Gλ dλ (3.5.1)
2πi C

with a compact contour C , the decay of Gλ implies that of the density matrix as long
as the system has a gap, so that the contour integral representation is well defined.
Similarly, one can also study the exponential decay of the density matrix for general
systems at finite temperature, the details of which we will leave to the reader.
It is helpful to consider first a concrete example for the Green’s function. Let us
take H = −∆ (in R3 ), for which we know that the spectrum is [0, ∞). Let λ < 0 and
consider the corresponding Green’s function Gλ = (λ + ∆)−1 . We have

(λ + ∆)(Gλ f ) = f. (3.5.2)
96 Chapter 3. Linear response theory

Denoting g = Gλ f and taking the Fourier transform, we get


Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

(λ − |k|2 )b
g (k) = fb(k) (3.5.3)

and hence
fb(k)
gb(k) = − . (3.5.4)
|λ| + |k|2
Thus, by the convolution theorem, we have
Z
g(r) = Kλ (r − r 0 )f (r 0 ) dr 0 , (3.5.5)

b λ (k) = − 13/2
where K 1
(2π) |λ|+|k|2 , so that

1/2 0
e−|λ| |r−r |
Z
1 1 0
Gλ (r, r 0 ) = Kλ (r − r 0 ) = − eik·(r−r ) dk = − .
(2π)3 |λ| + |k| 2 4π|r − r 0 |
(3.5.6)
We observe that Gλ (r, r 0 ) decays exponentially as |r − r 0 | becomes large.
Note that the decay rate is given by |λ|1/2 , which depends on the distance of λ to
the spectrum of −∆. If λ lies in the spectrum, we may still write down the expression
of the Green’s function (which is then not bounded!), but the kernel no longer decays
exponentially. For example, if λ = 0, as is well known, the Green’s function of ∆ is
given by the Poisson kernel

1
G0 (r, r 0 ) = − . (3.5.7)
4π|r − r 0 |

For λ positive, we get the Green’s function for the Helmholtz equations
1/2 0
0 e±iλ |r−r |
Gλ (r, r ) = − for λ > 0. (3.5.8)
4π|r − r 0 |

Both kernels above decay only algebraically as |r − r 0 | becomes large.


The exponential decay property in fact holds for more general Green’s function
(λ − H)−1 for H = −∆ + V and λ in the resolvent set. In fact, the only requirement is
that V (I − ∆)−1 is bounded, i.e., V is relatively bounded with respect to the Laplacian
∆. To prove this, we use the perturbation theory of Green’s functions.
Define a multiplicative operator Er0 ,γ given by
2
+1)1/2
(Er0 ,γ f )(r) = eγ(|r−r0 | f (r) =: er0 ,γ (r)f (r). (3.5.9)

Note that, for |r − r 0 |  1, this is essentially multiplying by the exponential function


eγ|r−r0 | , while the above expression removes the singularity associated with |r − r 0 |
at r = r 0 . The inverse Er−1
0 ,γ
can be explicitly written as
2
+1)1/2
(Er−1
0 ,γ
f )(r) = e−γ(|r−r0 | f (r). (3.5.10)

Assume that λ is in the resolvent set of H. Let us consider the operator

Tr0 ,γ = Er−1
0 ,γ
(λ − H)−1 Er0 ,γ . (3.5.11)
3.5. Exponential decay of the Green’s function 97

Note that Tr0 ,0 is just the Green’s function (λ − H)−1 . Observe that
−1  −1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Tr0 ,γ = Er−1 = λ + Er−1



0 ,γ
(λ − H)Er0 ,γ 0 ,γ
∆Er0 ,γ − V
−1 −1
= λ−H −∆+Er−1 = (λ−H)−1 I−(∆−Er−1 ∆Er0 ,γ )(λ−H)−1
 
0 ,γ
∆Er0 ,γ 0 ,γ
.
(3.5.12)

We will now show that, for γ small, ∆ − Er−1 ∆Er0 ,γ (λ − H)−1 is a small

0 ,γ
perturbation to the identity operator in (3.5.12). Note that
(Er−1 ∆Er0 ,γ f )(r) = er0 ,γ (r)−1 ∆ er0 ,γ (r)f (r)

0 ,γ

= (∆f )(r) + 2∇f (r) · er0 ,γ (r)−1 ∇er0 ,γ (r)



(3.5.13)
−1

+ f (r) er0 ,γ (r) ∆er0 ,γ (r) .
Explicit calculation yields that, for γ sufficiently small,
er0 ,γ (r)−1 ∇er0 ,γ (r) . γ,

(3.5.14)
er0 ,γ (r)−1 ∆er0 ,γ (r) . γ

(3.5.15)
for all r and r 0 . Thus, it suffices to control the operators ∇(λ − H)−1 .
Note that ∇(I − ∆)−1 is bounded, as can be clearly seen in the Fourier representa-
tion
 |k|
| ∇(I \− ∆)−1 f (k)| = |fb(k)| ≤ |fb(k)|. (3.5.16)
1 + |k|2
Thus, as
∇(λ − H)−1 = ∇(I − ∆)−1 (I − ∆)(λ − H)−1 ,
 
(3.5.17)
it suffices to consider the boundedness of ∆(λ − H)−1 (recall that the Green’s function
(λ − H)−1 is bounded).
We now work with ∆(λ − H)−1 and notice that
∆(λ−H)−1 = (λ−H −λ+V )(λ−H)−1 = I −λ(λ−H)−1 +V (λ−H)−1 . (3.5.18)
Hence, ∆(λ − H)−1 is bounded if V (λ − H)−1 is bounded, which is guaranteed by
our assumption that V is ∆-bounded.
Therefore, we arrive at the conclusion that Tr0 ,γ is a bounded operator uniformly in
r 0 if V is bounded and γ is sufficiently small (so that the Neumann series converges).
As a result, for two functions f, g, we have
hf |Tr0 ,γ |gi ≤ kTr0 ,γ kkf kkgk . kf kkgk. (3.5.19)
Note that the kernel of Tr0 ,γ is given by
0 2
+1)1/2 −γ(|r−r 0 |2 +1)1/2
Tr0 ,γ (r, r 0 ) = (λ − H)−1 (r, r 0 )eγ(|r −r0 | e . (3.5.20)
Thus
ZZ
0 2
+1)1/2 −γ(|r−r 0 |2 +1)1/2
hf |Tr0 ,γ |gi = f (r)Gλ (r, r 0 )eγ(|r −r0 | e g(r 0 ) dr dr 0 .
(3.5.21)
Taking f and g as characteristic functions of balls with volume 1 around r 0 and r 00 ,
respectively, the above integral is equal to
Z Z
0 2 1/2 2 1/2
Gλ (r, r 0 )eγ(|r −r0 | +1) e−γ(|r−r0 | +1) dr dr 0 . (3.5.22)
|r−r 0 |≤1 |r 0 −r 00 |≤1
98 Chapter 3. Linear response theory

The integral above is roughly (up to a multiplicative constant)


Z Z
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

0 2 1/2
Gλ (r, r 0 )eγ(|r0 −r0 | +1) dr dr0 . (3.5.23)
|r−r0 |≤1 |r0 −r00 |≤1

Thus the boundedness of Tr0 ,γ implies that the kernel of the Green’s function decays
exponentially at least in an averaged sense, given that, e.g., V is in L∞ . We remark that
if V has better regularity, we can get stronger conclusions on the decay estimate (such
as a pointwise estimate) [23], but we will not go into the details here.
The decay property of the Green’s function implies that the density matrix decays
exponentially along the off-diagonal direction for systems with a positive band gap.
This is often referred to as the near-sightedness property of insulating systems. Note that
this implies that each selected column obtained from the SCDM algorithm in section 2.9
also decays exponentially. In order to obtain exponentially localized Wannier functions
as discussed in section 2.9, the selected columns should be orthogonalized. It turns out
that the band gap condition is insufficient to guarantee that the resulting orthogonalized
functions will also decay exponentially due to potential topological obstructions. There
is a rich body of literature along this direction in mathematics and physics from the past
few decades. We refer readers to [11, 67, 19] for more details.

3.6 Time-dependent density functional perturbation theory


Let us now generalize the discussion of DFPT to the time-dependent case. Similarly to
the development of the time-independent case, we first consider the linear version of the
problem, i.e., the Hamiltonian operator H(t) depends explicitly on t but is independent
of the electron density ρ. The quantum Liouville equation (2.11.6) is a linear equation


P (t) = −i[H(t), P (t)]. (3.6.1)
∂t
To solve this equation, let us consider first the solution to a general linear equation


A(t) = −iH(t)A(t). (3.6.2)
∂t

If H(t) ≡ H is independent of t, we know that the solution is given by the exponential


of the Hamiltonian: e−iHt A(0). In the time-dependent case, the solution is similarly
given by a time-ordered exponential
h Rt i
A(t) = T e−i 0 H(s) ds A(0). (3.6.3)

Using power series representation, the time-ordered exponential is defined as


t Z tZ t
(−i)2
h Rt i Z
−i H(s) ds
T e 0 = I −i H(s) ds + T [H(s1 )H(s2 )] ds1 ds2 + · · · ,
0 2! 0 0
(3.6.4)
where the time-ordered product of two operators T [A(s1 )A(s2 )] is given by
(
A(s1 )A(s2 ), s1 ≥ s2 ,
T [A(s1 )A(s2 )] = (3.6.5)
A(s2 )A(s1 ), s1 < s2 .
3.6. Time-dependent density functional perturbation theory 99

Note that two possible outcomes are in general different, since A at different times
might not commute. Therefore, we have
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Z t Z t
··· T [H(s1 ) · · · H(sn )] ds1 · · · dsn
0 0
Z t Z sn Z sn−1 Z s2
= n! ··· H(sn ) · · · H(s1 ) ds1 · · · dsn . (3.6.6)
0 0 0 0

Using this, we can check by differentiating that (3.6.3) is indeed the solution to (3.6.2).
Thus, the time-ordered exponential is the propagator for (3.6.2):
h R t2 i
U(t2 , t1 ) = T e−i t1 H(s) ds , t1 ≤ t2 . (3.6.7)

For convenience, we also define U(t2 , t1 ) = U(t1 , t2 )∗ if t1 > t2 . As we will see later,
the propagator plays the role of the Green’s function for time-dependent problems.
Using the propagator (or the time-ordered matrix exponential), the solution to (3.6.1)
can be written as (the validation is left as an exercise)
h Rt i  h R i∗
−i 0 H(s) ds −i 0t H(s) ds
P (t) = T e P (0) T e = U(t, 0)P (0)U(0, t). (3.6.8)

Now we consider a perturbation to the Hamiltonian H (t) = H(t) + W (t) and


assume that the perturbation W is a real potential. Taking the same initial density
matrix, the perturbed density matrix is then given by
h Rt i  h R i∗
−i 0 H (s) ds −i 0t H (s) ds
P (t) = T e P (0) T e . (3.6.9)

In order to derive the first-order perturbation to the density matrix, it suffices to under-
stand the perturbation to the propagator. Let U be the propagator to the equation

A (t) = −iH (t)A (t) = −iH(t)A (t) − iW (t)A (t). (3.6.10)
∂t
Thus, viewing −iW (t)A (t) as an inhomogeneous source term, we have by Duhamel’s
principle Z t
A (t) = U(t, 0)A (0) − i U(t, s)W (s)A (s) ds. (3.6.11)
0
Thus, in terms of the propagators, we get
Z t
U (t, 0) = U(t, 0) − i U(t, s)W (s)U (s, 0) ds. (3.6.12)
0

This is the time-dependent version of Dyson’s equation (recall (3.1.21) for the perturbed
Green’s function in the time-independent case).
We will focus on the linear response regime and hence keep the terms up to O().
Analogous to time-independent DFPT, we define the operator
∂P (t)
(X0 W )(t) = . (3.6.13)
∂ =0
The propagator satisfies
Z t
U (t, 0) = U(t, 0) − i U(t, s)W (s)U(s, 0) ds + O(2 ), (3.6.14)
0
100 Chapter 3. Linear response theory

and hence
P (t) = U (t, 0)P (0)U (0, t)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Z t
= P (t) − i U(t, s)W (s)U(s, 0)P (0)U(0, t) ds
0 (3.6.15)
Z t
+ iU(t, 0)P (0) U(0, s)W (s)U(s, t) ds + O(2 ),
0

and
Z t
(X0 W )(t) = −i U(t, s)W (s)U(s, 0)P (0)U(0, t) ds + h.c. (3.6.16)
0

Let us consider a system that lies in the ground state initially without perturbation,
i.e., the unperturbed Hamiltonian is time-independent H(t) ≡ H and P (0) ≡ P0 is
the corresponding density matrix at zero temperature P0 = f∞ (H − µ). Here µ is
the chemical potential and we assume that there is an energy gap at t = 0. Since the
Hamiltonian without perturbation is time-independent, the propagator is just given by

U(t, s) = e−i(t−s)H . (3.6.17)

Correspondingly, (3.6.16) simplifies to


Z t
(X0 W )(t) = −i e−i(t−s)H W (s)ei(t−s)H ds P0 + h.c.
0
Z t (3.6.18)
= −i e−i(t−s)H [W (s), P0 ]ei(t−s)H ds,
0

where we have used the fact that P0 = f∞ (H − µ) commutes with the propagator
U(0, t) = eitH . The starting time 0 in the above formula is somewhat arbitrary, as
we may relabel the starting time to any point. To remove this arbitrary choice, let us
imagine that we start the perturbation at some time after t = −∞, and hence
Z t
(X0 W )(t) = −i e−i(t−s)H [W (s), P0 ]ei(t−s)H ds. (3.6.19)
−∞

Using the spectral decomposition of H, we may write the above as


X Z t
(X0 W )(t) = −i |ψp ihψp | e−i(t−s)(εp −εq ) [W (s), P0 ] ds|ψq ihψq |.
p,q −∞
(3.6.20)
Equation (3.6.20) represents X0 in the time domain. When comparing the solution
of time-dependent perturbation theories with experimental results, it is often more con-
venient to work in the frequency domain. This is given by the Fourier transform of
(X0 W )(t). For a complex valued function f (t) ∈ L1 (R; C), the Fourier transform is
defined as Z ∞
F(f )(ω) ≡ fb(ω) := eiωt f (t) dt. (3.6.21)
−∞
Let us compute first the expression
Z t
e−i(t−s)ω0 W (s) ds (3.6.22)
−∞
3.6. Time-dependent density functional perturbation theory 101

for some real frequency ω0 . Let Θ(t) be the Heaviside function that Θ(t) = 1 for t > 0,
Θ(t) = 0 for t < 0, and Θ(0) = 12 . We can rewrite (3.6.22) as
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Z t Z ∞
e−i(t−s)ω0 W (s) ds = e−i(t−s)ω0 W (s)Θ(t − s) ds, (3.6.23)
−∞ −∞

and thus the right-hand side is a convolution of W and e−iω0 t Θ(t).


Using the Fourier transform of Θ(t) in Appendix A.3, as well as the properties of
the Fourier transform of the convolution of two functions, we have
Z t 
c (ω0 )δ(ω−ω0 )+i p. v. W (ω) , (3.6.24)
c 
F e−i(t−s)ω0 W (s) ds (ω) = π W
−∞ ω − ω0

where p. v. stands for the principal value. Furthermore, using the Sokhotski–Plemelj
formula in Appendix A.3, we conclude that
Z t 
Wc (ω)
F e−i(t−s)ω0 W (s) ds (ω) = i lim . (3.6.25)
−∞ η→0+ ω − ω0 + iη

Note that (ω − ω0 + iη)−1 only has a pole in the lower half plane.
Substituting the above equation into (3.6.20), we obtain
X |ψp ihψp |[W
c (ω), P0 ]|ψq ihψq |
F(X0 W )(ω) = lim
η→0+
p,q
ω − (εp − εq ) + iη
occ unocc
X X |ψa ihψa |W
c (ω)|ψi ihψi |
= lim (3.6.26)
η→0+
i a
ω − (εa − εi ) + iη
occ unocc
X X |ψi ihψi |W
c (ω)|ψa ihψa |
− lim .
η→0+
i a
ω − (εi − εa ) + iη

This gives the irreducible dynamic polarizability operator by assuming W to be a local


potential and by taking the diagonal term of F(X0 W )(ω). For a fixed ω, the dynamic
polarizability operator is an integral operator with a kernel given by
occ unocc
X X hψi |δr |ψa ihψa |δr0 |ψi i
χ0 (r, r 0 ; ω) = lim
η→0+
i a
ω − (εa − εi ) + iη
occ unocc
(3.6.27)
X X hψa |δr |ψi ihψi |δr0 |ψa i
− lim .
η→0+
i a
ω − (εi − εa ) + iη

Here we have used the short-hand forms hψi |δr |ψa i = ψi∗ (r)ψa (r).
For time-dependent density functional theory (TDDFT), we also need to take into
account the change of the potential induced by the perturbation of the density. This is
described by the reducible dynamic polarizability operator denoted by χ(ω). Following
the same derivation as in the time-independent case, we write the formula relating χ and
χ0 as

χ(ω) = (I − χ0 (ω)fHxc (ω))−1 χ0 (ω) or χ−1 (ω) = χ−1


0 (ω) − fHxc (ω). (3.6.28)

Note that in TDDFT, the exchange-correlation kernel fHxc should also be frequency-
dependent in general, which originates from the memory effect of the exchange-
102 Chapter 3. Linear response theory

correlation potential. However, almost all practical TDDFT calculations use the adi-
abatic approximation, which applies the ground state exchange-correlation functional
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

to the instantaneous electron density ρ(t), and fHxc is then frequency-independent.


This can cause serious modeling issues in a number of contexts, and we refer readers
to [66, 83] for more discussion along this topic.
The dynamic polarizability operator acting on a frequency-dependent perturbation
ĝ(r, ω) can be evaluated without explicitly computing any unoccupied orbital. Using
(3.6.27),

occ unocc
ψa∗ (r 0 )ĝ(r 0 , ω)ψi (r 0 ) dr 0 ψi∗ (r)
R 
X X ψa (r)
(χ0 ĝ)(r, ω) = lim
η→0+
i a
ω − (εa − εi ) + iη
occ unocc
X X ψi (r) ψi (r )ĝ(r 0 , ω)ψa (r 0 ) dr 0 ψa∗ (r)
R ∗ 0 
− lim . (3.6.29)
η→0+
i a
ω − (εi − εa ) + iη

Similarly to the discussion on the time-independent case, we have (the spatial depen-
dence is omitted in the notation for simplicity)
occ
X
ξi,+ (ω)ψi∗ − ξi,−


(χ0 ĝ)(ω) = lim (ω)ψi , (3.6.30)
η→0+
i

where ξi,± (ω) solves the frequency-dependent Sternheimer equation as

Q(ω + iη + εi − H)Qξi,+ (ω) = Q(ψi ĝ(ω)),


(3.6.31)
Q(ω − iη − εi + H)Qξi,− (ω) = Q(ψi ĝ ∗ (ω)),

where Q is the projector onto the unoccupied space.


Though the frequency-dependent Sternheimer equation resembles that in the
frequency-independent case, its numerical solution can be much more difficult, espe-
cially when iterative methods are used. For instance, for a gapped Hamiltonian, the
eigenvalues of the operator (H − εi )|range(Q) are all positive. Hence the frequency-
independent Sternheimer equation resembles elliptic equations. Numerical methods
based on, e.g., GMRES [78] are expected to converge quickly, and preconditioners
such as those related to the inverse of the Laplacian operator can be effective [82]. On
the other hand, the operator on the left-hand side of (3.6.31) can be indefinite even for
gapped systems, especially for the computation of ξi,+ (ω) with a positive frequency ω.
Hence the frequency-dependent Sternheimer equation resembles the Helmholtz equa-
tion. Iterative methods can converge slowly or do not converge at all. The development
of effective preconditioners is still an active research direction.

3.7 Perturbation of the many-body Hamiltonian


Our discussion of perturbation theory has so far been restricted to single particle or
effective single particle theories. However, the techniques in fact can be applied to
any linear Hamiltonian. In this section, we consider the perturbation of a many-body
quantum system.
(N ) (N ) 
Denote the N -body Hamiltonian as H (N ) and the eigenpairs as Ek , Ψk , so
that
(N ) (N ) (N )
H (N ) Ψk = Ek Ψk .
3.7. Perturbation of the many-body Hamiltonian 103

The N -body density matrix is then given by


ED
(N ) (N )
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

P (N ) = Ψ0 Ψ0 . (3.7.1)

Consider a perturbation of the Hamiltonian given by W (N ) ; to leading order the per-


turbation of the density matrix is then
I
(N ) (N ) ε −1 (N ) −1
P − P = λ − H (N ) W λ − H (N ) dλ + O(2 ). (3.7.2)
2πi C
(N )
Here the contour C encloses only the ground state energy E0 . Using the spectral
representation, we get

X |Ψ(N ) ihΨ(N ) |W (N ) |Ψ(N ) ihΨ(N ) |


0 0
P(N ) − P (N ) =  (N )
k
(N )
k
+ h.c. + O(2 ). (3.7.3)
k6=0 E0 − Ek

As a result we may define the N -body polarizability operator:

X |Ψ(N ) ihΨ(N ) |W (N ) |Ψ(N ) ihΨ(N ) |


0 0
X(N ) W (N ) = (N )
k
(N )
k
+ h.c. (3.7.4)
k6=0 E0 − Ek

Note that the many-body Hamiltonian system is linear. Hence there is no distinction
between the irreducible and reducible polarizability operators in the many-body context.
In order to connect with the (effective) one-body picture of DFT, we consider the
special class of perturbations
N
X Z N
X Z
(N )
W = W (r i ) = W (r) δ(r − r i ) dr =: W (r)b
ρ(r) dr, (3.7.5)
i=1 i=1

where W is a (single-body) potential, thus defined on R3 , and we introduce the notation


N
X
ρb(r) = δ(r − r i ) (3.7.6)
i=1

as the N -body density operator. Note that


(N ) (N )
hΨ0 |b
ρ(r)|Ψ0 i = ρ(r),

where the right-hand side is the single particle electron density.


The definition of the N -body polarizability operator implies the following bilinear
form:
  X hΨ(N ) |U (N ) |Ψ(N ) ihΨ(N ) |W (N ) |Ψ(N ) i
0 0
Tr U (N ) X(N ) W (N ) = k
(N ) (N )
k

k6=0 E0 − E k
(3.7.7)
X hΨ(N ) |U (N ) |Ψ(N ) ihΨ(N ) |W (N ) |Ψ(N ) i
0 k k 0
+ (N ) (N )
.
k6=0 E0 − Ek

Within the class of perturbations for W (N ) as in (3.7.5), as well as the test function U (N )
in the same class, the N -body polarizability operator induces a one-body polarizability
104 Chapter 3. Linear response theory

operator χexact (we use the superscript “exact” to indicate that this comes from the
many-body theory) such that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Z X hΨ(N ) |b (N ) (N )
ρ(r)|Ψ ihΨ |b
(N )
ρ(r 0 )|Ψ i
Tr(U χ exact
W) = U (r)W (r 0 ) k 0
(N )
0
(N )
k
dr dr 0
k6=0 E0 − Ek
Z X hΨ(N ) |b (N ) (N )
ρ(r)|Ψ ihΨ |b
(N )
ρ(r 0 )|Ψ i
+ U (r)W (r 0 ) 0 k
(N )
k
(N )
0
dr dr 0 .
k6=0 E0 − Ek
(3.7.8)
Thus χexact is an integral operator on R3 with kernel

D ED E
(N ) (N ) (N ) (N )
X Ψk b ρ(r) Ψ0 Ψ0 b ρ(r 0 ) Ψk
χexact (r, r 0 ) = (N ) (N )
k6=0 E0 − Ek
D ED E
(N ) (N ) (N ) (N )
X Ψ0 b ρ(r) Ψk Ψk b ρ(r 0 ) Ψ0
+ (N ) (N )
. (3.7.9)
k6=0 E0 − Ek

Comparing this with the kernel of χ0 ,

occ unocc
X X ψ ∗ (r)ψi (r)ψ ∗ (r 0 )ψa (r 0 ) occ unocc
X X ψ ∗ (r)ψa (r)ψ ∗ (r 0 )ψi (r 0 )
χ0 (r, r 0 ) = a i
+ i a

i a
εi − εa i a
εi − εa
occ unocc
X X hψa |δr |ψi ihψi |δr0 |ψa i occ unocc
X X hψi |δr |ψa ihψa |δr0 |ψi i
= + ,
i a
εi − εa i a
εi − εa
(3.7.10)
we observe that the two formulae are quite similar since δr is the one-body version
of the density operator ρb(r). Despite the formal similarity, we remark that χexact is
better approximated by the reducible polarizability χ in the context of DFPT, which
represents the response of the electron density to the external potential while taking into
account the electron interactions. On the other hand, the irreducible polarizability χ0
only represents the response of the electron density to the effective potential, which is
not physically measurable.
Similarly to the static case, we can also apply the time-dependent perturbation to
the many-body Hamiltonian, and we get in the frequency domain
D ED E
(N ) (N ) (N ) (N )
X Ψ0 b ρ(r) Ψk Ψk b ρ(r 0 ) Ψ0
χexact (r, r 0 ; ω) = lim (N ) (N ) 
η→0+ ω − Ek − E0 + iη
k6=0
D ED E (3.7.11)
(N ) (N ) (N ) (N )
X Ψk b ρ(r) Ψ0 Ψ0 b ρ(r 0 ) Ψk
− lim (N ) (N ) 
.
η→0+ ω − E0 − Ek + iη
k6=0

(N ) (N )
In particular, (3.7.11) indicates that the poles of χexact (ω) take the form Ek − E0 ,
which is the difference between kth neutrally excited state energy and the ground state
energy (note that the electron number is kept the same). Such energy differences
are called (neutral) excitation energies. Again in the context of time-dependent DFPT,
3.8. Casida formalism 105

χexact (ω) can be approximated by the reducible polarizability operator χ(ω), and hence
the excitation energies can be approximated through the poles of χ(ω).
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

3.8 Casida formalism


In order to evaluate poles of χ(ω), i.e., zeros of χ−1 (ω), from (3.6.28) we need to solve

χ0 (ω)fHxc (ω)f = f (3.8.1)

for some function f . In order to solve the above equation, let us first truncate the
set of unoccupied orbitals up to some fixed energy level Ec . Then consider functions
Ψai = ψa ψi∗ and the complex conjugate Ψia := Ψ∗ai = ψi ψa∗ , where i and a are indices
for an occupied orbital and an unoccupied orbital within the energy cutoff, respectively.
Denote ωai = εa − εi , and expand f using this set of functions as
X 
f= fai Ψai + fia Ψia .
ai

Then (3.8.1) becomes


occ unocc
X |Ψai ihΨai |fHxc |Ψbj i
X |Ψai ihΨai |fHxc |Ψjb i
lim fbj + fjb
η→0+
ij
ω − ωai + iη ω − ωai + iη
ab
|Ψia ihΨia |fHxc |Ψbj i |Ψia ihΨia |fHxc |Ψjb i (3.8.2)
− fbj − fjb
ω + ωai + iη ω + ωai + iη
X
= fai |Ψai i + fia |Ψia i.
ai

Matching the coefficients for |Ψai i and |Ψia i, respectively,


occ unocc
X hΨai |fHxc |Ψjb i
X hΨai |fHxc |Ψjb i
lim fbj + fjb = fai , (3.8.3)
η→0+
j
ω − ωai + iη ω − ωai + iη
b
occ unocc
X X hΨia |fHxc |Ψbj i hΨia |fHxc |Ψjb i
− lim fbj − fjb = fia , (3.8.4)
η→0+
j
ω + ωai + iη ω + ωai + iη
b

or equivalently,
occ unocc
X X
hΨai |fHxc |Ψbj ifbj + hΨai |fHxc |Ψjb ifjb = (ω − ωai )fai , (3.8.5)
j b
occ
XX unocc
hΨia |fHxc |Ψbj ifbj + hΨia |fHxc |Ψjb ifjb = −(ω + ωai )fia . (3.8.6)
j b

This can be viewed as a (non-Hermitian) eigenvalue equation for ω, known as the Casida
equation [15].
One important application of TDDFT is the computation of the absorption spectrum,
which is directly related to the excitation energies. Following the discussion in the time-
independent case, the frequency-dependent macroscopic polarizability tensor is defined
as (cf. (3.4.11)) Z
Aij (ω) = − rα χ(r, r 0 ; ω)rβ0 dr dr 0 . (3.8.7)
106 Chapter 3. Linear response theory

Here χ(ω) can be interpreted as the exact polarizability operator χexact (ω) or approxi-
mation obtained from TDDFT. The absorption spectrum cross section, denoted by σ(ω),
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

is then given by
4πω
σ(ω) = Im Tr[A(ω)], (3.8.8)
c
where c is the speed of light, which is approximately 137 in atomic units.
Since only the poles of χ(ω) contribute to Im Tr[A(ω)], we may readily use the
Casida formalism to evaluate the absorption spectrum. However, the Casida formal-
ism requires the diagonalization of a matrix that is of size Nocc Nunocc , where Nocc is
the number of occupied states and Nunocc is the number of unoccupied states within
the energy, respectively. For a large system the matrix size becomes O(N 2 ), and the
diagonalization is thus very expensive. A more efficient numerical method is to use a
Lanczos approach to avoid the explicit diagonalization of this matrix.
Furthermore, note that the evaluation of A(ω) only requires applying χ(ω) to a
specific vector, i.e., the vector representing the uniform electric field. We may solve
the time-dependent Sternheimer equation directly to obtain the value of the absorption
spectrum at any specific point of interest. The advantage of this approach is that we
may completely remove the error due to the truncation of the unoccupied states at some
fixed energy level.

3.9 Random phase approximation


In this last section, we demonstrate that time-dependent perturbation theory can also
be used to improve the accuracy of the many-body ground state energy. Recall that
the correlation energy is defined to be the difference between the exact ground state
energy and the Hartree–Fock energy. The correlation energy is usually very small and
is less than 1% of the total energy. However, Kohn–Sham DFT with local and semi-
local exchange-correlation functionals may provide qualitatively incorrect descriptions
for certain electron correlation effects, for example the van der Waals interaction. Here
we discuss random phase approximation (RPA), which can be viewed as a starting point
for many approaches based on many-body perturbation theory. In fact, historically, the
RPA theory (which was developed by Bohm and Pines [8]) played a very influential
role in shaping how physicists think about many-body systems. We also remark that
the term “RPA” is merely a legacy term and has lost its original meaning in the current
context of DFT.
For this, it is useful to introduce the concept of the adiabatic connection to connect
a noninteracting particle system (such as the one in Kohn–Sham DFT) to an interacting
one. We consider a family of many-body Hamiltonians, for λ ∈ [0, 1],
(N )
Hλ = T + Vext + Vλ + λVee , (3.9.1)

where Vee is the N -body electron-electron interaction potential and Vλ is an effective


one-body potential such that Vλ=0 is given by the effective potential of a noninteract-
(N )
ing effective one-body theory and Vλ=1 = 0. Thus, Hλ connects an effective non-
interacting system at λ = 0 to the real physical system at λ = 1. Given the one-body
density of the ground state ρ, assume that at λ = 0 we use the effective potential that
corresponds to this density (from Kohn–Sham theory). For each λ, by a constrained
(N )
search fixing the density, we minimize the energy for the λ-Hamiltonian Hλ over all
possible wavefunctions giving the density ρ, denote the resulting many-body wavefunc-
tion as Ψλ , and take Vλ to be the effective potential (which corresponds to the Lagrange
3.9. Random phase approximation 107

multiplier for the constraint that the wavefunction is giving the density ρ) such that Ψλ
(N ) (N )
becomes the ground state of Hλ [68]. We denote E(λ) = hΨλ |Hλ |Ψλ i such that
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

in particular E(1) is the ground state energy of the physical system we are interested in.
To obtain E(1), the idea is to use the fundamental theorem of calculus and write
Z 1
∂E(λ)
E(1) − E(0) = dλ
0 ∂λ
Z 1D (N )
∂Hλ
E
= Ψλ Ψλ dλ (3.9.2)
0 ∂λ
Z 1D
∂Vλ E
= Ψλ Vee + Ψλ dλ,

0 ∂λ
where the second equality uses the Hellmann–Feynman theorem. Recall that, given an
N -body wavefunction Ψ, using the symmetry we can write the energy as
D X 1 E 1 Z Z ρ(2) (r, r 0 )
Ψ Ψ = dr dr 0 . (3.9.3)

|r − r | 2 |r − r 0|
i<j i j

Here ρ(2) is the two-particle electron density , given by

N (N − 1)
Z
ρ(2) (r 1 , r 2 ) = |Ψ(r 1 , r 2 , r 3 , . . . , r n )|2 dr 3 · · · dr N . (3.9.4)
2
Since Vλ is a single-body term, we have
Z 1D Z 1Z Z Z
∂Vλ ∂Vλ
E
Ψλ Ψλ dλ = ρλ dr dλ = ρ(V1 − V0 ) dr = − ρV0 dr,
0 ∂λ 0 ∂λ
(3.9.5)
where we have used the assumption that ρλ does not depend on λ and Vλ=1 = 0.
Recalling that E(0) is just the energy of a noninteracting system with effective potential
V0 , we get
1 (2)
ρλ (r, r 0 )
Z Z ZZ
1
E(1) = E(0) − ρV0 dr + dr dr 0 dλ, (3.9.6)
2 0 |r − r 0 |
(2)
where ρλ is the two-body electron density for the λ-system. Therefore, to obtain an
(2)
“exact” functional, we just to need to know ρλ .
Let us recall the definitions of χ0 and χ in (3.6.27) and (3.7.11) and consider an
analytic extension of them to imaginary frequencies (assuming for simplicity that all
eigenfunctions are real):
occ unocc
X X hψi |δr |ψa ihψa |δr0 |ψi i occ unocc
X X hψa |δr |ψi ihψi |δr0 |ψa i
χ0 (r, r 0 ; iω) = −
i a
iω − (εa − εi ) i a
iω − (εi − εa )
occ unocc
X X hψi |δr |ψa ihψa |δr0 |ψi i
= + c.c.
i a
iω − (εa − εi )
occ unocc
X X (εa − εi )
= −2 hψi |δr |ψa ihψa |δr0 |ψi i
i a
ω 2 + (εa − εi )2
(3.9.7)
108 Chapter 3. Linear response theory

and similarly
D ED E
(N ) (N ) (N ) (N )
ρ(r 0 ) Ψ0
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

X Ψ0 b ρ(r) Ψk Ψk b
χ(r, r 0 ; iω) = (N ) (N ) 
k6=0 iω − Ek − E0
D ED E
(N ) (N ) (N ) (N )
X Ψk b ρ(r) Ψ0 Ψ0 b ρ(r 0 ) Ψk
− (N ) (N ) 
k6=0 iω − E0 − Ek
(N ) (N )
X (Ek − E0 ) D
(N )

(N )
ED
(N )

0 (N )
E
= −2 (N ) (N )
Ψ k bρ (r) Ψ 0 Ψ 0 bρ (r ) Ψ k .
2 − E0 )2
k6=0 ω + (Ek

R∞ a (3.9.8)
π
Integrating along the imaginary axis and noting that 0 a2 +ω 2 dω = 2 for a > 0, we

get
Z ∞ occ unocc Z
1 0 2XX ∞ (εa − εi )
χ0 (r, r ; iω) dω = − hψi |δr |ψa ihψa |δr0 |ψi i dω
2π −∞ π i a 0 ω 2 + (εa − εi )2
occ unocc
X X
=− hψi |δr0 |ψa ihψa |δr |ψi i
i a
occ X
X occ occ
X
= hψi |δr0 |ψj ihψj |δr |ψi i − hψi |δr |ψi iδ(r − r 0 )
i j i
0 2 0
= |Ps (r, r )| − ρs (r)δ(r − r )
(3.9.9)
and similarly
∞ (N ) (N )
2X ∞ (Ek − E0 )
Z Z
1
χ(r, r 0 ; iω) dω = − (N ) (N )

2π −∞ π ω 2 + (Ek − E0 )2
k6=0 0
D ED E
(N ) (N ) (N ) (N )
× Ψk b ρ(r) Ψ0 Ψ0 b ρ(r 0 ) Ψk
X D (N )
(N )
ED
(N )

(N )
E
=− Ψk b ρ(r) Ψ0 Ψ0 b ρ(r 0 ) Ψk
k6=0
D ED E
(N ) (N ) (N ) (N )
= Ψ0 b ρ(r) Ψ0 Ψ0 b ρ(r 0 ) Ψ0
D E
(N ) (N )
− Ψ0 b ρ(r 0 ) Ψ0
ρ(r)b
= ρ(r)ρ(r 0 ) − ρ(2) (r, r 0 ) − ρ(r)δ(r − r 0 ),
(3.9.10)
where we have used ρs and ρ for the one-particle electron density for noninteracting
and interacting systems, respectively. In the above calculation, we have used that
X
ρ(r 0 ) =
ρb(r)b δ(r − r i )δ(r 0 − r j )
i,j
X X
= δ(r − r i )δ(r − r 0 ) + δ(r − r i )δ(r 0 − r j ) (3.9.11)
i i6=j

= ρb(r)δ(r − r 0 ) + ρb(2) (r, r 0 ),

where ρb(2) (r, r 0 ) is the two-particle density operator.


3.9. Random phase approximation 109

If we approximate χ by χ0 , we get an approximation of ρ(2) by


0 0 0 2
ρ(2)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

s (r, r ) = ρs (r)ρs (r ) − |Ps (r, r )| . (3.9.12)

This gives

ρ(2) (r, r 0 ) ρs (r)ρs (r 0 )


ZZ ZZ
1 1
dr dr 0 ≈ dr dr 0
2 |r − r 0 | 2 |r − r 0 |
(3.9.13)
|Ps (r, r 0 )|2
ZZ
1
− dr dr 0 .
2 |r − r 0 |

The right-hand side is exactly the Hartree–Fock approximation of the electronic Cou-
lomb repulsion! This motivates us to look for a better approximation of χ so that we
can hopefully get a better approximation of the many-body correlation energy (which is
completely missing if we replace χ by χ0 ).
(2)
Our previous discussion suggests rewriting ρλ using χλ , the dynamic polarizability
operator along the imaginary frequency axis for the λ-system, and then approximating
χλ . Since we already know that χ0 gives the Hartree and exchange terms, the additional
contribution gives the correlation energy
Z ∞ Z 1 ZZ
1 1 1
χλ (r, r 0 ; iω) − χ0 (r, r 0 ; iω) dr dr 0 dλ dω.

Ec = − 0
2π 2−∞ 0 |r − r |
(3.9.14)
(N )
Recall that χλ is the one-body polarizability of a many-body Hamiltonian Hλ .
According to our discussion of DFPT, in principle the response is captured if we know
λ
the exact exchange-correlation kernel corresponding to the λ-system fxc at the imagi-
nary frequency (recall (3.6.28)):

χ−1 −1 λ
λ = χ0 − λvC − fxc . (3.9.15)

Notice that the Coulomb kernel is changed to λvC due to the adiabatic connection. If we
know the exchange-correlation functional exactly, we can represent χλ and get the cor-
relation energy Ec (which is of course circular). But the advantage of representing into
χλ is that hopefully this will yield practical strategies for approximating the correlation
energy.
λ
The RPA proposes a simple approximation of χ. It is essentially neglecting fxc in
(3.9.15) and approximating
χ−1 −1
λ ≈ χ0 − λvC , (3.9.16)
or equivalently,
χλ ≈ χ0 + λχ0 vC χλ . (3.9.17)
More precisely, the approximate χλ solves the Dyson-like equation

χλ (r, r 0 ; iω) = χ0 (r, r 0 ; iω)


(3.9.18)
ZZ
+λ χ0 (r, r 00 ; iω)vC (r 00 , r 000 )χλ (r 000 , r 0 ; iω) dr 00 dr 000.

Solving for χλ , we obtain


−1
χλ (iω) = 1 − λχ0 (iω)vC χ0 (Iω). (3.9.19)
110 Chapter 3. Linear response theory

Thus
Z 1 Z 1
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

χλ (iω)vC dλ − χ0 (iω)vC = 1 − λχ0 (iω)vC )−1 χ0 (iω)vC dλ − χ0 (iω)vC


0 0

= − ln 1 − χ0 (iω)vC − χ0 (iω)vC .
(3.9.20)
Therefore, under RPA, the correlation energy is given by
Z ∞ 
RPA 1  
Ec = Tr ln 1 − χ0 (iω)vC + χ0 (iω)vC dω. (3.9.21)
4π −∞

Here the matrix logarithm can be computed by diagonalizing the matrix 1 − χ0 (iω)vC
or by the Cauchy contour integral formula. As a word of caution, there are actually
many flavors of RPA. The one we have just discussed is often known as the direct RPA
or the DFT-flavored RPA. There are various other formulations under the umbrella term
of RPA (e.g., RPA with exchange (RPAX), the Hartree–Fock-flavored RPA, and the
partial-partial RPA) that are beyond the scope of this book.

Exercises
1. Following section 3.2, derive the linear response of the density matrix for finite
temperature. Show that

δP X fp − fq
(W ) = |ψp ihψp |W |ψq ihψq |, (3.9.22)
δV ε − εq
p,q p

where the summation is over all orbital indices.


2. Apply the perturbation of the projection operators to show that the leading-order
perturbation to the energy of a nondegenerate ground state is given by

ε(1) = hψ0 |W |ψ0 i. (3.9.23)

3. Verify the second-order derivative of the total energy with respect to the atomic
position as in (3.4.28) and (3.4.29).
4. Use the decay of Green’s function as in section 3.5 to prove that at finite temper-
ature the density matrix decays exponentially even for systems without a spectral
gap. [Hint: represent the density matrix using Green’s function as (2.6.28).]

5. Prove the claim (3.6.8) using the definition of the time-ordered matrix exponen-
tial.
6. Expand the perturbation series for the time-dependent density matrix (3.6.15) to
the next order.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Appendix A

Notations and
preliminaries

A.1 Notation

Table A.1. Notation used in the book.

General conventions
i imaginary unit
z∗ complex conjugate of the complex number z
N number of electrons
M number of nuclei
β
H inverse temperature
C
dλ contour integral Z Z
R
integrals, for example f dr, f (r) dr, or
Z R3
X Z
f (x) dx := f (r, σ) dr
σ∈{↑,↓} R3

hψ|, |ψi, hψ|ϕi bra vector, ket vector, and braket in Dirac notation
↑, ↓ spin-up and spin-down components
Coordinates
r, rα single electron spatial coordinate and its Cartesian com-
ponents, α = x, y, z or 1, 2, 3
p, pα single electron momentum coordinate and its Cartesian
components
xi = (r i , σi ) space-spin coordinates of the ith electron
rij = |r i − r j | Euclidean distance between electrons i and j
ZI charge of the Ith nuclei
RI spatial coordinate of the Ith nuclei
Wavefunctions and densities
Ψ or |Ψi N -electron wavefunction
(N )
Ψk (x1 , . . . , xN ) space-spin coordinates of the kth excited state of an N -
electron wavefunction
P (N ) N -electron density matrix
P single particle density matrix

111
112 Appendix A. Notations and preliminaries

Table A.1. (cont.)


G(z) Green’s function with complex frequency z
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

ρ(r) electron density


ψi (x) or ϕi (x) ith single electron spin orbital
ψi (r) or ϕi (r) ith single electron spatial orbital
Energies and potentials
E, F general energy and free energy
E general energy functional
E HF Hartree–Fock ground state energy functional
E KS Kohn–Sham total energy functional
EH , Ex , Ec , Exc , EHxc Hartree, exchange, correlation, exchange-correlation,
and Hartree exchange-correlation energies
EII nuclei interaction energy
FLL Levy–Lieb constrained search energy functional
TS kinetic energy functional for a Slater determinant
εi eigenvalue of the ith orbital
εgap eigenvalue gap
fi occupation number for the ith orbital
i, j occupied eigenvalue index
a, b unoccupied eigenvalue index
p, q general eigenvalue index
µ chemical potential
V (r) single particle potential
Vext (r) external potential
VH , Vx , Vc , Vxc , VHxc Hartree, exchange, correlation, exchange-correlation,
and Hartree exchange-correlation potentials
Veff total potential Vext + VHxc
FKS [V ] Kohn–Sham map from potential to density
Operators
H or Ĥ general Hamiltonian
S or Ŝ spin operator, with Cartesian components Ŝ α
T or T̂ kinetic energy operator
V or V̂ potential operator
∆ Laplace operator
χ reducible polarizability operator
χ0 irreducible polarizability operator
d dielectric operator
vC Coulomb operator
U (t, s) time propagator
Periodic systems
L Bravais lattice
L∗ reciprocal lattice
Ω` global supercell
Ω periodic cell
Ω∗ first Brillouin zone
k k-point
eG (r) = |Ω|−1/2 eiG·r planewave with wavevector G ∈ L∗
ψn,k (r) = un,k (r)eik·r Bloch orbitals
A.1. Notation 113

Table A.1. (cont.)


εn,k energies of Bloch orbitals
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Function spaces
H general Hilbert space
Lp (Rd ), Lp (Rd ; C) spaces of real/complex valued Lp functions
H s (Rd ), H s (Rd ; C) spaces of real/complex valued H s functions
L2 (R3 ; C2 ) single electron state space in the real space
^N
AN ≡ L2 (R3 ; C2 ) N -electron state space in the real space
Notations for matrix representation
A> transpose of A
A∗ or A† Hermitian tranpose/adjoint of A
Ng number of grid points/degrees of freedom
Nb number of basis functions
Nocc number of occupied orbitals, especially in the spin-
restricted case
Ψ = [ψ1 , . . . , ψN ] a matrix collecting N single particle orbitals
Φ = [φ1 , . . . , φNb ] a matrix collecting Nb basis functions, usually of size
Ng × Nb
C coefficients of the single particle orbitals with respect to
a basis set
H, S discretized Hamiltonian and overlap matrices
G discretized Green’s function
I identity matrix
Other quantities
E electric field
B magnetic field
⊗ tensor product
1(−∞,0) indicator function
fβ finite temperature Fermi–Dirac function
f∞ zero temperature Fermi–Dirac function, the same as an
indicator function
F, F −1 Fourier transform and its inverse
η a small positive quantity approaching 0+
114 Appendix A. Notations and preliminaries

A.2 Spherical harmonics


The eigenvalue problem
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

L2 Y (θ, ϕ) = EY (θ, ϕ) (A.2.1)


can be solved using the separation of variables. Assume the eigenfunction takes the
form
Y (θ, ϕ) = Θ (θ) Φ (ϕ);
then, according to the formulation (1.2.25), (A.2.1) can be written as

1 ∂2
   
1 ∂ ∂
− sin θ − Θ (θ) Φ (ϕ) = EΘ (θ) Φ (ϕ) . (A.2.2)
sin θ ∂θ ∂θ sin2 θ ∂ϕ2

Multiply sin2 (θ) to both sides of the equation, and we first separate out the ϕ variable
as
∂2Φ
− 2 = m2 Φ, (A.2.3)
∂ϕ
where m2 is an eigenvalue. The solution takes the following form:

Φ (ϕ) = A exp (imϕ) + B exp (−imϕ) . (A.2.4)

The periodic boundary condition that Φ(0) = Φ(2π) requires m to be an integer.


Removing Φ(ϕ) from (A.2.2), we have an eigenvalue problem for the Θ term as

m2
 
1 ∂ ∂Θ
− sin θ + Θ = kΘ, (A.2.5)
sin θ ∂θ ∂θ sin2 θ
where k is an eigenvalue. Perform the change of variables

ζ = cos θ, ξ (cos θ) = Θ (θ) ; (A.2.6)

then (A.2.5) is reduced to the associated Legendre equation on the interval [−1, 1]:

m2
   
d  dξ
1 − ζ2 + k− ξ = 0, ξ(−1), ξ(1) are finite. (A.2.7)
dζ dζ 1 − ζ2

The eigenvalues of (A.2.7) are

k = l (l + 1) , l ∈ N. (A.2.8)

Each eigenvalue l (l + 1) corresponds to (l + 1) degenerate, orthogonal eigenfunctions,


denoted by
ξ (ζ) = Plm (ζ) , m = 0, . . . , l. (A.2.9)
Plm are called associated Legendre polynomials. It is also convenient to define the
associated Legendre polynomial for negative m as

(l − m)! m
Pl−m = (−1)m P (ζ) , m = 0, . . . , l. (A.2.10)
(l + m)! l

Changing the variable back to θ, the solution of the Θ degree of freedom is

Θ (θ) = Plm (cos θ) . (A.2.11)


A.3. Equalities in complex analysis 115

Combining the solution for Θ and Φ, we find that all eigenfunctions Y (θ, ϕ) can be
written in the form
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Ylm (θ, ϕ) = Clm Plm (cos θ) eimϕ , l ∈ N, m = −l, . . . , l, (A.2.12)

which corresponds to the eigenvalue El = l(l + 1). In other words, the eigenvalue El
has multiplicity 2l + 1. Clm is a normalization factor and is chosen so that
Z

Ylm (θ, ϕ) Yl0 m0 (θ, ϕ) sin2 θdθdϕ = δll0 δmm0 . (A.2.13)

Direct computation gives


s
m (l − m)! (2l + 1)
Clm = (−1) . (A.2.14)
4π (l + m)!

The functions Ylm (θ, ϕ) are called the spherical harmonics.

A.3 Equalities in complex analysis


Let f (z) be analytic inside a closed contour C that encloses a point z0 . Then the Cauchy
integral formula states that
I
1 f (λ)
f (z0 ) = dλ. (A.3.1)
2πi C λ − z0

This fact can be generalized to the matrix case: If all eigenvalues of A ∈ CN ×N are
enclosed by the closed contour C, then
I
1
f (A) = f (λ)(λI − A)−1 dλ. (A.3.2)
2πi C
For the Heaviside function

1,
 t > 0,
Θ(t) = 21 , t = 0, (A.3.3)

0, t < 0,

the Fourier transform is given, in the distribution sense, as


1
Θ(ω)
b = πδ(ω) + i p. v. , (A.3.4)
ω
where p. v. is the Cauchy principal value. To check this, we note that

1
i
Z ∞ −iωt
e 1
Z ∞
sin(ωt) 2,
 t > 0,
p. v. dω = lim dω = 0, t = 0, (A.3.5)
2π −∞ ω π ε→0+ ε ω  1

− 2 , t < 0,

where in the last equality we have used the fact that the integral of the sinc function is π.
The Sokhotski–Plemelj formula is the following:
Z Z
f (x) f (x)
lim dx = ∓iπf (0) + p. v. dx. (A.3.6)
η→0+ R x ± iη R x
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Appendix B

Selected references for


further reading

Chapter 1
R. F EYNMAN AND A. H IBBS, Quantum Mechanics and Path Integrals, McGraw-Hill,
New York, 1965.
S. J. G USTAFSON AND I. M. S IGAL, Mathematical Concepts of Quantum Mechanics,
Springer, Berlin, 2011.
E. K AXIRAS, Atomic and Electronic Structure of Solids, Cambridge University Press,
Cambridge, 2003.
L. L ANDAU AND E. L IFSHITZ, Quantum Mechanics: Non-Relativistic Theory, Butter-
worth-Heinemann, Oxford, 1991.
M. R EED AND B. S IMON, Methods of Modern Mathematical Physics. I. Functional
Analysis, Academic Press, New York, 1978.
J. J. S AKURAI, Modern Quantum Mechanics, Addison-Wesley, Reading, MA, 1994.

Chapter 2
D. R. B OWLER AND T. M IYAZAKI, O(N) methods in electronic structure calculations,
Rep. Progr. Phys., 75 (2012), 036503.
E. C ANCÈS , M. D EFRANCESCHI , W. K UTZELNIGG , C. L E B RIS , AND Y. M ADAY,
Computational quantum chemistry: A primer, Handb. Numer. Anal., 10 (2003),
pp. 3–270.
H. E SCHRIG, The Fundamentals of Density Functional Theory, B. G. Teubner, Stuttgart,
1996.
S. G OEDECKER, Linear scaling electronic structure methods, Rev. Mod. Phys., 71
(1999), pp. 1085–1123.
R. O. J ONES, Density functional theory: Its origins, rise to prominence, and future,
Rev. Mod. Phys., 87 (2015), pp. 897–923.
R. M ARTIN, Electronic Structure: Basic Theory and Practical Methods, Cambridge
University Press, New York, 2004.
D. M ARX AND J. H UTTER, Ab Initio Molecular Dynamics: Basic Theory and Ad-
vanced Methods, Cambridge University Press, New York, 2009.
N. M ARZARI , A. A. M OSTOFI , J. R. YATES , I. S OUZA , AND D. VANDERBILT, Max-
imally localized Wannier functions: Theory and applications, Rev. Mod. Phys., 84
(2012), pp. 1419–1475.

117
118 Appendix B. Selected references for further reading

R. PARR AND W. YANG, Density Functional Theory of Atoms and Molecules, Oxford
University Press, New York, 1989.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

M. C. PAYNE , M. P. T ETER , D. C. A LLEN , T. A. A RIAS , AND J. D. J OANNOP -


OULOS , Iterative minimization techniques for ab initio total energy calculation:
Molecular dynamics and conjugate gradients, Rev. Mod. Phys., 64 (1992), pp. 1045–
1097.
C. A. U LLRICH, Time-Dependent Density-Functional Theory: Concepts and Applica-
tions, Oxford University Press, Oxford, 2011.

Chapter 3
S. BARONI , S. DE G IRONCOLI , A. DAL C ORSO , AND P. G IANNOZZI, Phonons and
related crystal properties from density-functional perturbation theory, Rev. Mod.
Phys., 73 (2001), pp. 515–562.
M. E. C ASIDA AND M. H UIX -ROTLLANT, Progress in time-dependent density-
functional theory, Annu. Rev. Phys. Chem., 63 (2012), pp. 287–323.
T. K ATO, Perturbation Theory for Linear Operators, Springer, Berlin, 1966.
M. M ARQUES AND E. K. U. G ROSS, Time-dependent density functional theory, Annu.
Rev. Phys. Chem., 55 (2004), pp. 427–455.
G. O NIDA , L. R EINING , AND A. RUBIO, Electronic excitations: Density-functional
versus many-body Green’s-function approaches, Rev. Mod. Phys., 74 (2002), p. 601.
X. R EN , P. R INKE , C. J OAS , AND M. S CHEFFLER, Random-phase approximation and
its applications in computational chemistry and materials science, J. Mater. Sci., 47
(2012), pp. 7447–7471.
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Bibliography

[1] S. L. A DLER, Quantum theory of the dielectric constant in real solids, Phys. Rev., 126
(1962), pp. 413–420. (Cited on p. 92)

[2] D. G. A NDERSON, Iterative procedures for nonlinear integral equations, J. Assoc. Comput.
Mach., 12 (1965), pp. 547–560. (Cited on p. 52)

[3] T. A RIAS , M. PAYNE , AND J. J OANNOPOULOS, Ab initio molecular dynamics: Analyti-


cally continued energy functionals and insights into iterative solutions, Phys. Rev. Lett., 69
(1992), pp. 1077–1080. (Cited on p. 60)

[4] N. A SHCROFT AND N. M ERMIN, Solid State Physics, Thomson Learning, Toronto, 1976.
(Cited on p. 21)

[5] S. BARONI , S. DE G IRONCOLI , A. DAL C ORSO , AND P. G IANNOZZI, Phonons and re-
lated crystal properties from density-functional perturbation theory, Rev. Mod. Phys., 73
(2001), pp. 515–562. (Cited on pp. 92, 95)

[6] A. D. B ECKE, Density-functional exchange-energy approximation with correct asymptotic


behavior, Phys. Rev. A, 38 (1988), pp. 3098–3100. (Cited on p. 43)

[7] A. D. B ECKE, Density functional thermochemistry. III. The role of exact exchange, J. Chem.
Phys., 98 (1993), p. 5648. (Cited on p. 44)

[8] D. B OHM AND D. P INES, A collective description of electron interactions: III. Coulomb
interactions in a degenerate electron gas, Phys. Rev., 92 (1953), p. 609. (Cited on p. 106)

[9] F. A. B ORNEMANN AND C. S CHÜTTE, A mathematical investigation of the Car-Parrinello


method, Numer. Math., 78 (1998), pp. 359–376. (Cited on p. 79)

[10] D. R. B OWLER AND T. M IYAZAKI, O(N) methods in electronic structure calculations,


Rep. Progr. Phys., 75 (2012), 036503. (Cited on pp. 64, 73)

[11] C. B ROUDER , G. PANATI , M. C ALANDRA , C. M OUROUGANE , AND N. M ARZARI, Expo-


nential localization of Wannier functions in insulators, Phys. Rev. Lett., 98 (2007), 046402.
(Cited on p. 98)

[12] K. B URKE, Perspective on density functional theory, J. Chem. Phys., 136 (2012), 150901.
(Cited on p. 42)

[13] E. C ANCÈS , M. D EFRANCESCHI , W. K UTZELNIGG , C. L E B RIS , AND Y. M ADAY, Com-


putational quantum chemistry: A primer, Handb. Numer. Anal., 10 (2003), pp. 3–270. (Not
cited)

[14] R. C AR AND M. PARRINELLO, Unified approach for molecular dynamics and density-
functional theory, Phys. Rev. Lett., 55 (1985), pp. 2471–2474. (Cited on p. 78)

119
120 Bibliography

[15] M. E. C ASIDA, Time-dependent density functional response theory for molecules, in Recent
Advances in Density Functional Methods:(Part I), 1 (1995), p. 155. (Cited on p. 105)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

[16] M. E. C ASIDA AND M. H UIX -ROTLLANT, Progress in time-dependent density-functional


theory, Annu. Rev. Phys. Chem., 63 (2012), pp. 287–323. (Not cited)

[17] D. M. C EPERLEY AND B. J. A LDER, Ground state of the electron gas by a stochastic
method, Phys. Rev. Lett., 45 (1980), pp. 566–569. (Cited on p. 43)

[18] A. J. C OLEMAN, Structure of Fermion density matrices, Rev. Mod. Phys., 35 (1963),
pp. 668–687. (Cited on p. 57)

[19] H. D. C ORNEAN , D. M ONACO , AND S. T EUFEL, Wannier functions and Z2 invariants


in time-reversal symmetric topological insulators, Rev. Math. Phys., 29 (2017), 1730001.
(Cited on p. 98)

[20] A. DAMLE , A. L EVITT, AND L. L IN, Variational formulation for Wannier functions with
entangled band structure, Multiscale Model. Simul., 17 (2019), pp. 167–191. (Cited on
p. 75)

[21] A. DAMLE , L. L IN , AND L. Y ING, Compressed representation of Kohn–Sham orbitals via


selected columns of the density matrix, J. Chem. Theory Comput., 11 (2015), pp. 1463–
1469. (Cited on pp. 75, 76)

[22] E. DAVIDSON, The iterative calculation of a few of the lowest eigenvalues and correspond-
ing eigenvectors of large real-symmetric matrices, J. Comput. Phys., 17 (1975), pp. 87–94.
(Cited on p. 60)

[23] W. E AND J. L U, The Kohn-Sham Equation for Deformed Crystals, Memoirs of the Amer-
ican Mathematical Society, 221, American Mathematical Society, Providence, RI, 2013.
(Cited on pp. 91, 98)

[24] A. E RISMAN AND W. T INNEY, On computing certain elements of the inverse of a sparse
matrix, Commun. ACM, 18 (1975), pp. 177–179. (Cited on p. 65)

[25] H. E SCHRIG, The Fundamentals of Density Functional Theory, B. G. Teubner, Stuttgart,


1996. (Not cited)

[26] H.-R. FANG AND Y. S AAD, Two classes of multisecant methods for nonlinear acceleration,
Numer. Linear Algebra Appl., 16 (2009), pp. 197–221. (Cited on p. 51)

[27] R. F EYNMAN AND A. H IBBS, Quantum Mechanics and Path Integrals, McGraw-Hill, New
York, 1965. (Not cited)

[28] J. M. F OSTER AND S. F. B OYS, Canonical configurational interaction procedure, Rev.


Mod. Phys., 32 (1960), p. 300. (Cited on p. 74)

[29] P. G IANNOZZI , S. BARONI , N. B ONINI , M. C ALANDRA , R. C AR , C. C AVAZ -


ZONI , D. C ERESOLI , G. L. C HIAROTTI , M. C OCOCCIONI , I. DABO , A. DAL
C ORSO , S. DE G IRONCOLI , S. FABRIS , G. F RATESI , R. G EBAUER , U. G ERSTMANN ,
C. G OUGOUSSIS , A. KOKALJ , M. L AZZERI , L. M ARTIN -S AMOS , N. M ARZARI ,
F. M AURI , R. M AZZARELLO , S. PAOLINI , A. PASQUARELLO , L. PAULATTO , C. S BRAC -
CIA , S. S CANDOLO , G. S CLAUZERO , A. P. S EITSONEN , A. S MOGUNOV, P. U MARI ,
AND R. M. W ENTZCOVITCH , QUANTUM ESPRESSO: A modular and open-source soft-
ware project for quantum simulations of materials, J. Phys.: Condens. Matter, 21 (2009),
395502. (Cited on p. 71)

[30] S. G OEDECKER, Linear scaling electronic structure methods, Rev. Mod. Phys., 71 (1999),
pp. 1085–1123. (Cited on pp. 64, 73)
Bibliography 121

[31] G. H. G OLUB AND C. F. VAN L OAN, Matrix Computations, 4th ed. Johns Hopkins Uni-
versity Press, Baltimore, MD, 2013. (Cited on p. 76)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

[32] X. G ONZE AND C. L EE, Dynamical matrices, Born effective charges, dielectric permittivity
tensors, and interatomic force constants from density-functional perturbation theory, Phys.
Rev. B, 55 (1997), p. 10355. (Cited on p. 95)

[33] S. J. G USTAFSON AND I. M. S IGAL, Mathematical Concepts of Quantum Mechanics,


Springer, Berlin, 2011. (Not cited)

[34] L. H EDIN, New method for calculating the one-particle Green’s function with application
to the electron-gas problem, Phys. Rev., 139 (1965), p. A796. (Cited on p. 92)

[35] J. H EYD , G. E. S CUSERIA , AND M. E RNZERHOF, Hybrid functionals based on a screened


Coulomb potential, J. Chem. Phys., 118 (2003), pp. 8207–8215. (Cited on p. 44)

[36] N. J. H IGHAM, Functions of Matrices: Theory and Computation, SIAM, Philadelphia, PA,
2008. (Cited on p. 63)

[37] P. H OHENBERG AND W. KOHN, Inhomogeneous electron gas, Phys. Rev., 136 (1964),
pp. B864–B871. (Cited on p. 39)

[38] M. JACQUELIN , L. L IN , AND C. YANG, PSelInv—A distributed memory parallel algorithm


for selected inversion: The symmetric case, ACM Trans. Math. Software, 43 (2016), 21.
(Cited on pp. 65, 66)

[39] D. D. J OHNSON, Modified Broyden’s method for accelerating convergence in self-


consistent calculations, Phys. Rev. B, 38 (1988), pp. 12807–12813. (Cited on p. 51)

[40] R. O. J ONES, Density functional theory: Its origins, rise to prominence, and future, Rev.
Mod. Phys., 87 (2015), pp. 897–923. (Not cited)

[41] T. K ATO, Perturbation Theory for Linear Operators, Springer, Berlin, 1966. (Not cited)

[42] E. K AXIRAS, Atomic and Electronic Structure of Solids, Cambridge University Press, Cam-
bridge, 2003. (Not cited)

[43] G. P. K ERKER, Efficient iteration scheme for self-consistent pseudopotential calculations,


Phys. Rev. B, 23 (1981), pp. 3082–3084. (Cited on p. 53)

[44] A. V. K NYAZEV, Toward the optimal preconditioned eigensolver: Locally optimal block
preconditioned conjugate gradient method, SIAM J. Sci. Comput., 23 (2001), pp. 517–541.
(Cited on p. 60)

[45] W. KOHN, Density functional and density matrix method scaling linearly with the number
of atoms, Phys. Rev. Lett., 76 (1996), pp. 3168–3171. (Cited on pp. 64, 73)

[46] W. KOHN AND L. S HAM, Self-consistent equations including exchange and correlation
effects, Phys. Rev., 140 (1965), pp. A1133–A1138. (Cited on p. 39)

[47] L. L ANDAU AND E. L IFSHITZ, Quantum Mechanics: Non-Relativistic Theory,


Butterworth-Heinemann, Oxford, 1991. (Not cited)

[48] C. L EE , W. YANG , AND R. G. PARR, Development of the Colle-Salvetti correlation-energy


formula into a functional of the electron density, Phys. Rev. B, 37 (1988), pp. 785–789.
(Cited on p. 43)

[49] M. L EVY, Universal variational functionals of electron densities, first-order density ma-
trices, and natural spin-orbitals and solution of the v-representability problem, Proc. Natl.
Acad. Sci. USA, 76 (1979), pp. 6062–6065. (Cited on p. 39)
122 Bibliography

[50] E. H. L IEB, Thomas-Fermi and related theories of atoms and molecules, Rev. Mod. Phys.,
53 (1981), pp. 603–641. (Cited on p. 70)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

[51] E. H. L IEB, Density functionals for Coulomb systems, Int J. Quantum Chem., 24 (1983),
pp. 243–277. (Cited on pp. 39, 40)

[52] E. H. L IEB AND M. L OSS, Analysis, 2nd ed., Graduate Studies in Mathematics 14, Ameri-
can Mathematical Society, Providence, RI, 2001. (Cited on p. 40)

[53] L. L IN , J. L U , L. Y ING , R. C AR , AND W. E, Fast algorithm for extracting the diagonal of


the inverse matrix with application to the electronic structure analysis of metallic systems,
Commun. Math. Sci., 7 (2009), pp. 755–777. (Cited on p. 65)

[54] L. L IN , J. L U , L. Y ING , AND W. E, Pole-based approximation of the Fermi-Dirac function,


Chinese Ann. Math. Ser. B, 30 (2009), p. 729. (Cited on pp. 59, 65, 66)

[55] L. L IN AND C. YANG, Elliptic preconditioner for accelerating the self-consistent field iter-
ation in Kohn–Sham density functional theory, SIAM J. Sci. Comput., 35 (2013), pp. S277–
S298. (Cited on p. 93)

[56] L. L IN , C. YANG , J. M EZA , J. L U , L. Y ING , AND W. E, SelInv—An algorithm for selected


inversion of a sparse symmetric matrix, ACM. Trans. Math. Software, 37 (2011), 40. (Cited
on pp. 65, 66)

[57] L. D. M ARKS AND D. R. L UKE, Robust mixing for ab initio quantum mechanical calcula-
tions, Phys. Rev. B, 78 (2008), 075114. (Cited on p. 51)

[58] M. M ARQUES AND E. K. U. G ROSS, Time-dependent density functional theory, Annu.


Rev. Phys. Chem., 55 (2004), pp. 427–455. (Not cited)

[59] R. M ARTIN, Electronic Structure—Basic Theory and Practical Methods, Cambridge Uni-
versity Press, New York, 2004. (Not cited)

[60] D. M ARX AND J. H UTTER, Ab Initio Molecular Dynamics: Basic Theory and Advanced
Methods, Cambridge University Press, New York, 2009. (Not cited)

[61] N. M ARZARI , A. A. M OSTOFI , J. R. YATES , I. S OUZA , AND D. VANDERBILT, Maxi-


mally localized Wannier functions: Theory and applications, Rev. Mod. Phys., 84 (2012),
pp. 1419–1475. (Cited on p. 74)

[62] N. M ARZARI AND D. VANDERBILT, Maximally localized generalized Wannier functions


for composite energy bands, Phys. Rev. B, 56 (1997), p. 12847. (Cited on p. 75)

[63] R. M C W EENY, Some recent advances in density matrix theory, Rev. Mod. Phys., 32 (1960),
pp. 335–369. (Cited on p. 61)

[64] N. M ERMIN, Thermal properties of the inhomogeneous electron gas, Phys. Rev., 137
(1965), p. A1441. (Cited on p. 39)

[65] H. J. M ONKHORST AND J. D. PACK, Special points for Brillouin-zone integrations, Phys.
Rev. B, 13 (1976), p. 5188. (Cited on p. 73)

[66] G. O NIDA , L. R EINING , AND A. RUBIO, Electronic excitations: Density-functional versus


many-body Green’s-function approaches, Rev. Mod. Phys., 74 (2002), p. 601. (Cited on
p. 102)

[67] G. PANATI AND A. P ISANTE, Bloch bundles, Marzari-Vanderbilt functional and maximally
localized Wannier functions, Comm. Math. Phys., 322 (2013), pp. 835–875. (Cited on p. 98)
Bibliography 123

[68] R. PARR AND W. YANG, Density Functional Theory of Atoms and Molecules, Oxford Uni-
versity Press, New York, 1989. (Cited on pp. 57, 107)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

[69] G. PASTORE , E. S MARGIASSI , AND F. B UDA, Theory of ab initio molecular-dynamics


calculations, Phys. Rev. A, 44 (1991), pp. 6334–6347. (Cited on p. 79)

[70] M. C. PAYNE , M. P. T ETER , D. C. A LLEN , T. A. A RIAS , AND J. D. J OANNOPOULOS,


Iterative minimization techniques for ab initio total energy calculation: Molecular dynamics
and conjugate gradients, Rev. Mod. Phys., 64 (1992), pp. 1045–1097. (Not cited)

[71] J. P. P ERDEW, K. B URKE , AND M. E RNZERHOF, Generalized gradient approximation


made simple, Phys. Rev. Lett., 77 (1996), pp. 3865–3868. (Cited on p. 43)

[72] J. P. P ERDEW AND K. S CHMIDT, Jacob’s ladder of density functional approximations for
the exchange-correlation energy, AIP Conf. Proc., 577 (2001), pp. 1–20. (Cited on p. 42)

[73] J. P. P ERDEW AND A. Z UNGER, Self-interaction correction to density-functional approx-


imations for many-electron systems, Phys. Rev. B, 23 (1981), pp. 5048–5079. (Cited on
pp. 43, 45)

[74] P. P ULAY, Convergence acceleration of iterative sequences: The case of SCF iteration,
Chem. Phys. Lett., 73 (1980), pp. 393–398. (Cited on p. 52)

[75] P. P ULAY, Improved SCF convergence acceleration, J. Comput. Chem., 3 (1982), pp. 54–
69. (Cited on p. 52)

[76] M. R EED AND B. S IMON, Methods of Modern Mathematical Physics. I. Functional Anal-
ysis, Academic Press, New York, 1978. (Cited on p. 89)

[77] X. R EN , P. R INKE , C. J OAS , AND M. S CHEFFLER, Random-phase approximation and its


applications in computational chemistry and materials science, J. Mater. Sci., 47 (2012),
pp. 7447–7471. (Not cited)

[78] Y. S AAD AND M. H. S CHULTZ, GMRES: A generalized minimal residual algorithm for
solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986), pp. 856–869.
(Cited on pp. 51, 102)

[79] J. J. S AKURAI, Modern Quantum Mechanics, Addison-Wesley, Reading, MA, 1994. (Cited
on p. 13)

[80] V. N. S TAROVEROV, G. E. S CUSERIA , J. TAO , AND J. P. P ERDEW, Comparative assess-


ment of a new nonempirical density functional: Molecules and hydrogen-bonded complexes,
J. Chem. Phys., 119 (2003), pp. 12129–12137. (Cited on p. 44)

[81] J. S UN , A. RUZSINSZKY, AND J. P. P ERDEW, Strongly constrained and appropriately


normed semilocal density functional, Phys. Rev. Lett., 115 (2015), 036402. (Cited on p. 44)

[82] M. T ETER , M. PAYNE , AND D. A LLAN, Solution of Schrödinger’s equation for large
systems, Phys. Rev. B, 40 (1989), p. 12255. (Cited on p. 102)

[83] C. A. U LLRICH, Time-Dependent Density-Functional Theory: Concepts and Applications,


Oxford University Press, Oxford, 2011. (Cited on p. 102)

[84] G. H. WANNIER, The structure of electronic excitation levels in insulating crystals, Phys.
Rev., 52 (1937), p. 191. (Cited on p. 75)

[85] N. W ISER, Dielectric constant with local field effects included, Phys. Rev., 129 (1963),
pp. 62–69. (Cited on p. 92)
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Index

A density functional perturbation theory . . . 89, 91


ab initio molecular dynamics . . . . . . . . . . . . . . 77 density functional theory . . . . . . . . . . . . . . . . . . 39
absorption spectrum . . . . . . . . . . . . . . . . . . . . . 105 time-dependent . . . . . . . . . . . . . . . . . . . . . . . . 79
cross section . . . . . . . . . . . . . . . . . . . . . . . . . . 106 density matrix . . . . . . . . . . . . . . . . . . . . .37, 53, 85
adiabatic approximation . . . . . . . . . . . . . . 80, 102 contour integral representation . . . . 55, 59, 85
adiabatic connection . . . . . . . . . . . . . . . . . . . . .106 exponential decay . . . . . . . . . . . . . . . . . . . . . . 98
anti-commutator . . . . . . . . . . . . . . . . . . . . . . . . . . 8 many body . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
atomic force . . . . . . . . . . . . . . . . . . . . . . 76, 77, 95 density matrix algorithm . . . . . . . . . . . . . . . . . . 61
aufbau principle . . . . . . . . . . . . . . . . . . . . . . . . . . 47 dielectric screening . . . . . . . . . . . . . . . . . . . . . . . 91
Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
B dynamical matrix . . . . . . . . . . . . . . . . . . . . . . . . 94
band gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Dyson equation . . . . . . . . . . . . . . . . . . 85, 99, 109
band structure . . . . . . . . . . . . . . . . . . . . . 23, 70, 71
basis set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 E
Bloch–Floquet decomposition . . . . . . . . . . 22, 68 effective Hamiltonian . . . . . . . . . . . . . . . . . . . . . 89
Born–Oppenheimer approximation . . . . . 33, 76 effective potential . . . . . . . . . . . . . . . . . . . . . 47, 89
Born–Oppenheimer molecular dynamics . . . 78 eigenfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
boson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26, 29 generalized . . . . . . . . . . . . . . . . . . . . . . . . . 13, 71
boundary condition eigenvalue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Bloch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 electron density . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Born–von Karman . . . . . . . . . . . . . . . . . . 67, 68 two particle . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
periodic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21, 67 energy functional
Bravais lattice . . . . . . . . . . . . . . . . . . . . . . . . 21, 67 exchange-correlation . . . . . . . . . . . . . . . . . . . 41
Brillouin zone . . . . . . . . . . . . . . . . . . . . .23, 67, 73 Hartree–Fock . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Kohn–Sham . . . . . . . . . . . . . . . . . . . . . . . . 42, 46
C Levy–Lieb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Casida equation . . . . . . . . . . . . . . . . . . . . . . . . . 105 Euler–Lagrange equation . . . . . . . 44, 45, 56, 70
Cauchy integral formula . . . . . . . . . . . . . . . . . . 54 exchange-correlation functional . . . . . . . . . . . 41
chemical accuracy . . . . . . . . . . . . . . . . . . . . . . . . 42 Jacob’s ladder . . . . . . . . . . . . . . . . . . . . . . . . . . 42
chemical potential . . . . . . . . . . . . . . . . . . . . 54, 72 exchange-correlation kernel . . . . . . . . . . 91, 109
commutator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 exchange-correlation potential . . . . . . . . . . . . . 47
constrained minimization . . . . . . . . . . 39, 40, 56 excitation energy . . . . . . . . . . . . . . . . . . . . . . . . 104
correlation energy . . . . . . . . . . . . . . . . . . . 34, 106 excited state . . . . . . . . . . . . . . . . . . . . . . . . . 10, 104
Coulomb interaction, bare . . . . . . . . . . . . . . . . . 92 exponential decay . . . . . . . . . . . . . . . . . . . . . . . . 96
Coulomb interaction, screened . . . . . . . . . . . . . 92
Courant–Fischer min-max theorem . . . . . . . . 34 F
crystal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21, 67, 75 Fermi energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Fermi level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
D Fermi operator expansion . . . . . . . . . . . . . . . . . 64
density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Fermi–Dirac . . . . . . . . . . . . . . . . . . . . . . 57, 58, 64

125
126 Index

fermion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26, 29 M
finite temperature . . . . . . . . . . . . . . . . . . . . . . . . 55 matrix function . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

free energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 McWeeny purification . . . . . . . . . . . . . . . . . . . . 61


measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
G meta-generalized gradient approximation 44, 48
Gâteaux derivative . . . . . . . . . . . . . . . . . . . . . . . 86 metallic system . . . . . . . . . . . . . . . . . . . . . . . . . . 71
gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53, 74, 76 method
gauge-invariant . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Anderson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
generalized gradient approximation . . . . . 43, 48 Broyden . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
geometric optimization . . . . . . . . . . . . . . . . . . . 76 density matrix purification . . . . . . . . . . . . . . 61
Green’s function . . . . . . . . . . . . . . . 55, 59, 65, 83 direct inversion of iterative subspace . . . . . 52
decay property . . . . . . . . . . . . . . . . . . . . . . . . . 95 Fermi operator expansion . . . . . . . . . . . . . . . 64
kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 fixed point iteration . . . . . . . . . . . . . . . . . . . . . 50
ground state . . . . . . . . . . . . . . . . . . . . . . 10, 33, 39 Newton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Newton–Schulz . . . . . . . . . . . . . . . . . . . . . . . . 63
H pole expansion and selected inversion . . . . 66
Hamiltonian . . . . . . . . . . . . . . . . . . . . . . 10, 17, 23 Pulay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Hamiltonian, many body . . . . . . . . . . . . . 33, 103 quasi-Newton . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Hartree–Fock . . . . . . . . . . . . . . . . . . . . . . . . . 28, 34 selected column of the density matrix . . . . 75
energy functional . . . . . . . . . . . . . . . . . . . . . . . 37 selected inversion . . . . . . . . . . . . . . . . . . . . . . 65
exchange energy . . . . . . . . . . . . . . . . . . . . . . . 39
simple mixing . . . . . . . . . . . . . . . . . . . . . . 50, 93
general . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
mixing, density . . . . . . . . . . . . . . . . . . . . . . . . . . 49
restricted . . . . . . . . . . . . . . . . . . . . . . . . . . . 38, 45
mixing, potential . . . . . . . . . . . . . . . . . . . . . . . . . 49
unrestricted . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
molecular dynamics
Heisenberg uncertainty principle . . . . . . . . . . . 14
ab initio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
helium atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Born–Oppenheimer . . . . . . . . . . . . . . . . . . . . 78
Hellmann–Feynman theorem . . . . . . . . . . 77, 95
Car–Parrinello . . . . . . . . . . . . . . . . . . . . . . . . . 78
Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
tensor product . . . . . . . . . . . . . . . . . . . . . . . . . 24
N
hybrid functional . . . . . . . . . . . . . . . . . . . . . . . . . 44
near-sightedness principle . . . . . . . . . . . . . . . . . 64
hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Neumann series . . . . . . . . . . . . . . . . . . . . . . 85, 91
hydrogen molecular ion . . . . . . . . . . . . . . . . . . . 19
Newton’s equation . . . . . . . . . . . . . . . . . . . . . . . 77
I null state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
identical particles . . . . . . . . . . . . . . . . . . . . . . . . 29
insulating system . . . . . . . . . . . . . . . . . . . . . . . . . 71 O
occupied orbital . . . . . . . . . . . . . . . . . . . . . . . . . . 46
K occupied space . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
k-point sampling . . . . . . . . . . . . . . . . . . . . . . . . . 67 occupied subspace . . . . . . . . . . . . . . . . . . . . . . . 74
kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . 40, 69 operator
kinetic energy density . . . . . . . . . . . . . . . . . . . . 44 angular momentum . . . . . . . . . . . . . . . . . . . . . 15
Kohn–Sham density functional theory . . . . . . 39 Coulomb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Kohn–Sham equations . . . . . . . . . . . . . . . . . . . . 47 dielectric . . . . . . . . . . . . . . . . . . . . . . . . . . . 92, 93
Kohn–Sham map . . . . . . . . . . . . . . . . . . 49, 60, 65 expectation value . . . . . . . . . . . . . . . . . . . . . . . . 5
Fock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
L Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Levy–Lieb energy functional . . . . . . . . . . . . . . 41 momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
linear response . . . . . . . . . . . . . . . . . . . . 83, 86, 93 permutation . . . . . . . . . . . . . . . . . . . . . . . . 26, 28
linear scaling algorithm . . . . . . . . . . . . . . . . . . . 64 polarizability . . . . . . . . . . . . . . . . . . . . . . . 49, 87
local density approximation . . . . . . . . 42, 47, 90 position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 projection . . . . . . . . . . . . . . . . . . . 37, 61, 85, 88
Boys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 self-adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Wannier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 overlap matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Index 127

P S
Pauli exclusion principle . . . . . . . . . . . . . . 30, 34 Schrödinger equation . . . . . . . . . . . . . . . . . . . . . 10
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

Pauli matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 time-dependent . . . . . . . . . . . . . . . . . . . . . 17, 79


perturbation theory . . . . . . . . . . . . . . . . . . . . . . . 83 time-independent . . . . . . . . . . . . . . . . . . . . . . . 10
degenerate eigenvalues . . . . . . . . . . . . . . . . . . 89 selected inversion . . . . . . . . . . . . . . . . . . . . . . . . 65
density functional . . . . . . . . . . . . . . . . . . . 89, 91 self-consistent field iteration . . . . . . . . . . . 49, 93
density matrix . . . . . . . . . . . . . . . . . . . . . . 85, 99 semiconducting system . . . . . . . . . . . . . . . . . . . 71
eigenvalue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 simultaneous diagonalization . . . . . . . . . . . . . . . 8
electron density . . . . . . . . . . . . . . . . . . . . . . . . 87 Slater determinant . . . . . . . . . . . . . . . . . . . . 30, 34
Green’s function . . . . . . . . . . . . . . . . . . . . . . . 83 Sokhotski–Plemelj formula . . . . . . . . . . . . . . 101
many-body . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 spherical harmonics . . . . . . . . . . . . . . . . . . . 16, 19
many-body Hamiltonian . . . . . . . . . . . . . . . 102 spin
1
propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 2 particle . . . . . . . . . . . . . . . . . . . . . . . . . . . 3, 17
time-dependent . . . . . . . . . . . . . . . . . . . . . . . . 98 operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
time-dependent density functional . . . . . . 101 precession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
phonon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 singlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
planewave . . . . . . . . . . . . . . . . . . . . . . . . . . . 22, 71 triplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
polarizability operator . . . . . . . . . 49, 87, 90, 109 spin-less particles . . . . . . . . . . . . . . . . . . . . . . . . 49
dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 state space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
irreducible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 state vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Stern–Gerlach experiment . . . . . . . . . . . . . . . . . . 1
irreducible, dynamic . . . . . . . . . . . . . . . . . . . 101
Sternheimer equation . . . . . . . . . . . . . . . . . 88, 91
many-body . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
frequency dependent . . . . . . . . . . . . . . . . . . 102
one-body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
time-dependent . . . . . . . . . . . . . . . . . . . . . . . 106
reducible . . . . . . . . . . . . . . . . . . . . . . . . . 90, 104
supercell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
reducible, dynamic . . . . . . . . . . . . . . . 101, 105
polarizability tensor T
frequency dependent, macroscopic . . . . . . 105 thermodynamic limit . . . . . . . . . . . . . . . . . . . . . 68
macroscopic . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 time-ordered exponential . . . . . . . . . . . . . . . . . . 98
pole expansion . . . . . . . . . . . . . . . . . . . . . . . 59, 65 time-ordered product . . . . . . . . . . . . . . . . . . . . . 98
potential
effective . . . . . . . . . . . . . . . . . . . . . . . 47, 89, 106 U
exchange-correlation . . . . . . . . . . . . . . . . . . . 90 uncertainty principle . . . . . . . . . . . . . . . . . . . . . . . 8
Hartree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 unit cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21, 68
potential energy surface . . . . . . . . . . . . . . . . . . . 76 unoccupied orbital . . . . . . . . . . . . . . . . . . . . . . . 46
propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9, 99
V
Q variational principle . . . . . . . . . . . . . . . . . . . 34, 39
quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 vector
quantum Liouville equation . . . . . . . . . . . . 80, 98 bra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
virtual orbital . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
R von Neumann entropy . . . . . . . . . . . . . . . . . . . . 55
random phase approximation . 44, 91, 106, 109 von Neumann equation . . . . . . . . . . . . . . . . . . . 80
real space discretization . . . . . . . . . . . . . . . 60, 65
reciprocal lattice . . . . . . . . . . . . . . . . . . . . . . . . . 23 W
reciprocal space . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Wannier function . . . . . . . . . . . . . . . . . . . . . 74, 98
resolvent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
resolvent identity . . . . . . . . . . . . . . . . . . . . . . . . . 83 Z
resolvent set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 zero temperature . . . . . . . . . . . . . . . . . . . . . . 55, 85
A Mathematical Introduction to Electronic Structure Theory
Based on first principle quantum mechanics, electronic structure theory is widely used in physics,
chemistry, materials science, and related fields and has recently received increasing research attention
in applied and computational mathematics. This book provides a self-contained, mathematically
oriented introduction to the subject and its associated algorithms and analysis. It will help applied
mathematics students and researchers with minimal background in physics understand the basics of
Downloaded 09/24/21 to 18.10.248.49 Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/page/terms

electronic structure theory and prepare them to conduct research in this area.
A Mathematical Introduction to Electronic Structure Theory begins with an elementary introduction
of quantum mechanics, including the uncertainty principle and the Hartree–Fock theory, which

A Mathematical
is considered the starting point of modern electronic structure theory. The authors then provide
an in-depth discussion of two carefully selected topics that are directly related to several aspects
of modern electronic structure calculations: density matrix based algorithms and linear response
theory. Chapter 2 introduces the Kohn–Sham density functional theory with a focus on the density
matrix based numerical algorithms, and Chapter 3 introduces linear response theory, which provides
a unified viewpoint of several important phenomena in physics and numerics. An understanding of
these topics will prepare readers for more advanced topics in this field. The book concludes with the Introduction
to Electronic
random phase approximation to the correlation energy.
The book is written for advanced undergraduate and beginning graduate students, specifically those
with mathematical backgrounds but without a priori knowledge of quantum mechanics, and can
be used for self-study by researchers, instructors, and other scientists. The book can also serve as a

Structure Theory
starting point to learn about many-body perturbation theory, a topic at the frontier of the study of
interacting electrons.

Lin Lin is an associate professor in the department of mathematics at the


University of California, Berkeley, and is a faculty scientist at Lawrence Berkeley
National Laboratory. He is a recipient of the Sloan Fellowship, the National Science
Foundation CAREER Award, the Department of Energy Early Career Award, and
the SIAM Computational Science and Engineering (CSE) Early Career Award. His
research focuses on the development of efficient numerical methods for electronic
structure calculations.

Jianfeng Lu is an associate professor of mathematics, physics, and chemistry

Lin Lin • Jianfeng Lu


at Duke University, where he works in mathematical analysis and algorithm
development for problems and challenges arising from computational physics,
theoretical chemistry, and materials science. His work has been recognized by a
Sloan Fellowship, a National Science Foundation Career Award, and the IMA Prize in
Mathematics and its Applications.

For more information about SIAM books, journals,


conferences, memberships, or activities, contact:
Lin Lin
Society for Industrial and Applied Mathematics
3600 Market Street, 6th Floor
Philadelphia, PA 19104-2688 USA
Jianfeng Lu
+1-215-382-9800 • Fax +1-215-386-7999
[email protected] • www.siam.org

SL04
ISBN 978-1-611975-79-6
90000
SL04

9781611975796

SL04_LIN_COVER_V8_CMYK100833625.indd 1 4/23/2019 11:39:44 AM

You might also like