Notes Fys4480
Notes Fys4480
Notes Fys4480
Simen Kvaal
November ,
Contents
Fundamental formalism
. Many-particle systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Hilbert space and Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. The manybody Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Separation of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Particle statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Slater determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. Second quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. The creation and annihilation operators . . . . . . . . . . . . . . . . . . . . . . . . .
.. Anticommutator relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Occupation number representation . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Spin orbitals and orbital diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Fock space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Truncated bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. Representation of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. What we will prove . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. One-body operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Two-body operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. Wick’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. A sort of summary and motivation . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Vacuum expectation values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Normal ordering and contractions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Statement of Wick’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Vacuum expectation values using Wick’s Theorem . . . . . . . . . . . . . . . . . .
.. Proof of Wick’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Using Wick’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. Particle-hole formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. Operators on normal-order form (Not yet lectured) . . . . . . . . . . . . . . . . . . . . . .
.. The number operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. One-body operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Two-body operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Normal-ordered two-body Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . .
.. Full expressions for the normal-ordered Hamiltonian . . . . . . . . . . . . . . . . .
.. The Generalized Wick’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Standard Methods of approximation
. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. The variational principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. The Cauchy interlace theorem and linear models . . . . . . . . . . . . . . . . . . .
. The Configuration-interaction method (CI) . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. General description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Matrix elements of the CI method . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Computer implementation of CI methods . . . . . . . . . . . . . . . . . . . . . . .
.. Naive CI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Direct CI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Recipe for bit pattern representation. . . . . . . . . . . . . . . . . . . . . . . . . . .
. Hartree–Fock theory (HF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. The Hartree–Fock equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. The Hartree–Fock equations in a given basis: the Roothan–Hall equations . . . .
.. Self-consistent field iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Basis expansions in HF single-particle functions . . . . . . . . . . . . . . . . . . . .
.. Restricted Hartree–Fock for electronic systems (RHF) . . . . . . . . . . . . . . . .
.. Unrestricted Hartree–Fock for electronic systems (UHF) . . . . . . . . . . . . . .
.. Normal-ordered Hamiltonian in HF basis (Not yet lectured) . . . . . . . . . . . .
. Perturbation theory for the ground-state (PT) . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Non-degenerate Rayleigh–Schrödinger perturbation theory (RSPT) . . . . . . . .
.. Low-order RSPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. A two-state example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Manybody Perturbation Theory (MBPT) . . . . . . . . . . . . . . . . . . . . . . . .
.. Møller–Plesset Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Hartree–Fock and Post-Hartree–Fock methods . . . . . . . . . . . . . . . . . . . .
.. From hydrogenic to Gaussian orbotals . . . . . . . . . . . . . . . . . . . . . . . . .
.. Gaussian basis sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.. Gaussians are useful because they give fast integration . . . . . . . . . . . . . . . .
Chapter
Fundamental formalism
Suggested reading for this chapter: Raimes [], sections .–., and Gross/Runge/Heinonen [], section I..
Chapter of Szabo/Ostlund [] contains a nice refresher on mathematical topics, including linear algebra.
Ψ = Ψ(x , x , ⋯, x N ), (.)
where x i is a point in the configuration space X, the space where each particle “lives”. The configuration
space for all N particles is thus X N , and
Ψ ∶ X N Ð→ C. (.)
nucleons N components.
Remark, for orientation only: Mathematically, X is a measure space, which means that a function
ψ ∶ X → C can be integrated over subsets of X. For subsets of Rn , the standard measure is Lebesgue
measure, which gives an integral slightly more general than the Riemann integral encountered in intro-
ductory analysis courses. For discrete sets, the standard measure is counting measure, where the integral
Since the particles are identical, the configuration space is actually the quotient space X N /S , where S is the permutation
N N
group of N objects. This means that we identify points in X N that differ only by a permutation. Suppose X = R . Then X N is a flat
space. But X N /S N is actually a curved space! For low-dimensional systems, X = R or X = R , one can show that particle statistics
is not confined to only bosons or fermions. See [].
is simply a sum. See also the small section on finite dimensional spaces further down. This remark is for
orientation only. For us, we simply state that we integrate over continuous degrees of freedom and sum over
discrete degrees of freedom. For X = Rd × S with S = {s , s , ⋯, s n } a discrete set, we define
The wavefunction has a probabilistic interpretation: P(x , ⋯, x N ) = ∣Ψ(x , x , ⋯, x N )∣ is the probabil-
ity density for locating all particles at the point (x , ⋯, x N ) ∈ X N . Therefore, Ψ must be square integrable,
i.e., be in the Hilbert space L (X N ),
Ψ ∈ L (X N ). (.)
All physics can be obtained from the state Ψ.
The governing equation in non-relativistic quantum mechanics is the time-dependent Schrödinger
equation (TDSE):
∂
ĤΨ(x , x , ⋯, x N , t) = iħ Ψ(x , x , ⋯, x N , t). (.)
∂t
The system Hamiltonian Ĥ is obtained from its classical counterpart (if such exists) by a procedure called
Weyl quantization NB: Add reference. If Ĥ does not explicitly depend on time, the TDSE can be “solved”
by instead considering the time-independent Schrödinger equation (TISE),
= Ĥ + Ŵ.
where ĥ(i) denotes a single-particle operator acting only on the degrees of freedom of particle i, and
ŵ(i, j) = ŵ( j, i) denotes a two-body operator that acts only on the degrees of freedom of the pair (i, j),
i ≠ j.
Of course, one could consider three-body forces as well, and even higher. Such occur in nuclear physics.
We will rarely have occasion to work with such operators in this course.
Let us take the Hamiltonian of an atom in the Born–Oppenheimer approximation as an example.
The Hamiltonian for a free electron is just its kinetic energy,
ħ
t̂ = p = (−iħ∇) = − ∇ . (.)
m e m e m e
If it is moving in an external field, such as the Coulomb field set up by an atomic nucleus of charge +Ze at
⃗ we obtain the total single-particle Hamiltonian
the location R,
ħ Ze
ĥ = t̂ + v̂ = − ∇ − . (.)
m e ⃗ − ⃗r ∥
∥R
The Hamiltonian for a system of N electrons, neglecting inter-electronic interactions, becomes
N N
ħ Ze
Ĥ = ∑ ĥ(i) = ∑ [− ∇i − ]. (.)
i= i= m e ⃗
∣⃗r i − R∣
e
w(i, j) = . (.)
∣⃗r i − ⃗r j ∣
Thus,
N N e
Ŵ = ∑ w(i, j) = ∑ . (.)
i , j= i , j= ∣⃗r i − ⃗r j ∣
i≠ j i≠ j
The right hand side is a constant. The left hand side is a sum of functions f + f + ⋯ f N , f i = f i (x i ). This
can only sum to a constant if f i (x i ) is a constant,
which is just the TISE for a single particle! Thus, for any collection of N eigenvalues of the single-particle
problem, we get a solution of the N particle problem. We obtain that the total eigenfunction is
with eigenvalue
E = є i + ⋯ + є i N . (.)
One can also show that the converse is true: any eigenfunction Ψ can be taken on the above form.
.. Particle statistics
Our particles are identical, or indistinguishable. There is abundant evidence that all elementary particles
must be treated as such. That means that our probability density must be permutation invariant in the
following sense: let σ ∈ S N be a permutation of N indices, and let (x , ⋯, x N ) ∈ X N be a configuration of
the N particles. Then we must have
This is equivalent to
Ψ(x , ⋯, x N ) = e i α Ψ(x σ() , x σ() , ⋯, x σ(N) ) (.)
for some real α, that may depend on σ. (Clearly, our separation of variables eigenfunctions do not satisfy
this!)
Define a linear operator P̂σ via
that is, the operator that evaluates Ψ at permuted coordinates. We have reformulated particle indistin-
guishability as: Ψ is an eigenfunction of P̂σ for every σ ∈ S N , with eigenvalue possibly depending on σ.
One can show (see the exercises), that either Pσ Ψ = Ψ for every σ ∈ S N , or Pσ Ψ = (−)∣σ∣ Ψ for every
σ ∈ S N , where ∣σ∣ is the number of transpositions in σ, and thus (−)∣σ∣ is the sign of the permutation.
In the former case, Ψ is “totally symmetric with respect to permutations”, and in the latter case, “totally
anti-symmetric”.
It is a postulate that particles occuring in quantum theory (in three-dimensional space) are of one of two
types: bosons or fermions. Bosons have totally symmetric wavefunctions only, and fermions have totally
anti-symmetric wavefunctions only. To cite Leinaas and Myrheim [], “The physical consequences of this
postulate seem to be in good agreement with experimental data.” Wolfgang Pauli proved (using relativistic
considerations) that wavefunctions of half-integral spin must be anti-symmetric, and wavefunctions of
particles with integral spin must be symmetric, connecting the postulate with the intrinsic spin of particles.
To this day, no particles with other spin values have been found.
In this course, we focus on fermions. See, e.g., [] for the general case.
Exercise .. In this exercise, we prove that if Ψ ∈ L (X N ) is an eigenfunction for all P̂σ , then the eigenvalue
is either or (−)∣σ∣ .
We introdice transpositions: τ ∈ S N is transposition if it exhanges only a single pair (i, j), i ≠ j. Write
P̂i j ≡ P̂τ .
Assume that Ψ ∈ L (X N ) is such that, for all σ ∈ S N ,
P̂σ Ψ = s σ Ψ, s σ = e i α(σ) .
Show that P̂ij = , and find all the possible eigenvalues of P̂i j .
Under the assumption on Ψ, show that if s i j is the eigenvalue of P̂i j ,
P̂i j Ψ = s i j Ψ,
then, for any other pair (i ′ , j′ ), the eigenvalue is s i j = s i ′ j′ . You will probably need to use the group
theoretical properties of permutations.
We have established that the eigenvalue of a transposition is a characteristic of Ψ, let s = s i j . Compute
the eigenvalue of Pσ for arbitrary σ in terms of s. △
Exercise .. Let
N
Ĥ = ∑ ĥ(i) + ∑ ŵ(i, j).
i= (i , j)
Show that Ĥ commutes with Pσ for any permutation σ ∈ S N , i.e., show that for any wavefunction Ψ ∈
L (X N ),
ĤPσ Ψ = Pσ ĤΨ. (.)
△
Exercise .. In this exercise, we consider X = R , i.e., no spin. Consider each of the below functions.
. Ψ(⃗r , ⃗r ) = e −α∣⃗r −⃗r ∣ .
. Ψ(⃗r , ⃗r ) = sin(⃗
e z ⋅ (⃗r − ⃗r )), where e⃗z is the unit vector in the z-direction.
. Ψ(⃗r , ⃗r , ⃗r ) = sin[⃗r ⋅ (⃗r × ⃗r )]e −∣⃗r ∣ e −∣⃗r ∣ e −∣⃗r ∣
Answer the following questions, per function:
Is the function totally symmetric with respect to particle permutations?
Is the function totally antisymmetric with respect to particle permutations?
Is the function square integrable?
△
Exercise .. Prove that L (X N )AS is a linear space. Additionally, if you have the mathematical back-
ground, prove that it is a closed subspace using the Hilbert space metric. △
Here, L (X) is the Hilbert space of a single fermion. Let us assume that we have an orthonormal basis
(ONB) ϕ , ϕ , ⋯, for this space, such that we can expand any ψ ∈ L (X) as
with
⟨ϕ µ ∣ϕ ν ⟩ = δ µ,ν (.)
and
∥ψ∥ = ∑ ∣c µ ∣ . (.)
µ
Thus, ψ(x) is represented by an (infinite) vector [c µ ] = (c , c , ⋯). Because of Eq. (.), we may construct
a basis for L (X N ) by tensor products,
with
⟨Φ̃ µ ,⋯, µ N ∣Φ̃ ν ,⋯,ν N ⟩ = δ µ ,ν ⋯δ µ N ,ν N . (.)
In the N = case, we see that Ψ(x , x ) can be represented by an infinite matrix [c µ µ ], and in the N =
case a D matrix, and so on.
Remark: Compare this with the separation-of-variables treatment. If the set of eigenfunctions ψ i ∈
L (X) of ĥ is complete, our separation of variables eigenfunctions Ψ = ψ ψ ⋯ψ N form a complete set too.
Another remark: For arbitrary N, the tensor product basis described can be counted. For arbitrary N,
let us introduce a generic index, a multiindex, k = (µ , ⋯, µ N ). There is a one-to-one mapping between
multiindices and the natural numbers N = {, , , . . .}. Thus, writing ξ = (x , ⋯, x N )
all the various N are represented with the same formula. There is nothing special about c being a vector, a
matrix, a D matrix, etc. They are all fundamentally equivalent, since the basis set can be counted.
Important message so far: a single-particle basis set {ϕ µ } can be used to construct a basis for L (X N ).
What about or “actual” Hilbert space, L (X N )AS ? Can we construct a basis for this using our single-
particle basis? Yes, this is the role of Slater determinants.
What is the simplest totally antisymmetric wavefunction we can create, starting with some single-
particle functions? If we start with N = , and consider the product ϕ (x )ϕ (x ), this is not anti-
symmetric. But if we consider the linear combination
Φ(x , x ) = ϕ (x )ϕ (x ) − ϕ (x )ϕ (x ), (.)
this is antisymmetric if we exchange x and x . Continuing with N = , we quickly realize that in order to
obtain something antisymmetric out of ϕ (x )ϕ (x )ϕ (x ), we must take the linear combination
Φ(x , x , x ) = ϕ (x )ϕ (x )ϕ (x ) − ϕ (x )ϕ (x )ϕ (x ) − ϕ (x )ϕ (x )ϕ (x )−
(.)
ϕ (x )ϕ (x )ϕ (x ) + ϕ (x )ϕ (x )ϕ (x ) + ϕ (x )ϕ (x )ϕ (x ),
each term representing a permutation of the indices (). There is nothing special about () of course,
(µ µ µ ) also works. Note that if one of these indices are equal, then the whole linear combniation is zero
as well.
The generalization to N indices is in fact a determinant, and we make a definition:
Definition .. Let ϕ , ϕ , . . . , ϕ N be arbitrary single-particle functions in L (X) (not necessarily orthonor-
mal). The Slater determinant defined by these functions is denoted by [ϕ ϕ ⋯ϕ N ], and is defined via the
formula
RRR ϕ (x ) ϕ (x ) ⋯ ϕ (x N ) RRR
R R
RRRR ϕ (x ) ϕ (x ) ⋯ ϕ (x N ) RRRR
[ϕ , ϕ , ⋯, ϕ N ](x , ⋯, x N ) = √ RR R RRR
N! RRR ⋮ ⋮ ⋱ ⋮ R
RRRϕ (x ) ϕ (x ) ⋯ ϕ (x )RRRRR
R N N N N R
N (.)
∣σ∣
=√ ∑ (−) ∏ ϕ σ(i) (x i )
N! σ∈S N i=
N
∣σ∣
=√ ∑ (−) ∏ ϕ i (x σ(i) )
N! σ∈S N i=
√
Note: the / N! is there for normalization purposes, see later. The second formula in the definition
follows from the theory of matrix determinants.
Exercise .. Show that the two last lines in Eq. (.) are equivalent. This requires some manipulation of
permutations. △
Exercise .. Let A be an N × N matrix. Let ϕ j , j = , ⋯, N be given single-particle functions, and let ψ k ,
k = , ⋯, N be defined by
ψ k = ∑ ϕ j A jk . (.)
j
Prove that
[ψ , ψ , ⋯, ψ N ] = det(A)[ϕ , ϕ , ⋯, ϕ N ]. (.)
(Hint: use antisymmetry of Slater determinants with respect to permutations of single-particle functions,
and the expression det(A) = ∑σ∈S N (−)∣σ∣ A σ() A σ() ⋯A N σ(N) .) △
Exercise .. NB: This exercise has been updated since it was given as part of Problem set (H). The
assumption that the indices were sorted was added. Suppose that {ϕ µ }, µ = , , ⋯ are orthonormal. Prove
that Φ µ ,⋯, µ N = [ϕ µ ϕ µ , ⋯, ϕ µ N ] is normalized,
⟨Φ µ µ ⋯µ N ∣Φ µ µ ⋯µ N ⟩ = .
Prove that
⟨Φ µ µ ⋯µ N ∣Φ ν ν ⋯ν N ⟩ = δ µ ν ⋯δ µ N ν N ,
under the assumption that µ⃗ and ν⃗ are sorted in increasing order. What do you get for the inner product
if the indices are not sorted? △
Observation: Determinant properties imply that permutation of particle indices gives sign change.
Permutation of function indices gives sign change:
[ϕ , ⋯, ϕ i , ⋯, ϕ j , ⋯, ϕ N ] = −[ϕ , ⋯, ϕ j , ⋯, ϕ i , ⋯, ϕ N ] (.)
Theorem .. Let {ϕ µ } be an orthonormal basis for L (X). Then, any Ψ ∈ L (X N )AS can be expanded in
the Slater determinants
[ϕ µ , ϕ µ , ⋯, ϕ µ N ]. (.)
Moreover, if we choose an ordering of the indices µ, the Slater determinants satisfying µ < µ < ⋯ < µ N form
an orthonormal basis for L (X N )AS .
Proof. Step : Expand Ψ in the tensor product basis.
Step : Show that the coefficients c µ⃗ are antisymmetric under permutation. For simplicity, consider a
transposition of i with j, i < j:
= ∑ c µ ,⋯, µ N Φ̃ µ ,⋯, µ N (x , ⋯, x j ⋯, x i , ⋯, x N )
µ ,⋯, µ N
= − ∑ c µ ,⋯, µ N Φ̃ µ ,⋯, µ N (x , ⋯, x N )
µ ,⋯, µ N
splitting the summation over ordered multiindices and permutations of these. We now get
∣σ∣
Ψ= ∑ ∑(−) c µ ,⋯, µ N Φ̃ µ σ() , µ σ() ,⋯, µ σ(N)
µ <⋯<µ N σ
√ ∣σ∣
= ∑ ( N!c µ ,⋯, µ N ) √ ∑(−) Φ̃ µ σ() , µ σ() ,⋯, µ σ(N) (.)
µ <⋯<µ N N! σ
√
= ∑ ( N!c µ ,⋯, µ N )[ϕ µ , ⋯, ϕ µ N ].
µ <⋯<µ N
This in fact proves that the Slater determinants, when we only use ordered indices, are sufficient to expand
any ΨL (X N )AS . Clearly, if we omit one such Slater determinant, not all Ψ can be expanded. (In particular,
this omitted Slater determinant cannot be expanded in the rest!) Thus, the Slater determinants with ordered
indices form a basis.
Exercise .. How many terms are there in [ϕ ϕ ϕ ϕ ](x , x , x , x ), when expanded as a linear combi-
nation of tensor products? Write down the expansion explicitly. △
Now, √
[ϕ , ⋯, ϕ N ] = N!Aϕ (x )⋯ϕ N (x N ). (.)
†
An operator U is an orthogonal projector if and only if U = U and U = U.
Prove that A is an orthogonal projector from L (X N ) onto L (X N )AS . △
L N ≡ L (X N )AS (.)
since the space X is understood from context, and since we only deal with fermion spaces. We also intro-
duce the bra/ket notation for wavefunctions.
Recall that a basis for L N could be formed from an orthonormal basis {ϕ µ } of L (X), by computing a
set of Slater determinants Φ µ ,⋯, µ N = [ϕ µ , ⋯, ϕ µ N ], where µ < µ < ⋯ < µ N were ordered. (If we permute
the index set, we get the same function with a possible sign change, so it is not an additional basis function.)
So far we have emphasized that [ϕ µ , ⋯, ϕ µ N ] were functions, but in quantum mechanics the bra/ket
notation is useful. We therefore introduce the ket notation
∣ψ , ⋯, ψ N ⟩ = [ψ , ⋯, ψ N ] (.)
for an arbitrary Slater determinant. When {ϕ µ } is a single-particle basis, we may choose to suppress all
the ϕ’s everywhere, and write
for a Slater determinant. If µ i = µ j then ∣ µ⃗⟩ = is the zero vector. We recall the antisymmetry properties,
P̂i j ∣µ ⋯µ i ⋯µ j ⋯µ N ⟩ = − ∣µ ⋯µ j ⋯µ i ⋯µ N ⟩ (.)
connecting with the earlier treatment. The ∼ means that we sum only over ordered sets of indices. As we
saw earlier, the coefficients ⟨ µ⃗∣Ψ⟩ are permutation antisymmetric.
So far, we have used Greek letters µ, ν, etc., as single-particle indices. There is nothing special about
this, of course. We will later also use p, q, r, etc.
Looking at the determinant (.), we see that by adding a row containing the index ν, and a column
with coordinate x N+ , we obtain an N + particle Slater determinant (modulo a constant factor):
RRR ϕ ν (x ) ϕ ν (x ) ⋯ ϕ ν (x N ) ϕ ν (x N+ ) RRRR
RRR R
RRR ϕ µ (x ) ϕ µ (x ) ⋯ ϕ µ (x N ) ϕ µ (x N+ ) RRRR
RRR ϕ (x )
⟨x ⋯x N+ ∣νµ µ ⋯µ N ⟩ = √ R µ ϕ µ (x ) ⋯ ϕ µ (x N ) ϕ µ (x N+ ) RRRR (.)
(N + )! RRRR ⋮ ⋮ ⋱ ⋮ ⋮
RRR
RRR
RRR
RRRϕ µ N (x ) ϕ µ N (x ) ⋯ ϕ µ N (x N ) ϕ µ N (x N+ )RRRR
Similarly, we can remove a row and column, and obtain an N − particle Slater determinant.
This inspires the creation and annihilation operators, that map wavefunctions between different particle
number spaces:
c †ν ∶ L N → L N+ (.)
cν ∶ L N → L N− (.)
The operator c †ν is called a creation operator and is, roughly defined, by inserting a row and column as
described. The operator c ν is the Hermitian adjoint of c †ν , and it will be shown that its action on a Slater
determinant corresponds to the mentioned removal of a row and column.
We define the space L – the zero particle space – as a one-dimensional space spanned by the special
ket ∣−⟩, the vacuum state. There is nothing mysterious about this, it is just a definition that will be useful
later. Note that ∣−⟩ ≠ .
Recall that a linear operator is fully defined when we specify its action on a basis set. This is how we
define c †µ and c µ .
Definition of the creation operator: For every single-particle index ν, we define the creation operator
c †ν acting on the vacuum state by
c †ν ∣−⟩ = ∣ν⟩ . (.)
Since this is a Slater determinant with a single particle, we have, of course, ⟨x∣ν⟩ = ϕ ν (x). For an arbitrary
Slater determinant with N > , we define the action by
c †ν ∣µ ⋯µ N ⟩ ≡ ∣νµ ⋯µ N ⟩ . (.)
In terms of determinant coordinate expressions as in Eq. (.), c †ν inserts a column on the far right
with x N+ and inserts a row on the top with the index ν. Finally, the whole expression is renormalized.
[Recall that the basis Slater determinants were the determinants that had ordered indices. Assume that
µ⃗ is ordered. Clearly, c † ∣⃗
ν⟩ is either zero or equal to (−) j ∣µ µ ⋯µ j νµ j+ ⋯µ N ⟩, which is a new basis
determinant. Here, j is chosen such that the augmented index set is ordered.]
Let us now consider the annihilation operator. There are no particles to remove in the vacuum state, so
we set
c ν ∣−⟩ ≡ . (.)
Let µ⃗ be a multiindex. If ν = µ j for some j, we define
In terms of the coordinate determinant expression, this amounts to moving the jth row to the top with
j − transpositions, giving the sign factor, and then crossing out the far right column and the first row, now
containing the index ν. This moving of the jth row may seem like a complication compared to the creation
operator, but note that for c †ν we defined its action by inserting ν on the top. Moving ν to the ( j + )th
position will induce a (−) j . But c ν removes a row at an in principle arbitrary location.
Exercise .. Prove that c †α and c α are Hermitian adjoints of each other, as the notation suggests. Thus,
for any µ⃗ with N indices, and ν⃗ with N + indices, show that
{c †ν , c †ν } = (.a)
{c ν , c ν } = (.b)
{c ν , c †ν } = δ ν ,ν . (.c)
Why? The right hand side is obtained by exchanging the two first rows of the determinant on the left hand
side.
Since this equation holds for any basis vector, we have shown that the two creation operators anticom-
mute
{c †ν , c †ν } ≡ c †ν c †ν + c †ν c †ν = . (.)
Similarly, two annihilation operators anticommute,
{c ν , c ν } ≡ c ν c ν + c ν c ν = . (.)
c †ν c ν ∣ µ⃗⟩ . (.)
We also get
c ν c †ν ∣µ ⋯µ N ⟩ = c ν ∣µ j µ ⋯µ j ⋯µ N ⟩ = . (.)
Case b: ν ∉ µ⃗, ν is distinct from all the µ j . In this case, c ν ∣ µ⃗⟩ = , so
c †ν c ν ∣µ ⋯µ N ⟩ = . (.)
c †ν c ν ∣ µ⃗⟩ . (.)
c ν c †ν ∣ µ⃗⟩ = . (.)
or the binary number
B = µ + µ + µ = + + = = . (.)
The different bits are called occupation numbers. The vacuum has no occupied single-particle functions,
and is represented by the binary number or the empty set.
We use the notation
∣n n ⋯n µ ⋯⟩
to denote the Slater determinant with occupation numbers n µ ∈ {, }, and by definition we choose the
one determinant out of the N! possible that has µ⃗ sorted: µ < µ < ⋯ < µ N . We have n µ = if and only if
µ ∈ µ⃗. In the above example,
∣, , ⟩ = ∣ ⟩ . (.)
If no ambiguity can arise, we simply write
∣⟩
etc.
Again, we stress that occupation numbers only represent of the N! Slater determinants possible to
construct with µ through µ N , namely the one where all are sorted. But they still form a basis. In the
example,
Exercise .. Let µ⃗ = {µ , ⋯, µ N } be a given set of occupied orbitals, with occupation number represen-
tation
∣n n n ⋯⟩
Show that:
⎧
⎪
⎪ if ν is occupied
c †ν ∣n n n ⋯⟩ = ⎨ (.)
⎪
⎪ (−)#ν ∣n n n ⋯n ν− ν n ν+ ⋯⟩ if ν is unoccupied
⎩
⎧
⎪
⎪ if ν is unoccupied
c ν ∣n n n ⋯⟩ = ⎨ (.)
⎪
⎪(−)#ν ∣n n n ⋯n ν− ν n ν+ ⋯⟩ if ν is occupied
⎩
△
Exercise .. Let ϕ i , i = , , be three orthonormal single-particle functions. Consider the determinants
∣, , ⟩, ∣, , ⟩, ∣, , ⟩, ∣, , ⟩, ∣, , ⟩ and ∣, , ⟩.
a) Are there further N = Slater determinants that can be created using the single-particle orbitals ϕ i ,
i = , , only?
b) Write down a basis for the space spanned by the six determinants, i.e., a basis for all the vectors on
the form
∣Ψ⟩ = a ∣, , ⟩ + a ∣, , ⟩ + a ∣, , ⟩ + a ∣, , ⟩ + a ∣, , ⟩ + a ∣, , ⟩ .
(Here, a i are complex numbers.) △
.. Spin orbitals and orbital diagrams
Consider a system of electrons. Configuration space is X N for N electrons, and for a single electron X =
Rd × {+, −}, so
L (X) = L (R ) ⊗ C .
This means that each ψ ∈ L (X) is a two-component function, one for spin-up and one for spin-down.
The notation for spin can vary. Here, we use + for spin up and − for spin down, along the z-axis. This
is arbitrary, of course. In chemistry, one often uses α for spin up, and β for spin down, as symbols. (This is
the notation in Szabo and Ostlund, for instance.) Sometimes one uses arrows ↑ and ↓, or + and − .
If {φ p (⃗r )} is an orthonormal basis for L (R ), the space part, and χ+ (σ) and χ− (σ) are basis func-
tions for C , σ ∈ {+, −}, the spin space, we have a basis for L (X) via tensor products:
ħ
⟨σ∣S x ∣χ α ⟩ = ( ) (.)
ħ −i
⟨σ∣S y ∣χ α ⟩ = ( ) (.)
i
ħ
⟨σ∣S z ∣χ α ⟩ = ( ) = ħαδ ασ (.)
−
[Ĥ, Ŝ z ] =
Ĥ = ∑ ĥ(⃗r i ). (.)
i
…
ε4
ε3
ε2
ε1
ε0
Let ĥ, an operator on the space L (R ), have a complete set of eigenfunctions, φ p (⃗r ),
Then, as operator on L (X), we have the complete set ϕ µ (⃗r , σ) = φ p (⃗r )χ α (σ), and we see that the single-
particle functions are doubly degenerate:
By definition, ⟨ µ⃗∣⃗
ν⟩ = if ν⃗ and µ⃗ have different number of particles, i.e., different number of occupied
single-particle functions.
Thus
⟨ ⋯∣ ⋯⟩ = (.)
for example, since the number of particles differ in the two functions.
Now, c †µ ∶ F → F maps entirely inside F, and similarly with c µ .
A basis for F is the set of all ∣n n ⋯⟩ with arbitrary number of orbital occupied.
The binary number representation is quite useful for computer programs involving Slater determinants,
as easily can be imagined.
A special operator, the number operator: Let ν be arbitrary. We have that
Therefore, we define
N̂ ≡ ∑ c †ν c n u. (.)
ν
This operator extracts the number of fermions in a state ∣Ψ⟩ in the sense that for any ∣Ψ⟩ ∈ F, N̂ ∣Ψ⟩ = N ∣Ψ⟩
if and only if ∣Ψ⟩ ∈ L N .
Thus, ψ ∈ V means
L
ψ(x) = ∑ ψ µ ϕ µ (x).
µ=
Having selected the finite basis, we obtain for different N a Slater determinant basis, spanning VN ⊂
L (X N )AS .
Clearly, as we have only L single-particle functions available, we cannot create more than N particles
from vacuum without getting at least one repeated creation operator, i.e., we must have L ≥ N to have
nonzero dimension. The general dimension is dim(VN ) = ( NL ).
In computational settings, the truncation of the inifite basis into a finite one is almost universally done.
Of, course, we can only numerically diagonalize a finite matrix! But we would still like the basis to be as
large as possible to achieve the greatest accuracy. At least intuitively, we expect that as we include more and
more single-particle functions, the numerical results will approach the exact result. Under mild assump-
tions on the basis set and the Hamiltonian under consideration, this is in fact true.
Sometimes, the finite truncation is done after a detailed consideration of the physics of the system. This
can give considerable physical insight, giving great explanatory power to the second quantized picture.
As an example, take the physical explanation of the principles of a laser. (See for example https://en.
wikipedia.org/wiki/Population_inversion.) Another example is the Hubbard model from solid-
state physics, see for example https://en.wikipedia.org/wiki/Hubbard_model.
Exercise .. [Note: This exercise has been updated since it was given as a weekly exercise.]
Let ϕ µ , µ = , , ⋯, be given orthonormal single-particle functions.
a) Using the ∣µ , ⋯, µ N ⟩ notation, write down a basis for the finite dimensional subspace of L (X N )AS
for N = , N = and N = , that you can construct using the given single-particle funcitons. (Make
sure you include only linearly independent Slater determinants.)
b) Can you construct a Slater determinant for N = particles using the given ϕ µ ?
c) Using the occupation number notation ∣n n ⋯n ⟩ notation, write down a basis for the same spaces
as in exercise a).
c) What is the dimension of the subspace of Fock space you can create with the single-particle func-
tions?
e) Assume that you have L orbitals instead of just . What is the dimension of the N-particle spaces
you can build? What is the dimension of the Fock space you can build?
Exercise . (Note: This exercise has been updated since it was given as a weekly exercise.). Consider the
following picture:
ϕ
ϕ
ϕ
ϕ
We have four horizontal lines, each representing a single-particle function ϕ µ . The circle represents an
occupied single-particle function, i.e., the Slater determinant ∣⟩.
a) In a similar fashion as the the above picture, draw a pictures of all the distinct Slater determinants
you can create using the four single-particle functions. Make sure you consider all possible particle
numbers. Caption each picture with the corresponding ∣µ µ ⋯µ N ⟩.
We now consider electrons. Consider spin-orbitals φ p (⃗r ), i.e., spin-orbitals ϕ µ (⃗r , σ). The corre-
sponding diagram for the Slater determinant ∣ ↑, ↓⟩ is:
φ
φ
φ
↑ ↓ φ
Each level now can hold electrons, spin up and spin down.
b) Draw all possible -electron Slater determinants. Mark those that have total spin projection .
c) Consider the one-body operator given by
Ĥ = ∑ є p (c †p↑ c p↑ + c †p↓ c p↓ ).
p
N
Ĥ = ∑ ĥ(i).
i=
Write down the matrix of the (single-electron) operator ĥ in the spin-orbital basis {ϕ pσ } and find its
eigenfunctions. Interpret the spin-orbital diagram in terms of your results. Find the N = ground
state of Ĥ , and draw a picture of it.
Note that the last expression does not contain N explicitly. Here, note that ∣µ⟩ is a single-particle function
– it is the “Slater determinant” ϕ µ (x ). The number ⟨µ∣ĥ∣ν⟩ is the matrix element of the single-particle
operator ĥ in the given one-particle basis,
Eq. (??) gives a nice image of how Ĥ acts on a basis function: each term in the sum manipulates the Slater
determinant’s occupied orbitals and weighs it with a matrix element. Simple, and not at all obvious from
the “single quantized form”.
We shall also prove the following formula for the two-body operator:
N
αβ
† †
Ŵ = ∑ ŵ(i, j) = ∑ w µν c µ c ν c β c α , (.)
(i , j) µνα β
is a matrix element using tensor product two-body functions, not Slater determinants. Using Slater deter-
minant matrix elements we in fact have a similar expansion,
† †
Ŵ = ∑ ⟨µν∣ŵ∣αβ⟩ c µ c ν c β c α , (.)
µνα β
where thus the matrix elements are antisymmetric, computed as a matrix element using two-body Slater
determinants.
A word of warning: notation for two-body matrix elements is notoriouly varying between sources.
αβ
Some authors use the notation ⟨ϕ α ϕ β ∣ŵ∣ϕ µ ϕ ν ⟩ for the matrix element w µν , which is not antisymmetric.
In our case, the notation clashes with the Slater determinant matrix element, but we will still sin in this
respect. Some authors write ⟨ϕ α ϕ β ∣ŵ∣ϕ µ ϕ ν ⟩AS for the anti-symmetric Slater-determinant matrix element
(and sometimes we will too), which is equal to:
αβ αβ
⟨ϕ α ϕ β ∣ŵ∣ϕ µ ϕ ν ⟩AS = ⟨αβ∣ŵ∣µν⟩ = w µν − w ν µ . (.)
This can cause some confusion, as the expansions using tensor products and Slater determinants differ by
a factor . . .
The proofs given in this section borrow heavily from [].
Lemma .. Let ∣µ µ ⋯µ N ⟩ be a Slater determinant built from orthonormal single-particle functions ϕ µ , no
particular ordering assumed. The operator c †ν c α replaces ϕ α with ϕ ν (or gives zero of α is not present in µ⃗),
with no sign change.
Similarly, c †ν c †ν c α c α replaces α with ν , and α with ν , or gives zero if one of α or α is not present in
⃗
µ.
Ĥ ∣ϕ µ , ϕ µ , ⋯, ϕ µ N ⟩ = √ (∑ ĥ(i)) ∑(−)∣σ∣ P̂σ ϕ ν (x )⋯ϕ ν N (x N )
N! i σ
(.)
= √ ∑(−)∣σ∣ P̂σ (∑ ĥ(i)) ϕ ν (x )⋯ϕ ν N (x N )
N! σ i
Here, we used that P̂σ commutes with Ĥ .
Consider the operator ĥ acting on a single-particle function ϕ µ . The result, ψ, can be expanded in the
basis:
ψ(x) = ĥϕ µ (x) = ∑ ϕ ν (x) ⟨ν∣ĥ∣µ⟩ . (.)
ν
Finally, we note that c µ ∣µ ⋯µ N ⟩ = whenever µ ∉ µ⃗, so we may extend the summation over µ j to all of µ,
resulting in:
where the matrix elements w µν νµ are given by the formula (.). There is nothing special about the indices
(, ), it may just as well be (i, j). Note also the symmetry property
w µν νµ = w µν νµ .
As for the one-body case, Ŵ commutes with P̂σ , and we get, using Eq. (.),
⎡ ⎤
⎢ ⎥
Ŵ ∣ϕ µ ⋯ϕ µ N ⟩ = √ ∑(−)∣σ∣ P̂σ ⎢⎢∑ ŵ(i, j)ϕ µ (x )⋯ϕ µ N (x N )⎥⎥
N! σ ⎢ i< j ⎥
⎣ ⎦
⎡ ⎤
⎢ ⎥
= √ ∑(−)∣σ∣ P̂σ ⎢⎢∑ ŵ(i, j)ϕ µ (x )⋯ϕ µ N (x N )⎥⎥
N! σ ⎢ i< j ⎥
⎣ ⎦
⎡ ⎤ (.)
⎢ ⎥
∣σ∣ ⎢
= √ ∑(−) P̂σ ⎢∑ ∑ w µ i µ j ϕ µ ⋯ϕ ν (x i )⋯ϕ ν (x j )⋯ϕ µ N (x N )⎥⎥
ν ν
N! σ ⎢ i< j ν ν ⎥
⎣ ⎦
ν ν
= ∑ ∑ w µ i µ j ∣ϕ µ ⋯ϕ ν ⋯ϕ ν ⋯ϕ µ N ⟩
i< j ν ν
= ∑ ∑ w µν i νµj c †ν c †ν c µ j c µ i ∣ϕ µ ⋯ϕ µ N ⟩ .
i< j ν ν
Here, we used Lemma . about replacement behaviour of the c † c † cc product. We are currently summing
over µ i and µ j , such that i < j. Including i = j gives zero contribution (why?), and including j > i gives
equal contribution:
ν ν † † ν ν † †
∑ ∑ w µi µj c ν c ν c µ j c µ i ∣ϕ µ ⋯ϕ µ N ⟩ = − ∑ ∑ w µi µj c ν c ν c µ i c µ j ∣ϕ µ ⋯ϕ µ N ⟩
i< j ν ν i< j ν ν
= ∑ ∑ w µν j νµi c †ν c †ν c µ j c µ i ∣ϕ µ ⋯ϕ µ N ⟩ = ∑ ∑ w µν i νµj c †ν c †ν c µ j c µ i ∣ϕ µ ⋯ϕ µ N ⟩
j<i ν ν j<i ν ν
Here, we used the anticommutators and symmetry of the matrix elements. Assembling the two contribu-
tions,
Ŵ ∣ϕ µ ⋯ϕ µ N ⟩ = ∑ ∑ w µν i νµj c †ν c †ν c µ j c µ i ∣ϕ µ ⋯ϕ µ N ⟩ . (.)
i j ν ν
We note that the sum over i j is really a sum over two occupied orbitals µ i and µ j . We can therefore extend
the sum to all unoccupied orbitals as well, since c α ∣ µ⃗⟩ gives zero contributions for such orbitals. Thus,
Eq. (.) is proven.
We leave the proof of the antisymmetrized version as an exercise.
Exercise .. Prove Eq. (.). Start with showing Eq. (.). △
Exercise .. a) Let F̂ = ∑ Ni= fˆ(i) be a first-quantization operator. Write down the second-quantized
form of this operator. Let Ĝ = ∑ i< j ĝ(i, j) be a general two-body operator, where ĝ(, ) = ĝ(, ). Write
down the second-quantized form.
b) Using the fundamental anticommutator relations, compute the matrix element
⟨µ µ ∣F̂∣µ µ ⟩
⟨µ µ µ ∣F̂∣µ µ µ ⟩
d) Using the fundamental anticommutator relations, compute the matrix element
⟨µ µ ∣Ĝ∣µ µ ⟩
⟨µ µ µ ∣Ĝ∣µ µ µ ⟩
⟨µ , µ , ⋯, µ N ∣F̂∣µ , µ , ⋯, µ N ⟩
⟨µ , µ , ⋯, µ N ∣Ĝ∣µ , µ , ⋯, µ N ⟩
Exercise .. (Tedious, but very instructive.) In this exercise, we prove the so-called Slater–Condon rules:
the explicit expressions of matrix elements of one- and two-body operators in a Slater determinant basis.
We do not assume any particular ordering of the occupied single-particle functions considered.
If you solved Exercise ., you solved parts of this exercise.
a) Using the fundamental anticommutator relations, compute ⟨ µ⃗∣Ĥ ∣ µ⃗⟩ and ⟨ µ⃗∣Ŵ∣ µ⃗⟩ and prove that
N
µ
⟨ µ⃗∣Ĥ ∣ µ⃗⟩ = ∑ h µ ii , (.)
i=
N
⟨ µ⃗∣Ŵ∣ µ⃗⟩ = ∑ ⟨µ i µ j ∣ŵ∣µ i µ j ⟩AS = ∑ ⟨µ i µ j ∣ŵ∣µ i µ j ⟩AS . (.)
i< j ij
ν⟩ = c †ν j c µ j ∣ µ⃗⟩ ,
∣⃗ νj ≠ µj. (.)
Using the fundamental anticommutator relations, compute ⟨ µ⃗∣Ĥ ∣ µ⃗⟩ and ⟨ µ⃗∣Ŵ∣⃗
ν ⟩, and find
µ
⟨ µ⃗∣Ĥ ∣⃗
ν⟩ = h ν jj , (.)
⟨ µ⃗∣Ŵ∣⃗
ν⟩ = ∑ ⟨µ i µ j ∣ŵ∣µ i ν j ⟩AS . (.)
i
ν⟩ = c †ν k c †ν j c µ j c µ j ∣ µ⃗⟩ ,
∣⃗ j ≠ k. (.)
⟨ µ⃗∣Ĥ ∣⃗
ν⟩ = , (.)
⟨ µ⃗∣Ŵ∣⃗
ν⟩ = ⟨µ j µ k ∣ŵ∣ν j ν k ⟩AS . (.)
d) Explain that if ν⃗ differs from µ⃗ in more than two occupied functions, then ⟨ µ⃗∣Ŵ∣⃗
ν⟩ = .
△
p=
p=
p=
Figure .: Schematic plot of the possible single-particle levels with double degeneracy. The filled circles
indicate occupied particle states. The spacing between each level p is constant in this picture. We show
some possible two-particle states.
Ĥ = Ĥ + Ĥ I ,
and that the onebody part of the Hamiltonian with single-particle operator ĥ has the property
ĥ ψ pσ = p × dψ pσ ,
where we have added a spin quantum number σ. We assume also that the only two-particle states
that can exist are those where two particles are in the same state p, as shown by the two possibilities
to the left in the figure. The two-particle matrix elements of Ĥ I have all a constant value, −g. Show
then that the Hamiltonian matrix can be written as
d − g −g
( ),
−g d − g
and find the eigenvalues and eigenvectors. What is mixing of the state with two particles in p = to
the wave function with two-particles in p = ? Discuss your results in terms of a linear combination
of Slater determinants.
c) Add the possibility that the two particles can be in the state with p = as well and find the Hamil-
tonian matrix, the eigenvalues and the eigenvectors. We still insist that we only have two-particle
states composed of two particles being in the same level p. You can diagonalize numerically your
× matrix.
This simple model catches several birds with a stone. It demonstrates how we can build linear com-
binations of Slater determinants and interpret these as different admixtures to a given state. It repre-
sents also the way we are going to interpret these contributions. The two-particle states above p =
will be interpreted as excitations from the ground state configuration, p = here. The reliability of
this ansatz for the ground state, with two particles in p = , depends on the strength of the interac-
tion g and the single-particle spacing d. Finally, this model is a simple schematic ansatz for studies
of pairing correlations and thereby superfluidity/superconductivity in fermionic systems.
△
Inside the Fock space, every possible wavefunction of a system of some number of fermions exist.
Given a wavefunction ∣ΨN ⟩ ∈ F, it could be expanded in Slater determinants,
∼
∣ΨN ⟩ = ∑ ∣µ ⋯µ N ⟩ ⟨µ ⋯µ N ∣ΨN ⟩ . (.)
µ⃗
Here, the subscript N only indicates that we know the number of particles. The notation ∣µ ⋯µ N ⟩ indicates
that a certain single-particle basis {ϕ µ } has been chosen, since we only list the indices µ j . Each determinant
can be constructed from vacuum using creation operators c †µ (these, of course, depend on the basis),
∣µ ⋯µ N ⟩ = c †µ c †µ ⋯c †µ N ∣−⟩ . (.)
Finally, we found expressions for one- and two-body operators in terms of creation and annihilation
operators:
Ĥ = ∑ ⟨µ∣ĥ∣ν⟩ c †µ c ν (.)
µν
† †
Ŵ = ∑ ⟨µ µ ∣ŵ∣ν ν ⟩ c µ c µ c ν c ν . (.)
ν ν
µ µ
This expression equates two elements (functions) in Hilbert space. These functions are equal if and only if
their basis projections are equal. Thus, we expand ∣ΨN ⟩ in the basis, and similarly with the left-hand side
Ĥ ∣ΨN ⟩:
∼
∑ ⟨ν ⋯ν N ∣ (Ĥ + Ŵ) ∣µ ⋯µ N ⟩ ⟨µ ⋯µ N ∣ΨN ⟩ = E ⟨ν ⋯ν N ∣ΨN ⟩ . (.)
µ⃗
and similarly
† † † † †
Wν⃗, µ⃗ = ⟨ µ⃗∣Ŵ∣ µ⃗⟩ = ∑ ⟨α α ∣ŵ∣β β ⟩ ⟨−∣ c ν N c ν N− ⋯c ν c α c α c β c β c µ c µ ⋯c µ N ∣−⟩ (.)
α α
β β
c α c †β + c †β c α = δ α β (.)
cα cβ + cβ cα = (.)
and
c †α c †β + c †β c †α = . (.)
So, we can “flip” two creation or annihilation operators adjacent to each other and compensate with a −
sign. We can “flip” an annihilation and creation operator by a − sign, but we have to “pay a price” in the
On a computer, we need to truncate the basis to obtain a finite-dimensional matrix eigenvalue problem. Only for very small
problems will one actually compute the matrix itself, because that is quite expensive. Rather, one computes the matrix-vector product
Ĥ ∣⃗
µ ⟩.
form of a Kronecker delta, an additional term. However, this additional term has two less creation and
annihilation operators.
In this way, we can systematically move the annihilation operators to the right, and the creation oper-
ators to the left, possibly inserting kronecker deltas and generating new terms with fewer operators. But
when the annihilation operators are to the right they give zero contribution since c α ∣−⟩ = .
Let us see this in practice, and first remove one pair of creation and annihilation operators:
A = ⟨−∣ c ν c ν c †α c β c †µ c †µ ∣−⟩ = ⟨−∣ c ν c ν c †α (δ β µ − c †µ c β )c †µ ∣−⟩
= δ β µ ⟨−∣ c ν c ν c †α c †µ ∣−⟩ − ⟨−∣ c ν c ν c †α c †µ c β c †µ ∣−⟩
(.)
= δ β µ ⟨−∣ c ν c ν c †α c †µ ∣−⟩ − ⟨−∣ c ν c ν c †α c †µ (δ β µ − c †µ c β ) ∣−⟩
= δ β µ ⟨−∣ c ν c ν c †α c †µ ∣−⟩ − δ β µ ⟨−∣ c ν c ν c †α c †µ ∣−⟩ .
We continue:
A = δ β µ ⟨−∣ c ν (δ ν α − c †α c ν )c †µ ∣−⟩ − δ β µ ⟨−∣ c ν (δ ν α − c †α c ν )c †µ ∣−⟩
(.)
= δ β µ δ ν α ⟨−∣ c ν c †µ ∣−⟩ − δ β µ ⟨−∣ c ν c †α c ν c †µ ∣−⟩ − (µ ↔ µ ) .
In the last equality, we have indicated that the remaining terms are generated from the previous ones by
exchanging µ and µ .
Continuing,
Only the two first terms are non-vanishing, and we note, for example, that ⟨−∣ c ν c †µ ∣−⟩ = ⟨ν ∣µ ⟩ = δ ν µ .
(We could also use the anticommutator once more.) This gives:
A = δ β µ δ ν α δ ν µ − δ β µ δ ν µ δ ν α − (µ ↔ µ ). (.)
Yes, our life was made easier by introducing second-quantiation. However, the matrix elements are still
quite hard to compute. This is where Wick’s theorem comes in, by giving a much quicker way of determining
the vacuum expectation values.
Observe that the vacuum expectation value is basis independent. The value only depends on the anti-
commutator relations, and these only depended on the orthonormality of {ϕ µ }. So we see that the frame-
work is quite general.
The right-hand side is a linear combination of vacuum expectation values. So we see that having a straight-
forward way to compute Eq. (.) would be of great help.
Wick’s Theorem is what we shall need.
.. Normal ordering and contractions
In this section, we denote a general string of n creation and annihilation operators by
A A ⋯A n , A i ∈ {c µ } ∪ {c †µ }. (.)
Our goal is to find a general procedure of computing the vacuum expectation value
Note that this expectation value only depends on the orthogonality of the single-particle functions, not on
the functions themselves. I.e., the value of the vacuum expectation value can be computed solely from the
anticummutator relations (.).
The procedure we develop is based on Wick’s Theorem, to be stated and proven. Wick’s theorem is based
on two fundamental concepts, namely normal ordering and contraction. The normal-ordered product form
of an operator string A A ⋯A n is defined as
Here, σ ∈ S n denotes a permutation that brings the operator product to the desired order,
Note that the string A ⋯A n and the normal-order product N(A ⋯A n ) is not the same operator, since
by reordering creation and annihilation operators we neglect the extra terms arising from the Kronecker
delta in the anti-commutator relation {c α , c †β } = δ α β . If all individual A i in fact anticommute, then the
string and the normal-ordered string are identical as operators, but usually this is not the case.
Remark: The permutation σ in the definition is usually not be unique, but the normal ordered product
is unique as operator. Consider for example
There are × possible arrangements of the creation and annihilation operators that conform to the defini-
tion of the normal-ordered product. But creation and annihilation operators anticommute among them-
selves. The permutation sign (−)∣σ∣ in Eq. (.) automatically compensates for this. Thus,
N(c c † c † c ) = c † c † c c = −c † c † c c = c † c † c c = −c † c † c c . (.)
We define the normal order product of linear combinations in the obvious way:
Mathematical aside for the interested reader: N(⋅) is now defined as a linear operator on the space of
linear combinations of operator strings. The second-quantized formulas for Ĥ , Ŵ, etc., are examples of
such objects. This space of operators is an example of a C ∗ -algebra with unity. An algebra is a vector space
where multiplication is also defined (the product of two operators is an operator), and roughly speaking
the ∗ means that we can form Hermitian adjoints. The operator algebra is said to be generated by the c µ
operators and the unit operator.
A contraction between to arbitrary creation and annihilation operators X and Y is a special notation
for ⟨−∣XY∣−⟩,
XY ≡ ⟨−∣XY∣−⟩. (.)
Thus, the contraction is a number. One can easily show (see exercise .), that
XY = XY − N(XY). (.)
(This definition also allows the other operators to be permuted among themselves. This is of course per-
fectly acceptable – the sign of σ accounts for this.)
Thus, the sign factor (−)x+y+ equals the sign of the permutation σ that brings the two operators to
the front. Equivalently, we can count the number c of anticommutations needed, and the sign becomes
(−)c .
Examples:
N(c c † c c † ) = c c † N(c c † ) = −δ c † c (.a)
The definition is recursive: each pair of contracted operators is processed in turn according to Eq. (.).
This definition is independent of the order of the processing of the pairs. An example: Example:
Exercise .. Show Eq. (.) from Eq. (.), by considering the possible cases. △
Theorem . (Wick’s Theorem). Let A ⋯A n be an operator string of creation and annihilation operators.
Then,
A A ⋯A n = N(A A ⋯A n ) + ∑ N (A ⋯ ⋯ ⋯A n ) + ∑ N (A ⋯ ⋯ ⋯A n )
() ()
⎛ ⎞ (.)
⎜
+ ⋯ + ∑ N ⎜A ⋯ ⋯ ⋯ ⋯ ⋯ ⋯A n ⎟ ⎟
(⌊ n ⌋) ⎝´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶⎠
⌊n/⌋ contractions
The notation ∑(m) signifies that we sum over all combinations of m contractions.
When n is even, the last sum signifies that we sum over n/ contractions, i.e., all opeators are contracted.
If n is odd, there is one uncontracted operator left in each term of the last sum.
.. Vacuum expectation values using Wick’s Theorem
Before we start with the proof of Wick’s Theorem, we apply it to the evaluation of vacuum expectation
values. For any string with at least one factor,
This is so, because in the normal-order product, the annihilation operators are to the right, and the creation
operators are on the left. For odd n, therefore, Wick’s Theorem gives
For even n,
where we for brevity omit N(⋯) since there are no operators left anyway. (Note carefully, that this is abuse
of notation!)
The only non-vanishing contractions are
c α c †β = δ α β . (.)
This reduces the number of contractions we need to consider when evaluating the sum. Moreover, if
A ⋯A n contains a different number of creation and annihilation operators, at least one contraction of
the form c α c β or c †α c †β must be present in Eq. (.), in every term, giving a zero expectation value at once.
Finally, one can show that the sign of a fully contracted operator product is (−) k , where k is the number
of contraction line crossings. We will not prove this.
Clearly, Wick’s theorem provides us with an algebraic method for easy determination of the terms that
contribute to the matrix element.
We conclude with a recipe:
Theorem . (Vacuum expectation values using Wick’s Theorem). Let A ⋯A n be a string of creation and
annihilation operators.
If n is odd ⟨−∣A ⋯A n ∣−⟩ = .
Assume n is even. If A ⋯A n contains a different number of creation operators compared to annihilation
operators, ⟨−∣A ⋯A n ∣−⟩ = .
Finally,
where the sum runs over all possible combinations of n/ contractions on the form
c α c †β
The sign of each term in the sum is (−) k , where k is the number of crossings of contraction lines.
Exercise .. (Hard.) Prove the sign rule for the fully contracted terms. (More details will be filled in for
this exercise later in the course. Stay tuned.) △
Exercise .. Write out the statement of Wick’s Theorem for the following operator strings, and simplify
where you can:
. c β c †α
. c †α c β c γ† c δ
. c γ c †µ c †ν c α c β c †δ
△
Lemma .. Let A r , r = , ⋯, n be creation and annihilation operators. Let B be a creation or annihilation
operator. Then,
n
N(A A ⋯A n )B = ∑ N(A A ⋯A r ⋯A n B) + N(A ⋯A n B). (.)
r=
Proof. Assume first that B is an annihilation operator. Then all the contractions on the right-hand side
vanish. Also, N(A ⋯A n )B = N(A ⋯A n B).
Assume next that B is a creation operator, and that all the A i are annihilation operators. In that case,
we can verify that the left- and right-hand sides are equal. The left-hand side is equal to A ⋯A n B since
A ⋯A n is already a normal-ordered product. We compute N(A ⋯A n B) = (−)n BA ⋯A n . Looking at the
left-hand side again,
This proves the case for all A i annihilation operators, and it remains to prove it when we have creation
operators in the mix.
Multiply Eq. (.) from the left by a creation operator A . We observe that normal order is preserved
on the left hand side since A is a creation operator and can stand to the far right,
A N(A ⋯A n ) = N(A ⋯A N ),
and similarly A N(A ⋯A n )B = N(A ⋯A n )B. Also,
n n n
A ∑ ∑ N(A A ⋯A r ⋯A n B) = ∑ N(A A A ⋯A r ⋯A n B),
r= r= r=
since A B = . Thus, the statement of the lemma is true also when A is a creation operator. Clearly, we
can continue, and add as many creation operators we like. Thus, the lemma is true for strings of the form
C ⋯C k A k+ ⋯A n , where C i are creation operators, and A i are annihilation operators. By permuting this
string, we gain a sign change on all terms, and the terms in the sum over r are reordered, but leaving the
sum invariant. Thus, the lemma is proved for arbitrary strings A ⋯A n .
We introduce another lemma, which generalizes Lemma . to the case where we have an arbitrary
number m contractions between the n operators inside the normal order operator.
Lemma .. Suppose A ⋯A n is a given operator string, and suppose we choose m pairs p i = {x i , y i } to
contract from this string, with x i < y i . Let S m = {, , ⋯, N} ∖ (∪ i p i ) be the remaining indices when all pairs
are removed. Let B a creation or annihilation operator. Then,
where the notation indicates that all m pairs are contracted from the A i s.
Proof. let S = {, , ⋯, N}. The pairs are distinct, which we write mathematically as p i ⊂ S ∖ (∪ i− j= p j ).
Consider the normal-ordered product of A ⋯A n with the m given contractions, see the left-hand side
of Eq. (.). We perform the pairwise “operator flips” that brings first p to the front, then p , etc. The first
pair gives a sign (−) f , for f flips. The next pair gives a sign (−) f , and so on. (Importantly, f i depend
on the order in which we do the “contraction extractions”.) We arrive at
Consider the first term inside the bracket. We can move the contractions inside again, p m passing the same
operators as when extracted, then p m− , etc, giving an overall sign change that cancels (−) f +⋯+ f m . This
reproduces the first term on the left-hand side of Eq. (.).
The same is actually true for the second term. Even if we pass a contracted A r , the “operator flips”
count towards the sign, by the definition of N() with contractions.
This completes the proof.
We now prove Wick’s Theorem. Assume now that Pn is true. Multiply Eq. (.) from the right by an
operator A n+ :
+ ∑ N (A A A A ⋯A n ) A n+ (.)
()
Each sum is a sum over m contractions, including the first where we have m = . We now use Lemma .
and write
∶= X m + I m .
(.)
Here, X m contain all possible m contractions excluding A n+ , while I m contains all possible m + contrac-
tions including A n+ . We now get
Note that I⌊n/⌋ = , since there is no operator left to to contract A n+ with after ⌊n/⌋ operators have been
contracted.
Write
A ⋯A n+ = X + (I + X ) + (I + X ) + ⋯X⌊n/⌋ . (.)
and note that (I m + X m+ ) is the sum over all possible m + contractions of the string A ⋯A n+ . Thus,
Wick’s Theorem is proved.
Exercise .. (Slater–Condon rules revisited)
a) Let µ⃗ = (µ ⋯µ N ), with N ≥ . Compute the matrix elements ⟨ µ⃗∣Ĥ ∣ µ⃗⟩ and ⟨ µ⃗∣Ŵ∣ µ⃗⟩ using Wick’s
theorem applied to vacuum expectation values. Do you notice a pattern of which contractions con-
tribute other the rules listed in the main text?
b) Repeat Exercise . using Wick’s Theorem instead of the anticommutator relations to prove the
Slater–Condon rules. (Wick’s Theorem gives a much less tedious approach.)
△
b i = c †i , ba = ca (.)
{b µ , b †ν } = δ µ,ν , {b µ , b ν } = . (.)
∣Φ⟩ =
Note that
b µ ∣Φ⟩ = , ∀µ. (.)
Therefore, ∣Φ⟩ has the role of a vacuum for the new operators. It contains zero quasiparticles, since at-
tempting to remove one gives us zero.
Let us create a quasiparticle:
b †i ∣Φ⟩ =
i
b †a ∣Φ⟩ =
a
Note that b†i ∣Φ⟩ contains N − “real” particles, while b †a ∣Φ⟩ contains N + “real” particles.
The quasiparticles with µ = i ≤ N are called “holes”, while the quasiparticles with µ = a > N are called
“particles”.
Creating a particle-hole pair results in a state with N “real” particles, since b †a b†i = c †a c i preserves N
when acting on a state. Acting on the reference, we get N − occupied single-particle functions below N,
and occupied single-particle function above N, in pictures,
b†a b †i ∣Φ⟩ =
i a
Clearly, by creating another particle-hole pair with b †b b †i , we get a Slater determinant with two particles
and two holes, in total N particles. We are left with N − “real” particles below N.
Continuing, it is clear that we can generate all the original Slater determinants with N particles by
creating up to N particle-hole pairs .
Thus, any wavefunction in with N particles can be written
∣ΨN ⟩ = C ∣Φ⟩ + ∑ C i a b †a b i ∣Φ⟩ + † † † †
∑ C i jab b b b j b a b i ∣Φ⟩ + ⋯, (.)
ia ! i jab
where the factor /! comes from the double counting of the two particle-hole states. The sum extends all
the way up to N particle hole pairs.
Se define
∣Φ ai ⟩ = b †a b†i ∣Φ⟩ = c †a c i ∣Φ⟩ , (.)
and
∣Φ ab † † † † † †
i j ⟩ = b b b j b a b i ∣Φ⟩ = c b c j c a c i ∣Φ⟩ , (.)
and so on. The lower indices indicate that they are holes/below N, and the upper indices that they are
particles/above N.
In chemistry parlance, a particle-hole pair is called a singles excitation, two particle-hol pairs a doubles
excitation, etc. Thus, ∣Φ ai ⟩ is a “singly excited determinant”, ∣Φ ab i j ⟩ is doubly excited, etc.
There are many different common notations for the particle-hole vacuum: ∣vac⟩, ∣Φ⟩, ∣c⟩, etc. Similarly,
there are many ways to denote a Slater determinant with m particle-hole pairs, for example ∣ia⟩c , ∣Φ ai ⟩, ∣ ai ⟩,
and others.
We can say that b †a b †i is a (particle-hole) pair creation operator. In chemistry language, a sigles excitation.
It is useful to note that
[b †a b †i , b †b b †j ] = . (.)
I.e., ∣Φ ab ba
i j ⟩ = ∣Φ ji ⟩. We form double excitation operators by products of singles, and so on.
Digression: There can be only N hole-particles! I the solution of the Dirac equation, for those who have seen this, the vacuum
contains zero electrons. But every time an electron is created, an anti-electron is also created below the “Fermi sea”. There are infinitely
many hole-states in Dirac theory.
Finally, we note that Wick’s theorem applies equally well to quasiparticles! For example, to compute
⟨Φ∣c †i c j ∣Φ⟩ we note that ∣Φ⟩ is the vacuum and that c †i = b i and c j = b †j , so
Note how the quasiparticles greatly simplified the evaluation of the matrix element, see the exercises.
Exercise .. If we restrict a ≤ L, how many linearly independent two-particle-two-hole determinants can
you create? How many three-particle-three-hole? How would you phrase this in chemistry language? △
Exercise .. Compute the vacuum expectation value (.) ussing the Wick’s theorem and the original
creation and annihilation operators and compare the method and amount of work.
Repeat for
⟨Φ∣c †α c †β c δ c γ ∣Φ⟩ (.)
but compute also with quasiparticles, but note that you get several cases, depending if the Greek indices
are smaller than or larger than N. Compare with the original formulation. △
Exercise .. Use Wick’s Theorem with respect to quasiparticles and write down the following operators
as a sum of normal-ordered strings with as few terms as possible (i.e., only include nonvanishing contrac-
tions):
a) b a b†i
b) b †i b c b†a
c) b i b j b †a b†b b c b †k
d) b a b i b j b†b b†c b d b †k
△
. Operators on normal-order form (Not yet lectured)
.. The number operator
We need the Hamiltonian and other second-quantized operators on normal-order form, relative to quasi-
particle vacuum. I.e., we want the operator to be written such that all quasiparticle annihilation operators
are to the right. This is achieved using Wick’s Theorem, and results in the original operator obtaining more
terms.
This is the task in the current section.
For conformity with much of the literature, we replace Greek indices µ, ν, etc, with p, q, etc. We still
reserve i, j, etc for hole indices, and a, b, etc for particles. And let’s face it, it is easier to write up in LATEX.
We start with the number operator N̂, as an easy warm-up. First, we rewrite the second-quantized
operator using quasiparticle operators:
N̂ = ∑ c †p c p = ∑ c †i c i + ∑ c †a c a = ∑ b i b†i + ∑ b †a b a . (.)
p i a i a
= N − ∑ b i b i + ∑ b†a b a .
†
i a
This is the normal-ordered form of N̂. Interpreting, the last equality counts N minus the number of holes
plus the number of particles.
Let us act with N̂ on the quasiparticle vacuum, and observe:
All the terms vanish except the fully contracted term since we have annihilation operators to the right.
Thus, normal-ordered operators can be very useful when we deal with quasiparticles.
Introducing the quasiparticle operators at this stage leads to four distinct contributions to the operator,
corresponding to the different pq = i j, ia, ai, and ab contributions. However, it is more convenient to
use Wick’s Theorem on the above Ĥ expression without changing the creation- and annihilation operator
notation. Thus, beware, when we normal order now, it is relative to quasiparticles.
Wick’s Theorem gives
c †p c q = N(c †p c q ) + c †p c q . (.)
Some of the contractions are nonzero, namely, when pq = ii. The reader should verify that the rest of the
possible contractions vanish identically. Thus,
p p
Ĥ = ∑ h q N(c †p c q ) + ∑ h q c †p c q
pq pq
p
= ∑ h q N(c †p c q ) + ∑ h ii (.)
pq i
(q p) (q p)
= Ĥ + Ĥ .
Note that Ĥ is separated into a one-quasiparticle part and a constant zero-quasiparticle part. Explicitly,
(q p)
Ĥ = ∑ h ii , (.)
i
and
(q p)
Ĥ = − ∑ h ij b †j b i + ∑ h ia b†a b †i + ∑ h ai b i b a + ∑ h ba b †a b b , (.)
ij ai ia ab
where we have expanded the sum over pq in order to resolve the quasiparticles.
.. Normal-ordered two-body Hamiltonian
Consider the fiull Hamiltonian on the form
Ĥ = Ĥ + Ŵ. (.)
The Hamiltonian is normal-ordered relative to “real” particles. In terms of quasiparticles, we saw in the
previous sections that we could split
(q p) (q p)
Ĥ = Ĥ + Ŵ (q p) + Ĥ + Ŵ (q p) + Ŵ (q p) , (.)
separating Ĥ into zero, one and two-quasiparticle contributions. These were normal-ordered relative to
quasiparticles.
It is conventional to write
Ĥ = ĤN + E = ĤN + ŴN + E , (.)
with
(q p) ij
E = Ĥ + Ŵ (q p) = ∑ h ii + ∑w , (.a)
i ij ij
(q p) p pi
Ĥ ,N = Ĥ + Ŵ (q p) = ∑(h q + ∑ w qi )N(c †p c q ), (.b)
pq i
pq
ŴN = Ŵ (q p) = † †
∑ w rs N(c p c q c s c r ). (.c)
pqrs
Thus, Ĥ ,N is the total one-quasiparticle operator part of Ĥ, and contains contributions from Ŵ as well
as Ĥ , while ŴN is the total two-quasiparticle operator part of Ĥ. The subscript N stands for “normal-
ordered”.
where
p p pj
fq = hq + ∑ wq j . (.)
j
Some terms are equal, and we rearrange the expression to read:
ab † † † † ab † † † ai † † † ai † † †
ŴN = ∑ w i j b a b b b j b i + ∑ w c i b a b b b i b c + ∑ w b j b a b j b b b i + ∑ w jk b a b k b j b i
ab i j abc i aib j ai jk
ij † † ia † † ia † † ab † †
+ ∑ w b b b i b j − ∑ w jb b a b j b i b b + ∑ w b j b a b j b i b b + ∑ w cd b a b b b d b c (.)
i jk l k l l k i a jb i ab j abcd
ij † ai † ij
+ ∑ w ak b k b i b j b a + ∑ w bc b a b i b c b b + ∑ w ab b i b j b b b a
i jak aibc i jab
Exercise .. Verify that ŴN in Eq. (.) is Hermitian, given that Ŵ is Hermitian. △
Theorem . (The Generalized Wick’s Theorem). Let A ⋯A n be an operator string of creation and annihi-
lation operators, such that
 ⋯ n = N( , ⋯ ,n )N( , ⋯ ,n )⋯N( k , ⋯ k,n k ). (.)
Then,
′ ′
A A ⋯A n = N(A , ⋯A k,n k ) + ∑ N (A , ⋯ ⋯ ⋯A k,n k ) + ∑ N (A , ⋯ ⋯ ⋯A k,n n )
() ()
⎛ ⎞ (.)
′
⎜ ⎟
+ ⋯ + ∑ N ⎜A , ⋯ ⋯ ⋯ ⋯ ⋯ ⋯A k,n k ⎟
⎜ ⎟
(⌊ n ⌋)
⎝´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹⌊n/⌋
¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶⎠
contractions
The notation ∑′(m) signifies that we sum over all combinations of m contractions that each involve operators
from different substrings, i.e., all contractions are between operators  i , j and  i ′ , j′ with i ′ ≠ i. Contractions
between  i , j and  i , j′ are omitted.
When n is even, the last sum signifies that we sum over n/ contractions, i.e., all opeators are contracted.
The restriction to inter-string contractions implies that the maximum number of contractions in a term
usually is smaller than ⌊n/⌋.
Here is an example:
b) c c † c † c c c †
c) c c c † c c c † c c
△
Chapter
. Introduction
Having dealt with the basic formalism of many-fermion theory, how do we solve the Schrödinger equation
approximately? In this section, we discuss the variational principle, perhaps the most important tool for
devising approximate schemes.
We then develop the configuration-interaction method, and then Hartree–Fock theory, and then we
combine the two methods.
This is an eigenvalue problem for a Hermitian operator Ĥ over a Hilbert space. The mathematical analysis
of this problem is complex. However, if the Hilbert space L N has finite dimension D, then Ĥ can be viewed
as a Hermitian matrix, and we can find a complete set of orthonormal eigenfunctions ∣Ψk ⟩, k = , , , ⋯
with corresponding eigenvalues E k , such that
D
Ĥ = ∑ E k ∣Ψk ⟩ ⟨Ψk ∣ . (.)
k=
Of course, Hilbert space is usually infinite dimensional, complicating the mathematical analysis of the
problem. It may happen that Ĥ does not even have a ground state, or not even a single eigenvector. How-
ever, it turns out, that in most interesting cases the differences are small enough to warrant the assumption
that we are dealing with a finite-dimensional problem, or at least that Eq. (.) holds with possibly an
infinite dimension.
Theorem . (Variational principle). Consider the expectation value functional defined by
⟨Ψ∣Ĥ∣Ψ⟩
E(∣Ψ⟩) ≡ . (.)
⟨Ψ∣Ψ⟩
Let ∣Ψ∗ ⟩ be given. Then E∗ = E(∣Ψ∗ ⟩) is a stationary value of E with respect to all infinitesimal variations
∣Ψ∗ ⟩ + є ∣η⟩ (with є a small number and ⟨η∣η⟩ = ) if and only if
Ĥ ∣Ψ∗ ⟩ = E∗ ∣Ψ∗ ⟩ . (.)
Proof. Let є be a small real number, ∣Ψ⟩ , ∣η⟩ ∈ L N
arbitrary vectors, ∣η⟩ normalized. Let f (є) be defined
as
f (є) = E(∣Ψ⟩ + є ∣η⟩). (.)
The stationary point condition can be formulated as
f ′ () = . (.)
This condition must hold for all ∣η⟩. Thus, є ∣η⟩ is an arbitrary infinitesimal variation. Mathematically,
f ′ (є) is the directional derivative of E at ∣Ψ⟩ in the direction ∣η⟩. Then
⟨Ψ∣Ĥ∣Ψ⟩ + є ⟨η∣Ĥ∣Ψ⟩ + є ⟨Ψ∣Ĥ∣η⟩ + є ⟨η∣Ĥ∣η⟩
f (є) = . (.)
⟨Ψ∣Ψ⟩ + є ⟨η∣Ψ⟩ + є ⟨Ψ∣η⟩ + є ⟨η∣η⟩
Define E = ⟨Ψ∣Ĥ∣Ψ⟩, N = ⟨Ψ∣Ψ⟩. Define A = ⟨η∣Ĥ∣Ψ⟩ + ⟨Ψ∣η⟩, a = ⟨η∣Ψ⟩ + ⟨Ψ∣η⟩.
E + єA + O(є )
E(∣Ψ⟩ + є ∣η⟩) = . (.)
N + єa + O(є )
Using /( + x) = − x + O(x ), we expand the denominator to first order in є:
a
= [ − є + O(є )] . (.)
N + є Na + O(є ) N N
We expand f (є) to first order in є:
a
N f (є) = (E + єA + O(є )) [ − є + O(є )]
N
(.)
aE
= E + є [A − ] + O(є ).
N
Recall that
f (є) = f () + є f ′ () + O(є ). (.)
′
We see that f () = if and only if
aE
A= , (.)
N
that is,
⟨η∣Ĥ∣Ψ⟩ + ⟨Ψ∣Ĥ∣η⟩ = (⟨η∣Ψ⟩ + ⟨Ψ∣η⟩)E(∣Ψ⟩), (.)
which must hold for all ∣η⟩. In particular, if it holds for ∣η⟩ = ∣u⟩ it must also hold for ∣η⟩ = i ∣u⟩. Plugging
these in gives
⟨u∣Ĥ∣Ψ⟩ + ⟨Ψ∣Ĥ∣u⟩ = (⟨u∣Ψ⟩ + ⟨Ψ∣u⟩)E(∣Ψ⟩), (.)
−i ⟨u∣Ĥ∣Ψ⟩ + i ⟨Ψ∣Ĥ∣u⟩ = (−i ⟨u∣Ψ⟩ + i ⟨Ψ∣u⟩)E(∣Ψ⟩), (.)
Multiplying the second equation by i and adding the two equations gives
⟨u∣Ĥ∣Ψ⟩ = E(∣Ψ⟩) ⟨u∣ ∣Ψ⟩⟩ . (.)
Since ∣u⟩ was arbitrary, we must have
Ĥ ∣Ψ⟩ = E(∣Ψ⟩) ∣Ψ⟩ . (.)
The proof is complete.
The variational principle in its simplest form states that the ground-state energy E is the minimum of
the expectation value of the Hamiltonian:
Theorem . (Variational Principle, Rayleigh–Ritz). If Ĥ has a ground state, then the ground-state energy
is given by the minimum of the expectation value of Ĥ, viz,
⎧
⎪ RRR ⎫
⎪ ⟨Ψ∣Ĥ∣Ψ⟩
E = min ⎨ RRR ≠ ∣Ψ⟩ ∈ L , ∣ ⟨Ψ∣Ĥ∣Ψ⟩ ∣ < +∞⎪
⎪
⎬. (.)
⎪
⎪ ⟨Ψ∣Ψ⟩ RRRR N
⎪
⎪
⎩ R ⎭
Theorem . holds even if Eq. (.) does not hold. It is sufficient that Ĥ has a lowest eigenvalue. In the
infinite dimensional case, we must require that ∣ ⟨Ψ∣Ĥ∣Ψ⟩ ∣ < +∞, since for most Hamiltonians of interest,
there are in fact ∣Ψ⟩ that has an infinite expectation value. In finite dimensons, this is of course not true.
We will not prove Theorem . in its full generality, but we see immediately that it follows from The-
orem .: E is a stationary value, and cleary E cannot take values lower than E . Thus, E must be the
minimum.
We now consider the variational procedure, a useful method of generating approximate ground-state
energies. Suppose we have a subset of Hilbert space M ⊂ L N , and compute
⎧
⎪ RRR ⎫
⎪ ⟨Ψ∣Ĥ∣Ψ⟩
E [M] ≡ inf ⎨ RRR ≠ ∣Ψ⟩ ∈ M, ∣ ⟨Ψ∣Ĥ∣Ψ⟩ ∣ < +∞⎪
⎪
⎬. (.)
⎪ ⟨Ψ∣Ψ⟩ RRR ⎪
⎪
⎩ RR ⎪
⎭
Clearly,
E ≤ E [M], (.)
since we minimize over a smaller set than the full Hilbert space. This upper bound property of the variational
procedure is very useful, because if we enlarge M, we will always get a better estimate for E .
Suppose that our variational procedure yields a minimum value in Eq. (.) for the function ∣Ψ̃⟩ ∈ M:
⟨Ψ̃∣Ĥ∣Ψ̃⟩
E [M] = E(∣Ψ̃⟩) = . (.)
⟨Ψ̃∣Ψ̃⟩
Suppose also that ∣Ψ̃⟩ is fairly close to ∣Ψ ⟩, i.e.,
i.e., that the error in the eigenvalue is quadratic in the error in the eigenfunction! Thus, the error E [M] −
E is insensitive to errors in the wavefunction. This explains why the variational procedure is so useful.
Example: The hydrogen atom with Hamiltonian
ĥ = − ∇ + . (.)
r
The exact ground-state wavefunction is well-known,
ψ (⃗r ) = Ce −r , (.)
with eigenvalue E = −/. Here, C is a normalization constant. Let us imagine we did not know ψ , and
try a parameterized wavefunction on the form
ψ α (⃗r ) = (α/π)/ e −αr /
. (.)
0.14
A,
A0
0.12
0.1
0.08
0.06
0.04
0.02
0
-6 -4 -2 0 2 4 6
r
Figure .: Plot of approximate and exact ground-state wavefunction for the Hydrogen example
Thus, M = {∣ψ α ⟩ ∣ α > } is the set of approximate wavefunctions, which all satisfy ⟨ψ α ∣ψ α ⟩ = . We can
compute the expectation value,
α /
E(∣ψ α ⟩) = ⟨ψ α ∣ĥ∣ψ α ⟩ = α − ( ) (.)
π
and minimize with respect to α,
E [M] = inf E(∣ψ α ⟩) = E(∣ψ π ⟩) = − π
≈ −.. (.)
α
This is actually a minimum, obtained at α = /(π). Comparing with the exact result, we see that the
energies are rather close for such a simple parameterization. The wavefunctions are not that close, see
Fig. .! Note that the exact ground-state is not smooth at ⃗r = .
Usually, the set M contains wavefunction ansätze that are parameterized in some way. In the example,
we had a simple Gaussian wavefunction parameterized by the width.
HI J = ⟨Φ I ∣Ĥ∣Φ J ⟩ . (.)
For the finite-dimensional case, the Cauchy interlace theorem states that for a linear model as here
described, all the eigenvalues of H actually approximate eigenvalues of the full Hamiltonian H from above.
For a general nonlinear model M, we cannot say this. In general only the ground-state energy is approxi-
mated.
The theorem implies that truncating a single-particle basis or truncating a Slater determinant basis makes
sense.
We will not prove the theorem.
Theorem . (Cauchy Interlace Theorem). Let V and V be linear spaces, of dimension D and D , respec-
tively. Let V ⊂ V be a subspace.
D D
Let {∣Φ I ⟩}I= be an orthonormal basis for V , such that {∣Φ I ⟩}I= is a basis for V .
D ×D
Let Ĥ ∶ V → V be a Hermitian operator with matrix H ∈ C , HI J = ⟨Φ I ∣Ĥ∣Φ J ⟩.
Let H be the projection of Ĥ onto V , i.e., the matrix H of this operator is equal to the upper left D × D
block of the D × D matrix H .
(i)
Let E k be the D i eigenvalues of H i , arranged such that
(i) (i)
E k ≤ E k+ ∀k. (.)
Then,
() () ()
Ek ≤ E k ≤ E k+δ , δ = D − D . (.)
The set S may of course be chosen in many different ways. One typical choice is the set of all possible
Slater determinants generated by the first L single-particle functions ϕ through ϕ L− . This gives a space
of dimension ( NL ), and is called the full configuration-interaction space (FCI space).
Another typical approach is to have a reference determinant ∣Φ⟩ and consider particle-hole states on
top of that, or excitations in chemistry language.
For example, the one-particle-one-hole space (CI singles, CIS) wavefunction is given by the choice
VCISD = span{∣Φ⟩ , ∣Φ ai ⟩ ∣Φ ab
i j ⟩ ∣ i, j = , ⋯N , a, b = N + , ⋯, L}. (.)
ab ab ab ab
∑ ∑ A i j ∣Φ i j ⟩ = ∑ ∑ A ∣Φ ⟩ . (.)
i< j a<b i j ab i j i j
Clearly, indexing the Slater determinants using the vector p⃗ directly can be cumbersome. Using a
different notation, we let I ∈ I be an index that enumerates the basis determinants, and write
For example, I = , , ⋯, D is a possibility, with some way of choosing an I for every p⃗ we are interested in.
Or I = (a, i), I = (ab, i j), etc, enumerates the CIS, CISD, etc, hierarchy of spaces.
How do we choose the single-particle functions and the reference state in CI theory? The most common
choice in chemistry is to employ a basis of Hartree–Fock spin-orbitals. This is the topic of Section .. A
more general picture is as follows: if Ĥ = Ĥ + Ŵ, it is also possible to consider Ŵ a perturbation of Ĥ ,
assuming that the eigenstates and eigenvalues of Ĥ are good approximations to those of the full Ĥ. (This
is also true for the Hartree–Fock paradigm to be considered later.)
Let therefore {ϕ p } be a complete set of eigenfunctions for the single-particle operator ĥ with eigen-
values є p arranged in increasing order. Then, the Slater determinants ∣ p⃗⟩ are eigenstates of the one-body
Hamiltonian Ĥ = ∑ Ni ĥ(i). Clearly, the determinant
particles are here
єF єF
Figure .: Fermi level and quasiparticles. To the left, we have the vacuum state. To the right, we have a
doubly excited state, or a two-particle-two-hole-state. Notice how we draw the “Fermi line” between two
levels for clarity. In this simple picture, we have assumed that the levels are non-degenerate. If we had spin
present, we could fit two particles per level, and so on.
Note, that if the eigenvalues of ĥ are degenerate, then this wavefunction may or may not be unique.
In this picture, the truncated CI scheme as outlined above is a natural approach, since it is reasonable
to assume that singles, doubles, etc, will systematically improve upon the “zero-order” wavefunction ∣Φ⟩.
In the context of a reference function ∣Φ⟩ defined in terms of a zero-order Hamiltonian, such as Ĥ ,
it is common to define the fermi level є as the energy of the occupied orbital with the highest energy, єF ,
assuming that all degenerate levels are included. With this terminology,
⎛ †⎞
∣Φ⟩ = ∏ c p ∣−⟩ , (.)
⎝є p ≤єF ⎠
for example. Moreover, we say that a hole is “below the Fermi level” and a particle is “above the Fermi level”.
An excitation excites a fermion from below the Fermi level to above the Fermi level. Thus, the index N is
replaced by the one-body energy of that level, єF . See Fig. .
Notice that the truncated CI scheme favors the description of the ground-state wavefunction.
If we look at the CISD case, the matrix then obtains a block form:
.. Naive CI
The simplest approach, which we here call “naive CI”, is to
. Write down a list of all the Slater determinants in the desired basis,
I ↦ ∣Φ I ⟩ .
. Compute all the matrix elements H I J and store them in computer memory as a big D × D matrix.
This can be done using, say, the Slater–Condon rules (see Exercise ??) that are basically formulae for
the matrix elements given in terms of the occupied single-particle functions in ∣Φ I ⟩ and ∣Φ J ⟩.
. Use a diagonalization agorithm to find, say, the ground-state energy or other eigenvalues of the
matrix.
The biggest problem with this approach, is that the dimension D of the CI space grows pretty fast.
The matrix is, in principle, a table with D elements. For FCI, D grows like ( NL ), which very quickly is
prohibitive. For CIS, it grows only like N(L − N), but CIS is not that fancy. For CISD, the dimension
grows like N (L − N) . We see that the spaces in any case become huge for moderate partucle numbers
and numbers L of single-particle functions.
.. Direct CI
More common than “naive CI” is direct CI. For systems of interest, the matrix size grows so quickly that
storing the matrix H in memory is out of question. Moreover, diagonalization of dense matrices scales as
D , quickly becoming too expensive for practical calculations.
Luckily, we have iterative algorithms such as the Lanczos algorithm. These rely only on the matrix-vector
product. Nowhere is the actual value of H I J needed, only the action on a vector A I , i.e., the algorithm needs
to compute
A⃗′ = H A⃗ (.)
for some input vector A. ⃗ I.e., we must have an algorithm to compute
where P = ∑I ∣Φ I ⟩ ⟨Φ I ∣ is the projection operator onto our chosen basis, i.e., we throw away the part of
Ĥ ∣Ψ⟩ which is not describable in terms of our basis.
It is useful to represent ∣Φ I ⟩ in terms of its occupation number vector, a bit string B = B[I]. These are
integers, and we need a table of these in computer memory. Since our ∣Φ I ⟩ must be linearly independent,
there is a one-to-one correspondence between the B[I]’s and the I’s, i.e., we can invert the table to obtain
I = I[B], given B. We write ∣B⟩ = ∣Φ I[B] ⟩ for brevity, and we stress that now B is an integer written on
binary form.
The central observation is now that, for any string of creation and annihilation operators
⎧
⎪
⎪ or
C C ⋯C n ∣B⟩ = ⎨ . (.)
⎪
⎪(−)s ∣B′ ⟩
⎩
The result can be found by manipulating the bits of B and keeping track of the resulting sign. When B′ has
been found, the corresponding index I ′ can be found by searching the bit pattern table. Thus, let us write:
⎛ p pq † † ⎞
= ∑ A B ∑ ∣B′ ⟩ ⟨B′ ∣ ∑ h q c †p c q +
∑ w rs c p c q c s c r ∣B⟩
B B ′ ⎝ pq
pqrs ⎠ (.)
⎛ p pq ⎞ ′
= ∣Ψ′ ⟩ = ∑ A B ∑ h q ⟨B′ ∣c †p c q ∣B⟩ +
′ † †
∑ w rs ⟨B ∣c p c q c s c r ∣B⟩ ∣B ⟩
B ⎝ pq pqrs ⎠
This gives us the following algorithm for computing the Ĥ contribution to ∣Ψ′ ⟩ (the Ŵ part is similar):
. Initialize A′I ′ = for all I ′ .
. Loop over I:
Of course, this algorithm is just a sketch. There are many ways to improve it.
How does one search for the index I ′ in step /b/ii? One way is to ensure that the table of bit patterns
(integers) are sorted, and then use binary search. This requires on average O(D log D) operations, and
since we need to do this O(D) times, this slows down our program drastically. One can also use a hash
map (e.g., the C++ STL class std::map<int,int> can be used). This is no faster.
A much faster approach can be taken using graphical methods. It is actually possible to find a formula
for the inverse map. This formula is O(), dramatically reducing the computer work for direct CI. For
more information on this technique, see Helgaker/Jørgensen/Olsen [], Section ..
(But who is thinking in terms of base- numbers these days anyway?) All integers between and L−
encode all possible Fock space basis functions. A basis for N-fermion space is composed of all the integers
whose bit patterns have precisely N bits in total.
Annihilation operator: c p ∣B⟩ is either or (−) k ∣B′ ⟩ for some k and B′ . We have the following algo-
rithm:
Creation operator: c †p ∣B⟩ is either or (−) k ∣B′ ⟩ for some k and B′ . We have the following algorithm:
. If bit p is set, return the zero result.
. Else, compute k as the number of bits set before p.
Exercise .. We are given L = orbitals, numbered p = , , ⋯, L − , and thus an occupation number
representation of length bits, e.g.,
Write down the result of the following expressions, on occupation number form. Remember the sign
factor:
a) c † ∣⟩
b) c ∣⟩
c) c † c † ∣⟩
d) c † ∣⟩
e) c ∣⟩
f) c † ∣⟩
g) c † c † c † ∣⟩
h) c † c ∣⟩
△
Exercise .. Write a program that generates all possible bit patterns of length L with N bits set and writes
them to screen.
Check that you have the correct number of patterns, ( NL ). △
Exercise .. (continues exercise ..) Write a program that correctly creates/annihilates particles from a
bit pattern representation ∣B⟩ of a Slater determinant, returning the proper sign. △
p pq
Exercise .. (continues exercises . and ..) Write a program that, given h q and w rs (antisymmetrized
or otherwise) as input arrays, computes Ĥ ∣B⟩ using direct CI. △
∣Φ⟩ = ∣ϕ ϕ ⋯ϕ N ⟩ , ⟨ϕ i ∣ϕ j ⟩ = δ i j . (.)
Note carefully, that a single-particle basis is not given – it is to be found! The expectation value of the
Hamiltonian Ĥ = Ĥ + Ŵ now reads (recalling that ⟨Φ∣Φ⟩ = )
⟨Φ∣Ĥ∣Φ⟩ = ∑ ⟨ϕ i ∣ĥ∣ϕ i ⟩ + ∑ ⟨ϕ i ϕ j ∣ŵ∣ϕ i ϕ j − ϕ j ϕ i ⟩ (.)
i ij
as obtained via the Slater–Condon rules, see for example Exercise .. Here,
which satisfies ⟨ϕ p ϕ q ∣ŵ∣ϕ r ϕ s ⟩ = ⟨ϕ q ϕ p ∣ŵ∣ϕ s ϕ r ⟩. The task is now to minimize this energy ⟨Φ∣Ĥ∣Φ⟩ subject
to the constraint that the ϕ i are orthonormalized,
⟨ϕ i ∣ϕ j ⟩ = δ i j . (.)
When a minimum is found, we denote the solution by ∣ΦHF ⟩, the Hartree–Fock state.
The constraints constitute a complication that we want to get rid of. We therefore Lagrange multipliers,
one for each constraint, giving a Lagrangian functional
Recall, that computing an extremum for the constrained problem is equivalent to an unconstrained ex-
tremalization of L with respect to the ϕ i and the Lagrange multipliers (see any text on vector calculus).
Due to symmetry of the constraints, the Lagrange multiplier matrix λ can be assumed to be Hermitian .
A word on a special notation. We define a single-particle function
⟨ ⋅ ϕ ∣ŵ∣ϕ ϕ ⟩ ∈ L (.)
as the function obtained by integrating only over the second particle in the inner product, viz,
i.e., the full two-particle integral. Thus, the dot represents an “unused slot” in the two-particle matrix
element.
We can expand the function in any orthonormal single-particle basis {χ p } ⊂ L ,
i.e., a linear combination of to-particle matrix elements. This notation will be useful when we now state
and prove our result:
Theorem . (Hartree–Fock equations). The single-particle functions of the Hartree–Fock state ∣ΦHF ⟩ satisfy
the nonlinear eigenvalue problem
fˆ(ϕ , ⋯, ϕ N ) ∣ϕ i ⟩ = є i ∣ϕ i ⟩ , (.)
where
fˆ(ϕ , ⋯, ϕ N ) ≡ ĥ + v̂ direct − v̂ exchange , (.)
with
v̂ direct ∣ψ⟩ ≡ ∑ ⟨ ⋅ ϕ j ∣ŵ∣ψϕ j ⟩ . (.)
j
and
v̂ exchange ∣ψ⟩ ≡ ∑ ⟨ ⋅ ϕ j ∣ŵ∣ϕ j ψ⟩ . (.)
j
The equations (.) are referred to as “the Hartree–Fock equations”. The operator fˆ in Eq. (.) is “the
Fock operator”, and v̂ direct and v̂ exchange are the direct- and exchange potentials, respectively.
To see this, assume that a is a matrix which is not assumed to be Hermitian. Note that the expression g = ⟨ϕ ∣ϕ ⟩ − δ
ji ij i j ij
satisfies g ∗i j = g ji . Thus, ∑ i j a ji g i j = ∑ i j a ji g ∗ji = ∑ i j a i j g ∗i j = (∑ i j a ∗i j g i j )∗ . This gives ∑ i j a ji g i j = ∑ i j (a ji + a ∗i j )g i j . Take
λ ji = a ji + a ∗i j .
Proof. In the language of Sec. A., we need to show that the directional derivative of the Lagrangian van-
ishes.
We first note that λ ji can be treated separately: ∂L/∂λ ji = ⟨ϕ i ∣ϕ j ⟩− δ i j , the constraint. These equations
are ensured fulfilled in the end by finding solutions ϕ i that are in fact orthonormal. We thus only compute
the directional derivatives with respect to variations of the ϕ i .
Choose a k ∈ {, ⋯, N}. We are going to compute the directional derivative with respect to changes in
the function ϕ k only, leaving the other fixed. This turns out to be sufficient to find all the equations. Thus,
let є be a small real number, and let η be a normalized single-particle function. We write
δϕ k = єη.
To first order in є,
f (є) = f () + є f ′ () + O(є ), (.)
and we for an extremal point of L, we must have that for any η, f ′ () = . In the language of Sec. A., the
directional derivative of L at {ϕ i } Ni= in the direction η (for ϕ k , the others are fixed) vanishes.
We compute the Taylor expansion of f (є) by direct computation of the perturbed Lagrangian:
f (є) = ∑ ⟨ϕ i + δ k i єη∣ ĥ∣ϕ i + δ k i єη⟩ + ∑ ⟨(ϕ i + δ k i єη)(ϕ j + δ k j єη)∣ŵ∣(ϕ i + δ k i єη)(ϕ j + δ k j єη)⟩
i ij
− ∑ ⟨(ϕ i + δ k i єη)(ϕ j + δ k j єη)∣ŵ∣(ϕ j + δ k j єη)(ϕ i + δ k i єη)⟩
ij
− ∑ λ ji (⟨ϕ i + δ i k єη∣ϕ j + δ jk єη⟩ − δ i j )
ij
(.)
We now write out the matrix elements, but keep only terms up to first order in є. This gives
f (є) = ∑ ⟨ϕ i ∣ĥ∣ϕ i ⟩ + ∑ ⟨ϕ i ϕ j ∣ŵ∣ϕ i ϕ j − ϕ j ϕ i ⟩ + є ⟨η∣ĥ∣ϕ k ⟩ + є ⟨ϕ k ∣ĥ∣η⟩
i ij
+ є ∑ ⟨ηϕ j ∣ŵ∣ϕ k ϕ j ⟩ + є ∑ ⟨ϕ i η∣ŵ∣ϕ i ϕ k ⟩ − є ∑ ⟨ηϕ j ∣ŵ∣ϕ j ϕ k ⟩ − є ∑ ⟨ϕ i η∣ŵ∣ϕ k ϕ i ⟩
j i j i
+ є ∑ ⟨ϕ k ϕ j ∣ŵ∣ηϕ j ⟩ + є ∑ ⟨ϕ i ϕ k ∣ŵ∣ϕ i η⟩ − ∑ є ⟨ϕ i ϕ k ∣ŵ∣ηϕ i ⟩ − ∑ є ⟨ϕ k ϕ j ∣ŵ∣ϕ j η⟩
j i i j
− ∑ λ jk є ⟨η∣ϕ j ⟩ − ∑ λ k i є ⟨ϕ i ∣η⟩ − ∑ λ ji (⟨ϕ i ∣ϕ i ⟩ − δ i j ) + O(є )
j i ij
(.)
We now use the symmetry property of the matrix elements of ŵ. This gives, for example,
∑ ⟨ϕ i η∣ŵ∣ϕ i ϕ k ⟩ = ∑ ⟨ηϕ i ∣ŵ∣ϕ k ϕ i ⟩ = ∑ ⟨ηϕ j ∣ŵ∣ϕ k ϕ j ⟩ . (.)
i i j
This gives a simplification of f (є), and we regroup:
f (є) = ∑ ⟨ϕ i ∣ ĥ∣ϕ i ⟩ + ∑ ⟨ϕ i ϕ j ∣ŵ∣ϕ i ϕ j − ϕ j ϕ i ⟩ − ∑ λ ji (⟨ϕ i ∣ϕ i ⟩ − δ i j )
i ij ij
We recognize that the zeroth order term is just f () = L(ϕ , ⋯, ϕ N , λ). We read off f ′ (є), and obtain the
directional derivative, and hence the equation
which must be valid for all choices of the function η. In particular we can also insert iη, giving, after
dividing the result by i,
Let {χ p } be a complete orthonormal basis for the single-particle space L . Inserting η = χ p in Eq. (.),
we obtain
⎧
⎪ ⎫
⎪
⎪ ⎪
= ∑ ∣χ p ⟩ ⎨⟨χ p ∣ĥ∣ϕ k ⟩ + ∑ ⟨χ p ϕ j ∣ŵ∣ϕ k ϕ j ⟩ − ∑ ⟨χ p ϕ j ∣ŵ∣ϕ j ϕ k ⟩ − ∑ λ jk ⟨χ p ∣ϕ j ⟩⎬
p ⎪
⎪ ⎪
⎪
⎩ j j j ⎭ (.)
= ĥ ∣ϕ k ⟩ + ∑ ∑ ∣χ p ⟩ ⟨χ p ϕ j ∣ŵ∣ϕ k ϕ j ⟩ − ∑ ∑ ∣χ p ⟩ ⟨χ p ϕ j ∣ŵ∣ϕ j ϕ k ⟩ − ∑ λ jk ∣ϕ j ⟩ .
j p j p j
Here, we used
= ∑ ∣χ p ⟩ ⟨χ p ∣ . (.)
p
We now get rid of λ, replacing it with a diagonal matrix with diagonal elements є k (not to be confused with
the small parameter є above, which we now are done with.)
The determinant ∣Φ⟩ is invariant (up to an irrelevant phase) under a unitary mixing of the single-
particle functions, i.e, if we let
ϕ̃ k = ∑ ϕ j U jk (.)
j
with U a unitary matrix, then ∣Φ̃⟩ = det(U) ∣Φ⟩, i.e., the same state, and clearly the energy must be the
same too.
As argued, λ i j = λ∗ji can be assumed Hermitian. Select therefore U such that λ = U EU H , with E jk =
δ jk є k the elements of a diagonal matrix (the eigenvalues of λ):
λ ji = ∑ U jℓ є ℓ U i∗ℓ . (.)
ℓ
∑ ∣r k ⟩ U k i = . (.)
k
Since U is unitary, Eq. (.) is satisfied for all k if and only if Eq. (.) is satisfied for all i. Computing the
sum in Eq. (.) (see Exercise .) we obtain
The operator v̂ direct is a one-body operator. When acting on a one-body function ∣ψ⟩ it produces a new
one-body function, which at x takes the value
The operator v̂ exchange is still linear when acting on ∣ψ⟩, it is just not interpretable as a local potential.
If we introduce the reduced one-particle density matrix γ(x , x ) as
we can express
⟨x ∣ v̂ direct ∣ψ⟩ = ψ(x ) ∫ γ(x , x )w(x , x ) dx . (.)
The equation
fˆ(ϕ , ⋯, ϕ N ) ∣ϕ p ⟩ = є p ∣ϕ p ⟩ . (.)
with the Fock operator
fˆ(ϕ , ⋯, ϕ N ) = ĥ + v̂ direct − v̂ exchange , (.)
is referred to as the canonical Hartree–Fock equations, and the solutions are called the canonical single-
particle functions.
The first N HF single-particle functions ϕ i are often called occupied, while the rest, ϕ a , are often called
virtual single-particle functions.
We now show an interesting relation for the Hartree–Fock energy. It is tempting to assume that EHF =
∑ i i . However, this is not the case.
є
Theorem . (Energy expression for Hartree–Fock). Assume that a solution (ϕ i , є i ), i = , ⋯, N, to the
canonical Hartree–Fock equations have been found. Then, the Hartree–Fock energy is given by
EHF = ∑ є i − ∑ ⟨ϕ i ϕ i ∣ŵ∣ϕ i ϕ j − ϕ j ϕ i ⟩ . (.)
i ij
It happens that fˆ has a continuous spectrum, so our statement must really be limited to finite-dimensional one-particle spaces
Proof. Multiply the HF equation from the left by ⟨ϕ i ∣ and sum over i to obtain
∑ є i = ∑ ⟨ϕ i ∣ĥ∣ϕ i ⟩ + ∑ ⟨ϕ i ϕ j ∣ŵ∣ϕ i ϕ j − ϕ j ϕ i ⟩ . (.)
i i ij
We see that the interaction is double counted compared to Eq. (.), and we are finished.
Exercise .. Suppose the HF single-particle functions have been found, so that the Fock operator fˆ is a
fixed operator. Prove that it is Hermitian, i.e., for any two single-particle functions ψ(x) and ψ ′ (x),
⟨ψ∣ fˆ∣ψ ′ ⟩ = [⟨ψ ′ ∣ fˆ∣ψ⟩]∗ .
△
Exercise .. We show that the reduced one-particle density matrix is the same for canonical and non-
canonical orbitals: Let U be a unitary matrix and define
ϕ̃ i = ∑ ϕ j U ji . (.)
j
Show that
γ(x, x ′ ) = ∑ ϕ̃ j (x)ϕ̃ j (x ′ )∗ . (.)
j
What can you conclude about v̂ direct and v̂ exchange , which are functions of γ? △
Exercise .. In this exercise, we fill in the details between Eq. (.) and Eq. (.) in the proof of Theo-
rem ..
a) Verify that
∑ U k i ĥ ∣ϕ k ⟩ = ĥ ∣ϕ̃ i ⟩ . (.)
k
d) Show that
∑ U k i ∑ ⟨ ⋅ ϕ j ∣ŵ∣ϕ k ϕ j ⟩ = ∑ ⟨ ⋅ ϕ̃ j ∣ŵ∣ϕ̃ i ϕ̃ j ⟩ . (.)
k j j
You may do the transformations of the various ϕ ℓ into ϕ̃ ℓ using c), or use Exercise ..
e) Show that
∑ U k i ∑ ⟨ ⋅ ϕ j ∣ŵ∣ϕ j ϕ k ⟩ = ∑ ⟨ ⋅ ϕ̃ j ∣ŵ∣ϕ̃ j ϕ̃ i ⟩ . (.)
k j j
f) Gather the results of a), b), d), and e), to show that Eq. (.) becomes Eq. (.).
△
.. The Hartree–Fock equations in a given basis: the Roothan–Hall equations
How do we solve the HF equations (.)? In this section, we reformulate the HF equations relative to a
fixed basis, {χ p }Lp= . For practical reasons, of course, the basis must have a finite size L. However, we do
not assume that it is orthonormal. Thus, we have a possibly non-diagonal overlap matrix S of size L × L,
S pq ≡ ⟨χ p ∣χ q ⟩ . (.)
and we must have that S − exists since the ϕ p form a basis.
Such basis functions are common in quantum chemistry, where a non-orthogonal basis of Gaussian
functions centered on the atoms is typically employed. See for example Szabo/Ostlund or Helgaker/Jørgensen/Olsen
for details. For now, we just keep this remark as a motivation for not assuming orthogonality. In nuclear
physics or solid state physics, orthogonal functions χ p are more typical.
We expand our HF functions as
∣ϕ p ⟩ = ∑ ∣χ q ⟩ U q p , (.)
q
where U is in general not a unitary matrix, since the basis is not orthogonal. (However, we have U H SU = I,
the identity matrix, see Exercise ..) We notice that the columns of U are the basis expansions of each ϕ p .
We write u p for column number p, ∣ϕ p ⟩ = ∑q ∣χ q ⟩ (u p )q .
The reduced density matrix becomes
γ(x, x ′ ) = ∑ ⟨x∣ϕ i ⟩ ⟨ϕ i ∣x ′ ⟩ = ∑ ∑ U qi ∣χ q ⟩ ⟨χ p ∣ U pi
∗ ∗
= ∑(∑ U qi U pi H
) ⟨x∣χ q ⟩ ⟨χ p ∣x ′ ⟩ = ∑(U ∶N U ∶N )q p ⟨x∣χ q ⟩ ⟨χ p ∣x ′ ⟩ ,
i pq i pq i pq
(.)
and it makes sense to define
H
D = U ∶N U ∶N = ∑ ui uH
i , (.)
i
which we interpret as the reduced density matrix relative to the given basis {χ p }, depending on the N first
columns of U only.
We now demonstrate how the canonical HF equations (.) can be written
F(D)U = SUє, (.)
where
F pq = ⟨χ p ∣ fˆ(ϕ , ⋯, ϕ N )∣χ q ⟩ (.)
are the matrix elements of the Fock operator in the fixed basis, and where є = diag(є , ⋯, є L ) is a diagonal
matrix. Equation (.) is a nonlinear generalized eigenvalue problem.
Let us look at the matrix elements of f ,
Correspondingly,
We obtain
Fq p = ⟨χ q ∣ ĥ∣χ p ⟩ + ∑ D p′ q′ (⟨χ q χ q′ ∣ŵ∣χ p χ p′ ⟩ − ⟨χ q χ q′ ∣ŵ∣χ p′ χ p ⟩). (.)
p′ q ′
Gathering, we find
F(D)U = SUє, (.)
and we are finished. This equation is called the Roothan–Hall equation.
In terms of each column, i.e., each ϕ p ,
F(D)u p = є p Su p . (.)
∣ϕ p ⟩ ∑ S q p ∣χ q ⟩ , S q p = ⟨χ q ∣χ p ⟩ . (.)
q
.. Basis expansions in HF single-particle functions
We have now established the canonical Hartree–Fock single-particle functions, which can be used as a
basis just like any other orthonormal basis. Each canonical ϕ p is associated with a creation operator c †p ,
and in terms of the original basis {χ p } we have for a two-body operator
and
p
⟨p∣ ĥ∣q⟩ = h q = ⟨ϕ p ∣ĥ∣ϕ q ⟩ = ∑ U p∗′ p U q′ q ⟨χ p′ ∣ ĥ∣χ q′ ⟩ , (.)
p′ q ′
and similarly for any one-body operator. In a situation where HF single-particle functions are used in,
say, a CI program, the matrix elements ⟨χ p′ χ q′ ∣ŵ∣χ r ′ χ s ′ ⟩(AS) and ⟨χ p′ ∣ĥ∣χ q′ ⟩ will be produced by external
codes. This is especially true in chemistry, where the computation of matrix elements is a business on its
own.
In quantum chemistry, it is standard to start with the HF single-particle functions and perform cor-
rections on top of that, such as CISD, giving rise to the term “post-Hartree–Fock methods”.
It is convenient to write the Hamiltonian on the following form
Ĥ = Ĥ + Ŵ = F̂ + Û , (.)
Here,
V̂ direct = ∑ v̂ direct (i), V̂ exchange = ∑ v̂ exchange (i). (.)
i i
The fluctuation potential is so named, because if one considers the HF solution as a reference ∣Φ⟩ (and
now we drop the “HF” subscript), “most” of the interactions between the particles in ∣Φ⟩ are described by
the Fock operator, and Û should be “small”: after all, we have chosen the HF state such that it contains as
much of the interaction energy as possible, by minimizing the energy over all possible determinants. Thus,
the exact wavefunction ∣Ψ⟩ = ∣Φ⟩ + δ ∣Ψ⟩ consists of “small fluctutations” on top of ∣Φ⟩ caused by Û.
An expression for the direct potential operator matrix element is
This results in (using antisymmetrized matrix elements (.))
F̂ = Ĥ + ∑ ∑ ⟨qi∣ŵ∣pi⟩AS c †q c p (.)
pq i
Û = Ŵ − ∑ ∑ ⟨qi∣ŵ∣pi⟩AS c †q c p . (.)
pq i
Having dealt with the second-quantized form of the Hartree–Fock partitioned Hamiltonian, let us turn
to the Slater determinants. Since the c †p are creation operators for the canonical HF single-particle func-
tionss, a basis of Slater determinants can be taken to be the ∣p ⋯p N ⟩, with p < p < ⋯p N . Alternatively,
we can use the quasiparticle picture, and let the HF function be the reference,
etc, where we have introduced the quasiparticle creation- and annihilation operators.
All the determinants ∣p , ⋯, p N ⟩ are eigenfunctions of F̂,
F̂ ∣p , ⋯, p N ⟩ = (∑ є p i ) ∣p , ⋯, p N ⟩ , (.)
i
Proof. Assume that the HF equations are satisfied. Since fˆ is Hermitian, the single-particle basis functions
are orthonormal. The Fock matrix becomes diagonal,
p p
f q = h q + ∑ ⟨p j∣ŵ∣q j⟩ = δ pq є q . (.)
j
In particular,
q
f ia = h i + ∑ ⟨a j∣ŵ∣i j⟩ = . (.)
j
But this is precisely (see the Slater–Condon rules from Exercise .) the expression for ⟨Φ ai ∣ Ĥ ∣Φ⟩, which
therefore must vanish for all i, a.
The converse of Brilloin’s theorem is also true, in the sense that f ia = is equivalent to the non-canonical
HF equations. Recall that the HF state is the same for the non-canonical and canonical single-particle
functions.
Theorem . (Converse of Brillouin’s Theorem). Let a single-particle basis be given. This basis satisfies
if and only if the non-canonical HF equations are satified for the occupied ϕ i , i = , ⋯, N.
Proof. Since f ai = ( f ia )∗ = ,
fˆ ∣ϕ i ⟩ = ∑ λ ji ∣ϕ j ⟩ . (.)
j
j
Forming the inner product with ϕ j , we otain f i = λ ji , and Eq. (.) is satisfied.
Because of Brillouin’s Theorem, a configuration-interaction treatment win only singles (CIS) yields no
correction over the HF treatment alone, and we have to go to doubles.
in suitable units. The electrons are thus described by a one-body Hamiltonian Ĥ = ∑ i ĥ(i),
ĥ(⃗r ) = − ∇ + v(⃗r ), (.)
where v(⃗r ) is an external electrostatic potential, such as the one set up by an atomic nucleus. The operator
ĥ does not couple to electron spin, so that the single-particle eigenfunctions of ĥ separate,
where α = ±/ is the value of the projection of the electron spin along the z-axis. Also, σ = ±/, and
⟨χ α ∣χ β ⟩ = δ α β . The eigenvalue problem of ĥ(⃗r ) becomes
where the eigenvalue e p is seen to be doubly degenerate due to spin. The N-electron ground-state of Ĥ is
now given by the Slater determinant with the N first eigensolutios ϕ(p,σ) occupied. Assuming N even, we
get
∣Φ⟩ = ∣ϕ ϕ ⋯ϕ N ϕ N ⟩ . (.)
, ,− , ,−
(If N is odd, the ground-state is doubly degenerate, with an electron occupying ϕ N , for α = +/ or
⌊ ⌋+,α
α = −/.) A common notation is
where we have introduced a special notation for the spatial matrix element. Similarly,
Proof. (Optional reading.) Optimization of RHF energy, and RHF equations: Introducing Lagrange mul-
tipliers for the orthonormality constraints, we obtain a Lagrangian
A unitary transformation similar to the one for the HF equations allow us to replace λ by a diagonal matrix,
finally obtaining
[ĥ + v̂ Coulomb − v̂ exchange ]φ i (⃗r ) = є i φ i (⃗r ), i = , ⋯, N/, (.)
with
v̂ Coulomb (⃗r ) = ∫ γ(⃗r ′ , ⃗r ′ ) d⃗r ′ (.)
∣⃗r − ⃗r ′ ∣
being a local potential, and where
[v̂ exchange ψ](⃗r ) = ′
∫ γ(⃗r , ⃗r ) ψ(⃗r ′ ) d⃗r ′ (.)
∣⃗r − ⃗r ′ ∣
The proof of Eq. (.) is obtained by taking the inner product of Eq. (.) with φ i and summing over i,
then multiplying with .
Exercise .. In this exercise, we prove Theorem . (To be filled in.) △
Thus, the orbital carries a spin-index as well as a space index, compare with the RHF model. The UHF
state can be written
/ −/ / −/ / −/
∣ΦUHF ⟩ = ∣φ φ̄ φ φ̄ ⋯φ N/ φ̄ N/ ⟩ , (.)
compare with Eq. (??) Notice that the spin-orbitals are still orthogonal for different spins. Notice also that
the general HF model is more general than UHF: there, each spin-orbital was not required to separate into
a product of space and spin functions.
The UHF energy expectation value is (see Exercise .)
N/ N/ N/
α β α β
EUHF = ∑ ∑ (φ αi ∣ ĥ∣φ αi ) + α α α α
∑ ∑ ∑ (φ i φ j ∣ŵ∣φ i φ j ) − ∑ ∑ (φ i φ j ∣ŵ∣φ j φ i ). (.)
α i= α β ij α ij
where we note that each spin-orbital is not doubly degenerate anymore. We introduce the UHF Coulomb
potential,
β
v Coulomb (⃗r ) = ∫ ∑ ∣φ j (r⃗′ )∣ d⃗r ′ , (.)
jβ ∣⃗r − ⃗r ′ ∣
and the UHF exchange potential operator
[v̂ α,exchange ψ](⃗r ) = ∫ ∑ φ αj (⃗r )φ αj (⃗r ′ )∗ ψ(⃗r ) d⃗r ′ , (.)
j ∣⃗r − ⃗r ′ ∣
to obtain
[ ĥ + v̂ Coulomb − v̂ α,exchange ]ϕ αi (⃗r ) = є αi ϕ αi (⃗r ). (.)
Each of the operators with subscript “N” is thus normal-ordered with respect to quasiparticle vacuum,
thereby simplifying many formulas and manipulations.
Suppose now our single-particle basis is the HF basis. Looking carefully at the above equations, and
recalling the operator N( ⋅ ) is defined for linear combinations of strings, we recognize that
Thus, using HF orbitals, the normal-ordered Hamiltonian takes on a particularly simple form:
where we recall that the normal-ordering operator is relative to quasiparticle vacuum. Here, the quasi-
particle reference is the HF state ∣ΨHF ⟩ = ∣Φ⟩. Recall, that the normal-orering operator is defined linear
combinations of strings,
⎛ p ⎞ p
N(F̂) = N ∑ f q c †p c q = ∑ f q N(c †p c q ). (.)
⎝ pq ⎠ pq
But beware! In general, N(Ĥ ) ≠ Ĥ ,N ! The operator Ĥ ,N depends on the whole Hamiltonian, i.e., also
the two-body interaction. It is just that in in the particular case of the HF partitioning of the Hamiltonian,
N(F̂) = F̂N .
We now also use the fact that F̂ is diagonal in the HF basis,
F̂ = ∑ є p c †p c p . (.)
p
Exercise .. Set up the CISD formalism using L Hartree–Fock orbitals. Use the normal-ordered Hamil-
tonian. Compute the matrix elements ⟨Φ∣Ĥ∣Φ⟩, ⟨Φ ai ∣Ĥ∣Φ⟩, ⟨Φ ab a c a cd
i j ∣Ĥ∣Φ⟩, ⟨Φ i ∣Ĥ∣Φ k ⟩, ⟨Φ i ∣Ĥ∣Φ k l ⟩, and
⟨Φ ab cd
i j ∣Ĥ∣Φ k l ⟩. Use Wick’s Theorem for quasiparticle operators to achieve this. (One could also use the
Slater–Condon rules, but this exercise is about quasiparticles and normal-ordered operators.) △
The idea is to partition the Hamiltonian into a part that we can “solve” and a perturbation V̂ ,
Ĥ = Ĥ + V̂ . (.)
The operator Ĥ is “solved”, in the sense that we we assume knowledge of all its eigenfunctions and eigen-
values,
Ĥ ∣Φ k ⟩ = є k ∣Φ k ⟩ . (.)
The set {∣Φ k ⟩} is assumed to be an orthonormal basis for Hilbert space (this is true for all finite-dimensional
cases, and for many infinite-dimensional ones). We should, in principle, be able to express the exact eigen-
vectors and (and therefore the eigenvalues) in terms of the this basis and V̂ .
In perturbation theory, we seek such an expression in terms of power series in the perturbation V̂ . We
introduce an order parameter λ and write
Ĥ λ = Ĥ + λV̂ , (.)
i.e., Ĥ = Ĥ is the full Hamiltonian. It is not unreasonable to assume that the eigenvalues and eigenvectors
of Ĥ λ become smooth functions of λ, at least for λ sufficiently small and/or sufficiently weak perturbations
V̂ .
The Schrödinger equation for Ĥ(λ) reads
We now assume that we can expand the eigenvectors and eigenvalues in power series around λ = .
∞
(n)
∣Ψk (λ)⟩ = ∑ ∣Ψk ⟩ λ n (.a)
n=
∞
(n)
E k (λ) = ∑ E k λ n . (.b)
n=
()
The unperturbed problem is obtained at λ = : ∣Ψk ()⟩ = ∣Φ k ⟩ = ∣Ψk ⟩ and E k () = є k , and the full
problem at λ = : ∣Ψk ()⟩ = ∣Ψk ⟩, and E k () = E k .
The assumption that ∣Ψk (λ)⟩ is differentiable at λ = is assured by requiring є k to be non-degenerate.
(n) (n)
We now derive formulas for the perturbation corrections E k and ∣Ψk ⟩. This is done by plugging
Eqs. (.) into the Schrödinger equation. This gives
∞ ∞ ∞
(n) (n) (m)
(Ĥ + λV̂ ) ∑ ∣Ψk ⟩ λ n = ( ∑ E k λ n ) ∑ ∣Ψk ⟩ λm . (.)
n= n= m=
For this equation to hold for all λ, it must hold order-by-order. The λ -part of the equation is simply
Eq. (.). The n’th order equation is
n
(n) (n−) ( j) (n− j)
Ĥ ∣Ψk ⟩ + V̂ ∣Ψk ⟩ = ∑ E k ∣Ψk ⟩, n > . (.)
j=
The solution ∣Ψk (λ)⟩ to the Schrödinger equation is not unique. By scaling it we obtain a new solution.
Thus, in order to write ∣Ψk (λ)⟩ as a smooth function of λ, we need to select one particular normalization
for each λ. We obtain particularly simple expressions using intermediate normalization:
Since this expression is to hold for all λ, it must hold order-by-order, which gives
(n)
⟨Φ k ∣Ψk ⟩ = , ∀n ≥ , (.)
i.e., all the higher-order corrections are orthogonal to the unperturbed vector ∣Φ k ⟩.
We now use Eq. (.) and project Eq. (.) onto ∣Φ k ⟩ to obtain
(n−) (n)
⟨Φ k ∣V̂ ∣Ψk ⟩ = Ek , (.)
which is an expression for the n-th order energy perturbation in terms of the n − -th order correction in
the wavefunction. In particular,
()
E k = ⟨Φ k ∣V̂ ∣Φ k ⟩ . (.)
(n) ( j)
If we can find an expression for ∣Ψk ⟩ in terms of ∣Ψk ⟩, j < n, then we have a recursive procedure for
determining all the perturbation corrections.
To this end, rearrange Eq. (.) as
n−
(n) (n−) (n− j) ( j)
(є k − Ĥ ) ∣Ψk ⟩ = V̂ ∣Ψk ⟩ − ∑ Ek ∣Ψk ⟩ . (.)
j=
On the right-hand side we only have wavefunction corrections of order less than n. We also know that the
(n) (n−)
E k , which occurs on the right-hand side, is a function of ∣Ψk ⟩, so if we can somehow invert є k − Ĥ
(n)
then we have an expression for ∣Ψk ⟩ in terms of lower-order corrections only.
Let P̂ = ∣Φ k ⟩ ⟨Φ k ∣, the projection operator onto the unperturbed eigenvector. Let Q̂ = − P̂, which is
then the projector onto the subspace spanned by all the other ∣Φ j ⟩, j ≠ k:
Q̂ = ∑ ∣Φ j ⟩ ⟨Φ j ∣ . (.)
j≠k
Moreover,
[Ĥ , Q̂] = , (.)
since the ∣Φ j ⟩ are eigenfunctions of Ĥ . Acting on Eq. (.) with Q̂ we then obtain
n−
(n) (n−) (n− j) ( j)
(є k − Ĥ )Q̂ ∣Ψk ⟩ = Q̂ V̂ ∣Ψk ⟩ − ∑ Ek Q̂ ∣Ψk ⟩ , (.)
j=
where we remark that the j = -term from the sum on the right-hand side is eliminated. We have
i.e., the operator acts only within the space orthogonal to ∣Φ k ⟩. (Here, we use the non-degeneracy assump-
tion.) Define the operator
R̂ = Q̂ R̂ Q̂ = ∑ ∣Φ j ⟩ ⟨Φ j ∣ . (.)
j, j≠k є k − є j
It is important to note that we must here assume that the unperturbed eigenvalue є k is non-degenerate.
(n)
Otherwise there are infinite terms in the sum. Now, for every ∣u⟩ = Q̂ ∣u⟩ (such as ∣Ψk ⟩) we have
Q̂
R̂ = , (.)
є k − Ĥ
even though the fraction notation for matrices and operators is something to be careful with. R̂ is also
called the resolvent of Ĥ . Acting with R̂ on Eq. (.), we obtain
⎡ ⎤
Q̂ ⎢ n− ⎥
(n)
∣Ψk ⟩ = ⎢V̂ ∣Ψ(n−) ⟩ − ∑ E (n− j) ∣Ψ( j) ⟩⎥ . (.)
⎢ k k k ⎥
є k − Ĥ ⎢ j= ⎥
⎣ ⎦
We summarize as a theorem:
Theorem . (Non-degenerate Rayleigh–Schrödinger Perturbation Theory). Let Ĥ = Ĥ + λ V̂ be given,
and assume that
Ĥ ∣Φ k ⟩ = є k ∣Φ k ⟩ (.)
where the eigenvectors for a complete basis. Let
(Ĥ + λV̂ ) ∣Ψk (λ)⟩ = E k (λ) ∣Ψk (λ)⟩ , ⟨Φ k ∣Ψk (λ)⟩ = , (.)
for a given k, and assume that є k is a non-degenerate eigenvalue for Ĥ . Assume furthermore, that E k (λ)
and ∣Ψk (λ)⟩ are analytic in a neighborhood of λ = ,
∞
(n)
E k (λ) = ∑ E k λ n (.)
n=
∞
(n)
∣Ψk (λ)⟩ = ∑ ∣Ψk ⟩ λ n . (.)
n=
Then the n-th order corrections are given recursively in terms of the j < n-th order corrections via the formulae
(n) (n−)
Ek = ⟨Φ k ∣V̂ ∣Ψk ⟩ (.)
⎡ ⎤
Q̂ ⎢⎢ n−
( j) ⎥
∣Ψk ⟩⎥⎥ .
(n) (n−) (n− j)
∣Ψk ⟩ = ⎢V̂ ∣Ψk ⟩ − ∑ Ek (.)
є k − Ĥ ⎢⎣ j= ⎥
⎦
For E () we need the first-order wavefuncton correction,
∣Ψ() ⟩ = R̂ V̂ ∣Φ⟩ , (.)
which then gives
∣ ⟨Φ k ∣V̂ ∣Φ j ⟩ ∣
E () = ⟨Φ∣V̂ R̂ V̂ ∣Φ⟩ = ∑ , (.)
j, j≠k єk − є j
which is a familiar expression for second-order perturbation theory. For E () we need the second-order
wavefunction correction,
∣Ψ() ⟩ = R̂[V̂ − ⟨Φ∣V̂ ∣Φ⟩]R̂ V̂ ∣Φ⟩ . (.)
This gives
E () = ⟨Φ∣ V̂ R̂[V̂ − ⟨Φ∣V̂ ∣Φ⟩]R̂ V̂ ∣Φ⟩
(.)
= ⟨Φ∣ V̂ R̂ V̂ R̂ V̂ ∣Φ⟩ − ⟨Φ∣V̂ ∣Φ⟩ ⟨Φ∣ V̂ R̂ V̂ ∣Φ⟩
Comparing E () , and E () , we notice a pattern, but that for E () , we see that the pattern becomes more
complicated: there is a leading term on the form
(k)
Eleading = ⟨Φ∣V̂ R̂ V̂ R̂⋯V̂ ∣Φ⟩ , (.)
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
n factors V̂ , n − farctors R̂
Exercise .. Derive the fourth and fift order perturbation theory corrections to the energy from Theo-
rem .. Next, verify that the bracketing technique gives the correct answer. △
.. A two-state example
It is instructive to consider a two-state example, since we can diagonalize it exactly and obtain closed-form
expressions. The behavior of the perturbation series can then be considered.
Let
− є
H = ( ), V = ( ), (.)
є
so that
− λє
H(λ) = ( ). (.)
λє
The exact eigenvalues of H(λ) are given by the roots of the polynomial
We obtain the Taylor series (with a few extra terms obtained by computer algebra)
e− (λ) = − − (λє) + (λє) − (λє) + (λє) + O(λ ). (.)
A natural question arises: does the series converge? Does it converge for our desired parameter value
λ = ? Well, our function has a branch-point singularity since it is a square-root function. The branch-
point singularity arises when the two roots coincide in the complex plane, at λ = ±i/є. At these points the
eigenvalue functions are no longer analytic. The Taylor series only converges in a disc around λ = that
does not contain the singularity. Thus, the Taylor series will only converge within the circle ∣λ∣ < /∣є∣, i.e.,
it will converge for λ = only if ∣є∣ < .
Thus, we see directly that the strength of the perturbation may affect the convergence properties of the
perturbation series.
The points λ = ±i/є are called avoided crossings since, if the parameter λ is real, it “narrowly misses” the
branch-point and hence an exact crossing. Often, in eigenvalue plots, one can see the function behaviour
±[a + (єλ) ]/ , indicating an avoided crossing and hence a singluarity located approximately at this λ-
value.
For an n-state problem, each of the n eigenvalues may collide with n− eigenvalues (again for complex λ
in general), giving quite a lot of possible branch points, and thus many singularities. Determining whether
the RSPT series converges is thus a virtually impossible task for many-body calculations. Still, a few terms
may still give a good approximation.
e±
λ=
Figure .: The eigenvalues of a two-state problem, as function of the perturbation parameter λ. Here,
є = .
We now consider the perturbation series of the two-state problem explicitely. We write ∣⟩ for the
unperturbed ground state, and ∣⟩ for the unperturbed excited state. We obtain
We alsso have
R̂ = − ∣⟩ ⟨∣ . (.)
Perturbation terms for the energy:
E () = ⟨∣V̂ ∣⟩ = . (.)
E () = ⟨∣V̂ R̂ V̂ ∣⟩ = − ⟨∣V̂ ∣⟩ ⟨∣V̂ ∣⟩ = − є . (.)
For the higher order terms, we note that R̂ V̂ R̂ = . The third-order energy:
Exercise .. Prove that Use R̂ V̂ R̂ = . Compute E (n) , n = , , , for the two-state model, continuing the
above calculations. Use the bracketing technique to derive the terms. Verify that your calculations match
the terms in the Taylor expansion .. △
Ĥ = K̂ + L̂, (.)
N
K̂ = ∑ k̂(i), (.)
i=
N N
L̂ = ∑ ℓ̂() (i) + ∑ ℓ̂() (i, j). (.)
i= i< j
We take K̂ to be the zero-order Hamiltonian: we assume that k̂ has been diagonalized, giving a complete
orthonormal set of single-particle functions,
K̂ = ∑ κ p c †p c p (.)
p
L̂ = ∑ ⟨ϕ p ∣ℓ̂() ∣ϕ q ⟩ c †p c q + () † †
∑ ⟨ϕ p ϕ q ∣ℓ̂ ∣ϕ r ϕ s ⟩AS c p c q c s c r . (.)
pq pqrs
We are considering RSPT for the ground-state wavefunction ∣Ψ⟩. The corresponding unperturbed ground-
state is ∣Φ⟩ = c † ⋯c †N ∣−⟩, the ground-state of K̂. The unperturbed energy is
N
є = ⟨Φ∣K̂∣Φ⟩ = ∑ κ i . (.)
i=
Let us introduce quasiparticle operators, and write ∣Φ X ⟩ for an arbitrary excited Slater determinant.
Thus ∣Φ X ⟩ is a Slater determinant with particle-hole pair, particle-hole pairs, etc. We write #X for the
number of particle-hole pairs in ∣Φ X ⟩. Thus, the whole complete Slater determinant basis can be con-
structed:
∣Φ X ⟩ = b†a b †i ⋯b †a #X b†i #X ∣Φ⟩ . (.)
For the projector Q̂, we get
Q̂ = ∑ ∣Φ X ⟩ ⟨Φ X ∣ . (.)
X
Note that the reference ∣Φ⟩ is excluded from this sum. The unperturbed energies are
N #X
є X = ⟨Φ X ∣K̂∣Φ X ⟩ = ∑ κ i + ∑(κ a j − κ i j ), (.)
i= j=
We here remark that if k̂ has degenerate eigenvalues, we will end up with a zero denominator for some
X. Hence, the assumption of non-degeneracy. In fact, it is not enough to require k̂ to have nodegenerate
eigenvalues – we must require that ∆є X ≠ for all X. This is a stronger requirement.
Let us consider RSPT up to second order. Recall first the Slater–Condon rules:
⟨Φ ab
i j ∣L̂
()
∣Φ⟩ = ⟨ϕ a ϕ b ∣ℓ̂() ∣ϕ i ϕ j ⟩AS . (.)
All other matrix elements involving L̂ and ∣Φ⟩ vanish. We now obtain
E () = ⟨Φ∣L̂∣Φ⟩ = ∑ ⟨ϕ i ∣ℓ̂() ∣ϕ i ⟩ + ()
∑ ⟨ϕ i ϕ j ∣ℓ̂ ∣ϕ i ϕ j ⟩AS . (.)
i ij
∣ ⟨Φ X ∣L̂∣Φ⟩ ∣
E () = ⟨Φ∣L̂ R̂ L̂∣Φ⟩ = ∑ . (.)
X ∆є X
Since L̂ is at most a two-body operator, this series truncates at #X = ,
∣ ⟨Φ ai ∣L̂∣Φ⟩ ∣ ∣ ⟨Φ ab
i j ∣L̂
()
∣Φ⟩ ∣
E () = ∑ + ∑ . (.)
ia κi − κa i jab κ i + κ j − κ a − κ b
The prefactor comes from over-counting the double excitations. We note that only the two-body part of L̂
contributes in the second sum due to the Slater–Condon rules.
Higher-order corrections quickly become more complicated. We shall see, that using the Hamiltonian
on normal-order form will simplify matters a lot. Moreover, in Møller–Plesset perturbation theory, where
we use K̂ = F̂, the Fock operator, we get even more simplifications due to Brillouin’s Theorem.
Exercise .. Write oute E () in the same fashion as Eq. (.). △
Ĥ = F̂ + Û (.)
where
p pi
F̂ = Ĥ + V̂ HF = ∑(h q + ∑ w qi )c †p c q , (.)
pq i
p
where h q = ⟨ϕ p ∣ĥ∣ϕ q ⟩ and
pq
w rs ≡ ⟨ϕ p ϕ q ∣ŵ∣ϕ r ϕ s ⟩AS (.)
for brevity. Moreover,
pq pi
Û = Ŵ − V̂ HF = ∑ w rs c †p c †q c s c r − ∑(∑ w qi )c †p c q . (.)
pqrs pq i
We have defined
pi
V̂ HF = V̂ direct − V̂ exchange = ∑(∑ w qi )c †p c q . (.)
pq i
where
a a ⋯ a #X
X=( ) (.)
i i ⋯ i #X
is the index of an excitation of order #X.
We obtain for the resolvent operator
#X
R̂ = ∑ ∣Φ X ⟩ ⟨Φ X ∣ , ∆є X = ∑(є i j − є a j ). (.)
X ∆є X j=
The first-order energy is
where only the doubles-excitations X = ( ai bj ) will contribute. To see this, note that
due to Brillouin’s Theorem and the HF equations. (It can also be shown directly from Eq. (.), see
Exercise ..) For X being a higher than doubles-excitation, the Slater–Condon rule zeroes out any matrix
element. Thus,
E () = ∑ ∑ ∣w ab ∣ , (.)
i j ab є i + є j − є a − є b i j
where we used
⟨Φ ab ab ab
i j ∣Û∣Φ⟩ = ⟨Φ i j ∣Ŵ∣Φ⟩ = w i j , (.)
via the Slater–Condon rules, and the factor / comes from double-counting the indices i j and ab.
The third-order energy E () can be computed in a similar fashion, bt this is much more tedious, so we
omit it. See Szabo and Ostlund for the full expression.
Exercise .. Show that ⟨Φ ai ∣Û∣Φ⟩ = using the Slater–Condon rules and Eq. (.).
A remark to think about: The result does not depend on the HF equations being satisfied. △
Chapter
Thus, the total probabilty of finding any electron at the point ⃗r is thus
N
ρ(⃗r ) = ∑ p i (⃗r ). (.)
i=
so that we obtain
ρ(⃗r ) = ⟨Ψ∣ρ̂(⃗r )∣Ψ⟩ . (.)
Suppose a complete set of spin orbitals φ p (⃗r χ σ (α) is given, with creation operator c †pσ associated. We
note that
⟨ϕ pσ ∣δ(⃗r − ⃗r )∣ϕ qτ ⟩ = δ σ τ φ p (⃗r )∗ φ q (⃗r ). (.)
The second-quantized form of ρ̂ is therefore
which we interpret as the an operator that creates a particle at the space-spin point (⃗r , σ). Similarly
and
{ψσ (⃗r ), ψτ (⃗r ′ )} = , {ψσ† (⃗r ), ψτ† (⃗r ′ )} = . (.)
We note that for any one-body operator â(), the second-quantized operator can be written
This can be shown by inserting the definitions of the field operators. The notation implies that the operator
â is to be multiplied with the field annihilation operator.
For a local potential v(⃗r ) this simplifies to
Also,
⟨Ψ∣V̂ ∣Ψ⟩ = ∫ v(⃗r )n(⃗r ) d⃗r , (.)
which is identical to the potential energy of a classical system with density n(⃗r ).
Similarly, any two-body operator b̂, the second-quantized operator can be written
† †
B̂ = ∑ ∬ ψσ (⃗r )ψτ (⃗s)[b̂(⃗r , σ , ⃗s , τ)ψτ (⃗s)ψσ (⃗r ) d⃗r d⃗s], (.)
στ
where it is to be understood that the operator b̂ is to be multiplied with the field operators. For a local
potential b̂ = w(⃗r , ⃗s) this simplifies to
† †
B̂ = ∬ w(⃗r , ⃗s) ∑ ψσ (⃗r )ψτ (⃗s)ψτ (⃗s)ψσ (⃗r ) d⃗r d⃗s . (.)
στ
.. Plane-wave basis
In this section we discuss the jellium model: an infinite system of interacting electrons and a uniform
background charge, so that the system, on average, is neutral. This is also called the electron gas, and is a
first-approximation to, among other things, a metal. A surprising amount of insight can be obtained from
the jellium model, and we will here only scratch the surface.
The electron gas is an important theoretical model. It is useful as a model of metals, semiconductor
heterostructures, etc., and it is the theoretical foundation of density-functional theory (DFT), a very popular
computational technique in chemistry and solid-state physics.
From a many-body theoretical perspective, the electron gas is particularly interesting because it is an
example of a system where the HF equations can be solved analytically. It also displays divergent terms in
the a series, another interesting phenomenon.
An infinite system is hard to treat mathematically. It is natural to start with a finite system and then
take a limit afterwards. Throw N electrons in a box of sides L and volume Ω = L . The average density
is ρ̄ = N/Ω, and we add a background charge eN to balance the electron charge −eN. Smearing the
background charge uniformly gives a charge density e ρ̄.
After having obtained the results for this box-truncated jellium, one then considers the thermodynamic
limit, sending L → +∞ and N → +∞ together, such that ρ̄ is kept constant. Thus, ρ measures the “number”
of electrons in the thermodynamic limit.
(It is an exercise to show that the Fourier modes exp(i ⃗k ⋅ ⃗r ) are then periodic functions.) We use the
notation KL = {⃗k} for the set of wavenumbers.
The Fourier coefficients are given by
⃗
f˜(⃗k) = ∫ f (⃗r )e −i k⋅⃗r d⃗r , (.)
B
ħ
t̂ = − ∇ , (.)
m
whose eigenfunctions are, precisely, the Fourier modes. Define
i ⃗k⋅⃗r
φ⃗k (⃗r ) = e , (.)
Ω /
and observe that these are orthonormal,
These plane-wave basis functions are very useful, since the kinetic energy is diagonal in this basis. Indeed,
the momentum operator of a single electron is
Exercise .. Prove that the plane-wave basis functions φ⃗k (⃗r ) are orthonormal if and only if Eq. (.)
holds. △
Exercise .. Let ⃗k i , i = , ⋯, N be momentum vectors in KL . Let ∣Φ⟩ = ∣(⃗k α )⋯(⃗k N α N )⟩, a Slater
determinant.
Explain why ∣Φ⟩ is an eigenfunction of the total momentum operator P⃗ˆ = ∑ i p⃗ˆ(i) and compute its
eigenvalue. Repeat for the total kinetic energy operator T̂ = ∑ i t̂(i). △
ħ k
t̂φ⃗k (⃗r ) = φ⃗ (⃗r ), (.)
m k
which gives the second-quantized kinetic energy
ħ k †
T̂ = ∑ ∑ c⃗k,α c⃗k,α . (.)
⃗
k∈K
m α
The ground-state of this Hamiltonian is the Slater determinant ∣Φ⟩ where the first N/ lowest-energy
orbitals are doubly occupied. The energy becomes
ħ k
E = ∑ , (.)
⃗
k∈U occ
m
where the summation extends over the occupied orbitals, denoted by the set Uocc ⊂ K.
The kinetic energy depends only on k = ∣⃗k∣, and it is increasing in k. Thus, Uocc is, for large N, approx-
umately a sphere with radius kF .
Suppose we choose N such that Uocc consists of all the points ⃗k inside a sphere U with radius kF , i.e.,
we fill up with electrons having kinetic energy no larger than the Fermi energy
ħ kF
єF = . (.)
m
This gives for the number of electrons
L π L
N = ∑ = ( ) ∑ ( ) ≈ ( ) ∫ d ⃗k, (.)
⃗
k∈U
π π⃗κ /L∈U L π U
where we have used the definition of the Riemann integral in the last equatlity: κ⃗ is a vector of integers,
and for large L we see that ⃗k = π⃗
κ /L lies on a grid with grid spacing π/L in each spatial direction. The
⃗
volume of each “cell” in k-space is (π/L) . The error in the integral approximation is of order L− .
The integral computes the volume of the sphere U. This gives
L π
N ≈ ( ) k . (.)
π F
We observe that N becomes proportional to Ω = L . Dividing out,
ρ̄ = k . (.)
π F
The error in this last equality is O(L− ), which we safely ignore. We note that kF can be used as a variable
to describe the non-interacting gas, equivalent to ρ̄ via Eq. (.),
(m)/ /
ρ̄ = є . (.)
π ħ F
We observe that the integral approximation argument is valid in more general terms: suppose we are
given a subset V ⊂ K and want to compute
S = ∑ f (⃗k). (.)
⃗
k∈V
S L π
= L −
( ) ∑ ( ) f (⃗k) ≈ (π)− ∫ f (⃗k) d ⃗k, (.)
L π ⃗k∈V L V
E ħ k − ħ
⃗ ħ kF
ħ
= ∑ ≈ (π) ∫ k d k = π ∫ k dk = k .
F
(.)
L ⃗
∣ k∣<k
m m ∣⃗
k∣<k F (π) m m (π)
F
The left-hand side is actually the energy density. Using Eq. (.) we obtain
This is just a constant number, which is very large but finite, for finite L. However, we note that in the
thermodynamic limit, V̂b-b grows without bound and represents a singularity of the model in this limit.
Next, consider V̂e-b ,
N
V̂e-b = −ρ̄ ∑ ∫ w(⃗r − ⃗r i ) d⃗r . (.)
i= B
In a similar fashion as V̂b-b , the integral on the right-hand side is finite, but very large and negative. In the
limit of a large box, the integral blows up.
The term Ŵ will also blow up in the thermodynamic limit, since then we both have a very large box
and a very large number of electrons.
The problem with the infinities is that the Coulomb interaction has infinite range (in the sense of scat-
tering theory). We therefore introduce the Yukawa potential as a regularization
w(⃗r ; µ) = e −µr , µ > , (.)
r
which has finite range ∼ µ − and gives the Coulomb potential in the limit µ → . The idea is to use the
Yukawa potential for a finite N and L, and see that the nasty infinities cancel each other out. Then we can
take the thermodynamic limit, and observing that our results are have a well-defined limit as µ → .
The Fourier transform of the Yukawa potential is
so that
w(⃗r ; µ) = q; µ)e i q⃗⋅⃗r ,
∑ w̃(⃗ ⃗r ∈ B. (.)
Ω ⃗k∈K
If we assume that the box is large enough, this integral becomes, to an expoentially good approximation,
π
w̃(⃗
q; µ) ≈ . (.)
µ + k
This is an exercise.
We rewrite the Hamiltonian in terms of the Fourier transform (.). First, we define
N
n̂ q⃗ ≡ ∑ e −i q⃗⋅⃗r i , (.)
i=
This gives
V̂b-b = ρ̄ Ωw̃(⃗, µ). (.)
Similarly,
V̂e-b = −ρ̄N ω̃(⃗, µ). (.)
We note that V̂b-b and V̂e-b diverge for large boxes and large N. We therefore consider the q⃗ = ⃗ term of
Ŵ,
Ĥ = T̂ + Ŵ + w̃(⃗, µ)(N − N) − ρ̄N w̃(⃗, µ) + ρ̄ Ωw̃(⃗, µ) (.)
Ω
We note that the divergencies are canceled, since
(N − N) − ρ̄N + ρ̄ Ω = (N − N) − N + N = − ρ̄. (.)
Ω Ω Ω Ω
Thus, the Hamiltonian becomes
Ĥ = T̂ + Ŵ , Ŵ = ∑ w̃(⃗
q , µ)(n̂−⃗q n̂ q⃗ − N̂), (.)
Ω q⃗≠
.. Hamiltonian in second quantization
We now express the Hamiltoninan in second quantization using the plane-wave basis. To this end, first we
consider the matrix elements of n̂ q⃗ = ∑ i e −i q⃗⋅⃗r i , which is instructive:
−i ⃗ ⃗ ⃗ ⃗
(φ⃗k ∣e −i q⃗⋅⃗r ∣φ ℓ⃗) = ∫ e
k⋅⃗r −i q⃗⋅⃗r +i ℓ⋅⃗
e e r d⃗r = ∫ e i(− k−⃗q+ℓ)⋅⃗r d⃗r = δ−⃗k−⃗q+ℓ,⃗ ⃗ , (.)
Ω Ω
by the orthonormality of the φ⃗k , for ⃗k ∈ K.
Summing up the operator, we obtain
where we used that n̂ q⃗ does not depend on spin. Thus, n̂ q⃗ is a shift operator, that annihilates a particle with
wavenumber ℓ⃗ and inserts one with wavenumber ℓ⃗ − q⃗.
We note that
n̂†q⃗ = n̂−⃗q . (.)
Exercise .. Using the fundamental anticommutator and Eq. (.), show that
Explain that n̂−⃗q n̂ q⃗ conserves total momentum ħk K⃗ = ħk ∑ Ni= ⃗k i for any Slater determinant ∣(⃗k α )⋯(⃗k N α N )⟩.
△
A useful observation is that the operator T̂ + Ŵ conserves the total wavenumber ∑ i ⃗k i . The same is of
course true for the kinetic energy operator, which is diagonal in the plane-wave basis.
We now compute the matrix elements of Ŵ in the plane-wave basis. We start with the space integrals
of n̂−⃗q n̂ q⃗:
−i ⃗k ⋅⃗r −i ⃗ ⃗ ⃗
(φ⃗k φ⃗k ∣n̂−⃗q n̂ q⃗∣φ ℓ⃗ φ ℓ⃗ ) = ∬ e e k ⋅⃗r (e i q⃗⋅⃗r + e i q⃗⋅⃗r )(e −i q⃗⋅⃗r + e −i q⃗⋅⃗r )e i ℓ ⋅⃗r e i ℓ ⋅⃗r d⃗r d⃗r
Ω
⃗ ⃗ ⃗ ⃗
= ∬ e i(ℓ − k )⋅⃗r e i(ℓ − k )⋅⃗r ( + e i q⃗⋅⃗r −i q⃗⋅⃗r + e −i q⃗⋅⃗r +i q⃗⋅⃗r ) d⃗r d⃗r
Ω
⃗ ⃗ ⃗ ⃗
= δ ℓ⃗ , ⃗k δ ℓ⃗ , ⃗k + ∫ e i(ℓ − k +⃗q)⋅⃗r d⃗r ∫ e i(ℓ − k −⃗q)⋅⃗r d⃗r
Ω
⃗ ⃗ ⃗ ⃗
+ ∫ e i(ℓ − k −⃗q)⋅⃗r d⃗r ∫ e i(ℓ − k +⃗q)⋅⃗r d⃗r
Ω
= δ ℓ⃗ , ⃗k δ ℓ⃗ , ⃗k + δ ℓ⃗ , ⃗k −⃗q δ ℓ⃗ , ⃗k +⃗q + δ ℓ⃗ , ⃗k +⃗q δ ℓ⃗ , ⃗k −⃗q
(.)
⟨ϕ⃗k α ϕ⃗k α ∣Ĝ q⃗∣ϕ ℓ⃗ β ϕ ℓ⃗ β ⟩ = δ α β δ α β [δ ℓ⃗ , ⃗k −⃗q δ ℓ⃗ , ⃗k +⃗q + δ ℓ⃗ , ⃗k +⃗q δ ℓ⃗ , ⃗k −⃗q ] (.)
Let us consider the matrix elements of Ŵ , which we can write
⟨ϕ⃗k α ϕ⃗k α ∣ŵ ∣ϕ ℓ⃗ β ϕ ℓ⃗ β ⟩ = q; µ)δ α β δ α β [δ ℓ⃗ , ⃗k −⃗q δ ℓ⃗ , ⃗k +⃗q + δ ℓ⃗ , ⃗k +⃗q δ ℓ⃗ , ⃗k −⃗q ]
∑ w̃(⃗ (.)
Ω q⃗≠
The sum over q⃗ can be collapsed using the Kronecker deltas. However, if a Kronecker delta implies that
q⃗ = , it is not allowed. Therefore we, must eplicitly include q⃗ = in the sum by defining
w̃ (⃗
q; µ) = w̃(⃗
q; µ)( − δ q⃗,⃗ ). (.)
where we used Eq. (.), where the matrix elements are not antisymmetric, just as in the present case. An
alternative form is obtained by introducing a new summation variable K⃗ = ⃗k + ⃗k , the total momentum:
Ŵ = ∑ ∑ w̃ (⃗k − ℓ)
⃗ ∑ c† c†
⃗ ⃗ ⃗ c ⃗ ⃗ c⃗ .
k,β K− ℓ,β ℓα
(.)
Ω K⃗ ⃗k, ℓ⃗ αβ
k α K−
To see the last equality, note that since the bra and the ket state are identical, we are left with terms from
(.) containing w̃ ( ⃗i − ⃗i) = w̃ (⃗k − ⃗k) = w̃ (⃗) only, and these are identically zero.
We now durn to the exhange potential:
The exchange operator does not vanish. We obtain for the Fock operator
⎡ ⎤
⎢ħ k ⎛ ⎞⎥
F̂ = T̂ − V̂ exchange = ∑ ⎢⎢ − ∑ w̃(⃗k − ⃗i; µ)( − δ⃗k, ⃗i ) ⎥⎥ ∑ c⃗†k α c⃗k α ≡ ∑ є⃗k ∑ c⃗†k α c⃗k α . (.)
⃗
k ⎣
⎢ m ⎝ ⃗i Ω ⎠⎥ α ⃗ α
⎦ k
The Fock operator is manifestly diagonal in the plane-wave basis, and the diagonal elements є⃗k are therefore
the canonical HF energies, which are doubly degenerate due to spin, and the spin-orbitals ϕ⃗k α are the
canonical HF functions.
The HF energy depends only on k = ∣⃗k∣, and we have:
ħ k
є⃗k ≡ є k = − ∑ w̃(⃗k − ⃗i; µ)( − δ⃗k, ⃗i ). (.)
m Ω ⃗i
The HF energy is
EHF = ⟨Φ∣T̂ + Ŵ ∣Φ⟩ = ∑ t ⃗i − ∑ ⟨ϕ ⃗ ϕ ⃗ ∣Ŵ ∣ϕ ⃗i α ϕ ⃗jα ⟩ ≡ EK − Eexchange (.)
⃗
iα
⃗i ⃗jα jα i α
where we used that the direct potential is identically zero, and only the exchange parts of the interaction
matrix elements contribute.
Note that we have not specified which of the indices ⃗k that are the occupied ⃗is! We can choose any set
of N/ indices. It is natural to expect – but not at all trivial – that the minimum HF energy is obtained
by choosing those indices corresponding to the lowest energy. When studying the thermodynamic limit,
we take this approach. Thus, exactly as for the noninteracting electron gas, we let L and kF be given, and
compute N such that all φ⃗k with ∣⃗k∣ < kF are doubly occupied. In the thermodynamic limit, the number
of electrons are then expressed in terms of the average density ρ̄ and the Fermi wavenumber kF . Note
however, that the fermi energy in the HF model is not the kinetic energy of the electrons with ∣⃗k∣ = kF , but
rather the HF eigenvalue, єF = є kF .
so that є k = ħ k /m − S k . We evaluate the sum as an integral, and set µ = since the integral converges
also for this limit:
πe
Sk = ∫ d⃗s . (.)
(π) ∣⃗s ∣<kF ∣⃗s − ⃗k∣
To evaluate the integral, we choose the z-axis in ⃗s-space along ⃗k, introduce spherical coordinates, and get
πe π/ kF
Sk =
π ∫ sin (θ)dθ ∫ s ds[s + k − ks cos(θ)]− . (.)
(π) −π/
Introducing x = cos(θ) as integration variable, we can complete the calculation and obtain
e kF + k
Sk = [kF + (kF − k ) ln ∣ ∣] . (.)
π k kF − k
Thus, we obtain
ħ k e kF + k
єk = − [kF + (kF − k ) ln ∣ ∣] (.)
m π k kF − k
We now turn to the calculation of EHF . First, we note that the kinetic energy EK was calculated in the
section about the noninteracting gas. There, kF was expressed in terms of the density ρ̄ and vice versa,
ρ̄ = k . (.)
π F
The density is the same in the HF model, since the state ∣Φ⟩ is the same. The kinetic energy in terms of kF
becomes
ħ π / / ħ
EK = Ω [(π )kF ]/ = Ω k . (.)
m m π F
Now to the exchange energy.
Eexchange = ∑ ⟨ϕ ⃗ ϕ ⃗ ∣Ŵ ∣ϕ ⃗jα ϕ ⃗i α ⟩ = ∑ w̃ ( ⃗i − ⃗j) = ∑ S j
⃗i ⃗jα i α jα Ω ⃗i ⃗j ⃗
j
(.)
Ω Ω kF
≈ ∫ ⃗ Sj =
π ∫ k S k dk.
(π) ∣ j∣<kF (π)
e
Eexchange = Ω k . (.)
(π) F
In total, therefore, we get the energy density
EHF ħ e
= k F − k , (.)
Ω m π (π) F
valid in the limit Ω → +∞.
Exercise .. Fill in the details: Compute the integral in Eq. (.) to obtain Eq. (.) △
Exercise .. a) Show that the HF energy density (.) as a function of the average electron density
ρ̄ can be written
EHF ħ / π / / /
= ρ − e / ρ / . (.)
Ω m π
b) Show that the HF energy per particle can be written
where i j and ab enumerate the occupied and virtual spin-orbitals, respectively, and where t p is the kinetic
energy of spin-orbital ϕ p . Thus,
ħ ⃗
tk = ∣k∣ . (.)
m
We have
⟨ϕ i ϕ j ∣ŵ ∣ϕ a ϕ b ⟩AS = ⟨ϕ ⃗i ,α ϕ ⃗j,α ′ ∣ŵ ∣ϕ a⃗,β ϕ⃗b,β ′ ⟩ − ⟨ϕ ⃗i ,α ϕ ⃗j,α ′ ∣ŵ ∣ϕ⃗b,β ′ ϕ a⃗,β ⟩ , (.)
where
⟨ϕ⃗k ,α ϕ⃗k ,α ∣ŵ ∣ϕ ℓ⃗ ,β ϕ ℓ⃗ ,β ⟩ = δ α β δ α β δ⃗ ⃗ ⃗ ⃗ w̃ (⃗k − ℓ⃗ ; µ). (.)
Ω k +k , ℓ +ℓ
Thus,
⃗ µ)]
⟨ϕ i ϕ j ∣ŵ ∣ϕ a ϕ b ⟩AS = δ ⃗ ⃗ ⃗ [δ α β δ α ′ β ′ w̃ ( ⃗i − a⃗; µ) − δ α β ′ δ βα ′ w̃ ( ⃗i − b;
Ω i+ j, a⃗+b
(.)
πe ⃗ )− ]
= δ ⃗ ⃗ ⃗ [δ α β δ α ′ β ′ (µ + ∣ ⃗i − a⃗∣ )− − δ α β ′ δ βα ′ (µ + ∣ ⃗i − b∣
Ω i+ j, a⃗+b
We are going to sum over a, b, i, and j, and we note that total spin projection must be conserved, α + α ′ =
β+β′ . For a simpler integral analysis later on, we split the contributions into the cases α = −α ′ (anti-parallel
spins) and α = α ′ (parallel spins),
() ()
E↑↓ + E↓↓ . (.)
For anti-parallel spins, we obtain
() πe m ⃗ )− [(µ + ∣ ⃗i − a⃗∣ )− ] δ ⃗ ⃗ ⃗
E↑↓ = ( ) ∑ (∣ ⃗i∣ − ∣⃗
a ∣ + ∣ ⃗j∣ − ∣b∣ i+ j, a⃗+b (.)
Ω Ω Ω ħ ⃗i ⃗j a⃗⃗b
the factor comes from identification of several identical contributions. We are interested in showing that
this energy diverges. The proof for the parallel spin case is similar, and will not cancel the divergence in
()
E↑↓ . (The eager student can study the parallel spin case in Raimes.)
To get rid of the Kronecker delta, which expresses momentum conservation, we introduce the momen-
tum vector q⃗ ≡ a⃗ − ⃗i. We then obtain from momentum conservation
The summation over a⃗ and b⃗ is replaced by a single summation over q⃗. We introduce integrals, obtaining
()
E =C∫ d ⃗i ∫ d ⃗j ∫ d q⃗θ(∣ ⃗j − q⃗∣ − kF )θ(∣ ⃗i + q⃗∣ − kF )
Ω ↑↓ ∣⃗
i∣<k F ∣⃗
j∣<k F q⃗ (.)
× (∣ i∣ + ∣ ⃗j∣ − ∣ ⃗i + q⃗∣ − ∣ ⃗j − q⃗∣ )− (µ + ∣⃗
⃗ q∣ )− .
where C is a constant independent of Ω. The theta function is defined by θ(x) = if x < , and θ(x) =
if x > . Notice that all powers of Ω have been cancelled, leaving an integral as a function of µ only.
Let us study the integrand, and let us assume that µ is very small. The integrand then gets its main
contribution from small q = ∣⃗ q∣. To see this, note that
becomes closest to when q⃗ is small. Similarly (q + µ ) is the smallest when q is small. Thus the integrand
has its largest values for q small.
When q is small, i ≲ kF : otherwise it is not possible that ∣⃗ a ∣ = ∣ ⃗i + q⃗∣ > kF .
where
⃗i ⋅ q⃗
x≡ ≡ cos(θ i ). (.)
iq
For future reference we also define
⃗j ⋅ q⃗
y≡− ≡ − cos(θ j ). (.)
jq
Here, θ i (θ j ) is the angle between q⃗ and ⃗i ( ⃗j). Now, for small q, we can do a first-order consideration,
neglect terms of order q and higher, and a geometrical consideration shows that i ≈ kF ( − cq) + O(q )
for some small number c > . Equation (.) becomes, to first order in q,
Rearranging, we obtain
qx > (kF − i), (.)
and thus the function θ(∣ ⃗i + q⃗∣ − kF ) is, for small q, equivalent to the integration limits
F(є, µ) ∶= ∫ [∫ d ⃗i ∫ d ⃗j ] d q⃗ (.)
q<є (µ
+q ) A(⃗
q) B(⃗
q) q(y j + x i)
Here, A(⃗q) and B(⃗q) denote the integration limits in Eqs. (.) and (.).
We now introduce spherical coordinates in ⃗i-space, letting the z-axis point in the direction of q⃗. Thus,
the elevation angle θ = θ i , while the azimuthal angle is φ ∈ [, π]. The integration over A(⃗ q) can be
written
π π/ kF
∫ d ⃗i = ∫ dφ ∫ sin(θ)dθ ∫ i di, (.)
A(⃗
q) k F −q cos(θ)
where cos(θ) = cos(θ i ) ≥ is enforced by integrating over [, π/]. Reintroduce x = cos(θ), to obtain
kF
∫ d ⃗i = π ∫ dx ∫ i di, (.)
A(⃗
q) k F −qx
We now note that in the integration region, j ≈ kF , i ≈ kF , and thus
i j k
≈ F , (.)
yi + x j y + x
the error being small and causing no problems. Thus, to within an error of order є,
d q⃗ kF kF kF
F(є, µ) = π ∫ ∫ dx ∫ d y ∫ di ∫ d j
q ∣<є (µ + q ) q
∣⃗ k F −qx k F −qx x+y
(.)
d ⃗
q q x y
= kF π ∫ ∫ dx ∫ d y .
q ∣<є (µ + q ) q
∣⃗ x+y
The integral over x and y yields a constant independent of q⃗, thus, noting that the remaining q⃗-integrand
does only depend on the magnitude q,
є q
F(є, µ) ∝ ∫ dq. (.)
(µ + q )
q µ
∫ dq = [ + log(µ + q )] . (.)
(µ + q ) µ + q
For µ > we see that F(є, µ) is finite, but that the limit µ → is infinite.
In total, we see that there is an infinite contribution to E () in the physical limit µ → .
It can be shown that all E (n) /Ω for n ≥ diverge in a similar manner in the thermodyncamic limit.
Thus, RSPT fails badly for the electron gas. This is not to say that the ground-state energy is not well-
defined! One can prove that the energy per particle must be finite in the thermodynamic limit. What
fails here are the conditions for RSPT to converge. A necessary condition for convegence is that when we
introduce a coupling constant λ, the ground-state energy of Ĥ(λ) = T̂ + λ Ŵ is analytic at λ = . The
infinite perturbation series terms contradicts this assumption.
Chapter
In large ab initio calculations for realistic systems, one needs a single-particle basis set in which we develop
the wavefunction. In this brief chapter, we give a brief overview of the various common choices in quantum
chemistry, in solid-state physics, quantum dot studies, and in nuclear physics.
Using L single-particle functions, the Hilbert space scales as ( NL ). Thus, our basis needs to
. Yield a good approximation to the exact wavefunction, i.e., capture “the physics”,
. Allow efficient calculation of two-body (or higher!) matrix elements.
These criterions are not always compatible.
where W is a residual part, and where ĥHO is the harmonic oscillator (HO),
ħ d
ĥHO = − ∇ + mω r = ∑ h(r j ). (.)
m j=
Here d is the spatial dimension of the single-particle space, where the fermions “live”. Typically, d = , ,
or . The HO can be exactly solved, giving a convenient basis of single-particle orbitals for manybody
treatments.
Harmonic oscillator functions are useful in quantum dot models and very common in the nuclear
manybody problem as well.
Consider first a harmonic oscillator in one space dimension (no spin). This simple problem has the
Hamilatonian
ħ ∂
ĥ = − + mω x . (.)
m ∂x
The the solutions to the eigenvalue problem ĥ f n = e n f n are well-known, and on the form
√
α −/
f n (x) = π H n (αx)e −α x / (.)
n n!
f n (x)
αx
− −
Figure .: The first few Hermite functions f n (x), n = , , and n = . They are shifted vertically with
their energy eigenvalue. The first eigenvalues are also shown as dashed lines.
where
mω /
α=( ) . (.)
ħ
The eigenvalues are
e n = ħω(n + ). (.)
The functions H n (x) are the Hermite polynomials, which have the compact expression
d n −x
H n (x) = (−)n e x e . (.)
dx n
The Wikipedia page has tons of information.
For d = , we obtain
−/
φ n ,n ,n (⃗r ) = α / π −/ [n +n +n n !n !n !] H n (αx )H n (αx )H n (αx )e −α
r = r + r + r . r /
,
(.)
The energy levels group into shells of equal energy. Define the shell number N(⃗
n) = ∑ j n j , such that
e n⃗ = ħω(N(⃗
n) + d/). (.)
The degeneracy g(N , d) of the energy e = ħω(N + d/) depends on the dimension d. Let us look at some
examples. Suppose that N = , and d = . Then, we have the possibilities
(n , n , n ) ∈ {(, , ), (, , ), (, , ), (, , ), (, , ), (, , )} , (.)
giving g(, ) = .
In general, g(N , ) = , and one can show (exercise!) that g(N , ) = N + , while for g(N , ) =
(N + )(N + ).
In d = , one can also find the eigenfunctions of the HO using polar coordinates. The main observation
is that ĥHO is rotationally invariant, meaning that it commutes with the generators for the group of space
rotations: the angular momentum operators.
Exercise .. For the Harmonic oscillator, compute the degeneracy of the eigenvalue ħω(N + d/) for
d = , d = , and d = . △
Exercise .. Using your method of choice, plot the cartesian coordinate eigenfunctions for d = for
N ≤ . (This constitutes plots.) Set α = . △
x r cos ϕ
( )=( ). (.)
y r sin ϕ
∂
L̂ z = −iħ . (.)
∂ϕ
∂ ∂ ∂
∇ = r + . (.)
r ∂r ∂r r ∂ϕ
The HO Hamiltonoan for d = becomes
ħ ∂ ∂ ∂
ĥHO = − [ r + ] + mω r . (.)
m r ∂r ∂r r ∂ϕ
we obtain after some simple algebra the solution u(ϕ) = e i l z ϕ with l z ∈ Z, and the radial equation
ħ ∂ ∂ ħ l z
[− r + + mω r ] R(r) = eR(r). (.)
m r ∂r ∂r m r
Exercise .. Using your method of choice, plot the polar coordinated eigenfunctions for d = for N =
n + ∣l z ∣ ≤ . (This constitutes plots.) Set α = . △
⎛x ⎞ ⎛r sin θ cos ϕ⎞
⎜ y ⎟ = ⎜ r sin θ sin ϕ ⎟ , r ∈ [, +∞), θ ∈ [, π], ϕ ∈ [, π]. (.)
⎝ z ⎠ ⎝ r cos θ ⎠
where L̂ = L̂ x + L̂ y + L̂ z is, in polar coordinates,
∂ ∂ ∂
ħ− L̂ = − sin θ − . (.)
sin θ ∂θ ∂θ sin θ ∂ϕ
As for the d = case, we attempt an eigenfunction of ĥHO on the form φ(r, θ, ϕ) = R(r)Y(θ, ϕ). Straight-
forward algebra leads to Y being an eigenfunction of L̂ . The spherical harmonics Yl l z (θ, ϕ) form a com-
plete set of eigenfunctions of Lˆ (and L̂ z ),
l + (l − l z )! l z
Yl l z (θ, ϕ) = [ ] P (cos θ)e i l z θ , ∣l z ∣ ≤ l ∈ N. (.)
π (l + l z )! l
ħ ∂ ∂ ħ l(l + )
[− r + + mω r ] R(r) = eR(r). (.)
m r ∂r ∂r mr
Note that the radial equation depends on l, but not on l z . The radial equation can be solved, giving (see
Moshinsky’s book []), solutions R nl (r) for n = , , , ⋯,
/
(n!) l +/
R nl (r) = α / [ ] (αr) l L n (α r )e −αr /
. (.)
Γ(n + l + /)
(Remark: in some texts, a different convention for n is used. Note carefully that l is not restricted with
respect to n.)
The HO energy is
e nl = ħω(n + l + /), (.)
which is independent of l z since the radial equation was independent of l z .
Exercise .. Using your method of choice, plot the polar coordinated eigenfunctions for d = for N =
n + l ≤ . Set α = . Use only l z ≥ . △
−(/)e. The proton consists of up quarks and down quark. Thus, in total the neutron has no charge,
while the proton has charge +e.
Experimental evidence demonstrates that p and n behave almost identically in the nucleus, with almost
equal mass, spin +ħ/, and that they do not decompose into their constituent quarks at low energy. Their
interactions in a nucleus is also almost identical, except that two protons repel each other via the Coulomb
force. Therefore, Werner Heisenberg postulated a non-relativistic description where p and n are different
states of one kind of particle, a nucleon, with an additional spin-/ degree of freedom called isospin: a
nucleon with isospin +/ is a proton, and a nucleon with isospin −/ is a neutron. (It was Eugene Wigner
who coined the term “isospin” in .)
Thus, the Hamiltonian of an A-particle nucleus is
N
A
Ĥ = T̂ + Û = ∑ t̂(i) + ∑ û(i, j), (.)
i= i≠ j
where û(i, j) is the interaction potential between nucelons i and j. Note well, that the interaction potential
depends on the isospin of nucleons i and j.
The interaction potential is not known a priori, in contrast to electronic systems. One needs to fit semi-
empirical models to experimental data, or, as is the current trend, derive potential approximations from
QCD.
Suppose we are given a spatial orbital basis φ p (⃗r ), sa, the HO function with p = (n, l , l z ). Since we
have both spin and isospin in in our single-particle space, we obtain single-particle functions on the form
where χ±/ are the orthonormal spin-/ basis functions. It can be considered standard to use the HO
basis functions for nonrelativistic treatments of the nucleus.
A Slater determinant with N neutrons and Z protons, in total A = N + Z nucleons, can then be written
∣(p α , +/)⋯(p Z α Z , +/)(p Z+ α Z+ , −/)(p Z+ α Z+ , −/)⋯(p A α A , −/)⟩ (.)
Suppose we have in total L values for (pα), i.e., L spin-orbitals. Since the isospin valies for neutrons and
protons are different, neutrons and protons can occupy spin-orbitals independently of each other, meaning
that the dimension of the Hilbert space becomes
L L
D = ( ) × ( ). (.)
N Z
.. The self-bound property of the nucleus, and removal of centre-of-mass degree
of freedom
The nucleus is self-bound. There is no external potential in the one-body part of the Hamltonian to bind the
nucleons in space. The Hamiltonian is translationally invariant, i.e., it commutes with the total momentum
operator P̂ = ∑ i p̂(i). The spectrum of Ĥ becomes purely continuous, there are no isolated eigenvalues.
This complicates matters for most comon manybody techniques. It is therefore a common technique to add
µ
a weack fictitious harmonic oscillator potential ω r term to the Hamiltonian to weakly bind the nucleus,
producing a discrete spectrum, and then after the calculations remove the ω dependence.
NB: To be added, material not lectured: centre-of-mass transformation.
. Molecular systems and Gaussian basis sets
.. The Born–Oppenheimer molecular Hamiltonian
Classically, a molecule is a collection of nuclei with masses M α , charges eZ α , and positions R ⃗α , α =
, , ⋯, Nat , and a collection of N electrons with charge −e and positions ⃗r i , i = , , ⋯, N. Quantum me-
chanically, both the nuclei and the electrons obtain spin, and a wavefunction depending on all the N + Nat
space-spin coordinates. In all but the simplest cases, this is an intractable problem.
The way out is the Born–Oppenheimer (BO) approximation: Roughly speaking , the nuclei are so heavy
compared to the electrons that their movement occurs on a time-scale much larger than the motion of
the electrons. In the BO approximation we therefore treat the nuclei as classical particles, setting up an
external classical electrostatic potential v(⃗r ) felt by an electron,
N at
−e Z α
v(⃗r ) = ∑ . (.)
α= ∣⃗
r−R ⃗α ∣
The molecular Hamiltonian is therefore an N-electron Hamiltonian with a parametric dependence on the
nuclear geometry,
⃗ , R
Ĥ = Ĥ(R ⃗ , ⋯, R
⃗ N ). (.)
at
Nat e Z α Z β
Ĥn-n = ∑ ⃗ (.)
α≠β= ∣R ⃗
α − Rβ ∣
is a constant term depending on the nuclear geometry. Let us recall the expressions for T̂ and Ŵ,
N
ħ
T̂ = ∑ (− ∇ ) (.)
i= m i
N e
Ŵ = ∑ . (.)
i≠ j= ∣⃗r i − ⃗r j ∣
The eigenvalues E k and eigenfunctions ∣Ψk ⟩ of Ĥ obtain a parametric dependence on the nuclear
geometry. Of particular usefulness in chemistry is the potential energy surface: the ground-state energy
⃗ , ⋯, R
E (R ⃗ N ) as a function of the nuclear coordinates.
at
The equilibrium geometry is the configuration of the nuclei that minimizes E . This usually corresponds
to the configuration observed in nature.
We will not have more to say on the topic. The interested student should consult for example the book
by Szabo and Ostlund [] – a great read. The book [] is the definite guide to modern electronic-structure
theory.
The BO approximation has some subtleties, but these are beyond the scope of this course.
⃗ α })
E ({R
.
⃗α }
{R
−.
−
and the matrix elements (φ p φ q ∣ŵ∣φ r φ s ) are thus given as linear combinations of the matrix elements
(χ p χ q ∣ŵ∣χ r χ s ).
. Use the molecular orbitals φ p in a many-body treatment such as MPPT, CI, or coupled-cluster the-
ory.
.. From hydrogenic to Gaussian orbotals
How do we choose a single-particle basis (atomic orbitals) for the BO Hamiltonian? Clearly, the basis
must depend on the nuclear arrangement: we would like our results to be independent of translations of
the whole molecule, as this is a fundamental symmetry in the problem. The intuition behind the BO ap-
proximation also indicates that that each individual atom in the molecule roughly retains its indpeendence
as an entity in itself: the electron cloud of a molecule is a perturbation of the electron cloud obtained by
treating each atom by itself, eliminating inter-atom intreractions.
We therefore consider first an individual atom for guidance, located for convenience at R ⃗ = , with
nuclear charge eZ. We assume that the atom has N electrons, and we first consider the non-interacting
problem. We thus need to solve for the eigenvalues and eigenstates of a hydrogen-like atom with a single
electron. The Hamiltonian reads
ħ Ze
ĥ = − ∇ − . (.)
m r
The diagonalization of this problem is textbook material, see for instance [] or []. On finds a sequence
of eigenvalues
m (Ze )
en = − , n = , , ⋯, (.)
ħ n
degenerate in the angular momentum quantum numbers l ≤ n and l z , ∣l z ∣ ≤ l. The eigenfunctions are
given by
¿
Á (n − l − )! −ρ/ l l +
ψ nl l z (⃗r ) = R nl (r)Yl l z (θ, ϕ), R nl (r) = ÁÀ( ) e ρ L n−l − (ρ) (.)
na n(n + l)!
with
Ze m ħ
ρ= r, a = (.)
n ħ mZe
Intuitively, the functions ψ nl l z should be a good single-particle basis for the interacting N-electron
atom, obtained by throwing N − more electrons into this atom and turning on interactions. However,
this basis set has major deficienies:
• They are incomplete (and thus not an actual L (X)-basis!), as the hydrogen atom also has a contin-
uous spectrum for energies e > . Thus, we cannot expect convergence to the exact ground-state
energy of the N-electron atom as we include more and more ψ nl l z .
• Computing the matrix elements of Ŵ becomes complicated.
• The functions become very diffuse with higher n, allowing few details to be resolved around the
nucleus for moderate basis sizes.
On the other hand, the basis set displays other very useful features in its asymptocic behavior:
• A nuclear cusp at the origin, stemming from the singular nature of the Coulomb potential. This cusp
is always present in an atom, and gives a large contribution to the total electronic energy.
• Exponential fall-off of the radial part. This is responsible for physics of the N-electron atoms and
molecules, such as an R − -dependence of the inter-atomic forces in a molecule, where R is the dis-
tance between two atoms.
A partial remedy to the problems is the use of Laguerre radial functions, see []. These have nuclear
cusps and exponential fall-off, while forming a complete set. These functions do not solve the problems of
the complicated Ŵ matrix elements, however. We will not study these functions in detail here.
.. Gaussian basis sets
Selecting a single-particle basis for molecular systems is an art, due to the conflicing constraints of effi-
ciency, compactness and accuracy. Moreover, different manybody methods put different requirements on
the basis set. There are probably hundreds of different basis sets, with acronyms like “STO-kG”, “cc-PVXZ”,
etc. They are all tailored to have specific behaviour, and to be useful under different conditions. They all
have one thing in common, however: they are linear combinations of Gaussians.
Thus, the almost universally used approach in quantum chemistry today is a pragmatic one: One uses
Gaussian functions to approximate single-particle basis functions. A Gaussian is a function on the form
g i jk (⃗r ; ζ) = N x i x j x k e −ζr , (.)
where the exponent ζ > is a parameter, and where N is a normalization constant. These are closely related
to the harmonic oscillator eigenfunctions. In fact, the HO eigenfunctions are finite linear combinations of
such g i jk , since H n (x) is a polynomial. Conversely, the Gaussians can be expanded in a finite number of
HO functions.
The Gaussian g i jk (⋅; ζ) is referred to as a cartesian Gaussian since it is a tensor product of one-dimensional
Gaussians g i (x; ζ) = N x i e −ζ x .
A more compact description is obtained using spherical Gaussians on the form
sph
g nl l z (⃗r ; ζ) = N r l e −ζr Yl l z (θ, ϕ). (.)
These give a more compact description since they are eigenfunctions of L̂ and L̂ z , unlike the cartesian
counterparts. But they are equivalent: the cartesian and spherical Gaussians can be expanded in terms of
each other, using finite number of coefficients.
A general basis of atomic orbitals is then on the form:
⃗ p ; ζ p µ ),
χ p = ∑ D p µ g µ (⃗r − R (.)
µ
g µ is either the spherical or cartesian Gaussian functions, and where where µ = (i jk) or µ = (nl l z ).
Each χ p needs to be located on some atom R ⃗ α . The vector R
⃗ p therefore shifts the Gaussian accordingly.
The exponents depend on both p and µ, giving maximum flexibility in the description. The matrix D is
typically sparse.
Chapter
Recommended reading: Crawford and Schaefer [] is a very nice and pedagogical text. Shavitt and Bartlett
[] is also recommended. We mostly follow Crawford and Schaefer here.
∣Φ⟩ = ∣ϕ ϕ ⋯ϕ N ⟩ , (.)
where we for simplicity fill the N first single-particle functions. The fermions in this wavefunction are
uncorrelated, except for the Pauli principle. It is the simplest manybody ansatz we can make.
Assume that ∣Φ⟩ is a reasonable ansatz for the exact wavefunction ∣Ψ⟩, i.e., that at least ⟨Φ∣Ψ⟩ ≠ . By
scaling ∣Ψ⟩ by a number, we can write
∣Ψ⟩ = ∣Φ⟩ + ∣∆Ψ⟩ , (.)
How can we improve on ∣Φ⟩ in a systematical manner towards ∣Ψ⟩? The Slater determinant is an antisym-
metrized tensor product,
√
Φ(, , ⋯, N) = N!Aϕ ()ϕ ()⋯ϕ N (N). (.)
Intuitively, if we add to the product ϕ ()ϕ () a general function g (, ), we would obtain a wave-
function where “ of the fermions are correlated”, i.e., described with a general wavefunction, while the
rest are still independent,
√
Ψbetter (, , ⋯, N) = N!A[ϕ ()ϕ () + g (, )]ϕ ()⋯ϕ N (N)
(.)
≡ Φ(, , ⋯, N) + ⟨⋯N∣g ϕ ⋯ϕ N ⟩
The antisymmetrization operator A ensures that the final wavefunction is fully antisymmetrized. The latter
equation defines ∣g ϕ ⋯ϕ N ⟩ via the antisymmetrization operation on a product. Thus, in ket notation,
The function g (x, y) is called a cluster function, since when applied to ∣Φ⟩ in the above manner it
describes the wavefunction of a system where all fermions are independent/uncorrelated (think far away
from each other), except for a sincle cluster of particles consisting of fermions that are described in a
general manner (close to each other).
Suppose we instead introduce a correction on the occupied SPFs ϕ i and ϕ j , i < j.
′
√
Ψbetter (, , ⋯, N) = N!A[ϕ i (i)ϕ j ( j) + g i j (i, j)]ϕ ()⋯ ϕ i (i)⋯ ϕ j ( j)⋯ϕ N (N)
(.)
= Φ(, , ⋯, N) + (−) i− j+ ⟨⋯N∣g i j ϕ ⋯ ϕ i ⋯ ϕ j ⋯ϕ N ⟩
Here, we used antisymmetry of Slater determinants to find an expression for the correlated part, since i < j
are not necessarily next to each other. (However, note that ∣ϕ ⋯g i ,i+ ⋯ϕ N ⟩ is well-defined.)
An even better approach would be to introduce cluster functions g i j for all pairs of occupied SPFs, and
in all possible ways correlate pairs of SPFs. For example, for N = for simplicity,
∣ΨCCD ⟩ = ∣ϕ ϕ ϕ ϕ ⟩ + ∣g ϕ ϕ ⟩ + ∣ϕ g ϕ ⟩ + ∣ϕ ϕ g ⟩ − ∣ϕ ϕ g ⟩ − ∣g ϕ ϕ ⟩ + ∣g ϕ ϕ ⟩
(.)
+ ∣g g ⟩ − ∣g g ⟩ + ∣g g ⟩
This is the coupled-cluster doubles (CCD) wavefunction, for N = . The terms with two cluster functions
are defined in a similar way as the terms with only one cluster function.
The function g i j (x, y) of two one-particle coordinates, x, y ∈ X, can be expanded in the SPFs,
pq
g i j (x, y) = ∑ t i j ϕ p (x)ϕ q (y). (.)
p<q
We sum only over p < q, because we will see that in the end the other coefficients are not independent, by
antisymmetry properties of the wavefunction. Inserting this expansion leads to, for i j = ,
pq
√
⟨⋯N∣g ϕ ⋯ϕ N ⟩ = ∑ t N!Aϕ p ()ϕ q ()ϕ ()⋯ϕ N (N). (.)
p<q
We now observe that the right-hand side is a linear combination of Slater determinants,
pq
∣g ϕ ⋯ϕ N ⟩ = ∑ t ∣ϕ p ϕ q ϕ ⋯ϕ N ⟩
p<q
ab †
(.)
= ∑ t c a c c †b c ∣Φ⟩
a<b
In the last equality, we used the fact that only if pq = ab (virtual SPFs) can we get contributions, due to
antisymmetry of Slater determinants. We now note, that including a > b in the summation does not lead
to independent terms, justifying the restriction a < b in the summation.
′
Similarly, for the correction term in ∣Ψbetter ⟩,
pq
(−) i− j+ ∣g i j ϕ ⋯ ϕ i ⋯ ϕ j ⋯ϕ N ⟩ = ∑ t i j (−) i− j+ ∣ϕ p ϕ q ϕ ⋯ ϕ i ⋯ ϕ j ⋯ϕ N ⟩
p<q
pq
= ∑ t i j ∣ϕ ⋯ϕ p ⋯ϕ q ⋯ϕ N ⟩ (.)
p<q
= ∑ t iabj c †a c i c †b c j ∣Φ⟩ .
a<b
since, in a sense, all N fermions are involved in such statements. Instead, one speaks of clusters of n fermions. This term is then subtly
different from a subset of the fermions.
and we observe that
∣Ψbetter ⟩ = ∣Φ⟩ + t̂ ∣Φ⟩ , (.)
and similarly,
′
∣Ψbetter ⟩ = ∣Φ⟩ + t̂ i j ∣Φ⟩ . (.)
We now notice something curious and important: all operators t̂ i j commute among themselves. Why?
They are linear combinations of products of excitation operators c †a c i , and these commute:
[c †a c i , c †a ′ c i ′ ] = , (.)
since the creation operators always refer to virtual SPFs and the annihilation operators to occupied SPFs.
The reader should check that this final equation actually reproduces Eq. (.). The factor / in the last
term stems from double-counting of the cluster operators.
Exercise .. Prove that Eq. (.) becomes Eq. (.) when using the definition of t̂ i j . △
∣ΨCCD ⟩ = ( + T̂ + T̂ ) ∣Φ⟩ . (.)
We now observe that in the function T̂ ∣Φ⟩, no SPFs with indieces i ≤ N are left, since N =. Thus,
T̂ ∣Φ⟩ = , and we have in fact
∣ΨCCD ⟩ = e T̂ ∣Φ⟩ . (.)
The choice N = is not special: for any N, the wavefunction ∣ΨCCD ⟩ = e T̂ ∣Φ⟩ is identical to the
wavefunction where we replace pairs of occupied SPFs by with pair cluster functions g i j in all possible
ways in the reference Slater determinant ∣Φ⟩.
Furthermore, there is nothing special about pair clusters. We may introduce a singles cluster operator
T̂ = ∑ t̂ i = ∑ t ia c †a c i , (.)
i ia
corresponding to adding to the various ϕ i the SPF g i = ∑ a t ia ϕ a . We may also introduce a triples cluster
operator
T̂ = ∑ t̂ i jk = ∑ ∑ t iabc † † †
jk c a c i c b c j c c c k , (.)
i< j<k ! i jk abc
correlating a cluster of three particles, by adding to ϕ i ϕ j ϕ k a function g i jk (x, y, z).
We define a general cluster operator
Thus, the cluster expansion represents a systematic way to improve upon the reference wavefunction ∣Φ⟩. The
parameters of this expansion are the cluster amplitudes t ia , t iabj , etc, and they occur in a nonlinear fashion.
What is so good about this particular systematic expansion of ∣Ψ⟩? The answer is size-consistency.
We will have more to say about this later. However, here is a handwaving argument: Consider the CCD
wavefunction for N = , as above. Suppose that ϕ and ϕ have very small overlap with ϕ and ϕ . Since
∣Φ⟩ is supposed to be a reasonable guess for ∣Ψ⟩, this means that the fermions form -fermion clusters
that are “far apart”. It is therefore reasonable that g and g are the only contributing cluster functions to
∣ΨCCD ⟩: all the other g i j couple clusters that are very far apart and are approximately zero. We obtain
∣ΨCCD ⟩ ≈ ∣ϕ ϕ ϕ ϕ ⟩ + ∣g ϕ ϕ ⟩ + ∣ϕ ϕ g ⟩ + ∣g g ⟩ . (.)
The last term comes from T̂ – a quadruples cluster operator. Compare this with the CI doubles wave-
function, which can be written
∣ΨCCD ⟩ ≈ ∣ϕ ϕ ϕ ϕ ⟩ + ∣g ϕ ϕ ⟩ + ∣ϕ ϕ g ⟩ . (.)
The CID function contains all doubles excitations, but nothing more, while the CCD function adds those
quaddruples excitations that are doubles excitations on each cluster independently. It turns out that this
gives the exponential parameterization a great advantage.
Chapter
Appendix A
Mathematical supplement
A. Calculus of variations
A.. Functionals
In the calculus of variations, we compute the extrema of a possibly nonlinear function of a function. Such
objects are often called functionals. Thus, a functional F[u] takes some function u and produces a number.
One can think of F depending on infinitude of function values u(x). In the case of the energy expectation
value, the N-body wavefunction ∣Ψ⟩ is mapped to the number
∣Ψ⟩ = ∑ A I ∣Φ I ⟩ .
I
Then, E becomes a function of the vector A, ⃗ a possibly infinite set of coefficients. This may be an easier
way to think of a functional: a function that depends on K variables, where K may be infinite.
A functional can also depend on more than one function. In Hartree–Fock theory, the energy func-
tional depends on N single-particle functions ϕ i , i = , ⋯, N. Moreover, the Hartree–Fock Lagrangian
function that we actually optimize is a functional that also depends on a matrix λ = [λ i j ] of Lagrange mul-
tipliers, L = L[ϕ , ⋯, ϕ N , λ]. Given expansions of the ϕ i as ϕ i (x) = ∑ p χ p (x)U i p , we see that L becomes a
function of the matrix U and the matrix λ. Thus, functionals are not too different from ordinary functions
of a vectors.
How do we go about computing the extrema of a functional? A function of a single real variable has
an intuitive notion of a local extremum, and most readers probably have an intuitive notion of extrema
of two-variable functions as well. But if we go to higher dimensions (or infinite dimensions!) it becomes
more complicated.
We will therefore introduce the concept of a directional derivative in a rather informal way. This is
very handy, and allows us to read off the condition for an extremum in a straight-forward manner. This
framework is called the calculus of variations, since we are computing the “variation in F[u]” with respect
to arbitrary “variations δu of the function u”.
F(x) F(x)
x x
x x
Figure A.: Simple functions of one real variables with a local minimum (F ′′ (x ) > ) (left) and a saddle
point (F ′′ (x ) = ) (right)..
is a crossing of lines of equal elevation.) We now observe, that if you move in a direction η = (δx, δ y) ≠
from the local minium, you will always walk uphill, that is, the function
has a local minimum at є = , irrespective of η. If you were standing on a mountaintop (a local maximum)
you would always walk downhill, and f (є) would always have a local maximum at є = .
Finally, if you are standing between two mountaintops to the east and west, and looking down at valleys
to the south and north, you are standing on a saddle point. You are walking downhill if you go north or
south, but uphill if you go east or west: f (є) has a local minimum for some η, and a maximum for other η.
We see that, at least intuitively, we can determine wheter F has a local extremum at (x , y ) by studying
the behaviour of f (є), for all possible choices of η. We now prove this claim:
Let us compute the Taylor expansion of f (є):
We used the chain rule, and introduced the gradient and the Hessian matrix H, given by
⎛ ∂F(x∂x , y ) ⎞
∇F(x , y ) = ∂F(x , y ) (A.)
⎝ ∂ y ⎠
and
⎛ ∂ F(x , y ) ∂ F(x , y ) ⎞
∂x ∂x ∂ y
H(x , y ) = ⎜ ∂ F(x , y ) ∂ F(x , y )
⎟. (A.)
⎝ ∂ y∂x ∂ y ⎠
Now, F has an extremum at (x , y ) if and only if ∇F(x , y ) = , while f (є) has an extremum at є =
if and only if the second term in Eq. (A.) vanishes. But if ∇F(x , y )T η = for all η ≠ , then clearly
∇F(x , y ) = and vice versa. QED.
y
. (x , y )
.
.
.
Figure A.: The condition for a local minimum (x , y ) for a function F(x, y): in all directions η ≠ you
walk uphill from (x , y ).
This condition is equivalent to ∇F(x )T = .
Turning to a functional F[u] for some function u, or set of functions, the directional derivative in the
direction of the function η is in principle straightforward:
d
F ′ [u; η] = F[u + єη]∣ . (A.)
dє є=
Computing F[u + єη] as a series in є is usually straightforward, allowing an expression for F ′ [u; η] to be
read off. Typically, this leads to a differential equation: the variational principle gave us the Schrödinger
equation, while extremalization of the Hartree–Fock energy gave us the Hartree–Fock equations.
The term “calculus of variations” is historical, and comes from the idea that we are “computing infinites-
imal variations δF[u] in the functional under infinitesimal variations δu of the function” in all possible
ways, i.e., a different way of saying that we are computing directional derivatives.
Bibliography
[] E.K.U. Gross, E. Runge, and O. Heinonen. Many-Particle Theory. Adam Hilger, Bristol, Philadelphia
and New York, .
[] A. Szabo and N.S. Ostlund. Modern Quantum Chemistry: Introduction to Advanced Electronic Struc-
ture Theory. Dover, New York, USA, .
[] J.M. Leinaas and J. Myrheim. On the theory of identical particles. Il Nuovo Cimento, :, .
[] F.E. Harris, H.J. Monkhorst, and D.L. Freeman. Algebraic and Diagrammatic Methods in Many-
Fermion Theory. Oxford, New Work, .
[] R. Helgaker, P. Jørgensen, and Olsen. J. Molecular Electronic-Structure Theory. Wiley, Chichester, UK,
.
[] J. Paldus and Čı́žek. Time-independent diagrammatic approach to perturbation theory of fermion
systems. Adv. Quant. Chem., :, .
[] G.F. Giuliani and G. Vignale. Quantum Theory of the Electron Liquid. Cambridge, Cambridge, UK,
.
[] M. Moshinsky and Y.F. Smirnov. The Harmonic Oscillator in Modern Physics. Harwood, Amsterdam,
Nethherlands, .
[] T. D. Crawford and H.F. Schaefer III. An introduction to coupled cluster theory for computational
chemists. Rev. Comp. Chem., :, .
[] I. Shavitt and R. J. Bartlett. Many-body methods in chemistry and physics: MBPT and Coupled-Cluster
Theory. Cambridge, .