Lecture Notes in Quantum Chemistry II 1994 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 341

Lecture Notes in Chemistry 64

Edited by:
Prof. Dr. Gaston Berthier
Universite de Paris
Prof. Dr. Michael J. S. Dewar
The University of Texas
Prof. Dr. Hanns Fischer
Universitat Zurich
Prof. Dr. Kenichi Fukui
Kyoto University
Prof. Dr. George G. Hall
University of Nottingham
Prof. Dr. Jiirgen Hinze
Universitat Bielefeld
Prof. Dr. Joshua Jortner
Tel-Aviv University
Prof. Dr. Werner Kutzelnigg
Universitat Bochum
Prof. Dr. Klaus Ruedenberg
Iowa State University
Prof Dr. Jacopo Tomasi
Universita di Pisa
B. O. Roos (Ed.)

Lecture Notes
in Quantum Chemistry IT
European Summer School
in Quantum Chemistry

Springer-Verlag Berlin Heidelberg GmbH


Editor
BjomO. Roos
University of Lund
Department of Theoretical Chemistry
Chemical Centre
P. O. Box 124, S-22100 Lund

ISBN 978-3-540-58620-3 ISBN 978-3-642-57890-8 (eBook)


DOI 10.1007/978-3-642-57890-8

Cip data applied for


This work is subject to copyright. All rights are reserved, whether the whole or part
of the material is concemed, specifically the rights of translation, reprinting, re-use
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other
way, and storage in data banks. Duplication of this publication or parts thereof is
permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permis sion for use must always be obtained from
Springer-Verlag. Violations are liable for prosecution under the German Copyright
Law.
© Springer-Verlag Berlin Heidelberg 1994
Originally published by Springer-Verlag Berlin Heidelberg New York in 1994

Typesetting: Camera ready by author


SPIN: 10473124 51/3140 - 5432 - Printed an acid-free paper
Contents

Introduction
Bjorn O. Roos, Editor

Notes on Hartree-Fock Theory and Related Topics 1


Jan Almlof, University of Minnesota

Density Functional Theory 91


Nicholas C. Handy, University of Cambridge

Coupled-cluster Methods in Quantum Chemistry 125


Peter R. Taylor, San Diego Supercomputer Center

Methods of Relativistic Quantum Chemistry 203


Andrzej J. Sadlej, University of Lund

Exercises with Solutions 231


Roland Lindh and Per-Ake Malmqvist, University of Lund
Introduction
The first volume of Lecture Notes in Quantum Chemistry (Lecture Notes
in Chemistry 58, Springer Verlag, Berlin 1992) contained a compilation
of selected lectures given at the two first European Summer Schools in
Quantum Chemistry (ESQC), held in southern Sweden in August 1989 and
1991, respectively. The notes were written by the teachers at the school
and covered a large range of topics in ab initio quantum chemistry.
After the third summer school (held in 1993) it was decided to put together
a second volume with additional material. Important lecture material was
excluded in the first volume and has now been added. Such added topics
are: integrals and integral derivatives, SCF theory, coupled-cluster theory,
relativity in quantum chemistry, and density functional theory. One chapter
in the present volume contains the exercise material used at the summer
school and in addition solutions to all the exercises.
It is the hope of the authors that the two volumes will find good use in the
scientific community as textbooks for students, who are interested in learn-
ing more about modern methodology in molecular quantum chemistry. The
books will be used as teaching material in the European Summer Schools
in Quantum Chemistry, which are presently planned.
Lund in July 1994
Bjorn Roos
NOTES ON HARTREE-FOCK THEORY AND RELATED TOPICS

JanAlmlof
Department of Chemistry
University of Minnesota
Minneapolis, MN 55455. USA

Contents:

1• Introduction.
2. The Born-Oppenheimer Approximation.
3. Determinant Wavefunctions and the Pauli Principle.
4. Expectation Values With a Determinant Wavefunction.
5• The Hartree-Fock Equations.
6. Spin- and Space Orbitals. Unrestricted Hartree-Fock Theory.
7. Closed-shell Hartree-Fock Theory.
8. Restricted Open-Shell Hartree-Fock Theory.
9. The LCAO Expansion.
1 O. The Roothaan-Hall Equations.
11. The Self-Consistent Field Procedure.
12. Solution of the Roothaan-Hall Equations.
13. The Supermatrix Formalism.
14. Direct SCF Techniques.
15 . Basis sets.
16. Integral Evaluation.
17. Prescreening of Integrals.
18. The Gaussian Product Basis.
19. Approximate Three-Center Expansions.
20. The Semi-Classical Limit.
21. Fermi- and Coulomb Correlation.
22. The Fack Matrix in an Orbital Basis. Koopmans' Theorem.
23. Matrix Elements With Slater Determinants. Brillouin's Theorem.
24. Charge Density and Population Analysis.
25. Closing Remarks.
Appendix A. Notations for integrals.
Appendix B. Parallel Implementations of Hartree-Fack Methods.
2

1. Introduction.

The numerical challenges encountered when addressing electronic structure


problems from first principles in computational quantum chemistry are humbling.
Solving the SchrOdinger equation for a large molecular system amounts to handling sets
of second-order differential or integro-differential equations, often with thousands of
variables. Indeed, the large number of particles that have to be treated in a quantum-
mechanical description of a chemical system is certainly one of the greatest obstacles to
quantum chemistry. The equations may have millions of singularities, and an accuracy
of a few parts per billion is usually required. While a lot of sophisticated method
development has been devoted to this problem with spectacular progress in the last
couple of decades, the current state of the art nevertheless leaves both room and need
for improvement. Many problems where a theoretical-computational approach could
have an immense potential would require a quantitative description of extended
molecular systems, i.e., molecules in the range 10 2 - I()4 atoms.
Confonnational effects on the activity of enzymes and other biomolecules, the
structure of liquids, solvation of ions and surface reactions such as corrosion and
heterogeneous catalysis are only a few of the subjects which involve the chemistry of
large, non-periodic systems. Addressing this class of problems with ab initio methods
would certainly be an ambitious undertaking, but also an extremely rewarding one if
successful. In the past, however, this goal would have been utterly unrealistic. The
application of accurate quantum-chemical methods to these and similar problems has
been severely hampered by the insufficient ability to treat the large chemical systems
that are inevitably encountered in these areas of chemistry. However, the rapid
progress in computer hardware of the last decades. combined with the even faster
refinement of computational methods during the same period has made this a very
realistic objective which could well be reached before the end of this century.
Except in very simple systems. there is no way the accuracy requirements of
modem chemistry can be met with a brute-force application of numerical techniques
alone. Even the most sophisticated numerical methods for solving differential
equations would fall short of the stated goal. Instead, methods and approximations
must be chosen that take advantage of our chemical knowledge about the system under
consideration. but without biasing the results to meet preconceived expectations.
The Hartree-Fock method which we will now discuss is an example of one
such approximation. which has had a profound and continuing impact on chemistry. It
introduces in a natural way the concept of molecular orbitals. which has become an
invaluable conceptual tool for qualitatively describing the electronic i\tructure of
complex systems. at the same time as it provides a computation method that often leads
to results of highly useful quantitative accuracy.
3

2. The Born-Oppenheimer Approximation.

Separation of variables is a common method for simplifying complicated


Schrodinger equations for systems with many variables, e.g., in separating the time
variable from the spatial coordinates in a stationary system, or in the multi-dimensional
hannonic oscillator. In these cases the separation enables us to find exact solutions. In
many other cases, however. the separation is not exact, and should be regarded as an
approximation introduced for the purpose of simplifying the numerical procedure.
One such separation step, which is universally assumed in electronic structure
theory, is a separation of electronic and nuclear motion, the Born-Oppenheimer
approximalion. This approximation is based on the difference in mass between
electrons and nuclei (a factor of 103 -\ 0 5), and assumes that the electrons move on a
different "lime"-scale than the nuclei, such that the electrons follow the nuclei
"instantaneously" during the motion of the latter. Without pretending any complete
derivation or proof, a simplified discussion of the Born-Oppenheimer approximation
would go as follows:
The total, non-relativistic Hamiltonian for a system of charged particles
(electrons and nuclei) can be written in atomic units as l

(2.1a)

where V' a is the gradient operator for particle a;

(2.lb)

and
2 d2 d2 d2
V' =--+--+--. (2.1c)
a dx 2 dy 2 dz 2
a a a

In (2,1), the indices a, b... label all the particles of the system regardless of their
nature, rna are the masses of these particles, ra their positions and Qa their charges.
Distinguishing between nuclear coordinates {R}, with indices Il, V, •• and electronic
coordinates {r}. denoted i. j, .. , we can now rewrite the expression for the
Hamiltonian (2.1) as 2

H = T nuc + H eI , (2.2)
with
(2.3)

I We use a S C RIP T font for quantities that symbolize general many-electron operators.
2 To simplify the nOlalion. :lIomic units have lIeen used here and throughout the presentation.
4

and

Hel =-~lV2
I
~~L
~ 1: j + ~ Ir,,-rjl +..
'" I " 1<]
1 ~Q.&
Irj - rJ·1 -.LJ Ir,,-rvl
I1<V ..
(2.4)

Notice here that the separation is IlQl symmetric in electrons and nuclei! The
electronic Hamiltonian depends parametrically on the nuclear positions - the nuclear
coordinates appear in the electronic Hamiltonian, but derivatives with respect to these
coordinates do not - and the electronic problem can therefore be solved for nuclei
which are momentarily clamped to fixed pesitions in space:

(2.5)

Note that the electronic energy and wavefunction are still functions of the nuclear
coordinates R. We now approximate the total wavefunction as a product.

(2.6)

where the nuclear wavefunction "'nuc(R) is a solution to the equation

(2.7)

How good is the Born-Oppenheimer approximation? If we were to apply the


total Hamiltonian to the B-O wavefunction (2.6), we would get3

H 'PBO = [Hel + Tnuc1 "'nuc(R)"'el(r:R) =


"'nuc(R) Hel "'el(r,R) + Tnuc ["'nuc(R)"'el(r,R») (2.8)

="'nuc(R) Eel (R)"'el(r,R) + "'el(r:R) Tnuc "'nuc(R)


-L 2~11 {(2[V"''''el(r,R») • [VI1 "'nuc(R») + VI12"'el(r,R)} (2.9)

'"
where V Jl is the gradient operator for the coordinates of nucleus JL. The important thing
to notice here is that the electronic wavefunction still depends on the nuclear
coordinates, and that the nuclear kinetic energy operator thus affects it. However,
although this dependence is formally present, it is usually insignificant due to the large
difference in mass between nuclei and electrons. which makes the former move much
more slowly than the latter. In other words, the electronic wavefunction normally
varies little upon small changes in the nuclear positions. and the terms containing
V", '"el(r,R) and V/"'el(r.R) can therefore be neglected. leading to:

(2.10)

3 since Hel does not dirferenti:lle with respeclto the nuclear coordinates il commutes with'llnuc(R).
5

The errors introduced by the Born-Oppenheimer approximation are minute, and


we still think of solutions to the electronic part (2.5) as "exact". In electronic structure
problems, we normally focus on (2.5) and pay no further attention to this issue. We
should notice, however, that a number of the basic concepts in chemistry, such as the
concept of potential energy surfaces, the notion of molecular geometry, etc., are
implicit results of the Born-Oppenheimer approximation.
The separation of nuclear and electronic coordinates is an essential first step in
simplifying a molecular SchrOdinger equation to the point where actual computation can
take place. However, for systems of any chemical interest the electronic part of the
problem is still much too complicated to be treated exactly. A quick glance at the
electronic Hamiltonian (2.4) reveals a differential equation in (3n) coordinates,4 with
numerous singularities and other unpleasant features. Add to this that the eigenvalues of
this Hamiltonian are equal to total electronic energies, which are of the order loS to 108
kcal/mol, whereas their relevance to chemical problems requires them to be determined
with a precision of a few kcaVmol or better. It is clear that the numerical challenge is
immense, and that we cannot rely only on standard techniques of applied mathematics
to solve the problem. Even the most sophisticated and efficient numerical methods for
solving differential equations would fall short of the stated objective for accuracy.

3. Determinant Wavefunctions and the Pauli Principle.

From the above discussion. we realize the need for a 'quick and dirty' method
that can provide qualitatively correct. approximate solutions to the many-electron
Schrtidinger equation. The Hartree-Fock method provides such a solution, at the same
time as it also satisfies our need for a manageable model of the electronic structure of
many-electron systems.
The separation of variables which are not strongly interdependent worked well
in the Born-Oppenheimer approximation discussed above. (Incidentally, it is also
essential in solving the nuclear motion problem defined by (2.7). which is the key
equation in theoretical vibrational spectroscopy). An analogous trial wavefunction for a
many-electron system would be a product of one-electron wavefunctions, a Hartree
~:
(3.1 )

In (3.1) the functions <Pi are wavefunctions describing a single electron:~. This
model was used in the early days of quantum mechanics to carry out crude calculations
of the electronic structure of atoms. but it has some obvious deficiencies.

~ Here. and throughout the presentation. we use 'n' 10 represent the number of electrons in a many-
electron system.
6

For a start. we note that electrons are identical particles. a fact which ought to be
reflected by the wavefunction. If two identical particles "i" and "j" were to be
interchanged. there should be no detectable change in any of the observable properties
of the system. In particular. the probability density. as defined by the square amplitude
of the wavefunction. cannot be allowed to change upon such a manipulation. A very
basic requirement on any reasonable trial wavefunction should therefore be to satisfy
the condition Pij 'I' = ±'I' for any two identical particles i and j. where P ij is ~he
permutation operator interchanging the coordinates of these two particles. At this point.
we cannot decide from first principles which sign would be appropriate. or even
whether the sign matters. We accept as a postulate (supported by experiment) that

(3.2)

for electrons. as it is for all particles with half-integer spin (fermions).


This is the important Pauli Exclusion Principle. In plain words. it states that

"a many-electron wavefunction mLlst be al!tisymmetric with respect to interchange of


the coordinates of any two electrons. "

To proceed. we note that any two-electron function '1'( 1.2) can be written as

'1'(1.2) :; t ['1'(1.2)+'1'(2, I)J + t ['1'(1.2)-'1'(2, I)) (3.3a)

i.e .•
'I' = 'I'symm + 'I'antisymm (3.3b)

whether it satisfies the Pauli Principle or not. Given an arbitrary approximate


wavefunction for two electrons. '1'( 1.2). one can thus always form

'I'(I.2)antisymm = t['I'(I.2)-'I'(2.I)] = ¥1-PJZ1 'I' (3.4)

which is an anti symmetric function regardless of '1'(1.2) (unless '1'(1.2) happens to be


exactly symmetric. in which case '1'( 1.2)antisymm = 0). Similarly. for three electrons
we obtain:

'1'( 1.2.3)antisymm = ~['I'( 1.2.3)-'1'(2. 1.3)-'1'(1.3.2) -'1'(3,2.1 )+'1'(3.1.2)+'1'(2.3, I)J


f
=6[1 - Pl2 - P23 - P/3 + P231 + P:!J2J '1'(1,2.3) (3.5)

(3.4) and (3.5) can be viewed as projections. which project out the totally
anti symmetric part of any function '1'. In general:

\fI( 1.2 ... n)antisymm = I (-I)PP \fI( 1.2 ... n) (3.6)


p
7

The sum in (3.6) is over all possible permutations P of electrons 1,2,... n, and p is the
parity of the permutation P. Application of the antisymmetrizer

A =L (-l)PP (3.7)
p

to the Hartree product (3.1) thus leads to a sum of terms with alternating signs,
containing a total of n! different products of orbitals. We notice that the definition (3.6)
is identical to that of a detenninant:

CPI(1) CP2(I)

CPI (2) CP2(2)

CPP) (3.8)

where {CPI,CP2,CP3" . . ,CPn} = 1p is a set of one-electron orbitals,S and C is a


normalization constant to be discussed later. The many-electron functions (3.8) are
usually referred to as Slater determinants. The equivalence between (3.8) and (3.4) or
(3.5) can be verified explicitly for the two-and three-electron cases above.
Obviously, it becomes impractical to write out (3.8) in full every time we have a
determinant wavefunction in mind, and we therefore introduce the abbreviation

(3.9)
for a Slater determinant. The essence of the Hartree-Fock method is thus that the
wavefunction is written as a determinant of one-electron orbitals. For simplicity we
assume that the orbitals are orthonormal. i.e.

(3. lOa)

(a unit matrix), or

(3. lOb)

We will show later (5.10 - 5.16) that this assumption does not restrict the generality of
the approach.
We now investigate some properties of the determinant wavefunction. From
the above. we know that we can expand the determinant as a sum of products:

S We will he using this "shadowri" font whenever we refer tO:l m:ltrix or vector in n dimensions.
8

n!
'I'=C L
p
(-I)PP {<I',(I)<I'2(2)<I'3(3) . . . <l'n(n)} (3.11)

The sum in (3.11) contains all possible permutations of all the electrons. However, we
could also view the ordering of the electrons as fixed, and instead permute the orbitals
in all possible ways. As we sum over all possible permutations in both cases. the
results would be the same. but the latter approach is sometimes a more useful way of
viewing the Slater determinant. A characteristic term in the expansion is then:

(3.12)

where {a. b •... k} is a permutation of the numbers {I. 2, .... n}. We can now write
the normalization or "self-overlap" integral

('1'1'1') =JD( 1.2. ... n) * D(1.2.... n) dr,. dr2 •... drn (3.13a)
as:
n! n!
{'I'I'I')=C 2 L L (-I)P+P'(p {<I',(I)<I'2(2) ... <l'n(n)}1
p p'

(3.13b)

Here. it is useful to keep in mind that the electronic coordinates are "dummy" variables
- since we integrate over them. we can call them anything we want, as long as we are
naming them in a consistent manner. Furthermore. we note that applying the same
permutation to the electrons and the orbitals would result in a Hartree product identical
to the original one. and the effect of a permutation P on the electrons is therefore equal
to that of the inverse permutation P -, on the orbitals. One term in the sum (3.13b) can
be written as:

(p {<I',(I)<I'2(2) .. <I'n(n)}1 P' {<I',(I)<I'2(2) .. <I'n(n)})

= (<I'a(1)<I'b(2) .. <I'k(n) I <l'a'(1 )<I'b'(2) .. <I'k'(n» (3.14a)

which. upon re-Iabeling the electrons becomes

(3.14b)

The left-hand side is the original Hartree product - we just have to reorder the orbitals.
The right-hand side is the result of applying the permutation P to the electrons and P ,
to the orbitals. The former operation is equivalent to applying P -I to the orbitals. and
we can therefore write (3.14b) as:

(3.14c)
9

We thus obtain for the entire integral over determinants:


n! n!
('1'1'1') =c z L L (-J)P+P' ({<PI(I)<Pz(2) .. <Pn(n)} 1)
P P'

(3.15a)

The product of two permutations Q = P -I P' is just another permutation with


parity q = p+p'. Furthermore, as each of P' and P go through all possible
permutations, Q will pass through all permutation exactly n! times. Thus,
n!
('1'1'1') = n! C Z L (-I)q ({<PI(I)<Pz(2)"'<Pn(n)}IQ {<PI (l)<Pz(2) ... <Pn(n)})
Q
n!
=n! C Z L (-I)q (<PII<Pal)\ (<Pzl<Pa2)2 (<P31<Pa3)3 ... (<Pnl<Pan)n
C3.15b)
Q
where {aI, a2 .... an} is a permutation of the indices {I, 2, ... n}, and where the
bracket notation (<Ppl<Pq)j refers to an overlap integral over the orbitals (see Appendix A
for a definition). The outer subscripts 'i' for the integrals (<Pj I <Paj)i in (3.15) indicate
the electron whose coordinates we are integrating over. With orthogonality of the
orbitals. the integrals (<Pil<Paj) are thus zero unless aj =i for all 'i', according to (3.10),
which suggests that the only Q in the sum to give a non-zero contribution must be the
identity operation! Thus. we finally get ('f'I'I') =CZ n! I· I· I· . . ,from which
we conclude that the normalized Slater determinant wavefunction is:

<PI (I) <P2( I)

<PP) <Pz(2)

<PP) (3.16)

We have shown above that it is indeed possible to evaluate integrals over


n-electron determinant wavefunctions. even though each determinant is a sum of n!
terms. In electronic structure theory we frequently need to evaluate integrals involving
the many-electron Hamiltonian operator with these wavefunctions. This may at firSt
appear to be an impossible undertaking, given the n! terms in the wavefunction.
However. we have seen how the determinant properties of the wavefunction are useful
in simplifying this task for the normalization integrals. and similar techniques will be
used next to evaluate integrals whose appearance may be even more forbidding.
10

4. Expectation Values With a Determinant Wavefundion.

The electronic Hamiltonian for a many-electron system can be written in terms


of zero-, one-, and two-electron terms as:
n n
H =ho + ~ hi
I
+ l. g ij ,
1<]
(4.1 )

where
(4.2a)

is a trivial additive constant for fixed nuclear positions, which we will usually leave
outside the discussion in the rest of this Chapter;

depends on the coordinates of a single electron; and the two-electron operator

gij =/.IJ (4.2c)

describes the interaction between two particles with unit charge.6 Here again, we recall
that the indices "i" and "j" label the electrons. "Il" and "v" the nuclei. In evaluating the
=
energy as an expectation value of the Hamiltonian. E ('PH'P), we will need the
integrals ('P[1:hi ]'1') and ('1'[1: gij ]'1'). in addition to the overlap integral encountered
in Section 3. For the determinant wavefunction, the one-electron integral is given by:

('I'q: hi]'¥) = ~! t f., <-I)P+P' (p {<i>I(I)<i>2(2)···<i>n(n)}


n n! n!

n
[L hi]P' {<i>I<I)<i>2(2)···<i>n(n)}} (4.3)
i

Just like in the case of the normalization integral (3.15). we apply the inverse
permutation P -I and re-Iabel the integration variables:

('I'q: hd'¥) = n\ t f., <-I)P+P'({<i>I(I)<i>2(2)...<i>n<n)}


n n! n!

n
[p-IL hi]P -lp'{<i>l(1)<i>2(2)··.<i>n(n)}} (4.4)
i

Ii NOle Ihal Ihere is .DS!. direct phy~ical inleraclion hclween e1emenlary panicles involving more Ihan
IWO panicles. This is in conlrnSI 10 Ihe inlernction belween more complex objecls such :IS aloms or
molecules. where many-lKxly internclion erfe~'ls are prevalent.
11

Individual terms in the sums of operators in (4.1) may be affected by a


permutation operator. However, the total sum remains unchanged, and we can thus
use the same simplifications of the expression as for the normalization integral, i.e.:

n n! n
('I'~ hi '1') = ~ (-l)q({CP,(I)CP2(2) ... CPn(n»[~ hi] Q (CP,(I)CP2(2) ... CPn(n)})
i Q i
n n!
=Li QL (-I)q (CP,ICPa,), (CP2ICPa2)2 (CP3ICPa3)3 ... (CPj hj cpaj)j····(cpnl<pan)n (4.5)

where (<Pj hj CPa)i denotes integrals over the one-electron operator.'


All the one-electron overlap integrals (CPkICPak) vanish for ak :#: k , whereas the
integral (<Pj hi <Paj) can be non-zero even if aj:#: i. In order to get a non-zero result, all
orbitals must thus match: ak = k .... except possibly aj and i. But, since a" a2, ... is
just a permutation of the original indices I ,2, ... n, it is not possible to have a mismatch
in only one place! Thus, if ak = k for all the (n-I) overlap integrals, we must also have
ai = i. We thus conclude that the expression for an integral over the one-electron part of
the Hamiltonian with Slater determinants is:

n n n
('PI. hi'll) = ~ (<Pj hj <Paj)i =
I I I
r,
(<Pi h <P ai ) (4.6)

For the two-electron operator Lg ij , the situation is only slightly more


complicated. Recall that in Eq. (4.6) the simplification arises from the fact that we
require ak = k for all k;ei. which also automatically leads to 3.j = i. By analogy, in the
two-electron operator we should require ak = k for all k except for k=i or k=j. But, this
leaves us with only two possibilities: either aj = i, aj = j, or aj = j, aj =i. In the first
case. P is the identity operator and (-I)P = + I, in the second case P permutes i and j,
and the sign factor is negative. With arguments that parallel those used in the one-
electron case, the expectation value for the two-electron operator becomes:
n n! n
('I' ~ gij '1') = ~ (-l)q({CPI(I)CP2(2) ... CPn(n»)[~ gij]Q (CPI(I)CP2(2) ... CPn(n)})
~ Q ~
n n!
=~ L (-I )q(CP ,1<Pa I) I·' { (CPj CPj Igjjl CPi CPj)i,j - (<Pj CPj Igijl <Pj CPi )i.j} .. (CPnICPan)n
i<j Q
n
=L {(<Pi <Pj Igijl <Pi <Pj)i.j - (<Pi <Pj Igijl <Pj <Pi )i.j } (4.7)
i<j

7 The nOlalion used for Ihis integral and many others is discussed more fully in Appendilt A.
12

=
A tenn with i j in (4.7) would make no physical sense. since it would essentially
pretend to describe a "self-interaction" of the electron. However. it is easily seen that
such a tenn would vanish. and it is thus convenient to rewrite (4.7) as
n n
('I' l: gij '¥) =t h {(<Pi <Pj Igl <Pi <Pj) - (<Pi <Pj Igl <Pj <Pj ) }
1<) I,)
(4.8)

(4.9)

We can rewrite this expression further as

n n n
('¥ l: gij '¥) =!4..U (<Pi {Ij - Kj }<Pi}= !~
~ 1
(<Pi {I - K}<Pi) (4.10)

where we have introduced the Coulomb and exchange operators Ij and Kj • defined
through their action on an arbitrary function 9( I) such that

(4.11)

(4.12)

and also the total Coulomb and exchange operators


n
I =Li Ii (4.13a)

n
K=Li KI (4.13b)

The expectation value of the Hamiltonian with the determinant wavefunction is thus:
n n
E('P) =('PH'¥) =~ (<Pi h <Pi) + ~ l: (<Pi(J -K)<Pi) =
1 I
n n
~ (<Pi h <Pi) + ~ 4.. {(<Pi<Pj Igi <Pi<Pj) - (<Pj<Pj Igi <Pj<Pi)} (4.14)
I U

which is the desired result - an expression for the total (electronic) energy for a Slater
determinant wavefunction. evaluated ac; a proper expectation value of the Hamiltonian.
Notice that within the Born-Oppenheimer model the Hamiltonian used here is
exact: the only approximation introduced is that of a single-determinant wavefunction.
13

s. The Hartree-Foc:k Equations.

In .section 3 we introduced the Slater determinant. which appears to be a


reasonable approximation for the many-electron wavefunction. However. the orbitals
that define this wavefunction still have to be determined. We therefore apply the
variation principle to the energy expression 'obtained with such a wavefunction. with
the tacit assumption that the orbitals leading to the lowest energy are the "best" in a
general sense. As usual when extremum values are sought we may suspect that
differential calculus will lead to the desired goal. and we therefore change the orbitals
by a small (infinitesimal) amount; 'Pi ~ 'Pi + i)'Pi. which leads to a change in the
wavefunction 'I' ~ 'I' + 0'1'. This change gives rise to a corresponding change in the
total wavefunction. as well as in the energy expression. Replacing (j)j by 'Pi + O(j)i in
(4.14). we obtain

r «'Pj+O(j)i) h «(j)i+O'Pi»
n
E('I') ~ E('I') + oE('I') =
i
n
+ ~ ~ «(j)j+O(j)j)('Pj+O'Pj) Igl ('Pj+O'Pj)«(j)j+O'Pj»
I.J
n
I~ ,
- 1: ~ «(j)j+O(j)j)('Pj+O'Pj) Igl ('Pj+O'Pj)«(j)j+O(j)j» (5. J)
I,J

Terms in (5.1) of higher order than linear in <;(j) can safely be neglected. and we get
for the variation of the energy:
n n
<;E('I') = ~ (<;(j)i h 'Pi) + k {(<;(j)i'Pj Igi 'Pi(j)j) -' (<;(j)i(j)j IgI 'Pj(j)i)} + c.c.
I l.j

=r r =r
n n n
(<;(j)i h 'Pi) + (<;'Pi (J -K) 'Pi) + C.c. (<;'Pi F (j)i) + C.c. (5.2)
i i i

where c.c. denotes the complex conjugate. s In (5.2) we have introduced the &k
Operator
F=h+J -K (5.3)

We are now in a position to minimize E('I'). However. in order to maintain


normalization of the total wavefunction we shall require orthonormality (orthogonality
rumnormalization) of the orbitals. and the constraints
=
«(j)i I (j)j) <;ij (5.4a)
(.t I ') =] (5.4b)

8 Remember that we have to take the complex conjugate of the wavefunctjon to the left when we
evaluate expectatjon values of the form <lJ'1 OP'I'2>.
14

thus need to be applied throughout the optimization. This can be done in several ways.
One of the more common approaches uses the technique of La&ran&ian multipliers,
which we will assume known to the reader. Accordingly, we instead minimize the new
quantity
n
E = E('P) - ~ Aji «<Pi I <Pj) - Oij) (5.5)
I)

with respect to the wavefunction parameters as well as the Lagrangian multipliers '1i:
n n
oE = l: (o<Pi F <Pi) - ~ Aji (O<Pi <Pj) + c.c.
I I)
(5.6)
n n
= l: (O<Pi {F -l:J Aji} <Pj} + c.c.
I
(5.7)

If the orbitals <Pi are the "best" possible in the sense of the variation principle, then
oE = 0 for all possible variations o<Pj. This can only be the case if. all the terms in
brackets vanish:
n
F <Pi -l: Ajj<Pj = 0 (5.8)
J
or. in matrix notation:

(5.9)

The equations (5.8 - 5.9) are the legendary Hartree-Fock EQuations. They
establish a criterion for the orbitals giving the lowest energy for a system described by a
determinant wavefunction. However. they do not constitute the simplest possible form
for such an criterion. To see this, we consider a linear transformation of the orbitals:

n
<P'i = l: Wji<Pj (5.10)
J
'lp'=,W (5.11)

To determine what effect the transformation (5.11) will have on the Slater
determinant, we define the n x n matrix A such that

(5.12)
We can now write the Slater determinant as
·1/2
\fI(1,2 •... n)=(n!) detA (5.13)

After the transformation (5.11) of the orbitals in (5.12) we get


15

n n
A'ki = <P'i(k) = ~ Wji<Pj(k) = ~ WjiAkj, (5.14)
J J
A'=AW (5.15)
The wavefunction after this transfonnation would be:

'1"0,2,... n) = detA' = detAW = detA detW (5.16)

where we have again been using some elementary linear algebra. But, detW is merely
a constant (non-zero as long as the transformation is linearly independent), and thus the
new wavefunction If" is essentially identical to 'I' apart from a trivial renonnaJization!
This is an important result, with implication far beyond this course:
The Hartree-Fock wavejunction is invariant (except/or a renormalization) to linear
transformations among the orbitals!
Furthermore, we note that if we require the orbitals to remain orthonormal under the
transformation, then W has to be a unitary matrix, detW == I, and we don't even have
to worry about renonnaJization.
We assumed in (3.10) that the orthonormality requirement would not be a
significant constraint on the wavefunction. We see now that this assumption was
justified - transforming the orbitals among themselves leaves the total wavefunction
and energy invariant, and it thus only a matter of convenience.
It is now time to recall the Hartree-Fock equations (5.8) and (5.9). The
purpose of these equations was to define the orbitals leading to the lowest (and, thus,
the "best") energy for the Hartree-Fock wavefunction '1'. However, since we just
showed that the wavefunction is invariant to linear transformations among the orbitals,
the same must be true for the energy E('I')::: ('I'H '1'). Accordingly, we can allow
ourselves any transformation of the orbitals, if that can simplify our working equations.
For that purpose we let the Lagrangian multipliers define the matrices V and £ such
that:
n
~ AkiVij =EjVkj (5.17)
I
or
(5.18)

where £ is diagonal. In conventional linear algebra language, the columns of V are


eigenvectors of the matrix >.. and the diagonal elements of £ are the corresponding
eigenvalues. We now define a new set of orbitals:
n
<P'j =4. (j) i V ij (5.19)
I
16

.'=,V
In matrix notation, (5.19) would read:
(5.20)

Inserting this transformation into the Hartree-Fock equations (5.9), we get

F lip' =F 'V =, ~V ='V'E =!P's (5.21)


In other words, if the primes are dropped:

F qlj =qlj Ej (5.22a)


or, in matrix form:

(5.22b)

clearly a simplification compared to (5.9). Eq. (5.22b) is illustrated schematically in


the graph below - note that the operator F acts in tum on all the elements of the vector
of eigenfunctions' :

(5.22c)

Equation (5.22a) is reminiscent of the Schrodinger equation, and can therefore be


interpreted as a set of effective one-electron Schrodinger equations for the orbitals.
They are often referred to as the canonical Hartree-Fock equations. The corresponding
orbitals are the canonical Hartree-Fock orbitals, and the eigenvalues Ej are referred to as
orbital energies.
We make a few observations here before proceeding:
I: The particularly simple form (5.22) can be obtained because all orbitals are treated as
being equivalent, and the energy therefore remains invariant to transformations among
the orbitals.
2: We should also note that the orbital energies are given by

(5.23)

This relation can be obtained by mUltiplying l5.22a) from the left with qli and
n
integrating. Finally. we note that E:;e L Ei. Actually,
i

r. r.
n n
E = Ei - ~ (qli(1 -K)qli) (5.24)
I I

In semi-empirical theories it is often assumed that the total electronic energy equals the
sum of one-electron energies. but clearly that is not the case here.
17

6. Spin- and Space Orbitals. Unrestricted Hartree-Fock Theory.

So far, we have considered the molecular orbitals to depend on the coordinates


of the electron. We have not explicitly stated what these coordinates are, but the reader
may have assumed that they are the three Cartesian coordinates of the electron's
position in space {Xi, Yi , Zi }. At the same time, however, one could have suspected
that there may be more variables determining the characteristics of an electronic
wavefunction than just the spatial coordinates. For instance, we introduced 'the
antisymmetry of the wavefunction in an ad hoc manner, related to the spin properties of
the electron. A wealth of experiments suggest that elementary particles are
characterized not only by their spatial coordinates but also by their spin s (a vector
quantity),
The electron is known to have a spin with a magnitude s =t, the z-component
of which is quantized to take one of two possible values, ms =± t. The spin of a

single electron is thus characterized by the behavior of the wavefunction when acted
upon by the two operators s2 and Sz. The eigenvalues of these operators are = ~ and

=S' the eigenfunctions are labeled a and ~.9


In order for electron spin to have any meaning in MO theory, we must let the
orbitals depend on it. Since there are only two possible cases it is sufficient to classify
the spin-orbitals <P as <pa or <P~. depending on the spin:

CP~ ='I\(r) a. (6.1 a)

~ =",rCr ) ~. (6.lb)

where 'V are space-orbitals. i.e. functions that depend only on the spatial coordinates
r ={x.y.z}, and where we have recognized the fact that the spatial part of a spin-
orbital may depend on whether the spin part is a or ~.
As long as the Hamiltonian does not explicitly refer to spin it will commute with
these two operators, and the spin quantum numbers are therefore "constants of
motion". i.e .• any of its eigenfunctions should also be an eigenfunction of the spin
operators. In the same way. a many-electron wavefunction which is an eigenfunction
of a spin-free Hamiltonian is also an eigenfunction for the total spin operators S 2 and

Sz. In other words. the exact wavefunction for a many-electron system must satisfy
the relations

9 Reader~ already familiar with this subject from other presentations may notice that the factor h is
mi~~ing: that is hcc:luse we consi~tently usc the atomic units system in which h=1.
18

S 2 'I' =S(S+ 1)'1' (6.2)


Sz 'I' = Ms'l' (6.3)

To apply the n-electron spin operators (6.2) and (6.3) to our determinants. they
must first be expressed them in terms of the one-electron operators. The total spin
vector is just the vector sum of the contributions from each electron. S = Lsi. and thus
n
Sz =l: Szi
I
(6~4)

It can easily be shown that

(6.5)

Applying the Sz operator to a Hartree-Fock wavefunction. we obtain

n!
Sz 'I' =(n!)-112 L (-I)P.Sz P {lP 1(l)lP2(2)CP3(3) ... lPn(n)}
p
n n!
=(n!)-112 Li L
p
(-I)P {lPal (I)lPa2 (2) ..
'--=':-::':"...J
(6.6)

But. a spin-orbital lPi(i) is always an eigenfunction of Szi with eigenvalue OJ = ± ~ •


and thus.

n n!
Sz 'I' = (nn- Jl2 l: Lp (-I)P {lPal (l)lPa2 (2).
I
·IOajlPaj(i)I· . lPan(n)}

n! n
=(n!)" Jl2 L
p
(-I)P [L 0ail (lPal(l )CPa2(2).
i
lPaj(i) .. lPan(n)}

n n!
=(n!)-112 [L ail L l-I)P (lPal (l)lPa2 (2) .. CPai(i)· . lPan(n)} (6.7)
i P

A Slater determinant made up from na orbitals of (X-spin and n~ orbitals of


p-spin will always be an eigenfunction of the ~ operator. (since all the contributing

Hartree products are) with eigenvalue M~ = 1: ai =t(na - n~), where nex and n~ count
the (X- and p-orbitals. The requirement (6.3) is therefore fulfilled whenever we use a
determinant as our approximate n-electron wavefunction. However, the determinant is
not necessarily an eigenfunction of S 2, and (6.2) may not be fulfilled.
19

The Hartree-Fock equations that were derived in Section 5 can be used with
spin orbitals, as long as we remember to integrate over the spin coordinates.
Fortunately, the rules for spin integration are very simple:

<(XI(X>=<PI~>= I; (6.8a)
<(X 1~ > = O. (6.8b)

We may now introduce the substitutions (6.1) into the expressions for the
energy and Fock operator. With the expressions for one- and two-electron integrals
introduced in Appendix A, the energy in Eq (4.14) becomes:

n n
E('I') = l: ('IIi h 'IIi) +! .l(a)J(a)
1
~ ('IIj'llj Igl 'IIj'llj) - {'IIj'llj Igl 'IIj'llj}
n n
+.!. . ~ {'I'i'llj Igl 'IIj'llj} - ('IIj'llj Igl 'IIj'l'j) + . ~ ('I'i'l'j Igl 'l'i'l'j)
2 I(~)J(~) l(a)J(p)
n n n n
= L<i h i> + ~ L<ij II ij> + ~ L<ij II ij> + L<ij 1 ij> (6.9)
i i(a)j(a) - i(P)j(P> i(a)j(p)

The Coulomb interaction always survives the spin integration, whereas exchange only
occurs between electrons having the same spin (thus the 'single-bar' expression for the
last integral in (6.9). For the Fock operator (5.3), we must make a distinction
depending on whether it operates on an (X- or a j3-orbital:
F (a) =h + J _ K(a) (6. lOa)
F (P) =h + J - K (P) (6. lOb)

where the one-electron and Coulomb operators are the same as before, while the new
exchange operators K (a) and K (P) are defined as:
n
K(a)= L Kj (6.11 a)
i(a)

(6. 11 b)

since. as we discussed above, the exchange interaction only occurs between electrons
having the same spin. This straightforward implementation of the above equations is
known as Unrestricted Hartree-Fock theO!y,lO since there is no attempt to impose the
constraint (6.2) on our wavefunction.

10 1.A. Pople and R.K. Nesbel. 1. Chern. Phys. 33. 571 (1964); G. Berthjer. I. Chjrn. Phys. 51.
363 (1954).
20

7. Closed-Shell Hartree-Fock Theory.

While not an absolute prerequisite, it makes good sense to require the spin
properties (6.2) and (6.3) to be fulfilled also with approximate solutions to the
SchrMinger equation, in the same spirit as we constrained the wavefunction to be an
eigenfunction of the permutation operators Pij with eigenvalues of (-I) in Eq. (3.2). As
mentioned above, the Slater determinant does not in general satisfy (6.2). In certain
cases, however, it is quite straightforward to ensure the correct spin behavior of the
wavefunction. One important such situation occurs when the electronic state under
consideration is a spin singlet (S =Ms =0, spin multiplicity = I), which necessarily
requires an even number of electrons. By requiring the spin-orbitals to occur in pairs
having the same spatial function. thus differing only in the spin part - "perfect spin-
pairing" - important simplifications are possible:
ex
CPk = 'l'k(r) a(cr), (7.1a)
~ ='l'k(r) ~(cr), (7.lb)

Since such a determinant is an eigenfunction of the operators Sz , S_ , and S+ , with


eigenvalues =0 in all cases. it follows from (6.5) that it must be an eigenfunction for
S 2 as well. The ansatz (7.1) therefore ensures correct spin properties for a singlet

wavefunction. The energy expression with this wavefunction can be written as:

n n
E= 4. <i h i> + ih <ij II ij>
1 I.J
(7.2a)

nl2 n/2
=2I (i h i) + I (2(ii I jj) - (ij I ji) } (7.2b)
i i,j

where the summations in (7.2b) are over the nl2 doubly occupied orbitals. Thus, the
interaction between two doubly occupied spatial orbitals "i" and "j" is given by
4(ii Ijj) - 2(ij I ji) , since Coulomb interaction occurs between all electrons but
exchange only between those having the same spin. Within one doubly occupied
orbital, the energy contribution is =2 (i h i) + (ii I ii), since the two electrons have
different spin and thus give rise to only a Coulomb term.·
Closed-shell Hartree-Fock is by far the most commonly used variety of the
Hartree-Fock method. This is due to the fact that a vast majority of all molecules have
an even number of electrons and a singlet ground state. If an unrestricted Hartree-Fock
scheme were to be applied to such a system. the solutions would usually still satisfy
(6.2) and 7.1). but applying these constraints from the beginning makes the
21

computations much more efficient. Since the a- and ~-orbitals have identical spatial
parts. the two Fock operators (6.10) must also be the same in our closed-shell theory.

F (cs) =h + J (cs)_ K (CS) (7.3)


where
nl2
J(CS)=2L J i (7.4a)
i
n/2
K(CS)= L Ki (7.4b)
i

and by only constructing one Fock operator the work and the memory requirement are
reduced by about 50%.

8. Restricted Open-Shell Hartree-Fock Theory.

Spin constraints such as those applied in (6.2) can often be applied even if the
spin orbitals are not all perfectly paired with identical spatial parts as in (7.1). For a
simple example, consider a case where nc spatial orbitals are doubly occupied with
perfect spin pairing - the "closed shells". and no are singly occupied, all with a-spin -
"open shells". Such a determinant would be an eigenfunction of Sz with eigenvalue Ms
=n20 , and of S+ with eigenvalue zero. I I Operating with S. will be possible only for the
open shell orbitals; subsequent operation with S+ as in (6.5) will bring the function
back to the original determinant, and the S+S_ product in (6.5) will thus simply count
the number of orbitals with a-spin. We therefore conclude that

no n 2
S 'I' =(2 + 4) 'I' =MsCMs+ 1)'1'
2 0
(8.1 )

for this wavefunction.


It is convenient to have a consistent notation for the different types of orbitals
that will be encountered. and we have chosen here to use the indices i.j,k.... for the
doubly occupied orbitals. s.t,u... for the singly occupied, and a.b,c,.. whenever we
wish to refer to orbitals regardless
of their occupation. The energy expression (6.9) can now be simplified to

E =2 L (i hi) + L (s h s) + L [2 ( iii jj) - (i j I j i) I +


i s i.j
~ (2(ii I 5S) - (is I si)1 + ~L{(SS I tt)-(st Its)} (8.2)
I.S s,t

II the a-orbitals already have maximum m~ value and raising it is thus impossible: raising it for any
~ orbital will create a detclTIlinant with two identical orbitals.
22

Furthermore, if we use the occupation number rna (=1 or 2) for the number of
electrons occupying any spatial orbital we obtain after some manipulation:

E=L rna (a h a) + ~L rna mb(aa I bb)· i(ab I ba)}


a ~b

·~L(stlts) (S.3)
s,t

For the Fock operators, we get for the closed and open shells:

F (es) =h + J (ts). K(es) (8.4a)


F (OS)= h + itS) _ K(OS) (S.4b)
with
J (es) --~
~ J (S.Sa)
rna a
a
K (CS) -_1 ~
~ma
K
a (S.Sb)
2 a
I ~ I ~
K
(os)
=~
~

a
Ka = 2 ~ma Ka + 2 ~ Ks
a s
(S.Sc)

Apart from the last terms in (S.3) and (S.Sc), the expressions are all written on a
general form for all orbitals. whether singly or doubly occupied, using the occupation
numbers rna and the interaction terms (2J - K). Compared with the closed-shell
formalism, the only contributions that bring in new complexity are those describing the
exchange interaction among the open shells. We can therefore conclude that the
formalism would remain basically unchanged even if we were to assign other spins to
some of the open shells.
However, in order satisfy (6.2), a wavefunction may have to be formed as a
sum of several determinants. These determinants typically have the same spatial
orbitals occupied, but differ in the spin of the singly occupied orbitals. We may accept,
without a thorough derivation of all possible cases, that the energy for any such
wavefunction can be written as in (S.2) or (S.3) apart from the open-open shell
interaction for which we introduce a more general form:

E=2L(ihi)+L (shs)+L (2(iiljj)-(ijlji)) +


i s i.j
~(2(ii Iss) - (is I si)} + ~ L( aSI (ss I ttl -
I,S s.t
kb~t (st Its)} (8.6a)

or
E=L rna (a h a) + ~ L m:! rnh{{aa I bb) - ~(ablba)}
a a,b
+ ts,tL (asdss I tt) - ~ ~sdst Its)} (S.6b)
23

where ast and bSb or <Xst and ~~t are referred to as the open-shell vector coupling
coefficients. The (as" bst ) and (Clst , ~st) sets serve the same purposes, but different
notations are preferred by different schools. The reader should be warned that there are
many different official and unofficial definitions of vector coupling coefficients, and
take ample precautions before attempting to use a computer program that requires these
coefficients as input.
The coupling coefficients enable us to specify one out of several possible states
for an open-shell orbital configuration. As a simple example, let us consider the first
excited configuration of the Helium atom, He(l 5) J(2s) I. The reader is presumably
aware of the simple form of these wavefunctions from elementary quantum theory.
Thus we have for the triplet state:
R R ·112
3'1'=( II sQ2sl'> + IIs1'2sQ> )2 = (1525 - 2515) (<x~+~<x)/2 (8.7)
with the energy:

3E=hl s.ls + h2s.2s + (Isis 12s2s) - (1525 12515) (8.8)


which equals (8.6a) if

al s2s = I, bl s2s = 2. alsls = a2s2s = blsl s =b2s2s = 0 (8.9)

With the convention used in (8.6b) we need to have

<XIs2s = O. ~ls2s = I. <Xlsls = <X2s2s = ~Isls = ~2s2s = -I (8.10)

For the singlet state, the wavefunction is:

3'1'=( II s<X2s~>-1I s~2s<X> )i 112 = (1525+25 I5) (<x~-~<x)l2 (8.11)

with the energy:


I E=h Is.1 s + h2s.2s+ (I s I 512525) + (I s2sl2s Is) (8.12)

For the triplet. the vector coupling coefficients have to be given as

=
ah2s = L bls2s = -2. alsl s = a2s2s = blsl s b2s2s = 0 (8.13)
<Xls2s = O. ~ls2s = -3. <Xlsls = <X2s2s = ~Isls = ~2s2s =-1 (8.14)

Spin-restricted, open shell Hartree-Fock theory was developed by Roothaan


about ten years after the closed-shell method. 12 The above discussion only contains a
brief introduction to this complex field. For additional reading we recommend the
exposition by Veillard.l.~ which still contains a lot of useful material even though it is
now several decades old.

12 C.C.J. Roothaan. Rev. Mod. Phys. 32. 179 (1960).


1~ A. Veillard. in: "Camprltarimral Tec/miqllt.'.f in Qrlillltllm Clremi.ury· and Moleclliar Physics".
NATO ASI Series C. Vol 15. (Eds. O.H.F. Dierck$Cn. B.T. Sutcliffe and A. Veillardl Reidel.
Dordrechl (1975) pp. 201.
24

9. The LCAO Expansion.

Even though the Hartree-Fock approximation represents an immense


simplification compared to the original SchrOdinger equation, the resulting equations
are still too complicated to be solved exactly for most systems of chemical interest.
Brute-force numerical methods are not likely to change that situation in the near future.
Instead. methods must be chosen that take advantage of our chemical knowledge about
the system under consideration, without"biasing the results to meet preconceived
expectations. These requirements are fulfilled with the method of LCAO expansion,
i.e. the technique of expanding a (spatial) molecular orbital "'i(r) as a Linear
Combinatien of (approximate) Atomic Orbitals (Xp(r)},

N
'Vi(r) =LCpi Xp(r) (9.1)
p
'IJ7=XC (9.2)

an approach which has become an invaluable tool in electronic structure theory. 14


A basis set expansion approach as in (9.1) is by no means unique to
computational chemistry. Expanding unknown functions in a basis set of known
functions is a powerful and commonly used approach in many areas of applied
mathematics, thereby transforming a problem involving unpleasant calculus - in this
case one of coupled integro-differential equations - to the language of standard linear
algebra. The LCAO approach is also very appealing from the point of view of
common-sense chemistry: It is high-school knowledge that molecules are made from
atoms. and intuitively it makes a lot of sense to construct molecular orbitals in electronic
structure theory from their atomic counterparts. However, this simple and appealing
physical picture doesn't come without a price. It would seem reasonable that the
computational task of describing a classical system of particles having only one- and
two-body interaction terms should not grow more rapidly that the square of the size of
the system. Yet. when orbitals are expanded with the LCAO method. the single particle
probability distribution for an electron in a single orbital 'Vi would be given by

(9.3)

which is already a function with N2 complexity. The interaction between two electrons
in different orbitals will therefore grow as N4. a quite disadvantageous increase in

14 We continue to use lower-cao;e "n" to dcnote thc numher of electrons in our system. upper-case "N"
ror the number of ha~is functions. and a shldowe-d" fnnt for matrices or vectors in either n or N
dimen.~ions.
25

computational requirement (recall that N is roughly proportional to the size of the


system studied).
Specifically, even the simplest non-empirical LCAO methods require the
evaluation and processing of two-electron integrals over the electron-repulsion
operator, involving four basis functions:

(9.4)

Since the indices p,q,r,s can be permuted in several different ways without
essentially changing the value of the integral, the number of non-redundant such
integrals is typically of the order N4/8. This steep increase in the computational
requirement is a major concern in applications of the theory to realistic problems, since
basis sets of 100-1000 functions are often needed for a satisfactory description of
medium-sized molecules. In most schemes for performing ab initio SCF calculations,
the limitations to accuracy as well as to the size of the system that can be studied are set
by the two-electron integrals, their evaluation and their storage.

10. The Roothaan-Hall Equations.

With an LCAO expansion such as the one in (9.1), the general energy expression in
spin orbital formalism (4.14) takes the form

n N
E('I') = Li Lp.q CpiCqi hpq
n
t lI.J
N
+ L Cpi Cqj Cri Csj {<pq Irs> - <pr I qs>} (I0.1a)
pq.rs

N I N
=L Dpqh pq + 2" L
DprDqs {<pq Irs> - <pr I qs> } (I0.lb)
p.q pq.rs

where we have introduced the density matrix:

n
Dpq =4. CpiCQi (10.2)
1

Using the LCAO expansion technique. we can insert (9.2) into the canonical Hartree-
Fock equations (S.22b) to obtain:

(10.3)
26

If we multiply (10.3) from the left by xt and integrate, we obtain:


(10.4)

Introducing the Fock- and overlap matrices F and 5, such that

(10.5)

(10.6)

the Hartree-Fock equations in matrix form become:

(10.7)

This set of secular equations is often referred to as the Roothaan-Hall equations,


after the scientists who first derived workable equations that could be translated into
computational recipes for electronic structure calculations on polyatomic systems. IS
To solve the Roothaan-Hall equations, we need to evaluate the F and 5
matrices, as well as finding a way to solve (10.7). The overlap matrix 5 is trivial,
since its elements can be calculated directly as in (10.6). For the Fock matrix, (10.5)
suggests an approach, but we need to express the Fock operator in terms of known
quantities as follows:

n
Fpq = (XpF Xq) = <Xp( h + J - K )Xq) = (XphXq) + ~ (Xp(J j - Kj)Xq)
I
n n N
=hpq + 4. (Xp(Jj - Kj)Xq) =hpq + ~ L CriCsi {<pr I qs> - <pq I rs> I
I I ~s

N
=hpq + L Drs <pr " qs> (10.8)
r.s

In (10.8) we have used the notations for integrals defined in Appendix A. Notice that
in the Hartree-Fock equations (10.7) the matrices iF and 5 are given, £ is required to be
diagonal. while C is unknown. However. C cannot be varied completely without
restrictions: We required the orbitals to be orthonormal according to (5.4). which can
be written in matrix form as in ( 10.9):

IS C.C.1. Roothaan. Rev. Mod. Phy~. 23. 69 (1951): G.G. Hall. Proc. Roy. Soc. A 208.328
(1951).
27

(10.9)

The Roothaan-Hall equation (10.7) is a generalized matrix-eigenvalue equation.


In order to solve it, it is convenient to bring it on a conventional matrix-eigenvalue
form. i.e .. without the matrix 5. This can be achieved if we express the orbitals in an
orthonormal basis. and it is thus important to find a set of orthonormal functions within
the space of our original basis set that can be used for such an expansion. Assume that
we could find a set of functions {CPi}, expanded in our original basis set {Xp } such that

N
<Pi =I, XpApi (10.10)
P
or
~=~A (10.1 I)

with the properties (<Pi <Pj) = Oij , or

(10.12)

As long as the transformation A is non-singular, the original basis set {Xp} can be
expressed in the new set {CPi}, which can thus also be used as an expansion basis for
the orbitals in our Hartree-Fock problem. This leads to

where
'1\1 = ~ C =-4l A-l C = *
C' (10.13)

C= AC' (10.14)

Inserting (10.14) into (10.7). we obtain

F (AC') =S (AC') ~ (10.15)

and multiplying from the left with At this leads to

AtFAC' = AtS AC';£ ( 10.16)

or. using (10.12) and the definition

F' = AtlFA. (10.17)

we obtain the Secular equation

F'C' = C's (1O.18a)

or (dropping the 'primes')

(1O.ISb)
28

(10.18) is a "normal" matrix eigenvalue problem where the columns of C are the
eigenvectors and the diagonal elements of £ the corresponding eigenvalues as illustrated
in the diagram (IO.lSc):

(1O.1Sc)

the columns of C
are eigenvectors

In actual calculations we face the practical problem of determining the


orthonormal basis set {ep} in (10.13), or more precisely its expansion coefficients A. It
should be clear from the above that any orthonormal set will serve the purpose,
however, certain choices have specific advantages.
One such choice is to simply use the orbitals of the previous SCF iteration.
This has the advantage that the Fock matrix in this new, orthogonal basis is already
almost diagonal: more and more so as the SCF procedure approaches convergence, and
this can make the diagonaJization much more efficient.
In the first SCF iteration this strategy obviously cannot be used. Here, another
approach is found to have certain advantages: We start by diagonaJizing the overlap
matrix S. i.e. we solve the matrix eigenvalue problem

Su =lIG, (10.19)

where 11 is unitary and G is diagonal. We can then form the matrix X =lIC-112, where
~-112.IS the matnx
. containing
.. the eIements (O"jj )-112 on the d'lagona.I (We have assumed

here that the matrix C is positive definite. which holds as long as there is no linear
dependence in the ba.o;is set.) We can easily verify that

xtsx =(1JC-1/2 )ts UC-112 =e -112utS UC-112


-112 t -112 -1/2 ·1/2 1
=e u lIGC =6t1ift = (10.20)

and X can therefore be used in place of A for the transformations in (10.14) and
(10.17). This Canonical Orthogonalization scheme has some obvious advantages: In
the case of near or exact linear dependency in the basis set. the eigenvalues of the
overlap matrix S will be near or equal to zero. This would obviously constitute a
problem if we attempted to form fj-1I2 and X in (10.20) as the elements in column 'i'
would grow beyond any limit. but the calculation would in fact misbehave with any
29

choice of transformation since it would be essentially impossible to form a full set of


orthonormal molecular orbitals in such a basis.
The matrix 11 is unitary; hence its elements have an order of magnitude around
one, and X =11 a -In thus has elements of the order (O'ii).- 1The
1 2 . ". .
matnx ",-' IS also

unitary, and C, the coefficients of the LCAD expansion, are of order (aiif~12 The two-

electron energy in (10. I a) contains products of four different such expansion


coefficients, and is therefore of order (O'i/ 2 Any error in the integrals (pq I rs) could
therefore be multiplied by (O'jj)-2 when forming the expression for the total energy.
Clearly, as O'ii approaches zero this will lead to large numerical problems. Using a
canonical orthogonalization is actually an advantage in these cases, since the problem is
readily detected as it manifests itself in the magnitudes of O'ii.
However, the above procedure does not only identify the problem, they also
lead towards a procedure for correcting it. Instead of attempting to calculate (O'jj) -112
for very small eigenvalues, one may simply eliminate the corresponding eigenvector
from the basis. The procedure can be graphically illustrated as:

Mathematically, this means that we have projected out from the basis the
particular component causing the linear dependency.

I]] t"q,1 (-112)

This procedure gives a rectangular matrix X' to be used in place of A in (10.14) and
( 10.17), and the transformation of the Fock matrix therefore leads to a matrix of smaller
dimension (though still quadratic):

We could also have used a symmetric orthogonalization scheme, which is slightly


30

different in that we now define

x =Ultf" 112ut (10.21)

Again, it is easily verified that

xtsx = QC·1I2utS:l.JII7·112ut = U'Cf· 1I2U t IIlIC C· 1/2U t


= uc·112c ~·1I2\'lt = !Jut = 1 (10.22)

The advantage of a symmetric orthogonalization is that the new basis set


• =XX obtained with this transformation is as close to the original, X as possible.
The possibility to eliminate near linear dependencies is still present with the symmetric
orthogonalization approach. The symmetric orthogonalization also has applications in
the analysis of the wavefunction. as we will discuss later in Section 24.
Our purpose with the Hartree-Fock scheme is to determine the orbitals from
which to construct a molecular wavefunction. and we thus need as many orbitals as
there are electrons in the system. However, the number of solutions usually doesn't
equal the number of electrons n. but rather the number of basis functions N used in the
expansion (9.1). In most cases N>n. so we're getting more than we bargained for.
This is actually a potential practical problem - we are facing the task of selecting a
subset of solutions to the Hartree-Fock equations. and we need to establish the criteria
for such a selection. In semi-empirical theories the total energy equals the sum of the
orbital energies, and one can therefore safely assume that the lowest energy for the
molecule is obtained by using the n orbitals with lowest orbital energy and ignoring the
rest. This simple and obvious recipe goes under the name of the Aufbau Principle, and
is extremely useful in deciding which eigenfunctions of the Fock operator to use as
orbitals. These functions are termed the occupied orbitals, the remaining, unused
eigenfunctions are referred to as virtual orbitals. Virtual orbitals are sometimes used to
construct wavefunctions for excited states.
We showed in (5.24) that this simple relation between orbital and total energy
doesn't exactly hold for the Hartree-Fock method. Still, the Aufbau principle works in
almost all those cases, even though there is no formal guarantee that using the lowest
set of orbital energies leads to the wavefunction with the lowest total energy. We
should note. that the Aufbau Principle suggests a set of orbitals to be used in describing
the ground electronic state of the system. In many cases we are interested in excited
states - obviously the Aufbau Principle shouldn't be used to select those orbitals,
although it is generally useful also in these cases to know what a ground-state electronic
configuration might look like.
31

11. The Self· Consistent Field Procedure.

The attentive reader may already have noticed a significant flaw in our reasoning
when the Hartree-Fock method was presented above. It is certainly plausible that the
Hartree-Fock equations could somehow be solved if we were able to construct the Fock
operator, and subsequently the Fock
matrix (10.5), which could be diagonal-
ized to get the orbitals. It is also clear
that the Fock operator is defined from
these orbitals. according to (4.11) -
Guess the initial
(4.13) and (5.3). It thus appears that molecular orbitals
we would need to know the solutions to
the Hartree-Fock equations, before we
could define the operators needed to
construct these equations! The solution
to the above paradox is as simple as it is Replace Construct the
pragmatic. As long as the equations are old orbitals Fock Operator
satisfied in the end, it doesn't matter with new
how we arrive at these solutions. We
can therefore allow ourselves to guess a
set of orbitals without any particular Solve the eigenvalue
justification. in order to get the process problem F<p;:::
I
<P.E·
I I
started. With these orbitals we can now
construct an approximate Fock opera-
tor, which can then be diagonalized to
obtain a new set of orbitals. These
orbitals then replace the old ones in
.constructing a new Fock operator, and
so on. The procedure is repeated. and
after a certain number of iterations it is
usually found that the orbitals do not
change from one iteration to the next.
At this point. the orbitals satisfy (5.22),
and we conclude that we now have the
solution to our problem.
The above approach is referred to as the Self-Consistent Field method (or SCF
for short). since the Coulomb and exchange fields of the orbitals define a Fock operator
having these very same orbitals as eigenfunctions.
32

12. Solution of the Roothaan-Hall equations.

We are now in a position to devise a computational approach for solving the


Roothaan-Hall equations. The most obvious scheme is to expand the spatia) orbitals
~(r) and wr(r) in Eq. (6.1) directly. using an LCAO basis set (9.1):

N
~(r) =L C ~k Xp(r) ( 12.l.a)
p
N
~(r) =L C ~I Xp{r) (I2.l.b)
p

As discussed in Section I J. some initial approximate guess for the expansion


coefficients C has to be made. A number of different procedures can be used. details
of which will be discussed below. In all cases, however. an initial set of MO
coefficients C (or equivalently. a density matrix D). will be available at the beginning
of the iterative procedure. With,these orbital coefficients. one density matrix can be
defined as in (10.2), except that we now construct one matrix explicitly for each of the
spins.
n
DIl= ~ C~C~ (12.2a)
IS .k n 51
J(1l)

n
r:} = L cp.d3. (I2.2b)
IS i IP) n SI

where the summations are only over (l- Q.[ ~-orbitals. as indicated. With these density
matrices (and the one-and tw~lectron integrals) the total energy can be expressed as:

(12.3)

We also obtain two Fock matrices. one for each of the Fock operators in (6.10):
N
F~ =L [(D~ + D~)(pq I r~) - D~ (prl qs>J ( 12.4a)
IS
N
F~q =L [(D~ + D~) (pq 1rs) - D~ (pr 1qs)] ( 12.4b)
IS
33

Making the following substitutions in (12.4),

DI01=DC\D~ (J2.5a)

Dspin=Da_D~ (I2.5b)

D a = 1 {DIO\Dspin} (I2.5c)
2

D~ = 1 {D101_ DSPin } (12.5d)


2

we arrive at expressions that often lend themselves better for computation:

N N
E=L. DIOlhpq+~
!XI !Xl
L.
pq.rs
D~tD~I{2(pqlrs)-(prlqs)}
N
- L. DspinDspin(pr I qs) (12.6)
pq,rs !XI rs

t t
N
Fa
pq
=L.
rs
[D~t {(pq Irs) - (pr I qs)} - D:s"in (pr I qs)] (I2.7a)

L.rs [D~l {(pq Irs) - t(pr I qs)} + t D:s"in (pr I qs)]


N
F~ = (I2.7b)
pq

We now obtain the two Roothaan-Hall equations.

(12.8a)

(12.8b)

which can be solved with the methods discussed in Section I I. Note that in general the
orbitals and the orbital energies will be different for the two equations (I2.8a-b).
No special attention need to be given to the issue of orthogonality among the orbitals.
Orbitals with different spin are automatically orthogonal due to the spin integration,
whereas those with the same spin are orthogonal since they are solutions to the same set
of Roothaan-Hall equations.
The orbitals obtained in this scheme can now be used in accordance with the
Aufbau Principle to form the orbitals for the next iteration, or to construct the total
wave function if the SCF procedure ha<; converged. There is no requirement to have the
same number of u- and ~-orbitals. and the Autbau principle can be applied separately
for each spin.
It may appear confusing that two different sets Hartree-Fock equations were
obtained, since only one occurs in the original derivation. We can view the two
34

equations above as a special case of one large 2N x 2N equation, in a basis of (2N)


basis functions (N spatial functions combined with two different spin parts):

Since the off-diagonal blocks are exactly zero due to the spin orthogonality, these
matrix equations can be separated into two blocks of dimension N x N.
The practical procedure for solving the Hartree-Fock (or Roothaan-Hall)
equations in a LCAO basis set would then go as follows:

I: make an initial guess of molecular orbitals. i.e., find a starting set of MO

coefficients, and construct the trial density matrix. Alternatively, one could

try to guess the density matrix directly.

2: determine an orthonormal set of functions within the basis set given.

3: Compute one- and two-electron integrals over the LCAO basis functions and

construct the Fock matrix (-matrices).

4: Transform the Fock matrix to the orthonormal basis and diagonalize it.

5: If the new orbitals are equal to the old, STOP!

6: else: Construct a new density from the orbital coefficients and go to 3.

We have introduced the determinant wavefunction as a physically intuitive


approximation. We could also have arrived at the same result from a more formal point
of view. For our discussion to be mathematically solid. we need to accept a couple of
theorems without complete proof:

I: The Hartree-Fock equations - but not the matrix representations of these


equations - have solutions that form a complete set of (square-integrable) one-electron
functions.

2: The determinants formed with a complete !;et of one-electron functions form a


complete set of antisymmetric N-electron functions.
35

It can now be rigorously shown that. as long as certain formal criteria are met.
any function of several variables can be written as a sum of expansion functions which
are simple products of one-variable functions. In electronic structure theory, it is
obviously more useful to think in terms of n- and one-electron functions. i.e. total
wavefunctions and orbitals. Therefore. any function of the coordinates of n electrons
can be expanded in Hal1ree products. and any antisymmetric function can be similarly
expanded in Slater determinants!
This implies that we could write the exact electronic wavefunction for any
system as a linear combination of Slater determinants.

'1'(1.2 •... n)=2. cjDj(I.2 •... n) (12.9)


i
which is the basic idea behind the Configuration Interaction (CI) method to will be
discussed elsewhere. In practice. that expansion would be infinite and the sum above
must be truncated. The simplest wavefunction of that form is a single determinant,
which is another way of viewing the Hanree-Fock approximation.
The Hartree-Fock wavefunction is clearly an approximate solution to the
"exact" Hamiltonian. An interesting question regarding the properties of Slater
determinants, which is closely related to the above discussion is the following: Is there
instead an approximate Hamiltonian. for which the Hartree-Fock wavefunction is an
exact solution? To see that such an operator indeed exist. we consider the sum of the
Fock operator acting in tum on all electrons in the system:

n
Ho=L F(i) (12.10)
i

Pay attention here: F is a one-electron operator, i.e., it acts on one electron


only. although its eigenfunctions may well depend on the coordinates of other electrons
as well. Ho in contrast is an n-electron operator affecting all electrons in the system.
Note also that all the F(i) must have identical forms due to the fact that electrons are
indistinguishable. the only difference between the terms in (12.10) is that they operate
on different electrons. We now let Ho operate on a Hartree-Fock wavefunction. i.e. a
determinant as in (3.16). As discussed above. '1'0 is a sum of n! Hal1ree products as in
(3.1). One such product can be denoted

(12.11)

where la(, a2' ... anI is a permutation of the indices (1.2•... nJ. A single term F(i)
in Ho operating on a single Hal1ree product 0 v in the wavefunction will then result in
36

the following expressions:

F(i)9v =F(i) ['Par ( 1)'Pa2(2)···'Paj(i)···'Pan(n)] =


=['Par (l)'Pa2(2)...Eaj'Paj(i)···'Pan(n)] =Eaj9 v (12.12)

Therefore, we conclude that the Hanree product is an eigenfunction of the zero order
Hamiltonian:
n n
Ho9 v =l:i F (i) 9 v =[l:j EaJ9 v (12.13)

Since {ar, a2, ... anI is just a permutation of {1.2 •.•• nl the sum in (12.13) simply
contains the sum over all orbital energies. Thus,

(12.14)
where
n
Eo= LEai (12.15)
i
for all the Hartree products 9 v • and we thus also have

(12.16)

with the above definitions of Ho , '1'0 and Eo·

Gltess of initial orbitals.

In order to get the SCF procedure to converge swiftly, it is important to have an


accurate guess of the orbitals for the first iteration. Actually, since the orbitals would
be used to fonn the density matrix, the purpose is equally well served by a trial density
matrix. which is sometimes an easier task. The very simplest guess. but one that
actually works well in many cases, is to set the density matrix equal to zero in the first
iteration. This corresponds to a total neglect of all electron repulsion. and the electron
density will therefore be densely concentrated around the nuclei. A more sophisticated
guess would be to start with a density constructed as a superposition of atomic
densities. These can be stored pennanently along with the basis set information. or
obtained through inexpensive atomic calculations during the processing of the integrals.
Such a trial density would not in general correspond to any approximate Hartree-Fock
wavefunction for the molecule. This flaw can be corrected. usually with a general
improvement of the guess as a result. by diagonalizing the trial density and using the
resulting eigenvectors with integer occupation numbers ao; trial orbitals.
37

13. The Supermatrix Formalism.

In an LCAO representation. the expression for the closed-shell Hartree-Fock


energy (7.2) becomes:
N N
E=L
pq
Dpqhpq+~ L DpqD!'5(pqlrs)-~(prlqs)-~(pslqr)}
pq,rs
(13.1)

The (X- and the f3-Fock matrices would be equal. and given by:
N
Fpq = hpq + L Drs(pq Irs) - ~(pr Iqs) - ~(ps I qr) } (13.2)
rs
In (13.1) and (13.2) we have written the exchange contribution on a symmetric fonn,
which we may do as long as the summation is over all r,s. Exploiting this symmetry
we may now restrict the summation range. If we redefine the density matrix as

(13.3)

we arrive at expressions which are more efficient to evaluate than the previous ones:
N N
E= L dpqhpq + ~ L dpqdrsP pq.rs (13.4)
pSq pSq.rSs
N
Fpq =hpq + L drsP pq.rs (13.5)
rSs
In ( 13.4) and (13.5) we have introduced the supennatrix notation
Ppq,rs = (pq I rs) - 4I (pr I qs) - 41 (ps I qr) (13.6)

The supermatrix formalism is often very time-saving. It is convenient to treat


the Fock- and density matrices as one-dimensional quantities f and d, "supervectors",
of dimension N(N+ I)/2. and P as a two-dimensional "supermatrix" with the index
'pairs pq and rs as compound indices. The expressions for the energy and Fock
matrix/vector can then be conveniently written in matrix notation:
(13.7)
(13.8)

Similar techniques can be used to construct supermatrices in the unrestricted Hartree-


Fock method. Notice that supermatrices are symmetric upon interchange of indices
p <=> q and r ¢:) s. as well ao; of the index pairs pq ¢:) rs. Thus, if we should choose
to compute and save these quantities. the space requirement for each supermatrix would
1 .j
be of the order RN .
38

14. Direct SCF Techniques.

Since the first fonnulation of the LCAO finite basis scheme for molecular
Hartree-Fock calculations. computer applications of the method have traditionally been
implemented as a two-step process. In the first of these steps the two-electron integrals
are calculated and stored externally. The second step then consists of the iterative
solution of the Roothaan-Hall equations, where the integrals from the first step are ~ad
at least once for every iteration. 16
The division of the computational process into these two steps was motivated
by the high cost of central processor (CPU) perfonnance versus input and output (YO).
Whereas the second step involves extensive retrieval of data from mass-storage,
integral calculation is dominated by computation of rather complicated analytical
expressions. In early applications of LCAO calculations to molecules, the integral part
of the calculation with its high CPU demands represented the bottleneck, and the major
effort of developing more efficient algorithms and faster computer programs for
molecular Hartree-Fock calculations has therefore been directed at that problem.
During the last several decades there have been continuous rapid advances in
computer technology, as well as in integral algorithm and code development. Common
to virtually all types of computer equipment. from desktop PCs to supercomputers. is
the fact that the progress in CPU technology has been much faster than the development
of lIO facilities. Thus, with the traditional approach one now faces the dilemma of
being able to compute large numbers of integrals very rapidly, but spending a relatively
larger amount of time and effort on their storage and retrieval. Indeed, for SCF
calculations carried out in that fashion today. the size of the systems that can be handled
is almost always limited by the disk storage and YO capacity needed for the integrals,
rather than by CPU power required to compute them.
While enough storage capacity might be available on large mainframe systems
to carry out calculations in the conventional spirit up to some 500 or even 1000 basis
functions. such an endeavor would certainly place a heavy load on the YO channels. fill
a large portion of the available disk space. and reduce the overall throughput of the
system considerably.
The direct SCF scheme offers a solution to this problem by eliminating all
storage of integrals. This can. however. only be done at the expense of integral
recalculation in every iteration. While this would have been very inefficient in the early
days of molecular Hartree-Fock calculations. the present hardware and software

16 With the large memories a\'ailable on modern hardware. integrals can often be held in memory
during the entire calculation for small and medium-size systems. We will not discuss such "in-core"
~olutions in any detail here since our main focus is on large applications.
39

situation makes it a very viable approach. Many years of development has made the
evaluation of these integral less of a burden than it used to be. Today. it is often easier
to calculate the integrals than to store them; in other words. the evaluation of the
integrals has become less of a bottleneck than their storage. There is a lot of evidence
that this trend will continue. and one should therefore try to circumvent the bottlenecks
set by the external storage and 110 capacity. at the expense of extra CPU work if
necessary. This simple idea is the quintessence of the "direct" approaches in electronic
structure methodology. 17 The two approaches are schematically illustrated below:

The conventional SCF Approach The Direct SCF approach

The first scheme for performing LCAO-MO Hartree-Fock calculations without


integral storage bottlenecks was introduced more than a decade ago. 18 Although based
on well-known principles. it represented a break with the traditional approach to such
calculations. The approach is almost completely trivial: Instead of storing and re-
cycling integrals. they are re-evaluated whenever needed. Since integral storage is not
required. the 110 bottleneck hao; obviously been completely removed. The drawbacks
are ao; obvious as the advantages: more lime will now be spent on integral evaluation.
Deliberately increasing the amount of CPU work was contrary to all
conventional wisdom of the scientific community when these methods were first
introduced. and they were therefore met with considerable skepticism. However. direct
methods have now been universally accepted. and are incorporated in many of the large

17 J. Almlfif and P. R. Tayll,r: in: -Admlll:ed n,enrie.t and Compl/tational Approaches to the
NATO ASI Ser. C. 133 (Ed. C. Dyk~tra). Reidel. Dordrecht
Electronic: St",ctllre (If MfllfCI/Ies"
(1984). pp. 107-125
18 J. Almlilf. K. Faegri. Jr. and K. Ko~cll. J. Comput. Chem. 3.385 (1982).
40

program packages for electronic structure calculations. 19 In particular, most of the


development of parallel codes is based on direct algorithms, where it is expected to be
particularly beneficial due to the extreme unbalance between I/O and number-crunching
capacity on a typical parallel architecture.
In the most naive implementation, writing a computer code for a direct scheme
amountc; to little more than replacing the reading of one- and two-electron integrals in
the SCF algorithm by their repeated recalculation. However, it is evident that sucJt a
change calls for a revision of algorithms and procedures used at a lower level as well,
as we will demonstrate in the following sections. In particular, direct approaches
largely eliminate the need to compute, process and store integrals in any particular
order, and they therefore open possibilities to restructure the algorithms for better
efficiency.20 For best performance a direct algorithm should be integral driven; i.e.
integral evaluation concerns should dictate the order of events in the calculation. When
an integral has been calculated it should be used to the maximum extent possible, as
long as no external storage is invoked.
The direct approach appears to clearly be the method of choice for medium-size
and large SCF calculations. In addition to being the only realistic way to do large
calculations on workstations and minicomputers, a direct SCF calculation is ideally
suited as a background job for large mainframe and supercomputer systems, where it
can reside in a small portion of memory and execute with low priority using spare CPU
cycles. Its advantages are also accentuated whenever we try to use unusual and exotic
types of hardware, such as modem "massively parallel" equipment,21 where the I/O
and the access to externally stored data is often one of the major headaches.
To summarize. what distinguishes the Direct SCF method is that the storage of
integrals in an AO basis (which is -N:as /8 in conventional methods) is avoided by
constructing the Fock matrix on the fly from integrals, which are reevaluated in each
iteration as they are needed.
It is easy to conceive a compromise between these two extremes: Rather than
recalculating all integrals in every iteration, one can try to identify the most expensive
ones. evaluate them in the beginning of the calculation and keep them in the central
memory of the computer (or even on a fast external I/O device). The cheap ones can
preferably be reevaluated in every iteration, in the spirit of the direct scheme.

19 ~ee e.g. M.J. Fri~h. G.W. Trucks. M. Head-Gordon. P.M.W. Gill. M.W.·Wong. 1.B. Foresman.
B.G. lohn~on. H.B. Schle!!el. M.A. RollI!. E.S. Rel'loge. R. Gomperl~. I.L. Andres. K.
Raghavachari. 1.5. Binkley. C. Gnn7.alez. R.L. Manin. OJ. Fox. 0.1. Defres. J. Baker. JJ.P.
Stewan. and I.A. Pople. Ga/lssitlll 92. Revi~inn A.
20 J. Almlof and K. Faegri. Jr .. in: "SELF· CONSISTENT FIELD - T"em~' and Applications". (Eds.
R.Camo and M. KlobukoWl:ki I Elsevier. 1990. 1'1'.195.
21 See. e.\!.: H.P. Luthi and I. Almliif. Theoret. Chim Acta 84.443 (1993): LG.M. Peltersson
and T. Fax~n. Theoret. Chim. Acta 85.345 (1993).
41

J5. Basis Sets.

While we introduced the notion of an LCAO basis set already in (9.1). we have
said nothing so far about the fonn of these functions. Clearly. the exact atomic orbitals
are almost as inaccessible as their molecular counterparts - and even if we knew them it
would only be in the form of numerical tabulations. which would not lend themselves
to efficient evaluation of the one- and two-electron integrals that we encountered above.
All that was said about basis set expansions in Section 9 would be valid with
almost any basis set. and insisting on atomic orbitals for that expansion is really only
based on our expectation that AO's would be the most suitable set of functions for
expanding the MOs. After all. the notion of atoms in molecules - which is the rationale
for the MO-LCAO approach - is only approximate. and the pursuit of accuracy through
the use of very precise AO's for the expansion is therefore futile.
As a general consequence of atomic symmetry. atomic orbitals are always of the
fonn

'I'(r) =R(r)Ylm(9.q,). (15.1 )

The equations determining the fonn of the radial functions R(r) can be solved
exactly only for one-electron atoms. but some general conclusions about the fonn ofthe
solutions can still be drawn. Due to the singularity of the potential at a point nucleus
with a charge of +Z, the wave function must have a 'cusp' at the nucleus, more
specifically. it is required that

dRI
dr r=O
=-z (15.2)

At the other end of the range, an electron far away from any molecule would see the
remainder of the molecule as a positive charge without any particular structure. Like in
anyone-electron atom. the wavefunction would therefore decay exponentially. It
would thus seem reasonable to use exponential functions as basis functions, especially
since they are known to be the exact solutions for the one-electron systems.
Historically. basis functions with exponential asymptotic behavior - Slater-type
orbitals, (STO's) - were the first to be used. 22 They are characterized by an
exponential factor in the radial part:

( 15.3)

22 The Slater·type basis functions are often referred to as ETO's (exponential type orbitals). For a
comprehensive re\'iew. see: c.A. Weatherford and H.W. Jones. "£TO MlIlricelller MoleclIlar IlIIegrals",
(Reidel. Dordrccht. 1982).
42

where Per) in (15.3) is a polynomial in the radial coordinate that can take on several
different fonns.
Gaussian basis functions were originally introduced to remedy the difficulties
associated with evaluating multi-center integrals with STO's.23 They can be written on
a rather similar fonn,

(15.4)

though usually with a different radial polynomial. but much of their usefulness stems
from the fact that they are not confined to a local, polar coordinate system, and they are
therefore commonly expressed in tenns of their Cartesian components:

(15.5)

where each Cartesian component has the form:

(15.6)

The present success of GTO's as the basis set of choice in virtually all calculations was
far from obvious in the beginning. For instance. it is clear from quite elementary
considerations that a Gaussian has the qualitatively wrong behavior both at the nuclei
and in the asymptotic (long distance) limit. for a Hamiltonian with point-charge nuclei
and Coulomb interaction.

STO GTO

Furthennore. early practical experience with Gaussians was quite discouraging and it
has therefore been a commonly held belief that STO's would be the preferred basis if
only the integral evaluation problem could be solved. However. recent experience
indicates that this is not necessarily the case. The 'cusp' behavior represents an
idealized point nucleus. and for more realistic nuclei of finite extension the Gaussian
shape is actually more realilaic. If accurate solutions for a point-charge model

23 S.F. Boys: Proc. Roy. Soc. A 200.542 (1950): S.F. Boys. G. B. Cook. C. M. Reeves and I.
Shavilt. Nalure 178. 1207 (1956): H. Preuss. Z. Nalurforsch. A 11.323 (1956).
43

Hamiltonian are desired. they can be obtained to any desired accuracy in practice by
expanding the core basis functions in a sufficiently large number of Gaussians to
ensure their correct behavior. Furthermore, properties related to the behavior of the
wave function at or near nuclei can often be predicted correctly, even without an
accurately "cusped" wavefunction. 24
In most applications the asymptotic behavior of the density far from the nuclei is
considered much more important than the nuclear cusp. As mentioned above, the
wavefunction for a bound state must fall off exponentially with distance, whenever the
Hamiltonian contains Coulomb electrostatic interaction between particles. However,
even though a STO basis would in principle be capable of providing such a correct
exponential decay, this occurs in practice only when the smallest orbital exponent in the
basis set is

~min =(2Imin) 112. (15.7)

Imin being the first ionization potential. Imin is hardly ever known in practice when the
basis set is designed, but even if it were, all attempts to get a correct asymptotic
behavior even with a STO basis would still be futile. One must keep in mind that for
stable molecules Imin is usually> 5 eV, and thus ~min > 0.6 a.u. While this restriction
on the exponent range might be acceptable in SCF calculations on atoms, much lower
values are required for accurate work on any molecule. especially at the correlated level.
Violating the requirement (15.7) with a too diffuse STO basis will have much more
damaging effects on the long-range behavior than any Gaussian basis: if the smallest
exponent in a molecular calculation is chosen to be e.g. 0.4 rather than 0.6 as in the
above example. the density at a distance of 10 A from a molecule would exceed the
correct one by about three orders of magnitude! With a typical Gaussian basis, in
comparison. the density is essentially zero at that distance, and the consequences of this
error is far less severe for any normal molecular property calculated with these basis
sets.

Basis Set COIITraclion

In virtually all ab ;lIil;O calculations carried out today. a basis set of contracted
Gaussians is used. In conventional methods where integrals are stored, considerable
thought goes into the issues of basis set compactness. i.e .. the ability to describe the
orbitals as accurately us possible with the minimum nllmber of basis functions. The
discussion above clearly shows that Gaussians do not resemble atomic orbitals very
closely. and they are not always used directly as basis functions in the expansion (9.1).

24 M. Chall:lcnmhe :md 1. Ci(1sll)\V~ki. 1. Chem. Phy~. 100. 464 (1994).


44

However, they have other properties that still make them very attractive as basis
functions from a computational point of view. In order to get functions which retain
that advantage but perform better in the LCAO expansion, linear combinations of
Gaussians can be used; the original basis set of simple Gaussians is contracted2S (the
CGTO basis set). An atomic orbital. whose shape is suitable for physicakhemical
reasons, is thus expanded in a set of Gaussians, whose mathematical properties are
attractive from a computational point of view:

Xp= L.a CapXa (15.8)

The first contracted basis sets were designed based on atomic Hartree-Fock
wavefunctions, and with the purpose of facilitating molecular Hartree-Fock
calculations. Later. contraction schemes such as the well known STO-nG sets have
been designed to mimic the shape of STO functions. 26 If one assumes that a "true"
STO basis set would be superior to Gaussians - we have raised some doubt about that
assumption above - it would make a lot of sense to try to approximate the STO with an
expansion in a set of GTO functions. Such a fit can be done rather accurately as shown
below, and the main limitation to the usefulness of that approach appears to be that the
STO itself is not a perfect basis function.

STO APPROXIMATED BY 1 GTO STO APPROXIMATED BY 2 GTO's

STO APPROXIMATED BY 3 GTO's STO APPROXIMATED BY 6 GTO's

The STO-nG basis sets have gained a considerable popularity in Hartree-Fock


calculations on large systems. since they are rather inexpensive to use. Other
contraction schemes have been developed !;pecifically for the purpose of correlated
calculations. 27

2S M. Krauss. J. Chern. Phys. 38. 564 (1963): C.D. Rilchie and H.F. King. J. Chern. Phys. 47.
564 (1967): E. Clernenli and D.R. Davis. J. Cornpul. Phys. 2. 223 (1967).
26 W.I. Hehre. R.F. Slew3rI. and I.A. Pople. J. Chern. Phys. SI. 2657 (1969).
27 1. Almlofand P.R. Taylor. Adv. Quanlum Chem. 22. 301 (1991).
45

Two quite different philosophies are advocated with regard to the contraction
scheme. When transforming from the larger. primitive GTO set to a smaller, contracted
CGTO set, the algorithms are clearly much simpler if the transformation is resnicted in
such a way that each GTO contributes to exactly one CGTO. In that case, the transfor-
mation is reduced to a series of small, independent summations within mutually
exclusive sets. This is called the segmented contraction scheme. In contrast, the
general contraction scheme makes no such assumptions, and allows each GTO within a
set to contribute to several CGTO's. The distinction is best illustrated graphically. We
may assume that integrals have been evaluated for a set of GTOs. Usually, both the
nuclear center and the angular form is the same for all GTOs in the group; they differ
only in their orbital exponents a. In the first, segmented scheme, the transformation
matrix from the GTO to the CGTO representation is sparse: each column contains
exactly one non-zero element. The
Segmented contraction general scheme, in contrast, has a non-
sparse transformation. 28

~=
Chemically, the contraction of a basis
represents an attempt to get baSis func-
o G
T tions closer in spirit to the LCAO con-
o
cept. Mathematically, contraction con-
stitutes a projection of the one-electron
basis {X} onto the smaller, but
General Contraction physically more reasonable basis {X} .
Ideally, such contraction can be done
without any major deterioration of the
wavefunction quality.
One considerable advantage of the
general contraction scheme is that the
CGTOs reproduce exactly the desired
combinations of primitive functions.
Modified For example. if an atomic SCF
General Contraction calculation is used to define the

, I
contraction coefficients in a general
contraction, the resulting minimal basis
will reproduce the SCF energy obtained
6 in the primitive basis. This is not the
T
o case with segmented contractions.
--~
28 R. C. Raffeneui. J. Chern. Phys. 58 .
4452 (1973).
46

There are other advantages with a general contraction: for example. it is possible
to contract inner-shell orbitals to single functions with no error in the atomic energy,
making calculations on heavy elements much easier. Another advantage is a conceptual
one, much exploited by Ruedenberg and co-workers. 29 Using a general contraction, it
is possible to perform calculations in which the one-particle space is a set of atomic
orbitals, a true LCAO scheme. rather than being a segmented grouping of a somewhat
arbitrary expansion basis. The MOs can then be analyzed very simply, just as for the
original qualitative LCAO MO approach. but in terms of "exact AOs" rather than crude
approximations.
Clearly, contraction reduces the number of basis functions quite significantly.
With a STO-3G basis, as an example, the reduction in size from the primitive basis is a
factor of 3, corresponding to a reduction factor of 81 on the number of two-electron
integrals - a significant reduction indeed. if one uses an algorithm that requires storage
and extensive handling of these integrals after their evaluation.
In a Hartree-Fock calculation in the direct SCF spirit no such storage of
integrals is necessary, and one of the original arguments for contraction becomes
obsolete. The question is then: is it better to work in the original basis of primitive
GTOs. or should one still contract to reduce the size of the basis set? (Remember that
we need to calculate integrals involving all the primitive GTO basis functions whether
or not we choose to contract them.) One can readily compute the number of arithmetic
operations needed for the two scenarios. and it is quite clear that contraction reduces the
number of operations in the build-up of the Fock matrix. while a few extra operations
are required for the contraction itself. The comparisons are complicated by the fact that
an uncontracted scheme leads to a much clearer and simpler structure of the algorithms.
for which the computer implementation is expected to run more efficiently. The current
conventional wisdom appears to be that contraction pays off on a conventional vector-
supercomputer. whereas the totally uncontracted GTO basis set has an advantage on
workstations and massively parallel hardware. One must keep in mind. however. that
these considerations depend on the latest news in a chaotically evolving hardware
market. and any definitive conclusion of this type is likely to be outdated soon.

Computational chemistry is generally jargon-ridden. and this is particularly


obvious when we discuss basis sets. A plethora of different naming conventions have
been introduced. attempting to describe all the different aspects and special features of
basis set with a single acronym. We are not going to discuss all different naming
conventions here. but we will mention !'iome of the more common.

29 M. W. Schmidt and K. Ruedcnbcrg. 1. Chcm. PhY5. 71. 3951 (1979).


47

• A minimal basis set is one that has a single basis function corresponding to each
of the atomic orbitals that are occupied in the atom. It is the smallest set that one can
reasonable use in any calculation, and one should not expect any quantitative accuracy
with such a basis.

• The double-zeta basis set consists of two basis functions per atomic orbital, and
is thus twice as large as the minimal. The name stems from the tradition of STD type
basis functions, where the symbol ~ -"zeta" - is traditionally used for the exponential
factor. In the same spirit, basis sets of triple-zeta, Quadruple-zeta ... etc. quality can
be constructed.

The split-valence basis is of double-zeta quality for the valence atomic orbitals,
minimal basis in all the other atomic orbitals.

The basis sets constructed to parallel the occupied atomic orbitals may constitute
a good start, but for accurate calculations they are generally insufficient. The "atoms-
in-molecules" notion underlying the LCAD approach is only approximate, and in a
realistic situation the atoms are significantly modified as they come together to form the
molecule. To account for this phenomenon within the framework of LCAD sets, we
must introduce the notion of polarization. It is easiest to understand the concept of
polarization functions if we consider a Gaussian basis set:
It is plausible that, as the atom in a molecule experiences the field from the
surrounding atoms, the electrons on that atom may show a tendency to shift away
slightly from the center of the nucleus. In other words, the optimum center around
which the basis functions are expanded may not coincide exactly with the position of
the nucleus. As long as that deviation is small, we can express it with the help of

x(r + M) =x(r)+ Vx(r)· M (15.9)

or. in terms of the individual components:

dX dX dX
dX =~x-+~y-+~z- (15.10)
dx dy dz

With Gaussian basis functions. we can eac;ily show that

(15.11)

and. with X~TO from (15.4).


48

d GTO
~
XX
=dxd (x-xa) k e -a{x-xa)2
(\5.\2)

The conclusion from this little mathematical exercise is simple and obvious: If
we want to describe the displacement of the electron density from a situation where it is
centered symmetrically around the nucleus (as in the atom), we need to supply basis
functions with higher and lower L-quanwm numbers that the original ones, but with the
same orbital exponents. The lower L-values are normally present in the basis anyway,
and it is sufficient to supply basis functions with higher quanwm numbers.

We sometimes study molecules for which we expect the charge distribution to


be considerably more diffuse than in the neutral atom. This is especially true for
negatively charged species, or polar systems where a part of the molecule can be
expected to carry an excess negative charge. In this case, it is often advantageous to
augment the basis set with diffuse functions, i.e. functions that have smaller orbital
exponents than those normally used. Diffuse functions are also helpful in calculations
where an accurate account of the outer region of the charge density cloud is essential,
such as in the calculation of high order moments and polarizabilities.
Some conventions is describing standard primitive basis sets are common,
especially for the first- and second-row atoms - though far from universally accepted.
There are a number of options with regard to how the orbital exponents and contraction
coefficients are determined:
The contraction coefficients may be determined in accordance with the variation
principle, such that they minimize the total energy for the atom; or they may be
determined to get the best fit to a Slater-type function. The latter approach could make
some sense for minimal basis sets. and the popular STO-nG sets fall in that category.
In each case one might choose to optimize also the orbital exponents, or to
select them as an even-tempered sequence of n exponents:

a.- N . (n-i)/(n-I)N (i-IV(n-l)


.- -mm --max for i=I,2_ ... n ( \5.13)

where the minimum and maximum values amin and a max can be optimized. There is
also a choice with regard to restricting the exponents to be the same for each shell of
basis functions. (a~ = al' =ad ) or optimizing them freely without those restrictions.
The laner is clearly bener from the perspective of the variation principle. whereas the
former. constrained approach offers significant computational simplifications.
49

Accurate primitive sp basis sets for first-row atoms have been available for some
time in the compilation of van Duijneveldt.30 His large (13s 8p) sets reproduce
numerical Hartree-Fock atomic energies to within 0.5 mEh in the worst case (Ne).
These primitive sets are suitable for the generation of ANOs. For even higher
accuracy. Partridge has generated sets of size up to (18s 13p) for the first-row atoms.
For heavier elements. basis sets of accuracy similar to (13s 8p) for the first row have
only recently become available)1 Of course. before contraction it is necessary to
supplement these sp sets with polarization functions.
The conventional nomenclature with regard to basis sets is quite precise. A
notation such as 6-31 G denotes a basis set where six primitive Gaussians have been
used to describe each of the core orbitals, whereas the valence orbitals are represented
by two contracted functions - the inner one expanded in three Gaussians, the outer one
uncontracted. It is usual to leave the most diffuse basis functions uncontracted - the
outer part of the valence is so strongly distorted from the atomic picture that flexibility
is more important than atomic resemblance. If we want to indicate that polarization
functions have been added to the basis we augment it with an asterisk. As a practical
matter. the hydrogen atoms are often treated different from the other atoms in a
molecule with regard to the choice of basis set, and polarization functions are not
always added to the hydrogen atoms. Thus, for a set with polarization on all atoms we
add two asterisks. 6-31 **. The diffuse functions are handled the same way; a '+'
denotes that diffuse functions have been added, '++' ensures diffuse functions on all
atoms. A symbol such as e.g., 6-311 **+ would thus be interpreted as follows;

a: Each atomic core orbital is represented by one basis function. expanded in six
primitive Gaussians.

b: Each atomic valence orbital is represented by three basis functions, the tightest
expanded in three Gaussians. the other two contracted.

c: A set of uncontracted polarization functions has been added on each atom (p-orbitals
on hydrogen. d-orbitals on all other atoms).

d: A set of diffuse functions (with the same I-values as those occurring in the valence
orbitals) have been added on all non-hydrogen atoms.

~o F.B. van Duijnevcldt.IBM R~.f('nr('1r Reporr RJ 945 (IBM. San Jose. 1971).
.'1 H. Partridge. J. Chern. Phys. 87. 6643 {1987tlbid. 90. 1043 (1989).
50

16. Integral Evaluation.

One of the most attractive feature of Gaussians basis functions is the


separability into Cartesian components as illustrated in (15.5), making them especially
useful for work on po)yatomic systems. It provides a mathematically elegant, and
computationally efficient transition from the spherical symmetry of the atom, naturally
represented in a polar coordinate system. to the more general Cartesian representation
which is useful for describing molecular geometries. Another. equally important
reason for the efficacy of a Gaussian basis set is the fact that a two-center product of
Gaussians can be expressed as a short expansion of one-center Gaussians - the
Gaussian Product Theorem. (GPT).

la+lb
"'" Ia+lb (16.1)
Xax Xbx = ~ C i <Ppi(X-Xp)
1=0
with
( 16.2)

xp =CIaXa + CIbXb ( 16.3)

In .( i -CIp(x-xp)2 (16.4)
"'PI x) -x
_
e

In a geometrical interPretation. the GPT states


that the product of two Gaussian functions
(with arbitrary polynomial factors) can be
expressed as a finite sum of new Gaussians. all
centered in one point P on the line connecting A and B.
Virtually all schemes for evaluating integrals over
Gaussian basis functions rely on the GPT. Its most important consequence is that all
two-electron integrals can now be expressed in terms of two-center quantities (although
one must keep in mind that the work is still of the order N4). We thus get. for a general
four-center. two-electron integral over the electron repulsion operator:

L.
lal+lbl
L
mal+mbl nal+nbl
(ablcd)= L d~I+lbl C~al+mbl Cnal+nbl
il=o jl=o kl=o lI.x Jl.y kl.z

la2+lh2 ma2+mb2 na2+nb2


"'" "'" "'" da2+lb2 Cma2+mb2 Cna2+nb2 (<P I<P )
£.J £... £.J i2.x j2.y k2.z i I.j I.k I i2.j2.k2
i2=o j2=o k2=o
(16.5)
51

In one of the most common and efficient approaches currently used for the
evaluation of integrals over Gaussian basis functions,32 Hennite Gaussian functions
(HGFs) are used instead of the usual Cartesian Gaussians for the re-expansion (16.1).
A Hermite Gaussian is defined as
. di
Ai(~) =Hi(~) exp(-ap(x-xp)2) =(-1)1-.
d~1
2
exp(-~ ) (16.6)

where Hi(!;) is a Hermite polynomial of order 'i', and

~ =apll2 (x-xp) (16.7)

The set of HGFs (Ad spans the same space as the expansion functions {<I'p} in (16.2),
and as a consequence they can be used for expanding the basis function products:
la+lb la+lb
Xax(x) Xbx(x) = L Ci Ai@ (16.8)
i=o

where the expansion coefficients now need to be redefined. Because of the natural
relations between Hermite polynomia and Gaussians,33 the two-center integrals in
(16.5) can be evaluated with unique efficiency. Here. we will not go into the
technicalities of how these integrals are evaluated in detail, but the expression for the
simplest two-electron integral involving four s-type Gaussians may serve as an
illustration of the complexity of the problem. With an s-type Gaussian defined as

Ga(aa. A) =e-aa(r-A)Z (16.9)


we get the following expression for the two-electron integral:

(ab I cd) = (Ga Gb I Gc Gd) =

J Jexp[-aa(rl-A)2-ab(rl-B/-ac(r2-Ct~(r2-D)2]..l.
rl2
drl drz

=Sab Sed 2 ~ Fo(T) ( 16.10)

where

Sab = ( -
1t )3/2exp[---
aa ab 2
(A - B) ] (l6.1 I)
ap ap
2
T=W(P - Q) ( 16.12)

.~1 L.E. McMurchie and E.R. Davidson. J. Comput. Phys. 26. ZI8 (1978): see :llso: L.E.
McMurchie. Ph.D. nresi.f (University of Scanle). (1977).
.~.l For an excellent discus~ion. see V.R. Saunders. in: "Merhmls ill Compl/rarional Molecular
Physics". NATO A~i Ser. D. Vol. 113. (Eds. G.H.F. Diercksen and S. Wilson) Reidel. Dordreeht
(1983). PI'. I.
52

(16.13)

ap=aa+<q" (16.14)
and
I
Fo(T) = Je- U2T du = ~-vferf(TII2) (16.15)

Given the complexity of the integrand in (16.10), one should perhaps be surprised that
the integral can be solved analytically at all.
It can be seen from ( 15.12) that basis functions with higher quantum numbers
can be generated through repeated differentiation of an s-type Gaussian. Consequently,
expressions for integrals involving any Cartesian Gaussian can be obtained by
differentiating (16.10), according to Leibnitz' theorem.
An important characteristics of the GPT is that when basis functions (shells)
with different L-values share the same orbital exponent and center ("family" basis sets)
the expansion functions (CPI in (16.2) can be used for all members of the "family" with
lower L-values. We may consider as an example the products obtained by the
functions in two different p-shells: Nine products XaXb can be formed, which requires a
total of ten new functions {CPp I for the re-expansion. These new functions {CPp I form
an 5-. po, and a d-shelL all centered at r =rp. However, the 5-5, sop and p-s
combinations of basis functions with the same orbital exponents can also be expanded
in the same set of expansion functions {<I>p}. In general. the full Cartesian set {X} with
t
angular quantum number = L contains (L+ I )(L+2) functions. With all lower family
members included. the number would be i(L+I)(L+2)(L+3) The re-expansion set

{CPpl is always a "family" set. with ~(L1+L2+1)(L1+L2+2)(LI+L2+3) members. It


follows that the gain in the number of functions quantities upon re-expanding the
products is insignificant for normal Cartesian shells. though there would still be a large
computational gain due to the reduced complexity of the expressions when going from
a four-center to a two-center formalism. A Cartesian f-shell has ten functions, thus 100
products Xa Xb can be formed with two different f-shells. while the expansion set {<I>pl
would have 84 members. However. with all lower family members included the
number of functions is 20. leading to 400 distinctly different products, which can be
expanded in those same 84 functions.
The use of a family basis thus appears worthwhile. but the transformation back
to the four-center representation becomes a bottleneck. [n Section 18 we will review
several different methods to exploit the advantages of the family basis while avoiding
those bottlenecks.
53

17. Prescreening of Integrals.

As discussed above, the evaluation of the electron repulsion integrals (16.5) is


(at least nominally) a task whose complexity grows with the fourth power of the size of
the system under consideration. Even with the rapid development of computer
hardware and software, this dependence severely limits the progress of the field.
Assuming a doubling of hardware performance every 18 months - a commonly
accepted rule of thumb in the industry - it would take six years to double the maximum
size of the system one can treat with today's state-of-the-art equipment. One of the
more common complaints from users of quantum-chemistry methods to those scientists
who develop these methods is the limited ability to treat large systems, and this state of
affairs is therefore deeply unfortunate. The situation is somewhat improved by the
impressive progress in algorithmic development during the last decades. In terms of
allowing larger systems to be studied. the impact of these new methods has been as
important as the hardware development.
However, it is not realistic to expect that improved efficiency in the evaluation
of integrals will bring the field forward at a satisfactory pace. Conservatively, it would
be desirable to perform routine calculations on systems with at least 100-200 atoms,
involving 2,000 basis functions or more. Such a calculation requires about 10 12 -10 13
two-electron integrals to be evaluated. and it is clear that no hardware or software
development alone can make this a routine calculation within the foreseeable future.
The solution to the problem is therefore not only to generate the integrals more
efficiently, but to search for alternative algorithms that can avoid their evaluation
altogether whenever possible. Several such modifications of the direct SCF procedure
will be discussed below.
While it is quite obvious that the direct approach completely avoids
computational bottlenecks due to storage space limitations in the conventional approach,
it would appear to be much more demanding on CPU time at first sight. However, this
does not necessarily have to be the case in all situations. The integrals are not
computed until they are needed for contraction with the density matrix to form the Fock
matrix, and as a consequence their evaluation can be avoided if one can safely assume
that their contribution to the Fock matrix elements and the total energy are going to be
insignificant.3 4 That this is indeed going to have a significant effect in a large
calculation is demonstrated in the following graphs, where the contributions to the Fock
matrix from individual integrals are plotted for one small and one very large molecule.

34 .I. Almliif and P.R. Taylor. in: "Acll'ClIICC'd Thenri('s (/lid Camplltati'1/Ia{ Apprnaches t" the
Eieclrnllic Stnlcillre of MoieClIie.f". NATO ASI Ser. C. Vol. 133 (Ed. C. Dykstra). Reidel. Dordrecht
(19841. pp. 107-125.
54

Large Molecule
<R>=30A
Small Molecule
<R>=3A relative
abundance

l.E·03 l.E'06 1. 09 1. 12
Graphs showing the relative abundance of the two-electron integrals versus their magnitude for one
small and one large system.

The dotted lines in the graphs above indicate a suggested cutoff for practical
calculations - contributions smaller than a certain threshold do not make any difference
to the final result and can therefore be safely neglected. One should note that the
threshold must be set tighter in a large calculation, since the number of marginally
significant contributions grows faster than the total energy. Nevertheless, the message
is quite clear: For large molecules, the majority of the integrals can be neglected!
However, while it is certainly important to take advantage of this situation, it
cannot be exploited to its fullest without some further considerations. Consider a
calculation with some 2000 primitive basis functions - a calculation many would
consider routine today. Without any further simplifications, this calculation requires
the evaluation of about 2-10 12 integrals. Even if the magnitude of an integral could be
estimated in a couple of machine cycle (say, 10 ns), it would still take many hours to
carry out the tests: whatever the conclusion of those tests might be. Accordingly the
tests must be carried out with some care. and the different techniques to eliminate the
evaluation of integrals as accurately and cheaply as possible constitute one of the major
challenges in contemporary method development. The art of evaluating integrals faster
and faster developed very rapidly in the 1970's and -80·s. but, for the reasons
mentioned the focus of the development efforts has shifted to the various tricks required
to eliminate the calculation of integrals altogether.
To realize how the above ideas can be incorporated into a scheme for ab i1litio electronic
structure calculations. it is useful to first consider the prescreening of integrals normally
done in conventional large-scale LCAO work. As discussed in Section 16, the
expression for an integral over primitive Gaussians can be formally written as

(17.1 )

where Sab is a radial overlap between orbitals Xu and Xb, and Tubed is a slowly varying
angular factor. In many situations the product Sab Sed thus constitutes a good estimate
of the magnitude of the integral. and it may seem attractive to use that product as an
estimate in screening out small integrals. However. since the product does not provide
a strict upper bound. a few integrals <Ire sometimes eliminated from further
55

consideration by this screening even if their magnitude is well above the screening
threshold, and this can have very detrimental effects in a variational calculation. It is
preferable. therefore. to work with a strict upper bound for the magnitude of the
integrals. Such a bound can be obtained from the Schwartz' inequality:
cg(l)
I(ab I cd)1 S; lCab1G:d (l7.2a)
where cl ual Ii ...· depencllrtce

lCab =...}(ab lab) (17.2b)

In most cases the estimate (17 .2a) is


quite accurate. and requires only the
two-index quantities lCab. which can
easily be precomputed and stored. The
'--_ _ _ _ _ _--'09(n)
quantity ICab1G:d can then be compared to
a given threshold 't, and the integral is The figure shows the dependence of integral
time on the number of basis functions for
only evaluated if model systems consisting of linear poly-
acetylenic chains.
lCab1G:d ~ 't (17.3)
Such prescreening drastically reduces the number of integrals that actually need to be
evaluated, and in calculations on large systems an N-dependence not much higher than
quadratic can be observed for large systems)S
The departure from the formal N4 dependence is due to the smallness of the
integrals, as estimated by the radial overlaps (17.1) or two-center exchange integrals
(17.2). Clearly, these quantities fall off rapidly with distance, and the reduction is
therefore dependent on the overall shape of the molecule. Elongated molecules have
larger average distances for a given number of atoms, and the screening should be
expected to be most efficient for such systems. This is demonstrated in the following
graph. where we compare the efficiency of screening small contributions for different
types of systems:
..J!l!21.!l.
d(logN)

pi anar systems

linear molecules

The figure shows the deviation from the formal N4 dependence for three different types of systems.
Note that with 1000 ba~is functions. even the trend for the 3-D clusters - which may appear modest on
this scale - amounts to saving a fnctor of nearly 1000 on the time.

35 M. Haser and R. Ahlrichs. J. Compul. Chern. 10. 104 (1989).


56

For small molecules a value for the threshold 't of 10- 7 to 10- 8 is usually
reasonable. but it must be tightened as larger systems are considered, due to the
inevitable accumulation of errors. Significant deviation from the N4 dependence is
therefore only seen for extended systems. and a calculation on a molecule of chemically
interesting size may thus easily require 107 to 109 integrals to be evaluated even for
rather modest basis sets. In early days of computational quantum chemistry, the
computation of the two-electron integrals was a major bottleneck. and consequently
they were stored for re-use whenever possible.
A screening based on the ideas discussed so far can be implemented in any
scheme relying on an explicit evaluation of two-electron integrals in an AO basis. In
addition. the direct approach to electronic structure offers a more powerful screening
criterion by considering how these integrals are used in each SCF iteration to evaluate
the energy and to build the Fock matrix. With a closed-shell Hartree-Fock scheme as a
prototype example. the Fock matrix elements are obtained from the density matrix and
the integrals as:

Fab = hab + L
u:I
Ded [ 2 (ab I cd) - (ac I bd) ] (17.4)

Due to the permutational symmetry of the integral expression the following integrals are
equal.
(ab I cd) = (cd lab) = (dc I ba)* = (ba I dc)*
(17.5)
(ba I cd) = (cd I ba) = (ab I dc)* = (dc I ab)*

and with real basis functions. which are almost always used in calculations on
polyatomic systems. the eight integrals in (17.5) are all identical. It is then only
necessary to calculate one integral in this redundant set. and the processing of a general
two-electron integral (ab I cd) therefore requires the operations outlined in (17.6) for a
closed-shell case:

Fab =: Fab + 4 Ded (ab I cd) ( 17.6a)

Fed =: Fed + 4 Dab (ab I cd) (17.6b)

Fae=: Fae - DM (ab I cd) (17.6c)

Fad=: Fad - Dbc (ab I cd) (17.6d)

Fhc =: Fbc - Dad (ab I cd) (17.6e)

Fhd =: Fhd - Dae (ab I cd) (17.6f)


57

It is evident from (17.6) that there will be a significant contribution to the Fock matrix
only if hmh the integral irul at least one of the six density matrix elements in these
expressions are significantly different from zero.
Since the density matrix elements are known by the time the integral is to be
evaluated. they can be incorporated in the prescreening tests. The evaluation of an
integral is only necessary when the maximum contribution to the Fock matrix exceeds a
given threshold t:

(ab I cd) Dmax ~ t, (17.7a)


where
Dmax =max(41D;mI , 41Dcdl , IDacl , 1l>Jxi1 , 1Dt,c1 , lOadl) (17.7b)

The screening criterion (17.7a) can then be replaced by

!Cab ICed Dmax ~ t (17.8)

The test (17.8) places a rigorous bound on the maximum error allowed in the
calculated Fock matrix element. It should be noted that it is not practical to screen on
the contributions to the total energy;

!Cab 1Ccd max(4lDab Dedi, IDac l>Jxi1 , !Doc Dadl) ~ t (17.9)

While this would be a more powerful screening criterion in the sense that many more
integrals would be eliminated. it leaves an unmonitored error in the Fock matrix. As a
result the SCF calculation is not guaranteed to converge with the screening in (17.9),
and indeed doesn't do so except in trivial cases. However, Eq. (17.9) is still useful
when evaluating energy-related properties, e.g. the nuclear forces in calculations of
equilibrium geometries.
Even. though the integral pre-screening may appear similar in a direct and a
conventional SCF scheme. the logistics of the two procedures are entirely different. In
the traditional approach certain integrals are eliminated once and for all. That procedure
modifies the functional dependence of the energy on the MO coefficients, and great care
must be taken to assure that a variational collapse is avoided. With the direct SCF any
tendency towards such a collapse leads to an increase of the pertinent density matrix
elements. automatically ensuring the evaluation of the critical integrals in the subsequent
iteration.
llsually. it is desirable to evaluate integrals in batches. corresponding to shells
of basis functions having the same centers. L-values, and orbital exponents. However,
of the different screening criteria suggested above, only the radial overlap products
pertain to full batches of integrals. With processing of integrals by shells. the test
58

based on Eq. (17.8) must rely on the maxima of the density matrices and exchange
integrals evaluated for each shell:

DAB =a<EA.b<.EB
max (Dabl (17.10)

!CAB =a<EA.beB
max " (ablab) (17.11)

where the indices A and B refer to shells. and a and b run over all functions in a shell.
These compressed matrices are of rather modest dimensions. and may conveniently be
kept in memory during integral evaluation even for very large basis sets.
Further. important savings in the number of calculated integrals may be
obtained by considering the relationship between Fock matrices of two consecutive
iterations. Again considering the closed-shell Hartree-Fock example. these matrices in
iterations (m) and (m-I) are:

F~') = hab + ~ D~~) (2(ab I cd) - (ac I bd) I (17.12a)

F(m-I)
ab
= hab + £-
~ D(III-I) (2(ab I cd) - (ac I bd) I
cd
(I 7.1 2b)
cd

which yields the recurrence relation (since the integrals are the same in every iteration)

F(I1I)_pm-l)+ ~ {D(III)-D(III-I)I (2( bl d)-( Ibd)1 (17.13)


ab - ab £... cd cd a c ac
cd

In supermatrix notation (see Section 13) Eq. (17.13) is expressed as:

F(m) =F(m-I) + P ~(m) (17.14)


where
~(m) = D(III) _ D(m-I)
(17.15)
cd cd cd

This illustrates the fact that only those electron repulsion integrals are required
which are related to significant changes in the density matrix from one iteration to the
next. A screening criterion similar to those previously suggested could still be used.
substituting b. for D in (17.9). Especially close to convergence. this criterion is very
efficient. reducing the number of required integrals to zero in the limit of full
convergence.
The tests described above are instrumental in making the direct SCF a viable
approach for very large systems. One can actually proceed one step further, and only
calculate accurately those elements of the Fock matrix that would make a significant
contribution to the density in the next iteration. This approach is especially useful in
59

Dirac-Fock calculations. (i.e. relativistic calculations in a four-component formalism) in


which many large and expensive contributions to the small-component part of the Fock
matrices contribute little to the fina) result.
Although quite useful. the incorporation of the density in the integral
prescreening as indicated in (17. 7b) is not as powerful a tool as it might first seem. The
reason is that the two-index exchange integrals Kab show a pattern very similar to that
of the density matrix elements Dab. and their variation in size is largely parallel.
Therefore. integrals which cannot be eliminated simply due to their size by (17.3) will
usually have to be evaluated even with the criterion (17.8). since Dab orDcd will seldom
be both small in cases where Kab or Ked are both large. It is worth noticing that this is a
situation which penains only to the Coulomb type contributions. For the exchange
terms. the occurrence of small elements in the D and 1C matrices will in general provide
independent elimination of integrals. In a first step to exploit that difference. the
criterion ( 17.8) above can be divided in two.
(17. 16a)

KAB KCo max( IDACI • IDsoi.IDBcl.IDADI> ~ 't (17.l6b)

where the integral (abled) must obviously be evaluated if either of the criteria in (17.16)
is fulfilled. As a consequence of the particular structure of D and It discussed above.
(17 .16a) does not eliminate a large number of integrals in addition to those removed by
(17.3) only. whereas (17.16b) used alone would have eliminated a large number of
such integrals. ,As an example. a calculation on a 148 atom diamond-like carbon cluster
revealed that the number of remaining contributions required by criterion (l7.15a)
exceeded the ones required by (l7.16b) by more than one order of magnitude. a
difference that would also be reflected in the computer time requirement if the two types
of contributions were to be evaluated independently.
This observation opens possibilities for several interesting modifications of the
current computational schemes. The structure of the calculations can often be simplified
if Coulomb- and exchange type contributions are evaluated separately. This would
only lead to an insignificant increase of efforts for large molecules. since the Coulomb
part will dominate the calculation according to the above reasoning. Furthermore. since
Coulomb repulsion is a relatively simple (i.e. classical) form of interaction one should
be able to explOit simpler schemes for its evaluation that the conventional one based on
four-center integrals. We will discuss several such simplifications in the following
sections.
Incidentally. we note that this difference in the screening behavior is even more
pronounced if we consider a screening based on the contributions to the total energy in
Eq. (17.9) as discussed previously. Even though this screening cannot be used in the
60

SCF calculation for reasons discussed above. it is very useful when evaluating
properties such as nuclear forces ("gradients") and force fields.3 6 For such
applications. the experience with the test case discussed above indicates a difference of
nearly three orders of magnitude, suggesting that the exchange type contribution is
totally insignificant and that all efforts should be concentrated on finding simpler ways
to evaluate the Coulomb part. The evaluation of SCF gradients is naturally a "direct"
scheme since the integral derivatives are only needed once. and their contribution can be
processed on the fly without intermediate storage. The separation of Coulomb and
exchange contributions is not common. however, but has been found to lead to
dramatic savings in computer time for medium- and large size systems.

Finally, we should comment on a different type of screening that can actually


often serve a dual purpose: In addition to saving computer time through the elimination
of unnecessary integrals, it can actually help to improve the convergence of the SCF
procedure.
Trying to converge a calculation with a large basis set is often a frustrating
experience. In the Direct scheme where each iteration requires a substantial amount of
work, the encounter with the typically slow SCF convergence behaviour can be
especially disheartening. The convergence with large basis sets is often erratic, and a
certain amount of trial (and, alas. error) is common. It is then often an advantage to
start a calculation in a smaller basis. The converged results from such a calculation can
be used as a starting point for the larger one. To make this procedure as simple as
possible. one should contract the basis such that a subset of the basis constitutes a
reasonable minimal basis set.
This is most easily accomplished if the basis functions are assigned labels
according to increasing importance. We would start the calculations with only the most
essential ones (minimal basis). A screening parameter assigned to each shell of basis
functions would be compared to the current measure of convergence to decide whether
any of the integrals are evaluated in this iteration. As the calculation converges, more
and more basis functions are successively included, with the polarization and diffuse
functions being used only in the last few iterations.
To prevent erratic behavior during the SCF procedure, it is recommended to set
the one-electron integral equal to zero when this procedure is used. This will be
interpreted as a linear dependency in the basis set when the procedures described in
Section 10 are used. and these bac;is functions are therefore completely eliminated from
the calculation.

36 See e.g. T. Helgaker :lOd P. J~rgenscn. in "Methods ill Compl/tational Moleclliar Ph~'sics", NATO
AS! Ser. B. Vol. 293. (Eds. S. Wilson and G.H.F:. Dicrcksen) Reidel, Dordrecht (1992).
61

18. The Gaussian Product Basis

The philosophy of the direct SCF approach was based on the observation that
our integral processing ability had outgrown our capacity for storing and moving these
integrals, and as a consequence we reduced the storage requirement at the expense of
more computation. It is evident, though, that once the bottlenecks due to input/output
and disk space have been removed by a direct approach, the time needed for integral
evaluation will again be a bottleneck in electronic structure calculations on large
molecules. Much work is therefore needed to improve the efficiency by which these
integral are being evaluated.
The distribution of electron density occurring in a typical four-center, two-
electron integral is typically given as a product of basis functions. The GPT reduces
the electronic repulsion to two-center terms, and also reduces the number of integrals
significant! y.
Once these relatively few quantities have been evaluated, they are expanded to a
much larger number of four-center electron-repulsion integrals. In a direct approach
these are immediately reduced again to a relatively small set, such as a Fock matrix or a
gradient vector. By far the most time-consuming part of this procedure is the
transformation and handling of the large intermediate set of four-center electron-
repulsion integrals. It would appear very desirable to circumvent this step, and directly
build the Fock matrix from the two-center integrals over Hermite functions in (16.5), or
even from the incomplete Gamma functions (16.15), without ever handling a single
four-center integral over primitive or contracted basis functions. It is evident that this
approach could lead to very significant reductions in the operation count. To some
extent, similar ideas have been implemented in current integral algorithms, where the
transformation from a primitive to a contracted basis is partly carried out before the
primitive integrals are fully evaluated.
A simple analysis of operation counts shows that it is more efficient to
construct the Coulomb part of the Fock matrix in a basis of these Hermite functions
(strictly speaking, this is a vector-, not a matrix representation of the Coulomb
potential) and transform to the conventional basis at the end of each iteration. Such a
transformation only requires N2 work and can be done at essentially insignificant cost
as follows:
In accordance with the Gaussian Product Theorem (Eq. (16.1) a two-center
product of primitive Gaussians can be expanded in the basis of one-center Gaussians
(Hermite or Cartesians):

( 18.1)
62

The Coulomb part of Fock matrix can then be obtained as:

Jab = r Dcd (ab I cd) =r Dcd r C ~a.b) rC (c,d) (p I q) (18.2)


c,d c.d P qq

In (18.2), (ab I cd) are the two-electron, four-center integrals over primitive Gaussians.
and the (p I q) are two-electron. two-center integrals over Hermite Gaussians according
to (16.5),

(p I q) =f ~p(rl) r~2 ~(r2) drl dr2 (18.3)

The evaluation and processing of integrals for the Coulomb contributions to the Fock
matrix would then be carried out as

Jp =r Dq (p I q) (18.4)
q
where
Dq =r C (c.d) Dcd (18.5)
c.d q

is the representation of the density in the Hermite basis, which can be evaluated before
the beginning of an SCF iteration, outside the calculation of two-electron integrals.
Similarly. the back-transformation to the ordinary pair-representation.

Jab = r C (a,b) Jp (18.6)


p p

is carried out at the end of the direct SCF iteration at insignificant cost.
The approach is akin to the idea of an 'auxiliary' product basis in the
"multiplicative" approximation, and can be used along with that procedure which is also
a very time-saving device in large calculations. An additional benefit of this approach is
that all the low-L components of a shell of basis functions can be obtained at zero
additional cost. The case of four shells on different centers, all with L=2 (d-orbitals)
may serve as an illustration. Each pair would contain 36 products. and the total shell
therefore generates 1296 integrals with a conventional approach. We can instead use a
set of 35 Hermite Gaussians to represent each pair. This would give rise to only
slightly fewer integrals (1225). but the most expensive part of the integral calculation
would be avoided through the use of the product functions {~} as the basis of
representation. In addition. these 35 functions would now also represent the entire
63

"family" basis, i.e. p- and s-type basis on these centers with the same exponent.
Therefore, a total of ten functions on each center, or a total of 10,000 integrals are
accounted for with this method.

All approach based on a nlUnericai quadrature scheme.

In a different attempt to avoid the evaluation of electron-repulsion integrals


altogether, we found in Section 16 that such an integral over Gaussian basis functions
can be expressed as a sum of reduced, incomplete gamma functions of the general fonn

J
1

fn(t) = u 2n e- tu2 du (18.7)

We further observe that the 'Ryspolynomial' approach 37 can be viewed as a


procrastination of this integration until some of the summations are carried out, at
which point the integral is evaluated through a Gaussian quadrature scheme. This
quadrature can be postponed even much further in a direct scheme, resulting in
important additional savings. The extreme consequence of this observation, but indeed
a fully realistic one, would be to build the entire Fock matrix as a loop over discrete
quadrature points, and carry out the quadrature at the very end of an SCF iteration:

f
1

F = F(u) du (\8.8)
o

One can also envision schemes in which the integral is evaluated at an


intermediate stage of the Fock matrix construction. To see how this is possible in
practice. we consider two very commonly used relations in the evaluation of two-
electron integrals. I) the transform of the electron-repulsion operator:

f
00

I
-= 2
.C' exp(-rI22u)2. du ( 18.9)
['2 "lilt 0

which is obviously very useful when Gaussian basis functions are involved; and 2) the
separability of the Gaussian basis function itself:

n M.1. Rys and H.F. King. 1. Chern. Phys. 65. III (1976).
64

(18.10)

The expression for the two-electron integral thus becomes


00

(\8.11)

where

(18.12)

If the integration over the auxiliary coordinate u is temporarily delayed, the separation
of the summations in the buildup of the Coulomb matrix enables a very efficient
evaluation of the sum:

Jab =Jax.ay,az,bx,by,bz =
00 00

=~} { I cx.cy.cz
I(~~. j(u) I(~~. (u) I(~!.. (U)} du =} jab(U) (18.13)
dx.dy.dz

One can now postpone the integration over u until the very end of a SCF
iteration. Each matrixj(u) can then be built up with very few arithmetic operations due
to the effective separation between the x,y, and z-part of the integral. j(u) can be
constructed for 8-12 different quadrature points Uk (in parallel if desired), and the final
matrix can be obtained through a simple numerical quadrature scheme:

f
00

J =~~ j(u) du (18.14)


V1to

carried out separately for each of the (non-zero) matrix elements. Again. while the
scheme has been described here in detail only for the Coulomb part of the Fock matrix,
the exchange can be treated in the same spirit and with similar gains in efficiency. The
approach allows a much more efficient factorization of the work done at the innermost
loop level. which more than compensates for the extra computations involving the
additional quadrature.
65

19. Approximate Three-Center Expansions.

The evaluation and processing of four-index. two-electron integrals constitutes


a significant bottleneck in many types of ab initio electronic structure calculations.
Several approaches based on approximate re-expansions have been suggested to
overcome this obstac1e.38 The essence of all these methods is that they attempt to
(approximately) expand a product of basis functions in a new. auxiliary basis set {~u}:

(19.1)

Similar ideas are frequently used in density functional technology,39 Even when a
basis set is used in these calculations. the density is usually re-expanded in other basis
sets.
The form of (19.1) may look similar to Eq. (16.1). the Gaussian Product
Theorem. In (16.1). however, the expansion is exact and the expansion basis set is
specific for each product XaXb. The size of the expansion basis for that expansion is
therefore of the order N2, N being the number of LCAO basis functions. In contrast,
the size of the set used in (19.1) is only of the order N. but the expansion is global, i.e.
the sum runs over all expansion functions ~p.
There are several ways to determine the expansion coefficients c(a.b) in (19.1).
If we define a residual function

(19.2)

we can apply several different criteria by which Rab is to be minimized. One of the
most obvious would be the minimization of the norm:

(19.3)
This leads to
C (a.b) =l (ab q) (S-I)pq (I9.4a)
P q
or
c<a.b) =g-lll(a,b) (19.4b)

where a(a.b) =(ab q) is a three-center overlap, and Suv = (u v) is an overlap in the


q
auxiliary basis. With the approximation (19.1), a two-electron integral would be
approximated either as

38 C. Van Aisenoy. J. Compo Chem. 9. 620 (1988): A. Forlunelli and O. Salveui. J. Compo Chem.
12.36 (199\): A. Fonunelli and O. Salveui. Chem. PhY$. Len. 18Ci. 372 (1991).
39 B.1. Dunlap. I.W.D. Connolly. J.R. Sabin. J. Chem. Phys. 71. 3396 (1979).
66

(19.5a)

or as
(ij I kl) =L b (i,j) c (k,l) = b(i,j)t C<k,1) = o(i,j)t~fl f).(k,l) (l9.5b)
p p p

where b(k.l) = (kl I p) is a three-center, two-electron integral, and where we get


p
(19,5a) or (19,5b) depending on whether the first or second electron distribution is
expanded according to (19.1), If both electron distributions {ij} and {kl} are expanded,
we obtain
(ijlkl)=L C (i.j) YpqC(k.l) = C(i,j)tVC<k,I)= a(i,j)tg-IVg-la(k,l) (19.6)
p,q p q

where we have introduced the two-center, two-electron integrals V pq = (p I q).


These expressions could also have been obtained by using a conventional resolution of
the identity operator

00 N
1 = Lip> (S-I)pq <ql = Lip> (S-I)pq <ql, (19.7)
p,q p,q

for one or both of the electrons: ~(ij


I kl) = (i(l )j(1) I k(2)1(2»)

I = L IpO) > S -I < (1)1


p,q

where the approximate equality in (19.7) reflects the incompleteness of the finite
expansion basis,
The minimization of the norm (19.3) is just one out of many reasonable criteria
for determining c<a,b). Minimizing the self-repulsion of the residual. i.e. minimizing

('19.8)

would instead give:

C (a.b) =L lab I q) (y-I)pq ( 19.9a)


P q
or
(l9.9b)

When inserted in the integral expression, (19.9b) gives:


67

whether we have expanded electron distribution {ij}, {kl}, or both. The same
expression would be obtained by minimizing the error in the electrostatic potential
generated from the charge distribution XaXb'
More generally, we may define a residual overlap rj =(Bj lRab) , where {Bj} is
an arbitrary set of functions. Minimizing Z = I. r/ leads to
j

(19.11)

where dr = (Bjlab) and Ajk = (Bjlak). In the special case of {Bj} = {~u} , we have
A. =At =1{ and we obtain Eq. (19.9) again.
To summarize. this suggests at least four different approximations of a four-
center, two-electron integral (ab I rs) based on 3-center quantities:

tuvw
abt -I cd '" (ab I) (S ·1 )tu (cd I u)
(ab I cd) ,., !l S b = £... (J9.12b)
tu

(ablcd)"'babt~flaCd=I (abI t)(S-I)\U (cdu) (J9.12b)


tu
(ab I cd) ,., babtV·lbcd = I (ab I t) (V-I)ru (cd I u) (J9.12d)
tu

For the approximation of two-electron integrals in Hartree-Fock calculations,


the approximation (19 .12d) has been found 10 be superior to the other two. 40 The
success of this expansion technique for diatomic systems suggests that it is sufficient to
use a local expansion in (19.1), i.e. to use a sparse (blocked) set of expansion
coefficients {Cp(a.bl). with coefficients only for the expansion functions ~u on the two

centers where Xa and Xb are centered.


The choice of an expansion basis is certainly a topic that requires extensive and
lengthy experimentation. However, in order to reproduce SCF total energies with good
accuracy, it appears necessary to reproduce the behavior of the exact expansion near the
atomic nuclei. Therefore. the expansion basis set should include one-center products of
the original basis set. which account for by far the largest contributions to the total
electron density. This is most easily achieved if the basis set is uncontracted. in which
case the expansion basis set should have the sums of orbital exponents (and L-values)
from the original basis. In practice. it is sufficient to use an even-tempered expansion
ranging between the highest and lowest exponent given by this recipe.

40 O. Vahlras. J. AlrnlMand M.W. Fcycrei$cn. Chern. Phys. Lett. 213. 514 (1993).
68

20. The Semi-Classical Limit.

In localized systems, the exchange interactions fall off exponentially with


distance, whereas the classical (Coulomb) electrostatic interactions extend over a long
range. This difference in the two types of interaction lends additional support for the
suggestions to treat the Coulomb and exchange interactions differently in large systems,
as we discussed in Section 18. Some quite unorthodox recipes for ab initio calculations
on large systems can be conceived using such a combination of classical and quantum-
mechanical ideas. One can take advantage of the fact that the classical model is correct
in the macroscopic limit. introducing dramatic simplifications in calculations where the
interactions extend over more than a few Angstrom in space.
In this spirit the long-range Coulomb interactions should be treated classically.
while long-range exchange disappears and short range interactions (Coulomb and
exchange) can be evaluated by more conventional means since their number does not
grow rapidly with the size of the system. The basic mode of description for the system
will still be in quantum-mechanical language, i.e., involving wavefunctions,
Hamiltonians, etc., but new expressions will be used when evaluating the interactions
entering our (effective) one-particle Hamiltonians.
In many cases the only significant non-classical interactions occur between
nearest and next-nearest neighbors, and the computational effort required for those
interactions would only increase linearly with the size of the molecule in the asymptotic
limit of a macroscopic system. One thus expects long-range Coulomb interaction to
ultimately dominate the construction of the Fock matrix. This is a very fortunate
situation, since these interaction can be evaluated with classical, and therefore very
inexpensive methods. 41 in sharp contrast to the steep power-dependence seen for
traditional methods.
To evaluate the long-range Coulomb part of the Fock matrix efficiently, the
charge density of the system must be described in terms of larger units than the
individual basis functions and products thereof. A large molecular system can be
divided into suitable fragments by some well-defined but arbitrary recipe. The
electrostatic (Coulomb-like) potential generated by such a fragment at some distance
away from it can be expressed by a generalized multipole expansion relative to a single
point in space. conveniently taken as the center of the charge distribution for that
fragment. In addition to being arbitrary. the concept of a fragment can be dynamic,
such that in an iterative procedure the definition of the fragment would differ from one
iteration to the next. The screening techniques described in Section 17 provide a
powerful tool for avoiding the evaluation of insignificant contributions to the various

41 I. Panas and 1. AlmiOf. Intern. J. Quantum Chern. 40. 797 (1991): I. Panas. 1. AImliif and M.W.
Feyereisen. Intern. J. Quantum Chern. 42. 1073 (1992).
69

quantities in a calculation, with a minimum of explicit testing and administrative


overhead.
It is important to realize that the results obtained with this method would be
essentially identical to those obtained with standard techniques. While the
approximations and shoncuts introduced have a clear physical origin, they can always
be justified on strict numerical grounds. Introducing these classical approximations at
cutoff thresholds that give numerically different results would be a further level of
approximation. This could indeed be a very stable approximation with physically
meaningful and encouraging results, but it seems reasonable to first discuss the
approach with the thresholds set to exactly reproduce the full quantum-mechanical ab-
initio results.
The potential from the charge distribution of a basis function product XaXb can
be expressed by a multipole expansion around a center which is common to the
fragment (P) to which the product belongs:

(20.1)

where Ci.~b,P) are the multipole moments of XaXb evaluated around the center (P),
and F~:~(r-rp) is the expansion of the electrostatic field at the point r due to that
multipole moment. For the interaction between non-penetrating charge distributions ab
and cd. which would be an approximation for a two-electron integral (ab I cd):

( b I d) "" C(a,b.p) "" C(c,d,Q) I R)


a c = L.i 1m L.i I'.m' 1m I'm'( PQ (20.2)
I,m 'r,m' ,

where Ilm,l'm,(R pQ ) is the interaction between the multipoles. While this expression
does not provide a very useful approximation for individual integrals, it has the
advantage that electrons I and 2 are now effectively decoupled. Various contractions,
transformations, and summation over densities can therefore be carried out at the 1-
electron level. Obviously, the method has many similarities to the three-center
expansion discussed in Section 19, and the two can actually be combined with very
impressive results.
The first step in a calculation along these lines would be to define a partitioning
of the system into fragments. This partitioning need not have any chemical
significance. since the approximations used are numerically monitored. but there are
likely to be computational advantages if it does. Based on a screening of radial
overlaps. the electron repulsion terms can be divided into shon-, medium- and long-
range interactions. Shon-range interactions must always be dealt with in terms of
70

explicit two-electron integrals over basis functions. but. as discussed above. their
number only increases linearly with size in an extended system. The long-range
Coulomb contribution to the Fock matrix can be obtained as follows: In the usual
expansion in tenns of conventional two-electron integrals

Jab =L Dcd(ablcd) (20.3)


c,d

the pairs 'ab' and 'cd' can now both be expressed in tenns of mUltipole expansions (as
long as they do not penetrate into regions where both charge distributions are non-zero)

Jab= "c(a,b.P) " " D C(c,d,Q) I R) (20.4)


~ 1m ~ ~ cd I' m'lm.l'm'< PQ
I,m' I',m' c,d '

The summation over the basis function pair cd can now be carried out before the actual
loop over electron-repulsion contributions:

(Q) "D C(c,d.Q)


J.ll',m'= ~ cd I',m' (20.45)
cdeQ

giving quantities Ill'~~ that can be interpreted as the multipole moments of the
fragments (Q) with respect to the centers RQ. These multipole moments can be
precomputed and stored before each SCF iteration, and their contribution to the long-
range Fock matrix can then be easily evaluated. The expression for the Coulomb
matrix elements is obtained without any explicit summation over basis functions:

Jab=" C<a.b.P)"" (Q) I (R) (20.6)


I~
.,m
I,m I'~ '~Q
,m
Ill',m' Im.l'm' PQ

The summation over I'. m'. and Q simply gives the electrostatic potential. the field.
field gradient and the higher moments of the potential at the point Rp due to all the
multipole moments in the system.

F(P)- "" <Q) I (R) (20.7)


I.m - I'~''''Q Ill'.m' 1m. I'm' PQ

With these definitions. the Coulomb matrix elements are obtained as

J - " c<a.b.P) F(P) (20.8)


ab - I~ I.m I.m
.m
We should note an interesting analogy between this multipole-bac;ed electrostatic
approach and the tradition evaluation of electron-electron repulsion on tenns of four-
center two-electron integrals. The (ss 5S) integral discussed briefly in (16.10 - (16.15)
71

descibes the interaction between un-nonnalized spherical charge distributions. Sab and
Sed are just two overlap factors describing the magnitudes of these charge distributions.
Furthennore, we should note that the error function in (16.15) can be approximated as
1 2
erf(x) = I - 2x e- x (20.9)

for large values of x. Inserting this in (16.15), we get

1_ (1t
Fo(T) = r'l f (20.10)
and

(ab I cd) = SabScd -V T


_rw= SabSed RpQI (20.11 )

In other words. the four-center integral just reflects the classical electrostatic repulsion
between point charges for large distances between the charges. The correction tenn in
(20.9) is due to the penetration of the two charge distributions, ,and is seen to decay
rapidly with distance, For integrals over basis functions with higher L-values, a similar
analysis reveals that the integral represents an interaction between higher multipoles.
The fragment concept used here to represent long-range interaction has many
similarities with recent ideas of using an auxiliary basis set to expand the density
distributions, rather than the wavefunction. In both cases. the treatment of Coulomb
interaction is dramatically simplified since an external summation for the entire external
charge density is possible. while the savings for the exchange part (which can be
viewed merely as an efficient integral approximation) are less dramatic, though still
very significant. The multipoles used in the present approach could actually be viewed
as a particular choice of auxiliary basis functions with infinite orbital exponents,
centered in special positions in space. A combination of the present fragment approach
and a method involving auxiliary basis sets will be a very effective compromise for
many extended systems. With the fragments chosen to be atom-centered expansions,
the essential steps in the procedure would thus be as follows:

I: Expand the density of the molecule in an atom-centered auxiliary basis. This


basis set can be chosen to be quite large. if necessary_ in order to get an accurate
expansion. The techniques and considerations discussed in Section 19 would naturally
apply to such an expansion. Each fragment of the density would be uniquely identified
with an atom-centered charge distribution.

2: Evaluate the multipole moment of each such atomic expansion. This operation
is completely trivial. funhermore. the expansion is finite since the order of the multipole
moment corresponds exactly to the L-value of the auxiliary basis function if both are
expanded at the same center. The contribution from the nuclear charge could be
incorporated into this multipole. allowing all Coulomb interaction to be treated on an
72

equal footing. This would also have the effect of partly canceling the monopole terms
which are the largest contributions and the ones most slowly decaying with distance.

3: Evaluate the potential and its derivatives (field, field gradient, etc.) at the site of
each atom from all the other atoms.

4: Evaluate the effect of the total electrostatic potential on an auxiliary basis


function (essentially Jp in Eq. 18.4).
In the above scheme. step 3 is quadratic in the number of atoms, whereas. the
others are linear. This would already be a very significant improvement compared to
methods in current use. However, even better performance is possible using an
approach known as the Fast Multipole Method.42 The quadratic step 3 can be replaced
by the following procedure:

3a: Combine the multipoles to successively larger units in a hierarchical fashion.


(The expansion coefficients need to be kept.) For simplicity, we can assume that the
combination is done pair-wise, although this is not necessary in practice, and in an
actual application the proximity in space would be a better guideline.
Denoting the original moments ( ~~O) , i= I,n }. centered at (Q~O)} the next steps
in the hierarchy would be (~ll), i=l,n/2}, (~~2), i=l,n/4}, . .. centered at (Q~.I)}
, ( Q~2)}. etc. The number of levels in this hierarchy would then be K, where K =
log2(n).

3b: At the highest level K. the potential ( ~K)} at all centers ( Q~K)} resulting from
all other ~~K) is evaluated if such a multipole expansion is justified. Proximity in space
J
and convergence of the multipole expansion would dictate whether the expansion is
allowed.

3c: Using the expansion coefficients saved in 3a, the potentials at the (K-l) level
( F\K.I)} are calculated from ( ~K)}. For those interactions that have not already been
accounted for at the higher level, a new attempt is made to calculate further
I F.IK-I}}. This evaluation of successively smaller contributions
contributions to
continues until the first level is reached.

While the scheme is more complex that the original one and involves more
overhead. there is nothing in it that scales quadratic in the number of atoms. An exact
analysis of the work involved is difficult without additional information about the
magnitudes of the multipoles and their distribution in space. but a scaling of n·log(n)
seems plausible.

42 J. Ambmsianu. L. Grcengard and V. Rnkhlin. Cumru!. Phys. COInmun. 48. 117 (1988); L.
Greengard and V. Rokhlin. Chern. Ser. 29 A. 139 (1989),
73

21. Fermi- and Coulomb Correlation.

In the elementary interpretation of a wavefunction. the probability distribution


function for an n-electron system is given as
P(J.2• . . . n) =1'1'(1.2•.. n)12 (21.1 )

From general elementary statistics we know that the probability distribution for
two variables P(x.y) can be written as

P(x,y)=Px(x)Py(y) (21.2)

if and only if x and y are independent or uncorrelated variables. where the one-particle
probability distributions are defined as

P~(x) = I P(x,y) dy (21.3)

Eq. (21.2) is usually taken as the definition of uncorrelated variables, and


deviations from this behavior are described as correlation which can be quantified in
several different ways. In the Hartree-product (3.1) the coordinates of different
electrons are clearly independent:
P(L2) = [CPa(l)]2 [CPb(2)]2 =P(I) P(2) (21.4)

It is easily seen that the electrons are not uncorrelated in a Hartree-Fock


wavefunction. From the fonn of the Slater detenninant (3.16), we see directly that two
rows of the determinant will become identical when rj .. rj- Such a determinant is
identically zero. and therefore the probability for such an event vanishes. This is not
unexpected by itself. after all we would not expect particles with the same charges to be
found in the same point in space. However. the charges of the electrons was not an
issue when we arrived at the determinant form of the wavefunction! The correlation
seen here is due to the antisymmetry. rather than to Coulomb repulsion. This justifies a
distinction between Fermi- and Coulomb correlation. The former is a result of the
antisymmetry of the wavefunction, the latter is due to the Coulomb interaction between
electrons. Only Fermi correlation is accounted for by the Hartree-Fock wavefunction.
Despite this fact. it is common to define correlation energy as the difference between the
exact energy and the exact solution of the Hartree-Fock problem:

Ecorr =Eexact _ EHartree-Fock (21.5)

Since we expect interacting particles (electrostatic repulsion in this case) to move in a


correlated way. regardless of. and in addition to. effects brought about by general
symmetry considerations. we may predict that the Hartree-Fock model will fail to
describe situations where such interaction is strong.
74

22. The Fock Matrix in an Orbital Basis. Koopmans' Theorem.

We have shown how the Fock matrix can be obtained from integrals in the
LCAO representation. We will refer to this as the matrix representation of the Fock
operator in the AO basis set. Sometimes we will also need the representation of the
Fock operator in the molecular orbital basis. For this. we construct the matrix

Fij =(<Pi F <Pj) =(<Pi (h+J - K) <Pj}


n
= (<Pi h <Pj) + L {(<Pih <Pj) - (<Pi Kit <Pj)} (22.1 )
k

When the SCF procedure has converged. this matrix is diagonal. We thus have an
expression for the orbital energies:

n
Ei =(<Pi h <Pi) + L, {(<Pi<Pi Igl<Pk<P0 - (<Pi<Pk Igl<Pi<Pk) } (22.2)
k
and we also know that
n
Fij = (<Pi h <Pj) + L, {(<Pi<Pj Igl<Pk<Pk) - (<Pi<Pk Igl<Pj<Pk)} (22.3)
k

which clearly will be =0 for i * j when the orbitals satisfy the Hartree-Fock equations.
To describe a (n-I) electron system. we can simply remove one orbital. <Pk.
from the original set. assuming all the other orbitals to remain unchanged. With the
energy given as in Eq. (4. 14) for the original n-electron system:

n n
En =L, (<Pi h <Pi) + ~ L, {{<Pi<Pj Igl<Pi<Pj) - (<Pi<Pj Igl<Pj<Pj) I (22.4)
i - i.j

which. for the present purpose we may view as summing all elements of a vector 0 and
a matrix A. constructed as:

bi =(<Pj h <Pi) (22.5)


Aij =(<Pi<Pj Igl<Pi<Pj) - (<Pj<Pj Igl<Pj<Pi) (22.6)
we have
n n
En = L, bi + ~L A ij (22.7)
i - i.j
75

Note that the diagonal elements Ajj are identically zero by construction. For the ionized
system we get:
n
En-I =L bi + ~ L Aij
i;tk i;tk,j;tk
n n
=En -bk - ~ l. Aik - ~ l. Ajk + Akk =En - £k (22.8)
I J

which is most easily realized by consulting the illustration below.

Removing the row and the column 'k' will remove Akk twice, but since the diagonal
elements are zero. we get:

En-l - En =-€i (22.9)

(22.9) is the famous Koopmans' Theorem, which states that

the orbital energies approximate ionization potentials (with reversed signs).

There are several approximations involved in Koopmans' theorem: We have


assumed that all the other orbitals remain unchanged during the ionization process,
which certainly is not true. We also rely on the errors due to the neglect of correlation
effects (21.5) in the Hartree-Fock model to remain the same for the ion and the neutral
~pecies. Both these assumptions are usually in error by several eV. However, luckily
enough the errors have opposite signs and are often of similar magnitude. It is this
cancellation of errors that makes Koopmans' theorem a useful tool for predicting and
intewretin& experimental ionization potentials.
We could have used the same ideas to determine the electron affinity (EA) of a
molecule. i.e .. the energy difference En - En+1 associated with the attachment of an
electron to an n-electron system. Formally. similar expressions would result.
76

However, in this case the errors discussed above have the same sign, and the predicted
EA is therefore usually off by several eV. Since typical EA's are in the range 0-4 eV,
orbital energies are in practice useless for their prediction.
Notice that this reasoning only holds for singly ionized states. If we were to
remove two electrons, say, k and I, the change in energy would be. with the same type
of-pictorial representation:

(22.10)

but. since the off-diagonal elements are in general non-zero, this is not just the sum of
two orbital energies. This reasoning also provides some insight into why the total
electronic energy of a system - which obviously would equal the energy needed to
remove all electrons - is different from the sum of orbital energies as discussed in
(5.24).
In the same way as above. one can also show that the excitation energy, defined
as the energy required to remove one electron from an occupied orbital and add it to a
virtual, is not the difference between the two orbital energies involved.

k 1/+1

E' - E =-Ek + En+1 + Ak.n+1 + An+l.k (22. 11 )


n

23. Matrix Elements With Slater Determinants. Brillouin's Theorem.

The Hartree-Fock equations (5.22) have an infinite number of solutions. Even


after a finite basis set has been introduced, the Roothaan-Hall equations (10.7) or
(10.18) have more solutions than we need to describe the electronic ground state. The
remaining (virtual) orbitals can be used to describe higher electronic states. This feature
is of course useful if excited states are the focus of our interest. but more often the
virtual orbitals are actually used in improving the description of the ground state, using
the CI expansion method outlined in Eq. (22.1), and discussed elsewhere in this
volume. If we use the notation from Sections 3 and 4 to describe the ground-state
electronic wavefunction.

(23.1)

we can write singly, doubly and triply excited states as:

'I'f = I<P, <P2 ' . <Pa <Pj CPk' . <Pn} (23.2)


ab
'I'ij =I<P I <P 2 • • <Pa <Pb <Pk' . <Pn} (23.3)
abc
'P ijk =I<P, <P2 · • <Pa <Pb <Pc' . <Pn} (23.4)

We can now use a technique similar to the one used in Section 4 to find
integrals between two different wavefunctions. To begin with, the overlap integral is
obtained as

n! n!
a I ~
('1'01'1'.I ) ="1
n. ~
p
L (-I)P+P' {P {<P1<P2·'
p •
<Pi·· <PnIIP' {<P1<P2·· <Pa·· <Pnl}
n!
=L
p
(-I)P «(<P1<P2" <Pi·· <PnI IP ' {<P1<P2·· <Pa·· <Pnl} (23.5)

But. in analogy with Eq. (3.15) all terms in the sum must contain an integral of the type
<ala>, where <Pa is in the original set of orbitals whereas <Pa is not. These integrals all
equal zero, and the entire integral (23.5) therefore vanishes, in other words

a
{'I'0 1'1' i } =0 (23.6)
78

For the purpose of discussing more complex situations, we rephrase this result: There
is a "mismatch" between the orbital sets in '1'0 and in '¥~ , and nothing we do to
permute the orbitals can eliminate this mismatch.
Integrals over other operators can be systematically evaluated with a similar
approach. For the one-electron operators we have, analogous to (4.3)-(4.6):

n n! n
('1'0 I h k'I' ~ ) = I <
(-l)P {<P J <P2 .. <Pi .. <Pn} I hk P {<P J<P2 ., <Pa .. <Pn})
k p k
(23.7)

In (23.7), the "mismatch" between orbitals <Pi and 'Pa will make the entire integral zero,
unless 'Pi and <Pa are both associated with electron "k". In other words, the only non-
zero contribution occurs for i=k:

n
('1'0 I hk'l'~) = <i h a> (23.8)
k

In the same way, we obtain matrix elements over the two-electron operator. Notice,
first of all, that

n n
~
I<J
gij = iIb,1 gij (23.9)

We thus get for the integral over Slater determinants:

n
('1'0 I gkl'l'~) =
k<1
n!
Ip (-I)P <{ 'P I <P2·· <Pi·· <Pn} gkl P {<P 1'P2·· <Pa.· 'Pn}) (23.10)

The mismatch between Hartree-products on the left and the right side is eliminated here
if either k or I equals i. The expression then equals:

n n!
Ik Ip (-I)P ({ 'P I 'P2·· <Pi.. 'Pn) gki P {'P I 'P2 .. <POI .. 'Pn}) (23.11)

However. only two permutations in the sum over P will give non-zero contributions:
the one that leaves all orbitals in place. and the one that only interchanges i and k. We
thus get the final result:
79

n
('I'D L gkl 'I'~ ) =
k<1
n n
=L {('Pi 'PIc Igkil 'Pa 'PIc) - ('Pi 'Pk Igkil 'Pk 'Pa>}= L <ikllak> (23.12)
k k
with the notation for integrals introduced in Appendix A.
We can make an interesting and useful observation here: The integral of the
total Hamiltonian between the Hartree-Fock wavefunction and a singly excited state is:
n n
('I'oH'I'f)=('I'o[ho+~ hi+~ gij)'I'f>
I 1<]
n
= <i h a > + L <ik II ak> (23.13)
k

If we compare (23.13) with the expression (22.1) for the Fock matrix (which is
of course diagonal in the representation of converged molecular orbitals), we arrive at
the following conclusion. also known as Brillouin's Theorem:

The Hartree-Fock waveftl1lction does not interact directly with singly excited states
through the Hamiltonian.

a
In other words, ('1'0 H 'Pi ) = O. Brillouin's Theorem is frequently used elsewhere,
among other things for developing the theory and methods of electron correlation.43
If we proceed with the integrals between Slater detenninants, we may now also
consider the doubly excited states 'I'ijb. From the previous discussion, it should be
obvious that
ab
('1'01'1' ij ) =0 (23.14)

We can also conclude that


n
~ ab
('1'04.. hk '1' .. )=0 (23.15)
k IJ

since there will always be at least two orbital mismatches between the Hartree products
to the left and to the right. and any single term in the sum over k can make up for at
most one of them.
For the two-electron terms. we may expect to get some non-zero interaction:

43 Remember lhat Brillouin'~ ll1corem holds Wl!l if the orbitals are Hanree-Fock orbitals Clhough not
net:essarily canonical).
80

n
{'I'oL gkl 'I'~.b >=
k<1 1J
n n!
~L I. (-1)P {{<PI <P2 ..<Pj.'<Pj-·<Pn } gkl p {<P1<P2··<Pa·.CPb.. CPn}) (23.16)
k..1 P

By now we should be used to this type of reasoning. and we can conclude without
further ado that these integrals are non-zero only for i=k, j=1 and i=l. j=k. and the entire
integral (23.16) therefore becomes:
n!
~ Lp (-1)P ({<PI <P2 .. <Pj .. CPj-.<Pn} (gjj + gji) P {<P1<P2··<Pa•. CPb·.<P n}) (23.17)

Just like in the single-excitation case there are only two permutations that can give a
non-zero result: the one that leaves everything unchanged. ,and the one that only
interchanges electrons i and j. The final result is therefore:
n
('PoL gkl 'I'ijb) = <ij I ij> - <ij I ji> = <ij II ij> (23.18)
k<1
We finally conclude that for detenninants differing in three or more orbitals. there is no
way one can obtain non-zero integrals over operators that contain only one- and two-
electron interactions.
To summarize. we have the following rules which are important enough to be
worth memorizing (since deriving them every time we need them is a bit tedious):
n n
('1'0 H '1'0 > =ho + L <i h i> + t L <ij II ij> (23.19)
k i.j
n
('I'oH'I'f>=<iha>+ L <ik II ak> (23.20)
k
{'I'(l H 'l'ijb ) = <ij II ji> (23.21 )

('1'0 H 'I':,:bc ... ) = 0 (23.22)


IJk .. .

Notice that (23.20) equals zero for the case that the orbitals are the optimized Hartree-
Fock orbitals for \flo. The general fonn of (23.19) -(23.21) holds for any orthonormal
orbital set. however.
81

24. Charge Density and Population Analysis.

The charge density of a many-electron system is an intuitively obvious concept,


in many ways more so that the wavefunction itself. Some modern theories of electronic
structure are indeed very strongly focused on the charge density. The density can be
determined experimentally, and several methods actually do so with an accuracy that
rivals some of the best calculations. Actually the densities of (X- and p-electrons are
separately accessible - or more precisely their sum, the total electron density, and their
difference, the net spin density.
We begin this exercise by evaluating the probability of finding the electron "i"
inside the volume element dV at the position rj, irrespective of where all the other
electrons are. That probability is given by the integral

(24.1)

The expression can be written on a more convenient form if we recall the definition of
the Dirac S-function;

f(r) == Jfer') a(r - r') dr' (24.2)

with which we can write Pi as a conventional expectation value with integration over all
coordinates:

(24.3)

or, in a more convenient notation:

(24.4)

However, electrons are indistinguishable, and the observed density is the sum of
contributions from all the electrons. This gives the following form for the tOlal electron
density of a molecule:
n
Per) =f l'f(rl, r2 ... rn)12 L a(r - ri) drl ... drn (24.5)
i
In other words. the swerator corresponding to the total density is
n
R(r) =I S(r - ri) (24.6)
i
Notice that we made no assumption about the form of the wavefunction: the derivation
is general. In order to obtain the specific expression for the density in the case of a
determinant wavefunction. we note that the density operator (24.6) is a one-electron
82

n
operator similar to the term L hi in the Hamiltonian, and the expectation value can
i
then be evaluated along the same Jines as (4.6):
n n n
=
per) ('I' L~(r - ri) 'f') =L
<'Pi(r') ~(r - r') 'Pi(r'» ('Pi(r)(2 =L (24.7a)
i i i
p(r) =~(r) ~t(r) (24.7b)

In other words, the total electron density of a molecule is equal to the sum of densities
for each of the molecular orbitals - not an entirely surprising result!
The density can also be expressed in terms of the basis set - we just insert the
LCAO expansion (9.1) or (9.2) into (24.7), to obtain

r.
n n N N
per) = ('Pj(r)(2 =4. L. Cpj Cqi Xp(r)Xq{r) =L Dpq Xp(r)Xq(r)
I I p.q p,q
(24.8a)

per) =x(r) D xt(r) (24.8b)

where D is the same density matrix as in (10.2), which we introduced when discussing
the Roothaan-Hall equations.
The expressions above have a significance beyond any interest in the electron
density itself. Many molecular properties are represented by multiplicative, one-
electron operators. The dipole moment operator is one such example: The operator for
the dipole moment with respect to a point R is44
n
d=Li (rj - R) (24.9)

Removing the reference to origin, the expectation value of this operator is, with the
same methods as above,
n N
J.l =('I' d '1') =L < 'Pi(r) r 'Pi(r) > =L Dpq <Xp(r) r Xq(r» (24.10)
i p.q

A large class of properties can be evaluated in this convenient way, using the density
matrix and matrix elements of the operator in the LCAO basis set. Eq. (24.10) is of
course just another way of expression the relation between density and dipole moment:

J
J.l = per) r dr (24.11)

~ The dipole moment is a vector quantity. independent of the choice of origin if the molecule has no
net L·ha.rge.
83

Given the expression (24.8) for the electron density. we can carry out an
interesting analysis. We note that the total electron count of the molecule is given by:

n= J per) dr = Jf,
p.q
Dpq Xp(r)Xq(r) dr

N N
=L Dpq Spq =L {I> S}pp= Tr{D S} (24.12)
p.q p

This provides a breakdown of the· total number of electrons in the molecule into
components that can be assigned to individual atoms:

(24.13)

where we have summed the contributions for each pair of atoms {P,Q}.

qPQ = L Dpq Spq


peP.qeQ
(24.14)

Notice that (24.13) has both one- and two-center contributions! We can thus assign the
electron density to pairs of atoms (= bonds!) as well as to the atoms themselves.
This analysis is referred to as the Mulliken population analysis. The quantity
2qpQ for P;tQ is usually called the overlap population between atoms P and Q (notice
the factor of 2. since the sum in (24.13) is quadratic and contains each pair twice). We
also define the net chan~e of an atom P as

qp =Qp - qPP (24.15)

where Qp is the nuclear charge of the atom.


If we want to assign the total charge of the molecule to the atoms it is also common to
introduce the gross atomic charge as
Na10m

Zp=Qp - LQ qPQ (24.16)

Obviously. the sum over all gross atomic charges equals the total net charge of the
molecule. so this analysis does indeed provide a breakdown of the total charge into
atomic components.
There are other ways to divide the total electronic density. Starting by the
definition
84

1/2
W =u e \'It (24.17)

in close analogy to (10.21), we note that

W 2 1/2... 1/2. t
=UG ulue U'=UGU =S. (24.18)

Furthermore, since Tr{ A B} =Tr{ B A} for any two matrices A, lB, we can write
(24.12) as
N
n=Tr{WDW} = L (W D W}Pp (24.19)
p
which we may view as a population analysis in terms of a symmetrically orthogonalized
basis set. Using the transformation matrix :x. =UC- 1/2ut from Eq. (10.21), the
density becomes:

N N
p(r) =L Dpq Xp(r)Xq(r) =L D'rs G>r(r)cMr) (24.20)
p,q r,s
or. in matrix notation:

(24.21)

where D' is the density matrix in the new basis set" (note that:x. is a symmetric
matrix). From (24.21) we also have

D = :x. D':x. (24.22)

However. since W:x. = X W = 1. we also have

D'= WDW (24.23)

and we can thus write (24.19) as


N
n =Tr{D'} = L {D'}pp (24.19)
p
In other words. the symmetric orthogonalization. while giving a new orthogonal basis
set as close to the original as possible. also provides a population analysis where the
entire charge is associated with the atoms ( there can be no overlap density with an
orthogonal bao;is set). We often refer to (24.19) as a lijwdilZ population ana(vsis.
85

25. Closing Remarks.

There is no way a presentation of this fonn can be comprehensive. For


additional infonnation on topics discussed in this Chapter. the reader is referred to
special literature. For instance. several of the proceedings from NATO Summer
schools in the past provide valuable reading.4s The textbook by Szabo and Ostlund46
also contains a thorough review of Hartree-Fock theory. In general. however. the
research frontier on modem electronic structure is moving too rapidly for any textbook
on the topic to be up-to-date for more than a short time. and the reader is therefore
strongly encouraged to keep up with the current scientific literature for the latest
developments.

Appendix A: Notations for integrals.

Unfonunately. there exists a plethora of different notations for denoting


integrals over one- and two-electron operators involving one-electron orbitals and/or
basis functions. In previous Sections. we have already encountered two ways of
writing the two-electron integrals. Without pretending to provide an exhaustive list of
all the prevailing conventions. we will try to mention some of the more common ones.
For a start. we recall the bra-ket notation for an integral over wavefunctions and
operators in general:

<'!' A 'P> == J'!'(X)* A Cjl(X) dX (AI)

where X is a generalized notation for all the coordinates to be integrated over (usually
obvious from the context). Reminiscent of this convention are the notations

JCjlp(x)* Cjlq(x) dx
<p I q> == <Cjlp ICjlq> == (A2)

<p A q> == <Cjlp A Cjlq> == JCjlp(x)* A Cjlq(x) dx (A3)

<pr A qs> == «Pp Cjlr A Cjlq Cjls > == f Cjlp(x 1)*Cjlr<x2)* A Cjlq(XI) Cjls(X2) dx I dX2 (A4)

where we have used 'x' to denote the combination of space- and spin variable for the
electron. The electron repulsion integral is the most common two-electron quantity.
and in a notation already used above it is written a'l:

45 "Compll1atiollal Ter/miqlle.f ill Qllantlllll Chemi.ftryl alld Molec"lar Ph~'sics". NATO ASI Series C.
Vol 15. (Eds. G.H.F. Diercksen. B.T. SUlcliffe and A. Veillard) Reidel. Dordrechl (1975): "Methods in
ClIlllplllalional MlIlecrdar Ph~·sid·. NATO AS! Scr. C. Vol. 113. (Eds. G.H.F. Diercksen and S.
Wilson) Reidel. Dordrechl (1983): "Melhods;1I ClImp,,'o'illnal Mnleclliar Ph~·sics". NATO AS! Ser.
B. Vol. 293. (Ed... S. Wilson and G.H.F. Diercksen) Reidel. Dordrechl (1992).
46 "Modem Qllolllllm Clremistry:: IIIIrolf"climl In Advall"ed Electrollic StroClllre Tlreory·". A. Szabo
and N. Osllund. I~ Edilion (revised) McGraw Hill. 1989.
86

I
<pr - qs> =<pr I qs> (A5)
q2

Another convention also in use employs square brackets:

[i hjJ =<i hj>, (A6)

[ik I jl] =<ij I kl> (A7)

Note the different ordering of the indices for the tW<H:lectron integrals! The justification
for this notation is that the indices corresponding to the same electron are written
together on the same side of the symbol for the integral.
The orbitals entering the expressions (A2 -AS) can be understood as either spin
orbitals or as functions over space only. Both conventions are common, and we clearly
need to make sure that there is complete agreement on which one is used in order to
interpret expressions where these notations occur.
We noticed in the discussion of the Hartree-Fock equations that two-electron
integrals very often occur in pairs, due to the antisymmetry of the wavefunction. This
motivates the notation

<pq II rs> =<pq Irs> - <pq I Sf> (A8)

Using this notation, and assuming spin-orbitals, the Hartree-Fock energy becomes:

n n
E =ho + I
i
<i h i> + ~ 4.U <ij II ij> (A9)

Often, the spin-integration involves some trivial algebra and can be carried out even if
the spatial part of the orbitals are unknown. For integrals over space orbitals, we have
the following notations:

(p h q) =J'IIp(r)" A 'IIq(r) dr (A 10)

(pq I rs) == ('lip 'l'q I 'l'r'l's) == J 'l'p(rl )*'I'rCr2)* r:2 'l'q(r,) 'l's(r2) dr, dr2 (All)

On comparing the different formulations. we note that e.g., <pq Irs> =(pr I qs) if the
spin functions satisfy the relations O"p =O"r and O"q =O"s , and zero in all other cases.
87

Appendix B. Parallel Implementations of Hartree-Fock Methods.

The success of all method development must be measured by the practical


usefulness of these methods in realistic applications. Today, this requirement also
involves the adaptability of methods and algorithms to powerful and cost-effective
computer architectures. Thus, method development in quantum chemistry is closely
coupled with the designing of efficient codes for modem computers, which today is
almost synonymous to parallel architectures. The restructuring of application software
to fully utilize the awesome powers of current computer hardware is indeed one of the
most significant challenges facing contemporary computational chemistry. Considering
the wide variety of available computer architectures, and the ephemeral nature of the
cutting edge technology upon which they are based, this is not a one-time task, but
rather an ongoing development project
Many parallel architectures allow for a very high nominal performance,
measured in MFLOPS. However, when parallel approaches have been attempted in
electronic structure calculations, the real bottlenecks have been encountered in
communication and data handling. The calculations generate unmanageable amounts of
intermediary data, such as two-electron integrals, and data reduction at a high level is
therefore necessary. Conventional approaches rely on storage of these integrals on
disk, and they are therefore always limited by the available disk space. This is
especially unfortunate as the storage requirement grows with the fourth power of the
size of the system studied. and the 'direct' approaches discussed in Section 14 seem to
offer the only known remedy to this bottleneck.
The evaluation and further processing of two-electron interactions is by far the
most time-consuming step in all electronic structure calculations. It is therefore
important to design algorithms for parallel execution of this step within the restrictions
on memory per node which currently apply to many parallel systems.
These integrals can be computed totally independently, and their evaluation can
thus be split into individual computational tasks down to the "batch" level (a batch
would correspond to the set of integrals derived from four distinct Gaussian shells). In
the direct SCF method, parallelism can be achieved by splitting the computation and
processing of two-electron integrals into a number of tasks that can be executed
independently and in random order. The direct SCF scheme is ideally suited for
processing in parallel. and can easily be implemented on a variety of architectures -
massively parallel, shared-memory, heterogeneous-distributed, etc., with little change
in the structure of the kernel code. Ta... ks can be distributed across several computers,
each one executing a copy of the integral program. Once all tasks have been issued.
the 'master' machine collects all partial Fock matrices from the servers -'slaves',
combines them into one and concludes the current SCF iteration. To communicate data
88

and messages between master and slaves, a medium such as a communication network,
a network file system (NFS) or a shared memory device has to be available. Such
techniques can be used to distribute a computation over several machines with a large
variety of different hardware designs. The only model which would require more
consideration is the SIMD architecture, where integral evaluation would have to be
grouped according to type (quantum number) and large numbers of those would be
executed in parallel, essentially precluding most integral prescreening schemes in
current use.
A coarse-grain type of parallelism can be used to exploit parallelism efficiently
in a system with relatively few, powerful processors. In this scheme the evaluation and
processing of integrals is split into unrelated tasks, and at the end of the computation
the work done by all the tasks is collected by the main process. The obvious advantage
with this scheme is that it can be used for a shared memory computer as well as for a
distributed memory system, or any combination of the two.47 A net of workstations
can provide a significant source of CPU power for very large calculations, and it is
fully realistic to run machines at different locations in concert for the same job.
A simplistic parallel implementation of the direct SCF method requires complete
copies of both the Fock matrix (nuclear gradient), and the density matrix to be held in
memory on each node. However, with a physical memory of 16 megabytes on each
node one is limited to calculations on systems with up to about 1,000 basis functions
with such a "replicated-data" approach. By partitioning Fock and density matrices
across several nodes. it would be possible to investigate much larger systems. It is
possible with some modification of the algorithm to leave the AO-driven regime and
compute integrals in a Fock matrix driven manner. 48 A potential drawback of this
approach is the somewhat increased communication required to send matrix elements
between nodes, but this communication will increase slower than the computation as
one goes to larger systems. and thus it should not constitute a bottleneck.
Computationally intensive sections of existing programs can be modified to take
advantage of the processing power available on parallel machines in the regime of a few
hundred processors. However. while Amdahl's law oversimplifies the speedup for the
construction of the Fock matrix. it emphasizes the problem of efficiently using large
numbers of processors. To gain overall efficiency for a large number of processors, it
is not sufficient to parallelize only the evaluation and processing of two-electron
integrals. Large portions of the existing serial code. which is computationally in-
significant on a single-processor architecture. will have to be modified. Such an

47 H.P. Liithi. John E. Men7.. M.W. Fcyerciscn and J. Almlof. 1. Comput. Chem. 13. 160 (1992).
48 i.e .• completing the calt:ulation of all the inter:rnls rclatinr: one sub-block of the density matrix to
another sub-block of the Fock matrill Ilefore moving on to.the next pair of sub-blocks.
89

adaptation of an entire electronic structure code to parallel hardware could become an


overwhelming task.
An alternative solution to this problem would be the use of a dual architecture
(Heterogeneous computing environment). Many of the tasks that are difficult and
tedious to parallelize are perfectly suited for a conventional supercomputer architecture
where they require little time. An implementation has recently been described where a
massively parallel architecture was used for the evaluation and processing of two-
electron integrals, while all the other parts of the SCF calculation were carried out On a
conventional Cray supercomputer.49 Large production calculations on such a dual
architecture can run orders of magnitude faster than on anyone of the machines alone.
The algorithms can be organized to avoid reliance on high-speed communication, and
similar concepts can therefore be implemented with other combinations of hardware.
The processing of two-electron integrals in parallel SCF calculations is actually
quite simple. After evaluation, the huge amount of information in the two-electron
integrals can be immediately reduced to two-index quantities (the Fock matrix). A
relatively simple situation occurs if full copies of the Fock and density matrix can be
accommodated by each processor. In a shared-memory environment it is reasonable to
keep only one single copy of the density matrix. However, to avoid access conflicts
due to accidental attempts by more than one process to update the same Fock matrix
element, it is usually worthwhile to use this 'replicated data' model for the Pock matrix,
keeping separate copies in memory and adding them only at the end of the integral
processing.
The communication on a distributed-memory system would be reduced to the
transfer of the total density matrix to all processors at the beginning of an iteration, and
the return of the partial Fock matrices at the end of the integral processing. The former
can often be done with very efficient 'broadcast' type operations, whereas the latter can
be combined with a 'cascade' type of addition which would reduce the serial part of the
algorithm and produce the final Fock matrix in log2 Nproc steps.
For cases where the total matrices cannot be kept in memory on the nodes, the
situation is more complex. If the Coulomb and exchange contributions are treated
separately as discussed in previous Sections. it is possible to split the calculation into
N2 tasks, where each task requires the transfer of one row of the density matrix, the
evaluation of N2/4 integrals. and the return of one row of the Fock matrix. If
communication and computation can be at least partly overlapped, this splitting of the
communication into smaller units may have advantages for large N.
To see how such a procedure can be implemented. it is convenient to re-Iabel
the Fock- and density matrices as

.19 J. Almlof. A. Sargent. and M.W. Fcyereisen. SIAM News 26(1). 14 (1993).
90

(B.la)
(B.lb)

where r<b) and i d) are columns in these matrices which are now treated as vectors.
A minimum-memory scheme would amount to looping over indices a and b, and
(i
a), d(b)} to each processor. At the processor level,
distributing one pair of vectors
the integrals (ab I cd) (for fixed a,b; all c,d) would now be evaluated. and used to
calculate all exchange contributions to r<a) and r<b) as in Eqs. (17.6 c-f)

(B.2a)

(B.2b)

(B.2c)

(B.2d)

We also have to generate the Coulomb interactions, by looping over basis


functions a,c, distributing pairs of vectors {i
a), d(c)} to each processor, and again

evaluating the integrals (ab I cd) (but now for fixed a,c; all b,d). Again, the Coulomb
contributions (B.2) could now be evaluated according to:

-
f~ =r~ i~ (ab I cd) (B.3a)

-
f~ =r~ ia~ (ab I cd) (B.3b)

If the number of tasks generated in this way is not sufficient to get an efficient
load balance, but the calculation is still too large to fit the full matrices in memory, it
would also be possible to loop through these tasks one (or a few) at a time. and spread
the work in the inner double-loop across the processors.
Density Functional Theory

Nicholas C. Handy

University Chemical Laboratory


University of Cambridge
Lensfield Road
Cambridge CB2 lEW
Great Britain

1 Introduction

The subject of quantum chemistry may have reached an impasse. Keeping the discussion
to ab initio quantum chemistry we now know how to do very large SCF calculations,
thanks to the introduction of the Direct methodology by Almlof[lj. We can also manage
to work with good basis sets for such calculations, although I consider that 6-3IG* are
not good enough, and probably something nearer to TZ2P is required for definitive SCF
calculations.

Beyond SCF there are major difficulties, all associated with trying to more accurately
represent the electron-electron cusp. We know from the work of Kutzelnigg[2j that the
convergence of this problem is very slow, something like (I + lt4; this means that very
large basis sets are required for correlated calculations. We know that it is more important
to include d and f basis functions than to improve the methodology. 6-3IG* bases are
not appropriate for correlated studies. We also know that the raw cost of correlated
methods, MP2,MP3,MP4,CISD,CCSD,CCSD(T) increase with powers of the size of the
problem as 5,6,7,6,6,7. It is simply not possible to contemplate CCSD(T) calculations
with 1000 basis functions, nor will it be sensibly possible to contemplate such calculations
this century. I suggest that we have been misled by the rapid progress that computer
companies have made in the development of their hardware: more memory, vectorisation
compilers, parallel machines, all of which we have taken advantage of, but these advances
will not cure our outstanding basis set problems.
92

The physicists have been trying to persuade us for years that we ought to study Density
Functional Theory. We ought to have listened more carefully, especially since it was one
of our own, J. C. Slater who pushed them in that direction with his 1951 contribution[3]
in which he suggested the replacement of the difficult exchange term in the Hartree-Fock
method by the Dirac[4] pt potential, which he argued at that time included both exchange
and correlation effects. We were largely discouraged by the fact that this original DFT
made molecules substantially overbound. We do not think today that we can forget about
the problems of core electrons and use plane wave representations as the physicists do,
the jury is still out on this problem as they are on whether we should continue to use
gaussian basis sets, or will they be a thing of the past? We ignored the DFT work of
people like Jones[5] who was the first to get the bond length of Btl2 correct, although it
IIDlst be said that the reason the regular quantum chemists got it wrong was that too
small basis sets were being used in those days. So although I think we have all worked
hard and made great progress I rather wonder if we are up against it and the physicists
were right after all. This is why we are looking at DFT.

2 E. Bright Wilson's observation and the Kohn-


Hohenberg Theorem
When two of the key theorems of modern Density Functional Theory were introduced in
1965, it is said that the eminent theoretical spectroscopist E. Bright Wilson stood up at
the meeting and said that he understood the basic principles of the theory. He said that
if one knew the exact electron density p(r), then the cusps of p(r) would occur at the
positiollS of the nuclei. Furthermore he argued that a knowledge of IVp(r)1 at the nuclei
would give their nuclear charges. Thus he argued that the full Schrodinger Hamiltonian
was known because it is completely defined once the position and charge of the nuclei are
given. Hence, in principle, the wavefunction and energy are known, and thus everything
is known. In conclusion, Wilson said he understood that a knowledge of the density was
all that was necessary for a complete determination of all molecular properties. It is this
simple argilment which is behind most of the aspirations of modern Density Functional
Theory.
IT N is the number of electrons then p( r) is defined by

(1)

where 'II'(XIX2 •••XN) is the electronic wavefunction for the molecule. We observe that

J p{r)dr =N (2)

Furthermore it may be shown that the nuclear cusp condition gives [6]

(3)
93

where p(rA) is the spherical average of p(r). Another exact result which is of value for
the electron density is that asymptotically

p"" exp[-2(21!!!r)] (4)


where Imin is the exact first ionisation potential[7]. It will become clear that in DFT
exact results are of the greatest value.
Hohenberg-Kohn[8] theorem 1. The electron density per) determines the external poten-
tial.
Proof. Let there be two external potentials V}(r), vz(r) arising from the same per). Thus
there will be two hamiltonians HI and Hz with the same (ground state) density, but
different wavefunctions 1111 and 1112. Now use the variational principle:

Eft < (1I12IHI11112) = (1I12IHzI1l1z) + (1I1zlHl - Hz11l12) (5)


= ~ + j p(r)[vl(r) - vz(r)]dr (6)

where E:t and E~ are the ground state energies for HI and H 2 , respectively. Note that it
is here in the theorem that the restriction to the ground state has entered. In Eqn 5 the
subscripts 1 and 2 may be interchanged to give a second inequality. The two inequalities
may then be added to give ECf + ~ < ~ + E't, which is a contradiction. Hence the
external potential is uniquely determined by p(r).
We may therefore represent the energy as a functional of the density as follows

E(P] = Vne(P] + T(P] + Vce(P] (7)


= j p(r)v(r)dr + T(P] + Vee[p] (8)

where T(P] is the kinetic energy and Vce(P] is the electron-electron interaction energy which
contains the coulomb interaction J(p] which is given by:

(9)

The second Hohenberg-Kohn theorem allows us to introduce the variational principle,


although it restricts the theory to ground states only. Any approximate density p, by the
first theorem, determines its own hamiltonian and wavefunction i. Using this wavefunc-
tion in the usual variational principle we obtain

(iIHli) =j p(r)v(r)dr + T[P] + Vee[P] = E[P] ~ E(P] (10)

This variational principle allows us to write down the condition that the energy, Eqn 7,
is stationary with respect to changes in the density, subject to the constraint that Eqn 2
holds:

c5E(P] - pc5[j p(r)dr - N] = 0 (11)


94

for which the Euler-Lagrange equation is, in terms of functional derivatives

c5T(P] c5V•• (P]


p. = v{r) + c5p{r) + c5p{r) (12)

This last equation is an exa.ct equation for p( r) , if only we knew the functional forms of
T(P] and V•• (P]. We now go on to convert this equation into a set of working equations.
Kohn and Sham[9] introduced the idea of considering the determinantal wavefunction for
N nonintera.cting electrons in N orbitals 4>i. For such a system the kinetic energy and the
electron density are exa.ctly given by

T.[p] (13)
N
p(r) = L l4>i(rW (14)

The orbitals obey an equation of the form

(15)

and the energy of this system is given by

E(P] = T.[p] + Jv.(r}p(r}dr (16)

Equations (15) are the Euler equations obtained when E(P] is minimised with respect
to variations in the orbitals which constitute the density as given by (14) subject to the
constraint that they remain normalised.
Now we return to our problem with intera.cting electrons and we write the energy in
djfferent ways:

E(P] = Jv(r}p(r)dr + T[p] + y'.(P] (17)

= Jv(r}p(r}dr + T.(P] + J(p] + (T[p] - T.[pJ) + Y.e[P] - J(p]) (18)

= Jv(r}p(r)dr + T.(P] + J(p] + E"c[p] (19)

The first line is from Eqn 7, the second line inserts and removes the noninteracting kinetic
energy and the coulomb energy, the next line introduce the exchange-correlation energy
functional the functional derivative of which is the exchange-correlation potential v"c:

T[p]- T.[p] + Y..(P]- J(p] (20)


c5E",c(P]
v"c(r) (21)
c5p(r}
95

On comparing eqns(15), (19) and (21) we deduce that the problem has been recast into
one involving noninteracting electrons in N orbitals which obey the equation

[_~V2 + v(r) + J1:~~ldr' + v",c(r)]4>i(r) =fi4>(r) (22)

These are the Kohn-Sham equations for the Kohn-Sham orbitals 4>i. Note that the key
property of them is that they give the exact "density through eqn (14), once the exact
exchange-correlation functional Ezc[PJ has been determined.
Finally I make an observation. To be useful to chemistry, DFT must be applicable to
ground and excited states. The theory outlined above involving the work of Hohenberg,
Sham and KOhn is thought by some to be rigorous for ground states. The Bright Wilson
argument is not constrained to ground states, but it is not considered rigorous. However
it is plausible. In so far as I believe that DFT can only be a semi empirical theory, because
it will probably be never possible to find the exa.et exchange-correlation functional, then
plausibility arguments are important. I believe that DFT is the best semi empirical
method, because such parameters that are introduced into the functionals are not molecule
specific.

3 The Kohn-Sham equations


We must first make the obvious observation that the Kohn-Sham equations look like the
SCF equations. So let us move to a basis set representation in which the orbitals are
expanded in terms of a basis set

(23)

The Kohn-Sham equations then take the form

1
~(7]al- 2 V
2 J
+ v(r) + Irp(r')
_ r'1dr' + v",c(r) - fil7]p)c~ = 0 (24)

This is to be compared with the corresponding SCF equations

L:(7]al- !.V2 + v(r) +


p 2
JIrp(r')
- r'1
dr' (25)

- J"
t 4>i(r)4>i(r')
Ir - r'1 dr' Prr, - fil7]p)Cpi =0 (26)

We see that the usual exchange term has been replaced by the vzc(r) potential. Thus in
principle it is easy to modify an SCF code to make it into a DFT code: merely delete
the exchange contribution and replace by v:c(r). Similarly the exchange contribution to
the total energy is deleted and replaced by the E:c[P] term. In our code we have done
precisely this, keeping gaussian basis sets and using our gaussian integral codes to evaluate
96

the kinetic, nuclear attraction and coulomb contributions to the Kohn-Sham (KS) matrix
and the total energy. We discuss the evaluation of the specific DFT contribution in section
7. All aspects of convergence are dealt with in the same way as in standard SCF codes.
At this stage we observe that such a procedure may be followed for any SCF method which
holds for a single determinant wavefunction. In other words Kohn-Sham methodology is
immediately available for (i) closed shell molecules (ii) the unrestricted representation (iii)
the high spin open shell molecule. All of these may be applied for the lowest state of a
given symmetry.

4 One and Two Particle Density Matrices


We shall need some standard definitions of density matrices. If lII(rl, r2, r3, ... ) is a nor-
malised N electron wavefunction, then we shall discuss the two particle density matrix
defined by

P2 (r 1I ,rI2,r},r2 ) = N(N-1)j
2 .. j'T'*( r 1I ,rI2,r3,r4,··· ) x
'l'

lII(rI, r2, r3, r4, ... )dr3dr4d.stds2ds3ds4 ... (27)


and its diagonal form

(28)

The factors N(~-t) in front are the number of equivalent pairs. Similarly the one particle
density matrices are

pt(r;, rl) = N j .. j lII*(r;, r2, r3, ...)lII(r17 r2, r3, ... )dr2dr3dstds2ds3... (29)

pt(rt) = N j .. j 11II(r17r2,r3, ...Wdr2dr3dstds2ds3... (30)

The last equation gives the usual electron density p(r) = Ptr).
The energy E, given as the expectation value of the hamiltonian is expressed in terms of
these matrices,

(31)

In Hartree-Fock or Kohn-Sham theory the one particle density matrices are represented
in terms of orbitals:
i
Pl(r;, rt) = 2L lPi(r;)lPi(rl) (32)
pt(r) = 2L IlPi{rW (33)
97

In Hanree-Fock theory, the exchange energy is


K = -JJE ~i(rd<p;(r2)~;(r2)~j(rd
i;Irl - r21
dr dr2 1 (34)

= _!JJ Ipl(rl,r2)1 2dr dr2 1 (35)


4 Irl - r21

5 The Uniform Electron Gas-The Local Density Ap-


proximation
There is no straightforward way in which the exchange correlation functional E=(p) can be
systematically improved, unlike regular quantum chemistry. In quantum chemistry SCF
can be improved in principle to unlimited accuracy through configuration interaction or
perturbation theory. The way forward in DFT is to start from a model for which there is
an exact solution. This model is the uniform electron gas.
We follow Parr and Yang[lO). It is defined as a large number N of electrons in a cube
of volume V = P, throughout which there is uniformly spread out a positive charge
sufficient to make the system neutral. The uniform electron gas corresponds to the limit
=
N -+ 00, V -+ 00, with the density p N IV remaining finite. The ground state energy
is
E(P) =T.(P} + Jp(r)v(r)dr + J(p} + E:c(P} + Eo (36)

with Eo being the electrostatic energy of the positive background, which is equal to the
coulomb energy because the positive charge density n(r) is simply the negative of p(r).
Because the external potential is defined by

v(r) =- J1;~~ldrl (37)

it follows that the second , third and fifth terms of Eqn 26 add to zero, and therefore
E[P) = T.[p) + E:c[P} (38)
= T.[p} + E:(P} + Ec(P} (39)
where we have now split up the exchange-correlation term into an exchange term plus a
correlation term.
We will now briefly indicate how the expressions for T.[P) and E:(P} are obtained. The
KS equations are satisfied by plane waves
liJt.r
IPI:(r) = Vl/2e (40)

where periodic boundary conditions require


2~ 2~ 2~
k.: = Tn:, ky = Tn,l' k% = Tn", n:,ny,n% = 0,±1, ±2,... (41)
98

The one particle density matrix is given by


2 occ .
P1(rl, r2) = V :~:::eik.(rl-r2l (42)
k
which, on replacing the sum by an integral dn, and using dn = {l/27r)3dk = (V/87r 3 )dk
yields the one particle density matrix
= _1_
411"3
J e'k.(rl-r2 ldk (43)

= 4~ l<F k2dk JJe'k.r12 sin OdOd</> (44)


The Fermi level is defined by evaluating the density p(r) = P1(r, r):
k}
p(r) = 37r2 (45)
kF = [37r 2p(rW/3 (46)
To evaluate P1(rl, r2) we introduce the coordinates
r = (rl + r2)/2, s = rl - r2 (47)
and choose s to lie along the kz axis. The integral may then be evaluated to give the
exact first order spinless density matrix
sint-t cost
Pl(rl, r2) = 3p(r) t3 , t = kF(r)s (48)
The kinetic energy and the exchange energy are evaluated through

T.[p] = -~ J[V~p1(rl' r2)]r2=rl drl (49)

E.,[p] = -~ JJ[P1(~SW drds (50)


These integrals are evaluated with the help of the following identities
roo (sint - tcost? dt = 1
(51)
Jo tS 4
V:P1(rl, r2)lrl=r2 = [(~V; + V; + VrV.)P1(r, s))=o (52)

JV 2p(r)dr o (53)

V; 1~
~ ds 2S
(54)
The result is that
T.(P) = CF Jp(r)5/3dr (55)

E.,[p] = -C., Jp(r)4/3dr (56)

CF ~(37r2?/3 = 2.8712 (57)


10
C., = ~(37r-1 )1/3 = 0.7386 (58)
99

It is usual to introduce the exchange energy per particle e~ as a function of r a , the radius
of a sphere whose volume is the eft'ective volume of an electron
4 1
= -p
3
(59)
31fra
E.,(P] = J p(r)e.,(ra)dr (60)
0.4582
e",(ra) = ---
ra
(61)

In the case when the alpha spin density is not equal to the beta spin density, the kinetic
and exchange energies are

Ta[pa, I] = 22/ 3CF j[(pa)s/3 + (l)s/3]dr (62)

E,,(pa, I] = J~(p,()dr (63)


e.,(p, () = e~(p) + [e!(p) - e~(pO]f«() (64)
f«() = 0.5(2 1/ 3 _ 1)-1[(1 + ()4/3 + (1- ()4/3 - 2] (65)

where the spin polarisation parameter ( is given by (= (pa - tI)1 p.


For the correlation functional of the electron gas, reliance is placed on the numerical
simulations of the uniform electron gas by Ceperley and Alder[lll using the quantum
Monte-Carlo method for several different values of r a. The correlation energy was ob-
tained by subtracting the kinetic and exchange energies from Eqn 38. Also using analytic
information for the high and the low density limit, Vosko, Wilk and Nusair[121 give the
following accepted form for fe(r.) which covers both the spin polarised and spin compen-
sated cases
A :t 2 2b Q
= "2(ln X(:t) + Q arctan 2:t + b
b:to [In (:t - :to)2 2(b + 2:to) Q I)
- X(:to) X(:t) + Q arctan 2:t + b (66)

where:t = r!/2, X(:t) = x 2 + bx + C, Q = (4c - ~)1/2, and A = 0.0621814,


0.0310907,:1:0 = -0.409286 ,-0.743294,b = 13.0720, 20.1231 and
c = 42.7198, 101.578 for e~(r.); e~(ra) respectively (note that there is an error in Parr
and Yang's formula (65)).
This completes the description of the uniform electron gas and its functionals. Use of
these functionals is often referred to as the Local Density Approximation (LDA) or the
Local Spin Density approximation.
Finally we comment that the form of the LDA kinetic and exchange functionals can be
obtained by scaling arguments. First of all, if 111 (rl' r2, ... , IN) is normalised, then one
may scale the coordinates to obtain a new normalised function

(67)
100

Correspondingly the density p>. of the scaled wavefunction is given by

(68)

Thus if we write the kinetic and exchange terms as

T[P] = Jt(p)dr (69)


E.,[P] = Jk(p)dr (70)

Then using (67) we obtain

T[p>.] = J t(>.3p(>.r»dr = ),-3 Jt(),3p(r»dr (71)

E.,[p>.] =J k(),3p(),r»dr = ),-3 Jk(),3p(r»dr (72)

But we may assume that if we evaluate these from the wavefunction, then the key oper-
ators are V2 and r- 1 • Thus we also have

T[p>.] = ),2Jt(p(r»dr (73)


E.,[p>.] = >. Jk(p(r»dr (74)

We therefore deduce

t(),3p(r» = ),5t(p(r» (75)


k(),3p(r» = ),4t(p(r» (76)

or

t(),p) = ),5/3t(p) (77)


k()'p) = ),4/3k(p) (78)

as required.
A simpler argument may be given by observing that the dimensions of p, t, k are
L-3 ,L-2 ,L-l respectively.

6 Beyond the LDA approximation

A summary of calculations which is presented later makes it clear that the LDA is not
adequate for useful predictions for computational chemistry. Indeed LDA is no better
than SCF as a rough approximation and so there is no point in doing it on its own. We
therefore discuss some important improvements which have been made in recent years.
101

6.1 The Becke Exchange Correction

One of the most important deficiencies of the LOA exchange is that it does not have the
correct asymptotic behaviour. Becke[13] starts from an exact representation of Vce(P] in
terms of the diagonal two particle density matrix P2(rl, rz)

(79)

(80)

where h(rl' rz) is the pair correlation function. The exchange-correlation hole is defined
by

(81)
Integration of P2(rl, r2) with respect to r2, using the definitions (27) and (29) yields

N-l
-2-p(rl)
NP(r1) + 2"1J psc(rl, r2)dr2
= "2 (82)

from which it follows that

(83)

If we confine ourselves to the exchange part, then we may write the exchange energy in
terms of an exchange potential fs(rl):

E", = ~ f p(rl)fs(rl)dr1 (84)

fs(rl) = Jpsc(rl, rz) drz (85)


r12

Now let rl .... 00, and use Eqn 82, to obtain

(86)

This puts a constraint on the exchange potential, which is not obeyed by the LOA form,
because the well known asymptotic exponential dependence of p, exp( -QT) means that
the LOA E", has the asymptotic form exp(-QT/3). Thus a new term must be added. In
order to obtain this inverse distance dependence Becke recognised that it was necessary to
introduce both a logarithm and a term which involved the gradient of the density. After
investigations Becke's additional term has the form

t!! = Q pl/3 :z:


2
(87)
'" -~ (1+6(3:z:sinh 1 :z:)
IVpl
:z: = p4/3 (88)
102

It has one adjustable parameter (3 which was chosen so that the sum of theLDA and
Becke exchange terms accurately reproduce the exchange energies of six noble gas atoms,
=
f3 0.0042. We notice that this Becke exchange correction involves the gradient of the
density. This is natural, because the LOA approximation assumes that the molecular
density is homogeneous, which it is not, and formally a density gradient expansion is
required to introduce inhomogeneity.

6.2 The Lee-Yang-Parr Correlation Potential


Colle and Salvetti[14] have presented an approximate correlation energy formula for he-
lium in terms of the second order Hartree-Fock density matrix. Lee, Yang and Parr[15]
turned this formula into an explicit functional of P. involving the gradient and the lapla-
cian. Miehlich, Savin, Stoll and Preuss[16] eliminated the laplacian terms by integration
by parts, and for a closed shell system the result is .

Ec = -af 1 + dp-l
p I dr
3

-ab f wp2[C l F '3 + IVpI2(152- 6;2)] - ~!p2IvpI2dr (89)

where
exp( _cp-1 /3) -11/3
w = 1 +dp-1/3 P (90)
dp-1 /3
6 = cp
-1/3 + _"'--_,..
1 + dp-1/3 (91)
(92)
and a = 0.04918, b = 0.132, c = 0.2533, d = 0.349, which are the Colle-Salvetti param-
eters from their fit to the helium atom. The great advantage of this functional is that it
was derived from an actual correlated wavefunction for a two electron system, and has no
relation to the uniform electron gas. It also contains gradient terms.
A note on the terminology for functionals. We use the notation S-VWN for LOA, recognis-
ing Slater's primary contribution and use of Dirac's exchange term, together with Vosko,
Wilk and Nusair. We use the notation B-LYP for Dirac exchange + Becke correction +
Lee, Yang, Parr for the correlation functional. At this stage it is appropriate to note that
the introduction of gradient terms introduces mathematical complexities. In particular if

E:c = f F(p, Vp)dr (93)

then the exchange correlation potential is the functional derivative of p i.e.


6E:c of d of
V: c = Tp = op - dr . oVp (94)

Thus one can immediately see that an evaluation of V:c demands an evaluation of the
second derivatives of basis functions.
103

6.3 Latest Developments

Clearly it is a great challenge to develop improved functionals for molecular studies. One
really proceeds in an ad hoc fashion, the key object being to obtain better agreement with
experiment for predicted properties. As an example we followed the work of Wigner(17)
who devised a correlation energy functional of the form

= J1 + C2 p1/ 3 dr
CIP"/3
Ec (95)

It does not take much imagination therefore to propose a functional for which the Dirac
term is replaced by

(96)

where we have recognised that the separation into exchange and correlation parts is not
very rigorous. We (18) have used this new form in conjunction with the Becke exchange
correction and the LYP correlation functional, optimising the P(= 0.02) and P(= 0.008)
parameters. We have called the functional CAM(B)-LYP and the results (in the tables)
are encouraging.

7 Numerical Quadrature
This is one of the difficulties of DFT. It is quite clear that the new integrals which arise
in the KS equations may not be evaluated by analytic means because of the fractional
powers of the density which arise. There are various ways forward, and they all involve
a grid of points in molecular 3 dimensional space. Some people favour a least squares
fit of V"'C to an auxiliary gaussian basis set, some people favour a completely numerical
approach. Here we shall simply evaluate the required integrals using quadrature, and we
now describe a quadrature scheme which we have found satisfactory.

7.1 Voronoi Polyhedra

First we use the Becke[19) scheme for the decomposition ofthe integrand (which we write
here simply as F(r» into single centre components through the introduction of weight
functions wA(r) which have a value of near unity near nucleus A and which vanish in a
well-behaved manner near any other nucleus. The relevant equations are

(97)

F(r) = EFA(r) (98)


A
FA(r) = wA(r)F(r) (99)
104

f = LfA (100)

f FA (r)dr
A

fA = (101)

To determine the weight functions Becke introduced confocal elliptic coordinates (A, p., tP)
for two centres A and B. The key variable is p. defined by

(102)

P.AB takes the value -Ion A and also along the axis beyond A, and it takes the value +1
on B and along the axis beyond B. Becke defined a function s(p.AB) which was +1 for
-1 :s; P.AB :s; 0 and 0 for 0 < P.AB :s; +1. The weight function is then defined by

PA(r) = II s(p.AB) (103)


B¢A
wA(r) = PA(r)/LP..i(r) (104)
A

By this scheme the full space has been divided into Voronoi polyhedra surrounding each
nucleus A. Becke then smoothed out the discontinuities at P.AB = 0 into continuous
=
mutually overlapping regions. He constructed functions s(p.) such that s( -1) 1, s( + 1) =
0,: = 0 at p. = ±1, to ensure that 's(p.) does not have nuclear cusps'. He found it
desirable to work with a function for which the many derivatives were zero at the ends.
We[20] use the following form for s"

ds
dp.
= Am,. (1 - p.2y",. (105)

with Am,. being chosen to ensure that s(p.) has the value 1,0 at p. =::r1. We use the value
m,,=10.
Finally Becke recognised that it is important to have different sizes of regions around
each atom, the scheme given so far sharing the space equally between two atoms because
p. = 0 corresponds to the mid point between them. Therefore Becke introduced a change
of variable
(106)
and worked with the function s(/I) instead of s(p.), with the boundary at /I = O. He argued
that a good value for aAB is given by
UAB
aAB = (107)
u~B -1
X-I
UAB =
X+ l
(108)
RA
X = (109)
RB
where R A, RB are the respective Bragg-Slater radii. We have followed this aspect of
Becke's scheme exactly.
105

Te Velde and Baerends (tVB)[21] also divide space into atomic polyhedra. They then
proceed differently putting an atom centered sphere in the polyhedra and surround it by
a number of pyramids. The integration over the sphere uses spherical polar coordinates.
Special more complicated devices are used to integrate over the pyramids. Certainly the
Becke smoothing scheme is much easier to use than the t VB scheme.
Now we describe how the points are generated around each atom; we use spherical polar
coordinates.

7.2 The Euler-Maclaurin formula


The Euler-Maclaurin formula, in the context that we wish to use it , concerns the eval-
uation of an integral in one dimension using equally spaced points and equal weights.
We shall consider [0,11 divided into n equal parts, with quadrature points at i/n (i =
0,1,2, ... ,n). We also consider a change of variable from % to q and so the integral we

l l l
are considering is
F(:z:)dz = F(:z:(q)): dq =: G(q)dq (110)
The Euler-Maclaurin formula [22] for the evaluation of the integral (110) is

fG(q)dq = ;(~G(~)+~(G(O)+G(I»)
-f 1:=1
B2\(!.)2.\:(G(2"-1)(I)_G(2"-I)(0»
(2k). n
~+2 0<2m+2)(t) (111)
n 2m +1(2m + 2)! ..
where 0 < e < 1. 1J2.I; are Bernouilli numbers which are 1, 1/6,
-1/30, 1/42, -1/30, 5/66, -691/2730, 7/6, -361'1/510,
43867/798, -174611/330 for k = 0, ... ,10. The Bernouilli numbers grow very rapidly
and for large even k, B" ..... 2(-I)"-1(2k)! (22')-2". Thus the series in eqn (Ill) is oIily
useful for low values of m. The Euler-Maclaurin formula becomes

l G(q)dq = !n (E
i=l
G( .!.) + ~(G(O) + G(I»)
n
__1_(0<1)(1) _ G(l)(O» + _1_(0<3)(1) _ G(3)(O»
12n2 720n4
1 (dS)(I) _ 0<5)(0»
30240n6
+ 120~On8 (0<7)(1) - 0<7)(0»

+rema.inder term (112)


The idea becomes clear from this representation of the formula: if it is possible to arrange
matters such that the values of G(q) and its low derivatives are either equal and/or zero
106

at the end points then the convergence of the sum over the quadrature points to the exact
integral value should be more rapid than if it was not the case. Of course in some cases
this is true for all derivatives and then all the error will be in the remainder term and
nothing is achieved, but in most cases G( q) and its derivatives will not be equal at the end
points. Handy and Boys[23] showed how to make the derivatives zero at the end points
and specifically considered a jacobian of the form

dz = A.,.qm-l(1 _ q)m-l (113)


dq
where A.,. is chosen such that the range of q is [0,1]. In their study they considered
the first three values of m that removed new odd derivatives and also a transformation
=
which removed all derivatives, m 00. For the function sin( 1I"z) the fractional errors with
=
n 10,20 were 823.8 x 10-5 ,205.7 x 10-5 for m = 1; 14.7 x 10-5 ,0.93 x 10-5 for m = 2;
=
and 1.12 x 10- 5 ,0.02 x 10- 5 for m 3. These results are typical of the various functions
considered in the paper. It is easy to understand why m must not be made too large,
because as m is increased the integrand becomes increasingly fiat at the ends and more
delta-function like in the middle. Thus m = 00 results were no improvement on m = 3.
We now discuss the specific use of this approach to radial quadrature remembering that
the jacobian r2 sin fJ is present.

7.2.1 Radial quadrature

The nature of quantum mechanics means that all derivatives of our integrands vanish at
r =00, and the jacobian factor, r 2 , means the integrand and its first derivative vanish at
r = 0. We combine the above transformation (113) with a transformation from [0,00] to
[0,1), tbtts
(114)
(115)

where Q is a scaling parameter depending upon an effective atomic radius. The use of
(114) means that the integrand and its derivatives up to the (3m -1)th vanish at q. = 0,
and all its derivatives vanish at q. = 1. Our investigations show that the optimum value
ofm,. 2.=

7.3 Gauss Quadrature Schemes


It is hardly necessary to introduce Gauss q11adrature [22). It is probably suflicient to say
that it is a quadrature scheme designed to integrate polynomials with specified weight
functions exactly. The scheme of relevance to this discussion is Gauss-Legendre for which

[ sin fJPI(COS fJ)dfJ (116)

are integrated exactly for all polynomials PI of degree 2n - 1, where n is the number of
quadrature points.
107

7.3.1 Phi quadrature

The integrand and all its derivatives will have the same value at tP = 0 and tP = 2",. We rec-
ommend equally spaced points because it may easily be shown that n such points will ex-
actly integrate cos mtP and sin mtP for m = 0, 1, ... ,
n - 1. In this case then the transformation is

(117)
These quadratures are defined by referring to n r , n" n~ as the number of grid points in
each dimension.

7.4 Two Dimensional Angular Quadrature Schemes


There is a considerable literature on the numerical evaluation of the two dimensional
integral over a sphere

10f" 10r" 1(9,tP) sin 9 d9 dtP (118)

Indeed Lebedev[24] has devised Gauss like quadrature schemes which exactly integrate
the spherical harmonics l'/m(9,tP) for all -1:5 m :51, 0:5 1:5 L, for some L. There are
(L + 1)2 such functions. Lebedev describes 194,302 point quadrature schemes for which
L = 23, 29 and are based on the symmetry of the octahedron (there are also smaller
schemes). Although straightforward to implement, the derivation of these schemes is far
from trivial. They are argued to be efficient, with their efficiency (defined by (L+ 1)213N)
near unity.
As an alternative, consider the Gauss-Legendre scheme for theta and the simple Gauss
scheme for phi, regarded as a product quadrature scheme. To integrate all spherical
harmonics of degree:5 L, then [L+ 1]/2 theta points are required and L+l phi points are
required, that is N = (L + 1)2/2. Thus if L = 14, N = 120 and if L = 23, N = 288. In
other words a factor of 3/2 more points are required in this product scheme to integrate
exactly the same spherical harmonics. But this product scheme is exceedingly easy to
program, the number of points being trivially increased by the change of one parameter.
Symmetry applies trivially for the product scheme, whereas for the Lebedev procedures,
it is rather difficult having to be tied to the symmetry of the octahedron, respectively.
For all these arguments we have favoured the product scheme, with n~ = 2n,. We finally
note that the product scheme integrates a much wider class of function exactly compared
to the Lebedev scheme.

7.5 The Accuracy of the Quadrature

We are fortunate that a good test on the accuracy of a quadrature scheme may be judged
from Eqn 2, the integral of p. We have finalised our quadrature investigations, and have
108

a High, Medium and Low accuracy quadrature.


(a) High. For this n.. = 64,n, = 24, which gives an accuracy in the energy and density of
8 decimal places. This means that we are using 75000 points on each atom. Alternatively
we may use the best Lebedev scheme with 302 points. This scheme may be viewed as
overkill.
(b) Medium. For this n. = 54, n,= 16, which gives an accuracy in the energy of 5 deci-
mal places. 27500 points per atom. Alternative Lebedev with 302 points. This scheme is
recommended for geometry optimisations.
(c) Low. For this n.. = 25, n, = 8, which gives an accuracy in the energy of 2 decimal
places. 3200 points per atom. Alternative Lebedev with 86 points (L = 15).
We have substantially reduced the cost of the quadrature by (i) using symmetry, (ii) re-
ducing the number of angular points near nuclei, (iii) carrying out KS iterations using low
cost quadrature, and obtaining the energy by higher accuracy quadrature, (iv) inserting
several tests to eliminate the evaluation of basis functions which are far from grid points,
(v) evaluating the Becke weights at the start and storing them.

8 The Implementation of the Kohn-Sham equations


We have implemented the KS scheme as part of the CADPAC program. It is not relevant
to give the details here, but for typical molecular calculations the cost is approximately
twice that of an SCF calculation. Essentially the structure for the extra part of the KS
matrix is
DO 1 Atoms .. this determines a centre and Becke weight
DO-2 Atomic r, fJ, tP points. .. this determines grid points
DO 3 basis functions
evaluate 1/a and V1/a
END 3
DO 4: basis functions
DO 5 basis functions ...
form p etc
END 5,4
Form V.,c

DO 6 basis functions
DO 7 basis functions
construct Kohn-Sham matrix
END 7,6
END 2,1
109

Prom this outline it is easy to see that the basis cost is J{J, where N is a measure of the
size of the molecule.
We observe that there an increasing number of DFT codes in the literature under names
such as DGAUSS[26), DeMon[27), NUMOL[28), AMOL[29), DMOL[30), each of which
treats the evaluation of the KS matrix in a slightly different way.
Finally in this section we stress that for a giVen functional, it is possible to obtain the
exact KS solution, provided a complete basis and exact quadrature are used. Our ex-
perience is that this is in practice achievable provided a TZ2P( +f) type basis is used,
that is a good SCF basis. We can see every reason for using as reliable a quadrature as
possible. The great advantage of DFT is that no configuration interaction calculation is
performed, and therefore we are not trying to describe the electron-electron cusp in the
wavefunction, which is well established to be the reason why correlated calculations are
so slowly convergent with respect to basis set. This is the overriding advantage of modem
density functional theory.

9 Derivative Theory for the Kohn-Sham method


It is well established that the derivative of the closed shell SCF energy with respect to an
external parameter ~ such as a nuclear coordinate is

dE
d~
= E). _ 2~.S?:
...n
(119)

where E). is the derivative of the energy with respect to >., through its explicit dependence
on >., such as the basis function dependence on the nuclear coordinates. Thus

E).= 2ht; + 2(iiljj». - (ijlij? (120)

S~ = I:C:oicPi«<;;';I71P)+(71al':':» (121)
afJ

Because we have recognised the similarity between SCF and KS methodology, eqn (118)
is also the expression for the derivative of the DFT energy. Instead of the quantum
mechanical exchange term in eqn (119), we have

E;c = j !Fndr (122)


j of,,,c op dr j of,,,c . OVp d
= op o~ + oV P 0>. r (123)

= jOFnOP dr _ J~' oFnoP d (124)


op 0>. dr oVpo~ r

= j v",c op>. dr (125)


o
= J v",cp).dr (126)
110

where we have used integration by parts. p>' is trivially evaluated:

p" = 4 L La(J l1~l1(JCaiCf3i


i
(127)

The evaluation of the KS energy gradient is therefore a straightforward matter, just as it


is for SCF theory. Therefore geometries may be optimised in the usual way.
The theory also exists and it has been programmed for the determination of second
derivatives of the KS energy. For the simple case of LDA when F:r:c is only a function of
p, the extra terms are given as follows:
If the usual expression for the second derivative of the closed shell SCF energy is written

~! = E>''' - 2S;;"e. + 4(F; - EiS~)U:' - 2(Ftl - E"S~)S:i (128)

then the exchange contributions to F~, FJ:l and E"" are replaced by

(129)

(130)

(131)

respectively. The algebra becomes much more messy if F:r:c is a function of V p as well.
The above formula rely on an accurate quadrature. If the quadrature is not sufficiently
reliable, then the minimum of the KS energy may not occur where the gradient is zero.
To overcome this difficulty it is necessary to differentiate the weights in the quadrature
formula, which also depend upon the position of the nuclei. This is not a difficult matter,
but slightly tedious.

10 Overview of DFT Calculations on Molecules

In the tables are calculations for 33 molecules for which the experimental data for equi-
librium geometry and dissociation energy are reasonably certain. The calculations were
performed with high quadrature and large basis sets. Results are presented for LDA,
B-LYP and our new functional CAM(B)-LYP (as well as one other about which see [18].
This detail is included because it is only by understanding the specific failures of func-
tionals that advances can be made. From table 6, we see that LDA bond lengths are (too
long) by o.Ol7A and they are marginally improved using B-LYP. Substantial improve-
ment is obtained using CAM(B)-LYP, the parameters for which were optimised for the
subset of 9 molecules of tables 1 and 3. It is encouraging that the improvement for the
9 molecules carried over to the full set. For bond angles, we see that there is a slight
improvement in proceeding from LDA to B-LYP. However it is for Dissociation energies
111

for which there is the grea.test improvement using gradient corrected functionals. LDA
predicts an unacceptable 1.89 eVerror which is reduced to 0.41 or 0.28 eV using B-LYP
or CAM(B)-LYP.1t was the known poor results for LDA which held back the use of DFT
as a predictive tool for computational chemistry.
These results are supported by those presented in table 7, which are taken from Johnson et
al(31]. Note that these were obtained using only a 6-310* basis. Improvement ofthe basis
set will approximately halve the error of the bond lengths for all methods except SCF. It
is clear that B-LYP gives tremendously improved dissociation energies. Furthermore it is
generally found that frequencie: of vibration are better than MP2, although usually too
low.
Finally we observe that this field is rapidly growing and it has only been possible to touch
a small number of aspects. Further references may be found in the bibliography [31-46]
to both other important work on functional development as well as further calculations.
112

Table 1: Equilibrium Geometries calculated using different composite forms of the


exchange-correlation functional for the nine-molecule subset of our modified G2 data
set; the bondlengths are given in A and the angles in degrees.

Molecule Quantity LDA B-LYP CAM(B)-LYP Experiment

H2 r(H - H) 0.763 0.743 0.728 0.742

Lh r(Li - Li) 2.750 2.724 2.664 2.673

HF r(H - F) 0.932 0.933 0.925 0.917

CO r(C - 0) 1.128 1.137 1.129 1.128

N2 r(N - N) 1.096 1.104 1.095 1.098

F2 r(F - F) 1.392 1.440 1.446 1.412

P2 r(P - P) 1.894 1.916 1.909 1.893

CM. r(C - H) 1.097 1.094 1.082 1.086

CH 20 r(C - 0) 1.199 1.212 1.206 1.203

r(C - H) 1.121 1.114 1.099 1.099

L(H-C-O) 122.0 122.1 122.1 121.8


113

Table 2: Equilibrium Geometries calculated using different composite forms of the


exchange-correlation functional for the remaining molecules in our modified G2 data set;
the bondlengths are given in A and the angles in degrees.

Molecule Quantity LDA B-LYP CAM(B)-LYP Experiment

CH r(C - H) 1.142 1.133 1.116 1.120

NH r(N - H) 1.055 1.051 1.039 1.036

OH r(O - H) 0.986 0.985 0.976 0.970

LiH r(Li-H) 1.620 1.609 1.581 1.595

BeH r(Be - H) 1.388 1.372 1.348 1.343

H2 O r(O - H) 0.970 0.971 0.961 0.957

L(H - 0 - H) 104.9 104.3 104.6 104.5

NHa r(N - H) 1.021 1.021 1.010 1.012

L(H- N - H) 107.0 106.4 106.7 106.7

C2H4 r(C - C) 1.323 1.335 1.326 1.334

r(C - H) 1.093 1.088 1.075 1.081

L(H - C -H) 116.7 116.5 116.4 117.4

03 r(O - 0) 1.257 1.293 1.297 1.272

L(O - 0 - 0) 118.1 118.0 118.0 116.8

CN r(C - N) 1.175 1.166 1.166 1.172

NO r(N - 0) 1.149 1.164 1.159 1.151

O2 r(O - 0) 1.208 1.233 1.233 1.207

HCN r(C - H) 1.079 1.072 1.058 1.065

r(C - N) 1.151 1.158 1.148 1.153


114

Table 2: (Cnt.)Equilibrium Geometries calculated using different composite forms of the


exchange-correlation functional for the remaining molecules in our modified G2 data set;
the bondlengths are given in A and the angles in degrees.

Molecule Quantity LDA B-LYP CAM(B)-LYP Experiment

CO 2 r(C - 0) 1.163 1.174 1.166 1.160

CH 2 1A1 r(C - H) 1.125 1.119 1.104 1.107

L(H- C - H) 100.9 101.1 101.57 102.4

CH 2 3 B1 r(C - H) 1.089 1.083 1.070 1.075

L(H - C - H) 137.2 135.7 135.4 133.9

CH3 r(C - H) 1.093 1.090 1.079 1.079

C2H2 r(C - C) 1.201 1.206 1.195 1.203

r(C - H) 1.074 1.063 1.053 1.062

C2H6 r(C - C) 1.510 1.540 1.539 1.526

r(C - H) 1.100 1.097 1.084 1.088

L(H- C - H) 107.2 107.5 107.7 107.4

N2H4 r(N - N) 1.408 1.463 1.471 1.447

r(N - Hb) 1.024 1.023 1.013 1.008

r(N - Ha) 1.020 1.019 1.009 1.008

L(N -N - Hb) 113.4 111.3 110.7 109.2

L(N - N - Ha) 108.82 106.35 105.96 109.2

L(Ha - N - Hb) 108.7 107.0 107.1 113.3

Ld(Ha - N - N - Hb) 90.5 90.5 90.9 88.9


115

Table 2: (Cnt.)Equilibrium Geometries calculated using different composite forms of the


exchange-correlation functional for the remaining molecules in our modified G2 data set;
the bondlengths are given in A and the angles in degrees.

Molecule Quantity LDA B-LYP CAM(B)-LYP Experiment

H 20 2 reO - 0) 1.437 1.496 1.50S 1.475

reO - H) 0.977 0.977 0.968 0.950

L(O-O -H) 100.6 99.3 9S.9 94.S

Ld(H - 0 - 0 - H) 111.3 114.0 116.4 120.0

HCO r(C - H) 1.13S 1.132 1.117 1.122

r(C - 0) 1.174 1.1S6 1.179 1.175

L(H - C -0) 123.6 123.6 123.7 124.6

HsCOH r(C - 0) 1.405 1.443 1.446 1.421

r(C - Ha) 1.099 1.094 1.080 1.093

r(C - Hb) 1.106 1.100 1.0S6 1.093

reO - H) 0.970 0.970 0.963 0.963

L(O - C -Ha) 107.0 106.4 106.1 107.0

L(C- 0 - H) 10S.S 10S.0 10S.2 10S.0

L(Hb- C -Hb) 107.7 10S.5 10S.9 1OS.5

NH2 r(N - H) 1.03S 1.037 1.025 1.024

L(H-N-H) 102.S 102.5 102.S 103.4


116

Table 3: Atomisation energies r;De calculated using different composite forms of the
exchange-correlation functional for the nine molecule subset in our modified G2 data set;
all energies are given in eV.

Molecule LDA B-LYP CAM(B)-LYP Experiment

Hz 5.02 4.84 4.91 4.75

Liz 1.00 0.88 0.92 1.01

HF 7.04 6.12 6.00 6.13

CO 12.91 11.32 11.08 11.23

Nz 11.49 10.33 10.24 9.91

Fz 3.37 2.15 1.87 1.66

Pz 6.13 5.19 5.02 5.08

CH4 19.27 18.32 18.14 18.25

CHzO 18.88 16.46 16.12 16.22


117

Table 4: Atomisation energies EDe calculated using different composite forms of the
exchange-correlation functional for the remaining molecules in our modified G2 data. set;
all energies are given in eV.

Molecule LDA B-LYP CAM(B)-LYP Experiment

CH 4.48 4.07 4.07 3.64

NH 4.17 3.91 3.96 3.67

OH 5.93 5.18 5.09 4.62

LiH 2.62 2.51 2.64 2.52

BeH 2.23 2.09 2.17 2.16

N14 14.74 13.19 13.08 12.93

H2 O 11.61 10.14 9.96 10.09

c,~ 27.64 24.52 24.14 24.45

03 10.38 7.33 6.57 6.35

CN 9.45 8.24 8.06 7.89

NO 8.52 7.15 6.90 6.61

O2 7.52 5.85 5.42 5.21

HCN 15.64 13.89 13.71 13.48


118

Table 4: (Cnt.)Atomisation energies ED. calculated using different composite forms of


the exchange-correlation functional for the remaining molecules in our modified G2 data
set; all energies are given in eV.

Molecule LDA B-LYP CAM(B)-LYP Experiment

COz 20.41 17.28 16.70 16.56

CHzIA 1 8.71 7.89 7.87 7.40

CH2 3B 1 9.35 8.60 8.25 7.79

CH 3 14.29 12.84 12.60 13.33

C2 Hz 20.00 17.63 17.37 16.86

Czlls 34.78 30.87 30.41 28.89

Nz~ 22.51 19.54 19.23 18.90

H2 0 Z 14.60 12.05 11.65 11.63

HCO 14.44 12.47 12.15 12.06

H3COH 25.61 22.33 21.92 22.20

NH2 9.08 8.28 8.29 7.87


119

Table 5: The mean deviations, mean absolute deviations and mean percentage errors
in the bondlengths and atomisation energies for the nine molecules in the subset of our
modified set of G2 molecules for diiferent composite exchange-correlation functionals. The
deviations in the bondlengths and atomisation energies are expressed in units of A and
eV respectively.

LDA B-LYP CAM(B)-LYP

BONDLENGTHS

Mean
Deviation 0.012 0.017 0.003

Mean
Absolute 0.017 0.017 0.009
Deviation

Mean
Percentage 1.2 1.1 0.7
Error

DISSOCIATION
ENERGIES E De

Mean
Deviation 1.21 0.15 0.01

Mean
Absolute 1.21 0.18 0.15
Deviation

Mean
Percentage 22.0 5.9 3.8
Error
120

Table 6: The mean deviations, mean absolute deviations and mean percentage errors in
the bondlengths, bond angles and atomisation energies for the thirty-three molecules in
our modified G2 data set for different composite exchange-correlation functionals. The
deviations in the bondlengths, bond angles and atomisation energies are expressed in units
of A, degrees and eV respectively.

LDA B-LYP CAM(B)-LYP

BOND LENGTHS

Mean
Deviation 0.090 0.013 0.003

Mean
Absolute 0.017 0.013 0.009
Deviation

Mean
Percentage 1.4 1.1 0.7
Error

BOND ANGLES

Mean
Devia.tion 0.039 -0.463 -0.307

Mean
Absolute 1.91 1.68 1.51
Deviation

Mean
Percentage 1.7 1.5 1.4
Error

DISSOCIATION
ENERGIES EDe

Mean
Devia.tion 1.89 0.37 0.16

Mean
Absolute 1.89 0.41 0.28
Devia.tion

Mean
Percentage 20.8 5.7 3.8
Error
121

Table 7: Comparison of Mean absolute errors in bondlengths(l.A), bond angles (/0), dipole
moments(/D), harmonic frequencies (em-I) and atomisation energies(kcal mol-I) for 32
molecules, using 6-31G* basis set and 9700 quadrature points per atom.

SCF MP2 QCISD LDA BLYP


Bond lengths 0.02 0.014 0.013 0.021 0.020
Bond angles 1.99 1.78 1.79 1.93 2.33
Dipole moment 0.289 0.277 0.233 0.252 0.251
harmonic frequency 168 99 42 75 73
atomisa.tion energy 85.9 22.4 28.8 35.7 5.6
122

11 Appendix: Functional Derivatives


Consider the following functional E[p] of the density per),

E[P] = j E(p(r}, Vp(r))dr (132)

It is a functional because the numerical value of E depends upon the function per}. Let
us derive the condition that E[P] is stationary with repect to variations in the function
per). We consider the variation

per) -+ per) + ff(r) (133)

Heref is a small parameter which we shall let tend to zero and fer} is an arbitrary function
which obeys all boundary conditions. Substituting into the functional we obtain,

E[P+f!l =j E(P+ff,Vp+fVf)dr (134)

Now use a Taylor expansion of the integrand through first order and use (131) to obtain,

8E 8E
E[p+ f!l- E[p] = f( j fer) 8p + Vf(r}· 8Vp}dr (135)

Divide by f, and integrate the right hand side by parts to obtain

E[p+ f!l- E[P] = jf(r}(8E _ V. 8E }dr (136)


f 8p 8Vp

In the limit of f -+ 0, the left hand side is ~, which is zero if E[P] is stationary. Fur-
thermore it must be stationary for all possible functions fer}, and this can only be the
case if the integrand on the right hand side is zero. We call the integrand the functional
derivative ~!, and thus we have derived the condition that E[P] is stationary to be

6E 8E 8E
-=--V·-=O (137)
6p - 8p 8Vp

The above is the famous Euler-Lagrange equation which we have used in chapter 2.

References
[1] J. Almlof, K. Fa.egri and K. Korsell. J. Comput. Chem. b3 385 (1982)

[2] W. Klopper and W. Kutzelnigg., J. Chem. Phys. 962020 (1991)

(3] J. C. Slater. Phys. Rev. 81 385 (1951)

(4] P. A. M. Dirac. Cambridge Philos. Soc. 26376 (1930)


123

[5] R. O. Jones. J. Chem. Phys. 71 1300 (1979)


[6] E. R. Davidson.'Reduced Density Matrices in Quantum Chemistry'. (Acad. Pr., New
York) (1976)
[7] M. M. Morrell, R. G. Parr and M. Levy. J. Chem. Phys. 62 549 (1975)
[8] P. Hohenberg and W. Kohn. Phys. Rev. 136864 (1964)

[9] W. Kohn and L. J. Sham. Phys. Rev. A140 1133 (1965)


[10] R. G. Parr and W. Yang. 'Density-Functional Theory of Atoms and Molecules' (Ox-
ford) (1989)
[11] D. M. Ceperley and B. J. Alder. Phys. Rev. Lett. 45 566 (1980)
[12] S. J. Vosko, L. Wilk and M. Nusair. Can. J. Phys. 58 1200 (1980)
[13] A. D. Becke. J. Chem. Phys. 882547 (1988)
[14] R. Colle and O. Salvetti. Theor. Chim. Acta 37 329 (1975)

[15] C. Lee, W. Yang and R. G. Parr. Phys. Rev. B37 385 (1988)
[16] B. Miehlich, A. Savin, H. Stoll and H. Preuss. Chem. Phys. Lett. 157 200 (1989)

[17] E. P. Wigner. Phys. Rev. 461002 (1934)


[18] V. Termath, G. Laming and N. C. Handy. J. Chem. Phys. (submitted)
[19] A. D. Becke. Phys. Rev. A38 3098 (1988)
[20] C. W. Murray, N. C. Handy and G. J. Laming. Molec. Phys. 78 997 (1993)
[21] G. Te Velde and E. J. Baerends. J. Comput. Phys. 99 84 (1992)
[22] A. H. Stroud 'Approximate Calculation of Multiple Integrals' (Prentice-Hall) (1971)
[23] N. C. Handy and S. F. Boys. Theor. Chim.. Acta 31195 (1973)
[24] V. I. Lebedev. Zh. Vychisl. Mat. Mat. Fiz. 1548 (1975); Zh. Vychisl. Mat. Mat. Fiz.
16 293 (1976); Sibirsk. Mat. Zh. 18 132 (1977)

[25] CADPAC5: The Cambridge Analytic Derivatives Package Issue 5, Cambridge 1992.
A suite of quantum chemistry programs developed by R. D. Amos with contributions
from I. L. Alberts, J. S. Andrews, S. M. Colwell, N. C. Handy, D. Jayatilaka, P. J.
Knowles, R. Kobayashi, N. Koga, K. E. Laidig, P. E. Maslen, C. W. Murray, J. E.
Rice, J. Sanz, E. D. Simandiras, A. J. Stone and M-D Suo

[26] J. Andzelm and E. Wimmer. J. Chem. Phys. 96 1280 (1992)

[27) A. St. Amant and D. R. Salahub. Chem. Phys. Lett. 169 387 (1990)
124

[28] A. D. Becke. Int. J. Quantum Chem. S23 599 (1989); A. D. Becke and R. M. Dickson.
J. Chem. Phys. 923610 (1990)

[29] E. J. Baerends, D. E. Ellis and P. Ros. Chem. Phys. 2 41 (1973); T. Ziegler, J. G.


Snijders and E. J. Baerends. J. Chem. Phys. 74 1271 (1981)

[30] B. Delley. J. Chem. Phys.92 508 (1990)

[31] B. Johnson, P. M. W. Gill and J. A. Pople. J. Chem. Phys. 98 5612 (1993)

[32] R. Fournier, J. Andzelm and D. R. Salahub. J. Chem. Phys. 90 6371 (1989)

[33] R. Car and M. Parinello. Phys. Rev. Lett. 55 2471 (1985)

[34] L. Versluis and T. Ziegler. J. Chem. Phys. 88 322 (1988)


[35] B. I. Dunlap, J. Andzelm and J. W. Mintmire. Phys. Rev. A42 6354 (1990)

[36] A. Komornicki and G. Fitzgerald. J. Chem. Phys. (1993)

(37) J. A. Pople, P. M. W. Gill and B. G. Johnson. Chem. Phys. Lett. 199557 (1992)

[38] B. G. Johnson, P. M. W. Gill and J. A. Pople. J. Chem. Phys. 97 7846 (1992)

[39] A. D. Becke. J. Chem. Phys. 96 2155 (1992)

[40] C. W. Murray, G. J. Laming, N. C. Handy and R. D. Amos. Chem. Phys. Lett. 199
551 (1992)

[41] A. {). Becke. J. Chem. Phys. 97 9173 (1992)

[42] J. P. Perdew and Y. Wang. Phys. Rev B45 13244 (1992)


[43] J. P. Perdew, J. A. Chevary, S. H. Vosko, K. A. Jackson, M. R. Pederson, D. J. Singh
and C. Fiolhais. Phys. Rev. B46 6671 (1992)

[44] N. C. Handy, C. W. Murray and R. D. Amos. J. Phys. Chem. 974392 (1993)

[45] C. W. Murray, N. C. Handy and R. D. Amos. J. Chem. Phys. 98 7145 (1993)


[46] R. D. Amos, C. W. Murray and N. C. Handy. Chem. Phys. Lett. 202 489 (1993)
[47] C.W. Murray, G.J. Laming, N. C. Handy and R. D. Amos. J. Phys. Chem. 97 1868
(1993)
Coupled-cluster Methods

in

Quantum Chemistry

Peter R. Taylor
San Diego Supercomputer Center
P. O. Box 85608
San Diego, CA 92186-9784
USA

© Peter R. Taylor (General Atomics), 1993


Contents

Preface 129

1 Size-extensivity 131
1.1 Introduction. 131
1.2 Separated electron pairs . 132
1.3 Interacting electron pairs . 135
1.4 General remarks. . . . 137

2 The exponential ansatz 139


2.1 Linear ansatz: CI wave functions 139
2.2 The exponential ansatz . . . . 141
2.3 The coupled-duster method . 142
2.4 Alternative formulations . . . 146
2.5 Perturbation theory. . . . . . 148
2.6 Expectation value-based methods 149
2.7 Summary . . . 150

3 The CCSD model lSI


3.1 The CCSD equations . . . . . . 151
3.2 Closed-shell systems . . . . . . 153
3.3 Solution of the CCSD equations 156
3.4 Reliability of dosed-shell CCSD: a diagnostic. . . 157
3.5 Historical perspective . 159

4 Higher excitations 163


4.1 Deficiencies in the CCSD model 163
4.2 The CCSDT model . . . . . . . 163
4.3 Perturbational approaches to triple excitations . . 165
4.4 Quadruple and higher excitations? . . . . . . . . 166
5 Open-shell coupled-cluster methods 169
5.1 General remarks. . . . 169
5.2 UHF-based methods . . . . . . . . . 169
128

5.3 RHF jUHF-based methods . . . . . 170


5.4 RHF-based methods . . . . . . . . 171
5.5 Higher excitations in open-shell methods 174
5.6 Multireference coupled-cluster methods 174

6 Other treatments of size-extensivity 179


6.1 General remarks. . . . . . . . . . . 179
6.2 Quadratic configuration interaction 179
6.3 Brueckner orbitals . . . . . . . . 181
6.4 Approximate CC treatments . . . . 183
6.5 Coupled-pair functional methods . 188
6.6 Davidson's correction and extensions 192
Afterword 195

Bibliography 197
Preface

Only connect!
E. M. Forster

Only connected!
J. Cizek.

The purpose of this course is to review extensively the methods and the mo-
tivations behind coupled-cluster approaches to molecular electronic structure. These
methods had their origins - or, at least, were first used - in nuclear many-body the-
ory. They were introduced into quantum chemistry in the 1960's, but were relatively
little used until the late 1970's, perhaps in part because the original formulations
used techniques, like second quantization and diagrammatic methods, that were un-
familiar to quantum chemists. As time passed, however, these methods were recast
ill more palatable mathematical forms; more importantly, efficient computational im-
plementations appeared and demonstrated great robustness and high accuracy. In
the last ten years coupled-cluster methods, or approximations to them, have become
widely used when the aim is to obtain very accurate results for molecules that are
well-described qualitatively at the Hartree-Fock level.
We shall concentrate here entirely on the methodology. Our presentation will
include the close connections between coupled-cluster methods and perturbation the-
ory, although perturbation theory itself is not treated in any detail. Differences
between coupled-cluster methods and more traditional (variational) treatments of
electron correlation will also be discussed. No use is made of diagrammatic tech-
niques in our derivations: diagrams are undoubtedly a useful tool for enumerating
and classifying terms, but are not necessary for an understanding of coupled-cluster
methods. Numerical comparisons of coupled-cluster results with those of other meth-
ods are presented in detail in other courses at this sebool.
Helpful discussions over many years with individuals too numerous to list here
have influenced my thinking about coupled-cluster methods. I would, however, like
to acknowledge helpful discussions with and comments from several mates that relate
specifically to these lecture notes: Kim Baldridge, Les Barnes, Rod Bartlett, Tim
Lee, and Jan Martin.
Chapter 1

Size-extensivity

1.1 Introduction

Let us imagine that we wish to estimate the binding energy of the water dimer. That
is, we wish to determine the energy of the reaction

(1.1 )

In other courses methods for obtaining a realistic and accurate value of this binding
energy will be discussed in detail, but for the moment we are concerned only with
a fairly crude estimate, since our purpose here is not to predict the binding energy
accurately but to understand the behaviour of different computational methods. We
therefore begin with the simplest correlation treatment: second-order perturbation
theory. This gives -76.24602 Eh as the energy of H2 0 and -152.49975 as the energy
of the dimer. Hence the binding energy is 4.8 kcal/mol according to second-order
perturbation theory. Let us now use a more sophisticated - or at any rate a more
complicated - correlation treatment: configuration interaction (CI) with single and
double excitations (CISD). The CISD energy for H 2 0 is -76.24572, and for the
dimer -152.47869, giving a CISD binding energy of -8.0 kcal/mol! This is not only
a very different result from the perturbation theory value, it does not even seem to
be physically plausible. However, a further numerical experiment reveals an anomaly.
If we compute the energy of (H 2 0h with an arbitrarily large separation between the
monomers, the perturbation theory energy is -152.49204, which is twice the monomer
energy. But the CISD energy is -152.47186, which falls short of twice the monomer
energy by about 12 kcal/mol. This is in fact a large part of the original difference in
binding energies. If we compute the CISD binding energies as the difference between
the dimer energy and the dimer at infinite separation, we obtain a binding energy
of 4.3 kcal/mol, which is closer to the perturbation theory estimate and is more
physically plausible.
This behaviour of the CISD is referred to as a lack of size-consistency. Pople
and co-workers [1) defined a size-consistent method as one for which

(1.2)
132

Evidently, second-order perturbation theory is size-consistent for water dimer, while


CISD is not. Size-consistency is an important property of theoretical methods, but it
is a pragmatic criterion rather than a formal one. We can take a somewhat different
perspective and obtain a more formal criterion. We can see, for example, by general-
izing mentally our water dimer example, that while the energy of N noninteracting
systems should scale linearly with N, this cannot be the case for CISD since it does
not even work for the case N = 2. Conversely, second-order perturbation theory in-
deed scales linearly for such a case, as the mathematical methods we will later develop
will show. A method that scales correctly - that is, scales as the exact energy does
- with the number of particles in the systems is termed size-extensive [21.
Size-extensivity is a more general concept than size-consistency. Since it is a
scaling property of the energy, it applies in any circumstances, even to atomic calcu-
lations. We may be interested, for example, in the effects of correlating core electrons
on atomic properties. Clearly, if we wish to reliably compare the results of, say, an
8-electron treatment of argon atom (i.e., valence shell only) with an I8-electron treat-
ment (all electrons), the computational methods used should scale correctly with the
number of electrons. Size-consistency is more narrowly defined in terms of a fragmen-
tation process, and one can envisage constructing different fragmentation processes
for the same system that mayor may not be size-consistent. For the molecule N2 , for
example, the weakly bound 7Et state dissociates to ground-state atoms
(1.3)
a process which is qualitatively well-described at the Hartree-Fock level. However,
the molecular ground state dissociation
(1.4)
is 110t adequately described by Hartree-Fock theory, except at the unrestricted (UHF)
level. Hence for the 7Et state dissociation, restricted Hartree-Fock (RHF) and UHF
theory are size-consistent. But for the ground 1 E; state, RHF is not size-consistent,
while UHF is. Yet both UHF and RHF are always size-extensive. Size-consistency
thus may require more than size-extensivity of the energy - it requires that the wave
function be able to describe a specific fragmentation process, or its limits, correctly.
This does not mean that size-consistency is in some way more general than size-
extensivity - quite the reverse. It comes about because of a requirement (correct
fragmentation) that involves more than just the electron correlation treatment.

1.2 Separated electron pairs

We turn now to an elementary analysis that illustrates the formal requirements for
size-extensivity. We employ as a model a simple beyond-Hartree-Fock treatment in-
troduced more than forty years ago: the method of separated electron pairs [31. The
133

basic assumption is that only correlation effects involving specific disjoint pairs of
electrons are important. The total wave function is thus represented as an antisym-
metrized product of two-electron geminals

(1.5)

for a 2N-electron system; the geminals are strongly orthogonal in the sense that

J01'(1, 2)0,(1, 3)dT = ~p,. l (1.6)

This strong orthogonality simply means that the geminals can be expanded in disjoint
subsets of an orthonormal set of orbitals:
{I'}
nl'(I,2) = :E <=:b {¢:(1)¢t(2) + (1 - ~"b)¢t(1)¢:(2)} [ap - pal, (1.7)
lIb

where we have included the singlet spin function, and the notation {p} is used to
indicate that the set of orbitals used for each pair p is specific to that pair. We note
that the closed· shell Hartree-Fock function is a special case of this form, in which

01'(1,2) = </>g(I)</>g(2)[ap - pal. (1.8)

Beginning with the Hartree-Fock configuration, denoted WO, then, we can conveniently
write the unnormalized separated-pair wave function of Eq. 1.5 as

W = Wo + :E XI' + :E Xpq + ... , (1.9)


I' pq

where
{I'}
XI' = :E'<=:bD:b' (1.10)
4b

Here the prime indicates that the term with b = 0 is excluded; D:b is an excited
configuration given by

D: b = </>Ml)</>~(2) ... <P:(2p-l)</>t(2p) ... </>~(2N -1)</>~(2N). (1.11)

Similarly,
{p} {I'}
Xpq = :E':E'C:bcf..tD:Cd (1.12)
..b cd
involves configurations in which there are excitations out of the Hartree-Fock orbitals
in both pairs p and q. This would thus create up to four-fold excitations. But the
coefficient multiplying each such excitation is a product of lower-order excitation
coefficients.
Now, for example, if we have a lattice of N noninteracting identical two-
electron systems the separated-pair model will be exact. Hence the energy obtained,
134

say, by substituting the wave function into the variation principle and minimizing
the resulting functional, must be size-extensive. In other words, the total energy is
exactly N times the energy of one subsystem. Let us compare this with the situation
obtained using other approximations. We may, for instance, retain only single and
double excitations in the wave function. In the context of separated electron pairs,
the resulting wave function is given by

since, as we have seen, all other terms in the wave function Eq. 1.5 involve higher than
double excitations. Now, we already know that the CISD energy is not size-extensive.
This must therefore be related to the absence of the product-form higher excitations
in Eq. 1.9, since this is the only difference. How do these higher excitations affect the
size-extensivity behaviour? One way to see the anomalous behaviour [4] is to calculate
the probability 9'7' of excitation from a given pair p. This probability is related to
the norm of wave function terms involving p, suitably normalized: specifically

(1.14)

for the case of only single and double excitations, for example. (We have omitted
the summation label that identifies the disjoint subsets of correlating orbitals for
simplicity.) For the full separated-pair wave function, 011 the other hand, the same
probability is

f!lJ _ E46( <=:0)2 + Eq E ..oEci <=:0)2( C~)2 ...


(1.15)
7' - 1 + E7' E ..o( <=:0)2
+ Ep Eq E..oEcc/(<=:0)2 (c~c/)2 ....
The denominator in Eq. 1.15 can easily be seen to be the expansion of the product

(1.16)

whereas the numerator is

(1.17)

With these substitutions Eq. 1.15 reduces to the simple expression

(1.18)

We note that this is not equal to the CISD value of Eq. 1.14, except in the limiting
case of only one electron pair. Assume once again that we are considering a lattice
135

of noninteracting two-electron systems. It is clear that in the separated-pair wave


function, the probability of excitation from the pair p is exactly independent of exci-
tations from any other pairs, since Eq. 1.18 involves only the coefficients of the pair
of interest. But Eq. 1.14, while superficially similar, involves a sum over all pairs
in the denominator. Obviously, the more pairs we have, the larger the denomina-
tor will become, and the smaller will be the probability of excitation from a given
pair. This is completely unphysical. If the two-electron subsystems are noninter-
acting, how can excitations on one affect the probability of excitations on another?
It is this spurious "interaction" between physically noninteracting subsystems that
destroys size-extensivity of the energy, since the correlation energy of each subsystem
decreases as the probability of excitations from that pair is artefactually reduced.
The CISD energy scales in fact as VN rather than N, for example. We should
also note that it is only the full separated-pair wave function that displays true size-
extensivity. Extending CISD to some higher, but fixed, truncation level will not yield
a size-extensive energy: including up to 2N-fold excitations will give a treatment that
scales correctly up to N pairs, but which then "falls off" for higher N. CI with up
to twelve-fold excitations will work for six noninteracting pairs, but as the number of
pairs increases the same problems as seen in the CISD case will arise.
It is thus the presence of the higher excitations to all orders in the separated-
pair wave function that confers size-extensivity on the energy. The inclusion of higher
excitations with coefficients that are products of lower-order excitations is the key to
the factorization of the denominator in Eq. 1.15, and thus 'Ultimately to the desired
independence of the excitation probability in the noninteracting subsystems. In this
vein, instead of the term size-extensivity, Hurley [4J writes of the "kinematic inde-
pendence of the pairs" when referring to the correct behaviour of the separated-pair
wave function with excitation probabilities that are rigorously independent of the
size of the system. Yet another viewpoint [5] is that of "multiplicative separability
of the wave function" - a more formal argument that we return to at the end of
this chapter. Of course, we have so far discussed size-extensivity only in the case
of the separated-pair approximation. We now turn to a qualitative analysis of the
more realistic case in which no single unique electron pairing scheme provides a good
description.

1.3 Interacting electron pairs

A considerable effort to analyze and characterize treatments more general than sepa-
rated electron pairs was undertaken by Sinanoglu [6,7]. Drawing on terminology and
arguments from statistical theories of nonideal gases, he introduced the notion of a
cluster expansion of the wave function, restricting himself initially to only pair clus-
ters: functions to describe correlation between electrons I and J. To make discussion
of this approach easier, we will ignore single excitations for the moment and assume
136

that the correlation effects between a given electron pair involve only double exci-
tations. Denoting the pair cluster functions UIJ, we can rewrite the CI with double
excitations (CID) wave function as
llIelD = llIo + :E P(uIJIlI~J, (1.19)
1>J
where IlI~J is the (N - 2)-electron wave function obtained by deleting occupied spin-
orbitals I and J from llI o, and p( is an antisymmetrizer that ensures the overall
N-electron configuration is antisymmetric to exchange of electrons. The cm pair
cluster functions would then be given by

UIJ = :E c1!~A~B' (1.20)


A>B
where we hereafter use A, B ... to denote spin-orbitals not occupied in the Hartree-
Fock configuration. It will be convenient to use the symbol Ill?! for the antisym-
metrized N-electron functions obtained by replacing occupied spin-orbitals I and J
with A and B.
Of course, the CID energy is not size-extensive, and Sinanoglu did not recom-
mend using this wave function. Instead, he introduced [6) the cluster expansion
III = llIo + :E P(uIJIlI~J + :E :E PluIJUKLIlI~JKL •.• , (1.21)
l>J I>J K>L
with the goal of treating electron correlation on the same general footing as CID, but
obtairung a size-extensive energy. The terms involving products of clusters serve this
function: Sinanoglu termed these products unlinked clusters, and this terminology
was commonly used until fairly recently. However, as we shall see this, it call cause
confusion, and the term disconnected clusters is now preferred. The terms which can-
not be so written (the coefficients cf! in this case) are connected clusters. Formally,
then, thanks to the disconnected clusters, the aim of size-extensivity is achieved, al-
though it is not obvious how to use such a wave function in practical calculations. The
simplifications tha.t allow the use of the variation principle to optimize the separated-
pair wave function (disjoint pairs only, strong orthogonality between pair functions)
are not imposed in Eq. 1.21, creating considerable complications in practice. (The
interested reader may verify, however, that if only a single pa.iring scheme is con-
sidered and the pair functions are constructed to be strongly orthogonal - say, by
dividing the unoccupied spin-orbitals into disjoint subsets - Sinalloglu's suggested
wave function reduces to Eq. 1.5.) When Sinanoglu originally introduced Eq. 1.21, it
was always used in practice in drastically approximated form. We shall discuss this
further in subsequent chapters.
What do the individual terms in Eq. 1.21 look like? We can explicitly expand
the products of pair cluster functions to obtain

III = lIto + :E :E c1fIll:: + :E :E :E :E utffr.D IlI1fff .•. , (1.22)


l>J A>B I>J K>L A>B C>D
137

in which the disconnected quadruples coefficients are

utfff = ct! c~~ - 4f c~~ + 41'41 + d/Jc c1<~


- cff4/i + cff41- c::ljf + c1i4f
- c1# c~f - erff c1f + erRc1f - crR c1f
+ c1f c~fl - ctf c~H + 4f ~~ + cffc1R
- cffc1~ + crfc1:· (1.23)
We may note that the number of disconnected cluster coefficients grows factorially
with the excitation level [4]. There are more than 1300 triple products of doubles
coefficients in the disconnected hextuples. It is obvious that no method based on
explicit enumeration and consideration of such terms will be computationally feasi-
ble. The crucial step in the development of practical size-extensive methods was the
introduction of methods for determining the wave function coefficients and the energy
that could be implemented computationally. We shall discuss this step in detail in the
next chapter. Before we conclude this chapter, however, it is desirable to recapitulate
some of our discussion and to make some general remarks.

1.4 General remarks

There are many different perspectives from which one can view the cluster expansion
of the wave function, and the issue of size-extensivity. Thirty years ago, for example,
when there were essentially no computational implementations, much of the formal
effort was devoted to Lie-algebraic analyses, wave operators, and the logarithm of
the wave function. In simple terms, size-extensivity is obtained by having an addi-
tively separable energy, which in turn derives from a multiplicatively separable wave
function [5]. We can see this for our noninteracting two-electron systems: the total
wave function can be a simple product of the subsystem wave functions, leading to a
total energy which is the sum of subsystem energies [4]. The simple product suffices
because the wave functions of the subsystems do not overlap.
On the other hand, those who have entered the field from perturbation theory
usually use diagrammatic analyses, and express this perspective on size-extensivity by
stating that a size-extensive energy contains no "unlinked diagrams" (see, e.g., Ref. 2).
Feynman diagrams (also called Brandow diagrams in this context) or Hugenholtz
diagrams provide a bookkeeping strategy [8] for energy contributions, which for the
usual quantum-chemical single-reference perturbation theories are products (or sums
of products) of two-electron integrals divided by energy denominators. If a given
energy contribution can be factorized into a simple product of two other contributions,
the corresponding diagram will consist of two closed disjoint parts. Hence there are no
"links" between the two subdiagrams. The absence of any such terms is a necessary
and sufficient condition for size-extensivity. They thus allow us to determine whether
138

a treatment will he size-extensive, even when there is no explicit wave function, as


in some perturbation treatments. t Since we will not use diagrams in this course at
all, we will mostly be concerned with wave function terms and hence connected and
disconnected clusters.
Sinanoglu [6] originally introduced the cluster expansion drawing heavily on
physical arguments. We know that electron correlation is primarily a two-body ef-
fect, mediated by the cusp behaviour in the wave function as the distance between
the two electrons approaches zero. Further, correlation effects between electrons of
opposite spin will be stronger than between electrons of parallel spin, because Fermi
statistics (the Pauli principle) acts to keep the latter apart. Indeed, parallel-spin
electrons are sometimes said to be "Fermi correlated" even at the Hartree-Fock level.
This Fermi correlation means that true three- and more-body effects will be small,
because in any such situation at least two electrons must have parallel spins. But the
possibility of two-body correlation involving one pair of electrons and independent
two-body correlation of another pair of electrons is not affected by these arguments,
so there might be a sizeable "four-body" correlation effect arising from two two-body
correlations. Similarly, correlation of three pairs independently will give a six-body
term, etc. And, of course, these pair-product correlations correspond exactly to the
disconnected cluster terms in the wave function, since they are determined completely
by the two-body correlation effects.
Finally, let us review one or two of the main points of this chapter. We have
introduced the concept of size-extensivity, or correct scaling of the energy. Size-
extensivity or its absence is a formal property of the energy. The related concept
of size-consistency is defined by a requirement on dissociation or fragmentation limit
energies. While it is usually of considerable practical significance, it is not a formal
property, and is not relevant to some situations, like changing the number of electrons
in a correlation treatment, where size-extensivity is meaningful (and desirable). Size-
extensivity is related to disconnected cluster terms in the wave function: the number
of these terms is discouragingly large from a computational point of view. We shall
turn now to a way of developing a size-extensive energy from a wave function ansatz
that explicitly includes only connected clusters.

tTbis is the primary reason for preferring the term "disconnected duster" to "unlinked duster" in
discussing wave function terms. Then the word "unlinked" is used only with diagrams. Of course, we
should also note that in the early literature the term "unlinked duster" sometimes means "unlinked
diagram", not "disconnected duster"!
Chapter 2

The exponential ansatz

2.1 Linear ansatz: CI wave functions

We begin with a simple formulation of the CI expansion. Once again, we let in-
dices I, J ... denote occupied Hartree-Fock spin-orbitals and A, B ... unoccupied
spin-orbitals. We can use second quantization to obtain our excited configurations,

(2.1)

for example, where 1110 is the Hartree-Fock determinant. The CI expansion in inter-
mediate normalization can then be written

lIIel = (1 + LLc1X1 Xr+ L L c1! X~XJX1Xr + ...) 1110, (2.2)


A 1 A>B 1>J

which we can conveniently summarize as

(2.3)

where, for example, Ct is given by

Ct = LLc1x1 XI, (2.4)


A I

thus generating all single excitations from 1110, etc. We may simplify Eq. 2.3 to (1 +
C)Wo, where all the excitation operators are denoted by C. The CI eigenvalue
equations can be obtained, for example, by projecting the "Schrodinger equation"
H(1 + C)lIIo = E(1 + C)Wo onto the basis of many-electron states obtained byapply-
ing all the excitation operators in 1 + C to 111 0, These states thus comprise all single
excitations w1, double excitations lilt!, etc. The resulting equations are

(WoIHI(1 + C)Wo) = E, (2.5)


(lIItlHI(1 + C)lIIo) = Ect, (2.6)
(Wt!IHI(1 + C)lIIo) = Ec1!, etc. (2.7)
140

We have utilized the orthonormality of the many-electron states to simplify the


right-hand side (RHS) of these equations, which clearly correspond to an eigen-
value/eigenvector problem.
The limitation in this CI approach is what we discussed in Chapter 1: no
version of these equations in which there is truncation of 1 + C by excitation level
can yield a size-extensive result. We can illustrate this by applying simple scaling
arguments to the case of CI with only double excitations (CID). Let us simplify
the equations somewhat by eliminating the Hartree-Fock energy: we will replace the
Hamiltonian H by the operator W defined by

W =H - Eo + "L,elXt XI- "L,eAX!XA, (2.8)


I A

where Eo is the HF energy, and eA (el) are virtual (occupied) orbital energies. This
form is obviously appropriate only for canonical Hartree-Fock orbitals. If other
Hartree-Fock orbitals are used (e.g., localized orbitals), the form of Eq. 2.8 can be
modified straightforwardly to take account of a nondiagonal Fock operator. We shall
assume henceforth that the Hamiltonians Hand Ware given in normal-ordered form
(all creation operators stand to the left of annihilation operators). The CI equations
for the correlation energy f are then

(2.9)

and
(2.10)
Suppose we consider once again the case of M noninteracting two-electron systems.
The correlation energy

f = (1I10IWIC2 1110) == "L, "L, (11101 WI II1ff)c1! , (2.11)


I>J A>B

should scale linearly with M for size-extensive behaviour, but we begin only by as-
suming that it scales as M{J, for some power (3. Eq. 2.11 then shows that the CI
coefficient c:f/ must scale as M{J-l. This is so because the matrix elements of Ware
just integrals that are independent of M. Since there are M subsystems, and the
left-hand side (LHS) of Eq. 2.9 overall must scale as M{J (as assumed for the RHS),
the coefficient must go as M{J-l.
Let us now consider Eq. 2.10. Evidently, the RHS scales as M{J M{J-l, or M 2{J-l.
We can take the LHS term by term. The first term is (1111/IWIII1 0), which is just a
two-electron integral and is independent of M. The second term is (w1/IWIC2 WO).
This gives us products of a CI coefficient and a matrix element. The only matrix
elements that are nonvanishing are those in which all MOs are on the same two-
electron system, hence the matrix elements are independent of M. The coefficient
141

gives us an MP-l dependence. So the LHS has a term that is MO, and one that
is MP-l. The RHS goes as M 2P-l. We are thus in a quandary. There is no value
of p that will have all the terms in the equation scaling correctly, at least for M > 1.
In fact, since the LHS has a constant term already, we would need p = 1 to make
the LHS all constant. That is also what we want for size-extensivity, of course, but
since the RHS would then go as M, we cannot make the two sides of the equation
"dimensionally" equivalent. This simple argument, analogous to checking units in
numerical calculations, shows that CID cannot be size-extensive.
It would presumably be possible to establish that no truncated CI is size-
extensive by similar arguments, but the enumeration of the various matrix elements
would become extremely tedious. The result can be proved both algebraically and by
diagrammatic techniques in an elegant way, but this is not necessary for our present
purposes. We simply accept here that the result that holds for cm will hold for any
other truncated CI: as we saw in Chapter 1 the problem is that our wave function
contains no disconnected cluster contributions, or, as some would say, our energy
expression includes unlinked diagrams.

2.2 The exponential ansatz

Let us take another tack. Consider excitation operators labelled by T, where

(2.12)

and
Tl = 2:2:ttX.!XI, etc., (2.13)
A I
in complete analogy with the CI case above. The purpose in introducing different
labels is because we now propose to write the wave function not as the linear expansion
of Eq. 2.3, but as the ezponential [9,101

IIIExp = exp(T)Wo. (2.14)

By expanding the exponential in a power series we obtain

(2.15)

Since
T2 = 2: 2: tt!X~XJX.!XJ, (2.16)
A>B I>J
142

it follows that

Ti = :E :E :E :E ttJt~2X~XJXtXIXbXLXjXK (2.17)
I>J K>L A>S C>D

Thus the exponential ansatz automatically generates all the disconnected clusters.
The only parameters that appear independently in the wave function are the con-
nected cluster coefficients. This is clearly a very convenient way to build size-
extensivity into the method. We can also relate the coefficients of excitation levels in
Eqs. 2.14 and 2.3. For example,
c1 = t1 (2.18)

and
AS _ tAS
~lJ - IJ + tAtS
I J -
tSt A
I J (2.19)

for the single and double excitations. We can thus see the emergence of disconnected
terms even in the double excitations. Further, the reader may care to verify by explicit
expansion of the terms derived from Ti that the same disconnected quadruples are
obtained as were listed in Eq. 1.23. (The Ti expansion gives 36 terms that are
permutationally equivalent to the 18 independent terms in Eq. 1.23; the factor of
one-half in Eq. 2.15 then accounts for the redundancy.)
The wave function IJIEXP is implicitly equivalent to a full CI, since we have so
far done nothing about truncating the expansion of T. Indeed, from a full CI per-
spective all we have done is complicate matters, since now we have a highly nonlinear
representation of the wave function, compared to the linear CI expansion. However,
it is obvious that even if we truncate T at some fixed excitation level, we will retain
all disconnected clusters arising from the truncated set of connected terms. For in-
stance, we can approximate T by T2 only. The exponential then generates all orders
of disconnected clusters of pair excitations, just like Sinanoglu's pair-cluster expan-
sion (Eq 1.21). But a crucial problem remains - how do we optimize the connected
cluster coefficients? That is, how do we devise a computational method that can
exploit this type of wave function? We have already determined that a variational
approach will not do, because the number of terms (and the nonlinearity here) would
become impossible.

2.3 The coupled-cluster method

We begin by writing the unknown wave function IJI as exp(T)lJI o. At this stage, we
make no assumptions about truncation of T. In a variational treatment we would
multiply H exp(T)lJI o = E exp(T)lJI o on the left by exp(T)t, and then obtain an
equation system by setting the change in energy with respect to any of the t coefficients
to zero. We have already accepted that this is not feasible. How to proceed next
143

is developed in many different ways by different authors. One approach [11] is to


multiply on the left by exp( -T), thus obtaining

exp( -T)H exp(T)llI o = EllI o. (2.20)

As discussed elsewhere at this summer school, the operator on the left-hand side can
be rewritten using a Hausdorff commutator expansion,

1
exp(A)Bexp(-A) = B + [A,B] + 2![A, [A,Bll.... (2.21)

In the present case, however, we receive a special bonus when this expansion is devel-
oped. We note first that the excitation operator X.t XI commutes with the operator T.
The general creation/annihilation operator product XpXq does not, so H does not
commute with T. But we observe that

xtxQX.tXr = bAqXtXr- xtX.tXqXr (2.22)


= 6AqXtXI - X.txtxIxq
= 6AqXtXI- 6PIX.tXq + X.tXIXtXQ,
so that
(2.23)
Thus the effect of the commutator is to eliminate one of the general creation or
annihilation operators. When we evaluate the commutators that involve H, whose
second-quantized form is

H = E(PlhIQ)xtXq + ~ E (PRIQS)xtX~XSXR' (2.24)


Pq PQRS

we will successively eliminate the general P, Q, R, S creation and annihilation op-


erators. After four such eliminations, the only operators that remain are excita-
tion operators, like X.t XI, that commute with T. Hence the Hausdorff expansion
of exp( -T)H exp(T) is not an infinite series, but is given exactly by the five term
expansion

1
exp( -T)H exp(T) 1110 = H + [H,T] + 2[[H,T],T]
1 1
+ 3i[[[H,T], T],T] + 4i[[[[H,T],T],T],T]. (2.25)

We can now use the Hausdorff expansion to develop an explicit form for op-
timizing the wave function. Just as we proceeded above, we project onto a basis of
states adequate to define all the independent coefficients in the wave function. Such
144

a set is again provided by the Hartree-Fock configuration plus the excited configura-
tions generated by all the excitation operators in T. Let 'l!t!f.::· denote an arbitrary
excited configuration from this set. We thus obtain the equations
1 1
('l!oIH + [H, T] + 2[[H, T], T] + 31([[H, T], T], T]
1
+ 4f I[[[H, T],T],T], T]I'l!o) =E (2.26)

and
1 1
('l!t!f.::·IH + [H, T) + 2[[H, T), T] + 3i[[[H, T), T], T)
1
+4I[[[[H,T),T), T), TJI'l!o) =0 V IJK ... and ABC .... (2.27)

Eq. 2.26 and the set of equations 2.27 define the coupled-cluster method. Two crucial
points can be inferred from these coupled-duster equations. The first is that the
finite Hausdorff expansion leads to (at worst) quartic equations Eq. 2.27. This is
true no matter what level of excitation is generated byexp(T). Second, the unknown
energy does not appear anywhere in the equations that determine the various cluster
amplitudes tt!f.::·. This decoupling of Eq. 2.26 from the system 2.27, so that we
solve the latter first and obtain the energy from the former, produces an additively
separable energy and a size-extensive result. This is trivially true for the full CI case
(all possible excitations in T), but it is also true for any truncation of T by excitation
level. We will now explicitly demonstrate that size-extensive results are obtained for
our model noninteracting two-electron systems, by considering only double excitations
in T - the CCD model.
Using the operator W (Eq. 2.8) instead of H, the CCD equations can be
written as
1
('l!ol[W, T2) + 2[[W, T2), T2)I'l!o) = f, (2.28)
and
('l!t!IW + [W,T2) + ~[[W,T2),T2J1'l!O) = O. (2.29)
We note that the Hausdorff expansion terminates exactly after only three terms in
the CCD case. We expand the commutators, and use the fact that bra and ket may
differ by no more than a double excitation for a non-zero matrix element. Eq. 2.28
then simplifies to
(2.30)
We again assume that the .correlation energy f scales as MP for M noninteracting
two-electron systems, so that the amplitudes will again scale as MP-l. Similarly,
Eq. 2.29 can be rewritten as

(2.31)
145

where the parentheses simply let us group the LHS into three terms. The first of
these terms involves no amplitudes and reduces to a two-electron integral that is in-
dependent of M. The second term (the matrix element of (WT2 - T2W)) effectively
involves matrix elements between double excitations: these are zero when the exci-
tations are on different systems or from other than the pair I J. Where the matrix
element is nonzero, it comprises an integral or integrals multiplying an amplitude:
the integral is independent of M so the overall scaling of this term is MfJ-1 • The
third term involves disconnected quadruple excitations. After a certain amount of
manipulation it may be reduced to the form

E E (iJ!t!IWIiJ!t!ff)iitfK?, (2.32)
K>LC>D

where

iitffl = - tt.ft:f + tfft:'i + tfftf<f


- tfftf<l + tfftf<f - tfRt1f + tt*t~f
- ttft~f - tfft1f + tfft1£ - tf#t1f
+ ttft1f - ttft~g + ttft~i + tfft1f
- tfft1~ + tfft~. (2.33)

Note that this disconnected quadruples coefficient is not the IS-term form of Eq. 1.23:
the leading term ttftf/l, from Eq. 1.23 does not occur in the expansion of the LHS of
Eq.2.31. This has profound consequences, and is discussed in the next section as well
as here. The matrix element between double and quadruple excitations in Eq. 2.32
can be simplified, since

(2.34)

and this is actually a combination of two-electron integrals. These integrals are in-
dependent of M, of course. What of the dependence of ii? For our noninteracting
systems, any term corresponding to simultaneous single excitations on two subsys-
tems (all terms involving subscripts like I K or superscripts like AC) is zero. Further,
excitations from I J into correlating orbitals from a different pair also have zero coef-
ficients. The only terms that survive come from the case {KL} = {IJ}, which since
restricted summations are used in Eq. 2.32 implies K = I and J = L. Thus for the
noninteracting two-electron systems Otlff reduces to

- t1Ytff + t1ftfJC + tffttf


IJ IJ + IJ IJ
t BDt + tABtCD
AC tCDtAB tACt BD
- IJ JI - IJ JI

+ IJ JI + IJ JI - IJ JI + tIJ JI'
t ADt BC tBCtAD tBDtAC CDt AB (2.35)
146

From the original definition of the amplitudes tf! in Eq. 2.16 we can see that we
require that t1? = -tf!, whereupon most of the terms in Eq. 2.35 will cancel with
one another. For noninteracting two-electron systems we then finally obtain

L L (1II1!IWI1II1!ff)Otfff = - L (lIIoIWllllf'p)t1!tf.P· (2.36)


K>LC>D C>D

Note that the result has no sum over occupied orbitals. Hence this term in fact has
only the M dependence of the coefficients, or M2fJ- 2 overall. Let us return (finally!)
to the analysis of Eq. 2.31. The LHS comprises three terms, which scale respectively
as MO, MfJ- 1 , and M 2fJ- 2 . And we thus see that all three terms scale the same way
for f3 = 1, which is precisely what is required for size-extensive behaviour.
In this way, we see that the generation of disconnected quadruples by the ex-
ponential operator exp(T) provides exactly what is needed for a size-extensive result.
We may note that the unknown correlation energy nowhere appears in the equations
defining the amplitudes, unlike the CI case. We may also note that the amplitude
product on the RHS of Eq. 2.36 corresponds, in effect, to the "quadruple excitation"
1111fI~D, which cannot occur since the Pauli principle forbids annihilating the same
occupied spin-orbital twice. Such a contribution is often referred to as an exclusion
principle violating (EPV) term. We shall meet EPV terms again in this course.

2.4 Alternative formulations

There are many formulations of the coupled-cluster equations, some differing from
the previous section, at least superficially. We shall briefly review some aspects that
the reader may encounter in the literature.
It is common not to use the Hausdorff expansion when setting up the CC
equations [4,12]. We can instead project the "Schrodinger equation" H exp(T) 111 o =
Eexp(T)lII o onto a basis of many-electron states. Using again the operator W =
H - Eo we obtain

(lIIolWI exp(T)lIIo) = f, (2.37)


(1II1IWlexp(T)lIIo) = ftt, (2.38)
(1II1!IWI exp(T)lIIo) = f(t1! + t1t~ - tftJ), etc. (2.39)

In this approach, the correlation energy appears explicitly in the equations defining
the amplitudes. However, at each level of excitation, terms will arise on the LHS,
involving disconnected clusters, that will cancel the term on the RHS. We can see
this explicitly for the CCD case again, for which the equations are

(2.40)
(2.41)
147

The first equation just defines the correlation energy, given the CCD amplitudes. The
remaining equations include the term

(2.42)

which is similar to the doubles/quadruples matrix element considered above. How-


ever, Eq. 2.42 does not expand to the form of Eq. 2.32, but includes all 18 terms in
the coefficient product [4J:

l: l: (w1! IWlw1!f'f)Ut};gD, (2.43)


K>LC>D
where

ufA?P = t1ft~~ - t1.ft~f + t1ft~1 + tfft1:~


- tfft~·l + t?ft'i~ - t1#t5f + t1#t~f
- t1#t~f - tfftJf + tfRtJf - trRtJf
+ t1ft5f - t1ft~# + t1ft~f + tfftJf
- tfftJ~ + t?ftJ~· (2.44)

The first of these terms is

" "('T,ABIWI'T,ABCD)
L. L. 'i!IJ AB CD
'i!IJKL t IJ t KL · (2.45)
K>LC>D
We note that (w1J IWlw1JRf) = (WoIWIWk£), and recall that the correlation energy
is given by
~= l: l:
(WoIWIW~f)t~f· (2.46)
K>LC>D
Hence

l: l: (w1flwlw1:fff)t1!t~~ == l: l: (WoIWlw~~)t1!t~~ = fi1!, (2.47)


K>L C>D K>L C>D
which cancels the term on the RHS of Eq. 2.41. The correlation energy is thus
removed from the equations determining the amplitudes. Note that the treatment is
size-extensive whether or not this is done explicitly, but it is usually more convenient
computationally to do the cancellation.
The energy-independent equations for the cluster amplitudes are sometimes
written down directly in the form obtained by using the Hausdorff expansion, but
based instead [10J on projecting [(H - E) exp(T)Jcwo = O. Here the subscript C
indicates the use of the "connected cluster theorem" of Cizek [13J. This is in fact
exactly equivalent to the use of exp( -T)(H - E) exp(T). Note also that although
Cizek is commonly cited for this it is not obvious what, in these references, constitutes
148

the theorem. From a diagrammatic point of view, the connected cluster expansion
of [(H - E) exp(T)]c comprises only "connected diagrams". These form a proper
subset of the linked diagrams, hence the energy will be size-extensive (no unlinked
diagrams).t

2.5 Perturbation theory

Bartlett and co-workers (see, e.g., Ref. 14) have developed another formulation of
the ee equations - one that is useful for displaying the connections between ec
methods and perturbation theory. It is an "operator form", given symbolically as

(2.48)

for the eeD case, for example. Here we have an operator relation that becomes
an equation when we operate on the right on the HF determinant l{Io, and we then
project onto the appropriate space of excitations. In the above case, for example, we
would project on the double excitations. The result is

(1{I1!ID 2T2 1I{1o) = (I{Itflw + WT2 + ~WT:ll{lo). (2.49)

The LHS here can be expanded:

(I{ItfID 2T21I{1o) == (eJ + eJ - eA - cB)t1!, (2.50)


where we have assumed canonical Hartree-Fock orbitals and thus the denominator D2
involves only orbital energies Cp. Thus Eq. 2.48 is a shorthand notation for the
equation system

(cr + CJ - CA - cB)tt! = (I{It!IW + WT2 + ~WT;ll{Io). (2.51)

From our earlier discussion we realize that only connected terms should be retained
here, something that we henceforth regard as implicit in the notation. That is, the
simple product notation WT2 really represents the commutator [w, T2 ), etc. Thus
with these assumptions: connected terms only, canonical orbitals, and the normal-
ordered Hamiltonian W of Eq. 2.8, we have a convenient and compactn notation for
the ce equations. The general form of the equations can be written as

D"T" = [Wexp(T))c . (2.52)


This form also lets us readily identify connections between coupled-cluster
theory and many-body perturbation theory (MBPT) with the M,dler-Plesset parti-
tioning of the Hamiltonian. (MBPT refers to a Ra.yleigh-SchrOdinger perturbation
fAn UJIlinked diagram factorizes into at least one closed diagram. A diagram that factorizes into
open diagrams is disconnected, but is not unlinked.
149

theory in which only linked diagrams are included.) For instance, we can envisage
a perturbational approach in which we assume initially that the vector of doubles
amplitudes t2 = 0 on the RHS of Eq. 2.48. We can then solve for a first-order
estimate t 2 (1) from
(2.53)
Again, this is an operator relation: each side should operate on 1110, and then matrix
elements should be taken with all 111 !JAB should be taken. This yields

(er + eJ - eA - eB)tt!(l) = (1I1t!IWlll1o)


= (AB~/J). (2.54)

Hence
t1!(I) = (AB~IJ) , (2.55)
e] + eJ - eA - eB
the first-order perturbation theory wave function is

11) = l: l: 1I11!t1!(I), (2.56)


I>J A>B

and the second-order estimate of the correlation energy is

E(2) = l: l: (I JIAB}(ABIJ J) . (2.57)


I>J A>B el + eJ - eA - eB

The second-order (MP2) correlation energy can thus be obtained as a by-product of


solving the CC problem.
With the first-order wave function available, we can evaluate a third-order
energy, which appropriate algebraic manipulations show is exactly the MP3 energy.
Finally, if we "iterate" again, by inserting t 2 (1) on the RHS of Eq. 2.48 and obtaining
the second-order wave function coefficients t 2 (2), the resulting energy corresponds to
that part of the MP4 energy that involves only connected double and disconnected
quadruple excitations. The full MP4 energy also involves a contribution from single
and connected triple excitations. The former can be obtained as a by-product of
applying this iterative procedure to the CCSD, rather than the CCD, equations. The
triple excitation term will only appear if we consider the CCSDT equations.

2.6 Expectation value-based methods

We have already stated that attempts to use the exponential ansatz in a variational
formulation seem doomed. One way to see this is to consider the variational functional

(11101 exp(T)t H exp(T) 1111 o}


(2.58)
("01 exp(T)t exp(T)llI1 o) .
150

The exponential exp(T)t contains "deexcitation" operators like xt X A• Since

(2.59)

these deexcitation operators do not commute with T, and consequently a commuta-


tor expansion of exp(T)t H exp(T) does not terminate. Bartlett and Noga [15J have
chosen to avoid the infinite expansion by truncating it not after some fixed number
of terms, but by identifying which terms contribute in which order of perturbation
theory. Then all terms beyond a certain order are neglected. They term this the
expectation-value coupled-cluster (XCC) approach. The order of perturbation theory
used is indicated by a notation like XCC( 4), implying all terms that would contribute
through fourth-order perturbation theory. This would include terms in single, double
and triple connected excitations, although certain terms involving these excitations
that contribute to the CC equations in higher orders will be neglected. The main
advantage of proceeding via an energy functional is asserted to be the simpler expres-
sions that arise for energy derivatives and molecular properties. We note that XCC(2)
is equivalent to second-order perturbation theory, and that XCC(3) is equivalent to
a linearized CCD treatment (see Sec. 6.4).
It is also possible to define a similarfunctional-based method using the unitary
operator T - Tt instead of T [16J. The commutator expansion does not terminate
in this case, either, and again a truncation based on orders of perturbation theory is
used. The resulting unitary coupled-cluster (UCC) methods are equivalent to the XCC
methods in second and in third order. However, UCC(4) is not equivalent to XCC(4),
although they are closely related, and UCC(4) is argued to be preferable to XCC(4),
since several terms are treated more "symmetrically" in the former. Similarly, UCC(5)
and XCC(5) are not equivalent; the former is recommended over the latter. These
approaches have been little used outside Bartlett's group, but they can provide a
convenient tool for analyzing different contributions to the CC equations. They have
been used [17] to derive fairly economical methods for including higher excitations,
for example.

2.7 Summary

We have seen that the use of the exponential ansatz provides us with a means of
performing calculations that are rigorously size-extensive. In practice, of course, some
truncation of the excitation operator expansion will be required, and we shall proceed
to discuss next the restriction to single and double excitations only. We may note,
here, however, that size-extensive results will be obtained not only if we trullcate T
by excitation level, but also if we eliminate individual terms from the coupled-cluster
equations. We shall see later how simpler treatments can be developed using this
tactic.
Chapter 3

The CCSD model

3.1 The CCSD equations

Given that it will be necessary to truncate T to obtain a practical coupled-cluster


approach, what truncation is appropriate? Obviously, truncation at Tl is not helpful,
since this would not include any dynamical correlation effects at all, and for Hartree-
Fock orbitals the single excitations would not interact with the reference function
anyway. The simplest truncation is to T2 [10), which certainly accounts for most of
the dynamical correlation. However, property values are often influenced significantly
by single excitations, and experience with CI calculations suggests that it is preferable
to include single excitations. The CCSD model is given by the truncation T ~ Tl +Tz.
We shall first present the equations for this meth?d in operator form, and then discuss
some of the computational aspects. Note that we will not give the explicit form of the
CCD or CCSD equations, broken down all the way to integrals and amplitudes, in
this work. These equations have been given several times in the literature, and since
we devote no effort here to discussing each individual contribution it seems pointless
to take up space by listing them. Appropriate references are given in the text.
Using the compact notation introduced in Sec. 2.5, we have

(3.1)

and
1
= W + WT1 + WT2 + 2WT22
+ WTITz
1 1
+ 21T2 12
W 1 T2 + 2WT1 + 3! WT1 + 4! WT1 ,
3 4
(3.2)

together with the CCSD correlation energy

(3.3)

Recall that here and in what follows we restrict ourselves to connected contributions
only in the expansion of Wexp(Tl + T2). The coupled quartic equations 3.1 and 3.2
152

are solved for the singles and doubles amplitudes, and the correlation energy can then
be evaluated. We shall discuss the solution of the nonlinear equations below. For the
moment, we will expand the equations somewhat, explicitly performing the operation
on 1110 on the right and the projection on the singly and doubly excited states on the
left. This gives

(er - eA)t1 = + (\II1I WT21q,o) + (\II1I WT1T21q,o)


(q,1I WT1I\llo)
+ (\II11~WT121q,o) + (q,11~WT;Iq,o), (3.4)

and

(er + eJ - eA - es)t1! = (q,1!IWI\IIo) + (\II1!IWT11q,o) + (\II1!IWT2I\11o)


+ (q,1!I~WT;I\IIo) + (1II1!IWTIT2I\11o)
+ (\II1!I~WT12T2I\11o) + (1II1!I~WTil\llo)
3\. WT131\11o) + (\II1!I~WTtIq,o).
+ (\IIffl- 4.
(3.5)

In order to arrive at a set of equations that can be programmed, we would need to


expand all the matrix elements that arise. For example, we can expand the third
term on the RHS of Eq. 3.5 as

(q,1!IWT2I\11o) = L: L: (\II1!IWI\II~~)t~~. (3.6)


K>LC>D

From the Slater-Condon rules (or from explicit consideration of the second quan-
tized form of W) we can see that the indices K, L, C, D cannot all be different
from I, J, A, B. At least two of the former must coincide with the latter, giving

(\II1!IWT2I\11o) = L: (ABICD)tf.P + L (IJIKL)tf<1


C>D K>L
+ L: L {(K ApC)tr~ - (K BIJC)t1#
K C
- (KAIIC)tJKBC + (K BIIC)tJf}. (3.7)

(Recall here once again that we are assuming canonical Hartree-Fock orbitals.) The
other terms in Eqs. 3.4 and 3.5 can be expanded in like manner, giving us a complete
set of equations for determining the resulting amplitudes. The full equations can be
found, for example, in Purvis and Bartlett [18]. Instead of enumerating all the terms
here, we will tum to an important issue that we have largely ignored so far: the spin
symmetry of the problem.
153

3.2 Closed-shell systems

Suppose that we are concerned with a system whose Hartree-Fock wave function is
a closed shell (all orbitals maximally occupied). If we wish to perform, for example,
a CI calculation, we can use simple spin-orbital excitations just as we have done so
far. However, if the final wave function is to be a totally symmetric singlet state,
the coefficients of the excited determinants will not be linearly independent. The
simplest case would be the single excitations: denoting beta spin by a bar, and using
lower-case letters to denote orbitals, we can see that for illfcf and ill:4 we must
have cf = 4· Obviously, we can reduce the number of singles coefficients by a factor
of two by defining configurations that are spin eigenfunctions,

(3.8)

which would appear with a coefficient df, say.


Similar relationships hold at the level of higher excitations. For example, a
double excitation from a closed-shell determinant can produce at most four singly
occupied orbitals. These give rise to at most two spin couplings that yield a singlet
overall. Instead of the six doubly-excited determinants

(3.9)

we can form, for example, the two singlet spin eigenfunctions

.) = ~2
1~~~ [ill!~' )
- ill!~ J+
' )- 'ill~.! ' ill~~]
) (3.10)

and
..6 + .T,ii6 + .T,,,5 + .T,ii6 + .T,,,5 + 2'T'ii5]
3""..6
'tI'ij = v'f2
1 [2.T,
':Itij ':It,j "ffij ':Iti, "ffi, "ffr;.
(3.11)

The spin eigenfunctions of Eqs. 3.10 and 3.11 are often referred to informally as
singlet- and triplet-coupled double excitations, respectively. More correctly, Eq. 3.10
involves two occupied orbitals coupled as a singlet, and two virtual orbitals coupled
similarly. In Eq. 3.11 the holes are coupled as a triplet, the virtuals are coupled as
a triplet, and the overall coupling is of course as a singlet. Obviously, these spin
couplings are not appropriate for cases of index coincidence. For i = j or a = b the
linearly dependent terms are eliminated and the configuration is renormalized, as for
the configuration ~'t;" = ill'll. The triplet-coupled function Eq. 3.11 vanishes for any
coincidences among indices.
Using these spin-adapted configurations, we will substantially reduce the length
of the correlating expansion. The number of double excitations will go down by a fac-
tor of about three, for example, compared to the use of spin-orbital excitations. This
also leads to some extension of the definition of pairs, which hitherto have referred to
154

spin orbital pairs. For instance, if i and j are different spatial orbitals, we can consider
terms like those of Eq. 3.10, for all a ~ b, as constituting a correlating expansion for
the "singlet interorbital pair" ij. That is, the correlation of a singlet-coupled pair of
electrons in orbitals i and j is deScribed by the expansion
(3.12)

We can similarly have triplet interorbital pairs, and intraorbital pairs ii that are
necessarily singlet-coupled. Provided the correlation treatment we are using involves
summations over both occupied and virtual manifolds, the result is rigorously inde-
pendent of whether spin-adapted configurations are used or not. The energy from a
CISD or CCSD calculation is thus unaffected by the spin coupling, but the compu-
tational effort will be significantly reduced.
We should note that the spin-couplings given in Eqs. 3.10 and 3.11 are to a
considerable extent arbitrary. Any unitary transformation that mixes the two cou-
plings will give a pair of orthonormal singlet spin eigenfunction that are formalIy
acceptable. However, there may be computational advantages to using a particular
spin-adaptation scheme. The one described is that which is obtained by the earli-
est attempts to exploit spin symmetry in CCD or CCSD calculations [4,13,19,20],
whether arrived at diagrammatically or algebraically. This spin-coupling was con-
sidered most appropriate for early direct CI calculations [21J because it provided
maximum reduction in work. Nevertheless, recent developments have concentrated
on different, simpler schemes, as we now discuss.
The singlet and triplet pair functions provided considerable reduction in com-
putational work when they were first used in various electron-pair-based models (see
Ref. 4 and references therein). These early calculations predated vector supercom-
puters, and the main emphasis in obtaining computational efficiency was placed on
reducing the number of operations performed. But while spin-coupled pairs provide
the fewest independent double excitation amplitudes, they generate a number of side-
effects. First, as we have noted, configurations in which orbital indices coincide are
treated as special cases, although this can be avoided in part by renormalizing the
amplitudes of configurations with index coincidences. Second, the sets of amplitudes
are defined by restricted summations over a ~ b or a > b for singlets and triplets,
respectively. That is, the independent amplitudes form triangular matrices for each
pair. These issues of index coincidences and triangular matrices are precisely those
that bedevil efficient vector implementation of algorithms, since they engender either
conditional constructs in loops, or squaring of triangular arrays with a concomitant
increase in the number of operations. We can finesse these issues by thinking again
about the spin coupling.
We define orbital excitation operators as

Ea; = X: Xi + X: Xi, (3.13)


155

where we understand the excitation operator subscripts to refer to spatial orbitals,


while the labels on the creation and annihilation operators refer to orbitals multiplied
by either alpha or beta spin-functions, depending again on the presence or absence of a
bar over the indices. We then have, for example, for the case of no index coincidences,

14?ij = ~ [E"iE6; + E6iE..;] \lfo (3.14)


and
(3.15)
where the numerical factors yield normalized configurations. Evidently, an alternative
choice [22] would be to abandon the restriction a > b and to employ the simple
combinations
=,,6 _ E E ,T.
-i; - "i 6; '¥Q, (3.16)
together with an unrestricted range of all virtual pairs abo These new configura-
tions are obviously related to the original singlet and triplet spin-coupled pairs. The
case a = b gives rise to a combination of determinants that are not linearly indepen-
dent among themselves, but this can be accommodated by renormalizing the associ-
ated amplitudes. We now have square matrices of nonredundant amplitudes to work
with, greatly improving the prospects of vectorization. We may also exploit another
simplification in choosing the basis onto which we project exp( -T)W exp(T). Namely,
instead of projecting on configurations (that is, combinations of determinants), we
can project on individual determinants. There can only be as many linearly indepen-
dent projections as we have independent amplitudes, that is, two possibilities. We
could use the determinants \Iff; and \Iff;, for instance, instead of the basis :=:iJ, and
computational methods using both forms have been implemented [23-25].
All of these different tactics are associated with the aim of increasing the
computational efficiency of the calculation, especially for vector computers. This
also requires casting as many computational steps as possible in terms of matrix
multiplication, since the latter usually results in optimum performance on vector and
"superscalar" machines. A very clear and detailed discussion is given by Lee and
Rice [24]. These authors also analyze the computational dependence in terms of the
various "N 6 " steps, that is, work that behaves as N;N; or N;N:, where there are No
occupied orbitals involved in the correlation treatment, and Nv virtual orbitals. (In
addition to work with an "N5" dependence or less, there is another N6 term, of the
form N: N;. The factor in front of this term is such that it contributes little to the
time of real calculations.) A good matrix-formulated CCSD program like that of
Ref. 24 can achieve more than 250 MFLOPS overall on a CRAY Y-MP computer,
and the work done in each iteration is no more than twice what would be required
for a CISD calculation [25].
We should make one final observation here about the CCSD equations pre-
sented in the literature. It is very common for it to be assumed that canonical
156

Hartree-Fock orbitals are used; the Fock matrix is assumed to be diagonal. The
CCSD method itself is independent of whether canonical orbitals or some other choice
of Hartree-Fock orbitals, like localized orbitals, is used. However, if another choice
is used, the expressions programmed must include the necessary nondiagonal ele-
ments of the Fock matrix. Programs based on expressions that contain only diagonal
elements of the Fock matrix are correct only for canonical Hartree-Fock orbitals.

3.3 Solution of the CCSD equations

The efficient CCSD implementations we have alluded to allow us to evaluate the LHS
of Eq. 3.5, for a given estimate of the amplitudes, as rapidly as possible. Of course,
for any given estimate of the amplitudes, this expression will not evaluate to zero,
and we would then like to use the available information to improve the amplitude of
the estimates, ultimately converging on the CCSD solution. One way to proceed is
suggested immediately by the perturbation theory analysis of Sec. 2.5. We assume
we have an estimate of the amplitudes t and wish to determine a correction c5t to
them. Perturbation theory [2] would yield

(e; - e.. )8ti = G(i, a) (3.17)

and
(e; +ej -e .. -eb)c5ti] = G(i,j,a,b) (3.18)
in lowest order. Here the arrays G represent the RHS of the amplitude equations
evaluated with the current estimate of the amplitudes t. The amplitude corrections
are obtained by dividing through by the energy denominators. We can proceed it-
eratively, defining t ln+1] = tin] + c5tlnl, where we have used n to index the iterations
and
(3.19)
for the doubles equation, for example. Eventually, assuming the procedure converges,
the arrays Gin) will tend to zero. However, while this strategy may be useful for
obtaining low-order perturbation energies (through fourth order, say), the conver-
gence of such an approach is likely to be unacceptably slow. An analogy with the
direct CI method is appropriate - this was originally formulated using perturbation
theory to extract the lowest eigenvalue of the Hamiltonian, but for practical use a
variation-perturbation scheme (or a Davidson-type iterative scheme) is required [21].
For eigenvalue problems and for linear equations a combination of iterative methods,
to generate trial vectors, and a direct solution of the problem projected onto the
space of these vectors, has proved very successful. Two related schemes have been
used with success for solving the CCD and CCSD equations. The basic idea is that
rather than simply iterating, once a certain number of iterated amplitude vectors are
available they are used to form a new guess at the solution. For instance, assume we
157

have m sets of amplitude vectors tin) and "residual vectors" Gin), where n runs from
one to m. We wish to represent the solution to the CC equations as well as possible as
the linear combination En t[nl en . Using Pulay's DIIS (direct inversion in the iterative
subspace) approach to nonlinear optimization, we can regard the optimum coefficients
as those that minimize the residuals (in a mean-squared sense). That is, we deter-
mine those en that minimize the squared norm IE.. enGln) 12 , subject to the condition
that En en = 1. This requires solution of a system of m + 1 equations, which requires
trivial computational effort compared to constructing the residual vectors themselves.
Given the optimum coefficient values, a new amplitude vector can be constructed and
(assuming the residuals are not zero to within the desired threshold) more iterations
can be performed.
Purvis and Bartlett [261 devised a somewhat different approach to solving the
CC equations, but again it involves performing a number of simple iterations and
then solving a small· dimension problem. The latter is a linearized problem in their
approach, which they term a combination of "Jacobi iterations" (the simple iteration
steps) plus a "reduced linear equation" (RLE) step (the small dimension problem).
Typically, these authors use the RLE step every five Jacobi iterations, although it
might be used more frequently in difficult cases.
The author's experience is that DIIS-based methods show superior convergence
to the RLEj Jacobi scheme, although it should be said that this has generally involved
calculations in which the convergence criteria (e.g., changes in amplitudes from iter-
ation to iteration, or norms of residual vectors) are very demanding. Thresholds that
effectively mean convergence of the energy to 10-10 Eh require about 25 evaluations of
residual vectors using the DIIS approach, and possibly 30 or more with RLE. Conver-
gence of the energy with DIIS tends to be monotonic from above (although of course
there is no variation principle to guarantee this), while with the RLE nonmonotonic
behaviour can be observed. For comparison, a CISD calculation would probably re-
quire about 16 to 20 Davidson-type iterations to converge the total energy to a similar
threshold. If less precision is required, fewer iterations are required, and the difference
between the convergence rates of DIIS and RLE seems to decrease.

3.4 Reliability of closed-shell CCSO: a diagnostic

As we have demonstrated, the CCSD method provides an exact solution in the case
of noninteracting two-electron systems, but it is obviously not exact for real, many-
electron molecules. Nevertheless, we may expect that where electron correlation is
dominated by pair effects, CCSD should yield good results. This is quantified else-
where at this school, here we mention only that the CCSD model should recover
90-95% of the exact correlation energy (Le., the full CI limit in a given basis) in situ-
ations where the closed-shell Hartree-Fock configuration dominates the wave function.
In other words, where nondynamical correlation, like near-degeneracy effects or other
158

failures of the Hartree-Fock approximation, is absent, CCSD should perform well.


The remaining correlation effects, due to higher connected excitations, should be
small and can be estimated very accurately using some of the approximate methods
described in the next chapter. However, it would be well to have a means to monitor
our calculations to determine when nondynamical correlation may become important.
In this way we will not be led astray by calculations whose results look plausible but
which are not well-founded.
Let us consider first some of the ways in which nondynamical correlation can
manifest itself. For example, breaking the bond in the molecule N2 is easily shown
to require up to six-fold excitations from the Hartree-Fock configuration, and these
multiple excitations are not well approximated as disconnected products of doubles.
Hurley [4] gives a detailed discussion of an analogous case: breaking the C-C bond in
C2 H4 • Ultimately, such bond-breaking situations lead to a number of doubly excited
configurations having similar weight in the wave function to the Hartree-Fock config-
uration, and certainly one useful measure of the onset of nondynamical correlation is
the appearance of large amplitudes for excited configurations. Examining the largest
of the final amplitudes can thus provide some insight. Unfortunately, small values for
the double excitation amplitudes are only a necessary, not a sufficient, condition for
the absence of strong near-degeneracy effects. Another complication arises from the
use of Hartree-Fock orbitals. The orbitals are optimized for a single configuration -
in cases where other configurations would be important if they were admitted, the
orbitals may describe the one configuration that was used much better than the other
configurations that should have been considered. This "orbital bias" will raise the
energy of the excited configurations, and thus will reduce the contribution they make
to the correlated wave function. In fact, this effect is much more pronounced in CISD
calculations than in CCSD calculations, but it illustrates that small amplitudes do
not in themselves always mean that non dynamical correlation is small.
Fortunately, we can exploit another consequence of orbital bias to probe non-
dynamical correlation. If the orbitals are very unsuitable for describing the excited
configurations in the wave function expansion, we can expect that orbital relaxation
effects will assume a new importance, since by relaxing the orbitals we can reduce the
orbital bias. Orbital relaxation is governed, in first order, by single excitations. If we
perceive a large contribution from single excitations, we can infer that the Hartree-
Fock orbitals are not well-suited for describing correlation in the system, and that
nondynamical correlation is becoming important. Following earlier work by Lee and
co-workers [27], Lee and Taylor [28] suggested the use of the following quantity,

1i =~, (:3.20)

the 1i diagnostic, as an indicator of nondynamical correlation. Here tl is the vector


of single excitation amplitudes and N is the number of electrons. On the basis of
159

empirical comparison, for a variety of systems, Lee and Taylor suggested that 1i values
larger than 0.02 indicated an increasing importance of nondynamical correlation and
all increasing unreliability of CCSD.
It is important to note several points in connection with the 1i diagnostic.
First, it was not suggested as an alternative to examining individual amplitudes,
but as a complement to it, an aim that has been perversely misunderstood and
misreported in the literature. Second, it is crucial to "normalize" such a quantity
so that the final value is independent of the number of electrons. Examining unscaled
norms of excitation amplitudes, or norms of perturbed wave functions, is not useful in
this sense. It should be obvious for the case of noninteracting two-electron systems,
for example, that as the number of systems increases the norm of the individual
perturbed wave functions increase, since the number of contributions depends on the
number of electrons. Yet CCSD remains exact for this case, and (assuming each two-
electron system is well described at the Hartree-Fock level) there are no nondynamical
correlation effects. Thus we must remove any dependence on the number of electrons
before comparing norm-derived quantities.
What should one do when 1i is larger than 0.027 Within the CC framework,
it will be necessary to include higher excitations, procedures for which are discussed
in Chapter 4. Even the simplest (reliable) corrections for connected triple excitations
increase the radius of convergence of the method substantially. No threshold value has
been derived, but the author has seen good agreement between such treatments and
MCSCF /MRCI methods when 1i has been as large as 0.04. The alternative is to turn
to methods more suited to open-shell systems and to the treatment of nondynamical
correlation: such methods are discussed in Chapter 5. It is appropriate to point out
here that the threshold of 0.02 was derived for closed-shell CCSD. In particular, when
a UHF reference function is used, an analogous formula for 1i routinely yields much
larger values than are seen in the closed-shell case. Jayatilaka and Lee [29] suggest a
modified diagnostic for UHF-based CCSD.

3.5 Historical perspective

The development of the CC equations, and then practical formulation of the CCD and
CCSD equations for realistic calculations, has involved a variety of research groups.
It may be of interest to the reader to summarize some of the history. We concentrate
here on full CC methods - approximate treatments and their genealogy are described
in Chapter 6.
The foundations of the entire CC approach, at least in the context of quantum
chemistry, were laid by Cizek [10], although his original diagrammatic presentation
of the CCD equations demands a mathematical sophistication of the reader that
few practicing quantum chemists possess and few can devote the etIort to acquir-
ing. His later review article [13] expands on several aspects of CC theory, and also
160

shows how diagrammatic methods can be use to generate a spin-adapted form of the
equations. Later, Cizek and Paldus [19J rederived the "coupled-pair many-electron
theory" (CPMET) equations, as they referred to what we now term CCD, in terms
of determinants and algebraic expressions, although there is not much evidence that
their reformulation was found more palatable. In collaboration with Shavitt, Paldus
and Ciiek [30] had also performed some all-electron calculations on the model system
BH3 (their earlier calculations had used relatively crude models like the 7r-electron
Hamiltonian and PPP approximations). For BH3 they could compare with full CI
calculations; in addition to CPMET they also included the effects of single excitations
and, in an approximate way, connected triple excitations. The agreement with full
CI was very good, but since a minimal basis set was used only a very small fraction
of the correlation energy was recovered.
In the mid-1970s there was a substantial increase in interest in CC models.
Hurley [4J derived the spin-orbital CCD equations using purely determinantal meth-
ods in a form which seems much simpler than that of Paldus and Cizek. He also gave
the spin-orbital equations in terms of the nonorthogonal "pair-natural orbitals" that
were then very popular as a way of substantially reducing the length of CI expansions.
Taylor and co-workers [20J gave expressions for the spin-adapted CCD equations in
terms of spin-coupled pairs and pair-natural orbitals. Harris [11 J gave a rather general
derivation of CC-based methods for estimating excitation energies; Monkhorst [31J
presented an elaborate response theory for CC molecular properties. Finally, Paldus
and co-workers rederived spin-adapted CCD equations [32], and gave expressions for
excitation and ionization energies from CC ground-state wave functions [33].
At the same time, large-scale (or, at least, fairly realistic) CC calculations
were being performed with a variety of different computer implementations. Taylor
and co-workers used their spin-adapted pair-natural orbital formulation [34,35J, Pople
and co-workers [36J implemented the spin-orbital CCD equations exactly as given by
Hurley [4J, while Bartlett and Purvis [2J proceeded from a diagrammatic approach and
MBPT to spin-orbital CCD equations (apparently [18] with some reduction in effort
for closed-shell systems, compared to the UHF-based case, but not the full economy
of a spin-adapted method). Saunders and co-workers [37] developed a spin-adapted
CCD code based on Meyer's "self-consistent electron pairs" [38] formulation of CID.
After this burst of effort, the field became rather quiet in the early 1980s. The
main achievement in the computational arena was the derivation and implementation
of a practical CCSD method by Purvis and Bartlett [18]; Chiles and Dykstra [39] also
developed a CCD code based on self-consistent electron pairs. Bartlett and co-workers
also began to look at aspects of including higher than double excitations in CC treat-
ments [40]. In the latter part of the 1980s, however, there was an explosion of interest
in CC methods. Schaefer's group in Berkeley developed a rather efficient formulation
of the CCSD equations [23J, and in subsequent work Lee and Rice [24J, and Scuseria,
161

Janssen and Schaefer [25] described particularly efficient computer codes for CCSD
calculations. Contemporaneously, Bartlett and co-workers and Raghavachari and co-
workers had begun to look at methods for including the effects of higher than double
connected excitations, as described in the next chapter. As a result of the availability
of efficient CC codes like TITAN [41] and ACES II [42], and the high accuracy achievable
with coupled-cluster methods (described fully elsewhere at this school), CC methods
have moved from fringe to mainstream quantum chemistry in less than ten years.
Chapter 4

Higher excitations

4.1 Deficiencies in the CCSD model

We shall see elsewhere at this school that the CCSD approach provides results of
semiquantitative accuracy for a variety of molecules, at least in the absence of serious
nondynamical correlation effects. Indeed, there are good formal reasons to support
the view that CCSD is the most complete treatment of electron correlation within the
domain of methods that include at most connected double excitations. Neverthless,
the CCSD method is not complete for many-electron systems (except for the case of
noninteracting two-electron systems), and it is reasonable to ask how the exponential
ansatz converges with respect to excitation level, and what methods can be used to
include higher connected excitations.
From the point of view of introducing the least additional complication, the ob-
vious step beyond CCSD is the inclusion in some form of connected triple excitations.
More support for this view can be obtained from perturbation theory: as we have
seen, for Hartree-Fock-based perturbation theory only connected doubles contribute
to the first-order wave function, and thus to the second- and third-order energies.
The fourth-order energy includes contributions also from single excitations and dis-
connected quadruples, contributions that are included in the CCSD approach. In ad-
dition, however, the fourth-order energy includes contributions from connected triple
excitations: the full M~ller-Plesset fourth-order treatment is denoted MP4(SDTQ).
In a perturbational sense, therefore, CCSD is in error in fourth order, although of
course many contributions are included to infinite order. We thus anticipate that any
attempt to improve on the CCSD approach should concentrate first on connected
triple excitations.

4.2 The CCSDT model

Probably the most obvious strategy to improving the CCSD approach is to truncate T
in the exponential ansatz at a later stage. By including all terms in T}, T2 , and T3 , for
example, we would have a CCSDT method [40,43J. In terms of the operator formu-
lation of the CC equations, the defining equations for the T}, Tz, and T3 amplitudes
164

would respectively be

DzTz = W + WT1+ WTz + 4WT; + WT1T2 + 4WT;T2


1 2 1 3 1 4
+ 2 WT1 + 3iWT1 + 4iWT1 + WT3 + WT1T3, (4.2)

and
1
= WT2 + WT3 + 2WTz
2
+ WT1T2 + WTIT3
12121 z 13
+ WT2 T3 + 2WT1 T2 + 2WT1T2 + 2WT1 T3 + 3i WT1 Tz. (4.3)

We may observe that this is a significantly more difficult problem than CCSD [14,44].
The dimension of the CC equations is much larger, since the number of (connected)
triples scales as N; N~. Solving the CC equations requires an overall computational
effort proportional to N; N! and N: N~, whicb is commonly referred to as an "N8 n
dependence. And since iterative methods must be used, this effort is required in each
iteration. (We may note in passing, however, that the nonlinearity in the CCSDT
equations is no different from the CCSD equations: all CC equation systems are at
worst quartic in the unknown amplitudes.) Thus the CCSDT method is generally
too expensive for use in production calculations, although it has been used very
successfully to calibrate other methods for treating the triples contribution [14,45,46].
Within the general spirit of the CCSDT equations (that is, iteratively solving
for the amplitudes of single, double and triple excitations), Bartlett and co-workers
have derived a hierarchy of methods by successively approximating terms (see, e.g.,
Ref. 14 and references therein). These methods are denoted CCSDT-n, where higher
values of n indicate fewer approximations to the CCSDT equations. Specifically, the
CCSDT-4 method results from dropping nonlinear terms involving T3 on the RHS of
Eq. 4.3. Dropping all terms in T3 from the RHS of this equation yields the CCSDT-3
method. The CCSDT-2 method is obtained by dropping all terms in Tl and T3 from
the RHS. If the nonlinear T; terms are also neglected, we have the CCSDT-1 b method.
Note that in all approximations so far, the RHS of Eqs. 4.1 and 4.2 have been left
unaltered. The simplest approximate CCSDT method, denoted CCSDT-la, results
from CCSDT-lb by dropping the nonlinear term TIT3 from the RHS of Eq. 4.2. All of
these methods are iterative, and, except for CCSDT-4, which remains an NS method,
all behave as N7. They are thus all rather expensive in practice, and have not been
used very much. In fact, comparisons show that they gain little relative to the simpler
treatments we are about to examine, but require more computational effort.
165

4.3 Perturbational approaches to triple excitations

An alternative strategy for handling the triple excitations is provided by the close
connections between CC and perturbation theory that we have already discussed.
For example, a very simple approach would be based on the view that to make CC
"correct through fourth order" (in the sense of perturbation theory), we could sim-
ply compute the fourth-order M!IIller-Plesset perturbation theory contribution from
connected triples, and add it to the CCSD result [47]. Such a procedure would be
denoted CCSD+T(4), with the obvious indication of the fourth-order origin of the
triples contribution. The T(4) contribution can be computed with an effort that be-
haves as N 7 , and involveS no iteration, of course. Let us examine this fourth-order
contribution more closely. In terms of spin-orbitals, it is given by

T(4) = "
L-
,,(
L-
ABC) -1
DIJK IWIJK I,
ABC 2 (4.4)
I>J>K A>B>C

where
(4.5)
and
wtlfF = ~&1ff{E(BCIDK)t1.f{l) - l:(LCPK)ttf{l)}, (4.6)
4 D L

with
~BC
3'IJK = .::TlJK3'--
"" ~BC
, (4.7)
where, for example,

§iIJKIJK = IJK - II<J + JKI - JIK + KIJ - KJI. (4.8)

In Eq. 4.6 we have explicitly used the first-order estimates of the doubles ampli-
tudes ttl{l), which are given by Eq. 2.55, rather than writing the energy contribu-
tion down as a product of integrals divided by orbital energies. The advantage of this
form is that it immediately suggests an improvement to the correction T(4): instead
of using the first-order amplitudes, why not use the converged CC doubles ampli-
tudes [48]1 These should better reflect the exact values of the doubles amplitudes,
since CC includes terms to infinite order. Thus a correction T{CCSD) can be de-
fined by using Eqs. 4.4 and 4.6, but using the converged CCSD amplitudes ttl. The
approach so obtained is termed CCSD+T(CCSD). It has received some use, but an-
other improvement is still possible, without significantly increasing the computational
effort.
The final improvement to these perturbational triples estimates is the observa-
tion that there is a term in fifth-order, involving singles amplitudes, of the form [49]

"L- "(D
L-
ABC )
IJK -1 VABCWABC
IJK lJK, (4.9)
I>J>K A>B>C
166

where
(4.10)
The use of the combined corrections of Eqs. 4.4 and 4.9 gives the method denoted
CCSD(T). This is probably the most commonly used triples correction, and is em-
pirically observed to be the best behaved. The justification for its use is largely one
of experience - one could legitimately argue that there are many fifth-order energy
contributions that are not being included, so why include just Eq. 4.9? The practical
answer is related to the behaviour of the CCSD method itself, and, in particular, to
the importance of single excitation amplitudes (cf our discussion of the 7i diagnostic
in Sec. 3.4) as indicators of when non dynamical correlation effects become large. In
such situations (i.e., 7i large), the fifth-order contribution of Eq. 4.9 will be large.
Experience, again, shows that this term is usually positive (although there is no for~
mal reason why this must be so), and hence acts to "damp" the effects of T(CCSD).
Since the latter can become a serious overestimate (as T(4) itself would be) in such
cases, the damping effect is very helpful. As we shall see, it is probably fair to re-
gard CCSD(T) as the best single-reference correlation treatment that is inexpensive
enough to be widely applicable. We shaIl have more to say about CCSD(T) else-
where at this school; as we noted above several extensive comparisons of different
approximate methods for including triple excitations may be found in the literature.

4.4 Quadruple and higher excitations?

So far we have devoted all of our attention in going beyond the CCSD model to the
inclusion of connected triple excitations. Our original justification for this was the
observation that CCSD is in error in fourth order of perturbation theory, and this
entire error arose from connected triples. But what of higher excitations"? Connected
quadruples, for example, contribute in the fifth order of perturbation theory [4:3]. For
completeness, we give here the full CCSDTQ equations in operator form [17]:

= WT1 + WT2 + WT1T2 + ~WTi + ~WTI3 + WT3 , (4.11 )


I 1
+ W TI + WT2 + -WT2 + WT1T2 + -WTI T2
2 2
= W
:2 :2
+ -21 W T2I +"1
1W 3 1 4 T
TI +"1 WT1 + WT3 + WT4 + WT1 3, (4.12)
3. 4.
1 2
= WT2 + WT3 + "2WT2 + WTJT2 + WTIT3
1 2 1 2 1 2
+ WT2 T3 + 2"WT1 T2 + 2"WT1T2 + 2"WT1T3
1
+ 3iWT13
T2 + WT4 + WT1T4 , (4.13)

WT3 + WT4 + 2"1 WT22 + WTJ T3 + WT2T3


167

It should be clear that if the CCSDT model, with its N8 computational dependence, is
too expensive for general use, the CCSDTQ model, which would have an NIO depen-
dence, will be even less feasible. And the development of noniterative approximations
along the lines of the CCSD+T or CCSD(T) methods only reduces the dependence
to JV9, at least initially. Several authors, however, have pointed out that a number
of the fifth-order terms, at least, can be computed with ml,1ch less effort [17,50]. We
shall not discuss the details here, but the overall approach is to substitute the RHS
of the operator equation that defines the amplitudes in T4 into the equations for, say,
double excitations. For example, through fifth order we have the modified doubles
equation
. 1 2
W + WT1 + WT2 + WT3 + 2WT2 + WT1 T2

+ TJ [WT3 + ~WT;L, (4.15)

where we have explicitly indicated that the term in square brackets, like the other
terms, is restricted to connected contributions . .It is this last term that arises from
the fifth-order energy contribution of connected quadruples. The reader will note
that several terms are absent from the usual CCSD or CCSDT doubles equation, as
a result of truncating at fifth order.
It is essential to note that, despite the product forms like TdWT3 that appear
in Eq. 4.15, this is not a disconnected cluster contribution. Rather, it is analogous
to making a perturbation theory estimate of the connected quadruples amplitudes
from the doubles equation and using this to correct the original energy. This leads
to methods denoted CCSD(TQ) [17] - a quadruples correction to CCSDT could
be formulated as CCSDT(Q), but its applicability would be limited. The cost of
these "perturbational" quadruples corrections is at worst N: N:, and many terms are
only N 6 overall.
My own, very limited, experience with these corrections has not been espe-
ciallyencouraging. For systems in which Hartree-Fock is a good approximation, the
CCSD(T) results are usually already very good, and additional corrections are as
likely to make things worse as to make them better. On the other hand, if Hartree-
Fock is not a good approximation, it is not obvious that low-order estimates of con-
nected higher excitations will be useful anyway. Indeed, we have already noted that
CCSD+T(CCSD), or, worse, CCSD+T(4), is not reliable under these circumstances.
CCSD(T), which behaves better, includes not only the infinite-order effects in the
singles and doubles space, but also fifth-order effects involving singles and triples.
168

One might suspect, then, that lowest-order perturbation theory treatments of con-
nected higher excitations will not work well when nondynamical correlation becomes
important. It is probably preferable to treat the inadequacies in the Hartree-Fock
model more directly, say by using multireference methods.
ChapterS

Open-shell coupled-cluster methods

5.1 General remarks

We shall concentrate in this chapter on methods that explicitly handle open-shell sys-
tems. However, there have been many suggestions that treat such systems implicitly.
For example, equations-of-motion (EOM) or propagator methods can be formulated
to operate on ground-state wave functions to directly calculate ionization potentials,
electron affinities, or excitation energies [31,33,51,52]. If the ground state is well
described by Hartree-Fock plus CCSD, for instance, these EOM methods can yield
good results with little computational effort beyond the ground-state wave function
itself. The interested reader can find many examples in the literature. Of course,
such methods do not address all of the systems that would otherwise require open-
shell or even multi reference methods. We shall now focus on methods that set out
to treat open-shell systems by explicit calculation. There have been a. number of
efforts to generalize CC methods to spin-adapted open-shell treatments, and even
to multireference treatments. Much of what has been published is mathematical in
nature and not directly related to numerical computation. We will concentrate in this
chapter on methods that have been devised with the aim of efficient computational
implementation in mind. Our survey is neccesarily qualitative, since the open-shell
and multi reference methods lead to very complicated equations and exploring the
mathematics is beyond the scope of this course.

5.2 UHF-based methods

A spin-orbital formulation of CCSD (or any other CC method) based on a UHF


reference function can be used for open-shell systems 118] and, at least in principle,
for closed-shell systems when nondynamical correlation becomes important. CCSD
and CCSD(T) have been very widely used based on UHF wave functions. Of course,
there are significant limitations to this approach. Only the lowest state of each sym-
metry can be studied. More importantly, while almost any correlation treatment goes
some way towards rectifying spin contamination problems in the UHF reference, spin
conta.mination can still become an issue, especially where nondynamical correlation
170

is important. The major advantage of the UHF-based methods is their great sim-
plicity: they are very easily programmed compared to some of the more elaborate
schemes we will discuss in this chapter. The disadvantage is the computational effort
required in the calculations. As we saw in Chapter 3, a closed-shell system treated
using the UHF-based formalism requires about three times more computer time than
it would with a spin-adapted closed-shell treatment. We should reiterate that this is
true even when the equivalence between the alpha and beta spin-orbital spaces in the
closed-shell case is used to reduce the number of integrals that must be processed,
since as we saw in Chapter 3 there are up to six possible determinants from a double
excitation of spatial orbitals, but only two spin eigenfunctions. Hence the number of
independent parameters in the closed-shell case is one-third that of the UHF case.
A factor of three in computational effort is a considerable increase, and it certainly
seems desirable to explore ways of reducing it.

5.3 RHF/UHF-based methods

What is the problem with simply writing down a set of CC equations based on a
restricted open-shell Hartree-Fock (RHF) wave function as the reference? Recall that
in Sec. 2.3, we employed a Hausdorff expansion of the operator exp( -T)H exp(T).
This Hausdorff expansion terminated after four commutators (that is, after five terms)
because of the commutation relation of Eq. 2.23. That relation depends on a division
of the MO space into orbitals that are occupied in the reference determinant and
those that are empty. For an open-shell case this simple division is not possible when
we use only spatial orbital indices, since the open-shell orbitals are only partially
occupied. (In the spin-orbital case, of course, all indices can be uniquely identified
as occupied or virtual.) Excitations both to and from these orbitals, such as xtXi
and X: Xt, where t denotes an open-shell orbital, are possible. And this is what
creates the problem, since it means that the orbital excitation operators of Eq. 3.13
do not commute when they involve open-shell indices. For example,

(5.1 )

In Sec. 2.3 we used the fact that the spin-orbital excitation operators (there expressed
explicitly in terms of creation and annihilation operators) commute to demonstrate
that the Hausdorff expansion terminated after five terms (four commutators). This
is also true for the closed-shell case as well as the UHF case, because there is no
ambiguity defining occupied and virtual spaces. The existence of a partially occupied
space allows both creation and annihilation involving this space. The excitation
operators do not commute, and hence the termination of the Hausdorff expansion
after five terms is lost. In fact, for T truncated to only single and double excitations,
the Hausdorff expansion terminates only after eight commutators. Thus the spin-
adapted RHF-based CCSD equations [53) are considerably more complicated than
171

their closed-shell brethren, and, incidentally, are much more nonlinear, containing up
to Tf and n.
This leads rather naturally to a suggestion that represents the most obvious
first step in developing a method based on an RHF reference, namely, to employ a
UHF-based program, but to supply an RHF determinant and RHF MOs to it [54].
Since the UHF equations are determined using spin-orbital excitations, there is no
ambiguity in the definition of the occupied and virtual spaces. The first difficulty
with this approach is that if exp(T) is simply taken from the UHF spin-orbital ex-
pressions (presumably with T truncated at some excitation level), the CC wave func-
tion exp(T)\IIo will not necessarily be a spin eigenfunction, even if \11 0 itself is an open-
shell spin eigenfunction. However, Rittby and Bartlett [54] pointed out that this is not
a problem for the CC energy. For instance, if we assume that exp(T)\IIo = \lie + \II~,
where \lie is a spin eigenfunction with the desired spin, and \II~ comprises all the
spin-contaminant terms, the CC correlation energy is

(5.2)
since the matrix element (IlIoIWI\ll~) must be zero. This is so because the Hamiltonian
commutes with all spin-dependent operators, and thus cannot couple functions of
different spin. Hence even if the CC wave function is not a spin eigenfunction, the
CC correlation energy is free of any spin contamination effects. Thus by using an
RHF determinant and MOs in a UHF-based CC code, we obtain an energy that is
free of spin contamination, although the CC wave function we would obtain is not
necessarily a spin eigenfunction.
The second difficulty with this approach is that it does not exploit any of the
spin symmetry properties of the open-shell reference. Individual determinants with
independent amplitudes are generated by the UHF-based operator T. Thus we still
have the factor of three larger number of terms in the closed-shell case compared to
the spin-adapted equations, for instance [54,55]. Obviously, using an RHF reference
function does not reduce the work done, since the program cannot utilize the spin
symmetry. Hence we must accept that open-shell calculations with a UHF-based
program will take much longer than closed-shell calculations with about the same
number of electrons correlated. Further, we may certainly harbour suspicions that
such a program would take longer than a putative open-shell code in which the spin
symmetry is used throughout, since it should be obvious that at the very least, that
part of the code that deals exclusively with the closed-shell orbitals performs more
work than is actually necessary.

5.4 RHF-based methods

It is, perhaps, a measure of how much more difficult are restricted open-shell correla-
tion treatments than closed-shell methods, that it is only very recently that successful
172

open-shell perturbation theories have become available [56-59]. From our earlier dis-
cussions it should be clear that, if the perturbation theory has only recently proved
tractable, CC methodology will be barely emerging. Recent progress in this area has
been encouraging, however, and prospects undoubtedly look brighter than they did
even two years ago.
Formal work in this area (we are excluding EOM-type implicit methods) in-
cludes a full derivation of open-shell CCSD in terms of spin-adapted excitation op-
erators by Janssen and Schaefer [53]. The expressions were so complicated that
symbolic algebra programs were employed in their generation. To someone accus-
tomed to CI methodology, it may seem undesirable to insist on developing explicit
formulas for all the quantities needed. Multireference CI programs invariably include
the rules for evaluating matrix elements between prototype configurations, and no
explicit formulas for matrix elements need be programmed. This viewpoint is not
really appropriate. CC theory is not a method that is "driven" by Hamiltonian ma-
trix elements between configurations. It is true that spin-orbital CCD can be derived
in this way, but that serves only to mislead the unwary. It could be argued that
the only configurations in the CC equations are those onto which we project to ob-
tain the nonlinear equations - the matrix elements required are those of the (much
messier) operator exp(-T)H exp(T) between these configurations and Hartree-Fock.
This is certainly closer to what is done, for example, in perturbation-theoretic deriva-
tions of CC theory, than is an analysis involving matrix elements between double and
quadruple excitations. In a general open-shell or multi reference CC theory we are
not constructing a list of configurations and generating matrix elements over this list,
and it is not profitable to try to apply arguments appropriate to CI (which is such a
procedure) to the CC case.
All this being said, however, it is clearly not very convenient to have an open-
shell spin-adapted theory that involves thousands of different terms, which is what
was found by Janssen and Schaefer [53J! Presumably, a symbolic manipulator that
can generate the expressions could also generate FORTRAN code for them, but this
is not likely to be very efficient. And if the goal is to improve on the performance
of the RHF /UHF-based implementations, it may well not be met. More promising
is a completely different approach, based on some redefinitions of the reference wave
function.
Jayatilaka and Lee [60] suggest the use of quite different spin-orbitals in the
reference wave function from the usual alpha and beta spin-functions. Specifically,
they suggest that rather than use the conventional ms = ~ and ms = -~ spin
functions for open-shell orbitals, the average of these functions should be used. For
a 2N + 2 electron triplet state, for instance, we would have
(5.3)
where the presence or absence of a bar denotes alpha or beta spin, as usual, whereas
173

the tilde indicates the new spin functions

(5.4)

and the sign used in the combination is specified as a superscript. The determinant in
Eq. 5.3 is a mixture of the traditional Ms = 1,0, -1 determinants (S;; eigenfunctions)
with coefficients of ~,~,~, respectively. This result can be obtained by explicitly
expanding .(i;+ using Eq. 5.4; a general formula is given by Jayatilaka and Lee [60], who
call these "symmetric spin-orbitals". The wave functions are actually eigenfunctions
of S2 and S:z; using these spin orbitals, as the reader can verify. Since the Hamiltonian
is independent of spin, all Ms components of the same configuration are degenerate
and noninteracting, so the averaging we have performed has no consequence for the
energy.
Consider now the CCSD equations. The problem for conventional open-shell
methods is the inequivalence between the spin-orbitals ¢t and !i;t, where t denotes
an open-shell orbital, since the former is in the occupied spin-orbital space while the
latter is in the virtual spin-orbital space. Using the symmetric spin-orbitals this is no
longer a problem. Jayatilaka and Lee [29] present an open-shell CCSD formulation
in which the number of independent amplitudes is significantly less than spin-orbital
formulations. For example, there are only half as many amplitudes of the general
type tff (that is, a double excitation from closed-shell orbitals to virtual orbitals) as
would be obtained in the RHF jUHF-based treatments of Rittby and Bartlett [54] or
Scuseria [55J. The final equations for the symmetric spin-orbital open-shell CCSD
model are fairly complicated, but not remotely as elaborate as those in the spin-
adapted open-shell CCSD model of Janssen and Schaefer.
We have mentioned that there are half as many tff amplitudes in the method
of Jayatilaka and Lee as in the RHF jUHF-based methods. The reader will recall
that in the closed-shell CCSD method, there are three times as many amplitudes in
the spin-orbital formulation as in a spin-adapted form. Hence, in effect, there are
1~ times as many tff amplitudes in the closed-shell part of the open-shell problem as
there would be for the closed-shell problem alone. One perspective on this is provided
by considering spatial single and double excitations that involve triple or higher spin-
orbital excitations. Thus (using conventional spin functions) the quadruple spin-
orbital excitation "ijtu - tuab would be classified as a double excitation of spatial
orbitals, since the occupation number of the spatial orbitals t and u does not change in
this excitation. In terms of symmetric spin-orbitals, such terms appear as "spin-flip"
excitations [60] like"ij _ abo One can either treat the amplitudes of these terms as
independent amplitudes, or choose other linearly independent terms, as discussed in
Ref. 60. In any event, there are half as many of these new terms as there were of the
original tij, making a total ofq times the number of closed-shell CCSD amplitudes.
Given CCSD equations based on symmetric spin-orbitals, it should be straight-
174

forward to develop an open-shell perturbation theory. This has been discussed by Lee
and Jayat·ilaka [59], who also compare their approach with other recent open-shell
perturbation theories. In their open-shell CCSD paper [29] they also present a 1j
diagnostic appropriate to the open-shell case. They show explicitly that a naive gen-
eralization from the closed-shell1j to the RHF jUHF-based methods is inappropriate
and includes some higher spin-orbital excitations. They give a consistent definition
for the open-shell case that should be more comparable with the closed-shell formula.

5.5 Higher excitations in open-shell methods

There is very little to be said under this heading. Of course, UHF-based methods
can be readily generalized to include higher excitations (although the computational
Cost may be prohibitive), as we have already discussed in Chapter 4. It is therefore
possible to generalize the RHFjUHF-based approaches, as has been done by Scuse-
ria [55] and by Bartlett and co-workers [42], although there are pitfalls associated with
perturbational treatments of higher excitations. This is because there is no longer a
simple expression for matrix elements of Bo, that is, the perturbation energy denom-
inators, in terms of orbital energies. Particular choices of open-shell orbitals have
been recommended; Bartlett and co-workers also suggest a slightly different form [17]
of the triples correction, for example, that contains more fifth-order terms than the
usual (T) correction.
Including higher than double excitations into spin-adapted open-shell meth-
ods appears to be a rather difficult task, or will at least involve very complicated
expressions.

5:6 Multireference coupled-cluster methods

If we were to restrict our discussion here to those methods that have been programmed
explicitly, this section would be no longer than the last. We shall therefore cast a
somewhat wider net here, but we make no pretence of complete coverage of this field.
The aim is mainly to provide an introduction to this area for a reader who wishes to
explore it. An appropriate starting point is the establishment of a general taxonomy
for multi reference coupled-cluster methods. We use the notation of Jeziorski and
Paldus [61]. Much of the terminology of the field is built on concepts like model
spaces and wave operators. A model space is a set of wave functions that provide an
approximate description of the system. An MCSCF calculation, for example", involves
a set of configurations that could span a particular model space: the eigenvectors of
the Hamiltonian matrix over these configurations provide approximate descriptions
(of varying quality) of different electronic states of the system. Lowdin [62] introduced
wave operators into quantum chemistry: typically, a wave operator "11/ transforms an
=
approximate wave function '110 into the exact wave function'll "11/'11 0• For example,
175

with no truncation of T the CC wave function

111 = exp(T)1IIo, (5.5)


where lIto is the Hartree-Fock function, gives the exact (full CI) wave function. The
wave operator- here would be exp(T). Indeed, T is sometimes referred to as the
logarithm of the wave operator.
Among the earliest multireference (and open-shell single reference) CC theo-
ries (MRCC) are the Pock space approaches, in which a hierarchy of model spaces
is used. Each model space n~n) provides a zeroth-order description of an n-electron
system, and these are constructed for all valence occupations 0 ~ n ~ N for an
N valence-electron molecule. Then an exponential ansatz is used to develop a "wave
operator" YP'Fock such that simultaneously

(5.6)
where n(n} is the exact space. Such an approach is also called valence universal.
Mukherjee and co-workers have extensively investigated these methods [63], as has
Lindgren [64], and others. The main attraction of Fock space approaches is their obvi-
ous suitability for determining ionization potentials and electron affinities, since they
readily yield information about states with different numbers of electrons. However,
it can be imagined that considerable information about many molecular ion states is
required to define a single wave operator that satisfies Eq. 5.6.
A somewhat less demanding requirement is placed on the wave operator in
Hilbert space MRCC approaches. Here a single model space Ao is used to repre-
sent states of the N -electron molecule. Then the wave operator transforms these
N-electron approximate states into the exact ones:

(5.7)

A linear expansion of the wave operator here would lead to procedures very similar to
multi reference CI (MRCI), although with any truncation by excitation level the result
would not be size-extensive in general. Use of an exponential ansatz for the wave op-
erator in Eq. 5.7 was first suggested by Jeziorski and Monkhorst [65]. Bartlett and co-
workers have also pursued this approach on several occasions, suggesting a linearized
approximation and several noniterative size-extensivity corrections to MRCI [66], as
well as exploring more elaborate schemes [67,68]. Sometimes the term state universal
is used, instead of Hilbert space, in referring to Eq. 5.7.
On the surface, the simplest of the MRCC approaches are the one state meth-
ods, in which the model space is reduced to a single element, corresponding to a
particular electronic state. The wave operator then formally takes this approximate
representation of a given state into the exact state:

(5.8)
176

Again, with a linear expansion of the wave operator this approach resembles inter-
nally contracted MRCI methods, which are more economical than MRCI procedures
based on a multidimensional model space. With an exponential ansatz for 1I;,ne,
we have a type of MRCC method that has been explored by various authors, per-
haps most extensively by Simons and co-workers [69-71] and by Nakatsuji, Hirao,
and co-workers (the "symmetry-adapted cluster" methods, see Refs. 72 and 73, and
references therein). The term state-selective is also used for these one state methods.
It must be said at the outset that none of these three general approaches is
without severe problems, either formal or practical. All three lead to equations that
vary from merely complicated to almost incomprehensible. The computer implemen-
tations that exist are usually restricted in some way or other, sometimes severely.
Quite apart from these issues, which it could be argued could be overcome by in-
vesting the programming effort and by supplementing this author's mathematical
education, there are more fundamental problems. In the Fock space and Hilbert
space methods, model spaces must be defined on which the wave operator will act.
However, within a finite model orbital space - which is typically composed of only
the valence orbitals for small molecules and will be more restricted for larger systems
- even a full CI calculation is unlikely to provide good approximations to more than
the first few exact wave functions. That is, the spectrum of the model space does
not resemble the exact spectrum. For higher excited states of the exact problem, it is
almost inevitable that the wave functions will contain significant contributions from
configurations that represent excitations outside the model space. For example, if we
consider the simple case of H2 with a two-electron, two-orbital model space, the first
excited 1E: state can best be represented as

(5.9)

where 1c..1 ~ legl. However, it seems highly implausible that a real bound excited 1 E;
state will resemble such a wave function: a wave function dominated by
1
v'2 (IIO'g2ci'"gl + 120'glcfgl) (5.10)

seems more likely, and this is an excitation external to our model space. Hence as we
attempt to transform the model space into the exact space, such external excitations
will "intrude" [74], appearing with lower energies than the model space approxima-
tions and disrupting the model space spectrum. Intruder state problems of this type
plague all multi reference methods (including varieties of multi reference perturbation
theory that are beyond the scope of this work) that rely on well-defined correspon-
dences between model space and exact wave functions. They are very difficult to
resolve without resorting to impractically large model spaces.
It might appear from the intruder state problem that one state MRCC methods
should be preferred. A quite different problem arises here, however. No exponential
177

ansatz for the wave operator for such an approach has been devised that can be shown
to be complete (that is, that yields the full CI result when up to N-electron excitations
are included), without including amplitudes that cannot be determined from the single
state approach [61]. A unitary formulation appears to sidestep this difficulty [71], but
the Hausdorff expansion for such a formulation is infinite (for any order of excitation
in T) and must be therefore be truncated in practical implementations.
Another problem with essentially all MRCC approaches is the issue of "incom-
plete" model spaces (that is, not full CI spaces), or, for the one state methods, the use
of a multi configurational reference function that is not a CASSCF wave function but
involves some selection of configurations. Jeziorski and Monkhorst [65] showed that
some disconnected terms must appear when incomplete model spaces are used. It is
far from straightforward to demonstrate that a given approach will be size-extensive
for an incomplete model space, and indeed some approaches will not be. Complete
model spaces exacerbate intruder state problems and lead to very long expansions.
We should note that correlation of "inactive" orbitals (that is, including excitations
from orbitals that are doubly occupied in all reference configurations in the definition
of T) can also lead to problems similar to those of incomplete model spaces.
The Hilbert space formulation of Jeziorski and Monkhorst has provided the
basis for most of the effort in MRCC methods. The author is not aware of any
general-purpose computer implementation of this method, although by employing
the usual restriction T = TJ +T2 , eliminating all nonlinear terms, and neglecting ma-
trix elements of exp(T) that couple different model space configurations, Laidig and
Bartlett [66] developed the multireference linearized coupled-cluster method (MRL-
CCM). This can be implemented with relatively minor modifications to an MRCI pro-
gram and needs the same computational effort. However, MRLCCM should rightly
be regarded as an approximate CC method, and we shall discuss it further in the
context of other approximate methods, in Chapter 6.
Chapter 6

Other treatments of size-extensivity

6.1 General remarks

In this chapter we shall discuss a variety of methods designed to address the problem
of size-extensivity. Some of the methods are exactly size-extensive and some are
only approximately so; some methods are rather closely related to CC theory, others
are not. Where possible, we shall stress the relationships with CC methods. This
means that certain treatments, like "quadratic configuration interaction", are viewed
as approximations to CC methods. This viewpoint has caused controversy in the
past, but it is hard to disagree with the philosophy expressed by Paid us and co-
workers [75]: "When theory B may be obtained by dropping certain terms ... from
equations characterizing theory A, one normally considers B as a special case of (or
approximation to) A.".

6.2 Quadratic configuration interaction

The quadratic configuration interaction (QCI) approach was introduced by Pople and
co-workers in 1987 [76]. It was originally derived by adding only those terms to the
CISD equations that were required to ensure size-extensivity. (Pople and co-workers
use the term "size-consistency" throughout their discussion, but in this course the
appropriate term is size-extensivity.) The resulting equations are

{lI1oIWIT2I11o}= ~, (6.1)
(1I111WIl + Tl + T2 + TIT2I11o) = ~t1 , (6.2)
(1I11!IWll + Tl + T2 + ~Till1o) = ~t1!, (6.3)

where for convenient comparison with the work of Pople and co-workers we have used
their form in which the correlation energy appears explicitly in the wave function
180

equations. In the same form, the CCSD equations would be

1 2} =
(lIt o1WIT2 11t 0 + '2Tl e, (6.4)

(IIttlwll + Tl + T2 + TIT2
+'21T21 +3fT
1 3.T.)
1 'i'o = dt, (6.5)

(IIttJIWll + Tl + T2 + ~T; + TIT2


1 1T2 1 ...:! 1 T411t ) = e(ttJ + ttt~ - tft1),
2
+'2T1T2+2" 1 +31.t 1-+4f 1 0 (6.6)

The following differences between CCSD and QCISD are then apparent. (Hartree-
Fock orbitals are assumed.)

(i) The CCSD energy (i.e., the projection on (lItol) includes a contribution from the
disconnected term Tl, which is absent from QCISD.

(ii) The CCSD singles equations (the projection on (IIttl) contain terms in Tl and T~
that are absent from QCISD.

(iii) The CCSD doubles equations contain a number of terms on both sides involving
powers of Tl or products of Tl and T2 that are absent from QCISD.

We can draw some qualitative conclusions from these differences, all of which,
we note, involve single excitation amplitudes. First, the cubic and quartic terms
like TlT2' T~, or Tt that appear in the CCSD equations are absent from QCISD. The
only nORlinear terms are quadratic, hence the name QCI. Second, a putative QCID
method would correspond exactly to CCD. Third, since all differences involve T],
we can expect the difference between QCISD and CCSD results to be least when
single excitations are relatively unimportant. Alternatively, using the diagnostic in-
troduced previously, if 1j is large, we may expect significant differences between
CCSD and QCISD (note that it is of course possible to derive an analogous diag.
nostic, which we have denoted Q1, for QCI [77]). This is indeed what was found
in a detailed comparison of the two methods. In general, for a given system QI
tends to be larger than 1j, and the difference becomes larger with increasing mag-
nitude of the diagnostics (that is, with respect to increasing nondynamical correla-
tion). However, the inclusion of connected triple excitations - the CCSD(T) and
QCISD(T) - substantially improves the results, and the triples-corrected results
agree much better than do the QCISD and CCSD results. Indeed, credit should be
given to Ra.ghavachari and co-workers for recommending the (T) correction in the
first place [49], deriving the CCSD(T) method from the triples correction they had
implemented for QCISD [76]. (The actual terms appearing in the correction had al-
ready been presented by Kucharski and Bartlett in their analysis [43] of fifth-order
181

perturbation theory.) The inclusion of the fifth-order terms is even more important
for QCI than for CC [50]. The QCISD+T(QCISD) or QCISD+T(4) methods would
not work well, especially when non dynamical correlation becomes important.
Computationally, the terms omitted from the CCSD equations to obtain the
QCISD equations are relatively straightforward, and do not consume much time in
each CC iteration. In fact, by appropriate formulation of the equations the work that
scales as N6 should be the same in QCISD and CCSD. However, it might be expected,
given the always uncertain convergence of systems of nonlinear equations, that the
presence of cubic and quartic terms might cause convergence difficulties, or might
slow convergence, compared to only quadratic terms. In the author's experience this
has not been a real problem. There may be some compensation here - the fact that
QCISD "overshoots" CCSD as nondynamical correlation becomes more important
indicates that the QCI amplitudes are larger than their CCSD counterparts, which
may offset any advantage from less nonlinearity.
In the original QCI reference [76J, Pople and co-workers gave a rather general
prescription for converting any excitation level CI into a QCI procedure. However,
there has been considerable discussion about whether, for example, the QCISDT
treatment obtained with their prescription is unambiguously defined, or whether such
treatments are truly size-extensive [75,78,79]. This is largely academic, since there
is little incentive (given the computational cost) to implement such methods. It is
possible to implement perturbational corrections for higher than triple excitations,
by analogy with the CC case, giving a method like QCISD(TQ). Benefits from such
methods are not obvious, as discussed in Chapter 4.
Freely admitting to a bias in favour of the "theoretically more complete"
method, the author finds little reason to perform QCISD or QCISD(T) calculations
if the means is at hand for their CC analogues. The CC methods undoubtedly out-
perform the QCI methods when nondynamica.l correlation is important [77], and one
may not always know whether this is a problem beforehand, so CCSD or CCSD(T)
would be the better safety play. In systems strongly dominated by Hartree-Fock,
however, there is little to choose between QCISD(T) and CCSD(T).

6.3 Brueckner orbitals

Consider a configuration expansion of the exact wave function. Brueckner orbitals


were originally defined as a set of MOs for which the single excitations vanish. A more
useful practical strategy is to define Brueckner orbitals for a particular correlation
treatment. For example, we may construct a CISD wave function based on Hartree-
Fock orbitals, and then rotate the MOs so that the coefficients ct vanish. The final
wave function would then be of CID form. Alternatively, greater computational
efficiency can be obtained by iteratively updating the orbitals based on a cm wave
function from the outset. This strategy was followed by Meyer [38] in the method of
182

self-consistent electron pairs, an early matrix-formulated direct CI method in which,


at least initially, it seemed as if the explicit inclusion of single excitations would be
very expensive. Thus an iterative method to include their effect seemed preferable,
even though multiple integral transformations are required because the MOs change
in each iteration. Similarly, Purvis and Bartlett [18] suggested rotating the MOs,
using exp(T1 ), as a method for avoiding a full CCSD calculation.
Let us consider Brueckner orbitals in CCSD calculations [SO]. In effect, we
have the following equations

{woIWIT2 WO} = €, (6.7)


(w11WII + T2Wo) 0, (6.S)
AB· 1 2 AB
(Ill IJ IWI! + T2 + '2T2 Wo) = d IJ , (6.9)

since we require that Tl = O. Note that since the single excitation amplitudes are
absent, Eqs. 6.7 and 6.9 are identical to the CCD equations. Also, since QCID is
equivalent to CCD, we see that Brueckner orbital QCID and Brueckner orbital CCD
are equivalent. The orbital rotations required to eliminate Tl are defined by Eq. 6.S.
In practice, one could proceed by assuming that the orbital rotations are small, and
Eq. 6.8 is then approximated as linear in the unknown rotations [80]. Thus the com-
putational procedure is to solve for the amplitudes, and then determine an orthogonal
transformation of the orbitals to satisfy the linearized form of Eq. 6.S. Obviously,
such a procedure needs to be rapidly convergent if it is to be useful, as otherwise
the repeated integral transformations will simply cost too much. In fact, this simple
procedure does not perform very well. A more elaborate Newton-Raphson approach
apparently does not work much better. Werner and co-workers have thus suggested a
rather different approach, in which orbital rotation is performed in each iteration of a
CCSD (or QCISD) calculation. That is, at the start of each CCSD iteration, the MOs
are rotated using the most recent estimate of the tl amplitudes. Initially, it might
seem as if this approach would be prohibitively expensive, because an integral trans-
formation would be required in every iteration. However, Werner and co-workers [81]
sidestep the full transformation, just as they have done previously in direct CISD
calculations [82]. The partial transformations they require are relatively cheap. This
benefit is not entirely cost-free, since each CCSD iteration takes somewhat longer
without the full integral transformation (with or without orbital rotation), but it is
probably cost-effective for calculations aimed at Brueckner orbitals. Other factors,
such as disk space and input/output performance, will also playa role in practice.
We note finally that a perturbation theory analysis [50] shows that Brueckner-orbital
CCD includes more fifth-order terms than CCSD (or QCISD). Hence the Brueckner
orbital methods can be viewed as more complete in this sense, at least. The numerical
consequences appear slight, however.
It has become common practice to use the term "Brueckner doubles" (BD) as
183

a convenient shorthand for the treatment described here, that is, a CCD expansion
using Brueckner orbitals. The terminology is somewhat unfortunate, since the term
could also mean a CID expansion with Brueckner orbitals. Like most terminology in
use in the GAUSSIAN series of programs, BD or extensions like BD(T) will probably
become common in the literature, independent of their unsuitability or ambiguity.
But terminology like "Brueckner coupled-cluster doubles" or "CCD with Brueckner
orbitals" is less ambiguous and certainly preferable.
We have already described the use of the 1i diagnostic as a probe of the im-
portance of nondynamical correlation. For Brueckner orbitals, of course, 1i = 0, by
definition. We can thus regard the transformation to Brueckner orbitals as an attempt
to let the orbitals respond to nondynamical correlation; it might be hoped that the
use of Brueckner orbitals would thereby be beneficial in cases where nondynamical
correlation was important, and this point has been strongly emphasized in recent dis-
cussions of Brueckner orbitals. However, explicit comparisons with full CI benchmark
results, at least, do not show any particular advantage of Brueckner orbitals in this
respect [80]. Lee and co-workers [83] have compared QCISD, CCSD, and Brueckner-
CCD with one another, and also compared these methods when augmented with the
(T) triples correction. They found little to choose between the methods, especially
when the triples correction was included.

6.4 Approximate CC treatments

One of the virtues of the energy-independent formulation of the CC equations has been
emphasized by Bartlett [12]: the results are size-extensive no matter what truncation
of T is employed, and no matter which terms are dropped from the resulting equations.
We have already seen this, to some extent, with QCISD. But we can eliminate terms
more drastically than this and obtain size-extensive results. For example, consider
again the CCD equations,

(lI1oIWIT2I11o) = f (6.10)
(1l11!IW + (WT2 - T2W) + (~WTi - T2WT2)llI1o) = 0, (6.11)

and eliminate all terms nonlinear in T2 • The resulting equations are

(lI1olWIT2l11o) = f (6.12)
(1I11!IW + (WT2 - T2W)lll1o) = 0, (6.13)

It is vital to observe that this does not reduce to the cm equations: instead of the
cm eigenvalue problem we have a set of linear equations for the unknown amplitudes.
Our results will be size-extensive (although since the result for an isolated two-electron
system is not correct, we have correct scaling behaviour, but not the right answer,
184

in the case of noninteracting two-electron systems). This approximation to CCD (or


sometimes the analogous approximation applied to CCSD) is termed "Cizek's linear
approximation" [10]. The method may be denoted LCCD, LCPMET, CEPA-O, or
CPA o. LCCD calculations are clearly very simple to carry out, but have not been
widely used. This is partly because the method tends to overestimate the effects
of size-extensivity. That is, if we compare, say, cm, CCD, and LCCD results, and
identify the difference between CID and CCD as the correction due to size-extensivity,
the difference between CID and LCCD can be considerably larger (overshooting by
50% or more). Where nondynamical correlation becomes important this overshoot
can become uselessly large. As we have noted, LCCD does not give the correct
answer for a two-electron system, so one might also have doubts about applying it to
systems with rather few electrons correlated. These shortcomings are probably the
main reasons why the method has not been used much.
The general idea of dropping some terms from the CCSD equations to generate
approximate methods has produced many schemes besides LCCD. One approach has
been to classify various contributions diagrammatically and to eliminate some dia-
grams, or to eliminate terms with an eye to their computational cost. This gives rise
to a family of methods with names like ACCD [39] and ACPQ [84], which have rarely
been used in production calculations. More common, perhaps, are the various types of
coupled electron-pair approximation (CEPA). These have a complicated history, early
versions being derived heuristically by Kelly [85] in attempts to improve low-order
perturbation theory treatments of electron correlation. Meyer [86] and Kutzelnigg and
co-workers [87] analyzed this in terms of Sinanoglu's cluster expansion of the wave
function and analogies drawn with independent-pair approximations [7]. Finally, the
connections with CC theory were established clearly in independent work by Hur-
ley [4] and by Kutzelnigg [88]. To understand this approach to the CEPA methods
we return to the CCD amplitude equations in the form that explicitly includes the
correlation energy:
(6.14)
We established in Sec. 2.4 that the elimination of the energy from Eq. 6.14 is accom-
plished by explicitly expanding the matrix element between doubles and quadruples;
of the eighteen terms in the disconnected quadruples the first could then be written
as
L L
K>LC>D
(ll1oIWIIl1~~)t1!t~~ = d1!, (6.1.5)

cancelling the unknown energy. (Dropping the remaining 17 terms would yield
LCCD.) However, in order to accomplish this cancellation we used the matrix el-
ement identity
(6.16)
and a summation range over K > L that implicitly included terms that would arise
185

from "exclusion principle violating" (EPV) quadruple excitations, like ~t!BD. Such
terms cannot arise in a fermion wave function, of course. For example, the correlation
energy for the pair JJ is

fIJ = 2: (~oIWI~1f)ttf, (6.17)


A>B

which would be given by a doubles/quadruples matrix element like the LHS of Eq. 6.16
when KL = JJ. One might argue that in equations for amplitudes associated with
the pair I J it is an error to have included such EPV terms, and that they should be
removed. In this way we would obtain an approximation like [85,86]

(II1tflwl~Ti~o) R:: (e - EIJ}t1!, (6.18)

whereas in LCCD we have

(6.19)

The approximation of Eq. 6.18 is termed CEPA-2, while we can see from Eq. 6.19
why LCCD is sometimes termed CEPA-O. We can see that in both cases the unknown
correlation energy that would appear on the RHS of Eqs. 6.14 will be cancelled out,
although in the case of CEPA-2 the (unknown) pair correlation energy EIJ remains
on the LHS. For the noninteracting two-electron systems this does not interfere with
size-extensivity, as can be seen by repeating the scaling arguments of Chapter 2.
Perhaps the first question the reader might ask is "what about CEPA-l?"! We
should first point out that CEPA-2 corresponds to Kelly's original suggestion [85] and
was certainly known and used first. The name CEPA was introduced by Meyer [86]
(who originally called it CCI - cluster-corrected CI) and the numerical suffixes were
used to distinguish between different approximations. Unfortunately, there was con-
fusion about Meyer's numbering, and consequently the first CEPA method is denoted
CEPA-2. The origins of CEPA-I, which we will now discuss, reflect an interesting
aspect of theoretical methods that we not discussed up to this point: their invariance,
or lack of it, to unitary transformations on the molecular orbitals. Full CI results (or,
equivalently, full CC results) are invariant to any unitary transformation on the MOs
involved in the CI. Where, say, Hartree-Fock theory is used to define an occupied
and virtual space, the results of a truncated CI or coupled-cluster calculation will
not be invariant to an arbitrary transformation, since the Hartree-Fock solution it-
self is not invariant. However, a unitary transformation that mixes occupied orbitals
among themselves, and/or mixes virtual orbitals among themselves does not affect
the Hartree-Fock wave function (although it obviously changes individual orbitals).
Most of the methods we have encountered so far are rigorously invariant to such
transformations: this includes CC (or QCI) and CI at any level of excitation, plus
linearized CC methods like LCCD/CEPA-O. The 'Ii diagnostic is also invariant to
186

such mixing, although individual cluster amplitudes or CI coefficients clearly are not.
Such invariance is of interest if we wish to transform between, say, canonical and
localized Hartree-Fock orbitals, or if we have degenerate MOs that we may wish to
fix in some way that does not affect the energy. Invariance of perturbation theory
to orbital rotations is a much more complicated area, since M~ller-Plesset pertur-
bation theory, for example, is predicated on expressing the resolvent using sums of
orbital energies in the denominator. Obviously, a different choice of orbitals may
lead to a nondiagonal Fock operator. More details are given elsewhere at this school,
but we mention it here because it also affects the perturbational estimate of higher
excitations, like the (T) treatment of triples. Returning to the CEPA methods, we
have already noted that CEPA-O is invariant to unitary transformations on the MOs
that do not mix the occupied with the virtual space. However, this is not the case for
CEPA-2, since the individual pair-correlation energies are not invariant to such trans-
formations. For example, CCSD or CEPA-O are size-extensive independent of such
transformations. But for the case of noninteracting two-electron systems, CEPA-2 is
size-extensive only if localized orbitals are used. This has always been the basis of
our analysis, but we now know it to be irrelevant for CCSD, etc. For CEPA-2, how-
ever, a transformation to delocalized orbitals will not retain size-extensivity. Meyer
found this to be unsatisfactory, and modified the method [861 to obtain size-extensive
results for localized and delocalized orbitals in the case of noninteracting two-electron
systems. This method, CEPA-l, employs the approximation

(wtYIWI~TfWo)::::: (f- flJ - ~"[eIK + fJK1) ttY, (6.20)

where the doubly primed summation omits both J( = I and J( = J. The expression
now removes not only the pair-correlation energy E[J, but also all the "semi-joint"
pair-correlation energies in which one MO index is either lor J. These additional
pair-correlation energies come from quadruples with excitations from, say, I J I J(,
which are sometimes termed type 2 EPV terms, to distinguish them from the type 1
terms arising from I J I J. Thus CEPA-2 removes the EPV type 1 terms, while CEPA-l
removes the EPV terms of types 1 and 2. (Remember, the notation is all Meyer's
fault!)
We can again draw some qualitative conclusions from the form of the vari-
ous CEPA approximations. The magnitude of the term approximated in CEPA-O,
CEPA-2, and CEPA-l decreases in that order, and we know that cm would be ob-
tained if this term were set to zero (recall that we have not yet cancelled the energy
from the RHS here). Hence we might expect that the computed correlation energy
will increase from cm in the order CEPA-l, CEPA-2, CEPA-O. This is what is
commonly found in practice. The CCD result is usually found between CEPA-l and
CEPA-2, so the latter appears to be something of an overestimate and the former
an underestimate, at least for systems with little nondynamical correlation. Where
187

there is strong nondynamical correlation none of the CEPA methods is very reliable.
For real many-electron systems (as opposed to the noninteracting two-electron sys-
tems case) CEPA-l is approximately invariant to unitary transformations that do not
affect CCD or cm, while CEPA-2 can show significant differences.
The CEPA methods have been perhaps the most widely used approximate
treatments that can be viewed directly as approximations to the CCD or CCSD
equations. They were also the earliest widely applicable size-extensive treatments
that were accurate through at least the third order of perturbation theory. Although
the independent-pair approximations to Sinanoglu's cluster expansion [7) were size-
extensive by construction, they were in error in the third order of perturbation theory
because of the neglect of matrix elements coupling the pairs. Initially the CEPA
methods were viewed with a mixture of disdain and mistrust, sometimes with a dash
of outright hostility. This was partly because the desirability of having a size-extensive
treatment was not well understood, and was largely ignored. Indeed, Davidson's
correction, which we shall treat in Sec. 6.6 below, was the first treatment of size-
extensivity that won any sort of general acceptance, perhaps because it was trivial
to compute and thus did not require understanding any new physics in order to
implement it. Another factor that undoubtedly told against the CEPA methods,
however, was that there were simply too many varieties. People began to suspect
that a new CEPA-n method was devised each time the existing n -1 methods failed.
This was unfounded, but it merely reflects human nature. CEPA-3 was an average
of CEPA-l and CEPA-2, founded largely on the notion that CCD results, when they
became available for comparison, fell between CEPA-l and CEPA-2. CEPA-4 and
CEPA-5 were introduced by Koch and Kutzelnigg [89] after examination of various
terms in the CCD equations. In retrospect, recommending one CEPA method, say,
CEPA-l, would have been a better strategy, but these days the issue has become
irrelevant.
In parallel with the development of the CEPA methods by Meyer, Kutzelnigg,
and co-workers, Hurley [4) derived a number of similar approximations. In Hurley's
notation, LCCD is CPAo, while CEPA-2 is CPA' and CEPA-l is CPA". Actually,
Hurley's work was originally formulated for spin-orbital pairs, that is, true electron
pairs. The CEPA methods were introduced for spin-coupled pairs. As Taylor and co-
workers showed [20], there are subtleties and ambiguities associated with transforming
their CPA' and CPA" approximations to spin~coupled pairs. This is inevitable - for
spin-orbitals I and J an excitation from I J I J would always be EPV. But for spatial
orbitals i and j, a quadruple excitation from ijij need not be EPV, as in an excitation
like ill'l]:/'. This determinant is a legal quadruple excitation. Hence the EPV terms
become mixed with non-EPV terms.
Finally, several efforts have been made to develop a multi reference CEPA.
Siegbahn [90], Hoffman and Simons [71], and Fulde and Stoll [91], for example, have
188

implemented such methods, although they have received little use. Some aspects of
multireference CEPA methods are discussed in the next section.

6.5 Coupled-pair functional methods

Another approach to modifying a method like CISD to obtain size-extensive results is


based on considering the variational energy functional for CISD and trying to modify
it. It is convenient to write the CI wave function, using the spin-coupled pairs of
Sec. 3.2, as
WCI = Wo + :Lwp,
p
(6.21)

where the index P denotes either one-particle

(6.22)

or two-particle
Wp = :L wijcij (6.2:3)
a~b

correlation functions. (The labels P, etc., are taken to indicate the spin-coupling of
the double excitations here, as well as the orbitals from which electrons are excited.)
The CI correlation energy functional is then

feI = ((Wo+~Wp)IH-EoI(Wo+~WP))[I+~(WpIWp}rl (6.24)

= [2~(woIH - EoIWP)] [1 + ~(WRIWR)rl

+ [~(WpIH - EoIWQ)] [1 + ~{WRIWR}rl

The strategy for obtaining size-extensivity is related to the observation that cancel-
lation of terms in the numerator by part of the normalization denominator produces
size-extensive results in the separated-pair case. Ahlrichs and co-workers [92] sug-
gested a modified functional - the coupled-pair functional (CPF) - given by

fCPF = [2~(Il1oIH - EoIWP}] Np1 (6.25)

+ [~(WpIH - EoIWQ)] [N}NJ r 1


,

where
Np = 1 + :LTPR(WRIWR), (6.26)
R
189

and T is a symmetric matrix. The only change from the CI energy functional is the
incorporation of the weighting factors TpR ("topological factors" in the terminology
of Ahlrichs and co-workers (92)) into the normalization denominators. How do we
specify values for these factors? We note that the choice TpQ = 1 for all PQ recovers
the CISD functional, while inspection will show that the choice TpQ = 0 everywhere
in fact yields LCCSD/CEPA-O. From the previous section an intermediate choice
should give the best results. For the case that P represents excitations from spatial
orbitals i and j and Q from k and 1, Ahlrichs and co-workers chose
Oik + Oil
TPQ=---+ 0ik + Oil . (6.27)
2ni 2nj

Here ni is the occupation of space orbital i in Wo. (Single excitations are accounted
for using the formula for P = ii.) These occupation numbers allow the use of the CPF
method with open-shell SCF reference wave functions, although in this case care is
required to differentiate between those single excitations that are single spin-orbital
excitations from WO, and which therefore have vanishing matrix elements with Wo
by Brillouin's theorem, and those that are space-orbital single excitations but spin-
orbital double excitations, which are treated as double excitations [92].
The weighting factor definitions in Eq. 6.27 were chosen for two reasons: they
provide size-extensive results in the sense of noninteracting two-electron systems,
and for this case the results are independent of a unitary transformation on the
occupied or virtual spaces. However, the results are not generally invariant to such a
transformation, although in most investigations the lack of invariance is very small.
These weighting factors also yield size-extensive. results for noninteracting systems of
any size, provided orbitals localized on each system are used. The weighting factors
are not necessarily uniquely defined even with the specified requirements, as discussed
by Ahlrichs and co-workers, but Eq. 6.27 seems to be the simplest choice.
The fact that CPF is size-extensive for the noninteracting two-electron systems
independent of mixing among occupied or among virtual orbitals is reminiscent of the
CEPA-l method described in the previous section. Indeed, CPF restricted to double
excitations only is equivalent to CEPA-l with doubles. Differences arise if singles are
included because of different definitions.
One advantage of the CPF method is that it is very straightforward to program
if a CISD code is available, as discussed by Ahlrichs and co-workers. It can readily be
incorporated into direct CISD codes with little effort. The only consequence of any
significance is that convergence of the iterative process may be slower, especially where
size-extensivity effects are large, or where nondynamical correlation is important. The
CPF functional is bounded from below, but not necessarily by the true energy [92].
That is, the CPF "energy" cannot go to -00; it must remain finite, but it can fall
below the exact (full CI) energy. Such "nonvariational" results are only rarely seen,
however. In any event, the method is to be preferred to CEPA-O in this respect, since
190

the latter method is not bounded and is occasionally observed to undergo complete
collapse.
As we have stated, the CPF method is not restricted to closed-shell reference
functions, but can be used with restricted open-shell reference functions. (I am un-
aware of any UHF-based CPF program, although it would be a relatively simple mat-
ter to develop one.) Unlike spin-adapted open-shell CC implementations, open-shell
CPF requires essentially the same computational effort as the corresponding CISD.
Hence the method has been fairly popular for use in open-shell systems. For complete
generality, of course, we would like to be able to handle multireference cases. It is
not at all obvious how to do this with CPF as originally developed, because as with
many other pair-based methods it is not clear how to define "pairs" for a multiconfig-
urational reference function. Gdanitz and Ahlrichs [93] realized that one solution to
this would be to develop weighting factors that are not dependent on the individual
electron pairs - that is, to require all T pQ to be identical. By analogy again with the
noninteracting two-electron systems they suggested a normalization denominator of

(6.28)

where N is the number of electrons correlated. This would correspond to an aver-


age T pQ value of 1/2. One bonus of this averaged coupled-pair functional (ACPF)
method is that since the weighting factors have no dependence on the pairs of MOs,
the method is rigorously invariant to mixing among occupied or among virtual or-
bitals. The other bonus is that with no reference to individual pairs, the method is
readily generalized to multireference calculations. Full details are given by Gdanitz
and Ahlrichs, but we include a brief description here.
The multi reference wave function is written as

(6.29)

where Wo = I:R WRCR is the reference function, intermediately normalized. If. com-
prises all single and double excitations out of the set of WR, and Wa are configurations
with no external orbitals that are orthogonal to Wo. By analogy with CPF the ACPF
energy functional is

{Wo + IVa + IV.IH - EollVo + Wa + IV.)


(6.30)
1 + ga{ll'alwa} + 9.{Il'elwe)
The case ga = ge = 1 corresponds to MRCI, while Gdanitz and Ahlrichs termed the
method with 9a = ge = 0 MRCEPA-O. The linearized multireference CC method of
Laidig and Bartlett [66] is obtained by dropping Wa from the functional completely,
and setting ge = o. Finally, MRACPF is obtained by using the averaged factor
of 2/N obtained above, and 9a = 1. This choice (in fact, probably any choice) for gl1
191

is somewhat arbitrary. Gdanitz and Ahlrichs reasoned that for a CASSCF reference
space there is no need for a size-extensivity correction for the internal configurations,
since they 'Comprise a full CI, and that most multireference calculations would be
close to CASSCF reference anyway (this does not seem to be altogether realistic).
Setting g,. = 1 should be safe, since it is likely to result in an underestimate rather
than an overestimate.

As can be seen from the analysis above, the ACPF method has more "damping"
in the normalization denominator than linearized multireference CC methods, and
seems to behave better. The reader interested in this topic should note that ACPF
is considerably more sensitive to the choice of reference configurations than is MRCI,
and this can become a major issue in ensuring satisfactory convergence of the ACPF
iterations. More details are given in other courses at thi~ school. We should also note
that there are several treatments developed as multireference perturbation theory that
are very close indeed to ACPF [94-96]. The method of Cave and Davidson [94,95],
for example, corresponds to setting g,. = 1 and ge = O.

In 1986 Chong and Langhoff [97] pointed out that in systems with signifi-
cant non dynamical correlation the CPF method tends to overestimate the effect of
higher excitations. In particular, for several small transition-metal diatomics the
size-extensivity correction was overestimated so much that the CPF results were not
necessarily better than the CISD results. They suggested a modification of the defi-
nition of Ahlrichs and co-workers' weight factors TpQ, replacing the geometric mean
in the denominator of the last term of Eq. 6.25 with a constant plus the arithmetic
mean. This naturally leads to a larger denominator, and hence a reduction in the
effect of higher excitations on the energy. There are some other subtleties that the
interested read~r can follow up in the original reference to this modified coupled-pair
functional (MCPF) method. Implementation of the MCPF equations in a direct
CI-like form is not straightforward: Chong and Langhoff in fact modified a conven-
tional Hamiltonian matrix-driven code to obtain their first results, and a direct CI
implementation was first accomplished by Blomberg and Siegbahn [98].

The MCPF method undoubtedly out-performs CPF where non dynamical cor-
relation is important, at least as far as can be judged by empirical comparisons.
However, it has one significant disadvantage: its lack of invariance to rotations that
mix occupied orbitals (or virtual orbitals) among themselves. Unlike CPF, which is
formally invariant by construction for certain special cases, and which is empirically
observed to be close to invariant in practice, the MCPF method can give considerably
different results depending on the choice of orbitals. This is a particular problem if
properties like vibrational frequencies are to be calculated, since the energy may ap-
pear to change discontinously as geometric parameters are varied and orbitals localize.
MCPF can be very useful, but users must be very careful ..
192

6.6 Davidson's correction and extensions

We have left the simplest approximate treatment until last. This is a formula pub-
lished by Langhoff and Davidson [99], in which a CISD correlation energy was modi-
fied by adding 6£, where
6E = £ClSO(1 - ~). (6.31)
Here Co is the coefficient of the reference configuration in the normalized CISD ex-
pansion. The inclusion of the correction is commonly denoted by a suffix +Q, so that
the CISD+Q correlation energy, for example, would be

t:ClSO+Q = fClSD + 6E = EClSD(2 - ~). (6.32)


This "Davidson correction" to the correlation energy was derived by a perturbation
theoretic argument. An analysis of the CI energy using perturbation theory shows
that in fourth order the CI energy includes a term -E2{III) that involves the second-
order energy £2 and the first-order wave function 11). This "renormalization term"
would be cancelled by terms arising from the disconnected quadruples, were these to
be included in the calculation. It could therefore be argued that the CI energy could be
improved by dropping this renormalization term. If we approximate (Ill) by I - ~,
we then obtain the Davidson correction Eq. 6.31. We note that the perturbation
wave function should actually be given in intermediate normalized form, so that even
if we identify the CI coefficients with the first-order perturbation wave function, the
correction of Eq. 6.31 is not correct. This was pointed out by Siegbahn [100], in a very
elegant and pedagogical study of the Davidson correction. Taking the intermediate
normalization into account we get
(1- ~)
6E = fClSO ~ (6.33)

Obviously, this correction will be larger in magnitude than the original Davidson cor-
rection. Martin and co-workers [101J have suggested the term "renormalized Davidson
correction" for Eq. 6.33.
Ahlrichs [102] has given an alternative derivation of the original Davidson
correction that provides additional insight. We write the CISD correlation energy in
the usual expectation value form as
(lItclSolH - Eollllclso)
t:CISD = . (6.34)
(lit CISO IIII CISO)
Then the original Davidson correction is

(6.35)
and by inserting the CISD correlation energy in this expression we find that

(6.36)
193

That is, the Davidson correction cancels the normalization denominator from the
correlation energy. In effect, we have the value of an energy functional (\IIIH -
Eol\ll) evaluated with the CISD wave function. This is in fact no more than the
energy functional for the LCCSD method. Thus if we solve iteratively for the LCCSD
amplitudes, beginning with CISD as the initial guess, the first iteration will be the
traditional Davidson correction.
Two important observations should be made about the Davidson and renor-
malized Davidson corrections. First, we can see that we will obtain a size-extensivity
correction for a two-electron system, even though there should be none. Thus these
corrections behave inappropriately in the limit of very small numbers of electrons.
Obviously, one can avoid making any correction for the two-electron case, so there
is no real difficulty there. But what of four electrons, for example? Presumably the
correction will be too large, but by how much? It seems likely that the renormalized
correction, which must be larger in magnitude, will behave worse than the original
correction. Second, the corrections do not yield a completely size-extensive result,
since they do not account for all of the non size-extensive terms in a CISD calculation.
Hence they will become increasingly inaccurate as the number of electrons increases.
Siegbahn [100] derived higher-order supplements to the original corrections, but these
have been relatively little used.
Pople and co-workers (103) have derived another correction, stressing the need
for a vanishing size-extensivity correction in the limit of two-electrons. Davidson
and Silver [104] introduced yet another correction. Neither of these corrections has
found widespread use. In general, the most commonly used correction appears to
be the original Davidson correction, followed a long way behind by the renormalized
Davidson correction. The interested reader may care to consult Ref. 101 for a detailed
comparison of different Davidson-type corrections.
One of the virtues of the Davidson correction is that it can be extended heuris-
tically to the case of a multi reference CI. The two requirements are to define the cor-
relation energy, and the reference weight eg. Let us assume we have a set of reference
configurations, indexed by R, and a reference energy

(6.37)

where the normalized reference wave function lI1REF is given by

lI1REF =L IR)c~EF. (6.38)


R

The "correlation energy" can be defined, more or less unambiguously, as the differ-
ence EMRCI - EREF, where EMRCI is the MRCI energy. Now, when we compute the
MRCI energy, we will (usually) vary the coefficients of the reference configurations as
well as those of the correlating configurations. Hence the coefficients of the reference
configurations in the MRCI wave function, ~RCI, will usually differ by more than just
194

a renormalization from the ~F, which makes the definition of the reference weight
somewhat problematic. The first definition in common use is simply to use the c~RCI
to define the reference weight [105J, so that

~E = (EMRCI - EREF HI - 2:( c~RCI)2}. (6.39)


R

The MRCI+Q energy would be obtained by adding this correction to the MRCI
energy. This is probably the more common multi reference Davidson correction. But
alternatives are possible. One approach is to obtain the reference weight by projecting
the MRCI wave function onto the original reference wave function. This would give

~E = (EMRCI - EREF){1 - ~)c~RClc~EF)}. (6.40)


R

This is not the end of the story, either, since if the reference configurations were
selected from a CASSCF calculation, there is some scope for asserting that C~ASSCF
(with an obvious definition) should be used in place of ~EF here. Indeed, for a
CASSCF reference space (in which case the distinction between C~ASSCF and ~EF
is irrelevant, of course), we can view Eq. 6.40 as a "first iteration" of the linearized
multireference CC method of Laidig and Bartlett [66], discussed in Chapter 5 and
in Sec. 6.5 above. This is analogous to the relationship between the single-reference
Davidson correction and the linearized CCSD method. In this sense the correc-
tion Eq. 6.40 has a formal basis in theory, whereas the correction Eq. 6.39 is purely
heuristic. Nevertheless, full CI comparisons, for example, suggest that Eq. 6.39 is a
more reliable correction. We may expect it to yield a correction that is smaller in
magnitude than Eq. 6.40: both corrections tend to overshoot somewhat (especially for
CASSCF reference spaces) and so the latter overshoots more. More detailed discus-
sion of multireference size-extensivity corrections, including the MRACPF method,
is given elsewhere at this school.
Afterword

The aim of these notes was to provide a solid foundation for the understanding of
modern coupled-cluster methods and their relatives. They are certainly not com-
plete: apart from the decision to exclude diagrammatic methods from the formalism,
we have not discussed coupled-cluster energy derivatives and properties, nor have we
considered propagator or response methods based on CC treatments. These areas
represent important and valuable applications of coupled-cluster theory. The inter-
ested reader can find much on these subjects in the literature. Nor have we included
much about numerical applications of the methods. However, this area is covered in
some detail in other courses at this school.
Bibliography

[1] J. A. Pople, J. S. Binkley, and R. Seeger, Int. J. Quantum Chern. Symp. 10, 1
(1976).

[2] R. J. Bartlett and G. D. Purvis, Int. J. Quantum Chern. 14, 561 (1978).

[3] A. C. Hurley, J. E. Lennard-Jones, and J. A. Pople, Proc. Roy. Soc. A220, 100
(1953).

[4] A. C. Hurley, Electron. correlation in small molecules, Academic Press, London,


1976.

[5] H. Primas, in Modern Quantum Chemistry, edited by O. Sinanoglu, Academic


Press, New York, 1965.
[6] O. SinanogIu, J. Chern. Phys. 36,706 (1962).

[7] O. Sinanoglu, J. Chern. Phys. 36, 3198 (1962).

[8] P. JlIlrgensen and J. Simons, Second Quantization-based Methods in Quantum


Chemistry, Academic Press, New York, 1981.

[9] F. Coester and H. Kiimmel, Nucl. Phys. 17, 477 (1960).

[10] J. Cizek, J. Chern. Phys. 45,4256 (1966).

[11] F. E. Harris, Int J. Quanturn Chern. Symp. 11,403 (1977).


[12] R. J. Bartlett, J. Phys. Chern. 93, 1697 (1989).

[13] J. Cizek, Adv. Chern. Phys. 14,35 (1969).

[14] J. Noga and R. J. Bartlett, J. Chern. Phys. 86, 7041 (1987).

[15] R. J. Bartlett and J. Noga, Chern. Phys. Lett. 150,29 (1988).

[16) R. J. Bartlett, S. A. Kucharski, and J. Noga, Chern. Phys. Lett. 155, 133
(1989).

[17) R. J. Bartlett, J. D. Watts, S. A. Kucharski, and J. Noga, Chern. Phys. Lett.


165, 513 (1990).
198

[18] G. D. Purvis and R. J. Bartlett, J. Chern. Phys. 76, 1910 (1982).


[19] J. Cizek, Int. J. Quantum Chern. 5, 359 (19il).

[20] P. R. Taylor, G. B. Bacskay, N. S. Hush, and A. C. Hurley, Chern. Phys. Lett.


41, 444 (1976).

[21] B. Roos and P. Siegbahn, in Modern Theoretical Chemistry, edited by H. F.


Schaefer, Plenum Press, New York, London, 1977.

[22] P. Pulay, S. Sreb9J, and W. Meyer, J. Chern. Phys. 81, 1904 (1984).
[23J G. E. Scuseria, A. C. Scheiner, T. J. Lee, J. E. Rice, and H. F. Schaefer, J.
Chern. Phys. 86, 2881 (1987).

[24] T. J. Lee and J. E. Rice, Chern. Phys. Lett. 150,406 (1988).


[25] G. E. Scuseria, C. L. Janssen, and H. F. Schaefer, J. Chern. Phys. 89, 7382
(1988).
[26] G. D. Purvis and R. J. Bartlett, J. Chern. Phys. 75,1284 (1981).

[27] T. J. Lee, J. E. Rice, G. E. Scuseria, and H. F. Schaefer, Theoret. Chim. Acta


75, 81 (1989).

[28] T. J. Lee and P. R. Taylor, Int. J. Quantum Chern. Symp. 23, 199 (1989).
[29] D. Jayatilaka and T. J. Lee, J. Chern. Phys. 98, 9734 (1993).
[30] J. Paldus, J. Cizek, and I. Shavitt, Phys. Rev. A 5,50 (1972).
[31] H. J. Monkhorst, Int J. Quantum Chern. Symp. 11,421 (1977).
[32] J. Paldus, J. Chern. Phys. 61, 303 (1977).
[33] J. Paldus, J. Cizek, M. Saute, and A. Laforgue, Phys. Rev. A 17,805 (1978).

[34] P. R. Taylor, G. B. Bacskay, N. S. Hush, and A. C. Hurley, J. Chern. Phys.69,


1971 (1978).
[35] P. R. Taylor, G. B. Bacskay, N. S. Hush, and A. C. Hurley, J. Chern. Phys. 70,
4481 (1979).
[36] J. A. Pople, R. Krishnan, H. B. Schlegel, and J. S. Binkley, Int. J. Quantum
Chern. 14, 545 (1978).

[37] V. R. Saunders, unpublished work.

[38] W. Meyer, J. Chern. Phys. 64, 2901 (1976).


199

[39] R. A. Chiles and C. E. Dykstra, Chern. Phys. Lett. 80, 69 (1981).

[40] Y. S. Lee, S. A. Kucharski, and R. J. Bartlett, J. Chern. Phys. 81, 5906 (1984).

[41] TITAN is a set of electronic structure programs written by T. J. Lee, A. P. Ren-


dell, and J. E. Rice.

[42] ACES II is a coupled-cluster and MBPT electronic structure program. Writ-


ten mainly by J. F. Stanton, J. Gauss, J. D. Watts, W. J. Lauderdale, and
R. J. Bartlett, it also includes contributions from J. Almlof, T. Helgaker,
H. J. Aa. Jensen, P. JlIlrgensen, and P. R. Taylor.

[43] S. A. Kucharski and R. J. Bartlett, Adv. Quantum Chern. 18, 281 (1986).

[44] G. E. Scuseria and H. F. Schaefer, Chern. Phys. Lett. 152, 382 (1988).

[45] G. E. Scuseria, T. P. Hamilton, and H. F. Schaefer, J. Chern. Phys. 92, 568


(1990).

[46] G. E. Scuseria and T. J. Lee, J. Chern. Phys. 93, 5851 (1990).

[47] R. J. Bartlett, H. Sekino, and G. D. Purvis, Chern. Phys. Lett. 98, 66 (1983).

[48] M. Urban, J. Noga, S. .1. Cole, and R. J. Bartlett, J. Chern. Phys. 83, 4041
(1985).

[49] K. Raghavachari, G. W. Trucks, J. A. Pople, and M. Head-Gordon, Chern.


Phys. Lett. 157,479 (1989).

[50] K. Raghavachari, J. A. Pople, E. S. Replogle, and M. Head-Gordon, J. Phys.


Chern. 94, 5579 (1990).

[51] H. Sekino and R. J. Bartlett, Int. J. Quantum Chern. Symp. 18,255 (1984).

[52] H. Koch, H. J. A. Jensen, P. JlIlrgensen, and T. Helgaker, J. Chern. Phys. 93,


3345 (1990).

[53] C. L. Janssen and H. F. Schaefer, Theor. Chim. Acta 79, 1 (1991).

[54] M. Rittby and R. J. Bartlett, J. Phys. Chern. 92, 3033 (1988).

[55] G. E. Scuseria, Chern. Phys. Lett. 176,27 (1991).

[56] K. Wolinski and P. Pulay, J. Chern. Phys. 90, 3647 (1989).

[57] R. D. Amos, J. S. Andrews, N. C. Handy, and P. J. Knowles, Chern. Phys. Lett.


185,256 (1991).
200

[58J P. J. Knowles, J. S. Andrews, R. D. Amos, N. C. Handy, and J. A. Pople, Chern.


Phys. Lett. 186, 130 (1991).

[59J T. J. Lee and D. Jayatilaka, Chern. Phys. Lett. 201, 1 (1993).

[60] D. Jayatilaka and T. J. Lee, Chern. Phys. Lett. 199,211 (1992).

[61] B. Jeziorski and J. Paldus, J. Chern. Phys. 90, 2714 (1989).

[62] P.-O. Lowdin, J. Chern. Phys. 19, 1396 (1951).

[63] D. Mukherjee, R. K. Moitra, and A. Mukhophadhyay, Mol. Phys, 33, 955


(1977).

[64J I. Lindgren, Int. J. Quantum Chern. Symp. 12, 33 (1978).

[65J B. Jeziorski and H. J. Monkhorst, Phys. Rev. A 24, 1668 (1981).

[66] W. D. Laiclig and R. J. Bartlett, Chern. Phys. Lett. 104,424 (1984).

[67] L. Meissner, S. A. Kucharski, and R. J. Bartlett, J. Chern. Phys. 91, 6187


(1989).

[68] L. Meissner and R. J. Bartlett, J. Chern. Phys. 92, 561 (1990).

[69] A. Banerjee and J. Simons, Int. J. Quantum Chern. 19, 207 (1981).

[70] A. Banerjee and J. Simons, J. Chern. Phys. 76,4548 (1982).

[7lJ M. R. Hoffmann and J. Simons, J. Chern. Phys. 88, 993 (1988).

[72J H. Nakatsuji, J. Chern. Phys. 83, 713 (1985).

[73J K. Hirao, J. Chern. Phys. 95,3589 (1991).

[74] B. H. Brandow, Adv. Quantum Chern. 10, 187 (1977).

[75J J. Paldus, J. Cizek, and B. Jeziorski, J. Chern. Phys. 93, 1485 (1990).

[76J J. A. Pople, M. Head-Gordon, and K. Raghavachari, J. Chern. Phys. 87, 5968


(1987).

[77] T. J. Lee, A. P. Rendell, and P. R. Taylor, J. Phys. Chern. 94, 5463 (1990).

[78J J. Paldus, J. Cizek, and B. Jeziorski, J. Chern. Phys. 90,4356 (1989).

[79J J. A. Pople, M. Head-Gordon, and K. Raghavachari, J. Chern. Phys. 90, 4635


(1989).
201

[80) N. C. Handy, J. A. Pople, M. Head-Gordon, K. Raghavachari, and G. W. Trucks,


Chem. Phys. Lett. 184, 185 (1989).

[81) C. Hampel, K. A. Peterson, and H.-J. Werner, Chem. Phys. Lett. 190, 1 (1992).

[82) H.-J. Werner and P. J. Knowles, J. Chem. Phys. 89, 5803 (1988).

[83) T. J. Lee, R. Kobayashi, N. C. Handy, and R. D. Amos, J. Chem. Phys. 96,


8931 (1992).
[84) J. Paldus, J. CiZek, and M. Takahashi, Phys. Rev. A 30,2193 (1984).

[85) H. P. Kelly, Adv. Chem. Phys. 14, 129 (1969).

[86) W. Meyer, J. Chem. Phys. 58, 1017 (1973).

[87) R. Ahlrichs, H. Lischka, V. Staemmler, and W. Kutzelnigg, J. Chem. Phys. 62,


1225 (1975).
[88] W. Kutzelnigg, in Modern Theoretical Chemistry, edited by H. F. Schaefer,
Plenum Press, New York, London, 1977.

[89) S. Koch and W. Kutzeinigg, Theoret. Chim. Acta 59,387 (1981).

[90) P. Siegbahn, unpublished work.

[91] P. Fulde and H. Stoll, J. Chem. Phys. 97,4185 (1992).

[92) R. Ahlrichs, P. Scharf, and C. Ehrhardt, J. Chem. Phys. 82, 890 (1985).

[93) R. J. Gdanitz and R. Ahlrichs, Chem. Phys. Lett. 143, 413 (1988).

[94} R. J. Cave and E. R. Davidson, J. Chem. Phys. 88, 5770 (1988).

[95] R. J. Cave and E. R. Davidson, J. Chem. Phys. 89, 6798 (1988).

196] K. Tanaka, T. Sakai, and H. Terashima, Theoret. Chim. Acta 76, 213 (1989).

197] D. P. Chong and S. R. Langhoff, J. Chem. Phys. 84, 5606 (1986).

198] M. Blomberg and P. Siegbabn, unpublished work.

(99) S. R. Langhoff and E. R. Davidson, Int. J. Quantum Chem. 8, 61 (1974).

[100) P. E. M. Siegbahn, Chem. Phys. Lett. 55, 386 (1978).

[101] J. M. 1. Martin, J. P. Fran~is, and R. Gijbels, Chem. Phys. Lett. 172,346


(1990).
202

[102] R. Ahlrichs, Comput. Phys. Commun. 17,31 (1979).

[103] J. A. Pople, R. Seeger, and R. Krishnan, Int. J. Quantum Chem. Symp. 11,
149 (1977).

[104] E. R. Davidson and D. W. Silver, Chem. Phys. Lett. 52,403 (1977).

[lOS] M. R. A. Blomberg and P. E. M. Siegbahn, J. Chem. Phys. 78, 5682 (1983).


Methods of Relativistic Quantum Chemistry

Andrzej J. Sadlej
Department of Theoretical Chemistry
University of Lund, Sweden

May 26,1994
204

1. Introduction
The consideration of the electronic structure of atoms and molecules at the level of the rela-
tivistic quantum mechanics is a rather new area. of quantum chemistry. The growing interest in
relativistic methods for electronic structure calculations is strongly linked to the developments
in chemistry of heavy atom compounds and their use in industry. Moreover, there is a number
of chemical observations which show that for heavy atom compounds the interpretation of their
electronic structure and properties cannot be achieved without the relativistic trea.tment.
The relativistic theory of atoms and molecules appears to be far more challenging than the
well established non-relativistic methods based on the Schrodinger equation. There is a num-
ber of very basic unsolved problems which make working with relativistic theories interesting
and rewarding. The corresponding computational methods are still far from being that well
developed as those based on the non-relativistic theory.
In both classical and quantum mechanics the relativistic approach originates from the ob-
servation that the information transfer between different points in space requires certain finite
time. IT the interaction between particles is of electromagnetic nature the speed of its transfer
is given by the velocity of light, c. The importance of this limit increases with the velocity vof
the motion of particles and can be recognized by considering the ratio:

{3 =~, (1.1)
c
which varies between 0 (particle at rest) and 1 (light quanta). In atomic units, which are used
throughout this text, the speed of light is approximately equal to 137.036 and it is worthwhile
to consider the range of velocities of electrons in typical systems.
It follows the from virial theorem that the average total energy E =
(E) is equal to the
negative of the average kinetic energy (T). For the ground state of a hydrogen-like ion of the
nuclear charge Z one has (in atomic units):

E = _!Z2 = -(T) (1.2)


2

and

(1.3)

so that

(1.4)

Thus, the non-relativistic results for the energy of the Is electron in the hydrogen-like ion give
f
{3 equal to which is indeed small for light atoms. This estimate shows that the importance of
relativistic effects will increase parallel to the increase of the nuclear charge of atoms involved
in the given system.
A useful estimate of the magnitude of relativistic effects on energies follows from the relativistic
formula for the energy of a particle of rest mass m and velocity v:

(1.5)
205

where

(1.6)

On expanding i into power series in 13 one obtains from Eq. (1.5):

2 1 2 3 2 '112 2 1 2 3 2
E=mc + 2 m 'll +smv c2 +···=mc +2 mv (1+4"f3 + ... ) (1.7)

and within the additive rest mass energy mc2 the leading relativistic contribution to the energy
(1.7) is of the order of 132 • Already for Z = 40 this leading term will change the particle energy
by about 7 per cent.
This estimate gives some feeling for the magnitude and importance of relativistic contributions
to energies. In a similar way one can also consider the relativistic effect on average distances
which are measured in units of the Bohr radius (ao). According to its definition ao is proportional
to the inverse of the electron mass at rest me. Since the mass m of an electron moving with
velocity v is:

(1.8)

and distances are expected to be changed by a factor of ~ ( 1. This is known as the relativistic
contraction effect. However, the relativistic change in the shape of wave functions must not
change the mutual orthogonality conditions. As a consequence some of atomic shells may de-
crease their shapes while the others will increase. One should also remark that the electron spin
is a consequence of relativity. Thus, all magnetic interactions between electron spins and other
magnetic fields will arise in a natural way from relativistic considerations.
In principle the relativistic framework is the only one which provides the right description of
physics. One learns in the theory of relativity that valid equations of physics must have the
same form in all inertial coordinate systems. In other words all valid equations of physics must
be covariant with respect to the space-time transformations linking different inertial coordinate
systems and preserving the space-time interval:

(1.9)

where T and t are the space and time separations between two points in the 4-dimensional space-
time coordinate system. The requirement (1.9) is satisfied by what is known as the Lorentz
transformation and the principle of validity of equations used to describe physical phenomena
can be formulated in terms of their covariance with respect to this transformation. This feature
is obviously missing in the case of the Schrodinger equation which is approximately valid as long
as the velocities of particles are small compared to c.
The relativistic quantum theory is built on the basis of the relativistic classical mechanics
and commutation rules for the coordinate and momentum operators. Traditionally, the theory
is first developed for one electron (particle) and leads to what is known as the Dirac equation
which is a relativistic substitute for the Schrodinger equation. The consequences of the Dirac
equation are analysed leading to the particl~hole interpretation and provide a basis for quantum
electrodynamics, i.e., the relativistic theory of many--electron systems. In the present series of
lectures only the most important elements of general theory will be given. The main attention is
focused on the ways in which relativistic and approximately relativistic theories are being used
in quantum chemistry. Most details are being skipped and can be found in monographs and
review articles.
206

2. Relativistic theory of one-particle systems

2.1 Dirac equation for a free particle


The relativistic energy of a free particle of the rest mass m can be written in terms of its
momentump:

(2.1 )

and can be considered as a classical relativistic Hamiltonian. The quantization proceeds in the
usual way by replacing the classical variables by operators. By applying this method directly to
Eq. (2.1) one would obtain relativistic Hamiltonian operator involving a square root of operators
which is undefined. Dirac has assumed that the relativistic equation of motion should have the
same form as the Schrodinger equation, i.e.,

(2.2)

with the relativistic Hamiltonian H deduced from Eq. (2.1). To avoid undefined operators one
applies the so-called linearization procedure to the initial energy operator,

(2.3)

which follows from Eq. (2.1). The relativistic Hamiltonian operator is assumed to be linear in
momentum operators and written as:

H = co:p + f3mc 2 • (2.4)

a = (o:r, "';" 0%) is a 3-dimensional vector of constants and !3 is another constant, both to
be determined. The operator p is the usual 3-component momentum opertor vector, p =
(-i'V:r, -iVy, -i'V %) and the product ap is understood as the scalar product in 3-dimensional
vector space:

(2.5)

By requiring that the square of the operator (2.4) is equal to the square of the initial operator
(2.3) one concludes that both a and f3 are 4x4 matrices, i.e., operators acting in a 4-dimensional
space of the wave function components. Hence, the solutions 'II of the Dirac equation for a free
particle

(2.6)

must be 4-component vectors,

(2.7)
207

The wave function ofthe form (2.7) is referred to as a (4-component) spinor while in the so-called
standard representation

Ot:z: = (00 1) ,ay= ('


o o
0 1
0 1 00
1 0 o 0
00 0
0 0

0 -i 0
i 0 0
-'~) ('! -!)
,Otz =
0
0 0
0 0
-1 0
1
(2.8)

and

{j = CO
o
0 0 -1
o
1

0
0
0

0 J) (2.9)

are hermitian matrix operators acting in the space of 4-component spinors. They satisfy the
following relations:
{j2 = 1, Ot,,{j + {jOt" = 0, Ot"OtI + OtIOt" =0"" (2.10)

for Ie,l = (z,y,z) with 1 and 0 being 4x4 unit and zero matrices, respectively. The Kronecker
symbol Okl has the meaning of either unit (Ie = I) or zero (Ie '" I) 4x4 matrix. The matrices
and {3 can be written in terms of auxiliary 2x2 matrices 0, I:
0:
o=(~ ~), I=(~~) (2.11)

(2.12)

Thus,

0:=(: ;), (2.13)

The Dirac equation (2.6) can be understood as a set of four linear first-order dift'erential
equations for components of (2.7). As long as the Dirac Hamiltonian is time-independent the
time-dependence of the wave function can be factorized out leading to the time-independent
Dirac equation. For a free particle this equation assumes the following form:

H+=E+, (2.14)

where H is given by (2.4), + is a time-independent 4-component spinor:

(2.15)

and E denotes the particle energy. By solving Eq. (2.13) one learns that its eigenvalue spectrum
consists of two continua: -00 < E < -m,c'l and +m,c'l < E < +00, separated by a gap of 2m2.
208

This form of the spectrum brings certain problems with respect to the interpretation of states of
the relativistic free particle and has led Dira.c to proposing the existence of a positron. To avoid
problems arising from the negative energy continuum Dirac assumed that in what is referred to
a vacuum all those states must be occupied by electrons. Then, the observed free electron will
have to occupy one of the positive energy eigenstates, will have positive energy, positive mass,
and will carry a negative charge. A hole in the negative continuum corresponds to a particle
with positive energy and mass which carries a positive charge. In this way the one-particle
relativistic theory becomes a many-particle theory; the negative energy continuum can contribute
to energies of positive energy particles via the s~called virtual excitations (polarization of
vacuum). The relativistic many-particle theory which handles this problem is known as quantum
electrodynamics and in principle enables a full relativistic treatment of many-particle systems.
Starting from the Dirac equation (2.6) and its hermitian conjugate one cau derive the conti-
nuity equation which brings about definitions of the charge density,
4
P = eq,tq, = e 2: 1/Ji1/Ji, (2.16)
i=l

and Cartesian components j/c, k = x, y, z, of the current density vector j:

j/c = ecq,t",/cq" (2.17)

where q,t is a one-row vector of the form:

(2.18)

Most of physical interpretation of the behavior of relativitic particles can be carried out in terms
of the charge and current densities defined by Eqs. (2.17) and (2.18).
A link between non-relativistic and relativistic theories can be accomplished by considering
the following block form of the Dira.c equation, i.e.,

- Ell
( [mc2 c(up) c(up) ) ( u L ) _ 0 (2.19)
_[mc2 + Ell uS - ,

where

UL _
_ (Ul )
U2
, (2.20)

are referred to as the large and small components ofthe Dirac spinor, respectively. Eq. (2.19)
is simply a set of two (2x2) -dimensional matrix equations:

(mc2 - E)uL + c(up)u S = 0


(2.21 )
c(up)uL - (mc2 + E)u s = 0

of which the second one can be solved with respect to us:


1
uS = C (up)u L (2.22)
mc2 +E
On substituting this result into the first equation one obtains:

(2.23)
209

For a particle obeying non-relativistic mechanics its non-relativistic energy E is positive and
small compared to me2 • Thus, in such a case,

(2.24)

and the non-relativistic limit of (2.23) becomes


1
[-E + 2m (up)(up)]uL = o. (2.25)

With the aid of the identity

(up)(up) = l (2.26)

one finds a two component equation for u L ,


1
(-l-E)UL = 0 (2.27)
2m '
which closely resembles the Schrodinger equation for a free particle. Thus, the large component
u L can be regarded as a 2-dimensional non-relativistic limit (c -> 00) of the 4-component
solution. Additionally, on combining Eqs. (2.22) and (2.24) one obtains:
1
us::::: -(up)uL (2.28)
2mc
i.e., the values of uS are expected to be about ~ times smaller than those of u L . This result
underlies the terminology introduced in Eqs. (2.19) and (2.20). Eq. (2.27) obtained as the
non-relativistic limit of the Dirac equation (2.19) has solutions of the form:

(2.29)

where u solves the usual free-particle Schrodinger equation and Ct, C2 are arbitrary constants.
Through the analysis of solutions for a particle moving in the magnetic field one can associate the
two components of (2.29) with two possible directions of the magnetic moment of the particle.
Thus, Dirac's equation can describe particles whose magnetic moment has two energetically
different orientations in external magnetic field, i.e., it can describe particles with spin quantum
number of!. On combining this fact with experimental data one concludes that Dirac's equation
is appropriate for describing electrons and positrons.
Dirac's equation for a free particle brings about most of interpretation of the relativistic
quantum mechanics and the basic terminology. It also provides a starting point for relativistic
theory of many~lectron (many-particle) systems. Although such a theory can be developed its
use is limited by a number of mathematical problems. Thus, the so-called relativistic methods
of quantum chemistry are in most cases based on Dirac's equation for one-particle systems.
The development of the relativistic quantum chemistry parallels that of the non-relativistic
methods and the theory is built in the framework of the on~ectron (relativistic) approximation.
Solutions of the Dirac equation for on~lectron hydrogen-like systems play the key role in
devising relativistic methods in quantum chemistry of many-electron atoms and molecules in
exactly the same way as the corresponding solutions of the Schrodinger equation do in the
non-relativistic case.
210

2.2 Relativistic particle in external fields


In this section we shall generalize our previous considerations to the case of a particle moving
in time-independent electric (E) and magnetic (H) fields defined through the scalar (¢) and
vector (A) potentials:

E =-grad¢, (2.30)

H=rotA, (2.31)

The classical relativistic energy expression for a particle of mass m and charge e (both in atomic
units) moving in the field given by Eqs. (2.30) and (2.31) is:

(2.32)

Following the method used in Section (2.1) we obtain the Dirac Hamiltonian

H = CG:1I" + e¢/ + /3mc2 (2.33)

for a particle moving in external electric and magnetic fields, where

11" = p- cA,
e
(2.34)

and / is a 4 x 4 unit matrix. The general analysis of solutions of the corresponding time-dependent
and time-independent Dirac equations follows that presented in Section (2.1) for a free particle.
There are two particular cases which deserve a more detailed analysis. The first one follows
from assuming that

H=O a.nd ¢=-,


z (2.35)
r
where Z is the charge (in atomic units) carried by the source of electrostatic potential at the
distant r from the particle under consideration. This case corresponds to relativistic theory of the
hydrogen-like ion with point-like nucleus of the charge Z. The second case of particular interest
is the non-relativistic limit of the Dirac equation for a particle in electric and magnetic fields.
The analysis of terms arising from the separation of the large component of the Dirac spinor
brings about the notion of spin and shows how non-relativistic ap~roaches can be approximately
corrected for relativistic terms.

2.2.1 The hydrogen-like ion


Let us first brieily consider the Dirac equation for a hydrogen-like ion with point-like nucleus
of charge Z in the absence of magnetic fields. In terms of 0' matrices the corresponding Dirac
= =
Hamiltonian (m l,e -1)reads:

(2.36)

where the matrix I is defined in Eq. (2.11). The spectrum of the corresponding Dirac equations
consists of three regions: (i) a continuum of negative energy states extending from -00 to _c2 , (ii)
211

a continuum of positive energy states extending from c2 to 00, and (iii) a discrete spectrum of
stateS embedded in the gap between the two continua just at the bottom of the positive energy
continuum.
The problem of the negative energy continuum is resolved in the same way as for a free
electron. The negative energy continuum is assumed to be completely filled with electrons,
forming thus a reference vacuum. The positive energy continuum corresponds to energy levels
above the ionization potential of the hydrogen-like ion, i.e., it represents a free electron moving
in the field of the point-like positive charge. The discrete spectrum refers to discrete energy
levels of the hydrogen-like ion.
To gain some idea about solutions of the hydrogen-like Dirac problem let us note that the
usual angular momentum operator:

L == rxp I, (2.37)

where I is a 4x4 unit matrix, does not commute with the Hamiltonian (2.36). However, the
operator:

(2.38)

where

X==(tT 0) (2.39)
OtT'
does commute with (2.36) and its components satisfy all commutation rules for the angular
momentum operators:

[J;, Ji] == iJI" (2.40)

with (i,j,k) corresponding to a cyclic permutation oflabels (z,y,z). Moreover,

[H,J;] == 0, (2.41)

for i == z, y, z. Thus, the operator (2.38) can be interpreted as the total angular momentum
operator for a relativistic particle moving in the central field. The eigenequation for J2 is:

J2f(8,cp) == j(j + 1)f(8,cp), (2.42)

with j == I ± ~, where 1== 0,1,2, ... is the usual angular momentum quantum number as known
from the non-relativistic theory. The eigenfunction f( 8, cp) denotes a 4-component spinor wbich
depends on spherical angles 8 and cp.
To solve the time-independent Dirac equation for the hydrogen-like problem let us introduce
radial components Q r and pr for a and p operators:
(a1')
Qr=--, (2.43)
r

Pr == .( a +-1).
-I - (2.44)
ar r
212

With the aid of these operators the Dirac hamiltonian H for the hydrogen-like problem can be
written as:
ic Z
H = CQrPr + -QrfjK
r
+ (3c 2 - -1,
r
(2.45)

where the matrix operator

K = (3[(EL) + 1] (2.46)

commutes with (3, (I, and H of Eq. (2.36) and can be used to classify the spectrum of H. Let
us note that:

(2.47)

and its eigenvalues will be equal to:

j(j+l)+~ =(j+~)2. (2.48)

Thus,

Kfl = Itfl, (2.49)

where

It = ±(j +~) = ±1,±2,±3, ... , (2.50)

and the radial part of the Dirac Hamiltonian (2.45) can be written in the following form:

H =CQrPr + -Qr(31t
ic
r
+ (3c2 - Z
-1.
r
(2.51)

The solution of this equation results finally in the following form of components of the Dirac
spinor (2.15):

forj=l+!

1£1 = Ntg(r)Y"m;_i(9,rp)
1£2 = -Ntg(r)Y"m;+i(9,rp)
(2.52)
1£3 = -iNt/(r)Y,+I,m;_i(8,rp)

1£4 =
-iNtl(r)Y,+I,m;+i(8,rp)

forj=l-l

1£1 = Ni"g(r)Y"m;_i(8,rp)
1£2 = N;g(r)Y"m;+i(8,rp)
1£3 = -iN;/(r)Y'_I,m;_i(9,rp)
(2.53)
1£4 = -iN; l(r)Y,-I,mi+i(9,rp)
213

where Nt and Nt, i = 1,2,3,4 denote numerical normalization factors, and f( r) and g( r) solve
the radial Dirac equation for the given value of the quantum number k of Eq. (2.50).
The radial equation with the Hamiltonian (2.51) can be solved exactly for functions g(r)
and f(r) in terms of confiuent hypergeometric functions. These solutions depend on quantum
numbers n and j = I ± ~ and can be expressed as a product of a decaying exponential function
of r multiplied by a terminating power series in r. Although only three quantum numbers are
needed to fully determine the state of the electron in the hydrogen-like ion, the eigenfunctions of
the corresponding Dirac equation are usually characterized by the following set of four quantum
numbers:
=
(i) The principal quantum number n N+I K.I 1,2, .... =
=
(il) The azimuthal quantum number, 1 0,1,2, ... , n - 1, whose value is usually identified by
alphabetic symbols S,p, d, ....
(iii) The total. angular momentum quantum number, j = I ± ~,j > 0, whose value is given as
a subscript to the alphabetic state symbol.
(iv) The magnetic quantum number, m; = -j, -j + 1, ... ,j - 1,j.
The existence of normalizable radial solutions leads the following expression for the energy of
discrete states:

E- mc2 [1 + ( ~ ) 2]-!
- N- I K. I + JK. 2 - ~
(2.54)

where N = 0,1,2, ... plays the role of the non-relativistic principal quantum number and K. is
given by Eq. (2.50). On expanding Eq. (2.54) into a series of inverse powers of c the following
result is obtained:

E= mc2- ::2 [1 + ~ (::) (~- 43n) + ...J, (2.55)

where n = N + I K. I = 1,2, ... is the non-relativistic principal quantum number.


The dependence of energy (2.54) on the value j = 1 ± ~ through the quantum number K.
introduces for I > 0 a splitting of levels for the given 1. Thus, for instance, the relativistic np
levels are not fully degenerate as they were in the non-rel.a.tivistic case but split into npl If and
np!lf; the lower value of j corresponding to the lower energy. This can be understood in terms
of the coupling between orbital angular momentum and spin angular momentum.
The theory of the Dirac equation for hydrogen-like ions brings most of the conceptual back-
ground and terminology used in relativistic theories of many-eJ.ectron systems. Solutions of the
Dirac equation for the Coulomb field problem will be referred to as relativistic orbitals and can
be written in terms of 2-component (Pauli) spinors used in Eq. (2.29):

111 _ ( G""(r)fl+",,,(I1,l?) ) (2.56)


'''''''' - F""(r)fl_",,,(I1,l?) ,
where n is the principal quantum number, the value of K. s defined in terms of ~

-K. - 1 = j - ~ if K. < 0
1= { (2.57)
K.=j+~ ifK.>O'
and related to j through:

(2.58)
214

=
Finally, m mj is one of the possible magnetic quantum numbers for the given value of j. The
radial functions Gn,,(r) and Fn.c(r) are the same for both components of the given 2-component
spinor n. The 2-components spinors n are defined through the coupling between orbital and
spin momenta:

m,=+! 1 1
n"m(lJ,ip) = :E (I m - m. "2 m.II "2 j m) Yi,m-m,(lJ,ip) Am" (2.59)
m,,=-~

where the products under the summation sign consist of the Clebsch-Gordan coefficients (I m-
m. ! m.ll! j m), the usual spherical harmonics Yt,m-m,(lJ,ip), and 2-component spinors:

(2.60)

The 4-component spinors of the form (2.58) are used to build basis sets of one-electron func-
tions in relativistic calculations for atoms and molecules. In this context one should note
that (see Eqs. (2.21) and (2.22» that the small (us = Fn"(r)n_,,m(lJ,ip)) and the large
(u L = GnI«r)n+"m(lJ, ip)) components of the Dirac spinor (2.56) are mutually related:

uS = C 1 (up)u L (2.61 )
C2 +¢+E

This relation should be satisfied also in the case of the approximate form of the large and small
component which is obtained, e.g., by the truncated basis set expansion. It is only than that
the usual Schrodinger equation can be derived in the non-relativistic limit. To obtain this limit
one has to use the identity (2.26), which brings the non-relativistic kinetic energy term. This
requirement is usually referred to as the kinetic energy balance condition and will be satisfied in
approximate calculations if the small component basis set is appropriately related to that used
for the large component.
A detailed treatment of the Dirac equation for hydrogen-like ions can be found in several
quantum mechanics textbooks and monographs (see References). A useful qualitative presenta-
tion of the relativistic theory for hydrogen-like ions, including graphs of relativistic functions,
has been given by Powell (see References).

2_2_2 One-Electron Atom in External Fields


Let the electron move in external electric and magnetic fields as defined by Eqs. (2.30) and (2.31).
Then its motion is governed by the Dirac equation with the Hamiltonian (2.33). Following the
method used to analyse the non-relativistic limit of the free paricle Dirac equation (see Sect.
2.1) one can write:

(c 2 _ ¢ - E)u L + c(u"Jr)u s = 0
(2.62)
c(u"Jr)u L - (c 2 + ¢ + E)u s = 0

where we used atomic units (m 1, e = = -1 )for the mass and charge of the electron. From the
second of these equations one finds

(2.63)
215

and consequently the large component will be determined by:

(2.64)

or
I
[-</> - E+ c2 (0'1I') 2 2 4> (2.65)
+ +E (0'1I')]u = 0,
L
c
where the relative energy value E of Eq. (2.24) is used. After expanding the last term in square
brackets into a power series with respect to ~ and using:
I
11' =p+-A, (2.66)
c
one obtains the non-relativistic limit through terms of the order of ~:

H = !p2_ ~ (non - relativistic)

-bp4 (mass - velocity)

+f.;(pA+ Ap) (ezternal magnetic)

+~A2 (ezternal magnetic) (2.67)

+f.;O'H (spin - ezternal magnetic)

+bO'(E x p) (spin - orbit)

_~V24> (Darwin)

The term referred to in (2.67) as the 'spin-external magnetic' interaction contribution has the
form of the interaction (-I'B) between external magnetic field B and magnetic moment 1',

I' = - 2cI O', (2.68)

and brings about the interpretation of the electron spin in terms of the magnetic moment of the
electron.
For the potential </> arising from a point-like nucleus of the charge Z:

V2 4> = -4d(r) (2.69)

and
r
E= - ,.3' (2.70)

where 6(r) is the Dirac delta function. Thus, the spin-orbit and Darwin terms become:

(2.71)
216

and
'11"
HD = 2c26(r), (2.72)

where I is the angular momentum operator with respect to the origin at the nucleus. The spin-
orbit term Hso gives no contribution to energy for s states 1= O. Additionally, if H = 0, the
only remaining relativistic corrections to the non-rela.tivistic Hamiltonian (through the order of
~) are the mass-velocity Hm " and Darwin HD terms. Numerically, they constitute the largest
corrections to non-re1a.tivistic energies. However, they are close in magnitude and of opposite
sign.
The recognition of re1a.tively large magnitude of correction terms H m" and HD has lead to
defining an approximate quasi-relativistic energy operator Hm"D:

(2.73)

This operator involves two major relativistic terms while retaining the usual non-relativistic
l-component form of solutions of the corresponding eigenvalue problem. On adding to (2.73)
the Hso operator one obtaines a 2-dimensional mvD+SO Hamiltonian H."."D+SO,

(2.74)

which provides a 2-component (Pauli) approximation to the complete 4-component treatment of


relativistic effects. The two Hamiltonians (2.73) and (2.74) are commonly used in approximate
(quasi-relativistic) methods of atomic and molecular physics and quantum chemistry.
217

3. Relativistic theory of many-particle systems

3.1 The Hamiltonian


Almost all considerations in the relativistic theory of many-electron atoms and molecules
assume that the nuclei provide only a source of external potential for relativistic electrons. Thus,
the nuclear framework is commonly trea.ted within the Born-Oppenheimer approximation; the
relativistic features of nuclei being at best referred to in a phenomenological way (i.e., nuclear
spin). Moreover, it is usually assumed that nuclei are point-like masses of certain charge Z.
This is a set of standard assumptions which will be used in these lectures, and thus, we are
left with relativistic electrons moving in some external Coulomb field. Removing the point-like
approximation is a rather easy task. Going beyond the non-relativistic treatment of the nuclear
framework would be a considerable undertaking and does not seem to be needed in the area. of
quantum chemistry.
One of the major fundamental differences between non-relativistic and relativistic many-
electron problems is that while in the former case the Ha.miltonian is explicitly known from the
very beginning, the many-electron relativistic Hamiltonian has only an implicit form given by
quantum electrodynamics. Even for two electrons the corresponding relativistic Hamiltonian is
written only in approximate forms which are not fully covariant. Under those circumstances it is
tempting to build approximate relativistic theories of many-electron systems on the basis of more
or less sophisticated assumptions. The simplest relativistic 'model' Hamiltonian is considered to
be given by a sum ofrelativistic (Dirac) one-electron Hamiltonians HD and the usua.! Coulomb
interaction term:

H(1,2, ...,n) =EHD(i)+


i=1
L~'
i<; T.,
(3.1)

where
(3.2)

¢Ji =- t ZA
A=l riA
(3.3)

riA is the distance between the i-th electron and nucleus A of the charge ZA, and T;; is the
distance between electrons i and j. The number of nuclei in the system is assumed to be N.
It is worthwhile to note that both the Q and f3 matrix operators are labelled by the electron
reference number i. Although they have the same form, they will act on different 4-component
spinors.
The operator (3.1) is referred to as the Dirac-Coulomb (HDC) Hamiltonian and represents the
lowest order approximation electron-e1ectron interactions. In spite of that, the Dirac-Coulomb
Hamiltonian underlies the majority of 4-component relativistic calculations in quantum chem-
istry. However, the neglect of relativistic contributions to the electron-electron interactions
brings a.bout some fundamental problems. The 2-electron Dirac-Coulomb equation:

(3.4)
218

can be shown to have no bound states. Thus, there is no protection against the variational
collapse into negative energy states. The ill-conditioned form of the Dirac-Coulomb Hamiltonian
has been first recognized by Brown and Ravenhall and is usually termed as the 'Brown-Ravenhall
disease'. This follows from the fact that a bound state of two non-interacting Dirac electrons,
i.e., .(1).(2) is degenerate with a continuum of non-normalizable states having one electron
in the positive energy state and another one in the negative energy state. When the Coulomb
interaction is included, the initial wave function of non-interacting electrons gains contributions
from all those continuum states and becomes 'dissolved in continuum'. Until recently not too
much attention has been paid to this problem. The recent interest in avoiding the Brown-
Ravenhall disease has been pioneered by Sucher and followed by numerical studies of Hess
et al. However, one should remark that, in spite of its rather obscure physical meaning and
mathematical features the Dirac-Coulomb Hamiltonian is underlying the majority of relati-
vistic techniques in quantum chemistry.
The relativistic form of the two-electron interaction operator has been first discussed by Breit
leading to expression which is relativistically correct through ir.
The corresponding interaction
operator can be approximately derived from quantum electrodynamics and is known as the Breit
interaction operator Vs(i,j):

., ( . . ) 1 1 [a;aj (a,"-;j)(ajr;j)j
vs ~,1 =- - - -- + . (3.5)
r;j 2 Tij T~j

By substituting in Eq. (3.1) the Coulomb interaction operator by its relativistic extension (3.5)
one obtains the so-called Dirac-Breit many-tiectron Hamiltonian aDS:

"
a DB = ~ aD(i) + ~VB(i,j). (3.6)
;=1 i<j

This replacement, however, does not remove the 'Brown-Ravenhall disease' problem. More-
over, the Dirac-Breit Hamiltonian is derived perturbationally and there may be some objections
against its use in variational calculations. Thus, it is frequently suggested that the Breit cor-
rection to the Coulomb interaction should be considered in the perturbation framework and
evaluated as the first-<lrder contribution the the energy which follows from aDO. In the context
of the Briet operator (3.5) one should also mention its approximate form known as the Gaunt
interaction VG:

V G ( ',1
. .)
=1- - -
aiaj
-, (3.7)
Tij Tij

which leads to the Dirac-Gaunt (a DG ) many-tiectron Hamiltonian:

"
a DG = ~aD(i)+ ~VG(i,j) (3.8)
i=l i<J

In recent years considerable attention has been given to modifications of the Dirac-Coulomb
Hamiltonian which remove the 'Brown-Ravenhall disease' problem. The continuum dissolution
can be avoided by projecting out the relevant part of the Coulomb interaction operator. This
leads to the eigenvalue equation of the form:

a+(l, 2, ... ,n).(1,2, ... ,n) =E.(1,2, ... ,n), (3.9)


219

where

(3.10)

The projection operator L+(l, 2, ... , n):

(3.11)

with

(3.12)
n

projects onto the space of positive energy solutions for RD. The derivation of this so-called
'no-pair Hamiltonian' neglects all effect related to the creation of virtual electron-positron pairs.
Also the effects of virtual photons are neglected. Moreover, Eq. (3.10) can be rduced to a single
component form, leading to what is called the 'spin-free no-pair' approximation.
Most of the problems arising from the choice of approximate many-eectron Hamiltonians
can be resolved by using quantum electrodynamics in the so-called Furry's bound interaction
picture. It is quite pleasing to note that several equations used in relativistic quantum chem-
istry can be derived as legitimate approximations to the proper field-theoretic treatment of the
many-electron problem. Among others this applies to the relativistic equivalent of the Hartree-
Fock scheme, i.e., the Dirac-Hartree-Fock method. Once the many-eectron Hamiltonian is
chosen, the relativistic methods of quantum chemistry parallel those developed for solving the
SchrOdinger equation.

*
Before closing this chapter let us briefiy consider the extension of the non-relativistic limit
formulae of Section 2.2.2. In the Dirac-Breit approximation the additional terms arising through
the order are:

- "L.Ji<i 2CfPi
1 (!.!l!i.L
1, -;;; 1 ) Pj ( orbit - orbit)

- L:i~i btO'i [Iii (2p; - Pi)] (spin - other orbit)


'J

- "L.Ji<i 1
~O'i (r.,:" -;::rI,
r
'J
1 ) O'j (dipole spin - spin)
(3.13)

(contact spin - spin)

(two - electron Darwin)


where

Ii; = ri - rj. (3.14)

The first four operators correspond to interactions between orbital and/or spin magnetic mo-
ments of electrons. Together with the SO operator of Eq. (2.71) they are responsible, e.g., for
the so-called fine structure in atomic spectra. The corresponding effects on energies and wave
functions are usually evaluated by means of the perturbation treatment based on solutions of
the non-relativistic many-electron problem.
220

3.2 Dirac-Hartree-Fock Approximation


Given the n-e1ectron Hamiltonian the overwhelming majority of methods starts with the
one-electron model. For closed-shell systems the n-e1ectron wave function is built as a Slater
determinant from one-electron spinors. For general systems this ansatz is replaced by a more
general concept ofthe configuration state function (CSF), i.e., a wave function which corresponds
to the given electronic configuration and satisfies certain symmetry conditions with respect to
e.g., point group of symmetry, parity, total angular momentum.
Let the k-th 4-component spinor for the i-th electron, I}k(i), be:

.T. (.)
"'10 ~ =
(Ut(i»)
iuf(i) , (3.15)

where ut and uf, are the corresponding large and small components, respectiVely. In the present
case both of them are real. In order to determine the optimal set of 4-component spinors one
follows the same route as in the case of the non-relativistic Hartree-Fock method. For each of
the nobody Hamiltonians of Section 3.1 one can define the energy functional:
E _ (1}(1,2, ... ,n)IH(I,2, ...,n)II}(I,2, ... ,n)}
(3.16)
- (1}(1, 2, ... , n)II}(I,2, ... , n)} ,
where in the simplest case 1}(1, 2, ... , n) is a single Slater determinant,

(3.17)

built of orthonormal one-electron spinors,

(3.18)

Upon variation of (3.16) one obtains a set of equations:

(3.19)

where fTc is the orbital energy associated with the k- th spinor. For the two-body Coulomb
potential one has:

(3.20)

(3.21)

are the Coulomb and exchange operators, respectively. They are defined over 4-component
spinors and the integration comprises both the usual integration over space coordinates of the
electron and the summation over products of components, e.g.,

(3.22)
221

where Ui,k is the i-th component of the tTk spinor of the form given by Eq. (2.15). For other
two-body potentials, e.g., those which follow from the Breit or Gaunt interaction Hamiltonians,
the T!2 operator should be replaced by its appropriate counterpart.
The derivation of Dirac-Hartr~Fock equations follows that known from the non-relativistic
theory. The same applies to their derivation for open-shell and multi configuration cases. How-
ever, the use of the variation approach to determine one-particle spinors is a little problematic
since the Dirac Hamiltonian is not bounded from below. Hence, the variation method should be
rather used as a tool to determine the stationary point of the energy functional. The solution
of equations which result from the variation of E is therefore being usually restricted to the
positive energy region and the lowest positive energy eigenvalues are assumed to correspond to
occupied levels. In the iterative approach to the solution of one-electron equations (3.19) the n
lowest (positive) energy spinors are used to build the Coulomb and exchange operators for the
subsequent iteration. In order to avoid spurious solutions one should take care that the kinetic
energy balance condition (see Eq. (2.61» is satisfied.

3.3 Beyond the Hartree-Fock Approximation


Solving the Dirac-Hartree-Fock problem is rather the initial step in relativistic calculations
for many-electron systems. In most cases this relativistic one-electron (single-particle) approx-
imation is insufficient even for qualitatively correct considerations in quantum chemistry. Some
improvement over the one-electron approximation can be gained by using multiconfiguration
Dirac-Hartree-Fock methods. However, the standard methods to remedy the deffi.ciencies of
the one-electron approximation are the configuration interaction (Cl) approach and many-body
perturbation theory (MBPT) techniques. In general, the methods are essentially the same as in
the case of non-relativistic techniques for calculating the electron correlation energy. The major
difference is that in all relevant expressions the non-relativistic one-component single-particle
functions are substituted by their 4-component spinor counterparts. In order to avoid prob-
lems arising from the 'Brown-Ravenhall disease' the use of Sucher's projective many-electron
hamiltonian can be recommended in either its full 4-component form or in simplified 2- or
I-component approximations. The 'spin-free no-pair' l-component variant of this method has
been extensively used by Hess and appears to give a reasonable account of relativistic effects in
atoms and molecules as long as the spin-orbit coupling effects are insignificant.

3.4 The Density Functional Approach


The methods considered so far can be in a more or less rigorous way derived from either
quantum electrodynamics or approximate relativistic many-electron Hamiltonians discussed in
Section 3.1. In recent years a great deal of attention has been given to another class of methods
which are based on what is known as the density functional theory. This theory is based on a
theorem by Hohenberg and Kohn which says that the ground state energy 'of a many-electron
system is uniquely determined by the I-particle density. Kohn and Sham have derived a non-
relativistic one-electron equation

( -~ V2 + e¢J." ) .,pk =Ek.,pk (3.23)

with the effective potential ¢J.I I

¢J.,,(I) = ¢J",,0/(1) + jd3r2P(r2) + 4>:c(I), (3.24)


T12
222

where 4>n"ol is given by Eq. (3.3) and 4>"0 is the so-called exchang~orrelation potential. The 1-
particle density p is calculated from occupied single-particle solutions of (3.23). The usefulness
and success of the density functional method depends on the choice of the unknown exchange-
correlation potential. The most common choice is based on the exchang~orrelation energy
formula for a homogeneous electron gas and gives the so-called local density approximation
to the density functional theory. There is a variety of different forms of tP:&o used in density
functional methods. The best known is the one proposed by Slater:

(3.25)

and usually referred to as the X Q-potential; the constant Q being an adjustable parameter. The
local density functional approximation with the exchang~orrelation potential (3.25) is known
as the Hartree-Fock-Slater method.
A relativistic form of the density functional theory is rather obvious. The one-electron oper-
ator in Eq. (3.23) is to be replaced by the Dirac energy operator with the effective potential
which follows from I-particle density as given by Eq. (2.16). with 4-component spinor orbitals
obtained from the relativistic counterpart of Eq. (3.23). Computational techniques based on this
formalism are usually referred to as the Dirac-Slater or Dirac-Hartree-Fock-Slater methods.
Although both non-relativistic and relativistic density functional theories can be given a
sound formal background, the way they are used makes them into approximate techniques. The
relativistic density functional methods evidently provide computational tools for handling the
relativistic effects in many-electron systems. However, their success depends on the choice of
the exchang~orrelation potential. Thus, in practice all density functional methods rely on a
posteriori validation of their results.
223

4. Computational aspects

4.1 Numerical Integration Methods


To integrate directly one-electron equations of the relativistic theory of many-electron systems
is the most promising technique, though essentially limited to systems of high enough symmetry.
For this reason purely numerical techniques have been mostly used in relativistic atomic calcula-
tions and the achievements in this area. are summerized by the tabulation of Dirac-Hartree-Fock
wave functions and related properties for atoms with Z=1 through Z=120 (Descla.ux, 1973). The
state of the art of relativistic atomic calculations is presented in several recent review articles.
General packages for performing numerical atomic calculations are also available. More recent
extensions include multiconfiguration Dirac-Hartree-Fock methods and quantum. electrodynamic
corrections.
The area. of relativistic calculations for many electron atoms is rather specific for it heavily
relies on the angular momentum coupling theory. The separation of angular coordinates makes
finally the integration problem one-dimensional. This can be also achieved for highly symmetric
molecules MHn with the hea.vy atom (M) at the centre. By using the single centre expansion
for all operators the integration problem becomes essentially the same as in the case of atoms.
For linea.r molecules the integration of Dirac-Hartree-Fock equations can be reduced to two
dimensions and is numerically fea.sible. This approach has been pioneered by Pyykko and
SundhoIm and then extended by La.a.ksonen and Grant.
One should also mention that the numerical integration technique is commonly used for solv-
ing Dirac-Slater and related equations in methods based on the density functional theory. The
discrete variational method has been developed by Rosen and Ellis and applied to a number of
molecules in the framework of the Dirac-Slater approximation. Closely related are the integra-
tion techniques employed in the so-called multiple scattering method for solving Dirac-Slater
and similar equations.
Although the use of purely numerical integration methods in relativistic theories of many-
electron systems is limited mostly to atoms and linear diatomics, the corresponding results
provide benchmarks for other developments rooted in traditional non-relativistic quantum. chem-
istry. The level of accuracy which can be achieved in atomic single- and multiconfiguration rela-
tivistic calculations for atoms is already that high that it permits to discuss subtleties related to
the validity of different many-electron Hamiltonians and to evaluate quantum. electrodynamic
corrections with some degree of confidence.

4.2 Basis Set Expansion Methods


The developement ofrelativistic computational methods which are potentially useful for chem-
ical applications follows traditional approaches used in quantum. chemistry. It is a common
practice in non-relativistic methods to expand the unknown wave function into some truncated
set of known and relatively simple functions. In this way the problem of solving differential or
integro-differential equations is converted into much simpler algebraic problem of the determi-
nation of eigenvalues of hermitian matrices. There is a great deal of activity in this area and
severai interesting results have already been obtained.
224

The so-called basis set expansion methods are primarily used to determine approximate single-
particle solutions for equations discussed in Section 3. Once these are known the electron
correlation effects can be evaluated by using either CI or MBPT-type methods in essentially the
same way as in the non-relativistic case. In the case of atomic structure calculations the basis
set expansion methods do not seem to be competitive, at least at the level of the Dirac-Hartree-
Fock approximation, to numerical integration techniques. One of the main purposes of a variety
of atomic programs based on the analytic expansion methods appears to be the development of
relativistic atomic basis sets to be used in molecular calculations.
The early truncated basis set (algebraic) calculations for molecules have proven to be a failure
because of the so-called variational collapse problem. This has been remedied later on by
recognizing the importance of the kinetic energy balance which forces a fixed relation between
basis functions used for the large and small components. With the problem occuring in numerical
integration methods for polyatomic molecules the use of the algebraic approximation appears to
be at present the only way to extend the relativistic calculations beyond atoms and diatomics.
There is an increasing number of available Dirac-Hartree-Fock molecular programs which use
the basis set expansion methods. Moreover, in recent years the algebraic methods have been
extended beyond the single-particle level of approximation by devising the relativistic CI and
MBPT techniques.
The algebraic approach requires that certain integrals involving basis functions and operators
of relativistic Hamiltonians are calculated fast enough. This imposes some additional conditions
on the choice of basis sets and expansions in terms of Gaussian functions are rather routine
in molecular calculations. They have been also successfully used in relativistic treatment of
atoms. Without taking into account any symmetries the 4-component spinor solving the given
one-particle equation, e.g., the Dirac-Hartree-Fock equations, is expanded into a finite set {X,,}
of Gaussians X" which are usually centred at atomic nuclei. In principle the same basis set can
be used to expand both the large (Eq. 3.15) and the (real) small components, i.e.,

(4.1)

and

(4.2)

respectively. However, from Eq. (2.61), which in the present case assumes the following form:

(4.3)

where tP stands for the appropriate one-electron potential, one learns that the expansions (4.1)
and (4.2) are not independent. The relation (4.3) will be obviously satisfied in the limit of a
complete set of expansion functions X". However, its violation for finite basis sets may have
serious consequences.
Let us note that (4.3) is needed to obtain a proper non-relativistic limit of the considered
single-particle relativistic model. In particular, one finds (see e.g. Eqs. (2.25) (2.26» that

(uf I (ITp) I u~)(u~ I (ITp) I uf), (4.4)


225

must be equal to

(4.5)

This will be satisfied only if the basis set {X,,} is large and rich enough that it contains functions
generated from X" by the up opertor. In principle such a basis set must be complete.
The equivalence between (4.4) and (4.5) is termed as the kinetic balance condition and can
be satisfied by choosing different basis sets for the representation of the large and small com-
ponents. From the analysis of the non-relativistic limit of the Dirac equation we know that the
large component will approach the solution of the non-relativistic counterpart of the consid-
ered relativistic model. This gives a plausible receipe for choosing a basis set, say {X~} for the
expansion of the large component. The small component basis set can be then generated by:

(4.6)

There are obvious practical limitations to this procedure and usually Eq. (4.6) is only a guiding
principle for the choice of the small component basis set. Then, after selecting the large and
small component sets one combines them into a sigle set of functions to be used for expanding
the 4-component spinor. The choice of the basis set for representing the small component can
also be guided by atomic relativistic calculations. For the use in molecular calculations one can
devise as set of functions which comprises the usual non-relativistic atomic basis sets and atomic
small component sets. This leads to what is known as atomic balanced sets.
The basis sets to be used in relativistic calculations are obviously much larger that those
employed in non-relativistic cases. The reduction of primitive sets can be achieved by using
contraction methods in either segmented or generalized forms. Once the kinetic balance con-
dition is satisfied, then the variational collapse problem usually disappears. The use of the
point-like nuclei leads sometimes to convergence problems in iterative solutions of algebraic
Dirac-Hartree-Fock equations. This can be remedied by using a finite-size nucleus approxima-
tion. The final results are essentially independent ofthe assumed (small) nuclear radius.
The algebraic methods for solving one-particle equations in relativistic theories are of par-
ticular importance for computational techniques which go beyond the on~lectron model. In
addition to spinors which are occupied by electrons in the given electronic configuration they
provide a set of spurious solutions (virtual spinors). These can be used to build other config-
urations in CI-like schemes or can provide an approximation to complete set of one-particle
eigenstates to be used in many-body techniques.
In present molcular applications of the algebraic form of the Dirac-Hartree-Fock method
most calculations are carried out with Gaussian basis sets. The major convenience of such
basis sets is a relatively easy calculation of molecular relativistic integrals. In this context one
should remark that relativistic calculations, in spite of certain symmetries between integrals, are
far more size-demanding than their non-relativistic counterparts. The size of relativistic basis
sets makes the storage problem quite serious. This may explain several attempts to simplify
fully relativistic approaches either by using quasirelativistic approximations or by reducing the
number of explicitly considered electrons in the framework of the pseudopotential techniques.

4.3 . Quasireiativistic Approximations


The title of this section covers a wide variety of different methods which can be considered as
approximations to the most complete relativistic treatment. To have some working terminology
let us agree that any method which departs from the 4-component spinor representation of
226

one-electron wave functions is to be called quasireia.tivistic. In this sense all methods based
on the 2-component Pauli formalism. or I-component non-relativistic wave functions are to be
termed as quasirelativistic approaches. Though quasirelativistic, these methods may recover a
variety of relativistic effects on energies and related properties of many~ectron systems.
Most quasirelativistic approaches can be derived directly from relativistic schemes by us-
ing either rigorous or approximate perturbation treatment and saving only certain, supposedly
largest. contributions. In all derivations the first step consists of separating the large and small
components of the wave function. This can be achieved rigorously by applying a series of uni-
tary transformations to the Dirac Hamiltonian. The corresponding method is known as the
Foldy-Wouthuysen transformation technique. The transformed Hamiltonian truncated at the
c 2 order is equivalent to the Pauli Hamiltonian for one-electron systems and includes the mass-
velocity, Darwin, and spin-orbit terms (see Section 2.2.2). In higher orders with respect to c- 2
the transformed Foldy-Wouthuysen Hamiltonian becomes strongly singular and cannot be used
in numerical calculations.
The 2-component Pauli approximation is a typical example of a quasirelativistic approach.
In the case of many-electron systems it usually brings the so-called Pauli Hamiltonian which
reads:

(4.7)

where HS is the usual non-relativistic (SchrOdinger) Hamiltonian for the given many-electron
system and

Hmv =--8c12 ".t...JPi.


,.
4 (4.8)

(4.9)

and

(4.10)

are the many-electron· generalizations of operators given in Section 2.2.2. The one-electron
angular momentum operator i;A of the i-th electron is defined with respect to the origin A at
the nucleus A and riA is the length of riA = ri - A. The many~ectron Pauli operator can
be either completed with tw~ectron terms which result from the Breit operator (through the
order of c- 2 • (3.13» or further reduced to I-component approximation by removing the SO
term.
The Pauli Hamiltonian completed with two-electron terms is used in quasirelativistic atomic
calculations for the evaluation of the effect of different magnetic interactions. In most cases the
corresponding calculations are carried out perturbationally by using non-relativistic O-th order
wave functions in the LS coupling scheme. The perturbation evaluation of the effect of Hso on
spin-forbidden transitions is a typical example of such applications.
By reducing the Pauli Hamiltonian to the I-component form one retains the two largest cor-
rections to non-relativistic energies. As long as the magnetic interactions are of little importance
(e.g. closed shell systems) the I-component quasirelativistic apprOximation works unexpectedly
well. This approach has been first used by Cowan and Griffin in calculations of atomic energy
227

levels. More recently the I-component quasirelativistic scheme has been applied to the evalua-
tion of relativistic corrections to variety of atomic and moleculer properies. Taking into account
several fundamental objections concerning the most advanced many-electron relativistic Hamil-
tonians, the numerical problems occurring in relativistic molecular calculations, and the effort
and costs involved, the simplest I-component approximation for relativistic effects is certainly
worth pursuing.
On the basis of recent numerical results most promising appears to be the quasirelativistic
'spin-free no-pair' approximation based on Sucher's projected Hamiltonian (3.10). Within the
I-component framework this method includes relativistic effects on the electron-electron inter-
action and provides a tool for studying the interplay between relativity and correlation effects.
In the context of quasirelativistic methods one should also mention recent progress in ex-
plicit perturbation evaluation of relativistic effects. By certain modification of the metric for
4-component spinors one can devise a perturbation expansion which is based on non-relativistic
solutions and avoids singularities ofthe Foldy-Wouthuysen transformation method. In the first-
order with respect to c- 2 this method gives essentially the first-order result of the Pauli ap-
proximation. However, numerical studies of Rutkowski and Kutzelnigg indicate its much higher
independence of the approximate character of the non-relativistic reference function.
Finally, we shall place in this Section also the methods based on ad hoc one-electron potentials
which are supposed to simulate the effect of true relativistic contributions. Such methods are
used mainly in the framework of the spin-less I-component approximation and are usually
restricted to the area of atomic calculations.

4.4 Pseudopotential Methods


The problem of handling systems with heavy atoms and large number of electrons is already
quite serious in the non-reltivistic approximation and has lead to the development of a vari-
ety of the so-called effective core potential (or pseudopotential) methods. The main idea of
these methods is to replace relatively inert atomic cores by some effective potential and con-
centrate attention on outer (valence) electrons. If this sheme is applied to relativistic equations
the resulting relativistic pseudopotentials have in general a 2-component form. Usually some
averaging procedure is applied and gives scalar I-component relativistic pseudopotentials with
the magnetic effects to be evaluated a posteriori by using perturbation theory methods. The
determination of pseudopotentials requires certain reference data to evaluate numerical param-
eters. These can be taken either from Dirac-Hartree-Fock calculations for atoms or obtained by
matching different experimental data for atoms.
The use of parametrized pseudopotentials makes the quality of the valence-only relativistic
or quasirelativistic calculations a little uncertain. In the first step one must resolve the question
which electron shells should be considered as the core and which ones should be explicitly treated.
For heavy atoms this problem has by no means simple answer and different partitions of atomic
shells result in different sets of pseudopotentials. On the other hand, the pseudopotential schemes
have a chemical appeal of a kind; they support the believe that most of chemistry can be derived
from valence shells.
From the computational point of view the relativistic pseudopotential methods are not too
much different from all-electron techniques. They can be used in the one-electron approximation
as well as in any of the correlated-level methods. The performance of methods utilizing the effec-
tive core approximation ranks from very poor to excitingly good depending on the system and,
to some extent, personal attitudes. However, if one takes into accont the number of electrons in
chemically interesting systems both the non-relativistic and relativistic pseudopotential meth-
228

ods are worth developing. The corresponding relativistic methods are of indispensable usefulness
in study of heavy-metal compounds and heavy-metal clusters.

4.5 Remarks on Applications


The relativistic quantum chemistry is a rather new area and neither is explored wen enough
in the sense of numerical calculations nor is that well developed as the non-relativistic theories.
In this respect relativistic atomic calculations represent an exception and have been carried out
by atomic physicists long ago. There is a number of atomic relativistic programs which can be
used almost routinely. With molecules the situation is rather di1ferent. Codes for performing
even the simplest Dirac-Hartree-Fock calculations for molecules are not that common. There
are also only a few highly accurate relativistic results available. Most of other data suffer from
defficiencies either in basic theory or in numerical accuracy.
The purpose of this series of lectures was primarily to give some background in relativistic
quantum mechanics which should be sufficient for understanding the kind of problems occurring
in relativistic quantum chemistry. Different computational aspects were scanned in this Section
with the intention to provide some overview of the methods which are currently in use in rel-
ativistic calculations of chemical interest. More details can be found in references listed and
commented upon in Section 5.
Since these lectures are concerned more with the methods rather than with. calculations, no
numerical illustration of the performance of di1ferent computational schemes is given. This, as a
matter of fact, would be quite obstructed by incompatibility of di1ferent approaches. Accurate
molecular relativistic calculations for non-trivial systems have appeared only recently. To reach
the level of reliability of non-relativistic methods needs far more basic research and benchmarking
calculations to be performed by using di1ferent relativistic and quasirelativistic approaches. With
the importance of heavy metals and their compounds in chemistry and chemical industry one
can expect that the relativistic quantum chemistry will be one of the major areas of research in
the years to come.
229

5. References

The fundamentals ofthe relativistic quantum mechanics can be found in a number of textbooks
and monographs [1 - 3]. Some interpretational problem of relativistic quantum mechanics are
qualitatively discussed by Powell [4] who also gives plots of spinor orbitals for hydrogen-like
ions. The problems of the relativistic theory of many-electron systems are well surveyed in
review articles of Sucher [5] and Grant [6,7] where references to other papers of interest are
given. Both formal and computational aspects of the atomic relativistic theory are covered by
other reviews of Grant [8]. Desclaux [9] has published a compilation of hydrogenic relativistic
wave functions and related properties for Z from 1 through 120. The very promising projection
method of Sucher is described in his articles [5,13]. Implementations of this method can be
found in Refs. [14] and [15].
Several technical aspects of relativistic calculations with truncated basis sets are described
in articles by Clementi and co-workers [10 - 12]. The many-body techniques in relativistic
calculations have been reviewed by Wilson [18] and Quiney [19]. One should also mention papers
by Malli and his co-workers on the relativistic CI method [20]. Relativistic pseudopotential
methods are comprehensively described in review articles by Balasubramanian and Pitzer [21]
and by Gropen [22].
The perturbation technique for the solution of the Dirac equation has been proposed by
Rutkowski [23]. Different aspects of this approach are discussed by Kutzelnigg [24]. Some
illustration of the usefulness of the I-component quasirelativistic scheme can by found in recent
papers by myself and my co-workers [25].
Of particular value are two review articles by Pyykk5 [16,17] which give a broad historical
account of different relativistic methods and provide excellent illustration for chemical aspects of
relativity. Pyykk5 has also compiled [26] nearly all 'relativistic' papers published in the period
1916-1985.

1. A. Messiah, Quantum Mech4nics, vol. 2, North-Holland, Amsterdam 1963.


2. R. E. Moss, Advanced Quantum Mechnics:An Introduction to Relativistic Quantum Me-
chanics and the Quantum Theory oj Radiction, Chapman and Hall, London, 1973.
3. H. A. Bethe and E. E. Salpeter,Quantum Mechanics of One- and Two-Electron Atoms,
Academic Press, New York 1957.
4. R. E. Powell, J. Chem. Educ. 45,558 (1968).
5. J. Sucher, in Relativistic Effects in Atoms, Molecules, and Solids, ed. G. L. Mali, NATO
ASI Series, Plenum Press, New York 1983, p. l.
6. I. Grant, in Relativistic Effects in Atoms, Molecules, and Solids, ed. G. L. Mali, NATO
ASI Series, Plenum Press, New York 1983, p. 73.
7. I. Grant, in: Methods in Computational Chemistry, vol. 2, ed. S. Wilson, Plenum Press,
New York 1988, p. 1.
8. 1. Grant, in Relativistic Effects in Atoms, Molecules, and Solids, ed. G. L. Mali, NATO
ASI Series, Plenum Press, New York 1983, pp. 55, 89, 101.
230

9. J. P. Desclaux, At. Data Nucl. Data Tables 12,311 (1973).

10. A. K. Mohanty, F. A. Parpia, and E. Clementi, in: Modem Techniques in Computational


Chemistry: MOTECC-91, ed. E. Clementi, ESCOM, Leiden 1991, p. 167.
11. F. A. Parpia and A. K. Mohanty, in: Modem Techniques in Computational Chemistry:
MOTECC-91, ed. E. Clementi, ESCOM, Leiden 1991, p. 211.

12. A. K. Mohanty, S. Panigrahy, and E. Clementi, in: Modem Techniques in Computational


Chemistry: MOTECC-91, ed. E. Clementi, ESCOM, Leiden 1991, p. 647.
13. J. Sucher, Phys. Rev. A 22, 348 (1980); J. Sucher, Int. J. Quantum Chem. 25,3 (1984).
14. B. A. Hess, Phys. Rev. A 32, 756 (1985); B. A. Hess, Phys. Rev. A 33, 3742 (1986); B.
A. Hess and P. Chandra, Phys. Scr. 36,412 (1987).

15. G. Jansen and B. A. Hess, Z. Phys. D. 13, 363 (1989); G. Jansen and B. A. Hess. Chern.
Phys. Lett. 160,507 (1989).
16. P. Pyykko, Adv. Quantum Chem. 11,353 (1978).

17. P. Pyykko, Chem. Rev. 88,563 (1988).


18. S. Wilson, in: Methods in Computational Chemistry, vol. 2, ed. S. Wilson, Plenum Press,
New York 1988, p. 73.
19. H. M. Quiney, in: Methods in Computational Chemistry, vol. 2, ed. S. Wilson, Plenum
Press, New York 1988, p. 227.

20. G. L. Malli and N. C. Pyper, Proc. Roy. Soc. London A 407, 377 (1986); A. F. Ramos,
N. C. Pyper, and G. 1. Malli, Phys. Rev. A 38, 2729 (1988).
21. K. Balasubramanian and K. S. Pitzer, Adv. Chem. Phys. 67,287 (1987).
22. O. Gropen, in: Methods in Computational Chemistry, vol. 2, ed. S. Wilson, Plenum Press,
New York 1988, p. 109.

23. A. Rutkowski, J. Phys. At. Mol. Phys. B 19, 149, 3431,3443 (1986); A. Rutkowski and
W. H. E. Schwarz, Theor. Chim. Acta 76391 (1990); A. Rutkowski, D. Rutkowska, and
W. H. E. Schwarz, Theor. Chim. Acta 84 105 (1992).

24. W. Kutzelnigg, Z. Phys. D 11, 15 (1989); R. Franke and W. Kutzeinigg, Chern. Phys.
Lett. 199561 (1992).

25. V. Kello and A. J. Sadiej, J. Chem. Phys. 93,8122 (1990); A. J. Sadlej, J. Chem. Phys.
95,2614 (1991); V. Kello and A. J. Sadlej, J. Chem. Phys. 95,8248 (1991); V. Kello, A.
J. Sadlej, and K. Faegri, Jr., Phys. Rev. A 47,1715 (1993).

26. P. Pyykko, Relativistic Theory of Atoms and Molecules, Springer-Verlag, Berlin, 1986.
Exercises with solutions

Compiled by
Roland Lindh
Per-Ake Malmqvist

Department of Theoretical Chemistry


Chemical Center, University of Lund
P.O.B. 124, S-22100 LUND, Sweden
The exercises in this book have been selected from those used at the European Summer-
school in Quantum Chemistry, with a few additions. The proveniences of these exercises
are sometimes unclear; however, we gratefully acknowledge contributions, work and com-
munications with Lars Pettersson, Jeppe Olsen, Trygve Helgaker, Peter Taylor, Bjorn
Roos, and Nicholas Handy. To you, and to any of the inevitably forgotten problem-text
authors, thank you very much, and forgive us the liberties taken.
The following sections are organized as follows. The exercise texts are kept separate from
the solutions. They are divided by topic using the following headers:

• Basis Sets
• Hartree-Fock
• Second Quantization

• Spin
• Geometrical Derivatives ...
• Density Functional Theory
• Coupled Cluster Theory
• Truncated Cl. ..
• Accurate Calculations
• MCSCF Theory

The three topics Basis Set.s, Spin, and Truncated CI and Size Consistency, are free-
standing. The others are exercises on the topics presented in other chapters of this book.
The MCSCF Theory exercises are not reprinted here, however; they appear in the MCSCF
Theory chapter of the First. ESQC Book CLecture Notes in Chemistry 58, Lecture notes in
Quantum Chemistry, European Summer School in Quantum Chemistry" (Springer-Verlag
Heidelberg. 1992), and only the solutions are given here.
Exercises

1 Basis Sets
A primit.ivf' Car!.esian Gaussian basis function G( CI', A, /, m, n) centered at A, with expo-
nf'n!. CI' and quantum numbers /, m, n is given by

where N(o, A, /, ro, n) is a normalization factor. Useful integrals in the following are:

1"" (_"r 2 d:r =

l~ J"2n+l e -or2 d:r = (n E Z)

(n E Z)

Note that o. which is not an exponent, is always called 'the exponent' of a Gaussian
basis function, and that /, m, and n which are exponents, should never be called so.
A complet.e Sf't of functions with a common center and exponent, and the same degree
N = /+ 111 +n, is called a shell. For N = 0, we have a single s function, N = 1 gives three
p functions, N = 2 gives six cartesian d functions, etc. This notation is borrowed from the
one used for spherical harmonic functions, and originates in spectroscopy (s,p,d,f stand
for sharp. principal, diffuse, and fundamental). This is improper for cartesian d functions
and higher. but the terminology is widespread, and the least confusing is probably to
accept. it.
The basis functions are linear combinations, callE'd contractions, of primitive gaussians.
When forming such contractions, one can also make sure to form linear combinations
which are true angula.r eigenfunctions. From one shell of six cartesian d functions, one
obtains one s component., which is not used, and five true d functions.
Anot.her name (again, not. quite proper) for gaussian basis functions are GTO's (Gaussian
Type Orbit.als). One can also talk of primitive GTO's and Contracted GTO's. The latter
may be abbreviatE'd CGTO's. However, this could also be taken to mean Cartesian GTO's
as distinct from Spherical Harmonic GTO's.

Exercise 1
Comput.e the oVE'rlap integral of two normalized s-type functions with exponents ° and
/j, and wit.h a common center.
234

Exercise 2
The overlap of sand p functions on the same center must of course be zero - find a simple
and direct argument. What about the overlap of s functions with the six cartesian d
functions?

Exercise 3
Consider the six cartesian d functions centered in the origin. The angular momentum
operators are it", = yf. - z/y etc. Apply the operator L2 to the six components and
determine five real linear combinations ¢Ji such that L2¢Ji = 1(1 + 1)¢Ji, with 1=2. These
are then a set of proper d functions, the Sperical Harmonic GTO's. Also find the sixth
linear combination, orthogonal to the others, with I = O. This is called the s contaminant,
and is seldom used.
Since the complete cartesian d shell is invariant to rotations, if it is decomposed into
spherical harmonic components, there must be complete sets of these as well. Thus,
they come in sets of 1,3,5 ... functions of type .!l,p,d .... What does this imply for the
transformation of the ten cartestian f functions to spherical harmonic functions?

Exercise 4

Find the radial maximum of GTO's of 5, p, d and f type, i.e. r = rmax(Q) such that the
radial density function r2¢J2 takes its maximum. Then find p, d and f exponents which
give the same maximum as a given s function. This, and similar, procedures are often
used to extend a basis set with functions for polarisation or correlation.

Exercise 5
Compute the overlap integral between two s-type functions centered at A and B, and
with exponents Q and /3, respectively.

Exercise 6
Same as Exercise 5, for p functions.
The integrals obtained from the integral evaluation are of two types: matrices for one-
electron operators, and for two-electron operators. The former have two basis function
indices p, q, while the latter have four indices. In non-relativistic work, for zero magnetic
235

field, the intt'grals art' rt'al numbers. Formally, with N basis functions, we need N2 and
N 4 integrals. However, there are permutation symmetry in the indices:
(pIAq) = (qIAp)
(pq,rs) = (pq,sr) = (qp,rs) = (qp,sr) = (rs,pq) = (sr,pq) = (rs,qp) = (sr,qp)
(real basis functions, ht'rmitian A, charge-cloud notation). For one-electron integrals, we
t.JllIS need only ~N(N + 1) values, for instance, the lower triangle of the matrix, row by
row. which impIit'S lint'M storage using a single triangular index [pq) = tp(p-l)+q, p ~ q.
Similarly. t.he t.wo-elE'Ctron integrals can bt' thought of as placed in a larger matrix with
indices (pq), Irs), and since this largt' matrix is itself symmetric, we use a single combined
index [(pq)[rs)]. The numoor of two-electron integrals are then t(tN(N + l))(tN(N +
1) + 1) or roughly ~N4.
When a molecule has some symmetry, the number of integrals to compute, store and
use can be furtht'r reduced. There are two schemes: Symmetry-adapted basis functions,
and Pt'titt' List. The following description applies to the D2h group and its subgroups.
All the integrals (Gr/Jp, Gr/Jq, Gr/J.. Gr/J.) have the same value for every operator G in the
point group. If wt' use a simple atom-centered basis set, these integrals involve different
basis functions, and only one of them needs to be stored. This is the basis for the Petite
List approach. We may instead form symmetry combinations of the basis functions -
Symmetry Adapted basis functions - and collect the symmetry combinations into blocks.
Each block is assigned a symmetry label A, and all functions in a block transform the
sa.mt' way. na.mt'ly Gr/Jp = '(G(A)r/Jp where \'G(.~) is 1 or -1, are symmetry characters of
tht' irreducible representation A of the group G. Usually, so-called Schonfiies notation is
used for the point groups and their irreducible representation, and you are assumed to
have some familia.rity with this notation.
For symmetry-adapted basis functions, the rule is now very simple: For each quadruple
=
of symmetry labels Al ;:: A2. A3 ;:: A4, [AIA2);:: [).3).4], all the integrals (pq,rs) 0 unless
YG(AdYG(A2hG(A3hG().4) =1 for all operators G. In other words, the product in the
integral

must transform as a totally symmetric function to be non-zero. A simple estimate of


the symmetry reduction is obtained by simply dividing by the size 9 of the group. The
number of two-electron integrals is then roughly N 4 /8g.
Point groups which contain other than two-fold symmetry elements require special han-
dling. For SCF calculations there is a simple way to use Petite List integrals for higher
point groups. Standard ·programs are not equipped to deal with such symmetries for
correlated wa.ve functions. .
In C. symmetry, which has only a mirror plane, there are two irreducible representations:
A' is even. and A" is odd with respect to reflection. We use capital letters in general,
and lower case for orbit.als. Also common: u and 7r for the orbitals, but this is not quite
proper. In C2v t.here are four symmetry operations, and four irreducible representations,
calif·d AI, B1 , B2 , and A 2 • The character table is
236

Label E C2 Uv U'v
Al 1 1 1 1 z, z2, x 2, y2
BI 1 -1 1 -1 x,xz Ry
B2 1 -1 -1 1 y,yz R%
A2 1 1 -1 -1 xy R%
From this, we can also deduce a multiplication table for the functions:

Example: Electrostatic integrals in the symmetry block (a2~' lit al) may be non-zero, since
the product is totally symmetric. The product a2~ is BI to the left of the comma, lit al to
the right is also BI> and of course BIBI is At, which is the label of the totally symmetric
function. Remember: capital letters in general, but lower case for basis functions and
orbitals (one-electron functions).

Exercise 7
a) Assume that, in a calculation on C3 N, we use an ANO basis set with 14 primitive
s functions, 9 shells of p functions etc. The notation ANO(14s9p4d3f)/[5s4p2dlfl, for
example, means that we use 5 s-type contracted functions, etc.
For the three contractions [4s3p2d] , [5s4p2dlfl, and 15s4p.3d2fl, calculate the number
of symmetry adapted basis functions of a' and a" type, and the number of two-electron
integrals over symmetry-adapted basis functions. C<>mpare to the simple estimate, N 4 /8g.
Also, compute the number of integrals over primiti.ve basis functions.

Exercise 8
For a pyramidal CUs cluster, compute the maximum number of one- and two-electron
integrals that have to be stored on disk. A cartesian (12s8p5d)/15s4pld] basis was used,
and the symmetry was C2v which gives 35at, 2811t, 28~, and 24a2 basis functions (These
numbers do not quite add up: the explanation is that five basis functions were deleted
from the calculation). The number of integrals actually stored in this calculations was
4 238 521; the program stored only those larger than 10- 14 •

Exercise 9
In a calculation on CuF, at a distance close to equilibrium, 4609521 integrals were gen-
erated. At a large distance (lOOao) only 1272495 integrals were produced. What is the
origin of this difference?
237

Exercise 10

In a calculation on N z, using DZh symmetry and a (13058p6d4f)/[5o54p3d2fJ ANO basis,


2409047 integrals were produced close to equilibrium, and 2215567 at l00ao. Compare
the number of integrals with a simple estimate. Why was the reduction so small in this
case? The integral calculations required 2100 s and 15OOs, respectively, on a single CPU
of an Alliant FX/8. Why was the reduction in time so substantial, when the number of
integrals produced was so nearly the same?

2 Hartree-Fock
See Almlof's chapter on Hartree-Fock, Appendix A, for integral notation. Expectation
values over determinant functions of orthonormal spin-orbitals are found in the same
chapter, Eqs 4.6 and 4.10-14. Matrix elements over different determinants are found in
Eqs 23.19-22. For the F, J, and K operators as they appear in the UHF and Closed-shell
cases, see also Chapters 6 and 7.

Exercise 1

Write down the Fock operator for the lo5z205 2 state of Be.

Exercise 2

Show Koopmans' theorem for a closed-shell molecule, i.e. that the ionization potential
in the frozen approximation equals -£k, the negative of the orbital energy. The frozen
approximation means that the ion state is obtained using the canonical orbitals of the
neutral state, simply by forming a determinant with one of the orbitals removed.

Exercise 3

By a single excitation from a closed shell determinant, we can obtain a singlet and a
triplet sta.te. Compute the energy difference between these states. Which is the lower?

Exercise 4

The HF orbital energies are given by


m
t; = hii + E«ijlij) - (ijlji))
j=1
238

where the sum is over the m occupied shells. The exchange terms are much smaller than
the Coulomb terms, except of course for the case i = j. Compare the two cases where
i is an occupied and where i is a virtual orbital. Also show Koopmans' theorem for the
electron affinity in the frozen approximation. This is similar to Exercise 2, except that
we add a virtual orbital instead of removing an occupied orbital.

Exercise 5

The Fock equation in a non-orthogonal basis {Xi};'=! has the form

where 5 i; = (xilx;) and Fi; = hilF.\:;) are the overlap matrix and Fock matrix, and Ck
is a vector of MO coefficients, which express the orbital <Pk in the basis: <Pk = 2:i CikXi.
a) Derive the formula by which this equation is brought to the form

where c" = Tc /" for some transformation matrix T, such that F' is symmetric (assume
real matrices).
b) The matrix T is not orthogonal. Derive a simple expression for T-! in terms of TT
and S.

Exercise 6

Consider a real closed-shell Hartree-Fock wave function. Assume that the occupied canon-
ical HF orbitals can be expressed in some finite, non-orthogonal real basis set h:l>}~=!'
so that tPi = 2:1' CpiXI>' The density matrix in this basis is Dl>q = 22::'=1 CpiCqi (using n
occupied orbitals).
a) Show that the orthonormality of the HF orbitals implies the condition

DSD = 2D

where S is the overlap matrix, and that D is invariant to orthogonal transformations


among the occupied orbitals.
b) Show that the Fock matrix in this basis, Fl>q = (XI>IFI\:q), can be expressed as

which means that F is invariant to ON transformations among the occupied orbitals.


239

Exercise 7
The determinant is linear in all its columns, so for a first-order variation in the orbitals,

The energy expectation value is E = (1)0191)0), so the first-order energy variation is

=
Show that the variations of an arbitrary occupied orbital f/;; of the two forms 6v;; 6xv;"
and 6f/;; = i6xtj.'e, where tj.'a is an arbitrary virtual orbital and 6x a real number, preserves
orthonormality to first order in Ox. What does this imply for the matrix element (\}IoI9\}1i)
for the Hartree-Fock state?

Exercise 8

=
Use the exact HIs orbitals in a minimal LCAO basis for H2 Ms 0 states. The closed-
shell "ground state~ determinant is Ig9), and there are also two singly-excited states
(singlet and triplet) and a "doubly excited" determinant luu). Use the known. hydrogen
atom energy Eh = -~ a.u., and (ls1sI1s1s) = ~ a.u. to calculate the energy of these
determinants at the dissociation limit.

3 Second Quantization

Exercise 1

Let In) be an arbitrary occupation number vector. Show that

a!ajln) = (1 - n; + o;j)n;E;;r(n);r(n);lk)

where

k/ = n/ if 1:# i,/:# j
k; 1,
kj = 6;j,

{ Iifi:5j
and E;j = -1 if i > j
240

Exercise 2
The following equalities are constantly used to evaluate products of elementary opera-
tors ('field operators'), and in particular their matrix elements. Prove the following six
formulae:
[A, B1B2] = [A, BIIB2 + BdA, B21
n-I
[A, B1 B2··· Bn) L BI ... Bk[A, Ek+IIBk+2'" En
k=O

[A,B1B2) = [A, BII+B2 - BI [A, B21+


n-l
[A, B1 B2 • •• Bn) = 2:< _1)k BI ... BA:[A, BA:+1I+BA:+2 ... Bn (n even)
k=O

[A, B 1 B 2]+ = [A, BII+B2 + BIlA, B21+


[A, [B, C]I + [B, [C, AIJ + [C. [A, BlI = 0
(A product such as B1 ••• BA: is ignored when it is empty, in this case for k = 0.)

Exercise 3
Let ir, and j be one-electron operators and 9 a two-electron operator. Show that the
commutators [ir"h and [ir"gl are one-electron and two-electron operators, respectively.

Exercise 4
Let In and 1m be products of 71 and m elementary operators (creation or annihilation
operators), respectively. Show that. for 71 and m both even, the commutator [In' 1m) can
be reduced to a sum of terms, where each term is the product of at most n + m - 2
elementary operators.

Exercise 5
Prove the following relations for operators and square matrices. You may assume that
the Taylor expansions are absolutely convergent, so any reordering of terms or regular
summation is allowed. This is true for matrices and for bounded operators.
exp(A)t = exp(At)
Bexp(A)B- 1 = exp(BAB- 1 )
exp(A + B) = exp(A)exp(B) if(A, B) =0
exp(A)exp( -A) = 1
d
dJ. exp(J.A) = Aexp(J.A) =exp(J.A)A
1 1
exp(A)Bexp(-A) = B + [A, B) + -2 [A, [A,BlI + ... ,[A,[A,·· ·,[A,BI· ··lJ + ...
n.
241

Hint: To prove the last equation, define a function F('\) = exp('\A)Bexp('\A).

Exercise 6
Let D be a diagonal matrix with elements d;, i = ] ... n in the diagonal. Show that exp(D)
is also a diagonal matrix, with diagonal elements exp( d;), i =1 ... n. Assume that A is
any general, square mat.rix t.hat can be diagonalized by a similarity transformation:
A=X-1DX
Show that for any such matrix,
det(exp(A) = exp(Tr(A))

Exercise 7
Two of the operators occuring in relativistic quantum mechanics are the two-electron
Darwin term and the spin-spin contact term, respectively:

H~o = -~ L:c(r; - rj) and Hssc = - :; L:S;Sjc(r; - rj)


.>, .>,
where C is the Dirac delta. function.
a) Determine the second-quantization representation of H2o and HSsc .
b) Exploit. the permutation symmetry of the integrals to prove that
. 1.
H 20 - -2 H sse
c - C

Exercise 8
One of the operators describing the interaction between a nuclear magnetic dipole and
electrons is the Fermi contact term
Hlc = L L "YAS;IAc(r;A)
A ;
Det.erminE' t.he sffond quantization representation of HJc.

Exercise 9
Prove the commutation relations
[Emn. at.,) = C;na!,,,
[Emn' a;,,) = -C",;ant1
[E",n,Eij ) = C;nEmj - CmjE;n
[Emn, e;j"!) = C;ne"'j"l - Cmje;,,"1 + C""eijml - Cmle;j""
242

Exercise 10

a) Prove that these are components of a triplet tensor operator:

Q;j{I,I)

Q;j{I,O)

Q;j(1, -1)

b) Prove that this is a singlet. tensor operator:

QA;]' (0 ,0) = v2
1In (Aa.,a.
tAt AtAt
,a., )
., ]-,,-a.'-,],

c) Use the relations in Eq. (5.22) to carry out substitutions in Eqs. (10.1) and (10.2) to
show that

TiAI,I)

Tij(l,O)

T;j(1,-I)

are components of a triplet. tensor operator and

SAi]' (00)
• = v2
IIn (At
a., A At
.• a].!+a.
2 .-,
A )
,a]·_!
2

is a singlet tensor operator.

Exercise 11

Let 8 be total spin operator and Ni = alc,aiO' + a;IJai/1 the orbital occupation number
operator. Prove that 82 does not change the orbital occupations, i.e., (8 2 , N;J = o.

Exercise 12

Calculate the matrix exp( K) where K is


a) A 2 x 2 real antisymmetric matrix,
243

b) A 2 x 2 complex antihermitian matrix with zero diagonal,

c) A genera.! 2 x 2 antihermitian matrix,

K = (~~. i~2)
wit.h 0 = A + if, and 01,02, A, f are real.

Hint: ( i0 1
-0"

Exercise 13

a) Show that t.he exponential operator (; = exp(i1l'Sx) is a spin-flip operator, i.e.,

(r at. iTt oc:

iratrirt oc:

b) Let. 10.0 > be a singlet spin state. The numbers denote the total spin S and its z
component Ms for the state, respectively. Show that

c) Let 11,0 > be a t.riplet st.ate with spin projection Ms = O. Show that

exp(i1l'SrlI 1,0 >= -11,0 >

Exercise 14

Let 10 >.IOk > be an ort.honormal basis of a. vector space and introduce the exponential

exp(iRj = exp(i })RkIOk >< 01 + R;IO >< OklJ)


k~O

a) Show that
. isind"
exp(iRllO >= cosdlO > +-d- L- RklOk >,
k
244

Exercise 15
Let at be a set of creators for orthonormal orbitals. A set of non-orthogonal orbitals can
be defined with creators of the form

it! = exp( i It )al exp( -i It )


where K is general- not necessarily hermitian.
Find a similar expression for annihilation operators 6such that
- At
rbi, bi ] = bii

Note that, in general, 61 f: it!. The 'tilde' operators are then creators and annihilators for
another set of orbitals than the 'hatted' operators. The two orbital sets are said to be
biorthonormal.

4 Spin
To remind you of the basic nomenclature and definitions, here is a list of some pertinent
facts about electron spin. The spin operator S is a vector operator. In any given reference
frame, it can be represented by three components, S = (Br, By, B.), and on rotations of the
reference frame, these components mix like components of an ordinary vector. The three
components must obey the cyclic commutation relations [Br, By] = iB., [By, B.] = iBr'
and [B., Br] = iBy, since they are components of an angular momentum operator (we
use a system of units where Ii = 1). The operator B2 is defined as the scalar operator
5~ + S~ + B~. The eigenfunctions of 52 have eigenvalues S(S + 1), where S can be
O,~, 1, 1~ ... , i.e. only non-negative half-integers are possible. The functions are called
singlet, doublet, triplet ... , wave functions, depending on S. The eigenfunctions of B.
have eigenvalues Ms, which also must be half-integer, but can have either sign. The
one-electron wave functions always have S = ~. These functions must, strictly speaking,
have one additional argument, namely the direction of the electron spin, apart from the
position variable r. This additional variable m is an internal degree of freedom, and the
internal stntcture of the electron is not part of quantum chemistry but is assumed to be
completely specified for any eigenfunction of S•. One possible choice of a complete set of
observables for the electron is thus (r,m), where m can only be ~ or -~, and variables
in thiR composit space are, if needE'd, denoted x. In formulas, one often integrates over
this space, meaning a sum of the two integrals with m = ~ and m = -~. Any arbitrary
one-electron wave function (or spin-orbital) can then be described by a decomposition
W(x) = <1>1 (r)xl(m) + <l>2(rhdm) if the two functions Xl.2 are independent functions of
m. Customarily, the two functions 0 and !3 are used: o( ~) = 1, o( - ~) = 0, !3( H = 0,
and !3( -~) = 1.
245

Exercise 1
Show that the commutator [52 ,5.] = o.

Exercise 2
Show that

Exercise 3
Define thE' operators 5+ = S., + i5. and 5_ = 5", - is•.
Let IS, Ms) denote some arbitrary
eigenfunction of $2 and 5•. Use the commutation rules and definitions above to show that
the new functions $+15, Ms) and S_IS,Ms) are also spin eigenfunctions, if non-zero. The
opera.tors S+ and S_ are called spin-up and spin-down operators. This type of operators
are generally called ladder operators or step operators.

Exercise 4
Find a 2 x 2 matrix representation for all the spin operators encountered so far, using
the basis 0 and f3. Begin by noting that the representation for S+ or S_ are almost
obvious in t.his basis - these matrices contain a single non-zero complex number, with
known magnitude but unknown phase. The phase cannot be obtained from definitions
and commutation relations (Proof: Those are still fulfilled if any or both of the basis
functions are scaled by phase factors of unit magnitude). Choose the phase so tha.t the
S+ matrix elements are real and non-negative. The matrices you obtain are called the
'standard representation', twice the S"" S,,' and S. matrices are called the Pauli spin
matrices, also denoted (7z, (7", and (7., often regarded as components of a vector (7.
It is convenient to introduce spin operators for each individual electron, denoted e.g. as
Sir = L; Br(i). A product of two operators is then
sr(i), etc., for electron nr. i, and thus
obtained as e.g.

8+5_ = Ls+(i)L(j) = L8+(i).L(i) + Ls+(i),L(j)


;j ; ;¢j

which is a sum of a one-electron and a two-electron operator. For non-relativistic applica-


tions, one often handles determinants written as e.g. IcPl4>tthl, where the overline denotes
a beta. spin, and the indices refer to spatial dependence. Then the spin orbitals have the
special form

cP;(x) = cP;(r)o(m)
cP;(x) = cP;(r)f3(m)
246

Exercise 5
Show that the general determinant D = I<pI ••. <pNI is an eigenfunction to S. with eigen-
value Ms = ~(n", - n,B), where n", and n(3 are the number of alpha and beta spin orbitals
in the determinant.

Exercise 6

{I: Jt
Show that
S2 D = pq +! [(n", - n,B)2 + 2(n", + n,B)j} D
p<q 4
where we have defined a spin interchange operator Xpq: This operator gives a non-zero
result when spin-orbitals in the positions p and q have different spin, and it then moves
the overline from the one to the other. This formula is very handy for hand calculations.
In application, it should be used to the open shells only, and no, n,B should then only
count the open-shell spins. Try to show why this works.

Exercise 1
It is easily seen that only open shells need to be considered in the above construction. A
simplified notation is then used for hand calculations: just list the spin labels of the open
shells. For the set of determinants with two electrons in two open shells, DI = 10'0'1, D2 =
10',81, D3 = 1,80'1, and D4 = 1,8,81, compute the effect of S20n each of these determinants.
In ~eneral. a single determinant is not an 52 eigenfunction, even if it is an eigenfunction
of 5 •. However, a high-spin determinant, which may have a number of paired (closed, or
doubly-occupied) orbitals, but all unpaired (open) orbitals have the same spin, is always
an eigenfunction of [;2, and has 5 = IMsl. The number of open orbitals are thus 25 for
a high-spin determinant. To construct eigenfunctions with some specified 5 when there
are more open shells than 25, we need in general a linear combination of the different
possible determinants. A very general (but often clumsy) method to obtain such linear
combinations is to use a projection operator:
P(5) II OK
Ki:S
52 - K(I( + 1)
5(S + I) - J\(l( + 1)
The projection operator is applied to a suitable determinant, and is said to project out the
desired spin eigenfunction. The result is, in practice, obtained by sequentially applying
the operators 0[( with different values of K. From exercise 6, this results in forming
various sums of determinants where the spin of open shells have been swithed by the spin
interchange operator. If the original determinant is considered as a sum of terms with
different spin 5', it is easily seen the Of( produces new sums where each term has been
scaled except the one wit.h spin 5, and that the term with spin S' = K is removed.
247

Exercise 8

In the basis DJ ... D4 defined in Exercise 7, construct eigenfunctions to 52 in three dife


ferent ways:
a) Diagonalize the matrix with elements (D;!5 2 Dj)
b) Start with a high-spin determinant, apply 5_, and use normalization and orthogonal-
ity.
c) Use the projection operator technique.

Exercise 9

As in exercises Sa and b, but with three open shells instead. Also, use the projection
operator technique to get an S = ~ wave function. Is it identical to any of the ones in
part a and b? If not, why?

Exercise 10

In the chapter on the Configuration Interaction Method of the Lecture Notes, Eqs (4.7)
and (4.S) on page 266, two spin eigenfunctions with four open shells are written down.
Verify that they are eigenfunctions of 52.

Exercise 11

Consider a closed-shell determinant 10), let i,j be two different occupied orbitals, and a, b
two unoccupied ones. A double excitation i,j ..... a, b gives a wave function with four open
shells. Write down the linear combination of determinants obtained as EaiEbiIO). Verify
that this is a singlet wave function.

Exercise 12

Same, but use instead EajEbiIO). Show that this function is linearly independent to the
one in Exercise 11, but not orthonormal to it.

5 Geometrical Derivatives . ..

We consider an electronic energy E( x, ~) which is a function of some external parameters


(for example the molecular geometry or the field strength) and a set of independent
:r
248

wave function parameters ~ (for example CI state transfer parameters). The optimized
electronic energy at x is denoted e( z) and

The variational condition


8E(x,~) _ 0 u
8>' -, vX

is assumed.

Exercise 1
Show by differentiation that the molecular Hessian may be written in the form
e(2) = E(20) + 2E(1I)~(1) + E(02)>.(1) ~(I) + E(OI) >.(2)

Exercise 2
Use the variational condition to show that the above expression for the Hessian may be
written in the following two ways:
e(2) = E(20) + 2E(1I)~(I) + E(02)~(1)~(I)
e(2) = E(20) + E(1I) >.(1)
in a.ccordanc.e with the 2n + 1 rule. Show that the first expression is stable with respect
to errors in >.(1) while the second is unstable.

Exercise 3
Show that the Hessian may also be written in the form
e(2) = E(20) + E(11)(E(02»-1 E(lI)
For a complete CI wave function this expression is the same as the usual sum-over-states
expression of second-order perturbation theory:

e(2) = (Olil(2)10) _ 2 L (018


- (1) - (1)
In)("18 10)
,,~ En-Eo

The factor of t.wo is due to the following parametrization of the energy and Hamiltonian:

e(x) = e(O)
1
+ e(1)x + _e(2)z2 +
2
iI(x) = iI(O) + iI(I)x + ~iI(2)X2 +
2
249

Exercise 4
lise t.hE' following parametrization of the CI wave function:

It/>} =exp( - .1%"0)


P= L .\n(ln)(OI - 10)(nl)
n;!'O

wherE'
iI(O)lk) = Eklk)
t.o show t.hat t.he derivatives of the energy with respect to the state transfer parameters
.\n !>('('ome (assuming real parameters and matrix elements):
8E •
8.\n = -2(nIHlO)
82 E • •
8>'m8>'n =2( (mIBln) - (OIBIO)

By st.raightforward differentiation one obtains the following expression for the third deriva-
tivE' of t.he eledronic energy:
e-(3) = E(3O) + 3E(21)>.(I) + 3E(12)>.(1).\(1) + E(03)>.(I)>.(1)>.(I)
+ 3E(11) ).(2) + 3E(02) ).(1) ).(2) + 3E(01) ).(3)

Exercise 5

Using thE' variat.ional condition, show that. this expression may be written in the simplified
way
C(3) = E(3O) + 3E(21) >.(1) + 3E(12) >.(1) >.(1) + E(03) ).(1) >.(1) >.(1)
in accordan('e with the 2n + 1 rulE'.
London orbitals arE' oftE'n USE'd in calculations involving an external magnetic field. Con-
sidE'r an at.omic orbital t/'Im positioned at. 0 which in the field-free case satisfies the
SchrooingE'r E'quation

In atomic unit.s.
iI(O) = _!V 2 + V
2
for some E'ff('('tive. spherically symmetric potential V. The orbital is an eigenfunction of
the;; component of the angular momentum about 0:

L~thm = mt/'Im
t = (f-O) x P
250

We now apply a uniform external magnetic induction B along the z axis. To construct
the Hamiltonian, we need a vector potential. First choose it such that it disappears at
the center 0:
Ao(r) = B x (r - 0)
2
This potential gives the correct induction, as may be confirmed by calculating

B = V' x Ao(r)

With this potential, the Hamiltonian becomes


, 1.2
Ho = -211"0 +V
where the kinetic momentum is given by

*0 = -iV' + Ao(r)
and we have added the subscript 0 to the Hamiltonian to remind us that it is constructed
from a vector potential vanishing at o.

Exercise 6
Show that to first order in B the Hamiltonian may be written

and that to the same order in B the Schrodinger equation is

n°t/'Im = (E(O) + ~BmNlm (gauge origin 0)

Hence with this choice of vector potential, the unperturbed atomic orbital is correct to
first order in the perturbation.
Now make a different choice of gauge origin:

AG(r) = B x (r - G)
2
This is also a valid vector potential, since it describes the same induction as before:

V' x AG(r) = V' x Ao(r)

The Hamiltonian and kinetic momentum operators are now represented as

H'G 1'2
= -211"G + ,e y

n-G = -iV' + AG(r)


251

Exercise 7

Show that with this (equally valid) choice of gauge origin the unperturbed wave function
"'1m is no longer correct. to first order in the perturbation.

Wf" conduc11" that t.he at.omic orbit,als describes the perturbed wave function better for
somE' gauge choices t.han for ot.her. In gE:'neral, magnetic properties calculated using LCAO
orbitals a.rf" gauge dependent..
WI" now introduce a complex phasE:' factor in the orbital as suggested by London:

Wlm = exp( -iA(O) . r)"'lm


where A(O) is thE:' vector potential at the position of the atomic orbital. For example,
with the Vf"ct.or potent.ia.! AG(r) the London atomic orbital becomes

Exercise 8

Show that t.he following rela.tionship holds:

irG exp( -iAG(O) . r) = exp( -iAG(O) . r)iro


a.nd t.herefore
HG exp( -iAG(O) . r) =exp( -iAG(O) . r)Ho
Show t.hat. to first, order,

1

HGWlm = (E (0) + 2"Bm)Wlm (any gauge origin)

WE:' have se-en t,hat the London orbitals are well suited for the description of magnetic
perturba.tions involving an external magnetic field. A further benefit is that the use of
London orbitals makes LCAO calculations strictly gauge-origin independent. To prove
this, it suffices to show that all integrals over London orbitals are independent of origin.
Assume two atomic orbitals V.'I' and 'l/'v are centered at M and N respectively, and form
the London orbitals
W" = exp( -iAG(M) . r)"'"

Wv =exp( -iAG(N)· r)'l/'v


252

Exercise 9

Show that the kinetic energy integral

TI'~ = (wl'l~,rblw~)
may be written as
T~ = (1/7l'lexp(~B. (M -N) x r)~,r~ltt'~)
which is manifestly independent of the gauge origin G.
The London orbitals depend on the coordinate system since r appears in the phase factor.
The same is true of integrals over London orbitals.

Exercise 10

Does this make our calculations dependent on the coordinate system?

Exercise 11

Show that quadratic convergence implies superlinear convergence, and that superlinear
convergence implies linear convergence.
An iteration in Newton's method can be written as

where Xc is the current point, G c and gc is the Hessian and gradient at that point, and
x+ is the next point. The optimizer is X., and thus the current error e c = Xc - x. and
the error in next iteration will be e+ = x+ - x •.

Exercise 12

Write the inverse Hessian and the gradient as Taylor series in (x - x.), using derivatives
at x •. Use this to express the next error, e+. in terms of ec, to prove that the method is
quadratically convergent sufficiently close to the optimizer.
A step to the boundary of the restricted second-Qrder model

mRSO(s) = f(x c ) + gcT


5 + 25 G c 5,
IT T
5 5::; h2

may be written as
253

Exercise 13

Show that the second-order change in the function is given by

The quasi-Newton condition requires that the updated Hessian fulfills the equation

where

Exercise 14

Show that the PSB update,

where

and the BFGS update,

both fulfil the quasi-Newton condition.


The set of points x where some function f has a particular value f(x) = k is an iso- f
contour. At some points on such a contour, the norm of the gradient has its maximum or
minimum value on that contour. A so-called gradient extremal is a curve which cuts each
iso- f contour at such an extremum point. Gradient extremals of potential functions are
used to define reaction paths.

Exercise 15

Use Lagrange's method of undetermined multipliers to show that, along a gradient ex-
tremal,
G(x)g(x) = p(x)g(x)
for some function p .
254

6 Density Functional Theory

Exercise 1

Evaluate r. at the nucleus of a Hydrogen like atom of charge Z, and at the bohr radius.

Exercise 2
Evaluate t.he functional derivative of the last term in t.he LYP correlation functional:

Jwp41Vpl2dr
Exercise 3
Show that by substitution of eqn(93) into the KS equations, and integrating by parts, it
is possible to remove the need for the evaluation of basis function second derivatives.

Exercise 4

Show that x, eqn(87), is dimensionless.

Exercise 5

Analyse the data in the tables; e.g. which bonds are poorly obtained by OFT and which
are well obtained by OFT. How do YOll think OFT compares with SCF and MP2?

Exercise 6
Carry through the evaluation of the kinetic and exchange energies for the uniform electron
gas.

7 Coupled Cluster Theory

Exercise 1
Let P denote a single two-electron system whose wave function is

iflp = Wo+:V
255

where 1110 is the Hartree-Fock determinant and XP is a correlation function orthogonal to


111 0 , We denote the correlation energy obtained with this wave function by €p, and the
overlap (vlyP) by S.
(a) What is the exact correlation energy for N identical non-interacting systems like P?
(b) Write down the energy expression for 111 p, and then the energy expression for a many-
electron wave function for N non-interacting systems constructed as
N
II1(N) = 1110 + L XP
P=J

Express the result in terms of N, S, and €p. How does this correlation energy behave as
N -+ oo?

(c) Note that this result corresponds to a truncation of the exact wave function that is
not va,riationally optimum, since the correlation function xp is taken from the exact wave
function and not reoptimized. (The analogy would be the difference between a CISD
wave function with the coefficients taken from a full CI, and with the optimized CISD
coefficients). Optimize the energy of the truncated expansion. (Hint: the only degree
of variational freedom is an overall scaling of the correlation functions). How does the
optimized energy behave as N -+ oo?

Exercise 2

Considering the form of the CCSDT triples equation,

1
WT2 + WT3 + 2"WT2 + WTJT2 + WTJT3 + WT2T3
2

1"2 1 21
+2"WTJT2 + 2"WTJT2 + '2 WT12T3 + 3fWTJ
1 3
T2

a. reader points out that there must be at least two typographic errors. Since the Hamil-
tonian can couple single excitations to triple excitations, there appears to be a linear term
WT1 missing on the RHS. The five-fold excitation term i!WT1s is also absent.
(a) Why does the term in WTt not appear? Is there a mistake in the equation?
(b) Why does the term WT1 not appear either? How could one describe this term?
(c) Why is it obvious that the reader is thinking from the perspective of CI calculations?
What other terms might a naive reader expect in this equation that are also missing?
(d) Which linear terms appear in the CCSDTQ equations (that is, up to connected quadru-
ples in the expansion)? Hence write down the linearized coupled-cluster equations for
a.rbitrary levels of excitation. How are these related to CI secular equations?
256

Exercise 3
Suppose we wish to use the process of solving the CC equations to estimate Mreller-Plesset
perturbation theory energies.
(a) Using the operator form of the CCSDTQ equations (these are given in Eqs. 4.11 -
4.14 of the notes), iterate the amplitudes order-by-order and obtain contributions to the
perturbation energies through fifth order. You need not expand the matrix element.s of
W, etc. (unless you are very keen).
(b) Analyze the perturbation energy contributions in terms of excitation levels.
(c) Why is the use of the CC energy equation
1 2
ecc = (Wo IW(2Tt +T2 )l wo)

not especially well-suited for our purpose? What might be a better approach? (Hint:
What perturbation energies can be determined once the perturbed wave function of order
n is known?)

8 Truncated CI ...

Exercise 1
Consider minimal basis H2 where the two available orbitals are O'g and O'lt. Set up the
SDCI wave function for this two-electron system and compute the correlation energy

Ec =E-Eo
where Eo is the Hartree-Fock energy.

Exercise 2
Add another H2 molecule at infinite distance such that there is no interaction between
the molecules implying that the total wave function may be written as t.he product of the
wave functions of the two subunits.
a) Set up the full CI wave function for this system. Very few excitations will result in
non-vanishing contributions to the wave function. Give arguments for excluding terms!
Is (local) symmetry necessary to exclude the single excitations in this case?
b) Construct the full CI matrix for this system and compute the correlation energy for
the given basis. How does that compare with that of a single H2 molecule?
c) Compute the CI coefficients and compare the coefficients for the quadruple excitations
with that of the doubles. How could this be used to simplify the system of equations?
257

(Note this relation only holds in the case of non-interacting systems, but it forms the
basis of the:> Couple Cluster approximation). Can you see from the full CI wave functions
for the two non-interacting H2 molecules that this particular relation between CD and CQ
should hold?

Exercise 3

Restrict the wave function to SDCI, i.e. only up to double excitations are included. Set
up the SDCI matrix and obtain the correlation energy. Compare with (1) and (2) above!

Exercise 4

Consider th(' case of an SDCI wav(' function for N none-interacting H2 molecules. Set up
th(' SDCI matrix (lnd compute the correlation energy. What happens with the correlation
('nergy per molecule when N --+ (Xl? What happens with the coefficient of the Hartree-
Fock reference state when N gets large:>?

Exercise 5

One common method of making the SDCI approach more nearly size-consistent is through
the Davidson ~orrection EDGv. Consider the correlation energy functional

E _ ('11'0 + v'cl H - £01'1/'0 + 'IPc)


c - ('1/'01'1/'0) + g(tPcltPc}
which for 9 = 1 gives the SDCI correlation energy.
a) For N non-interacting H2 molecules compare the SDCI correlation energy obtained
using this functional with that of a single H2 molecule.
b) 9 = 0 (CEPA-O) and show that this gives a size-consistent result for the correlation
energy.
c) Obtain t.he Davidson correlation energy between the CEPA-O and the SDCI results for
th(' correlation energy.
d) Show that t.he choic(' 9 = k,where N = ~ and n is the number of electrons, leads to
a size consistent result (ACPF).
(') Add N non-interacting He atoms and investigate the size-consistency of the CEPA-O
and the:> ACPF functionals.
258

9 Accurate Calculations ...


Many of these exercises have no single correct answer: they are intended to illustrate the
considerations that go into designing accurate calculations. The solutions list many of
these considerations. but are not necessarily complete.

Exercise 1
Is restricted Hartree-Fock size-extensive? Size-consistent? What about UHF? What about
different correlation treatments? Does your classification here agree with your neighbours?
Discuss any differences and attempt to resolve them.

Exercise 2
The 3$ contaminant function from a Cartesian d shell is of the form r2 exp( _br 2). Find
the overlap between a normalized 35 function and a normalized Is gaussian function,
exp( _ar Z). For what ratio of exponents alb is the overlap a maximum? What does this
say a.bout. choosing polarization exponents for Cartesian sets? (If you prefer, solve the
mort> general problem of overlap of rn exp( -ar Z) with rn+Z exp( _brZ), thereby obtaining
a· formula that could also be used for .f functions). An essential radial integral:

1"'"
o
exp( _07. 2 ) r2 dr = (2n.+ 1)".. (_11" )1/2
2n+2on+1 0

Exercise 3
(a) Tht> resonse of a spherical atom to an applied electric field F is defined to be

/j,.E = _~oF2 - ]"''')'F4


2 24
Assume that two different field strengths are applied, and obtain finite difference formulae
for the polarizability and hyperpolarizability.
(b) You are asked to calculate polarizabilities and Raman intensities (proportional to the
square of the derivative of the polarizability with respect to the bond length) for N2 •
Consider how to carry out a (feasible) correlation treatment, basis set, and vibrational
study. How would you calibrate your proposed approach? How would it affect your plans
if you were asked to consider Raman transitions between highly excited vibrational levels?

Exercise 4
An expt>rimantalist claims to bt> observing phenomena involving excited states of H2 0 up
to 10 eV above the ground state. What sort of calculations would you contemplate doing
259

to help analyze the experimental results? What major potential difficulty would warrant
special at tention ?

Exercise 5

How would you set up practical calculations to determine an accurate binding energy for a
complex bf'tweE'n Ar and NH3? How would you provide some calibration of your results?
Solutions

1 Basis Sets

Answer 1

A nice property of integrals over gaussian functions is that the product of gaussians are
gaussians, and also that so many integrals factorize. In cases like this, we use
e- ar2 e-fJr' = e-(a+P)r2
to convert to a single gaussian, and combine with the formula

1 100 1""
00
-00 -OX' -IX>
e- or2 dxdydz = 1 00

-00
e- ar2 dx 1
00

-00
e- all dy 1
00

-00
e- az'2 dz = (!:.)!
0

Including real normalization constants Nl and N2 , we get

= N; (2:) 2" = 1
3

(4)114>1)
3

(4)214>1) NIN2(0::~r
(4)114>1) Ni(2~)~ =1
Eliminating, we find

_ /2 (0:~)3/4 _ ( .jQjJ )3/2


(4)214>1) - ~ (0: + ,8)3/2 - (0: + ~)/2

The fraction in parentheses in the last form is the ratio of the geometric average to the
arithmetic average of the positive numbers 0: and ~. This is less than 1, except when the
numbers are equal, and then it is 1.

Answer 2

The product to be integrated can be regarded as the product of three factors, which only
depend on x, y, and z, respectively. If any of these integrands is an odd function, the
integral is zero. This happens for the overlap between s and any of the p functions, of
course. For the six d functions, those three with factors x 2 , y2 and Z2 will have non-zero
overlap wit.h an So function, but those of type xy, XZ, yz will have zero overlap with s.
262

Answer 3
Let us start with the simplest approach. The s contaminant can of course be seen by
inspection: it is a symmetric combination of the type x 2 + y2 + z2, since this is = r2
independent of coordinate system, hence rotationally invariant. Furthermore, the three
functions of type xy, xz, yz are seen (as in Exerdse 2) to be orthonormal to each other,
and to the So contaminant. We then need only two more orthonormal functions from the
set spanned by x 2 , y2 and z2. All non-zero overlaps in this set are either of the type
SI = (x 2Ix 2) or of the type S2 = (x 2Iy2). The values SI and S2 are unknown, but do not
matter: orthogonalization produces e.g. the two combinations Ix 2-y2) and Ix 2+y2_2z2).
The more systematic approach is to apply the angular momentum operators:
iL:rxkyl z'" exp( -ar 2 ) =mxkyl+1 z",-I exp( -ar2 ) - lxkyl-l Z"'+I exp( -ar2 )

end so on. Applying iL", twice gives - £;, etc:


Function L2:r L2 L2z L2
;1'2 0 2(x 2 ': Z2) 2(x2 _ y2) (4x 2 _ 2y2 _ 2Z2)
xy xy xy 4xy 6xy
XZ xz 4xz xz 6xz
y2 2(y2 _ Z2) 0 2(y2 _ x 2) (_2:r2 + 4y2 - 2z2)
yz 4yz yz yz 6yz
z2 2(Z2 _ y2) 2(Z2 _ x 2) 0 ( _2x2 - 2y2 + 4z2)

The ansatz ¢J = CIX2 + C2y2 + C3Z2 and the equation i.l¢J = 6¢J gives t.he equation system
4cI - 2C2 - 2C3 = 6cI
-2cI +4C2 -2C3 = 6C2
-2cI -2C2 +4C3 =6C3
This equation system is equivalent to CI + C2 + C3 = 0, so it only tells us that any function
we pick. a.<; long as it is orthogonal to the .$ contaminant, will be an eigenfunction to i?
with angular momentum 2. In practice, one uses eigenfunctions of L. with eigenvalues
±m, combined to form real combinations, to define the real spherical harmonics. The d
combinations are fairly standard:

have IMd = 2 and = 1, respectively.


Similar to the reasoning in Exercise 2, since .$, d, g ... functions are even functions of r,
but p, f ... are odd, a cartesian f basis can have p contaminants, but. not s or d. Since
there are 3 spherical harmonic p functions, and 7 spherical harmonic f, the ten cartesian
f functions must be decomposed into exactly one set of true .f functions, and one set of p
contaminants. In fact, this rule is general: The set of !(n + l)(n + 2) cartesian functions
with total degree n can be transformed into exactly one set each of spherical harmonic
=
functions wit.h I n, 1 = n - 2, et.c.
263

Answer 4

Since we only care about the radial dependence, it does not matter if we are dealing with
cartesian or spherical harmonic gaussians, nor which component of a shell w~ choose. The
radial distribution function in any direction will have the form PI ex: r 21+2 exp( -20'r 2). It
has a single maximum for r > 0:

( :)
Tm~x
=0 => (21 + 2)r~:'; - 4ar~; = 0 => rmax = l ~l
If 7·max is to be the same for two gaussian shells with quantum numbers [' and [", we get
0"/0''' = (I' + 1)/(1" + 1)

Answer 5

A beautiful fact is that the product of two gaussians is a new gaussian, even if they are
not on thE' samE' centE'r:
o(x - A)2 + (3(x - B)2 = (a + (3)x 2 - 2(aA + (3B)x + O'A 2 + (3B 2 = (0' + (3)(x - e)2 + D

is easily solved to give C = o~:~B and D = ::/3 (A - B)2. This implies


Nlexp(-o(r-A)2)N2exp(-(3(r-B)2) = NIN2eXP(- 0'+,.,
0'(3 (.1(A-B)2)exp(-')'(r-C)2)

with')' = 0' + (3. For s functions, the overlap integral is thus

With thE' known normalization factors, we obtain finally

C; _ (
r.:71
vap )3/2 _-.!!.L _ 2
.. - (a + (3)/2 exp( 0' + (3(A B»

which is also obvious from the solution of Exercise 1. Note that the new center C = o!!:B
is simply the weighted mean of the original centra.

Answer 6

From the solution of Exercise 5, it is obvious that we can now generalize:


¢>I = NIPI (r - A) exp( -a(r - A)2)
¢>2 = N 2P2(r - B) exp( -(3(r - B)2)
=>
4>14>2 = N 1 N2 Fp(r - C) exp( -,),(r - C)2)
264

where PI is a polynomial of total degree nl, expressing the angular dependence around
center A, P2 has total degree n2 around center B, and
oA +,8B
(I
C = 0+,8
0,8 2
F exp( - - - ( A - B) )
0+/3
"Y (1 + /3

In particular. for two p shells, we obtain x - A" = (x - C,.) + (CX - Ax) etc., so
(;r - A,,)(x - Brl = (x - Cr )2 - 2(XCA + XCB)(X - Crl + XCAZCB
and so on, where XCA is short for C" - AT etc. The following integrals are standard:

The overlap integrals are thus


12 + xCAxcBIo xCAYCBlo
S = NIN2F x ( YCAxcBIo 12 + YCAYCBIo
zCAxcBIo zCAYCBIo

This is an important general point: All the integrals involving the various components
of complete shells of cartesian (or spherical harmonic) gaussians are obtained from a few
values in common for all the integrals, combined with simple expressions involving the
relative coordinates of the centra. Example: With four f shells, all the 2401 four-center
two-electron integrals can be computed by simple arithmetic from two transcendental
fundion evaluations (one exponential function, and one incomplete gamma function).

Answer 7

Use a coordinate system with z axis perpendicular to the molecule. Assume spherical
harmonic basis functions. We obtain the following number of basis functions:

Each shell Each atom, primit.ive Each atom (4s3p2dJ Each atom (5421 J

Is = la' x14=14a' x4 = 4a' x5 = Sa'


3p = 2a' + 1a" x9 = 18a' + 9a" x3 = 6a' +3a" x4 = 8a' + 40."

5d = 3a' + 2a" x4 = 12a' + 8a" x2 = 6a' + 4a" x2 = 6a' + 4a"


7s = 4a' + 3a" x3 = 12(1' + 9a" xl = 4(1' + 3a"

560:' + 26a" 160' + 7a" 23a' + lla"


265

The number of integrals, for the [4s3p2d) contraction, can be computed as follows. There
are four atoms, and so 64a' + 28a" or in total 92 basis functions. There will be ~ . 64 .
65 + ~ ·28·29 or 2486 products of basis function pairs with combined symmetry A'.
(This number, by the way, is also the number of one-electron integrals). There are also
64·28 = 1792 such products wit.h A" symmetry. The number of two-electron integrals are
then ~.2486.2487+~.1792·1793 = 4697869. The rule-of-thumb gives 924 /8.2 = 4477456
(N = 92,g = 2).
Similarly, for the [!)s4p2dlfl calculation, we get N = 136 basis functions, subdivided as
920' +440", giving 5268A' +4048A" density products, and 22 073 722 integrals. The simple
formula giVE'S 21.3 million. For [5s4p3d2fJ, the figures are 46286079 integrals, while the
est.imat.e gives 45.2 million.
Note: 1) Nr of integrals rises steeply with size of basis set. 2) The approximate formula
is quite good: You never really need to hand-compute the exact number of integrals.
3) The int.egral calculation times may be roughly the same, if it is determined by the
number of primitive integrals, in t.his case about 0.7 billion. In practice, a modem integral
program is able t.o skip some of the smallest integrals, and is also able to move some of
the operat.ions t.hat compute the individual integrals to outer loops, where they handle
already-contra.ct.ed quantities. But as a rough guide, computation times is determined by
t.he primitives, while disk space is determined by the contracted basis functions.

Answer 8
The number of density products of the four possible symmetry types are:
1
N(Ad 2'll(otl(lI(ail + 1)+ .. ·+ = 2'1 ·35 ·36 + ... + 2'.24
1
·25 = 1742
N(Bd = = 305·28 + 28·24 = 1652
n(ad71(b1 ) + 1l(~)1l(a2)
N(B2 ) n(adn(~) + n(b1)n(az} = 35·28 + 28·24 = 1652
N(A 2 ) = n(adn(a2) + n(bt)n(hz) = 3·5·24 + 28·28 = 1624

The number of two-t>lectron integrals is finally obtained as


1 1 1
Nint = 2'N(ot)(N(ad + 1) + ... + = 2' ·1742 ·1743 + ... + 2' ·1624 ·1625 = 5568409

Answer 9
Let t.he numbers 1 and 2 stand for arbitrary orbitals on eu and on F, respectively. Then
the integrals of type (21. 11), (21,21), (21,22), and (22,21) will be very small and are not
computed, since the basis function product of type 21 is almost zero.
266

Answer 10

This calculation used symmetry-adapted basis functions. The reduction shown in Exer-
cise 9 is not relevant. Even at infinite separation, all orbital products are non-zero. On
the other hand, there is a reduction in the number of primitive integrals, and thus also in
CPU time.

2 Hartree-Fock

Answer 1

The Coulomb and exchange operators for an orbital t/>j are defined by their action on any
arbitrary one-electron function x:

[ijx](r) = JIr ~ r'1t/>j(r')t/>j{r')dv'x(r)


(lijx](r) = JIr ~ r'1t/>j(r')x(r')dv't/>j(r)
Thus, the Coulomb operator simply multiplies the test function by the electrostatic po-
tential function arising from the orbital t/>j, so it is just an ordinary local multiplicative
operator. By contrast, the exchange operator is non-local, and produces a function which
is the product of t/>j with an electrostatic potential. This potential is determined by the
differential overlap of the test function with the orbital t/>j. The Fock operator for closed
shells is in genera.l P = h+ L;I (2ij - Kj), where 2m is the number of electrons, and k
is the one-electron hamiltonian. In the Be case, we get

P = h + 2ih - iiI, + 2i2• - K2•


A more simple-minded approach would be to use a different effective hamiltonian for each
orbital, use only the Coulomb potential, but then excluding the self-interaction. That
approach is called the Hartree model, and it gives non-orthogonal orbitals.

Answer 2

It is simplest to use the spin-orbital formulation. We can always order the orbitals so that
the removed spin-orbital is the last. The energies of the N-electron and N - I-electron
systems are
N N i-I
EN = L(ilhli) + LL((ijlij) - (ijlji))
i=2 j=1
N-I N-I i-I
EN-I L (ilkli) +L L ((ijlij) - (ijlji))
i=2 j=1
267

The differE'nce consists simply in the tE'rms with i = N:


N-l
E N - l - EN = -(NlhIN) - L «NjINj) - WjljN)
j=l

Comparing t.o UlE' E'xpression for the orbital energies,


N
Ei = (ilhli) + L «ijlij) - (ijlji)
j=J

shows that EN-l - EN =


-EN, since the j = N term is zero. Since the order of the
orbitals is immatE'rial, excluding any orbital k thus requires an energy -flo. This is the
frozen·orbita.1 ionization potentia.ls. which are in general too high since the ion states are
not described by optimal 'relaxed' orbitals.

Answer 3

Assume 2N E'lectrons. enumerating the ground-state orbitals 1, .. , N and consider the


excitations N -+ (l where a is some virtual orbital. We have four determinants:
DJ = IIT. .. (N -1)(N -1)Na)
D2 IIT. .. (N -1)(N -1)Na)
D3 = lIT. . . (N --I)(N -1)Na)
D4 = Ill ... (N - I)(N -l)Na)
ThE' easiest solut.ion is to look at their expectation values: There are a number of terms
a.ppearing in all the energies, which we denote by C and which is immaterial for the
quest,ion at hand.
E(D J) = C + (NNlaa) - (NaINa)
E(D2) = C + (NNlaa)
E(D3 ) = C + (NNlaa)
E(D4) C + (NNI(la) - (NaINa)

c = 'E
~l
J
(2h i i + ~ L(2(iiljj) -
j
(ijlij») + hNN + ~(2(iiINN) -
W
(iNliN»
N-J
+h".. + L(2(iilaa) - (ialia»
i=J

DJ and D4 are pure triplet states, and give immediately the triplet energy. D2and D3
together span a space containing precicely one singlet and one triplet state. The sum of
diagonal elements of the submatrix for these two states equals the sum of eigenvalues.
Thus we get
E(T) = C + (NNlaa) - (NaINa)
E(8) + E(T) = 2C + 2(N Nlaa)
268

i.e. E(S) = C + (NNlaa) + (NaINa) and E(S) - E(T) = 2(NaINa). Another method
is to directly write IS >= JI(D z - D3 ), so that

E(S)
1
= 2(Dz -
- 1 - -
D31HID2 - D3) = 2(E(D2) - (D2IHID3) - (D3 IHID2) + E(D3 »
Since D2 a.nd D3 differ in two orbitals, we get
(D2IHID3) = (NallNa) = -(NaINa)
leading to the same answer.

Answer 4
With the approximation Wlji) = ci;(ijlij), we get
Ei = hi + E(ijlij) - (iilii)

fa ha + E(ajlaj)

This means that a self-energy term is subtracted for occupied orbital energies, but not
for virtuals.
The energy with an added electron can be written as
E~+l = EN + ha + E«ajlaj) - (ajlja)

directly from the standard formula, by recognizing all terms not involving a as' precisely
the sum EN. The additional terms are precisely Ea. This is Koopmans' theorem for
electron affinities: If the orbitals do not change, then the E.A. 's on the Hartree-Fock level
of approximation are given by the virtual orbital energies.
The difference in the physical interpretation of occupied VS. virtual orbital energies ex-
plains the so-called HOMO-LUMO energy gap. It is particularly obvious in UHF orbital
energies, in cases where the HOMO and LUMO are spatially equivalent, e.g. H2 near
dissociation.

Answer 5
Substituting Cle = Te'le gives the left-hand side (FT - fk5T)e'le, which in general cannot
be of the required form: we would have to use T = 5- 1 , and since this is a symmetric
matrix but does not in general commute with F, the product FT would be asymmetric.
The symmetry is enforced by writing
TT(F - EkS)Te' k = 0
which is also natural since it expresses a basis change. This has the required form pro-
vided that TTST = 1, which is the usual matrix form of finding a transformation to an
orthonormal basis. This relation immediately gives T- 1 = TTS. It is convenient to use
this formula rather than matrix inversion when the inverse transformation is needed.
269

Answer 6
a) The p, q matrix element of the left-hand side is

E 2cpiC,.;Sr• • 2c. = E 4c.,,;6


ijr!l
j Cqj
i;rll
ijCqj = 2DJH1
where the central factors summed to a Kronecker delta because of the orthonormality of
thE' HF orbitals. A simple and concise way of expressing this is by matrix notation:
ON condition: cTSc =1
RHS: 2D= 4ccT
LHS: DSD = 2ccT S2ccT = 4c1cT
The D matrix is invariant to ON transformation of c:

b) It is enough to show what happens with the Coulomb part:

(xplix q) = I ;tp(r) (~I ~i~~dV') Xq(r)dv


= I Xp(r) (~/~C.iC./r~:~X~r)dV') Xq(r)dv
= '"' II Xp(r)Xq(r)Xr(r')Xr(r')d 'd D
~ Ir-r'1 v v r.
and similar for the exchange term.

Answer 7

For any occupied orbital tPj, we obtain the overlap variations

for the real variation, and similar for the imaginary one, since the virtual orbitals are
orthogonal to all the occupied ones. For the real and the imaginary cases, respectively,
we obtain the energy variations

6El = c:t:(lPoIHlPn + b(lPiIHlPo) = 2cSxRe(lPoIH'Ii)


6~ = i6x(lPoIH'Ii) - icSx('IiIH'Io) = -26xIm('IoIH'Ii)

1£ the determinant happens to be a Hartree-Fock state, we know that any orbital variation
that preserves orthonormality to first order must have a first order energy variation which
is zero. For such a state, we have now proved Brillouin's theorem: the HF state does not
interact through H with any singly-excited determinant.
270

Answer 8
At the dissociation limit, we know the orbitals:

19) = lUg) = IfIA + B)


Ill) = IlTv) = ~IA - B)
with obvious notation, where A and B are the hydrogen Is functions. The one-electron
energies of A and B are both -~. The singlet and triplet energies differ by an exchange
integral which is zero. The energies are thus

E(O";) -1 + (991199)
E(ugO"v) = -1
E(O";') = -1 + (llulluu)

The electrostatic integrals are easily computed:

(991199) = ~((A + B)(A + B)I(A + B)(A + B))


= ~((AAIAA) + (BBIBB)) = 156

since any integral with both A and B factors must be zero. Thus the dissociation limit for
the open-shell states is correct, but the dosed-shell lu;
state is (with this basis) 8.5 eV
too high! The explanation is that there is a matrix element of this size which couples
the two closed-shell determinants. Taking this into account (Full eI) produces two linear
combinations of the closed-shell states, one with the correct dissociation energy, and one
which dissociates into H+ and H-. This is of course a common phenomenon: closed-shell
HF is unable to break or form bonds.

3 Second Quantization

Answer 1

Using

with
k/ = nl - Olj
r(k)1 = €ljf(n)1
271

we obtain

a!ajln) = njr(n)ja!lk)
= nj(1- k;)r(n)jr(k);lp)
= nj(l - ni + 6ij)Eijr(n)jr(n)ilp)
where PI = n, for I ::j: i, j, Pi = 1 and Pi = 6ij.

Answer 2
1.

[A, BIB21 = ABIB2 - BIB2A


= ABIB2 - BIAB2 + BIAB2 - BIB2A
= [A, Bl1B2 + Bl [A, B21 (1)

2. Assume that the equa.tion is fulfilled for n =m


[A, Bl ... BmJ = t
1e=O
B1 ••• BIe[A, BIe+lJB1c+2'" Bm

We then have from 1

[A. B1 B2... BmBm+lJ = [A, B1 B2... BmJBm+l + B1 B2... Bm[A, Bm+ll


m
= L (B1 ••• Bm[A,B1c+lIBIe+2 ... Bm) +B1 .82... Bm[A,Bm+l1
1.=0
m+l
= L .81 ••• .81. [A, .81e+lJBk+2 ... .8 +l
m
1=0

Since the equation holds for n = 2, the induction is complete.


3.

[A,BIB21 = ABIB2 - BIB2A


= ABIB2 + BIAB2 - BIAB2 - BIB2A
= [A, .811+.82 - Bl [A, .821+ (2)

4. Assume that the rplation holds for n = m (m even)


m
[A,.8 1 ••• Bml = L(-1) le B1 ••• BIe[A,B1c+lIBIe+2" .Bm (3)
1.=0
Using Eq. 1 we obtain

[A,.81 •• • BmBm+lBm+21 = [A,B 1 •• • .8ml.8m+lBm+2 + B1 ••• .8m[A,Bm+l.8m+21


272

Using 2 and 3 gives

[A, BI ... Bm+21 = f) _1)" BI ... Bk[A, Ek+tl+··· BmBm+IBm+z


"=0
+ EI ... Bm[A, Bm+ll+Bm+2 - BI ... Bm+1/A, Bm+zl+
m+Z
= 2: (-1)" BI ... B,,[A, Bk+ll+EA-+2 ... Bm+z
k=O

Since the equation holds for n=2 (Eq. 2), the induction is complete.

5.

[A, BIB21+ = ABIB2 + BIB2A


= ABIE2 - EIAB2 + BIAB2 + BIE2A
= [A, 81 ]B2 + RI[A, 8 2]+
6.

[A, [B, til + [t, [.4., Bll + [B, [C, All =


ABC - AtB - ECA + CBA + CAB - CBA
-ABC + EAC + BCA - BAC - CAB + ACE
=0

Answer 3

Let
K. = 2: Kija;aj
i;

and

50

[k,j1 = 2: Ki;!",[a;aj, ala,]


ijkl
2: Ki;!kd cjkala, - cilalaj}
ijkl

ij
273

with

A two-E'lE'ctron operator can bE' written as

so

[K,gJ = ~ L Kij9klmn[a!aj,ala~anatl
ij/dmn

= ~ L Kij9klmn (c5jka!a~anal + c5jmala!anal


ijklmn
c5inala~ajal - c5ilala~anaj)

with

Answer 4

If 11 = 1, i.E'. In is onE' single elementary operator, but m is even, then the fourth formula
of Exercise 2 shows, since then [A, Ek+lJ+ is a number, that the terms of [It, 1m] are
products of at. most m - 1 elementary operators. Since [In' 1m] = -[1m' In], the roles of
In and 1m can be interchanged, so for even n, [In' II! contains products of at most n - 1
elE'mentaryoperators. Now let 1m = El ... Em and use the second formula of Exc. 2:
m-l
[/",ImJ =L El ... EI:[In. Bk+l]Ek +2'" Em
k=O

We now know that the commutator in the right-hand side contains products of at most
n - 1 elementary operators, which proves the statement.
274

Answer 5
1.

exp(A)t = [f: A~)t


m=O m.
co Atm
L-,
m.
m=O

= exp(At)

2. Since

Bexp(A)B- 1 = B ['&0 ~~) B-1= '&0 (B~!-l)m


= exp(BAB- 1 )

A similar statement is obviously true for any Taylor expansion!


3.
AiBk
exp (A) exp (B)
= LL-=fk'
J ..
j=Ok=O

L -1 L n
n.
,
AmBn-m
n=O n! m=O m!(n - m)!

L ~(A + Bt = exp(A + B),


,,=on.

where we used the binomial expansion for commutating quantities

(A+B)" = L" n.
,AmB,,-m
m=O m!(n - m)!

4.

exp (A) exp ( -A) = exp(A - A) =1


5.
d A,,-IA"
dA exp(AA) = L-
,,=1 (n -I)!
AnA"
= AL-,
n=O n.
= Aexp(AA)
= exp(AA)A (4)
275

6. Determine a Taylor expansion of f(>..). Using 4 we obtain


d
d>" (exp (>..A)Bexp (->..A» =

(~ eXP(>..A») Bexp(->..A) + exp(>..A)B (~ eXP(->..A»)


= exp(>..A)(AB+B(-A»exp(->..A)
= exp(>..A}[A,BJexp(->..A)

Differentiating several times, we obtain in general


d!'
d>..n exp(>"A)Bexp( ->..A) = exp(>..A)[A, [A, ... [A,B] ...JJ exp (->..A)
The Taylor expansion of f(>..) for>.. = 1 around>" = 0 proves the relation.

Answer 6
1.
1 2 1 3 1 n
exp(D) = 1+D+2D +3i D +···n!D
Sinct> the n'th power of a diagonal matrix D is a diagonal matrix with elements
(d;;)n we obtain

exp(D)

2. From exercise 5.2


exp(A) = exp(UDU- 1 )
= Uexp(D)U- 1
so
det(expA) = det.(Uexp(D)U- 1 )
= det (U) det (exp (D» det (U- 1 )
= det (exp(D)) = II exp(d;)
;

= exp(Ld;) =exp(TrD)
= exp(Tr(DUU- 1 )) = exp(Tr(U-1DU» =exp(TrA)
276

Answer 7
Since HiD and H~sc contain summations over 2 electrons, they are two-electron operators.
HiD is spin-free, so its second quantation representation is

with

(f/J;lPjlP,lPl) = -; f drdr'IP;(r)lP;(r)c5(r - r')lPk(r')IPI(r')


= -~ f drlPi'{r)lP;(r)IPHr)lPl(r)
The operator H~se is spin dependent. It is convenient to write H~sc as

~>" 28 +S~ + 25'- s~ + s; s~ c5(ri -


811" ,,1" " ( 1·" "")
H~se = - 3c2 I J
rj)

The following integrals over spin orbitals are non-vanishing;

(Hsse )iojolcolo = (Hsse )iPjPlcPIP


(Hsse )iojo'PIP = (Hsse )iPjP'olo

So the second quantization representation of H~se becomes

The integrals (ijk/) have the following permutation symmetry

(ijk/) = (kji/) = (ilkj) = (k/ij)


so

H"2D = -21 "(.


L.J '1·k/) [t t
a;oa'oaloajo t t
+ aipa/opalpajp
ij/ol

+ a!oa!p4 Ipajo + a!pa!aalaajp]


= -21 E (ijkl) [a!aaloa,aaja + alpalpa'Pajp]
i>/o,jl
277

+-21 L (ijkl) [otooL.a,oaier + 0!p01polPoiP]


i<lc,jl

+-21 L(ijkl) [ot,olpolPojO +1p oloolooiP] (5)


iikl
= 1
-2 :E (ijkl) [010 0100100jO + otpo!poIPOjp]
i>Ic,j1

+-21 L (kjil) [010010010 a jO + olpo;pa,paiP]


i<lc,j/

+ L(ijkl)a10 0!polpaio
ijkl
L< ij kl)otoalpalPojO
iikl
In the same way we obtain
Hssc = -2 L a!oalpolpaja (6)
ijkl
Comparing Eqs. 6 and 6 gives directly
Hssc = -2H2D

Answer 8

H,c = L 1'A ~ L
A IJ tritfJ
Jdr dm. ~i<r) O"i(m.)
X {~(S+i + S-d IAz + ~i (S+i - S-i) lAy + S.dA.}
x6(r-rA) (/Ij(r) O"i(m.) a!...;aj"j

= ~ LA 1'A ~ ~i(rA) ~j(rA)


'J

x {IA%a!oajp + Iba;pojO + iIAlla1 ajp


0

- iIAyo!pojo + IA.otoajo - IA.o;pojp}

= ~ LA 1'A ~~i(rA) ~j(rA)


'J

x {(/A% - iIAy )(T;j(l, -1) - Tij(l, 1» + Y2IA.T;j(l,O)}

Answer 9
We have
[Emn,a!...l = L[a~...anv.,a!...l = 6..ia~.,
...'
278

and

[Emn, fijlod [Emn' EijEkl - bjkEil]

== Eij[Emn' Ekd + [Emn' Ei;]Ekl - c5jk [Em", tid


bnloEijEml - bmlEijElon + bniEmjElol - bnjEinElol

-bjloOniEml + OjkOmlEin
Onkfijml + bnkOjmEil - Omlfijlen - omloj/.Ein
+Onifmjlol + OniOjkEml - Omjfinlol - OmjOnloEil
OjloOniEml + OjloOmlEin
Onloe-ijml - bmleijkn + Oniemjkl - Omjeinkl

Answer 10

Jl(l + 1) - 0(0 + 1)Qij(1, 1)


279

. lIn( a.,a.
[5:, t t ,+a.t ,a.,
v2 " )-, '-,),
t) 1

2. Q;AO,O):


[5+,
v2
(t
1M a.,a.t ,-a.t ,a·l 1
" )-, '-,),
t)

[5.,
v2
(t"
• 1M a.,a.t ,-a.t ,a.,
)-,
t)l
'-,).

3. Substitution of Eq. (5.22) in the three components in Eq. (5.24) gives

fI (at, (-a)d + at
( at,E2' a)·_!,
2 V2" 1.2' 2
,a)._!)
1-'2 2
, at1-2',(-a)._d)
2
280

Exept for an overall sign in all components we have the triplet operator in Eq.
(5.26). Substitution of Eq' (5.22) in Eq. (5.25) gives

ff.(a!,(-a.d-a
V'2 'i" 12
f la'_l) =- fI(a!la'l+a! la'_ l)
V'2 '2'
1- 2 J 2 J2 1-2" J 2

which is identical to Eq. (5.27) apart from the minus sign.

Answer 11

From the second quantization expressions for the components of S, Eq. 5.14, it is imme-
diately seen that

One of the formulae for the commutator of an operator product gives

[5+5_,NiJ= S+[S-,NiJ + [S+,N;JS- = 0


and similarly for any other product of S components. Since S = !(S+5_ + 5+5_) + 5;,
we obtain immediately [S2, N;J = O.

Answer 12

A useful idea is to use the Taylor expansion of the exponential and add the even and odd
powers separately.
a) Use ,..2 = _A2], where] is the unit matrix. Then K2n = (_A2.)n], while ,,2n+1 =
( _A 2 )n,..:
co,..2n 00 ,,2,,+1
~ (2n)! + ~ (2n + I)!
= f: (_I)" A + f:
,,=0 (2n)!
2,,]
,,::0
(_I)" A2""
(2n + I)!
sin A
= cos).] + (-A-)K

COS A sin A )
(
- sin A cos A

b) Use ,,2 = -101 2 ] in the same way:

COS 101 for sin 10'1 )


exp(K) =( a"· I I
-101 sm Q cos IQ I
281

c) First, split up It = Itl + "2, where "I = iAI with A = ~(61 + 62 ). Since Itl is a multiple
of the unit matrix, it commutes with "2 so that exp(lt) = exp(ltdeXP(1t2)' Use

2
/1".2 =(
'B
~
• 0
'B
)2 = -C I
2
-0 -~

where B = ~(hl - 62 ) and C = JB2 + 10F. As above, we obtain

_ . ( cos C + tg. sin C ~ sin C )


exp(lt) - exp(1A) o· . C C iB· C
- c sm cos - csm

Answer 13

Either use the standard representation for the spin operators to express the exponential,
in this case

This transformation matrix must then be applied individually to each orbital. One can
proceed similarly to obtain transformation matrices for other spins than S = ~.
As an alternative, we are sure to get the many-electron operator correct if we express it
directly in field operators. In part (a), we obtain the transformed creators:

. ·.t . . _ ~ (i7r)n.. ·.t


exp(t7rSr )ai"exp( -'7rS~) - L..J - (
)' [S~, [S~, ... , [S~,aio)···11
n=O n.

where the n-th order term has n commutators. However, from the second-quantization
form of the S components, we get immediately
1 .t
[S~,iito) = 2aiP
.
[S~, aiP)
·t
= 21aio
.t

By recursion, we obtain the multiple commutators

where q stands for /3, if n is odd, else o. Summing odd and even terms separately, as in
so many other examples, gives

exp( i7r Sr jato exp( -i1r S~) i i


= cos( )ato + i sine )a!p =iaIp
and obviously this derivation works the same if 0 and (3 are interchanged in the formulae.
b) Sincf' 05;10,0) = 0 for all integer n > 0, the simplest is to use the Taylor expansion
directly. Only the first (O-th order) term is non-zero, so exp(AS~)lO,O) = 10,0) for any A.
282

c) Repeated use of Sz = ~(S+ + S_) shows that Szll,O) = VICll,l) + 11,-1)), and
S,m,O) = 11,0), so that
s;n+1II,O) = I{(II,I) + 11,-1))
s;nll,o) = 11,0)

As usual, collecting terms of odd and even order separately in the Taylor expansion gives

exp(idz )ll,O) = cos(7I')11,0) + sin(7I')ff(ll, I) + 11, -I)) = -11,0)

Answer 14

Use X = i LA:¢o(lO)(kl + Ik)(O\).


a) Operating repeatedly on 10) gives

XIO) = i L Rklk)
k¢O

X 10)
2 = -<flO)
X 2"10) = (-I)n<f"IO)
X 2"+110) = (-I)"<f"XIO)

where d = VLk¢O RI. As before, then,

• .
sind
exp(X)IO) = cos(d)IO) + t( d) L Rklk)
k¢O

b) Similarly,

XII) = iRilO)
X 2,,+1II) = iR;( -1)"d2"10)
X 2,,+2 II) = iRi(-I)n~"XIO)

The last expression, used in the even-term Taylor expansion, would give -iR;(~)XIO).
However, the O-th order term will be missing, and must be subtracted. Also, insert the
expression XIO) = iLk¢oRklk) to obtain
283

Answer 15

Choose
b; = exp(-ilt)a;exp(ilt)
It is immE'diately obvious that

[bi,bjJ = [exp(-ilt)ajexp(ilt),exp(-ilt)a;exp(ilt)J = [a;,a;J = Djj


Note that hi is not equal to hi:

since It is in general not hermitian! The nomenclature is a bit vague: The most logical
choice is to call bt and bi the creator and annihilator, respectively, for orbital h; with
respect to the orbital system {b; }~1. The advantage is that the anticommutation relations
always work, while a disadvantage is that the annihilator is then not defined solely by
the orbital but depends on the choice of the entire orbital set. More common is to define
the 'annihilator' to be the conjugate of the creator, and accept that the anticommutation
relations will have an overlap matrix element instead of a Kronecker delta on the right-
hand side. The most drastic stratagem is to use different notation for the 'hat' and 'tilde'
operators: the orbital index is put in either a superscript or subscript position, and all
E'xpressions are handled by tensor rules.

4 Spin

Answer 1

[S;,S.J = +i(S"Sy + 8 8,,) y sim.


[S~,S.J = 0

The sum is simply [82 , S.J = 0, QED. Since all used relations are invariant to cyclic
permutation of labels :t, y,Z, so is the result, so it is also true that [82 ,8.,J = [82 ,8y J = o.

Answer 2

The proof is mechanical:


• •• • •• •• •• 2··
(S" - iSv)(S", + is,) = S",S" - iSyS" + is,,Sy - i S"S"
= S~ + S; + i[S." 8 11J = 8 2 - S~ + i(i8.) = S2 - 8,,(8. + 1)
284

and similarly for 5+!L = 52 - 5%(5% - 1). We get the relations


5-5+ + 5%(5% + 1)
5+5_ + 5%(5% - 1)
1 •• •• 2
2(S_S+ + S+S_) + Sz

Answer 3

Use [5., 5+1 = [5., 5", + i5y 1 = 5+:


5%(5+ IS, Ms») = (5+5z + [5., 5+])15, Ms) = (5+5z + 5+)IS, Ms)
= 5+(Ms + 1)IS,Ms) = (Ms + 1)(5+IS,Ms»)
so, 5+ IS, Ms) is a new (unnormalized) spin eigenfunction ex IS, Ms + 1), unless it is
zero. Directly from Exercise 1, we get [5 2,5+1 = 0, so by the same reasoning it is also an
eigenfunction of 52 with eigenvalue S(S + 1) unchanged.
To check if the result is nonzero, we compute
/l5+IS,Ms)W = (S,MsI5-5+IS,Ms)
= (5, Msl5 2 - 5z(Sz + 1)15, Ms) = 5(S + 1) - Ms(Ms + 1)
so, if Ms =f. Sand =f. -S - 1, then 5+IS, Ms) is non-zero. Similar rules obtain for 5_.
One can also conclude that the only possible Ms values are spaced one unit apart in the
interval [-S, S1, which shows that only half-inte~er 5 ~alues are possible, since else the
sequence of results with repeated application of S+ or S_ would generate functions with
negative norm, which is impossible.

Answer 4
By definition, 0 and /3 are spin eigenfunctions with S = ~, Ms = ±~, respectively. From
Exercise 3,5+0 is a function with squared norm S(5 + 1) - Ms(Ms + 1) = O. Similarly,
5_/3 = O. On the other hand, 5+/3 has squared norm H~ + 1) - (-~H = 1, and so
=
5+/3 co, and similarly 5_0 = d/3, where Icl = Idl = cd = 1. The last relation comes from
the requireme~t (/315_5+/3) = 1 as in Exercise 3, but is also obvio.us for anot~er reason:
Since 5+ and 5_ are hermitian conjugates, we must have that (015+/3) = (/3IS_o)*. The
natural choice is thus the standard representation:

S+ = (~ ~) S_ = (~ ~)
Directly from the definitions, we obtain then

S", = ~ (~ ~) Sy = ~ (~ ~i)
The representation matrix of 52 is obviously half times a unit matrix. This, as well as all
commutation rules, can be verified by simple matrix multiplication.
285

Answer 5

From e.g. Slater-Condon rules, or from the Second Quantization lectures, or simply
from noting that the determinant is a multilinear form of its rows (or its columns), the
application of an operator in the 'independent particle' form A = Li a( i) to a determinant
D = ItP1tP2tP3 ... t,l'nl is obtained as

L a( i)It,l'I"'P2t,l'3 ... t,l'nl = I( atPdtP2t,l'3 ... tPn + ItP1 (atP2)tP3 ... tPnl + ... + ItP1tP21/13 ... (atPn)l
1
i

=
With the overline notation, we note that szrf> ~rf> while s;4> = -~4>. In that case, then,
all terms are equal to the original determinant except for a factor of ~ or -~. Thus we
get 5z D = ~(nG\' -np)D, where ncr and np counts the number of orbitals without and with
an overbar. Obviously this applies generally whenever all the orbitals are eigenfunctions
to the one-electron operator a: The determinant is simply multiplied by the sum of the
eigenvalues.

Answer 6

Use the result of Exercise 3: 52 = 5+5_ + 5z (5z - 1). The rest of the derivation can
be made rather simple by using the second-quantization rules rather than the 'individual
particle' formulation, but here is the more complicated solution. The one-electron oper-
ator LiS+(i).L(i) is handled just as in Exercise 5. Since s+s_¢> = ¢>, while s+L4> = 0,
we obtain Li s+(i).L(i) D = nerD. The two-electron operator Li¢j s+(i).L(j) works as
follows:
Consider a single operator in this sum, with some specific values i, j. In the expansion of
the determinant as a sum of products P, each term has either of four forms which gives
the following results:

05+ (i).L(j) ... ¢>p( i) ... ¢>q(j) . .. = o


o5+(i).L(j)··· ¢p(i) ... ¢>q(j) .. . ... ¢>p( i) ... ¢q(j) ...
s+(i)s_(j)··· ¢>p(i)··· ¢q(j) .. . o
s+(i).L(j)··· 4>p(i) ... 4>q(j) .. . o
where p and q are some indices which are in general different for different products P.
Operating on the product gives zero result, unless an overline can be moved over from the
¢>p(i) to the ¢>q(j) factor. If we now sum over all i =/; j, we generate a sum of new products,
where each is the result of switching the position of an overline, whenever possible. We
recognize this as the effect of the operator Lp<q XT'q defined in Exercise 6. We have thus
obtained

It is difficult to directly apply the LHS expression to each of the n! product terms P in
D, but the RHS is expressed in terms of orbitals instead of particles and applies equally
286

to each of the terms, and thus also to their sum P:


Es+(i).LU)D = E.t"qD
i~j "<9
and finally
S2D = (5+5_ + Sz(5z-l»D
= (no. + E Xpq + ~(n" -
"<9
np) G(no. - np) - 1)) D
= (E ,Y + N) D
,,<q
pq

where the number N is


N = i (no. - np)2 + 2(no. + 71p») = M't; + n/2
This expression appears to involve dosed shells, but this is easily fixed. Assume we have
ne closed shells, i.e. 2ne closed-shell electrons, and thus no = n - 2ne open shells. The spin
interchange operator, if applied to the pair of orbitals in a closed shell, will interchange
the orbitals and produce a term -D. If one of the orbitals belongs to a closed shell, it
will produce zero by antisymmetry. Altogether, the sum of all the terms involving closed
shells will simply contribute -neD. This contribution can be subtracted from the n/2
term: n/2 - ne = no/2. The result is thus unchanged if we apply it to open shells only.

Answer 7

Using Exercise 6, with the simplified notation, we obtain


82 1001 = (M; + n/2)lool = 21001
82 10.81 = 1.801 + (M't; + n/2)lo.81 = 1.801 + 10.81
82 1.801 = 10.81 + (M't; + n/2)1.801 = 10.81 + 1.801
52 1.8.81 = (M; + n/2) 1.8.81 = 21.8.81
Note that, in this notation, the closed shells have been disregarded. there are no orbital
labels, and the positions of each 0 or .8 indicates each of the open shells: there is no
antisymmetry rule involved. 1.801 + 10.81 does not simplify to zero, but is the sum of two
quite different determinants.

Answer 8
a) From Exercise 7, we can immediately write down the matrix:

82 =(
2 0
o
0 1 1 0
0)
o 1 1 0
o 0 o2
287

It is almost diagonal already, we just need to remove the mixing of the Dz and D3 func-
tions. The central 2 x 2 submatrix has normalised eigenvectors .jf(I, I)T and .jf(I, _1)T,
with eigenvalues 2 and 0, respectively. Now remember that the eigenvalues should be
5(5 + 1), so the spins are 5 = 1 or 5 = O. We thus obtain the spin eigenfunctions
IS, Ms):
11,1) = DI

11,0) = ~(Dz +D3)


10,0) = ~(D2 -D3)
11, -1) D4

b) Since DJ is a high-spin determinant, we already know it is a spin eigenfunction with


5 = IMsl. The same is true for D 4 • What we need to know is:

5'_15, Ms) .j5(5 + 1) - (M -I)MIS,Ms -1)


5-1l/.'ltP2tP3 ... tPnl = 1(5_r/'I)tP2tP3 ... 1+ ItPI(5-tPz)tP3·.·1 + ...

Starting with DI and spinning down, we obtain:

11,1) = DI = laal
5'-11,1) = 1.801 + 10.81
";-=-1-,;.2:---:0::-·"'7111, 0) = 1.80 I + 10.81

11,0) = ~(1.801 + la.8!) = jf(D2 + D3)


In this case, we are already finished (almost), but you can verify that an additional step
produces indeed the 11, -1) already known.
Having obtained the triplet states, the only way to continue is by noting that the Ms = 1
state is spanned by Dz, D3 , and that the singlet is orthonormal to the triplet. Orthonor-
malization or direct inspection gives 10,0) = ±.jf(D2 - D3).
c) Again, we know DI and D4 are pure states. However, D2 and D3 are not pure, but are
each a mixture of singlet and triplet. Suppose we wish to get the pure singlet component
out from the D2 : The product form of the projector now has a single factor (8=0, K=I):

. 52 -K(K+I) 5Z -2
OK = 5(5 + 1) - K(K + 1) = 0 - 2

Applying this operator to D2 gives

!(2 - 5'2)Dz = !(2Dz - (D2


2 2
+ D3 »= !(D
2
z- D3 )
288

The result is obviously not normalized. Furthermore, if we project instead from D3 , we


get
1 1 1
-(2 - S )D3 = -(2D3 - (D2 + D3 = -(D3 - D2 )
2
"2
2 » 2
If we wanted the triplet instead, we must use 5=1, K=O:

" .5'2 - K(K + 1) 82


OK = S(S + 1) - ]«K + 1) = 2 - 0

Applying this operator to D2 gives

!(82)D2
2
= !(D2
2
+ D3)

Answer 9

a) As in Exercise 7, we obtain:
15
82 10'00'1 = '410001
7
82 10'0.81 = 1.8001 + 10',80'1 + 4"100,81 etc ...
7
82 10'.8.81 = 1.80,B1 + 1,8,80'1 + 4"10.8,81 etc...

82 1.8.8.81 = 154 1,8,8.81


(Obviously, permuting each determinant in any of these formulae will produce other valid
formulae!) It follows that I~,~) = 10'001. Within the subspace spanned by the three
Ms = ~ functions, the 82 matrix will be

S2 =! ( 47 4 4)
7 4
4 447

This matrix has unnormalised eigenvectors (1,I,l)T, (1,-I,Of, and (1,1,-2f, with
eigenvalues ¥, ~, and ~, respectively. These values of S(S + 1) correspond to spin
S = ~,~, and ~, respectively. Within this subspace, we thus obtain the three normalized
spin eigenfunctions

31
12'2) = {a000.81 + 10,Bol + l,BoO'I)
1 1
12'2) = !f000.81 - 10,801)
1 1
12'2) = 1f(loO'.81 + 10.801- 21.800'1)
289

Obviously, since the last two are degenerate, we could just as well (as you probably did!)
obtain any orthonormal linear combinations of them instead. The matrix problem for the
=
three Ms -~ components is identical.
b) Starting with I~,~) = 10001 since it is high spin, we next obtain

and thus
(153 3 1
V"4 - 412' 2) = 100,81 + 10,801 + 1,8001
or I~,~) = y1<loo,81 + 10,801 + 1,800!). The next step is to obtain

or I~, -~) = y1 (10,8,81 + l.Bo,81 + 1,8,8a/). and finally I~, -~) = 1,8,8,81
To be able to continue, we must take the orthogonal complement to I~,!) in the Ms ~ =
space. We can arbitrarily pick, for instance, the result of a Gram-Schmidt orthonormal-
ization starting with, say, 100,81. Its overlap with I~,~) is y1. Subtracting y11~,~)
leaves ~ (210a,8l- 10,801- 1,800!), which is then normalized to give

1 1
12, 2) = Vfl6 (2100,81-10,801-1,8001)
and finally orthonormalizing, say, 10,801 against this function, and against I~, ~), gives
!
another S = function:
1 1
12, 2)
fl
= V2 (10,801-1,8001)
Whatever t.he choice of functions here, we next proceed by spinning them down to obtain
I~, -~) functions in an obvious way.

Obviously, we can make many different choices for the two I~,~) functions. The results
from the diagonalization method are the most arbitrary ones. The results from spinning
down has at least the advantage that the chain of states connected by S+ and S_ will
always adhere to the standard formulae. Uniquely defined functions are obtained by so-
called spin-coupling schemes, of which the genealogical scheme would be simplest and best
for the purposes of this exercise. Several other schemes are also used.
290

Answer 10

Since the functions have Ms = 0, the simplest test is to see if S+ (or, if you prefer, S_>
gives zero:

s+ (10',80',81- 10',8,80'1-1,80'0',81 + 1,80',80'1)


= (10'0'0',81 + 10',80'0'1-10'0',80'1-10',80'0'1
-10'0'0',81 - 1,80'0'0'1 + 10'0',80'1 + l,8aaall =0
S+ (210'0',8.81- 10',80',81-10'.8,80'1 -1,80'0',81-1.80',80'1 + 21,8,80'0'1)
= (210'0'0'.81 + 210'0',80'1- 10'0'0',81- 10',80'0'1- IcrO',8al - 10',80'0'1
-10'0'0'.81-1,80'0'0'1-10'0',80'1-- 1,80'0'0'1 + 210',80'0'1 + 21,80'0'0'1) = 0

Answer 11

In overline notation, the closed-shell determinant is 10) = liiiJl. Remember that E7"l =
E., a~aq.. , where the operator a~aq., acts on a determinant by giving non-zero result
only if tPq<1 is present but tP1'" is not, and in that case replacing tPq.. by tP1'" in place. We
immediately obtain

EbiliijJI = liibJI + liijbl = -liiJbl + liijbl


EfJiJiijb) = laubl + liaJbl = liJabl - liJabl
Ea;liijbl laijbl + liCijbl = lijabl - lijabl
which can be combined to

Obviously, Ms =O. Operating with S+ gives


S+ 10} = S+ (-Iuabl + liJabl + lijabl - lijaiil)
= -liJabl- lijabl + lijabl + liJabl + lijabl + lijabl- lijabl- lijabl = 0

Answer 12

We can do the calculation as in Answer 11. It is simpler to reuse that result, just inter-
changing letters i,j:
291

After restoring the order of determinant labels, we obtain

Obviously, interchange of letter symbols does not alter the fact that the function is a
singlet.
In the two expressions, the two middle determinants differ, so they are independent.
However, the first and last terms give a non-zero overlap, so they are not orthonormal.

5 Geometrical Derivatives . ..

Answer 1

e(:r) is defined as
e(X) = E(X,A(X»
The chain rule of differentiation gives

de(x) = dE(x,A(X» = aE( A( .» dA(X)aE( A(»


dx dx ax x, :r. + dx aA x, x

Differentiating again gives

The first and last terms are evaluated by the chain rule:

In toto, with some abbreviation,

For higher derivatives, most people find it simpler to write up the Taylor expansion of
E(x, A) and then replace the variable A with the expansion of the function A(X).
292

Answer 2
The variational condition is to be valid for all x, so it must be fulfilled in each order of x
separately. The Taylor expansion of aE/a). is
aE(x, ).) = E(Ol) + XE(ll) + ~x2 E(21) + ~x3 E(31) + ...
a). 2 6
+ ).( E(02) + XE(12) + ~X2 E(22) + ...)
2
+ ~).2(E(03) + XE(13) + ... ) + ~).3(E(04) + ... ) + ...
2 6
Replace the powers of the variable ). by the expressions
).(x) = X).(l) + ~X2).(2) + ~X3).(3) + ...
2 6
).(X)2 = X 2 ,\(1)2 + x3).(I) ).(2) + ...
).(X)3 = X 3 A(I)3 + ...
Collecting equal powers of x into subexpressions, and setting each to zero, gives:
E(Ol) = 0
+ ,\(1) E(02) =
E(ll) 0
E(21) + A(2) E(02) + 2A(1) E(12) + ,\(1)2 E(03) = 0

These expressions can be used to eliminate terms in other expressions. They are the
response equations of order zero, one, etc. In the general expression for the second deriva-
tive,
e(2) = E(20) + 2,\(1) E(ll) + ,\(2) E(Ol) + ,\(1)2 E(02)
use of the zero-order response equation eliminates the A(2) term:
e(2) = E(20) + 2,\(1) E(l1) + ,\(1)2 E(02) (7)
Subtraction of A(l) times the first-order response equation gives
e(2) = E('1;O) + ,\(l)E(l1) (8)

Answer 3

It is assumed that the energy partial derivatives are exact, but the response parameters
have small errors. If we use equation 7, but replace ,\(1) with ,\(1) + 8, the result will
change from the exact e(2) to e(2) + Ll:
<;12) + Ll = E(20) + 2(A(1) + 8)E(I1) + (A(1) + 8)2E(02)
=E(20) + 2).(1) E(ll) + ,\(1)2 E(02)
+ 2(E(11) + A(l)E(02)8 + E(02Jc2

= e(2) + 08 + E(02)82
293

i.e., .6 = E(02)62 , and the error is of second order.


Similarly, for equation 8, we obtain a first order error ~ =E(11)6 instead.

Answer 4
The first-order response equation gives immediately
A(l) = -(E(02)r E(I1)
1

and insertion into equation 8 gives


e;(2) = E(20) _ E(I1)(E(02»-l E(I1)

It is written this way, rather than as a fraction, to allow generalization to many dimensions.
E(02) is then a matrix, not a number.

Series expansion gives

111 = exp (- E A,,(ln)(OI-IO)(nn) 10) = 10) - E A"ln) -


"¢O ,,~~
~E .\!IO) + ...

The energy derivatives with respect to A are

:~ = (01H18111 /8A,,) + (8111 /8.\"IHI0) = -2(0IHln) (if real)

and similarly
82 E ••
8.\,,8.\m = ... = 2«(mIHln) -(OIHIO»
With H = H(O) + zH(1) ... we obtain
E(OO) = (OIH(O)IO) E(10) = (OIH(I)IO)
= 2(0IH(0)ln)
Ei0 1) Eil1 ) = 2(0IH(I)ln)
E~~) = 2«mIH(0) In) - Eo)
Since we assume eigenstates of iI(O), E!:) =2(E" - Eo)om". Then
e;(2) = E(2O) _ E(I1)(E(02»-1 E(I1)

= (0IiI(2)10) _ E(2(0IH(I)ln» 1 (2(nliI(I)10»


n'j!O 2(E" - Eo)

Answer 5
Subtract A(3) times the zero-order, and 3A(2) times the first-order response equation:
E(30)+A(3)E(01)+3A(2)E(I1)+3.\(I)E(21)+3A(I)A(2)E(02)+3(A(1»2E(12) + (A(I»3E(03)
_.\(3)E(01)
-3A(2) E(I1) -3A(I) .\(2) E(02)
294

Answer 6

7To = -iV + Ao(r)


Squaring:

7Tb = (-iV + Ao(r)(-iV + Ao(r»


= -V2-2iAo·V+Ab-idivAo

The last term is zero. The term -2iAo'V can be written as

= «r - 0) x p)·B = Lo·B
(B x (r - O))·p
where we use the canonical momentum p = iV. Taking B to define the quantization axis,
we get
.
Ho = -21 7To+V
2
= !pA2
2
+ V + !BL + ! A2o
2 2 %

= h(o) + ~B.iz + O(B2)


where 00 is the Landau order function ("Big 0").

Answer 7

We now get
(B x (r - G»·p = «r - 0) x p)·B + (B x (G - O»·p
so that
1
HG
A

= II 0) + 2BL%
A (

+ Bd·p + O(B2 )
where d is a constant vector. This is not O.K., since the tPlm functions are not eigenfunc-
tions of this extra term. It can be noted that this extra term is perfectly analogous to
an extra term that appears when doing a Galileo transformation: the "velocity term" in
describing e.g. atom/atom collisions.

Answer 8

In the usual position representation, the result follows immediately from differentiation:
for functions F and f,

-iV F(r)f(r) = [-iV F)(r)f(r) - F(r)iV fer)

so if F(r) is regarded as a multiplicative operator, then

pF = [-iV F](r) + F(r)p


295

With F(r) = exp( -iAG(O) . r), we get


*G exp( -iAG(O) . r) = exp( -iAG(O)· r)tG - [V'(AG(O)· r)Jexp( -iAG(O) . r)
= exp(-iAG(O). r)(iG - AG(O))
= exp(-iAG(O)· r)(iG - AG(O))
= exp( -iAG(O) . r)io
We obtain then

HGWlm = HG exp( -iAG(O) . r)t,l'lm = =exp( -iAG(O) . r)HO'I/Jlm


= exp( -iAG(O)· r)(E(O) + 4Bm)'l/Jlm
= (E(O) + 4Bm)exp( -iAG(O)· r)'I/J'm
1
= (E(O) + 2"Bm)Wlm

Answer 9

Using the first result of last solution twice, we get

*~ exp( -iAG(N) . r) = iG exp( -iAG(N) . r)iN


= exp( -iAG(N)· r)1i"k
We obtain then the kinetic energy integral on the form

TI''' = (wl'l~i~lw,,)
= ('l/JI'I exp(iAG(M)r)~1i"~ exp( -iAG(N)r)I'I/J,,)
= ('l/JI'I exp(i(AG(M) - AG(N»r)~1i"kl'I/J,,)

and
1
AG(M) - AG(N) = 2"B x (M - N)
and finally, to bring it to the required form,
1 1
2"(B x (M - N»·r = 2"B.«M - N) x r)

Answer 10

No. It merely introduces a phase shift, which depends on the center position, for each
set of basis functions on a common center. All commonly used methods are insensitive to
phase shifts for the individual basis functions.
296

Answer 11

Assume a convergent iterative method, with an error e" in iteration k which decreases
towards zero. Define the convergence order "'( to mean that for sufficiently large k,
le"+1l/ie"l"Y is bounded. For any "'(' < ",(, we than obtain
lek+1l1le"I"Y' = le"I"'-"" lek+ll/l e"!",,
which is also bounded, since the first factor goes to zero. In general then: Convergence
order "'( implies convergence order "'(', if "'(' < "'(.

Answer 12

The equation y(x) = 0 is to be solved where y is usually the gradient of some function to
be optimized. The Newton-Raphson method is defined by

x,,+! = Xk - G(Xk)-ly(X,,)

In general, G is a linear mapping defined by

l:v(x + 6) - y(x) - G(x)61/161 = 0(161)


In one dimension, it is simply the second derivative. In finite dimensions, it is the Jacobian
matrix of the gradient function, i.e. the Hessian matrix. Assume convergence to some
solution Xoo, and let x" = Xoo + e". The N-R equations give

Multiplying by G" gives G"e,,+! = G"e" - y", and the definition of G gives then

Assuming that the inverse of G is bounded (which is true in finite dimensions since the
inverse is already assumed to exist) this implies

i.e. quadratic convergence.

Answer 13

The minimum is stated to lie on the boundary ST s = h. 2 , so it coincides with the solution
of the constrained problem for which Lagrange's method gives the equation

Gs + g = ps => s = -(G _ p)-lg


297

with some unknown multiplier Il. This gives immediately


1
m =f - gT(G - 1l)-lg + 2"gT(G _ 1l)-IG(G _ 1l)-lg

or
~m = gT( -(G _ 1,)-1 + ~(G _ Il)-IG(G _ Il)-I)g

= gT( -(G -IL)-I(G - Il)(G - Il)-I + ~(G _ 1l)-IG(G _ 1l)-I)g


2
= gT(G _ Il)-I( -(G - Il) + ~G)(G _ Il)-I)g
2
= gT(G _ 1l)-I(1l - ~G)(G -
2
Ilrl)g

Answer 14
a) PSB update formula, directly inserted in the quasi-Newton condition, gives
s.TSet eScT Be + S.T BeSet T B. - tT
e BeBeBT B.
B
+S. = B eBe + (T
Be Be
)2

This looks bad, but is simple: the scalar B"[ Be can be divided away. The second and third
term in the numerator cancel. It remains
B+Be = B.s. + t. = Y.
which follows directly from the definition of t •.
b) PFGS update formula, directly inserted in the quasi-Newton condition, gives
Y.y~Be B.seB~BeB.
B
+Be = B eSc + -YT --
e Y.
TB
B. eB.

Here, the scalars y"[Se and S~BeB. can be divided away. The first and last terms on the
right-hand side then cancel.

Answer 15
We want to find a stationary value of (V f(X»2 restricted to the set of points x for which
=
f(x) const. Lagrange's method gives
V(V f(X»2 = >.V f(x)
For simplicity, use g = V f(x), and cartesian coordinates:
axa (2
g", + g, + gz) =
2 2

~ 2gz Gz;z + 2gyGZ'fJ + 29z Gzz =


298

where G was introduced for the second derivative matrix (Hessian) of f(x), the y and
z components have similar equations. The required equation is obtained by identifying
p. = ~~ and using vector notation:
Gg = p.g

6 Density Functional Theory

Answer 1
The wave function for a hydrogen atom in atomic units is
lI'(r) = Ne- r
where N is some normalizing constant to be determined. The density and its integral over
all space are then
p(r) = N 2 e- 2r
j p(r) dv = N 2 foco e- 2r 47rr2 dr = N 27r
Normalization gives N 27r = 1, i.e. p(r) = e- 2r /7r. Numerically then, p(O) ::::: 0.318 a.u.
::::: 2.148 A-3. The mean distance parameter is
r. = (~)1/3 = (~)1/3e2r/3
47rp 4

*'
Answer 2
By definition, the functional derivative is a (possibly generalized) function, call it D =
such that, to first order in cp,
J[p + cp]- J[p] =j D(r) cp(r) dv
When I is defined as an integral involving p, the functional derivative can be identified
directly by differentiation. Any derivatives of cp are eliminated by integration by parts:
CJ = 6 Jwp41Vpl2dv = Jw(4p CplVpl2 + 2p4Vp· V6p) dv
3

In the second term, Vcp must be eliminated:


j p4Vp· V6pdv = - j (V. (/Vp») cpdv
= - j(4lVp.Vp+/V 2p)Cpdv
In toto,
61= -w j(4llVpl2 + 2lV2p)cpdv
so the functional derivative is
CJ
cp = -2wl(2IV pl2 + pV2p)
299

Answer 3

Equations (92) and (93) may perhaps be better understood if we note that F is an ordinary
function in 4 scalar variables: p, and the three components of its gradient, g = V p. The
la:,t term of (93) in component form is then
d 8F d 8F d 8F
- dx8gr. - dy8g y - dz8g%

For any concrete, non-linear F we know the expression for the partial derivatives of F,
and we will then need to evaluate

- ~ (Some expression involving p, g)


which seems t.o require derivatives of the gradient of p, i.e. second derivatives of p and
therefore of basis functions.
However, we do not need vr.c{r) as such, but only matrix elements of the type

(7]"li'rcl7]P) = J7],,(r)vr.c{r)7],8(r)dv
Integration by parts (now using vector analysis notation) of the troublesome part of the
integral gi ves

Answer 4

p has dimension L-3, (L=Length). V has dimension L-I. In toto, then, Vp/p-! has
dimension L -I L -3 / £-4 = L O , i.e. it is dimensionless.

Answer 6

Answer: Consider a closed-shell Hartree-Fock wave function for pV electrons in a box of


volume V with periodic boundary conditions, with orbitals tht{r) = V-teiler. The box is
assumed so large that summation over the discrete set of allowed k values can be replaced
by integration. Elementary considerations show that

L ... -+ 8~3 JJJ. ..d3 k


It

The occupied orbitals are those with Ikl < kF, where kF is to be determined. The density
"matrix" in position representation is then
300

(Remember to sum over the two possible spins!) where s = fl - f2 and ks = ks cos O. For
5= 0, we get immediately the ordinary density and thus the relation between p and kF:
= p(r,r) = -41r1 3 lkF
p
P
41r k 2dk = ....L
0 31r 2
The density matrix is obtained by substituting u = cos 0 and du = - sin OdO:

1 lkF eik • - e- iks


=- edk
21r2 0 iks
1 1 [sin kFs kF cos kFs)
= 1r2; -S-2- - S
_ 3 sin t - t cos t
- P t3
The kinetic energy density can be computed the same way as the density, by just inserting
the orbital kinetic energy _!k
2 in the integral:

T/V = _1_3 fkF(_~k2)41rk2dk = k} = (3/5)(3p)S/3


41r Jo 2· 51r 2
The Coulomb repulsion energy is cancelled by the uniform background charge and does
not enter. The exchange energy is

and is infinite of course. However, its average per unit volume is finite, and can be
regarded as an exchange potential:

-~ 11 p(rb r2)2 dVldv2 =1 pE%dv

= 11
4 r12

E% = -~1 p(rl,r2)2 dV2 --


p. 9(sint - tcosW 2d
41rS S

1
4 pr12 4 st S
9 p (sint - tcost)2 d
= ---41r =
00
t _2-(31r2 P)1/3
4 k} 0 tS 41r

7 Coupled Cluster Theory

Answer 1
(a) The exact correlation energy is NEp.

(b) The correlation energy for IJI p is


2(lJI oIHlxp) + (xpIHlxp)
Ep - .....;..--'--'~'---:::":;;....:.---"c=..:..
- 1 +S
301

Note that since IP p is the exact solution for our two-electron system, we also have

C.onsider now the case of N non-interacting two-electron systems and the truncated wave
=
function IP(N) 1P0 + Lp )Cp. We have

2(1P0IHlxp) + (XpIHlxp)
f
(t
runc
)
=N 1 + NS

From the two relations for the correlation energy above we can determine that

(xpIHI.v) = SEp - Ep

We can therefore simplify the case of N systems to

(1 + S)fp
f(trunc) = N 1 + NS
As N --+ 00 we have
N(l + S)Ep 1+S
--+--
I+NS S
Thus the correlation energy with this truncated wave function expansion goes to a con-
stant value as N increases, as tabulated below.
=
(c) The wave function IP(N) 1P0 + LP XP is not variationally optimum for this problem
in general. We note that the only freedom needed is to scale the individual XP by a factor,
since the correlation part of the wave function must continue to solve the individual two-
=
electron problems. The trial wave function is therefore IP(N) Ilio + k LP Xp, where k is
to be determined by the variation principle. The correlation energy functional is

which we can simplify to


( ) _ N(2k - k 2 + k2S)Ep
E opt - 1 + Nk2S
We can proceed two ways here. The first is simply a qualitative argument - since it is
the factor of N in the normalization denominator that causes the scaling difficulties, the
best use of k is to cancel this factor. Thus we expect k 2 "" N- 1 , or k "" N-1/ 2 • The
leading term in f( opt) then behaves as NI/2Ep overall. Thus the optimized energy with
the truncated wave function behaves as N 1 / 2 • This is the case for the CISD wave function,
for instance. A more complete mathematical solution requires determining the optimum
value of k. Differentiation of the energy expression and equating the result to zero gives
a quadratic equation for k: Inspection shows that the appropriate root is

S -1 + [(S + 1)2 +4NS]I/2


kopt = 2NS
302

This clearly displays an N-l/ 2 dependence for large N. Substitution of kopt into the
correlation energy gives the results listed below for the approaches we have considered
(We have used S = 0.01814 and e:p = -0.041OEh , which are appropriate to H2)' The
limiting behaviour of both the truncated and the optimized truncated expansions is very
clear in these results.
N e:(exact) e:(trunc) e:(opt)
1 -0.0410 -0.0410 -0.0410
2 -0.0820 -0.0806 -0.0806
5 -0.2050 -0.1914 -0.1922
10 -0.4100 -0.3534 -0.3595
20 -0.8200 -0.6134 -0.6469
50 -2.0500 -1.0956 -1.3129
100 -4.1000 -1.4855 -2.1320
200 -8.2000 -1.8070 -3.3389
500 -20.5000 -2.0767 -5.7925
1000 -41.0000 -2.185.5 -8.5889
10000 -410.0000 -2.2935 -29.3832
100000 -4100.0000 -2.3049 -95.2649
1000000 -41000.0000 -2.3061 -303.6405

Answer 2

(a) A term involving Tt cannot occur in the (closed-shell or spin-orbital) CC equa-


tions at any level of excitation. Recall that the CC equations result from projecting
exp( -T)W exp(T)lllio) onto a fixed basis of excited states. The Hausdorff expansion of
exp( -T)W exp(T) terminates exactly after five terms, which involve at most four com-
mutators. Hence the CC equations can contain at most quartic terms.
(b) Consider again the Hausdorff expansion of exp( -T)W exp(T). The only way a term
linear in Tl can arise is from the single commutator [W, Til, as higher commutators will
give non-linear dependency on T. The single commutator involves (from the two-electron
part of the Hamiltonian)

You should explicitly verify, by expanding the commutator, that

[X~XbXSXR' X!X1l = CASCIQX~XR - CARCIQX~XS


+CARCIPXbXs - CASCIPxbXR
+CASXIX~XbXR - CARXIX~XbXs
+6IQx1x~xSXR - cIPx1xbXsXR

Obviously, the RHS can generate at most double excitations from Ilio. Hence a matrix
element like (1li1!fIIW, Tlllllio) must be zero. We can say that there is no connected
contribution to the T3 equation from WT1 •
303

(c) It is clear that the reader is mentally using the Slater-Condon rules for matrix elements
to infer that the triples equation, having triple excitations in the bra, must have singles
through quintuples (five-fold excitations) in the keto This is true for CI, since the operator
whose matrix elements we need is simply W (or H, if you like). But this is not the case for
the CC equations! The "matrix elements" here are over the operator exp( -T)W exp(T),
and are between triples, in this case, and lifo. We must explicitly look at the Hausdorff
expansion to see what will contribute. For instance, there is no connected contribution
to the triples equation from Tt\ although the naive expectation might be that such a
"quadruple excitation" term would appear.
(d) The linear terms appearing in the CCSDTQ equations are as follows:
Tt equation: WTt , WT2 , WT3 ,
T2 equation: WT], WT2 , WT3 , WT4
T3 equation: WT2 , WT3 , WT4
T4 equation: WT3 ,WT4

We may conclude from this that in the most general CC expansion, we will have linear
terms WTn , WTn +b WTn +2, and WTn - t . We do not obtain a term WTn _ 2 • On the other
hand, CI equations at the same level would include the latter term. So the linearized
CC equations are actually simpler, in terms of the matrix elements required, than the CI
equations.

Answer 3
Suppose we wish to use the process of solving the CC equations to estimate M~ller-Plesset
perturbation theory energies.
(a) The perturbation energy of order n can be obtained from the CC correlation energy
formula as

en+J = (lIfolW ( Tt(n) + T2(n) + 2" ft


1 n-t )
Tt(m)Tt(n - m) lifo}

where, for example, T2(n) is the n-th order perturbation theory term of the doubles
amplitudes. Let us concentrate first on the perturbed wave functions. By order we have:
D2T2(1) = W,
giving us the first-order amplitudes as

tt!(l) = (lIft!IWlllIo) ,
E[ + EJ - fA - fB

for example. In second order


D t T t (2) WT2 (1)
D 2 T2 (2) WT2 (1)
D3T3(2) WT2 (1)
304

In third order

D t Tt (2) = WTt(2) + WT2 (2) + WT3(2)


D 2T2 (2) = WT1 (2) + WT2(2) + WT3(2) + ~WT2(1)2
1
D3T3(2) = WT2(2) + WT3(2) + 2"WT2(1)2
DS.(2) = WT3(2) + ~ WT2(1)2

Finally, in fourth order

DtTt(4) = WTt(3) + WT2(3) + WT3(3) + WT1 (2)T2 (1)


D2T2(4) = WT1 (3) + WT2(3) + WT3(3) + WT.(3) +
+WT1 (2)T2(1) + WT2(1)T2(2)
D3T3(4) = WT2(3) + WT3(3) + WT.(3) + WT1(2)T2(1) +
+WT2(1)T2(2) + WT3(2)T2(1)
D.T4 (4) = WT3(3) + WT.(3) + WT2(1)T2(2) + WT3(2)T2(1) +
1 3
+3f WT2 (1)

Hence we obtain the second-order energy as

Substituting the expanded form of the ttl (1) amplitudes given above gives the usual MP2
energy formula. Similarly, we have

By recursively substituting the perturbation theory amplitudes, order by order, these


expressions can ultimately be decomposed to products of integrals over products of orbital
energy differences in the usual way.
(b) From the form of the perturbed wave function equations, we can see that "E2 and f3
are determined by (connected) double excitations alone. E4 includes contributions from
singles, connected doubles, connected triples, and disconnected quadruples. In addition
to the terms that contribute in fourth order, E5 includes contributions from disconnected
triples and connected quadruples (which contribute to T2 ( 4», and also from disconnected
doubles, since the term ~ Tl in the CC energy expression can contribute for the first time
in fifth order.
305

(c) These expressions are not the most efficient, since they require the perturbed wave
function of order n to determine the energy at order n + 1. Wigner's formulae

E2n = (n - lIWln),
E2n+1 = (nIWln)
provide a much more efficient strategy. For example, the fourth-order energy would be
(1IWI2). Consideration of the above results then shows that E4 comprises contributions
from matrix elements between double excitations and singles, connected doubles, con-
nected triples, and disconnected quadruples. Things are less clear for the fifth-order
energy obtained this way. The second-order wave function does not include connected
quadruples, so it is not obvious how these terms arise in the fifth-order energy. The con-
nected quadruples, in lowest order (third), are given in terms of connected triples and
disconnected quadruples. Some of the terms in the fifth-order expression (2IWI2) have
exactly the form of the equations that define the connected quadruples, as discussed in
Chapter 4.

8 Truncated CI ...

Answer 1

Minimal basis SDC} on the hydrogen molecule requires two configurations

4>0 = 10";), 4>1 = 10";)·


The (O"gO"u) configuration has the wrong symmetry to be included. We get a two-by-two
matrix

where

and
H12 = (gulgtt)
(From now on, however. no explicit expressions in terms of integrals will be needed).
Diagonalize:

I H12
go - g
306

CO+Cl (+)/(CO+Cl)2
~c = -2- - V -2- -Cael + H212

Let C = co+ Cc
co - Cl
= - -- ( co -2 Cl)2 + HZ12 <:- o., Correlation energy
2
Use
a- co - Cl X = !iJZ. = .1!lu..
- 2' A ~O-~I

~= Ll (1- v'1 + X2)


Cc

Also: let the eigenfunctions be rP =dorPo + d1rPl, where then the coefficients are obtained
as en eigenvector of the hamiltonian matrix.

where N is some arbitrary norlmalization factor.

Answer 2
Use two isolated hydrogen molecules. Antisymmetrization can be disregarded. Basis
functions of the CI are

so the CI wave function is

On the other hand, we know that the eigenfunction is just the product

I}I = rP ArPB(i.e.l}I(rl,r2,r3,r4) = ¢A(rl,rZ)rPB(r3,r4))

where
307

and similar for iPB • So

III = ¢JA¢JB = (do¢J~ + d1¢Jt) (do¢J: + d1¢Jf)


'It = 4<p~<p: + dodl<P~iP~ + d1doiPtiPg + 4<pt<p~
and we can identify

Co = 4; Cl = C2 = dod l ; C3 =4
The matrix elements of iI = iIA + iIB are for example
('ltoliIllIIl) = (<p~<pgliIA + iIBI¢Jg<p~)
= (¢J~lilAltP~)(tP:ltP~) + (¢JgltP~)(¢J:lilBltP~)
= H12 x 0 + 1 X HI2 = HI2
In full
2eo
H= ( H12
Hl2
o
Now use the already-solved eigenproblem for the 2-electron case,

( Heo12 H12)
': 1
( ddo1 ) =.: ( do )
dl

i.e.,

Also use CO = <Po, ftc. The first row of He will be

the 2nd row is

H124 + (eo + eddodl + H124 = (H124 + e1dddo + (eodo + H12d1)d1


=ed1do + edod1 = 2edod1
and so on. In toto,

H e =2ee

which means that full CI (FCI) is size consistent.


308

The above can be just as easily expressed using the correlation energy directly. Just
subtract 2co from each diagonal element:

(
~2 CO~2Cl
H 12 0
H~20 H12
H 12
) ( : ) =cc ( : )
co - Cl C2 C2
o ~2 ~2 ~~-~) q q
(where Cc is the correlation energy of this specific problem. It is twice as large as the one
calculated in exercise 1).
We now introduce a more efficient notation: Let
\II = Co\llo + cD\IID + cq\llQ
\liD = .J1i2 (~~~f + ~:~g)
\IIQ = ~tt/>r

so in this new basis, we get

(
0
v'2H12 -.!2Hco12
Cl -
0
v'2H12 Co )
) ( CD = Cc
( CD
Co )
o .../iH12 2(Cl - co) CQ CQ
of course with the same solution as before:
1
CQ = d~ = -chiCo
2

Answer 3
Now we do an SDCI instead, by removing the quadruple excitation. The new eigenproblem
is

( 0 = c:DC1 (2H 2) ( Co )
.../iH12 ) ( Co)
v'2H12co Cl -
CD CD
Such an equation was solved already in problem 1. IT we keep the same value of x, we
now get
C~DCI(2H2) = 6 (1- "'1 + 2X2)
To the lowest non-vanishing order, we get

{
c~DCI(2H2) = _~X2 +
cc(H 2 ) = _~6X2 +
but as we see,
C:DC1 (2H 2) i= 2 x Ec (H 2 )
so SDCI is not size consistent.
309

Answer 4
Similar to exercise 2, let us put
W = ¢>A¢>B tip. ..
= (do¢>~ +d1 ¢>t) (do¢>: +dl¢>n ...
= d~ ¢>t¢>g ¢>~ ... + d~-ldl (¢>tf/>g ¢>~ ... + ¢>t¢>f¢>~ ... + ... ) + .. -
CoWo + CDWD + cqWQ + ...
where

WD = ~ (¢>t¢>~ ¢>~ ... + ¢>t¢>f ¢>~ ... ¢>t¢>~ ¢>f ... + ...)
WD = J N(} _ 1) (¢>t¢>f ¢>~ ... + ¢>t¢>g ¢>f ... ¢>t¢>f ¢>f ... + ...)
etc.,
and
Co ~
CD = ..JN~-ldl

cq = JN(N2- 1) d 0
N - 2cf.
1
etc.
We get the matrix elements of the (shifted) hamiltonian
(WoIN - eolwo) 0
(W DIH - eolWo) = /k(H12 + H12 + ... ) = ..JNH12
(WDIN - eOIWD) = k «el - eo) + (el - eo) + ... ) = el-eo
(WQIN - eolwo) = = J2(N -1)H12
(wQIN - eolwQ) = = 2(el - eo)
etc.,
and a tridiagonal FeI matrix:
0 ..JNH12 0
VNHI2 (el - eo) J2(N -1)H12
0 J2(N -l)H12 2(el - eo)
0 0 J3(N -2)H12
0
Now, the SDCI approximation means we truncate the upper-left two-by-two submatrix
and are left with the problem:

0 VNH12) ( Co ) ( Co )
( VNHI2 el - eo CD = ec CD
310

Again, we can use the solution of exercise 1 to obtain

e~DCl(N H2 ) = ~ (1 - VI + NX2)
Also, since the problem becomes asymptotic.ally similar to

when N -+ 00, we can conclude that

( Co)
CD
-+ ( ~),
-v 1/ 2
and ec = -VNH12
when N -+ 00.

Example: If x = 0.1, then

N= 1 N=2 N= 100 N-+oo


Co= 0.999 0.998 0.924 /lii
CD= -0.050 -0.070 -0.383 -/112
ec/~ = -0.005 -0.010 -0.414 '" -IN x 0.1

Answer 5

In our case, the energy functional takes the form

a) if this is the SDCI case, then we know that

(lI1olH - eoleoll1o + cDII1D) = e:DC1(lI1oIColl1o + cDII1D)


(II1DIH - eolColl1o + CDWD) = e:DC1{WDICowo + CDWD)

Multiply with Co and with CD, and add:

(eoll1o + CD 111 DIH - eoleowo + CDII1D) = e:DC1 (eoll1o + CD 111 Dleol}lo + CD 111D)
e~DCI (~+ (1)

=>
SDCI (.:l + CD2)
. al)
ec(functIon = ec .:l
'1l
'1l
+ gcn
_2 = ecSDCI 'fI 9 = I'.
311

b) in general we get

where t = CD/eo. In particular, for our Nx 2-electron case, we get

_ t(v'Nx+t)
eo - 2~ I
+gt 2
If 9 = 0, and if we want the minimum, we get

so we conclude that CEPA-O is size consistent (Coupled Electron Pair Approximation).


c)

(The approximation lies in the use of the SDCI wave function in the CEPA-O functional.)
d) ACPF (Average Coupled Pair Functional) means that 9 = I/N. We want to minimize

ACPF = 2~ t( v'Nx + t)
eo 1+t2 /N
Substitute t = ../Jiiu, to "normalize" the denominator:
= 2~../Jiiu( ../Jiix + v'Nu)
1 +u 2
= N2~ u(x + u) = N
1 + u2
ACPF(H)
eo 2

so ACPF is also size consistent.


312

e) A system of non-interacting hydrogen molecules and helium atoms. Note that the
number of molecules and atoms is irrelevant. We consider two subsystems, denoted prime
and bis.

w= w'w" exact, H = H' + H", or H- eo = (H' - e~) + (H" - e~)


W' is Hrsubsystem,
W" is He-subsystem.

Now approximate:

First, we write the numerator of the correlation energy functional:

(WIH - eolW) 2C{)cv(WvIH' - e~IW~} + 2eocv(WvIH" - e~IW~}


+ cZ(WvlH' - e~IWD} + c'g(WvIH" - e~IWv}
(W'IH' - e~IW') + (W"IH" - e~W')
and then the denominator:

(W'IH' - e~W) + (W"IH" - e~IW")


... ec = ~ + 9 (cZ + dB)
(w'IH' - e~IW') (WI/Iff" - e~IW") , "
¥ ~ + gcZ + ~ + gdfJ = ec + eo

in general, except of course if 9 = O. We conclude that ACPF is not size consistent for
this mixed system, but CEPA-O is.

9 Accurate Calculations . ..

Answer 1

RHF is size-extensive, but not generally size-consistent. UHF is both size-extensive and
size-consistent. Coupled-cluster and MP perturbation theory are size-extensive, as is full
CI, but truncated CI methods are not. Size-consistency depends on the reference function:
In general, RHF-based coupled-cluster and perturbation theory treatments will not be
size-consistent. MCSCF methods are often constructed to be size-consistent, although this
depends on the configuration space chosen. Such methods are size-extensive in principle,
although this may be hard to achieve in practice.
313

Answer 2
C'.onsider the normalization for the radial functions r" exp( -ar2 ) and r,,+2 exp( -lJr2).
From the given integral formula, the normalization factors are found to be

and
26 (!!!±II 2"-H 1
( ) • «2n + 5)!!Jjr) 2
respectively. The unnormalized overlap is
(2n + 3)!! (_11'_)1/2
2"+3(0 + 6),,+2 a + 6
Multiplying all three together, setting 6 = ka, differentiating with respect to k and equat-
ing the derivative to zero we obtain
Ie= 2n+7
2n+3
(Note that since we are only interested in the exponent dependence of the normalized
overlap, we can ignore all the other factors in finding the maximum). Hence the overlap
is a maximum when d and s exponents have the ratio 7/3, or for the f Ip case, when the
exponent ratio is 9/5.

Answer 3

Calculation of properties.
(a) The expressions are

2t1EJ.zFt - t1EIFt
a = FlFt-FlFt
"( = -24 t1EJ.zFl- t1E.Fl
FlFt-FlFt
where AEI is the energy change obtained by applying a field strength F1, etc.
(b) Dynamical correlation has a substantial effect on polarizabilities, so an accurate treat-
ment of dynamical correlation is required. The best approach for the lower vibrational
levels would probably be to use the CCSD(T) method, although if higher vibrational lev-
els were required a multi reference would probably be needed, at a considerable increase
in expense.
Select.ion of the basis set would probably be best handled by starting with a good spdf set
(perhaps TZ2pf, or a [4s3p2dlfl ANO or correlation-consistent basis), and then augment-
ing it wit.h diffuse func.tions (up to at least d type), monitoring the change in polarizability
314

components and polarizability gradient at re' The latter can be estimated fairly crudely
here by a three-point fit. Once a result converged with respect to basis set is obtained a
larger range of points can be calculated, with a numerical technique used to compute vi-
brational wave functions and (say) a spline fit to the polarizability components as function
of bond length.
Since the polarizability tensor of N2 is rather well established experimentally it could be
used for calibration, but the best experimental results are frequency-dependent, so this
would not be competely reliable.

Answer 4

Answering "I would propose to steer clear of this project and not perform any calculations
at all" would show good judgment, but if you do get involved, it will be crucial to obtain
a balanced description of valence-Rydberg mixing in the excited states. Since this can
change drastically with geometry, a number of exploratory calculations would be required,
with diffuse functions in the basis and probably a large state-averaged MCSCF followed
by MRCI, or perhaps by doing state-averaged RASSCF calculations. Since the number
of states to be averaged over would be unknown at the beginning and might require
adjustment during the course of the calculations, considerable trial-and-error effort will
be required. You might be better off with new experimental friends!

Answer 5
This is another very difficult problem, but this one is more amenable to theoretical solu-
tion. First, it is vital to obtain a good description of properties, particularly the dipole
moment of NH3 and the polarizability of Ar. Since these are relatively small systems,
an extensive series of basis set and correlation treatments could be examined, in order to
ensure convergence of the desired properties. Second, this is an obvious case for attention
to basis set superposition error, and a proper examination of BSSE using counterpoise
calculations would be mandatory for determining the binding energy, and the optimum
geometry of the complex.

10 MCSCF Theory

This chapter presents solutions to the problems in "The Multiconfigurational (MC) Self-
Consistent Field (SCF) Theory" chapter by B. O. Roos in "Lecture Notes in Chemistry 58,
Lecture notes in Quantum Chemistry, European Summer School in Quantum Chemistry"
(Springer-Verlag Heidelberg, 1992), pp. 177-254. The numbering of the solutions follows
that used by Roos in his lecture notes.
315

r
~ eH8

"""""-r

Figure 1: The hydrogen molecule.

Answer 1.1

The two-partide density is (eq. 1:3):

Int.roduce the simplified notation:

then

4>d-r) = a and 4>2(-r) = -b.

Insert. in Eq. 9 to obtain:

(10)

a.nd

(11)

... P2(r, -r) > P2(r, r)

An example: In Fig. 1, the H2 molecule with 4>1 = 100g and 4>2 = 10"". If electron 1 is close
to atom HA the probability to find electron 2 at H8 is larger. In this case P2(r, r) -+ 0 at
dissociation (a 2 = b2 and Tfl = Tf2 = 1).

Answer 2.1

From Eqs. 2:6 and 2:8 we have


316

and

where

+1"" = N I"" {lsA(1)lsA(2) + lss(1)lss(2)} 9 2,0


+Cov = NCov {lSA(1)1ss(2) + lss(1)lsA(2)} 9 2,0
c
Ni;' = N ! = (2 + 282 ),
for convenience we define

Thus we have

.1 = Nl{ +Ion + .COV}


Nlon Nc""
.2 = N:{·lon _ .COv}
Nlon Ncov
CI• 1 + C2 • 2 = {CIN; + C2NnN·I"" + {C1N? - C2N;}N,·cOV
I",. Cw
Thus i)vs = IliMC when

When the internuclear separation goes to infinity we obtain (S=O):

CI"" = ~(CI + C2 )
CCw = ~(CI - C2 )

but for two hydrogen atoms: CI",. = 0 and Ccov = 1. Thus C1 = -C2 = ~.

Answer 2.2

The CH2 radical has a triplet ground state, 3BI , with one electron in a q and one in a 1r
lone-pair orbital (see Fig. 2).
Two such radicals combine to form the ethene double bond (see Fig. 3).
317

c:i~H
1t

\ '
.H
\\'

Figure 2: The CH 2 radical.

Figure 3: The ethene molecule built up from two CH 2 radical fragments

Denote the triplet state on radical A as 11, M)A where S=I and M=I,O,-l, with the
corresponding notation for radical B. Start by constructing the overall quintet state (S=2,
M=2) for the combined system AB:
(12)
Anti-symmetrization is implicitly understood in this formal equation. In order to obtain
the singlet state, 10,0)As, we use the step down operator:

S_IS,M) = V(S + M)(S - M + I)IS,M -I}


to obtain 12,I)AB. Orthogonality gives II,I}AB. Further use of S_ gives 12,0)AB and
II,O)AB' 10,0}AB is finally obtained using the orthogonality requirement:
1
10,0)AB = v'3 {II, I}AII, -I}s -11, O)AII, O)B + 11, -I}AII, I}s}

This is the VB function. We can transform it to the common detenninant language by


using:

II,I}A = 100AlI'AI
II,-I}A = iO'A,i'AI
1
II,O)A V2 {lO'A,i'AI + IUA,lI'AI}
with similar relations for B. The vertical bars (I ... I) here denotes a normalized detenninant.
Inserting into the VB function yields:
IO,O)AS = ~ { 100A,1rA,uB,i'sl llO'A, i'A, Us, 'ifsl IO'A, i'A, UB, lI'BI
- ~IUA,1I'A,O'S,i'BI - ~IUA,1rA,UB,1rBI + IUA,"ifA,O'B,1I'BI }
318

Figure 4: The equilibrium structure of water.

It remains to transform to symmetry adapted orbitals:


1
/ ug = v'2(UA + us) 11',. = ~(1I'A + 1I's)
1
u,. = y'2(UA - us) 1I'g = ~(1I'A -1I's)
(note: these relations are valid at large CC distances where the overlap is zero).
The final result is :
10,0)AS =~ { 2Iugu,.7i'.,7i'gl + 210'gO'1/1I',.1I'gl lugO',.1I',.7i'gl + IO'gu,.1I'u7i'gl
+ IUgO'u7i'u1l'gl + 100guu7i'u7rgl 3lugO'g1l'u7i'1/1 + 3luuO'u1l'g7i'gl
+ 3lugO'g1l'g7i'gl 3 luuO',.7ru7i'u1 }

Thus all configurations, which can be generated by distributing 4 electrons in the 4 MO's
and coupling them to a singlet state, are included. (A CASSCF wave function with 4
electrons in 4 orbitals and S=O). Note: The closed shell HF configuration has the weight
=
3/16 18.75 % in IO,O)AS.

Answer 2.3

The water molecule has 10 electrons. The equilibrium structure has e211 symmetry (see
Fig. 4).
The electron structure is in terms of localized orbitals:

where Iso is the oxygen Is orbitals, OH are the two oxygen-hydrogen bonds, ,nu is the
in-plane oxygen lone-pair, and 1/.11' is the out-of-plane oxygen lone-pair. Transform to
symmetry orbitals:
319

Figure 5: The carbon dioxide molecule.

OBI + OB2 -+ 2al


OBI - OB2 -+ 1~
nO' -+ 3al
mr -+ 1~

Thus:

Note: The derivation is heuristic and based on chemical knowledge (the structure). More
complicated structures may be less straightforward.

Carbon dioxide is a linear molecule with Dooh symmetry (see Fig. 5), and has 22 electrons.
The localized orbitals are:

(1so, )2(lso, )2(lso )2( COl )2( CO2)2(nO'l)2(n0'2)2


(mrh: )2( mr2,1 )2( 1roo" )2(1r002s)2

Note: the two 1r bonds are perpendicular to each other, as are the oxygen 1r lone-pairs.
Now transform to symmetry orbitals:

Iso, + 1so2 -+ 10'9


Iso, -Iso. -+ 10'u
1sc -+ 20'9
Cal + CO 2 -+ 30'9 (CO 0' bonds)
CO l -CO2 -+ 20'" (CO 0' bonds)
nO'l + n0'2 -+ 40'9 (oxygen 0' lone - pair)
nO'l - n0'2 -+ 30'u (oxygen 0' lone - pair)
320

Figure 6: The formaldehyde molecule.

The ?r-orbitals are most easily transformed by noting that they form the energy diagram

211"" (11"0, + 11"0. - 211"c)


11I"g t -! t-! (11"0, - 11"0.)
l1ru t -! t-! ("KO, + 11"0. + "Kc)
With the resulting electron configuration:

4i = (lO'g )2( 10',,)2 (20'g )2( 30'g )2(20'u)2 (40'g )2(30'u)2(bu)4( 111"g)4

The formaldehyde molecule has 16 electrons and has in equilibrium C 2v symmetry (see
Fig. 6).
In localized orbitals:

Transform to symmetry orbitals,

Iso -+ 1al
1sc -+ 2at
O'co -+ 3a l
O'CH, + O'CH. -+ 4at
O'CH, - O'CH. -+ l~
noO' -+ 5al
no 11"" -+ 2~
1I"co -+ lbl

Which gives the electronic configuration


321

Figure 7: The ethene molecule.


Lx
The ethene molecule has D2h symmetry (see Fig. 7), and 16 electrons.
In localized orbitals we obtain from the structure formula:

--
or in terms of symmetry adapted MO's

lsc, + lsc. lag

--
lsc, -lsc. lb:J..
(Icc 2ag
(lC,H,+ (lC.H. +(lC.H. + (lC.H,
(lC.H, + (lC.H. - (lC.H. - (lC.H,
-- 3ag
2b:J..

--
(lC,H, - (lC,H. + (lC.H. - (lC.H, 16,..
(lC,H, - (lC.H. - (lC.H. + (lC.H, Ib,.g
7fCC lilt ..
With the resulting electronic configuration

The benzene molecule has D6IJ symmetry and 42 electrons. We divide the electrons up
into the following groups,

• Carbon Is electrons (6 doubly occupied orbitals),


• CC (I-bonds (6 doubly occupied orbitals),

• CH (I-bonds (6 doubly occupied orbitals),


• the ,,"-electrons (3 doubly occupied orbitals).
322

Figure 8: Position of core orbitals and CH bonds in benzene.

Figure 9: Position of CC 71' and q bonds in benzene.

The first and third group will give rise to equivalent symmetry adapted MO's. We there-
fore treat only the first group (see Fig. 8).
We use the projection operator technique to find out that the six orbitals transform as
alg,btu, e .... and e2g (see section 3.4 in "Molecular Symmetry and Quantum Chemistry"
by P. R. Taylor in Lecture Notes in Quantum Chemistry, Ed. B. O. Roos, pp. 111). The
second group is slightly different since the bonds are located between the atoms rather
than on them (compare Figs. 8 and 9).
The only effect is to change btu into b2... We can now write down the q part of the
electronic configuration:

(la l g)2( Ib 1u )2( lel .. )4(le2g)4 Is - electrons


(2a lg )2( Ib 2u )2 ( 2e l .. )4( 2e 29)4 CC q - bonds
(3a lg )2(2btu)2(3elU)4(3e2g)4 CH bonds

The 1r-electron part is less straight forward, since we also have to decide which orbitals
are to be filled. The six MO's are: a2.. , e2u, el g , and b2g . The ordering of the orbitals after
energy can be obtained from some simple calculations (e.g. Hiickel) of by just counting
the nodes. The result is:

with the electronic configuration


323

Figure 10: The tri-radical resonance structure of the N0 2 molecule.

".
N N
. / ~
.p.' ·P.· ,.~.~ ,.0

(1) (2)

Figure 11: The two radical resonance structures of the N0 2 molecule

Hint.: the projection to symmetry orbitals is most easily performed by considering the
subgroup C6t• first, realizing that all 0' orbitals are symmetric with respect to the reflection
in the molecular plane of the benzene ring, and that the ?r orbitals are anti-symmetric
with respect to the same symmetry operation.

The N0 2 molecule is a radical and has C2v symmetry and 23 electrons. Here we have a
problem, since there is no obvious structure formula for this molecule, which defines the
bonding situation. One possibility is the tri-radical configuration (see Fig. 10), but that
is very unlikely. Then we have the two radical resonance structures (see Fig. 11).
The electronic configuration for (1) is
(150, )2( 1502 )2( ISN )2( 250, )2( 2502)2{ UNO, )2( O'N02)2
(nUN )2(nO'o, )1{nO'02)2(?rN02)2(no, ?r)2

where nUN is the nitrogen lone-pair, nO'o the in plane 2p oxygen lone-pair, and no,?r
the 7r lone-pair. For simplicity we have assumed the oxygen 2s not to be involved in
t.he bonding. It is only the last nine electrons that causes any problem. The seven first
orbitals in the row are easily symmetrized to:

The same result is obtained for (2). Consider now the orbitals nO'o, and nO'02' They can
be replaced by the symmetrized orbitals

nt70, + nO'02 - 6a 1
nO'o, - nO'02 - 4b:!
6al is "bonding" and we may guess the contribution (5al is the nllN orbital),
(.')ad(6ad(4b:!)1
324

N+ N+
:p( ~., .0"//"
..
.. - .0-
(3) (4)

Figure 12: The two ionic resonance structures of the N0 2 molecule.

"
A
C/ B
Figure 13: Labelling of the ozone molecule.

This is wrong! The ground state of N0 2 is 2At and not 2H2 • The reason is the appearance
of ionic structures described in Fig. 12.
The q-Ione-pair part of the wave function is now

The 7I"-orbitals in N0 2 are 1bt. la2, and 2b t • With four electrons we obtain the configu-
ration:

The difficulty to write down a consistent electronic structure for N0 2 reflects the fact
that it is not well described by a single configuration. Another example of this situation
is ozone, which will be discussed in the next example.

Answer 2.4

In this example we treat only the 4 71" electrons. For simplicity it is assumed that the
three 71" orbitals 7I"A, 7I"B, and 7I"c are orthonormal. Formally we can write the three valence
structures as (see Fig. 13 for notation):

I (7I"A)2(7I"B7I"C).
II (7I"B)2(1I'A1I'C).
III (7I"c)2(7I'A7I"B).

where (7I"B1rc). means a singlet coupled electron pair:


325

Now introduce the MO's, 71'1, 11"2, and 11"3 according to the text. For simplicity we also use
1
11"+ = y'2(1I"1 + 11"3)
1
11"_ = -(lI"l -lI"3)
y'2
In this notation we have:

11"04 = lI"+
1
1I"B = y'2(11"2 + 11"_)
1
1I"c = -(lI"2-1I"
y'2 -)

The diradical structure I can now be transformed to the MO picture:

(lI"BlI"C). = ~{(11"_)2 - (lI"2)2}


(11"04)2 = (11"+)2

Thus

Obviously we have:

Further we find

Thus:

.1 = ~(lI"l)2(1I"3)2 +
v2
lrn(7I"1)2(7I"2)2
2v2
- 2~(71"2)2(71"3)2 - ~(7I"11I"3).(1I"2)2
Note: The second term is the HF configuration. It has the weight 1/8 = 12.5 %. For
symmetry reasons we only need
326

A S A S
S A S A

Figure 14: The four sets of 71" orbitals in cis-butadiene.

Compute the overlap


1 1
= 0.89( - 2v'2) - 0.45'2 ~ 0.6329
1 1
v'2(t 1t ll + t1l1) = 0.89 v'2 = 0.6293
1
... t = 0.6329t[ + 0.6293 v'2(tll + tIlI) + ...

Finally we find:

Weight oftr in t: 40.0%


Weight oftll in t: 19.8%
Weight oftm in t: 19.8%
Summed weights: 79.6%

Answer 2.5

The 7I"-orbitals in cis-butadiene (the u-orbitals remain essentially unchanged during the
reaction and need no be considered) have the form according to Fig 14,
where we have also indicated weather they are symmetric (S) or anti-symmetric (A)
under the C2 or C. point groups, respectively. The orbitals 71"1 and 71"2 are occupied. The
corresponding orbitals in cyclobutene are described in Fig. 15.
The con-rotatory reaction path leads to the correlation diagram of Fig. 16.
This reaction path is the allowed! For the dis-rotatory reaction path we obtain the
correlation diagram according to Fig. 17.
The dis-rotatory reaction path leads to a change of the electronic configuration and is
then expected to have a considerable barrier (a Woodward-Hoffman forbidden reaction).
327

0 (J
tl :It
~
:It*

C2: S A S A

Cs: S S A A

Figure 15: The four sets of orbitals in cyclo-butene produced by ring closure of cis-
butadiene.

7t (S) - - - - - - - - - - - - - o*(A)
4
7t ( A ) - - - .
3 1t *(S)

7t (S) 1t (A)
2
1t 1(A) -+I------------+i-- a (S)

Figure 16: The correlation diagram of the con-rotatory reaction path.

7t4 ( A ) - - - - O*(A)
::::-=
1t *(A)

1t (S)
a (S)

Figure 17: The correlation diagram of the dis-rotatory reaction path.


328

Answer 2.6

"i-j' = ~ {I ... i 1 i J ···1 + I··· i' zi J ... I}


.....
Vj_i' 1 {I ... Z....
v'2 Z J ."
Z ••• I + I... z
....
Z Z,""
J •.• I}

The energies are (use the Slater rules):

E(i - j') = Eo - Ei + (.i' - Ji' j+ 2Kj'i


E(i - i') = Eo - (.j + (.i' - Jji' + 2Kji'

where Eo is the ground state energy, (.i the orbital energies and
Jd = (aalbb)
l(d = (ablab)

To compute these Coulomb and exchange integrals we use the zero-differential overlap
(ZDO) approximation:

For the Coulomb integrals we obtain using the MO's given in the problem (note that
c;" = c;,,,)
(j'i'l = r:C1,,,(ppl + ~:C1,,,(ppl = WI
I'l.) ,,0
(iii = ~:C1,,(ppl + Lc1,,(ppl = (i'il
p(.) I'l)

Thus Jj'j = Jijl. In the same way Kj'j = K ji,. Since also -(.i + (.31 = (.i' - (.; we obtain

E(i - i') = E(j -+ i') = Eo - f.; - Ej - Jj'i + 2Kjli


The interaction term is

("j-AHI"i'-i) = 2(j'ijji') - (i'i'lii) = 2Kj 'i - Kji


since (j'il = (ji'l and (i'i'l = (iii solving the 2x2 secular problem we obtain
ELi = Eo - Ei - Ej - Jj'i + 4Kj'i - Kji
Ei:-i = Eo - (.i - Ej - Ji'i + Kji
The transition dipole moments are

(~oILiiI>!/EI>!/I~i-i') = V2iiw
",q
329

Figure 18: The expansion of the dipole moment operator.

and

(~ol LilpqEpql~j-i')
p,q
= V2ilji'

where li pq = (<pplrl¢q). The transition dipole moments can now be expressed as

liij' = L
p,q
C;pCj'q(7rplrl7rq}

iljri L C;'pCjq(7rp lTl7r q)


p,q

The dipole moments (7rplrl7rq) can be expanded as

The second term in this expression is zero. Why? According to Fig. 18 R goes to the
midpoint of pq.
With assumed orthogonality we have

To conclude the derivation

L C;pCj'p14 - L c;p cj'P14


p(.) pO
.. liji' liij'
330

Thus all intensity goes into the upper state i)1"-j, while in this approximation the tran-
sition to <I>;:"'j is forbidden. The separation of the 7r excited states in conjugated hydro-
carbons remains approximately valid also in more accurate treatments (see for example,
Matos & Roos in Theoret. Chim. Acta 74, 363 (1988): The singlet-singlet and triplet-
triplet spectroscopy of naphtalene).

Answer 3.1

Assume i < j

il/J,il···mi ... mj ... } (ljmi(-I)P;I ... Oi ... mj ... ) = mjmA-l)P.+p,-ll···Oj ... Oj ... }
CtiCtjl···mi .. _mj ... } (limj(-I)P'I···mj ... Oj ... ) = mim j(-I)P;+P'I···Oi ... Oj ... }

The other relations are proven in a similar way.

Answer 3.2

Use the following operator relation:

[..1..8, CD] = -[A, C]+.8D + ..1.[.8, C]+D - C[A, D]+.8 + ..1.[.8, D]+C
where [A,C]+ = At + CA, etc.

Answer 3.3

Use relation 3.6

Answer 3.4

Use (3:8), which can be written as:

EijEkl - cjkEil = Ek/Eij - ci/Ekj

thus PtAi = Pr:l;j. For real functions we have

thus PiAl = ~n = P}:ik Q. E. D. It follows that Pijk/ = Pk/ij = Plkji = Pjilk for real wave
functions.
331

Answer 3.5

Assume W is the unitary matrix that diagonalizes U to D,

wtuw = D

with t.he element.s Dij = Cij exp( i8;). Thus D = exp(i8) and

U = WDWt = exp(iW8Wt) = exp(T)

with T = iW8Wt ( T is anti-hermitian Tt = -iW8Wt = -T).

Answer 3.6

Taylor expand the exponent.ial expression of U and separate'the summation into a sine
and cosine like contribution,

) ~ 1 k ~ 1 2k+! ~ 1 2k
U = exp(T= f='o kfT = f='o (2k + I)! T + f='o 2k! T
Manipulate the a.rguments further by noting that
T2 = _8 21

Use this identity to reexpress the arguments in the Taylor expansion, i.e.

Insert this in the Taylor expansion and identify the corresponding Taylor expansions for
a sine and a cosine function (in the latter case we extract the scalar 8 from the anti-
symmetric matrix T and put it inside the summation),

U = T Lf:o (~~~r)!82k + 1 Lb,o (~!t 82k


= ( _ s~n 8 Si~ 8 ) +( c~ 8 co~ 8 ) =
( cos8 sin8)
-sin8 cos8

Answer 3.7

Assume

fi <0 since the eigenvalues of an anti-Hermitian matrix (T) are purely imaginary.

exp (T) = ,"00 !


/.Jk=O (2k+! )!
+ /.Jk=O 2k!! T2k
T2kT ,"00

Lk=O (2k~!)!V{kvtT + Lb,o 2~!Vfkvt


332

Since €i = -9; we have

where we, after multiplying the first sum on the right hand side of the equation with (};f(}i,
have aij = 6ij sin (};f(}i and bij = 6ij cos (}i.

Answer 3.8

Equation (3:41) defines the operator 5,


5 = L SKO {IK)(OI-IO)(KI}
K#J
Since (OIl{) = 0 for K -# 0 we find
SIO) = L SKoll{)
K#J
and since (LIK) = 6LK we also have that
52 10) = L SKoSIl{)
K~O

- L SKO L SLO(LIK) 10)


K#J L~O

= -L ShIO) = -(}210)
K#J
From this we obtain

Now expand the operator exp S in even and odd powers of S:


S'2k+l
expSIO) ={ ,",00
~k=O
1
(2k+l)! } 10)
={ 00 .i=.!.t.. e2 >+1 •
} 10)
Lk=O (2k+l)!-e- S
={ ~5
e } 10)

Answer 3.9

The operator for the z component of the spin is expressed as

S. = L SijO,j o'j
i,j
333

where the sum runs over all spin orbital pairs. To compute Sij we calculate the matrix
elements (I/Ii(ilszll/lj(j) where I/Ii is an MO and (i is the spin function (a of fJ). Keeping
in mind that

we will have the following matrix elements

(l/iialszls",l/Ija)
(l/iialszls",l/IjfJ)
1
(4);,818%18%</>;,8) = --s--
2 'J
Thus, only diagonal elements in the sum over spin orbital pairs will survive

(13)

where the sum now runs over MO's. For ~ we use the relation

(14)

So we need the representations of 5+ and 5_ in Fock space. For one electron we have:

S+I/Iia == 0 s_l/Iia = l/IifJ


S+l/IifJ = I/Ii a s-l/IifJ == 0
From this follows immediately the representation in Fock space:

(15)

Inserting eq. 13 and 15 into eq. 14 yields finally ~, for example:

52 = ~ ~ Eii Ejj + 4~ Eii + ~ at (ajaaiP + a;aajp) alp


I,,] ",]

but since Li Eii = N (N = number of electrons) we can rewrite the expression as

Now investigate if these spin operators commutes with the excitation operator E ij • First
explore this for 5z where we have
334

consider the first term of the spin operator

the same holds for the second term. Hence, [E;;,Sz] = O. It immediatly follows from
[A, BC] = B[A, C] + [A, B]C that [E;;,~] = O. We also note that

[E;;, S+l = LkIE;;, aloak/3] = Lk {alo[E;;, "k.B1 + [E;j, alo1"k/3}


= Ll: { -ok;"lo"j/3 + Ok;,,!aak/3 } = 0

and also [E;;, S-l = O. It follows then from the arguments mention above that [Ei;, S21 =
O.

Answer 4.1

We start by considering eq. (4.9)

gJ;1 = (OI[H, Eij]lO) = (OIH E;;IO) - (OIH E;;IO) - (OIE;;HIO) + (OIE;;HIO)


= 2(0IH(E;; - E;;)IO)
This expression is only non-zero (when 10) is the CS HF determinant) if j is occupied
and i is virtual (or vice versa). For this case we obtain (i occupied orbital; a is a virtual
orbital),

(OI[H, E,,;]lO) = 0
To compute the commutator: we start from

H = Lp,q h"q EI'9 + ~ L 1',q,r.~


(pqlrs) {E"qEr$- oqrE,>$ }

Start with the one-electron term and use:

[EI'9' E,,;] = OaqEpi -.oi"Eaq


since EpiIO) = 20",10) and (OIEaq = 0 we obtain:
L h"q(OI[EI'9' Eai]10) = 2hia
".q
For the two-electron part we use the commutator relation

[E"qEr., E,,;] = EI'9[Er., Ea;] + [EI'9' Ea;]Er$


335

For the matrix elements we obtain using the rules above:

c5ri(O IEl'q E... 10) + 2c5C1qc5pi(0IEr.10)


= 4c5... c5ric5~ c5ri {OI[EI'9' E".] 10) + 4c5..qSpiS~.
= 4c5...c5ric5~q 2c5ric5..qc5~. + 4S..qSpiS~.
where the symbol 6~q means a Kronecker delta if p and q are both occupied, but zero
if they are not. The second term in the two-electron part of the Hamiltonian gives the
contribution

The total two-electron contribution is now

4 ~)kklia) - 2 2:(kalik) + 4 2:(ialkk) - 22:(iklka)


Ie Ie Ie Ie

Adding the one-electron contribution we obtain

which proves eq. (4:46). Plugging the result into eq. (4:9) also proves eq. (4:41)

the second term here is trivially identical to zero. Hence, the Brillouin condition which
states that the gradient with respect to orbital changes is zero can be expressed as F..i = O.
Suppose that the Fock matrix in AO basis is FAO. The Brillouin condition then gives:

c!FAOCi = 0

where cl' is the coefficient vector of the MO tPl'. Consider the general transformation of
an occupied orbital with the coefficient vector c;,

where S is the AO overlap matrix and the sum runs over all MO's. Multiply with c! and
use the orthogonality condition for the MO's (JScq = 61'q) to obtain:
FAoe; = 2:SCjEji with Et = E

By transforming the occupied MO's among themselves we can obtain E in diagonal form

F AO c: = SC:Ei (The HF equations )


336

E=f(E)

Figure 19: Graphical representation of the bracketing theorem.

Answer 4.2
Consider the equation (4:31) with a diagonal Hessian (the primes are deleted here):

EiqiPi = E (first row in (4: 31))


qi + tiPi = EPi (row i + 1 in (4 : 31))

with the solution for the energy:


g2
E= L:_l-
i E-ti
=f(E)

The betweenness condition (see Fig. 19) is immediately clear from the graphical repre-
sentation.

Answer 4.3
The closed shell HF Hessian is most easily obtained from (4:53) by assuming that there
are no inactive electrons and that all active orbitals are doubly occupied.

Now since we have a closed shell HF


337

FIb = hab
Ftu = htu + L {2(tulvv) - (tvluv)}

Inserting in eq. (4:53) yields

H!;::l = 20tuFab - 20abFtu + 4(atlbu) - (abltu) - (aulbt)

Answer 4.4

The overlap ma.trix is:

where Dpq = (OIEpqIO) is the first order reduced density matrix (the I-matrix) and ppgr• =
!(OIEpqEr• - oqrEp.IO) is the corresponding second order reduced density matrix (the 2-
matrix). Using eq. (3:22) we can simplify this to:

(OIE;E~IO) = 2 {oprDq. - 2Ppgr. - ogrDp. - cp.Dgr + 2Ppqsr + oq.Dpr}


which can be simplified further for a. CASSCF wave function, where the only pq and rs
combina.tions are: it, ia, or ta (i:ina.ctive, t:a.ctive, and a:seconda.ry).

Answer 4.5

Prove first these relations:

and from them:

The Ha.miltonian is:

b = Lhk/Ekt+ ~
k,l
L
k,ltm.,n
(kllmn) {Ek/Emn -o/mEI:n}

The one-electron contribution to Fiju is


338

where k, and 1 have been interchanged in the last summation (h,d = hll,) , and upon
interchanging the order of the first term we get

:E hl<j {Oil< - cLaw + aLaI<,; }


I<
The last two terms cancel in (OIFij.. IO) and the total one-electron contribution is hij.
The two-electron contribution is:

~ E (kllmn){ a...aL.EI<,6"i +cwaLEm"c5,;


k.',m,"

"t" E"m,,°l<;
+awa,.. ..

The third and the seventh term cancel. Hence, we are left with six terms. Permute the
indices and identify tha.t the first and second term, the fourth and fifth, and the sixth and
eighth terms are identical. Hence, we have

:E (kllmn) {cwal.,EI<IO,,; + aLa_Eldom; - aLa1,;om;C",,}


kJ,m.,n

:E (kllmn) {OimEI<,O"i - Emi.. EI<IC"i + Ei.... EI<,Cmi - EkivCmlO,,;}


k,l,m.,n

Next, avera.ge over spin l E.. EI'9" = lEI'9 and we get

Using the identity


339

on the third term and permute indices again to see that the second and the last term
cancel out and we are left with

E (kllmn) {6imE",6n; - ~Ekn6li6mj}


k,l.m,n.

Form the matrix element (01·· ·IO) of this expression, and do some additional index ma-
nipulation and the two-electron contribution is expressed as

ED", {(kl[ij) - 4( ki ljl)}


1:,1

The total matrix element is then

Ii; = hi; + ED", {(kllij) - ~(kiljl)}


k,J

Now calculate the expectation value of F;i"!


Case 1: The spin-orbital <PitT is occupied, that is atlO} = 0, (OlaitT = 0, atai"IO) = 10) and
(01 = (Olatai"o Thus only the last term in Fii" contributes:
(OIF;itTIO) = -(Ola!..[B, aitTJIO)
= -(OlatBaitTIO) + (OlaJA"BIO)
= -(~t.IBI~t.) + (OIBIO)
= -/PitT
where I~t.) is the wave function for the positive ion state, where an electron has been
removed from spin-orbital (PitT.
Case 2: The spin-orbital <Pi" is empty, that is atlO) = 0, (Olat = 0, and Q;"atIO) = O.
Only the first term in FiiCT contributes:

(OIFiitTIO) = (OlaitTBatIO) (OIBIO)


= (~;IBI~;) (OIBIO)
= EAiD'
where I~;) is the wave function for the negative ion state with one electron added to
spin-orbital 4Ji".

Answer 5.1

but since both i, and j are active orbitals we obtain for a CAS wave function

EijIO} = ECKIK)
K
340

where IK) are the eigenstates of the CAS CI secular problem. Hence we get (for real
functions)

L CK {(OIHIK) - (KIHIO)}
K

Exercise 5.2

No solution· to this problem is provided.

Exercise 5.3

No solution to this problem is provided.

Exercise 5.4

No solution to this problem is provided.


Spri nger-Verlag
and the Environment

We at Springer-Verlag firmly believe that an


international science publisher has a special
obligation to the environment, and our corpo-
rate policies consistently reflect this conviction.

We also expect our busi-


ness partners - paper mills, printers, packag-
ing manufacturers, etc. -to commit themselves
to using environmentally friendly materials and
production processes.

The paper in this book is made from


low- or no-chlorine pulp and is acid free, in
conformance with international standards for
paper permanency.
Editorial Policy

This series aims to report new developments in chemical research and teaching -
quickly, informally and at a high level. The type of material considered for
publication includes:
1. Preliminary drafts of original papers and monographs
2. Lectures on a new field, or presenting a new angle on a classical field
3. Seminar work-outs
4. Reports of meetings, provided they are
a) of exceptional interest and
b) devoted to a single topic.
Texts which are out of print but still in demand may also be considered if they fall
within these categories.
The timeliness of a manuscript is more important than its form, which may be
unfinished or tentative. Thus, in some instances, proofs may be merely outlined
and results presented which have been or wi11later be published elsewhere. If
possible, a subject index should be included. Publication of Lecture Notes is
intended as a service to the international chemical community, in that a commer-
cial publisher, Springer-Verlag, can offer a wider distribution to documents which
would otherwise have a restricted readership. Once published and copyrighted,
they can be documented in the scientific literature.

Manuscripts
Manuscripts should comprise not less than 100 and preferably not more than 500
pages. They are reproduced by a photographic process and therefore must be
typed with extreme care. Symbols not on the typewriter should be inserted by hand
in indelible black ink. Corrections to the typescript should be made by pasting the
amended text over the old one, or by obliterating errors with white correcting
fluid. Authors receive 50 free copies and are free to use the material in other
publications. The typescript is reduced slightly in size during reproduction; best
results will not be obtained unless the text on anyone page is kept within the
overall limit of 18 x 26.5 cm (7 x 10 1/ 2 inches). The publishers will be pleased to
supply on request special stationary with the typing area outlined.
Manuscripts should be sent to one of the editors or directly to Springer-Verlag,
Heidelberg.
Lecture Notes in Chemistry
For information about Vols. 1-25
please contact your bookseller or Springer-Verlag

Vol. 26: S. Califano. V. Scbettino and N. Neto. Lattice Vol. 47: C.A. Morrison. Angular Momentum Tbeory
Dynamics of Molecular Crystals. VI. 309.pages. 1981. Applied to Interactioos in Solids. 8.9-159 pages. 1988.
Vol. 27: W. BruDS. I. MOIOC. and K.F. O·DriscoU. Monte Vol. 48: C. Pisani, R. Dovesi. C. Roetti. Hartree-Fock Ab
Carlo ApplicatiODS in Polymer Science. V. 179 pages. 1982. Initio Treatment of Crystalline Systems. V, 193 pages.
Vol. 28: G.S. Ezra. Symmetry Properties of Molecules. VW. 1988.
202 pages. 1982. Vol. 49: E. Roduner. The Positive Muon as a Probe in Free
Vol. 29: N.D. Epiotis. Unified Valence Bond Theory of Radical Chemistry. VII. 104 pages. 1988.
Eleccronic Structme VIII. 305 pages. 1982. Vol. 50: D. Mnkbeljee (Ed.). Aspects of Many-Body Effects
Vol. 30: R.D. Harcourt. Qualitative Valence-Bond in Molecules and Extended Systems. VlU. 56S pages. 1989.
Descriptions of Electron-Ricb Molecules: Pauling M3_ Vol. 51:1. Koca.M. Kratocbv1\. V. KvasoicJca.L. Matyska,
Eleccron Bonds" and "Increased· Valence" Theory. X. 260 I. Pospicbal. Synthon Model of Organic Chemistry and
pages. 1982. Synthesis Design. VI. 207 pages. 1989.
Vol. 31: H. Harmwm. K.-P. Wanczelt, Ion Cyclotron Reso- Vol. 52: U. Kaldor (Ed.). Many-Body Methods in Quan-
nance Spectrometry II. XV. 538 pages. 1982. tum Chemistry. V. 349 pages. 1989.
Vol. 32: H.F. Franzen Second-Order Pbase Transitioas and Vol. 53: G.A. Arteea, F.M. Fem4ndez. E.A. Castro. Large
the Irreducible Representation of Space Groups. VI. 98 Order Perturbation Theory and Summation Methods in
pages. 1982. (out of print) Quantum Mecbanics. XI. 644 pages. 1990.
Vol. 33: G.A. Martynov. R.R. Salem, Electrical Double Vol. 54: SJ. Cyvin. J. Brunvoll. B.N. Cyvin. Theory of
Layer at a Metal-dilute Eleccrolyte Solution Interface. VI. Coronoid HydrocarboDS. IX. 172 pages. 1991.
170 pages. 1983. Vol. 55: L.T. Fan.D. Neogi. M. Yashima. Elementary Intro-
Vol. 34: N.D. Epiotis. Unified Valence Bond Theory of duction to Spatial and Temporal Fractals. IX. 168 pages.
Eleccronic Structure' Applications. VIII. 585 pages. 1983. 1991.
Vol. 35: WavefunctioDS and Mechanisms from Eleccron Vol. 56: D. Heidrich. W. K1iescb. W. Quapp. Properties of
Scattering Processes. Edited by F.A. Gianturco and G. ChemicaUy Interesting Potential Energy Surfaces. VIII. 183
Stefani. IX. 279 pages. 1984. pages. 199 J.
Vol. 36: I. Ugi. J. Dugundji. R. Kopp and D. Marquarding. Vol. 57: P. Turq. J. Barthel. M. Cbemla. Transport.
Perspectives in Theoretical Stereochemistry. XVII. 247 Relaxation. and Kinetic Processes in Electrolyte Solutioas.
pages. 1984. XIV. 206 pages. 1992.
Vol. 37: K. Rasmussen. Potential Energy FunCtiODS in Vol. 58: B. O. Roos (Ed.), Lecture Notes in Quantum
Conformational Analysis. XUI. 231 pages. 1985. Chemistry. VII. 421 pages. 1992.
Vol. 38: E. Lindholm. L. AsbriDlt, Molecular Orbitals and Vol. 59: S. Fraga, M. K1obukowski. I. Muszynski. E. San
their Energies. Studied by the Semiempirical HAM Method. Fabian. K.M.S. Saxena. J.A. Sordo. T.L. Sordo. Research
X. 288 pages. 1985. in Atomic Structure. XII. 143 pages. 1993.
Vol. 39: P. Vany'selt, Electrochemistry on LiquidlLiquid Vol. 60: P. Pyykktl. Relativistic Theory of Atoms and
Interfaces. 2. 3-108 pages. 1985. Molecules II. A Bibliography 1986-1992. vm.
479 pages.
Vol. 40: A. Plonka. Time-Dependent Reactivity of Species 1993.
in CondeDSed Media. V. 151 pages. 1986. Vol. 61: D. Searles. E. von Nagy-Felsobuki. Ab Initio
Vol. 41: P. PyykklS. Relativistic Theory of Atoms and VariatioDal CalculatioDs of Molecular Vibrational-
Molecules. IX. 389 pages. 1986. Rotational Spectra. IX. 186 pages. 1993.

Vol. 42: W. Ducb, GRMS or Graphical Representation of Vol. 62: S. J. CyviD. J. BJUnvoll. R. S. Chen. B. N. Cyvin
Model Spaces. V, 189 pages. 1986. F. I. Zhang. Theory of Coronoid Hydrocarbons II. XII. 310
pages. 1994.
Vol. 43: F.M. FernAndez. E.A. Castro. Hypervirial
Theorems. V11l. 373 pages. 1987. Vol. 63: S. F1isw. Atoms. Cbemical Bonds and Bond
Dissociation Energies. VIII. 173 pages. 1994.
Vol. 44: Supercomputer Simulations in Chemistry. Edited
by M. Dupuis. V. 312 pages. 1986. Vol. 64: B. O. Roos (Ed.). Lecture Notes in Quantum
Cbemistry II. VII. 340 pages. 1994.
Vol. 45: M.C. BlSbm. One-Dimensional OrganometaUic
Materials. V. 181 pages. 1987.
Vol. 46: S.l. Cyvin. I. Gutman. Kekult Structures in
Benzenoid Hydrocarbons. XV. 348 pages. 1988.

You might also like