Lecture Notes in Quantum Chemistry II 1994 PDF
Lecture Notes in Quantum Chemistry II 1994 PDF
Lecture Notes in Quantum Chemistry II 1994 PDF
Edited by:
Prof. Dr. Gaston Berthier
Universite de Paris
Prof. Dr. Michael J. S. Dewar
The University of Texas
Prof. Dr. Hanns Fischer
Universitat Zurich
Prof. Dr. Kenichi Fukui
Kyoto University
Prof. Dr. George G. Hall
University of Nottingham
Prof. Dr. Jiirgen Hinze
Universitat Bielefeld
Prof. Dr. Joshua Jortner
Tel-Aviv University
Prof. Dr. Werner Kutzelnigg
Universitat Bochum
Prof. Dr. Klaus Ruedenberg
Iowa State University
Prof Dr. Jacopo Tomasi
Universita di Pisa
B. O. Roos (Ed.)
Lecture Notes
in Quantum Chemistry IT
European Summer School
in Quantum Chemistry
Introduction
Bjorn O. Roos, Editor
JanAlmlof
Department of Chemistry
University of Minnesota
Minneapolis, MN 55455. USA
Contents:
1• Introduction.
2. The Born-Oppenheimer Approximation.
3. Determinant Wavefunctions and the Pauli Principle.
4. Expectation Values With a Determinant Wavefunction.
5• The Hartree-Fock Equations.
6. Spin- and Space Orbitals. Unrestricted Hartree-Fock Theory.
7. Closed-shell Hartree-Fock Theory.
8. Restricted Open-Shell Hartree-Fock Theory.
9. The LCAO Expansion.
1 O. The Roothaan-Hall Equations.
11. The Self-Consistent Field Procedure.
12. Solution of the Roothaan-Hall Equations.
13. The Supermatrix Formalism.
14. Direct SCF Techniques.
15 . Basis sets.
16. Integral Evaluation.
17. Prescreening of Integrals.
18. The Gaussian Product Basis.
19. Approximate Three-Center Expansions.
20. The Semi-Classical Limit.
21. Fermi- and Coulomb Correlation.
22. The Fack Matrix in an Orbital Basis. Koopmans' Theorem.
23. Matrix Elements With Slater Determinants. Brillouin's Theorem.
24. Charge Density and Population Analysis.
25. Closing Remarks.
Appendix A. Notations for integrals.
Appendix B. Parallel Implementations of Hartree-Fack Methods.
2
1. Introduction.
(2.1a)
(2.lb)
and
2 d2 d2 d2
V' =--+--+--. (2.1c)
a dx 2 dy 2 dz 2
a a a
In (2,1), the indices a, b... label all the particles of the system regardless of their
nature, rna are the masses of these particles, ra their positions and Qa their charges.
Distinguishing between nuclear coordinates {R}, with indices Il, V, •• and electronic
coordinates {r}. denoted i. j, .. , we can now rewrite the expression for the
Hamiltonian (2.1) as 2
H = T nuc + H eI , (2.2)
with
(2.3)
I We use a S C RIP T font for quantities that symbolize general many-electron operators.
2 To simplify the nOlalion. :lIomic units have lIeen used here and throughout the presentation.
4
and
Hel =-~lV2
I
~~L
~ 1: j + ~ Ir,,-rjl +..
'" I " 1<]
1 ~Q.&
Irj - rJ·1 -.LJ Ir,,-rvl
I1<V ..
(2.4)
Notice here that the separation is IlQl symmetric in electrons and nuclei! The
electronic Hamiltonian depends parametrically on the nuclear positions - the nuclear
coordinates appear in the electronic Hamiltonian, but derivatives with respect to these
coordinates do not - and the electronic problem can therefore be solved for nuclei
which are momentarily clamped to fixed pesitions in space:
(2.5)
Note that the electronic energy and wavefunction are still functions of the nuclear
coordinates R. We now approximate the total wavefunction as a product.
(2.6)
(2.7)
'"
where V Jl is the gradient operator for the coordinates of nucleus JL. The important thing
to notice here is that the electronic wavefunction still depends on the nuclear
coordinates, and that the nuclear kinetic energy operator thus affects it. However,
although this dependence is formally present, it is usually insignificant due to the large
difference in mass between nuclei and electrons. which makes the former move much
more slowly than the latter. In other words, the electronic wavefunction normally
varies little upon small changes in the nuclear positions. and the terms containing
V", '"el(r,R) and V/"'el(r.R) can therefore be neglected. leading to:
(2.10)
3 since Hel does not dirferenti:lle with respeclto the nuclear coordinates il commutes with'llnuc(R).
5
From the above discussion. we realize the need for a 'quick and dirty' method
that can provide qualitatively correct. approximate solutions to the many-electron
Schrtidinger equation. The Hartree-Fock method provides such a solution, at the same
time as it also satisfies our need for a manageable model of the electronic structure of
many-electron systems.
The separation of variables which are not strongly interdependent worked well
in the Born-Oppenheimer approximation discussed above. (Incidentally, it is also
essential in solving the nuclear motion problem defined by (2.7). which is the key
equation in theoretical vibrational spectroscopy). An analogous trial wavefunction for a
many-electron system would be a product of one-electron wavefunctions, a Hartree
~:
(3.1 )
In (3.1) the functions <Pi are wavefunctions describing a single electron:~. This
model was used in the early days of quantum mechanics to carry out crude calculations
of the electronic structure of atoms. but it has some obvious deficiencies.
~ Here. and throughout the presentation. we use 'n' 10 represent the number of electrons in a many-
electron system.
6
For a start. we note that electrons are identical particles. a fact which ought to be
reflected by the wavefunction. If two identical particles "i" and "j" were to be
interchanged. there should be no detectable change in any of the observable properties
of the system. In particular. the probability density. as defined by the square amplitude
of the wavefunction. cannot be allowed to change upon such a manipulation. A very
basic requirement on any reasonable trial wavefunction should therefore be to satisfy
the condition Pij 'I' = ±'I' for any two identical particles i and j. where P ij is ~he
permutation operator interchanging the coordinates of these two particles. At this point.
we cannot decide from first principles which sign would be appropriate. or even
whether the sign matters. We accept as a postulate (supported by experiment) that
(3.2)
To proceed. we note that any two-electron function '1'( 1.2) can be written as
i.e .•
'I' = 'I'symm + 'I'antisymm (3.3b)
(3.4) and (3.5) can be viewed as projections. which project out the totally
anti symmetric part of any function '1'. In general:
The sum in (3.6) is over all possible permutations P of electrons 1,2,... n, and p is the
parity of the permutation P. Application of the antisymmetrizer
A =L (-l)PP (3.7)
p
to the Hartree product (3.1) thus leads to a sum of terms with alternating signs,
containing a total of n! different products of orbitals. We notice that the definition (3.6)
is identical to that of a detenninant:
CPI(1) CP2(I)
CPP) (3.8)
(3.9)
for a Slater determinant. The essence of the Hartree-Fock method is thus that the
wavefunction is written as a determinant of one-electron orbitals. For simplicity we
assume that the orbitals are orthonormal. i.e.
(3. lOa)
(a unit matrix), or
(3. lOb)
We will show later (5.10 - 5.16) that this assumption does not restrict the generality of
the approach.
We now investigate some properties of the determinant wavefunction. From
the above. we know that we can expand the determinant as a sum of products:
S We will he using this "shadowri" font whenever we refer tO:l m:ltrix or vector in n dimensions.
8
n!
'I'=C L
p
(-I)PP {<I',(I)<I'2(2)<I'3(3) . . . <l'n(n)} (3.11)
The sum in (3.11) contains all possible permutations of all the electrons. However, we
could also view the ordering of the electrons as fixed, and instead permute the orbitals
in all possible ways. As we sum over all possible permutations in both cases. the
results would be the same. but the latter approach is sometimes a more useful way of
viewing the Slater determinant. A characteristic term in the expansion is then:
(3.12)
where {a. b •... k} is a permutation of the numbers {I. 2, .... n}. We can now write
the normalization or "self-overlap" integral
('1'1'1') =JD( 1.2. ... n) * D(1.2.... n) dr,. dr2 •... drn (3.13a)
as:
n! n!
{'I'I'I')=C 2 L L (-I)P+P'(p {<I',(I)<I'2(2) ... <l'n(n)}1
p p'
(3.13b)
Here. it is useful to keep in mind that the electronic coordinates are "dummy" variables
- since we integrate over them. we can call them anything we want, as long as we are
naming them in a consistent manner. Furthermore. we note that applying the same
permutation to the electrons and the orbitals would result in a Hartree product identical
to the original one. and the effect of a permutation P on the electrons is therefore equal
to that of the inverse permutation P -, on the orbitals. One term in the sum (3.13b) can
be written as:
(3.14b)
The left-hand side is the original Hartree product - we just have to reorder the orbitals.
The right-hand side is the result of applying the permutation P to the electrons and P ,
to the orbitals. The former operation is equivalent to applying P -I to the orbitals. and
we can therefore write (3.14b) as:
(3.14c)
9
(3.15a)
<PP) <Pz(2)
<PP) (3.16)
where
(4.2a)
is a trivial additive constant for fixed nuclear positions, which we will usually leave
outside the discussion in the rest of this Chapter;
describes the interaction between two particles with unit charge.6 Here again, we recall
that the indices "i" and "j" label the electrons. "Il" and "v" the nuclei. In evaluating the
=
energy as an expectation value of the Hamiltonian. E ('PH'P), we will need the
integrals ('P[1:hi ]'1') and ('1'[1: gij ]'1'). in addition to the overlap integral encountered
in Section 3. For the determinant wavefunction, the one-electron integral is given by:
n
[L hi]P' {<i>I<I)<i>2(2)···<i>n(n)}} (4.3)
i
Just like in the case of the normalization integral (3.15). we apply the inverse
permutation P -I and re-Iabel the integration variables:
n
[p-IL hi]P -lp'{<i>l(1)<i>2(2)··.<i>n(n)}} (4.4)
i
Ii NOle Ihal Ihere is .DS!. direct phy~ical inleraclion hclween e1emenlary panicles involving more Ihan
IWO panicles. This is in conlrnSI 10 Ihe inlernction belween more complex objecls such :IS aloms or
molecules. where many-lKxly internclion erfe~'ls are prevalent.
11
n n! n
('I'~ hi '1') = ~ (-l)q({CP,(I)CP2(2) ... CPn(n»[~ hi] Q (CP,(I)CP2(2) ... CPn(n)})
i Q i
n n!
=Li QL (-I)q (CP,ICPa,), (CP2ICPa2)2 (CP3ICPa3)3 ... (CPj hj cpaj)j····(cpnl<pan)n (4.5)
n n n
('PI. hi'll) = ~ (<Pj hj <Paj)i =
I I I
r,
(<Pi h <P ai ) (4.6)
7 The nOlalion used for Ihis integral and many others is discussed more fully in Appendilt A.
12
=
A tenn with i j in (4.7) would make no physical sense. since it would essentially
pretend to describe a "self-interaction" of the electron. However. it is easily seen that
such a tenn would vanish. and it is thus convenient to rewrite (4.7) as
n n
('I' l: gij '¥) =t h {(<Pi <Pj Igl <Pi <Pj) - (<Pi <Pj Igl <Pj <Pj ) }
1<) I,)
(4.8)
(4.9)
n n n
('¥ l: gij '¥) =!4..U (<Pi {Ij - Kj }<Pi}= !~
~ 1
(<Pi {I - K}<Pi) (4.10)
where we have introduced the Coulomb and exchange operators Ij and Kj • defined
through their action on an arbitrary function 9( I) such that
(4.11)
(4.12)
n
K=Li KI (4.13b)
The expectation value of the Hamiltonian with the determinant wavefunction is thus:
n n
E('P) =('PH'¥) =~ (<Pi h <Pi) + ~ l: (<Pi(J -K)<Pi) =
1 I
n n
~ (<Pi h <Pi) + ~ 4.. {(<Pi<Pj Igi <Pi<Pj) - (<Pj<Pj Igi <Pj<Pi)} (4.14)
I U
which is the desired result - an expression for the total (electronic) energy for a Slater
determinant wavefunction. evaluated ac; a proper expectation value of the Hamiltonian.
Notice that within the Born-Oppenheimer model the Hamiltonian used here is
exact: the only approximation introduced is that of a single-determinant wavefunction.
13
r «'Pj+O(j)i) h «(j)i+O'Pi»
n
E('I') ~ E('I') + oE('I') =
i
n
+ ~ ~ «(j)j+O(j)j)('Pj+O'Pj) Igl ('Pj+O'Pj)«(j)j+O'Pj»
I.J
n
I~ ,
- 1: ~ «(j)j+O(j)j)('Pj+O'Pj) Igl ('Pj+O'Pj)«(j)j+O(j)j» (5. J)
I,J
Terms in (5.1) of higher order than linear in <;(j) can safely be neglected. and we get
for the variation of the energy:
n n
<;E('I') = ~ (<;(j)i h 'Pi) + k {(<;(j)i'Pj Igi 'Pi(j)j) -' (<;(j)i(j)j IgI 'Pj(j)i)} + c.c.
I l.j
=r r =r
n n n
(<;(j)i h 'Pi) + (<;'Pi (J -K) 'Pi) + C.c. (<;'Pi F (j)i) + C.c. (5.2)
i i i
where c.c. denotes the complex conjugate. s In (5.2) we have introduced the &k
Operator
F=h+J -K (5.3)
8 Remember that we have to take the complex conjugate of the wavefunctjon to the left when we
evaluate expectatjon values of the form <lJ'1 OP'I'2>.
14
thus need to be applied throughout the optimization. This can be done in several ways.
One of the more common approaches uses the technique of La&ran&ian multipliers,
which we will assume known to the reader. Accordingly, we instead minimize the new
quantity
n
E = E('P) - ~ Aji «<Pi I <Pj) - Oij) (5.5)
I)
with respect to the wavefunction parameters as well as the Lagrangian multipliers '1i:
n n
oE = l: (o<Pi F <Pi) - ~ Aji (O<Pi <Pj) + c.c.
I I)
(5.6)
n n
= l: (O<Pi {F -l:J Aji} <Pj} + c.c.
I
(5.7)
If the orbitals <Pi are the "best" possible in the sense of the variation principle, then
oE = 0 for all possible variations o<Pj. This can only be the case if. all the terms in
brackets vanish:
n
F <Pi -l: Ajj<Pj = 0 (5.8)
J
or. in matrix notation:
(5.9)
The equations (5.8 - 5.9) are the legendary Hartree-Fock EQuations. They
establish a criterion for the orbitals giving the lowest energy for a system described by a
determinant wavefunction. However. they do not constitute the simplest possible form
for such an criterion. To see this, we consider a linear transformation of the orbitals:
n
<P'i = l: Wji<Pj (5.10)
J
'lp'=,W (5.11)
To determine what effect the transformation (5.11) will have on the Slater
determinant, we define the n x n matrix A such that
(5.12)
We can now write the Slater determinant as
·1/2
\fI(1,2 •... n)=(n!) detA (5.13)
n n
A'ki = <P'i(k) = ~ Wji<Pj(k) = ~ WjiAkj, (5.14)
J J
A'=AW (5.15)
The wavefunction after this transfonnation would be:
where we have again been using some elementary linear algebra. But, detW is merely
a constant (non-zero as long as the transformation is linearly independent), and thus the
new wavefunction If" is essentially identical to 'I' apart from a trivial renonnaJization!
This is an important result, with implication far beyond this course:
The Hartree-Fock wavejunction is invariant (except/or a renormalization) to linear
transformations among the orbitals!
Furthermore, we note that if we require the orbitals to remain orthonormal under the
transformation, then W has to be a unitary matrix, detW == I, and we don't even have
to worry about renonnaJization.
We assumed in (3.10) that the orthonormality requirement would not be a
significant constraint on the wavefunction. We see now that this assumption was
justified - transforming the orbitals among themselves leaves the total wavefunction
and energy invariant, and it thus only a matter of convenience.
It is now time to recall the Hartree-Fock equations (5.8) and (5.9). The
purpose of these equations was to define the orbitals leading to the lowest (and, thus,
the "best") energy for the Hartree-Fock wavefunction '1'. However, since we just
showed that the wavefunction is invariant to linear transformations among the orbitals,
the same must be true for the energy E('I')::: ('I'H '1'). Accordingly, we can allow
ourselves any transformation of the orbitals, if that can simplify our working equations.
For that purpose we let the Lagrangian multipliers define the matrices V and £ such
that:
n
~ AkiVij =EjVkj (5.17)
I
or
(5.18)
.'=,V
In matrix notation, (5.19) would read:
(5.20)
(5.22b)
(5.22c)
(5.23)
This relation can be obtained by mUltiplying l5.22a) from the left with qli and
n
integrating. Finally. we note that E:;e L Ei. Actually,
i
r. r.
n n
E = Ei - ~ (qli(1 -K)qli) (5.24)
I I
In semi-empirical theories it is often assumed that the total electronic energy equals the
sum of one-electron energies. but clearly that is not the case here.
17
single electron is thus characterized by the behavior of the wavefunction when acted
upon by the two operators s2 and Sz. The eigenvalues of these operators are = ~ and
~ =",rCr ) ~. (6.lb)
where 'V are space-orbitals. i.e. functions that depend only on the spatial coordinates
r ={x.y.z}, and where we have recognized the fact that the spatial part of a spin-
orbital may depend on whether the spin part is a or ~.
As long as the Hamiltonian does not explicitly refer to spin it will commute with
these two operators, and the spin quantum numbers are therefore "constants of
motion". i.e .• any of its eigenfunctions should also be an eigenfunction of the spin
operators. In the same way. a many-electron wavefunction which is an eigenfunction
of a spin-free Hamiltonian is also an eigenfunction for the total spin operators S 2 and
Sz. In other words. the exact wavefunction for a many-electron system must satisfy
the relations
9 Reader~ already familiar with this subject from other presentations may notice that the factor h is
mi~~ing: that is hcc:luse we consi~tently usc the atomic units system in which h=1.
18
To apply the n-electron spin operators (6.2) and (6.3) to our determinants. they
must first be expressed them in terms of the one-electron operators. The total spin
vector is just the vector sum of the contributions from each electron. S = Lsi. and thus
n
Sz =l: Szi
I
(6~4)
(6.5)
n!
Sz 'I' =(n!)-112 L (-I)P.Sz P {lP 1(l)lP2(2)CP3(3) ... lPn(n)}
p
n n!
=(n!)-112 Li L
p
(-I)P {lPal (I)lPa2 (2) ..
'--=':-::':"...J
(6.6)
n n!
Sz 'I' = (nn- Jl2 l: Lp (-I)P {lPal (l)lPa2 (2).
I
·IOajlPaj(i)I· . lPan(n)}
n! n
=(n!)" Jl2 L
p
(-I)P [L 0ail (lPal(l )CPa2(2).
i
lPaj(i) .. lPan(n)}
n n!
=(n!)-112 [L ail L l-I)P (lPal (l)lPa2 (2) .. CPai(i)· . lPan(n)} (6.7)
i P
Hartree products are) with eigenvalue M~ = 1: ai =t(na - n~), where nex and n~ count
the (X- and p-orbitals. The requirement (6.3) is therefore fulfilled whenever we use a
determinant as our approximate n-electron wavefunction. However, the determinant is
not necessarily an eigenfunction of S 2, and (6.2) may not be fulfilled.
19
The Hartree-Fock equations that were derived in Section 5 can be used with
spin orbitals, as long as we remember to integrate over the spin coordinates.
Fortunately, the rules for spin integration are very simple:
<(XI(X>=<PI~>= I; (6.8a)
<(X 1~ > = O. (6.8b)
We may now introduce the substitutions (6.1) into the expressions for the
energy and Fock operator. With the expressions for one- and two-electron integrals
introduced in Appendix A, the energy in Eq (4.14) becomes:
n n
E('I') = l: ('IIi h 'IIi) +! .l(a)J(a)
1
~ ('IIj'llj Igl 'IIj'llj) - {'IIj'llj Igl 'IIj'llj}
n n
+.!. . ~ {'I'i'llj Igl 'IIj'llj} - ('IIj'llj Igl 'IIj'l'j) + . ~ ('I'i'l'j Igl 'l'i'l'j)
2 I(~)J(~) l(a)J(p)
n n n n
= L<i h i> + ~ L<ij II ij> + ~ L<ij II ij> + L<ij 1 ij> (6.9)
i i(a)j(a) - i(P)j(P> i(a)j(p)
The Coulomb interaction always survives the spin integration, whereas exchange only
occurs between electrons having the same spin (thus the 'single-bar' expression for the
last integral in (6.9). For the Fock operator (5.3), we must make a distinction
depending on whether it operates on an (X- or a j3-orbital:
F (a) =h + J _ K(a) (6. lOa)
F (P) =h + J - K (P) (6. lOb)
where the one-electron and Coulomb operators are the same as before, while the new
exchange operators K (a) and K (P) are defined as:
n
K(a)= L Kj (6.11 a)
i(a)
(6. 11 b)
since. as we discussed above, the exchange interaction only occurs between electrons
having the same spin. This straightforward implementation of the above equations is
known as Unrestricted Hartree-Fock theO!y,lO since there is no attempt to impose the
constraint (6.2) on our wavefunction.
10 1.A. Pople and R.K. Nesbel. 1. Chern. Phys. 33. 571 (1964); G. Berthjer. I. Chjrn. Phys. 51.
363 (1954).
20
While not an absolute prerequisite, it makes good sense to require the spin
properties (6.2) and (6.3) to be fulfilled also with approximate solutions to the
SchrMinger equation, in the same spirit as we constrained the wavefunction to be an
eigenfunction of the permutation operators Pij with eigenvalues of (-I) in Eq. (3.2). As
mentioned above, the Slater determinant does not in general satisfy (6.2). In certain
cases, however, it is quite straightforward to ensure the correct spin behavior of the
wavefunction. One important such situation occurs when the electronic state under
consideration is a spin singlet (S =Ms =0, spin multiplicity = I), which necessarily
requires an even number of electrons. By requiring the spin-orbitals to occur in pairs
having the same spatial function. thus differing only in the spin part - "perfect spin-
pairing" - important simplifications are possible:
ex
CPk = 'l'k(r) a(cr), (7.1a)
~ ='l'k(r) ~(cr), (7.lb)
wavefunction. The energy expression with this wavefunction can be written as:
n n
E= 4. <i h i> + ih <ij II ij>
1 I.J
(7.2a)
nl2 n/2
=2I (i h i) + I (2(ii I jj) - (ij I ji) } (7.2b)
i i,j
where the summations in (7.2b) are over the nl2 doubly occupied orbitals. Thus, the
interaction between two doubly occupied spatial orbitals "i" and "j" is given by
4(ii Ijj) - 2(ij I ji) , since Coulomb interaction occurs between all electrons but
exchange only between those having the same spin. Within one doubly occupied
orbital, the energy contribution is =2 (i h i) + (ii I ii), since the two electrons have
different spin and thus give rise to only a Coulomb term.·
Closed-shell Hartree-Fock is by far the most commonly used variety of the
Hartree-Fock method. This is due to the fact that a vast majority of all molecules have
an even number of electrons and a singlet ground state. If an unrestricted Hartree-Fock
scheme were to be applied to such a system. the solutions would usually still satisfy
(6.2) and 7.1). but applying these constraints from the beginning makes the
21
computations much more efficient. Since the a- and ~-orbitals have identical spatial
parts. the two Fock operators (6.10) must also be the same in our closed-shell theory.
and by only constructing one Fock operator the work and the memory requirement are
reduced by about 50%.
Spin constraints such as those applied in (6.2) can often be applied even if the
spin orbitals are not all perfectly paired with identical spatial parts as in (7.1). For a
simple example, consider a case where nc spatial orbitals are doubly occupied with
perfect spin pairing - the "closed shells". and no are singly occupied, all with a-spin -
"open shells". Such a determinant would be an eigenfunction of Sz with eigenvalue Ms
=n20 , and of S+ with eigenvalue zero. I I Operating with S. will be possible only for the
open shell orbitals; subsequent operation with S+ as in (6.5) will bring the function
back to the original determinant, and the S+S_ product in (6.5) will thus simply count
the number of orbitals with a-spin. We therefore conclude that
no n 2
S 'I' =(2 + 4) 'I' =MsCMs+ 1)'1'
2 0
(8.1 )
II the a-orbitals already have maximum m~ value and raising it is thus impossible: raising it for any
~ orbital will create a detclTIlinant with two identical orbitals.
22
Furthermore, if we use the occupation number rna (=1 or 2) for the number of
electrons occupying any spatial orbital we obtain after some manipulation:
·~L(stlts) (S.3)
s,t
For the Fock operators, we get for the closed and open shells:
a
Ka = 2 ~ma Ka + 2 ~ Ks
a s
(S.Sc)
Apart from the last terms in (S.3) and (S.Sc), the expressions are all written on a
general form for all orbitals. whether singly or doubly occupied, using the occupation
numbers rna and the interaction terms (2J - K). Compared with the closed-shell
formalism, the only contributions that bring in new complexity are those describing the
exchange interaction among the open shells. We can therefore conclude that the
formalism would remain basically unchanged even if we were to assign other spins to
some of the open shells.
However, in order satisfy (6.2), a wavefunction may have to be formed as a
sum of several determinants. These determinants typically have the same spatial
orbitals occupied, but differ in the spin of the singly occupied orbitals. We may accept,
without a thorough derivation of all possible cases, that the energy for any such
wavefunction can be written as in (S.2) or (S.3) apart from the open-open shell
interaction for which we introduce a more general form:
or
E=L rna (a h a) + ~ L m:! rnh{{aa I bb) - ~(ablba)}
a a,b
+ ts,tL (asdss I tt) - ~ ~sdst Its)} (S.6b)
23
where ast and bSb or <Xst and ~~t are referred to as the open-shell vector coupling
coefficients. The (as" bst ) and (Clst , ~st) sets serve the same purposes, but different
notations are preferred by different schools. The reader should be warned that there are
many different official and unofficial definitions of vector coupling coefficients, and
take ample precautions before attempting to use a computer program that requires these
coefficients as input.
The coupling coefficients enable us to specify one out of several possible states
for an open-shell orbital configuration. As a simple example, let us consider the first
excited configuration of the Helium atom, He(l 5) J(2s) I. The reader is presumably
aware of the simple form of these wavefunctions from elementary quantum theory.
Thus we have for the triplet state:
R R ·112
3'1'=( II sQ2sl'> + IIs1'2sQ> )2 = (1525 - 2515) (<x~+~<x)/2 (8.7)
with the energy:
=
ah2s = L bls2s = -2. alsl s = a2s2s = blsl s b2s2s = 0 (8.13)
<Xls2s = O. ~ls2s = -3. <Xlsls = <X2s2s = ~Isls = ~2s2s =-1 (8.14)
N
'Vi(r) =LCpi Xp(r) (9.1)
p
'IJ7=XC (9.2)
(9.3)
which is already a function with N2 complexity. The interaction between two electrons
in different orbitals will therefore grow as N4. a quite disadvantageous increase in
14 We continue to use lower-cao;e "n" to dcnote thc numher of electrons in our system. upper-case "N"
ror the number of ha~is functions. and a shldowe-d" fnnt for matrices or vectors in either n or N
dimen.~ions.
25
(9.4)
Since the indices p,q,r,s can be permuted in several different ways without
essentially changing the value of the integral, the number of non-redundant such
integrals is typically of the order N4/8. This steep increase in the computational
requirement is a major concern in applications of the theory to realistic problems, since
basis sets of 100-1000 functions are often needed for a satisfactory description of
medium-sized molecules. In most schemes for performing ab initio SCF calculations,
the limitations to accuracy as well as to the size of the system that can be studied are set
by the two-electron integrals, their evaluation and their storage.
With an LCAO expansion such as the one in (9.1), the general energy expression in
spin orbital formalism (4.14) takes the form
n N
E('I') = Li Lp.q CpiCqi hpq
n
t lI.J
N
+ L Cpi Cqj Cri Csj {<pq Irs> - <pr I qs>} (I0.1a)
pq.rs
N I N
=L Dpqh pq + 2" L
DprDqs {<pq Irs> - <pr I qs> } (I0.lb)
p.q pq.rs
n
Dpq =4. CpiCQi (10.2)
1
Using the LCAO expansion technique. we can insert (9.2) into the canonical Hartree-
Fock equations (S.22b) to obtain:
(10.3)
26
(10.5)
(10.6)
(10.7)
n
Fpq = (XpF Xq) = <Xp( h + J - K )Xq) = (XphXq) + ~ (Xp(J j - Kj)Xq)
I
n n N
=hpq + 4. (Xp(Jj - Kj)Xq) =hpq + ~ L CriCsi {<pr I qs> - <pq I rs> I
I I ~s
N
=hpq + L Drs <pr " qs> (10.8)
r.s
In (10.8) we have used the notations for integrals defined in Appendix A. Notice that
in the Hartree-Fock equations (10.7) the matrices iF and 5 are given, £ is required to be
diagonal. while C is unknown. However. C cannot be varied completely without
restrictions: We required the orbitals to be orthonormal according to (5.4). which can
be written in matrix form as in ( 10.9):
IS C.C.1. Roothaan. Rev. Mod. Phy~. 23. 69 (1951): G.G. Hall. Proc. Roy. Soc. A 208.328
(1951).
27
(10.9)
N
<Pi =I, XpApi (10.10)
P
or
~=~A (10.1 I)
(10.12)
As long as the transformation A is non-singular, the original basis set {Xp} can be
expressed in the new set {CPi}, which can thus also be used as an expansion basis for
the orbitals in our Hartree-Fock problem. This leads to
where
'1\1 = ~ C =-4l A-l C = *
C' (10.13)
C= AC' (10.14)
(1O.ISb)
28
(10.18) is a "normal" matrix eigenvalue problem where the columns of C are the
eigenvectors and the diagonal elements of £ the corresponding eigenvalues as illustrated
in the diagram (IO.lSc):
(1O.1Sc)
the columns of C
are eigenvectors
Su =lIG, (10.19)
where 11 is unitary and G is diagonal. We can then form the matrix X =lIC-112, where
~-112.IS the matnx
. containing
.. the eIements (O"jj )-112 on the d'lagona.I (We have assumed
here that the matrix C is positive definite. which holds as long as there is no linear
dependence in the ba.o;is set.) We can easily verify that
and X can therefore be used in place of A for the transformations in (10.14) and
(10.17). This Canonical Orthogonalization scheme has some obvious advantages: In
the case of near or exact linear dependency in the basis set. the eigenvalues of the
overlap matrix S will be near or equal to zero. This would obviously constitute a
problem if we attempted to form fj-1I2 and X in (10.20) as the elements in column 'i'
would grow beyond any limit. but the calculation would in fact misbehave with any
29
unitary, and C, the coefficients of the LCAD expansion, are of order (aiif~12 The two-
Mathematically, this means that we have projected out from the basis the
particular component causing the linear dependency.
This procedure gives a rectangular matrix X' to be used in place of A in (10.14) and
( 10.17), and the transformation of the Fock matrix therefore leads to a matrix of smaller
dimension (though still quadratic):
The attentive reader may already have noticed a significant flaw in our reasoning
when the Hartree-Fock method was presented above. It is certainly plausible that the
Hartree-Fock equations could somehow be solved if we were able to construct the Fock
operator, and subsequently the Fock
matrix (10.5), which could be diagonal-
ized to get the orbitals. It is also clear
that the Fock operator is defined from
these orbitals. according to (4.11) -
Guess the initial
(4.13) and (5.3). It thus appears that molecular orbitals
we would need to know the solutions to
the Hartree-Fock equations, before we
could define the operators needed to
construct these equations! The solution
to the above paradox is as simple as it is Replace Construct the
pragmatic. As long as the equations are old orbitals Fock Operator
satisfied in the end, it doesn't matter with new
how we arrive at these solutions. We
can therefore allow ourselves to guess a
set of orbitals without any particular Solve the eigenvalue
justification. in order to get the process problem F<p;:::
I
<P.E·
I I
started. With these orbitals we can now
construct an approximate Fock opera-
tor, which can then be diagonalized to
obtain a new set of orbitals. These
orbitals then replace the old ones in
.constructing a new Fock operator, and
so on. The procedure is repeated. and
after a certain number of iterations it is
usually found that the orbitals do not
change from one iteration to the next.
At this point. the orbitals satisfy (5.22),
and we conclude that we now have the
solution to our problem.
The above approach is referred to as the Self-Consistent Field method (or SCF
for short). since the Coulomb and exchange fields of the orbitals define a Fock operator
having these very same orbitals as eigenfunctions.
32
N
~(r) =L C ~k Xp(r) ( 12.l.a)
p
N
~(r) =L C ~I Xp{r) (I2.l.b)
p
n
r:} = L cp.d3. (I2.2b)
IS i IP) n SI
where the summations are only over (l- Q.[ ~-orbitals. as indicated. With these density
matrices (and the one-and tw~lectron integrals) the total energy can be expressed as:
(12.3)
We also obtain two Fock matrices. one for each of the Fock operators in (6.10):
N
F~ =L [(D~ + D~)(pq I r~) - D~ (prl qs>J ( 12.4a)
IS
N
F~q =L [(D~ + D~) (pq 1rs) - D~ (pr 1qs)] ( 12.4b)
IS
33
DI01=DC\D~ (J2.5a)
Dspin=Da_D~ (I2.5b)
D a = 1 {DIO\Dspin} (I2.5c)
2
N N
E=L. DIOlhpq+~
!XI !Xl
L.
pq.rs
D~tD~I{2(pqlrs)-(prlqs)}
N
- L. DspinDspin(pr I qs) (12.6)
pq,rs !XI rs
t t
N
Fa
pq
=L.
rs
[D~t {(pq Irs) - (pr I qs)} - D:s"in (pr I qs)] (I2.7a)
(12.8a)
(12.8b)
which can be solved with the methods discussed in Section I I. Note that in general the
orbitals and the orbital energies will be different for the two equations (I2.8a-b).
No special attention need to be given to the issue of orthogonality among the orbitals.
Orbitals with different spin are automatically orthogonal due to the spin integration,
whereas those with the same spin are orthogonal since they are solutions to the same set
of Roothaan-Hall equations.
The orbitals obtained in this scheme can now be used in accordance with the
Aufbau Principle to form the orbitals for the next iteration, or to construct the total
wave function if the SCF procedure ha<; converged. There is no requirement to have the
same number of u- and ~-orbitals. and the Autbau principle can be applied separately
for each spin.
It may appear confusing that two different sets Hartree-Fock equations were
obtained, since only one occurs in the original derivation. We can view the two
34
Since the off-diagonal blocks are exactly zero due to the spin orthogonality, these
matrix equations can be separated into two blocks of dimension N x N.
The practical procedure for solving the Hartree-Fock (or Roothaan-Hall)
equations in a LCAO basis set would then go as follows:
coefficients, and construct the trial density matrix. Alternatively, one could
3: Compute one- and two-electron integrals over the LCAO basis functions and
4: Transform the Fock matrix to the orthonormal basis and diagonalize it.
It can now be rigorously shown that. as long as certain formal criteria are met.
any function of several variables can be written as a sum of expansion functions which
are simple products of one-variable functions. In electronic structure theory, it is
obviously more useful to think in terms of n- and one-electron functions. i.e. total
wavefunctions and orbitals. Therefore. any function of the coordinates of n electrons
can be expanded in Hal1ree products. and any antisymmetric function can be similarly
expanded in Slater determinants!
This implies that we could write the exact electronic wavefunction for any
system as a linear combination of Slater determinants.
n
Ho=L F(i) (12.10)
i
(12.11)
where la(, a2' ... anI is a permutation of the indices (1.2•... nJ. A single term F(i)
in Ho operating on a single Hal1ree product 0 v in the wavefunction will then result in
36
Therefore, we conclude that the Hanree product is an eigenfunction of the zero order
Hamiltonian:
n n
Ho9 v =l:i F (i) 9 v =[l:j EaJ9 v (12.13)
Since {ar, a2, ... anI is just a permutation of {1.2 •.•• nl the sum in (12.13) simply
contains the sum over all orbital energies. Thus,
(12.14)
where
n
Eo= LEai (12.15)
i
for all the Hartree products 9 v • and we thus also have
(12.16)
The (X- and the f3-Fock matrices would be equal. and given by:
N
Fpq = hpq + L Drs(pq Irs) - ~(pr Iqs) - ~(ps I qr) } (13.2)
rs
In (13.1) and (13.2) we have written the exchange contribution on a symmetric fonn,
which we may do as long as the summation is over all r,s. Exploiting this symmetry
we may now restrict the summation range. If we redefine the density matrix as
(13.3)
we arrive at expressions which are more efficient to evaluate than the previous ones:
N N
E= L dpqhpq + ~ L dpqdrsP pq.rs (13.4)
pSq pSq.rSs
N
Fpq =hpq + L drsP pq.rs (13.5)
rSs
In ( 13.4) and (13.5) we have introduced the supennatrix notation
Ppq,rs = (pq I rs) - 4I (pr I qs) - 41 (ps I qr) (13.6)
Since the first fonnulation of the LCAO finite basis scheme for molecular
Hartree-Fock calculations. computer applications of the method have traditionally been
implemented as a two-step process. In the first of these steps the two-electron integrals
are calculated and stored externally. The second step then consists of the iterative
solution of the Roothaan-Hall equations, where the integrals from the first step are ~ad
at least once for every iteration. 16
The division of the computational process into these two steps was motivated
by the high cost of central processor (CPU) perfonnance versus input and output (YO).
Whereas the second step involves extensive retrieval of data from mass-storage,
integral calculation is dominated by computation of rather complicated analytical
expressions. In early applications of LCAO calculations to molecules, the integral part
of the calculation with its high CPU demands represented the bottleneck, and the major
effort of developing more efficient algorithms and faster computer programs for
molecular Hartree-Fock calculations has therefore been directed at that problem.
During the last several decades there have been continuous rapid advances in
computer technology, as well as in integral algorithm and code development. Common
to virtually all types of computer equipment. from desktop PCs to supercomputers. is
the fact that the progress in CPU technology has been much faster than the development
of lIO facilities. Thus, with the traditional approach one now faces the dilemma of
being able to compute large numbers of integrals very rapidly, but spending a relatively
larger amount of time and effort on their storage and retrieval. Indeed, for SCF
calculations carried out in that fashion today. the size of the systems that can be handled
is almost always limited by the disk storage and YO capacity needed for the integrals,
rather than by CPU power required to compute them.
While enough storage capacity might be available on large mainframe systems
to carry out calculations in the conventional spirit up to some 500 or even 1000 basis
functions. such an endeavor would certainly place a heavy load on the YO channels. fill
a large portion of the available disk space. and reduce the overall throughput of the
system considerably.
The direct SCF scheme offers a solution to this problem by eliminating all
storage of integrals. This can. however. only be done at the expense of integral
recalculation in every iteration. While this would have been very inefficient in the early
days of molecular Hartree-Fock calculations. the present hardware and software
16 With the large memories a\'ailable on modern hardware. integrals can often be held in memory
during the entire calculation for small and medium-size systems. We will not discuss such "in-core"
~olutions in any detail here since our main focus is on large applications.
39
situation makes it a very viable approach. Many years of development has made the
evaluation of these integral less of a burden than it used to be. Today. it is often easier
to calculate the integrals than to store them; in other words. the evaluation of the
integrals has become less of a bottleneck than their storage. There is a lot of evidence
that this trend will continue. and one should therefore try to circumvent the bottlenecks
set by the external storage and 110 capacity. at the expense of extra CPU work if
necessary. This simple idea is the quintessence of the "direct" approaches in electronic
structure methodology. 17 The two approaches are schematically illustrated below:
17 J. Almlfif and P. R. Tayll,r: in: -Admlll:ed n,enrie.t and Compl/tational Approaches to the
NATO ASI Ser. C. 133 (Ed. C. Dyk~tra). Reidel. Dordrecht
Electronic: St",ctllre (If MfllfCI/Ies"
(1984). pp. 107-125
18 J. Almlilf. K. Faegri. Jr. and K. Ko~cll. J. Comput. Chem. 3.385 (1982).
40
19 ~ee e.g. M.J. Fri~h. G.W. Trucks. M. Head-Gordon. P.M.W. Gill. M.W.·Wong. 1.B. Foresman.
B.G. lohn~on. H.B. Schle!!el. M.A. RollI!. E.S. Rel'loge. R. Gomperl~. I.L. Andres. K.
Raghavachari. 1.5. Binkley. C. Gnn7.alez. R.L. Manin. OJ. Fox. 0.1. Defres. J. Baker. JJ.P.
Stewan. and I.A. Pople. Ga/lssitlll 92. Revi~inn A.
20 J. Almlof and K. Faegri. Jr .. in: "SELF· CONSISTENT FIELD - T"em~' and Applications". (Eds.
R.Camo and M. KlobukoWl:ki I Elsevier. 1990. 1'1'.195.
21 See. e.\!.: H.P. Luthi and I. Almliif. Theoret. Chim Acta 84.443 (1993): LG.M. Peltersson
and T. Fax~n. Theoret. Chim. Acta 85.345 (1993).
41
While we introduced the notion of an LCAO basis set already in (9.1). we have
said nothing so far about the fonn of these functions. Clearly. the exact atomic orbitals
are almost as inaccessible as their molecular counterparts - and even if we knew them it
would only be in the form of numerical tabulations. which would not lend themselves
to efficient evaluation of the one- and two-electron integrals that we encountered above.
All that was said about basis set expansions in Section 9 would be valid with
almost any basis set. and insisting on atomic orbitals for that expansion is really only
based on our expectation that AO's would be the most suitable set of functions for
expanding the MOs. After all. the notion of atoms in molecules - which is the rationale
for the MO-LCAO approach - is only approximate. and the pursuit of accuracy through
the use of very precise AO's for the expansion is therefore futile.
As a general consequence of atomic symmetry. atomic orbitals are always of the
fonn
The equations determining the fonn of the radial functions R(r) can be solved
exactly only for one-electron atoms. but some general conclusions about the fonn ofthe
solutions can still be drawn. Due to the singularity of the potential at a point nucleus
with a charge of +Z, the wave function must have a 'cusp' at the nucleus, more
specifically. it is required that
dRI
dr r=O
=-z (15.2)
At the other end of the range, an electron far away from any molecule would see the
remainder of the molecule as a positive charge without any particular structure. Like in
anyone-electron atom. the wavefunction would therefore decay exponentially. It
would thus seem reasonable to use exponential functions as basis functions, especially
since they are known to be the exact solutions for the one-electron systems.
Historically. basis functions with exponential asymptotic behavior - Slater-type
orbitals, (STO's) - were the first to be used. 22 They are characterized by an
exponential factor in the radial part:
( 15.3)
22 The Slater·type basis functions are often referred to as ETO's (exponential type orbitals). For a
comprehensive re\'iew. see: c.A. Weatherford and H.W. Jones. "£TO MlIlricelller MoleclIlar IlIIegrals",
(Reidel. Dordrccht. 1982).
42
where Per) in (15.3) is a polynomial in the radial coordinate that can take on several
different fonns.
Gaussian basis functions were originally introduced to remedy the difficulties
associated with evaluating multi-center integrals with STO's.23 They can be written on
a rather similar fonn,
(15.4)
though usually with a different radial polynomial. but much of their usefulness stems
from the fact that they are not confined to a local, polar coordinate system, and they are
therefore commonly expressed in tenns of their Cartesian components:
(15.5)
(15.6)
The present success of GTO's as the basis set of choice in virtually all calculations was
far from obvious in the beginning. For instance. it is clear from quite elementary
considerations that a Gaussian has the qualitatively wrong behavior both at the nuclei
and in the asymptotic (long distance) limit. for a Hamiltonian with point-charge nuclei
and Coulomb interaction.
STO GTO
Furthennore. early practical experience with Gaussians was quite discouraging and it
has therefore been a commonly held belief that STO's would be the preferred basis if
only the integral evaluation problem could be solved. However. recent experience
indicates that this is not necessarily the case. The 'cusp' behavior represents an
idealized point nucleus. and for more realistic nuclei of finite extension the Gaussian
shape is actually more realilaic. If accurate solutions for a point-charge model
23 S.F. Boys: Proc. Roy. Soc. A 200.542 (1950): S.F. Boys. G. B. Cook. C. M. Reeves and I.
Shavilt. Nalure 178. 1207 (1956): H. Preuss. Z. Nalurforsch. A 11.323 (1956).
43
Hamiltonian are desired. they can be obtained to any desired accuracy in practice by
expanding the core basis functions in a sufficiently large number of Gaussians to
ensure their correct behavior. Furthermore, properties related to the behavior of the
wave function at or near nuclei can often be predicted correctly, even without an
accurately "cusped" wavefunction. 24
In most applications the asymptotic behavior of the density far from the nuclei is
considered much more important than the nuclear cusp. As mentioned above, the
wavefunction for a bound state must fall off exponentially with distance, whenever the
Hamiltonian contains Coulomb electrostatic interaction between particles. However,
even though a STO basis would in principle be capable of providing such a correct
exponential decay, this occurs in practice only when the smallest orbital exponent in the
basis set is
Imin being the first ionization potential. Imin is hardly ever known in practice when the
basis set is designed, but even if it were, all attempts to get a correct asymptotic
behavior even with a STO basis would still be futile. One must keep in mind that for
stable molecules Imin is usually> 5 eV, and thus ~min > 0.6 a.u. While this restriction
on the exponent range might be acceptable in SCF calculations on atoms, much lower
values are required for accurate work on any molecule. especially at the correlated level.
Violating the requirement (15.7) with a too diffuse STO basis will have much more
damaging effects on the long-range behavior than any Gaussian basis: if the smallest
exponent in a molecular calculation is chosen to be e.g. 0.4 rather than 0.6 as in the
above example. the density at a distance of 10 A from a molecule would exceed the
correct one by about three orders of magnitude! With a typical Gaussian basis, in
comparison. the density is essentially zero at that distance, and the consequences of this
error is far less severe for any normal molecular property calculated with these basis
sets.
In virtually all ab ;lIil;O calculations carried out today. a basis set of contracted
Gaussians is used. In conventional methods where integrals are stored, considerable
thought goes into the issues of basis set compactness. i.e .. the ability to describe the
orbitals as accurately us possible with the minimum nllmber of basis functions. The
discussion above clearly shows that Gaussians do not resemble atomic orbitals very
closely. and they are not always used directly as basis functions in the expansion (9.1).
However, they have other properties that still make them very attractive as basis
functions from a computational point of view. In order to get functions which retain
that advantage but perform better in the LCAO expansion, linear combinations of
Gaussians can be used; the original basis set of simple Gaussians is contracted2S (the
CGTO basis set). An atomic orbital. whose shape is suitable for physicakhemical
reasons, is thus expanded in a set of Gaussians, whose mathematical properties are
attractive from a computational point of view:
The first contracted basis sets were designed based on atomic Hartree-Fock
wavefunctions, and with the purpose of facilitating molecular Hartree-Fock
calculations. Later. contraction schemes such as the well known STO-nG sets have
been designed to mimic the shape of STO functions. 26 If one assumes that a "true"
STO basis set would be superior to Gaussians - we have raised some doubt about that
assumption above - it would make a lot of sense to try to approximate the STO with an
expansion in a set of GTO functions. Such a fit can be done rather accurately as shown
below, and the main limitation to the usefulness of that approach appears to be that the
STO itself is not a perfect basis function.
2S M. Krauss. J. Chern. Phys. 38. 564 (1963): C.D. Rilchie and H.F. King. J. Chern. Phys. 47.
564 (1967): E. Clernenli and D.R. Davis. J. Cornpul. Phys. 2. 223 (1967).
26 W.I. Hehre. R.F. Slew3rI. and I.A. Pople. J. Chern. Phys. SI. 2657 (1969).
27 1. Almlofand P.R. Taylor. Adv. Quanlum Chem. 22. 301 (1991).
45
Two quite different philosophies are advocated with regard to the contraction
scheme. When transforming from the larger. primitive GTO set to a smaller, contracted
CGTO set, the algorithms are clearly much simpler if the transformation is resnicted in
such a way that each GTO contributes to exactly one CGTO. In that case, the transfor-
mation is reduced to a series of small, independent summations within mutually
exclusive sets. This is called the segmented contraction scheme. In contrast, the
general contraction scheme makes no such assumptions, and allows each GTO within a
set to contribute to several CGTO's. The distinction is best illustrated graphically. We
may assume that integrals have been evaluated for a set of GTOs. Usually, both the
nuclear center and the angular form is the same for all GTOs in the group; they differ
only in their orbital exponents a. In the first, segmented scheme, the transformation
matrix from the GTO to the CGTO representation is sparse: each column contains
exactly one non-zero element. The
Segmented contraction general scheme, in contrast, has a non-
sparse transformation. 28
~=
Chemically, the contraction of a basis
represents an attempt to get baSis func-
o G
T tions closer in spirit to the LCAO con-
o
cept. Mathematically, contraction con-
stitutes a projection of the one-electron
basis {X} onto the smaller, but
General Contraction physically more reasonable basis {X} .
Ideally, such contraction can be done
without any major deterioration of the
wavefunction quality.
One considerable advantage of the
general contraction scheme is that the
CGTOs reproduce exactly the desired
combinations of primitive functions.
Modified For example. if an atomic SCF
General Contraction calculation is used to define the
, I
contraction coefficients in a general
contraction, the resulting minimal basis
will reproduce the SCF energy obtained
6 in the primitive basis. This is not the
T
o case with segmented contractions.
--~
28 R. C. Raffeneui. J. Chern. Phys. 58 .
4452 (1973).
46
There are other advantages with a general contraction: for example. it is possible
to contract inner-shell orbitals to single functions with no error in the atomic energy,
making calculations on heavy elements much easier. Another advantage is a conceptual
one, much exploited by Ruedenberg and co-workers. 29 Using a general contraction, it
is possible to perform calculations in which the one-particle space is a set of atomic
orbitals, a true LCAO scheme. rather than being a segmented grouping of a somewhat
arbitrary expansion basis. The MOs can then be analyzed very simply, just as for the
original qualitative LCAO MO approach. but in terms of "exact AOs" rather than crude
approximations.
Clearly, contraction reduces the number of basis functions quite significantly.
With a STO-3G basis, as an example, the reduction in size from the primitive basis is a
factor of 3, corresponding to a reduction factor of 81 on the number of two-electron
integrals - a significant reduction indeed. if one uses an algorithm that requires storage
and extensive handling of these integrals after their evaluation.
In a Hartree-Fock calculation in the direct SCF spirit no such storage of
integrals is necessary, and one of the original arguments for contraction becomes
obsolete. The question is then: is it better to work in the original basis of primitive
GTOs. or should one still contract to reduce the size of the basis set? (Remember that
we need to calculate integrals involving all the primitive GTO basis functions whether
or not we choose to contract them.) One can readily compute the number of arithmetic
operations needed for the two scenarios. and it is quite clear that contraction reduces the
number of operations in the build-up of the Fock matrix. while a few extra operations
are required for the contraction itself. The comparisons are complicated by the fact that
an uncontracted scheme leads to a much clearer and simpler structure of the algorithms.
for which the computer implementation is expected to run more efficiently. The current
conventional wisdom appears to be that contraction pays off on a conventional vector-
supercomputer. whereas the totally uncontracted GTO basis set has an advantage on
workstations and massively parallel hardware. One must keep in mind. however. that
these considerations depend on the latest news in a chaotically evolving hardware
market. and any definitive conclusion of this type is likely to be outdated soon.
• A minimal basis set is one that has a single basis function corresponding to each
of the atomic orbitals that are occupied in the atom. It is the smallest set that one can
reasonable use in any calculation, and one should not expect any quantitative accuracy
with such a basis.
• The double-zeta basis set consists of two basis functions per atomic orbital, and
is thus twice as large as the minimal. The name stems from the tradition of STD type
basis functions, where the symbol ~ -"zeta" - is traditionally used for the exponential
factor. In the same spirit, basis sets of triple-zeta, Quadruple-zeta ... etc. quality can
be constructed.
The split-valence basis is of double-zeta quality for the valence atomic orbitals,
minimal basis in all the other atomic orbitals.
The basis sets constructed to parallel the occupied atomic orbitals may constitute
a good start, but for accurate calculations they are generally insufficient. The "atoms-
in-molecules" notion underlying the LCAD approach is only approximate, and in a
realistic situation the atoms are significantly modified as they come together to form the
molecule. To account for this phenomenon within the framework of LCAD sets, we
must introduce the notion of polarization. It is easiest to understand the concept of
polarization functions if we consider a Gaussian basis set:
It is plausible that, as the atom in a molecule experiences the field from the
surrounding atoms, the electrons on that atom may show a tendency to shift away
slightly from the center of the nucleus. In other words, the optimum center around
which the basis functions are expanded may not coincide exactly with the position of
the nucleus. As long as that deviation is small, we can express it with the help of
dX dX dX
dX =~x-+~y-+~z- (15.10)
dx dy dz
(15.11)
d GTO
~
XX
=dxd (x-xa) k e -a{x-xa)2
(\5.\2)
The conclusion from this little mathematical exercise is simple and obvious: If
we want to describe the displacement of the electron density from a situation where it is
centered symmetrically around the nucleus (as in the atom), we need to supply basis
functions with higher and lower L-quanwm numbers that the original ones, but with the
same orbital exponents. The lower L-values are normally present in the basis anyway,
and it is sufficient to supply basis functions with higher quanwm numbers.
where the minimum and maximum values amin and a max can be optimized. There is
also a choice with regard to restricting the exponents to be the same for each shell of
basis functions. (a~ = al' =ad ) or optimizing them freely without those restrictions.
The laner is clearly bener from the perspective of the variation principle. whereas the
former. constrained approach offers significant computational simplifications.
49
Accurate primitive sp basis sets for first-row atoms have been available for some
time in the compilation of van Duijneveldt.30 His large (13s 8p) sets reproduce
numerical Hartree-Fock atomic energies to within 0.5 mEh in the worst case (Ne).
These primitive sets are suitable for the generation of ANOs. For even higher
accuracy. Partridge has generated sets of size up to (18s 13p) for the first-row atoms.
For heavier elements. basis sets of accuracy similar to (13s 8p) for the first row have
only recently become available)1 Of course. before contraction it is necessary to
supplement these sp sets with polarization functions.
The conventional nomenclature with regard to basis sets is quite precise. A
notation such as 6-31 G denotes a basis set where six primitive Gaussians have been
used to describe each of the core orbitals, whereas the valence orbitals are represented
by two contracted functions - the inner one expanded in three Gaussians, the outer one
uncontracted. It is usual to leave the most diffuse basis functions uncontracted - the
outer part of the valence is so strongly distorted from the atomic picture that flexibility
is more important than atomic resemblance. If we want to indicate that polarization
functions have been added to the basis we augment it with an asterisk. As a practical
matter. the hydrogen atoms are often treated different from the other atoms in a
molecule with regard to the choice of basis set, and polarization functions are not
always added to the hydrogen atoms. Thus, for a set with polarization on all atoms we
add two asterisks. 6-31 **. The diffuse functions are handled the same way; a '+'
denotes that diffuse functions have been added, '++' ensures diffuse functions on all
atoms. A symbol such as e.g., 6-311 **+ would thus be interpreted as follows;
a: Each atomic core orbital is represented by one basis function. expanded in six
primitive Gaussians.
b: Each atomic valence orbital is represented by three basis functions, the tightest
expanded in three Gaussians. the other two contracted.
c: A set of uncontracted polarization functions has been added on each atom (p-orbitals
on hydrogen. d-orbitals on all other atoms).
d: A set of diffuse functions (with the same I-values as those occurring in the valence
orbitals) have been added on all non-hydrogen atoms.
~o F.B. van Duijnevcldt.IBM R~.f('nr('1r Reporr RJ 945 (IBM. San Jose. 1971).
.'1 H. Partridge. J. Chern. Phys. 87. 6643 {1987tlbid. 90. 1043 (1989).
50
la+lb
"'" Ia+lb (16.1)
Xax Xbx = ~ C i <Ppi(X-Xp)
1=0
with
( 16.2)
In .( i -CIp(x-xp)2 (16.4)
"'PI x) -x
_
e
L.
lal+lbl
L
mal+mbl nal+nbl
(ablcd)= L d~I+lbl C~al+mbl Cnal+nbl
il=o jl=o kl=o lI.x Jl.y kl.z
In one of the most common and efficient approaches currently used for the
evaluation of integrals over Gaussian basis functions,32 Hennite Gaussian functions
(HGFs) are used instead of the usual Cartesian Gaussians for the re-expansion (16.1).
A Hermite Gaussian is defined as
. di
Ai(~) =Hi(~) exp(-ap(x-xp)2) =(-1)1-.
d~1
2
exp(-~ ) (16.6)
The set of HGFs (Ad spans the same space as the expansion functions {<I'p} in (16.2),
and as a consequence they can be used for expanding the basis function products:
la+lb la+lb
Xax(x) Xbx(x) = L Ci Ai@ (16.8)
i=o
where the expansion coefficients now need to be redefined. Because of the natural
relations between Hermite polynomia and Gaussians,33 the two-center integrals in
(16.5) can be evaluated with unique efficiency. Here. we will not go into the
technicalities of how these integrals are evaluated in detail, but the expression for the
simplest two-electron integral involving four s-type Gaussians may serve as an
illustration of the complexity of the problem. With an s-type Gaussian defined as
J Jexp[-aa(rl-A)2-ab(rl-B/-ac(r2-Ct~(r2-D)2]..l.
rl2
drl drz
where
Sab = ( -
1t )3/2exp[---
aa ab 2
(A - B) ] (l6.1 I)
ap ap
2
T=W(P - Q) ( 16.12)
.~1 L.E. McMurchie and E.R. Davidson. J. Comput. Phys. 26. ZI8 (1978): see :llso: L.E.
McMurchie. Ph.D. nresi.f (University of Scanle). (1977).
.~.l For an excellent discus~ion. see V.R. Saunders. in: "Merhmls ill Compl/rarional Molecular
Physics". NATO A~i Ser. D. Vol. 113. (Eds. G.H.F. Diercksen and S. Wilson) Reidel. Dordreeht
(1983). PI'. I.
52
(16.13)
ap=aa+<q" (16.14)
and
I
Fo(T) = Je- U2T du = ~-vferf(TII2) (16.15)
Given the complexity of the integrand in (16.10), one should perhaps be surprised that
the integral can be solved analytically at all.
It can be seen from ( 15.12) that basis functions with higher quantum numbers
can be generated through repeated differentiation of an s-type Gaussian. Consequently,
expressions for integrals involving any Cartesian Gaussian can be obtained by
differentiating (16.10), according to Leibnitz' theorem.
An important characteristics of the GPT is that when basis functions (shells)
with different L-values share the same orbital exponent and center ("family" basis sets)
the expansion functions (CPI in (16.2) can be used for all members of the "family" with
lower L-values. We may consider as an example the products obtained by the
functions in two different p-shells: Nine products XaXb can be formed, which requires a
total of ten new functions {CPp I for the re-expansion. These new functions {CPp I form
an 5-. po, and a d-shelL all centered at r =rp. However, the 5-5, sop and p-s
combinations of basis functions with the same orbital exponents can also be expanded
in the same set of expansion functions {<I>p}. In general. the full Cartesian set {X} with
t
angular quantum number = L contains (L+ I )(L+2) functions. With all lower family
members included. the number would be i(L+I)(L+2)(L+3) The re-expansion set
34 .I. Almliif and P.R. Taylor. in: "Acll'ClIICC'd Thenri('s (/lid Camplltati'1/Ia{ Apprnaches t" the
Eieclrnllic Stnlcillre of MoieClIie.f". NATO ASI Ser. C. Vol. 133 (Ed. C. Dykstra). Reidel. Dordrecht
(19841. pp. 107-125.
54
Large Molecule
<R>=30A
Small Molecule
<R>=3A relative
abundance
l.E·03 l.E'06 1. 09 1. 12
Graphs showing the relative abundance of the two-electron integrals versus their magnitude for one
small and one large system.
The dotted lines in the graphs above indicate a suggested cutoff for practical
calculations - contributions smaller than a certain threshold do not make any difference
to the final result and can therefore be safely neglected. One should note that the
threshold must be set tighter in a large calculation, since the number of marginally
significant contributions grows faster than the total energy. Nevertheless, the message
is quite clear: For large molecules, the majority of the integrals can be neglected!
However, while it is certainly important to take advantage of this situation, it
cannot be exploited to its fullest without some further considerations. Consider a
calculation with some 2000 primitive basis functions - a calculation many would
consider routine today. Without any further simplifications, this calculation requires
the evaluation of about 2-10 12 integrals. Even if the magnitude of an integral could be
estimated in a couple of machine cycle (say, 10 ns), it would still take many hours to
carry out the tests: whatever the conclusion of those tests might be. Accordingly the
tests must be carried out with some care. and the different techniques to eliminate the
evaluation of integrals as accurately and cheaply as possible constitute one of the major
challenges in contemporary method development. The art of evaluating integrals faster
and faster developed very rapidly in the 1970's and -80·s. but, for the reasons
mentioned the focus of the development efforts has shifted to the various tricks required
to eliminate the calculation of integrals altogether.
To realize how the above ideas can be incorporated into a scheme for ab i1litio electronic
structure calculations. it is useful to first consider the prescreening of integrals normally
done in conventional large-scale LCAO work. As discussed in Section 16, the
expression for an integral over primitive Gaussians can be formally written as
(17.1 )
where Sab is a radial overlap between orbitals Xu and Xb, and Tubed is a slowly varying
angular factor. In many situations the product Sab Sed thus constitutes a good estimate
of the magnitude of the integral. and it may seem attractive to use that product as an
estimate in screening out small integrals. However. since the product does not provide
a strict upper bound. a few integrals <Ire sometimes eliminated from further
55
consideration by this screening even if their magnitude is well above the screening
threshold, and this can have very detrimental effects in a variational calculation. It is
preferable. therefore. to work with a strict upper bound for the magnitude of the
integrals. Such a bound can be obtained from the Schwartz' inequality:
cg(l)
I(ab I cd)1 S; lCab1G:d (l7.2a)
where cl ual Ii ...· depencllrtce
pi anar systems
linear molecules
The figure shows the deviation from the formal N4 dependence for three different types of systems.
Note that with 1000 ba~is functions. even the trend for the 3-D clusters - which may appear modest on
this scale - amounts to saving a fnctor of nearly 1000 on the time.
For small molecules a value for the threshold 't of 10- 7 to 10- 8 is usually
reasonable. but it must be tightened as larger systems are considered, due to the
inevitable accumulation of errors. Significant deviation from the N4 dependence is
therefore only seen for extended systems. and a calculation on a molecule of chemically
interesting size may thus easily require 107 to 109 integrals to be evaluated even for
rather modest basis sets. In early days of computational quantum chemistry, the
computation of the two-electron integrals was a major bottleneck. and consequently
they were stored for re-use whenever possible.
A screening based on the ideas discussed so far can be implemented in any
scheme relying on an explicit evaluation of two-electron integrals in an AO basis. In
addition. the direct approach to electronic structure offers a more powerful screening
criterion by considering how these integrals are used in each SCF iteration to evaluate
the energy and to build the Fock matrix. With a closed-shell Hartree-Fock scheme as a
prototype example. the Fock matrix elements are obtained from the density matrix and
the integrals as:
Fab = hab + L
u:I
Ded [ 2 (ab I cd) - (ac I bd) ] (17.4)
Due to the permutational symmetry of the integral expression the following integrals are
equal.
(ab I cd) = (cd lab) = (dc I ba)* = (ba I dc)*
(17.5)
(ba I cd) = (cd I ba) = (ab I dc)* = (dc I ab)*
and with real basis functions. which are almost always used in calculations on
polyatomic systems. the eight integrals in (17.5) are all identical. It is then only
necessary to calculate one integral in this redundant set. and the processing of a general
two-electron integral (ab I cd) therefore requires the operations outlined in (17.6) for a
closed-shell case:
It is evident from (17.6) that there will be a significant contribution to the Fock matrix
only if hmh the integral irul at least one of the six density matrix elements in these
expressions are significantly different from zero.
Since the density matrix elements are known by the time the integral is to be
evaluated. they can be incorporated in the prescreening tests. The evaluation of an
integral is only necessary when the maximum contribution to the Fock matrix exceeds a
given threshold t:
The test (17.8) places a rigorous bound on the maximum error allowed in the
calculated Fock matrix element. It should be noted that it is not practical to screen on
the contributions to the total energy;
While this would be a more powerful screening criterion in the sense that many more
integrals would be eliminated. it leaves an unmonitored error in the Fock matrix. As a
result the SCF calculation is not guaranteed to converge with the screening in (17.9),
and indeed doesn't do so except in trivial cases. However, Eq. (17.9) is still useful
when evaluating energy-related properties, e.g. the nuclear forces in calculations of
equilibrium geometries.
Even. though the integral pre-screening may appear similar in a direct and a
conventional SCF scheme. the logistics of the two procedures are entirely different. In
the traditional approach certain integrals are eliminated once and for all. That procedure
modifies the functional dependence of the energy on the MO coefficients, and great care
must be taken to assure that a variational collapse is avoided. With the direct SCF any
tendency towards such a collapse leads to an increase of the pertinent density matrix
elements. automatically ensuring the evaluation of the critical integrals in the subsequent
iteration.
llsually. it is desirable to evaluate integrals in batches. corresponding to shells
of basis functions having the same centers. L-values, and orbital exponents. However,
of the different screening criteria suggested above, only the radial overlap products
pertain to full batches of integrals. With processing of integrals by shells. the test
58
based on Eq. (17.8) must rely on the maxima of the density matrices and exchange
integrals evaluated for each shell:
DAB =a<EA.b<.EB
max (Dabl (17.10)
!CAB =a<EA.beB
max " (ablab) (17.11)
where the indices A and B refer to shells. and a and b run over all functions in a shell.
These compressed matrices are of rather modest dimensions. and may conveniently be
kept in memory during integral evaluation even for very large basis sets.
Further. important savings in the number of calculated integrals may be
obtained by considering the relationship between Fock matrices of two consecutive
iterations. Again considering the closed-shell Hartree-Fock example. these matrices in
iterations (m) and (m-I) are:
F(m-I)
ab
= hab + £-
~ D(III-I) (2(ab I cd) - (ac I bd) I
cd
(I 7.1 2b)
cd
which yields the recurrence relation (since the integrals are the same in every iteration)
This illustrates the fact that only those electron repulsion integrals are required
which are related to significant changes in the density matrix from one iteration to the
next. A screening criterion similar to those previously suggested could still be used.
substituting b. for D in (17.9). Especially close to convergence. this criterion is very
efficient. reducing the number of required integrals to zero in the limit of full
convergence.
The tests described above are instrumental in making the direct SCF a viable
approach for very large systems. One can actually proceed one step further, and only
calculate accurately those elements of the Fock matrix that would make a significant
contribution to the density in the next iteration. This approach is especially useful in
59
where the integral (abled) must obviously be evaluated if either of the criteria in (17.16)
is fulfilled. As a consequence of the particular structure of D and It discussed above.
(17 .16a) does not eliminate a large number of integrals in addition to those removed by
(17.3) only. whereas (17.16b) used alone would have eliminated a large number of
such integrals. ,As an example. a calculation on a 148 atom diamond-like carbon cluster
revealed that the number of remaining contributions required by criterion (l7.15a)
exceeded the ones required by (l7.16b) by more than one order of magnitude. a
difference that would also be reflected in the computer time requirement if the two types
of contributions were to be evaluated independently.
This observation opens possibilities for several interesting modifications of the
current computational schemes. The structure of the calculations can often be simplified
if Coulomb- and exchange type contributions are evaluated separately. This would
only lead to an insignificant increase of efforts for large molecules. since the Coulomb
part will dominate the calculation according to the above reasoning. Furthermore. since
Coulomb repulsion is a relatively simple (i.e. classical) form of interaction one should
be able to explOit simpler schemes for its evaluation that the conventional one based on
four-center integrals. We will discuss several such simplifications in the following
sections.
Incidentally. we note that this difference in the screening behavior is even more
pronounced if we consider a screening based on the contributions to the total energy in
Eq. (17.9) as discussed previously. Even though this screening cannot be used in the
60
SCF calculation for reasons discussed above. it is very useful when evaluating
properties such as nuclear forces ("gradients") and force fields.3 6 For such
applications. the experience with the test case discussed above indicates a difference of
nearly three orders of magnitude, suggesting that the exchange type contribution is
totally insignificant and that all efforts should be concentrated on finding simpler ways
to evaluate the Coulomb part. The evaluation of SCF gradients is naturally a "direct"
scheme since the integral derivatives are only needed once. and their contribution can be
processed on the fly without intermediate storage. The separation of Coulomb and
exchange contributions is not common. however, but has been found to lead to
dramatic savings in computer time for medium- and large size systems.
36 See e.g. T. Helgaker :lOd P. J~rgenscn. in "Methods ill Compl/tational Moleclliar Ph~'sics", NATO
AS! Ser. B. Vol. 293. (Eds. S. Wilson and G.H.F:. Dicrcksen) Reidel, Dordrecht (1992).
61
The philosophy of the direct SCF approach was based on the observation that
our integral processing ability had outgrown our capacity for storing and moving these
integrals, and as a consequence we reduced the storage requirement at the expense of
more computation. It is evident, though, that once the bottlenecks due to input/output
and disk space have been removed by a direct approach, the time needed for integral
evaluation will again be a bottleneck in electronic structure calculations on large
molecules. Much work is therefore needed to improve the efficiency by which these
integral are being evaluated.
The distribution of electron density occurring in a typical four-center, two-
electron integral is typically given as a product of basis functions. The GPT reduces
the electronic repulsion to two-center terms, and also reduces the number of integrals
significant! y.
Once these relatively few quantities have been evaluated, they are expanded to a
much larger number of four-center electron-repulsion integrals. In a direct approach
these are immediately reduced again to a relatively small set, such as a Fock matrix or a
gradient vector. By far the most time-consuming part of this procedure is the
transformation and handling of the large intermediate set of four-center electron-
repulsion integrals. It would appear very desirable to circumvent this step, and directly
build the Fock matrix from the two-center integrals over Hermite functions in (16.5), or
even from the incomplete Gamma functions (16.15), without ever handling a single
four-center integral over primitive or contracted basis functions. It is evident that this
approach could lead to very significant reductions in the operation count. To some
extent, similar ideas have been implemented in current integral algorithms, where the
transformation from a primitive to a contracted basis is partly carried out before the
primitive integrals are fully evaluated.
A simple analysis of operation counts shows that it is more efficient to
construct the Coulomb part of the Fock matrix in a basis of these Hermite functions
(strictly speaking, this is a vector-, not a matrix representation of the Coulomb
potential) and transform to the conventional basis at the end of each iteration. Such a
transformation only requires N2 work and can be done at essentially insignificant cost
as follows:
In accordance with the Gaussian Product Theorem (Eq. (16.1) a two-center
product of primitive Gaussians can be expanded in the basis of one-center Gaussians
(Hermite or Cartesians):
( 18.1)
62
In (18.2), (ab I cd) are the two-electron, four-center integrals over primitive Gaussians.
and the (p I q) are two-electron. two-center integrals over Hermite Gaussians according
to (16.5),
The evaluation and processing of integrals for the Coulomb contributions to the Fock
matrix would then be carried out as
Jp =r Dq (p I q) (18.4)
q
where
Dq =r C (c.d) Dcd (18.5)
c.d q
is the representation of the density in the Hermite basis, which can be evaluated before
the beginning of an SCF iteration, outside the calculation of two-electron integrals.
Similarly. the back-transformation to the ordinary pair-representation.
is carried out at the end of the direct SCF iteration at insignificant cost.
The approach is akin to the idea of an 'auxiliary' product basis in the
"multiplicative" approximation, and can be used along with that procedure which is also
a very time-saving device in large calculations. An additional benefit of this approach is
that all the low-L components of a shell of basis functions can be obtained at zero
additional cost. The case of four shells on different centers, all with L=2 (d-orbitals)
may serve as an illustration. Each pair would contain 36 products. and the total shell
therefore generates 1296 integrals with a conventional approach. We can instead use a
set of 35 Hermite Gaussians to represent each pair. This would give rise to only
slightly fewer integrals (1225). but the most expensive part of the integral calculation
would be avoided through the use of the product functions {~} as the basis of
representation. In addition. these 35 functions would now also represent the entire
63
"family" basis, i.e. p- and s-type basis on these centers with the same exponent.
Therefore, a total of ten functions on each center, or a total of 10,000 integrals are
accounted for with this method.
J
1
f
1
F = F(u) du (\8.8)
o
f
00
I
-= 2
.C' exp(-rI22u)2. du ( 18.9)
['2 "lilt 0
which is obviously very useful when Gaussian basis functions are involved; and 2) the
separability of the Gaussian basis function itself:
n M.1. Rys and H.F. King. 1. Chern. Phys. 65. III (1976).
64
(18.10)
(\8.11)
where
(18.12)
If the integration over the auxiliary coordinate u is temporarily delayed, the separation
of the summations in the buildup of the Coulomb matrix enables a very efficient
evaluation of the sum:
Jab =Jax.ay,az,bx,by,bz =
00 00
=~} { I cx.cy.cz
I(~~. j(u) I(~~. (u) I(~!.. (U)} du =} jab(U) (18.13)
dx.dy.dz
One can now postpone the integration over u until the very end of a SCF
iteration. Each matrixj(u) can then be built up with very few arithmetic operations due
to the effective separation between the x,y, and z-part of the integral. j(u) can be
constructed for 8-12 different quadrature points Uk (in parallel if desired), and the final
matrix can be obtained through a simple numerical quadrature scheme:
f
00
carried out separately for each of the (non-zero) matrix elements. Again. while the
scheme has been described here in detail only for the Coulomb part of the Fock matrix,
the exchange can be treated in the same spirit and with similar gains in efficiency. The
approach allows a much more efficient factorization of the work done at the innermost
loop level. which more than compensates for the extra computations involving the
additional quadrature.
65
(19.1)
Similar ideas are frequently used in density functional technology,39 Even when a
basis set is used in these calculations. the density is usually re-expanded in other basis
sets.
The form of (19.1) may look similar to Eq. (16.1). the Gaussian Product
Theorem. In (16.1). however, the expansion is exact and the expansion basis set is
specific for each product XaXb. The size of the expansion basis for that expansion is
therefore of the order N2, N being the number of LCAO basis functions. In contrast,
the size of the set used in (19.1) is only of the order N. but the expansion is global, i.e.
the sum runs over all expansion functions ~p.
There are several ways to determine the expansion coefficients c(a.b) in (19.1).
If we define a residual function
(19.2)
we can apply several different criteria by which Rab is to be minimized. One of the
most obvious would be the minimization of the norm:
(19.3)
This leads to
C (a.b) =l (ab q) (S-I)pq (I9.4a)
P q
or
c<a.b) =g-lll(a,b) (19.4b)
38 C. Van Aisenoy. J. Compo Chem. 9. 620 (1988): A. Forlunelli and O. Salveui. J. Compo Chem.
12.36 (199\): A. Fonunelli and O. Salveui. Chem. PhY$. Len. 18Ci. 372 (1991).
39 B.1. Dunlap. I.W.D. Connolly. J.R. Sabin. J. Chem. Phys. 71. 3396 (1979).
66
(19.5a)
or as
(ij I kl) =L b (i,j) c (k,l) = b(i,j)t C<k,1) = o(i,j)t~fl f).(k,l) (l9.5b)
p p p
00 N
1 = Lip> (S-I)pq <ql = Lip> (S-I)pq <ql, (19.7)
p,q p,q
where the approximate equality in (19.7) reflects the incompleteness of the finite
expansion basis,
The minimization of the norm (19.3) is just one out of many reasonable criteria
for determining c<a,b). Minimizing the self-repulsion of the residual. i.e. minimizing
('19.8)
whether we have expanded electron distribution {ij}, {kl}, or both. The same
expression would be obtained by minimizing the error in the electrostatic potential
generated from the charge distribution XaXb'
More generally, we may define a residual overlap rj =(Bj lRab) , where {Bj} is
an arbitrary set of functions. Minimizing Z = I. r/ leads to
j
(19.11)
where dr = (Bjlab) and Ajk = (Bjlak). In the special case of {Bj} = {~u} , we have
A. =At =1{ and we obtain Eq. (19.9) again.
To summarize. this suggests at least four different approximations of a four-
center, two-electron integral (ab I rs) based on 3-center quantities:
tuvw
abt -I cd '" (ab I) (S ·1 )tu (cd I u)
(ab I cd) ,., !l S b = £... (J9.12b)
tu
40 O. Vahlras. J. AlrnlMand M.W. Fcycrei$cn. Chern. Phys. Lett. 213. 514 (1993).
68
41 I. Panas and 1. AlmiOf. Intern. J. Quantum Chern. 40. 797 (1991): I. Panas. 1. AImliif and M.W.
Feyereisen. Intern. J. Quantum Chern. 42. 1073 (1992).
69
(20.1)
where Ci.~b,P) are the multipole moments of XaXb evaluated around the center (P),
and F~:~(r-rp) is the expansion of the electrostatic field at the point r due to that
multipole moment. For the interaction between non-penetrating charge distributions ab
and cd. which would be an approximation for a two-electron integral (ab I cd):
where Ilm,l'm,(R pQ ) is the interaction between the multipoles. While this expression
does not provide a very useful approximation for individual integrals, it has the
advantage that electrons I and 2 are now effectively decoupled. Various contractions,
transformations, and summation over densities can therefore be carried out at the 1-
electron level. Obviously, the method has many similarities to the three-center
expansion discussed in Section 19, and the two can actually be combined with very
impressive results.
The first step in a calculation along these lines would be to define a partitioning
of the system into fragments. This partitioning need not have any chemical
significance. since the approximations used are numerically monitored. but there are
likely to be computational advantages if it does. Based on a screening of radial
overlaps. the electron repulsion terms can be divided into shon-, medium- and long-
range interactions. Shon-range interactions must always be dealt with in terms of
70
explicit two-electron integrals over basis functions. but. as discussed above. their
number only increases linearly with size in an extended system. The long-range
Coulomb contribution to the Fock matrix can be obtained as follows: In the usual
expansion in tenns of conventional two-electron integrals
the pairs 'ab' and 'cd' can now both be expressed in tenns of mUltipole expansions (as
long as they do not penetrate into regions where both charge distributions are non-zero)
The summation over the basis function pair cd can now be carried out before the actual
loop over electron-repulsion contributions:
giving quantities Ill'~~ that can be interpreted as the multipole moments of the
fragments (Q) with respect to the centers RQ. These multipole moments can be
precomputed and stored before each SCF iteration, and their contribution to the long-
range Fock matrix can then be easily evaluated. The expression for the Coulomb
matrix elements is obtained without any explicit summation over basis functions:
The summation over I'. m'. and Q simply gives the electrostatic potential. the field.
field gradient and the higher moments of the potential at the point Rp due to all the
multipole moments in the system.
descibes the interaction between un-nonnalized spherical charge distributions. Sab and
Sed are just two overlap factors describing the magnitudes of these charge distributions.
Furthennore, we should note that the error function in (16.15) can be approximated as
1 2
erf(x) = I - 2x e- x (20.9)
1_ (1t
Fo(T) = r'l f (20.10)
and
In other words. the four-center integral just reflects the classical electrostatic repulsion
between point charges for large distances between the charges. The correction tenn in
(20.9) is due to the penetration of the two charge distributions, ,and is seen to decay
rapidly with distance, For integrals over basis functions with higher L-values, a similar
analysis reveals that the integral represents an interaction between higher multipoles.
The fragment concept used here to represent long-range interaction has many
similarities with recent ideas of using an auxiliary basis set to expand the density
distributions, rather than the wavefunction. In both cases. the treatment of Coulomb
interaction is dramatically simplified since an external summation for the entire external
charge density is possible. while the savings for the exchange part (which can be
viewed merely as an efficient integral approximation) are less dramatic, though still
very significant. The multipoles used in the present approach could actually be viewed
as a particular choice of auxiliary basis functions with infinite orbital exponents,
centered in special positions in space. A combination of the present fragment approach
and a method involving auxiliary basis sets will be a very effective compromise for
many extended systems. With the fragments chosen to be atom-centered expansions,
the essential steps in the procedure would thus be as follows:
2: Evaluate the multipole moment of each such atomic expansion. This operation
is completely trivial. funhermore. the expansion is finite since the order of the multipole
moment corresponds exactly to the L-value of the auxiliary basis function if both are
expanded at the same center. The contribution from the nuclear charge could be
incorporated into this multipole. allowing all Coulomb interaction to be treated on an
72
equal footing. This would also have the effect of partly canceling the monopole terms
which are the largest contributions and the ones most slowly decaying with distance.
3: Evaluate the potential and its derivatives (field, field gradient, etc.) at the site of
each atom from all the other atoms.
3b: At the highest level K. the potential ( ~K)} at all centers ( Q~K)} resulting from
all other ~~K) is evaluated if such a multipole expansion is justified. Proximity in space
J
and convergence of the multipole expansion would dictate whether the expansion is
allowed.
3c: Using the expansion coefficients saved in 3a, the potentials at the (K-l) level
( F\K.I)} are calculated from ( ~K)}. For those interactions that have not already been
accounted for at the higher level, a new attempt is made to calculate further
I F.IK-I}}. This evaluation of successively smaller contributions
contributions to
continues until the first level is reached.
While the scheme is more complex that the original one and involves more
overhead. there is nothing in it that scales quadratic in the number of atoms. An exact
analysis of the work involved is difficult without additional information about the
magnitudes of the multipoles and their distribution in space. but a scaling of n·log(n)
seems plausible.
42 J. Ambmsianu. L. Grcengard and V. Rnkhlin. Cumru!. Phys. COInmun. 48. 117 (1988); L.
Greengard and V. Rokhlin. Chern. Ser. 29 A. 139 (1989),
73
From general elementary statistics we know that the probability distribution for
two variables P(x.y) can be written as
P(x,y)=Px(x)Py(y) (21.2)
if and only if x and y are independent or uncorrelated variables. where the one-particle
probability distributions are defined as
We have shown how the Fock matrix can be obtained from integrals in the
LCAO representation. We will refer to this as the matrix representation of the Fock
operator in the AO basis set. Sometimes we will also need the representation of the
Fock operator in the molecular orbital basis. For this. we construct the matrix
When the SCF procedure has converged. this matrix is diagonal. We thus have an
expression for the orbital energies:
n
Ei =(<Pi h <Pi) + L, {(<Pi<Pi Igl<Pk<P0 - (<Pi<Pk Igl<Pi<Pk) } (22.2)
k
and we also know that
n
Fij = (<Pi h <Pj) + L, {(<Pi<Pj Igl<Pk<Pk) - (<Pi<Pk Igl<Pj<Pk)} (22.3)
k
which clearly will be =0 for i * j when the orbitals satisfy the Hartree-Fock equations.
To describe a (n-I) electron system. we can simply remove one orbital. <Pk.
from the original set. assuming all the other orbitals to remain unchanged. With the
energy given as in Eq. (4. 14) for the original n-electron system:
n n
En =L, (<Pi h <Pi) + ~ L, {{<Pi<Pj Igl<Pi<Pj) - (<Pi<Pj Igl<Pj<Pj) I (22.4)
i - i.j
which. for the present purpose we may view as summing all elements of a vector 0 and
a matrix A. constructed as:
Note that the diagonal elements Ajj are identically zero by construction. For the ionized
system we get:
n
En-I =L bi + ~ L Aij
i;tk i;tk,j;tk
n n
=En -bk - ~ l. Aik - ~ l. Ajk + Akk =En - £k (22.8)
I J
Removing the row and the column 'k' will remove Akk twice, but since the diagonal
elements are zero. we get:
However, in this case the errors discussed above have the same sign, and the predicted
EA is therefore usually off by several eV. Since typical EA's are in the range 0-4 eV,
orbital energies are in practice useless for their prediction.
Notice that this reasoning only holds for singly ionized states. If we were to
remove two electrons, say, k and I, the change in energy would be. with the same type
of-pictorial representation:
(22.10)
but. since the off-diagonal elements are in general non-zero, this is not just the sum of
two orbital energies. This reasoning also provides some insight into why the total
electronic energy of a system - which obviously would equal the energy needed to
remove all electrons - is different from the sum of orbital energies as discussed in
(5.24).
In the same way as above. one can also show that the excitation energy, defined
as the energy required to remove one electron from an occupied orbital and add it to a
virtual, is not the difference between the two orbital energies involved.
k 1/+1
(23.1)
We can now use a technique similar to the one used in Section 4 to find
integrals between two different wavefunctions. To begin with, the overlap integral is
obtained as
n! n!
a I ~
('1'01'1'.I ) ="1
n. ~
p
L (-I)P+P' {P {<P1<P2·'
p •
<Pi·· <PnIIP' {<P1<P2·· <Pa·· <Pnl}
n!
=L
p
(-I)P «(<P1<P2" <Pi·· <PnI IP ' {<P1<P2·· <Pa·· <Pnl} (23.5)
But. in analogy with Eq. (3.15) all terms in the sum must contain an integral of the type
<ala>, where <Pa is in the original set of orbitals whereas <Pa is not. These integrals all
equal zero, and the entire integral (23.5) therefore vanishes, in other words
a
{'I'0 1'1' i } =0 (23.6)
78
For the purpose of discussing more complex situations, we rephrase this result: There
is a "mismatch" between the orbital sets in '1'0 and in '¥~ , and nothing we do to
permute the orbitals can eliminate this mismatch.
Integrals over other operators can be systematically evaluated with a similar
approach. For the one-electron operators we have, analogous to (4.3)-(4.6):
n n! n
('1'0 I h k'I' ~ ) = I <
(-l)P {<P J <P2 .. <Pi .. <Pn} I hk P {<P J<P2 ., <Pa .. <Pn})
k p k
(23.7)
In (23.7), the "mismatch" between orbitals <Pi and 'Pa will make the entire integral zero,
unless 'Pi and <Pa are both associated with electron "k". In other words, the only non-
zero contribution occurs for i=k:
n
('1'0 I hk'l'~) = <i h a> (23.8)
k
In the same way, we obtain matrix elements over the two-electron operator. Notice,
first of all, that
n n
~
I<J
gij = iIb,1 gij (23.9)
n
('1'0 I gkl'l'~) =
k<1
n!
Ip (-I)P <{ 'P I <P2·· <Pi·· <Pn} gkl P {<P 1'P2·· <Pa.· 'Pn}) (23.10)
The mismatch between Hartree-products on the left and the right side is eliminated here
if either k or I equals i. The expression then equals:
n n!
Ik Ip (-I)P ({ 'P I 'P2·· <Pi.. 'Pn) gki P {'P I 'P2 .. <POI .. 'Pn}) (23.11)
However. only two permutations in the sum over P will give non-zero contributions:
the one that leaves all orbitals in place. and the one that only interchanges i and k. We
thus get the final result:
79
n
('I'D L gkl 'I'~ ) =
k<1
n n
=L {('Pi 'PIc Igkil 'Pa 'PIc) - ('Pi 'Pk Igkil 'Pk 'Pa>}= L <ikllak> (23.12)
k k
with the notation for integrals introduced in Appendix A.
We can make an interesting and useful observation here: The integral of the
total Hamiltonian between the Hartree-Fock wavefunction and a singly excited state is:
n n
('I'oH'I'f)=('I'o[ho+~ hi+~ gij)'I'f>
I 1<]
n
= <i h a > + L <ik II ak> (23.13)
k
If we compare (23.13) with the expression (22.1) for the Fock matrix (which is
of course diagonal in the representation of converged molecular orbitals), we arrive at
the following conclusion. also known as Brillouin's Theorem:
The Hartree-Fock waveftl1lction does not interact directly with singly excited states
through the Hamiltonian.
a
In other words, ('1'0 H 'Pi ) = O. Brillouin's Theorem is frequently used elsewhere,
among other things for developing the theory and methods of electron correlation.43
If we proceed with the integrals between Slater detenninants, we may now also
consider the doubly excited states 'I'ijb. From the previous discussion, it should be
obvious that
ab
('1'01'1' ij ) =0 (23.14)
since there will always be at least two orbital mismatches between the Hartree products
to the left and to the right. and any single term in the sum over k can make up for at
most one of them.
For the two-electron terms. we may expect to get some non-zero interaction:
43 Remember lhat Brillouin'~ ll1corem holds Wl!l if the orbitals are Hanree-Fock orbitals Clhough not
net:essarily canonical).
80
n
{'I'oL gkl 'I'~.b >=
k<1 1J
n n!
~L I. (-1)P {{<PI <P2 ..<Pj.'<Pj-·<Pn } gkl p {<P1<P2··<Pa·.CPb.. CPn}) (23.16)
k..1 P
By now we should be used to this type of reasoning. and we can conclude without
further ado that these integrals are non-zero only for i=k, j=1 and i=l. j=k. and the entire
integral (23.16) therefore becomes:
n!
~ Lp (-1)P ({<PI <P2 .. <Pj .. CPj-.<Pn} (gjj + gji) P {<P1<P2··<Pa•. CPb·.<P n}) (23.17)
Just like in the single-excitation case there are only two permutations that can give a
non-zero result: the one that leaves everything unchanged. ,and the one that only
interchanges electrons i and j. The final result is therefore:
n
('PoL gkl 'I'ijb) = <ij I ij> - <ij I ji> = <ij II ij> (23.18)
k<1
We finally conclude that for detenninants differing in three or more orbitals. there is no
way one can obtain non-zero integrals over operators that contain only one- and two-
electron interactions.
To summarize. we have the following rules which are important enough to be
worth memorizing (since deriving them every time we need them is a bit tedious):
n n
('1'0 H '1'0 > =ho + L <i h i> + t L <ij II ij> (23.19)
k i.j
n
('I'oH'I'f>=<iha>+ L <ik II ak> (23.20)
k
{'I'(l H 'l'ijb ) = <ij II ji> (23.21 )
Notice that (23.20) equals zero for the case that the orbitals are the optimized Hartree-
Fock orbitals for \flo. The general fonn of (23.19) -(23.21) holds for any orthonormal
orbital set. however.
81
(24.1)
The expression can be written on a more convenient form if we recall the definition of
the Dirac S-function;
with which we can write Pi as a conventional expectation value with integration over all
coordinates:
(24.3)
(24.4)
However, electrons are indistinguishable, and the observed density is the sum of
contributions from all the electrons. This gives the following form for the tOlal electron
density of a molecule:
n
Per) =f l'f(rl, r2 ... rn)12 L a(r - ri) drl ... drn (24.5)
i
In other words. the swerator corresponding to the total density is
n
R(r) =I S(r - ri) (24.6)
i
Notice that we made no assumption about the form of the wavefunction: the derivation
is general. In order to obtain the specific expression for the density in the case of a
determinant wavefunction. we note that the density operator (24.6) is a one-electron
82
n
operator similar to the term L hi in the Hamiltonian, and the expectation value can
i
then be evaluated along the same Jines as (4.6):
n n n
=
per) ('I' L~(r - ri) 'f') =L
<'Pi(r') ~(r - r') 'Pi(r'» ('Pi(r)(2 =L (24.7a)
i i i
p(r) =~(r) ~t(r) (24.7b)
In other words, the total electron density of a molecule is equal to the sum of densities
for each of the molecular orbitals - not an entirely surprising result!
The density can also be expressed in terms of the basis set - we just insert the
LCAO expansion (9.1) or (9.2) into (24.7), to obtain
r.
n n N N
per) = ('Pj(r)(2 =4. L. Cpj Cqi Xp(r)Xq{r) =L Dpq Xp(r)Xq(r)
I I p.q p,q
(24.8a)
where D is the same density matrix as in (10.2), which we introduced when discussing
the Roothaan-Hall equations.
The expressions above have a significance beyond any interest in the electron
density itself. Many molecular properties are represented by multiplicative, one-
electron operators. The dipole moment operator is one such example: The operator for
the dipole moment with respect to a point R is44
n
d=Li (rj - R) (24.9)
Removing the reference to origin, the expectation value of this operator is, with the
same methods as above,
n N
J.l =('I' d '1') =L < 'Pi(r) r 'Pi(r) > =L Dpq <Xp(r) r Xq(r» (24.10)
i p.q
A large class of properties can be evaluated in this convenient way, using the density
matrix and matrix elements of the operator in the LCAO basis set. Eq. (24.10) is of
course just another way of expression the relation between density and dipole moment:
J
J.l = per) r dr (24.11)
~ The dipole moment is a vector quantity. independent of the choice of origin if the molecule has no
net L·ha.rge.
83
Given the expression (24.8) for the electron density. we can carry out an
interesting analysis. We note that the total electron count of the molecule is given by:
n= J per) dr = Jf,
p.q
Dpq Xp(r)Xq(r) dr
N N
=L Dpq Spq =L {I> S}pp= Tr{D S} (24.12)
p.q p
This provides a breakdown of the· total number of electrons in the molecule into
components that can be assigned to individual atoms:
(24.13)
where we have summed the contributions for each pair of atoms {P,Q}.
Notice that (24.13) has both one- and two-center contributions! We can thus assign the
electron density to pairs of atoms (= bonds!) as well as to the atoms themselves.
This analysis is referred to as the Mulliken population analysis. The quantity
2qpQ for P;tQ is usually called the overlap population between atoms P and Q (notice
the factor of 2. since the sum in (24.13) is quadratic and contains each pair twice). We
also define the net chan~e of an atom P as
Obviously. the sum over all gross atomic charges equals the total net charge of the
molecule. so this analysis does indeed provide a breakdown of the total charge into
atomic components.
There are other ways to divide the total electronic density. Starting by the
definition
84
1/2
W =u e \'It (24.17)
W 2 1/2... 1/2. t
=UG ulue U'=UGU =S. (24.18)
Furthermore, since Tr{ A B} =Tr{ B A} for any two matrices A, lB, we can write
(24.12) as
N
n=Tr{WDW} = L (W D W}Pp (24.19)
p
which we may view as a population analysis in terms of a symmetrically orthogonalized
basis set. Using the transformation matrix :x. =UC- 1/2ut from Eq. (10.21), the
density becomes:
N N
p(r) =L Dpq Xp(r)Xq(r) =L D'rs G>r(r)cMr) (24.20)
p,q r,s
or. in matrix notation:
(24.21)
where D' is the density matrix in the new basis set" (note that:x. is a symmetric
matrix). From (24.21) we also have
where X is a generalized notation for all the coordinates to be integrated over (usually
obvious from the context). Reminiscent of this convention are the notations
JCjlp(x)* Cjlq(x) dx
<p I q> == <Cjlp ICjlq> == (A2)
<pr A qs> == «Pp Cjlr A Cjlq Cjls > == f Cjlp(x 1)*Cjlr<x2)* A Cjlq(XI) Cjls(X2) dx I dX2 (A4)
where we have used 'x' to denote the combination of space- and spin variable for the
electron. The electron repulsion integral is the most common two-electron quantity.
and in a notation already used above it is written a'l:
45 "Compll1atiollal Ter/miqlle.f ill Qllantlllll Chemi.ftryl alld Molec"lar Ph~'sics". NATO ASI Series C.
Vol 15. (Eds. G.H.F. Diercksen. B.T. SUlcliffe and A. Veillard) Reidel. Dordrechl (1975): "Methods in
ClIlllplllalional MlIlecrdar Ph~·sid·. NATO AS! Scr. C. Vol. 113. (Eds. G.H.F. Diercksen and S.
Wilson) Reidel. Dordrechl (1983): "Melhods;1I ClImp,,'o'illnal Mnleclliar Ph~·sics". NATO AS! Ser.
B. Vol. 293. (Ed... S. Wilson and G.H.F. Diercksen) Reidel. Dordrechl (1992).
46 "Modem Qllolllllm Clremistry:: IIIIrolf"climl In Advall"ed Electrollic StroClllre Tlreory·". A. Szabo
and N. Osllund. I~ Edilion (revised) McGraw Hill. 1989.
86
I
<pr - qs> =<pr I qs> (A5)
q2
Note the different ordering of the indices for the tW<H:lectron integrals! The justification
for this notation is that the indices corresponding to the same electron are written
together on the same side of the symbol for the integral.
The orbitals entering the expressions (A2 -AS) can be understood as either spin
orbitals or as functions over space only. Both conventions are common, and we clearly
need to make sure that there is complete agreement on which one is used in order to
interpret expressions where these notations occur.
We noticed in the discussion of the Hartree-Fock equations that two-electron
integrals very often occur in pairs, due to the antisymmetry of the wavefunction. This
motivates the notation
Using this notation, and assuming spin-orbitals, the Hartree-Fock energy becomes:
n n
E =ho + I
i
<i h i> + ~ 4.U <ij II ij> (A9)
Often, the spin-integration involves some trivial algebra and can be carried out even if
the spatial part of the orbitals are unknown. For integrals over space orbitals, we have
the following notations:
(pq I rs) == ('lip 'l'q I 'l'r'l's) == J 'l'p(rl )*'I'rCr2)* r:2 'l'q(r,) 'l's(r2) dr, dr2 (All)
On comparing the different formulations. we note that e.g., <pq Irs> =(pr I qs) if the
spin functions satisfy the relations O"p =O"r and O"q =O"s , and zero in all other cases.
87
and messages between master and slaves, a medium such as a communication network,
a network file system (NFS) or a shared memory device has to be available. Such
techniques can be used to distribute a computation over several machines with a large
variety of different hardware designs. The only model which would require more
consideration is the SIMD architecture, where integral evaluation would have to be
grouped according to type (quantum number) and large numbers of those would be
executed in parallel, essentially precluding most integral prescreening schemes in
current use.
A coarse-grain type of parallelism can be used to exploit parallelism efficiently
in a system with relatively few, powerful processors. In this scheme the evaluation and
processing of integrals is split into unrelated tasks, and at the end of the computation
the work done by all the tasks is collected by the main process. The obvious advantage
with this scheme is that it can be used for a shared memory computer as well as for a
distributed memory system, or any combination of the two.47 A net of workstations
can provide a significant source of CPU power for very large calculations, and it is
fully realistic to run machines at different locations in concert for the same job.
A simplistic parallel implementation of the direct SCF method requires complete
copies of both the Fock matrix (nuclear gradient), and the density matrix to be held in
memory on each node. However, with a physical memory of 16 megabytes on each
node one is limited to calculations on systems with up to about 1,000 basis functions
with such a "replicated-data" approach. By partitioning Fock and density matrices
across several nodes. it would be possible to investigate much larger systems. It is
possible with some modification of the algorithm to leave the AO-driven regime and
compute integrals in a Fock matrix driven manner. 48 A potential drawback of this
approach is the somewhat increased communication required to send matrix elements
between nodes, but this communication will increase slower than the computation as
one goes to larger systems. and thus it should not constitute a bottleneck.
Computationally intensive sections of existing programs can be modified to take
advantage of the processing power available on parallel machines in the regime of a few
hundred processors. However. while Amdahl's law oversimplifies the speedup for the
construction of the Fock matrix. it emphasizes the problem of efficiently using large
numbers of processors. To gain overall efficiency for a large number of processors, it
is not sufficient to parallelize only the evaluation and processing of two-electron
integrals. Large portions of the existing serial code. which is computationally in-
significant on a single-processor architecture. will have to be modified. Such an
47 H.P. Liithi. John E. Men7.. M.W. Fcyerciscn and J. Almlof. 1. Comput. Chem. 13. 160 (1992).
48 i.e .• completing the calt:ulation of all the inter:rnls rclatinr: one sub-block of the density matrix to
another sub-block of the Fock matrill Ilefore moving on to.the next pair of sub-blocks.
89
.19 J. Almlof. A. Sargent. and M.W. Fcyereisen. SIAM News 26(1). 14 (1993).
90
(B.la)
(B.lb)
where r<b) and i d) are columns in these matrices which are now treated as vectors.
A minimum-memory scheme would amount to looping over indices a and b, and
(i
a), d(b)} to each processor. At the processor level,
distributing one pair of vectors
the integrals (ab I cd) (for fixed a,b; all c,d) would now be evaluated. and used to
calculate all exchange contributions to r<a) and r<b) as in Eqs. (17.6 c-f)
(B.2a)
(B.2b)
(B.2c)
(B.2d)
evaluating the integrals (ab I cd) (but now for fixed a,c; all b,d). Again, the Coulomb
contributions (B.2) could now be evaluated according to:
-
f~ =r~ i~ (ab I cd) (B.3a)
-
f~ =r~ ia~ (ab I cd) (B.3b)
If the number of tasks generated in this way is not sufficient to get an efficient
load balance, but the calculation is still too large to fit the full matrices in memory, it
would also be possible to loop through these tasks one (or a few) at a time. and spread
the work in the inner double-loop across the processors.
Density Functional Theory
Nicholas C. Handy
1 Introduction
The subject of quantum chemistry may have reached an impasse. Keeping the discussion
to ab initio quantum chemistry we now know how to do very large SCF calculations,
thanks to the introduction of the Direct methodology by Almlof[lj. We can also manage
to work with good basis sets for such calculations, although I consider that 6-3IG* are
not good enough, and probably something nearer to TZ2P is required for definitive SCF
calculations.
Beyond SCF there are major difficulties, all associated with trying to more accurately
represent the electron-electron cusp. We know from the work of Kutzelnigg[2j that the
convergence of this problem is very slow, something like (I + lt4; this means that very
large basis sets are required for correlated calculations. We know that it is more important
to include d and f basis functions than to improve the methodology. 6-3IG* bases are
not appropriate for correlated studies. We also know that the raw cost of correlated
methods, MP2,MP3,MP4,CISD,CCSD,CCSD(T) increase with powers of the size of the
problem as 5,6,7,6,6,7. It is simply not possible to contemplate CCSD(T) calculations
with 1000 basis functions, nor will it be sensibly possible to contemplate such calculations
this century. I suggest that we have been misled by the rapid progress that computer
companies have made in the development of their hardware: more memory, vectorisation
compilers, parallel machines, all of which we have taken advantage of, but these advances
will not cure our outstanding basis set problems.
92
The physicists have been trying to persuade us for years that we ought to study Density
Functional Theory. We ought to have listened more carefully, especially since it was one
of our own, J. C. Slater who pushed them in that direction with his 1951 contribution[3]
in which he suggested the replacement of the difficult exchange term in the Hartree-Fock
method by the Dirac[4] pt potential, which he argued at that time included both exchange
and correlation effects. We were largely discouraged by the fact that this original DFT
made molecules substantially overbound. We do not think today that we can forget about
the problems of core electrons and use plane wave representations as the physicists do,
the jury is still out on this problem as they are on whether we should continue to use
gaussian basis sets, or will they be a thing of the past? We ignored the DFT work of
people like Jones[5] who was the first to get the bond length of Btl2 correct, although it
IIDlst be said that the reason the regular quantum chemists got it wrong was that too
small basis sets were being used in those days. So although I think we have all worked
hard and made great progress I rather wonder if we are up against it and the physicists
were right after all. This is why we are looking at DFT.
(1)
where 'II'(XIX2 •••XN) is the electronic wavefunction for the molecule. We observe that
J p{r)dr =N (2)
Furthermore it may be shown that the nuclear cusp condition gives [6]
(3)
93
where p(rA) is the spherical average of p(r). Another exact result which is of value for
the electron density is that asymptotically
where E:t and E~ are the ground state energies for HI and H 2 , respectively. Note that it
is here in the theorem that the restriction to the ground state has entered. In Eqn 5 the
subscripts 1 and 2 may be interchanged to give a second inequality. The two inequalities
may then be added to give ECf + ~ < ~ + E't, which is a contradiction. Hence the
external potential is uniquely determined by p(r).
We may therefore represent the energy as a functional of the density as follows
where T(P] is the kinetic energy and Vce(P] is the electron-electron interaction energy which
contains the coulomb interaction J(p] which is given by:
(9)
This variational principle allows us to write down the condition that the energy, Eqn 7,
is stationary with respect to changes in the density, subject to the constraint that Eqn 2
holds:
This last equation is an exa.ct equation for p( r) , if only we knew the functional forms of
T(P] and V•• (P]. We now go on to convert this equation into a set of working equations.
Kohn and Sham[9] introduced the idea of considering the determinantal wavefunction for
N nonintera.cting electrons in N orbitals 4>i. For such a system the kinetic energy and the
electron density are exa.ctly given by
T.[p] (13)
N
p(r) = L l4>i(rW (14)
(15)
Equations (15) are the Euler equations obtained when E(P] is minimised with respect
to variations in the orbitals which constitute the density as given by (14) subject to the
constraint that they remain normalised.
Now we return to our problem with intera.cting electrons and we write the energy in
djfferent ways:
The first line is from Eqn 7, the second line inserts and removes the noninteracting kinetic
energy and the coulomb energy, the next line introduce the exchange-correlation energy
functional the functional derivative of which is the exchange-correlation potential v"c:
On comparing eqns(15), (19) and (21) we deduce that the problem has been recast into
one involving noninteracting electrons in N orbitals which obey the equation
These are the Kohn-Sham equations for the Kohn-Sham orbitals 4>i. Note that the key
property of them is that they give the exact "density through eqn (14), once the exact
exchange-correlation functional Ezc[PJ has been determined.
Finally I make an observation. To be useful to chemistry, DFT must be applicable to
ground and excited states. The theory outlined above involving the work of Hohenberg,
Sham and KOhn is thought by some to be rigorous for ground states. The Bright Wilson
argument is not constrained to ground states, but it is not considered rigorous. However
it is plausible. In so far as I believe that DFT can only be a semi empirical theory, because
it will probably be never possible to find the exa.et exchange-correlation functional, then
plausibility arguments are important. I believe that DFT is the best semi empirical
method, because such parameters that are introduced into the functionals are not molecule
specific.
(23)
1
~(7]al- 2 V
2 J
+ v(r) + Irp(r')
_ r'1dr' + v",c(r) - fil7]p)c~ = 0 (24)
- J"
t 4>i(r)4>i(r')
Ir - r'1 dr' Prr, - fil7]p)Cpi =0 (26)
We see that the usual exchange term has been replaced by the vzc(r) potential. Thus in
principle it is easy to modify an SCF code to make it into a DFT code: merely delete
the exchange contribution and replace by v:c(r). Similarly the exchange contribution to
the total energy is deleted and replaced by the E:c[P] term. In our code we have done
precisely this, keeping gaussian basis sets and using our gaussian integral codes to evaluate
96
the kinetic, nuclear attraction and coulomb contributions to the Kohn-Sham (KS) matrix
and the total energy. We discuss the evaluation of the specific DFT contribution in section
7. All aspects of convergence are dealt with in the same way as in standard SCF codes.
At this stage we observe that such a procedure may be followed for any SCF method which
holds for a single determinant wavefunction. In other words Kohn-Sham methodology is
immediately available for (i) closed shell molecules (ii) the unrestricted representation (iii)
the high spin open shell molecule. All of these may be applied for the lowest state of a
given symmetry.
P2 (r 1I ,rI2,r},r2 ) = N(N-1)j
2 .. j'T'*( r 1I ,rI2,r3,r4,··· ) x
'l'
(28)
The factors N(~-t) in front are the number of equivalent pairs. Similarly the one particle
density matrices are
pt(r;, rl) = N j .. j lII*(r;, r2, r3, ...)lII(r17 r2, r3, ... )dr2dr3dstds2ds3... (29)
The last equation gives the usual electron density p(r) = Ptr).
The energy E, given as the expectation value of the hamiltonian is expressed in terms of
these matrices,
(31)
In Hartree-Fock or Kohn-Sham theory the one particle density matrices are represented
in terms of orbitals:
i
Pl(r;, rt) = 2L lPi(r;)lPi(rl) (32)
pt(r) = 2L IlPi{rW (33)
97
with Eo being the electrostatic energy of the positive background, which is equal to the
coulomb energy because the positive charge density n(r) is simply the negative of p(r).
Because the external potential is defined by
it follows that the second , third and fifth terms of Eqn 26 add to zero, and therefore
E[P) = T.[p) + E:c[P} (38)
= T.[p} + E:(P} + Ec(P} (39)
where we have now split up the exchange-correlation term into an exchange term plus a
correlation term.
We will now briefly indicate how the expressions for T.[P) and E:(P} are obtained. The
KS equations are satisfied by plane waves
liJt.r
IPI:(r) = Vl/2e (40)
JV 2p(r)dr o (53)
V; 1~
~ ds 2S
(54)
The result is that
T.(P) = CF Jp(r)5/3dr (55)
It is usual to introduce the exchange energy per particle e~ as a function of r a , the radius
of a sphere whose volume is the eft'ective volume of an electron
4 1
= -p
3
(59)
31fra
E.,(P] = J p(r)e.,(ra)dr (60)
0.4582
e",(ra) = ---
ra
(61)
In the case when the alpha spin density is not equal to the beta spin density, the kinetic
and exchange energies are
(67)
100
(68)
But we may assume that if we evaluate these from the wavefunction, then the key oper-
ators are V2 and r- 1 • Thus we also have
We therefore deduce
or
as required.
A simpler argument may be given by observing that the dimensions of p, t, k are
L-3 ,L-2 ,L-l respectively.
A summary of calculations which is presented later makes it clear that the LDA is not
adequate for useful predictions for computational chemistry. Indeed LDA is no better
than SCF as a rough approximation and so there is no point in doing it on its own. We
therefore discuss some important improvements which have been made in recent years.
101
One of the most important deficiencies of the LOA exchange is that it does not have the
correct asymptotic behaviour. Becke[13] starts from an exact representation of Vce(P] in
terms of the diagonal two particle density matrix P2(rl, rz)
(79)
(80)
where h(rl' rz) is the pair correlation function. The exchange-correlation hole is defined
by
(81)
Integration of P2(rl, r2) with respect to r2, using the definitions (27) and (29) yields
N-l
-2-p(rl)
NP(r1) + 2"1J psc(rl, r2)dr2
= "2 (82)
(83)
If we confine ourselves to the exchange part, then we may write the exchange energy in
terms of an exchange potential fs(rl):
(86)
This puts a constraint on the exchange potential, which is not obeyed by the LOA form,
because the well known asymptotic exponential dependence of p, exp( -QT) means that
the LOA E", has the asymptotic form exp(-QT/3). Thus a new term must be added. In
order to obtain this inverse distance dependence Becke recognised that it was necessary to
introduce both a logarithm and a term which involved the gradient of the density. After
investigations Becke's additional term has the form
It has one adjustable parameter (3 which was chosen so that the sum of theLDA and
Becke exchange terms accurately reproduce the exchange energies of six noble gas atoms,
=
f3 0.0042. We notice that this Becke exchange correction involves the gradient of the
density. This is natural, because the LOA approximation assumes that the molecular
density is homogeneous, which it is not, and formally a density gradient expansion is
required to introduce inhomogeneity.
Ec = -af 1 + dp-l
p I dr
3
where
exp( _cp-1 /3) -11/3
w = 1 +dp-1/3 P (90)
dp-1 /3
6 = cp
-1/3 + _"'--_,..
1 + dp-1/3 (91)
(92)
and a = 0.04918, b = 0.132, c = 0.2533, d = 0.349, which are the Colle-Salvetti param-
eters from their fit to the helium atom. The great advantage of this functional is that it
was derived from an actual correlated wavefunction for a two electron system, and has no
relation to the uniform electron gas. It also contains gradient terms.
A note on the terminology for functionals. We use the notation S-VWN for LOA, recognis-
ing Slater's primary contribution and use of Dirac's exchange term, together with Vosko,
Wilk and Nusair. We use the notation B-LYP for Dirac exchange + Becke correction +
Lee, Yang, Parr for the correlation functional. At this stage it is appropriate to note that
the introduction of gradient terms introduces mathematical complexities. In particular if
Thus one can immediately see that an evaluation of V:c demands an evaluation of the
second derivatives of basis functions.
103
Clearly it is a great challenge to develop improved functionals for molecular studies. One
really proceeds in an ad hoc fashion, the key object being to obtain better agreement with
experiment for predicted properties. As an example we followed the work of Wigner(17)
who devised a correlation energy functional of the form
= J1 + C2 p1/ 3 dr
CIP"/3
Ec (95)
It does not take much imagination therefore to propose a functional for which the Dirac
term is replaced by
(96)
where we have recognised that the separation into exchange and correlation parts is not
very rigorous. We (18) have used this new form in conjunction with the Becke exchange
correction and the LYP correlation functional, optimising the P(= 0.02) and P(= 0.008)
parameters. We have called the functional CAM(B)-LYP and the results (in the tables)
are encouraging.
7 Numerical Quadrature
This is one of the difficulties of DFT. It is quite clear that the new integrals which arise
in the KS equations may not be evaluated by analytic means because of the fractional
powers of the density which arise. There are various ways forward, and they all involve
a grid of points in molecular 3 dimensional space. Some people favour a least squares
fit of V"'C to an auxiliary gaussian basis set, some people favour a completely numerical
approach. Here we shall simply evaluate the required integrals using quadrature, and we
now describe a quadrature scheme which we have found satisfactory.
First we use the Becke[19) scheme for the decomposition ofthe integrand (which we write
here simply as F(r» into single centre components through the introduction of weight
functions wA(r) which have a value of near unity near nucleus A and which vanish in a
well-behaved manner near any other nucleus. The relevant equations are
(97)
f = LfA (100)
f FA (r)dr
A
fA = (101)
To determine the weight functions Becke introduced confocal elliptic coordinates (A, p., tP)
for two centres A and B. The key variable is p. defined by
(102)
P.AB takes the value -Ion A and also along the axis beyond A, and it takes the value +1
on B and along the axis beyond B. Becke defined a function s(p.AB) which was +1 for
-1 :s; P.AB :s; 0 and 0 for 0 < P.AB :s; +1. The weight function is then defined by
By this scheme the full space has been divided into Voronoi polyhedra surrounding each
nucleus A. Becke then smoothed out the discontinuities at P.AB = 0 into continuous
=
mutually overlapping regions. He constructed functions s(p.) such that s( -1) 1, s( + 1) =
0,: = 0 at p. = ±1, to ensure that 's(p.) does not have nuclear cusps'. He found it
desirable to work with a function for which the many derivatives were zero at the ends.
We[20] use the following form for s"
ds
dp.
= Am,. (1 - p.2y",. (105)
with Am,. being chosen to ensure that s(p.) has the value 1,0 at p. =::r1. We use the value
m,,=10.
Finally Becke recognised that it is important to have different sizes of regions around
each atom, the scheme given so far sharing the space equally between two atoms because
p. = 0 corresponds to the mid point between them. Therefore Becke introduced a change
of variable
(106)
and worked with the function s(/I) instead of s(p.), with the boundary at /I = O. He argued
that a good value for aAB is given by
UAB
aAB = (107)
u~B -1
X-I
UAB =
X+ l
(108)
RA
X = (109)
RB
where R A, RB are the respective Bragg-Slater radii. We have followed this aspect of
Becke's scheme exactly.
105
Te Velde and Baerends (tVB)[21] also divide space into atomic polyhedra. They then
proceed differently putting an atom centered sphere in the polyhedra and surround it by
a number of pyramids. The integration over the sphere uses spherical polar coordinates.
Special more complicated devices are used to integrate over the pyramids. Certainly the
Becke smoothing scheme is much easier to use than the t VB scheme.
Now we describe how the points are generated around each atom; we use spherical polar
coordinates.
l l l
are considering is
F(:z:)dz = F(:z:(q)): dq =: G(q)dq (110)
The Euler-Maclaurin formula [22] for the evaluation of the integral (110) is
fG(q)dq = ;(~G(~)+~(G(O)+G(I»)
-f 1:=1
B2\(!.)2.\:(G(2"-1)(I)_G(2"-I)(0»
(2k). n
~+2 0<2m+2)(t) (111)
n 2m +1(2m + 2)! ..
where 0 < e < 1. 1J2.I; are Bernouilli numbers which are 1, 1/6,
-1/30, 1/42, -1/30, 5/66, -691/2730, 7/6, -361'1/510,
43867/798, -174611/330 for k = 0, ... ,10. The Bernouilli numbers grow very rapidly
and for large even k, B" ..... 2(-I)"-1(2k)! (22')-2". Thus the series in eqn (Ill) is oIily
useful for low values of m. The Euler-Maclaurin formula becomes
l G(q)dq = !n (E
i=l
G( .!.) + ~(G(O) + G(I»)
n
__1_(0<1)(1) _ G(l)(O» + _1_(0<3)(1) _ G(3)(O»
12n2 720n4
1 (dS)(I) _ 0<5)(0»
30240n6
+ 120~On8 (0<7)(1) - 0<7)(0»
at the end points then the convergence of the sum over the quadrature points to the exact
integral value should be more rapid than if it was not the case. Of course in some cases
this is true for all derivatives and then all the error will be in the remainder term and
nothing is achieved, but in most cases G( q) and its derivatives will not be equal at the end
points. Handy and Boys[23] showed how to make the derivatives zero at the end points
and specifically considered a jacobian of the form
The nature of quantum mechanics means that all derivatives of our integrands vanish at
r =00, and the jacobian factor, r 2 , means the integrand and its first derivative vanish at
r = 0. We combine the above transformation (113) with a transformation from [0,00] to
[0,1), tbtts
(114)
(115)
where Q is a scaling parameter depending upon an effective atomic radius. The use of
(114) means that the integrand and its derivatives up to the (3m -1)th vanish at q. = 0,
and all its derivatives vanish at q. = 1. Our investigations show that the optimum value
ofm,. 2.=
are integrated exactly for all polynomials PI of degree 2n - 1, where n is the number of
quadrature points.
107
The integrand and all its derivatives will have the same value at tP = 0 and tP = 2",. We rec-
ommend equally spaced points because it may easily be shown that n such points will ex-
actly integrate cos mtP and sin mtP for m = 0, 1, ... ,
n - 1. In this case then the transformation is
(117)
These quadratures are defined by referring to n r , n" n~ as the number of grid points in
each dimension.
Indeed Lebedev[24] has devised Gauss like quadrature schemes which exactly integrate
the spherical harmonics l'/m(9,tP) for all -1:5 m :51, 0:5 1:5 L, for some L. There are
(L + 1)2 such functions. Lebedev describes 194,302 point quadrature schemes for which
L = 23, 29 and are based on the symmetry of the octahedron (there are also smaller
schemes). Although straightforward to implement, the derivation of these schemes is far
from trivial. They are argued to be efficient, with their efficiency (defined by (L+ 1)213N)
near unity.
As an alternative, consider the Gauss-Legendre scheme for theta and the simple Gauss
scheme for phi, regarded as a product quadrature scheme. To integrate all spherical
harmonics of degree:5 L, then [L+ 1]/2 theta points are required and L+l phi points are
required, that is N = (L + 1)2/2. Thus if L = 14, N = 120 and if L = 23, N = 288. In
other words a factor of 3/2 more points are required in this product scheme to integrate
exactly the same spherical harmonics. But this product scheme is exceedingly easy to
program, the number of points being trivially increased by the change of one parameter.
Symmetry applies trivially for the product scheme, whereas for the Lebedev procedures,
it is rather difficult having to be tied to the symmetry of the octahedron, respectively.
For all these arguments we have favoured the product scheme, with n~ = 2n,. We finally
note that the product scheme integrates a much wider class of function exactly compared
to the Lebedev scheme.
We are fortunate that a good test on the accuracy of a quadrature scheme may be judged
from Eqn 2, the integral of p. We have finalised our quadrature investigations, and have
108
DO 6 basis functions
DO 7 basis functions
construct Kohn-Sham matrix
END 7,6
END 2,1
109
Prom this outline it is easy to see that the basis cost is J{J, where N is a measure of the
size of the molecule.
We observe that there an increasing number of DFT codes in the literature under names
such as DGAUSS[26), DeMon[27), NUMOL[28), AMOL[29), DMOL[30), each of which
treats the evaluation of the KS matrix in a slightly different way.
Finally in this section we stress that for a giVen functional, it is possible to obtain the
exact KS solution, provided a complete basis and exact quadrature are used. Our ex-
perience is that this is in practice achievable provided a TZ2P( +f) type basis is used,
that is a good SCF basis. We can see every reason for using as reliable a quadrature as
possible. The great advantage of DFT is that no configuration interaction calculation is
performed, and therefore we are not trying to describe the electron-electron cusp in the
wavefunction, which is well established to be the reason why correlated calculations are
so slowly convergent with respect to basis set. This is the overriding advantage of modem
density functional theory.
dE
d~
= E). _ 2~.S?:
...n
(119)
where E). is the derivative of the energy with respect to >., through its explicit dependence
on >., such as the basis function dependence on the nuclear coordinates. Thus
S~ = I:C:oicPi«<;;';I71P)+(71al':':» (121)
afJ
Because we have recognised the similarity between SCF and KS methodology, eqn (118)
is also the expression for the derivative of the DFT energy. Instead of the quantum
mechanical exchange term in eqn (119), we have
then the exchange contributions to F~, FJ:l and E"" are replaced by
(129)
(130)
(131)
respectively. The algebra becomes much more messy if F:r:c is a function of V p as well.
The above formula rely on an accurate quadrature. If the quadrature is not sufficiently
reliable, then the minimum of the KS energy may not occur where the gradient is zero.
To overcome this difficulty it is necessary to differentiate the weights in the quadrature
formula, which also depend upon the position of the nuclei. This is not a difficult matter,
but slightly tedious.
In the tables are calculations for 33 molecules for which the experimental data for equi-
librium geometry and dissociation energy are reasonably certain. The calculations were
performed with high quadrature and large basis sets. Results are presented for LDA,
B-LYP and our new functional CAM(B)-LYP (as well as one other about which see [18].
This detail is included because it is only by understanding the specific failures of func-
tionals that advances can be made. From table 6, we see that LDA bond lengths are (too
long) by o.Ol7A and they are marginally improved using B-LYP. Substantial improve-
ment is obtained using CAM(B)-LYP, the parameters for which were optimised for the
subset of 9 molecules of tables 1 and 3. It is encouraging that the improvement for the
9 molecules carried over to the full set. For bond angles, we see that there is a slight
improvement in proceeding from LDA to B-LYP. However it is for Dissociation energies
111
for which there is the grea.test improvement using gradient corrected functionals. LDA
predicts an unacceptable 1.89 eVerror which is reduced to 0.41 or 0.28 eV using B-LYP
or CAM(B)-LYP.1t was the known poor results for LDA which held back the use of DFT
as a predictive tool for computational chemistry.
These results are supported by those presented in table 7, which are taken from Johnson et
al(31]. Note that these were obtained using only a 6-310* basis. Improvement ofthe basis
set will approximately halve the error of the bond lengths for all methods except SCF. It
is clear that B-LYP gives tremendously improved dissociation energies. Furthermore it is
generally found that frequencie: of vibration are better than MP2, although usually too
low.
Finally we observe that this field is rapidly growing and it has only been possible to touch
a small number of aspects. Further references may be found in the bibliography [31-46]
to both other important work on functional development as well as further calculations.
112
Table 3: Atomisation energies r;De calculated using different composite forms of the
exchange-correlation functional for the nine molecule subset in our modified G2 data set;
all energies are given in eV.
Table 4: Atomisation energies EDe calculated using different composite forms of the
exchange-correlation functional for the remaining molecules in our modified G2 data. set;
all energies are given in eV.
Table 5: The mean deviations, mean absolute deviations and mean percentage errors
in the bondlengths and atomisation energies for the nine molecules in the subset of our
modified set of G2 molecules for diiferent composite exchange-correlation functionals. The
deviations in the bondlengths and atomisation energies are expressed in units of A and
eV respectively.
BONDLENGTHS
Mean
Deviation 0.012 0.017 0.003
Mean
Absolute 0.017 0.017 0.009
Deviation
Mean
Percentage 1.2 1.1 0.7
Error
DISSOCIATION
ENERGIES E De
Mean
Deviation 1.21 0.15 0.01
Mean
Absolute 1.21 0.18 0.15
Deviation
Mean
Percentage 22.0 5.9 3.8
Error
120
Table 6: The mean deviations, mean absolute deviations and mean percentage errors in
the bondlengths, bond angles and atomisation energies for the thirty-three molecules in
our modified G2 data set for different composite exchange-correlation functionals. The
deviations in the bondlengths, bond angles and atomisation energies are expressed in units
of A, degrees and eV respectively.
BOND LENGTHS
Mean
Deviation 0.090 0.013 0.003
Mean
Absolute 0.017 0.013 0.009
Deviation
Mean
Percentage 1.4 1.1 0.7
Error
BOND ANGLES
Mean
Devia.tion 0.039 -0.463 -0.307
Mean
Absolute 1.91 1.68 1.51
Deviation
Mean
Percentage 1.7 1.5 1.4
Error
DISSOCIATION
ENERGIES EDe
Mean
Devia.tion 1.89 0.37 0.16
Mean
Absolute 1.89 0.41 0.28
Devia.tion
Mean
Percentage 20.8 5.7 3.8
Error
121
Table 7: Comparison of Mean absolute errors in bondlengths(l.A), bond angles (/0), dipole
moments(/D), harmonic frequencies (em-I) and atomisation energies(kcal mol-I) for 32
molecules, using 6-31G* basis set and 9700 quadrature points per atom.
It is a functional because the numerical value of E depends upon the function per}. Let
us derive the condition that E[P] is stationary with repect to variations in the function
per). We consider the variation
Heref is a small parameter which we shall let tend to zero and fer} is an arbitrary function
which obeys all boundary conditions. Substituting into the functional we obtain,
Now use a Taylor expansion of the integrand through first order and use (131) to obtain,
8E 8E
E[p+ f!l- E[p] = f( j fer) 8p + Vf(r}· 8Vp}dr (135)
In the limit of f -+ 0, the left hand side is ~, which is zero if E[P] is stationary. Fur-
thermore it must be stationary for all possible functions fer}, and this can only be the
case if the integrand on the right hand side is zero. We call the integrand the functional
derivative ~!, and thus we have derived the condition that E[P] is stationary to be
6E 8E 8E
-=--V·-=O (137)
6p - 8p 8Vp
The above is the famous Euler-Lagrange equation which we have used in chapter 2.
References
[1] J. Almlof, K. Fa.egri and K. Korsell. J. Comput. Chem. b3 385 (1982)
[15] C. Lee, W. Yang and R. G. Parr. Phys. Rev. B37 385 (1988)
[16] B. Miehlich, A. Savin, H. Stoll and H. Preuss. Chem. Phys. Lett. 157 200 (1989)
[25] CADPAC5: The Cambridge Analytic Derivatives Package Issue 5, Cambridge 1992.
A suite of quantum chemistry programs developed by R. D. Amos with contributions
from I. L. Alberts, J. S. Andrews, S. M. Colwell, N. C. Handy, D. Jayatilaka, P. J.
Knowles, R. Kobayashi, N. Koga, K. E. Laidig, P. E. Maslen, C. W. Murray, J. E.
Rice, J. Sanz, E. D. Simandiras, A. J. Stone and M-D Suo
[27) A. St. Amant and D. R. Salahub. Chem. Phys. Lett. 169 387 (1990)
124
[28] A. D. Becke. Int. J. Quantum Chem. S23 599 (1989); A. D. Becke and R. M. Dickson.
J. Chem. Phys. 923610 (1990)
(37) J. A. Pople, P. M. W. Gill and B. G. Johnson. Chem. Phys. Lett. 199557 (1992)
[40] C. W. Murray, G. J. Laming, N. C. Handy and R. D. Amos. Chem. Phys. Lett. 199
551 (1992)
in
Quantum Chemistry
Peter R. Taylor
San Diego Supercomputer Center
P. O. Box 85608
San Diego, CA 92186-9784
USA
Preface 129
1 Size-extensivity 131
1.1 Introduction. 131
1.2 Separated electron pairs . 132
1.3 Interacting electron pairs . 135
1.4 General remarks. . . . 137
Bibliography 197
Preface
Only connect!
E. M. Forster
Only connected!
J. Cizek.
The purpose of this course is to review extensively the methods and the mo-
tivations behind coupled-cluster approaches to molecular electronic structure. These
methods had their origins - or, at least, were first used - in nuclear many-body the-
ory. They were introduced into quantum chemistry in the 1960's, but were relatively
little used until the late 1970's, perhaps in part because the original formulations
used techniques, like second quantization and diagrammatic methods, that were un-
familiar to quantum chemists. As time passed, however, these methods were recast
ill more palatable mathematical forms; more importantly, efficient computational im-
plementations appeared and demonstrated great robustness and high accuracy. In
the last ten years coupled-cluster methods, or approximations to them, have become
widely used when the aim is to obtain very accurate results for molecules that are
well-described qualitatively at the Hartree-Fock level.
We shall concentrate here entirely on the methodology. Our presentation will
include the close connections between coupled-cluster methods and perturbation the-
ory, although perturbation theory itself is not treated in any detail. Differences
between coupled-cluster methods and more traditional (variational) treatments of
electron correlation will also be discussed. No use is made of diagrammatic tech-
niques in our derivations: diagrams are undoubtedly a useful tool for enumerating
and classifying terms, but are not necessary for an understanding of coupled-cluster
methods. Numerical comparisons of coupled-cluster results with those of other meth-
ods are presented in detail in other courses at this sebool.
Helpful discussions over many years with individuals too numerous to list here
have influenced my thinking about coupled-cluster methods. I would, however, like
to acknowledge helpful discussions with and comments from several mates that relate
specifically to these lecture notes: Kim Baldridge, Les Barnes, Rod Bartlett, Tim
Lee, and Jan Martin.
Chapter 1
Size-extensivity
1.1 Introduction
Let us imagine that we wish to estimate the binding energy of the water dimer. That
is, we wish to determine the energy of the reaction
(1.1 )
In other courses methods for obtaining a realistic and accurate value of this binding
energy will be discussed in detail, but for the moment we are concerned only with
a fairly crude estimate, since our purpose here is not to predict the binding energy
accurately but to understand the behaviour of different computational methods. We
therefore begin with the simplest correlation treatment: second-order perturbation
theory. This gives -76.24602 Eh as the energy of H2 0 and -152.49975 as the energy
of the dimer. Hence the binding energy is 4.8 kcal/mol according to second-order
perturbation theory. Let us now use a more sophisticated - or at any rate a more
complicated - correlation treatment: configuration interaction (CI) with single and
double excitations (CISD). The CISD energy for H 2 0 is -76.24572, and for the
dimer -152.47869, giving a CISD binding energy of -8.0 kcal/mol! This is not only
a very different result from the perturbation theory value, it does not even seem to
be physically plausible. However, a further numerical experiment reveals an anomaly.
If we compute the energy of (H 2 0h with an arbitrarily large separation between the
monomers, the perturbation theory energy is -152.49204, which is twice the monomer
energy. But the CISD energy is -152.47186, which falls short of twice the monomer
energy by about 12 kcal/mol. This is in fact a large part of the original difference in
binding energies. If we compute the CISD binding energies as the difference between
the dimer energy and the dimer at infinite separation, we obtain a binding energy
of 4.3 kcal/mol, which is closer to the perturbation theory estimate and is more
physically plausible.
This behaviour of the CISD is referred to as a lack of size-consistency. Pople
and co-workers [1) defined a size-consistent method as one for which
(1.2)
132
We turn now to an elementary analysis that illustrates the formal requirements for
size-extensivity. We employ as a model a simple beyond-Hartree-Fock treatment in-
troduced more than forty years ago: the method of separated electron pairs [31. The
133
basic assumption is that only correlation effects involving specific disjoint pairs of
electrons are important. The total wave function is thus represented as an antisym-
metrized product of two-electron geminals
(1.5)
for a 2N-electron system; the geminals are strongly orthogonal in the sense that
This strong orthogonality simply means that the geminals can be expanded in disjoint
subsets of an orthonormal set of orbitals:
{I'}
nl'(I,2) = :E <=:b {¢:(1)¢t(2) + (1 - ~"b)¢t(1)¢:(2)} [ap - pal, (1.7)
lIb
where we have included the singlet spin function, and the notation {p} is used to
indicate that the set of orbitals used for each pair p is specific to that pair. We note
that the closed· shell Hartree-Fock function is a special case of this form, in which
Beginning with the Hartree-Fock configuration, denoted WO, then, we can conveniently
write the unnormalized separated-pair wave function of Eq. 1.5 as
where
{I'}
XI' = :E'<=:bD:b' (1.10)
4b
Here the prime indicates that the term with b = 0 is excluded; D:b is an excited
configuration given by
Similarly,
{p} {I'}
Xpq = :E':E'C:bcf..tD:Cd (1.12)
..b cd
involves configurations in which there are excitations out of the Hartree-Fock orbitals
in both pairs p and q. This would thus create up to four-fold excitations. But the
coefficient multiplying each such excitation is a product of lower-order excitation
coefficients.
Now, for example, if we have a lattice of N noninteracting identical two-
electron systems the separated-pair model will be exact. Hence the energy obtained,
134
say, by substituting the wave function into the variation principle and minimizing
the resulting functional, must be size-extensive. In other words, the total energy is
exactly N times the energy of one subsystem. Let us compare this with the situation
obtained using other approximations. We may, for instance, retain only single and
double excitations in the wave function. In the context of separated electron pairs,
the resulting wave function is given by
since, as we have seen, all other terms in the wave function Eq. 1.5 involve higher than
double excitations. Now, we already know that the CISD energy is not size-extensive.
This must therefore be related to the absence of the product-form higher excitations
in Eq. 1.9, since this is the only difference. How do these higher excitations affect the
size-extensivity behaviour? One way to see the anomalous behaviour [4] is to calculate
the probability 9'7' of excitation from a given pair p. This probability is related to
the norm of wave function terms involving p, suitably normalized: specifically
(1.14)
for the case of only single and double excitations, for example. (We have omitted
the summation label that identifies the disjoint subsets of correlating orbitals for
simplicity.) For the full separated-pair wave function, 011 the other hand, the same
probability is
(1.16)
(1.17)
(1.18)
We note that this is not equal to the CISD value of Eq. 1.14, except in the limiting
case of only one electron pair. Assume once again that we are considering a lattice
135
A considerable effort to analyze and characterize treatments more general than sepa-
rated electron pairs was undertaken by Sinanoglu [6,7]. Drawing on terminology and
arguments from statistical theories of nonideal gases, he introduced the notion of a
cluster expansion of the wave function, restricting himself initially to only pair clus-
ters: functions to describe correlation between electrons I and J. To make discussion
of this approach easier, we will ignore single excitations for the moment and assume
136
that the correlation effects between a given electron pair involve only double exci-
tations. Denoting the pair cluster functions UIJ, we can rewrite the CI with double
excitations (CID) wave function as
llIelD = llIo + :E P(uIJIlI~J, (1.19)
1>J
where IlI~J is the (N - 2)-electron wave function obtained by deleting occupied spin-
orbitals I and J from llI o, and p( is an antisymmetrizer that ensures the overall
N-electron configuration is antisymmetric to exchange of electrons. The cm pair
cluster functions would then be given by
There are many different perspectives from which one can view the cluster expansion
of the wave function, and the issue of size-extensivity. Thirty years ago, for example,
when there were essentially no computational implementations, much of the formal
effort was devoted to Lie-algebraic analyses, wave operators, and the logarithm of
the wave function. In simple terms, size-extensivity is obtained by having an addi-
tively separable energy, which in turn derives from a multiplicatively separable wave
function [5]. We can see this for our noninteracting two-electron systems: the total
wave function can be a simple product of the subsystem wave functions, leading to a
total energy which is the sum of subsystem energies [4]. The simple product suffices
because the wave functions of the subsystems do not overlap.
On the other hand, those who have entered the field from perturbation theory
usually use diagrammatic analyses, and express this perspective on size-extensivity by
stating that a size-extensive energy contains no "unlinked diagrams" (see, e.g., Ref. 2).
Feynman diagrams (also called Brandow diagrams in this context) or Hugenholtz
diagrams provide a bookkeeping strategy [8] for energy contributions, which for the
usual quantum-chemical single-reference perturbation theories are products (or sums
of products) of two-electron integrals divided by energy denominators. If a given
energy contribution can be factorized into a simple product of two other contributions,
the corresponding diagram will consist of two closed disjoint parts. Hence there are no
"links" between the two subdiagrams. The absence of any such terms is a necessary
and sufficient condition for size-extensivity. They thus allow us to determine whether
138
tTbis is the primary reason for preferring the term "disconnected duster" to "unlinked duster" in
discussing wave function terms. Then the word "unlinked" is used only with diagrams. Of course, we
should also note that in the early literature the term "unlinked duster" sometimes means "unlinked
diagram", not "disconnected duster"!
Chapter 2
We begin with a simple formulation of the CI expansion. Once again, we let in-
dices I, J ... denote occupied Hartree-Fock spin-orbitals and A, B ... unoccupied
spin-orbitals. We can use second quantization to obtain our excited configurations,
(2.1)
for example, where 1110 is the Hartree-Fock determinant. The CI expansion in inter-
mediate normalization can then be written
(2.3)
thus generating all single excitations from 1110, etc. We may simplify Eq. 2.3 to (1 +
C)Wo, where all the excitation operators are denoted by C. The CI eigenvalue
equations can be obtained, for example, by projecting the "Schrodinger equation"
H(1 + C)lIIo = E(1 + C)Wo onto the basis of many-electron states obtained byapply-
ing all the excitation operators in 1 + C to 111 0, These states thus comprise all single
excitations w1, double excitations lilt!, etc. The resulting equations are
where Eo is the HF energy, and eA (el) are virtual (occupied) orbital energies. This
form is obviously appropriate only for canonical Hartree-Fock orbitals. If other
Hartree-Fock orbitals are used (e.g., localized orbitals), the form of Eq. 2.8 can be
modified straightforwardly to take account of a nondiagonal Fock operator. We shall
assume henceforth that the Hamiltonians Hand Ware given in normal-ordered form
(all creation operators stand to the left of annihilation operators). The CI equations
for the correlation energy f are then
(2.9)
and
(2.10)
Suppose we consider once again the case of M noninteracting two-electron systems.
The correlation energy
should scale linearly with M for size-extensive behaviour, but we begin only by as-
suming that it scales as M{J, for some power (3. Eq. 2.11 then shows that the CI
coefficient c:f/ must scale as M{J-l. This is so because the matrix elements of Ware
just integrals that are independent of M. Since there are M subsystems, and the
left-hand side (LHS) of Eq. 2.9 overall must scale as M{J (as assumed for the RHS),
the coefficient must go as M{J-l.
Let us now consider Eq. 2.10. Evidently, the RHS scales as M{J M{J-l, or M 2{J-l.
We can take the LHS term by term. The first term is (1111/IWIII1 0), which is just a
two-electron integral and is independent of M. The second term is (w1/IWIC2 WO).
This gives us products of a CI coefficient and a matrix element. The only matrix
elements that are nonvanishing are those in which all MOs are on the same two-
electron system, hence the matrix elements are independent of M. The coefficient
141
gives us an MP-l dependence. So the LHS has a term that is MO, and one that
is MP-l. The RHS goes as M 2P-l. We are thus in a quandary. There is no value
of p that will have all the terms in the equation scaling correctly, at least for M > 1.
In fact, since the LHS has a constant term already, we would need p = 1 to make
the LHS all constant. That is also what we want for size-extensivity, of course, but
since the RHS would then go as M, we cannot make the two sides of the equation
"dimensionally" equivalent. This simple argument, analogous to checking units in
numerical calculations, shows that CID cannot be size-extensive.
It would presumably be possible to establish that no truncated CI is size-
extensive by similar arguments, but the enumeration of the various matrix elements
would become extremely tedious. The result can be proved both algebraically and by
diagrammatic techniques in an elegant way, but this is not necessary for our present
purposes. We simply accept here that the result that holds for cm will hold for any
other truncated CI: as we saw in Chapter 1 the problem is that our wave function
contains no disconnected cluster contributions, or, as some would say, our energy
expression includes unlinked diagrams.
(2.12)
and
Tl = 2:2:ttX.!XI, etc., (2.13)
A I
in complete analogy with the CI case above. The purpose in introducing different
labels is because we now propose to write the wave function not as the linear expansion
of Eq. 2.3, but as the ezponential [9,101
(2.15)
Since
T2 = 2: 2: tt!X~XJX.!XJ, (2.16)
A>B I>J
142
it follows that
Ti = :E :E :E :E ttJt~2X~XJXtXIXbXLXjXK (2.17)
I>J K>L A>S C>D
Thus the exponential ansatz automatically generates all the disconnected clusters.
The only parameters that appear independently in the wave function are the con-
nected cluster coefficients. This is clearly a very convenient way to build size-
extensivity into the method. We can also relate the coefficients of excitation levels in
Eqs. 2.14 and 2.3. For example,
c1 = t1 (2.18)
and
AS _ tAS
~lJ - IJ + tAtS
I J -
tSt A
I J (2.19)
for the single and double excitations. We can thus see the emergence of disconnected
terms even in the double excitations. Further, the reader may care to verify by explicit
expansion of the terms derived from Ti that the same disconnected quadruples are
obtained as were listed in Eq. 1.23. (The Ti expansion gives 36 terms that are
permutationally equivalent to the 18 independent terms in Eq. 1.23; the factor of
one-half in Eq. 2.15 then accounts for the redundancy.)
The wave function IJIEXP is implicitly equivalent to a full CI, since we have so
far done nothing about truncating the expansion of T. Indeed, from a full CI per-
spective all we have done is complicate matters, since now we have a highly nonlinear
representation of the wave function, compared to the linear CI expansion. However,
it is obvious that even if we truncate T at some fixed excitation level, we will retain
all disconnected clusters arising from the truncated set of connected terms. For in-
stance, we can approximate T by T2 only. The exponential then generates all orders
of disconnected clusters of pair excitations, just like Sinanoglu's pair-cluster expan-
sion (Eq 1.21). But a crucial problem remains - how do we optimize the connected
cluster coefficients? That is, how do we devise a computational method that can
exploit this type of wave function? We have already determined that a variational
approach will not do, because the number of terms (and the nonlinearity here) would
become impossible.
We begin by writing the unknown wave function IJI as exp(T)lJI o. At this stage, we
make no assumptions about truncation of T. In a variational treatment we would
multiply H exp(T)lJI o = E exp(T)lJI o on the left by exp(T)t, and then obtain an
equation system by setting the change in energy with respect to any of the t coefficients
to zero. We have already accepted that this is not feasible. How to proceed next
143
As discussed elsewhere at this summer school, the operator on the left-hand side can
be rewritten using a Hausdorff commutator expansion,
1
exp(A)Bexp(-A) = B + [A,B] + 2![A, [A,Bll.... (2.21)
In the present case, however, we receive a special bonus when this expansion is devel-
oped. We note first that the excitation operator X.t XI commutes with the operator T.
The general creation/annihilation operator product XpXq does not, so H does not
commute with T. But we observe that
1
exp( -T)H exp(T) 1110 = H + [H,T] + 2[[H,T],T]
1 1
+ 3i[[[H,T], T],T] + 4i[[[[H,T],T],T],T]. (2.25)
We can now use the Hausdorff expansion to develop an explicit form for op-
timizing the wave function. Just as we proceeded above, we project onto a basis of
states adequate to define all the independent coefficients in the wave function. Such
144
a set is again provided by the Hartree-Fock configuration plus the excited configura-
tions generated by all the excitation operators in T. Let 'l!t!f.::· denote an arbitrary
excited configuration from this set. We thus obtain the equations
1 1
('l!oIH + [H, T] + 2[[H, T], T] + 31([[H, T], T], T]
1
+ 4f I[[[H, T],T],T], T]I'l!o) =E (2.26)
and
1 1
('l!t!f.::·IH + [H, T) + 2[[H, T), T] + 3i[[[H, T), T], T)
1
+4I[[[[H,T),T), T), TJI'l!o) =0 V IJK ... and ABC .... (2.27)
Eq. 2.26 and the set of equations 2.27 define the coupled-cluster method. Two crucial
points can be inferred from these coupled-duster equations. The first is that the
finite Hausdorff expansion leads to (at worst) quartic equations Eq. 2.27. This is
true no matter what level of excitation is generated byexp(T). Second, the unknown
energy does not appear anywhere in the equations that determine the various cluster
amplitudes tt!f.::·. This decoupling of Eq. 2.26 from the system 2.27, so that we
solve the latter first and obtain the energy from the former, produces an additively
separable energy and a size-extensive result. This is trivially true for the full CI case
(all possible excitations in T), but it is also true for any truncation of T by excitation
level. We will now explicitly demonstrate that size-extensive results are obtained for
our model noninteracting two-electron systems, by considering only double excitations
in T - the CCD model.
Using the operator W (Eq. 2.8) instead of H, the CCD equations can be
written as
1
('l!ol[W, T2) + 2[[W, T2), T2)I'l!o) = f, (2.28)
and
('l!t!IW + [W,T2) + ~[[W,T2),T2J1'l!O) = O. (2.29)
We note that the Hausdorff expansion terminates exactly after only three terms in
the CCD case. We expand the commutators, and use the fact that bra and ket may
differ by no more than a double excitation for a non-zero matrix element. Eq. 2.28
then simplifies to
(2.30)
We again assume that the .correlation energy f scales as MP for M noninteracting
two-electron systems, so that the amplitudes will again scale as MP-l. Similarly,
Eq. 2.29 can be rewritten as
(2.31)
145
where the parentheses simply let us group the LHS into three terms. The first of
these terms involves no amplitudes and reduces to a two-electron integral that is in-
dependent of M. The second term (the matrix element of (WT2 - T2W)) effectively
involves matrix elements between double excitations: these are zero when the exci-
tations are on different systems or from other than the pair I J. Where the matrix
element is nonzero, it comprises an integral or integrals multiplying an amplitude:
the integral is independent of M so the overall scaling of this term is MfJ-1 • The
third term involves disconnected quadruple excitations. After a certain amount of
manipulation it may be reduced to the form
E E (iJ!t!IWIiJ!t!ff)iitfK?, (2.32)
K>LC>D
where
Note that this disconnected quadruples coefficient is not the IS-term form of Eq. 1.23:
the leading term ttftf/l, from Eq. 1.23 does not occur in the expansion of the LHS of
Eq.2.31. This has profound consequences, and is discussed in the next section as well
as here. The matrix element between double and quadruple excitations in Eq. 2.32
can be simplified, since
(2.34)
and this is actually a combination of two-electron integrals. These integrals are in-
dependent of M, of course. What of the dependence of ii? For our noninteracting
systems, any term corresponding to simultaneous single excitations on two subsys-
tems (all terms involving subscripts like I K or superscripts like AC) is zero. Further,
excitations from I J into correlating orbitals from a different pair also have zero coef-
ficients. The only terms that survive come from the case {KL} = {IJ}, which since
restricted summations are used in Eq. 2.32 implies K = I and J = L. Thus for the
noninteracting two-electron systems Otlff reduces to
+ IJ JI + IJ JI - IJ JI + tIJ JI'
t ADt BC tBCtAD tBDtAC CDt AB (2.35)
146
From the original definition of the amplitudes tf! in Eq. 2.16 we can see that we
require that t1? = -tf!, whereupon most of the terms in Eq. 2.35 will cancel with
one another. For noninteracting two-electron systems we then finally obtain
Note that the result has no sum over occupied orbitals. Hence this term in fact has
only the M dependence of the coefficients, or M2fJ- 2 overall. Let us return (finally!)
to the analysis of Eq. 2.31. The LHS comprises three terms, which scale respectively
as MO, MfJ- 1 , and M 2fJ- 2 . And we thus see that all three terms scale the same way
for f3 = 1, which is precisely what is required for size-extensive behaviour.
In this way, we see that the generation of disconnected quadruples by the ex-
ponential operator exp(T) provides exactly what is needed for a size-extensive result.
We may note that the unknown correlation energy nowhere appears in the equations
defining the amplitudes, unlike the CI case. We may also note that the amplitude
product on the RHS of Eq. 2.36 corresponds, in effect, to the "quadruple excitation"
1111fI~D, which cannot occur since the Pauli principle forbids annihilating the same
occupied spin-orbital twice. Such a contribution is often referred to as an exclusion
principle violating (EPV) term. We shall meet EPV terms again in this course.
There are many formulations of the coupled-cluster equations, some differing from
the previous section, at least superficially. We shall briefly review some aspects that
the reader may encounter in the literature.
It is common not to use the Hausdorff expansion when setting up the CC
equations [4,12]. We can instead project the "Schrodinger equation" H exp(T) 111 o =
Eexp(T)lII o onto a basis of many-electron states. Using again the operator W =
H - Eo we obtain
In this approach, the correlation energy appears explicitly in the equations defining
the amplitudes. However, at each level of excitation, terms will arise on the LHS,
involving disconnected clusters, that will cancel the term on the RHS. We can see
this explicitly for the CCD case again, for which the equations are
(2.40)
(2.41)
147
The first equation just defines the correlation energy, given the CCD amplitudes. The
remaining equations include the term
(2.42)
" "('T,ABIWI'T,ABCD)
L. L. 'i!IJ AB CD
'i!IJKL t IJ t KL · (2.45)
K>LC>D
We note that (w1J IWlw1JRf) = (WoIWIWk£), and recall that the correlation energy
is given by
~= l: l:
(WoIWIW~f)t~f· (2.46)
K>LC>D
Hence
the theorem. From a diagrammatic point of view, the connected cluster expansion
of [(H - E) exp(T)]c comprises only "connected diagrams". These form a proper
subset of the linked diagrams, hence the energy will be size-extensive (no unlinked
diagrams).t
Bartlett and co-workers (see, e.g., Ref. 14) have developed another formulation of
the ee equations - one that is useful for displaying the connections between ec
methods and perturbation theory. It is an "operator form", given symbolically as
(2.48)
for the eeD case, for example. Here we have an operator relation that becomes
an equation when we operate on the right on the HF determinant l{Io, and we then
project onto the appropriate space of excitations. In the above case, for example, we
would project on the double excitations. The result is
From our earlier discussion we realize that only connected terms should be retained
here, something that we henceforth regard as implicit in the notation. That is, the
simple product notation WT2 really represents the commutator [w, T2 ), etc. Thus
with these assumptions: connected terms only, canonical orbitals, and the normal-
ordered Hamiltonian W of Eq. 2.8, we have a convenient and compactn notation for
the ce equations. The general form of the equations can be written as
theory in which only linked diagrams are included.) For instance, we can envisage
a perturbational approach in which we assume initially that the vector of doubles
amplitudes t2 = 0 on the RHS of Eq. 2.48. We can then solve for a first-order
estimate t 2 (1) from
(2.53)
Again, this is an operator relation: each side should operate on 1110, and then matrix
elements should be taken with all 111 !JAB should be taken. This yields
Hence
t1!(I) = (AB~IJ) , (2.55)
e] + eJ - eA - eB
the first-order perturbation theory wave function is
We have already stated that attempts to use the exponential ansatz in a variational
formulation seem doomed. One way to see this is to consider the variational functional
(2.59)
2.7 Summary
We have seen that the use of the exponential ansatz provides us with a means of
performing calculations that are rigorously size-extensive. In practice, of course, some
truncation of the excitation operator expansion will be required, and we shall proceed
to discuss next the restriction to single and double excitations only. We may note,
here, however, that size-extensive results will be obtained not only if we trullcate T
by excitation level, but also if we eliminate individual terms from the coupled-cluster
equations. We shall see later how simpler treatments can be developed using this
tactic.
Chapter 3
(3.1)
and
1
= W + WT1 + WT2 + 2WT22
+ WTITz
1 1
+ 21T2 12
W 1 T2 + 2WT1 + 3! WT1 + 4! WT1 ,
3 4
(3.2)
(3.3)
Recall that here and in what follows we restrict ourselves to connected contributions
only in the expansion of Wexp(Tl + T2). The coupled quartic equations 3.1 and 3.2
152
are solved for the singles and doubles amplitudes, and the correlation energy can then
be evaluated. We shall discuss the solution of the nonlinear equations below. For the
moment, we will expand the equations somewhat, explicitly performing the operation
on 1110 on the right and the projection on the singly and doubly excited states on the
left. This gives
and
From the Slater-Condon rules (or from explicit consideration of the second quan-
tized form of W) we can see that the indices K, L, C, D cannot all be different
from I, J, A, B. At least two of the former must coincide with the latter, giving
(Recall here once again that we are assuming canonical Hartree-Fock orbitals.) The
other terms in Eqs. 3.4 and 3.5 can be expanded in like manner, giving us a complete
set of equations for determining the resulting amplitudes. The full equations can be
found, for example, in Purvis and Bartlett [18]. Instead of enumerating all the terms
here, we will tum to an important issue that we have largely ignored so far: the spin
symmetry of the problem.
153
Suppose that we are concerned with a system whose Hartree-Fock wave function is
a closed shell (all orbitals maximally occupied). If we wish to perform, for example,
a CI calculation, we can use simple spin-orbital excitations just as we have done so
far. However, if the final wave function is to be a totally symmetric singlet state,
the coefficients of the excited determinants will not be linearly independent. The
simplest case would be the single excitations: denoting beta spin by a bar, and using
lower-case letters to denote orbitals, we can see that for illfcf and ill:4 we must
have cf = 4· Obviously, we can reduce the number of singles coefficients by a factor
of two by defining configurations that are spin eigenfunctions,
(3.8)
(3.9)
.) = ~2
1~~~ [ill!~' )
- ill!~ J+
' )- 'ill~.! ' ill~~]
) (3.10)
and
..6 + .T,ii6 + .T,,,5 + .T,ii6 + .T,,,5 + 2'T'ii5]
3""..6
'tI'ij = v'f2
1 [2.T,
':Itij ':It,j "ffij ':Iti, "ffi, "ffr;.
(3.11)
The spin eigenfunctions of Eqs. 3.10 and 3.11 are often referred to informally as
singlet- and triplet-coupled double excitations, respectively. More correctly, Eq. 3.10
involves two occupied orbitals coupled as a singlet, and two virtual orbitals coupled
similarly. In Eq. 3.11 the holes are coupled as a triplet, the virtuals are coupled as
a triplet, and the overall coupling is of course as a singlet. Obviously, these spin
couplings are not appropriate for cases of index coincidence. For i = j or a = b the
linearly dependent terms are eliminated and the configuration is renormalized, as for
the configuration ~'t;" = ill'll. The triplet-coupled function Eq. 3.11 vanishes for any
coincidences among indices.
Using these spin-adapted configurations, we will substantially reduce the length
of the correlating expansion. The number of double excitations will go down by a fac-
tor of about three, for example, compared to the use of spin-orbital excitations. This
also leads to some extension of the definition of pairs, which hitherto have referred to
154
spin orbital pairs. For instance, if i and j are different spatial orbitals, we can consider
terms like those of Eq. 3.10, for all a ~ b, as constituting a correlating expansion for
the "singlet interorbital pair" ij. That is, the correlation of a singlet-coupled pair of
electrons in orbitals i and j is deScribed by the expansion
(3.12)
We can similarly have triplet interorbital pairs, and intraorbital pairs ii that are
necessarily singlet-coupled. Provided the correlation treatment we are using involves
summations over both occupied and virtual manifolds, the result is rigorously inde-
pendent of whether spin-adapted configurations are used or not. The energy from a
CISD or CCSD calculation is thus unaffected by the spin coupling, but the compu-
tational effort will be significantly reduced.
We should note that the spin-couplings given in Eqs. 3.10 and 3.11 are to a
considerable extent arbitrary. Any unitary transformation that mixes the two cou-
plings will give a pair of orthonormal singlet spin eigenfunction that are formalIy
acceptable. However, there may be computational advantages to using a particular
spin-adaptation scheme. The one described is that which is obtained by the earli-
est attempts to exploit spin symmetry in CCD or CCSD calculations [4,13,19,20],
whether arrived at diagrammatically or algebraically. This spin-coupling was con-
sidered most appropriate for early direct CI calculations [21J because it provided
maximum reduction in work. Nevertheless, recent developments have concentrated
on different, simpler schemes, as we now discuss.
The singlet and triplet pair functions provided considerable reduction in com-
putational work when they were first used in various electron-pair-based models (see
Ref. 4 and references therein). These early calculations predated vector supercom-
puters, and the main emphasis in obtaining computational efficiency was placed on
reducing the number of operations performed. But while spin-coupled pairs provide
the fewest independent double excitation amplitudes, they generate a number of side-
effects. First, as we have noted, configurations in which orbital indices coincide are
treated as special cases, although this can be avoided in part by renormalizing the
amplitudes of configurations with index coincidences. Second, the sets of amplitudes
are defined by restricted summations over a ~ b or a > b for singlets and triplets,
respectively. That is, the independent amplitudes form triangular matrices for each
pair. These issues of index coincidences and triangular matrices are precisely those
that bedevil efficient vector implementation of algorithms, since they engender either
conditional constructs in loops, or squaring of triangular arrays with a concomitant
increase in the number of operations. We can finesse these issues by thinking again
about the spin coupling.
We define orbital excitation operators as
Hartree-Fock orbitals are used; the Fock matrix is assumed to be diagonal. The
CCSD method itself is independent of whether canonical orbitals or some other choice
of Hartree-Fock orbitals, like localized orbitals, is used. However, if another choice
is used, the expressions programmed must include the necessary nondiagonal ele-
ments of the Fock matrix. Programs based on expressions that contain only diagonal
elements of the Fock matrix are correct only for canonical Hartree-Fock orbitals.
The efficient CCSD implementations we have alluded to allow us to evaluate the LHS
of Eq. 3.5, for a given estimate of the amplitudes, as rapidly as possible. Of course,
for any given estimate of the amplitudes, this expression will not evaluate to zero,
and we would then like to use the available information to improve the amplitude of
the estimates, ultimately converging on the CCSD solution. One way to proceed is
suggested immediately by the perturbation theory analysis of Sec. 2.5. We assume
we have an estimate of the amplitudes t and wish to determine a correction c5t to
them. Perturbation theory [2] would yield
and
(e; +ej -e .. -eb)c5ti] = G(i,j,a,b) (3.18)
in lowest order. Here the arrays G represent the RHS of the amplitude equations
evaluated with the current estimate of the amplitudes t. The amplitude corrections
are obtained by dividing through by the energy denominators. We can proceed it-
eratively, defining t ln+1] = tin] + c5tlnl, where we have used n to index the iterations
and
(3.19)
for the doubles equation, for example. Eventually, assuming the procedure converges,
the arrays Gin) will tend to zero. However, while this strategy may be useful for
obtaining low-order perturbation energies (through fourth order, say), the conver-
gence of such an approach is likely to be unacceptably slow. An analogy with the
direct CI method is appropriate - this was originally formulated using perturbation
theory to extract the lowest eigenvalue of the Hamiltonian, but for practical use a
variation-perturbation scheme (or a Davidson-type iterative scheme) is required [21].
For eigenvalue problems and for linear equations a combination of iterative methods,
to generate trial vectors, and a direct solution of the problem projected onto the
space of these vectors, has proved very successful. Two related schemes have been
used with success for solving the CCD and CCSD equations. The basic idea is that
rather than simply iterating, once a certain number of iterated amplitude vectors are
available they are used to form a new guess at the solution. For instance, assume we
157
have m sets of amplitude vectors tin) and "residual vectors" Gin), where n runs from
one to m. We wish to represent the solution to the CC equations as well as possible as
the linear combination En t[nl en . Using Pulay's DIIS (direct inversion in the iterative
subspace) approach to nonlinear optimization, we can regard the optimum coefficients
as those that minimize the residuals (in a mean-squared sense). That is, we deter-
mine those en that minimize the squared norm IE.. enGln) 12 , subject to the condition
that En en = 1. This requires solution of a system of m + 1 equations, which requires
trivial computational effort compared to constructing the residual vectors themselves.
Given the optimum coefficient values, a new amplitude vector can be constructed and
(assuming the residuals are not zero to within the desired threshold) more iterations
can be performed.
Purvis and Bartlett [261 devised a somewhat different approach to solving the
CC equations, but again it involves performing a number of simple iterations and
then solving a small· dimension problem. The latter is a linearized problem in their
approach, which they term a combination of "Jacobi iterations" (the simple iteration
steps) plus a "reduced linear equation" (RLE) step (the small dimension problem).
Typically, these authors use the RLE step every five Jacobi iterations, although it
might be used more frequently in difficult cases.
The author's experience is that DIIS-based methods show superior convergence
to the RLEj Jacobi scheme, although it should be said that this has generally involved
calculations in which the convergence criteria (e.g., changes in amplitudes from iter-
ation to iteration, or norms of residual vectors) are very demanding. Thresholds that
effectively mean convergence of the energy to 10-10 Eh require about 25 evaluations of
residual vectors using the DIIS approach, and possibly 30 or more with RLE. Conver-
gence of the energy with DIIS tends to be monotonic from above (although of course
there is no variation principle to guarantee this), while with the RLE nonmonotonic
behaviour can be observed. For comparison, a CISD calculation would probably re-
quire about 16 to 20 Davidson-type iterations to converge the total energy to a similar
threshold. If less precision is required, fewer iterations are required, and the difference
between the convergence rates of DIIS and RLE seems to decrease.
As we have demonstrated, the CCSD method provides an exact solution in the case
of noninteracting two-electron systems, but it is obviously not exact for real, many-
electron molecules. Nevertheless, we may expect that where electron correlation is
dominated by pair effects, CCSD should yield good results. This is quantified else-
where at this school, here we mention only that the CCSD model should recover
90-95% of the exact correlation energy (Le., the full CI limit in a given basis) in situ-
ations where the closed-shell Hartree-Fock configuration dominates the wave function.
In other words, where nondynamical correlation, like near-degeneracy effects or other
158
1i =~, (:3.20)
empirical comparison, for a variety of systems, Lee and Taylor suggested that 1i values
larger than 0.02 indicated an increasing importance of nondynamical correlation and
all increasing unreliability of CCSD.
It is important to note several points in connection with the 1i diagnostic.
First, it was not suggested as an alternative to examining individual amplitudes,
but as a complement to it, an aim that has been perversely misunderstood and
misreported in the literature. Second, it is crucial to "normalize" such a quantity
so that the final value is independent of the number of electrons. Examining unscaled
norms of excitation amplitudes, or norms of perturbed wave functions, is not useful in
this sense. It should be obvious for the case of noninteracting two-electron systems,
for example, that as the number of systems increases the norm of the individual
perturbed wave functions increase, since the number of contributions depends on the
number of electrons. Yet CCSD remains exact for this case, and (assuming each two-
electron system is well described at the Hartree-Fock level) there are no nondynamical
correlation effects. Thus we must remove any dependence on the number of electrons
before comparing norm-derived quantities.
What should one do when 1i is larger than 0.027 Within the CC framework,
it will be necessary to include higher excitations, procedures for which are discussed
in Chapter 4. Even the simplest (reliable) corrections for connected triple excitations
increase the radius of convergence of the method substantially. No threshold value has
been derived, but the author has seen good agreement between such treatments and
MCSCF /MRCI methods when 1i has been as large as 0.04. The alternative is to turn
to methods more suited to open-shell systems and to the treatment of nondynamical
correlation: such methods are discussed in Chapter 5. It is appropriate to point out
here that the threshold of 0.02 was derived for closed-shell CCSD. In particular, when
a UHF reference function is used, an analogous formula for 1i routinely yields much
larger values than are seen in the closed-shell case. Jayatilaka and Lee [29] suggest a
modified diagnostic for UHF-based CCSD.
The development of the CC equations, and then practical formulation of the CCD and
CCSD equations for realistic calculations, has involved a variety of research groups.
It may be of interest to the reader to summarize some of the history. We concentrate
here on full CC methods - approximate treatments and their genealogy are described
in Chapter 6.
The foundations of the entire CC approach, at least in the context of quantum
chemistry, were laid by Cizek [10], although his original diagrammatic presentation
of the CCD equations demands a mathematical sophistication of the reader that
few practicing quantum chemists possess and few can devote the etIort to acquir-
ing. His later review article [13] expands on several aspects of CC theory, and also
160
shows how diagrammatic methods can be use to generate a spin-adapted form of the
equations. Later, Cizek and Paldus [19J rederived the "coupled-pair many-electron
theory" (CPMET) equations, as they referred to what we now term CCD, in terms
of determinants and algebraic expressions, although there is not much evidence that
their reformulation was found more palatable. In collaboration with Shavitt, Paldus
and Ciiek [30] had also performed some all-electron calculations on the model system
BH3 (their earlier calculations had used relatively crude models like the 7r-electron
Hamiltonian and PPP approximations). For BH3 they could compare with full CI
calculations; in addition to CPMET they also included the effects of single excitations
and, in an approximate way, connected triple excitations. The agreement with full
CI was very good, but since a minimal basis set was used only a very small fraction
of the correlation energy was recovered.
In the mid-1970s there was a substantial increase in interest in CC models.
Hurley [4J derived the spin-orbital CCD equations using purely determinantal meth-
ods in a form which seems much simpler than that of Paldus and Cizek. He also gave
the spin-orbital equations in terms of the nonorthogonal "pair-natural orbitals" that
were then very popular as a way of substantially reducing the length of CI expansions.
Taylor and co-workers [20J gave expressions for the spin-adapted CCD equations in
terms of spin-coupled pairs and pair-natural orbitals. Harris [11 J gave a rather general
derivation of CC-based methods for estimating excitation energies; Monkhorst [31J
presented an elaborate response theory for CC molecular properties. Finally, Paldus
and co-workers rederived spin-adapted CCD equations [32], and gave expressions for
excitation and ionization energies from CC ground-state wave functions [33].
At the same time, large-scale (or, at least, fairly realistic) CC calculations
were being performed with a variety of different computer implementations. Taylor
and co-workers used their spin-adapted pair-natural orbital formulation [34,35J, Pople
and co-workers [36J implemented the spin-orbital CCD equations exactly as given by
Hurley [4J, while Bartlett and Purvis [2J proceeded from a diagrammatic approach and
MBPT to spin-orbital CCD equations (apparently [18] with some reduction in effort
for closed-shell systems, compared to the UHF-based case, but not the full economy
of a spin-adapted method). Saunders and co-workers [37] developed a spin-adapted
CCD code based on Meyer's "self-consistent electron pairs" [38] formulation of CID.
After this burst of effort, the field became rather quiet in the early 1980s. The
main achievement in the computational arena was the derivation and implementation
of a practical CCSD method by Purvis and Bartlett [18]; Chiles and Dykstra [39] also
developed a CCD code based on self-consistent electron pairs. Bartlett and co-workers
also began to look at aspects of including higher than double excitations in CC treat-
ments [40]. In the latter part of the 1980s, however, there was an explosion of interest
in CC methods. Schaefer's group in Berkeley developed a rather efficient formulation
of the CCSD equations [23J, and in subsequent work Lee and Rice [24J, and Scuseria,
161
Janssen and Schaefer [25] described particularly efficient computer codes for CCSD
calculations. Contemporaneously, Bartlett and co-workers and Raghavachari and co-
workers had begun to look at methods for including the effects of higher than double
connected excitations, as described in the next chapter. As a result of the availability
of efficient CC codes like TITAN [41] and ACES II [42], and the high accuracy achievable
with coupled-cluster methods (described fully elsewhere at this school), CC methods
have moved from fringe to mainstream quantum chemistry in less than ten years.
Chapter 4
Higher excitations
We shall see elsewhere at this school that the CCSD approach provides results of
semiquantitative accuracy for a variety of molecules, at least in the absence of serious
nondynamical correlation effects. Indeed, there are good formal reasons to support
the view that CCSD is the most complete treatment of electron correlation within the
domain of methods that include at most connected double excitations. Neverthless,
the CCSD method is not complete for many-electron systems (except for the case of
noninteracting two-electron systems), and it is reasonable to ask how the exponential
ansatz converges with respect to excitation level, and what methods can be used to
include higher connected excitations.
From the point of view of introducing the least additional complication, the ob-
vious step beyond CCSD is the inclusion in some form of connected triple excitations.
More support for this view can be obtained from perturbation theory: as we have
seen, for Hartree-Fock-based perturbation theory only connected doubles contribute
to the first-order wave function, and thus to the second- and third-order energies.
The fourth-order energy includes contributions also from single excitations and dis-
connected quadruples, contributions that are included in the CCSD approach. In ad-
dition, however, the fourth-order energy includes contributions from connected triple
excitations: the full M~ller-Plesset fourth-order treatment is denoted MP4(SDTQ).
In a perturbational sense, therefore, CCSD is in error in fourth order, although of
course many contributions are included to infinite order. We thus anticipate that any
attempt to improve on the CCSD approach should concentrate first on connected
triple excitations.
Probably the most obvious strategy to improving the CCSD approach is to truncate T
in the exponential ansatz at a later stage. By including all terms in T}, T2 , and T3 , for
example, we would have a CCSDT method [40,43J. In terms of the operator formu-
lation of the CC equations, the defining equations for the T}, Tz, and T3 amplitudes
164
would respectively be
and
1
= WT2 + WT3 + 2WTz
2
+ WT1T2 + WTIT3
12121 z 13
+ WT2 T3 + 2WT1 T2 + 2WT1T2 + 2WT1 T3 + 3i WT1 Tz. (4.3)
We may observe that this is a significantly more difficult problem than CCSD [14,44].
The dimension of the CC equations is much larger, since the number of (connected)
triples scales as N; N~. Solving the CC equations requires an overall computational
effort proportional to N; N! and N: N~, whicb is commonly referred to as an "N8 n
dependence. And since iterative methods must be used, this effort is required in each
iteration. (We may note in passing, however, that the nonlinearity in the CCSDT
equations is no different from the CCSD equations: all CC equation systems are at
worst quartic in the unknown amplitudes.) Thus the CCSDT method is generally
too expensive for use in production calculations, although it has been used very
successfully to calibrate other methods for treating the triples contribution [14,45,46].
Within the general spirit of the CCSDT equations (that is, iteratively solving
for the amplitudes of single, double and triple excitations), Bartlett and co-workers
have derived a hierarchy of methods by successively approximating terms (see, e.g.,
Ref. 14 and references therein). These methods are denoted CCSDT-n, where higher
values of n indicate fewer approximations to the CCSDT equations. Specifically, the
CCSDT-4 method results from dropping nonlinear terms involving T3 on the RHS of
Eq. 4.3. Dropping all terms in T3 from the RHS of this equation yields the CCSDT-3
method. The CCSDT-2 method is obtained by dropping all terms in Tl and T3 from
the RHS. If the nonlinear T; terms are also neglected, we have the CCSDT-1 b method.
Note that in all approximations so far, the RHS of Eqs. 4.1 and 4.2 have been left
unaltered. The simplest approximate CCSDT method, denoted CCSDT-la, results
from CCSDT-lb by dropping the nonlinear term TIT3 from the RHS of Eq. 4.2. All of
these methods are iterative, and, except for CCSDT-4, which remains an NS method,
all behave as N7. They are thus all rather expensive in practice, and have not been
used very much. In fact, comparisons show that they gain little relative to the simpler
treatments we are about to examine, but require more computational effort.
165
An alternative strategy for handling the triple excitations is provided by the close
connections between CC and perturbation theory that we have already discussed.
For example, a very simple approach would be based on the view that to make CC
"correct through fourth order" (in the sense of perturbation theory), we could sim-
ply compute the fourth-order M!IIller-Plesset perturbation theory contribution from
connected triples, and add it to the CCSD result [47]. Such a procedure would be
denoted CCSD+T(4), with the obvious indication of the fourth-order origin of the
triples contribution. The T(4) contribution can be computed with an effort that be-
haves as N 7 , and involveS no iteration, of course. Let us examine this fourth-order
contribution more closely. In terms of spin-orbitals, it is given by
T(4) = "
L-
,,(
L-
ABC) -1
DIJK IWIJK I,
ABC 2 (4.4)
I>J>K A>B>C
where
(4.5)
and
wtlfF = ~&1ff{E(BCIDK)t1.f{l) - l:(LCPK)ttf{l)}, (4.6)
4 D L
with
~BC
3'IJK = .::TlJK3'--
"" ~BC
, (4.7)
where, for example,
In Eq. 4.6 we have explicitly used the first-order estimates of the doubles ampli-
tudes ttl{l), which are given by Eq. 2.55, rather than writing the energy contribu-
tion down as a product of integrals divided by orbital energies. The advantage of this
form is that it immediately suggests an improvement to the correction T(4): instead
of using the first-order amplitudes, why not use the converged CC doubles ampli-
tudes [48]1 These should better reflect the exact values of the doubles amplitudes,
since CC includes terms to infinite order. Thus a correction T{CCSD) can be de-
fined by using Eqs. 4.4 and 4.6, but using the converged CCSD amplitudes ttl. The
approach so obtained is termed CCSD+T(CCSD). It has received some use, but an-
other improvement is still possible, without significantly increasing the computational
effort.
The final improvement to these perturbational triples estimates is the observa-
tion that there is a term in fifth-order, involving singles amplitudes, of the form [49]
"L- "(D
L-
ABC )
IJK -1 VABCWABC
IJK lJK, (4.9)
I>J>K A>B>C
166
where
(4.10)
The use of the combined corrections of Eqs. 4.4 and 4.9 gives the method denoted
CCSD(T). This is probably the most commonly used triples correction, and is em-
pirically observed to be the best behaved. The justification for its use is largely one
of experience - one could legitimately argue that there are many fifth-order energy
contributions that are not being included, so why include just Eq. 4.9? The practical
answer is related to the behaviour of the CCSD method itself, and, in particular, to
the importance of single excitation amplitudes (cf our discussion of the 7i diagnostic
in Sec. 3.4) as indicators of when non dynamical correlation effects become large. In
such situations (i.e., 7i large), the fifth-order contribution of Eq. 4.9 will be large.
Experience, again, shows that this term is usually positive (although there is no for~
mal reason why this must be so), and hence acts to "damp" the effects of T(CCSD).
Since the latter can become a serious overestimate (as T(4) itself would be) in such
cases, the damping effect is very helpful. As we shall see, it is probably fair to re-
gard CCSD(T) as the best single-reference correlation treatment that is inexpensive
enough to be widely applicable. We shaIl have more to say about CCSD(T) else-
where at this school; as we noted above several extensive comparisons of different
approximate methods for including triple excitations may be found in the literature.
So far we have devoted all of our attention in going beyond the CCSD model to the
inclusion of connected triple excitations. Our original justification for this was the
observation that CCSD is in error in fourth order of perturbation theory, and this
entire error arose from connected triples. But what of higher excitations"? Connected
quadruples, for example, contribute in the fifth order of perturbation theory [4:3]. For
completeness, we give here the full CCSDTQ equations in operator form [17]:
It should be clear that if the CCSDT model, with its N8 computational dependence, is
too expensive for general use, the CCSDTQ model, which would have an NIO depen-
dence, will be even less feasible. And the development of noniterative approximations
along the lines of the CCSD+T or CCSD(T) methods only reduces the dependence
to JV9, at least initially. Several authors, however, have pointed out that a number
of the fifth-order terms, at least, can be computed with ml,1ch less effort [17,50]. We
shall not discuss the details here, but the overall approach is to substitute the RHS
of the operator equation that defines the amplitudes in T4 into the equations for, say,
double excitations. For example, through fifth order we have the modified doubles
equation
. 1 2
W + WT1 + WT2 + WT3 + 2WT2 + WT1 T2
where we have explicitly indicated that the term in square brackets, like the other
terms, is restricted to connected contributions . .It is this last term that arises from
the fifth-order energy contribution of connected quadruples. The reader will note
that several terms are absent from the usual CCSD or CCSDT doubles equation, as
a result of truncating at fifth order.
It is essential to note that, despite the product forms like TdWT3 that appear
in Eq. 4.15, this is not a disconnected cluster contribution. Rather, it is analogous
to making a perturbation theory estimate of the connected quadruples amplitudes
from the doubles equation and using this to correct the original energy. This leads
to methods denoted CCSD(TQ) [17] - a quadruples correction to CCSDT could
be formulated as CCSDT(Q), but its applicability would be limited. The cost of
these "perturbational" quadruples corrections is at worst N: N:, and many terms are
only N 6 overall.
My own, very limited, experience with these corrections has not been espe-
ciallyencouraging. For systems in which Hartree-Fock is a good approximation, the
CCSD(T) results are usually already very good, and additional corrections are as
likely to make things worse as to make them better. On the other hand, if Hartree-
Fock is not a good approximation, it is not obvious that low-order estimates of con-
nected higher excitations will be useful anyway. Indeed, we have already noted that
CCSD+T(CCSD), or, worse, CCSD+T(4), is not reliable under these circumstances.
CCSD(T), which behaves better, includes not only the infinite-order effects in the
singles and doubles space, but also fifth-order effects involving singles and triples.
168
One might suspect, then, that lowest-order perturbation theory treatments of con-
nected higher excitations will not work well when nondynamical correlation becomes
important. It is probably preferable to treat the inadequacies in the Hartree-Fock
model more directly, say by using multireference methods.
ChapterS
We shall concentrate in this chapter on methods that explicitly handle open-shell sys-
tems. However, there have been many suggestions that treat such systems implicitly.
For example, equations-of-motion (EOM) or propagator methods can be formulated
to operate on ground-state wave functions to directly calculate ionization potentials,
electron affinities, or excitation energies [31,33,51,52]. If the ground state is well
described by Hartree-Fock plus CCSD, for instance, these EOM methods can yield
good results with little computational effort beyond the ground-state wave function
itself. The interested reader can find many examples in the literature. Of course,
such methods do not address all of the systems that would otherwise require open-
shell or even multi reference methods. We shall now focus on methods that set out
to treat open-shell systems by explicit calculation. There have been a. number of
efforts to generalize CC methods to spin-adapted open-shell treatments, and even
to multireference treatments. Much of what has been published is mathematical in
nature and not directly related to numerical computation. We will concentrate in this
chapter on methods that have been devised with the aim of efficient computational
implementation in mind. Our survey is neccesarily qualitative, since the open-shell
and multi reference methods lead to very complicated equations and exploring the
mathematics is beyond the scope of this course.
is important. The major advantage of the UHF-based methods is their great sim-
plicity: they are very easily programmed compared to some of the more elaborate
schemes we will discuss in this chapter. The disadvantage is the computational effort
required in the calculations. As we saw in Chapter 3, a closed-shell system treated
using the UHF-based formalism requires about three times more computer time than
it would with a spin-adapted closed-shell treatment. We should reiterate that this is
true even when the equivalence between the alpha and beta spin-orbital spaces in the
closed-shell case is used to reduce the number of integrals that must be processed,
since as we saw in Chapter 3 there are up to six possible determinants from a double
excitation of spatial orbitals, but only two spin eigenfunctions. Hence the number of
independent parameters in the closed-shell case is one-third that of the UHF case.
A factor of three in computational effort is a considerable increase, and it certainly
seems desirable to explore ways of reducing it.
What is the problem with simply writing down a set of CC equations based on a
restricted open-shell Hartree-Fock (RHF) wave function as the reference? Recall that
in Sec. 2.3, we employed a Hausdorff expansion of the operator exp( -T)H exp(T).
This Hausdorff expansion terminated after four commutators (that is, after five terms)
because of the commutation relation of Eq. 2.23. That relation depends on a division
of the MO space into orbitals that are occupied in the reference determinant and
those that are empty. For an open-shell case this simple division is not possible when
we use only spatial orbital indices, since the open-shell orbitals are only partially
occupied. (In the spin-orbital case, of course, all indices can be uniquely identified
as occupied or virtual.) Excitations both to and from these orbitals, such as xtXi
and X: Xt, where t denotes an open-shell orbital, are possible. And this is what
creates the problem, since it means that the orbital excitation operators of Eq. 3.13
do not commute when they involve open-shell indices. For example,
(5.1 )
In Sec. 2.3 we used the fact that the spin-orbital excitation operators (there expressed
explicitly in terms of creation and annihilation operators) commute to demonstrate
that the Hausdorff expansion terminated after five terms (four commutators). This
is also true for the closed-shell case as well as the UHF case, because there is no
ambiguity defining occupied and virtual spaces. The existence of a partially occupied
space allows both creation and annihilation involving this space. The excitation
operators do not commute, and hence the termination of the Hausdorff expansion
after five terms is lost. In fact, for T truncated to only single and double excitations,
the Hausdorff expansion terminates only after eight commutators. Thus the spin-
adapted RHF-based CCSD equations [53) are considerably more complicated than
171
their closed-shell brethren, and, incidentally, are much more nonlinear, containing up
to Tf and n.
This leads rather naturally to a suggestion that represents the most obvious
first step in developing a method based on an RHF reference, namely, to employ a
UHF-based program, but to supply an RHF determinant and RHF MOs to it [54].
Since the UHF equations are determined using spin-orbital excitations, there is no
ambiguity in the definition of the occupied and virtual spaces. The first difficulty
with this approach is that if exp(T) is simply taken from the UHF spin-orbital ex-
pressions (presumably with T truncated at some excitation level), the CC wave func-
tion exp(T)\IIo will not necessarily be a spin eigenfunction, even if \11 0 itself is an open-
shell spin eigenfunction. However, Rittby and Bartlett [54] pointed out that this is not
a problem for the CC energy. For instance, if we assume that exp(T)\IIo = \lie + \II~,
where \lie is a spin eigenfunction with the desired spin, and \II~ comprises all the
spin-contaminant terms, the CC correlation energy is
(5.2)
since the matrix element (IlIoIWI\ll~) must be zero. This is so because the Hamiltonian
commutes with all spin-dependent operators, and thus cannot couple functions of
different spin. Hence even if the CC wave function is not a spin eigenfunction, the
CC correlation energy is free of any spin contamination effects. Thus by using an
RHF determinant and MOs in a UHF-based CC code, we obtain an energy that is
free of spin contamination, although the CC wave function we would obtain is not
necessarily a spin eigenfunction.
The second difficulty with this approach is that it does not exploit any of the
spin symmetry properties of the open-shell reference. Individual determinants with
independent amplitudes are generated by the UHF-based operator T. Thus we still
have the factor of three larger number of terms in the closed-shell case compared to
the spin-adapted equations, for instance [54,55]. Obviously, using an RHF reference
function does not reduce the work done, since the program cannot utilize the spin
symmetry. Hence we must accept that open-shell calculations with a UHF-based
program will take much longer than closed-shell calculations with about the same
number of electrons correlated. Further, we may certainly harbour suspicions that
such a program would take longer than a putative open-shell code in which the spin
symmetry is used throughout, since it should be obvious that at the very least, that
part of the code that deals exclusively with the closed-shell orbitals performs more
work than is actually necessary.
It is, perhaps, a measure of how much more difficult are restricted open-shell correla-
tion treatments than closed-shell methods, that it is only very recently that successful
172
open-shell perturbation theories have become available [56-59]. From our earlier dis-
cussions it should be clear that, if the perturbation theory has only recently proved
tractable, CC methodology will be barely emerging. Recent progress in this area has
been encouraging, however, and prospects undoubtedly look brighter than they did
even two years ago.
Formal work in this area (we are excluding EOM-type implicit methods) in-
cludes a full derivation of open-shell CCSD in terms of spin-adapted excitation op-
erators by Janssen and Schaefer [53]. The expressions were so complicated that
symbolic algebra programs were employed in their generation. To someone accus-
tomed to CI methodology, it may seem undesirable to insist on developing explicit
formulas for all the quantities needed. Multireference CI programs invariably include
the rules for evaluating matrix elements between prototype configurations, and no
explicit formulas for matrix elements need be programmed. This viewpoint is not
really appropriate. CC theory is not a method that is "driven" by Hamiltonian ma-
trix elements between configurations. It is true that spin-orbital CCD can be derived
in this way, but that serves only to mislead the unwary. It could be argued that
the only configurations in the CC equations are those onto which we project to ob-
tain the nonlinear equations - the matrix elements required are those of the (much
messier) operator exp(-T)H exp(T) between these configurations and Hartree-Fock.
This is certainly closer to what is done, for example, in perturbation-theoretic deriva-
tions of CC theory, than is an analysis involving matrix elements between double and
quadruple excitations. In a general open-shell or multi reference CC theory we are
not constructing a list of configurations and generating matrix elements over this list,
and it is not profitable to try to apply arguments appropriate to CI (which is such a
procedure) to the CC case.
All this being said, however, it is clearly not very convenient to have an open-
shell spin-adapted theory that involves thousands of different terms, which is what
was found by Janssen and Schaefer [53J! Presumably, a symbolic manipulator that
can generate the expressions could also generate FORTRAN code for them, but this
is not likely to be very efficient. And if the goal is to improve on the performance
of the RHF /UHF-based implementations, it may well not be met. More promising
is a completely different approach, based on some redefinitions of the reference wave
function.
Jayatilaka and Lee [60] suggest the use of quite different spin-orbitals in the
reference wave function from the usual alpha and beta spin-functions. Specifically,
they suggest that rather than use the conventional ms = ~ and ms = -~ spin
functions for open-shell orbitals, the average of these functions should be used. For
a 2N + 2 electron triplet state, for instance, we would have
(5.3)
where the presence or absence of a bar denotes alpha or beta spin, as usual, whereas
173
(5.4)
and the sign used in the combination is specified as a superscript. The determinant in
Eq. 5.3 is a mixture of the traditional Ms = 1,0, -1 determinants (S;; eigenfunctions)
with coefficients of ~,~,~, respectively. This result can be obtained by explicitly
expanding .(i;+ using Eq. 5.4; a general formula is given by Jayatilaka and Lee [60], who
call these "symmetric spin-orbitals". The wave functions are actually eigenfunctions
of S2 and S:z; using these spin orbitals, as the reader can verify. Since the Hamiltonian
is independent of spin, all Ms components of the same configuration are degenerate
and noninteracting, so the averaging we have performed has no consequence for the
energy.
Consider now the CCSD equations. The problem for conventional open-shell
methods is the inequivalence between the spin-orbitals ¢t and !i;t, where t denotes
an open-shell orbital, since the former is in the occupied spin-orbital space while the
latter is in the virtual spin-orbital space. Using the symmetric spin-orbitals this is no
longer a problem. Jayatilaka and Lee [29] present an open-shell CCSD formulation
in which the number of independent amplitudes is significantly less than spin-orbital
formulations. For example, there are only half as many amplitudes of the general
type tff (that is, a double excitation from closed-shell orbitals to virtual orbitals) as
would be obtained in the RHF jUHF-based treatments of Rittby and Bartlett [54] or
Scuseria [55J. The final equations for the symmetric spin-orbital open-shell CCSD
model are fairly complicated, but not remotely as elaborate as those in the spin-
adapted open-shell CCSD model of Janssen and Schaefer.
We have mentioned that there are half as many tff amplitudes in the method
of Jayatilaka and Lee as in the RHF jUHF-based methods. The reader will recall
that in the closed-shell CCSD method, there are three times as many amplitudes in
the spin-orbital formulation as in a spin-adapted form. Hence, in effect, there are
1~ times as many tff amplitudes in the closed-shell part of the open-shell problem as
there would be for the closed-shell problem alone. One perspective on this is provided
by considering spatial single and double excitations that involve triple or higher spin-
orbital excitations. Thus (using conventional spin functions) the quadruple spin-
orbital excitation "ijtu - tuab would be classified as a double excitation of spatial
orbitals, since the occupation number of the spatial orbitals t and u does not change in
this excitation. In terms of symmetric spin-orbitals, such terms appear as "spin-flip"
excitations [60] like"ij _ abo One can either treat the amplitudes of these terms as
independent amplitudes, or choose other linearly independent terms, as discussed in
Ref. 60. In any event, there are half as many of these new terms as there were of the
original tij, making a total ofq times the number of closed-shell CCSD amplitudes.
Given CCSD equations based on symmetric spin-orbitals, it should be straight-
174
forward to develop an open-shell perturbation theory. This has been discussed by Lee
and Jayat·ilaka [59], who also compare their approach with other recent open-shell
perturbation theories. In their open-shell CCSD paper [29] they also present a 1j
diagnostic appropriate to the open-shell case. They show explicitly that a naive gen-
eralization from the closed-shell1j to the RHF jUHF-based methods is inappropriate
and includes some higher spin-orbital excitations. They give a consistent definition
for the open-shell case that should be more comparable with the closed-shell formula.
There is very little to be said under this heading. Of course, UHF-based methods
can be readily generalized to include higher excitations (although the computational
Cost may be prohibitive), as we have already discussed in Chapter 4. It is therefore
possible to generalize the RHFjUHF-based approaches, as has been done by Scuse-
ria [55] and by Bartlett and co-workers [42], although there are pitfalls associated with
perturbational treatments of higher excitations. This is because there is no longer a
simple expression for matrix elements of Bo, that is, the perturbation energy denom-
inators, in terms of orbital energies. Particular choices of open-shell orbitals have
been recommended; Bartlett and co-workers also suggest a slightly different form [17]
of the triples correction, for example, that contains more fifth-order terms than the
usual (T) correction.
Including higher than double excitations into spin-adapted open-shell meth-
ods appears to be a rather difficult task, or will at least involve very complicated
expressions.
If we were to restrict our discussion here to those methods that have been programmed
explicitly, this section would be no longer than the last. We shall therefore cast a
somewhat wider net here, but we make no pretence of complete coverage of this field.
The aim is mainly to provide an introduction to this area for a reader who wishes to
explore it. An appropriate starting point is the establishment of a general taxonomy
for multi reference coupled-cluster methods. We use the notation of Jeziorski and
Paldus [61]. Much of the terminology of the field is built on concepts like model
spaces and wave operators. A model space is a set of wave functions that provide an
approximate description of the system. An MCSCF calculation, for example", involves
a set of configurations that could span a particular model space: the eigenvectors of
the Hamiltonian matrix over these configurations provide approximate descriptions
(of varying quality) of different electronic states of the system. Lowdin [62] introduced
wave operators into quantum chemistry: typically, a wave operator "11/ transforms an
=
approximate wave function '110 into the exact wave function'll "11/'11 0• For example,
175
(5.6)
where n(n} is the exact space. Such an approach is also called valence universal.
Mukherjee and co-workers have extensively investigated these methods [63], as has
Lindgren [64], and others. The main attraction of Fock space approaches is their obvi-
ous suitability for determining ionization potentials and electron affinities, since they
readily yield information about states with different numbers of electrons. However,
it can be imagined that considerable information about many molecular ion states is
required to define a single wave operator that satisfies Eq. 5.6.
A somewhat less demanding requirement is placed on the wave operator in
Hilbert space MRCC approaches. Here a single model space Ao is used to repre-
sent states of the N -electron molecule. Then the wave operator transforms these
N-electron approximate states into the exact ones:
(5.7)
A linear expansion of the wave operator here would lead to procedures very similar to
multi reference CI (MRCI), although with any truncation by excitation level the result
would not be size-extensive in general. Use of an exponential ansatz for the wave op-
erator in Eq. 5.7 was first suggested by Jeziorski and Monkhorst [65]. Bartlett and co-
workers have also pursued this approach on several occasions, suggesting a linearized
approximation and several noniterative size-extensivity corrections to MRCI [66], as
well as exploring more elaborate schemes [67,68]. Sometimes the term state universal
is used, instead of Hilbert space, in referring to Eq. 5.7.
On the surface, the simplest of the MRCC approaches are the one state meth-
ods, in which the model space is reduced to a single element, corresponding to a
particular electronic state. The wave operator then formally takes this approximate
representation of a given state into the exact state:
(5.8)
176
Again, with a linear expansion of the wave operator this approach resembles inter-
nally contracted MRCI methods, which are more economical than MRCI procedures
based on a multidimensional model space. With an exponential ansatz for 1I;,ne,
we have a type of MRCC method that has been explored by various authors, per-
haps most extensively by Simons and co-workers [69-71] and by Nakatsuji, Hirao,
and co-workers (the "symmetry-adapted cluster" methods, see Refs. 72 and 73, and
references therein). The term state-selective is also used for these one state methods.
It must be said at the outset that none of these three general approaches is
without severe problems, either formal or practical. All three lead to equations that
vary from merely complicated to almost incomprehensible. The computer implemen-
tations that exist are usually restricted in some way or other, sometimes severely.
Quite apart from these issues, which it could be argued could be overcome by in-
vesting the programming effort and by supplementing this author's mathematical
education, there are more fundamental problems. In the Fock space and Hilbert
space methods, model spaces must be defined on which the wave operator will act.
However, within a finite model orbital space - which is typically composed of only
the valence orbitals for small molecules and will be more restricted for larger systems
- even a full CI calculation is unlikely to provide good approximations to more than
the first few exact wave functions. That is, the spectrum of the model space does
not resemble the exact spectrum. For higher excited states of the exact problem, it is
almost inevitable that the wave functions will contain significant contributions from
configurations that represent excitations outside the model space. For example, if we
consider the simple case of H2 with a two-electron, two-orbital model space, the first
excited 1E: state can best be represented as
(5.9)
where 1c..1 ~ legl. However, it seems highly implausible that a real bound excited 1 E;
state will resemble such a wave function: a wave function dominated by
1
v'2 (IIO'g2ci'"gl + 120'glcfgl) (5.10)
seems more likely, and this is an excitation external to our model space. Hence as we
attempt to transform the model space into the exact space, such external excitations
will "intrude" [74], appearing with lower energies than the model space approxima-
tions and disrupting the model space spectrum. Intruder state problems of this type
plague all multi reference methods (including varieties of multi reference perturbation
theory that are beyond the scope of this work) that rely on well-defined correspon-
dences between model space and exact wave functions. They are very difficult to
resolve without resorting to impractically large model spaces.
It might appear from the intruder state problem that one state MRCC methods
should be preferred. A quite different problem arises here, however. No exponential
177
ansatz for the wave operator for such an approach has been devised that can be shown
to be complete (that is, that yields the full CI result when up to N-electron excitations
are included), without including amplitudes that cannot be determined from the single
state approach [61]. A unitary formulation appears to sidestep this difficulty [71], but
the Hausdorff expansion for such a formulation is infinite (for any order of excitation
in T) and must be therefore be truncated in practical implementations.
Another problem with essentially all MRCC approaches is the issue of "incom-
plete" model spaces (that is, not full CI spaces), or, for the one state methods, the use
of a multi configurational reference function that is not a CASSCF wave function but
involves some selection of configurations. Jeziorski and Monkhorst [65] showed that
some disconnected terms must appear when incomplete model spaces are used. It is
far from straightforward to demonstrate that a given approach will be size-extensive
for an incomplete model space, and indeed some approaches will not be. Complete
model spaces exacerbate intruder state problems and lead to very long expansions.
We should note that correlation of "inactive" orbitals (that is, including excitations
from orbitals that are doubly occupied in all reference configurations in the definition
of T) can also lead to problems similar to those of incomplete model spaces.
The Hilbert space formulation of Jeziorski and Monkhorst has provided the
basis for most of the effort in MRCC methods. The author is not aware of any
general-purpose computer implementation of this method, although by employing
the usual restriction T = TJ +T2 , eliminating all nonlinear terms, and neglecting ma-
trix elements of exp(T) that couple different model space configurations, Laidig and
Bartlett [66] developed the multireference linearized coupled-cluster method (MRL-
CCM). This can be implemented with relatively minor modifications to an MRCI pro-
gram and needs the same computational effort. However, MRLCCM should rightly
be regarded as an approximate CC method, and we shall discuss it further in the
context of other approximate methods, in Chapter 6.
Chapter 6
In this chapter we shall discuss a variety of methods designed to address the problem
of size-extensivity. Some of the methods are exactly size-extensive and some are
only approximately so; some methods are rather closely related to CC theory, others
are not. Where possible, we shall stress the relationships with CC methods. This
means that certain treatments, like "quadratic configuration interaction", are viewed
as approximations to CC methods. This viewpoint has caused controversy in the
past, but it is hard to disagree with the philosophy expressed by Paid us and co-
workers [75]: "When theory B may be obtained by dropping certain terms ... from
equations characterizing theory A, one normally considers B as a special case of (or
approximation to) A.".
The quadratic configuration interaction (QCI) approach was introduced by Pople and
co-workers in 1987 [76]. It was originally derived by adding only those terms to the
CISD equations that were required to ensure size-extensivity. (Pople and co-workers
use the term "size-consistency" throughout their discussion, but in this course the
appropriate term is size-extensivity.) The resulting equations are
{lI1oIWIT2I11o}= ~, (6.1)
(1I111WIl + Tl + T2 + TIT2I11o) = ~t1 , (6.2)
(1I11!IWll + Tl + T2 + ~Till1o) = ~t1!, (6.3)
where for convenient comparison with the work of Pople and co-workers we have used
their form in which the correlation energy appears explicitly in the wave function
180
1 2} =
(lIt o1WIT2 11t 0 + '2Tl e, (6.4)
(IIttlwll + Tl + T2 + TIT2
+'21T21 +3fT
1 3.T.)
1 'i'o = dt, (6.5)
The following differences between CCSD and QCISD are then apparent. (Hartree-
Fock orbitals are assumed.)
(i) The CCSD energy (i.e., the projection on (lItol) includes a contribution from the
disconnected term Tl, which is absent from QCISD.
(ii) The CCSD singles equations (the projection on (IIttl) contain terms in Tl and T~
that are absent from QCISD.
(iii) The CCSD doubles equations contain a number of terms on both sides involving
powers of Tl or products of Tl and T2 that are absent from QCISD.
We can draw some qualitative conclusions from these differences, all of which,
we note, involve single excitation amplitudes. First, the cubic and quartic terms
like TlT2' T~, or Tt that appear in the CCSD equations are absent from QCISD. The
only nORlinear terms are quadratic, hence the name QCI. Second, a putative QCID
method would correspond exactly to CCD. Third, since all differences involve T],
we can expect the difference between QCISD and CCSD results to be least when
single excitations are relatively unimportant. Alternatively, using the diagnostic in-
troduced previously, if 1j is large, we may expect significant differences between
CCSD and QCISD (note that it is of course possible to derive an analogous diag.
nostic, which we have denoted Q1, for QCI [77]). This is indeed what was found
in a detailed comparison of the two methods. In general, for a given system QI
tends to be larger than 1j, and the difference becomes larger with increasing mag-
nitude of the diagnostics (that is, with respect to increasing nondynamical correla-
tion). However, the inclusion of connected triple excitations - the CCSD(T) and
QCISD(T) - substantially improves the results, and the triples-corrected results
agree much better than do the QCISD and CCSD results. Indeed, credit should be
given to Ra.ghavachari and co-workers for recommending the (T) correction in the
first place [49], deriving the CCSD(T) method from the triples correction they had
implemented for QCISD [76]. (The actual terms appearing in the correction had al-
ready been presented by Kucharski and Bartlett in their analysis [43] of fifth-order
181
perturbation theory.) The inclusion of the fifth-order terms is even more important
for QCI than for CC [50]. The QCISD+T(QCISD) or QCISD+T(4) methods would
not work well, especially when non dynamical correlation becomes important.
Computationally, the terms omitted from the CCSD equations to obtain the
QCISD equations are relatively straightforward, and do not consume much time in
each CC iteration. In fact, by appropriate formulation of the equations the work that
scales as N6 should be the same in QCISD and CCSD. However, it might be expected,
given the always uncertain convergence of systems of nonlinear equations, that the
presence of cubic and quartic terms might cause convergence difficulties, or might
slow convergence, compared to only quadratic terms. In the author's experience this
has not been a real problem. There may be some compensation here - the fact that
QCISD "overshoots" CCSD as nondynamical correlation becomes more important
indicates that the QCI amplitudes are larger than their CCSD counterparts, which
may offset any advantage from less nonlinearity.
In the original QCI reference [76J, Pople and co-workers gave a rather general
prescription for converting any excitation level CI into a QCI procedure. However,
there has been considerable discussion about whether, for example, the QCISDT
treatment obtained with their prescription is unambiguously defined, or whether such
treatments are truly size-extensive [75,78,79]. This is largely academic, since there
is little incentive (given the computational cost) to implement such methods. It is
possible to implement perturbational corrections for higher than triple excitations,
by analogy with the CC case, giving a method like QCISD(TQ). Benefits from such
methods are not obvious, as discussed in Chapter 4.
Freely admitting to a bias in favour of the "theoretically more complete"
method, the author finds little reason to perform QCISD or QCISD(T) calculations
if the means is at hand for their CC analogues. The CC methods undoubtedly out-
perform the QCI methods when nondynamica.l correlation is important [77], and one
may not always know whether this is a problem beforehand, so CCSD or CCSD(T)
would be the better safety play. In systems strongly dominated by Hartree-Fock,
however, there is little to choose between QCISD(T) and CCSD(T).
since we require that Tl = O. Note that since the single excitation amplitudes are
absent, Eqs. 6.7 and 6.9 are identical to the CCD equations. Also, since QCID is
equivalent to CCD, we see that Brueckner orbital QCID and Brueckner orbital CCD
are equivalent. The orbital rotations required to eliminate Tl are defined by Eq. 6.S.
In practice, one could proceed by assuming that the orbital rotations are small, and
Eq. 6.8 is then approximated as linear in the unknown rotations [80]. Thus the com-
putational procedure is to solve for the amplitudes, and then determine an orthogonal
transformation of the orbitals to satisfy the linearized form of Eq. 6.S. Obviously,
such a procedure needs to be rapidly convergent if it is to be useful, as otherwise
the repeated integral transformations will simply cost too much. In fact, this simple
procedure does not perform very well. A more elaborate Newton-Raphson approach
apparently does not work much better. Werner and co-workers have thus suggested a
rather different approach, in which orbital rotation is performed in each iteration of a
CCSD (or QCISD) calculation. That is, at the start of each CCSD iteration, the MOs
are rotated using the most recent estimate of the tl amplitudes. Initially, it might
seem as if this approach would be prohibitively expensive, because an integral trans-
formation would be required in every iteration. However, Werner and co-workers [81]
sidestep the full transformation, just as they have done previously in direct CISD
calculations [82]. The partial transformations they require are relatively cheap. This
benefit is not entirely cost-free, since each CCSD iteration takes somewhat longer
without the full integral transformation (with or without orbital rotation), but it is
probably cost-effective for calculations aimed at Brueckner orbitals. Other factors,
such as disk space and input/output performance, will also playa role in practice.
We note finally that a perturbation theory analysis [50] shows that Brueckner-orbital
CCD includes more fifth-order terms than CCSD (or QCISD). Hence the Brueckner
orbital methods can be viewed as more complete in this sense, at least. The numerical
consequences appear slight, however.
It has become common practice to use the term "Brueckner doubles" (BD) as
183
a convenient shorthand for the treatment described here, that is, a CCD expansion
using Brueckner orbitals. The terminology is somewhat unfortunate, since the term
could also mean a CID expansion with Brueckner orbitals. Like most terminology in
use in the GAUSSIAN series of programs, BD or extensions like BD(T) will probably
become common in the literature, independent of their unsuitability or ambiguity.
But terminology like "Brueckner coupled-cluster doubles" or "CCD with Brueckner
orbitals" is less ambiguous and certainly preferable.
We have already described the use of the 1i diagnostic as a probe of the im-
portance of nondynamical correlation. For Brueckner orbitals, of course, 1i = 0, by
definition. We can thus regard the transformation to Brueckner orbitals as an attempt
to let the orbitals respond to nondynamical correlation; it might be hoped that the
use of Brueckner orbitals would thereby be beneficial in cases where nondynamical
correlation was important, and this point has been strongly emphasized in recent dis-
cussions of Brueckner orbitals. However, explicit comparisons with full CI benchmark
results, at least, do not show any particular advantage of Brueckner orbitals in this
respect [80]. Lee and co-workers [83] have compared QCISD, CCSD, and Brueckner-
CCD with one another, and also compared these methods when augmented with the
(T) triples correction. They found little to choose between the methods, especially
when the triples correction was included.
One of the virtues of the energy-independent formulation of the CC equations has been
emphasized by Bartlett [12]: the results are size-extensive no matter what truncation
of T is employed, and no matter which terms are dropped from the resulting equations.
We have already seen this, to some extent, with QCISD. But we can eliminate terms
more drastically than this and obtain size-extensive results. For example, consider
again the CCD equations,
(lI1oIWIT2I11o) = f (6.10)
(1l11!IW + (WT2 - T2W) + (~WTi - T2WT2)llI1o) = 0, (6.11)
(lI1olWIT2l11o) = f (6.12)
(1I11!IW + (WT2 - T2W)lll1o) = 0, (6.13)
It is vital to observe that this does not reduce to the cm equations: instead of the
cm eigenvalue problem we have a set of linear equations for the unknown amplitudes.
Our results will be size-extensive (although since the result for an isolated two-electron
system is not correct, we have correct scaling behaviour, but not the right answer,
184
cancelling the unknown energy. (Dropping the remaining 17 terms would yield
LCCD.) However, in order to accomplish this cancellation we used the matrix el-
ement identity
(6.16)
and a summation range over K > L that implicitly included terms that would arise
185
from "exclusion principle violating" (EPV) quadruple excitations, like ~t!BD. Such
terms cannot arise in a fermion wave function, of course. For example, the correlation
energy for the pair JJ is
which would be given by a doubles/quadruples matrix element like the LHS of Eq. 6.16
when KL = JJ. One might argue that in equations for amplitudes associated with
the pair I J it is an error to have included such EPV terms, and that they should be
removed. In this way we would obtain an approximation like [85,86]
(6.19)
The approximation of Eq. 6.18 is termed CEPA-2, while we can see from Eq. 6.19
why LCCD is sometimes termed CEPA-O. We can see that in both cases the unknown
correlation energy that would appear on the RHS of Eqs. 6.14 will be cancelled out,
although in the case of CEPA-2 the (unknown) pair correlation energy EIJ remains
on the LHS. For the noninteracting two-electron systems this does not interfere with
size-extensivity, as can be seen by repeating the scaling arguments of Chapter 2.
Perhaps the first question the reader might ask is "what about CEPA-l?"! We
should first point out that CEPA-2 corresponds to Kelly's original suggestion [85] and
was certainly known and used first. The name CEPA was introduced by Meyer [86]
(who originally called it CCI - cluster-corrected CI) and the numerical suffixes were
used to distinguish between different approximations. Unfortunately, there was con-
fusion about Meyer's numbering, and consequently the first CEPA method is denoted
CEPA-2. The origins of CEPA-I, which we will now discuss, reflect an interesting
aspect of theoretical methods that we not discussed up to this point: their invariance,
or lack of it, to unitary transformations on the molecular orbitals. Full CI results (or,
equivalently, full CC results) are invariant to any unitary transformation on the MOs
involved in the CI. Where, say, Hartree-Fock theory is used to define an occupied
and virtual space, the results of a truncated CI or coupled-cluster calculation will
not be invariant to an arbitrary transformation, since the Hartree-Fock solution it-
self is not invariant. However, a unitary transformation that mixes occupied orbitals
among themselves, and/or mixes virtual orbitals among themselves does not affect
the Hartree-Fock wave function (although it obviously changes individual orbitals).
Most of the methods we have encountered so far are rigorously invariant to such
transformations: this includes CC (or QCI) and CI at any level of excitation, plus
linearized CC methods like LCCD/CEPA-O. The 'Ii diagnostic is also invariant to
186
such mixing, although individual cluster amplitudes or CI coefficients clearly are not.
Such invariance is of interest if we wish to transform between, say, canonical and
localized Hartree-Fock orbitals, or if we have degenerate MOs that we may wish to
fix in some way that does not affect the energy. Invariance of perturbation theory
to orbital rotations is a much more complicated area, since M~ller-Plesset pertur-
bation theory, for example, is predicated on expressing the resolvent using sums of
orbital energies in the denominator. Obviously, a different choice of orbitals may
lead to a nondiagonal Fock operator. More details are given elsewhere at this school,
but we mention it here because it also affects the perturbational estimate of higher
excitations, like the (T) treatment of triples. Returning to the CEPA methods, we
have already noted that CEPA-O is invariant to unitary transformations on the MOs
that do not mix the occupied with the virtual space. However, this is not the case for
CEPA-2, since the individual pair-correlation energies are not invariant to such trans-
formations. For example, CCSD or CEPA-O are size-extensive independent of such
transformations. But for the case of noninteracting two-electron systems, CEPA-2 is
size-extensive only if localized orbitals are used. This has always been the basis of
our analysis, but we now know it to be irrelevant for CCSD, etc. For CEPA-2, how-
ever, a transformation to delocalized orbitals will not retain size-extensivity. Meyer
found this to be unsatisfactory, and modified the method [861 to obtain size-extensive
results for localized and delocalized orbitals in the case of noninteracting two-electron
systems. This method, CEPA-l, employs the approximation
where the doubly primed summation omits both J( = I and J( = J. The expression
now removes not only the pair-correlation energy E[J, but also all the "semi-joint"
pair-correlation energies in which one MO index is either lor J. These additional
pair-correlation energies come from quadruples with excitations from, say, I J I J(,
which are sometimes termed type 2 EPV terms, to distinguish them from the type 1
terms arising from I J I J. Thus CEPA-2 removes the EPV type 1 terms, while CEPA-l
removes the EPV terms of types 1 and 2. (Remember, the notation is all Meyer's
fault!)
We can again draw some qualitative conclusions from the form of the vari-
ous CEPA approximations. The magnitude of the term approximated in CEPA-O,
CEPA-2, and CEPA-l decreases in that order, and we know that cm would be ob-
tained if this term were set to zero (recall that we have not yet cancelled the energy
from the RHS here). Hence we might expect that the computed correlation energy
will increase from cm in the order CEPA-l, CEPA-2, CEPA-O. This is what is
commonly found in practice. The CCD result is usually found between CEPA-l and
CEPA-2, so the latter appears to be something of an overestimate and the former
an underestimate, at least for systems with little nondynamical correlation. Where
187
there is strong nondynamical correlation none of the CEPA methods is very reliable.
For real many-electron systems (as opposed to the noninteracting two-electron sys-
tems case) CEPA-l is approximately invariant to unitary transformations that do not
affect CCD or cm, while CEPA-2 can show significant differences.
The CEPA methods have been perhaps the most widely used approximate
treatments that can be viewed directly as approximations to the CCD or CCSD
equations. They were also the earliest widely applicable size-extensive treatments
that were accurate through at least the third order of perturbation theory. Although
the independent-pair approximations to Sinanoglu's cluster expansion [7) were size-
extensive by construction, they were in error in the third order of perturbation theory
because of the neglect of matrix elements coupling the pairs. Initially the CEPA
methods were viewed with a mixture of disdain and mistrust, sometimes with a dash
of outright hostility. This was partly because the desirability of having a size-extensive
treatment was not well understood, and was largely ignored. Indeed, Davidson's
correction, which we shall treat in Sec. 6.6 below, was the first treatment of size-
extensivity that won any sort of general acceptance, perhaps because it was trivial
to compute and thus did not require understanding any new physics in order to
implement it. Another factor that undoubtedly told against the CEPA methods,
however, was that there were simply too many varieties. People began to suspect
that a new CEPA-n method was devised each time the existing n -1 methods failed.
This was unfounded, but it merely reflects human nature. CEPA-3 was an average
of CEPA-l and CEPA-2, founded largely on the notion that CCD results, when they
became available for comparison, fell between CEPA-l and CEPA-2. CEPA-4 and
CEPA-5 were introduced by Koch and Kutzelnigg [89] after examination of various
terms in the CCD equations. In retrospect, recommending one CEPA method, say,
CEPA-l, would have been a better strategy, but these days the issue has become
irrelevant.
In parallel with the development of the CEPA methods by Meyer, Kutzelnigg,
and co-workers, Hurley [4) derived a number of similar approximations. In Hurley's
notation, LCCD is CPAo, while CEPA-2 is CPA' and CEPA-l is CPA". Actually,
Hurley's work was originally formulated for spin-orbital pairs, that is, true electron
pairs. The CEPA methods were introduced for spin-coupled pairs. As Taylor and co-
workers showed [20], there are subtleties and ambiguities associated with transforming
their CPA' and CPA" approximations to spin~coupled pairs. This is inevitable - for
spin-orbitals I and J an excitation from I J I J would always be EPV. But for spatial
orbitals i and j, a quadruple excitation from ijij need not be EPV, as in an excitation
like ill'l]:/'. This determinant is a legal quadruple excitation. Hence the EPV terms
become mixed with non-EPV terms.
Finally, several efforts have been made to develop a multi reference CEPA.
Siegbahn [90], Hoffman and Simons [71], and Fulde and Stoll [91], for example, have
188
implemented such methods, although they have received little use. Some aspects of
multireference CEPA methods are discussed in the next section.
(6.22)
or two-particle
Wp = :L wijcij (6.2:3)
a~b
correlation functions. (The labels P, etc., are taken to indicate the spin-coupling of
the double excitations here, as well as the orbitals from which electrons are excited.)
The CI correlation energy functional is then
The strategy for obtaining size-extensivity is related to the observation that cancel-
lation of terms in the numerator by part of the normalization denominator produces
size-extensive results in the separated-pair case. Ahlrichs and co-workers [92] sug-
gested a modified functional - the coupled-pair functional (CPF) - given by
where
Np = 1 + :LTPR(WRIWR), (6.26)
R
189
and T is a symmetric matrix. The only change from the CI energy functional is the
incorporation of the weighting factors TpR ("topological factors" in the terminology
of Ahlrichs and co-workers (92)) into the normalization denominators. How do we
specify values for these factors? We note that the choice TpQ = 1 for all PQ recovers
the CISD functional, while inspection will show that the choice TpQ = 0 everywhere
in fact yields LCCSD/CEPA-O. From the previous section an intermediate choice
should give the best results. For the case that P represents excitations from spatial
orbitals i and j and Q from k and 1, Ahlrichs and co-workers chose
Oik + Oil
TPQ=---+ 0ik + Oil . (6.27)
2ni 2nj
Here ni is the occupation of space orbital i in Wo. (Single excitations are accounted
for using the formula for P = ii.) These occupation numbers allow the use of the CPF
method with open-shell SCF reference wave functions, although in this case care is
required to differentiate between those single excitations that are single spin-orbital
excitations from WO, and which therefore have vanishing matrix elements with Wo
by Brillouin's theorem, and those that are space-orbital single excitations but spin-
orbital double excitations, which are treated as double excitations [92].
The weighting factor definitions in Eq. 6.27 were chosen for two reasons: they
provide size-extensive results in the sense of noninteracting two-electron systems,
and for this case the results are independent of a unitary transformation on the
occupied or virtual spaces. However, the results are not generally invariant to such a
transformation, although in most investigations the lack of invariance is very small.
These weighting factors also yield size-extensive. results for noninteracting systems of
any size, provided orbitals localized on each system are used. The weighting factors
are not necessarily uniquely defined even with the specified requirements, as discussed
by Ahlrichs and co-workers, but Eq. 6.27 seems to be the simplest choice.
The fact that CPF is size-extensive for the noninteracting two-electron systems
independent of mixing among occupied or among virtual orbitals is reminiscent of the
CEPA-l method described in the previous section. Indeed, CPF restricted to double
excitations only is equivalent to CEPA-l with doubles. Differences arise if singles are
included because of different definitions.
One advantage of the CPF method is that it is very straightforward to program
if a CISD code is available, as discussed by Ahlrichs and co-workers. It can readily be
incorporated into direct CISD codes with little effort. The only consequence of any
significance is that convergence of the iterative process may be slower, especially where
size-extensivity effects are large, or where nondynamical correlation is important. The
CPF functional is bounded from below, but not necessarily by the true energy [92].
That is, the CPF "energy" cannot go to -00; it must remain finite, but it can fall
below the exact (full CI) energy. Such "nonvariational" results are only rarely seen,
however. In any event, the method is to be preferred to CEPA-O in this respect, since
190
the latter method is not bounded and is occasionally observed to undergo complete
collapse.
As we have stated, the CPF method is not restricted to closed-shell reference
functions, but can be used with restricted open-shell reference functions. (I am un-
aware of any UHF-based CPF program, although it would be a relatively simple mat-
ter to develop one.) Unlike spin-adapted open-shell CC implementations, open-shell
CPF requires essentially the same computational effort as the corresponding CISD.
Hence the method has been fairly popular for use in open-shell systems. For complete
generality, of course, we would like to be able to handle multireference cases. It is
not at all obvious how to do this with CPF as originally developed, because as with
many other pair-based methods it is not clear how to define "pairs" for a multiconfig-
urational reference function. Gdanitz and Ahlrichs [93] realized that one solution to
this would be to develop weighting factors that are not dependent on the individual
electron pairs - that is, to require all T pQ to be identical. By analogy again with the
noninteracting two-electron systems they suggested a normalization denominator of
(6.28)
(6.29)
where Wo = I:R WRCR is the reference function, intermediately normalized. If. com-
prises all single and double excitations out of the set of WR, and Wa are configurations
with no external orbitals that are orthogonal to Wo. By analogy with CPF the ACPF
energy functional is
is somewhat arbitrary. Gdanitz and Ahlrichs reasoned that for a CASSCF reference
space there is no need for a size-extensivity correction for the internal configurations,
since they 'Comprise a full CI, and that most multireference calculations would be
close to CASSCF reference anyway (this does not seem to be altogether realistic).
Setting g,. = 1 should be safe, since it is likely to result in an underestimate rather
than an overestimate.
As can be seen from the analysis above, the ACPF method has more "damping"
in the normalization denominator than linearized multireference CC methods, and
seems to behave better. The reader interested in this topic should note that ACPF
is considerably more sensitive to the choice of reference configurations than is MRCI,
and this can become a major issue in ensuring satisfactory convergence of the ACPF
iterations. More details are given in other courses at thi~ school. We should also note
that there are several treatments developed as multireference perturbation theory that
are very close indeed to ACPF [94-96]. The method of Cave and Davidson [94,95],
for example, corresponds to setting g,. = 1 and ge = O.
In 1986 Chong and Langhoff [97] pointed out that in systems with signifi-
cant non dynamical correlation the CPF method tends to overestimate the effect of
higher excitations. In particular, for several small transition-metal diatomics the
size-extensivity correction was overestimated so much that the CPF results were not
necessarily better than the CISD results. They suggested a modification of the defi-
nition of Ahlrichs and co-workers' weight factors TpQ, replacing the geometric mean
in the denominator of the last term of Eq. 6.25 with a constant plus the arithmetic
mean. This naturally leads to a larger denominator, and hence a reduction in the
effect of higher excitations on the energy. There are some other subtleties that the
interested read~r can follow up in the original reference to this modified coupled-pair
functional (MCPF) method. Implementation of the MCPF equations in a direct
CI-like form is not straightforward: Chong and Langhoff in fact modified a conven-
tional Hamiltonian matrix-driven code to obtain their first results, and a direct CI
implementation was first accomplished by Blomberg and Siegbahn [98].
The MCPF method undoubtedly out-performs CPF where non dynamical cor-
relation is important, at least as far as can be judged by empirical comparisons.
However, it has one significant disadvantage: its lack of invariance to rotations that
mix occupied orbitals (or virtual orbitals) among themselves. Unlike CPF, which is
formally invariant by construction for certain special cases, and which is empirically
observed to be close to invariant in practice, the MCPF method can give considerably
different results depending on the choice of orbitals. This is a particular problem if
properties like vibrational frequencies are to be calculated, since the energy may ap-
pear to change discontinously as geometric parameters are varied and orbitals localize.
MCPF can be very useful, but users must be very careful ..
192
We have left the simplest approximate treatment until last. This is a formula pub-
lished by Langhoff and Davidson [99], in which a CISD correlation energy was modi-
fied by adding 6£, where
6E = £ClSO(1 - ~). (6.31)
Here Co is the coefficient of the reference configuration in the normalized CISD ex-
pansion. The inclusion of the correction is commonly denoted by a suffix +Q, so that
the CISD+Q correlation energy, for example, would be
Obviously, this correction will be larger in magnitude than the original Davidson cor-
rection. Martin and co-workers [101J have suggested the term "renormalized Davidson
correction" for Eq. 6.33.
Ahlrichs [102] has given an alternative derivation of the original Davidson
correction that provides additional insight. We write the CISD correlation energy in
the usual expectation value form as
(lItclSolH - Eollllclso)
t:CISD = . (6.34)
(lit CISO IIII CISO)
Then the original Davidson correction is
(6.35)
and by inserting the CISD correlation energy in this expression we find that
(6.36)
193
That is, the Davidson correction cancels the normalization denominator from the
correlation energy. In effect, we have the value of an energy functional (\IIIH -
Eol\ll) evaluated with the CISD wave function. This is in fact no more than the
energy functional for the LCCSD method. Thus if we solve iteratively for the LCCSD
amplitudes, beginning with CISD as the initial guess, the first iteration will be the
traditional Davidson correction.
Two important observations should be made about the Davidson and renor-
malized Davidson corrections. First, we can see that we will obtain a size-extensivity
correction for a two-electron system, even though there should be none. Thus these
corrections behave inappropriately in the limit of very small numbers of electrons.
Obviously, one can avoid making any correction for the two-electron case, so there
is no real difficulty there. But what of four electrons, for example? Presumably the
correction will be too large, but by how much? It seems likely that the renormalized
correction, which must be larger in magnitude, will behave worse than the original
correction. Second, the corrections do not yield a completely size-extensive result,
since they do not account for all of the non size-extensive terms in a CISD calculation.
Hence they will become increasingly inaccurate as the number of electrons increases.
Siegbahn [100] derived higher-order supplements to the original corrections, but these
have been relatively little used.
Pople and co-workers (103) have derived another correction, stressing the need
for a vanishing size-extensivity correction in the limit of two-electrons. Davidson
and Silver [104] introduced yet another correction. Neither of these corrections has
found widespread use. In general, the most commonly used correction appears to
be the original Davidson correction, followed a long way behind by the renormalized
Davidson correction. The interested reader may care to consult Ref. 101 for a detailed
comparison of different Davidson-type corrections.
One of the virtues of the Davidson correction is that it can be extended heuris-
tically to the case of a multi reference CI. The two requirements are to define the cor-
relation energy, and the reference weight eg. Let us assume we have a set of reference
configurations, indexed by R, and a reference energy
(6.37)
The "correlation energy" can be defined, more or less unambiguously, as the differ-
ence EMRCI - EREF, where EMRCI is the MRCI energy. Now, when we compute the
MRCI energy, we will (usually) vary the coefficients of the reference configurations as
well as those of the correlating configurations. Hence the coefficients of the reference
configurations in the MRCI wave function, ~RCI, will usually differ by more than just
194
a renormalization from the ~F, which makes the definition of the reference weight
somewhat problematic. The first definition in common use is simply to use the c~RCI
to define the reference weight [105J, so that
The MRCI+Q energy would be obtained by adding this correction to the MRCI
energy. This is probably the more common multi reference Davidson correction. But
alternatives are possible. One approach is to obtain the reference weight by projecting
the MRCI wave function onto the original reference wave function. This would give
This is not the end of the story, either, since if the reference configurations were
selected from a CASSCF calculation, there is some scope for asserting that C~ASSCF
(with an obvious definition) should be used in place of ~EF here. Indeed, for a
CASSCF reference space (in which case the distinction between C~ASSCF and ~EF
is irrelevant, of course), we can view Eq. 6.40 as a "first iteration" of the linearized
multireference CC method of Laidig and Bartlett [66], discussed in Chapter 5 and
in Sec. 6.5 above. This is analogous to the relationship between the single-reference
Davidson correction and the linearized CCSD method. In this sense the correc-
tion Eq. 6.40 has a formal basis in theory, whereas the correction Eq. 6.39 is purely
heuristic. Nevertheless, full CI comparisons, for example, suggest that Eq. 6.39 is a
more reliable correction. We may expect it to yield a correction that is smaller in
magnitude than Eq. 6.40: both corrections tend to overshoot somewhat (especially for
CASSCF reference spaces) and so the latter overshoots more. More detailed discus-
sion of multireference size-extensivity corrections, including the MRACPF method,
is given elsewhere at this school.
Afterword
The aim of these notes was to provide a solid foundation for the understanding of
modern coupled-cluster methods and their relatives. They are certainly not com-
plete: apart from the decision to exclude diagrammatic methods from the formalism,
we have not discussed coupled-cluster energy derivatives and properties, nor have we
considered propagator or response methods based on CC treatments. These areas
represent important and valuable applications of coupled-cluster theory. The inter-
ested reader can find much on these subjects in the literature. Nor have we included
much about numerical applications of the methods. However, this area is covered in
some detail in other courses at this school.
Bibliography
[1] J. A. Pople, J. S. Binkley, and R. Seeger, Int. J. Quantum Chern. Symp. 10, 1
(1976).
[2] R. J. Bartlett and G. D. Purvis, Int. J. Quantum Chern. 14, 561 (1978).
[3] A. C. Hurley, J. E. Lennard-Jones, and J. A. Pople, Proc. Roy. Soc. A220, 100
(1953).
[16) R. J. Bartlett, S. A. Kucharski, and J. Noga, Chern. Phys. Lett. 155, 133
(1989).
[22] P. Pulay, S. Sreb9J, and W. Meyer, J. Chern. Phys. 81, 1904 (1984).
[23J G. E. Scuseria, A. C. Scheiner, T. J. Lee, J. E. Rice, and H. F. Schaefer, J.
Chern. Phys. 86, 2881 (1987).
[28] T. J. Lee and P. R. Taylor, Int. J. Quantum Chern. Symp. 23, 199 (1989).
[29] D. Jayatilaka and T. J. Lee, J. Chern. Phys. 98, 9734 (1993).
[30] J. Paldus, J. Cizek, and I. Shavitt, Phys. Rev. A 5,50 (1972).
[31] H. J. Monkhorst, Int J. Quantum Chern. Symp. 11,421 (1977).
[32] J. Paldus, J. Chern. Phys. 61, 303 (1977).
[33] J. Paldus, J. Cizek, M. Saute, and A. Laforgue, Phys. Rev. A 17,805 (1978).
[40] Y. S. Lee, S. A. Kucharski, and R. J. Bartlett, J. Chern. Phys. 81, 5906 (1984).
[43] S. A. Kucharski and R. J. Bartlett, Adv. Quantum Chern. 18, 281 (1986).
[44] G. E. Scuseria and H. F. Schaefer, Chern. Phys. Lett. 152, 382 (1988).
[47] R. J. Bartlett, H. Sekino, and G. D. Purvis, Chern. Phys. Lett. 98, 66 (1983).
[48] M. Urban, J. Noga, S. .1. Cole, and R. J. Bartlett, J. Chern. Phys. 83, 4041
(1985).
[51] H. Sekino and R. J. Bartlett, Int. J. Quantum Chern. Symp. 18,255 (1984).
[69] A. Banerjee and J. Simons, Int. J. Quantum Chern. 19, 207 (1981).
[75J J. Paldus, J. Cizek, and B. Jeziorski, J. Chern. Phys. 93, 1485 (1990).
[77] T. J. Lee, A. P. Rendell, and P. R. Taylor, J. Phys. Chern. 94, 5463 (1990).
[81) C. Hampel, K. A. Peterson, and H.-J. Werner, Chem. Phys. Lett. 190, 1 (1992).
[82) H.-J. Werner and P. J. Knowles, J. Chem. Phys. 89, 5803 (1988).
[92) R. Ahlrichs, P. Scharf, and C. Ehrhardt, J. Chem. Phys. 82, 890 (1985).
[93) R. J. Gdanitz and R. Ahlrichs, Chem. Phys. Lett. 143, 413 (1988).
196] K. Tanaka, T. Sakai, and H. Terashima, Theoret. Chim. Acta 76, 213 (1989).
[103] J. A. Pople, R. Seeger, and R. Krishnan, Int. J. Quantum Chem. Symp. 11,
149 (1977).
Andrzej J. Sadlej
Department of Theoretical Chemistry
University of Lund, Sweden
May 26,1994
204
1. Introduction
The consideration of the electronic structure of atoms and molecules at the level of the rela-
tivistic quantum mechanics is a rather new area. of quantum chemistry. The growing interest in
relativistic methods for electronic structure calculations is strongly linked to the developments
in chemistry of heavy atom compounds and their use in industry. Moreover, there is a number
of chemical observations which show that for heavy atom compounds the interpretation of their
electronic structure and properties cannot be achieved without the relativistic trea.tment.
The relativistic theory of atoms and molecules appears to be far more challenging than the
well established non-relativistic methods based on the Schrodinger equation. There is a num-
ber of very basic unsolved problems which make working with relativistic theories interesting
and rewarding. The corresponding computational methods are still far from being that well
developed as those based on the non-relativistic theory.
In both classical and quantum mechanics the relativistic approach originates from the ob-
servation that the information transfer between different points in space requires certain finite
time. IT the interaction between particles is of electromagnetic nature the speed of its transfer
is given by the velocity of light, c. The importance of this limit increases with the velocity vof
the motion of particles and can be recognized by considering the ratio:
{3 =~, (1.1)
c
which varies between 0 (particle at rest) and 1 (light quanta). In atomic units, which are used
throughout this text, the speed of light is approximately equal to 137.036 and it is worthwhile
to consider the range of velocities of electrons in typical systems.
It follows the from virial theorem that the average total energy E =
(E) is equal to the
negative of the average kinetic energy (T). For the ground state of a hydrogen-like ion of the
nuclear charge Z one has (in atomic units):
and
(1.3)
so that
(1.4)
Thus, the non-relativistic results for the energy of the Is electron in the hydrogen-like ion give
f
{3 equal to which is indeed small for light atoms. This estimate shows that the importance of
relativistic effects will increase parallel to the increase of the nuclear charge of atoms involved
in the given system.
A useful estimate of the magnitude of relativistic effects on energies follows from the relativistic
formula for the energy of a particle of rest mass m and velocity v:
(1.5)
205
where
(1.6)
2 1 2 3 2 '112 2 1 2 3 2
E=mc + 2 m 'll +smv c2 +···=mc +2 mv (1+4"f3 + ... ) (1.7)
and within the additive rest mass energy mc2 the leading relativistic contribution to the energy
(1.7) is of the order of 132 • Already for Z = 40 this leading term will change the particle energy
by about 7 per cent.
This estimate gives some feeling for the magnitude and importance of relativistic contributions
to energies. In a similar way one can also consider the relativistic effect on average distances
which are measured in units of the Bohr radius (ao). According to its definition ao is proportional
to the inverse of the electron mass at rest me. Since the mass m of an electron moving with
velocity v is:
(1.8)
and distances are expected to be changed by a factor of ~ ( 1. This is known as the relativistic
contraction effect. However, the relativistic change in the shape of wave functions must not
change the mutual orthogonality conditions. As a consequence some of atomic shells may de-
crease their shapes while the others will increase. One should also remark that the electron spin
is a consequence of relativity. Thus, all magnetic interactions between electron spins and other
magnetic fields will arise in a natural way from relativistic considerations.
In principle the relativistic framework is the only one which provides the right description of
physics. One learns in the theory of relativity that valid equations of physics must have the
same form in all inertial coordinate systems. In other words all valid equations of physics must
be covariant with respect to the space-time transformations linking different inertial coordinate
systems and preserving the space-time interval:
(1.9)
where T and t are the space and time separations between two points in the 4-dimensional space-
time coordinate system. The requirement (1.9) is satisfied by what is known as the Lorentz
transformation and the principle of validity of equations used to describe physical phenomena
can be formulated in terms of their covariance with respect to this transformation. This feature
is obviously missing in the case of the Schrodinger equation which is approximately valid as long
as the velocities of particles are small compared to c.
The relativistic quantum theory is built on the basis of the relativistic classical mechanics
and commutation rules for the coordinate and momentum operators. Traditionally, the theory
is first developed for one electron (particle) and leads to what is known as the Dirac equation
which is a relativistic substitute for the Schrodinger equation. The consequences of the Dirac
equation are analysed leading to the particl~hole interpretation and provide a basis for quantum
electrodynamics, i.e., the relativistic theory of many--electron systems. In the present series of
lectures only the most important elements of general theory will be given. The main attention is
focused on the ways in which relativistic and approximately relativistic theories are being used
in quantum chemistry. Most details are being skipped and can be found in monographs and
review articles.
206
(2.1 )
and can be considered as a classical relativistic Hamiltonian. The quantization proceeds in the
usual way by replacing the classical variables by operators. By applying this method directly to
Eq. (2.1) one would obtain relativistic Hamiltonian operator involving a square root of operators
which is undefined. Dirac has assumed that the relativistic equation of motion should have the
same form as the Schrodinger equation, i.e.,
(2.2)
with the relativistic Hamiltonian H deduced from Eq. (2.1). To avoid undefined operators one
applies the so-called linearization procedure to the initial energy operator,
(2.3)
which follows from Eq. (2.1). The relativistic Hamiltonian operator is assumed to be linear in
momentum operators and written as:
a = (o:r, "';" 0%) is a 3-dimensional vector of constants and !3 is another constant, both to
be determined. The operator p is the usual 3-component momentum opertor vector, p =
(-i'V:r, -iVy, -i'V %) and the product ap is understood as the scalar product in 3-dimensional
vector space:
(2.5)
By requiring that the square of the operator (2.4) is equal to the square of the initial operator
(2.3) one concludes that both a and f3 are 4x4 matrices, i.e., operators acting in a 4-dimensional
space of the wave function components. Hence, the solutions 'II of the Dirac equation for a free
particle
(2.6)
(2.7)
207
The wave function ofthe form (2.7) is referred to as a (4-component) spinor while in the so-called
standard representation
0 -i 0
i 0 0
-'~) ('! -!)
,Otz =
0
0 0
0 0
-1 0
1
(2.8)
and
{j = CO
o
0 0 -1
o
1
0
0
0
0 J) (2.9)
are hermitian matrix operators acting in the space of 4-component spinors. They satisfy the
following relations:
{j2 = 1, Ot,,{j + {jOt" = 0, Ot"OtI + OtIOt" =0"" (2.10)
for Ie,l = (z,y,z) with 1 and 0 being 4x4 unit and zero matrices, respectively. The Kronecker
symbol Okl has the meaning of either unit (Ie = I) or zero (Ie '" I) 4x4 matrix. The matrices
and {3 can be written in terms of auxiliary 2x2 matrices 0, I:
0:
o=(~ ~), I=(~~) (2.11)
(2.12)
Thus,
The Dirac equation (2.6) can be understood as a set of four linear first-order dift'erential
equations for components of (2.7). As long as the Dirac Hamiltonian is time-independent the
time-dependence of the wave function can be factorized out leading to the time-independent
Dirac equation. For a free particle this equation assumes the following form:
H+=E+, (2.14)
(2.15)
and E denotes the particle energy. By solving Eq. (2.13) one learns that its eigenvalue spectrum
consists of two continua: -00 < E < -m,c'l and +m,c'l < E < +00, separated by a gap of 2m2.
208
This form of the spectrum brings certain problems with respect to the interpretation of states of
the relativistic free particle and has led Dira.c to proposing the existence of a positron. To avoid
problems arising from the negative energy continuum Dirac assumed that in what is referred to
a vacuum all those states must be occupied by electrons. Then, the observed free electron will
have to occupy one of the positive energy eigenstates, will have positive energy, positive mass,
and will carry a negative charge. A hole in the negative continuum corresponds to a particle
with positive energy and mass which carries a positive charge. In this way the one-particle
relativistic theory becomes a many-particle theory; the negative energy continuum can contribute
to energies of positive energy particles via the s~called virtual excitations (polarization of
vacuum). The relativistic many-particle theory which handles this problem is known as quantum
electrodynamics and in principle enables a full relativistic treatment of many-particle systems.
Starting from the Dirac equation (2.6) and its hermitian conjugate one cau derive the conti-
nuity equation which brings about definitions of the charge density,
4
P = eq,tq, = e 2: 1/Ji1/Ji, (2.16)
i=l
(2.18)
Most of physical interpretation of the behavior of relativitic particles can be carried out in terms
of the charge and current densities defined by Eqs. (2.17) and (2.18).
A link between non-relativistic and relativistic theories can be accomplished by considering
the following block form of the Dira.c equation, i.e.,
- Ell
( [mc2 c(up) c(up) ) ( u L ) _ 0 (2.19)
_[mc2 + Ell uS - ,
where
UL _
_ (Ul )
U2
, (2.20)
are referred to as the large and small components ofthe Dirac spinor, respectively. Eq. (2.19)
is simply a set of two (2x2) -dimensional matrix equations:
(2.23)
209
For a particle obeying non-relativistic mechanics its non-relativistic energy E is positive and
small compared to me2 • Thus, in such a case,
(2.24)
(up)(up) = l (2.26)
(2.29)
where u solves the usual free-particle Schrodinger equation and Ct, C2 are arbitrary constants.
Through the analysis of solutions for a particle moving in the magnetic field one can associate the
two components of (2.29) with two possible directions of the magnetic moment of the particle.
Thus, Dirac's equation can describe particles whose magnetic moment has two energetically
different orientations in external magnetic field, i.e., it can describe particles with spin quantum
number of!. On combining this fact with experimental data one concludes that Dirac's equation
is appropriate for describing electrons and positrons.
Dirac's equation for a free particle brings about most of interpretation of the relativistic
quantum mechanics and the basic terminology. It also provides a starting point for relativistic
theory of many~lectron (many-particle) systems. Although such a theory can be developed its
use is limited by a number of mathematical problems. Thus, the so-called relativistic methods
of quantum chemistry are in most cases based on Dirac's equation for one-particle systems.
The development of the relativistic quantum chemistry parallels that of the non-relativistic
methods and the theory is built in the framework of the on~ectron (relativistic) approximation.
Solutions of the Dirac equation for on~lectron hydrogen-like systems play the key role in
devising relativistic methods in quantum chemistry of many-electron atoms and molecules in
exactly the same way as the corresponding solutions of the Schrodinger equation do in the
non-relativistic case.
210
E =-grad¢, (2.30)
H=rotA, (2.31)
The classical relativistic energy expression for a particle of mass m and charge e (both in atomic
units) moving in the field given by Eqs. (2.30) and (2.31) is:
(2.32)
Following the method used in Section (2.1) we obtain the Dirac Hamiltonian
11" = p- cA,
e
(2.34)
and / is a 4 x 4 unit matrix. The general analysis of solutions of the corresponding time-dependent
and time-independent Dirac equations follows that presented in Section (2.1) for a free particle.
There are two particular cases which deserve a more detailed analysis. The first one follows
from assuming that
(2.36)
where the matrix I is defined in Eq. (2.11). The spectrum of the corresponding Dirac equations
consists of three regions: (i) a continuum of negative energy states extending from -00 to _c2 , (ii)
211
a continuum of positive energy states extending from c2 to 00, and (iii) a discrete spectrum of
stateS embedded in the gap between the two continua just at the bottom of the positive energy
continuum.
The problem of the negative energy continuum is resolved in the same way as for a free
electron. The negative energy continuum is assumed to be completely filled with electrons,
forming thus a reference vacuum. The positive energy continuum corresponds to energy levels
above the ionization potential of the hydrogen-like ion, i.e., it represents a free electron moving
in the field of the point-like positive charge. The discrete spectrum refers to discrete energy
levels of the hydrogen-like ion.
To gain some idea about solutions of the hydrogen-like Dirac problem let us note that the
usual angular momentum operator:
L == rxp I, (2.37)
where I is a 4x4 unit matrix, does not commute with the Hamiltonian (2.36). However, the
operator:
(2.38)
where
X==(tT 0) (2.39)
OtT'
does commute with (2.36) and its components satisfy all commutation rules for the angular
momentum operators:
[H,J;] == 0, (2.41)
for i == z, y, z. Thus, the operator (2.38) can be interpreted as the total angular momentum
operator for a relativistic particle moving in the central field. The eigenequation for J2 is:
with j == I ± ~, where 1== 0,1,2, ... is the usual angular momentum quantum number as known
from the non-relativistic theory. The eigenfunction f( 8, cp) denotes a 4-component spinor wbich
depends on spherical angles 8 and cp.
To solve the time-independent Dirac equation for the hydrogen-like problem let us introduce
radial components Q r and pr for a and p operators:
(a1')
Qr=--, (2.43)
r
Pr == .( a +-1).
-I - (2.44)
ar r
212
With the aid of these operators the Dirac hamiltonian H for the hydrogen-like problem can be
written as:
ic Z
H = CQrPr + -QrfjK
r
+ (3c 2 - -1,
r
(2.45)
K = (3[(EL) + 1] (2.46)
commutes with (3, (I, and H of Eq. (2.36) and can be used to classify the spectrum of H. Let
us note that:
(2.47)
Thus,
where
and the radial part of the Dirac Hamiltonian (2.45) can be written in the following form:
H =CQrPr + -Qr(31t
ic
r
+ (3c2 - Z
-1.
r
(2.51)
The solution of this equation results finally in the following form of components of the Dirac
spinor (2.15):
forj=l+!
1£1 = Ntg(r)Y"m;_i(9,rp)
1£2 = -Ntg(r)Y"m;+i(9,rp)
(2.52)
1£3 = -iNt/(r)Y,+I,m;_i(8,rp)
1£4 =
-iNtl(r)Y,+I,m;+i(8,rp)
forj=l-l
1£1 = Ni"g(r)Y"m;_i(8,rp)
1£2 = N;g(r)Y"m;+i(8,rp)
1£3 = -iN;/(r)Y'_I,m;_i(9,rp)
(2.53)
1£4 = -iN; l(r)Y,-I,mi+i(9,rp)
213
where Nt and Nt, i = 1,2,3,4 denote numerical normalization factors, and f( r) and g( r) solve
the radial Dirac equation for the given value of the quantum number k of Eq. (2.50).
The radial equation with the Hamiltonian (2.51) can be solved exactly for functions g(r)
and f(r) in terms of confiuent hypergeometric functions. These solutions depend on quantum
numbers n and j = I ± ~ and can be expressed as a product of a decaying exponential function
of r multiplied by a terminating power series in r. Although only three quantum numbers are
needed to fully determine the state of the electron in the hydrogen-like ion, the eigenfunctions of
the corresponding Dirac equation are usually characterized by the following set of four quantum
numbers:
=
(i) The principal quantum number n N+I K.I 1,2, .... =
=
(il) The azimuthal quantum number, 1 0,1,2, ... , n - 1, whose value is usually identified by
alphabetic symbols S,p, d, ....
(iii) The total. angular momentum quantum number, j = I ± ~,j > 0, whose value is given as
a subscript to the alphabetic state symbol.
(iv) The magnetic quantum number, m; = -j, -j + 1, ... ,j - 1,j.
The existence of normalizable radial solutions leads the following expression for the energy of
discrete states:
E- mc2 [1 + ( ~ ) 2]-!
- N- I K. I + JK. 2 - ~
(2.54)
where N = 0,1,2, ... plays the role of the non-relativistic principal quantum number and K. is
given by Eq. (2.50). On expanding Eq. (2.54) into a series of inverse powers of c the following
result is obtained:
-K. - 1 = j - ~ if K. < 0
1= { (2.57)
K.=j+~ ifK.>O'
and related to j through:
(2.58)
214
=
Finally, m mj is one of the possible magnetic quantum numbers for the given value of j. The
radial functions Gn,,(r) and Fn.c(r) are the same for both components of the given 2-component
spinor n. The 2-components spinors n are defined through the coupling between orbital and
spin momenta:
m,=+! 1 1
n"m(lJ,ip) = :E (I m - m. "2 m.II "2 j m) Yi,m-m,(lJ,ip) Am" (2.59)
m,,=-~
where the products under the summation sign consist of the Clebsch-Gordan coefficients (I m-
m. ! m.ll! j m), the usual spherical harmonics Yt,m-m,(lJ,ip), and 2-component spinors:
(2.60)
The 4-component spinors of the form (2.58) are used to build basis sets of one-electron func-
tions in relativistic calculations for atoms and molecules. In this context one should note
that (see Eqs. (2.21) and (2.22» that the small (us = Fn"(r)n_,,m(lJ,ip)) and the large
(u L = GnI«r)n+"m(lJ, ip)) components of the Dirac spinor (2.56) are mutually related:
uS = C 1 (up)u L (2.61 )
C2 +¢+E
This relation should be satisfied also in the case of the approximate form of the large and small
component which is obtained, e.g., by the truncated basis set expansion. It is only than that
the usual Schrodinger equation can be derived in the non-relativistic limit. To obtain this limit
one has to use the identity (2.26), which brings the non-relativistic kinetic energy term. This
requirement is usually referred to as the kinetic energy balance condition and will be satisfied in
approximate calculations if the small component basis set is appropriately related to that used
for the large component.
A detailed treatment of the Dirac equation for hydrogen-like ions can be found in several
quantum mechanics textbooks and monographs (see References). A useful qualitative presenta-
tion of the relativistic theory for hydrogen-like ions, including graphs of relativistic functions,
has been given by Powell (see References).
(c 2 _ ¢ - E)u L + c(u"Jr)u s = 0
(2.62)
c(u"Jr)u L - (c 2 + ¢ + E)u s = 0
where we used atomic units (m 1, e = = -1 )for the mass and charge of the electron. From the
second of these equations one finds
(2.63)
215
(2.64)
or
I
[-</> - E+ c2 (0'1I') 2 2 4> (2.65)
+ +E (0'1I')]u = 0,
L
c
where the relative energy value E of Eq. (2.24) is used. After expanding the last term in square
brackets into a power series with respect to ~ and using:
I
11' =p+-A, (2.66)
c
one obtains the non-relativistic limit through terms of the order of ~:
_~V24> (Darwin)
The term referred to in (2.67) as the 'spin-external magnetic' interaction contribution has the
form of the interaction (-I'B) between external magnetic field B and magnetic moment 1',
and brings about the interpretation of the electron spin in terms of the magnetic moment of the
electron.
For the potential </> arising from a point-like nucleus of the charge Z:
and
r
E= - ,.3' (2.70)
where 6(r) is the Dirac delta function. Thus, the spin-orbit and Darwin terms become:
(2.71)
216
and
'11"
HD = 2c26(r), (2.72)
where I is the angular momentum operator with respect to the origin at the nucleus. The spin-
orbit term Hso gives no contribution to energy for s states 1= O. Additionally, if H = 0, the
only remaining relativistic corrections to the non-rela.tivistic Hamiltonian (through the order of
~) are the mass-velocity Hm " and Darwin HD terms. Numerically, they constitute the largest
corrections to non-re1a.tivistic energies. However, they are close in magnitude and of opposite
sign.
The recognition of re1a.tively large magnitude of correction terms H m" and HD has lead to
defining an approximate quasi-relativistic energy operator Hm"D:
(2.73)
This operator involves two major relativistic terms while retaining the usual non-relativistic
l-component form of solutions of the corresponding eigenvalue problem. On adding to (2.73)
the Hso operator one obtaines a 2-dimensional mvD+SO Hamiltonian H."."D+SO,
(2.74)
where
(3.2)
¢Ji =- t ZA
A=l riA
(3.3)
riA is the distance between the i-th electron and nucleus A of the charge ZA, and T;; is the
distance between electrons i and j. The number of nuclei in the system is assumed to be N.
It is worthwhile to note that both the Q and f3 matrix operators are labelled by the electron
reference number i. Although they have the same form, they will act on different 4-component
spinors.
The operator (3.1) is referred to as the Dirac-Coulomb (HDC) Hamiltonian and represents the
lowest order approximation electron-e1ectron interactions. In spite of that, the Dirac-Coulomb
Hamiltonian underlies the majority of 4-component relativistic calculations in quantum chem-
istry. However, the neglect of relativistic contributions to the electron-electron interactions
brings a.bout some fundamental problems. The 2-electron Dirac-Coulomb equation:
(3.4)
218
can be shown to have no bound states. Thus, there is no protection against the variational
collapse into negative energy states. The ill-conditioned form of the Dirac-Coulomb Hamiltonian
has been first recognized by Brown and Ravenhall and is usually termed as the 'Brown-Ravenhall
disease'. This follows from the fact that a bound state of two non-interacting Dirac electrons,
i.e., .(1).(2) is degenerate with a continuum of non-normalizable states having one electron
in the positive energy state and another one in the negative energy state. When the Coulomb
interaction is included, the initial wave function of non-interacting electrons gains contributions
from all those continuum states and becomes 'dissolved in continuum'. Until recently not too
much attention has been paid to this problem. The recent interest in avoiding the Brown-
Ravenhall disease has been pioneered by Sucher and followed by numerical studies of Hess
et al. However, one should remark that, in spite of its rather obscure physical meaning and
mathematical features the Dirac-Coulomb Hamiltonian is underlying the majority of relati-
vistic techniques in quantum chemistry.
The relativistic form of the two-electron interaction operator has been first discussed by Breit
leading to expression which is relativistically correct through ir.
The corresponding interaction
operator can be approximately derived from quantum electrodynamics and is known as the Breit
interaction operator Vs(i,j):
., ( . . ) 1 1 [a;aj (a,"-;j)(ajr;j)j
vs ~,1 =- - - -- + . (3.5)
r;j 2 Tij T~j
By substituting in Eq. (3.1) the Coulomb interaction operator by its relativistic extension (3.5)
one obtains the so-called Dirac-Breit many-tiectron Hamiltonian aDS:
"
a DB = ~ aD(i) + ~VB(i,j). (3.6)
;=1 i<j
This replacement, however, does not remove the 'Brown-Ravenhall disease' problem. More-
over, the Dirac-Breit Hamiltonian is derived perturbationally and there may be some objections
against its use in variational calculations. Thus, it is frequently suggested that the Breit cor-
rection to the Coulomb interaction should be considered in the perturbation framework and
evaluated as the first-<lrder contribution the the energy which follows from aDO. In the context
of the Briet operator (3.5) one should also mention its approximate form known as the Gaunt
interaction VG:
V G ( ',1
. .)
=1- - -
aiaj
-, (3.7)
Tij Tij
"
a DG = ~aD(i)+ ~VG(i,j) (3.8)
i=l i<J
In recent years considerable attention has been given to modifications of the Dirac-Coulomb
Hamiltonian which remove the 'Brown-Ravenhall disease' problem. The continuum dissolution
can be avoided by projecting out the relevant part of the Coulomb interaction operator. This
leads to the eigenvalue equation of the form:
where
(3.10)
(3.11)
with
(3.12)
n
projects onto the space of positive energy solutions for RD. The derivation of this so-called
'no-pair Hamiltonian' neglects all effect related to the creation of virtual electron-positron pairs.
Also the effects of virtual photons are neglected. Moreover, Eq. (3.10) can be rduced to a single
component form, leading to what is called the 'spin-free no-pair' approximation.
Most of the problems arising from the choice of approximate many-eectron Hamiltonians
can be resolved by using quantum electrodynamics in the so-called Furry's bound interaction
picture. It is quite pleasing to note that several equations used in relativistic quantum chem-
istry can be derived as legitimate approximations to the proper field-theoretic treatment of the
many-electron problem. Among others this applies to the relativistic equivalent of the Hartree-
Fock scheme, i.e., the Dirac-Hartree-Fock method. Once the many-eectron Hamiltonian is
chosen, the relativistic methods of quantum chemistry parallel those developed for solving the
SchrOdinger equation.
*
Before closing this chapter let us briefiy consider the extension of the non-relativistic limit
formulae of Section 2.2.2. In the Dirac-Breit approximation the additional terms arising through
the order are:
- "L.Ji<i 2CfPi
1 (!.!l!i.L
1, -;;; 1 ) Pj ( orbit - orbit)
- "L.Ji<i 1
~O'i (r.,:" -;::rI,
r
'J
1 ) O'j (dipole spin - spin)
(3.13)
The first four operators correspond to interactions between orbital and/or spin magnetic mo-
ments of electrons. Together with the SO operator of Eq. (2.71) they are responsible, e.g., for
the so-called fine structure in atomic spectra. The corresponding effects on energies and wave
functions are usually evaluated by means of the perturbation treatment based on solutions of
the non-relativistic many-electron problem.
220
.T. (.)
"'10 ~ =
(Ut(i»)
iuf(i) , (3.15)
where ut and uf, are the corresponding large and small components, respectiVely. In the present
case both of them are real. In order to determine the optimal set of 4-component spinors one
follows the same route as in the case of the non-relativistic Hartree-Fock method. For each of
the nobody Hamiltonians of Section 3.1 one can define the energy functional:
E _ (1}(1,2, ... ,n)IH(I,2, ...,n)II}(I,2, ... ,n)}
(3.16)
- (1}(1, 2, ... , n)II}(I,2, ... , n)} ,
where in the simplest case 1}(1, 2, ... , n) is a single Slater determinant,
(3.17)
(3.18)
(3.19)
where fTc is the orbital energy associated with the k- th spinor. For the two-body Coulomb
potential one has:
(3.20)
(3.21)
are the Coulomb and exchange operators, respectively. They are defined over 4-component
spinors and the integration comprises both the usual integration over space coordinates of the
electron and the summation over products of components, e.g.,
(3.22)
221
where Ui,k is the i-th component of the tTk spinor of the form given by Eq. (2.15). For other
two-body potentials, e.g., those which follow from the Breit or Gaunt interaction Hamiltonians,
the T!2 operator should be replaced by its appropriate counterpart.
The derivation of Dirac-Hartr~Fock equations follows that known from the non-relativistic
theory. The same applies to their derivation for open-shell and multi configuration cases. How-
ever, the use of the variation approach to determine one-particle spinors is a little problematic
since the Dirac Hamiltonian is not bounded from below. Hence, the variation method should be
rather used as a tool to determine the stationary point of the energy functional. The solution
of equations which result from the variation of E is therefore being usually restricted to the
positive energy region and the lowest positive energy eigenvalues are assumed to correspond to
occupied levels. In the iterative approach to the solution of one-electron equations (3.19) the n
lowest (positive) energy spinors are used to build the Coulomb and exchange operators for the
subsequent iteration. In order to avoid spurious solutions one should take care that the kinetic
energy balance condition (see Eq. (2.61» is satisfied.
where 4>n"ol is given by Eq. (3.3) and 4>"0 is the so-called exchang~orrelation potential. The 1-
particle density p is calculated from occupied single-particle solutions of (3.23). The usefulness
and success of the density functional method depends on the choice of the unknown exchange-
correlation potential. The most common choice is based on the exchang~orrelation energy
formula for a homogeneous electron gas and gives the so-called local density approximation
to the density functional theory. There is a variety of different forms of tP:&o used in density
functional methods. The best known is the one proposed by Slater:
(3.25)
and usually referred to as the X Q-potential; the constant Q being an adjustable parameter. The
local density functional approximation with the exchang~orrelation potential (3.25) is known
as the Hartree-Fock-Slater method.
A relativistic form of the density functional theory is rather obvious. The one-electron oper-
ator in Eq. (3.23) is to be replaced by the Dirac energy operator with the effective potential
which follows from I-particle density as given by Eq. (2.16). with 4-component spinor orbitals
obtained from the relativistic counterpart of Eq. (3.23). Computational techniques based on this
formalism are usually referred to as the Dirac-Slater or Dirac-Hartree-Fock-Slater methods.
Although both non-relativistic and relativistic density functional theories can be given a
sound formal background, the way they are used makes them into approximate techniques. The
relativistic density functional methods evidently provide computational tools for handling the
relativistic effects in many-electron systems. However, their success depends on the choice of
the exchang~orrelation potential. Thus, in practice all density functional methods rely on a
posteriori validation of their results.
223
4. Computational aspects
The so-called basis set expansion methods are primarily used to determine approximate single-
particle solutions for equations discussed in Section 3. Once these are known the electron
correlation effects can be evaluated by using either CI or MBPT-type methods in essentially the
same way as in the non-relativistic case. In the case of atomic structure calculations the basis
set expansion methods do not seem to be competitive, at least at the level of the Dirac-Hartree-
Fock approximation, to numerical integration techniques. One of the main purposes of a variety
of atomic programs based on the analytic expansion methods appears to be the development of
relativistic atomic basis sets to be used in molecular calculations.
The early truncated basis set (algebraic) calculations for molecules have proven to be a failure
because of the so-called variational collapse problem. This has been remedied later on by
recognizing the importance of the kinetic energy balance which forces a fixed relation between
basis functions used for the large and small components. With the problem occuring in numerical
integration methods for polyatomic molecules the use of the algebraic approximation appears to
be at present the only way to extend the relativistic calculations beyond atoms and diatomics.
There is an increasing number of available Dirac-Hartree-Fock molecular programs which use
the basis set expansion methods. Moreover, in recent years the algebraic methods have been
extended beyond the single-particle level of approximation by devising the relativistic CI and
MBPT techniques.
The algebraic approach requires that certain integrals involving basis functions and operators
of relativistic Hamiltonians are calculated fast enough. This imposes some additional conditions
on the choice of basis sets and expansions in terms of Gaussian functions are rather routine
in molecular calculations. They have been also successfully used in relativistic treatment of
atoms. Without taking into account any symmetries the 4-component spinor solving the given
one-particle equation, e.g., the Dirac-Hartree-Fock equations, is expanded into a finite set {X,,}
of Gaussians X" which are usually centred at atomic nuclei. In principle the same basis set can
be used to expand both the large (Eq. 3.15) and the (real) small components, i.e.,
(4.1)
and
(4.2)
respectively. However, from Eq. (2.61), which in the present case assumes the following form:
(4.3)
where tP stands for the appropriate one-electron potential, one learns that the expansions (4.1)
and (4.2) are not independent. The relation (4.3) will be obviously satisfied in the limit of a
complete set of expansion functions X". However, its violation for finite basis sets may have
serious consequences.
Let us note that (4.3) is needed to obtain a proper non-relativistic limit of the considered
single-particle relativistic model. In particular, one finds (see e.g. Eqs. (2.25) (2.26» that
must be equal to
(4.5)
This will be satisfied only if the basis set {X,,} is large and rich enough that it contains functions
generated from X" by the up opertor. In principle such a basis set must be complete.
The equivalence between (4.4) and (4.5) is termed as the kinetic balance condition and can
be satisfied by choosing different basis sets for the representation of the large and small com-
ponents. From the analysis of the non-relativistic limit of the Dirac equation we know that the
large component will approach the solution of the non-relativistic counterpart of the consid-
ered relativistic model. This gives a plausible receipe for choosing a basis set, say {X~} for the
expansion of the large component. The small component basis set can be then generated by:
(4.6)
There are obvious practical limitations to this procedure and usually Eq. (4.6) is only a guiding
principle for the choice of the small component basis set. Then, after selecting the large and
small component sets one combines them into a sigle set of functions to be used for expanding
the 4-component spinor. The choice of the basis set for representing the small component can
also be guided by atomic relativistic calculations. For the use in molecular calculations one can
devise as set of functions which comprises the usual non-relativistic atomic basis sets and atomic
small component sets. This leads to what is known as atomic balanced sets.
The basis sets to be used in relativistic calculations are obviously much larger that those
employed in non-relativistic cases. The reduction of primitive sets can be achieved by using
contraction methods in either segmented or generalized forms. Once the kinetic balance con-
dition is satisfied, then the variational collapse problem usually disappears. The use of the
point-like nuclei leads sometimes to convergence problems in iterative solutions of algebraic
Dirac-Hartree-Fock equations. This can be remedied by using a finite-size nucleus approxima-
tion. The final results are essentially independent ofthe assumed (small) nuclear radius.
The algebraic methods for solving one-particle equations in relativistic theories are of par-
ticular importance for computational techniques which go beyond the on~lectron model. In
addition to spinors which are occupied by electrons in the given electronic configuration they
provide a set of spurious solutions (virtual spinors). These can be used to build other config-
urations in CI-like schemes or can provide an approximation to complete set of one-particle
eigenstates to be used in many-body techniques.
In present molcular applications of the algebraic form of the Dirac-Hartree-Fock method
most calculations are carried out with Gaussian basis sets. The major convenience of such
basis sets is a relatively easy calculation of molecular relativistic integrals. In this context one
should remark that relativistic calculations, in spite of certain symmetries between integrals, are
far more size-demanding than their non-relativistic counterparts. The size of relativistic basis
sets makes the storage problem quite serious. This may explain several attempts to simplify
fully relativistic approaches either by using quasirelativistic approximations or by reducing the
number of explicitly considered electrons in the framework of the pseudopotential techniques.
one-electron wave functions is to be called quasireia.tivistic. In this sense all methods based
on the 2-component Pauli formalism. or I-component non-relativistic wave functions are to be
termed as quasirelativistic approaches. Though quasirelativistic, these methods may recover a
variety of relativistic effects on energies and related properties of many~ectron systems.
Most quasirelativistic approaches can be derived directly from relativistic schemes by us-
ing either rigorous or approximate perturbation treatment and saving only certain, supposedly
largest. contributions. In all derivations the first step consists of separating the large and small
components of the wave function. This can be achieved rigorously by applying a series of uni-
tary transformations to the Dirac Hamiltonian. The corresponding method is known as the
Foldy-Wouthuysen transformation technique. The transformed Hamiltonian truncated at the
c 2 order is equivalent to the Pauli Hamiltonian for one-electron systems and includes the mass-
velocity, Darwin, and spin-orbit terms (see Section 2.2.2). In higher orders with respect to c- 2
the transformed Foldy-Wouthuysen Hamiltonian becomes strongly singular and cannot be used
in numerical calculations.
The 2-component Pauli approximation is a typical example of a quasirelativistic approach.
In the case of many-electron systems it usually brings the so-called Pauli Hamiltonian which
reads:
(4.7)
where HS is the usual non-relativistic (SchrOdinger) Hamiltonian for the given many-electron
system and
(4.9)
and
(4.10)
are the many-electron· generalizations of operators given in Section 2.2.2. The one-electron
angular momentum operator i;A of the i-th electron is defined with respect to the origin A at
the nucleus A and riA is the length of riA = ri - A. The many~ectron Pauli operator can
be either completed with tw~ectron terms which result from the Breit operator (through the
order of c- 2 • (3.13» or further reduced to I-component approximation by removing the SO
term.
The Pauli Hamiltonian completed with two-electron terms is used in quasirelativistic atomic
calculations for the evaluation of the effect of different magnetic interactions. In most cases the
corresponding calculations are carried out perturbationally by using non-relativistic O-th order
wave functions in the LS coupling scheme. The perturbation evaluation of the effect of Hso on
spin-forbidden transitions is a typical example of such applications.
By reducing the Pauli Hamiltonian to the I-component form one retains the two largest cor-
rections to non-relativistic energies. As long as the magnetic interactions are of little importance
(e.g. closed shell systems) the I-component quasirelativistic apprOximation works unexpectedly
well. This approach has been first used by Cowan and Griffin in calculations of atomic energy
227
levels. More recently the I-component quasirelativistic scheme has been applied to the evalua-
tion of relativistic corrections to variety of atomic and moleculer properies. Taking into account
several fundamental objections concerning the most advanced many-electron relativistic Hamil-
tonians, the numerical problems occurring in relativistic molecular calculations, and the effort
and costs involved, the simplest I-component approximation for relativistic effects is certainly
worth pursuing.
On the basis of recent numerical results most promising appears to be the quasirelativistic
'spin-free no-pair' approximation based on Sucher's projected Hamiltonian (3.10). Within the
I-component framework this method includes relativistic effects on the electron-electron inter-
action and provides a tool for studying the interplay between relativity and correlation effects.
In the context of quasirelativistic methods one should also mention recent progress in ex-
plicit perturbation evaluation of relativistic effects. By certain modification of the metric for
4-component spinors one can devise a perturbation expansion which is based on non-relativistic
solutions and avoids singularities ofthe Foldy-Wouthuysen transformation method. In the first-
order with respect to c- 2 this method gives essentially the first-order result of the Pauli ap-
proximation. However, numerical studies of Rutkowski and Kutzelnigg indicate its much higher
independence of the approximate character of the non-relativistic reference function.
Finally, we shall place in this Section also the methods based on ad hoc one-electron potentials
which are supposed to simulate the effect of true relativistic contributions. Such methods are
used mainly in the framework of the spin-less I-component approximation and are usually
restricted to the area of atomic calculations.
ods are worth developing. The corresponding relativistic methods are of indispensable usefulness
in study of heavy-metal compounds and heavy-metal clusters.
5. References
The fundamentals ofthe relativistic quantum mechanics can be found in a number of textbooks
and monographs [1 - 3]. Some interpretational problem of relativistic quantum mechanics are
qualitatively discussed by Powell [4] who also gives plots of spinor orbitals for hydrogen-like
ions. The problems of the relativistic theory of many-electron systems are well surveyed in
review articles of Sucher [5] and Grant [6,7] where references to other papers of interest are
given. Both formal and computational aspects of the atomic relativistic theory are covered by
other reviews of Grant [8]. Desclaux [9] has published a compilation of hydrogenic relativistic
wave functions and related properties for Z from 1 through 120. The very promising projection
method of Sucher is described in his articles [5,13]. Implementations of this method can be
found in Refs. [14] and [15].
Several technical aspects of relativistic calculations with truncated basis sets are described
in articles by Clementi and co-workers [10 - 12]. The many-body techniques in relativistic
calculations have been reviewed by Wilson [18] and Quiney [19]. One should also mention papers
by Malli and his co-workers on the relativistic CI method [20]. Relativistic pseudopotential
methods are comprehensively described in review articles by Balasubramanian and Pitzer [21]
and by Gropen [22].
The perturbation technique for the solution of the Dirac equation has been proposed by
Rutkowski [23]. Different aspects of this approach are discussed by Kutzelnigg [24]. Some
illustration of the usefulness of the I-component quasirelativistic scheme can by found in recent
papers by myself and my co-workers [25].
Of particular value are two review articles by Pyykk5 [16,17] which give a broad historical
account of different relativistic methods and provide excellent illustration for chemical aspects of
relativity. Pyykk5 has also compiled [26] nearly all 'relativistic' papers published in the period
1916-1985.
15. G. Jansen and B. A. Hess, Z. Phys. D. 13, 363 (1989); G. Jansen and B. A. Hess. Chern.
Phys. Lett. 160,507 (1989).
16. P. Pyykko, Adv. Quantum Chem. 11,353 (1978).
20. G. L. Malli and N. C. Pyper, Proc. Roy. Soc. London A 407, 377 (1986); A. F. Ramos,
N. C. Pyper, and G. 1. Malli, Phys. Rev. A 38, 2729 (1988).
21. K. Balasubramanian and K. S. Pitzer, Adv. Chem. Phys. 67,287 (1987).
22. O. Gropen, in: Methods in Computational Chemistry, vol. 2, ed. S. Wilson, Plenum Press,
New York 1988, p. 109.
23. A. Rutkowski, J. Phys. At. Mol. Phys. B 19, 149, 3431,3443 (1986); A. Rutkowski and
W. H. E. Schwarz, Theor. Chim. Acta 76391 (1990); A. Rutkowski, D. Rutkowska, and
W. H. E. Schwarz, Theor. Chim. Acta 84 105 (1992).
24. W. Kutzelnigg, Z. Phys. D 11, 15 (1989); R. Franke and W. Kutzeinigg, Chern. Phys.
Lett. 199561 (1992).
25. V. Kello and A. J. Sadiej, J. Chem. Phys. 93,8122 (1990); A. J. Sadlej, J. Chem. Phys.
95,2614 (1991); V. Kello and A. J. Sadlej, J. Chem. Phys. 95,8248 (1991); V. Kello, A.
J. Sadlej, and K. Faegri, Jr., Phys. Rev. A 47,1715 (1993).
26. P. Pyykko, Relativistic Theory of Atoms and Molecules, Springer-Verlag, Berlin, 1986.
Exercises with solutions
Compiled by
Roland Lindh
Per-Ake Malmqvist
• Basis Sets
• Hartree-Fock
• Second Quantization
• Spin
• Geometrical Derivatives ...
• Density Functional Theory
• Coupled Cluster Theory
• Truncated Cl. ..
• Accurate Calculations
• MCSCF Theory
The three topics Basis Set.s, Spin, and Truncated CI and Size Consistency, are free-
standing. The others are exercises on the topics presented in other chapters of this book.
The MCSCF Theory exercises are not reprinted here, however; they appear in the MCSCF
Theory chapter of the First. ESQC Book CLecture Notes in Chemistry 58, Lecture notes in
Quantum Chemistry, European Summer School in Quantum Chemistry" (Springer-Verlag
Heidelberg. 1992), and only the solutions are given here.
Exercises
1 Basis Sets
A primit.ivf' Car!.esian Gaussian basis function G( CI', A, /, m, n) centered at A, with expo-
nf'n!. CI' and quantum numbers /, m, n is given by
where N(o, A, /, ro, n) is a normalization factor. Useful integrals in the following are:
(n E Z)
Note that o. which is not an exponent, is always called 'the exponent' of a Gaussian
basis function, and that /, m, and n which are exponents, should never be called so.
A complet.e Sf't of functions with a common center and exponent, and the same degree
N = /+ 111 +n, is called a shell. For N = 0, we have a single s function, N = 1 gives three
p functions, N = 2 gives six cartesian d functions, etc. This notation is borrowed from the
one used for spherical harmonic functions, and originates in spectroscopy (s,p,d,f stand
for sharp. principal, diffuse, and fundamental). This is improper for cartesian d functions
and higher. but the terminology is widespread, and the least confusing is probably to
accept. it.
The basis functions are linear combinations, callE'd contractions, of primitive gaussians.
When forming such contractions, one can also make sure to form linear combinations
which are true angula.r eigenfunctions. From one shell of six cartesian d functions, one
obtains one s component., which is not used, and five true d functions.
Anot.her name (again, not. quite proper) for gaussian basis functions are GTO's (Gaussian
Type Orbit.als). One can also talk of primitive GTO's and Contracted GTO's. The latter
may be abbreviatE'd CGTO's. However, this could also be taken to mean Cartesian GTO's
as distinct from Spherical Harmonic GTO's.
Exercise 1
Comput.e the oVE'rlap integral of two normalized s-type functions with exponents ° and
/j, and wit.h a common center.
234
Exercise 2
The overlap of sand p functions on the same center must of course be zero - find a simple
and direct argument. What about the overlap of s functions with the six cartesian d
functions?
Exercise 3
Consider the six cartesian d functions centered in the origin. The angular momentum
operators are it", = yf. - z/y etc. Apply the operator L2 to the six components and
determine five real linear combinations ¢Ji such that L2¢Ji = 1(1 + 1)¢Ji, with 1=2. These
are then a set of proper d functions, the Sperical Harmonic GTO's. Also find the sixth
linear combination, orthogonal to the others, with I = O. This is called the s contaminant,
and is seldom used.
Since the complete cartesian d shell is invariant to rotations, if it is decomposed into
spherical harmonic components, there must be complete sets of these as well. Thus,
they come in sets of 1,3,5 ... functions of type .!l,p,d .... What does this imply for the
transformation of the ten cartestian f functions to spherical harmonic functions?
Exercise 4
Find the radial maximum of GTO's of 5, p, d and f type, i.e. r = rmax(Q) such that the
radial density function r2¢J2 takes its maximum. Then find p, d and f exponents which
give the same maximum as a given s function. This, and similar, procedures are often
used to extend a basis set with functions for polarisation or correlation.
Exercise 5
Compute the overlap integral between two s-type functions centered at A and B, and
with exponents Q and /3, respectively.
Exercise 6
Same as Exercise 5, for p functions.
The integrals obtained from the integral evaluation are of two types: matrices for one-
electron operators, and for two-electron operators. The former have two basis function
indices p, q, while the latter have four indices. In non-relativistic work, for zero magnetic
235
field, the intt'grals art' rt'al numbers. Formally, with N basis functions, we need N2 and
N 4 integrals. However, there are permutation symmetry in the indices:
(pIAq) = (qIAp)
(pq,rs) = (pq,sr) = (qp,rs) = (qp,sr) = (rs,pq) = (sr,pq) = (rs,qp) = (sr,qp)
(real basis functions, ht'rmitian A, charge-cloud notation). For one-electron integrals, we
t.JllIS need only ~N(N + 1) values, for instance, the lower triangle of the matrix, row by
row. which impIit'S lint'M storage using a single triangular index [pq) = tp(p-l)+q, p ~ q.
Similarly. t.he t.wo-elE'Ctron integrals can bt' thought of as placed in a larger matrix with
indices (pq), Irs), and since this largt' matrix is itself symmetric, we use a single combined
index [(pq)[rs)]. The numoor of two-electron integrals are then t(tN(N + l))(tN(N +
1) + 1) or roughly ~N4.
When a molecule has some symmetry, the number of integrals to compute, store and
use can be furtht'r reduced. There are two schemes: Symmetry-adapted basis functions,
and Pt'titt' List. The following description applies to the D2h group and its subgroups.
All the integrals (Gr/Jp, Gr/Jq, Gr/J.. Gr/J.) have the same value for every operator G in the
point group. If wt' use a simple atom-centered basis set, these integrals involve different
basis functions, and only one of them needs to be stored. This is the basis for the Petite
List approach. We may instead form symmetry combinations of the basis functions -
Symmetry Adapted basis functions - and collect the symmetry combinations into blocks.
Each block is assigned a symmetry label A, and all functions in a block transform the
sa.mt' way. na.mt'ly Gr/Jp = '(G(A)r/Jp where \'G(.~) is 1 or -1, are symmetry characters of
tht' irreducible representation A of the group G. Usually, so-called Schonfiies notation is
used for the point groups and their irreducible representation, and you are assumed to
have some familia.rity with this notation.
For symmetry-adapted basis functions, the rule is now very simple: For each quadruple
=
of symmetry labels Al ;:: A2. A3 ;:: A4, [AIA2);:: [).3).4], all the integrals (pq,rs) 0 unless
YG(AdYG(A2hG(A3hG().4) =1 for all operators G. In other words, the product in the
integral
Label E C2 Uv U'v
Al 1 1 1 1 z, z2, x 2, y2
BI 1 -1 1 -1 x,xz Ry
B2 1 -1 -1 1 y,yz R%
A2 1 1 -1 -1 xy R%
From this, we can also deduce a multiplication table for the functions:
Example: Electrostatic integrals in the symmetry block (a2~' lit al) may be non-zero, since
the product is totally symmetric. The product a2~ is BI to the left of the comma, lit al to
the right is also BI> and of course BIBI is At, which is the label of the totally symmetric
function. Remember: capital letters in general, but lower case for basis functions and
orbitals (one-electron functions).
Exercise 7
a) Assume that, in a calculation on C3 N, we use an ANO basis set with 14 primitive
s functions, 9 shells of p functions etc. The notation ANO(14s9p4d3f)/[5s4p2dlfl, for
example, means that we use 5 s-type contracted functions, etc.
For the three contractions [4s3p2d] , [5s4p2dlfl, and 15s4p.3d2fl, calculate the number
of symmetry adapted basis functions of a' and a" type, and the number of two-electron
integrals over symmetry-adapted basis functions. C<>mpare to the simple estimate, N 4 /8g.
Also, compute the number of integrals over primiti.ve basis functions.
Exercise 8
For a pyramidal CUs cluster, compute the maximum number of one- and two-electron
integrals that have to be stored on disk. A cartesian (12s8p5d)/15s4pld] basis was used,
and the symmetry was C2v which gives 35at, 2811t, 28~, and 24a2 basis functions (These
numbers do not quite add up: the explanation is that five basis functions were deleted
from the calculation). The number of integrals actually stored in this calculations was
4 238 521; the program stored only those larger than 10- 14 •
Exercise 9
In a calculation on CuF, at a distance close to equilibrium, 4609521 integrals were gen-
erated. At a large distance (lOOao) only 1272495 integrals were produced. What is the
origin of this difference?
237
Exercise 10
2 Hartree-Fock
See Almlof's chapter on Hartree-Fock, Appendix A, for integral notation. Expectation
values over determinant functions of orthonormal spin-orbitals are found in the same
chapter, Eqs 4.6 and 4.10-14. Matrix elements over different determinants are found in
Eqs 23.19-22. For the F, J, and K operators as they appear in the UHF and Closed-shell
cases, see also Chapters 6 and 7.
Exercise 1
Write down the Fock operator for the lo5z205 2 state of Be.
Exercise 2
Show Koopmans' theorem for a closed-shell molecule, i.e. that the ionization potential
in the frozen approximation equals -£k, the negative of the orbital energy. The frozen
approximation means that the ion state is obtained using the canonical orbitals of the
neutral state, simply by forming a determinant with one of the orbitals removed.
Exercise 3
By a single excitation from a closed shell determinant, we can obtain a singlet and a
triplet sta.te. Compute the energy difference between these states. Which is the lower?
Exercise 4
where the sum is over the m occupied shells. The exchange terms are much smaller than
the Coulomb terms, except of course for the case i = j. Compare the two cases where
i is an occupied and where i is a virtual orbital. Also show Koopmans' theorem for the
electron affinity in the frozen approximation. This is similar to Exercise 2, except that
we add a virtual orbital instead of removing an occupied orbital.
Exercise 5
where 5 i; = (xilx;) and Fi; = hilF.\:;) are the overlap matrix and Fock matrix, and Ck
is a vector of MO coefficients, which express the orbital <Pk in the basis: <Pk = 2:i CikXi.
a) Derive the formula by which this equation is brought to the form
where c" = Tc /" for some transformation matrix T, such that F' is symmetric (assume
real matrices).
b) The matrix T is not orthogonal. Derive a simple expression for T-! in terms of TT
and S.
Exercise 6
Consider a real closed-shell Hartree-Fock wave function. Assume that the occupied canon-
ical HF orbitals can be expressed in some finite, non-orthogonal real basis set h:l>}~=!'
so that tPi = 2:1' CpiXI>' The density matrix in this basis is Dl>q = 22::'=1 CpiCqi (using n
occupied orbitals).
a) Show that the orthonormality of the HF orbitals implies the condition
DSD = 2D
Exercise 7
The determinant is linear in all its columns, so for a first-order variation in the orbitals,
=
Show that the variations of an arbitrary occupied orbital f/;; of the two forms 6v;; 6xv;"
and 6f/;; = i6xtj.'e, where tj.'a is an arbitrary virtual orbital and 6x a real number, preserves
orthonormality to first order in Ox. What does this imply for the matrix element (\}IoI9\}1i)
for the Hartree-Fock state?
Exercise 8
=
Use the exact HIs orbitals in a minimal LCAO basis for H2 Ms 0 states. The closed-
shell "ground state~ determinant is Ig9), and there are also two singly-excited states
(singlet and triplet) and a "doubly excited" determinant luu). Use the known. hydrogen
atom energy Eh = -~ a.u., and (ls1sI1s1s) = ~ a.u. to calculate the energy of these
determinants at the dissociation limit.
3 Second Quantization
Exercise 1
a!ajln) = (1 - n; + o;j)n;E;;r(n);r(n);lk)
where
k/ = n/ if 1:# i,/:# j
k; 1,
kj = 6;j,
{ Iifi:5j
and E;j = -1 if i > j
240
Exercise 2
The following equalities are constantly used to evaluate products of elementary opera-
tors ('field operators'), and in particular their matrix elements. Prove the following six
formulae:
[A, B1B2] = [A, BIIB2 + BdA, B21
n-I
[A, B1 B2··· Bn) L BI ... Bk[A, Ek+IIBk+2'" En
k=O
Exercise 3
Let ir, and j be one-electron operators and 9 a two-electron operator. Show that the
commutators [ir"h and [ir"gl are one-electron and two-electron operators, respectively.
Exercise 4
Let In and 1m be products of 71 and m elementary operators (creation or annihilation
operators), respectively. Show that. for 71 and m both even, the commutator [In' 1m) can
be reduced to a sum of terms, where each term is the product of at most n + m - 2
elementary operators.
Exercise 5
Prove the following relations for operators and square matrices. You may assume that
the Taylor expansions are absolutely convergent, so any reordering of terms or regular
summation is allowed. This is true for matrices and for bounded operators.
exp(A)t = exp(At)
Bexp(A)B- 1 = exp(BAB- 1 )
exp(A + B) = exp(A)exp(B) if(A, B) =0
exp(A)exp( -A) = 1
d
dJ. exp(J.A) = Aexp(J.A) =exp(J.A)A
1 1
exp(A)Bexp(-A) = B + [A, B) + -2 [A, [A,BlI + ... ,[A,[A,·· ·,[A,BI· ··lJ + ...
n.
241
Exercise 6
Let D be a diagonal matrix with elements d;, i = ] ... n in the diagonal. Show that exp(D)
is also a diagonal matrix, with diagonal elements exp( d;), i =1 ... n. Assume that A is
any general, square mat.rix t.hat can be diagonalized by a similarity transformation:
A=X-1DX
Show that for any such matrix,
det(exp(A) = exp(Tr(A))
Exercise 7
Two of the operators occuring in relativistic quantum mechanics are the two-electron
Darwin term and the spin-spin contact term, respectively:
Exercise 8
One of the operators describing the interaction between a nuclear magnetic dipole and
electrons is the Fermi contact term
Hlc = L L "YAS;IAc(r;A)
A ;
Det.erminE' t.he sffond quantization representation of HJc.
Exercise 9
Prove the commutation relations
[Emn. at.,) = C;na!,,,
[Emn' a;,,) = -C",;ant1
[E",n,Eij ) = C;nEmj - CmjE;n
[Emn, e;j"!) = C;ne"'j"l - Cmje;,,"1 + C""eijml - Cmle;j""
242
Exercise 10
Q;j{I,I)
Q;j{I,O)
Q;j(1, -1)
QA;]' (0 ,0) = v2
1In (Aa.,a.
tAt AtAt
,a., )
., ]-,,-a.'-,],
c) Use the relations in Eq. (5.22) to carry out substitutions in Eqs. (10.1) and (10.2) to
show that
TiAI,I)
Tij(l,O)
T;j(1,-I)
SAi]' (00)
• = v2
IIn (At
a., A At
.• a].!+a.
2 .-,
A )
,a]·_!
2
Exercise 11
Let 8 be total spin operator and Ni = alc,aiO' + a;IJai/1 the orbital occupation number
operator. Prove that 82 does not change the orbital occupations, i.e., (8 2 , N;J = o.
Exercise 12
K = (~~. i~2)
wit.h 0 = A + if, and 01,02, A, f are real.
Hint: ( i0 1
-0"
Exercise 13
iratrirt oc:
b) Let. 10.0 > be a singlet spin state. The numbers denote the total spin S and its z
component Ms for the state, respectively. Show that
c) Let 11,0 > be a t.riplet st.ate with spin projection Ms = O. Show that
Exercise 14
Let 10 >.IOk > be an ort.honormal basis of a. vector space and introduce the exponential
a) Show that
. isind"
exp(iRllO >= cosdlO > +-d- L- RklOk >,
k
244
Exercise 15
Let at be a set of creators for orthonormal orbitals. A set of non-orthogonal orbitals can
be defined with creators of the form
Note that, in general, 61 f: it!. The 'tilde' operators are then creators and annihilators for
another set of orbitals than the 'hatted' operators. The two orbital sets are said to be
biorthonormal.
4 Spin
To remind you of the basic nomenclature and definitions, here is a list of some pertinent
facts about electron spin. The spin operator S is a vector operator. In any given reference
frame, it can be represented by three components, S = (Br, By, B.), and on rotations of the
reference frame, these components mix like components of an ordinary vector. The three
components must obey the cyclic commutation relations [Br, By] = iB., [By, B.] = iBr'
and [B., Br] = iBy, since they are components of an angular momentum operator (we
use a system of units where Ii = 1). The operator B2 is defined as the scalar operator
5~ + S~ + B~. The eigenfunctions of 52 have eigenvalues S(S + 1), where S can be
O,~, 1, 1~ ... , i.e. only non-negative half-integers are possible. The functions are called
singlet, doublet, triplet ... , wave functions, depending on S. The eigenfunctions of B.
have eigenvalues Ms, which also must be half-integer, but can have either sign. The
one-electron wave functions always have S = ~. These functions must, strictly speaking,
have one additional argument, namely the direction of the electron spin, apart from the
position variable r. This additional variable m is an internal degree of freedom, and the
internal stntcture of the electron is not part of quantum chemistry but is assumed to be
completely specified for any eigenfunction of S•. One possible choice of a complete set of
observables for the electron is thus (r,m), where m can only be ~ or -~, and variables
in thiR composit space are, if needE'd, denoted x. In formulas, one often integrates over
this space, meaning a sum of the two integrals with m = ~ and m = -~. Any arbitrary
one-electron wave function (or spin-orbital) can then be described by a decomposition
W(x) = <1>1 (r)xl(m) + <l>2(rhdm) if the two functions Xl.2 are independent functions of
m. Customarily, the two functions 0 and !3 are used: o( ~) = 1, o( - ~) = 0, !3( H = 0,
and !3( -~) = 1.
245
Exercise 1
Show that the commutator [52 ,5.] = o.
Exercise 2
Show that
Exercise 3
Define thE' operators 5+ = S., + i5. and 5_ = 5", - is•.
Let IS, Ms) denote some arbitrary
eigenfunction of $2 and 5•. Use the commutation rules and definitions above to show that
the new functions $+15, Ms) and S_IS,Ms) are also spin eigenfunctions, if non-zero. The
opera.tors S+ and S_ are called spin-up and spin-down operators. This type of operators
are generally called ladder operators or step operators.
Exercise 4
Find a 2 x 2 matrix representation for all the spin operators encountered so far, using
the basis 0 and f3. Begin by noting that the representation for S+ or S_ are almost
obvious in t.his basis - these matrices contain a single non-zero complex number, with
known magnitude but unknown phase. The phase cannot be obtained from definitions
and commutation relations (Proof: Those are still fulfilled if any or both of the basis
functions are scaled by phase factors of unit magnitude). Choose the phase so tha.t the
S+ matrix elements are real and non-negative. The matrices you obtain are called the
'standard representation', twice the S"" S,,' and S. matrices are called the Pauli spin
matrices, also denoted (7z, (7", and (7., often regarded as components of a vector (7.
It is convenient to introduce spin operators for each individual electron, denoted e.g. as
Sir = L; Br(i). A product of two operators is then
sr(i), etc., for electron nr. i, and thus
obtained as e.g.
cP;(x) = cP;(r)o(m)
cP;(x) = cP;(r)f3(m)
246
Exercise 5
Show that the general determinant D = I<pI ••. <pNI is an eigenfunction to S. with eigen-
value Ms = ~(n", - n,B), where n", and n(3 are the number of alpha and beta spin orbitals
in the determinant.
Exercise 6
{I: Jt
Show that
S2 D = pq +! [(n", - n,B)2 + 2(n", + n,B)j} D
p<q 4
where we have defined a spin interchange operator Xpq: This operator gives a non-zero
result when spin-orbitals in the positions p and q have different spin, and it then moves
the overline from the one to the other. This formula is very handy for hand calculations.
In application, it should be used to the open shells only, and no, n,B should then only
count the open-shell spins. Try to show why this works.
Exercise 1
It is easily seen that only open shells need to be considered in the above construction. A
simplified notation is then used for hand calculations: just list the spin labels of the open
shells. For the set of determinants with two electrons in two open shells, DI = 10'0'1, D2 =
10',81, D3 = 1,80'1, and D4 = 1,8,81, compute the effect of S20n each of these determinants.
In ~eneral. a single determinant is not an 52 eigenfunction, even if it is an eigenfunction
of 5 •. However, a high-spin determinant, which may have a number of paired (closed, or
doubly-occupied) orbitals, but all unpaired (open) orbitals have the same spin, is always
an eigenfunction of [;2, and has 5 = IMsl. The number of open orbitals are thus 25 for
a high-spin determinant. To construct eigenfunctions with some specified 5 when there
are more open shells than 25, we need in general a linear combination of the different
possible determinants. A very general (but often clumsy) method to obtain such linear
combinations is to use a projection operator:
P(5) II OK
Ki:S
52 - K(I( + 1)
5(S + I) - J\(l( + 1)
The projection operator is applied to a suitable determinant, and is said to project out the
desired spin eigenfunction. The result is, in practice, obtained by sequentially applying
the operators 0[( with different values of K. From exercise 6, this results in forming
various sums of determinants where the spin of open shells have been swithed by the spin
interchange operator. If the original determinant is considered as a sum of terms with
different spin 5', it is easily seen the Of( produces new sums where each term has been
scaled except the one wit.h spin 5, and that the term with spin S' = K is removed.
247
Exercise 8
Exercise 9
As in exercises Sa and b, but with three open shells instead. Also, use the projection
operator technique to get an S = ~ wave function. Is it identical to any of the ones in
part a and b? If not, why?
Exercise 10
In the chapter on the Configuration Interaction Method of the Lecture Notes, Eqs (4.7)
and (4.S) on page 266, two spin eigenfunctions with four open shells are written down.
Verify that they are eigenfunctions of 52.
Exercise 11
Consider a closed-shell determinant 10), let i,j be two different occupied orbitals, and a, b
two unoccupied ones. A double excitation i,j ..... a, b gives a wave function with four open
shells. Write down the linear combination of determinants obtained as EaiEbiIO). Verify
that this is a singlet wave function.
Exercise 12
Same, but use instead EajEbiIO). Show that this function is linearly independent to the
one in Exercise 11, but not orthonormal to it.
5 Geometrical Derivatives . ..
wave function parameters ~ (for example CI state transfer parameters). The optimized
electronic energy at x is denoted e( z) and
is assumed.
Exercise 1
Show by differentiation that the molecular Hessian may be written in the form
e(2) = E(20) + 2E(1I)~(1) + E(02)>.(1) ~(I) + E(OI) >.(2)
Exercise 2
Use the variational condition to show that the above expression for the Hessian may be
written in the following two ways:
e(2) = E(20) + 2E(1I)~(I) + E(02)~(1)~(I)
e(2) = E(20) + E(1I) >.(1)
in a.ccordanc.e with the 2n + 1 rule. Show that the first expression is stable with respect
to errors in >.(1) while the second is unstable.
Exercise 3
Show that the Hessian may also be written in the form
e(2) = E(20) + E(11)(E(02»-1 E(lI)
For a complete CI wave function this expression is the same as the usual sum-over-states
expression of second-order perturbation theory:
The factor of t.wo is due to the following parametrization of the energy and Hamiltonian:
e(x) = e(O)
1
+ e(1)x + _e(2)z2 +
2
iI(x) = iI(O) + iI(I)x + ~iI(2)X2 +
2
249
Exercise 4
lise t.hE' following parametrization of the CI wave function:
wherE'
iI(O)lk) = Eklk)
t.o show t.hat t.he derivatives of the energy with respect to the state transfer parameters
.\n !>('('ome (assuming real parameters and matrix elements):
8E •
8.\n = -2(nIHlO)
82 E • •
8>'m8>'n =2( (mIBln) - (OIBIO)
By st.raightforward differentiation one obtains the following expression for the third deriva-
tivE' of t.he eledronic energy:
e-(3) = E(3O) + 3E(21)>.(I) + 3E(12)>.(1).\(1) + E(03)>.(I)>.(1)>.(I)
+ 3E(11) ).(2) + 3E(02) ).(1) ).(2) + 3E(01) ).(3)
Exercise 5
Using thE' variat.ional condition, show that. this expression may be written in the simplified
way
C(3) = E(3O) + 3E(21) >.(1) + 3E(12) >.(1) >.(1) + E(03) ).(1) >.(1) >.(1)
in accordan('e with the 2n + 1 rulE'.
London orbitals arE' oftE'n USE'd in calculations involving an external magnetic field. Con-
sidE'r an at.omic orbital t/'Im positioned at. 0 which in the field-free case satisfies the
SchrooingE'r E'quation
In atomic unit.s.
iI(O) = _!V 2 + V
2
for some E'ff('('tive. spherically symmetric potential V. The orbital is an eigenfunction of
the;; component of the angular momentum about 0:
L~thm = mt/'Im
t = (f-O) x P
250
We now apply a uniform external magnetic induction B along the z axis. To construct
the Hamiltonian, we need a vector potential. First choose it such that it disappears at
the center 0:
Ao(r) = B x (r - 0)
2
This potential gives the correct induction, as may be confirmed by calculating
B = V' x Ao(r)
*0 = -iV' + Ao(r)
and we have added the subscript 0 to the Hamiltonian to remind us that it is constructed
from a vector potential vanishing at o.
Exercise 6
Show that to first order in B the Hamiltonian may be written
Hence with this choice of vector potential, the unperturbed atomic orbital is correct to
first order in the perturbation.
Now make a different choice of gauge origin:
AG(r) = B x (r - G)
2
This is also a valid vector potential, since it describes the same induction as before:
H'G 1'2
= -211"G + ,e y
Exercise 7
Show that with this (equally valid) choice of gauge origin the unperturbed wave function
"'1m is no longer correct. to first order in the perturbation.
Wf" conduc11" that t.he at.omic orbit,als describes the perturbed wave function better for
somE' gauge choices t.han for ot.her. In gE:'neral, magnetic properties calculated using LCAO
orbitals a.rf" gauge dependent..
WI" now introduce a complex phasE:' factor in the orbital as suggested by London:
Exercise 8
1
•
HGWlm = (E (0) + 2"Bm)Wlm (any gauge origin)
WE:' have se-en t,hat the London orbitals are well suited for the description of magnetic
perturba.tions involving an external magnetic field. A further benefit is that the use of
London orbitals makes LCAO calculations strictly gauge-origin independent. To prove
this, it suffices to show that all integrals over London orbitals are independent of origin.
Assume two atomic orbitals V.'I' and 'l/'v are centered at M and N respectively, and form
the London orbitals
W" = exp( -iAG(M) . r)"'"
Exercise 9
TI'~ = (wl'l~,rblw~)
may be written as
T~ = (1/7l'lexp(~B. (M -N) x r)~,r~ltt'~)
which is manifestly independent of the gauge origin G.
The London orbitals depend on the coordinate system since r appears in the phase factor.
The same is true of integrals over London orbitals.
Exercise 10
Exercise 11
Show that quadratic convergence implies superlinear convergence, and that superlinear
convergence implies linear convergence.
An iteration in Newton's method can be written as
where Xc is the current point, G c and gc is the Hessian and gradient at that point, and
x+ is the next point. The optimizer is X., and thus the current error e c = Xc - x. and
the error in next iteration will be e+ = x+ - x •.
Exercise 12
Write the inverse Hessian and the gradient as Taylor series in (x - x.), using derivatives
at x •. Use this to express the next error, e+. in terms of ec, to prove that the method is
quadratically convergent sufficiently close to the optimizer.
A step to the boundary of the restricted second-Qrder model
may be written as
253
Exercise 13
The quasi-Newton condition requires that the updated Hessian fulfills the equation
where
Exercise 14
where
Exercise 15
Use Lagrange's method of undetermined multipliers to show that, along a gradient ex-
tremal,
G(x)g(x) = p(x)g(x)
for some function p .
254
Exercise 1
Evaluate r. at the nucleus of a Hydrogen like atom of charge Z, and at the bohr radius.
Exercise 2
Evaluate t.he functional derivative of the last term in t.he LYP correlation functional:
Jwp41Vpl2dr
Exercise 3
Show that by substitution of eqn(93) into the KS equations, and integrating by parts, it
is possible to remove the need for the evaluation of basis function second derivatives.
Exercise 4
Exercise 5
Analyse the data in the tables; e.g. which bonds are poorly obtained by OFT and which
are well obtained by OFT. How do YOll think OFT compares with SCF and MP2?
Exercise 6
Carry through the evaluation of the kinetic and exchange energies for the uniform electron
gas.
Exercise 1
Let P denote a single two-electron system whose wave function is
iflp = Wo+:V
255
Express the result in terms of N, S, and €p. How does this correlation energy behave as
N -+ oo?
(c) Note that this result corresponds to a truncation of the exact wave function that is
not va,riationally optimum, since the correlation function xp is taken from the exact wave
function and not reoptimized. (The analogy would be the difference between a CISD
wave function with the coefficients taken from a full CI, and with the optimized CISD
coefficients). Optimize the energy of the truncated expansion. (Hint: the only degree
of variational freedom is an overall scaling of the correlation functions). How does the
optimized energy behave as N -+ oo?
Exercise 2
1
WT2 + WT3 + 2"WT2 + WTJT2 + WTJT3 + WT2T3
2
1"2 1 21
+2"WTJT2 + 2"WTJT2 + '2 WT12T3 + 3fWTJ
1 3
T2
a. reader points out that there must be at least two typographic errors. Since the Hamil-
tonian can couple single excitations to triple excitations, there appears to be a linear term
WT1 missing on the RHS. The five-fold excitation term i!WT1s is also absent.
(a) Why does the term in WTt not appear? Is there a mistake in the equation?
(b) Why does the term WT1 not appear either? How could one describe this term?
(c) Why is it obvious that the reader is thinking from the perspective of CI calculations?
What other terms might a naive reader expect in this equation that are also missing?
(d) Which linear terms appear in the CCSDTQ equations (that is, up to connected quadru-
ples in the expansion)? Hence write down the linearized coupled-cluster equations for
a.rbitrary levels of excitation. How are these related to CI secular equations?
256
Exercise 3
Suppose we wish to use the process of solving the CC equations to estimate Mreller-Plesset
perturbation theory energies.
(a) Using the operator form of the CCSDTQ equations (these are given in Eqs. 4.11 -
4.14 of the notes), iterate the amplitudes order-by-order and obtain contributions to the
perturbation energies through fifth order. You need not expand the matrix element.s of
W, etc. (unless you are very keen).
(b) Analyze the perturbation energy contributions in terms of excitation levels.
(c) Why is the use of the CC energy equation
1 2
ecc = (Wo IW(2Tt +T2 )l wo)
not especially well-suited for our purpose? What might be a better approach? (Hint:
What perturbation energies can be determined once the perturbed wave function of order
n is known?)
8 Truncated CI ...
Exercise 1
Consider minimal basis H2 where the two available orbitals are O'g and O'lt. Set up the
SDCI wave function for this two-electron system and compute the correlation energy
Ec =E-Eo
where Eo is the Hartree-Fock energy.
Exercise 2
Add another H2 molecule at infinite distance such that there is no interaction between
the molecules implying that the total wave function may be written as t.he product of the
wave functions of the two subunits.
a) Set up the full CI wave function for this system. Very few excitations will result in
non-vanishing contributions to the wave function. Give arguments for excluding terms!
Is (local) symmetry necessary to exclude the single excitations in this case?
b) Construct the full CI matrix for this system and compute the correlation energy for
the given basis. How does that compare with that of a single H2 molecule?
c) Compute the CI coefficients and compare the coefficients for the quadruple excitations
with that of the doubles. How could this be used to simplify the system of equations?
257
(Note this relation only holds in the case of non-interacting systems, but it forms the
basis of the:> Couple Cluster approximation). Can you see from the full CI wave functions
for the two non-interacting H2 molecules that this particular relation between CD and CQ
should hold?
Exercise 3
Restrict the wave function to SDCI, i.e. only up to double excitations are included. Set
up the SDCI matrix and obtain the correlation energy. Compare with (1) and (2) above!
Exercise 4
Consider th(' case of an SDCI wav(' function for N none-interacting H2 molecules. Set up
th(' SDCI matrix (lnd compute the correlation energy. What happens with the correlation
('nergy per molecule when N --+ (Xl? What happens with the coefficient of the Hartree-
Fock reference state when N gets large:>?
Exercise 5
One common method of making the SDCI approach more nearly size-consistent is through
the Davidson ~orrection EDGv. Consider the correlation energy functional
Exercise 1
Is restricted Hartree-Fock size-extensive? Size-consistent? What about UHF? What about
different correlation treatments? Does your classification here agree with your neighbours?
Discuss any differences and attempt to resolve them.
Exercise 2
The 3$ contaminant function from a Cartesian d shell is of the form r2 exp( _br 2). Find
the overlap between a normalized 35 function and a normalized Is gaussian function,
exp( _ar Z). For what ratio of exponents alb is the overlap a maximum? What does this
say a.bout. choosing polarization exponents for Cartesian sets? (If you prefer, solve the
mort> general problem of overlap of rn exp( -ar Z) with rn+Z exp( _brZ), thereby obtaining
a· formula that could also be used for .f functions). An essential radial integral:
1"'"
o
exp( _07. 2 ) r2 dr = (2n.+ 1)".. (_11" )1/2
2n+2on+1 0
Exercise 3
(a) Tht> resonse of a spherical atom to an applied electric field F is defined to be
Exercise 4
An expt>rimantalist claims to bt> observing phenomena involving excited states of H2 0 up
to 10 eV above the ground state. What sort of calculations would you contemplate doing
259
to help analyze the experimental results? What major potential difficulty would warrant
special at tention ?
Exercise 5
How would you set up practical calculations to determine an accurate binding energy for a
complex bf'tweE'n Ar and NH3? How would you provide some calibration of your results?
Solutions
1 Basis Sets
Answer 1
A nice property of integrals over gaussian functions is that the product of gaussians are
gaussians, and also that so many integrals factorize. In cases like this, we use
e- ar2 e-fJr' = e-(a+P)r2
to convert to a single gaussian, and combine with the formula
1 100 1""
00
-00 -OX' -IX>
e- or2 dxdydz = 1 00
-00
e- ar2 dx 1
00
-00
e- all dy 1
00
-00
e- az'2 dz = (!:.)!
0
= N; (2:) 2" = 1
3
(4)114>1)
3
(4)214>1) NIN2(0::~r
(4)114>1) Ni(2~)~ =1
Eliminating, we find
The fraction in parentheses in the last form is the ratio of the geometric average to the
arithmetic average of the positive numbers 0: and ~. This is less than 1, except when the
numbers are equal, and then it is 1.
Answer 2
The product to be integrated can be regarded as the product of three factors, which only
depend on x, y, and z, respectively. If any of these integrands is an odd function, the
integral is zero. This happens for the overlap between s and any of the p functions, of
course. For the six d functions, those three with factors x 2 , y2 and Z2 will have non-zero
overlap wit.h an So function, but those of type xy, XZ, yz will have zero overlap with s.
262
Answer 3
Let us start with the simplest approach. The s contaminant can of course be seen by
inspection: it is a symmetric combination of the type x 2 + y2 + z2, since this is = r2
independent of coordinate system, hence rotationally invariant. Furthermore, the three
functions of type xy, xz, yz are seen (as in Exerdse 2) to be orthonormal to each other,
and to the So contaminant. We then need only two more orthonormal functions from the
set spanned by x 2 , y2 and z2. All non-zero overlaps in this set are either of the type
SI = (x 2Ix 2) or of the type S2 = (x 2Iy2). The values SI and S2 are unknown, but do not
matter: orthogonalization produces e.g. the two combinations Ix 2-y2) and Ix 2+y2_2z2).
The more systematic approach is to apply the angular momentum operators:
iL:rxkyl z'" exp( -ar 2 ) =mxkyl+1 z",-I exp( -ar2 ) - lxkyl-l Z"'+I exp( -ar2 )
The ansatz ¢J = CIX2 + C2y2 + C3Z2 and the equation i.l¢J = 6¢J gives t.he equation system
4cI - 2C2 - 2C3 = 6cI
-2cI +4C2 -2C3 = 6C2
-2cI -2C2 +4C3 =6C3
This equation system is equivalent to CI + C2 + C3 = 0, so it only tells us that any function
we pick. a.<; long as it is orthogonal to the .$ contaminant, will be an eigenfunction to i?
with angular momentum 2. In practice, one uses eigenfunctions of L. with eigenvalues
±m, combined to form real combinations, to define the real spherical harmonics. The d
combinations are fairly standard:
Answer 4
Since we only care about the radial dependence, it does not matter if we are dealing with
cartesian or spherical harmonic gaussians, nor which component of a shell w~ choose. The
radial distribution function in any direction will have the form PI ex: r 21+2 exp( -20'r 2). It
has a single maximum for r > 0:
( :)
Tm~x
=0 => (21 + 2)r~:'; - 4ar~; = 0 => rmax = l ~l
If 7·max is to be the same for two gaussian shells with quantum numbers [' and [", we get
0"/0''' = (I' + 1)/(1" + 1)
Answer 5
A beautiful fact is that the product of two gaussians is a new gaussian, even if they are
not on thE' samE' centE'r:
o(x - A)2 + (3(x - B)2 = (a + (3)x 2 - 2(aA + (3B)x + O'A 2 + (3B 2 = (0' + (3)(x - e)2 + D
C; _ (
r.:71
vap )3/2 _-.!!.L _ 2
.. - (a + (3)/2 exp( 0' + (3(A B»
which is also obvious from the solution of Exercise 1. Note that the new center C = o!!:B
is simply the weighted mean of the original centra.
Answer 6
where PI is a polynomial of total degree nl, expressing the angular dependence around
center A, P2 has total degree n2 around center B, and
oA +,8B
(I
C = 0+,8
0,8 2
F exp( - - - ( A - B) )
0+/3
"Y (1 + /3
In particular. for two p shells, we obtain x - A" = (x - C,.) + (CX - Ax) etc., so
(;r - A,,)(x - Brl = (x - Cr )2 - 2(XCA + XCB)(X - Crl + XCAZCB
and so on, where XCA is short for C" - AT etc. The following integrals are standard:
This is an important general point: All the integrals involving the various components
of complete shells of cartesian (or spherical harmonic) gaussians are obtained from a few
values in common for all the integrals, combined with simple expressions involving the
relative coordinates of the centra. Example: With four f shells, all the 2401 four-center
two-electron integrals can be computed by simple arithmetic from two transcendental
fundion evaluations (one exponential function, and one incomplete gamma function).
Answer 7
Use a coordinate system with z axis perpendicular to the molecule. Assume spherical
harmonic basis functions. We obtain the following number of basis functions:
Each shell Each atom, primit.ive Each atom (4s3p2dJ Each atom (5421 J
The number of integrals, for the [4s3p2d) contraction, can be computed as follows. There
are four atoms, and so 64a' + 28a" or in total 92 basis functions. There will be ~ . 64 .
65 + ~ ·28·29 or 2486 products of basis function pairs with combined symmetry A'.
(This number, by the way, is also the number of one-electron integrals). There are also
64·28 = 1792 such products wit.h A" symmetry. The number of two-electron integrals are
then ~.2486.2487+~.1792·1793 = 4697869. The rule-of-thumb gives 924 /8.2 = 4477456
(N = 92,g = 2).
Similarly, for the [!)s4p2dlfl calculation, we get N = 136 basis functions, subdivided as
920' +440", giving 5268A' +4048A" density products, and 22 073 722 integrals. The simple
formula giVE'S 21.3 million. For [5s4p3d2fJ, the figures are 46286079 integrals, while the
est.imat.e gives 45.2 million.
Note: 1) Nr of integrals rises steeply with size of basis set. 2) The approximate formula
is quite good: You never really need to hand-compute the exact number of integrals.
3) The int.egral calculation times may be roughly the same, if it is determined by the
number of primitive integrals, in t.his case about 0.7 billion. In practice, a modem integral
program is able t.o skip some of the smallest integrals, and is also able to move some of
the operat.ions t.hat compute the individual integrals to outer loops, where they handle
already-contra.ct.ed quantities. But as a rough guide, computation times is determined by
t.he primitives, while disk space is determined by the contracted basis functions.
Answer 8
The number of density products of the four possible symmetry types are:
1
N(Ad 2'll(otl(lI(ail + 1)+ .. ·+ = 2'1 ·35 ·36 + ... + 2'.24
1
·25 = 1742
N(Bd = = 305·28 + 28·24 = 1652
n(ad71(b1 ) + 1l(~)1l(a2)
N(B2 ) n(adn(~) + n(b1)n(az} = 35·28 + 28·24 = 1652
N(A 2 ) = n(adn(a2) + n(bt)n(hz) = 3·5·24 + 28·28 = 1624
Answer 9
Let t.he numbers 1 and 2 stand for arbitrary orbitals on eu and on F, respectively. Then
the integrals of type (21. 11), (21,21), (21,22), and (22,21) will be very small and are not
computed, since the basis function product of type 21 is almost zero.
266
Answer 10
This calculation used symmetry-adapted basis functions. The reduction shown in Exer-
cise 9 is not relevant. Even at infinite separation, all orbital products are non-zero. On
the other hand, there is a reduction in the number of primitive integrals, and thus also in
CPU time.
2 Hartree-Fock
Answer 1
The Coulomb and exchange operators for an orbital t/>j are defined by their action on any
arbitrary one-electron function x:
Answer 2
It is simplest to use the spin-orbital formulation. We can always order the orbitals so that
the removed spin-orbital is the last. The energies of the N-electron and N - I-electron
systems are
N N i-I
EN = L(ilhli) + LL((ijlij) - (ijlji))
i=2 j=1
N-I N-I i-I
EN-I L (ilkli) +L L ((ijlij) - (ijlji))
i=2 j=1
267
Answer 3
c = 'E
~l
J
(2h i i + ~ L(2(iiljj) -
j
(ijlij») + hNN + ~(2(iiINN) -
W
(iNliN»
N-J
+h".. + L(2(iilaa) - (ialia»
i=J
DJ and D4 are pure triplet states, and give immediately the triplet energy. D2and D3
together span a space containing precicely one singlet and one triplet state. The sum of
diagonal elements of the submatrix for these two states equals the sum of eigenvalues.
Thus we get
E(T) = C + (NNlaa) - (NaINa)
E(8) + E(T) = 2C + 2(N Nlaa)
268
i.e. E(S) = C + (NNlaa) + (NaINa) and E(S) - E(T) = 2(NaINa). Another method
is to directly write IS >= JI(D z - D3 ), so that
E(S)
1
= 2(Dz -
- 1 - -
D31HID2 - D3) = 2(E(D2) - (D2IHID3) - (D3 IHID2) + E(D3 »
Since D2 a.nd D3 differ in two orbitals, we get
(D2IHID3) = (NallNa) = -(NaINa)
leading to the same answer.
Answer 4
With the approximation Wlji) = ci;(ijlij), we get
Ei = hi + E(ijlij) - (iilii)
fa ha + E(ajlaj)
This means that a self-energy term is subtracted for occupied orbital energies, but not
for virtuals.
The energy with an added electron can be written as
E~+l = EN + ha + E«ajlaj) - (ajlja)
directly from the standard formula, by recognizing all terms not involving a as' precisely
the sum EN. The additional terms are precisely Ea. This is Koopmans' theorem for
electron affinities: If the orbitals do not change, then the E.A. 's on the Hartree-Fock level
of approximation are given by the virtual orbital energies.
The difference in the physical interpretation of occupied VS. virtual orbital energies ex-
plains the so-called HOMO-LUMO energy gap. It is particularly obvious in UHF orbital
energies, in cases where the HOMO and LUMO are spatially equivalent, e.g. H2 near
dissociation.
Answer 5
Substituting Cle = Te'le gives the left-hand side (FT - fk5T)e'le, which in general cannot
be of the required form: we would have to use T = 5- 1 , and since this is a symmetric
matrix but does not in general commute with F, the product FT would be asymmetric.
The symmetry is enforced by writing
TT(F - EkS)Te' k = 0
which is also natural since it expresses a basis change. This has the required form pro-
vided that TTST = 1, which is the usual matrix form of finding a transformation to an
orthonormal basis. This relation immediately gives T- 1 = TTS. It is convenient to use
this formula rather than matrix inversion when the inverse transformation is needed.
269
Answer 6
a) The p, q matrix element of the left-hand side is
Answer 7
for the real variation, and similar for the imaginary one, since the virtual orbitals are
orthogonal to all the occupied ones. For the real and the imaginary cases, respectively,
we obtain the energy variations
1£ the determinant happens to be a Hartree-Fock state, we know that any orbital variation
that preserves orthonormality to first order must have a first order energy variation which
is zero. For such a state, we have now proved Brillouin's theorem: the HF state does not
interact through H with any singly-excited determinant.
270
Answer 8
At the dissociation limit, we know the orbitals:
E(O";) -1 + (991199)
E(ugO"v) = -1
E(O";') = -1 + (llulluu)
since any integral with both A and B factors must be zero. Thus the dissociation limit for
the open-shell states is correct, but the dosed-shell lu;
state is (with this basis) 8.5 eV
too high! The explanation is that there is a matrix element of this size which couples
the two closed-shell determinants. Taking this into account (Full eI) produces two linear
combinations of the closed-shell states, one with the correct dissociation energy, and one
which dissociates into H+ and H-. This is of course a common phenomenon: closed-shell
HF is unable to break or form bonds.
3 Second Quantization
Answer 1
Using
with
k/ = nl - Olj
r(k)1 = €ljf(n)1
271
we obtain
a!ajln) = njr(n)ja!lk)
= nj(1- k;)r(n)jr(k);lp)
= nj(l - ni + 6ij)Eijr(n)jr(n)ilp)
where PI = n, for I ::j: i, j, Pi = 1 and Pi = 6ij.
Answer 2
1.
Since the equation holds for n=2 (Eq. 2), the induction is complete.
5.
Answer 3
Let
K. = 2: Kija;aj
i;
and
50
ij
273
with
so
[K,gJ = ~ L Kij9klmn[a!aj,ala~anatl
ij/dmn
with
Answer 4
If 11 = 1, i.E'. In is onE' single elementary operator, but m is even, then the fourth formula
of Exercise 2 shows, since then [A, Ek+lJ+ is a number, that the terms of [It, 1m] are
products of at. most m - 1 elementary operators. Since [In' 1m] = -[1m' In], the roles of
In and 1m can be interchanged, so for even n, [In' II! contains products of at most n - 1
elE'mentaryoperators. Now let 1m = El ... Em and use the second formula of Exc. 2:
m-l
[/",ImJ =L El ... EI:[In. Bk+l]Ek +2'" Em
k=O
We now know that the commutator in the right-hand side contains products of at most
n - 1 elementary operators, which proves the statement.
274
Answer 5
1.
= exp(At)
2. Since
L -1 L n
n.
,
AmBn-m
n=O n! m=O m!(n - m)!
(A+B)" = L" n.
,AmB,,-m
m=O m!(n - m)!
4.
Answer 6
1.
1 2 1 3 1 n
exp(D) = 1+D+2D +3i D +···n!D
Sinct> the n'th power of a diagonal matrix D is a diagonal matrix with elements
(d;;)n we obtain
exp(D)
= exp(Ld;) =exp(TrD)
= exp(Tr(DUU- 1 )) = exp(Tr(U-1DU» =exp(TrA)
276
Answer 7
Since HiD and H~sc contain summations over 2 electrons, they are two-electron operators.
HiD is spin-free, so its second quantation representation is
with
+ L(ijkl)a10 0!polpaio
ijkl
L< ij kl)otoalpalPojO
iikl
In the same way we obtain
Hssc = -2 L a!oalpolpaja (6)
ijkl
Comparing Eqs. 6 and 6 gives directly
Hssc = -2H2D
Answer 8
H,c = L 1'A ~ L
A IJ tritfJ
Jdr dm. ~i<r) O"i(m.)
X {~(S+i + S-d IAz + ~i (S+i - S-i) lAy + S.dA.}
x6(r-rA) (/Ij(r) O"i(m.) a!...;aj"j
Answer 9
We have
[Emn,a!...l = L[a~...anv.,a!...l = 6..ia~.,
...'
278
and
-bjloOniEml + OjkOmlEin
Onkfijml + bnkOjmEil - Omlfijlen - omloj/.Ein
+Onifmjlol + OniOjkEml - Omjfinlol - OmjOnloEil
OjloOniEml + OjloOmlEin
Onloe-ijml - bmleijkn + Oniemjkl - Omjeinkl
Answer 10
. lIn( a.,a.
[5:, t t ,+a.t ,a.,
v2 " )-, '-,),
t) 1
2. Q;AO,O):
•
[5+,
v2
(t
1M a.,a.t ,-a.t ,a·l 1
" )-, '-,),
t)
[5.,
v2
(t"
• 1M a.,a.t ,-a.t ,a.,
)-,
t)l
'-,).
fI (at, (-a)d + at
( at,E2' a)·_!,
2 V2" 1.2' 2
,a)._!)
1-'2 2
, at1-2',(-a)._d)
2
280
Exept for an overall sign in all components we have the triplet operator in Eq.
(5.26). Substitution of Eq' (5.22) in Eq. (5.25) gives
ff.(a!,(-a.d-a
V'2 'i" 12
f la'_l) =- fI(a!la'l+a! la'_ l)
V'2 '2'
1- 2 J 2 J2 1-2" J 2
Answer 11
From the second quantization expressions for the components of S, Eq. 5.14, it is imme-
diately seen that
Answer 12
A useful idea is to use the Taylor expansion of the exponential and add the even and odd
powers separately.
a) Use ,..2 = _A2], where] is the unit matrix. Then K2n = (_A2.)n], while ,,2n+1 =
( _A 2 )n,..:
co,..2n 00 ,,2,,+1
~ (2n)! + ~ (2n + I)!
= f: (_I)" A + f:
,,=0 (2n)!
2,,]
,,::0
(_I)" A2""
(2n + I)!
sin A
= cos).] + (-A-)K
COS A sin A )
(
- sin A cos A
c) First, split up It = Itl + "2, where "I = iAI with A = ~(61 + 62 ). Since Itl is a multiple
of the unit matrix, it commutes with "2 so that exp(lt) = exp(ltdeXP(1t2)' Use
2
/1".2 =(
'B
~
• 0
'B
)2 = -C I
2
-0 -~
Answer 13
Either use the standard representation for the spin operators to express the exponential,
in this case
This transformation matrix must then be applied individually to each orbital. One can
proceed similarly to obtain transformation matrices for other spins than S = ~.
As an alternative, we are sure to get the many-electron operator correct if we express it
directly in field operators. In part (a), we obtain the transformed creators:
where the n-th order term has n commutators. However, from the second-quantization
form of the S components, we get immediately
1 .t
[S~,iito) = 2aiP
.
[S~, aiP)
·t
= 21aio
.t
where q stands for /3, if n is odd, else o. Summing odd and even terms separately, as in
so many other examples, gives
c) Repeated use of Sz = ~(S+ + S_) shows that Szll,O) = VICll,l) + 11,-1)), and
S,m,O) = 11,0), so that
s;n+1II,O) = I{(II,I) + 11,-1))
s;nll,o) = 11,0)
As usual, collecting terms of odd and even order separately in the Taylor expansion gives
Answer 14
XIO) = i L Rklk)
k¢O
X 10)
2 = -<flO)
X 2"10) = (-I)n<f"IO)
X 2"+110) = (-I)"<f"XIO)
• .
sind
exp(X)IO) = cos(d)IO) + t( d) L Rklk)
k¢O
b) Similarly,
XII) = iRilO)
X 2,,+1II) = iR;( -1)"d2"10)
X 2,,+2 II) = iRi(-I)n~"XIO)
The last expression, used in the even-term Taylor expansion, would give -iR;(~)XIO).
However, the O-th order term will be missing, and must be subtracted. Also, insert the
expression XIO) = iLk¢oRklk) to obtain
283
Answer 15
Choose
b; = exp(-ilt)a;exp(ilt)
It is immE'diately obvious that
since It is in general not hermitian! The nomenclature is a bit vague: The most logical
choice is to call bt and bi the creator and annihilator, respectively, for orbital h; with
respect to the orbital system {b; }~1. The advantage is that the anticommutation relations
always work, while a disadvantage is that the annihilator is then not defined solely by
the orbital but depends on the choice of the entire orbital set. More common is to define
the 'annihilator' to be the conjugate of the creator, and accept that the anticommutation
relations will have an overlap matrix element instead of a Kronecker delta on the right-
hand side. The most drastic stratagem is to use different notation for the 'hat' and 'tilde'
operators: the orbital index is put in either a superscript or subscript position, and all
E'xpressions are handled by tensor rules.
4 Spin
Answer 1
The sum is simply [82 , S.J = 0, QED. Since all used relations are invariant to cyclic
permutation of labels :t, y,Z, so is the result, so it is also true that [82 ,8.,J = [82 ,8y J = o.
Answer 2
Answer 3
Answer 4
By definition, 0 and /3 are spin eigenfunctions with S = ~, Ms = ±~, respectively. From
Exercise 3,5+0 is a function with squared norm S(5 + 1) - Ms(Ms + 1) = O. Similarly,
5_/3 = O. On the other hand, 5+/3 has squared norm H~ + 1) - (-~H = 1, and so
=
5+/3 co, and similarly 5_0 = d/3, where Icl = Idl = cd = 1. The last relation comes from
the requireme~t (/315_5+/3) = 1 as in Exercise 3, but is also obvio.us for anot~er reason:
Since 5+ and 5_ are hermitian conjugates, we must have that (015+/3) = (/3IS_o)*. The
natural choice is thus the standard representation:
S+ = (~ ~) S_ = (~ ~)
Directly from the definitions, we obtain then
S", = ~ (~ ~) Sy = ~ (~ ~i)
The representation matrix of 52 is obviously half times a unit matrix. This, as well as all
commutation rules, can be verified by simple matrix multiplication.
285
Answer 5
From e.g. Slater-Condon rules, or from the Second Quantization lectures, or simply
from noting that the determinant is a multilinear form of its rows (or its columns), the
application of an operator in the 'independent particle' form A = Li a( i) to a determinant
D = ItP1tP2tP3 ... t,l'nl is obtained as
L a( i)It,l'I"'P2t,l'3 ... t,l'nl = I( atPdtP2t,l'3 ... tPn + ItP1 (atP2)tP3 ... tPnl + ... + ItP1tP21/13 ... (atPn)l
1
i
=
With the overline notation, we note that szrf> ~rf> while s;4> = -~4>. In that case, then,
all terms are equal to the original determinant except for a factor of ~ or -~. Thus we
get 5z D = ~(nG\' -np)D, where ncr and np counts the number of orbitals without and with
an overbar. Obviously this applies generally whenever all the orbitals are eigenfunctions
to the one-electron operator a: The determinant is simply multiplied by the sum of the
eigenvalues.
Answer 6
Use the result of Exercise 3: 52 = 5+5_ + 5z (5z - 1). The rest of the derivation can
be made rather simple by using the second-quantization rules rather than the 'individual
particle' formulation, but here is the more complicated solution. The one-electron oper-
ator LiS+(i).L(i) is handled just as in Exercise 5. Since s+s_¢> = ¢>, while s+L4> = 0,
we obtain Li s+(i).L(i) D = nerD. The two-electron operator Li¢j s+(i).L(j) works as
follows:
Consider a single operator in this sum, with some specific values i, j. In the expansion of
the determinant as a sum of products P, each term has either of four forms which gives
the following results:
It is difficult to directly apply the LHS expression to each of the n! product terms P in
D, but the RHS is expressed in terms of orbitals instead of particles and applies equally
286
Answer 7
Answer 8
a) From Exercise 7, we can immediately write down the matrix:
82 =(
2 0
o
0 1 1 0
0)
o 1 1 0
o 0 o2
287
It is almost diagonal already, we just need to remove the mixing of the Dz and D3 func-
tions. The central 2 x 2 submatrix has normalised eigenvectors .jf(I, I)T and .jf(I, _1)T,
with eigenvalues 2 and 0, respectively. Now remember that the eigenvalues should be
5(5 + 1), so the spins are 5 = 1 or 5 = O. We thus obtain the spin eigenfunctions
IS, Ms):
11,1) = DI
11,1) = DI = laal
5'-11,1) = 1.801 + 10.81
";-=-1-,;.2:---:0::-·"'7111, 0) = 1.80 I + 10.81
. 52 -K(K+I) 5Z -2
OK = 5(5 + 1) - K(K + 1) = 0 - 2
!(82)D2
2
= !(D2
2
+ D3)
Answer 9
a) As in Exercise 7, we obtain:
15
82 10'00'1 = '410001
7
82 10'0.81 = 1.8001 + 10',80'1 + 4"100,81 etc ...
7
82 10'.8.81 = 1.80,B1 + 1,8,80'1 + 4"10.8,81 etc...
S2 =! ( 47 4 4)
7 4
4 447
This matrix has unnormalised eigenvectors (1,I,l)T, (1,-I,Of, and (1,1,-2f, with
eigenvalues ¥, ~, and ~, respectively. These values of S(S + 1) correspond to spin
S = ~,~, and ~, respectively. Within this subspace, we thus obtain the three normalized
spin eigenfunctions
31
12'2) = {a000.81 + 10,Bol + l,BoO'I)
1 1
12'2) = !f000.81 - 10,801)
1 1
12'2) = 1f(loO'.81 + 10.801- 21.800'1)
289
Obviously, since the last two are degenerate, we could just as well (as you probably did!)
obtain any orthonormal linear combinations of them instead. The matrix problem for the
=
three Ms -~ components is identical.
b) Starting with I~,~) = 10001 since it is high spin, we next obtain
and thus
(153 3 1
V"4 - 412' 2) = 100,81 + 10,801 + 1,8001
or I~,~) = y1<loo,81 + 10,801 + 1,800!). The next step is to obtain
or I~, -~) = y1 (10,8,81 + l.Bo,81 + 1,8,8a/). and finally I~, -~) = 1,8,8,81
To be able to continue, we must take the orthogonal complement to I~,!) in the Ms ~ =
space. We can arbitrarily pick, for instance, the result of a Gram-Schmidt orthonormal-
ization starting with, say, 100,81. Its overlap with I~,~) is y1. Subtracting y11~,~)
leaves ~ (210a,8l- 10,801- 1,800!), which is then normalized to give
1 1
12, 2) = Vfl6 (2100,81-10,801-1,8001)
and finally orthonormalizing, say, 10,801 against this function, and against I~, ~), gives
!
another S = function:
1 1
12, 2)
fl
= V2 (10,801-1,8001)
Whatever t.he choice of functions here, we next proceed by spinning them down to obtain
I~, -~) functions in an obvious way.
Obviously, we can make many different choices for the two I~,~) functions. The results
from the diagonalization method are the most arbitrary ones. The results from spinning
down has at least the advantage that the chain of states connected by S+ and S_ will
always adhere to the standard formulae. Uniquely defined functions are obtained by so-
called spin-coupling schemes, of which the genealogical scheme would be simplest and best
for the purposes of this exercise. Several other schemes are also used.
290
Answer 10
Since the functions have Ms = 0, the simplest test is to see if S+ (or, if you prefer, S_>
gives zero:
Answer 11
In overline notation, the closed-shell determinant is 10) = liiiJl. Remember that E7"l =
E., a~aq.. , where the operator a~aq., acts on a determinant by giving non-zero result
only if tPq<1 is present but tP1'" is not, and in that case replacing tPq.. by tP1'" in place. We
immediately obtain
Answer 12
We can do the calculation as in Answer 11. It is simpler to reuse that result, just inter-
changing letters i,j:
291
Obviously, interchange of letter symbols does not alter the fact that the function is a
singlet.
In the two expressions, the two middle determinants differ, so they are independent.
However, the first and last terms give a non-zero overlap, so they are not orthonormal.
5 Geometrical Derivatives . ..
Answer 1
e(:r) is defined as
e(X) = E(X,A(X»
The chain rule of differentiation gives
The first and last terms are evaluated by the chain rule:
For higher derivatives, most people find it simpler to write up the Taylor expansion of
E(x, A) and then replace the variable A with the expansion of the function A(X).
292
Answer 2
The variational condition is to be valid for all x, so it must be fulfilled in each order of x
separately. The Taylor expansion of aE/a). is
aE(x, ).) = E(Ol) + XE(ll) + ~x2 E(21) + ~x3 E(31) + ...
a). 2 6
+ ).( E(02) + XE(12) + ~X2 E(22) + ...)
2
+ ~).2(E(03) + XE(13) + ... ) + ~).3(E(04) + ... ) + ...
2 6
Replace the powers of the variable ). by the expressions
).(x) = X).(l) + ~X2).(2) + ~X3).(3) + ...
2 6
).(X)2 = X 2 ,\(1)2 + x3).(I) ).(2) + ...
).(X)3 = X 3 A(I)3 + ...
Collecting equal powers of x into subexpressions, and setting each to zero, gives:
E(Ol) = 0
+ ,\(1) E(02) =
E(ll) 0
E(21) + A(2) E(02) + 2A(1) E(12) + ,\(1)2 E(03) = 0
These expressions can be used to eliminate terms in other expressions. They are the
response equations of order zero, one, etc. In the general expression for the second deriva-
tive,
e(2) = E(20) + 2,\(1) E(ll) + ,\(2) E(Ol) + ,\(1)2 E(02)
use of the zero-order response equation eliminates the A(2) term:
e(2) = E(20) + 2,\(1) E(l1) + ,\(1)2 E(02) (7)
Subtraction of A(l) times the first-order response equation gives
e(2) = E('1;O) + ,\(l)E(l1) (8)
Answer 3
It is assumed that the energy partial derivatives are exact, but the response parameters
have small errors. If we use equation 7, but replace ,\(1) with ,\(1) + 8, the result will
change from the exact e(2) to e(2) + Ll:
<;12) + Ll = E(20) + 2(A(1) + 8)E(I1) + (A(1) + 8)2E(02)
=E(20) + 2).(1) E(ll) + ,\(1)2 E(02)
+ 2(E(11) + A(l)E(02)8 + E(02Jc2
= e(2) + 08 + E(02)82
293
Answer 4
The first-order response equation gives immediately
A(l) = -(E(02)r E(I1)
1
It is written this way, rather than as a fraction, to allow generalization to many dimensions.
E(02) is then a matrix, not a number.
and similarly
82 E ••
8.\,,8.\m = ... = 2«(mIHln) -(OIHIO»
With H = H(O) + zH(1) ... we obtain
E(OO) = (OIH(O)IO) E(10) = (OIH(I)IO)
= 2(0IH(0)ln)
Ei0 1) Eil1 ) = 2(0IH(I)ln)
E~~) = 2«mIH(0) In) - Eo)
Since we assume eigenstates of iI(O), E!:) =2(E" - Eo)om". Then
e;(2) = E(2O) _ E(I1)(E(02»-1 E(I1)
Answer 5
Subtract A(3) times the zero-order, and 3A(2) times the first-order response equation:
E(30)+A(3)E(01)+3A(2)E(I1)+3.\(I)E(21)+3A(I)A(2)E(02)+3(A(1»2E(12) + (A(I»3E(03)
_.\(3)E(01)
-3A(2) E(I1) -3A(I) .\(2) E(02)
294
Answer 6
= «r - 0) x p)·B = Lo·B
(B x (r - O))·p
where we use the canonical momentum p = iV. Taking B to define the quantization axis,
we get
.
Ho = -21 7To+V
2
= !pA2
2
+ V + !BL + ! A2o
2 2 %
Answer 7
We now get
(B x (r - G»·p = «r - 0) x p)·B + (B x (G - O»·p
so that
1
HG
A
= II 0) + 2BL%
A (
+ Bd·p + O(B2 )
where d is a constant vector. This is not O.K., since the tPlm functions are not eigenfunc-
tions of this extra term. It can be noted that this extra term is perfectly analogous to
an extra term that appears when doing a Galileo transformation: the "velocity term" in
describing e.g. atom/atom collisions.
Answer 8
In the usual position representation, the result follows immediately from differentiation:
for functions F and f,
Answer 9
TI''' = (wl'l~i~lw,,)
= ('l/JI'I exp(iAG(M)r)~1i"~ exp( -iAG(N)r)I'I/J,,)
= ('l/JI'I exp(i(AG(M) - AG(N»r)~1i"kl'I/J,,)
and
1
AG(M) - AG(N) = 2"B x (M - N)
and finally, to bring it to the required form,
1 1
2"(B x (M - N»·r = 2"B.«M - N) x r)
Answer 10
No. It merely introduces a phase shift, which depends on the center position, for each
set of basis functions on a common center. All commonly used methods are insensitive to
phase shifts for the individual basis functions.
296
Answer 11
Assume a convergent iterative method, with an error e" in iteration k which decreases
towards zero. Define the convergence order "'( to mean that for sufficiently large k,
le"+1l/ie"l"Y is bounded. For any "'(' < ",(, we than obtain
lek+1l1le"I"Y' = le"I"'-"" lek+ll/l e"!",,
which is also bounded, since the first factor goes to zero. In general then: Convergence
order "'( implies convergence order "'(', if "'(' < "'(.
Answer 12
The equation y(x) = 0 is to be solved where y is usually the gradient of some function to
be optimized. The Newton-Raphson method is defined by
x,,+! = Xk - G(Xk)-ly(X,,)
Multiplying by G" gives G"e,,+! = G"e" - y", and the definition of G gives then
Assuming that the inverse of G is bounded (which is true in finite dimensions since the
inverse is already assumed to exist) this implies
Answer 13
The minimum is stated to lie on the boundary ST s = h. 2 , so it coincides with the solution
of the constrained problem for which Lagrange's method gives the equation
or
~m = gT( -(G _ 1,)-1 + ~(G _ Il)-IG(G _ Il)-I)g
Answer 14
a) PSB update formula, directly inserted in the quasi-Newton condition, gives
s.TSet eScT Be + S.T BeSet T B. - tT
e BeBeBT B.
B
+S. = B eBe + (T
Be Be
)2
This looks bad, but is simple: the scalar B"[ Be can be divided away. The second and third
term in the numerator cancel. It remains
B+Be = B.s. + t. = Y.
which follows directly from the definition of t •.
b) PFGS update formula, directly inserted in the quasi-Newton condition, gives
Y.y~Be B.seB~BeB.
B
+Be = B eSc + -YT --
e Y.
TB
B. eB.
Here, the scalars y"[Se and S~BeB. can be divided away. The first and last terms on the
right-hand side then cancel.
Answer 15
We want to find a stationary value of (V f(X»2 restricted to the set of points x for which
=
f(x) const. Lagrange's method gives
V(V f(X»2 = >.V f(x)
For simplicity, use g = V f(x), and cartesian coordinates:
axa (2
g", + g, + gz) =
2 2
where G was introduced for the second derivative matrix (Hessian) of f(x), the y and
z components have similar equations. The required equation is obtained by identifying
p. = ~~ and using vector notation:
Gg = p.g
Answer 1
The wave function for a hydrogen atom in atomic units is
lI'(r) = Ne- r
where N is some normalizing constant to be determined. The density and its integral over
all space are then
p(r) = N 2 e- 2r
j p(r) dv = N 2 foco e- 2r 47rr2 dr = N 27r
Normalization gives N 27r = 1, i.e. p(r) = e- 2r /7r. Numerically then, p(O) ::::: 0.318 a.u.
::::: 2.148 A-3. The mean distance parameter is
r. = (~)1/3 = (~)1/3e2r/3
47rp 4
*'
Answer 2
By definition, the functional derivative is a (possibly generalized) function, call it D =
such that, to first order in cp,
J[p + cp]- J[p] =j D(r) cp(r) dv
When I is defined as an integral involving p, the functional derivative can be identified
directly by differentiation. Any derivatives of cp are eliminated by integration by parts:
CJ = 6 Jwp41Vpl2dv = Jw(4p CplVpl2 + 2p4Vp· V6p) dv
3
Answer 3
Equations (92) and (93) may perhaps be better understood if we note that F is an ordinary
function in 4 scalar variables: p, and the three components of its gradient, g = V p. The
la:,t term of (93) in component form is then
d 8F d 8F d 8F
- dx8gr. - dy8g y - dz8g%
For any concrete, non-linear F we know the expression for the partial derivatives of F,
and we will then need to evaluate
(7]"li'rcl7]P) = J7],,(r)vr.c{r)7],8(r)dv
Integration by parts (now using vector analysis notation) of the troublesome part of the
integral gi ves
Answer 4
p has dimension L-3, (L=Length). V has dimension L-I. In toto, then, Vp/p-! has
dimension L -I L -3 / £-4 = L O , i.e. it is dimensionless.
Answer 6
The occupied orbitals are those with Ikl < kF, where kF is to be determined. The density
"matrix" in position representation is then
300
(Remember to sum over the two possible spins!) where s = fl - f2 and ks = ks cos O. For
5= 0, we get immediately the ordinary density and thus the relation between p and kF:
= p(r,r) = -41r1 3 lkF
p
P
41r k 2dk = ....L
0 31r 2
The density matrix is obtained by substituting u = cos 0 and du = - sin OdO:
and is infinite of course. However, its average per unit volume is finite, and can be
regarded as an exchange potential:
= 11
4 r12
1
4 pr12 4 st S
9 p (sint - tcost)2 d
= ---41r =
00
t _2-(31r2 P)1/3
4 k} 0 tS 41r
Answer 1
(a) The exact correlation energy is NEp.
Note that since IP p is the exact solution for our two-electron system, we also have
C.onsider now the case of N non-interacting two-electron systems and the truncated wave
=
function IP(N) 1P0 + Lp )Cp. We have
2(1P0IHlxp) + (XpIHlxp)
f
(t
runc
)
=N 1 + NS
From the two relations for the correlation energy above we can determine that
(xpIHI.v) = SEp - Ep
(1 + S)fp
f(trunc) = N 1 + NS
As N --+ 00 we have
N(l + S)Ep 1+S
--+--
I+NS S
Thus the correlation energy with this truncated wave function expansion goes to a con-
stant value as N increases, as tabulated below.
=
(c) The wave function IP(N) 1P0 + LP XP is not variationally optimum for this problem
in general. We note that the only freedom needed is to scale the individual XP by a factor,
since the correlation part of the wave function must continue to solve the individual two-
=
electron problems. The trial wave function is therefore IP(N) Ilio + k LP Xp, where k is
to be determined by the variation principle. The correlation energy functional is
This clearly displays an N-l/ 2 dependence for large N. Substitution of kopt into the
correlation energy gives the results listed below for the approaches we have considered
(We have used S = 0.01814 and e:p = -0.041OEh , which are appropriate to H2)' The
limiting behaviour of both the truncated and the optimized truncated expansions is very
clear in these results.
N e:(exact) e:(trunc) e:(opt)
1 -0.0410 -0.0410 -0.0410
2 -0.0820 -0.0806 -0.0806
5 -0.2050 -0.1914 -0.1922
10 -0.4100 -0.3534 -0.3595
20 -0.8200 -0.6134 -0.6469
50 -2.0500 -1.0956 -1.3129
100 -4.1000 -1.4855 -2.1320
200 -8.2000 -1.8070 -3.3389
500 -20.5000 -2.0767 -5.7925
1000 -41.0000 -2.185.5 -8.5889
10000 -410.0000 -2.2935 -29.3832
100000 -4100.0000 -2.3049 -95.2649
1000000 -41000.0000 -2.3061 -303.6405
Answer 2
Obviously, the RHS can generate at most double excitations from Ilio. Hence a matrix
element like (1li1!fIIW, Tlllllio) must be zero. We can say that there is no connected
contribution to the T3 equation from WT1 •
303
(c) It is clear that the reader is mentally using the Slater-Condon rules for matrix elements
to infer that the triples equation, having triple excitations in the bra, must have singles
through quintuples (five-fold excitations) in the keto This is true for CI, since the operator
whose matrix elements we need is simply W (or H, if you like). But this is not the case for
the CC equations! The "matrix elements" here are over the operator exp( -T)W exp(T),
and are between triples, in this case, and lifo. We must explicitly look at the Hausdorff
expansion to see what will contribute. For instance, there is no connected contribution
to the triples equation from Tt\ although the naive expectation might be that such a
"quadruple excitation" term would appear.
(d) The linear terms appearing in the CCSDTQ equations are as follows:
Tt equation: WTt , WT2 , WT3 ,
T2 equation: WT], WT2 , WT3 , WT4
T3 equation: WT2 , WT3 , WT4
T4 equation: WT3 ,WT4
We may conclude from this that in the most general CC expansion, we will have linear
terms WTn , WTn +b WTn +2, and WTn - t . We do not obtain a term WTn _ 2 • On the other
hand, CI equations at the same level would include the latter term. So the linearized
CC equations are actually simpler, in terms of the matrix elements required, than the CI
equations.
Answer 3
Suppose we wish to use the process of solving the CC equations to estimate M~ller-Plesset
perturbation theory energies.
(a) The perturbation energy of order n can be obtained from the CC correlation energy
formula as
where, for example, T2(n) is the n-th order perturbation theory term of the doubles
amplitudes. Let us concentrate first on the perturbed wave functions. By order we have:
D2T2(1) = W,
giving us the first-order amplitudes as
tt!(l) = (lIft!IWlllIo) ,
E[ + EJ - fA - fB
In third order
Substituting the expanded form of the ttl (1) amplitudes given above gives the usual MP2
energy formula. Similarly, we have
(c) These expressions are not the most efficient, since they require the perturbed wave
function of order n to determine the energy at order n + 1. Wigner's formulae
E2n = (n - lIWln),
E2n+1 = (nIWln)
provide a much more efficient strategy. For example, the fourth-order energy would be
(1IWI2). Consideration of the above results then shows that E4 comprises contributions
from matrix elements between double excitations and singles, connected doubles, con-
nected triples, and disconnected quadruples. Things are less clear for the fifth-order
energy obtained this way. The second-order wave function does not include connected
quadruples, so it is not obvious how these terms arise in the fifth-order energy. The con-
nected quadruples, in lowest order (third), are given in terms of connected triples and
disconnected quadruples. Some of the terms in the fifth-order expression (2IWI2) have
exactly the form of the equations that define the connected quadruples, as discussed in
Chapter 4.
8 Truncated CI ...
Answer 1
where
and
H12 = (gulgtt)
(From now on, however. no explicit expressions in terms of integrals will be needed).
Diagonalize:
I H12
go - g
306
CO+Cl (+)/(CO+Cl)2
~c = -2- - V -2- -Cael + H212
Let C = co+ Cc
co - Cl
= - -- ( co -2 Cl)2 + HZ12 <:- o., Correlation energy
2
Use
a- co - Cl X = !iJZ. = .1!lu..
- 2' A ~O-~I
Also: let the eigenfunctions be rP =dorPo + d1rPl, where then the coefficients are obtained
as en eigenvector of the hamiltonian matrix.
Answer 2
Use two isolated hydrogen molecules. Antisymmetrization can be disregarded. Basis
functions of the CI are
On the other hand, we know that the eigenfunction is just the product
where
307
Co = 4; Cl = C2 = dod l ; C3 =4
The matrix elements of iI = iIA + iIB are for example
('ltoliIllIIl) = (<p~<pgliIA + iIBI¢Jg<p~)
= (¢J~lilAltP~)(tP:ltP~) + (¢JgltP~)(¢J:lilBltP~)
= H12 x 0 + 1 X HI2 = HI2
In full
2eo
H= ( H12
Hl2
o
Now use the already-solved eigenproblem for the 2-electron case,
( Heo12 H12)
': 1
( ddo1 ) =.: ( do )
dl
i.e.,
H e =2ee
The above can be just as easily expressed using the correlation energy directly. Just
subtract 2co from each diagonal element:
(
~2 CO~2Cl
H 12 0
H~20 H12
H 12
) ( : ) =cc ( : )
co - Cl C2 C2
o ~2 ~2 ~~-~) q q
(where Cc is the correlation energy of this specific problem. It is twice as large as the one
calculated in exercise 1).
We now introduce a more efficient notation: Let
\II = Co\llo + cD\IID + cq\llQ
\liD = .J1i2 (~~~f + ~:~g)
\IIQ = ~tt/>r
(
0
v'2H12 -.!2Hco12
Cl -
0
v'2H12 Co )
) ( CD = Cc
( CD
Co )
o .../iH12 2(Cl - co) CQ CQ
of course with the same solution as before:
1
CQ = d~ = -chiCo
2
Answer 3
Now we do an SDCI instead, by removing the quadruple excitation. The new eigenproblem
is
( 0 = c:DC1 (2H 2) ( Co )
.../iH12 ) ( Co)
v'2H12co Cl -
CD CD
Such an equation was solved already in problem 1. IT we keep the same value of x, we
now get
C~DCI(2H2) = 6 (1- "'1 + 2X2)
To the lowest non-vanishing order, we get
{
c~DCI(2H2) = _~X2 +
cc(H 2 ) = _~6X2 +
but as we see,
C:DC1 (2H 2) i= 2 x Ec (H 2 )
so SDCI is not size consistent.
309
Answer 4
Similar to exercise 2, let us put
W = ¢>A¢>B tip. ..
= (do¢>~ +d1 ¢>t) (do¢>: +dl¢>n ...
= d~ ¢>t¢>g ¢>~ ... + d~-ldl (¢>tf/>g ¢>~ ... + ¢>t¢>f¢>~ ... + ... ) + .. -
CoWo + CDWD + cqWQ + ...
where
WD = ~ (¢>t¢>~ ¢>~ ... + ¢>t¢>f ¢>~ ... ¢>t¢>~ ¢>f ... + ...)
WD = J N(} _ 1) (¢>t¢>f ¢>~ ... + ¢>t¢>g ¢>f ... ¢>t¢>f ¢>f ... + ...)
etc.,
and
Co ~
CD = ..JN~-ldl
cq = JN(N2- 1) d 0
N - 2cf.
1
etc.
We get the matrix elements of the (shifted) hamiltonian
(WoIN - eolwo) 0
(W DIH - eolWo) = /k(H12 + H12 + ... ) = ..JNH12
(WDIN - eOIWD) = k «el - eo) + (el - eo) + ... ) = el-eo
(WQIN - eolwo) = = J2(N -1)H12
(wQIN - eolwQ) = = 2(el - eo)
etc.,
and a tridiagonal FeI matrix:
0 ..JNH12 0
VNHI2 (el - eo) J2(N -1)H12
0 J2(N -l)H12 2(el - eo)
0 0 J3(N -2)H12
0
Now, the SDCI approximation means we truncate the upper-left two-by-two submatrix
and are left with the problem:
0 VNH12) ( Co ) ( Co )
( VNHI2 el - eo CD = ec CD
310
e~DCl(N H2 ) = ~ (1 - VI + NX2)
Also, since the problem becomes asymptotic.ally similar to
( Co)
CD
-+ ( ~),
-v 1/ 2
and ec = -VNH12
when N -+ 00.
Answer 5
(eoll1o + CD 111 DIH - eoleowo + CDII1D) = e:DC1 (eoll1o + CD 111 Dleol}lo + CD 111D)
e~DCI (~+ (1)
=>
SDCI (.:l + CD2)
. al)
ec(functIon = ec .:l
'1l
'1l
+ gcn
_2 = ecSDCI 'fI 9 = I'.
311
b) in general we get
_ t(v'Nx+t)
eo - 2~ I
+gt 2
If 9 = 0, and if we want the minimum, we get
(The approximation lies in the use of the SDCI wave function in the CEPA-O functional.)
d) ACPF (Average Coupled Pair Functional) means that 9 = I/N. We want to minimize
ACPF = 2~ t( v'Nx + t)
eo 1+t2 /N
Substitute t = ../Jiiu, to "normalize" the denominator:
= 2~../Jiiu( ../Jiix + v'Nu)
1 +u 2
= N2~ u(x + u) = N
1 + u2
ACPF(H)
eo 2
e) A system of non-interacting hydrogen molecules and helium atoms. Note that the
number of molecules and atoms is irrelevant. We consider two subsystems, denoted prime
and bis.
Now approximate:
in general, except of course if 9 = O. We conclude that ACPF is not size consistent for
this mixed system, but CEPA-O is.
9 Accurate Calculations . ..
Answer 1
RHF is size-extensive, but not generally size-consistent. UHF is both size-extensive and
size-consistent. Coupled-cluster and MP perturbation theory are size-extensive, as is full
CI, but truncated CI methods are not. Size-consistency depends on the reference function:
In general, RHF-based coupled-cluster and perturbation theory treatments will not be
size-consistent. MCSCF methods are often constructed to be size-consistent, although this
depends on the configuration space chosen. Such methods are size-extensive in principle,
although this may be hard to achieve in practice.
313
Answer 2
C'.onsider the normalization for the radial functions r" exp( -ar2 ) and r,,+2 exp( -lJr2).
From the given integral formula, the normalization factors are found to be
and
26 (!!!±II 2"-H 1
( ) • «2n + 5)!!Jjr) 2
respectively. The unnormalized overlap is
(2n + 3)!! (_11'_)1/2
2"+3(0 + 6),,+2 a + 6
Multiplying all three together, setting 6 = ka, differentiating with respect to k and equat-
ing the derivative to zero we obtain
Ie= 2n+7
2n+3
(Note that since we are only interested in the exponent dependence of the normalized
overlap, we can ignore all the other factors in finding the maximum). Hence the overlap
is a maximum when d and s exponents have the ratio 7/3, or for the f Ip case, when the
exponent ratio is 9/5.
Answer 3
Calculation of properties.
(a) The expressions are
2t1EJ.zFt - t1EIFt
a = FlFt-FlFt
"( = -24 t1EJ.zFl- t1E.Fl
FlFt-FlFt
where AEI is the energy change obtained by applying a field strength F1, etc.
(b) Dynamical correlation has a substantial effect on polarizabilities, so an accurate treat-
ment of dynamical correlation is required. The best approach for the lower vibrational
levels would probably be to use the CCSD(T) method, although if higher vibrational lev-
els were required a multi reference would probably be needed, at a considerable increase
in expense.
Select.ion of the basis set would probably be best handled by starting with a good spdf set
(perhaps TZ2pf, or a [4s3p2dlfl ANO or correlation-consistent basis), and then augment-
ing it wit.h diffuse func.tions (up to at least d type), monitoring the change in polarizability
314
components and polarizability gradient at re' The latter can be estimated fairly crudely
here by a three-point fit. Once a result converged with respect to basis set is obtained a
larger range of points can be calculated, with a numerical technique used to compute vi-
brational wave functions and (say) a spline fit to the polarizability components as function
of bond length.
Since the polarizability tensor of N2 is rather well established experimentally it could be
used for calibration, but the best experimental results are frequency-dependent, so this
would not be competely reliable.
Answer 4
Answering "I would propose to steer clear of this project and not perform any calculations
at all" would show good judgment, but if you do get involved, it will be crucial to obtain
a balanced description of valence-Rydberg mixing in the excited states. Since this can
change drastically with geometry, a number of exploratory calculations would be required,
with diffuse functions in the basis and probably a large state-averaged MCSCF followed
by MRCI, or perhaps by doing state-averaged RASSCF calculations. Since the number
of states to be averaged over would be unknown at the beginning and might require
adjustment during the course of the calculations, considerable trial-and-error effort will
be required. You might be better off with new experimental friends!
Answer 5
This is another very difficult problem, but this one is more amenable to theoretical solu-
tion. First, it is vital to obtain a good description of properties, particularly the dipole
moment of NH3 and the polarizability of Ar. Since these are relatively small systems,
an extensive series of basis set and correlation treatments could be examined, in order to
ensure convergence of the desired properties. Second, this is an obvious case for attention
to basis set superposition error, and a proper examination of BSSE using counterpoise
calculations would be mandatory for determining the binding energy, and the optimum
geometry of the complex.
10 MCSCF Theory
This chapter presents solutions to the problems in "The Multiconfigurational (MC) Self-
Consistent Field (SCF) Theory" chapter by B. O. Roos in "Lecture Notes in Chemistry 58,
Lecture notes in Quantum Chemistry, European Summer School in Quantum Chemistry"
(Springer-Verlag Heidelberg, 1992), pp. 177-254. The numbering of the solutions follows
that used by Roos in his lecture notes.
315
r
~ eH8
"""""-r
Answer 1.1
then
(10)
a.nd
(11)
An example: In Fig. 1, the H2 molecule with 4>1 = 100g and 4>2 = 10"". If electron 1 is close
to atom HA the probability to find electron 2 at H8 is larger. In this case P2(r, r) -+ 0 at
dissociation (a 2 = b2 and Tfl = Tf2 = 1).
Answer 2.1
and
where
Thus we have
CI"" = ~(CI + C2 )
CCw = ~(CI - C2 )
but for two hydrogen atoms: CI",. = 0 and Ccov = 1. Thus C1 = -C2 = ~.
Answer 2.2
The CH2 radical has a triplet ground state, 3BI , with one electron in a q and one in a 1r
lone-pair orbital (see Fig. 2).
Two such radicals combine to form the ethene double bond (see Fig. 3).
317
c:i~H
1t
\ '
.H
\\'
Denote the triplet state on radical A as 11, M)A where S=I and M=I,O,-l, with the
corresponding notation for radical B. Start by constructing the overall quintet state (S=2,
M=2) for the combined system AB:
(12)
Anti-symmetrization is implicitly understood in this formal equation. In order to obtain
the singlet state, 10,0)As, we use the step down operator:
II,I}A = 100AlI'AI
II,-I}A = iO'A,i'AI
1
II,O)A V2 {lO'A,i'AI + IUA,lI'AI}
with similar relations for B. The vertical bars (I ... I) here denotes a normalized detenninant.
Inserting into the VB function yields:
IO,O)AS = ~ { 100A,1rA,uB,i'sl llO'A, i'A, Us, 'ifsl IO'A, i'A, UB, lI'BI
- ~IUA,1I'A,O'S,i'BI - ~IUA,1rA,UB,1rBI + IUA,"ifA,O'B,1I'BI }
318
Thus all configurations, which can be generated by distributing 4 electrons in the 4 MO's
and coupling them to a singlet state, are included. (A CASSCF wave function with 4
electrons in 4 orbitals and S=O). Note: The closed shell HF configuration has the weight
=
3/16 18.75 % in IO,O)AS.
Answer 2.3
The water molecule has 10 electrons. The equilibrium structure has e211 symmetry (see
Fig. 4).
The electron structure is in terms of localized orbitals:
where Iso is the oxygen Is orbitals, OH are the two oxygen-hydrogen bonds, ,nu is the
in-plane oxygen lone-pair, and 1/.11' is the out-of-plane oxygen lone-pair. Transform to
symmetry orbitals:
319
Thus:
Note: The derivation is heuristic and based on chemical knowledge (the structure). More
complicated structures may be less straightforward.
Carbon dioxide is a linear molecule with Dooh symmetry (see Fig. 5), and has 22 electrons.
The localized orbitals are:
Note: the two 1r bonds are perpendicular to each other, as are the oxygen 1r lone-pairs.
Now transform to symmetry orbitals:
The ?r-orbitals are most easily transformed by noting that they form the energy diagram
4i = (lO'g )2( 10',,)2 (20'g )2( 30'g )2(20'u)2 (40'g )2(30'u)2(bu)4( 111"g)4
The formaldehyde molecule has 16 electrons and has in equilibrium C 2v symmetry (see
Fig. 6).
In localized orbitals:
Iso -+ 1al
1sc -+ 2at
O'co -+ 3a l
O'CH, + O'CH. -+ 4at
O'CH, - O'CH. -+ l~
noO' -+ 5al
no 11"" -+ 2~
1I"co -+ lbl
--
or in terms of symmetry adapted MO's
--
lsc, -lsc. lb:J..
(Icc 2ag
(lC,H,+ (lC.H. +(lC.H. + (lC.H,
(lC.H, + (lC.H. - (lC.H. - (lC.H,
-- 3ag
2b:J..
--
(lC,H, - (lC,H. + (lC.H. - (lC.H, 16,..
(lC,H, - (lC.H. - (lC.H. + (lC.H, Ib,.g
7fCC lilt ..
With the resulting electronic configuration
The benzene molecule has D6IJ symmetry and 42 electrons. We divide the electrons up
into the following groups,
The first and third group will give rise to equivalent symmetry adapted MO's. We there-
fore treat only the first group (see Fig. 8).
We use the projection operator technique to find out that the six orbitals transform as
alg,btu, e .... and e2g (see section 3.4 in "Molecular Symmetry and Quantum Chemistry"
by P. R. Taylor in Lecture Notes in Quantum Chemistry, Ed. B. O. Roos, pp. 111). The
second group is slightly different since the bonds are located between the atoms rather
than on them (compare Figs. 8 and 9).
The only effect is to change btu into b2... We can now write down the q part of the
electronic configuration:
The 1r-electron part is less straight forward, since we also have to decide which orbitals
are to be filled. The six MO's are: a2.. , e2u, el g , and b2g . The ordering of the orbitals after
energy can be obtained from some simple calculations (e.g. Hiickel) of by just counting
the nodes. The result is:
".
N N
. / ~
.p.' ·P.· ,.~.~ ,.0
(1) (2)
Hint.: the projection to symmetry orbitals is most easily performed by considering the
subgroup C6t• first, realizing that all 0' orbitals are symmetric with respect to the reflection
in the molecular plane of the benzene ring, and that the ?r orbitals are anti-symmetric
with respect to the same symmetry operation.
The N0 2 molecule is a radical and has C2v symmetry and 23 electrons. Here we have a
problem, since there is no obvious structure formula for this molecule, which defines the
bonding situation. One possibility is the tri-radical configuration (see Fig. 10), but that
is very unlikely. Then we have the two radical resonance structures (see Fig. 11).
The electronic configuration for (1) is
(150, )2( 1502 )2( ISN )2( 250, )2( 2502)2{ UNO, )2( O'N02)2
(nUN )2(nO'o, )1{nO'02)2(?rN02)2(no, ?r)2
where nUN is the nitrogen lone-pair, nO'o the in plane 2p oxygen lone-pair, and no,?r
the 7r lone-pair. For simplicity we have assumed the oxygen 2s not to be involved in
t.he bonding. It is only the last nine electrons that causes any problem. The seven first
orbitals in the row are easily symmetrized to:
The same result is obtained for (2). Consider now the orbitals nO'o, and nO'02' They can
be replaced by the symmetrized orbitals
nt70, + nO'02 - 6a 1
nO'o, - nO'02 - 4b:!
6al is "bonding" and we may guess the contribution (5al is the nllN orbital),
(.')ad(6ad(4b:!)1
324
N+ N+
:p( ~., .0"//"
..
.. - .0-
(3) (4)
"
A
C/ B
Figure 13: Labelling of the ozone molecule.
This is wrong! The ground state of N0 2 is 2At and not 2H2 • The reason is the appearance
of ionic structures described in Fig. 12.
The q-Ione-pair part of the wave function is now
The 7I"-orbitals in N0 2 are 1bt. la2, and 2b t • With four electrons we obtain the configu-
ration:
The difficulty to write down a consistent electronic structure for N0 2 reflects the fact
that it is not well described by a single configuration. Another example of this situation
is ozone, which will be discussed in the next example.
Answer 2.4
In this example we treat only the 4 71" electrons. For simplicity it is assumed that the
three 71" orbitals 7I"A, 7I"B, and 7I"c are orthonormal. Formally we can write the three valence
structures as (see Fig. 13 for notation):
I (7I"A)2(7I"B7I"C).
II (7I"B)2(1I'A1I'C).
III (7I"c)2(7I'A7I"B).
Now introduce the MO's, 71'1, 11"2, and 11"3 according to the text. For simplicity we also use
1
11"+ = y'2(1I"1 + 11"3)
1
11"_ = -(lI"l -lI"3)
y'2
In this notation we have:
11"04 = lI"+
1
1I"B = y'2(11"2 + 11"_)
1
1I"c = -(lI"2-1I"
y'2 -)
Thus
Obviously we have:
Further we find
Thus:
.1 = ~(lI"l)2(1I"3)2 +
v2
lrn(7I"1)2(7I"2)2
2v2
- 2~(71"2)2(71"3)2 - ~(7I"11I"3).(1I"2)2
Note: The second term is the HF configuration. It has the weight 1/8 = 12.5 %. For
symmetry reasons we only need
326
A S A S
S A S A
Finally we find:
Answer 2.5
The 7I"-orbitals in cis-butadiene (the u-orbitals remain essentially unchanged during the
reaction and need no be considered) have the form according to Fig 14,
where we have also indicated weather they are symmetric (S) or anti-symmetric (A)
under the C2 or C. point groups, respectively. The orbitals 71"1 and 71"2 are occupied. The
corresponding orbitals in cyclobutene are described in Fig. 15.
The con-rotatory reaction path leads to the correlation diagram of Fig. 16.
This reaction path is the allowed! For the dis-rotatory reaction path we obtain the
correlation diagram according to Fig. 17.
The dis-rotatory reaction path leads to a change of the electronic configuration and is
then expected to have a considerable barrier (a Woodward-Hoffman forbidden reaction).
327
0 (J
tl :It
~
:It*
C2: S A S A
Cs: S S A A
Figure 15: The four sets of orbitals in cyclo-butene produced by ring closure of cis-
butadiene.
7t (S) - - - - - - - - - - - - - o*(A)
4
7t ( A ) - - - .
3 1t *(S)
7t (S) 1t (A)
2
1t 1(A) -+I------------+i-- a (S)
7t4 ( A ) - - - - O*(A)
::::-=
1t *(A)
1t (S)
a (S)
Answer 2.6
where Eo is the ground state energy, (.i the orbital energies and
Jd = (aalbb)
l(d = (ablab)
To compute these Coulomb and exchange integrals we use the zero-differential overlap
(ZDO) approximation:
For the Coulomb integrals we obtain using the MO's given in the problem (note that
c;" = c;,,,)
(j'i'l = r:C1,,,(ppl + ~:C1,,,(ppl = WI
I'l.) ,,0
(iii = ~:C1,,(ppl + Lc1,,(ppl = (i'il
p(.) I'l)
Thus Jj'j = Jijl. In the same way Kj'j = K ji,. Since also -(.i + (.31 = (.i' - (.; we obtain
(~oILiiI>!/EI>!/I~i-i') = V2iiw
",q
329
and
(~ol LilpqEpql~j-i')
p,q
= V2ilji'
liij' = L
p,q
C;pCj'q(7rplrl7rq}
The second term in this expression is zero. Why? According to Fig. 18 R goes to the
midpoint of pq.
With assumed orthogonality we have
Thus all intensity goes into the upper state i)1"-j, while in this approximation the tran-
sition to <I>;:"'j is forbidden. The separation of the 7r excited states in conjugated hydro-
carbons remains approximately valid also in more accurate treatments (see for example,
Matos & Roos in Theoret. Chim. Acta 74, 363 (1988): The singlet-singlet and triplet-
triplet spectroscopy of naphtalene).
Answer 3.1
Assume i < j
il/J,il···mi ... mj ... } (ljmi(-I)P;I ... Oi ... mj ... ) = mjmA-l)P.+p,-ll···Oj ... Oj ... }
CtiCtjl···mi .. _mj ... } (limj(-I)P'I···mj ... Oj ... ) = mim j(-I)P;+P'I···Oi ... Oj ... }
Answer 3.2
[..1..8, CD] = -[A, C]+.8D + ..1.[.8, C]+D - C[A, D]+.8 + ..1.[.8, D]+C
where [A,C]+ = At + CA, etc.
Answer 3.3
Answer 3.4
thus PiAl = ~n = P}:ik Q. E. D. It follows that Pijk/ = Pk/ij = Plkji = Pjilk for real wave
functions.
331
Answer 3.5
wtuw = D
with t.he element.s Dij = Cij exp( i8;). Thus D = exp(i8) and
Answer 3.6
Taylor expand the exponent.ial expression of U and separate'the summation into a sine
and cosine like contribution,
) ~ 1 k ~ 1 2k+! ~ 1 2k
U = exp(T= f='o kfT = f='o (2k + I)! T + f='o 2k! T
Manipulate the a.rguments further by noting that
T2 = _8 21
Use this identity to reexpress the arguments in the Taylor expansion, i.e.
Insert this in the Taylor expansion and identify the corresponding Taylor expansions for
a sine and a cosine function (in the latter case we extract the scalar 8 from the anti-
symmetric matrix T and put it inside the summation),
Answer 3.7
Assume
fi <0 since the eigenvalues of an anti-Hermitian matrix (T) are purely imaginary.
where we, after multiplying the first sum on the right hand side of the equation with (};f(}i,
have aij = 6ij sin (};f(}i and bij = 6ij cos (}i.
Answer 3.8
= -L ShIO) = -(}210)
K#J
From this we obtain
Answer 3.9
S. = L SijO,j o'j
i,j
333
where the sum runs over all spin orbital pairs. To compute Sij we calculate the matrix
elements (I/Ii(ilszll/lj(j) where I/Ii is an MO and (i is the spin function (a of fJ). Keeping
in mind that
(l/iialszls",l/Ija)
(l/iialszls",l/IjfJ)
1
(4);,818%18%</>;,8) = --s--
2 'J
Thus, only diagonal elements in the sum over spin orbital pairs will survive
(13)
where the sum now runs over MO's. For ~ we use the relation
(14)
So we need the representations of 5+ and 5_ in Fock space. For one electron we have:
(15)
Now investigate if these spin operators commutes with the excitation operator E ij • First
explore this for 5z where we have
334
the same holds for the second term. Hence, [E;;,Sz] = O. It immediatly follows from
[A, BC] = B[A, C] + [A, B]C that [E;;,~] = O. We also note that
and also [E;;, S-l = O. It follows then from the arguments mention above that [Ei;, S21 =
O.
Answer 4.1
(OI[H, E,,;]lO) = 0
To compute the commutator: we start from
which proves eq. (4:46). Plugging the result into eq. (4:9) also proves eq. (4:41)
the second term here is trivially identical to zero. Hence, the Brillouin condition which
states that the gradient with respect to orbital changes is zero can be expressed as F..i = O.
Suppose that the Fock matrix in AO basis is FAO. The Brillouin condition then gives:
c!FAOCi = 0
where cl' is the coefficient vector of the MO tPl'. Consider the general transformation of
an occupied orbital with the coefficient vector c;,
where S is the AO overlap matrix and the sum runs over all MO's. Multiply with c! and
use the orthogonality condition for the MO's (JScq = 61'q) to obtain:
FAoe; = 2:SCjEji with Et = E
By transforming the occupied MO's among themselves we can obtain E in diagonal form
E=f(E)
Answer 4.2
Consider the equation (4:31) with a diagonal Hessian (the primes are deleted here):
The betweenness condition (see Fig. 19) is immediately clear from the graphical repre-
sentation.
Answer 4.3
The closed shell HF Hessian is most easily obtained from (4:53) by assuming that there
are no inactive electrons and that all active orbitals are doubly occupied.
FIb = hab
Ftu = htu + L {2(tulvv) - (tvluv)}
Answer 4.4
where Dpq = (OIEpqIO) is the first order reduced density matrix (the I-matrix) and ppgr• =
!(OIEpqEr• - oqrEp.IO) is the corresponding second order reduced density matrix (the 2-
matrix). Using eq. (3:22) we can simplify this to:
Answer 4.5
b = Lhk/Ekt+ ~
k,l
L
k,ltm.,n
(kllmn) {Ek/Emn -o/mEI:n}
where k, and 1 have been interchanged in the last summation (h,d = hll,) , and upon
interchanging the order of the first term we get
"t" E"m,,°l<;
+awa,.. ..
The third and the seventh term cancel. Hence, we are left with six terms. Permute the
indices and identify tha.t the first and second term, the fourth and fifth, and the sixth and
eighth terms are identical. Hence, we have
on the third term and permute indices again to see that the second and the last term
cancel out and we are left with
Form the matrix element (01·· ·IO) of this expression, and do some additional index ma-
nipulation and the two-electron contribution is expressed as
Answer 5.1
but since both i, and j are active orbitals we obtain for a CAS wave function
EijIO} = ECKIK)
K
340
where IK) are the eigenstates of the CAS CI secular problem. Hence we get (for real
functions)
L CK {(OIHIK) - (KIHIO)}
K
Exercise 5.2
Exercise 5.3
Exercise 5.4
This series aims to report new developments in chemical research and teaching -
quickly, informally and at a high level. The type of material considered for
publication includes:
1. Preliminary drafts of original papers and monographs
2. Lectures on a new field, or presenting a new angle on a classical field
3. Seminar work-outs
4. Reports of meetings, provided they are
a) of exceptional interest and
b) devoted to a single topic.
Texts which are out of print but still in demand may also be considered if they fall
within these categories.
The timeliness of a manuscript is more important than its form, which may be
unfinished or tentative. Thus, in some instances, proofs may be merely outlined
and results presented which have been or wi11later be published elsewhere. If
possible, a subject index should be included. Publication of Lecture Notes is
intended as a service to the international chemical community, in that a commer-
cial publisher, Springer-Verlag, can offer a wider distribution to documents which
would otherwise have a restricted readership. Once published and copyrighted,
they can be documented in the scientific literature.
Manuscripts
Manuscripts should comprise not less than 100 and preferably not more than 500
pages. They are reproduced by a photographic process and therefore must be
typed with extreme care. Symbols not on the typewriter should be inserted by hand
in indelible black ink. Corrections to the typescript should be made by pasting the
amended text over the old one, or by obliterating errors with white correcting
fluid. Authors receive 50 free copies and are free to use the material in other
publications. The typescript is reduced slightly in size during reproduction; best
results will not be obtained unless the text on anyone page is kept within the
overall limit of 18 x 26.5 cm (7 x 10 1/ 2 inches). The publishers will be pleased to
supply on request special stationary with the typing area outlined.
Manuscripts should be sent to one of the editors or directly to Springer-Verlag,
Heidelberg.
Lecture Notes in Chemistry
For information about Vols. 1-25
please contact your bookseller or Springer-Verlag
Vol. 26: S. Califano. V. Scbettino and N. Neto. Lattice Vol. 47: C.A. Morrison. Angular Momentum Tbeory
Dynamics of Molecular Crystals. VI. 309.pages. 1981. Applied to Interactioos in Solids. 8.9-159 pages. 1988.
Vol. 27: W. BruDS. I. MOIOC. and K.F. O·DriscoU. Monte Vol. 48: C. Pisani, R. Dovesi. C. Roetti. Hartree-Fock Ab
Carlo ApplicatiODS in Polymer Science. V. 179 pages. 1982. Initio Treatment of Crystalline Systems. V, 193 pages.
Vol. 28: G.S. Ezra. Symmetry Properties of Molecules. VW. 1988.
202 pages. 1982. Vol. 49: E. Roduner. The Positive Muon as a Probe in Free
Vol. 29: N.D. Epiotis. Unified Valence Bond Theory of Radical Chemistry. VII. 104 pages. 1988.
Eleccronic Structme VIII. 305 pages. 1982. Vol. 50: D. Mnkbeljee (Ed.). Aspects of Many-Body Effects
Vol. 30: R.D. Harcourt. Qualitative Valence-Bond in Molecules and Extended Systems. VlU. 56S pages. 1989.
Descriptions of Electron-Ricb Molecules: Pauling M3_ Vol. 51:1. Koca.M. Kratocbv1\. V. KvasoicJca.L. Matyska,
Eleccron Bonds" and "Increased· Valence" Theory. X. 260 I. Pospicbal. Synthon Model of Organic Chemistry and
pages. 1982. Synthesis Design. VI. 207 pages. 1989.
Vol. 31: H. Harmwm. K.-P. Wanczelt, Ion Cyclotron Reso- Vol. 52: U. Kaldor (Ed.). Many-Body Methods in Quan-
nance Spectrometry II. XV. 538 pages. 1982. tum Chemistry. V. 349 pages. 1989.
Vol. 32: H.F. Franzen Second-Order Pbase Transitioas and Vol. 53: G.A. Arteea, F.M. Fem4ndez. E.A. Castro. Large
the Irreducible Representation of Space Groups. VI. 98 Order Perturbation Theory and Summation Methods in
pages. 1982. (out of print) Quantum Mecbanics. XI. 644 pages. 1990.
Vol. 33: G.A. Martynov. R.R. Salem, Electrical Double Vol. 54: SJ. Cyvin. J. Brunvoll. B.N. Cyvin. Theory of
Layer at a Metal-dilute Eleccrolyte Solution Interface. VI. Coronoid HydrocarboDS. IX. 172 pages. 1991.
170 pages. 1983. Vol. 55: L.T. Fan.D. Neogi. M. Yashima. Elementary Intro-
Vol. 34: N.D. Epiotis. Unified Valence Bond Theory of duction to Spatial and Temporal Fractals. IX. 168 pages.
Eleccronic Structure' Applications. VIII. 585 pages. 1983. 1991.
Vol. 35: WavefunctioDS and Mechanisms from Eleccron Vol. 56: D. Heidrich. W. K1iescb. W. Quapp. Properties of
Scattering Processes. Edited by F.A. Gianturco and G. ChemicaUy Interesting Potential Energy Surfaces. VIII. 183
Stefani. IX. 279 pages. 1984. pages. 199 J.
Vol. 36: I. Ugi. J. Dugundji. R. Kopp and D. Marquarding. Vol. 57: P. Turq. J. Barthel. M. Cbemla. Transport.
Perspectives in Theoretical Stereochemistry. XVII. 247 Relaxation. and Kinetic Processes in Electrolyte Solutioas.
pages. 1984. XIV. 206 pages. 1992.
Vol. 37: K. Rasmussen. Potential Energy FunCtiODS in Vol. 58: B. O. Roos (Ed.), Lecture Notes in Quantum
Conformational Analysis. XUI. 231 pages. 1985. Chemistry. VII. 421 pages. 1992.
Vol. 38: E. Lindholm. L. AsbriDlt, Molecular Orbitals and Vol. 59: S. Fraga, M. K1obukowski. I. Muszynski. E. San
their Energies. Studied by the Semiempirical HAM Method. Fabian. K.M.S. Saxena. J.A. Sordo. T.L. Sordo. Research
X. 288 pages. 1985. in Atomic Structure. XII. 143 pages. 1993.
Vol. 39: P. Vany'selt, Electrochemistry on LiquidlLiquid Vol. 60: P. Pyykktl. Relativistic Theory of Atoms and
Interfaces. 2. 3-108 pages. 1985. Molecules II. A Bibliography 1986-1992. vm.
479 pages.
Vol. 40: A. Plonka. Time-Dependent Reactivity of Species 1993.
in CondeDSed Media. V. 151 pages. 1986. Vol. 61: D. Searles. E. von Nagy-Felsobuki. Ab Initio
Vol. 41: P. PyykklS. Relativistic Theory of Atoms and VariatioDal CalculatioDs of Molecular Vibrational-
Molecules. IX. 389 pages. 1986. Rotational Spectra. IX. 186 pages. 1993.
Vol. 42: W. Ducb, GRMS or Graphical Representation of Vol. 62: S. J. CyviD. J. BJUnvoll. R. S. Chen. B. N. Cyvin
Model Spaces. V, 189 pages. 1986. F. I. Zhang. Theory of Coronoid Hydrocarbons II. XII. 310
pages. 1994.
Vol. 43: F.M. FernAndez. E.A. Castro. Hypervirial
Theorems. V11l. 373 pages. 1987. Vol. 63: S. F1isw. Atoms. Cbemical Bonds and Bond
Dissociation Energies. VIII. 173 pages. 1994.
Vol. 44: Supercomputer Simulations in Chemistry. Edited
by M. Dupuis. V. 312 pages. 1986. Vol. 64: B. O. Roos (Ed.). Lecture Notes in Quantum
Cbemistry II. VII. 340 pages. 1994.
Vol. 45: M.C. BlSbm. One-Dimensional OrganometaUic
Materials. V. 181 pages. 1987.
Vol. 46: S.l. Cyvin. I. Gutman. Kekult Structures in
Benzenoid Hydrocarbons. XV. 348 pages. 1988.