Mag Theory

Download as pdf or txt
Download as pdf or txt
You are on page 1of 88

Theory of Magnetism

Lecture notes

Alexander Tsirlin
Experimentalphysik VI
Universität Augsburg

January 16, 2020

Have corrections or comments? Please, send them to [email protected]


These lecture notes are released under the generic Creative Commons (CC-BY-SA) license.
You are free (and welcome) to disseminate and re-use the full document or any of its parts, as
long as you provide attribution as follows, ”Alexander Tsirlin, University of Augsburg”
1 Introduction
1.1 Classical mechanics and Bohr-van Leuuwen theorem
Magnetism covers phenomena mediated by the magnetic field. As obscure as it seems, this
”definition” reflects the fact that magnetic phenomena are easy to perceive but notoriously
hard to understand. They largely remained enigmatic until quantum mechanics was devel-
oped, and only empirical understanding was possible prior to that. It slowly progressed into
a picture of magnetic dipoles introduced similar to electric dipoles, because magnets have
been historically fabricated into bars (like in compasses) with the two poles, North and South.
These dipoles are said to carry magnetic moments µ and interact with the external field B
that changes their energy by E = −µB. Volume integral of the magnetic moment yields
macroscopic volume magnetization, Z
M = µ dV (1.1)

that enters, e.g., Maxwell’s equations. We are more interested in atomic quantities, though,
and thus scale the magnetization per atom throughout this lecture.
Attempts to break dipoles into individual magnetic charges were dismayingly unsuccessful,
because two new dipoles always ensued. This led Maxwell to postulate that individual magnetic
charges (monopoles) do not exist. It seemed natural that some elementary, microscopic charges
inside a solid should move and produce robust, unbreakable magnetic dipoles. Indeed, it was
known since early 1800’s that circulating electric current generates magnetic field (Biot-Savart
law).
Magnetic moment of a current loop is defined as

m = ISn, (1.2)

where I is the current, S is the loop area, and n is the unitary vector perpendicular to the
loop. One can show that a dipole carrying such a moment interacts with any external field in
the same way as the current loop itself.
Consider now a charged particle moving around a loop of the radius r with the angular
frequency ω. The resulting current I = q ω/(2π) (charge q divided by the rotation period
2π/ω) leads to the magnetic moment

qr2
m= ω,
2
where S = πr2 and we introduced the linear speed v = ω × r defined via the the angular
frequency vector ω directed perpendicular to the loop (this is based on v = rω derived from
the rotation period 2π/ω = 2πr/v). Using the vector identity

r × v = r × (ω × r) = r2 ω − (r · ω)r = r2 ω

(the second term is zero, because r and ω are orthogonal), we arrive at

q ql
µ = (r × v) = , (1.3)
2 2m
where l is the angular momentum, and m is the particle mass. For an electron with q = −e
and m = me this leads to
el el
(SI) µ = − (CGS) µ = − . (1.4)
2me 2me c

1
Two important consequences can be derived immediately. First, angular and magnetic moments
of an electron are antiparallel by virtue of electron’s negative charge. Second, as a cross-product
of two vectors, magnetic moment is a pseudo-vector. This has important implications for
its symmetry (Sec. 8).
Although magnetic moments can be conceived within classical mechanics, macroscopic mag-
netization fails to exist. The general statement known as Bohr-van Leuuwen theorem states
that thermal average of the magnetization is always zero in a classical system. This theorem
can be proved by analyzing the Hamilton function of electrons in a magnetic field,
X 1
HBvL (r1 , . . . rN , p1 , . . . pN ) = (pi + eA(ri ))2 + V (r1 , . . . rN ),
i
2mi

where pi are electron momenta and A is the vector potential. The partition function Z is
obtained by integrating overall all variables of HBvL ,
Z Z
Z(T ) = dr1 . . . drN dp1 . . . dpN e−HBvL (r1 ,...rN ,p1 ,...pN )/kB T .

Using p0i = pi + eA(ri ), one converts integration over pi into integration over p0i , and eliminates
any dependence of Z on A, because integral is taken over the whole space. This way, Helmholtz
free energy F = −kB T ln Z does not depend on A too. Thermal average of the magnetization,
hMi = −dF/dB, equals to zero, as F is independent of B = rot A.

1.2 Spin and orbital moments


The central point of Bohr-van Leuuwen theorem is the continuous nature of ri and pi that
allowed us to muddle the momentum. This would not be possible in a quantum system with
discrete values of ri and pi . Bohr thus conjectured that magnetism is a quantum phenomenon
intertwined with the quantized nature of electron movement. Rumors are that the above theo-
rem was one of the triggers for Bohr’s atomic model, where he postulated that electrons move
around the nucleus in fixed (quantized) orbits acting as microscopic current loops. Magnetic
moment comes out as a consequence. With magnetic field applied along z, one would be in-
terested in the operator ˆlz having the eigenvalues lz = n~, n integer (Sec. A). This led to the
definition of Bohr magneton as
e~
µB = = 9.274 · 10−24 J/T (1.5)
2me
that is indeed close to the magnetic moment of an individual electron, although later work
showed this is mere coincidence.
The next problem was that, even if magnetic moments exist and survive thermodynamically,
they fail to interact. The standard dipole-dipole interaction between magnetic moments of the
size of µB placed at the typical interatomic distance of r = 3 Å from each other,
µ0
Edip = (µ µ − 3(µ1 · r)(µ2 · r)), (1.6)
4πr3 1 2
does not exceed 1 K. Such interaction strength renders magnetism a low-temperature effect a
la superconductivity, leaving no room for ferromagnetism of iron and any other instances of
room-temperature magnetism.
In Sec. 2, we will see that magnetic interactions are a quantum phenomenon too and can’t be
captured by classical mechanics. So quantum mechanics is central to all problems in magnetism,
but it also makes the whole picture much more involved. The main difficulty at this point is

2
that the angular momentum operator has only integer eigenvalues in its orbital form (L̂ and
L̂z ), but can also appear in the general form (Ĵ and Jˆz ), see App. A. Then, Jˆ2 has eigenvalues
j(j + 1), whereas the associated eigenvalues of Jˆz are separated by 1 and fall in the range
−j ≤ mj ≤ +j, where j is integer or half-integer. While integer j’s lead to integer values of
mj and the magnetic moments equal to Bohr magneton (or its multiples), half-integer j’s yield
half-integer mj ’s that have no classical analog.
The standard interpretation of these integer and half-integer quantum numbers relies on the
separation of the angular momentum into spin and orbital components. The orbital component
(lz ) is associated with integer values of j and mj and related to the orbital motion of the
electron. The spin component (sz ) corresponds to half-integer values of j and mj . It is an
intrinsic property of the electron and, within non-relativistic quantum mechanics, can only be
introduced in an ad hoc manner. Magnetic moments are defined by1
µs = gs µB sz , µl = gl µB lz , (1.7)
where sz = ± 12 and gs is spin g-factor. Likewise, lz = n and gl is orbital g-factor (note that
~ is already contained within µB ). The free-electron value gs ' 2.002 is very close to 2.0 and
yields ms ' µB in line with the original Bohr’s definition. Finally, gl ' 1 restores the classical
expression for the magnetic moment of an orbiting electron.
The interplay between spin and orbital moments follows general quantum-mechanical rules
for the summation of angular momenta. The total angular momentum J = L+S has eigenvalues
between L − S and L + S. Third Hund’s rule further postulates that the state with J = L − S
has lower energy for less than half-filled shells, whereas the state with J = L + S is favored for
more than half-filled shells (and in the case of half-filling only the state with J = S is possible,
because L = 0). This effect has something to do with the spin-orbit coupling,
1 1 dV
HSO = λ L̂ Ŝ, λ= , (1.8)
2m2e c2 r dr
where the spin-orbit-coupling constant λ is defined for the case of the spherical potential V (r).
The dV /dr term increases with increasing the atomic number Z, and thus the spin-orbit cou-
pling becomes more pronounced (i.e., increasingly relevant) in heavy elements.
Several ad hoc concepts introduced in the last paragraphs are a concise quantum-mechanical
description of magnetic effects on the non-relativistic level. The physical origin of these effects
(electron spin, g-factors, spin-orbit coupling) is essentially relativistic. They can be rigorously
obtained by a canonical transformation of the Dirac equation, but this goes well beyond our
lecture and can be found (without details) in White’s book or, in a more complete form, in any
textbook on quantum electrodynamics. Luckily, most problems in cooperative magnetism have
nothing to do with the relativistic description. We only have to worry about magnetic moments
and their interactions, and in some cases the spin-orbit coupling in the form of Eq. (1.8) comes
into play.

1.3 Spin Hamiltonians


The bulk of magnetic phenomena can be understood in terms of pairwise interactions. The
Heisenberg spin Hamiltonian reads as
X
HHeis = Jij Ŝi Ŝj , (1.9)
hiji

1
A subtlety of this definition is that it requires the minus sign, because after Eq. (1.4) magnetic moment of
an electron is opposite to its angular moment. However, this minus sign is very often left out for the sake of a
more intuitive interpretation. Without the minus sign, the sz = 21 spin is parallel to the field and the sz = − 12
spin is antiparallel to the field.

3
where the summation is over bonds ij, Ŝi and Ŝj are spin operators, and Jij is the coupling
constant or exchange integral, not to be confused with the total angular momentum J.
Despite its simple form, Eq. (1.9) is prone to ambiguity. First, exchange couplings can be
defined per site or per bond (or, respectively, the summation can be done over sites i, j vs. bonds
hiji), leading to the factor of 2 difference in Jij . We shall always define Jij per bond and use
the summation over hiji, but different conventions can be found in different books and research
papers. Second, the form we used in Eq. (1.9) implies Jij > 0 for antiferromagnetic interactions
and Jij < 0 for their ferromagnetic counterparts. The other choice, positive ferromagnetic and
negative antiferromagnetic exchange, is not uncommon either.
Another important remark is that Eq. (1.9) does not contain (and does not require) any
information on the nature of Ŝi and Ŝj . They are written as spin operators, but stand for
arbitrary angular momentum operators, be they pure spin moments, pure orbital moments, or
their combinations. Two most common situations are as follows:

• Ŝi is the spin moment S renormalized by a small admixture of the orbital moment (weak
spin-orbit coupling). One conveniently writes ms = gµB S, where g is an effective g-factor
comprising gs of Eq. (1.7) and the contribution of the orbital moment.

• Ŝi is an alias for the total angular momentum, the sum of S and L (strong spin-orbit
coupling). The notation Si is preserved to avoid confusion with the exchange integral Jij .

Eq. (1.9) can be generalized using the exchange matrix Jij in the place of the scalar Jij .
Different interactions between different spin components are thus defined. For example, the
interaction between only z-components leads to the Ising spin Hamiltonian
X
HIsing = Jij Ŝiz Ŝjz . (1.10)
hiji

Conversely, the XY spin Hamiltonian reads as


X
HXY = Jij (Ŝix Ŝjx + Ŝiy Ŝjy ), (1.11)
hiji

and all intermediate cases are possible too.


For the sake of completeness, we shall also write the generic bilinear term

Ŝi Jij Ŝj ,


X
H= (1.12)
hiji

which may depend on the bond direction, Jij 6= Jji . It is common to separate this term into
the symmetric and antisymmetric parts,
X X
H = Hsym + Hasym = Ŝi Γij Ŝj + Dij (Ŝi × Ŝj ). (1.13)
hiji hiji

The former describes the interaction independent of the bond direction. The latter describes
the so-called Dzyaloshinsky-Moriya interaction that changes sign upon changing the bond
direction, as reflected in its cross-product form
Xh i
HDM = Dx (Ŝiy Ŝjz − Ŝiz Ŝjy ) + Dy (Ŝiz Ŝjx − Ŝix Ŝjz ) + Dz (Ŝix Ŝjy − Ŝiy Ŝjz ) ,
hiji

4
where Dij = (Dx , Dy , Dz ) = −Dji , because the cross-product changes sign upon swapping i
and j, while the spin Hamiltonian should remain invariant under this transformation. The
general form of the exchange tensor is then
   
Jx Γxy + Dz Γxz − Dy Jx Γxy − Dz Γxz + Dy
Jij =  Γxy − Dz Jy Γyz + Dx  , Jji =  Γxy + Dz Jy Γyz − Dx  ,
xz y yz x z xz y yz x
Γ +D Γ −D J Γ −D Γ +D Jz

where we defined the symmetric part of the exchange as


 x 
J Γxy Γxz
Γij =  Γxy J y Γyz  = Γji .
Γxz Γyz J z

Spin Hamiltonian in the presence of a magnetic field contains an additional Zeeman term

Ŝi gi H,
X
Hfield = H + HZeeman , HZeeman = (1.14)
i

where H is the external magnetic field, and gi stands for the g-tensor. The form of the Zeeman
term is inherited from the classical energy of a dipole interacting with the magnetic field. The
g-tensor serves to re-scale the spin moment to the actual magnetic moment and to account for
the orbital contribution.
Eq. (1.9) is said to define quantum spin Hamiltonian, because Ŝi and Ŝj entering this
expression are operators. This won’t be surprising if you have read the previous section, where
we articulated quantum nature of magnetism and praised quantization of angular momentum
as the main physical effect behind it. A more surprising and even bewildering statement is
that problems in magnetism are often solved in the classical approximation by replacing
spin operators with spin vectors to produce Si Sj instead of Ŝi Ŝj . This trick greatly simplifies
the solution and may be the only viable option when complex mesoscopic effects, such as
skyrmions, are at stake. The applicability and limitations of this approach will be further
discussed in Sec. 3.2.

1.4 Ground state and excitations


Now, we have a spin Hamiltonian, and even more than one. How to make sense of it? Like
any Hamiltionian, spin Hamiltonian has to be solved. The solution is a sequence of eigenstates
and eigenvalues (energies) that define the energy spectrum of our problem. Depending on the
situation, different parts of this spectrum may be of interest.
The ground state (viz. the lowest-energy eigenstate) defines the spin configuration that
should form at low temperatures. It is usually a magnetic order of some sort described by
spin-spin correlations, including the on-site correlation that corresponds to the size of the
ordered moment. Experimentally, magnetic ground state is most efficiently probed by the
elastic neutron scattering that will be further discussed in Sec. 8.
Magnetic excitations are equally important, because at any non-zero temperature excited
states will be partially occupied. Their nature and energy splittings define thermodynamic
properties of the system. To calculate magnetic susceptibility or specific heat as function of
temperature, the full energy spectrum is required.
Like any excitations of a crystal, magnetic excitations are defined in the reciprocal space,
where they show periodicity across the different Brilluoin zones. Each point of the first Brillouin
zone corresponds to different periodicities of excitations in real space. For example, Γ-point
excitations repeat in every unit cell of the crystal, where Γ = (0, 0, 0). The excitations at

5
X = ( 12 , 0, 0) are such that they double the periodicity along the real-space a-direction, i.e.,
opposite displacements occur on two atoms separated by one lattice period along a. This is
easier to see for phonons, yet for any magnetic excitations exactly the same principles hold. The
crystal with only one atom per unit cell features exclusively acoustic modes having zero energy
at Γ. With more than one atom per cell, optical modes appear. Those have non-zero energies
at Γ and entail mutual displacements of different atoms within the unit cell. k-dependence of
excitation energies, E(k) or ω(k) is known as the dispersion relation for magnetic excitations.
In this lecture course, we shall discuss different spin models, their ground states and exci-
tation spectra that, as already mentioned, go hand in hand with the experimental properties,
such as magnetic structures and thermodynamic properties of magnetic materials.

6
2 Exchange interaction
Exchange interaction is the main source of cooperative magnetism. Its simplest manifestation is
the first Hund’s rule postulating that electrons fill atomic orbitals to achieve maximum spin
(in other words, the lowest-energy state has the highest multiplicity). An intuitive explanation
of this rule can be given by considering the Pauli principle and electron-electron (Coulomb) re-
pulsion. The former requires that no two electrons occupy the same state. Therefore, same-spin
electrons tend to avoid each other, thus reducing the repulsion energy and stabilizing the state
with the highest spin. This serves as an example of intraatomic or Hund’s exchange. In the
following, we shall derive mathematical expressions behind it and discuss further manifestations
of this effect.

2.1 Orthogonal orbitals


Consider a two-electron problem defined by the Hamiltonian

e2
H12 = ĥ0 (r1 ) + ĥ0 (r2 ) + , (2.1)
|r2 − r1 |

where ĥ0 (r) determines the one-electron states, and the third term is the electron-electron
repulsion. Let ϕa (r) and ϕb (r) be orthogonal eigenstates of ĥ0 , such that
Z
ĥ0 ϕa,b (r) = εa,b ϕa,b (r) and ϕ∗a (r)ϕb (r) = 0.

In non-relativistic quantum mechanics, spatial and spin parts of the wave function are
completely independent, so we define

ψa,b (r) = ϕa,b (r) · χ, (2.2)

where spin part of the wave function, χ, takes the form



α, spin-up
χ=
β, spin-down.

Opposite-spin states should be orthogonal, so we have to define αβ = 0 as well as α2 = β 2 = 1


(wavefunction is normalized to unity). Alternatively, we can write α and β as vectors, α = (1, 0)
and β = (0, 1).
The two-electron wavefunction should change sign upon permutation, so we construct it as
Slater determinant
1 ϕa (r1 ) α1 ϕa (r2 ) α2
|↑↑i = √ = √1 α1 α2 [ϕa (r1 )ϕb (r2 ) − ϕa (r2 )ϕb (r1 )] ,
2 ϕb (r1 ) α1 ϕb (r2 ) α2 2

and
1 ϕa (r1 ) α1 ϕa (r2 ) α2
|↑↓i = √ = √1 [ϕa (r1 )ϕb (r2 ) α1 β2 − ϕa (r2 )ϕb (r1 ) α2 β1 ] .
2 ϕb (r1 ) β1 ϕb (r2 ) β2 2

The |↓↓i and |↓↑i states are obtained by swapping α and β.


In the absence of the two-electron term in Eq. (2.1), all four state |↑↑i, |↑↓i, |↓↑i, and |↓↓i
are eigenstates of H12 with the same energy of εa + εb . The situation changes in the presence of

7
electron-electron repulsion that renders the matrix of H12 in the { |↑↑i, |↑↓i, |↓↑i, |↓↓i} basis
off-diagonal. It takes the form (see Sec. B)
 
Cab − Jab 0 0 0
0 Cab −Jab 0
H12 = (εa + εb ) · 1 + 
 
,
 0 −Jab Cab 0 
0 0 0 Cab − Jab

where Cab and Jab are, respectively, Coulomb and exchange integrals defined as

e2 |ϕa (r1 )|2 |ϕb (r2 )|2


ZZ
2
Cab = h↑↓ ↑↓i = e dr1 dr2 (2.3)
|r1 − r2 | |r1 − r2 |

e2 ϕ∗a (r1 )ϕb (r1 )ϕ∗b (r2 )ϕa (r2 )


ZZ
Jab = h↑↓ ↓↑i = e2 dr1 dr2 . (2.4)
|r1 − r2 | |r1 − r2 |

Let’s stop for a minute and discuss their physical meaning. The first integral, Cab , is the
Coulomb repulsion between the two charges. It is always positive and appears in every diagonal
term, thus increasing the energy simply because Coulomb repulsion is unavoidable. The second
term, Jab , describes a more delicate spin-dependent process and has no analog in classical
mechanics. It is called an exchange term, because r1 and r2 are exchanged in the integrand.
The sign of this term is not immediately obvious, but through a simple mathematical exercise
we can show (Sec. B) that Jab > 0.
Now, we are ready to diagonalize H12 and obtain eigenstates of H12 ,

|ψs i = √1 ( |↑↓i − |↓↑i), ε = εa + εb + Cab + Jab = εs


2
1
|ψt i = √
2
( |↑↓i + |↓↑i), ε = εa + εb + Cab − Jab = εt
|↑↑i, ε = εa + εb + Cab − Jab = εt
|↓↓i, ε = εa + εb + Cab − Jab = εt

The first state is a singlet, whereas the other three states are triplets (parallel spins on 1 and
2) that lie lower in energy, because Jab > 0. The singlet and triplet nature of the states is
determined by their total spin, i.e., by the eigenvalue of Ŝ2 , where Ŝ = Ŝ1 + Ŝ2 , see Sec. 3.1.
So our solution for H12 yields triplet states having lower energy that the singlet states. This
implies ferromagnetic intraatomic exchange in line with the first Hund’s rule. Exchange integral
Jab is proportional to the energy splitting between the singlet and triplet states of an atom.
Hund’s exchange JH is a sizable term reaching 1 eV in 3d compounds. In heavier elements,
JH is reduced to 0.3 − 0.5 eV because of the larger radial span and enhanced screening.

2.2 Non-orthogonal orbitals, Heitler-London scheme


We shall now proceed to the case of non-orthogonal orbitals centered on different atoms. The
simplest problem of this kind is that of the hydrogen molecule having two nuclei at Ra and Rb ,

e2 e2 e2
HH2 = Hat (r1 − Ra ) + Hat (r2 − Rb ) − − + , (2.5)
|r1 − Rb | |r2 − Ra | |r1 − r2 |

where Hat is the one-electron atomic Hamiltonian, the third and fourth terms stand for the
attraction between an electron and the ”other” nucleus, and the last term is the electron-
electron repulsion. We could also include the internuclei repulsion, e2 /|Ra − Rb |, but it does
not depend on electronic variables and leads to a simple offset in energy.

8
The eigenstates of Hat ,

Hat (r − R) ϕat (r − R) = εat ϕat (r − R),

can be placed on either of the nuclei and written as ϕa (r) = ϕat (r−Ra ) and ϕb (r) = ϕat (r−Rb )
with the overlap integral Z
l= dr ϕa (r)ϕb (r). (2.6)

By disregarding any excited states of Hat , we are left with six possible electronic configura-
tions. Four of them form the already familiar { |↑↑i, |↑↓i, |↓↑i, |↓↓i} manifold, whereas the two
remaining states represent the ionized H+ –H− configurations with both electrons occupying the
same orbital on the same atom. In the following, we shall adopt the so-called Heitler-London
scheme, where these ionized configurations are discarded, and only the first four states remain.
Backed by our previous knowledge from Sec. 2.1, we shall look for solutions in the same
form of Slater determinants or their combinations. For example,
1
|↑↑i = p α1 α2 [ϕa (r1 )ϕb (r2 ) − ϕa (r2 )ϕb (r1 )],
2(1 − l2 )

where the prefactor is obtained from the normalization condition h↑↑ | ↑↑i = 1. The |↓↓i state is
obtained by replacing α with β. As for the two other states, they should be linear combinations
of |↑↓i and |↓↑i,
1
|ψt i = √1( |↑↓i + |↓↑i) = √ (α1 β2 + β1 α2 )[ϕa (r1 )ϕb (r2 ) − ϕa (r2 )ϕb (r1 )], (2.7)
2
2 1 − l2
1
|ψs i = √12 ( |↑↓i − |↓↑i) = √ (α1 β2 − β1 α2 )[ϕa (r1 )ϕb (r2 ) + ϕa (r2 )ϕb (r1 )]. (2.8)
2 1 + l2
By a lengthy but straight-forward calculation one can show that |↑↑i, |↓↓i, ψt , and ψs are
indeed eigenstates of HH2 . The first three states are again triplets. They differ in energy from
the singlet state ψs ,
Cab − Jab Cab + Jab
εt = 2εat + , εs = 2εat + . (2.9)
1 − l2 1 + l2
The Coulomb and exchange integrals have now a more intricate structure,
e2 |ϕa (r1 )2 ||ϕb (r2 )|2 e2 e2
ZZ Z Z
2
Cab = dr1 dr2 − dr1 |ϕa (r1 )| − dr2 |ϕb (r2 )|2 ,
|r1 − r2 | |r1 − Rb | |r2 − Ra |
and
e2
ZZ
Jab = dr1 dr2 ϕ∗a (r1 )ϕ∗b (r2 ) ϕb (r1 )ϕa (r2 )−
|r1 − r2 |
e2 e2
Z Z
−l dr1 ϕ∗a (r1 )ϕb (r1 ) − l dr2 ϕ∗b (r2 )ϕa (r2 ),
|r1 − Rb | |r2 − Ra |
but their physical meaning is essentially the same as in Sec. 2.1.
We are mostly interested in the singlet-triplet splitting,
l2 Cab − Jab
εt − εs = 2 , (2.10)
1 − l4
that becomes −2Jab (ferromagnetic coupling) in the l = 0 limit, as expected from Sec. 2.1. On
the other hand, the l 6= 0 case may give rise to an antiferromagnetic coupling (εt > εs ). In

9
the simplest situation, one needs l2 Cab > Jab , but in fact Cab and Jab of the Heitler-London
scheme may be positive or negative, because each of them has several contributions of different
sign. For example, at moderate interatomic distances Cab + Jab < 0 leads to εs < 2εat , i.e.,
a chemical bond is formed between the two atoms. Its physical meaning is as follows. When
the two nuclei are sufficiently close to each other, the region between these nuclei is favorable
for electrons, because they can experience strong attraction to both. The resulting energy
gain exceeds the energy loss due to repulsion between adjacent electrons, and the singlet state
becomes favorable, i.e., a chemical bond is formed. This mechanism does not apply to the
triplet case, because two same-spin electrons do not approach each other as a consequence
of the Paili principle. On a more mathematical level, this follows from the structure of |ψs i
and |ψt i, where the spatial part of the former is symmetric with respect to permutation and
allows non-zero electron density between the nuclei, whereas the spatial part of the latter is
antisymmetric with respect to permutation and leads to zero density there.
As one of the earliest attempt to describe chemical bonding, the Heitler-London scheme
was remarkably successful, especially when compared with the molecular orbital approach,
where one-electron functions are considered. The Heitler-London scheme utilizes two-electron
wavefunctions and better accounts for many-body effects. The limitations of the Heitler-London
approach manifest themselves at small |Ra − Rb |, where ionized configurations come into play.
Interestingly, the Heitler-London scheme also fails in the limit of |Ra − Rb | → ∞, where l → 0
stabilizes the triplet configuration, which is not the case in the exact solution. This shortcoming
is due to dynamic correlations that become increasingly important for weakly coupled atoms.

2.3 Kinetic exchange


Let’s now explore the ionized configurations that we previously discarded. They are constructed
as Slater determinants built solely from ϕa or ϕb ,

|ψa i = √1 [α1 β2 − β1 α2 ] ϕa (r1 )ϕa (r2 ),


2
|ψb i = √1 [α1 β2 − β1 α2 ] ϕb (r1 )ϕb (r2 ),
2

where antisymmetrization is achieved through the spin part, and for simplicity l = 0 was
assumed. We can define the on-site Coulomb repulsion

e2 |ϕa (r1 )|2 |ϕa (r2 )|2


ZZ
2
U = hψa ψa i = e dr1 dr2 (2.11)
|r1 − r2 | |r1 − r2 |

also known as the Hubbard U , because it is central parameter of the Hubbard model (Sec. 6).
Since both electrons are on the same site, the on-site repulsion is very strong compared to
the intersite repulsion Cab < U . In the following, we neglect both Cab and Jab as small terms
compared to U , which makes the problem simple enough for an analytical solution. Regarding
Eq. (2.5), this is equivalent to neglecting the integral of the two-electron term, unless both
electrons are on the same atom, and neglecting the integrals of the one-electron terms, unless
the electrons are on different atoms. In other words, we remove all earlier sources of the
magnetic coupling and look for a new one.
We shall also make use of the symmetry and classify the states with respect to their parity
defined by the inversion center in the middle of the H–H bond. States that do not change sign
of their spatial part upon the inversion are even, whereas those states that change sign are odd.
This way, |ψs i is parity-even, and |ψt i is parity-odd. As for the ionized configurations |ψa i and
|ψb i, they can be symmetrized to produce the states

|ψ1 i = √1 (|ψa i + |ψb i) , |ψ2 i = √1 (|ψa i − |ψb i) (2.12)


2 2

10
with even and odd parities, respectively.
Wigner’s theorem states that eigenstates of the Hamiltonian can be classified according
to its symmetry. The Hamiltonian matrix takes the block-diagonal form, such that matrix
elements between states of different symmetry are zero. Therefore,
hψs |H|ψ1 i = −2t, hψs |H|ψ2 i = 0,
where we introduced the transfer integral t, which describes the probability of electron
hopping between the two sites. Such a hopping is possible, because |ψs i is the singlet state,
where electrons have different spins and can move between the atoms. Per our assumption of
large U , the condition t  U holds.
The matrix element between |ψs i and |ψ2 i is zero by symmetry. As for the parity-odd state
|ψt i, it does not interfere with |ψ2 i because of the Pauli principle, so both |ψt i and |ψ2 i form
diagonal parts of the Hamiltonian matrix.
We have thus reduced our problem to the subspace built by only two states, |ψs i and |ψ1 i.
The relevant part of the Hamiltonian matrix reads as
 
2 εat −2t
H=
−2t 2 εat + U
(the |ψ1 i state has an additional energy of U due to the on-site Coulomb repulsion) and can
be diagonalized using the characteristic equation
2 εat − ε 2t
=0 ⇒ ε2 − ε(4εat + U ) + 2εat U + 4ε2at − 4t2 = 0
2t 2 εat + U − ε
that yields
1 √  1 p
ε = 2 εat + U ± U 2 + 16t2 = 2 εat + (U ± U 1 + 16t2 /U 2 ). (2.13)
2 2
p
By expanding 1 + 16t2 /U 2 in powers of t2 /U 2 , we arrive at
4t2
 4
4t2
 4
0 t 00 t
ε = 2 εat − +O 3
, ε = 2 εat + U + +O . (2.14)
U U U U3
The resulting eigenstates are not pure |ψs i and |ψ1 i. For example, the lowest-energy state
characterized by ε0 can be obtained by solving the equation
    
2εat −2t v1 0 v1 v2 2t
=ε ⇒ =
−2t 2εat + U v2 v2 v1 U
that defines the eigenstate |ψ 0 i (up to normalization) as
2t
|ψ 0 i = |ψs i + |ψ1 i, (2.15)
U
where the contribution of the ionized configuration |ψ1 i is of the order of t/U  1.
The singlet state |ψ 0 i is lower in energy than the triplet state |ψt i with εt = 2 εat . We see
now that the singlet state is lower than the triplet state by 4t2 /U , which defines the kinetic
exchange process. Its name stems from the energy reduction of the singlet state by the electron
hopping (kinetic energy), an effect forbidden in the triplet state.
Remark on t: by a direct evaluation of hψs |H|ψ1 i, one can verify that within our approxi-
mation of Cab = Jab = 0, t is an integral of the one-electron part of the Hamiltonian taken with
atomic orbitals centered on different atoms,
e2
Z
− t = dr ϕ∗at (r − Ra ) ϕat (r − Rb ). (2.16)
|r − Rb |

11
As a one-center integral, the hopping parameter is much easier to evaluate numerically than
the two-center integrals involved in Cab and Jab .
Remark on the Hubbard model: the structure of the energy levels parallels the general
solution of Hubbard model (Sec. 6). We find the low-energy states εs and εt separated from
the ionized states ε1 and ε2 by U . This reminds of the band gap opening between the lower
and upper Hubbard bands formed, respectively, by ε0 , εt and ε00 , ε2 . The splitting within these
bands (i.e., the band width) is of the order of 4t2 /U due to magnetism.

2.4 Goodenough-Kanamori-Anderson rules


By combining the results of Secs. 2.2 and 2.3, we can write the singlet-triplet energy splitting
as
4t2 l2 Cab − Jab
εt − εs = +2 , (2.17)
U 1 − l4
where the first term is the kinetic exchange and the second term is the potential exchange,
because it arises from the electron-electron repulsion that, in turn, contributes to the potential
energy of the system. The kinetic exchange is always antiferromagnetic. As for the potential
exchange, it boils down to −2Jab < 0 at l → 0, i.e., for all practical situations the potential ex-
change is ferromagnetic. The interplay between the potential and kinetic exchanges determines
magnetic interactions in the majority of (insulating) transition-metal compounds and can be
summarized within Goodenough-Kanamori-Anderson (GKA) rules.
These rules describe exchange couplings between two magnetic ions M separated by a lig-
and L. The interaction between M and L can be understood on two different levels: i) p-orbitals
of L are merged into d-orbitals of M as ”tails” (Anderson); ii) virtual excitations of p-electrons
onto d-orbitals are considered (Goodenough). We shall start with the latter description.
180◦ superexchange: consider one half-filled orbital on each of the magnetic atoms and
the 180◦ M–L–M interaction geometry. Here, both d-orbitals overlap with the same p-orbital.
By choosing the spin-up electron on one atom M, we pick up the spin-down electron in the
p-orbital of L to hop onto the d-orbital of M. Then we are left with the spin-up p-electron to hop
onto the d-orbital of the second M, and such a hopping is only possible when the d-electron of
the second M is spin-down. Therefore, only the antiferromagnetic configuration gains additional
energy from these virtual excitations, and an antiferromagnetic coupling ensues. This scenario
is known as the 180◦ superexchange.
90◦ superexchange: consider now the 90◦ coupling geometry. This time, two d-orbitals
overlap with two different p-orbitals. When p-electrons of both orbitals hop onto the d-orbitals,
the ligand remains with two electrons having parallel spins when the d-spins are parallel, and
with two electrons having antiparallel spins when the d-spins are antiparallel. The configuration
with parallel spins has lower energy by virtue of the Hund’s coupling on the ligand site. This
is the 90◦ superexchange.
Altogether, the superexchange between two half-filled orbitals of the same symmetry is:

• Strongly antiferromagnetic for the 180◦ M–L–M geometry


• Weakly ferromagnetic for the 90◦ M–L–M geometry

The notion of ”weak” and ”strong” does not follow from the Goodenough’s picture, but
becomes more transparent when Anderson’s approach and Eq. (2.17) are used. The idea here is
to mix d-orbitals with ligand p-orbitals and consider the resulting wavefunction as an effective
orbital (more precisely, Wannier function) that boils the complex M–L–M process down to the
interaction of two atoms, as considered in Sec. 2.3. This way, the 180◦ geometry involves an
overlap of the Wannier functions and a sizable t that outweighs the potential exchange term.

12
On the other hand, the 90◦ geometry leads to Wannier functions containing different p-orbitals,
so these Wannier functions are orthogonal, t = 0, and potential exchange prevails, but its
absolute value is less than 4t2 /U in the 180◦ case.
Remark: Anderson’s approach is indispensable in ab initio calculations, where the hopping
parameters t can be obtained directly from Wannier functions, which are Fourier transforms of
Bloch functions.
Multi-orbital case: the generalization to the multi-orbital case, where both half-filled,
fully filled, and empty orbitals are present, is more conveniently done within Anderson’s ap-
proach. In the one-orbital case, the t2 /U expression is merely a result of the second-order
perturbation theory, where we treat the singlet state as the ground state and the ”ionized”
state (with two electrons on one atom) as the excited state, which is higher in energy by U and
connected to the ground state by the matrix element −2t.
Now we do the same for the multi-orbital case, where an electron from the half-filled orbital
can hop onto an empty orbital (or, alternatively, an electron from the filled orbital can hop
onto the half-filled one). Such a hopping leads to a lower energy in the ferromagnetic case,
because, per first Hund’s rule, it is more favorable to have two same-spin electrons on one site.
Such a state has the energy U − JH , where JH is the Hund’s coupling, and the energy gain from
second-order perturbation theory is 4t2 /(U − JH ). The overall singlet-triplet splitting arising
from this process (electron hopping to the empty orbital) is then

4t02 4t02 4t02 JH 4t02 JH


εt − εs = − =− '− , (2.18)
U U − JH U U − JH U U
where we used t0 to distinguish the hopping to the empty orbital from the hopping to the half-
filled orbital denoted by t. This way, we get the purely ferromagnetic kinetic exchange that, on
average, is smaller than the antiferromagnetic kinetic exchange, because JH  U . On the other
hand, several empty orbitals may be available, which multiplies the ferromagnetic contribution
and makes it comparable to the antiferromagnetic one.
One particularly relevant situation arises when one atom has a half-filled orbital, whereas the
other atom has an empty orbital of the same symmetry. This enhances t0 compared to t, because
t runs between orbitals of different symmetry. Such a scenario is typical for orbitally-ordered
systems, or for mixed-valence compounds, such as Fe2+ –Fe3+ . The Goodenough-Kanamori-
Anderson rules are then reversed, and the exchange is

• Ferromagnetic for the 180◦ M–L–M geometry


• Antiferromagnetic for the 90◦ M–L–M geometry

(both can be weaker or stronger depending on the compound in question).


Remarks on the terminology: it is not uncommon to juxtapose direct exchange
and superexchange. The historical definition says that direct exchange arises from the direct
overlap of transition-metal d-orbitals, whereas superexchange is an exchange interaction assisted
by the ligands (some publications also mention super-superexchange as an interaction mediated
by two or more ligand atoms). Anderson abolished these definitions by introducing the concept
of kinetic and potential exchange, because his theory takes the same form regardless of the
presence of ligands between the interacting transition-metal ions. As far as real-world materials
are concerned, it is at best difficult to distinguish between magnetic interactions arising from
direct orbital overlap, and magnetic interactions mediated by ligands.
Superexchange models/calculations are those that elucidate magnetic interactions in
terms of t’s, U , and JH in the vein of Eqs. (2.17) and (2.18).

13
2.5 Anisotropic magnetic interactions
Little can be said here without going into detailed calculations of the anisotropic superexchange.
The description in Secs. 2.1–2.3 was done on the non-relativistic level with the complete sep-
aration of the spin and spatial components of the wavefunction, so the resulting magnetic
interaction is strictly isotropic. Magnetic anisotropy only appears when spin-orbit coupling is
included to produce at least a small orbital moment.
It is useful to know that all anisotropic terms are related to the spin-orbit-coupling constant
λ introduced in Eq. (1.8). When L  S, the Dzyaloshinsky-Moriya coupling of Eq. (1.13) is
linear in λ, |D| ∼ λ, whereas symmetric anisotropy is quadratic in λ, |Γαβ | ∼ λ2 . For the
complete derivation of the anisotropic superexchange, we refer the reader to the seminal paper
by T. Moriya [Phys. Rev. 120, 91 (1960)], as well as the more recent discussion by L. Shekhtman
et al. [Phys. Rev. Lett. 69, 836 (1992)] and Yildirim et al. [Phys. Rev. B 52, 10239 (1995)].

Further reading
• P.W. Anderson, Solid State Physics 14, 99 (1963)
The standard reference to Anderson’s superexchange theory includes not only the theory
itself, but also its background and a brief historical overview of research on antiferromag-
netism prior to 1963.

• P.W. Anderson, More and Different: notes from a thoughtful curmudgeon


A collection of Anderson’s popular articles and memoirs is fascinating reading. Historical
and philosophical notes on the superexchange theory are in the article ”Winning the prize
and losing the PR battle” (Chapter 2).

• D.I. Khomskii, Transition metal oxides (Cambridge University Press, 2014)


Chapter 5 gives a modern view of Goodenough-Kanamori-Anderson rules

14
3 Finite systems
3.1 Spin- 21 dimer
Having introduced the exchange interaction we return to the spin Hamiltonian approach and,
instead of looking into the origin of the interaction, start with the Heisenberg model for a spin- 12
dimer,  
z z 1 + − 1 − +
H = J Ŝ1 Ŝ2 = J Ŝ1 Ŝ2 + Ŝ1 Ŝ2 + Ŝ1 Ŝ2 , (3.1)
2 2
where S1 = S2 = 12 . Our approach to its solution will be very similar to that of Sec. 2.1. We
shall use the basis set of {|↑↑i , |↑↓i , |↓↑i , |↓↓i} and construct the Hamiltonian matrix using
the relations from App. A and, in particular, Eq. (A.14). The result is
 
1
0 0 0
 4
 0 − 41 + 12 0 

H=J 
1 1
.
 0 +2 −4 0 

0 0 0 14
The diagonalization yields the following energy spectrum,
J
1 4
|↑↑i Sz = 1 S2 = 2
J √1 (|↑↓i
2 4 2
+ |↓↑i) Sz = 0 S2 = 2
J
3 4
|↓↓i S z = −1 S2 = 2
4 − 3J
4
√1 (|↑↓i
2
− |↓↑i ) Sz = 0 S2 = 0
where we also classified the states with respect to their total spin Ŝ = Ŝ1 + Ŝ2 . The calculation of
S z is a trivial summation of S1z and S2z . As for S 2 , its operator can be conveniently represented
as
3
Ŝ2 = (Ŝ1 + Ŝ2 )(Ŝ1 + Ŝ2 ) = Ŝ21 + Ŝ22 + 2Ŝ1 Ŝ2 = + 2 Ŝ1 Ŝ2 , (3.2)
2
which yields the S 2 values listed above (note that S 2 = S(S +1) = 2 for S = 1, and S12 = S22 = 34
for spin- 21 ).
Such a spectrum not only repeats the result of Sec. 2.1, but also elucidates it. The key
difference between triplets and singlets is that the former belong to the eigenstates of Ŝ2 with
S = 1, whereas the latter has S = 0. Basically, we have not got a justification of the Heisenberg
model, because this model reproduces the energy levels of two interacting electrons. The singlet-
triplet splitting εt − εs is nothing but the exchange parameter J of the Heisenberg model.
Ferromagnetic dimer is a trivial object acting as an individual spin-1 ion. A more interesting
physics can be obtained in the antiferromagnetic case (J > 0).
Magnetic susceptibility: once we know the energy spectrum, magnetic susceptibility
can be calculated using the Van Vleck formula, i.e., by a summation of the susceptibilities of
individual states times Boltzmann factors e−E/kB T , divided by the partition function. This way,
3 χt e−J/kB T
χ(T ) =
1 + 3 e−J/kB T
where the singlet state (S = 0) does not contribute to the susceptibility, and χt is the suscep-
tibility of the triplet state (S = 1) given by the Curie law2
(gµB )2
χt = S(S + 1)
3 kB T
2
From now on, we write magnetic susceptibilities in CGS. The SI units require µ0 as a pre-factor.

15
(note that we use the susceptibility per atom; the molar susceptibility can be obtained by
multiplying by NA ). Altogether, we get the susceptibility of

2(gµB )2 1
χ(T ) = 1 J/kB T (3.3)
3 kB T 1 + 3 e

calculated per dimer (two atoms). Eq. (3.3) is sometimes known as the Bleaney-Bowers
equation. At high temperatures, eJ/kB T ' 1, and χ(T ) ' 2(gµB )2 /(4kB T ), which is the
susceptibility of two spin- 21 ions. At low temperatures, χ(T ) ∼ (1/T ) e−J/kB T due to excitations
over the singlet-triplet gap.
Specific heat: is obtained using standard thermodynamic relations. The Helmholtz func-
tion can be expressed via the partition function Z,

F = −kB T ln Z, Z = 1 + 3 e−J/kB T ,

and
∂ 2F ∂ 2 ln Z
     
∂ ln Z 2
CV = −T = 2kB T + kB T (3.4)
∂T 2 V ∂T V ∂T 2 V
that, upon a lengthy calculation, yields
2
eJ/kB T

1 J
CV /kB = 2 . (3.5)
3 kB T 1 + 13 eJ/kB T
1
This is identical to the standard expression for the Schottky anomaly up to the factor of 3
that
arises from the three-fold degeneracy of the triplet state.
Spin dimer in the magnetic field: four eigenstates of Eq. (3.1) are also eigenstates of
z
Ŝ . Therefore, magnetic field H applied along the z-direction adds the Zeeman term to the
spin Hamiltonian,3
H = J Ŝ1 Ŝ2 − gµB H Ŝ z , (3.6)
but has no effect on the eigenstates other than changing their energies. It is straightforward to
show that
ε1 = 14 J − gµB H, ε2 = 41 J, ε3 = 14 J + gµB H, ε4 = − 34 J.
This way, magnetic field splits the three-fold degenerate triplet state, and, in particular, state 1
goes down in energy. It reaches the singlet energy at the critical field H = J/(gµB ) known as the
saturation field, because above this field an antiferromagnetic dimer becomes ferromagnetic.

3.2 Role of quantum effects


Commutation relations: the fact that the eigenstates of Eq. (3.1) are simultaneously eigen-
states of Ŝ2 and Ŝ z is not accidental and arises from the symmetry of the Heisenberg model.
The Hamiltonian of Ŝ1 Ŝ2 type commutes with Ŝ2 and Ŝ z , so their eigenstates are common.
This is trivial in the case of Ŝ2 , which directly relates to Ŝ1 Ŝ2 via Eq. (3.2). The case of Ŝ z is
equally trivial, because
3 1
Ŝ1 Ŝ2 = − + Ŝ2 ,
4 2
whereas Ŝ2 and Ŝ z commute per Eq. (A.4).
3
Here, we again use CGS, where B ' H in the limit of small M . Note also that the minus sign in front of
the Zeeman term is a direct consequence of the sign convention introduced in Eq. (1.7).

16
The commutation relations may not hold when anisotropic terms appear in the spin Hamil-
tonian, though. Consider, for example, the XYZ Hamiltonian,
H = J x Ŝ1x Ŝ2x + J y Ŝ1y Ŝ2y + J z Ŝ1z Ŝ2z ,
and its commutator with Ŝz ,
[Ŝ z , H] = [Ŝ1z + Ŝ2z , J x Ŝ1x Ŝ2x + J y Ŝ1y Ŝ2y + J z Ŝ1z Ŝ2z ] = iJ x (Ŝ1y Ŝ2x + Ŝ1x Ŝ2y ) − iJ y (Ŝ1x Ŝ2y + Ŝ1y Ŝ2x ).
This becomes zero only when J x = J y . Therefore, eigenstates of the Heisenberg, XY, and
Ising Hamiltonians can be classified with respect to S z as long as J x = J y and no off-diagonal
exchange terms are present.
Ferromagnetic states: now, let’s go back to the Heisenberg Hamiltonian, but consider
arbitrary spins S1 = S2 = S and a ferromagnetic coupling J < 0. Using Eq. (3.2), we can write
J |J|
[(Ŝ1 + Ŝ2 )2 − Ŝ21 − Ŝ22 ] = − (Ŝ1 + Ŝ2 )2 + |J|S(S + 1).
J Ŝ1 Ŝ2 =
2 2
The matrix element of this expression is the energy. With the second term positive, the lowest
energy can be achieved when (Ŝ1 + Ŝ2 )2 takes the highest value allowed by the summation of
the angular momenta, i.e., 2S. This way, we get the ground-state energy of
FM
Emin = |J|[S(S + 1) − S(2S + 1)] = −|J|S 2 . (3.7)
Let’s denote the states by |S1z , S2z i. Then the two lowest-energy states of the ferromagnetic
Heisenberg model are |S, Si and | − S, −Si. The Ŝ1z Ŝ2z part of the Hamiltonian of Eq. (3.1)
yields −|J|S 2 , whereas the Ŝ + Ŝ − terms yield zero, because the values of S1z and S2z can’t be
increased (decreased) by the ladder operators.
The same results hold for an extended lattice with multiple atoms and Heisenberg inter-
actions between them. The | ↑↑ . . . ↑↑i and | ↓↓ . . . ↓↓i states are eigenstates with the lowest
energy.
Antiferromagnetic states: the antiferromagnetic case is qualitatively different. We shall
again make use of Eq. (3.2) and write
J J
J Ŝ1 Ŝ2 =[(Ŝ1 + Ŝ2 )2 − Ŝ21 − Ŝ22 ] = (Ŝ1 + Ŝ2 )2 − JS(S + 1).
2 2
Now the second term is negative, and the lowest energy corresponds to the lowest eigenvalue
of (Ŝ1 + Ŝ2 )2 , which is zero. Therefore,
AFM
Emin = −JS(S + 1). (3.8)
No simple eigenstates exist in this case, because any up-down state is not an eigenstate.
For example,
Ŝ1− Ŝ2+ |S, −Si −→ |S − 1, −S + 1i.
This way, antiferromagnetic states are always a superposition of several ”simple” states of
|S1z , S2z i type, and quantum effects play major role in antiferromagnets.
Classical vs. quantum: the ferromagnetic states are classical in the sense that their
energy does not depend on whether we treat spins as operators or vectors (see Sec. 1.3). On
the other hand, the antiferromagnetic states are fundamentally quantum in nature. In the
classical spin model, both |S, −Si and |−S, Si are eigenstates with the energy −JS 2 , which is
obviously higher than that in Eq. (3.8). The difference between −JS 2 and −JS(S + 1) vanishes
at S → ∞. On the physical level, the S → ∞ limit implies the absence of spin quantization
and renders spin a simple vector. On the other hand, for spin- 21 quantum effects are most
pronounced.
In short, ferromagnetic states are classical, whereas antiferromagnetic states are quantum.

17
3.3 Spin- 21 triangle
We can now try to solve a slightly more complex problem of a spin triangle,

H = J (Ŝ1 Ŝ2 + Ŝ1 Ŝ3 + Ŝ2 Ŝ3 ), (3.9)

this time using the classification of eigenstates with respect to Sz . Our basis set will include
|↑↑↑i with S z = 32 and {|↑↑↓i , |↑↓↑i , |↓↑↑i} with S z = 21 , as well as four states, where all spins
are flipped (S z = − 12 and S z = − 23 ). The |↑↑↑i state is an eigenstate of H as the only state
with S z = 23 , so we are left to diagonalize the 3 × 3 matrix of the S z = 12 states,
 
1 1 1
− +2 +2
 4
H1/2 = J  + 12 − 41 + 12 

.

+ 12 + 21 − 14

This leads to the following spectrum


3 3 3
1 4
J |↑↑↑i Sz = 2
S= 2
3 √1 (|↑↑↓i + |↑↓↑i + |↓↑↑i) 1 3
2 4
J 3
Sz = 2
S= 2

3 − 43 J √1 (|↑↑↓i − |↓↑↑i )
2
Sz = 1
2
S= 1
2

4 − 43 J √1 (|↑↑↓i − |↑↓↑i )
2
Sz = 1
2
S= 1
2

with another four states obtained by flipping all spins in 1 − 4.


The ground state of an antiferromagnetic spin triangle spans the states 3 and 4. Therefore,
at low temperatures the triangle behaves as a spin- 12 entity. This is also reflected in its magnetic
susceptibility (calculated per triangle),

χ1/2 + χ3/2 e−3J/2kB T (gµB )2 1 + 5 e−3J/2kB T


χ(T ) = = ,
1 + e−3J/2kB T 4 kB T 1 + e−3J/2kB T
where we could put factor of 4 in front of each term, because there are two S = 12 states (each
double-degenerate) and one S = 32 state (four-fold degenerate), but then these factors of 4
cancel out. At T → 0, the exponents are negligible, and the susceptibility of the whole triangle
χ ' (gµB )2 /(4kB T ) is that of a single spin- 12 ion. At T → ∞, one finds χ ' 3(gµB )2 /(4kB T ),
the susceptibility of three independent spin- 21 ions.
Classical solution for a spin- 21 triangle lacks even a remote connection to this result. The
collinear |↑↑↓i state has the energy of − 14 J, whereas the lowest energy is achieved by the
120◦ spin arrangement with E120◦ = − 38 J. Periodic systems are more likely to adopt the 120◦
spin configuration, as in the Heisenberg model on the triangular lattice, but other types of
triangular networks can feature more intricate and more quantum states lacking the 120◦ spin
arrangement.

3.4 Spin-1 dimer


As another example, we solve the Hamiltonian of Eq. (3.1) for S1 = S2 = 1. We can no longer
identify the states as up or down and will use the Siz values instead. The S z = 2 subspace
comprises the trivial |1, 1i state with E = J. The S z = 1 subspace is built of |1, 0i and |0, 1i
states, with the relevant part of the Hamiltonian matrix given by
 
0 1
HS z =1 = . (3.10)
1 0

18
Likewise, the S z = 0 subspace has the dimension of 3 due to the states |1, −1i, |0, 0i, and
| − 1, 1i forming the matrix  
−1 1 0
HS z =0 =  1 0 1  , (3.11)
0 1 −1
where we used Eq. (A.16).
The full spectrum is then

1 J |1, 1i Sz = 2 S=2
2 J √1 (|1, 0i + |0, 1i) Sz = 1 S=2
2

3 J √1 (|1, −1i + 2 |0, 0i + |−1, 1i) Sz = 0 S=2


6

4 −J √1 (|1, 0i − |0, 1i) Sz = 1 S=1


2

5 −J √1 (|1, −1i − |−1, 1i) Sz = 0 S=1


2

6 −2J √1 (|1, −1i − |0, 0i + |−1, 1i) Sz = 0 S=0


3

where the S z = −1 and S z = −2 states are again obtained by flipping all spins in the S z = 1
and S z = 2 states, respectively.
Although formed by two interacting atoms, the spin-1 dimer features three separate energy
levels with S = 0, 1, 2. This leads to the characteristic magnetization curve with two steps.
First, the S = 1 state is reached at Hc1 = J/(gµB ), then the S = 2 state gets stable above
Hc2 = 2J/(gµB ).
Magnetic susceptibility of the spin-1 dimer is calculated using the already familiar Van
Vleck formula,
3 χS=1 e−J/kB T + 5 χS=2 e−3J/kB T (gµB )2 1 + 5 e−2J/kB T
χ= = . (3.12)
1 + 3 e−J/kB T + 5 e−3J/kB T kB T 3 + eJ/kB T + 5 e−2J/kB T

3.5 Diagonalization methods


The solution of spin Hamiltonians by diagonalization is straight-forward, but computationally
highly demanding. For a spin- 21 system, each spin can have two states, up and down, so the
total number of states is 2N in a system with N atoms. The matrix dimension is 2N ×2N = 22N .
This is still feasible with N = 16 where the matrix size is 32.8 GB (8 bytes per matrix element),
but at N = 22 the matrix swells to 105 GB, which is hardly feasible even for the most powerful
modern computers. The symmetry can be of great advantage, as it renders the matrix block-
diagonal and reduces the number of matrix elements by a great margin. Presently, Heisenberg
model can be solved exactly for systems with up to N = 24 sites. This technique is known as
exact diagonalization or full diagonalization. Once the energy spectrum is obtained, all
thermodynamic properties can be calculated.
For larger systems sparse or Lanczos diagonalization can be used. The idea here is to
pick up a random state |Φ0 i, which is not the eigenstate of the spin Hamiltonian H. This state
can be represented by a linear combination of the eigenstates |ψi i,
X
|Φ0 i = ci |ψi i.
i

By acting on it with H, one gets


 n
n
X
n n
X Ei
(−H) |Φ0 i = ci (−Ei ) |ψi i = (−E0 ) ci |ψi i,
i i
E0

19
where E0 is the ground-state energy. As the lowest energy it leads xi = Ei /E0 ≤ 1, hence
xni → 0 and x0 → 1. This way, by the repetitive application of H one approaches the ground-
state energy and ground-state wavefunction.
Lanczos diagonalization can be used for systems with up to N = 48 − 50 magnetic atoms.
Note however that the working algorithm is somewhat more complicated than the one described
above, because the simplest Lanczos scheme is numerically unstable and does not lead to
convergence.

Further reading
• E. Sinn, Coord. Chem. Rev. 5, 313 (1970)
Perhaps the best collection of analytical solutions for finite spin systems.

• G.H. Golub and C.F. Van Loan. Matrix Computations


Not a magnetism textbook, but it gives a good overview of the Lanczos method

20
4 Ising model
Ising model introduced in Eq. (1.10) is of fundamental importance for magnetism and physics
in general. It does not involve any quantum effects, as one can readily see from the fact that
any classical state of |↑↑i or |↑↓i type is an eigenstate. The physics is, nevertheless, very rich,
although independent of the spin value (up to the normalization by S 2 ). It is thus common to
use Si = 1 when solving the Ising model.
Historically, the model was developed to reproduce the ferromagnetic transition, but it
turned out that no transition occurs in 1D. Decades later the transition was demonstrated for
the 2D case. It was the first analytical solution for a phase transition in a periodic system. No
exact solution in 3D is available to date.

4.1 1D model in zero field


Absence of magnetic order: let’s consider the nearest-neighbor ferromagnetic Ising model
on a finite chain of length N ,
N
X −1
H = −J Siz Si+1
z
(4.1)
i=1

(J > 0). Its ground state is the ferromagnetic state with all spins parallel and the energy EFM =
−N J. The free energy equals EFM , because there is only one ferromagnetic configuration, and
entropy is zero. This way, FFM (N, T ) = −(N − 1)J.
The lowest-energy defect is the creation of a domain wall, which has the energy

Ewall (N ) = −(N − 2)J + J = −(N − 1)J + 2J. (4.2)

The domain wall can appear at any position between 1 and N − 1, so its entropy is Swall =
kB ln(N − 1), and the free energy,

Fwall (N, T ) = −(N − 1)J + 2J − kB T ln(N − 1). (4.3)

The free energy difference is then

∆F = Fwall (N, T ) − FFM (N, T ) = 2J − kB T ln(N − 1) (4.4)

and negative for N → ∞ at any non-zero temperature. Therefore, nearest-neighbor ferromag-


netic Ising chain is unstable with respect to domain wall creation. This result may change if
we include long-range interactions, though.
Partition function: we shall introduce K = J/(kB T ) > 0, omit the z superscripts (this
convention will be used throughout Sec. 4), and write the partition function as
! N −1
!
X X X X X
ZN (K) = exp K Si Si+1 = ... exp K Si Si+1 , (4.5)
{S} i S1 SN i=1

where {S} denotes the summation over all possible states. This can be re-written using bond
variables ηi = Si Si+1 , which take the values of +1 and −1 depending on the spin arrangement
on the bond i (ferromagnetic or antiferromagnetic). This way,
XX X
ZNopen (K) = ... eK(η1 +...+ηN −1 ) = 2 (2 cosh K)N −1 , (4.6)
η0 η1 ηN −1

where η0 stands simply for the sign of S1 , and the last part follows from eK + e−K = 2 cosh K,
because η = ±1.

21
Two strategies can be used at this juncture. The partition function of Eq. (4.6) corresponds
to a finite chain with open boundary conditions. Alternatively, periodic boundary condi-
tions SN +1 = S1 can be introduced (spin chain folded into a ring). The choice of the boundary
conditions will affect the partition function ZN , although the ensuing free energy should not
depend on the boundary conditions in the thermodynamic limit N → ∞. We shall verify this
below.
Compared to Eq. (4.5), the SN S1 term should be added into the exponent,
−1
N
!
periodic
X X X
ZN (K) = ... exp K Si Si+1 + KSN S1 .
S1 SN i=1

This additional term can be represented as follows,

SN S1 = SN SN −1 SN −1 SN −2 · . . . · S2 S1 = ηN −1 ηN −2 · . . . · η1 ,

where we took advantage of the fact that Si2 = 1. Such a counter-intuitive but useful represen-
tation leads to the following,
XX X
ZNperiodic (K) = ... eK(η1 +...+ηN −1 )+Kη1 η2 ...ηN −1 =
η0 η1 ηN −1
 
∞ ∞ α X
X X X (Kη 1 ...ηN −1 )
α X K  X
=2 ... eK(η1 +...+ηN −1 ) α!
=2 ... η1α eKη1 . . . ηN
α
−1 e
KηN −1 

η1 ηN −1 α=0 α=0
α! η η
1 N −1

using Taylor expansion for eKη1 ...ηN −1 . The term inside the square brackets is essentially the
same summation performed N − 1 times, so

" ∞
#N −1
X Kα X X Kα  K N −1
ZNperiodic (K) =2 η e α Kη
=2 e + (−1)α e−K =
α=0
α! η α=0
α!
" #
α α
X K X K
=2 (eK + e−K )N −1 + (eK − e−K )N −1 =
α even
α! α odd
α!

= 2 (2 cosh K)N −1 coshK + 2 (2 sinh K)N −1 sinhK,


α
where we noted that the summations of Kα! with even and odd powers of α are essentially
Taylor expansion for coshK and sinhK, respectively. Altogether, we arrive at

ZNperiodic (K) = (2 cosh K)N + (2 sinh K)N . (4.7)

Free energy: the Helmholtz free energy can be calculated using either Eq. (4.6) or (4.7)
for open and periodic boundary conditions, respectively. In the former case,

FN (K, T ) = −kB T ln ZNopen = −kB T [N ln 2 + (N − 1) ln(cosh K)] ,


 
FN (K, T ) N −1
= −kB T ln 2 + ln(cosh K) ,
N N

which in the N → ∞ limit yields

F (K, T ) = −kB T [ ln 2 + ln(cosh K)] = −kB T ln(2 cosh K). (4.8)

22
Alternatively, we can start from Eq. (4.7) and write

FN (K, T ) = − kB T ln ZNperiodic = −kB T ln (2 cosh K)N + (2 sinh K)N =


 

= − kB T ln (2 cosh K)N (1 + (th K)N ) = −kB T N ln(2 cosh K) + ln 1 + (th K)N .


   

Then,
FN (K, T ) ln[1 + (th K)N ]
= −kB T ln(2 cosh K) − kB T , (4.9)
N N
where the second term vanishes at N → ∞, because |th x| ≤ 1. This way, we recover the result
of Eq. (4.8) and conclude that free energy in the thermodynamic limit (N → ∞) does not
depend on the boundary conditions, which kind of makes sense.
At low temperatures, K → ∞, and cosh K ' eK /2, hence F ' −J. At high temperatures,
K → 0, and F ' −kB T ln 2. In other words, only entropy matters at high temperatures,
whereas only internal energy matters at low temperatures.
Spin-spin correlations: like any other thermodynamic quantity, spin-spin correlations
can be calculated via the partition function. Consider nearest neighbors first,
−1
N
!
1 X X
hSj Sj+1 i = Sj Sj+1 exp Ki Si Si+1 .
ZN i=1
{S}

The summation is performed similar to Eq. (4.6), but excluding ηj , because it has different
signs in front of the exponent depending on ηj = Sj Sj+1 = ±1,
1 X X
eK(η1 +...+ηj−1 +1+ηj+1 +...ηN −1 ) − eK(η1 +...+ηj−1 −1+ηj+1 +...ηN −1 ) =

hSj Sj+1 i = no ηj
ZN η ... η
0 N −1

2 sinhK (2 coshK)N −2 sinh K


= = = th K.
ZN cosh K
Correlations beyond nearest neighbors can be calculated by combining the nearest-neighbor
ones,

hSj Sj+l i = hSj Sj+1 Sj+1 Sj+2 . . . Sj+l−1 Sj+l i = hSj Sj+1 i · . . . · hSj+l−1 Sj+l i = (th K)l , (4.10)

where we used the fact that Si Si = 1, so any pair of spins like Sj+1 Sj+1 can be placed inside the
correlator. Moreover, we can split the correlator into pair-wise correlators, because interactions
run between nearest neighbors only, and quantities like Sj Sj+1 and Sj+1 Sj+2 are independent.
Since |th x| < 1 for any finite x and only becomes unity for x → ∞, we conclude that spin-
spin correlations decay along the chain. At low temperatures, spins are short-range-ordered,
and ”feel” each other despite the short-range nature of the interactions (nearest-neighbor).
However, the whole chain becomes ordered only at K = ∞ that corresponds to T = 0. This
way, we corroborate our previous result that no magnetic order exists in an Ising chain at
T 6= 0.
Specific heat: we could use Eq. (3.4) again, but it is more convenient to calculate specific
heat as the derivative of the internal energy U , which, in turn, is obtained as
     
2 ∂ ln ZN J UN J
UN = kB T = −(N − 1)J th ⇒ U (T ) = lim = −J th .
∂T V kB T N →∞ N kB T
Then,
J2
 
∂U 1
CV = = 2 2 . (4.11)
∂T V kB T cosh (J/kB T )

23
It gives us a glimpse of the thermodynamic behavior. At high temperatures, CV ∼ 1/T 2 (as
in any cooperative magnet). At low temperatures, CV ∼ e−2J/kB T /T 2 , and the specific heat
vanishes exponentially indicating a gap on the order of 2J in the excitation spectrum. This
result is fairly general, because excitations of an Ising magnet always involve a spin flip that
breaks two bonds and requires the energy on the order of J (this energy becomes 2J in the 1D
case).

4.2 1D model in the magnetic field


Partition function: we shall now update our Ising Hamiltonian of Eq. (4.1) with the Zeeman
term containing the field applied along the z direction (Ising model in longitudinal field ),
N
X N
X
H = −J Siz Si+1
z
− gµB H Siz , (4.12)
i=1 i=1

and attempt to solve this problem too. Under periodic boundary conditions (SN +1 = S1 ) the
partition function becomes
N N
!
X X X
ZN (h, K) = exp h Si + K Si Si+1 =
{S} i=1 i=1
X Xh h
ih h i h h i
(S1 +S2 )+KS1 S2 (S2 +S3 )+KS2 S3 (SN +S1 )+KSN S1
= ... e 2 e 2 ... e 2 ,
S1 SN

where we introduced h = gµB H/(kB T ).


The quantities in square brackets are transfer functions,
h
Ti,i+1 = e 2 (Si +Si+1 )+KSi Si+1 . (4.13)

Let’s define transfer matrix operator T in such a way that hSi |T|Si+1 i = Ti,i+1 . Then the
partition function can be written in a remarkably simple form

hS1 |T|S2 ihS2 |T|S3 i . . . hSN |T|S1 i = hS1 |TN |S1 i = Tr (TN ), (4.14)
X X X
ZN (h, K) = ...
S1 SN S1

where we used the completeness property, namely, for an orthonormal basis set |λi the sum-
mation over all states yields unity, X
|λihλ| = 1.
λ

This way, we reduced the problem of calculating the partition function to the problem of
calculating the trace of a matrix, which we don’t know yet, but will find quite appealing once
we do.
Transfer matrix: Eq. (4.14) shows that we are only interested in the representation of T
in the basis of S1 states, which are S1 = 1 and S1 = −1. Therefore, transfer matrix takes the
form
e−K
   h+K 
T1,1 T1,−1 e
T= = (4.15)
T−1,1 T−1,−1 e−K e−h+K
Calculating the trace of TN is more of a technical problem. Let’s use a unitary transforma-
tion to diagonalize T,
 
λ+ 0
T = M TM, where M = M , T =
0 −1 † −1 0
.
0 λ−

24
The trace in question is then directly related to λ+ and λ− , because

Tr (TN ) = Tr (TN MM−1 ) = Tr (M−1 TN M) = Tr [(M−1 TM)N ] = Tr [(T0 )N ] = λN N


+ + λ− .

Free energy: we are left to compute eigenvalues of T,

eh+K − λ e−K  p 
=0 ⇒ λ± = eK cosh h ± sinh2 h + e−4K .
e−K e−h+K − λ

We can also write "  N #


λ−
ZN (h, K) = λN
+ 1+ .
λ+
Because λ− /λ+ < 1, the second term vanishes in the N → ∞ limit. Therefore, in the thermo-
dynamic limit,

FN (h, K, T )  p 
F (h, K, T ) = lim = −kB T ln λ+ = −J − kB T ln cosh h + sinh2 h + e−4K .
N →∞ N

At low temperatures, the e−4K term can be neglected, and

F (h, K, T ) ' −J − kB T ln(cosh h + | sinh h|) = −(J + gµB |H|)

that matches the low-temperature limit of the zero-field case. Likewise, at high temperatures,
h ' 0 and K ' 0 leads to λ+ ' 2 and F = −kB T ln 2 (entropy of an Ising spin). Finally, at
h = 0, λ+ = 2 cosh K, and we recover Eq. (4.8), as expected.
Magnetization is calculated as the derivative of F with respect to H (with the minus
sign). At T = 0 the result is trivial,
(
+1, H > 0
M/(gµB ) =
−1, H < 0

At T 6= 0,
 
gµB sinh h 1 + √ cosh h
∂F sinh2 h+e−4K gµB sinh h
M =− = p =p . (4.16)
∂H cosh h + sinh2 h + e−4K sinh2 h + e−4K
Note that we normalized the free energy per spin and thus obtained the magnetization per
atom.
Magnetic susceptibility is obtained as the low-field limit of dM/dH. To simplify the
calculation, we first calculate M in the low-field limit using sinh h ' h and sinh2 h  e−4K .
Then,
dM d e2J/kB T
χ = lim ' (gµB h e2K ) = (gµB )2 . (4.17)
H→0 dH dH kB T
This corresponds to the 1/T behavior at high temperatures and exponential divergence at low
temperatures. Here, again, we obtained the susceptibility per atom and may have to multiply
it by NA , should molar susceptibility be ofp
interest. Note also that the paramagnetic effective
moment in this case is gµB S and not gµB S(S + 1). Compared to the standard Curie law,
factor of 3 in the denominator is missing, because Ising spins have only two orientations, so
they don’t average to S(S + 1)/3.

25
Low-energy excitations deserve a separate comment. The simplest possible excitation
involves a single spin flip and creates two domain walls (kinks). These domain walls can separate
from each other and propagate along the chain at no energy cost in zero field. In a spin- 21 Ising
chain, each of the domain walls carries spin- 21 , so one considers this situation as the creation of
two free spinon quasiparticles.
Longitudinal field H plays against the separation, as it increases the energy by nH, where
n is the number of flipped spins (separation between the domain walls). In other words, the
region between the two domain walls represents a string with the tension λ = H/a, where a
is the lattice period. String tension leads to spinon confinement, i.e., spinons can no longer
propagate freely. Instead, each jump of a spinon involves an energy cost of H, and discrete
bound states of spinons are formed. These bound states are readily tracked experimentally
even in zero field, because interchain interactions create an effective (molecular) field that binds
the spinons.
The spectrum of bound states (individual strings) is limited from above by the two-string
continuum. Indeed, a sufficiently long string can break into two pieces, again at no energy
cost. This behavior shows close similarity to quark confinement in hadrons, where an attempt
to separate a quark from an antiquark leads to the creation of new quark-antiquark pairs. If
you want to read more on this, check [R. Coldea et al. Science 327, 177 (2010)] as well as its
popular summary in [I. Affleck, Nature 464, 362 (2010)] and references therein.
Effect of transverse field: we note in passing that transverse field,
N
X −1 N
X
H = −J Siz Si+1
z
−H Six , (4.18)
i=1 i=1

leads to the drastically different physics, because |↑↑i is no longer an eigenstate of H. Ising
model becomes quantum in this case, with more than one spin component contributing to the
magnetization. In this case, we deem z axis fixed by the local anisotropy (crystal field), and
the external field applied perpendicular to this axis.

4.3 Mean-field solution


Solving Ising model beyond 1D proves to be difficult, so different approximations were developed
over time. We shall consider the simplest one, Weiss molecular-field theory. It will give us a feel
of how good and how bad such an approach can be. Other mean-field solutions (by Bogolyubov,
Bragg-Williams, Bethe) were designed to improve over the molecular-field one. While they do
better in some respects, they also share all major deficiencies that we shall arrive at below.
Formulation of the problem: Let’s define the magnetization as the average value of
a single spin, M = gµB hSi i. We then decompose each spin into this average value and the
deviation from the average value, Sj = M/(gµB ) + [Sj − M/(gµB )]. The idea behind the
molecular-field approach is choosing a single spin S0 and treating the influence of its neighbors
as an effective magnetic field. Then,
z
! z
X X
2
HMF = −S0 J Sj + gµB H = −gµB S0 [zJM/(gµB ) + H] − JS0 [Sj − M/(gµB )],
j=1 j=1

where H is the external field, and we do the summation only within z nearest neighbors of the
reference spin S0 .
The second part of the molecular-field approximation requires that spin fluctuations are set
to zero, i.e., Sj − M/(gµB ) = Sj − hSj i → 0. This way, we forget about the second term and
arrive at the molecular-field Hamiltonian
HMF = −gµB S0 [zJM/(gµB )2 + H] = −gµB S0 Heff (4.19)

26
that describes the behavior of S0 in the effective field Heff . The influence of neighboring spins
is now contained in Heff .
Magnetization: with S0 = ±1 the partition function is trivial,
gµB Heff
ZMF = egµB Heff /kB T + e−gµB Heff /kB T = 2 cosh ,
kB T
and leads to the thermal average of S0 (viz. magnetization M ),

egµB Heff /kB T − e−gµB Heff /kB T


M = gµB hS0 i = gµB =
2 cosh(gµB Heff /kB T )
  
gµB Heff gµB zJM
= gµB × th = gµB × th +H . (4.20)
kB T kB T (gµB )2

This is a self-consistent equation for the magnetization. Let’s explore the zero-field case
only (H = 0). M = 0 is one solution (paramagnetic), but two other solutions may exist too.
Consider that |th x| ≤ 1 on the right-hand side, whereas the expression on the left-hand side
(M ) is not limited from above. Therefore, there must be a crossing at M 6= 0 whenever the
initial slope of th x exceeds 1. For H = 0 this leads to a condition
 
d zJM 1 zJ zJ
gµB × th ≥1 ⇒ 2 = ≥ 1.
dM gµB kB T M =0 cosh (zJM/gµB kB T ) kB T M =0 kB T

This way, at low enough temperatures a solution with M 6= 0 appears. The critical temperature
Tc is given by a simple expression
kB Tc = zJ. (4.21)
The result is obviously incorrect in 1D (z = 2), where no phase transition occurs. It is
also quite inaccurate in 2D (z = 4), where kB Tc = 2.269J is nearly twice lower, as we shall
see below. On the other hand, in 3D (z = 6) the overestimate of the transition temperature
becomes less dramatic, with the accurate numerical value of kB Tc = 4.511J, only 25% lower
than the molecular-field result. We note that in general the mean-field approach overestimates
the Tc . The error becomes smaller with the increase of dimensionality and z. This can be
understood from the fact that the mean-field theory neglects spin fluctuations, which prevent
the system from ordering. These fluctuations become less important in higher dimensions and
in the presence of multiple neighbors that bring the system closer to the mean-field behavior.
Critical behavior: magnetization vanishes at Tc , so in the vicinity of Tc it can be treated
as a small parameter, such that th x = x − x3 /3 + O(x5 ), and
 3
zJM 1 zJM
M' − gµB ,
kB T 3 gµB kB T

hence
 23   12   32   12
√ √

kB T zJ T Tc
M = ± 3 gµB −1 = ± 3 gµB −1 . (4.22)
zJ kB T Tc T
1
This expression demonstrates that M (T ) vanishes upon approaching Tc and follows the (T −Tc ) 2
power law. It is an example of the critical behavior M ∼ (Tc − T )β , where β = 12 stands
for the critical exponent that is usually generic for a given system type (Ising magnet in our
case). The prediction of the power-law behavior is qualitatively correct, although details are

27
1
totally wrong, as usual for the mean-field theory. Exact solution yields β = 8
in 2D, whereas
β ' 0.32 in 3D, both quite far from the mean-field prediction.
Specific heat: on the mean-field level, internal energy can be expressed via the product
of thermal averages of S, each of them equal to the magnetization,
 2
X N Jz M
U = hHi = −J hSi ihSj i = − .
2 gµB
hiji

Above Tc , M = 0, and so is the internal energy. Below Tc , Eq. (4.22) can be used. Therefore,
U (T ) is continuous at Tc , but its derivaties, including the specific heat, are not. By taking the
derivative of U , we arrive at some kind of a λ-type anomaly, because CV = 0 at T > Tc . Below
Tc ,
∂U 3 T
CV = = − N Jz 3 (2Tc − 3T ).
∂T 2 Tc
At T = Tc , one recovers a finite value of CV = 32 N kB .
Such a behavior is typical for a second-order phase transition, but it’s not the true behavior
of a 2D Ising magnet, as we shall discuss below.

4.4 Exact solution in 2D


Several steps preceded the exact solution of the Ising model in 2D. In 1936, Peierls demon-
strated that M 6= 0 at T > 0, i.e., he confirmed the formation of magnetic order at a finite
temperature. His result was extended by Kramers and Wannier in 1941. They developed√the
concept of dual lattice and obtained the exact transition temperature, kB Tc /J = 2/ ln(1 + 2),
without calculating the partition function and thermodynamic properties. This last step was
accomplished by Onsager in 1944. His solution is too complex to be presented here, so we shall
restrict ourselves to several key aspects. First of all, there is indeed a transition. Second, the
magnetization follows the power-law with β = 18 in the vicinity of the transition.
The third and perhaps most important result is the specific heat in the vicinity of Tc ,

NJ2 T − Tc
CV ∼ − ln .
2πkB Tc Tc

It diverges logarithmically upon approaching the transition from either below or above. This
logarithmic divergence was at odds with Ehrenfest’s classification of phase transitions into first
and second order. Those of the first order show discontinuity already in first derivatives of the
free energy, dF/dα (e.g., entropy or volume), whereas second-order transitions are characterized
by the continuous change of dF/dα and discontinuities in d2 F/dα2 (e.g., thermal expansion or
specific heat). Onsager’s case seemed to be neither of the two, as the specific heat is continuous
(approaches +∞ from both below and above), but diverges at the transition. Later on, it
became clear that experimental cases deemed as second-order transitions within the Ehrenfest
scheme are in fact characterized by the logarithimic divergence (e.g., in thermal expansion
of helium at the λ-transition). Therefore, the original Ehrenfest scheme was supplanted by
the revised one, where phase transitions are classified into continuous and discontinuous
depending on the change in dF/dα at the transition.
Triangular case: antiferromagnetic Ising model on the triangular lattice is very instructive
too. Its solution reported by G. Wannier in 1950 delivers a state with large residual entropy,
because Ising spins do not reach magnetic order in the triangular geometry, part of them remain
fluctuating and cause large non-zero entropy at T = 0.

28
4.5 Beyond magnetism
Ising model is widely used beyond magnetism. A few representative cases will be described
below.
Lattice gas models: the distribution of particles on a lattice is conveniently described
by an Ising variable. Consider particles, such as atoms on the surface, that can occupy fixed
adsorption sites. The presence or absence of a particle at a given site is encoded by η = 1 and
η = 0, respectively. The total energy is then
X X
E= Jij ηi ηj − µ ηi , (4.23)
hiji i

where the first term is the energy gain/loss due to the occupation of neighboring lattice sites,
and the second term is the energy due to individual particles (adding a particle changes the
energy by µ). Here, µ is the chemical potential that acts as the (longitudinal) magnetic field
of the original Ising model.
The transformation to the Ising spins Si = ±1 is straightforward, ηi = (1 + Si )/2. Then,
X X 1 + Si N 1X
ηi = = + Si , (4.24)
i i
2 2 2 i
X X 1 + Si 1 + Sj 1X zX zN
ηi ηj = = Si Sj + Si + , (4.25)
2 2 4 2 i 8
hiji hiji hiji

and X X
E = E0 − J 0 Si Sj − H Si
hiji i

with
µN zN J J µ + zJ
E0 = − − , J0 = , H= .
2 8 4 2

Binary alloys: the idea is similar to the above, but with a couple of amendments. Consider
the distribution of particles A and B on a lattice. Their total energy is

E = −EAA NAA − EAB NAB − EBB NBB , (4.26)

where we count the A–A, A–B, and B–B pairs. The NAA , NAB , and NAB parameters are related
to the total number of particles of each type, NA and NB , via

2NAA + NAB = zNA , 2NBB + NAB = zNB , NA + NB = N,

and z stands for the number of neighbors, as in Sec. 4.3 above. We can thus express

NAB = zNA − 2NAA , NBB = NAA − zNA + zN/2,

and
zN
E = −NAA (EAA + EBB − 2EAB ) − zNA (EAB − EBB ) − EBB .
2
The Ising variable ηi equals 1 for atom A and 0 for atom B. Using
X X
NA = ηi , NAA = ηi ηj
i hiji

29
we arrive at the standard form of the Ising Hamiltonian,
X X
E = E0 − J ηi ηj − H ηi ,
hiji i

with
zN
J = EAA + EBB − 2EAB , H = z(EAB − EBB ), E0 = − EBB ,
2
and the remaining transformations repeat Eqs. (4.24) and (4.25).

4.6 Monte-Carlo simulations


Exact solution of the Ising model is usually impossible, and numerical techniques have to be
used. In Monte-Carlo simulations, the behavior of the system is sampled by a small number
of representative configurations. One starts with a random configuration and initiates a guided
walk. The result is improved as the number of visited states increases.
Monte-Carlo techniques exist in many varieties that differ in the way the configurations
are sampled. The very first Monte-Carlo algorithim is due to Metropolis. The Metropolis
algorithm runs on the following principle. A random configuration is chosen, and one spin
is flipped with the associated change of δE in energy. For δE ≤ 0 the new configuration is
accepted. For δE > 0 one generates a random number 0 < R ≤ 1. If R ≤ e−δE/(kB T ) , the new
configuration is accepted. If R > e−δE/(kB T ) , the new configuration is rejected, and another
spin is flipped randomly.
By repeating this loop, one reaches a reasonable description of the real probability distribu-
tion and estimates any thermodynamic parameter of interest. The accuracy of the Monte-Carlo
result increases with the number of steps. Lower temperatures entail a sharper probability dis-
tribution and require a larger number of steps. This way, the Monte-Carlo simulations become
increasingly more difficult as temperature is decreased.

30
5 Heisenberg model
5.1 Mean-field theory for ferromagnets
We shall repeat the analysis of Sec. 4.3, but now for Heisenberg spins. The magnetization is
an average of the single spin, M = gµB hSi i, whereas the scalar product Ŝi Ŝj is replaced by
Si hSj i = Si M/(gµB ). Therefore, the Hamiltonian
X X
H = −J Ŝi Ŝj − gµB H Ŝi
hiji i

becomes  
zJM
HMF = − + gµB H Si = −gµB Heff Si , (5.1)
gµB
with the effective (mean) field
zJM
Heff = + H. (5.2)
(gµB )2
The first term here is the exchange field, or Weiss molecular field, and z stands for the
number of neighbors.
Magnetization: the solution of Eq. (5.1) resembles the one in atomic magnetism. The
direction of the effective field defines the quantization axis, and the partition function is the
summation over the allowed values of Siz ,
S 2S 1 1
sinh α(S + 12 )
 
X
−αSiz αS
X
−α n
 αS 1 − e
−α(2S+1)
eα(S+ 2 ) − e−α(S+ 2 )
Z= e =e e =e −α
= α/2 − e−α/2
= ,
z
S =−S n=0
1 − e e sinh α/2
i

where we used α = gµB Heff /(kB T ). The magnetization is


∂ ln Z kB T ∂Z h
1
  1
 1 αi
M = kB T = = gµB S + 2 coth α(S + 2 ) − 2 coth = gµB S × BS (αS)
∂H Z ∂H 2
with the Brillouin function BS (x) defined as
 
2S + 1 2S + 1 1 x
BS (x) = coth x − coth . (5.3)
2S 2S 2S 2S
At x → ∞, coth x ' 1, and BS (x) ' 1. Then, M = gµB S, i.e., the maximum (saturation)
magnetization is reached. However, in the present context we shall be more interested in the
x → 0 limit, where
1 x S+1
coth x ' − , and BS (x) ' x. (5.4)
x 3 3S

Transition temperature: similar to Eq. (4.20), we get an equation for the magnetization
  
gµB S zJM
M = gµB S × BS +H . (5.5)
kB T (gµB )2
At H = 0, the M 6= 0 solutions to this equation exist when the derivative of the right-hand
side at M → 0 is greater than 1. The calculation becomes quite simple if we use the low-x
limit of Eq. (5.4) right away,
d zJS(S + 1)
[gµB S × BS (αS)] ≥1 ⇒ ≥ 1.
dM M →0 3kB T

31
This leads to the mean-field estimate of the Curie temperature

kB TC = zJS(S + 1)/3. (5.6)

High-temperature limit: using Eq. (5.4) in Eq. (5.5), we get a simple linear equation for
the magnetization,
(gµB )2
 
zJM
M= S(S + 1) +H ,
3kB T (gµB )2
that yields
(gµB )2 S(S + 1) 1 (gµB )2 S(S + 1) 1
χ = M/H = = , (5.7)
3kB T −θ 3kB T − TC
the Curie-Weiss law for ferromagnets. Note that the form with θ is preferable, because it
is exact in the high-temperature limit, whereas the mean-field expression for TC is not accurate
at all. The Curie-Weiss temperature θ is directly related to the exchange interactions in
the system,
S(S + 1) S(S + 1) X
θ= zJ = Jij , (5.8)
3 3 j

where in the general case one does the summation of Jij at a given lattice site.
Low-temperature limit: at T → 0, the argument of the Brillouin function becomes
infinitely large. It can be approximated using

ex + e−x 1 + e−2x 2
coth x = −x
= −2x
= − 1 ' 1 + 2e−2x .
x
e −e 1−e 1 − e−2x
To a first approximation, the e−2x part should be kept in the second term of Eq. (5.3) only.
Then,
BS (x) ' 1 − S1 e−x/S . (5.9)
At H = 0,
1
e−zJS/kB T ,

M ' gµB S 1 − S

where we used M ' gµB S in the exponent.


The departure of M from its maximum value of gµB S is due to excitations. Here, the
exponential term reflects activated (gapped) nature of magnetic excitations in the mean-field
approximation. It would be correct in the Ising case, where excitations are spin flips and require
a finite energy of 2J, but does not hold in the Heisenberg case, where excitations are gapless.

5.2 Spin-wave theory for ferromagnets


So far our results for the Heisenberg model were not very different from the results for the
Ising model in Sec. 4.3. In fact, the mean-field approximation makes no difference between
the two, except for the statistics, which leads to the factor of S(S + 1) in TC and θ. The low-
temperature magnetization is qualitatively wrong in the mean-field approximation, because the
true nature of spin excitations could not be properly captured. The spin-wave theory, which
we shall introduce here, serves as a much better approximation.

5.2.1 Spin waves


We start with a simplistic version of the spin-wave theory and elucidate the nature of low-
energy excitations in Heisenberg ferromagnets. According to Sec. 3.2, the ground state of a

32
Heisenberg ferromagnet is a ferromagnetic (”all-up”) configuration, which we denote by |0i.
The Ŝr− operator changes Siz from S to S − 1 on the site r, so we can define a new state
1
|ri = √ Ŝr− |0i,
2S
√ √
where the pre-factor of 1/ 2S is needed because Ŝ − |Si = 2S |S − 1i, see Eq. (A.13). More-
over,
Ŝr−0 Ŝr+ |ri = 2S |r0 i,
which is the action of moving the ”reduced” spin from site r to site r0 (here, Ŝ + |S − 1i =

2S |Si). Finally,

Srz0 |ri = S |ri (r0 6= r), Srz0 |ri = (S − 1) |ri (r0 = r).

These relations define the action of the Heisenberg Hamiltonian,


 
X X
z z 1 + − − +
H=− J(p − q) Ŝp Ŝq = − J(p − q) Ŝp Ŝq + (Ŝp Ŝq + Ŝp Ŝq ) , (5.10)
2
hpqi hpqi

on the state |ri, X


H|ri = E0 |ri + S J(r − p)(|ri − |pi), (5.11)
p

where E0 = −(N S 2 /2) p J(p) is the ground-state energy.4 To understand this result, consider
P

the following. In the ferromagnetic state |0i, only Ŝpz Ŝqz contributes to the total energy, yielding
JS 2 for each lattice bond. In the state |ri, the Ŝpz Ŝqz part of the Hamiltonian yields the same
JS 2 when p, q 6= r and JS(S − 1) when p = r or q = r. Because the summation is done over
lattice bonds, hpqi, we can choose r as the ”central” site and deduce that
X X
J(p − q) Ŝpz Ŝqz |ri = E0 − S J(r − p) |ri,
hpqi p

where S is the difference between S 2 and S(S − 1). On the other hand, Ŝq− Ŝp+ yields 2S|pi
when q = r, which elucidates the rest of Eq. (5.11).
The main result at this point is that |ri is not an eigenstate of H, but H|ri represents a
linear combination of similar states |pi, where spin is reduced from S to S − 1 on one of the
lattice sites. The true eigenstate is a Fourier transform of |ri,
1 X ikr
|ki = √ e |ri.
N r

Then,
S X S X
H|ki = E0 |ki + √ J(r − p)eikr |ri − √ J(r − p)eikr |pi.
N r,p N r,p
4
The expression for E0 can be clarified as follows,
 
X X X X N S2 X
H|0i = − J(p − q)Ŝpz Ŝqz |0i = − 12 J(p − q)Ŝpz Ŝqz |0i = − 12 S 2 |0i  J(p0 ) = − J(p0 )|0i,
p,q p 0
2 0
hpqi p p

where we introduced p0 = p − q, and the summation over p yields N .

33
By introducing p0 = r − p, we obtain
! !
X 1 X X 0 1 X
H|ki = E0 |ki + S J(p0 ) √ eikr |ri − S J(p0 ) eikp √ eikp |pi =
p0
N r p0
N p
0
X
= E0 |ki + S J(p0 )(1 − eikp )|ki = Ek |ki. (5.12)
p0

This way, |ki is an eigenstate of H with the energy Ek .


Let’s analyze how |ki looks like. First, as a linear combination of |ri it can be characterized
by the spin reduction from S to S − 1, but this aberration is now non-local and distributed all
over the place. Second, the product of |ki and |ri yields |hk|ri|2 = 1/N , so the spin reduction
is shared between all lattice sites evenly. Finally, we can take a look at the transverse spin
component in the state |ki. To this end, note that
r r
1 X 2S X 2S ikq
Ŝq+ |ki = √ eikr Ŝq+ |ri = eikr δrq |0i = e |0i.
N r N r N

Therefore,
2S ik(q−p)
hk|Ŝp− Ŝq+ |ki = e , (5.13)
N
where hk|Ŝp− is the Hermitian congugate of Ŝp+ |ki and thus equivalent to the above.
Eq. (5.13) describes the wave-like propagation of the transverse spin component. In other
words, the excitation of the Heisenberg ferromagnet entails the reduction in S z and the wave-like
propagation of S x and S y , hence the name spin wave.
Lastly, we note that lattice symmetry requires J(r) = J(−r). Then, the spin-wave energy
Ek − E0 of Eq. (5.12) can be simplified to
X X X
Ek − E0 = S J(r)(1 − eikr ) = S J(r)(1 − cos kr) = 2S J(r) sin2 kr
2
(5.14)
r r r

(here, we omit the factor of 2 in front of the cos kr, because the summation is still done over
all possible r, i.e., both r and −r are included). At small k, sin kr ' kr, and Ek − E0 ∼ (kr)2 ,
the spin-wave dispersion is quadratic in k. The spin-wave excitations are gapless, as Ek → 0
at k → 0.
Anisotropic magnets: the spin-wave dispersion can be easily generalized to the case of
the XXZ model defined by
Xh i
H=− J z (p − q) Ŝpz Ŝqz + 12 J xy (p − q) (Ŝp+ Ŝq− + Ŝp− Ŝq+ ) , (5.15)
hpqi

where J z > J xy . In Eq. (5.12), J z gives rise to the first term with J(p0 ), whereas the second
0
term with J(p0 ) eikp ensues from J xy . Altogether,
X X
Ek = E0 + S J z (r) − S J xy (r) eikr . (5.16)
r r

The nature of excitationsPdoes not change, but they become gapped, as the lowest excitation
energy is now equal to S r [J z (r) − J xy (r)], i.e., the gap is proportional to the anisotropy.
This manifests one case of the general Goldstone theorem, which can be formulated as
follows: upon any transition that breaks continuous symmetry in a system with sufficiently

34
short-range interactions, gapless excitations appear. A smarter and fancier way of making the
same statement would be through the appearance of massless bosons, which are quasiparticles
responsible for gapless excitations. The Heisenberg model is symmetric with respect to spin
rotations, but this symmetry is broken in the ordered state, because spins become dependent
on each other and lose this freedom to rotate. Therefore, gapless spin-wave excitations must
occur. In contrast, the XXZ model lacks the symmetry with respect to spin rotations, there is
nothing to lose, and no gapless modes appear in the ordered state.

5.2.2 Bosonic representation


A more general description of spin-wave excitations can be obtained using second quantization.
If you seriously intend to understand the rest of this section, please, get familiar with App. D
first.
We have seen that the spin-wave excitation corresponds to the change of S z by −1, so it
can be interpreted as a bosonic quasiparticle. Naively, one would define Ŝ − = ↠and Ŝ + = â,
as the former creates an excitation, and the latter annihilates it. It’s a bad starting point,
though, because the commutation relation [â , ↠] = 1 of Eq. (D.8) should hold, but instead
[Ŝ + , Ŝ − ] = 2S z per Eq. (A.7). So a different definition of Ŝ + and Ŝ − in terms of bosonic
operators is needed. Several definitions are possible (Sec. 5.5). We start with the one, which is
most suitable for systematic derivation of the spin-wave theory.
Holstein-Primakoff transformation: introduces bosonic operators in the following form,
1 1
√ √
 
â†r âr 2 â†r âr 2
Ŝr+ = 2S 1 − âr , Ŝr− = âr†
2S 1 − , Ŝrz = S − â†r âr . (5.17)
2S 2S

As counter-intuitive as it seems, this definition ensures a direct link between spin operators and
bosonic operators of magnetic excitations. For now we drop the site index and consider a single
spin operator with the eigenstates |mi, where m is the projection of S onto the quantization
axis. Alternatively, we can label these eigenstates with |ni, where n = S − m, i.e., n is the
deviation of S z from its maximum value of S.
According to Eq. (A.12),
p p
Ŝ + |mi = S(S + 1) − m(m + 1) |m + 1i = (S − m)(S + m + 1) |m + 1i

hence p
Ŝ + |ni = n(2S − n + 1) |n − 1i.
On the other hand, the action of Ŝ + written via bosonic operators is
1
√ √ √ √

↠â 2 p
2S 1 − â|ni = 2S − n̂ â|ni = 2S − n̂ n |n − 1i = n(2S − (n − 1)) |n − 1i,
2S

which is identical to the above (we used â|ni = n |n − 1i per Eq. (D.3)).
Likewise, Eq. (A.13) defines the action of Ŝ − ,
p p
Ŝ − |mi = S(S + 1) − m(m − 1) |m − 1i = (S + m)(S − m + 1) |m − 1i,
p
Ŝ − |ni = (2S − n)(n + 1) |n + 1i.

This should be compared to


√ √ p
↠2S − n̂ |ni = ↠2S − n |ni = (2S − n)(n + 1) |n + 1i.

35
Alternatively, one can use the fact that Ŝ − = (Ŝ + )† , which immediately explains the relation
between the bosonic forms of Ŝ + and Ŝ − , as well as the fact that ↠stands in front in the
definition of Ŝ − .
Commutation relation: √ we can verify that [Ŝ + , Ŝ − ] = 2Ŝ z holds. Let’s write Ŝ + = Â â
and Ŝ − = ↠ with  = 2S − n̂. Then,

[Ŝ + , Ŝ − ] = [ â, ↠Â] =  â↠ − ↠ â = Â2 + Â2 ↠â − ↠Â2 â = Â2 + [Â2 , ↠] â,

where in the first term we used â ↠= 1 + ↠â = 1 + n̂ and [Â, n̂] = 0 (since  contains n̂ only).
We should now compute

[Â2 , ↠] = [2S − n̂, ↠] = −[n̂, ↠] = −↠.

using. This way,


[Ŝ + , Ŝ − ] = 2S − ↠â − ↠â = 2(S − ↠â) = 2Ŝ z ,
as expected.
Fourier transforms will be useful. We define
1 X ikr 1 X −ikr †
âk = √ e âr , â†k = √ e âr . (5.18)
N r N r

and, likewise,
1 X −ikr 1 X ikr †
âr = √ e âk , â†r = √ e âk (5.19)
N k N k
(here the summation over r runs over all N lattice sites, whereas the summation over k runs
over N associated points in the first Brillouin zone).
The commutation relation of Eq. (D.8) becomes
1 X i(kp−k0 q) 1 X i(k−k0 )p
[âk , â†k0 ] = e [âp , â†q ] = e = δkk0 , (5.20)
N p,q N p

where we used the standard summation rule for periodic lattices,


X
eikr = N δk,0 . (5.21)
r

It is verified by a simple argument that all r in this summation can be shifted by an arbitrary
lattice vector r0 (r0 = r + r0 ), which should not change the result. Therefore,
0
X X
A= eikr = eikr0 eikr = eikr0 A.
r r

This equation holds if eikr0 = 1, or A = 0. The former condition is satisfied by any reciprocal-
lattice vector, but only one such vector, k = 0, is available in the first Brillouin zone. Therefore,
as long as we restrict ourselves to the first Brillouin zone, the sum in Eq. (5.21) should be
N × 1 = N for k = 0 and 0 otherwise.

5.2.3 Linear spin-wave theory


Representation of spin operators: we shall now write down the full Heisenberg Hamiltonian
using bosonic operators. A useful approximation in this case is
hâ†r âr i
 1, (5.22)
2S
36
which basically tells us that the number of excitations measured by â†r âr should be small (i.e.,
temperature should be low). Then, Eq. (5.17) can be re-written as
1
√ √
  
â†r âr 2 1 †
Ŝr+ = 2S 1 − âr = 2S âr − â â â + . . . =
2S 4S r r r
r !
2S X −ikr 1 X
= e âk − ei(k1 −k2 −k3 )r â†k1 âk2 âk3 + . . . . (5.23)
N k
4SN k ,k ,k 1 2 3

Likewise, r !
2S X 1 X
Ŝr− = eikr â†k − ei(k1 +k2 −k3 )r â†k1 â†k2 âk3 , (5.24)
N k
4SN k1 ,k2 ,k3

whereas Ŝjz takes a simpler form,

1 X i(k1 −k2 )r †
Ŝrz = S − e âk1 âk2 . (5.25)
N k ,k
1 2

Linear approximation: we shall now plug these lengthy expressions into the Heisenberg
Hamiltonian, Eq. (5.10), and distinguish between the three terms that ensue,
 
X
z z 1 + − − +
H=− J(p − q) Ŝp Ŝq + (Ŝp Ŝq + Ŝp Ŝq ) = E0 + H0 + H1 . (5.26)
2
hpqi

Here, E0 is the ground-state energy originating from the first part of Ŝrz . It’s no different from
E0 in Eq. (5.11). H0 accumulates all terms, which are quadratic in â†k and âk (i.e., linear in the
number of excitations), whereas H1 includes higher-order terms. Within the linear spin-wave
theory, one neglects H1 and considers H0 only (hence the name ”linear”). So we do.
Individual terms: the H0 part includes the terms arising from Ŝpz Ŝqz , Ŝp+ Ŝq− , and Ŝp− Ŝq+ ,

H0 = H0zz + H0+− + H0−+ .

Let’s analyze H0zz first. We shall transform the summation over lattice bonds hpqi into the
independent summation over sites p and q, with the pre-factor 12 to avoid double-counting.
Then,
1S XX
H0zz = J(p − q) ei(k1 −k2 )p + ei(k1 −k2 )q â†k1 âk2 =
 
2 N k ,k p,q
1 2
" #" # !
S X X i(k1 −k2 )p X  0
 X X
= e J(p0 ) 1 + e−i(k1 −k2 )p â†k1 âk2 = S J(r) â†k âk ,
2N k ,k p p0 r k
1 2

where p0 = p − q, as usual, and we used


X
ei(k1 −k2 )p = N δk1 k2
p

per Eq. (5.21).

37
Using the same idea for H0+− and H0−+ , one arrives at
1 2S X X
H0+− = − J(p − q)(e−ik1 p eik2 q ) âk1 â†k2 =
4 N k ,k p,q
1 2
" #" # !
S X X −i(k1 −k2 )p X
p0 SX X
=− e J(−p0 )e−ik2 âk1 â†k2 =− J(r)eikr âk â†k
2N k ,k p p0
2 k r
1 2

and
!
S XX S X X
H0−+ =− J(p − q)(eik1 p e−ik2 q ) â†k1 âk2 = − J(r)e−ikr â†k âk .
2N k ,k p,q 2 k r
1 2

Magnon Hamiltonian: the symmetry J(r) = J(−r) implies that the summation of J(r)
yields the same result for H0+− and H0−+ , the Fourier transform J(k). Then,
X
J(0) â†k âk − 12 J(k)(âk â†k + â†k âk ) ,

H0 = S
k

where we used J(0) = r J(r). For the last step, we recall that [âk , â†k ] = 1 per Eq. (5.20) and
P
obtain
X X X X
H0 = S [J(0) − J(k)] â†k âk − S2 J(k) = S [J(0) − J(k)] â†k âk = ~ωk â†k âk , (5.27)
k k k k

because the summation of J(k) yields zero,


!
X X X X
ikr
J(k) = J(r) e = N J(r) δr,0 = N J(0) = 0,
k r k r

via a relation reciprocal to Eq. (5.21).


Eq. (5.27) essentially repeats Eq. (5.14) but in the language of second quantization. Instead
of spin waves we get magnons, bosonic quasiparticles that describe excitations of Heisenberg
ferromagnet within linear spin-wave theory. Each magnon corresponds to ∆S z = 1, which is
also obvious from the form of S z within the Holstein-Primakoff transformation, Eq. (5.17).
Simple lattices with only one type of exchange coupling allow a more intuitive form of
the magnon Hamiltonian X
H = zJS (1 − gk )â†k âk (5.28)
k

with the geometrical factor of the lattice,


1 X ikδ
gk = e . (5.29)
z δ

Note that the summation is done only over vectors δ that connect a lattice site to its neighbors.

For a simple cubic lattice, the number of neighbors is z = 6, and

1 (ka)2
gk = (cos kx a + cos ky a + cos kz a) ' 1 − . (5.30)
3 6
Then, ~ωk ' JS(ka)2 , the quadratic magnon dispersion at low k.

38
5.2.4 Thermodynamic properties
Internal energy: Eq. (5.27) describes the gas of free bosons. Its energy depends on the
number of bosons,
X 1
U = E0 + ~ωk nk , nk = ~ω /(k T ) ,
k
e k B −1
where nk is determined by the conventional Bose-Einstein statistics.
High-temperature limit: at high temperatures, ~ωk /(kB T )  1, and ex ' 1 + x. There-
fore,
X kB T X
U = E0 + ~ωk = E0 + kB T = E0 + N kB T.
k
~ωk k

As temperature derivative of U , specific heat is constant in the high-temperature limit, CV =


N kB .
Low-temperature limit: at low temperatures, only the low-energy modes in the vicinity
of k = 0 contribute to the energy. We have seen that the power-law behavior (e.g., ωk ∼ k 2 )
holds. Let’s use the generic representation ωk = g |k|η and calculate temperature dependence
of the specific heat for an arbitrary dimension D. We replace the summation over the first
Brillouin zone with an integral over the k-space (this adds the V /(2π)D pre-factor),

V 2π D/2 ~ g kη
Z Z
V D ~ωk D−1
U = E0 + d k = E0 + dk k ,
(2π)D e~ωk /(kB T ) − 1 (2π)D Γ( D2 ) e~ g kη /(kB T ) − 1

and integrate over a D-dimensional sphere with the radius k. This way, dD k is reduced to dk
using an expression for the spherical volume element that includes Euler Γ-function. For the
reference, we mention that
√ (n − 2)!!
Γ(n) = (n − 1)! and Γ( n2 ) = π (n−1)/2 ,
2
which restore the familiar d2 r = 2πr dr and d3 r = 4πr2 dr expressions in 2D and 3D, respec-
tively.
Let’s define 1/η
~ g kη

kB T y k
y= ⇒ k= , dk = dy.
kB T ηg ηy
The idea of this transformation is to remove any explicit T -dependence of the integrand and
keep T only in the pre-factor. By replacing k with y, we arrive at
 Dη
V 2π D/2 V 2π D/2 −1
Z Z 
D kB T 1 kB T y kB T
U = E0 + dy k = E 0 + η dy =
(2π)D Γ( D2 ) η ey − 1 (2π)D Γ( D2 ) ~g ey − 1

V 2π D/2 (kB T )D/η+1 y D/η


Z
−1
= E0 + η dy .
(2π)D Γ( D2 ) (~g)D/η ey − 1

The integral of y does not depend on y, so the temperature dependence is fully determined by
the pre-factor. Therefore,
 D
∂U kB T η
CV = =A . (5.31)
∂T ~g
3
For example, three-dimensional Heisenberg ferromagnets exhibit the T 2 power-law behavior of
the specific heat. Specific heat of two-dimensional Heisenberg ferromagnets is linear in T , etc.

39
3
5.2.5 Magnetization and Bloch’s T 2 law
The deviation of the magnetization from the saturation value is also determined by nk ,
" #
d3 k
 Z 
1 X V 1
M (T ) = M (0) 1 − nk = M (0) 1 − .
NS k NS (2π)3 e~ωk /(kB T ) − 1

Similar to the above, we√ would 3like to remove the explicit T -dependence of the integrand.
3
2
To this end, we use y = k/ kB T , d k = 4πk dk = 4π(kB T ) 2 y 2 dy, ωk = gk 2 , and obtain

y2
 Z 
4π V 3 3
M (T ) = M (0) 1 − 3
(kB T ) 2 dy ~gy2 = M (0)(1 − A T 2 ),
N S (2π) e −1
3
the Bloch’s T 2 law for the magnetization, which holds in three-dimensional Heisenberg ferro-
magnets. Unlike the specific heat, it can’t be extended to low dimensions, where no magnetic
order exists at T 6= 0 (see Sec. 5.4.5).

5.3 Mean-field theory for antiferromagnets


The mean-field description of antiferromagnets is similar to the one for ferromagnets in Sec. 5.1.
However, antiferromagnets feature two ferromagnetic sublattices with the magnetizations M1
and M2 . The sublattice magnetization serves as an order parameter and tracks the formation
of the ordered state. The total magnetization M = M1 + M2 will be central to the calculation
of the magnetic susceptibility.
Sublattice magnetization: similar to Eq. (5.1), the mean-field Hamiltonian reads as

HMF,i = −gµB Heff,i Si , (5.32)

where S1 belongs to the first (”spin-up”) sublattice, S2 belongs to the second (”spin-down”)
sublattice, and the effective fields acting on these sublattices may be different. These fields,
Heff,i = H + Hexch,i , are a combination of the external field H and internal (exchange) fields,
which we write as

Hexch,1 = λ12 M2 + λ11 M1 , Hexch,2 = λ21 M1 + λ22 M2 . (5.33)

In a fully compensated antiferromagnet, the two sublattices are equivalent. Therefore, λ12 = λ21
and λ11 = λ22 . These coefficients are obtained by summing magnetic interactions,
1 X 00 1 X0
λ12 = − Jij , λ11 = − Jij , (5.34)
(gµB )2 j (gµB )2 j

where 0 denotes the summation within the sublattice, and 00 stands for the summation
P P
between the sublattices.
The mean-field solution for the magnetization is obtained as a system of equations,
 
gµB S
M1 = gµB S × BS (λ11 M1 + λ12 M2 + H) ,
kB T
 
gµB S
M2 = gµB S × BS (λ21 M1 + λ22 M2 + H) ,
kB T

where BS (x) stands for the Brillouin function. In zero field, M1 = −M2 , Heff,1 = −Heff,2 =
(λ11 − λ12 )M1 , and the two equations become equivalent. It is then sufficient to solve the

40
equation for M1 , which is no different from Eq. (5.5) in the ferromagnetic case, but contains
(λ11 − λ12 ) instead of zJ/(gµB )2 .
Néel temperature: by repeating the derivation of Eq. (5.6), we find
S(S + 1)
kB TN = (gµB )2 (λ11 − λ12 ). (5.35)
3
Note that λ11 and λ12 have different signs, because interactions within the sublattice are fer-
romagnetic (negative), whereas those between the sublattices are antiferromagnetic (positive).
This way, all magnetic interactions are added together in (λ11 − λ12 ), and it is their overall
energy that determines TN .
High-temperature limit: full compensation of the sublattices is altered in the applied
field, and we can no longer restrict ourselves to a single equation, but should use two of them,
where M1 and M2 are intertwined. These equations are still tractable in the high-T limit
using the approximation of BS (x) ' x(S + 1)/(3S). The quantity of our interest is the total
magnetization M , which comes out as
S + 1 gµB S
M = M1 + M2 ' gµB S (λ11 M1 + λ12 M2 + λ21 M1 + λ22 M2 + 2H).
3S kB T
With λ12 = λ21 and λ11 = λ22 , this equation simplifies to
(gµB )2 S(S + 1)
M= ((λ11 + λ12 )M + 2H)
3kB T
that leads to the high-temperature susceptibility
M/2 (gµB )2 S(S + 1) 1
χ= = , (5.36)
H 3kB T −θ
the Curie-Weiss law for antiferromagnets, where we added the factor of 12 to get the
susceptibility per atom (M was a sum over the two sublattices and thus twice the magnetization
per atom).
This result looks deceptively similar to the one for ferromagnets, Eq. (5.7), but there is an
important difference regarding the Curie-Weiss temperature θ that now has the form
(gµB )2 S(S + 1) S(S + 1) X
θ= (λ11 + λ12 ) = − Jij . (5.37)
3kB 3kB j

Because λ11 and λ12 have different signs, we effectively subtract ferromagnetic couplings from
the antiferromagnetic ones, and θ can be quite low, even if the underlying couplings are strong.
In general, we can’t write θ = TN even on the mean-field level, except for a special situation
when λ11 = 0, and only antiferromagnetic couplings are operative.
Low-temperature limit splits into two distinct cases, depending on the direction of the
applied field with respect to the spin direction. When the field direction is collinear with the
spin direction, we measure the parallel susceptibility χk , whereas the field applied perpendicular
to the spin direction yields the perpendicular susceptibility χ⊥ .
Parallel susceptibility: in zero field, M1 = −M2 at any temperature. Weak magnetic
field will slightly shift the balance (without changing the directions of M1 and M2 ), so we can
write M1 = M0 +δM and M2 = −M0 +δM , where M0 is the zero-field sublattice magnetization
(at a given temperature), and δM is a small parameter. Let’s now expand BS [f (H)] considering
the external field H as another small parameter. To the first order,
df
BS [f (H)] ' BS [f (0)] + HBS0 [f (0)] .
dH H=0

41
The argument of the Brillouin function at zero field includes only M0 without δM , so we come
to the familiar situation when the argument of the Brillouiun function for M1 contains (λ11 −
λ12 )M0 , and the argument for M2 contains −(λ11 − λ12 )M0 . Another important observation
at this juncture is that BS (x) is odd, whereas BS0 (x) is even. When we sum up the Brillouin
functions of M1 and M2 , the BS [f (0)] terms vanish, BS (x) + BS (−x) = 0, whereas the BS0 [f (0)]
terms add up. With this knowledge, we can write
   
0 gµB S df1 df2
M = M1 + M2 ' gµB S H × BS (λ11 − λ12 )M0 × + . (5.38)
kB T dH dH H=0

This is close to the linear dependence of M (H), but we still have to consider the last term,
which is not that simple, because f1 ∼ (λ11 M1 +λ12 M2 +H), whereas f2 ∼ (λ12 M1 +λ11 M2 +H),
and we can’t replace M1 and M2 with M0 before taking the derivative. This leads to
   
df1 df2 gµB S dM gµB S
+ = 2 + (λ11 + λ12 ) = [2 + 2χk (λ11 + λ12 )],
dH dH H=0 kB T dH H=0 kB T

where we add factor of 2 in front of χk , because M is the total magnetization of both sublattices.
Upon replacing into Eq. (5.38), we get the linear equation for χk ,

(gµB S)2 gµB S


χk = × [ 1 + (λ11 + λ12 )χk ] × BS0 (α), α= (λ11 − λ12 )M0 .
kB T kB T
Then,
(gµB S)2 BS0 (α)
χk = (5.39)
kB T − (gµB S)2 (λ11 + λ12 ) BS0 (α)

At low temperatures, α → ∞ and Eq. (5.9) applies. The derivative becomes BS0 (α) '
−α/S
e /S 2 , and the parallel susceptibility grows exponentially starting from 0 at T = 0. The
opposite limit of T ' TN is also easy to handle, because α becomes a small parameter (M0 → 0),
and the conventional form BS (α) ' α(S + 1)/3S can be used. The derivative BS0 (α) = (S +
1)/3S does not depend on α, so

(gµB )2 S(S + 1)/3 (gµB )2 S(S + 1) 1


χk = = , (5.40)
kB T − [(gµB )2 S(S + 1)/3](λ11 + λ12 ) 3kB T −θ
which is equivalent to the Curie-Weiss law derived above. This way, χk grows exponentially
at low temperatures and merges into the Curie-Weiss behavior at T ' TN , with the maximum
around TN .
Perpendicular susceptibility should be handled differently, because we don’t really want
to exercise with the Brillouin function when H and M1 , M2 are non-collinear. The effect of the
transverse field is a slight canting of M1 and M2 away from their zero-field directions. With
the canting angle ϕ, the angle between M1 and M2 is (π − 2ϕ).
The energy of an antiferromagnet is given by Eq. (5.32), where gµB Si = Mi . Therefore,

U = −M1 Heff,1 − M2 Heff,2 = −M1 H − λ11 M21 − M2 H − λ11 M22 − λ12 M1 M2 .

One detail here is that we take λ12 M1 M2 only once to avoid double-counting. In the limit of
small ϕ,

M1 H = M2 H = M0 H cos( π2 − ϕ) = M0 H sin ϕ ' M0 Hϕ,


M1 M2 = M02 cos(π − 2ϕ) = −M02 (1 − 2 sin2 ϕ) ' M02 (2ϕ2 − 1),

42
where M0 is the zero-field sublattice magnetization, as in the case of χk above. The resulting
energy is a simple function of ϕ,

U = −(2λ11 − λ12 )M02 − 2M0 Hϕ − 2M02 λ12 ϕ2 .

The energy minimum is at ϕ0 = −H/(2M0 λ12 ) that gives rise to

|M| = |M1 + M2 | = 2M0 cos( π2 − ϕ0 ) ' 2M0 ϕ0 = −H/λ12

(the minus sign here reflects the fact that λ12 < 0).
This way, we arrive at a remarkably simple expression for the perpendicular susceptibility,
which is temperature-independent in the ordered state,
1
χ⊥ = − (5.41)
2λ12

(the factor of 21 should be introduced again, because |M| is the magnetization per two sublat-
tices). Remarkably, χ⊥ = χ(TN ), where χ(TN ) is obtained from the Curie-Weiss law, Eq. (5.36).
Indeed, from θ ∼ (λ11 + λ12 ) and TN ∼ (λ11 − λ12 ) one infers that (TN − θ) ' −2λ12 , and all
pre-factors cancel out, resulting in χ(TN ) = −1/(2λ12 ).

5.4 Spin-wave theory for antiferromagnets


5.4.1 Linear spin-wave theory
Holstein-Primakoff transformation: we have to deal with the two sublattices again and
introduce individual bosonic operators for the sublattices A and B,
1 1
√ √
 
â†r âr 2 â†r âr 2
ŜA+r = 2S 1 − âr , ŜA−r †
= âj 2S 1 − , ŜAz r = S − â†r âr , (5.42)
2S 2S
! 12 ! 12
√ b̂ †
b̂ √ b̂†r b̂r
ŜB+r = b̂†r 2S 1 − r r , ŜB−r = 2S 1 − b̂r , ŜBz r = −S + b̂†r b̂r . (5.43)
2S 2S

Note that the transformation is ”asymmetric”, because sublattice B is the spin-down sublattice
that features the spin of −S in the ground state. A more subtle issue here is that we start from
classical the Néel state with the spins of +S and −S in the sublattices A and B, respectively.
As we have seen in Sec. 3.2, this Néel state may not be the ground state. Spin-wave theory will
confirm that it’s not the ground state indeed, but it should be deemed a reasonable starting
approximation that can be improved using spin-wave corrections.
More generally, any kind of spin-wave theory requires an ordered state to begin with, because
â and ↠must act on some well-defined (preferably, classical) state. This is not a problem for
ferromagnets, where ferromagnetic state is always a good choice, but may impose difficulties
for antiferromagnets, where the type of magnetic order is not known a priori, and in special
cases (frustrated magnets) no magnetic order exists.
Bosonic Hamiltonian: all of the following is essentially similar to Sec. 5.2.3 and could be
written with an arbitrary set of exchange couplings J(p − q), but for simplicity we shall restrict
ourselves to the spin Hamiltonian with only one (antiferromagnetic) coupling J that connects
the sublattices A and B,
X 
z z 1 + − 1 − +
H=J ŜAp ŜBq + 2 ŜAp ŜBq + 2 ŜAp ŜBq , (5.44)
p,q

43
where we do the summation over all p and q assuming that p belongs to the sublattice A and
q belongs to the sublattice B (this way, we essentially do the summation over pairs of atoms).
By restricting ourselves to the linear spin-wave theory, we can simplify the spin operators to
√ √
ŜA+p = 2S âp , ŜA−p = 2S â†p , ŜAz p = S − â†p âp ,
√ √
ŜB+q = 2S b̂†q , ŜB−q = 2S b̂q , ŜBz q = −S + b̂†q b̂q .

We shall also use N for the number of atoms in each sublattice, because this preserves the form
of Eqs. (5.18), (5.19), and (5.21). The total number of atoms is 2N then.
Using Fourier transforms of the creation and annihilation operators, Eq. (5.19), and disre-
garding all terms, which are non-linear in ↠â, we arrive at

HLSW = Eg(0) + H0 , H0 = H0zz + H0+− + H0−+ , (5.45)


(0)
where Eg = −N zJS 2 is the classical energy of the Néel state. The form of H0 ,
Xh i
H0 = zJS (â†k âk + b̂†k b̂k ) + gk (âk b̂−k + â†−k b̂†k ) , (5.46)
k

is similar to that of Sec. 5.2.3, but with the geometrical factor gk replacing the Fourier transform
J(k). It’s easy to notice that the first term in Eq. (5.46) stands for H0zz , whereas the second
term should be due to [H0+− + H0−+ ].
At this point, the antiferromagnetic and ferromagnetic cases diverge, because, instead of
the simple ↠â (viz. â ↠) terms, we have got a mixture of â and b̂. To better understand their
origin, consider, for example,
" #" #
1 2S XX JS X X X
H0+− = J(p−q)e−ik1 p e−ik2 q âk1 b̂k2 = e−i(k1 +k2 )p eik2 δ âk1 b̂k2 ,
2 N k ,k p,q N k ,k p δ
1 2 1 2

where we use δ = p − q in the place of p0 of Sec. 5.2.3 to emphasize that the summation is done
over nearest neighbors only (i.e., over those p − q where J 6= 0). The summation over p yields
N δk1 ,−k2 , and the summation of δ yields z gk , thus leading to the âk b̂−k term of Eq. (5.46).
At first glance, H0 of an antiferromagnet is just slightly more complicated than in the
ferromagnetic case of Eq. (5.28). However, the spin-wave Hamiltonian is no longer diagonal,
which means that we don’t have the solution yet.

5.4.2 Bogolyubov transformation


Before we proceed, let’s symmetrize Eq. (5.46) using the fact that, for example,
X X 1 X 
âk b̂−k = â−k0 b̂k0 = âk b̂−k + â−k b̂k , (5.47)
k k0 =−k
2 k

and for a centrosymmetric structure gk = g−k . Then,


!
zJS X (1)
X (2)
H0 = Hk + Hk , (5.48)
2 k k

where
(1)
Hk = (â†k âk + b̂†−k b̂−k ) + gk (âk b̂−k + â†k b̂†−k ),
(2)
Hk = (â†−k â−k + b̂†k b̂k ) + gk (â−k b̂k + â†−k b̂†k ).

44
Consider the Bogolyubov transformation,

α̂k = c1 âk + c2 b̂†−k , α̂k† = c1 â†k + c2 b̂−k , (5.49)


† † †
β̂k = d1 â−k + d2 b̂k , β̂k = d1 â−k + d2 b̂k , (5.50)

where the coefficients c1 , c2 , d1 , and d2 remain to be determined. First of all, we impose the
standard commutation relations,

[α̂k , α̂k† ] = 1, [β̂k , β̂k† ] = 1, [α̂k , β̂k ] = 0, [α̂k† , β̂k† ] = 0 (5.51)

that lead to the conditions

c21 − c22 = 1, d22 − d21 = 1, c1 d1 = c2 d2 . (5.52)


(1)
We seek to write Hk of Eq. (5.48) in the diagonal form
(1)
Hk = Eα + λαk α̂k† α̂k + λβk β̂−k

β̂−k , (5.53)
† (1)
where we choose β̂−k and β̂−k , because they contain âk and b̂−k entering Hk . Eq. (5.53) implies
that
(1)
[αk , Hk ] = λαk α̂k = λαk (c1 âk + c2 b̂†−k ).
(1)
On the other hand, we can write the same commutator using the actual form of Hk of
Eq. (5.45),
h i
(1)
[αk , Hk ] = c1 âk + c2 b̂†−k , (â†k âk + b̂†−k b̂−k ) +gk (âk b̂−k + â†k b̂†−k ) =

= (c1 − c2 gk )âk + (c1 gk − c2 )b̂†−k .

The expressions on the right-hand side of the last two equations should match, which leads to
the system of linear equations for c1 and c2 ,
( ! !
c1 − c2 gk = c1 λαk λαk − 1 gk c1
⇒ = 0. (5.54)
c1 gk − c2 = c2 λαk −gk λαk + 1 c2

The non-zero solution for c1 and c2 exists when the determinant of the matrix is zero, hence
q
α 2 2 α
(λk ) − 1 + gk = 0 ⇒ λk = 1 − g2k , (5.55)

where we picked up the positive solution, because λαk is proportional to the magnon energy.
By repeating the same procedure for β̂−k , we arrive at the conditions for d1 , d2 , and λβk ,
(
−d1 + d2 gk = d1 λβk β
q
β
⇒ λk = 1 − g2k = λαk = λk .
−d1 gk + d2 = d2 λk
† (2)
Moreover, same conditions apply for the diagonalization of Hk in terms of α̂−k α̂−k + β̂k† β̂k . It
is also easy to notice that all of the above requirements for the coefficients of the Bogolyubov
transformation are fulfilled if we choose d2 = c1 and d1 = c2 , so the transformation is in fact
very simple.
Coming back to Eq. (5.45), we can write the spin-wave Hamiltonian as
X q
HLSW = Eg(1) + ~ωk (α̂k† α̂k + β̂k† β̂k ), ~ωk = zJS 1 − g2k , (5.56)
k

45
where we use the same trick as in Eq. (5.47) and do the summation over k or −k in order to
(2) (1)
merge the terms like β̂k† β̂k of Hk and β̂−k

β̂−k of Hk .
If you are not convinced yet, please, read on to Sec. 5.4.3, where we demonstrate the diagonal
(1)
form explicitly and also elucidate the meaning of Eg . For now, we only want to discuss the
k-dependent part, which is a direct consequence of Eq. (5.53).
Low-k limit: using the geometrical factor for the cubic lattice, Eq. (5.30), we note that at
small k 2
(ka)2 (ka)2

gk = 1 − '1− ⇒ ~ωk ∼ ka. (5.57)
6 3
Antiferromagnetic magnons are gapless, but their dispersion is linear at low k. From Eq. (5.31)
we immediately infer that in 3D antiferromagnets CV ∼ T 3 , whereas in 2D antiferromagnets
CV ' T 2 .

5.4.3 Ground-state energy


(1)
We shall now elucidate the ground-state energy Eg in Eq. (5.56) and the physical meaning of
antiferromagnetic magnons. To this end, we have to invert the Bogolyubov transformation,

âk = c1 α̂k − c2 β̂−k , â†k = c1 α̂k† − c2 β̂−k ,

b̂k = c1 β̂k − c2 α̂−k , b̂†k = c1 β̂k† − c2 α̂−k ,

and calculate different terms of Eq. (5.56). Let’s start with

â†k âk = c21 α̂k† α̂k + c22 β̂−k β̂−k



− c1 c2 (α̂k† β̂−k

+ β̂−k α̂k ) =
= c22 + c21 α̂k† α̂k + c22 β̂−k

β̂−k − c1 c2 (α̂k† β̂−k

+ α̂k β̂−k ),

b̂†k b̂k = c21 β̂k† β̂k + c22 α̂−k α̂−k



− c1 c2 (β̂k† α̂−k

+ α̂−k β̂k ) =
= c22 + c22 α̂−k

α̂−k + c21 β̂k† β̂k − c1 c2 (α̂−k

β̂k† + α̂−k β̂k ).

Interpretation of antiferromagnetic magnons: we can now represent Ŝ z in terms of α̂


and β̂. Indeed, from Eqs. (5.42) and (5.43),
* + * + * + * +
X X X X
z z z † †
hŜ i = Ŝp + Ŝq = N S − âp âp − N S + âq âq =
p q p q
X  X   X 
= hb̂†k b̂k i − hâ†k âk i = (c22 − c21 ) hβ̂k† β̂k i − hα̂k† α̂k i = hβ̂k† β̂k i − hα̂k† α̂k i ,
k k k

where we again merge similar terms by performing the summation over k or −k. This is made
possible by the fact that gk = g−k and, thus, via Eq. (5.54) the coefficients c1 and c2 are also
the same for k and −k.
The excitations created by α̂k† and β̂k† change S z by −1 and +1, respectively, just as fer-
romagnetic magnons change S z by −1. This should be paralleled to the fact that the α̂- and
β̂-excitations are degenerate per Eq. (5.56). In antiferromagnets, the balance between the two
sublattices can be shifted in both directions, S z > 0 or S z < 0, with no difference in the en-
ergy. Another important feature is that the excitations are never restricted to one sublattice.
Both α̂k and β̂k mix âk and b̂k in a rather intricate way, where c1 and c2 are k-dependent via
Eq. (5.54).

46
Diagonalization of the spin-wave Hamiltonian: in order to represent Eq. (5.46), we
shall also need
â†−k b̂†k = −c1 c2 + c21 α̂−k

β̂k† + c22 α̂−k β̂k − c1 c2 (α̂−k

α̂−k + β̂k† β̂k ),
âk b̂−k = −c1 c2 + c22 α̂k† β̂−k

+ c21 α̂k β̂−k − c1 c2 (α̂k† α̂k + β̂−k

β̂−k ).
Then, H0 will include three terms,
Xh i
H0 = zJS A0 + A1 (α̂k† α̂k + β̂k† β̂k ) + A2 (α̂k† β̂k† + α̂k β̂k ) . (5.58)
k

We consider them one by one using Eq. (5.54).


First, the pre-factor in front of the mixed α̂β̂ term should vanish,
A2 = −2c1 c2 + gk (c21 + c22 ) = c1 (c1 gk − c2 ) + c2 (c2 gk − c1 ) = c1 c2 λk − c2 c1 λk = 0, (5.59)
and it does. Second, the pre-factor in front of the magnon term yields the magnon energy,
A1 = c21 + c22 − 2gk c1 c2 = c1 (c1 − gk c2 ) + c2 (c2 − gk c1 ) = c21 λk − c22 λk = λk . (5.60)
Finally, the term that contributes to the ground-state energy is
A0 = 2c22 − 2c1 c2 gk = c22 − 1 + c21 − 2c1 c2 gk = −1 + c2 (c2 − c1 gk ) + c1 (c1 − c2 gk ) =
q
2 2
= −1 + (c1 − c2 )λk = −1 + 1 − g2k , (5.61)
so
X X q 
Eg(1) = Eg(0) + zJS A0 = −N zJS + zJS2 2
−1 + 1 − gk . (5.62)
k k
p
Because 1 − g2k < 1, the correction is negative. The energy is reduced compared to the
classical energy of the Néel state. On the other hand, we can write,
Xq
(1)
Eg = −N zJ S(S + 1) + zJS 1 − g2k , (5.63)
k

which reflects the fact that the spin-wave energy is higher than the quantum limit with S(S +1),
as given by Eq. (3.8). A compromise way of writing the ground-state energy is
 
1 X
q
(1) 2
Eg = −N zJ S(S + ζ), ζ= 1 − 1 − gk . (5.64)
N k

Here, ζ is the spin-wave correction to the classical energy and can be calculated using the
lattice structure. In a simple cubic lattice ζ ' 0.097, whereas in the bcc lattice ζ ' 0.072. For
lower dimensions, one finds ζ ' 0.158 in 2D (square lattice) and ζ ' 0.363 in 1D (linear chain).
Quantum effects are increasingly more important in low dimensions.
What is the origin of this correction? If we start with the quantum scenario, Eq. (5.63) can
be seen as the quantum energy of the Heisenberg antiferromagnet plus some energy proportional
to the magnon frequency but independent of the number of excited magnons. It is the zero-
point energy of a quantum system. On the other hand, if we start with the classical Néel state
(0)
having the energy Eg , we note that it is the vacuum state for the âk and b̂k operators, but
(1)
not for α̂k and β̂k . They act on their own vacuum state having the energy Eg , where some
excitations due to âk and b̂k are allowed. Such excitations define quantum (zero-temperature)
(0)
fluctuations of the Heisenberg antiferromagnet and reduce its energy compared to Eg . The
effect of these quantum fluctuations will become even more obvious in the following where
the sublattice magnetization is considered.

47
5.4.4 Staggered magnetization
Antiferromagnetic magnons change the total S z by ±1, whereas their effect on individual sub-
lattices is less straight-forward. In contrast to the mean-field theory (Sec. 5.3), M1 and M2
can’t be used as the order parameter, and a staggered magnetization, the difference between
the sublattice magnetizations, should be introduced instead.
* ! !+
1 X X 1 Xh † i
Mst = Spz − Sqz =S− hâk âk i + hb̂†k b̂k i , (5.65)
2N A B
2N k

where hi denote thermal average. We shall use the expressions for â†k âk and b̂†k b̂k from the
previous section, but realize that none of the terms cancel (unlike in the calculation of Skz
above),
1 Xh 2 i
Mst = S − 2c2 + (c21 + c22 )(hαk† αk i + hβk† βk i) − 2c1 c2 (hα̂k† β̂k† i + hα̂k β̂k i) . (5.66)
2N k

Ground state. By the ground state we mean the vacuum state |0i of the α̂ and β̂ bosons.
Here, thermal averages are simply expectation values, and all expectation values are zero.
This is obvious for hα̂k† α̂k i = 0 and hβ̂k† β̂k i = 0, because they are particle numbers, and no
particles exist in the vacuum state. It is also clear that h0|α̂k β̂k |0i = 0, because βk |0i = 0.
Finally, h0|α̂k† β̂k† |0i = (h0|α̂k β̂k |0i)† = 0, and only 2c22 is left. It can be expressed using 2c1 c2 =
gk (c21 + c22 ) = gk (2c22 + 1) of Eq. (5.59). Then, Eq. (5.61) leads to
q q
2 2 2 2
2c2 − 2c1 c2 gk = −1 + 1 − gk ⇒ 2c2 − gk (2c2 + 1) = −1 + 1 − g2k
2

1
q
2c22 (1 − g2k ) = −(1 − g2k ) + 1 − g2k ⇒ 2c22 = −1 + p .
1 − g2k

This brings Mst to the tractable form


!
1 X 1
Mst = S − −1 + p . (5.67)
2N k 1 − g2k

Here, the first term is the classical value for the Néel state (it’s S, because we get N S −
(−N S) from the summation). The second term is the spin-wave correction, which is positive
and reduces the staggered magnetization with respect to its classical value. The reduction is
due to quantum fluctuations in the ordered state. In other words, the presence of âk and b̂k ex-
citations in the state |0i reduces the staggered magnetization, an effect that is well documented
experimentally.
In a simple cubic antiferromagnet, ∆M = S − Mst = 0.078. In 2D, the effect becomes
more pronounced, and ∆M ' 0.197 (the ordered moment is reduced by 40% due to quantum
fluctuations). Lastly, in the 1D case we run into a trouble called infrared catastrophe, which is
further discussed in Sec. 5.4.5.
Finite temperatures render other terms of Eq. (5.66) non-zero. Specifically, hα̂k† α̂k i and

hβ̂k β̂k i are the numbers of bosons nk given by the Bose-Einstein distribution, whereas the
off-diagonal terms like α̂k† β̂k† vanish, because the α̂k and β̂k excitations occur independently.
Then,

1 X c21 + c22 1 X 1 1
Mst (T ) = Mst (0) − = Mst (0) − p , (5.68)
N k e~ωk /(kB T ) − 1 N k 1 − g2k e~ωk /(kB T ) − 1

48
p
where we used c21 + c22 = 2c22 + 1 = 1/ 1 − g2k . p
At low temperatures and energies, where ωk ' g k and 1 − g2k ∼ k [Eq. (5.57)], the result
can be assessed similar to the magnetization of a ferromagnet in Sec. 5.2.5,
Z 3
A V dk 1
Mst (T ) = Mst (0) − ,
N (2π)3 k e~gk/(kB T ) − 1
p
and we use the prefactor A to account for whatever linear dependence of 1 − g2k on k. With
d3 k = 4πk 2 dk and y = k/(kB T ), one takes temperature out of the integral and arrives at
Z
A V 2 y dy
Mst (T ) = Mst (0) − 2
(kB T ) , (5.69)
N 2π e −1
~gy

the quadratic reduction of the staggered magnetization at finite temperatures. That was the
3D case. The behavior in 2D and 1D is different, as we shall see below.

5.4.5 Low-dimensional magnets and Mermin-Wagner theorem


Linear spin-wave theory can be applied to low-dimensional Heisenberg magnets, although it will
lead to some strange results. We demonstrate this by analyzing the staggered magnetization
in low dimensions. As before, we shallp look at the low-k and low-energy limit, because all
problems appear around k = 0, where 1 − g2k ∼ k.
1D antiferromagnets. In 1D, the zero-temperature result of Eq. (5.67) can be transformed
into an integral as follows,
Z  
a 1
Mst = S − dk −1 + ,
4πN k

and the integration is over the first Brillouin zone, i.e., it should include k = 0. The first
term is fine, but the second term is not. It yields ln k, which diverges at k → 0. Therefore,
quantum fluctuations are so strong that they destroy the antiferromagnetic order completely
and render Mst = 0 (the divergent spin-wave correction does not mean that Mst diverges, but
simply indicates the inapplicability of linear spin-wave theory in this case).
2D antiferromagnets. The zero-temperature result is fine here, because d2 k = 2πk dk,
and this additional k eliminates the divergence of the integrand in Mst (0). However, we have a
problem at T 6= 0 while using Eq. (5.68),
Z 2 Z
2 AV dk 1 2 AV dk
Mst (T ) = Mst (0) − 2 T )
= Mst (0) − .
N (2π) k e ~gk/(kB −1 N 2π e ~gk/(kBT ) − 1

This integral diverges at k = 0, because it also yields a logarithm per

dy e−y d
y
= −y
= ln(1 − e−y ).
e −1 1−e dy

This way, Mst (0) 6= 0, but the staggered magnetization should vanish at any finite temperature,
and the magnetic order exists at T = 0 only.
Ferromagnets: we shall encounter exactly the same problem in 2D ferromagnets upon
calculating the magnetization in the vein of Sec. 5.2.5,

d2 k d(k 2 )
 Z   Z 
V 1 V 1
M (T ) = M (0) 1 − = M (0) 1 − ,
NS (2π 2 ) e~gk2 /(kB T ) − 1 N S 4π e~gk2 /(kB T ) − 1

49
so the 2D ferromagnets lack any magnetic order at finite temperatures too. You can guess that
the situation gets only worse in 1D ferromagnets, where we have an additional 1/k divergence
of the integrand.
Altogether, no magnetic order exists in Heisenberg systems in 1D and 2D at any finite
temperature. This is a simplified statement of the general Mermin-Wagner theorem: in
a system with sufficiently short-range interactions, continuous symmetries can not be sponta-
neously broken at finite temperature in dimensions D ≤ 2. At zero temperature, magnetic
order exists in ferromagnets in any dimension, and in antiferromagnets in 2D but not in 1D.

5.5 Beyond linear spin-wave theory


Linear spin-wave theory is by no means exact. Its accuracy varies depending on the system in
question. Here, we shall juxtapose the effect of non-linear terms in ferromagnets and antiferro-
magnets, and elucidate yet another important difference between the two.
Ferromagnets feature one-magnon states as eigenstates of the exact Heisenberg Hamilto-
nian. We have basically shown this in Sec. 5.2.1, but shall give some further arguments here
using second quantization. We do it for the sake of eventual comparison with antiferromagnets,
where the solution in the vein of Sec. 5.2.1 is no longer possible.
Suppose that â†k |0i is an eigenstate of H with the classical energy E0 plus magnon energy
~ωk . Then,

[H, â†k ]|0i = H(â†k |0i) − â†k H|0i = (E0 + ~ωk )|0i − E0 â†k |0i = ~ωk â†k |0i. (5.70)

Let’s now check how different components of H commute with â†k . The linear term H0 yields
X
[H0 , âk ] = ~ωk0 [â†k0 âk0 , â†k ] = ~ωk â†k ,
k0

which is the result of Eq. (5.70). Consequently, [H1 , â†k ]|0i = 0, and we illustrate it below.
Consider, for example, the 4th-order term arising from Ŝp− Ŝq+ ,

−+ † J X X X ik1 p i(k2 −k3 −k4 )(p+δ) † †


[H4th order , âk ]|0i = 2
e e [âk1 âk2 âk3 âk4 , â†k ]|0i+
2N k ,k k ,k p,δ
1 2 3 4

J X X X i(k1 +k2 −k3 )p −ik4 (p+δ) † †


+ e e [âk1 âk2 âk3 âk4 , â†k ]|0i,
2N 2 k ,k k ,k p,δ
1 2 3 4

where we assumed a uniform lattice with the interaction J between the sites p and p + δ and
used the expressions for Ŝp− and Ŝq+ from Eqs. (5.23) and (5.24). Let’s look at the commutator,

[â†k1 â†k2 âk3 âk4 , â†k ]|0i = â†k1 â†k2 âk3 (1 + â†k âk4 )|0i − â†k â†k1 â†k2 âk3 âk4 |0i = 0,

because there is always an annihilation operator standing on the right and producing zero upon
acting on the state |0i. Therefore, the 4th-order contribution to H1 does not affect the energy
of the â†k |0i state, and the same holds true for any high-order term (show this explicitly, if you
will).
−+ †
Antiferromagnets produce a similar expression for [H4th order , âk ]. We shall again use
Eq. (5.24) for ŜA−p and take ŜB+q to the third order as


 
1 † †
ŜB− †
= 2S b̂ − b̂ b̂ b̂ .
4S

50
Then,

−+ † J X X X i(k1 +k2 −k3 )p ik4 (p+δ) † †


[H4th order , âk ]|0i = − 2
e e [âk1 âk2 âk3 b̂†k4 , â†k ]|0i+
2N k ,k k ,k p,δ
1 2 3 4

J X XX
− 2
eik1 p ei(k2 +k3 −k4 )(p+δ) [â†k1 b̂†k2 b̂†k3 b̂k4 , â†k ]|0i.
2N k ,k k ,k p,δ
1 2 3 4

The second term is zero, because b̂k4 is there and commutes with any â operator. On the other
hand, the first term yields a non-zero component, which we obtain through the application of
commutation relations,

[â†k1 â†k2 âk3 b̂†k4 , â†k ]|0i = δk3 ,k â†k1 â†k2 b̂†k4 (1 + â†k âk3 )|0i = δk3 ,k â†k1 â†k2 b̂†k4 |0i

Therefore,

−+ † J X X X ik4 δ
[H4th order , âk ]|0i = − e (δ k1 +k2 , k3 −k4 ) δk3 k â†k1 â†k2 b̂†k4 |0i =
2N k ,k k ,k δ
1 2 3 4

J X X i(k−k1 −k2 )δ † † †
=− e âk1 âk2 b̂k−k1 −k2 |0i,
2N k ,k δ
1 2

where we used the summation rule


X
ei(k1 +k2 −k3 +k4 )p = N δ k1 +k2 , k3 −k4 .
p

The meaning of this result is simpler than its mathematical form. A magnon âk sponta-
neously decays to produce three new magnons, âk1 + âk2 + b̂k−k1 −k2 , with the overall momentum
k conserved. In other words, an excitation within the A-sublattice breaks down into two ex-
citations within the A sublattice, which should be balanced by an additional excitation in the
B-sublattice.
More generally, we see that ferromagnetic magnons are stable entities. Once created, single
magnon will be preserved until it meets another magnon, and they interact. In contrast, antifer-
romagnetic magnons can spontaneously decay, which happens most often in low-dimensional
and frustrated antiferromagnets and yields characteristic broad features in neutron scattering.

51
6 Hubbard model
6.1 Hubbard Hamiltonian
Let’s introduce the generic Hubbard Hamiltonian,
X X X
H = −µ n̂jσ − t ĉ†jσ ĉiσ + U n̂j↑ n̂j↓ , (6.1)
j,σ i6=j, σ j

where the first term is the chemical potential (number of electrons viz. band filling), the second
term stands for the hoppings between the lattice sites, and the third term is the on-site Coulomb
repulsion. This Hamiltonian is written for one band or, more precisely, for one orbital per atom
(which may still lead to several bands). Further orbitals can be included at the expense of
introducing several U -terms. In the following, we shall consider the one-orbital case only. We
shall also disregard the term with the chemical potential, because it’s simple constant unless
band filling is varied.
k-dependent form: Fourier transforms of ĉjσ and ĉ†jσ modify the hopping term as follows,
X t X X X ik1 r −ik2 (r+δ) †
−t ĉ†jσ ĉiσ = − e e ĉk1 σ ĉk2 σ =
i,j,σ
N σ k1 ,k2 r,δ
!
t X X X X i(k1 −k2 )r −ik2 δ † XX X
=− e e ĉk1 σ ĉk2 σ = −t e−ikδ ĉ†kσ ĉkσ = εk ĉ†kσ ĉkσ ,
N σ k ,k δ r k,σ δ k,σ
1 2

where we did the same thing as in Sec. 5, namely, took the sum over each lattice site r and
its neighbors r + δ, and used Eq. (5.21). Similar transformations applied to the first term of
Eq. (6.1) yield X X
H = Ht + HU = (εk − µ)n̂kσ + U n̂j↑ n̂j↓ . (6.2)
k,σ j

Band energy εk is represented by a sum of exponentials or, in a simple D-dimensional


lattice, where δ are lattice directions (with δ equivalent to −δ),

X D
X
−ikδ
εk = −t e = −2t cos kα a. (6.3)
δ α=1

In general, band structure (experimental or calculated) should be represented by a sum of


cosine functions to extract the values of t (tight-binding representation).
Relation to spin operators: spin operators (for spin- 21 , as we have not more than one un-
paired electron per site here) are directly linked to electron creation and annihilation operators
via
Ŝ + = ĉ†↑ ĉ↓ , Ŝ − = ĉ†↓ ĉ↑ , Ŝ z = 21 (n̂↑ − n̂↓ ). (6.4)
We can verify that the commutation relation for spin operators holds, namely,

[Ŝ + , Ŝ − ] = ĉ†↑ ĉ↓ ĉ†↓ ĉ↑ − ĉ†↓ ĉ↑ ĉ†↑ ĉ↓ = n̂↑ ĉ↓ ĉ†↓ − n̂↓ ĉ↑ ĉ†↑ = n̂↑ (1 − n̂↓ ) − n↓ (1 − n↑ ) = 2Ŝ z ,

where ĉσ ĉ†σ = 1 − ĉ†σ ĉσ = 1 − n̂σ via the fundamental commutation relation of Eq. (D.8).
Rotational spin invariance: the interaction term HU has some hidden symmetry and
depends on the total spin at site j only. To demonstrate this, consider

n̂↑ n̂↓ = ĉ†↑ ĉ↑ ĉ†↓ ĉ↓ = ĉ†↑ ĉ↑ (1 − ĉ↓ ĉ†↓ ) = n̂↑ − Ŝ + Ŝ − .

52
Likewise,
n̂↑ n̂↓ = n̂↓ − Ŝ − Ŝ + .
Finally, we can take the square of Ŝ z represented via the particle number operators,

4(Ŝ z )2 = n̂2↑ + n̂2↓ − 2n̂↑ n̂↓ ⇒ n̂↑ n̂↓ = − 2(Ŝ z )2 ,
2
where we used n̂ = n̂↑ + n̂↓ and noticed that n̂2↑ = n̂↑ (viz. n̂2↓ = n̂↓ ), as the occupation number
is 0 or 1.
By combining three different representations of n̂↑ n̂↓ , we arrive at the square of the spin
operator, Ŝ2 = (S z )2 + 21 Ŝ + Ŝ − + 12 Ŝ − Ŝ + ,

n̂ 2 2
n̂↑ n̂↓ = − Ŝ . (6.5)
2 3
Two important consequences follow. First, the Hubbard Hamiltonian is invariant with respect
to the choice of the quantization axis, i.e., it does not matter what we call spin-up and spin-
down. Second, the interaction term HU favors magnetism in the sense that the doubly-occupied
states (n = 2, S 2 = 0) are higher in energy than the single-occupied states (n = 1, S 2 = 43 ), so
the HU term favors magnetic solutions.

6.2 t − J model
In the t  U limit, the system is metallic and not very different from a simple metal, the case
that does not really interest us at the moment. We rather want to know what happens in the
t  U limit, where the HU term favors the magnetism. This understanding is achieved through
a rather lengthy transformation toward the so-called t − J model.

6.2.1 Classification of hopping events


For each atom, the possible states are

|0i, | ↑i = ĉ†↑ |0i, | ↓i = ĉ†↓ |0i, |di = ĉ†↑ ĉ†↓ |0i. (6.6)

One can easily verify that the following operators

P̂0 = (1 − n̂↑ )(1 − n̂↓ ), P̂↑ = n̂↑ (1 − n̂↓ ), P̂↓ = n̂↓ (1 − n̂↑ ), P̂d = n̂↑ n̂↓ (6.7)

act as projector operators (i.e., they yield 1 for the given state and 0 for all other states),
and the completeness condition

P̂0 + P̂↑ + P̂↓ + P̂d = 1

is fulfilled.
We know that the state |di is unlikely to form in the t  U limit. Therefore, we would like
to transform the Hubbard Hamiltonian in such a way that the states of type |di are separated
from the rest. This idea underlies the classification of hopping events into several groups,
depending on whether they create the |di states or not.
Hoppings that do not involve the |di states are those that transfer an unpaired electron
to an empty site. The relevant part of the Hamiltonian can be singled out using the projector
operators P̂ . It’s enough to apply one projector for the initial state (e.g., on site i) and one
projector for the final state (on site j then), although you can apply two projectors each time,

53
which will lead to the same result. For example, the transfer of a spin-up electron from site i
to site j corresponds to

P̂j↑ ĉ†j↑ ĉi↑ P̂i↑ = n̂j↑ (1 − n̂j↓ ) ĉ†j↑ ĉi↑ n̂i↑ (1 − n̂i↓ ) = (1 − n̂j↓ ) ĉ†j↑ ĉi↑ (1 − n̂i↓ ),

where we drop n̂j↑ and n̂i↑ because the former yields 1 for any state created by ĉ†j↑ , whereas the
latter yields 1 for any state that will eventually produce a non-zero result upon the action with
ĉi↑ . In contrast, we have to keep the n̂↓ operators, as ĉ↑ and ĉ†↑ don’t tell us anything about the
down-spins. If you want to show this mathematically, consider, for example,

ĉi↑ n̂i↑ = ĉi↑ ĉ†i↑ ĉi↑ = (1 − ĉ†i↑ ĉi↑ )ĉi↑ = ĉi↑ ,

and the second term vanishes because ĉ2i↑ = 0.


Altogether, the relevant part of Ht reads as
X
Ht0 = −t (1 − n̂j σ̄ ) ĉ†jσ ĉiσ (1 − n̂iσ̄ ), (6.8)
i6=j, σ

where σ̄ = −σ.
Hopping events that create the |di states are those where an electron hops from site
i to site j that already contains an electron. Consider for example the spin-up electron on site
i that hops to site j with the spin-down electron. This way, the d-state is produced on site j,
and the projection reads as

P̂jd ĉ†j↑ ĉi↑ P̂i↑ = n̂j↑ n̂j↓ ĉ†j↑ ĉi↑ n̂i↑ (1 − n̂i↓ ) = n̂j↓ ĉ†j↑ ĉi↑ (1 − n̂i↓ )

following same arguments as above. Altogether, the relevant part of the Hamiltonian becomes
X
Ht+ = −t n̂j σ̄ ĉ†jσ ĉiσ (1 − n̂iσ̄ ). (6.9)
i6=j, σ

Hopping events that destroy the |di states are similar to the above,
X
Ht− = −t (1 − n̂j σ̄ ) ĉ†jσ ĉiσ n̂iσ̄ (6.10)
i6=j, σ

Lastly, the hopping events that preserve the |di states (and simply move them around)
yield X
Htd = −t n̂j σ̄ ĉ†jσ ĉiσ n̂iσ̄ . (6.11)
i6=j, σ

It is straight-forward to verify by a direct summation that

Ht = Ht0 + Ht+ + Ht− + Htd . (6.12)

6.2.2 Canonical transformation


Schematically, the matrix of the Hubbard Hamiltonian can be represented in the form

H0t H+
!
H= t
,
H−
t Ht + H U
d

where the first line corresponds to the states with not more than one electron per site (|0i, |↑i,
and |↓i), and the second line stands for the |di states. Our intention is to transform H in such

54
a way that the upper left and bottom right cells are singled out, or, in other words, the action
of Ht+ and Ht− becomes null.
Consider an arbitrary unitary transformation
i2
Heff = eiT̂ He−iT̂ = H + i[T̂ , H] + [T̂ , [T̂ , H]] + . . . '
2
i2
' (Ht0 + Htd + HU ) + (Ht+ + Ht− ) + i[T̂ , HU ] + i[T̂ , Ht0 + Htd ] + i[T̂ , Ht+ + Ht− ] + [T̂ , [T̂ , H]].
2
Elimination of Ht+ + Ht− . We shall now try to find T̂ that cancels (Ht+ + Ht− ) via the
i[T̂ , HU ] term. To this end, we compute
X X
[Ht+ , HU ] = −t U

n̂j σ̄ ĉ†jσ ĉiσ (1 − n̂iσ̄ ), n̂k↑ n̂k↓ =
i6=j, k σ
XX 
= −t U n̂j σ̄ n̂kσ̄ ĉiσ (1 − n̂iσ̄ )[ĉ†jσ , n̂kσ ] + n̂j σ̄ ĉ†jσ (1 − n̂iσ̄ )n̂kσ̄ [ĉiσ , n̂kσ ] ,
i6=j, k σ

where the HU part may not commute with either ĉ†jσ or ĉiσ . We can now consider

[ĉ† , n̂ ] = ĉ† ĉ† ĉ − ĉ† ĉ ĉ† = −ĉ† (1 − ĉ† ĉ ) = −ĉ† , [ĉ, n̂] = ĉ.

Therefore, [ĉ†jσ , n̂kσ ] = −δjk ĉ†jσ and [ĉiσ , n̂kσ ] = δik ĉiσ , such that
X
[Ht+ , HU ] = −t U −n̂2j σ̄ ĉ†jσ ĉiσ (1 − n̂iσ̄ ) + n̂j σ̄ ĉ†jσ ĉiσ (1 − n̂iσ̄ )n̂iσ̄ = −U Ht+ ,

i6=j, σ

where the second term does not survive, because (1 − n̂iσ̄ )n̂iσ̄ = n̂iσ̄ − n̂2iσ̄ = n̂iσ̄ − n̂iσ̄ = 0, but
n̂2j σ̄ = n̂j σ̄ .
In the case of Ht− , the second term will survive, and the first term won’t. Eventually,
[Ht− , HU ] = U Ht− . It is then easy to verify that
i
T̂ = − (Ht+ − Ht− ) (6.13)
U
does the job and cancels the respective term in Heff , because
1 +
i[T̂ , HU ] = [H − Ht− , HU ] = −(Ht+ + Ht− ).
U t
Before we go further, let’s note that Ht+ and Ht− are of the order of t, so T̂ is of the order
of t/U . By inspecting the commutators within the expansion of Heff , we can notice that the
lowest-order corrections to the ”basic” part (Ht0 + Htd + HU ) are of the order of t2 /U , which
is the physics of the t − J model that we seek to establish. Further terms are of the order of
t3 /U 2 , etc.
What’s left? Since T̂ includes Ht+ and Ht− , its commutators contain the unwanted contri-
butions that admix the |di states. For example,
1 + 2
i[T̂ , Ht+ + Ht− ] = [Ht − Ht− , Ht+ + Ht− ] = [Ht+ , Ht− ], (6.14)
U U
is of the order of t2 /U and should be retained. As for the double commutator, its [T̂ , [T̂ , Ht ]]
part is of the order of t3 /U 2 and can be neglected, but we still have to consider the other part,

i2 i 1
[T̂ , [T̂ , HU ]] = − [T̂ , Ht+ + Ht− ] = − [Ht+ , Ht− ], (6.15)
2 2 U
55
that nicely combines with the previous one.
We also have to take care of [T̂ , Ht0 + Htd ], but in fact this term can be eliminated by
adjusting our transformation to
i
T̃ˆ = T̂ + T̂1 = − (Ht+ − Ht− ) + T̂1 , (6.16)
U

where T̂1 is chosen in such a way that


1
i[T̂1 , HU ] = −i[T̂ , Ht0 + Htd ] = − [Ht+ − Ht− , Ht0 + Htd ].
U

The expression on the right-hand side is of the order of t2 /U , whereas T̂1 should be of the
order of (t/U )2 , so it won’t bring any additional terms in the lowest order. The commutators
[T̂1 , Ht+ + Ht− ] and [T̂1 , Ht0 + Htd ] are of the order of t3 /U 2 and can be neglected.

Altogether, using T̃ˆ instead of T̂ and Eqs. (6.14) and (6.15), we arrive at
 3
iT̃ˆ −iT̃ˆ 0 d 1 + − t
Heff = e He = Ht + Ht + HU + [Ht , Ht ] + O . (6.17)
U U2

The commutator [Ht+ , Ht− ] can be evaluated in a straight-forward way, but we shall introduce
a special algebra of Hubbard operators that renders the calculation more intuitive.

6.2.3 Hubbard operators


The Hubbard operators are projected versions of the creation and annihilation operators. Sup-
pose that we want to create the state |↑i and make sure that no spin-down electron is present
at the same site. Then we should use

X̂ ↑←0 = P̂↑ ĉ†↑ = n̂↑ (1 − n̂↓ ) ĉ†↑ = (1 − n̂↓ ) ĉ†↑ , (6.18)

where we can drop n̂↑ because it always yields 1 for a state acted upon by ĉ†↑ . Incidentally, we
note that exactly this combination of ĉ†↑ and n̂↓ enters our expression for Ht0 in Eq. (6.8). So
we could conveniently write and analyze this stuff in terms of Hubbard operators, should we
know how to deal with them in the first place.
Properties of the Hubbard operators are easier to understand if we think of these
operators as projectors of some kind. In general,

X̂ b←a = |biha|, P̂α = |αihα| = X̂ α←α (6.19)

Then it becomes obvious that


(X̂ b←a )† = |aihb| = X̂ a←b (6.20)
and
X̂ b←a X̂ d←c = |biha|dihci = δad |bihc| = δad X̂ b←c . (6.21)
This facilitates the computation of commutators,

[X̂ib←a , X̂jd←c ] = δij (δad X̂jb←c − δbc X̂jd←a ). (6.22)

Hubbard operators of Ht : we shall need

X̂ σ←0 = (1 − n̂σ̄ )ĉ†σ , X̂ 0←σ = ĉσ (1 − n̂σ̄ ),

56
which is a simple generalization of Eq. (6.18). Things get more complicated when the |di state
comes into play. Namely,
X̂ d←↓ = P̂d ĉ†↑ = n̂↓ ĉ†↑ , X̂ d←↑ = −n̂↑ ĉ†↓ ,
where the minus sign appears in the second part, because |di = |↑↓i = ĉ†↑ ĉ†↓ |0i, but ĉ†↓ |↑i =
ĉ†↓ ĉ†↑ |0i = |↓↑i = −|di following the antisymmetric nature of the fermionic wavefuncton with
respect to permutations. This can be generalized in the form
(
+1, σ = ↑
X̂ d←σ̄ = η(σ) n̂σ̄ ĉ†σ , X̂ σ̄←d = η(σ) ĉσ n̂σ̄ , η(σ) =
−1, σ = ↓
The hopping terms are written via the Hubbard operators as follows,
X X
Ht0 = −t X̂jσ←0 X̂i0←σ , Ht+ = −t η(σ)X̂jd←σ̄ X̂i0←σ , (6.23)
i6=j, σ i6=j, σ
X X
Htd = −t X̂jd←σ̄ X̂iσ̄←d , Ht− = −t η(σ)X̂jσ←0 X̂iσ̄←d . (6.24)
i6=j, σ i6=j, σ

6.2.4 Two-site and three-site terms


Both Ht+ and Ht− consist of individual terms related to two sites. When the term of Ht+ acts
on the sites i and j, the term of Ht− acts on two other sites, k and l, and none of these four
sites match, the result is simply zero. So we only need to consider the two-site and three-site
contributions to each commutator,
X X X
− − −
[Ht+ , Ht− ] = +
[Ht,ij , Ht,ij ]+ +
[Ht,ij , Ht,ji ]+ +
[Ht,hiji , Ht,hjki ],
i6=j i6=j i6=j6=k

where the first two terms are two-site processes, the third term stands for three-site processes,
and we use angular brackets to denote all possible combinations, e.g., hiji = {ij, ji}.
Two-site terms can be easily analyzed qualitatively. By looking at the action of Ht+ and
Ht− , one infers that Ht+ Ht− is active for the ij − ij pairs, whereas Ht− Ht+ is active for the ij − ji
pairs. The former transfers the |di states between sites i and j, while the latter exchanges
unpaired spins at the sites i and j.
We can also show this rigorously,
X  0 0 0 0


+
[Ht,ij , Ht,ij ] = t2 η(σ)η(σ 0 ) X̂jd←σ̄ X̂jσ ←0 X̂i0←σ X̂iσ̄ ←d − X̂jσ ←0 X̂jd←σ̄ X̂iσ̄ ←d X̂i0←σ =
σ,σ 0
X
= t2 δσ̄σ0 η(σ)η(σ 0 )X̂jd←0 X̂i0←d ,
σ,σ 0

which is the transfer of |di states and belongs to the d-part of the Hamiltonian, which is of no
interest to us at the moment.
As for the ij − ji term,
X  0 0 0 0


+
[Ht,ij , Ht,ji ] = t2 η(σ)η(σ 0 ) X̂jd←σ̄ X̂jσ̄ ←d X̂i0←σ X̂iσ ←0 − X̂jσ̄ ←d X̂jd←σ̄ X̂iσ ←0 X̂i0←σ .
σ,σ 0

Its first part boils down to δσσ0 P̂jd P̂i0 and can be discarded on the same grounds as above. On
the other hand, the second part describes the aforementioned spin-exchange process,
0 0
X

+
[Ht,ij , Ht,ji ] → −t2 η(σ)η(σ 0 )X̂jσ̄ ←σ̄ X̂iσ ←σ =
σ,σ 0

= −t 2
X̂j↑←↑ X̂i↓←↓ −t 2
X̂j↓←↓ X̂i↑←↑ + t X̂j↑←↓ X̂i↓←↑ + t2 X̂j↓←↑ X̂i↑←↓ .
2

57
The spin-flip operators have the meaning of the raising and lowering spin operators Ŝ + and

Ŝ , respectively. For example,

X̂ ↑←↓ = n̂↑ (1 − n̂↓ )ĉ†↑ ĉ↓ = ĉ†↑ ĉ↓ = Ŝ +

per Eq. (6.4), so the last two terms in the above commutator are t2 (Ŝj+ Ŝi− + Ŝj− Ŝi+ ). One then
expects that the first term is Ŝjz Ŝiz , and that’s true but more difficult to show. Qualitatively,
this term has no effect on the sites i and j with parallel spins, while lowering the energy for
antiparallel spins by t2 , so it is an antiferromagnetic coupling that can be represented by
 
2 z z n̂i n̂j
2t Ŝi Ŝj − .
4

Three-site terms are, expectedly, more involved and generate many contributions in gen-
eral. We can simplify the analysis using the fact that Ht+ Ht− acts on a state containing at least
one |di and yields another state with |di, so such an operator is not relevant for the low-energy
model. The only relevant term is −Ht− Ht+ that acts in two ways: i) the consecutive hopping
of the same-spin electron from i to j and from j to k; ii) the hopping of an electron from i to
j followed by the hopping of an electron with a different spin from j to k. Those produce two
contributions as follows,
X X X X
Ht3-site = −t2 X̂i0←σ X̂jσ̄←σ̄ X̂kσ←0 + t2 X̂i0←σ X̂jσ←σ̄ X̂kσ̄←0 , (6.25)
i6=j6=k σ i6=j6=k σ

and these contributions have different sign, because the former involves same-spin electrons,
whereas the latter features two consecutive hoppings of electrons with the different spin. This
adds to the magnetic coupling, but only away from the half-filling, because any such process
requires an empty state to begin with.
− +
It’s not difficult to verify that any other combination, like −Ht,kj Ht,ij , yields zero contribu-
tion, so Eq. (6.25) is exhaustive within the |0i, | ↑i, and | ↓i subspace.

6.2.5 Formulation of the t − J model


By combining the results of the previous paragraphs, we can write down the Hamiltonian of
the t − J model,
1 X + − 1
Ht−J = Ht0 + [Ht,ij , Ht,ji ] + Ht3-site =
U i6=j U
2t2 X
 
X n̂i n̂j 1
= −t †
(1 − n̂j σ̄ )ĉjσ ĉiσ (1 − n̂iσ̄ ) + Ŝi Ŝj − + Ht3-site ,
i6=j,σ
U i,j 4 U

which is the truncated version of Eq. (6.17) with all terms related to the d-states removed. The
t − J model is defined in the space of three states (|0i, | ↑i, | ↓i) and includes all terms of the
order of t2 /U . It is used to describe the low-energy physics of strongly correlated materials
and contains an explicit magnetic term, as well as some implicit magnetic effects within the
three-site terms.
The t − J model becomes particularly simple in the case of half-filling, where the first term
as well as the three-site terms vanish, and we are left with the standard Heisenberg Hamiltonian
X 2t2

half-filling n̂i n̂j
Ht−J =J Ŝi Ŝj − , J= (6.26)
i6=j
4 U

58
that essentially repeats our result of Sec. 2.3, where we derived the kinetic exchange of 4t2 /U .
The factor of 2 difference is due to the fact that the kinetic exchange was calculated per bond,
while here we do independent summation over i and j and thus arrive at the twice smaller
exchange integral.
It’s now time to reveal that the result for the half-filled case could be obtained in a much
simpler way using second-order perturbation theory. We shall treat HU as the ”base” Hamil-
tonian and Ht as a small perturbation. The ground state of HU is formed by the manifold of
states with one electron on each site and different spin configurations (basically, ferro- and an-
tiferromagnetic). Excited states with one empty and one doubly-occupied site have the energy
of U .
The second-order correction to the energy has the form
X X hn|Ht |kihk|Ht |n0 i
E (2) = − ,
n,n0 k6=n
Ek − En

where n and n0 are configurations of the ground-state manifold (En = En0 ), k’s are excited
states, and Ek − En = U . The matrix elements of Ht are non-zero when the excited state |ki
is obtained from |ni by a hopping process and, likewise, when |n0 i is obtained from |ki by (a
possibly different) hopping process. This becomes possible for antiferromagnetic states, where
electron hoppings are allowed, resulting in the matrix elements of t. On the other hand, no
hoppings are possible in the ferromagnetic states, and E (2) = 0. The antiferromagnetic states
lower their energy through second-order processes by t2 /U .
The energy splitting between the ferro- and antiferromagnetic states can be expressed by
means of an effective Hamiltonian acting within the subspace of singly-occupied states. To
obtain this Hamiltonian, we simply write down all possible second-order processes,
t2 X † 
H2-nd order = − ĉi↑ ĉj↑ ĉ†j↑ ĉi↑ + ĉ†i↓ ĉj↓ ĉ†j↓ ĉi↓ + ĉ†i↓ ĉj↓ ĉ†j↑ ĉi↑ + ĉ†i↑ ĉj↑ ĉ†j↓ ĉi↓ , (6.27)
U i6=j

arising from two consecutive hoppings: i) either spin-up or spin-down electron hops from i to
j; ii) either spin-up or spin-down electron hops from j to i.
The last two terms within the brackets in Eq. (6.27) are
−ĉ†i↓ ĉi↑ ĉ†j↑ ĉj↓ − ĉ†i↑ ĉi↓ ĉ†j↓ ĉj↑ = −Ŝi− Ŝj+ − Ŝi+ Ŝj−
per the fundamental commutation relation of Eq. (D.8) and the definition of spin operators in
Eq. (6.4). As for the first two terms, they can be written as
n̂i↑ (1 − n̂j↑ ) + n̂i↓ (1 − n̂j↓ ) = n̂i − n̂i↑ n̂j↑ − n̂i↓ n̂j↓ .
On the other hand, the same Eq. (6.4) yields
4Ŝiz Ŝjz = (n̂i↑ − n̂i↓ )(n̂j↑ − n̂j↓ ) = n̂i↑ n̂j↑ + n̂i↓ n̂j↓ − n̂i↑ (n̂j − n̂j↑ ) − n̂i↓ (n̂j − n̂j↓ ) =
= 2(n̂i↑ n̂j↑ + n̂i↓ n̂j↓ ) − n̂i n̂j .
Therefore, the first two terms of Eq. (6.27) are
n̂i n̂j 1
n̂i↑ (1 − n̂j↑ ) + n̂i↓ (1 − n̂j↓ ) = n̂i − 2Ŝiz Ŝjz − = −2Ŝiz Ŝjz + ,
2 2
because ni = nj = 1 in the case of half-filling.
This way, the second-order corrections yield
2t2 X 2t2 X
   
z z 1 + − 1 − + 1 1
H2-nd order = Ŝi Ŝj + Ŝi Ŝj + Ŝi Ŝj − = Ŝi Ŝj − , (6.28)
U i6=j 2 2 4 U i6=j 4

the result of Eq. (6.26).

59
6.3 High-order exchange terms
The terms neglected in Eq. (6.17) will include some exchange processes as well. We won’t
try to analyze them rigorously, because the tediousness of this procedure does not redeem the
usefulness of the result. Therefore, let’s simply sketch it.
Consider a square plaquette. Four consecutive hoppings can lead to a process like

| ↑↓↑↓i ⇒ |0 d ↑↓i ⇒ | ↓↑↑↓i ⇒ | ↑↓ d 0i ⇒ | ↓↑↓↑i

that corresponds to the exchange term (Ŝ1 Ŝ2 )(Ŝ3 Ŝ4 ). A similar sequence of hoppings but
running in the opposite direction over the plaquette results in (Ŝ1 Ŝ4 )(Ŝ2 Ŝ3 ). Finally, we can
also envisage something like

| ↑↑↓↓i ⇒ | ↑ d 0 ↓i ⇒ | ↑↓↑↓i ⇒ |d ↓↑ 0i ⇒ | ↓↓↑↑i.

that yields an exchange (Ŝ1 Ŝ3 )(Ŝ2 Ŝ4 ) between the sites that are not directly connected to each
other.
The full four-spin exchange term (ring exchange) on the square plaquette reads as

5t4 h i
Pring = 3 (Ŝ1 Ŝ2 )(Ŝ3 Ŝ4 ) + (Ŝ1 Ŝ4 )(Ŝ2 Ŝ3 ) − (Ŝ1 Ŝ3 )(Ŝ2 Ŝ4 ) . (6.29)
U

Bilinear exchange Ŝ1 Ŝ2 will also feature contributions of the order of t4 /U 3 . Moreover, for
S > 12 one finds the biquadratic exchange term (Ŝ1 Ŝ2 )2 , which is of the same order.

6.4 Stoner criterion


Our analysis shows that magnetism is easily obtained in the t  U limit of the Hubbard model
with the Mott-insulating solution. On the other hand, some of the metals are ferromagnetic,
the fact that does not seem to be captured by the t  U limit. Stoner developed a simple
criterion of a ferromagnetic instability depending on the density of states at the Fermi level
D(εF ),
I D(εF ) > 1, (6.30)
where I is the Stoner parameter. Here, we shall analyze the Stoner criterion on the level of
the one-band Hubbard model.
Mean-field approximation: consider nσ = hnσ i + δnσ , which yields

n↑ n↓ = hn↑ ihn↓ i + δn↑ hn↓ i + δn↓ hn↑ i + δn↑ δn↓ '


' hn↓ i (hn↑ i + δn↑ ) + hn↑ i (hn↓ i + δn↓ ) − hn↑ ihn↓ i ' hn↓ in↑ + hn↑ in↓ − hn↑ ihn↓ i,

where we neglected δn↑ δn↓ as a small term. Then the Hamiltonian of Eq. (6.1) takes the form
X X
HMF = −t ĉ†jσ ĉiσ + (U hn̂σ̄ i − µ) ĉ†jσ ĉjσ − U N hn↑ ihn↓ i.
i6=j, σ j,σ

It can be diagonalized by introducing k-dependent ĉσ operators. The transformation of the


hopping term was described in Sec. 6.1, whereas the on-site term becomes
!
X 1 XX X X
hn̂σ̄ i ĉ†jσ ĉjσ = hn̂σ̄ i ei(k1 −k2 )r ĉ†k1 σ ĉk2 σ = hn̂σ̄ iĉ†kσ ĉkσ .
j,σ
N σ k ,k r k,σ
1 2

60
Altogether, X
HMF = (εk − µ + U hn̂σ̄ i) ĉ†kσ ĉkσ − U N hn↑ ihn↓ i, (6.31)
k,σ

which is the Hamiltonian of a Fermi gas with the constant energy offset of −U N hn↑ ihn↓ i.
Thermal averages. We can now calculate thermal averages of n↑ and n↓ using Fermi
statistics,
1 X 1
hnσ i = β(ε −µ+U hn̂σ̄ i) + 1
, (6.32)
N k e k
where β = kB T . By introducing

n = hn̂↑ i + hn̂↓ i and m = hn̂↑ i − hn̂↓ i, (6.33)

we can write
1 X 1 1 X 1
hn↑ i = β[ε −µ+U (n−m)/2)]
and hn↓ i = β[ε −µ+U (n+m)/2)]
N k e k +1 N k e k +1

that yields an expression for the magnetization


 
1 X 1 1
m= − =
N k eβ[εk −µ+U (n−m)/2)] + 1 eβ[εk −µ+U (n+m)/2)] + 1
eβ(εk −µ+U n/2) e β U m/2 − e−β U m/2

1 X
= =
N k e2β(εk −µ+U n/2) + 1 + eβ(εk −µ+U n/2) (e β U m/2 + e−β U m/2 )
2 sinh β U2 m

1 X
= ,
N k eβ(εk −µ+U n/2) + e−β(εk −µ+U n/2) + 2 cosh β U2 m

and, upon introducing the relative chemical potential µR = µ − U n/2,

sinh βU2m

1 X
m= .
N k cosh βU2m + cosh[β(εk − µR )]


Using Taylor expansion at small m, sinh x ' x and cosh x ' 1 + x2 /2, we can transform
this expression to
1 X βU m/2
m'  .
N k 1 + cosh[β(εk − µR )] + 1 βU m 2
2 2
One further Taylor expansion,
1 1 B
2
' − 2 m2 + O(m4 ),
A + Bm A A
yields
1 βU m βU m 2
"  #
1 X βU m/2
m' − 2 2 2
2
= am − bm3 , (6.34)
N k 1 + cosh[β(εk − µR )] (1 + cosh[β(εk − µR )])

where the coefficients a and b can be conveniently represented via derivatives of the occupation
numbers for the paramagnetic electron gas with m = 0 and εF = µR ,

1 ∂n0k ∂n0 β eβ(εk −µR ) β/2


n0k = , = − k = β(ε −µ ) = ,
eβ(εk −µR ) + 1 ∂µR ∂εk [e k R + 1]2 1 + cosh[β(εk − µR )]

61
such that 2
U X ∂n0k U X ∂n0k βU 3 X ∂n0k

a= =− and b = . (6.35)
N k ∂µR N k ∂εk 4N k ∂µR

Solution for m could be either m = 0 (paramagnetic) or m2 = (a − 1)/b (ferromagnetic).


The latter exists when (a−1)/b > 0, which essentially means a > 1, because b > 0 by definition.
The condition for ferromagnetism is then a > 1. To explore it further, we have to calculate a
via Eq. (6.35) that basically contains the energy derivative of the Fermi function. At T → 0,
this derivative converges to −δ(ε − εF ). This way, a becomes
Z Z∞
U X U V U V
a= δ(ε − εF ) = d3 k δ(ε − εF ) = dk k 2 δ(ε − εF ),
N k N (2π)3 N 2π 2
0

where we did the standard transformation from the summation over k to the integration over
k. Then, using ε = ~2 k 2 /2m and dε = (~2 /m) k dk, we find

Z+∞ √ Z∞
V m 2mε
a=U δ(ε − εF ) dε = U dε D(ε) δ(ε − εF ) = U D(εF ) > 1, (6.36)
2π 2 N ~3
−∞ −∞

where we defined all the integrand but δ(ε − εF ) as D(ε) that has the meaning of electronic
density of states. It is similar to the density of states introduced in the standard Sommerfeld
theory of metals (e.g., Ch. 2 of Ashcroft and Mermin), but with an additional factor of V /N
that yields the density of states per atom and not per unit of volume.
Eq. (6.36) is equivalent to Eq. (6.30) and serves as the Stoner criterion of a ferromagnetic
instability in the electron gas, with Hubbard U playing the role of the Stoner parameter. Note
that D(εF ) is the density of states of the paramagnetic electron gas, and εF = µR = µ − U n/2
is the Fermi level in the paramagnetic state.
Curie temperature is obtained in a similar way, but without resorting to the T → 0 limit.
Let’s expand D(ε) around εF = µR (which we choose as zero energy),

D(ε) = D(εF ) + D0 (εF )(ε − µR ) + 21 D00 (εF )(ε − µR )2 + O(ε3 ),

and retain the terms up to ε2 ,


Z∞ Z∞ Z∞
∂n0 ∂n0 ∂n0
a = −U dε D(ε) = −U D(εF ) dε − U D0 (εF ) dε (ε − µR ) −
∂ε ∂ε ∂ε
−∞ −∞ −∞

Z∞ Z∞
U D00 (εF ) 2 ∂n
0
U D00 (εF ) ∂n0
− dε (ε − µR ) = U D(εF ) − dε (ε − µR )2 ,
2 ∂ε 2 ∂ε
−∞ −∞

where one of the integrals vanishes, because with x = β(ε − µR )


Z∞ Z∞
∂n0 x
dε (ε − µR ) = dx ,
∂ε 2β(1 + cosh x)
−∞ −∞

and the integrand is an odd function that yields zero upon integration.

62
The remaining integral is expressed as
Z∞ 0 Z∞ Z∞
2 ∂nε 2 β eβ(ε−µR ) [β(ε − µR )]2 eβ(ε−µR )
− dε (ε − µR ) = dε (ε − µR ) β(ε−µ ) = dε =
∂ε [e R + 1]2 β [eβ(ε−µR ) + 1]2
−∞ −∞ −∞
Z∞
1 x2 e x π2
= 2 dx = .
β (ex + 1)2 3β 2
−∞

Then from the a = 1 condition one finds


s
U π 2 00 1 − U D(εF )
U D(εF ) + D (εF ) = 1 ⇒ kB TC = π2
. (6.37)
6β 2 6
U D00 (εF )

Numerical estimates yield TC on the order of 104 K and overestimate Curie temperature of
real metals by at least a factor of 10. It’s a typical story with the mean-field solution, where
fluctuations are neglected.

63
7 Kondo model
7.1 Anderson model and Kondo effect
The Hubbard model considered in the previous section has been developed for and primarily
applied to d-metals. It can be applied to insulating compounds of f -metals as well, although
here one always ends up in the t  U limit complicated by the spin-orbit coupling, crystal
fields, and such. A very different physics is observed in metallic compounds of f -elements, where
interaction (hybridization) between the itinerant conduction electrons and localized f -electrons
appears as a new, and central effect.
Its simplest description is achieved on the level of the Anderson model,
X X X  X
H= εk ĉ†kσ ĉkσ + εf fˆkσ
† ˆ
fkσ + vk ĉ†kσ fˆkσ + vk∗ fˆkσ

ĉkσ + Uf n̂j↑ n̂j↓ , (7.1)
k,σ k,σ k,σ j

where the ĉσ operators stand for the itinerant electrons, and the fˆ operators denote the localized
f -electrons. The first term is the standard hopping term. The second term is merely the
orbital energy of the f -states, where for simplicity we assumed the complete localization (no
dispersion). The third term is the aforementioned hybridization vk , which stands for the transfer
of an electron from the f -shell to the conduction sea, or vice versa. The last term is the Coulomb
repulsion for the f -states.
The first two terms of Eq. (7.1) produce a dispersive band of conduction electrons and a
flat band of localized electrons. The third term introduces their mixing, which leads to a band
splitting. The simplest form of the Hamiltonian matrix is
 
εk vk
vk εf

that, upon a straight-forward diagonalization, yields band energies of


 
1
q
ε± = εk + εf ± (εk − εf )2 + 4vk2 . (7.2)
2
Unless vk = 0 at some of the k-points, there is a gap between the lower and upper bands
all over the Brillouin zone. To estimate the indirect band gap, one has to find the maximum of
the lower band and the minimum of the upper band. Consider the 1D case and the cosine-like
dispersion for itinerant electrons, εk = −2t cos(ka). Then
 
1
q
2
ε− = −2t cos ka + εf − (2t cos ka + εf )2 + 4vk
2
has the maximum at the zone boundary (ka = π), because that’s the point, where −2t cos ka
is maximal and 2t cos ka minimal. Therefore,
 
1
q
max 2 2
ε− = 2t + εf − (2t − εf ) + 4v =
2
s !
1 4v 2 2v 2
= 2t + εf − (2t − εf ) 1 + ' ε f − ,
2 (2t − εf )2 2t − εf

where we assumed εf  t to ensure that 2t − εf > 0. Similar considerations show that ε+ has
the minimum at k = 0 with
2v 2
εmin
+ = ε f + .
2t + εf

64
Then the indirect band gap

8v 2 t
 
2 1 1
∆|U =0 = 2v + = ' 2v 2 /t, (7.3)
2t + εf 2t − εf 4t2 − ε2f

so it is of the order of v/W (W is bandwidth) and fairly small.


Suppose that our system features two electrons, one itinerant and one localized. Together
they completely fill the lower band ε− while leaving the upper band ε+ empty. The nominally
metallic system unexpectedly becomes non-metallic. The small gap opened by v is largely
smeared out at high temperatures, where the system behaves as normal metal (e.g., resistivity
is low and slowly increases with temperature). However, at low temperatures the conduction
electrons become localized due to interaction with f -electrons (”impurities”), and the resistivity
increases upon cooling. That’s the simplest picture of the Kondo effect.
We note in passing that the above mechanism of localization is very different from the
Hubbard mechanism, where the gap opening is due to U , and the gap size is of the order of
U , so it can’t be overcome by any thermal fluctuations, and a robust insulating behavior is
observed.

7.2 Single impurity


We start with the single-impurity problem that reduces Eq. (7.1) to the form
X X X
Himp = εk ĉ†kσ ĉkσ + εf fˆσ† fˆσ + vk (ĉ†kσ fˆσ + fˆσ† ĉkσ ) + Uf n̂↑ n̂↓ , (7.4)
k,σ σ k,σ

where for simplicity we assumed vk = vk∗ , i.e., vk is real. The first term defines the free electron
gas with the ground-state wavefunction
kF
Y
|ψ0 i = ĉ†k↑ ĉ†k↓ |0i. (7.5)
k

To solve the actual model, we construct a trial wavefunction


1 X
|ψi = a0 |ψ0 i + √ ak (fˆσ† ĉkσ )|ψ0 i, (7.6)
2 k,σ

which admixes to |ψ0 i the states, where one conduction electron is excited to the impurity level.
For simplicity we assume ak real, although this won’t have any significant effect on the result.
Placing the second electron onto the impurity level is forbidden by the large Uf .
The coefficients a0 and ak are determined using a variational procedure. Before we proceed,
let’s make sure that |ψi is properly normalized. To this end, we calculate
! !
1 X 1 X
hψ|ψi = hψ0 a0 + √ ak0 ĉ†k0 σ0 fˆσ0 a0 + √ ak fˆσ† ĉkσ ψ0 i =
2 k0 ,σ0 2 k,σ

a0 X  ˆ
 1 XX
= a20 +√ †
ak hψ0 |fσ ĉkσ |ψ0 i + H.c. + ak ak0 hψ0 |ĉ†k0 σ0 fˆσ0 fˆσ† ĉkσ |ψ0 i.
2 k,σ 2 k,k0 σ,σ0

The second term vanishes, because we annihilate a conduction electron and create a state
orthogonal to |ψ0 i. As for the last term, the only non-vanishing contribution appears when ĉkσ

65
annihilates a conduction electron and ĉ†k0 σ0 restores it, so we get δkk0 δσσ0 ,
XX X
ak ak0 hψ0 |ĉ†k0 σ0 fˆσ0 fˆσ† ĉkσ |ψ0 iδkk0 δσσ0 = a2k hψ0 |ĉ†kσ fˆσ fˆσ† ĉkσ |ψ0 i =
k,k0 σ,σ 0 k,σ
X X
= a2k hψ0 |ĉ†kσ ĉkσ (1 − fˆσ† fˆσ )|ψ0 i = a2k ,
k,σ k,σ

where fˆσ† fˆσ yields zero because fˆσ |ψ0 i = 0. The summation over σ yields the factor of 2 that
cancels the 21 pre-factor, and we end up with the condition
X
a20 + a2k = 1. (7.7)
k

Energy of the trial state is calculated in a similar way and includes several contributions,
! !
1 X 1 X
E = hψ0 a0 + √ ak0 ĉ†k0 σ0 fˆσ0 Himp a0 + √ ak00 fˆσ† 00 ĉk00 σ00 ψ0 i =
2 k0 ,σ0 2 k00 ,σ00
X 1 XX X
= εk a20 hψ0 |ĉ†kσ ĉkσ |ψ0 i + εk ak0 ak00 hψ0 |ĉ†k0 σ0 fˆσ0 ĉ†kσ ĉkσ fˆσ† 00 ĉk00 σ00 |ψ0 i +
k,σ
2 k,σ k0 ,σ0 k00 ,σ00

1 XX X
+ εf ak0 ak00 hψ0 |ĉ†k0 σ0 fˆσ0 fˆσ† fˆσ fˆσ† 00 ĉk00 σ00 |ψ0 i +
2 σ k0 ,σ0 k00 ,σ00
" #
a0 X X X
+√ vk ak00 hψ0 |(ĉ†kσ fˆσ + fˆσ† ĉkσ )fˆσ† 00 ĉk00 σ00 |ψ0 i + ak0 hψ0 |ĉ†k0 σ0 fˆσ0 (ĉ†kσ fˆσ + fˆσ† ĉkσ )|ψ0 i ,
2 k,σ k00 ,σ 00 k0 ,σ 0

where we skipped all terms with the uneven number of ĉσ and ĉ†σ , because they yield zero matrix
elements. The Uf n̂↑ n̂↓ term is not present, because our trial wavefunction does not feature the
double occupation of the impurity level.
Let’s represent the energy as E = E0 + Ec + Ef + Ecf and analyze individual terms. First,
X X
E0 = a20 εk = 2a20 εk .
k,σ k

Second,
1 XX X 1X X
Ef = εf ak0 ak00 δk0 k00 δσ0 σ00 δσσ00 = εf a2k = εf a2k = εf (1 − a20 ),
2 σ k0 ,σ0 k00 ,σ00 2 k,σ k

where δσσ00 is due to the sequence of fˆσ fˆσ† 00 .


The case of Ec is slightly more involved. The combination of ĉ†k0 σ0 ĉk00 σ00 yields δk0 k00 δσ0 σ00 , but
additionally we have to ensure that ĉk00 σ00 = ĉk0 σ0 is different from ĉkσ , as two similar annihilation
operators can’t act one after the other. We shall denote this condition (k0 6= k or σ 0 6= σ) with
the prime sign in the summation. It can be achieved by performing an unrestricted summation
over k0 and σ 0 and subtracting the sum over k0 = k with σ 0 6= σ. This way,
" ! #
1 XX0 2 1X X
2
X
2
X X
Ec = ε k ak 0 = ε k ak 0 − ε k ak = 2 εk (1 − a20 ) − εk a2k .
2 k,σ k0 ,σ0 2 k,σ k0 ,σ 0 σ 0 6=σ k k

66
Finally, each of the terms in Ecf yields something like δkk0 δσσ0 , such that

2 X √ X
Ecf = √ v k a0 ak = 2 2 vk a0 ak .
2 k,σ k

Altogether, X X √ X
E=2 εk − εk a2k + (1 − a20 )εf + 2 2 vk a0 ak . (7.8)
k k k

Variational procedure involves the minimization of E with respect to a0 and ak under


the condition of Eq. (7.7). The variation of a0 yields
" !#
δ X √ X
E − λ a20 + a2k − 1 = −2a0 εf + 2 2 vk ak − 2λa0 = 0 ⇒
δa0 k k
√ X
⇒ λa0 = −a0 εf + 2 vk ak . (7.9)
k

Likewise, the variation of ak results in


" !#
δ X √
E − λ a20 + a2k − 1 = −2εk ak + 2 2vk a0 − 2λak = 0 ⇒
δak k

⇒ λak = −εk ak + 2 vk a0 . (7.10)

By multiplying this equation by ak and making the summation over k, we arrive at the second
term of Eq. (7.8), X X √ X
− εk a2k = λ a2k − 2 vk a0 ak .
k k k

On the other hand, from Eq. (7.9),


√ X
2 vk a0 ak = λa20 + εf a20 .
k

Together with Eq. (7.7), these two conditions bring Eq. (7.8) to a quite simple form of
X
E=2 εk + εf + λ, (7.11)
k

which is the energy of free conduction electrons, plus the energy of the f -state, plus an additional
energy λ due to the interaction.
Kondo singlet: the condition for the interaction energy λ is determined by inserting
Eq. (7.10) into Eq. (7.9),
√ X 2v 2
k
ak (λ + εk ) = 2 vk a0 ⇒ λ = −εf + . (7.12)
k
λ + ε k

λ is the energy of a Kondo singlet, the state formed upon the interaction between the con-
duction electrons and impurity. What happens here is that a conduction electron hops onto the
impurity state and creates a spin polarization. The conduction electrons around the impurity
become spin polarized and, together with the electron on the impurity level, form a singlet
state with the characteristic energy of λ.

67
Kondo temperature: let’s transform Eq. (7.12) by replacing the summation with an
integration over energy, starting from the bottom of the band (εk = ε0 ) and up to the Fermi
energy at εk = εF = 0.
Z0
dε D(ε) 2
λ = −εf + 2 v .
λ+ε k
ε0

This is not an easy integral to solve, but we expect that the main contribution comes from
the states around the Fermi level, where ε is small, and the integrand large. Therefore, we use
D(ε) ' D(εF ) and vk ' vkF = const to obtain
Z0
dε |λ|
λ ' −εf + 2vk2 F D(εF ) = −εf + 2vk2 F D(εF ) ln . (7.13)
λ+ε |λ + ε0 |
ε0

The logarithm function involved is rather peculiar. It approaches zero at |λ| → ∞, diverges
to −∞ at λ = 0 and to +∞ at λ = −ε0 > 0. The solution can be found graphically, where
it immediately becomes clear that the λ = λ line crosses the logarithm curve at three points,
with only one crossing at λ < 0, which is the solution we are looking for as the bound state of
the Kondo singlet.
It seems plausible to assume |λ|  |εf | and |λ|  |ε0 |, which simplifies Eq. (7.13) to
 
2 |λ| |εf |
− |εf | = 2vkF D(εF ) ln ⇒ λ = −|ε0 | exp − 2 , (7.14)
|ε0 | 2vkF D(εF )

where the minus sign is due to the fact that λ should be negative. Alternatively, one can define
the Kondo temperature TK such that λ = −kB TK , and
 
|ε0 | |εf |
TK = exp − 2 . (7.15)
kB 2vkF D(εF )

The energy of the Kondo singlet and the associated Kondo temperature increase, as the
impurity level approaches the Fermi level, and decrease as the hybridization strength v or
density of states at the Fermi level decrease. This all seems logical.

7.3 Kondo lattice


Real crystals feature more than one ”impurity” state, so we should generally work with the
full Anderson Hamiltonian, Eq. (7.1), that includes a complete impurity band. The resulting
Kondo physics is as rich as the Hubbard physics, and we won’t even try to cover all of its
aspects here, but will only discuss implications for the magnetism of f -ions embedded in the
conduction matrix. In other words, we are interested in the half-filling of the f -bands (one
electron in the impurity bands) and the low-energy effects. The general procedure is similar to
our derivation of the t − J model in Sec. 6.2, namely, one has to project the hopping terms,
introduce the Hubbard operators, and collect the results to the lowest order. This is known as
the Schrieffer-Wolff transformation and can be found, e.g., in the book by Fazekas. Here, we
only discuss the nature of the coupling term qualitatively.

7.3.1 Derivation of the Kondo Hamiltonian


In Sec. 6.2.5, we realized that magnetism of the Hubbard model at half-filling can be decently
understood on the level of second-order perturbation theory. The same approach will now be
used to describe the physics of half-filled f -states surrounded by the conduction electrons. The

68
hybridization term acts as the perturbation, while the rest of Eq. (7.1) constitutes the non-
interacting Hamiltonian, which we consider in the limit of large U with one electron on each
localized state.
Let’s consider the hopping processes that combine the hopping of an electron from the
localized state onto the itinerant state k, and from the itinerant state q onto the localized
state. Two situations are possible: i) an electron leaves the localized state first (f → k), and
another electron comes to replace it later (q → f ), i.e., the double occupation does not occur;
ii) an electron hops onto the localized state first (q → f ), leading to the double occupation,
and only later one of the electrons leaves to the itinerant states (f → k).
We consider the former mechanism first. The ground-state energy is εk + εf , whereas the
relevant excited states have the energy of 2εk , assuming that εk ' εq ,5 so the energy difference
is εk − εf .
To determine the matrix elements, let’s introduce the local operators fˆrσ into the hybridiza-
tion term of Eq. (7.1),
!
X  1 X X X
vk ĉ†kσ fˆkσ + vk∗ fˆkσ

ĉkσ = √ vk ĉ†kσ eikr fˆrσ + vk∗ ĉkσ e−ikr fˆrσ

,
k,σ
N k,σ r r

which yields the matrix element of (1/N ) vk1 eik1 r vk∗ 2 e−ik2 r for the hopping processes in question.
Keeping in mind the minus sign in front of the perturbation correction to the energy, we
find the following combinations of the operators,
 
ˆ† † ˆ ˆ† † ˆ ˆ† † ˆ ˆ† † ˆ
− fr↓ ĉq↓ ĉk↑ fr↑ + fr↑ ĉq↑ ĉk↓ fr↓ + fr↑ ĉq↑ ĉk↑ fr↑ + fr↓ ĉq↓ ĉk↓ fr↓ =
= fˆr↓
† ˆ
fr↑ ĉ†k↑ ĉq↓ + fˆr↑
† ˆ
fr↓ ĉ†k↓ ĉq↑ + fˆr↑
† ˆ
fr↑ ĉ†k↑ ĉq↑ + fˆr↓
† ˆ
fr↓ ĉ†k↓ ĉq↓ ,
so the minus sign is used to swap ĉ†k and ĉq .
The second mechanism involves the doubly-occupied state with the energy 2εf + Uf , so the
energy difference is εf + Uf − εk in this case. Regarding the operators, they come in a different
sequence and do not require the minus sign to swap k and q, but now we need one to swap fˆr
and fˆr† . For example,
−ĉ†k↑ fˆr↑ fˆr↓

ĉq↓ = fˆr↓
† ˆ
fr↑ ĉk↑ ĉq↓ .
At the end of the day, all terms come without the minus sign. The spin-flip contributions can
be written as
 
1 XX ∗ i(k−q)r 1 1 
vk vq e + Ŝr− ĉ†k↑ ĉq↓ + Ŝr+ ĉ†k↓ ĉq↑ ,
N k,q r εk − εf εf + Uf − εk

where we recognized the spin operators for the f -states, Ŝ + = fˆ↑† fˆ↓ and Ŝ − = fˆ↓† fˆ↑ .
When both hoppings involve electrons of the same spin, one finds6
fˆr↑
† ˆ
fr↑ ĉ†k↑ ĉq↑ + fˆr↓
† ˆ
fr↓ ĉ†k↓ ĉq↓ = ĉ†k↑ ĉq↑ n̂r↑ + ĉ†k↓ ĉq↓ n̂r↓ ,
where the n̂r operators are defined for the f -states. Using n̂r↑ = n̂r − n̂r↓ and n̂r↓ = n̂r − n̂r↑ ,
we can write
ĉ†k↑ ĉq↑ n̂r↑ + ĉ†k↓ ĉq↓ n̂r↓ = 21 ĉ†k↑ ĉq↑ (n̂r↑ + n̂r − n̂r↓ ) + 21 ĉ†k↓ ĉq↓ (n̂r↓ + n̂r − n̂r↑ ) =

= 21 (n̂r↑ − n̂r↓ )(ĉ†k↑ ĉq↑ − ĉ†k↓ ĉq↓ ) + 21 n̂r (ĉ†k↑ ĉq↑ + ĉ†k↓ ĉq↓ ) =

= Ŝrz (ĉ†k↑ ĉq↑ − ĉ†k↓ ĉq↓ ) + 21 n̂r (ĉ†k↑ ĉq↑ + ĉ†k↓ ĉq↓ ),
5
In other words, q is right under the Fermi level, whereas k is right above the Fermi level. This is the only
situation that would allow us to use the perturbation theory in a proper way.
6
More precisely, the second mechanism contributes terms like −fˆr↑ fˆr↑

= −1 + fˆr↑
† ˆ
fr↑ , and we disregard −1,
because it does not include the localized states.

69
so we get the S z term plus something that does not include the localized electrons and is of no
interest for us at the moment.
Combining the spin-flip and non-spin-flip contributions, one finds
 
1 XX ∗ i(k−q)r 1 1
vk vq e + ×
N k,q r εk − εf εf + U f − εk
h i
+ † − † z † †
× Ŝr ĉk↓ ĉq↑ + Ŝr ĉk↑ ĉq↓ + Ŝr (ĉk↑ ĉq↑ − ĉk↓ ĉq↓ ) . (7.16)

A simpler notation becomes possible if we assume vk = v and put all energies into the pre-factor
J (it will be explained later),
!
X
† J 1 X X i(k−q)r X

HKL = εk ĉkσ ĉkσ + e Ŝr ĉkσ σ σσ0 ĉqσ0 , (7.17)
k,σ
2 N r k,q σ,σ 0

where σ is the vector of Pauli matrices σα of Eq. (A.15), and the notation in brackets implies
 
 †    ĉq↑
ĉk σ ĉq α = ĉk↑ ĉk↓ × σα ×  .
ĉq↓

The Pauli matrices introduce 21 in the pre-factor.


Another useful observation at this juncture is that the summations over k and q are Fourier
transforms of real-space operators ĉrσ . This also removes the 1/N pre-factor, and we end up
with the Hamiltonian of the Kondo lattice,
X J XX X X
HKL = εk ĉ†kσ ĉkσ + Ŝr (ĉ†rσ σ σσ0 ĉrσ0 ) = εk ĉ†kσ ĉkσ + J Ŝr ŝr , (7.18)
k,σ
2 r σ,σ0 k,σ r

because (ĉ†rσ σ σσ0 ĉrσ0 ) = ŝr is merely the spin of the itinerant electron at site r.
The Hamiltonian of this type was introduced by Kondo on purely phenomenological grounds.
Having derived this Hamiltonian from the Anderson model, we are able to understand the nature
of the Kondo coupling J. It is given by
 
2 1 1
J =v + , (7.19)
ε − εf εf + Uf − ε

where we assumed εk ' ε = const to avoid k-dependence (such an approximation holds when
εf is sufficiently far away from the Fermi level and gives the main contribution to the denomi-
nators). In the general case, we have to use the k-dependent Kondo coupling,
 
∗ 1 1
Jkq = vk vq + , (7.20)
εk − εf εf + U f − εk

and stay on the level of Eq. (7.16) without introducing ŝr .


We can now re-define the Kondo temperature derived in Sec. 7.2 and re-write Eq. (7.15) in
the more common form of
W −1/[J D(εF )]
TK = e , (7.21)
kB
where W is the width of the conduction band replacing |ε0 |, and v 2 /|εf | is replaced by the
Kondo coupling, 1/J.

70
7.3.2 Indirect exchange (RKKY coupling)
Direct interaction (hybridization) between the localized and itinerant electrons triggers an
effective interaction between the localized spins. We assess it by re-sorting to Eq. (7.16) and
writing the Hamiltonian for two localized spins located at the sites r1 and r2 ,
X
H2-site = H0 + H1 , where H0 = εk ĉ†kσ ĉkσ
k,σ

and
2
J X X (k−q)rj h + † i
H1 = e Ŝj ĉk↓ ĉq↑ + Ŝj− ĉ†k↑ ĉq↓ + Ŝjz (ĉ†k↑ ĉq↑ − ĉ†k↓ ĉq↓ ) . (7.22)
2N j=1 k,q

We seek to eliminate the effect of conduction electrons. To this end, we conceive a suitable
canonical transformation in the same vein as the transformation of the Hubbard Hamiltonian
in Sec. 6.2.2.
Canonical transformation: consider an operator T̂ and

i2
Heff = e iT̂ H2-site e−iT̂ ' H0 + H1 + i[T̂ , H0 ] + i[T̂ , H1 ] + [T̂ , [T̂ , H]] + . . . . (7.23)
2
Let’s impose the condition
H1 + i[T̂ , H0 ] = 0 (7.24)
that eliminates the effect of conduction electrons to the lowest order. Similar to Sec. 6.2.2, the
required form of T̂ is in fact very similar to the form of H1 itself,
2
J X X ei(k−q)rj h + † i
iT̂ = Ŝj ĉk↓ ĉq↑ + Ŝj− ĉ†k↑ ĉq↓ + Ŝjz (ĉ†k↑ ĉq↑ − ĉ†k↓ ĉq↓ ) .
2N j=1 k,q εk − εq

It is straightforward to verify that Eq. (7.24) holds. For example,

[Ŝj+ ĉ†k↓ ĉq↑ , H0 ] = εk Ŝj+ [ĉ†k↓ , ĉ†k↓ ĉk↓ ] ĉq↑ + εq Ŝj+ ĉ†k↓ [ĉq↑ , ĉ†q↑ ĉq↑ ] = (εq − εk ) Ŝj+ ĉ†k↓ ĉq↑

that cancels the first term of H1 upon the multiplication by (εk − εq ).


An important observation at this point is that T̂ is of the order of J/W , where W ∼ (εk −εq )
is the width of the conduction band. Therefore, Eq. (7.23) is essentially an expansion in powers
of J/W . The lowest-order contributions are of the order of J 2 /W and arise from

i2 i i
i[T̂ , H1 ] + [T̂ , [T̂ , H0 ]] = i[T̂ , H1 ] − [T̂ , H1 ] = [T̂ , H1 ].
2 2 2
Similar to Sec. 6.2.2, we are left with computing just one commutator that hopefully contains
the interaction we need.
Calculation of [T̂ , H1 ] is a tedious exercise, because lots of different terms have to be
considered. Let’s look at the intersite terms (Ŝ1 ↔ Ŝ2 ), because the on-site terms like Ŝ1 ↔ Ŝ1
won’t lead to the interaction we search for. The spin operators on different sites commute,
and the only non-commuting part arises from itinerant electrons. For example, cross-terms like
Ŝ1+ Ŝ2− will produce the non-zero result only when k and q are swapped on site 2. This way,

[Ŝ1+ ĉ†k↓ ĉq↑ , Ŝ2− ĉ†q↑ ĉk↓ ] = Ŝ1+ Ŝ2− ĉ†k↓ ĉq↑ ĉ†q↑ ĉk↓ − ĉ†q↑ ĉk↓ ĉ†k↓ ĉ†q↑ =


= Ŝ1+ Ŝ2− [n̂k↓ (1 − n̂q↑ ) − n̂q↑ (1 − n̂k↓ )] = Ŝ1+ Ŝ2− (n̂k↓ − n̂q↑ ).

71
The complementary Ŝ2− Ŝ1+ term yields (n̂k↑ − n̂q↓ ), so together we get the contribution of

Ŝ1+ Ŝ2− (n̂k − n̂q ).

On the other hand, if we don’t swap k and q on site 2, the result is

[Ŝ1+ ĉ†k↓ ĉq↑ , Ŝ2− ĉ†k↑ ĉq↓ ] = Ŝ1+ Ŝ2− ĉ†k↓ ĉq↑ ĉ†k↑ ĉq↓ − ĉ†k↑ ĉq↓ ĉ†k↓ ĉq↑ =


= Ŝ1+ Ŝ2− ĉ†k↑ ĉ†k↓ ĉq↑ ĉq↓ − ĉ†k↑ ĉ†k↓ ĉq↑ ĉq↓ = 0,


where we do two permutations in each term, so the signs do not change.


The absence of terms like Ŝ1+ Ŝ2z is ensured by the fact that the 1 − 2 and 2 − 1 contributions
cancel. For example,

[Ŝ1+ ĉ†k↓ ĉq↑ , Ŝ2z ĉ†q↑ ĉk↑ ] = Ŝ1+ Ŝ2z ĉ†k↓ ĉq↑ ĉ†q↑ ĉk↑ − ĉ†q↑ ĉk↑ ĉ†k↓ ĉq↑ =


= Ŝ1+ Ŝ2z ĉ†k↓ ĉk↑ (1 − nq↑ ) + ĉ†k↓ ĉk↑ n̂q↑ = Ŝ1+ Ŝ2z ĉ†k↓ ĉk↑ ,
 

but

[−Ŝ2z ĉ†k↓ ĉq↓ , Ŝ1+ ĉ†q↓ ĉk↑ ] = −Ŝ1+ Ŝ2z ĉ†k↓ ĉq↓ ĉ†q↓ ĉk↑ − ĉ†q↓ ĉk↑ ĉ†k↓ ĉq↓ =


= −Ŝ1+ Ŝ2z ĉ†k↓ ĉk↑ (1 − n̂q↓ ) + ĉ†k↓ ĉk↑ n̂q↓ = −Ŝ1+ Ŝ2z ĉ†k↓ ĉk↑ .
 

Finally, the Ŝ1z Ŝ2z terms come with coefficients like (n̂k↑ − n̂q↑ ) and (n̂k↓ − n̂q↓ ) that together
yield
2Ŝ1z Ŝ2z (n̂k − n̂q ).
So we recover the Heisenberg interaction 2Ŝ1z Ŝ2z + Ŝ1+ Ŝ2− + Ŝ1− Ŝ2+ = 2 Ŝ1 Ŝ2 between the localized
spins, and the factor of 2 cancels 21 in front of [T̂ , H1 ].
The RKKY Hamiltonian reads as
X 1X
HRKKY = εk ĉ†kσ ĉkσ + Jij Ŝi Ŝj , (7.25)
k,σ
2 i6=j

where we neglected the constant-energy offset due to the single-site terms in [T̂ , H1 ]. The
RKKY coupling named after Ruderman, Kittel, Kasuya, and Yosida can be expressed by
 2 X
J nk − nq
Jij = − cos[(k − q)(ri − rj )] , (7.26)
N k,q
εq − εk

and the cosine term arises from the product of the exponentials ei(k−q)r1 and ei(k−q)r2 in
Eq. (7.22).
After replacing the summation with integration in Eq. (7.26) and solving the not-so-simple
integrals, one evaluates the RKKY interaction as function of rij ,
6
J2

kF a0 sin x − x cos x
Jij = − 3 , x = 2kF rij . (7.27)
π 2 x4
4
This is an oscillating function of x (viz. rij ) with the amplitude vanishing as 1/rij . The sign
of the RKKY interaction depends on the interatomic distance. At very low distances, the
interaction is antiferromagnetic.

72
7.3.3 Doniach phase diagram
Localized electron inside the metal can choose one of the two options: i) interact with the
conduction electrons to form the Kondo singlet; or ii) interact with other localized electrons
(through RKKY) to form a magnetically ordered state. These two scenarios are controlled by
the characteristic temperatures

J2
kB TK = W e−1/[J D(εF )] and TRKKY ' , (7.28)
W
where TK is the Kondo temperature of Eq. (7.21) and TRKKY is the characteristic temperature
of the RKKY interaction.
Both temperatures increase with J, but in a rather different manner. The J-dependence
is quadratic for TRKKY and exponential for TK , so we expect TK < TRKKY at small J and
the other way around at larger J. In the former case, the formation of Kondo singlets is
excluded, because localized spins choose to interact. In the latter case, Kondo singlets exclude
cooperative magnetism, because they form before the localized moments may start interacting.
Here, conduction electrons screen the localized spins (Kondo screening) and prevent cooperative
magnetism.
There is a well-defined border between the two regimes that can be accessed by pressure or
doping.

73
8 Neutron scattering
8.1 Propagation vector
This section is written under an assumption that you are already familiar with the basics of
scattering on periodic structures, know the meaning of structure factor, and understand how
(x-ray) scattering on atoms works. If you are not sure, please check any standard solid-state
physics textbook before you proceed.
Here, we shall discuss neutron scattering, because neutrons scatter on both nuclei (nuclear
scattering) and electron spins (magnetic scattering). Nuclear scattering leads to reflections
at the reciprocal-lattice sites (hkl), similar to x-ray diffraction. Magnetic neutron scattering
stands for the scattering of neutrons by electron spins. The primary effect of the magnetic order
is an additional periodicity imposed on the crystal. For example, simple antiferromagnetic order
in a cubic crystal renders two neighboring sites of the crystal non-equivalent. Therefore, lattice
period increases, and the reciprocal lattice shrinks, because its period is, consequently, reduced.
A general effect of the magnetic order is the appearance of additional reflections that may or
may not overlap with the nuclear ones. The positions q of the magnetic reflections are described
by
q = q0 ± 2πk, (8.1)
where q0 = 2π(h, k, l) is the position of a nuclear peak, and k is propagation vector of the
magnetic structure.
Examples. Ferromagnetic order does not change lattice periodicity. Therefore, k = 0.
Simple. Antiferromagnetic order will generally increase the periodicity by a factor of two. If
each spin on a square lattice is surrounded by opposite spins, the unit cell becomes twice bigger.
This implies the propagation vector k = ( 21 , 21 ). The new lattice translations are am = a + b
and bm = a − b (diagonals of the parent unit cell).
Another possibility of an antiferromagnetic order is the stripe order featuring stripes of
parallel spins alternating along one of the crystal directions. The unit cell becomes twice bigger
again, but now am = 2a and bm = b giving rise to k = ( 21 , 0). Such type of order is typically
observed when antiferromagnetic interactions between next-nearest neighbors (J2 ) are stronger
than interactions between nearest-neighbors (J1 ). The sign of J1 does not really matter, because
this interaction does not contribute to the energy of the stripe antiferromagnetic state.
The aforementioned example may leave an impression that the component kα = 12 of the
propagation vector always stands for the antiferromagnetic order, whereas kα = 0 stands for
the ferromagnetic order along a given crystal direction α. This is generally not true. It may
well happen that the crystallographic unit cell is large enough to fit some antiparallel spins, and
antiferromagnetic order is characterized by k = 0 then. Such an antiferromagnetic order may
be even indistinguishable from ferromagnetic order, unless reflection intensities are considered.
Two further remarks are in place. First, k can be any vector. Its components are not
necessarily integer fractions like 13 or 12 . As long as the components of k are rational numbers,
the magnetic structure is called commensurate. An irrational component of k renders the
magnetic structure incommensurate.
Second, kα may be not only fractions, but also integers. For example, kα = 1 is not unrea-
sonable as long as some of the nuclear hkl reflections on the diffraction pattern are missing. For
example, body-centered lattice imposes the reflection condition hkl, h+k +l = 2n (even). Mag-
netic ordering may break the body-centering and waive this reflection condition. In this case,
magnetic reflections will appear at (100), (300), (111), and so on, producing the propagation
vector k = (1, 0, 0).
Relation to the spin arrangement. We introduced the propagation vector in a very
formal way, but in fact it is directly related to the spin arrangement. Consider two magnetic

74
atoms at r and r + δ, where δ is a lattice vector. Magnetic moments on these atoms are µr
and µr+δ = A µr .
Scattering from the crystal at a given point q of the reciprocal space is obtained in the form
of the structure factor,
X X
F (q) = fj (q) eiqrj = fj (q) e2πi(hxj +kyj +lzj ) , (8.2)
j j

where fj (q) stands for the scattering from an individual atom, and the summation goes over all
atoms. The magnetic moment is contained within fj (q) (details will be discussed in Sec. 8.3),
but for now it’s enough to know that fj (q) is proportional to the magnetic moment (which
is most logical), so the contribution of these two atoms to the reflection at q = 2πk (assume
q0 = 0) is
µr e2πikr + µr+δ e2πik(r+δ) = µr e2πikr (1 + A e2πikδ ).
Constructive interference occurs when A e2πikd = 1, hence µr+δ = µr e−2πikδ . Therefore, the
propagation vector defines how magnetic moments change under the lattice translation. Con-
sider, for example, δ = (1, 0, 0). In this case, µr+a = µr e−2πikx . For kx = 12 , µr+a = −µr ,
i.e., the ordering along a is simple antiferromagnetic. On the other hand, kx = 0 results in
µr+a = µr and ferromagnetic ordering along a.
The factor of 2π is sometimes added to the propagation vector. Therefore, (π, π) is equiv-
alent to ( 21 , 12 ). Crystallographers (including neutron crystallographers) prefer the notation
without the 2π, whereas theorists made a habit of adding 2π to the propagation vector.
Incommensurate magnetic structures: when kα 6= 0, 12 , magnetic moments are gener-
ally complex, which may look strange. However, each moment at r + δ has its counterpart at
r − δ, and together they form real magnetic moment
1
µobs
r+δ = (µr+δ + µr−δ ) = µr cos(2πkδ)
2
that follows cosine modulation. This modulation can have different flavors. The simplest one
is the cosine expression written above. It means that the direction of the moment does not
change, while the size of the moment is modulated. Such magnetic structure is known as the
spin-density wave, because spin density is distributed throughout the crystal as a long-period
modulation. It serves an example of an incommensurate magnetic structure, which is collinear.
Another possibility involves phase shift between the modulation of different components of
µ. For example, µxr±δ = µr cos(2πkδ) and µyr±δ = µr cos(2πkδ + π/2) = −µr sin(2πkδ). In this
case, the size of the moment remains constant (µr ), but its direction is modulated. Directions
of the magnetic moments follow a circle in the xy plane, thus forming a spiral structure.
It essentially implies a phase shift of π/2 between µx and µy . In the general notation, both
µx and µy are complex numbers. However, whenever µx is purely real on a given site, µy is
purely imaginary, and the other way around. This mathematical construction is very useful for
understanding and solving incommensurate magnetic structures.
Remark: different flavors of spiral structures exist. Assume that xy is the plane, where
spins rotate. When k ⊥ xy, i.e., the spiral propagates in the direction perpendicular to the
rotation plane, the magnetic structure is called helix. On the other hand, when k k xy, the
magnetic structure is called cycloid.
Another remark: despite their very different nature, the phase shift of π/2 is the only
difference between the spin-density wave and helical magnetic structures. The bad news is that
this phase shift can not be observed experimentally, at least with standard neutron scattering.
Indeed, any phase shift is of the type eiϕ and becomes unity when |F (q)|2 is calculated. There-
fore, any given neutron diffraction pattern with an incommensurate k can always be interpreted

75
as a spin-density wave or a suitably chosen helical structure. The choice between the two should
be based on physical arguments, or requires advanced experiments, such as polarized neutron
scattering.
Why do incommensurate structures form? They are usually driven by a competition
of exchange couplings. We can understand this by considering frustrated spin chain as an
example. Suppose that nearest-neighbor spins are coupled by J1 , whereas next-nearest-neighbor
spins are coupled by J2 . Antiferromagnetic J2 is incompatible with J1 , no matter whether J1
is ferro- or antiferromagnetic. Let’s consider an arbitrary spiral configuration with the pitch
angle ϕ. Its energy is
X X
E=− J1 Si Si+1 − J2 Si Si+2 = N (−J1 cos ϕ − J2 cos 2ϕ), (8.3)
i i

where N is the change length. Energy minimum can be found by taking the derivative,
dE
= N (J1 sin ϕ + 2J2 sin 2ϕ) = 0, (8.4)

resulting in the condition
J1
cos ϕ = − . (8.5)
4J2
Spiral order forms for |J2 | > |J1 |/4 only. Exact value of the pitch angle depends on the
ratio of the competing exchange couplings. In the |J2 |  |J1 | limit, ϕ approaches 90◦ , and the
magnetic structure splits into two antiferromagnetic chains, because J2 dominates.
Remark: it may be tempting to conclude that non-collinear magnetic structures are always
incommensurate, but that’s not necessarily the case. Non-collinear structures can also be
commensurate. For example, canted order, where spins systematically tilt into clockwise and
counter-clockwise directions, are commensurate with the lattice.

8.2 Symmetry of the magnetic structures


Magnetic structures obey symmetry relations that can be expressed using two different ap-
proaches, magnetic space groups and irreducible representations of symmetry groups. The
ideas behind the symmetries are basically the same as in standard (”atomic”) crystallography.
Symmetry simplifies the description of a magnetic structure by reducing the number of param-
eters that describe it. This is vital for the magnetic structure determination, where the amount
of experimental data (measured magnetic reflections) is never too high. Further on, symmetry
imposes constraints on the interaction of the system with the field and on the response functions
like magnetic susceptibility.
Symmetry elements. Before we construct magnetic space groups, let’s look at individual
symmetry elements. At first glance, we perfectly know what a rotation axis or a mirror plane
should do to a vector. This intuitive knowledge is, however, only true for polar vectors. Spin
(and any magnetic moment) is an axial vector, a cross-product of two vectors r and p per
Eq. (1.3), so the symmetry transformations are given by the effect of the symmetry element on
r and p and not on S directly.
For example, the inversion center 1̄ reverts the direction of r and p, but does not revert the
direction of S! Consider now the mirror plane perpendicular to the c-axis. The spin along c,
S = (0, 0, S), arises from r = (r, 0, 0) and p = (0, p, 0). As the mirror plane does not change the
directions of r and p, the direction of S is also retained. On the other hand, with S = (S, 0, 0)
one implies r = (0, r, 0) and p = (0, 0, p), and p changes the direction to (0, 0, −p) but r does
not. Therefore, the spin direction is also reverted to (−S, 0, 0). Altogether, the mirror plane
flips the spin parallel to the plane and does not flip the spin perpendicular to the plane.

76
Altogether, we see that not every symmetry element changes the spin direction. Because
we still need the way to flip the spin, one introduces the additional operation, time reversal,
that changes the spin direction. This operation is denoted with the prime sign. For example,
m0 flips the spin perpendicular to the plane and does not flip the spin parallel to the plane, so
its effect is opposite to that of m.
Magnetic space groups are similar to crystallographic space groups. The latter are
formed by several symmetry elements that give rise to 230 different combinations allowed in
crystals. Magnetic symmetry brings further opportunities. By combining standard symmetry
operations with time reversal, one finds 1651 different magnetic space groups that are allowed
in 3D periodic crystals. Such space groups are sometimes called black and white depending on
the presence or absence of the time-reversal operator. The parent crystallographic space groups
are then colourless. For any colourless space group F , F 0 = F + 10 is a grey (paramagnetic)
space group, because each spin is flipped by 10 resulting in a paramagnetic state.
For k 6= 0 the magnetic unit cell is larger than the crystallographic one. When kα = 12 ,
this can be taken into account by combining translation operations with the spin flip 10 . For
example, P stands for primitive translations, and P2a means that all spins are flipped upon the
translation by (1, 0, 0), resulting in the propagation vector k = ( 12 , 0, 0). Likewise, PC implies
the k = ( 21 , 21 , 0) order, and PF stands for the k = ( 21 , 21 , 12 ) order.
Irreducible representations: the space-group approach works as long as we are con-
cerned with commensurate magnetic structures. Incommensurate magnetic structures require
the formalism of 3+n-dimensional crystallography, which is tedious enough to be avoided. Al-
ternatively, we can re-sort to another approach, which is based on irreducible representations
(irreps) of symmetry groups.
In a nutshell, each irrep defines how individual symmetry elements act upon a given object.
For magnetic moments, +1 means that the magnetic moment component remains intact, and
−1 means that this component changes sign. Values other than +1 and −1 are possible too and
imply that the size of the moment changes upon the symmetry operation. Irreps are always
defined with respect to the given atomic position and the propagation vector.
One reason for using irreps in magnetic structure analysis is Landau’s theory of phase transi-
tions. It requires that a second-order phase transition follows a single irreducible representation
of the symmetry group. Since magnetic transitions are usually of second order (and this can
be verified experimentally for a given material), magnetic structure should be described by a
single irrep. This is not always the case (and not all transitions are of second order), but irreps
always give a useful guidance of how magnetic structure could look like.

8.3 Structure factors


Magnetic excitations can be observed by inelastic neutron scattering. It should not be con-
fused with the elastic scattering (neutron diffraction), which is used for magnetic structure
determination. The description of these phenomena is, however, quite similar. Most generally,
differential cross-section for neutron scattering can be expressed as
 
dσ 1 k1  mn 2 X
= 2
pλ0 pλ1 |hk1 λ1 |V |k0 λ0 i|2 δ(E − (Eλ1 − Eλ0 )). (8.6)
dΩdE k0 →k1 N k0 2π~ λ ,λ
0 1

Let’s try to understand all the different symbols entering this expression. σ is the neutron
flux calculated per spherical angle Ω and energy E. The δ-function on the right-hand side
means that we pick only those energies that match the energy difference between the states
|k1 , λ1 i and |k0 , λ0 i, where the former is the final state and the latter is the initial state of the
system. What remains is the pre-factor (mn stands for the neutron mass), the probability pλi

77
for a state λi to occur, and the matrix elements of the interaction potential V that describes
the interaction between the neutron spin and electron spin. Eq. (8.6) is master equation
for neutron scattering (note that this expression is for the non-polarized case; when neutron
polarization is taken into account, additional symbols and summations appear).
The interaction potential V has a fairly unpleasant form
 
∇ × s × r̂ 1 p × r̂
V = −2γµN µB σ + , (8.7)
r2 ~ r2

where r defines the position of an electron, and r̂ = r/|r| is the vector of unitary length along
r. The first term stands for the interaction with the spin moment s, while the second one is
the interaction with the orbital moment p. The pre-factor contains γ = 1.9132 (gyromagnetic
ratio for a neutron) and nuclear magneton µN . Lastly, σ is the vector of Pauli matrices for
neutron as a spin- 21 particle.
What we want to do now are two Fourier transforms. The first one will convert real-space
vectors r into reciprocal-space vectors q. To this end, we introduce
X i

M⊥ (q) = 2µB q̂ × [si × q̂i ] + pi × q̂ eiqri ,
i
~q

where the summation is over atoms i, q is the scattering vector, and q̂ = q/|q| is its unitary
form. The subscript ⊥ appears because q̂ × [si × q̂i ] selects the spin component, which is
perpendicular to q, and that’s why M⊥ appears in the magnetic form-factor.
We can now write X
hk1 |V |k0 i = 4π γµN σ M⊥ (q),
i

where the scattering vector q is, naturally, q = k1 − k0 . Moreover, the projection M⊥ can be
expressed as
X
M⊥ = M − (Mq̂) q̂ ⇒ M+ ⊥ M⊥ = (δαβ − q̂α q̂β )M+
α Mβ ,
α,β=x,y,z

where X
M(q) = M(ri ) eiqri (8.8)
i

is Fourier transform of the magnetization. This brings Eq. (8.6) to the form
 
dσ 1 k1 X X
= (γr0 )2 (δαβ −q̂α q̂β ) pλ0 pλ1 hλ0 |M+
α |λ1 ihλ1 |Mβ |λ0 iδ(E−(Eλ1 −Eλ0 ))
dΩdE q N k0 α,β=x,y,z λ ,λ
0 1

and gives us a glimpse of its physical meaning: the neutron scattering is determined by matrix
elements of the magnetization. The pre-factor has changed quite a bit and now contains classical
electron radius r0 = e2 /(me c20 ) = 2.82 × 10−15 m.
The remaining part is the second Fourier transform that will convert energy variable into
the time variable. Matrix elements are replaced by time-dependent magnetizations, and we end
up with

1 k1 (γr0 )2 X
  Z

= (δαβ − q̂α q̂β ) hMα (−q, 0)Mβ (q, t)i e−iEt/~ dt. (8.9)
dΩdE q N k0 2π~ α,β=x,y,z

78
We are left to express M using Eq. (8.8) and add magnetic form-factors fmag . This provides
the complete expression for the neutron scattering cross-section,

1 k1 (γr0 )2 X
 
dσ X
= (δαβ − q̂α q̂β ) fmag,i (q)∗ fmag,j (q)
dΩdE q N k0 2π~ α,β=x,y,z i,j
Z
× hµiα (0)µjβ (t)i e−iq(ri −rj ) e−iEt/~ dt. (8.10)

The physical meaning of this complex expression is contained in the integrand. Neutron
scattering is generally sensitive to the correlation between the magnetic moment µi at time 0
and the magnetic moment µj at time t. The expression on the right-hand side of Eq. (8.10) is
called dynamic structure factor,
X Z
Sαβ (q, ω) = fmag,i (q) fmag,j (q) hµiα (0)µjβ (t)i e−iq(ri −rj ) e−iωt dt

(8.11)
i,j

(E = ~ω), and it is eventually measured experimentally by inelastic neutron scattering. It


contains (albeit in a rather hidden form) full information on spin-spin correlations in the system.
Static structure factor is determined as structure factor at t = 0,
X
Sαβ (q) = fmag,i (q)∗ fmag,j (q) hµiα (0)µjβ (0)i e−iq(ri −rj ) . (8.12)
i,j

It probes instantaneous state of the system. In theoretical studies of magnetic systems, static
structure factor is the best probe of long-range magnetic order, because in the fully ordered
state Sαβ (q) becomes infinitely large at q = k (propagation vector), while remaining small at
any other q. However, static structure factor is not directly measured in the experiment.
What can be measured in the experiment is the elastic scattering,
X Z
Sαβ (q, 0) = fmag,i (q) fmag,j (q) hµiα (0)µjβ (t)i e−iq(ri −rj ) dt,

(8.13)
i,j

which is time-average of spin-spin correlations. For a long-range ordered state this yields robust
Bragg peaks at selected q’s, because spins are correlated at any time t.
Short-range order implies local spin-spin correlations that do not pertain to the whole crystal
and do not persist at an arbitrarily large time t. But for shorter t’s these correlations occur.
Therefore, they can be probed by elastic scattering experiments, giving rise to the so-called
diffuse scattering. One typical manifestation of the diffuse scattering is short-range order
right above the Néel temperature, where sharp Bragg peaks give way to broad diffuse features
that occur at about the same positions in the q-space.
The opposite limit is paramagnet, where time-average of intersite correlationsp is zero, and
time-average of on-site correlations yields square of the effective moment (gµB S(S + 1)),
similar to the Curie law. This explains magnetic scattering from paramagnets that decays with
q following  
dσ 2
∝ |fmag (q)|2 S(S + 1). (8.14)
dΩdE q 3

79
A Angular momentum operator
A.1 Commutation relations
Consider the angular momentum operator L̂ = r × p̂ with the components

L̂x = y p̂z − z p̂y , L̂y = z p̂x − xp̂z , L̂z = xp̂y − y p̂x .

To calculate their commutation relations, we shall make use of two general statements,

[Â, B̂ Ĉ] = [Â, B̂]Ĉ + B̂[Â, Ĉ], (A.1)

and
[rα , p̂β ] = i~ δαβ , (A.2)
where the former is trivial, and the latter follows from
∂ψ ∂(rα ψ) ∂rα
[rα , p̂β ]ψ = −i~ rα + i~ = i~ψ = i~ δαβ ψ.
∂rβ ∂rβ ∂rβ

We can now use this knowledge to express [L̂x , L̂y ] as

[L̂x , L̂y ] = [y p̂z , z p̂x ] − [y p̂z , xp̂z ] − [z p̂y , z p̂x ] + [z p̂y , xp̂z ] =
= y[p̂z , z]p̂x − y[p̂z , p̂z ]x − p̂y [z, z]p̂x + p̂y [z, p̂z ]x = i~(xp̂y − y p̂x ) = i~L̂z ,

where we took advantage of Eq. (A.1) in the sense that an operator can be taken out of the
commutator when it commutes with the other part. This way,

[L̂x , L̂y ] = i~L̂z , [L̂y , L̂z ] = i~L̂x , [L̂z , L̂x ] = i~L̂y . (A.3)

Consider now L̂2 = L̂2x + L̂2y + L̂2z that commutes with all individual components of L̂. For
example,

[L̂x , L̂2 ] = [L̂x , L̂2x ] + [L̂x ,L̂2y ] + [L̂x , L̂2z ] = [L̂x , L̂y ]L̂y + L̂y [L̂x , L̂y ] +
+ [L̂x , L̂z ]L̂z + L̂z [L̂x , L̂z ] = i~(L̂z L̂y + L̂y L̂z − L̂y L̂z − L̂z L̂y ) = 0.

Altogether,
[L̂x , L̂2 ] = 0, [L̂y , L̂2 ] = 0, [L̂z , L̂2 ] = 0. (A.4)

A.2 Eigenvalues of L̂z


To obtain eigenvalues of L̂z , we shall use spherical coordinates

x = r cos ϕ sin θ, y = r sin ϕ sin θ, z = r cos θ

and calculate
∂ψ ∂ψ ∂x ∂ψ ∂y ∂ψ ∂z ∂ψ ∂ψ ∂ψ ∂ψ
= + + = −r sin ϕ sin θ + r cos ϕ sin θ = −y +x ,
∂ϕ ∂x ∂ϕ ∂y ∂ϕ ∂z ∂ϕ ∂x ∂y ∂x ∂y

where the right-hand side is L̂z up to the factor of −i~. Therefore,



L̂z = −i~ . (A.5)
∂ϕ

80
The eigenvalues of L̂z are determined from the first-order differential equation
∂ψ
−i~ = λψ.
∂ϕ

Its solution ψ = ψ0 eiλϕ/~ is a periodic function (by definition of ϕ). Therefore,

ψ(ϕ) = ψ(ϕ + 2π) ⇒ e2πiλ/~ = 1 ⇒ 2πλ/~ = 2πm,

hence λ = m~ with m integer.

A.3 Ladder operators


From now on, we shall switch to the general angular momentum operator Ĵ defined by the
commutation relations of Eq. (A.3) without any explicit relation to r and p̂. As commuting
operators, Ĵ2 and Jˆz have the common set of eigenstates that can be written as |λ, mi, where
λ~2 and m~ are eigenvalues of Ĵ2 and Jˆz , respectively.
Let’s introduce the operators

Jˆ+ = Jˆx + iJˆy , Jˆ− = Jˆx − iJˆy (A.6)

that, by virtue of Eq. (A.3), follow the commutation relations

[Jˆz , Jˆ+ ] = [Jˆz , Jˆx + iJˆy ] = i~Jˆy + ~Jˆx = ~Jˆ+ , [Jˆz , Jˆ− ] = −~Jˆ− (A.7)

and
[Jˆ+ , Jˆ− ] = −i[Jˆx , Jˆy ] + i[Jˆy , Jˆx ] = 2~Jˆz . (A.8)
This way,

Jˆz Jˆ+ |λ, mi = [Jˆz , Jˆ+ ]|λ, mi + Jˆ+ Jˆz |λ, mi = ~Jˆ+ |λ, mi + m~ Jˆ+ |λ, mi = (m + 1)~ Jˆ+ |λ, mi,

and Jˆ+ |λ, mi is an eigenstate of Jˆz with the eigenvalue of (m + 1)~. Likewise, Jˆ− |λ, mi is an
eigenstate with the eigenvalue of (m − 1)~. Therefore, Jˆ± are ladder operators or raising and
lowering operators. Their effect on |λ, mi is given by

Jˆ+ |λ, mi = α+ |λ, m + 1i, Jˆ+ |λ, mi = α− |λ, m − 1i, (A.9)

where the coefficients α+ and α− remain to be determined.

A.4 Eigenvalues of Ĵ2


The eigenvalues of Ĵ2 are necessarily positive, because
X 2
λ~2 = hλ, m|Ĵ2 |λ, mi = hλ, m|Jˆα† Jˆα |λ, mi = Jˆα |λ, mi ≥ 0.
α

On the other hand,


2 2
m2 ~2 = hλ, m|Jˆz2 |λ, mi = hλ, m|Ĵ2 |λ, mi−hλ, m|Jˆx2 +Jˆy2 |λ, mi = λ~2 − Jˆx |λ, mi − Jˆy |λ, mi ,

and m2 ≤ λ because m2 ≥ 0. So we can define mmin and mmax , such that mmin ≤ m ≤ mmax .

81
Let’s now express

Jˆ− Jˆ+ = (Jˆx − iJˆy )(Jˆx + iJˆy ) = Jˆx2 + Jˆy2 + i[Jˆx , Jˆy ] = Ĵ2 − Jˆz2 − ~Jˆz , (A.10)
Jˆ+ Jˆ− = (Jˆx + iJˆy )(Jˆx − iJˆy ) = Jˆx2 + Jˆy2 − i[Jˆx , Jˆy ] = Ĵ2 − Jˆ2 + ~Jˆz
z (A.11)

and calculate

Jˆ− Jˆ+ |λ, mmax i = (Ĵ2 − ~Jˆz − Jˆz2 ) |λ, mmax i = (λ − mmax − m2max )~2 |λ, mmax i = 0
Jˆ+ Jˆ− |λ, mmin i = (Ĵ2 + ~Jˆz − Jˆ2 ) |λ, mmin i = (λ + mmin − m2 )~2 |λ, mmin i = 0,
z min

where both vectors are zero, because we apply Jˆ+ to the eigenstate with the highest possible m
and, likewise, Jˆ− to the eigenstate with the lowest possible m. This leads to a simple condition

λ = m2max + mmax = m2min − mmin ⇒ m2max − m2min = −(mmax + mmin )

that only holds when mmax = −mmin , because the second solution mmax − mmin = −1 violates
mmax > mmin .
It is now convenient to define mmax = −mmin = j > 0 and obtain λ~2 = j(j + 1)~2 as the
eigenvalue of Ĵ2 . The ladder operators Jˆ+ and Jˆ− will span the full range of −j ≤ m ≤ +j
only if j is integer or half-integer. This defines the possible eigenvalues of Ĵ2 and Jˆz .
To elucidate the effect of the ladder operators, we note that Jˆ+† = Jˆx − iJˆy = Jˆ− (and,
likewise, Jˆ−† = Jˆ+ ) and make use of Eqs. (A.10) and (A.11),

|α+ |2 = hλ, m|Jˆ+† Jˆ+ |λ, mi = hλ, m|Jˆ− Jˆ+ |λ, mi = hλ, m|(Ĵ2 − Jˆz2 − ~Jˆz )|λ, mi,
|α− |2 = hλ, m|Jˆ−† Jˆ− |λ, mi = hλ, m|Jˆ+ Jˆ− |λ, mi = hλ, m|(Ĵ2 − Jˆ2 + ~Jˆz )|λ, mi,
z

which lead to

|α+ |2 = ~2 [j(j + 1) − m(m + 1)] = ~2 (j − m)(j + m + 1), (A.12)


|α− |2 = ~2 [j(j + 1) − m(m − 1)] = ~2 (j + m)(j − m + 1). (A.13)

A.5 Pauli matrices


Using Eqs. (A.12) and (A.13), one can write the matrices of Jˆ+ and Jˆ− in the |j, mi basis, and
calculate the matrices of Jˆx and Jˆy . For j = 21 and m = ± 12 ,
   
0 1 0 0
J+ = ~ , J− = ~ . (A.14)
0 0 1 0

This way, disregarding the pre-factor of ~/2, we arrive at the Pauli matrices
     
0 1 0 −i 1 0
σx = , σy = , σz = . (A.15)
1 0 i 0 0 −1

A.6 S = 1 case
It is also useful to know the matrices for j = 1, where three states with m = −1, 0, +1 are
possible,
 √     
0 2 √0 √0 0 0 1 0 0
J+ = ~  0 0 2  , J− = ~  2 √0 0  , Jz = ~  0 0 0  . (A.16)
0 0 0 0 2 0 0 0 −1

82
B Exchange integral
Here, we shall calculate the exchange integral from Eq. (2.4) and demonstrate it is positive.
Let’s introduce
e2
f (r2 ) = ϕ∗b (r2 )ϕa (r2 ), V (r1 − r2 ) = .
|r1 − r2 |
Then,
ZZ Z Z Z
∗ ∗
Jab = dr1 dr2 f (r1 )V (r1 −r2 )f (r2 ) = dr1 f (r1 ) dr2 V (r1 −r2 )f (r2 ) = dr1 f ∗ (r1 )I(r1 ).

We shall now calculate the Fourier transform of I(r1 ),


Z Z Z Z
F[I(r1 )] = dr1 dr2 e V (r1 − r2 )f (r2 ) = dr1 dr2 eik(r1 −r2 ) V (r1 − r2 )eikr2 f (r2 ) =
ikr1

Z Z
= dr e V (r) dr2 eikr2 f (r2 ) = Ṽ (k) f˜(k),
ikr

where we introduced r = r1 − r2 and basically split the integral into two independent Fourier
transforms. This way,
Z Z Z Z 

Jab = dr1 f (r1 ) dk e −ikr1
Ṽ (k)f˜(k) = dk dr1 e −ikr 1
f (r1 ) Ṽ (k)f˜(k) =

Z
= dk Ṽ (k)[f˜(k)]∗ f˜(k).

So we transformed Jab into the product of |f˜(k)|2 ≥ 0 and Ṽ (k).


To calculate Ṽ (k), we choose V in the form of the Yukawa potential, V 0 (r) = e−αr /r. Then,

Z −αr Z∞ Z2π Zπ
0 ikr e
Ṽ (k) = dr e = dr dϕ dθ r sin θ eikr cos θ e−αr =
r
0 0 0
Z∞  ikr cos θ π Z∞
e 4π 4π
= 2π dr r e−αr − = dr e−αr sin(kr) = 2 ,
ikr 0 k k + α2
0 0

where the last integral is expressed through partial integration,


Z∞ Z∞ Z∞
−αr 1 −αr 1 α
dr e sin(kr) = − dr e d[cos(kr)] = − dr e−αr cos(kr) =
k k k
0 0 0
Z∞ Z∞
1 α 1 α2
= − 2 dr e−αr d[sin(kr)] = − 2 dr e−αr sin(kr).
k k k k
0 0

We are left to set α = 0, which reduces V 0 (r) to V (r). This way, we have shown that Ṽ (k) > 0,
hence Jab ≥ 0.

83
C Perturbation theory
C.1 Non-degenerate case
C.2 Degenerate case

84
D Second quantization
D.1 Creation and annihilation operators
Many-body problems are conveniently (and conventionally) defined in terms of creation and
annihilation operators. Let’s introduce the vacuum state |0i and postulate that the creation
operator ĉ†k creates a particle in the state k,

|ki = ĉ†k |0i.

Several bosonic particles can occupy the same state simultaneously, so we shall define

â†k |k1 . . . kN i = N + 1 |k0 k1 . . . kN i, (D.1)

where we√imply that the particles 1 . . . N are in the state k, and the particle 0 is added. The
factor of N + 1 is needed for proper normalization, as will become clear later on.
In the case of fermions, not more than one electron occupies each state, so we shall describe
the many-body state by |k1 , . . . kN i, where particle i occupies the state ki , and define

ĉ†β |k1 , . . . kN i = |kβ , k1 , . . . kN i, (D.2)

without any pre-factor. Note that we use ↠for bosons and ĉ† for fermions.
To understand the effect of âk , consider
√ √
(hk2 . . . kN |âk |k1 k2 . . . kN i)† = hk1 k2 . . . kN |â†k |k2 . . . kN i = N hk1 k2 . . . kN |k1 k2 . . . kN i = N .

Therefore, we can interpret âk as the annihilation operator,



âk |k1 k2 . . . kN i = N |k2 . . . kN i, (D.3)

and
ĉ1 |k1 , k2 , . . . kN i = |k2 , . . . kN i (D.4)
in the case of fermions.
Commutation relations: consider the state |k3 . . . kN i. We can act on it with â†1 â†2 or
† †
â2 â1 , p p
â†1 â†2 = N (N − 1)|k1 k2 k3 . . . kN i, â†2 â†1 = N (N − 1)|k2 k1 k3 . . . kN i.
The resulting states are essentially the wavefunctions that should have certain symmetry with
respect to permutations. In the case of bosons, |k1 k2 k3 . . . kN i = |k2 k1 k3 . . . kN i, so â†1 and â†2
commute. In the case of fermions,

ĉ†1 ĉ†2 = |k1 , k2 , k3 , . . . kN i, ĉ†2 ĉ†1 = |k2 , k1 , k3 , . . . kN i,

but |k1 , k2 , k3 , . . . kN i = −|k2 , k1 , k3 , . . . kN i, so ĉ†1 and ĉ†2 anticommute.


This can be expressed via fundamental commutation relations,

[â†k , â†p ] = [âk , âp ] = 0, (D.5)


{ĉ†k , ĉ†p } = {ĉk , ĉp } = 0, (D.6)

where the relations for the annihilation operators are obtained as straight-forward Hermitian
conjugates. Note that square brackets denote commutators, whereas curly brackets denote
anticommutators,

[Â, B̂] = ÂB̂ − B̂ Â, {Â, B̂} = ÂB̂ + B̂ Â.

85
D.2 Particle number operator
We can introduce the operator

n̂k = â†k âk (bosons) n̂α = ĉ†α ĉα (fermions) (D.7)

that yields the number of particles in the state k (bosons) and the number of particles in the
state α (fermions). This is the particle number operator.
Consider now the commutator [âα , â†β ] and multiply it by âβ from the right,

[âα , â†β ]âβ = âα n̂β − n̂β âα = 0

for α 6= β, because the number of particles in the state β is unrelated to the creation/annihilation
of a particle in the state α for β 6= α. Likewise, for fermions we shall use the anticommutator
{ĉα , ĉ†β } and multiply by ĉβ from the right,

[ĉα , ĉ†β ]ĉβ = ĉα n̂β − n̂β ĉα = 0

for α 6= β, where the minus sign arises from ĉα ĉβ = −ĉβ ĉα .
The α = β case is easier to understand from the following considerations,
√ p
(â1 â†1 − â†1 â1 )|k2 . . . kN i = N · N |k2 . . . kN i − (N − 1)(N − 1)|k2 . . . kN i = |k2 . . . kN i,

and for fermions

(ĉ1 ĉ†1 + ĉ†1 ĉ1 )|k1 , k2 , . . . kN i = 0 + |k1 , k2 , . . . kN i = |k1 , k2 , . . . kN i.

Altogether, we obtain another fundamental commutation relation,

[âα , â†β ] = δαβ , {ĉα , ĉ†β } = δαβ . (D.8)

D.3 Representation of one-particle and two-particle operators


D.4 One-particle correlation functions
Bosons: from the definition of the field operator one finds
1 X X ikr −ik0 r0 1 X ik(r−r0 )
G1 (r, r0 ) = e e hΦ|â†k âk0 |Φi = e hn̂k i,
V k k0 V k

because all states with k 6= k0 are orthogonal. At low temperatures, bosons condense into the
k = 0 state, so the summation simply yields the number of particles N . Consequently, the
one-particle correlation function G1 (r, r0 ) = N/V does not depend on the distance |r − r0 |.
Fermions: the reasoning is similar here, and we find
1 X X ikr −ik0 r0 1 X ik(r−r0 )
G1 (s, r, r0 ) = e e hΦ|ĉ†ks ĉk0 s |Φi = e hn̂ks i.
V k k0 V k

At T = 0 the summation runs from k = 0 to k = kF with nks = 1. It can be replaced by an


integration,

ZkF ZkF Z2π Zπ


0 1 ik(r−r0 ) 1 0
G1 (s, r, r ) = 3
d ke = dk k 2
dπ dθ sin θ eik|r−r | cos θ ,
(2π)3 (2π)3
0 0 0 0

86
where we switched to the spherical coordinates and defined θ as the angle between k and r − r0 .
Using x = cos θ, the integral over θ can be written as

Z1
0 2
dx eik|r−r |x = sin(k|r − r0 |).
k|r − r0 |
−1

Therefore,

ZkF
2 kF3 sin y − y cos y n 3(sin x − x cos x)
G1 (s, r, r0 ) = dk k sin(k|r − r0 |) = = ,
(2π) |r − r0 |
2 2π 2 y 3 2 x3
0

where we defined y = kF |r − r0 | and took advantage of kF3 = 3π 2 n for free electron gas.

87

You might also like