0% found this document useful (0 votes)
7 views97 pages

1718 Advanced Quantum Notes

The document contains course notes for Advanced Quantum Physics (PHYS6003) by A.G. Akeroyd, covering foundational concepts in quantum mechanics, vector spaces, and their applications. It outlines the structure of the course, including topics such as state vectors, linear operators, and quantum systems, along with mathematical formalism. The course aims to provide a rigorous mathematical framework for understanding quantum mechanics, building on previous knowledge from earlier courses.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views97 pages

1718 Advanced Quantum Notes

The document contains course notes for Advanced Quantum Physics (PHYS6003) by A.G. Akeroyd, covering foundational concepts in quantum mechanics, vector spaces, and their applications. It outlines the structure of the course, including topics such as state vectors, linear operators, and quantum systems, along with mathematical formalism. The course aims to provide a rigorous mathematical framework for understanding quantum mechanics, building on previous knowledge from earlier courses.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

Course notes for Advanced Quantum Physics

(PHYS6003, academic year 2017/2018)


A.G. Akeroyd
September 12, 2017

Contents
1 Introduction 4
1.1 The state vector in quantum mechanics . . . . . . . . . . . . . . . . . . 4
1.2 The superposition principle . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Vector Spaces 7
2.1 Vector Space (definition) . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Linear Dependence and Linear Independence . . . . . . . . . . . . . . . 8
2.3 Dimension of vector space, and definition of basis . . . . . . . . . . . . 9
2.4 Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.6 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.7 Norm of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.8 Orthonormal basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.9 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.10 Gram-Schmidt procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.11 Index notation and summation convention . . . . . . . . . . . . . . . . 17
2.12 Dual spaces and Dirac notation . . . . . . . . . . . . . . . . . . . . . . 18
2.13 Example of manipulating inner products . . . . . . . . . . . . . . . . . 19
2.14 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.15 Adjoint of an operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.16 Matrix representation of linear operators . . . . . . . . . . . . . . . . . 24
2.17 Matrix representation of an adjoint operator . . . . . . . . . . . . . . . 25
2.18 Outer product representation of operators . . . . . . . . . . . . . . . . 26

1
2.19 Completeness relation . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.20 Projection operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.21 Transformations using a unitary operator . . . . . . . . . . . . . . . . . 28
2.22 Active and passive transformations . . . . . . . . . . . . . . . . . . . . 29
2.23 Hermitian operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.24 Degeneracy of eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.25 Continuous bases in a vector space . . . . . . . . . . . . . . . . . . . . 37
2.26 Generalisation to infinite dimensional vector spaces . . . . . . . . . . . 38
2.27 Functions can be expressed as vectors in a continuous basis . . . . . . . 40
2.28 Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3 Vector spaces applied to Quantum Mechanics 46


3.1 Postulates of Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . 46
3.2 Collapse of the state vector . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 How to test Quantum Theory . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Expectation value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 The uncertainty ∆Ω̂ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.6 Compatible and incompatible operators (observables) . . . . . . . . . . 52
3.7 Time dependence of the state vector . . . . . . . . . . . . . . . . . . . 54
3.8 Normalisation of state is constant under time evolution . . . . . . . . . 55
3.9 Solution of the Schrödinger equation . . . . . . . . . . . . . . . . . . . 56
3.10 The Schrödinger picture and the Heisenberg picture . . . . . . . . . . . 57

4 Simple Harmonic Oscillator in Quantum Mechanics 59


4.1 Raising and lowering operators . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Matrix representations of operators and eigenvectors of the simple har-
monic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Simple Harmonic Oscillator wave functions . . . . . . . . . . . . . . . . 63
4.4 Example: Calculating expectation values . . . . . . . . . . . . . . . . . 65

5 Angular Momentum for Quantum Systems 66


5.1 Ladder operators of angular momentum . . . . . . . . . . . . . . . . . 68
5.2 Matrix representation of angular momentum . . . . . . . . . . . . . . . 71
5.3 Calculating probabilities of eigenvalues of angular momentum with ma-
trix representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.4 Orbital angular momentum, and coordinate representation of eigenvectors 75
5.5 Spin angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.6 Examples of elementary particles and their spins . . . . . . . . . . . . . 80

2
5.7 Matrix representations of spin 1/2 particles . . . . . . . . . . . . . . . . 80
5.8 Eigenvectors of states with spin 1/2 . . . . . . . . . . . . . . . . . . . . 81
5.9 Addition of angular momentum . . . . . . . . . . . . . . . . . . . . . . 83
5.10 Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.11 Clebsch-Gordon coefficients . . . . . . . . . . . . . . . . . . . . . . . . 88
5.12 Matrix representation of a tensor product . . . . . . . . . . . . . . . . . 91
5.13 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6 Non-locality and Bell’s inequalities 93

7 Quantum key distribution (non-examinable) 96

8 Qubits, Bell states and Quantum teleportation 96

3
1 Introduction
The “old quantum theory” was developed between 1900 and 1925, and has been cov-
ered in the course PHYS1011 (Waves, Light and Quanta). The “new quantum theory”
(which was developed from 1925) refers to what we now call “(non-relativistic) quantum
mechanics”, and has been covered in PHYS2003 (Quantum Physics) and PHYS3008
(Atomic Physics). This course PHYS6003 puts quantum mechanics on a firm mathe-
matical footing. Several of the results that we derive will be familiar to you, but they
will be done using the formalism of a vector space. Orginally there were two versions of
the “new quantum theory” i) Heisenberg’s “Matrix mechanics” (July 1925), to which
Jordan/Born contributed (working both with and without Heisenberg) later that year,
and ii) Schrödingers’s “Wave mechanics” (Jan 1926).
Matrix mechanics focussed on observables (e.g. the energy levels of the hydrogen)
rather than on quantities that are not readily oberservable (e.g. the trajectory of an
electron). Wave mechanics developed the idea of De Broglie wavelength, and pro-
posed a wave equation (“Schrödinger equation”) for the wave that is associated with
a particle. Wave mechanics was the one preferred by most theorists at the time (due
to its simpler mathematics and more intuitive approach). Both theories agreed with
experiment, despite having very different approaches. It was Dirac (1930) and later
Von Neumann (1932) who showed that “Matrix mechanics” and “Wave mechanics” are
different aspects of a unifying approach to quantum mechanics that makes use of an
infinite dimensional vector space. In this course we will present quantum mechanics in
the approach of Dirac and Von Neumann using a vector called a “state vector”. The
necessary mathematics will be described in chapter 2, with the application to quantum
mechanics starting at the end of chapter 2.
In this section we briefly introduce the state vector and discuss its analogy with the
familiar vectors of three-dimensional space. We then recall the superposition principle.

1.1 The state vector in quantum mechanics


In a first course on quantum mechanics (QM) one is introduced to the concept of a
“wave function”, denoted by ψ(x, t) in a one-dimensional system, which describes the
particle (or group of particles) under consideration in a very non-classical way. A more
advanced (and abstract) treatment of QM considers a quantum system to be described
by a “state vector”, which contains the information of the wave function.
The state vector (or just “state”) resides in a multi-dimensional vector space. The
state vector is usually denoted by |ψi, called a “ket”, and this is a notation for multi-
dimensional vector spaces that was introduced by Dirac. We are familiar with three-

4
dimensional Euclidean space (i.e. n = 3, where n is the number of dimensions), with
a vector v (e.g. a force or velocity) given by

v = ai + bj + ck . (1)

Here, i, j and k are called “basis vectors”, and every v is a linear combination of
these basis vectors. It is important to note that there is not just one basis for this
n = 3 vector space, because we could simply rotate i, j, k to i0 , j0 , k0 , and this latter
basis could also be used to write the vector v. The coefficients of a, b, and c are
real numbers (“scalars”), and are not vectors themselves. These coefficients are called
the “components” of v in a given basis. Euclidean space is an example of a finite-
dimensional (i.e. n = 3) vector space. Multi-dimensional vector spaces have n > 3,
and thus are difficult to visualise. It is best to think of such multi-dimensional vector
spaces as mathematical tools, which have an application within QM.
In QM the state vector |ψi is analogous to the vector v, and a choice of the basis
vectors (analogous to the choice of i, j and k) is the set of the eigenvectors of some
operator with real eigenvalues (called “Hermitian operators”). Each state vector can
be expanded in this basis of eigenvectors (see section 2.23), and thus each |ψi will
have n components given by a, b, c...etc, in analogy with the n = 3 case in eq. (1) for
a vector v. We will deal with both discrete and continuous bases, which correspond
to the cases of the Hermitian operator having discrete eigenvalues and continuous
eigenvalues respectively. A “discrete basis” is when the number of basis vectors is
countable (i.e. they can be labelled by an integer). This is still true even if there
are an infinite number of basis vectors (referred to as “countably infinite”, where the
counting goes on for ever). In a “continuous basis”, the basis vectors are labelled by
a continuous parameter (e.g. x) instead of being labelled by an integer, and there
are an “uncountably infinite” number of basis vectors. The components a, b, c....for
a general state vector in the vector space then merge into a continuous function, and
this function is the wave function ψ(x). In general there will be a time dependence
of the wave function as well, and so ψ(x) → ψ(x, t). The state vector |ψi and the
wave function ψ(x, t) are not the same mathematical object. After a continuous basis
is chosen, the wave function ψ(x, t) corresponds to the (continuous) components of the
state vector |ψi in this continuous basis. As mentioned earlier, in QM we write the
state vector |ψi as a superposition of eigenvectors of a Hermitian operator in the same
way that v is a superposition of i, j and k in eq. (1). Since Hermitian operators always
have real eigenvalues, and experimental observables (e.g. the continuous parameters
of position x and momentum p) are real numbers, one associates each experimental
observable with a Hermitian operator (e.g. position operator, momentum operator etc).

5
Hence the eigenvalues obtained from a vector space (in which |ψi resides) correspond
to the “physics” i.e. the eigenvalues are what we are measuring in experiments.

1.2 The superposition principle


The superposition principle in QM states that if the kets |Ai and |Bi represent the
possible states of a particle, then so does superposition

α|Ai + β|Bi . (2)

We are familiar with superpositions in classical physics e.g. waves on strings, water
waves etc.
In QM, the interpretation of the superposition is different to that in classical physics.
If state |Ai gives rise to an experimental result “a” (e.g. a measurement of energy)
and |Bi gives rise to an experimental result “b”, a result of a measurement performed
on the superposed state α|Ai + β|Bi will sometimes be a and sometimes will be b.
It will never be different from a and b, as would be expected in classical physics. A
superposition in classical physics would not maintain the individual properties of |Ai
and |Bi, and instead would have physical properties which come from combining |Ai
and |Bi e.g. think of superimposing water waves, which would give rise to a new wave
with its own physical properties distinct from the original waves. We cannot predict
the outcome of an individual measurement of a superposition in QM. Instead, we can
predict the probabilities of obtaining a given result, and these probabilities depend on
the coefficients α and β.
In the next chapter we will discuss the mathematics of vector spaces.

6
2 Vector Spaces
We are familiar with using three-dimensional vectors (“arrows”) for quantities with a
direction (e.g. velocity, momentum etc). In QM we need to generalise the concept of
a three-dimensional vector space to an arbitrary number of dimensions, and in such
vector spaces the vectors are denoted by kets (e.g. |ui). The states of a quantum
system will then be represented by vectors in an appropriate vector space.

2.1 Vector Space (definition)


A vector space, denoted by V , contains mathematical objects called “vectors”, which
we will denote by the kets |ui, |vi etc. Vectors may be added together and/or multiplied
by scalars (denoted by a, b, c....). The scalars are real or complex numbers. In QM the
scalars are in general complex, and thus we are dealing with a “complex vector space”.
The vectors themselves are neither real nor complex; the adjective “real” or “complex”
applies to the scalars only.

2.1.1 Addition of vectors


Addition is commutative and associative. There is a null vector denoted by |0i, and an
inverse vector under addition (i.e. not under multiplication) of every vector (see item
(v) below). A vector space has the following (linear) properties:

(i) The vector space is closed under addition i.e. if |ui and |vi are in V then so is
|ui + |vi.

(ii) Commutativity: |ui + |vi = |vi + |ui.

(iii) Associativity: (|ui + |vi) + |wi = |ui + (|vi + |wi).

(iv) Addition of the null vector does not change a vector: |ui + |0i = |ui.

(v) For every vector |ui in V there is an inverse vector under addition, denoted by
| − ui such that |ui + | − ui = |0i.

2.1.2 Multiplication of vectors by scalars


Multiplication by scalars is defined by the following axioms:

(vi) a(|ui + |vi) = a|ui + a|vi (distributive).

7
(vii) (a + b)|ui = a|ui + b|ui (distributive).

(viii) a(b|ui) = (ab)|ui (associative).

(ix) 1|ui = |ui (unit law).

Sometimes |0i is written as just 0. Remember that 0 in this context is really a vector
and is not just a number. Note that 0|ui = |0i. To see this, take a = 1 and b = 0 in
rule (vii) and then use rule (iv). A good way to remember all the above rules is to do
what comes naturally.

2.1.3 Examples of vector spaces


Here are some examples of vector spaces:
(i) The vector space whose elements are ordered lists of n real or complex numbers,
with addition and multiplication by a scalar respectively given by:

(λ1 , λ2 , ......λn ) + (u1 , u2 , ......un ) = (λ1 + u1 , λ2 + u2 , ......, λn + un ) (3)

and
a(λ1 , λ2 , ......λn ) = (aλ1 , aλ2 , ......aλn ) . (4)
i.e. the vector space of row vectors. Column vectors and matrices satisfy the
axioms of a vector space as well.

(ii) Three-dimensional Euclidean space with the basis vectors i, j and k, or in ket
notation |ui = u1 |ii + u2 |ji + u3 |ki.

2.2 Linear Dependence and Linear Independence


A set of vectors {|ii} labelled by the integer i (i.e. |1i, |2i,....|ni) is “linearly dependent”
if there is a corresponding set of scalars a1 , a2 ,...an , which are not all zero, such that
n
X
ai |ii = |0i . (5)
i=1

If, on the other hand, eq. (5) can only be satisfied by ai = 0 for each value of i, then
the set of vectors {|ii} is ”linearly independent“.
It is not possible to write any member of a linearly independent set in terms of
the other vectors. However, if the set of vectors is linearly dependent then one could

8
have (for example) the following case of a3 6= 0 and at least one of the other ai being
non-zero: n
X ai
|3i = − |ii . (6)
i=1;i6=3
a3

Hence, in this example one can express the vector |3i in terms of the other vectors. If
only a3 6= 0 and all the other ai are zero then |3i would be the null vector.

2.3 Dimension of vector space, and definition of basis


The dimension of a vector space is defined as follows:
A vector space has dimension n if no more than n vectors can be linearly independent.
A basis for the vector space is defined as follows:
A set of n linearly independent vectors in an n−dimensional vector space is called a basis.

Theorem
Any vector |ui in an n−dimensional vector space can be expanded as a linear combi-
nation of basis vectors, which here are denoted by {|ii}:
n
X
|ui = ui |ii . (7)
i=1

The coefficients ui of the expansion are unique in a given basis, and are called the
“components” of the vector in that basis. These components are complex numbers in
general. In contrast, in our 3D space of everyday vectors (like velocity) the components
are, of course, real numbers.
To prove this theorem, suppose that there were some vector |vi which could not
be expanded in terms of our basis {|ii} alone. Consequently, the new set of vectors
formed from {|ii} and |vi would be a set of n + 1 linearly independent vectors, which
is impossible (by definition) in an n−dimensional vector space.
Now suppose that the vector components in a given basis were not unique for a
vector |ui. Then we could also write
n
X
|ui = u0i |ii , (8)
i=1

9
where u0i 6= ui for at least one i. However, subtracting eq. (8) from eq. (7), one has
n
X
|0i = (ui − u0i )|ii . (9)
i=1

This would contradict the linear independence of the basis vectors unless ui = u0i for
all i. Hence ui = u0i for all i and the expansion of |ui in a given basis is unique, and so
completes the proof of the theorem above.
Note that in a given basis the components are unique, but if we change the basis
the components will change, e.g.
n
X
|ui = u0i |i0 i , (10)
i=1

where {|i0 i} is a basis which is different to {|ii}. One refers to |ui as the vector “in
the abstract”, having an existence of its own and satisfying various relations involving
other vectors (e.g. |ui = |vi + |wi). When the choice of a basis is made, the vectors
assume concrete forms in terms of their components, and the relation between vectors
is satisfied by the components. So if one has
n
X n
X
|ui = ui |ii and |vi = vi |ii (11)
i=1 i=1

then n
X
|ui + |vi = (ui + vi )|ii . (12)
i=1

2.4 Subspace
A subset of the vector space V whose elements also form a vector space is called a
“subspace” of V . It will have dimensionality m, with m < n, where n is the dimen-
sionality of V . For example, the vector space of the x, y plane is a subspace with m = 2
of 3D Euclidean space. Note that the positive x−axis alone is not a subspace of 3D
Euclidean space because there is no inverse under addition i.e. no element -i in order
to give −i + i = 0.

10
2.5 Inner product
Notice that the definition of a vector space does not include a definition of a length or
direction, although we are familiar with these concepts for ordinary 3-dimensional vec-
tors (“arrows”). We define an “inner product” (or “scalar product” or “dot product”)
on a (real or complex) vector space. This is a rule which for any two vectors will give
us a (real or complex) number. The inner product is defined by the following axioms:

Axiom (i): hu|vi = hv|ui∗ . This property ensures that hu|ui = hu|ui∗ , and so hu|ui is a real
number.

Axiom (ii): hu|ui ≥ 0; hu|ui = 0 if and only if |ui = |0i.

Axiom (iii): hu|(a|vi + b|wi) ≡ hu|av + bwi = ahu|vi + bhu|wi. An inner product is linear in
the second vector in the inner product.

Now what if the first vector in the inner product is a linear superposition? Consider
the inner product hau + bv|wi:

hau + bv|wi = hw|au + bvi∗ by axiom (i) (13)

= (ahw|ui + bhw|vi)∗ by axiom (iii) (14)


= a∗ hw|ui∗ + b∗ hw|vi∗ (15)
= a∗ hu|wi + b∗ hv|wi by axiom (i) (16)
Thus the inner product is anti-linear with respect to the first vector in the inner prod-
uct. This asymmetry, unfamiliar in real vector spaces, is here to stay. A vector space
with an inner product is called an “inner product space”. Note that we have not
yet given an explicit rule for actually evaluating the inner product. We are merely
demanding that any rule we come up with must satisfy the three axioms above.

2.6 Orthogonality
Two distinct vectors are “orthogonal” (or “perpendicular”) if their inner product van-
ishes, hu|vi = 0 e.g. i.j = 0 in 3D space. When we say that a set of vectors are
“mutually orthogonal” (or “pairwise orthogonal”) we mean that the inner product of
every distinct pair of vectors in the set is zero i.e. hu|vi = 0, hu|wi = 0, hv|wi = 0 etc.

11
1) If a set of vectors are mutually orthogonal then the vectors are also linearly independent.

2) The vectors in a linearly independent set are not necessarily mutually orthogonal.

We will give an example of 2) later in section 2.9. To show 1), consider the following
equation for an orthogonal set of vectors denoted by {|ii}:
n
X
ai |ii = |0i . (17)
i=1

Now multiply from the left by hj| where j is a particular value of i (e.g. j = 3).
Recall that i is summed over and takes all values from 1 to n, while j is a fixed
integer value. Now note that hj|0i = 0, which can be obtained from Axiom (iii) since
hj|0i = hj|0ui = 0hj|ui = 0, and one has:
n
X
ai hj|ii = 0 (i.e. RHS of equation is 0 and not |0i) . (18)
i=1

Now since the set of vectors are mutually orthogonal we have hj|ii = 0 for j 6= i and
hj|ii =
6 0 for j = i, and consequently for this particular value of j one has

aj = 0 for i = j . (19)

Repeating this, one can multiply from the left by hj| for all choices of j from 1 to n,
and so aj = 0 for all j from 1 to n. Therefore the only solution to eq. (17) is that
ai = 0 for all i, and so {|ii} are linearly independent.

2.7 Norm of a vector


The “norm” (or generalised length) is given by
p
||u|| = hu|ui . (20)

A normalised vector has unit norm, ||u|| = 1. Axiom (ii) ensures p that hu|ui ≥ 0
(“positive semi-definite’), and we can always take the square root hu|ui to obtain a
norm which is either positive, or zero if |ui is the null vector.

12
2.8 Orthonormal basis
A set of basis vectors, each with unit norm, which are pairwise (mutually) orthogonal
is called an “orthonormal basis”. That is, if the basis vectors are denoted by |ii and
|ji for i, j = 1, ...n then

hi|ji = 1 for i = j, and hi|ji = 0 for i 6= j (21)

or simply
hi|ji = δij (22)
where δij is the Kronecker delta symbol defined by

δij = 1 for i = j , and δij = 0 for i 6= j , (23)

We are now ready to obtain a concrete formula for the inner product in terms of
components. Consider the kets
n
X n
X
|ui = ui |ii and |vi = vj |ji . (24)
i=1 j=1

where i and j are used to label the basis vectors |ii of a given basis, which is not
necessarily an orthonormal basis. The inner product of |ui and |vi is hu|vi, and can
be written explicitly in terms of the components as follows:

hu|vi = hu1 1 + u2 2 + ...un n|v1 1 + v2 2 + ...vn ni , (25)

and using axiom (iii)

= hu1 1 + u2 2 + ..un n|1iv1 + hu1 1 + u2 2 + ..un n|2iv2 + ....hu1 1 + u2 2 + ..un n|nivn . (26)

Using axioms (i) and (iii)


n X
X n
hu|vi = u∗i vj hi|ji . (27)
i=1 j=1

To go any further we need to evaluate hi|ji. This depends on the details of the basis,
and all we know is that it is linearly independent. If the basis is orthonormal then the
double sum collapses to a single sum over i:
n X
X n n
X
hu|vi = u∗i vj δij = u∗i vi . (28)
i=1 j=1 i=1

13
This is the form of the inner product that we will be using from now on i.e. we will
use an orthonormal basis. Note that hu|vi can be positive, negative, real or complex.
Moreover, one can see that
n
X n
X
hu|ui = u∗i ui = |ui |2 . (29)
i=1 i=1

and so the norm ||u|| is positive semi-definite (≥ 0) as expected.


A vector is uniquely specified by its components in a given basis, and so we may
choose to represent given vectors |ui and |vi by column vectors of their components in
this orthonormal basis:
   
u1 v1
 u2   v2 
|ui −→  ...  , |vi −→  ..  .
   (30)
un vn

Hence the first row contains the component of the first basis vector |1i, the second row
contains the component of the second basis vector |2i, and so on. In this way, the inner
product hu|vi is given by matrix multiplication of the column vector representing |vi
and the complex conjugate transpose of the column vector representing |ui:
 
v1
n
  v2  X
∗ ∗ ∗
hu|vi = u1 , u2 , ... un    = u∗i vi . (31)
.. 
i=1
vn

Note that representing the inner product of the bras and kets in this way can only
be done if the set of basis vectors is mutually orthogonal (e.g. an orthonormal basis).
Only then does the inner product simplify, as in eq. (28).

2.9 Examples
1) Find the norm of the vector
 
2
 −7i 
|ui −→ 
 1 .
 (32)
2i

14
Answer: 
2
  −7i 
hu|ui = 2, 7i, 1, −2i  1  = 4 + 49 + 1 + 4 = 58 .
 (33)
2i
Therefore p √
||u|| = hu|ui = 58 . (34)
To normalise |ui one can form a new vector

|ui
|ũi = , (35)
||u||

where hũ|ũi = 1 and so ||ũ|| = 1.

2) Show that the following vectors are orthogonal:


 √   √ 
1/√2 1/ √2
|ui = and |vi = . (36)
1/ 2 −1/ 2

Answer:
√ 
√ √ 

1/ √2
hu|vi = 1/ 2 1/ 2 = 0 ( i.e. are orthogonal) . (37)
−1/ 2

3) Show that the following vectors are linearly independent:


   
3 2
|ui = and |vi = . (38)
4 −6

Answer: First check to see what values of the scalars a and b are required to satisfy
eq. (17) applied to these kets:

a|ui + b|vi = |0i . (39)

This leads to:      


3a 2b 0
+ = . (40)
4a −6b 0
Therefore
3a + 2b = 0 and 4a − 6b = 0 . (41)

15
Hence 13a = 0 and so a = 0, and consequently b = 0. Since a = b = 0 is the only
solution then |ui and |vi are linearly independent. If there exist solutions which are
not of this type then the vectors are not linearly independent. Note that hu|vi =
3 × 2 − 6 × 4 = −18 and so the vectors are not orthogonal. This is an explicit example
of a case where linearly independent vectors are not orthogonal (see section 2.6).

2.10 Gram-Schmidt procedure


Orthonormal bases are generally the easiest ones to work with. The Gram-Schmidt
procedure tells us how to obtain an orthonormal basis from any given basis. Let the
set of kets |e1 i,.. |en i be a non-orthonormal basis. We can convert this basis into an
orthonormal basis, denoted by {|ii}, as follows:

1) Make the first vector of an orthonormal basis {|ii} by setting


|e1 i
|1i = . (42)
||e1 ||
Clearly the ket |1i is of unit norm, as we learnt from the discussion around eq. (35) .

2) Make the second orthonormal basis vector by first defining a vector |20 i as follows:
|20 i = |e2 i − |1ih1|e2 i . (43)
One can check that |1i and |20 i are orthogonal:
h1|20 i = h1|e2 i − h1|1ih1|e2 i = 0 . (44)
Since h1|1i = 1 then h1|20 i = h1|e2 i − h1|e2 i = 0, and so |1i and |20 i are indeed
orthogonal. Then normalise |20 i to find the second vector of the orthonormal basis,
denoted by |2i:
|20 i
|2i = 0 . (45)
||2 ||
3) For the third orthonormal basis vector set:
|30 i = |e3 i − |1ih1|e3 i − |2ih2|e3 i . (46)
One can show that h1|30 i = 0 and h2|30 i = 0, e.g.
h1|30 i = h1|e3 i − h1|1ih1|e3 i − h1|2ih2|e3 i = 0 , (47)

16
since h1|1i = 1 and h1|2i = 0. Then normalise to obtain the third vector of the
orthonormal basis, denoted by |3i:

|30 i
|3i = . (48)
||30 ||

4) At the ith stage we have

|i0 i = |ei i − |1ih1|ei i − .... − |i − 1ihi − 1|ei i . (49)

and normalising gives


|i0 i
|ii = . (50)
||i0 ||
The original basis {|ei i} is composed of linearly independent vectors and thus one never
has |i0 i = |0i. This is because |i0 i in eq. (49) is composed of |ei i (the first term) and a
sum of vectors from |e1 i to |ei−1 i (the remaining terms). The latter terms can never
cancel the first term |ei i because of the linear independence of the original basis {|ei i},
and so one can never have |i0 i = |0i. The Gram-Schmidt procedure shows us that the
dimension of a vector space (i.e. the number of linearly independent vectors) is equal to
the maximum number of mutually orthogonal vectors that we can construct, thereby
linking to our everyday notion of dimensionality (e.g. 3D space has three mutually
orthogonal vectors of unit norm, denoted by i, j and k.)

2.11 Index notation and summation convention


If two vectors |ui and |vi are equal then their components in a given basis are equal
and we can write ui = vi , for all values of i. We could equally write uj = vj . Here i
and j are called “free indices”, which can take each of the values from 1 to n (where
n is the dimensionality of the vector space). Whether we use i or j, each index should
appear once and once only in each term in an equation e.g.

|ui = 2|vi + 5|wi implies ui = 2vi + 5wi . (51)

It would be incorrect to write ui = 2vi + 5wj , since the index i did not appear in the
last term. From an abstract equation Ω̂ = Λ̂ + Θ̂ for operators (which have two indices,
see later) one would have :
Ω̂ij = Λ̂ij + Θ̂ij , (52)

17
where each free index has appeared once and once only in each term. It would be
incorrect to write:
Ω̂ij = Λ̂ik + Θ̂ijk . (53)
A summation
Pn Pconvention is sometimes used to avoid explicitly writing the summation
n
signs i=1 , j=1 etc. When an index is repeated just once it is summed over all its n
values, e.g.
u∗i vi ≡ u∗1 v1 + u∗2 v2 + .......u∗n vn . (54)
Now i is a ”dummy index”, and it can be replaced by some other dummy index e.g.
u∗i vi = u∗j vj . Using this summation convention one can write

hu|vi = u∗i vj δij = u∗i vi . (55)


P
However, in this course we will always write the sum signs ( ) explicitly
n X
X n n
X
hu|vi = u∗i vj δij = u∗i vi . (56)
i=1 j=1 i=1

2.12 Dual spaces and Dirac notation


We have been using symbols like |ui to denote abstract vectors in a vector space. This
notation was introduced by Dirac, and is called a “ket”. Using the inner product we
can introduce “bra” vectors, denoted by hu|. Hence hu|vi is a “bra”c”ket”.
An inner product is a mapping from the vector space V to the (real or complex)
numbers. The bra vectors hu| live in a separate vector space V ∗ , called the dual space
of V . For every ket in V there is an associated bra in V ∗ . We can think of V as being
the vector space of column vectors, and V ∗ as the vector space of complex conjugated
row vectors. We say that hu| is the “adjoint” of |ui.
Now multiply |ui by a scalar a. What bra corresponds to a|ui ?
 
au1
 au2 
 and so hau| −→ a∗ u∗1 , a∗ u∗2 , ....a∗ u∗n = hu|a∗ ,

a|ui −→  ...  (57)
aun

where “−→”, means “can be represented as” (in a given orthonormal basis). A common
notation is to write:
|aui = a|ui and hau| = hu|a∗ . (58)

18
The general rule for finding the adjoint of a linear equation involving kets (bras) is to
replace every ket (bra) by its associated bra (ket), and complex conjugate all coefficients
(i.e. the scalars). For example, a relation between kets such as:

a|ui = b|vi + c|wi (59)

leads to the equation


hu|a∗ = hv|b∗ + hw|c∗ . (60)
These two equations are adjoints of each other.
Now consider the ket n
X
|ui = ui |ii (61)
i=1

in an orthonormal basis {|ii}. Then multiply both sides from the left by hj|, where j
takes a particular value from 1 to n. One has
n
X n
X
hj|ui = ui hj|ii = ui δji = uj . (62)
i=1 i=1

Swapping j for i, ones sees that each component ui is given by an inner product, and
so one can write eq. (61) as:
n
X
|ui = hi|ui|ii . (63)
i=1

Now since ui = hi|ui then u∗i = hu|ii by axiom (i). The adjoint of |ui is given by:
n
X
hu| = hu|iihi| . (64)
i=1

To obtain the adjoint of an equation involving bras and kets, reverse the order of all
inner products (i.e. axiom (i)), exchange bras and kets, and complex conjugate all
coefficients (i.e. the scalars).

2.13 Example of manipulating inner products


Show that:
2Rehψ|φi ≤ hψ|ψi + hφ|φi , (65)
where “Re” denotes the real part of the complex number hψ|φi.

19
Solution:
Use the fact that hf |f i ≥ 0 for any ket |f i, and let |f i = |ψ − φi.
Therefore one has:

hψ − φ|ψ − φi = hψ|ψi − hψ|φi − hφ|ψi + hφ|φi ≥ 0 . (66)

Now recall that hψ|φi = hφ|ψi∗ and z + z ∗ = 2Re(z) for any complex number z.
Therefore one has:
hψ|φi + hφ|ψi = 2Rehψ|φi , (67)
and so eq. (66) becomes

hψ|ψi + hφ|φi − 2Rehψ|φi ≥ 0 . (68)

After rearranging, we arrive at the required result:

2Rehψ|φi ≤ hψ|ψi + hφ|φi . (69)

Two important inequalities involving inner products are:

1) The Cauchy-Schwartz inequality:

|hφ|ψi|2 ≤ hψ|ψihφ|φi . (70)

2) The triangle inequality:


p p p
hψ + φ|ψ + φi ≤ hψ|ψi + hφ|φi . (71)

2.14 Operators
The states of a quantum system are vectors in an appropriate vector space. To get
some physics out we will need to think about transforming one vector into another e.g.
a given initial state will generally evolve in time into another state. “Operators” are
the objects we use to transform one vector into another, and are denoted by a hat on
an algebraic symbol e.g. Ω̂. Physical observables will also correspond to a particular
kind of operator. If Ω̂ is an operator, it acts on |ui in V to produce another vector
|u0 i, also in V . Likewise, Ω̂ acts on hv| in V ∗ to produce another vector hv 0 |, also in
V ∗ . Operators stand to the left of kets and to the right of bras:

Ω̂|ui = |u0 i and hv|Ω̂ = hv 0 | . (72)

20
2.14.1 Linear operators
We will only be concerned with linear operators, which are defined as follows:
(1) Ω̂a|ui = aΩ̂|ui, and similar for bras, hu|aΩ̂ = hu|Ω̂a .
(2) Ω̂(a|ui + b|vi) = aΩ̂|ui + bΩ̂|vi (and similar for bras) .
These rules imply Ω̂|0i = |0i, since one has:
Ω̂|0i = Ω̂0|ui = 0Ω̂|ui = 0|u0 i = |0i . (73)
The benefit of linear operators is that once their action on the basis vectors is known,
0
Pn is determined i.e. if Ω̂|ii = |i i for an
their action on any vector in the vector space
orthonormal basis {|ii}, then for any |ui = i=1 ui |ii one has:
n
X n
X n
X
Ω̂|ui = Ω̂ui |ii = ui Ω̂|ii = ui |i0 i . (74)
i=1 i=1 i=1

ˆ maps every vector into itself


Definition: The identity operator, written as I (and not usually I),

That is, for all kets and bras one has:


I|ui = |ui and hu|I = hu| . (75)

2.14.2 Product of operators and the commutator


If Λ̂ and Ω̂ are operators then
(Λ̂Ω̂)|ui = Λ̂(Ω̂|ui) = Λ̂|Ω̂ui . (76)
Note that we have introduced the notation Ω̂|ui = |Ω̂ui. If Λ̂|ui = |u0 i and Ω̂|ui = |u00 i
then
Ω̂Λ̂|ui = Ω̂|u0 i and Λ̂Ω̂|ui = Λ̂|u00 i . (77)
Now Ω̂|u0 i = 6 Λ|u00 i in general, and so the order of the operators in a product matters.
Since Λ̂Ω̂|ui and Ω̂Λ̂|ui do not give the same result in general then the commutator
[Λ̂, Ω̂] defined by
[Λ̂, Ω̂] ≡ Λ̂Ω̂ − Ω̂Λ̂ (78)
does not vanish in general. Two useful identities are:
[Ω̂, Λ̂Θ̂] = Λ̂[Ω̂, Θ̂] + [Ω̂, Λ]Θ̂ , (79)
and
[Ω̂Λ̂, Θ̂] = Ω̂[Λ̂, Θ̂] + [Ω̂, Θ̂]Λ̂ . (80)

21
2.14.3 Inverse of an operator
If Ω̂|ui = |0i implies |ui = |0i and the vector space is finite dimensional, then the
linear operator Ω̂ has an inverse, Ω̂−1 , where

Ω̂−1 Ω̂ = Ω̂Ω̂−1 = I . (81)

2.15 Adjoint of an operator


The adjoint, written Ω̂† , of an operator is defined by:

Adjoint Ω̂† of Ω̂ satisfies the following: If Ω̂|ui = |u0 i then hu|Ω̂† = hu0 |

i.e. if Ω̂ turns a ket |ui into |u0 i, then Ω̂† turns the bra hu| into hu0 |. We can also write

hu|Ω̂† = hΩ̂u| because hΩ̂u| = hu0 | . (82)

We will often use this result when dealing with inner products

hu|Ω̂† = hΩ̂u|
i.e. when one “pulls” the operator out of the bra vector, one adds the † sign onto the
operator. One has the relation:

hu|Ω̂† |vi = hΩ̂u|vi = hv|Ω̂ui∗ = hv|Ω̂|ui∗ , (83)

where we have used axiom (i) to obtain the third term. Another important result is
that the adjoint of a product of operators is the product of adjoints in reverse:

(Ω̂Λ̂)† = Λ̂† Ω̂† . (84)

To prove this, consider the bra vector hΩ̂Λ̂u|. First treat Ω̂Λ̂ as one operator, and so:

hΩ̂Λ̂u| = h(Ω̂Λ̂)u| = hu|(Ω̂Λ̂)† . (85)

Next treat hΛ̂u| as a separate bra vector:

hΩ̂Λ̂u| = hΩ̂(Λ̂u)| = hΛ̂u|Ω̂† . (86)

Next, pull out Λ̂, which pushes Ω̂† further out:

hΛ̂u|Ω̂† = hu|Λ̂† Ω̂† . (87)

22
Now compare eq. (85) with eq. (87), and one clearly has

(Ω̂Λ̂)† = Λ̂† Ω̂† . (88)

Consider an equation consisting of kets, bras, scalars and operators:

a1 |ui = a2 |vi + a3 |wihp|qi + a4 Ω̂Λ̂|ri . (89)

The adjoint of this equation is:

hu|a∗1 = hv|a∗2 + hq|pihw|a∗3 + hr|Λ̂† Ω̂† a∗4 . (90)

i.e. make the substitutions bra→ket and ket→bra, ai ⇔ a∗i , Ω̂ ⇔ Ω̂† , and reverse the
order of all terms. Note that there is no need to reverse the location of the scalars ai ,
except in the interest of uniformity.

Definition: Hermitian operator

An operator is “Hermitian” or “self-adjoint” if Ω̂† = Ω̂

The identity operator is Hermitian, I = I † .

Definition: Unitary operator, Û .

An operator is “unitary” if Û † = Û −1

A unitary operator satisfies


Û † Û = Û Û † = I . (91)
Unitary operators preserve the inner product and norm. If we act Û on the kets |ui and
|vi, then the inner product of the transformed kets is the same as the inner product of
the original kets:
hÛ u|Û vi = hu|Û † Û |vi = hu|vi . (92)
This is important in QM because it means that unitary operators preserve probabilities,
which are given by inner products. Unitary operators are generalisations of rotations
in 3D Euclidean space e.g. rotations do not change the length (or “norm”) of a vector.

23
2.16 Matrix representation of linear operators
So far we have given an abstract definition of operators acting on kets and bras. How-
ever, we have seen that a ket |ui in a n−dimensional vector space can be represented
by a column vector of its components in an orthonormal basis, and likewise, a bra by
a (complex conjugated) row vector (recall eq. (57)). Now we will see that an operator
can be represented by an n × n matrix of components (called “matrix elements”) in
that basis. The matrix elements (like components of vectors) are basis dependent.
Our starting point is Ω̂|ii = |i0 i, where |i0 i is known i.e. we know how Ω̂ acts on a
given orthonormal basis {|ii}. Therefore one has for an arbitrary ket |ui:
n
X n
X n
X
Ω̂|ui = Ω̂ ui |ii = ui Ω̂|ii = ui |i0 i . (93)
i=1 i=1 i=1

When we said that |i0 i is “known”, we mean that we know its components when
expanded in the original basis, which we now denote by {|ji} instead of {|ii} (i.e. just
swapping the label). The components of |i0 i are given by hj|i0 i in the basis {|ji} (recall
eq. (62)):
hj|i0 i = hj|Ω̂|ii ≡ Ω̂ji (94)
for particular values of i and j. Hence
n
X
|i0 i = Ω̂ji |ji . (95)
j=1

The n2 numbers, Ω̂ij , are the “matrix elements” of Ω̂ in this orthonormal basis {|ji}.
If we let Ω̂ act on a ket |ui, i.e. Ω̂|ui = |u0 i, the components of the transformed vector
in our orthonormal basis are given by:
Xn n
X n
X
u0i 0
= hi|u i = hi|Ω̂|ui = hi|Ω̂( uj |ji) = uj hi|Ω̂|ji ≡ Ω̂ij uj , (96)
j=1 j=1 j=1

where Ω̂ij = hi|Ω̂|ji as above. Eq. (96) is a concise way of writing the matrix equation:

u01
    
h1|Ω̂|1i h1|Ω̂|2i ... h1|Ω̂|ni u1
 u02   h2|Ω̂|1i h2|Ω̂|2i ... h2|Ω̂|ni   u2
  
 ...  = 
   . (97)
... ... ... ...   ... 
0
un hn|Ω̂|1i ... ... hn|Ω̂|ni un

24
In abstract ket notation (i.e. no basis specified), eq. (97) is written as

|u0 i = Ω̂|ui , (98)

and so by comparing the two equations one can read off the matrix representation of
the operator Ω̂, which is transforming a column vector |ui to another column vector |u0 i.

Example:
Question: Suppose that in some orthonormal basis {|1i, |2i, |3i} an operator acts as
follows:
Ω̂|1i = 2|1i, Ω̂|2i = 3|1i − i|3i, Ω̂|3i = −|2i . (99)
Write the matrix representation of Ω̂ in this basis, denoted by Ω̂ij .

Answer: Using eq. (97) one has:


 
2h1|1i 3h1|1i − ih1|3i −h1|2i
Ω̂ij =  2h2|1i 3h2|1i − ih2|3i −h2|2i  . (100)
2h3|1i 3h3|1i − ih3|3i −h3|2i

Since hi|ji = δij one has  


2 3 0
Ω̂ij =  0 0 −1  . (101)
0 −i 0
In a different basis, Ω̂ij will be (in general) a different matrix.

2.17 Matrix representation of an adjoint operator


The matrix representing Ω̂† is the transpose conjugate (“Hermitian conjugate”) of Ω̂.

(Ω̂† )ij = hi|Ω̂† |ji = hΩ̂i|ji = hj|Ω̂ii∗ = hj|Ω̂|ii∗ , (102)

and so
(Ω̂† )ij = (Ω̂ji )∗ . (103)
The matrix representation of the identity operator is given by:

Iij = hi|I|ji = hi|ji = δij . (104)

Clearly Iij is the identity matrix.

25
2.18 Outer product representation of operators
The “outer product” between a ket and a bra is written as:
Outer product : |uihv| . (105)
This is a linear operator. If we apply |uihv| to a ket |wi one gets a new ket |uihv|wi
(where hv|wi is just a complex number), as expected for operators. Since |ui and hv|
can be represented by a column vector and a row vector respectively, |uihv| gives a
matrix (as expected, since |uihv| is an operator):
a1 b∗1 ... ... a1 b∗n
   
a1
 ...  ∗
 b1 ... ... b∗n =  ... ... ... ..  .
  

 ...   ... ... ... ...  (106)
an an b∗1 ... ... an b∗n
We can write any operator in terms of outer products of the basis vectors as follows,
where Ω̂ij are the matrix elements:
n
X n
X n X
X n
Ω̂ = Ω̂ij |iihj| . (where = ) (107)
i,j=1 i,j=1 i=1 j=1

Proof: Let |ki and |mi also be basis vectors:


n
X n
X n
X
hk|( Ω̂ij |iihj|)|mi = Ω̂ij hk|iihj|mi = hi|Ω̂|jiδki δjm = hk|Ω̂|mi , (108)
i,j=1 i,j=1 i,j=1

thus proving eq. (107).

2.19 Completeness relation


P
Consider |ui = ui |ii, where ui = hi|ui. Therefore one has:
Xn Xn Xn
|ui = hi|ui|ii = |iihi|ui = ( |iihi|)|ui . (109)
i=1 i=1 i=1

This is true for any ket |ui and so we deduce that


Xn
|iihi| = I . (110)
i=1

Eq. (110) is the “completeness relation”, or ”resolution of the identity”. One rule of
thumb in quantum mechanics is that if you get stuck while manipulating expressions
involving bras and kets, try inserting the completeness relation.

26
2.20 Projection operators
We can also write the completeness relation as
n
X n
X
I= |iihi| = P̂i where P̂i = |iihi| . (111)
i=1 i=1

Here P̂i is the projection operator onto the ith basis vector. The effect of acting P̂i on
an arbitrary ket is to project out the component along the ith basis vector:

P̂i |ui = |iihi|ui = ui |ii , (112)

and a corresponding result for bras

hu|P̂i = hu|iihi| = u∗i hi| . (113)

Projection operators have two important properties:

(1) P̂i = |iihi| is self-adjoint (Hermitian). Clearly P̂i =P̂i† since (|iihi|)† = |iihi|, using
our usual rules for forming the adjoint.

(2) P̂i is equal to its own square, P̂i = P̂i2 , since

P̂i2 = (|iihi|)(|iihi|) = |iihi|iihi| = |iihi| = P̂i , sincehi|ii = 1 . (114)

Note that
P̂i P̂j = |iihi|jihj| = δij |iihj| . (115)
So if i 6= j then
P̂i P̂j |ui = 0̂|ui = |0i . (116)
Note that P̂i P̂j is the zero operator 0̂ in this case, and not the scalar zero. The matrix
elements of P̂i are (where we are taking a particular value of i):

(P̂i )kl = hk|iihi|li = δki δil . (117)

Here we have just inserted P̂i = |iihi| into the usual definition of the matrix elements
in eq .(94). Now δki δil is only non-zero for k = l = i, which tells us that the matrix
representation of P̂i has a 1 at the ith entry on the diagonal, with all other elements

27
being zero e.g. in a four dimensional vector space one would have for a particular P̂i
(say i = 2):  
0 0 0 0
 0 1 0 0 
P̂2 = 
 0 0
. (118)
0 0 
0 0 0 0
Clearly the sum over i of all the P̂i gives the identity matrix, as expected from the com-
pleteness relation. Finally, using completeness, one can find the matrix representation
of a product of operators:
X X
(Ω̂Λ̂)ij = hi|Ω̂Λ̂|ji = hi|Ω̂|kihk|Λ̂|ji = Ω̂ik Λ̂kj . (119)
k k

This is the usual rule for matrix multiplication,


Pn and in obtaining the third term we
have inserted the completeness relation k=1 |kihk| = I.

2.21 Transformations using a unitary operator


We will now show that a unitary transformation changes a vector of a particular basis to
a vector of another basis. This is the generalisation of a ”rotation“ in three-dimensional
space. Recall that the basis vectors i, j, and k can be rotated to another basis i0 , j0 ,
and k0 , and the rotation matrix satisfies Û † Û = I. Convince yourself that this is true
e.g. take the simple case of a 2X2 rotation matrix for 2D space written in terms of
sin θ and cos θ, and you’ll see that Û † Û = I because cos2 θ + sin2 θ = 1. Of course, the
3D case will have two angles, θ and φ. Let us now show this result for an arbitrary ket
in an n dimensional vector space.
Consider two orthonormal bases {|ii} and {|i0 i} for i, i0 = 1 → n, where |i0 i = Û |ii.
We will show that Û has to be a unitary operator. We can expand each vector of the
primed basis as a linear combination of vectors from the unprimed basis. Recalling
eq. (95) with Ω̂ = Û (so |i0 i = Û |ii), one has
n
X
|i0 i = |jiÛji . (120)
j=1

The linear operator Û , has matrix elements in the unprimed basis given by:

Ûij = hi|Û |ji . (121)

28
Now from eq. (120) one has
n
X n
X
0 ∗ 0
hi | = Ûki hk| and |j i = |liÛlj . (122)
k=1 l=1

To obtain hi0 | we have taken the adjoint of eq. (120) and the index to be summed
over has been swapped from j (in eq. (120)) to k. The scalars Ûki need to be complex

conjugated when taking the adjoint and hence Ûki → Ûki . To obtain |j 0 i, where j 0 takes
a specific value, we use eq. (120) but with the fixed index i0 changed to j 0 , and the
index to be summed over changed from j to l. Now hi0 |j 0 i = hÛ i|Û ji = hi|Û † |Û ji =
hi|ji = δij , and using eq. (122) for hi0 | and |j 0 i one has:
X X
δij = hi0 |j 0 i = ∗
Ûki hk|liÛlj = ∗
Ûki Ûkj = (Û † Û )ij . (123)
k,l k

Hence Û † Û = I, and so the operator which transforms one orthonormal basis into an-
other is unitary. Conversely, given a unitary operator, we can use it to generate a new
orthonormal basis from an existing one, and (as shown in eq. (92)) this transformation
preserves inner products.

A unitary operator generates a new orthonormal basis from an existing one.

2.22 Active and passive transformations


An ”active transformation” is one in which all vectors are transformed by a unitary
operator Û i.e. |ii → Û |ii = |i0 i. What happens to the matrix elements of an operator
Ω̂? The matrix elements of Ω̂ in the primed basis are expressed in terms of those in
the original (unprimed) basis by:

Ω̂i0 j 0 ≡ hi0 |Ω̂|j 0 i = hÛ i|Ω̂|Û ji = hi|Û † Ω̂Û |ji . (124)


P P
Now insert I = k |kihk| to the left of Ω̂ and I = l |lihl| to the right of Ω̂:
n
X n
X
† †
Ω̂i0 j 0 = hi|Û |kihk|Ω̂|lihl|Û |ji = Uik Ω̂kl Ûlj . (125)
k,l k,l

Clearly, Ω̂i0 j 0 6= Ω̂ij under the transformation by Û . This is to be expected because the
bases are different, and the matrix representation of an operator is basis dependent.

29
A ”passive transformation” is one in which the vectors are left alone and all oper-
ators are transformed as
Ω̂ −→ Û † Ω̂Û . (126)
This is just an equivalent way of describing the same transformation, as can be seen
from eq. (124).
The components of a vector in different bases take different numerical values, but
the abstract vector is the same. The different values of the components of an ordinary
3-dimensional vector in different co-ordinate systems (Cartesian, Spherical, Polar) are
an example. We would also like physical observables, which come from matrix elements
of operators like hu|Ω̂|ui to be basis independent. We can do this by saying that under
a change of basis, both states and operators transform as

|ui → Û |ui, Ω̂ → Û Ω̂Û † , (127)

because then
hu|Ω̂|ui → hÛ u|Û Ω̂Û † |Û ui = hu|Ω̂|ui . (128)

2.23 Hermitian operators


Observables in QM correspond to “Hermitian operators”, whose eigenvalues give the
possible results of carrying out a measurement of the observable. After making a
measurement, the system (described by the state vector) will be an eigenvector which
corresponds to the measured eigenvalue. We will study this in more detail in section
3. In this subsection we write down some properties of Hermitian operators and their
eigenvalues and eigenvectors.
An eigenvector |ei of a linear operator Ω̂ is a vector with a non-zero norm which is
simply rescaled under the action of the operator:

Ω̂|ei = ω|ei . (129)

Here ω is a scalar called the “eigenvalue”. This equation does not fix the norm of the
eigenvector, and this can be seen by considering the scaled eigenvector a|ei:

Ω̂(a|ei) = aΩ̂|ei = aω|ei = ω(a|ei) . (130)

Eq. (130) shows that a|ei is also an eigenvector of Ω̂ with eigenvalue ω. The eigenvalues
can be obtained from the equation

det(Ω̂ − ωI) = 0 , (131)

30
where “det” means the determinant of the matrix formed from subtracting the matrix
ωIij from the matrix Ω̂ij . Eq. (131) leads to an expression of the form
n
X
cm ω m = 0 , (132)
m=0

which is called the “characteristic equation”, with cm being scalars and n is the dimen-
sion of the space. We define the characteristic polynomial P n (ω) as
n
X
n
P (ω) = cm ω m . (133)
m=0

The eigenvalues are the roots of the equation P n (ω) = 0, and are independent of the
basis, since the abstract equation Ω̂|ei = ω|ei has not specified a basis. Hence it does
not matter which basis we use for the matrix Ω̂ij ; the eigenvalues will be the same.
Of course, the column vector representations of the eigenvectors are basis dependent.
Every nth order polynomial has n roots, not necessarily distinct, and not necessarily
real. So every operator in an n−dimensional vector space (i.e. an n × n matrix) has n
eigenvalues.

Examples
1) Consider the following matrix in a 3-dimensional space:
 
1 0 0
 0 0 −1  . (134)
0 1 0

This matrix has the characteristic equation


 
1−ω 0 0
det  0 −ω −1  = 0 i.e. (1 − ω)(ω 2 + 1) = 0 . (135)
0 1 −ω

The three roots are ω = 1, i, −i, which are distinct but are not all real.

2) The matrix:  
1 0 1
 0 2 0  (136)
1 0 1

31
has characteristic equation (ω − 2)2 ω = 0. The roots are ω = 2, 2, 0, which are all real
and have a repeated or “degenerate” eigenvalue.

3) The matrix:  
2 0 −1
 0 3 1  (137)
1 0 4
leads to (3 − ω)(ω 2 − 6ω + 9) = 0, with eigenvalues ω = 3, 3, 3 i.e. all real and a 3-fold
degeneracy.
The eigenvectors are then found by using Ω̂|ei = ω|ei for each eigenvalue ω. For a
Hermitian operator the n eigenvectors are distinct, even if some of the eigenvalues are
degenerate. This method can give all the eigenvectors if the eigenvalues are distinct.
For the above matrix  
1 0 0
 0 0 −1  (138)
0 1 0
one has the following eigenvectors:
     
x1 0 0
 0  for ω = 1;  ix1  for ω =i;  −ix1  for ω = − i. (139)
0 x1 x1
Here x1 is an arbitrary scaling factor. In QM it is in our interest to normalise the
eigenvector, and so
       
x1 1 0 0
 0  →  0  and  ix1  → √1  i  etc (140)
0 0 x1 2 1
Note that this normalisation condition still does not fix the eigenvector uniquely, since
the normalised eigenvector may be multiplied by eiθ (for some θ 6= 0) without changing
the norm. There is no universally accepted convention for eliminating this freedom,
except perhaps to choose the vector with real components when possible i.e.
   iθ 
1 e
 0  instead of  0  . (141)
0 0
If there is a degeneracy of the eigenvalues (e.g. the matrix in examples 2) and 3) above)
then this method will not give all the eigenvectors.

32
If, however, we know that the eigenvectors are orthogonal (as we shall see shortly is
indeed the case for Hermitian operators) then we can find all the remaining eigenvectors
by orthogonality. We now state an important result for the eigenvalues of Hermitian
operators, which we will prove shortly:

Hermitian operators have real eigenvalues

This has important consequences for QM, since the real eigenvalues correspond to
experimental observables like energy and position which can (of course) only take real
values. Operators which correspond to observables can only be Hermitian operators.
Proof: Hermitian operators have real eigenvalues.

Let Ω̂ = Ω̂† be a Hermitian operator, with eigenvector |ei and corresponding eigen-
value ω. Then one has:

ωhe|ei = he|Ω̂|ei = hΩ̂e|ei∗ = he|Ω̂† |ei∗ = he|Ω̂|ei∗ = ω ∗ he|ei . (142)

Hence ωhe|ei = ω ∗ he|ei. Since |ei 6= |0i, then he|ei > 0, leading to ω = ω ∗ and so ω is
real.
When one takes the adjoint of the eigenvalue equation Ω̂|ei = ω|ei one has

he|Ω̂† = he|ω ∗ or equivalently he|Ω̂ = he|ω , (143)

since Ω̂† = Ω̂ and ω = ω ∗ . We now state an important result regarding the eigenvectors
of a Hermitian operator, which was mentioned earlier:

Eigenvectors corresponding to distinct eigenvalues of a Hermitian operator are orthogonal

To prove this, let Ω̂ = Ω̂† be a Hermitian operator, and let Ω̂|e1 i = ω1 |e1 i and Ω̂|e2 i =
ω2 |e2 i, with ω1 6= ω2 (i.e. distinct eigenvalues). Now

he1 |Ω̂|e2 i = ω1 he1 |e2 i = ω2 he1 |e2 i , (144)

where ω1 he1 |e2 i comes from Ω̂ acting on he1 | and ω2 he1 |e2 i comes from Ω̂ acting on |e2 i.
Hence one has
(ω1 − ω2 )he1 |e2 i = 0 . (145)

33
Now if ω1 6= ω2 (which we assumed) then he1 |e2 i = 0, and so the eigenvectors are
orthogonal. This result is true for any pair of eigenvectors (e.g. |ei i and |ej i) in the
vector space. Recall that a set of mutually orthogonal vectors are linearly independent
and thus form a basis for the vector space.
This (short) proof only works for the case of distinct eigenvalues. However, even in
the case of degenerate eigenvalues it can be proved that the normalised eigenvectors of
a Hermitian operator form an orthonormal basis (although the proof is longer).
Definition: Normal operator

An operator N̂ is “normal” if [N̂ , Nˆ † ] = [N̂ Nˆ † − Nˆ † N̂ ] = 0

This definition includes Hermitian (N̂ = Nˆ † ) and unitary (Nˆ † N̂ = N̂ Nˆ † = I) opera-


tors.
There is a theorem which shows that every normal operator on a finite-dimensional
vector space has an orthonormal basis of eigenvectors, called the “eigenbasis” of that
operator. We do not prove this here, but we do consider some implications of the
result. The matrix representation of the operator is diagonal in its eigenbasis, with
the eigenvalues as the diagonal entries (we will show this shortly). We say that the
operator can be “diagonalised”. Let’s adopt the convention of labelling an eigenvector
by its corresponding eigenvalue. Then in an n-dimensional space we have a set of n
eigenvector-eigenvalue pairs, each satisfying
Ω̂|ωi i = ωi |ωi i . (146)
Not all of the eigenvalues need be distinct, and some could be zero (in which case Ω̂ has
no inverse). The theorem tells us that we have an orthonormal basis of eigenvectors of
Ω̂, and so we know that hωi |ωj i = δij . Thus the matrix elements of Ω̂ in its eigenbasis
are:
Ω̂ij = hωi |Ω̂|ωj i = δij ωj . (147)
Hence the matrix representation is
 
ω1 0 ... 0
 0 ω2 ... 0  , (148)
0 0 ... ωn
i.e. a diagonal matrix.
Now any operator can be written in the form of eq. (107), and so using eq. (147)
in eq. (107) one has:
X n
Ω̂ = ωi |ωi ihωi | . (149)
i=1

34
This form of writing a Hermitian operator is called the “spectral decomposition” i.e.
the Hermitian operator is written as an outer product in its eigenbasis. If all ωi are
non-zero then there is an inverse operator Ω̂−1 given by:
n
X
Ω̂−1
= ωi−1 |ωi ihωi | . (150)
i=1

One P can check that Ω̂−1 Ω̂ = Ω̂Ω̂−1 = I (i.e. make use of the completeness relation
I = ni |ωi ihωi |).
If we start with an orthonormal basis |1i,...|ni, there will be a unitary matrix Û
which ”rotates” these basis vectors to the eigenbasis of the particular operator Ω̂ (recall
section 2.21):
|ωi i = Û |ii , (151)
i.e. the orthonormal basis that is generated from the basis {|ii}, which was labelled
by the vectors {|i0 i} in section 2.21, is now labelled by |ωi i because we have chosen a
specific Û which rotates to the eigenbasis. Since Ω̂ is diagonal in its eigenbasis (recall
eq. (148)), our result on active transformations in eq. (124) shows us that

hωi |Ω̂|ωj i = ωj δij = hi|Û † Ω̂Û |ji = (Û † Ω̂Û )ij , (152)

This result can be restated as follows: if Ω̂ is a Hermitian matrix, there is a unitary


matrix such that the matrix Û † Ω̂Û is diagonal.

2.24 Degeneracy of eigenvalues


If all the eigenvalues are distinct then the orthonormal eigenbasis is unique. What if
there is degeneracy? Suppose that there is a degeneracy where two eigenvalues are the
same, ω1 = ω2 = ω. Then we have two orthonormal vectors obeying

Ω̂|ω1 i = ω|ω1 i and Ω̂|ω2 i = ω|ω2 i . (153)

Thus it follows that for any scalars α and β:

Ω̂[α|ω1 i + β|ω2 i] = αω|ω1 i + βω|ω2 i = ω[α|ω1 i + β|ω2 i] . (154)

Since the eigenvectors |ω1 i and |ω2 i are orthogonal, we find that there is a two di-
mensional subspace with basis vectors |ω1 i and |ω2 i, in which all the elements are
eigenvectors of Ω̂ with eigenvalue ω. Consequently, besides the vectors |ω1 i and |ω2 i,
there exists an infinity of orthonormal pairs |ω1 i and |ω2 i obtained by rotation of |ω1 i

35
and |ω2 i, from which we may select any pair in forming the eigenbasis of Ω̂ i.e. the
eigenvectors sin θ|ω1 i + cos θ|ω2 i and − cos θ|ω1 i + sin θ|ω2 i, which both have unit norm
and are orthogonal. So in the case of degeneracy of eigenvalues there is not a single,
unique eigenbasis, but instead an infinite number of eigenbases.
In general, if an eigenvalue ω occurs m times there will be a subspace V m (m < n)
from which we may choose any m orthonormal vectors. These m vectors, together
with the unique eigenvectors, give a choice of eigenbasis for the whole space V n . The
subspace V m can also be called Vωm .
For the eigenvectors in the subspace Vωm we need to modify our labelling. We
introduce a degeneracy label, which is an integer running from 1 to m e.g. if ω1 =
ω2 = ω then we label the orthonormal eigenvectors in Vωm as

|ω, 1i and |ω, 2i , (155)

instead of |ω1 i and |ω2 i. Note that |ω, 1i and |ω, 2i are not unique kets, but refer to
any orthonormal pairs in the subspace Vωm .
Now consider another operator Λ̂ which commutes with Ω̂ i.e.

[Λ̂, Ω̂] = 0 , (156)

and |λi i are eigenvectors of Λ̂ with none of the eigenvalues λi being degenerate

Λ̂|λi i = λi |λi i . (157)

One has
Λ̂(Ω̂|λi i) = Ω̂Λ̂|λi i = Ω̂λi |λi i = λi (Ω̂|λi i) , (158)
and so (Ω̂|λi i) is an eigenvector of Λ̂. What is the effect of Ω̂ on |λi i? We know that

Ω̂|λi i =
6 |ui , (159)

i.e. Ω̂ cannot change |λi i to a different ket |ui (say). If this were true then eq. (158)
would give Λ̂|ui = λi |ui, which contradicts our assumption that |λi i is the only ket
with eigenvalue λi . Hence the only possibility is

Ω̂|λi i = ωi |λi i , (160)

and so the set of eigenvectors {|λi i} of Λ̂ are also eigenvectors of Ω̂. Thus the operators
Λ̂ and Ω̂ have ”common eigenvectors“.

36
If both Λ̂ and Ω̂ have degenerate eigenvalues then the proof is longer, but the
conclusion is still the same:

If two Hermitian operators commute then they have common eigenvectors

We recall that {|λi i} is a unique basis when none of λi have degenerate eigenvalues.
We can then label the unique and common basis of eigenvectors of Ω̂ and Λ̂ as |ωi , λi i.
e.g. even if ω1 = ω2 = ω, since λ1 6= λ2 then one has two unique kets |ω, λ1 i and |ω, λ2 i
which are distinguished by λi , instead of having an infinite choice of eigenvector pairs
within the subspace Vωm (where m = 2 in this example).
Now if λi has degeneracy in the subspace Vωm where ωi has degeneracy (e.g. λ1 =
λ2 = λ in the above example), then to have unique kets it is necessary to have a third
operator Γ̂ with no degeneracy in this subspace Vωm , and which commutes with Ω̂ and
Λ̂. Then the eigenvalues γi of Λ̂ can distinguish the kets e.g. |ω, λ, γ1 i and |ω, λ, γ2 i. A
”complete set of commuting operators” (Ω̂, Λ̂, Γ̂....) is the set of operators which nails
down a unique, common eigenbasis. A single Hermitian operator with no degenerate
eigenvalues is by itself a complete commuting set.

2.25 Continuous bases in a vector space


Up to now we have discussed vector spaces in which the basis vectors are labelled by
a finite integer i ranging from 1 to n, and are thus countable. In QM, the measured
values of observables correspond to eigenvalues of Hermitian operators, and after a
measurement the state vector will be an eigenvector of the Hermitian operator under
consideration. However, QM can have an infinite tower of energies e.g. the hydrogen
atom and the quantum simple harmonic oscillator have an infinite number of discrete
energy levels. Consequently, to describe physical systems like these we need to allow
n to extend to infinity in order to have an eigenvector of the energy operator (say) for
each allowed energy level. Moreover, a particle in a one-dimensional well of finite depth
can have both discrete energy eigenvalues (”bound states“) together with a continuous
range of allowed energies for states which are freely propagating outside the well.
So we have to extend our ideas to include vector spaces with an infinite number of
dimensions, and to allow not only discrete eigenvalues, but also continuous eigenvalues.
We will now discuss the differences between a discrete basis (in which the basis vectors
are labelled by an integer) and a continuous basis (in which the basis vectors are
labelled by a continuous parameter). The orthonormality of the basis vectors differs
for these two types of basis.

37
2.26 Generalisation to infinite dimensional vector spaces
Below is a comparison of a discrete basis (finite or infinite dimensional) and a continuous
basis (infinite dimensional):
Discrete basis

i) Orthonormal n dimensional basis |1i,|2i...|ni, where n can be a finite integer or tend


to infinity.
ii) An arbitrary ket is written as: |vi = ni=1 vi |ii.
P
iii) hu|vi = u∗1 v1 + ....u∗n vn .
iv) hi|ji = δij (Kronecker delta orthonormality).
Continuous basis

v) Orthonormal ∞ dimensional basis |xi, where x is real and continuous in some


interval aR< x < b e.g. −∞ < x < R b∞.∗
b
vi) |vi = a v(x)|xidx and hv| = a v (x)hx|dx.
Rb
vii) hu|vi = a u∗ (x)v(x)dx.
viii) hx|x0 i = δ(x − x0 ) (Dirac delta function orthonormality).
Notes

1) For a discrete basis we label the basis vectors by integers i (e.g. |ii). In a continuous
basis we label them by a continuous real number in some range, |xi.

2) The ket |vi can be thought of as an infinitely long column vector, with each row
corresponding to a different value of the continuous variable x. Consequently, the basis
vectors |xi are also infinitely long column vectors.

3) Point vii) above is the definition of the inner product for a bra and a ket which have
been expanded in a continuous basis. The sum over components in the discrete case
has been replaced by an integral over a continuous variable, which is like a “sum over
the continuous components v(x)”.

4) The function v(x) is the direct analogue of the vector components vi i.e. vi → vx →
v(x). If we interpret the continuous variable x as the observable “position”, then v(x)
is the wave function in the position basis, and |xi are the eigenvectors of the Hermitian

38
position operator x̂. The vector |vi is the state vector. Since the wave function in QM
also depends on time, then we have v(x, t), and |v(t)i.
Note that the result hv|vi = 1 (i.e. a normalised ket |vi) leads to the familiar result
for normalising the wave function
Z ∞
v ∗ (x)v(x)dx = 1 . (161)
−∞

The requirement that vii) above follows from vi) fixes the orthonormality condition in
viii). To see this explicitly, using vi) one has
Z b Z b
|vi = v(x)|xidx and hu| = u∗ (x0 )hx0 |dx0 . (162)
a a

Then Z b Z b
0
hu|vi = dx dx u∗ (x0 )v(x)hx0 |xi . (163)
a a

If we require vii) to hold then the following relation is necessary


Z b
dx v(x)hx0 |xi = v(x0 ) , (164)
a

and then swap x0 with x to agree with vii). Clearly eq. (164) is just the sampling
property of the Dirac delta function, and so

hx|x0 i = δ(x − x0 ) . (165)

Thus we have derived viii) from vi) and vii).


Rb
Now since |vi = a v(x)|xi dx then multiplying from the left by hx0 | one has:
Z b Z b
0 0
hx |vi = v(x)hx |xi dx = v(x)δ(x0 − x) dx = v(x0 ) . (166)
a a

Swapping x0 for x one has hx|vi = v(x) and so one can write our arbitrary ket |vi now
as Z b
|vi = |xihx|vi dx . (167)
a

Hence the wave function v(x) in vi) is an inner product hx|vi of the state vector with
a ket from a continuous basis. In other words, the component of the state vector |vi
along the basis vector |xi is the value of the wave function at x. From eq. (167) one

39
has the following “resolution of the identity” (completeness relation) for the vectors in
a continuous basis: Z b
I= |xihx| dx . (168)
a
P
This is analogous to I = i |iihi| in the discrete case. Think of eq. (168) as a piece
of maths which can be inserted in expressions involving bras and kets e.g. it one can
multiply a ket without changing as follows:
Z b
|vi = I|vi = |xihx|vi dx . (169)
a

2.27 Functions can be expressed as vectors in a continuous


basis
A ket |f (x)i expanded in a continuous basis can represent a function f (x), and vice
versa (i.e. x is a label that takes continuous values from −∞ to ∞):
   
f (−∞) hx|f ix=−∞

 ... 


 ... 

 f (−1)   hx|f ix=−1 
   

 ... 


 ... 

|f (x)i ↔  f (0)  or  hx|f ix=0 
  
. (170)

 .. 


 ... 

 f (1)   hx|f ix=1 
   
 ...   ... 
f (∞) hx|f ix=∞
The ... stand for the continuous values of the function f (x) in between the chosen
values of x. Take a particular value x = x0 . This basis vector |x0 i is a Dirac-delta
function expressed as a ket, and can be represented by a column vector of zeros, with

40
the only non-zero entry at x = x0 (in the |xi-basis):
 
0
 ... 
 
 0 
 
 ... 
 
 ∞ .
|x0 i ↔  (171)

 ... 
 
 0 
 
 ... 
0
Operators can be thought of as a square matrix with an infinite number of rows
and columns, with matrix elements labelled by x and x0 e.g. Ω̂xx0 instead of Ω̂ij . As
usual, an operator (in general) changes a ket to another ket.
Ω̂|f i = |f˜i (172)
The differential operator d/dx changes a function f (x) to another function df /dx. In
ket notation we would write
D̂|f i = |df /dxi , (173)
where |f i is the ket corresponding to the function f (x) and |df /dxi is the transformed
ket corresponding to the function df /dx. The operator D̂ plays the role of d/dx. To
obtain the matrix representing D̂ in the basis of {|xi} one multiplies eq. (173) from
the left with hx|:
df df (x)
hx|D̂|f i = hx| i = . (174)
dx dx
Now insert the resolution of the identity in hx|D̂|f i only, giving
Z Z
df (x)
hx|D̂|x ihx |f idx = hx|D̂|x0 if (x0 )dx0 =
0 0 0
, (175)
dx
and from the sampling property of the delta function one can see that

hx|D̂|x0 i = δ(x − x0 ) . (176)
∂x0
i.e. an infinitely long square matrix with ∂/∂x0 on the diagonal. Now consider the
momentum operator in the x−direction, p̂ = −ih̄∂/∂x. Clearly one has

hx|p̂|x0 i = −ih̄δ(x − x0 ) . (177)
∂x0
41
The momentum operator is a Hermitian operator because it satisfies the following
condition for Hermitian operators:

hg|p̂|f i = hf |p̂|gi∗ (178)

This result follows from eq. (83) by taking Ω̂ = Ω̂† = p̂. Eq. (178) can be verified by
integrating by parts:
Z ∞  Z ∞ ∗
∗ ∂f (x) ∗ ∂g(x)
− g (x)ih̄ dx = − f (x)ih̄ dx , (179)
−∞ ∂x −∞ ∂x

provided that the surface term vanishes

[−ig ∗ (x)f (x)]∞


−∞ = 0 . (180)

This term is guaranteed to vanish if f (x) = 0 and g(x) = 0 at x = ±∞. Wave functions
are required to satisfy this condition, which is known as “square integrability”, and so
the momentum operator is Hermitian.
In ket notation one has the eigenvalue equation for the possible measured values of
the momentum:
p̂|pi = p|pi , (181)
where p is a real number (as required for a Hermitian operator) and |pi is an eigenvector
of momentum. Now multiply eq. (181) from the left-hand side by the bra vector hx|
on both terms
⇒ hx|p̂|pi = phx|pi (182)
R 0 0 0
Now insert I = |x ihx |dx between p̂ and p on the left-hand side of eq. (182) only
Z
⇒ hx|p̂|x0 ihx0 |pidx0 = phx|pi . (183)

Using eq. (177) one has:


Z
∂ 0
⇒ −ih̄ δ(x − x0 ) hx |pidx0 = phx|pi , (184)
∂x0

⇒ −ih̄ hx|pi = phx|pi , (185)
∂x
⇒ hx|pi = Aeipx/h̄ , (186)

42
where A is a free parameter unrestricted by the eigenvalue equation. We choose
1
A= √ . (187)
2πh̄
With this choice one has
Z ∞ Z ∞
0 0 1 0
hp|p i = hp|xihx|p idx = e−i(p−p )x/h̄ dx , (188)
−∞ 2πh̄ −∞

where we have made use of eq. (186) and hp|xi = hx|pi∗ . This is one definition of the
Dirac-delta function and so
hp|p0 i = δ(p − p0 ) . (189)
Therefore the eigenvectors of momentum |pi also form a continuous basis because
they satisfy the orthonormality condition. Consequently, we have the resolution of the
identity for the eigenvectors |pi, i.e.
Z
I = |pihp|dp . (190)

Since |xi and |pi are two alternative bases, we can expand either in terms of the other.
Hence for a particular eigenvalue of p and varying x one has
Z
|pi = |xihx|pidx . (191)

Here,
1
hx|pi = √ eipx/h̄ (192)
2πh̄
are the components of a momentum eigenstate |pi in the |xi basis, or in other words,
the position space wave function of a momentum eigenstate. In a similar way, for a
particular value of x and varying p we have
Z
|xi = |pihp|xidp . (193)

Here,
1
hp|xi = hx|pi∗ = √ e−ipx/h̄ (194)
2πh̄
are the components of a position eigenstate |xi in the |pi basis, or in other words, the
momentum space wave function of a position eigenstate.

43
Now let’s consider an arbitrary state |ψi, which is neither an eigenstate of position
nor of momentum, but (of course) can be expanded in terms of the basis R vectors (either
|xi basis or |pi basis). Take the inner product hx|ψi and insert I = |pihp|dp. Hence
one has Z
hx|ψi = hx|pihp|ψidp . (195)

Now hx|ψi = ψ(x) (position space wave function), hp|ψi = ψ̃(p) (momentum space
wave function), and we know hx|pi from eq. (192). Making these substitutions one sees
that eq. (195) is the same as the following equation:
Z
1
ψ(x) = √ eipx/h̄ ψ̃(p)dp . (196)
2πh̄

Clearly, ψ(x) isRthe Fourier transform of ψ̃(p). In a similar way we can start with hp|ψi
and insert I = |xihx|dx, which leads to
Z
hp|ψi = hp|xihx|ψidx , (197)

from which we obtain Z


1
ψ̃(p) = √ e−ipx/h̄ ψ(x)dx , (198)
2πh̄
which is again a Fourier transform.
The orthonormal basis |xi has come from the eigenvectors of a Hermitian position
operator x̂, which satisfies the following equations

x̂|xi = x|xi and hx|x̂ = hx|x . (199)

The matrix elements of x̂ in the |xi basis are

hx0 |x̂|xi = xhx0 |xi = xδ(x0 − x) . (200)

Now consider the action of x̂ on an arbitrary state |ψi, and look at the components in
the x−basis. The operator x̂ will change |ψi to a new ket x̂|ψi, which can be written
as one ket |x̂ψi. Hence the Rcomponents of this new ket in the x−basis are given by
hx|x̂|ψi. Making use of I = |x0 ihx0 |dx0 one has
Z ∞
hx|x̂|ψi = hx|x̂|x0 ihx0 |ψidx0 , (201)
−∞

44
Abstract (no basis) x−space representation p−space representation
State |ψi ψ(x) = hx|ψi ψ̃(p) = hp|ψi

Position op x̂ x ih̄ ∂p

Momentum op p̂ −ih̄ ∂x p
∂ ∂
Commutator [x̂, p̂] = ih̄ [x, −ih̄ ∂x ] = ih̄ [ih̄ ∂p , p] = ih̄

Table 1: The operators x̂ and p̂ in position and momentum space.

and so Z ∞
hx|x̂|ψi = x0 δ(x − x0 )ψ(x0 )dx0 = xψ(x) . (202)
−∞
Now hx|ψi = ψ(x) are the components of |ψi in the |xi basis, and hx|x̂ψi = xψ(x) are
the components of the transformed ket |x̂ψi in the x-basis. Hence the effect of x̂ on
an arbitrary state |ψi written in the |xi basis is to multiply the components by the
number x.
R we consider the effect of x̂ on |ψi with |ψi expanded in the |pi basis. Using
Now
I = |xihx|dx we have Z
hp|x̂|ψi = hp|xihx|x̂|ψidx . (203)

Then using either hx|x̂ = hx|x or result in eq. (202) one has
Z −ipx/h̄
e
⇒ hp|x̂|ψi = √ xhx|ψidx , (204)
2πh̄
and taking out ih̄∂/∂p (which removes the factor of x)
Z
∂ ∂
hp|x̂|ψi = ih̄ hp|xihx|ψidx = ih̄ hp|ψi , (205)
∂p ∂p
R
where in the last line we have removed I = |xihx|dx. Now hp|ψi = ψ̃(p) are the
components of |ψi in the |pi basis, and hp|x̂|ψi = ih̄∂ ψ̃(p)/∂p are the components of
the transformed ket |x̂ψi in the |pi basis. Therefore the effect of x̂ on |ψi written in
the |pi basis is to apply the differential operator ih̄∂/∂p.

Summary
The commutator [x̂, p̂] = ih̄, which leads to Heisenberg’s uncertainty relation, is
given in position space by [x, −ih̄∂/∂x] acting on a wave function ψ(x) and in mo-
mentum space by [ih̄∂/∂p, p]] acting on ψ̃(p). This is summarised in Table 1. Position

45
|xi and momentum |pi representations of |ψi give two completely equivalent physical
descriptions. Which one we choose to use depends on the context. Note that once
we have an expression for the position operator x̂ in the two bases, we automatically
know the form of the momentum operator p̂ in the two bases because of the commu-
tator [x̂, p̂] = ih̄. Alternatively, we can derive the above forms of p̂ in the x−basis and
p−basis from algebra similar to what we used for finding the form of x̂ in these bases.

2.28 Hilbert space


Quantum mechanics deals with states in “Hilbert space”. Hilbert spaces of interest
for quantum mechanics are complex inner product spaces, finite or infinite dimensional
with the properties we need. For example, we can expand any vector using a basis of
eigenvectors of a Hermitian operator. If the eigenvectors are of a Hermitian operator
with discrete eigenvalues then they respect Kronecker delta orthonormality, while the
eigenvectors of a Hermitian operator with continuous eigenvalues respect Dirac delta
function orthonormality.

3 Vector spaces applied to Quantum Mechanics


3.1 Postulates of Quantum Mechanics
We’ll consider a system with one spatial dimension e.g. a particle confined to the
x−axis. The generalisation to three dimensions is straightforward. There are four
postulates of Quantum Mechanics:

(I) The state of a particle is represented by a vector |ψ(t)i (the state vector) in
Hilbert space.

(II) The independent variables x and p of classical mechanics now become Hermitian
operators, x̂ and p̂ respectively, defined by the commutator [x̂, p̂] = ih̄. Dependent
variables f (x, p) in classical mechanics are given by the Hermitian operators
Ω̂(x̂, p̂) = f (x → x̂, p → p̂) i.e. Ω̂ is the same function of x̂ and p̂ as f is of x and
p.

(III) If a particle is in a state |ψi, a measurement of the observable corresponding to Ω̂


will yield one of the eigenvalues ωi with probability prob(ωi ) ∝ |hωi |ψi|2 , where
|ωi i is the eigenvector with eigenvalue ωi . The state vector of the system will
change from |ψi → |ωi i as a result of the measurement.

46
(IV) The state vector |ψ(t)i obeys the Schrödinger equation
d
ih̄ |ψ(t)i = Ĥ|ψ(t)i . (206)
dt
Here Ĥ(x̂, p̂) = H(x → x̂, p → p̂), where Ĥ is the QM Hamiltonian operator
and H is the Hamiltonian (i.e. not an operator) for the corresponding classical
problem.
We will see later that postulate IV leads to the more familiar form of the Schrödinger
equation written in terms of the wave function. If a particle is in a state |ψi and we
wish to measure the observable associated with the operator Ω̂ (e.g. could be energy)
we proceed as follows:

Step (1): Find the orthonormal eigenvectors |ωi i and eigenvalues ωi of Ω̂.

P
Step (2): Expand |ψi in this basis: |ψi = i |ωi ihωi |ψi.

Step (3): Introducing the projection operator P̂ωi = |ωi ihωi |, and recalling that P̂ωi P̂ωi =
P̂ωi and P̂ωi = P̂ω†i , the probability that the result ωi will be obtained is given by the
following:
Prob(ωi ) ∝ |hωi |ψi|2 = hψ|ωi ihωi |ψi = hψ|P̂ωi |ψi = hψ|P̂ωi P̂ωi |ψi = hP̂ωi ψ|P̂ωi ψi ,
(207)
and so one has:
Prob(ωi ) ∝ ||P̂ωi ψ||2 . (208)

Comments

1) The theory only makes probabilistic predictions for the result of a measurement of
Ω̂. It assigns relative probabilities for obtaining eigenvalues ωi of Ω̂. The only possible
values of a measurement of Ω̂ are its eigenvalues. These eigenvalues are real because
postulate II demands that Ω̂ be a Hermitian operator.

2) To get an absolute probability for a particular ωi we divide |hωi |ψi|2 by the sum of
all relative probabilities
|hωi |ψi|2 |hωi |ψi|2 |hωi |ψi|2
Prob(ωi ) = P 2
= = , (209)
j |hωj |ψi| hψ|ψi
P
j hψ| P̂ ωj
|ψi

47
P
since j P̂ωj = I. Clearly, if we had started with a normalised state hψ|ψi = 1 then
we would have had
Prob(ωi ) = |hωi |ψi|2 . (210)
3) If |ψi is solely composed of a particular eigenvector |ωi i, a measurement of Ω̂ is
guaranteed to yield the result ωi . A particle in a such a state may be said to have a
definite value ωi in the classical sense.
4) For a normalised superposition such as (where α and β are complex numbers)

α|ω1 i + β|ω2 i
|ψi = , (211)
(|α|2 + |β|2 )1/2

where hω1 |ω1 i = hω2 |ω2 i = 1, a measurement of Ω̂ yields either ω1 or ω2 , with proba-
bilities
|α|2 |β|2
and respectively . (212)
(|α|2 + |β|2 ) (|α|2 + |β|2 )
This has no analogue in classical physics, where superpositions also occur (e.g strings,
water waves) but one does not expect to obtain ω1 some of the time and ω2 the rest
of the time. Instead, in classical physics one expects a unique value each time which
is generally distinct from ω1 and ω2 , and depends on the values of α and β in the
superposition.
We now discuss a couple of complications:
Complication 1: If the operator Ω̂ has degenerate eigenvalues ω = ω1 = ω2 and corre-
sponding kets |ω, 1i and |ω, 2i, then the probability of obtaining ω is given by

Prob(ω) = |hω, 1|ψi|2 + |hω, 2|ψi|2 . (213)

Complication 2: The eigenvalue spectrum of Ω̂ can be continuous. In this case one has:
Z
|ψi = |ωihω|ψi dω . (214)

Can we interpret |hω|ψi|2 as the probability of finding the particle with eigenvalue ω
for the observable Ω̂? The answer is no, because the number of possible values of
ω is infinite (uncountably), and since the total probability is unity, each single value
of ω can only be assigned an infinitesimal probability. One interprets |hω|ψi|2 as the

48
probability density at ω i.e. |hω|ψi|2 is the probability of obtaining a result between ω
and ω + dω. This definition meets the requirement that the total probability be unity:
Z Z Z
2
Probdens(ω) dω = |hω|ψi| dω = hψ|ωihω|ψi dω = hψ|I|ψi = hψ|ψi = 1 .
(215)

3.2 Collapse of the state vector


Postulate III states that a measurement of Ω̂ changes the state from |ψi = nj=1 |ωj ihωj |ψi
P

to |ωi i for a particular value of j = i. In terms of the projection operator P̂ωi one has:

P̂ωi |ψi
|ψi → Ω̂ measured and ωi obtained → . (216)
hP̂ωi ψ|P̂ωi ψi1/2

Assuming that |ψi is already normalised, one has to normalise P̂ωi |ψi, which in general
is not normalised because
P̂ωi |ψi = |ωi ihωi |ψi , (217)
where |ωi i is a normalised eigenvector, and |hωi |ψi|2 ≤ 1.
The act of measurement is like ”acting on the state“ with a projection operator.
This is called ”collapse of the state”, and is a common feature whenever a probabilistic
prediction is confronted with an individual experimental result. To draw an analogy,
if you buy a lottery ticket you have a (tiny) probability of being a millionaire before
the draw is announced:
|ψi = α|wini + β|don0 t wini (218)
However, once the draw is made you become either |wini or |don0 t wini.

3.3 How to test Quantum Theory


We need to test probabilistic predictions and not definite predictions. Assume that we
have prepared an eigenvector of Ω̂, denoted by |ωi i, by measuring a non-degenerate
eigenvalue ωi . Now suppose that we measure some other variable Λ̂ immediately after
the measurement of ωi so that the state could not have changed from |ωi i with time
evolution. Suppose that the expansion of |ωi i in terms of the eigenvectors {|λi i} of Λ̂
is given by the simple expression:

1 2
|ωi i = √ |λ1 i + √ |λ2 i . (219)
3 3

49
The theory predicts that the eigenvalues λ1 and λ2 will be obtained with probabilities
of 1/3 and 2/3 respectively. If our measurement of Λ̂ gives a value that is neither λ1 nor
λ2 then that is the end of the theory, because the prediction of the theory did not agree
with experiment. Let’s say that the result of our measurement is λ1 . This is consistent
with the theory, but it does not vindicate it, since the theoretical prediction for the
probability of obtaining λ1 could have been 1/30 instead of 1/3, and we would still
expect to obtain the result λ1 some of the time from an individual measurement. To
test the prediction in eq. (219) we must repeat the experiment many times. However,
we cannot repeat the experiment with this state because it is now in the state |λ1 i,
and not in the state given by eq. (219). We must start afresh with another particle
in a state |ωi i. Clearly we need an ensemble of N particles all in the same state |ωi i.
Then after a measurement of Λ̂ for each of these particles the theory would predict
that approximately N/3 would have a value λ1 and 2N/3 would have value λ2 . For
a sufficiently large N the differences from 1/3 and 2/3 will be negligible if quantum
theory is correct.
The chief difference between a classical ensemble and a quantum ensemble is that
in the classical case the ensemble contained N/3 particles with λ1 and 2N/3 particles
with λ2 before the measurement. This is also what ”common sense” would expect. In
the quantum ensemble every particle is in the same state |ωi i before the measurement
of Λ̂, and each particle is capable of yielding λ1 or λ2 . Only after a measurement are
a third in the state |λ1 i and two-thirds are in the state |λ2 i.

3.4 Expectation value


If one is not concerned with the detailed information of the state (i.e. the explicit
form of the eigenvectors and eigenvalues) one can calculate an average value of the
observable Ω̂ over the ensemble. This is called the ”expectation value” (denoted by
hΩ̂i), where every particle is in the state |ψi:
X X X
hΩ̂i = Prob(ωi )ωi = |hωi |ψi|2 ωi = hψ|ωi ihωi |ψiωi . (220)
i i i

Now since Ω̂|ωi i = ωi |ωi i for each i, then we have


X
hΩ̂i = hψ|Ω̂|ωi ihωi |ψi . (221)
i
P
Now we can remove i |ωi ihωi | = I to obtain the following expression for the expec-
tation value:
hΩ̂i = hψ|Ω̂|ψi . (222)

50
P
This Rderivation alsoR works in the continuous case with eigenvalues ω, by replacing
with dω and use |ωihω| dω = I.

Comments
1) To calculate hΩ̂i in eq. (222) one only needs to be given the state vector and the
operator (say, a column vector and a matrix in some basis). There is no need to find
the eigenvalues and eigenvectors of Ω̂.
2) If |ψi = |ωi i then hΩ̂i = ωi , since hψ|Ω̂|ψi = hωi |Ω̂|ωi i = ωi hωi |ωi i =ωi .
3) By hΩ̂i we mean average over the ensemble. A given particle will only yield one
of the eigenvalues. In general, these eigenvalues will not coincide with hΩ̂i e.g. the
mean number of children per couple might be 2.12 but the ”eigenvalues”‘ are integers
(1, 2, 3...).

3.5 The uncertainty ∆Ω̂


In any situation described by probabilities, the standard deviation measures the average
fluctuation around the mean:
q
2 1/2
∆Ω̂ = h(Ω̂ − hΩ̂i) i = hΩ̂2 i − hΩ̂i2 . (223)

In QM, ∆Ω̂ is referred to as the ”uncertainty” in the result of a measurement of Ω̂.


Notice that ∆Ω̂, just like hΩ̂i, is calculable given just the state and the operator (i.e.
knowledge of the eigenvectors is not needed). One has
q
∆Ω̂ = hψ|Ω̂2 |ψi − hψ|Ω̂|ψi2 , (224)

where Ω̂2 = Ω̂Ω̂. Usually the expectation value and the uncertainty provide us with a
fairly good description of the system. Note that ∆Ω̂ is not related to the experimental
error, which is an additional error/uncertainty. The uncertainty ∆Ω̂ is an uncertainty
in prediction which would still be present even if the experimental uncertainty were
reduced to zero.
The uncertainty ∆Ω̂ quantifies the spread of a set of repeated measurements on the
same state. If the state is an eigenvector |ωi i of Ω̂ then ∆Ω̂ = 0, since we would always
obtain the same result ωi from a measurement of Ω̂. In classical mechanics we would
also have ∆Ω̂ = 0, since repeated measurements on the same initial state would always
give the same result.

51
3.6 Compatible and incompatible operators (observables)
Consider a measurement of the observable Ω̂ on a general state |ψi. We obtain the
result ωi (say). Now immediately measure the observable Λ̂ and we obtain the result
λi (say). Do we now have a state that definitely gives rise to the results ωi and λi ? In
general, this is not true. After the measurement of Ω̂ we would have :
X
|ψi = |ωj ihωj |ψi → measure Ω̂, obtain ωi → state collapses to |ωi i . (225)
j

Now |ωi i can be written in the basis of eigenvectors of Λ̂, and measurement of Λ̂ will
give one of the eigenvalues λi :
X
|ωi i = |λj ihλj |ωi i → measure Λ̂, obtain λi → state collapses to |λi i . (226)
j

Now if we measure Ω̂ again, this time on the state |λi i, we will in general obtain a
value ωi0 , where i0 6= i in general, because |λi i is a superposition of the eigenvectors
|ωj i:
X
|λi i = |ωj ihωj |λi i → measure Ω̂, obtain ωi0 → state collapses to |ωi0 i , (227)
j

In general, we do not obtain our original result of ωi in eq. (225). So the state |λi i is
not guaranteed to give back the original value ωi when a second measurement of Ω̂ is
made.
However, an exception occurs if |ωi i = |λi i i.e. the particular eigenvector |ωi i is
identical to an eigenvector of Λ̂. Then we have a simultaneous eigenvector of Ω̂ and Λ̂,
labelled by |ωi , λi i. Then one has:

Ω̂|ωi , λi i = ωi |ωi , λi i and Λ̂|ωi , λi i = λi |ωi , λi i . (228)

Hence the second measurement of Ω̂ gives the same result as the first measurement
of Ω̂. When will two operators admit simultaneous eigenvectors? If two Hermitian
operators commute [Ω̂, Λ̂] = 0̂ then they have a basis of common eigenvectors, |ωi , λi i.
This was discussed around eq. (160), with a proof for the case where at least one of
the operators has no degenerate eigenvalue. A longer proof is needed for the case when
both operators have degenerate eigenvalues.
Each |ωi , λi i has well-defined values of the observables Ω̂ and Λ̂, and thus Ω̂ and Λ̂
are called ”compatible operators“. In contrast, ”incompatible operators“ have [Ω̂, Λ̂] 6=

52
0̂, and thus there are no simultaneous eigenvectors. The famous example is that of
[x̂, p̂] = ih̄. An eigenvector of x̂ (denoted by |xi) is a sum of eigenvectors of p̂ (denoted
by |pi, and |xi = |pi is not possible.
Now let us see if the probability of obtaining the results ωi and λi (for two compat-
ible observables) depends on the order of the measurements of Ω̂ and Λ̂. We start with
a general state |ψi and assume no degeneracy among ωi and λi (i.e. all ωi are distinct
and all λi are distinct). If we measure Ω̂ first we obtain
Prob(ωi ) = |hωi , λi |ψi|2 . (229)
The state then changes to |ωi , λi i. Measuring Λ̂ then gives the result λi with certainty,
and |ωi , λi i stays as |ωi , λi i. Therefore one has
Prob(ωi then λi ) = |hωi , λi |ψi|2 × 1 = |hωi , λi |ψi|2 . (230)
Now we measure Λ̂ first and then Ω̂. We get the same result |hωi , λi |ψi|2 × 1 for the
probability. Hence
Prob(ωi then λi ) = Prob(λi then ωi ) . (231)
Note that the state vector did not change under the second measurement in either case,
|ψi → measure Ω̂ (Λ̂) → |ωi , λi i → measure Λ̂ (Ω̂) → |ωi , λi i . (232)
In the case of degenerate eigenvalues the state vector can change under the second
measurement if a degenerate eigenvalue is measured first. Let ωi be distinct eigenvalues
and λ1 = λ2 = λ (i.e. degeneracy of two eigenvalues of Λ̂). Consider the normalised
state:
|ψi = α|ω3 , λ3 i + β|ω1 , λi + γ|ω2 , λi . (233)
One has:
|ψi → measure Ω̂, obtain ω1 → |ω1 , λi → measure Λ̂, obtain λ → |ω1 , λi . (234)
The state did not change under the second measurement and Prob(ω1 then λ) =
|β|2 × 1 = |β|2 . However, if we measure the degenerate eigenvalue first:
β|ω1 , λi + γ|ω2 , λi
|ψi → measure Λ̂, obtain λ → → measure Ω̂, obtain ω1 → |ω1 , λi .
(|β|2 + |γ|2 )1/2
(235)
Then
(|β|2 + |γ|2 )|β|2
Prob(λ then ω1 ) = = |β|2 = Prob(ω1 then λ) . (236)
(|β|2 + |γ|2 )
So the probabilities are the same, but the state suffered a change under the measure-
ment in the latter case. Note that the term (|β|2 + |γ|2 )1/2 appears in eq. (235) order
to normalise the state after the measurement of Λ̂.

53
3.7 Time dependence of the state vector
Recall postulate IV:
d
|ψ(t)i = Ĥ|ψ(t)i .
ih̄ (237)
dt
Here Ĥ is the Hamiltonian Ĥ = Ĥ(x̂, p̂). The Hamiltonian can have a time dependence
but we will assume that it does not e.g. for a particle of mass m in a potential V (x)
in |xi basis:
−h̄2 ∂ 2
Ĥ = + V (x) . (238)
2m ∂x2
We are more familiar with the Schrödinger equation:

ψ(x, t) = Ĥψ(x, t) .
ih̄ (239)
∂t
This equation is just postulate IV in the |xi basis. To see this explicitly, multiply
eq. (237) from the left by hx|:

hx|ψ(t)i = hx|Ĥ|ψ(t)i .
ih̄ (240)
∂t
Now insert I = |x0 ihx0 | dx0 on RHS:
R
Z

ih̄ hx|ψ(t)i = hx|Ĥ|x0 ihx0 |ψi dx0 . (241)
∂t
One then has Z
∂ψ(x, t) ∂
ih̄ = δ(x − x0 )Ĥ(x0 , −ih̄ )hx0 |ψi dx0 , (242)
∂t ∂x0
where Ĥ has x̂ and p̂ written in the x0 representation. Therefore one arrives at the
usual form of the Schrödinger equation
∂ψ(x, t) ∂
ih̄ = Ĥ(x, −ih̄ )ψ(x, t) . (243)
∂t ∂x
Note that the step

hx|Ĥ|x0 i = δ(x − x0 )Ĥ(x0 , −ih̄ ) (244)
∂x0
for any Ĥ(x̂, p̂) follows from the following results in sections 2.26 and 2.27 (e.g. eq. (177)):
hx|x0 i = δ(x − x0 ) .
hx|x̂|x0 i = x0 δ(x − x0 ) ,

hx|p̂|x0 i = −ih̄δ(x − x0 ) 0 . (245)
∂x
54
For example, take Ĥ = x̂2 + p̂2 (where we are neglecting constants which give the right
dimensions for Ĥ), which is similar to the Hamiltonian for a particle in a potential,
and one has:
∂2
hx|Ĥ|x0 i → hx|x̂x̂ + p̂p̂|x0 i = δ(x − x0 )(x02 − h̄2 ), (246)
∂x02
since Z Z
0
hx|p̂p̂|x i = hx|p̂|x00 ihx00 |p̂|x000 ihx000 |x0 i dx00 dx000 , (247)

by inserting twice the completeness relation, and then use eq. (245).

3.8 Normalisation of state is constant under time evolution


The norm of the state vector does not have a dependence on time i.e. one has the
following relation:
d
hψ(t)|ψ(t)i = 0 . (248)
dt
This can be shown as follows. First notice that
d dψ(t) dψ(t)
hψ(t)|ψ(t)i = hψ(t)| i+h |ψ(t)i . (249)
dt dt dt
This step can be shown by explicitly inserting a column vector with entries ai (t) and
so X
hψ(t)|ψ(t)i = a∗i (t)ai (t) . (250)
i
Then use the adjoint of the Schrödinger equation
dψ dψ
ih̄| i = Ĥ|ψ(t)i → −ih̄h | = hψ(t)|Ĥ † = hψ(t)|Ĥ . (251)
dt dt
Then substitute for | dψ
dt
i and h dψ
dt
| in eq. (249) and one has:
1 1
hψ(t)|Ĥψ(t)i − hψ(t)|Ĥψ(t)i = 0 , (252)
ih̄ ih̄
and so we have proved eq. (248).
Let the state be normalised, hψ(t)|ψ(t)i = 1, and it will stay normalised for all t.
Now evaluate dhQ̂i/dt for an operator Q̂ which we assume does not depend on time.
Of course, hQ̂i depends on time because |ψ(t)i depends on time:
dhQ̂i dψ(t) dψ
=h |Q̂|ψ(t)i + hψ(t)|Q̂| i . (253)
dt dt dt
55
Again this step can be shown by writing out explicitly a column vector for |ψ(t)i (with
time dependent components) and a matrix for Q̂ (whose entries do not depend on
time). One then has:

dhQ̂i −1 1
= hψ(t)|Ĥ Q̂|ψ(t)i + hψ(t)|Q̂Ĥ|ψ(t)i , (254)
dt ih̄ ih̄
or in compact notation
dhQ̂i
ih̄ = h[Q̂, Ĥ]i . (255)
dt
Hence, if a time independent operator Q̂ commutes with the Hamiltonian then its
expectation value is constant. For example, take Q̂ = Ĥ. Since [Ĥ, Ĥ] = 0 then

dhĤi
=0 (256)
dt
for a time independent Hamiltonian. This has similarities with the classical statement
of energy conservation. If we use Ĥ = p̂2 /(2m) + V (x̂) (i.e. a particle in a one-
dimensional potential) and put Q̂ = x̂ then one has

dhx̂i p̂
= h i, (257)
dt m
where we have used [V (x̂), x̂] = 0̂ and [x̂, p̂2 ] = 2ih̄p̂ (using the commutation relation
in eq. (79)). The laws of classical mechanics (i.e. p = mdx/dt) hold for the expectation
values of the QM operators x̂ and p̂.

3.9 Solution of the Schrödinger equation


If Ĥ does not explicitly depend on time (i.e. its matrix elements in a given basis are not
time dependent) then its eigenvalues Ei are time independent. Since Ĥ is a Hermitian
operator then we have a basis of eigenvectors of Ĥ, denoted by {|Ei i} with eigenvalues
Ei , and thus:
Ĥ|Ei i = Ei |Ei i . (258)
Now expand the state |ψ(t)i in this basis so that one has:
X
|ψ(t)i = aj (t)|Ej i . (259)
j

56
The components aj (t) carry the time dependence, while the basis vectors |Ei i (i.e.
“directions” in Hilbert space) are independent of time. Now substitute this expression
for |ψ(t)i in the Schrödinger equation:
X daj (t) X
ih̄ |Ej i = Ej aj (t)|Ej i . (260)
j
dt j

Since |Ej i are linearly independent then each aj (t) satisfies

daj (t)
ih̄ = Ej aj (t) , (261)
dt
with solution aj (t) = aj (0)e−iEj t/h̄ . The solution of the Schrödinger equation is then
X
|ψ(t)i = aj (0)e−iEj t/h̄ |Ej i . (262)
j

Taking t = 0 one sees that aj (0) = hEj |ψ(0)i for each value of j. Since the eigenvalues
Ej are in general different, the exponential term which carries the time dependence
is different for each eigenvector |Ej i, and so the relative phases of aj (t) will change
with time. These phases affect probabilities and so are physically significant e.g. the
modulus squared of the x−space wave function |hx|ψ(t)i|2 has a time dependence.
If the initial state |ψ(0)i at t = 0 is equal to one of the energy eigenvectors i.e.
|ψ(0)i = |Ei i for a particular value of j = i (where i is fixed) then one has ai (0) = 1
and all other aj (0) = 0. In this case one has

|ψ(t)i = e−iEi t/h̄ |Ei i . (263)

This phase is not physically significant, since the modulus squared of the x−space wave
function has no dependence on t

|hx|ψ(t)i|2 = |hx|Ei ie−iEi t/h̄ |2 = |hx|Ei i|2 . (264)

3.10 The Schrödinger picture and the Heisenberg picture


Up to now we have considered the case that the time evolution of the system resides in
the state vector |ψ(t)i, while the operators Ω̂ have no time dependence. This approach
is called the “Schrödinger picture”, and we could put a subscript “S” on the state
vector, |ψ(t)i → |ψS (t)i and on a general operator Ω̂ → Ω̂S . An alternative (and
physically equivalent) approach is the “Heisenberg picture”, in which the state vector

57
does not depend on time and the time evolution is solely contained in the operators.
Time evolution can be considered as a “rotation” in Hilbert space, i.e. the state vector
at t = 0 is acted upon by a unitary time evolution operator Û (t). From eq. (262), and
recalling that aj (0) = hEj |ψ(0)i, one can see that:

|ψS (t)i = Û (t)|ψS (0)i , (265)

where Û (t) is given by: X


Û (t) = |Ej ihEj |e−iEj t/h̄ . (266)
j

The time evolution operator Û (t) in eq. (266) is a unitary operator because Û † (t)Û (t) =
I due to the orthonormality of |Ej i and the completeness relation applied to the outer
products |Ej ihEj |:
X X
Û † (t)Û (t) = |Ei ihEi |Ej ihEj |eiEi t/h̄ e−iEj t/h̄ = |Ej ihEj | = I . (267)
i,j j

We can now define the state vector in the Heisenberg picture, |ψH (t)i, as

|ψH (t)i = Û † (t)|ψS (t)i = |ψS (0)i , (268)

where we have used eq. (265) for |ψS (t)i. Hence the state vector |ψH (t)i has no time
evolution in the Heisenberg picture, being equal at all times to the state vector in the
Schrödinger picture at t = 0. In order to keep matrix elements invariant under the
unitary transformation in eq. (268), we learnt in eq. (127) that an operator (in this
case the time independent Ω̂S ) must transform as

Ω̂S → Û † (t)Ω̂S Û (t) = Ω̂H (t) . (269)

Hence operators in the Heisenberg picture Ω̂H carry all the time evolution of the system.
We note that our rotation matrix in eq. (268) is Û † and so when we apply eq. (127)
(which was written for a rotation matrix Û ) we obtain the transformation in eq. (269).

58
4 Simple Harmonic Oscillator in Quantum Mechan-
ics
We take the Hamiltonian for a 1-dimensional harmonic oscillator of mass m and fre-
quency ω to be
p̂2 1
Ĥ = + mω 2 x̂2 . (270)
2m 2
Clearly this Hamiltonian has been obtained from the Hamiltonian for the classical
simple harmonic oscillator, with the replacements x → x̂ and p → p̂. We will now
determine the energy eigenvalues using only the commutation relation [x̂, p̂] = ih̄,
instead of applying the series solution method to the Schrödinger equation.

4.1 Raising and lowering operators


Let us define the scaled operators
r r
mω 1
X̂ = x̂ , P̂ = p̂ (271)
h̄ mh̄ω
so that
1
[X̂, P̂ ] = i and Ĥ = (X̂ 2 + P̂ 2 )h̄ω . (272)
2

Now introduce two new operators a and a .
1 1
a = √ (X̂ + iP̂ ) and a† = √ (X̂ − iP̂ ) . (273)
2 2

Since these are operators, a should really be denoted by â and a† should be ↠, but I
will not write the hat explicitly for these operators. Note that a† is the adjoint of a.
The eigenvalues of the operators a, a† , X̂ and P̂ are dimensionless. Classically (i.e.
when x and p and the Hamiltonian H are not operators) one would have H = aa† h̄ω
or H = a† ah̄ω. However, this is not the case for the QM simple harmonic oscillator,
and instead one has:

Ĥ i Ĥ 1
aa† = − [X̂, P̂ ] = + ,
h̄ω 2 h̄ω 2
Ĥ i Ĥ 1
a† a = + [X̂, P̂ ] = − . (274)
h̄ω 2 h̄ω 2

59
Here 1/2 is really I/2 where I is the identity operator, but it is common practice to
write I as “1”. One can also write
1 1
Ĥ = (a† a + )h̄ω or Ĥ = (aa† − )h̄ω . (275)
2 2
Equating these two expressions for Ĥ gives a† a − aa† = 1, or equivalently

[a, a† ] = 1 . (276)

Again note that “1” here really means the identity operator. Now Ĥ = (a† a + 1/2)h̄ω
and so to find the eigenvalues of Ĥ we just need to find the eigenvalues of a† a = N ,
where N is the “number operator” defined by:

N = a† a . (277)

Although N is an operator it is common practice to write N instead of N̂ . Let |ni be


a normalised eigenvector of N , so that

N |ni = n|ni and hn|ni = 1 . (278)

Here we are following the convention of labelling the eigenvector by its eigenvalue.
Note that |ni is also an eigenvector of Ĥ with eigenvalue (n + 1/2)h̄ω, as can be seen
as follows:
1
Ĥ|ni = (N + )h̄ω|ni = (n + 1/2)h̄ω|ni , (279)
2
where we have used the first form of Ĥ in eq. (275). It is important to emphasise at
this stage that we do not know if n is positive, negative, integer or non-integer. We just
know that n is real, since N † = (a† a)† = a† a = N , and so N is a Hermitian operator
(with necessarily real eigenvalues).
From eq. (278) it follows that hn|N |ni = n. One can also show that hn|N |ni is
equal to the norm squared of the ket a|ni as follows:

hn|N |ni = hn|a† a|ni = han|ani = ||a|ni||2 . (280)

Hence ||a|ni||2 = n, and since any norm ≥ 0 then n ≥ 0. Consequently, none of the
eigenvalues can be negative.
Now we show that a and a† shift the eigenvalues of N by −1 and 1 respectively i.e.
they are “lowering” and “raising” and operators respectively. Consider a† |ni. One has
the following identity:
N a† |ni ≡ [N, a† ]|ni + a† N |ni . (281)

60
Just write out the commutator explicitly and you will see that the right-hand side is
equivalent to the left-hand side. Using the commutator identity in eq .(80) one has:

[N, a† ] = [a† a, a† ] = a† aa† − a† a† a = a† [a, a† ] = a† , (282)

where we have used [a, a† ] = 1 (where “1” really means I). Hence using the result of
eq. (282) in eq. (281) one has

N a† |ni = a† |ni + na† |ni = (n + 1)a† |ni . (283)

Therefore a† |ni is an eigenvector of N with eigenvalue n + 1 unless a† |ni is the null


vector. Now let us calculate the norm squared of the ket a† |ni:

||a† |ni||2 = hn|aa† |ni = hn|a† a + 1|ni = hn|N + 1|ni = nhn|ni + hn|ni = n + 1 > 0 .
(284)
One knows that n + 1 > 0 because we showed in eq. (280) that n ≥ 0. Eq. (284)
says that the ket a† |ni does not vanish i.e. it can never be the null vector. Repeated
applications of a† can generate infinitely more eigenstates with higher eigenvalues i.e.
for an integer m, the ket (a† )m |ni is an eigenvector of N with eigenvalue n + m.
So far we still do not know if n is an integer or not. We just know that n ≥ 0. Now
consider the ket a|ni. One has

N a|ni ≡ [N, a]|ni + aN |ni = (n − 1)a|ni , (285)

where we have used [N, a] = −a. So either a|ni is the null vector or a|ni is an
eigenvector of N with eigenvalue n − 1. Let us evaluate the norm squared of a|ni:

||a|ni||2 = hn|a† a|ni = hn|N |ni = n . (286)

So a|ni is the null vector for n = 0. Now act N on the ket a|n − 1i, and the result can
be obtained from eq. (285) with the replacement n → (n − 1):

N a|n − 1i = (n − 2)a|n − 1i . (287)

Repeating the above reasoning, either n − 1 is zero (i.e. a|n − 1i is the null vector) or
a|n − 1i is an eigenvector of N with eigenvalue n − 2. Since all eigenvalues are non-
negative, the lowering process must stop by reaching the null vector (since subsequent
applications of a on the null vector just give null vector again). This is only possible
if n is a positive integer or zero. If we had some non-integer value of n (say n = 2.5)
then we could not satisfy the above either/or requirements since the lowering process

61
would have a contradiction i.e. the process would not eventually give the null vector
and would try to keep lowering to negative eigenvalues (which is forbidden). We have
shown that the eigenvalues of Ĥ = (N + 21 )h̄ω are (n + 21 )h̄ω, with n being a positive
integer or zero. The operators a and a† are the “lowering” and “raising” operators
respectively.
Now a word about notation. We have been labelling our eigenvectors of N (and
Ĥ) by |ni. The eigenvector with n = 0 is the eigenvector corresponding to the ground
state energy (i.e. the state with the lowest energy), and would then be denoted by |0i
in this notation. However, we learnt in Chapter 2 that a null vector in a vector space
is denoted by |0i. Of course, the null vector and an eigenvector of N are not the same
vector and should be labelled differently. To avoid this problem we will refer to the
eigenvector of N with eigenvalue n = 0 as |n0 i, while all other eigenvectors of N will
just be denoted by |1i, |2i etc (instead of |n1 i, |n2 i etc). The null vector will continue
to be labelled by |0i.
One has
N |n0 i = 0|n0 i = |0i . (288)
The following important relations can be derived (see problem sheet 2):

a† |ni = n + 1|n + 1i ,

a|ni = n|n − 1i , (recalling that a|n0 i = |0i and a|0i = |0i) , (289)

where all kets are normalised. Using eqs. (289), we can express all normalised eigen-
vectors in terms of the ground state |n0 i as follows:

a† |n − 1i a† a† |n − 2i (a† )n |n0 i
|ni = √ = √ √ ..... √ . (290)
n n n−1 n!

4.2 Matrix representations of operators and eigenvectors of


the simple harmonic oscillator
Let us now write down explicit matrix representations in the |ni basis of the above
operators and eigenvectors. These are all infinitely long:
     
1 0 0
 0   1   0 
|n0 i = 
 0  |1i =  0  |0i =  0  .
     (291)
.. .. ..

62
   √ 
0 0 0 0 .. 0 1 √0 0 ..

 0 1 0 0 ..  

 0 0 2 √0 .. 

N =
 0 0 2 0 ..   a=
 0 0 0 3 .. .
 (292)
 0 0 0 3 ..   0 0 0 0 .. 
.. .. .... .. .. .. .. .. ..
 
√0 0 0 0 ..
 1

 √0 0 0 .. 

a = 0 2 √0 0 .
..  (293)
 0 0 3 0 .. 
.. .. .. .. ..
One can check that [a, a† ] = 1 (identity matrix), a|n0 i = |0i and N |ni = n|ni etc.

4.3 Simple Harmonic Oscillator wave functions


Using the lowering and raising operators a and a† we can find the x−space wave function
ψ(x) = hx|ψi, or in our notation for the simple harmonic oscillator, ψn (x) = hx|ni for
the wave function of the eigenvector |ni in the x−basis.
It is convenient to work with the operators X̂ and P̂ to find the X−space wave
function, and change to x at the end. In X−space, P̂ is represented by −i∂/∂X. We
can see this from the following:
∂ p̂ −ih̄ ∂X ∂
p̂ = −ih̄⇒ √ = P̂ = √ , (294)
∂x mh̄ω mh̄ω ∂x ∂X
p
and so using ∂X/∂x = mω/h̄ one has the result

P̂ = −i∂/∂X . (295)

Now we saw earlier that a|n0 i = |0i and so hX|a|n0 i=0 (zero). Recall that

X̂ + iP̂
a= √ , (296)
2
and so
1  
√ hX|X̂|n0 i + ihX|P̂ |n0 i = 0 . (297)
2

63
Using eq. (177) one has
Z Z  
0 0 0 0 ∂ ∂ψ0 (X)
hX|P̂ |n0 i = hX|P̂ |X ihX |n0 idX = δ(X−X ) −i 0
ψ0 (X 0 )dX 0 = −i
∂X ∂X
(298)
Hence eqn. (297) becomes
 
1 ∂ψ0 (X)
√ Xψ0 (X) + = 0. (299)
2 ∂X
p
This equation has solution ψ0 (X) ∝ exp(−X 2 /2). Using X = mω/h̄ x we obtain the
solution
−mωx2
ψ0 (x) = ce 2h̄ , (300)
and c is a normalisation constant which is fixed by the usual condition for wave func-
tions Z  mω 1/4

ψ0 (x)ψ0 (x) = 1 leading to c = , (301)
πh̄
where we choose to make c real. To find the wave function ψ1 (x) of the first excited
state (i.e. the normalised state |1i) we use ψ1 (x) = hx|1i and |1i = a† |n0 i, from which
one has
ψ1 (x) = hx|a† |n0 i . (302)

Then use a† = (X̂ −iP̂ )/ 2, which gives an equation for ψ1 (x) in terms of ψ0 as follows
 
c ∂ 2
ψ1 (x) = √ X − e−X /2 . (303)
2 ∂X

Repeating the process, one can obtain a generalised formula for the normalised wave
function for ψn :  n
c ∂ 2
ψn (x) = √ X− e−X /2 , (304)
n
2 n! ∂X
with X = (mω/h̄)1/2 x. This formula can be written in terms of Hermite polynomials

1  mω 1/4 −mωx2
ψn (x) = √ Hn (X)exp , (305)
2n n! πh̄ 2h̄

where H0 = 1, H1 = 2X, H2 = 4X 2 − 2, H3 = 8X 3 − 12X etc.


The probability density function for the ground state of a quantum simple harmonic
oscillator (i.e. |ψ0 (x)|2 ) and the probability density function for a classical simple

64
harmonic oscillator are quite different. In QM, |ψ0 (x)|2 peaks at x = 0, while in the
classical case the minimum is at x = 0; a classical harmonic oscillator would spend
least time around x = 0 because it would be travelling at its fastest speed when x = 0,
where its kinetic energy is a maximum. As n increases, the wave functions ψn (x) of
the excited states look increasingly like the classical probability distribution. In the
quantum case there are nodes and anti-nodes of ψn (x) but on macroscopic scales (i.e.
very large n) these are too close together to be experimentally distinguished. In a given
interval of x one would obtain the classical probability for finding the particle inside
that interval.

4.4 Example: Calculating expectation values


Question:
Show that the expectation value of the potential energy of a simple harmonic oscillator
in the state |ni is half of the total energy.
Answer:
We need to calculate
1 1
hn| mω 2 x̂2 |ni = h̄ωhn|X̂ 2 |ni . (306)
2 2
Now one has:
a† + a (a† a† + aa† + a† a + aa)
X̂ = √ ⇒ X̂ 2 = . (307)
2 2
The term with a† a† vanishes because
√ p
a† |ni = n + 1|n + 1i and so a† a† |ni = (n + 1)(n + 2)|n + 2i . (308)

Now hn|n + 2i = 0 by orthonormality (recall hn|mi = δnm ). The term with aa also
vanishes because of orthonormality

hn|aa|ni ∝ hn|n − 2i = 0 . (309)

The term with aa† can be rearranged using [a, a† ] = 1 as follows:

aa† = 1 + a† a = 1 + N , (310)

and so the terms aa† + a† a in eq. (307) can be written as

aa† + a† a 1+N +N 1
= = +N. (311)
2 2 2

65
Hence one has
1 1 1 1 1
h̄ωhn|X̂ 2 |ni = h̄ωhn| + N |ni = h̄ω(n + ) , (312)
2 2 2 2 2
which is half of the total energy of the oscillator, En = h̄ω(n + 1/2).

5 Angular Momentum for Quantum Systems


In classical physics the angular momentum of a particle is given by L = r × p, where
r is the displacement from the origin and p is the linear momentum. Using Cartesian
coordinates (we are now working in three dimensions), one has

i j k
L=r×p= x y z = i(ypz − zpy ) + j(zpx − xpz ) + k(xpy − ypx ) . (313)
px py pz

Of course, in going from classical angular momentum to quantum angular momentum,


we make the replacements x → x̂ and px → p̂x etc (following postulate II of Quantum
Mechanics). Therefore one has

L̂x = ŷ p̂z − ẑ p̂y (operator corresponding to x component of L) , (314)


L̂y = ẑ p̂x − x̂p̂z (operator corresponding to y component of L) , (315)
L̂z = x̂p̂y − ŷ p̂x (operator corresponding to z component of L) . (316)

Now we know that (QM postulates generalised to three dimensions)

[x̂, p̂x ] = [ŷ, p̂y ] = [ẑ, p̂z ] = ih̄ , (317)

but if the coordinates do not match then

[x̂, p̂y ] = 0, [x̂, p̂z ] = 0 etc . (318)

This can be seen from the following, where in the coordinate basis (i.e. x,y,z) one has
x̂ = x and p̂y = −ih̄∂/∂y etc, and ψ(x, y, z) is the coordinate basis wave function:
∂ ∂ψ(x, y, z) ∂(xψ(x, y, z))
[x, −ih̄ ]ψ(x, y, z) = −ih̄x + ih̄ (319)
∂y ∂y ∂y
∂ψ(x, y, z) ∂ψ(x, y, z)
= −ih̄x + ih̄x + 0 = 0. (320)
∂y ∂y

66
Using the above commutation relations one can show:

[L̂x , L̂y ] = ih̄L̂z and [L̂y , L̂z ] = ih̄L̂x and [L̂z , L̂x ] = ih̄L̂y . (321)

All other combinations of operators commute. One can write


3
X
[L̂i , L̂j ] = ih̄ijk L̂k . (322)
k=1

Here ijk is the Levi-Civita tensor, for which 123 = 1. This tensor is anti-symmetric
under exchange of two indices and so ijk = 0 if two indices are the same i.e. 122 =
−122 = 0. So the non-zero values of the tensor are

123 = 231 = 312 = 1; 132 = 213 = 321 = −1. (323)

The other 21 combinations of the indices give ijk = 0. Since each of L̂x , L̂y , L̂z does
not commute with the others, as shown in eq. (321), then they do not have common
eigenvectors.
Now consider the magnitude squared of the total angular momentum vector:

L.L = |L|2 = L̂2 = L̂2x + L̂2y + L̂2z . (324)

This operator for the total angular momentum, denoted by L̂2 , commutes with each of
L̂x , L̂y and L̂z . We can see this explicitly as follows:

[L̂z , L̂2 ] = [L̂z , L̂2x ] + [L̂z , L̂2y ] + [L̂z , L̂2z ] . (325)

The last term is zero, and the remaining terms can be expressed as follows by using
the commutator identity in eq. (79):

[L̂z , L̂2 ] = L̂x [L̂z , L̂x ] + [L̂z , L̂x ]L̂x + L̂y [L̂z , L̂y ] + [L̂z , L̂y ]L̂y , (326)

and so
[L̂z , L̂2 ] = ih̄L̂x L̂y + ih̄L̂y L̂x − ih̄L̂y L̂x − ih̄L̂x L̂y = 0 , (327)
In a similar way, [L̂x , L̂2 ] = 0 and [L̂y , L̂2 ] = 0. Therefore L̂2 and one of L̂x , L̂y and L̂z
share common eigenvectors (and thus can be measured without uncertainty).

67
5.1 Ladder operators of angular momentum
There are two types of angular momentum in quantum mechanics:
(i) Orbital angular momentum, denoted by the operators L̂2 , L̂i etc.
(ii) Spin angular momentum, denoted by the operator Ŝ 2 , Ŝi .
We will study spin angular momentum later. Let us use the symbol J to refer to either
of these types of angular momentum e.g. Jˆ2 could mean L̂2 or Ŝ 2 . We can then write
a generic commutation relation analogous to eq. (322)
3
X
[Jˆi , Jˆj ] = ih̄ijk Jˆk . (328)
k=1

Now let’s see what we can learn from these commutation relations. Consider the nor-
malised, common eigenvectors of Jˆ2 and Jˆz , with eigenvalues λh̄2 and mh̄ respectively.
Thus we can write
Jˆ2 |λ, mi = λh̄2 |λ, mi; Jˆz |λ, mi = mh̄|λ, mi; hλ, m|λ, mi = 1 . (329)
It is important to note that at this stage we have no restriction on the values that λ
and m can take. Due to the commutation relations we will show that λ and m can
only take specific values (e.g. m can only take integer or half-integer values). Now
introduce “ladder operators”
Jˆ+ = Jˆx + iJˆy ; Jˆ− = Jˆx − iJˆy with Jˆ+† = Jˆ− . (330)
One can show that
[Jˆz , Jˆ+ ] = h̄Jˆ+ ; [Jˆz , Jˆ− ] = −h̄Jˆ− ; [Jˆ+ , Jˆ− ] = 2h̄Jˆz ; [Jˆ2 , Jˆ± ] = 0 . (331)
For example, to show the first of these
[Jˆz , Jˆ+ ] = [Jˆz , Jˆx + iJˆy ] = [Jˆz , Jˆx ] + i[Jˆz , Jˆy ] = ih̄Jˆy + i(−ih̄Jˆx ) = h̄(Jˆx + iJˆy ) = h̄Jˆ+ .
(332)
Moreover, one also has
Jˆ− Jˆ+ = (Jˆx − iJˆy )(Jˆx + iJˆy ) = Jˆx2 + Jˆy2 + i[Jˆx , Jˆy ] = Jˆ2 − Jˆz2 − h̄Jˆz , (333)

where we have made use of Jˆ2 = Jˆx2 + Jˆy2 + Jˆz2 in the final step. Similarly one can show
that
Jˆ+ Jˆ− = Jˆ2 − Jˆz2 + h̄Jˆz . (334)

68
Now calculate the norm of the ket ||Jˆ+ |λ, mi|| (and recall that norm ≥ 0):

||Jˆ+ |λ, mi||2 = hλ, m|Jˆ− Jˆ+ |λ, mi ≥ 0 . (335)

Now making use of eq. (333) for Jˆ− Jˆ+ , and eq. (329), one has

hλ, m|Jˆ− Jˆ+ |λ, mi = h̄2 (λ − m2 − m) ≥ 0; → (λ − m2 − m) ≥ 0 . (336)

This equation tells us that λ ≥ m(m + 1). We also know that λ ≥ 0 since λh̄2 is an
eigenvalue of Jˆ2 (i.e. the magnitude of the total angular momentum is ≥ 0). Instead
of using λ we can change to a different parameter j, writing

λ = j(j + 1) with j ≥ 0 . (337)

One can see that j has a unique value for each value of λ, and we will see that the
allowed values of j are integers or half-integers.
Hence using eq. (336) one has

j(j + 1) ≥ m(m + 1) . (338)

This equation can be rewritten as (j − m)(j + m + 1) ≥ 0, which tells us that

−j − 1 ≤ m ≤ j , (339)

where we recall that j is positive. Now calculate the norm of Jˆ− |λ, mi:

||Jˆ− |λ, mi||2 = hλ, m|Jˆ+ Jˆ− |λ, mi = h̄2 (λ − m2 + m) ≥ 0; → (λ − m2 + m) ≥ 0 , (340)

where we have made use of eq. (334). One then has j(j +1) ≥ m(m−1) or equivalently
(j + m)(j − m + 1) ≥ 0. From this one obtains

−j ≤ m ≤ j + 1 . (341)

Combining eq. (339) and eq. (341) gives

−j ≤ m ≤ j . (342)

Therefore we have found a limitation on the possible values of m. Recall that j ≥ 0,


and so m can take positive or negative values.
To summarise, so far we have learnt the following (changing the label λ to j):
(i) Jˆ2 |j, mi = j(j + 1)h̄2 |j, mi .

69
(ii) Jˆz |j, mi = mh̄|j, mi .

(iii) −j ≤ m ≤ j and j ≥ 0 .

Now consider

Jˆz (Jˆ± |j, mi) ≡ ([Jˆz , Jˆ± ] + Jˆ± Jˆz )|j, mi = (±h̄Jˆ± + Jˆ± Jˆz )|j, mi = (m ± 1)h̄(Jˆ± |j, mi) .
(343)
ˆ ˆ ˆ
So J± |j, mi is an eigenvector of Jz with eigenvalue (m ± 1)h̄, and J+ and J− are raisingˆ
and lowering operators respectively of angular momentum Jˆz . Note that Jˆ± |j, mi has
the same Jˆ2 eigenvalue as |j, mi, which can be seen as follows (where we make use of
the relation [Jˆ2 , Jˆ± ] = 0):

Jˆ2 (Jˆ± |j, mi) = Jˆ± Jˆ2 |j, mi = j(j + 1)h̄2 (Jˆ± |j, mi) . (344)

Alternatively, Jˆ± |j, mi could be the null vector. This occurs for j = m (i.e. the
maximum value of m from eq. (342)) for which

||Jˆ+ |j, mi||2 = h̄2 (j − m)(j + m + 1) = 0 , (345)

and for j = −m for which

||Jˆ− |j, mi||2 = h̄2 (j + m)(j − m + 1) = 0 . (346)

Hence Jˆ+ |j, ji = |0i and Jˆ− |j, −j)i = |0i.


Starting from any m in the allowed range −j ≤ m ≤ j we can use Jˆ+ to move the
ˆ
Jz eigenvalue up in integer steps until for some integer r we reach m + r = j, at which
value further applications of Jˆ+ annihilate the state. Likewise, we can step down s
times until m − s = −j (where s is an integer). Combining, we have

2j = r + s , (347)

and so 2j is an integer (and a positive integer due to j ≥ 0) because r + s is an integer.


Since 2j is an integer then we have the following important result:

j is a positive integer or a positive half-integer

Since m = j − r and r is an integer then:

m is an integer when j is an integer, and m is a half-integer when j is a half-integer.

70
In conclusion, the allowed possibilities for j and m are (recalling −j ≤ m ≤ j):

j = 0, m = 0 , (348)
j = 1/2, m = −1/2, 1/2 ,
j = 1, m = −1, 0, 1 ,
j = 3/2, m = −3/2, −1/2, 1/2, 3/2 .

i.e. there are 2j + 1 possible values of m.


One can show that

Jˆ+ |j, mi = h̄ (j − m)(j + m + 1)|j, m + 1i ,


p

Jˆ− |j, mi = h̄ (j + m)(j − m + 1)|j, m − 1i .


p
(349)

Here all kets are normalised. Let us derive the result for Jˆ+ |j, mi. We know that
Jˆ+ |j, mi = Ch̄|j, m + 1i, where C is a normalisation constant to be determined and
can be taken as a real number. The kets |j, mi and |j, m + 1i are normalised. Writing
J+ |j, mi as a single ket |Jˆ+ jmi one has:

hJˆ+ jm|Jˆ+ jmi = |C|2 h̄2 hj, m + 1|j, m + 1i , (350)


⇒ hj, m|Jˆ− Jˆ+ |j, mi = |C|2 h̄2 , (351)
⇒ hj, m|Jˆ2 − Jˆz2 − h̄Jˆz |j, mi = |C|2 h̄2 , (352)
p
⇒ j(j + 1) − m2 − m = |C|2 , ⇒ (j − m)(j + m + 1) = C . (353)

The same can be done for Jˆ− |j, mi, and hence we have reproduced eq.(349).
pA final comment. The magnitude of the total angular momentum is given by
h̄ j(j + 1) since
Jˆ2 |j, mi = j(j + 1)h̄2 |j, mi . (354)
The maximum value of the momentum in the z (or p x or y direction) is when |m| = j
(i.e. the largest allowed value of m). Now since j(j + 1) > j then the angular
momentum can never be aligned with the z axis (or x or y axis). The total angular
momentum vector has a definite magnitude, but its direction is indeterminate because
the x and y components of the angular momentum do not have definite values when
the z component is measured.

5.2 Matrix representation of angular momentum


Angular momentum operators can be represented by matrices, with states being rep-
resented by column vectors. We will continue to work in the orthonormal basis of

71
simultaneous eigenvectors of Jˆz and Jˆ2 i.e. the normalised kets |j, mi. Let us fix j = 1,
and so the possible values of m are 1, 0 and −1. These three eigenvectors of Jˆ2 and Jˆz
can be labelled as |1, 1i, |1, 0i, and |1, −1i. An alternative notation is to label them
by m only:
|1z i |0z i | − 1z i , (355)
corresponding to eigenvalues of h̄, 0 and −h̄ respectively for Jˆz . These three eigen-
vectors form an orthonormal basis in this three-dimensional vector space, which is a
subspace of the Hilbert space. We also have

Jˆ2 |1z i = h̄2 j(j + 1)|1z i , (356)

and same for |0z i and | − 1z i.


The matrix representation of Jˆz in this basis is as follows:

h1z |Jˆz |1z i h1z |Jˆz |0z i h1z |Jˆz | − 1z i


 
ˆ
(Jz )ij =  h0z |Jˆz |1z i h0z |Jˆz |0z i h0z |Jˆz | − 1z i  . (357)
h−1z |Jˆz |1z i h−1z |Jˆz |0z i h−1z |Jˆz | − 1z i

After acting on the kets with the Jˆz operator one has
 
h̄h1z |1z i 0 −h̄h1z | − 1z i
(Jˆz )ij =  h̄h0z |1z i 0 −h̄h0z | − 1z i  . (358)
h̄h−1z |1z i 0 −h̄h−1z | − 1z i

Note that the centre column of zeros arises because Jˆz |0z i = 0|0z i = |0i, and any inner
product with the null vector is zero. Due to orthonormality of the kets, (Jˆz )ij simplifies
further to  
1 0 0
(Jˆz )ij = h̄  0 0 0  . (359)
0 0 −1
Having found the matrix representation of the Jˆz operator, one can easily show that the
column representations of the normalised eigenvectors are as follows, where we neglect
any overall (global) phase eiθ :
     
1 0 0
|1z i =  0  |0z i =  1  | − 1z i =  0  . (360)
0 0 1

72
ˆ Jˆ+ , and Jˆ− in this basis are
The matrix representations of the operators J, as follows,
and can be derived in the same way as we derived (Jˆz )ij :
     
1 0 0 √ 0 1 0 √ 0 0 0
ˆ2
(J )ij = 2h̄2
0 1 0  ˆ
, (J+ )ij = 2h̄ 0 0 1
  ˆ
, (J− )ij = 2h̄ 1
 0 0  .
0 0 1 0 0 0 0 1 0
(361)
Using these representations one can easily
√ check the results for the abstract ket equa-
tions that we had before (e.g Jˆ+ |0z i = 2h̄|1z i) and the commutation relations (e.g.
[Jˆ2 , Jˆz ] = 0).

5.3 Calculating probabilities of eigenvalues of angular mo-


mentum with matrix representations
Suppose that a particle is in the j = 1 state of angular momentum. The result h̄ is
found from a measurement of Jˆz . Then Jˆx is measured. What values can be found and
with what probabilities?
First find the matrix representation of Jˆx in this basis of |1z i, |0z i and | − 1z i.
Writing Jˆx in terms of ladder operators in the usual way (see eq. 330) one has:
 
ˆ ˆ 0 1 0
(J+ )ij + (J− )ij h̄
(Jˆx )ij = =√  1 0 1 . (362)
2 2 0 1 0

The eigenvalues of this matrix are h̄, 0 and −h̄ (i.e. the same as for Jˆz ), and can be
obtained from solving the characteristic equation. Label the eigenvectors of Jˆx as |1x i,
|0x i and | − 1x i. We can find their column vector representations by acting the matrix
(Jˆx )ij on a column vector with unknown complex numbers a, b, c to be determined. For
the eigenvector |1x i one has the equation:
    
0 1 0 a a
h̄ 
√ 1 0 1   b  = h̄  b . (363)
2 0 1 0 c c

This gives three simultaneous equations:


b a+c b
√ = a, √ = b, √ = c. (364)
2 2 2

73
Clearly c = a and so the eigenvector has the form
 
√ a
 2a  . (365)
a
This column vector is not normalised, but this can be done as follows:
 
√ √a
2a∗ a∗  2a  = |a|2 + 2|a|2 + |a|2 = 4|a|2 .

h1x |1x i = 1 = a∗ (366)
a

Hence a = 1/2, where we have neglected overall multiplicative (global) phases eiθ (i.e.
we choose a to be real). Physical observables do not depend on this global phase.
Therefore the normalised eigenvector |1x i is as follows:
 
√1
1
|1x i =  2  . (367)
2
1
Using the same method one finds:
   
1 1

1 1
|0x i = √  0  ; | − 1x i =  − 2  . (368)
2 2
−1 1

The question tells us that the measurement of Jˆz gave the result h̄. Hence the state
after this measurement is the normalised eigenvector (see eq. 360):
 
1
|1z i =  0 . (369)
0

Now expand this state |1z i in terms of the normalised eigenvectors of Jˆx (which also
form an orthonormal basis) as follows:
|1z i = h1x |1z i|1x i + h0x |1z i|0x i + h−1x |1z i| − 1x i. (370)
We can evaluate the components (inner products) h1x |1z i, h0x |1z i, h−1x |1z i e.g.
 
1 √  1 1
h1x |1z i = 1 2 1  0 = . (371)
2 2
0

74
Hence one arrives at the following form for the state |1z i in the orthonormal basis of
eigenvectors of Jˆx :
1 1 1
|1z i = |1x i + √ |0x i + | − 1x i . (372)
2 2 2
The probabilities of obtaining the eigenvalues from a measurement of Jˆx are just given
by the modulus squared of the components. Hence the values of h̄ and −h̄ (correspond-
ing to |1x i and | − 1x i respectively) occur with a probability of 1/4, and the value of
0 occurs with a probability 1/2. After the measurement of Jˆx the state will be one of
these three eigenvectors of Jˆx . A second measurement of Jˆz would not necessarily give
back the value h̄ we started with. On performing this second measurement of Jˆz we
would have to expand the eigenvector of Jˆx in terms of the eigenvectors of Jˆz , and any
of the values h̄, 0 and −h̄ would be possible from the measurement of Jˆz . e.g.

|0x i = α|1z i + β|0z i + γ| − 1z i , (373)

and α = h1z |0x i etc.

5.4 Orbital angular momentum, and coordinate representa-


tion of eigenvectors
Now let us focus on orbital angular momentum. Earlier we gave expressions for L̂x ,
L̂y and L̂z in terms of Cartesian coordinates. We can also write these expressions in
spherical coordinates using the standard relations

x = r sin θ cos φ, y = r sin θ sin φ, z = r cos θ . (374)

One can show that


∂ ∂
L̂x = ih̄(sin φ + cot θ cos φ ) , (375)
∂θ ∂φ
∂ ∂
L̂y = ih̄(− cos φ + cot θ sin φ ) , (376)
∂θ ∂φ

L̂z = −ih̄ , (377)
∂φ
∂ ∂
L̂± = ±h̄e±iφ ( ± i cot θ ) . (378)
∂θ ∂φ
Note that these expressions do not involve the radial coordinate r (due to the fact that
angular momentum is intimately related to rotations). Let us now determine the wave

75
function in the position basis of an eigenvector |l, mi of the operators L̂2 and L̂z . Recall
that quantum numbers of orbital angular momentum are l (instead of our generic j)
and m. The abstract equation L̂z |l, mi = mh̄|l, mi leads to the following equation in
the position basis
L̂z [R(r)Ylm (θ, φ)] = mh̄R(r)Ylm (θ, φ) . (379)
Here the function R(r) is the radial part of the wave function, and Ylm (called a
“spherical harmonic”) is the angular part of the wave function, and is a function of θ
and φ only. The total wave function in the position basis is just R(r)Ylm (θ, φ).
Since L̂z does not depend on the radial coordinate, R(r) is a constant with respect
to L̂z and so cancels out from eq.(379). We seek separable solutions Ylm = f (φ)g(θ):

∂f ∂f
L̂z Ylm = mh̄Ylm ⇒ −ih̄ g(θ) = mh̄f (φ)g(θ) ⇒ = imf (φ) . (380)
∂φ ∂φ
This equation has solution

f (φ) = eimφ up to a constant . (381)

We require that the wave function be single-valued for a given coordinate φ:

eimφ = eim(φ+2π) ⇒ eim2π = 1 . (382)

Hence this equation shows that for orbital angular momentum the quantum number m
is an integer, and cannot take half-integer values. Since m is an integer then l is also an
integer. Recall that the commutation relations for L̂x , L̂y and L̂z allowed both integer
and half-integer values for l, but the requirement of a single valued wave function forces
l and m to take integer values.
We can now write
Ylm = Clm eimφ Plm (θ) , (383)
where we have written g(θ) = Plm (θ) and Clm is a normalisation constant. The func-
tions Plm (θ) are called the “associated Legendre functions”. To find the explicit form
of Plm (θ) we use the fact that L̂+ Yll = 0 i.e. the raising operator annihilates the state
with m = l. Use the explicit form of L̂+ given in eq. (378) acting on Yll = Cll eilφ Pll (θ):

∂Pll (θ)
= l cot θPll (θ) , (384)
∂θ
with solution
Pll (θ) = const × sinl θ . (385)

76
We then have
Yll = Cll0 eilφ sinl θ . (386)
where is a normalisation constant different to Cll . The constant Cll0 is fixed by the
normalisation condition
Z Z
|Ylm | dΩ = |Ylm |2 sin θdθdφ = 1 ,
2
(387)

over all solid angles Ω i.e. the particle is certain to be found in the region 0 ≤ Ω ≤ 4π.
We can obtain the remaining Ylm from Yll by applying the lowering operator to Yll .
One finds (using eq. (349)) that
s
(l + m)!
h̄Ylm = (L̂− )l−m Yll . (388)
(2l)!(l − m)!

The first few (normalised) spherical harmonics are:


r r r
1 3 3 ±iφ
Y00 = , Y10 = cos θ, Y1±1 = ∓ e sin θ . (389)
4π 4π 8π
Plots of the modulus squared (probability density) of these wave functions can be found
in textbooks or online. The form of Y00 is spherical symmetrical and so the particle is
equally likely to be found at any value of θ and φ.

5.5 Spin angular momentum


We now discuss spin angular momentum S, which has no classical counterpart. Spin
angular momentum is different from orbital angular momentum L. Spin angular mo-
mentum was discovered in the Stern-Gerlach experiment in 1922 in which particles
were seen to be deflected by a magnetic field even when the particles were arranged
such that l = 0. This deflection cannot then be attributed to orbital angular momen-
tum, and it confirmed the existence of another type of angular momentum, which is
now known as spin angular momentum.
We recall that the quantum mechanical operators L̂i for i = x, y, z were obtained
from the expressions for the classical angular momentum with the replacements x → x̂,
y → ŷ, z → ẑ and p → p̂. As we know, in both classical and quantum physics, orbital
angular momentum L depends on the motion and position of the particle. In contrast,
spin angular momentum S has no classical counterpart.

77
Spin is an intrinsic property of a particle, like a particle’s mass or charge.

The magnitude of S is fixed for each species of particle. Being an angular momentum,
it is postulated that the operators of spin angular momentum, denoted by Ŝi , obey the
same commutation relations as the operators of orbital angular momentum L̂i :
3
X
[Ŝi , Ŝj ] = ih̄ijk Ŝk i, j, k = x, y, or z . (390)
k=1

In the case of orbital angular momentum L, these commutation relations were obtained
from quantising the classical result L = r × p, making the replacements x → x̂, p → p̂
etc, and then substituting the explicit expressions for these operators in the coordinate
basis (i.e. x̂ → x, p̂ → −ih̄∂/∂x). For spin angular momentum we emphasise again
that the commutation relations in eq. (390) are postulated, with the motivation for
this postulate coming from the commutation relations for L̂i . The quantum numbers
of spin are s and ms , in analogy with the quantum numbers l and m respectively for
orbital angular momentum. All the results before for l and m carry across to s and
ms because the commutation relations are the same (recall that we defined a generic
angular momentum operator Jˆi before, which can be taken to be either L̂i or Ŝi ).
Hence for each value of s there are 2s + 1 possible values of ms for the operator Ŝz .
The magnitude of the total spin (with operator Ŝ 2 ) is h̄2 s(s + 1). One difference with

78
orbital angular momentum is that both integer and half-integer values are possible for
s from the commutation relations, but in the case of orbital angular momentum we
saw earlier that l and m must take integer values. States of spin do not have a position
space wave function and so the requirement that the wave function be single-valued is
not relevant for spin.
It is important to emphasise that the spin angular momentum of a quantum particle
cannot be thought of as a particle spinning on its own axis, although this is tempting
to do so. Models of finite sized electrons have been constructed, but in order to account
for the observed spin in experiments (e.g. the Stern-Gerlach experiment) the spinning
of the electron would have to be faster than the speed of light. This result further
corroborates the notion that spin is indeed an intrinsic property of a particle and does
not depend on its motion through space
The spin operators act on a state vector of spin, which resides in a vector space
Vs . It is important to note that this vector space Vs is distinct from the Hilbert space
(called VH , say) in which the state vector |ψi resides. The Hilbert space VH is of
infinite dimension, with eigenbases corresponding to all the other observables (such
as position |xi, momentum |pi, energy |Ei and orbital angular momentum |l, mi).
The completeness relationsP∞ for Pthese eigenvectors have been given earlier, and for the
m=l
eigenvectors |l, mi it is l=0 m=−l |l, mihl, m| = I. The vector space Vs of spin is of
finite dimension, because s is a fixed and finite number for a given species of particle,
while l in principle can take any integer value up to infinity. Spin states in Vs cannot
be expanded in terms of the (infinite dimensional) eigenbases of the Hilbert space VH ,
and instead are expanded in terms of the basis vectors in Vs . The dimensionality of
Vs is fixed by the value of s e.g. if s = 1/2 then ms = +1/2 or −1/2, and thus there
are only two (physically distinct) eigenvectors of the Hermitian operator Ŝz (which
form a two-dimensional eigenbasis for the vector space Vs ). As we learnt before, a
given normalised eigenvector can be multiplied by a phase and so be mathematically
different, but physically it is the same eigenvector, and will lead to the same predictions
for observables. The full Hilbert space of the electron (which has s = 1/2) is a tensor
product (see later) of the separate Hilbert spaces VH and Vs . To summarise, the two
main differences between spin and orbital angular momentum are:

(i) States of spin reside in a finite dimensional Hilbert space for which there is no
position basis or momentum basis. States of orbital angular momentum reside
in an infinite dimensional Hilbert space, for which there is a position basis and
a momentum basis. Or put more simply, the spin of a particle does not depend
on its spatial coordinates and motion, while the orbital angular momentum of a
particle does (spin operators cannot be written in terms of x̂ and p̂).

79
Spin 0 Spin 1/2 Spin 1
Neutral h0 νe , νµ , ντ γ,Z,g
Charged (H ± )? ± ± ±
e , µ , τ , u, d, s, c, b, t W±

Table 2: The values of the spin and charge of the elementary particles.

(ii) Spin angular momentum can have integer or half-integer values of the quantum
numbers s and ms while orbital angular momentum can only take integer values
of l and m.

5.6 Examples of elementary particles and their spins


We can classify the known elementary particles by their spin and charge (see Table 2)
Here u, d, s, c, b, t are the six flavours of quarks, γ,Z, W ± and g are the photon, Z boson,
W boson and the gluon respectively. The particle h0 is the Higgs boson which was
discovered in 2012 by the Large Hadron Collider with a mass of 125 GeV/c2 , and is the
second heaviest elementary particle discovered to date. The Higgs boson was predicted
to have s = 0, and current measurements suggest s = 0. Hence the Higgs boson seems
to be the first elementary particle with s = 0. Note that composite particles like pions
π ± , π 0 also have s = 0 but they are made up of quarks with s = 1/2.
The discovery of the Higgs boson provokes many questions. Is the Higgs boson a
composite particle or is it a fundamental particle with s = 0? Do more (electrically
neutral) Higgs bosons exist? Note that the discovery of a particle with electric charge
and s = 0 would complete the table. An example of such a particle is a “charged Higgs
boson” denoted by H ± , and such particles are being searched for at the Large Hadron
Collider.

5.7 Matrix representations of spin 1/2 particles


For a particle with s = 1/2 (e.g. the electron) we have a two-dimensional Hilbert
space. This is because there are only two possible values of the quantum number ms :
ms = 1/2 and ms = −1/2. The two eigenvectors of Ŝz with eigenvalues h̄/2 and −h̄/2
(and likewise for Ŝx and Ŝy ) are basis vectors for the vector space of states of spin 1/2.
As usual, kets and operators can be represented by column vectors and matrices etc.
One can obtain the matrix representations of Ŝx , Ŝy and Ŝz by the usual method (see
eq. 357), which in the “standard basis” of the orthonormal eigenvectors of Ŝz are given

80
by:      
h̄ 0 1 h̄ 0 −i h̄ 1 0
Ŝx = , Ŝy = , Ŝz = . (391)
2 1 0 2 i 0 2 0 −1
One can see that these matrices satisfy the commutation relations [Ŝi , Ŝj ] = 3k=1 ih̄ijk Ŝk .
P
The eigenvalues of these matrices are indeed h̄/2 and −h̄/2. This can be seen by solving
the characteristic equation, e.g for Ŝz with eigenvalue ms h̄ one has:
det(Ŝz − ms h̄I) = 0 ⇒ ms = ±1/2 . (392)
The matrix for the total spin angular momentum is given by
 
2 2 2 2 3 2 1 0
Ŝ = Ŝx + Ŝy + Ŝz = h̄ . (393)
4 0 1

Hence Ŝ 2 is proportional to the identity matrix, and so every vector in the space is
an eigenvector of the operator for the total spin angular momentum, with eigenvalue
3h̄2 /4. One can check that [Ŝ 2 , Ŝz2 ] = 0 etc. From eq .(391) and noting that the spin
raising operator is Ŝ+ = Ŝx + iŜy and the spin lowering operator is Ŝ− = Ŝx − iŜy , one
can easily obtain the following:
   
0 1 0 0
Ŝ+ = h̄ and Ŝ− = h̄ . (394)
0 0 1 0

5.8 Eigenvectors of states with spin 1/2


In Table 3 the normalised eigenvectors for the vector space of s = 1/2 are listed.
These normalised eigenvectors of Ŝx or Ŝy or Ŝz form an orthonormal basis for this
two-dimensional Hilbert space (as expected for a Hermitian operator).
Consider now the spin component in a different direction. In the xz plane at an
angle φ to the z−axis one has, in analogy with the classical expression, the following:
Ŝφ = Ŝz cos φ + Ŝx sin φ . (395)
Hence  
1 cos φ sin φ
Ŝφ = h̄ . (396)
2 sin φ − cos φ
The eigenvalues of this matrix are again h̄/2 and −h̄/2 as expected, because axes can
be arbitrarily defined and rotated. The eigenvectors are:
   
cos(φ/2) 1 − sin(φ/2) 1
for h̄ and for − h̄ . (397)
sin(φ/2) 2 cos(φ/2) 2

81
Spin component eigenvalue normalised  eigenvector

1 1 1
Ŝx 2
h̄ √
 1 
2

1
Ŝx − 12 h̄ √1
2 −1
 
1 1 1
Ŝy 2
h̄ √
 i 
2

1
Ŝy − 12 h̄ √1
 −i
2

1 1
Ŝz 2

 0 
0
Ŝz − 12 h̄
1

Table 3: The normalised eigenvectors for the vector space of spin s = 1/2 in the
standard basis (i.e. the orthonormal basis of eigenvectors of Ŝz ).

Example
A particle of spin 1/2 is in the normalised state of spin:
 
1 2
|ψs i = √ . (398)
5 i
Find the probability of measuring spin h̄/2 and −h̄/2 in the y and z directions, and
evaluate the expectation values hŜy i and hŜz i.
Solution
Note that the state is clearly not an eigenvector of Ŝy and Ŝz because it does not
coincide with any of the eigenvectors listed in Table 3. It can be considered as a
superposition of either the eigenvectors of Ŝy or of the eigenvectors of Ŝz . To express
the state |ψs i in terms of the eigenvectors of Ŝz we just need to find the components in
this basis by the usual method. e.g. the component of the eigenvector with eigenvalue
h̄/2 is  
1  2 2
√ 1 0 =√ . (399)
5 i 5
Hence one arrives at    
2 1 i 0
|ψs i = √ +√ . (400)
5 0 5 1

82
√ 2
The probability of measuring√ 2Ŝz = h̄/2 is |2/ 5| = 4/5, and the probability of
measuring Ŝz = −h̄/2 is |1/ 5| = 1/5. A similar method with the eigenvectors of Ŝy
gives  √   √ 
3 1/√ 2 1 1/ √2
|ψs i = √ +√ , (401)
10 i/ 2 10 −i/ 2
which leads to probabilities of 9/10 and 1/10 for a measurement of Ŝy giving h̄/2 and
a measurement of Ŝy giving −h̄/2 . To calculate the expectation values of Ŝy and Ŝz
one simply calculates hψs |Ŝy |ψs i and hψs |Ŝz |ψs i e.g.
 √ 
√ √  h̄ 1 0

2/√ 5 3h̄
hŜz i = 2/ 5 , −i/ 5 = . (402)
2 0 −1 i/ 5 10

One can also obtain this expectation value by just taking our previous results for the
probabilities and multiply by the eigenvalue e.g. (4/5 × h̄/2)+(1/5 × −h̄/2) = 3h̄/10.
In a similar way, hŜy i = (9/10 × h̄/2) + (1/10 × −h̄/2) = 4h̄/10.
Finally, we note that every column vector in the vector space of spin 1/2 states is
an eigenvector of Ŝ 2 with eigenvalue 3h̄2 /4 since for arbitrary complex numbers a and
b one has
3h̄2 1 0 3h̄2 a
    
a
= . (403)
4 0 1 b 4 b
The exception is the null vector (i.e. a column vector of zeros), which corresponds to
no particle of spin 1/2 being present.

5.9 Addition of angular momentum


Suppose that two distinct states with angular momentum (either orbital or spin) quan-
tum numbers j1 , m1 for the first state and j2 , m2 for the second state are considered
as a single system. Examples of this scenario are the case of two distinct elementary
particles, or the case of combining the orbital angular momentum and the spin angular
momentum of a single particle. For concreteness, let us consider two particles and we
wish to consider the allowed spin quantum numbers of the combined system. However,
for generality, let us label the angular momentum operators by Jˆ2 and Jˆi instead of
Ŝ 2 and Ŝi , even though we are taking the addition of the spins of two particles as our
example. Since we are considering two particles we need to have a particle label on
each angular momentum operator.
i) Jˆ(1) refers to an angular momentum operator for particle 1.

83
ii) Jˆ(2) refers to an angular momentum operator for particle 2.
For the operators corresponding to the components of angular momentum we have
(1) (2)
[Jˆi , Jˆj ] = 0 for i, j = x, y, z , (404)

because the coordinates of particle 1 are independent of those of particle 2 i.e. the x
coordinate of particle 1 is x1 and of particle 2 is x2 , which are considered as independent
variables, and same for y1 , y2 , z1 and z2 . Moreover, we have from our earlier results:
(1) (1)
X (1) (2) (2)
X (2)
[Jˆi , Jˆj ] = ih̄ijk Jˆk and [Jˆi , Jˆj ] = ih̄ijk Jˆk . (405)
k k

(1) (2)
The operators Jˆz ,(Jˆ(1) )2 , Jˆz ,(Jˆ(2) )2 all commute with each other:

[Jˆz(1) , Jˆz(2) ] = [Jˆz(1) , (Jˆ(1) )2 ] = [Jˆz(1) , (Jˆ(2) )2 ] = [Jˆz(2) , (Jˆ(1) )2 ] = (406)


[Jˆz(2) , (Jˆ(2) )2 ] = [(Jˆ(1) )2 , (Jˆ(2) )2 ] = 0 .

The second and fifth commutators are zero since they are the same as eq. (327). All
others vanish due to the two different coordinate systems. Hence there is a basis of
common eigenvectors |j1 , j2 , m1 , m2 i and so a maximum of four quantum numbers for
the spin angular momentum of the two particles can be simultaneously specified.
Now consider the operator for the combined angular momentum Jˆ (i.e. no super-
ˆ of the two particles
script on J)

Jˆ = Jˆ(1) + Jˆ(2) . (407)

Note that Jˆ is a vector operator, and is given by

Jˆ = iJˆx + jJˆy + kJˆz . (408)

The magnitude squared of this combined angular momentum is given by J. ˆ Jˆ = Jˆ2 .


The components of Jˆ satisfy the usual commutation relations for angular momentum
operators X
[Jˆi , Jˆj ] = ih̄ijk Jˆk . (409)
k

For example, take i = x, and j = y and one has

[Jˆx(1) + Jˆx(2) , Jˆy(1) + Jˆy(2) ] = ih̄(Jˆz(1) + Jˆz(2) ) , (410)


(1)
because the operators for particle 1 combine to give Jˆz (and same for particle 2), while
(1) (2)
[Jˆx , Jˆy ] = 0 because the two coordinate systems are independent of each other. Let

84
the quantum numbers for Jˆ2 and Jˆz be j and m respectively (i.e. no subscript). Since
the components of Jˆ satisfy the usual commutation relations in eq.(409) then our
previous results for the allowed values of the quantum numbers of angular momentum
apply to j and m. Having defined this operator Jˆ for the combined angular momentum
of the two particles one can show that the following four operators all commute with
each other.
Jˆ2 , Jˆz , (Jˆ(1) )2 , (Jˆ(2) )2 . (411)
(1) (2)
We see that we have exchanged Jˆz and Jˆz for Jˆ2 and Jˆz as our choice of four
commuting operators. For example, [Jˆ2 , (Jˆ(1) )2 ] = 0. To show this, first note that
3
X (1) (2)
Jˆ2 = J.
ˆ Jˆ = (Jˆ(1) + Jˆ(2) ).(Jˆ(1) + Jˆ(2) ) = (Jˆ(1) )2 + (Jˆ(2) )2 + 2 Jˆi Jˆi , (412)
i=1
(1) (2) (2) (1)
where the factor of two arises because Jˆi Jˆi = Jˆi Jˆi . Thus one has:
3
X (1) (2)
[Jˆ2 , (Jˆ(1) )2 ] = [(Jˆ(1) )2 , (Jˆ(1) )2 ] + [(Jˆ(2) )2 , (Jˆ(1) )2 ] + 2 [Jˆi Jˆi , (Jˆ(1) )2 ] . (413)
i=1

The first two commutators clearly vanish, and in the third term just expand the com-
mutator and use the fact that the three operators all commute with each other and
so their order can be freely changed. Therefore the common eigenvectors |j, m, j1 , j2 i
also form a basis for the states of angular momentum of this system of two particles.
A given state can be expanded in either basis and each is maximally informative i.e.
four quantum numbers can be specified simultaneously with definite values as follows:
Jˆ2 |j, m, j1 , j2 i = j(j + 1)h̄2 |j, m, j1 , j2 i , (414)
Jˆz |j, m, j1 , j2 i = mh̄|j, m, j1 , j2 i ,
(Jˆ(1) )2 |j, m, j1 , j2 i = j1 (j1 + 1)h̄2 |j, m, j1 , j2 i ,
(Jˆ(2) )2 |j, m, j1 , j2 i = j2 (j2 + 1)h̄2 |j, m, j1 , j2 i .
Now returning to the basis |j1 , m1 , j2 , m2 i, for fixed j1 and j2 (e.g. j1 (= s1 ) = 1/2
and j2 (= s2 ) = 1 for a spin 1/2 particle and a spin 1 particle), particle 1 has (2j1 + 1)
distinct values of m1 and particle 2 has (2j2 + 1) distinct values of m2 . Therefore there
are (2j1 + 1)(2j2 + 1) distinct eigenvectors |j1 , m1 , j2 , m2 i. Therefore there must be the
same number of basis vectors in the alternative basis |j, m, j1 , j2 i.
Now consider the allowed values of j and m for the combined angular momentum
Jˆ = Jˆ(1) + Jˆ(2) of the two-particle system. For fixed j1 and j2 , the quantum number j for
(1) (2)
the combined system will have several possible values. We know that Jˆz = Jˆz + Jˆz ,
and we have the following:

85
(i) The maximum value of Jˆz is mmax .
(1)
(ii) The maximum value of Jˆz is mmax
1 = j1 (since −j1 ≤ m1 ≤ j1 ).
(2)
(iii) The maximum value of Jˆz is mmax
2 = j2 (since −j2 ≤ m2 ≤ j2 ).
From this we conclude that
mmax = mmax
1 + mmax
2 = j1 + j2 . (415)
Since the components of the operator Jˆ satisfies the commutation relations in eq. (409)
then we know that the usual result for the allowed values of the angular momentum
quantum numbers must apply to Jˆ as well i.e. −j ≤ m ≤ j. Now mmax = j1 + j2 ,
and so there must be a value j max = mmax = j1 + j2 , which is the largest value of j.
There cannot be a value of j greater than this, because then there would be a value of
m which is larger than mmax (recall −j ≤ m ≤ j). There will be 2j max + 1 different
values of m for j = j max . So far we have found j max in terms of j1 and j2 . Lower values
of j must differ in integer steps from j max down to some minimum value j min , because
m = (m1 + m2 ) can only change in integer steps. What is the value of j min ? For each
value of j there are 2j + 1 distinct states with a different value of m, which gives 2j + 1
basis vectors. We know that
j=j max
X
2j + 1 = (2j1 + 1)(2j2 + 1) , (416)
j=j min

so that the number of vectors in the two bases |j1 , m1 , j2 , m2 i and |j, m, j1 , j2 i is the
same. The summation in eq. (416) fixes the value of j min to be j min = |j1 − j2 |. Let’s
check this for the case of j1 = 3/2 and j2 = 1, for which (2j1 + 1)(2j2 + 1) = 12. One
has
j=j1 +j2
X
= (2 × 1/2 + 1) + (2 × 3/2 + 1) + (2 × 5/2 + 1) = 12 . (417)
j=|j1 −j2 |

We have derived an important result:


The allowed values of j for the combined system are |j1 − j2 | ≤ j ≤ j1 + j2 in integer steps.

5.10 Tensor products


In the previous section we discussed the addition of spin angular momentum for a
system with two particles (particle 1 and particle 2). It is clear that we need to
describe multi-particle systems in QM. We have learnt that each particle is assigned a
ket:

86
(i) The spin angular momentum of particle 1 is described by the ket |j1 , m1 i in a
vector space VA .

(ii) The spin angular momentum of particle 2 is described by the ket |j2 , m2 i in a
vector space VB .

If particle 1 and particle 2 are both spin 1/2 particles, then the distinct vector spaces
VA and VB are each of dimension 2 (i.e. the vector space of spin 1/2 particles that we
studied in section 5.7). The kets |j1 , m1 , j2 , m2 i reside in the combined vector space
VA ⊗ VB of dimension 4 i.e. |j1 , m1 , j2 , m2 i can be represented by 4 × 1 column vectors.
The symbol “⊗” denotes the tensor product, and is an operation that combines the
two vector spaces into a larger one. In fact,

|j1 , m1 , j2 , m2 i = |j1 , m1 i ⊗ |j2 , m2 i . (418)

If VA and VB have orthonormal bases |iiA and |kiB respectively, where i, k take values
from 1 to n (n is the dimensionality of the space) then

|iiA ⊗ |kiB = |ikiAB are basis vectors for the vector space VA ⊗ VB . (419)

The dimension of VA ⊗ VB is n2 . Any vector |ψiAB in the vector space VA ⊗ VB can be


expressed in terms of the basis vectors
X X
|ψiAB = aik |ikiAB = aik (|iiA ⊗ |kiB ) , (420)
i,k i,k

where aik are complex numbers. If Q̂A is an operator on the vector space VA , and Q̂B
is an operator on the vector space VB , then we define:
X
Q̂A ⊗ Q̂B |ψiAB = aik Q̂A |iiA ⊗ Q̂B |kiB . (421)
i,k

Hence Q̂A acts on the vectors in VA and Q̂B acts on the vectors in VB . For example,

Jˆz(1) |j1 , m1 , j2 , m2 i ⇒ Jˆz(1) ⊗I(|j1 , m1 i⊗|j2 , m2 i) = m1 h̄|j1 , m1 i⊗|j2 , m2 i = m1 h̄|j1 , m1 , j2 , m2 i .


(422)
We can see in eq. (422) that the first and fourth terms are our old notation from section
(1)
5.9, but this old notation is somewhat “sloppy”. The operator Jˆz (think of it as a 2×2
matrix) only acts on the vector space VA of dimension 2, while the kets |j1 , m1 , j2 , m2 i
are in a vector space of larger dimension, which we now call VA ⊗ VB . The second term

87
(1)
shows the step in which the operator Jˆz is converted into an operator which can act
on the enlarged vector space VA ⊗ VB , and the third step makes use of our definition
in eq. (421).
It is common to drop the symbol ⊗ and just write

|iiA |kiB instead of |iiA ⊗ |kiB . (423)

We need to define an inner product for the vector space VA ⊗ VB . If |ψiAB = |uiA |wiB
and |ψ 0 iAB = |u0 iA |w0 iB then the inner product hψ|ψ 0 iAB is defined as:

hψ|ψ 0 iAB = hu|u0 iA hw|w0 iB . (424)

5.11 Clebsch-Gordon coefficients


Any of the basis vectors |j, m, j1 , j2 i (i.e. a particular choice of j, m, j1 and j2 ) can be
written in terms of the basis vectors |j1 , m1 , j2 , m2 i as follows:
X
|j, m, j1 , j2 i = |j1 , m1 , j2 , m2 ihj1 , m1 , j2 , m2 |j, m, j1 , j2 i . (425)
|m1 |≤j1 ; |m2 |≤j2 ; m1 +m2 =m

The inner products hj1 , m1 , j2 , m2 |j, m, j1 , j2 i are called the “Clebsch-Gordon coeffi-
cients” (CG coefficients). The CG coefficients are determined as functions of the quan-
tum numbers up to some sign conventions. Let’s take the simplest case of two spin 1/2
particles i.e. j1 = s1 = 1/2 and j2 = s2 = 1/2. We now show how to determine the
CG coefficients for this case. Since j1 and j2 are both fixed in this example, we can
streamline our notation by writing

|j, m, j1 , j2 i = |j, mi and |j1 , m1 , j2 , m2 i = |j1 , m1 i|j2 , m2 i = |m1 i|m2 i . (426)

We learnt before that |j1 − j2 | ≤ j ≤ j1 + j2 in integer steps and so the possible values
of j are j = 1 and j = 0. Hence the four states of |j, mi are

|1, 1i, |1, 0i, |1, −1i, |0, 0i . (427)

The four states of |m1 i|m2 i are

|1/2i|1/2i, |1/2i| − 1/2i, | − 1/2i|1/2i, | − 1/2i| − 1/2i . (428)

Now we proceed to find the CG coefficients for these two bases. First let us start with
the state |1, 1i. We can write eq. (425) as

|1, 1i = c1 |1/2i|1/2i + c2 |1/2i| − 1/2i + c3 | − 1/2i|1/2i + c4 | − 1/2i| − 1/2i . (429)

88
Since |1, 1i has m = 1, and the only way to make m = 1 is from m1 = 1/2 and
m2 = 1/2 then we know immediately that c1 = 1 and c2 = c3 = c4 = 0 i.e.

|1, 1i = |1/2i|1/2i , (430)

and
c1 = h1/2, 1/2, 1/2, 1/2|1, 1, 1/2, 1/2i = 1 . (431)
The convention is to define the CG coefficients as being real (i.e. no complex phase),
and hence we chose “1” instead of eiθ .
We can obtain the state |1, 0i from the state |1, 1i by making use of the lowering
operator Jˆ− for the kets |j, mi. One applies Jˆ− to the left-hand side of eq. (430), and
(1) (2) (1) (2)
since Jˆ− = Jˆ− + Jˆ− , we apply Jˆ− + Jˆ− to the right-hand side of eq. (430). This
(1) (2)
will give |1, 0i written in terms of |m1 i|m2 i and |m2 i|m1 i. Of course, Jˆ− and Jˆ−
are the lowering operators for the individual systems of particle 1 and particle 2. As
(1) (2)
explained before, since Jˆ− and Jˆ− act on the dimension 2 vector spaces of particle 1
and particle 2 respectively, we need to form the following tensor products when these
operators act on the combined vector space in which the kets |m1 i|m2 i reside:
(1) (1) (2) (2)
Jˆ− ⇒ Jˆ− ⊗ I; Jˆ− ⇒ I ⊗ Jˆ− , (432)

where I is the identity matrix of a vector space of dimension 2. So applying these


lowering operators to eq. (430) we have
(1) (2)
Jˆ− |1, 1i = (Jˆ− ⊗ I + I ⊗ Jˆ− )|1/2i|1/2i . (433)

Now recall eq. (349)

Jˆ− |j, mi = h̄ (j + m)(j − m + 1)|j, m − 1i .


p
(434)

So the left-hand side of eq. (433) gives



Jˆ− |1, 1i = h̄ (1 + 1)(1 − 1 + 1)|1, 0i = h̄ 2|1, 0i .
p
(435)

Now let’s work out the right-hand side of eq. (433). We use eq. (434) with j and m
replaced by j1 and m1 respectively, and recall that |m1 i is compact notation for the
ket |j1 , m1 i:
(1)
Jˆ− ⊗ I|1/2i|1/2i = h̄ (1/2 + 1/2)(1/2 − 1/2 + 1)| − 1/2i|1/2i = h̄| − 1/2i|1/2i .
p

(436)

89
(1)
Note here that Jˆ− has acted on the first ket only and has lowered it to | − 1/2i, while
the identity operator I has acted on the second ket, leaving it unchanged. In a similar
way we have
(2)
I ⊗ Jˆ− |1/2i|1/2i = h̄ (1/2 + 1/2)(1/2 − 1/2 + 1)|1/2i| − 1/2i = h̄|1/2i| − 1/2i .
p

(437)
Hence combining eqs .(435), (436) and (437), we obtain the following result which has
been obtained from eq. (433):

| − 1/2i|1/2i + |1/2i| − 1/2i


|1, 0i = √ . (438)
2
Thus the two CG coefficients are

h1/2, ±1/2, 1/2, ∓1/2|1, 0, 1/2, 1/2i = 1/ 2 . (439)

The CG coefficient for the state |1, −1i can be obtained by inspection, because the
only way to obtain m = −1 is from m1 = m2 = −1/2. Hence one has

|1, −1i = | − 1/2i| − 1/2i . (440)

Alternatively, one can obtain this result by applying the lowering operators to both
sides of eq. (438). One has √
Jˆ− |1, 0i = 2|1, −1i , (441)
and
   
(1) | − 1/2i|1/2i + |1/2i| − 1/2i | − 1/2i| − 1/2i
(Jˆ− ⊗ I) √ = √ . (442)
2 2
(1)
Here we see that Jˆ− has annihilated the ket | − 1/2i to the null vector in the first term
in the parentheses, and has lowered the ket |1/2i to −|1/2i in the second term in the
(2)
parentheses. There is a similar result for Jˆ− and thus one obtains eq. (440).
The final state |0, 0i must be composed of |1/2i| − 1/2i and | − 1/2i|1/2i in order
to ensure that m = 0, and it must be orthogonal to |1, 0i because |0, 0i and |1, 0i are
basis vectors. Note that |0, 0i is not obtained by using lowering operators. It turns out
that
|1/2i| − 1/2i − | − 1/2i|1/2i
|0, 0i = √ . (443)
2

90
Using the definition of the inner product in eq .(424) one can check the othogonality:
1 1
h0, 0|1, 0i = h1/2| − 1/2ih−1/2|1/2i + h1/2|1/2ih−1/2| − 1/2i (444)
2 2
1 1 1 1
− h−1/2| − 1/2ih1/2|1/2i − h−1/2|1/2ih1/2| − 1/2i = 0 + − + 0 = 0 .
2 2 2 2
Another possible choice is

| − 1/2i|1/2i − |1/2i| − 1/2i


|0, 0i = √ . (445)
2
i.e. multiplication of eq. (443) by −1, but no complex phase (recall that the convention
is to have no phase in the CB coefficients).

5.12 Matrix representation of a tensor product


So far we have discussed the tensor product in terms of the abstract kets and operators.
Let us now define the matrix representation of a tensor product. The tensor product
of an n × 1 column vector and an m × 1 column vector is an nm × 1 matrix e.g. for
the case of n = m = 2 one has
 
    ac
a c  ad 
⊗ =  bc  .
 (446)
b d
bd

The tensor product of an n × m matrix and a p × q matrix is an np × mq matrix e.g.


for the matrices    
a b w x
A= B= , (447)
c d y z
one has  
  aw ax bw bx
aB bB  ay az by bz 
A⊗B = = . (448)
cB dB  cw cx dw dx 
cy cz dy dz

91
5.13 Entanglement
A general vector |ψiAB in a vector space VA ⊗ VB can be written in terms of the basis
vectors as follows (recall eq. (420)):
X
|ψiAB = aik (|ii ⊗ |ki) . (449)
i,k

Not all vectors in VA ⊗ VB can be written as the tensor product of a vector |ψiA (in
VA ) and a vector |ψiB (in VB ), where
X X
|ψiA = aA
i |iiA and |ψiB = aB
k |kiB . (450)
i k

To show this with an explicit example let us consider the following:


 A B 
 A   B  a1 a1
a1 a1  aA B 
1 a2 
|ψiA ⊗ |ψiB ⇒ A ⊗ B = a2 a1  ,
A B (451)
a2 a2
aA B
2 a2

for the vectors in eq. (450) and



aAB
11
 aAB 
12 
|ψiAB = AB
 a21  (452)
aAB
22

for the vector in eq. (449). Now take the case of aAB AB AB AB
11 = a22 = 0 and a12 = a21 = 1
for |ψiAB . Thus one has  
0
 1 
|ψiAB =  1 .
 (453)
0
Clearly eq. (453) is a vector in the vector space VA ⊗ VB , but when compared with
the form of the vector |ψiA ⊗ |ψiB in eq. (451), one sees that there are no solutions
of aA A B B
1 , a2 , a1 , a2 which enable eq. (453) to be written in the form |ψiA ⊗ |ψiB . To
see this, note that one of {aA B A B
1 , a1 } and one of {a2 , a2 } must be zero to obtain the
zero entries in eq. (453), but this requirement contradicts the condition for obtaining
the entries with a “1”. The vector in eq. (453) is an example of an “entangled” state.

92
We say that a state in VA ⊗ VB is “separable” if aAB A B
ik = ai ak , and so the state of two
particles can be expressed as a product of states of individual particles. If aABik cannot
be written as aA a
i k
B
then a state of two particles is “entangled”. The members of an
entangled state do not have their individual quantum states i.e. each particle cannot
be assigned separate kets |ψiA and |ψiB . However, the system as a whole does have
its own quantum state, and is represented by |ψiAB .
Recall, the state |j, mi = |0, 0i for a system of two particles each with spin 1/2:
|1/2i| − 1/2i − | − 1/2i|1/2i
|0, 0i = √ . (454)
2
This is an entangled state. This can be seen by making use of the column vector
representations in section 5.7:
   
1 0
|1/2i = and | − 1/2i = . (455)
0 1

After performing the tensor products in eq. (454) one obtains


 
0
1  1 
|0, 0i = √  , (456)
2  −1 
0

which is clearly an entangled state like eq. (453).

6 Non-locality and Bell’s inequalities


Please see the separate file Chap6-Bell.pdf on blackboard. Here is a detailed
derivation of the expectation value habi, which is defined in eq.(4.2) in the file Chap6-
Bell.pdf. Note that the ket |1/2i is denoted by | ↑i (signifying spin up) and the ket
| − 1/2i is denoted by | ↓i (signifying spin down). We start with the entangled state
|ψi:
1
|ψi = √ (| ↑iA | ↓iB − | ↓iA | ↑iB ) . (457)
2
Note that I explicitly write the subscript A and B on | ↑i etc, where A and B denote
which particle we are referring to. Now form the bra vector hψ|:
1
hψ| = √ (h↑ |A h↓ |B − h↓ |A h↑ |B ) . (458)
2

93
For convenience, now write

ŜBz cos θab + ŜBx sin θab = ŜB . (459)

Therefore one has the following expression for habi:


 2  2
1 2
habi = √ (h↑ |A h↓ |B − h↓ |A h↑ |B ) ŜAz ⊗ ŜB (| ↑iA | ↓iB − | ↓iA | ↑iB ) .
2 h̄
(460)
Now use the identity in eq.(421)

 ⊗ B̂|vi|wi = Â|viB̂|wi , (461)

and so one arrives at


 
2  
habi = (h↑ |A h↓ |B − h↓ |A h↑ |B ) ŜAz | ↑iA ŜB | ↓iB − ŜAz | ↓iA ŜB | ↑iB . (462)
h̄2

Now use eq.(424) for the inner product of |ψi = |ui|wi and |ψ 0 i = |u0 i|w0 i:

hψ|ψ 0 i = hu|u0 ihw|w0 i . (463)

Using this one arrives at:

habi
= h↑ |A ŜAz | ↑iA h↓ |B ŜB | ↓iB − h↑ |A ŜAz | ↓iA h↓ |B ŜB | ↑iB
(2/h̄2 )
−h↓ |A ŜAz | ↑iA h↑ |B ŜB | ↓iB + h↓ |A ŜAz | ↓iA h↑ |B ŜB | ↑iB . (464)

The second and third terms vanish as follows:


1
h↑ |A ŜAz | ↓iA = − h̄h↑ |A ↓iA = 0 by orthonormality , (465)
2
and
1
h↓ |A ŜAz | ↑iA = h̄h↓ |A ↑iA = 0 by orthonormality . (466)
2
Recall our definition of ŜB in eq. (459) and use the usual relation

ŜB+ + ŜB−
ŜBx = , (467)
2

94
where ŜB+ and ŜB− are the spin raising and lowering operators for particle B. In the
first and fourth terms the contribution from ŜBx sin θab vanishes because

ŜB+ | ↓iB ∼ | ↑iB and ŜB− | ↑iB ∼ | ↓iB . (468)

Here sin θab h↓ |B ↑iB = 0 and sin θab h↑ |B ↓iB = 0 by orthonormality, and one also has

ŜB+ | ↑iB = |0i and ŜB− | ↓iB = |0i . (469)

So the only non-vanishing terms in the first and fourth terms of the expression for habi
in eq. (464) are of the form
1 1
h↑ |A ŜAz | ↑iA = h̄ and h↓ |A ŜAz | ↓iA = − h̄ . (470)
2 2

and similar for ŜBz but with a factor cos θab . Hence we finally arrive at the following
expression for habi:
      
2 h̄ −h̄ cos θab −h̄ h̄ cos θab
habi = 2 + = − cos θab . (471)
h̄ 2 2 2 2

95
7 Quantum key distribution (non-examinable)
The Bell inequality has an application (EPR key distribution, E91 protocol, Ekert,
1991) in ensuring that information has not been intercepted by an eavesdropper. In-
formation can be encrypted (plaintext → ciphertext) by using a secret key, which
should only be known to the sender and to the recipient. If this secret key is sent
classically between the communicators, an eavesdropper could read it and pass it on
to the recipient without altering it. The sender and recipient cannot be sure that
their exchange is secure (i.e. that the key hasn’t been intercepted). In quantum me-
chanics, the action of a measurement on a superposition changes the state, and so the
eavesdropper’s reading of the secret key can be detected. In EPR key distribution the
measurement performed by the eavesdropper would change an entangled state (which
is used to send the key) to a separable state, and only entangled states can violate the
Bell inequality. This difference can be used to test whether any eavesdropper had read
the secret key before it reached the recipient.

8 Qubits, Bell states and Quantum teleportation


Please see the separate file Chap8-Qubits.pdf on blackboard. Below are a
summary of the steps in quantum teleportation. A qubit is information contained by
a particle, and in the examples we study we take the qubit to be spin.

(i) Alice has a qubit in an unknown state |φi i.e. α and β are unknown in the
superposition |φi = α|0i + β|1i. She would like this state to be transported
(“teleported”) to Bob. Note that Alice’s particle will not be physically moved to
Bob’s location, but the state of Alice’s particle (given by the unknown coefficients
α and β) will be transferred to Bob’s particle.

(ii) Alice and Bob have a pair of qubits in an entangled state


1
|ψi = |B0 i = √ (|00i + |11i) . (472)
2
Alice keeps one of the qubits in |ψi (the left one) and sends the other qubit (the
right qubit) to Bob. Hence Alice is in possession of two particles (|φi and the
first qubit of |ψi) and Bob is in possession of one particle.

(iii) The product state |φi|ψi describes the three qubits in the combined system of

96
Alice and Bob particles:
1
|φi|ψi = √ (α|000i + β|100i + α|011i + β|111i) . (473)
2
The first two qubits (which are with Alice) in the kets on the right-hand side of
eq .(473) are expanded in terms of Bell states i.e. by performing a Bell measure-
ment. This gives the expression
1
|φi|ψi = [|B0 i(α|0i+β|1i)+|B1 i(α|1i+β|0i)+|B2 i(α|0i−β|1i)+|B3 i(α|1i−β|0i)]
2
√ √ (474)
Here we have used |00i = (|B0 i + |B2 i)/ 2, and |01i = (|B1 i + |B3 i)/ 2 etc.

(iv) The result of Alice’s measurement is one of the four Bell states |Bi i. Thus |φi|ψi
in eq .(474) collapses to one of the four terms. Bob’s qubit (which has not been
measured yet) is in a state given by the single qubit part of one of the four terms,
and contains the coefficients α and β.

(v) Alice sends the number i = 0, 1, 2 or 3 (i.e. two bits of classical information)
to Bob depending on whether the state of her qubits after measurement is |B0 i,
|B1 i, |B2 i or |B3 i. This is done by a classical communication channel.
ˆ X̂, Ẑ, Ŷ to his qubit (i.e. not a measurement),
(vi) Bob applies one of the operators I,
according to whether Alice sends him 0, 1, 2 or 3. This operation transforms Bob’s
qubit into |φi(= α|0i + β|1i).

(vii) The qubit |φi has been transported from Alice to Bob, without either of them
knowing the state |φi (i.e. the coefficients α and β are still unknown).

97

You might also like