Why Are Complex Numbers Needed in Quantum Mechanics
Why Are Complex Numbers Needed in Quantum Mechanics
Ricardo Karam∗
Department of Science Education, University of Copenhagen, Denmark
Abstract
Complex numbers are broadly used in physics, normally as a calculation tool that makes things
easier due to Euler’s formula. In the end, it is only the real component that has physical meaning or
the two parts (real and imaginary) are treated separately as real quantities. However, the situation
seems to be different in quantum mechanics, since the imaginary unit i appears explicitly in its
fundamental equations. From a learning perspective, this can create some challenges to newcomers.
In this article, four conceptually different justifications for the use/need of complex numbers in
quantum mechanics are presented and some pedagogical implications are discussed.
1
I. INTRODUCTION
Complex numbers were invented (or discovered?) in 16th -century Italy as a calculation
tool to solve cubic equations. In the beginning, not much attention was paid to their
meaning, since the imaginary terms should cancel out during the calculations and only (real)
roots were considered. Around 250 years later, complex numbers were given a geometrical
interpretation and, since then, they became quite a useful tool to physics.1 One reason
is that complex numbers represent direction algebraically (2D vectors) and many of their
operations have a direct geometrical meaning (e.g., the product rule: multiply the norms
and add the angles).
It is particularly helpful to use complex numbers to model periodic phenomena, especially
to operate with phase differences. Mathematically, one can treat a physical quantity as being
complex, but address physical meaning only to its real part. Another possibility is to treat
the real and imaginary parts of a complex number as two related (real) physical quantities.
In both cases, the structure of complex numbers is useful to make calculations more easily,
but no physical meaning is actually attached to complex variables.
Quantum mechanics seems to use complex numbers in a more fundamental way. It suffices
to look at some of the most basic equations, both in the matrix ([p̂, x̂] = −i~) and wave
(i~ ∂ψ
∂t
= Ĥψ) formulations, to wonder about the presence of the imaginary unit. What is
essentially different in quantum mechanics is that it deals with complex quantities (e.g. wave
functions and quantum state vectors) of a special kind, which cannot be split up into pure
real and imaginary parts that can be treated independently. Furthermore, physical meaning
is not attached directly to the complex quantities themselves, but to some other operation
that produces real numbers (e.g. the square modulus of the wave function or of the inner
product between state vectors).
This complex nature of quantum mechanical quantities puzzled some of the very founders
of the theory. Schrödinger, for instance, was bothered by the fact that his wave function
was complex and tried for quite some time to find physical interpretations for its (real)
component.2,3 It was probably not until Dirac formulated his bra-ket notation that it became
clearer that the complex quantities of quantum mechanics were of a different kind than the
ones commonly used in classical physics.4
It is therefore very reasonable to expect that introductory level students will face consid-
2
erable difficulties with the peculiar use of complex numbers in quantum mechanics. However,
it is not uncommon to find textbooks that postulate, from the get-go, that wave functions
and quantum state vectors are complex valued, without providing any sort of explanation for
why this is the case. It is not only natural, but even desirable, that students inquire about
the reasons why certain mathematical structures are useful to describe physical properties.
Thus, for newcomers to the field, one can say that it is perfectly legitimate to ask: “Why
do we need complex numbers in quantum mechanics?”.
There is certainly no unique straight answer to this question5–8 ; constructing plausible
arguments to address it is a matter of pedagogical creativity. The aim of this paper is to
present synthetic (and slightly modified) versions of four different justifications found in the
literature9 and discuss some of their pedagogical implications. The first two justifications
(IIA and IIB) align best with a position first approach, whereas the last two (IIC and
IID) with a spin first approach. The main selection criteria that guided this choice were
the plausibility of the arguments and their applicability for the very first lessons of an
introductory quantum mechanics course.
This argument is found in R. Shankar’s textbook10 for the introductory level and goes
as follows. Consider one particle in one dimension and assume de Broglie’s matter waves
2π~
relation (λ = p
), which is previously justified by empirical evidence from the double-slit
experiment with electrons. Since there appears to be a wave associated with the electron, it
is reasonable to assume that there is a wave function ψ(x) describing it (ignore time depen-
dence). Postulate that the absolute square of this function (|ψ(x)|2 ) gives the probability
density, i.e., the likelihood to find the particle between the positions x and x + dx.
The last piece needed for the argument is inspired by Heisenberg’s uncertainty principle.
It is presented qualitatively in Shankar’s textbook by thought experiments like the gamma-
ray microscope. One way to state the principle is to say that “it is impossible to prepare a
particle in a state in which its momentum and position (along one axis) are exactly known.”
Now, suppose the electron is in a state of definite momentum. How would the wave
3
i.e., the likelihood to find the particle between the positions x and x + dx.
The last piece needed for the argument is Heisenberg’s uncertainty principle. It is pre-
sented qualitatively in Shankar’s textbook by thought experiments like the gamma-ray mi-
II. WHY ARE COMPLEX NUMBERS NEEDED IN QUANTUM MECHANICS?
croscope. One way to state the principle is to say that “it is impossible to prepare a particle
in a state
A. inDue
which
to its
themomentum and
uncertainty position (along one axis) are exactly known.” In other
principle
function for this state look like?words,
According toofdetheBroglie’s
the product relation,
uncertainties x and pthere is a bound
has a lower wavelength as- can
and the principle
1
This argument
sociated with the electron. Therefore, it expressed
be roughly ismathematically
found in R. Shankar’s
seems plausible to assume >
by x textbook
p⇠ for the introductory level and goes as
a ~.classical/real oscillating
follows. Consider
Now, suppose the1 electron
particle is
in in
1 dimension
a state of and assume
definite de Broglie’s
momentum. matter
How wouldwaves
the relation
wave
wave function of the form ( = 2⇡~
), which is previously justified by empirical evidence from the double-slit experiment
function for this state look like? According to de Broglie’s relation, there is a wavelength
p
with electrons. Since there appears to be a wave associated with the electron, it is reason-
II. WHY ARE ψ
COMPLEX
associated2πx
NUMBERS px IN QUANTUM MECHANICS?
with it, thus, it seems plausible to assume a classical/real oscillating wave of the
p (x)
able = A cos
to assume = Ais NEEDED
that there cos ,
a wave function (x) describing it (ignore time (1)
dependency).
form λ ~
Postulate that the absolute square of this2⇡x function (| px(x)|2 ) gives the probability density,
p (x) = A cos = A cos , (1)
A.
where ψp (x) denotes aDue to the
state uncertainty
of definite principle
momentum.
i.e., the likelihood to find theψparticle
p (x), as |ψ~p (x)|
welltheaspositions
between x 2and
, are
x + plotted
dx.
2
whereThe p (x)
lastdenotes a state for
piece needed of definite momentum.
the argument well as | p (x)|
p (x), asuncertainty
is Heisenberg’s , are plotted
principle. It is pre-
in Fig. 1.
This argument isinfound
Fig. 1. in
sented R. Shankar’s
qualitatively textbook
in Shankar’s 1
for by
textbook thethought
introductory level
experiments likeand
the goes as mi-
gamma-ray
croscope. One way to state the principle is to say that “it is impossible to prepare a particle
follows. Consider 1 particle in 1 dimension and assume FIG.de1. Broglie’s
. matter waves relation
in a state in which its momentum and position (along one axis) are exactly known.” In other
2⇡~
( = p
), which is previously justified by empirical evidence from the double-slit experiment
words, the product of the uncertainties x and p has a lower bound and the principle can
By looking at the graph of | p (x)|2 (probability distribution) one realizes that it is in
with electrons. Since be there appears
roughly to be
expressed a wave associated
mathematically by x p with >
⇠ ~.knowthe electron, it is reason-
contradiction with the uncertainty principle. If you the momentum of the electron
able to assume thatexactly, Now,
there is suppose
a wave the electron is(x)
function in describing
a state of definite
it momentum.
(ignore time theHow would the wave
dependency).
you should have no information about its position. However, graph of | p (x)|2
function for this state look like? According to 2 de Broglie’s relation, there is a wavelength
Postulate that the isabsolute
implying square ofare
that there this function
regions where (|it is(x)|
more ) likely
givestothe
findprobability density,
the electron than others. If
associated with it, thus, it seems plausible to assume a classical/real oscillating wave of the
the uncertainty principle were to be respected, Shankar argues, the probability distribution
i.e., the likelihood to find the particle between the positions x and x + dx.
form
should be “flat”, meaning that it is equally likely 2⇡x to find itpxanywhere. Thus, one cannot
The last piece needed for the argument is Heisenberg’s p (x) = A cos uncertainty
= A cos ,principle. It is pre-
~
(1)
3
sented qualitatively inwhere
Shankar’s textbook
p (x) denotes byofthought
a state experimentsp (x),
definite momentum. like asthe gamma-ray
well mi-plotted
as | p (x)|2 , are
FIG. 1. .
functionthat
for this
Postulate that the isabsolute
implying there
square ofstate
are look like?
regions
this According
where
function (|it is(x)|
more2to likely
de Broglie’s
) gives tothe relation,
findprobabilitythere
the electron is aothers.
than
density,wavelength
If
the associated
uncertaintywith it, thus,
principle it seems
were plausible to
to be respected, assumeargues,
Shankar a classical/real oscillating
the probability wave of the
distribution
i.e., the likelihood to find the particle between the positions x and x + dx.
form
should be “flat”, meaning that it is equally likely to find it anywhere. Thus, one cannot
The last piece needed for the argument isRe[ ]
Heisenberg’s 2⇡x
uncertainty px
p (x) = A cos = A cos principle.
, It is pre- (1)
3 ~
sented qualitatively in where
Shankar’s textbook by thought experiments like
p (x) denotes a state of definite momentum.
the gamma-ray2 mi-
p (x), as well as | p (x)| , are plotted
∂ 2y 2
2∂ y
= c . (3)
∂t2 ∂x2
Can we allow the solutions to this equation to be complex valued? In other words, what
are the consequences of writing the solution as y(x, t) = A(x, t) + iB(x, t)? By substituting
this complex function in (3), and setting the real and imaginary parts to be respectively
equal, we get
∂ 2A 2
2∂ A ∂ 2B 2
2∂ B
= c and = c , (4)
∂t2 ∂x2 ∂t2 ∂x2
which means that the functions A and B can be chosen freely, as long as they both satisfy
the wave equation. Due to its nice mathematical properties that make calculations easier,
we often use the complex exponential ei(kx−ωt) to represent oscillating travelling waves, but
attach physical meaning only to its real part. As we can see from Euler’s formula, both
A(x, t) = cos (kx − ωt) and B(x, t) = sin (kx − ωt) are solutions to the wave equation, but
given A(x, t), B(x, t) could be another function that satisfies the same equation.
Now let us try to do the same with the Schrödinger equation. Consider, for simplicity,
the equation of a free particle in one dimension13
∂ψ ~2 ∂ 2 ψ
i~ =− . (5)
∂t 2m ∂x2
Substituting ψ(x, t) = A(x, t) + iB(x, t) in (5) and setting the real and imaginary parts
to be respectively equal, we obtain the following conditions
∂A ~ ∂ 2B ∂B ~ ∂ 2A
=− and = . (6)
∂t 2m ∂x2 ∂t 2m ∂x2
Now we no longer have the freedom to choose the functions A(x, t) and B(x, t) arbitrarily
since they are coupled by the relations (6). This implies that we need two functions to
describe the physical situation and cannot assign meaning to just one of them, as it was
the case with classical waves. This will generally happen when the imaginary unit appears
explicitly in the differential equation, as in Schrödinger’s. Another way to justify the need
for two (real) functions is to see how they both appear in the definition of probability density
P = ψ ∗ ψ = A2 + B 2 .
6
But then another question arises immediately: Why is there an i in the time-dependent
Schrödinger equation? One answer can be found in Schrödinger’s original derivation of his
time-dependent equation, which consists in eliminating the energy parameter E from his
time-independent equation14
∂ 2 ψ 2m
+ 2 (E − V )ψ = 0. (7)
∂x2 ~
The derivation involves assuming the Planck-Einstein relation ω = E/~ and using sep-
E
aration of variables to represent ψ(x, t) = Ψ(x) · ei ~ t . In order to eliminate the energy
parameter, we derive ψ(x, t) with respect to time
∂ψ E
= i ψ, (8)
∂t ~
∂ 2ψ 2m ∂ψ 2m
−i − 2 V ψ = 0, (9)
∂x 2 ~ ∂t ~
which, for a free particle, is exactly Eq. (5). Thus, one can say that the imaginary unit i
appeared explicitly in the equation because a time periodic function was represented by a
complex exponential. This enabled the isolation of the energy parameter already after the
first time derivative.16
One nice thing about this justification is that it emphasizes formal differences between the
classical wave equation and Schrödinger’s equation, and is therefore closer to Schrödinger’s
original reasoning. For some people this can actually be a drawback, since it puts too much
emphasis on the wave formalism, which can hinder a more authentic understanding of mod-
ern quantum mechanics. Another interesting aspect is that it highlights the relation between
a complex function and two coupled real functions, implying that in quantum mechanics
one needs two real functions to describe the dynamics of a system. As in the previous case,
here we are also led to wonder about the implications of certain formal differences between
a cosine and a complex exponential. One negative aspect of this justification is that if
one truly wants to explore the consequences of a real wave function which is the solution
of a fourth order differential equation (see note 16), then the explanation can become too
demanding/distracting and not worth the trade-off from a pedagogical point of view.
7
C. Sx , Sy and Sz in sequential Stern-Gerlach experiments
This justification appears in some textbooks that adopt the so-called spin first approach
(e.g. Ref. 18–20) and therefore differs fundamentally from the two previous ones. Sequential
Stern-Gerlach (SG) experiments are presented to motivate the need for a departure from
classical mechanics and to build the new formalism of quantum states. The situation illus-
trated in Fig. 3 is used to highlight some peculiarities of quantum mechanical systems, since
it is rather counterintuitive that a component of Sz in the negative (↓) direction “reappears”
after the beam emerges from the third apparatus. The reason is that one cannot determine
Sz and Sx simultaneously. In other words, they are incompatible observables and the selec-
tion of the component of Sx in the positive (↑) direction by the second apparatus destroys
any previous1.1information about Sz .18
Stern-Gerlach Experiment 9
100 50 25
(a) Z X Z
25
FIG. 3. Sequential SG experiments showing the counterintuitive result that particles with a nega-
100 25
X the thirdZapparatus, although they were previously
(b)after the beamZemerges from
tive Sz “reappear”
50
8
In his initial discussion of sequential SG experiments, Sakurai makes use of a formal
analogy with light polarization to find a solution to this22 problem. He Experiments
Stern-Gerlach compares the decom-
position of the Sx states with the mathematical expression
22
of Stern-Gerlach
linearlyExperiments
polarized 3 light
4
!Ψin"#$#####!%"#%#####!!"
in
P the
5 5
◦ 1
xy-plane and inclined 45 with the x-direction. In order to represent the Sy statesP in a differ-
3 4 ! P
22
!Ψin"#$#####!%"#%#####!!"
Stern-Gerlach Experiments 5 P%
ent way, Sakurai argues, one can refer to the mathematical expression of circularly 5polarized
1
P P!
3 4 Sz
light, which is clearly a distinct physical situation. Because
22 theStern-Gerlach
perpendicular components
Experiments 5 ! ! 5
!Ψ "#$#####!%"#%#####!!" of P!% in
2 1 2
6
circularly polarized light are out of phase by 90◦ , complex
Stern-Gerlach Experiments
6 numbers appear to Histogram
FIGURE 1.13 represent
3
P
4
of S spinthis
P
component measurements.
S
Stern-Gerlach Experiments !
z
6 Stern-Gerlach Experiments !Ψin"#$#####!%"#%#####!!"
5 5 P% z
! !
!2 1
phase difference. Although
50
the analogy is meaningful from Note
a formal perspective,
that the two probabilities
!!" it50which
add to unity,
P
seemsindicates that we normalized the
S 22 Stern-Gerlach Experiments
!!" 2
FIGURE 1.13
ZHistogram of Sz spin results
component measurements.
!!"
! z
properly. A histogram Z of the predicted!measurement is
! shown in Fig. 1.13.
!!"
50
!!" !Ψ"
22 Stern-Gerlach Experiments
Z Z !2
!Ψ" !!"
Z Z P
!Ψ"
confusing if one tries to make some kind of physical association between the two situations.
FIGURE 1.13 Histogram of S spin component measurements. 3
3 4
!Ψin"#$#####!%"#%#####!!"
5 5
P1
P 2%
!"" 45
!""
Note that the two probabilities add to unity, 0which
z
indicates that we normalized the
!"" !Ψin"#$#####!%"#%#####!!"
5
0
!""
0 P! 1 Sz
!""
McIntyre19 and Townsend20 do not refer to the analogy with polarization, but
! introduce
properly. A histogram of the predicted!measurement results is
! shown in Fig. 1.13.
!""
50 P
50 1.3 ! MATRIX NOTATION
Note that the two probabilities add to unity, which indicates that we normalized the
50
P!
2
%
2
FIGURE 1.3 Experiment 1 measures the spin component along
P% the z-axis twice in succession.
FIGURE 1.3 Experiment 1 measures the spin component along the z-axis twice in succession. properly. A histogram of the predicted measurement
S z spin results is shown in Fig. 1.13.
FIGURE 1.3 Experiment 1 measures the spin component along the z-axis twice in succession. FIGURE 1.13 Histogram of S z component measurements.
directly the formalism of quantum state vectors and the probabilistic interpretation
Up to this point, we have ofinthe
defined kets mathematically terms of their inner products with ! !2
Sz
!
2
through the upper port of the second analyzer. Thus we say that when the first Stern-Gerlach analyzer Thus,theinupper
through theport general of the second case analyzer.
FIGURE we
1.13 write
! !2 Thus
Histogram a ofwe ket Szsayspin asthat component !whenmeasurements. the first Stern-Gerlach analyzer
measures an atom to have a z-component of spin S z = +U>2, then the second analyzer
through the upper port of the second analyzer. Thus we say that when the first
also measures
1.3 !
Stern-Gerlach MATRIX
measures Note analyzer
an atom that NOTATION
to the
have two a z-component probabilities of spin S z add = +U>2, to then 2
unity, the second which analyzerindicates also measures that we normalized the
antum State Vectors square modulus of the inner product between
17 two state vectors. This enables them to treat
measures an atom to have a z-component of spin S z = +U>2, then the second analyzer
S z = +U>2 for that atom. This result is not surprising, but it sets the stage for results
S z of
= experiments
+U>2 for that atom. This result is not surprising, but it sets the stage for 1.3 !
results
S z =also
MATRIX
+U>2 measures
properly.
of experiments
to follow.
for thatNote atom.
A that
NOTATION
properly.
This
FIGURE
histogram
the two result
1.13 is
probabilities of not
Histogram
the
add surprising, of
predicted
to unity, Sz spin
0 c 9 = 8 + 0 c 9 0 we
A histogram of the predicted measurement results is shown in Fig. 1.13.
but
which component itindicates
sets the
measurement thatstage
measurements. for results
+ 9normalized + results 8 - the0ofcinput experiments
9is0 -state
9,
shown in Fig. 1.13.
to follow. to follow. Up Though both Stern-Gerlach analyzers in Experiment 1 are identical, they play different roles in inner products with
to this point, we have defined kets mathematically in terms of their
Though both Stern-Gerlach analyzers in Experiment 1 are identical, they play different roles in
0 + 92 and theThe first analyzer prepares the beam in a particular quantum stateor
Though both Stern-Gerlach analyzers in Experiment 1 are identical, they playthis
Up in aNote
different
Thus, experiment. roles
in that
specific
the the two probabilities add to unity, which indicates that we normalized the input state
inThe
Ageneral
case,
first we
ofcase
analyzer write
we
prepares write the kets a ketmathematically
beam inas a particular quantum instate 1 0 + 92 ofand the inner products with
more formally the issue of distinguishing the Sx from the Sy states and to motivate the need
this experiment. The first analyzer prepares the beam in a particular quantum state this1experiment. second1 0 +to 92 this
properly.
and the
analyzer
point, histogram we have
the predicted defined measurement
measures the resultant beam, so we often refer to the first analyzer as a state prepa-
results is shown in Fig. 1.13. terms their
(a) as a state
second analyzer measures the resultant beam, so we often refer to the first analyzersecond prepa-
analyzer (b)
measures the resultant beam, so we often refer to the first analyzerration asThus,
a state 1.3in
device. ! the
1.4
prepa-
MATRIX general
General
By preparing
Quantum
NOTATION thecase statewe withwrite
Systems
the firstaanalyzer,
0 + 9what
ket as the details of the source of atoms can be
88++ 0 0+atc9the 9x 0 0+second
+99 ++analyzer 88--0 0c+9because 90 x-0 9-, 9we
25
main focus in Experiment 1 is0 c x9 = =
ration device. By preparing the state with the first analyzer, the details of the source of atoms canBybe preparing the state with the first analyzer, the details of the
P
ignored. Thus our main focus in Experiment 1 is what happens at the second analyzer
ration device.
because we
P 1.3 source! MATRIX
of
ignored. atoms Thus canour beNOTATION happens
analyzer0 is 9 = 81 + use0by 9 0 ++919 ket + prepared8 -calculate
0 c 9by0 -this: 9 ,first
ignored. Thus1.4 ourGeneral Quantum
main focus Systems 1 is what happens at the second analyzer
in Experiment because Up to this
we entering point, we have defined
25 kets mathematically in terms of their inner products with other kets.
in for complex numbers.!ΨThese
"$%$!!" states are first represented in the Sz basis as follows:
know that any atom entering the second analyzer is represented by the 0 + 9 ket prepared
in know that by the
anyfirst
atom entering the second analyzer is represented by the 0 + 9 ket prepared
know that
1.3 ! MATRIX
or
UpintoaAll
analyzer. by the
any atom
this
Thus,
first
1.4
specific
the
in theas
NOTATION
General
experiments
point,
one of
general
case, Quantum
we have
thethe
wewill
we
second
case wemeasured write a ket quantities.
describe employ=
Systems
writedefined
as
kets
crepresented Now
a first
0 +matrix
12analyzer as12
mathematically
9 + 0 algebra
cthe
a state
0 - to9preparation
.
in terms of device,
the
their inner products with
25
1 1
!Ψ "$%$!""
analyzer. All the experiments we will describe employ a first analyzer as a state preparation
analyzer. All device,
the experiments
as one ofwe thewill describequantities.
measured employ aNow first use analyzer
matrixasalgebra
a state to preparation
or
though
calculatein a
the device,
specific
SPINS program
this: case, haswe a feature write 0where
c 9 = 8the + 0 cstate 9 0 + 9 of + 8the -0c atoms9 0 - 91, coming -iafrom 1 the 1 oven(1.45) is deter-
though the SPINS program has a feature where the state of the atoms coming fromthough the oven theisSPINS
deter-1.4 Thus, y8 0 + 9 xof=their 11 eproducts 2 12 awith
butisin the general case we write a ket as
program
Generalhas a feature
Quantum where the state of the atoms coming frommined
Systems the ovenUp to this
deter-point, we have defined 25 kets mathematically in+terms 12 inner
b other kets.
mined but unknown, and the user can perform experiments to determine the unknown statebut
mined using only and the user can perform experiments to determine the unknownIn
unknown, 1state both
Thus,
using
unknown,
inof
orthe
onlyinthese
general
and
a specific
the
cases,
ascase
onewe
case,
user
of the
wewrite
can
we
write
perform
have
measured
a ket
experiments
as quantities. 0 + 9 x =Now
chosen to8 + to determine
write
use 0 +matrix
9 the the
0+ unknown
9 + toin
kets
algebra state
8calculate
-1terms using
0 + 9 this: 0 -the
of 9 0 + 9 and 0 - 9 basis
only
y8 + 0 + 9 x = 12 11
1
e -ia2 12 1 one
a banalyzer in the experiment. 0 + 9 x = 8 + 0 =+1291xx1 0 ++e9-ia+ 8 - 0 + 9 xx 0 - 9
91-ia+2two
P P
|+iPx = a |+i + b P|−i 8 + 0 + 9 x and 8 - 0 +
one analyzer in the experiment. 0 0 0 +c
!x one analyzer in the experiment. agree on that choice of basis 0 c 9 0 +=9as = 0a +99x90 ++ =
8c+9convention, 98 -+ 0 88 1c-+ 90 0 +-099xc, 0 1-99 0 +the
then 8 -1coefficients
0 c 9 0 (1.45) - 9,
1
!x "x as "x
one of the measured quantities. Now use matrix algebra to calculate
1
= 2 11 + e 2
1.1.2 1!specify
-ia
Experiment
this:
the2quantum |+iy = e |+i + f |−i state, and= we
x8 +
1
0 + 9can
y
@ y+8 +12
8=
= +
1 0 12
0 +
0 simplify
+
-12
1 9
99.x @ 0 +
2
x 0 +
= 9
= 912 2 11the
1 +1 1
++12 e
1
e -ia
0 2-
0 2-12
notation
1
199 .
a b
. + e iaby
2 11 1
(1.46)
2 using just those number
(1.59)
1.1.2 ! Experiment 2 1.1.2 ! Experiment 2 @ y8 + y8 +0 +0 9+x9@ x =
2
= 12 1
1 11 e -ia
2 11 + e
-ia
22121
1 a b
2 111 +
or in
represent
e ia2 and
or in
a aspecific
specific
a ket
case,
as
we write
case,
a we(1.59)
column write
vector
12
containing =
=
1
24 1 1 the
1 1 + e ia 2
1 +
12
two
e
of the 0 + 9 and1 0 -
-ia
+
(10)coefficients
e -ia
+ 1 2 that multiply each ba
The second experiment
In both In both of these
of these is cases,
cases,shown we in
we 0 + 9Fig.
have
= 1.4
chosen
xhave 8 + 0 and + 9 x 0 is
to write
chosen + 9identical
the kets in terms
+ 8to - 0 write+29to Experiment
x 0 -119 the +ekets in98basis
except
terms that
kets. If the
of the 00 +
wesec-
99 and(1.59) 00 - 99 basis
|−ix = c |+i + d |−i In+Stern-Gerlach
both
12 theof these
|−iy = g |+i + h |−i .
cases, we have chosen to write the8the kets 1 in terms of
9 + and - basis
0 we-the
The second experiment is shown in Fig. 1.4 and is identical to Experiment 1 except that theexperiment
sec-
The second is shown in Fig. 1.4 and is identical to=Experiment 11
1111 ++ ee-iaia 1 except
2 + e -ia ond that
example, sec-
agreewe represent
onanalyzer
that choice hasof basis
been0 +andas9rotated
x as
a convention, by 0 +90° 9then
@ x1y8to+= thebe 0 +two 8aligned
9+ x @ 0 =+229
coefficients 11x11 0++
with + 0 -ia
cos 9 x2 and
9+those a22118=
+
x-axis. -
+ 10 + ia
-Now
2e .0 9+x2uniquely
9 x(1.46)
the second
ond Stern-Gerlach analyzer has been rotated by 90° to be aligned with the x-axis.ond
S
x the x-axis rather the z-axis. Atoms input
Now the second analyzer has been rotated by 90° to be aligned24with the x-axis. Now
Stern-Gerlach
x the x-axis 2
= 1121111 ++ ecos-ia 1a2 =
agree
analyzer
1
.2S the onsecond that
specify the
measures the spin (1.59)choice
quantum of
state,
component along basis we = as
can +a
1 simplify
0
12 the x-axis
9 convention,
+ 12
0
the - 9notation
.
rather the z-axis. by then
using the
just
Atoms input two coefficients
numbers. Thus,
to the second 8
8 + 0
0 + 9
9 x and 88 - 00 +
analyzer measures the spin component along to themeasures
second the spin component along @ y8 +rather
0 + 9 x @ the 2 2 11input 1 that multiply -ia basis
0 =+a94 1=1is+the + 12 ket. Forfor the phase correspond
= + e2 ia represent a ket as a column vector containing the two coefficients + e1ia +The each
! ! by the ket 0 + 9 because the first analyzer is unchanged.
! analyzer
!by the ket 0 + 9 because z-axis.
2 Atoms analyzer to thearesecond
specify the
still quantum
represented This result by state,
0 +requires
the 9 xket as 0 and 9 because
+ that weacan
cos the == 0,
first or1analyzer
simplify that {p>2.
unchanged. 0 -e91. two
1andnotation The choices by using
result of this just those number
2
!
analyzer are still represented
2 at both possible output ports of the second2analyzer.
The result
analyzer areofstill
thisrepresented
This result !
requires that cos a the
2two atpossibilities = 0, first or analyzer
that=a4 1=1is{
1
+unchanged.
e ia + The
p>2. The
e -iaexperiment
two result
+ 12choices
represent
In bothof example,
of this
for
is that these
a the
ket
we represent
cases,
phaseto
atomsasappear thewe have
two
correspond
a column
chosen
possibilities
at both possible to
vector
write
for the
the
output kets
direction in
ports
containing 0 +of9=the
12 of 1the 0 + 912
terms of x 2"
the 1second
y-axis
1 + cos
the relative 0
a¢
analyzer. - 9
2 = ≤2 .,Atoms leaving
tobasis
1 kets. If we
the already determined x- and z-axes.
experiment is that atoms appear Atoms leaving
experiment is that atoms appear both possible output portsofofthe they-axis
second analyzer. represent
Atoms agreeleaving a the
ondetermined
that ket
choice asofandbasis column
a analyzerasaa convention, vector then the containing
two
1 coefficients1 correspond the8 + to0 +two
two andcoefficients
9axright-handedcoefficients
18and
- 0 + 9 x uniquely that
that multiply
multiply each
each ba ba
the upper port of the second analyzer have24 been measured to Stern-Gerlach theExperiments
to the for the
24been direction
shown totoStern-Gerlach
relative
= 12 11 + cos a2 = the
to1the already
upper
2 .Experiments
port of Thex-
second choice z-axes.
00the
= +p>2
have can
been 0 + 9be "shown¢toto
xmeasured ≤ , have S x = 22 +U>2, atoms coordinate
leaving
(1.47) system, which is the
FIGURE 1.11 Histograms of Sx spin component measurements for Experiment
have S x = +U>2, and atoms
upper portleaving
of the second
The choice
2
analyzer
a = + have
p>2 measured
can be have
correspond S x =to +U>2, and
a right-handed example,
Inatomsboth
example,0
specify
{ 9
thewe
of
leaving
coordinate
we these
quantumrepresent
system, cases,
This
represent state,
which
standard
column
and
result iswe we
+ 99can
+
requires
convention,
vector
x as
have
x
simplify
asthat
so we chosen
coschoose the
a =notation 220,
thatto write
or1phase. by using
that a We =the {
just
thus kets
those
p>2. represent The intwo
numbers. terms
thechoices { 9 yof
0 Thus, forthe
we
kets 0 + 9 correspond
the phase
as and 0 - 9 basis
This result requires that cos a = 0, or that a = { p>2. The two
standard convention, so we choose that phase. We thus represent the
agree represent
choices yon kets
for the aas
thatket
phase as a
choice
to the
correspond twoofnew basis
possibilities containing
as"atoconvention,
for the
the two
direction coefficients of the thenthat
y-axis multiply
the
relative two each
to basis
coefficients
the ket.
already For
determined8 + 0 + 9 x and 8 - 0 +
x- and z-axes.
for different input states (a) 0 cin 9 = 0 + 9 and (b) 0 cin 9 = 0 - 9 . to the two possibilities for the direction of the y-axis relative to thewhere already example, wewehave
where
determined we
represent have
x- and used
The used
0 choice the
the
+ 9 x as
z-axes. a = new
symbol
+ p>2 symbol
can be shownsignify " totocorrespond
“is represented signify by,” 1to1a“is
and 1it is represented
1
understood
right-handed that
coordinatewe by,”system,and it is
which understo
is the
1 1 specify the
aresystem, quantum
using the 0 + 9 and state,
0 - 9 basis or and the Swe z basis.can We cannot 0 +0!!"
simplify +99xxythat
say ! "thethusthe ket1a notation
equals b¢ the 1 ≤the , 0{
column byvector,
using as just those number
!!"x The choice a = + p>2 can be shown to 0!!"
standard convention, so we choose that phase.
correspond
+ 9xy ! to aaright-handed
We thus i
22represent
b
the
are coordinate
usingbecause
0 { 9 y kets as
represent the
a Ifket the0 which
+ket
as
9isand
standard is the
aanchoose
abstract
column
0 - vector
convention, 9 basis so we
in the or
vector
choose
statethe
+ 9 x " containing
0 !!"
1 that
space S1and0 phase.
¢ ≤z , + x "
the 9
basis. We
columnWe
the
25
22
vector22
represent
twocannot
isi just ¢coefficients
1 ≤complex
two , 9that
say y kets the
num- ket equals
(1.47) that multiply
the colu
each
(1.60) ba
Example 25 1.3 To get some practice using this new Example
matrix notation,
25 1.3 To
and get to
because would learn
some bers.
the some
practice
we were more
to using
(1.60) aabout this
different basis newfor matrix
representing the notation,
vector,
1122
then the 1
and
complex to learn
coefficients some more about
weket is an abstract vector22 in the state space 1and theforcolumn vector is just two com
1 1
! 1 aa1 b b . example, berepresent 0 + 9 the x as Z
X X
1 1
Y
!Ψ"
different even though vector is unchanged. We 0+ -need 9 yy ! !to have aaaconvention bb. the order-
!!"
Z X
!!"
00 - 9y !
Z X
Y
!Ψ"
where we have used the new symbol to signify “is represented by,” and it is understo
!Ψ"
the spin-1/2 system, use the results of Experiment 2 to thedetermine spin-1/2
+ 9
22 isystem,
the Sybers.
- i basis use If kets
the
we
inghaveofwereusing
results
the to the
amplitudes ofinsymbol
choose matrix
Experiment
the a"different
column vector. The2 to
"
basis
standard determine forby,”
convention representing
22
22
isand
to put itthe
-i
i
isthe S y basis
spin upthe kets using
vector,
amplitude then the thecomplex
matrix
0 9 0 9
y
where we used the new to signify “is represented understood that we
nough information about how the states + x and - approach x behaveinstead mathematically.
of the Dirac bra-ket
Rather, we will use
approach. approach
22
where
are using we firstthehave
(at0 +
the 90 + 9 0Thus,
used
top). and
(1.60) the0 representation
the
- 9 basis -ornew9thebasis symbol ofor Wethe
the 0 cannot
- 9 x" Szsay
state to
basis.
in Eq. signify
theWe
(1.36) “is
1cannot
1is equals 1 represented
1 column say vector, that the by,”ket and it (1.60)
equals is understo
the colu
these kets1instead of the Dirac bra-ket approach.
are using the and Sz basis. that ket the
Note that the imaginary components of!"" are 1required. would
They are benot different
merely Note even
that
a mathemati- the though
imaginary the
components vector 0+
of 0 is 99unchanged.
these
- x "
xy ! kets area required. b¢. We ≤complex
,Theyneed are to not have
merely aa mathemati-
convention fo
he results of the experiment to determine these states. Recalling that the experimental results would 0 - 9xy ! a b. are
because using
because thethe
the ketketis0 an+ is9 and
an
abstract 0
abstract
vector - 9 inbasis
thevectorstateor space thein and theS thebasis. state
column We
space
vector iscannot
and
just i two the say column that
num- the ket
vector is
equalsjust the
two colu
com
!""
!""x
0 9 22 - 1
Consider
25 Experiment 2 in the case where the secondConsider
cal convenience as one sees in classical Stern-Gerlach
mechanics. 22 In-
25 that the
i Experiment
general,
ing analyzer
quantum
of
bers. the
If we 2
mechanical in
istothe
amplitudes
were aligned
choosecase
cal convenience
state vectors
in
a different along
where
the as one
column
basis0 !""
- 9for the second
sees
x "representing
1
in 1
classical
vector. z
¢ ≤ the vector,
d +Stern-Gerlach
mechanics.
The , 25 22
standard
then
In general,
the complex convention analyzer
quantum
(1.48)
coefficients isistoaligned
mechanical state
put vectors
the along
spin up
e the same if the first analyzer prepared the system to the !""
be in the - state [see
y-axis. We said before thatcomplex. Fig. 1.11(b)],
the
Noteresults
that
Onthe
we 0 9
have
have complex coefficients.!"" But this does not mean
are the
theimaginary
contrary, same
components
we always thecalculate
ofas inkets
y-axis.
these atheareWe case
required.
measurement said
bers.
results
because
shown
first
They
of
before
would
If be
(at
are
probability notin
we
physical
the
the
were
Fig.
that
different
merely
using
measurements
ket
top).
is
have
the
1.4.
toan
even
Note
a acomplex Thus,
mathemati-
complex.
choose
abstract
complex
results
Thus,
though
that
square,
On
the
are
the the
the
a different
we vector
coefficients.
are the
vector
imaginary
representation
contrary,
is unchanged.
components
we50
But
22 in
same
always
basis
the
-1
this
We
does
of
of
d
state
asthe
need
calculate
for
0 -not
these
9 mean
in 0 athe
to
representing
space
have
kets a
are
that
case
9 xIn“is and
convention
-measurement required.
state
the theresults
shownfor
the
They
inprobability
Eq.the
ofvector,
column physical
order-
are inusing
not
(1.36) Fig.
then
vector
merely
is
is
measurements the
1.4. Thus,
a
a complex
just
mathemati-
complex
square,
two
are
wecom
would
where
real. bers. ing If be
we wedifferent
have
were used
to even the
choose though
new a one symbol
different the invector "convention
basis to ismechanics.
for unchanged.
signify representing We
therepresented need toby,”have astate
and convention
itvectors
is understo fo
are real. the vector, then the complex
of the amplitudes cal in the column
convenience vector.
as The
sees standard classical is to put general, spin up amplitude
quantum mechanical
our results for the two experiments: 50 have calquantum
so all convenience as one sees
mechanics 50in classical
predictions
have mechanics. In are
of probabilities general, quantum mechanical state
so
vectors
all quantum mechanics predictions of probabilities
ing of the 0 + 9 and
amplitudes 0 -the
in 9 basis
thecolumn x vector. of the 0 -But 9 state
is The standard convention
x-axis.theisket
to equals
aput the the
spincolu
up
areofusing first (at the top). Thus, havethecomplex
representation coefficients. this indoes Eq. (1.36) not mean is that the results of physical measurements are
be the
have complex coefficients. But this does not mean that the results physical
FIGURE 1.4 measurements
Experiment 2are
measures spin or the
component basis.
Szalong We
the z-axis cannot
and say
then along thethat
FIGURE
FIGURE 1.4 Experiment 2 measures the spin component along the z-axis and then along 1.4 Experiment
the x-axis. 2 measures
On thethe spin component
we alwaysalong the z-axis would
and then along the x-axis. different evenOn though the
we vector unchanged.
11 0a-
Wed 0need to
+ 9(1.36) have convention fo
2
complex. contrary,
P1, + ymechanics + + of =
= y predictions
2calculate
1 a measurement probability
are real. because
using
first (at thethetop).
complex.
a complex square,
ket soisThus, Pin
the contrary,
@ 8 @ 9@
themechanics
anquantum
abstract representation
vector
always2calculate
+¢ 0in dof
+≤9the the
state
9 2 space 9 x1are
measurement
state
and
probability
in Eq.
real.the column
using a complex
vector
square,
is is just two com @ 8 @ 9@
P 1
= 0 8+ 0 +9 0 = so all quantum probabilities
ing of the amplitudes y0 - 9=
1, +the column 1-
y 1predictions
vector."0 +=
The ¢
standard ≤ convention ,
0 - 9vector, then the complexup
is to put the spin
2 all x of probabilities
, (1.48)
1, + x x 2 bers. If we were to choose a xdifferent "
22 -1 basis d 0 - 9for22 -1 d the
representing
2 first (atbethedifferent
top). Thus, the representation 2of the 11 0 - 9 x1stateWe din Eq. + 9(1.36)
0need is a convention fo
1, - x x P
2 1
= 0 8 - 0 +9 0 =
1.4 " GENERAL QUANTUM SYSTEMS
P 1, - y = y - + = 1
2
would @ 8 @ 9@ even
1.4 " GENERAL QUANTUM P though
1, - y = the- vector
+ is
0 - 9x " 2 ¢ ≤
ySYSTEMS = unchanged. @ 8 @ 9@
,
to have
2
(1.28) ing of the amplitudes in(1.56) the column vector. The -1
22 standard d 0 -9
convention is to put the(1.56) spin up
2
The machinery we have developed for spin-1/2
2 systems can be generalized to other quantum systems. 2 11 0 - 9 x1 d 0 + 9(1.36)
@ y8 + @ - 9 @quantized 8 +A 0yields
@ --for99xspin-1/2
1.4 GENERAL QUANTUM SYSTEMS state in Eq.
first (at the top). Thus, the representation @ of to other is
"=thesystems
1
P2, + x = 0 x8 + 0 - 9 0 = 1
"
+ y = @ ySYSTEMS
For example,Pif an observable rangePof
2, + ythe= = measurement results an for1.4 "The machinery we
n, have developed can
2
A yields some finite
GENERAL
depiction of a 2Stern-Gerlach measurement to a measurement
QUANTUM
2, 2 ¢ ≤be generalized ,for quantum systems.
then we generalize schematic For example, ofif
the
an observable quantized -1
measurement d 0 -
results 9
a some finite range of n,
2 of 22
The machinery we have developed for spin-1/2 systems can be generalized to other quantum systems. n
2 2 then we generalize the schematic
developeddepiction asystems
11Stern-Gerlach d 0 + 9 to other
measurement
1 be generalized to a measurement of the
P2, - x = 0 x8 - 0 - 9 0 = 12 .
Therange
machinery
of n, we have for spin-1/2 can quantum systems.
then we generalize @ y8 - @ depiction
-9 @ =
For example, if an observable A yields quantized
P2, - y the=schematic 1 measurement results a for some finite
of a 2 ,
Stern-Gerlach For example,
measurement to a measurement P2,
of the if an = @ y8 -A 0yields
n
- yobservable @ --99x @ quantized
"= 2 ,measurement
¢ ≤ results an for , some finite range of n,
then we generalize the schematic depiction of 22
a Stern-Gerlach d 0 - 9 to a measurement of the
- 1 measurement
Because the kets 0 + 9 and 0 - 9 form a complete basis,
as depicted in the histograms of Fig. 1.14. 0 + 9 x as depicted in the histograms of Fig. 1.14.
(a) theThese kets describing the Sx measurement,
(b)in" 9 y corresponding kets(c)
!a1"
1
|a|2 = |b|2 = |c|2 = |d|2 = . (12)
2
Thus, we know the absolute square of each coefficient, but their phase is indeterminate.
Since the absolute phase is not physically measurable, one coefficient of each vector of the Sx
states can be chosen to be real without loss of generality24 , so that we can take a = c = √12 .
Both state vectors (|+ix and |−ix ) are already normalized, but not necessarily orthogonal.
Imposing the latter condition we obtain
|x h−|+ix | = 0
1 ∗ 1
√ h+| + d h−| √ |+i + b |−i = 0 (13)
2 2
1
d∗ b = − .
2
Since our goal is to try to use only real numbers, we are free to choose b = √1 and
2
d = − √12 , which enables us to represent the Sx states (in matrix notation)25 as
. 1 1 . 1 1
|+ix = √ and |−ix = √ , (14)
2 1 2 −1
showing that we managed to represent the Sx states in the Sz basis without using complex
numbers. So far, so good.
Now let us try to do the same Sy states. From Fig. 4(b) we arrive at similar relations
as (12), i.e., |e|2 = |f |2 = |g|2 = |h|2 = 1
2
, and make the same initial assumption that
e=g= √1 . Orthogonality of the Sy states gives us, just like in (13), f ∗ h = − 21 . In order
2
to avoid complex numbers and to distinguish the Sy from the Sx states, we are tempted to
choose f = − √12 and h = √1 .
2
10
However, this would be in contradiction with the situation represented by Fig 4(c). Tak-
ing, for instance, the first line
1
|y h+|+ix |2 =
2 2
1
( √ h+| + f ∗ h−|) √1 (|+i + |−i) = 1
2 2 (15)
2
2
f ∗
1 1
+√ = ,
2 2 2
1 e−iα 1 eiα
1
+ + =
2 2 2 2 2
1 1 −iα 1 1
+ e + eiα + =
4 4 4 2 (16)
cos α = 0
π
α=± .
2
Choosing α = π2 , f = √i
2
and h = −i
√
2
, since f ∗ h = − 21 . These coefficients also verify all
the relations of Fig 4(c). Thus, we can represent the the Sy states in the Sz basis as
. 1 1 . 1 1
|+iy = √ and |−iy = √ . (17)
2 i 2 −i
The appearance of i’s [in (17)] is one of the key ingredients of a description
of nature by quantum mechanics. Whereas in classical physics we often use
complex numbers as an aid to do calculations, there they are not essential.
The straightforward Stern-Gerlach experiments we have outlined [...] demand
complex numbers for their explanation.
11
This example shows that although it is possible to write the Sx states in the Sz basis
without complex numbers, they are inevitable to describe the Sy states in the same basis.
A more general way to put it is to say that complex numbers are indispensable to describe
(two-state) systems with more than two incompatible observables.
This type of reason is analogous to other more mathematical arguments where complex
numbers appear due to a necessity of some kind of generalization/expansion of our notion
of number, e.g., in connection with the fundamental theorem of algebra. It is somehow also
related with the first geometrical interpretation of complex numbers, where they appear as
a necessity to fully represent direction in the plane.21 One aspect that can be criticized in
this approach is the fact that the coefficients of quantum state vectors are assumed to be
complex numbers from the beginning, and no justification for that is usually offered.
The forth justification presented here is based on the last of Lucien Hardy’s “five reason-
able axioms” from which quantum theory can be deduced22 and has been presented in an
insightful paper written by Artur Eckert.23 It starts with the possibility of building quantum
mechanics entirely from classical probability in order to see where this leads. The initial
assumption is that a quantum state is described by a column vector in which each term rep-
resents the probability of the system to be measured in a given configuration. For instance,
in a two-state system like the spin 1/2 discussed in the previous section, such a vector could
look like
0.3
p~0 = , (18)
0.7
meaning that it has 30% probability to be measured with its spin up and 70% down.
In such a formalism, transitions are represented by stochastic matrices, whose elements
are positive real numbers and whose values in each column add up to 1. One random
example of a transition (T1 ) applied to p~0 is
0.2 0.4 0.3 0.34
T1 p~0 = = , (19)
0.8 0.6 0.7 0.66
12
meaning that after this transition there is 34% probability for spin up and 66% for spin
down. The elements of the stochastic matrices represent the probabilities for the system to
either remain (main diagonal) or swap states (antidiagonal). Note that after the transition
the probabilities still add up to 1.
When one thinks about time evolution, a physically desirable requirement is that these
transitions are continuous. In other words, it should be possible to view any transition as a
sequence of independent transitions over shorter periods of time. Among other things, this
means that one should be able to extract square or cubic roots (in fact any nth root) of any
transition and the result should be another stochastic matrix. Let us take, for instance, one
square root of T1
1 1 + √2 i 1− √1 i
(T1 )1/2 = 5 5
,
3 2− 2
√ i 2+ 1
√ i
5 5
1 1 + √2 i 1− √1 i 1 1 + √2 i 1− √1 i
5 5 5 5
T1 = . (20)
3 2− 2
√ i 2+ 1
√ i 3 2− 2
√ i 2+ 1
√ i
5 5 5 5
One can easily see that our initial goal was not achieved, i.e., it was not possible to write
the stochastic matrix T1 as the product of two (equal) stochastic matrices. The elements of
(T1 )1/2 are complex numbers with non-zero imaginary parts.26
This example is enough to show that if one requires continuity of transitions, real numbers
are not sufficient. The attempt to describe quantum mechanical systems with classical prob-
ability failed and complex numbers appeared inevitably. As in the first justification, this one
has also the reductio ad absurdum structure. Another advantage is that one counterexample
is sufficient to show the need for a more general set of numbers. This justification can also
lead to arguments related to the need for a unitary time evolution. A possible drawback is
that this explanation can be too mathematical and wouldn’t make much sense for newcomers
who aren’t used to a probabilistic description of the physical world. Another critical remark
can be that although it shows that real numbers are not sufficient, it does not fully prove
that complex numbers are enough for a complete quantum mechanical formalism.
13
III. CONCLUSIONS
Complex numbers seem to be fundamental for the description of the world proposed by
quantum mechanics. In principle, this can be a source of puzzlement: Why do we need such
abstract entities to describe real things?
One way to refute this bewilderment is to stress that what we can measure is essentially
real, so complex numbers are not directly related to observable quantities. A more philo-
sophical argument is to say that real numbers are no less abstract than complex ones, the
actual question is why mathematics is so effective for the description of the physical world.28
When comparing the answers presented here, it is quite interesting to see how they differ
from one another. Four apparently different physical principles/properties are used to justify
the need for complex numbers, namely (II.A): the impossibility of having information on
position when momentum is exactly known; (II.B): the fact that i appears explicitly in the
Schrödinger equation; (II.C): the descriptions of Sx , Sy and Sz in sequential Stern-Gerlach
experiments; and (II.D): the demand for continuous transitions. For courses adopting a
position first framework, A and B should be more appropriate, whereas C and D are more
suitable for the ones that chose a spin first. This diversity of explanations reflects a historical
(and actual) characteristic trait of quantum mechanics, which is the possibility of multiple
theoretical approaches.
It is reasonable to assume that people with different backgrounds will have different
preferences. But perhaps the question “Which is THE best justification?” is not the most
appropriate. One learns something from each justification and a deep understanding of
the matter is probably related to the establishment of different connections. For a physics
instructor, one can conjecture that “the more the better”, as it seems plausible to assume
that one important teaching competence is to possess a broad repertoire of explanations.
14
ACKNOWLEDGMENTS
I would like to thank the many colleagues and friends who made comments to previ-
ous versions of this paper and/or with whom I had the privilege to discuss this intriguing
topic. Among them are Elina Palmgren, Ismo Koponen, Tommi Kokkonen, Giacomo Zuc-
carini, Magnus Bøe, Carlos Eduardo Aguiar, Nelson Studart, Débora Coimbra, Helge Kragh,
Christian Joas, Thiago Hartz, Ramon Lopez, Paul van Kampen, Mossy Kelly, Núria Munoz
Garganté, Robert Lambourne, John Stack and R. Shankar. I am also indebted to the re-
viewers for their valuable comments and suggestions.
∗ [email protected]
1 Salomon Bochner, “The significance of some basic mathematical conceptions for physics,” Isis
54(2), 179-205 (1963).
2 This struggle is well documented in Schrödinger’s letters and papers. Here we give two examples.
In a letter to Lorentz on June 6, 1926, Schrödinger wrote: “What is unpleasant here, and
indeed directly to be objected to, is the use of complex numbers. ψ is surely fundamentally a
real function”. The last paragraph of his forth communication to the Annalen der Physik also
illustrates this feeling of puzzlement and frustration: “Meantime, there is no doubt a certain
crudeness in the use of a complex wave function.”
3 C.N. Yang, “Square root of minus one, complex phases and Erwin Schrödinger,” in Schrödinger:
Centenary celebration of a polymath, edited by C.W. Kilmister (Cambridge University Press,
Cambridge, 1987), pp. 53-64.
4 In fact, Dirac is quite explicit about that in the forth edition of his celebrated book The Prin-
ciples of Quantum Mechanics (p. 20, our emphasis): “Our bra and ket vectors are complex
quantities, since they can be multiplied by complex numbers and are then of the same nature
as before, but they are complex quantities of a special kind which cannot be split up into real
and pure imaginary parts. The usual method of getting the real part of a complex quantity, by
taking half the sum of the quantity itself and its conjugate, cannot be applied since a bra and
a ket vector are of different natures and cannot be added.”
5 It is worth mentioning that we are scratching the surface of a much deeper question that has been
15
addressed by many serious researchers on mathematical foundations of quantum mechanics (e.g.
Ref. 6). Among them, the work of Stueckelberg (Ref. 7) stands out as a plausible formulation
of quantum mechanics in real Hilbert space (see Ref. 8 for a pedagogical introduction). The
interested reader will also find numerous discussions about the topic in internet forums, such as
https://physics.stackexchange.com/questions/32422/qm-without-complex-numbers.
6 Felix M. Lev, “Why is quantum physics based on complex numbers?” Finite Fields and Their
Applications, 12, 336 (2006).
7 Ernst C. G. Stueckelberg, “Quantum Theory in Real Hilbert Space.” Helv. Phys. Acta 33, 727
(1960).
8 Jan Myrheim, “Quantum Mechanics on a Real Hilbert Space.” arXiv:quant-ph/9905037.
9 A broad literature review shows very few attempts to explain the need for complex numbers
in quantum mechanics, both in introductory level textbooks or physics education journals. An
exception is the paper “Why i ?” (W. E. Baylis et al., Am J Phys 60 (9), 788, 1992), which
actually deals with the question of why complex numbers are useful for physics in general. This
paper is based on the geometric algebra by David Hestenes, who proposes a new mathematical
formalism for physics (Am J Phys 71 (2), 104, 2003). Because of its geometrical interpretation,
the reasons for “Why i ?” in quantum mechanics enabled by this formalism are very convincing
and it is definitely worthwhile pursuing this project. However, here we are concerned with
physics instructors who use more standard approaches.
10 Ramamurti Shankar, Fundamentals of Physics II: Electromagnetism, Optics, and Quantum
Mechanics (Yale University Press, New Haven, 2016).
11 If one takes the precise definition of uncertainty in quantum theory, the cosine function does
not violate the uncertainty principle, since its uncertainty is infinite (plane wave). Nevertheless,
the general argument that this function does provide information about position still holds. I
am indebted to two reviewers for pointing this out.
12 David Bohm, Quantum Theory (Prentice-Hall, New York, 1951).
13 The main argument would remain valid in the general case.
14 From a pedagogical perspective it would be important to show where this equation comes
from. Schrödinger derived it from Hamilton’s optical-mechanical analogy (See Ref. 15). For our
purpose here, it is worth stressing that one does not need to use the complex exponential in
this derivation, i.e., the same expression is obtained if one assumes ψ(x, t) = Ψ(x) · cos E~ t.
16
15 Erwin Schrödinger, Four lectures on wave mechanics, delivered at the Royal institution, London,
on 5th, 7th, 12th, and 14th March, 1928. (Blackie & son limited, London, Glasgow, 1928).
16 If we had represented ψ as a purely real periodic function, e.g. ψ(x, t) = Ψ(x) · cos E~ t, then we
2 2
would need to derive it twice with respect to time to isolate the energy parameter ( ∂∂tψ2 = − E~2 ψ).
In order to substitute this in Eq. (7), we would also need to square the whole equation and would
∂ 2 2m 2 4m 2 ∂2ψ
end up with a rather complicated fourth order equation that looks like ( ∂x 2 − ~ V ) ψ+ ~2 ∂t2
=
0. Curiously, this fourth-order equation was the one Schrödinger initially claimed to be “the
uniform and general wave equation for the scalar field ψ”, since apparently it would be possible
to consider ψ as a real function. One can hypothesize that Schrödinger did this in order to avoid
an explicit i in his fundamental equation. Dealing with this matter is beyond the scope of this
article, but the interested reader can find deeper discussions about the implications of a real
wave function in the continuation of Bohm’s argument12 and in Ref. 17.
17 Robert L. W. Chen, “Derivation of the real form of Schrödinger’s equation for a nonconservative
system and the unique relation between Re(ψ) and Im(ψ)”, J. Math. Phys., 30 (1), 1988.
18 Jun John Sakurai, Advanced Quantum Mechanics (Addison-Wesley, Boston, 1967).
19 D. H. McIntyre, C. A. Manogue, J. Tate, Quantum Mechanics: A Paradigms Approach (Pearson
Addison-Wesley, Boston, 2012).
20 John S. Townsend, A Modern Approach to Quantum Mechanics, Second Edition (University
Science Books, Mill Valley, CA, 2013).
21
√
Paul J. Nahin, Imaginary Tale: The Story of −1 (Princeton University Press, Princeton, 1998).
22 Lucien Hardy, “Quantum Physics From Five Reasonable Axioms,” arXiv:quant-ph/0101012v4.
23 Artur Ekert, “Complex and unpredictable Cardano,” Int J Theor Phys,47, 2101 (2008).
24 We will assume that a and c are real because we want to see if we can get away without complex
numbers. A more general treatment of the situation is presented in Ref 20.
25 .
The symbol = is used to indicate “is represented by”. One cannot set the ket equal to the
column vector, because the former is an abstract vector and the latter is its representation in a
given basis.
26 There is a great deal of gymnastics to calculate the square root of a square matrix. For this
particular case, the fact that the determinant of T1 is negative already guarantees that the
components of its square roots are complex numbers with non-zero imaginary parts (see Ref.
27).
17
27 Bernard W. Levinger, “The Square Root of a 2 x 2 Matrix”, Mathematics Magazine, 53 (4),
222-224 (1980).
28 See, for instance, Wigner’s Unreasonable Effectiveness paper (Pure Appl. Math. 13, 114 , 1960)
and the debates related to it.
18