Maria Chekhova, Peter Banzer - Polarization of Light - in Classical, Quantum, and Nonlinear Optics (De Gruyter Textbook) - de Gruyter (2021)
Maria Chekhova, Peter Banzer - Polarization of Light - in Classical, Quantum, and Nonlinear Optics (De Gruyter Textbook) - de Gruyter (2021)
Maria Chekhova, Peter Banzer - Polarization of Light - in Classical, Quantum, and Nonlinear Optics (De Gruyter Textbook) - de Gruyter (2021)
Polarization of Light
Also of Interest
Semiconductor Spintronics
Schäpers, 2021
ISBN 978-3-11-063887-5, e-ISBN 978-3-11-063900-1
Spintronics
Theory, Modelling, Devices
Blachowicz, Ehrmann, 2019
ISBN 978-3-11-049062-6, e-ISBN 978-3-11-049063-3
Phononic Crystals
Artificial Crystals for Sonic, Acoustic, and Elastic Waves
Laude, 2020
ISBN 978-3-11-063728-1, e-ISBN 978-3-11-064118-9
Maria Chekhova, Peter Banzer
Polarization of
Light
|
In Classical, Quantum, and Nonlinear Optics
Authors
Prof. Dr. Maria Chekhova Univ.-Prof. Dr. Peter Banzer
Max-Planck Institute University of Graz
for the Science of Light Universitätsplatz 5
Staudtstr. 2 8010 Graz
91058 Erlangen Austria
Germany Max-Planck Institute
[email protected] for the Science of Light
Staudtstr. 2
91058 Erlangen
Germany
[email protected]
ISBN 978-3-11-066801-8
e-ISBN (PDF) 978-3-11-066802-5
e-ISBN (EPUB) 978-3-11-060509-9
www.degruyter.com
|
For Vladimir, Rosa, Emma, Leo and Eva
Contents
1 Introduction | 1
1.1 About this book | 1
1.2 Brief history of polarization optics | 2
2 Necessary basics | 7
2.1 Analytic signal | 7
2.2 Maxwell’s equations | 8
5 Polarization transformations | 42
5.1 Phase (retardation) plates | 42
5.1.1 Half-wave plate | 44
5.1.2 Quarter-wave plate | 45
5.2 Rotators | 45
5.3 Poincaré-sphere representation | 47
VIII | Contents
6 Geometric phase | 53
6.1 Examples of geometric phase | 53
6.1.1 The Foucault pendulum | 53
6.1.2 Non-planar optical path | 55
6.2 Interference of arbitrarily polarized beams | 56
6.3 Decomposition of a beam in two differently polarized
components | 57
6.3.1 Decomposition of a beam in two orthogonally polarized
components | 57
6.3.2 Decomposition of a beam in two non-orthogonally polarized
components | 58
6.4 Pancharatnam phase | 59
6.4.1 Calculation of the Pancharatnam phase | 59
6.4.2 Measurement of the Pancharatnam phase | 60
6.5 Berry phase | 62
7 Structured light | 65
7.1 The paraxial wave equation | 65
7.2 Structured scalar light beams—transverse phase patterns and phase
singularities | 66
7.3 Vectorial spatial modes and light beams—non-homogeneous
polarization distributions | 70
7.4 Polarization singularities and generic ellipse fields | 73
7.5 Basic principles of structured light beam generation | 75
Index | 217
1 Introduction
1.1 About this book
It is difficult to overestimate the role of polarization in modern optics and photonics.
A brief glance at the website of any company producing optical components shows
how large the section ‘polarization optics’ is. Polarization elements are used in imag-
ing and spectroscopy, they can be essential in interferometers and light modulators,
they are ubiquitous in lasers and laser systems. In nonlinear optics, polarization of
light is crucial for understanding the phase matching and for the analysis of the tensor
properties of different nonlinear susceptibilities. Polarization is important for liquid
crystals, which are part of our everyday life: they are used in liquid crystal displays
(LCDs) in computers and smartphones, and in spatial light modulators, installed in
beam projectors. The fact that LCDs use polarization can be verified by simply looking
at your mobile phone through a polarizer, or wearing polarizing sunglasses (another
object familiar to everyone). As you rotate the polarizer, at a certain angle your smart-
phone screen will become dark.
Importantly, polarization plays the central role in modern quantum optics, quan-
tum information, and in the booming quantum technology. The main reason for that
is that the ‘building bricks’ of quantum information, so-called qubits, are so easily
realized in the form of polarized photons. The quantum state of a polarized photon
is similar to the one of a spin-1/2 particle, or of a two-level atom. Meanwhile, photons
are the best carriers of information: they do not easily interact with each other or the
environment; this means they can propagate relatively far without being lost or scat-
tered. This is why polarized photons are used in quantum key distribution, one of the
most robust quantum information technologies to date.
This book considers polarization of light and its manifestations and use in modern
optics and photonics. It is mainly addressed to master and PhD students working in
various fields of modern optics, and it is essentially based on the courses we teach
at the Friedrich-Alexander University of Erlangen-Nürnberg. A large part of this book
originates from the lecture course started at the Lomonosov Moscow State University
by David Klyshko (and further continued by Maria Chekhova). In the quantum optics
part, this book is considerably based on his work.
After introducing some necessary basics in Chapter 2, we start from the formal
description of polarization (Chapter 3) in terms of the polarization ellipse, Jones vec-
tor and matrices, and Stokes vector and Müller matrices. Optics of crystals, necessary
for understanding the operation of polarization optical elements, is briefly reviewed in
Chapter 4. Polarization transformations with waveplates and polarization rotators are
then considered in Chapter 5. Chapter 6 is devoted to the manifestations of geometric
(Pancharatnam) phase in optics, similar to the Berry phase in quantum physics. Chap-
ter 7 considers structured light, whose polarization state differs from point to point.
Chapter 8 deals with polarization at the nanoscale, a subject that recently emerged in
https://doi.org/10.1515/9783110668025-001
2 | 1 Introduction
connection with the rapidly developing fields of nanooptics and nanoscale nonlinear
optics. An overview of polarization elements used in modern optics experiments is
given in Chapter 9. Chapter 10 is devoted to polarization in nonlinear optics, its role in
phase matching, and its manifestations due to the tensor properties of nonlinear sus-
ceptibilities. Finally, the last three chapters cover polarization-based quantum optics.
Chapter 11 introduces polarization from the viewpoint of quantum physics, in terms
of the Stokes operators and simplest polarization states. Chapter 12 deals with various
quantum states of polarized light and related effects. Chapter 13 describes two applica-
tions of polarized light in quantum optics: one is testing the foundations of quantum
mechanics, the other is quantum key distribution.
Most of the chapters are written in a textbook style and do not require special
knowledge. But some of them, namely Chapters 8, 12, and 13, are also intended to give
brief reviews of modern literature on the subject. Correspondingly, each of them has
an extensive list of references for the interested reader. However, to keep these lists
short, wherever possible we cite review papers rather than original works.
ordinary and extraordinary rays are polarized orthogonally to each other, one in the
plane of the crystal optic axis and the other one, perpendicularly to it. These effects
of double refraction and spatial walk-off will be considered in detail in Chapter 4. An
example of double refraction can be seen in Fig. 1.1, which shows this text on a com-
puter screen, photographed through a 3 cm calcite crystal. The existence of only two
possible polarization states,1 for instance, vertical and horizontal, follows from light
being a transverse wave; this idea was first formulated by Hooke in 1757 and further
proven by Young in 1817 [7].
The 19th century brought enormous progress in the study of polarization. In 1808,
Malus discovered that initially unpolarized light becomes partially polarized as a re-
sult of oblique reflection from a dielectric surface. The way he observed this effect was
by looking through a calcite crystal at the reflections from the windows of the Luxem-
bourg Palace in Paris, where he was an officer of the guard [4, 7]. As he rotated the
crystal around its axis, one of the reflected images was extinguished. One can repeat
this experiment by looking at an oblique reflection through a polarizer: at a certain
orientation of the polarizer the reflected image gets weaker. Figure 1.2 shows a picture
of a window in the Luxembourg Palace taken in 2019 through a polarizer selecting ver-
tical (left) and horizontal (right) polarization. The right-hand photo obviously shows
an ‘extinguished’ reflection. The same effect must have been seen by Malus. (Unfortu-
nately today’s guards do not let people come close to the fence, and the pictures were
taken from a large distance.) The fact that for a certain angle of incidence (Brewster’s
1 In some special cases, like the one of an evanescent wave, there are three possible polarization
states; this will be briefly discussed in Chapter 8.
4 | 1 Introduction
depends on the polarization of the pump and the orientation of the crystal. Later, the
phase matching conditions were formulated, and at that time they were satisfied only
through the choice of different polarization modes for the pump and the frequency
converted radiation. This will be the subject of Chapter 10.
In quantum optics, polarized photons offered a possibility to realize some
gedanken (thought) experiments formulated at the dawn of quantum mechanics. For
instance, Schrödinger’s concept of entanglement and the famous Einstein–Podolsky–
Rosen (EPR) paradox of 1935 was considerably simplified by passing from the
‘position-momentum’ picture to the concept of a spin 1/2 particle, which was done
by Bohm [1, 6]. The situation described in the EPR paradox could in this case be
reproduced not in an abstract gedanken experiment, but in a real experiment of
Stern–Gerlach type. For the latter, in 1964 Bell formulated a theorem, leading to an
inequality that could help to test some statements of quantum mechanics in exper-
iment. Although experiments on testing these Bell’s inequalities indeed started with
spin-1/2 particles, it was only through the use of polarization-entangled photons that
Bell’s inequalities could be tested in a relatively simple way. Polarization-entangled
photons, as well as other types of nonclassical light, will be considered in detail in
Chapter 12, and the exciting story of EPR paradox, Bell’s inequalities, and their final
experimental tests will be described in Chapter 13.
In 1984, Bennett and Brassard proposed an idea that later revolutionized cryptog-
raphy and in fact was one of the main triggers of the quantum information theory [2].
In their method of secret key distribution between two users (now known as the BB84
protocol) they encoded information into the polarization states of single photons. The
fragility of the polarization state of a single photon provided the protection of this
secret information against eavesdropper’s attacks. Generally, an information bit en-
coded into the state of a ‘two-level’ quantum system, like a photon with two polariza-
tion states, is now known as a quantum bit, a qubit, and forms the basis of quantum
information. The quantum key distribution with polarized photons will be discussed
in Chapter 13.
A very young field of optics, originating from the end of the 20th century, is
nanooptics. It considers light at the subwavelength scale, where polarization, like
many other phenomena, behaves different from a macroscopic scale. For instance,
light is not any more a transverse wave: it can be polarized longitudinally. Today, with
the miniaturization of optical devices, nanoscale polarization optics becomes very
important. It will be the subject of Chapter 8, and the related subject of structured
light beams will be considered in Chapter 7.
6 | 1 Introduction
Bibliography
[1] D. Bohm. Quantum theory. Prentice-Hall, 1952.
[2] D. Bouwmeester, A. Ekert, and A. Zeilinger. The physics of quantum information.
Springer-Verlag, 2000.
[3] R. W. Boyd. Nonlinear optics. Academic Press, 2008.
[4] D. Goldstein. Polarized light. GRC, 2003.
[5] C. Huygens. Treatise on light. MacMillan and Co, 1912.
[6] D. N. Klyshko. Basic quantum mechanical concepts from the operational viewpoint. Phys. Usp.,
41(9):885–922, 1998.
[7] W. A. Shurcliff. Polarized light. Harvard University Press, 1962.
2 Necessary basics
2.1 Analytic signal
Although the electric field E(t) is real-valued, it is very convenient to describe it by
introducing a complex field, so that the observed field is its real part. To introduce
this complex field [3], let us decompose the real field into a Fourier integral,
∞
where E(ω) is the field spectral amplitude, and split the integral in two parts, one
including the integration over negative frequencies,
0
(−)
E (t) = ∫ dωe−iωt E(ω), (2.2)
−∞
The fields (2.2) and (2.3) are called negative-frequency and positive-frequency
fields. They are complex conjugates of each other, as
∞ 0
(+) ∗ iωt ∗
[E (t)] = ∫ dωe E (ω) = ∫ dωe−iωt E ∗ (−ω)
0 −∞
0
where we have changed the integration variable from −ω to ω and used the fact that
for the spectral amplitude of a real field, E ∗ (ω) = E(−ω).
The positive-frequency field E (+) (t) will be further called the analytic signal [1].
The observed field is proportional to its real part,
Figure 2.1 shows the analytic signal E (+) (t) = A(t)ei(−ωt+Φ(t)) for a monochromatic
light. The arrow depicting the analytic signal is rotating with the optical frequency
ω (dashed line), but it is convenient to consider a ‘stroboscopic’ picture by passing
to a frame of reference rotating with the same frequency. The length of the arrow is
https://doi.org/10.1515/9783110668025-002
8 | 2 Necessary basics
the field amplitude A(t) and its angle with the horizontal axis is the phase Φ(t). For a
non-monochromatic light, both the amplitude and the phase are random functions of
time. Typical times of their variation are given by the inverse spectral width of light.
The intensity of a light beam is then calculated, up to a dimensional factor, as the
squared amplitude of the analytic signal. In what follows, we will omit this dimen-
sional factor and write the instantaneous intensity as
2
I(t) = E (−) (t)E (+) (t) = E (+) (t) . (2.6)
where P(⃗ r,⃗ t) is the polarization of the matter (not to be confused with the polarization
of light, the subject of this book). Physically, it is the dipole moment density, i. e., the
Bibliography | 9
Here χ̂ and χm
̂ are the electric and magnetic susceptibilities, which in the framework
of this book will be assumed to be constant in space and time.
Bibliography
[1] M. Born and E. Wolf. Principles of optics. Pergamon Press, 1970.
[2] R. W. Boyd. Nonlinear optics. Academic Press, 2008.
[3] J. W. Goodman. Statistical optics. John Wiley and Sons, Inc., 2000.
3 Polarization of light: classical description
3.1 Polarization ellipse
Polarization of a light wave is defined by the way its electric field vector oscillates.
Imagine that we can take a snapshot showing us the ‘trajectory’ of the electric field
vector (Fig. 3.1). If the electric field E(⃗ r,⃗ t) is oscillating in one fixed plane, light is said to
be linearly polarized. If it moves along a spiral (the projection on the plane transverse
to the propagation direction is a circle), the polarization state is right- or left-hand cir-
cular, depending on the direction of rotation. Finally, if the projection of the E(⃗ r,⃗ t)
trajectory on the transverse plane is an ellipse, light is elliptically polarized [6]. Fig-
ure 3.1 shows the cases of a horizontally polarized beam (a), vertically polarized beam
(b), and elliptically polarized beams with different rotation direction of the electric
field vector (c, d). The wave propagation direction z is the same in all pictures, and
the horizontal and vertical directions are marked H and V, respectively. The pictures
in the bottom row show the trajectories as seen from the direction in which the wave
propagates.
Figure 3.1: The trajectory of the electric field vector of a horizontally polarized wave (a), vertically po-
larized wave (b), right- (c) and left-handed (d) elliptically polarized waves. Bottom row: trajectories
of the electric field vector as ‘seen’ by an ‘observer staring into the beam’.
The latter is the most general case of a polarization state: the projection of the elec-
tric field vector trajectory on the plane orthogonal to the propagation direction can be
considered as an ellipse (Fig. 3.2). The parameters of this polarization ellipse differ for
different polarization states.
Namely, the ellipticity is given by the ratio of the semiaxes, b/a. Linear polarization
corresponds to zero ellipticity and circular polarization, to a unity ellipticity. Instead
of the ellipticity, one can speak of its opposite, the excentricity, √a2 − b2 /a. The excen-
tricity is unity for linearly polarized light and zero for circularly polarized light. The
tilt angle Ω of the ellipse is called the azimuth angle. Finally, the direction of rotation
(shown in the figure by an arrow) is called the handedness. It can be positive (right)
https://doi.org/10.1515/9783110668025-003
3.2 The Jones vector and the Jones matrices | 11
or negative (left). These three numbers (ellipticity, azimuth, handedness) fully define
the state of polarization [6].
In Chapter 8, we will introduce another description of the polarization ellipse,
which is of importance in the context of polarization singularities.
From Fig. 3.2, it is clear that the polarization state is linear when the horizontal
and vertical components of the electric field oscillate in phase with each other. The
azimuth is then determined by the ratio of their amplitudes. Right (left) circular polar-
ization will be observed in the case where the horizontal and vertical components of
the electric field have equal amplitudes and the latter has a phase delay of π/2 (−π/2)
with respect to the former. Generally, the ellipticity is determined by the phase delay
between the horizontal and vertical components of the electric field.
For monochromatic light, the phase and amplitude of the electric field (see
Fig. 2.1) do not vary with time; therefore, the relative phase between the two field com-
ponents, as well as the ratio of their amplitudes, is constant as well. It follows that
monochromatic light is always polarized, i. e., its polarization state does not change
in time. The situation is different for non-monochromatic light: it can happen that its
polarization state drifts with time. For instance, the polarization ellipse can rotate,
or its ellipticity can change. In this case, light is referred to as partially polarized or
unpolarized.
Because the typical time at which the phase and amplitude of non-mono-
chromatic light drift is the coherence time, given by the inverse width of the spec-
trum [3], one can conclude that at times much smaller than the coherence time even
non-monochromatic light will be polarized. In particular, quite counter-intuitively,
even sunlight will be polarized at very short time intervals.
The most complete classical description of the polarization state of light is in terms of
the analytic signal in its vectorial form (Chapter 2). For a plane non-monochromatic
12 | 3 Polarization of light: classical description
where k is the wave vector, ω the central frequency, and the slowly varying amplitude
E⃗ 0 (t) can be decomposed in two vectors, along the horizontal (H) and vertical (V) di-
rections:
2 2
S0 (t) ≡ EH (t) + EV (t) . (3.3)
S0 (t) is equal to the instantaneous intensity of the light wave. In the general case, this
value varies with time [3].
The Jones vector is then defined as a two-component column vector [2, 4, 6],
α 1 E (t)
e(t)
⃗ ≡( )≡ ( H ). (3.4)
β √S0 (t) EV (t)
Here we omitted the time dependence of α and β, but we will keep in mind that both
are functions of time. Therefore the Jones vector describes the instantaneous state of
polarization.
Clearly, the Jones vector is normalized, i. e.,
Since we are not interested in the overall phase of the Jones vector, but only in the
relative phase between its complex components α and β, we can define them as
α ≡ cos(ϑ/2),
β ≡ eiφ sin(ϑ/2), (3.6)
Now we can describe different polarization states in terms of the Jones vector. For in-
stance, for linearly polarized light, the relative phase φ between the components is
zero. In particular, the Jones vector for horizontally polarized light is
1
e⃗H = ( ) , (3.7)
0
0
e⃗V = ( ) . (3.8)
1
Two other important cases are the ones with α = β = 1/√2 and with α = −β =
1/√2. They correspond, respectively, to linear diagonal and anti-diagonal polarization
states:
1/√2
e⃗D,A = ( ). (3.9)
±1/√2
1/√2
e⃗R,L = ( ). (3.10)
±i/√2
A Jones vector with arbitrary components α, β or, equivalently, with arbitrary val-
ues of ϑ and φ, obviously describes an arbitrary elliptical polarization state. The ellip-
ticity and azimuth of the polarization ellipse are then related to the parameters of the
Jones vector by
a 1 + √sin2 ϑ sin2 φ
=√ , tan(2Ω) = tan ϑ cos φ. (3.11)
b
1 − √sin2 ϑ sin2 φ
The handedness is given by φ: it is right for φ < π and left for φ > π.
One can see that the Jones vectors (3.7), (3.8), as well as two vectors (3.9) and two
vectors (3.10), are orthogonal to each other. This can be verified by calculating their
inner product; however, one should keep in mind that the inner product of two Jones
vectors e⃗1,2 with the components α1,2 and β1,2 is defined as
α
e⃗1 ⋅ e⃗2 = ( 1 ) (α2∗ ; β2∗ ). (3.12)
β1
14 | 3 Polarization of light: classical description
Moreover, two arbitrary Jones vectors e⃗1,2 with parameters ϑ1,2 and φ1,2 are orthogonal
if ϑ2 = π − ϑ1 and φ2 = π + φ1 . According to Eq. (3.11), this means that for orthogonal
states, the polarization ellipses have the same ellipticity, the azimuths differing by π/2
(the principal axes orthogonal), and opposite handedness.
We see now why light beams with orthogonal polarization states (orthogonal
Jones vectors) do not form an interference pattern. Indeed, if two fields overlap at
some point, the total analytic signal is given by the vectorial sum of their analytic
signals E⃗ 1,2
(+)
(t),
2 2
I = E⃗ 1(+) (t) + E⃗ 2(+) (t) + 2 Re{E⃗ 1(+) (t)E⃗ 2(−) (t)}. (3.15)
The first two terms are the intensities of the two beams, and the last term is responsible
for the interference. Its value is proportional to the inner product of the two Jones
vectors, e⃗1 ⋅ e⃗2 . This term will be equal to zero if the Jones vectors of the two beams are
orthogonal.
In Section 3.2.1 we decomposed the analytic signal into vertically and horizontally po-
larized components. This means that so far we were using the basis (frame of refer-
ence) formed by Jones vectors e⃗H and e⃗V . We will further call it the HV basis. Mean-
while, in some cases it is more convenient to use other bases. We will now consider
transformations from one basis to another.
Linear algebra tells us that from a basis formed by two orthonormal complex vec-
tors e⃗1,2 we can pass to a new orthonormal basis, e⃗ 1,2 , by a transformation
α α
( ) = A+ ( ) . (3.17)
β β
3.2 The Jones vector and the Jones matrices | 15
For example, the transformations from the HV basis to the ‘diagonal’ basis and
‘circular’ basis look, according to Eqs. (3.9) and (3.10), as
1 1
(e⃗D , e⃗A ) = (e⃗H , e⃗V ) ( √12 √2
−1 ) (3.18)
√2 √2
and
1 1
(e⃗R , e⃗L ) = (e⃗H , e⃗V ) ( i2
√2
(3.19)
√
−i ) ,
√2 √2
1 1 1
ADA = ( ), (3.20)
√2 1 −1
1 1 1
ARL = ( ). (3.21)
√2 i −i
Now, with the polarization state of light described by the Jones vector, it is worth dis-
cussing how this state can be transformed. We know that certain optical elements
change the polarization of light. For instance, as mentioned in Chapter 1, the polariza-
tion state changes as a result of reflection. Some materials such as, for instance, the
sugar solution, can rotate the plane of polarization. Another example is retardation
plates used in optical laboratories: they can transform linear into circular polariza-
tion and vice versa.
In this book we will consider all these polarization elements, and some others. Our
consideration will only cover lossless elements, i. e., those conserving the intensity of
light. Correspondingly, these transformations will be described by unitary matrices,
called the Jones matrices [2, 4]. As a result of such a transformation, an initial Jones
vector e⃗ will become
e⃗ = Je,⃗ (3.22)
where the Jones matrix J is a 2 × 2 unitary matrix. Because the total phase of the Jones
vector is irrelevant1 and is therefore ignored, the Jones matrices have an additional
property det(J) = 1. It follows that an arbitrary Jones matrix belongs to the SU(2) group
1 It is, however, important for effects involving the geometric phase; see Chapter 6.
16 | 3 Polarization of light: classical description
(2 × 2 unitary matrices with the special property of unimodularity, i. e., having a unity
determinant) and has the general form
t r
J=( ∗ ). (3.23)
−t r∗
Here, t and r are complex numbers satisfying the condition |t|2 + |r|2 = 1.
It is worth noting that the same SU(2) form is typical for matrices describing a
lossless beamsplitter: for two fields E1 and E2 at its input, the output fields will have
the form
Here, r and t are the reflectivity and transmissivity of the beamsplitter w. r. t. the field.
This analogy between a polarization transformation and the transformation of elec-
tric fields on a beamsplitter has a simple explanation: the two polarization modes are
similar to any other binary set of modes, like spatial modes, which are involved in
transformations (3.24).
Thus, every polarization transforming element, like a cuvette with sugar solution,
a retardation plate, or in fact any crystalline slab, will be described by a Jones matrix J.
If there are several such elements placed in series, each described by a matrix Jj , j =
1, . . . , n, then the total transformation of the Jones vector is given by the matrix J =
Jn ⋅ ⋅ ⋅ J1 . In Chapter 5 we will calculate the Jones matrices of various optical elements.
Polarization elements we use in the lab will be discussed in more detail in Chapter 9.
S1 ≡ |EH |2 − |EV |2 ,
3.3 The Stokes vector and the Poincaré sphere | 17
S2 ≡ 2 Re(EH∗ EV ),
S3 ≡ 2 Im(EH∗ EV ). (3.25)
S1
⃗S ≡ (S ) . (3.26)
2
S3
In addition, the zeroth Stokes observable S0 is defined by Eq. (3.3). All Stokes ob-
servables are, in the general case, functions of time. Using the definitions (3.3), (3.25),
one can verify the relation
Importantly, the second and third Stokes observables S2,3 can be expressed simi-
larly to S1 by passing to the ‘DA’ and ‘RL’ bases. Indeed, because ED,A = (EH ± EV )/√2
and ER,L = (EH ± iEV )/√2, we can obtain
S2 ≡ |ED |2 − |EA |2 ,
S3 ≡ |EL |2 − |ER |2 . (3.28)
From this, a very simple interpretation of the Stokes observables S1,2,3 follows: each of
them is the difference of intensities in two orthogonal polarization modes,
S1 ≡ IH − IV , S2 ≡ ID − IA , S3 ≡ IL − IR . (3.29)
The mean values of the Stokes observables, ⟨S1,2,3 ⟩, will be further called the
Stokes parameters. Here we denote averaging by angular brackets, typically used in
quantum mechanics, because in Chapter 11 we will apply the same notation to the
quantum mechanical description. But in this chapter and further, wherever a classi-
cal description is considered, we will understand a mean value as a time-averaged
quantity. Alternatively, one can imagine that the mean value is found by averaging
over the ensemble formed by several independent beams.
Partially polarized light can now be described in terms of the Stokes parameters.
Namely, the degree of polarization is introduced as
For fully polarized light, the degree of polarization takes its maximum value P = 1 [due
to Eq. (3.27)], for unpolarized light P = 0. In the general case, for partially polarized
light, 0 ≤ P ≤ 1.
18 | 3 Polarization of light: classical description
⟨EH∗ EH ⟩ ⟨EH∗ EV ⟩
C=( ). (3.31)
⟨EH EV∗ ⟩ ⟨EV∗ EV ⟩
Then the degree of polarization (3.30) is related to the determinant and trace of
the coherence matrix as [4]
4 det C
P = √1 − . (3.32)
(Tr C)2
σ⃗ ≡ S/S
⃗ .
0 (3.33)
Recalling the definition (3.4) of the Jones vector, we find that its components de-
fine the ones of the normalized Stokes vector:
|α|2 − |β|2
σ⃗ = ( 2 Re(α∗ β) ) . (3.34)
2 Im(α∗ β)
cos ϑ
σ⃗ = (sin ϑ cos φ) . (3.35)
sin ϑ sin φ
1. The equator corresponds to the case where the angle φ is either zero or π, which
means that both components of the Jones vector, α and β, are real. Then there is no
phase difference between the horizontal and vertical field components, i. e., the
polarization is linear. Accordingly, Fig. 3.3 shows polarization states at different
points of the equator as blue arrows. In particular, the point on the equator with
ϑ = 0 corresponds to α = 1, β = 0, i. e., to the horizontal polarization, and the
opposite point, with ϑ = π, to α = 0, β = 1, i. e., to the vertical polarization.
Correspondingly, the points on the equator with ϑ = π/2 and φ = 0, π denote the
diagonal (D) and anti-diagonal (A) polarization, respectively. As the ‘longitude’
on the equator increases, the plane of linear polarization tilts gradually (Fig. 3.3).
2. At the poles, ϑ = π/2, while φ = π/2 for the North Pole and φ = 3π/2 for the South
Pole. The components of the Jones vector are then α = 1/√2, β = i/√2 for the
North Pole and α = 1/√2, β = −i/√2 for the South Pole. It follows that the North
Pole depicts the right-hand circular polarization and the South Pole, the left-hand
circular polarization.
These examples, so far, show that opposite points on the Poincaré sphere corre-
spond to orthogonal polarization states: H and V, D and A, R and L; see Fig. 3.3.
This is also true in the general case: for opposite points A, B on the sphere the co-
ordinates are ϑB = π − ϑA and φB = π + φA ; then, according to Eq. (3.11), opposite
points on the sphere correspond to orthogonal polarization states.
3. From Eq. (3.11), it also follows that the northern hemisphere (φ ≤ π) has right-hand
polarization, while the southern one (π < φ < 2π) has left-hand polarization.
4. Points on a single meridian have the same azimuth of the polarization ellipse,
equal to half of the longitude μ: 2Ω = μ. In particular, for all points on the Green-
20 | 3 Polarization of light: classical description
wich meridian, the azimuth is zero: the long semiaxes of the polarization ellipses
are oriented along the H direction.
5. Points with the same latitude λ have the same ellipticity. From Eq. (3.11) one can
see that the ellipticity is b/a = √(1 − cos λ)/(1 + cos λ). In particular, for points on
the equator, λ = 0, b/a = 0, while for the poles, λ = π/2 and b/a = 1.
Partially polarized light or unpolarized light is described in the classical picture as some
variation of the Stokes vector. One can see it as the Stokes vector changing its direc-
tion with time;2 then the point depicting the polarization state will ‘wander’ over the
Poincaré sphere. After time averaging, the components of the Stokes vector reduce
and the degree of polarization (3.30) becomes smaller than unity or, for completely
unpolarized light, zero.
Alternatively, the Stokes vector can vary over an ensemble. Indeed, one can imag-
ine that a light beam is an ensemble of beams with different polarization states. Ac-
cordingly, instead of a single point, there will be a set of points on the Poincaré sphere.
To find the degree of polarization for the whole beam, one should average the Stokes
vector components over all points. This procedure is equivalent to finding the vector
sum of all Stokes vectors and dividing it by the number of points. As a result of this
averaging, the degree of polarization becomes smaller than the unity.
σ⃗ = Mσ,⃗ (3.36)
where M is called a Mueller matrix. Clearly, it is a 3 × 3 matrix with real elements (as it
relates a real vector to a real vector). From Eq. (3.34) and from the general form (3.23) of
the Jones matrix we can find the corresponding Mueller matrix in terms of t and r [4]:
2 Similarly, for spatially multimode beams, the Stokes vector can be seen as varying in space.
3.3 The Stokes vector and the Poincaré sphere | 21
Note that in the general case, a Mueller matrix describes a transformation of the
non-normalized Stokes vector S⃗ with four components S0,1,2,3 and it is therefore a 4 × 4
matrix [2, 6]. But here, we will only consider lossless transformations preserving the
total intensity S0 ; therefore it is sufficient to consider the normalized Stokes vector σ⃗
and its transformations by 3 × 3 matrices.
Similar to the Jones matrix, a Mueller matrix (3.37) is unitary and unimodular. It
follows that a Mueller matrix conserves the length of a vector—indeed, it transforms a
unit Stokes vector into another unit Stokes vector—but it also conserves the angle be-
tween two Stokes vectors. In other words, it conserves the inner product of two Stokes
vectors. Moreover, one can show that it also conserves a vector product of two Stokes
vectors; in particular, it transforms a ‘right-hand’ rectangular triplet of vectors into an-
other ‘right-hand’ triplet. This means a transformation given by a Mueller matrix is a
rotation on the Poincaré sphere, also known as an SO(3) transformation (a rotation in
a three-dimensional space).
To describe such a rotation, there are at least two alternative ways. One is to spec-
ify three Euler angles. The other one, which we will follow here, is to specify the axis
of rotation and the angle of rotation. They can be found from the elements of the ma-
trix M. Indeed, a vector transformation (3.36) in the Cartesian space, described by a
3 × 3 matrix M with the elements mij , corresponds to the rotation of every point by an
angle ν about a rotation axis determined by the direction cosines c1 , c2 , c3 . (The rota-
tion axis will be invariant to this transformation.) The rotation angle and the direction
cosines are given by [5]
1 1
cos ν = (Tr(M) − 1) = (m11 + m22 + m33 − 1),
2 2
m − m23 m − m31 m − m12
c1 = 32 , c2 = 13 , c3 = 21 . (3.38)
2 sin ν 2 sin ν 2 sin ν
From Eqs. (3.38) we obtain the angle of rotation and the direction cosines of the rota-
tion axis for a polarization transformation in terms of t, r:
From the definition of the Stokes observables S0 and S1 as the sum and difference of
the intensities in the H and V polarization modes [see Eqs. (3.3) and (3.25)], it is clear
how to measure these values. One should split the input beam in two, polarized hor-
izontally and vertically, and then measure the intensities of both beams. From what
was discussed in Chapter 1, this splitting can be performed with the help of a cal-
cite crystal, but this method will be considered in detail in Chapter 4. Here we will
describe a more common way, using a polarizing prism. Such a prism, also called a
polarizing beamsplitter, reflects in the horizontal direction vertically polarized light
and transmits horizontally polarized light. Different types of polarization prisms will
be considered in Chapter 9; the operation of such a prism is shown in Fig. 3.4. After the
prism, the horizontally and vertically polarized beams are measured by two detectors,
and then, by summing and subtracting their readings, one can measure, respectively,
the instantaneous values of S0 and S1 .
From Eqs. (3.28), one can see that the Stokes observables S2 and S3 can be obtained
in the same way, as long as we can split the beam into two beams with D and A po-
larizations and into two beams with R and L polarizations. The first task is easy: one
has to rotate the polarization prism by 45∘ . The second task is solved by using the
quarter-wave plate, considered in the end of the previous section. Such a plate trans-
forms H, V linear polarization states into R, L circular polarization states, respectively,
and vice versa. Then, if it is placed in front of the polarization prism, a light beam ini-
tially polarized right-circularly will become horizontally polarized after the plate and
get transmitted through the prism. Similarly, a light beam polarized left-circularly will
be transformed into vertically polarized light and will be reflected. The two detectors
then will measure the intensities in the R and L polarization modes (Fig. 3.5).
With all four Stokes observables S0,1,2,3 measured, one can calculate the normal-
ized Stokes vector σ.⃗ By averaging over time, the Stokes parameters and the degree
of polarization can be found. In addition, it is interesting to measure the fluctuations
(noise) of the Stokes observables. This is possible provided that the detectors are fast
enough to follow the intensity fluctuations in each output channel of the prism. This
Bibliography | 23
Figure 3.5: A simplified setup for the measurement of Stokes observables S2 (a) and S3 (b).
measurement is especially important for quantum optics and will be discussed in de-
tail in Chapters 11 and 12.
Although we have only considered the measurement of the three Stokes observ-
ables S1,2,3 and the total intensity S0 , it is clear that any arbitrary Stokes observable
Sϑ,φ ≡ S1 cos ϑ + S2 sin ϑ cos φ + S3 sin ϑ sin φ can be also measured. The setup for this
most general Stokes measurement will be considered in Chapter 5.
Bibliography
[1] M. Born and E. Wolf. Principles of optics. Pergamon Press, 1970.
[2] D. Goldstein. Polarized light. GRC, 2003.
[3] J. W. Goodman. Statistical optics. John Wiley and Sons, Inc., 2000.
[4] D. N. Klyshko. Polarization of light: fourth-order effects and polarization-squeezed states. J. Exp.
Theor. Phys., 84:1065–1079, 1997.
[5] G. A. Korn and T. M. Korn. Mathematical handbook for scientists and engineers. Dover
Publications, Inc., 2000.
[6] W. A. Shurcliff. Polarized light. Harward University Press, 1962.
[7] G. Strang. Linear algebra and its applications. Thomson Brooks-Cole, 2006.
4 Optics of crystals: basic concepts
In this chapter, we will consider the anisotropy of optical materials, leading to bire-
fringence and many other polarization effects that are discussed further in this book.
We will restrict the discussion to linear optics, leaving the nonlinear effects to Chap-
ter 10. In addition, here we will consider light at a fixed wavelength and therefore avoid
the consideration of optical dispersion. We will also ignore magnetic phenomena, by
assuming the matter to be non-magnetic.
We start with the Maxwell equations, assuming that the medium contains no charges
and no currents. In this case, the general equations (2.7) simplify to (we omit the space
and time arguments for brevity)
∇⃗ × E⃗ = −B,⃗
̇
(4.1)
∇⃗ × H⃗ = D,⃗
̇
(4.2)
∇⃗ ⋅ D⃗ = 0, (4.3)
∇⃗ ⋅ B⃗ = 0. (4.4)
D⃗ = ϵ0 ϵE,⃗ (4.5)
where ϵ = 1 + χ is the dielectric permittivity, also known as the dielectric function, and
χ is the linear susceptibility. We have ignored the nonlinear dependence of polariza-
tion P⃗ on the electric field E.⃗ We will take it into account in Chapter 10, dedicated to
nonlinear polarization optics.
https://doi.org/10.1515/9783110668025-004
4.1 Anisotropy of linear optical properties | 25
Consider now the anisotropy contained in Eq. (4.5). It sets the relation between
two vectors, the electric field and the displacement, through ϵ, which is therefore a
tensor, ϵij . In the matrix form, Eq. (4.5) reads
3
Di = ϵ0 ∑ ϵij Ej . (4.6)
j=1
The dielectric tensor is symmetric. Indeed, the energy density of the electric field
can be written as
3
1 1
Ue = D⃗ ⋅ E⃗ = ϵ0 ∑ ϵij Ei Ej . (4.7)
2 2 i,j=1
In this expression, the indices i, j can be interchanged. This leads to the symmetry of
the dielectric tensor [2]:
For a generic frame of reference i, j, k, there can be six different values of the di-
electric tensor: ϵii , ϵjj , ϵkk , ϵij , ϵik , ϵjk . But because the energy is always positive, (4.7)
is a so-called positive definite quadratic form. It is always possible to diagonalize it,
i. e., to pass to the coordinates x, y, z such that
1
Ue = ϵ0 (ϵxx Ex2 + ϵyy Ey2 + ϵzz Ez2 ). (4.9)
2
The corresponding coordinate axes x, y, z are called the principal axes. In this
frame of reference, the relation between the displacement and the electric field takes
the simplest form
Di = ϵ0 ϵii Ei . (4.10)
The values ϵxx , ϵyy , ϵzz are called principal dielectric permittivities and denoted simply
as ϵx , ϵy , ϵz . In general, these values can change with the frequency of light and are
therefore called dielectric functions. However, their symmetry should always obey the
symmetry of the crystal [3] (see also Section 4.2).
It follows from Eq. (4.10) that the vectors D⃗ and E⃗ are parallel only if ϵx = ϵy = ϵz .
Equation (4.9) can be rewritten as
2
1 Dx Dy Dz
2 2
Ue = ( + + ). (4.11)
2ϵ0 ϵx ϵy ϵz
26 | 4 Optics of crystals: basic concepts
In a plane monochromatic electromagnetic wave, all fields have harmonic space and
time dependence. For the positive-frequency parts, it will be
and similarly for D(+) (r,⃗ t), H (+) (r,⃗ t), B(+) (r,⃗ t). Here, ω is the frequency and k⃗ is the
wavevector.
From this dependence, we can introduce the phase velocity v.⃗ It is directed along
the wavevector and its absolute value is
ω
v= . (4.13)
k
With the refractive index n defined through the equation
nω
k= , (4.14)
c
the phase velocity is
k⃗ c
v⃗ = . (4.15)
kn
With the time and space dependences of all variables taken into account, the first
two Maxwell equations (4.1) and (4.2) become
k⃗ × E⃗ = ωB,⃗
k⃗ × H⃗ = −ωD.⃗ (4.16)
At the same time, in a non-magnetic material, B⃗ and H⃗ are parallel. Then, it follows
from Eqs. (4.16) that the three vectors k,⃗ E,⃗ D⃗ are all in the plane orthogonal to B⃗ and
H⃗ (Fig. 4.1), and in this plane, k⊥
⃗ D.⃗ But because E⃗ and D⃗ are not parallel, E⃗ is not
orthogonal to k. ⃗
At the same time, the Poynting vector, which determines the energy flux, is defined
as [3]
S⃗ = E⃗ × H.⃗ (4.17)
It means that the angle between D⃗ and E⃗ is the same as the angle between k⃗ and
⃗S (Fig. 4.1). This angle α is called the angle of anisotropy and we calculate it in Sec-
tion 4.3. Similar to the vectors E,⃗ D,⃗ k,⃗ the Poynting vector S⃗ is in the plane orthogonal
to H⃗ and B.⃗
The total energy density of the electromagnetic wave is twice as large as its electric
energy density (4.7), because its electric and magnetic parts are equal. Therefore,
U = D⃗ ⋅ E.⃗ (4.18)
4.1 Anisotropy of linear optical properties | 27
The group velocity of an electromagnetic wave is defined as its Poynting vector nor-
malized to the energy density,
S⃗ E⃗ × H⃗
u⃗ ≡ = . (4.19)
U D⃗ ⋅ E⃗
S⃗ ω
u⃗ = . (4.20)
S k cos α
We see that the group velocity differs from the phase velocity only by a factor given
by the cosine of the angle of anisotropy α. There is of course an additional difference
between the velocities of energy and phase propagation, caused by the dispersion.
However, at the moment we do not consider dispersion. For this reason, it is better to
call u⃗ the ray velocity: it has nothing to do with the group of monochromatic waves but
it shows the direction of a ray.
Let us derive the value of the phase velocity for a given direction of the wavevector.
From the first equation of the system (4.16), taking into account that B⃗ = μ0 H,⃗ we get
1 ⃗
H⃗ = k × E.⃗ (4.21)
ωμ0
After substituting this expression into the second equation in (4.16) and taking into
account Eq. (4.5) between D⃗ and E,⃗ we obtain
1 ⃗
k × (k⃗ × E)⃗ = −μ0 ϵ0 ϵE.⃗ (4.22)
ω2
28 | 4 Optics of crystals: basic concepts
Now, we take into account that μ0 ϵ0 = 1/c2 and use the vector algebra rule
We get then
c2 ⃗ ⃗ ⃗ ⃗ 2 ] = −ϵE.⃗
[k(k ⋅ E) − Ek (4.24)
ω2
Let us write this equation in the frame of reference where the ϵ tensor is diagonal.
For each component i = x, y, z, we obtain
3
c2 2
[E i k − ki ∑ kj Ej ] = ϵi Ei . (4.25)
ω2 j=1
∑j kj Ej
Ei = ki , (4.26)
ω2
k2 − ϵ
c2 i
3 3 3
ki2
∑ ki Ei = ∑ kj Ej ∑ . (4.27)
ω2
i=1 j=1 i=1 k2 − ϵ
c2 i
We obtain
This relation is known as the Fresnel equation for wavevectors (sometimes one
says ‘wave normals’). After multiplying (4.31) by the product of all denominators, we
obtain
1 1 1 1
kx2 ( − 2 )( − 2 )
ϵy n ϵz n
1 1 1 1
+ ky2 ( − )( − 2 )
ϵx n2 ϵz n
1 1 1 1
+ kz2 ( − )( − 2 ) = 0. (4.32)
ϵx n2 ϵy n
For any direction of k,⃗ this is a quadratic equation in n−2 . It follows that, for any
direction of k,⃗ there are two values of refractive index n. This effect is called birefrin-
gence,1 or double refraction. For instance, for k⃗ directed along the x axis, the second
and third terms are zero, and the solutions are n = √ϵy and n = √ϵz . In other words,
for any direction of k,⃗ there are two possible values of the phase velocity. For instance,
for k⃗ directed along the x axis, the phase velocity can be v = c/√ϵ or v = c/√ϵ .
y z
A similar equation can be derived for the group velocity values [2].
There is a clear visual explanation of the double refraction. Indeed, for a fixed electric
energy of the wave, Eq. (4.11) defines an ellipsoid,
2
D2x Dy D2z
+ + = const. (4.33)
ϵx ϵy ϵz
x2 y2 z 2
+ + = const. (4.34)
ϵx ϵy ϵz
This surface is shown in Fig. 4.2 by red color. The ellipsoid has three planes of symme-
try but it is not necessarily an ellipsoid of rotation.
Let us choose a direction of the wavevector k;⃗ the displacement vector D⃗ lies in the
plane orthogonal to it. This plane intersects with the ellipsoid (4.2) along an ellipse,
because a cross-section of an ellipsoid is always an ellipse. This ellipse is shown in
orange in Fig. 4.2. One can prove [2] that the large and small semi-axes of this ellipse
are along the two possible directions of D⃗ (denoted D⃗ and D⃗ in the figure). These are
orthogonal directions. In addition, the two possible values of the phase velocity scale
as the inverse lengths of these semi-axes:
We arrived at an important conclusion: for the two possible values of the refractive
index, the directions of displacement D⃗ are orthogonal. The three vectors D⃗ , D⃗ , and
k⃗ form an orthogonal triplet of vectors.
One can also prove, in a similar way, that the two possible polarization directions,
E ⃗ , E ⃗ , and the Poynting vector S⃗ also form an orthogonal triplet of vectors.
Optic axis or optic axes. For any ellipsoid, there are two circular cross sections
passing through its center. (For an ellipsoid of rotation, these two sections coincide.)
The wavevectors normal to these circular cross sections have the property that for
them, D = D ; hence n = n . Then there is no birefringence along these directions of
k,⃗ and this is the definition of an optic axis. Depending on the symmetry of the crystal,
there can be one or two optic axes.
A more difficult task is to find the following surface: for each direction k⃗ one plots
two wavevectors, having the two possible lengths: k = n ω/c and k = n ω/c. This
surface, often called the Fresnel surface, has, in the general case, a more complicated
shape than an ellipsoid (4.34). In the next section we consider it separately for different
types of crystals.
Let us find, for a given direction of the wavevector k,⃗ the two possible values of the
phase velocity v or the refractive index n. For this purpose, we will use Eq. (4.31). After
4.2 Optical types of crystals | 31
c c2
substituting v = n
and denoting vi2 ≡ ϵi
, i = x, y, z, we get
This dependence determines the so-called normal surface: for each direction of
the wavevector two values of v are plotted. Indeed, we have shown in the previous
section that there are two solutions. This surface is more complicated than an ellipse
and it consists of two shells. In terms of the refractive index n, it defines the Fresnel
surface.
Similarly, one can derive the equation for the ray surface. It takes the form
Sx2 Sy2 S2
+ + −2 z −2 = 0, (4.38)
u − ux
−2 −2 u − uy
−2 −2 u − uz
and it gives the two possible values of the group velocity u for a given direction of the
Poynting vector S.⃗
Crystals used in linear and nonlinear optics are categorized into different classes and
groups, depending on the symmetry of their elementary cell. There are seven crystal
systems, namely, cubic, hexagonal, trigonal, tetragonal, orthorhombic, monoclinic,
and triclinic, each of them containing several crystal symmetry classes [6, 7]. A sym-
metry class involves crystals whose symmetry elements (center of symmetry, mirror
planes, 1-, 2-, 3-, 4-, 6-fold rotation axes or 1-, 2-, 3-, 4-, 6-fold inversion axes) belong
to a certain point group. Cubic crystals are most symmetric, and triclinic crystals are
least symmetric. There are 32 crystal symmetry classes in total, listed in Table 4.1. Here
we use the so-called Hermann–Mauguin notation showing the symmetry elements of
the crystals. For example, a cubic crystal of class 23 has 2-fold rotation symmetry axes
parallel to the cube axes and 3-fold rotation symmetry axes parallel to the major diago-
nals of the cube. A cubic crystal of class 432 has 4-fold rotation symmetry axes parallel
to the cube axes, 3-fold rotation symmetry axes parallel to the major diagonals of the
cube, and 2-fold rotation symmetry axes parallel to the diagonals of the cube faces.
For trigonal class 3m crystals, there is a 3-fold axis and three mirror planes parallel to
it. A detailed explanation of this notation can be found in Ref. [6].
According to Neumann’s principle, the symmetry of crystal physical properties in
general, and optical properties in particular, should include the elements of the ele-
mentary cell symmetry. In full accordance with this statement, the optical properties
of different crystal systems correspond to their symmetry; for example, cubic crystals
are expected to have most symmetric optical properties.
32 | 4 Optics of crystals: basic concepts
System Classes
Depending on the symmetry, from the viewpoint of their linear optical properties, crys-
tals can be isotropic, uniaxial, and biaxial.
The group of isotropic crystals only includes crystals with cubic structure. Their
optical properties are the same as for isotropic materials like gases, liquids, or glasses.
For such crystals, ϵx = ϵy = ϵz ≡ ϵ0 . In this case, the Fresnel equation (4.31) has only
one solution, n = √ϵ0 , and there is no birefringence.
Uniaxial crystals are less symmetric; to this class belong crystals of hexagonal,
trigonal, and tetragonal systems. They have two different values of the dielectric per-
mittivity, ϵx = ϵy ≡ ϵ0 and ϵz . In this case, the Fresnel equation (4.31), after multipli-
cation by the product of the denominators, becomes
1 1 1 1 1 1
[ − ][(kx2 + ky2 )( − 2 ) + kz2 ( − 2 )] = 0. (4.39)
ϵ0 n2 ϵz n ϵ0 n
1 1
− 2 = 0,
ϵ0 n
1 1 1 1
sin2 ϑ( − ) + cos2 ϑ( − 2 ) = 0. (4.41)
ϵz n2 ϵ0 n
n = √ϵ0 ,
sin2 ϑ cos2 ϑ
−1/2
n=( + ) . (4.42)
ϵz ϵ0
These two solutions represent the Fresnel surface, shown in Fig. 4.3. The surface con-
sists of two shells. The first one is a sphere (shown in red); for any direction of k⃗ there is
4.2 Optical types of crystals | 33
the same value of the refractive index. The corresponding wave is the ordinary wave.
The second one is an ellipsoid of rotation (orange in Fig. 4.3), which has a circle in
the xy-cut and an ellipse in any cut passing through the z axis (because the refractive
index depends only on the angle ϑ between k⃗ and the z axis). This wave is called the
extraordinary wave. For a wavevector along the z axis, there is just a single value of
the refractive index, n0 = √ϵ0 . According to the definition above, z is then the optic
axis, i. e., the direction along which the two shells of the Fresnel surface intersect.
Figure 4.3: Two shells of the Fresnel surface for a uniaxial crystal, a
sphere and an ellipsoid of rotation. The two possible polarization
directions are shown in blue.
In other words, in a uniaxial crystal, there are two possible waves (solutions to the
Maxwell equations) for any direction of the wavevector k.⃗ One of them, the ordinary
wave, has the refractive index independent of the k⃗ direction. For the other one, the
extraordinary wave, the refractive index depends on the k⃗ direction. Along the optic
axis z, both refractive indices coincide.
The directions of the electric field E⃗ for the ordinary and extraordinary waves are
shown in Fig. 4.3 by a blue dot and a blue arrow, respectively. The ordinary wave is
polarized orthogonally to the plane containing the wavevector and the optic axis. For
the extraordinary wave, the electric field vector lies in the (z,⃗ k)⃗ plane but it is not
orthogonal to k.⃗ Indeed, one can show that the direction of the electric field vector
is always tangent to the corresponding Fresnel surface [2]. For the ellipsoid shell of
the Fresnel surface in Fig. 4.3, the tangent is not orthogonal to k.⃗ This is in agreement
with Fig. 4.1: in the general case, the electric field vector is not orthogonal to the wave
vector.
Sometimes, n0 is called the ordinary refractive index, no . The other refractive in-
dex is called the extraordinary one, ne ≡ √ϵz . A more commonly used form for the
refractive index of the extraordinary wave is
1 sin2 ϑ cos2 ϑ
2
= + . (4.43)
n n2e n2o
Figure 4.3 shows a situation where no > ne . This type of uniaxial crystal is called neg-
ative; in the opposite case, no < ne , the crystal is called positive.
34 | 4 Optics of crystals: basic concepts
The case of biaxial crystals is the most general one. Biaxial crystals are those be-
longing to triclinic, monoclinic, and orthorhombic systems. For a biaxial crystal, all
values of the dielectric tensor are different, ϵx ≠ ϵy ≠ ϵz , and the Fresnel equation
has the most general form. In this case, one can say that there are two different shells
of the Fresnel surface, i. e., two possible solutions to the Fresnel equation (4.31), as
we have shown in Section 4.1.3. The analysis of the two Fresnel surfaces shows that
they only intersect at four points, which of course should be symmetric with respect
to the crystal axes. For this reason, these four points should lie in one of the principal
planes, i. e., one of the planes where ki = 0, i = x, y, z.
To find these points, consider such a plane. In this case Eq. (4.32) simplifies. For
instance, for the plane orthogonal to z, we have either
1 1
− = 0, (4.44)
ϵz n2
or
1 1 1 1
kx2 ( − ) + ky2 ( − 2 ) = 0. (4.45)
ϵy n2 ϵx n
The first solution defines a circle for the possible values of the refractive index n in the
plane x, y: n = √ϵz regardless of the direction of k.⃗ For the second solution, we get the
equation of an ellipse,
2
1 kx ky
2
1
= ( + ). (4.46)
n2 k 2 ϵy ϵx
Both solutions (4.44) and (4.46), i. e., the sections of the Fresnel surfaces by the
plane x, y, are shown in Fig. 4.4 by red and orange lines, respectively. The circle has
the radius nz ≡ √ϵz and the ellipse, semi-axes nx ≡ √ϵx and ny ≡ √ϵy . The figure also
shows the two polarization directions (blue arrow and blue dot), i. e., the directions of
the electric field E.⃗ For the circle, the electric field is orthogonal to the x, y plane and,
in particular, to the k⃗ vector, as it should be for an ordinary wave. For the ellipse, light
is polarized in the x, y plane (blue arrow); the electric field vector is not orthogonal
Figure 4.4: Fresnel surfaces for a biaxial crystal with ϵy < ϵz <
ϵx : cross-section by the xy plane. The two possible polarization
directions are shown in blue.
4.2 Optical types of crystals | 35
Figure 4.5: Fresnel surfaces for a biaxial crystal with ϵy < ϵz <
ϵx : cross-section by the xz plane. The two possible polarization
directions are shown in blue.
Similarly, the third principal plane (not shown) will also intersect with the Fresnel
surfaces along a circle and an ellipse; in this case the circle will have the radius nx
and lie outside of the ellipse, which will have the semi-axes ny and nz .
We can now combine this information about the three principal planes, each one
containing an ellipse and a circle, and draw a three-dimensional plot of the Fresnel
surface. A single octant of this surface is shown in Fig. 4.6. In each plane, there is a
circle (red) and an ellipse (orange). As we have just seen, there is only a single inter-
section point within this octant belonging to the x, y plane. There are another three
intersection points in the same plane, placed symmetrically with respect to x and y
axes. These four points define the two optic axes, one of them shown by green dashed
line in the figure.
Figure 4.6 also shows the polarization directions on each shell of the Fresnel sur-
face. In the principal planes, the segments of the circle and the ellipse have normal
and tangential polarization directions, respectively. Outside of the principal planes,
polarization directions are shown on the inner Fresnel surface by red lines.
36 | 4 Optics of crystals: basic concepts
After discussing linear birefringence in the previous section, we might ask a natural
question: are there cases where the normal waves are elliptically or circularly polar-
ized?
Indeed, it turns out that materials where birefringence is elliptical or circular also
exist. It can take place in certain anisotropic crystals, such as uniaxial or biaxial crys-
tals considered above. In such crystals, called gyrotropic, or optically active (this term
will be clear from what follows), even in the direction of the optic axis there are still
two values of the refractive index n. The two corresponding solutions to the Maxwell
equations yield complex values of the analytic signal, i. e., they represent right- and
left-hand circularly polarized light.
The theory of this phenomenon is based on the spatial dispersion, i. e., the depen-
dence of the dielectric tensor on the wavevector, ϵ = ϵ(k)⃗ [1]. Here we will not consider
this theory, but only mention some examples of circular birefringence and discuss its
physical consequences.
An example of a gyrotropic crystal is quartz [5]. It is a uniaxial crystal and for di-
rections of the wavevector that are far from the optic axis it behaves as described in
Section 4.2.2. The difference of the refractive indices for ordinary and extraordinary
waves, no − ne , is on the order of 10−2 . Meanwhile, if the k⃗ vector is directed along
4.2 Optical types of crystals | 37
the optic axis, there exist, instead of a single linearly polarized wave with the refrac-
tive index no , two circularly polarized normal waves with refractive indices nR (for the
right-hand polarization) and nL (left-hand polarization). The corresponding circular
birefringence nR −nL is on the order of 10−4 , much smaller than the linear birefringence
no −ne . In the directions close to the optic axis but not coinciding with it, normal waves
are polarized elliptically. In terms of the Fresnel surfaces, optical activity means that
the sphere and the ellipsoid do not touch in the direction of the optic axis, but there
is a tiny gap between them [7].
Circular birefringence leads to the effect of optical activity, i. e., polarization
rotation—and this is why gyrotropic crystals are also called optically active. A lin-
early polarized wave entering such a crystal will split into two circularly polarized
waves, whose refractive indices, and therefore phase velocities, differ. During the
propagation in the crystal, they acquire different phases and at the output, the plane
of polarization is rotated. For instance, if linearly polarized light with the wavelength
500 nm is sent along the optical axis of a 3 mm quartz crystal, the plane of polarization
will rotate by about 90∘ . This effect and the elements based on it will be considered in
Chapter 5.
Circular birefringence also occurs in liquids. A well-known example is sugar solu-
tion, but many organic substances have the same property. In such substances, each
molecule is chiral, i. e., it cannot be superposed with its mirror image. An example
of a chiral object is a bolt with a thread: it is either left- or right-handed, and the two
versions cannot be superposed. Similarly, a chiral molecule can be either right- or left-
handed. The two different versions (enantiomers) do not coexist in the same solution.
The circular birefringence of organic liquids has the following explanation. For
each molecule, the response to electric field is given by the polarizability γ: the dipole
moment d⃗ induced by a field E⃗ is
d⃗ = γ E.⃗ (4.47)
The polarizability of a molecule is therefore a tensor, but it is its effective value that
matters for a given electric field direction. For a molecule of a certain handedness,
the effective polarizability differs depending on whether the incident light is right- or
left-circularly polarized: γR ≠ γL .
The dielectric permittivity ϵ of the liquid depends on the polarizability of a single
molecule according to the Lorentz–Lorenz law:
ϵ−1 1
= Nγ, (4.48)
ϵ+2 3
where N is the density of molecules. It follows that the dielectric permittivity also has
different values ϵR,L for right- and left-circular polarization of the input light. The re-
sulting circular birefringence nR − nL of a liquid depends on the chiral properties of
each molecule and also scales with the concentration of the molecules.
38 | 4 Optics of crystals: basic concepts
The effect of polarization rotation also depends on these parameters. Using the
example of sugar, one can say that its circular birefringence is even smaller than in
quartz. Correspondingly, the effect of polarization rotation is weaker: for rotating the
polarization of visible light by 90∘ , one needs about 20 cm of saturated sugar solution.
Interestingly, in contrast to organic liquids, the optical activity of quartz is not
caused by the structure of its molecules but by the crystalline structure. For instance,
amorphous quartz (fused silica) does not manifest optical activity.
Liquid crystals are liquids that, similarly to solid crystals, manifest anisotropy. The
reason for this is that their molecules are elongated.2 For nematic and smectic liquid
crystals, the molecules are axially symmetric, but in some cases (chiral, or twisted,
liquid crystals) the molecules also exhibit chirality.
For such a stretched molecule, the polarizability tensor γ [see Eq. (4.47)] has at
least two different components in the frame of reference of the molecule. In other
words, for light polarized parallel to the long axis of the molecule (called director)
and orthogonal to it, the response of the molecule is different.
Moreover, due to this property, the molecules can be oriented by an external static
electric field. If all molecules have their directors parallel, the macroscopic optical
properties are similar to the microscopic ones. Then the dielectric tensor of the whole
liquid is similar to the polarizability tensor of a single molecule. For such a liquid, one
can solve the Fresnel equation and find the Fresnel surfaces. For instance, in the case
of a nematic or smectic liquid crystal, the Fresnel surfaces will be axially symmetric:
a sphere and an ellipsoid, like for a uniaxial crystal. The optic axis direction will be
given by the director of each molecule.
If light with wavevector k⃗ is incident on a cuvette with such a liquid, there will
be two normal waves, ordinary and extraordinary. For the extraordinary wave, the
refractive index will depend on the angle between k⃗ and the directors of the molecules.
By varying the external static electric field, one can re-orient the molecules and thus
change the refractive index.
These properties of liquid crystals are used in spatial light modulators and other
beam-shaping devices, which will be considered in Chapters 7 and 9.
2 Sometimes one says that the molecules are cigar-shaped but this is a simplification.
4.3 Walk-off effects | 39
tal. For the ordinary wave, the Fresnel surface is a sphere, and the electric field is
orthogonal to the wavevector. But for the extraordinary wave, whose Fresnel surface
is elliptic, this is not the case: the tangent to the ellipsoid is only orthogonal to the
wavevector if the latter is directed along the optic axis or orthogonally to it (Fig. 4.3).
For a biaxial crystal, there are also directions of the wavevector k,⃗ for which the electric
field for the extraordinary wave is orthogonal to it (Fig. 4.5) but this is not the general
case.
Consider now the Poynting vector (4.17), which is orthogonal to the electric field
and, according to Fig. 4.1, lies in the (k,⃗ E)⃗ plane. It follows that, for the ordinary wave
in a uniaxial crystal, as well as for certain parts of the Fresnel surface in a biaxial crys-
tal (see Fig. 4.6), the Poynting vector is parallel to the wavevector. This, in its turn,
means that the group (ray) velocity is parallel to the phase velocity: the energy prop-
agates in the same direction as the phase of the wave.
In contrast, for the extraordinary wave in a uniaxial crystal, unless the wavevector
is parallel or orthogonal to the optic axis, the group velocity u⃗ is directed differently
from the phase velocity v.⃗ The angle between them, as mentioned before, is called the
angle of anisotropy. This effect is called spatial walk-off , because the energy of the
wave ‘walks off’ its phase propagation direction. The same walk-off effect takes place
for almost all directions of the wavevector in a biaxial crystal (Fig. 4.6).
We can now explain the effect described in Chapter 1, i. e., the splitting of an image
seen through a calcite crystal. If an unpolarized or arbitrary polarized beam enters
a calcite crystal (Fig. 4.7) at some angle 0 < θ < 90∘ to the optic axis, shown by a
black dashed line, it splits in two beams: the ordinary one and the extraordinary one.
Inside the crystal, the ordinary beam is parallel to the k⃗ vector and therefore to its
direction outside. But the extraordinary beam is tilted by the anisotropy angle α inside
the crystal. Outside of the crystal, there is no anisotropy and both beams are parallel
again, but now they are shifted from each other by a distance d = L tan α, where L
is the crystal length. If another crystal, with the same orientation, is placed after the
first one, the shift increases. If the optic axis of the other crystal is tilted in the opposite
direction, the shift is eliminated.
Figure 4.7: Spatial walk-off. The incoming beam has non-zero components of the polarization both
perpendicular and parallel to the plane of the figure. If another crystal with a parallel optic axis is
placed after the first one, the shift between the ordinary and extraordinary beams increases.
40 | 4 Optics of crystals: basic concepts
Calcite is a negative uniaxial crystal, with the refractive indices differing by about
15 %; for instance, at the wavelength 532 nm the ordinary refractive index is no = 1.66
and the extraordinary one, ne = 1.49. This large birefringence leads to a very stretched
Fresnel ellipsoid for the extraordinary wave (Fig. 4.3).
Taking into account that the electric field of the extraordinary wave is along the
tangent to the ellipse in Fig. 4.3, we can find the anisotropy angle α from the condition
𝜕n
n tan α = . (4.49)
𝜕θ
𝜕n sin(2θ)n3 1 1
=− ( 2 − 2 ).
𝜕θ 2 ne no
As a result,
sin(2θ)n2 1 1
tan α = − ( 2 − 2 ). (4.50)
2 ne no
n2o
tan(α ± θ) = ± tan θ, (4.51)
n2e
where the upper signs are for negative crystals and lower signs are for positive crystals.
For the ordinary wave, α = 0 and the Poynting vector is along the wavevector.
This is why the splitting of ordinary and extraordinary beams appears. It is most pro-
nounced, as it is clear from Eqs. (4.50), (4.51), for θ = 45∘ . The anisotropy angle is also
called the walk-off angle.
We have shown that spatial walk-off appears due to the non-collinearity of the
phase and group velocity vectors. Similarly, temporal walk-off effect is caused by the
difference between the absolute values of the group and phase velocities. This hap-
pens due to dispersion, i. e., to the dependence of the dielectric tensor on the fre-
quency, which leads to the energy of an optical pulse propagating with a velocity dif-
ferent from that of its phase. However, in this chapter we neglect the ϵ(ω) dependence.
Still, according to Eq. (4.20), even in the absence of dispersion, the group and phase
velocities differ by a factor cos α. This factor appears due to the non-collinearity of u⃗
and v.⃗
Spatial walk-off can be used for the measurement of the Stokes observables dif-
ferent from the one described in Section 3.3.4. This method is shown in Fig. 4.8. The
light beam under study is sent to a strongly birefringent crystal (usually, calcite) with
the optic axis oriented at 45∘ . Light polarized orthogonally to the plane formed by the
wavevector and the optic axis (in the figure, horizontally) will be the ordinary beam
Bibliography | 41
and propagate straight while light polarized vertically (the extraordinary beam) will
be shifted. The two output beams are registered by a camera or two detectors.
The zeroth and first Stokes observables can be measured as the sum and difference
of the two intensities, respectively:
S0,1 = IH ± IV . (4.52)
With a 45∘ rotation, the same setup can be used to measure the second Stokes
observable S2 . In order to measure the third Stokes observable S3 , one needs a quarter-
wave plate in front of the setup. Its operation has been already mentioned in Chapter 3
but it will be considered in more detail in Chapter 5.
Bibliography
[1] V. M. Agranovich and V. L. Ginzburg. Crystal optics with allowance for spatial dispersion and
exciton theory. Sov. Phys. Usp., 5(2):323–346, feb 1962.
[2] M. Born and E. Wolf. Principles of optics. Pergamon Press, 1970.
[3] R. W. Boyd. Nonlinear optics. Academic Press, 2008.
[4] V. G. Dmitriev, G. G. Gurzadyan, and D. N. Nikogosyan. Handbook of nonlinear crystals. Springer,
1999.
[5] E. Hecht. Optics. Pearson, 2017.
[6] C. Malgrange, C. Ricolleau, and M. Schlenker. Symmetry and physical properties of crystals.
Springer, 2014.
[7] J. F. Nye. Physical properties of crystals. Clarendon Press, 1985.
5 Polarization transformations
In this chapter we will consider optical elements that perform polarization transfor-
mations. We will use both the Jones vector and the Stokes vector formalisms, described
in detail in Chapter 3. Here we will only consider lossless elements like waveplates and
polarization rotators. Accordingly, polarizers of various types will not be the subject
of this chapter but will be considered in Chapter 9.
Sometimes, when describing a waveplate, one uses the terms ‘fast axis’ and ‘slow
axis’. The fast axis is the polarization direction for which the normal wave has a larger
phase velocity; the orthogonal direction is the slow axis. Usually, phase plates are
made of quartz, which is a positive crystal (no < ne ), and the ordinary wave has a
larger phase velocity than the extraordinary wave. Therefore the optic axis of a quartz
phase plate coincides with its slow axis.
https://doi.org/10.1515/9783110668025-005
5.1 Phase (retardation) plates | 43
Let us calculate [1] the transformation that a phase plate of thickness l performs
on the input Jones vector
α
e⃗ = ( ) . (5.1)
β
We assume that the optic axis is at an angle χ to the horizontal axis (Fig. 5.1).
For convenience, let us first pass to the Jones vector basis associated with the
plate. That is, from the original (‘laboratory’) basis (e⃗H , e⃗V ) we pass to the one formed
by the Jones vector corresponding to the linear polarization along the optic axis, e⃗z ,
and the one orthogonal to it, e⃗⊥ .
This new basis is obtained by acting on the laboratory basis with the rotation ma-
trix A,
where
cos χ − sin χ
A=( ). (5.3)
sin χ cos χ
Then (see Section 3.2.3) the input Jones vector in the eigenbasis of the plate has
the components
α0 α
( ) = A+ ( ) , (5.4)
β0 β
where we use the subscript ‘0’ to stress that the Jones vector is written in the eigenbasis
of the plate.
Inside the plate, the evolution of the Jones vector is trivial. The ordinary wave
acquires a phase shift ko l, where the wavevector ko = 2πno /λ, and λ is the wavelength.
Accordingly, the extraordinary wave acquires a phase shift ke l, ke = 2πne /λ. As a result,
the Jones vector components after the plate become (still written in the eigenbasis of
the plate)
α0 α0
( ) = J0 ( ) , (5.5)
β0 β0
and the Jones matrix of the plate in its eigenbasis has the form
eiδ 0
J0 = ( ), (5.6)
0 e−iδ
where δ ≡ π(ne − no )l/λ and we omitted the overall phase of J0 , π(ne + no )l/λ, to make
it an SU(2) matrix. (This phase does not change the Jones vector transformation.) As
44 | 5 Polarization transformations
one would expect, the evolution of the Jones vector in the eigenbasis of the plate is
described by a diagonal matrix.
Then, passing back to the laboratory basis, we get
α −1 α
( ) = (A+ ) ( 0 ) . (5.7)
β β0
It follows that the transformation performed by the phase plate is written in the
laboratory frame as
α α
( ) = Jp ( ) , (5.8)
β β
where
This expression can be written in the general form (3.23) of an SU(2) matrix, with
t = cos δ + i sin δ cos(2χ) and r = i sin δ sin(2χ).
A generic Jones matrix (3.23) is given by two complex numbers t, r, which boil
down, taking into account the condition |t|2 + |r|2 = 1, to three real parameters. From
the structure of the Jones matrix (5.10) for a phase plate, we see that it has only two pa-
rameters, the phase δ and the angle χ. Therefore, a general type of polarization trans-
formation cannot be realized by a single phase plate, but only by a combination of two
plates.
In the next section, we consider the most commonly used types of phase plates.
For a half-wave plate (HWP), the phase δ = π/2. According to Eq. (5.6), the ordinary
and extraordinary waves acquire in this plate a relative phase of π. The Jones matrix
of such a plate is then
cos(2χ) sin(2χ)
JHWP = i ( ). (5.11)
sin(2χ) − cos(2χ)
Such a plate will leave a linearly polarized beam linearly polarized but it will rotate
the polarization direction. Indeed, consider the Jones vector of linearly polarized light,
cos(ϑ/2)
e⃗ = ( ), (5.12)
sin(ϑ/2)
5.2 Rotators | 45
with the initial azimuth Ω = ϑ/2. After a HWP oriented at an angle χ, the Jones vector
becomes, up to a phase factor i,
cos(2χ − ϑ/2)
e⃗ = ( ). (5.13)
sin(2χ − ϑ/2)
We see that light is still polarized linearly but the azimuth is now 2χ−ϑ/2. In particular,
if the HWP is oriented at an angle χ = π/4, the polarization plane will be rotated by
90∘ with respect to its initial state.
Typically, half-wave plates are used in laboratories to rotate the polarization
plane. However, this operation is only possible with linearly polarized light. As we
will see in Section 5.3, the way a HWP acts on elliptically or circularly polarized
light is different. In particular, it transforms right-hand circularly polarized light into
left-hand cicularly polarized light and vice versa.
For a quarter-wave plate (QWP), the phase δ = π/4. The Jones matrix of such a plate is
1 1 + i cos(2χ) i sin(2χ)
JQWP = ( ). (5.14)
√2 i sin(2χ) 1 − i cos(2χ)
Although in the general case the transformation performed by a QWP is quite com-
plicated, it is mostly used to convert linearly polarized light into circularly polarized
light and vice versa. In this case, the orientation of the QWP should be χ = π/4. Then,
if light at the input of a QWP is horizontally polarized, i. e. its Jones vector has the form
(3.7), after the QWP the Jones vector is
1 1
e⃗ = ( ), (5.15)
√2 i
i. e., light is right-hand circularly polarized [compare with Eq. (3.10)]. With the verti-
cally polarized light at the input of a QWP, light at its output will be left-hand circularly
polarized.
5.2 Rotators
A polarization rotator is a device rotating the polarization plane of linearly polarized
light [1, 4]. Such an element can be constructed using an optically active material (see
Section 4.2.3), for instance a cuvette with sugar solution or a gyrotropic crystal. At first
sight, a rotator performs the same operation as a HWP. However, this is not the case,
as we will soon see.
46 | 5 Polarization transformations
As a result, we obtain
cos δ sin δ
Jr = ( ). (5.17)
− sin δ cos δ
This is a rotation matrix, with the angle of rotation δ. For instance, if horizontally
polarized light is at the input of the rotator, at the output the Jones vector becomes
cos δ
e⃗ = ( ), (5.18)
sin δ
i. e., the initial polarization azimuth is rotated by an angle δ. Clearly, Eq. (5.17) is also
an SU(2) matrix, characterized by a single parameter δ.
Now we can see how the operation of a rotator is different from the one of a HWP.
For linearly polarized light, the former rotates the polarization by the same amount,
regardless of its orientation. The latter, meanwhile, rotates the polarization depending
on its orientation. In particular, if the input light is polarized along the fast or slow axis
of a HWP, no polarization rotation occurs: these states are eigenstates of the HWP. For a
rotator, the eigenstates are circularly polarized; therefore, it leaves circularly polarized
light unchanged.
An example of a rotator is a quartz crystal cut orthogonally to the optic axis. Fig-
ure 5.2 shows the thickness of a quartz slab needed to rotate linear polarization by 45∘
(red line) and 90∘ (blue line) for different wavelengths [3]. The optical activity of quartz
is especially strong in the ultraviolet (UV) range, where a 100μ-thick plate already ro-
tates polarization by a few tens of degrees. Such a device is convenient because the ro-
tation is performed regardless of the initial polarization, unlike in the case of a HWP.
The drawback of a quartz polarization rotator is that the angle of rotation depends on
the wavelength, which can be a problem when dealing with broadband light.
Apart from the optically active material like crystal quartz or a sugar solution, a
rotator can be constructed using the Faraday effect, i. e., rotation of the plane of po-
larization in a magnetic field. This way one obtains a Faraday cell, which will be con-
sidered in more detail in Chapter 9. The most important feature of this type of rotator
5.3 Poincaré-sphere representation | 47
Figure 5.2: Thickness of a quartz plate cut orthogonal to the optic axis, needed to rotate the polar-
ization by 45∘ (red) and 90∘ (blue) [3].
is that its phase δ depends on the direction of propagation of light, which is not the
case for an optically active material.
Let us briefly mention here another, very elegant way of rotating the plane of po-
larization for linearly polarized light. This method consists of reflecting a light beam in
three different planes, so that the final direction of the beam coincides with the initial
one. Such triple reflection, in the general case, leads to a polarization transformation.
In particular, if reflections occur in three mutually orthogonal planes, light remains
linearly polarized but its plane of polarization is rotated by π/2. This method is based
on the effect similar to the geometric phase and will be considered in Chapter 6, and
the corresponding device, consisting of three mirrors, is called an optical tower or a
periscope.
It follows that a plate with the phase δ and the orientation angle χ rotates the
Stokes vector by an angle 2δ around an axis that lies in the equatorial plane at an
angle 2χ to the σ1 axis. Figure 5.3 shows this rotation on the Poincaré sphere by a red
arrow; the axis of rotation is shown by a red dotted line.
Note that the same rotation can be considered in two different ways. For instance,
Fig. 5.3 shows a point on the Poincaré sphere rotated clockwise by an angle 2δ. This is
the so-called active viewpoint. But the same effect will be achieved if the point stays
the same, but the sphere rotates anticlockwise by an angle 2δ. Sometimes this passive
viewpoint is more convenient.
A half-wave plate will always rotate the Stokes vector by an angle π. It means, in
agreement with the result we obtained in Section 5.1.1, that it will always transform
linearly polarized light (a point on the equator of the Poincaré sphere) into linearly
polarized light (another point on the equator). For instance, if the initial polarization
state is horizontal (ϑ = 0), after the rotation we get ϑ = 4χ, i. e., the final state is
linearly polarized at an angle 2χ. (Note that angles on the Poincaré sphere are always
doubled compared to their values in the usual space.) These rotations are shown in
Fig. 5.4 by different colors: the HWP is oriented at 11.25∘ (orange), at 22.5∘ (green), and
at 45∘ (purple). The latter is the most commonly used setting of the HWP: when it is
oriented at π/4 to the incident light polarization, it rotates the polarization by π/2. The
axis of rotation in each case is plotted by a dotted line of the corresponding color.
If the state is originally circularly polarized, it corresponds to one of the poles on
the Poncaré sphere, and the HWP will transform it into the other pole, regardless of
the orientation χ of the plate. In Fig. 5.4, the magenta line shows the transformation a
HWP performs on right-hand circularly polarized light. The trajectory, however, will
go along different meridians, depending on the angle χ (for the example in Fig. 5.4,
5.3 Poincaré-sphere representation | 49
χ ≈ −10∘ ). This fact will be important in Chapter 6, where we discuss the geometric
phase.
A quarter-wave plate will rotate the Stokes vector by an angle π/2. It is clear now
why it transforms linearly polarized light into circularly polarized light. For instance,
if the initial polarization state is horizontal (Fig. 5.5, red dot) and the plate is oriented
at χ = π/4, the rotation is around the σ2 axis and the final state is right-hand circularly
polarized (solid purple arrow in Fig. 5.5). Similarly, if the initial state is vertically po-
larized, light will be left-hand circularly polarized after the plate (dashed purple arrow
in Fig. 5.5). The same transformation will turn circularly polarized light into linearly
polarized light: R into V, L into H. Other transformations with the QWP are shown in
the figure by orange (QWP at 11.25∘ ) and green (QWP at 22.5∘ ) colors.
It is worth mentioning that, obviously, two QWPs with the same orientation χ, stacked
together, form a HWP. This simply follows from the fact that the phase δ is additive,
and it will be π/2 for the system of two QWPs. The same conclusion follows if we con-
sider two consecutive rotations by π/2 around the same axis: obviously it is a π rotation
around the same axis. These simple considerations explain a very convenient method
used in many polarization setups: if a linearly polarized beam passes through a QWP
oriented at π/4 to its polarization plane and then is reflected by a mirror at normal
incidence, then its polarization, after passing the QWP twice, will be rotated by π/2.
50 | 5 Polarization transformations
If the QWP is oriented at an angle smaller than π/4, then the initial point H will be
transformed into a point with a lower latitude (orange and green arrows in Fig. 5.5),
and the resulting polarization will be elliptical. In the opposite situation, where light
of given ellipticity has to be transformed into linearly polarized light, one can also use
a QWP: the required orientation χ is then determined by the ellipticity.
A plate with an arbitrary phase δ will perform a rotation by an arbitrary angle, but
always around an axis lying in the equatorial plane. This is a restriction imposed by the
fact that, for any plate, polarization eigenstates are linearly polarized. As mentioned
above, this does not allow for performing an arbitrary SU(2) transformation with a
single plate. At the same time, if the goal is to transform a given initial polarization
state into another polarization state, it is always possible to find a plate performing
such a transformation. Note also that a transformation from one point on the Poincaré
sphere into another point can be performed by infinitely many rotations, only one of
them being possible with a phase plate.
A combination of several plates will result in a combination of rotations, the total
Mueller matrix being the product of the matrices for all plates. The best-known ex-
ample is a combination of three plates: QWP + HWP + QWP. For an arbitrary input
polarization state, the first plate (QWP) is used to transform it into linear polarization,
then a HWP rotates the linear polarization by a necessary amount, and the last plate
(another QWP) produces a state with a given ellipticity. The same system can be real-
ized with fiber loops (which will be discussed in more detail in Chapter 9). A simplified
version, a combination of a QWP and a HWP, transforms an arbitrary state into an ar-
bitrary linearly polarized state and is used in the measurement of an arbitrary Stokes
observable (Fig. 5.7).
Transformations with a polarization rotator are considered similarly [2]. From
Eqs. (3.39), we find that, for a rotator with the phase δ, the rotation is by the angle ν,
with
c1 = c2 = 0, c3 = −1. (5.22)
This rotation is around the σ3 axis, by an angle 2δ (Fig. 5.6). A rotator can then
move any point along the equator, and it would leave the poles (right- and left-circular
polarization states) unchanged.
The reason why a circularly polarized state stays invariant under the action of a ro-
tator is because this is an eigenstate of the rotator. Similarly, in the case of a waveplate,
if the incident light beam is polarized linearly along the optic axis of the waveplate,
the rotation on the Poincaré sphere is around the initial point itself. This rotation ob-
viously leaves the polarization state the same. Note that, if light is initially polarized
orthogonally to the optic axis of the plate, the point depicting its polarization state
5.3 Poincaré-sphere representation | 51
on the Poincaré sphere lies on the other side of the same diameter, and therefore this
polarization state is also left unchanged under this transformation.
We see that the Poincaré sphere is a very convenient tool to visualize polariza-
tion transformations. It is also helpful to understand the transitions between the mea-
surements of different Stokes observables. Indeed, from the structure of the Poincaré
sphere we see that there is no principal difference between the Stokes observables σ1 ,
σ2 , σ3 and any generalized Stokes observable
σϑ,φ ≡ cos ϑσ1 + sin ϑ cos φσ2 + sin ϑ sin φσ3 . (5.23)
In order to switch from the measurement of the first Stokes observable to the mea-
surement of the second Stokes observable, we need to rotate our frame of reference so
that the σ2 axis is transformed into the σ1 axis. This rotation of the frame of reference
can be performed by placing a HWP with χ = π/8 in front of the measurement scheme
in Fig. 3.4. From the active viewpoint, it rotates the input linear polarization by π/4,
transforming A into H and D into V. But from the passive viewpoint [2], it transforms
the σ2 axis into the σ1 axis and therefore enables the measurement of the second Stokes
observable. Note that the same transformation is achieved by means of a rotator with
δ = π/4.
Similarly, for the measurement of the third Stokes observable we need to trans-
form the σ3 axis into the σ1 axis. This transformation is performed by a QWP set at
an angle χ = π/4. This is why this plate is used in the setup shown in Fig. 3.5 for the
measurement of S3 .
It follows that an arbitrary Stokes observable (5.23) can be measured with a setup
where the polarization prism is preceded by a set of waveplates performing a polar-
ization transformation of a general form. Because the final polarization basis is set
by the prism, a combination of only two plates suffices, QWP + HWP. By setting their
orientation angles χ1 and χ2 , any values of ϑ, φ can be achieved. Figure 5.7 shows this
general scheme of the Stokes measurement. By propagating the Stokes vectors ‘back-
wards’ through the setup, we see that the Stokes vector σ⃗ out with σ1 = 1, σ2 = σ3 = 0 at
the output corresponds to the input Stokes vector
σ⃗ in = M1 M2 σ⃗ out , (5.24)
52 | 5 Polarization transformations
where M1 , M2 are the Mueller matrices of the QWP and HWP, respectively. Calculation
yields for σ⃗ in the spherical angles ϑ and φ given by
tan(2χ1 − 4χ2 )
cos ϑ = cos(2χ1 ) cos(2χ1 − 4χ2 ), tan φ = − . (5.25)
sin(2χ1 )
The setup shown in Fig. 5.7 can be used to measure the degree of polarization. Ac-
cording to the definition (3.30), it is given by the visibility of the intensity modulation
obtained at one of the outputs under all possible rotations of the QWP and HWP [2]:
Imax − Imin
P= , (5.26)
Imax + Imin
where Imax and Imin are, respectively, the maximal and minimal intensity measured by
one of the detectors of the setup in Fig. 5.7.
As a final comment, let us mention that apart from rotations, which we consid-
ered so far, one might also consider other transformations, like mirror reflections and
inversion. Inversion, which brings every point on the sphere into the opposite point, is
especially interesting for polarization optics because it would transform every polar-
ization state into an orthogonal one. Unfortunately, inversion or mirror reflection on
the Poincaré sphere is not possible using only phase plates or rotators because these
elements perform only rotations. Any rotation leaves the points on the axis of rotation
invariant and therefore cannot lead to their inversion or mirror reflection.
Bibliography
[1] D. N. Klyshko. Berry geometric phase in oscillatory processes. Phys. Usp., 36(11):1005–1019, nov
1993.
[2] D. N. Klyshko. Polarization of light: fourth-order effects and polarization-squeezed states. J. Exp.
Theor. Phys., 84:1065–1079, 1997.
[3] T. Radhakrishnan. The dispersion, birefringence and optical activity of quartz. Proc. Indian Acad.
Sci. A, 25:260–265, 1947.
[4] W. A. Shurcliff. Polarized light. Harward University Press, 1962.
6 Geometric phase
In this chapter, we discuss a phenomenon that appears in different fields of physics
but is most important for polarization optics, where it is known as the Pancharatnam
phase. Generally, this effect appears wherever we deal with the trajectory of a point on
a curved surface. It has the general term ‘topological phase’ or ‘geometric phase’—the
latter will be used throughout this chapter. In quantum physics, this phase is usually
called the Berry phase. Below, we start from several simple examples [3].
Imagine a very large pendulum, called the Foucault pendulum (after Leon Foucault
who first introduced it), which is used to demonstrate the rotation of the Earth. Such
a pendulum can be found in many universities, science museums and other public
places like the Panthéon in Paris. During a demonstration, a guide usually starts its
swinging motion and marks the oscillation plane. After the pendulum swings for sev-
eral minutes, the spectators see that the oscillation plane has rotated, due to the fact
that the Earth has turned while the plane of the pendulum was constant. A naïve ex-
pectation would be that in 24 hours it will return to the initial position. But this will be
only the case on the North and South Poles, where an observer will see the plane of os-
cillations rotating with the same angular speed as the Earth. If the pendulum swings
at a point with the latitude γ, after 24 hours the plane of rotation will turn by an angle
This effect is illustrated by Fig. 6.1. Here, we can imagine that the Earth is stationary,
but the pendulum is displaced around it, along a single parallel, so that its plane of
oscillations is constant. In the end it returns to the same point, but the line of oscil-
lations will have to tilt in order to remain tangent to the Earth. In other words, for a
stationary observer, the pendulum’s plane of oscillations will rotate more slowly than
the Earth, and the difference (6.1) will be seen after 24 hours.
At the North or South Pole, where γ = ±π/2, the Foucault effect is maximal, and
the naïve picture will be correct: the plane of the pendulum oscillations will rotate for
the observer with the same angular velocity as the Earth and it will recover in 24 hours.
But at the equator, γ = 0, an observer will not see any effect of the Earth rotation, and
the angle (6.1) will be 2π. One can notice that
β = Ω, (6.2)
which is the solid angle that would be covered if the pendulum were driven around
the Earth along a parallel (Fig. 6.1).
https://doi.org/10.1515/9783110668025-006
54 | 6 Geometric phase
In another hypothetical example, the Foucault pendulum is indeed driven around the
Earth. For instance, let us start such a pendulum at the North Pole, drive it down to the
equator along some meridian, coinciding with the direction of swinging, then drive it
along the equator by an angle π/2 and return it back to the North Pole. Figure 6.2 shows
the trajectory of the pendulum in orange colour.1 To understand what happens, we
have to shift the vector describing the oscillations (red in the figure) so that it is always
in the same plane but still tangent to the surface of the Earth. This, however, leads to
its rotation, upon return to the North Pole, by an angle π/2. We notice that once again,
this angle coincides with the solid angle covered by the pendulum on the surface of
the Earth.
This phase (or angular) shift appears because a vector (of the pendulum oscillations)
is shifted so that it stays in one plane but yet has to be tangent to a curved space (in
this case, the surface of the Earth).
1 This situation is only possible if the pendulum has very low damping, so that it is still swinging
when it returns.
6.1 Examples of geometric phase | 55
In another simple example, we consider the polarization of a light beam whose opti-
cal path does not lie in a single plane. While in the previous example the oscillation
vector of the pendulum had to be tangent to the curved surface of the Earth, now the
restriction will be that polarization should be orthogonal to the k-vector and hence to
the beam trajectory.
Consider first the optical tower (also called a periscope), briefly mentioned in
Chapter 5. In a periscope, the beam, initially propagating along the x axis and po-
larized along the z axis (vertically), is reflected by three mirrors, in three mutually
orthogonal planes (Fig. 6.3, left panel). After the first mirror, the beam rotates by 90∘
around the z axis, then mirror 2 rotates it by 90∘ around the x axis, then mirror 3
rotates it by 90∘ around the y axis. Eventually the beam is again parallel to the x axis,
but now it is horizontally polarized.
This rotation of polarization can be also related to some solid angle covered in a three-
dimensional (3D) space. Indeed, the right panel of Fig. 6.3 shows the evolution of the
k-vector of the beam. Initially along the x axis, its trajectory covers a solid angle π/2
in the k-space and then returns again to the same direction. The ‘trajectory’ is shown
by orange and the electric field direction, by red arrows. The situation is very similar
to the one with the Foucault pendulum (Fig. 6.2). As a result, the angle of polarization
rotation is again given by the solid angle covered in the 3D space (on a sphere); see
Eq. (6.2).
An optical tower is used in laboratories to implement a π/2 polarization rotation
without the use of dispersive elements. At the same time, this effect of polarization
rotation can be a problem in experiments where polarization should be maintained to
a very high accuracy. Indeed, any reflections of the beam that are not within a single
plane will change its polarization state. The change will be the larger, the larger the
angle covered by the beam in the 3D space.
Propagation in a fiber. This situation with reflections in different planes can be
generalized to the case of light propagating in an optical fiber. If the fiber lies in one
56 | 6 Geometric phase
plane, the polarization will not be changed. But if the fiber direction forms a trajec-
tory in 3D space, then the polarization will be rotated by an angle given by the solid
angle covered by this trajectory. This effect is used in a fiber-based device called ‘po-
larization controller’, which will be discussed in Chapter 9, although with a different
interpretation.
Let the Jones vectors of the two beams be e⃗A , e⃗B . If beams A and B are overlapped, and
they have the same intensity I0 , the total intensity will be given by
where Δ is the phase of the interference (as defined by Pancharatnam, and hereafter
called the Pancharatnam phase). Clearly, the visibility, defined as
Imax − Imin
V≡ , (6.4)
Imax + Imin
is equal to |e⃗A ⋅ e⃗B |. (Note that, as mentioned in Chapter 3, the scalar product of two
Jones vectors implies the complex conjugation of one of them.) In other words, the
interference visibility for two polarized beams of equal intensity is given by the absolute
value of the scalar product of their Jones vectors.
6.3 Decomposition of a beam in two differently polarized components | 57
The absolute value of this scalar product has a clear interpretation on the Poincaré
sphere. Indeed, if A and B correspond to linearly polarized states (green points in
Fig. 6.4), with coordinates ϑA,B (for linearly polarized states, φA,B = 0), then the scalar
product is [see Eq. (3.6)]
ϑ ϑ ϑ ϑ ϑ − ϑB
eA ⋅ e⃗B = cos A cos B + sin A sin B = cos A
⃗
. (6.5)
2 2 2 2 2
But ϑA − ϑB = γ, the length of the arc between the two green points in Fig. 6.4.
(Note that two points on a sphere can be connected by infinitely many arcs; this one is
the shortest, called the geodesic line.) But because it is only the relative position of two
points on the Poincaré sphere that matters, this relation can be generalized: for any
two points A, B with the angular separation γ (red points), the visibility of interference
will be given by the cosine of half the angular distance between them,
γ
V = cos . (6.6)
2
For instance, the visibility will be zero for opposite points, in full accordance with
the statement that orthogonally polarized beams do not form an interference pattern.
For points separated by a quadrant, the visibility will be equal to 1/√2. This will be
the case, for instance, for beams polarized vertically and diagonally, or linearly and
circularly.
We saw that the angular length of the geodesic arc between any two points A, B on the
Poincaré sphere determines the modulus of the Jones vectors scalar product: |e⃗A ⋅ e⃗B | =
cos γ2 . Suppose we need to decompose an arbitrarily polarized beam, given by a point C
on the Poincaré sphere, with the Jones vector e⃗C , in two orthogonally polarized beams,
given by points A and A’ (Fig. 6.5). This is always possible, and the projections of the
Jones vector e⃗C on the Jones vectors of the two beams will be
⃗ α
eC ⋅ e⃗A = cos (6.7)
2
and
⃗ π−α α
eC ⋅ e⃗A = cos = sin . (6.8)
2 2
58 | 6 Geometric phase
Similarly, a beam in a polarization state C can be decomposed into two beams in po-
larization states A and B, the angular distance between them being γ (Fig. 6.6). One
can show that the relation between the Jones vectors will be [6]
γ β α
sin e⃗C = sin e⃗A + sin e⃗B , (6.10)
2 2 2
where α and β are the angular distances from point C to points A and B, respectively.
This relation can be understood by realizing its similarity to the decomposion of
a usual vector in a 2D Cartesian space in two non-orthogonal components (Fig. 6.7). If
A,⃗ B,⃗ C⃗ are unit vectors, then for the non-orthogonal projection a of C⃗ on A⃗ (shown by
and for the non-orthogonal projection b of C⃗ on B⃗ (another green dashed line) we have
a similar relation,
sin β ⃗ sin α ⃗
C⃗ = A+ B. (6.13)
sin γ sin γ
This decomposition almost perfectly coincides with Eq. (6.10). The only difference is
that the latter contains halved angles. But this is because on the Poincaré sphere all
angles are a factor of two larger than their counterparts in the Cartesian space.
We can now find the Pancharatnam phase according to its definition: if two beams,
originally in polarization states A and B, are brought together into a beam with po-
larization C, the Pancharatnam phase is the phase of their interference. Taking the
squared modulus of Eq. (6.10), we get
γ α β α β
sin2 = sin2 + sin2 + 2e⃗A ⋅ e⃗B cos Δ sin sin ,
(6.14)
2 2 2 2 2
or, after substituting the value of the scalar product |e⃗A ⋅ e⃗B |,
γ α β γ α β
sin2 = sin2 + sin2 + 2 cos cos Δ sin sin , (6.15)
2 2 2 2 2 2
Then the Pancharatnam phase is given by the relation
γ α β
sin2 2
− sin2 2
− sin2 2
cos Δ = γ β
. (6.16)
2 cos 2
sin α2 sin 2
60 | 6 Geometric phase
β
cos2 α
2
+ cos2 2
+ cos2 γ2 −1
cos(π − Δ) = . (6.17)
γ β
2 cos cos α2 cos 2
Ω
cos(π − Δ) = cos . (6.18)
2
Ω
Δ= . (6.19)
2
It means that, if two beams in polarization states A and B form an interference pat-
tern in the polarization state C, the phase of the interference will be given by half of the
solid angle subtended by the geodesic triangle ABC. One can implement this situation
by ‘inverting’ the scheme we have considered before, with a polarization prism fol-
lowed by two arbitrary polarization transformations.
In particular, if the interference of states A and B leads to a state C lying on the
geodesic connecting them, the solid angle is zero. Correspondingly, the phase of the
interference is zero, i. e., the interference is constructive.
This means that, if a polarization state is transformed in such a way that the corre-
sponding point moves along the geodesic line on the Poincaré sphere, no phase shift
appears until a half-circle is made. But if the point is moved along a closed contour
ABC in Fig. 6.6, then there appears a phase shift: the Pancharatnam phase (6.19).
direction (the electric field vector) along the rotator, which could be, for instance, a
cuvette with sugar solution.
Because we ignore the total phase of the Jones vector, we say that the polarization
state did not change: light is still vertically polarized. But then the direction of the
electric field vector is related to a phase that has been acquired, and if we make this
beam interfere with the initial beam, a phase shift of π will appear, which is exactly
equal to Ω/2.
Actually we already came across a similar effect in Chapter 5 when we considered
the action of a HWP on linearly polarized light. While obtaining Eq. (5.13), we omitted
the phase factor i, saying that it was irrelevant for the Jones vector and the final state
was linearly polarized. However, if the transformation were performed with a rotator,
the trajectory on the Poincaré sphere would be along a geodesic (equator), and no
phase factor would emerge. In an experiment, one could see this phase difference by
placing a HWP in one arm of an interferometer and a rotator in the other arm.
A simpler experiment is to put HWPs into both arms of a Mach–Zehnder inter-
ferometer fed with circularly polarized light. For instance, let the input beam be right-
hand circularly polarized. Both plates will transform this state into left-hand circularly
polarized light, and at the output there will be perfect (with 100 % visibility) interfer-
ence pattern. But the phase of the interference will depend on the orientation of the
plates. If one of them is oriented with the optic axis horizontal, it performs the rota-
tion on the Poincaré sphere around the σ1 axis (see Section 5.3). At the same time, the
other HWP, oriented at an angle χ, will rotate the initial point around an axis that is in
the equatorial plane at an angle 2χ to σ1 . The two trajectories on the Poincaré sphere
(shown in Fig. 6.9 by green and magenta colours) will subtend a solid angle Ω = 2χ.
Then the interference phase at the output will be given by the orientation of the sec-
ond HWP: Δ = χ. This is the simplest experiment in which the Pancharatnam phase
can be directly observed.
This example shows that, from the viewpoint of the Pancharatnam phase, impor-
tant are not only the initial and final polarization states, but also the trajectory on the
Poincaré sphere along which the polarization transformation happens. As mentioned
in Chapter 5, there are infinitely many ways to transform the polarization from one
state into another, and all these ways are accompanied by different Pancharatnam
phase shifts. In particular, as we have just seen, if the rotation is along a geodesic (the
62 | 6 Geometric phase
equator, in particular), no phase shift occurs unless a semi-circle is covered. For this
reason, if the rotation of linear polarization is performed with a rotator (for instance,
polarization is transformed from horizontal to diagonal), no phase shift occurs. Mean-
while, when the same result is achieved with a HWP oriented at χ = π/8, then a certain
solid angle is covered (the trajectory is closed by completing it with a geodesic line).
Concluding this section, let us stress again that there are two equivalent defini-
tions of the Pancharatnam phase. On the one hand, it is a phase acquired due to the
evolution of a point on the Poincaré sphere along a closed trajectory. On the other
hand, it is the interference phase for two beams whose polarization states are de-
scribed by two points coming together along two different parts of the same trajectory.
In this section, let us briefly discuss other manifestations of the geometric phase, re-
lating to quantum physics. The general concept is known as the Berry phase as it has
been first described by Michael Berry. While the rigorous consideration can be found
in the original work [2], here we will consider only two examples of the Berry phase.
One example is the evolution of a two-level quantum system like a spin 1/2 particle
in a magnetic field or a two-level atom interacting with resonant radiation. (Another
quantum system with the same description is a polarized single photon, considered
in detail in Chapter 11.) The state of a two-level quantum system can be described by
the Bloch vector, which is introduced in terms of the density matrix ρij , i, j = 1, 2 [5]:
For a pure quantum system, the state can be described by a vector |Ψ⟩ = α|1⟩ +
β|2⟩, where |1⟩ and |2⟩ are the ground and excited states, respectively, and the complex
numbers α, β satisfy the normalization condition |α|2 + |β|2 = 1. The density matrix
elements are then ρ11 = |α|2 , ρ22 = |β|2 , ρ12 = αβ∗ = ρ∗21 . In this case, |R|⃗ = 1, the Bloch
vector is a unit vector, and both its structure and properties are equivalent to the ones
of the normalized Stokes vector σ⃗ for polarized light (section 3.3). In full accordance
with this analogy, α and β correspond to the components of the Jones vector.
Similarly to the normalized Stokes vector, the Bloch vector is usually depicted as a
point on a unit sphere (the Bloch sphere), which is therefore equivalent to the Poincaré
sphere. The South Pole of the Bloch sphere corresponds to the ground state of the
quantum system, and the North Pole, to the excited state. Points on the equator corre-
spond to the system being in a coherent superposition of ground and excited states.
For a mixed state, the Bloch vector has absolute value |R| < 1, and this situation
is similar to the one of partially polarized light. In this case, the point depicting the
state of the two-level system is inside the Bloch sphere.
Similarly to how the Stokes vector rotates due to the polarization transformations
with phase plates or polarization rotators, rotation of the Bloch vector describes the
evolution of the two-level system under the action of an external force. For instance,
a two-level atom driven by an external resonant field performs transitions from the
ground state into the excited state and back (Rabi oscillations). Its Bloch vector makes
circles along a meridian of the Bloch sphere. If the field is non-resonant, the atom, ini-
tially in the ground state, never gets into the excited state, and its Bloch vector makes
smaller circles around a point on the equator [5]. These rotations are perfectly sim-
ilar to the SO(3) rotations of the Stokes vector. And, quite similarly, if such a trans-
formation or a series of transformations covers a closed trajectory, a geometric phase
appears, given by the solid angle subtended by the trajectory. This geometric phase ac-
quired by the quantum state of an atom is therefore equivalent to the Pancharatnam
phase of polarized light.
An equivalent case is the evolution of a spin 1/2 particle in the alternating mag-
netic field. The state of the particle is also shown as a point on a Bloch sphere, and vari-
ation of the magnetic field leads to the transport of the point over the sphere. A closed
trajectory is again associated with the phase shift [2, 3].
Another manifestation of the Berry phase is the Aharonov–Bohm effect [1, 2, 4].
This effect entails that an electron moving along a closed trajectory around a solenoid
with magnetic field acquires a phase determined by the magnetic flux through the
surface subtended by this trajectory. Similar to the case of the evolution over the Bloch
sphere or Poincaré sphere, the phase does not depend on the trajectory itself but only
on the area subtended by it. Note that along the trajectory of the electron, the magnetic
field can be zero. Alternatively, and similar to the case of the Pancharatnam phase, the
geometric phase shift can be observed by making electrons move along two different
trajectories forming a closed circuit [4]. At the output, interference can be observed,
with the phase scaling as the magnetic field flux through the circuit.
64 | 6 Geometric phase
Bibliography
[1] Y. Aharonov and D. Bohm. Significance of electromagnetic potentials in the quantum theory.
Phys. Rev., 115:485–491, 1959.
[2] M. V. Berry. Quantal phase factors accompanying adiabatic changes. Proc. R. Soc. Lond. A,
392:45–57, 1984.
[3] D. N. Klyshko. Berry geometric phase in oscillatory processes. Phys. Usp., 36(11):1005–1019, nov
1993.
[4] D. N. Klyshko. Basic quantum mechanical concepts from the operational viewpoint. Phys. Usp.,
41(9):885–922, 1998.
[5] D. Klyshko. Physical foundations of quantum electronics. World Scientific, 2011.
[6] S. Pancharatnam. Generalized theory of interference and its applications. Proc. Indian Acad.
Sci., 44:247, 1956.
7 Structured light
So far, we considered only the polarization state of propagating plane waves. In the
general discussion, we conveniently referred to light beams without introducing a the-
oretical framework beyond plane waves. However, to better understand the spatial de-
grees of freedom for light fields, we now need to turn away from single plane waves
and consider more realistic solutions to Maxwell’s equations. In this context we will
also learn that light, although usually described as a transverse electromagnetic wave,
cannot be described in full compliance with Maxwell’s equations if only transverse
field (polarization) components orthogonal to the mean propagation direction are as-
sumed. We need to take into account also longitudinal field components (see discus-
sion in Chapter 8), i. e., electric and magnetic field components oscillating along the
direction of propagation. Additionally and unexpectedly, we will also show that the
spatial distribution of light propagating in free space can even feature points or lines
where only electric or magnetic fields are present (similar to standing waves). The ap-
pearance of such purely electric or magnetic fields is intimately connected with the
aforementioned longitudinal components. But here we start our discussion with the
derivation of the paraxial wave or Helmholtz equation resulting in analytical and ap-
proximate beam solutions. Afterwards, we discuss the spatial structure of light beams
from a scalar and a vectorial perspective. At the end of the chapter, different methods
for the generation of structured light will be summarized.
∇⃗ × ∇⃗ × E⃗ = −μ0 ∇⃗ × H.⃗
̇
(7.1)
We omit the time dependence here for brevity. If we assume a harmonic time depen-
dence of the fields (∝ e−iωt ), we get
We now substitute the term ∇⃗ × H⃗ with the curl equation for the magnetic field,
This equation can be further simplified by taking advantage of the vector algebra rule
(4.23)
https://doi.org/10.1515/9783110668025-007
66 | 7 Structured light
Together with ∇⃗ ⋅ E⃗ = 0, μ0 ϵ0 = 1/c2 and k = ω/c, we finally end up with the (full) wave
or Helmholtz equation:
In an equivalent manner, the corresponding equation for the magnetic field can be
derived.
In this chapter, we are interested in the retrieval of analytical solutions repre-
senting light beams propagating in a paraxial and, hence, a collimated fashion, just
like light beams emitted by a laser. This restriction will allow us to derive a parax-
ial wave equation. Without loss of generality, we let the light propagate along the
z-direction. Hence, the electric field is assumed to depend on the coordinate z as
⃗ y, z) = E⃗ 0 (x, y, z)eikz . If we substitute the electric field in (7.5) with this depen-
E(x,
dence, we get
𝜕2 𝜕2 𝜕2
( + + + k 2 )E⃗ 0 (x, y, z)eikz = 0, (7.6)
𝜕x2 𝜕y2 𝜕z 2
leading, after several steps, to the following equation:
𝜕2 𝜕2 2
⃗ 0 (x, y, z) + 𝜕 E⃗ 0 (x, y, z) + 2ik 𝜕 E⃗ 0 (x, y, z) = 0.
( + )E (7.7)
𝜕x2 𝜕y2 𝜕z 2 𝜕z
If we assume propagation in a paraxial and collimated fashion, i. e., we expect the
intensity of the beam on the optical axis to vary only very slowly with propagation
(along z) in comparison to its variation in the transverse direction (x, y), the term with
the second derivative in z can be neglected. We obtain
(∇⃗⊥2 + 2ik
𝜕
)E⃗ (x, y, z) = 0, (7.8)
𝜕z 0
2 2
with ∇⃗⊥2 = 𝜕x
𝜕
2 + 𝜕y 2 . Equation (7.8) is usually referred to as the paraxial wave or
𝜕
Helmholtz equation. This equation can be solved analytically. In the next section,
selected scalar solutions to this equation will be introduced and discussed. It should
be noted here already that solutions to the paraxial wave equation do not satisfy
Maxwell’s equations or the full wave equation, because they resulted from an ap-
proximation. This will become crucial in the discussion of three-dimensional fields
in Chapter 8. However, solutions to Eq. (7.8) represent very good approximations for
propagating collimated light beams we use in the lab. In the following section we
discuss some of the most prominent solutions of Eq. (7.8).
featuring a polarization state that is fixed and homogeneous across their lateral ex-
tent and upon propagation, while the intensity (and phase) are structured and, hence,
non-homogeneous. These solutions can be treated in a scalar framework, i. e., the po-
larization state does not depend on the spatial coordinates. Several sets of scalar spa-
tial modes can be retrieved from the paraxial wave equation (7.8). Depending on the
coordinate system (Cartesian, cylindrical, etc.), or more practically speaking, in de-
pendence on the symmetry of the cavity mirrors in a laser from which these modes
originate, various complete and orthogonal sets of spatial scalar modes can be de-
rived, including, but not limited to, Hermite–Gaussian (HG) or Laguerre–Gaussian
(LG) modes. These modes earn their names from the Hermite or Laguerre polynomials
they depend on, which are multiplied by a Gaussian function. The fundamental mode
G
E⃗ = E⃗ 0,0 of both LG and HG mode sets is the familiar Gaussian light beam,
x 2 +y2
G 1
(x, y, z) = E⃗ G e w0 [1+(2iz/kw0 )] eikz ,
− 2 2
E⃗ 0,0 2
(7.9)
1 + (2iz/kw0 )
with w0 denoting the beam waist (radius at z = 0) and E⃗ G the actual position-
independent transverse polarization of the beam. This analytical expression for a
paraxially propagating Gaussian beam can be rewritten in the following way:
w − x +y
2 2
k(x2 +y2 )
G
E⃗ 0,0 (x, y, z) = E⃗ G 0 e w2 (z) ei[kz+ 2R(z) −η(z)] . (7.10)
w(z)
The beam radius w(z), the wave front radius of curvature R(z), and the Gouy phase
term η(z) [18] are defined as follows:
z2
w(z) = w0 √1 + , (7.11)
zR2
zR2
R(z) = z(1 + ), (7.12)
z2
z
η(z) = arctan( ), (7.13)
zR
kw2
with zR = 2 0 being the Rayleigh range (see Fig. 7.1).
As we can see, the polarization as well as the beam shape can be approximated to
be constant upon propagation along the z-axis, while the beam diameter, the radius
of phase-front curvature, as well as an additional phase term (η) are z-dependent. The
Gouy phase term defines the relative phase lag between a Gaussian beam propagating
from −∞ to +∞ and a plane wave. It reflects the converging and diverging nature of
beam propagation.
The aforementioned higher-order HG modes (in a Cartesian coordinate frame)
form a complete set of modes and can be constructed from the fundamental Gaussian
68 | 7 Structured light
Figure 7.1: Paraxial propagation of a beam of light. The beam with a Gaussian intensity profile prop-
agates along the horizontal (z) axis and reaches its smallest radius (beam waist w0 ) in the center
of the sketch where the phase front is planar and the radius of the phase-front curvature diverges.
The dashed white lines indicate the geometrical rays crossing at the focus and defining the angu-
lar spread of the beam for large distances from the waist plane. The intensity along the optical axis
changes only very slowly (small convergence and divergence angle) allowing for the application of a
paraxial approximation.
HG 𝜕m 𝜕n G
E⃗ m,n (x, y, z) = w0m+n m n E⃗ 00 (x, y, z), (7.14)
𝜕x 𝜕y
HG w x y
E⃗ m,n (x, y, z) = E⃗ HG 0 Hm (√2 )H (√2 )
w(z) w(z) n w(z) (7.15)
2 2 2 2
− x 2+y i[kz+ k(x2R(z)
+y )
⋅e w (z) e −ηHG (z)]
,
with ηHG (z) being the generalized Gouy phase term defined by ηHG (z) = (m + n +
1) arctan( zz ). The polynomial pre-factors result in changes of the phase within the
R
beam cross-section (beam profile for a fixed z position), affecting also the intensity.
The modes therefore naturally feature a structured (transversely varying) phase dis-
tribution in combination with a non-homogeneous intensity pattern. In addition, Her-
mite functions are invariant under the Fourier transform and, hence, keep their shapes
upon propagation. Selected first-order (m, n) = (0, 1) (a) and (m, n) = (1, 0) (b) HG
modes are shown in Fig. 7.2. As can be seen, the modes feature multiple intensity lobes
with their number depending on the chosen indices, while the phase changes from
lobe to lobe by π.
Similarly, another prominent and full set of solutions can be derived, i. e., the
aforementioned LG modes based on Laguerre polynomials. These modes feature cylin-
drically symmetric ring-like intensity distributions, with the number and size of rings
depending on the chosen indices. The full set of modes can be described by the equa-
tions
w r √2 2r 2
|l|
LG
E⃗ l,p (r, ϕ, z) = E⃗ LG 0 ( ) L|l|
p( )⋅
w(z) w(z) w(z) (7.16)
r 2 2
kr
ilϕ − w2 (z) i[kz+ 2R(z)
⋅e e e −ηLG (z)]
,
7.2 Structured scalar light beams—transverse phase patterns and phase singularities | 69
Figure 7.2: Examples of first-order HG (a, b) and first azimuthal order LG (c) paraxial modes, respec-
tively. Distributions of the normalized (electric field) intensity (color-coded), phase (inset) and a
snapshot of the electric field vectors (white arrows) are shown. Figure reproduced from [2].
where l is the azimuthal and p the radial index defining the order of the mode and relat-
ing to the azimuthal (ϕ) and radial (r) coordinate in the cylindrical coordinate system.
ηLG (z) is the generalized Gouy phase term defined by ηLG (z) = (|l| + 2p + 1) arctan( zz ).
R
Figure 7.2(c) also shows intensity and phase distributions of the selected LG mode.
As a direct consequence of the term eilϕ , LG modes of azimuthal order l = 1 (l = −1)
or higher (lower) exhibit an azimuthally varying phase gradient of l2π leading to a
spiral phase front. The phase is undefined at the origin (optical axis) where the corre-
sponding modes are dark (intensity reaches zero). Such a point of darkness (or lines
of darkness along the beam axis) is referred to as phase singularity or phase vortex.
Due to the topological nature of this feature, l is usually also called the topological
charge of the vortex. Optical vortices can also be observed when multiple light waves
interfere, for instance, in the case of diffraction.
It should also be noted here that LG modes can also be constructed from HG modes
and vice versa. This is particularly easy to see for first (azimuthal) order LG modes,
which can be constructed by superposing two spatially orthogonal HG modes with in-
dices (0, 1) and (1, 0) polarized along the same axis (see Fig. 7.2(a) and (b)). When su-
perposed in-phase or with a phase delay of π, the resultant mode still features an HG
mode shape but appears rotated by ±45 deg. However, if the two constituent modes are
dephased by ± π2 , a first (l = ±1) order LG mode is created featuring a ring-like intensity
distribution and the aforementioned spiral phase front (Fig. 7.2(c)). Based on this sim-
ple construction it is easy to see that an LG beam of, e. g., first azimuthal order shows a
first-order HG modal profile for a fixed snapshot in time. This simple pattern orbits or
spins around the optical axis at light’s frequency with time evolving. If HG, LG or other
paraxial modes of different orders are superposed, interesting propagation-induced
effects appear as a direct consequence of mode-order-dependent Gouy phases [3, 9,
31, 44].
In this context it is also worth mentioning that the spiral phase fronts and, there-
fore, the azimuthal mode indices of LG beams are also associated with another very
70 | 7 Structured light
Figure 7.3: Selected examples of paraxial (a) azimuthally and (b) counter-rotating azimuthally po-
larized as well as (c) Poincaré-type beams. Left: Distributions of the normalized total (electric field)
intensity (color-coded). Center and right: Color-coded (electric field) intensity distributions of the
transverse electric field components Ex and Ey . Snapshots of the electric field vectors (white arrows)
are shown as overlays. For better visibility, the field vectors are not shown for the full electric field
intensity of the Poincaré beam in (c). Maps of the relative phase between the field components are
shown as insets. Figure reproduced from [2].
72 | 7 Structured light
order. We can superpose two of the above-mentioned scalar spatial modes, e. g., HG
modes, but now with orthogonal polarization states,
HG HG
E⃗ radial = E1,0 e⃗x + E0,1 e⃗y , (7.17)
HG HG
E⃗ azimuthal = E0,1 e⃗x − E1,0 e⃗y . (7.18)
Alternatively, also the superposition of two scalar LG modes of azimuthal index 1 and
−1 carrying opposite handedness of circular polarization results in either radial or az-
imuthal polarization. It should be noted here that depending on the chosen relative
phase and polarization of the two constituent modes brought together, also beams
carrying a spiral polarization [15] can be created, finding interesting applications es-
pecially when highly confined spatially (see the discussion in Chapter 8). These beams
can be visualized very instructively when overlapping a radially and an azimuthally
polarized doughnut beam, which oscillate either in phase or with a phase-shift of π.
Locally, the beam therefore exhibits a non-zero radial and simultaneously azimuthal
electric field component, forming a spiral-like pattern of the field for a snapshot in
time, imprinting a certain type of handedness [15, 21, 41].
Radial, azimuthal and spiral cylindrical vector beams all feature a locally linear
electric field, which rotates clockwise when walking around the optical axis in a clock-
wise sense. These beams therefore have a polarization order of 1, i. e., a rotation of 2π
for a full trip around the axis. The polarization order is therefore defined in a similar
fashion like before the azimuthal order of LG modes, which itself indicated the num-
ber of 2π azimuthal phase changes. Equivalently, also the polarization order can be
larger than 1 (in discretized steps of 1) if the number of turns of the linear polarization
is more than 2π [25, 40]. If a cylindrical vector beam carries a negative polarization
order, the electric field necessarily rotates counter-clockwise.
Although cylindrical vector beams discussed so far feature interesting polariza-
tion patterns, they are still linearly polarized everywhere with only the orientation of
the polarization plane varying spatially. However, this is by far not the end of the story.
Light beams may also feature multiple different polarization types, from linear via
elliptical to circular, all present in one cross-section. This important remark directly
leads us to the discussion of another very interesting class of spatial vectorial modes
of light, which definitely deserve it to be mentioned here also because it is intimately
connected to complex polarization patterns (see next section), i. e., so-called Poincaré
beams [5] (see Fig. 7.3). On the surface of the Poincaré sphere (see Chapter 3), all pos-
sible homogeneous polarization states are contained. A Poincaré beam is therefore a
beam featuring spatially varying polarization states covering either the full Poincaré
sphere (full Poincaré beams) or a part of it. In a certain sense, radially and azimuthally
polarized modes can be seen as trivial versions of Poincaré beams, because they cover
the full equator. Also this class of beams can be constructed, e. g., by superposing
modes we met with already [3, 5]. For instance, if a circularly polarized fundamental
Gaussian beam is co-propagating with a first (or minus first) order LG mode of oppo-
7.4 Polarization singularities and generic ellipse fields | 73
site handedness, the resulting beam will be circularly polarized on the optical axis,
elliptically polarized off-axis, where both modes feature non-zero intensity, opposite
handedness and unequal amplitudes, and linearly polarized, where the two modes
have the same amplitude. In addition, the orientation of the polarization ellipse is
ruled by the azimuthally changing phase of the LG mode, hence covering a large por-
tion of the Poincaré sphere’s surface.
It should be noted here that beams of different order accumulate different phases
upon propagation (Gouy phase) influencing the relative polarization in the transverse
plane depending on the propagation distance [3, 9, 31, 44].
As we will see in Chapter 8, the spatial confinement of light will naturally lead to
an even more complex structure of the electric and/or magnetic components of elec-
tromagnetic fields.
Re(E⃗ √E⃗ ∗ ⋅ E⃗ ∗ )
a= , (7.19)
|√E⃗ ⋅ E|⃗
Im(E⃗ √E⃗ ∗ ⋅ E⃗ ∗ )
b= , (7.20)
|√E⃗ ⋅ E|⃗
c = Im(E⃗ ∗ × E),
⃗ (7.21)
with the spatial dependence of the field and the ellipse parameters omitted. c is also
proportional to the spin density, which defines the local degree of circular polariza-
tion. If a and b are of the same length, the polarization ellipse turns into a circle, the
field is locally circularly polarized. The orientation of the ellipse is not defined any-
more and it does not feature distinguishable major or minor semiaxes. Such a point
of circular polarization is therefore usually referred to as polarization singularity in
general and a C-point in particular. Similarly, if a field is locally linearly polarized, the
74 | 7 Structured light
ellipse is a line and the minor axis b as well as the parameter c are zero, and thus
singular as well. Such a point is therefore called an L-point.1
The convenient and powerful notation of polarization singularities and polariza-
tion distributions in generic 2D fields, following that of scalar phase singularities and
their surrounding in scalar fields or beams, was introduced and studied in detail by
Nye, Hajnal, Berry, Dennis, Soskin, Freund and others [6, 7, 11, 16, 19, 20, 32, 38]. They
also found that around C-points, fields of elliptical polarization form, which take on
specific distributions with respect to the ellipse orientations in 2D (see Fig. 7.4). These
distributions, together with their polarization singularity in the center, form topolog-
ical structures, just like the phase singularities and the phase map around them. The
polarization ellipses for the most generic fields rotate by ±π when walking along a
closed circle around the central C-point, resulting in the definition of a topological
index of ± 21 with the sign depending on the sense of rotation of the ellipses (rotating
clockwise or counter-clockwise for a clockwise path). Hence, the ellipses only rotate
by 180 degrees in contrast to the polarization order discussed above, which was equal
to full integer numbers and, thus, full 360 degree rotation of the electric field vector for
one round-trip. This is possible because the major and minor semiaxes of the polar-
ization ellipse are directors and not vectors. A rotation by 180 degrees brings us back
to the original orientation of the ellipse. The ellipse field is usually surrounded by a
closed line of linear polarization separating points of elliptical polarization of oppo-
site handednesses. The topological index can also be higher (single or multiple full or
half turns of the ellipse) for more complicated distributions.
Two fundamental distributions of ellipse fields around C-points are depicted in Fig. 7.4.
Based on their shape they are called star, lemon and monstar (le-mon-star; not shown
here) [7]. The star-type ellipse field distribution is similar to the field in the cross-
section of the above-mentioned Poincaré beam with a central C-point surrounded by
elliptically polarized light with position-dependent ellipse orientation (see Fig. 7.3). In
Chapter 8, we extend our discussion to cover also intriguing phenomena appearing in
3D ellipse fields.
1 In 2D ellipse fields, points of linear polarization can usually be found along lines, while C-points are
isolated.
7.5 Basic principles of structured light beam generation | 75
be converted into a spatial mode is manipulated by acting on the beam’s local am-
plitude, phase or polarization. By the implementation of a position-dependent modi-
fication of these parameters, the beam can be engineered spatially on demand and
almost limitless. The utilization of liquid crystal technology adds an extra level of
control to such methods. Liquid crystal molecules show strong optical birefringence
thus acting like miniaturized waveplates and phase-shifters (see also Section 4.2.4),
while also aligning with an external electric field, which can be used to control the
effective birefringence or induced phase precisely. This enables fine control over the
beam shape and quality, and it also allows for spectral tunability. The spatial degree
of freedom in the corresponding manipulation of polarization and phase is achieved
by different means, for instance, by imposing a spatial orientation of the molecules
by structured alignment layers or by sub-dividing a liquid crystal cell into pixels and
providing pixel-by-pixel voltage control. The latter class of devices is referred to as
spatial light modulators (SLM), described in more detail in Chapter 9. They can be
implemented as reflective or transmissive devices acting on the local phase of a ho-
mogeneously polarized input mode. Position-dependent manipulation is realized via
discretized pixel arrays controllable individually. An SLM can therefore be run like
a computer screen, turning it into a very versatile, highly flexible and extraordinar-
ily powerful device for beam shaping, imposing almost no limits with respect to the
mode order or type to be generated. Despite their phase-only (and amplitude) oper-
ation, SLMs were also successfully utilized for the generation of cylindrical vector
beams and other vectorial spatial modes [10, 29, 30]. Hence, SLMs are the right choice
if flexibility with respect to mode type or order as well as wavelength is required. In
addition, so-called q-plates [28], which are also based on voltage-driven liquid crys-
tal cells (see also Section 9.6.2) and position-dependent manipulation of the field in
the sense of a HWP, have been established as simple yet highly efficient and reliable
devices finding various areas of applications. The underlying idea is the same as the
one discussed above for segmented waveplates. The liquid crystal molecules behave
like microscopic HWPs and feature azimuthally varying orientations, hence influenc-
ing the polarization in a position-dependent manner. In particular, they are usually
implemented for the generation of LG modes or cylindrical vector beams. However,
they only allow for the generation of a specific group of modes defined by the cho-
sen arrangement of liquid-crystal molecular chains. The concept behind q-plates is
discussed in more detail in Section 9.6.2.
Bibliography
[1] L. Allen, M. W. Beijersbergen, R. J C. Spreeuw, and J. P. Woerdman. Orbital angular momentum
of light and the transformation of Laguerre–Gaussian laser modes. Phys. Rev. A, 45(11):8185,
1992.
[2] P. Banzer. Nano-optics and plasmonics with complex spatial modes of light—structured
electromagnetic fields at the nanoscale, 2019. Habilitation Thesis.
Bibliography | 77
[3] T. Bauer, P. Banzer, E. Karimi, S. Orlov, A. Rubano, L. Marrucci, E. Santamato, R. W. Boyd, and
G. Leuchs. Observation of optical polarization Möbius strips. Science, 347(6225):964–966,
2015.
[4] V. Y. Bazhenov, M. S. Soskin, and M. V. Vasnetsov. Screw dislocations in light wavefronts.
J. Mod. Opt., 39(5):985–990, 1992.
[5] A. M. Beckley, T. G. Brown, and M. A. Alonso. Full Poincaré beams. Opt. Express,
18(10):10777–10785, 2010.
[6] M. V. Berry. Index formulae for singular lines of polarization. J. Opt. A, Pure Appl. Opt.,
6(7):675, 2004.
[7] M. V. Berry and J. H. Hannay. Umbilic points on Gaussian random surfaces. J. Phys. A, Math.
Gen., 10(11):1809, 1977.
[8] Z. Bomzon, G. Biener, V. Kleiner, and E. Hasman. Real-time analysis of partially polarized light
with a space-variant subwavelength dielectric grating. Opt. Lett., 27(3):188–190, 2002.
[9] F. Cardano, E. Karimi, L. Marrucci, C. de Lisio, and E. Santamato. Generation and dynamics of
optical beams with polarization singularities. Opt. Express, 21(7):8815–8820, 2013.
[10] V. Chille, S. Berg-Johansen, M. Semmler, P. Banzer, A. Aiello, G. Leuchs, and C. Marquardt.
Experimental generation of amplitude squeezed vector beams. Opt. Express,
24(11):12385–12394, 2016.
[11] M. R. Dennis. Polarization singularities in paraxial vector fields: morphology and statistics.
Opt. Commun., 213(4–6):201–221, 2002.
[12] R. Dorn, S. Quabis, and G. Leuchs. Sharper focus for a radially polarized light beam. Phys. Rev.
Lett., 91(23):233901, 2003.
[13] J. Durnin. Exact solutions for nondiffracting beams. i. The scalar theory. JOSA A, 4(4):651–654,
1987.
[14] J. Durnin, J. J. Miceli Jr, and J. H. Eberly. Diffraction-free beams. Phys. Rev. Lett., 58(15):1499,
1987.
[15] J. S. Eismann, M. Neugebauer, and P. Banzer. Exciting a chiral dipole moment in an achiral
nanostructure. Optica, 5(8):954–959, 2018.
[16] I. Freund. Polarization flowers. Opt. Commun., 199(1–4):47–63, 2001.
[17] Z. Ghadyani, I. Harder, N. Lindlein, A. Berger, W. Iff, I. Vartiainen, and M. Kuittinen.
Concentric ring metal grating for generating radially polarized light. Appl. Opt., 50:2451,
2011.
[18] L. G. Gouy. Sur une propriété nouvelle des ondes lumineuses. C. R. Acad. Sci. Paris, 110:1251,
1890.
[19] J. V. Hajnal. Singularities in the transverse fields of electromagnetic waves. I. Theory. Proc. R.
Soc. Lond. Ser. A, Math. Phys. Sci., 414(1847):433–446, 1987.
[20] J. V. Hajnal. Singularities in the transverse fields of electromagnetic waves.
II. Observations on the electric field. Proc. R. Soc. Lond. Ser. A, Math. Phys. Sci.,
414(1847):447–468, 1987.
[21] A. Holleczek, A. Aiello, C. Gabriel, C. Marquardt, and G. Leuchs. Classical and quantum
properties of cylindrically polarized states of light. Opt. Express, 19(10):9714–9736, 2011.
[22] J. Kalwe, M. Neugebauer, C. Ominde, G. Leuchs, G. Rurimo, and P. Banzer. Exploiting
cellophane birefringence to generate radially and azimuthally polarised vector beams. Eur.
J. Phys., 36(2):025011, 2015.
[23] E. Karimi, S. A. Schulz, I. De Leon, H. Qassim, J. Upham, and R. W. Boyd. Generating optical
orbital angular momentum at visible wavelengths using a plasmonic metasurface. Light Sci.
Appl., 3(5):e167, 2014.
[24] E. Karimi, G. Zito, B. Piccirillo, L. Marrucci, and E. Santamato. Hypergeometric-Gaussian
modes. Opt. Lett., 32(21):3053–3055, 2007.
78 | 7 Structured light
https://doi.org/10.1515/9783110668025-008
80 | 8 Polarization of light at the nanoscale
get a better understanding of effects to appear close to or in the focal plane, we assume
a linearly (y) polarized light beam propagating along the z-axis in free space and plug
it into Maxwell’s equation(s). In preparation for this step, we rewrite Gauss’ law in
differential form for the vacuum [30]:
Here we still assume for convenience that the x-component (crossed in-plane com-
ponent) of the electric field is strictly zero. However, it should be noted here already
that this is not the case for symmetry reasons (see Section 8.2.1). The equation above
should help us now to characterize the longitudinal field component and its distribu-
tion in the focal plane (z = 0). For a beam exhibiting a fundamental Gaussian intensity
envelope in the transverse plane,
−(x2 +y2 )
Ey ∝ e w2
0 eikz , (8.2)
2y
Ez (x, y, 0) = −i Ey (x, y, 0). (8.3)
kω20
2π
With k = λ
we obtain
λ
Ez (x, y, 0) = −i yEy (x, y, 0). (8.4)
πω20
We can immediately identify various interesting features. First and foremost, the lon-
gitudinal field component is indeed non-zero (for y ≠ 0). Furthermore, Ez features a
Hermite–Gaussian transverse distribution depending on the distribution of the (main)
y-component, which is parallel to the polarization. The strength of the longitudinal
field component scales with the wavelength and inversely with the square of the beam
waist w0 ; the smaller the beam in the focal plane, the stronger the z-component. Last
but not least, we also find that the transverse field component Ey and the longitudi-
nal one Ez are ± π2 out-of-phase, as indicated by the imaginary unit i [6, 33]. The latter
apparent detail holds immense potential in terms of fundamental physics and appli-
cations (see [1]) and will be discussed in more detail in Section 8.4.1.
In summary, we showed that based on this short theoretical calculation longitu-
dinal field components must be present for the considered case (and many others)
although their contribution might not be significant for large beam waists (limit of the
8.2 3D-Structured landscapes of light resulting from strong spatial confinement | 81
paraxial approximation). However, they might grow to a substantial level quite rapidly
in the case of tight focusing, discussed in the next section. We mention the work by
Lax et al. who use a slightly different approach to also show that longitudinal electric
fields must appear [23].
that the field is plane polarized while featuring a limited spatial extent [23]. They pro-
posed extensions, or corrections, to the paraxial solutions for coping with this appar-
ent issue. This ansatz provides for a more accurate description of the field.
Nonetheless, we want to focus here on an intuitive and powerful description,
which is capable of also describing tight focusing by a lens. At its core, it is based
on the viewpoint of geometrical optics. We know of course that geometrical optics
in its original sense does neither provide information about the vectorial properties
of a focal spot nor does it reflect properly how the beam propagates within or close
to the focal volume. Furthermore, it even suggests an unphysical infinitely small fo-
cal area, or a focal point. The power of geometrical optics should, however, not be
underestimated, like the case of ray-tracing shows impressively. It can serve as an
underlying framework, which can be extended by involving polarization of light. For
the discussion, we assume the following setting, also sketched in Fig. 8.1. A bundle of
aligned parallel rays represented by an incoming wave vector k⃗in impinge on a thin
lens (blue vertical line). This bundle of parallel rays can also be replaced by a single
planar wave with wave vector k⃗in propagating along the z-axis. The incoming rays are
refracted by the lens and redirected to meet at the geometrical focus point. Therefore,
the field behind the lens cannot be described by a single planar wave anymore. Each
ray now has its own wave vector k⃗out,n pointing towards the focal point with the lim-
iting angle depending on the focusing strength and geometrical aperture size of the
lens. Wave vectors above and below the optical axis (positive and negative y-axis) at
equal distances feature the same z-component but opposite signs of the y-component.
The plane waves propagating along each ray interfere and form the respective focal
pattern in the focal volume similar to the interference pattern behind a double-slit
or grating. The solid cone of rays and wave vectors is usually referred to as angular
spectrum. The electric and magnetic field vectors for each geometrical ray or the cor-
responding plane wave have to be orthogonal to the ray or local wave vector due to
the transversality condition. If the input field (planar wave) was vertically linearly po-
Figure 8.1: An incoming plane wave (or bundle of aligned parallel plane waves) impinges on a thin
lens (blue vertical line), focusing individual light rays to meet at the geometrical focus. Each ray is
redirected by the refraction in the lens. Hence, the propagation behind the lens cannot be repre-
sented anymore by a single kin ⃗ but an angular spectrum of wave vectors k⃗
out,n .
8.2 3D-Structured landscapes of light resulting from strong spatial confinement | 83
larized, the output polarization of each ray behind the lens will be dependent on the
respective lateral position (along the x- and y-axis). This fact illustrates already that
the lens (or non-paraxial propagation) induces a depolarization effect by redirecting
the partial rays, which can also take a more complex form. A plane wave is an exact
solution to the full wave equation and Maxwell’s equations. Hence, also a superposi-
tion or spectrum of plane waves is a solution. We are consequently not applying any
approximations till this point. We also see that due to the depolarization, the focal
spot of course will not be infinitely small because the interference of plane waves
will result in an interference pattern consisting of three-dimensional fields (electric
or magnetic or both).
In the discussion we assumed that the lens is arbitrarily thin while still warranting
a lens-like generation of a converging field (see Fig. 8.1). This would, however, require
the rays originating from the rear surface of the lens to take different path lengths
to reach the geometrical focus point. Different path length result in different phases
across the aperture, influencing the focal interference. This is of course an artificial
problem resulting from the chosen configuration. In a realistic lens, the light rays have
to propagate through the lens material (glass or similar) with the thickness varying
across the aperture. The optical path length thus depends on the position where the
input ray hits the lens. Microscope objectives and lenses are usually designed such
that they convert a planar incoming wave into a spherical wave converging towards
the focus, warranting equal path lengths and no parasitic phase delays. This type of
focusing element is usually called aplanatic lens. It fulfills Abbe’s sine condition for a
collimated input field:
d
sin θ = , (8.5)
f
with d the distance of the input ray from the optical axis, θ the corresponding angle of
the resulting ray with respect to the optical axis, and f the focal length on the side of
the image plane. An important aspect for the theoretical treatment of this conversion
is also energy conservation, which must be taken into account.
Without going any deeper into these important but rather technical details, we
now turn to the discussion of a powerful and versatile method, which is based on
the aforementioned fundamental aspects for calculating focal field distributions. This
method was introduced by Richards and Wolf in 1959 [33], extending the scalar diffrac-
tion theory. Their seminal work, which marks the hour of birth of vectorial diffraction
theory, is still used heavily nowadays and can be implemented in a rather straight-
forward manner. The starting point for the derivation (not to be elaborated on here)
is the angular spectrum introduced above. The electric field in real space can be rep-
resented by the field in momentum space (and vice versa) by the following integral
assigning different wave vector components to the spectrum of plane waves (shown
84 | 8 Polarization of light at the nanoscale
∞
⃗̂ −i[kx x+ky y±kz z]
⃗ y, z) = ∬ E(k
E(x, x , ky ; 0)e dkx dky . (8.6)
−∞
Figure 8.2: Left: Total electric energy density distribution in the focal plane for tight focusing. Center
and right: energy density distributions of all electric field components. Maps of relative phases are
shown as insets covering the same axes dimensions. Two cases are shown: tightly focused (a) lin-
early polarized Gaussian beam and (b) radially polarized doughnut mode. All distributions are nor-
malized to the maximum of the total electric energy density. Reproduced from Ref. [4].
volume, the transverse electric field components on the optical axis cancel for sym-
metry reasons while the longitudinal components interfere constructively to form a
strong on-axis component (see also Fig. 8.2 (b), right). The central z-component is sur-
rounded by ring-shaped transverse field components. In the focal plane we therefore
find a three-dimensional electric field with the polarization depending on the actual
position with respect to the optical axis. Based on these two examples it becomes clear
that the focal field can be engineered by modifying the input intensity, polarization
and phase distributions.
Before we move on to discuss in more detail other configurations and related phe-
nomena, we want to allude to an intriguing aspect with respect to the democracy of
electric and magnetic fields. The energy density distributions of electric and magnetic
fields (or electric and magnetic field intensities) for plane waves or paraxial beams
of light are equivalent. This is a direct consequence of the fact that their electric and
magnetic fields are locally orthogonal to each other. For instance, while the electric
field of a radially polarized light beam in the paraxial domain is oriented like the
spokes of a wheel, the magnetic field is pointing locally in a direction tangential to
the tire or along the azimuthal angle. Both of them equal zero on the optical axis and
therefore form doughnut-shaped field intensity distributions. However, the situation
is drastically different for light fields described in a full (Maxwellian) and complete
86 | 8 Polarization of light at the nanoscale
The discussion of light’s structure can also be extended to electromagnetic fields near
interfaces, scatterers, diffractive elements or similar inhomogeneities light is interact-
ing with in space. The electromagnetic field close to refractive index steps, metal in-
terfaces or similar naturally features an evanescent contribution. From a more gen-
eral perspective, evanescent fields need to be considered whenever space is inho-
mogeneous in a certain sense. Evanescent fields decay exponentially in amplitude in
contrast to propagating waves. To illustrate the importance of evanescent fields, their
inherently structured nature, and their strong contribution to the near field, it is in-
structive to consider the simple and very fundamental example of a point-like electric
dipole oscillating at a given frequency ω0 . When observed from a sufficiently large
distance (≫ λ0 = 2πc/ω0 ) and at an angle of 90∘ with respect to the orientation of
the dipole moment, the emission can be considered to be plane-wave-like. Further-
more, when resolved angularly, the emission pattern features the well-known sine-
squared intensity distribution with zero emission along the dipole moment (dipole
axis). However, the situation is considerably different in close vicinity to the dipole.
A charge oscillating along the dipole axis creates a highly structured near-field dis-
tribution resulting from the strongly curved field lines. The evanescent field contri-
8.3 Measuring structured light at the nanoscale | 87
butions originating from the charge(s) die out quickly with increasing distance. How-
ever, and in contrast to the far-field distribution discussed above, which features no
intensity along the dipole axis, the near field peaks at those positions. This effect can
be attributed solely to the evanescent fields. This simple example visualizes the sub-
stantial differences between near- and far-field intensity distributions and the highly
structured nature of evanescent waves. Now, we move on to the discussion of the po-
larization of these near fields. Caused by its strong spatial dependence in close vicinity
to the dipole, the polarization also varies substantially on a small scale (smaller than
the wavelength corresponding to the oscillation frequency of the dipole). When the
charges are separated, Maxwell’s equations tell us that we should expect a gradient
of the electric field surrounding the charges. While the charges flow along the dipole
axis (just like in an antenna), a magnetic field arises, curling around the current. The
resulting fields oscillate in time with the dipole frequency. The time-oscillating mag-
netic fields result in a curl of the electric field and so forth. The polarization therefore
varies on a very small length scale.
Another important case to be shortly discussed here is the evanescent field at di-
electric interfaces, for instance between air and a dielectric material such as glass. If
a plane wave impinges from the glass side of the interface at an angle greater than the
angle of total internal reflection, it will be fully reflected. No propagation component
of the field will be found in the air half-space. However, the field is also not strictly zero
there. As a direct consequence of the boundary conditions, the field does not drop to
zero abruptly, but decays exponentially with increasing distance from the interface.
The evanescent wave propagates along the interface and it is spatially confined to it.
The electric field is parallel to the propagation plane (spanned by the input and output
ray) while the magnetic field is orthogonal to it. As we will see below in Section 8.4.1,
the electric field of evanescent waves features an intriguing property with respect to
the polarization.
Multiple experimental methods have been introduced, tested and discussed in the
literature, which allow for the measurement of electric field intensity distributions, for
the study of individual field components, or even the reconstruction of the full-field in-
formation (amplitude and phase distributions of all field components) of propagating
or evanescent light with deep sub-wavelength spatial resolution. Below, we discuss
briefly some selected methods capable of providing quantitative access to nanostruc-
tured propagating or evanescent fields.
In this Section, which has more a review-type rather than a didactic style, we want
to give a brief overview with respect to experimental techniques for probing the elec-
tromagnetic field at nanoscale dimensions. As mentioned before, it seems to be in-
tuitive to utilize probe-based scanning approaches for the measurement of confined
beam profiles instead of using detectors with insufficient spatial resolution and miss-
ing phase and polarization sensitivities directly. In fact, most techniques presented
to date feature probes locally interacting with the light field. The level of information
gained by such measurements strongly depends on probe design and analysis strat-
egy. To gain, for instance, information about the relative phase of the individual field
components in a spatially resolved manner, usually additional measures have to be
taken to provide a phase reference just like in an interferometer.
For the sake of convenience, we focus here on a small selection of powerful meth-
ods to be discussed in a bit more detail. However, in passing, we plan to mention
briefly some other methods as well. We start with a discussion of conventional tech-
niques for profiling relatively large beams.
Many beam profilers available commercially nowadays are based on simple high-
resolution cameras, which allow for measuring the intensity profile of light beams.
With ever decreasing pixel sizes and pitches, the intensity distribution can be mea-
sured accurately, even for small beam diameters on the order of tens of micrometers.
If we are interested in the polarization distribution of paraxial beams of light, such
cameras can be combined with polarizing elements (see Chapter 9) to perform a spa-
tially resolved Stokes measurement, which defines the polarization distribution in the
beam cross-section in addition to the intensity profile. Furthermore, polarization cam-
eras have been introduced, which feature linear polarizers in front of the pixels to
distinguish between different linear polarization states. If on top of the polarization
also the phase front is of interest, the complexity of the measurement procedure is
increasing because additional measurement devices, such as phase front sensors, are
required or interferometric schemes have to be implemented. Phase, polarization and
intensity distributions fully characterize the beam.
As an alternative to a camera-based measurement of the beam profile, also photo-
detectors combined with moving mechanical elements (see Fig. 8.3), e. g., slits or pin-
8.3 Measuring structured light at the nanoscale | 89
Figure 8.3: Different types of probes for profiling focused beams of light. From left to right: pinhole
or slit, knife-edge, and (nano-)particle.
holes in opaque films, sharp metal edges (knife-edges) or fluorescent point-like probes
[15, 35] are used and have been partially commercialized as beam profilers. The trans-
mitted power is measured with a photo-detector and the profile can be reconstructed
tomographically from the position-dependent scan data. Such devices usually only
measure the intensity profile or certain beam parameters like the radius. The reader
might wonder now, whether or not such schemes can also be applied if non-paraxially
propagating or tightly focused light beams are to be profiled. In the previous sections it
became clear that beside a structure to the transverse electromagnetic field, strongly
confined light fields will feature complex three-dimensional distributions with spa-
tially varying phase, polarization and intensity. It is therefore immediately apparent
that the aforementioned methods for beam profiling might not be sufficiently capable
or have to be adapted. An instructive example in this context is indeed the knife-edge-
based method. It utilizes an opaque layer with a sharp edge (see Fig. 8.3, center), which
is moved across the beam to be profiled, to block the beam partially. From the photo-
current curve recorded by a photo-detector, the beam profile projection along the scan
direction can be reconstructed. To adapt this scheme to also work for tightly focused
light beams, a number of modifications are necessary to reduce the amount of artifacts
and errors. Because a tightly focused light beam diverges quickly after trespassing the
focal plane, the detector should be placed very close to the knife edge itself. In addi-
tion, the steep wave vectors involved in tight focusing require an ultra-thin footprint
of the sharp-edged material layer forming the knife edge, while still being opaque. Ex-
perimentally, this was realized by fabricating knife edges directly on top of detectors
and based on thin metal films. However, additional problems arise from the complex
interaction of focused beams with metal edges. For instance, the sharp knife edge will
interact with the impinging light in a polarization sensitive manner, e. g., differently
for light polarized along the edge or orthogonal to it. This polarization dependence can
influence the resulting beam profile in a parasitic way, deforming, shifting or skewing
it [20, 27]. The impact on the measured data depends on various parameters, for in-
stance on the edge material [14, 27, 32], the substrate or the detector material under-
neath [21]. The artifacts introduced by the measurement can be compensated for by a
proper data post-processing strategy [20] or circumvented by an appropriate choice of
90 | 8 Polarization of light at the nanoscale
the involved materials [14, 32]. The knife-edge method therefore turns into a powerful
tool for reconstructing the total electric field intensity profile of even tightly focused
light beams. But, so far, its capabilities are limited to intensity distributions.
To get access to individual field components and their energy density distri-
butions, the aforementioned method of fluorescent point probes has been adopted
and modified accordingly. A tightly focused light beam is scanned across randomly
oriented dye molecules embedded into a transparent dielectric matrix [31, 35, 37].
Although introduced as a method for molecule orientation sensing, the patterns ob-
served in a fluorescence confocal scanning microscope also allow for retrieval of
the focal field distributions (amplitude distributions of field components) if various
molecules of different orientation are scanned.
An alternative version of this method can also enable the measurement of relative
phase maps. For this purpose, various groups have utilized near-field scanning optical
microscopes (NSOM) [38]. A detailed discussion of NSOMs would go beyond the scope
of this book. In short, NSOMs are based on sharp metal tips brought into close vicinity
of scatterers, waveguides or interfaces to pick up the near-field information (by scat-
tering or propagating it to the far field). NSOMs are excellently suited for measuring
the complex near-field polarization distribution and other field parameters [34]. Fur-
thermore, they also have been successfully applied for measuring focal fields [19, 25].
For granting access to polarization and phase information, the implementation of ad-
ditional polarization elements and a phase reference is required.
We complete our overview by discussing a more recent technique for the full re-
construction of focal fields, not requiring the implementation of any additional phase
reference, polarization analysis, or similar. This method is based on a very intuitive
approach and probe-design, while enabling the reconstruction of both amplitude and
phase distributions of the individual electric field components. It utilizes a nanopar-
ticle acting as a scanning probe (see Fig. 8.3, right). The nanoparticle supports multi-
polar resonances and is immobilized on a dielectric substrate. It is placed in the focal
plane of a tightly focused light beam under study, and raster-scanned across the latter
[5, 6]. It might sound surprising that such a seemingly simple scheme is sufficient to
fully reconstruct the electric field in its amplitude and phase distributions. As elabo-
rated on in the above-mentioned references, the key ingredients to this technique en-
abling also access to the phase are a rigorous theoretical backbone and the measure-
ment of transmitted (or reflected) and scattered light in an angularly resolved fashion.
No additional polarization or phase measurement apparatus is required to analyze
the light after it has interacted with the sample. The intensity distribution is measured
with a camera, which images the back focal plane of a collecting microscope objective.
This distribution contains information about the scattered part of the light field, the
input beam itself, and their interference. Hence, the interference term carries the in-
formation about the phase of the excitation field components. With the probe interact-
ing locally with the excitation field under investigation, the desired information about
the local field can be retrieved. From a theoretical perspective, the power detected in
8.3 Measuring structured light at the nanoscale | 91
the back focal plane (resulting from an integration across different areas of the in-
tensity distribution for each particle position) can be decomposed into input power,
scattered power and the aforementioned interference term. Scattered and input fields
are related to each other via a scattering matrix (T-matrix), which contains information
about the scatterer and the substrate underneath. The fields themselves can be repre-
sented by a series of vector-spherical harmonics, which represent multipoles (dipoles,
quadrupoles etc.). The latter is a natural basis for this kind of interaction because it
reflects the optical response of the probing particle. It can be shown that a limited
number of multipole orders are sufficient together with an adapted number of particle
steps and step-sizes. Furthermore, the integration limits in the measured back focal
plane images (angular spectrum) have to be chosen accordingly, representing certain
solid angles within which the detected power is taken into account for the analysis. As
a result, the input field in the probed plane can be retrieved accurately with deep sub-
wavelength spatial resolution. The achievable resolution depends on the aforemen-
tioned parameters. With the experimental reconstruction of amplitudes and phases,
the complex field structure at nanoscale dimensions becomes experimentally accessi-
ble [5, 6]. This method can be used to characterize tightly focused light beams used as
tools for nano-optics experiments, for analyzing focusing systems, and many more.
In summary, the last decade has seen a promising development of novel tech-
niques and refined methods with proven capabilities. They allow for accessing the
sub-wavelength features and the three-dimensional nature of highly confined elec-
tromagnetic fields and their complex polarization distributions.
On the theoretical side, also the previously introduced frameworks for the descrip-
tion of polarization and its spatial distribution, such as Stokes parameters, angular
momenta, and topological features, need to be revisited. The latter aspects will be
covered below while we focus first on Stokes parameters.
By definition, the Stokes parameters as introduced in Section 3 are based on the
assumption that the electric field is restricted to a plane orthogonal to the propaga-
tion direction. More generally speaking, they can be applied if the field is polarized
in a plane. This assumption is perfectly valid for plane waves or paraxially propa-
gating beams of light. However, the electromagnetic field is usually inherently three-
dimensional, while especially in the case of strong spatial confinement (tight focus-
ing, evanescent fields, etc.), longitudinal components contribute substantially to the
total field. The Stokes formalism thus has to be extended to cover the full extent of
the field. This can be done in a very intuitive and convenient manner also for fully po-
larized three-dimensional fields, following the original idea behind the Stokes vector
and its components. To cover the full extent of the three-dimensional field, the three
orthogonal (electric) field components Ex , Ey , and Ez need to be compared to each
92 | 8 Polarization of light at the nanoscale
other with respect to their field intensities and phases, just like before in the case of
conventional Stokes parameters and purely transverse fields. It is convenient to start
again with the Jones vector, while we follow the discussion and notation in References
[12, 36]. The Jones vector for a fully polarized three-dimensional field now takes the
form of a three-components column vector and reads as follows:
Ex E0x eiδx
E ≡ (Ey ) = (E0y eiδy ) .
⃗ (8.7)
Ez E0z eiδz
The phases can be defined with respect to an absolute phase reference, reducing the
number of independent variables to five (including a total intensity reference). Al-
though the field is now allowed to oscillate in three-dimensional space with no phys-
ical restriction to the transverse plane with respect to the propagation direction, the
electric field can still be described, in the most general case and for fully polarized
light, by a polarization ellipse (or line/circle for the limiting cases) just like in the two-
dimensional case discussed in Section 3.3. Starting with the generalized Jones vector,
we can now also write down the generalized Stokes parameters Λi for a fully polarized
field as follows:
with
1 8 2
Λ20 = ∑Λ . (8.9)
3 i=1 i
We see that we have nine generalized Stokes parameters together with a relationship
connecting all parameters comparable to that for the two-dimensional case. However,
we noted above that the three-dimensional field can be defined by five independent
parameters in the Jones vector. It can be shown that the generalized Stokes parameters
are not all independent of each other and that a set of five is sufficient to describe
8.4 Exotic phenomena based on polarization effects in spatially confined light | 93
the field’s behavior. For more details and possible combinations of Stokes parameters
allowing for the full characterization of the field, the interested reader should refer to
Ref. [36]. The parameters listed above can be used to represent the local polarization in
fully polarized three-dimensional fields. However, their measurement is considerably
more complicated than the measurement of the conventional Stokes parameters in
paraxial light (see discussion in Section 8.3).
In the following section we will see that the three-dimensional character of spa-
tially confined fields evokes intriguing and fascinating phenomena connected to po-
larization in general as well as spin angular momentum and the topological structure
of light in particular.
In the previous chapter we mentioned already that the polarization is also intimately
connected to the spin angular momentum of light. The spin per photon reaches val-
ues of up to ±ℏ for circularly polarized light and is identical 0 for linear polarization.
The spin angular momentum is an integrated quantity resulting from the spatial inte-
gration across the spin density distribution s⃗ of, e. g., a light beam. The spin density
defines the local spin of the field and, following the notation shown in [1, 7], it is given
by
⃗ = Im(ϵ0 E⃗ ∗ × E⃗ + μ0 H⃗ ∗ × H)/4ω.
s⃗ = sE⃗ + sH ⃗ (8.10)
cussion on the electric part of the spin density. The spin density is strictly zero if the
field is locally linearly polarized, while it takes non-zero values for elliptical polariza-
tion, and reaches its maximum (minimum) for circular polarization. The spin density
is proportional to the parameter c as defined in Eq. (7.19). In the case of paraxial light
or individual plane waves, the electric and magnetic fields are restricted to the trans-
verse plane, which forces the spin density to be a scalar number in principle (only
the sz component can be different from zero). Hence, the spin is always aligned with
the propagation axis (parallel or anti-parallel) for such cases; the spin is longitudinal.
However, if we allow for three-dimensional electric and magnetic fields as they appear
in focused light beams or other scenarios, also the spin density in Eq. (8.10), will be
a vector with three entries. From its construction above it becomes apparent that also
transverse components of the spin density (sEx and sEy ) may appear as long as the corre-
sponding field components, i. e., Ey and Ez as well as Ex and Ez , respectively, are non-
zero and appropriately de-phased. In the discussion of tightly focused light beams
(Fig. 8.2) and also in the introductory section, Section 8.1, we learned already that by
spatially confined light beams naturally feature longitudinal together with transverse
field components [6]. The only missing ingredient for the spin to density have non-
zero transverse components, and for the field to be elliptically or circularly polarized
in the propagation plane, is an appropriate phase relation (insets in Fig. 8.2). If we re-
visit again the focal distributions shown in this figure, we can see that, for the shown
cases also the remaining requirements for non-zero transverse spin components, i. e.,
a relative phase of ±π/2, are fulfilled. In Fig. 8.4, we show the focal spin density dis-
tributions (transverse components only) for the cases of tightly focused linearly and
radially polarized light beams, respectively [1, 28]. We can clearly see that the trans-
verse components of the spin density are non-zero for certain areas where distribu-
tions of longitudinal and transverse field components overlap. In addition, also their
sign is position-dependent. The de-phasing of the field components responsible for
the appearance of a non-zero transverse spin density can be explained in a simple and
intuitive manner by the Gouy phase discussed above in the context of paraxial light
propagation. Also for tight focusing, the field accumulates a phase delay (with respect
to a planar reference wave). The total phases accumulated while propagating towards
the focal plane depend on the mode order or spatial profile of the field. The longi-
tudinal field component of, e. g., a tightly focused radially polarized beam features
for symmetry reasons a different spatial profile than the transverse field. Hence, the
respective field components are de-phased differently, resulting in an effective ±π/2
phase difference in the focal plane. This relative phase together with the spatially par-
tially overlapping field distributions result in non-zero transverse spin density com-
ponents with the corresponding elliptically or circularly polarized field spinning in
the propagation plane (see Fig. 8.5).
It is also worth noting that the longitudinal component of the spin density sz for
the presented cases is strictly 0 everywhere, which can be understood by looking at
8.4 Exotic phenomena based on polarization effects in spatially confined light | 95
Fig. 8.2. We also note that for the chosen input beams the distributions are all anti-
symmetric with respect to one of the coordinate axes spanning the transverse plane
(x or y). Hence, the integrated quantity S⃗E at z = 0, defined by
also called the net spin, is strictly zero in all its components for the presented and
many other cases. As a side note we mention here briefly that also beams carrying non-
zero transverse components of the net spin can be constructed. They feature transverse
spin density distributions, which do not integrate to zero and therefore are not anti-
symmetric. For more details the interested reader is referred to Refs. [1] and [3].
The reader might be of the impression now that transverse spin only appears in the
tight focusing regime. However, if we recall the discussion in Section 8.1, longitudi-
nal field components (electric or magnetic) are ubiquitous and appear also for weakly
focused beams to be compliant with Maxwell’s equations. However, for a collimated
beam, the contributions of longitudinal fields is negligibly small for geometric rea-
sons, and, thus, the transverse spin density components are small as well.
In general, the phenomenon of transverse spin densities is not exclusive to free-
space focusing. It can also be found in either the electric or magnetic field or both for
other types of spatial confinement. Interesting and important examples are evanes-
cent waves at dielectric interfaces [9], or equivalently, propagating waves at metal–
dielectric surfaces (propagating surface plasmon polaritons; SPPs) [10]. Both types
of waves are confined to a volume close to the surface (exponential decay along the
surface normal) while propagating along it. For a better understanding, we again
turn back to the discussion of evanescent waves at dielectric interfaces (see also
Section 8.3.1). Transverse spin density components are equivalent to an elliptical or
circular polarization component with the field spinning in the propagation plane
(plane spanned by transverse and longitudinal axis). To qualitatively predict the
polarization of an evanescent wave resulting from total internal reflection at, e. g.,
a glass-air interface, we discuss the process of a plane wave totally reflected more
carefully. For the discussion, the wave impinging above the critical angle is set to be
in-plane linearly polarized. The wave interferes partially with the reflected copy of
itself. The reflected planar wave has the same amplitude but it is slightly shifted in
its phase as a direct consequence of the Fresnel coefficients. Both waves—incoming
and reflected—are superposed and interfere partially (if their polarization states are
not fully orthogonal, i. e., for an angle of incidence different from 45∘ ). The phase
delay introduced by reflection (Fresnel coefficients) is strongly angle dependent and
changes from 0 to π between the critical angle and grazing incidence (90∘ ). The result-
ing polarization of the two-plane-wave pattern in the glass half-space shows therefore
a complex polarization distribution varying with increasing distance from the inter-
face also defining the relative path difference. Close to the interface in the optically
denser medium (glass) where the path difference goes to zero, the field is elliptically
polarized in the plane of incidence, the signature of a non-zero transverse spin den-
sity (of the electric field). On the other side of the interface, in the air half-space, the
field is also elliptically polarized as a consequence of the continuity conditions for the
electric field (and displacement field) components at the interface. Hence, the evanes-
cent wave traveling along the interface and decaying exponentially with increasing
distance to it, is elliptically polarized in the propagation plane and therefore features
also a transverse spin density different from zero.
In conclusion, the appearance of transverse spin, and equivalently, electric and
magnetic fields polarized elliptically or circularly with the polarization ellipse lying in
the meridional plane is a ubiquitous phenomenon, although introduced and put on
a solid theoretical foundation related to angular momenta only very recently [1, 11].
8.4 Exotic phenomena based on polarization effects in spatially confined light | 97
Transverse spin is one of the key features enabling interesting applications in light-
routing, nano-metrology, quantum optics and more [1, 26].
In the previous chapters, the appearance of topological features in light fields was
discussed already in the context of phase vortices and polarization singularities in the
scalar case or for generic two-dimensional ellipse fields. By definition, the correspond-
ing electromagnetic fields were two-dimensional evolving in the same plane. We now
want to briefly extend this discussion to three-dimensional field distributions. We will
see that these may host exotic and surprising topological structures. The phenomena
that follow closely resemble the geometric-phase effects described in Chapter 6.
The aforementioned polarization distributions (ellipse fields) evolving around C-
points form topological structures with a topological index defining their type. Natu-
rally, the addition of an extra field component orthogonal to the plane lifts the field out
of the plane and creates a three-dimensional landscape of the field in the given plane.
Isaac Freund was one of the first to study corresponding systems [16, 17]. His stud-
ies showed that, for instance, the major axes of the polarization ellipses traced along
closed lines around C-points can show intriguing topological structures. In particular
cases, the major axis twists and turns around the chosen trace and when returning to
the starting point, the number of turns will be equal to m/2 with m an odd integer. This
behavior is similar to the case of two-dimensional ellipse fields (see Fig. 7.4) where the
ellipse rotated in plane by ±180∘ . However, with the rotation now happening in 3D
space, the resulting structure formed by the major axes of the traced polarization el-
lipses features the shape and properties of a Möbius strip. A Möbius strip is an object,
which possesses only one edge and one surface by construction (see Fig. 8.6). It is sta-
ble under deformations and can be classified as a topology. As shown in Fig. 8.6, it can
be constructed easily from paper by twisting one end of a paper strip by 180 degrees
and taping both ends together. By construction, the originally separated surfaces of
the paper strip are now connected. The same is true for the two (long) edges.
In his work, Freund also proposed a method for creating such elusive strips of
light. His suggestion was based on the interaction of two orthogonally polarized (left-
and right-handed circular) non-coaxially propagating light beams of different spatial
phase structures (different azimuthal phase ramps). In his case, a fundamental Gaus-
sian beam and a Laguerre–Gaussian beam of first azimuthal order were chosen. When
studied in the plane of interaction, the field would naturally feature also field compo-
nents orthogonal to this plane. The differences in the azimuthal phase change and
the opposite handedness of polarization give rise to the appearance of Möbius strips
around a central C-point. In 2015, Bauer et al. proved the existence of polarization
Möbius strips experimentally. In their case, they created the topological structures re-
quiring three-dimensional field distributions in a different and more efficient manner.
98 | 8 Polarization of light at the nanoscale
They chose the scheme of tight focusing of co-propagating light beams. Upon tight fo-
cusing, strong longitudinal field components appeared, which together with the cho-
sen input states generated optical polarization Möbius strips formed around the opti-
cal axis (C-point) in the focal plane [5]. A particle-based probing technique [6] allowed
them to measure the full field in the focal volume (see Section 8.3). The orientation of
the major ellipse axes can be calculated using Eq. (7.19). Two years later, Galvez and
coworkers also realized the scheme originally proposed by Freund and measured the
strips for this configuration [18].
Another topological phenomenon, which we mention here briefly and which may
be found in scalar or polarization varying field distributions, are so-called knots. Den-
nis, Berry, Padgett and others studied these intriguing and mind-boggling structures
in theoretical and experimental detail [8, 13, 24]. Knots can be formed by phase singu-
larities in three-dimensional space or also by the polarization (e. g. knotted C-lines). In
the latter case, points of conventional circular polarization (with the field spinning in
the transverse plane) form closed lines in a given volume of a propagating beam [22].
These lines turn out to be knotted under certain circumstances. The appearance of
knotted polarization structures is again linked to different phases accumulated upon
propagation of modes of different order. These additional examples showcase again
the richness of electromagnetic fields.
Bibliography
[1] A. Aiello, P. Banzer, M. Neugebauer, and G. Leuchs. From transverse angular momentum to
photonic wheels. Nat. Photonics, 9(12):789–795, 2015.
[2] A. Aiello and M. V. Berry. Note on the helicity decomposition of spin and orbital optical
currents. J. Opt., 17(6):062001, 2015.
[3] P. Banzer, M. Neugebauer, A. Aiello, C. Marquardt, N. Lindlein, T. Bauer, and G. Leuchs. The
photonic wheel—demonstration of a state of light with purely transverse angular momentum.
J. Eur. Opt. Soc., Rapid Publ., 8(0), 2013.
Bibliography | 99
[4] P. Banzer. Nano-optics and plasmonics with complex spatial modes of light—structured
electromagnetic fields at the nanoscale, 2019. Habilitation Thesis.
[5] T. Bauer, P. Banzer, E. Karimi, S. Orlov, A. Rubano, L. Marrucci, E. Santamato, R. W. Boyd, and G.
Leuchs. Observation of optical polarization Möbius strips. Science, 347(6225):964–966, 2015.
[6] T. Bauer, S. Orlov, U. Peschel, P. Banzer, and G. Leuchs. Nanointerferometric amplitude and
phase reconstruction of tightly focused vector beams. Nat. Photonics, 8(1):23–27, 2014.
[7] M. V. Berry. Optical currents. J. Opt. A, Pure Appl. Opt., 11(9):094001, 2009.
[8] M. V. Berry and M. R. Dennis. Knotted and linked phase singularities in monochromatic waves.
Proc. R. Soc. Lond., Ser. A, Math. Phys. Eng. Sci., 457(2013):2251–2263, 2001.
[9] K. Y. Bliokh, A. Y. Bekshaev, and F. Nori. Extraordinary momentum and spin in evanescent
waves. Nat. Commun., 5(1):1–8, 2014.
[10] K. Y. Bliokh and F. Nori. Transverse spin of a surface polariton. Phys. Rev. A, 85(6):061801,
2012.
[11] K. Y. Bliokh and F. Nori. Transverse and longitudinal angular momenta of light. Phys. Rep.,
592:1–38, 2015.
[12] T. Carozzi, R. Karlsson, and J. Bergman. Parameters characterizing electromagnetic wave
polarization. Phys. Rev. E, 61(2):2024, 2000.
[13] M. R. Dennis, R. P. King, B. Jack, K. O’Holleran, and M. J. Padgett. Isolated optical vortex knots.
Nat. Phys., 6(2):118–121, 2010.
[14] R. Dorn, S. Quabis, and G. Leuchs. Sharper focus for a radially polarized light beam. Phys. Rev.
Lett., 91(23):233901, 2003.
[15] A. H. Firester, M. E. Heller, and P. Sheng. Knife-edge scanning measurements of subwavelength
focused light beams. Appl. Opt., 16(7):1971–1974, 1977.
[16] I. Freund. Cones, spirals, and Möbius strips, in elliptically polarized light. Opt. Commun.,
249(1–3):7–22, 2005.
[17] I. Freund. Optical Möbius strips in three-dimensional ellipse fields: I. Lines of circular
polarization. Opt. Commun., 283(1):1–15, 2010.
[18] E. J. Galvez, I. Dutta, K. Beach, J. J. Zeosky, J. A. Jones, and B. Khajavi. Multitwist Möbius strips
and twisted ribbons in the polarization of paraxial light beams. Sci. Rep., 7(1):1–9, 2017.
[19] T. Grosjean, I. A. Ibrahim, M. A. Suarez, G. W. Burr, M. Mivelle, and D. Charraut. Full vectorial
imaging of electromagnetic light at subwavelength scale. Opt. Express, 18(6):5809–5824,
2010.
[20] C. Huber, S. Orlov, P. Banzer, and G. Leuchs. Corrections to the knife-edge based
reconstruction scheme of tightly focused light beams. Opt. Express, 21(21):25069–25076,
2013.
[21] C. Huber, S. Orlov, P. Banzer, and G. Leuchs. Influence of the substrate material on the
knife-edge based profiling of tightly focused light beams. Opt. Express, 24(8):8214–8227,
2016.
[22] H. Larocque, D. Sugic, D. Mortimer, A. J. Taylor, R. Fickler, R. W. Boyd, M. R. Dennis,
and E. Karimi. Reconstructing the topology of optical polarization knots. Nat. Phys.,
14(11):1079–1082, 2018.
[23] M. Lax, W. H. Louisell, and W. B. McKnight. From maxwell to paraxial wave optics. Phys. Rev. A,
11(4):1365, 1975.
[24] J. Leach, M. R. Dennis, J. Courtial, and M. J. Padgett. Knotted threads of darkness. Nature,
432(7014):165, 2004.
[25] K. G. Lee, H. W. Kihm, J. E. Kihm, W. J. Choi, H. Kim, C. Ropers, D. J. Park, Y. C. Yoon, S. B. Choi,
D. H. Woo, et al.Vector field microscopic imaging of light. Nat. Photonics, 1(1):53–56, 2007.
[26] P. Lodahl, S. Mahmoodian, S. Stobbe, A. Rauschenbeutel, P. Schneeweiss, J. Volz, H. Pichler,
and P. Zoller. Chiral quantum optics. Nature, 541(7638):473–480, 2017.
100 | 8 Polarization of light at the nanoscale
[27] P. Marchenko, S. Orlov, C. Huber, P. Banzer, S. Quabis, U. Peschel, and G. Leuchs. Interaction of
highly focused vector beams with a metal knife-edge. Opt. Express, 19(8):7244–7261, 2011.
[28] M. Neugebauer, T. Bauer, A. Aiello, and P. Banzer. Measuring the transverse spin density of
light. Phys. Rev. Lett., 114(6):063901, 2015.
[29] M. Neugebauer, J. S. Eismann, T. Bauer, and P. Banzer. Magnetic and electric transverse spin
density of spatially confined light. Phys. Rev. X, 8(2):021042, 2018.
[30] L. Novotny and B. Hecht. Principles of nano-optics. Cambridge University Press, 2006.
[31] M. Prummer, B. Sick, B. Hecht, and U. P. Wild. Three-dimensional optical polarization
tomography of single molecules. J. Chem. Phys., 118(21):9824–9829, 2003.
[32] S. Quabis, R. Dorn, M. Eberler, O. Glöckl, and G. Leuchs. The focus of light–theoretical
calculation and experimental tomographic reconstruction. Appl. Phys. B, 72(1):109–113, 2001.
[33] B. Richards and E. Wolf. Electromagnetic diffraction in optical systems, ii. Structure of
the image field in an aplanatic system. Proc. R. Soc. Lond., Ser. A, Math. Phys. Eng. Sci.,
253(1274):358–379, 1959.
[34] N. Rotenberg and L. Kuipers. Mapping nanoscale light fields. Nat. Photonics, 8(12):919–926,
2014.
[35] M. B. Schneider and W. W. Webb. Measurement of submicron laser beam radii. Appl. Opt.,
20(8):1382–1388, 1981.
[36] C. J. R. Sheppard. Jones and Stokes parameters for polarization in three dimensions. Phys. Rev.
A, 90(2):023809, 2014.
[37] B. Sick, B. Hecht, and L. Novotny. Orientational imaging of single molecules by annular
illumination. Phys. Rev. Lett., 85(21):4482, 2000.
[38] E. H. Synge. Xxxviii. A suggested method for extending microscopic resolution into the
ultra-microscopic region. The London, Edinburgh, and Dublin Philosophical Magazine and
Journal of Science, 6(35):356–362, 1928.
9 Polarization elements that we use in the lab
Polarization elements are important tools for various experiments, from very tradi-
tional interferometry to the most recent quantum key distribution experiments. This
chapter is focused on the commercial polarization elements that today (2020) one can
buy from companies like Thorlabs, Edmund Optics, and Laser Components. Similar to
all textbooks, this one will get out of date at some point, but this chapter will be the
first to ‘expire’, because new, more advanced elements will be developed and man-
ufactured. Even now, new ‘user-inspired’ products appear almost every month, and
this rate will certainly get even higher. Nevertheless, it is useful to give a review of
what is available right now. We will consider waveplates and rotators for polarization
transformations, beam displacers and polarization prisms for the measurement of po-
larization states, and finally spatial light modulators, elements that use polarization
for preparing structured light beams. The few elements of fiber polarization optics will
also be briefly reviewed.
9.1 Waveplates
A waveplate (retardation plate) can be most simply made out of a piece of crystalline
quartz, cut so that the optic axis (z) is in its plane. Quartz is a positive uniaxial crystal,
ne > no . Therefore, for light polarized linearly along the z-axis, the refractive index n =
ne , and the phase (and group) velocity is smaller than for light polarized orthogonally
to z-axis (n = no ). The ‘fast axis’, perpendicular to z, is sometimes marked by a flat
cut.
A plate of reasonable thickness (0.5 mm or thicker) will provide a given phase only
for light with a sufficiently narrow bandwidth. Indeed, the phase δ of a plate with the
thickness l (see Chapter 5) depends strongly on the wavelength λ,
πΔnl
δ= , (9.1)
λ
https://doi.org/10.1515/9783110668025-009
102 | 9 Polarization elements that we use in the lab
the phase δ = (m + 1/2)π is a HWP of order m. The plate we are considering is therefore
a 7th-order HWP.
At the same time, by using Eq. (9.1) again we find that for the wavelength λ =
615 nm the same plate will produce a phase delay of δ = 7.25π and will then be a QWP
of order 7. For intermediate wavelengths, 600 < λ < 615 nm, the plate will be neither
a HWP nor a QWP, but something in between.
We see that a multiple-order waveplate can be only used for very narrowband ra-
diation. For instance, if we request that the deviation from the expected phase shift
should be not more than 0.1 rad, the plate considered here will be suitable for a band-
width of only about 2 nm.
For a zero-order plate, the phase delay should be δ ≤ π/2: exactly π/2 for a HWP and
exactly π/4 for a QWP. For the wavelength λ = 600 nm, a zero-order HWP should have
a thickness of 33 µm, and a zero-order QWP, a thickness of 16.5 µm. Such plates will
have a broader bandwidth but they will be too fragile. To avoid this, one can make a
zero-order plate by stacking together two plates, with the optic axes orthogonal and
thicknesses l1 , l2 large enough to provide the necessary rigidity (Fig. 9.1). Because the
extraordinary beam in the first plate will be the ordinary beam in the second plate and
vice versa, the phase delays in the two plates will be of different sign. The resulting
phase delay will be
πΔn(l1 − l2 )
δ= . (9.2)
λ
It follows that this composite plate has an effective thickness l1 −l2 . If l1 and l2 are close
enough, then the composite plate will act like a zero-order plate. For instance, a plate
with l1 = 533 µm and l2 = 500 µm will be a zero-order HWP for λ = 600 nm.
A composite zero-order waveplate can be made by gluing two quartz plates
together—in this case it does not stand very strong radiation because the glue can
burn. To get a higher damage threshold, the two plates can be simply stacked together
using the ‘optical contact’.
There are also so-called ‘true zero-order plates’. Such a plate is a thin layer of
polymer (liquid crystal) placed between two glass plates. The liquid crystal provides
in this case birefringence while the isotropic glass plates make the whole construction
mechanically strong.
The thickness of a plate can be chosen such that the plate will perform different polar-
ization transformations for different wavelengths, for instance, it can be a HWP for one
wavelength and a QWP for another one. An example was discussed in Section 9.1.1: a
plate of thickness 495 µm will be a HWP for the wavelength λ = 600 nm but a QWP
for the wavelength λ = 615 nm. Alternatively, a HWP for a certain wavelength can si-
multaneously have δ = mπ for another one, then it will not affect the polarization of
light at this other wavelength.
Even for a zero-order plate, the bandwidth is not very large. In the example considered
in Section 9.1.2, a zero-order plate for λ = 600 nm, with the (effective) thickness 33 µm,
will have a bandwidth of 30 nm. Clearly, the bandwidth of a plate scales as the inverse
of its thickness.
One can increase this bandwidth by compensating the wavelength in the denom-
inator of Eq. (9.2) with the wavelength-dependent birefringence Δn in its numerator.
In the normal dispersion range, the birefringence reduces with the wavelength and
such compensation is impossible. To overcome this problem, the two plates forming
a zero-order plate are made of different crystals, for instance, quartz and magnesium
fluoride (MgF2 ) or quartz and sapphire (Al2 O3 ) [6]. Then, instead of Eq. (9.2), the phase
of the plate will be given by
π(Δn1 l1 − Δn2 l2 )
δ= , (9.3)
λ
where the indices 1, 2 correspond to the two different materials. Because the two ma-
terials have different dependence of the birefringence on the wavelength, by properly
choosing the lengths l1 and l2 the numerator in this expression can be made roughly
scaling with the wavelength. Such a waveplate will have the same phase within a very
large bandwidth, up to about 200 nm.
104 | 9 Polarization elements that we use in the lab
Some experiments require waveplates with a variable phase. As such a variable wave-
plate, it is convenient to use a Soleil-Babinet compensator (Fig. 9.2). It consists of a
plate with a fixed thickness and another one, with a variable thickness. Both plates
are usually made of quartz. The optic axes of the two plates are orthogonal (shown by
blue dots and arrow), as in the case of a zero-order plate. To make the thickness of the
first plate variable, the plate is made out of two wedges, which can be displaced with
respect to each other (shown by a gray arrow in the figure). The phase of such a plate
can be varied from 0 to π. Due to the use of two plates with orthogonal optic axes, the
Soleil-Babinet compensator is equivalent to a zero-order plate.
9.2 Rotators
In many devices or experiments, it is necessary to rotate linear polarization by a cer-
tain angle regardless of its initial state. In such cases, as discussed in Chapter 5, a good
solution is a polarization rotator.
In the simplest case, a polarization rotator is a plate of crystalline quartz, cut or-
thogonal to the optic axis. As described in Chapter 4, such a z-cut has circularly po-
larized normal waves and circular birefringence. For light in the visible range, a slab
with a few mm thickness will rotate the polarization by 90∘ (Fig. 5.2). The disadvan-
tage of such a rotator is its small bandwidth. For instance, a 2 mm slab will rotate the
polarization by 90∘ for light at 600 nm but by 45∘ at 700 nm.
For operation with different wavelengths, much more convenient are Faraday ro-
tators. These are rotators based on the Faraday effect: in a longitudinal magnetic field
H, the plane of polarization rotates by angle δ = VHL, where L is the length of the
nonlinear crystal and V is the Verdet constant [2]. The Verdet constant is determined
by the nonlinear properties of the material and therefore depends on the wavelength.
But for any wavelength it is possible to set the magnetic field in such a way that a given
rotation angle δ is achieved. Faraday rotators are made of ferromagnetic materials like
9.3 Beam displacers | 105
Beam displacers can be useful in many optics experiments. For instance, with the help
of a beam displacer one can realize Young’s double-slit interference: at the output of
the displacer the ordinary and extraordinary beams represent two copies of the same
beam. For the interference to take place, both beams should have the same polariza-
tion state, but this can be achieved by placing another polarizer after the displacer,
projecting both polarization states on a single one. In other types of interferometers,
beam splitting can be also conveniently realized with the help of several beam dis-
placers. As an example, Fig. 9.5 shows a Mach–Zehnder interferometer formed by two
beam displacers with the optic axes parallel. A HWP oriented at 45∘ , placed between
them, converts the ordinary beam at the output of the first displacer into the extraor-
dinary beam in the second one, and vice versa. Such an interferometer is easy to align
and it does not suffer from rapid phase drift. By placing more beam displacers, the
interferometer can be transformed into a multipath one.
Even more important is the use of beam displacers in the Stokes measurement, already
discussed in Chapter 4. With a camera registering both beams after a beam displacer
(see Fig. 4.8), one can measure the intensities for the horizontally and vertically po-
larized radiation and this way measure the Stokes variables S1 and S0 . For the mea-
surement of S2 , the displacer should be rotated 45∘ and for the measurement of S3 , it
should be preceded by a QWP. Especially convenient is that by varying the size of the
incident beam, one can modify the quality of the measurement. If the beam is broader
than the transverse displacement between the ordinary and extraordinary beams, the
measurement becomes ‘weak’. This might seem to be a bad idea, but for the reasons
that will be clear further, weak measurements are widely used in quantum optics. They
will be considered in detail in Chapter 11.
9.4 Prisms and polarizing beamsplitters | 107
Total internal reflection follows from the Snell law: for the angles of incidence and
refraction θi , θr at an interface between the two materials, whose refractive indices
are n1 and n2 (Fig. 9.6) [1],
sin θi n2
= . (9.4)
sin θr n1
If the second medium is less dense (for instance, air), then n1 > n2 , and total internal
reflection is possible. The angle θi0 of total internal reflection is defined as
n2
sin θi0 = . (9.5)
n1
For a beam incident at this angle, θr = π/2, and the refracted beam propagates along
the interface. For larger angles of incidence, total internal reflection occurs and there
is no refracted beam.
The Brewster law follows from the Fresnel formulas for reflectivity and transmissivity
of an interface. If the angle of incidence is nonzero, it is convenient to introduce the
notation ‘p’ and ‘s’, correspondingly, for the waves polarized parallel to the plane of in-
cidence (parallel) and orthogonal to it (from the German senkrecht). From the Fresnel
formulas, it follows [1] that reflectivity and transmissivity differ for p- and s-polarized
beams. In particular, for a beam incident at the Brewster angle, defined as
n
tan θB = 2 , (9.6)
n1
108 | 9 Polarization elements that we use in the lab
the p-component is not reflected at all. The reason for this can be qualitatively under-
stood if we notice that condition (9.6) means that the reflected and refracted beams
are orthogonal to each other (see Fig. 9.6). Indeed, from Snell’s law (9.4) in combina-
tion with Eq. (9.6), we obtain sin θi = cos θr . Then the absence of the ‘p’ polarization
in the reflected beam can be explained in simple terms. Recall that the reflected beam
emerges due to the oscillation of electrons on the right of the interface. Then, for the
reflected beam to be ‘p’-polarized, the electrons should oscillate in the transverse di-
rection. But this is exactly the direction in which the refracted beam propagates; there-
fore, these oscillations cannot be excited by the refracted beam.
Unlike total internal reflection, which occurs only at the boundary with a less
dense medium, the Brewster effect does not require this condition. The relation be-
tween the two refractive indices n1 , n2 only affects the value of the Brewster angle.
The Brewster effect is used for making so-called Brewster windows. These are
plates made of a crystalline or glass material and placed at the Brewster angle, so that
they do not reflect an incident p-polarized beam. As a result, there is no loss for the
p-polarization. Such windows are used for gas laser tubes, in which they reduce the
losses. Brewster-angle incidence is also used in all kinds of prisms, to reduce reflection
losses.
Figure 9.7 shows a Glan–Taylor prism. Its two halves, separated by an air gap, are made
of crystalline material. Usually calcite is chosen, due to its large birefringence and
small dispersion. For the operation in the UV range, Glan–Taylor prisms can be also
made of α-BBO [6], which has lower birefringence than calcite but is transparent down
to 200 nm. The optic axes in both halves of the prisms are vertical (shown by blue
arrows in the figure).
A light beam incident on the prism and then hitting the air gap contains, in general,
both s- and p-polarized components. The s-component is an ordinary beam whose re-
fractive index is no , while the p-component (extraordinary beam) has the refractive
9.4 Prisms and polarizing beamsplitters | 109
index ne . These two indices differ considerably; for instance, for calcite, at 532 nm
the values of no and ne are, respectively, 1.66 and 1.49. The angle of incidence on
the crystal-air interface is chosen to be larger than the total internal reflection an-
gle for the s-polarized beam: sin θi > 1/no . Therefore, the s-polarized beam is fully
reflected from the interface and the transmitted beam is perfectly cleaned from the s-
polarization (horizontal in the figure). At the same time, for the p-polarized (extraordi-
nary) beam, the angle of incidence at the same interface is equal to the Brewster angle:
tan θi = 1/ne . Therefore, no p-polarization (vertical) is contained in the reflected beam.
Typically, a Glan–Taylor prism provides an extinction ratio of 1000000 : 1, which
means that the transmissivity for the incident s-polarized beam is six orders of magni-
tude lower than for the p-polarized beam. For the reflected beam, the extinction ratio
is somewhat worse but still high enough.
Because the angle of incidence is only a bit larger than the total internal reflection
angle, a Glan–Taylor prism does not provide perfect polarization splitting for divergent
beams. It is very important therefore that the incident beam is collimated and hits the
prism at normal incidence.
A Glan–Laser prism has the same design as the Glan–Taylor prism but it is made
of high-quality crystal material and therefore stands a higher intensity and produces
less scattered light.
A Glan–Thompson prism has a slightly different structure (Fig. 9.8). The angle of inci-
dence on the interface is larger, and in both halves of the prism, the optic axes (z) are
parallel to the interface (blue dots in the figure). Now the s and p components change
roles: the first one is an extraordinary beam and is transmitted (with low loss due to
the Brewster law), and the second one is an ordinary beam and is reflected.
Because of the larger angle of incidence, the gap between the two halves can be filled
with optical cement instead of the air. This reduces the tolerance of the prism to dam-
age but increases the range of angles for which it can operate. The width of the field
of view for a Glan–Thompson prism can be as large as 40∘ [6].
110 | 9 Polarization elements that we use in the lab
For a Wollaston prism (Fig. 9.9), similarly to Glan prisms, two halves are made of bire-
fringent crystalline material, which can be calcite, α-BBO, quartz, or magnesium fluo-
ride. But in contrast to the Glan prisms, there is no gap between the two halves, and the
refractive indices for an incident beam differ because the optic axes in the two halves
are orthogonal to each other. For instance, in Fig. 9.9, the input half of the prism has
the optic axis horizontal and the output part, vertical. Then a horizontally polarized
beam is an extraordinary polarized beam in the first half, and has a refractive index
ne , but it is an ordinary beam after the interface and its refractive index is no . For an
incident beam with vertical polarization, it is the other way round. In the case of cal-
cite, which is a negative crystal, no > ne . Therefore, at the boundary the horizontally
polarized beam goes from a less dense medium into a more dense one; this beam is re-
fracted downwards. The vertically polarized beam, on the contrary, enters a less dense
medium and is therefore refracted upwards. The angular separation of the beams can
be different, from 1∘ to 20∘ , depending on the orientation of the interface.
Calcite Wollaston prisms provide an extinction ratio as high as 1000000 : 1 for both
vertically and horizontally polarized beams. For this reason, a Wollaston prism is the
best option if both beams are to be used after the splitting.
The simplest and cheapest polarizers are made with the help of dielectric coatings
placed on isotropic materials, like glass or fused silica. For instance, a beam incident
at 45∘ on a polarizing plate will experience high transmissivity for the p-polarization
and high reflectivity for the s-polarization. In order to eliminate the reflection from the
second surface of the plate, the plate is made slightly wedged. The extinction ratio for a
9.5 Fiber polarization components | 111
polarizing plate can be up to 10000 : 1, considerably worse than for Glan or Wollaston
prisms.
Polarizing cubes are based on a similar principle as polarizing plates, but they
are easier to align. A polarizing cube is made of an isotropic material with a dielec-
tric coating covering the 45∘ interface between its two halves. The dielectric coating
is chosen such that the p-polarization is not reflected due to the Brewster effect. The
s-polarization is partly reflected, partly transmitted, but it can be made to be reflected
completely due to the interference. The extinction ratio of a polarizing cube is typically
1000 : 1, worse than for a polarizing plate.
ratio per meter. For instance, if the H-parameter is 10−6 m−1 , then after 1 km of this
fiber its extinction of the ‘wrong’ polarization is 30 dB.
With the help of a PM fiber, fiber polarization beam splitters are produced. In such
a device, a single-mode fiber is connected to the input of a polarization prism (for
instance, made of calcite), whose two outputs are, in turn, coupled to PM fibers with
fast axes oriented orthogonally. Such fiber splitters are used for splitting orthogonally
polarized modes of a fiber, or combining them (in the latter case the device is used in
the opposite direction).
To control the polarization of light after or before an optical fiber, there are special de-
vices called fiber paddles. In each such paddle, a fiber is several times looped around
a spool, which leads to a birefringence appearing in the fiber, the fast axis being in the
plane of the loops. The phase introduced by such a paddle is [6]
2π 2 aNd2
δ= , (9.7)
λD
where a is a constant depending on the material of the fiber, N the number of loops, d
the cladding diameter, λ the wavelength, and D the diameter of the loop. By properly
choosing the diameter of a single loop and the number of loops, one can make a paddle
equivalent to a QWP or to a HWP. A system of three paddles, two HWPs and a QWP
between them, provides an arbitrary polarization transformation from any input state
to any desired output state. The polarization controller is operated by changing the
tilt of each of the three paddles.
Spatial light modulators (SLMs) are used in numerous fields of optics, from hologra-
phy to beam shaping, and in standard devices like beam projectors. An SLM is a device
9.6 Liquid-crystal devices | 113
for modulating the phase of a beam, and sometimes also its amplitude, with high spa-
tial resolution. The operation of an SLM is based on the polarization properties of liq-
uid crystals (Chapter 4). This operation is schematically demonstrated in Fig. 9.10 [4]
where, as an example, four pixels of an SLM are shown.
Each pixel contains a tiny amount of a liquid crystal. Molecules of the liquid crystal
(their directors shown by green arrows) are oriented by the electric field Ei , applied in-
dependently to every pixel. As a result, in every pixel there is a different orientation of
all molecules. As described in Section 4.2.4, the refractive index for light with a certain
polarization will be different depending on the orientation of the molecules in each
pixel. Light (shown by red arrows) is sent to the SLM and reflected by a mirror on its
back side. The phase acquired by the incident light at each pixel will be proportional
to the refractive index value, and it can be varied independently by setting the electric
field at the pixel. The accessible phases cover the range from 0 to 2π.
This way a different phase can be imparted on every pixel. To make sure that only
phase-modulated beam is used, usually a blazed grating is also written on the SLM.
The superposition of the blazed grating and the required phase distribution defines
the phase profile of the diffracted beam.
As mentioned, one of the applications of an SLM is generation of scalar struc-
tured beams. For example, to convert a Gaussian beam into a Hermite–Gaussian or
a Laguerre–Gaussian beam, one has to modulate both its phase and its intensity as
in the examples of Fig. 7.2. The phase distributions, shown in the insets of the figure,
are then imparted on the beam with the help of an SLM. For modulating the intensity,
additional elements have to be used. In particular, because a liquid crystal modifies
not only the phase but also the polarization state of a beam, the intensity can be mod-
ulated by means of a polarizer introduced after the SLM.
9.6.2 Q-plates
The fact that a liquid crystal modifies the polarization state of light underpins the op-
eration of so-called q-plates—devices that appeared recently but are already available
commercially [5]. A q-plate is designed to substitute segmented waveplates mentioned
114 | 9 Polarization elements that we use in the lab
in Chapter 7 as well as more complicated devices. Indeed, a liquid crystal layer with all
molecules oriented the same way is similar to a slab of a birefringent crystal. Suppose
that the directors of all molecules are parallel and lie in the plane of the layer. Then,
depending on the thickness of the layer, the liquid crystal will act as a waveplate with
a certain phase, for instance a HWP or a QWP. A segmented waveplate can then be
made by orienting the directors of the molecules differently in different segments of
such a layer.
Figure 9.11 shows, as an example, a 8-segment waveplate made of a liquid crystal.
In each next segment of this waveplate, the directors of the molecules are rotated by
π/8 compared to the previous one (panel a). This construction has been also described
in Section 7.5. Let the thickness of the liquid be such that each segment acts as a HWP,
with the optic axis along the directors. Then, if the input light beam is horizontally
polarized, its polarization after the kth segment will be still linear but rotated by (k −
1)π/4. Panel b illustrates this transformation as a rotation on the Poincaré sphere, with
the numbered green lines marking the rotation trajectories. Segments 1 and 5 lead to
no rotation. Panel d below shows the resulting distribution of the polarization after the
plate. The output beam is radially polarized, similar to the one discussed in Chapter 7.
Figure 9.11: An 8-segment q-plate (a) with the thickness equivalent to the one of a HWP. Green ar-
rows show the directors of the molecules. Panels b, c show the Poincaré-sphere representation of
the q-plate’s action on a horizontally polarized beam (b) and a right-hand circularly polarized beam
(c). Red points show the Stokes vector of the initial polarization state and the numbered green lines
are rotations performed on the Poincaré sphere by different segments of the plate. After the q-plate,
a horizontally polarized beam becomes a radially polarized beam (d) and a right-hand circularly po-
larized beam becomes left-hand circularly polarized and acquires an azimuthal phase shift (e).
Bibliography | 115
If the input beam is right-hand circularly polarized, we know from Chapter 5 that after
a HWP it will be left-hand circularly polarized, regardless of the plate orientation. But,
as described in Section 6.4.2 of Chapter 6, depending on the plate orientation it will
acquire a different geometric phase (see Figure 6.9). Figure 9.11(c) shows the Poincaré-
sphere representation of the polarization transformations in this case. For instance,
segment 1 of the plate results in the rotation of the initial Stokes vector (red point on
the North Pole) by π around the σ1 axis. Segment 2 rotates the same point by the same
angle around the axis (σ1 + σ2 )/√2 axis, segment 3, around the σ2 axis and so on. After
segment k, the solid angle Ωk = (k − 1)π/2 is covered on the Poincaré sphere compared
to the trajectory due to the first segment, and a geometric phase βk = (k − 1)π/4 is
acquired. The resulting beam will have a phase varying with the azimuthal angle by
2π, i. e., an orbital angular momentum l = 1 (panel e).
In a similar way, one can azimuthally modulate both the phase and the polariza-
tion state of the beam. For instance, if the waveplate is a QWP, and the input beam
is right-hand circularly polarized, then the output beam will have radial polarization
and an azimuthally varying phase. This becomes clear from Fig. 9.11(c) if we imagine
the trajectories stopped half-way, according to the fact that a QWP performs only a π/2
rotation on the Poincaré sphere.
In order to make a q-plate [5], a nematic liquid crystal is placed between two
polymer-coated substrates. The polymer is corrugated or structured using polarized
ultraviolet light; then the directors of the nematic liquid crystal are aligned accord-
ingly. This way, any spatial distribution of the molecules’ directors can be obtained. In
particular, the boundaries between the segments in Fig. 9.11(a) can be made smooth,
so that the polarization state of light or a phase at the output of a q-plate is varied
continuously and not stepwise as in panels d and e of the figure.
Some q-plates enable tuning of their phase delay at a given point by means of ap-
plied AC electric field, which changes the birefringence of the oriented liquid crystal.
Moreover, it is possible to independently control the phases for different wavelengths.
This is a considerable advantage compared to a segmented waveplate [3].
Bibliography
[1] M. Born and E. Wolf. Principles of optics. Pergamon Press, 1970.
[2] D. Goldstein. Polarized light. GRC, 2003.
[3] A. Rubano, F. Cardano, B. Piccirillo, and L. Marrucci. Q-plate technology: a progress review.
J. Opt. Soc. Am. B, 36(5):D70–D87, May 2019.
[4] Hamamatsu selection guide. https://www.hamamatsu.com/resources/pdf.
[5] ARCoptics website. http://www.arcoptix.com/Q_Plate.htm.
[6] Thorlabs website. https://www.thorlabs.com.
10 Polarization in nonlinear optics
Nonlinear optics describes frequency conversion of light due to the nonlinearity of
the electromagnetic response of the matter. Many aspects of nonlinear optics are con-
siderably based on the polarization of light. This is why this chapter is included into
this book. However, the reader is expected to have some basic knowledge of nonlinear
optics.
P(⃗ r,⃗ t) = ϵ0 [χ̂ (1) ⋅ E(⃗ r,⃗ t) + χ̂ (2) : E(⃗ r,⃗ t)E(⃗ r,⃗ t)
.
+ χ̂ (3) ..E(⃗ r,⃗ t)E(⃗ r,⃗ t)E(⃗ r,⃗ t) + ⋅ ⋅ ⋅], (10.1)
where we introduced the nth-order nonlinear susceptibility χ̂ (n) . The nonlinear sus-
ceptibility χ̂ (n) sets the relation between a vector (P)⃗ and a product of n other vectors
(E)⃗ and is therefore a tensor of rank n + 1. The tensor nature of the nonlinear suscepti-
bility is often ignored in simplified descriptions of nonlinear optical interactions; but
in this section, it will be the main issue. The signs ‘⋅’, ‘:’, etc. denote the procedure of
multiplying a tensor by one, two, etc. vectors. In what follows, we will omit the ‘hats’
over the susceptibilities but will bear in mind that they are tensors.
We have assumed here that the response of the medium is instantaneous and lo-
cal: the space and time dependences of the polarization P(⃗ r,⃗ t) repeat those of the field
E(⃗ r,⃗ t). This is possible only in a material without absorption and dispersion [2], but in
this book, as a rule, we ignore both effects.
We will further distinguish between the linear polarization P⃗ (1) (r,⃗ t) = ϵ0 χ (1) ⋅E(⃗ r,⃗ t),
second-order nonlinear polarization
and so on.
https://doi.org/10.1515/9783110668025-010
10.1 Nonlinear susceptibilities: tensor description | 117
Pi = ϵ0 (∑ χij(1) Ej + ∑ χijk
(2) (3)
Ej Ek + ∑ χijkl Ej Ek El + ⋅ ⋅ ⋅), (10.4)
j jk jkl
where we have omitted, for brevity, the time and space dependences of the fields. The
indices i, j, k, l correspond to the Cartesian coordinates and each one can take values
x, y, z.
In Chapter 4 we have already dealt with the second-rank first-order susceptibility
tensor χij(1) and the dielectric permittivity tensor ϵij related to it. In the scalar form, we
had (see Chapters 2, 4) ϵ = 1 + χ. In the matrix form, we get for the linear dielectric
permittivity tensor
From Chapter 4, we know that by properly choosing the frame of reference, the
number of elements for the dielectric permittivity tensor can be reduced. The optimal
frame of reference corresponds to the symmetry of the matter, and in this frame the
maximal number of ϵij elements is three. In materials of higher symmetry, it can be
two (uniaxial crystals) or one (isotropic crystals or amorphous materials).
In the next two sections, we will consider the tensor properties of nonlinear sus-
ceptibilities and, in particular, the number of their independent elements.
1 At the boundary between two isotropic materials, for instance, on a surface of an isotropic solid, one
can still observe second-order nonlinear effects because the boundary breaks the central symmetry.
In other words, there is always a surface contribution to the second-order nonlinear susceptibility.
118 | 10 Polarization in nonlinear optics
Usually, nonlinear effects are observed for rather narrowband (laser) radiation, so
that one can speak about radiation at certain frequencies converted into radiation at
other frequencies. For instance, in second-harmonic generation the fundamental radi-
ation at a frequency ω is converted into its second harmonic at frequency 2ω. It makes
then sense to consider the fields and polarization of the matter at specific frequencies,
rather than at specific times, as in Eq. (10.1). In other words, we will further use the
Fourier components of the fields and nonlinear polarization from Eq. (10.1), and the
corresponding Fourier components of the susceptibilities.
Consider now the general case of second-order frequency conversion, for in-
stance, conversion from frequencies ωn and ωm (which could be any, including neg-
ative values) to the frequency ωn + ωm . This includes sum- and difference-frequency
generation, second-harmonic generation, and optical rectification. The second-order
polarization at frequency ωn + ωm is [2]
where we stressed that the value of the susceptibility depends on the process we con-
sider.
For instance, second-harmonic generation will correspond to χijk (2)
(2ω, ω, ω);
difference-frequency generation, to χijk
(2)
(ω1 − ω2 , ω1 , −ω2 ); optical rectification, to
χijk
(2)
(0, ω, −ω).
Let us consider the most general case of conversion between three frequencies ω1 ,
ω2 , and ω1 + ω2 ≡ ω3 , and ask the question: how many elements of the χijk (2)
(ω3 , ω1 , ω2 )
tensor are there?
Without any restrictions, there would be six different tensors: χijk
(2)
(ω1 , −ω2 , ω3 );
χijk
(2)
(ω1 , ω3 , −ω2 ); . . .; χijk
(2)
(ω3 , ω1 , ω2 ). Another six tensors can be obtained by flipping
the signs of all frequencies. Finally, in every tensor there will be 27 permutations of
indices i, j, k. Altogether, it makes 12 × 27 = 324 complex numbers, because suscepti-
bilities are, in the general case, complex.
Fortunately, there are some restrictions imposed by symmetry considerations.
1. Complex conjugation. Because all fields entering Eq. (10.1) are real, the Fourier
components of the susceptibility satisfy the condition
(2) (2) ∗
χijk (−ω3 , −ω1 , −ω2 ) = [χijk (ω3 , ω1 , ω2 )] , (10.8)
i. e., flipping the signs of the frequencies is equivalent to complex conjugation. This
reduces the number of independent tensor values by a factor of 2.
2. Intrinsic permutation symmetry. From the definition of the process (10.7), it fol-
lows that the frequencies ω1,2 summing up to a frequency ω3 can be interchanged to-
gether with their indices [2]:
(2) (2)
χijk (ω3 , ω1 , ω2 ) = χikj (ω3 , ω2 , ω1 ). (10.9)
10.1 Nonlinear susceptibilities: tensor description | 119
3. Full permutation symmetry [2]. In a lossless medium there are additional symme-
try restrictions imposed on the nonlinear susceptibility tensor. First, in the absence of
absorption the imaginary parts of all susceptibilities should be zero. Then χijk(2)
should
be a real tensor, and from Eq. (10.8) we have
(2) (2)
χijk (−ω3 , −ω1 , −ω2 ) = χijk (ω3 , ω1 , ω2 ). (10.10)
Second, in a lossless medium we can also interchange any two frequencies to-
gether with the corresponding indices:
(2) (2)
χijk (ω3 , ω1 , ω2 ) = χjik (−ω1 , −ω3 , ω2 ). (10.11)
Importantly, the first frequency in the three arguments of the second-order suscepti-
bility should be the sum of the other two. This property is called the full permutation
symmetry.
But according to Eq. (10.10) the signs of all frequencies can be changed as well;
therefore, we get
Full permutation symmetry directly follows from the fact that the energy density
of the electric field in a lossless medium is constant [2]. The proof is similar to the
one we used in Chapter 4. Indeed, in the presence of nonlinear susceptibilities, ex-
pression (4.7) for the energy density of the electric field should be completed with
nonlinear terms:
ϵ0
Ue = ∑ ∑ ϵ (ω )E ∗ (ω )E (ω )
2 ij n ij n i n j n
ϵ0
+ ∑ ∑ χ (2) (−ωn − ωm , ωn , ωm )Ei∗ (ωn + ωm )Ej (ωn )Ek (ωm )
3 ijk nm ijk
ϵ0
+ ∑ ∑ χ (3) (−ωn − ωm − ωo , ωn , ωm , ωo )
4 ijkl nmo ijkl
× Ei∗ (ωn + ωm + ωo )Ej (ωn )Ek (ωm )El (ωo ) + ⋅ ⋅ ⋅ . (10.13)
Because i, j, k, l are just dummy (summation) indices, the second-order and third-order
susceptibilities χijk(2)
(−ωn − ωm , ωn , ωm ), χijkl
(3)
(−ωo − ωn − ωm , ωn , ωm , ωo ) should have
full permutation symmetry. Therefore, the susceptibilities with modified frequency
arguments, χijk(2)
(ωn + ωm , ωn , ωm ) and χijkl
(3)
(ωo + ωn + ωm , ωn , ωm , ωo ), should also have
this property.
4. Kleinman’s symmetry. The strongest symmetry restriction for the number of el-
ements of the second-order susceptibility tensor is valid in materials without optical
dispersion. The assumption that dispersion is absent, i. e., that the optical properties
do not depend on the frequency, is even stronger than the assumption of no optical
120 | 10 Polarization in nonlinear optics
loss. This assumption is valid when any material resonances are far from the frequency
range of interest. Typically, this is true in the middle of the visible–near-infrared range.
Then in Eq. (10.12) the sequence of frequencies is immaterial. One can therefore
write
1 (2)
dil (ω3 , ω1 , ω2 ) = χijk (ω3 , ω1 , ω2 ), (10.15)
2
where the factor 21 is added for convenience and there is a correspondence between
the combination of j, k indices and the l index:
1 (2) 1 (2)
d12 = χ122 = χ = d26 ;
2 2 212
1 (2) 1 (2)
d13 = χ133 = χ = d35 ;
2 2 313
1 (2) 1 (2)
d23 = χ233 = χ = d34 ; . . . . (10.18)
2 2 323
10.1 Nonlinear susceptibilities: tensor description | 121
We find that in the presence of Kleinman’s symmetry, there are only ten different
components of the tensor:
Furthermore, for a certain crystal some components of this tensor will be zero
according to the symmetry class to which the crystal belongs, in full agreement with
the Neumann principle. The structure of the dil tensor for different symmetry classes
will be described in the next section.
In Section 4.2 we considered the linear optical properties for different crystal systems
and classes. We saw that for crystals with the highest symmetry, the dielectric per-
mittivity tensor has the simplest structure. The lower the crystal symmetry, the larger
number of different elements of ϵij . According to the Neumann principle, the situation
with the second-order susceptibility is similar. We will now describe the properties of
dil for different types of crystals. So far, we are not assuming Kleinman’s symmetry.
Isotropic crystals (cubic). One might think that, for cubic crystals, which have
isotropic linear optical properties, there will be no second-order nonlinear effects.
However, among the cubic crystals, only ones belonging to classes m3̄ and m3m ̄ are
centrosymmetric and therefore have no second-order susceptibility. Crystals of an-
other cubic class, 432, although being non-centrosymmetric, also have no second-
order susceptibility because of other symmetry restrictions. But for crystals of classes
23 and 43m̄ (an example is gallium arsenide, GaAs), there are three nonzero elements
of the d tensor: d14 = d25 = d36 . Moreover, the second-order susceptibility of GaAs is
one of the highest known.
Uniaxial crystals. In the hexagonal system, classes 6/m and 6/mmm are cen-
trosymmetric. For the rest, the d matrices have the structure
0 0 0 d14 d15 0
dil6 = ( 0 0 0 d15 −d14 0) (10.20)
d31 d31 d33 0 0 0
0 0 0 d14 0 0
dil622 = (0 0 0 0 −d14 0) (10.22)
0 0 0 0 0 0
0 0 0 0 d15 0
dil6mm = ( 0 0 0 d15 0 0) (10.23)
d31 d31 d33 0 0 0
0 0 0 0 0 d16
̄
dil6m2 = (d16 −d16 0 0 0 0) (10.24)
0 0 0 0 0 0
dil4 = dil6 ;
0 0 0 d14 d15 0
dil4 = ( 0
̄
0 0 −d15 d14 0 ); (10.25)
d31 −d31 0 0 0 d36
dil422 = dil622 ;
dil4mm = dil6mm ;
0 0 0 d14 0 0
dil42m
̄
= (0 0 0 0 d14 0 ). (10.26)
0 0 0 0 0 d36
Biaxial crystals. In the orthorhombic system, crystals of class mmm are centrosym-
metric. For the other classes,
0 0 0 d14 0 0
dil222 = (0 0 0 0 d25 0 ); (10.30)
0 0 0 0 0 d36
0 0 0 0 d15 0
dilmm2 = ( 0 0 0 d24 0 0) . (10.31)
d31 d32 d33 0 0 0
0 0 0 d14 0 d16
dil2 = (d21 d22 d23 0 d25 0 ); (10.32)
0 0 0 d34 0 d36
d11 d12 d13 0 d15 0
dilm =(0 0 0 d24 0 d26 ) . (10.33)
d31 d32 d33 0 d35 0
Finally, for triclinic crystals, class 1̄ is centrosymmetric, and for class 1 all elements
of the dil tensor are nonzero and, in the general case, different.
In the presence of Kleinman’s symmetry, several components of dil become zero
or equal to each other [2].
Due to the intrinsic permutation symmetry, contracted notation can be also intro-
duced for the cubic susceptibility. Instead of the rank 4 tensor χijkl
(3)
, one can consider
a 3 × 10 matrix cim with i = 1, 2, 3 and m = 1, . . . , 0:
(3)
cim (ω4 , ω1 , ω2 , ω3 ) = χijkl (ω4 , ω1 , ω2 , ω3 ), (10.36)
jkl : 111 222 333 233, 223, 133, 113, 122, 112, 123, 132,
323, 322, 313, 311, 212, 121, 213, 231,
(10.37)
332 232 331 131 221 211 312, 321
m: 1 2 3 4 5 6 7 8 9 0
related as
χ1 = χ2 + χ3 + χ4 . (10.39)
Equation (10.39) follows from the simple fact that in an anisotropic medium, the
nonlinear effects are the same for any input polarization state. For instance, if the inci-
dent light is polarized diagonally and propagates along the z axis, E⃗ = E0 e⃗D , then the
third-order nonlinear polarization (for instance, for the third-harmonic generation)
should be also polarized diagonally and its value should be
At the same time, we can write it formally according to Eq. (10.34), assuming that the
electric field has components Ex = E0 /√2 and Ey = E0 /√2. The components of the
polarization vector P⃗ (3) , taking into account Eq. (10.38), will then be
E03
Px(3) = ϵ0 (χ1 + χ2 + χ3 + χ4 ),
2√2
E03
Py(3) = ϵ0 (χ1 + χ2 + χ3 + χ4 ). (10.41)
2√2
From Eqs. (10.40), (10.41), we obtain Eq. (10.39).
10.2 Phase matching | 125
The same way, i. e., by assuming that for an isotropic medium all polarization
states of the input electric field are equivalent, we can derive an interesting feature: the
third harmonic from a circularly polarized wave in an isotropic material is absent [6].
Indeed, let the incident field be right-hand circularly polarized, E⃗ = E0 e⃗R . Then, as in
the previous example, on the one hand,
E03
Px(3) = ϵ0 (χ1 − χ2 − χ3 − χ4 ),
2√2
E03
Py(3) = −iϵ0 (χ1 − χ2 − χ3 − χ4 ). (10.43)
2√2
From Eq. (10.39), it follows that Px(3) = Py(3) = 0. But even without Eq. (10.39), Eq. (10.43)
is in contradiction with Eq. (10.42) because it describes a left-circularly polarized wave.
Therefore, the third-harmonic generation should be absent.
Cubic crystals have the same 21 nonzero elements of the χijkl
(3)
tensor as isotropic
materials, but more of them independent: 4 for some classes and 7 for others [2]. For
instance, for classes 432, 43m,
̄ and m3m, there are the same 4 nonzero elements (10.38)
as for isotropic materials, but without Eq. (10.39). Further, the less symmetric a crystal
is, the more nonzero and independent elements the tensor χ (3) has. The least symmet-
ric, triclinic crystals have 81 nonzero elements of the χ (3) tensor, all of them indepen-
dent. A detailed table of nonzero elements of χijkl
(3)
for each crystal group can be found
in book [2].
In the general case of a nonlinear interaction, all participating waves can have dif-
ferent polarization states. For instance, in the simplest case of the second-harmonic
generation, Eq. (10.7) becomes
Ex2 (ω)
Ey2 (ω)
Px(2) (2ω) d11 d12 d13 d14 d15 d16
( Ez2 (ω) )
(Py(2) (2ω)) = 2ϵ0 (d16 d22 d23 d24 d14 (2E (ω)E (ω)) . (10.45)
d12 ) ( )
y z
Pz(2) (2ω) d15 d24 d33 d23 d13 d14
2Ex (ω)Ez (ω)
2E
( x (ω)Ey (ω))
It follows that the direction of the nonlinear polarization vector P⃗ (2) (2ω) depends on
both the structure of the nonlinear tensor dil and the polarization of the fundamental
harmonic wave. The same will be valid for any second-order or third-order nonlinear
process.
How will the electromagnetic waves emerging due to the nonlinear interaction be
polarized? For instance, what will be the polarization state of the second harmonic
resulting from nonlinear polarization (10.44)? The approach used in nonlinear optics
is based on the Helmholtz equation describing the electric field wave E⃗ induced by the
nonlinear polarization wave:
ϵ 𝜕2 E⃗ 1 𝜕2 P⃗ NL
∇2 E⃗ − 2 2 = . (10.46)
c 𝜕t ϵ0 c2 𝜕t 2
This Helmholtz equation differs from the one we derived in Chapter 7 by the right-
hand side: now it is nonzero. This is a typical equation describing oscillations of a
system (electric field) due to the driving force (nonlinear polarization). At each point,
the electric field will have the same direction and the same oscillation frequency as the
polarization (10.45). At the same time, the propagation of a wave E(⃗ r,⃗ t) in a material
requires certain conditions to be satisfied. First, the wavevector and the frequency of
a propagating wave should obey the dispersion relation k⃗ = k(ω). ⃗ Second, as it was
shown in Chapter 4, only waves in two polarization states can propagate in a crystal:
the ordinary wave and the extraordinary wave. The first restriction leads to the phase
matching condition. The second one dictates the possible polarization types of phase
matching.
The wave vectors of the nonlinear polarization wave and the induced field wave
are, in the general case, different. For instance, in the case of second-harmonic genera-
tion, the incident field at frequency ω with the wavevector k(ω) ⃗ will generate a wave of
second-order nonlinear polarization P at frequency 2ω. This polarization wave will
⃗ (2)
propagate with the wavevector 2k(ω), ⃗ but the induced electric field wave at frequency
2ω can only propagate with a certain wavevector k(2ω), ⃗ satisfying the dispersion rela-
tion in the medium. In the presence of dispersion, the refractive index depends on the
frequency, and the wave vector is (comp. with Eq. (4.14), which did not take dispersion
into account)
n(ω)ω
k(ω) = . (10.47)
c
10.2 Phase matching | 127
Therefore, in the presence of dispersion the nonlinear polarization and the electric
field will have different wavevectors, whose absolute values are 2k(ω) = 2 n(ω)ω c
and
n(2ω)ω
k(2ω) = 2 c , respectively. Unless n(ω) = n(2ω), the waves of the electric field E(r,⃗ t)
⃗
and the nonlinear polarization P⃗ (2) (r,⃗ t) will propagate with different phase velocities
and get out of phase at some point. Then the interaction becomes inefficient. This
explains, in simple terms, why nonlinear optical processes require phase matching. In
particular, for second-harmonic generation the phase matching requires the condition
For sum- and difference-frequency generation, the conditions are more complicated:
Each of conditions (10.48), (10.49), (10.50) are impossible to satisfy under normal dis-
persion where n(ω) increases with ω. (For instance, in Eq. (10.50), the left-hand side
and right-hand side are of opposite signs.) As an example, Fig. 10.1 shows the disper-
sion of ordinary and extraordinary refractive indices in lithium niobate crystal. If the
second harmonic has to be generated from the radiation at 1.064 μ (the wavelength
of a Nd:YAG laser), the refractive indices for the second-harmonic and fundamental
radiation will differ by almost 0.1, and the phase matching condition (10.48) will not
be satisfied.
However, the phase matching conditions for second-harmonic, sum- and difference-
frequency generation, and sometimes even for third-harmonic generation can be satis-
fied by using crystal birefringence and requiring that the fields at different frequencies
should be polarized differently.
128 | 10 Polarization in nonlinear optics
Figure 10.2: Orientation of the wave vector (red) and the polar-
ization directions (green, blue) for second-harmonic generation
in a uniaxial crystal.
The cases (i) and (ii) are called type-I interaction, the cases (v) and (vi), type-II inter-
action, and the cases (iii) and (iv), type-0 interactions.
Some of these interactions can help to satisfy condition (10.48). For instance, for
second-harmonic generation from 1.064 μ to 0.532 μ in lithium niobate (Fig. 10.1), the
dispersion in the visible range is somewhat smaller than birefringence. We see that,
10.2 Phase matching | 129
as it should be for normal dispersion, no (ω) < no (2ω) and ne (ω) < ne (2ω), so type-
0 interaction is impossible. However (see the two green points and the red point in
Fig. 10.1), because lithium niobate is a negative crystal, ne (2ω) < no (ω) < no (2ω). By
properly choosing the angle ϑ to the optic axis one can make the effective index [see
Eq. (4.43) of Chapter 4],
−1/2
sin2 ϑ cos2 ϑ
n(2ω) = ( 2
+ 2 ) , (10.51)
ne (2ω) no (2ω)
take any value between ne (2ω) and no (2ω).
Then, one can make n(2ω) = no (ω) (blue point in Fig. 10.1), and the phase match-
ing is satisfied for type-I interaction oo→e. From Eq. (10.51) we find that the angle
between the wave vector and the optic axis should be
2 Doping lithium niobate with magnesium (typically, 5 %) reduces the photorefractive effect and is
therefore widely used in nonlinear optics.
130 | 10 Polarization in nonlinear optics
Now, with the geometry of a nonlinear optical process dictated by the phase match-
ing, and the nonlinear susceptibility tensor components determined by the symmetry
of the nonlinear material, we can find the effective value of the nonlinear suscepti-
bility. This can be done with the help of Eq. (10.45) [3]. In the example considered
above, lithium niobate is a class 3m crystal, and its d tensor has the form (10.29). Then
Eq. (10.45) takes the form
Ex2 (ω)
Ey2 (ω)
Px(2) (2ω) 0 0 0 0 d15 d16
( Ez2 (ω) )
(Py(2) (2ω)) = 2ϵ0 (d16 −d16 0 d15 0 (2E (ω)E (ω)) . (10.53)
0 )( )
y z
Pz(2) (2ω) d15 d15 d33 0 0 0
2Ex (ω)Ez (ω)
2E
( x (ω)Ey (ω))
− sin φ
E(ω)
⃗ = E(ω) ( cos φ ) , (10.54)
0
where φ is the azimuthal angle of the incident wave (see Fig. 10.2).
The components of the nonlinear polarization (10.53) are then
cos ϑ cos φ
e⃗ ≡ ( cos ϑ sin φ ) , (10.56)
− sin ϑ
where we have ignored the spatial walk-off and assumed that the electric field of the
extraordinary wave is orthogonal to the wavevector. The extraodinary polarized elec-
tric field will be determined by the projection of the nonlinear polarization at fre-
quency 2ω (10.55) on the unit vector (10.56),
P(2ω)eff = −2ϵ0 [d16 sin(3φ) cos ϑ + d15 sin ϑ]E 2 (ω). (10.57)
10.2 Phase matching | 131
Similarly, one can find effective nonlinearities for any type of nonlinear interac-
tion. Expressions for deff for all symmetry classes and for all types of interaction can
be found in Refs. [3, 9].
This result can be generalized by recalling the whole procedure: we find the non-
linear polarization vector by multiplying the χ̂ (2) tensor by two vectors corresponding
to the pump polarization, P⃗ ∝ χ̂ (2) : E⃗ E.⃗ Then we project this vector on the unit vector
corresponding to the polarization direction of the ordinary or extraordinary wave in
the crystal: P⃗ ⋅ e.⃗ As a result, the effective second-order susceptibility is
.
(2)
χeff = χ̂ (2) ..e⃗2ω e⃗ω e⃗ω , (10.59)
where e⃗2ω and e⃗ω are unit vectors corresponding to the polarization directions (or-
dinary or extraordinary) of the incident and second-harmonic waves, and the three
dots, as before, denote the multiplication of a tensor by three vectors. Then the effec-
tive nonlinearity is related to the second-order susceptibility as deff = 21 χeff
(2)
.
Generalizing Eq. (10.59) to the case of an arbitrary three-wave interaction, we ob-
tain
1 .
deff = χ̂ (2) ..e⃗1 e⃗2 e⃗3 , (10.60)
2
One way to reduce the detrimental effect of spatial walk-off is to use, instead of a long
crystal, two crystals with optic axes tilted symmetrically. Figure 10.3 shows this ar-
rangement for the case of type-II second-harmonic generation [10]. To provide type-II
interaction, the fundamental wave (ω) should be polarized at 45∘ to the plane formed
by the optic axis (z) and the incident wave vector direction. Inside the crystal the inci-
dent beam splits in two, the ordinary beam (o) propagating in the original direction,
and the extraordinary beam (e) tilted by the walk-off angle. If a single long crystal is
used (panel a), the two beams are strongly displaced from each other in the end of
the crystal, and the second harmonic (2ω) is generated only at the very beginning. In-
stead, one can place two shorter crystals after one another (panel b), so that their optic
axes are both oriented at the phase matching angle ϑ to the incident wavevector, but
tilted symmetrically. Then the walk-off directions in the two crystals will be opposite,
and the shift of the extraordinary wave in the first crystal will be compensated for in
the second crystal. As a result, the range of efficient second-harmonic generation will
be increased by a factor of two.
Figure 10.3: Spatial walk-off in type-II second-harmonic generation (a) and its compensation by
using two shorter crystals with symmetric orientations of the optic axes (b).
10.3 The effect of spatial walk-off and its elimination | 133
As shown in Section 4.3, the amount of spatial walk-off depends on the angle ϑ be-
tween the wavevector and the optic axis. It is maximal if ϑ is 45∘ but it is absent if ϑ
is 0∘ or 90∘ . Accordingly, a way to completely eliminate the walk-off effect is to find
phase matching at ϑ = 90∘ . For instance, type-I phase matching for second-harmonic
generation from 1.064 μ in lithium niobate (Section 10.2) occurs at an angle ϑ = 82.6∘ ,
i. e., in the direction almost orthogonal to the optic axis. It turns out that by chang-
ing the temperature of the crystal, this angle can be made exactly 90∘ [2]. This type of
phase matching, called non-critical phase matching, is possible because the ordinary
and extraordinary refractive indices of lithium niobate depend on the temperature
differently. Temperature-tunable non-critical phase matching is often used in lithium
niobate, KTP, lithium triborate (LBO) and several other crystals. Unlike the compensa-
tion method considered above, it enables complete elimination of the walk-off effect.
10.3.3 Quasi-phasematching
Bibliography
[1] D. J. Armstrong, W. J. Alford, T. D. Raymond, A. V. Smith, and M. S. Bowers. Parametric
amplification and oscillation with walkoff-compensating crystals. J. Opt. Soc. Am. B, 1997.
[2] R. W. Boyd. Nonlinear optics. Academic Press, 2008.
[3] V. G. Dmitriev, G. G. Gurzadyan, and D. N. Nikogosyan. Handbook of nonlinear crystals.
Springer, 1999.
[4] F. Gravier and B. Boulanger. Cubic parametric frequency generation in rutile single crystal. Opt.
Express, 14(24):11715–11720, Nov 2006.
[5] J. E. Midwinter and J. Warner. The effects of phase matching method and of crystal symmetry
on the polar dependence of third order non-linear optical polarization. Br. J. Appl. Phys.,
16:1667–1674, 1965.
[6] G. New. Introduction to nonlinear optics. Cambridge University Press, 2011.
[7] A. Penzkofer, F. Ossig, and P. Qiu. Picosecond third-harmonic light generation in calcite. Appl.
Phys. B, 47:71–81, 1988.
[8] D. E. Zelmon, D. L. Small, and D. Jundt. Infrared corrected Sellmeier coefficients for congruently
grown lithium niobate and 5 mol. % magnesium oxide-doped lithium niobate. J. Opt. Soc. Am.
B, 14(12):3319–3322, Dec 1997.
[9] F. Zernike and J. E. Midwinter. Applied nonlinear optics. John Wiley and Sons, 1973.
[10] J.-J. Zondy, M. Abed, and S. Khodja. Twin-crystal walk-off-compensated type-II second-harmonic
generation: single-pass and cavity-enhanced experiments in ktiopo4. J. Opt. Soc. Am. B,
11(12):2368–2379, Dec 1994.
11 Quantum description of polarization
In this chapter we will re-consider the description of polarization used so far through-
out the book. In quantum optics, like generally in quantum mechanics, every physi-
cal observable corresponds to some operator. Accordingly, in this chapter, instead of
the Stokes observables we will introduce the Stokes operators. Unlike classical Stokes
observables, they will be defined not in terms of intensities, but in terms of photon-
number operators. Instead of electric fields, which were the basic classical observables
of the previous chapters, we will now use electric field operators, and further, photon-
creation and -annihilation operators. All these operators will act on the quantum states
of polarized light. In this chapter, we will mainly consider single-photon polarized
states; more complicated states of polarized light will be the subject of Chapter 12.
The reader of this chapter is expected to have studied quantum mechanics, but not
necessarily quantum optics. Therefore, the basic notions and instruments of quantum
optics will be briefly introduced here from the polarization point of view.
https://doi.org/10.1515/9783110668025-011
136 | 11 Quantum description of polarization
it [15]. By assuming that the field distribution is periodic in all three Cartesian coordi-
nates x, y, z, with the period given by the box size L, we obtain the condition that only
a discrete set of wavevectors are allowed:
2π
k⃗ ≡ {kx ; ky ; kz } = {l; m; n}, (11.1)
L
where l, m, n are integer numbers.
These discrete modes are shown in Fig. 11.1. To specify such a mode, we need to
fix three components of the wavevector. Alternatively, we can specify the direction of
the wavevector and its absolute value, related to the frequency ω via the dispersion
dependence, k = n(ω)ω/c.
But according to Chapter 3, for every such a mode, i. e., for every direction and absolute
value of the wavevector, there are also two polarization states. Then for completeness,
one should specify a mode by four, rather than three numbers,
where σ denotes the polarization and can take two values. Depending on the situa-
tion, the two polarization modes can be linear, or circular, or elliptic, but they should
be orthogonal. Which polarization basis to use depends on the specific problem: for in-
stance, in a crystal, it is more convenient to use linearly polarized modes correspond-
ing to the ordinary and extraordinary beams. In an optically active material, the con-
venient polarization mode basis will comprise left- and right-polarized modes, L and
R. But in the laboratory frame of reference, most convenient are horizontally and verti-
cally polarized modes. In what follows, we will usually consider one wavevector mode
and two polarization modes: H and V, or D and A, or L and R.
The complex fields in classical optics, the negative-frequency one E (−) [Eq. (2.2)]
and the positive-frequency one, E (+) [Eq. (2.3)], in quantum optics become operators,
Ê (−) and Ê (+) , respectively. These operators are, in turn, written as superpositions over
modes,
ik⃗ r−iω
⃗ t
Ê (+) (r,⃗ t) = ∑ Ê (+)
⃗ e
k⃗ , (11.3)
k
k⃗
11.1 Basic notions of quantum optics | 137
Summation over wavevectors k⃗ implies summation over the indices l, m, n [see (11.1)]
and two polarization states. Similarly, the negative-frequency field operator is Hermi-
tian conjugated to the positive-frequency one. We obtain
⃗ ⃗
Ê (+) (r,⃗ t) = ∑ ck⃗ ak⃗ eikr−iωk⃗ t ,
k⃗
⃗ ⃗
E ̂ (−) (r,⃗ t) = ∑ ck∗⃗ a†⃗ e−ikr+iωk⃗ t , (11.4)
k
k⃗
where a†⃗ is the photon-creation operator in mode k.⃗ Being dimensionless, photon-
k
creation and -annihilation operators are more convenient than electric field operators.
It follows from the Maxwell equations [23] that each mode of the electromagnetic
field is similar to a harmonic oscillator and, as such, can be populated by a certain
number of quanta (photons). The energy operator, or Hamiltonian, of each mode can
be written as
† 1
ℋ̂ k⃗ = ℏωk⃗ (a ⃗ ak⃗ + ). (11.5)
k 2
Throughout this chapter and Chapter 12, the focus of our discussion will be on the po-
larization modes, usually H and V. Accordingly, there will be two photon-annihilation
operators, aH,V and their Hermitian conjugates a†H,V .
These operators are not Hermitian: a†H,V ≠ aH,V . Photon creation and annihilation
operators for each mode do not commute:
where 0 ≤ φ < π.
Photon creation and annihilation operators form the photon-number operators
for each mode:
N̂ H† = N̂ H , N̂ V† = N̂ V . (11.12)
As any Hermitian operators, they correspond to real observables, which can be mea-
sured. Indeed, according to Eq. (2.6), the classical counterparts of photon-number op-
erators N̂ H,V are intensities in the polarization modes.
Quantum states are often introduced in quantum mechanics as eigenstates of var-
ious operators. Here we will briefly describe the states further used in this book. Note
that because the two polarization modes are orthogonal, in the simplest case a state
|Ψ⟩ is a direct product of states in the two polarization modes: |Ψ⟩ = |ΨH ⟩H ⊗ |ΨV ⟩V ,
or simply |Ψ⟩ = |ΨH ⟩H |ΨV ⟩V .
Fock states. The eigenvalues of photon-number operators N̂ H,V are non-negative
integer numbers NH,V , and their eigenstates are so-called Fock states, or number
states:
(We denote, as is common in quantum mechanics, the operator, its eigenvalue, and
its eigenstate by the same character.) Fock states are states in which the number of
photons populating a mode is fixed and does not fluctuate. In practice, among all Fock
states, only single-photon and two-photon ones can be prepared in laboratories in a
relatively simple way.
Fock states of each mode form a complete orthonormal basis, and any state can
be decomposed over this basis. The possibility of such a decomposition leads to the
decomposition of the identity operator:
1̂ = ∑ |N⟩⟨N|, (11.14)
N
These so-called ladder equations describe the transitions between different states of a
harmonic oscillator populated by a fixed number of quanta.
Coherent states are defined as the eigenstates of photon-annihilation operators
aH,V :
Similarly,
The uncertainty of an observable is estimated as the square root of its variance. Ac-
cordingly, the uncertainty given by the shot noise is ΔNcoh = √⟨N⟩coh .
The variances of the quadrature operators are also nonzero in a coherent state.
From the definitions of the quadrature operators (11.9), it follows that
1
Var(qcoh ) = Var(pcoh ) = . (11.21)
4
Therefore, any coherent state has the same uncertainty of any quadrature: Δq = Δp =
Δqφ = 21 . The mean values of the quadratures in a coherent state are ⟨α|q|α⟩ ̂ = Re{α}
and ⟨α|p|α⟩
̂ = Im{α}.
A real-world coherent state differs from this idealized picture. First, it has in-
evitable phase fluctuations due to the phase drift of the laser. This problem can be
overcome in experiments by using the same laser as the source of a quantum state
and as a reference. Because what matters is the relative phase between the state and
the reference, the phase drift does not affect the measurement. The second problem
is the excess noise in the number of photons, which makes the uncertainty in the
photon number larger than the shot noise. The photon-number variance due to the
excess noise scales quadratically with the mean photon number; due to this fact, the
role of excess noise reduces as the mean photon number decreases. Therefore, no
laser is shot-noise limited, i. e., has the photon-number uncertainty given by the shot
noise; however, any laser can be made shot-noise limited by sufficiently reducing its
intensity.
The vacuum state |0⟩ belongs to both the Fock states and the coherent states. As
a Fock state, it has the eigenvalue N = 0, and this photon number does not fluctu-
ate. Meanwhile, as a coherent state, it has nonzero uncertainties of the quadratures.
This shows that a quantum vacuum is not just ‘nothing’: despite having zero mean
number of photons and zero mean field, it has nonzero field fluctuations (zero-point
fluctuations of the electromagnetic vacuum).
Squeezed states are formally defined as the eigenstates of the operator μa +
νa† [30]. More commonly, they are known as the states in which the uncertainty
of one quadrature is smaller than of the other one, for instance, Δq < Δp. At the
same time, these are minimal-uncertainty states, i. e., the product of their quadrature
uncertainties is ΔqΔp = 1/4, as in the case of coherent states. It follows that the uncer-
tainty of some quadrature q̂ φ is smaller than the shot-noise uncertainty, for instance,
11.1 Basic notions of quantum optics | 141
Δq < 1/2. One says that the quantum fluctuations are squeezed for this quadrature and
anti-squeezed for the conjugated one, Δp > 1/2. The reduced noise in the squeezed
quadrature makes squeezed states useful for metrology. In particular, a squeezed-
vacuum state has ⟨q⟩̂ = ⟨p⟩̂ = 0, but unequal uncertainties of different quadratures.
Due to the increased noise in the anti-squeezed quadrature, a squeezed vacuum state
has a mean number of photons that is not only nonzero, but sometimes very large.
To characterize and distinguish various quantum states, several instruments are used.
Here we will consider two of them: Glauber’s correlation functions and the Wigner
function. Further, we will apply these instruments to characterize nonclassical states
of polarized light.
Glauber’s correlation functions of order n describe n-photon absorption. In par-
ticular, they determine the outcome of the Hanbury Brown–Twiss experiment [15, 23]
where an incident beam is split, in general, into n beams, with a detector in each beam,
and the simultaneous photocounts of these detectors are registered. The photocount
coincidence rate or, in the case of bright light, the correlation of the photocurrents, is
given by the normally ordered nth-order correlation function. Formally, it is defined
as [15]
where r1⃗ , . . . rn⃗ and t1 . . . tn are the positions and times at which the detectors measure
and the averaging is over the quantum state. Note that the definition assumes nor-
mal ordering, i. e., all negative-frequency operators standing on the left and positive-
frequency operators, on the right.
Because correlation functions (11.22) depend on the mean number of photons, it
is convenient to normalize them as
⟨(a† )2 a2 ⟩ ⟨: N̂ 2 :⟩
g (2) (0) = ≡ . (11.25)
⟨a† a⟩2 ⟨N⟩̂ 2
The normal ordering is omitted here because the operators of orthogonal polarization
modes commute.
where Rc is the coincidence rate and R1 , R2 are the rates of counts in the two detec-
tors [23]. The time delay τ can be introduced electronically and the spatial displace-
ment ρ,⃗ by shifting one of the detectors. In the case of pulsed light, the equation for
calculating g (2) is modified and, in general, contains both the coincidence resolution
11.1 Basic notions of quantum optics | 143
Tc and the pulse duration. However, if the pulse is much shorter than Tc , the bunching
parameter can be calculated using a simplified formula [11]:
Nc
g (2) (0) = , (11.28)
N1 N2
where Nc is the mean number of coincidences per pulse and N1 , N2 are the mean num-
bers of photocounts per pulse in the two detectors.
The denominators in Eqs. (11.27) and (11.28) are equal, respectively, to the rate
and mean number per pulse of coincidences in the case where detectors 1, 2 regis-
ter light from independent sources (accidental coincidences). For coherent light, the
arrival of each photon is independent of the others, all coincidences are accidental,
and g (2) (0) = 1. For single-mode thermal light, the number of coincidences is twice as
large, yielding g (2) (0) = 2. This result of the Hanbury Brown–Twiss experiment [15, 23]
was interpreted as ‘bunching’ of photons [21] in thermal light—this is where the term
‘bunching parameter’ comes from. But this result also follows from the classical de-
scription of intensity fluctuations in thermal light [15]. What indeed requires a quan-
tum description is anti-bunching, i. e., the case of g (2) (0) < 1. Anti-bunching of photons
can be observed for Fock states; in particular, for a single-photon state g (2) (0) = 0.
For measuring the cross-correlation function (11.26), the beamsplitter in Fig. 11.2
should be a polarization one. The Hanbury Brown–Twiss setup can be also extended to
the general case of measuring the nth-order correlation function. For this, one should
use n detectors after a sufficient number of beamsplitters, and register n-fold, instead
of two-fold, coincidences.
Quasi-probabilities. In order to describe a state in terms of quadratures q and p,
i. e., in the phase space, it would be very convenient to have some joint probability
distribution P(q, p). But because the quadrature operators do not commute, their joint
probability distribution is unphysical. Several quasi-probabilities can be introduced,
but in each case there is a price to pay: the quasi-probabilities violate certain rules
that normal probabilities should obey.
In classical probability theory, a probability distribution has a Fourier transform,
called the characteristic function. Similarly, quantum quasi-probabilities can be de-
fined as Fourier transforms of certain characteristic functions. The normally ordered
characteristic function is defined as [23]
where w is a complex number and the averaging is over the quantum state to be char-
acterized. The Fourier transform of Cn (w) is the Glauber–Sudarshan quasi-probability,
or P-distribution:
1
∫ d2 wCn (w)e−wz +w z ,
∗ ∗
P(z) = (11.30)
π2
144 | 11 Quantum description of polarization
with z = q + ip being a complex number. The P-distribution has the meaning of the
density matrix in the coherent-state representation. It can be singular or negative for
some states—this is why it is not a true probability distribution. In fact, its negativity is
a criterion of nonclassicality: by definition, a nonclassical state is one that has a nega-
tive Glauber–Sudarshan quasi-probability P(q, p). The negativity of the P-distribution
means that a state cannot be described in terms of classical statistical optics [30]. How-
ever, because P(q, p) can be singular, it cannot be measured directly. For the measure-
ment, most convenient is the Wigner function.
The Wigner function is defined as the two-dimensional Fourier transform of the
symmetrized characteristic function [23]:
1
∫ d2 wCs (w)e−wz +w z .
∗ ∗
W(z) = 2
(11.31)
π
The Wigner function cannot be singular, but it can be negative. From the negativity of
the Wigner function, the negativity of the P-distribution follows; in other words, the
negativity of the Wigner function is a sufficient condition for nonclassicality.
The most important property of the Wigner function is that it can be used for calcu-
lating the mean values and moments of a quadrature in the same manner as a ‘normal’
probability distribution:
This feature means that the marginal distribution Wq (q) ≡ ∫ W(q, p)dp is a true prob-
ability distribution. The same property is valid for the marginal distribution of any
generalized quadrature qφ ≡ q cos φ + p sin φ. It also provides a way to measure the
Wigner function in experiment [20].
Measurement of the Wigner function is performed through balanced homodyne de-
tection (Fig. 11.3). A state to be characterized is sent to a beamsplitter, as in the Han-
bury Brown–Twiss experiment, but now another state is sent into the second input
port, namely, a strong coherent state known as the local oscillator [2]. Importantly,
the beamsplitter should be perfectly balanced (50 %), and the local oscillator should
be much brighter than the state under study. The detectors should then not be count-
ing photons but registering strong photon fluxes. At the output, their photocurrents
i1,2 should be subtracted.
The difference photocurrent, i− ≡ i1 − i2 , is then proportional to the quadrature of
the input state,
i− = 2ηα0 qφ (11.34)
11.1 Basic notions of quantum optics | 145
where η is the quantum efficiency of the detectors (assumed to be the same), α0 the
amplitude of the local oscillator, and φ its phase. Equation (11.34) can be used to mea-
sure the mean value and all statistical moments of the quadrature qφ , by analyzing
the probability distribution of the difference photocurrent i− . Such distributions can
be acquired for a set of quadratures {qφ } with different φ, and then, using the property
(11.33), one can reconstruct the Wigner function through the inverse Radon transfor-
mation or some other method. This procedure is called the Wigner-function tomogra-
phy.
It is worth noting that the scheme in Fig. 11.3 resembles the Stokes measurement
setup (Fig. 5.7), in that the value of interest is obtained by subtracting the photocur-
rents of two detectors. This analogy will be further developed in Section 11.3.
Observable signs of nonclassicality. Although the rigorous definition of nonclassi-
cal light is in terms of the P-distribution and cannot be applied in experiment, there are
many observable features that follow from the negativity of the P-function and there-
fore are sufficient conditions for nonclassicality. Some of them were briefly mentioned
above, but here we present a more complete (but not exhaustive) list of nonclassicality
signs used in experiment.
1. Anti-bunching, g (2) (0) < 1. This feature, already mentioned above, is equivalent
to another property, namely, sub-Poissonian statistics. Because the bunching param-
eter is related to the variance and mean of the photon number as
Var(N) − ⟨N⟩
g (2) (0) − 1 = , (11.35)
⟨N⟩2
anti-bunching means that the variance is less than the mean, Var(N) < ⟨N⟩. However,
experimental conditions for observing anti-bunching and sub-Poissonian statistics
are different: according to Eq. (11.35), anti-bunching is easier to detect for faint light,
⟨N⟩ < 1, while sub-Poissonian statistics is better noticeable for bright light, ⟨N⟩ ≫ 1.
2. Anti-bunching can be generalized to the condition involving higher-order nor-
malized correlation functions g (n) ≡ g (n) (0) [13]. The resulting sufficient conditions for
nonclassicality are
g (n−1) g (n+1)
< 1. (11.36)
[g (n) ]2
146 | 11 Quantum description of polarization
Otherwise, the state |Ψ⟩1,2 is called separable. A mixed state ρ12 is called separable if
its density matrix can be represented as a convex sum of factorizable states ρ1 , ρ2 in
modes 1, 2; otherwise it is called inseparable.
To certify entanglement in experiment, there are various witnesses and measures,
which, however, will not be used in this book.
The Stokes operators are Hermitian, by definition, and they correspond to real observ-
ables. Furthermore, we will consider the measurement of these observables, but from
now on we will use the term ‘Stokes parameters’ exclusively for their mean values. As
in classical optics, it is worth introducing a Stokes operator of a general form,
S(ϑ,
̂ φ) ≡ Ŝ cos ϑ + Ŝ sin ϑ cos φ + Ŝ sin ϑ sin φ.
1 2 3 (11.42)
Because this definition is insufficient to describe certain effects, other definitions have
been proposed [17]. We will consider them in Chapter 12.
Generally, the Stokes operators (11.41) do not commute. Their commutation relations
are obtained from definitions (11.41), with the help of the rules
These commutation relations resemble the ones of the Pauli operators. This again
points at the analogy between polarized photons and a spin 1/2 particle, which was
already mentioned in Chapter 6. In what follows, we will further develop this analogy.
148 | 11 Quantum description of polarization
From Eqs. (11.45), the uncertainty relations follow. Indeed, one can show [19] that
̂ 2⟩
the uncertainties of non-commuting operators  and B,̂ defined as ΔA ≡ √⟨( − ⟨A⟩)
̂ 2 ⟩, satisfy the condition
and ΔB ≡ √⟨(B̂ − ⟨B⟩)
1
ΔAΔB ≥ ⟨[A,̂ B]⟩
̂ . (11.46)
2
This identity leads to another inequality for the Stokes observables. Indeed, the
definition (11.43) of the degree of polarization can be rewritten as
Subtracting this equation from Eq. (11.48), averaging and using the conditions 0 ≤ P ≤
1, Var(Ŝ0 ) ≥ 0, we obtain the relation
In quantum optics, to measure the mean value of some observable A means to find
the average of the corresponding operator  over a state. If the averaging is over a
pure state |Ψ⟩, it is defined as
⟨A⟩
̂ ≡ ⟨Ψ|A|Ψ⟩.
̂ (11.50)
11.2 Stokes observables | 149
The averaging over a mixed state with the density matrix ρ̂ is written as
⟨A⟩
̂ ≡ Tr(Â ρ).
̂ (11.51)
Further, we will use the last notation for brevity. With the normalization condition
|α|2 + |β|2 = 1, the state (11.52) describes a single photon that is ‘spread’ over the two
polarization modes H, V. The state can be also represented as a two-component vec-
tor,
α
|Ψ⟩ = ( ) , (11.53)
β
which resembles the Jones vector (3.4) we considered in Chapter 3. In particular, the
common phase of the coefficients α and β plays no role now.
The state (11.53) represents a general state of a qubit, the state of a quantum system
with two eigenstates. This can be a two-level atom, a spin 1/2 particle like an electron,
or—as we see—a polarized photon. The latter, as a result, can represent any of these
other quantum systems.
Some particular cases of a polarized single photon are as follows. The states
1 1
|1⟩D = (|1⟩H + |1⟩V ), |1⟩A = (|1⟩H − |1⟩V ) (11.55)
√2 √2
1 1
|1⟩L = (|1⟩H + i|1⟩V ), |1⟩R = (|1⟩H − i|1⟩V ) (11.56)
√2 √2
measurement) [24]. For instance, above we saw that a Fock state is an eigenstate of the
photon-number operator. A generic Hermitian operator  always has a complete or-
thonormal set of eigenstates: A|A
̂ n ⟩ = An |An ⟩. Any state |Φ⟩ can be decomposed over
this set,
where cn = ⟨An |Φ⟩. The mean value of  over |Φ⟩ is found according to Eq. (11.50),
⟨A⟩
̂ = ⟨Φ|A|Φ⟩
̂ = ∑ |cn |2 An , (11.58)
n
where we used the orthogonality of the eigenstates |An ⟩, ⟨An |Am ⟩ = δmn . The nth term
in Eq. (11.58) is the probability that the state is |An ⟩, Pn ≡ |cn |2 = |⟨Φ|An ⟩|2 , times
the value of the operator  in this state. This expression is perfectly clear from the
viewpoint of the probability theory: with the probability Pn , observable A takes the
value An .
The same result is achieved using the decomposition of the identity over the eigen-
states of A.̂ Indeed, by plugging the decomposition
⟨A⟩
̂ = ⟨Φ| ∑ A|A
̂ n ⟩⟨An |Φ⟩ = ∑ An Pn . (11.60)
n n
Substituting (11.52) and taking into account that N̂ H |1⟩H = |1⟩H , N̂ V |1⟩V = |1⟩V , we get
Because the Fock states |1⟩H and |1⟩V belong to orthogonal modes, the factors in
front of them should be zero. From this, we find two possibilities: either s1 = 1 and
β = 0, or s1 = −1 and α = 0.
11.2 Stokes observables | 151
We obtained the result that the eigenstates of Ŝ1 are horizontally polarized single
photon and vertically polarized single photon, and the corresponding eigenvalues are
+1 and −1.
For the second Stokes operator Ŝ2 , the eigenvalue problem leads to the equation
To solve this equation, we notice that the second Stokes operator converts hori-
zontally and vertically polarized photons into each other: Ŝ2 |1⟩H ≡ (a†H aV +
a†V aH )|1⟩H = |1⟩V , and similarly Ŝ2 |1⟩V = |1⟩H . Then the equation becomes
Again, requiring that the factors by the states |1⟩H and |1⟩V are both zero, we obtain
two solutions: either s2 = 1 and α = β = 1/√2, or s2 = −1 and α = −β = 1/√2. Thus,
the eigenstates of Ŝ2 are a diagonally polarized single photon, with the eigenvalue +1,
and an anti-diagonally polarized single photon, with the eigenvalue −1.
Similarly, one can show that the eigenstates of Ŝ3 are a left-circularly polarized
single photon, with the eigenvalue +1, and a right-circularly polarized single photon,
with the eigenvalue −1.
For measuring a certain Stokes observable for a single photon, one should project
it on the corresponding eigenstate. This is done with the same setup as in classical op-
tics (Figs. 3.4, 3.5), with the only difference that the detectors should be able to register
single photons (single-photon, or ‘click’ detectors). For instance, to measure S1 we use
the setup shown in Fig. 3.4. If the upper detector clicks, we say that the photon is hor-
izontally polarized, and write down the result: s1 = 1. If the lower detector clicks, our
result is s1 = −1. This procedure is actually very similar to using the formula S1 ≡ IH −IV
(Section 3.3.4), with the intensities replaced by photon numbers: S1 ≡ NH − NV . But
for a single photon, either NH = 1, NV = 0 or the other way round, hence the Stokes
observable takes values ±1.
After M tries with identically prepared photons, we calculate the mean value, i. e.,
the first Stokes parameter, by averaging the results of all tries:
1 M
⟨S1 ⟩ = ∑s , (11.63)
M i=1 1i
projection of its spin σ⃗ on the direction of the magnetic field (x in the figure). Note that,
for measuring the projection of σ⃗ on some other direction, for instance, y, the magnet
in the figure should be rotated. It is impossible to measure σx and σy simultaneously
because the corresponding Pauli operators do not commute.
The same is true for the Stokes observables. Indeed, although there is a setup for
the measurement of any desired Stokes observable (Fig 5.7), it requires different set-
tings for different Stokes observables. In this setup, shown for the case of quantum
measurement in Fig. 11.4(b), a polarization prism is preceded by a QWP and a HWP.
The orientation angles of the plates, χ1 and χ2 , determine the parameters ϑ and φ of
the generic Stokes observable (11.42). For instance, to measure S1 , both plates should
be oriented horizontally: χ1 = χ2 = 0∘ . For the measurement of S2 , the angles should
be χ1 = 45∘ and χ2 = 22.5∘ , and for the measurement of S3 , χ1 = 45∘ and χ2 = 0∘ . For
general settings of the plates, the Stokes observable (11.42) measured in the setup is
given by Eq. (5.25).
Clearly, the measurement of different Stokes observables S1 , S2 , S3 requires differ-
ent settings in the setup and therefore cannot be performed at once. In particular, if a
photon is polarized diagonally or circularly, the measurement of S1 will not give any
information about its polarization state. In the setup of Fig. 11.4(b), the photon will be
11.2 Stokes observables | 153
reflected or transmitted with 50 % probability. The fact that the photon is detected, say,
in the ‘transmitted path’ only tells that it was not a vertically polarized photon. As we
will see in Chapter 12, this feature underlies the principle of quantum key distribution
with polarized photons.
The Stokes observables can be alternatively measured using the walk-off effect,
as it was shown in Chapter 4. In this case, the photon should be detected after a long
birefringent crystal, for instance, calcite (Fig. 4.8) with the optic axis in the vertical
plane. If a photon is displaced in the course of propagation through the crystal, then
it is vertically polarized, and the value s1 = −1 is registered. If the photon is not dis-
placed, the conclusion is that s1 = 1. The measured value of s1 should then correspond
to the displacement Δx along the x axis: s1 = 1 − 2Δx/d, where d is the shift due to the
walk-off. It is assumed here that the beam size a is much smaller than d. A combina-
tion of quarter-wave and half-wave plates in front of the calcite turns the measurement
of S1 into the measurement of any other Stokes observable (Fig. 11.4(c)).
For d ≫ a, the value of the vertical displacement, measured, for instance, with a
camera, can be associated with the eigenvalue of Sϑ,φ . This condition makes the mea-
surement projective (one also says ‘strong’). In the next subsection we will see how the
violation of this condition makes the measurement uncertain (one says ‘weak’) but,
surprisingly, brings new interesting possibilities.
made visible. The result of such a measurement is a so-called weak value, which can
be larger than any of the Stokes eigenvalues [9].
The procedure of weakly measuring an observable A, corresponding to an opera-
tor A,̂ is mathematically described as follows. Let the initial state be |Ψin ⟩, a pure state
for simplicity. First, observable A is measured weakly for this state. Then the resulting
state is projected on the eigenstates of another operator (B), ̂ which does not commute
with A.̂ One of the eigenstates |Bn ⟩ is postselected.
̂ in ⟩. Its decomposition over the eigenstates |Bn ⟩, accord-
Consider first the state A|Ψ
ing to Eq. (11.57), yields
̂ in ⟩ = ∑⟨Bn |A|Ψ
A|Ψ ̂ in ⟩|Bn ⟩. (11.65)
n
̂ ≡ ⟨Ψin |A|Ψ
⟨A⟩ ̂ in ⟩ = ∑⟨Bn |A|Ψ
̂ in ⟩⟨Ψin |Bn ⟩, (11.66)
n
̂ = ∑ Pn Aw ,
⟨A⟩ (11.68)
n
n
where
2
Pn = ⟨Bn |Ψin ⟩ (11.69)
̂ in ⟩
⟨Bn |A|Ψ
Aw
n = (11.70)
⟨Bn |Ψin ⟩
̂ in ⟩
⟨B0 |A|Ψ
Aw
0 ≡ . (11.71)
⟨B0 |Ψin ⟩
The denominator in this expression can be very small if the output and input states
are almost orthogonal. Then the weak value of A is very large. In particular, it can be
larger than any of the eigenvalues An of the operator A.̂
11.2 Stokes observables | 155
Figure 11.5: Left panel: weak measurement of the Stokes observable Ŝ 1 for a state prepared by the
first polarizer. Right panel: intensity distribution along the vertical axis without the second polarizer
(green dashed line) and with it, scaled up by a factor of 200 (red solid line). Blue dotted line shows
the intensity distribution for a narrow beam without the second polarizer (strong measurement).
Figure 11.5 shows how a weak value of a Stokes observable can be measured. The walk-
off scheme, as in Fig. 11.4(d), performs the weak measurement of one Stokes observ-
able and then a polarizer projects the state on another Stokes observable, not commut-
ing with the first one. Suppose that the state at the input is a photon polarized almost
anti-diagonally, |Ψin ⟩ = α|1⟩H − β|1⟩V , with α ≈ β ≈ 1/√2. It is prepared by transmitting
single photons through a polarizer oriented at an angle close to −45∘ . Then we weakly
measure the observable  = Ŝ1 , with the help of a thin calcite crystal with the optic
axis in the vertical plane. A camera placed close to the crystal (i. e., in the near field)
measures the transverse displacement of the photon. This displacement is supposed
to tell us the value of S1 , but for a broad input beam, it is hardly visible: the measure-
ment is weak. However, before the camera we place a polarizer oriented at 45∘ , and
thus project the state onto an eigenstate of the B̂ = Ŝ2 operator, namely |B0 ⟩ = |1⟩D .
(Note that this state is nearly orthogonal to the input one.) Surprisingly, the displace-
ment of the beam in the direction of the walk-off will be then very pronounced, and
different for different input states. Indeed, the result of the measurement will be the
weak value of operator Ŝ1 for the state α|1⟩H − β|1⟩V , which is [see Eq. (11.71)]
This value is very large if α ≈ β, and definitely larger than any eigenvalue of the op-
erator Ŝ1 . In other words, the displacement of the beam in the vertical direction will
much exceed the displacement of a thin beam in a ‘strong’ measurement. At β = 0 or
α = 0, the weak value approaches the eigenvalues of Ŝ1 and the beam is displaced as
in the ‘strong’ measurement.
The corresponding intensity distributions are shown in the right-hand panel of
Fig. 11.5 for the case of a beam with full width at half maximum (FWHM) a = 5 mm and
the displacement due to walk-off only d = 0.1 mm. The input state is linearly polarized
at an angle 42.5∘ , which corresponds to α = 0.74, β = 0.68. In the absence of the
second polarizer, the intensity distributions for the ordinary and extraordinary beams
in calcite overlap (green dashed line) and cannot be distinguished. (For comparison,
156 | 11 Quantum description of polarization
blue dotted line shows the intensity distributions for these beams if their widths are
0.01 mm, which is the case of strong measurement.) With the second polarizer inserted
(red solid line), there is a single intensity peak, shifted about ten times more than in
the case of a strong measurement. In other words, the small transverse shift due to
the walk-off is amplified 10 times. In the limit of very small walk-off, the amplification
factor tends to α/(α − β) = 11.4 [26].
This result has a simple classical explanation [26] in terms of the interference be-
tween the ordinary and extraordinary beams after the calcite. These beams overlap
due to their large widths and have the same polarization states after the second po-
larizer. Their interference is destructive because the fields in the two beams have a π
phase shift after the polarizer and very close absolute values. The resulting intensity
peak is weak and shifted towards the stronger beam.
In a similar experiment, Salvail et al. [27] obtained weak values of the Stokes ob-
servables as large as 4, by projecting on states that were almost orthogonal to the input
state. This shows how a weak measurement can retrieve very small displacements and
therefore provides a precision higher than its strong counterpart.1
The weak value defined by Eq. (11.70) can be, in principle, also complex. It can be
therefore used to directly measure the wavefunction of a quantum particle [22]. To this
end, the near-field measurement discussed in this section should be complemented
by a far-field measurement [27].
Finally, weak measurement can provide insights into the fundamental nonclassi-
cal features of light, such as the violation of Bell and Leggett–Garg inequalities and
several quantum paradoxes [9].
For the quantum description of polarized light, it is natural to introduce some quasi-
probability distribution in the space of the Stokes observables S1 , S2 , S3 , similar to the
quasi-probabilities in the phase space. Because the Stokes operators do not commute,
this quasi-probability distribution is bound to have ‘strange’ features like negativity or
singularity; however, it is still useful to describe the polarization part of the quantum
state. Moreover, one could expect that its one- or two-dimensional marginals would
have the properties of true probability distributions. These marginals can be helpful
to develop some experimental state reconstruction procedure.
1 Strictly speaking, the precision is determined not only by the beam displacement, but also by its
brightness (the number of photons). This taken into account, a weak measurement provides no ad-
vantage.
11.3 Polarization quasi-probability | 157
Indeed, in 2001 Karassiov and Masalov [5, 12] introduced the polarization quasi-
probability as the Fourier transform of the symmetrized characteristic function
̂ ̂ ̂
χ(u1 , u2 , u3 ) = ⟨eu1 S1 +u2 S2 +u3 S3 ⟩, (11.73)
where u1,2,3 are real Cartesian coordinates and the angular brackets denote the aver-
aging over the quantum state, in the general case a mixed one.
The quasi-probability distribution, called the polarization Wigner function, is then
defined as
du1 du2 du3
W(S1 , S2 , S3 ) = ∭ χ(u1 , u2 , u3 )e−iu1 S1 −iu2 S2 −iu3 S3 . (11.74)
(2π)3
just ‘a guide for the eye’ and can indicate, for instance, the mean number of photons.
The green ellipsoid shows schematically the surface where the quasi-probability dis-
tribution W(S1 , S2 , S3 ) has a certain value, for instance, its half-maximum value. For
given angles ϑ, φ, determined by the orientations χ1 and χ2 of the HWP and QWP in
Fig. 5.7 [see Eq. (5.25)], a single direction in the three-dimensional (3D) Stokes space,
along the unit vector n(ϑ, ⃗ φ) = {cos ϑ; sin ϑ cos φ; sin ϑ sin φ}, is probed. Each point of
the histogram (11.75) is given by the surface area of the cross-section of the distribu-
tion W(S1 , S2 , S3 ) orthogonal to n(ϑ,
⃗ φ) at the corresponding value of Sϑ,φ (shown by
red lines).
This setup for polarization quantum tomography [5] strongly resembles the stan-
dard homodyne tomography setup (Fig. 11.3). In both cases, the settings of the setup
(the phase of the local oscillator for homodyne tomography and the orientations of
the HWP and QWP for polarization tomography) determine the direction in space:
phase space for the Wigner-function tomography and the Stokes space for the polar-
ization tomography. In both cases, the difference photocurrent of the two detectors is
analyzed; its histogram determines the marginal probability distribution of the corre-
sponding observable. But there is an important difference between the two schemes.
Unlike the Wigner function W(q, p), the polarization quasiprobability distribution
(11.74) is defined in terms of the Stokes observables, and the latter, unlike the quadra-
tures, are integer-valued. This leads to the singularities and negative values even in the
marginal distributions of the Stokes observables [6]. Even in the simplest case of a hor-
izontally polarized weak coherent state, the calculated marginal probability P(S2 , S3 )
has singularities at integer values of S⊥ ≡ √S22 + S32 and is negative in the neighbor-
hood of these values. The reconstructed quasi-probability distribution W(S1 , S2 , S3 )
will be also singular and negative at some values of S1 , S2 , S3 . This behavior can be
observed through polarization tomography using the quantum Stokes measurement
setup (Fig. 11.4(b)), where the detectors can count single photons. The resulting his-
togram of any Stokes observable will be discrete. After the reconstruction, W(S1 , S2 , S3 )
will show negative regions.
To demonstrate this, Spasibko et al. [28] reconstructed the probability distribu-
tion P(S2 , S3 ) for a coherent state with the mean number of photons 0.19. Sections of
the polarization quasi-probability distribution by various planes and by the S2 axis
indeed showed negativities near the eigenvalues S2 = ±1. The theory also predicts
singularities at these points but they cannot be reconstructed from the experimental
data.
Polarization tomography can be also performed with more advanced photon-
number resolving detectors, which can register not only single photons but also mul-
tiphoton states and can distinguish between different photon numbers. An example
is transition-edge sensors (TES). The use of such detectors would enable polariza-
tion quantum tomography in a larger space of photon numbers, but still, due to the
11.3 Polarization quasi-probability | 159
therefore does not have to be phase locked. This also shows robustness of polariza-
tion tomography to phase drifts.
This situation is formally described as follows [18]. The photon-annihilation op-
erators in modes H, V can be approximated by
where αH = α0 and αV = iα0 are the amplitudes of the coherent states in the hori-
zontal and vertical polarization modes, assumed to be classical, and δq̂ H,V , δp̂ H,V are
quadrature operators in the polarization modes H, V, which are assumed to be weakly
populated. The Stokes operators Ŝ1,2 can then be expressed using definition (11.41) as
We see the Stokes operators Ŝ1 , Ŝ2 scale as linear combinations of the quadrature
operators in the H, V polarization modes, and the proportionality constant is given by
the amplitude of the bright coherent state. The variances of the Stokes operators are
also linearly related to the quadrature variances:
Therefore, quadrature squeezing of the states populating the horizontally and ver-
tically polarized modes will lead to the reduction in the fluctuations of the Stokes ob-
servables. This effect of polarization squeezing will be described in detail in the next
chapter.
Bibliography | 161
Bibliography
[1] Y. Aharonov, D. Z. Albert, and L. Vaidman. How the result of a measurement of a component of
the spin of a spin-1/2 particle can turn out to be 100. Phys. Rev. Lett., 60:1351–1354, Apr 1988.
[2] H.-A. Bachor and T. C. Ralph. A guide to experiments in quantum optics. Wiley-VCH, 2004.
[3] W. P. Bowen, R. Schnabel, H. A. Bachor, and P. K. Lam. Polarization squeezing of continuous
variable Stokes parameters. Phys. Rev. Lett., 88:093601, Feb 2002.
[4] V. B. Braginsky and F. Y. Khalili. Quantum measurement. Cambridge University Press, 1992.
[5] P. A. Bushev, V. P. Karassiov, A. V. Masalov, and A. A. Putilin. Biphoton light with hidden
polarization and its polarization tomography. Opt. Spectrosc., 91:558–564, 2001.
[6] M. V. Chekhova and F. Y. Khalili. Nonclassical features of the polarization quasiprobability
distribution. Phys. Rev. A, 88:023822, Aug 2013.
[7] A. S. Chirkin, A. A. Orlov, and D. Y. Parashchuk. Quantum theory of two-mode interactions
in optically anisotropic media with cubic nonlinearities: Generation of quadrature- and
polarization-squeezed light. Rus. Journ. Quantum Electronics, 23:870–874, 1993.
[8] J. F. Corney, J. Heersink, R. Dong, V. Josse, P. D. Drummond, G. Leuchs, and U. L. Andersen.
Simulations and experiments on polarization squeezing in optical fiber. Phys. Rev. A,
78:023831, Aug 2008.
[9] J. Dressel, M. Malik, F. M. Miatto, A. Jordan, and R. W. Boyd. Colloquium: Understanding
quantum weak values: basics and applications. Rev. Mod. Phys., 86:307–316, 2014.
[10] R. Filip and L. Lachman. Hierarchy of feasible nonclassicality criteria for sources of photons.
Phys. Rev. A, 88:043827, Oct 2013.
[11] O. A. Ivanova, T. S. Iskhakov, A. N. Penin, and M. V. Chekhova. Multiphoton correlations in
parametric down-conversion and their measurement in the pulsed regime. Quantum Electron.,
36(10):951–956, oct 2006.
[12] V. P. Karassiov and A. V. Masalov. Quantum interference of light polarization states via
polarization quasiprobability functions. J. Opt. B, Quantum Semiclass. Opt., 4(4):S366–S371,
aug 2002.
[13] D. N. Klyshko. The nonclassical light. Phys. Usp., 39:573–596, 1996.
[14] D. N. Klyshko. Polarization of light: fourth-order effects and polarization-squeezed states.
J. Exp. Theor. Phys., 84:1065–1079, 1997.
[15] D. Klyshko. Physical foundations of quantum electronics. World Scientific, 2011.
[16] J. Knight and L. Vaidman. Weak measurement of photon polarization. Phys. Lett. A,
143:357–361, 1990.
[17] N. Korolkova. Quantum polarization for continuous-variable information processing,
chapter 30, pages 405–417. John Wiley & Sons, Ltd, 2005.
[18] N. Korolkova, G. Leuchs, R. Loudon, T. C. Ralph, and C. Silberhorn. Polarization squeezing and
continuous-variable polarization entanglement. Phys. Rev. A, 65:052306, Apr 2002.
[19] L. D. Landau and E. M. Lifshitz. Quantum mechanics. Elsevier, 3rd edition, 1977.
[20] U. Leonhardt. Measuring the quantum state of light. Cambridge University Press, 1997.
[21] R. Loudon. The quantum theory of light. Oxford University Press, 1973.
[22] J. S. Lundeen, B. Sutherland, A. Patel, C. Stewart, and C. Bamber. Direct measurement of the
quantum wavefunction. Nature, 474:188–191, 2011.
[23] L. Mandel and E. Wolf. Optical coherence and quantum optics. Cambridge University Press,
1995.
[24] M. A. Nielsen and I. L. Chuang. Quantum computation and quantum information. Cambridge
University Press, 2010.
[25] G. J. Pryde, J. L. O’Brien, A. G. White, T. C. Ralph, and H. M. Wiseman. Measurement of quantum
weak values of photon polarization. Phys. Rev. Lett., 94:220405, 2005.
162 | 11 Quantum description of polarization
where the dipole moment is the polarization of the matter integrated over the volume,
d⃗ = ∫ d3 r P(⃗ r).
⃗ (12.2)
https://doi.org/10.1515/9783110668025-012
164 | 12 Nonclassical states of polarized light
The integral should be done over the whole volume where the nonlinear interaction
takes place, and P(r)⃗ is the nonlinear polarization.
Consider first the second-order nonlinear polarization P⃗ (2) (the third-order one
will be responsible for four-wave mixing and similar effects in Section 12.2). The ex-
pression for P⃗ (2) has been introduced in Chapter 10. After substituting Eq. (10.2) and
Eq. (12.2) into Eq. (12.1), we obtain
.
ℋ = −ϵ0 ∫ d rχ
3 (2)
(r)..E(⃗ r,⃗ t)E(⃗ r,⃗ t)E(⃗ r,⃗ t). (12.3)
The three dots, as before, mean multiplication of a rank-3 tensor by three vectors.
We assume the field E⃗ contains three components: the pump E⃗ 0 , the signal E⃗ s , and
the idler E⃗ i . In reality, only the pump is present at the input of the nonlinear crystal;
the other two fields are accounted for formally, keeping in mind that the correspond-
ing modes are populated by only vacuum states. The reason is that in the quantum
description (Chapter 11), we have to assign a field operator to every field mode, and
only later, in order to calculate some observables, average the operators over states,
as given by Eq. (11.50). Writing the total field E⃗ in terms of the analytic signals, we get
The energy (12.3) will comprise many different terms, every one describing some
nonlinear optical process. For instance, the term containing [E0(+) ]2 E0(−) will corre-
spond to the second-harmonic generation from the pump. But as we know from
Chapter 10, only those processes will be efficient, for which the phase-matching con-
ditions are satisfied. This will not be the case for all nonlinear effects. In this section,
we are interested only in SPDC, and the relevant term in the expression for the energy
is
.
ℋSPDC = −ϵ0 ∫ d rχ
3 (2)
(r)..E⃗ 0(+) (r,⃗ t)E⃗ s(−) (r,⃗ t)E⃗ i(−) (r,⃗ t) + c. c. (12.5)
The interaction described by Eq. (12.5) takes place in the area shown in Fig. 12.1,
namely where the pump beam is in the crystal of length L with the second-order
nonlinear susceptibility χ (2) .
The pump, usually a laser beam, can be seen as a plane monochromatic classical
wave propagating along the z direction, E⃗ 0(+) (r,⃗ t) = e⃗0 E0 e−iω0 t+ik0 z , e⃗0 being the unity
Jones vector defining the polarization state of the pump. The situation with the sig-
nal and idler fields is different: because they are vacuum fields, we write E⃗ s(−) (r,⃗ t) and
E⃗ i(−) (r,⃗ t) as quantum operators, according to Eq. (11.4):
∗ † iωm t−ik⃗m r ⃗
E⃗ s(−) (r,⃗ t) = ∑ e⃗m cm am e ,
m
Here, the indices m, n number all wavevector modes of the signal and idler fields, e⃗m,n
are the Jones vectors defining their polarization states, and ωm,n ≡ ω(k⃗m,n ) are their
frequencies. We will further assume that the second-order nonlinear susceptibility has
a constant value χ0 over the whole nonlinear crystal.
Then the energy is
.
ℋ̂ SPDC = −ϵ0 E0 ∑ cm cn χ0 ..e⃗0 e⃗m e⃗n ∫ d ram an e
3 † iΔk⃗mn r−iΔωmn t
+ h. c., (12.7)
∗ ∗ † ⃗
m,n
where Δωmn ≡ ωm + ωn − ω0 and Δk⃗mn ≡ k⃗m + k⃗n − k⃗0 are frequency and wavevector mis-
matches, respectively. The energy (12.7) is now an operator, a Hamiltonian; this is why
we wrote the conjugated part as ‘Hermitian conjugated’, ‘h. c.’, instead of ‘complex
conjugated’, ‘c. c.’
Every term in Hamiltonian (12.7) contains two photon creation operators, a†m and
an . This means it will lead to the generation of photon pairs in the corresponding
†
modes, which we labeled by m, n. Due to the sum in the Hamiltonian, the pairs can
be generated into many different signal and idler modes: each of the indices m, n im-
plies three degrees of freedom in the wavevector space, as well as the polarization (see
Section 11.1.1). However, there are additional restrictions related to the frequency and
wavevector mismatches entering the Hamiltonian.
Consider first the frequency mismatch. If it is nonzero, the Hamiltonian will os-
cillate in time, and the nonlinear interaction will not be accumulated. Therefore, it is
necessary that
ωm + ωn = ω0 . (12.8)
Equation (12.8) means that the frequencies of the two generated photons should sum
up to give the pump frequency. In the simplest case of ωm = ωn (frequency-degenerate
SPDC), the frequencies of the generated photons are half the pump frequency. A simi-
lar condition we obtained in Chapter 10, where the frequency of the second harmonic
was twice the frequency of the pump. Now, in the ‘photon language’, we can say that
166 | 12 Nonclassical states of polarized light
Eq. (12.8) formulates the energy conservation: the energy of the pump photon ℏω0 is
equal to the sum of the daughter photon energies, ℏωm and ℏωn .1
The volume integration in Eq. (12.7), d3 r ≡ dxdydz, imposes additional restrictions
on the modes n, m, in which the signal and idler photons are generated. For simplicity,
we can assume that the transverse size of the pump is very large. Then the integration
over the transverse coordinates (x and y in Fig. 12.1) will lead to a factor δ(Δkx )δ(Δky ) in
the Hamiltonian: the transverse wavevector mismatch is equal to zero. In the ‘photon
language’, this means conservation of the transverse momentum of photons. Because
the pump photons had only momentum along the z axis, their transverse momenta
were zero, and for the signal and idler photons, the projections of the momenta on
both x and y axes should be opposite. The integration in z (Fig. 12.1) is over the length
of the crystal, and as in all coherent nonlinear optical effects, it leads to the expression
0
iΔkz L Δkz L
∫ eiΔkz z = Le− 2 sinc , (12.9)
2
−L
sin x
where sinc(x) ≡ x
.
The conditions that the frequency mismatch and the transverse wavevector mis-
match are zero remove part of the summation in Eq. (12.7). Namely, out of the six sums
over the wavevector modes, conditions Δω = Δkx = Δky = 0 eliminate three. In ad-
dition, the Hamiltonian contains the factor (12.9), which is nonzero only in the vicin-
ity of the longitudinal phase matching Δkz = 0. All this leaves much less freedom in
the choice of modes where the signal and idler photons are generated: unrestricted
remains the frequency of one of them (let it be the signal frequency ωs —the idler fre-
quency is anti-correlated to it through the condition (12.8)), the azimuthal angle of
emission ϕs of the signal photon, and the polarization states of the signal and idler
photons. The latter will be the subject of the following sections.
Then the Hamiltonian can be simplified to
where we deliberately dragged out the Planck constant and included all relevant pa-
rameters (crystal length, second-order nonlinearity, the amplitude of the pump, etc.)
into the coupling parameter Γ. We also selected only one mode for the signal photon
and, correspondingly, one for the idler photon, but we will keep in mind that there is
still a choice of signal photon frequencies. This property of producing new states of
light not into a single mode but into multiple modes, among many other features, dis-
tinguishes SPDC from the second-harmonic generation where, for instance, the spec-
trum of generated light is just the spectrum of the pump shifted (in the logarithmic
1 This picture is valid in the case of a continuous-wave pump or a pump with relatively long pulses.
For femtosecond-pulse pump, Eq. (12.8) is satisfied up to the pump spectral width; this does not mean
that the energy is not conserved but only that the energy of a short pump pulse has an uncertainty.
12.1 Spontaneous parametric down-conversion | 167
scale) by an octave. In contrast, SPDC produces new frequency and wavevector states
and, most importantly for this book, various polarization states.
The polarization modes of the signal and idler photons will be determined by the type
of phase matching, described in Section 10.2. In SPDC, the phase matching is usu-
ally satisfied by using a birefringent crystal and choosing different polarizations for
the pump and for the signal/idler photons. For instance, in type-I SPDC in a negative
crystal, the pump is polarized as an extraordinary beam while the down-converted
photons are ordinarily polarized (e→oo interaction, reverse to the process discussed
in Section 10.2.2). They can still differ in frequency or direction of emission, but distin-
guished is the case where they also have the same frequencies and wavevectors. The
Hamiltonian (12.10) then takes the form
† 2
ℋ̂ I = iℏΓ[a ] + h. c., (12.11)
where a† is the photon creation operator in the mode into which SPDC photons are
emitted. In accordance with the phase matching conditions, these photons should
have the frequency equal to half of the pump frequency and the wavevector collinear
with the pump wavevector. This type of SPDC phase matching is found in exactly the
same way as for the second-harmonic generation (see Section 10.2.2).
The states generated through this type of phase matching manifest various re-
markable quantum features, such as non-monotonic photon-number distribution,
strong photon bunching, and quadrature squeezing. But because these properties are
not relevant for the polarization quantum optics, we will not discuss them here.
In a more general case of type-I SPDC, the daughter photons are emitted non-
collinearly with the pump (Fig. 12.2) but along cones whose opening angles depend
on the frequencies of the signal and idler photons (and if they are not equal, the pho-
ton with a lower frequency will be emitted along a larger cone). The angle ϑ between
the optic axis ζ and the pump wavevector defines the angle of emission for photons
of a given frequency. In particular, for a certain angle, photons at the degenerate fre-
quency ω0 /2 will be emitted along the pump, and the case of collinear degenerate
type-I SPDC will be realized.
One can also implement type-II phase matching, for which signal and idler pho-
tons are polarized orthogonally. The Hamiltonian then takes the form
where we assumed that the polarization states of the two photons are horizontal and
vertical and the other parameters of the signal and idler photons, such as wavelength
168 | 12 Nonclassical states of polarized light
and wavevector direction, are the same. This case is called type-II collinear frequency-
degenerate SPDC. In general, the two photons are polarized as the ordinary and ex-
traordinary normal waves in the crystal, i. e., linearly and orthogonally to each other.
For simplicity, we assume that the crystal is oriented so that the directions of linear
polarization are horizontal and vertical.
This case of type-II SPDC is formally similar to other cases where the signal and
idler photons are emitted into two distinguishable modes. These can be different fre-
quency modes, or different wavevector directions, or both. But in the context of this
book, type-II SPDC is most interesting because it provides special polarization states,
whose properties (hidden polarization, polarization squeezing) will be discussed in
the next sections.
Type-II SPDC can be considered as reversed type-II second-harmonic generation
(Section 10.2.2); the necessary orientation of the nonlinear crystal and the effective
value of the second-order susceptibility should be calculated the same way.
Even more interesting in connection with the polarization states of nonclassical
light is the phase matching involving two polarization modes and two other modes,
for instance, two different wavevector directions. In this case, polarization-entangled
photons are generated. This situation, depicted in Fig. 12.3, has been first realized
by Kwiat et al. [38]. Importantly, SPDC produces photon pairs not only along the
pump wavevector, but in a continuum of other directions and frequencies—unlike the
second-harmonic generation, as already mentioned. In particular, even for a given
(degenerate) frequency of the signal and idler photons ωs = ωi = ω0 /2, SPDC can be
non-collinear: the wavevectors of the two daughter photons are in this case not paral-
lel to the pump wavevector. Calculation shows that, for type-II frequency-degenerate
SPDC, the ordinary (o) and extraordinary (e) photons are emitted along two different
cones, tilted with respect to each other in the plane containing the incident pump
wavevector and the optic axis (Fig. 12.3). These cones intersect along two lines, de-
noted as A and B in the figure.
The situation shown in Fig. 12.3 is the most general one. As the angle ϑ between
the optic axis and the incident pump is varied, the cones become larger or smaller. In
particular, for a certain angle ϑ they touch along a single line that is collinear with the
pump wavevector. This is the case of collinear degenerate type-II SPDC we described
previously.
But consider now the case shown in Fig. 12.3, and namely the photon pairs emitted
in the directions A,B. Along each line, there is both an e-polarized photon (V) and
an o-polarized photon (H). Because of the transverse wavevector matching condition,
Δkx = Δky = 0, the two photons should be always emitted symmetrically with respect
to the pump. They also should have orthogonal polarizations. There are therefore two
possibilities: that photon A is H-polarized and photon B, V polarized, and vice versa.
The Hamiltonian can then be rewritten as the sum of two Hamiltonians:
iϕ †
ℋ̂ ent = iℏΓ(aAH aBV + e aAV aBH ) + h.c, (12.13)
† † †
where the phase ϕ can be different depending on the pump, signal, and idler phase
delays in the nonlinear crystal.2
In addition to the non-collinear type-II SPDC, there is another experimental
scheme to obtain Hamiltonian (12.13) [39]. In this scheme, two nonlinear crystals,
cut for type-I phase matching as in Fig. 12.2, are placed one after another into a com-
mon pump beam (Fig. 12.4). One of the crystals is oriented with the optic axis ζ in the
vertical plane and the other one, with the optic axis in the horizontal plane. If the
pump is polarized diagonally, it has an extraordinary polarized component in each
crystal and generates SPDC with the e→oo phase matching. Both crystals emit photon
pairs at the degenerate frequency ω0 /2 along the cone with the same opening angle,
but the photons from the first crystal are polarized horizontally and the photons from
the second one, vertically.
2 There is an additional element in the setup, not shown in Fig. 12.3: to make the two terms in Hamil-
tonian (12.13) coherent, a birefringent element after the nonlinear crystal has to compensate for the
group-velocity delay between ordinary and extraordinary photons [38].
170 | 12 Nonclassical states of polarized light
Let us select two directions, A and B, into which signal and idler photons are emitted.
(Note that, unlike in the scheme shown in Fig. 12.3, here we can choose such directions
in many different ways.) The Hamiltonian provided by the first crystal can then be
written as ℋ̂ 1 = iℏΓa†AH a†BH + h. c., where a†AH and a†BH are photon creation operators
in the horizontally polarized modes of beams A and B, respectively. Meanwhile, the
Hamiltonian of the second crystal is ℋ̂ 2 = iℏΓa†AV a†BV + h. c., where the same notation
is used.
Because the two SPDC sources are pumped coherently, by a common laser beam,
the total Hamiltonian is the sum of Hamiltonians ℋ̂ 1 and ℋ̂ 2 , with a constant phase
between them. This phase ϕ is due to the phase delays of the pump and the emitted
down-converted radiation. The result is
iϕ iϕ †
ℋ̂ = ℋ̂ 1 + e ℋ̂ 2 = iℏΓ(aAH aBH + e aAV aBV ) + h. c. (12.14)
† † †
and a HWP at 45∘ placed in beam B will convert it into Hamiltonian (12.13) with ϕ = 0.
This scheme is more efficient than the one based on type-II SPDC for many rea-
sons. First, as we already mentioned, it includes many directions A and B. Second,
type-I SPDC, in most crystals, has a higher effective susceptibility. Finally, this scheme
is simpler in operation.
The states generated by the Hamiltonians (12.10), (12.12), (12.13) we derived here
will be the subject of the next sections. But at this point we notice that each of them
creates photons only in pairs. The probability of pair creation depends on the coupling
parameter Γ, which scales as the second-order nonlinear susceptibility, the pump field
amplitude, and the length of the crystal. Depending on the magnitude of the coupling
parameter, the interaction can be weak or strong. Correspondingly, one can distin-
guish between two cases: low-gain SPDC, which generates photon pairs, and high-
gain SPDC, which generates bright beams with photon-number correlations. We will
consider the first case in Section 12.3 and the second one, in Section 12.4. But before do-
ing that, in the next section we show how the same pair-creating Hamiltonians emerge
through third-order nonlinear effects.
do not offer such a rich platform for producing polarization states as crystals do. Here
we will only briefly describe the methods of generating nonclassical states of polarized
light in optical fibers. As in the case of SPDC, our consideration will start with deriving
the Hamiltonian, i. e., the energy of the nonlinear interaction.
d⃗ = ∫ d3 r P⃗ (3) (r),
⃗ (12.16)
3
ℋ = −ϵ0 ∫ d rχ (r) E(⃗ r,⃗ t)E(⃗ r,⃗ t)E(⃗ r,⃗ t)E(⃗ r,⃗ t). (12.17)
(3)
....
Similarly to the case of SPDC, we assume that the total field E(⃗ r,⃗ t) contains the pump
E⃗ 0 (r,⃗ t), which is a classical plane monochromatic wave, and the signal and idler field
operators given by Eqs. (12.6). Then, among the many terms the energy will contain,
we are interested in
3
ℋSFWM = −ϵ0 ∫ d rχ (r) E⃗ 0(+) (r,⃗ t)E⃗ 0(+) (r,⃗ t)E⃗ s(−) (r,⃗ t)E⃗ i(−) (r,⃗ t) + c. c. (12.18)
(3)
....
This term describes spontaneous four-wave mixing (SFWM) and modulation instability
(MI); the difference between these two processes will be clear from what follows.
We proceed in the same way as in the case of SPDC: by substituting into Eq. (12.18)
the expressions for the pump, signal and idler fields, we obtain the SFWM Hamiltonian
2
ℋ̂ SFWM = −ϵ0 E0 ∑ cm cn χ0 e⃗0 e⃗0 e⃗m e⃗n ∫ d3 ra†m a†n eiΔkmn r−iΔωmn t + h. c., (12.19)
∗ ∗ (3) ⃗ ⃗
....
m,n
where now, Δωmn ≡ ωm + ωn − 2ω0 and Δk⃗mn ≡ k⃗m + k⃗n − 2k⃗0 . Here, for uniformity
we use the same notation (k) for the wavevectors as in the case of SPDC in nonlinear
crystals. But usually, third-order interactions are implemented in optical fibers where
the propagation constant β is used instead. Also, in the phase mismatch we omitted
the contribution of self-phase-modulation and cross-phase-modulation; this can be
done in the case of weak continuous-wave pump but would be wrong in the case of
pulsed pump with a high peak power.
Equation (12.19), with the restrictions imposed by phase matching, boils down to
the same pair-producing Hamiltonian (12.10) as in the case of SPDC. The difference
172 | 12 Nonclassical states of polarized light
of SFWM-MI Hamiltonians from the one of SPDC is that now, the coupling parameter
scales as the third-order susceptibility and the squared pump amplitude:
The fact that the coupling parameter of third-order nonlinear interactions scales
quadratically with the pump amplitude leads to some important features. As will be
clear from the next sections, the rate of pair production scales as the square of the
coupling parameter; therefore, in SFWM and MI it will scale as the pump intensity
squared. This means that, for a pulsed pump, the efficiency of SFWM and MI will be
higher than for a continuous-wave pump with the same average power—the same fea-
ture is typical for the second-harmonic generation and other effects nonlinear in the
pump power. Similarly, tight focusing of the pump should also increase the efficiency
of SFWM and MI. In contrast, the pair generation rate of SPDC scales linearly with the
pump power, and it is only the average pump power that matters.
In optical fibers, birefringence is usually absent or too small to satisfy the phase
matching. Periodic poling, often used to phase match second-harmonic generation
(Section 10.3.3) and SPDC, does not work here either, because poling does not change
the third-order susceptibility χ (3) . Still, phase matching is possible, and the mecha-
nisms are different depending on whether the pump wavelength λ0 = 2πc/ω0 is below
or above the zero-dispersion wavelength λZDW .
Figure 12.5 shows a typical dispersion dependence k(ω). It is steep at low frequen-
cies, due to the presence of infrared resonances with the molecules oscillations, and
it is steep again at high frequencies, approaching electronic resonances. In between,
there is the zero-dispersion frequency ωZDW = 2πc/λZDW , where the dispersion depen-
dence has an inflection point. The dispersion dependence is convex below the zero-
dispersion point and concave above it. The two intervals on the left and on the right of
ωZDW are called, respectively, anomalous and normal group-velocity dispersion (GVD)
ranges.
The phase-matching condition requires that the pump wavevector k0 is the mean
arithmetic of the signal and idler wavevectors, 2k0 = ks + ki , while the pump frequency
ω0 is the mean arithmetic of the signal and idler frequencies, 2ω0 = ωs +ωi . If the pump
frequency is above the zero-dispersion frequency ωZDW , as shown by the blue dashed
line in Fig. 12.5, it is possible to satisfy this condition. This is geometrically illustrated
by the blue solid line in the figure, connecting points s1 and i1 on the dispersion de-
pendence. The phase-matching condition can be satisfied in the normal GVD range, as
long as the pump (p1) is not too far from the zero-dispersion point. This regime of pair
generation is called spontaneous four-wave mixing, and it produces signal and idler
12.2 Spontaneous four-wave mixing and related effects | 173
photons at relatively large separated frequencies. These frequencies are solely deter-
mined by the dispersion of the nonlinear material and do not depend on the pump
power [52].
But if the pump frequency (shown by a red dashed line) is below ωZDW , i. e., in the
anomalous GVD range, the pump wavevector cannot be the mean arithmetic of the sig-
nal and idler wavevectors. This is shown in Fig. 12.5 by a red straight line connecting
two points s2, i2 on the dispersion dependence that are symmetric with respect to ω0 .
The mean arithmetic of k(ωs ) and k(ωi ) is always below the dispersion dependence.
And here the cross- and self-phase-modulation come into play. We ignored them so
far, because these effects change the phase matching very little, but it still matters
if only a small wavevector mismatch has to be compensated. Indeed, if the pump is
strong, it changes both its refractive index (self-phase modulation) and the refractive
indices of the signal and idler radiation (cross-phase-modulation) [2]. These two ef-
fects add to the negative mismatch Δk = ks + ki − 2k0 a positive term scaling as the
pump power, which reduces the absolute value of the mismatch. This term, shown
by red vertical bar in Fig. 12.5, can be viewed as reducing the wavevector of the pump
and therefore making the phase matching satisfied. Naturally, the stronger the pump,
the larger the nonlinear change in the refractive index, and hence the further apart
the signal and idler wavelengths. This regime is called modulation instability. In this
regime, signal and idler photons are generated spectrally rather close to the pump,
but their frequency separation considerably increases with the increase in the pump
power [52].
In the regime of moderately pumped MI, where the signal and idler photons are
generated even closer to the pump frequency (as so-called sidebands of the pump),
the emergence of nonclassical light has a different interpretation [49]. In this case,
the pump and the signal/idler sidebands can be considered as a single strong beam.
Due to the Kerr effect, the refractive index of this beam gets a nonlinear additional
part, caused by self-phase-modulation. This additional part scales linearly with the
174 | 12 Nonclassical states of polarized light
intensity I of light,
n = n0 + n2 I, (12.21)
where n2 is called the nonlinear refractive index. If a coherent state enters a fiber, its
intensity is constant only in the classical picture; from the quantum-optical viewpoint
even a coherent state has shot-noise fluctuations of the amplitude (Section 11.1.2) and
photon-number/intensity fluctuations. Its Wigner function (Section 11.1.3) is Gaus-
sian, with the width of 1/2 in each quadrature. This Wigner function is schematically
shown in Fig. 12.6 with a circle. Without the Kerr nonlinearity the state would only
change its phase, i. e., the Wigner function would simply rotate in the phase space
with the circular frequency ω0 . Now we will look at the Wigner function as at a real
probability distribution of the quadratures, i. e., as a set of points in the phase space.3
Due the nonlinear change of the refractive index, points with larger amplitudes ac-
quire larger phase shifts than points with smaller amplitudes (dashed lines in the
figure). After a sufficiently long nonlinear medium (fiber), the Wigner-function distri-
bution stretches, as shown in Fig. 12.6 with an ellipse. Note, however, that the Kerr
effect does not change the amplitude of light, only its phase; therefore the stretching
occurs along a certain quadrature qa that is different from the amplitude. The ampli-
tude uncertainty remains the same, and the anti-squeezing of quadrature qa leads to
the squeezing of the orthogonal quadrature qs .
Phase matching is satisfied in this ‘Kerr squeezing’ effect automatically, because the
signal and idler sidebands have only minute spectral separation from the pump. What
is important, however, is that, for very short pulses, the interaction is reduced by the
GVD of the fiber. To mitigate this problem, it is useful to work in the soliton regime of
pulse propagation, where the spreading of a pulse is compensated by the Kerr effect [4,
49].
3 This is not a rigorous picture, just a classical interpretation (see Section 11.1.3), but it gives a good
intuition on how the Kerr nonlinearity leads to squeezing.
12.2 Spontaneous four-wave mixing and related effects | 175
Third-order nonlinear interactions discussed in the previous two sections all lead to
the pair-generating Hamiltonians of the form (12.10). The difference between SFWM,
MI, and Kerr squeezing is only that in the first case, the frequencies ωs , ωi of the signal
and idler photons are well separated from the pump frequency, in the second case,
very little separated, and in the last case, not separated at all. This does not leave much
freedom for engineering the Hamiltonian and the resulting quantum states, especially
from the polarization viewpoint. The signal and idler photons are typically emitted
into the same polarization mode as the pump.
SFWM also enables the interaction where both photons are emitted into the same
mode, i. e., where the Hamiltonian has the form (12.11). Then, for the signal/idler mode
not to coincide with the one of the strong pump, two pump beams are used instead of
one. These pump beams are separated in frequency or, in the case of SFWM in atoms,
in wavevector.
Of course, if the signal and idler photons have frequencies sufficiently far apart,
their polarization states can be transformed independently and, for instance, orthog-
onally polarized pairs can be produced, like in the case of Hamiltonian (12.12). But the
signal and idler photons will still differ in frequency, which is a restriction in some
cases.
The solution to this engineering problem is similar to the one where two SPDC
crystals are used. This time (Fig. 12.7(a)), two fibers are placed one after another into
the same pump beam (green in the figure), each of them generating signal and idler
photons in sidebands A, B (shown with blue and yellow colors) through some third-
order process, for instance, SFWM. If the pump is polarized vertically, signal and idler
photons at the output of the first fiber are also polarized vertically. Further, a HWP
oriented at 45∘ rotates the polarization of signal and idler photons by 90∘ but does not
change the pump polarization. Then the second fiber will also generate signal and
idler photons polarized vertically. At the output of the second fiber, there will be verti-
cally polarized pairs from the second fiber, but also horizontally polarized pairs from
the first fiber. The total Hamiltonian will have the form (12.15). The only difference from
the SPDC case shown in Fig. 12.4 is that there beams A and B have different directions,
and now they have the same direction but differ in frequencies. Alternatively, the HWP
can only change the pump polarization and maintain the polarization states of the sig-
nal and idler photons; then the pairs from the first fiber will be polarized vertically and
from the second fiber, horizontally.
This strategy is part of a more general principle of ‘nonlinear interferometry’: if
two nonlinear processes are pumped coherently, there is interference between them.
Nonlinear effects within the same polarization mode can enhance or suppress each
other, but fields emitted into orthogonal polarization modes can form new superposi-
tions and therefore new polarization states [12].
176 | 12 Nonclassical states of polarized light
Figure 12.7: Interferometric schemes to generate photon pairs in optical fibers: two fibers placed
into a common pump beam (a), the use of fast and slow axes of a polarization-maintaining fiber (b),
and the Sagnac interferometer (c).
Usually, selective polarization rotation for only signal and idler, or only for the pump,
is difficult—and impossible if signal and idler frequencies coincide with the pump one.
To solve this problem in the case of Kerr squeezing, Heersink et al. [20] used, instead
of two different fibers, a single polarization-maintaining fiber (Section 9.5.1). Because
such a fiber suppresses the cross-talk between fields polarized along the ‘fast’ and
‘slow’ axes, nonlinear interaction occurs for both polarizations independently. With
the pump polarized diagonally (Fig. 12.7(b)) and slow (s) and fast (f) axes correspond-
ing to V and H polarizations, the fiber produced both vertically and horizontally po-
larized photon pairs (Fig. 12.7(b)). The group delay between such pairs had to be pre-
compensated [20].
Another solution is to use an interferometer, for instance, a Sagnac interferome-
ter shown in Fig. 12.7(c). In a very elegant way, this strategy was applied to SFWM [50].
A diagonally polarized pump was split at a polarizing beamsplitter, and the horizon-
tally and vertically polarized beams propagated in the Sagnac loop clockwise and an-
ticlockwise, respectively. Correspondingly, signal and idler photons from horizontally
polarized pump, also horizontally polarized, were reflected into the output port of the
Sagnac interferometer, and overlapped with the vertically polarized pairs, which were
transmitted. Again, Hamiltonian (12.15) was realized. An unbalanced Sagnac interfer-
ometer was earlier applied to Kerr squeezing [45, 48], in a more complicated scheme
that we will not discuss here, but later replaced by the same group with a scheme
based on a polarization-maintaining fiber (Fig. 12.7(b)).
find the quantum state by using the Schrödinger approach in combination with the
perturbation theory.
In the Schrödinger approach, the state |Ψ(t)⟩ varies with the time due to the action of
the Hamiltonian, and the operators are considered to be time-independent. The initial
state of the modes where signal and idler photons are generated is the vacuum state,
because there are no signal or idler photons at the input of the crystal. The evolution of
the state is then described by the Schrödinger equation with one of the Hamiltonians
ℋ̂ we derived in the previous sections:
d|Ψ(t)⟩
iℏ = ℋ̂ |Ψ(t)⟩. (12.22)
dt
where the initial state |Ψ(0)⟩ is the vacuum. Because the Hamiltonians we derived in
Section 12.1 are time-independent, the integration in Eq. (12.23) boils down to the mul-
tiplication by the integration time (time ti of the interaction). Then the magnitude of
the exponent in Eq. (12.23) is given by Γti . This parameter is the key characteristic of the
‘strength’ of pair-producing interaction, SPDC or SFWM, and will be further denoted
as G = Γti and called the parametric gain.
If the interaction is weak, G ≪ 1, the exponential in Eq. (12.23) can be expanded
into a Taylor series, with only the first two terms kept, which yields
where |0⟩ is the vacuum states for all involved modes: a single mode in the case of
degenerate collinear SPDC, polarization modes H and V for collinear type-II SPDC,
and four modes AH, AV, BH, BV for the processes shown in Figs. 12.3, 12.4, and 12.7.
The state (12.24) is a superposition of the vacuum state and a state formed by ac-
tion of two photon creation operators on the vacuum state. Depending on whether the
photon creation operators belong to the same mode or different modes, the second
term in Eq. (12.24) describes either a two-photon Fock state or a product of single-
photon Fock states in two orthogonal polarization modes. If the Hamiltonian in-
volves photon creation operators in four modes, as in Eq. (12.13), the resulting state
is polarization-entangled and will be considered in Section 12.3.3. But in all cases the
state (12.24) describes a superposition of the vacuum and a photon pair, a biphoton.
178 | 12 Nonclassical states of polarized light
Even in the simplest case where SPDC produces photon pairs into the same mode, the
state at the output of the nonlinear crystal,
2
|Ψ⟩ = |0⟩ + G[a† ] |0⟩ = |0⟩ + √2G|2⟩, (12.25)
is nonclassical. For instance, its bunching parameter g (2) = 1/(2|G|2 ) is very high at
|G| ≪ 1, and condition (11.36) for n = 2 is satisfied even if higher-order terms in the
expansion (12.25) are taken into account and calculation gives a nonzero third-order
correlation function g (3) . This extreme bunching can be observed by sending the state
into a Hanbury Brown–Twiss setup (Fig. 11.2): the vacuum part of the state will not
affect the experiment, but the photon pairs, with a probability 50 %, will be split on
the beamsplitter, and the detectors will ‘click’ simultaneously in each such case. If the
photon flux is low, accidental coincidences will be very few, much fewer than those
caused by photon pairs, and the resulting bunching parameter will be high.
Condition (11.37) is also satisfied for state (12.25), for even m. To see this, we can
continue the Taylor expansion in Eq. (12.25) and obtain the result that it contains no
three-photon Fock state but only a four-photon one, and so on. The state produced by
frequency-degenerate collinear type-I SPDC is for this reason called sometimes ‘light
with even photon numbers’ [30]. But from the viewpoint of polarization properties,
this state of light is not very interesting, and we will pass now to the state generated
via frequency-degenerate collinear type-II SPDC.
This state has the form
It also consists of biphotons, but now the photons within one biphoton are or-
thogonally polarized. Their correlation can be measured in a modified Hanbury
Brown–Twiss setup, where a non-polarizing beamsplitter is replaced by a polariz-
ing one (Fig. 12.8). The two detectors will always ‘click’ simultaneously in this case,
as long as there are no losses. This way, one can measure the second-order cross-
correlation function (11.26) for the H and V polarization modes, and the result will be
gH,V
(2)
(0) = 1/|G|2 ≫ 1.
This type of biphotons has a very important application: by using one of the detec-
tors in Fig. 12.8 as a trigger, one can produce single-photon states in the other channel.
For instance, if the detector registering a vertically polarized photon ‘clicks’, we know
for sure that there is a horizontally polarized photon in the ‘transmitted’ path of the
polarization prism, and we can further use it. For instance, we can open a gate in the
transmitted arm if there is a photon in the reflected arm. The probability that a second
pair is accompanying the first one is very low if the parametric gain is low; therefore
only a single photon will pass through the gate, and the state in the transmitted arm
12.3 Low parametric gain and entangled photons | 179
will be very close to the Fock state |1⟩. This method is called heralded preparation of
single photons, and it is used, for instance, for quantum key distribution. It does not
necessarily require type-II SPDC: it is only necessary that photon pairs are emitted into
two distinguishable modes. For instance, in the pioneering experiment by Hong and
Mandel [21], the two photons were emitted in two different directions. But the polar-
ization version is technically simpler.
A very unusual feature of both the state (12.26) and its two-photon part |1⟩H |1⟩V is
that despite being pure, it is completely unpolarized. Indeed, calculation of the Stokes
parameters for this state yields ⟨S1 ⟩ = ⟨S2 ⟩ = ⟨S3 ⟩ = 0 and ⟨S0 ⟩ = 2|G|2 . This looks
strange at first sight, but we need to recall that two orthogonally polarized photons
(or, generally, electric fields) will interfere and give a polarized state only if they are
coherent. Meanwhile, the two photons generated via SPDC are not coherent with each
other—although they are correlated. Figure 12.9 illustrates this feature: it shows the
pair of orthogonally polarized photons |1⟩H |1⟩V on the Poincaré sphere. If there is no
coherence between the two photons, their Stokes vectors should be summed geomet-
rically and yield zero.4 Therefore the degree of polarization of state (12.26) is zero. In
Section 12.5 we will see that this state and other similar states feature what is called
‘hidden polarization’, and how this behavior suggests alternative definitions for the
degree of polarization (Section 12.5.3).
Figure 12.9: The state |1⟩H |1⟩V represented by two points on the
Poncaré sphere.
4 Using two SPDC sources in a configuration as in Fig. 12.4, together with polarization transformations
at the input and output, one can prepare biphotons in a single frequency and wavevector mode with
an arbitrary degree of polarization [9].
180 | 12 Nonclassical states of polarized light
A question arises whether the state |1⟩H |1⟩V is an entangled state. As written here,
it is a factorable state because it is a product of two single-photon Fock states in two
orthogonal polarization modes. Indeed, if the quantum states are understood as states
populating certain modes of radiation, the state |1⟩H |1⟩V is perfectly factorable. There
is, however, another attitude [15]: if the two photons of the state are ‘labeled’, say, as
‘s’ and ‘i’ (signal and idler), then the two-photon state should be ‘symmetrized’ with
respect to the photon exchange and written as
1
|Ψ⟩ = (|H⟩s |V⟩i + |V⟩s |H⟩i ). (12.27)
√2
Equation (12.27) means that either the signal photon is horizontally polarized and the
idler photon, vertically polarized, or vice versa.5 This type of entanglement is similar to
another debatable case, namely of the entanglement of a single photon in two arms of
a Mach–Zehnder interferometer [35] or a single photon in two polarization modes [36].
Regardless of whether to consider the state (12.26) as entangled or factorable, it
can be projected on an entangled state [6]. Indeed, let us split it on a non-polarizing
beamsplitter, and label as ‘s’ and ‘i’ the photons in different arms. If we ignore the pairs
that are directed into the same arm (transmitted or reflected) of the beamsplitter, the
rest of the state will be written as in Eq. (12.27) and be entangled, because it is not
factorable into some states of photons ‘s’ and ‘i’, or modes ‘s’ and ‘i’.
The state (12.26) manifests one of the most intriguing effects in quantum optics,
namely the Hong–Ou–Mandel dip; however, in its polarization version. In the ‘stan-
dard’ version of the Hong–Ou–Mandel effect [22], two photons are overlapped on a
50 % beamsplitter (Fig. 12.10(a)). If the photons are indistinguishable in all parame-
ters, namely frequency, wavevector direction, polarization, and perfectly overlapped
in time and space, then they are never directed into two different output ports of the
beamsplitter. They are always in the same output port, and a pair of detectors placed
in different output ports will never ‘click’ simultaneously. If the time delay t between
the arrivals of the two photons on the beamsplitter is larger than the coherence time of
the photons, they ‘do not notice each other’ and are independently split on the beam-
splitter; the detectors then ‘click’ in coincidence with the probability 50 %, i. e., very
often. This forms a ‘dip’ in the rate Rc of coincidence counting.
In the polarization version of the Hong–Ou–Mandel effect [44, 46, 47]
(Fig. 12.10(b)), the two-photon part of state (12.26) is transformed by a HWP oriented
at 22.5∘ into |1⟩D |1⟩A , a pair of diagonally and anti-diagonally polarized photons, and
then sent to a polarizing beamsplitter. The beamsplitter is oriented so that it transmits
a horizontally polarized photon and reflects a vertically polarized photon. Then each
5 Actually in quantum mechanics a superposition state means not ‘either–or’ but ‘both’—similar to
how a single photon in the Young experiment can pass, in principle, through two slits at the same time
and interfere with itself.
12.3 Low parametric gain and entangled photons | 181
Figure 12.10: The ‘standard’ Hong–Ou–Mandel effect (a) and its polarization version (b).
of the photons of the |1⟩D |1⟩A pair has a 50 % chance to be reflected or transmitted. But
due to the interference, the pair always goes into the same port, and there are no coin-
cidences between the photocounts of two detectors after the polarizing beamsplitter
in Fig. 12.10(b). We will now derive this result rigorously, highlighting the similarity
with the ‘standard’ HOM effect.
As mentioned in Chapter 3, a polarization transformation is similar to a two-mode
beamsplitter transformation, and it is described by a similar Jones matrix. In particu-
lar, a HWP (Chapter 5) placed at an angle 22.5∘ performs the same transformation on
polarization modes as a 50 % beamsplitter on spatial modes, and its Jones matrix is
(see Eq. (5.11))
i 1 1
JHWP = ( ). (12.28)
√2 1 −1
Let us find the state after the HWP. We will do it by writing the input state as
a†H a†V |0⟩, then transforming the photon annihilation operators aH and aV with the
Jones matrix (12.28), and Hermitian conjugating them to obtain photon creation oper-
ators. For the new operators in the H, V modes we get
i
aH † = − (a† + a†V ),
√2 H
i
aV † = − (a†H − a†V ). (12.29)
√2
Inverting this transformation to obtain a†H , a†V in terms of aH† , aV† and substituting
these expressions into the two-photon part of the state (12.26), after the HWP we get a
biphoton of the form
1 2 2 1
|Ψ⟩(2) = ([aV† ] − [aH† ] ) = (|2⟩V − |2⟩H ). (12.30)
2 √2
shown in Fig. 12.10(b). Of course, this effect is only possible if the input horizontally
and vertically polarized photons are indistinguishable; in reality, there is always a
time delay between the orthogonally polarized photons at the output of a type-II SPDC
process. This delay is caused by the difference between the group velocities of the or-
dinary and extraordinary waves in the nonlinear crystal and can be compensated in
experiment by birefringent plates inserted after it [47].
Consider now the most complicated Hamiltonian realized through SPDC, SFWM, or
MI, namely the one of Eq. (12.13), corresponding to the situation shown in Figs. 12.3,
12.4, and 12.7. As everywhere in Section 12.3, here we assume that the parametric gain,
equal to the product of Γ and the interaction time ti , is small. Then the generated state
will be, as in all other cases of this section, a superposition of the vacuum and a two-
photon state. Its two-photon part, for instance, in the case of Fig. 12.3, is
1
|Ψ⟩(2) = (|H⟩A |V⟩B + eiϕ |V⟩A |H⟩B ). (12.31)
√2
1
|Ψ(+) ⟩ ≡ (|H⟩A |V⟩B + |V⟩A |H⟩B ), (12.32)
√2
1
|Ψ(−) ⟩ ≡ (|H⟩A |V⟩B − |V⟩A |H⟩B ). (12.33)
√2
The two states (12.32) and (12.33) are polarization-entangled biphotons with photons
in a pair polarized orthogonally.
These states can be easily converted into biphotons where both photons in a pair
are in the same polarization states. This can be done by placing a HWP oriented at 45∘
into arm B, thus rotating the polarization of photon B by 90∘ . This gives the states
1
|Φ(+) ⟩ ≡ (|H⟩A |H⟩B + |V⟩A |V⟩B ), (12.34)
√2
1
|Φ(−) ⟩ ≡ (|H⟩A |H⟩B − |V⟩A |V⟩B ). (12.35)
√2
Equations (12.32)–(12.35) describe the so-called Bell states, the most notorious ex-
ample of entangled photons. They are named so due to their role in the tests of Bell’s
12.4 High parametric gain: polarization squeezing and entanglement | 183
inequalities, which will be the subject of Chapter 13. The states (12.32)–(12.35) are max-
imally polarization-entangled and orthogonal to each other.
One of the distinguishing features of an entangled state of two subsystems A, B is
that taken separately, each subsystem is in a mixed state [43]. The polarization analog
of a mixed state is an unpolarized state. Accordingly, for each of the photons A, B the
degree of polarization is zero. This can be verified by calculating the Stokes parameters
for photons A and B in the Bell states. In experiment, if a Stokes measurement setup
(Fig. 11.4(b)) is placed in each path, A and B in Fig. 12.3, at any settings of the HWP and
QWP the upper and lower detectors will ‘click’ equally frequent. However, there will
be correlation between their ‘clicks’. For instance, if the state at the input is |Φ(−) ⟩ and
all phase plates are oriented at 0∘ (the first Stokes observable is measured), the upper
and lower detectors in paths A and B will always register photons simultaneously: if
photon A is detected with the horizontal polarization, its match photon B also has
horizontal polarization.
Due to the symmetry with respect to the exchange of the photons, the state |Ψ(−) ⟩
is called the singlet state, while the other three Bell states |Φ(+) ⟩, |Φ(−) ⟩, and |Ψ(+) ⟩ are
said to form a triplet. The singlet state has a remarkable property: it maintains its form
in any polarization basis [17, 41]. For instance, in the diagonal and circular bases it
remains a pair of orthogonally polarized photons:
1 1
|Ψ(−) ⟩ = (|D⟩A |A⟩B − |A⟩A |D⟩B ) = (|R⟩A |L⟩B − |L⟩A |R⟩B ). (12.36)
√2 √2
The other Bell states can be obtained from this one by means of local (i. e., in only
one beam) or global (in both beams) polarization transformations.
Instead of finding the quantum state at the output of the nonlinear source, here we
will look at the operators, such as photon creation, annihilation, and photon-number
operators in both polarization modes. In the end, we are only interested in observable
quantities [32], and these are mean photon numbers or various statistical moments
(variances, correlation functions etc.) For this purpose, the Heisenberg picture is at
least as suitable as the Schrödinger one; moreover, it has the advantage of treating
operators similar to fields in the classical optics; it is therefore more intuitive [34].
The time evolution of an operator  is governed by the Heisenberg equation
dÂ
iℏ = [A,̂ ℋ̂ ], (12.37)
dt
da
= 2Γa† , (12.38)
dt
and the equation for a† is obtained by Hermitian conjugation. The solution for the
operator a is [33, 53]
where a(0) is the initial operator, i. e., the operator before the evolution imposed by the
Hamiltonian. As in the previous section, we define the interaction time ti and denote
Γti ≡ G, the parametric gain. Equation (12.39) is called the Bogolyubov transformation.
For the quadrature operators (11.9) q,̂ p,̂ the Bogolyubov transformation leads
to [53]
̂ = e2G q(0),
q(t) ̂ p(t)
̂ = e−2G p(0).
̂ (12.40)
These transformations mean that the quadratures evolve along hyperbolas q(t)p(t) =
const, which is shown in Fig. 12.11 [11]. As in Section 12.2.2, we use the Wigner-function
Because the initial state is the vacuum, Var[q(0)] = Var[p(0)] = 1/4, the final state
shows quadrature squeezing:
e−2G 1 e2G
Δp(t) = < < Δq(t) = . (12.42)
2 2 2
From the Bogolyubov transformation (12.39), we can find the mean photon num-
ber in the squeezed vacuum state. It is found by averaging the output photon-number
operator over the input (vacuum) state:
⟨N⟩
̂ ≡ ⟨0|a† (t)a(t)|0⟩
The initial (vacuum) photon annihilation operator a(0) yields zero after acting on the
vacuum; accounting for this and using commutation relations in (12.43), we obtain
̂ = sinh2 (2G).
⟨N⟩ (12.44)
We see that at the output of SPDC, the mean photon number can be very large if the
parametric gain G is high. For instance, in experiments with strongly pumped SPDC
(usually the pump is pulsed), values of G as high as 8 can be obtained, leading to
mean numbers of photons on the order of 1013 [11]. This regime of SPDC is known
as high-gain parametric down-conversion (PDC), and the state at the output is called
bright squeezed vacuum. The term ‘bright’ here means that the number of photons per
radiation mode is high—as high as in laser radiation [11].
We will now consider the Hamiltonians that involve both polarization modes: the
type-II SPDC Hamiltonian (12.12) and the ‘entangling’ Hamiltonian (12.13).
In the case of type-II SPDC, the Heisenberg equations for the annihilation opera-
tors in both polarization modes are
daH
= Γa†V ,
dt
daV
= Γa†H . (12.45)
dt
186 | 12 Nonclassical states of polarized light
The solution to these equations, as one can verify by substitution, is given by the
two-mode Bogolyubov transformations,
Here, aH (0) and aV (0) are initial operators, i. e., operators before the evolution im-
posed by the SPDC Hamiltonian. As in the previous case, we define the interaction
time ti and denote Γti ≡ G, the parametric gain.
From the Bogolyubov transformations (12.46), we can find various parameters of
the SPDC radiation, similar to the case of Hamiltonian (12.11).
The mean numbers of photons in the horizontal and vertical polarization modes
are found as in the previous case, and are
We obtain the result that the mean photon numbers in the horizontal and ver-
tical polarization modes are the same. This is not surprising because the Hamilto-
nian (12.11) was symmetric with respect to the interchange of these modes. Moreover,
Eq. (12.47) shows that these photon numbers can be very large—the parametric gain
is twice as small as in the case of degenerate collinear high-gain PDC, but still can
be very high. But the most remarkable feature follows from the fact that type-II SPDC
generates photons always in pairs, so that every time a photon appears in mode H, its
twin appears in mode V. This leads to the effect of polarization squeezing, which will
be the subject of the next section.
From ⟨NH ⟩ = ⟨NV ⟩, it follows that the mean value of the first Stokes operator is zero,
⟨S1 ⟩ = 0. By calculating the other Stokes parameters, one can also verify that ⟨S2 ⟩ =
⟨S3 ⟩ = 0. But the most surprising and, in fact, nonclassical feature of the radiation
produced through high-gain type-II PDC is that the variance of the first Stokes observ-
able is zero as well. Indeed, fluctuations of the photon numbers in modes H and V
are perfectly correlated, because photons appear in these two modes simultaneously.
Therefore, the difference of these photon numbers, i. e., the first Stokes observable S1 ,
does not fluctuate.
According to quantum mechanics, the first Stokes observable does not fluctuate
simply because the operator S1̂ commutes with Hamiltonian (12.12):
Var(S1̂ ) = 0. (12.49)
This perfectly noiseless behavior can be only observed in the absence of losses and
under the condition of perfect detection efficiency. In reality this is not the case. If
the detection efficiency, including all losses on the way from the generation to the
detection, is η for both H and V polarization modes, then the variance of the first Stokes
observable is [11]
But even with non-unity detection efficiency, the variance of the first Stokes ob-
servable is smaller than ⟨S0 ⟩, i. e., the total number of photons. Meanwhile, for a co-
herent state the variances of all Stokes observables are equal to ⟨S0 ⟩, which can be
considered as the shot-noise limit for polarization noise. This effect, when fluctua-
tions in one of the Stokes observables are reduced below the shot-noise limit,
has been defined as polarization squeezing [13, 31]. Any Stokes observable can be
squeezed, from i = 1, 2, 3 to any generic Stokes observable (11.42). Obviously, by an ap-
propriate polarization transformation the squeezing can be transferred from the first
Stokes observable to any generic one.
Polarization squeezing is a special case of twin-beam squeezing (Section 11.1.3),
where the variance of the photon-number difference between two beams is less than
the mean total number of photons in both beams. As any twin-beam squeezing, po-
larization squeezing is a nonclassical feature.
Polarization squeezing has been first reported by Bushev et al. [10] who observed
it at the output of a type-II parametric oscillator operating below threshold. In this
regime, a parametric oscillator emits bright squeezed vacuum, similar to high-gain
PDC. The Hamiltonian realized in this case was (12.12), and squeezing was in the first
Stokes observable. Polarization tomography revealed a polarization quasi-probability
(Section 11.3) whose width in S2 and S3 directions exceeded the shot noise but whose
width in the S1 direction was below the shot noise. This polarization quasi-probability
distribution is schematically shown in Fig. 12.12 (red shape, labeled 1). It is at the cen-
ter of the Stokes space because, as mentioned above, Hamiltonian (12.12) generates
unpolarized light, ⟨S1 ⟩ = ⟨S2 ⟩ = ⟨S3 ⟩ = 0. For comparison, the green shape 2 shows
in this figure the polarization quasi-probability of a coherent circularly polarized state
with the same mean number of photons.
Suppression of noise in polarization observables is useful in all measurements
based on polarization, an obvious example being polarimetry. Like in all measure-
ments, suppression of the noise below the shot-noise level offers a quantum advantage
188 | 12 Nonclassical states of polarized light
over classical measurements, which in the best case use coherent light. For instance,
Feng and Pfister [16] achieved signal-to-noise ratio in polarimetry 4.8 dB better than
with coherent states. Using polarization squeezing in an atomic magnetometer, Wolf-
gramm et al. [54] demonstrated an advantage of 3.2 dB over a coherent-state measure-
ment.
Polarization squeezing is related to quadrature squeezing. For instance, by pass-
ing from H, V to A, D modes, we can rewrite Hamiltonian (12.12) in the form
iℏΓ † 2 2
ℋ̂ II = ([aD ] − [a†A ] ) + h. c., (12.52)
2
which is the difference of two single-mode Hamiltonians. Here, we see exactly the
same effect as in the experiment of Fig. 12.4: two pairs of orthogonally polarized
photons are equivalent to a coherent superposition of a photon pair in the same po-
larization state and another photon pair in the orthogonal polarization state. The
same interference phenomenon underpins the polarization Hong–Ou–Mandel effect
(Fig. 12.10(b)). In other words, one can obtain polarization squeezing in the first Stokes
variable S1 (‘pancake-like’ shape 1 in Fig. 12.12) by placing two type-I SPDC crystals
one by one into a common pump beam, the first crystal producing diagonally (D)
polarized pairs and the second crystal, anti-diagonally (A) polarized pairs. Each of
these crystals will then produce quadrature-squeezed vacuum; correspondingly, the
two terms in Hamiltonian (12.52) describe quadrature squeezing in modes A, D.
Bowen et al. [7] used this principle to produce polarization squeezing by com-
bining two quadrature-squeezed beams polarized orthogonally. In their experiment,
beams were polarized horizontally (H) and vertically (V); clearly, the resulting state
was not squeezed in S1 . Depending on the phase between the two beams, squeezing
was obtained either in S2 (green ‘pancake-like’ shape 3 in Fig. 12.12) or in both S2 and
S3 (green ‘cigar-like’ shape 4 in Fig. 12.12). But because the initial quadrature-squeezed
beams were not squeezed vacuums but displaced squeezed states, with coherent com-
ponents, the resulting states also had a coherent component. This component was
12.4 High parametric gain: polarization squeezing and entanglement | 189
polarized, as would always be the case for a superposition of two coherent orthogo-
nally polarized states. The states therefore were displaced in the Stokes space: they
had ⟨S3 ⟩ = ⟨S0 ⟩ and ⟨S1 ⟩ = ⟨S2 ⟩ = 0, i. e., they were circularly polarized (Fig. 12.12).
Remarkably, while the ‘pancake-like’ state produced in Ref. [7] was not squeezed
in any Stokes observable except S2 , the ‘cigar-like’ state was additionally squeezed
in S0 , i. e., it had sub-shot-noise intensity fluctuations. Moreover, the polarization
squeezing of this state was of a different nature than the one of the squeezed vacuum
state obtained by Bushev et al. [10]. Indeed, for state 4 in Fig. 12.12 not only the noise in
some Stokes observables is less than for a coherent state with the same mean photon
number (shape 2 in Fig. 12.12). Additionally, this state satisfies the inequality
i. e., its polarization squeezing has a meaning with respect to the uncertainty relations
(11.47) for the Stokes observables. In contrast, a polarization-squeezed vacuum state
is unpolarized, and the right-hand sides of uncertainty relations (11.47) are zero for
this state.
Partly for this reason, and also because the condition (12.51) of polarization
squeezing boils down to the quadrature squeezing in two polarization modes A, D,
Korolkova et al. [19, 36, 37] proposed an alternative definition for the polarization
squeezing. In this definition, a state of light is polarization-squeezed if
This type of polarization squeezing was also obtained by overlapping two orthog-
onally polarized Kerr-squeezed pulses, using a Sagnac interferometer (Fig. 12.7(c)) [19]
and later, two polarization modes in a polarization-maintaining fiber (Fig. 12.7(b)). The
state produced by Heersink et al. [19] is shown as blue shape 5 in Fig. 12.12. For such
a state, Marquardt et al. [42] performed the polarization tomography and observed a
‘cigar-like’ polarization quasi-probability distribution.
Let us now return to the experimental schemes involving two polarization modes and
two other modes, for instance, wavevector (directional), as in Figs. 12.3, 12.4. The cor-
responding Hamiltonians have the form (12.13) or (12.14). At low parametric gain, the
states at the output are pairs of polarization-entangled photons. At high parametric
gain, as in the cases considered in previous sections, the flux of pairs becomes very
strong. What are the properties and, in particular, nonclassical features of the output
states? As always in the case of bright light, it is convenient to treat this problem in
the Heisenberg picture. We will use H and V polarization modes and also the modes A
190 | 12 Nonclassical states of polarized light
and B, which can be wavevector (directional) or frequency modes. We will only con-
sider Hamiltonian (12.13) with ϕ = 0, which at low parametric gain produces Bell state
|Ψ(+) ⟩, but we will keep in mind that there are other three ‘entangling’ Hamiltonians,
producing the other three Bell states at low gain.
The Heisenberg equations for four operators aAH , aAV , aBH , aBV lead to the four-
mode Bogolyubov transformations
We can notice that these four equations form two independent pairs: two equa-
tions (the first and the fourth) involve creation and annihilation operators in modes
AH, BV, and the other two (the second and the third) involve operators in modes BH,
AV. In both cases, the relations between two operators are the same as between aH ,
aV in the Bogolyubov transformations (12.46). We therefore expect that the operator
N̂ AH − N̂ BV , i. e., the difference of photon numbers in modes AH, BV will not fluctuate,
or at least, even in the presence of loss will have fluctuations below the shot-noise
limit. The same will be the case for the operator N̂ BH − N̂ AV . Summing both operators,
we see that the difference of the total photon numbers in polarization modes H and V
will not fluctuate. We can come to the same conclusion by noticing that the operator
N̂ AH + N̂ BH − N̂ AV − N̂ BV commutes with the Hamiltonian. This leads to several important
properties of the state.
In 1993, Karassiov and Masalov [27, 28] proposed to consider the resulting state in
terms of joint Stokes observables for modes A and B, introduced as
joint A B
Sî ≡ Sî + Sî , i = 0, 1, 2, 3. (12.56)
These joint Stokes observables have a physical meaning if A and B are wavelength
modes. Then observables (12.56) can be measured in a standard Stokes measurement
setup as shown in Fig. 5.7 or, in the quantum case, in Fig. 11.4(b). The setup should
not distinguish between modes A and B and the detectors should measure the total
number of photons.
It follows that, for the state described by Hamiltonian (12.13) with ϕ = 0, the first
joint Stokes observable will have, ideally, no noise—and noise reduced below the shot-
noise limit even under imperfect detection and losses. The polarization probability
distribution for such a state should be ‘pancake-like’, ideally, infinitely thin along S1
and having a large size (noise considerably larger than the shot noise) along the other
two observables. This probability distribution is shown by blue color in Fig. 12.13. Its
polarization properties are similar to the ones of the Bell state |Ψ(+) ⟩ but, unlike a two-
photon entangled state, it contains a large (macroscopic) number of photons. This
12.4 High parametric gain: polarization squeezing and entanglement | 191
Similarly, by stronger pumping a system that at low parametric gain generates the
other three Bell states, (12.34), (12.35), and (12.33), one can obtain macroscopic Bell
states |Φ(+)mac ⟩, |Φmac ⟩, and |Ψmac ⟩. Their polarization quasi-probability distributions
(−) (−)
are shown in Fig. 12.13 with green, red, and black colors, respectively. Each of the states
mac ⟩ and |Φmac ⟩ is squeezed in one Stokes observable: the first one in S3 , the second
|Φ(+) (−)
one in S2 .
Most interesting is the macroscopic singlet Bell state |Ψ(−)
mac ⟩, shown in Fig. 12.13 as
a black point at the origin. Because, similarly to the two-photon singlet Bell state, it is
invariant to polarization transformations, and because, similarly to the state |Ψ(+) mac ⟩,
it should feature squeezing of the first Stokes observable, it should have squeezed fluc-
tuations of all Stokes observables. This is why this state is shown as a point at the cen-
ter of the Stokes space. Because of these unusual properties, theoretically predicted
by Karassiov and Masalov [27], it was called ‘polarization-scalar light’. This state was
generated by Iskhakov et al. [25], and its polarization tomography [26] showed that
indeed, the fluctuations of all Stokes observables in this state were suppressed below
the shot-noise level.
Another remarkable property of the macroscopic singlet Bell state is that it vio-
lates inequality (11.49) for the joint Stokes observables. Indeed, it was experimentally
shown in Ref. [23] that, for this state,
The contradiction with inequality (11.49) is because the latter was derived for a
state of light in a single frequency and wavevector mode; meanwhile, the singlet state
and all other macroscopic Bell states generated in Refs. [10, 23, 24, 26] contained at
least two frequency modes (A,B). Nevertheless, Eq. (12.58) leads to a remarkable re-
sult [23]. In the state |Ψ(−)
mac ⟩, photons are emitted into modes A and B in large groups, of
105 on the average. One can place Stokes measurement setups (Fig. 5.7) in each mode
A and B, and count photons that are reflected and transmitted by polarization prisms
in each case. If the settings of waveplates in beams A and B are the same, the number
n of photons reflected in arm A is uncertain, but always the same as the number of
photons transmitted in arm B (Fig. 12.14). The same is true for the number m of trans-
mitted photons in arm A: it will be always the same as the number of reflected photons
in arm B.
This result means that, for the macroscopic singlet Bell state, each Stokes observable
for each mode A, B is uncertain, but there are correlations in the Stokes observables
between modes A and B. This property, according to the general definition, can be
considered as entanglement. Indeed, as shown in ref. [23], it can be interpreted as
polarization entanglement for macroscopic pulses of light.
However, it was pointed out by Korolkova et al. [36, 37] that an unpolarized state
like |Ψ(−)
mac ⟩ or any polarization-squeezed vacuum state is irrelevant for the Stokes ob-
servables uncertainty relations, because for such a state ⟨Si ⟩ = 0, i = 1, 2, 3, and the
right-hand sides of uncertainty relations (11.47) are zero. Therefore, despite the cor-
relation between the quantum Stokes observables for modes A,B in Fig. 12.14, such a
situation does not enable simultaneous accurate measurement of two non-commuting
Stokes observables for the same mode, with an ‘apparent violation of the uncertainty
principle’ [36]. The definition of polarization entanglement was formulated in Ref. [36]
as the condition
A A 2
Var(Sî |SiB ) Var(Sĵ |SjB ) < ⟨Sk̂ ⟩ , i ≠ j ≠ k, (12.59)
A
where Var(Sî |SiB ) is the variance of the ith Stokes observable for mode A conditioned
on the measurement of the same Stokes observable for mode B.
Polarization entanglement of the form (12.59) has been reported by Bowen et al. [8]
using a modification of their experiment on polarization squeezing [7] and by Dong
et al. [14] using Kerr squeezing in optical fibers.
12.5 ‘Hidden polarization’ | 193
The most basic definition (3.30) of the degree of polarization dictates the following
way of measuring it [31]: in a Stokes measurement setup (Fig. 5.7), the orientations of
the HWP and QWP are varied in all possible ways, and the visibility in the intensity
modulation measured by one of the detectors gives the degree of polarization [Chap-
ter 5, Eq. (5.26)]. According to this definition, orthogonally polarized photon pairs
(Section 12.3.2), polarization-entangled photon pairs including the Bell states (Sec-
tion 12.3.3), and the macroscopic analogues of all these states emitted through high-
gain PDC (Section 12.4) are unpolarized. But at the same time, for most of these states,
rotation of the waveplates in the scheme of Fig. 5.7 does lead to some observable mod-
ulation. This modulation, however, is not in the mean intensity but in higher-order
intensity moments.
Examples have been already given in several sections of this Chapter. Consider,
for instance, the polarization Hong–Ou–Mandel effect (Section 12.3.2, Fig. 12.10(b)). If
the HWP is at 22.5∘ , the two detectors are almost never ‘clicking’ in coincidence. But if
the HWP is oriented at 0∘ , the detectors will always ‘click’ in coincidence. As the HWP
is rotated, the rate of single counts will be constant for each detector, but the rate of co-
incidence counts will be 100 % modulated. ‘Distinguished’ directions, in which max-
imum coincidence rate will be measured, will be the ones of the initial polarizations
of the photons in pairs (H,V). This is a typical example of hidden polarization. The re-
sult of such a measurement is shown in Fig. 12.15 with a red solid line. It is sufficient
to remove the QWP and rotate only the HWP (left panel in the figure); as its orienta-
tion χ2 is varied, the rate of coincidences is modulated. The maxima correspond to the
cases where the photons are not split on the beamsplitter: the horizontally polarized
photon goes to one detector and its vertically polarized ‘match’, to the other one. As
mentioned, the count rates of both detectors do not show any modulation as the HWP
is rotated.
Two-photon Bell states (Section 12.3.3) manifest a similar behavior. For instance,
in the scheme of Fig. 12.4, the radiation in any direction is fully unpolarized, according
to definition (3.30) or (5.26). But correlated measurement of the Stokes observables in
194 | 12 Nonclassical states of polarized light
Figure 12.15: A setup to observe ‘hidden polarization’ (left) and the dependence of normalized coin-
cidence rate (red thin solid line) and normalized variance of the Stokes observable (blue thick solid
line) on the HWP orientation (right) in the case of SPDC radiation at the input. The mean count rates
of both detectors (not shown) have no modulation. Dashed lines show the dependences for classical
radiation at the input.
modes A and B simultaneously will reveal ‘distinguished’ directions for the three Bell
states |Φ(±) ⟩, |Ψ(+) ⟩. However, this example involves two radiation modes (A, B).
If the flux of photon pairs at the input of the measurement setup of Fig. 12.10(b)
is very high, almost all coincidences will become accidental, as mentioned in the be-
ginning of Section 12.4. The modulation in coincidence count rate will disappear, but
it does not mean that the hidden polarization effect disappears as well. It just means
that the measurement should be different. In order to observe ‘hidden polarization’
in the radiation of strongly pumped type-II SPDC, one should measure not the rate of
coincidences but the variance of the Stokes observable that is set by the orientations
of the plates [31]. For instance, as shown theoretically in Section 12.4.2, the first Stokes
observable has, in the absence of losses, zero variance. Meanwhile, the variances of
the S2 and S3 Stokes observables for this state exceed the shot-noise limit. Therefore,
by measuring the variance of the difference of signals from the two detectors in Fig. 5.7
and rotating the waveplates, one will see, ideally, a 100 % modulation. Again, to see
this effect in high-gain type-II PDC it suffices to have only the HWP (Fig. 12.15). Rotation
of this plate leads to the modulation of the measured Stokes observable variance (blue
thick solid line). The minima correspond to the measurement of S1 , for which the noise
is suppressed; this happens when the HWP is at 0∘ , 45∘ , and 90∘ . In the positions of
the HWP at 22.5∘ and 67.5∘ , the setup probes observable S2 , which has enhanced noise.
At the same time, no modulation is observed in the signals of both detectors.
This behavior has indeed been reported for the radiation at the output of a type-II
optical parametric oscillator under threshold by Bushev et al. [10]. The dependence of
the Stokes observable noise resembled the one shown by blue solid line in Fig. 12.15,
albeit with lower visibility: because of the losses and imperfect detection, the noise
was never exactly zero.
For photon pairs emitted via type-II low-gain SPDC, hidden polarization was ob-
served by Usachev et al. [51] by using the same setup and measuring coincidence count
12.5 ‘Hidden polarization’ | 195
rate. Rotating a HWP in front of a polarizing prism, they obtained a dependence sim-
ilar to the solid red curve in Fig. 12.15.
Experiments with the macroscopic Bell state |Ψ(−) mac ⟩ showed that this state does
not have ‘hidden polarization’ [25]. Because this state is invariant to polarization ro-
tation, all moments of its Stokes observables S1,2,3 are zero. This is why this state was
termed ‘polarization-scalar light’ [27]: it is unpolarized in all orders in the intensity.
One might think that ‘hidden polarization’ is a typically quantum effect. This is not
true: as proposed by Klyshko [31], two overlapped orthogonally polarized laser beams
would produce the same effects as for SPDC radiation at the input, albeit with lower
visibilities. In the case of coincidence counting, the visibility will be only 33 % (red thin
dashed line in Fig. 12.15). In the case of Stokes variance measurement, 50 % visibility
should be observed (blue thick dashed line in Fig. 12.15). In this measurement, it might
seem that there is no difference between the quantum and classical cases, because the
quantum case will give finite visibility as well due to losses and imperfect detection.
But the boundary between the classical and quantum cases is set by the shot-noise
limit, which can be overcome for nonclassical light.
An experimental realization of this proposal has a certain difficulty. Namely, the
two lasers whose beams should be combined to obtain the ‘hidden polarization’ effect
have to be incoherent but still of the same wavelength. As a solution, one could take a
single laser beam, polarized at 45∘ , split it on a polarizing beamsplitter, delay one po-
larization component with respect to the other by more than a coherence length, and
recombine the beams on a polarizing beamsplitter (Fig. 12.16). The output beam will
contain an incoherent mixture of horizontally and vertically polarized components.
If it is sent into a measurement setup as shown in Fig. 12.15, it will manifest ‘hidden
polarization’. for instance, if variances of different Stokes observables are measured,
the one of S1 will be smaller than the one of S2 .
Such an experiment has been made by Guzun and Penin [18] by registering coin-
cidences in the scheme shown in Fig. 12.15 with an attenuated laser beam at the input.
As expected, they observed a dependence similar to the one shown in Fig. 12.15 by red
thin dashed line.
The existence of ‘hidden polarization’ shows that the degree of polarization, as de-
fined by Eq. (3.30), is incomplete: it classifies states like polarization-squeezed vac-
uum, or orthogonally polarized photon pairs, as unpolarized. An example has been
pointed out by Agarwal and Puri [1]: in the course of propagation through optical fiber,
a coherent beam can generate a quantum state through Kerr squeezing, and its degree
of polarization may change. Besides, there are other features that make this definition
inconvenient. For instance, a vacuum state turns out to be fully polarized [3], which
is unphysical. In the framework of nano-optics (Chapter 8), definition (3.30) also has
to be modified, because it takes into account only transverse (two-dimensional) po-
larization.
A straightforward way to generalize the degree of polarization is to replace clas-
sical averages in Eq. (3.30) by quantum averages. Similarly, in the coherence matrix
(3.31) the field correlators should be replaced by quantum correlators, for instance
⟨EH∗ EV ⟩ → ⟨a†H aV ⟩ [1]. However, this does not help to circumvent the problems men-
tioned above.
Alodjants et al. [3] introduced the degree of polarization as
expected in both cases; but the variance measurement is affected by the detection
efficiency.
Several theoretical definitions for the degree of polarization were introduced as
distances on the Poincaré sphere from a completely unpolarized state [5]. The latter
is defined as a state ‘spread’ over the Poincaré sphere, so that it is invariant to any
polarization transformation. However, these measures are not operational: they do
not enable direct measurement in experiment.
Finally, to take into account the three-dimensional structure of polarization at the
nanoscale (Chapter 8), as well as the vacuum fluctuations, Luis [40] proposed to gen-
eralize both the Stokes observables and the degree of polarization to include a third
dimension.
Bibliography
[1] G. S. Agarwal and R. R. Puri. Quantum theory of propagation of elliptically polarized light
through a Kerr medium. Phys. Rev. A, 40:5179–5186, Nov 1989.
[2] G. Agrawal. Nonlinear fiber optics. Academic Press, 1989.
[3] A. P. Alodjants, S. M. Arakelian, and A. S. Chirkin. Polarization quantum states of light in
nonlinear distributed feedback systems; quantum nondemolition measurements of the Stokes
parameters of light and atomic angular momentum. Appl. Phys. B, Lasers Opt., 66(1):53–65,
January 1998.
[4] H.-A. Bachor and T. C. Ralph. A guide to experiments in quantum optics. Wiley-VCH, 2004.
[5] G. Björk, J. Söderholm, L. L. Sánchez-Soto, A. B. Klimov, I. Ghiu, P. Marian, and T. A. Marian.
Quantum degrees of polarization. Opt. Commun., 283:4440–4447, 2010.
[6] D. Bouwmeester, A. Ekert, and A. Zeilinger. The physics of quantum information.
Springer-Verlag, 2000.
[7] W. P. Bowen, R. Schnabel, H. A. Bachor, and P. K. Lam. Polarization squeezing of continuous
variable Stokes parameters. Phys. Rev. Lett., 88:093601, Feb 2002.
[8] W. P. Bowen, N. Treps, R. Schnabel, and P. K. Lam. Experimental demonstration of continuous
variable polarization entanglement. Phys. Rev. Lett., 89:253601, Dec 2002.
[9] A. V. Burlakov and M. V. Chekhova. Polarization optics of biphotons. JETP Lett., 75:432–438,
2002.
[10] P. A. Bushev, V. P. Karassiov, A. V. Masalov, and A. A. Putilin. Biphoton light with hidden
polarization and its polarization tomography. Opt. Spectrosc., 91:558–564, 2001.
[11] M. V. Chekhova, G. Leuchs, and M. Zukowski. Bright squeezed vacuum: entanglement of
macroscopic light beams. Opt. Commun., 337:27, 2014.
[12] M. V. Chekhova and Z. Y. Ou. Nonlinear interferometers in quantum optics. Adv. Opt. Photonics,
8(1):104–155, Mar 2016.
[13] A. S. Chirkin, A. A. Orlov, and D. Y. Parashchuk. Quantum theory of two-mode interactions
in optically anisotropic media with cubic nonlinearities: Generation of quadrature- and
polarization-squeezed light. Rus. Journ. Quantum Electronics, 23:870–874, 1993.
[14] R. Dong, J. Heersink, J.-I. Yoshikawa, O. Glöckl, U. L. Andersen, and G. Leuchs. An efficient
source of continuous variable polarization entanglement. New J. Phys., 9(11):410, nov 2007.
[15] M. V. Fedorov and N. I. Miklin. Schmidt modes and entanglement. Contemp. Phys., 2014.
[16] S. Feng and O. Pfister. Sub-shot-noise heterodyne polarimetry. Opt. Lett., 29(23):2800–2802,
Dec 2004.
198 | 12 Nonclassical states of polarized light
[17] C. C. Gerry and P. L. Knight. Introductory quantum optics. Cambridge University Press, 2005.
[18] D. I. Guzun and A. N. Penin. Hidden polarization of two-mode coherent light. In S. N. Bagayev
and A. S. Chirkin, editors, Atomic and Quantum Optics: High-Precision Measurements,
volume 2799, pages 249–254. International Society for Optics and Photonics, SPIE, 1996.
[19] J. Heersink, T. Gaber, S. Lorenz, O. Gloeckl, N. Korolkova, and G. Leuchs. Polarization
squeezing of intense pulses with a fiber-optic Sagnac interferometer. Phys. Rev. A, 68:013815,
2003.
[20] J. Heersink, V. Josse, G. Leuchs, and U. Andersen. Efficient polarization squeezing in optical
fibers. Opt. Lett., 30:1192, 2005.
[21] C. K. Hong and L. Mandel. Experimental realization of a localized one-photon state. Phys. Rev.
Lett., 56:58–60, Jan 1986.
[22] C. K. Hong, Z. Y. Ou, and L. Mandel. Measurement of subpicosecond time intervals between two
photons by interference. Phys. Rev. Lett., 59:2044–2046, Nov 1987.
[23] T. S. Iskhakov, I. N. Agafonov, M. V. Chekhova, and G. Leuchs. Polarization-entangled light
pulses of 105 photons. Phys. Rev. Lett., 109:150502, Oct 2012.
[24] T. S. Iskhakov, I. N. Agafonov, M. V. Chekhova, G. O. Rytikov, and G. Leuchs. Polarization
properties of macroscopic Bell states. Phys. Rev. A, 84:045804, Oct 2011.
[25] T. S. Iskhakov, M. V. Chekhova, G. O. Rytikov, and G. Leuchs. Macroscopic pure state of light
free of polarization noise. Phys. Rev. Lett., 106:113602, Mar 2011.
[26] B. Kanseri, T. Iskhakov, I. Agafonov, M. Chekhova, and G. Leuchs. Three-dimensional quantum
polarization tomography of macroscopic Bell states. Phys. Rev. A, 85:022126, Feb 2012.
[27] V. P. Karasev and A. V. Masalov. Unpolarized light states in quantum optics. Opt. Spectrosc.,
74:551, 1994.
[28] V. P. Karassiov. Polarization structure of quantum light fields: a new insight: I. General outlook.
J. Phys. A, 26:4345, 1993.
[29] D. N. Klyshko. Multiphoton light and polarization effects. Phys. Lett. A, 163:349, 1992.
[30] D. N. Klyshko. The nonclassical light. Phys. Usp., 39:573–596, 1996.
[31] D. N. Klyshko. Polarization of light: fourth-order effects and polarization-squeezed states.
J. Exp. Theor. Phys., 84:1065–1079, 1997.
[32] D. N. Klyshko. Basic quantum mechanical concepts from the operational viewpoint. Phys. Usp.,
41(9):885–922, 1998.
[33] D. Klyshko. Photons and nonlinear optics. Gordon and Breach, 1988.
[34] D. Klyshko. Physical foundations of quantum electronics. World Scientific, 2011.
[35] N. Korolkova and G. Leuchs. Quantum correlations in separable multi-mode states and in
classically entangled light. Rep. Prog. Phys., 8:056001, 2019.
[36] N. Korolkova, G. Leuchs, R. Loudon, T. C. Ralph, and C. Silberhorn. Polarization squeezing and
continuous-variable polarization entanglement. Phys. Rev. A, 65:052306, Apr 2002.
[37] N. Korolkova and R. Loudon. Nonseparability and squeezing of continuous polarization
variables. Phys. Rev. Lett., 71:032343, 2005.
[38] P. G. Kwiat, K. Mattle, H. Weinfurter, A. Zeilinger, A. V. Sergienko, and Y. Shih. New
high-intensity source of polarization-entangled photon pairs. Phys. Rev. Lett., 75:4337–4341,
Dec 1995.
[39] P. G. Kwiat, E. Waks, A. G. White, I. Appelbaum, and P. H. Eberhard. Ultrabright source of
polarization-entangled photons. Phys. Rev. A, 60:R773–R776, Aug 1999.
[40] A. Luis. Quantum polarization for three-dimensional fields via Stokes operators. Phys. Rev. A,
71:023810, Feb 2005.
[41] A. I. Lvovsky. Quantum physics: an introduction based on photons. Springer, 2018.
Bibliography | 199
In 1935, Einstein, Podolsky, and Rosen formulated a paradox that seemed to under-
mine the very basics of quantum mechanics. In the gedanken (thought) experiment
proposed by EPR, two particles A, B had, at the time of their birth, correlated values
of position and anti-correlated values of momentum [15], so that they were born at one
point and propagated in the opposite directions. The paradoxical statement, accord-
ing to the quantum-mechanical idea of measurement, was that upon the measure-
ment performed on particle A, for instance, the measurement of position, the state
of particle B would be instantly turned into a position state, i. e., into a state with a
fixed position. Alternatively, a momentum measurement on particle A would instantly
put particle B into a momentum state. This instant state reduction should happen re-
gardless of the distance between the particles, which could be very large. This meant
‘spooky action at a distance’, as Einstein formulated it [2], and could not be accepted
by most physicists. Moreover, by measuring non-commuting variables, momentum
for particle A and position for particle B, one could apparently violate the uncertainty
relation.
The argument by EPR went further to propose that the quantum-mechanical
description of a quantum system, such as these two particles, was incomplete. The
theory, according to their viewpoint, should additionally contain some ‘hidden vari-
ables’, i. e., the a priori values of the position and momentum for each particle of the
https://doi.org/10.1515/9783110668025-013
13.1 Bell’s inequality and its violation | 201
1
|Ψ⟩ = (| ↑⟩A | ↓⟩B + | ↓⟩A | ↑⟩B ), (13.1)
√2
where | ↑⟩ denotes a ‘spin up’ state, and | ↓⟩, a ‘spin down’ state. Note that here the
states are distinguished relative to the vertical magnetic field; we could also consider
the direction of the spin with respect to the horizontal magnetic field or, actually, any
magnetic field direction.
Equation (13.1) gives an example of an entangled state—similar to the |Ψ(+) ⟩ Bell
state we considered in the previous section: taken separately, each particle has the
spin direction completely uncertain, but if particle A has the spin directed ‘up’, then
the spin of particle B is directed ‘down’, and vice versa. (In fact, the term ‘entangled’—
in German, ‘verschränkt’—has been proposed by Schrödinger namely for this case.)
The EPR paradox, in Bohm’s formulation, sounds as follows: if one measures the spin
direction of particle A, and the result is ‘down’, then particle B is instantly reduced to
the ‘spin up’ state; but how can this be true if the particles are very far apart? Also, simi-
lar to the argument of EPR, one can then simultaneously measure two non-commuting
Pauli operators: say, σx for particle A and σy for particle B.
This binary version of the EPR paradox is extremely useful for two reasons. First,
it allows for the derivation of an inequality that can be tested in experiment [2]. (We
will derive this Bell inequality in the next few paragraphs.) Second, as mentioned in
the previous chapters, there is an analogy between the spin of a spin-1/2 particle and
the Stokes observables for a photon. The latter can be measured very simply using a
setup of Fig. 11.4(b) instead of a bulky complicated setup with magnets, as in the case
of the Stern–Gerlach experiment (Fig. 11.4(a)).
Based on the binary interpretation of the EPR paradox, in 1964 John Bell formu-
lated a theorem that assumed the existence of local hidden variables for the two par-
ticles and resulted in an inequality. The term ‘local’ means here that the variables for
particle A cannot affect particle B, and vice versa. Because the original derivation of
Bell’s inequality is a bit complicated [2], here we will present its very simple version in
the form of Clauser–Horne–Shimony–Holt (CHSH) inequality, following Klyshko [14].
Let the state of particles A, B be described by binary variables a, b, each of them
taking values 1 or −1. Suppose these variables can be measured in an experiment as
202 | 13 Applications of quantum polarization states
shown in Fig. 11.4(a) or Fig. 11.4(b): if the upper detector clicks, then we say that a = 1,
and if the lower detector clicks, then a = −1. For the measurement of b we have another
similar setup, placed in the path of particle B. Moreover, the same setups in different
configurations can measure other variables: a for particle A and b for particle B. The
variables a and b are also binary and take values +1 or −1. One can imagine that, for
the measurement of a, the setup in the path of particle A looks as shown in Fig. 11.4(a),
or in Fig. 11.4(b) with a certain setting of the waveplates. For the measurement of a ,
the setup has to be modified; for instance, in the case of spin, the setup in Fig. 11.4(a)
should be rotated. In the case of the Stokes measurement (Fig. 11.4(b)), the orienta-
tions of the HWP and QWP should be changed to measure a .
To derive Bell’s inequality, we assume that, as soon as particles A, B are created,
each of them has certain parameters: particle A has parameters a, a and particle B has
parameters b, b . These are exactly the local hidden variables in the EPR argument. For
instance, for a spin-1/2 particle these can be projections of the spin on the horizontal
and vertical axes. The set of these parameters {λ} ≡ {a, a, b, b } is assumed to have some
probability distribution, p({λ}). We assume this probability distribution to be ‘well-
behaved’, i. e., to be non-negative and normalized: p({λ}) ≥ 0, ∫ p({λ})d{λ} = 1. In the
Copenhagen picture of quantum mechanics, there are no a priori values of {λ}, i. e.,
there are no hidden variables. Let us see where the hidden-variable assumption leads
us. Of course there is another assumption here, namely, locality: particle A cannot
affect particle B, and vice versa.
We now introduce a new variable,
1 1
F ≡ {ab + a b + ab − a b } = {a(b + b ) + a (b − b )}. (13.2)
2 2
Because b, b = ±1, either b = b is possible or b = −b ; then only one of the round
brackets in Eq. (13.2) can be nonzero—and the absolute value of this nonzero bracket
is 2. Because |a| = |a | = 1, the new variable F can only take values +1 or −1. Therefore
F is also binary, and its absolute value is always |F| = 1. Let us now look at its mean
value. If we try to average F experimentally, every measurement will yield either F =
+1 or F = −1, and the averaging should yield a value −1 ≤ F ≤ 1. An example of
such a measurement is shown in Fig. 13.1: the measured points, up to the experimental
uncertainty, are either at F = 1 or at F = −1. By averaging these results over time t or
over many experimental tries, we will always get |⟨F⟩| ≤ 1.
The same result follows from the calculation using the probability theory,
⟨F⟩ ≤ 1. (13.4)
13.1 Bell’s inequality and its violation | 203
In other words, from the assumptions of (i) the existence of hidden variables and (ii)
their locality, inequality (13.4) follows. This is a modified [14] Bell inequality in the
CHSH form [2, 5, 7]. It is one of the numerous formulations of Bell’s inequality, and,
like the other Bell inequalities, it can be tested in experiment.
for photon A cannot be performed simultaneously. This is going to be the key point in
the explanation why the inequality will be violated.
2. For the measurements on photon B we introduce more complicated variables.
Variable b will be the Stokes observable (S1 + S2 )/√2. For this measurement, the HWP
in path B should be oriented at 11.25∘ (half-way between the measurement of the first
and second Stokes observables). Now, a click of detector B1 will tell us that b = −1 and
a click of detector B2 will indicate b = 1.
For the variable b , we will choose the Stokes observable (S1 − S2 )/√2, and to mea-
sure it, we will orient the HWP in path B at an angle −11.25∘ or, equivalently, at 78.75∘ .
The clicks of detectors B1 and B2, again, will mean that b = −1 and b = 1, respectively.
As in the case of the measurements on photon A, the measurements of b and b are
impossible to perform simultaneously.
Because we are doing a quantum-mechanical calculation, all observables should
be operators:
1 B̂ 1 B̂
â ≡ S1Â , â ≡ S2Â , b̂ ≡ (S + S2B̂ ), b̂ ≡ (S − S2B̂ ), (13.5)
√2 1 √2 1
where the upper indices A, B mean that the Stokes operators relate to photons A, B.
The variable F will then also be an operator, and take the form
1 1 Â B̂
F̂ = {a(̂ b̂ + b̂ ) + â (b̂ − b̂ )} = {S S + S2Â S2B̂ }. (13.6)
2 √2 1 1
Let us now calculate the mean value of F̂ by averaging it over the singlet Bell state
|Ψ ⟩ (12.33). Actually, the inequality will be violated for all four Bell states, but then
(−)
different operators F̂ have to be chosen. With the one chosen according to Eq. (13.6),
the inequality will be violated only for |Ψ(−) ⟩ and |Φ(+) ⟩.
As the first step, let us show that |Ψ(−) ⟩ is an eigenstate of F.̂ We have to calculate
the eigenvalues 1, −1, while S2̂ acts on them as follows: S2̂ |H⟩ = |V⟩, S2̂ |V⟩ = |H⟩. With
an account for all this, the algebra becomes very simple, and we obtain
F|Ψ
̂ (−) ⟩ = −√2|Ψ(−) ⟩. (13.8)
Hence, |Ψ(−) ⟩ is indeed an eigenstate of F,̂ and the mean value ⟨F⟩
̂ is easy to cal-
culate:
̂ (−) ⟩ = −√2,
⟨Ψ(−) |F|Ψ (13.9)
and
1 ̂
S = (⟨F⟩ − 1), (13.11)
2
which includes the mean values of the same four terms of Eq. (13.2). For S, the CHSH
inequality takes the form
− 1 ≤ S ≤ 0. (13.12)
Quantum-mechanical calculation for the state |Ψ(−) ⟩ gives ⟨F⟩ = −√2 and therefore
S = − 21 (√2 + 1) ≈ −1.21. For the state |Φ(+) ⟩, the result is S = 21 (√2 − 1) ≈ 0.21.
In order to test the CHSH inequality (13.12), an experimentalist has to measure the
mean values of all terms in Eq. (13.2). As mentioned above, each of them requires a
different setting of the setup in Fig. 13.2. For each setting, the correlation of two Stokes
observables is measured:
(i) √12 ⟨S1Â (S1B̂ + S2B̂ ⟩ is measured with the HWP in channel A at 0∘ and the HWP in
channel B at 11.25∘ .
206 | 13 Applications of quantum polarization states
1
(ii) ⟨SÂ (SB̂
√2 2 1
+ S2B̂ ⟩ is measured with the HWP in channel A at 22.5∘ and the HWP in
channel B at 11.25∘ .
(iii) √12 ⟨S1Â (S1B̂ − S2B̂ ⟩ is measured with the HWP in channel A at 0∘ and the HWP in
channel B at 78.75∘ .
(iv) √12 ⟨S2Â (S1B̂ − S2B̂ ⟩ is measured with the HWP in channel A at 22.5∘ and the HWP in
channel B at 78.75∘ .
In practice, one only needs to measure the rate of coincidences between the
‘transmitted-path’ detectors A2 and B2. Indeed, consider the mean value (i). Each
of the Stokes observables entering it is measured (see Section 11.2.3) as the difference
of the number of photons hitting detector 2 and the number of photons hitting detec-
tor 1, for a certain setting of the HWP. Meanwhile, the number of photons NA1 hitting
detector A1 is equal to NA − NA2 , where NA2 is the number of photons hitting A2 and
NA is the total number of photons in channel A (which can be measured by removing
the polarizing beamsplitter). Then the mean value (i) can be written as
1
⟨SÂ (SB̂ + S2B̂ )⟩
√2 1 1
⟨[NA2 (0∘ ) − NA1 (0∘ )][NB2 (11.25∘ ) − NB1 (11.25∘ )]⟩
=
⟨NA NB ⟩
4⟨NA2 (0∘ )NB2 (11.25∘ )⟩ − 2⟨NA NB2 (11.25∘ )⟩ − 2⟨NB NA2 (0∘ )⟩ + ⟨NA NB ⟩
= ,
⟨NA NB ⟩
where the angles in brackets denote the orientations of the HWPs. This expression
involves only the numbers of transmitted photons and the total photon numbers in
channels A, B.
By performing the same calculation for each of the mean values (i)–(iv), we obtain
the mean value of ⟨F⟩, and then the value of S, in the form
1
S= [⟨NA2 (0∘ )NB2 (11.25∘ )⟩ + ⟨NA2 (22.5∘ )NB2 (11.25∘ )⟩
⟨NA NB ⟩
+ ⟨NA2 (0∘ )NB2 (78.75∘ )⟩ − ⟨NA2 (22.5∘ )NB2 (78.75∘ )⟩
− ⟨NA NB2 (11.25∘ )⟩ − ⟨NA2 (0∘ )NB ⟩]. (13.13)
Experiments on testing the CHSH inequality, according to Eq. (13.13), require the
following measurements: four series with polarizing beamsplitters in both arms, two
series with polarizing beamsplitters in one arm, and one series with no polarizing
beamsplitters. Because the reflected paths are not used, flat polarizers can replace
polarizing beamsplitters. Also, real experiments use no HWPs but instead, polarizers
oriented at 0∘ , 22.5∘ , 45∘ , 67.5∘ .
The first photon-based experiments on Bell’s inequality violation obtained pho-
ton pairs from the cascaded transitions of atoms. Early experiments involved fewer
settings of polarizers [9] and rather few statistics, but later, experiments were more
13.1 Bell’s inequality and its violation | 207
and more advanced, and by 1982 the violation of Bell’s inequalities was convincingly
proved. Especially important were experiments by Aspect et al. [1], where the orien-
tation of the polarizers was varied during the flight of photons from the source to the
detectors. This enabled the refutal of several local hidden variables theories—for in-
stance, the hypothesis that after the detection of photon A in a certain polarization
state, this information was somehow transmitted to photon B.
Nevertheless, there remained certain ‘loopholes’ for the local hidden variable the-
ories. One of them, called the fair sampling loophole, or detection loophole, is that all,
or almost all, pairs should be probed, otherwise there is still space for local hidden
variables. Experiments with atomic cascaded transitions still left this loophole open,
because atoms emit photon pairs into the full solid angle of 4π radians, and it is very
difficult to detect even half of them.
A breakthrough was made when Bell’s inequality tests started to use SPDC as a
source of photon pairs. In the first experiments of this kind, Shih and Alley [19] and,
independently, Ou and Mandel [16], produced entangled states from type-I SPDC, us-
ing HOM-type interference on a beamsplitter. Further tests of Bell’s inequality were
performed with SPDC configurations shown in Fig. 12.3 and Fig. 12.4.
After this pioneering work, Bell tests were repeated always with SPDC, with higher
and higher accuracy, and all of them resulted in the violation of Bell’s inequalities.
And still, some ‘loopholes’ for local realism remained. First, closing the aforemen-
tioned fair sampling loophole required a detection efficiency of at least 82 % in the
case of maximally entangled states (12.32)–(12.35) and somewhat lower in the case of
non-maximally entangled states (with unbalanced terms in the superposition)—but
still above 70 % for realistic cases. Second, there remained the locality (communica-
tion) loophole. To exclude the possibility of communication between particles A and B,
measurements on A, B should be separated by a spacelike interval in the Minkowski
space. This situation is shown in Fig. 13.3: the source S (blue circle) emits photons
A, B along the red dashed lines in the Minkowski space. The detection procedures
are shown by red rectangles. To avoid communication between the setting of polariz-
ers in paths A, B (green rectangles) and the measurements on the other sides, these
events should be also separated by spacelike intervals (green dashed lines in the fig-
ure). This requires a relatively fast setting of the polarizers and measurement and a
relatively large separation between the measurement stations and the source. Finally,
there existed the freedom-of-choice loophole: the settings of the polarizers should be
chosen free or random. This implies using some really random choice—a true random-
number generator, for instance.
All three loopholes were overcome in recent experiments [11, 13, 18]. The locality
loophole was overcome by using large distances between detectors A and B (60 m,
170 m and 1.3 km, respectively) in combination with fast electronics. The freedom-
of-choice loophole was eliminated by applying various random-number generators.
The fair sampling loophole was closed by using a non-maximally entangled state
and detection with 73 % [18] and 76 % [11] efficiencies, provided by superconduct-
ing nanowires and transition-edge sensors. Two experiments [11, 18] were performed
with entangled photons generated via SPDC and one [13], with spin-1/2 excitations
in nitrogen-vacancy centers in diamond, which allowed for a perfect detection effi-
ciency. These three loophole-free tests of Bell’s inequality seem to have put an end to
the debates about the local hidden variables.
In the end, it is time to explain: why does the Bell inequality (13.12) or (13.4) break
down? For many, a good answer will be ‘because this is what quantum mechanics
says; see the result of calculation (13.10). Photons A, B form a joint quantum system,
whose each part is in a mixed state and has zero mean values of all Stokes variable
S1,2,3 before the measurement.’ But the question is then: where did we make a mis-
take in the derivation of inequality (13.4)? Following again Klyshko’s argument [14],
the answer is that we assumed the existence of a joint probability distribution p({λ})
for all variables a, a , b, b entering the expression for F. Moreover, we assumed this
probability distribution to be non-negative. At the same time, we have noticed that
different mean values entering inequality (13.4) cannot be measured simultaneously.
Therefore, their joint probability distribution is unphysical. We faced a similar situa-
tion in Section 11.1.3: the Wigner function pretends to be a joint probability distribu-
tion for variables (position and momentum) that cannot be measured simultaneously;
the price is that it can be negative. Similarly, we can force Bell’s inequality to hold true
by allowing the probability distribution p({λ}) to take negative values [8].
People always needed to send secret messages. No matter what the secrets were, mili-
tary, trade, or personal, it was often important to protect a message against a possible
interception. To encrypt a message means to put into correspondence to every letter
some symbol, number, or some other letter, to create a cipher. To decrypt the original
message, the cipher has to be used again. If the same cipher is used several times, then
it can be broken by noticing certain regularities in the encrypted texts. This is exactly
how the Enigma machine codes were broken during WW2 [4].
A powerful way to encrypt a message is to use the so-called one-time pad, pro-
posed by Vernam. The message should then first be binary encoded, i. e., represented
by a sequence of ‘zeros’ and ‘ones’. For instance, the word ‘light’ is represented by a
string of 40 bits (the intervals are added for clarity):
0110 1100 0110 1001 0110 0111 0110 1000 0111 0100
The encryption is done by summing it modulo 2 with the key, i. e., another string
of bits, of the same length. For instance, here is a randomly generated code of 40 bits:
0110 0001 0011 0001 1000 0000 0001 1001 0011 0110
The sum of the message and the key modulo 2 is
0000 1101 0101 1000 1110 0111 0111 0001 0100 0010
One who knows the secret key can decrypt the message by summing it with the
key modulo 2 again. As proved by Shannon, if the secret key is used only once, has the
same length as the message, and is purely random, then the encryption is perfectly
secure [4].
Therefore the only task of cryptography is to distribute the secret key between two
users, in such a way that it is best protected from an interceptor (an eavesdropper).
There are many ways to do it—for instance, by simply sending it with some person—but
all these ways are vulnerable. The great advantage offered by quantum physics is the
‘fragility’ of a single quantum system, and the fact that a measurement performed on
it should, in general, destroy its state. Therefore in quantum cryptography, the secret
key is distributed by imparting the bits to the state of a quantum system, thus turning
them into qubits. The qubits can be then physically sent from one user to another.
This ground-breaking idea started the whole field of quantum key distribution (QKD),
which is now the most industrialized part of quantum information science.
An important part of the whole principle of QKD is that a single quantum system
cannot be copied (cloned). This no-cloning theorem can be rigorously proved [4] and
210 | 13 Applications of quantum polarization states
it means that an eavesdropper cannot copy the transmitted qubits and this way get
access to the secret key, or at least to a part of it.
The goal of QKD is to distribute the secret key between two legitimate users, who
are traditionally called Alice (A) and Bob (B). In the course of distribution, the key
should be protected from the eavesdropper, who is traditionally called Eve (E). The
exchange of information involves at least two channels: the quantum channel, through
which the qubits are sent, and the public channel, which could be radio, television,
Internet, and which is accessible to everyone. The quantum channel can be noisy;
the public channel is more robust: the information transmitted through it cannot be
modified.
Below we will only briefly describe the main ways (protocols) of QKD, making an
accent on the use of the polarization degree of freedom. Here we provide only the basic
principles of each protocol, but for the details we refer the reader to two reviews: one
of the earliest, on the basics of the method [10], and the most recent one, on practical
QKD [21].
The first QKD protocol was proposed in 1984 by Bennett and Brassard [4], and is re-
ferred to as BB84, according to the tradition of labeling protocols in cryprography. This
was the first implementation of the idea to encode every bit of the secret key into the
state of a quantum system. As such, Bennett and Brassard proposed the polarization
state of a single photon. As described in Chapter 11, it is impossible to measure two
different Stokes observables for a single photon simultaneously. For example, if a pho-
ton is diagonally polarized, then in a setup for measuring the first Stokes observable
S1 (Fig. 3.4) it will be reflected or transmitted with 50 % probability. Without knowing
how the photon was polarized, an eavesdropper cannot learn it with certainty. Cloning
a polarized photon would be also impossible, according to the no-cloning theorem. In
an attempt to intercept a qubit, Eve the eavesdropper will have to detect the photon,
and then she will either reveal herself, as the photon would not arrive at the receiver,
or she will have to re-send this photon. But then she will inevitably send a photon with
a wrong polarization state. This will lead to errors, and again Eve will reveal herself.
The protocol then runs as follows. Alice sends a sequence of pulses (for instance,
femtosecond pulses with 80 MHz repetition rate), each of which, ideally, contains a
single photon polarized differently. Alice encodes each bit of the secret key into the
polarization states of these photons. But she uses two different rules for encoding.
In half of the cases, she encodes ‘0’s into horizontally polarized photons |H⟩ and ‘1’s
into vertically polarized photons |V⟩ (red arrows in Fig. 13.4). But the other half of bits,
chosen randomly, are encoded using a diagonal polarization basis (blue arrows in
Fig. 13.4). Then, a diagonally polarized photon |D⟩ corresponds to bit ‘0’ and the anti-
diagonally polarized photon |A⟩, to bit ‘1’.
13.2 Quantum key distribution | 211
Figure 13.4: Encoding bits of the key in the BB84 protocol. Alice
randomly switches between the HV (red) basis or the DA (blue)
basis.
In order to encode the bit, Alice could have all photons initially polarized the same
way, for instance, horizontally (|H⟩), and then, for each bit, perform a different po-
larization transformation. In the simplest case, she can rotate a HWP: by ±22.5∘ to
prepare photons |D⟩ and |A⟩, and by 45∘ to prepare photons |V⟩. In practice, of course,
for increasing the transmission rate the procedure is implemented differently [21].
Let us illustrate the protocol by making a table [4, 10]. The first line in Table 13.1
shows the secret key to be transmitted. As an example, we will use the first 10 bits of
the random sequence from the previous section. The second line shows the basis cho-
sen by Alice: ‘X’ for the diagonal–antidiagonal one and ‘+’ for the horizontal–vertical
one, chosen randomly. In the third line, there are states that Alice prepares. They are
unambiguously determined by the bit in the first line and the chosen basis in the sec-
ond line: for instance, bit ‘0’ in the ‘X’ basis should be |D⟩.
The receiver, Bob, measures the polarization using a standard Stokes measure-
ment setup, as shown in Fig. 11.4 (b) or (c). It is sufficient to have a single HWP, se-
lecting either the measurement of S1 (‘horizontal–vertical’ measurement basis), or the
measurement of S2 (the ‘diagonal–antidiagonal’ basis). As in the case of transmission,
real-life setups do not use rotation of waveplates but faster methods to switch between
the measurements in different polarization bases. Bob does not know, in which basis
a bit was encoded; therefore he randomly chooses the basis (the fourth line in Ta-
ble 13.1). This way, Bob unambiguously distinguishes between H and V polarizations
if he uses the ‘+’ basis. But in approximately half of the cases, he uses the ‘X’ basis
Random bit 0 1 1 0 0 0 0 1 0 0
Alice’s basis + + X + + X X + X +
Alice’s qubit |H⟩ |V ⟩ |A⟩ |H⟩ |H⟩ |D⟩ |D⟩ |V ⟩ |D⟩ |H⟩
Bob’s basis + X X X + X X + X X
Bob measures H D A A H D D V D A
Same basis? Y N Y N Y Y Y Y Y N
Sifted key 0 1 0 0 0 1 0
Test Eve? Y Y
Secret key 0 1 0 0 1
212 | 13 Applications of quantum polarization states
and then, makes a mistake if the qubit was |H⟩ or |V⟩. The conclusions Bob makes are
shown in the fifth line of the table.
After a certain number of bits have been transmitted (and all photons have been
detected and destroyed!), Bob publicly announces which basis he used for each bit.
Alice then says in which cases they used the same bases. Alice and Bob discard the
bits where they used different bases, and leave only those where they used the same
ones. After this procedure, called the key sifting, the length of the key is reduced ap-
proximately twice, because the probability for Alice and Bob to use the same basis is
50 %. Nevertheless, the part of the key that remains is random, and long enough if
sufficiently many bits have been sent. The key is common for Alice and Bob, as long
as there were no errors during the transmission. These could be caused both by depo-
larization or loss of the photons and by the presence of Eve. Indeed, Eve could have
‘stolen’ some qubits, reproduced them somehow (with errors of course) and then, after
the public announcement of the bases used, acquired some part of the random key.
Therefore, Alice and Bob need to check whether Eve interfered in the course of
transmission. To this end, they take a part of the key, for instance, (10 %) and com-
pare it (line 8 of the table). This procedure is also made through the public channel,
but these 10 % of the key are then discarded. If the eavesdropping took place, the key
would contain more errors than losses and depolarization would cause. Then, the
whole key is thrown out and the procedure is repeated anew. Otherwise (as shown
in the table), the rest of the key is kept and, after applying error correction proce-
dures [4, 10], used as the secret key.
The BB84 protocol can be further advanced by using, apart from the four linear
polarized states |H⟩, |V⟩, |D⟩, |A⟩, also circularly polarized photons |R⟩, |L⟩. Various
other modifications, as well as the strategies of Eve, are described in Ref. [10]. Current
state of the art in QKD can be found in Ref. [21].
An important question is how to produce single photons for QKD. Ideally, the
transmitted states should not contain multiphoton components; otherwise the eaves-
dropper can tap off and use one of the photons. Up to recently, most QKD systems
operated with weak coherent pulses, with the mean photon number per pulse below
0.2. Then, the probability to have two photons per pulse is small. However, this proba-
bility is still nonzero; in addition, there are ‘empty’ pulses, which strongly reduces the
rate of transmission. One way to obtain single photons is to use single-photon emit-
ters such as atoms, molecules, color centers in diamond, or quantum dots; another
way is to obtain single photons from SPDC or FWM through heralding (Section 12.3.2).
State-of-the-art QKD methods use both these techniques [21].
Unlike in BB84 protocol, where the qubits are prepared by the sender (Alice) and de-
tected by the receiver (Bob), in EPR-based protocols a single distributor sends the
13.2 Quantum key distribution | 213
qubits to the two users [10]. The qubits should form an entangled state—one of the four
Bell states (12.32)–(12.35); see Section 12.3.3. Both Alice and Bob have Stokes measure-
ment setups and do the same as Bob in the BB84 protocol: they randomly choose bases
and write down a zero or unity, depending on whether their photon gets reflected or
transmitted.
Two-qubit BB84 protocol. In the simplest case, the state is |Φ(+) ⟩, and Alice and
Bob randomly and independently switch between ‘horizontal-vertical’ and ‘diagonal–
antidiagonal’ bases. If they both use the ‘+’ basis, they have perfect correlation. Either
both photons are horizontally polarized and get reflected in both setups—then both
Alice and Bob record a ‘0’ bit—or both photons are vertically polarized and get trans-
mitted; then both Alice and Bob write down ‘1’. But whenever they switch to the ‘X’
basis, the situation remains the same, because the |Φ(+) ⟩ state is invariant to linear
polarization rotation; in the AD basis it takes the same form:
1 1
|Φ(+) ⟩ = (|H⟩A |H⟩B + |V⟩A |V⟩B ) = (|D⟩A |D⟩B + |A⟩A |A⟩B ). (13.14)
√2 √2
This symmetry can be seen, for instance, for the macroscopic analogue of the |Φ(+) ⟩
state in Fig. 12.13. Then, if Alice and Bob use the same bases, they have exactly the
same bits of their key. The rest of the protocol works similar to BB84.
The E91 protocol, proposed by Ekert in 1991, uses the singlet Bell state |Ψ(−) ⟩. As
discussed in Section 12.3.3, this state is invariant to any polarization transformation.
Therefore, whatever polarization bases Alice and Bob use, they will get the same bits
as long as the bases are the same.
In the E91 protocol, they randomly switch between three polarization bases: the
‘HV’ one, the ‘AD’ one, and the third one, in which the polarization states are also
linear but the polarization directions are at 22.5∘ and 112.5∘ . In the Stokes measurement
setup of Fig. 11.4 (b), this basis is accessed with no QWP and the HWP oriented at 11.25∘ ;
this setting corresponds to the measurement of observable √12 (S1 + S2 ).
As in the other protocols, after receiving some number of bits, Alice and Bob pub-
licly discuss the bases they used and sift the key by discarding all cases where they
used different bases. But in contrast to other protocols, now they can check the exis-
tence of an eavesdropper by testing Bell’s inequalities with the data they have.
EPR-based protocols were further developed into so-called device-independent
protocols; for more details; see Ref. [21].
The B92 protocol, proposed by Bennett in 1992, uses two non-orthogonal states for
QKD. For instance, Alice uses |H⟩ to encode ‘0’ and |D⟩ to encode ‘1’ (Fig. 13.5).
As in the other protocols, Bob chooses his basis randomly, between ‘+’ and ‘X’.
If he has the photon reflected in the ‘+’ basis, he makes the conclusion that the bit
214 | 13 Applications of quantum polarization states
was ‘1’: if it were ‘0’, then the photon would be horizontally polarized and would be
definitely transmitted. But if with the same basis used, Bob has the photon transmit-
ted, no conclusion can be made. The photon could be either diagonally polarized or
horizontally polarized in this case. Therefore, Bob says that the result is inconclusive
(Fig. 13.5) and discards this bit.
The same happens if Bob uses the ‘X’ basis and has a photon transmitted: it could
be diagonally polarized, but it also could be horizontally polarized; therefore the re-
sult is inconclusive (Fig. 13.5) and the bit is discarded. And only if Bob gets the photon
reflected in the ‘X’ basis, he writes down ‘0’ because the photon could not be diago-
nally polarized.
The B92 protocol is easier to realize than BB84. Besides, as we will now see, it can
be applied to continuous-variable states.
Any two non-orthogonal states can be used for the B92 protocol. It is important
though to have states orthogonal to them, or at least approximately orthogonal. An
example is a set of two coherent states |α⟩ and |β⟩. Two coherent states are always
non-orthogonal: their scalar product is ⟨α|β⟩ = exp{−|α − β|2 }. For two weak coherent
states, this scalar product differs considerably from zero; therefore two weak coherent
states are always non-orthogonal.
In the simplest continuous-variable QKD (CV QKD) protocol with discrete mod-
ulation [21], Alice encodes the qubits into two coherent states that differ only by the
phase: the ‘0’ is encoded into state |−α⟩ and the ‘1’, into state |α⟩ (Fig. 13.6). This en-
coding is very simple in practice and requires only a phase modulator. The amplitude
of the coherent state should be |α| < 1.
To measure these states, Bob uses homodyne detection. If the value of the q quadra-
ture exceeds the one shown by the right-hand dashed line in Fig. 13.6, the conclusion
is that the state was |α⟩, and Bob writes down ‘1’; if the value is lower than shown by
the left-hand dashed line, then the state must have been |−α⟩, and Bob writes down
‘0’. In all other cases the result is inconclusive. The protocol can be improved by using
quadrature-squeezed, rather than coherent states [20].
In connection with the main subject of this book, let us mention in the end that the
same CV QKD protocol can be realized with polarized bright states of light. Indeed,
consider now two strong coherent beams polarized approximately circularly, but with
small deviations towards the H and V directions (red arrows in Fig. 13.7). Such states
can be shown on the Poincaré sphere with a radius given by the mean photon number.
Both states will be close to the North pole, and provided the radius of the sphere is
large, the landscape around the states will be practically flat. This situation has been
described in Section 11.3; see Fig. 11.7. In the neighborhood of the North Pole, the two
bright polarized states will look similarly to the two coherent states of Fig. 13.6: they
will be displaced in the H-V direction and partially overlap. A Stokes measurement will
distinguish them partially, like the two weak coherent states. In contrast to homodyne
detection, here one does not need a local oscillator, since the role of a local oscillator
is played by the circularly polarized polarization component. Recently, this protocol
was implemented with polarization squeezed states [17].
Bibliography
[1] A. Aspect, J. Dalibard, and G. Roger. Experimental test of Bell’s inequalities using time-varying
analyzers. Phys. Rev. Lett., 49:1804–1807, Dec 1982.
[2] J. S. Bell. Speakable and unspeakable in quantum mechanics. Cambridge University Press,
1987.
[3] D. Bohm. Quantum theory. Prentice-Hall, 1952.
216 | 13 Applications of quantum polarization states
nonclassical state 141, 142, 144, 145, 156, 159, Poynting vector 26, 27, 30, 31, 39, 40, 105
168, 171, 173, 178, 183, 186, 187, 189, 193, projective measurement 149, 152, 153
195
nonlinear susceptibility 116, 117, 119, 120, 126, q-plate 76, 113–115
130, 164, 165, 170 quadrature operators 138, 140, 143, 160, 184
normal ordering 143 quantum key distribution (QKD) 1, 2, 5, 153, 179,
200, 209, 210, 212, 214, 215
optical activity 36–38, 45, 105 quantum measurement of the Stokes
optical isolator 105 observables 150–155, 183, 192, 194, 203,
orthogonality of polarization states 3, 4, 13, 14, 204, 212, 213
17, 19, 30, 56–58, 112, 132, 135–138, 142, quarter-wave plate 21, 22, 41, 42, 45, 49–52,
150, 154–156 58, 102, 103, 106, 112, 152, 158, 183, 193,
202, 213
P-distribution 143–145, 159 quasi-phasematching 131
Pancharatnam phase 56, 59–63 quasi-probabilities 143, 156, 157
parametric down-conversion 163–172, 175, qubit 1, 5, 149, 200, 208–214
177–179, 182, 183, 185, 186, 191, 194, 195,
203, 207, 208, 212 rotator 45, 46, 50, 51, 60–62, 104, 105
parametric gain 176–178, 182–186, 189–191
periodically poled crystal 131 Sagnac interferometer 176, 189
periscope 55 second-order correlation function 141, 142
phase plate 42, 44, 46–48, 50, 52, 63, 183 secret key 5, 209–212
phase space 143, 156, 158, 174, 184, 185 shot noise 140, 159, 174, 187, 189, 190, 195
phase velocity 26, 27, 29, 30, 39, 42 shot-noise limit 140, 174, 187, 188, 190, 191
photon annihilation operator 137, 139, 160, 181, SLM 76
185 spatial light modulator 38, 101, 112, 113
photon creation operator 137, 138, 165, 167, spin 86, 93
170, 177, 181 spontaneous four-wave mixing 171, 172, 175
photon-number operators 135, 138 SPP 96
Poincaré sphere 4, 16, 18–21, 47–52, 56, 57, squeezed state 140, 141, 174, 188, 215
60–63, 72, 73, 114, 157, 159, 160, 179, 197 squeezed vacuum 141, 184, 185, 187–189, 196
polarization basis 14, 15, 43, 46, 51, 136, 137, Stern–Gerlach experiment 151, 201
139 Stokes measurement 21, 23, 51, 88, 106, 145,
polarization ellipse 10, 11, 13, 14, 19, 20, 73, 92, 158, 192, 193, 202, 203, 211, 213, 215
96, 97 Stokes observable 16, 17, 21, 40, 41, 51, 135,
polarization entanglement 192 146, 148, 149, 151–153, 155–160, 183,
polarization modes 5, 16, 17, 22, 111, 135–138, 186–196, 201, 203–205, 208, 210
142, 146, 149, 160, 167, 168, 175, 177, 178, Stokes operators 135, 147, 148, 150, 151, 153,
180, 181, 184–187, 189, 190 156, 160, 186, 204
polarization of matter 24, 116, 118, 123, 124, Stokes parameters 91–93
126, 127, 130, 131, 163 Stokes space 157–159, 187, 189, 191
polarization prism 22, 51, 58, 60, 107, 112 Stokes vector 17–22, 42, 47–49, 51, 63, 91
polarization quasi-probability 156–159, 187, structured light 65, 67, 70, 73, 75, 86, 87
189–191
polarization squeezing 146, 159, 160, 168, 183, total internal reflection 81, 87
186–189, 192 transverse spin 93–97
polarization-maintaining fiber 111, 112, 176 twin-beam squeezing 146, 187
polarization-scalar light 191, 195 type-I phase matching 127–129, 132, 133,
positive-frequency field 7, 26, 136–138, 141 167–170, 178, 188, 191, 207
Index | 219
type-II phase matching 129, 132, 167–170, walk-off 39–42, 105, 130, 132, 133, 153, 155
177–179, 182, 185–187, 194, 203 walk-off compensation 132
waveplate 42, 50, 51, 75, 76, 101, 104
uncertainty 140, 146, 153, 159, 166, 174, 185, weak measurement 152, 153, 155, 156
202 weak value 154–156
uncertainty relations 147, 148, 189, 192, 200 Wigner function 141, 144–146, 158, 159, 174,
uniaxial crystals 32, 33, 35, 36, 38–40, 128, 129 208