0% found this document useful (0 votes)
16 views194 pages

2024 Kueng Quantum Computing

The document is a set of lecture notes for an 'Introduction to Quantum Computing' course at Johannes Kepler University Linz for Fall 2024, authored by Prof. Dr. Richard Kueng. It covers a wide range of topics including quantum processing units, single and two-qubit circuits, entanglement, quantum teleportation, and Shor's algorithm for integer factorization. The notes also include details on grading, exercise classes, and acknowledgments for contributions from other individuals.

Uploaded by

Gustavo Ferreira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views194 pages

2024 Kueng Quantum Computing

The document is a set of lecture notes for an 'Introduction to Quantum Computing' course at Johannes Kepler University Linz for Fall 2024, authored by Prof. Dr. Richard Kueng. It covers a wide range of topics including quantum processing units, single and two-qubit circuits, entanglement, quantum teleportation, and Shor's algorithm for integer factorization. The notes also include details on grading, exercise classes, and acknowledgments for contributions from other individuals.

Uploaded by

Gustavo Ferreira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 194

Introduction to Quantum Computing

Johannes Kepler University Linz / Fall 2024

Univ. Prof. Dr. Richard Kueng, MSc ETH


Special thanks to Kristina Kirova, Johannes Kofler, Alexander Ploier, Florian
Schwarcz and Jadwiga Wilkens (alphabetical ordering) for carefully checking
and revising these lecture notes.

Copyright ©2024. All rights reserved.

These lecture notes are composed using an adaptation of a template designed by


Mathias Legrand, licensed under CC BY-NC-SA 3.0 (http://creativecommons.
org/licenses/by-nc-sa/3.0/).
Contents

1 Motivation and outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Motivation: integer factorization 1
1.2 Quantum processing units (QPUs) 3
1.2.1 Non-technical analogy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Different types of quantum hardware . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Tentative overview of topics 8
1.3.1 Lectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Exercise classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Grading process 9
1.4.1 Lecture: open-book, written exam . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.2 Exercises: moodle quizzes + group project . . . . . . . . . . . . . . . . . . . . . . . 9

2 Single qubit circuits I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10


2.1 Introduction 10
2.2 Gaining intuition 11
2.2.1 Overall layout of single-qubit quantum circuits . . . . . . . . . . . . . . . . . . . 11
2.2.2 Classical options: identity and bit-flip gate . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Quantum options: superposition and sign-flip . . . . . . . . . . . . . . . . . . . . 13
2.3 Rigorous formalism: matrix-vector multiplication 15
2.4 Application: the BB84 quantum key distribution 20
3 Single qubit circuits II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Motivation and outline 26
3.2 Excursion: complex numbers 27
3.3 Ultimate limits of single-qubit logic 29
3.3.1 Recapitulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Clifford gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.3 Universal gate sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Pauli rotation gates 37
3.5 Application: restricted sum of parity computations 41

4 Two-qubit circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1 Classical reversible operations on 2 bits 45
4.1.1 Combining single-bit operations in parallel . . . . . . . . . . . . . . . . . . . . . 46
4.1.2 The Kronecker product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Quantum operations on 2 qubits 50
4.2.1 Quantum gates on 2 qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.2 Quantum states on 2 qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.3 Universal 2-qubit gate sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.4 Example 1: CNOT with control and target flipped . . . . . . . . . . . . . . . . . 54
4.2.5 Example 2: a two-bit random number generator . . . . . . . . . . . . . . . . . 55

5 Bell states & Superdense Coding . . . . . . . . . . . . . . . . . . . . . . . 57


5.1 Motivation: The Bell state 57
5.1.1 Stronger than classical correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 More Bell states 64
5.3 Bell measurement 65
5.4 Superdense Coding 65
5.5 Bell states as universal inputs for quantum circuit verification & learning
67

6 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.1 Entanglement 70
6.1.1 Rotated Bell states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 The CHSH game and Bell inequalities 73
6.2.1 The CHSH game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2.2 Optimal classical strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2.3 Optimal quantum strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 CHSH rigidity and monogamy of entanglement 79
6.4 Bell inequalities and the violation of local realism 79
6.5 The E91 protocol for quantum key distribution 80
7 Quantum teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.1 Motivation 83
7.2 Background: marginal and conditional probabilities 85
7.2.1 Marginal probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.2.2 Conditional probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.3 Example 1: Bell state readout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.2.4 Example 2: Drawing straws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.3 Quantum 𝑇 -gate teleportation 91
7.4 Quantum state teleportation 94

8 General 𝑛 -qubit architectures . . . . . . . . . . . . . . . . . . . . . . . . 102


8.1 General 𝑛 -qubit architectures 102
8.2 Classical description of 𝑛 -qubit architectures 103
8.2.1 State vector representation of general 𝑛 -qubit states . . . . . . . . . . . . . . 103
8.2.2 Circuit matrix representation of general 𝑛 -qubit circuits . . . . . . . . . . . . 106
8.2.3 Classical simulation of 𝑛 -qubit logic and readout . . . . . . . . . . . . . . . . . 108
8.3 Implementing classical circuits with quantum logic 111
8.3.1 Quantum realizations of elementary logical gates . . . . . . . . . . . . . . . . . . 111
8.3.2 Quantum realization of entire Boolean circuits . . . . . . . . . . . . . . . . . . . 113
8.4 Synopsis 116

9 Amplitude amplification circuits . . . . . . . . . . . . . . . . . . . . . . . 117


9.1 Motivation 117
9.2 Setup 117
9.3 Overall idea for a quadratic quantum advantage 120
9.3.1 high-level vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
9.4 Concrete circuit construction 123
9.4.1 Circuit 1: reflection about good solutions (‘function oracle’) . . . . . . . . . . 124
9.4.2 Circuit 2: reflection about uniform superposition (‘diffusion operator’) . . . 125
9.4.3 Combination of the two circuit blocks . . . . . . . . . . . . . . . . . . . . . . . . . 126
9.5 Full amplitude amplification circuit 126

10 Quantum Fourier-type transforms . . . . . . . . . . . . . . . . . . . . . 129


10.1 General overview 129
10.2 Walsh-Hadamard transform 132
10.2.1 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
10.2.2 Implementation as a quantum circuit . . . . . . . . . . . . . . . . . . . . . . . . . 133
10.2.3 Fast Walsh-Hadamard transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
10.3 Discrete Fourier transform 135
10.3.1 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
10.3.2 Implementation as a quantum circuit . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.3.3 Fast discrete Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
10.4 Synopsis 145

11 Quantum Phase Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 146


11.1 Background: eigenvalue decomposition of normal matrices 146
11.2 Quantum Phase estimation circuits 148
11.3 Analysis 150
11.3.1 Phase kickback effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
11.3.2 Quantum Phase estimation for 𝑚 = 1 readout bit . . . . . . . . . . . . . . . . 151
11.3.3 Quantum Phase estimation for 𝑚 = 2 readout bits . . . . . . . . . . . . . . . . 154
11.3.4 Quantum phase estimation for 𝑚 readout bits . . . . . . . . . . . . . . . . . . . 156

12 Shor’s algorithm for integer factorization . . . . . . . . . . . . . . . . 159


12.1 Motiviation: hard instances of integer factorization 159
12.2 Reducing Integer Factorization to order finding 160
12.2.1 The order finding problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
12.2.2 Solving integer factorization via order finding . . . . . . . . . . . . . . . . . . . . 162
12.3 Efficiently solving order finding on a quantum computer 164
12.3.1 Recapitulation: Quantum Phase Estimation (QPE) . . . . . . . . . . . . . . . . 165
12.3.2 Identifying the order parameter in eigenvalues of a simple reversible circuit 166
12.3.3 Approximate eigenvalues of the modular multiplication circuit via QPE . . 168
12.4 Synopsis: implementation of Shor’s algorithm 170

13 Learning from quantum experiments . . . . . . . . . . . . . . . . . . . 172


13.1 Motivation 172
13.2 Stylized learning challenge: data hiding 173
13.2.1 Encoding strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
13.2.2 Conventional approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
13.2.3 Quantum-enhanced approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
13.3 Demonstration on an actual quantum computer 180

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
1. Motivation and outline

Date: October 2, 2024

1.1 Motivation: integer factorization Agenda:


One of the core objectives in computer science is, well, to compute things. On 1 motivation: integer fac-
a high level, this is typically achieved by developing algorithms that break torization
down potentially complicated tasks into a sequence of simpler, standardized 2 quantum processing
operations. These standardized operations can then be executed on (classical) units (QPUs)
hardware. A modern CPU, for instance, can execute billions of elementary 3 overview of topics
4 grading process
logical and arithmetic operations in mere seconds, so we are blessed with
substantial amounts of raw computing power.
Alas, raw computing power may not always be enough. There is a wealth
of computing problems, where scalability issues prevent even supercomputers
from going to really large problem sizes. One well-known problem of this
kind is integer factorization: decompose a (typically large) number 𝑁 ∈ ℕ
comprised of 𝑛 bits (𝑛 = ⌊ log2 (𝑁 )⌋ + 1) into a product of prime numbers, i.e. integer factorization of a 𝑛 -bit
number
𝑁 = 𝐹 0 × · · · × 𝐹𝑚 −1 with 𝐹 0 , . . . , 𝐹𝑚 −1 ∈ ℕ prime. (1.1)

The fundamental theorem of arithmetic states that every positive integer has
a unique prime factorization (if we arrange the factors in non-decreasing
order, i.e. 𝐹𝑖 − 1 ≤ 𝐹𝑖 ). And it is relatively easy to check that the maximum
number of factors 𝑚 must obey 𝑚 ≤ log2 (𝑁 ) ≈ 𝑛 . So, there can never be
too many factors. Also, and more remarkably, it is possible to check that each
proposed factor 𝐹𝑖 is actually a prime number. This is courtesy of the AKS
algorithm which also scales polynomially in 𝑛 . Together, these two insights
ensure that it is always possible to efficiently check whether a proposed integer
factorization (1.1) is valid. Here, efficiently means that the number of required
2 Lecture 1: Motivation and outline

operations scales (at most) polynomially in the representation size 𝑛 (bit length)
of 𝑁 .
But, how can we actually find an integer factorization in the first place?
The easiest algorithm is trial division which goes back to Fibonacci and is
often taught in middle school: systematically test whether 𝑁 is divisible by a
smaller number. For instance, 12 = 2 × 6 = 2 × 2 × 3 which is a valid integer
factorization. This algorithm works well if there are a lot of small prime factors,
because it is comparatively cheap to identify those. And subsequent divisions
reduce the remaining problem size considerably. But, trial division can become
extremely resource-intensive if this is not the case. The worst case occurs if
𝑁 √= 𝐹 0 × 𝐹 1 , where 𝐹 0 < 𝐹1 are unknown prime numbers of comparable size
≈ 𝑁 . In such a situation, we need very many trial divisions to find the √ first
prime factor. Indeed,√naively trying all numbers between 2 and 𝐹 0 ≈ 𝑁
requires a total of ≈ 𝑁 trial divisions. This number alone is exponentially
large in the bit size 𝑛 required to represent 𝑁 :

𝑁 = 𝑁 1/2 = 2log2 (𝑁 )/2 ≈ 2𝑛/2 .
It is possible to considerably improve the (worst-case) runtime of trial division cost of factoring algorithms
by only considering numbers that prime to begin with (e.g. don’t try 4, 6, 9, . . . scales super-polynomially in
at all). But, this is not enough to overcome this general worst-case scaling. The bit size 𝑛
number of required operations for every trial division variant known to date is
still dominated by 2𝑛/2 . It should be noted that trial division is not the best
known algorithm for integer factorization. But even the best current state of
the art – the general number field sieve – cannot factor a 𝑛 -bit number with
a number of basic operations that scales polynomially in 𝑛 . This feature is hardness of factoring is basis
actually exploited by widely employed cryptography schemes, most notably of RSA public-key encryption
RSA public-key encryption.
To summarize: integer factorization is an example of a problem that is
difficult to solve (best known algorithms scale exponentially in input size), but
easy to verify (correctness of a proposed factorization (1.1) can be checked in
only polynomially many steps). This is the trademark structure of a wide and
important class of problems – the problem class NP which you might remember
from your computational complexity lecture. Other example problems of this
kind include 3-SAT, the traveling salesman problem, knapsack, minesweeper
and many more. But, among NP-problems, integer factorization is special.
Firstly, we have strong reasons to believe that it is not quite as difficult as other
problems, like 3-SAT. Secondly, and more importantly for this course, we have
omitted an important detail in our discussion of known factoring algorithms.
We actually know an algorithm that is capable of factoring a 𝑛 -bit number
using a number of elementary operations that scales only polynomially in 𝑛 –
Shor’s algorithm. The “only” caveat is that this algorithm cannot be executed Shor’s algorithm can factor
on conventional hardware, but requires a different type of hardware that is integers efficiently, but
more expressive – a quantum computer. We will discuss the precise workings requires a quantum computer
of this breakthrough algorithm in this class, but also the underlying type of
computational model.
3 Lecture 1: Motivation and outline

1.2 Quantum processing units (QPUs)


Shor’s algorithm for efficient integer factorization posits an interesting conun-
drum for tried and tested computer science. For decades, we have divided
computational problems into different classes of difficulty. This difficulty is
measured by the number of operations (or runtime) that is required to solve
these problems on any type of computing architecture. But, at least initially,
we have only considered computing architectures that model the layout of
a modern computer, e.g. Turing machines, or logical circuits. The existence
of Shor’s algorithm suggests that this may be too restrictive. It works on a
different hardware proposal – a quantum computer – and can efficiently solve
a problem that we believed to be hard (factoring). A hypothetical quantum
computer is at least as powerful as any conventional type of hardware, but it
can also natively do things that are impossible – or at least: very expensive –
for modern computers. And unprecedented advances over the last decade have
brought us closer to actually build and operate these machines.

1.2.1 Non-technical analogy


Quantum computers are not the next generation of supercomputers. Rather,
they are an entirely new type of computing hardware based on the rules
of quantum mechanics – the laws of nature that govern physical systems at
microscopic scales (e.g. on the level of individual atoms). And, although
well-understood, these rules are radically different from everyday experience.
Understanding how a quantum computer actually works is therefore not that
easy and we will do so in multiple steps. Although more successful than any
other physical theory, quantum mechanics does not have the reputation of being
either simple, or intuitive. Some quantum mechanical effects are responsible
for the astonishing power of quantum computers, while other effects again limit
their potential considerably. Balancing these blessings and curses against each
other to still obtain a net gain can be surprisingly tricky. And, as a result, we
actually do not know many problems for which quantum computers offer an
unconditional (mathematically rigorous) advantage. But, we do know some
and are constantly looking for more.
In order to get a first intuition about quantum computers, a high-level
comparison with conventional hardware can be helpful. The core of most
current computing devices is a central processing unit (CPU). It can be tasked
to carry out any possible set of instructions we throw at it, but is not necessarily
good at computing specific things (a jack of all trades, master of none). This
is where alternative processing units come in. One important example are
graphical processing units (GPUs). They are designed to solve specialized
mathematical operations, in this case large matrix multiplications, much more
efficiently than traditional CPUs. The original motivation for this setup is
computer graphics, but GPUs are also well-suited for training neural networks
and simulating macroscopic physical systems.
However, even GPUs struggle with the excessive number of mathematical
4 Lecture 1: Motivation and outline

input problem

CPU QPU

readout problem
macroscopic world quantum realm

Figure 1.1 Schematic illustration of a hybrid quantum-classical computer: A con-


ventional Central Processing Unit (CPU) can outsource certain computational
task to a Quantum Processing Unit (QPU). The resulting hybrid architecture
combines the strengths of both hardware platforms, but also suffers from
information-transmission bottlenecks (input problem and readout problem).

operations that would be required to accurately simulate physical and chem-


ical processes beneath the nanoscale. Problems of this type occur naturally
in materials science (e.g. the search for high-temperature superconductors),
pharmaceutics and chemistry (e.g. ab initio drug design) and fundamental
physics (e.g. probing exotic field theories). All these problems have one thing
in common: They adhere to the rules of quantum mechanics. And this renders
them extremely difficult to handle with classical (in the sense of macroscopic;
not quantum mechanical) computations and hardware. Hence, it would be
great if we had a different type of processing unit that is capable of handling
these kind of problems. This is the conceptual origin of quantum computers
that is often attributed to Richard Feynman:

“Nature isn’t classical, dammit, and if you want to make a simulation


of nature, you’d better make it quantum mechanical, and by golly it’s a
wonderful problem, because it doesn’t look so easy.”
Richard Feynman, 1981.

The term Quantum Processing Unit (QPUs) captures the intended purpose more
accurately than the colloquially used term quantum computer. QPUs are not quantum computers are
designed to supersede conventional computers (like CPUs or GPUs) as a whole, special-purpose processing
but are specialized processing units that can further augment computing power. units (QPUs)
The result is a hybrid quantum-classical computer, schematically illustrated in
Figure 1.1. This combination produces a completely new and different type
of computing architecture that comes with novel opportunities, but also novel
challenges. We will discuss both throughout the course of this lecture.
5 Lecture 1: Motivation and outline

Figure 1.2 Pictures of different types of quantum hardware: ion-trap computer


(left), superconducting circuit architecture (center) and optical platform (right).
Pictures are taken from phys.org, qmunity.tech and phys.org, respectively.

1.2.2 Different types of quantum hardware


Let us now briefly discuss the most promising ways to actually implement a
quantum computing device. Note that a lot of quantum physics enters when it
comes to the precise working of these devices. This is not the main focus of
this lecture and we therefore content ourselves with a high-level overview of
the three prevalent platforms. Photographs of each are collected in Fig. 1.2

Trapped ion quantum computing


Atoms are individual particles that are very small (radius about 10 − 10 m). They
consist of a positively charged core and a hull of electrons that is negatively
charged. It is possible to remove individual electrons to produce a positively
charged particle – an ion. This charge interacts with external electric fields.
And one can use these effects to trap ions at a specific location in 3D space (Paul
trap, Nobel Prize 1989). This has allowed quantum pioneers to create entire
chains of ions that are trapped in a 1D line, see Fig. 1.2 (left). Qubits – the electronic states of ions carry
fundamental ‘binary’ carriers of quantum information – are stored in electronic quantum information
states of each ion, e.g. 0 ↔ground state, 1 ↔first excited state. External
lasers are then used to flip individual qubits, while multi-qubit operations are
achieved by coupling the state of the qubit in question with external motion
states of the entire ion chain.
The result is a fully-functional quantum computing platform where infor-
mation is stored in a collection of ions. Today, more than 100 qubits can be
implemented in this fashion. The clever way of executing multi-qubit opera-
tions ensures that this device has full connectivity. That is, we can let every
qubit talk to every other qubit. One of the downsides of ion-trap platforms is
that they are relatively slow and the motional states – which are essential for
all-to-all interactions between the qubits – are difficult to initialize and can
have rather brief lifetimes. Also, scaling up to (much) larger qubit numbers is
challenging, because the underlying geometry – a chain of ions – is inherently
one-dimensional. A two-dimensional lattice of ions could host much more
qubits, but these assemblages are still in a rather early stage.
Finally, it is worthwhile to point out that ion trap quantum computing is
(almost) an Austrian invention. The theoretical proposal is due to I. Cirac and
P. Zoller (1995) who then both worked in Innsbruck. To this date, Innsbruck
6 Lecture 1: Motivation and outline

remains a global player in ion trap quantum computing. The spin-off company
Alpine Quantum Technologies is the first European company that actually sets
out to sell these quantum computers.

Superconducting quantum computing


In this platform type, a QPU is implemented with superconducting electronic
circuits. Qubits are stored in microscopical circuits that roughly resemble a
resonant circuit which oscillates. The key difference is that they are comprised electronic circuits carry
of superconducting material (no resistance) and contain a capacitor (like a quantum information
resonant circuit), but also an additional nonlinearity – a superconducting tunnel
junction (Nobel Prize in 1973). The state of such a qubit is determined by the
number of electrons which reside on one side of the circuit compared to the
other. These states can be flipped by microwave pulses sent to an antenna
coupled to the qubit.
This basic building block (circuit plus antenna) is in principle scalable using
existing chip manufacturing techniques. Many such individual circuits can be
arranged on a 2D plane which today can host several 100s of superconducting
qubits. Interactions between these qubits can be achieved by coupling two
superconducting qubits to an intermediate coupling circuit. These intermediate
circuits must reside between the two qubits in question and must also be
fabricated. This effectively limits interactions to nearest neighbors: every qubit
can only talk to its immediate neighbors.
Today, superconducting platforms are the prevalent type of quantum hard-
ware. Companies like IBM, Google, Rigetti and more have really pushed this
technology over the last couple of years. Today’s devices can operate reliably
with about 100 qubits and they also operate much quicker than existing ion trap
devices. These desirable effects have led to the first justifiable demonstrations
of a quantum advantage – i.e. a well-defined task where quantum computers
outperform even the largest supercomputers to date by a substantial margin.
We will discuss one such result towards the end of this course. However,
superconducting quantum computers are not perfect. They have to operate at
extremely low temperatures (300mK) to ensure that the underlying material
is perfectly superconducting. This may, in fact, soon become a bottleneck
for further scaling up to larger qubit sizes, because the additional electrical
components produce heat that needs to be counteracted. Also, the stringent
limitation to nearest-neighbor interactions can considerably slow down the
actual realization of a given quantum circuit.

Optical architectures
So far, we have mainly talked about quantum computing platforms. But these
are only one aspect of the larger field of quantum technology. When it comes photons carry quantum
to communication and networks – think internet – light is a very promising information
carrier of information. Light can quickly and reliably cover large distances
to convey information. And, it comes in discrete packages of ‘light particles’,
called photons. These photons can be used to represent a qubit, e.g. via
7 Lecture 1: Motivation and outline

polarization: 0 ↔horizontal, 1 ↔vertical. These polarizations can be picked


out and modified with linear optical elements – think mirrors – which effectively
implements single-qubit transformations. What is more, optical devices don’t
require low temperatures and can be built and scaled-up relatively easily.
The big problem is that photons don’t directly interact with each other. This
makes it very difficult to execute multi-qubit operations – a prerequisite for
creating interesting correlations between qubits and performing interesting
computations. One way to overcome this issue is at the source: nanomaterials,
like quantum dots, can be used to create multiple photons at once which
do already exhibit powerful correlations (entanglement). This initial budget
of quantum correlation can subsequently be consumed to execute powerful
quantum computing procedures, like state teleportation which we will discuss
in due time. This technology also forms the basis of the nascent quantum
internet. Alain Aspect, John F. Clauser and Anton Zeilinger – a compatriot who
was born in Ried – are pioneers of this type of quantum technology. Last year
(04.10.2022), they received the Physics Nobel Prize for these groundbreaking
contributions.
So far, we have emphasized the potential of photons to carry quantum bits
over large distances. However, it is also possible to devise actual quantum
computers based on photons. Theses, so-called, measurement-based computing
architectures are remarkably different from conventional circuit architectures.
As such, they go beyond the scope of this lecture.

Synopsis
The above explanations are extremely crude and superficial, they sweep a lot of
important and groundbreaking insights under the rug. But we hope that they
convey a high-level message: it is actually possible to build quantum computing
platforms and there are, in fact, several ways to do so. And each comes
with their own advantages and disadvantages. It should also be noted that different ways to realize QPUs
qubits and elementary operations (single qubit + two-qubit) are not enough
by themselves. We also need a way to initialize the qubits (qubit initialization)
and way to access the final result (qubit readout). We will discuss all these
ingredients in the next lecture. It is also important to note that all these
operations are challenging (we operate on the tiniest scales imaginable) and
not perfect. Every operation is bound to incur a small error. And these errors
can, and do, accumulate when we start combining many operations to build
a larger circuit. Today, this severely hampers our ability to scale up quantum
computing technology. There are, however, ways to overcome these issues. We
will briefly cover this topic of quantum error correction towards the end of this
course.
8 Lecture 1: Motivation and outline

1.3 Tentative overview of topics


1.3.1 Lectures
Lectures will occur weekly on Wednesdays, 13:45–15:15 and we aim for about
13 lectures in total. This is very little time to cover the vast area of quantum
computing. As a result, we will make compromises and sub-select topics which
we then discuss in depth. The core focus of this class will be quantum circuits
and, by extension, quantum algorithms. These are arguably the ‘raison d’être’ lecture focus on quantum
for building quantum hardware in the first place. They also can be understood circuits & algorithms
as an interesting generalization of digital circuits. To get everyone up to speed,
we start small und gradually build up to more involved and potentially impactful
algorithms described by quantum circuits. Here is a tentative list of 13 topics:

1 Motivation and outline


2 Single-qubit circuits 1: gaining intuition, mathematical formalism and
one application: quantum cryptography (BB84)
3 Single-qubit circuits 2: universal gate sets, approximation theorems
(Solayev-Kitaev), reversibility and one application: compute the parity of
a sum of rational numbers
4 Two-qubit circuits 1: basic building blocks and mathematical formalism
5 Two-qubit circuits 2: Bell states (entanglement) and superdense coding
6 Quantum teleportation: gate & state teleportation
7 Many-qubit circuits: relation between quantum and classical circuits
8 Amplitude amplification: quadratic speedup for 3-SAT (Grover)
9 Quantum Fourier transforms: Walsh-Hadamard transformation and the
Quantum Fourier transform
10 Quantum Phase Estimation: counting the number of 3-SAT solutions
11 Shor’s algorithms for discrete logarithms and integer factorization
12 Basics of quantum error correction
13 (Machine) learning from quantum experiments

1.3.2 Exercise classes


In the lecture, we will approach the aforementioned topics from a conceptual
and, admittedly, rather theoretical side. The focus is on laying the framework
that allows us to reason about quantum computing in a consistent, correct and
reproducible fashion. And doing so will involve quite a bit of math, mostly
linear algebra and probability theory. The newly introduced exercise classes
are intended to complement the lecture material with practical hands-on
experience. The overarching goal of the exercise classes is that you program
your own classical simulator of quantum circuits that runs on your own device.
This overarching goal is achieved in several weekly steps: every Wednesday
after the lecture, Jadwiga and Kristina help you to add the things you have
learned in class that day to your personal quantum simulation tool. build your own classical
Around Xmas, this should provide you with your own classical simulator of simulator of quantum circuits
quantum circuits that can – in principle – run everything you have learned in
9 Lecture 1: Motivation and outline

class. And this includes small-scales realizations of the most famous quantum
algorithms we know to date! Playtime will have arrived by January and the
exercise classes culminate in a group project where you really discuss and
implement a small-scale variant of a famous quantum algorithm. This should
be fun!

1.4 Grading process


1.4.1 Lecture: open-book, written exam
The tentative lecture exam date is Friday February 6th, 2025, 12:00 –14:00. We lecture: open book exam via
offer a Moodle exam which you can attend either from home or in a designated moodle
lecture hall. We also propose to do an open-book exam, i.e. you can use
lecture notes and personal summaries and even the internet. This, however,
means that the individual questions will be more challenging: you will have to
analyze and understand how concrete quantum circuits operate. Additional
details and example questions will follow in due course.

1.4.2 Exercises: moodle quizzes + group project


The exercise grading will involve two parts: (i) regular moodle quizzes that exercises: moodle quizzes +
check whether your classical simulation tool performs correctly on given input group project
quantum circuits and (ii) a group project where you introduce, discuss and
simulate a famous quantum algorithm of your choosing. Both parts do not
necessarily require you to attend exercise classes or code your own classical
simulator of quantum circuits, but we highly advise it.
2. Single qubit circuits I

Date: October 10, 2024

2.1 Introduction Agenda:


Today, we take our first steps into the realm of quantum computation. A high- 1 introduction
level schematic of a quantum processing unit (QPU) is displayed in Fig. 2.1. 2 gaining intuition
Note that a QPU is a digital device: it ‘eats’ bitstrings and ‘spits out’ bitstrings. 3 rigorous formalism:
This figure also highlights an admittedly unconventional choice of convention. matrix-vector multipli-
cation
Convention (reading circuit diagrams from right to left). In this class, we read 4 application: quantum
circuit diagrams from right to left. The red arrow in Fig. 2.1 visually key distributions
illustrates this flow of information (“arrow of time”). The reason for this (BB84)
convention will become clear later on: it plays nicely with the mathematical
formalism we use to capture quantum logic (matrix-vector multiplication).

Figure 2.1 Schematic illustration of a quantum processing unit (QPU): on a high


level, a QPU maps bitstrings to bitstrings. Also, in this class we read circuit
diagrams from right to left. The red arrow underscores this convention.
11 Lecture 2: Single qubit circuits I

Figure 2.2 Schematic illustration of a single-qubit processor (QPU): input (very


right) and output (very left) of a single-qubit QPU are conventional bits.
Inbetween, single qubit logic (blue) is used to process the input bit directly at
the quantum level. Disruptive effects happen at the quantum-classical interface
(purple arrows), in particular the readout stage.

For the remainder of this chapter – and most lectures further down the road
– we adopt a hardware perspective and represent QPUs by quantum circuits.
Today, we focus on circuits that affect a single quantum bit, called qubit. We qubit = quantum bit
will see that these circuits can execute well-known logical functionalities (e.g.
negation), but other elementary gates don’t have a classical counterpart at
all. For illustrative purposes, we will heavily use the Quantum Circuit Library
developed by our very own JKU team member Jadwiga Wilkens [Wil23].

2.2 Gaining intuition


2.2.1 Overall layout of single-qubit quantum circuits
A QPU operates on two fundamentally different levels. Input and output
do correspond to conventional bit strings. However, the logic in-between is
executed on extremely small scales – the realm of individual atoms and photons
(particles of light). There, genuine quantum effects become available and
can be used to perform completely new types of (quantum) logic. Fig. 2.2
illustrates such a setting for a single qubit. A single qubit is initialized with an
input bit value 𝑏 ∈ {0, 1} (right). Subsequently, a collection of single-qubit
logical gates is applied (center). This is where the actual quantum computation
happens. Once this is completed, we perform a readout step where the qubit is
measured to produce a single output bit 𝑜 ∈ {0, 1}. Throughout the course of
today’s lecture, we will explore the workings of such a hybrid quantum-classical
architecture. We will discover that the quantum logic part is captured by
a nice deterministic and even reversible formalism. The interfaces between
quantum and classical realm are more disruptive by comparison. Readout,
in particular, can produce true randomness – something that is impossible
for conventional (deterministic) hardware. Let us now start to discover the
workings and interplay of these different constituents in a step-by-step fashion.
12 Lecture 2: Single qubit circuits I

2.2.2 Classical options: identity and bit-flip gate


Hybrid quantum-classical architectures, like the one displayed in Fig. 2.2, are
capable of executing basic logical functionalities. A simple, but often under-
appreciated, logical gate is the identity operation, i.e. do nothing. This operation
maps a bit to itself, i.e. logical identity (𝕀)

𝕀(𝑏) = 𝑏 for 𝑏 ∈ {0, 1}.

Equivalently, we can fully capture this (trivial) action by the following truth
table:

0 1
0 1 0 (identity truth table). (2.1)
1 0 1

A full pipeline with qubit initialization, identity gate (blue) and readout stage
looks as follows:

(2.2)

As intended, the final output bit 𝑜 is equal to the input bit 𝑏 . In formulas:
𝑜 = 𝑏 . This showcases that we can use a single-qubit QPU to reliably store one
bit of information within the quantum realm and recover it exactly at some
later point in time.
Another important logical gate is the negation, or bit-flip, operation: bit-flip (𝑿 )

X (𝑏) = ¬𝑏 for 𝑏 ∈ {0, 1},

where ¬0 = 1 and ¬1 = 0. This important logical operation is covered by the


following truth table:

0 1
0 0 1 (bit-flip truth table). (2.3)
1 1 0

A single-qubit QPU can also implement this functionality:

(2.4)

Together, Eq. (2.2) and Eq. (2.4) showcases that we can use a hybrid quantum-
classical pipeline to implement the two most important single-bit operations.
13 Lecture 2: Single qubit circuits I

These operations, however, have an additional feature. Applying them twice


has no effect on the (qu)bit in question:

𝕀 (𝕀(𝑏)) = 𝕀(𝑏) = 𝑏 and X ( X (𝑏)) = ¬ (¬𝑏) = 𝑏 for 𝑏 ∈ {0, 1}.

This, in particular, means that identity and bit-flip are reversible logical oper-
ations. They do not erase any information about the input bit. In fact, their
action can be readily undone by applying the same gate again. Note that not all
conceivable single-bit functions have this feature. There are in total 4 Boolean
functions that map a single bit onto a single bit. Identity and bit-flip are two of
them. The other two correspond to re-setting the bit to one particular value.
I.e. 𝑓 (𝑏) = 0 (reset to 0) or 𝑓 (𝑏) = 1 (reset to 1), for 𝑏 ∈ {0, 1}. Clearly, these
re-set operations are not reversible, because they completely erase the input bit.
The quantum implementations of identity and bit-flip do adhere to reversibility.
This is captured by the following streamlined circuit diagram equations: QPUs can execute reversible
single-bit logic

. (2.5)

2.2.3 Quantum options: superposition and sign-flip


We have just seen that a single-qubit QPU can reproduce basic logical functional-
ities, as long as they are reversible (𝕀 and 𝑿 ). This is a good start that suggests
that QPUs may be capable of executing conventional logical operations. But,
by itself, this is not really disruptive yet. Let us now discuss some genuinely
quantum operations that don’t have a conventional counterpart. First and
foremost, there is the Hadamard or superposition gate: Hadamard/superposition (𝑯 )

, (2.6)

where ‘w.p. 1/2’ is short for ‘with probability 1/2’. This classical-quantum-
classical pipeline takes an arbitrary single-bit input 𝑏 and produces a uniformly
random output bit. We write
unif
𝑜 ∼ {0, 1}

to denote that 𝑜 = 0 and 𝑜 = 1 happen with equal probability 1/2 each. This
feature is a striking deviation from conventional logic which is fundamentally
deterministic. The execution of a Hadamard gate uses an interesting quantum
14 Lecture 2: Single qubit circuits I

effect, called superposition: a binary quantum system (qubit) can assume both
bit values at the same time.
If we readout (measure) such a superposition of binary values, the outcome
bit we obtain is truly random: both 𝑜 = 0 and 𝑜 = 1 occur with equal
probability. This is the same situation as a fair coin flip. The Hadamard gate
provides the means to observe true randomness by bringing a qubit into equal
superposition. And, equally strikingly, we can use another Hadamard gate to
exit superposition again. Much like identity and bit-flip, the Hadamard gate is
also reversible. In fact, it is its own inverse as well:

. (2.7)

Together, Eq. (2.6) and Eq. (2.7) reveal a striking quantum phenomenon. The
first equation showcases that the Hadamard gate can be used to generate
uniformly random bits. In standard binary logic, randomization requires an
external seed and cannot be undone without erasing the bit in question. Or,
put differently: the only way to map a random bit 𝑟 into a deterministic bit 𝑜
is to erase and reset. This breaks any correlations with the original input bit 𝑏 .
The Hadamard gate, however, is not like this at all! We can apply it twice to
completely undo its effect and recover a perfect correlation between input bit
𝑏 and output bit 𝑜 . This is impossible in classical logic (even when we allow for
true randomness).
The ‘truth table’ of the Hadamard gate reflects this, because it doesn’t
adhere to the rules of conventional logic:

0 1
√ √
0 1/ 2 1/ 2 (Hadamard ‘truth table’).
√ √
1 1/ 2 −1/ 2
The detailed numbers in this table should become clear later on. For now, we
emphasize two things:

(i) the magnitude of each entry is the same, that is 0 and 1 feature in equal
measure within the superposition;
(ii) the two rows (columns) are distinct. This means that information about
the input qubit is actually preserved.

We conclude this section with another quantum logic gate, the sign-flip gate.
The ‘truth table’ of the Hadamard gate already suggests that quantum logic can sign-flip (𝒁 )
feature positive and negative numbers. The sign-flip is used to change the sign
of the 1-contribution within a superposition:

0 1
0 1 0 (sign-flip ‘truth table’).
1 0 −1
15 Lecture 2: Single qubit circuits I

Interestingly, this sign-flip doesn’t do anything if we apply it to quantum


encodings of conventional logic. In particular,

, (2.8)

which looks identical to the action of the identity gate in Eq. (2.2). Also, much
like the identity gate (and every other gate we’ve encountered so far), the
sign-flip gate is also it’s own reverse:

. (2.9)

Importantly, the action of a sign gate becomes nontrivial if we apply it to


superpositions of different bit values. This is achieved by combining 𝒁 with the
Hadamard gate 𝑯 . For instance,

, (2.10)

which showcases that we can sandwich 𝒁 (sign-flip) between two Hadamard


gates 𝑯 (enter/leave superposition) to execute a logical bit-flip 𝑿 . This is a
very strong case for the nontrivial behavior of a sign-flip gate (𝒁 ≠ 𝕀).

2.3 Rigorous formalism: matrix-vector multiplication


We have now completed a first look at how quantum circuits work. We have seen
that they can implement standard logic (e.g. identity and bit-flip), but also have
new and seemingly mysterious features. The apparent randomness generation
via Hadamard which can be completely undone via another Hadamard comes
to mind here. We now present a mathematical formalism that can be used
to reproduce all these new quantum features. It equips the intuition we
gained so far with a rigorous underpinning. This is essential when it comes to
exploring the realm of quantum computing and is one of the most transformative
developments within quantum science. Over the past 30 years or so, the field
has moved away from complicated physical equations and gradually developed
a succinct and finite-dimensional alternative that nonetheless captures most
interesting quantum effects. In fact, basic matrix-vector multiplication is enough
to keep track of any QPU that is comprised of (finitely many) qubits. Today,
we present these rules for the special case of a single qubit. The central object
used to describe a single-qubit QPU is the current state of the qubit inolved.
16 Lecture 2: Single qubit circuits I

Definition 2.1 (single-qubit state vector). The state of a single qubit keeps track of
its quantum logical value. At each point, it is given by a 2-dimensional vector state of a qubit is a
  normalized 2D vector
𝜓0
|𝜓 ⟩ := 𝝍 = ∈ ℂ2 .
𝜓1
The individual coefficients can be complex-valued numbers, but must obey

∥|𝜓 ⟩∥ 2 = ∥𝝍 ∥ 2 = |𝜓0 | 2 + |𝜓1 | 2 =1 (state normalization). (2.11)

We use the somewhat strange notation |𝜓 ⟩ to denote state vectors. This


is called a ket and features prominently in the quantum computing literature.
The following example tells us how we can imprint a classical bit 𝑏 ∈ {0, 1}
into the initial state of a qubit.
Example 2.2 (Qubit initialization). Consider the following two state vectors:
   
1 0
| 0⟩ := 𝒆 0 = and | 1⟩ := 𝒆 1 = .
0 1

We interpret the first state vector | 0⟩ as ‘everything is concentrated at 0 (first


entry)’, while | 1⟩ means that ‘everything is concentrated at 1’. Moreover, both
state vectors obey the normalization condition (2.11):

∥| 0⟩∥ 2 = ∥𝒆 0 ∥ 2 = | 1 | 2 + | 0 | 2 = 1,
∥| 1⟩∥ 2 = ∥𝒆 1 ∥ 2 = | 0 | 2 + | 1 | 2 = 1.

As Definition 2.1 suggests, state vectors can be used to keep track of qubits
throughout a sequence of quantum logical gates. This, however, necessitates a
formalism to unambiguously characterize the action of quantum gates. And a
way to imprint their action onto the current state vector of a qubit.
Definition 2.3 (single-qubit gate action). A single-qubit gate is fully described by
a 2 × 2 matrix single-qubit gates are unitary
2 × 2 matrices (‘truth tables’)
 
𝑈0,0 𝑈0,1
𝑼 = ∈ ℂ2×2 .
𝑈1,0 𝑈1,1
The action of gate 𝑼 on quantum state |𝜓 ⟩ = 𝝍 ∈ ℂ2 is captured by matrix-
vector multiplication : gate action on qubit state =
     matrix-vector multiplication
𝑈0,0 𝑈0,1 𝜓0 𝑈0,0𝜓0 + 𝑈0,1𝜓1
|𝜓final ⟩ = 𝝍 final = 𝑼𝝍 = = ∈ ℂ2 .
𝑈1,0 𝑈1,1 𝜓1 𝑈1,0𝜓0 + 𝑈1,1𝜓1
What is more, each gate matrix 𝑼 must be unitary and, therefore, reversible:
†
𝑈¯0,0 𝑈¯1,0
    
† † 1 0 𝑈0,0 𝑈0,1
𝑼 𝑼 = 𝑼𝑼 = , where = (2.12)
0 1 𝑈1,0 𝑈1,1 𝑈¯0,1 𝑈¯1,1
denotes matrix adjungation (transposition plus taking complex conjugates1).
1Recall that the complex conjugate of a complex number 𝑧 = 𝑎 + i𝑏 is defined as 𝑧¯ = 𝑎 − i𝑏 .
17 Lecture 2: Single qubit circuits I

The following instructive example showcases how this matrix-vector multi-


plication formalism allows us to recover classical logical operations.
Example 2.4 (matrix representation of classical gates). The matrix representations
of the classical operations identity (𝕀) and bit-flip (𝑿 ) are
   
1 0 0 1
𝕀= and 𝑿 = .
0 1 1 0

It is easy to check that both matrices are unitary matrices. What is more,
    
1 0 1 1
𝕀| 0⟩ =𝕀𝒆 0 = = = 𝒆 0 = | 0⟩,
0 1 0 0
    
1 0 0 0
𝕀| 1⟩ =𝕀𝒆 1 = = = 𝒆 1 = | 1⟩,
0 1 1 1

which puts ‘do nothing’ into concrete formulas. Likewise


    
0 1 1 0
𝑿 | 0⟩ =𝑿 𝒆 0 = = = 𝒆 1 = | 1⟩,
1 0 0 1
    
0 1 0 1
𝑿 | 1⟩ =𝑿 𝒆 1 = = = 𝒆 0 = | 0⟩,
1 0 1 0

puts formulas to the action of a bit-flip. It is not a coincidence that these


matrix representations are in one-to-one correspondence to the truth tables of
the corresponding logical functionalities. Matrix-vector multiplication is just
another way of reading logical truth tables. ■

Exercise 2.5 (matrix representation of the Hadamard gate). The matrix representa-
tion of the Hadmard gate is given as
 √ √ 
1/ 2 1/ 2
𝑯 = √ √ ∈ ℂ2×2 .
1/ 2 −1/ 2

1 Show that this matrix is unitary by verifying Eq. (2.12) for 𝑼= 𝑯.


2 The action of a Hadamard gate maps deterministic bit states | 0⟩ and
| 1⟩ into uniform superposition states |+⟩ and |−⟩ . Use matrix-vector
multiplication to verify the following state vector representations:

1 1
|+⟩ =𝑯 | 0⟩ = 𝑯 𝒆 0 = √ (𝒆 0 + 𝒆 1 ) = √ (| 0⟩ + | 1⟩) , (2.13)
2 2
1 1
|−⟩ =𝑯 | 1⟩ = 𝑯 𝒆 1 = √ (𝒆 0 − 𝒆 1 ) = √ (| 0⟩ − | 1⟩) . (2.14)
2 2
Both expressions on the very right should be interpreted as ‘both 0 and
1 in equal measure’. Note, however that one combination has a ‘+’
inbetween, while the other one has a ‘−’. This sign difference ensures
that both superpositions remain distinct and can be undone again.
18 Lecture 2: Single qubit circuits I

3 Verify reversibility by showing 𝑯 2 = 𝕀 via matrix-vector multiplication.


Definition 2.1 (state vector) and Definition 2.3 (gate action) provide us with
all the rules we need to keep track of a state vector throughout an arbitrary
long sequence of single-qubit gates. The following proposition can be derived
from these rules and allows for compressing multiple gate actions into a single
matrix.
Proposition 2.6 (sequential gate composition rule). Let 𝑼 ,𝑽 ∈ ℂ2 × 2 be matrix
representations of two single-qubit gates. Then, the total action of sequentially
applying first 𝑼 and then 𝑽 is captured by the matrix-matrix product of the two
gate representations involved: sequential gate composition
    = matrix-matrix product
𝑈0,0 𝑈0,1 𝑉0,0 𝑉0,1
𝑼 tot = 𝑽 × 𝑼 = ×
𝑈1,0 𝑈1,1 𝑉1,0 𝑉1,1
 
𝑈0,0𝑉0,0 + 𝑈0,1𝑉1,0 𝑈0,0𝑉0,1 + 𝑈0,1𝑉1,1
= .
𝑈1,0𝑉0,0 + 𝑈1,1𝑉1,0 𝑈1,0𝑉0,1 + 𝑈1,1𝑉1,1

This composition rule with the matrix product straightforwardly extends to 𝑁


sequential gate applications: 𝑼 tot = 𝑼 𝑁 × 𝑼 𝑁 − 1 · · · × 𝑼 2 × 𝑼 1 .
We leave the proof as an instructive exercise and instead want to draw
attention to the way we write down and read large matrix-vector products.
Suppose that we sequentially apply 𝑁 single-qubit gates 𝑼 1 , . . . , 𝑼 𝑁 to an
arbitrary starting state (vector) |𝜓 ⟩ = 𝝍 ∈ ℂ2 . Then, we can compute the
final state vector as

|𝜓final ⟩ = 𝝍 final = 𝑼 𝑁 × · · · × 𝑼 2 × 𝑼 1𝝍 ∈ ℂ2 .

In words: we start our matrix-vector multiplication on the very right and keep
going. The convention to read circuit diagrams from left to right as well exactly
resembles this ordering:

. (2.15)

We now have all the pieces in place to initialize a qubit and keep track of its
state throughout a sequence of arbitrary many single qubit gates. All that is
missing now is a formula for executing the readout at the very end. That is, we
need to assign meaning to the following operation:

.
19 Lecture 2: Single qubit circuits I

Definition 2.7 (single-qubit readout). A single-qubit readout (measurement) oper-


ation always produces a valid bit value 𝑜 ∈ {0, 1}. But it does so probabilistically.
The probability of obtaining outcome 𝑜 ∈ {0, 1} depends on the underlying outcome probabilities =
state vector |𝜓 ⟩ = 𝝍 ∈ ℂ2 : squared magnitudes of state
vector entries
2
𝑝 0 =Pr |𝜓 ⟩ [𝑜 = 0] = |⟨0 |𝜓 ⟩| 2 = 𝒆 †0𝝍 = |𝜓0 | 2 ≥ 0, (2.16)
2
𝑝 1 =Pr |𝜓 ⟩ [𝑜 = 1] = |⟨1 |𝜓 ⟩| 2 = 𝒆 †1𝝍 = |𝜓1 | 2 ≥ 0. (2.17)

This should be read as: ‘the probability of obtaining outcome 𝑜 = 0 (𝑜 = 1)


when reading out a qubit in state |𝜓 ⟩ is |𝜓0 | 2 ( |𝜓1 | 2 ).
Here, we use another bit of quantum notation: we write ⟨0 | ( ⟨1 | ) to denote
the dual/adjoint state vector of | 0⟩ = 𝒆 0 ( | 1⟩ = 𝒆 1 ):
⟨0 | = 𝒆 †0 = ⟨1 | = 𝒆 †1 =
 
1 0 and 0 1 .
This is called a bra and plays nicely with the ket we introduced earlier. Indeed,
combining a bra (row vector) with a ket (column vector) produces a number.
Suchbra-ket expressions compute an inner product. For instance,
 
†  1
⟨0 | 0⟩ = 𝒆 0𝒆 0 = 1 0 = 1 × 1 + 0 × 0 = 1,
0
 
†  0
⟨0 | 1⟩ = 𝒆 0𝒆 1 = 1 0 = 1 × 0 + 0 × 1 = 0,
1
 
†  1
⟨1 | 0⟩ = 𝒆 1𝒆 0 = 0 1 = 0 × 1 + 1 × 0 = 0,
0
 
†  0
⟨1 | 1⟩ = 𝒆 1𝒆 1 = 0 1 = 0 × 0 + 1 × 1 = 1.
1
An extension to other bra and ket vectors is now straightforward.
Note that normalization of the state vector (Eq. (2.11) in Defininition 2.1)
ensures that 𝑝 0 and 𝑝 1 defined in Eqs. (2.16),(2.17) define a valid binary
probability distribution (think: coin toss): 𝑝 0 , 𝑝 1 ≥ 0 and
𝑝 0 + 𝑝 1 = |𝜓0 | 2 + |𝜓1 | 2 = ∥𝝍 ∥ 2 = ∥|𝜓 ⟩∥ 2 = 1.
We now have everything in place to reproduce the actual workings of all
example circuits so far. We’ll do one concrete example and leave the rest as
very instructive exercises.
Example 2.8 (quantum random number generator). Consider the quantum circuit
from Eq. (2.6), i.e.: quantum random number
generator

.
20 Lecture 2: Single qubit circuits I

Let us do the computation for 𝑏 = 1 (the case for 𝑏 = 0 is similar). We start by


using matrix-vector multiplication to compute the final state vector (blue):
   √ √    √ 
𝜓0 1/ 2 1/ 2 0 1/ 2
|𝜓 ⟩ = =𝑯 | 1⟩ = √ √ = √ .
𝜓1 1/ 2 −1/ 2 1 −1/ 2
√ √
Hence, 𝜓0 = 1/ 2, 𝜓1 = −1/ 2 and perfect randomness generation follows
from invoking the rule for single-qubit readout (Definition 2.7):
√ 2
Pr |𝜓 ⟩ [𝑜 = 0] = |⟨0 |𝜓 ⟩| 2 = |𝜓0 | 2 = 1/ 2 = 1/2,
√ 2
Pr |𝜓 ⟩ [𝑜 = 1] = |⟨1 |𝜓 ⟩| 2 = |𝜓1 | 2 = −1/ 2 = 1/2.

unif
This is just the definition of a perfect random bit 𝑜 ∼ {0, 1}. ■

2.4 Application: the BB84 quantum key distribution


We now have gathered enough insights about single-qubit circuits to discuss
our first quantum application: the BB84 protocol for quantum key distributions
(QKD) [BB14]. The main goal of KD is to distribute a uniformly random QKD = Quantum Key
seed among two parties – Alice and Bob – such that it remains private. I.e. Distribution
information about the secret bit string is not available for any third party (Eve).
The Q in QKD indicates that this is achieved by using quantum computing
features. The twist is that the rules of quantum computing allow for detecting
whether an eavesdropper might tamper with the connection between Alice and
Bob. If this happens, then Alice and Bob can abort the protocol and throw away
the random key generated so far. This does not protect private randomness,
but allows for detecting an attack and aborting – the next best thing. This is a
striking advantage over conventional key distribution protocols where ‘person
in the middle attacks’ can not be detected on the fundamental level. use ‘quantum fragility’ to
The setup for the BB84 key exchange protocol is depicted in Fig. 2.3. At the detect eavesdroppers
core is a classical identity channel, like the one presented in Eq. (2.2). Alice
(right) uses private randomness to sample a random bit 𝑟 , imprints this into a
qubit and sends this qubit to Bob. Bob can perform the readout and perfectly
recovers this bit. Here, it is also useful to think in terms of photons as carriers
of quantum information: photons can cover a lot of distance in short time and
are ideally suited for this type of information transmission protocol. But, a
mere identity channel is not secure. In particular, it does not allow to detect
actions of a potential eavesdropper in the middle. This is where the green
quantum gate boxes come into play. They are a placeholder for one of two
possible actions: (i) do nothing (𝕀) or (ii) enter/leave superposition (𝑯 ). Alice
unif
and Bob toss a private random coin (𝑎, 𝑏 ∼ {0, 1}) to decide which action
they apply. This gives rise to four potential quantum circuits that each occur
21 Lecture 2: Single qubit circuits I

Figure 2.3 Schematic illustration of the BB84-protocol: Alice and Bob use a
single-qubit circuit to communicate a single bit of information (purple). Each
player obfuscates this identity channel by either applying 𝕀 (do nothing) or 𝑯
(enter/leave superposition). If both players happen to apply the same gate, the
transmission is perfect (𝑠 = 𝑟 ). Otherwise, the output bit is completely random
unif
(𝑠 ∼ {0, 1}) . This dichotomy is impossible to achieve classically and allows
for sharing a key and detecting person in the middle attacks.

with probability 1/4:

Note that whenever 𝑎 = 𝑏 , the effective classical-quantum-classical channel


transmits the original bit perfectly, i.e. 𝑠 = 𝑟 . If instead 𝑎 ≠ 𝑏 , then a uniformly
unif
random bit is produced, i.e. 𝑟 ∼ {0, 1}. Such effective classical-quantum-
classical channels are completely useless, because they erase all information
about the initial bit. This is an interesting and nontrivial dichotomy: whenever
Bob’s gate action cancels Alice’s obfuscation, he can recover her bit perfectly.
Otherwise, he effectively destroys Alice’s (qu)bit and produces a random bit
instead.
Why is this noteworthy? Well, this dichotomy also affects any potential
eavesdropper who intercepts the qubit somewhere in the middle, see Fig. 2.4
for an illustration. Let us call this eavesdropper Eve. She intercepts Alice’s BB84-protocol allows for
qubit, performs a readout and then initializes a new qubit which she transmits transmitting keys & detecting
on to Bob. Two situations can arise: evesdroppers

(i) Eve’s success: Eve guesses Alice’s obfuscation gate 𝑼 𝑎 correctly. Then,
she can perfectly recover Alice’s bit (𝑡 = 𝑟 ) and can re-initialize the qubit
22 Lecture 2: Single qubit circuits I

Figure 2.4 Illustration of an eavesdropper attack on the BB84 protocol: a third


party (Eve, red) intercepts the qubit in the middle. She performs qubit readout
to obtain her own bit 𝑡 , before re-initializing the qubit to send it on to Bob. In
order to maximize her chances of learning 𝑟 and to obfuscate her attack, she
can also apply two quantum gates 𝑽 and 𝑽 ′ that resemble the gate actions of
Alice and Bob.

to exactly the correct quantum logical value. In so doing, she hides her
eavesdropping activity from Bob.
(ii) Eve’s failure: Eve doesn’t guess Alice’s obfuscation gate 𝑼 𝑎 correctly.
Then, she destroys the underlying bit message, gets a completely random
unif
bit (𝑡 ∼ {0, 1}) and re-initalizes the qubit with an uncorrelated, random
bit value.
And here is where things get interesting. If Alice uses private randomness
unif
to decide her obfuscation action (i.e. 𝑎 ∼ {0, 1}), then Eve has no way to
anticipate her move. And, in turn, she must fail about half of the time. Crucially,
whenever she guesses wrong, she re-inserts a random bit value to Bob. And
this interference can now be detected! After all, Alice and Bob expect to get
perfectly correlated bits whenever they happen to execute the same obfuscation
gate (which happens in about 1/2 of all cases): 𝑟 = 𝑠 . An interception of
Eve in the middle must break this perfect correlation. And repeating the
protocol sufficiently often (with new private randomness in each go) will allow
Alice+Bob to detect eavesdropping by comparing some of their input/output
values over a public channel. More precisely, they execute 𝑇 ≫ 1 rounds
of their protocol and, after completion, they broadcast their private random
bitstrings 𝒂, 𝒃 ∈ {0, 1}𝑇 . This allows them to discard all uncorrelated bits
and focus on the instances where perfect correlations should happen: 𝑎𝑡 = 𝑏𝑡
should imply 𝑠𝑡 = 𝑟𝑡 . This already shrinks the possible bits to roughly 𝑇 /2.
They then select certain instances within this remaining string to check whether
𝑠𝑡 = 𝑟𝑡 is actually true. This is again achieved via public communication over
insecure channels. An instance where 𝑠𝑡 ≠ 𝑟𝑡 rises suspicion. Assuming the
hardware works perfectly as intended (which is a very strong assumption), this
can only be the signature of an eavesdropper!
This ability to detect actions of an eavesdropper are a core feature of
23 Lecture 2: Single qubit circuits I

quantum cryptography protocols. Here, we have only scratched the surface and
discussed the key ideas behind one of the oldest and most basic protocols of this
type. However, at this point, you have all the information you need to execute
a proper analysis of the entire protocol – under the additional assumption
that Eve intercepts in the middle via readout plus initialization and only uses
single-qubit gates that mimic the actions of Alice and Bob.
Exercise 2.9 (complete analysis of the BB84-protocol). Perform a rigorous treat-
ment of the BB84-protocol under the additional assumption that Eve’s gates are
𝑽 ,𝑽 ′ ∈ {𝕀, 𝑯 }. This results in a total of 2 × 4 × 2 = 16 different incarnations
of the intercepted protocol – one for each choice of 𝑎 (Alice), as well as 𝑽 ,𝑽 ′
(Eve) and 𝑏 (Bob). How many of these incarnations lead to a undetectable and
successful attack, i.e. 𝑠 = 𝑟 , as well as 𝑡 = 𝑟 ? How many of these incarnations
leave traces of Eve’s actions that can be detected by Alice and Bob? And how
unif
many leave Eve completely clueless, i.e,. 𝑡 ∼ {0, 1}?
Use your findings to argue that Eve does not stand a chance if Alice+Bob repeat
the protocol many times and use private randomness to choose 𝑎 and 𝑏 .
It turns out that the aforementioned assumptions on the type of attack can
be removed completely. A proper security analysis for any type of quantum
attack did take another 29 years to come up with [Tom+13]. Needless to say,
these arguments go beyond the scope of this introductory lecture.

Problems
Problem 2.10 (random number generator). Do the computation in Example 2.8
for input bit 𝑏 = 0.
Problem 2.11 (gate composition rule). Prove Proposition 2.6.
Problem 2.12 (Reversibility of all gates introduced so far). Use the sequential gate
composition rule (Proposition 2.6) to rigorously prove the following circuit
24 Lecture 2: Single qubit circuits I

identities:

This, in particular, ensures that all these gates are reversible and to constitute
their own inverses.
Problem 2.13 (some useful circuit identities). Use the sequential gate composition
rule (Proposition 2.6) to rigorously prove the following circuit identities:

Note that this implies that two Hadamard gates convert 𝑿 -gate (bit-flip) into 𝒁 -
gate (sign-flip) and vice versa. Hint: use the following matrix representations
of all gates involved:
     √ √ 
0 1 1 0 1/ 2 1/ 2
𝑿 = , 𝒁 = and 𝑯 = √ √ .
1 0 0 −1 1/ 2 −1/ 2

Problem 2.14 (matrix representation of the Hadamard gate, see Exercise 2.5). The
matrix representation of the Hadamard gate is given as
 √ √ 
1/ 2 1/ 2
𝑯 = √ √ ∈ ℂ2×2 .
1/ 2 −1/ 2

1 Show that this matrix is unitary by verifying Eq. (2.12) for 𝑼 = 𝑯.


25 Lecture 2: Single qubit circuits I

2 The action of a Hadamard gate maps deterministic bit states | 0⟩ and


| 1⟩ into uniform superposition states |+⟩ and |−⟩ . Use matrix-vector
multiplication to verify the following state vector representations:

1 1
|+⟩ =𝑯 | 0⟩ = 𝑯 𝒆 0 = √ (𝒆 0 + 𝒆 1 ) = √ (| 0⟩ + | 1⟩) , (2.18)
2 2
1 1
|−⟩ =𝑯 | 1⟩ = 𝑯 𝒆 1 = √ (𝒆 0 − 𝒆 1 ) = √ (| 0⟩ − | 1⟩) . (2.19)
2 2
Both expressions on the very right should be interpreted as ‘both 0 and
1 in equal measure’. Note, however that one combination has a ‘+’
inbetween, while the other one has a ‘−’. This sign difference ensures
that both superpositions can be undone again.
3 Verify reversibility by showing 𝑯 2 = 𝕀 via matrix-vector multiplication.

Problem 2.15 (complete analysis of the BB84-protocol, see Exercise 2.9). Perform a
rigorous treatment of the BB84-protocol under the additional assumption that
Eve’s gates are 𝑽 ,𝑽 ′ ∈ {𝕀, 𝑯 }. This results in a total of 2 × 4 × 2 = 16 different
incarnation of the intercepted protocol – one for each choice of 𝑎 (Alice), as
well as 𝑽 ,𝑽 ′ (Eve) and 𝑏 (Bob). How many of these incarnations lead to a
undetectable and successful attack, i.e. 𝑠 = 𝑟 , as well as 𝑡 = 𝑟 ? How many of
these incarnations leave traces of Eve’s actions that can be detected by Alice
unif
and Bob? And how many leave Eve completely clueless, i.e,. 𝑡 ∼ {0, 1}?
Use your findings to argue that Eve does not stand a chance if Alice+Bob repeat
the protocol many times and use private randomness to choose 𝑎 and 𝑏 .
3. Single qubit circuits II

Date: October 16, 2024

3.1 Motivation and outline Agenda:


Last week (Lecture 2), we started to explore single-qubit quantum logic. We saw 1 complex numbers
that quantum gates allow a native execution of completely new functionalities. 2 ultimate limits of single-
Today, we continue along these lines and push single-qubit quantum logic to qubit logic
the ultimate limits of their capabilities. In a certain sense, quantum logic is 3 list of prominent gates
‘infinitely more expressive’ than conventional single-bit logic. As a teaser, we 4 application: restricted
point out that the negation gate 𝑿 is the only non-trivial reversible single-bit parity computations
gate. But, because 𝑿 2 = 𝕀 (do nothing), the list of all reversible single-bit
circuits is a very short one:

(3.1)

are the only two options.


In quantum logic, the situation could not be more different. We will see
that two (appropriately chosen) elementary quantum gates suffice to generate
an infinite amount of single-qubit circuits that have distinct logical functionality.
Every 2 × 2 unitary matrix is valid in quantum logic:
27 Lecture 3: Single qubit circuits II

And, what is more, we can approximate it with only logarithmically many


elementary quantum gates (in the desired approximation accuracy). This
powerful circuit synthesis result is known as the Solovay-Kitaev theorem. En
route to this surprising result, we will need to remember the basic structure
and elementary properties of complex numbers. So, today is a good occasion
to review them.

3.2 Excursion: complex numbers


The field of complex numbers ℂ is an extension of the real numbers ℝ. Arguably,
the √most characteristic feature of complex numbers is the imaginary unit
i = −1. It was first introduced to formally solve polynomial equations, most
notably 𝑥 2 = −1 (which cannot have a solution over real numbers only). Today,
we know that the solutions of every polynomial equation can be expressed as
complex numbers, i.e. combinations of purely real and imaginary numbers: complex numbers have
real+imaginary parts
𝑎 + i𝑏 with 𝑎, 𝑏 ∈ ℝ.

This deep result is known as the fundamental theorem of algebra. The collection
of all such 𝑧 ’s forms the field of complex numbers ℂ. Much like normal numbers,
we can add complex numbers using familiar rules: addition & multiplication

𝑧 + 𝑧 ′ = (𝑎 + i𝑏) + (𝑎 ′ + i𝑏 ′ ) = (𝑎 + 𝑎 ′ ) + i (𝑏 + 𝑏 ′ ) ,
i.e. we add real and imaginary parts of complex numbers separately. Multiplica-
tion takes a bit more work and we must also use the formal definition i2 = −1
to obtain

𝑧 × 𝑧 ′ = (𝑎 + i𝑏) × (𝑎 ′ + i𝑏) = 𝑎 × 𝑎 ′ + i𝑏 × 𝑎 ′ + i𝑎 × 𝑏 ′ + i2𝑏 × 𝑏 ′


= (𝑎 × 𝑎 ′ − 𝑏 × 𝑏 ′ ) + i (𝑎 × 𝑏 ′ + 𝑏 × 𝑎 ′ ) . (3.2)

This is a bit cumbersome in the real+imaginary part representation. We will


soon discuss another representation which makes multiplication much easier.
For now, we point out that complex numbers come with a new operation called
complex conjugation: complex conjugation
𝑧¯ = (𝑎 + i𝑏) = 𝑎 − i𝑏.
This operation flips the sign of the imaginary part. This operation is actually
trivial for numbers that are real-valued to begin with: 𝑎¯ = 𝑎 for all 𝑎 ∈ ℝ.
Complex conjugation allows us to define the absolute value of a complex number:
√ √︁ √
|𝑧 | = 𝑧¯ × 𝑧 = (𝑎 − i𝑏) × (𝑎 + i𝑏) = 𝑎 2 + 𝑏 2 .
This looks like the (Euclidean) length of a 2-dimensional (real-valued) vector
𝒛 ∈ ℝ2 . Such an analogy between complex numbers 𝑧 ∈ ℂ and 2D vectors
𝒛 ∈ ℝ2 is no coincidence. We can obtain a lot of geometric intuition by
envisioning complex plane
 
𝑎
𝑧 = 𝑎 + i𝑏 as 𝒛= .
𝑏
28 Lecture 3: Single qubit circuits II

+i
iR
C
exp (iϕ)
−1 ϕ +1 exp (iϕ)

sin(ϕ)
ϕ
R
−i cos(ϕ)

Figure 3.1 Complex unit circle (left) and Euler’s formula (right): (Left) It is
instructive to view the field of complex numbers as a 2-dimensional plane
(real+imaginary part). The unit circle within this plane contains complex
numbers 𝑧 ∈ ℂ with absolute value |𝑧 | = 1. These are called complex
phases and can be parametrized by a single angle 𝜑 . (Right) Euler’s theorem
provides the justification for our notation of complex phases: exp ( i 𝜑 ) =
cos (𝜑 ) + i sin ( 𝜑 ) .

The 𝑥 -coordinate tabulates the purely real part (ℝ), while the 𝑦 -coordinate
tabulates the purely imaginary part (iℝ). General complex numbers have
nontrivial coordinates in both and therefore live in a complex plane. Prominent
real and imaginary numbers obey |𝑧 | = 1, e.g. | + 1 | = | − 1 | = 1 and also
| + i | = | − i | = 1. These four points therefore live on the unit circle within the
complex plane. We refer to Fig. 3.1 (left) for a visual
√ illustration.
Complex numbers 𝑧 ∈ ℂ that obey |𝑧 | = 𝑧¯𝑧 = 1 are called (complex)
phases. These are points on the complex unit circle and can be conveniently complex phase
represented by a single angle 𝜑 ∈ [ 0, 2𝜋) in radiants1:
 
cos (𝜑 )
𝒛= ∈ ℝ2 ⇔ 𝑧 = cos (𝜑 ) + i sin (𝜑 ) ∈ ℂ.
sin (𝜑 )
Finally, we can use Euler’s theorem to compactly express the right-hand repre-
sentation as a single exponential function:
exp ( i 𝜑 ) = cos ( 𝜑 ) + i sin (𝜑 ) . (3.3)
The geometric intuition behind this celebrated result is displayed in Fig. 3.1
(right). We leave a rigorous proof as an instructive exercise at the end of this
section.
Example 3.1 (‘showoff’ formula). The following mathematical formula combines
i, e and 𝜋 in a perfectly correct fashion:
ei𝜋 = exp ( i𝜋) = −1.
It describes an angle representation of the ‘west pole’ (−1) on the complex unit
circle (recall that 180◦ ↔ 𝜋 ). ■

1Here are the conversion rules for the four most important angles: 0◦ ↔ 0, 90◦ ↔ 𝜋/2,
180◦ ↔ 𝜋 , 270◦ ↔ 3𝜋/4 and 360◦ ↔ 2𝜋 .
29 Lecture 3: Single qubit circuits II

The last thing we need to know about complex phases is that they play
nicely with each other when it comes to multiplication. Let exp ( i 𝜑 ) =
cos (𝜑 ) + i sin (𝜑 ) and exp ( i 𝜑 ′ ) = cos (𝜑 ′ ) + i sin (𝜑 ′ ) be two complex phases.
We can use the multiplication rule (3.2) and trigonometric identities to compute

exp ( i 𝜑 ) × exp ( i 𝜑 ′ ) = ( cos (𝜑 ) + i sin (𝜑 )) × ( cos ( 𝜑 ′ ) + i sin ( 𝜑 ′ ))


= ( cos ( 𝜑 ) cos ( 𝜑 ′ ) − sin (𝜑 ) sin (𝜑 ′ ))
+i ( cos (𝜑 ) sin (𝜑 ′ ) + sin (𝜑 ) cos (𝜑 ′ ))
= cos ( 𝜑 + 𝜑 ′ ) + i sin (𝜑 + 𝜑 ′ )
= exp ( i (𝜑 + 𝜑 ′ )) .

This confirms a famous property of exponential functions in the realm of


complex numbers: multiplication of exponentials is the same as adding the
exponents. The concept of a complex phase and this multiplication rule will be
important in our study of quantum circuits. It deserves a prominent display.

Fact 3.2 (complex phase). A complex number 𝑧 ∈ ℂ is called a complex multiplication rule for

phase if |𝑧 | = 𝑧¯ × 𝑧 = 1. It can be represented as 𝑧 = exp ( i 𝜑 ) , where complex phases
𝜑 ∈ [0, 2𝜋) is a single angle. Multiplication of two complex phases is the
same as adding the corresponding angles, i.e.

𝑧 × 𝑧 ′ = exp ( i 𝜑 ) × exp ( i 𝜑 ′ ) = exp ( i (𝜑 + 𝜑 ′ )) .

Let us do a simple example that cycles through the four prominent points on
the complex unit circle:

i = exp ( i𝜋/2) ,
2
i =i × i = exp ( i𝜋/2) × exp ( i𝜋/2) = exp ( i (𝜋/2 + 𝜋/2)) = exp ( i𝜋) = −1,
i3 =i × i2 = exp ( i𝜋/2) × exp ( i𝜋) = exp ( i (𝜋/2 + 𝜋)) = exp ( i3𝜋/2) = −i,
i4 =i × i3 = exp ( i𝜋/2) × exp ( i3𝜋/2) = exp ( i (𝜋/2 + 3𝜋/2)) = exp ( i2𝜋) = 1.

Exercise 3.3 (Proof of Euler’s theorem). Prove Eq. (3.3) by using i2 = −1 and the
following three Taylor series expansions:
∑︁∞ 𝑧 𝑘 ∑︁∞ 𝑧 2𝑘 ∑︁∞ 𝑧 2𝑘 +1
exp (𝑧) = , cos (𝑧) = (−1) 𝑘 , sin (𝑧) = (−1) 𝑘 .
𝑘 =0 𝑘 ! 𝑘 =0 ( 2𝑘 ) ! 𝑘 =0 ( 2𝑘 + 1) !

3.3 Ultimate limits of single-qubit logic


3.3.1 Recapitulation
Recall that a single-qubit quantum processor maps bitstrings to bitstrings. An
initial bit value 𝑏 ∈ {0, 1} is used to initialize the qubit state vector
   
1 0
|𝜓 ⟩ = | 0⟩ = 𝒆 0 = or |𝜓 ⟩ = | 1⟩ = 𝒆 1 = .
0 1
30 Lecture 3: Single qubit circuits II

Figure 3.2 Schematic illustration of a single-qubit processor from Lecture 2 (see also
Fig. 2.2 there): a single-qubit processor takes a single bit 𝑏 ∈ {0, 1} as input
and produces a single-bit output 𝑜 ∈ {0, 1}. The logic in-between is executed
on the quantum level, where different logical operations become available. The
readout stage is also special and can give rise to true randomness.

This qubit state vector is a 2-dimensional vector with complex entries that can
be used to keep track of the quantum logic content when we apply quantum
gates. The action of each single-qubit gate is described by a complex-valued
matrix 𝑼 ∈ ℂ2 × 2 that is also unitary, i.e. 𝑼 †𝑼 = 𝑼𝑼 † = 𝕀. The action of one
gate matrix 𝑼 on state vector |𝜓 ⟩ = 𝝍 can be computed with matrix-vector
multiplication:
    
′ 𝑈0,0 𝑈0,1 𝜓0 𝑈0,0𝜓0 + 𝑈0,1𝜓1
𝝍 = 𝑼𝝍 = = .
𝑈1,0 𝑈1,1 𝜓1 𝑈1,0𝜓0 + 𝑈1,1𝜓1

This rule readily extends to the application of 𝑇 ≫ 1 different gates: |𝜓 ′ ⟩ =


𝝍 ′ = 𝑼 𝑇 −1 × · · · × 𝑼 1 × 𝑼 0𝝍 , where × denotes matrix-matrix multiplication.
The last ingredient concerns the readout stage. If the final state is 𝝍 ′ , then the
probability of obtaining outcome bit 𝑜 = 0 and 𝑜 = 1 is

𝑝 0 =Pr |𝜓 ⟩ [𝑜 = 0] = |𝜓0 | 2 and 𝑝 1 = Pr |𝜓 ⟩ [𝑜 = 1] = |𝜓1 | 2 .

Note that each formula features the absolute value of a complex number. And
such absolute values are invariant under multiplication with any complex phase
exp ( i 𝜑 ) :

𝑝 0 = |𝜓0 | 2 = | exp ( i 𝜑 ) 𝜓0 | 2 and 𝑝 1 = |𝜓1 | 2 = | exp ( i 𝜑 ) 𝜓1 | 2 .

This has an important consequence. Our description of quantum logic in terms


of matrix-vector multiplication carries redundant information. Indeed, the
overall complex phase does not matter at all. This affects both state vectors and outcome probabilities don’t
gate matrices. depend on complex phases

Example 3.4 (states). The following four state vectors all describe a logical 0 bit
(‘everything is in zero’):
       
1 i −1 −i
| 0⟩ = , +i | 0⟩ = , −| 0⟩ = , −i | 0⟩ = .
0 0 0 0

31 Lecture 3: Single qubit circuits II

Example 3.5 (gates). The following two unitary matrices describe the same gate
action:
   
exp (−i𝜋/8) 0 1 0
and .
0 exp ( i𝜋/8) 0 exp ( i𝜋/4)

Non-example 3.6 (states). The following two states are not equivalent
 √   √ 
1 +1/√2 1 1/ 2
|+⟩ = √ (| 0⟩ + | 1⟩) = and | i+⟩ = √ (| 0⟩ + i | 1⟩) = √ ,
2 +1/ 2 2 i/ 2

because there is no way to write | i+⟩ as exp ( i 𝜑 ) |+⟩ . ■

This phase invariance of both state vectors and gate actions is an important
feature of quantum computing and also deserves a prominent display.

Fact 3.7 (phase invariance of state vectors and gate matrices). Two state vectors phase invariance of state
|𝜓 ⟩, |𝜓 ′ ⟩ encode the same information content if |𝜓 ′ ⟩ = exp ( i 𝜑 ) |𝜓 ⟩ for vectors & gate matrices
some 𝜑 ∈ [ 0, 2𝜋) . Likewise, two gate matrices 𝑼 , 𝑼 ′ encode the same
action if 𝑼 ′ = exp ( i 𝜑 ′ ) 𝑼 for some 𝜑 ′ ∈ [ 0, 2𝜋) . If this is the case, we
succinctly write

|𝜓 ′ ⟩ = 𝝍 ′ ∼ 𝝍 = |𝜓 ⟩ and also 𝑼 ′ ∼ 𝑼 .

3.3.2 Clifford gates


In the introduction, we have already emphasized that there are only two
non-trivial single-bit circuits: 𝕀 (do nothing) and 𝑿 (bit-flip), see Eq. (3.1). Let
us now explore what happens in the quantum case. Let’s start with the two
most prominent single-qubit gates that are featured in Lecture 2:
   
0 1 1 1 1
𝑿 = and 𝑯 = √ .
1 0 2 1 −1

The quantum circuit model allows us to combine these elementary gates to con-
struct new quantum logical functionalities. Note that many gate combinations
must be trivial because both 𝑿 and 𝑯 are reversible. In particular,

𝑿 2 = 𝑿 × 𝑿 = 𝕀 and 𝑯 2 = 𝑯 × 𝑯 = 𝕀,

which tells us that long sequences of only 𝑿 or only 𝑯 don’t accomplish


anything. Indeed, 𝑿 𝐷 = 𝑿 if 𝐷 is odd and 𝑿 𝐷 = 𝕀 else if 𝐷 is even. Likewise,
𝑯 𝐷 = 𝑯 if 𝐷 is odd and 𝑯 𝐷 = 𝕀 else. Truly new logical functionalities can
only be achieved if we alternate 𝑿 and 𝑯 gates. For instance, we can create
32 Lecture 3: Single qubit circuits II

the following ‘cousins’ of the Hadamard gate:

. (3.4)

Each of them features the minus sign in a different location. We can also
sandwich the 𝑿 gate between two 𝑯 s to obtain the sign-flip gate: sign-flip gate 𝒁

. (3.5)

This matrix 𝒁 is also called the Pauli-z gate, while 𝑿 is known as the Pauli-x
gate. Finally, we can combine 𝑿 and 𝒁 to get an action that is equivalent to
the Pauli-y gate 𝒀 :
     
0 −1 0 −i 0 −i
𝑿𝒁 = = (−i) ∼ =𝒀, (3.6)
1 0 i 0 i 0

where we have used the gate invariance under global phases from Fact 3.7. This
phase invariance of quantum gate actions, in fact, also implies that we are done.
We cannot reach any new functionalities anymore. Also, note that Eq. (3.5)
allows us to replace the elementary 𝑿 -gate with an elementary 𝒁 -gate. After
all, we can transform one into the other by investing two Hadamard gates.
Lemma 3.8 The two elementary gates 𝑿 (bit-flip) and 𝑯 (Hadamard) generate
a total of 8 functionally distinct quantum functionalities: one identity (𝕀), three
Pauli gates (𝑿 ,𝒀 , 𝒁 ) and four Hadamard-type gates displayed Eq. (3.4).
We leave a proof of this technical statement as an instructive exercise
that can be easily automated. Fig. 3.3 tries to visualize this as actions on
the Bloch sphere. The Bloch sphere is a convenient geometric representation
of single-qubit states that you might have seen in the Quantum Information
lecture. It uses a geometric analogy between single-qubit state vectors (with Bloch sphere representation
of single-qubit state vectors
(optional)
33 Lecture 3: Single qubit circuits II

complex coefficients and phase invariance) and real-valued 3D-vectors that


are confined to a sphere. In this Bloch sphere representation, antipodal points
are always orthogonal to each other. E.g. | 0⟩ and | 1⟩ from north and south
pole of the sphere and also obey ⟨0 | 1⟩ = 𝒆 †0𝒆 1 = 0 (orthogonality). We refer to
the internet for further information – the precise workings of the Bloch sphere
representation will not be essential for this course.
Exercise 3.9 (Proof of Lemma 3.8). Verify the correctness of Lemma 3.8 by writing
a piece of code that generates gate combinations of a certain length, computes
their matrix representation via matrix-matrix multiplication and terminates
once no new functionally distinct matrices can be achieved (i.e. all newly
generated matrices are equivalent to existing ones up to a global phase).
Lemma 3.8 lists a total of 8 different logical functionalities that can be
obtained from combining two elementary quantum gates: 𝑯 and 𝒁 (or,
equivalently: 𝑯 and 𝑿 ). Let us now see if we can push this number even
further if we replace 𝒁 with another quantum gate. The phase gate phase gate
 
1 0
𝑺=
0 i

introduces complex numbers into our gate model. It is also closely related to
the sign-flip gate. Indeed,
       
2 1 0 1 0 1 0 1 0
𝑺 =𝑺 ×𝑺 = × = = = 𝒁,
0 i 0 i 0 i2 0 −1

so it is instructive to think of 𝑺 as a ‘square root’ of the sign-flip gate 𝒁 . Higher


powers of 𝑺 lead to lower right matrix entries that continue to jump around
the complex unit circle:
   
3 1 0 4 1 0
𝑺 = and finally 𝑺 = = 𝕀.
0 −i 0 1

The phase gate is the first gate we encounter that is not its own reverse. If we
want to undo the action of 𝑺 , we must apply 𝑺 † = 𝑺 3 . A single application of
the phase gate inserts complex numbers into rows or columns of existing gate
matrix descriptions. Whether it is rows or columns depends on the ordering of
gates. For instance,
     
1 0 1 1 1 1 1 1
𝑺 ×𝑯 = ×√ =√ , while
0 i
2 2 1 −1 i −i
     
1 1 1 1 0 1 1 i
𝑯 × 𝑺 =√ × =√ .
2 1 −1 0 i 2 1 −i
We can now use a combination of 𝑺 and 𝑯 to generate the actual Pauli-Y matrix
from Eq. (3.6):

𝒀 = 𝑺 × 𝑯 × 𝒁 × 𝑯 × 𝑺 3.
34 Lecture 3: Single qubit circuits II

Figure 3.3 All 8 functionally distinct quantum gates formed from only the gates 𝑿
and 𝑯 applied to the zero | 0⟩ state visualized on a Bloch Sphere.
35 Lecture 3: Single qubit circuits II

We leave a verification of this formula as an instructive exercise. By now, it


should not be a surprise that replacing 𝒁 with 𝑺 (its square root) allows us to
build more stuff. However, the total number of different functionalities is still
finite.
Definition 3.10 (single-qubit Clifford gates). The two gates 𝑯 (Hadamard) and 𝑺
(phase) generate a total of 24 different single-qubit gate functionalities. These
are called (single-qubit) Clifford gates.
Table 3.1 provides a complete list of these 24 operations and a way of how
to realize them using Pauli rotations – the topic of today’s final chapter.

3.3.3 Universal gate sets


We now have seen that two quantum gates already generate many different
logical functionalities – many more than the two nontrivial single-bit circuits 𝕀
and 𝑿 . Single-qubit Clifford gates, for instance, comprise a total of 24 gates
with different quantum logic. It is therefore natural to wonder if we can do
even better. Let us see if we can apply the same trick another time. More
precisely, we replace 𝑺 with its ‘square root’ which is called the T-gate: T-gate
   
1 0 2 1 0
𝑻 = such that 𝑻 = = 𝑺,
0 exp ( i𝜋/4) 0 exp ( i𝜋/2)
where we have used the rule for multiplying complex phases from Fact 3.2,
as well as exp ( i𝜋/2) = +i (see e.g. Fig. 3.1 (left)). Euler’s formula (3.3) also
implies √
exp ( i𝜋/4) = cos (𝜋/4) + i sin (𝜋/4) = ( 1 + i) / 2,
so this complex phase contains both a real-valued and an imaginary contribution.
Due to 𝑺 = 𝑻 2 , replacing 𝑺 by 𝑻 can only increase the number of quantum
functionalities that can be reached. The actual gain is astonishing.

Theorem 3.11 (universal gate set). Together, the elementary Clifford gates Hadamard+T generate every
𝑯 , 𝑺 and the T-gate 𝑻 form a universal gate set: Every 2 × 2 unitary matrix 2 × 2 unitary matrix
can be approximated to an arbitrary degree with sequences comprised of
only 𝑯 and 𝑻 (we actually don’t need 𝑺 , because 𝑺 = 𝑻 2 ).

This powerful statement stems from group theory and a proof would go
beyond the scope of this lecture. Here, we instead emphasize the implications.
There are infinitely many unitary 2 × 2 matrices that are distinct from each
other (even if we take into account phase invariance). Nonetheless, we can
approximate each and every such matrix with sequential quantum circuits that
only feature 𝑯 (Hadamard) and 𝑻 (T-gate). What is more, such universal
gate sets are easy to find. The elementary Clifford gates {𝑯 , 𝑺 } only need one
additional non-Clifford gate 𝑼 to become universal2. The precise nature of this
additional third gate 𝑼 does not matter at all!
2For technical reasons, we might actually need one additional gate 𝑼 , as well as its inverse
𝑼 †.
36 Lecture 3: Single qubit circuits II

Figure 3.4 (a) A Bloch sphere showing the labeling of the axis. (b) The single qubit
quantum states reached with only combinations of 𝑿 and 𝑯 , see Lemma 3.8. (c)
The single qubit quantum states reached with only Clifford gates, see Definition 3.10.
(d) The single qubit quantum states reached with arbitrary long combinations of
Clifford gates and the 𝑻 gate as stated in Theorem 3.11.

Theorem 3.11 is interesting and tells us something about the ultimate


possibilities of single-qubit logic. However it does not tell us anything about the
cost associated with realizing this apparent potential. A natural cost parameter
for quantum circuits is circuit depth: how many layers of either 𝑯 or 𝑻 are
required to a given target unitary 𝑼 up to accuracy 𝜀 ∈ ( 0, 1) ? This quantum
synthesis problem has a surprisingly strong answer that is valid at a remarkable
level of generality [Kit97; DN05].

Theorem 3.12 (efficient single-qubit synthesis (Solovay-Kitaev Theorem)). Let efficient single-qubit circuit
a synthesis
G be a universal gate set , e.g. G = {𝑯 ,𝑻 } and let 𝜀 be a desired
approximation accuracy. Then, for every unitary 2 × 2 matrix 𝑼 , there
exists a sequence of (at most)

𝐷 = O ( log𝑐 ( 1/𝜀)) (3.7)

elementary gates (e.g. 𝑯 and 𝑻 ) that approximates the action of 𝑼 up


to accuracy 𝜀 . Here, 𝑐 ∈ [ 1, 3 + 𝑜 ( 1)] is a constant that depends on the
universal gate set in question.
a Theoriginal statement also requires that this gate set either contains inverses or can
generate them in a constant number of steps.

Some elementary gate sets even achieve 𝑐 = 1, in which case Eq. (3.7) is
tight up to a constant factor. We emphasize that the required circuit depth
only scales (poly)logarithmically in the desired target accuracy. Or, to put it
differently: (after some burn-in period,) the approximation error of a quantum
37 Lecture 3: Single qubit circuits II

circuit approximation diminishes exponentially in the circuit depth one is


willing to invest.
We have come a long way so far. Starting with 𝑯 and 𝑿 (or 𝒁 ), we found
out that we can generate a total of 8 different quantum functionalities. This
number went up to 24 after replacing 𝑿 with the phase gate 𝑺 . Moving from
𝑯 and 𝑺 to 𝑯 and 𝑻 – the ‘square root’ of 𝑺 – had even more disruptive
consequences. These two gates can approximate every 2 × 2 unitary matrix.
We refer to Fig. 3.4 for an illustration of this journey.

3.4 Pauli rotation gates


Let us now move on and discuss a different approach towards quantum gates
that is more analog in nature. This is also how modern quantum hardware
executes quantum logic. The basic building blocks are the 3 Pauli matrices: 3 Pauli matrices
     
0 1 0 −i 1 0
𝑿 = ,𝒀 = and 𝒁 = . (3.8)
1 0 i 0 0 −1

They describe important quantum logical operations, like bit-flip (𝑿 ), sign-flip


(𝒁 ) and a combination of both (𝒀 = i𝑿 𝒁 ∼ 𝑿 𝒁 ). However, Pauli matrices also
feature prominently in the quantum physical formalism that is used to describe
individual qubits3. They can also be used to generate qubit rotations along
different axes. Let 𝜃 ∈ [ 0, 2𝜋] be an angle and define Pauli rotations
 
cos (𝜃 /2) −i sin (𝜃 /2)
𝑹 𝑥 (𝜃 ) = exp (−i (𝜃 /2)𝑿 ) = (𝑿 -rotation by 𝜃 ),
−i sin (𝜃 /2) cos (𝜃 /2)
 
cos (𝜃 /2) − sin (𝜃 /2)
𝑹 𝑦 (𝜃 ) = exp (−i (𝜃 /2)𝒀 ) = (𝒀 -rotation by 𝜃 ),
sin (𝜃 /2) cos (𝜃 /2)
 
exp (−i𝜃 /2) 0
𝑹 𝑧 (𝜃 ) = exp (−i (𝜃 /2)𝒁 ) = (𝒁 -rotation by 𝜃 ).
0 exp ( i𝜃 /2)

Exercise 3.13 (computing matrix exponentials). Derive the matrix expressions of


𝑹 𝑥 (𝜃 ) , 𝑹 𝑦 (𝜃 ) and 𝑹 𝑧 (𝜃 ) by explicitly computing and simplifying the general
formula for matrix exponentials:
∑︁∞ 1 𝑘 ∑︁∞ 1
exp (𝑨) = 𝑨 = 𝑨 × · · · × 𝑨.
𝑘 =0 𝑘! 𝑘 =0 𝑘 ! | {z }
𝑘 times

Hint: use 𝑿 2 = 𝒀 2 = 𝒁 2 = 𝕀, as well as the Taylor series expressions for sin (𝜃 )


and cos (𝜃 ) .
For now, we emphasize that one should really view these three operations
as rotations. The following technical statement explains why.
3They are related to the spin of a quantum particle across the different coordinate axes in
3-dimensional space.
38 Lecture 3: Single qubit circuits II

Lemma 3.14 All three Pauli rotations are 2𝜋 -periodic (up to a global phase) and
obey an angle addition rule. In formulas, we have for 𝑤 = 𝑥, 𝑦 , 𝑧 ,

𝑹 𝑤 ( 2𝜋) ∝ 𝑹 𝑤 ( 0) = 𝕀 and 𝑹 𝑤 (𝜃 )𝑹 𝑤 (𝜃 ′ ) = 𝑹 𝑤 (𝜃 + 𝜃 ′ ), (3.9)

regardless of 𝜃 , 𝜃 ′ ∈ ℝ.
Again, we leave a rigorous derivation as an instructive exercise. Note that
this angle addition rule is very similar to the phase multiplication rule from
Fact 3.2. Lemma 3.14 ensures 𝑅𝑤 (𝜃 ) ∝ 𝑅𝑤 (𝜃 mod 2𝜋) (2𝜋 -periodicity) and
also completely specifies the reverse operation:

𝑹 𝑤 (𝜃 ) −1 = 𝑹 𝑤 (−𝜃 ) because 𝑹 𝑤 (𝜃 )𝑹 𝑤 (−𝜃 ) = 𝑹 𝑤 (𝜃 − 𝜃 ) = 𝑹 𝑤 ( 0) = 𝕀.

Now suppose that we execute multiple Pauli rotations around the same axis.
Then, a similar argument highlights that the order of rotations does not matter:

𝑹 𝑤 (𝜃 )𝑹 𝑤 (𝜃 ′ ) = 𝑹 𝑤 (𝜃 + 𝜃 ′ ) = 𝑹 𝑤 (𝜃 ′ + 𝜃 ) = 𝑹 𝑤 (𝜃 ′ )𝑹 𝑤 (𝜃 ).

This means that we can accumulate and/or permute different Pauli rotations
of the same kind at will – an extremely useful feature. The following circuit
visualization depicts a straightforward extension of this formula to 𝐷 + 1
different rotations: rotations across same axis
play nicely

, (3.10)

for 𝑤 = 𝑥, 𝑦 , 𝑧 and 𝜃 0 , . . . , 𝜃𝐷 ∈ [ 0, 2𝜋) arbitrary. Note, however, that this is


only true for rotations along the same axis. Pauli rotations across different axes
don’t have this feature:

𝑹 𝑤 (𝜃 )𝑹 𝑤 ′ (𝜃 ) ≠ 𝑹 𝑤 ′ (𝜃 ′ )𝑹 𝑤 (𝜃 ) whenever 𝑤 ≠ 𝑤 ′

for almost all angles 𝜃 , 𝜃 ′ ∈ [ 0, 2𝜋) . The following diagrammatic reformulation


reminds us that order matters a lot in general:

We conclude this section by reproducing some of the gates we have seen


today with Pauli rotations.

1 The T-gate 𝑻 is equivalent to a


𝑧 -rotation with angle 𝜃 = 𝜋/4:
   
exp (−i𝜋/8) 0 1 0
𝑹 𝑧 (𝜋/4) = ∼ =𝑻.
0 exp ( i𝜋/8) 0 exp ( i𝜋/4)
39 Lecture 3: Single qubit circuits II

No. Rotation composition  Matrix 


1 0
1 𝕀
 0 1 
0 1
2 𝑿
 1 0 
0 −i
3 𝒀
 i 0 
1 0
4 𝒁

 0 −1 
𝜋
 2 1−𝑖 0
5 𝑹𝑧 2 2
 0 1+𝑖 

 2 1+𝑖 0
6 𝑹 𝑧 − 𝜋2 2 0 1 − 𝑖


𝜋
 2 1 −𝑖
7 𝑹𝑥
−𝑖 1 
2 2

 2 1 𝑖
8 𝑹 𝑥 − 𝜋2 2

 𝑖 1 
𝜋
 2 1 −1
9 𝑹𝑦 2 2

 1 1 
 2 1 1
10 𝑹 𝑦 − 𝜋2

2 − 1 1 
𝜋
 𝜋
 1 1 − 𝑖 −1 − 𝑖
11 𝑹𝑥 × 𝑹𝑦
 1−𝑖 1+𝑖 
2 2 2

𝜋
  1 1+𝑖 1−𝑖
12 𝑹𝑥 × 𝑹 𝑦 − 𝜋2
 −1 − 𝑖 1 − 𝑖 
2 2
  1 1 + 𝑖 −1 + 𝑖
13 𝑹 𝑥 − 𝜋2 × 𝑹 𝑦 𝜋
 1+𝑖 1−𝑖 
2 2
  1 1−𝑖 1+𝑖
14 𝑹 𝑥 − 𝜋2 × 𝑹 𝑦 − 𝜋2
 1+𝑖 1+𝑖 

2

𝜋
 𝜋
 1 1 + 𝑖 −1 − 𝑖
15 𝑹𝑦 × 𝑹𝑥
 −𝑖 1−𝑖 
2 2 2 1
  1 1−𝑖 1−𝑖
16 𝑹 𝑦 − 𝜋2 × 𝑹 𝑥 𝜋
 1−𝑖 1+𝑖 

2 2

𝜋
  1 1 − 𝑖 −1 + 𝑖
17 𝑹𝑦 × 𝑹 𝑥 − 𝜋2
 1+𝑖 1+𝑖 
2 2
  1 1+𝑖 1+𝑖
18 𝑹 𝑦 − 𝜋2 × 𝑹 𝑥 − 𝜋2 2 −1 + 𝑖 1 −𝑖

𝜋
 2 1 1
19 𝑿 𝑹𝑦
 1 −1 
2 2

 2 −1 1
20 𝑿 𝑹 𝑦 − 𝜋2 2

 1 1 
𝜋
 2 −1 −𝑖
21 𝒀 𝑹𝑥 2 2
 𝑖 1 

 2 1 −𝑖
22 𝒀 𝑹 𝑥 − 𝜋2 2 𝑖 −1

 
𝜋
 𝜋
 𝜋
 2 0 −1 − 𝑖
23 𝑹𝑥 × 𝑹𝑦 × 𝑹𝑥
 1−𝑖
2 2 2 2 0


   2 0 −1 + 𝑖
24 𝑹 𝑥 − 𝜋2 × 𝑹 𝑦 𝜋
× 𝑹 𝑥 − 𝜋2
2 2 1+𝑖 0
Table 3.1 All single-qubit Clifford gates that can be constructed using only 𝑯
(Hadamard) and 𝑺 (phase). The first column also provides a decomposition
into (at most) 3 Pauli rotations and/or Pauli gates.
40 Lecture 3: Single qubit circuits II

2 The phase gate 𝑺 -gate is equivalent to a 𝑧 -rotation with angle 𝜃 = 𝜋/2:


   
exp (−i𝜋/4) 0 1 0
𝑹 𝑧 (𝜋/2) = ∼ = 𝑺.
0 exp ( i𝜋/4) 0 exp ( i𝜋/2)

This angle is exactly twice as large as the angle required for the T-gate.
3 The sign gate 𝒁 is equivalent to a 𝑧 -rotation with angle 𝜃 = 𝜋 :
   
exp (−i𝜋/2) 0 1 0
𝑹 𝑧 (𝜋) = ∼ = 𝒁.
0 exp ( i𝜋/2) 0 exp ( i𝜋)

This angle is exactly twice as large as the angle required for the phase
gate and four times as large as the angle required for the T-gate.
4 The Hadamard gate 𝑯 is equivalent to a combination of an 𝑥 -rotation
and a 𝑦 -rotation with angle 𝜋 and 𝜋/2 each:

𝑹 𝑥 (𝜋) × 𝑹 𝑦 (𝜋/2)
   
cos (𝜋/2) −i sin (𝜋/2) cos (𝜋/4) − sin (𝜋/4)
= ×
−i sin (−𝜋/2) cos (𝜋/2) sin (𝜋/4) cos (𝜋/4)
     
0 1 1 1 −1 1 1 1
=(−i) ×√ = (−i) √ ∼ 𝑯.
1 0 2 1 1 2 1 −1

Here, we have√used cos (−𝜋/2) = 0, sin (−𝜋/2) = −1 and cos (𝜋/4) =


sin (𝜋/4) = 1/ 2.

The final statement of this section is an immediate consequence of Theorem 3.11


and the fact that we can use Pauli rotations to represent 𝑻 ∼ 𝑹 𝑧 (𝜋/4) and
𝑯 ∼ 𝑹 𝑥 (𝜋)𝑹 𝑧 (𝜋/2) .
Corollary 3.15 (Single-qubit Pauli rotations are universal). Sequential combinations Pauli rotations are universal
of Pauli rotations 𝑹 𝑥 (𝜃 𝑥 ), 𝑹 𝑦 (𝜃 𝑦 ), 𝑹 𝑧 (𝜃 𝑧 ) with variable angles 𝜃 𝑥 , 𝜃 𝑦 , 𝜃 𝑧 ∈
[ 0, 2𝜋) can be used to reach every single-qubit functionality.
This elementary gate set is even more powerful than the ones we’ve seen so
far because we can choose continuous angles. As a result, only very few such
continuous gates suffice to exactly implement a unitary functionality 𝑼 .
Exercise 3.16 (formal proofs of the matrix analysis equations).

1 Verify all three matrix representations of Pauli rotations by writing down


the matrix exponential and performing simplifications that are similar to
the proof of Euler’s equation (Exercise 3.3). Hint: use 𝑿 2 = 𝒀 2 = 𝒁 2 = 𝕀
to decompose each Taylor series into an even and odd part.
2 Verify the rotation properties in Eq. (3.9), e.g. by inserting concrete angles
and combining matrix-matrix multiplication rules with trigonometric
identities.
41 Lecture 3: Single qubit circuits II

3.5 Application: restricted sum of parity computations


Let us now showcase how analog degrees of freedom can be exploited to
solve certain problems that don’t have an (obvious) digital solution. The
following stylized challenge illustrates that this feature alone can already be
quite empowering.
Envision a lecture hall with one instructor (me) and 𝐷 ≥ 2 students (you).
Once the game begins, the instructor hands out one integer number 𝑘𝑑 ∈ ℤ
to each student. The students win the challenge if they can determine the
parity (odd vs. even) of the sum of all integer s𝑘 tot = 𝑘 0 + · · · + 𝑘𝐷 − 1 . But
they must do so with very limited information transfer: the students can only
receive/transmit a single bit of information. This limitation forces students to
act sequentially and we can completely describe every possible strategy as a
dynamic 1-bit circuit, where each student contributes one logical operation
(that depends on the integer they have received).
The following classical strategy – visualized as a quantum circuit pipeline
with only classical constituents – allows the students to always win this chal-
lenge:

(3.11)

This circuit cleverly exploits a cute feature of the parity: the parity of a sum is
the sum of the parities modulo 2. In this strategy, each student computes the
parity of their number and, once she receives the bit, she either flips it (if their
number is odd) or does nothing (if their number is even). This ensures that the
final outcome bit is

𝑜 =parity (𝑘𝐷1 ) ⊕ · · · ⊕ parity (𝑘 0 ) = parity (𝑘 0 + · · · + 𝑘𝐷 −1 ) = parity (𝑘 tot ).

In words: the outcome bit is 1 if and only if 𝑘 tot is an odd number. Otherwise, it
evaluates to 0. Executing this strategy allows the students to win this restricted
parity of sum computation with certainty.
So, let’s increase the difficulty level. Instead of distributing integers
𝑘 0 , . . . , 𝑘𝐷 −1 ∈ ℤ, the instructor now distributes rational numbers 𝑘 0 , . . . , 𝑘𝐷 −1 ∈
ℚ (i.e. fractions) with the additional promise that the total sum 𝑘 tot =
𝑘 0 + · · · + 𝑘𝐷 −1 ∈ ℤ still adds up to an integer value. He then asks the students
to come up with a strategy that again only involves a single bit of commu-
nication between them. This modification is quite nasty because the parity
is not well-defined for general rational numbers. This prevents the students
from re-using their winning strategy from before. In fact, we believe that it is
42 Lecture 3: Single qubit circuits II

impossible to come up with a 1-bit communication protocol that is still capable


of winning this challenge with certainty.
Nonetheless, a clever group of students can overcome this sorry state of
affairs by going quantum: they replace the single bit of shared (classical) infor-
mation with a single qubit of shared quantum information. This generalization
allows the students to implement a sequential single-qubit quantum circuit of
the following form:

. (3.12)

Instead of an individual bit, the students now pass an individual qubit between
themselves. They can choose to apply a quantum gate and/or perform a readout
and initialize a new qubit. A moment of thought reveals that only the very
last student should perform an actual readout. And only the very first student
should initialize their qubit. All students in the middle apply a quantum gate
that depends on the fraction 𝑘𝑑 ∈ ℚ they received. The strategy depicted in
Eq. (3.12) suggests to use a Pauli-y rotation across angle 𝜃𝑘 = 𝜋𝑘 . The phase
addition rule for Pauli rotations from Eq. (3.10) highlights why this strategy is
a good idea:
𝑹 𝑦 (𝜋𝑘𝐷 −1 ) × · · · × 𝑹 𝑦 (𝜋𝑘 0 ) = 𝑹 𝑦 (𝜋 (𝑘𝐷 −1 + · · · + 𝑘 0 )) = 𝑹 𝑦 (𝜋𝑘 tot ) .
Or, in pictures:

And, with the additional promise that 𝑘 tot is integer, we can conclude that
the final state vector is
|𝜓 (𝑘 tot )⟩ =𝑹 𝑦 (𝜋𝑘 tot )| 0⟩
  
cos (𝜋𝑘 tot /2) − sin (𝜋𝑘 tot /2) 1
=
sin (𝜋𝑘 tot /2) cos (𝜋𝑘 tot /2) 0
 
cos (𝜋𝑘 tot /2)
=
− sin (𝜋𝑘 tot /2)
(
±| 0⟩ if 𝑘 tot even,
=
±| 1⟩ if 𝑘 tot odd.
43 Lecture 3: Single qubit circuits II

The last simplification follows from the behavior of trigonometric functions:


whenever 𝑘 tot is even, sin (𝜋𝑘 tot /2) = 0 and all of the information must
concentrate in the ‘only 0’ branch. Conversely, if 𝑘 tot is odd, then cos (𝜋𝑘 tot /2) =
0 and all information must concentrate in the ‘only 1’ branch. There can be an
additional sign factor, but this does not matter, because the readout stage is
invariant under global phases.
This means that once the last person contributes their gate and starts the
readout, the outcome bit 𝑜 is perfectly correlated with the parity of the total
sum:

1 𝑜 = 0 must happen with certainty if 𝑘 tot = 𝑘 0 + · · · + 𝑘𝐷 −1 is an even


number.
2 𝑜 = 1 must happen with certainty if 𝑘 tot = 𝑘 0 + · · · + 𝑘 𝐷 − 1 is an odd
number.

So, going quantum allows the students to always win a very constrained
multi-player challenge for which no optimal classical strategy is known.

Problems
Problem 3.17 (Proof of Euler’s theorem, see Exercise 3.3). Prove exp ( i 𝜑 ) = cos (𝜑 )+
i sin (𝜑 ) by using i2 = −1 and the following three Taylor series expansions:
∑︁∞ 𝑧 𝑘 ∑︁∞ 𝑧 2𝑘 ∑︁∞ 𝑧 2𝑘 +1
exp (𝑧) = , cos (𝑧) = (−1) 𝑘 , sin (𝑧) = (−1) 𝑘 .
𝑘 =0 𝑘 ! 𝑘 =0 ( 2𝑘 ) ! 𝑘 =0 ( 2𝑘 + 1) !
Problem 3.18 (Finding all gates generated by 𝑯 and 𝒁 , see Exercise 3.9). Verify
the correctness of Lemma 3.8 by writing a piece of code that generates gate
combinations of certain length, computes their matrix representation via matrix-
matrix multiplication and terminates once no new functionally distinct matrices
can be achieved (i.e. all newly generated matrices are equivalent to existing
ones up to a global phase).
Problem 3.19 (formal proofs of matrix exponential identities, see Exercise 3.16).

1 Verify all three matrix representations of Pauli rotations by writing down


the matrix exponential and performing simplifications that are similar to
the proof of Euler’s equation (Exercise 3.3). Hint: use 𝑿 2 = 𝒀 2 = 𝒁 2 = 𝕀
to decompose each Taylor series into an even and odd part.
2 Verify the rotation properties in Eq. (3.9), e.g. by inserting concrete angles
and combining matrix-matrix multiplication rules with trigonometric
identities.
4. Two-qubit circuits

Written by: Kristina Kirova & Jadwiga Wilkens


Date: 23 October 2024
Revised by: Richard Kueng

A 1-qubit quantum processing unit (QPU) is a great starting point for gaining Agenda:
a better understanding of the distinctions between classical and quantum
1 motivation
computing by introducing the concept of superposition. However, to harness
2 classical operations and
the full power of quantum phenomena in quantum computing, we require a
their truth tables
multi-qubit QPU and new logical quantum operations that address (at least) 2 3 Kronecker product vs.
qubits matrix product
In this lecture, we make a first step in this direction and delve into the 4 XOR and CNOT gate
world of 2-qubit gate logic. We begin by exploring the classical bit level, first 5 universal 2-qubit gate
making sure that the gates are reversible, then examining concepts like the set and its implications
sequential and parallel application of logical gates with a truth table, which are

Figure 4.1 Schematic illustration of a two-qubit processor: Input and output consist
of 2 combined conventional bits. In-between, depicted in blue, the 2-qubit logic
operates at the quantum level.
45 Lecture 4: Two-qubit circuits

Figure 4.2 Illustration of all 1-bit logical operations as maps. (a) The identity
operation (do nothing), a valid reversible mapping; (b) the bit-flip operation,
another valid reversible mapping; (c) a reset to 0, not reversible.

Figure 4.3 Illustration of some 2-bit logical operations as maps. (a) and (b) are
examples of reversible mappings. (c) and (d) are examples of mappings that
are not reversible: In (c) 01 is mapped to both 00 and 10 and in (d) both 00
and 01 are mapped to 00.

subsequently expressed in mathematical terms. Following this, we transition


to the realm of 2-qubit quantum gates, and we discuss their implications for
universality in both the 2-qubit and multi-qubit scenarios making a promising
point about why and how quantum computing can be more powerful than
classical computing.

4.1 Classical reversible operations on 2 bits


As we saw in Lecture 2, there are two classical 1-bit gate operations that take
an input bit 𝑏 ∈ {0, 1} to an output bit 𝑜 ∈ {0, 1} in such a way that no
information is lost (reversibility). Namely the bit-flip operation 𝑿 and the
’do nothing’ operation 𝕀. In order to have a look at what reversibility means,
Fig. 4.2 shows examples of different logical operations that map an input bit
to an output bit. Every input bit has to be mapped to exactly one output bit,
otherwise the logical operation (gate) is not reversible.
Adding a second (qu)bit to this process, the general layout now looks like
Fig. 4.1, input bit pairs 𝑏 ∈ {00, 01, 10, 11} are manipulated and mapped onto
output bit pairs 𝑜 ∈ {00, 01, 10, 11}. In the 1-qubit case, there are 2 distinct
logical reversible operations, where every bit from the input set is mapped to
exactly one bit in the output set (reversibility). How many reversible operations
are there now for 2 bits? The same explanation that applies to 1-bit operations
about what a 2-bit operation must follow to be reversible still stands. In Fig. 4.3
some illustrations are given for reversible and not reversible 2-bit gates.
46 Lecture 4: Two-qubit circuits

00 01 10 11 00 01 10 11
00 0 0 1 0 00 0 1 0 0
01 0 0 0 1 01 1 0 0 0
10 1 0 0 0 10 0 0 0 1
11 0 1 0 0 11 0 0 1 0

(a) Flipping the first bit (b) Flipping the second bit
00 01 10 11 00 01 10 11
00 0 0 0 1 00 1 0 0 0
01 0 0 1 0 01 0 1 0 0
10 0 1 0 0 10 0 0 1 0
11 1 0 0 0 11 0 0 0 1

(c) Flipping both bits (d) Identity on both bits

Figure 4.4 Logical truth tables for all parallel combinations of bit flip (𝑿 ) and
identity (𝕀). Fig. 4.5 displays the corresponding quantum circuit realizations.

This argument, when followed through, boils down to a combinatorics


problem: how many distinct orderings of 4 distinguishable objects (here the 4
combinations of 2 bit values) are there? Or formulated differently: how many
distinct permutations of 4 distinct objects are there? The answer is 4!, read as
4 factorial, which is 4! = 4 × 3 × 2 × 1 = 24. There are 24 reversible 2 bit
operations
4.1.1 Combining single-bit operations in parallel
The analysis above just shows how many distinct reversible classical gates there
are, not how their truth tables or sequence of gates look.
For that, let us look at some easy examples to get a grasp of how to build
compositions of 2 (qu)bit gates. Since the bit-flip operator is already known,
the easiest non-trivial example is flipping one bit while leaving one bit and
doing nothing to it or vice versa. In Fig. 4.4 all four truth tables of combinations
of bit-flip and identity operations are depicted while Fig. 4.5 presents the
corresponding quantum circuit realizations.
After visualizing the 2-qubit bit-flip/identity in parallel combinations, we
are now ready to present a mathematical operation that allows us to combine
two (matrix representations of) single-bit truth tables to obtain the matching
truth table for the resulting 2-bit operation.
Applying two gates in parallel is written with a Kronecker product symbol
⊗ in between. This is a different type of matrix product that we haven’t seen
yet. Recall that we use the ordinary matrix product × to sequentially multiple
(single-qubit) gate matrix actions. We can use this new Kronecker product to sequential gates → matrix
turn Fig. 4.5 into the following formulas: product
parallel gates → kronecker
product
47 Lecture 4: Two-qubit circuits

Figure 4.5 Quantum circuits that realize all parallel combinations of bit flip (𝑿 )
and identity (𝕀). (a) flip the first bit and leave the second. (b) flip the second
bit and leave the first. (c) flip both bits at the same time (parallel). (d) don’t
do anything to both bits. Fig. 4.4 depicts the associated truth tables.

(a) 𝑿 ⊗𝕀 Flipping the first bit,


(b) 𝕀⊗𝑿 Flipping the second bit,
(c) 𝑿 ⊗𝑿 Flipping both bits,
(d) 𝕀⊗𝕀 Identity on both bits.

The order of gates, or rather the matrices underneath, within the Kronecker
product ⊗ is crucial since gates left of the ⊗ are only acting on the first (qu)bit
and gates written on the right are only acting on the second (qu)bit. This
is similar to the fact that ordering also matters a lot in the ordinary matrix
product × which describes sequential gate applications: 𝑨 × 𝑩 ≠ 𝑩 × 𝑨 in
general. An example for the importance of ordering can be seen in Chapter
3 where the different combinations of 𝑯 and 𝑿 are shown, more precisely
Equation 3.4. Looking at the matrix representation it very clearly shows that
𝑯 × 𝑿 ≠ 𝑿 × 𝑯.
The same holds for the Kronecker product, in general, 𝑨 ⊗ 𝑩 ≠ 𝑩 ⊗ 𝑨 . This Ordering of gates matters!
can already be seen in the truth table representation of bit-flip and identity
combinations in Fig. 4.4: 𝑿 ⊗ 𝕀 ≠ 𝕀 ⊗ 𝑿 .

4.1.2 The Kronecker product


Before moving on to a formal definition of the Kronecker product, let us review
its action by means of a concrete example. The Kronecker product between
48 Lecture 4: Two-qubit circuits

bit-flip (𝑿 ) and identity (𝕀) can be computed as follows:


   
0 1 0𝕀 1𝕀
𝑿 ⊗𝕀= ⊗𝕀=
1 0 1𝕀 0𝕀
   
1 0 1 0 0 0 1 0
0 1
0 1 0 1 ® ­ 0 0 0 1 ®
© ª © ª
=­ ®=­
­
®.
1 0 1 0 ® ­ 1 0 0 0 ®
1 0
­
« 0 1 0 1 ¬ « 0 1 0 0 ¬
Here (and in Fig. 4.4), blue color highlights the entries of the gate acting on
the first (qu)bit which are not zero.
Definition 4.1 (Kronecker Product of two 2 × 2 matrices). Consider two complex- definition of the Kronecker
valued 2 × 2 matrices product
   
𝐴0,0 𝐴0,1 2×2 𝐵 0,0 𝐵 0,1
𝑨= ∈ℂ and 𝑩 = ∈ ℂ2×2 .
𝐴1,0 𝐴1,1 𝐵 1,0 𝐵 1,1
Then, the parallel execution of gate 𝑨 on the first qubit and 𝑩 on the second
qubit is written as 𝑼 ⊗ 𝑽 . This Kronecker product produces a 4 × 4 matrix that
is calculated as follows:
 
𝐴0,0 × 𝑩 𝐴0,1 × 𝑩
𝑨 ⊗𝑩 =
𝐴1,0 × 𝑩 𝐴1,0 × 𝑩
𝐴0,0𝐵 0,0 𝐴0,0𝐵 0,1 𝐴0,1𝐵 0,0 𝐴0,1𝐵 0,1
© ª
­ 𝐴0,0𝐵 1,0 𝐴0,0𝐵 1,1 𝐴0,1𝐵 1,0 𝐴0,1𝐵 1,1
=­ ® ∈ ℂ4𝑥 4 .
®
­ 𝐴1,0𝐵 0,0 𝐴1,0𝐵 0,1 𝐴1,1𝐵 0,0 𝐴1,1𝐵 0,1 ®
« 𝐴1,0𝐵 1,0 𝐴1,0𝐵 1,1 𝐴1,1𝐵 1,0 𝐴1,1𝐵 1,1 ¬
Note that the resulting matrix has 2 × 2 = 4 columns and 2 × 2 = 4
rows. So, a parallel application of gates increases the size of the resulting
truth table. In contrast, the matrix multiplication of gates which are applied
sequentially (one after the other) does not increase the dimension of the truth
table representation.
In order to get a bit more familiar with the Kronecker product, here are
some properties which are good to keep in mind for further calculations and
a general understanding. For example taking the matrix product of parallel
gates, which are denoted with capital letters 𝑨, 𝑩, 𝑪 , and 𝑫 boils down to:

(𝑨 ⊗ 𝑩) × (𝑪 ⊗ 𝑫) = (𝑨 × 𝑪 ) ⊗ (𝑩 × 𝑫). (4.1)

This has a neat consequence to grasp what it means to apply two gates in
parallel: When you use two gates at the same time, it’s kind of like using one
gate on the first bit and leaving the other bit alone. Then you use the second
gate on the other bit, while keeping the first one unchanged with an identity
gate. Putting this explanation into an equation gives

(𝑨 ⊗ 𝕀) × (𝕀 ⊗ 𝑩) = (𝑨 ⊗ 𝑩).
49 Lecture 4: Two-qubit circuits

And putting this explanation into a figure gives

It is easy to count the total number of classical 2-bit truth tables (matrices)
which arise from Kronecker products of reversible single-bit operations. In
circuit language, this corresponds to the parallel application of single-bit gates
and there are exactly 2 × 2 = 4 of them (either 𝑿 or 𝕀 on each bit). This number
(4) is much smaller than the total number of reversible two-bit circuits (24).
So, there must be circuits whose truth table cannot be expressed as a single
Kronecker product.
One such gate is a reversible implementation of the XOR operation, called
Controlled Not or CNOT in quantum computing. This operation flips the truth
content of the second bit if and only if the first bit value is 1. Else if the first bit
value is 0, nothing happens: 2-bit CNOT gate

00 01 10 11
00 1 0 0 0
01 0 1 0 0 (CNOT0→1 ’truth table’). (4.2)
10 0 0 0 1
11 0 0 1 0
This operation always requires 2 bits: a control bit and a target bit. Here the
first bit denoted as 0 is the control bit and the second bit denoted as 1 is the
target bit, written as 0 → 1 in the caption of the CNOT. Note that CNOT can be
viewed as a reversible implementation of the exclusive or, or XOR, operation:
CNOT0→1 (𝑏 0 , 𝑏 1 ) = (𝑏 0 , XOR (𝑏 0 , 𝑏 1 )) = (𝑏 0 , 𝑏 0 ⊕ 𝑏 1 ) .
In quantum computing, we use the following symbol to denote a 2-(qu)bit
CNOT gate:

(4.3)

Control and target (qu)bit are clearly singled out in this symbol. There is also
the CNOT1→0 operation, where control and target qubit are flipped. We will
discuss it in a later subsection.
Since information is flowing from the control bit to the target bit, this gate
cannot be viewed as acting separately on the first and separately on the second
bit. It always acts simultaneously on both bits and cannot be written in any
other way if there are only two bits available.
50 Lecture 4: Two-qubit circuits

4.2 Quantum operations on 2 qubits


Equipped with a new tool to express parallel operations, let us have a look at
quantum gates and quantum states.

4.2.1 Quantum gates on 2 qubits


The generalization of the Kronecker product from classical gates to quantum
gates is straightforward: As long as a gate that now acts in parallel with other
gates has a matrix representation one can always calculate the Kronecker
product of this composition.
Example 4.2 (Kronecker product of 𝑯 and 𝒁 ). Recall the following two single-qubit
quantum gate matrix representations:
   
1 1 1 1 0
𝑯 =√ and 𝒁 = ,
2 1 −1 0 −1

which describe superposition and sign flip, respectively. Then, 𝑯 ⊗ 𝒁 describes


a 2-qubit circuit where 𝑯 acts on the first qubit and 𝒁 acts on the second qubit.
Its ‘truth table’ can be readily computed by invoking Definition 4.1 (Kronecker
product):

    1 0 1 0
1 1 1 1 0 1 ­ 0 −1 0 −1
© ª
𝑯 ⊗𝒁 = √ ⊗ =√ ­ (4.4)
®
®.
2 1 −1 0 −1 2 ­ 1 0 −1 0 ®
« 0 −1 0 1 ¬
Note that this 4 × 4 matrix is again a unitary matrix. In fact, it is its own reverse.
Using Eq. (4.1), we obtain

(𝑯 ⊗ 𝒁 ) × (𝑯 ⊗ 𝒁 ) = (𝑯 × 𝑯 ) ⊗ (𝒁 × 𝒁 ) = 𝕀 ⊗ 𝕀,

i.e. do nothing on the first qubit and do nothing on the second qubit. ■

4.2.2 Quantum states on 2 qubits


By now, we have gathered quite some intuition about how 2-qubit circuits affect
quantum logic. To complete this treatment, only a formal definition of how a
2-qubit quantum state is expressed in the state vector formalism is missing.
Let us start this treatment with 2-qubit inititialization. Recall the follow-
ing two single-qubit state vectors that correspond to initialization in 0 and
initialization in 1, respectively:
   
1 2 0
| 0⟩ = 𝒆 0 = ∈ℂ and | 1⟩ = 𝒆 1 = ∈ ℂ2 .
0 1

|𝑏 0𝑏 1 ⟩ = |𝑏 0 ⟩ ⊗ |𝑏 1 ⟩ ∈ ℂ4 for 𝑏 0 , 𝑏 1 ∈ {0, 1}.


51 Lecture 4: Two-qubit circuits

We can use the Kronecker product for 2-dimensional vectors (interpreted as


2 × 1 matrices) to compute a 4 = 22 -dimensional state vector that corresponds
to initializing the first qubit in 𝑏 0 ∈ {0, 1} and the second qubit in 𝑏 1 ∈ {0, 1}:
2-qubit initialization:
Kronecker product of 1-qubit
|𝑏 0𝑏 1 ⟩ = 𝒆 𝑏 0𝑏 1 = |𝑏 0 ⟩ ⊗ |𝑏 1 ⟩ = 𝒆 𝑏 0 ⊗ 𝒆 𝑏 1 ∈ ℂ4 for 𝑏 0𝑏 1 ∈ {0, 1}2 . (4.5) initializations

This display succinctly summarizes a total of four possible input state vectors:

    1     0
1 1 ­ 0 ® 1 0 ­ 1 ®
© ª © ª
| 00⟩ = ⊗ = ­ ®, | 01⟩ = ⊗ = ­ ®,
0 0 ­ 0 ® 0 1 ­ 0 ®
« 0 ¬ « 0 ¬
    0     0
0 1 ­ 0 ® 0 0 ­ 0 ®
© ª © ª
| 10⟩ = ⊗ = ­ ®, | 11⟩ = ⊗ = ­ ®.
1 0 ­ 1 ® 1 1 ­ 0 ®
« 0 ¬ « 1 ¬
Note that these four vectors are precisely the four different possibilities of a
4-dimensional vector with exactly one 1 and zero everywhere else. The position
of the single 1 is in one-to-one correspondence with the underlying bitstring.
A general 2-qubit state vector arises from applying (unitary) quantum gates
to one such initialization. This can produce complex-valued state entries, but
must not change the overall normalization of the state vector.
Definition 4.3 (2-qubit quantum state). The state of a 2-qubit system is character-
ized by a 4-dimensional state vector 2-qubit state vectors are
4-dim, normalized vectors
𝜓00 with complex-valued entries
© ª
­𝜓
|𝜓 ⟩ = 𝝍 = ­ 01 ® ∈ ℂ4 . (4.6)
®
­ 𝜓10 ®
« 𝜓11 ¬
Each state vector entry corresponds to one of the 4 = 22 possible bit strings
(00, 01, 10, 11). The following reformulation makes this precise:
∑︁1
|𝜓 ⟩ = 𝜓00 | 00⟩ + 𝜓01 | 01⟩ + 𝜓10 | 10⟩ + 𝜓11 | 11⟩ = 𝜓𝑏 0𝑏 1 |𝑏 0𝑏 1 ⟩.
𝑏 0 ,𝑏 1 =0

The coefficients can be complex-valued numbers and must obey the normaliza-
tion condition

∥|𝜓 ⟩∥ 2 = |𝜓00 | 2 + |𝜓01 | 2 + |𝜓10 | 2 + |𝜓11 | 2 =1 (state normalization). (4.7)

Analogous to 1-qubit quantum states, the state vector entries tell us the
relative contribution of a particular 2-bit string to the current quantum logical
content. A 2-qubit readout procedure converts them into probabilities of
observing the underlying bit string.
52 Lecture 4: Two-qubit circuits

Fact 4.4 (2-qubit readout). Let |𝜓 ⟩ ∈ ℂ4 be a 2-qubit state vector. Then, we can
perform a readout of both the first and second qubit. This results in a single
2-bit string 𝑜 0𝑜 1 ∈ {0, 1}2 and the probability of observing a concrete value is
given by the squared modulus of the underlying state vector (Born’s rule):
2
Pr |𝜓 ⟩ [𝑜 0𝑜 1 ] = 𝜓𝑜0𝑜1 for 𝑜 0𝑜 1 ∈ {0, 1}2 . (4.8)

Eq. (4.8) succinctly tabulates the following four outcome probabilities when
reading out a general 2-qubit state |𝜓 ⟩ :

Pr |𝜓 ⟩ [ 00] = Pr |𝜓 ⟩ [𝑜 0 = 0, 𝑜 1 = 0] = |⟨00 |𝜓 ⟩| 2 = |𝜓00 | 2 ,


Pr |𝜓 ⟩ [ 01] = Pr |𝜓 ⟩ [𝑜 0 = 0, 𝑜 1 = 1] = |⟨01 |𝜓 ⟩| 2 = |𝜓01 | 2 ,
Pr |𝜓 ⟩ [ 10] = Pr |𝜓 ⟩ [𝑜 0 = 1, 𝑜 1 = 0] = |⟨10 |𝜓 ⟩| 2 = |𝜓10 | 2 ,
Pr |𝜓 ⟩ [ 11] = Pr |𝜓 ⟩ [𝑜 0 = 1, 𝑜 1 = 1] = |⟨11 |𝜓 ⟩| 2 = |𝜓11 | 2 .

Analogous to the already discussed gates, two single-qubit states can be


paired together via the Kronecker product. This is essentially a generalization
of our construction of 2-qubit initialization.
Example 4.5 (2-qubit quantum state vector from two 1-qubit states). We can com-
𝜓0,0
bine two general single-qubit quantum states |𝜓0 ⟩ = ∈ ℂ2×2 and
𝜓0,1
 
𝜓1,0
|𝜓1 ⟩ = ∈ ℂ2×2 to a single 2-qubit state vector
𝜓1,1

    𝜓0,0𝜓1,0
© ª
𝜓0 , 0 𝜓1,0 ­𝜓 𝜓
|𝜓 ⟩ = |𝜓0 ⟩ ⊗ |𝜓1 ⟩ = ⊗ = ­ 0,0 1,1 ® ∈ ℂ4 . (4.9)
®
𝜓0 , 1 𝜓1,1 ­ 𝜓0,1𝜓1,0 ®
« 𝜓0,1𝜓1,1 ¬

Every 2-qubit system that is comprised of two independent single-qubit


state vectors can be written in this fashion. If, for example, the circuit starts
in a classical bit imprinted state and the gates applied do not transfer any
information from one qubit to the other, the action of the gate (𝑼 ⊗ 𝑽 ) is fully
captured by letting 𝑼 only act on the first qubit and 𝑽 only act on the second
qubit,

(𝑼 ⊗ 𝑽 ) (𝝍 0 ⊗ 𝝍 1 ) = 𝑼𝝍 0 ⊗ 𝑽 𝝍 1 . (4.10)

But not every 2-qubit quantum state is of this particular form! As discussed in
the classical 2-bit gate section, there are gates whose action cannot be described
as a single gate acting on the first qubit and another single gate acting on
the second qubit. As in the classical case the most prominent candidate is
the quantum CNOT gate whose matrix representation is indeed the same,
53 Lecture 4: Two-qubit circuits

see the truth table (4.2) and the diagram as written in Fig. 4.3. In quantum
computing, this gate and its properties have huge implications in terms of
computing strength in the quantum realm which will be discussed in more
detail in the next lecture. Much like in classical circuit logic, these operations
allow to conditionally modify qubits. This is essential to build more complicated
functionalities out of simple elementary building blocks.

4.2.3 Universal 2-qubit gate sets


Last lecture, we have seen that two single-qubit gates – namely the Hadamard
gate 𝑯 and the 𝑻 -gate – are universal in the sense that any 2 × 2 unitary gate
action matrix can be approximated (to arbitrary accuracy) by a single-qubit
circuit comprised of only these two gates. An extension of this result to 2-qubit
circuits must contain at least one genuine 2-qubit gate (i.e. a functionality that
cannot be decomposed into a Kronecker product of two single qubit gates) –
the CNOT gate comes to mind here. Remarkably, this is all we need: paired
with a universal 1-qubit gate set, the CNOT gate forms a universal 2-qubit gate
set. Meaning that every 2-qubit unitary 𝑼 ∈ ℂ4 × 4 with 16 entries can be, to
arbitrary precision, approximated by only 3 gates: the Hadamard 1-qubit gate
𝑯 , the 1 qubit T-gate 𝑻 , and the CNOT gate 𝑪 𝑵 𝑶𝑻 acting on two qubits. universal 2-qubit gate set

Theorem 4.6 (efficient 2-qubit synthesis). Let G be a universal 2-qubit gate efficient 2-qubit circuit
seta , e.g. G = {𝑯 ,𝑻 , 𝑪 𝑵 𝑶𝑻 } and let 𝜀 ′ be a desired approximation synthesis
accuracy. Then, for every unitary 4 × 4 matrix 𝑼 containing 𝑚 𝑪 𝑵 𝑶𝑻
gates and arbitrarily many single-qubit gates, there exists a sequence of (at
most)
𝐷 = O (𝑚 log𝑐 (𝑚/𝜀 ′ )) (4.11)
gates that approximates the action of 𝑼 up to accuracy 𝜀 ′ . Here, 𝑐 ∈
[1, 3 + 𝑜 ( 1)] is a constant that depends on the universal gate set in question.
a Theoriginal statement also requires that this gate set either contains inverses or can
generate them in a constant number of steps.

This theorem resembles the Theorem 3.12 of the universal 1-qubit gate set with
the big difference that the number of one standing-out gate, the CNOT gate,
plays a role in the depth of the approximating circuit. This should at first be
surprising since this kind of special treatment of one gate did not happen in
the 1-qubit case. To grasp a better understanding of this above stated theorem,
we will look at the case where there is an arbitrary circuit with single-qubit
unitaries and 𝑚 CNOT gates, mixed randomly together.
Taking a closer look at the role of the CNOT gate, it becomes clear that any
application of it is a somehow disruptive operation conditionally modifying one
qubit state. If we would zoom into one wire, let’s say the wire of the second
qubit, all the single-qubit gate operations between two CNOT gates can be
summarized to one unitary (by simply calculating the matrix product of them).
This arbitrary unitary can now be approximated with O ( log𝑐 ( 1/𝜀)) many 𝑯
54 Lecture 4: Two-qubit circuits

Figure 4.6 Flipping CNOT control and target using 4 Hadamard gates. This
circuit identity shows that the target and control qubits can be interchanged
by applying a single-qubit Hadamard gate on every qubit before and after the
CNOT gate.

and 𝑻 gates to a precision of 𝜀 . But a CNOT cannot be written as a unitary


acting solely on the second qubit, meaning it can not just be pushed into a
single-qubit unitary which is then approximated. The CNOT changes the qubit
state conditionally to the first one, so after the CNOT the next single-qubit
gates have to be approximated again, with an order of depth of O ( log𝑐 ( 1/𝜀)) .
This little procedure continues until all single-qubit gates between the 𝑚 CNOT
gates are approximated to an 𝜀 ′ = 𝜀 𝑚 precision in total. Note that here we
use two different precision variables, namely 𝜀 for the single-qubit unitary
precision and 𝜀 ′ for the whole 2-qubit unitary precision.

4.2.4 Example 1: CNOT with control and target flipped


We will now introduce a very useful (and perhaps somewhat surprising) circuit
identity displayed in Fig. 4.6. It states that we can flip the direction of a
CNOT gate action by sandwiching it inbetween four Hadamard (superposition)
gates. By “flipping”, we mean that we exchange control and target qubit of
the logical operation. Let us verify this identity by direct computations that
involve Kronecker products ( ⊗ ) for parallel gate application and ordinary matrix
products (×) for sequential gate combinations:

(𝑯 ⊗ 𝑯 ) 𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝑯 ) (4.12)
1 1 1 1 1 0 0 0 1 1 1 1
1 ­ 1 −1 1 −1 ® ­ 0 1 0 0 ® 1 ­ 1 −1 1 −1 ®
© ª© ª © ª
= ­ (4.13)
2 ­ 1 1 −1 −1 ® ­ 0 0 0 1 ® 2 ­ 1 1 −1 −1 ®
®­ ® ­ ®

« 1 −1 −1 1 ¬ « 0 0 1 0 ¬ « 1 −1 −1 1 ¬
1 0 0 0
­ 0 0 0 1 ®
© ª
=­ ®. (4.14)
­ 0 0 1 0 ®
« 0 1 0 0 ¬
55 Lecture 4: Two-qubit circuits

Figure 4.7 Random number generator with 2 qubits. This quantum circuit accepts
two classical bits, 𝑏 ∈ {0, 1} which initialize the two quantum states in either
| 0⟩ or | 1⟩ . Applying a Hadamard to both qubits and then measuring results in
two independent random bits.

Basic 2-bit logic now allows us to recognize the truth table of CNOT1→0 – i.e.
an XOR operation on the first bit – in the final matrix:

CNOT1→0 ( 0, 0) = ( XOR ( 0, 0), 0) = ( 0, 0) ,


CNOT1→0 ( 0, 1) = ( XOR ( 0, 1), 1) = ( 1, 1) ,
CNOT1→0 ( 1, 0) = ( XOR ( 1, 0), 0) = ( 1, 0) ,
CNOT1→0 ( 1, 1) = ( XOR ( 1, 1), 1) = ( 0, 1) .

This result implies that by acting locally on the two qubits we can change the
flow of information!

4.2.5 Example 2: a two-bit random number generator


Let us now extend our example for a quantum number generator from the
second lecture, Example 2.8, to 2 qubits. Consider the simple circuit, Fig. 4.7,
where we initialize both qubits in the | 1⟩ state (for 𝑏 = 1), apply a Hadamard
to each, and measure.
The state just before the measurement is:

𝜓00
© ª
­ 𝜓01 ®
|𝜓 ⟩ = ­ ® = (𝑯 ⊗ 𝑯 ) (| 1⟩ ⊗ | 1⟩)
­ 𝜓10 ®
« 𝜓11 ¬
 √ √   √ √     
1/ 2 1/ 2 1/ 2 1/ 2 0 0
= √ √ ⊗ √ √ ⊗
1/ 2 −1/ 2 1/ 2 −1/ 2 1 1
1/2 1/2 1 /2 1 /2 0 1/2
1/2 − 1/2 1 / 2 − 1/2 ® ­ ® ­ 1/2
0 −
© ª© ª © ª
=­ ®­ ® = ­
­ ®
®.
­ 1/2 1/2 −1/2 −1/2 ® ­ 0 ® ­ −1/2 ®
« 1/2 −1/2 −1/2 1/2 ¬ « 1 ¬ « 1/2 ¬
Now recall that the outcome probabilities are the squared magnitudes of
the state vector entries, namely |𝜓0 ⟩, |𝜓1 ⟩, |𝜓2 ⟩, |𝜓3 ⟩ . Analogous to the rule
56 Lecture 4: Two-qubit circuits

for single-qubit readout (Definition 2.7), we now compute: every outcome is equally
likely
Pr |𝜓 ⟩ [ 00] = |⟨00 |𝜓 ⟩| 2 = |𝜓0 | 2 = | 1/2 | 2 = 1/4,
Pr |𝜓 ⟩ [ 01] = |⟨01 |𝜓 ⟩| 2 = |𝜓1 | 2 = |−1/2 | 2 = 1/4.
Pr |𝜓 ⟩ [ 10] = |⟨10 |𝜓 ⟩| 2 = |𝜓2 | 2 = |−1/2 | 2 = 1/4.
Pr |𝜓 ⟩ [ 11] = |⟨11 |𝜓 ⟩| 2 = |𝜓3 | 2 = | 1/2 | 2 = 1/4.

We see that all the possible 4 outcomes are equally likely. Let us now focus
only on the first bit, without loss of generality. The probability of measuring an
outcome 0 on it is given by
∑︁1
Pr [𝑜 0 = 0] = Pr [𝑜 0 = 0, 𝑜 1 = 𝑘 ] = 1/4 + 1/4 = 1/2.
𝑘 =0

unif
This is just the definition of a perfect random bit 𝑜 ∼ {0, 1} which we saw
in Example 2.8. We now have a device which can generate 2 random bits
simultaneously. It is important to note that these 2 bits are independent of each
other. This is not always the case and in the next two lectures, we will see
examples of creating correlations with quantum circuits which cannot be done
classically.
5. Bell states & Superdense Coding

Written by: Kristina Kirova & Jadwiga Wilkens


Date: 30 October 2024
Revised by: Richard Kueng

Today, we will start analyzing and exploiting the extraordinary correlations Agenda:
that become possible when working with quantum circuits. As we shall see, the
(joint) quantum state of two qubits (𝑛 = 2) can be correlated in ways that are 1 Bell States
2 Equivalence Theorem
inconceivable from a classical perspective. This has both important conceptual
3 Superdense Coding
implications, as well as practical ones. 4 Implications for circuit
verification & learning
5.1 Motivation: The Bell state
Our starting point is the output of the following, seemingly innocuous, quantum
circuit, Fig. 5.1. It contains two qubits, one Hadamard (superposition) gate
and a CNOT (classical gate).
We can use the unitary matrix framework to compute all amplitudes of the
resulting quantum state:

Figure 5.1 Circuit generating a Bell state. This quantum circuit initializes both
qubits in 0, then applies Hadamard to the first qubit, followed by a CNOT.
58 Lecture 5: Bell states & Superdense Coding

|𝜓Bell ⟩ =𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝕀)| 00⟩


=𝑪 𝑵 𝑶𝑻 (𝑯 | 0⟩ ⊗ | 0⟩)
 
1
=𝑪 𝑵 𝑶𝑻 √ (| 0⟩ + | 1⟩) ⊗ | 0⟩
2
1 1
= √ 𝑪 𝑵 𝑶𝑻 | 00⟩ + √ 𝑪 𝑵 𝑶𝑻 | 10⟩
2 2
1
= √ (| 00⟩ + | 11⟩) . (5.1)
2

Let us now compute the probability of each readout outcome in the same Bell state: √1 (| 00⟩ + | 11⟩)
2
manner as we did for the single-qubit quantum random number generator (see
Example 2.8 in Lecture 2):
√ 2
Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 0] = |⟨00 |𝜓 ⟩| 2 = |𝜓0 | 2 = 1/ 2 = 1/2,
Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 1] = |⟨01 |𝜓 ⟩| 2 = |𝜓1 | 2 = | 0 | 2 = 0.
Pr |𝜓Bell ⟩ [𝑜 0 = 1, 𝑜 1 = 0] = |⟨10 |𝜓 ⟩| 2 = |𝜓2 | 2 = | 0 | 2 = 0.
√ 2
Pr |𝜓Bell ⟩ [𝑜 0 = 1, 𝑜 1 = 1] = |⟨11 |𝜓 ⟩| 2 = |𝜓3 | 2 = 1/ 2 = 1/2.

Out of the four possible amplitudes, only two are nonzero and equal to 1/2.
This looks a lot like the value of two random coins that are perfectly correlated
with each other. Both coins always show the same value (0 or 1) with a perfectly correlated outcomes
probability of 1/2 each.

5.1.1 Stronger than classical correlations


The above result is nice, but does not look very revolutionary yet as one can
also imitate such correlated randomness by first tossing a fair coin and then
communicating the value to both readout locations. What is so special about
this Bell state then? Let us now apply the Hadamard gate to one or both qubits.
Fig. 5.2 summarizes the four possibilities as circuit diagrams. Case (a) (𝕀 on first
qubit and 𝕀 on second qubit is easy. We already did that above and observed
perfect correlations between the two readout bits:
unif
Case (a): 𝑜 1 = 𝑜 0 ∼ {0, 1} .

Case (b) (𝑯 on first qubit, 𝕀 on second qubit) pilots us into new territory. We
can use the matrix-vector framework to compute the 2-qubit state vector just
59 Lecture 5: Bell states & Superdense Coding

Figure 5.2 Four simple circuits including the Bell state. (a) Bell state followed
by a measurement. Outcomes are perfectly correlated. (b) A Hadamard gate
applied to the first qubit destroys the correlation. (c) A Hadamard gate applied
to the second qubit also destroys the correlation. (d) A Hadamard gate applied
to both qubits preserves the correlation.

before the readout stage:

|𝜓final ( b ) ⟩ =(𝑯 ⊗ 𝕀)|𝜓Bell ⟩


1
=(𝑯 ⊗ 𝕀) √ (| 00⟩ + | 11⟩)
2
    
1 | 0⟩ + | 1⟩ | 0⟩ − | 1⟩
=√ √ | 0⟩ + √ | 1⟩
2 2 2
1
= (| 00⟩ + | 10⟩ + | 01⟩ − | 11⟩)
2
This final state features all possible bit configurations with equal-sized weights
and a negative sign that flags | 11⟩ . Performing the readout absorbs these sign
differences and we obtain
2
Pr |𝜓final ( b) [𝑜 0 = 0, 𝑜 1 = 0] = ⟨00 |𝜓final ( b ) ⟩ = |+1/2 | 2 = 1/4,
2
Pr |𝜓final ( b) [𝑜 0 = 0, 𝑜 1 = 1] = ⟨01 |𝜓final ( b ) ⟩ = |+1/2 | 2 = 1/4,
2
Pr |𝜓final ( b) [𝑜 1 = 1, 𝑜 1 = 0] = ⟨10 |𝜓final ( b ) ⟩ = |+1/2 | 2 = 1/4,
2
Pr |𝜓final ( b) [𝑜 0 = 1, 𝑜 1 = 1] = ⟨11 |𝜓final ( b ) ⟩ = |−1/2 | 2 = 1/4.

This is the signature of two independent and uniformly random readout bits:
unif unif
Case (b): 𝑜 0 ∼ {0, 1} and 𝑜 1 ∼ {0, 1} .
60 Lecture 5: Bell states & Superdense Coding

The third case (c) (𝕀 on the first qubit and 𝑯 on the second qubit) paints a
picture that is strikingly similar to case (b):

|𝜓final ( c ) ⟩ =(𝕀 ⊗ 𝑯 )|𝜓Bell ⟩


1
=(𝕀 ⊗ 𝑯 ) √ (| 00⟩ + | 11⟩)
2
    
1 | 0⟩ + | 1⟩ | 0⟩ − | 1⟩
= √ | 0⟩ √ + | 1⟩ √
2 2 2
1
= (| 00⟩ + | 10⟩ + | 01⟩ − | 11⟩)
2
=|𝜓final ( b ) ⟩.

Note that this final state vector is virtually identical to case (b). In turn,
performing the readout also yields the same result:
unif unif
Case (c): 𝑜 0 ∼ {0, 1} and 𝑜 1 ∼ {0, 1} .

The final case (d) (apply 𝑯 to the first qubit and apply 𝑯 to the second qubit)
produces a final state vector that is identical to case (a) (do nothing at all).

|𝜓final ( d ) ⟩ =(𝑯 ⊗ 𝑯 )|𝜓Bell ⟩


1
=(𝑯 ⊗ 𝑯 ) √ (| 00⟩ + | 11⟩)
2
       
1 | 0⟩ + | 1⟩ | 0⟩ + | 1⟩ | 0⟩ − | 1⟩ | 0⟩ − | 1⟩
=√ √ ⊗ √ + √ ⊗ √
2 2 2 2 2
1
= √ (| 00⟩ + | 01⟩ + | 10⟩ + | 11⟩ + | 00⟩ − | 01⟩ − | 10⟩ + | 11⟩)
2 2
1
= √ ( 2 | 00⟩ + 2 | 11⟩)
2 2
| 00⟩ + | 11⟩
= √
2
=|𝜓Bell ⟩

In turn, we again obtain perfect correlations when performing the readout:


unif
Case (d): 𝑜 1 = 𝑜 0 ∼ {0, 1} .

This is surprising and without a classical counterpart! Single-qubit Hadamard


rotations are designed to break up deterministic bit configurations into two
parts. The above case study shows that this intuition only works partially –
namely if we apply a single Hadamard gate to one qubit and do nothing to the
other qubit. If we instead apply Hadamard to both qubits, their action seems
to cancel out completely. Even though they act independently on two separate
61 Lecture 5: Bell states & Superdense Coding

Figure 5.3 Local equivalence of Bell state. Applying a T gate on either of the two
qubits in a Bell state leads to the same final quantum state.

qubits! What is happening here? Is this because of the Hadamard gate or the
specific properties of the Bell state?
What is more, there does not exist a classical device which behaves in the
same way under such conditions. Namely, producing perfectly correlated bits,
then producing random bits if a Hadamard gate is applied to either register
and yet correlated ones if the Hadamard gate is applied to both. This is what No classical equivalence
we call stronger than classical correlations and you will see more examples of
this in future lectures.
For now, let us do a quick calculation with a different gate to see whether a
similar situation also arises.
Example 5.1 (T gate on one qubit of a Bell state while doing nothing to the other.). Let
us now consider the two circuits depicted in Fig. 5.3 and compute the final two
qubit state vector just prior to the readout stage. For the first circuit (𝑻 on first
qubit, 𝑰 on second qubit), we obtain

|𝜓final−left ⟩ = (𝑻 ⊗ 𝕀) |𝜓Bell ⟩
1
= (𝑻 ⊗ 𝕀) √ (| 00⟩ + | 11⟩)
2
1
= √ ((𝑻 ⊗ 𝕀) | 00⟩ + (𝑻 ⊗ 𝕀) | 11⟩)
2
1
= √ ((𝑻 | 0⟩) ⊗ (𝕀| 0⟩) + (𝑻 | 1⟩) ⊗ (𝕀| 1⟩))
2
1    
=√ ( 1 × | 0⟩) ⊗ (+1 × | 0⟩) + ei𝜋/4 × | 1⟩ ⊗ (+1 × | 1⟩)
2
1  i𝜋/4

=√ 1 × 1 × | 00⟩ + e × 1 × | 11⟩
2
1  
=√ | 00⟩ + ei𝜋/4 | 11⟩ .
2

This final result should not come as a complete surprise. The T-gate does
nothing to | 0⟩ and applies a phase shift by ei𝜋/4 to the | 1⟩ -state. Viewed from
this angle, it is also not really a surprise that the second circuit in Fig. 5.3
62 Lecture 5: Bell states & Superdense Coding

produces an identical final state:


1  
|𝜓final−right ⟩ = (𝕀 ⊗ 𝑻 ) |𝜓Bell ⟩ = √ | 00⟩ + ei𝜋/4 | 11⟩ .
2
We leave a detailed derivation – which follows along the same steps as before
– as an instructive exercise in pen-and-paper calculation of quantum circuit
actions.
For now, we tabulate what we just learned: applying the T-gate either to
the first or the second qubit of a Bell state produces the same final state vector.
In formulas:
(𝑻 ⊗ 𝕀) |𝜓Bell ⟩ = (𝕀 ⊗ 𝑻 ) |𝜓Bell . (5.2)

It is worthwhile to summarize two noteworthy results, we just obtained by


direct calculation:

(𝑯 ⊗ 𝕀) |𝜓Bell ⟩ = (𝕀 ⊗ 𝑯 ) |𝜓Bell ⟩ (5.3)


(𝑻 ⊗ 𝕀) |𝜓Bell ⟩ = (𝕀 ⊗ 𝑻 ) |𝜓Bell ⟩. (5.4)

These are some of the properties that make the Bell State so special and worth
dedicating a whole lecture to! The above observations are a signature of a
more general and remarkable feature of the Bell state.

Theorem 5.2 (gate equivalence for Bell states). Let 𝑼 be an arbitrary single- Gate Equivalence
qubit unitary matrix. Then, (𝑈 ⊗𝕀)|𝜓Bell ⟩ = (𝕀⊗𝑈 ᵀ )|𝜓Bell ⟩

(𝑼 ⊗ 𝕀)|𝜓Bell ⟩ =(𝕀 ⊗ 𝑼 ᵀ )|𝜓Bell ⟩

where ᵀ denotes matrix transposition (not adjungation!).

Proof. Recall Theorem 3.11, which states that {𝑯 ,𝑻 } is a universal single


qubit gate set. Hence, any arbitrary unitary gate 𝑼 can be written as 𝑼 =
𝑽 𝑁 · · ·𝑽 1𝑽 0 with 𝑽 𝑘 ∈ {𝑯 ,𝑻 } for some 𝑁 ≥ 1. Let 𝑼 act on the first qubit
of the Bell state:

𝑼 ⊗ 𝕀|𝜓Bell ⟩ =(𝑽 𝑁 · · ·𝑽 1𝑽 0 ) ⊗ 𝕀|𝜓Bell ⟩


= (𝑽 𝑁 ⊗ 𝕀) · · · (𝑽 1 ⊗ 𝕀) (𝑽 0 ⊗ 𝕀) |𝜓Bell ⟩
= (𝑽 𝑁 ⊗ 𝕀) · · · (𝑽 1 ⊗ 𝕀) (𝕀 ⊗ 𝑽 0 ) |𝜓Bell ⟩
= (𝕀 ⊗ 𝑽 0 ) (𝑽 𝑁 ⊗ 𝕀) · · · (𝑽 1 ⊗ 𝕀) |𝜓Bell ⟩,
where we have used Eq. (5.2) or Eq. (5.3) to move the first gate 𝑽 0 ∈ {𝑯 ,𝑻 }
from the top to the bottom. The last line is a simple commutation operation.
Repeat this (𝑁 − 1) times to obtain

𝑼 ⊗ 𝕀|𝜓Bell ⟩ = (𝕀 ⊗ 𝑽 0 ) (𝕀 ⊗ 𝑽 1 ) · · · (𝕀 ⊗ 𝑽 𝑁 ) |𝜓Bell ⟩
=𝕀 ⊗ (𝑽 0𝑽 1 · · ·𝑽 𝑁 ) |𝜓Bell ⟩. (5.5)
63 Lecture 5: Bell states & Superdense Coding

Figure 5.4 Visualization of Corollary 5.3: The following three circuits are all
equivalent in the sense that they produce the same final quantum state. There
is no way of distinguishing these circuits.

The sequence of gates (𝑽 0𝑽 1 · · ·𝑽 𝑁 ) looks like the gate decomposition of 𝑼 ,


but in reverse order. This reverse ordering can be expressed via the transpose
operation. Indeed, 𝑽 𝑘ᵀ = 𝑽 𝑘 , because 𝑯 ᵀ = 𝑯 and 𝑻 ᵀ = 𝑻 . In turn,
𝑼 ᵀ = (𝑽 𝑁 · · ·𝑽 1𝑽 0 ) ᵀ = 𝑽 ᵀ0𝑽 ᵀ1 · · ·𝑽 𝑁

= 𝑽 0𝑽 1 · · ·𝑽 𝑁
and inserting this back into Eq. (5.5) produces the desired equivalence:
(𝑼 ⊗ 𝕀) |𝜓Bell ⟩ = 𝕀 ⊗ 𝑼 ᵀ |𝜓Bell ⟩.



Here is an immediate consequence of Theorem 5.2 that justifies the particular
result of circuit Fig. 5.2 (d) where applying Hadamard gates to both qubits did
not change the final state.
Corollary 5.3 Let 𝑼 ,𝑽 be arbitrary single-qubit gates. Then,

𝑼 ⊗ 𝑽 |𝜓Bell ⟩ = 𝑼 × 𝑽 ᵀ ⊗ 𝕀|𝜓Bell ⟩ = 𝕀 ⊗ 𝑽 × 𝑼 ᵀ |𝜓Bell ⟩.


 

We refer to Fig. 5.4 for a visualization of this equivalence relation.


Ultimately, the three circuits shown in Fig. 5.4 all produce the same quantum
state. This also explains why applying Hadamard gates to both qubits of the
Bell state did nothing to the state, Fig. 5.2(d):
𝑯 × 𝑯 ᵀ ⊗ 𝕀 |𝜓Bell ⟩ = (𝕀 ⊗ 𝕀) |𝜓Bell ⟩
 
(𝑯 ⊗ 𝑯 ) |𝜓Bell ⟩ =
It also looks like we have found a way to “teleport” gates from one of the
qubits in the Bell state to the other. This is a very powerful property which we
will use intensively in the future.

Proof of Corollary 5.3. Both claims follow from decomposing 𝑼 ⊗ 𝑽 either as


𝑼 ⊗ 𝑽 = (𝑼 ⊗ 𝕀) × (𝕀 ⊗ 𝑽 ) or (𝕀 ⊗ 𝑽 ) × (𝑼 ⊗ 𝕀) and subsequently applying
Theorem 5.2. For the second decomposition, we can apply Theorem 5.2 to
conclude
𝑼 ⊗ 𝑽 |𝜓Bell ⟩ = (𝕀 ⊗ 𝑽 ) × (𝑼 ⊗ 𝕀) |𝜓Bell ⟩
= (𝕀 ⊗ 𝑽 ) × 𝕀 ⊗ 𝑼 ᵀ |𝜓Bell ⟩


=𝕀 ⊗ 𝑽 × 𝑼 ᵀ |𝜓Bell ⟩.


The other reformulation can be obtained in an analogous fashion after decom-


posing 𝑼 ⊗ 𝑽 as (𝑼 ⊗ 𝕀) × (𝕀 ⊗ 𝑽 ) . ■
64 Lecture 5: Bell states & Superdense Coding

Figure 5.5 Circuit generating all the four Bell states. Depending on the input bits,
𝑏 , a different state is generated.

5.2 More Bell states


So far we have been analysing the simple circuit which produces the Bell state
starting with both qubits initialized in the 0 state. What happens if we were to
choose different inputs? There are 22 = 4 different choices, one for each input
string of size 𝑛 = 2, as depicted in Fig. 5.5.
The 4 resulting Bell-type state vectors then correspond to
1
|𝜓Bell ( 0, 0)⟩ =𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝕀)| 0, 0⟩ = √ (| 0, 0⟩ + | 1, 1⟩) , (5.6)
2
1
|𝜓Bell ( 0, 1)⟩ =𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝕀)| 0, 1⟩ = √ (| 0, 1⟩ + | 1, 0⟩) , (5.7)
2
1
|𝜓Bell ( 1, 0)⟩ =𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝕀)| 1, 0⟩ = √ (| 0, 0⟩ − | 1, 1⟩) , (5.8)
2
1
|𝜓Bell ( 1, 1)⟩ =𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝕀)| 1, 1⟩ = √ (| 0, 1⟩ − | 1, 0⟩) . (5.9)
2
These quantum states do look different but have similar features overall. They
all describe perfect correlation or anti-correlation between the two qubits
involved. In fact, we can generate all of them from the original Bell state All 4 Bell states are locally
|𝜓Bell ⟩ = |𝜓Bell ( 0, 0)⟩ . To this end, we recall the following elementary single- convertible
qubit operations:
  (
0 1 𝑿 | 0⟩ = | 1⟩,
𝑿 = i.e. (bit flip),
1 0 𝑿 | 1⟩ = | 0⟩,
  (
1 0 𝒁 | 0⟩ = +| 0⟩,
𝒁 = i.e. (sign flip).
0 −1 𝒁 | 1⟩ = −| 1⟩,

We can define a conditional application of these single-qubit gates that depends


on an external input bit 𝑏 ∈ {0, 1}:

𝑿 0 = 𝕀, 𝑿 1 = 𝑿 and 𝒁 0 = 𝕀, 𝒁 1 = 𝒁 .
65 Lecture 5: Bell states & Superdense Coding

Figure 5.6 Bell measurement. Applying a 𝑪 𝑵 𝑶𝑻 and a 𝑯 just before the


measurement is equivalent to measuring in the Bell basis.

In words: if 𝑏 = 0, we do nothing (identity operation). Else if 𝑏 = 1, we apply


the bit (sign) flip gate.
Proposition 5.4 Let 𝑏 0 , 𝑏 1 ∈ {0, 1} be two possible input bits. Then,
 
|𝜓Bell (𝑏 0 , 𝑏 1 )⟩ = 𝒁 𝑏 0 𝑿 𝑏 1 ⊗ 𝕀 |𝜓Bell ( 0, 0)⟩.

Exercise 5.5 Prove Proposition 5.4, e.g. by directly computing all advertised
2-qubit states and verifying equality.

5.3 Bell measurement


A Bell measurement is a joint measurement performed on two qubits, which
determines which of the four Bell states the two qubits are in. It consists of the
same operations preparing the Bell state but in reverse order: first applying a Bell measurement is the
CNOT gate to both qubits and then a Hadamard gate on the first qubit, Fig. 5.6. reverse of Bell state
We can think of the CNOT gate un-entangling the two previously entangled preparation
qubits. This allows the information to be converted from quantum information
to a measurement of classical information.
Each of the 4 possible outcomes denotes one of the Bell states, as defined
in Eqs. (5.9).

5.4 Superdense Coding


This protocol called superdense coding is often used in literature to showcase
how the Bell State could be used in a real-world application, keeping in mind
that for simplicity there are only two qubits involved to make calculations
understandable. The naming will become clearer after stating the protocol.
The setting is depicted in Fig. 5.7 and is the following: Alice wants to send
Bob a message which is encoded in two classical bits 𝑏 ∈ ( 00, 01, 10, 11) , with
her only sending one qubit to Bob. For that to work, they have to share a
strongly correlated qubit pair which is prepared and sent by a neutral third
66 Lecture 5: Bell states & Superdense Coding

Figure 5.7 Scheme of the superdense coding protocol. Charlie prepares Bell State
and sends one part to Alice and the other part to Bob. Alice performs a quantum
operation on her qubit depending on which 2 bits (𝑏 0𝑏 1 ) ∈ ( 00, 01, 10, 11)
she wants to send to Bob. Then she sends her qubit to Bob which applies
a CNOT01 gate and a Hadamard gate on the first qubit. A measurement
performed by Bob in the computational basis will reveal which 2 bits Alice sent
him.

party, here called Charlie. As one can see in the diagram in Fig. 5.7 the prepared
qubit pair is the first bell basis state as stated in Eq. (5.1). Alice then applies
quantum operations to her qubit to encode the two classical bits she wants to
send to Bob. She has four different options to encode her message and for Bob
to decode the sent qubit correctly:

. (5.10)

After Alice applies one of the four operations, she sends her qubit to Bob. Bob,
already holding the other part of the Bell state qubit pair or receiving it at
the same time performs a measurement in the first bell basis by applying a
CNOT01 gate where Alice’s qubit is the control qubit and his qubit the target
qubit and after that a Hadamard gate on Alice’s qubit before measuring and
recording the classical bit string.
This protocol takes advantage of Proposition 5.4, the fact that Alice’s action
on her part of the Bell state is equivalent to changing the qubit initialization of
Bob, see Fig. 5.8. Since the CNOT and Hadamard gate are reversible, Alice’s
actions directly influence the result of the readout operations of Bob.
67 Lecture 5: Bell states & Superdense Coding

Figure 5.8 Diagrams sketching the equivalences in the superdense coding protocol.

5.5 Bell states as universal inputs for quantum circuit verification & learning
Let us conclude today’s lecture with another application of Bell states: they
serve as powerful stimuli to probe unknown single-qubit circuits. This has
applications for formal verification of quantum logic, as well as quantum circuit
design (equivalence checking, synthesis, compilation).
Today, we only discuss the single-qubit base case, where all these tasks
are not too daunting (yet). The underlying procedure does, however, readily
extend to 𝑛 -qubit circuits and (Kronecker products of) Bell states on 2𝑛 qubits.
And there, these genuinely quantum ideas can truly start to make a difference.
We will discuss this in a future lecture and/or in a group project.
For now, let us set the stage and build the foundation for using Bell states
as inputs to probe entire single-qubit circuits: ‘one input state vector to rule
them all’.
 
2×2 𝑈0,0 𝑈0,1
Theorem 5.6 Applying a single-qubit circuit 𝑼 ∈ ℂ = to
𝑈1,0 𝑈1,1
one half of a Bell state produces a state vector that preserves all information:

𝑈0,0 𝑈0,0
1 ­ 𝑈 1 ­ 𝑈
© ª © ª
(𝑼 ⊗ 𝕀) |𝜓Bell ⟩ = √ ­ 0,1 and (𝕀 ⊗ 𝑼 ) |𝜓Bell ⟩ = √ ­ 1,0
® ®
® ®.
2 ­ 𝑈1,0 ® 2 ­ 𝑈0,1 ®
« 𝑈1,1 ¬ « 𝑈1,1 ¬
Note that these state vectors describe two flattening operations of the 2D
array 𝑼 : row vectorization (left) and column vectorization (right).
68 Lecture 5: Bell states & Superdense Coding

A statement like this is impossible in the classical realm. If we have black-


box access to a classical circuit, we must test different inputs to uncover the
underlying functionality. Or, to pinpoint this: evaluating a classical single-bit
function 𝑓 (𝑥) for 𝑥 = 0 doesn’t tell us anything about its behavior for input
𝑥 = 11. Theorem 5.6 highlights that the quantum case is strikingly different.
There, a single input state is capable of ‘recording’ all 2 × 2 = 4 entries of the
underlying quantum truth table. The only caveat is that this input state must
first be quantum correlated with a second qubit that doesn’t participate in the
computation.
Many of the observations we made earlier in this lecture are, in fact, special
cases of this fundamental one-to-one correspondence between 2 × 2 matrices
(single-qubit gates) and 4-dimensional vectors (two-qubit states). For instance,

𝕀0,0 1
1 ­ 𝕀0,1 ® 1 ­ 0 ®
© ª © ª
𝑼 =𝕀: (𝕀 ⊗ 𝕀) |𝜓Bell ⟩ = √ ­ ® = √ ­ ®,
2 ­ 𝕀1,0 ® 2­ 0 ®
𝕀
« 1, 1 ¬ « 1 ¬
𝑯 0,0 +1
1 ­ 𝑯 0,1 ® 1 ­ +1 ®
© ª © ª
𝑼 = 𝑯 (a) : (𝑯 ⊗ 𝕀) |𝜓Bell ⟩ = √ ­ ®= ­ ®,
2 ­ 𝑯 1,0 ® 2 ­ +1 ®
« 𝑯 1,1 ¬ « −1 ¬
𝑯 0,0 +1
1 ­ 𝑯 1,0 ® 1 ­ +1 ®
© ª © ª
𝑼 = 𝑯 (b) : (𝕀 ⊗ 𝑯 ) |𝜓Bell ⟩ = √ ­ ®= ­ ®.
2 ­ 𝑯 0,1 ® 2 ­ +1 ®
« 𝑯 1,1 ¬ « −1 ¬
This mathematical relation between actions (think: circuit) and state vectors is
remarkably general and known as the Choi-Jamiolkowski isomorphism. We will
see later that it extends to 𝑛 -qubit circuits (2𝑛 × 2𝑛 matrices) and 2𝑛 -qubit
states (22𝑛 different amplitudes). It even applies to quantum actions that
cannot be described by a single quantum circuit (e.g. a randomized quantum
algorithm), but this would go beyond the scope of this introductory lecture.

Proof of Theorem 5.6. We can establish them by direct computation in the


matrix-vector framework. The two-qubit Bell state is described by the following
4-dimensional state vector:

1
1 1 ­ 0 ®
© ª
|𝜓Bell ⟩ = √ (| 00⟩ + | 11⟩) = √ ­ ® ∈ ℂ4
2 2­ 0 ®
« 1 ¬
1Strictly speaking, single-bit functions are an annoying border case here. If we insist on
reversibility, checking input 𝑥 = 0 tells us everything about input 𝑥 = 1. This one-to-one
correspondence is, however, lost if we either drop reversibility or consider 𝑛 -bit inputs with
𝑛 ≥ 2. The quantum approach requires reversibility, but works for any number of input qubits.
69 Lecture 5: Bell states & Superdense Coding

Likewise, we can compute the 4 × 4 circuit matrix for (𝑼 ⊗ 𝕀) by using the


Kronecker product:

    𝑈0,0 0 𝑈0,1 0
1 0 ­ 0 𝑈0,0 0 𝑈0,1
© ª
𝑈0,0 𝑈0,1
(𝑼 ⊗ 𝕀) = ⊗ =­ ® ∈ ℂ4 × 4
®
𝑈1,0 𝑈1,1 0 1 ­ 𝑈1,0 0 𝑈1,1 0 ®
« 0 𝑈1,0 0 𝑈1,1 ¬
Finally, we use matrix-vector multiplication to derive the first formula:

𝑈0,0 0 𝑈0,1 0 1 𝑈0,0


­ 0 𝑈0,0 0 𝑈0,1 1 0 1
© ª © ª © ª
­ 𝑈0,1
(𝑼 ⊗ 𝕀) |𝜓Bell ⟩ = ­ ®× √ ­ ®= √ ­
® ­ ® ®
­ 𝑈1,0 0 𝑈1,1 0 2­ 0 ®
®
® 2 ­ 𝑈1,0 ®
« 0 𝑈1,0 0 𝑈1,1 ¬ « 1 ¬ « 𝑈1,1 ¬
The derivation for the second formula is very similar. But this time, the
Kronecker product yields

    𝑈0,0 𝑈0,1 0 0
1 0 𝑈1,1 0 0
© ª
𝑈0,0 𝑈0,1 ­𝑈
(𝕀 ⊗ 𝑼 ) = ⊗ = ­ 1,0
®
0 1 ­ 0 0 𝑈0,0 𝑈0,1
®
𝑈1,0 𝑈1,1 ®
« 0 0 𝑈1,0 𝑈1,1 ¬
and we instead obtain

(𝕀 ⊗ 𝑼 ) |𝜓Bell ⟩
𝑈0,0 𝑈0,1 0 0 1 𝑈0,0
𝑈1,1 0 0 1 ­ 0 ® 1 ­ 𝑈1,0
© ª © ª © ª
­𝑈
= ­ 1,0 ®× √ ­ ®= √ ­
® ®
®,
­ 0 0 𝑈0,0 𝑈0,1 ® 2­ 0 ® 2 ­ 𝑈0,1 ®
« 0 0 𝑈1,0 𝑈1,1 ¬ 1
« ¬ « 𝑈1,1 ¬
as advertised. ■
6. Entanglement

Date: 15 November 2023 Lecturer: Johannes Kofler

6.1 Entanglement Agenda:


Entanglement is the phenomenon when two or more quantum systems are 1 entanglement
correlated in such a (non-classical) way that even a perfect and complete 2 the CHSH game
description of all individual systems does not fully specify their joint state. And 3 quantum key distribu-
vice versa, knowing everything about their joint state, does not imply maximal tions (QKD) revisited:
knowledge about the individual constituents. When two or more systems are in the E91 protocol
an entangled state, they – in some sense – cannot be thought of as individual
systems anymore, even if they are separated in space. This is, in fact, what
Erwin Schrödinger called the “essence of quantum physics”.
In the bipartite case, i.e., for two quantum systems 𝐴 and 𝐵 , product (or
separable) states have the form

|𝜓 ⟩𝐴𝐵 = | 𝜑 ⟩𝐴 ⊗ | 𝜑 ⟩𝐵 . (6.1)

Definition 6.1 (entanglement, bipartite case). A pure bipartite quantum state |𝜓 ⟩𝐴𝐵
is entangled if and only if it is not a product state (i.e. not separable). This entangled quantum states are
means that it is impossible to write |𝜓 ⟩𝐴𝐵 in the form of Eq. (6.1). not separable

The Bell states are, of course, not of this form, i.e. there is no way to write
a Bell state such that it factorizes into an individual state | 𝜑 ⟩𝐴 for Alice and
an individual state |𝜑 ⟩𝐵 for Bob. There also exist measures to quantify the
amount of entanglement in a quantum state, and the Bell states are indeed
maximally entangled. Bell states are maximally
As we will see later, entangled states can give rise to correlations whose entangled
“strength” cannot be achieved by any classical process. Moreover, entanglement
71 Lecture 6: Entanglement

is a necessary resource for many quantum information technologies such as


quantum computing and entanglement-based quantum cryptography.

6.1.1 Rotated Bell states


Corollary 5.3, see also Fig. 5.4, highlights that the Bell state somehow correlates
both qubit wires in a nontrivial way. In fact, it is impossible to discern
whether a certain unitary (think: gate, circuit) has been applied to the first
qubit or the second one. Both qubits (wires) are linked. These circuit
reformulations, however, do feature a transpose. And this can be different from
the (complex-valued) adjoint. For real-valued matrices, however, transposition
and adjungation coincide and both denote the inverse of the continuous gate.
Rotation gates are one rich family of real-valued unitary transformations.
Parametrized by a single angle, they correspond to 2 × 2 rotation matrix
 
cos (𝜃 ) − sin (𝜃 )
𝑹 (𝜃 ) = with 𝜃 ∈ [ 0, 2𝜋).
sin (𝜃 ) cos (𝜃 )

Note that we use polarization (or spin) angles here, which are half as large
as angles on the Bloch sphere. So, we do not need all the factors 1/2 as in
Lecture 3. Rotation matrices are comparatively intuitive and obey very nice
composition and inversion rules.
Fact 6.2 Let 𝑹 (𝜃 𝐴 ), 𝑹 (𝜃 𝐵 ) be two rotation matrices. Then,

𝑹 (𝜃 𝐴 )𝑹 (𝜃𝐵 ) =𝑹 (𝜃 𝐴 + 𝜃𝐵 ) (composition),
𝑇 −1
𝑹 (−𝜃 𝐴 ) =𝑹 (𝜃 𝐴 ) = 𝑹 (𝜃 𝐴 ) (transposition/inversion).

Exercise 6.3 Verify the composition rule of Fact 6.2 by computing the matrix
product and using trigonometric identities. Then, conclude the transposi-
tion/inversion rule directly from the composition rule.
It is interesting to consider Bell states, where each qubit is rotated by a
different angle. From now on we use subscript 𝐴 to denote the first qubit and
subscript 𝐵 to denote the second qubit. This notation convention will become
clear later on. For 𝜃 𝐴 , 𝜃 𝐵 , we define

|𝜓Bell (𝜃 𝐴 , 𝜃𝐵 )⟩ = 𝑹 (𝜃 𝐴 ) ⊗ 𝑹 (𝜃𝐵 )|𝜓Bell ⟩. (6.2)

This state can be created by applying two independent rotation gates to


each qubit, Fig. 6.1.
Access to a quantum computer is one way to prepare such rotated Bell
states. But it is not the only one, and far from the most interesting case. All we
need to create such a state is a source that produces two qubits whose joint
quantum state is described by |𝜓Bell ⟩ . This can, for instance, be achieved by
nanomaterials – so-called quantum dots – that create a pair of photons whose
72 Lecture 6: Entanglement

Figure 6.1 Preparing the state |𝜓Bell (𝜃 𝐴 , 𝜃 𝐵 )⟩ by rotating each qubit at an angle
𝜃 A and 𝜃 B respectively.

polarization degree is maximally entangled.1 These photons can then travel


(with the speed of light) to distant locations. There, quantum aficionados –
whom we call Alice and Bob – can ‘catch’ the photons and use polarization
filters to implement each rotation. In this context, the following visualization
is more appropriate:

Alice Bob
qubit 𝐴 Bell qubit 𝐵
𝑅 (𝜃 𝐴 ) 𝑅 (𝜃𝐵 )
source

‘large distance’
.

This visualization depicts a scenario that is equivalent to the circuit in Fig. 6.1.
But, the underlying geometry is very different. Since their creation, the two
qubits have travelled in opposite directions. The recipients, Alice and Bob,
are very far away from each other, apply rotations independently and perform
single-qubit measurements. Nonetheless, the measurement outcomes they can
obtain remain very correlated. And, what is more severe, rotations performed
by Alice appear to affect Bob’s side and vice versa. This alternative viewpoint
isolates the strangeness of the following, mathematically valid, statement.
Lemma 6.4 For any 𝜃 𝐴 , 𝜃 𝐵 ∈ [ 0, 2𝜋) , the outcome probabilities of measuring
the rotated Bell state in Eq. (6.2) always obey Bell states preserve
correlations under joint
1 rotations
Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 00] = Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 11] = cos2 (𝜃 𝐴 − 𝜃 𝐵 ) ,
2
1
Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 01] = Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 10] = sin2 (𝜃 𝐴 − 𝜃𝐵 ) .
2
Proof. Let us combine Corollary 5.3, Theorem 5.2 and the defining properties
1Armando Rastelli from the physics department at JKU is, in fact, one of the world-leading
experts in fabricating such entangling photon sources.
73 Lecture 6: Entanglement

of rotation matrices (Fact 6.2) to obtain


|𝜓Bell (𝜃 𝐴 , 𝜃𝐵 )⟩ = 𝑹 (𝜃 𝐴 ) ⊗ 𝑹 (𝜃𝐵 )|𝜓Bell ⟩ = 𝕀 ⊗ 𝑹 (𝜃𝐵 )𝑹 (𝜃 𝐴 )𝑇 |𝜓Bell ⟩
= 𝕀 ⊗ 𝑹 (𝜃𝐵 − 𝜃 𝐴 )|𝜓Bell ⟩
cos (𝜃 𝐵 − 𝜃 𝐴 ) sin (𝜃 𝐵 − 𝜃 𝐴 )
= √ | 0, 0⟩ + √ | 0, 1⟩
2 2
sin (𝜃 𝐵 − 𝜃 𝐴 ) cos (𝜃 𝐵 − 𝜃 𝐴 )
− √ | 1, 0⟩ + √ | 1, 1⟩.
2 2
We can now square these amplitudes to obtain the probabilities of the 2-bit
outcomes in question. ■
The following two extreme cases are noteworthy:
1 𝜃 𝐴 = 𝜃𝐵 (same rotation on both qubits): in this case 𝜃 𝐴 − 𝜃𝐵 = 0 and
the trigonometric relations cos2 ( 0) = 1, sin2 ( 0) = 0 ensure
1
Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 00] = Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 11] =
,
2
while the outcomes 0, 1 and 1, 0 can never occur. This always produces
perfectly correlated outcome bits for both qubits.
2 𝜃 𝐴 = 𝜃 𝐵 ± 𝜋/2 (shifted angle): in this case 𝜃 𝐴 − 𝜃 𝐵 = ±𝜋/2 and the
trigonometric relations cos2 (𝜃 /2) = 0, sin2 (𝜃 /2) = 1 ensure
1
Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 01] = Pr |𝜓Bell (𝜃 𝐴 ,𝜃𝐵 ) ⟩ [𝑜 = 10] =
,
2
while the outcomes 0, 0 and 1, 1 can never occur. This produces perfectly
anticorrelated outcome bits for both qubits.
Lemma 6.4 interpolates between those extreme cases in a smooth fashion. If
𝜃 𝐴 and 𝜃𝐵 are close, we are still likely to get very correlated output bits. If
instead 𝜃 𝐴 − 𝜃 𝐵 is close to 𝜋/2, we are likely to obtain very anti-correlated
output bits instead. Changing the relative difference of both angles allows us
to interpolate between perfect correlation and perfect anticorrelation.

6.2 The CHSH game and Bell inequalities


6.2.1 The CHSH game
The CHSH game is a modern view on a seminal thought experiment by John
Clauser (Nobel Prize 2022), Michael Horne, Abner Shimony and Richard Holt
from 1969. It has been intended to test the fundamental limits of (any) classical
explanation for quantum mechanical effects that involve the Bell state. This is
a clever sharpening of revolutionary observations by John Bell in 1964 [Bel64].
The CHSH game involves two players, A (for Alice) and B (for Bob), as well
as a referee. We refer to Fig. 6.2 for a visualization. Throughout the duration
of the game, A and B must not communicate with each other. Each of them
receives a uniformly random bit from a quizzmaster and are tasked to commit
to another single bit as output: the CHSH game
74 Lecture 6: Entanglement

𝑥 𝑦 𝑥 𝑦 𝑥 𝑦

Bell pair
A B A Λ B A B

𝑎 𝑏 𝑎 𝑏 𝑎 𝑏

Figure 6.2 Three variants of the CHSH game: Two players, A (for Alice) and
B (for Bob), play as partners in the following setting. A quizmaster (red)
provides each with a uniformly random input 𝑥 ∈ {0, 1} for A and 𝑦 ∈ {0, 1}
for Bob. They then have to output one bit (blue) each. They win the game if
𝑎 ⊕ 𝑏 = 𝑥 ∧ 𝑦 . The interesting twist is that A and B cannot talk to each other
once the game has started. There are three potential scenarios that meet these
overall desiderata: (Left): A and B perform purely deterministic strategies (grey
boxes that implement a single-bit function). (Center:) similar setting, but A
and B share some joint random seed Λ (green) before the game starts. This
allows them to potentially hedge bets and gamble. (Right:) In the quantum
variant of the CHSH game, A and B share a Bell state (magenta) which they
can rotate and measure after the game has started.

• A receives 𝑥 ∈ {0, 1} (uniformly random) and outputs 𝑎 ∈ {0, 1},


• B receives 𝑦 ∈ {0, 1} (uniformly random) and outputs 𝑏 ∈ {0, 1}.

The players win if the two output bits obey 𝑎 ⊕ 𝑏 = 𝑥 ∧ 𝑦 , so the optimal
strategy depends on both input bits. More precisely:

1 (𝑥, 𝑦 ) = ( 0, 0) implies that they win if (𝑎, 𝑏) = ( 0, 0) or (𝑎, 𝑏) = ( 1, 1)


(perfect correlation),
2 (𝑥, 𝑦 ) = ( 0, 1) implies that they win if (𝑎, 𝑏) = ( 0, 0) or (𝑎, 𝑏) = ( 1, 1)
(perfect correlation),
3 (𝑥, 𝑦 ) = ( 1, 0) implies that they win if (𝑎, 𝑏) = ( 0, 0) or (𝑎, 𝑏) = ( 1, 1)
(perfect correlation),
4 (𝑥, 𝑦 ) = ( 1, 1) implies that they win if (𝑎, 𝑏) = ( 0, 1) or (𝑎, 𝑏) = ( 1, 0)
(perfect anti-correlation),

This looks like an easy and somewhat boring game. But, remember that Alice
and Bob cannot talk to each other! And while three game settings suggest an
easy winning strategy, the fourth setting (perfect anti-correlation) asks for a
completely orthogonal strategy. And, since A and B only have access to one
input bit, how should they prepare for this situation?

6.2.2 Optimal classical strategies


The dilemma from above turns out to be impossible to overcome with traditional
means. This is the content of the following statement that basically underscores
that the naive strategy – Alice and Bob both always output 0 (1) – is optimal
among all deterministic strategies.
75 Lecture 6: Entanglement

Proposition 6.5 The best deterministic classical strategy for the CHSH game
wins with probability 3/4 = 0.75.

Proof. Both players, Alice and Bob, receive a (uniformly random) bit and are
tasked to produce a single output bit. There are only four different single-bit
functions that make sense in this context:

𝑓0 ( 0) = 0, 𝑓 0 ( 1) =0 (constant, always 0),


𝑓1 ( 0) = 0, 𝑓 1 ( 1) =1 (balanced, identity),
𝑓2 ( 0) = 1, 𝑓 2 ( 1) =0 (balanced, bit-flip),
𝑓3 ( 0) = 1, 𝑓 3 ( 1) =1 (constant, always 1).

Recall that there are 4 possible inputs for the CHSH game. Three of them
( ( 0, 0), ( 0, 1), ( 1, 0) ) ask for perfectly correlated output bits ( ( 0, 0) or ( 1, 1) )
while only one input ( 1, 1) requires perfectly anti-correlated output bits ( ( 0, 1)
or ( 1, 0) ). Importantly, Alice (Bob) doesn’t have access to the full input string,
they only see the first (second) bit. So, there is no way to prepare a strategy to
handle the fourth scenario. Instead, it seems better to always provide correlated
outputs, e.g. by jointly agreeing to always output 0 (Alice and Bob always apply
𝑓 0 to their input) or, equivalently, by jointly agreeing to always output 1 (Alice
and Bob both apply 𝑓 3 to their input bits). Such a perfectly correlated strategy
succeeds in 3 of the 4 possible scenarios. Since all four scenarios occur with
equal probability, the overall probability of success is 3/4 = 0.75, as advertised.
An exhaustive search over all possible combinations of 42 = 16 function
combinations underscores that these strategies are indeed as good as it gets. ■

The above proof reveals a dilemma that arises when Alice and Bob want
to come up with a very good strategy for the CHSH game. Each of them only
receives one-half of the input string, and they must not (cannot) communicate
while the game is going on. This partial information is not enough for them
to discern whether the interesting special case (input ( ( 1, 1) which asks for
anti-correlated bits) has actually happened. And so, it does not seem to make
sense to pay attention to this special case at all. Stubbornly outputting 0,
regardless of the actual input, looks like a highly competitive strategy.
An interesting question is now whether there are randomized strategies that
allow Alice and Bob to do better than that. A randomized strategy could look
as follows: before the game starts, Alice and Bob are allowed to meet, conspire
and exchange information. They could use this opportunity to share a random
seed Λ that allows them to hedge bets and ‘gamble’ later on in the game. This
shared random seed should really be viewed as a mathematical abstraction of
a joint strategy that may involve additional information, as well as multiple
strategies.
But, to not interfere with the defining rule of the CHSH game, we insist
that the shared random seed is (statistically) independent from the random
bits 𝑥 and 𝑦 the referee is going to use once the game actually starts in
76 Lecture 6: Entanglement

earnest. This assumption is crucial and actually sufficient to prove the following
generalization of Proposition 6.5.

Theorem 6.6 (Bell inequality for CHSH). Any classical strategy conceivable – best classical CHSH strategy
even those that use (arbitrary amounts of) shared randomness between wins with probability ≤ 0.75
both players – can win the CHSH game with a probability of at most
3/4 = 0.75. Hence, the Bell inequality for the CHSH game, limiting all
class
classical strategies, reads: 𝑝 succ ≤ 34 .

To be more precise, this upper bound on the optimal classical strategy


hinges on two explicit assumptions:

• separated players (‘locality’): Alice receives input bit 𝑥 and outputs


𝑎 ∈ {0, 1}; she has zero information about either 𝑦 or 𝑏 from Bob’s side.
Analogously, Bob’s outcome 𝑏 does not depend on 𝑥 or 𝑎 .
• uncorrelated randomness (‘free will’): the referee samples both input
bits 𝑥, 𝑦 uniformly at random; importantly, these random numbers are
completely uncorrelated from the shared random seed Λ that Alice and
Bob use to power their strategies.

There is a third implicit assumption, that is sometimes called ‘realism’. It states


that it is possible to assign probabilities to all input/output tuples arising from
potential strategies, irrespective of whether they occur in the actual game or
not.

Proof sketch of Theorem 6.6. The overall idea is that shared randomness cannot
improve over the best deterministic strategy. A shared random seed allows the
players to switch between different deterministic strategies – depending on the
value of the random seed. The resulting probability of success then becomes a
weighted sum over winning probabilities of individual deterministic strategies:
∑︁ ∑︁
𝑝 succ = Pr [ success|strategy 𝑘 ] 𝑝 (𝑘 ) with 𝑝𝑘 ≥ 0, 𝑝 (𝑘 ) = 1.
𝑘 𝑘

But, we already know from Proposition 6.5 that Pr [ success|strategy 𝑘 ] ≤ 3/4


for all 𝑘 . It is impossible to overcome this threshold with probabilistic averaging.
Note, however, that this strategy only works if the random seed (𝑝 (𝑘 ) in
our case) is statistically independent of the game settings (𝑥, 𝑦 ) . Otherwise,
the expression would not factorize nicely and the argument becomes void.
This is why uncorrelated randomness is an important assumption behind
Theorem 6.6. ■

6.2.3 Optimal quantum strategy


We have now prepared the stage for a very astonishing observation: a quantum
generalization of the CHSH – which does not violate any game rules – allows
the players to win with a success probability that is quite a bit larger than
3/4 = 0.75. To achieve such a noteworthy improvement, Alice and Bob share a
77 Lecture 6: Entanglement

Bell state between them before the game starts, see Fig. 6.2 (right). Once the
game begins, each player uses the input bit to perform a certain rotation on
their half of the Bell state and measure their qubit:

𝑥 𝑦

Bell state (qubit pair)


𝑥 𝜃𝐴 𝑦 𝜃𝐵 𝜑

𝑎 𝑏 .
At first sight, this setup looks a bit asymmetric. Bob features an additional
rotation gate 𝑅 (𝜑 ) while Alice doesn’t. Due to the invariants of a rotated Bell
state, this is in fact the most general setup conceivable. Consider now the
scenario where Alice receives input bit 𝑥 and Bob receives input bit 𝑦 . After
applying rotations, they produce a rotated Bell state that depends on 𝑥 and 𝑦 .
We succinctly write
|𝜓 (𝑥, 𝑦 )⟩ := |𝜓Bell (𝑥 𝜃 𝐴 , 𝑦 𝜃𝐵 + 𝜑 )⟩ = 𝑹 (𝑥 𝜃 𝐴 ) ⊗ 𝑹 (𝑦 𝜃𝐵 + 𝜑 )|𝜓Bell ⟩.
Note that since this state depends on the rotation angles 𝜃 𝐴 , 𝜃 𝐵 , 𝜑 ∈ [ 0, 2𝜋) ,
Lemma 6.4 conveniently provides us with the associated outcome probabilities
for perfectly correlated and anti-correlated measurement outcome bits:
1
Pr |𝜓 (𝑥,𝑦 ) ⟩ [𝑜 = 00] = |𝜓 (𝑥, 𝑦 )⟩ [𝑜 = 11] =cos2 (−𝑥 𝜃 𝐴 + 𝑦 𝜃 𝐵 + 𝜑 ) ,
2
1
Pr |𝜓 (𝑥,𝑦 ) ⟩ [𝑜 = 01] = |𝜓 (𝑥, 𝑦 )⟩ [𝑜 = 10] = sin2 (−𝑥 𝜃 𝐴 + 𝑦 𝜃 𝐵 + 𝜑 ) .
2
We can now use these probabilities to analyze the success probability in each
CHSH game setting as a function of the rotation angles involved:
1 (𝑥, 𝑦 ) = ( 0, 0) which asks for 0 = 𝑥 ∧ 𝑦 = 𝑎 ⊕ 𝑏 . In words, Alice and
Bob win the game if their output bits are perfectly correlated, that is they
should output either (𝑎, 𝑏) = ( 0, 0) or (𝑎, 𝑏) = ( 1, 1) . The probability
of success is
𝑝 succ ( 0, 0) = Pr |𝜓 ( 0,0 ) ⟩ [𝑜 = 00] + Pr |𝜓 ( 0,0 ) ⟩ [𝑜 = 11]
1 1
= cos2 (−0𝜃 𝐴 + 0𝜃 𝐵 + 𝜑 ) + cos2 (−0𝜃 𝐴 + 0𝜃 𝐵 + 𝜑 )
2 2
2
= cos (𝜑 ) .
2 (𝑥, 𝑦 ) = ( 0, 1) which asks for 0 = 𝑥 ∧ 𝑦 = 𝑎 ⊕ 𝑏 . Again, Alice and Bob
win if their output bits are perfectly correlated. The probability of success
is
𝑝 succ ( 0, 1) = Pr |𝜓 ( 0,1 ) ⟩ [𝑜 = 00] + Pr |𝜓 ( 0,1 ) ⟩ [𝑜 = 11]
1 1
= cos2 (−0𝜃 𝐴 + 1𝜃 𝐵 + 𝜑 ) + cos2 (−0𝜃 𝐴 + 1𝜃 𝐵 + 𝜑 )
2 2
2
= cos (𝜃 )
𝐵 +𝜑 .
78 Lecture 6: Entanglement

3 (𝑥, 𝑦 ) = ( 1, 0) which asks for 0 = 𝑥 ∧ 𝑦 = 𝑎 ⊕ 𝑏 . For the last time, Alice


and Bob win if their output bits are perfectly correlated. The probability
of success is

𝑝 succ ( 1, 0) = Pr |𝜓 ( 1,0 ) ⟩ [𝑜 = 00] + Pr |𝜓 ( 1,0 ) ⟩ [𝑜 = 11]


1 1
= cos2 (−1𝜃 𝐴 + 0𝜃 𝐵 + 𝜑 ) + cos2 (−1𝜃 𝐴 + 0𝜃 𝐵 + 𝜑 )
2 2
= cos2 (−𝜃 𝐴 + 𝜑 ) .

4 (𝑥, 𝑦 ) = ( 1, 1) which asks for 1 = 𝑥 ∧ 𝑦 = 𝑎 ⊕ 𝑏 . In this last scenario,


Alice and Bob win if their output bits are perfectly anti-correlated. The
probability of success is

𝑝 succ ( 1, 1) = Pr |𝜓 ( 1,1 ) ⟩ [𝑜 = 01] + Pr |𝜓 ( 1,1 ) ⟩ [𝑜 = 10]


1 1
= sin2 (−1𝜃 𝐴 + 1𝜃 𝐵 + 𝜑 ) + sin2 (−1𝜃 𝐴 + 1𝜃 𝐵 + 𝜑 )
2 2
2
= sin (−𝜃 𝐴 + 𝜃𝐵 + 𝜑 ) .

We can now take into account the distribution of inputs (𝑥, 𝑦 ) to combine
all these four probabilities into a single success probability. By assumption,
all possible 2-bit inputs occur with equal probability 1/22 = 1/4 (uniform
distribution). So, the overall probability of success becomes
1 2
cos (𝜑 ) + cos2 (𝜃 𝐵 + 𝜑 ) + cos2 (−𝜃 𝐴 + 𝜑 ) + sin2 (−𝜃 𝐴 + 𝜃 𝐵 + 𝜑 ) .

𝑝 succ =
4
We are now in a position to optimize the quantum strategy by choosing
𝜃 𝐴 , 𝜃𝐵 , 𝜑 to make this success probability as large as possible. The cosine
function becomes large if all the angles are (relatively) close to zero and is
symmetric, i.e. cos (−𝛼) = cos (+𝛼) . Here is one choice that yields the same
success probability for each of the four challenges:

𝜃 𝐴 = −𝜋/4, 𝜃𝐵 = +𝜋/4, 𝜑 = −𝜋/8.

It achieves
1 √ 
𝑝 succ = cos2 (𝜋/8) = 2 + 2 ≳ 0.85.
4
Although not perfect, this success probability is considerably larger than the
best classical success probability (3/4 = 0.75) conceivable. In fact, it can
be shown that this quantum success probability is optimal. Even by using
(arbitrary) quantum resources and (arbitrary) quantum circuits one cannot
beat this threshold. This is worth a prominent display.

Theorem 6.7 (optimal quantum strategy for the CHSH game). For the CHSH
game (2 players), there is an optimal quantum strategy that (only) uses
2-qubit Bell states and two single-qubit rotations for each player. This best quantum CHSH strategy
wins with probability > 0.85
79 Lecture 6: Entanglement

strategy achieves a success probability of 𝑝 succ = ( 2 + 2)/4 ≳ 0.85.

6.3 CHSH rigidity and monogamy of entanglement


In this section, we briefly discuss the implications
√ of actually observing a success
probability that is (close to) 𝑝 succ = ( 2 + 2)/4. It turns out that such a high
probability of success is only possible if Alice and Bob share a (state close to
the) maximally entangled Bell state.
Fact 6.8 (CHSH rigidity and monogamy, informal). Suppose that Alice and Bob
play the CHSH game, but do not necessarily know (or trust) whether they
actually both measure one half of a Bell state. Then, they can use the success
probability they achieve
√ to test their assumptions. In fact, an optimal success
probability of ( 2 + 2)/4 is only achievable if they indeed share a (possibly
rotated) Bell state. Moreover, this even implies that their two-qubit state is
completely uncorrelated (think: private) from any external players. ■

This fact subsumes two strong observations: (i) playing a CHSH game and
winning with optimal success probability essentially certifies that the underlying
protocol works as intended. There is no additional need for benchmarking monogamy of entanglement: if
and/or bug fixing. This remarkable feature is also called ‘self-testing’. two quantum systems are
(ii) the quantum correlations within a two-qubit Bell state are maximally strong. maximally entangled they
cannot be entangled with a
They are, in fact, so strong that it is impossible to still couple this quantum
third one
system to another one. This feature is known as monogamy of entanglement.

6.4 Bell inequalities and the violation of local realism


We should note that the importance of the violation of Bell’s inequalities goes
far beyond winning a game with larger-than-classical success probability. In
fact, what John Bell did [Bel64], was a rather historic achievement: He made
it possible to experimentally test a hitherto almost metaphysical problem,
which some of the greatest minds in the history of physics could not solve,
namely whether or not the physical world may eventually allow for a classical
description after all. I.e. can there exist an underlying reality “beneath” the
quantum state, described by “hidden variables” (similar to the random seed Λ
introduced earlier) not yet known in quantum theory? This classical worldview
is called “realism”. Alone, it does not seem to be testable. But Bell combined it
with two more very plausible assumptions, denoting the resulting worldview of
“local realism”.
Local realism encompasses all classical theories about the physical universe
which obey the three assumptions of realism (physical properties are defined by
hidden variables and exist independent of and prior to measurement), locality
(no influence can propagate faster than the speed of light), and freedom of
choice (measurement settings can be chosen independently of the hidden
variables).
80 Lecture 6: Entanglement

The worldview of local realism implies Bell inequalities. Over many decades,
Bell inequalities have been violated in laboratories all around the world. The
2022 Nobel Prize in Physics was awarded to some of the most significant of
these experiments.

6.5 The E91 protocol for quantum key distribution


We now have seen two compelling features of the two-qubit Bell state. On the
one hand, it allows two agents, Alice and Bob, to obtain perfectly correlated
output bits that are still random ( ( 0, 0) and ( 1, 1) with probability 1/2 each).
And, on the other hand, the agents can play a CHSH game to self-test and
certify the underlying setup. Arthur Ekert was the first to realize the potential
impact of a combination of these two effects [Eke91]. He devised a protocol,
which first appeared in 1991, that uses shared entanglement to establish a
private random key between two distant parties. This key can then be used in
a one-time-pad protocol, which is an information-theoretic secure encryption
technique, i.e. it remains safe even if an adversary has infinite computing power.
The key idea is to distribute many Bell states among Alice and Bob. Each of
them uses private randomness to measure halves of each state in one of several
designated basis. Each measurement provides Alice with a private bit that is
strongly correlated – via entanglement in the original Bell state – with the
private bit that Bob has obtained in the same round.
Once this randomized (Bell) measurement stage is completed, Alice and
Bob exchange their basis choices over a public channel, e.g. the internet
or a telephone. By itself, the choice of basis setting does not reveal any
information about the obtained measurement outcomes. They then identify
instances, where they happened to measure in the same basis, to identify
perfectly correlated output bits. These then form the basis of their private key.
But, before jumping to premature conclusions, they also use a considerable
amount of their measurement data to imitate a CHSH game and compute their
(approximate) success probability. Note that, in this step, it is necessary to shared entanglement enables
also communicate measurement outcomes with each other. So, the bits they detecting ‘wiretap’ or ‘man in
use for CHSH testing cannot be used as a private key anymore. Nonetheless, the middle’ attacks
this CHSH game imitation is essential for the protocol and equips it with the
uncanny ability to detect ‘wiretap’ or ‘man in the middle’ attacks.
If they achieve a close-to-optimal CHSH success probability, the rigidity of
the CHSH game and monotony of entanglement (Fact 6.8) ensure that their
protocol must have worked as intended. In particular, the shared correlations
between their Bell states are private in the sense that they cannot be correlated
to any external quantum system that is beyond their control. The latter is very
general and includes a potential eavesdropper (‘man in the middle’), even if
this agent is very powerful and has access to arbitrary amounts of (quantum)
computing resources. As soon as such a hypothetical eavesdropper tampers
with the shared Bell states, they must interfere with the perfect entanglement
between Alice and Bob (monogamy of entanglement). And such an attack
81 Lecture 6: Entanglement

would necessarily manifest itself in a (much) smaller success probability for the
CHSH game. Alice and Bob, however, can check for exactly that and abort their
protocol if their imitation of the CHSH game is not as successful as it ought to
be.
The actual E91 protocol uses 3 different rotations for each participant (Alice
and Bob) that are depicted in Fig. 6.3. A quick look at them reveals that two of
each are perfectly aligned with each other. These are well-suited to establish
shared randomness (via perfect correlation of Bell outcome measurements).
On the other hand, two of each, are also perfectly suited to play the winning
strategy for the CHSH game. A full and secure execution of the E91 protocol
requires access to many shared Bell pairs so that we can generate sufficiently
long shared private keys and also have enough statistics to approximately
determine the CHSH success probability. We encourage you to have a closer
look by yourself and implement a variant of the E91 protocol using, for instance,
QISKIT to simulate the generation and subsequent measurement of Bell states.
Such a small coding project would also allow you to discover for yourself, how
powerful the CHSH game is at detecting potential attacks on the Bell state.
A brutal man-in-the-middle-attack could, for instance, involve an additional
third qubit which the eavesdropper swaps into the circuit to funnel out one-half
of the entangled state. The CHSH test, however, would detect such a clumsy
attack almost immediately.
Exercise 6.9 Go through our high-level description of the E91 protocol and
make it precise. Which pairs of basis rotations allow for extracting a pair of
perfectly correlated, uniformly random bits? Which pairs of basis rotations
instead allow for playing a CHSH game? Optional: use QISKIT to simulate the
E91 protocol in the presence of a malicious eavesdropper who tampers with the
pristine Bell-state preparation circuit.
√ Show that the (estimated) CHSH success
probability really drops below ( 2 + 2)/4 in any such scenario.
82 Lecture 6: Entanglement

θB = π/8
θA = 0 θB = 0
θA = −π/8 θB = −π/8
θA = −π/4

random angle random angle

Bell state (qubit pair)


(oA , θA ) R(θA ) R(θB ) (oB , θB )

many (independent) repetitions

Bell state (qubit pair)


(oA , θA ) R(θA ) R(θB ) (oB , θB )

Bell state (qubit pair)


(oA , θA ) R(θA ) R(θB ) (oB , θB )

Figure 6.3 Illustration of the E91 protocol: Two players – Alice on the left and
Bob on the right – share many perceived Bell pairs (blue) among. For each
Bell state, they perform independent rotations that are selected uniformly
from three options (red circle for Alice, green circle for Bob) and follow it
up with a computational basis measurement (magenta). Two of the three
angles are the same on each side. The entire process gives each party a list
of measurement settings and corresponding results. They then share their
measurement settings over a public channel. If the same angle was measured –
this occurs in 2 of the 9 setting combinations – then the perfect correlations
of Bell state measurements provide them with one shared random bit. In 4
out of 9 cases, their measurement setting combination belongs to the CHSH
game. Then, they also communicate their measurement outcomes as they are
required for the calculation of the success probability. If the estimated success
probability is (close to) optimal, they can be sure that the entire protocol has
worked as intended. In particular, no eavesdropping whatsoever can have taken
place. If the success probability is not close to optimal, something fishy must
have happened. Alice and Bob then abort the protocol, because their shared
key may not be correct and/or secure.
7. Quantum teleportation

Date: 6 November 2024

7.1 Motivation Agenda:


Today, we prepare the stage for scaling up our quantum architectures to many 1 motivation
qubits. An important prerequisite to doing so is the ability to recognize sub- 2 marginal & conditional
routines and reason about them. In particular, we want to know whether two probabilities
quantum subroutines are equivalent or not. In pictures, 3 𝑇 -gate teleportation
4 state teleportation

The following rigorous statement fully resolves this question for the case of a
single qubit1.

Theorem 7.1 (equivalence of single-qubit functionalities). Two single-qubit func-


tionalities (e.g. sub-circuits) 𝑨, 𝑩 are equivalent if they always lead to
the same readout probabilities. That is, for all input states |𝜓 ⟩ and all
single-qubit unitaries 𝑼 , we must have

Pr𝑼 𝑨 |𝜓 ⟩ [𝑜 = 0] = Pr𝑼 𝑩 |𝜓 ⟩ [𝑜 = 0] and Pr𝑼 𝑨 |𝜓 ⟩ [𝑜 = 1] = Pr𝑼 𝑩 |𝜓 ⟩ [𝑜 = 1] .

The requirements put forth by this statement are best visualized in another
1A generalization to 𝑛 ≥ 1 qubits is relatively straightforward, but would go beyond the
scope of this lecture.
84 Lecture 7: Quantum teleportation

picture:

for every input state |𝜓 ⟩ and every subsequent unitary gate 𝑼 . This characteri-
zation of equivalence is intuitive: two (black box) subroutines are equivalent if
and only if it is impossible to detect any functional difference between the two.
It should not come as a surprise that quantum circuits can ‘hide’ functional
differences better than conventional circuits. The following example highlights
that different input states and different unitaries can both be necessary to
detect them.
Example 7.2 (checking different input states |𝜓 ⟩ and unitaries 𝑼 matters for Theo-
rem 7.1). Consider 𝑨 = 𝕀 (do nothing) and 𝑩 = 𝒁 (sign flip). Then,

𝑩 | 0⟩ = | 0⟩ = 𝑨 | 0⟩ and 𝑩 | 1⟩ = −| 1⟩ ∼ | 1⟩ = 𝑨 | 1⟩,

but the two gates are √ clearly not equivalent. To see this, set √ |𝜓 ⟩ = |+⟩ =
𝑯 | 0⟩ = (| 0⟩ + | 1⟩)/ 2 (and recall |−⟩ = 𝑯 | 1⟩ = (| 0⟩ − | 1⟩)/ 2) ). Then,
√ √
𝑩 |𝜓 ⟩ =𝑩 |+⟩ = 𝑩 (| 0⟩ + | 1⟩) / 2 = (| 0⟩ − | 1⟩) / 2 = |−⟩ = 𝑯 | 1⟩,
𝑨 |𝜓 ⟩ =𝑨 |+⟩ = |+⟩ = 𝑯 | 0⟩.

Both states describe equal superpositions between 0 and 1 and produce equiva-
lent readout probabilities:

1
Pr𝑩 |𝜓 ⟩ [𝑜 0 = 0] =Pr |+⟩ [𝑜 0 = 0] = |⟨0 |+⟩| 2 = ,
2
1
Pr𝑨 |𝜓 ⟩ [𝑜 0 = 0] =Pr | −⟩ [𝑜 0 = 0] = |⟨0 |−⟩| 2 = .
2
Nonetheless, the actual states are very different from each other. Applying one
subsequent Hadamard gate 𝑼 = 𝑯 reveals this difference: 𝑼 𝑩 |𝜓 ⟩ = 𝑯 |−⟩ =
𝑯 × 𝑯 | 1⟩ = | 1⟩ , while 𝑼 𝑨 |𝜓 ⟩ = 𝑯 |+⟩ = 𝑯 × 𝑯 | 0⟩ = | 0⟩ . This ensures

Pr𝑼 𝑩 |𝜓 ⟩ [𝑜 0 = 0] =Pr | 1 ⟩ [𝑜 0 = 0] = 0,
Pr𝑼 𝑨 |𝜓 ⟩ [𝑜 0 = 0] =Pr | 0 ⟩ [𝑜 0 = 1] = 1.

These readout probabilities are as different as they get: one always produces 0
and one always produces 1. ■

This example showcases that we may need the ability to choose different
input states and different subsequent unitaries to unravel a functional difference
between quantum sub-routines. Theorem 7.1 then follows from carefully
arguing that this is enough to unravel all functional differences between
single-qubit circuits. We leave this analysis as an instructive exercise.
85 Lecture 7: Quantum teleportation

Exercise 7.3 (Proof of Theorem 7.1). Provide a proof of Theorem 7.1 for the special
case where both 𝑨 and 𝑩 are single-qubit gates.
At this point, it is worthwhile to emphasize that Theorem 7.1 also applies
to more general quantum subroutines. The subroutines we analyze today,
for instance, involve more than one qubit. A readout is performed on these
auxiliary qubits. And, depending on the outcome obtained (𝑜𝑘 = 0 or 𝑜𝑘 = 1),
we perform different quantum gates on the remaining qubit wire. For instance,

where double lines indicate the conditional application of a quantum gate (if
readout is 1, we apply the gate; otherwise we do nothing).
In order to properly analyze such quantum (sub-)routines, we need a
framework that allows us to reason about such conditional operations, i.e.
(gate) actions that depend on a certain – possibly randomly sampled – readout
bit we have observed earlier.

7.2 Background: marginal and conditional probabilities


For simplicity and concreteness, we will focus on probability distributions that
address binary outcomes/events. This will be enough to reason about quantum
readout procedures, because they only ever produce bit values.

7.2.1 Marginal probabilities


Marginal probability distributions arise from averaging over certain portions of
a multi-variate probability distribution. This is akin to ignoring (or forgetting)
parts of the full distribution.
Definition 7.4 (marginal probability distributions (two binary events)). Consider a marginal probabilities
joint probability distribution Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] over two binary events 𝑠 , 𝑡 ∈
{0, 1}. Then, the marginal probability for the first event (𝑜0 ) is
∑︁1
Pr [𝑜 0 = 𝑠 ] = Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] = Pr [𝑜 0 = 𝑠 , 𝑜 1 = 0]+Pr [𝑜 0 = 𝑠 , 𝑜 1 = 1] ,
𝑡 =0

while the marginal probability for the second (𝑜 1 ) event is


∑︁1
Pr [𝑜 1 = 𝑡 ] = Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] = Pr [𝑜 0 = 0, 𝑜 1 = 𝑡 ]+Pr [𝑜 0 = 1, 𝑜 1 = 𝑡 ] .
𝑠 =0
86 Lecture 7: Quantum teleportation

Note that this definition readily extends to a joint probability distribution


over more than two events. For three binary events (𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢 ) we
get
∑︁1
Pr [𝑜 0 = 𝑠 ] = Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢] ,
𝑡 ,𝑢=0
∑︁1
Pr [𝑜 1 = 𝑡 ] = Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢] ,
𝑠 ,𝑢=0
∑︁1
Pr [𝑜 2 = 𝑢] = Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢] .
𝑠 ,𝑡 =0

7.2.2 Conditional probability distributions


Conditional probability distributions are similar to marginal distributions in
the sense that they only address one part of a larger probability distributions.
However, now, we fix the outcome of the other part instead of averaging over
all possible outcomes.
Definition 7.5 (conditional probability distributions (two binary events)). Consider a conditional probabilities
joint probability distribution Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] over two binary events 𝑠 , 𝑡 ∈
{0, 1}. Then, the two conditional probabilities for the first event (𝑜0 ) are
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 0]
Pr [𝑜 0 = 𝑠 |𝑜 1 = 0] = for 𝑠 = 0, 1,
Pr [𝑜 1 = 0]
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 1]
Pr [𝑜 0 = 𝑠 |𝑜 1 = 1] = for 𝑠 = 0, 1.
Pr [𝑜 1 = 1]

Likewise, the two conditional probabilities for the second event (𝑜 1 ) are

Pr [𝑜 0 = 0, 𝑜 1 = 𝑡 ]
Pr [𝑜 1 = 𝑡 |𝑜 0 = 0] = for 𝑡 = 0, 1,
Pr [𝑜 0 = 0]
Pr [𝑜 0 = 1, 𝑜 1 = 𝑡 ]
Pr [𝑜 1 = 𝑡 |𝑜 0 = 1] = for 𝑡 = 0, 1.
Pr [𝑜 0 = 1]

(Care must be taken when the denominator approaches zero. This would mean
that we condition on an event that can (almost) never happen).
Note that there are two conditional probability distributions for each event:
one that assumes 𝑜 0/1 = 0 and one that assumes 𝑜 0/1 = 1. Each of them is a
valid probability distribution in its own right. Non-negativity follows directly
from the construction. Normalization, on the other hand, follows from the
fact that the relevant marginal distribution features in the denominator. For
instance,
Í1
∑︁1
𝑠 =0 Pr [𝑜0 = 𝑠 , 𝑜 1 = 0] Pr [𝑜 1 = 0]
Pr [𝑜 0 = 𝑠 |𝑜 1 = 0] = = =1
𝑠 =0 Pr [𝑜 1 = 0] Pr [𝑜 1 = 0]

and we obtain the same result for all other conditional probability distributions.
Similar to marginal probability distributions, Definition 7.5 also readily
extends to more than two binary variables. However, the number of different
87 Lecture 7: Quantum teleportation

possibilities grows very quickly! For three binary events (𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢 ),


we can construct
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢]
Pr [𝑜 0 = 𝑠 |𝑜 1 = 𝑡 , 𝑜 2 = 𝑢] = for 𝑠 = 0, 1,
Pr [𝑜 1 = 𝑡 , 𝑜 2 = 𝑢]
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢]
Pr [𝑜 1 = 𝑡 |𝑜 0 = 𝑠 , 𝑜 2 = 𝑢] = for 𝑡 = 0, 1,
Pr [𝑜 0 = 𝑠 , 𝑜 2 = 𝑢]
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢]
Pr [𝑜 2 = 𝑢 |𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] = for 𝑢 = 0, 1,
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ]

(condition on two events), but also

Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢]
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 |𝑜 2 = 𝑢] = for 𝑠 , 𝑡 = 0, 1,
Pr [𝑜 2 = 𝑢]
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢]
Pr [𝑜 0 = 𝑠 , 𝑜 2 = 𝑢 |𝑜 1 = 𝑡 ] = for 𝑠 , 𝑢 = 0, 1,
Pr [𝑜 1 = 𝑡 ]
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢]
Pr [𝑜 1 = 𝑡 , 𝑜 2 = 𝑢 |𝑜 0 = 𝑠 ] = for 𝑡 , 𝑢 = 0, 1.
Pr [𝑜 0 = 𝑠 ]

(condition on a single event).


Exercise 7.6 (Bayes’ theorem). Prove the following statement known as Bayes’
theorem: Bayes’ theorem

Pr [𝑜 0 = 𝑠 |𝑜 1 = 𝑡 ] Pr [𝑜 1 = 𝑡 ]
Pr [𝑜 1 = 𝑡 |𝑜 0 = 𝑠 ] = .
Pr [𝑜 0 = 𝑠 ]

Context: Bayes’ theorem highlights that the direction of conditional influences


can be inverted. As such, it plays a pivotal role in statistics.
Exercise 7.7 (Perfect correlations go both ways). Suppose that we have a joint
distribution of two binary variables that obey
(
1 if 𝑠 = 𝑡 ,
Pr [𝑜 1 = 𝑡 |𝑜 0 = 𝑠 ] = (7.1)
0 else if 𝑠 ≠ 𝑡 .

In words: the value of 𝑜 0 completely determines the value of 𝑜 1 (perfect


correlation). Use Bayes’ rule to show that this also implies
(
1 if 𝑠 = 𝑡 ,
Pr [𝑜 0 = 𝑠 |𝑜 1 = 𝑡 ] =
0 else if 𝑠 ≠ 𝑡 .

Is the converse direction also true? That is, does

Pr [𝑜 1 = 𝑡 |𝑜 0 = 𝑠 ] = Pr [𝑜 0 = 𝑠 |𝑜 1 = 𝑡 ]

necessarily imply perfect correlations in the sense of Eq. (7.1)?


88 Lecture 7: Quantum teleportation

7.2.3 Example 1: Bell state readout


Recall a basic Bell state preparation circuit, followed by reading out the two
qubits involved:

We already know from Lecture 3 that this state (vector) corresponds to

1
|𝜓Bell ⟩ = 𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝕀) | 00⟩ = √ (| 00⟩ + | 11⟩) .
2

And, for readout values 𝑜 0 = 𝑠 and 𝑜 1 = 𝑡 with 𝑠 , 𝑡 = 0, 1, we obtain


(
1/2 if 𝑠 = 𝑡 ,
Pr |𝜓Bell ⟩ [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] = . (7.2)
0 else if 𝑠 ≠ 𝑡

Access to this joint probability distribution over a pair of events, allows us to


compute both marginal and conditional probabilities. Let us start with the
marginal probabilities. For the first outcome 𝑜 0 , we obtain
1 1
Pr |𝜓Bell ⟩ [𝑜 0 = 0] =Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 0] + Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 1] = +0= ,
2 2
1 1
Pr |𝜓Bell ⟩ [𝑜 0 = 1] =Pr |𝜓Bell ⟩ [𝑜 0 = 1, 𝑜 1 = 0] + Pr |𝜓Bell ⟩ [𝑜0 = 1, 𝑜1 = 1] = 0 + = .
2 2
(Alternatively, we could have also deduced the second line from the first one:
Pr |𝜓Bell ⟩ [𝑜 0 = 1] = 1 − Pr |𝜓Bell ⟩ [𝑜 0 = 0] = 1 − 1/2 = 1/2.) An almost identical
computation reveals the same marginal probabilities for the second outcome
𝑜1 :
1 1
Pr |𝜓Bell ⟩ [𝑜 1 = 0] =Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 0] + Pr |𝜓Bell ⟩ [𝑜 0 = 1, 𝑜 1 = 0] = +0= ,
2 2
1 1
Pr |𝜓Bell ⟩ [𝑜 1 = 1] =Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 1] + Pr |𝜓Bell ⟩ [𝑜0 = 1, 𝑜1 = 1] = 0 + = .
2 2
We can put everything together into a single formula that succinctly captures
the marginal probabilities of a Bell state readout procedure:

1
Pr |𝜓Bell ⟩ [𝑜 0 = 𝑠 ] = Pr |𝜓Bell ⟩ [𝑜 1 = 𝑠 ] = for 𝑠 = 0, 1. (7.3)
2
This display highlights two things: (i) the two marginal probabilities are
identical and (ii) each marginal probability is equivalent to a uniformly random
bit (think: coin toss).
89 Lecture 7: Quantum teleportation

Let us now move on to determining the conditional probabilities. We start


with conditioning on 𝑜 1 = 0:

Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 0] 1/2
Pr |𝜓Bell ⟩ [𝑜 0 = 0 |𝑜 1 = 0] = = = 1,
Pr |𝜓Bell ⟩ [𝑜 1 = 0] 1/2

where we have inserted Eq. (7.2) for the numerator and Eq. (7.3) for the
denominator. In a similar fashion, we obtain

Pr |𝜓Bell ⟩ [𝑜 0 = 1, 𝑜 1 = 0] 0
Pr |𝜓Bell ⟩ [𝑜 0 = 1 |𝑜 1 = 0] = = = 0.
Pr |𝜓Bell ⟩ [𝑜 1 = 0] 1/2

This confirms that these two conditional probabilities indeed form a valid
probability distribution. Both probabilities are non-negative and add up to one.
Conditioning on 𝑜 1 = 1 instead, tells a similar story, but with reversed roles:

Pr |𝜓Bell ⟩ [𝑜 0 = 0, 𝑜 1 = 1] 0
Pr |𝜓Bell ⟩ [𝑜 0 = 0 |𝑜 1 = 1] = = = 0,
Pr |𝜓Bell ⟩ [𝑜 1 = 1] 1/2
Pr |𝜓Bell ⟩ [𝑜 0 = 1, 𝑜 1 = 1] 1/2
Pr |𝜓Bell ⟩ [𝑜 0 = 1 |𝑜 1 = 1] = = = 1.
Pr |𝜓Bell ⟩ [𝑜 1 = 1] 1/2

This is again a valid probability distribution over a single binary outcome. In


fact, we can combine both into a single formula:
(
1 if 𝑠 = 𝑡 ,
Pr |𝜓Bell ⟩ [𝑜 0 = 𝑠 |𝑜 1 = 𝑡 ] = (7.4)
0 else if 𝑠 ≠ 𝑡 .

This display highlights another feature of Bell state readout: the first outcome
bit is perfectly correlated with the second outcome bit: 𝑜 0 = 𝑜 1 . It should not
come as a surprise that this perfect correlation persists if we exchange the roles
of the two outcome bits. We leave this as an instructive exercise.
Exercise 7.8 (Bell readout probabilities conditioned on the first outcome bit 𝑜 0 ). Use
Eq. (7.2) and Eq. (7.3) to derive the following conditional probability distribu-
tions: (
1 if 𝑡 = 𝑠 ,
Pr |𝜓Bell ⟩ [𝑜 1 = 𝑡 |𝑜 0 = 𝑠 ] =
0 else if 𝑡 ≠ 𝑠 .

7.2.4 Example 2: Drawing straws


You and four of your friends have just had dinner and now have to decide who
does the dishes. To make the decision fair all of you agree to decide by drawing
straws. This means there are 5 straws of which 1 is considerably shorter than
the other 4 straws, they are mixed in a hat or beanie and one person at a time
is drawing one straw from the hat without looking and not putting it back after
drawing it. The person drawing the shortest straw loses and has to do the
dishes and the other ones all win.
90 Lecture 7: Quantum teleportation

This can be described as a joint probability distribution over 5 binary events


𝑜0 , 𝑜1 , 𝑜2 , 𝑜3 , 𝑜4 ∈ {0, 1}– one for each straw in the hat, aka player of the game.
We say that player 𝑘 loses if they draw the short straw, 𝑜𝑘 = 1 and 𝑜𝑘 = 0
otherwise. The associated probability distribution then becomes
(
1/5 if 𝑠 + 𝑡 + 𝑢 + 𝑣 + 𝑤 = 1,
Pr [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢, 𝑜 3 = 𝑣 , 𝑜 4 = 𝑤 ] =
0 else.

for 𝑠 , 𝑡 , 𝑢, 𝑣 , 𝑤 ∈ {0, 1}. Access to this distribution allows us to compute the


marginal probabilities for 𝑜 0 only:
∑︁1
Pr [𝑜 0 = 1] = Pr [𝑜 0 = 1, 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢, 𝑜 3 = 𝑣 , 𝑜 4 = 𝑤 ]
𝑡 ,𝑢,𝑣 ,𝑤 =0
=1/5 + 0 + · · · + 0 = 1/5.

This readily allows us to conclude that Pr [𝑜 0 = 0] = 1 − Pr [𝑜 0 = 1] = 1 − 1/5 =


4/5. A computation of the other four marginal probabilities looks very similar
and produces exactly the same results (why?). We can therefore succinctly
write

Pr [𝑜𝑘 = 1] = 1/5 and Pr [𝑜𝑘 = 0] = 4/5 for all 𝑘 = 0, 1, 2, 3, 4.

This ensures that the drawing of the straws is fair: each player has the same
odds of winning/losing.
However, all five players draw from the same hat. This introduces depen-
dencies between the individual scores of the players involved. Conditional
probabilities are the proper way to reason about these effects which become
most pronounced if we look at the score of the last player’s conditional on the
score of all players before them. For instance,
Pr [𝑜 0 = 0, 𝑜 1 = 0, 𝑜 2 = 0, 𝑜 3 = 0, 𝑜 4 = 1]
Pr [𝑜 4 = 1 |𝑜 3 = 0, 𝑜 2 = 0, 𝑜 1 = 0, 𝑜 0 = 0] =
Pr [𝑜 0 = 0, 𝑜 1 = 0, 𝑜 2 = 0, 𝑜 3 = 0]
Pr [𝑜 0 = 0, 𝑜 1 = 0, 𝑜 2 = 0, 𝑜 3 = 0, 𝑜 4 = 1]
=
Pr [𝑜 0 = 0, 𝑜 1 = 0, 𝑜 2 = 0, 𝑜 3 = 0, 𝑜 4 = 0] + Pr [𝑜 0 = 0, 𝑜 1 = 0, 𝑜 2 = 0, 𝑜 3 = 0, 𝑜 4 = 1]
1/5
= = 1.
0 + 1/5

And, in a similar fashion, we can conclude

Pr [𝑜 4 = 0 |𝑜 3 = 0, 𝑜 2 = 0, 𝑜 1 = 0, 𝑜 0 = 0] = 0.

Note that this final conditional probability distribution for 𝑜 4 is actually deter-
ministic: player five is guaranteed to lose if players one, two, three, and four all
win. A converse of this observation is also true. Suppose, for concreteness, that
player three loses, i.e. 𝑜 2 = 1. Then, player five has no chance of also losing. In
formulas,

Pr [𝑜 4 = 1 |𝑜 3 = 0, 𝑜 2 = 1, 𝑜 1 , = 0, 𝑜 0 = 0] = 0,
Pr [𝑜 4 = 0 |𝑜 3 = 0, 𝑜 2 = 1, 𝑜 1 , = 0, 𝑜 0 = 0] = 1
91 Lecture 7: Quantum teleportation

Figure 7.1 𝑻 -gate teleportation: This quantum subroutine effectively acts on a


single qubit (lower line). Apart from a single 𝑇 -gate (red), it only features
Clifford operations (𝑺 , 𝑿 and 𝑪 𝑵 𝑶𝑻 ) and the readout of the top qubit.
Interestingly this subroutine is equivalent to applying a 𝑻 -gate on the second
qubit, i.e. |𝜓out ⟩ = 𝑻 |𝜓in ⟩ .

and we leave the actual derivation as a quick exercise. By now it should not
come as a surprise that the same is true if any of the other players already lost.
We can succinctly summarize our insights in the following display
h ∑︁3 i h ∑︁3 i
Pr 𝑜 4 = 1 𝑜𝑘 = 0 = 1 while Pr 𝑜 4 = 1 𝑜𝑘 = 1 = 0.
𝑘 =0 𝑘 =0

In words: player five must lose (𝑜 4 = 1) if everybody else wins before and they
must win (𝑜 4 = 0) if somebody else has already lost.

7.3 Quantum 𝑇 -gate teleportation


The concept of conditional probabilities is vital to analyze quantum subroutines
that combine unitary (quantum) gates with partial qubit readout and conditional
operations. Fig. 7.1 displays one such subroutine that plays a very prominent
role in fault-tolerant quantum computation. It also highlights that quantum
circuits are really different from conventional circuit architectures [GC99].
The circuit in Fig. 7.1 contains notation we haven’t seen before. In particular,
a double line emanates from the readout symbol and enters a quantum gate
box. This depicts a conditional gate application that depends on the readout bit conditional gate application
𝑜0 we observe:
(i) if 𝑜 0 = 0, we do nothing (i.e. apply the identity 𝕀 to the remaining qubit),
(ii) else if 𝑜 0 = 1, we apply the gate 𝑺 𝑿 to the remaining qubit.

The subroutine depicted in Fig. (7.1) takes a single qubit as input and also
outputs a single qubit. It, therefore, corresponds to an effective single-qubit
operation that we can execute if we have a quantum computer with (at least)
two qubits. The main result of this section highlights that this effective operation
is equivalent to one we already know.
92 Lecture 7: Quantum teleportation

Theorem 7.9 (T-gate teleportation). The effective single-qubit subroutine


displayed in Fig. (7.1) is equivalent to applying a single-qubit 𝑇 -gate:
|𝜓 ⟩ ↦→ 𝑻 |𝜓 ⟩ for every input |𝜓 ⟩ .

Before moving on to a step-by-step analysis, it is worthwhile to suggestively


rewrite part of the 𝑇 -gate teleportation circuit. To this end, we define

1  
|𝑇 ⟩ = 𝑻 𝑯 | 0⟩ = √ | 0⟩ + ei𝜋/4 | 1⟩ ,
2

which is known as a magic state . Theorem 7.9 then tells us that we can use magic state: |𝑇 ⟩ = 𝑻 𝑯 | 0⟩
one such magic state to effectively apply a single 𝑇 gate to another (arbitrary)
qubit. In pictures,

and this version of the protocol is known as magic state injection. It only magic state injection
features Clifford gates (𝑺 , 𝑯 , 𝑪 𝑵 𝑶𝑻 and 𝑿 = 𝑯 𝑺 2𝑯 ), as well as one single-
qubit readout operation. Magic state injection does, however, convert access
to one magic state |𝑇 ⟩ in an effective application of a 𝑇 -gate which is not a
Clifford gate. This trick will become important once we discuss quantum error
correction and fault-tolerant quantum computation.
Let us now move on to actually prove Theorem 7.9. Theorem 7.1 tells us
that it is enough to show

for an arbitrary single-qubit unitary 𝑼 and an arbitrary single-qubit state vector


|𝜓 ⟩ . The readout probabilities for the r.h.s. are now simply

Pr𝑼𝑻 |𝜓 ⟩ [𝑜 0 = 𝑡 ] = |⟨𝑡 |𝑼𝑻 |𝜓 ⟩| 2 for 𝑡 = 0, 1. (7.5)

The other side requires quite a bit more work. But, we have all the necessary
prerequisites to analyze it as well. Let us start with tracking the two-qubit state
vector throughout the quantum circuit. We can write |𝜓 ⟩ = 𝛼 | 0⟩ + 𝛽 | 1⟩ with
93 Lecture 7: Quantum teleportation

𝛼, 𝛽 ∈ ℂ and |𝛼 | 2 + |𝛽 | 2 = 1 and obtain the following effective starting state:


1  
| 𝜑 0 ⟩ =|𝑇 ⟩ ⊗ |𝜓 ⟩ = √ | 0⟩ + ei𝜋/4 | 1⟩ ⊗ (𝛼 | 0⟩ + 𝛽 | 1⟩)
2
𝛼 𝛽 𝛼 ei𝜋/4 𝛽 ei𝜋/4
= √ | 00⟩ + √ | 01⟩ + √ | 10⟩ + √ | 11⟩.
2 2 2 2

Next, we apply two CNOT-gates with different control qubits. On a bit logic
level, they act as

𝑪 𝑵 𝑶𝑻 2→1 |𝑠 , 𝑡 ⟩ = |𝑠 ⊕ 𝑡 , 𝑡 ⟩ and 𝑪 𝑵 𝑶𝑻 1→2 |𝑠 , 𝑡 ⟩ = |𝑠 , 𝑡 ⊕ 𝑠 ⟩

for 𝑠 , 𝑡 = 0, 1 and we obtain

𝛼 𝛽 𝛼 ei𝜋/4 𝛽 ei𝜋/4
| 𝜑 1 ⟩ =𝑪 𝑵 𝑶𝑻 2→1 | 𝜑 0 ⟩ = √ | 00⟩ + √ | 11⟩ + √ | 10⟩ + √ | 01⟩ and
2 2 2 2
i𝜋/4 i𝜋/4
𝛼 𝛽 𝛼e 𝛽e
| 𝜑 2 ⟩ =𝑪 𝑵 𝑶𝑻 1→2 | 𝜑 1 ⟩ = √ | 00⟩ + √ | 10⟩ + √ | 11⟩ + √ | 01⟩.
2 2 2 2
We can use elementary (state) vector and Kronecker operations to rewrite this
state suggestively as

1   e−i𝜋/4  
| 𝜑 2 ⟩ = √ | 0⟩ ⊗ 𝛼 | 0⟩ + ei𝜋/4 𝛽 | 1⟩ + √ | 1⟩ ⊗ ei𝜋/4 𝛽 | 0⟩ + ei𝜋/2 𝛼 | 1⟩
2 2
1 e − i𝜋/4   
= √ | 0⟩ ⊗ (𝑻 (𝛼 | 0⟩ + 𝛽 | 1⟩)) + √ | 1⟩ ⊗ 𝑺 † ei𝜋/4 𝛽 | 0⟩ + 𝛼 | 1⟩
2 2
1 e − i𝜋/4   
= √ | 0⟩ ⊗ (𝑻 |𝜓 ⟩) + √ | 1⟩ ⊗ 𝑺 † 𝑿 𝛼 | 0⟩ + ei𝜋/4 𝛽 | 1⟩
2 2
1 e − i𝜋/4  
= √ | 0⟩ ⊗ (𝑻 |𝜓 ⟩) + √ | 1⟩ ⊗ 𝑺 † 𝑿𝑻 |𝜓 ⟩ . (7.6)
2 2

This reformulation of the state | 𝜑 2 ⟩ already tells us quite a bit about the
quantum logical configuration just before reading out the first qubit. It is a
superposition of two distinct contributions: one for each classical readout value
associated with the first qubit. If the readout value is 𝑜 0 = 0, we don’t do
anything to the remaining qubit and obtain

Pr (𝕀⊗𝑼 ) | 𝜑 final ⟩ [𝑜 0 = 0, 𝑜 1 = 𝑡 ] = Pr (𝕀⊗𝑼 ) | 𝜑 2 ⟩ [𝑜 0 = 0, 𝑜 1 = 𝑡 ]


2
e − i𝜋/4
 
1 †
= ⟨0𝑡 | √ | 0⟩ ⊗ (𝑼𝑻 |𝜓 ⟩) + √ | 1⟩ ⊗ (𝑼 𝑺 𝑿𝑻 |𝜓 ⟩)
2 2
1
= |⟨𝑡 |𝑼𝑻 |𝜓 ⟩| 2 .
2
94 Lecture 7: Quantum teleportation

Else if 𝑜 0 = 1, we do apply 𝑿 𝑺 to the second qubit and obtain

Pr (𝕀⊗𝑼 ) | 𝜑 final ⟩ [𝑜 0 = 1, 𝑜 1 = 𝑡 ] = Pr (𝕀⊗𝑼 ) × (𝕀⊗ (𝑿 𝑺 ) ) | 𝜑 2 ⟩ [𝑜 0 = 1, 𝑜 1 = 𝑡 ]


2
e − i𝜋/4
 
1 

= ⟨1𝑡 | √ | 0⟩ ⊗ (𝑼 𝑿 𝑺𝑻 |𝜓 ⟩) + √ | 1⟩ ⊗ 𝑼 𝑿 𝑺𝑺 𝑿𝑻 |𝜓 ⟩
2 2
1 2 1
= ⟨𝑡 |𝑼 𝑿 𝑺𝑺 † 𝑿𝑻 |𝜓 ⟩ = |⟨𝑡 |𝑼𝑻 |𝜓 ⟩| 2 .
2 2
These two computations nail down all four joint probabilities for 𝑠 , 𝑡 = 0, 1.
Marginalization then implies

1 1 1
Pr (𝕀⊗𝑼 ) | 𝜑 final ⟩ [𝑜 0 = 0] = |⟨0 |𝑼𝑻 |𝜓 ⟩| 2 + |⟨1 |𝑼𝑻 |𝜓 ⟩| 2 = ,
2 2 2
1 1 1
Pr (𝕀⊗𝑼 ) | 𝜑 final ⟩ [𝑜 0 = 1] = |⟨0 |𝑼𝑻 |𝜓 ⟩| 2 + 2
|⟨1 |𝑼𝑻 |𝜓 ⟩| = ,
2 2 2
and the conditional outcome probabilities become

( 1/2) |⟨𝑡 |𝑼𝑻 |𝜓 ⟩| 2


Pr (𝕀⊗𝑼 ) | 𝜑 final ⟩ [𝑜 1 = 𝑡 |𝑜0 = 0] = = |⟨𝑡 |𝑼𝑻 |𝜓 ⟩| 2 ,
1/2
( 1/2) |⟨𝑡 |𝑼𝑻 |𝜓 ⟩| 2
Pr (𝕀⊗𝑼 ) | 𝜑 final ⟩ [𝑜 1 = 𝑡 |𝑜0 = 1] = = |⟨𝑡 |𝑼𝑻 |𝜓 ⟩| 2 ,
1/2

In words: each conditional readout probability is equivalent to reading out the


single-qubit state 𝑼𝑻 |𝜓 ⟩ instead. Also, this is valid for any input state |𝜓 ⟩
and any subsequent unitary 𝑼 . This allows us to conclude that Theorem 7.9 is
valid as stated.

7.4 Quantum state teleportation


We are now ready for the main topic of today’s lecture: quantum state tele-
portation. The protocol dates back to 1993 [Ben+93], but we present it in a
more modern framework – as another quantum circuit black box that has one
incoming qubit wire and one outgoing qubit wire, see Fig. 7.2.

Theorem 7.10 (correctness of quantum teleportation). The teleportation subrou- quantum state teleportation
tine in Fig. 7.2 acts like an effective one-qubit operation. Every input state on
the top right gets exactly transferred to the bottom left, i.e. |𝜓out ⟩ = |𝜓in ⟩ .

We provide a justification of the name ‘teleportation’ protocol in Fig. 7.3.


Let us now move on to provide a proof sketch for Theorem 7.10. Similar to
𝑇 -gate teleportation, we build our arguments on the (sufficient) conditions
on subroutine equivalence from Theorem 7.1. Concretely, we fix an arbitrary
95 Lecture 7: Quantum teleportation

Figure 7.2 Quantum teleportation subroutine: An arbitrary quantum state enters


this subroutine on the top right (first qubit wire). Once the protocol is completed,
the same state leaves the lower left corner (third qubit wire), i.e. |𝜓out ⟩ = |𝜓in ⟩ .
This is remarkable because there seems to be no quantum logical connection
between the first and the last qubit.

single-qubit state |𝜓 ⟩ , an arbitrary single qubit unitary 𝑼 and set out to show

(7.7)

The equality sign here indicates that the outcome probabilities associated with
𝑜2 ∈ {0, 1} (left) and 𝑜 ∈ {0, 1} (right) must be identical. We have also
already streamlined this display a bit by incorporating the two-qubit Bell state.
This readily allows us to write down the actual 3-qubit starting state. For
|𝜓 ⟩ = 𝛼 | 0⟩ + 𝛽 | 1⟩ (with 𝛼, 𝛽 ∈ ℂ and |𝛼 | 2 + |𝛽 | 2 = 1), we obtain
1
| 𝜑 0 ⟩ =|𝜓 ⟩ ⊗ |𝜓Bell ⟩ = (𝛼 | 0⟩ + 𝛽 | 1⟩) ⊗ √ (| 00⟩ + | 11⟩)
2
𝛼 𝛼 𝛽 𝛽
= √ | 000⟩ + √ | 011⟩ + √ | 100⟩ + √ | 111⟩
2 2 2 2
The CNOT between qubits 1 and 2 turns this state into

| 𝜑 1 ⟩ =(𝑪 𝑵 𝑶𝑻 1→2 ⊗ 𝕀)| 𝜑 0 ⟩


𝛼 𝛼
= √ (𝑪 𝑵 𝑶𝑻 1→2 ⊗ 𝕀)| 000⟩ + √ (𝑪 𝑵 𝑶𝑻 1→2 ⊗ 𝕀)| 011⟩
2 2
𝛽 𝛽
+ √ (𝑪 𝑵 𝑶𝑻 1→2 ⊗ 𝕀)| 100⟩ + √ (𝑪 𝑵 𝑶𝑻 1→2 ⊗ 𝕀)| 111⟩
2 2
𝛼 𝛼 𝛽 𝛽
= √ | 000⟩ + √ | 011⟩ + √ | 110⟩ + √ | 101⟩
2 2 2 2
96 Lecture 7: Quantum teleportation

justification of the term


‘teleportation’
Figure 7.3 Interpretation as a quantum state ‘teleportation’ protocol: This inter-
pretation stems from the observation that part of this quantum circuit can be
prepared in advance. It is easy to recognize a Bell state preparation circuit at
the beginning of qubits 2 and 3. Inserting it provides the following suggestive
reformulation where we have artificially elongated the wires of qubit 2 and
qubit 3. This highlights that the actual generation of the Bell state between
qubit 2 and 3 can actually lie in the past, i.e. it occurred a long time before
the actual state |𝜓 ⟩ enters the picture. This extra time can, in principle, be
spent on moving the two parts of the Bell state (qubit 2 and qubit 3) very far
away from each other. The state teleportation subroutine then uses existing
entanglement (Bell state) between two very distant locations to perfectly
transmit a single-qubit state |𝜓 ⟩ from one location (Alice’s side, aka qubits 1
and 2) to a completely different location (Bob’s side, aka qubit 3).
Note, however, that this protocol only works as intended if Alice communicates
her readout values to Bob and Bob uses them to apply conditional quantum
gates. If this is not the case, the protocol produces complete garbage. This
subtle feature reconciles state teleportation with Einstein’s postulate that no
‘information’ can move faster than the speed of light.
97 Lecture 7: Quantum teleportation

and a Hadamard gate on the first qubit produces

| 𝜑 2 ⟩ = (𝑯 ⊗ 𝕀 ⊗ 𝕀) | 𝜑 1 ⟩
𝛼 𝛼 𝛽 𝛽
= √ | + 00⟩ + √ | + 11⟩ + √ | − 10⟩ + √ | − 01⟩
2 2 2 2
𝛼 𝛼 𝛼 𝛼
= | 000⟩ + | 100⟩ + | 011⟩ + | 111⟩
2 2 2 2
𝛽 𝛽 𝛽 𝛽
+ | 010⟩ − | 110⟩ + | 001⟩ − | 101⟩.
2 2 2 2
Written as is, this final 3-qubit state looks rather complicated. However, an
interesting structure reveals itself if we start grouping the amplitudes in terms
of the possible outcome bits for qubit 1 (𝑜 0 ) and qubit 2 (𝑜 1 ):
   
1 1
| 𝜑 2 ⟩ = | 00⟩ ⊗ (𝛼 | 0⟩ + 𝛽 | 1⟩) + | 01⟩ ⊗ (𝛼 | 1⟩ + 𝛽 | 0⟩)
2 2
   
1 1
+ | 10⟩ ⊗ (𝛼 | 0⟩ − 𝛽 | 1⟩) + | 11⟩ ⊗ (𝛼 | 1⟩ − 𝛽 | 0⟩)
2 2
   
1 1
= | 00⟩ ⊗ (|𝜓 ⟩) + | 01⟩ ⊗ (𝑿 |𝜓 ⟩)
2 2
   
1 1
+ | 10⟩ ⊗ (𝒁 |𝜓 ⟩) + | 11⟩ ⊗ (𝑿 𝒁 |𝜓 ⟩) ,
2 2

where we have used |𝜓 ⟩ = 𝛼 | 0⟩+𝛽 | 1⟩ , 𝑿 |𝜓 ⟩ = 𝛼 | 1⟩+𝛽 | 0⟩ , 𝒁 |𝜓 ⟩ = 𝛼 | 0⟩−𝛽 | 1⟩


and 𝑿 𝒁 |𝜓 ⟩ = 𝛼 | 1⟩ − 𝛽 | 0⟩ . This regrouping tells an interesting story that
comes in four parts:

(i) If 𝑜 0 = 0, 𝑜1 = 0, the third qubit must be in the state |𝜓 ⟩ . Conditioned


on these two readout outcomes, the protocol works perfectly and exactly
transmits |𝜓 ⟩ from the first qubit wire to the third one.
(ii) Else if 𝑜 0 = 0, 𝑜 1 = 1, the third qubit must be in the state 𝑿 |𝜓 ⟩ . This is
not a perfect state transmission from qubit one to qubit three, but the
superfluous bit flip 𝑿 can be undone. Conditioned on 𝑜 1 = 1, we apply
an additional 𝑿 -gate to qubit three to also recover 𝑿 × 𝑿 |𝜓 ⟩ = |𝜓 ⟩
perfectly.
(iii) Else if 𝑜 0 = 1, 𝑜 1 = 0, the third qubit must be in the state 𝒁 |𝜓 ⟩ . This
is not a perfect state transmission from qubit one to qubit three, but
the superfluous sign flip 𝒁 can be undone. Conditioned on 𝑜 0 = 1, we
apply an additional 𝒁 -gate to qubit three to also recover 𝒁 × 𝒁 |𝜓 ⟩ = |𝜓 ⟩
perfectly.
(iv) Else if 𝑜 0 = 1, 𝑜 1 = 1, the third qubit must be in the state 𝑿 𝒁 |𝝍 ⟩ . This is
essentially a combination of cases (ii) and (iii). Conditional application of
both 𝒁 (because 𝑜 0 = 1) and 𝑿 (because 𝑜 1 = 1), however, recovers the
state perfectly. Provided that we apply these gates in the correct order.
Doing 𝑿 first (further right) and 𝒁 second (further left) ensures that the
resulting state is 𝒁 × 𝑿 × 𝑿 × 𝒁 |𝜓 ⟩ = 𝒁 × 𝒁 |𝜓 ⟩ = |𝜓 ⟩ .
98 Lecture 7: Quantum teleportation

Understanding these four cases is enough to complete a rigorous proof of


Theorem 7.10. We have just shown that the conditional applications of 𝑿
(associated with 𝑜 0 = 1) and 𝒁 (associated with 𝑜 1 = 1) produce a teleportation
output that is always exactly equal to the teleportation input state |𝜓 ⟩ . The
final unitary 𝑼 in Eq. (7.7) turns this state into 𝑼 |𝜓 ⟩ just before the final
readout. This readout procedure is therefore equivalent to the right-hand side
(simply prepare 𝑼 |𝜓 ⟩ and perform the readout).
Note, however, that our analysis above is merely a proof sketch and not yet a
complete proof. Turning it into one requires a proper treatment via conditional
readout probabilities that is similar to Section 7.3. We leave it as an instructive
exercise that may be very relevant for the written exam.
Exercise 7.11 (complete proof of Theorem 7.10). Flesh out this proof sketch into a
complete proof of correctness for quantum state teleportation. See Prob. 7.16
for a detailed outline.
99 Lecture 7: Quantum teleportation

Problems
Problem 7.12 (Proof of Theorem 7.1). Consider two single-qubit gate matrices
𝑨, 𝑩 (unitaries). Show that they must be equivalent (i.e. 𝑩 = ei 𝜑 𝑨 for some
𝜑 ∈ [ 0, 2𝜋) ) if the following equality is true for all input states |𝜓 ⟩ and all
subsequent unitary gates 𝑼 :
Pr𝑼 𝑩 |𝜓 ⟩ [𝑜 = 𝑠 ] = Pr𝑼 𝑨 |𝜓 ⟩ [𝑜 = 𝑠 ] for 𝑠 = 0, 1.
Challenging bonus question: is it really necessary to consider all possible
input states, as well as all possible unitaries?
Problem 7.13 (Bayes’ theorem). Prove the following statement known as Bayes’
theorem:
Pr [𝑜 0 = 𝑎 |𝑜 1 = 𝑏] Pr [𝑜 1 = 𝑏]
Pr [𝑜 1 = 𝑏 |𝑜 0 = 𝑎] = .
Pr [𝑜 0 = 𝑎]
Context: Bayes’ theorem highlights that the direction of correlations can be
inverted. As such, it plays a pivotal role in statistics.
Problem 7.14 (Drawing straws revisited). Recall the drawing straws scenario,
Ex. 7.2.4. What happens if everyone were to put back in the hat their straw
after their turn? What would the probability of winning or losing be? Would it
change after every turn? Justify your findings with the mathematical formulas
developed in this lecture.
Problem 7.15 (Perfect correlations go both ways). Suppose that we have a joint
distribution of two binary variables that obey
(
1 if 𝑠 = 𝑡 ,
Pr [𝑜 1 = 𝑡 |𝑜 0 = 𝑠 ] = (7.8)
0 else if 𝑠 ≠ 𝑡 .
In words: the value of 𝑜 0 completely determines the value of 𝑜 1 (perfect
correlation). Use Bayes’ rule to show that this also implies
(
1 if 𝑠 = 𝑡 ,
Pr [𝑜 0 = 𝑠 |𝑜 1 = 𝑡 ] =
0 else if 𝑠 ≠ 𝑡 .
Is the converse direction also true? That is, does
Pr [𝑜 1 = 𝑡 |𝑜 0 = 𝑠 ] = Pr [𝑜 0 = 𝑠 |𝑜 1 = 𝑡 ]
necessarily imply perfect correlations in the sense of Eq. (7.8)?
Problem 7.16 (proof of correctness for quantum state teleportation). Consider the
following two quantum circuits

,
100 Lecture 7: Quantum teleportation

where 𝑼 is an arbitrary single-qubit gate and |𝜓 ⟩ is an arbitrary single-qubit


input state (vector).

1 Write down the readout probabilities of the left-hand-circuit, i.e.

Pr𝑼 |𝜓 ⟩ [𝑜 = 𝑢] for 𝑢 = 0, 1.

2 Compute all joint readout probabilities of the final right-hand side state:
Pr | 𝜑 final ⟩ [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 , 𝑜 2 = 𝑢] for 𝑠 , 𝑡 , 𝑢 = 0, 1.
3 Use your result from 2 to derive the marginal probabilities for readout
bits one and two, i.e. Pr | 𝜑 final ⟩ [𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] for 𝑠 , 𝑡 = 0, 1.
4 Combine your results from 2 and 3 to compute all conditional probabilities
Pr | 𝜑 final ⟩ [𝑜 2 = 𝑢 |𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] . Conclude that

Pr | 𝜑 final ⟩ [𝑜 2 = 𝑢 |𝑜 0 = 𝑠 , 𝑜 1 = 𝑡 ] = Pr𝑼 |𝜓 ⟩ [𝑜 = 𝑢] for 𝑢 = 0, 1,

regardless of the bit values for 𝑜 0 and 𝑜 1 . In other words: the state
teleportation always operates as intended!
5 Suppose that Bob becomes impatient and performs a readout on his qubit
before receiving the readout values of Alice (and before performing the
conditional corrections). Then, the full teleportation protocol is cut short
and effectively becomes

Compute the marginal probability distribution for Bob’s readout of the


third qubit: Pr | 𝜑˜final ⟩ [𝑜 2 = 𝑢] for 𝑢 = 0, 1. Argue that this readout
probability distribution does not contain any information about Alice’s
input state |𝜓 ⟩ whatsoever.
Context: this observation resolves an apparent conflict between quantum
state teleportation and the widespread belief that ‘nothing’ can propagate
faster than the speed of light. According to the rules of quantum
computing, the teleportation of |𝜓 ⟩ from qubit one to qubit three happens
instantly – regardless of the distance spanned by the initial Bell state.
However, the readout values on Alice’s side (𝑜 0 and 𝑜 1 ) do affect the
teleported state in a very particular fashion. If not properly undone, this
‘washes out’ all information about |𝜓 ⟩ . In other words: the teleported
state is useless for Bob until he receives Alice’s readout values for 𝑜 0 and
𝑜1 . This information, however, is classical and can only travel at the speed
of light.
101 Lecture 7: Quantum teleportation

Problem 7.17 (quantum repeaters). Consider the following quantum circuit that
involves 4 qubits and two partial measurements on qubit 2 and 3:

1 Compute the two-qubit output state 𝜌 out (𝑜0 , 𝑜1 ) for the special case
where 𝑜 0 , 𝑜 1 = 0. Can you recognize it?
2 What is the probability of obtaining 𝑜0 = 𝑜1 = 0 when performing the
partial measurement?
3 Argue that this circuit actually encompasses a quantum repeater for
spreading entanglement across larger distances. But, the way we have
set it up is probabilistic. The entanglement exchange protocol only works
with a certain success probability (which one?).
4 Optional: do a full analysis that applies to all possible measurement
outcomes 𝑜 0 , 𝑜 1 ∈ {0, 1}. Can you correct the protocol (using 𝑜 0 and 𝑜 1 )
such that it is guaranteed to work in a deterministic fashion?
8. General 𝑛 -qubit architectures

Date: 13 November 2024

8.1 General 𝑛 -qubit architectures Agenda:


Today, we make a substantial jump: we transition from few-qubit architectures 1 general 𝑛 -qubit circuits
(1, 2 or 3 qubits) to large-scale architectures that contain 𝑛 ≫ 1 qubits. 2 strong & weak (classi-
Conceptually, we are well-prepared for this increase in complexity. We already cal) simulation
know all the relevant concepts, like qubit initialization at the beginning and 3 implementing classical
qubit readout at the very end. The quantum circuits in between are also circuits on quantum
combinations of elementary 1- and 2-qubit gates that we already know. We hardware
4 synopsis

𝑛 -qubit architecture: 𝑛 input


qubits, 𝑛 readout bits and a
combination of elementary
quantum gates inbetween

Figure 8.1 A general 𝑛 -qubit architecture has 𝑛 input bits 𝑏 0 , . . . , 𝑏 𝑛 − 1 , a central


block of quantum logic and a final readout stage that recovers 𝑛 bits 𝑜 0 , . . . , 𝑜𝑛 − 1 .
the central block is solely comprised of elementary quantum gates, e.g. 𝑯 ,𝑻 , 𝑺
and 𝑪 𝑵 𝑶𝑻 . The number 𝑠 of elementary quantum gates is called the size of
the quantum circuit.
103 Lecture 8: General 𝑛 -qubit architectures

refer to Fig. 8.1 for a visualization1.


We are going to explore the fundamental possibilities of such large quantum
architectures. We will see that simulation on classical hardware is possible, but
does come with an exponential overhead (in the number of qubits 𝑛 ). Conversely,
we can use a hypothetical quantum architecture to execute any classical Boolean
circuit with only a linear overhead. To paraphrase: quantum architectures
are never much worse than conventional hardware. But, conversely, building
them might unlock exponential improvements in terms of (conventional) circuit
size and, by extension, running time. This window of opportunity is exploited
by seminal quantum circuit constructions, like the ones by Shor for integer
factorization and discrete logarithm. This, however, will be the topic of a future
lecture.

8.2 Classical description of 𝑛 -qubit architectures


We have already introduced quite a bit of formalism that allows us to reason
about quantum circuit architectures.

8.2.1 State vector representation of general 𝑛 -qubit states


Recall that a single qubit wire contains two complex-valued degrees of freedom.
We can capture both of them with a 21 -dimensional state vector of amplitudes:
∑︁1  
𝜓0 1
|𝜓 ⟩ = 𝜓𝑏 0 |𝑏 0 ⟩ = 𝝍 = ∈ ℂ2 = ℂ2 , (8.1)
𝑏 0 =0 𝜓1

where we must also meet the normalization condition ∥𝝍 ∥ 2 = ⟨𝜓 |𝜓 ⟩ =


Í1 2 2 2
𝑏 0 =0 𝜓𝑏 0 = |𝜓0 | + |𝜓1 | = 1. The two most basic examples encode a single
logical 0 and a single logical 1, respectively:
   
1 0
| 0⟩ = 𝒆 0 = and | 1⟩ = 𝒆 1 = . (8.2)
0 1

We can use these basic building blocks to construct state vectors of more
complex bit configurations. The case 𝑛 = 2, for instance, featured prominently
in Lecture 4. There are in total 4 = 22 bit strings of length 𝑛 = 2. And we can
use the Kronecker product to construct all of them from the basic state vectors
1A 𝑛 -qubit architecture can also feature partial readout and conditional gate applications
(see Lecture 7). We disregard this option here for the sake of keeping things simple.
104 Lecture 8: General 𝑛 -qubit architectures

in Eq. (8.2):
   
1 1 𝑇 2
| 00⟩ =| 0⟩ ⊗ | 0⟩ = ⊗ = 1 0 0 0 = 𝒆 0 ∈ ℂ4 = ℂ2 ,
0 0
   
1 0 𝑇 2
| 01⟩ =| 0⟩ ⊗ | 1⟩ = ⊗ = 0 1 0 0 = 𝒆 1 ∈ ℂ4 = ℂ2 ,
0 1
   
0 1 𝑇 2
| 10⟩ =| 1⟩ ⊗ | 0⟩ = ⊗ = 0 0 1 0 = 𝒆 2 ∈ ℂ4 = ℂ2 ,
1 0
   
0 0 𝑇 2
| 11⟩ =| 1⟩ ⊗ | 1⟩ = ⊗ = 0 0 0 1 = 𝒆 3 ∈ ℂ4 = ℂ2 ,
1 1

where we have written down the final expression as a row vector (transposition)
to save a bit of paper space. Note that this identification between bitstrings
(left) and standard basis vectors (right) is even nicer than one might expect:
the 2-bit string on the left corresponds to a bit encoding ⌞𝑙 ⌟ of the standard
basis vector index. For 𝑙 between 0 and 3 = 22 − 1,

𝒆 𝑙 = |⌞𝑙 ⌟⟩ or equivalently |𝑏 0𝑏 1 ⟩ = 𝒆 𝑏 0 +2×𝑏 1 . (8.3)

A general 2-qubit state vector can always be decomposed into a superposition


over all these 22 bit configurations:

𝜓00
∑︁1 ©
­ 𝜓01
ª 2
|𝜓 ⟩ = 𝜓𝑏 0𝑏 1 |𝑏 0𝑏 1 ⟩ = 𝝍 = ­ ® ∈ ℂ4 = ℂ2 . (8.4)
®
𝑏 0 ,𝑏 1 =0 ­ 𝜓10 ®
« 𝜓11 ¬
This state vector must obey the, by now, familiar normalization condition:
∑︁1 2
∥𝝍 ∥ 2 = ⟨𝜓 |𝜓 ⟩ = 𝜓𝑏 0𝑏 1 = 1.
𝑏 0 ,𝑏 1 =0

Comparing Eq. (8.4) with Eq. (8.1) already provides us with a blueprint
on how to scale up these vector representations further. State vectors of
𝑛 -bit strings can be constructed by forming 𝑛 -fold Kronecker products of the
basic bit configurations (8.2) involved. Each Kronecker product doubles the
number of dimensions involved. So, we end up with state vectors that live in a
105 Lecture 8: General 𝑛 -qubit architectures
𝑛
2𝑛 -dimensional complex space ℂ2 : 𝑛 -qubit bitstring
      configurations
1 1 1
| 0 · · · 00⟩ =| 0⟩ ⊗ · · · ⊗ | 0⟩ ⊗ | 0⟩ = ⊗ ··· ⊗ ⊗
0 0 0
𝑇 𝑛
= 1 0 ··· 0 0 = 𝒆 0 ∈ ℂ2 ,
     
1 1 0
| 0 · · · 01⟩ =| 0⟩ ⊗ · · · ⊗ | 0⟩ ⊗ | 1⟩ = ⊗ ··· ⊗ ⊗
0 0 1
2𝑛
𝑇
= 0 1 0 ··· 0 = 𝒆1 ∈ ℂ ,
..
.
     
1 0 0
| 01 · · · 1⟩ =| 0⟩ ⊗ | 1⟩ ⊗ · · · ⊗ | 1⟩ = ⊗ ⊗ ··· ⊗
0 1 1
𝑛
= 0 · · · 0 1 0 = 𝒆 2𝑛 −2 ∈ ℂ2 ,

     
0 0 0
| 11 · · · 1⟩ =| 1⟩ ⊗ | 1⟩ ⊗ · · · ⊗ | 1⟩ = ⊗ ⊗ ··· ⊗
1 1 1
2𝑛

= 0 0 · · · 0 1 = 𝒆 2𝑛 −1 ∈ ℂ .

More succinctly, we obtain the following generalization of Eq. (8.3) to 𝑛 -bit


strings and numbers 𝑙 between 0 and 2𝑛 − 1:

𝒆 𝑙 = |⌞𝑙 ⌟⟩ or, equivalently |𝑏 0 · · · 𝑏 𝑛 −1 ⟩ = 𝒆 𝑏 0 ×2𝑛 −1 +𝑏 1 ×2𝑛 −2 +···+𝑏 𝑛 −1 . (8.5)

A general 𝑛 -qubit state vector can form a superposition of all these 2𝑛 𝑛 -bit
configurations: 𝑛 -qubit state vector has 2𝑛
(complex-valued) amplitudes
𝜓0···00
© ª
∑︁1 ­ 𝜓0···01 ®
.. ® ∈ ℂ2𝑛 .
­ ®
|𝜓 ⟩ = 𝜓𝑏 0 ···𝑏 𝑛 −1 |𝑏 0 · · · 𝑏 𝑛 −1 ⟩ = ­­ . (8.6)
𝑏 0 ,...,𝑏 𝑛 − 1 =0 ®
­𝜓 ®
­ 01···1 ®
« 𝜓11···1 ¬
Each amplitude 𝜓𝑏 0 ···𝑏 𝑛 −1 can be a complex number, but together they must
obey the following normalization condition:
∑︁1 2
∥𝝍 ∥ 2 = ⟨𝜓 |𝜓 ⟩ = 𝜓𝑏 0 ···𝑏 𝑛 −1 = 1.
𝑏 0 ,...,𝑏 𝑛 − 1 =0

Note that this sum ranges over all 2𝑛 different amplitudes that feature in
Eq. (8.6).
Exercise 8.1 The following representation of a general state puts more emphasis
on the exponential amount of different superpositions that are allowed:
∑︁2𝑛 −1 𝑛
∑︁2𝑛 −1
𝝍= 𝜓𝑙 𝒆 𝑙 ∈ ℂ2 or |𝜓 ⟩ = 𝜓𝑙 |⌞𝑙 ⌟⟩.
𝑙 =0 𝑙 =0

Show that both formulas are equivalent to Eq. (8.6).


106 Lecture 8: General 𝑛 -qubit architectures

8.2.2 Circuit matrix representation of general 𝑛 -qubit circuits


By now, we are already acquainted with quantum gate matrices. They are
reversible extensions of reversible binary logic. Prominent single-qubit gate
matrices are    
1 1 1 1 0
𝑯 =√ and 𝑺 = ,
2 1 −1 0 i
as well as the 𝑇 -gate:  
1 0
𝑻 = .
0 exp ( i𝜋/4)
In addition, we have also seen two important 2-qubit gates that allow us to do
conditional quantum logic. The two possible CNOT gates are,

1 0 0 0
ª    
­ 0 1 0 0 ® 1 0 0 0
©
𝑪 𝑵 𝑶𝑻 1→2 =­ ®= ⊗𝕀+ ⊗ 𝑿,
­ 0 0 0 1 ® 0 0 0 1
« 0 0 1 0 ¬
| {z } | {z }
| 0 ⟩⟨ 0 | | 1 ⟩⟨ 1 |
1 0 0 0    
­ 0 0 0 1 ® 1 0 0 0
© ª
𝑪 𝑵 𝑶𝑻 2→1 =­ ®=𝕀⊗ +𝑿 ⊗ ,
­ 0 0 1 0 ® 0 0 0 1
« 0 1 0 0 ¬
| {z } | {z }
| 0 ⟩⟨ 0 | | 1 ⟩⟨ 1 |

where the subscript denotes the ‘flow of information’ from control to target.
Much like in conventional Boolean circuitry, we can now take these ele-
mentary gates and combine them to construct nontrivial functionalities on 𝑛
qubits. Such a quantum circuit has to map a 2𝑛 -dimensional state vector |𝜓in ⟩
(𝑛 -qubit input state) to another 2𝑛 -dimensional state vector |𝜓out ⟩ (𝑛 -qubit
output state). This action is described by a 2𝑛 × 2𝑛 circuit matrix 𝑼 : 𝑛 -qubit circuit fully described
by a 2𝑛 × 2𝑛 circuit matrix
𝑛
|𝜓out ⟩ = 𝑼 |𝜓in ⟩ for all |𝜓in ⟩ ∈ ℂ2 . (8.7)

The matrix 𝑼 depends on the elementary gates involved, as well as their


location within the circuit. Similar to the 1- and 2-qubit case, we can use
products to construct this final matrix out of matrix representations of the
individual constituents: Kronecker (parallel) and
matrix (sequential) products
1 Parallel gate applications use the Kronecker product ‘ ⊗ ’ of the individual turn elementary gate matrices
gate matrices involved (including 𝕀 ∈ ℂ2 × 2 for qubit wires where nothing into full 2𝑛 × 2𝑛 circuit matrix
happens). For 𝑛 qubits, this always produces a single 2𝑛 × 2𝑛 matrix for
each gate layer
2 Combining sequential gate layers uses the matrix product ‘×’ of 2𝑛 × 2𝑛
gate layer matrices.

This general construction is best explained by means of an example.


107 Lecture 8: General 𝑛 -qubit architectures

Example 8.2 (3-qubit Toffoli gate). Consider the following combination of 𝑯 ,𝑻 ,𝑻 † =


𝑻 7 = diag ( 0, exp (−i𝜋/4)) and 𝑪 𝑵 𝑶𝑻 that act on three qubit wires:

(8.8)

The r.h.s displays a sequential combination of 12 gate layers. We can use the
Kronecker product to compute a 23 × 23 matrix representation for each of them:

𝑪 0 = 𝕀 ⊗ 𝕀 ⊗ 𝑯 , 𝑪 1 = 𝕀 ⊗ 𝑪 𝑵 𝑶𝑻 1→2 , 𝑪 2 = 𝕀 ⊗ 𝕀 ⊗ 𝑻 † ,
𝑪 3 = | 0⟩⟨0 | ⊗ 𝕀 ⊗ 𝕀 + | 1⟩⟨1 | ⊗ 𝕀 ⊗ 𝑿 (why?),
𝑪 4 = 𝕀 ⊗ 𝕀 ⊗ 𝑻 , 𝑪 5 = 𝕀 ⊗ 𝑪 𝑵 𝑶𝑻 1→2 , 𝑪 6 = 𝕀 ⊗ 𝕀 ⊗ 𝑻 † ,
𝑪 7 = 𝑪 3 , 𝑪 8 = 𝕀 ⊗ 𝑻 ⊗ 𝑻 , 𝑪 9 = 𝑪 𝑵 𝑶𝑻 1→2 ⊗ 𝑯 ,
𝑪 10 = 𝑻 ⊗ 𝑻 † ⊗ 𝕀 and 𝑪 11 = 𝑪 𝑵 𝑶𝑻 1→2 ⊗ 𝕀.

The final gate matrix then corresponds to the (ordered) matrix product of the
12 matrices that describe the individual layers:

𝑼 = 𝑪 11 × 𝑪 10 × 𝑪 9 × 𝑪 8 × 𝑪 7 × 𝑪 6 × 𝑪 5 × 𝑪 4 × 𝑪 3 × 𝑪 2 × 𝑪 1 × 𝑪 0 . (8.9)

Proper execution of this construction produces the following simple expression


for the final unitary matrix: 3-qubit Toffoli gate (two-fold
controlled-NOT)
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
© ª
­ ®
0 0 1 0 0 0 0 0
­ ®
­ ®
0 0 0 1 0 0 0 0
­ ®
𝑪 𝑪 𝑵 𝑶𝑻 = 𝑼 = ­ (8.10)
­ ®
®.
­ 0 0 0 0 1 0 0 0 ®
0 0 0 0 0 1 0 0
­ ®
­ ®
0 0 0 0 0 0 0 1
­ ®
­ ®
« 0 0 0 0 0 0 1 0 ¬
This is the truth table of a two-fold controlled bitflip: the third truth value is
flipped if the first two bits are equal to 1. Otherwise, nothing happens. The
short-hand notation on the lhs of Eq. (8.8) succinctly captures this functionality.

Exercise 8.3 Derive the matrix representation (8.10) yourself by completing the
argument from the example: (i) form all 12 layer matrices 𝑪 0 , . . . , 𝑪 11 using
a Kronecker product construction, (ii) combine these 12 layers sequentially
by computing the matrix product in Eq. (8.9). Hint: don’t do this by hand.
Instead, write a piece of code that does it for you. Computers are good at this
type of linear algebra.
108 Lecture 8: General 𝑛 -qubit architectures

Kronecker and matrix products have another nice feature: they play nicely
with unitary matrices2.
Lemma 8.4 (Kronecker and matrix product respect unitary structure). The Kronecker
product of two (or more) unitary matrices is again a unitary matrix. Likewise,
the matrix product of two (or more) unitary matrices is again a unitary matrix.
The proof readily follows from the following appealing features of Kronecker
and matrix products: (𝑨 ⊗ 𝑩) † = 𝑨 † ⊗ 𝑩 † , (𝑨 ⊗ 𝑩) (𝑪 ⊗ 𝑫) = (𝑨𝑪 ) ⊗ (𝑩𝑫)
and, finally, (𝑼 × 𝑽 ) † = 𝑽 † × 𝑼 † . We leave it as an instructive exercise in
(multi-)linear algebra.
Exercise 8.5 Prove Lemma 8.4.
The following proposition is now an immediate consequence of Lemma 8.4
and the way we construct matrix representations of general 𝑛 -qubit circuits.
Proposition 8.6 (circuit matrix). The functionality of every 𝑛 -qubit circuit is every 𝑛 -qubit circuit is fully
completely described by a × circuit matrix 𝑼 . This matrix is unitary and
2𝑛 2𝑛 captured by a unitary 2𝑛 × 2𝑛
Kronecker & matrix products allow us to construct it from the circuit diagram. matrix

This statement tells us that every 𝑛 -qubit circuit is fully characterized by a


2𝑛 × 2𝑛unitary matrix 𝑼 . Remarkably, the converse is also true:

Theorem 8.7 (Solovay-Kitaev (𝑛 -qubit case)). Every unitary 2𝑛 × 2𝑛 matrix 𝑼 every 2𝑛 × 2𝑛 unitary matrix
can be approximated to arbitrary precision by a 𝑛 -qubit quantum circuit admits a 𝑛 -qubit circuit
that is solely comprised of 𝑯 , 𝑺 , 𝑪 𝑵 𝑶𝑻 (Clifford) and 𝑻 -gates. approx. (Solovay-Kitaev)

This is the quantum generalization of a seminal result in logical circuitry:


Every logical function 𝑓 : {0, 1}𝑛 → {0, 1}𝑚 can be represented as a logical
circuit that is comprised solely of ¬, ∨ and ∧. We don’t have the time, nor the
necessary background, to prove this powerful result. We emphasize instead,
that we now start to feel the sheer size of quantum state space. A 2𝑛 × 2𝑛 unitary
has exactly ( 2𝑛 ) 2 = 4𝑛 real-valued degrees of freedom (why?). And, we need
a comparable number of quantum gates to address all of them simultaneously.
In other words: almost all possible 𝑛 -qubit unitaries need exponentially many
quantum gates (in 𝑛 ) to approximate. This is a straightforward quantum
generalization of Shannon’s famous result that almost all 𝑛 -bit logical functions
require exponentially many logical operations (¬, ∧, ∨) to realize.

8.2.3 Classical simulation of 𝑛 -qubit logic and readout


We now have all the ingredients in place to present a classical strategy that
fully simulates the execution of a quantum circuit on a 𝑛 -qubit architecture,
like the one presented in Fig. 8.1. The key idea is to use 2𝑛 -dimensional state
vectors to keep track of the quantum logic at each step of the quantum circuit.
We start this procedure by forming a state vector representation of the 𝑛 -qubit
2Recall that a 𝐷 × 𝐷 matrix 𝑼 is unitary if 𝑼 †𝑼 = 𝑼𝑼 † = 𝕀, where † denotes adjungation
(transposition and complex conjugation) and 𝕀 is the 𝐷 × 𝐷 identity matrix with ones on the
main diagonal and zeroes everywhere else.
109 Lecture 8: General 𝑛 -qubit architectures

initialization:
𝑛
|𝑏 0 · · · 𝑏 𝑛 −1 ⟩ = 𝒆 𝑏 0 +2×𝑏 1 +···+2𝑛 −1 ×𝑏 𝑛 −1 ∈ ℂ2 . (8.11)

Now, let 𝑠 be the size of the quantum circuit, i.e. the total number of nontrivial
gates. Then, we can re-express the circuit as a sequential combination of 𝑠
layers, where each layer contains exactly one nontrivial gate:

𝑼 = 𝑪 𝑠 −1 × · · · × 𝑪 1 .
The individual layer matrices must be Kronecker products of exactly one
nontrivial gate matrix with only identity matrices (𝕀). For a single-qubit gate 𝑽
at qubit wire 𝑎 ∈ {0, . . . , 𝑛 − 1}, we get

𝑪𝑘 = 𝕀 ⊗ · · · ⊗ 𝕀 ⊗𝑽 ⊗ 𝕀 ⊗ · · · ⊗ 𝕀 (8.12)
| {z } | {z }
(𝑎 − 1 ) times (𝑛 − 𝑎 ) times

Else if we have a CNOT-gate with control at qubit wire 𝑎 ∈ {0, . . . , 𝑛 − 1}


and target 𝑏 ∈ {0, . . . , 𝑛 − 1}, there are two options: if 𝑎 < 𝑏 (control before
target), we get

𝑪 𝑘 =𝕀 ⊗ · · · ⊗ 𝕀 ⊗ | 0⟩⟨0 | ⊗ 𝕀 ⊗ · · · ⊗ 𝕀
| {z } | {z }
𝑎 times (𝑛 − 𝑎 − 1 ) times
+𝕀 ⊗ · · · ⊗ 𝕀 ⊗ | 1⟩⟨1 | ⊗ 𝕀 ⊗ · · · ⊗ 𝕀 ⊗ 𝑿 ⊗ 𝕀 ⊗ · · · ⊗ 𝕀 . (8.13)
| {z } | {z } | {z }
𝑠 times (𝑏 − 𝑎 − 1 ) times (𝑛 − 𝑏 − 1 ) times

And if 𝑎 > 𝑏 (target before control), we get

𝑪 𝑘 =𝕀 ⊗ · · · ⊗ 𝕀 ⊗ | 0⟩⟨0 | ⊗ 𝕀 ⊗ · · · ⊗ 𝕀
| {z } | {z }
𝑎 times (𝑛 − 𝑎 − 1 ) times
+𝕀 ⊗ · · · ⊗ 𝕀 ⊗ 𝑿 ⊗ 𝕀 ⊗ · · · ⊗ 𝕀 ⊗ | 1⟩⟨1 | ⊗ 𝕀 ⊗ · · · ⊗ 𝕀 . (8.14)
| {z } | {z } | {z }
𝑏 times (𝑎 − 𝑏 − 1 ) times (𝑛 − 𝑎 − 1 ) times

Importantly, every such layer matrix is sparse by construction.


Fact 8.8 The matrices 𝑪 𝑘 that arise from Eqs. (8.12),(8.13) and (8.14) are all
extremely sparse: each row/column contains at most 2 nonzero entries. ■

This helps a lot with performance when it comes to matrix-vector multipli-


cation. And, as it happens, this is precisely the operation we need to update our
quantum logical configurations. We start with Eq. (8.11) and compute the final
state vector 𝑼 |𝑏 0 · · · 𝑏 𝑛 − 1 ⟩ via a sequence of matrix-vector multiplications:
𝑛
|𝜓0 ⟩ =|𝑏 0 · · · 𝑏 𝑛 −1 ⟩ ∈ ℂ2 and |𝜓𝑘 +1 ⟩ = 𝑪 𝑘 |𝜓𝑘 ⟩ for 𝑘 = 0, . . . , 𝑠 − 1.
This computes the 𝑛 -qubit final state |𝜓 ⟩ = 𝑼 |𝜓0 ⟩ as a sequence of 𝑠 sparse
matrix-vector multiplications:

|𝜓 ⟩ =|𝜓𝑠 +1 ⟩ = 𝑪 𝑠 |𝜓𝑠 ⟩ = · · · = 𝑪 𝑠 × · · · × 𝑪 0 |𝜓0 ⟩ = 𝑼 |𝑏 0 · · · 𝑏 𝑛 −1 ⟩.


110 Lecture 8: General 𝑛 -qubit architectures
𝑛
More precisely, computing |𝜓 ⟩ ∈ ℂ2 in this fashion involves a total of 𝑠
matrix-vector multiplications in 2𝑛 dimensions. Fortunately, each matrix 𝑪 𝑘
involved is extremely sparse, see Fact 8.8. Sparse matrix-vector multiplication
routines execute each update using only (order) sparsity × dimension = 2 × 2𝑛
arithmetic operations. This produces a total resource cost of (at most) 2𝑠 × 2𝑛
arithmetic operations, where 𝑠 is the size of the circuit involved.

Theorem 8.9 (strong simulation of a 𝑛 -qubit circuit). Let 𝑼 be a 𝑛 -qubit quan-


tum circuit comprised of 𝑠 elementary gates (e.g. 𝑯 , 𝑺 , 𝑪 𝑵 𝑶𝑻 and 𝑻 ).
Given an input configuration |𝑏 0 · · · 𝑏 𝑛 − 1 ⟩ with 𝑏 0 , . . . , 𝑏 𝑛 − 1 ∈ {0, 1}𝑛 , we
𝑛
can compute the final state vector |𝜓 ⟩ = 𝑼 |𝑏 0 · · · 𝑏 𝑛 − 1 ⟩ ∈ ℂ2 using a
sequence of 𝑠 sparse matrix-vector multiplications. The total cost is (at
most) 2𝑠 × 2𝑛 .

This process is called a strong simulation of the underlying 𝑛 -qubit logic.


After all, it does provide us with a complete classical description of all 2𝑛 strong simulation = keep
amplitudes that feature in the final 𝑛 -qubit configuration. In a sense, this track of 2𝑛 -dim. state vector
description is even more fine-grained than the inner workings of an actual
𝑛 -qubit processor. The latter, after all, does not access amplitudes directly.
Instead, we can only sample 𝑛 -bit strings 𝑜 0 · · · 𝑜𝑛 − 1 ∈ {0, 1}𝑛 according to a
probability distribution that involves the squared amplitudes:
2
Pr𝑼 |𝑏 0 ···𝑏 𝑛 −1 ⟩ [𝑜 0 · · · 𝑜𝑛 − 1 ] = Pr |𝜓 ⟩ [𝑜 0 · · · 𝑜𝑛 − 1 ] = 𝜓𝑜0 ···𝑜𝑛 −1 . (8.15)

One repetition of a 𝑛 -qubit quantum computation produces exactly one 𝑛 -bit


string 𝑜 0 · · · 𝑜𝑛 − 1 . And we may need very, very many repetitions of this process
to get even a rough idea about the 2𝑛 different (squared) amplitudes involved.
The process of being able to sample 𝑛 -bit strings from the correct probability
distribution (8.15) is called a weak simulation protocol. Such protocols actually weak simulation = sample
implement the functionality of a 𝑛 -qubit architecture from the beginning (input from correct readout
𝑛 -bit string) to the end (outcome 𝑛 -bit string). distribution
Access to the full 2𝑛 -dimensional state vector is enough to sample outcome
strings according to the accompanying probability distribution (8.15). Note
that this step involves (pseudo-)randomness. The resource cost involved is
linear in the length of the state vector and therefore exponential in the number
of qubits.
Proposition 8.10 (strong simulation implies weak simulation). Assuming access to
𝑛
uniformly random bits and a state vector |𝜓 ⟩ ∈ ℂ2 , we can sample 𝑛 -bit
2
strings according to Pr |𝜓 ⟩ [𝑜 0 · · · 𝑜𝑛 − 1 ] = 𝜓𝑜0 ···𝑜𝑛 −1 with order 2𝑛 arithmetic
operations.
We leave a proof of this statement as an instructive exercise (there are
multiple ways to construct a weak simulator). The following consequence now
immediately follows from combining Theorem 8.9 with Proposition 8.10.
Corollary 8.11 (exponential overhead when moving from quantum to classical hard-
ware). Suppose that we wish to execute a circuit with 𝑠 elementary gates on strong (& weak) simulation
doable, but with exponential
overhead (in 𝑛 )
111 Lecture 8: General 𝑛 -qubit architectures

a 𝑛 -qubit architecture. Then, we can simulate the entire pipeline on classical


hardware. The overhead in resource cost is, however, exponential in 𝑛 : we
need roughly 2 (𝑠 + const) × 2𝑛 arithmetic operations (and access to random
bits).
Exercise 8.12 (does weak simulation imply strong simulation? (challenging)). Suppose
you have access to a weak simulator for an unknown quantum state vector |𝜓 ⟩ .
How often do you need to run it and produce an outcome bitstring 𝑜 0 · · · 𝑜𝑛 − 1
until you get a good idea about all 2𝑛 amplitudes 𝜓𝑜0 ···𝑜𝑛 −1 involved? Is this
task possible at all?

8.3 Implementing classical circuits with quantum logic


Before, we have asked ourselves whether we can use classical hardware to
simulate access to a (hypothetical) 𝑛 -qubit architecture. Let us now ask the
reverse question: can we use a quantum architecture with 𝑛 ′ ≥ 𝑛 qubits to
simulate a given (classical) 𝑛 -bit Boolean circuit?
Recall that a Boolean circuit has 𝑛 input bits and 𝑚 output bits. In between,
it can execute elementary logical gates like NOT (¬), AND (∧) and OR (∨).
The size 𝑠 of a circuit counts the total number of logical gates. We refer to
standard textbooks and lecture notes for detail, e.g. [Kue22].
Note that these classical gates are not native for a quantum architecture.
AND and OR, in particular, are irreversible gates. Let us now show how one
can nonetheless realize them them as sequences of native quantum gates (like
𝑪 𝑵 𝑶𝑻 , 𝑯 and 𝑻 ), albeit with additional qubits involved.

8.3.1 Quantum realizations of elementary logical gates


Reversible negation (¬)
The negation operation is special, because it is reversible. The bitflip gate 𝑿
implements it natively on single qubits: reversible NOT: 1 bitflip 𝑿

At face value, there is no overhead when implementing this logical functionality


on a quantum chip. We may, however, have to revert it at a later point if we
need access to the original bit value at a later point within the circuit execution.
This reversion costs one additional 𝑿 -gate and is displayed at the left of the
above diagram.

Reversible AND (∧)


Let us now move on to realizing the logical AND operation (∧):
(
0 if 𝑏 0 = 0 or 𝑏 1 = 0,
𝑏0 ∧ 𝑏1 =
1 else (𝑏 0 = 1 and 𝑏 1 = 1).
112 Lecture 8: General 𝑛 -qubit architectures

This operation is not reversible. If we receive 1, we know that both 𝑏 0 and 𝑏 1


must have been 1. But, if we receive 0 instead, we don’t know which of the
three input configurations produced it. We can, however, implement ∧ in a
reversible fashion if we allow for an additional (qu)-bit wire. The following
3-qubit circuit uses a Toffoli gate (two-fold controlled-NOT) to do the job: reversible AND: 3 qubits, 1
Toffoli gate

Correctness of the implementation follows directly from the truth table of the
Toffoli gate (8.10): the third bit is flipped from 0 to 1 if and only if 𝑏 0 = 𝑏 1 = 1.
Otherwise, it stays in 0.
Note that the Toffoli gate is typically not a native gate on a quantum
computer. We can, however, use Example 8.2 to decompose 𝑪 𝑪 𝑵 𝑶𝑻 into a
combination of 2 Hadamards, 25 𝑻 gates and 6 𝑪 𝑵 𝑶𝑻 s. This produces the
following overhead in terms of elementary quantum gates.
Lemma 8.13 (reversible implementation of AND (∧)). We can realize one two-bit
AND-gate using one additional qubit (initialized in 0), 6 𝑪 𝑵 𝑶𝑻 gates and 27
single-qubit gates (i.e. 33 elementary gates in total).
Exercise 8.14 (impossibility of implementing ∧ with two qubits). Argue that it is
impossible to realize a logical AND operation with only two qubits.

1 Why is it impossible to find a 2-qubit unitary 𝑼 ∈ ℂ4×4 such that


𝑼 |𝑏 0𝑏 1 ⟩ = |𝜓 (𝑏 0 , 𝑏 1 )⟩ ⊗ |𝑏 0 ∧ 𝑏 1 ⟩ for all 𝑏 0 , 𝑏 1 ∈ {0, 1}?
2 Does this situation change if we allow for a partial readout of the first
qubit combined with a conditional unitary on the remaining qubit?

Reversible OR (∨)
Finally, we need to realize the logical OR operation (∨):
(
0 if 𝑏 0 = 0 and 𝑏 1 = 0,
𝑏0 ∨ 𝑏1 =
1 else (𝑏 0 = 1 or 𝑏 1 = 1).

Again, this operation is not reversible. If we receive 0, we know that both 𝑏 0


and 𝑏 1 must have been 0. But, if we receive 1 instead, we don’t know which of
the three possible input configurations produced it. Similar to logical AND (∧),
we can still implement ∨ in a reversible fashion if we allow for an additional
113 Lecture 8: General 𝑛 -qubit architectures

(qu)-bit wire. The following 3-qubit circuit again uses a single Toffoli gate: reversible OR: 3 qubits, 1
Toffoli gate, 5 bitflips

This realization draws inspiration from our AND-implementation. The third


qubit wire is initialized to the truth value 1. The only way to change this is
to trigger a two-fold controlled-NOT gate. Additional negations on the first
two-qubit wires (which are undone later on) ensure that the Toffoli gate fires if
and only if both inputs are 0. This produces an effective logical OR gate.
Similar to before, we can further decompose the Toffoli gate into elementary
quantum gates (see Example 8.2). Doing so produces the following resource
count.
Lemma 8.15 (reversible implementation of OR (∨)). We can realize one two-bit
OR-gate using one additional qubit (initialized in 0), 6 𝑪 𝑵 𝑶𝑻 gates and 32
single-qubit gates (i.e. 38 elementary gates in total).

8.3.2 Quantum realization of entire Boolean circuits


The reversible execution of a single AND or OR gate requires one additional qubit
wire and a total of (at most) 38 elementary quantum gates (see Lemma 8.13
and Lemma 8.15). Putting everything together produces an extra overhead
that takes into account these extra costs for implementing AND and OR gates.

Theorem 8.16 (realizing a classical circuit on a quantum architecture). Consider a


Boolean circuit with 𝑛 input bits, 𝑚 output bits and circuit size 𝑠 . Then, we
can (perfectly and deterministically) simulate this circuit with a quantum
architecture that features 𝑛 + 𝑚 + 𝑠 qubits and requires (at most) 38 × 𝑠
elementary quantum gates (𝑯 ,𝑻 , 𝑪 𝑵 𝑶𝑻 , 𝑿 ).

For most circuits, circuit size 𝑠 is the dominant cost parameter. Theorem 8.16
then highlights that the overheads scale (at most) linearly in this dominant cost
factor: nr. of qubits = 𝑂 (𝑠 ) and nr. of quantum gates = 𝑂 (𝑠 ) .
Corollary 8.17 (linear overhead when moving from classical to quantum). Every clas- linear overhead permits
sical circuit can also be executed on a quantum architecture. The overhead is executing classical circuits on
(at most) linear in the original circuit size. quantum hardware

Let us first illustrate the conversion behind Theorem 8.16 by means of a


concrete example. Consider the following logical function with 𝑛 = 2 input
114 Lecture 8: General 𝑛 -qubit architectures

Figure 8.2 Implementation of Boolean circuits on quantum hardware: every


Boolean circuit with 𝑛 input bits, 𝑚 output bits and 𝑠 elementary logical gates
(green, lhs) can be converted into a functionally equivalent quantum circuit
(purple, rhs). This quantum circuit is larger and longer, but the increase in cost
is (at most) linear in circuit size max {𝑛, 𝑚, 𝑠 }.

bits and 𝑚 = 1 output bit:

b0 ¬

 
∧ b̄0 ∨ b1 ∧ b0 ∨ b̄1
(8.16)

b1 ¬

This circuit goes from left to right and contains 𝑠 = 5 Boolean gates (two ¬,
two ∨ and one ∧). Here is a quantum implementation of the same functionality
which we read from right to left instead:

Each readout is guaranteed to recover the advertised in question with certainty.


There are no superpositions whatsoever and the functionality is perfectly
deterministic. Already a single look at this circuit reveals that there is a
lot of potential for improvement. The 𝑿 -gates, in particular, are their own
inverses (i.e. 𝑿 × 𝑿 = 𝕀) and many of them cancel. Here is a more streamlined
115 Lecture 8: General 𝑛 -qubit architectures

representation of this quantum circuit:

In order to compile it into elementary quantum gates, we can use the decom-
position of the Toffoli gate (two-fold controlled-NOT gate) into 𝑯 ,𝑻 ,𝑻 † and
𝑪 𝑵 𝑶𝑻 from Example 8.2. Doing so produces an actual quantum circuit that
acts on
𝑛′ = 5 = 2 + 3
qubit wires. The first two wires carry the logical inputs 𝑏 0 and 𝑏 1 . Qubits 3, 4
and 5 compute intermediate logical values. Incidentally, the final logical value
is also the output of the circuit. There are 𝑠˜ = 3 non-reversible operations (∧
and ∨). Using Lemma 8.13 and Lemma 8.15, we can invest one additional
qubit wire and a total of 33 (∧) and 38 (∨) elementary quantum gates to
implement these functionalities in a reversible fashion. Logical negation (¬)
is easier by comparison. We can achieve it by applying 𝑿 and (potentially)
reverting this bit flip again at a later stage to recover the original truth value
back. To summarize:

1 For each non-reversible operation (∧, ∨), we need one additional qubit
wire and (at most) 38 elementary quantum gates.
2 For each output bit, we (may) need an additional qubit wire that is
initialized to 0. A single 𝑪 𝑵 𝑶𝑻 allows us to copy the relevant truth
variable into this wire.

It should not come as a surprise that this construction is general. Carrying it


out for a general Boolean circuit with 𝑛 input bits, 𝑚 output bits and 𝑠 logical
operations, produces the resource overhead displayed in Theorem 8.16.
Exercise 8.18 (Proof of Theorem 8.16). Generalize the construction above to general
Boolean circuits with 𝑛 input bits, 𝑚 output bits and a total of 𝑠 nontrivial
Boolean gates (¬, ∧∨).
Exercise 8.19 (more direct quantum execution of logical equality). Can you find a
more direct and less wasteful implementation of the logical functionality behind
Eq. (8.16)? Hint: look up parity check circuits.
116 Lecture 8: General 𝑛 -qubit architectures

8.4 Synopsis
Today, we started to compare classical and quantum (hardware) architectures
directly with each other. In particular, we have seen that quantum hardware
is never much worse than classical hardware. Theorem 8.16 states that every
classical circuit can also be executed on a quantum architecture with (at most)
linear overhead. In contrast, the transition from quantum to classical hardware
looks much more daunting. Theorem 8.9 states that it is possible to simulate
quantum architectures with classical software (and hardware, by extension).
But, the overhead in cost is substantial: our sparse matrix-vector subroutine
could only guarantee a runtime of 2𝑛 × circuit-size for a general 𝑛 -qubit circuit.
This exponential overhead grows quickly. For 𝑛 = 10, we obtain 2𝑛 = 1024 –
which is still manageable. But already 𝑛 = 100 produces 2100 ≈ 1.26 × 1030 –
this overhead quickly exhausts even the most powerful supercomputers.
linear
At this point, we should point out that sparse matrix-vector multiplication classical −→ quantum
is only one approach to simulate quantum architectures on classical hardware. exponential
quantum −→ classical
This approach is also called array-based simulation because it involves large
arrays (matrices and vectors). Other approaches include tensor network-based
simulations, stabilizer-based simulations, simulation based on decision diagrams
and many more. These all have their own strengths and weaknesses. But,
ultimately, each and every classical simulator developed to date starts to struggle
with an exponential cost increase (in the number of qubits 𝑛 ).
Exercise 8.20 (consequences of efficient classical simulation of quantum architectures).
Suppose that it was possible to simulate a general 𝑛 -qubit architecture with
only polynomial overhead (in 𝑛 ). I.e. every quantum circuit with 𝑠 gates can
be simulated with poly (𝑛) × 𝑠 arithmetic operations. Shor’s algorithm is such a
quantum circuit: it factorizes a 𝑛 bit integer by repeatedly executing a quantum
circuit of size 𝑠 = 𝑂 (𝑛 3 ) . Use this piece of information to conclude that efficient
classical simulation of quantum circuits would imply a polynomial-runtime
algorithm (in bit size 𝑛 ) for integer factorization. What would this mean for
the RSA public key encryption protocol?
9. Amplitude amplification circuits

Date: 19 November 2024

9.1 Motivation Agenda:


By now, we have all necessary pieces in place to start talking about actual 1 setup
quantum algorithms. Note that the term quantum algorithm can be a bit 2 overall idea
misleading. Actually, these are quantum circuits designed to achieve certain 3 circuit design
computational tasks. Today we discuss amplitude amplification circuits. The 4 pros and cons
well-known Grover ‘algorithm’ is a particularly prominent example of this
kind. It uses quantum effects to find satisfying assignments of a Boolean
function faster than a conventional brute-force search ever could. Today,
we analyze this circuit and show that this quantum speedup is quadratic in
nature. Our analysis will also highlight the advantages of a clean mathematical
formalism. The matrix-vector multiplication framework will allow us to draw
rigorous conclusions without explicitly having to know the underlying logical
functionalities. This is a key feature of existing quantum algorithm design and
analysis. After all, we don’t have the hardware (yet) to run these algorithms.

9.2 Setup
Consider a 𝑛 -bit Boolean function
𝑓 : {0, 1}𝑛 → {0, 1} .
Our task is to find a satisfying assignment: task: find 𝒃 ∈ {0, 1}𝑛 s.t.
𝑓 (𝒃) = 1 (‘positive answer’)
𝒃 = 𝑏 0 · · · 𝑏 𝑛 −1 ∈ {0, 1}𝑛 such that 𝑓 (𝒃) = 1.
Let us start by collecting important cost parameters. The first one is the cost
of computing 𝑓 (𝒃) for a given input. To ease our transition into the quantum
118 Lecture 9: Amplitude amplification circuits

realm, we don’t consider the function 𝑓 directly, but a Boolean circuit 𝐶 𝑓 that
implements it. The circuit size

size ( 𝑓 ) = # of elementary logical gates in 𝐶 𝑓

measures the cost of implementing 𝑓 as a circuit. It is also in one-to-one


correspondence with the runtime required to compute 𝑓 on a digital computer
(or a Turing machine). Another important indicator for the difficulty of this
problem is the ratio of positive answers: ratio of positive answers 𝑟 ( 𝑓 )

# of inputs such that 𝑓 (𝒃) = 1 1 ∑︁


𝑟 (𝑓 ) = = 𝑓 (𝒃) ∈ [0, 1] .
total number of inputs 𝒃 ∈ {0, 1}𝑛 2𝑛 𝒃 ∈ { 0,1 } 𝑛

Together, circuit size and ratio of positive answers bound the expected runtime
of a simple randomized solution strategy:

Algorithm 9.1 randomized search for positive answers


1 while success = 0 do
unif
2 sample 𝒃 = 𝑏 0 . . . 𝑏 𝑛 − 1 ∼ {0, 1}𝑛
3 compute 𝑓 (𝒃) ∈ {0, 1}
4 if 𝑓 (𝒃) = 1 then
5 set success = 1 and output 𝒃

Instead of discussing this classical strategy, let us reformulate it in terms of


a (𝑛 + size ( 𝑓 ) + 1) -qubit architecture. We can use Hadamard gates to create
uniform superpositions of all input strings 𝒃 ∈ {0, 1}𝑛 and use a reversible
implementation of 𝐶 𝑓 to subsequently compute the superposition of all 𝑓 (𝒃) s.
Fig. 9.1 provides a visualization of such a quantum circuit. How often do we
need to execute this circuit (or equivalently: test random input bit strings) until
we find a positive answer?
𝑛
Proposition 9.1 Let 𝑓 : {0, 1} → {0, 1} be a Boolean function with ratio of randomized search for
positive answers 𝑟 (𝑓 ) . Then with probability at least 95%, a total of 3/𝑟 (𝑓 ) positive answers requires
repetitions of the quantum circuit displayed in Fig. 9.1 provides us with (at 3/𝑟 ( 𝑓 ) attempts
least) one bitstring 𝒃 ∈ {0, 1}𝑛 that obeys 𝑓 (𝒃) = 1.
Before providing a proof, we emphasize that a number of (approximately)
1/𝑟 (𝑓 ) random input choices is also necessary. It is extremely unlikely to get
lucky and sample a positive answer earlier.

Proof of Proposition 9.1. The proof follows from computing the state vector of
the (𝑛 + 𝑠 + 1) qubits from beginning to end. In the beginning, we have
| 𝜑 0 ⟩ = | 0 . . . 0⟩ = (| 0⟩) ⊗ (𝑛+𝑠 +1 ) . Applying Hadamards to the first 𝑛 qubits
produces a (partial) superposition

| 𝜑 1 ⟩ =𝑯 ⊗𝑛 ⊗ 𝕀⊗ (𝑠 +1 ) | 0⟩ ⊗ (𝑛+𝑠 +1 ) = (𝑯 | 0⟩) ⊗𝑛 ⊗ | 0⟩ 𝑠 +1
1 ∑︁1 1 ∑︁
=√ |𝑏 0 . . . 𝑏 𝑛 −1 ⟩ ⊗ | 0⟩ ⊗ (𝑠 +1 ) = √ |𝒃 0𝑠 0⟩.
2𝑛 𝑏 ,...,𝑏
0 𝑛 −1=0 2𝑛 𝒃 ∈ { 0,1 } 𝑛
119 Lecture 9: Amplitude amplification circuits

Figure 9.1 Quantum implementation of a randomized search for positive answers:


this quantum circuit implements Algorithm 9.1. The purple quantum circuit
evaluates a Boolean function 𝑓 : {0, 1}𝑛 → {0, 1} in a reversible fashion. This
requires (at most) size (𝑓 ) auxiliary qubits to implement ∧ and ∨ and a final
qubit onto which the result 𝑓 (𝒃) is imprinted on:
𝑪 𝑓 |𝑏 0 . . . 𝑏 𝑛 −1 0𝑠 0⟩ = |𝑏 0 . . . 𝑏 𝑛 −1 0𝑠 𝑓 (𝒃)⟩ .

It is now time to apply the purple circuit block from Fig. 9.1:

1 ∑︁
| 𝜑2⟩ = 𝑪 𝑓 | 𝜑1⟩ = √ |𝒃 0𝑠 𝑓 (𝒃)⟩.
2𝑛 𝒃 ∈ { 0,1 } 𝑛

This is a uniform superposition over 2𝑛 different bitstrings. Note furthermore


that for each 𝒃 ∈ {0, 1}𝑛 , the accompanying function value 𝑓 (𝒃) now features
at the last qubit location. The readout stage collapses this equal superposition
and selects one of the 2𝑛 contributions uniformly at random. The first 𝑛 bit
values tell us the (randomly selected) input string 𝒃 = 𝑏 0 . . . 𝑏 𝑛 − 1 , while the
final readout value tells us 𝑓 (𝒃) .
The probability with which we randomly select a bit string 𝒃 that obeys
𝑓 (𝒃) = 1 is in one-to-one correspondence with the ratio of positive answers:

Pr unif [𝑓 (𝒃) = 1] = 𝑟 ( 𝑓 ) and Pr unif [𝑓 (𝒃) = 0] = 1 − 𝑟 (𝑓 ).


𝒃 ∼ { 0, 1 } 𝑛 𝒃 ∼ { 0,1 } 𝑛

How many trials do we need until we are guaranteed to sample (at least) one
bit string that obeys 𝑓 (𝒃) = 1? To answer this question, let 𝑇 denote the
total number of times we evaluate the circuit (i.e. the total number of random
bitstrings 𝒃 we produce with readout). Then, the probability of never sampling
a positive answer is

Pr [ all 𝑇 trials fail] =( 1 − 𝑟 ( 𝑓 ))𝑇 = exp ( log ( 1 − 𝑟 (𝑓 )) × 𝑇 )


≤ exp (−𝑟 (𝑓 )𝑇 ) ,

because log ( 1 − 𝑥) ≤ −𝑥 for all 𝑥 ∈ [ 0, 1] . If we set 𝑇 = 3/𝑟 (𝑓 ) , we obtain


a failure probability of at most exp (−3) < 0.05. In other words: we get a
positive answer with probability > 1 − 0.05 = 0.95. ■
120 Lecture 9: Amplitude amplification circuits

Many important search problems fall into this broad category. Let us
provide two examples.
𝑛
Example 9.2 (SAT). Let 𝑓 : {0, 1} → {0, 1} be a Boolean formula in CNF. search for positive answers
Then, the task of finding a positive answer 𝑓 (𝒃) = 1 also solves the famous covers satisfiability (SAT)
satisfiability problem. CNF formulas also have efficient circuit implementations,
i.e. size (𝑓 ) = poly (𝑛) . So, evaluating 𝑓 (or constructing the circuit 𝐶 𝑓 ) is not
the main bottleneck. What makes this problem hard is that we may have to
check exponentially many inputs: the ratio of positive answers 𝑟 ( 𝑓 ) can be as
small as 𝑟 (𝑓 ) = 1/2𝑛 . ■

Example 9.3 (unstructured data base search). We can also interpret the Boolean unstructured data base search
function 𝑓 : {0, 1}𝑛 → {0, 1} as a label function in an unstructured database also covered
of bit encodings. Viewed from this angle, the task of finding 𝒃 ∈ {0, 1}𝑛 with
𝑓 (𝒃) = 1 boils down to finding a database entry with the label ‘yes’. Problems
of this kind often occur as subroutines in more involved algorithms. ■

9.3 Overall idea for a quadratic quantum advantage


9.3.1 high-level vision
We have just seen that the ratio of positive answers 𝑟 ( 𝑓 ) plays an important
role when it comes to looking for positive answers. A standard execution of
randomized search requires 𝑇 ≈ 3/𝑟 (𝑓 ) invocations of the function/circuit in
question. Each such evaluation comes with a circuit of size size (𝑓 ) to evaluate
𝑓 , as well as (up to) 𝑛 gates for randomization (Hadamard in the quantum
case, negation in the classial case). Hence, the total cost is

𝑇 × size (𝑓 ) = 3 × ( size ( 𝑓 ) + 𝑛) /𝑟 (𝑓 ) (9.1)

and scales linearly in the inverse ratio of positive answers (1/𝑟 ( 𝑓 ) ). Note total classical cost is
furthermore that the ratio of positive answers can very well be exponentially ≈ size ( 𝑓 )/𝑟 ( 𝑓 )
small (𝑟 (𝑓 ) = 1/2𝑛 if there is exactly one positive answer). In these cases, the
total cost (9.1) also explodes exponentially.
Let us now present a high-level idea that uses quantum circuits to achieve a
quadratic improvement in total cost. The resulting quantum circuit 𝑮 will have
(quantum) circuit size
 √︁ 
size (𝑮 ) = 𝑂 ( size ( 𝑓 ) + 𝑛) / 𝑟 (𝑓 ) , (9.2)

where 𝑂 denotes the big-O notation which supresses constants and subleading
terms. The first contribution is comparable to the cost of executing the circuit total quantum cost is
√︁ √︁
𝐶 𝑓 once. The second contribution is where things get interesting: 1/ 𝑟 ( 𝑓 ) = ≈ size ( 𝑓 )/ 𝑟 ( 𝑓 )
√︁
1/𝑟 (𝑓 ) is quadratically smaller than the 1/𝑟 (𝑓 ) -term that dominates the
classical cost (9.1). How can we hope to achieve such a quadratic improvement?
There are two main ideas that we shall now cover.
121 Lecture 9: Amplitude amplification circuits

Figure 9.2 Geometric intuition behind amplitude amplification: view the uniform
superposition |𝜔⟩ as a combination of the superposition of bad input strings
|𝜓bad ⟩ ( 𝑓 (𝒃) = 0) and the superposition of good input strings √︁|𝜓good ⟩ ( 𝑓 (𝒃) =
1). The amplitude of |𝜓good ⟩ is proportional to 𝜃 ≈ sin (𝜃 ) = 𝑟 (𝑓 ) ≪ 1 (top
left). A reflection about |𝜓bad ⟩ (top center) followed by a reflection about |𝜔⟩
(top right) implement a rotation 𝑹 that amplifies this amplitude (bottom right).
A sequential application of many rotations amplifies this good amplitude to
approximately everything (bottom left).

Idea I: re-interpret the uniform superposition in a problem-related fashion:


Fix the Boolean function 𝑓 : {0, 1}𝑛 → {0, 1} and let 𝑡 + (𝑓 ) be the number of
inputs such that 𝑓 (𝒃) = 1. Likewise, let 𝑡 − (𝑓 ) = 2𝑛 − 𝑡 + (𝑓 ) be the number of
inputs such that 𝑓 (𝒃) = 0. Then, we can rewrite a uniform superposition over
all 𝑛 -bit strings as

1 ∑︁
|𝜔⟩ = (𝑯 | 0⟩) ⊗𝑛 = √ |𝒃⟩
2𝑛 𝒃 ∈ { 0,1 } 𝑛
1 ∑︁ 1 ∑︁
=√ |𝒃⟩ + √ |𝒃⟩
2𝑛 𝒃 : 𝑓 (𝒃 )=0 2𝑛 𝒃 : 𝑓 (𝒃 )=1
√︂ √︂
𝑡 − (𝑓 ) 1 ∑︁ 𝑡 + ( 𝑓 ) 1 ∑︁
= √︁ |𝒃⟩ + √︁ |𝒃⟩
2𝑛 𝑡 − (𝑓 ) 𝒃 : 𝑓 (𝒃 )=0 2𝑛 𝑡+ ( 𝑓 ) 𝒃 : 𝑓 (𝒃 )=1
√︂ √︂
2𝑛 − 𝑡 + (𝑓 ) 𝑡+ (𝑓 )
= |𝜓bad ⟩ + |𝜓good ⟩
2 𝑛 2𝑛
√︁ √︁
= 1 − 𝑟 ( 𝑓 )|𝜓bad ⟩ + 𝑟 (𝑓 )|𝜓good ⟩. (9.3)

This tells us that the uniform superposition is actually a linear combination of


two quantum states: the superposition of ‘bad inputs’ |𝜓bad ⟩ and the superposi-
tion of ‘good inputs’ |𝜓good ⟩ . The amplitude in front of the superposition of good
√︁
inputs scales like the square root 𝑟 ( 𝑓 ) of the ratio of positive answers. This is
a small number, but not quite as small as 𝑟 (𝑓 ) itself. Amplitude amplification
122 Lecture 9: Amplitude amplification circuits

ultimately leverages this quadratic discrepancy. To make our life easier later
on, we replace the amplitudes in Eq. (9.3) with trigonometric functions. Let
𝜃 ∈ [0, 2𝜋) be the angle that obeys
√︁
𝑟 (𝑓 ) = sin (𝜃 ), such that |𝜔⟩ = cos (𝜃 )|𝜓bad ⟩ + sin (𝜃 )|𝜓good ⟩. (9.4)

We refer to Fig. 9.2 (top left) for a visual illustration. The problem is that for uniform superposition |𝜔⟩
𝜃 ≪ 1 (which happens if 𝑟 (𝑓 ) ≪ 1), cos (𝜃 ) ≈ 1 and sin (𝜃 ) ≈ 0. This is also contains both bad and good
why our first quantum circuit needs so many trials to get a positive answer. bitstrings
However, it is also true that the uniform superposition |𝜔⟩ = |𝑯 ⊗𝑛 | 0 . . . 0⟩
does contain some share of the good answers: sin (𝜃 ) > 0 whenever 𝑟 ( 𝑓 ) > 0.
We refer to Fig. 9.2 (left) for a visual illustration. The second quantum idea
is designed to increase the amplitude of |𝜓good ⟩ at the cost of diminishing the
amplitude of |𝜓bad ⟩ .

Idea II: amplify the amplitude belonging to the superposition of ‘good’ bitstrings
Recall that quantum circuits act as unitary matrices on state vectors. What is
more, we can view Eq. (9.4) as an effective single-qubit state vector with only
two amplitudes:
 
cos (𝜃 )
|𝜔⟩ = cos (𝜃 )|𝜓bad ⟩ + sin (𝜃 )|𝜓good ⟩ = with ⟨𝜓bad |𝜓good ⟩ = 0.
sin (𝜃 )

Next, suppose that we are somehow able to implement the following rotation
gate on this effective qubit:
 
cos ( 2𝜃 ) − sin ( 2𝜃 )
𝑹= . (9.5)
sin ( 2𝜃 ) cos ( 2𝜃 )

We emphasize that this is not really a single-qubit gate, but a full-fletched


𝑛 -qubit circuit (perhaps even larger). We demand, however, that it transforms
the two special states |𝜓bad ⟩, |𝜓good ⟩ in exactly this fashion. In particular,
  
cos ( 2𝜃 ) − sin ( 2𝜃 ) cos (𝜃 )
𝑹 |𝜔⟩ =
sin ( 2𝜃 ) cos ( 2𝜃 ) sin (𝜃 )
 
cos ( 2𝜃 ) cos (𝜃 ) − sin ( 2𝜃 ) sin (𝜃 )
=
sin ( 2𝜃 ) cos (𝜃 ) + cos ( 2𝜃 ) sin (𝜃 )
 
cos (( 2 + 1)𝜃 )
= ,
sin (( 2 + 1)𝜃 )

because sin (𝛼+𝛽) = sin (𝛼) cos (𝛽)+cos (𝛼) sin (𝛽) and cos (𝛼+𝛽) = cos (𝛼) cos (𝛽)−
sin (𝛼) sin (𝛽) . More generally, we obtain for 𝑇 ≥ 2 iteratively increase amplitude
  of good bitstrings within |𝜔⟩
𝑇 cos (( 2𝑇 + 1)𝜃 )
𝑹 |𝜔⟩ = 𝑹 × · · · × 𝑹 |𝜔⟩ = .
| {z } sin (( 2𝑇 + 1)𝜃 )
𝑇 times

In other words: 𝑇 sequential applications of 𝑹 increase the angle 𝜃 approx-


imately 2𝑇 -fold. And a larger angle also makes the sine-contribution larger
123 Lecture 9: Amplitude amplification circuits

at the cost of the cosine-contribution. The extreme case occurs at angle 𝜋/2
where sin (𝜋/2) = 1 and cos (𝜋/2) = 0. That is, if we choose

𝑇♯ ≥ 2 such that ( 2𝑇♯ + 1)𝜃 ≈ 𝜋/2, (9.6)

the resulting quantum state will (almost) only feature good contributions:

𝑹 𝑇♯ × 𝑯 ⊗𝑛 | 0 . . . 0⟩ = cos ( 2𝑇♯ + 1)𝜃 |𝜓bad ⟩ + sin (( 2𝑇 + 1)𝜃 ) |𝜓good ⟩




≈ cos (𝜋/2)|𝜓bad ⟩ + sin (𝜋/2)|𝜓good ⟩


=0 × |𝜓bad ⟩ + 1 × |𝜓good ⟩
1 ∑︁
=|𝜓good ⟩ = √︁ |𝒃⟩.
𝒃 : 𝑓 (𝒃 )=1
𝑟 (𝑓 )

Note that the final expression only features ‘good’ bit strings 𝒃 ∈ {0, 1}𝑛 such
that 𝑓 (𝒃) = 1. Any one of them solves our problem! And performing the
readout will provide us with precisely√︁one of them. Finally, note that 𝜃 is
in one-to-one correspondence with 1/ 𝑟 (𝑓 ) , courtesy of Eq. (9.4) and the
fact that sin (𝜃 ) ≈ 𝜃 for 𝜃 ≪ 1 (in fact sin (𝜃 ) ≤ 𝜃 for all 𝜃 ∈ [ 0, 2𝜋) ). The
following rigorous proposition is an immediate consequence of our analysis.
Proposition 9.4 (amplitude amplification, high-level). Fix a satisfiable Boolean
function 𝑓 : {0, 1}𝑛 → {0, 1} with ratio√︁of positive answers 𝑟 ( 𝑓 ) ∈ ( 0, 1]
and set 𝜃 ∈ [ 0, 2𝜋) such that sin (𝜃 ) = 𝑟 (𝑓 ) . Suppose that it is possible
to generate a 𝑛 -qubit circuit 𝑹 that acts as Eq. (9.5) on |𝜓bad ⟩ and |𝜓good ⟩ .
√︁
Then, 𝑇♯ ≈ 𝜋/( 4 𝑟 ( 𝑓 )) sequential applications of 𝑹 (approximately) turn the
uniform superposition |𝜔⟩ = 𝑯 ⊗𝑛 | 0 . . . 0⟩ into a superposition |𝜓good ⟩ of only
positive answers.
Note that the term approximately is actually necessary in Proposition 9.4.
The optimal choice of 𝑇♯ in Eq. (9.6) requires us to approximate the possibly
irrational fraction 𝜋/( 2𝜃 ) with an integer 2𝑇♯ + 1. This necessarily introduces
rounding errors. Fortunately, these rounding errors are small and sine is a
smoothly varying  function. In particular, 2𝑇♯ + 1 ≈ 𝜋/( 2𝜃 ) is enough to ensure
sin ( 2𝑇♯ + 1)𝜃 ≈ sin (𝜋/2) = 1 via a continuity argument.

9.4 Concrete circuit construction


Proposition 9.4 provides us with a clear strategy on how to look for positive
answers with the help of a quantum architecture. What is still missing is
a concrete realization of the (effective) rotation matrix 𝑹 in Eq. (9.5). The
following concrete construction dates back to Grover’s work from 1996. It
combines two big quantum circuits that each act as a geometric reflection
(think: mirrors). Both are illustrated in Fig. 9.3 and deserve a bit of extra
attention.
124 Lecture 9: Amplitude amplification circuits

Figure 9.3 Quantum circuit blocks that achieve amplitude amplification:


(left): the function oracle 𝑼 𝑓 uses a reversible implementation of 𝐶 𝑓 to change
the sign of good bitstring answers: 𝑼 𝑓 |𝒃⟩ = (−1) 𝑓 (𝒃 ) |𝒃⟩ for 𝒃 ∈ {0, 1}𝑛 .
(right): the diffusion operator acts as a reflection about the uniform superposition:
𝑺 |𝜔⟩ = −|𝜔⟩ and 𝑺 |𝜈⟩ = +|𝜈⟩ whenever ⟨𝜔 |𝜈⟩ = 0 (orthogonality).

9.4.1 Circuit 1: reflection about good solutions (‘function oracle’)


Let us first take a closer look at the left-hand-side circuit in Fig. 9.3, the so-called
function oracle or Grover oracle (named after the person who discovered it in
the 90s).
Lemma 9.5 (action of function oracle). The first circuit displayed in Fig. 9.3 acts
on the first 𝑛 qubits as the function oracle acts as a
∑︁ reflection about ‘good
𝑼𝑓 = (−1) 𝑓 (𝒃 ) |𝒃⟩⟨𝒃 |. (9.7) bitstrings’ (𝒃 s.t. 𝑓 (𝒃) = 1)
𝒃 ∈ { 0,1 } 𝑛

In words: This unitary matrix reflects ‘good input states’ and leaves ‘bad
input states’ as they are.

Proof. We can without loss restrict our attention to the first 𝑛 qubits plus the
last one. The part in the middle is only required to reversibly implement logical
∧, ∨. It starts in 0 and ends in 0. The remaining√circuit prepares the last qubit
in the state |−⟩ = 𝑯 𝑿 | 0⟩ = 𝑯 | 1⟩ = (| 0⟩ − | 1⟩)/ 2. We then XOR the value of
𝑓 (𝒃) to it. For 𝒃 ∈ {0, 1}𝑛 , we obtain

𝑪 𝑓 × 𝕀⊗𝑛 ⊗ (𝑯 𝑿 ) |𝒃 0⟩ =|𝒃⟩ ⊗ (| 0 ⊕ 𝑓 (𝒃)⟩ − | 1 ⊕ 𝑓 (𝒃)⟩) / 2


=(−1) 𝑓 (𝒃 ) |𝒃⟩ ⊗ |−⟩.

The final 𝑿 and 𝑯 convert the |−⟩ -state back into the | 0⟩ -state we start with:

𝑼 𝑓 |𝒃 0⟩ = (−1) 𝑓 (𝒃 ) |𝒃 0⟩ for 𝒃 ∈ {0, 1}𝑛 .

Eq. (9.7) subsumes all these actions on different bit strings into a single
display. ■
125 Lecture 9: Amplitude amplification circuits

We can use Lemma 9.5 to infer the action of 𝑼 𝑓 on the effective qubit
spanned by |𝜓bad ⟩ and |𝜓good ⟩ . Linearity ensures

1 ∑︁
𝑼 𝑓 |𝜓bad ⟩ = √︁ 𝑼 𝑓 |𝒃⟩
𝒃 : 𝑓 (𝒃 )=0
2𝑛 ( 1 − 𝑟 (𝑓 ))
1 ∑︁
= √︁ (−1) 𝑓 (𝒃 ) |𝒃⟩
𝒃 : 𝑓 (𝒃 )=0
2𝑛 ( 1 − 𝑟 (𝑓 ))
1 ∑︁
= √︁ |𝒃⟩
𝒃 : 𝑓 (𝒃 )=0
2𝑛 ( 1 − 𝑟 (𝑓 ))
= + |𝜓bad ⟩

and, likewise
1 ∑︁
𝑼 𝑓 |𝜓good ⟩ = √︁ 𝑼 𝑓 |𝒃⟩
𝒃 : 𝑓 (𝒃 )=1
2𝑛 𝑟 (𝑓 )
1 ∑︁
= √︁ (−1) 𝑓 (𝒃 ) |𝒃⟩
𝒃 : 𝑓 (𝒃 )=1
2𝑛 𝑟 (𝑓 )
1 ∑︁
= − √︁ |𝒃⟩
𝒃 : 𝑓 (𝒃 )=1
2𝑛 𝑟 (𝑓 )
= − |𝜓good ⟩.

Putting everything together produces


 
eff 1 0
𝑼 𝑓 =|𝜓bad ⟩⟨𝜓bad | − |𝜓good ⟩⟨𝜓good | = . (9.8)
0 −1

9.4.2 Circuit 2: reflection about uniform superposition (‘diffusion operator’)


Let us now take a closer look at the right-hand-side circuit in Fig. 9.3, the
so-called diffusion operator. At the heart of this diffusion operator is a 𝑛 -fold
controlled-NOT gate – a straightforward generalization of the Toffoli gate to
(𝑛 + 1) qubits.
Lemma 9.6 The second circuit displayed in Fig. 9.3 acts on the first 𝑛 qubits as the diffusion operator acts as
a reflection about |𝜔⟩
𝑺 = 2 |𝜔⟩⟨𝜔 | − 𝕀⊗𝑛 . (9.9) (uniform superposition)

In words: this is a reflection about the state |𝜔⟩ , i.e. 𝑺 |𝜔⟩ = −|𝜔⟩ and
𝑺 |𝜈⟩ = +|𝜈⟩ whenever ⟨𝜔 |𝜈⟩ = 0 (orthogonality).
Exercise 9.7 (Proof of Lemma 9.6). Show that the circuit depicted in Fig. 9.3 is
indeed described by Eq. (9.9).
We can use |𝜔⟩ = cos (𝜃 )|𝜓bad ⟩ + sin (𝜃 )|𝜓good ⟩ (Eq. (9.4)) to infer the
action of 𝑺 on the effective qubit spanned by |𝜓bad ⟩ and |𝜓good ⟩ . To declutter
126 Lecture 9: Amplitude amplification circuits

notation, we abbreviate |𝜓g ⟩ = |𝜓good ⟩ and |𝜓b ⟩ = |𝜓bad ⟩ :

𝑺 eff =2 |𝜔⟩⟨𝜔 | − 𝕀 = 2 |𝜔⟩⟨𝜔 | − |𝜓b ⟩⟨𝜓b | + |𝜓g ⟩⟨𝜓g |



  
=2 cos (𝜃 )|𝜓b ⟩ + sin (𝜃 )|𝜓g ⟩ cos (𝜃 )⟨𝜓b | + sin (𝜃 )⟨𝜓g | − |𝜓b ⟩⟨𝜓b | + |𝜓g ⟩⟨𝜓g |
= 2 cos2 (𝜃 ) − 1 |𝜓b ⟩⟨𝜓b | + 2 sin (𝜃 ) cos (𝜃 )|𝜓b ⟩⟨𝜓g |


+2 sin (𝜃 ) cos (𝜃 )|𝜓g ⟩⟨𝜓b | + ( 2 sin2 (𝜃 ) − 1)|𝜓g ⟩⟨𝜓g |


   
2 cos2 (𝜃 ) − 1 2 sin (𝜃 ) cos (𝜃 ) cos ( 2𝜃 ) sin ( 2𝜃 )
= = . (9.10)
2 sin (𝜃 ) cos (𝜃 ) 2 sin2 (𝜃 ) − 1 sin ( 2𝜃 ) − cos ( 2𝜃 )

Here, we have used some additional trigonometric identities1. Although


it looks similar, this is not a rotation. The determinant, in particular, is
− cos2 (𝜃 ) − sin2 (𝜃 ) = −1 which is indicative of a reflection (think mirror
image).

9.4.3 Combination of the two circuit blocks


We now have effective single-qubit actions of both the function oracle (Eq. (9.8))
and the diffusion operator (Eq. (9.10)). A combination of the two yields the
following effective gate matrix:
   
eff eff cos ( 2𝜃 ) sin ( 2𝜃 ) 1 0
𝑹 =𝑺 × 𝑼 𝑓 = ×
sin ( 2𝜃 ) − cos ( 2𝜃 ) 0 −1
 
cos ( 2𝜃 ) − sin ( 2𝜃 )
= . (9.11)
sin ( 2𝜃 ) cos ( 2𝜃 )

This is now a proper rotation matrix. Note furthermore that this effective a combination of diffusion
single-qubit functionality has exactly the form we need for Proposition 9.4. and function oracles acts as
an effective rotation

9.5 Full amplitude amplification circuit


We now have all the pieces in place for a fully-fletched quantum search algorithm
via amplitude amplification. At the heart of it are two building blocks:

(i) the function oracle which is essentially a reversible implementation of


the function 𝑓 (Fig. 9.3, left) and
(ii) the diffusion operator with a 𝑛 -fold controlled NOT-gate at its center
(Fig. 9.3, right).

In contrast to our quantum implementation of random search in Fig. 9.1, we


now group many of these elementary
√︁ blocks into a single quantum circuit. A
total of 𝑇♯ ≈ 𝜋/( 2𝜃 ) ≳ 𝜋/( 4 𝑟 (𝑓 )) to be precise. The last missing ingredient
is an initial preparation of the uniform superposition |𝜔⟩ = 𝑯 ⊗𝑛 | 0 . . . 0⟩ on the
first 𝑛 qubits. We refer to Fig. 9.4 for a visual illustration. Proposition 9.4 and
the explicit construction of the required rotation matrix in Eq. (9.11) readily
imply the following theoretical underpinning.
1In particular: 2 sin (𝜃 ) cos (𝜃 ) = sin ( 2𝜃 ) , 1 − 2 sin2 (𝜃 ) = cos ( 2𝜃 ) and 1 − 2 cos2 (𝜃 ) =
2 sin2 (𝜃 ) − 1 = − cos (𝜃 ) (because cos2 (𝜃 ) + sin2 (𝜃 ) = 1).
127 Lecture 9: Amplitude amplification circuits

Figure 9.4 Full amplitude amplification circuit: generate a uniform superposition


over all 𝑛 -bit strings and then sequentially apply 𝑇♯ combinations of diffusion
operator and function oracle. A proper choice of 𝑇♯ ≈ 𝜋/( 4𝑟 (𝑓 )) amplifies
the amplitudes of ‘good bit strings’ (𝒃 s.t. 𝑓 (𝒃) = 1) while essentially erasing
the amplitudes of ‘bad bit strings’ (𝒃 s.t. 𝑓 (𝒃) = 0). The total circuit size is
𝑂 𝑇♯ (𝑛 + size (𝑓 ) .

𝑛
Theorem 9.8 (amplitude amplification circuit). Let 𝑓 : {0, 1} → {0, 1} be a amplitude amplification
achieves quadratic speedup
√︁ with ratio of positive answers 𝑟 (𝑓 ) and circuit size size (𝑓 ) .
Boolean function
Set 𝑇♯ ≈ 𝜋/( 4 𝑟 (𝑓 ) . Then, the circuit displayed in Fig. 9.4 almost produces
a uniform superposition of all bitstrings 𝒃 such that 𝑓 (𝒃) = 1. A readout of
the first 𝑛 qubits produces one of them with a very high probability.
 √︁ 
The total size of this circuit is 𝑂 size ( 𝑓 )/ 𝑟 (𝑓 ) (assuming size (𝑓 ) ≥ 𝑛 ).
For sufficiently large input sizes 𝑛 ≫ 1, this becomes much cheaper than the
total cost of 3 × size (𝑓 )/𝑟 (𝑓 ) required to execute a random search protocol
(Algorithm 9.1). This quadratic discrepancy becomes particularly pronounced if
the ratio of positive answers 𝑟 ( 𝑓 ) is very small (𝑟 (𝑓 ) ≈ 1/2𝑛 ).
Next, we point out that the quantifier ‘almost’ in Theorem 9.8 takes into
account rounding
√︁ issues when choosing the optimal number of iterations
𝑇♯ ≈ 𝜋/( 4 𝑟 (𝑓 ) . We can handle this by repeating the circuit multiple times
until we get an output 𝒃 that fulfils 𝑓 (𝒃) = 1 (evaluating the function for a
single input is very cheap by comparison). And continuity arguments ensure
that the probability of getting a good string is very close to sin (𝜋/2) = 1.
A more serious issue is the fact that the number of repetitions 𝑇♯ depends
on the ratio of positive answers 𝑟 (𝑓 ) . And this ratio may not be known
in advance! To make matters worse, it is absolutely possible to overshoot
amplitude amplification. If we get 𝑇♯ wrong by a factor of 2, for instance, we amplitude amplification can
overshoot
128 Lecture 9: Amplitude amplification circuits

effectively end with a superposition of bad answers:


 2𝑇♯
𝑺 ×𝑼𝑓 |𝜔⟩ ≈ cos ( 2 (𝜋/2)) |𝜓bad ⟩ + sin ( 2 (𝜋/2)) |𝜓good ⟩ = −|𝜓bad ⟩.

This periodicity is an unavoidable feature when constructing amplitude amplifi-


cation the way we have done it here (which dates back to Grover in the 90s).
However, much more recently, researchers have developed an alternative way
to construct amplitude amplification circuits that don’t have this problem. This
construction is based on a technique called quantum singular value transform.
Discussing it would go beyond the scope of this introductory lecture.
10. Quantum Fourier-type transforms

Date: 4 December 2024

10.1 General overview Agenda:


A discrete Fourier transform (FT) in 𝐷 dimensions is a unitary 𝐷 × 𝐷 matrix 1 General overview
𝑭 ∈ ℂ𝐷 ×𝐷 whose matrix entries have flat magnitudes. Applying this matrix 2 Walsh-Hadamard trans-
to vectors (matrix-vector multiplication) describes a change of perspective. It form
preserves the (Euclidean) length of the vectors but maps sparse vectors to flat 3 Discrete Fourier trans-
ones and vice versa. In formulas, length preservation (aka unitarity) manifests form
itself as

∥𝑭 𝒙 ∥ 22 = (𝑭 𝒙 ) † (𝑭 𝒙 ) = 𝒙 † 𝑭 † 𝑭 𝒙 = 𝒙 † 𝕀𝒙 = 𝒙 †𝒙 = ∥𝒙 ∥ 22 for every 𝒙 ∈ ℂ𝐷 ,

where we have used 𝑭 † 𝑭 = 𝕀 (𝑭 is unitary). Flatness manifests itself in the


action of 𝑭 on standard basis vectors 𝒆 0 = ( 10 · · · 0) † , . . . , 𝒆 𝐷 − 1 = ( 0 · · · 01) † .
Let 𝒇 𝑘 = 𝑭 𝒆 𝑘 ∈ ℂ𝐷 be the image of the 𝑘 th standard basis vector under the
Fourier transform 𝑭 . Then, for all 𝑙 , 𝑘 ∈ {0, . . . , 𝐷 − 1}

2 2 2 2 1
= 𝒆 𝑙†𝒇 𝑘 = 𝒆 𝑙† 𝑭 𝒆 𝑘
 
𝒇𝑘 𝑙 = [𝑭 ] 𝑘 ,𝑙 = . (10.1)
𝐷
In words: all coefficients of all Fourier basis vectors – or, in fact, every entry
in the the Fourier-type matrix itself – must have the same magnitude. This
indeed ensures that the Fourier-type transformation maps (very) sparse vectors
𝒙 ∈ ℂ𝐷 to (very) dense vectors. Fourier type transform:
We can recognize a quantum computing interpretation of these two math- unitary matrix with ‘flat’
ematical properties if we set 𝐷 = 2𝑛 (𝑛 qubits), see Fig. 10.1 for a visual matrix entries
𝑛 𝑛
illustration. Length preservation (aka unitarity) ensures that 𝑭 ∈ ℂ2 × 2
describes the truth table of a valid 𝑛 -qubit quantum circuit. And the flatness
130 Lecture 10: Quantum Fourier-type transforms

Figure 10.1 General layout of a Fourier type circuit: a 𝑛 -qubit circuit comprised
of poly (𝑛) gates effectively implements a Fourier-type (unitary+flat) matrix
multiplication in 𝐷 = 2𝑛 dimensions.

condition (10.1) demands that this quantum circuit maps every deterministic bit
encoding |𝑏 0 · · · 𝑏 𝑛 − 1 ⟩ into a uniform superposition of all 2𝑛 possible bitstrings.
What is more, all these uniform superpositions must be disjoint and perfectly
distinguishable. After all
(
1 if 𝑘 = 𝑙 ,
𝒇 𝑘† 𝒇 𝑙 = 𝒆 𝑘† 𝑭 † 𝑭 𝒆 𝑙 = 𝒆 𝑘† 𝒆 𝑙 =
0 else if 𝑘 ≠ 𝑙 .

These are all naturally desirable features in quantum computing. In fact, we


are already intimately familiar with one Fourier-type transform.
Example 10.1 (the Hadamard gate is a Fourier-type transform for 𝐷 = 2). Consider Hadamard gate is a
the truth table of a single-qubit Hadamard gate, where 𝐷 = 21 = 2: Fourier-type transform
 
1 +1 +1
𝑯 =√ ∈ ℂ2×2 .
2 +1 −1

This matrix is unitary (𝑯 † × 𝑯 = 𝑯 × 𝑯 = 𝕀2 × 2 ) and therefore preserves vector


lengths. What is more, every matrix entry obeys
2
2 ±1 1 1
[𝑯 ] 𝑘 ,𝑙 = √ , = =
2 2 𝐷

which satisfies the flatness condition (10.1). These two features ensure that the
Hadamard gate maps the two possible deterministic input states | 0⟩ and | 1⟩
131 Lecture 10: Quantum Fourier-type transforms

onto uniform superpositions that are perfectly distinguishable from each other:
 
1 1 +1
|+⟩ =𝑯 | 0⟩ = √ (| 0⟩ + | 1⟩) = √ =: 𝒇 0 ∈ ℂ2 ,
2 2 +1
 
1 1 +1
|−⟩ =𝑯 | 1⟩ = √ (| 0⟩ − | 1⟩) = √ = 𝒇 1 ∈ ℂ2 .
2 2 − 1

We emphasize that the study and application of Fourier-type transforms


predates quantum computing by several centuries. These transforms have been,
and still are, powerful tools in mathematics, engineering and natural sciences.
Here are a couple of examples that underscore their prominent role across
scientific disciplines: Fourier-type transforms have
many applications
• Uncertainty relations: by definition, Fourier-type transforms map (very)
sparse vectors to (very) dense ones and vice versa. This, in turn, implies
that it is impossible to find vectors whose dominant entries are confined
to adjacent subregions in both representations simultaneously. Results
of this type are known as uncertainty relations and feature prominently
in acoustics (harmonic analysis), wireless communication and quantum
mechanics (Heisenberg uncertainty principle).
• Differential equations: many differential equations can be solved by
performing a continuous version of one particular Fourier-type transform.
This is largely because the Fourier transform maps derivatives (think:
𝜕/𝜕𝑥 ) to multiplications (think ×𝑥 ) and vice versa. A concrete example
is the heat equation on a circle in ℝ2 . The study of differential equations
is where Fourier-type transforms actually originated.
• Image compression: the popular JPEG format for pixel images computes
a Fourier-type transform (e.g. a discrete cosine transform) to transform
a 2D pixel image into another basis where natural images become
(approximately) sparse. Cutting off the small vector entries does not
distort the image too much and can save a lot of storage space: a dense
vector in pixel space is encoded as a (very) sparse vector in discrete cosine
space. Wavelets and also convolutional neural networks build on this
idea and have achieved considerable success in image analysis.

The wealth of applications of 𝐷 -dimensional Fourier-type transforms com-


bined with their possible interpretation as unitary circuits on only 𝑛 =
⌈log2 (𝐷)⌉ qubits, suggests a potential window of opportunity. These 𝑛 -qubit
circuits are efficient to realize on quantum hardware and effectively operate on
𝐷 = 2𝑛 different amplitudes simultaneously. This is an enormous compression
potential, especially when 𝑛 and 𝐷 = 2𝑛 become very large!
But merely representing Fourier-type transforms as hypothetical quantum
circuits already comes with remarkable benefits. Recall from Lecture 8 that Fast (classical) Fourier-type
we can use sparse matrix-vector multiplication to classically simulate the transformations
action of a Fourier type circuit 𝑭 on 𝑛 -qubit state vectors (roughly) scales as
132 Lecture 10: Quantum Fourier-type transforms

size (𝑭 ) × 2𝑛 = poly (𝑛) × 2𝑛 , provided that size (𝑭 ) = poly (𝑛) (polynomial


circuit size). If we now substitute in 𝐷 = 2𝑛 – the actual dimension of
the matrix-vector multiplications involved – we obtain a runtime of (order)
polylog (𝐷) × 𝐷 to compute 𝑭 𝒙 (think 𝑭 |𝜉 ⟩ , where |𝜉 ⟩ is an arbitrary 𝑛 -qubit
state). This is much, much better than the naive cost of 𝐷 2 that is associated
with general matrix-vector multiplication!
Such smart executions of a Fourier-type matrix vector multiplication 𝑭 𝒙
are known as Fast Fourier (type) transforms (or FFT): polylog (𝐷) × 𝐷 vs. 𝐷 2 .
Not only are Fourier-type transforms useful, they are also much cheaper to
execute than standard matrix-vector multiplications. We will derive two such
FFTs in today’s lecture!

10.2 Walsh-Hadamard transform


10.2.1 Formal definition
One Fourier-type transformation is a straightforward extension of the Hadamard
gate we revisited in Example 10.1: we populate a 𝐷 × 𝐷 matrix with ±1 entries
(same magnitude) in a specific way to ensure that a normalized version of the
overall matrix is unitary. For 𝐷 = 2 this strategy produces the Hadamard truth
table  
1 +1 +1
𝑾 𝑯 (2) = √ ∈ ℝ2 × 2 . (10.2)
2 + 1 − 1
Note that this matrix is real-valued and unitary: the two columns are orthogonal
to each other and normalized to unit length. A matrix with this property is
known as a Walsh-Hadamard transformation. Walsh-Hadamard
Unfortunately, this construction does not extend to the next larger dimension transformation
𝐷 = 3. It is easy to convince oneself that it is impossible to construct 𝐷 = 3
(±) -vectors that are all orthogonal to each other. It does, however, work again
for 𝐷 = 4:
+1 +1 +1 +1
1 ­ +1 −1 +1 −1 ®
© ª
𝑾 𝑯 (4) = ­ ® ∈ ℝ4 × 4 .
2 ­ +1 +1 −1 −1 ®
« +1 −1 −1 +1 ¬
This matrix is unitary – all column vectors are mutually orthogonal and
normalized to unit length – and obeys the flatness condition from Eq. (10.1).
A closer look at this matrix reveals that the 4-dimensional Walsh-Hadamard
matrix is actually a two-fold Kronecker product of the 2-dimensional Hadamard
gate from Eq. (10.2):
2
× 22
𝑾 𝑯 ( 4 ) = 𝑯 ⊗ 2 ∈ ℝ2 = ℝ4 × 4 .

This alternative representation in terms of a Kronecker product allows us to very


easily verify the necessary conditions: (i) 𝑾 𝑯 ( 4 ) is a unitary matrix, because
it is the Kronecker product of two unitary matrices. (ii) the individual entries
are proportional to ±1, because entries of a Kronecker product are products of
133 Lecture 10: Quantum Fourier-type transforms

the entries of the smaller matrices involved. And these are themselves always
+1 and −1.
These arguments readily extend to larger Kronecker products of Hadamard
gates and give rise to an infinite family of Fourier-type transformations. The
following result is an immediate consequence of this observation:

Theorem 10.2 (Walsh-Hadamard transform in 𝐷 = 2𝑛 dimensions). The Walsh- Walsh-Hadamard transform


Hadamard transform exists for dimensions 𝐷 = 2𝑛 and is constructed (WHT)
as
𝑛 𝑛
𝑾 𝑯 ( 2𝑛 ) = 𝑯 ⊗𝑛 ∈ ℝ2 ×2 . (10.3)
 
This produces the following formula for the individual entries: 𝑾 𝑯 ( 2𝑛 ) =
𝑘 ×𝑙
√ 𝑘 ,𝑙
(−1) / 𝐷 . Every such matrix is unitary and every entry has the same
magnitude. Hence, the Walsh-Hadamard transform is a Fourier-type matrix.

Note that, since 𝐷 = 2𝑛 , Eq. (10.3) is equivalent to

𝑾 𝑯 (𝐷 ) = 𝑯 ⊗ log2 (𝐷 ) (10.4)

which only makes sense if log2 (𝐷) is a positive integer.


Exercise
 10.3 (matrix entries of Walsh-Hadamard
√ transform). Derive the entry for-
mula 𝑾 𝑯 (𝐷 ) 𝑘 ,𝑙 = (−1) 𝑘 ×𝑙 / 𝐷 for 0 ≤ 𝑘 , 𝑙 ≤ (𝐷 − 1) of the Walsh-


Hadamard transform matrix from the Kronecker product definition in Eq. (10.4).

10.2.2 Implementation as a quantum circuit


The formal definition of the Walsh-Hadamard transform as a Kronecker product
of 𝑛 single-qubit Hadamard gates provides us with a straightforward quantum
circuit realization:
134 Lecture 10: Quantum Fourier-type transforms

This 𝑛 -qubit circuit is arguably as easy and cheap as it gets: a single layer of 𝑛 quantum realization of WH
parallel single-qubit gates is enough. transform requires 𝑛
Hadamard gates
10.2.3 Fast Walsh-Hadamard transform
We now show how one can use the quantum circuit implementation of the Walsh-
Hadamard transformation matrix 𝑾 𝑯 (𝐷 ) from Sub. 10.2.2 to construct a fast
classical transformation. That is, an alternative way to compute matrix-vector
multiplications

𝑾 𝑯 (𝐷 ) × 𝒙 ∈ ℂ𝐷 for arbitrary 𝒙 ∈ ℂ𝐷 , (10.5)

whose resource cost scales much more favorably than the general cost of (order)
𝐷 2 for multiplying a 𝐷 × 𝐷 matrix with a 𝐷 -dimensional vector. Remarkably,
we will get an almost quadratic improvement.

Theorem 10.4 (Fast (classical) Walsh-Hadamard transform). Fix 𝐷 = 2𝑛 . Then, Fast (classical)
we can compute the Walsh-Hadamard transform 𝑾 𝑯 (𝐷 ) × 𝒙 of any vector Walsh-Hadamard transform
𝒙 ∈ ℂ𝐷 with only (order) log2 (𝐷) × 𝐷 ≪ 𝐷 2 (classical) operations.

Proof. First, let us assume that the vector 𝒙 ∈ ℂ𝐷 is normalized to unit length,
𝑛
i.e. ∥𝒙 ∥ 2 = 1. Then, we can interpret the 2𝑛 coefficients of 𝒙 ∈ ℂ2 as
amplitudes belonging to a superposition of all 2𝑛 possible 𝑛 -bit configurations
|𝑏 0 · · · 𝑏 𝑛 −1 ⟩ :
∑︁2𝑛 −1 ∑︁1
|𝜉 ⟩ = [𝒙 ] 𝑙 |⌞𝑙 ⌟⟩ = [𝒙 ] 2𝑛 −1𝑏 0 +···2𝑏 𝑛 −2 +𝑏 𝑛 −1 |𝑏 0 · · · 𝑏 𝑛 −1 ⟩.
𝑙 =0 𝑏 0 ,...,𝑏 𝑛 − 1 =0

The key idea is now to interpret 𝑾 𝑯 ( 2𝑛 ) = 𝑯 ⊗𝑛 as a 𝑛 -qubit quantum circuit


comprised of 𝑛 = log2 (𝐷) Hadamard gates that acts on the 𝑛 -qubit state |𝜉 ⟩ :

𝑾 𝑯 (𝐷 ) × 𝒙 = 𝑯 ⊗𝑛 |𝜉 ⟩.

So, computing the Walsh-Hadamard transform of 𝒙 is equivalent to classically


compute the action of the 𝑛 -qubit quantum circuit 𝑯 ⊗𝑛 on a fixed 𝑛 -qubit
state vector |𝜉 ⟩ . The techniques we developed in Lecture 8 tell us how to do
this. First, we interpret the quantum circuit in the following fashion:
135 Lecture 10: Quantum Fourier-type transforms

The circuit on the right contains 𝑛 layers. And each layer only contains a single
Hadamard gate. This reformulation is desirable, when we attempt to simulate
this circuit on classical hardware. Indeed,
   
⊗ (𝑛 − 1 ) ⊗ (𝑛 − 1 )
𝑾 = 𝕀 ⊗𝑯 ×···× 𝑯 ⊗ 𝕀
| {z }
𝑛 matrix products

and each matrix on the right is extremely sparse: only 𝑠 = 2 entries in each
row/column are different from zero. This observation allows us to decompose
the full matrix-vector multiplication into a sequence of 𝑛 sparse matrix-vector
multiplications:
   
𝒚 0 = 𝒙 , 𝒚 1 = 𝑯 ⊗ 𝕀⊗ (𝑛 −1 ) 𝒚 0 , . . . , 𝒚 𝑛 −1 = 𝕀⊗ (𝑛 −1 ) ⊗ 𝑯 𝒚 𝑛 −2 .

Each matrix-vector multiplication only costs about (order) 2 × 2𝑛 resources,


because the matrix involved is extremely sparse. This produces a total cost of
(order) 𝑛 × 2𝑛 operations to compute 𝑾 𝑯 (𝐷 ) × 𝒙 , provided that ∥𝒙 ∥ 2 = 1.
This result that looks more impressive when we set 𝐷 = 2𝑛 and 𝑛 = log2 (𝐷) .
Finally, suppose that the vector 𝒙 ∈ ℂ𝐷 is not normalized to unit length,
i.e. ∥𝒙 ∥ 2 ≠ 1. Then, we can invest (order) 𝐷 to compute the length
√︂
∑︁𝐷 −1
∥𝒙 ∥ 2 = | [𝒙 ] 𝑙 | 2 > 0
𝑙 =0

and normalize the vector1: 𝒙 ↦→ 𝒙ˆ = 𝒙 /∥𝒙 ∥ 2 . We can now use linearity of


matrix-vector multiplication to rewrite the task as

𝑾 𝑯 (𝐷 ) × 𝒙 = 𝑾 𝑯 (𝐷 ) × (∥𝒙 ∥ 2 × 𝒙ˆ ) = ∥𝒙 ∥ 2 × 𝑾 𝑯 (𝐷 ) × 𝒙ˆ

and use the strategy developed above to execute the bracketed matrix-vector
product. To obtain the final,
 unnormalized, result, we only need to multiply
each entry of 𝑾 𝑯 (𝐷 ) × 𝒙ˆ ∈ ℂ by the single number ∥𝒙 ∥ 2 . This can again
𝐷

be done with only (order) 𝐷 elementary operations. So, the total overhead
of this pre- and post-processing is (order) 𝐷 which is smaller than the cost of
executing the normalized matrix-vector multiplication. ■

10.3 Discrete Fourier transform


10.3.1 Formal definition
The Walsh-Hadamard transform is one way to extend the Fourier-type properties
of the Hadamard gate to more than 𝐷 = 2 dimensions. However, it only
works for dimensions 𝐷 = 2𝑛 that are a power of two. The discrete Fourier
1Note that this strategy only works if ∥𝒙 ∥ 2 ≠ 0. But ∥𝒙 ∥ 2 = 0 is only possible if 𝒙 = 0 =
( 0 · · · 0)𝑇 is the all-zeros vector. This case is, however trivial and can be handled separately:
𝑾 𝑯 (𝐷 ) × 0 = 0.
136 Lecture 10: Quantum Fourier-type transforms

transform (DFT) follows from an alternative route that is well defined for any
dimension 𝐷 . The starting point is to recognize +1 and −1 as solutions to the
quadratic equation 𝑥 2 = +1. In other words: (−1) = exp ( i2𝜋/2) =: 𝜔 2 is a
second root of unity and (+1) can be interpreted as the 0th power of this root:
(+1) = exp ( 0) = exp ( i2𝜋/2 × 0) = 𝜔 20 . Inserting these interpretations of ±1
into the Hadamard gate produces

𝜔 20×0 𝜔 20×1
     
1 +1 +1 1 𝜔 20 𝜔 20 1
𝑯 =√ =√ =√ .
2 +1 −1 2 𝜔 20 𝜔 21 2 𝜔 21×0 𝜔 21×1

Note that the reformulation on the very right – where we express the powers as
product of row and column index – is very reminiscent of the entry formula
of the general Walsh-Hadamard transform in Theorem 10.2. Doing this for
the Hadamard gate might look overly complicated on first sight. But, the
interpretation on the very right suggests a way to generalize this matrix structure
to 𝐷 = 3 dimensions. We replace the second root of unity 𝜔 2 = exp ( 2𝜋 i/2)
with a third root of unity 𝜔 3 = exp ( 2𝜋 i/3) . I.e. one solution of the cubic
equation 𝑥 3 = +1. Then, we use products of column and row indices to
populate a 3 × 3 matrix:

𝜔 30×0 𝜔 30×1 𝜔 30×2 𝜔 30 𝜔 30 𝜔 30


1 © 1
𝑭 (3) = √ ­ 𝜔 31×0 𝜔 31×1 𝜔 31×2 ® = √ ­ 𝜔 30 𝜔 31 𝜔 32 ®
ª © ª
3 𝜔 2×0 𝜔 2×1 𝜔 2×2 3 𝜔0 𝜔1 𝜔4
« 3 3 3 ¬ « 3 3 3 ¬
1 1 1
1 ©
= √ ­ 1 𝜔 3 𝜔 32 ® ∈ ℂ3×3 . (10.6)
ª
3 1 𝜔2 𝜔
« 3 3 ¬

Here, we have used 𝜔 30 = 1 and 𝜔 34 = 𝜔 3 × 𝜔 33 = 𝜔 3 × 1 = 𝜔 3 (𝜔 3 is a third


root of unity) to simplify the matrix somewhat. This matrix is complex-valued
and symmetric (𝑭 ᵀ( 3 ) = 𝑭 ( 3 ) ). It also obeys the flatness condition, because
all powers of roots of unity lie on the complex unit circle. In particular,
|𝜔 3 | = 𝜔 32 = | 1 | = 1. Finally, this matrix is also a unitary matrix:

𝑭 †(3 ) × 𝑭 ( 3 ) = 𝕀3×3 .

We leave the derivation as an instructive exercise in arithmetic with complex


numbers.
Exercise 10.5 Verify Eq. (10.3.1) by direct computation. Hint: use the following
formula for 3rd roots of unity: (i) complex conjugation acts as (𝜔 3 ) ∗ = 𝜔 32 and

𝜔 32 = 𝜔 3 (why?) and (ii) 1 + 𝜔 2 + 𝜔 22 = 0 (why?).


Let us do one more example before stating the general definition. For 𝐷 = 4,
our fourth root of unity becomes 𝜔 4 = exp ( 2𝜋 i/4) = i. In this dimension, the
137 Lecture 10: Quantum Fourier-type transforms

DFT matrix is populated with the imaginary unit i, as well as its powers:
𝜔 40×0 𝜔 40×1 𝜔 40×2 𝜔 40×3 i0 i0 i0 i0
1 ­ 𝜔 41 × 0 𝜔 41 × 1 𝜔 41 × 2 𝜔 41 × 3 ® 1 ­ i0 i1 i2 i3 ®
© ª © ª
𝑭 (4) = √ ­ 2×0 ®= ­
𝜔 42×1 𝜔 42×2 𝜔 42×3 ® 2 ­ i0 i2 i4 i6 ®
®
4 ­ 𝜔4
3×0
« 𝜔4 𝜔 43×1 𝜔 43×2 𝜔 43×3 ¬ 0 3 6
« i i i i ¬
9

+1 +1 +1 +1
1 ­ +1 +i −1 −i ®
© ª
= ­ ® ∈ ℂ4×4 . (10.7)
2 ­ + 1 −1 + 1 −1 ®
« + 1 −i −1 + i ¬
Here, we have used i2 = −1, i3 = −i and i4 = +1 to simplify the powers of i.
This matrix certainly obeys the flatness condition ( | ± i | = | ± 1 | = 1) and is
also unitary.
Exercise 10.6 (𝑭 ( 4 ) is a unitary matrix). Show that the matrix 𝑭 ( 4 ) is unitary, i.e.
𝑭 †(4 ) × 𝑭 ( 4 ) = 𝕀4×4 . Hint: the adjoint features both transposition and complex
conjugation, also i2 = −1.
We are now ready to present the 𝐷 -dimensional discrete Fourier transform
(DFT) matrix. It is a straightforward extension from the examples we just did
for 𝐷 = 2 (Hadamard gate, base case), 𝐷 = 3 and 𝐷 = 4.
Definition 10.7 (Discrete Fourier Transform (DFT) in 𝐷 dimensions). Fix a dimension Discrete Fourier Transform
𝐷 ∈ ℕ and let 𝜔 𝐷 = exp ( 2𝜋 i/𝐷) ∈ ℂ be a 𝐷 th root of unity (i.e. a solution (DFT)
to 𝑥 𝐷 = +1). The 𝐷 -dimensional Discrete Fourier Transform (DFT) is a
complex-valued 𝐷 × 𝐷 matrix 𝑭 (𝐷 ) with entries
1
𝑭 (𝐷 ) 𝑘 ,𝑙 = √ 𝜔 𝐷𝑘 ×𝑙 with 𝑘 , 𝑙 ∈ {0, . . . , 𝐷 − 1}.
 
(10.8)
𝐷
By construction, the DFT matrix is a Fourier-type matrix. In fact, this is
where the name comes from.
Proposition 10.8 The DFT matrix defined in Eq. (10.8) is a unitary matrix
that also obeys the flatness condition from Eq. (10.1). In other words: the
𝐷 -dimensional DFT is a Fourier-type transform.
The proof is an extension of our arguments for 𝐷 = 2, 𝐷 = 3 and 𝐷 = 4 to
general dimensions. It follows from elementary properties of complex phases,
in particular, 𝐷 -th roots of unity.

Proof of Proposition 10.8. Let us first show that the matrix is unitary, i.e. 𝑭 †(𝐷 ) ×
𝑭 (𝐷 ) = 𝕀𝐷 ×𝐷 . We do this by showing that 𝑷 = 𝑭 †(𝐷 ) 𝑭 (𝐷 ) obeys [𝑷 ] 𝑎,𝑏 = 𝛿𝑎,𝑏 =
[𝕀] 𝑎,𝑏 for 𝑎, 𝑏 ∈ {0, . . . , 𝐷 − 1} arbitrary. To this end, use 𝜔¯𝐷 𝑥 = 𝜔¯ 𝐷𝑥 = 𝜔 −𝑥 ,
𝜔 𝑎 × 𝜔 𝑏 = 𝜔 𝑎+𝑏 (why?) and the rules of matrix multiplication to deduce
∑︁𝐷 −1   ∑︁𝐷 −1  
[𝑷 ] 𝑎,𝑏 = 𝑭 † 𝑎,𝑘 × [𝑭 ] 𝑘 ,𝑏 = 𝑭¯ 𝑘 ,𝑎 × [𝑭 ] 𝑘 ,𝑏
𝑘 =0 𝑘 =0
1 ∑︁𝐷 − 1 𝑘 ×𝑎 1 ∑︁𝐷 − 1 𝑘 (𝑏 −𝑎 )
= 𝜔¯ 𝐷 × 𝜔 𝐷𝑘 ×𝑏 = 𝜔𝐷 .
𝐷 𝑘 =0 𝐷 𝑘 =0
138 Lecture 10: Quantum Fourier-type transforms

Two situations can arise: (i) 𝑏 = 𝑎 . In this case, we get

1 ∑︁𝐷 − 1 1 ∑︁𝐷 − 1
[𝑷 ] 𝑎,𝑎 = 𝜔 𝐷𝑘 (𝑎 −𝑎 ) = 𝜔 𝐷0 = 1.
𝐷 𝑘 =0 𝐷 𝑘 =0

The second case (ii) 𝑏 ≠ 𝑎 is more interesting. In this case, the exponent of
𝜔 𝐷 is never zero and the sum lets us jump around on the complex unit circle.
But, we do so with equal spacing and (almost) up where we started. Taking the
average over all points we visited produces the origin of the circle (we haven’t
moved on average). And this origin is 0. So,

1 ∑︁𝐷 − 1
[𝑷 ] 𝑎,𝑏 = 𝜔 𝐷𝑘 (𝑏 −𝑎 ) = 0.
𝐷 𝑘 =0

This establishes [𝑷 ] 𝑎,𝑏 = 𝛿𝑎,𝑏 = 𝕀𝑎,𝑏 for any 𝑎, 𝑏 ∈ {0, . . . , 𝐷 − 1}. In short:
𝑭 †(𝐷 ) 𝑭 (𝐷 ) = 𝕀𝐷 ×𝐷 .
The flatness condition is easy by comparison. Every entry of the DFT matrix
obeys

2 2
1 1 1
= √ 𝜔 𝐷𝑘 ×𝑙 | exp ( 2𝜋 i/𝐷 (𝑘 × 𝑙 ))| 2 =
 
𝑭 (𝐷 ) 𝑘 ,𝑙
= ,
𝐷 𝐷 𝐷

because 𝜔 𝐷 is a complex phase ( |𝜔 𝐷 | = 1) and all powers of complex phases


are again complex phases. ■
Note that the DFT is a Fourier-type transform that exists for every dimension
𝐷 . This is in stark contrast to the Walsh-Hadamard (WH) transform which only
exists if 𝐷 = 2𝑛 is a power of two. In power-of-two dimensions, our analysis
thus gives rise to two different Fourier-type transforms:

1 1
𝑾 𝑯 ( 2𝑛 ) 𝑘 ,𝑙 = √ (−1) 𝑘 ×𝑙 = √ exp ( i2𝜋/2 (𝑘 × 𝑙 )) ,
 
2𝑛 2𝑛
  1
𝑭 ( 2𝑛 ) 𝑘 ,𝑙
=√ exp ( i2𝜋/2𝑛 (𝑘 × 𝑙 )) .
2𝑛

The behaviors of these two Fourier-type transforms is very different. The WH


transform jumps between only two points on the complex unit circle: (−1)
(‘west pole’) and (+1) (‘east pole’). And it does so a total of 𝐷/2 times. In
contrast, the DFT entries move across the complex unit circle in 𝐷 constant-angle
steps. For 𝐷 = 4, we obtain, for instance:

+1 +1 +1 +1 +1 +1 +1 +1
1 ­ +1 −1 +1 −1 1 ­ +1 +i −1 −i ®
© ª © ª
𝑾 𝑯 (4) = ­ vs. 𝑭 ( 4 ) = ­
®
®.
2 ­ +1 +1 −1 −1 2 ­ + 1 −1 + 1 −1 ®
®
®
« +1 −1 −1 +1 ¬ « + 1 −i −1 + i ¬
139 Lecture 10: Quantum Fourier-type transforms

10.3.2 Implementation as a quantum circuit


Like with the Walsh-Hadamard transform, the 𝐷 = 2𝑛 -dimensional DFT can
also be implemented as a 𝑛 -qubit quantum circuit. To achieve this, we need a
Hadamard gate (𝑯 ) as well as the following Z-rotation gate: Z rotation 𝑹 (𝑘 )
   
1 0 1 0
𝑹 (𝑘 ) = =  ∈ ℂ2×2 .
0 𝜔 2𝑘 0 exp 2𝜋 i/2𝑘

Note that the angle of this phase type rotation gate depends inverse exponen-
tially on its input (𝑘 ↦→ 1/2𝑘 ). Many familiar single-qubit gates are special
cases of this object:

𝑹 ( 0) = 𝕀, 𝑹 ( 1) = 𝒁 , 𝑹 ( 2) = 𝑺 and 𝑹 ( 3) = 𝑻 .

To construct our DDFT circuit, we need a controlled version of this rotation


gate that acts on two qubits:
(
𝕀 ⊗ 𝕀|𝑏 0𝑏 1 ⟩ if 𝑏 1 = 0,
𝑪 𝑹 (𝑘 )|𝑏 0𝑏 1 ⟩ =
𝑹 (𝑘 ) ⊗ 𝕀|𝑏 0 , 𝑏 1 ⟩ else if 𝑏 1 = 1.

Here, the second qubit is the control and the first qubit is the target. This
operational definition is equivalent to the following 2-qubit truth table: controlled Z rotation 𝑪 𝑹 (𝑘 )

1 0 0 0
­ 0 1 0 0
© ª
𝑪 𝑹 (𝑘 ) = 𝕀2×2 ⊗ | 0⟩⟨0 | + 𝑹 (𝑘 ) ⊗ | 1⟩⟨1 | = ­ ® ∈ ℂ4×4
®
­ 0 0 1 0 ®
« 0 0 0 𝜔 2𝑘 ¬
and we use the following gate symbol to describe it in a larger circuit:

Let us use these building blocks to compute quantum circuit representations of


the DFT matrix for 𝐷 = 21 , 𝐷 = 22 and 𝐷 = 23 .

Quantum Fourier transform circuit for 𝐷 = 2 (1 qubit)


For 𝐷 = 21 = 2, the DFT matrix is simply the Hadamard gate and we obtain 1-qubit DFT circuit

.
140 Lecture 10: Quantum Fourier-type transforms

Quantum Fourier transform circuit for 𝐷 = 22 = 4 (2 qubits


For 𝐷 = 22 = 4 (2 qubits), things get a bit more interesting already: 2-qubit DFT circuit

. (10.9)

Let us ignore the qubit labels on the left hand side for now and compute a
circuit representation of the right hand side. This circuit is comprised of three
sequential layers:

    +1 0 +1 0
1 +1 +1 1 0 1 ­ 0 +1 0 +1
© ª
𝑳 0 =𝑯 ⊗ 𝕀 = √ ⊗ =√ ­
®
®,
2 +1 −1 0 1 2 ­ +1 0 −1 0 ®
« 0 + 1 0 −1 ¬
1 0 0 0 1 0 0 0
­ 0 1 0 0 ® ­ 0 1 0 0 ®
© ª © ª
𝑳 1 =𝑪 𝑹 ( 2) = ­ ®=­ ®,
­ 0 0 1 0 ® ­ 0 0 1 0 ®
« 0 0 0 𝜔 22
¬ « 0 0 0 i ¬
    +1 +1 0 0
1 0 1 +1 +1 1 ­ +1 −1 0 0
© ª
𝑳 2 =𝕀 ⊗ 𝑯 = ⊗√ =√ ­
®
®.
0 1 2 +1 −1 2­ 0 0 +1 +1 ®
« 0 0 +1 −1 ¬
So, the total circuit becomes

𝑫𝑭𝑻 ( 4 ) = 𝑳 2 × 𝑳 1 × 𝑳 0
+1 +1 0 0 1 0 0 0 +1 0 +1 0
1 ­ + 1 −1 0 0 ® ­ 0 1 0 0 ® 1 ­ 0 +1 0 +1
© ª © ª © ª
=√ ­ ®×­ ®× √ ­
®
0 0 +1 + 1 ® ­ 0 0 1 0 ® 2 ­ +1 0 −1 0
®
2 ­ ®
« 0 0 +1 −1 ¬ « 0 0 0 i ¬ « 0 +1 0 −1 ¬
+1 +1 0 0 +1 0 +1 0
1 ­ + 1 −1 0 0 ® ­ 0 +1 0 +1 ®
© ª © ª
= ­ ®×­
0 0 +1 + 1 ® ­ +1 0 −1 0 ®
®

« 0 0 +1 −1 ¬ « 0 + i 0 −i ¬
+1 +1 +1 +1
1 ­ +1 −1 +1 −1 ®
© ª
= ­ ®. (10.10)
2 ­ +1 +i −1 −i ®
« + 1 −i −1 + i ¬
This already looks a lot like the 4-dimensional DFT matrix from Eq. (10.9),
but close inspection reveals that we are not quite there yet. Certain column
entries are not at the exactly right position yet. This can be resolved by
reversing/swapping the output qubits involved, i.e. 𝑞0 ↔ 𝑞1 . For two qubits,
141 Lecture 10: Quantum Fourier-type transforms

such a SWAP operation has truth table 2-qubit SWAP

1 0 0 0
­ 0 0 1 0 ®
© ª
𝑺𝑾 𝑨𝑷 = ­ (10.11)
­ 0 1 0 0 ®
®

« 0 0 0 1 ¬
and can be realized by concatenating three CNOT gates:

If we apply this qubit SWAP to the original DFT matrix, we obtain

1 0 0 0 +1 +1 +1 +1
­ 0 0 1 0 1­ +1 +i −1 −i ®
© ª © ª
𝑺𝑾 𝑨𝑷 × 𝑫𝑭𝑻 ( 4 ) =­ ®×
®
­ 0 1 0 0 +1 −1 +1 −1 ®
­ ®
® 2­
« 0 0 0 1 ¬ « +1 −i −1 +i ¬
+1 +1 +1 +1
1 ­ +1 −1 +1 −1
© ª
= ­
®
®.
2 ­ +1 +i −1 −i ®
« +1 −i −1 +i ¬
This matrix is now exactly equivalent to the circuit matrix in Eq. (10.10). This
also explains our notational convention in Eq. (10.9): the 2-qubit circuit on
the right hand side reproduces the 4-dimensional DFT matrix 𝑫𝑭𝑻 ( 4 ) up to a
relabelling of the output qubits.

Quantum Fourier transform circuit for 𝐷 = 22 = 4 (3 qubits)


The circuit construction and insights gathered so far for 𝐷 = 2 and 𝐷 = 22 = 4 3-qubit DFT circuit
do extend to 𝐷 = 23 = 8:

We leave a verification of this equivalence relation as an instructive exercise


in the study of 3-qubit circuits and instead point out some familiar patterns
and recursive structure that begins to emerge:

• Reversed order of output qubits: the purple high-level circuit block reverses
the ordering of the qubits involved. There are two ways to deal with this:
142 Lecture 10: Quantum Fourier-type transforms

Figure 10.2 Quantum circuit realization of the discrete Fourier transform: this
𝑛 -qubit circuit involves 𝑛 Hadamard gates, as well as 𝑛 (𝑛 − 1)/2 controlled Z
rotations. Note furthermore that the ordering of the outcome qubits becomes
reversed in the process.

(i) hardware solution: we can use SWAP gates introduced in Eq. (10.11)
to physically re-arrange qubit wires in a way that reproduces the original
qubit ordering. (ii) software solution: remember that the qubit ordering
has changed and adjust the action of future gates and the interpretation
of readout bits accordingly. E.g. the readout of the first qubit wire now
produces 𝑜𝑛 − 1 instead of 𝑜 0 .
• Recursive block structure: the DFT implementation in terms of quantum
gates features log2 (𝐷) = 3 distinctive blocks. Each of them starts
with a Hadamard gate and follows it up with controlled rotations on
subsequent qubits. These controlled rotations start with angle 21 for the
first neighboring qubit and go up to 4 = 23 − 1 for the second neighboring
qubit.

Exercise 10.9 Verify the correctness of this 3-qubit realization of the DFT in
𝐷 = 8, e.g. by computing Kronecker and matrix products on paper or by using
your own classical simulation tool.

Quantum Fourier transform circuit for 𝐷 = 2𝑛 (𝑛 qubits)


Our study of the DDFT circuit for 𝐷 = 2, 4, 8 starts to reveal a recursive 𝑛 -qubit DFT circuit
structure. Its generalization to 𝐷 = 2𝑛 dimensions/amplitudes is illustrated in
Fig. 10.2. This circuit involves 𝑛 qubits and the order of this qubits is reversed
at the end of the circuit: if we start with qubit labels 𝑞0 , . . . , 𝑞𝑛 − 1 , we end up
with qubit labels 𝑞𝑛 − 1 , . . . , 𝑞0 . It is possible to use a collection of (at most)
𝑛/2 SWAP gates to reverse this order in hardware. Alternatively, we can also
remember this new ordering in software and adjust subsequent gates and
readout accordingly without extra cost.
The actual circuit involves 𝑛 Hadamard gates (𝑯 ) – one for each qubit – as
well as (𝑛 − 1 − 𝑘 ) controlled Z rotations (𝑹 (𝑙 ) ) for qubits 𝑘 = 0, . . . , (𝑛 − 1) .
143 Lecture 10: Quantum Fourier-type transforms

This produces a total of


∑︁𝑛 −1 ∑︁𝑛 −1 ∑︁𝑛 −1
𝑠 (𝑛) = (𝑛 − 1 − 𝑘 ) = 𝑛− (𝑘 + 1)
𝑘 =0 𝑘 =0 𝑘 =0
∑︁𝑛 1 1
=𝑛 2 − 𝑛 = 𝑛 2 − 𝑛 (𝑛 + 1) = 𝑛 (𝑛 − 1).
𝑘 =1 2 2
controlled 𝑍 -rotations by angles ranging from 1/22 = 1/4 (for 𝑹 ( 2) ) to 1/2𝑛
(for 𝑹 (𝑛) ). If we also count the 𝑛 -Hadamard gates, the total gate count
becomes
1 1
𝑛 (𝑛 − 1) = 𝑛 (𝑛 + 1) = 𝑂 𝑛 2 .
 
size 𝑭 (𝐷 ) = 𝑛 + 𝑠 (𝑛) = 𝑛 +
2 2
in this high-level framework. This is a quadratic scaling in the number of qubits
𝑛 . A proof that this circuit really reproduces the 𝐷 = 2𝑛 -dimension DFT can
be done by decomposing its functionality recursively. This, however, would go
beyond the scope of this introductory lecture and we merely state the main
result.

Theorem 10.10 (Quantum Fourier Transform (QFT) circuit). In dimensions 𝐷 = Quantum Fourier transform:
2𝑛 , the DFT matrix defined in Def. 10.7 can be realized by a high-level 𝑛 qubit realization of DFT
𝑛 -qubit circuit depicted in Fig. 10.2. This circuit involves 𝑛 Hadamard gates, matrix with 𝑂 (𝑛 2 ) gates
as well as 𝑛 (𝑛 − 1)/2 = 𝑂 (𝑛 2 ) controlled Z rotations 𝑹 (𝑙 ) by angles 1/2𝑙
with 𝑙 = 2, . . . , 𝑛 . Importantly, this circuit also reverses the order of the
qubits involved. This circuit is called a Quantum Fourier Transform (QFT).

We conclude this section with raising and addressing an important issue:


the gates depicted in Fig. 10.2 are not yet elementary gates. If we want to write
the DDFT circuit solely in terms of our elementary gate set {𝑯 ,𝑻 , 𝑪 𝑵 𝑶𝑻 },
we need to approximate all controlled Z rotations by sequences of elementary
gates. It can be shown that the correct working of the overall circuit remains
stable if we replace the exact controlled rotations by accurate approximations
comprised of elementary gates: small approximation errors in each controlled
rotation only produce a small approximation error of the 2𝑛 × 2𝑛 unitary
matrix 𝑭 (𝐷 ) . However, the approximation accuracy has to be very high. The
rotation matrix 𝑹 (𝑛) , in particular, contains an exponentially small complex
phase 𝜔 2𝑛 = exp ( 2𝜋 i/2𝑛 ) that needs to be approximated. Fortunately, the
Solovay-Kitaev theorem – our main tool to approximate arbitrary quantum
circuits with elementary gates – scales only poly-logarithmically in the desired
accuracy: 𝑂 ( log𝑐 ( 1/𝜖 2 )) with 𝑐 ∈ [ 1, 3 + 𝑜 ( 1)] elementary gates are enough
to approximate every 2-qubit gate up to accuracy 𝜖 , see Lecture 4. If we set
𝜖 = 𝜖˜/2𝑛 (approximation up to exponential accuracy), Solovay-Kitaev gives us
an overhead of at most

compilation-overhead = 𝑂 ( log𝑐 ( 1/𝜖)) = 𝑂 ( log𝑐 ( 2𝑛 / 𝜖˜)) = 𝑂 (𝑛 𝑐 log ( 1/ 𝜖˜))

to decompose every controlled Z-gate into elementary gates. In other words:


this extra overhead remains polynomial, even if we insist on extremely accurate
144 Lecture 10: Quantum Fourier-type transforms

approximations. The details of this extra overhead are, however, very dependent
on the elementary gate set used (which affects the exponent 𝑐 in Solovay-Kitaev)
and the desired accuracy of the overall circuit.
Corollary 10.11 (approximating the DFT matrix with elementary quantum gates). It realizing the DFT requires
is possible to (very) accurately approximate the high-level QDFT circuit from poly (𝑛) elementary quantum
Fig. 10.2 with poly (𝑛) elementary quantum gates, e.g. 𝑯 ,𝑻 , 𝑪 𝑵 𝑶𝑻 . gates

10.3.3 Fast discrete Fourier Transform


We have just constructed a 𝑛 -qubit quantum circuit that implements the
functionality of a 2𝑛 -dimensional Fourier transform. What is more, our high-
level circuit requires a quadratic number of single- and two-qubit gates gates.
More precisely,
1
size (𝑭 2𝑛 ) = 𝑛 (𝑛 + 1).
2
The following important result is now an immediate consequence of our ability
to classically simulate size-𝑠 circuits on 𝑛 qubits with (order) 2𝑛 × 𝑠 elementary
operations.

Theorem 10.12 (Fast discrete Fourier Transform (FFT)). For 𝐷 = 2𝑛 , the Fourier Fast classical discrete Fourier
transform admits a fast matrix-vector multiplication. For any 𝒙 ∈ ℂ𝐷 , transform (FFT)
we can compute 𝑭 𝐷 × 𝒙 with only (order) log22 (𝐷) × 𝐷 ≪ 𝐷 2 (classical)
operations.

Note that this is much better than the naive cost of 𝐷 2 associated with
naive matrix-vector multiplication in 𝐷 dimensions. Additional improvements
allow to implement the Fast Fourier Transform at (order) log2 (𝐷) 2 × 𝐷 cost
only. The proof is very similar to our derivation of the Fast classical Walsh-
Hadamard transform: use sparse matrix-vector multiplication to compute the
DFT as a sequence of 𝑛 (𝑛 + 1)/2 sparse (and often even diagonal) matrix-vector
multiplications in 𝐷 = 2𝑛 dimensions. We leave it as an instructive exercise.
Exercise 10.13 (Proof of Theorem 10.12). Use the 𝑛 -qubit simulation framework
from Lecture 8 and the concrete FDFT circuit from Figure 10.2 to compute
𝑭 (𝐷 ) × 𝒙 with only order 𝑛 2 × 2𝑛 = log2 (𝐷) 2 × 𝐷 elementary operations.
We emphasize that the fast Fourier transform, also known as FFT, has been
a cornerstone in many branches of computational science. Applications range
from signal processing to image analysis. Today, we managed to derive it as
a by-product of our quest to find quantum circuit realizations of the Fourier
transform. Sometimes, merely thinking in a quantum fashion can provide us
with insights that are valuable outside the strict scope of quantum computing
itself!
145 Lecture 10: Quantum Fourier-type transforms

10.4 Synopsis
The formal description of 𝑛 -qubit quantum circuits is intimately connected to
matrix-vector multiplications in 𝐷 = 2𝑛 dimensions. We have developed this
correspondence in Lecture 8 to use matrix-vector multiplications in order to
describe general quantum circuits. Today, we started to venture in the opposite
direction: we identified two very prominent 𝐷 × 𝐷 matrices – the Walsh-
Hadamard transform and the discrete Fourier transform – and strove to realize
them as 𝑛 -qubit quantum circuits comprised of only poly (𝑛) = polylog (𝐷)
elementary quantum gates. This can be an exponential compression in both
‘memory’ and ‘runtime’ for executing Fourier-type transforms: quantum circuits can offer
exponential improvements for
(i) ‘memory’ (aka number of qubits): 𝑛 qubits are (in principle) enough to matrix-vector multiplications
store a general 𝐷 = 2𝑛 -dimensional vector in its amplitudes.
(ii) ‘runtime’ (aka circuit size): poly (𝑛) gates are enough to implement WH
transform and the DFT on the quantum level.

Such Fourier-type transforms are a powerful tool when designing high-level


algorithms. And quantum architectures can execute them exponentially cheaper
than classical computers! Many seminal quantum algorithms build on this
observation. Chief among them is Shor’s algorithm for integer factorization
which we will discuss in a future lecture.
11. Quantum Phase Estimation

Date: 11 December 2024

Today, we discuss one of the most important quantum algorithmic primitives: Agenda:
quantum phase estimation. At face level, this quantum circuit addresses a rather
1 Background: eigenval-
abstract (and seemingly simple) problem from matrix analysis: learn something
ues and eigenvectors
about eigenvalues of a unitary matrix given access to the associated eigenvector.
2 Quantum Phase Estima-
But, the quantum circuit framework puts an interesting twist on this problem tion (QPE) circuits
and offers a new, genuinely quantum approach. Known as quantum phase 3 Analysis
estimation, this circuit family uses the (inverse) quantum Fourier transform
from last lecture as an important building block.
Quantum phase estimation plays a prominent role as a subroutine in
several famous quantum algorithms. Examples are quantum approximate
counting (which we will briefly discuss today) and Shor’s algorithm for integer
factorization (which we will discuss next week).

11.1 Background: eigenvalue decomposition of normal matrices


A matrix can be viewed as an object that eats a vector and spits out another
vector. Some vectors, however, are special. They enter and leave the matrix
almost unchanged, except for a linear scaling factor. Such vectors are called
eigenvectors of the matrix in question. The associated scaling factor is called
an eigenvalue. Remarkably, certain classes of matrices can be completely
characterized by this effect.
147 Lecture 11: Quantum Phase Estimation

Theorem 11.1 (Spectral theorem for normal matrices). Let 𝑩 ∈ ℂ𝐷 ×𝐷 be a eigenvalue decomposition for
† † normal matrices
normal matrix, i.e. 𝑩 𝑩 = 𝑩𝑩 . Then, we can decompose 𝑩 as

𝑩 =𝑼 𝑫𝑼 † (eigenvalue decomposition) (11.1)

such that: (i) 𝑫 = diag (𝜆 1 , . . . , 𝜆𝐷 ) ∈ ℂ𝐷 ×𝐷 is a diagonal matrix (collecting


eigenvalues 𝜆𝑘 ∈ ℂ as diagonal entries); (ii) 𝑼 = (𝒖 1 · · · 𝒖 𝐷 ) ∈ ℂ𝐷 ×𝐷 is a
unitary matrix (subsuming orthonormal eigenvectors 𝒖 𝑘 ∈ ℂ𝐷 as columns).

Note that the eigenvalue decomposition from Eq. (11.1) is equivalent to


writing
∑︁𝐷
𝑩= 𝜆𝑘 𝒖 𝑘 𝒖 𝑘† ,
𝑘 =0

where 𝜆𝑘 ∈ ℂ denotes the eigenvalues and 𝒖 𝑘 ∈ ℂ𝐷 the associated eigenvectors.


Because 𝑼 = (𝒖 1 · · · 𝒖 𝐷 ) is unitary, these eigenvectors are all normalized
and orthogonal to each other (i.e. 𝒖 𝑘† 𝒖 𝑙 = 𝛿𝑘 ,𝑙 ). We call an eigenvalue
decomposition with this feature an orthonormal eigenvalue decomposition.
This in turn ensures
∑︁𝐷 ∑︁𝐷
𝑩𝒖 𝑙 = 𝜆𝑘 𝒖 𝑘 𝒖 𝑘† 𝒖 𝑙 = 𝜆𝑘 𝛿𝑘 ,𝑙 𝒖 𝑙 = 𝜆𝑙 𝒖 𝑙 for 1 ≤ 𝑙 ≤ 𝐷 .
𝑘 =0 𝑘 =0

This is the defining property of an eigenvalue/eigenvector pair: when multiplying


the matrix 𝑩 with an eigenvector 𝒖 𝑙 ∈ ℂ𝐷 , the eigenvector gets scaled by a
number 𝜆𝑘 ∈ ℂ (the eigenvalue), but is otherwise left unchanged.
Normal matrices subsume two important categories of matrices that feature
prominently in quantum computing: Hermitian matrices (where 𝑨 † = 𝑨
ensures 𝑨 † 𝑨 = 𝑨 2 = 𝑨𝑨 † ) and unitary matrices (where 𝑼 †𝑼 = 𝕀 = 𝑼𝑼 † ).
Corollary 11.2 (Spectral theorem for Hermitian matrices). Let 𝑨 ∈ ℂ𝐷 ×𝐷 be a eigenvalue decomposition for
† Hermitian matrices
Hermitian matrix, i.e. 𝑨 = 𝑨 . Then, this matrix admits an orthonormal

eigenvalue decomposition 𝑨 = 𝐷
Í
𝑘 =1 𝜆𝑘 𝒖 𝑘 𝒖 𝑘 where each eigenvalue is real-
valued, i.e. 𝜆𝑘 ∈ ℝ for all 1 ≤ 𝑘 ≤ 𝐷 .
The proof is an immediate consequence of Theorem 11.1 and the defining
property of a Hermitian matrix. We leave it as an instructive exercise and move
on to a specification of the spectral theorem for unitary matrices. This is again
an immediate consequence of Theorem 11.1 and the defining property of a
unitary matrix:
Corollary 11.3 (Spectral theorem for unitary matrices). Let 𝑽 ∈ ℂ𝐷 ×𝐷 be a unitary eigenvalue decomposition for
† † unitary matrices
matrix, i.e. 𝑽 𝑽 = 𝑽 𝑽 = 𝕀. Then, this matrix admits an orthonormal

eigenvalue decomposition 𝑽 = 𝐷
Í
𝑘 =1 𝜆𝑘 𝒖 𝑘 𝒖 𝑘 , where each eigenvalue is a
complex phase, i.e. 𝜆𝑘 = exp ( 2𝜋 i𝜃𝑘 ) with 𝜃𝑘 ∈ [ 0, 1) for all 1 ≤ 𝑘 ≤ 𝐷 .
Exercise 11.4 Derive Corollary 11.2 and Corollary 11.3 from Theorem 11.1 by
utilizing the defining properties of Hermitian and unitary matrices.
We conclude this section with a 2-dimensional example.
148 Lecture 11: Quantum Phase Estimation

Figure 11.1 Setup in quantum phase estimation: we are given a classical description
of an 𝑛 -qubit circuit 𝑼 and a 𝑛 -qubit quantum state |𝜓 ⟩ such that 𝑼 |𝜓 ⟩ =
e2𝜋 i𝜃 |𝜓 ⟩ with 𝜃 ∈ [ 0, 1) . Our task is to approximate the phase 𝜃 ∈ [ 0, 1) up
to 𝑚 bits of accuracy.

Example 11.5 (eigenvectors and eigenvalues of a 2D rotation matrix). Consider a


counter-clockwise rotation matrix with angle 2𝜋 𝜃 , where 𝜃 ∈ [ 0, 1) :
 
cos ( 2𝜋 𝜃 ) − sin ( 2𝜋 𝜃 )
𝑹 (𝜃 ) = ∈ ℝ2 × 2 .
sin ( 2𝜋 𝜃 ) cos ( 2𝜋 𝜃 )

This matrix is a unitary matrix and has two eigenvector/eigenvalue pairs. In


bra-ket notation ( | 0⟩ = 𝒆 0 = ( 1 0)𝑇 and | 1⟩ = 𝒆 1 = ( 0 1)𝑇 ), we obtain:

1
|𝜓+ ⟩ = √ (| 0⟩ − i | 1⟩) obeys 𝑹 (𝜃 )|𝜓+ ⟩ = e2𝜋 i𝜃 |𝜓+ ⟩.
2
1
|𝜓− ⟩ = √ (| 0⟩ + i | 1⟩) obeys 𝑹 (𝜃 )|𝜓 − ⟩ = e2𝜋 i ( 1 −𝜃 ) |𝜓 − ⟩
2

What is more, these eigenvectors obey the following interesting relationship:

e2𝜋 i (𝜃 /2 ) e2𝜋 i ( 1 − 𝜃 /2)


√ |𝜓+ ⟩ + √ |𝜓− ⟩ = cos (𝜋 𝜃 )| 0⟩ + sin (𝜋 𝜃 )| 1⟩.
2 2

11.2 Quantum Phase estimation circuits


Eigenvalue decompositions are a powerful tool to think about and work with
(large) matrices. Unitary matrices are an interesting special case that feature
prominently in quantum computing, where the dimension 𝐷 = 2𝑛 grows
exponentially with the number of qubits 𝑛 involved. The spectral theorem for
unitary matrices (Corollary 11.3) states that every 2𝑛 × 2𝑛 unitary matrix 𝑼 –
149 Lecture 11: Quantum Phase Estimation

Figure 11.2 Quantum phase estimation circuit (QPE): this circuit works on (𝑚 + 𝑛)
qubits and produces an 𝑚 -bit approximation 𝑜 0 · · · 𝑜𝑚 − 1 of the phase 𝜃 ∈ [ 0, 1)
by reading out the first 𝑚 qubits. The circuit itself requires access to the 𝑛 -
qubit state |𝜓 ⟩ (red), executes 𝑚 controlled applications of powers of the
𝑛 -qubit circuit 𝑼 (blue boxes) and also features a 2𝑚 -dimensional inverse
quantum Fourier transform (purple). Note that, in general, the size of this
𝑚 −1
circuit increases exponentially in the phase resolution 𝑚 : 𝑼 2 corresponds
to sequentially applying the circuit 𝑼 a total of 2𝑚 − 1 times.

think: 𝑛 -qubit circuit – must have (at least) one normalized, 2𝑛 -dimensional
𝑛
vector |𝜓 ⟩ ∈ ℂ2 – think: 𝑛 -qubit state vector – that obeys quantum phase estimation
promise
𝑼 |𝜓 ⟩ = e2𝜋 i𝜃 |𝜓 ⟩ with phase 𝜃 ∈ [0, 1) .

In other words: the unitary 𝑼 is described by a 𝑛 -qubit circuit, the 𝑛 -qubit


state |𝜓 ⟩ encodes an eigenvector of the circuit matrix 𝑼 and the eigenvalue is
a complex phase parametrized by 𝜃 ∈ [ 0, 1) . This endows a matrix analysis
question with a uniquely quantum flavor that is summarized in Fig. 11.1. The
main topic of today’s lecture is a genuine quantum solution to this problem.

Theorem 11.6 (Quantum Phase Estimation (QPE) circuits; informal). Suppose that Quantum Phase Estimation
we have access to a 𝑛 -qubit circuit 𝑼 with size size (𝑼 ) and a 𝑛 -qubit (QPE) circuit functionality
state |𝜓 ⟩ such that 𝑼 |𝜓 ⟩ = e2𝜋 i𝜃 |𝜓 ⟩ with 𝜃 ∈ [ 0, 1) . Then, for any
𝑚 = 1, 2, 3, . . ., the Quantum Phase Estimation (QPE) circuit displayed in
Fig. 11.2 finds a ‘good’ 𝑚 -bit approximation of the phase 𝜃 ∈ [ 0, 1) with
high probability. This circuit operates on (𝑚 +𝑛) qubits and, without further
assumptions on 𝑼 , requires a total of 𝑂 (( 2𝑚 − 1) size (𝑼 )) elementary
quantum gates.

Note that the size of this circuit scales very poorly in the number of
approximation bits 𝑚 . This scaling becomes a bit less daunting if we remember
that a 𝑚 -bit approximation of 𝜃 actually approximates 2𝑚 𝜃 up to constant
accuracy 𝜖 . Hence, we need a total 2𝑚 = 𝑂 ( 1/𝜖) controlled applications of 𝑼
and only 𝑚 = 𝑂 ( log ( 1/𝜖)) additional readout qubits.
150 Lecture 11: Quantum Phase Estimation

Figure 11.3 Visualization of the phase kickback effect: under the assumption that
𝑼 |𝜓 ⟩ = e2𝜋 i𝜃 |𝜓 ⟩ , we can bring one control qubit into uniform superposition,
apply a conditional 𝑼 -circuit to the 𝑛 -qubit state |𝜓 ⟩ in order to kick back the
phase e2𝜋 i𝜃 into the amplitudes of the superposition within the control qubit.

The next question is: what does it mean to find a ‘good’ 𝑚 -bit approximation
of the unknown phase 𝜃 ∈ [ 0, 1) . Our analysis will provide the following rigor-
ous and deterministic statement for the special case that 𝜃 can be represented
exactly by 𝑚 bits.
Proposition 11.7 (Quantum Phase Estimation (QPE) circuits, rigorous special case).
Instantiate the assumptions from Theorem 11.6 and suppose furthermore
that 2𝑚 𝜃 is an integer with bit representation ⌞ 2𝑚 𝜃 ⌟ = 𝑡𝑚 − 1 · · · 𝑡 0 ∈ {0, 1}𝑚 .
Then, the Quantum Phase Estimation circuit displayed in Fig. 11.2 is guaranteed
to produce 𝑜𝑚 − 1 = 𝑡𝑚 − 1 , . . . , 𝑜 0 = 𝑡 0 (i.e. perfect and deterministic recovery).
A ‘good’ approximation generalizes this behavior to phases that do not have
an exact 𝑚 -bit representation. In this case, it is possible to show that the
readout bits 𝑜𝑚 − 1 , . . . , 𝑜 0 ∈ {0, 1} yield one of the best 𝑚 -bit approximations
to 𝜃 with probability at least 40%. In contrast, all other bit strings (aka bad
approximations) only occur with probability at most 25%. Hence, we can run
QPE multiple times and take a majority vote to quickly isolate one of the best
𝑚 -bit approximations to 𝜃 . Although not really difficult per se, this rigorous
justification of a ‘good’ approximation would, however, go beyond the scope
of this introductory course. We will instead content ourselves with rigorously
deriving Proposition 11.7 and sketching the way to generalize it for the special
cases of 𝑚 = 1 and 𝑚 = 2.

11.3 Analysis
11.3.1 Phase kickback effect
At the heart of quantum phase estimation is an interesting observation about
the action of controlled circuit application on 𝑛 -qubit states that correspond to
an eigenvector of the unitary circuit 𝑼 . This effect is called phase kickback and
151 Lecture 11: Quantum Phase Estimation

we refer to Fig. 11.3 for a visualization.


Lemma 11.8 (Phase kickback effect). Fix a 𝑛 -qubit circuit 𝑼 and let |𝜓 ⟩ be a phase kickback effect
𝑛 -qubit state such that 𝑼 |𝜓 ⟩ = e2𝜋 i𝜃 |𝜓 ⟩ with 𝜃 ∈ [0, 1) . Then,
1  
𝑪𝑼 𝑯 ⊗ 𝕀⊗𝑛 | 0⟩⊗|𝜓 ⟩ = |𝑟 (𝜃 )⟩⊗|𝜓 ⟩ with |𝑟 (𝜃 )⟩ = √ | 0⟩ + e2𝜋 i𝜃 | 1⟩ .

2
Here, 𝑪𝑼 = | 0⟩⟨0 | ⊗ 𝕀 ⊗𝑛 + | 1⟩⟨1 | ⊗ 𝑼 is the controlled version of 𝑼 .

Proof. By direct computation. The circuit displayed in Fig. 11.3 has three
sequential layers. We use (𝑛 + 1) -qubit state vectors | 𝜑 0 ⟩, | 𝜑 1 ⟩, | 𝜑 2 ⟩ to track
and reformulate the underlying quantum logic throughout the circuit evolution.
We start with | 𝜑 0 ⟩ = | 0⟩ ⊗ |𝜓 ⟩ and then move on to bring the first qubit into
uniform superposition:

| 𝜑 1 ⟩ = 𝑯 ⊗ 𝕀⊗𝑛 |𝜑 0 ⟩ = (𝑯 | 0⟩) ⊗ 𝕀⊗𝑛 |𝜓 ⟩ = |+⟩ ⊗ |𝜓 ⟩


 

1
= √ (| 0⟩ ⊗ |𝜓 ⟩ + | 1⟩ ⊗ |𝜓 ⟩) .
2
Now, we apply the circuit 𝑼 in a controlled fashion: if the first qubit is in | 0⟩ ,
we do nothing. Else if the first qubit is in | 1⟩ , we apply 𝑼 . This produces the
final quantum state vector
1
| 𝜑 2 ⟩ =𝑪𝑼 | 𝜑 1 ⟩ = √ (𝑪𝑼 | 0⟩ ⊗ |𝜓 ⟩ + 𝑪𝑼 | 1⟩ ⊗ |𝜓 ⟩)
2
1
= √ (| 0⟩ ⊗ |𝜓 ⟩ + | 1⟩ ⊗ (𝑼 |𝜓 ⟩)) .
2

The additional assumption 𝑼 |𝜓 ⟩ = e2𝜋 i𝜃 |𝜓 ⟩ considerably simplifies this super-


position and yields the desired result:
1
| 𝜑 2 ⟩ = √ (| 0⟩ ⊗ |𝜓 ⟩ + | 1⟩ ⊗ (𝑼 |𝜓 ⟩))
2
1  2𝜋 i𝜃

=√ | 0⟩ ⊗ |𝜓 ⟩ + | 1⟩ ⊗ e |𝜓 ⟩)
2
1  
=√ | 0⟩ + e2𝜋 i𝜃 | 1⟩ ⊗ |𝜓 ⟩ = |𝑟 (𝜃 )⟩ ⊗ |𝜓 ⟩.
2

11.3.2 Quantum Phase estimation for 𝑚 = 1 readout bit


The phase kickback effect is, in principle, enough to understand and appreciate
what Quantum Phase Estimation circuits do and why they work. However, the
precise role of the inverse Quantum Fourier transform can be a bit daunting
at first sight. To mitigate this risk, let us first look at the simplest instance of
quantum phase estimation, where we only have 𝑚 = 1 readout bit to gain
guiding intuition.
152 Lecture 11: Quantum Phase Estimation

Figure 11.4 Visualization of QPE with 𝑚 = 1 readout qubit: the (inverse) Quantum
Fourier transform in 𝐷 = 2 dimensions is simply the Hadamard gate. This
produces a simple special case where 𝜃 = 0 ensures 𝑜 0 = 0 and 𝜃 = 1/2
ensures 𝑜 0 = 1. In other words: 𝑜 0 = 2𝜃 whenever 2𝜃 is an integer.

Lemma 11.9 (𝑚 = 1 readout bit QPE, promise problem). Suppose that either 𝜃 = 0 1-bit QPE with promise
or 𝜃 = 1/2, i.e. 2𝜃 = 21 𝜃 is an integer. Then, the 𝑚 = 1-qubit QPE circuit
illustrated in Fig. 11.4 produces a deterministic readout bit
(
0 if 𝜃 = 0,
𝑜0 =
1 else if 𝜃 = 1/2.

Note that this outcome bit exactly encodes the exact bit representation of
2𝜃 = 21 𝜃 ∈ {0, 1}.

Proof. We can use the phase kickback effect in Lemma 11.8 to compute the
(𝑛 + 1) qubit state | 𝜑 2 ⟩ directly after the controlled application of 𝑼 :
| 𝜑 2 ⟩ = 𝑪𝑼 𝑯 ⊗ 𝕀⊗𝑛 |+⟩ ⊗ |𝜓 ⟩ = |𝑟 (𝜃 )⟩ ⊗ |𝜓 ⟩.


From now on, only the single-qubit state |𝑟 (𝜃 )⟩ matters. The promise 𝜃 = 0
or 𝜃 = 1/2 manifests itself in two distinct single-qubit states we need to
distinguish:
1   1
|𝑟 ( 0)⟩ = √ | 0⟩ + e2𝜋 i0 | 1⟩ = √ (| 0⟩ + | 1⟩) = |+⟩,
2 2
1   1
|𝑟 ( 1/2)⟩ = √ | 0⟩ + e2𝜋 i ( 1/2 ) | 1⟩ = √ (| 0⟩ − | 1⟩) = |−⟩.
2 2
So, the task boils down to distinguishing the uniform superposition |+⟩ from
the uniform superposition |−⟩ . This can be achieved with 100% success chance
by applying a Hadamard gate and performing a readout in the computational
basis: |+⟩ ↦→ 𝑯 |+⟩ = | 0⟩ vs. |−⟩ ↦→ 𝑯 |−⟩ = | 1⟩ . ■
153 Lecture 11: Quantum Phase Estimation

Note that we can achieve a much stronger result than Proposition 11.9
(which relies on a promise) by keeping 𝜃 as a variable in the analysis and 1-bit analysis without promise
computing the action of the Hadamard gate (inverse Fourier transform for
𝐷 = 2) on |𝑟 (𝜃 )⟩ directly. To make this task as simple as possible, we first
suggestively rewrite the single qubit phase state as
1   1  
|𝑟 (𝜃 )⟩ = √ | 0⟩ + e2𝜋 i𝜃 | 1⟩ = √ e2𝜋 i/2 ( 2𝜃 ) ×0 | 0⟩ + e2𝜋 i/2 ( 2𝜃 ) ×1 | 1⟩
2 2
1 ∑︁1 1 ∑︁1
=√ e ( 2𝜋 i/2 ) ( 2𝜃 )𝑎 0 |𝑎 0 ⟩ = √ 𝜔 2( 2𝜃 ) ×𝑎0 |𝑎 0 ⟩,
2 𝑎 0 =0 2 𝑎 0 =0

where 𝜔 2 = exp ( 2𝜋 i/2) = (−1) is a second root of unity. The Hadamard gate
describes the (inverse) Fourier transform in dimension 𝐷 = 21 = 2. We can
write this inverse transform as
1 ∑︁1
𝑯 =√ 𝜔 2−𝑏 ×𝑐 |𝑏⟩⟨𝑐 |
2 𝑏,𝑐 =0

and obtain
1 ∑︁1 1 ∑︁1
𝑯 |𝑟 (𝜃 )⟩ = √ 𝜔 2−𝑏 ×𝑐 |𝑏⟩⟨𝑐 | √ 𝜔 2( 2𝜃 ) ×𝑎0 |𝑎 0 ⟩
2 𝑏,𝑐 =0 2 𝑎 0 =0
1 ∑︁1
= 𝜔 ( 2𝜃 ) ×𝑎0 −𝑏 ×𝑐 |𝑏⟩⟨𝑐 |𝑎 0 ⟩
2 𝑎 0 ,𝑏,𝑐 =0 2
∑︁1  1 ∑︁1 
( ( 2𝜃 ) −𝑏 ) ×𝑐
= 𝜔 |𝑏⟩.
𝑏=0 2 𝑐 =0 2

The expression in brackets becomes maximal when 2𝜃 and 𝑏 ∈ {0, 1} are as


close as possible. The promise 𝜃 = 0 (2𝜃 = 0) or 𝜃 = 1/2 (2𝜃 = 1) ensures
that this happens in the strongest possible sense. But even if this is not the case,
this closed form expression allows us to compute the outcome probabilities for
𝑜0 analytically:
2
1 ∑︁1
Pr𝑯 |𝑟 (𝜃 ) ⟩ [𝑜 0 = 0] = |⟨0 |𝑯 |𝑟 (𝜃 )⟩| 2 = 𝜔 ( ( 2𝜃 ) −0 ) ×𝑐
2 𝑐 =0 2
2
1 2 1
= 1 + 𝜔 22𝜃 = ( 1 + exp ( 2𝜋 i/2 ( 2𝜃 )))
4 2
2
1  − i𝜋 𝜃 
= ei𝜋 𝜃 × e + e+i𝜋 𝜃 = | cos (𝜋 𝜃 )| 2 = cos2 (𝜋 𝜃 ).
2
This deserves a prominent display:

PrQPE,𝑚=1 [𝑜 0 = 0] = cos2 (𝜋 𝜃 ) and PrQPE,𝑚=1 [𝑜 0 = 1] = sin2 (𝜋 𝜃 ).

As a function in the unknown phase 𝜃 ∈ [ 0, 1) , the probability to obtain 𝑜 0 = 0


peaks at 𝜃 = 0 (the first promise case) and then decays smoothly and achieves
0 at 𝜃 = 1/2 (the other promise case) before rising again. Recall that the
problem is periodic. Conversely, the probability to obtain 𝑜 0 = 1 peaks at
𝜃 = 1/2. We refer to Fig. 11.5 for a visualization and further discussion.
154 Lecture 11: Quantum Phase Estimation

Figure 11.5 Plot of the readout probabilities for QPE with 𝑚 = 1 qubit: the blue line
displays the probability of reading out 𝑜 0 = 0 (𝑦 -axis) as a function of 𝜃 ∈ [ 0, 1)
(𝑥 -axis). This probability peaks at 𝜃 = 0 (and 𝜃 = 1 due to periodicity): which
corresponds to the first promise case in Lemma 11.9. The yellow line displays
the probability of reading out 𝑜 0 = 1 as a function of 𝜃 ∈ [ 0, 1) . This probability
peaks at 𝜃 = 1/2 which corresponds to the second promise case in Lemma 11.9.
The actual probabilities do, however, interpolate smoothly between these two
extreme cases. This is what we mean by stating that quantum phase estimation
provides ‘good’ bit approximations of the underlying phase 𝜃 even if it cannot
be represented exactly.

11.3.3 Quantum Phase estimation for 𝑚 = 2 readout bits


Before moving to the general case, let us also discuss the 𝑚 = 2-readout bit case.
Here, we will see how the 𝐷 = 22 = 4-dimensional (inverse) Fourier transform
arises naturally from the quantum phase estimation problem structure.
Lemma 11.10 (𝑚 = 2 readout bit QPE, promise problem). Suppose that 𝜃 ∈ 2-bit QPE with promise
2
{0, 1/4, 1/2, 3/4} (i.e. 2 𝜃 = 4𝜃 is an integer). Then, the 𝑚 = 2-qubit

Figure 11.6 Visualization of QPE with 𝑚 = 2 readout qubits:


155 Lecture 11: Quantum Phase Estimation

QPE circuit illustrated in Fig. 11.6 produces a deterministic readout bit string


 00 if 𝜃 = 0,

 01 else if 𝜃 = 1/4,


𝑜 1𝑜 0 =


 10 else if 𝜃 = 1/2,

 11
 else if 𝜃 = 3/4.
Note that this outcome bit string encodes the exact bit representation of
22 𝜃 ∈ {0, 1, 2, 3}.

Proof. The proof is a straightforward extension from the 1-bit base case. First
note that the two controlled applications of 𝑼 and 𝑼 2 act on completely
separate control qubits. The eigenvalue equation ensures 𝑼 |𝜓 ⟩ = e2𝜋 i𝜃 |𝜓 ⟩ and
also 𝑼 2 |𝜓 ⟩ = 𝑼 × 𝑼 |𝜓 ⟩ = e2𝜋 i2𝜃 |𝜓 ⟩ . Hence, we can apply the phase kickback
lemma (Lemma 11.8) twice and obtain the following visual simplification:

From now on, we can discard the final 𝑛 qubits (which are still in state |𝜓 ⟩ )
and can fully focus on the first two readout qubits. Similar to before, we can
rewrite |𝑟 ( 2𝜃 )⟩ ⊗ |𝑟 (𝜃 )⟩ in a suggestive fashion:
| 𝜑˜after kickback (𝜃 )⟩ =|𝑟 ( 2𝜃 )⟩ ⊗ |𝑟 (𝜃 )⟩ (11.2)
 
1 ∑︁1
= √ exp ( 2𝜋 i ( 2𝜃 ) × 𝑎 1 ) |𝑎 1 ⟩
2 𝑎 1 =0
 
1 ∑︁1
⊗ √ exp ( 2𝜋 i𝜃 × 𝑎 0 ) |𝑎 0 ⟩
2 𝑎 0 =0
1 ∑︁1
= exp ( 2𝜋 i𝜃 × ( 2𝑎 1 + 𝑎 0 )) |𝑎 1 𝑎 0 ⟩
2 𝑎 1 ,𝑎 0 =0
1 ∑︁3
= exp ( 2𝜋 i𝜃 × 𝑗 ) |⌞𝑗 ⌟⟩
2 𝑗 =0
1 ∑︁3 ( 4𝜃 ) ×𝑗
=√ 𝜔4 |⌞𝑗 ⌟⟩. (11.3)
4 𝑗 =0
156 Lecture 11: Quantum Phase Estimation

Here, we have tacitly moved from bit strings 𝑎 1 𝑎 0 ∈ {0, 1}2 to the actual integer
𝑗 = 2𝑎 1 + 𝑎 0 ∈ {0, 1, 2, 3} this bit string encodes (equivalently: ⌞𝑗 ⌟ = 𝑎 1 𝑎 0 ).
The resulting expression features a sum over 𝐷 = 4 index elements which
suggests a connection to the Fourier transform in 𝐷 = 22 = 4 dimensions.
Rewriting the complex phase as a power of a fourth root of unity 𝜔 4 =
exp ( 2𝜋 i/4) = (+i) further underscores this connection.
We are now ready to use the promises we have on 𝜃 . By assumption, there
are four distinct possibilities:

Note that, similar to the 1-bit readout case, we can also execute a general
inverse Fourier transform to get closed-form expressions of the final two-qubit
amplitudes. We can write the 𝐷 = 22 = 4 dimensional inverse Fourier transform 2-bit readout analysis without
matrix as promise
−1 1 ∑︁3 −𝑘 ×𝑙
𝑸 𝑭𝑻 ( 4 ) = √ 𝜔4 |⌞𝑘 ⌟⟩⟨⌞𝑙 ⌟|.
4 𝑘 ,𝑙 =0

If we apply it to the reformulation of | 𝜑˜after kickback (𝜃 )⟩ = |𝑟 ( 2𝜃 )⟩ ⊗ |𝑟 (𝜃 )⟩


from Eq. (11.3), we obtain

| 𝜑˜final (𝜃 )⟩ =𝑸 𝑭𝑻 −( 41) | 𝜑˜after kickback (𝜃 )⟩


1 ∑︁3 1 ∑︁3 ( 4𝜃 ) ×𝑗
=√ 𝜔 4−𝑘 ×𝑙 |⌞𝑘 ⌟⟩⟨⌞𝑙 ⌟| √ 𝜔4 |⌞𝑗 ⌟⟩
4 𝑘 ,𝑙 =0 4 𝑗 =0
1 ∑︁3 ( 4𝜃 ) ×𝑗 −𝑘 ×𝑙
= 𝜔4 |⌞𝑘 ⌟⟩⟨⌞𝑙 ⌟|⌞𝑗 ⌟⟩
4 𝑗 ,𝑘 ,𝑙 =0
 ∑︁ 
∑︁3 1 3 ( ( 4𝜃 ) −𝑘 ) ×𝑙
= 𝜔4 |⌞𝑘 ⌟⟩,
𝑘 =0 4 𝑙 =0

where we have used orthogonality of deterministic bit-string states: ⟨⌞𝑙 ⌟|⌞𝑗 ⌟⟩ =


𝛿 𝑗 ,𝑘 . The amplitude within the bracket peaks for the 𝑘 closest to 4𝜃 and rapidly
falls off for values of 𝑘 that are further off. The extreme case occurs if 4𝜃 = 𝑘
for exactly one 𝑘 ∈ {0, 1, 2, 3}. This is the promise case we analyzed earlier
and exactly one bit encoding |⌞𝑘 ⌟⟩ survives while all other amplitudes are
pushed to zero. The formula above interpolates smoothly between these four
deterministic extreme cases.

11.3.4 Quantum phase estimation for 𝑚 readout bits


We are now ready for a general analysis of the Quantum Phase Estimation
Circuit with 𝑚 readout bits, see Fig. 11.2 for a visual illustration. Let us restate
the main result for convenience.
Proposition 11.11 (Restatement of Proposition 11.7). Instantiate the assumptions 𝑚 -bit QPE with promise
from Theorem 11.6 and suppose furthermore that 2𝑚 𝜃 is an integer with
bit representation ⌞ 2𝑚 𝜃 ⌟ = 𝑡𝑚 − 1 · · · 𝑡 0 ∈ {0, 1}𝑚 . Then, the Quantum
Phase Estimation circuit displayed in Fig. 11.2 is guaranteed to produce
𝑜𝑚 −1 = 𝑡𝑚 −1 , . . . , 𝑜 0 = 𝑡 0 (i.e. perfect and deterministic recovery).
157 Lecture 11: Quantum Phase Estimation

Proof. Note that by iterating the promise 𝑼 |𝜓 ⟩ = e2𝜋 i𝜃 |𝜓 ⟩ a total of 2𝑘 times,


𝑘 𝑘
we can also deduce 𝑼 2 |𝜓 ⟩ = e2𝜋 i ( 2 𝜃 ) |𝜓 ⟩ for every 𝑘 = 1, . . . , 2𝑚 − 1 . What
𝑘
is more, the state |𝜓 ⟩ describes an eigenvector of 𝑼 2 and therefore survives
each controlled circuit application unchanged (up to a phase). Hence, we can
use the phase kickback lemma (Lemma 11.8) a total of 2𝑚 times to obtain the
following visual simplification of the (𝑚 + 𝑛) -qubit state after all controlled
circuit applications:

From now on, we can discard the final 𝑛 qubits (which continue to reside in
the state |𝜓 ⟩ ) and focus on the 𝑚 -qubit state after all kickbacks:

| 𝜑˜after kickback (𝜃 )⟩ =|𝑟 2𝑚 −1 𝜃 ⟩ ⊗ · · · ⊗ |𝑟 (𝜃 )⟩



 
1 ∑︁1 𝑚 𝑚 𝑚 −1 
= √ exp 2𝜋 i/2 ( 2 𝜃 ) × 2 𝑎𝑚 −1 ⊗ ···
2 𝑎 𝑚 − 1 =0
 
1 ∑︁1 𝑚 𝑚
⊗ √ exp ( 2𝜋 i/2 ( 2 𝜃 ) × 𝑎 0 )
2 𝑎 =0
0

1 ∑︁1 ( 2𝑚 𝜃 ) × ( 2𝑚 − 1 𝑎 𝑚 − 1 +···+2𝑎 1 +𝑎 0 )
=√ 𝜔 2𝑚 |𝑎 𝑚 −1 · · · 𝑎 1 𝑎 0 ⟩
2𝑚 𝑎 𝑚 − 1 ,...,𝑎 1 ,𝑎 0 =0
1 ∑︁2𝑚 − 1 ( 2𝑚 𝜃 ) ×𝑗
=√ 𝜔 2𝑚 |⌞𝑗 ⌟⟩.
2𝑚 𝑗 =0

We can write the 𝐷 = 2𝑚 -dimensional inverse Fourier transform in a similar


fashion as
−1 1 ∑︁2𝑚 − 1 −𝑘 ×𝑙
𝑭 ( 2𝑚 ) = √ 𝜔 2𝑚 |⌞𝑘 ⌟⟩⟨⌞𝑙 ⌟|.
2𝑚 𝑘 ,𝑙 =0

This allows us to compute a general expression of the amplitudes that feature


158 Lecture 11: Quantum Phase Estimation

in the final 𝑚 -qubit state:

| 𝜑˜final (𝜃 )⟩ =𝑸 𝑭𝑻 −( 21𝑚 ) | 𝜑˜after kickback (𝜃 )⟩


1 ∑︁2𝑚 − 1 −𝑘 ×𝑙 1 ∑︁2𝑚 − 1 ( 2𝑚 𝜃 ) ×𝑗
=√ 𝜔 2𝑚 |⌞𝑘 ⌟⟩⟨⌞𝑙 ⌟| √ 𝜔 2𝑚 |⌞𝑗 ⌟⟩
2𝑚 𝑘 ,𝑙 =0 2𝑚 𝑗 =0
1 ∑︁ 𝑚
2 −1 ( 2𝑚 𝜃 ) ×𝑗 −𝑘 ×𝑙
= 𝑚 𝜔 2𝑚 |⌞𝑘 ⌟⟩⟨⌞𝑙 ⌟|⌞𝑗 ⌟⟩
2 𝑗 ,𝑘

,𝑙 = 0

∑︁2𝑚 −1 1 ∑︁2𝑚 −1
( ( 2𝑚 𝜃 ) −𝑘 ) ×𝑙
= 𝜔 2𝑚 |⌞𝑘 ⌟⟩.
𝑘 =0 2𝑚 𝑙 =0

The
 amplitude expression within the bracket peaks for the integer 𝑘 ∈
0, 1, . . . , 2𝑚 − 1
− 1 that is closest to 2𝑚 −1 𝜃 . The extreme case occurs if
2𝑚 𝜃 = 𝑘 ♯ for exactly one 𝑘 ♯ . In this case, we obtain a simple, deterministic
expression for | 𝜑˜final ⟩ :
∑︁2𝑚 −1  1 ∑︁2𝑚 −1 (𝑘 −𝑘 ) ×𝑙 
𝑚
| 𝜑˜final 𝑘 ♯ /2 ⟩= 𝜔 2𝑚♯ |⌞𝑘 ⌟⟩
𝑘 =0𝑚 2 𝑙 =0
∑︁2𝑚 −1
= 𝛿𝑘 ,𝑘♯ |⌞𝑘 ⌟⟩ = |⌞𝑘 ♯ ⌟⟩.
𝑘 =0

We will explain the occurrence of 𝛿𝑘 ,𝑘 ♯ in a moment. But before, let us


emphasize that performing a readout on | 𝜑˜final (𝑘 ♯ /2𝑚 )⟩ produces the 𝑚 -bit
encoding ⌞𝑘 ♯ ⌟ of the correct 𝑚 -bit approximation 𝑘 ♯ = 2𝑚 𝜃 .
Last, but not least, let us quickly argue while the amplitudes collapse to
deterministic values if 𝜃 = 𝑘 ♯ /2𝑚 with 𝑘 ♯ ∈ {0, . . . , 2𝑚 − 1}. To see this, it is
enough to look at the amplitude associated with summand 𝑘 = 𝑘 ♯ :

1 ∑︁2𝑚 − 1 (𝑘 ♯ −𝑘 ♯ ) ×𝑙 1 ∑︁2𝑚 − 1 0 1 ∑︁2𝑚 − 1


𝜔 2𝑚 = 𝜔 2𝑚 = 1 = 1.
2𝑚 𝑙 =0 2𝑚 𝑙 =0 2𝑚 𝑙 =0

An amplitude of magnitude 1 in one 𝑚 -bit string is only possible if all other


amplitudes vanish identically. This justifies the use of the Kronecker delta 𝛿𝑘 ♯ ,𝑘
further above.

12. Shor’s algorithm for integer factorization

Date: 18 December 2024

12.1 Motiviation: hard instances of integer factorization Agenda:


Today, we finally have all the pieces in place to properly discuss one of the 1 motivation: hard in-
most prominent quantum algorithms to date: Shor’s algorithm for integer teger factorization in-
factorization from 1994. Integer factorization is the problem of decomposing a stances
(large) integer 𝑁 into a product of smaller numbers that are all prime. The 2 (classical) reduction of
first algorithms to solve integer factorization date back to Fibonacci in 1202. integer factorization to
Methods like trial division work well for products of many small primes. But, order finding
3 (quantum) algorithm
they become really expensive if 𝑁 is the product of two distinct prime numbers:
for order finding
4 synopsis: Shor’s algo-
𝑁 =𝑝 ×𝑞 where 𝑝, 𝑞 prime. (12.1)
rithm

In this worst case, trial division may require up to 𝑁 different attempts at
division – a number that scales exponentially in the bit length 𝑛 = ⌊ log2 (𝑁 )⌋ + 1
of 𝑁 . Even the best algorithms known to date scale like task: factor 𝑛 -bit integer into
  a product of prime numbers
2
 1 /3  
𝑂 exp 𝑐 × 𝑛 log (𝑛) ,

which is still exponential in 𝑛 1/3 . The apparent hardness of such integer


factorization problems is not only a curse, but also a blessing. Hardness of best classical factorization
factoring forms the basis of several cryptographic primitives, most notably RSA. scaling is exponential in 𝑛 1/3
The main result of today is Shor’s algorithm for integer factorization from
1994. This is really a hybrid classical-quantum approach, where the bulk is
fully classical. It does, however, use a ( 3𝑛 + 1) = 𝑂 (𝑛) -qubit architecture to
(exponentially) speed up one crucial subroutine. This is achieved by cleverly
employing quantum phase estimation – the main topic of Lecture 11.
160 Lecture 12: Shor’s algorithm for integer factorization

The remainder of today’s lecture discusses different aspects of this algorithm.


In Sec. 12.2 we use modular arithmetic to reduce the (classically hard) problem
of integer factorization to another (classically hard) problem that looks very
different: finding the order of an exponential modulo 𝑁 . Subsequently, we
adapt quantum phase estimation to solve this order finding problem much
faster than the best-known classical procedure. This will require ( 3𝑛 + 1)
qubits, a total of 𝑂 (𝑛 3 ) elementary one- and two-qubit gates and (possibly) a
few rounds of repetitions to guarantee that we read out the correct solution by
measuring the first ( 2𝑛 + 1) qubits. This is the content of Sec. 12.3. Finally,
Sec. 12.4 combines both parts and states Shor’s algorithm in its full glory.

12.2 Reducing Integer Factorization to order finding


We now present a fully classical analysis that reduces integer factorization to
another problem called order finding.

12.2.1 The order finding problem


Let us first introduce a couple of concepts from arithmetic. For two positive
integers 𝑎, 𝑁 , the greatest common divisor (gcd) is the largest number that greatest common
divides both 𝑎 and 𝑁 . We denote it by writing gcd (𝑎, 𝑁 ) . For example, divisor (gcd)
gcd ( 8, 12) = 4 and gcd ( 11, 27) = 1. Note that we can compute gcd’s efficiently
on a classical computer. Euclid’s algorithm, for instance, runs in time 𝑂 (𝑛 2 ) ,
where 𝑛 is the bit length of max {𝑎, 𝑁 }.
Let us now move on to review the basics of modular arithmetic (think:
binary math, but extended to 𝑁 ary number systems). For a positive integer 𝑁 ,
we define
ℤ𝑁 = {0, 1, 2, . . . , 𝑁 − 1} .
For example ℤ2 = {0, 1} (binary alphabet), ℤ3 = {0, 1, 2} (ternary alphabet)
and ℤ16 = {0, . . . , 15} (integer representation of the hexadecimal alphabet).
Every ℤ𝑁 is a set of integers which we can endow with arithmetic operations arithmetic modulo 𝑁
modulo 𝑁 :
𝑥 + 𝑦 mod 𝑁 and 𝑥 × 𝑦 mod 𝑁 .
For instance, the following modular relations should be very familiar for
computer scientists:

1 × 1 mod 2 = 1 = 1 mod 2,
2 × 1 mod 2 = 1 + 1 mod 2 = 0 mod 2,
3 × 1 mod 2 = 1 + 1 + 1 mod 2 = 1 mod 2,
..
.

Example 12.1 (first 10 digits modulo 𝑁 for 𝑁 = 3, 4, 5). The first ten digits (ℤ10 =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}) assume the following values in different modular
161 Lecture 12: Shor’s algorithm for integer factorization

arithmetic frameworks:

𝑁 = 3 : ℤ10 ↦→ {0, 1, 2, 0, 1, 2, 0, 1, 2, 0} ,
𝑁 = 4 : ℤ10 ↦→ {0, 1, 2, 3, 0, 1, 2, 3, 0, 1} ,
𝑁 = 5 : ℤ10 ↦→ {0, 1, 2, 3, 4, 0, 1, 2, 3, 4} .

Notice the repeating patterns that occur if we represent 10 numbers in arithmetic


modulo 𝑁 < 10. This periodicity is a general feature of modular arithmetic
that will become crucial later on. ■

The set ℤ𝑁 from Eq. (12.2.1) is guaranteed to contain elements 𝑎 ∈ ℤ𝑁


that don’t share a gcd with 𝑁 . We accumulate all of them and define

ℤ𝑁 = {𝑎 ∈ ℤ𝑁 : gcd (𝑎, 𝑁 ) = 1} .
∗ is coprime with 𝑁 . For
These numbers are special, because each 𝑎 ∈ ℤ𝑁
instance,

ℤ∗15 = {1, 2, 4, 7, 8, 11, 13} , (12.2)


ℤ∗35 = {1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 16, 17, 18, 19, 22,
23, 24, 26, 27, 29, 31, 32, 33, 34} . (12.3)

We are now ready to present a key fact from modular arithmetic on which we
build our integer factorization algorithm.
∗ ). Fix 𝑁 ∈ ℕ. Then, for every 𝑎 ∈ ℤ∗ there must exist
Fact 12.2 (order of 𝑎 in ℤ𝑁 order finding problem
𝑁
a positive integer 𝑟 such that 𝑎 𝑟 = 1 mod 𝑁 . The smallest such 𝑟 is called the
order of 𝑎 in ℤ𝑁∗ . ■

For this fact to hold, it is important that 𝑎 ∈ ℤ𝑁 ∗ to begin with. If

gcd (𝑎, 𝑁 ) ≠ 1, there can be no order to begin with (why?). Note that the
∗ is closely related to periodicities, like the ones we saw above.
order 𝑟 of 𝑎 in ℤ𝑁
More precisely, it is the period of the function 𝑓𝑎,𝑁 (𝑥) = 𝑎 𝑥 mod 𝑁 . Indeed,

𝑓𝑎,𝑁 (𝑥 + 𝑟 ) = 𝑎 𝑥+𝑟 mod 𝑁 = 𝑎 𝑥 × 𝑎 𝑟 mod 𝑁 = 𝑎 𝑥 × 1 mod 𝑁 = 𝑓𝑎,𝑁 (𝑥).

Let us familiarize ourselves with this concept by executing two concrete example
calculations.
Example 12.3 (order of 𝑎 = 13 in ℤ∗15 ). Let us consider the set ℤ∗15 from Eq. (12.2)
and choose 𝑎 = 13. We can then evaluate the order by trial search over
candidate exponents 𝑟 = 1, 2, 3, . . .:

131 =13 mod 15, 132 = 169 = 4 mod 15,


133 =2197 = 7 mod 15, 134 = 28561 = 1 mod 15.

So, the order of 𝑎 = 13 in ℤ∗15 is 𝑟 = 4. ■


162 Lecture 12: Shor’s algorithm for integer factorization

Example 12.4 (order of 𝑎 = 2 in ℤ35 ). Let us consider the set ℤ∗35 from Eq. (12.3)
and choose 𝑎 = 2. We can then evaluate the order by trial search over the
candidate exponent. Since 21 , 22 , 23 , 24 , 25 < 35 = 𝑁 , we can start our search
at 𝑟 = 6:

26 =64 mod 35 = 29 mod 35, 27 = 128 mod 35 = 23 mod 35,


28 =256 mod 35 = 11 mod 35, 29 = 512 mod 35 = 22 mod 35,
210 =1024 mod 35 = 9 mod 35, 211 = 2048 mod 35 = 18 mod 35,
212 =4096 mod 35 = 1 mod 35.

This allows us to conclude that the order of 𝑎 = 2 in ℤ∗35 is 𝑟 = 12. We can also
see that the cost of trial search over possible exponents can grow very quickly
once 𝑁 becomes somewhat large. ■

These examples highlight that order finding is possible. But the number
of trial exponentiations can grow quickly once 𝑁 becomes (somewhat) large.
This increase in cost seems to be a fundamental problem. To date, no efficient
classical algorithm is known that can determine the order of 𝑎 in ℤ𝑁 ∗ with

polynomial resources in 𝑛 = ⌊ log2 (𝑁 )⌋ + 1. We will, however, develop an


efficient quantum algorithm in Sec. 12.3 below.

12.2.2 Solving integer factorization via order finding


Let us now present a (classical) protocol that solves the worst case of integer
factorization, see Eq. (12.1), via the order finding problem defined in Fact 12.2.
Formally speaking, this is a proper reduction: every step in our procedure
will be (very) efficient – with the sole exception of order finding. We refer to
Algorithm 12.1 for details.

Theorem 12.5 (integer factorization via order finding). Suppose that 𝑁 = 𝑝 ×𝑞 is (classical) reduction from
an odd product of two distinct primes. Then, Algorithm 12.1 is guaranteed integer factorization to order
to terminate and output (at least) one factor: 𝑓 0 ∈ {𝑝, 𝑞 } or 𝑓 1 ∈ {𝑝, 𝑞 }. finding

This result implies that order finding is at least as difficult as (our special
case of) integer factorization. Typically, such reduction arguments are used
to argue that a problem is difficult. Here, we will do the opposite. We will
develop an efficient algorithm for order finding in order to argue that integer
factorization becomes cheap if we had access to a quantum computer.

Proof of Theorem 12.5. The input 𝑁 allows us to initialize arithmetic modulo


𝑁 and sample 𝑎 ∈ ℤ𝑁 uniformly. Although very unlikely, it can happen that 𝑎
shares a nontrivial greatest common divisor with 𝑁 . In this case, 𝑁 = 𝑝 × 𝑞
and 𝑝, 𝑞 prime demands that this gcd is either 𝑝 or 𝑞 . In other words: we got
lucky and have found a factor by pure chance!
Let us now focus on the much more likely case, where gcd (𝑎, 𝑁 ) = 1. In
∗ is well-defined. We can identify the smallest
this case, the order of 𝑎 in ℤ𝑁
𝑟
𝑟 such that 𝑎 = 1 mod 𝑁 (e.g. via brute force search). Two situations
163 Lecture 12: Shor’s algorithm for integer factorization

Algorithm 12.1 Integer Factorization via order finding


Input: 𝑁 ∈ ℕ ⊲ integer to be factorized
Output: two positive integers 𝑓 0 , 𝑓 1 ∈ ℕ ⊲ pair of candidate factors
1 while
2 do choose 𝑎 ∈ ℤ𝑁 = {0, 1, . . . , 𝑁 − 1} randomly
3 if gcd (𝑎, 𝑁 ) > 1 then ⊲ we found a factor by chance
4 success, set 𝑓 0 = gcd (𝑎, 𝑁 ), 𝑓 1 = 1 and break while loop
5 else if gcd (𝑎, 𝑁 ) = 1 then
6 identify smallest 𝑟 s.t. 𝑎 𝑟 = 1 mod 𝑁 ⊲ find order of 𝑎 in ℤ𝑁∗

7 if 𝑟 is odd then
8 failure, move back to step 2
9 else if 𝑟 is even then
10 set 𝑥 = 𝑎 𝑟 /2 mod 𝑁 and compute (𝑥 + 1), (𝑥 − 1)
11 if (𝑥 ± 1) = 0 mod 𝑁 then
12 failure, move back to step 2
13 else if (𝑥 ± 1) ≠ 0 mod 𝑁 then
14 success, compute 𝑓 0 = gcd (𝑥 + 1, 𝑁 ) and 𝑓 1 = gcd (𝑥 − 1, 𝑁 )
and break while loop
15 output 𝑓 0 and 𝑓 1

can arise: 𝑟 may be odd or 𝑟 may be even. If 𝑟 is odd, the algorithm fails
and we try again with a different 𝑎 ∈ ℤ𝑁 . Else if 𝑟 is even, then 𝑟 /2 is a
positive integer and we can compute 𝑥 = 𝑎 𝑟 /2 mod 𝑁 . The modular version of
(𝑎 + 𝑏) × (𝑎 − 𝑏) = 𝑎 2 − 𝑏 2 then ensures
 2
(𝑥 + 1) × (𝑥 − 1) mod 𝑁 = 𝑎 𝑟 /2 − 1 mod 𝑁
=𝑎 𝑟 − 1 mod 𝑁
=0 mod 𝑁 ,

because 𝑎 𝑟 = 1 mod 𝑁 . Returning to ordinary arithmetic, this is equivalent to

(𝑥 + 1) × (𝑥 − 1) = 𝑘 𝑁 for some 𝑘 ∈ ℕ. (12.4)

This starts to look very close to a factorization. However, this display becomes
completely trivial if either (𝑥 + 1) or (𝑥 − 1) is itself equal to a multiple of 𝑁 .
The last if-condition in line 11 of Alg. 12.1 checks for precisely this possibility.
If (𝑥 ± 1) = 0 mod 𝑁 , then we have learned nothing new and restart. Else
if (𝑥 ± 1) ≠ 0 mod 𝑁 , Eq. (12.4) must contain nontrivial information about
the factors. Using 𝑁 = 𝑝 × 𝑞 , we rewrite it as (𝑥 + 1) × (𝑥 − 1) = 𝑘 × 𝑝 × 𝑞
for some 𝑘 ∈ ℕ and (𝑥 ± 1) ≠ 𝑘 ′ × 𝑝 × 𝑞 . This is only possible if either
gcd (𝑥 + 1, 𝑁 ) ∈ {𝑝, 𝑞 } or gcd (𝑥 − 1, 𝑁 ) ∈ {𝑝, 𝑞 } (or both). ■
So far, Algorithm 12.1 looks rather abstract and difficult to grasp. It is
therefore instructive to execute it in small-scale examples.
164 Lecture 12: Shor’s algorithm for integer factorization

Example 12.6 (Factoring 𝑁 = 15 with Algorithm 12.1). Suppose that we start with
𝑎 = 13 which obeys gcd ( 13, 15) = 1 and therefore doesn’t yet provide any
information about the factors of 𝑁 = 15. To change this, we use trial
exponentiation to find the order of 𝑎 = 13 in ℤ∗15 . We already did that in
Example 12.3: 𝑟 = 4. This number is even and we can use it to define
𝑥 = 𝑎 𝑟 /2 mod 15 = 132 mod 15 = 4 and, in turn, (𝑥 + 1) = 5, (𝑥 − 1) = 3.
Neither of these numbers is a multiple of 15, so the last condition is also met
and we are guaranteed to learn something about the factors of 15. Indeed,

gcd (𝑥 + 1, 𝑁 ) = gcd ( 5, 15) = 5 and gcd (𝑥 − 1, 𝑁 ) = gcd ( 3, 15) = 3.

Remarkably, we have learned both factors of 𝑁 = 3 × 5 in one go. This is even


better than the promise from Theorem 12.5. ■

Example 12.7 (Factoring 𝑁 = 35 with Algorithm 12.1). Here, we will also discuss
choices of 𝑎 for which the algorithm fails. If this happens, we start over with a
new choice of 𝑎 ∈ ℤ35 :
Try 𝑎 = 11 and use brute force to identify the order of 11 in ℤ∗35 : 𝑟 = 3
which is not even and we have to start over.
Try 𝑎 = 2 and use brute force to identify the order of 2 in ℤ∗35 : 𝑟 = 12
which is even and allows us to define 𝑥 = 𝑎 𝑟 /2 mod 35 = 26 mod 35 = 29.
Fortunately, neither (𝑥 + 1) = 30 nor (𝑥 − 1) = 28 are themselves multiples of
𝑁 . So, computing their gcds with 𝑁 reveals at least one factor (and, by chance,
we in fact get both of them again):

gcd (𝑥 + 1, 𝑁 ) = gcd ( 30, 35) = 5 and gcd (𝑥 − 1, 𝑁 ) = gcd ( 28, 35) = 7.

12.3 Efficiently solving order finding on a quantum computer


Section 12.2 provided an alternative approach for (hard instances of) integer
factorization. This method is very different from well-known factoring methods,
like trial division. It isolates most of the hardness in a single sub-task: find the
order of 𝑎 in ℤ𝑁 ∗ (where gcd (𝑎, 𝑁 ) = 1):

find 𝑟 ∈ ℤ𝑁 such that 𝑎 𝑟 = 1 mod 𝑁 .

We don’t know any classical algorithm for this order finding problem that scales
polynomially in the bit length of 𝑁 . Trial and error, for instance, may take
us exponentially in 𝑛 many attempts to find the order. However, unlike the
original factoring problem, order finding is much more amenable to a genuine
quantum solution. In fact, we can use phase estimation for it. This was Shor’s
seminal insight in 1994.
165 Lecture 12: Shor’s algorithm for integer factorization

Figure 12.1 Quantum phase estimation circuit: Let 𝑼 be a 𝑛 -qubit unitary and
let |𝜓 ⟩ be a state vector representation of an eigenvector with eigenvalue
(‘phase’) exp ( 2𝜋 i𝜃 ) of 𝑼 . Then, for 𝑚 ≥ 1, the advertised (𝑚 + 𝑛) -qubit
circuit approximates the first 𝑚 bits of the (eigen-)phase 𝜃 ∈ ( 0, 2𝜋) . More
precisely, with high probability, the 𝑚 -bit readout ⌞𝑦 ⌟ describes an integer
approximation 𝑦 ∈ {0, . . . , 2𝑚 − 1} of 2𝑚 × 𝜃 .

12.3.1 Recapitulation: Quantum Phase Estimation (QPE)


Quantum phase estimation is a versatile quantum circuit architecture that
exploits the (very, very fast) quantum Fourier transform. At the heart are a quantum phase estimation for
𝑛 -qubit unitary circuit 𝑼 and a 𝑛 -qubit state (vector) |𝜓 ⟩ that obey eigenvalue+eigenvector pair
of 𝑛 -qubit unitary 𝑼
𝑼 |𝜓 ⟩ = exp ( 2𝜋 i𝜃 ) |𝜓 ⟩. (12.5)

In mathematical terms, 𝑼 describes a unitary matrix, |𝜓 ⟩ denotes an eigenvector


and the associated eigenvalue is exp ( 2𝜋 i𝜃 ) . Our task is to approximate the
phase 𝜃 up to 𝑚 bits of accuracy.
There is a quantum solution to phase estimation. It requires that we
can create the state |𝜓 ⟩ and know a quantum circuit realization for 𝑼 . The
associated circuit uses (𝑚 +𝑛) qubits and is displayed in Fig. 12.1. It combines a
𝑚 -qubit Quantum Fourier transform1 with 𝑚 controlled applications of powers
of 𝑼 on the lower batch of 𝑛 qubits. Finally, we apply a reverse QFT to the
first 𝑚 qubits and perform a readout. This produces a bit string ⌞𝑦 ⌟ ∈ {0, 1}𝑚
which we interpret as bit encoding of 0 ≤ 𝑦 ≤ 2𝑚 − 1. Remarkably, the fraction
𝑦 /2𝑚 is likely to be the best approximation of 𝜃 up to accuracy 1/2𝑚+1 .
Fact 12.8 (performance guarantee for quantum phase estimation). Suppose that
𝑼 and |𝜓 ⟩ obey Eq. (12.5) with phase 𝜃 and choose 𝑚 ∈ ℕ. Then, with
probability at least 40%, the (𝑚 + 𝑛) -qubit circuit displayed in Fig. 12.1
1Note that the action of the QFT on the initial 𝑚 -qubit state | 0 · · · 0⟩ is the same as the action
of 𝑚 parallel Hadamard gates. This bridges the gap to the Phase Estimation Circuits discussed
in Lecture 11.
166 Lecture 12: Shor’s algorithm for integer factorization

produces an outcome ⌞𝑦 ⌟ ∈ {0, 1} which can be viewed as bit encoding of


𝑦 ∈ {0, 2𝑚 − 1} that obeys
𝑦 1
−𝜃 ≤ . (12.6)
2𝑚 2𝑚+1

Note that this fact says nothing about the cost of executing the phase
estimation circuit. This depends on the way we construct |𝜓 ⟩ and – which is
𝑘
much more severe – how we implement the controlled application of 𝑼 2 =
𝑼 × · · · × 𝑼 for 𝑘 = 1, . . . , 2𝑚 − 1. This gate count quickly explodes if one is
not careful. Fortunately, for Shor’s algorithm, a trick will allow us to deal with
this issue nicely.

12.3.2 Identifying the order parameter in eigenvalues of a simple reversible circuit


In order to build a bridge between (classical) order finding and quantum phase
estimation, we need to identify qubit encodings of ℤ𝑁 , the ‘right’ unitary circuit
𝑼 and a promising eivenvector |𝜓 ⟩ .
A qubit encoding of ℤ𝑁 is easy to accomplish. Set 𝑛 = ⌊ log2 (𝑁 )⌋ + 1 and 𝑛 -qubit encoding of ℤ𝑁
identify each 𝑧 ∈ ℤ𝑁 with its bit encoding ⌞𝑧⌟ ∈ {0, 1}𝑛 . In a second step, (𝑛 = 𝑂 ( log (𝑁 ) )
we associate such bit encodings with deterministic 𝑛 -bit input state vectors:

|𝑧⟩𝑁 := |⌞(𝑧 mod 𝑁 )⌟⟩ for 𝑧 ∈ ℤ𝑁 and ⌞𝑧⌟ ∈ {0, 1}𝑛 .

With a slight abuse of notation, we denote an entire bit string by the number
𝑧 ∈ ℤ𝑁 it encodes. Note, moreover, that we can also encode modular arithmetic
in these state vector labels. In particular,

|𝑧 + 𝑧 ′ ⟩𝑁 = | (𝑧 + 𝑧 ′ mod 𝑁 )⟩𝑁 = |⌞(𝑧 + 𝑧 ′ mod 𝑁 )⌟⟩,


|𝑧 × 𝑧 ′ ⟩𝑁 = | (𝑧 × 𝑧 ′ mod 𝑁 )⟩𝑁 = |⌞(𝑧 × 𝑧 ′ mod 𝑁 )⌟⟩.

This (qu-)bit encoding allows us to represent modular addition and multiplica-


∗ fixed, we define
tion as classical reversible circuits. For 𝑎 ∈ ℤ𝑁

𝑨 𝑎 |𝑧⟩𝑁 =|𝑧 + 𝑎⟩𝑁 (addition by 𝑎 modulo 𝑁 ), (12.7)


𝑴 𝑎 |𝑧⟩𝑁 =|𝑎 × 𝑧⟩𝑁 (multiplication by 𝑎 modulo 𝑁 ), (12.8)

for all possible 𝑧 ∈ ℤ𝑁 . Modulo some technical fineprint2, these operations


map 𝑛 -bit encodings of 𝑧 ∈ ℤ𝑁 to 𝑛 -bit encodings of 𝑧 ′ ∈ ℤ𝑁 in a reversible
fashion. The reverse of 𝑨 𝑎 is simply 𝑨 (𝑁 −𝑎 ) :

𝑨 (𝑁 −𝑎 ) 𝑨 𝑎 |𝑧⟩𝑁 = 𝑨 (𝑁 −𝑎 ) |𝑧 + 𝑎⟩𝑁 = |(𝑧 + 𝑎) + (𝑁 − 𝑎)⟩𝑁 = |𝑧 + 𝑁 ⟩𝑁 = |𝑧⟩𝑁 .


2Whenever 𝑁 ≠ 2𝑛 − 1, there are 𝑛 -bit strings 𝑦 ∈ {0, 1}𝑛 that do not describe bit encodings
of any 𝑧 ∈ ℤ𝑁 . To complete the formal definition of 𝑨 𝑎 and 𝑴 𝑎 , we let both of them act
trivially on such input configurations: 𝑨 𝑎 |𝑦 ⟩ = |𝑦 ⟩ and 𝑴 𝑎 |𝑦 ⟩ = |𝑦 ⟩ .
167 Lecture 12: Shor’s algorithm for integer factorization

The reverse operation of 𝑴 𝑎 is much more interesting by comparison. It is


∗ :
𝑴 𝑎 𝑟 −1 , where 𝑟 is the order of 𝑎 ∈ ℤ𝑁

𝑴 𝑎 𝑟 −1 𝑴 𝑎 |𝑧⟩𝑁 =𝑴 𝑎 𝑟 −1 |𝑎 × 𝑧⟩𝑁 = |𝑎 𝑟 × 𝑧⟩𝑁 = | 1 × 𝑧⟩𝑁 = |𝑧⟩.

These displays ensure that both 𝑨 𝑎 and 𝑴 𝑎 are classical reversible circuits
(and therefore also unitary). What is more, they can be implemented with only
𝑂 (𝑛 2 ) elementary quantum gates.
Exercise 12.9 (quantum circuit implementation of reversible addition and reversible
multiplication). Sketch how you would construct a quantum implementation of efficient quantum circuits for
the permutation matrices 𝑨 𝑎 (addition by 𝑎 modulo 𝑁 ) and 𝑴 𝑎 (multiplication addition & multiplication by 𝑎
by 𝑎 modulo 𝑁 ) on 𝑛 = ⌊ log2 (𝑁 )⌋ + 1 qubits. Show that the number of modulo 𝑁 (𝑂 (𝑛 2 ) gates)
elementary gates required does not exceed 𝑂 (𝑛 2 ) .
We leave the proof as an instructive exercise. Instead, let us emphasize
that the multiplication operator 𝑴 𝑎 looks to be connected to the order 𝑟 of
𝑎 in ℤ𝑁∗ in a nontrivial fashion that we might be able to exploit. In order to

make this correspondence precise, we must take a look at the eigenvalues and
eigenvectors of 𝑴 𝑎 .
Proposition 12.10 (eigenvalues+eigenvectors of the modular multiplication operator).
For 𝑁 , 𝑎 ∈ ℤ𝑁 ∗ with gcd (𝑎, 𝑁 ) = 1, let 𝑟 be the order of 𝑎 in ℤ∗ and set (many) eigenvalues of
𝑁
𝜔𝑟 = exp ( 2𝜋 i/𝑟 ) (𝑟 -th root of unity). Then, the modular multiplication circuit modular multiplication circuit
𝑴 𝑎 defined in Eq. (12.8) has the following 𝑟 eigenvalue+eigenvector pairs: by 𝑎 isolate order 𝑟

𝑗 1 ∑︁𝑟 − 1 −𝑗 ×𝑘
𝜆 𝑗 = 𝜔𝑟 = exp ( 2𝜋 i (𝑗 /𝑟 )) and |𝜓𝑗 ⟩ = √ 𝜔𝑟 |𝑎 𝑘 ⟩𝑁
𝑟 𝑘 =0

for 𝑗 = 0, . . . , 𝑟 − 1.

Proof sketch. Let us start with an explicit calculation for the special case 𝑗 = 0:
1
𝑴 𝑎 |𝜓0 ⟩𝑁 = √ 𝑴 𝑎 | 1⟩𝑁 + 𝑴 𝑎 |𝑎⟩𝑁 + · · · + 𝑴 𝑎 |𝑎 𝑟 −2 ⟩𝑁 + 𝑴 𝑎 |𝑎 𝑟 −1 ⟩𝑁

𝑟
1
= √ |𝑎⟩𝑁 + |𝑎 2 ⟩𝑁 + · · · + |𝑎 𝑟 −1 ⟩𝑁 + |𝑎 𝑟 ⟩𝑁

𝑟
1
= √ |𝑎⟩𝑁 + |𝑎 2 ⟩𝑁 + · · · + |𝑎 𝑟 −1 ⟩𝑁 + | 1⟩𝑁

𝑟
=(+1)|𝜓0 ⟩ = 𝜔𝑟0 |𝜓0 ⟩𝑁 .

Here, we have used |𝑎 𝑟 ⟩𝑁 = | 1⟩𝑁 which follows from the fact that 𝑟 is the
order of 𝑎 in ℤ𝑁 ∗ (𝑎 𝑟 = 1 mod 𝑁 ). In short, 𝑴 |𝜓 ⟩ = 𝜔 0 |𝜓 ⟩ which is the
𝑎 0 𝑟 0
defining property of an eigenvector |𝜓0 ⟩ with eigenvalue 𝜔𝑟0 . Verifying the
remaining (𝑟 − 1) eigenvalue equations with 𝑗 = 1, . . . , 𝑟 − 1 can be achieved
in a similar fashion (hint: use 𝜔𝑟𝑟 = 1). We leave it as an exercise. ■
Exercise 12.11 (eigenvalues and eigenvectors of the modular addition operator). Fix
∗ and let 𝑟 be the order of 𝑎 in ℤ∗ . Identify at least 𝑟
𝑁 ∈ ℕ, 𝑎 ∈ ℤ𝑁 𝑁
168 Lecture 12: Shor’s algorithm for integer factorization

eigenvector-eigenvalue pairs of the modular addition operator 𝑨 𝑎 defined in


Eq. (12.7). Hint: take inspiration from our analysis of 𝑴 𝑎 and/or Fourier
series with finite periodicity.

12.3.3 Approximate eigenvalues of the modular multiplication circuit via QPE


Proposition 12.10 is noteworthy for the following reason: 𝑴 𝑎 admits a sim-
ple reversible circuit (see Lemma 12.9) and the eigenvalue associated with
eigenvector |𝜓 𝑗 ⟩ is a complex phase that encodes the period we are looking for:

𝑴 𝑎 |𝜓𝑗 ⟩𝑁 = exp ( 2𝜋 i (𝑗 /𝑟 )) |𝜓𝑗 ⟩𝑁 = exp 2𝜋 i𝜃 𝑗 |𝜓𝑗 ⟩ with 𝜃 𝑗 = 𝑗 /𝑟 .
Quantum Phase estimation now looks to be almost tailor-made for this task!
But with quantum phase estimation, we always have to be careful that the
requirements for approximation accuracy are not too large. Otherwise, the
required number 𝑚 of readout qubits can explode and the entire procedure
becomes infeasible. The following fact provides good news: the number of
additional qubits 𝑚 can be chosen to be linear in the original qubit size 𝑛 . This
is as good as it gets.
Fact 12.12 (required accuracy for quantum phase estimation). Suppose that we can quantum phase estimation
run quantum phase estimation with 𝑚 control qubits for 𝑴 𝑎 (unitary matrix) with 𝑚 = 𝑂 (𝑛) qubits
and |𝜓 𝑗 ⟩ (eigenvector with phase 𝜃 𝑗 = 𝑗 /𝑟 ), where 𝑗 = 0, . . . , 𝑟 − 1: suffices to read out order 𝑟

• worst case: 𝑗 = 0, 𝜃 0 = 0/𝑟 = 0 and we learn nothing about the order


whatsoever;
• best case: 𝑗 = 1, 𝜃 1 = 1/𝑟 and we can approximate 𝑟 by the closest
integer to (𝑦 /2𝑚 ) − 1 = 2𝑚 /𝑦 .
• general case: 𝜃 𝑗 = 𝑗 /𝑟 is an actual fraction. We can use a classical
algorithm – called continued fractions – to efficiently find the closest
fraction 𝑢/𝑣 in lowest terms satisfying 𝑢, 𝑣 ∈ ℤ𝑁 (and 𝑣 ≠ 0).
Unless 𝑗 = 0, setting 𝑚 = ⌞ log2 ( 2𝑁 2 )⌟ = 2 ⌊ log2 (𝑁 )⌋ + 1 ≤ 2𝑛 + 1 is enough
to determine 𝑗 /𝑟 as fraction with probability (at least) 40%. ■

We refer to standard textbooks for a proof of this fact and a review of the
continued fraction algorithm which also allows to recover the order 𝑟 itself.
Note also that this fact only guarantees a successful approximation with a
certain probability of success, namely greater than 40%. So, it can happen
that we get unlucky with our quantum phase estimation approximation ⌞𝑦 ⌟ .
This issue can be offset by repeating the entire procedure multiple times and
classically checking the observed outcomes ⌞𝑦𝑘 ⌟ for consistency (is it really a
fraction? is it really minimal? and does the period 𝑟 feature?)
Here is another noteworthy and essential feature that renders the quantum
phase estimation circuit more efficient than one might naively assume
𝑘
Lemma 12.13 (efficient circuit implementation of 𝑴 𝑎2 ). For each 𝑘 = 0, . . . , 𝑚 , we
2𝑘
can implement the 2𝑘 -th power 𝑴 𝑎 of 𝑴 𝑎 as 𝑴 𝑎 2𝑘 . The latter is another
multiplication matrix that can be implemented with only 𝑂 (𝑛 2 ) elementary
quantum gates.
169 Lecture 12: Shor’s algorithm for integer factorization

Proof. The following equation is valid for all 𝑛 -bit strings 𝑧 ∈ {0, 1}𝑛
𝑘 𝑘
𝑴 𝑎2 |𝑧⟩𝑁 = 𝑴 𝑎 × · · · × 𝑴 𝑎 |𝑧⟩𝑁 = |𝑎 2 𝑧⟩𝑁 = 𝑴 𝑎 2𝑘 |𝑧⟩𝑁 .
| {z }
2𝑘 times

The technical fineprint3 from above ensures that this suffices to conclude that
𝑘
the two matrices 𝑴 𝑎2 and 𝑴 𝑎 2𝑘 must be identical overall. ■
Lemma 12.13 ensures that we can implement each controlled operation in
quantum phase estimation (central part of Fig. 12.1) with only 𝑂 (𝑛 2 ) gates. efficient implementation of
Since there are 𝑚 = ( 2𝑛 + 1) such controlled circuits, this produces a total cost phase estimation circuit
of 𝑂 (𝑛 3 ) gates for the central block. The cost of the initial Walsh Hadamard (𝑂 (𝑛 3 ) gates)
transform is 𝑚 = 2𝑛 + 1 = 𝑂 (𝑛) while the cost of the final inverse QFT is
𝑂 (𝑚 2 ) = 𝑂 (𝑛 2 ) . This produces a total cost of 𝑂 (𝑛) +𝑂 (𝑛 3 ) +𝑂 (𝑛 2 ) = 𝑂 (𝑛 3 )
gates to execute the full quantum phase estimation circuit. The only thing
missing now is the preparation of an eigenvector |𝜓 𝑗 ⟩ (ideally with 𝑗 ≠ 0).
Explicitly constructing one |𝜓 𝑗 ⟩ in a deterministic fashion is a daunting task.
We can, however, easily prepare the uniform superposition over all possible
eigenvectors |𝜓 𝑗 ⟩ . This is the content of the following lemma.
Lemma 12.14 Let |𝜓 𝑗 ⟩𝑁 be the eigenvectors of 𝑴 𝑎 defined in Proposition 12.10. efficient preparation of a
Then, we have superposition of eigenvectors
1 ∑︁𝑟 − 1 (0 gates)
√ |𝜓𝑗 ⟩𝑁 = | 1⟩𝑁 = | 0 · · · 01⟩.
𝑟 𝑗 =0

In words: the uniform superposition of all 𝑛 -qubit eigenvectors |𝜓 𝑗 ⟩ is the


computational basis state | 1⟩𝑁 = | 0 · · · 01⟩ ∈ {0, 1}𝑛 .

Proof. This follows from an identity that we already saw in previous lectures.
Summing over all (powers) of an 𝑟 -th root of unity produces 0 (amplitudes
cancel):
 ∑︁   ∑︁ 
1 ∑︁𝑟 − 1 1 𝑟 −1 0 1 𝑟 −1 −𝑗
√ |𝜓𝑗 ⟩𝑁 = 𝜔𝑟 | 1⟩𝑁 + 𝜔 |𝑎⟩𝑁
𝑟 𝑗 =0 𝑟 𝑗 =0 𝑟 𝑗 =0 𝑟
 ∑︁ 
1 𝑟 −1 −𝑗 (𝑟 − 1 )
+··· + 𝜔 |𝑎 𝑟 −1 ⟩𝑁
𝑟 𝑗 =0 𝑟

=1 × | 1⟩𝑁 + 0 × |𝑎⟩𝑁 + · · · + 0 × |𝑎 𝑟 −1 ⟩𝑁
=| 1⟩𝑁 .


This is the last ingredient that we needed to implement order finding on
a quantum computer: we can simply initialize the second 𝑛 -qubit register in
| 0 · · · 01⟩ = | 1⟩𝑁 and view this deterministic bit string as a uniform superposi-
tion of all 𝑟 eigenvectors |𝜓 𝑗 ⟩ . Linearity of the quantum computing formalism
3This is a continuation of the previous footnote. We let every modular multiplication circuit act
trivially on input configurations 𝑦 that don’t encode elements 𝑧 ∈ ℤ𝑁 : 𝑴 𝑎 |𝑦 ⟩ = |𝑦 ⟩ = 𝑴 2𝑘 |𝑦 ⟩ .
𝑎
170 Lecture 12: Shor’s algorithm for integer factorization

Figure 12.2 quantum part of Shor’s algorithm: This adaptation of quantum phase
estimation is designed to solve the order finding problem for 𝑎 in ℤ𝑁 . The circuit
requires 3𝑛 + 1 qubits, where 𝑛 = ⌊ log2 (𝑁 )⌋ + 1, and initializes them in the bit
string 0 · · · 01. Subsequently, we apply a Quantum Fourier Transform on the first
𝑚 = ( 2𝑛 + 1) gates and follow it up with 𝑚 controlled applications of different
arithmetic multiplication circuits 𝑴 𝑎 𝑘 with 𝑘 = 1, 2, . . . , 22𝑚 − 1 . Finally, we
apply an inverse Quantum Fourier Transform on the first ( 2𝑚 + 1) qubits and
read them out to obtain a bit string representation of 𝑦 ∈ {0, 2𝑚 − 1} that is
likely to encode the order 𝑟 of 𝑎 in ℤ𝑛 : 𝑦 /2𝑚 ≈ 𝑗 /𝑟 for some 𝑗 = 0, . . . , 𝑟 − 1
with constant probability of success ( ≥ 0.2).

(matrix-vector multiplication) then ensures that the remaining (𝑛 + 𝑚) -qubit


circuit produces a superposition of all 𝑟 phase estimation protocols – one for
each eigenvector. Reading out the first 𝑚 qubits then collapses this superposi-
tion into one particular branch, i.e. one particular 𝑗 = 0, . . . , 𝑟 − 1. Out of these
𝑟 possible trajectories, only 𝑗 = 0 is completely useless. So, with probability
(𝑟 − 1)/𝑟 ≥ 1/2, we are in a position to extract useful information about
𝑟 . This, however, is contingent on the quantum phase estimation subroutine
succeeding. According to Theorem 12.8, this happens with probability at least
40%. Total odds of about (𝑟 − 1)/𝑟 × 0.4 ≥ 0.2 are not bad at all. A constant
number of repetitions should suffice to learn 𝑗 /𝑟 and, by extension, the period
𝑟 . This realization deserves a prominent display and even its own synopsis
section.

12.4 Synopsis: implementation of Shor’s algorithm


Theorem 12.15 (Quantum Order Finding). Fix 𝑁 ∈ ℕ, set 𝑛 = ⌊ log2 (𝑁 )⌋ + 1 order finding via quantum
and let 𝑎 ∈ ℤ𝑁 be such that gcd (𝑎, 𝑁 ) = 1. Then, the quantum circuit phase estimation works with
displayed in Fig. 12.2 is comprised of 𝑂 (𝑛 3 ) elementary one- and two-qubit gate count 𝑂 (𝑛 3 )
gates and identifies the smallest integer 𝑟 such that 𝑎 𝑟 = 1 mod 𝑁 with a
constant probability of success.
171 Lecture 12: Shor’s algorithm for integer factorization

To paraphrase: this quantum circuit solves the order finding problem


with only 𝑂 (𝑛 3 ) quantum resources!

Note that this circuit has a lot of structure and can be divided into three
qualitatively different blocks: a quantum Fourier transform (on one part of
the qubits) followed by a classical reversible circuit followed by the inverse
quantum Fourier transform (again on one part of the qubits):
†
𝑼 = QFT22𝑛+1 ⊗ 𝕀⊗𝑛 𝑹 QFT22𝑛+1 ⊗ 𝕀⊗𝑛 .


What is more, we apply this circuit to a very simple starting configuration:


| 0 . . . 01⟩ ∈ {0, 1}3𝑛+1 . This is remarkable, because
†
QFT22𝑛+1 ⊗ 𝕀 ⊗𝑛 |𝒃⟩, QFT22𝑛+1 ⊗ 𝕀 ⊗𝑛

𝑹 |𝒃⟩ and |𝒃⟩

would be easy to compute classically for every |𝒃⟩ ∈ {0, 1}3𝑛+1 . A sequential
combination of these three quantum subroutines, however, would yield an
exponential improvement over the best known approach for order finding. This
exponential speedup becomes much more relevant if we use it as a (quantum)
subroutine in Algorithm 12.1. There, all other computational steps can be
executed on a classical computer in cubic runtime 𝑂 (𝑛 3 ). Hence, the following
corollary is an immediate consequence of Theorem 12.15 and Theorem 12.5
(reformulate factoring as an order finding problem).
Corollary 12.16 (Efficient hybrid quantum-classical algorithm for Integer Factorization
(Shor, 1994)). Let 𝑁 = 𝑝 × 𝑞 , with 𝑝, 𝑞 prime and set 𝑛 = ⌊ log2 (𝑁 )⌋ + 1. Then, hybrid quantum-classical
we can determine one factor (𝑝 or 𝑞 ) by repeating Algorithm 12.1 sufficiently algorithm solves integer
often. The order finding step, in particular, is outsourced to a ( 3𝑛 + 1) -qubit factorization at 𝑂 (𝑛 3 ) cost
architecture which executes a circuit of size 𝑂 (𝑛 3 ). This also bounds the overall
cost of all remaining classical computing steps.
Contrast this 𝑂 (𝑛 3 ) scaling with the best known fully classical factorization
strategy for 𝑁 = 𝑝 × 𝑞 with 𝑝, 𝑞 prime that scales exponentially in 𝑛 1/3 .

Problems
Problem 12.17 (quantum circuit implementation of reversible addition and reversible
multiplication). Sketch how you would construct a quantum implementation of
the permutation matrices 𝑨 𝑎 (addition by 𝑎 modulo 𝑁 ) and 𝑴 𝑎 (multiplication
by 𝑎 modulo 𝑁 ) on 𝑛 = ⌊ log2 (𝑁 )⌋ + 1 qubits. Show that the number of
elementary gates required does not exceed 𝑂 (𝑛 2 ) .
Problem 12.18 (eigenvalues and eigenvectors of modular addition operator). Fix
𝑁 ∈ ℕ, 𝑎 ∈ ℤ𝑁 ∗ and let 𝑡 the smallest integer that obeys 𝑡 × 𝑎 = 1 mod 𝑁 .

Identify at least 𝑡 eigenvector-eigenvalue pairs of the modular addition operator


𝑨 𝑎 defined in Eq. (12.7). textbfHint: take inspiration from our analysis of 𝑴 𝑎
and/or Fourier series with finite periodicity.
13. Learning from quantum experiments

Date: 8 January 2025

13.1 Motivation Agenda:


Broadly speaking, the main promise and raison d’être of quantum computers 1 motivation
is that they may have the potential to solve certain problems more efficiently 2 stylized learning chal-
than traditional processing units. Exponential improvements in resource cost lenge: data hiding
are therefore the ultimate objective. 3 two approaches: con-
And we do know a couple of computational problems, where fully functional ventional and quantum-
quantum computers can make a substantial difference. Shor’s quantum-classical enhanced
4 execution on real quan-
approach to integer factorization comes into mind here. Such approaches isolate
tum computer
quantum circuit size (and classical runtime) as the main cost parameter. And
they focus on computational tasks that seem to be hard for existing computing
platforms. This, however, means that the proposed quantum solutions must
also reflect some of this complexity. And this demand goes way beyond the
capabilities of today’s nascent quantum computers. These are not perfect and current quantum hardware
each applied gate is prone to errors. This, unfortunately, restricts us from too noisy to execute large
executing quantum circuits with very short depth. And although the number of quantum circuits
qubits 𝑛 is growing, the polynomial scaling of famous quantum circuits – like
the 𝑂 (𝑛 3 ) -size of Shor’s algorithm – grows even faster. So, the more qubits we
use, the larger these circuits must become. For now, this unavoidable scaling
prevents us from executing Shor’s algorithm on existing quantum hardware. If
𝑛 = 51, then we simply can’t run execute order 533 = 1.5 × 105 elementary
quantum gates in a sufficiently reliable fashion.
But, at the same time, the qubit sizes of existing quantum computing
platforms do become respectable. The Google sycamore chip, for instance,
boasts 𝑛 = 53 qubits. So, there should be ‘something’ new and unexpected
173 Lecture 13: Learning from quantum experiments

that we can actually do with them. (Machine) learning can serve as a guiding
motivation in this regard. There, the broad goal is not to solve a computationally
hard problem, but to learn something that is initially hidden from us. And switch cost from runtime
such learning processes come with their own reasonable cost parameters, in (algorithms) to training data
particular training data size. This is not a computational cost parameter, size (machine learning)
but a statistical one. How much data (information) do we need in order
to distill underlying principles? And this statistical nature plays nicely with
quantum computing architectures which also produce outcome bit strings that
are statistical in nature.
In fact, this correspondence goes even deeper. Because in quantum comput-
ing, we can manipulate the way we generate data by adjusting the quantum
circuit we use prior to measurements. And it is reasonable to expect that some
quantum circuit executions produce more valuable outcome data than others.
This effect is also well-known in machine learning: ‘good data’ lets you learn
quickly while ‘bad data’ may require a lot more training effort.
And today, we shall use an ML-inspired view on learning problems based on
quantum (computing) experiments to identify exponential quantum advantages.

13.2 Stylized learning challenge: data hiding


Let us approach the broad topic of learning from the quantum world by means
of a concrete example. This example is very stylized, but intended to illustrate
the underlying principles and possibilities. The key idea is based on data
hiding and involves two players. Player 1 possesses a private ternary string stylized data hiding challenge
𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 of length 𝑛 and Player 2 wants to learn that string. Player 1
encodes the string 𝒗 into a quantum state |𝜓ˆ (𝒘 )⟩ . This encoding process is
known to both players. Player 2 can now request copies of this quantum state
from Plater 1, but for a price. let’s say, one state copy costs 1000USD. Being
thrifty, Player 2 wants to learn the message as cheaply as possible, meaning with
as few copies as possible (because they are expensive). She can do whatever
she wants with that state. Apply more gates, leave it lying around for a year,
measure the qubits to gather classical data, you name it you get it. The only
burden she is carrying is that she should request as few copies as possible from
Player 1 and that the ultimate goal is to learn the hidden string. And it turns
out that the strategy of how to learn the message has a big impact about the
overall cost, i.e. the number of quantum state copies required to learn the
message (with very high probability of success).
As we shall see later on, this stylized scenario is designed such that a fully
quantum learning strategy (exploiting entanglement) is exponentially faster
than any traditional quantum-to-classical learning strategy. However, there
are a couple of steps involved to ensure that speed-up. For example that the
message consists of three symbols {𝑥, 𝑦 , 𝑧 }, and not as usual out of bits {0, 1}
is necessary for the quantum approach to excel. The encoding stage which
maps 𝒘 to the 𝑛 -qubit state |𝜓 (𝒘 )⟩ is also not ‘too easy’ in order to produce
an actual gap between traditional and quantum-enhanced learning strategies.
174 Lecture 13: Learning from quantum experiments

Let us get a bit more concrete and jump into the details of preparing the
initial quantum state |𝜓ˆ0 ⟩ (where then the message is imprinted on).
Recall that Player 1 possesses a private ternary string 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 and
imprints it into a 𝑛 -qubit state vector . This state vector starts with randomly
unif
initializing the first 𝑛 − 1 qubits: 𝑏ˆ0 , . . . , 𝑏ˆ𝑛 − 2 ∼ {0, 1}. The final 𝑛 -th qubit
is then initialized such that parity of the total sum is even:
unif
|𝜓ˆ0 ⟩ = |𝑏ˆ0 , . . . , 𝑏ˆ𝑛 −2 , 𝑏ˆ0 ⊕ · · · ⊕ 𝑏ˆ𝑛 −2 ⟩ with 𝑏ˆ0 , . . . , 𝑏ˆ𝑛 −2 ∼ {0, 1}. (13.1)

We use a hat (‘ˆ’) to explicitly delineate the fact that the initialization bit is
randomly generated. The even parity of sum constraint then fully determines
the value of the last bit: 𝑏ˆ𝑛 − 1 = 𝑏ˆ0 ⊕ · · · ⊕ 𝑏ˆ𝑛 − 2 ∈ {0, 1} in a deterministic
fashion. randomly generated qubit
initialization |𝜓ˆ0 ⟩
Example 13.1 ( |𝜓ˆ0 ⟩ for 𝑛 = 1 and 𝑛 = 2 qubits). For a single qubit, the sum of
parity constraint is completely binding. The result must be

(𝑛 = 1) : |𝜓ˆ0 ⟩ = | 0⟩ with certainty.

For 𝑛 = 2, there are two bit strings with even parity. The state vector assumes
either initialization with equal probability:
(
| 0, 0⟩ with prob. 1/2,
(𝑛 = 2) : |𝜓ˆ0 ⟩ =
| 1, 1⟩ with prob. 1/2.

The randomization of initial quantum states used to imprint the same


message ensures the exponential grow of needed copies when not choosing the
optimal quantum strategy.
The encoding strategy 𝒘 ↦→ |𝜓ˆ (𝒘 )⟩ = 𝑼 (𝒘 )|𝜓ˆ0 ⟩ is known to both players
and will be explained in the next section. At this point it is worthwhile to secret 𝑛 -trit string is
emphasize that Player 2 will have to read out the qubits involved in order to imprinted on |𝜓ˆ0 ⟩
get any actionable advice at all. And measurements destroy the underlying
quantum state (collapse of the wave function). So Player 2 must be prepared
to invest several 1000USD to learn anything at all.
The overarching questions now are:

1 How does the number of state copies required (i.e. the money spent),
scale with the number of qubits 𝑛 and, by extension, with the size of the
hidden string? This is our toy model for training data size. It is reasonable
to expect that this cost will grow. But, does it grow polynomial (efficient
scaling) or exponential (prohibitively expensive scaling)?
2 Does the way the state copies are processed have an impact on this cost
parameter? After all, different ways of accessing this quantum state
vector may lead to readout bits that carry more or less information about
the underlying secret. This is our way of varying the quality of training
data.
175 Lecture 13: Learning from quantum experiments

Today, we construct a variant of this data-hiding game, where discrepancies are


as pronounced as possible. Any quantum-classical learning approach conceivable
that only addresses individual copies of |𝜓ˆ (𝒘 )⟩ must scale exponentially with
qubit size 𝑛 . We will call this the conventional approach and an exponential
scaling is probably not too surprising. After all, there are 3𝑛 possibilities for
the hidden string 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 . What is surprising, is that we also offer
an alternative that is exponentially more efficient. It is possible to construct
a quantum-enhanced readout protocol that processes pairs of |𝜓ˆ (𝒘 )⟩ in a
genuine quantum fashion. We call this the quantum-enhanced approach. It
is cheap (short circuits) and uncovers the hidden string after only a constant
number of iterations. In turn, the cost associated with both secret learning
approaches deviates exponentially in the size 𝑛 of the task: exponential discrepancy in
cost to learn hidden secret
𝑇conv (𝑛) = 2Ω(𝑛 ) while 𝑇qe (𝑛) = 𝑂 (𝑛).

The first statement highlights that the cost for any conventional approach scales
(at least) exponentially in the number of qubits 𝑛 . The second statement, in
stark contrast, states that there is a quantum-enhanced whose cost scales (at
most) linearly in the number of qubits 𝑛 .

13.2.1 Encoding strategy


We are now ready to explain the high-level rules of our learning challenge that
result in the provably exponential cost discrepancy displayed in Eq. (13.2). To
this end, we must first specify the encoding procedure
 ⊗𝑛
{𝑥, 𝑦 , 𝑧 }𝑛 ∋ 𝒘 ↦→ |𝜓ˆ (𝒘 )⟩ = 𝑼 (𝒘 )|𝜓ˆ0 ⟩ ∈ ℂ2 ,

where |𝜓ˆ0 ⟩ is the randomly initialized 𝑛 -qubit state from Eq. (13.1) (with a
sum-of-parity constraint). The reason why this string involves trits instead of
bits is based on the elementary building blocks of our encoding. It uses the
three most prominent single-qubit gates, namely identity, Hadamard and phase
gates:      
1 0 1 1 1 1 0
𝕀= , 𝑯 =√ , 𝑺= .
0 1 2 1 −1 0 i
We will use these gates to imprint the secret key onto (probabilistic mixtures
of) 𝑛 -qubit computational basis states. For 𝑤 ∈ {𝑥, 𝑦 , 𝑧 }, we set

𝑽 𝑧 = 𝕀, 𝑽 𝑥 = 𝑯 and 𝑽 𝑦 = 𝑺 × 𝑯 . (13.2)

The labels 𝑥, 𝑦 , 𝑧 respect an intimate connection between these three unitaries


and the three Pauli matrices.
Exercise 13.2 (connection between Eq. (13.2) and Pauli matrices). The three non-
trivial Pauli matrices are defined as
     
0 1 0 −i 1 0
𝑿 = , 𝒀 = , 𝒁 = .
1 0 i 0 0 −1
176 Lecture 13: Learning from quantum experiments

Verify the following relation between these Pauli matrices and the unitary
transformations in Eq. (13.2):

𝑿 = 𝑽 𝑥† × 𝒁 × 𝑽 𝑥 , 𝒀 = 𝑽 †𝑦 × 𝒁 × 𝑽 𝑦 , 𝒁 = 𝑽 †𝑧 × 𝒁 × 𝑽 𝑧 .
These three unitaries allow us to imprint exactly one trit 𝑤 ∈ {𝑥, 𝑦 , 𝑧 } on a
single qubit computational basis state. This encoding strategy readily extends
to 𝑛 -trit strings 𝒘 = (𝑤 0 , . . . , 𝑤𝑛 − 1 ) ∈ {𝑥, 𝑦 , 𝑧 }𝑛 and 𝑛 qubits: concrete encoding strategy

(13.3)

Here, |𝜓ˆ0 ⟩ = |𝑏ˆ0 , . . . , 𝑏ˆ𝑛 − 2 , 𝑏ˆ0 ⊕ · · · ⊕ 𝑏ˆ𝑛 − 2 ⟩ is the randomly constructed input
string from Eq. (13.1). With the encoding strategy at hand, we can introduce
the two different approaches on how to access this quantum state.

13.2.2 Conventional approach


Our first approach is inspired by the way people have performed experiments
for centuries now. This is why we dub it the conventional approach. The key
idea consists of three steps:
(i) acquire a scientific probe/sample (buy the state |𝜓ˆ (𝒘 )⟩ ),
(ii) do something with the given probe (apply a unitary circuit),
(iii) perform a 𝑛 -qubit readout to observe the underlying behavior.
These three basic steps are then repeated many times to get enough data in
order to draw credible solutions. Here is an illustration of such an approach for
our hidden-data challenge: conventional approach
(sequential): modify single
states, readout, repeat.

(13.4)
177 Lecture 13: Learning from quantum experiments

In this illustration, we always start with buying a single copy of |𝜓ˆ (𝒘 )⟩ . We


then apply a unitary circuit (blue) and perform a 𝑛 -qubit readout to get a
classical bit string (magenta). This conversion is probabilistic and destructive.
Hence, a single run of this procedure will not provide us with enough statistical
data. This is why we repeat it 𝑇conv times. Note, however, that the type
of experiment – i.e. the choice of unitary circuit – can, and should, change
between repetitions. This allows us, for instance, to sequentially check different
aspects of the underlying state |𝜓ˆ (𝒘 )⟩ one at a time. But more sophisticated
methods can also come into play. For instance, we could use powerful learning
algorithms in order to execute optimised scheduling procedures to make the
most of the data we generate. Machine learning comes to mind here. What
is more, we don’t impose any constraint on the computational cost associated
with individual unitary transformations. Every conceivable quantum circuit
(even if it is arbitrarily large) is fair game.
Although brief and high-level, these arguments should highlight that the
conventional readout approach from Eq. (13.4) is extremely general. It virtually
encompasses every conceivable readout strategy that uses quantum measure-
ments in a sequential fashion to learn something about a quantum system.
Given this level of generality – and the absence of any computational restrictions
on both the quantum computation and the conventional data-processing – the
following rigorous result might be surprising.

Theorem 13.3 (Lower bound on any conventional strategy [HKP21]). Consider


the data hiding strategy {𝑥, 𝑦 , 𝑧 }𝑛 ∋ 𝒘 ↦→ |𝜓ˆ (𝒘 )⟩ from Eq. (13.3) for
𝑛 qubits. Then, any conventional (data) readout procedure illustrated in
Eq. (13.4) requires
𝑇conv (𝑛) = 2Ω(𝑛 )
repetitions to recover 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 with probability of success > 50%.

The big-Ω notation highlights that the cost grows (at least) exponentially
in 𝑛 . This is a very general statement and the proof is not easy. Technically
speaking it requires an additional randomization over one sign factor for each
qubit and employs advanced proof techniques from statistical learning theory,
probability theory and quantum computing. Such a level of sophistication
is necessary to handle arbitrary quantum circuits and arbitrary classical data
processing, including any machine learning model.
The underlying idea, however, is rather simple and boils down to our data
hiding strategy (13.3). Although it looks simple, it does imprint the secret trit
string into a 𝑛 -qubit system |𝜓ˆ (𝒘 )⟩ that must be interpreted as a probability
distribution over a total of 2𝑛 /2 different quantum state vectors

𝑼 (𝒘 )|𝑏 0 , . . . , 𝑏 𝑛 −2 , 𝑏 0 ⊕· · ·⊕𝑏 𝑛 −2 ⟩ = 𝑼 (𝑤0 )|𝑏 0 ⟩⊗· · ·⊗𝑼 (𝑤𝑛 −1 )|𝑏 0 ⊕· · ·⊕𝑏 𝑛 −2 ⟩.


with 𝑏 0 , . . . , 𝑏 𝑛 − 2 ∈ {0, 1}𝑛 − 1 . So there are a lot of degrees of freedom
available to maliciously hide even 3𝑛 different trit strings. What is more, the
encoding strategy is based on very specific single-qubit unitaries that are as
178 Lecture 13: Learning from quantum experiments

different from each other as possible. This ensures that we actually occupy
radically different corners of this huge 𝑛 -qubit amplitude space. And this makes
it extremely difficult to do anything (substantially) smarter than iteratively
asking binary questions: is 𝒘 = 𝒗 , where 𝒗 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 is our current best
guess. And since there are 3𝑛 possible guesses, we are forced to ask this
question exponentially often (in 𝑛 ). We will sketch a single-qubit caricature of
this effect in Annex 1 below.

13.2.3 Quantum-enhanced approach


In the previous section, we have outlined remarkably general arguments that
seem to suggest that the data hiding procedure from Eq. (13.3) is very secure.
In our data hiding game, this means that Player 2 is forced to buy exponentially
many copies of |𝜓ˆ (𝒘 )⟩ in order to learn a secret string 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 . This
binds despite the fact that the underlying data extraction model looks very
general and powerful. However, the conventional approach from the previous
section is still deeply rooted in a conventional way of thinking. And if one is
willing to accept the quantum computing paradigm, it is not the only way to
approach this challenge.
If we assume access to a (sufficiently large) quantum computer, we can
envision processing multiple copies of the unknown state |𝜓ˆ (𝒘 )⟩ at the same
time and within the same quantum circuit. The easiest setup for such a quantum
parallel data processing routine involves two state copies and looks as follows: quantum-enhanced approach
(parallel): modify state pairs,
readout, repeat.

(13.5)

Here, 𝑼 ˆ can be an arbitrary quantum circuit on 2𝑛 qubits (𝑛 qubits for the first
state copy and another 𝑛 qubits for the second state copy).
It is easy to see that this framework is at least as general as the conventional
approach. After all, we can always choose to divide up the general 2𝑛 -qubit
179 Lecture 13: Learning from quantum experiments

unitary circuit into two parallel (and uncorrelated) 𝑛 -qubit unitary circuits, i.e.
𝑼ˆ = 𝑼 1 ⊗ 𝑼 2 . Doing so effectively reduces one run of this two-copy protocol
into two independent runs of the conventional (single-copy) protocol. But
from this new point of view, such a Kronecker product construction starts to
look rather restrictive. What if we instead use this additional expressiveness
to insert correlations (think: CNOTs) and superpositions (think: Hadamards)
between qubits from the first copy of |𝜓ˆ (𝒘 )⟩ (top) and the second copy of
|𝜓ˆ (𝒘 )⟩ (bottom)? Thinking further along these lines highlights that this new,
quantum-enhanced approach is capable of doing something that the conventional
approach cannot fully mimic (at least not with an exponential overhead in
repetitions): creating entanglement between qubits from each copy directly
at the quantum level. The following constructive result highlights that such
quantum-enhanced readout protocols become a game changer for our data
hiding challenge:

Theorem 13.4 (Upper bound for a fixed quantum-enhanced strategy [HKP21]).


Consider the data hiding strategy {𝑥, 𝑦 , 𝑧 }𝑛 ∋ 𝒘 ↦→ |𝜓ˆ (𝒘 )⟩ from Eq. (13.3)
for 𝑛 qubits. Then, there is a simple quantum-enhanced procedure that
allows for uncovering the hidden string 𝒘 after already

𝑇qe = 𝑂 (𝑛)

repetitions. The quantum part of this procedure is very cheap and illustrated
in Fig. 13.1. A total of 𝑇qe repetitions produce enough statistical data to
?
check 𝒗 = 𝒘 for every candidate 𝒗 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 in linear time.

In contrast to Theorem 13.3, this result is constructive. We have one


concrete solution circuit and need to show that it actually works (efficiently).
In turn, the actual proof is also much simpler than the general no go statement
in Theorem 13.3. In fact, a thorough analysis of the single-qubit case (𝑛 = 1)
already conveys much of the main ideas. We will provide such an analysis
in Annex 2 below. The key idea is to use Bell-type measurements on both
single-qubit states to unravel information about all possible encoding unitaries
𝑼 (𝑤 ) at once. This general idea extends to 𝑛 -qubit systems, because we have
fine-tuned the (randomized) initial state |𝜓ˆ0 ⟩ from Eq. (13.1) in precisely the
right way to make it work.
To summarize: Our data imprinting strategy is designed to be hard for
conventional readout procedures, but also contains a trapdoor-type feature:
it becomes very easy to crack with a simple quantum-enhanced (two-copy)
procedure. This is deliberate: after all, we are looking for stylized example
challenges that herald an exponential quantum advantage. And the discrepancy
between Theorem 13.3 (exponential lower bound on any conventional strategy)
and Theorem 13.4 (linear upper bound for one quantum-enhanced strategy)
achieves just that. At this point, it is worthwhile to re-emphasize that this
exponential discrepancy does not (necessarily) manifest itself in algorithm
180 Lecture 13: Learning from quantum experiments

Figure 13.1 Quantum enhanced learning protocol: this 2𝑛 -qubit circuit processes
two copies of |𝜓ˆ (𝒘 )⟩ in parallel. The circuit executes a total of 𝑛 Bell basis
measurements that each connect the 𝑘 -th qubit of the first copy with the
𝑘 -th qubit of the second copy. This readout stage requires exactly 𝑛 single
qubit gates (𝕀, 𝑯 , 𝑺 × 𝑯 ) and exactly 𝑛 two-qubit CNOT gates. Each data
hiding state is also comparatively cheap to create: we need two random 𝑛 -bit
initializations and (at most) 2𝑛 single qubit gates (𝕀, 𝑯 , 𝑺 × 𝑯 ). Viewed as one
circuit from beginning to end, this demonstration requires 2𝑛 qubits, (at most)
3𝑛 single-qubit gates and exactly 𝑛 CNOT gates.

runtime. Instead, it targets the number of repetitions that are required to


learn the underlying secret. This is an appropriate and very general model for
training data size in machine learning.

13.3 Demonstration on an actual quantum computer


In the previous section, we have laid out an alternative approach towards
quantum advantage. We have presented a stylized learning challenge where
we first imprint a trit string 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 onto a (classically) randomized
𝑛 -qubit quantum state and subsequently analyze how many experiments it
takes to acquire enough (training) data in order to confidently recover this now
hidden string. And we have seen that the type of data acquisition makes a huge
difference: any conventional data acquisition procedure necessarily requires
exponentially many repetitions (in 𝑛 ) while a quantum-enhanced approach
gets by with only a linear number of repetitions. And, what is more, all steps in
the quantum-enhanced protocol are relatively simple, even for large qubit sizes
𝑛.
Let us first see how expensive it is to create |𝜓ˆ (𝒘 )⟩ for a fixed (and arbitrary)
trit string 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 . To this end, we must first prepare |𝜓0 ⟩ which is a
(classical) probabilistic average over all possible 2𝑛 − 1 input bit configurations
|𝑏 0 , . . . , 𝑏 𝑛 −2 , 𝑏 0 ⊕ · · · ⊕ 𝑏 𝑛 −2 ⟩ with an even sum of parity. We can effectively
generate such a state by sampling (𝑛 − 1) classical bits 𝑏ˆ0 , . . . , 𝑏ˆ𝑛 − 2 uniformly
at random and initializing the 𝑛 qubits in |𝜓0 ⟩ = |𝑏ˆ0 , . . . , 𝑏ˆ𝑛 − 2 , 𝑏ˆ𝑛 − 1 ⟩ with
181 Lecture 13: Learning from quantum experiments

𝑏ˆ𝑛 −1 = 𝑏ˆ0 ⊕· · ·⊕ 𝑏ˆ𝑛 −2 (note that 𝑏ˆ𝑛 −1 is itself not random, but fully determined
by the first (𝑛 − 1) random bits). Subsequently, we need to imprint the 𝑛 -trit
string 𝒘 = (𝑤 0 , . . . , 𝑤𝑛 − 1 ) ∈ {𝑥, 𝑦 , 𝑧 }𝑛 by applying 𝑽 𝑤0 ⊗ · · · ⊗ 𝑽 𝑤𝑛 −1 to our
randomized initial state |𝑏ˆ0 , . . . , 𝑏ˆ𝑛 − 1 ⟩ (with even sum of parity). But this is
also cheap, because we apply one single qubit unitary to each qubit.
Next, we need to have a look at the different readout protocols. Conventional
readout protocols are quite tricky, because the model is very general. Any
collection of unitary circuits is fair game, even ones with exponential circuit
size. Fortunately, implementing a conventional strategy is not the main point
here. Rigorous math tells us that any such approach must be bad! So, let
us instead focus on the quantum-enhanced protocol. There, we must first
prepare two copies of |𝜓ˆ (𝒘 )⟩ in parallel. This is not difficult, but requires 2𝑛
available (and working) qubits. If we have a quantum computer with, say, 53
qubits, then we can only play the data hiding game up to 𝑛 = ⌊ 53/2⌋ = 26
qubits. The subsequent entangling procedure involves a total of 𝑛 parallel
CNOT gates followed by 𝑛 parallel Hadamard gates. And, finally, we need to
perform measurements on all 2𝑛 qubits involved. This is again a routine task
for any working quantum computer. So, in summary, a full execution of our
data hiding game with quantum-enhanced readout boils down to the following
depth-4 circuit which involves 4𝑛 gates:

(13.6)
Note that all CNOT gates affect different qubit pairs and can therefore be
stacked into a single gate layer of depth one. This looks doable even on existing
quantum computers that are noisy (which limits available circuit depth) and
comparatively small (which limits the maximum 𝑛 we can go to). And, in
fact, it has been done. Google – who operates one of the largest and most
reliable quantum computers to date – teamed up with researchers from Caltech
and JKU (yours truly) to actually run a slight modification of our data hiding
game [Hua+22]. The results are depicted in Figure 13.2
182 Lecture 13: Learning from quantum experiments

(c)

)
Prediction Accuracy (Q) (C
s LB
Prediction Accuracy (C)
rou
go
Ri

Training Loss (Q)

data hiding case study on


53-qubit processor (Google)
Figure 13.2 Empirical performance of conventional (gray) and quantum-enhanced
(purple) learning on the 53-qubit Google Sycamore chip [Hua+22]: The task
is to recover a hidden trit string 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 that is imprinted onto a
randomized 𝑛 -qubit state |𝜓ˆ (𝒘 )⟩ . This log plot displays the number of
quantum executions/training data size as a function of qubit number 𝑛 . The
dashed line illustrates the fundamental lower bound from Theorem 13.3.

For technical reasons1 we could only scale up to 𝑛 = 20 (for one |𝜓 (𝒘 )⟩ )


which translates to 2𝑛 = 40 qubits in total. But this is enough to actually
witness a substantial discrepancy between conventional strategies and quantum-
enhanced strategies: 220 ≈ 1.04 × 106 is much larger than 𝑛 = 20. The plot
in Fig. 13.2 shows how this fundamental distance in scaling manifests itself in
actual executions of the learning challenge. The dashed solid line is the lower
bound from Theorem 13.3. Any conventional readout strategy must be north-
west of this exponential growth line. The green dots depict one such strategy
which essentially involves randomly guessing the correct imprinting unitaries
in a hardware-friendly and automated fashion. The purple dots are where
things get interesting. These depict the performance of the quantum-enhanced
readout protocol and it is way south-west of the dashed line. This is strong
empirical support for the theoretic assertion from Theorem 13.4 and one of
the first large-scale demonstrations of a quantum advantage on real quantum
hardware.

Annex 1: single-qubit analysis of a conventional approach


For 𝑛 = 1 qubit, our initial state collapses to a simple (deterministic) initaliza-
tion:
|𝜓ˆ0 ⟩ = | 0⟩.
1One qubit is broken and connectivity becomes an issue.
183 Lecture 13: Learning from quantum experiments

Now, let us implement the encoding strategy. It is based on mapping a single


trit 𝑤 ∈ {𝑥, 𝑦 , 𝑧 } onto this initial state by applying a single-qubit unitary
that depends on 𝑤 . According to our encoding strategy, we have 𝑽 𝑥 = 𝑯
(Hadamard), 𝑽 𝑦 = 𝑺 × 𝑯 (Hadamard+phase) and 𝑽 𝑧 = 𝕀 (do nothing). It is
easy to determine the resulting states:



 𝑯 | 0⟩ = |+⟩ if 𝑤 = 𝑥,


|𝜓ˆ (𝑤 )⟩ = 𝑽 | 0⟩ = 𝑺 × 𝑯 | 0⟩ = | i+⟩ else if 𝑤 = 𝑦 ,

 𝕀| 0⟩ = | 0⟩

else if 𝑤 = 𝑧.

Note that both
1 1
|+⟩ = 𝑯 | 0⟩ = √ (| 0⟩ + | 1⟩) and | i+⟩ = 𝑺 × 𝑯 | 0⟩ = √ (| 0⟩ + i | 1⟩) .
2 2

are perfect superpositions between | 0⟩ and | 1⟩ , but the amplitudes differ.


The task is now to devise a quantum circuit that can identify the single
trit 𝑤 ∈ {𝑥, 𝑦 , 𝑧 } hidden within |𝜓ˆ (𝑤 )⟩ . We will not do the fully general case
advertised in Sec. 13.2, but instead, consider a problem-informed special case.
Since the encoding strategy is known to us, we can try to undo it. Here is a
guessing circuit that attempts to achieve this goal:

(13.7)
Here, we have already suggestively included the encoding procedure on the
right hand side. A quick look at this single-qubit circuit architecture already
tells us that two things can happen:

1 𝑣 = 𝑤 (correct guess): in this case, the two unitaries in Eq. (13.7) cancel
out and we end up simply measuring the zero state | 0⟩ . More formally,
2

Pr𝑽 † |𝜓ˆ (𝑤 ) ⟩ [ 0] = ⟨0 |𝑽 𝑤 𝑽 𝑤 | 0⟩ = 1 = |⟨0 |𝕀| 0⟩| 2 = 1.
𝑤

(Here we have sticked to the density matrix formalism, because it is this


one that will generalize to 𝑛 qubits).

2 𝑣 ≠ 𝑤 (incorrect guess): in this case, the two unitaries 𝑽 𝑤 and 𝑽 𝑣 cannot
cancel, but their combination produces a state that is always in perfect
superposition between | 0⟩ and | 1⟩ . More formally,
2 1
Pr𝑽 † |𝜓ˆ (𝑤 ) ⟩ [ 0] = ⟨0 |𝑽 𝑣†𝑽 𝑤 | 0⟩ = for all 𝑣 , 𝑤 ∈ {𝑥, 𝑦 , 𝑧 } with 𝑣 ≠ 𝑤 .
𝑣 2
We leave a derivation as an instructive exercise.
184 Lecture 13: Learning from quantum experiments

Exercise 13.5 Verify the following general equation for 𝑣 , 𝑤 ∈ {𝑥, 𝑦 , 𝑧 } by


directly computing all 3 × 3 = 9 state vector amplitudes:
(
2 1 if 𝑣 = 𝑤 ,
Pr𝑽 †𝑽
𝑤 | 0⟩
[0] = ⟨0 |𝑽 𝑣† × 𝑽 𝑤 | 0⟩ =
𝑣 1/2 else if 𝑣 ≠ 𝑤 .

It is worthwhile to emphasize that the two possible outcome distributions


are radically different. One (correct guess) produces a deterministic outcome
of always 0, while the other (incorrect guess) produces a uniform distribution,
where 0 and 1 both occur with probability 1/2. It is very easy to distinguish
these two situations by re-running the quantum circuit a few times. If we ever
observe a 1, we know that our guess must have been incorrect. And 1 occurs
with probability 1/2, so we don’t have to wait very long. Already 4 quantum
circuit executions are enough to get a failure probability < 0.05 (why?).
But this is also where the issues arise. Re-running the test circuit from
Eq. (13.7) only ever allows us to check whether our current guess 𝑣 equals the
hidden trit 𝑤 . If this is not the case, i.e. 𝑣 ≠ 𝑤 , we do not get any actionable
advice on what string to try next. After all, every possible combination of 𝑣
(guess) and 𝑤 with 𝑣 ≠ 𝑤 produces the same (uniform) outcome distribution.
So, what does that mean for us if we want to recover a hidden data trit
𝑤 ∈ {𝑥, 𝑦 , 𝑧 } that is encoded into |𝜓ˆ (𝑤 )⟩ ? The strategy displayed above
only really leaves us with one option: making random guesses 𝑣 and using
repetitions of the corresponding quantum circuit to check whether 𝑣 = 𝑤 . For
𝑛 = 1 (a single trit of hidden data) this is not too costly (yet). There are only
31 = 3 possibilities and we obtain

𝑇conv ( 1) ≳ 4 × 31 , (13.8)

where 4 takes into account the (expected) number of circuit evaluations required
to distinguish a deterministic distribution (correct guess) from the uniform one
(incorrect guess).
However, the form of Eq. (13.8) suggests an exponential dependence on the
number of qubits. Indeed, as 𝑛 increases, the total number of secret trit strings
𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 grows as 3𝑛 . And, if randomly guessing the correct string is
essentially our only chance, we will feel this exponential growth in the number
of options.
Exercise 13.6 Extend this analysis to 𝑛 = 2 and 𝑛 = 3 qubits. Show that the
number of ‘guess circuits’ you need will grow indeed proportionally to 32 and
33 , respectively. This is a strong indicator of exponential growth in the number
of qubits. Hint: the random initialization of |𝜓ˆ0 ⟩ plays an important role here.
We emphasize that such a generalization is not enough to deduce Theo-
rem 13.3. After all, we are analyzing only one particular readout strategy (guess
the correct string by unravelling) and not all of them. It does, however, convey
the main gist: our secret imprinting is designed in a way that makes it very
difficult to extract ‘global information’ about the secret string 𝒘 ∈ {𝑥, 𝑦 , 𝑧 }𝑛 .
185 Lecture 13: Learning from quantum experiments

Annex 2: single-qubit analysis of the quantum-enhanced approach


For 𝑛 = 1, our quantum-enhanced circuit strategy boils down to the following
circuit on 2 × 1 = 2 qubits:

There are three different circuit configurations — one for each 𝑤 ∈ {𝑥, 𝑦 , 𝑧 } –
that want to be analyzed. We learn everything about the measurement outcome
if we compute the final 2-qubit quantum state

| 𝜑 (𝑤 )⟩ = (𝑯 ⊗ 𝕀)𝑪 𝑵 𝑶𝑻 (𝑽 𝑤 ⊗ 𝑽 𝑤 )| 0, 0⟩.
1 𝑤 = 𝑥 , i.e. 𝑽 𝑤 = 𝑽 𝑥 = 𝑯 :
| 𝜑 (𝑥)⟩ = (𝑯 ⊗ 𝕀) 𝑪 𝑵 𝑶𝑻 (𝑯 ⊗ 𝑯 ) | 0, 0⟩
= (𝑯 ⊗ 𝕀) 𝑪 𝑵 𝑶𝑻 |+, +⟩
= (𝑯 ⊗ 𝕀) |+, +⟩ = | 0, +⟩
1
= √ (| 0, 0⟩ + | 0, 1⟩)
2
2 𝑤 = 𝑦 , i.e. 𝑽 𝑤 = 𝑽 𝑦 = 𝑺 × 𝑯 :
|𝜑 (𝑦 )⟩ = (𝑯 ⊗ 𝕀) 𝑪 𝑵 𝑶𝑻 (𝑺𝑯 ⊗ 𝑺𝑯 ) | 0, 0⟩
= (𝑯 ⊗ 𝕀) 𝑪 𝑵 𝑶𝑻 | i+, i+⟩
1
= (𝑯 ⊗ 𝕀) (| 0, 0⟩ + i | 0, 1⟩ + i | 1, 1⟩ − | 1, 0⟩)
2
1
= √ (| 1, 0⟩ + i | 0, 1⟩) .
2
3 𝑤 = 𝑧 , i.e. 𝑽 𝑤 = 𝑽 𝑧 = 𝕀:
| 𝜑 (𝑧)⟩ = (𝑯 ⊗ 𝕀) 𝑪 𝑵 𝑶𝑻 (𝕀 ⊗ 𝕀) | 0, 0⟩
= (𝑯 ⊗ 𝕀) 𝑪 𝑵 𝑶𝑻 | 0, 0⟩
= (𝑯 ⊗ 𝕀) | 0, 0⟩ = |+, 0⟩
1
= √ (| 0, 0⟩ + | 1, 0⟩) .
2
So, in summary, the three different choices for 𝑤 lead to different outcome
probability distributions for the two readout bits. If we collect the probabilities
for obtaining 00, 01, 10 and 11 in a 4-dimensional vector, we obtain
1 0 1 Pr | 𝜑 (𝑤 ) ⟩ [00]
1­ 1 ® 1­ 1 1­ 0 ® Pr | 𝜑 (𝑤 ) ⟩ [01]
© ª © ª © ª
𝒑𝑥 = ­ ® , 𝒑𝑦 = ­ ® , 𝒑𝑧 = ­ ® (13.9)
®
.
2­ 0 ® 2­ 1 ® 2­ 1 ® Pr | 𝜑 (𝑤 ) ⟩ [10]
0
« ¬ « 0 ¬ « 0 ¬ Pr | 𝜑 (𝑤 ) ⟩ [11]
186 Lecture 13: Learning from quantum experiments

Note that these three possible outcome distributions are very different from
each other. A (small) constant number of quantum circuit repetitions (coin
tosses) allow us to do a 2-step identification procedure.
Suppose, for the sake of illustration, that the underlying secret is 𝑤 = 𝑥 .
Then, 𝒑 𝑥 tells us that we obtain outcome 00 with probability 1/2. And we
expect to see this outcome after only 2 repetitions (why?). As soon as we
observe 00 once, we can exclude 𝑤 = 𝑦 for the hidden string (𝒑 𝑦 has 0 weight
on 00) and are left with two options: 𝑤 = 𝑥 or 𝑤 = 𝑧 . We then invest a
couple of extra repetitions (coin tosses) to look for outcome 01. As soon as
we observe this outcome, we can also rule out 𝒑 𝑧 and can be sure that the
underlying hidden trit is 𝑤 = 𝑥 . We expect that 2 additional repetitions are
enough (why?). The search procedure for other trits is analogous and gets by
with equally few runs of the quantum-enhanced readout procedure.
In summary, we can conclude that the quantum-enhanced readout protocol
gets by with
𝑇qe ( 1) = const where const ≈ 4
repetitions only. What is more, this procedure is fixed and not conditional on a
guess by us. The fixed Bell-type measurement allows us to correctly uncover
the secret trit 𝑤 after very few repetitions of the same quantum circuit. These
are all excellent signs for a benign generalization to 𝑛 ≥ 1 qubits.
Bibliography

[Bel64] J. Bell. “On The Einstein Podolsky Rosen Paradox”. In: Physics 1.3 (1964),
pages 195–200. doi: https://doi.org/10.1103/PhysicsPhysiqueFizika.
1.195.
[BB14] C. H. Bennett and G. Brassard. “Quantum cryptography: Public key
distribution and coin tossing”. In: Theor. Comput. Sci. 560 (2014), pages 7–
11. doi: 10.1016/j.tcs.2014.05.025. url: https://doi.org/10.
1016/j.tcs.2014.05.025.
[Ben+93] C. H. Bennett et al. “Teleporting an unknown quantum state via dual
classical and Einstein-Podolsky-Rosen channels”. In: Phys. Rev. Lett. 70 (13
1993), pages 1895–1899. doi: 10.1103/PhysRevLett.70.1895. url:
https://link.aps.org/doi/10.1103/PhysRevLett.70.1895.
[DN05] C. M. Dawson and M. A. Nielsen. The Solovay-Kitaev algorithm. 2005.
arXiv: quant-ph/0505030 [quant-ph].
[Eke91] A. Ekert. “Quantum Cryptography Based on Bell’s Theorem”. In: Phys. Rev.
Lett. 67.6 (1991), pages 661–663. doi: https://doi.org/10.1103/
PhysRevLett.67.661.
[GC99] D. Gottesman and I. L. Chuang. “Demonstrating the viability of universal
quantum computation using teleportation and single-qubit operations”.
In: Nature 402.6760 (1999), pages 390–393. doi: 10.1038/46503. url:
https://doi.org/10.1038/46503.
[HKP21] H.-Y. Huang, R. Kueng, and J. Preskill. “Information-Theoretic Bounds on
Quantum Advantage in Machine Learning”. In: Phys. Rev. Lett. 126 (19
2021), page 190505. doi: 10.1103/PhysRevLett.126.190505. url:
https://link.aps.org/doi/10.1103/PhysRevLett.126.190505.
188 Lecture 13: Learning from quantum experiments

[Hua+22] H.-Y. Huang et al. “Quantum advantage in learning from experiments”.


In: Science 376.6598 (2022), pages 1182–1186. doi: 10.1126/science.
abn7293. eprint: https://www.science.org/doi/pdf/10.1126/
science.abn7293. url: https://www.science.org/doi/abs/10.
1126/science.abn7293.
[Kit97] A. Y. Kitaev. “Quantum computations: algorithms and error correction”.
In: Russian Mathematical Surveys 52.6 (1997), page 1191. doi: 10.1070/
RM1997v052n06ABEH002155. url: https://dx.doi.org/10.1070/
RM1997v052n06ABEH002155.
[Kue22] R. Kueng. Introduction to Computational Complexity (lecture notes). JKU
Linz, Austria, 2022. url: https://iic.jku.at/files/eda/kueng-
complexity.pdf.
[Tom+13] M. Tomamichel et al. “A monogamy-of-entanglement game with appli-
cations to device-independent quantum cryptography”. In: New J. Phys.
15 (2013), pages 103002, 24. issn: 1367-2630. doi: 10.1088/1367-
2630 / 15 / 10 / 103002. url: https : / / doi . org / 10 . 1088 / 1367 -
2630/15/10/103002.
[Wil23] J. Wilkens. Quantum Circuit Library. 2023. url: https://github.com/
wilkensJ/drawio-library.

You might also like