Qbook 1
Qbook 1
QUANTUM COMPUTING
Jozef Gruska
http://mcgraw-hill.co.uk/gruska
one findes.
1. Basic information about the contents of the book.
2. Ordering and price information.
3. Second part of the Appendix. (A survey of basic concepts from complexity theory and
models of computing. Additional exercises. Historical and bibliographical refrences.)
4. eps-versions of figures from the book.
5. Corrections.
6. Additions.
v
To my parents
for their love and care.
To my wife
for her ever increasing care, support and patience.
To my children
with best wishes for their future.
To my grandson
with best wishes for quantum computing age.
vi
Contents
Contents v
Preface xiii
1 FUNDAMENTALS 1
1.1 Why Quantum Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Prehistory of Quantum Computing . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 From Randomized to Quantum Computation . . . . . . . . . . . . . . . . . . 12
1.3.1 Probabilistic Turing machines . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 Quantum Turing machines . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Hilbert Space Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.1 Orthogonality, bases and subspaces . . . . . . . . . . . . . . . . . . . . 23
1.4.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.3 Observables and measurements . . . . . . . . . . . . . . . . . . . . . . 26
1.4.4 Tensor products in Hilbert spaces . . . . . . . . . . . . . . . . . . . . . 27
1.4.5 Mixed states and density operators . . . . . . . . . . . . . . . . . . . . 29
1.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.5.1 Classical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.5.2 Quantum experiments—single particle interference . . . . . . . . . . . 33
1.5.3 Quantum experiments—measurements . . . . . . . . . . . . . . . . . . 38
1.6 Quantum Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.6.1 States and amplitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.6.2 Measurements—the projection approach . . . . . . . . . . . . . . . . . 43
1.6.3 Evolution of quantum systems . . . . . . . . . . . . . . . . . . . . . . 45
1.6.4 Compound quantum systems . . . . . . . . . . . . . . . . . . . . . . . 48
1.6.5 Quantum theory interpretations . . . . . . . . . . . . . . . . . . . . . 49
1.7 Classical Reversible Gates and Computing . . . . . . . . . . . . . . . . . . . . 49
1.7.1 Reversible gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1.7.2 Reversible Turing machines . . . . . . . . . . . . . . . . . . . . . . . . 53
1.7.3 Billiard ball model of (reversible) computing . . . . . . . . . . . . . . 54
2 ELEMENTS 57
2.1 Quantum Bits and Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.1.1 Qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.1.2 Two-qubit registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.1.3 No-cloning theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.1.4 Quantum registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
vii
viii CONTENTS
3 ALGORITHMS 101
3.1 Quantum Parallelism and Simple Algorithms . . . . . . . . . . . . . . . . . . 103
3.1.1 Deutsch’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.1.2 The Deutsch–Jozsa promise problem . . . . . . . . . . . . . . . . . . . 107
3.1.3 Simon’s problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.2 Shor’s Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3.2.1 Number theory basics . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3.2.2 Quantum Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . 115
3.2.3 Shor’s factorization algorithm . . . . . . . . . . . . . . . . . . . . . . . 119
3.2.4 Shor’s discrete logarithm algorithm . . . . . . . . . . . . . . . . . . . . 124
3.2.5 The hidden subgroup problems . . . . . . . . . . . . . . . . . . . . . . 125
3.3 Quantum Searching and Counting . . . . . . . . . . . . . . . . . . . . . . . . 127
3.3.1 Grover’s search algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 128
3.3.2 G-BBHT search algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 131
3.3.3 Minimum-finding algorithm . . . . . . . . . . . . . . . . . . . . . . . . 133
3.3.4 Generalizations and modifications of search problems . . . . . . . . . . 135
3.4 Methodologies to Design Quantum Algorithms . . . . . . . . . . . . . . . . . 137
3.4.1 Amplitude amplification–boosting search probabilities . . . . . . . . . 137
3.4.2 Amplitude amplification—speeding of the states searching . . . . . . . 139
3.4.3 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
3.5 Limitations of Quantum Algorithms . . . . . . . . . . . . . . . . . . . . . . . 140
3.5.1 No quantum speed-up for the parity function . . . . . . . . . . . . . . 140
3.5.2 Framework for proving lower bounds . . . . . . . . . . . . . . . . . . . 143
3.5.3 Oracle calls limitation of quantum computing . . . . . . . . . . . . . . 147
4 AUTOMATA 149
4.1 Quantum Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.1.1 Models of classical finite automata . . . . . . . . . . . . . . . . . . . . 151
4.1.2 One-way quantum finite automata . . . . . . . . . . . . . . . . . . . . 152
4.1.3 1QFA versus 1FA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.1.4 Two-way quantum finite automata . . . . . . . . . . . . . . . . . . . . 157
4.1.5 2QFA versus 1FA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.2 Quantum Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4.2.1 One-tape quantum Turing machines . . . . . . . . . . . . . . . . . . . 164
4.2.2 Variations on the basic model . . . . . . . . . . . . . . . . . . . . . . . 169
4.2.3 Are quantum Turing machines analogue or discrete? . . . . . . . . . . 171
4.2.4 Programming techniques for quantum Turing machines . . . . . . . . 174
4.3 Quantum Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
CONTENTS ix
5 COMPLEXITY 191
5.1 Universal Quantum Turing Machines . . . . . . . . . . . . . . . . . . . . . . 192
5.1.1 Efficient implementation of unitary transformations . . . . . . . . . . 192
5.1.2 Design of a universal quantum Turing machine . . . . . . . . . . . . . 196
5.2 Quantum Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . 199
5.2.1 Basic quantum versus classical complexity classes . . . . . . . . . . . . 199
5.2.2 Relativized quantum complexity . . . . . . . . . . . . . . . . . . . . . 204
5.3 Quantum Communication Complexity . . . . . . . . . . . . . . . . . . . . . . 207
5.3.1 Classical and quantum communication protocols and complexity . . . 208
5.3.2 Quantum communication versus computation complexity . . . . . . . 210
5.4 Computational Power of quantum non-linear mechanics . . . . . . . . . . . . 212
6 CRYPTOGRAPHY 215
6.1 Prologue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
6.2 Quantum Key Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.2.1 Basic ideas of two parties quantum key generation . . . . . . . . . . . 218
6.2.2 Security issues of QKG protocols . . . . . . . . . . . . . . . . . . . . . 220
6.2.3 Quantum key generation protocols BB84 and B92 . . . . . . . . . . . 222
6.2.4 Multiparty key generation . . . . . . . . . . . . . . . . . . . . . . . . . 227
6.2.5 Entanglement-based QKG protocols . . . . . . . . . . . . . . . . . . . 228
6.2.6 Unconditional security of QKG∗ . . . . . . . . . . . . . . . . . . . . . 231
6.2.7 Experimental quantum cryptography . . . . . . . . . . . . . . . . . . . 234
6.3 Quantum Cryptographic Protocols . . . . . . . . . . . . . . . . . . . . . . . . 236
6.3.1 Quantum coin-flipping and bit commitment protocols . . . . . . . . . 238
6.3.2 Quantum oblivious transfer protocols . . . . . . . . . . . . . . . . . . 241
6.3.3 Security of the quantum protocols . . . . . . . . . . . . . . . . . . . . 243
6.3.4 Security limitations of the quantum cryptographic protocols . . . . . . 246
6.3.5 Insecurity of quantum one-sided two-party computation protocols . . . 249
6.4 Quantum Teleportation and Superdense Coding . . . . . . . . . . . . . . . . . 250
6.4.1 Basic principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
6.4.2 Teleportation circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
6.4.3 Quantum secret sharing . . . . . . . . . . . . . . . . . . . . . . . . . . 254
6.4.4 Superdense coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
7 PROCESSORS 259
7.1 Early Quantum Computers Ideas . . . . . . . . . . . . . . . . . . . . . . . . . 261
7.1.1 Benioff’s quantum computer . . . . . . . . . . . . . . . . . . . . . . . 261
7.1.2 Feynman’s quantum computer . . . . . . . . . . . . . . . . . . . . . . 261
7.1.3 Peres’ quantum computer . . . . . . . . . . . . . . . . . . . . . . . . . 262
7.1.4 Deutsch’s quantum computer . . . . . . . . . . . . . . . . . . . . . . . 263
7.2 Impacts of Imperfections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
7.2.1 Internal imperfections . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
7.2.2 Decoherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
7.3 Quantum Computation and Memory Stabilization . . . . . . . . . . . . . . . 268
x CONTENTS
8 INFORMATION 315
8.1 Quantum Entropy and Information . . . . . . . . . . . . . . . . . . . . . . . . 316
8.1.1 Basic concepts of classical information theory . . . . . . . . . . . . . . 317
8.1.2 Quantum entropy and information . . . . . . . . . . . . . . . . . . . . 318
8.2 Quantum Channels and Data Compression . . . . . . . . . . . . . . . . . . . 320
8.2.1 Quantum sources, channels and transmissions . . . . . . . . . . . . . . 321
8.2.2 Shannon’s coding theorems . . . . . . . . . . . . . . . . . . . . . . . . 323
8.2.3 Schumacher’s noiseless coding theorem . . . . . . . . . . . . . . . . . . 325
8.2.4 Dense quantum coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
8.2.5 Quantum Noisy Channel Transmissions . . . . . . . . . . . . . . . . . 330
8.2.6 Capacities of erasure and depolarizing channels . . . . . . . . . . . . . 332
8.3 Quantum Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
8.3.1 Transformation and the partial order of entangled states. . . . . . . . 333
8.3.2 Entanglement purification/distillation . . . . . . . . . . . . . . . . . . 333
8.3.3 Entanglement concentration and dilution . . . . . . . . . . . . . . . . 336
8.3.4 Quantifying entanglement . . . . . . . . . . . . . . . . . . . . . . . . . 336
8.3.5 Bound entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
8.4 Quantum information processing principles and primitives . . . . . . . . . . . 338
8.4.1 Search for quantum information principles . . . . . . . . . . . . . . . 338
8.4.2 Quantum information processing primitives . . . . . . . . . . . . . . . 339
APPENDIX 341
9.1 Quantum Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
9.1.1 Pre-history of quantum theory . . . . . . . . . . . . . . . . . . . . . . 342
9.1.2 Heisenberg’s uncertainty principle . . . . . . . . . . . . . . . . . . . . 345
9.1.3 Quantum theory versus physical reality . . . . . . . . . . . . . . . . . 349
9.1.4 Quantum measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 350
9.1.5 Quantum paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
9.1.6 The quantum paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
9.1.7 Interpretations of quantum theory . . . . . . . . . . . . . . . . . . . . 358
CONTENTS xi
Bibliography 403
xii CONTENTS
Preface xiii
PREFACE
A
Come forth into the light of things.
Let Nature be your teacher.
W. Wordsworth (1770–1850)
It has been known, but not realized enough, from the birth of modern quantum
mechanics theory that the most basic processes of Nature are actually quantum
information processing processes and that amount of information processing go-
ing on everywhere around us in a tiny portion of matter and time is incomparable
larger than all information processing classical technology has ever provided. In
addition, it has not been realised, till the birth of modern quantum information
processing research, that information processing capabilities of Nature cannot be
matched by classical information processing tools, and due to severe limitations
on retrieval of information from quantum to classical world, it has not been clear
at all whether and how we can harness enormous information processing power
of Nature for classical information processing.
At the same time, as it is often the case with the very fundamental and powerful theories
and ideas, the very basic concepts of quantum computing are surprisingly simple and elegant
even though they seem to deal with mysterious and puzzling phenomena. Moreover, the
technical—mathematical—tools needed to present an introduction to quantum computing
are mostly those that are included in basic science education. The book demonstrates and
utilises this fact in a way that is readable and understandable by the broad science and
technology community.
It is hard to foresee exactly where the research and development in quantum computing
will take us. However, we can safely say that something important will come out and that
quantum computing is a challenge not only for informatics and physics—theoretical and also
experimental — but also for science, technology and society in general.
For informatics as a science, quantum computing may bring the most radical change in
its main research aims, scope and paradigms. Indeed, so far informatics has been devel-
oped, largely, with the global aims of serving current and foreseeable information processing
technology. Quantum computing (with molecular computing) is perhaps the first significant
challenge, chance and necessity for informatics to free itself from this short-term role of the
servant of technology and to start to concentrate more on its most basic long term aims:
to study the laws and limitations of the information processing world, to contribute to the
development of new global theories and to deepen our understanding of various worlds: for
example physical, biological, and chemical.
For informatics as a technology, the development of quantum information processing
technologies can make a revolutionary contribution to the potential and security of infor-
mation processing and communication systems.
For theoretical physics, quantum computing can be seen as a new challenge and also
as an important new source of aims, stimuli, scientific methods and paradigms for dealing
with one of the most basic problems of current science (physics). Namely, how to deepen
and extend one of the most basic, powerful and fascinating theory in physics—quantum
Preface xv
Referencing. A large effort has been made that results and ideas presented are properly
credited and referenced. This has been a hard task because the field develops very fast. This
is therefore to apologize for all omisions, imperfections or even misclaims and to ask those
feeling that an addition or correction should be done along these lines to let me know and
I will try to do that on the book web pages.
Using the book as the textbook. The following is a possible structure of a one
semester course: (1) Introduction (1.1–1.3, 1.7); (2) Hilbert spaces basics (1.4, 9.2); (3)
Quantum principles (1.5–1.6 + Appendix 9.1) or Computational complexity (Appendixx
—on book web pages 9.3); (4) Quantum bits, registers, gates and networks (2.1–2.3); (5)
Basic quantum algorithms (3.1); (6) Shor’s algorithms I (3.2); (7) Search algorithms (3.3);
(8) Quantum algorithms design methodologies and limitations ( 3.4 and 3.5); (9) Quan-
tum finite automata and Turing machines (4.1 and 4.2); (10) Quantum key generation
xviii Preface
(6.1 and 6.2); (11) Quantum cryptographic protocols and teleportation (6.3 and 6.4); (12)
Quantum error correction codes (7.2– 7.4); (13) Quantum fault-tolerant methods (7.5); (14)
Quantum processors (7.1 and 7.6); (15) Quantum information theory (8.1– 8.2); (16) Quan-
tum entanglement theory (8.3– 8.4).
Additional subjects: quantum computational complexity (5.1–5.4), quantum cellular
automata (4.3).
FUNDAMENTALS
INTRODUCTION
The power of quantum computing is based on several phenomena and laws of the quantum
world that are fundamentally different from those one encounters in classical computing:
complex probability amplitudes, quantum interference, quantum parallelism, quantum en-
tanglement and the unitarity of quantum evolution. In order to understand these features,
and to make a use of them for the design of quantum algorithms, networks and processors,
one has to understand several basic principles which quantum mechanics is based on, as well
as the basics of Hilbert space formalism that represents the mathematical framework used
in quantum mechanics.
The chapter starts with an analysis of the current interest in quantum computing. It
then discusses the main intellectual barriers that had to be overcome to make a vision of the
quantum computer an important challenge to current science and technology. The basic and
specific features of quantum computing are first introduced by a comparison of randomized
computing and quantum computing. An introduction to quantum phenomena is done in
three stages. First, several classical and similar quantum experiments are analysed. This
is followed by Hilbert space basics and by a presentation of the elementary principles of
quantum mechanics and the elements of classical reversible computing.
LEARNING OBJECTIVES
The aim of the chapter is to learn
1
2 CHAPTER 1. FUNDAMENTALS
Q
You have nothing to do but mention the
quantum theory, and people will take your
voice for the voice of science, and believe
anything.
Quantum computing is without doubt one of the hottest topics at the current frontiers of
computing, or even of the whole science. It sounds very attractive and looks very promising.
There are several natural basic questions to ask before we start to explore the concepts
and principles as well as the mystery and potentials of quantum computing.
1. Why to consider quantum computing at all? The development of classical
computers is still making enormous progress and no end of that seems to be in sight. More-
over, the design of quantum computers seems to be very questionable and almost surely
enormously expensive. All this is true. However, there are at least four very good reasons
1.1. WHY QUANTUM COMPUTING 3
Church–Turing thesis concerning computability. They cannot compute what could not be computed by
classical computers. Their main advantage is that they can solve some important computational tasks much
more efficiently than classical computers.
2 In such a case it will be necessary to include in the design and description of computers quantum
theory and such quantum phenomena as superposition and entanglement, to obtain correct predictions
about computer behaviour. However, the clear necessity to go deeper into the quantum level for improving
performance of computers does not immediately imply that the way pursued under the current interpretation
of the term “quantum computing” is the only one, or even the best one.
3 The single electron transistor is already under development, see page 313.
4 At the same time one should note that while quantum physics has been already for a long time essential to
the understanding of the operations of transistors and other key elements of modern computers, computation
remained to be a classical process. In addition, at the first sight there are good reasons for computing and
quantum physics to be very far apart because determinism and certainty required from computations seem
to be in strong contrast with uncertainty principle and probabilistic nature of quantum mechanics.
4 CHAPTER 1. FUNDAMENTALS
harvested through quantum cryptography, can offer, in view of our current knowledge,
unconditional security of communication, unachievable by classical means.
• Finally, the development of quantum computing is a drive and gives new impetus
to explore in more detail and from new points of view concepts, potentials, laws and
limitations of the quantum world and to improve our knowledge of the natural world.
The study of information processing laws, limitations and potentials is nowadays in
general a powerful methodology to extend our knowledge, and this seems to be partic-
ularly true for quantum mechanics. Information is being identified as one of the basic
and powerful concepts of physics and quantum entanglement is an important commu-
nication resource. Several profound insights into the natural world have already been
obtained on this basis.5
Remark 1.1.1 The above ideas are so new and important, that they deserve an additional
analysis.
Historically, the fundamental principles of physics first concerned the problems of
matter—what things are made of and how they move. Later, the problems of energy started
to be reflected in the leading principles of physics—how energy is created, expressed and
transformed. As the next stage an alternative seems to be to look to information processing
for a new source of fundamental principles and basic laws. For example, concerning the par-
ticles, the questions of the movement of particles may be superseded by how particles can be
utilized for information processing. Finally, let us observe some similarities between energy
and information. Both of them have many representations, but basic principles, and also
equations, hold independently of the form in which energy or information is presented.
The increasing importance of information processing principles for current science has
been first, correctly, reflected in the views and understanding (due to Landauer, 1991), that
“information is physical” and in the corresponding changes of emphases on the essence and
ways to deal with information processing problems. However, it could be the case that this is
only the first step and perhaps even more fundamental changes in the principles of physics
could be obtained from the view that “physics is informational”.6
These new views of the role of information in quantum physics also bring new potentials,
challenges and questions for quantum physics. Is the well known “weirdness” of the quantum
world due to the fact that physical reality is governed by even more basic laws of the infor-
mation processing world? Is quantum theory a theory of the physical or of the information
world? Can the study of quantum information help to deal with the most basic problems
quantum theory has?
As an example of a change of research aims in physics under the influence of computer
science research paradigms, consider quantum evolution. Traditionally, quantum physics
5 For example, manifestations of quantum nonlocality that go beyond entanglement (see Bennett et al.
1998), the use of quantum principles for secure transmission of classical information (quantum cryptography),
the use of quantum entanglement for reliable transmission of quantum states over a distance (quantum
teleportation), the possibility of preserving quantum coherence in the presence of irreversible noise processes
(quantum error correction and fault tolerant computation). In addition, by Steane (1997), one has to realize
that historically much of fundamental physics has been concerned with discovering fundamental particles
of Nature and the equations which describe their motions and interactions. It now appears that a different
program may be equally important. Namely, to discover the ways Nature allows, and prevents, information
to be expressed and manipulated, rather than particles to move.
6 A lot of research is still needed to determine the position and real role information plays in physics. The
extreme views go even so far that information is a physical quantity, similar as energy in thermodynamics
(Horodecki, 1991, and Landauer, 1991, 1995), or even that information is deeper than reality—a substance
that is more fundamental than matter and energy.
1.1. WHY QUANTUM COMPUTING 5
has been concerned with the study or design of particular quantum systems and the study
of various related fundamental problems. In addition to these problems quantum computing
brought up new general and fundamental questions. Namely, what are the best, from well
defined quantitative point of views, quantum evolutions to solve particular algorithmic or
communication tasks. Or a problem of the maximum quantum computation power achievable
in a quantum system of a certain dimension and disturbance level (Steane, 1998b), and of
the way to reach such a maximum.
New fundamental questions in quantum mechanics are raised also in connection with
the following problem: how secure are, or can be, quantum cryptographic protocols? For
example, the question how much information can be extracted from a quantum system for a
given amount of expected disturbances? These questions are of fundamental importance far
beyond quantum cryptography. To answer these questions, new theoretical insights and also
new experiments seem to be needed.
In addition, an awareness has been emerging also in the foundations of computing that
fundamental questions regarding computability and computational complexity are in a deep
sense questions about physical processes.7 If they are studied on a mathematical level then
the underlying models have to reflect fully the properties of our physical world. This in
particular implies that computational complexity theory has to be, in its most fundamental
form, based on models of quantum computers.8
2. Can quantum computers do what classical ones cannot? The answer de-
pends on the point of view. It can be YES. Indeed, the simplest example is generation of
random numbers. Quantum algorithms can generate truly random numbers. Deterministic
algorithms can generate only pseudo-random numbers. Other examples come from the sim-
ulation of quantum phenomena. On the other hand, the answer can be also NO. A classical
computer can produce truly random numbers when attached to a proper physical source.
3. Where lie the differences between the classical and quantum information
processing? Some of the differences have already been mentioned. Let us now discuss
some others.
Classical information can be read, transcribed (into any medium), duplicated at will,
transmitted and broadcasted. Quantum information, on the other hand, cannot be in general
read or duplicated without being disturbed, but it can be “teleported” (as discussed in
Section 6.4).
In classical randomized computing, a computer always selects one of the possible com-
putation paths, according to a source of randomness, and “what-could-happen-but-did-not”
7 An understanding has emerged that each specific computation is performed by a physical system evolv-
ing in time and, consequently, that one of the basic problems of computing, namely “what is efficiently
computable?” is deeply related to one of the basic problems of physics, namely “which dynamical systems
are physically realizable?”
8 The following citations reflect a dissatisfaction with the fact that the development of complexity theory
ignored one of its most fundamental tasks. The fact that this had been so is in one way explainable but, in
another way, hardly forgivable.
A. Ekert (1995): Computers are physical objects and computations are physical processes. The theory
of computation is not a branch of pure mathematics. Fundamental questions regarding computability and
computational complexity are questions about physical processes that reveal to us properties of abstract
entities such as numbers or ideas. Those questions belong to physics rather than mathematics.
J. Beckman et al (1996): The theory of computation would be bootless if the computations that it describes
could not be carried out using physically realizable devices. Hence it is really a task of physics to charac-
terize what is computable, and to classify the efficiency of computations. The physical world is quantum
mechanical. Therefore, the foundations of the theory of computation must be quantum mechanical as well.
The classical theory of computation should be viewed as an important special case of a more general theory.
6 CHAPTER 1. FUNDAMENTALS
has no influence whatsoever on the outcome of the computation. On the other hand, in
quantum computing, exponentially many computational paths can be taken simultaneously
in a single piece of hardware and in a special quantum way and “what-could-happen-but-
did-not” can really matter.
Acquiring information about a quantum system can inevitably disturbs the state of the
system. The tradeoff between acquiring quantum information and creating a disturbance of
the system is due to quantum randomness. The outcome of a quantum measurement has a
random element and because of that we are unable always faithfully infer the (initial) state
of the system from the measurement outcome.
Perhaps the main difference between classical and quantum information processing lies
in the fact that quantum information can be encoded in mutual correlations between remote
parts of physical systems and quantum information processing can make essential use of this
phenomena—called entanglement—not available for classical information processing.
Another big difference between the classical and quantum worlds that strongly influences
quantum information processing stems from the fact that the relationship between a system
and its subsystems is different in the quantum world than in the classical world. For example,
the states of a quantum system composed of quantum subsystems cannot be in general
decomposed into states of these subsystems.
4. Can quantum computers solve some practically important problems much
more efficiently? Yes. For example, integer factorization can be done in polynomial time
on quantum computers what seems to be impossible on classical computers. Searching in
unordered database can be done provably with less queries on quantum computer.
5. Where does the power of quantum computing come from? On one side,
quantum computation offers enormous parallelism. The size of the computational state
space is exponential in the physical size of the system and the energy available. A quantum
bit can be in any of a potentially infinite number of states and quantum systems can be
simultaneously in superposition of exponentially many of the basis states. A linear number
of operations can create an exponentially large superposition of states and, in parallel, an
exponentially large number of operations can be performed in one step.
Secondly, it is the branching and quantum interference that create parallel computation
and constructive/destructive superpositions of states and can amplify or destroy the impacts
of some computations. Due to this fact, we can, in spite of the peculiarities of quantum
measurements, utilize quantum parallelism.
Thirdly, it is mainly the existence of so-called “entangled states” that makes quantum
computing more powerful than classical and allows even very distant parts of systems to be
strongly tied. This creates a base for developing and exploring quantum teleportation and
other phenomena that are outside of the realm of the classical world.
*****
After all this excitement let us start to deal with more prosaic and “harder ”questions.
6. Where are the drawbacks and bottlenecks of quantum computing? There
are, unfortunately, quite a few. Let us mention here only two of them.
• Quantum computing can provide enormous parallelism. However, there are also enor-
mous problems with harnessing the power of its parallelism. According to the basic
principles of quantum mechanics, a (projection) measurement process can get out of
(large) quantum superposition only one classical result, randomly chosen, and the
remaining quantum information can be irreversibly destroyed.
1.2. PREHISTORY OF QUANTUM COMPUTING 7
• An interaction of a quantum system with its environment can lead to the the so-
called decoherence effects and can greatly influence, or even completely destroy, subtle
quantum interference mechanisms. This appears to make long reliable quantum com-
putations practically impossible.
sponding systems or experiments are in principle possible. Sometimes it is sufficient that no physical law is
known that would not allow such an experiment.
8 CHAPTER 1. FUNDAMENTALS
Since 1945 we have been witnessing a rapid growth of the raw performance of computers
with respect to their speed and memory size. An important step in this development was
the invention of transistors, which already use some quantum effects in their operation.
However, it is clear that if such an increase in performance of computers continues, then
after 50 years, our chips will have to contain 1016 gates and operate at a 1014 Hz clock rate
(thus delivering 1030 logic operations per second)10 . It seems that the only way to achieve
that is to learn to build computers directly out of the laws of quantum physics.
In order to come up seriously with the idea of quantum information processing, and to
develop it so far and so fast, it has been necessary to overcome several intellectual barriers.
The most basic one concerned an important feature of quantum physics—reversibility
(see Section 1.7).11 None of the known models of universal computers was reversible. This
barrier was overcome first by Bennett12 (1973), who showed the existence of universal re-
versible Turing machines, and then by Toffoli (1980, 1981) and Fredkin and Toffoli (1982),
who showed the existence of universal classical reversible gates.13
The second intellectual barrier was overcome by Benioff (1980, 1982, 1982a) who showed
that quantum mechanical computational processes can be at least as powerful as classical
computational processes. He did that by showing how a quantum system can simulate
actions of the classical reversible Turing machines. However, his “quantum computer” was
not fully quantum yet and could not outperform classical ones.
The overcoming of these basic intellectual barriers had significant and broad conse-
quences. Relations between physics and computation started to be investigated on a more
general and deeper level. This has also been due to the fact that reversibility results im-
plied the theoretical possibility of zero-energy computations.14 A Workshop on Physics and
Computation started to be organized and in his keynote speech at the first of these work-
shops, in 1981, R. Feynman (1982)15 asked an important question: Can (quantum) physics
be (efficiently) simulated by (classical) computers? At the same time he showed good rea-
sons to believe that the answer is negative. Namely, that it appears to be impossible to
10 Due to these facts, the concern was voiced quite a while ago on the possible negative effects that quantum
phenomena could induce in the “classical” operations of computers. For example, what fundamental limits
could Heisenberg’s uncertainty principle impose on memory chips whose bits are stored in single electron
states? This approach was later superseded, as we shall see, by more optimistic, more constructive and more
ambitious aims to harness the power of quantum mechanics to perform computations.
11 Reversibility is actually not an exclusive phenomenon of the quantum world. Reversibility also occurs
in the classical physics. It is only the physics of large systems (classical but also quantum) that is not
reversible. The fact is that classical computationally reversible systems suggested by Bennett and others, as
discussed later, were not practically realizable. This brought up the idea of considering quantum reversible
information processing systems.
12 For earlier references see Section 9.5 in Appendix.
13 Bennett (1988) traces the need to think seriously about the thermodynamics of mental processes (and
computation was thought of this way in the nineteenth century), back to the famous paradox of “Maxwell’s
demon” from 1871, which seemed to violate the second law of thermodynamics, see Appendix, Section 9.1.5.
14 Actually, the original motivation for studying the reversibility of computation came from the interest in
determining the ultimate thermodynamic costs of elementary information processing operations, especially
because heat removal has always been a major engineering concern in the design of classical computers,
limiting the density with which active components could be packed. In the beginnings of the modern
computer era there was a folklore belief, going back to a von Neumann’s lecture in 1949 (see Burks, 1966),
that at least kT ln 2 of energy is needed per bit operation. Attempts to prove this misleading folklore belief
led Landauer to the discovery of reversible computing.
15 Richard P. Feynman (1918-1988), an American physicist. His main scientific contributions were in
quantum electrodynamics and in the study of interactions of elementary particles. He gave a mathematical
description of helium. Feynman received the 1965 Nobel prize for physics for his contributions to quantum
electrodynamics. He has also been known for his extraordinary capabilities to explain physical phenomena
and his lectures and textbooks represent an additional important contributions to modern physics.
1.2. PREHISTORY OF QUANTUM COMPUTING 9
and uncomputable” pointed out explicitly the potential advantages of quantum computing (exponential
number of basis states to work with simultaneously) and emphasized a need to design a theory of quantum
automata that would be abstract enough and would have a proper balance between mathematical principles
and fundamental principles of quantum mechanics without specification of some physical realizations.
18 Deutsch centered his attention on the computability and not on complexity issues.
10 CHAPTER 1. FUNDAMENTALS
“promise problems”, Berthiaume and Brassard (1992, 1992a, 1992b, 1994) proved the first
separation results in the relativized quantum complexity theory. For example, they showed
that there is an oracle A such that QEPA 6⊂ ZPPA —they proved the existence of an
oracle for which there are computational problems that QTM can solve in polynomial time
with certainty, but each probabilistic Turing machine to solve these problems with certainty
needs exponential time for some inputs. These results were first improved by Bernstein and
Vazirani (1993) and later by Simon (1994). He proved the following result that was at that
time the strongest argument in favor of the superiority of quantum computers over classical
ones.
Theorem 1.2.1 There exists an oracle relative to which there is a problem solvable in poly-
nomial time (with bounded error probability) on a quantum computer, but any probabilistic
Turing machine with bounded error probability solving this problem (using the oracle) will
require exponential time (at least 2n/2 steps) on infinitely many inputs (of length n).
Results of Bernstein and Vazirani (1993) and Simon (1994) provide formal evidence that, in
the relativized setting, QTM are more powerful than PTM.19
However, all these problems were quite artificial. Very important and much needed
steps along these lines have been the results of Shor (1994, 1997) who, building on the
works of the above mentioned authors, especially on Simon’s method, showed how to factor
integers, and how to compute discrete logarithms in polynomial time on potential quantum
computers—two problems of crucial importance for public-key cryptography.
Due to these results quantum computing, that till then used to be considered as a curios-
ity for few visionaries, started to be of broader scientific, and not only scientific, interest. An
intensive search started to discover physical principles and processes that could eventually
make quantum computation practical. Moreover, several groups of experimental physicists
around the world have begun projects to explore experimentally the basic principles of
quantum computing.
The next question to address was whether one can build a practically successful quantum
computer. Could quantum computing be brought from a visionary stage to an experimental
stage (and later to an engineering stage)?
This question is still to be answered. An intensive effort to deal with quantum computer
design problems has brought some remarkable success, but also revealed new problems.
On one hand success came in an unexpected area. Quantum cryptography—in which
one tries to exploit quantum phenomena20 to transmit quantum information in such a way
that undetectable eavesdropping is impossible, has already reached an experimental stage.
There has also been success in the effort to find sufficiently simple reversible quantum gates
that could be used to build potential quantum computers. The classical universal reversible
gates have three inputs and outputs. Sleator and Weinfurter (1995), Barenco (1995) and
DiVincenzo (1995) have shown universal two bit quantum gates. This has been an important
result because the problem to control interaction of three particles seems to be much more
complex than for the case of two particles. In addition, Barenco (1995) and Lloyd (1995) have
shown that almost any quantum two-bit gate is universal. These results greatly simplified
the search for physical implementations of quantum computational networks.
On the other hand, it has also turned out that the first models of quantum computers
were oversimplified and that for quantum computing to come to an experimental or even
19 However, it is necessary to make clear that the question whether quantum computers allow one to obtain
essentially more computational power has not yet been completely satisfactorily answered.
20 Heisenberg’s uncertainty principle—see Section 9.1.2.
1.2. PREHISTORY OF QUANTUM COMPUTING 11
engineering stage many fundamental problems still need to be solved. The necessity of
examining impacts of inaccuracies, emissions and coupling with the environment of any
realistic device on the capability of quantum computing to meet their promises has long
been emphasized by Landauer (1994). Especially problems decoherence causes made many
to believe that it is in principle impossible to design reliably enough functioning quantum
computer.21
The situation started to look almost hopeless. A breakthrough came after overcoming
another intellectual barrier: it was realised that the situation is not as bad as it looks and that
physics does not need to rely on itself only in the search for how to overcome problems of the
imperfections of operations, emission and of the decoherence. Mathematics and informatics
seem to be able to help significantly. The first important and encouraging result was due
to Bernstein and Vazirani (1993). They showed that quite weak precision requirements
are sufficient for quantum computing—only logarithmic precision for inputs and gates is
needed. Discovery of error-correcting codes by Shor (1995), and soon by many others,
allowed one to cope with decoherence and operational imperfections during transmission
and storage of quantum information. (In behind there was a key discovery that quantum
noise/errors, in principle continuous, can be viewed and dealt with as being discrete.) The
discovery of quantum fault-tolerant computations by Shor (1996) allowed one to cope with
decoherence and imprecisions during processing of quantum information.22 The discovery of
“concatenated codes” (Knill and Laflamme, 1996) and “quantum repeaters” (Briegel, 1998),
allows one to cope with the problem of storage and transmission of quantum information
for a long time and long distance with desirable reliability.
Quantum cryptography has also contributed to an awareness that quantum computing
is full of pitfalls, not fully understood yet. In 1993, Brassard, Crépeau, Jozsa and Langlois
surprised the community by the claim (proof) that a quantum bit commitment protocol
provably unbreakable by both parties is possible. It took three years to find out, by Lo and
Chau (1997, 1997a) and Mayers (1998), that proposed protocols are, in principle, insecure.
Another intellectual barrier was overcome by contributions of Cirac and Zoller (1995).
They showed, at least on the laboratory level, that in the search for technology to build
quantum processors and computers one does not need to wait till some “unobtainium” is
available, but that one can start with the existing technologies with which there are already
rich experimental experiences. (Of course, this is not the whole story. One also has to realize
that even if it might be possible to build small quantum computers, scaling up to machines
large enough to make really important computations could present fundamental difficulties.)
21 Pessimism that technology cannot be made reliable enough to realize useful computations is not a new
phenomenon in the short history of modern computers. For example, in the autobiography of K. Zuse (1984),
there is a story about sceptical reactions to his talk in 1938 in which he anticipated -based on discussions
with Schreyer-that about 2000 tubes would be needed to build an electronic computer. (At that time the
biggest electronic devices were broadcasting stations with few hundreds of valves.) Similarly, the idea that
ENIAC with its 16000 tubes could work for a sufficiently long time was for that time an enginnering phantasy
that would hardly get through a granting agency of “peace time”.
22 Actually Landauer’s constant challenge of “visionaries” to show a really workable path to the future
has been of immense significance for making correct research agenda in quantum computing. Quantum
computing is an excellent example of the rapid progress in science and technology that can be achieved
by optimists and visionaries if they closely cooperate with, and listen to, sceptics and pessimists directing
constructively the effort of visionaries and optimists on the key problems to attack.
12 CHAPTER 1. FUNDAMENTALS
δ : Σ × Q × Σ × Q × {←, ↓, →} −→ [0, 1]
assigning to each possible transition a probability in such a way that for each configuration24
c0 and all its successor-configurations c1 , . . . , ck , the following local probability condition
is satisfied: If pi , 1 ≤ i ≤ k, is the probability, assigned by δ, of the transition from c0 to ci ,
then (see Figure 1.1a):
X k
pi = 1.
i=1
On the base of the transition function δ of a PTM M we can assign probabilities to all
edges, to all nodes and also to all configurations of each level of any configuration tree of T .
The probability assigned to an edge c → c′ of such a tree is given directly by δ and represents
the probability that computation goes, in one step, from c to c′ . From that we can assign
a probability to each node N of any configuration tree, see Figure 1.2a, as the product of
all probabilities assigned to the edges on the path from the root to N . (The probability
assigned to the root is defined to be 1.) The probability assigned to an arbitrary node N is
therefore the probability that a computation starting at the root reaches the node N .
It may happen that at a certain level of a configuration tree there are several occurrences
c(1) , . . . , c(m) of the same configuration c, see Figure 1.3a. In such a case, if pi is the
23 Alan M. Turing (1912-1954) an English mathematician. He wrote fundamental papers on computability
and artificial intelligence and invented a computation model bearing his name. During the Second World
War Turing participated in the cryptanalysis project ULTRA in Bletchley Park and in the design of the first
powerful electronic computer Colossus. After the war he supervised the design and building of ACE, a large
electronic digital computer at the National Physical Laboratory.
24 A configuration is a full description of the global state of a PTM. It can be seen as having the form
w1 qw2 , where w1 w2 is the current content of the tape, q is the current state and the current position of the
head of the PTM is on the cell with the first symbol of w2 .
1.3. FROM RANDOMIZED TO QUANTUM COMPUTATION 13
c0 c0
p p p p α1 α 2 α k-1 α k
1 2 k-1 k
c1 c2 ck-1 ck c1 c2 c k-1 c k
2 2
|α | + |α 2 |+ .....+ |α | + |αk | = 1
2 2
p + p + .... + p +p =1
1 2 k-1 k 1 k-1
probability assigned to the occurrence c(i) of the configuration c, then the total probability
that the configuration c occurs at that level of the configuration tree is given by the sum
m
X
pi .
i=1
Now, if c1 , . . . , ck are all distinct configurations occurring at a certain level of the con-
figuration tree, and p1 , . . . , pk are their global probabilities of occurrence at that level, then
the following global probability condition has to be satisfied:
k
X
pi = 1.
i=1
1 1 1
2 2 2 2
0.5 0.5
2 2 2 2
0.5 0.5
2 2 2 2 2 2
0.5 0.5 0.5 2 2 2 2 2 2
c d d c a b b c
0.25 0.25 0.25 0.25 0.5 0.5 0.5 -0.5 0.5 0.5 0.5 0.5
Figure 1.2: Configuration trees with probabilities and the probability amplitudes
14 CHAPTER 1. FUNDAMENTALS
(2)
c
(1)
c
(2)
c
(m)
c(1) c c(m)
p p p α1 α 2 αm
1 2 m
Exercise 1.3.1 Show that if a PTM satisfies the local probability condition, then it also
satisfies the global probability condition.
The local probability condition can also be seen as assigning to each configuration a
“linear superposition” of successor configurations
F (c) = p1 c1 + p2 c2 + . . . + pk ck ,
Pk
with i=1 pi = 1, where F is some kind of “global transition function” of T and pi is the
probability
Pk of having ci as the next configuration
Pk of c.25 Let us now consider any superpo-
sition i=1 pi ci of configurations with i=1 pi = 1. If we replace, in such a superposition,
any particular configuration c by the superposition of its successor configurations, as above,
and make corresponding multiplications by constants and corresponding additions, we get
again a superposition of configurations with coefficients summing up to 1.
All this implies that the transition function δ of a PTM M actually determines a so-
called transition matrix MM , rows and columns of which are labeled by configurations
of M (and therefore the matrix can be infinite) and MM (i, j) is the probability that the
configuration ci is the successor configuration of cj . Such a transition matrix clearly has all
entries nonnegative and the sum of its entries in each column is 1.
In this case, if we multiply MM with a column vector of the same dimension and only
nonnegative elements, the sum of which is 1, we get again a column vector with only non-
negative elements the sum of which is 1. We can therefore see MM as a mapping that maps
any superposition of configurations satisfying the global probability condition to another
superposition of configurations satisfying again the global probability condition.
The time evolution of a probabilistic Turing machine M can therefore also be described
by a sequence of probability distributions, represented by superpositions, which begins with
25 This assignment of linear superpositions to PTM has no real meaning for PTM but helps us to make a
better analogy with QTM.
1.3. FROM RANDOMIZED TO QUANTUM COMPUTATION 15
the superposition containing only the initial configuration, and such that the ith distribution
provides the likelihood of each possible configuration after the (i−1)st step of the evolution of
M. Each next superposition (probability distribution) can be obtained from the previous one
by multiplying with the matrix MM . The matrix MM can therefore be seen as representing
the evolution of the PTM M.
In each particular computation of a PTM only one path is taken from the whole set of
paths of the configuration tree, in accordance with the assigned probabilities. To simulate a
PTM we need therefore to keep track of only a constant amount of information. We could
also imagine a PTM M as being put into a box with a glass top through which we could
watch (not influence) the particular steps taken by M, one after another. At the end of
the computation we could see the result obtained. A PTM computation can therefore be
observed and the act of their observation has no effect on its further computation. (Why
should it have?)
Concerning the outcomes, a PTM can be seen as defining a random sample, a probability
distribution on the final configurations for each initial configuration.
In order to study the computational power of PTM we need to impose some restriction on
the probabilities allowed. Otherwise one could hide hard-to-compute numbers or properties
into them. It is well known that in order to study computational complexity problems of
randomized computing, it is sufficient to allow only probabilities from the set {0, 21 , 1}.
After this lengthy review of probabilistic Turing machines and their behaviour we are in
a better position to discuss quantum Turing machines and their behaviour.
δ : Σ × Q × Σ × Q × {←, ↓, →} −→ C[0,1]
2
and therefore |αi | can be seen (and will be seen) as a probability of transition from c0 to ci .
However, as discussed later, this is not the only condition a transition function of a QTM
has to satisfy.
The transition function of a QTM can be used to assign amplitudes (not probabilities)
also to all edges, nodes and all configurations of the same level of a configuration tree. The
amplitude assigned to an edge is given directly by δ. The amplitude assigned to a node is
the product of the amplitudes assigned to all edges on the path from the root to that node,
assuming again that the amplitude 1 is assigned to the root (see Figure 1.2b).
As for the case of PTM, let us assume that at a particular level of the configuration
tree there are several occurrences, say c(1) , c(2) , . . . , c(m) , of the same configuration c. (See
26 The concept of configuration is defined in a similar way as for PTM.
16 CHAPTER 1. FUNDAMENTALS
Figure 1.3b). Let now αi be the amplitude of the configuration c(i) at that level. In such a
case the total amplitude of c at that level is defined to be
m
X
β= αi .
i=1
So far all that looks quite similar to the case of PTM. The only difference being that in the
case of PTM we have worked with probabilities and now we are working with (probability)
amplitudes. An essential difference between PTM and QTM concerning their computations
comes now. If c1 , c2 , . . . , ck are all mutually different configurations at a certain level of
the configuration tree, then their total amplitudes β1 , . . . , βk have to satisfy the following
global probability condition
Xk
|βi |2 = 1,
i=1
2
and |βi | is said to be the probability of the occurrence of the configuration ci at that level
of computation.
It is not true that if a QTM satisfies all local probability conditions, then it also satisfies
all global probability conditions. A counter example is shown in Figure 1.2c.
What does all this actually imply? As we shall see soon, the way probabilities are assigned
to configurations at particular levels of computations represents an enormous difference with
respect to the case of PTM. Indeed, one of the important consequences is the existence of
constructive and destructive interferences.
where θ is the angle which the vectors α and β subtend at the origin of the complex plane.
Analyse the value of |α + β|2 for cases: (a) α = β; (b) α = −β; (c) α = iβ.
so? This is surely puzzling. But this is the way quantum world is. Such are the rules that
say how much and which information one can get from the quantum world to the classical
world by an “observation” or a “measurement”—one of the most puzzling phenomenon of
quantum physics, to be discussed in more detail in the Sections 1.4.3, 1.5.3 and 1.6.2 as well
as in Appendix, Section 9.1.4.
From the fact that we can have positive and negative interferences, one of the basic
tricks of quantum computing follows. One has to program quantum computers in such a
way that correct and desirable answers, due to positive interference, have large probability,
and incorrect, or not desirable answers, due to destructive inference, have very small, or
zero, probability.
What does this all imply? Is there some other condition, more transparent than the
global probability condition a quantum Turing machine has to satisfy? Yes, there is.
In the same way as for PTM, to each QTM M we can associate a matrix MM of
configuration transitions such that MM (i, j) is the amplitude of having the configuration ci
as the successor configuration of the configuration cj . Entries of MM are therefore complex
numbers and the local probability condition implies that Euclidean norm of its column vector
is 1.
∗ ∗
MM MM = MM MM = I,
∗
where MM is the conjugate transpose of MM , i.e. the transposition of MM and conjugation
of its elements, and I is the unit matrix.
0 1 0 −i 1 0
σx = σy = σz =
1 0 i 0 0 −1
(a) (b) (c)
!
√1 √1 1 1−i 1+i
2 2
√1 − √12 2 1+i 1−i
2
(d) (e)
i cos θ sin θ eiα cos θ −iei(α−θ) sin θ
sin θ i cos θ −iei(α+θ) sin θ eiα cos θ
(f) (g)
Exercise 1.3.3(a) Showthat if A and B are unitary matrices of the same dimension,
A 0
then the matrix is also unitary: (b) Show that if A and B are unitary matrices
0 B
of the same dimension, then so is the matrix A · B.
Exercise 1.3.4 Show the following properties of Pauli matrices: (a) σx2 = σy2 = σz2 = I;
(b) σk σl = iσm , where (k, l, m) is a cyclic permutation of (x, y, z); (c) all Pauli matrices
have eigenvalues 1 and −1.
The requirement of unitarity is far from obvious. Examples of unitary matrices of degree
two, including Pauli matrices27 , are shown in Figure 1.428 . A general form of the unitary
matrix of degree two can be found on page 64.
It can be shown that if the transition matrix of a QTM is unitary, then all global
probability conditions are satisfied. Another important consequence of unitarity is that
each QTM is reversible (and the same is true for each quantum evolution). This means that
from a given superposition of configurations in a step of a computation we can uniquely
deduce the superposition of configurations in the previous step.
Exercise 1.3.5 (a) Verify the unitarity of the matrices shown in Figure 1.4; (b) show
the unitarity of any n × n matrix A with A[i, j] = n2 if i 6= j and A[i, i] = −1 + n2 , for
any 1 ≤ i ≤ n.
Exercise 1.3.6 Show that: (a) the determinant of any unitary matrix is ±1; (b) all
unitary matrices of a degree n form a group, with respect to multiplication, usually denoted
U (n); (c) all unitary matrices of degree n and with determinant equal to 1 also form a
group, usually denoted SU (n); (d)∗ all eigenvalues of unitary matrices have absolute value
1.
Another essential difference between a PTM and a QTM could be seen when we would
try “to observe” the evolution of a QTM and to find out the results of its computations.
In the case of a PTM, at each particular computation a single path through the config-
uration tree has to be chosen, and we could watch (though not influence) the path being
taken. The result would be obtained with the probability attached to the final configuration.
On the other hand, a QTM always follows all paths of the configuration tree simultane-
ously! Since the number of nodes at the levels of a configuration tree can grow exponentially,
this means that a QTM can, simultaneously, take an exponentially large number of paths
and can be, at particular steps of computation, in a superposition of exponentially many
configurations (with respect to the number of computational steps), at the same time! In
addition, the computational evolution of a QTM, and of any quantum computation, is fully
determined by its unitary matrix and it is deterministic.
27 Wolfgang Pauli (1900–1958), an American physicist of Austrian origin, with positions in Hamburg,
Zürich and Princeton. He received the 1945 Nobel prize for his exclusion principle (formulated in 1924),
according to which no two electrons in an atom may be in the same quantum state. In 1930 Pauli derived the
existence of the neutrino, before it was experimentally observed. He also made fundamental contributions
in quantum electrodynamic, quantum field theory and paramagnetism.
28 Matrices σ , σ and σ are called Pauli matrices and they play an important role in the theory of
x y z
spin- 21 electrons. They were introduced by W. Pauli in 1927 to describe angular momentum and magnetic
momentum of electrons.
1.4. HILBERT SPACE BASICS 19
Moreover, there is no way “to watch” the computations of a QTM. We could “put it into
a box and let it run” but we cannot watch it, or open the box before computation is done—
at least not without serious consequences. This would be an observation (a measurement)
and, according to the laws of quantum mechanics, it could immediately lead to a disruption
of the computation and could result in a loss of (quantum) information! At the end of the
computation we can try to observe (measure) the result. However we can not get in general
the whole resulting superposition of configurations. Only one of them, randomly chosen,
with the probability determined by the corresponding global amplitude. In addition, once
such information is obtained, all other results of the computation (all other configurations)
are irreversibly lost. Finally, it is only at this point, in general, in a measurement, or an
observation, where probabilities and indeterminancy enter quantum computation.
A QTM can therefore get in linear time an exponential number of results, but unfortu-
nately we cannot read them all out. (Disappointing, but this is the way it is. In spite of
that, QTM can be more efficient than classical ones.)
This seems to imply that we actually cannot get more with a QTM than with a proba-
bilistic Turing machine that would provide us, randomly, with one result! Fortunately, this
is not true. There are sometimes clever ways to make a QTM use its enormous parallelism
to get to the single needed result, with a high probability, which seems not to be obtainable
efficiently without this quantum parallelism. This is, however, not a simple task, what will
be demonstrated in Section 3.2.
Both PTM and QTM produce their results with certain probabilities. Therefore they
actually define probability distributions on possible outputs.
In order to study the computational power of a QTM we also need to make some re-
striction on the probability amplitudes allowed. Otherwise one could hide hard-to-compute
numbers or properties into them. It has been shown, as discussed in Chapter 5, that in
order to study the computational complexity problems of quantum computing it is sufficient
to allow only amplitudes from the set {−1, − 54 , − 53 , 0, 53 , 45 , 1}.
Remark 1.3.7 Figure 1.5 summarizes, very informally, the main features of classical versus
quantum computations. As we shall see in Section 1.6, quantum computation, as determined
by quantum evolution only, is a deterministic process, contradicting widespread naive beliefs,
and probabilities appears only when “creatures” of the classical world try “to observe” the
outcomes of quantum processes. On the other hand, modern complexity theory considers
probabilistic computations as the main mode of efficient computing.
It is one of the profound problems in science to determine what classical and quantum
worlds actually are and where a borderline between them is—if there is any. Interesting
enough, even some of the founders of quantum mechanics have been very careful about it.
For example, Bohr avoided to referring explicitly to two types of worlds. He only emphasized
a need to use two different languages to talk about quantum and classical phenomena.
Hilbert29 space is a mathematical framework suitable for describing the concepts, prin-
ciples, processes and laws of quantum mechanics. Pure states of quantum systems are
considered to be vectors of a Hilbert space. One can say that to each isolated quantum sys-
tem corresponds a Hilbert space. Some even go farther by claiming that there is no reality
on the quantum level; such a reality emerges only in the case of a measurement, and what
we know about the quantum level are only computational procedures, expressed in terms of
Hilbert space concepts, to compute evolutions of quantum systems and probabilities of the
measurement outcomes.
Let us start with the two most important examples of Hilbert spaces for quantum me-
chanics.30
Example 1.4.1 (Hilbert spaces l2 (D)) For any countable set D, let l2 (D) be the space
29 David Hilbert (1862–1943), a German mathematician and logician. Hilbert was perhaps the most
influential mathematician of his period. After work on the theory of invariants, he developed a new approach
to abstract algebra and functional analysis. Of key importance for quantum mechanics are his abstract spaces
bearing his name. He made important contributions to algebraic number theory, functional analysis, integral
equations and to variation calculus. He also worked on several fundamental problems of physics.
30 von Neumann’s idea to formulate quantum mechanics in terms of Hilbert spaces was one of the most
We say that l2 (D) is a Hilbert space with respect to the inner product h · | · i : l2 (D) ×
l2 (D) → C,32 defined by X
hx1 |x2 i = x∗1 (i)x2 (i).
i∈D
Elements of l2 (D) are usually called vectors (to be indexed by elements of D). The notation
l2 = l2 (N) is usually used in the case D = N.
Our second example of a Hilbert space is actually the main one considered in quantum
mechanics. However, for dealing with the very basic concepts of quantum computing it is
not as much needed.
Example 1.4.2 (Hilbert space L2 ) 33 Let (a, b) be an interval, with finite or infinite
bounds, on the real axis. By L2 ((a, b)), or simply L2 , we denote the set of all complex
Rb
valued functions such that a |f (x)|2 dx exists, equipped with the inner product
Z b
hf |gi = f ∗ (t)g(t) dt < ∞.
a
If f and g are such that |f |2 and |g|2 are integrable functions34 on (a, b), then so are functions
cf and f + g, for any complex number c, and therefore L2 is a linear space.35
Hilbert spaces are discussed in more detail in Appendix, Section 9.2. In this section only
the very basic concepts and results are summarized.
hφ|ψi = hψ|φi∗ ,
hψ|ψi ≥ 0 and hψ|ψi = 0 if and only if ψ = 0,
hψ|c1 φ1 + c2 φ2 i = c1 hψ|φ1 i + c2 hψ|φ2 i.36
31 x∗ denotes the conjugate of the complex number x; i.e., x∗ = a − bi if x = a + bi, where a, b are real.
32 In this book we use the following notation for sets of numbers: C—the set of complex numbers; R—the
set of reals; Q—the set of rationals; Z—the set of whole numbers; N (N≥0 )—the set of (nonnegative)
integers.
33 Hilbert studied spaces l and L , in his work on linear integral systems, and that is why von Neumann
2 2
all spaces of such types named as Hilbert spaces.
34 With respect to Lebesgue measure.
35 To be more precise L is to be the set of Lebesgue integrable functions on (a, b) and we do not consider
2
as different a pair of functions that differ only on a set of measure zero. In such a linear space the zero
element is a function that is equal to zero almost everywhere on (a, b).
36 Caution! In more mathematically oriented literature the third axiom for the inner product has often
the form: hc1 φ1 + c2 φ2 , ψi = c1 hφ1 |ψi + c2 hφ2 |ψi. In order to transfer results of one of these two axiomatic
approaches into another conjugate values have to be taken.
22 CHAPTER 1. FUNDAMENTALS
Exercise 1.4.4 Show the following properties of the norm for φ, ψ ∈ H, a ∈ C:(a)
||φ|| ≥ 0 for all φ ∈ H; (b) ||φ|| = 0 if and only if φ = 0 (c) ||φ + ψ|| ≤ ||φ|| + ||ψ||
(triangle inequality); ; (d) ||aφ|| = |a| ||φ||; (e) |hφ, ψi| ≤ ||φ|| ||ψ|| (Schwarz inequality).
Unit norm vectors of an inner-product space are also called (pure) states of H.37
Exercise 1.4.5 Show the following properties for the distance mapping for any φ, ψ, χ ∈
H: (a) dist(φ, ψ) ∈ R≥0 ; (b) dist(φ, ψ) = dist(ψ, φ); (c) dist(φ, ψ) = 0 if and only if
φ = ψ; (d) dist(φ, ψ) ≤ dist(φ, χ) + dist(χ, ψ).
fφ (ψ) = hφ|ψi
is a linear mapping on H in the sense that fφ (cψ) = cfφ (ψ) and fφ (ψ1 + ψ2 ) = fφ (ψ1 ) +
fφ (ψ2 ). One can even show that we get all linear mappings from H to C by this construction.
Namely, it holds:
The space of all linear mappings (called also functionals) of a Hilbert space H forms
again a Hilbert space, the so-called dual Hilbert space (or conjugate Hilbert space)
H ∗ with the inner product hf |gi = hφf |φg i, for f, g ∈ H ∗ .
The mapping fφ (ψ) = hφ|ψi is a functional for any φ ∈ H. Therefore, the last theorem
establishes a bijection between H and H ∗ , and H ∼ = H ∗ . On the base of this relation the
38
handy “ket-bra” notation, due to Dirac , can be introduced.
37 The idea of representing pure quantum states of a quantum system by unit vectors of a Hilbert space,
one of the key ideas of modern quantum theory, is due to von Neumann (1932).
38 Paul Adrian Maurice Dirac (1902–1984), an English physicist. He formulated a version of quantum
mechanics that took into account the theory of relativity. He shared the 1933 Nobel prize with E. Schrödinger.
Together with E. Fermi he determined the laws of statistical mechanics of a system of atoms and he envisioned
the existence of positrons.
1.4. HILBERT SPACE BASICS 23
Exercise 1.4.8 Let A be a set of states of a Hilbert space H each which has norm 1 and
there is an ε such that ||φ − ψ|| ≥ ε if φ, ψ ∈ A, φ 6= ψ. Is it true that the set A has to
be finite?
Definition 1.4.9 Two vectors φ and ψ of a Hilbert space are called orthogonal, notation
φ ⊥ ψ, if hφ|ψi = 0. A set S ⊆ H is orthogonal if any two of its elements are orthogonal.
S is orthonormal if it is orthogonal and all its elements have norm 1.
Exercise 1.4.10
√ Show that if φ, ψ are distinct elements of an orthonormal set, then
||φ − ψ|| ≤ 2.
Exercise 1.4.11 Show that if n nonzero vectors of a Hilbert space are mutually orthog-
onal, then they are linearly independent.
Remark 1.4.12 The key role of orthogonality for quantum computing is that whenever a
measurement is performed on a quantum system, then those quantum states that lead to
distinguishable outcomes have to be mutually orthogonal. No measurement or observation
of a quantum system is able to distinguish faithfully between non-orthogonal states.
It can be shown that all bases of a Hilbert space H have the same cardinality, called the
dimension of H. In addition, two Hilbert spaces, H1 and H2 , of the same dimension are
isomorphic. A d-dimensional Hilbert space will be denoted by Hd .
Exercise 1.4.15 Show that the dimension of a finite dimensional Hilbert space H is the
maximum of linearly independent vectors of H.
Finite dimensional Hilbert spaces correspond, for example, to such properties of parti-
cles as spin value or polarization. Infinite dimensional Hilbert spaces correspond to such
properties of particles as position or momentum. In the case of quantum automata the
corresponding Hilbert space has as an orthonormal basis the set of all configurations of the
automaton. In the case of space bounded computations, the corresponding Hilbert space is
finite dimensional; otherwise it is infinite dimensional.
There are several other equivalent and important definitions of the orthonormal basis of
Hilbert spaces:
Theorem 1.4.16 Let B be an orthonormal set of a Hilbert space H. The following state-
ments are equivalent:
1. B is an orthonormal basis of H.
Theorem 1.4.19 For each closed subspace W of a Hilbert space H there exists a unique
subspace W ⊥ such that hφ|ψi = 0, whenever φ ∈ W and ψ ∈ W ⊥ and each ψ ∈ H can be
uniquely expressed in the form ψ = φ1 + φ2 , with φ1 ∈ W and φ2 ∈ W ⊥ . In such a case we
write H = W ⊕ W ⊥ and we say that W and W ⊥ form an orthogonal decomposition of H.
H = W1 ⊕ W2 ⊕ . . . ⊕ Wn ,
of H into mutually orthogonal subspaces W1 , . . . , Wn such that each ψ ∈ H has a unique
representation as ψ = φ1 + φ2 + . . . + φn , with φi ∈ Wi , 1 ≤ i ≤ n.
1.4. HILBERT SPACE BASICS 25
1.4.2 Operators
Definition 1.4.20 A linear operator on a Hilbert space H is a linear mapping A : H → H.
The set of all linear operators of a Hilbert space H will be denoted L(H). L(H1 , H2 ) will
stand for the set of all linear operators from a Hilbert space H1 into the Hilbert space H2 .
An application of a linear operator A to a vector |ψi is denoted A|ψi or A(|ψi). A is also
a linear operator of the dual Hilbert space H ∗ , mapping each linear functional hφ| of the
dual space to the linear functional, denoted by hφ|A. (If hφ|A is applied to a vector, then
A is applied first and then hφ|). A linear operator A is called positive or semi-definite,
notation A ≥ 0, if hψ|Aψi ≥ 0 for every |ψi ∈ H.
Each linear operator A of a countable Hilbert space H with a basis B = {|θi i| i ∈ I} can
be represented by a matrix, in general infinitely dimensional, whose rows and columns are
labeled by elements of I and with hθi |A|θj i in the i-th row and j-th column. In such a case
a row indexed by i ∈ I is the vector hθi A| and a column indexed by j is the vector A|θj i.39
The norm ||A|| of a linear operator A is defined as ||A|| = sup||φ||=1 ||A|φi||. A linear
operator A is called bounded if its norm ||A|| is finite. A linear operator is bounded if and
only if it is continuous.
Exercise 1.4.21 Show that if M is a complex matrix of degree d such that |M (i, j)| ≤ ε
for all i, j, then ||M || ≤ dε.
Exercise 1.4.23 Let A, A1 , A2 be linear operators. Show: (a) ||A1 A2 || ≤ ||A1 || ||A2 ||;
(b) ||A1 || − ||A2 || ≤ ||A1 + A2 || ≤ ||A1 || + ||A2 ||.
39 Linear operators are often identified with the corresponding matrices. This correspondence is always
related to some basis and if no basis is explicitly mentioned, then it should be clear implicitly from the
context which basis is considered.
26 CHAPTER 1. FUNDAMENTALS
Exercise 1.4.25 Show that if A is a linear operator and A∗ exists, then ||A∗ || = ||A||.
3. The trace (the sum of diagonal elements) of a Hermitian matrix A, notation Tr(A),
equals the sum of its eigenvalues .
Actually, the last two properties hold for all matrices. In addition, if A ≥ 0 is a Hermitian
matrix,
√ then there exists a unique matrix B such that B · B = A. This B is then denoted
A.
A self-adjoint operator A of a finite dimensional Hilbert space H has the so-called spec-
tral representation. If λ1 , . . . , λk are its distinct eigenvalues, then A can be expressed in
the form
X k
A= λi Pi ,
i=1
where Pi is the projection operator into the subspace of H spanned by the eigenvectors
corresponding to λi .
that can be computed according to the rules under which the preparation phase was done.
Testing is a randomized process.
Quantum tests can produce non-numerical values. For example, the polarizations of pho-
tons. In order to make a quantum measurement out of a quantum test, different numerical
labels have to be associated with different outcomes of the test.
An observable is a property of the physical system that can be measured. In the classical
physics, position, speed and momentum are examples of observables. In quantum theory a
(sharp) observable is a self-adjoint operator.
In quantum mechanics we used to consider the measurement of states mainly with respect
to sharp observables. The numerical outcome of the measurement of a pure state |ψi with
respect to an observable A is one of the eigenvalues of A and the side impact of such a
measurement is a “collapse” of |ψi into a state |ψ ′ i. In the measurement the eigenvalue λi
is obtained with probability
and the new state |ψ ′ i, into which |ψi collapses, has the form
Pi |ψi
|ψ ′ i = p .
hψ|Pi |ψi
This means that a measurement of |ψi with respect to A irreversibly destroys |ψi, unless
|ψi is an eigenvector of A.
Sometimes it is important to know the average value of an observable A when a state
|ψi is measured. This average value, denoted hAiψ , is defined to be hψ|Aψi = hψ|A|ψi.
Exercise 1.4.28 Show that if |ψ1 i, . . . , |ψn i are eigenvectors of a self-adjoint operator
A that form an orthonormal
Pn basis and λ1 , . . . , λn are the Pcorresponding eigenvalues, then
n
for every vector |ψi = i=1 αi |ψi i, it holds hψ|A|ψi = i=1 |αi |2 λi .
In the last 10–15 years it has been shown that unsharp observables and the corresponding
POV measurements, see Section 9.2.8, are of key importance in many cases. In quantum
information processing they play an important role in quantum cryptography (Chapter 6)
and in quantum information theory (Chapter 8).
(x1 y1 , . . . , x1 yn , x2 y1 , . . . , x2 yn , . . . , xm y1 , . . . , xm yn )
28 CHAPTER 1. FUNDAMENTALS
Exercise 1.4.29 Show that the tensor product of vectors is an associative operation.
Exercise 1.4.30 Show that if x, y are vectors, c ∈ R, then: (a) c(x ⊗ y) = (cx) ⊗ y =
x ⊗ (cy); (b) x ⊗ z + y ⊗ z = (x + y) ⊗ z; (c) x ⊗ y + x ⊗ z = x ⊗ (y + z).
as follows
a11 B ... a1n B
.. ..
A⊗B = . .
an1 B . . . ann B
k
O
B1 ⊗ . . . ⊗ Bk = Bi = {x1 ⊗ . . . ⊗ xk | xi ∈ Bi }
i=1
k
O
H= Hi .
i=1
Example 1.4.31 If H2 is the two-dimensional vector space with the basis B2 = {|0i, |1i},
where |0i = (1, 0)T , |1i = (0, 1)T , then
n
O
B2 = {|x1 i ⊗ |x2 i ⊗ . . . ⊗ |xn i | x1 . . . xn ∈ {0, 1}n}
i=1
and instead of |x1 i ⊗ |x2 i ⊗ . . . ⊗ |xn i the following notations are often used:
If {αi }i∈I is a basis of a Hilbert space H, {βj }j∈J is a basis of a Hilbert space H ′ , then
in H ⊗ H ′ the inner product is induced by the mapping hαi βj |αk βl i = hαi |αk ihβj |βl i.
1.4. HILBERT SPACE BASICS 29
To each mixed state [ψi corresponds a density operator ρ[ψi . If [ψi = |φi for a pure
Lk
state |φi, then ρ[φi = |φihφ|. If [ψi = i=1 (pi , |φi i), where |φi i are pure states, then
k
X
ρ[ψi = pi |φi ihφi |.
i=1
Exercise 1.4.34 Let |φi and |ψi be two states and |ψi = U |φi for a unitary operator U .
Express the |ψihψ| in terms of the matrices |φihφ| and U .
The representation of pure states depends on the choice of the basis, the same is true
for density operators, which are uniquely represented in the matrix form through density
matrices. Denote by MρB the matrix representation of the density operator ρ, with respect
to the basis B. The concepts of density operator and density matrix are often considered
as synonymous and the basis is considered to be clear from the context, unless described
explicitly.
As discuss in more detail on page 372, the same density matrix can correspond to different
mixed states.
Instead of “a density matrix ρ” one can sometimes read also “a state ρ” because ρ is
undistinguishable from mixed states to whom ρ corresponds.
In general, ρ is a density operator41 if ρ is Hermitian, ρ ≥ 0 and T r(ρ) = 1. This implies
40 A more modern term for an unisolated quantum system is “open” system.
41 Density operators were introduced by von Neumann (1932).
30 CHAPTER 1. FUNDAMENTALS
ρ = ρ∗ .
A general form of a density matrix is:
n
X
ρ= pij |φi ihφj |,
i,j=1
Pn
where pij = p∗ij , i=1 pii = 1 and |φi i are pure states.
B
To each density operator ρ there is a basis Bρ in which the matrix Mρ ρ is diagonal. This
is used to define f (ρ) for functions f : C → C as follows:
Bρ
Mf (ρ) (i, j) = f (MρBρ (i, j)).
√
For example, in this sense one understands such operations as ρρ∗ or lg ρ.
Lk
The degree of ignorance embodied in a mixed state [φi = i=1 (pi , φi ) is represented by
its (quantum) von Neumann entropy
One can show that S(ρ[ψi ) ≤ H(p1 , . . . , pk ), where H is the classical Shannon entropy.
One of the profound differences between the quantum and classical systems lies in the
relation between a system and its subsystems. As discussed below a state of a Hilbert space
H = HA ⊗ HB cannot be always decomposed into states of its subspaces HA and HB .
We also cannot define any natural mapping from the space of linear operators on H into
the space of linear operators on HA (or HB ). However, density operators are much more
robust and that is also one reason for their importance. A density operator ρ on H can be
“projected” into HA by the operation of tracing out HB , to give the density operator (for
finite dimensional Hilbert spaces):
X X
ρHA = T rHB (ρ) = |φi hφ, ψ|ρ|φ′ , ψi hφ′ |,
|φi,|φ′ i∈BHA |ψi∈BHB
ρ(H, ψ, H ′ ) = Tr H ′ |ψihψ|.
(open) quantum systems, pure states can naturally evolve into mixed states (which can
also be described as pure states of a larger system, composed of a given system and the
environment). For unisolated quantum systems the basic elements are therefore mixed
states, or density matrices, and evolutions are performed by the so-called superoperators42
(certain positive and trace preserving linear transformations over density matrices), acting
on density matrices (see page 97, and also Davies (1976), for more detail).
which is just the average of (1.2) over mixed states with density matrix ρ2 . Several other
concepts of fidelity are considered in Chapter 8.
Remark 1.4.36 The importance of mixed states and especially of density matrices for
quantum mechanics theory in general, and for quantum information processing in particu-
lar, is growing. For quantum computation this lies in the following. Real quantum computers
can rarely be in pure states, rather they are in mixed states, and interact with their envi-
ronment leading to non-unitary evolutions. Research in quantum error-correcting codes and
in fault-tolerant computation has shown that non-unitary evolutions, in the case of weak
interactions with the environment, do not need to imply a loss of quantum computational
42 Sometimes the term “quantum operator” or even simply “operator” is used instead of “superoperator”.
43 A new meaning of density matrices which gives it the same onthological status as the wavefunction
describing a pure state is discussed by Aharonov and Anandan (1998).
32 CHAPTER 1. FUNDAMENTALS
power. Computation with mixed states has been shown (see Section 2.3.5), to be as pow-
erful as with pure states. Finally, some of the technologies being used for experimenting in
quantum computing work with highly mixed states.
Remark 1.4.37 The Hilbert space is a very nice and useful abstraction of physical reality.
However, one should not forget that real quantum computing is not performed in a Hilbert
space but in laboratories.
1.5 Experiments
The assignment of complex amplitudes to quantum events, superposition rules and the
special way of considering probabilities in the quantum world are quite puzzling. Naturally
a question arises: why is it so?
One of the possible answers is: this is the way it works in quantum world; such are its
rules (see Feynman et al. 1964)/indexFeynman, Richard P..
In this section we take a little more didactic approach and describe, in a very simplified
“textbook form”44 , several basic quantum experiments that suggest and justify the above
mentioned principles and rules (see Feynman et al. 1964).
detector
1
0
0
1
0
1
0
1
0
1
0
1
0H
1
0
1 P(x)
1 1
0
1
0
1
0
1
0
1 P (x)
12
0H
1
gun 0 2
1
0
1
0
1 P2(x)
0
1
0
1
0
1
wall
wall
(a) (b) (c)
Figure 1.6: Experiment with bullets
The second experiment, this time with waves on the water, is illustrated in Figure 1.7a. A
generator vibrates and makes waves. They move toward the first wall, with two holes again,
and then to the second wall with a detector, which can detect the intensity I(x)p= |h(x)|2 of
the wave (or its energy), i.e. the square of the height-amplitude h(x) = eiφ(x) I(x) of the
wave. In the next Figure 1.7b, the curve shows the level of the intensity I1 (x) (I2 (x)) for the
case that only the hole H1 (H2 ) is open. The results are again as expected. The intensity
curve, for the case that both holes are opened, is shown in Figure 1.7c, and from the wave
theory it is well known, that this is a consequence of the wave interference phenomena. In
this case actually I12 (x) = |h1 (x) + h2 (x)|2 . This means, that in some cases we have positive
and in some cases negative interference. It is also well known in which positions there are
local minima and where local maxima (as well as how large they are).
detector
wave
source 1
0
0
1
0
1
0
1
0
1
0
1
0H
1
0
1 I (x)
1 1
0
1
0
1
0
1 I12(x)
0
1
0H
1
0 2
1
0
1
0
1 I 2 (x)
0
1
0
1
0
1
wall
wall
(a) (b) (c)
Figure 1.7: Experiments with waves
probabilities that electrons reach given positions on the second wall. The results are shown
in Figure 1.8b, by the curve P1 (x) (P2 (x)) for the case that electrons reach the position x at
the second wall and that only one slit, namely H1 (H2 ), is open. Again, the results are as
expected, the maxima are exactly at points where the straight lines from the source through
the slits reach the second wall. However, contrary to our intuition, in the case that both slits
are open we get the curve P12 (x), shown in Figure 1.8c, similar to that in Figure 1.7c. Very
surprisingly, at some places one observes fewer electrons when both slits are open than in
the case only one slit is open!47 A similarity of Figures 1.7 and 1.8 indicates that electrons,
particles, sometimes behave as waves!
There seem to be two surprising conclusions one can draw from these experimental
results. By opening the second slit, it suddenly seems that electrons are somehow prevented
from doing what they could do before! It seems that by allowing an electron to take an
alternative route we have actually stopped it from traversing either route.
Electrons are particles, but they seem to have a wave-like behaviour as they pass through
the holes! Each particle seems to behave as if it is going through both holes at once and
afterwards creating waves that interfere, as in the second, wave, experiment.48 However, we
cannot predict the precise path for any electron.
Observe that by opening the second, slit the number of electrons arriving at some places
47 This is, actually, only a rough idea of a potential experiment. In order really to function, its components
would need to have extraordinarily small sizes. However, from similar and really feasible experiments, it is
pretty well known how the results would look in the experiment considered here.
48 In our experiment, electrons behave as being little packets of waves and they well demonstrate the
“wave-particle” duality in quantum mechanics. According to this duality principle, fundamental quantum
objects are neither waves nor particles, but sometimes one thing or the other, or perhaps always a little of
both—see Lindley (1996).
1.5. EXPERIMENTS 35
detector
1
0
0
1
0
1
0
1
0
1
0
1
0H
1
0 1
1 P1(x)
0
1
0
1
0
1 P12(x)
0
1
0H
1
source 0 2
1
0
1
0
1 P2(x)
of 0
1
electrons 0
1
0
1
wall
wall
(a) (b) (c)
Figure 1.8: Two slit experiment
increases four times and at some places decreases to zero. The places with local minima
(maxima) are known to be in the distance lλ L , where l is the distance of two walls, L is
distance of two slits and λ is the wavelength.
The similarity of the curves P12 (x) in Figures 1.7 and 1.8, can be seen as a motivation for
trying to assign a complex amplitude to the fact that an electron reaches a position, and then
to find the corresponding probabilities in a similar way to how we found the intensities in
the previous example and so to have interference responsible for contra-intuitive outcomes.
Indeed, it turns out that one can assign to the event that an electron reaches a position
x going through the first (second) slit a complex amplitude ψ1 (x) (ψ2 (x)) in such a way
that P1 (x) = |ψ1 (x)|2 , P2 (x) = |ψ2 (x)|2 and P12 (x) = |ψ1 (x) + ψ2 (x)|2 . Now it is easy to
see the reason for oscillations in the case of P12 (x). Indeed, we get
where Re(y) and Im(y) denote the real and imaginary parts of y. Oscillations are due to
the term 2Re(ψ1 (x))Re(ψ2 (x))) + 2Im(ψ1 (x))Im(ψ2 (x)).49
It is also important to mention that the results of this experiment do not depend on the
frequency with which electrons are shot. The same interference pattern would be obtained,
49 However, since there is such a similarity between the interference produced by waves, as the elements
of the classical world, and the interference exhibited by quantum phenomena, one may wonder how it is
possible that quantum computers can be essentially more powerful than classical ones. As discussed in more
detail in Section 2.1, it is mainly quantum entanglement that has no counterpart in the classical world and
makes quantum computation so powerful.
36 CHAPTER 1. FUNDAMENTALS
in the end, if each electron is shot only after the previous one hits the wall or if there is an
interval of several years between two consecutive electrons.50 It is also important to realize
that quantum physics has no explanation where a particular electron reaches the detector
wall. All quantum physics can offer are statements on the probability that an electron
reaches a certain position on the detector wall.
In order to illustrate another quantum phenomenon, to be discussed later, we consider
a modification of the previous experiment—with a measurement during the experiment.
detector
light
0
1source
1
0
0
1
0
1
0
1
0
1
0H
1
0 1
1 P1(x)
0
1
0
1
0
1 P12(x)
0
1
source 0
1
0
1H
0
1
0
1
2 P2 (x)
of 0
1
electrons 0
1
0
1
wall
wall
(a) (b) (c)
Figure 1.9: Two-slit experiment with an observation
In the experiment depicted in Figure 1.9a, the basic setting is similar to the previous
experiment. However, this time we have in addition a source of light on the right hand side
of the first wall, just in the middle between the two slits. If we now watch the experiment
we can detect (as indicated by the small black circle in the figure), through which slit a
particular electron passes through the first wall. If it goes through the slit H1 , some light
appears for a moment in its neighborhood, as a reflection; if its passes trough the slit H2 ,
we see some light near that slit.
Again, we can determine the probabilities that electrons reach positions on the second
wall for the case where only one slit is open, Figure 1.9b, and for the case where both slits
are open, Figure 1.9c. The curves in Figure 1.9b are similar to those in Figure 1.7b, as
expected. However, the curve for the case where both slits are open is different from that
in Figure 1.8c, and actually similar to that in Figure 1.6c. This is again a counter-intuitive
phenomenon. The explanation is, that the resulting behaviour of electrons is due to the fact
that we have been observing (or at least could observe) their behaviour by putting a light
source next to the slits. In this case an observation or a measurement results in the lost
of interference.
50 A similar, one-particle interference, has been observed with photons, neutrons and atoms.
1.5. EXPERIMENTS 37
We again have a particular case of a very well known phenomenon in the quantum world.
A quantum system behaves differently if it is observed from when it is not! Moreover,
the interference pattern disappears would we change our original electron experiment in
some other way in order to find out which way electrons go. This can also be seen as
another example of the uncertainty principle of quantum mechanics, see Section 9.1.2. Either
you observe an interference pattern or you can find out which way the electron went, but
you cannot do both. Seeing the interference pattern and detecting an electron are both
measurements that cannot be performed in the same experiment—one has to choose one
or the other. Observe also that detecting which slit an electron went through is sort of a
“particle measurement”; recording the interference pattern is a “wave measurement”. One
can do one or the other, but not both.
A slight modification of the previous experiment is that instead of a light source we have
a single photon. In such a case there is an interaction between the electron and photon. If
the photon has a short wavelength, then the interference pattern disappears. However, if a
photon of a longer wavelength is used (thereby reducing the momentum kick conveyed to
the electron), then the interference pattern is restored when the wavelength is greater then
the slit distance.
In the case of an additional photon between the slits, it is useful to turn the experiment
on its head and to ask, on the basis of outcomes, whether there was an attempt to measure
the way the electron went. The resulting pattern of measurement answers the question. If
one see the interference, then there was no measurement. This observation, namely, that
we can detect the existence of a measurement from the outcome is the basis of the quantum
cryptography discussed in Chapter 6.
The above example suggest how laws of quantum physics can be used to detect eavesdrop-
ping during quantum communications. Let Alice be shooting electrons and Bob is observing
the probability pattern electrons produce. If he see no superposition then he knows that
somebody tried to observe flying electrons; otherwise he can be sure that nobody tried to
interfere with electrons.
Another modification of the basic two-slit experiments is obtained by placing another
particle with spin before the first wall and close to one of the slits. In such a case electron
passing through the slit interacts with the particle and flips the spin (see DiVincenzo and
Terhal, 1998). This interaction with the environment is called decoherence and in this
case decoherence causes the interference pattern to disappear.51
A different kind of experiment, the so called “delay choice experiment” employs as the
basic component an often used device, a beam splitter, which sends the incoming photon
one way or another, with equal probability, see Figure 1.10. If the light from the laser is
divided in this way, the two beams are physically distinct, but nevertheless coherent. The
two routes along which photons travel can be as long as needed. Assume that on one route
we install a photon detector, distant enough that we can wait until the photon has passed
through the beam splitter to switch the detector on or off. With the detector off we have a
standard means of creating an interference pattern. With the detector on, we are actually
asking which way the photon went, and therefore we lose the interference pattern.
51 In the essence we keep discussing the same problem over and over because each measurement is an
interaction with environment and each interaction with environment can be sees as a measurement.
38 CHAPTER 1. FUNDAMENTALS
111111
000000 111
000
detector
000
111
000000
111111
000000
111111
000
111
000
111
000
111
000
111
000000
111111 000
111
000
111
000000
111111 000
111
000
111
000
111
laser beam
splitter wall
The experiment was performed in 1921 by Otto Stern and Walter Gerlach in Berlin (see
Figure 1.11). They shot a beam of atoms with random magnetic orientation (thought of,
for this experiment, as little bar magnets, with North and South Poles) through a magnetic
field, graded by intensity from top to bottom. The magnetic field is created by two vertically
positioned magnets, to sort atoms according to their magnetic orientation, and the result
are recorded on a screen. It was expected that the atoms emerging from the magnetic field
would be smeared out, when they hit the screen, into a range of all possible deflections, with
all possible orientations of their magnetic axes (as it would be the case with real magnets).
Instead of that, it was discovered that atoms emerging from the magnetic field struck the
screen in exactly two places, each with only one orientation, say “up” or “down”, each with
equal probability, so they came up in a “half-up and half-down manner”. Moreover, the
same phenomenon appeared when the magnets themselves were turned ninety degrees so
1.5. EXPERIMENTS 39
that their magnetic field was horizontal and not vertical. (See Figure 1.12a). Again, the
atoms hit the screen in exactly two spots, to the left and right of the original line of the
beam and again with the same probability. We can say they came out in a “half-left and
half-right” manner.
Actually, no matter how the magnetic field was lined up, it always split the beam of
atoms into two. As if each atom was forced somehow to take up either one or the other of
just two possible orientations, dictated by the alignment of the magnets.
000
111
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
111111111111111
000000000000000 000
111
000
111
000000000000000
111111111111111
000000000000000
111111111111111 000
111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
000
111
000
111
000
111 00111100
111
000 000000000000000
111111111111111
000000000000000
111111111111111
000
111
000
111
000
111
000
111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
000
111
000
111
000
111
000
111
000000000000000
111111111111111
000000000000000
111111111111111 000
111
000000000000000
111111111111111 000
111
000
111
000000000000000
111111111111111
000000000000000
111111111111111 000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111 (b)
000
111
(a) 000
111
000
111
000
111
It can be demonstrated that magnets do not physically ‘sort’ atoms passing through by
directly manipulating their magnetic axes. The quantum theory explanation is the following
one: Passing an atom through a magnetic field amounts to a measurement of its magnetic
alignment, and until you make such a measurement there is no sense in saying what the
atom’s magnetic alignment might be; only when you make a measurement do you obtain
one of only two possible outcomes, with equal probability, and those two possibilities are
defined by the direction of the magnetic field that you use to make the measurement.
Finally, let us see what happens if a one of the streams of atoms, which came out
of the vertically aligned magnet passes again through another Stern-Gerlach magnet—see
Figure 1.12b. If the second magnet is again vertically aligned, the stream just gets through
without being divided into two. However, if the stream goes through horizontally aligned
magnets, then it again comes out in two streams, in a “half-right and half-left manner”.
Quantum mechanics provides the following explanation: In the first magnet a measure-
ment is performed with respect to one orientation, up or down. If the second magnet
performs the same measurement then the same outcomes happen. However, if atoms with
orientation “up or down” are measured in the second magnet with respect to the orientation
“left or right”, then half of them get oriented left and half get oriented right.
The same experiment can be performed with electrons and other particles. The quality
of a particle measured by Stern-Gerlach magnets is called spin. 52 A particle is called a
spin- n2 particle if the Stern-Gerlach magnets sort the particles into exactly n + 1 possible
outcomes and with equal probability for each outcome.
The Stern-Gerlach experiment has become an important part of the empirical founda-
tions of quantum theory. It helped to discover the basic principles of quantum measurements.
Remark 1.5.1 For a more detailed treatment of basic experiments of quantum mechanics,
with respect to unsharp observables (see Bush et al. 1997).
52 Of course, there is much more to say about spins, see, for example, Peres (1993).
40 CHAPTER 1. FUNDAMENTALS
In order to explain the basic principles of quantum computing, and to develop quantum
algorithms and networks, it is not necessary to explore why the strange behaviour of quantum
systems appears as discussed in Section 1.5. It is sufficient to use some basic empirically-
based principles of quantum behaviour.53 In order to formulate these principles we need to
introduce some basic concepts: events, states, amplitudes of events, evolution, compound
quantum systems and measurement.
Remark 1.6.1 We are adhering here to the standard (or canonical or orthodox) interpre-
tation of quantum mechanics. To the one actually used by physicists to predict and analyze
experimental results. One can also say that we are using the Copenhagen interpretation
of quantum mechanics in the sense of Dirac and von Neumann54 , in which quantum states
are considered as a complete description of reality, measurements are projections, and each
physical quantity is represented by a Hermitian operator, called observable.
Principle 1.6.2 The probability p that an event happens is given as p = |α|2 , where α is a
complex number called the probability amplitude, or simply amplitude56 , of the event.
The amplitude of an event with the initial state IS and a final state FS is usually denoted
by
hFS | ISi,
53 See Feynman et al. (1964) and Berthiaume (1997). However, it is not possible to derive the whole of
quantum theory from empirically based postulates, and additional postulates have to be introduced with
mathematical intuition as the main guide. (For example, the requirement of completeness for Hilbert spaces.)
54 John von Neumann (1903-1957), an American mathematician and physicist of Hungarian origin, one of
the leading scientists of his period. He made lasting contributions to almost all areas of modern mathematics
and its applications as well as to theoretical physics, especially quantum physics. Von Neumann was one of
the leading scientists in the development of the first very powerful electronic computer as well as of the first
atomic bomb.
55 In quantum mechanics, a state is a ray in the corresponding Hilbert space. By a ray, an equivalence class
of vectors is understood, which differs by multiplication with a non-zero real number. Often the vectors with
the norm 1 are considered as representing such equivalence classes. Instead of the term “quantum state”,
the term “wave-function” is often used in (older) physics-oriented literature.
56 In some sense, probability amplitudes can be seen as “complex square roots of probabilities”.
1.6. QUANTUM PRINCIPLES 41
Principle 1.6.3 (a) If an event can be decomposed into two sequential subevents, then the
amplitude of the event is the product of amplitudes of the subevents.
(b) If an event consists of several alternative and independent subevents, then the ampli-
tude of the event is the sum of the amplitudes of all subevents considered separately.
hx|si = hx|wallihwall|si
= hx|1ih1|1ih1|si + hx|2ih2|2ih2|si,
where hx|ii is the amplitude of an electron arriving at x, given that it came out of the slit i.
Similarly, hi|si is the amplitude of the event of having an electron arriving at the slit i after
leaving the source s. In addition, hi|ii expresses the amplitude that an electron arriving to
the slit i leaves through the same slit.
Remark 1.6.4 hx|1ih1|si is the amplitude that an electron arrives to x through the slit H1
after leaving s. However, from that we cannot conclude that |hx|1ih1|si|2 is the probability
that an electron actually passes through the slit H1 to reach x. On the other hand, if we
choose to detect the electron at the slit H1 , by magnifying the effect of its presence in the slit
to the classical level, then we can see |h1|si|2 as the probability that the electron is actually
presented at the slit H1 . For an interference to occur, we need to ensure that the passage
of the electron through the slit remains on the quantum level.
In our experiment, slits are natural elements for expressing events. We can even consider
general events hi, ji, i, j ∈ {1, 2}. In such a case:
h1|1i = h2|2i = 1,
h2|1i = h1|2i = 0.
A similar situation, as in the above experiment, occurs in quantum systems in general.
Whenever an appropriate set of basis states (conditions) is chosen, then any event can be
decomposed into events entering and leaving these basis states. This motivates the following
definition.
Definition 1.6.5 The set B = {i | i denotes a state} is a set of basis states, if for all
i, j ∈ B, hi, ji = δij , i.e.,
1, if i = j;
hi|ji =
0, otherwise;
42 CHAPTER 1. FUNDAMENTALS
and for any initial state X and any final state Y it holds
X
hY |Xi = hY |iihi|Xi. (1.3)
i∈B
The next principle says that such basis states always exist.
Principle 1.6.6 Any event can be described in terms of a set of states of a basis, often
called basis states if the basis is fixed, by giving the transition amplitudes to and from those
basis states.
Our experiment, shown in Figure 1.8, seems to have only one natural set of basis states—
electrons are at the slits—but actually, for any experiment and any quantum system, there
are infinitely many sets of basis states, and all these sets of basis states have the same
cardinality.
57
By the first principle, the square of the norm of an amplitude gives the probability of
the corresponding event. By the last principle, the amplitude of an event is the sum of
amplitudes of subevents corresponding to the given set of basis states. The set of basis
states is to be “complete” (none can be added), two basis states have to be orthogonal (this
refers to the fact that hi|ji = 0 if i 6= j, i, j ∈ B), and, finally, the sum of probabilities for
reaching a basis state from any initial state has to be one. We have therefore a motivation
for the next principle.
Principle 1.6.7 The amplitude of an event is a sum of amplitudes of events corresponding
to the given set of basis states. In addition, for any set B of basis states and for any initial
state X X
|hi|Xi|2 = 1.
i∈B
The proper mathematical framework to deal with quantum systems is that of Hilbert
spaces, as discussed in Section 1.4 and in more detail in Appendix 9.2. The basic principle
is:
Principle 1.6.8 To each quantum system there corresponds a Hilbert space whose dimen-
sion equals the maximum number of reliably distinguishable states of the system.
In the Hilbert space framework, to each state X of the system a (column) state vector,
or ket-vector, |Xi and also a (row) bra-vectorhX| correspond. In addition, a scalar
product “·” of a bra-vektor and a ket-vector is defined in such a way that hX| · |Y i = hX|Y i
for any states X, Y of the system.
For ket- and bra-vectors similar relations hold as in 1.3. They are summarized in the
next principle:
Principle 1.6.9 For any states X, Y of a quantum system, and any set B of basis states it
holds: X
|Xi = |iihi|Xi, 58 (1.4)
i∈B
57 Basis states might be various possible locations of a particle, or some other properties of a particle,
such as its spin value. For any nonnegative integer n there is a spin- n
2
particle. Its spin has n + 1 possible
values and the corresponding Hilbert space has dimension n + 1. For example, pions are spin-0 particles,
and electrons are spin- 12 particles.
58
P In more mathematically-oriented literature the equality 1.4 is usually written in the form |Xi =
i∈B hi|Xi|ii.
1.6. QUANTUM PRINCIPLES 43
and X
hY | = hY |iihi|. (1.5)
i∈B
P P
In addition, if |Xi = i∈B αi |ii, then hX| = i∈B α∗i hi|.
Observe that equations 1.4 and 1.5 can be seen as being obtained from 1.3 by abstracting
either from the final condition or from the initial condition.
Principle 1.6.12 Let |φi be a state and O = {E1 , . . . , Ek } be an observable. |φi can be
expressed uniquely, as a linear superposition of its components (projections) along each of
the Ei ’s:
Xk
|φi = αi |φEi i,
i=1
59 For a more detailed discussion of the measurement problem see Appendix 9.1.4.
44 CHAPTER 1. FUNDAMENTALS
where |φEi i is a state in Ei (the projection of |φi into Ei ), and hφEi |φEi i = 1 for all i.
(Uniqueness is up to a phase factor.)
An observation (measurement) of |φi by O has the following consequences.
1. One of the subspaces E1 , . . . , Ek , say Ei , is selected and the value µ(Ei ) is produced.
The probability that a subspace Ei is selected is |αi |2 .
2. After the observation, the state |φi “collapses” into the (renormalized) state |φEi i.60
3. The only classical information given by O is the value of the function µ. In the case
µ(Ei ) = i, this is just information which of the subspaces E1 , . . . , Ek was selected (or
into which of subspaces the state |φi was projected). All information not in |φEi i is
irreversibly lost.
Example 1.6.13 Let us assume that we have a quantum system with exactly two basis
states |0i and |1i. (And therefore a two dimensional Hilbert space corresponds to it.)61
The so-called standard observable for a state |φi = α|0i + β|1i, with |α|2 + |β|2 = 1,
is B = {E0 , E1 }, where Ei , i = 0, 1 is the linear subspace generated (spanned) by the vector
|ii. An example of another, so-called “dual” observable for φ, is O = {E0′ , E1′ }, where E0′
is the linear subspace generated by the vector |0′ i = √12 (|0i + |1i) and E1′ is the subspace
generated by the vector |1′ i = √12 (|0i − |1i).62
then we say that 1.6 expresses the state being measured, the measurement being made
(given by the observable defined by the bases) and the probabilities of various measurement
outcomes: {|αi |2 }ni=1 . (In such a case it is implicitly assumed that the numerical value is
provided by the mapping µ(Ei ) = i.)
Remark 1.6.14 Each outcome of a quantum measurement is given with a certain probabil-
ity that is uniquely determined by the state being measured and the observable being used.
However, these probabilities are not a “cause of our ignorance”. One cannot improve them
using more sophisticated measurements. Results of quantum mechanics are probabilistic not
because of our insufficient understanding of the reality, but because quantum theory itself
has “nothing more to say”, or even because Nature has “nothing more to say”.
60 A state |φi is seen as having “norm” 1 if hφ|φi = 1. If hφ|φi = k > 0, then for the state |φ′ i = √1 |φi it
k
holds that hφ′ |φ′ i
= 1. |φ′ i
is said to be obtained from |φi by “renormalization”.
61 We can well imagine |0i as the (column) vector (1, 0)T and |1i as the (column) vector (0, 1)T .
62 We will see in Section 2.2 how one can make use of such an observable. The basic idea is that the same
state observed through different observables can give a definite answer in one case and a completely random
answer in another case.
1.6. QUANTUM PRINCIPLES 45
In a very general sense, a quantum system is whatever admits a closed dynamical de-
scription within quantum theory.
So far, we have considered only the static case, namely that the initial condition (state
vector) |Xi does not change after being set. We could then consider the amplitudes hY |Xi
for different Y . In the evolution of a quantum system, in particular in a computational
process, some transformation of the initial state has to be performed. On the physical side,
some apparatus A is used. On the mathematical side, some operator A is used that maps
one state into another.
The main question of interest now is the following one: what is the amplitude of the
event hY |A applied toXi? Or, in a more common notation, what is the value of
hY |A|Xi.
Fortunately, all such operators A of quantum systems are well understood. They are
linear operators of a special form:
Principle 1.6.18 To an evolution in an isolated quantum system there corresponds a trans-
formation by a unitary operator in its Hilbert space. Relative to a given basis B, a unitary
operator P is represented by a unitary matrix MPB such that, for i, j ∈ B, MPB [i, j] is the
amplitude of the transition from the state j to the state i.
Since unitary matrices preserve the norm of state vectors they can be seen as performing
rotations on quantum states. This actually means that all we can do with quantum states
is to “rotate” them.
In principle, to any unitary matrix there exists a quantum system evolving according to
that matrix. However, the real design of such systems can give rise to formidable techno-
logical problems.
On the other hand, a variety of experimental techniques have already been developed
to realize unitary evolutions of two-state quantum systems. This is of practical importance
because there is a technique, see Section 5.1, to decompose large unitary matrices into a
product of simpler ones, which represent the evolutions of two-state systems.
In a more detailed way, the evolution of a quantum system S is described by the linear
Schrödinger equation.64
∂|ψ(t)i
i~ = H(t)|ψ(t)i,
∂t
where ~ is the Planck constant, H(t) is an observable of S called Hamiltonian of the system
(which is a quantum analogue of Hamiltonian of the classical mechanics—the total energy
of the system—and can be represented by a Hermitian matrix) in time t, and |ψ(t)i is the
state of S at the time t. ψ(t) is also called the wave function of S.
In the case where the Hamiltonian is time independent, the formal solution of the
Schrödinger equation has the form
|ψ(t)i = U(t)|ψ(0)i
64 Erwin Schrödinger (1887-1961) an Austrian theoretical physicist. He formulated the basic equation for
quantum evolution that now bears his name. For this equation and the total contribution to quantum
mechanics he shared, with P. Dirac, the 1933 Nobel prize for physics. Schrödinger was the first to notice, in
1935, entanglement as a phenomenon of quantum physics.
1.6. QUANTUM PRINCIPLES 47
and
U(t) = e−Ht/~
is the evolution operator which can be represented by a unitary matrix.
Exercise 1.6.19 State which of the following matrices are unitary or Hermitian:
1+i 1−i
2 2 2 1 − 3i 1 i
(a) 1−i 1+i (b) (c) .
2 2
1 + 3i 5 i 1
There is also another way to see that linear operators representing quantum evolution
in finite dimensional quantum systems have to be unitary.
A quantum evolution operator A has to map quantum states into quantum states. This
implies that for any state x it has to hold
hAx|Axi = hx|xi = 1
and therefore
hx|xi = hA∗ Ax|xi
which yields
A∗ A = I.
The last equation also implies that if Ax = y, then A∗ y = x and therefore each quantum
process is reversible and A∗ is the operator for the reverse process corresponding to the
process given by A.
If A is a finite matrix, then AA∗ = I ⇐⇒ A∗ A = I. However, this is not the case for
infinite matrices. For example, if
1
√ √1 0 0 0 ...
2 2
0 0 1 0 0 ...
0 0 0 1 0 ...
A=
0 0 0 0 1 ...
.. .. .. .. .. . .
. . . . . .
then AA∗ = I, but A∗ A 6= I.
Finally, observe that the equality AA∗ = I is equivalent to the assertion that row vectors
of A are orthonormal and the property A∗ A = I is equivalent to the claim that column
vectors of A are orthonormal.
Exercise 1.6.21 Show that if a matrix A is such that A∗ A = I, then A is unitary if and
only if the mapping defined by A is surjective.
The quantum evolution of a pure state |φ0 i can also be seen as a sequence65 of pure
states |φ0 i, |φ1 i, |φ2 i, . . . and unitary operators U1 , U2 , U3 , . . . such that |φi i = Ui |φi−1 i, for
i = 1, 2, 3, . . ..
Similarly, the evolution of a density matrix ρ0 can be seen as a sequence of density
matrices ρ0 , ρ1 , ρ2 , . . . and unitary operators U1 , U2 , U3 , . . . such that ρi = Ui ρi−1 Ui∗ for
i = 1, 2, 3, . . ..
The following property of the tracing out operation is also of importance:
where UA (UB ) is a unitary operation on the Hilbert space HA (HB ) and ρ is a density
matrix of H = HA ⊗ HB . This implies that we can commute two important operations on
density matrices: an evolution step and a tracing out operation, in the following sense. We
can either first perform an evolution step of ρ in H and then trace out HB , or first trace
out HB and then make an evolution step on the resulting density matrix in HA .
Principle 1.6.23 Let S1 and S2 be two quantum systems and let H1 and H2 be correspond-
ing Hilbert spaces. Let the compound system of S1 and S2 be S. It holds:
Remark 1.6.24 The state spaces of n particles classically combines through the carte-
sian product and quantumly through the tensor product. In order to understand quantum
computation, it is crucial to see the difference between cartesian and tensor product. The
cartesian product of two subspaces X × Y has dimension dim(X × Y ) = dim(X) + dim(Y ).
For the tensor product X ⊗ Y we have dim(X ⊗ Y ) = dim(X) × dim(Y ).
Remark 1.6.25 The most general type of measurements, the so-called generalized mea-
surement or POVM measurement, are discussed in Section 9.2.8. They allow in some cases
to extract more information from a quantum state then projection measurements and there-
fore they are needed to consider as a possible tools of an eavesdropper when security of
quantum cryptographic protocols is investigated.
The original motivation for the study of reversibility in classical computing came from the
observation that heat dissipation is one of the major obstacles for miniaturization of classical
66 Surprisingly, we have no really satisfactory reason for assigning objective existence to physical quantities
as distinguished from numbers which we correlate with them. There are no reasons to suppose that a particle
has at any moment a definite, but unknown, position which may be revealed by a measurement. On the
contrary, we run into contradictions by assuming that (see Peres, 1993).
50 CHAPTER 1. FUNDAMENTALS
computers and the fact that the second law of thermodynamics implies that irreversible state
changes during computation must dissipate heat.67 This is nowadays known (see Bennett,
1998b), as:
Landauer’s principle. To erase a bit of classical information within a com-
puter, 1 bit of entropy must be expelled into the computer’s environment (typically
in the form of waste heat).
The importance of the investigations of the classical reversible computing follows from
the fact that such computations are special cases of quantum computations.
A B A NAND B
00 1
01 1
10 1
(c) 11 0
NAND
Figure 1.13: Classical Boolean gates
Unfortunately, both AND and NAND are irreversible Boolean operations. By that we
mean that from the output value(s) of the gate one cannot determine unambiguously the
input values; information gets irreversibly lost “during the gate operation”.
We talk about a reversible gate, Boolean function, operation or computation, as one
with always enough information in the outputs to deduce the inputs unambiguously. Such
operations, gates and computations are crucial for quantum computing because of the re-
versibility of the evolution in quantum physics.
Three reversible gates have turned out to be of special importance: the usual NOT
gate (N), CONTROL NOT gate (CN or CNOT or XOR) (see Figure 1.14a), CONTROL
CONTROL NOT gate (CCN or CCNOT), see Figure 1.14b, also called Toffoli gate.
67 In modern computers heat dissipation is about kT108 per logical operation. The heat must be removed
by external means. For example, by constant cooling of all components of classical computers by the thermal
coupling of the circuits to a heat reservoir, i.e. air. However, for quantum computing such a cooling by
thermal coupling is not an option because it would lead to decoherence effects—see Section 7.2.2—and would
destroy the superpositions of states, an important source of the power of quantum computing.
1.7. CLASSICAL REVERSIBLE GATES AND COMPUTING 51
control bit 1 A B A’ B’
A A’
0 0 0 0
0 1 0 1
B B’ 1 0 1 1
target bit 1 1 1 0
CN
A A
A B C A’ B’ C’
control bit 1
N A A’ 0 0 0 0 0 0
0 0 1 0 0 1
control bit 1
B B’ 0 1 0 0 1 0
0 1 1 0 1 1
target bit 1 0 0 1 0 0
C C’
1 0 1 1 0 1
CCN 1 1 0 1 1 1
1 1 1 1 1 0
In the CN gate A′ = A, i.e. the input A gets through unchanged. The filled circle on
the first wire represents a control in the following sense: if A = 0 then ⊕ on the second wire
just lets the signal B get through and therefore B ′ = B. If A = 1, then ⊕ on the second
wire acts as a NOT gate and B ′ = B̄. In the CCN gate A′ = A and B ′ = B. The ⊕ on the
last wire acts as a NOT gate but only if A = B = 1.
It has been shown that any reversible Boolean function in Bnm , n ≥ 3, can be computed by
a reversible network composed from the gates N, CN and CCN. It has actually been shown
that the single 3-input and 3-output Toffoli gate, Figure 1.14, and the so-called Fredkin
gate, Figure 1.15b are universal.68
Exercise 1.7.1 Find all functions f : {0, 1}3 → {0, 1} such that the mapping (a, b, c) →
(a, b, f (a, b, c)) is injective.
Example 1.7.2 It is actually easy to show universality of Fredkin and Toffoli gates. In the
Toffoli gate, if A = 1 then B ′ = B ⊕ C; if B = 0, then B ′ = A ∧ C; if A = 1, C = 1, then
B ′ = B̄. For the Fredkin gate, if C = 0 then B ′ = A ∧ B; if B = 0 and C = 1, then B ′ = Ā.
Thus, gates AND and NOT are realizable using both Fredkin and Toffoli gates.
The fact that for each reversible Boolean function there is a Boolean circuit composed
68 Observe that in Fredkin’s gate the output has always the same number of 1’s as the input.
In short, the Fredkin gate realizes the function f (a, b, c) = (a, ab ∨ āc, āb ∨ ac) and the Toffoli gate the
function f ′ (a, b, c) = (a, b ⊕ (a ∧ c), c) or f ′ (a, b, c) = (a, b, c ⊕ (a ∧ b)). Observe that there are two ways the
Toffoli gate is considered—see Figure 1.15c.
The CN gate, The Toffoli gate and the Fredkin gate were first presented by C. A. Petri in 1965, but
their publication in 1967, in German and in a not too widespread Proceedings, went apparently unnoticed
by most of those working on reversible computing. However, in view of the above fact, it would perhaps
be historically more proper to talk about Petri-Toffoli and Petri-Fredkin gates. Petri has also shown the
universality of these two gates for classical reversible computing.
52 CHAPTER 1. FUNDAMENTALS
0 Fredkin Toffoli
A A’ gate gate
of copies of a single reversible gate is certainly of interest. However, how important is this
really for computing in general? In most of the computational tasks
a a
b a b
0 a b
Figure 1.16: A reversible implementation of a two-bit adder
1.7. CLASSICAL REVERSIBLE GATES AND COMPUTING 53
Exercise 1.7.4 (a) Design a reversible circuit for a three-bit adder; (b) design a re-
versible circuit for multiplication of two-bit integers; (c) design a reversible circuit with
six inputs and outputs such that one of the outputs determines whether the input 5-bit
word is a palindrome.
Exercise 1.7.5 Find a necessary and sufficient condition for the transition function of
a one-tape Turing machine M = hΣ, Q, q0 , δi to be reversible.
All these three computational tasks are clearly reversible. Moreover, fanout does not
require any additional garbage space.
Example 1.7.6 For any function f : {0, 1} → {0, 1}, the mapping (x, 0) → (x, f (x)) is
one-to-one and therefore we can compute f 4 (x) reversibly as follows:
f 4 (x) can now be copied and the “garbage” f (x), f 2 (x), f 3 (x) can be removed by “uncom-
puting” as illustrated in Figure 1.17.
54 CHAPTER 1. FUNDAMENTALS
x x
0 0
0 0
x x 0 0 x x
0 f(x) 0 0 f(x) 0
0 4
f (x)
(b) (d) . .
(a)
(c) (e)
c c
c=0 c=1
cx cx cx
x x x
_
cx cx cx
Figure 1.19: Switch gate
at a “mirror”, see Figure 1.18c. For example, Figures 1.18d and 1.18e show the billiard ball
model implementation of a shift and a delay of the signals. Figure 1.20 shows a billiard ball
implementation of the switch gate from Figure 1.19.
Switch gates are of importance because with four of them the Toffoli gate can be imple-
mented as shown in Figure 1.21.
Remark 1.7.7 In practice (irreversible) computers in use today dissipate orders of mag-
nitude more heat per bit processing than the theoretical lower bound 1kT ln 2, given by
Landauer’s principle. However, if computer hardware continues to shrink in size as so far,
then the only feasible option to beat Landauer’s lower bound seems to be reversible compu-
tation.
After Bennett’s discovery of the universality of reversibility, the question arose whether
such no-energy dissipating computers can ever be built. The billiard ball model is clearly
unfeasible. Very small imprecisions would soon cause the balls to leave the track. As an
alternative, Bennett (1973) considered a computational model in molecular dynamics and
he was able to show that energy dissipation per bit operation would be much smaller (about
20 to 200 kT Joules). However, that model was seen also as not feasible. In spite of their
having no applications, all these results showed the limits and paved the way to the current
reversible CMOS devices that dissipate very little energy.
56 CHAPTER 1. FUNDAMENTALS
cx
c
cx
x
c
c c
cq
cp cp
p cp + cq
cp cq cp cq
q cp + cq
ELEMENTS
INTRODUCTION
The basic elements of quantum computing are easy to identify: quantum bits, quantum
registers, quantum gates and quantum networks. However, at this point an analogy with
classical computing ends. Quantum bits, registers, gates and networks are very different,
have other properties and larger power than their classical counterparts.
A quantum bit can be in any state within an infinite set of states. A quantum register
of n quantum bits can be, at the same time, in any of the infinitely many superpositions
of 2n basis states. The parallelism a quantum register can exhibit is striking. The key new
feature is that a quantum register can be in an entangled state. On one side, entangled
states with their non-locality features are a hallmark of quantum mechanics. On the other
side, quantum entanglement is an important resource of quantum information processing.
There is a larger variety of quantum gates than of classical gates. There are already
infinitely many one-input/output quantum gates. In addition, almost any two-input/output
quantum gate is universal. A simple two input/output gate together with one input rotation
gates form a set of universal gates.
LEARNING OBJECTIVES
The aim of the chapter is to learn:
57
Q
58 CHAPTER 2. ELEMENTS
2.1.1 Qubits
Let S be a two-dimensional quantum system with two orthonormal states, denoted |0i and
|1i, that can be considered as forming a natural, or standard, or preferred, basis of S.
1 One can also say that a qubit is a unit vector in a two-dimensional inner-product space. For the
representation of qubits we often assume that a particular basis, say {|0i, |1i}, has been fixed.
2 The term qubit was coined by B. Schumacher (1995). A more classical term is a “two-level system”, or
a “two-state system”.
For the purpose of quantum information processing two basis states |0i and |1i are usually taken as
encoding classical bit values 0 and 1. A classical bit can be seen as a qubit promised to be in one of the
basis states. In general, we can call any two-state system a physical bit and when the system is quantum
and the two states are orthogonal quantum states, we can refer to it as a qubit. Therefore, any two-state
quantum system is a potential candidate for a qubit.
In physics literature the following notation is often used for the states of the standard basis of various two-
level quantum systems: in the case of spin− 21 particles, and in the case of vertical or horizontal polarization
of photons, | ↑i or | li, is taken instead of |0i and | ↓i, or | ↔i is used instead of |1i; in the case of diagonal
2.1. QUANTUM BITS AND REGISTERS 59
Exercise 2.1.2 Show that any qubit state |ψi = α|0i + β|1i can be expressed in the form
|ψi = cos θ|0i + eiφ sin θ|1i in the sense that |α|2 = | cos θ|2 and |β|2 = |eiφ sin θ|2 .
Remark 2.1.3 The term “qubit” is often used in a more abstract way—as a variable that
can take on any qubit state (2.1). In such a case we can talk about a (particular) state of a
qubit. This is in accordance with the use of the term “bit” in classical computing where we
talk about a value, or a state, of a bit.
The above definition leaves the actual physical medium of a qubit completely undefined—
as long as objects are treated according to the quantum principles discussed in the previous
chapter.
General state = α + β
α α|0> + β|1>
amplitudes H | > = α| > + β| >
|α| + |β|
2 2
=1
β
(a) (b) 2 2
|α| + |β| = 1
Figure 2.1: Qubit representations by energy levels of an electron in a hydrogen atom and
by a spin- 21 particle. The condition |α|2 + |β|2 = 1 is a legal one if |α|2 and |β|2 are to be
the probabilities of being in one of two basis states (of electrons or photons).
There are many ways to realize qubits—there are many interesting/important two-
dimensional quantum systems known in physics. For example, by the polarization of a
photon or by the ground (n = 1) and excited (n = 2) states of an electron in the hydrogen
atom (Figure 2.1a). One of the most often and best-explored two-level quantum systems is
that of spin- 21 particles with two basis states: spin-up (notation | ↑i or |0i) and spin-down
(notation | ↓i or |1i) (Figure 2.1b).
From the implementation point of view the most promising candidates for qubits are so
far photons, trapped ions and spins of atomic nuclei.
States |0i and |1i of a qubit can be seen, and are often referred to, as representing
classical states (bits).3 The main difference between classical bits and qubits is that while a
classical bit can be set up only to one of the two states, namely 0 or 1, a qubit can take any
quantum linear superposition of |0i and |1i, i.e., in principle can be in any of uncountably
polarizations of photons (45◦ and 135◦ ), | րi (| ցi) is used for |0i (|1i) in the case of circular polarization
| i (| i) is used for |0i (|1i).
3 This is principally incorrect but often useful simplification, especially for getting an analogy with classical
computing.
60 CHAPTER 2. ELEMENTS
many states. This means that a large, even infinite, amount of information could potentially
be encoded in amplitudes of a single qubit by appropriately choosing α and β.4 Of course,
this does not mean that in any implementation a qubit can take any of its potentially infinite
number of theoretically possible states.5
One way to represent states of qubits geometrically is as points on the surface of a
unit Riemann6 sphere, where North and South poles correspond to the basis states (that
correspond to bits) (see Figure 2.2a).7 Qubits can be represented also by points on a Bloch8
sphere (see Allen and Eberly, 1975), called also Poincaré sphere, and Figure 2.2b), using the
spherical coordinate system. This representation is based on the fact that any qubit can be
represented (see Exercise 2.1.2), as cos 2θ |0i + eiφ sin 2θ |1i. 9
z
N
P’
i
11111111111
00000000000
00000000000
11111111111
00000000000
11111111111
00000000000
11111111111
00000000000
11111111111
00000000000
11111111111
θ
-1
00000000000
11111111111
P
00000000000
11111111111
1
00000000000
11111111111 ϕ
00000000000
11111111111
-i
(a) (b) x y
S
4 For this reason, in order to study computational complexity problems of quantum computing a restriction
will have to be made on α and β. In principle, it will be required that they can be computed in polynomial
time and have logarithmic size.
5 For example, a trine is a qubit that can be in only one of the following three states (see Peres and
√ √
Wootters, 1991): |0i, 12 |0i + 23 |1i or 12 |0i − 23 |1i.
6 Georg Friedrich Bernhard Riemann (1826–1866), a German mathematician. His main contributions was
in the theory of complex variable functions and their representations (on Riemann surfaces), non-Euclidean
geometry (representation of elliptic spaces and an extension of Gauss’s work on differential geometry to
n-dimensional objects), and electromagnetic theory.
7 The Riemann sphere is a sphere of unit radius whose equatorial plane is the complex plane whose centre
is the origin of the plane. One qubit state |φi = α|0i + β|1i can be represented by a point on a Riemann
sphere as follows. If β 6= 0 we mark in the complex plane the point P that represents the number α β
and
then we project P from the South Pole onto the sphere to get the point P ′ that then represents |φi. If α = 0
one gets the North Pole this way; if β = 0 the South Pole is the limit (Penrose, 1994).
8 Felix Bloch (1905–1983), an American physicist of Swiss origin. His main contributions were in the
quantum theory of solid bodies, ferromagnetism and quantum electrodynamics. Bloch developed methods
to measure magnetic moments of atom nucleus.
9 θ is the angle from z axis and φ the angle in the x − y plane from the x axis—a phase. Representation
of qubits by points on a Bloch sphere is of interest also because it provides an isomorphism between qubit
operations and solid-body rotations.
2.1. QUANTUM BITS AND REGISTERS 61
Just
asin the previous
chapter, we can represent basis states as vectors as follows:
1 0
|0i = , |1i = . In such a case we have representations:
0 1
! ! ! !
√1 √1 √1 √1
| րi = 2 , | ցi = 2 ,| i= 2 and | i = 2 .
√1 − √12 √i − √i2
2 2
Qubit measurements
Unfortunately, what goes into a qubit does not necessarily come out. A single qubit is in
principle not fully identifiable, i.e., given an unknown state |ψi of a qubit, it is in general not
possible to identify it fully by a projection measurement. Quantum physics has strict rules on
how to extract information out of an unknown quantum state. The outcome of any projection
measurements of a qubit must be formulated in classical terms. More exactly, we can get out
of any projection measurement of one qubit only one classical bit of information. Therefore,
even though there is a continuum of possible quantum states of a single qubit, these states
cannot all be distinguished reliably from each other. No (von Neumann) measurement
can extract more than one expected bit of information from any given qubit.10 From an
information theory point of view, from a qubit one can obtain by a (projection) measurement
exactly the same amount of (classical) information as a classical bit has, even if it has
infinitely many potential states.
Example 2.1.4 A measurement of a qubit state |φi = α|0i + β|1i, corresponding to the
observable {E0 , E1 }, where E0 (E1 ) is the subspace spanned by the state |0i (|1i), or, in
other words, with respect to the standard basis {|0i, |1i}, provides as the output bit 0 (1)
with the probability |α|2 (|β|2 ) and the state |φi = α|0i + β|1i collapses into the state |0i
(|1i). All other information about the superposition is irreversibly lost. For an observer a
qubit represents therefore a probability distribution.
However, the qubit |φi = α|0i + β|1i can be measured also with respect to infinitely many
other bases. For example, with respect to the often used dual basis D = {|0′ i, |1′ i}, where
1 1
|0′ i = √ (|0i + |1i) |1′ i = √ (|0i − |1i).
2 2
Since
1 1
|0i = √ (|0′ i + |1′ i) |1i = √ (|0′ i − |1′ i)
2 2
we have
1
|φi = √ ((α + β)|0′ i + (α − β)|1′ i)
2
and a measurement of |φi with respect to the dual basis gives 0 (1) with probability 12 |α + β|2
( 21 |α − β|2 ).
10 Of course, this is not the whole story. In order to illustrate the peculiarities of quantum measurement, let
us consider again trines. No measurement on a trine can do better than rule out one of the three possibilities,
leaving one bit of uncertainty about the original state of the trine. Thus, it is not possible to extract more
than lg 3 − 1 ≈ 0.585 bits of classical information from a single trine. However, if we have two trines that
are guaranteed to be identical, then (by Peres and Wootters, 1991), there is a measurement that can extract
√
2
√
3
(1 + lg(17 + 12 2)) − 23 ≈ 1.369 > 2 × 0.585 bits of information from both trines. This implies that in
some cases we can extract more than twice as much information from two identical qubits than from either
one alone. In order to extract more information from two identical qubits than projection measurements
allow, POV measurements, see Section 9.2.8, are used.
In addition, as discussed in Section 8.2.4, one can encode 3 bits into one qubit in such a way that any of
them (but not all of them) can be retrieved (by a proper measurement) with success probability 0.79.
62 CHAPTER 2. ELEMENTS
Example 2.1.5 If the state |0i is measured with respect to the standard basis we get as
outcome 0 with probability 1 and the state collapses into itself. On the other hand, if |0i is
measured with respect to the dual basis we get as the outcome 0 or 1, both with probability
1 ′ ′
2 , and the state will collapse either into the state |0 i or into the state |1 i. The result of
the measurement in this case is actually a random bit.
Remark 2.1.6 A measurement of a qubit with respect to the basis {|0i, |1i} corresponds
to the projection operator 12 (I ± σz ). In the case of spin- 12 particles this corresponds to
measuring of the spin along the z-axis. Projections 21 (I ± σx ) and 12 (I ± σy ) correspond to
measurements along the x-axis and y-axis, respectively.
Exercise 2.1.7 Determine probabilities of the outcomes of the measurements of the qubit
|φi = α|0i + β|1i with respect to the bases: √ √
(a) { √12 (|0i + i|1i), √12 (i|0i − |1i)}; (b) { 21 |0i + 23 |1i, 23 |0i − 12 |1i}.
Exercise 2.1.8 Given 0 < p < 1, determine the basis with respect to which the measure-
ment of a qubit α|0i + β|1i gives outcome 0 with probability p.
Qubit evolution
Any quantum evolution of a qubit, or any quantum operation on a qubit is described, as
already mentioned, by a unitary matrix
a b
A=
c d
11
which transforms any qubit state α|0i + β|1i into the state (aα + bβ)|0i + (cα + dβ)|1i.
For example, the evolution given by the Hadamard matrix (transformation)
!
√1 √1 1 1 1
H= 2 2 =√ ,
√1 − √12 2 1 −1
2
which is called also Hadamard rotation, transforms the states |0i, |1i, |0′ i and |1′ i as follows;
The basis B = {|0i, |1i} is called the standard basis, or the computational basis, D =
{|0′ i, |1′ i} is called the dual basis or the Hadamard basis or the Fourier basis. As we
could see, by applying H we can switch between the standard and the dual bases. Observe
also that H 2 = I. The so-called circular (polarization) basis
1 1
|0′′ i = √ (|0i + i|1i), |1′′ i = √ (|0i − i|1i)
2 2
is also of importance.
Exercise 2.1.9 Construct matrices to transform a qubit state from: (a) standard basis
to circular polarization basis and vice versa; (b) dual basis to circular polarization basis
and vice versa.
If the states |0′ i and |1′ i are measured with respect to the standard basis B, we get both
outcomes—0 and 1—with the same probability 21 . The evolution H applied on the states of
the standard basis can therefore be seen as implementing a fair coin tossing.
Example 2.1.10 If the matrix
1 1 1
H′ = √
2 −1 1
is applied to the states of the standard basis, then H ′2 |0i = −|1i, H ′2 |1i = |0i, and therefore
H ′2 acts as a NOT operation, up to the phase sign.
Exercise 2.1.11 What do you get if the matrix from Figure 1.4e is applied: (a) once;
(b) twice, to the standard basis states? (Transformation defined by this matrix is known
as the square root of not. Explain why.)
Three other important unitary operations on qubits are shown in Figure 2.3: rotation
(by θ) R(θ); phase shift (with respect to α) P S(α); and scale (with respect to δ) Scal(δ).
α
cos 2θ sin θ2 ei 2 0 eiδ 0
R(θ) = P S(α) = −i α Scal(δ) =
− sin 2θ cos θ2 0 e 2 0 eiδ
Exercise 2.1.12 (Barenco, 1996) Show that the following properties hold for matrices
R, P S, Scal and the Pauli matrix σx .
(a) R(θ1 ) · R(θ2 ) = R(θ1 + θ2 );
(b) P S(α1 ) · P S(α2 ) = P S(α1 + α2 );
(c) Scal(δ1 ) · Scal(δ2 ) = Scal(δ1 + δ2 );
(d) σx · R(θ) · σx = R(−θ); (e) σx · P S(α) · σx = P S(−α).
Exercise 2.1.15 Prove Theorem 2.1.13 using the fact that a matrix is unitary if and
only if its rows and columns are orthogonal.
Exercise 2.1.16 (Vazirani, 1997) Given a matrix M and a real λ, define eiλM =
P∞ (iλ)k k iσx
k=0 k! M . (a) Compute e , eiσy , eiσz , where σx , σy , σz are Pauli matrices; (b)
show that each unitary matrix of degree 2 has the form U = eiγ eiασz eiθσx eiβσz .
In principle, there is a continuous range of rotation, phase shift and scale matrices.
However, already finitely many of them are sufficient to perform all quantum computations
with an arbitrary precision.
Exercise 2.1.17 Show that any rotation R(α) can be decomposed, with an arbitrarily
small
P∞error, into polynomially many, with respect to the error, gates R(θ), with θ =
2π k=0 21k .
2
One of the basic tools for unitary operations in H2 is the beam splitter, see Figure 2.4
for two ways a beam splitter is depicted. The beam splitter has two input and two output
ports. By varying both phases of the incoming beams and the reflectivity of the beam
splitter, one can realize by a beam splitter any unitary operation in H2 . A beam splitter
is often depicted with only one input and with no specification of phasing and reflecting.
In such a case it is assumed that the second input refers to the vacuum or other reference
state and the incoming basis state is transferred to an equally weighted superposition of
both bases states.
Example 2.1.18 In the case of photons a half-silver mirror (Figure 2.5b), acts as the beam
splitter (see Figure 2.5a). A full-silver mirror (Figure 2.5c), reflects the photon. In the case
of the half-silver mirror both detectors D1 and D2 detect the photon with the same probability.
The fact that this cannot be interpreted as “photon goes one way with probability 21 and the
other way with the same probability” is well demonstrated using the so-called Mach–Zehnder
2.1. QUANTUM BITS AND REGISTERS 65
interferometer (Figure 2.5d). In the case two half-silver mirrors and two full-silver mirrors
are arranged as Figure 2.5d shows, then, as one can easily calculate, detector D1 detects the
photon with probability 1. However, if we put an obstacle on one of the paths (Figure 2.5e),
then both detectors detect with the same probability. If we put on one path an obstruction the
photon can get through, but this obstruction acts as a measuring device, then both detectors
detect the photon with the same probability (see Figure 2.5f ). Finally, if we put on one path
a glass that causes a precalculated delay (see Figure 2.5g), then only detector D2 detects the
photon.
D1
b D2
a
BS
c
(a) beam splitter (b) half-silver mirror (c) full silver mirror
D1 D1
D2 D2
obstacle
D1 D1
glass
D2 D2
obstruction
Remark 2.1.19 In addition to qubits as states of H2 , the special name (qu)trit is used
for states in H3 , (for spin–1 particles). Their general form is
where |α|2 + |β|2 + |γ|2 = 1, and ({|0i, |1i, |2i}) is its standard basis.
66 CHAPTER 2. ELEMENTS
One cannot do too much computation with a single qubit. Actually, it is the concept of
an n-bit quantum register that is a proper framework for designing quantum algorithms.
Figure 2.6: Bases in H4 ; Bell states are denoted by Φ± and Ψ± , and for each of these states
the corresponding value is given one row below; the state Ψ− is called singleton
It is usual to represent states of the standard basis in one of the following forms:
1 0 0 0
0 1 0 0
|0i = |00i =
0 , |1i = |01i =
, |2i = |10i =
, |3i = |11i =
.
0 1 0
0 0 0 1
|ψi = α00 |00i + α01 |01i + α10 |10i + α11 |11i, (2.2)
Exercise 2.1.20 Express the basis states |00i, |01i, |10i and |11i in terms of the states
|0′ 0′ i, |0′ 1′ i, |1′ 0′ i, |1′ 1′ i and vice versa.
Exercise 2.1.21 (a) Design a unitary matrix that maps standard basis
{|00i, |01i, |10i, |11i} into Bell’s basis. (b) Design a unitary matrix that maps Bell’s
basis into the standard one; (c) Hadamard matrix transforms standard basis of H2 into
its dual basis and vice versa. Does there exist a unitary matrix that maps the standard
basis of H4 into Bell’s basis and vice versa?
Exercise 2.1.22 Show that vectors of Bell’s basis are eigenvectors of the unitary matrix
implementing the mapping |x, yi → |x̄, ȳi.
2.1. QUANTUM BITS AND REGISTERS 67
Exercise 2.1.23 (DiVincenzo et al. 1998a) For the following 5 states of two qutrits:
|φ1 i = √12 |0i(|0i − |1i), |φ2 i = √12 (|0i − |1i)||2i, |φ3 i = √12 |2i(|1i − |2i), |φ4 i = √12 (|1i −
|2i)|0i and |φ5 i = 13 (|0i + |1i + |2i)(|0i + |1i + |2i) show that they form “unextendable
product base” of H9 in the sense that: (a) they form an orthonormal set; (b) there is no
state |φ1 i|φ2 i, where |φ1 i and |φ2 i are qutrit states, which is orthogonal to all 5 states
above.
Exercise 2.1.24 Determine probabilities of possible outcomes when the state (2.2) is
measured with respect to: (a) dual basis; (b) Bell’s basis; (c) magic basis.
It is often necessary to measure only one qubit. This can be done using observable
B1 = {E10 , E11 }, in the case of the first qubit;
B2 = {E20 , E21 }, in the case of the second qubit.
where E1i , i = 0, 1, is the subspace spanned by the vectors {|i0i, |i1i} and E2i , i = 0, 1 is the
subspace spanned by the vectors {|0ii, |1ii}.
Hence, if the first qubit is measured, we get as the outcome bit 0 with probability
|α00 |2 + |α01 |2 , and the post-measurement state
α00 |00i + α01 |01i
|ψ ′ i = p .
|α00 |2 + |α01 |2
(Note that the state |ψ ′ i is projected and then renormalized.) In a similar way probabilities
and resulting state are determined when outcome is 1 and when the second qubit is measured.
Exercise 2.1.25 Verify that if I denotes the unit matrix of degree 2, then
Exercise 2.1.26 Show that the tensor products of matrices and quantum states have the
following property:
Exercise 2.1.27 Show that XOR matrix cannot be obtained as a tensor product of two
unitary matrices of degree 2.
Theorem 2.1.28 (No Cloning (copying) Theorem) An unknown quantum state can-
not be cloned. (Namely, there is no unitary transformation U , such that for any one-qubit
state |ψi, U (|ψ, 0i) = |ψ, ψi. 13 ) No cloning theorem holds for any Hilbert space.
Proof 1. Assume that such a U exists and for two different orthogonal states |αi and
|βi, U (|α, 0i) = |α, αi, U (|β, 0i) = |β, βi. Let |γi = √12 (|αi + |βi). Then U (|γ, 0i) =
√1 (|α, αi + |β, βi) 6= |γ, γi = 1 (|α, αi + |β, βi + |α, βi + |β, αi).14
2 2
13 The no-cloning theorem seems to be bad news. However, this would be a very simplified view of its
impacts. For example, very good news that unconditional secure quantum key generation is possible, see
Section 6.2, is to a large extend due to this “bad news”. In addition, new techniques have appeared that
allow one to make approximate copies of qubits (Bužek and Hillery, 1996, and Bužek et al. 1997). They
proposed the Universal Quantum Copy Machine that can produce two (imperfect, but equivalent in some
weaker sense) copies of any qubit and the quality of copying is independent of particular qubits. It is
even possible to make three imperfect copies. However, this good news has also ‘bad” aspects. The copies
obtained are entangled.
14 The discovery that no general quantum copying procedure exists is a surprising and profound result of
quantum mechanics. The reason behind it is that any attempt to copy a coherent superposition of states
results in a state reduction, destruction of coherence, and the addition of noise. Moreover, the feasibility of
cloning would have surprising consequences and would lead to paradoxes. For example, in combination with
quantum teleportation, Section 6.4, it would allow faster-than-light transmission of information.
2.1. QUANTUM BITS AND REGISTERS 69
Proof 2. Assume there is a unitary operator U such that U |φ, 0i = |φ, φi, U |ψ, 0i =
|ψ, ψi, for arbitrary φ, ψ. Since U is unitary we have
Exercise 2.1.30 Show that we can learn an unknown quantum state with arbitrary pre-
cision when we have an unlimited number of copies of the state and can measure it with
respect to any observable we need.
Observe also that “No-cloning theorem” only implies that there is no general unitary
transformation for perfect copying of quantum information without destroying the original
copy of information. As we shall see in Section 6.4, quantum information can be teleported
(a copy of it can be moved to some different place) but the original copy will get destroyed.
Exercise 2.1.31 Show that to any qubit state |ψi of a qubit there is a unitary transfor-
mation U|ψi such that U|ψi (|ψi|0i) = |ψi|ψi.
Exercise 2.1.32 Show that if two different states |φi and |ψi can be copied by the same
circuit, then hφ|ψi = 0.
Exercise 2.1.33 Show that there is no universal quantum NOT gate in the following
sense—there is no unitary one-qubit transformation NOT such that hNOT(φ)|φi = 0 for
any one qubit state |φi.15
B = {|ii | 0 ≤ i < 2n },
a00 a01
Example 2.1.36 Let a unitary matrix U = be applied to the first qubit of
P a10 a11
an n-qubit register in the state |φi = x∈{0,1}n αx |xi. In such a case
X X1
U |φi = ( ay1 j αjy2 ...yn )|y1 . . . yn i.
y1 ...yn ∈{0,1}n j=0
16 Therefore, already 100-qubit computers should have, in some sense, incredible power.
2.1. QUANTUM BITS AND REGISTERS 71
One of the aims of quantum algorithms and network design is to decompose a problem-
solving unitary operator into a sequence of one and two qubit operations from a small set
of such available operations. As discussed in Section 2.3, for any reasonable set of such
basic operations such a decomposition always exists. However, an important problem is to
determine for particular unitary operations whether such a decomposition into a polynomial
number of one or two qubit operations can be done. When a considered unitary operation
can be decomposed into a polynomial number p(n) of one and two qubit operations, then
the overall quantum computation time is O(p(n)). Where a classical simulation is to be
performed the total number of operations needed is O(p(n)2n ).
Measurements
If the state |φi of an n-qubit register is measured with respect to the standard basis we
get as the outcome n bits, each n-tuples of bits with a precalculated probability, and the
state |φi, in the (potential) superposition of 2n basis states, collapses to just one of the basis
states.
Let us now consider a measurement of the jth qubit only. The corresponding observable
is Bj = {Ej0 , Ej1 }, where Ej0 (Ej1 ) is the subspace of the 2n -dimensional Hilbert space spanned
by all basic vectors having 0 (1) in the jth component.
The measurement of the jth qubit gives
X X
0 (1) with probability |αi |2 ( |αi |2 ),
i | ij =0 i | ij =1
In this case the states of the basis are seen as pairs of integers 0 ≤ i < 2n and 0 ≤ j < 2m .
If we now measure the first n qubits we get each number i ∈ [0, 2n ), with probability
m
2X −1
p(i) = |cij |2
j=0
Exercise 2.1.37 What are the results of the measurement of the first qubit of the fol-
lowing states with respect to the dual basis: (a) α|000i + β|111i; (b) √12 (|0000i + |1111i);
(c) √1 (|0(n) i + |1(n) i for an arbitrary n?
2
72 CHAPTER 2. ELEMENTS
Exercise 2.1.39 Describe unitary matrices that map the state |0(n) i into the cat state
√1 (|0(n) i + |1(n) i) for: (a) n = 3; (b) n = 4; (c) n = 5.
2
P
Exercise 2.1.42 (Steane, 1996) If |φc i = αx |flip(x)i, where flip(x) flips all
x∈{0,1}n
P
bits of the binary representation of x, then |φc i = x′ ∈{0′ ,1′ }n α′x (−1)parity(x ) |x′ i, i.e
′
if all basis words are flipped in the standard basis, then all odd parity words in the dual
basis change their sign.
Exercise 2.1.43 Express the states |1(n) i and √1 (|0(n) i + |1(n) i) in the dual basis.
2
In all of the above exercises short states in one basis have long representation in another
basis. This is inevitable because the following inequality holds (see Bialynicki-Birula and
Mycielski, 1975, Deutsch, 1983, and Steane, 1996).
Theorem 2.1.44 If a state |φi of a Hilbert space H2n can be written as a superposition of
m1 basis states in the standard basis and m2 basis states in the dual basis, then
m1 m2 ≥ 2 n . (2.5)
20 The name of the operation comes from the fact that 2E − x = E + E − x and therefore the new value
is as much above (below) the average as it was initially below (above) the average—which is precisely the
inversion about the average.
74 CHAPTER 2. ELEMENTS
X1 X1
|ψi = |ψ1 i ⊗ |ψ2 i = ( αi |ii) ⊗ ( βj |ji). (2.6)
i=0 j=0
Exercise 2.2.1 Determine: (a) |ψi ⊗ |ψi ⊗ |ψi for |ψi = √12 (|0i + |1i); (b) |ψi ⊗ |ψi ⊗
|ψi ⊗ |ψi for |ψi = √12 (|0i − |1i); (c) |ψi ⊗ |ψi ⊗ |ψi for ψ = √12 (|0i + i|1i).
If we now observe the first qubit of the state |ψi, in (2.6), we get:
P2n −1 2πiay
Exercise 2.2.2 (Cleve et al. 1998) Show that the state y=0 e 2n |yi is unentangled
Nn
if a ∈ {0, . . . , 2n − 1} and can be expressed in the form i=1 (|0i + αi |1i), for proper
amplitudes αi .
P2n −1
Exercise 2.2.3 (Cleve et al. 1998) Show that the state y=0 e2πiφy |yi is unentangled
for all φ, and find its decomposition into the tensor product of one-qubit states.
A pair of two-state particles which is in the entangled state √12 (|00i + |11i) or √12 (|01i +
|10i), is often said to be in the Bohm state or EPR state or Bell state or to form an
EPR pair, or the EPR effect—“EPR” stands here for “Einstein, Podolsky and Rosen”, see
Section 9.1.5—and that it creates a so-called “EPR channel”. Such a channel can be used
to “teleport” quantum information as discussed in Section 6.4.
An EPR channel is created, for example, when certain types of atom or molecule decay
with the emission of two photons, and consists of the fact that the two photons are always
found to have opposite polarization, regardless of the basis used to measure them, provided
both are measured with respect to the same basis. Figure 2.7 shows a way how an EPR-pair
and channel can be generated.
mirror
rotator
EPR channel
a source of
EPR pair
a high-energy photon nonlinear
high-energy beam splitter
photons optical crystal
mirror
Figure 2.7: Generation of an EPR pair of polarized photons and an EPR-channel, adopted
from Glanz (1995).
22 Entangled are, for example, the states of Bell basis. Observe that XOR maps all states of Bell basis into
states that are not entangled; it therefore performs disentanglement on the states of Bell basis.
The phenomenon of quantum entanglement was first noticed by Schrödinger, in 1935, and the English
term “entanglement” is the direct translation of the German term “Verschränkung” used by Schrödinger.
76 CHAPTER 2. ELEMENTS
Exercise 2.2.4 Show that by an appropriate choice of the basis any entagled (pure) state
of two qubits can be written as |φi = cos φ|00i + sin φ|11i.
Example 2.2.6 Let both Alice and Bob possess one particle of an entangled pair in the
state √12 (|01i + |10i) and assume they travel to distant places and agree to measure their
particles immediately after their arrival. Let us assume that Alice gets as the result of
her measurement state the |0i. She knows immediately that Bob’s particle has the state
|1i. However, this situation could happen in two ways. The first possibility is that Alice
has arrived first and her measurement determined the state of Bob’s particle. The other
possibility is that Bob has arrived first, got |1i as the result of his measurement, and that
determined |0i as the state of Alice’s particle. How to find out for Alice which of these
two cases did really happen? She could call Bob to ask about the time of his measurement.
Interestingly enough, such a classical communication seems to be the only way to solve this
problem.
a pair of quantum systems in a maximally entangled state is the purest form of inherently
quantum information: it is capable interconnecting two parties far apart, it cannot be copied,
eavesdropped without disturbance, nor it can be used by itself to send classical messages.
At the same time it can assist in speeding up both classical and quantum communication.
Quantum entanglement is also the main reason why quantum computers cannot be effi-
ciently simulated by classical ones.
To describe fully a state of n-qubit register we need to write down in general 2n complex
coefficients. Already for a small n = 100 this would require 2100 ≈ 1030 numbers, which is
outside the potential of foreseeable classical computers. (In addition, to simulate a quantum
computer with 100-qubit register we would need to manipulate matrices of degree 1030 .)
Since quantum computers are probabilistic it could seem that keeping a complete state
description at each stage of simulation is not the only way to simulate quantum computers on
probabilistic ones. It is therefore natural to ask whether it is possible to simulate quantum
computers on probabilistic ones which do not always keep a complete description of the
current quantum state and only provide various outcomes with the same probability as the
simulated quantum computer does. Could not we perform some “local simulations” in which
each qubit has a definite value at each computation step and each quantum gate can act on
the input qubits in various possible ways, only one of which is always selected as determined
by a (pseudo)random generator? Such simulations could avoid a need for exponential space!
However, it is a consequence of quantum entanglement, of its non-local correlations, that we
cannot always divide the state of the quantum system under consideration into parts and
compute them separately. Due to the quantum entanglement there is in general no local
probabilistic classical algorithm simulating quantum computers.
An easy way to demonstrate directly that it is exactly because of quantum entanglement
that quantum computers cannot be efficiently simulated by classical ones goes as follows.
Let us assume that an evolution of a state |ψi of an n qubit register goes through a
sequence of states |ψi i, i ≥ 1, and |ψi i = |φi1 i ⊗ |φi2 i ⊗ . . .⊗ |φin i, where each |φij i is a qubit
state. No matter how long such an evolution is, it can be easily simulated by a classical
computer because it can simulate evolution qubit by qubit and there is no exponential
increase in the number of coefficients a classical computer has to store.
It often used to be emphasized that quantum superposition is the root of the extraordi-
nary power of quantum computing. However, it is nowadays clear that this is too simplified
a view of quantum computing.
Indeed, as already discussed in Section 1.5, classical waves also exhibit superpositions.
As a consequence, any effect depending on quantum interference alone seems to be readily
simulated by classical waves. Entanglement can be seen as a very special type of superpo-
sition that has no classical analog. Jozsa (1997) provides arguments that we cannot always
simulate quantum parallelism (see page 103), by classical waves.
Quantum entanglement should be seen as a computational resource that allows qualita-
tively and quantitatively new types of information processing. At the same time entangle-
ment is a resource which is very difficult to create and to preserve.
Applications of quantum entanglement: speed-up of classical computations, Sec-
tions 3.1, 3.2, 3.3; quantum key generation, Sections 6.2 and 6.2.4; teleportation, Section 6.4;
superdense coding, Section 6.4.4; entanglement enhanced classical communication (Bennett,
Fuchs and Smolin, 1997); quantum data compression, Section 8.2; error-correction codes,
Section 7.4; fault-tolerant computing, Section 7.5, dense coding, Section 8.2.4.
78 CHAPTER 2. ELEMENTS
It is easy to see that |φ2n i = |Φ+ i⊗n , i.e., |φ2n i is the tensor product of the Bell state
+
|Φ i with itself n times.
There is a variety of technical results showing that entropy of entanglement as defined
above is a reasonable measure of entanglement. For example, if two parties share an entan-
gled pair, then they cannot change the entropy of entanglement by local actions (even with
the help of classical communication). In addition, two pure entangled states with the same
entropy of entanglement can be transferred into each other by local actions (see Bennett,
1998a).
The concept of entanglement is defined for mixed states and density matrices similarly
as for pure states. Ln
PnA mixed state [ψi is separable or disentangled if [ψi = i=1 (pi , |φi i ⊗ |ψi i), where
i=1 pi = 1 and |φi i, |ψi i are pure states. [ψi
Pnis entangled ifPitn is not disentangled.
A density matrix ρ is disentangled if ρ = i=1 pi ρi ⊗ ρ′i , i=1 pi = 1 and it is entangled
if it is not disentangled.
Several approaches to quantification of entanglement of mixed states and density matrices
are dealt with in Section 8.3 where various ways of creation and manipulation of entangled
states are dealt with.
Remark 2.2.7 As illustrated in the rest of the book, entanglement can be used in vari-
ous ways to make quantum communication more efficient and more secure. It is also well
known in quantum mechanics that to any entangled state one can find operators whose
correlation violate Bell inequalities and contradict “local realism” view of quantum physics
(see Section 2.7). An important role in quantum information processing have quantum
error-correcting codes whose highly entangled codewords protect quantum information. As
23 On the Bell states, that play an important role in the quantum theory of entanglement, one can also
illustrate enormous difference between what is possible in quantum theory and practice. Theoretically, a
single projection measurement can distinguish four Bell states. However, until now no experimental way to
do that in H4 is known! Only recently (see Kwiat and Weinfurter, 1998), a way has been demonstrated
how to distinguish these four states, but only by working in a larger Hilbert space—making use of additional
entanglement.
24 The term “ebit” is also almost a trademark for “Electron-Beam Ion Trap”.
2.2. QUANTUM ENTANGLEMENT 79
Denote by fk : {0, . . . , 2n − 1}k → {0, 1}, the Boolean function defined, on valid inputs
only, by !
k
1 X
fk (x1 , . . . , xk ) = xi mod 2.
2n−1 i=1
Clearly, fk (x1 , . . . , xk ) ∈ {0, 1} for all valid inputs.
At first we show that QC(fk , n, k) = k provided k communicating parties share k entan-
gled qubits, each party having one. Later we show that C(fk , n, k) ≥ k lg k − k for n ≥ lg k.
In this way we find that for functions fk defined above communication with the help of
entanglement is asymptotically better.
Let us assume that k parties share k qubits, the party Pi the qubit qi and together
these qubits are in the entangled state |q1 . . . qk i = √12 (|0(k) i + |1(k) i). Let each party Pj
independently apply on its qubit qj the following procedure.
1. A phase-changing transformation
2πixj
|0i → |0i, |1i → e 2n |1i.
2. Hadamard transformation.
3. A measurement of the jth qubit qj , with respect to the standard observable {|0i, |1i}
providing an output bj ∈ {0, 1}.
25 See Hromkovič (1997), Gruska (1997).
80 CHAPTER 2. ELEMENTS
2πi
Pk
After the first step the resulting state is √1
2
|0(k) i + (e 2n j=1 xj )|1(k) i . (Observe that
2πi
Pk
for a valid input e 2n j=1 xj equals 1 or −1.) After the second step the resulting state is
2k −1 k
2X −1
1 1 X 2πi
Pk 1
√ √ |ji + e 2n j=1 xj √ (−1)parity(j) |ji ,
2 2k j=0 2k j=0
2πi
Pk
xj
where parity(j) = 0 if j is even and 1, otherwise. Since e 2n j=1 = (−1)fk (x1 ,...,xk ) we
see that the resulting state is
1 X
√ |ji.
2k−1 parity(j)=fk (x1 ,...,xn )
b1 ⊕ b2 ⊕ . . . ⊕ bk = fk (x1 , . . . , xk ),
Exercise 2.2.8 Determine in detail that ⊕ki=1 bi = fk (x1 , . . . , xn ) for: (a) k = 2, 3; (b)
for an arbitrary k.
All parties can compute the value of fk if bits bi , i = 1, . . . , k are broadcast. On the
other hand, broadcasting of less than k bits cannot be sufficient to compute fk because if
one of the parties does not broadcast its bit, then no other party can compute the value
of fk .
Let us now consider communications between the parties in the case that only classical
information is used.
The simplest way to communicate is that all but one of the parties broadcasts their
inputs. The last party then computes the result and broadcast it to all other parties. This
implies
C(fk , n, k) ≤ (k − 1)n + 1.
Another possibility is that parties P1 , P2 , . . . , Pk−1 broadcast the d most significant bits of
their inputs, i.e., yi = xi − xi mod 2n−d for some d > 1. The last party, say Pk , then
computes the sum
X k
( xi ) − δ,
i=1
where
k−1
X
δ= (xi mod 2n−d ).
i=1
Exercise 2.2.9 (Cleve and Buhrman, 1997) Let each of three parties A, B and C
possess an n bit string wA , wB and wC , such that wiA ⊕ wiB ⊕ wiC = 1 for 1 ≤ i ≤ n.
In addition, let each party P possess one of the particles qiP , 1 ≤ i ≤ n, of n triples of
particles each in the entangled state |ψi = 12 (|001i + |010iL
+ |100i − |111i). Show that the
n
three parties can compute the function f (wA , wB , wC ) = i=1 wiA ∧ wiB ∧ wiC in such a
way that each party P ∈ {A, B, C} performs the following protocol:
for i from 1 to n do
if wiP = 0 then apply Hadamard rotation to qiP ;
get the bitLsP P
i by measuring qi ;
P n P
compute s ← i=1 si .
and then let parties B and C send two bits, namely sB and sC , to A such that A can
compute sA + sB + sC which equals f (wA , wB , wC ). (It can be shown that the classical
communication complexity of this problem is 3. Using quantum entanglement only two
bits for communication are necessary, as demonstrated by the above protocol.)
Remark 2.2.10 In the result presented in this section it has been demonstrated that en-
tanglement has the potential to act as a substitute for communication during multiparty
cooperation. On the other hand, the potential of entanglement for direct communication is
very restricted. This is discussed in more detail in Section 6.4 on teleportation—an another
important application of entanglement.
Open problem 2.2.11 Can quantum entanglement decrease also two-party communication
complexity for some communication problems?
Example 2.3.2 The so-called Hadamard (rotation) gates are represented by matrices
1 1 1 1 1 1 1 1 −1
H=√ H′ = √ H ′′ = √ .
2 1 −1 2 −1 1 2 1 1
82 CHAPTER 2. ELEMENTS
Example 2.3.3 The following one parameter set of rotation gates (represented by matri-
ces) is also often used:
cos θ i sin θ i cos θ sin θ eiθ 0
Rx (θ) = , Ry (θ) = , Rz (θ) = ,
i sin θ cos θ sin θ i cos θ 0 e−iθ
and
As already mentioned, of central importance for quantum computing is the XOR gate (Fig-
ure 2.8). Observe that if the target qubit has the input |0i, then this gate can be used to
copy qubits |0i and |1i from the control qubit. At the same time the gate in Figure 2.8 can
be seen as a classical wire. Indeed, inputs |0i and |1i on the control qubit come out on the
target qubit output, but a superposition α|0i + β|1i on the control qubit is transferred into
the entangled state α|00i + β|11i and if we consider as the overall output that of the target
qubit then this output is a mixed state: |0i with probability |α|2 and |1i with probability
|β|2 , and if |α| = |β| then the output is a random |0i or |1i.
|ψ>
|0>
Figure 2.8: XOR gate as a real wire
Just as with classical gates, quantum gates also can be described by “truth tables”
showing that the outputs for the case inputs are states |0i and |1i. The point is that once
such a truth table is given, linearity of quantum gate mappings allows us to determine gate
outputs for all possible input states. In this way several often-used gates have been described
in Figures 1.14 and 1.15: CNOT or XOR gate as well as Fredkin and Toffoli gates.
Exercise 2.3.4 Describe unitary matrices for (a) Toffoli gate; (b) Fredkin gate.
Exercise 2.3.5 XOR gate can be expressed using outer and tensor products as follows:
|0ih0| ⊗ I + |1ih1| ⊗ σx . Find a similar representation for: (a) Toffoli gate; (b) Fredkin
gate.
The gates for basic Boolean reversible operations NOT, CNOT and CCNOT can be
described also using a notation for registers as follows:
A relation between a quantum circuit and the corresponding unitary matrix is far from
being very transparent even for simple circuits and some experience is needed to get proper
feelings in this respect. That is why there are quite a few (very simple) examples and
exercises in this section, worth paying detailed attention to.
A A
A B
(a) (b) (c)
Figure 2.9: Elementary networks I
a11 a12 b11 b12
Exercise 2.3.7 Let a = ,B= be unitary matrices of degree
a21 a22 b21 b22
2. Design unitary matrices of degree 4 that represent mappings realized by networks: (a)
in Figure 2.9a; (b) in Figure 2.9b; (c) in Figure 2.9c.
Exercise 2.3.8 Let us assume that unitary matrices A, B, C of degree 2 are given. De-
sign unitary matrices representing networks shown: (a) in Figure 2.10a; (b) in Fig-
ure 2.10b; (c) in Figure 2.10c; (d) for a network obtained by serial composition of net-
works in Figure 2.10a,b; (e) for a network obtained by serial composition of networks
from Figure 2.10a,c.
If gates G1 and G2 realizes the mappings described by unitary matrices A1 and A2 , then
the network in Figure 2.11a realizes the mapping described by the matrix A1 ⊗ A2 .
26 We are using here similar terminology as for classical circuits, in spite of the fact that there are no real
wires on the quantum level. Two ports of the gates communicate either by sharing a physical qubit or via
field interactions or using other physical means.
When considering a class of quantum circuits {Ci }n i=1 of a certain type, it is necessary to assume, if we
want to assign computations on such circuits to uniform complexity classes, that all such circuits can be
designed (computed) by a single classical Turing machine in a polynomial time with respect to n.
84 CHAPTER 2. ELEMENTS
A
B
C
(a) (b) (c)
Figure 2.10: Elementary networks II
G1 H H
G2 H H
(a) (b)
The processing in the network on the left side of the identity in Figure 2.11b for the
input |0i|1i can be depicted as follows:
H−gates 1
|0i|1i −→ √ (|0i|0i + |1i|0i − |0i|1i − |1i|1i)
2
XOR gate 1
−→ √ (|0i|0i + |1i|1i − |0i|1i − |1i|0i)
2
H gates
−→ |1i|1i.
Exercise 2.3.9 Prove the equivalence of networks shown in: (a) Figure 2.11b; (b) Fig-
ure 2.12.
Exercise 2.3.10 Show how to design quantum circuits producing the cat state √1 (|0(l) i+
2
|1(l) i) for: (a) l = 2; (b) l = 3; (c) for an arbitrary l.
H H
H H
Figure 2.12: Two equivalent circuits
Example 2.3.11 The XOR gate determined by the matrix XOR is an important example
2.3. QUANTUM CIRCUITS 85
of a 2-qubit gate. If depicted as in Figure 1.14, then a simple circuit of three such gates,
shown in Figure 2.13b, flips the qubits.
a a a
a |φ> |ψ>
b b b
b
c c c
c |ψ> |φ>
d G d
(a1) (a2) (a3) (a4) (b)
Figure 2.13: Generalized XOR gate notations and a quantum circuit to flip the qubits
Notation used for XOR and Toffoli gates in Figure 1.14 is often generalized to consider
cases that target bits should flip if one of the control bits is 0 (represented by the empty circle)
and for the cases there are several control and/or target bits or we have a controlled gate.
For example, the circuit in Figure 2.13a1 realizes the mapping (a, b, c) → (a, b, (a ∧ b̄) ⊕ c);
the circuit in Figure 2.13a2, the mapping (a, b, c, d) → (a, b, c, (a ∧ b̄ ∧ c) ⊕ d). In the circuit
in Figure 2.13a3 the gate G is applied if and only if a ∧ b ∧ c̄ = 1. Finally, in the circuit
from Figure 2.13a4 the last two bits flip if and only if a ∧ b = 1 (this stands actually for two
consecutive XOR gates with the same control bits but different target bits). In addition,
notation XORi,j will be used to denote the case that an XOR gate is applied to the ith and
the jth qubit as the control and the target qubits.
|ϕ> H |α>
|ϕ> |ϕ 1 >
Exercise 2.3.13 Show that the quantum circuit, so-called bilateral quantum XOR
(BXOR), shown in Figure 2.14, transforms a pair of Bell states |φi and |ψi into a pair
of Bell states |φ1 i, |ψ1 i.
Exercise 2.3.14 Show that each of the Pauli matrices maps in a one-to-one way states
of the Bell basis on themselves if the matrix is applied to one of the two qubits of the given
Bell state. (In other words, Bell states can be connected into one another by unilateral
Pauli rotations.)
86 CHAPTER 2. ELEMENTS
XOR is an important example of quantum gates with two or more inputs that performs
the so-called conditional quantum dynamics, in which one system (qubit) undergoes an
evolution that depends on the quantum state of another system (of other qubits).
Exercise 2.3.15 The circuit on Figure 2.15 (see Moore and Nilsson, 1998), performs a
permutation of 3 qubit states using 3 ancilla qubits. Show: that any permutation of n
qubit states can be performed: (a) using 4 layers of XOR gates with n ancilla qubits; (b)
using 6 layers of XOR gates and no ancilla qubits.
Exercise 2.3.17 Design: (a) a circuit to recognize Bell states; (b) a circuit to generate
GHZ states; (c) a circuit to recognize GHZ states.
|ψ1> |ψ3>
0 0
|ψ2> |ψ1>
0 0
|ψ3> |ψ2>
0 0
Exercise 2.3.19 Apply the Hadamard transform Hn on the following states where x, y ∈
{0, 1}n: (a) √12 (|xi + |yi); (b) √12 (|xi − |yi).
Exercise 2.3.20 Design a quantum circuit transforming the state α|0i + β|1i into the
state: (a) −β|0i + α|1i; (b) β|0i − α|1i; (c) α|0i − β|1i; (d) −α|0i + β|1i.
|0> H |0>
|0> H |0>
|ϕ> Hn |ϕ>
|0> H |0>
|0> H |0>
(a) (b)
Figure 2.16: The Hadamard circuit Hn and its application to the state |0(n) i with the
P2n −1
outcome |φi = √12n i=0 |ii.
Exercise 2.3.21 Show that the operation AN D : (x, y) → (x, y, x ∧ y) can be imple-
mented: (a) up to a phase by a quantum circuit consisting of four Hadamard gates and
three XOR gates; (b) by a quantum circuit consisting of six XOR gates and eight one-bit
gates.
Exercise 2.3.22 (Barenco et al. 1997) Design a quantum circuit to realize the unitary
Pk−1
transformation Uk : |0(k) i → √1k (|0(k) i + i=1 |2i i). (Hint: use gates corresponding to
√
the following unitary matrices: Ak = √k+1 1 √1 − k and
k 1
√
k−j+1 0 √ 0 0
1 0 1 k−j 0
Tj,k =√ √
k−j+1 0 − k−j 1 √ 0
0 0 0 k−j+1
The computational meaning of quantum circuits is defined as follows. For any quantum
circuit C with input variables x1 , . . . , xn and output variables y1 , . . . , ym , m ≤ n (they are
to be a subset of outputs), we associate to any input x ∈ {0, 1}n the probability distribution
ρx over {0, 1}m defined in the following way (see Yao, 1993).
For any input x the final quantum state v, corresponding to all output wires, not only
to those carrying output variables, has the form
X
v= αy |yi,
y∈{0,1}m
88 CHAPTER 2. ELEMENTS
where αy is the amplitude obtained by the projection of v when the output variables are
set to the value y, i.e. αy is the sum of amplitudes of these final outcomes having value
y in the wires corresponding to output variables. Then πx (y) = |αy |2 is the corresponding
probability and {πx | x ∈ {0, 1}n } is said to be the distribution generated by the circuit C.
One of the main results of classical computation theory says that each TM can be sim-
ulated in polynomial time by a uniform family of Boolean circuits. A similar result holds
also for quantum Turing machines and quantum circuits, and due to this result one uses
nowadays more quantum circuits than QTM to develop quantum algorithms. To present
the corresponding result we need at first to introduce the concept of simulation of QTM by
quantum circuit.
Definition 2.3.23 A quantum circuit C with n input variables is said to (n, t)-simulate a
QTM M , if for each x ∈ {0, 1}n the probability distribution ρx generated by C is identical
to the distribution of the configurations of M after t steps with x as input.
Theorem 2.3.24 If M is a QTM and n, t ∈ N≥0 , then there exists a quantum Boolean
circuit C of size (number of gates) polynomial in n and t that (n, t)-simulates M .
Remark 2.3.25 Quantum gates are far from easy to implement. It seems safe to say that
the potential computational power of quantum computers does not come from the intrinsic
speed of quantum gates, but from the fact that quantum circuits to solve some problems
can have exponentially less gates than their classical counterparts.
The task of designing efficient quantum algorithms can be seen as unitary matrix fac-
torization problem: given a universal set U of basic unitary matrices, and an n × n unitary
matrix U , how to decompose U into a product of poly(lg n) matrices from U. Quantum pro-
gramming—quantum algorithms and networks design and analysis—requires therefore quite
a different expertise (see, for example Høyer, 1997). Development of efficient factorization
methods and proper sets of basic unitary matrices is still task to deal with. In the next
chapter we deal with quantum algorithm design problems in a more traditional spirit—as
an art of composition of unitary transformation to perform the final unitary transformation.
Example 2.3.26 Consider the two quantum circuits depicted in Figure 2.17. The first one
consists at first of two Hadamard gates Hn and ends with the measurement gate, with respect
2.3. QUANTUM CIRCUITS 89
|0> |0>
measurement
measurement
measurement
|0> |0>
Hn Hn Hn Hn
|0> |0>
|0> |0>
(a) (b)
Figure 2.17: Measurement gates and their role
to the standard observable. The second circuit has in addition a measurement gate also in
between two Hadamard gates.
There is an essential difference between these two circuits, and the inclusion of the mea-
surement gate also between two Hadamard gates makes the second circuit more interesting.
Indeed, since Hn2 = 1, the result of the measurement gate of the circuit in Figure 2.17a
is |0n i with probability 1. On the other hand, in the second circuit, the first measurement
gate will observe a random
P n-bit string x with probability 21n . The output of the second gate
1
is then Hn |xi = 2n y∈{0,1}n (−1)y·x |yi. The second measurement gate therefore observes
√
Generation of a random string, as provided by the gates in Figure 2.17b, is often needed in
quantum computation. It is therefore natural to ask whether the inclusion of an intermediate
measurement is the only (easy) way of doing that or whether there is a simple way to avoid
such intermediate measurements. As we shall see, there is—the so-called copying technique,
using XOR gates. This will be illustrated in the following simple example using only H
gates but the technique can be used in general. This techniques will play an important role
in Sections 7.4 and 7.5.
Example 2.3.27 As discussed in the previous example, the network in Figure 2.18a can be
seen as generating a random bit. However, the same can do networks in Figure 2.18b,c.
In the circuit in Figure 2.18b, after the XOR gate the state is √12 (|00i + |11i) and the
first bit is random. After the next application of the Hadamard gate on the first bit we get
1 1
(H ⊗ I)( √ (|00i + |11i) = (|00i + |10i + |01i − |11i)
2 2
|0> H M H M
(a)
measurement
measurement
|0> H H |0> H H
|0> |0> H
(b) (c)
Figure 2.18: Power of copying circuits
It is easy to see that if a matrix M is ε-close to a unitary matrix, then 1−ε ≤ ||M || ≤ 1+ε
and its rows, Mi , have the norm close to 1 and, in addition, they are almost orthogonal.
Exercise 2.3.29 (Bernstein and Vazirani, 1997.) Show that if a matrix M of degree d is
ε-close to a unitary matrix, then the following holds for its rows Mi : (a) 1 − ε ≤ ||Mi || ≤
1 + ε; (b) if i 6= j, then ||Mi Mj∗ || ≤ 2ε + 3ε2 .
27 One also says that a set of gates is universal if the subgroup of unitary transformations generated
by the unitary transformations corresponding to the gates in the set is dense in the group of all unitary
transformations U (n), for any n.
2.3. QUANTUM CIRCUITS 91
Exercise 2.3.31 Let φ0 be an irrational fraction of π. (a) Show that any gate
iφ
e 0
Uφ =
0 1
can be implemented with an arbitrary precision using (several copies of ) a single gate
Uφ0 ; (b) show that the number of gates required to achieve ε accuracy of approximation
is O( 1ε ).
The first result concerning the universality of quantum gates was due to Deutsch (1989),
who has shown that any 3-qubit gate with the unitary matrix
1 0
1 0 0 0
D(θ) = 0 0 1 0 0 ,
0 0 i cos θ sin θ
0 0 sin θ i cos θ
where 0 and 1 are zero and unit matrices of degree 4, depicted in short in Figure 2.19a, with
D = Ry (θ) (see page 82), is universal provided θ/π is irrational. Observe that the Deutsch
gate can be seen as a quantum generalization of the Toffoli gate.
A
D V
The following example presents the first step which led to a universal 2-qubit gate.
Example 2.3.32 (Sleator and Weinfurter, 1995) Consider the gates realizing the op-
erators specified by the matrices:
1 0 0 0 1 0 0 0
0 1 0 0 −1 0 1 0 0
S(τ ) =
0 0 e iπ 4 cos
πτ −iπ
πτ , S (τ ) =
0 0 e −iπ πτ iπ πτ
2 e 4 sin 2
4 cos 2 e 4 sin
2
−iπ iπ iπ −iπ
0 0 e 4 sin πτ 2 e 4 cos πτ2 0 0 e 4 sin πτ2 e 4 cos πτ 2
Both gates perform transformations of the target bit conditional to the case that the control
bit is 1. Clearly S(τ )S −1 (τ ) = I. In addition, it holds
1 0 0 0
τ 0 1 0 0
S2( ) = τ
τ ,
2 0 0 i cos π 2 sin π 2
0 0 sin π τ2 i cos π τ2
92 CHAPTER 2. ELEMENTS
b
−1
Sτ/2 Sτ/2
−1
c Sτ/2 Sτ/2 Sτ/2
(a) (b)
We show now a modified result, adopted from Barenco (1995), namely that any 2-
input/output gate with the matrix
1 0 0 0
0 1 0 0
A(φ, α, θ) =
0 0
,
eiα cos θ −iei(α−φ) sin θ
0 0 −iei(α+φ) sin θ eiα cos θ
and with the short notation for the gate in Figure 2.19b, is universal if α, φ and θ are
irrational multiples of π and of each other.
The proof is by providing an explicit construction of the circuit implementing the gate
D(θ) via a 3-qubit gate specified by the matrix
1 0
1 0 0 0
V (φ, α, θ) = 0 1 0 0 ,
0 0 0 eiα cos θ −iei(α−φ) sin θ
0 0 −iei(α+φ) sin θ eiα cos θ
Since α and θ are irrational in the above sense, any transformation A(φ, α1 , θ1 ), with α1 , θ1 ∈
[0, 2π], can be implemented with arbitrary precision using several gates A.
Exercise 2.3.33 Show that if α1 and θ1 are specified with accuracy ±ε, then O( ε12 )
applications of the gate A are needed to approximate A(φ, α1 , θ1 ) with a given precision
ε.
28 S( 1 ) is therefore called “a square root of XOR”.
2
2.3. QUANTUM CIRCUITS 93
Let us call “repertoire of A” the set of unitary transformations that can be approximated
with an arbitrary precision by networks composed of A-gates only. In this repertoire there
is clearly the inverse of the transformation A because
A−1 (φ, α, θ) = A(φ, 2π − α, 2π − θ).
Let Aij , 1 ≤ i 6= j ≤ 3, denote the two-qubit gate obtained from A by having the ith
qubit to be its control qubit and the jth qubit to be its target qubit. All such gates are
clearly in the repertoire of A. It is now easy to verify that the network from Figure 2.21,
described by the matrix
α θ α θ π π α θ π π
, )A13 (φ, , )A12 (φ, , )A−1
A23 (φ, 23 (φ, , )A12 (φ, , )
2 2 2 2 2 2 2 2 2 2
implements the gate V (φ, α, θ)—see Example 2.3.32.
= A A
-1
V A A A
Denote now by V ′ the gate obtained from V by exchanging the second and the third
qubit, and let us denote by P the matrix (gate) V ′ (φ, π/2, π/2). In such a case
1 0
1 0 0 0
′
P = V (φ, π/2, π/2) = 0 0 0 e −iφ
0 .
0 0 1 0
iφ
0 e 0 0
In addition, let us denote
1 0
1 0 0 0
π π π π ′ π π
Q = V (φ, , − )V (φ, , − )V (φ, , − ) =
′
0 0 0 1 0
2 2 2 2 2 2
0 1 0 0
0 0 0 1
and
T (φ, β) = Q[V (φ, 0, β)P ]2 [V (φ, 0, −β)P ]2 Q.
For very small β, we get
0 0
0 0 0 0
T (φ, β) = 1 + iβ
0
2
0 0 0 0 + O(β 3 ),
0 0 0 ie−iφ
0 0 −ieiφ 0
94 CHAPTER 2. ELEMENTS
where O(β 3 ) denotes a matrix the norm of which is O(β 3 ). Hence the transformation
1 0
1 0 0 0
π p
V (φ − , 0, β) = 0 0 1 0 0 = lim T (φ, β/n)n
2 n→∞
0 0 cos β e−iφ sin β
iφ
0 0 e sin β cos β
can also be performed with an arbitrary precision by networks with the gate A as the only
gate, and the same is therefore true for the transformation (gate).
r ! r ! r ! r !
β π β β π β
Rz (β) = lim [V φ, 0, V φ − , 0, V φ, 0, − V φ − , 0, − n ]n
n→∞ 2n 2 2n 2n 2 2
1 0
1 0 0 0
= 0 0 1 0 0 .
0 0 eiβ 0
0 0 0 e−iβ
Exercise 2.3.34 (Barenco, 1996) Show that: (a) there is no one-qubit universal gate; (b)
no classical gate can be universal for quantum computing; (c) gate A( π2 , π4 , θ) is universal
under certain conditions.
Open problem 2.3.35 Determine the set of all non-universal quantum gates.
The task of finding 2-qubit universal gates is of theoretical and also practical importance.
However, this is not the only way to go in searching for simple gates that can be used to
design a quantum circuit.
As already mentioned, XOR gate is not universal for quantum computing. In spite of
that it has its firm role in the search for universality in quantum computing.
Barenco et al. (1995) have shown that the XOR gate, when supplemented by a set of
the following one-qubit gates (that perform general rotation of single qubits), is sufficient to
implement any unitary transformation
α β α β
!
ei(δ+ 2 + 2 ) cos 2θ ei(δ+ 2 − 2 ) sin 2θ
α β α β
−ei(δ− 2 + 2 ) sin θ2 ei(δ− 2 − 2 ) cos 2θ
On the base of the above results we see that while one- and two-bit operations are
classical computation primitives, one- and two-qubit unitary operations are quantum
computation primitives.
2.3. QUANTUM CIRCUITS 95
XOR gate has also been shown to be an important component of various decompositions
of 3-qubit gates into 2-input/output gates. All these results make the XOR gate of central
importance for quantum computation.
In addition, it can be shown that XOR gate and a single one-qubit gate form a universal
set of gates (see Section 5.1).
Remark 2.3.36 Current quantum computing algorithms use qubits and quantum registers
as the basic building blocks. This implies that current quantum mechanics systems used for
quantum computing are composed of two-state quantum systems. Theoretically, we could
use as basic quantum systems three or more states quantum systems. This would be a
generalization of old attempts to use 3-valued or multi-valued logic for classical computing.
In the classical case it has not been demonstrated that such a generalization brings some
essential advantages. The quantum case has not yet been investigated and the situation may
be quite different. Of interest is also the non-trivial question of universal quantum gates for
3-valued logic.
Exercise 2.3.38 (a) Design the carry gate from Figure 2.22a, using one XOR and one
Toffoli gate; (b) design the summation gate from Figure 2.22b, using two XOR gates.
Figure 2.23a shows a schematic notation for a binary adder. By reversing the order in
which gates of an adder are applied we get a network, schematically shown in Figure 2.23b,
to compute an ordinary subtraction (a, b) → (a, b − a) when b ≥ a, and a “modulo 2n
subtraction” (a, b) → (a, 2n + (b − a)) if a > b. In the last case the most significant bit of
the second register contains always 1. This will be essentially used in the next construction
of the modular adder.
A modular adder for (a+b) mod N is shown in Figure 2.24. The basic idea is very simple.
Adder A1 provides the outputs (a, a + b) and the subtractor S1 produces a + b − N for the
96 CHAPTER 2. ELEMENTS
0 0
a0 a0
b0 (a+b)0
0 0
a1 a1
b1 (a+b)1
0 0
a2 a2
b2 (a+b)2
0 0
ci ci
ai ai
bi ai bi
di (ci^ (bi ai)) ((ai ^ bi) + di)
an-2 an-2 = ci+1 if di = 0
bn-2 (a+b)n-2 (a) carry gate
(c) 0 0
addition an-1 an-1 ci ci
bn-1 (a+b)n-1 ai ai
ai bi ci
network
0 (a+b)n bi = (a + b)i
(b) summation gate
or as
2.3. QUANTUM CIRCUITS 97
N N
G1 G2
a + + + + + a
b + + + + + a+b mod N
A1 S1 A2 S2 A3
t =0 0
Figure 2.24: A quantum network for modular addition-75%
n−1
X
ab = ( a2i bi ) mod N,
i=0
Pn−1
where b = i=0 bi 2i , i.e. as (n − 1)-additions, where in the ith addition a2i is added if and
only if bi = 1.
Exercise 2.3.39 (a) Design a quantum network for modular multiplication; (b) design
a quantum network for modular exponentiation.
n−1
Y i
ab = (a2 )bi .
i=0
Concerning the efficiency of networks for (modular) arithmetical operations two quan-
titative measures are of importance: the total number of elementary gates and the total
number of qubits needed. The problem has received special attention because in Shor’s al-
gorithm efficiency of exponentiation ax mod n is of key importance for the overall efficiency.
By Vedral, Barenco and Ekert (1996), 4n + 3 qubits are sufficient for exponentiation in
Shor’s factorization algorithm, where n is the number of qubits needed to store N , 2n is
the number of qubits needed to store x (because x can be there as large as N 2 ), and n + 3
temporary qubits are sufficient.
Definition 2.3.40 A superoperator gate G of type (k, l) is a completely positive map which
maps density matrices on k qubits to density matrices on l qubits. Its action on a density
matrix ρ will be denoted symbolically by G ◦ ρ.
98 CHAPTER 2. ELEMENTS
Two important special cases of superoperator gates are unitary and measurement gates.
In the case of a unitary gate U the corresponding operator is U · U ∗ . For a pure state |φi, U
maps ρ = |φihφ| into U ◦ρ = U ρU ∗ . A measurement gate represents a probabilistic projection
into a set of mutually orthogonal subspaces, which produces a mixed state. Superoperators
are in general not reversible.
There is a well-known and well-understood relation between superoperators and unitary
operators.
Lemma 2.3.41 The following conditions are equivalent for any two Hilbert spaces Hn and
Hm and sets of linear operators L(Hn ) and L(Hm ).
2. There is a Hilbert space F with dim(F ) ≤ dim(Hn ) dim(Hm ), and a unitary embedding
E : Hn → Hn ⊗ F such that T ρ = TrF (EρE ∗ ) for all ρ ∈ L(Hn ).
|φ> |ψ>
E |ψ> UE |φ>
|0>
(a) (b)
Figure 2.25: Encoder as superoperator and its unitary embedding
Example 2.3.42 Superoperators of special interest for quantum information processing are
encoders and decoders. An encoder E that maps n-qubits to m-qubits, m ≥ n (see Fig-
ure 2.25a), can be seen in a larger quantum space, extended by k ≥ m − n qubits in the
initial state 0(k) , the so called ancilla qubits, as a unitary operator UE , Figure 2.25b, some
of the outputs of which are then discarded.
Each superoperator circuit produces, for a given input density matrix, an output density
matrix which is defined in the natural way as follows. If Q is a quantum superoperator
circuit and G1 , . . . , Gt is a topological sort of its gates, then Q computes the density matrix
Q ◦ ρ = Gt ◦ Gt−1 ◦ . . . ◦ G1 ◦ ρ.
In order to show that this definition is consistent one has to show that the two different
topological orderings of gates yield the same result. A step in this direction is to show that
if G1 and G2 are superoperator gates operating on different qubits, then G1 ◦ G2 ◦ ρ =
G2 ◦ G1 ◦ ρ, for any density matrix ρ. This can be shown easily if we consider an extension
of superoperators by tensoring with unitary matrices as discussed above. The probability
distribution that such a circuit computes is defined in the following way:
Definition 2.3.44 Let Q be a quantum circuit with n inputs (blanks) and m outputs (re-
m
sults). The probability distribution fQ : {0, 1}n → [0, 1]{0,1} that Q computes is defined as
follows: For an input i the probability of the output j is
As a quite straightforward corollary of Lemma 2.3.41 we have the main result concerning
the computational power of superoperator gates.
Lemma 2.3.45 If G : L(H2n ) → L(H2n ) is a superoperator gate of type (n, m), then there
exists a unitary quantum gate Ug on 2n + m qubits such that for any density matrix of the
order n,
G ◦ ρ = (Ug ◦ (ρ ⊗ |0(n+m) ih0(n+m) |))|A ,
where A is the set of the first n qubits.
G = TrH2n+m (U V0 · V0∗ U ∗ ).
As a corollary we get
Theorem 2.3.46 The model of quantum circuits with mixed states is polynomially equiva-
lent, in computational power, to the standard model of quantum circuits over pure states.
100 CHAPTER 2. ELEMENTS
Chapter 3
ALGORITHMS
INTRODUCTION
Quantum algorithms make use of several specific features of the quantum world, for example
quantum superposition, to get from classical inputs, through entangled states, to classical
outputs more efficiently than classical algorithms. A variety of quantum algorithms are
presented in this chapter. They range from pioneering algorithms, simple but powerful, for
several promise problems, through seminal Shor’s algorithms and a variety of algorithms for
various search problems and their modifications, due to Grover and others.
Design of faster-than-classical quantum algorithms for important algorithmic problems
has been an interesting intellectual adventure and achievement. Their existence keeps being
one of the key stimuli to those trying to overcome enormous technology problems to build
(powerful) quantum computers.
Methods to design quantum algorithms and to show limitations of quantum power have
also been developed gradually and will be presented and illustrated in this chapter.
LEARNING OBJECTIVES
The aim of the chapter is to learn:
101
102 CHAPTER 3. ALGORITHMS
Q
converting its opponents: it rarely happens
that Saul becomes Paul. What does happen is
that its opponents gradually die out and that
the growing generation is familiarized with the
idea from the beginning.
Max Planck (1936)
yields
n
2X −1
A|φi = ci A|ii,
i=0
i.e. by a single application of the operator A (on a “single processor”), exponentially many,
namely 2n , operations on basis states are performed. This phenomenon is called quantum
parallelism and it is of great importance for the design of efficient quantum algorithms.
Observe that quantum parallelism is already for a modest n a really massive parallelism.
Quantum computing can therefore trade exponentiality in time for exponentiality in
quantum interference. In addition, in quantum registers the amount of parallelism increases
exponentially with the size of the system, and this exponential growth of parallelism requires
only a linear increase in the amount of physical space needed.1
m m
2 −1 2 −1
1 X Uf 1 X
|φi = √ |x, 0i → √ |x, f (x)i = Uf |φi = |ψi (3.2)
2m x=0 2m x=0
which changes the sign of the amplitude for those basis states |xi for which f (x) = 1. Using
one additional qubit, in the state √12 (|0i − |1i), the operator Vf can be expressed using the
operator Uf as follows:
1 1
Uf |x, √ (|0i − |1i)i = √ (|x, f (x)i − |x, 1 ⊕ f (x)i)
2 2
1
= (−1)f (x) |x, √ (|0i − |1i)i.
2
With the exception of some trivial cases, the resulting state |ψi in (3.2) is entangled.
Indeed, by measuring the first m qubits (or “x-register”) with respect to the standard basis,
we get a value x0 randomly chosen from the set {0, 1, . . . , 2m − 1} and the state collapses to
|x0 , f (x0 )i. The following measurement of the second register gives us then f (x0 ). However,
this way quantum algorithms provide no advantage over the classical one. Fortunately, as
illustrated in the following examples, in some cases there is a more clever way to make use
of quantum entanglement in (3.2) to compute efficiently some global properties of f .
Example 3.1.2 (van Dam, 1998) Let a function f : {1, . . . , n} → {0, 1} be given as a
black box. To determine f classically, n calls of f are needed—to get the string wf =
f (1)f√(2) . . . f (n). Quantumly, this can be done, with probability greater than 0.95, using
n
2 + n quantum calls of f . Indeed, by (2.3)
X
wf = Hn (−1)x·wf |xi (3.3)
x∈{0,1}n
In order to compute x · wf one needs hw(x) calls of f , where hw(x) is the Hamming weight
of x—the number of 1’s in x.
The basic trick is to compute the sum in (3.3) but only for x such that hw(x) ≤ k, for a
suitable k.
If Fk is such a function that for x ∈ {0, 1}n, Fk (x) = x · wf if hw(x) ≤ k and Fk (x) = 0
otherwise, then
VFk |xi = (−1)x·wf |xi,
if hw(x) ≤ k and VFk |xi = |xi, otherwise. Therefore if VFk is applied to the (initial) state
hw(x)≤k
1 X
|ψk i = √ |xi,
Mk
x∈{0,1}n
Pk n
where Mk = i=0 k , then
hw(x)≤k
X
|ψk′ i = VFk |ψk i = (−1)x·wf |xi.
x∈{0,1}n
In order to compute |ψk′ i, at most k calls of f are needed. Let us now measure all n qubits
of |ψk′′ i = Hn |ψk′ i. The probability that this way we get wf is
k
Mk 1 X n
P r(|ψk′′ i
yields wf ) = |hwf |ψk′′ i|
= n = n
2 2 i=1 k
√
and, as one can calculate, this probability is less then 0.95 if k = n + n.
3.1. QUANTUM PARALLELISM AND SIMPLE ALGORITHMS 105
H2 1
|0i|1i → (|0i + |1i)(|0i − |1i)
2
1
= (|0i(|0i − |1i) + |1i(|0i − |1i))
2
Uf 1
→ (|0i(|0 ⊕ f (0)i − |1 ⊕ f (0)i) + |1i(|0 ⊕ f (1)i − |1 ⊕ f (1)i)) (3.4)
2
2 From now on the assumption that a function f is given as a black box, or oracle, means that it is not
possible to obtain knowledge about f by any other means than by evaluating it on points of its domain.
106 CHAPTER 3. ALGORITHMS
|0> - f is constant
|0> H H M
|1> - f is balanced
Uf
|0> - no information about f
|0> H M
|1> - information by first qubit
(a)
|f(0) + f(1)>
|0> - f is constant
|0> H H M
|1> - f is balanced
Uf
|1> H H |1>
(b)
Figure 3.1: Circuits for randomized and deterministic solution of Deutsch’s problem
1
1 X
= ( (−1)f (x) |xi)(|0i − |1i)
2 x=0
1
= (−1)f (0) (|0i + (−1)f (0)⊕f (1) |1i)(|0i − |1i). (3.5)
2
From the right side in (3.4), the two possibilities for f to be constant lead to the left sides
in (3.6) and (3.7) and two possibilities for f to be balanced lead to the left sides in (3.8) and
(3.9):
1
(|0i + |1i)(|0i − |1i) = |0′ i|1′ i if f (0) = 0; (3.6)
2
1
(|0i + |1i)(|1i − |0i) = −|0′ i|1′ i if f (0) = 1; (3.7)
2
1
(|0i − |1i)(|0i − |1i) = |1′ i|1′ i if f (0) = 0; (3.8)
2
1
(|0i − |1i)(|1i − |0i) = −|1′ i|1′ i if f (0) = 1. (3.9)
2
By measuring the first bit, with respect to the dual basis, we can immediately see whether f
is constant or balanced.
Another way, and a more straightforward one, to come to the same outcome is to trans-
form, at the right side in (3.5), the states of both qubits to the dual basis with the outcome
(−1)f (0) |(f (0) ⊕ f (1))′ i|1′ i.
The circuit for this algorithm is in Figure 3.1b. It is now easy to see how we can simplify
the algorithm and the corresponding circuit. Indeed, since the final measurement is on the
first qubit only we can omit the second Hadamard rotation on the second qubit. In addition,
we can also omit the first Hadamard rotation on the second qubit, if its initial state is
√1 (|0i − |1i). The resulting circuit is in Figure 3.2a, where a special notation is used for
2
the f -controlled NOT.
In the second algorithm we have used two simple but powerful techniques which one often
encounters in the design of efficient algorithms and quantum error-correcting networks: a
3.1. QUANTUM PARALLELISM AND SIMPLE ALGORITHMS 107
11
00
00
11
|ψ> 00
11
00
11
H 1
0
0
1 H |ψ> Hn 00
11 Hn
00
11
00
11
00
11
00
11
00
11
|0>-|1> |0>-|1>
Uf |0>-|1> |0>-|1>
Uf
(a) (b)
Figure 3.2: Circuit for the Deutsch’s problem and the “Hadamard-twice scheme”. The state
|0i − |1i should be normalized
change between the standard and the dual basis (for some qubits), and the computation
scheme, called Hadamard twice , depicted in the general form in Figure 3.2b, which
uses again f -controlled NOT . On closer examination one sees that the key point of the
“Hadamard twice” scheme is again the change of the basis from standard to dual, some
natural computations, and again the change of the basis back.
Exercise 3.1.5 Given a unitary operation Uy which maps any state |ψi into (−1)y |ψi,
for a fixed y, design a network using a conditional Uy -gate and two Hadamard gates to
determine y.
Deutsch’s problem was the first one at which a separation was found between what
classical and quantum computers can do. In this case a better performance of the quantum
algorithm is due to the fact that a quantum algorithm can act in one step on a superposition
of states |0i and |1i and in this way it can extract global information about the function.
The second algorithm for Deutsch’s problem was the first quantum algorithm experi-
mentally implemented, using NMR technology (see page 312 for more detail).
Definition 3.1.7 A function f : {0, 1}n → {0, 1} is balanced if none of the values of f
has majority and it is constant if there exists no x, y ∈ {0, 1}n such that f (x) 6= f (y).
What has been achieved by these operations? The values of f were transferred to the
amplitudes, relative to each of the basis states. This can now be utilized, through the power
of quantum superposition and a proper observable, to solve the problem through a single
measurement as follows.
Let us consider the observable D = {Ea , Eb }, where Ea is the one-dimensional subspace
spanned by the vector
2n −1
1 X
|ψa i = √ |ii,
2n i=0
and Eb = (Ea )⊥ . The projection of |φ1 i into Ea and Eb has the form
where |ψb i is a vector in Eb such that |ψa i ⊥ |ψb i. A measurement through D will provide
“the value a or b” with probability |α|2 or |β|2 , respectively.
It is easy to determine α, using the projection of |φ1 i onto Ea by the computation
2n −1
! n
2X −1
1 X 1
α = hψa |φ1 i = √ hi| √ (−1)f (j) |ji
2n i=0 2n j=0
n n n
2 −1 2 −1 2 −1
1 X X f (j) 1 X
= (−1) hi|ji = (−1)f (i) ,
2n i=0 j=0 2n i=0
If f is balanced, then the sum for α contains the same number of 1s and −1s and therefore
α = 0. A measurement of |φ1 i, with respect to D therefore provides, for sure, the outcome b.
If f is constant, then either α = 1 or α = −1 and therefore the measurement of |φ1 i
with respect to D always gives the outcome a.
A single measurement of |φ1 i, with respect to D, therefore provides the solution of the
problem with probability 1.
The Deutsch-Jozsa problem was the first one that was found to need only linear time on
a quantum computer but exponential time on a deterministic Turing machine.
Exercise 3.1.9 Show that the Deutsch–Jozsa problem can be solved by first applying the
Hadamard transformation to the state |φ1 i and then checking whether all resulting qubits
are |0i.
The quantum algorithm presented above solves the Deutsch–Jozsa problem exactly in
polynomial time. As shown above, the problem cannot be solved in polynomial time on a
deterministic computer. However, it can be solved in polynomial time on a PTM.
Exercise 3.1.11 Show that the Deutsch–Jozsa problem can be solved on PTM in poly-
nomial time provided an arbitrarily small one-sided error is allowed.
There are several variations of the Deutsch–Jozsa problem that can be solved with a
small modification of the above techniques.
Exercise 3.1.12 (Cleve et al. 1998) Given a function f : {0, 1}n → {0, 1}m, m ≤ n,
that is promised to have the property that the parity of the elements in the range of f is
either constant or equally balanced. Show that there is a quantum algorithm to determine
which of these two properties f has. (Hint: choose an auxiliary register of m qubits all
in the initial state √12 (|0i − |1i).)
Example 3.1.13 (Simon’s XOR Problem) Let f : {0, 1}n → {0, 1}n be a function such
that either f is one-to-one or f is two-to-one and there exists a single non-zero s ∈ {0, 1}n
such that
∀x 6= x′ (f (x) = f (x′ ) ⇔ x′ = x ⊕ s).
The task is to determine which of the above conditions holds for f and, in the second
case, to determine also s.
110 CHAPTER 3. ALGORITHMS
To solve the problem two registers are used, both with n qubits and the initial states |0(n) i,
and (expected) O(n) repetitions of the following version of the Hadamard-twice scheme:
1. Apply the Hadamard transformation register, with the initial value |0(n) i,
P on the first(n)
1
to produce the superposition √2n x∈{0,1}n |x, 0 i.
P
2. Apply Uf to compute |ψi = √12n x∈{0,1}n |x, f (x)i.
3. Apply Hadamard transformation on the first register to get
1 X
n
(−1)x·y |y, f (x)i.
2 n
x,y∈{0,1}
Proof. Let a 6≡ ±1 (mod n) be such that a2 ≡ 1 (mod n). Since a2 − 1 = (a + 1)(a − 1),
if n is not prime, then a prime factor of n has to be a prime factor of either a + 1 or a − 1.
By applying Euclid’s algorithm to (n, a + 1) and (n, a − 1) we can find, in O(log n) steps, a
prime factor of n.
The second concept to be used in the following is that of the period of the function
fn,x (k) = xk mod n. It is the smallest integer r such that fn,x (k + r) = fn,x (k) for any
k; i.e., the smallest r such that xr ≡ 1 (mod n). Such an r is also called the order of x,
in short ord (x), in Z∗n .3 The problem to find the period of a function is more technically
named as the order problem.
Algorithm 3.2.3
If this algorithm stops, then y r/2 is a nontrivial solution of the equation x2 ≡ 1 (mod n).
Exercise 3.2.4 Show the following result, which indicates why it is a good idea to exclude
powers of primes in Algorithm 3.2.3. Let n = pe , where p is an odd prime and e > 1,
let y be an integer with gcd(y, p) = 1 and let r be the order of y. Then either r is odd or
r
y 2 ≡ ±1 (mod n).
Exercise 3.2.5 Show the following result which implies that the exclusion of powers of
primes in the factorization Algorithm 3.2.3 is not an essential restriction: powers of
primes can be factorized in polynomial time.
Lemma 3.2.6 If a y such that 1 < y < n and gcd(n, y) = 1 is selected randomly and an
odd n is not a power of a prime, then P r{r is even and y r/2 6≡ ±1 (mod n)} ≥ 14 .
Q
Proof. Let a prime factorization of n be n = ki=1 pei i . By the Chinese remainder
theorem, the groups Z∗n and Z∗pe1 × . . . × Z∗pek are isomorphic by the following mapping
1 k
a mod n ↔ (a mod pei i , . . . , a mod pekk ).
In the rest of the proof we consider the following unique decompositions φ(n) = 2l m,
ei
φ(pi ) = 2li mi ,4 1 ≤ i ≤ k, where m and all mi are odd.
Since all groups Z∗pei are cyclic (see, for example, Gruska, 1997, page 53), in each of them
i
a generator gi can be found and fixed. In such a case choosing randomly and independently
xi ∈ {1, . . . , 2li mi } and considering a = (g1x1 , . . . , gkxl ) ∈ Z∗pe1 × . . . × Z∗pek is a way to get a
1 k
random a ∈ Z∗n . The claim of the Lemma now follows from the following two sublemmas:
Lemma 3.2.7 If a ∈ Z∗n is chosen randomly, n is odd, then P r{r = ord(a) is even} ≥ 12 .
rxi
Proof. The order r of a in Z∗n is the smallest integer such that all 2li mi
are integers.
Hence r is the least common multiplier (LCM) of the set
l1
2 m1 2 lk m k
,..., .
x1 xk
Since n is odd all pi have to be odd primes and therefore all φ(pei i ) are even and, naturally,
all li > 0. Thus if any of the xi is odd, then LCM must be even. Since xi are chosen
randomly, such a probability is at least 1 − 21k ≥ 21 .
4 φ(n) is Euler totient function and φ(n) is the number of elements of the group Z∗n .
114 CHAPTER 3. ALGORITHMS
r
Lemma 3.2.8 For a random a ∈ Z∗n , r = ord(a), r even, P r{a 2 6≡ ±1 (mod n)} ≥ 12 .
x1 r xk r
r
Proof. Fix an a and let r = ord(a). a 2 corresponds to (g1 2 , . . . , gk 2 ). Since all Z∗pei
i
are cyclic, in each Z∗pei there are only two square roots of 1, namely +1 and −1. This implies
i
that square roots of 1 in Z∗n are exactly those corresponding to k-tuples (±1, . . . , ±1) with
xi r
the correspondence 1 ↔ (1, . . . , 1), −1 ↔ (−1, . . . , −1). For an 1 ≤ i ≤ k, gi 2 is −1 if x2i r
is not a multiple of 2li mi . This happens if the highest power of 2 dividing xi r is at most li .
xi r
It is clear that not all gi 2 are 1 because otherwise r would not be the order of a. This
implies that in order to show the lemma, it is sufficient to bound the probability that all
xi r
gi 2 are −1.
The only way this can happen is that for all 1 ≤ i ≤ k, the highest power of 2 dividing rxi
is li . Suppose now that each xi is chosen randomly. Let t be the highest power of 2 dividing
x1 r
x1 . In order that g1 2 is −1, the highest power of 2 dividing r has to be l1 − t > 0. The
probability of choosing x2 such that the highest power of 2 dividing it is exactly l2 − (l1 − t)
x2 r
(which implies g2 2 = −1) is less than or equal to 21 . This proves the lemma.
Exercise 3.2.10 Show that if 1 < y < n is selected randomly, then the probability that
gcd(y, n) = 1 is: (a) greater than Ω( lg1n ); (Hint: use the Prime Number Theorem); (b)
1
greater than Ω( lg lg n ).
Exercise 3.2.11 Use Lemma 3.2.1 and Algorithm 3.2.3 to factorize: (a) 91; (b) 899;
(c) 5183.
Example 3.2.12 Let n = 15 and select 1 < y < 15 such that gcd(y, 15) = 1. The set of such
y is {2, 4, 7, 8, 11, 13, 14}. Let us choose y = 11. Values of 11x mod 15 form, for x = 1, 2, . . .,
the sequence 11, 1, 11, 1, 11, 1, . . . with the period r = 2. Hence y r/2 = 11 and we have to
compute gcd(15, 11 + 1) = 3 and gcd(15, 11 − 1) = 5—to get both factors of 15. Observe
also that the corresponding periods of elements 2, 4, 7, 8, 11, 13, 14 are 4, 2, 4, 4, 2, 4, 2 and in
this case any choice of y with the exception of y = 14 leads to a desirable factorization. For
y = 14 we get r = 2, 142/2 ≡ −1 mod 15 and the method fails.
Exercise 3.2.13 Analyse the case n = 21. Find all integers y such that gcd(21, y) ≡ 1
and their order.
The task now is to find out how to make use of quantum parallelism to compute the
period of the function fn,x (k) for n = 2d − 1. Let Ufn,x be the unitary operator to realize
5 Ithas been shown (see, for example, Gruska, 1997, Section 1.8.1), that if n is not prime and has at least
2 different odd factors, then the equation x2 ≡ 1 (mod n) has at least four solutions.
3.2. SHOR’S ALGORITHMS 115
the mapping (k, 0) → (k, fn,x (k)). An application of this operator to the state
d
2 −1
1 X
|ψi = √ |k, 0(d)i,
d
2 k=0
yields
d
2 −1
1 X
Ufn,k |ψi = √ |k, fn,x (k)i = |ψ1 i. (3.10)
2d k=0
Observe that all possible values of fn,x are encoded in values of the second register of the
state |ψ1 i. However, as already pointed out, in this context we are actually not interested
in particular values of the function fn,x , only in its period. It is therefore of importance and
interest to locate the potentials and pitfalls of attempts to find the period from the state
|ψ1 i. To see that let us consider again the case n = 15 and x = 7. In such a case (3.10) has
the form
1
(|0i|1i + |1i|7i + |2i|4i + |3i|13i + |4i|1i + |5i|7i + . . . + |14i|4i + |15i|13i).
4
If we measure at this point the second register, then we get as the outcome one of the
numbers 1, 4, 7 or 13, and the following table shows the corresponding post-measurement
states in the second column. The corresponding sequences of values of the first register are
periodic with period 4 but they have different offsets (pre-periods) listed in column 3 of the
table.
result post-measurement state offset
1
1 2 (|0i + |4i + |8i + |12i)|1i 0
1
4 2 (|2i + |6i + |10i + |14i)|4i 2
1
7 2 (|1i + |5i + |9i + |13i)|7i 1
1
13 2 (|3i + |7i + |11i + |15i)|13i 3
One natural way to obtain the period seems to be to repeat computation (3.10) many
times and each time to measure at first the second register and then the first one. If we get
for some value z of the second register the values y1 < y2 < y3 of the first register we know
that the period is at most gcd(y2 − y1 , y3 − y2 ).
Unfortunately, this method is not efficient enough. Due to the fact that preperiods may
be different we cannot compare values of the first register for different values of the second
register. In addition, on average the period r grows exponentially with d. Therefore an
exponential number of repetitions of computation 3.10 would be in general needed to get
the period this way.
Fortunately, there is a method of “massaging” the state (3.10) in such a way that from
the result the period can be obtained efficiently, without sampling the state. The key step is
to transform the pre-period into the phase in which it has no influence on the corresponding
probabilities. The key tool to use is the Quantum Fourier Transform discussed in the next
subsection.
q−1
1 X 2πiac/q
QFTq : |ai → √ e |ci (3.12)
q c=0
Exercise 3.2.14 Demonstrate why is the nature of QFT different from DFT.
Pq−1
If applied to a quantum superposition, QFTq transforms the state √1 f (a)|ai as
q a=0
follows
q−1
X q−1
X
QFTq : f (a)|ai → f¯(c)|ci,
a=0 c=0
and therefore the impact of QFT on |0i is the same as of the Hadamard transformation—see
Section 2.1.
Most of the known important/interesting quantum algorithms use QFT either in its full
strength or its special case—the Hadamard transformation. Because of that the question of
how efficiently one can compute QFT on quantum computers is of key importance.
3.2. SHOR’S ALGORITHMS 117
Exercise 3.2.15 (Mosca, 1998b) Let x, y ∈ [0, 2n − 1] and Wy : |xi|ψi → ξ x·y |xi|ψi,
n
ξn = e2πi/2 be a unitary transformation. Design a network to determine y that uses two
QFT gates.
QFTq is usually used with the base q = 2n . In such a case the classical Fourier Transform
algorithm requires time O(22n ). The classical Fast Fourier Transform algorithm requires
only time O(n2n )—a very significant saving. With a quantum implementation time can be
reduced to O(n2 ) for some n. The fact that the QFT can be performed in polynomial time
is of key importance for polynomial running time of quantum algorithms using it.
We first prove a partial result, namely that if
a ↔ (a1 , a2 , . . . , ak ),
where ai = a mod pei i is one-to-one (and therefore it can be performed by a unitary trans-
formation).
For any 0 ≤ c < q let ci = c mod pei i . Then ac ≡ ai ci mod pei i and, therefore, by the
Chinese remainder theorem,
k
X Y e
ac ≡ ai ci ri pj j (mod q), (3.14)
i=1 j6=i
where Y e
ri = ( pj j )−1 mod pei i (3.15)
j6=i
and all ri can be computed easily using the extended Euclid’s algorithm.
The mapping
|a1 , . . . , ak i → |a1 r1 , . . . , ak rk i.
is also one-to-one because each ri is invertible modulo pei i and therefore, if we apply QFTpei
i
on the ith register of |a1 r1 ,. . . , ak rk i, then from (3.11) the following cumulative result
e e
q11 ,...,qkk
„ «
1 X 2πi
a1 r1 c1
e1
a r c
+...+ k ekk k
p e1 p e p
1 p
k |c1 , . . . , ck i
q1 . . . qkek c1 =0,...,ck =0
follows. This expression can be simplified using the relations (3.14) and (3.15) to the form
q−1
1 X 2πiac/q
√ e |c1 , . . . , ck i
q c=0
118 CHAPTER 3. ALGORITHMS
and by relabeling |c1 , . . . , ck i with |ci we get exactly the same expression as in (3.12).
A simple implementation of QF Tq was discovered by Coppersmith (1994) and Deutsch
(see Ekert and Jozsa, 1996) for the case that q = 2n . The circuit implementing QF Tq uses
the Hadamard gate H and conditional phase shift on second qubit provided the first qubit
2πi
is in the state |1i. Phase shift by e 2j is represented by the matrices Xj , j = 0, . . . , n − 1,
where
1 0
Xj = j .
0 e2πi/2
Let us denote by Hj the gate H operating on the jth qubit and by Sj,k the conditional
Xk−j gate operating on the jth and kth qubit, j < k.
Pn−1
The algorithm is based on the fact that if q = 2n , a = i=0 ai 2n−i−1 , then the QFTq |ai
is not entangled and it holds
1
QF Tq |ai = √ (|0i + e2πi0.an−1 |1i)(|0i + e2πi0.an−2 an−1 |1i) . . . (|0i + e2πi0.a0 ...an−1 |1i)
2n
Exercise 3.2.16 Show that in the network in Figure 3.3 the output value of the qubit
with input |aj i is |0i + e2πi0.aj ...an−1 |1i.
|a 2 > H X1 Xn-3
|an-2 > H X1
|an-1 > H
The number of gates and, consequently, the computation time of this network is θ(n2 ).
For details see Coppersmith (1994) and Cleve et al. (1998). These results were generalized by
Kitaev (1997), who showed how to design a polynomial time approximate quantum algorithm
for Fourier transform on any finite Abelian group presented as a product of cyclic groups.
Beals (1997) showed how to compute quantum Fourier transform over symmetric groups.
For an analysis of the role Fourier transform has in design of quantum algorithms, and for
the general construction of the Fourier transform on Abelian groups see Jozsa (1997a).
3.2. SHOR’S ALGORITHMS 119
Exercise 3.2.17 Make a formal proof that the above quantum computes QFT.
Exercise 3.2.18PnShow that if B = {βi }ni=1 is a basis, then G = {γi }ni=1 is also a ba-
1 2πijk/n
sis if γk = √n j=1 βj e and these two bases are mutually unbiased. (Quantum
measurements corresponding to two mutually unbiased bases (see page 367), are called
complementary.)
Quantum Fourier transform has been so far the key tool in designing efficient quantum
algorithms. It is therefore of importance to analyze the performance of the QFT in the
presence of decoherence. Barenco et al. (1996) have shown that so-called approximate QFT
can provide better results concerning the period estimation than (exact) QFT.
The main problem with using QFT to extract the period, as we shall see, is that it works
only approximately in general and a special effort is needed to derive from the approximation
the exact period.
Remark 3.2.19 If we can factorize an integer n we can break any RSA cryptosystem with
the public key n, e. In order to do factorization, as we can see from the flow diagram in
Figure 3.4, we need in general to do order computing several times. However, to break
RSA we actually do not need to factorize n. There is a simple method of breaking RSA, as
pointed out by Ekert (1997) and Cleve et al. (1998), at which it is sufficient to compute the
order of the cryptotext, and only once.
Indeed, given a cryptotext c = we mod n for an integer plaintext w we have, since e is
relative prime to φ(n), order(c) = order(w). Let now d be such that ed ≡ 1 mod (order(c)),
i.e. ed = k · order(w) + 1 for some k. In such a case cd ≡ wed = worder (w)k+1 ≡ w mod n
and in this way we can get the plaintext from the cryptotext.
6 To design an RSA cryptosystem two large primes (512–1024 bits) are first chosen and n = pq, φ(n) =
(p − 1)(q − 1) are computed. d is then chosen such that gcd(d, φ(n)) = 1 and e is computed such that
ed ≡ 1 mod (φ(n)). n and e form the public key; p, q, d form the secret key. Encoding of a plaintext w:
c = w e mod n; decoding of the cryptotext c; w = cd mod n. Encryption seems to be secure provided it is not
feasible to get p and q from n—though it is not known if breaking RSA is as hard as integer factorization.
120 CHAPTER 3. ALGORITHMS
choose randomly
a {2, ... ,n-1}
compute
z = gcd(a, n)
no
z = 1?
yes
r is no
even?
yes
r/2 r/2
z = max{gcd(n, a -1), gcd(n, a +1)}
yes
z=1 ?
no
output z
As the next step we perform a measurement on the last register. Let y be the value
obtained, i.e. y = xl mod n for the smallest l with this property. If r is the period of
fn,x , then xl ≡ xjr+l (mod n) for all j. Therefore, the measurement actually selects the
following sequence of a’s values (in the fourth register), l, l + r, l + 2r, . . . , l + Ar, where A is
the largest integer such that l + Ar ≤ q, and l ≤ r has been chosen essentially randomly by
the measurement. Since l ≤ r < n and q = Θ(n2 ), we get A ≈ qr . The post-measurement
state is then
XA
1
|φl i = √ |n, x, q, jr + l, yi. (3.16)
A + 1 j=0
Since n, x, q and y will be fixed from now on, we will no longer write them down explicitly
and therefore the previous state can be considered as having the form
X A
1
|φl i = √ |jr + li. (3.17)
A + 1 j=0
Phase II: amplitude amplification by QFT. From now on we consider in detail only
a special case. Namely that A = qr − 1. In such a case the state (3.17) can be written in the
form
r Xq
−1
r r
|φl i = |jr + li
q j=0
q−1 r r −1
q
1 X r X
QFTq |φl i = √ e2πic(jr+l)/q |ci (3.18)
q c=0 q j=0
q
√ Xq−1 r −1
X q−1
X
r
= e2πilc/q e2πijcr/q |ci = αc |ci. (3.19)
q c=0 j=0 c=0
because the above sum is over a set of qr roots of unity equally spaced around the unit circle.
Thus 1 2πilc/q
√ e
r
, if c is a multiple of qr ;
αc =
0, otherwise;
and therefore
r−1
1 X 2πilc/q q
|φout i = QFTq |φl i = √ e |j i.
r j=0 r
The key point is that the trouble-making offset l appears now in the phase factor e2πilc/q
and has no influence either on the probabilities or on the values in the register.
122 CHAPTER 3. ALGORITHMS
Phase III: period extraction. Each measurement of the state |φout i therefore yields
one of the multiples c = λ qr , λ ∈ {0, 1, . . . r − 1}, where each λ is chosen with the same
probability 1r . Observe also that in this case the QFT transforms a function with the period
r (and an offset l) to a function with the period rq . After each measurement we therefore
know c and q and
c λ
= ,
q r
where λ is randomly chosen. If gcd(λ, r) = 1, then from q we can determine r by dividing q
with gcd(c, q). Since λ is chosen randomly, the probability that gcd(λ, r) = 1 is greater than
1
Ω( lg lg r ). If the above computation is repeated O(lg lg r) times, then the success probability
can be as close to 1 as desired and therefore r can be determined efficiently.8
In the general case, i.e., if A 6= rq −1, there is only a more sophisticated computation of the
resulting probabilities and a more sophisticated way to determine r (using a continuous frac-
tion method to extract the period from its approximation). No new “quantum-computing”
ideas are involved. For details see Shor (1997).
offset period
l q/r
(a)
(d)
q/r
(b)
r
(c) (e)
Figure 3.5: Representation of particular steps of Shor’s order-finding algorithm—adapted
from Bennett (1998b)
Perspectives of factorization
Hughes (1997) has analyzed the perspectives of factoring using on one side currently the
most powerful factorization technique (Number Field Sieve method), on state-of-the-art
workstations (assuming that the power of processors keeps increasing by Moore law), and
on the other side a potential quantum computer with minimal clock speed of 100 MHz.
Table 3.6 shows estimations of factoring times on networks of 1000 workstations. Ta-
ble 3.7 provides estimations for the number of qubits, gates and factoring time for a (poten-
tial) quantum computer.
124 CHAPTER 3. ALGORITHMS
The above analyses show that using 2048-bit numbers seems to be safe for the next 50
years for classical computers. However, this is not so even for 4096-bit numbers if sufficiently
powerful quantum computers will be available.
p−2 p−2
1 XX
|φ′ i = |x, g, a, b, g a x−b mod pi.
p − 1 a=0
b=0
Since x, g will not be changed in the following computations we will not write them explicitly
any longer.
As the next step we apply QFTp−1 on |φ′ i twice, once to map a → c with amplitude
√ 1 e2πiac/(p−1) and once to map b → d with amplitude √ 1 e2πibd/(p−1) . Since p − 1 is
p−1 p−1
smooth, this can be done in polynomial time. The resulting state is
p−2
X
1 2πi
|φ1 i = e p−1 (ac+bd) |c, d, g a x−b mod pi.
(p − 1)2
a,b,c,d=0
3.2. SHOR’S ALGORITHMS 125
Let us now measure the last register and let us determine the probability that what we
get is the state y ≡ g k mod p for some k.
The probability equals the square of the absolute value of the sum of all amplitudes of all
states having y in the last register, i.e., the sum of amplitudes over all a and b satisfying the
equality a − rb ≡ k (mod p − 1) for some k. This is due to the fact that the computational
paths interfere only if y ≡ g a (g r )−b = g a−rb ≡ g k mod p.9 Indeed, g a−rb = g j(p−1)+k ≡ g k
(mod p)). The probability is therefore
2
p−2
X
1 2πi
p−1 (ac+bd)
{e |a − rb = k} .
(p − 1)2
a,b,c,d=0
2
e p−1 (kc+b(d+rc)) .
(p − 1)
b,c,d=0
The measurements on the first and second register provide a (random) c < p − 1 such
that d ≡ rc (mod p − 1). If gcd(c, p − 1) = 1, r can be obtained by division. As already
mentioned, the probability that gcd(c, p − 1) = 1 is Ω( lg1p ). Therefore, the number of
computations needed to perform in order to get the probability close to 1 for finding r is
polynomial in lg p.
An important outcome in this direction has been an observation (for example, see Høyer,
1997), that all currently known quantum algorithms which run superpolynomially faster than
their most efficient probabilistic classical counterparts solve a hidden subgroup problem.
The first results along these lines were due to Simon (1994), Shor (1994) and Kitaev
(1995), who discovered a bounded-error polynomial time algorithm for the so-called Abelian
subgroup stabilizer problem to which both integer factorization and discrete logarithm prob-
lem can be reduced. This problem is also a special case of the following problem:
9 Itfollows from Fermat’s theorem that if p is a prime and a ≡ b (mod p − 1), then g a ≡ g b (mod p),
because g p−1 ≡ 1 (mod p).
126 CHAPTER 3. ALGORITHMS
Abelian group stabilizer problem. Let G be a group of elements acting on a set R; that
is if a ∈ G, then a : R → R and if a, b ∈ G, then a(b(x)) = (ab)(x) for each x ∈ R. For
x ∈ R let Stx = {a | a ∈ G, a(x) = x}. Stx is a subgroup—the so-called stabilizer for
x. For each x ∈ R let fx : G → R be such that fx (a) = a(x). The hidden subgroup
corresponding to fx is G0 = Stx .
It is still not known whether the hidden subgroup problem has a bounded-errorl poly-
nomial time algorithm also in the general case of non-abelian groups. This problem is of
interest for various reasons. One of them is that the graph isomorphism problem is of
such a type. (Graph isomorphism problem is reducible to finding a hidden subgroup of the
symmetric group Sn .)
11
Polynomial time algorithms for the hidden subgroup problem for certain types of non-
abelian groups have been designed by Rötteler and Beth (1998) and Ettinger and Høyer
(1998, 1999). In the first paper the problem is solved for certain semi-direct (namely wreath)
group products; in the second paper for the so-called dihedral groups but the algorithm is
10 A way to solve the problem is to show that in polynomial number of oracle calls (or time) the states
corresponding to different candidate subgroups have exponentially small inner product and are therefore
distinguishable.
11 Indeed, let G be the disjoint union graph of connected graphs G and G . The automorphism group H
1 2
of G is a subgroup of the graph Sn ≀ S2 (where Si is the symmetric group of the “order” i) and ≀ stands for
the wreath product of groups). Knowledge of a set of generators for H is sufficient to decide isomorphism
of G1 and G2 . Ettinger and Høyer (1999) defined an observable on l2 ((Sn ≀ S2 )m ), for any m, through a
projection P such that if |ψi is the tensor product of the coset states of H, then if G1 and G2 are (are not)
isomorphic, then hψ|P |ψi = 1 ( ≥ 1 − 2n! m ). It remains, as an open problem, to determine whether this
observable is efficiently implementable.
In this connection of importance seems to be to determine for which non-abelian groups there are efficient
QFT algorithms. For permutation non-commutative groups the existence of such an efficient algorithm was
shown by Beals (1997).
3.3. QUANTUM SEARCHING AND COUNTING 127
polynomial only with respect to the number of quantum oracle calls—the classical postpro-
cessing requires exponential time. In both cases the key subresult is an efficient mplemen-
tation of the Fourier transform for some non-abelian groups (for that see also Püschel et al.
1998). Ettinger et al. (1999) showed that the hidden subgroup problem can be solved in
linear (O(lg |G|)) number of calls for any finite group G. However, their algorithm requires
again exponential time for classical postprocessing. They have actually shown that there is a
POV measurement that can distinguish among the possible states corresponding to different
subgroups. An open problem remains whether there is such an POVM which can do the
same, is efficiently implementable and also the postprocessing can be done efficiently.
There are two basic methods for solving the hidden subgroup problems. The first one
is presented in Section 3.2 and follows an already familiar scheme: a Fourier transform, a
function evaluation, again a Fourier transform and a sampling of the resulting superposition
distribution.12 ,13 ,14 The second approach, introduced by Kitaev (1995), is based on an
estimation of eigenvalues of certain unitary operators. (For a detailed exposition of Kitaev’s
algorithm see Aharonov (1998).) These two approaches have been shown equivalent (see
Mosca and Ekert, 1998). Shor’s and Kitaev’s algorithms are bounded-error polynomial
time algorithms. An important question is for which hidden subgroup problems there exist
also exact polynomial time algorithms. So far only one partial result is known. Brassard and
Høyer (1997) have shown the existence of such an algorithm for a generalization of Simon’s
problem. By that, see also Brassard and Høyer, (1996), they established an exponential
gap between the power of exact quantum computation and that of classical bounded-error
randomized computation for decision problems.
The hidden subgroup problem for finitely generated Abelian subgroups is dealt with in
detail by Mosca and Ekert (1998).
probability measure 0. However, it is far from clear whether this negative result has a significant implication
concerning algorithmic problems one is really interested in.
128 CHAPTER 3. ALGORITHMS
with algorithms whose power is in a clever amplitude magnification in such a way that the
desirable outcome has far the largest probability to come up at the measurement.
3. Apply the inversion about average operator Dn = −Hn Rn1 Hn , page 72, to the state
received in the previous step;
√
4. Iterate ⌈ π4 2n ⌉ times steps 2 and 3, i.e., the transformation Gn = −Hn Rn1 Hn Vf (the
so-called Grover’s iterate, or G-iteration).
5. Measure the x-register.
x0
x0
(a) (b)
x0 x0
(c) (d)
Figure 3.8: “Cooking” the solution with Grover’s algorithm - 70%
Grover’s algorithm is very simple and it is easy to verify that it works. Less trivial is
to get a proper insight why it works. For that see a detailed analysis by Jozsa (1999). For
example, as also discussed later, instead of the transformation Hn , essentially any unitary
operator can be used.
It is clear that unless some structure of the problem is given, any classical algorithm to
solve the unsorted databases search problem has to try to check all elements until the one
with the desired property is found and for that N2 checks are needed in the average. Grover’s
√
quantum algorithm presented above can do that in O( N ) steps. 18
√
An analysis by Bøyer et al. (1996) showed that after π8 2n iterations the failure rate is
1
2 . Zalka (1997) has shown that Grover’s algorithm is optimal, for finding a solution
√
with
1 n
probability at least 2 . Grover (1998a) gives a simple proof that at least 0.707 2 queries
are needed by any quantum search algorithm.19
Exercise 3.3.1 Show that Grover’s algorithm could, in principle, be used to break such
cryptosystems as DES.
17 Steane (1997) refers to K. Fuchs for the following view of Grover’s techniques: “It is like cooking a
soufflé. The state is placed in the “quantum oven” and the desired answer rises slowly. You must open
the oven at the right time, neither too soon nor too late, to guarantee success. Otherwise the soufflé will
fall—the state collapses to the wrong answer.”
18 Grover’s result does not put the unsorted database search problem into another complexity class. In spite
of that it is remarkable that for such a surprisingly simple problem such an improvement can be obtained.
In addition, such an algorithm could also be of large importance for cryptanalysis—to find a plaintext to a
given cryptotext.
19 Under special conditions (for example, if queries about multiple items are allowed or highly structured
search problems are considered), see Grover (1997), Terhal and Smolin (1997) and Hogg (1998) a single
database query is sufficient.
130 CHAPTER 3. ALGORITHMS
Remark 3.3.2 Experimental realization, using NMR technology, of the simplest interesting
case of n = 4 and f : {0, 1, 2, 3} → {−1, 1} was reported by Chuang et al. (1998). In the
case x0 has the same probability to be 0, 1, 2 or 3, the average number of classical queries is
2.25 and Grover’s algorithm reduces to a single query.
Various generalizations and modifications of the original search problem started to be
investigated soon. At first the requirement to find a unique solution was lifted. Then
the problem of getting an exact and approximate counting of potential solutions has been
considered.
Let us now formulate, following Mosca (1998b), the corresponding decision, search,
counting and approximation problems and results formally. Let f : X → {0, 1} be a
function. Define X1 = {x | f (x) = 1}, X0 = {x | f (x) = 0} and x1 = |X1 |, x0 = |X0 |.
The decision problem associated with f is the problem of deciding whether |X1 | = 0.
The search problem, or the generation problem is the problem to find an x ∈ X1 . The
counting problem is to determine |X1 |. To approximately count X1 with accuracy ε
means to get an x such that
(1 − ε)x1 ≤ x ≤ (1 + ε)x1 . (3.20)
A randomized approximation scheme for f is a randomized algorithm that for any real
parameter ε > 0 outputs an x such that (3.20) holds with probability 23 .
Figure 3.9 summarizes the main complexity results for the above algorithmic problems
for classical and quantum computing, where N = |X| and t = |X1 |.
In addition, from the results of Example 3.1.2 it follows that all 2n functions from
{1, 2, . . ., n} → {−1, 1} can be distinguished by k oracle calls with probability (1 + n1 +
. . . + nk )/2n .
Let X1 = {x | f (x) = 1}, X0 = {x | f (x) = 0}, and let, after the jth iteration of Step 2:
X X
|ψj i = kj |xi + lj |xi,
x∈X1 x∈X0
√1 , l0
for suitable lj , kj with k0 = = √1 .
2n 2n
In Step 2 the unitary operation, the inversion about the average, is performed that
corresponds to the matrix
−1 + 22n 2
2n ... 2
2n
2
−1 + 22n . . . 2
2n 2n
−Hn · Rn1 · Hn = .. .. . . .
. . . . .
.
2 2 2
2n 2n . . . −1 + 2n
For kj and lj the following recursive relations have been derived:
2n − 2t 2(2n − t)
kj+1 = k j + lj , (3.22)
2n 2n
n
2 − 2t 2t
lj+1 = n
lj − n kj (3.23)
2 2
132 CHAPTER 3. ALGORITHMS
where sin2 θ = t
2n .
Exercise 3.3.6 (a) Derive recurrences (3.22) and (3.23); (b) Show that kj and lj from
(3.24) and (3.25) satisfy recurrences (3.22) and (3.23).
The aim and the art is to make such a number of steps that maximize kj and minimize
lj . Let us therefore take j such that cos((2j + 1)θ) = 0. This yields
π 1 π
j= − +m for some m ∈ Z.
4θ 2 θ
Since j has to be an integer, choose
jπk
j0 = .
4θ
q q q
Because sin2 θ = 2tn we have θ ≥ sin θ = 2tn and therefore j0 ≤ φ4 2t = O( 2t ). The
n n
General case
Let us now consider the general case in which the number t of solutions is not known.
Without loss of generality we assume that 1 ≤ t ≤ 34 2n . Indeed, if t > 43 2n , then a simple
algorithm, namely
3. To achieve no error, i.e., ε = 0, n queries are needed (Buhrman et al. 1998, see also
Section 3.5);
4. The following lower bound for ε is due to Buhrman and de Wolf (1998), where b is a
fixed constant and T < n: 2 8T
− 4bT − √
ε ∈ Ω(e n n ).
Remark 3.3.8 It has been shown that for a closely related problem of searching in an
ordered list we can gain a bit, but not too much, by using quantum algorithms. The
lg n
best current lower bound, in terms of the number of comparisons, is 2 lg lg n —for exact and
bounded error computation—due to Farhi et al. (1998a). The upper bound of 34 lg n + O(1)
quantum queries on average, with probability 21 , is due to Röhrig (1998). Interesting and
stimulating on the last result is that not Fourier but Haar transform (see Haar, 1910, and
Høyer, 1997), was used. Farhi et al. (1999) presented an 0.55 lg n algorithm.
(b) Apply the G-BBHT search algorithm to the first register to find a marked element.
(c) Measure the first register. If y ′ is the outcome and sy′ < sy take as the new
threshold index y ′ .
3. Return y.
Theorem 3.3.11 The minimum-finding algorithm presented in the example above finds the
minimum
√ with probability at least 1 − 21 if the measurement is done after a total number of
O( n) iterations.
Proof. Let us denote by p(M, t) the probability that when the above algorithm searches
the minimum among M items, one of the threshold indices chosen will be that of the element
s of the rank21 t ∈ {1, . . . , M }. On the basis of the identity
k+1
1 1 X
Pr (k + 1, t) = + Pr (r − 1, t)
k + 1 k + 1 r>t
n r n−1
!
X 1 n √ 1 X −3 45 √
4.5 ≤ 4.5 n + t 2 ≤ n.
t=1
t t−1 2 t=2 4
The expected total number of time steps of the Phase 2a before the minimum is found
is therefore at most
n
X 7 2
P r(n, t) lg n = (Hn − 1) lg n ≤ ln n lg n ≤ lg n,
t=2
10
45 √ 7 2
m= n+ lg n. (3.26)
4 10
Of course, by using algorithm c times the probability of success can be improved to at
least 1 − 21c . Using Markov inequality we have that after 2m steps the minimum is found
with probability at least 21 and therefore Theorem 3.3.11 holds.
Exercise 3.3.12 (a) Design a minimum-search algorithm for the case that not all ele-
ments of the sequence are distinct; (b) reformulate the above algorithm to find extreme
points of functions.
Theorem 3.3.13 Let F be a search problem chosen from F according to some probability
distribution. If, using a heuristic H, a solution to F can be found √
in expected time t, then
there is a quantum algorithm to find a solution in expected time O( t).
Proof. We combine the G-BBHT search algorithm with the heuristic H. Define H ′ (r) =
F (H(F, r)) and x = H(F, G-BBHT(H ′ )). Hence F (x)q= 1. By the results of Section 3.3.2,
for each F ∈ F we have an expected running time θ( |R|
hF ). If pF is the probability that f
P
is chosen, then F ∈F pF = 1. The expected computation time is then of the order
s s !1/2 !1/2 !1/2
X |R| X |R| √ X |R| X X |R|
pF = pF pF ≤ pF pF = pF
hF hF hF hF
F ∈F F ∈F F ∈F F ∈F F ∈F
Quantum counting
We present a way to perform an approximate counting of the number of solutions of the
equation f (x) = 1, where f : {0, 1}n → {0, 1}, due to Brassard, Høyer and Tapp (1998a)22,
in which one combines basic ideas of Grover’s and Shor’s a algorithms.
22 Another approach, due to Mosca (1998) is based on the analysis of eigenvalues of Grover’s iterate.
136 CHAPTER 3. ALGORITHMS
Basic idea: in the G-BBHT algorithm the amplitudes of the sets A0 and A1 vary with
the number of iterations, according to a periodic function. We know from Section 3.3.2 that
this period is directly related to the size of these sets. An estimation of the common period,
using Fourier analysis, provides an approximation of the size of sets A0 and A1 .
The quantum algorithm COUNT presented below to provide the approximate counting
has two parameters: the function f , given as a black box, and an integer p = 2k , for some k,
to determine the precision of the approximation. The algorithm uses two transformations
(m)
Cf : |m, ψi → |m, Gf ψi,
p−1
1 X 2πikl/p
Fp : |ki → √ e |li,
p
l=0
(m)
where Gf = −Hn Rn1 Hn Vf is the Grover’s iterate for f , and Gf denotes its mth iteration.
Algorithm 3.3.14 COUNT(f, p)
1. |ψ0 i ← (Hn ⊗ Hn )|0(n) , 0(n) i;
2. |ψ1 i ← Cf |ψ0 i;
3. |ψ2 i ← Fp ⊗ I|ψ1 i;
p
4. f ← if measure of |ψ2 i > 2 then p − M(|ψ2 i) else M(|ψ2 i);
If we call states |i, ·i good provided i ∈ f −1 (1), and bad otherwise, then |φ1 i (|φ0 i) is a
projection of |φi into the subspaces spanned by good (bad) state. Moreover, the probability
that by measuring |φi we get a good (bad) state is hφ1 |φ1 i (hφ0 |φ0 i).
Let U be any unitary transformation on H. The key tool for the amplitude amplification
procedure to be presented below is the unitary operator
Q = Q(U, f, p, q) = −U S0p U −1 Sfq ,
where p, q are complex numbers such that |p| = |q| = 1. The operator Sfq conditionally
changes the phase by the phase factor q as follows
Sfq |i, ·i → q|i, ·i if f (i) = 1;
Sfq |i, ·i → |i, ·i if f (i) = 0.
The operator S0pchanges the phase of a state by a factor p if and only if the first register
holds 0. Using this notation the original Grover’s iterate has the form Q(Hn , f, −1, −1).
The following properties of the operators U S0p U −1 and Q are easy to verify.
138 CHAPTER 3. ALGORITHMS
In particular, if |ψi = U |0i and |ψi is decomposed, similarly as above, into |ψ1 i and |ψ0 i,
then it holds
Lemma 3.4.2
As a consequence of the above lemma, for any vector |ψi the subspace spanned by
vectors |φ0 i, |φ1 i, |ψ0 i and |ψ1 i is invariant under the transformation Q. For the special
case p = q = −1 simpler relations are obtained:
Lemma 3.4.4 Let U |0i = |ψi = |ψ0 i + |ψ1 i, Q = Q(U, f, −1, −1), then
The recurrences (3.28) and (3.29) have actually been solved in Section 3.3.2 with the result
Theorem 3.4.5 Let U |0i = |ψi = |ψ0 i + |ψ1 i and Q = Q(U, f, −1, −1). Then, for j ≥ 0,
where
1 1
kj = √ sin((2j + 1)θ) and lj = √ cos((2j + 1)θ)
a 1−a
and θ is such that sin2 θ = a = hψ1 |ψ1 i, 0 ≤ θ ≤ φ2 .
In Theorem 3.4.5 U can be any quantum algorithm that uses no measurement. This way
we therefore get a general method to increase probability of the success at searching and let
us therefore analyze this situation in more detail.
If A|0i is computed, then a is the success probability to get into a good state. On
the other side, if the transformation Qj A is applied, then the success probability is akj2 =
sin2 ((2j + 1)θ). One can achieve high probability of success by chosing j such that sin2 ((2j +
1)θ) ≈ 1.pHowever, for that one needs to know θ which depends in turn on a. In case a > 0
π
and j = 4θ , we have akj2 ≥ 1 − a and therefore it holds.
3.4. METHODOLOGIES TO DESIGN QUANTUM ALGORITHMS 139
and therefore we have basically the same recurrences (with different parameters) as in Sec-
tions 3.3.2 and 3.4.1.
By applying a similar procedure as in Section 3.3.2 one can show that after T iterations of
Q we get the superposition as |φs i+at U −1 |φt i, where αs = cos(2T |αU U
st |), |αt | = | sin(2T |ast |)|.
π −1
If T = 4|aU | , then we get superposition U |φt i and by an application of U we reach the
st
target state |φt i. Therefore in O( |a1U | ) steps we reach |φt i when starting in |φs i.
st
presented in Section 3.4.2 provides an algorithm to reach the target state in O( |aH1n | ) =
√ 0t
O( 2n ) steps. In this case the operator −Hn I0 Hn is simply the inversion about the average
operator as introduced on page 72.
140 CHAPTER 3. ALGORITHMS
Search when a basis state near the target basis state is given
The aim is to reach in H2n a basis state |ti starting from the initial state |si provided a
state |qi is given such that q and t differ in k bits and k is known. Ln
This time instead of the Hadamard transformation, the transformation Wαn = i=1 Wα
is applied to each of n qubits, where
q
1 − α1 √1
Wα = q α .
√1 − 1 − 1
α α
Wn n−k k
In this case |ast α | = (1 − α1 ) 2 ( α1 ) 2 and this value is maximal if α = n
k. The algorithm
presented in Section 3.4.2 provides the solution in time O( W1 αn ).
|ast |
It remains to find out how good this solution is with respect
to the classical case of doing
exhaustive search. The size of the search space is now nk . Using Stirling’ approximation
Wα
for factorial we have lg nk ≈ n lg n−k
n k
− k lg n−k . On the other hand, for α = nk , lg |ast n | =
n n−k k n−k
2 lg n − 2 lg k and therefore the number of steps of the quantum algorithm just derived
is about the square root of the number of steps of the classical exhaustive search algorithm.
where the second register is considered as a qubit taking value −1 or 1. If we now define
1 1
|x, 0′ i = √ (|x, 1i + |x, −1i), |x, 1′ i = √ (|x, −1i − |x, 1i),
2 2
then for y ∈ {0, 1}, if we denote f y (x) = (f (x))y ,
and therefore, in the basis {|x, 0′ i, |x, 1′ i}, the quantum operator Af is simply a multiplica-
tion by f y (x).
Our aim is to show that any quantum algorithm computing par(f ) has to use at least
n
2 calls of f . In order to do that we need to consider also the cases where a quantum
algorithm computing par(f ) works in a larger Hilbert space than (2n)-dimensional Hilbert
space spanned by vectors {|x, yi | 1 ≤ x ≤ n, y ∈ {0, 1}}. We shall therefore assume that
a quantum algorithm A to compute par(f ) is given which works in a Hilbert space Hn,Z
spanned by the vectors {|x, y, zi | 1 ≤ x ≤ n, y ∈ {0, 1}, 1 ≤ z ≤ Z}, for some Z.
A can be seen as a sequence of unitary operators that act on an initial vector |ψ0 i and
produce a final vector |ψf i in such a way that there is a projection operator (observable)
P corresponding to a decomposition of the underlying Hilbert space into two orthogonal
subspaces, such that if P is applied to |ψf i, then either the value 0 is obtained, corresponding
to the case par(f ) = −1 or the value 1, corresponding to par(f ) = 1.
The algorithm A will be considered as computing par(f ) with error ε, if for the expec-
tation value of P , with respect to |ψf i, it holds
≥ 12 + ε, if par(f ) = 1;
hψf |P |ψf i (3.33)
≤ 21 − ε, if par(f ) = −1.
The operator Af can clearly be expressed (as a generalization of (3.32)), in the form
n X
X 1
Af = f y (x)Px Py ,
x=1 y=0
where, for any 1 ≤ x ≤ n, Px is the projection operator into the space spanned by the
vectors {|x, y ′ , zi | y ∈ {0, 1}, 1 ≤ z ≤ Z} and Py is the projection into the subspace with
the basis {|x, y ′ , zi | 1 ≤ x ≤ n, 1 ≤ z ≤ Z}. (This is actually the spectral representation of
the query operator Af .)
If the algorithm A contains k applications of Af , then it can be seen as having the form
A = Uk Af Uk−1 Af . . . Af U1 Af , (3.34)
142 CHAPTER 3. ALGORITHMS
where
Φ(x1 , q1 , . . . , x2k , q2k ) = hψ0 |Px1 Pq1 U1∗ . . . Uk∗ Pf Uk . . . U1 Px2k Pq2k |ψ0 i
does not depend on f .
There are 2n functions f ∈ Fn . The summation over all such functions yields
X n X
X X 1 n
X 1
X 2k
Y n
Y
par(f )hψf |P |ψf i = ... Φ(x1 , q1 , . . . , x2k , q2k ) f qi (xi ) f (y).
f ∈Fn f ∈Fn x1 =1 q1 =0 x2k =1 q2k =0 i=1 y=1
(3.35)
Since the summation is over all functions f ∈ Fn , it holds
X
f (z) = 0 for any z ∈ {1, . . . , n},
f ∈Fn
where x1 , . . . , x2k and also q1 , . . . , q2k are fixed. Since f 0 (xi ) = 1, we have
X Y n
Y
S= f (xi ) f (y). (3.37)
f ∈Fn {i | qi =1} y=1
Observe that f 2 (z) = 1 for any f and any z. From (3.36) and (3.37) it follows that S = 0
unless each term in the second product in (3.37) can be matched by a different term of the
first product.
The first product has at most 2k terms, and the second product has n terms. This
immediately implies that S = 0 if 2k < n. In addition, in such a case
X
par(f )hψf |P |ψf i = 0
f ∈F
n+1
Exercise 3.5.1 Show that the above bound holds also for n odd. Namely, that with 2
applications of Af one can determine parity of f .
Exercise 3.5.2 (Farhi et al. 1998) Consider computation of f n (x) for functions that
map a set of size 2n to itself. Show that no quantum algorithm which uses f as a black box
can solve the problem with fewer than n2 applications of the unitary operator corresponding
to the given function.
Three types of algorithm to compute a Boolean function f : {0, 1}n → {0, 1} are to consider:
exact algorithms providing f (x) for any x ∈ {0, 1}n; Las Vegas or zero-error algorithms
providing a result with probability at least 21 (and if they deliver a result, then surely a correct
one); Monte Carlo (2–sided error) or bounded-error algorithms providing a result
that is correct with probability 23 .
Representation and approximation of Boolean functions by polynomials is defined as
follows:
Observe that xn = x for any integer n whenever x ∈ {0, 1}. It is therefore sufficient to
consider multilinear polynomials when representation or approximation of Boolean functions
by a polynomials is the task.
Example 3.5.4 (1) The polynomials x1 x2 and 1 − (1 − x1 )(1 − x2 ) represent Boolean func-
tions x1 ∧ x2 and x1 ∨ x2 ; (2) the polynomial x1 + x3 − x1 x3 represents the Boolean function
(x1 ∨ x2 ∨ x3 ) ∧ (x1 ∨ x2 ∨ x3 ).
Definition 3.5.5 The degree of a multilinear polynomial is the maximum number of vari-
d ))
ables occurring in a term of the polynomial. For a Boolean function f let deg(f ) (deg(f
be the the degree of a minimum degree polynomial representing (approximating) f .
Remark 3.5.6 It is well known that for any Boolean function there is exactly one poly-
Since n-ary OR function, x1 ∨ x2 ∨ . . . ∨ xn , can be
nomial representing this function. Q
n
represented by the polynomial 1 − i=1 (1 − xi ), we have that deg(OR) = n. Clearly, the
same holds for n-ary AND.
Several general lower bounds on the degree of Boolean functions were shown by Nisan
and Szegedy (1994), von zur Gathen and Rucke (1997) and by Paturi (1992):
Theorem 3.5.7 If f : {0, 1}n → {0, 1} is a Boolean function that depends on all its vari-
ables, then
(1) deg(f ) ≥ lg n − O(lg lg n);
(2) deg(f ) ≥ n − O(n0.548 ), if f is non-constant and symmetric;
(3) deg(f ) = n, if f is non-constant and symmetric and n + 1 is prime.
p
d ) = θ( n(n − Γ(f ))), where
(4) deg(f
Exercise 3.5.8 (a) Show for functions f equal to OR and AND that Γ(f ) = n − 1 (and
d ) = θ(√n)); (b) show for functions f equal to PARITY or MAJORITY
therefore deg(f
d ) = θ(n).
that Γ(f ) = 1, and therefore deg(f
Model of computation
We shall consider the following oracle setting for computing Boolean functions. There is a
vector of N Boolean variables X = (x0 , . . . , xN −1 ) given by an oracle that produces xi on the
input i and we want to compute a Boolean function f : X → {0, 1}. (If f : {0, 1}n → {0, 1},
then X = (x0 , . . . , x2n −1 ).)
A quantum network with T oracle calls to an oracle X will be represented by a sequence
of unitary transformations
U0 , O, U1 , O, U2 , O, . . . . . . , O, UT −1 , O, UT ,
Basic results
Lemma 3.5.9 Let N be a quantum circuit that makes T calls to an oracle X. Then there
exist complex-valued multilinear polynomials pi , 0 ≤ i < 2m , each of the degree at most T ,
such that the final state of the network is
m
2X −1
|φX i = pk (X)|ki.
k=0
Proof. Let |φi i be the state of the network just before the ith oracle call; i.e. |φi+1 i =
Ui O|φi i. The amplitudes in |φ0 i depend on the initial state and on U0 , but not on the oracle,
and therefore they are polynomials of the degree 0. An oracle call maps a basis state |i, b, zi
to |i, b ⊕ xi , zi.23 If therefore the amplitude of |i, 0, zi (|i, 1, zi) is α (β), then the amplitude
of |i, 0, zi (|i, 1, zi), after the oracle call becomes (1 − xi )α + xi β (xi α + (1 − xi )β), which
are polynomials of degree 1. In the same way we can show that if the amplitudes of the
state before an oracle call are polynomials of the degree ≤ j, then after the oracle call they
are polynomials of the degree ≤ j + 1. Moreover, no unitary transformation Ui increases
the maximal degree of the amplitude polynomials because such transformations only create
linear combinations of the already existing polynomials. By induction we can now prove
that all amplitude polynomials have degree at most T .
Lemma 3.5.10 Let N be a quantum circuit that makes T calls to an oracle X and B be a
set of basis states. Then there exist a real-valued multilinear polynomial p(X) of degree at
most 2T which equals the probability that observing the final state of N with oracle X yields
a state from B.
Proof. By Lemma 3.5.9, the final state of the network can be written as
m
2X −1
pj (X)|ji,
j=0
where pj are complex valued polynomials of degree at most T . The probability of observing
a state from B is X
PB (X) = |pj (X)|2 .
j∈B
If pj (X) = Rej (X)+iImj (X), where Rej (X) and Imj (X) are real and imaginary parts—
real valued polynomials of degree at most T —then |pj (X)|2 = (Rej (X))2 + (Imj (X))2 and
therefore the resulting polynomials have degree at most 2T .
23 The mapping O : |i, b, yi → |i, b ⊕ xi , yi has the following interpretation: i is information for the oracle
and xi is the output of the oracle.
146 CHAPTER 3. ALGORITHMS
Theorem 3.5.11 Let f be a Boolean function and N a quantum circuit computing f with
T oracle calls. Then T ≥ deg(f )/2.
Proof. Let B be the set of those basis states |ii that have 1 as the rightmost bit (of
the binary representation of i). By Lemma 3.5.10, there is a real-valued polynomial p of
degree at most 2T and such that, for all X, p(X) equals the probability that the result of
measurement of the last qubit is 1. Since N computes f exactly it must hold that p(X) = 1
if and only if f (X) = 1, and p(X) = 0 if and only if f (X) = 0. Hence p(X) = f (X) for all
X and therefore the degree of p has to be at least deg(f ).
As a corollary of Theorems 3.5.7 and 3.5.11 we have
Theorem 3.5.12 Let f be a Boolean function that depends on n variables and N a quantum
network that computes f exactly using T oracle calls. Then T ≥ 12 lg n − O(lg lg n).
Better bounds can be obtained if more can be assumed about f . By Theorems 3.5.7
and 3.5.11 we have
Theorem 3.5.14 Let f be a Boolean function and N be a quantum circuit that computes
d )/2.
f with 2-sided error probability at most 31 using T oracle calls. Then T ≥ deg(f
Lemma 3.5.15 Any quantum circuit that computes n-variable OR with zero-error requires
n oracle calls.
3.5. LIMITATIONS OF QUANTUM ALGORITHMS 147
Let B be the set of all basis states ending with 0—with the output 0. For every k ∈ B there
has to be pk (X) = 0 if X 6= 0; otherwise the probability of getting the incorrect answer
on the input 0 would be non-zero. At the same time there has to be at least one k0 ∈ B
such that pk0 (0) 6= 0, since the probability of getting the correct answer 0 on |φ0 i must be
non-zero. Let p(X) be the real part of the polynomial 1 − pk0 (X)/pk0 (0). This polynomial
has to have degree at least t and can be seen as representing the OR function. Hence p has
to have degree at least deg(OR) = n. Therefore, t ≥ n.
As a consequence we have
Corollary 3.5.16 A quantum circuit for exact or zero-error n-element database search re-
quires n oracle calls.
Finally, let us discuss again the problem of computing parity function of n arguments.
Since deg(par)= n, it follows from Theorem 3.5.11 that at least n2 oracle calls are needed,
and as discussed at the end of Section 3.5.2, the fact that the Deutsch XOR problem can
be solved with one oracle call implies that parity computation can be done for n even with
n
2 oracle calls.
a<b
+ -
b<c a<c
- + + -
a<c a<b<c b<a<c b<c
+ - + -
Decision trees are perhaps the simplest computational model convenient to deal with
the above problems. In deterministic decision trees (see Figure 3.10 for a decision tree
to sort three different numbers), to each node corresponds a query concerning the input
data (a use of the black box), and computation then proceeds according to the result of the
query. In order to simplify the matter we will consider, without loss of generality concerning
the complexity, decision trees to compute Boolean functions. The depth-cost of such a tree
is the length of the shortest path and let D(f ) be the minimum of depth-costs over all
deterministic decision trees for f .
Randomized decision trees are the corresponding model for randomized computation.
They can be seen as a probability distribution on the set of deterministic decision trees.
The depth-cost of such a decision tree is the expected number of calls on the worst case
input. Depending on the type of error allowed several types of randomized decision trees
are considered: zero-error, one-sided error and bounded-error. Randomized bounded-error
complexity R(f ) of a Boolean function f is the minimum cost of randomized decision tree
that computes f with bounded-error probability at most 31 for all inputs.
Quantum complexity Q(f ) of a Boolean function f is defined as the number of the black
box calls in the best network that computes f with error probability at most 13 .
Relation D(f ) = O(R(f )3 ) is due to Nisan (1991), and the relation D(f ) = O(Q(f )6 ) is
duet to Beals et al. (1998). The result implies that if some Boolean function can be computed
quantumly with bounded-error probability, then it can be computed by a deterministic
decision tree with only polynomial increase of black-box queries.
Open problem 3.5.17 Can the upper bound D(f ) = O(Q(f )6 ) be improved?
4
In the case of monotone (symmetric) Boolean functions it holds D(f ) = O(Q(f
√ ) )
(D(f ) = O(f )2 ), and the best separation known is D(f ) = n and Q(f ) = O( n) for
the XOR function.
Chapter 4
AUTOMATA
INTRODUCTION
In addition to the study of problems of the design and analysis of algorithms and circuits,
as well as of the computational complexity of algorithmic problems, another main method
of theoretical computing to get an insight into the power of computational resources is to
study models of quantum computing devices and the corresponding complexity classes. This
will be done in this chapter for three most basic models of quantum automata: quantum
versions of finite automata, Turing machines and cellular automata.
Quantum finite automata are perhaps the most elementary model of quantum automata.
They are in addition the only model so far for which it has been fully proved that they have
larger power than their classical counterparts.
Quantum Turing machines are the main model to study the most fundamental questions
concerning the power of quantum computing itself and the power of quantum versus classical
computing. Quantum cellular automata are of a special interest. They seem to be a model
much closer to the physical reality than quantum Turing machines. In addition, it is still a
major open problem whether quantum cellular automata are more powerful than quantum
Turing machines.
LEARNING OBJECTIVES
The aim of the chapter is to learn:
1. the basic models of one-way and two-way quantum finite automata as well as their
computational power and space efficiency in comparison with classical finite automata;
2. the basic models of quantum Turing machines and methods of implementing the main
classical programming primitives in a reversible way;
3. the types and precision of amplitudes needed to ensure sufficient accuracy of compu-
tation on quantum Turing machines;
4. the basic models of quantum cellular automata and partitioned quantum cellular au-
tomata, as well as simulations between quantum cellular automata and quantum Tur-
ing machines.
149
150 CHAPTER 4. AUTOMATA
T
By convention there is color, by convention
sweetness, by convention bitterness, but in re-
ality there are atoms and space.
evolution.
a b c x y z # a b c x y z $
σ q
control 1FA q control 2FA q control
(a) unit (b) unit (c) unit
q q
0.5 0.5
q’ q’’ q’ q’’
NFA PFA
0.2 0.8 0.3 0.7
(d) q1 q2 q3 q4 q1 q2 q3 q4 (e)
Basic concepts
One-way quantum finite automata are the first type of quantum automata we consider and
therefore we shall deal with them in more detail.
Definition 4.1.1 A one-way quantum finite automaton A is specified by the finite
(input) alphabet Σ, the finite set of states Q, the initial state q0 , the sets Qa ⊆ Q and
Qr ⊆ Q of accepting and rejecting states, respectively, with Qa ∩ Qr = ∅, and the transition
function
δ : Q × Γ × Q → C[0,1] ,
where Γ = Σ ∪ {#, $} is the tape alphabet of A, and # and $ are endmarkers not in Σ.
The evolution (computation) of A is performed in the inner-product space l2 (Q), i.e.,
with the basis {|qi | q ∈ Q}, using the linear operators Vσ , σ ∈ Γ, defined by
X
Vσ (|qi) = δ(q, σ, q ′ )|q ′ i,
q′ ∈Q
l2 (Q) = Ea ⊕ Er ⊕ En ,
4.1. QUANTUM FINITE AUTOMATA 153
Example 4.1.2 (Ambainis and Freivalds, 1998) We show that the language L0 =
{0i 1j | i, j ≥ 0} can be recognized by a 1QFA A0 with probability p such that p = 1 − p3
(p ≈ 0.68).
A0 has the set of states Q = {q0 , q1 , q2 , qa , qr }, Qa = {qa }, Qr = {qr }. q0 is the initial
state. The transition function σ is specified by transitions:
p √
V# |q0 i = 1 − p|q1 i + p|q2 i,
p √
V0 |q1 i = (1 − p)|q1 i + p(1 − p)|q2 i + p|qr i,
p p
V0 |q2 i = p(1 − p)|q1 i + p|q2 i − 1 − p|qr i,
154 CHAPTER 4. AUTOMATA
(1 − p) + p(1 − p) + p2 (1 − p) = 1 − p3 = p.
Exercise 4.1.3 What is the largest probability with which a 1QFA can accept the lan-
guage {0i 1j 0k | i, j, k ≥ 0}?
Exercise 4.1.4 Show: (a) || · ||u is a norm on VA ; (b) there is a constant c such that
||Tx v − Tx v ′ ||u ≤ c||v − v ′ ||u for any v, v ′ ∈ VA , x ∈ Γ∗ ; (c) if a set A ⊂ D is such that
there exists an ε > 0 such that for all v, v ′ ∈ A it holds ||v − v ′ ||u > ε, then there can be
only finitely many elements in A.
Remark 4.1.5 Another possibility to define acceptance by one-way QFA was considered
by Moore and Crutchfield (1997)—to measure only at the right endmarker. Let us denote
one-way QFA with such acceptance by mc-1QFA. Even if this definition seems to be at first
sight more natural, technical results do not confirm that. It holds mc-QFA ( 1QF A.
Language recognition
The basic question is whether 1QFA have larger recognition power than 1FA. The answer
is negative (Kondacs and Watrous, 1997).
Proof. The proof is a small modification of the proof that probabilistic FA accept only
regular languages, see Rabin (1976). Here is the basic idea:
Let A = hQ, Σ, q0 , Qa , Qr , δi be a 1QFA recognizing the language L with probability
1
2 + ε.
For w, w′ ∈ Σ∗ we define that the prefix relation w ≡L w′ holds if for all y ∈ Σ∗ , wy ∈ L
if and only if w′ y ∈ L. It is well known (see, for example Gruska, 1997), that a language L
is regular if and only if there are only finitely many equivalence classes with respect to its
prefix equivalence.
Let W ⊆ Σ∗ be any set of strings that are mutually inequivalent with respect to the
equivalence ≡L . If we prove that W is finite, the theorem will be proved. This can be done
as follows.
If w, w′ ∈ W , w 6≡L w′ , then there must exist a y such that only one of the strings wy
and w′ y is in L. Therefore, for v = T#w (|q0 i, 0, 0) and v ′ = T#w′ (|q0 i, 0, 0i it has to hold:
Theorem 4.1.9 The regular language L0 = {0, 1}∗0 cannot be recognized by a 1QFA with
bounded-error probability.
and let µ = inf w∈{0,1}∗ {||ψ#w ||}. For each w ∈ {0, 1}∗, w0 ∈ L0 and w1 6∈ L0 . If µ = 0,
then clearly A cannot recognize L0 with bounded-error probability 12 + ε. Let us therefore
assume that µ > 0.
For any ε > 0 there is a w such that ||ψ#w || < µ + ε, and also ||ψ#wy || ∈ [µ, µ + ε] for
any y ∈ {0, 1}∗. In particular, for any m > 0
m
||V1′ |ψ#w0 i|| ∈ [µ, µ + ε). (4.1)
156 CHAPTER 4. AUTOMATA
i
This implies that the sequence {V1′ | ψ#w0 i}∞i=0 is bounded in the finite dimensional inner-
product space and must have a limit point. Therefore there have to exist j and k such
that
j k
||V1′ (|ψ#w0 i − V1′ |ψ#w0 i)|| < ε.
The last inequality together with (4.1) imply (see Lemma 4.1.10), that there is a constant
c which does not depend on ε and such that
k
|ψ#w0 i − V1′ |ψ#w0 i < cε1/4 .
This implies that
||T#w0$ (|q0 i, 0, 0) − T#w01k $ (|q0 i, 0, 0)||u < c′ ε1/4
for fixed c′ . However, this has to be valid for an arbitrarily small ε. This is not possible if A
accepts L0 because A should accept the string w0 and reject w01k . Hence A cannot accept
L0 with bounded-error probability.
Lemma 4.1.10 (J. Watrous) If |ui and |vi are vectors such that for a linear operator A,
reals 0 < ε < 1 and µ > 0, ||A(u − v)|| < ε, and ||v||, ||u||, ||Au||, ||Av|| are in [µ, µ + ε],
then there is a constant c, that does not depend on ε, such that ||u − v|| < cε1/4 .
Proof. First observe that if we can show
√
||u − v||2 < ||A(u − v)||2 + c′ ε (4.2)
then p
we have our proof because the square root of the right-hand side in (4.2) is smaller
√ √
than ε2 + c′ ε ≤ c′ + 1 ε1/4 , since ε < 1. To show (4.2) let us compute
||u − v||2 − ||A(u − v)||2 = hu − v|u − vi − hAu − Av|Au − Avi
= ||u||2 + ||v||2 − hu|vi − hv|ui
−||Au||2 − ||Av||2 + hAu|Avi + hAv|Aui
= (I) + (II) + (III) + (IV ),
where
(I) = ||u||2 − ||Au||2 , (II) = ||v||2 − ||Av||2 ,
(III) = hu|A∗ Av − vi, (IV ) = hv|A∗ Au − ui.
Since ||u||, ||v|| ≤ [µ, µ + ε], we have
||u||2 − ||Au||2 ≤ 2µε + ε2 , ||v||2 − ||Av||2 ≤ 2µε + ε2 .
In order to get an estimate for (III) and (IV ) we proceed as follows:
Another natural type of automata one should try to compare with 1QFA are one-way
reversible finite automata (1RFA). They can be defined as 1QFA having transition ampli-
tudes either 0 or 1. The following result, due to Ambainis and Freivalds (1998), shows that
if 1QFA are required to give the correct answer with high probability, then they have the
same recognition power as 1RFA.
7
Theorem 4.1.11 A language can be recognized by a 1QFA with probability 9 + ε if and only
if it can be recognized by 1RFA.
Size-space efficiency
The number of states is a natural space measure for both 1FA and 1QFA. A natural related
problem to investigate is the following one: if a language L is accepted by a 1QFA, then
how many states does a minimal 1QFA recognizing L have in comparison with minimal 1FA
recognizing L? The results obtained by Ambainis and Freivalds (1998) and Ambainis et
al. (1998) show that sometimes a 1QFA is exponentially smaller than any 1FA recognizing
the same language, but sometimes a 1FA is almost exponentially smaller than any 1QFA
recognizing the same language. It holds:
Theorem 4.1.12 (1) Let p be prime. Any 1FA recognizing the language Lp =
{0i | i is divisible by p} has to have p states, but for any ε > 0 there is a 1QFA Ap,ε ,
with O(lg p) states accepting Lp with probability 21 + ε.
(2) Let n ≥ 1 be an integer. The language Ln = {w0 | w ∈ {0, 1}∗ , |w| ≤ n} can be
recognized by a 1FA with 2n + 3 states but for any ε > 0 any 1QFA recognizing Ln with
probability 12 + ε has to have 2Ω(n/ lg n) states.
It is easy to design a (2n + 3)-state 1FA accepting the language Ln . Concerning the
proof of the lower bound on the number of states of any 1QFA for Ln , the very basic idea
is simple. Since a 1QFA can read each input symbol only once, any 1QFA for Ln which
is necessary reversible is forced to “remember” all symbols read until it is clear whether
the input symbol is in the language. Consequently, the state the automaton reaches after
reading n input symbols has to be an encoding of these n symbols. Since this encoding has
to be such that any input n-bit word can be recovered, the number of states has to be at
least 2n . Unfortunately, the above idea is not fully valid because a 1QFA can terminate
before it reads all inputs. The problem is therefore more complex and consequently the
proof is more involved. It makes use of the ideas of dense coding discussed in Section 8.2.4.
It follows from the above theorem that in some cases the requirement of unitarity makes
1QFA much larger than the minimal equivalent 1FA.
Basic concepts
Definition of two-way QFA is significantly more complex than of a 1QFA, especially due to
the need to make sure their evolutions are unitary.
Definition 4.1.13 A two-way quantum finite automaton A is specified by the finite (input)
alphabet Σ, the finite set of states Q, the initial state q0 , the sets Qa ⊂ Q and Qr ⊂ Q of
accepting and rejecting states, respectively, with Qa ∩ Qr = ∅, and the transition function
δ : Q × Γ × Q × {←, ↓, →} −→ C[0,1] ,
where Γ = Σ ∪ {#, $} is the tape alphabet of A and # and $ are endmarkers not in Σ, which
satisfies the following conditions (of well-formedness) for any q1 , q2 ∈ Q, σ, σ1 , σ2 ∈ Γ,
d ∈ {←, ↓, →}:
1. Local probability and orthogonality condition.
P ∗ ′ ′ 1, if q1 = q2 ;
q′ ,d δ (q1 , σ, q , d)δ(q2 , σ, q , d) = 0, otherwise.
P
2. Separability condition I. q′ δ ∗ (q1 , σ1 , q ′ , →)δ(q2 , σ2 , q ′ , ↓) = 0.
P
3. Separability condition II. q′ δ ∗ (q1 , σ1 , q ′ , ↓)δ(q2 , σ2 , q ′ , ←) = 0.
P
4. Separability condition III. q′ δ ∗ (q1 , σ1 , q ′ , →)δ(q2 , σ2 , q ′ , ←) = 0.
(These conditions are equivalent, as shown below, to the requirement that evolution of A is
unitary.) Formally, A = hΣ, Q, q0 , Qa , Qr , δi.
States from Qa ∪ Qr are called halting states and states from Qn = Q − (Qa ∪ Qr ) are
called non-halting states.
In order to process an input word x ∈ Σ∗ by A we assume that the input is written on
the tape with the endmarkers in the form wx = #x$ and that such a tape, of length |x| + 2,
is circular, i.e. the symbol to the right of $ is #.1
For an integer n let Cn be the set (of size (n + 2)|Q|) of all possible configurations of A
for inputs of length n. For each specific input x such a configuration is uniquely determined
by a pair (q, k), where q is the state of the configuration and 0 ≤ k ≤ |x| + 1 denotes the
position of the head.
To a computation of A on an input x of length n corresponds a unitary evolution in the
underlying inner-product space HA,n = l2 (Cn ). For each configuration c ∈ Cn , |ci denotes
the basis vector in l2 (Cn ). Each state in HA,n will therefore have a form
X X
αc |ci where |αc |2 = 1.
c∈Cn c∈Cn
The automaton A induces for any input x ∈ Σn a linear operator Uxδ on HA,n defined for a
configuration (q, k) ∈ Cn by
X
Uxδ |q, ki = δ(q, wx (k), q ′ , d) | q ′ , (k + µ(d)) mod (n + 2)i, 2
q′ ,d
where wx (k) denotes the kth symbol of wx = #x$ for 0 ≤ k ≤ |x| + 1. By linearity Uxδ is
extended to map any superposition of basis states.
1 The requirement of circularity of the tape is not essential but slightly simplifies the treatment of 2QFA.
2 By definition, µ(d) = −1(0)[1] if d =← (↓)[→].
4.1. QUANTUM FINITE AUTOMATA 159
Lemma 4.1.14 For any nonempty input string x the mapping Uxδ is unitary if and only if
the conditions (1) to (4) of Definition 4.1.13 are satisfied.
Proof. To prove the lemma, it is sufficient to investigate the orthogonality of the vectors
Uxδ |q, ki for q ∈ Q, 0 ≤ k ≤ |x| + 1. The condition (1) is equivalent to the statement that for
every x, |x| ≥ 0, ||Uxδ |q, ki|| = 1 for all q and k and that Uxδ |q1 , ki ⊥ Uxδ |q2 , ki for q1 6= q2 .
Conditions (2) to (4) are equivalent to the statement that Uxδ |q1 , k1 i ⊥ Uxδ |q2 , k2 i if k1 and
k2 differ at most by 2. Finally, it is trivially true that Uxδ |q1 , k1 i ⊥ Uxδ |q2 , k2 i if the head
positions are more than two cells away because the heads can move only one cell per step.
Lemma 4.1.16 A simple 2QFA A satisfies the well-formedness condition if and only if
X
1, if q1 = q2 ;
hq ′ |Vσ |q1 i∗ hq ′ |Vσ |q2 i =
′
0, otherwise,
q
V# |q0 i = |q0 i, V0 |q0 i = |q0 i, V1 |q0 i = |q1 i, V$ |q0 i = |q1 i, D(q0 ) = +1,
V# |q1 i = |q2 i, V0 |q1 i = |q2 i, V1 |q1 i = |q0 i, V$ |q1 i = |q0 i, D(q1 ) = −1,
V# |q2 i = |q4 i, V0 |q2 i = |q4 i, V1 |q2 i = |q2 i, V$ |q2 i = |q3 i. D(q2 ) = +1,
V# |q3 i = |q3 i, V0 |q3 i = |q3 i, V1 |q3 i = |q3 i, V$ |q3 i = |q2 i, D(q3 ) = 0,
V# |q4 i = |q1 i, V0 |q4 i = |q1 i, V1 |q4 i = |q4 i, V$ |q4 i = |q4 i, D(q4 ) = 0.
By inspection one sees that all Vσ operators are unitary and therefore if δ is defined as
in 4.3, then A is well-formed. Below, Example 4.1.19, we will see the way A works.
160 CHAPTER 4. AUTOMATA
Example 4.1.19 We show that L(A) = 0∗ 1∗ for the automaton from the Example 4.1.18.
More exactly, each x ∈ L is accepted with probability 1 and each x ∈ {0, 1}∗ − L is rejected
with probability 1.
Let us start with an illustration of computations on A for two different input strings.
For the input 03 12 we get the following sequence of states:
# 0 0 0 1 0 1 1 $
|q0 , 0i → |q0 , 1i → |q0 , 2i → |q0 , 3i → |q0 , 4i → |q1 , 3i → |q2 , 4i → |q2 , 5i → |q2 , 6i → |q3 , 6i,
It is easy to verify that for an input 0i 1j , i, j ≥ 0, the automaton A enters the state |q3 , i +
j + 1i after i + j + 4 steps and at that time it will be for the first time that A gets into a
halting state. An input 1i 0x with i > 0 is rejected after i + 3 steps and the input of the form
0i 1j 0x with i > 0 and j > 0 after i + j + 4 steps.
Exercise 4.1.20 Design a simple 2QFA accepting the languages: (a) {0∗ 1∗ 0∗ }; (b)
{(00)∗ (11)∗ }.
As the next step we show that the family of languages recognized by one-sided bounded-
error simple 2QFA in linear time contains also non-regular languages, namely the language
L = {0i 1i | i > 0}.
4.1. QUANTUM FINITE AUTOMATA 161
(n) (n)
For each integer n let A(n) = h{0, 1}, Q(n), q0 , Qa Qr , δ (n) i be a simple 2QFA with
Q(n) = {q0 , q1 , q2 , q3 } ∪ {rj,k | 1 ≤ j ≤ n, 0 ≤ k ≤ max{j, n − j + 1}}
∪ {sj | 1 ≤ j ≤ n},
(n) (n)
Qa = {sn }, Qr = {q3 } ∪ {sj , 1 ≤ j < n}. The transition function δ (n) is defined as in
(n) (n)
(4.3) where Vσ , σ ∈ {#, 0, 1, $} and D are given in Figure 4.2. An extension of Vσ to
(n)
other basis states in such a way that all Vσ are unitary is straightforward.
Lemma 4.1.21 Let x ∈ {0, 1}∗, n ∈ N. If x ∈ {0i 1i | i ≥ 1}, then the 2QFA A(n) accepts x
with the probability 1; otherwise A(n) rejects x with the probability at least 1 − n1 . In either
case A(n) halts after O(n|x|) steps.
Proof. Figure 4.3 illustrates the basic trick of a 2QFA described in Figure 4.2, that
accepts strings from the language {0i 1i | i > 0}. Each computation of A(n) consists of three
phases. In the first phase, in which only states q0 , q1 , q2 and q3 are involved, any input
word not of the form 0i 1j is rejected, in a similar way as in Example 4.1.19. For words of
the type 0i 1j the phase ends in the state |q2 i with the head on the rightmost endmarker
$. As the first step of the second phase the operator V$ is applied and a superposition of n
states is formed. This way computation branches into n parallel paths starting in the states
|r1,0 i, . . . , |rn,0 i, each with the amplitude √1n .
In the jth of the paths, starting in the state |rj,0 i the head moves, deterministically, to
the left endmarker according to the following rules. Each time the head is on a new cell and
reads 0 (1) it remains stationary for j (n−j+1) steps and then moves one cell left. Therefore,
for an input of the form 0u 1v the jth head requires exactly (j + 1)u + (n − j + 2)v + 1 steps
to reach the left endmarker. If j 6= j ′ , then
(j + 1)u + (n − j + 2)v + 1 = (j ′ + 1)u + (n − j ′ + 2)v + 1 if and only if u = v.
This implies that any two heads of all n different computational paths reach the left end-
marker at the same time if and only if u = v.3
3 This means that in the second phase the automaton is always in a superposition of n basis states
(configurations) each corresponding to a different branch, and therefore one can see the computation as
being performed by n different heads which can be on different cells.
162 CHAPTER 4. AUTOMATA
# 0 0 1 1 $
# xi $
In the third phase, consisting of only one step, each computation path splits again, this
time the resulting superposition is obtained by an application of the QFT.
In the case u = v all these splittings occur simultaneously and the resulting superposition
has the form
n n
1 X X 2πi jl
e n |sj i, (4.4)
n j=1
l=1
and equals exactly to |sn i on the basis of the same reasoning as in Sections 3.1 and 3.2. In
the last phase, in addition, an observation is performed using the observable O. In the case
u = v the result of such an observation is “accept” with probability 1.
Finally, in the case u 6= v, no two different computational paths come to the left end-
marker at the same time and therefore after Phase 3, when the first observation is made, we
get “accept” with probability only n1 and “reject” with probability 1 − n1 .
The above method can be used to show that some non-context-free languages can also
be accepted in linear time by bounded error 2QFA.
Simple 2QFA can be shown to accept all regular languages. Actually a stronger statement
has been shown. Namely, that any deterministic (one-way) FA can be simulated by a two-
way reversible finite automaton (which can be defined as a 2QFA, amplitudes of which are
only 0 or 1). As a corollary we have
4.1. QUANTUM FINITE AUTOMATA 163
This theorem has been shown by Kondacs and Watrous (1997) directly and it also follows
from a more general result due to Lange at al. (1997).
Exercise 4.1.24 Show that any 1FA can be simulated by a reversible two-way finite
automaton.
Exercise 4.1.25 Show that the language of words over the alphabet {0, 1} that contain
the same number of 0’s and 1’s can be accepted by a 2QFA
Exercise 4.1.26 Let us call a 2QFA A one-directional if it never moves left. Show
that one-directional 2QFA can accept non-regular languages.
Open problem 4.1.27 1. Is it true that one-directional 2QFA accept all regular lan-
guages?
Remark 4.1.28 One-directional 2QFA seem to be very powerful. Amano and Iwama
(1999), who denote them as 1.5QFA, showed that one of the basic decision problem, the
emptiness problem, which is decidable even for pushdown automata, is undecidable for
1.5QFA. Perhaps it would have been more proper to call them 1QFA and those denoted
that way so far as “real-time QFA”,
There are several variants of the models of quantum finite automata discussed here
and they need to be explored in order to get a better insight into the power of quantum
computation on the “finite state” level. Some of these models are presented in the following
exercises (Ambainis and Freivalds (1998)).
Exercise 4.1.29 Let us consider special two-way QFA in which the head keeps making
left-to-right and right-to-left passes between endmarkers. (a) Show that such a quantum
automaton can recognize the language {0i 1i | i ≥ 0} in the following sense. If x 6∈ L the
automaton stops with probability 1 after O(|x|) scans of the tape; if x ∈ L, then it never
stops; (b) explore more potentials of such quantum automata.
164 CHAPTER 4. AUTOMATA
Exercise 4.1.30 Show that for any ε > 0 there is a 2-way probabilistic finite automaton
A1 and an 1QFA A2 such that with probability at least 1 − ε: (1) A1 stops in time
quadratic in the length of the input; (2) A2 accepts the output of A1 if and only if the
input x of A1 is in {0i 1i | i > 0}.
The result of Exercise 4.1.30 is of interest for the following reason: neither 1QFA nor
2PFA can recognize non-regular languages (see Freivalds, 1981, and Kaneps and Freivalds,
1991), but their composition can.
δ : Q × Σ × Σ × Q × {←, ↓, →} −→ C[0,1]
4 Quite a different approach to QTM is pursued by Benioff, see page 261 and Benioff (1998) for the recent
paper. It is the so-called “physical QTM”, and the main difference is in problems on which investigation
concentrates. Benioff (1998) is concerned with the process defined by a step operator that is used to construct
Hamiltonian according to Feynman’s ideas—see page 261. It is interesting to compare this “physical”
approach to QTM with the “automata theoretic” approach as presented in this section. In spite of the fact
that Benioff’s approach may be seen, from an implementation point of view, as better reflecting quantum
physics laws; so far the automata-theoretic approach seems to be far more insightful, stimulating, and
productive.
4.2. QUANTUM TURING MACHINES 165
The concept of configuration is the basic one for the description of the quantum evolution
of QTM. A configuration of M is determined by the content τ of the tape, τ ∈ ΣZ , by an
i ∈ Z which specifies the position of the head, and by a q ∈ Q, the current state of the
tape. Let CM denote the set of all configurations of M. Computation (evolution) of M is
performed in the inner-product space HM = l2 (CM ) with the basis {|ci | c ∈ CM }.
The transition function δ uniquely determines a mapping a : CM × CM → C such that
for c1 , c2 ∈ CM , a(c1 , c2 ) is the amplitude of the transition of M from the basis state |c1 i
to |c2 i. The time evolution mapping UM : HM → HM is then defined as follows.
1. If |ci is a basis state, then
X
UM |ci = a(c, c′ )|c′ i.
c′ ∈CM
P
2. If |φi = c∈CM αc |ci is a superposition, then
X
UM |φi = αc UM |ci.
c∈CM
Remark 4.2.3 1. Sometimes it is convenient to see the basis states of a QTM as being
tensor products of the form |qi|τ i|ii, where q ∈ Q, τ ∈ ΣZ , i ∈ Z, with i representing the
position of the head.
2. Observe that the subspace of HM = l2 (CM ) consisting of finite sums of configurations
is not a Hilbert space but a dense subspace of l2 (CM ).
Measurements
In addition to a standard measurementP with respect to the basis {|ci | c ∈ CM }, which,
when applied to a state |φi = c∈CM c |ci, provides a configuration c with probability
α
|αc |2 , measurements of certain cells of the tape are also of importance. The concept of an
equivalence of configurations is needed to introduce the corresponding observables.
If c is a configuration, then let c(i) be the ith symbol of its tape. The equivalence relation
on configurations, denoted ∼I and associated to a set I = {−N, . . . , N }, N ∈ N, is defined
by
ci ∼I cj ⇐⇒ ci (k) = cj (k) whenever k ∈ I.
(In other words, two configurations are equivalent over I if they have the same symbols
in the positions from I.) Let us now denote by [c] the equivalence
P class of configurations
containing the configuration c. In the case of the state |ψi = c∈CM αc |ci, the probability
of such equivalence class is defined by
X
p([c]) = |αci |2 .
ci ∈[c]
Definition 4.2.4 The standard observable of the cells I of the tape yields the equivalence
class [c] with probability p([c]) and the post-observation superposition is
1 X
qP αcj cj ,
2
cj ∈[c] |αcj | cj ∈[c]
Well-formedness conditions
As shown in Theorem 4.2.6, unitarity of a qQTM M is ensured if M satisfies the following
so-called strong well-formedness conditions.
δ : Q × Σ × Σ × Q × {←, ↓, →} −→ C
2. Separability conditon I. For any two different pairs (q1 , σ1 ), (q2 , σ2 ) from the set
Q × Σ: X
δ ∗ (q1 , σ1 , σ, q, d)δ(q2 , σ2 , σ, q, d) = 0.
(q,σ,d)∈Q×Σ×{←,↓,→}
3. Separability condition II. For any (q, σ, d), (q ′ , σ ′ , d′ ) from the set Q×Σ×{←, ↓, →}
such that (q, σ, d) 6= (q ′ , σ ′ , d′ ):
X
δ ∗ (q1 , σ1 , σ, q, d)δ(q1 , σ1 , σ ′ , q ′ , d′ ) = 0.
(q1 ,σ1 )∈Q×Σ
4. Separability condition III. For any (q1 , σ1 , σ1′ ), (q2 , σ2 , σ2′ ) ∈ Q × Σ × Σ and d1 6=
d2 ∈ {←, ↓, →}:
X
δ ∗ (q1 , σ1 , σ1′ , q, d1 )δ(q2 , σ2 , σ2′ , q, d2 ) = 0.
q∈Q
Theorem 4.2.6 (Hirvensalo5 , 1997) If a qQTM M satisfies the above strongly well-
formedness conditions, then its evolution is unitary.
Proof. Let us assume a fixed enumeration of all configurations. For any configuration
ci then
∞
X
UM (ci ) = αli cl , (4.5)
l=1
where αli is the amplitude of reaching cl from ci . Observe that for each i the sum (4.5)
is actually finite, because there are only finitely many configurations reachable from any ci
in one step. UM can now be seen as a finite or infinite matrix with UM (l, i) = αli . Each
column of the matrix has therefore only finitely many nonzero elements.
5 The proof makes use of the ideas of Bernstein and Vazirani (1993).
4.2. QUANTUM TURING MACHINES 167
∗
If now UM is the adjoint matrix to UM , then
∞
X
∗
UM (ci ) = α∗il cl .
l=1
∗
We first show that UM UM = I. This will imply that the mapping UM is one-to-one.
Indeed, !
∞
X ∞
X
∗
UM (UM (ci )) = αki α∗kl cl .
l=1 k=1
The right-hand side is exactly ci if
∞
X
αki α∗kl = 0, (4.6)
k=1
whenever l 6= i and
∞
X ∞
X
αki α∗ki = |αki |2 = 1.
k=1 k=1
The last equalities follow from the condition (1) in the definition of well-formedness.
Let us now consider condition (4.6), which actually requires orthogonality of
the sequence of amplitudes {αk | ck is the successor configuration of ci , k ∈ N} and
{βk | ck is the successor configuration of cl , k ∈ N}.
Observe that from different configurations ci and cl it is possible to obtain ck in one step
only if the head positions in ci and cl differ at most by two positions.
Now if the head is in the same position in ci and cl , the orthogonality follows from
condition 3; if the heads are in different positions, the orthogonality follows from condition
4 of the definition of well-formedness.
To finish the proof of the theorem it is now sufficient to show that the mapping UM is
∗ −1
surjective. This will imply that UM = UM and the theorem is proved.
A configuration ci will be calledPreachable (by M) if ci occurs with a non-zero coefficient
in some superpositions UM (ck ) = ∞ j=1 αj cj —which happens if the ith row of UM is not
empty.
We first show that each configuration is reachable. This will imply that UM has no
empty row.
Let us assume, on the contrary, that there is a configuration c′ not reachable from
any other configuration. Let σ1 qσ2 σ3 be a “local subconfiguration” of c′ containing the
state q, symbols in the cell with the head on σ2 , and in two neighbouring cells σ1 , σ3 . If
c′ is not reachable, then the same has to be true for all configurations having σ1 qσ2 σ3
as the local subconfiguration. All these configurations will be said to be locally like c′ .
Let us now take an n ≥ 4 and consider all configurations ci1 , . . . , ciK such that in these
configurations all cells outside cells 0, 1, . . . , n have blanks and the state symbol is not in
the nth cell. For the subspace V ⊂ CM generated by configurations ci1 , . . . , ciK , we have
dim(UM (V )) = K = n|Q| |Σ|n because UM is injective.
In one step from any cij one can either reach another cik , or exit the chosen n + 1 cells,
and there are at most 2|Q||Σ|n configurations to exit into. In total this gives K + 2|Q| |Σ|n
possible successor configurations. However, at least (n − 2)|Σ|n−3 of them look locally like
c′ , and therefore they cannot be reached from any configuration. Consequently we get
X
UM (cik ) = αl cjl
jl ∈J
168 CHAPTER 4. AUTOMATA
with the index set J of configurations of the cardinality at most K +2|Q| |Σ|n −(n−2)|Σ|n−3 .
Therefore,
K = dim(UM (V )) ≤ K + 2|Q| |Σ|n − (n − 2)|Σ|n−3 ,
which yields n ≤ 2|Q| |Σ|3 + 2 and contradicts the fact that n could be chosen arbitrarily
large.
Without loss of generality let us now assume that the configuration c1 is not in the
range of UM . As shown above, the first row of UM is not empty. Moreover, it follows from
the condition 2 of the definition of well-formedness of QTM that any two rows of UM are
orthogonal.
Let us now choose N such that αij = 0 for j > N . From the orthogonality of rows of
UM it then follows that
∗
α11 A
α11 α12 . . . α1N ∗
α21 α22 . . . α2N α12 0
.. = 0 ,
.. .. .. .
. . ... . ∗
..
α1N .
where A = |α11 |2 + . . . + |α1N |2 > 0. Hence
UM (α∗11 c1 + . . . + α∗1N cN ) = Ac1
and by dividing both sides by A we get that c1 is in the range of UM .
Hirvensalo was the first to come with some conditions for a general qQTM to have unitary
evolution. However, conditions presented in Definition 4.2.5 are only sufficient conditions.
As shown by Ozawa, they are not necessary. The first set of sufficient and necessary well-
formedness conditions is due to Ozawa (1998). Instead of separability conditions II and III,
he considers the following two conditions:
3. Separability condition II’: For any (q1 , σ1 , σ1′ ), (q2 , σ2 , σ2′ ) ∈ Q × Σ × Σ
X
δ ∗ (q1 , σ1 , σ1′ , p, →)δ(q2 , σ2 , σ2′ , p, ←) = 0.
p∈Q
The proof that these conditions are really sufficient and necessary has been given in
Ozawa and Nishimura (1998) and it is too technical and lengthy to be presented here.
The following QTM, due to Ozawa, 1998, is an example of a QTM that does not satisfies
conditions of Definition 4.2.5:
Exercise 4.2.9 Verify that QTM with the set of states Q = {0, 1}, Σ = {0} and the
transition function defined for simplicity as the mapping δ : Q × Q × {←, ↓, →} → C
defined for a = δ(p, q, c) as follows
q p d a q p d a q p d a q p d a
0 0 ← 0 0 1 ← 12 1 0 ← 0 1 1 ← 1
2
1 1 1
0 0 ↓ 2 0 1 ↓ 2 1 0 ↓ 2 1 1 ↓ − 21
0 0 → − 12 0 1 → 0 1 0 → 21 1 1 → 0
satisfies Ozawa’s but not Hirvensalo’s conditions,
Input–output conventions
The initial configuration of a QTM M = hΣ, Q, q0 , qf , δi has the form q0 x, x ∈ (Σ − {λ})∗ ,
where λ stands for the blank symbol, and x is written on cells numbered 0, 1, 2, . . . , |x| − 1,
with all other cells filled with the blank and the head is on the cell number 0. M halts on
the input x when it finally enters the final state qf . The number of steps needed to reach qf
is the computation time of M on x. After M halts the standard measurement is performed
and its output is the string on the tape of the resulting configuration consisting of the tape
contents between the leftmost and rightmost non-blank symbol. Each output is therefore
produced with certain probability. For each input M produces a sample from a probability
distribution on its outputs.
Remark 4.2.10 It has been shown by Bernstein (1997) that it is sufficient to consider a
single final state and a single measurement, after halting, if time efficiency of quantum Turing
machines is considered. On the other hand, when space efficiency of QTM is considered,
then, as in the case of QFA, a measurement after each step seems to be more appropriate.
Two QTM will be considered as equivalent if for each input their output probability
distributions are close to each other; or more exactly, if their total variation distance is
small.6
Exercise 4.2.11 Let |ψi, |φi be states of an inner product space H such that ||φ|| =
||ψ|| = 1, ||φ − ψ|| ≤ ε. Show that the total variation distance between probability distri-
butions corresponding to measurements of |φi and |ψi, with respect to the standard basis,
is at most 4ε.
it far from trivial to perform even small modifications of the transition function of a QTM.
For example, even a simple classical design step, to add a transition from the final state to
the initial state, may not work—unitarity can disappear—and the new qQTM is not really
quantum.
There are several other models of QTM that are easier to deal with because of the
restrictive nature of the movement of the head, and their computational power, or even
efficiency, is not worse than that of the most general model presented in Section 4.2.1.
It is not obvious whether a restriction to uQTM does not represent an essential reduc-
tion of the power or efficiency of QTM. In order to deal with this problem the concept of
simulation of QTM is needed.
Concerning the power and efficiency of uQTM the following basic result has been shown
by Bernstein and Vazirani (1997).
Theorem 4.2.14 Any QTM can be simulated by a uQTM with slowdown factor at most 5.
Exercise 4.2.15 Show that if M is an RTM, then there is a DBV-RTM M′ that simu-
lates M with slowdown at most 2.
Exercise 4.2.16 Show that if M is a QTM, then there is a DBV-QTM M′ that simulates
M with constant slowdown.
4.2. QUANTUM TURING MACHINES 171
Exercise 4.2.19 Show that to any quantum circuit over k qubits we can design an equiv-
alent (in a reasonable sense) quantum circuit over k + 1 qubits all gates of which are
represented by unitary matrices over reals.
Observe that the QTM M′ constructed in the proof of Theorem 4.2.17, which uses only
real amplitudes and simulates M, works in the Hilbert space of twice as large dimension as
M.
Theorem 4.2.17 was strengthened by Adleman et al. (1997). They showed that with
respect to the computational power it is sufficient to consider only rational amplitudes, and
that the set of amplitudes {0, ± 53 , ± 45 , 1} is sufficient to construct a universal QTM.
Remark 4.2.20 It can be shown (see Watrous, 1995), quite surprisingly, that any QTM
M can be simulated with a constant slowdown by a QTM M′ with the “deterministic head
position”. In other words, if M′ is observed during a computation, then the probability
that the head will be observed at any given tape cell will be either 1 or 0. This implies
that the position of the head of M′ can be observed at every time step without affecting its
computation.
Theorem 4.2.21 Let UM be the evolution operator of a QTM. If |φi i and |φ′i i, i =
0, 1, . . . , t are superpositions from HM , such that ||φi − φ′i || ≤ ε, |φi i = UM |φ′i−1 i, then
||φ′t − UM t
φ0 || ≤ tε.
To formulate the second result we need the concept of “closeness” for QTM.
Definition 4.2.22 Two QTM M and M′ are ε-close, ε > 0, if they have the same sets of
states and symbols and if the difference between pairs of the corresponding amplitudes has
magnitude at most ε.
Two close QTM produce two close evolutions in the following sense.
4.2. QUANTUM TURING MACHINES 173
Theorem 4.2.23 (Bernstein, Vazirani, 1998) If two QTM, M and M′ , with the set of
states Q and alphabet Σ are ε-close, then the difference of their time evolutions has the
norm bounded by 2|Q| |Σ|ε. (The statement holds also in the case M′ is a qQTM.)
Proof. Let M and M′ be QTM that are ε-close and let Σ and Q be their tape symbols
and states. Both HM and HM′ haveP the same basis {|ci | c ∈ CM = CM′ }. The difference
in their evolution from a state |φi = c∈CM αc |ci can be expressed as follows
X X
′
U |φi − U |φi = (εi,j − εi,j )αci |cj i,
′
cj ∈CM ci ∈CM′
where εi,j and ε′i,j are amplitudes of the transition from ci to cj in M and M′ . Since each
Pn Pn
configuration has at most 2|Σ| |Q| predecessor configurations and | i=1 ai |2 ≤ n i=1 a2i
for any real ai s and any n, we get
2
X X
||(U − U ′ )φ||2 = (εi,j − ε′i,j )αci
cj ∈CM ci ∈CM
X X
≤ 2|Σ| |Q| |(εi,j − εi,j )αci |2
cj ∈CM ci ∈CM
X X
2
≤ 2|Σ| |Q|ε |αci |2
ci ∈CM cj ∈CM
2 2 2
≤ 4|Σ| |Q| ε .
Corollary 4.2.24 Let M be a QTM, and let M′ be a QTM which is 24|Σ|ε|Q| t -close to M,
where ε > 0. Then M′ simulates M for t steps with accuracy ε. (The statement holds also
in the case when M′ is a qQTM.)
Proof. Let ε > 0. If unitary operators UM and UM′ are applied to the same state,
then, by Theorem 4.2.23, the norm of the difference of the resulting states is bounded from
ε
above by δ = 12t .
An application of UM′ can also be seen as an application of UM and then an addition of
perturbations of length at most δ times the length of current superposition. Therefore the
length of the superpositions of UM′ after t steps is increased by (1 + δ)t ≤ e, because δ ≤ 1t .
By Theorem 4.2.21, the difference between superpositions of M and M′ after t steps is a
superposition of the norm at most 3δt ≤ 4ε . Now the Corollary follows from Exercise 4.2.11.
It follows from the last corollary that O(lg t) bits of precision in transition amplitudes
are sufficient to support t steps of a QTM with the resulting precision ε. QTM can therefore
be considered as discrete models of computation.
Definition 4.2.26 (1) A QTM M is called well-behaved if it halts for all inputs in a
state all configurations of which are in the final state and have the head on the same cell. If
this cell is always the starting cell, then M is called stationary.
(2) A QTM M is said to be in the normal form if each transition from the final state
is into its initial state.
Remark 4.2.27 If a QTM is in the normal form, then there can be no other transition into
the initial state than those from the final state. This allows one to redirect transitions from
the final state and then add transitions to the initial state.
Example 4.2.28 We design a stationary, normal-form QTM which maps any state |ψi of
the n-qubit register into the state Hn |ψi, where Hn is the Hadamard transformation, and
which halts in time 2n + 4 with its head back on the starting cell. The machine has tape
alphabet {0, 1, λ}, states {q0 , qa , qb , qc , qf } and the transition function shown in Table 4.1:
λ 0 1
q0 |0, qa , ←i |1, qa , ←i
qa |λ, qb , →i
qb |λ, qc , ←i √1 (|0, qb , →i + |1, qb , →, i) √1 (|0, qb , →i − |1, qb , →i)
2 2
qc |λ, qf , →i |0, qc , ←i |1, qc , ←i
qf |λ, q0 , →i |0, q0 , →i |1, q0 , →i
The machine processes a string x from {0, 1}∗ as follows. Starting in the state q0 on the
leftmost square of x the machine moves left and right and enters the state qb . In this state
4.2. QUANTUM TURING MACHINES 175
it keeps moving right, replacing each 0 by √12 (|0i + |1i) and each 1 by √12 (|0i − |1i), until
the first blank is reached. Afterwards the machine keeps moving left until the first blank left
from x is reached. As the last step it moves right into the final state.
Concepts of well-behavedness, stationarity, and normal form have their meaning also for
deterministic TM. They also play the key role in the following result, due to Bernstein and
Vazirani (1997), that modifies Bennett’s result on reversibility.
Theorem 4.2.29 (Synchronization Theorem) If f is a string-to-string function which
can be computed by a DTM in polynomial time and such that the length of f (x) depends only
on the length of x, then there is a polynomial time, stationary, normal-form and reversible
TM which on any input x produces x, f (x) and whose running time depends only on the
length of x.
Programming primitives
Just as in the classical case, one of the techniques that makes the design and behaviour of
QTM more transparent is to consider cells (tapes) as having several tracks. An equivalent
way is to use tape alphabets that are cartesian products of subalphabets. Some of the very
basic techniques of manipulation with several tracks that can be used in the case of quantum
Turing machines are presented in the following exercises.
Exercise 4.2.31 Show that given a QTM (RTM) M = hΣ, Q, δi and a set Σ′ , there is
a QTM (RTM) M′ = hΣ × Σ′ , Q, δ ′ i such that M′ behaves as M while leaving its second
track unchanged.
Exercise 4.2.32 Show that given any QTM (RTM) M = hΣ1 × . . . × Σk , Q, δi and a
permutation π on {1, . . . , k}, there exists a QTM (RTM) M′ = hΣπ(1) × . . . × Σπ(k) , Q, δ ′ i
such that M′ behaves exactly as M except that its tracks are permuted according to the
permutation π.
The results of the following two easy exercises will be used below in showing how to
compose programs in several ways. The first exercise deals with swapping of transitions.
We talk about swapping of outgoing transitions of states q1 and q2 in a QTM with transition
function δ, if δ is replaced by δ ′ defined as follows; δ ′ (q1 , σ) = δ(q2 , σ), δ ′ (q2 , σ) = δ(q1 , σ)
for all tape symbols σ and δ ′ (q, σ) = δ(q, σ) for all q 6∈ {q1 , q2 }. In a similar way swapping
of incoming transition of two states is defined.
Exercise 4.2.33 Show that if M is a well-formed QTM (RTM), then swapping of the
incoming or outgoing transitions between a pair of states in M yields another well-formed
QTM (RTM).
Exercise 4.2.34 Show that if M1 = hΣ, Q1 , δ1 i and M2 = hΣ, Q2 , δ2 i are two well-
formed QTM (RTM) with disjoint sets of states, then the “union” of these two QTM
(RTM) M = hΣ, Q1 ∪ Q2 , δ1 ∪ δ2 i is also a well-formed QTM (RTM).
176 CHAPTER 4. AUTOMATA
Lemma 4.2.35 If M1 and M2 are normal-form QTM (RTM) with the same alphabet and
q is a state of M1 , then there is a normal-form QTM M which acts as M1 except that each
time M1 would enter the state q it runs M2 instead of that.
(1) (2) (1) (2)
Proof. Let q0 , q0 , qf and qf be the initial and final states of M1 and M2 . M
(1)
will be designed as union of M1 and M2 (see Exercise 4.2.34), with the initial state q0 if
(1) (2) (2)
q 6= q0 and q0 otherwise. In addition, the incoming transitions of q and q0 and also the
(2)
outgoing transitions of q and qf will be swapped (see Exercise 4.2.33). According to the
above two exercises, the resulting QTM M is well-formed. Since M1 is in the normal form
the final state of M leads back to its initial state (no matter whether q is the initial or the
final state of M1 or neither of them).
Now we can formulate the first two basic QTM design techniques.
at all. On the other hand, if R is run with the blank tape, then R visits q2 once with the
blank tape and does not visit q1 at all. R will now be used to design M.
At first M1 and M2 are extended to have also the second track with the alphabet of R—
see Exercise 4.2.31. Afterwards R is extended to have a first track with the common alphabet
of M1 and M2 . Finally, M1 and M2 are inserted for the states q1 and q2 , respectively.
R will have the alphabet {λ, 1}, the set of states {q0 , q1 , q1′ , q2 , q2′ , q3 , qf } and transitions:
Loops are the third main classical design tool. Building an RTM that loops indefinitely
is easy. Finite loops can be realized in the following way.
Remark 4.2.40 Observe that all constructions of this section that support main classical
algorithm design primitives—composition, branching, and looping—can be implemented
without any inherently quantum step.
array network of identical finite automata with Q as their set of states. For each node
n ∈ Zd , the neighbourhood N determines the set {n} + N of |N | “neighbours” of the node
n. Formally, A = hd, Q, N, δi and elements (nodes) of Zd are regarded as representing those
finite automata which A consists of.
N1
N2
N3
N4 N5
(a) (b)
Figure 4.5a depicts three neighbourhoods of the dotted nodes for one-dimensional
cellular automata: N1 = {0, 1}, N2 = {−1, 0, 1}, N3 = {−2, −1, 0, 1, 2}. Fig-
ure 4.5b illustrates two often-used neighbourhoods (again for the dotted nodes)
for two-dimensional cellular automata: von Neumann neighbourhood N4 =
{(−1, 0), (0, −1), (0, 0), (0, 1), (1, 0)}, and Moore neighbourhood N5 = {(−1, −1),
(−1, 0), (−1, 1), (0, −1), (0, 0), (0, 1), (1, −1), (1, 0), (1, 1)}.
All finite automata of a cellular automaton work concurrently, synchronized, and in
discrete time steps. At each time moment the new state of each finite automaton is defined
to be the value of the local transition function applied to the current states of a cell and all
its neighbours.
Figure 4.5a depicts three neighbourhoods of the dotted nodes for one-dimensional
cellular automata: N1 = {0, 1}, N2 = {−1, 0, 1}, N3 = {−2, −1, 0, 1, 2}. Fig-
ure 4.5b illustrates two often-used neighbourhoods (again for the dotted nodes)
for two-dimensional cellular automata: von Neumann neighbourhood N4 =
{(−1, 0), (0, −1), (0, 0), (0, 1), (1, 0)}, and Moore neighbourhood N5 = {(−1, −1),
(−1, 0), (−1, 1), (0, −1), (0, 0), (0, 1), (1, −1), (1, 0), (1, 1)}. In order to describe more formally
the overall behaviour of a cellular automaton A = hd, Q, N, δi the concept of configuration
N1
N2
N3
N4 N5
(a) (b)
Gδ (c)(i, j) = δ(c(i − 1, j), c(i, j − 1), c(i, j), c(i, j + 1), c(i + 1, j)).
Example 4.3.2 A simple four-state cellular automaton, due to I. Korec, with the neighbour-
hood {0, 1} is depicted in Figure 4.6a and its reversible counterpart, with the neighbourhood
{−1, 0} is shown in Figure 4.6b.
∗ 0 1 2 3 ∗ 0 1 2 3
0 0 1 1 0 0 0 0 3 3
1 2 3 3 2 1 2 2 1 1
2 0 1 1 0 2 0 0 3 3
3 2 3 3 2 3 2 2 1 1
(a) (b)
There do not seem to be many reversible cellular automata. For two-state automata with
a neighbourhood N where |N | = 2 or |N | = 3 there are none. For the neighbourhood N =
{−1, 0, 1, 2} there are 65 536 cellular two-state automata but only 8 of them are reversible
180 CHAPTER 4. AUTOMATA
and all of them are insignificant modifications of the same one.7 The following theorem,
due to Toffoli (1977), Dubacq (1985) and Kari (1990), of importance for cellular automata
applications, is therefore quite a surprise.
Theorem 4.3.3 (1) Any k-dimensional CA can be simulated in real time by a (k + 1)-
dimensional reversible CA. (2) There is a universal cellular automaton that is reversible.
(3) It is decidable whether a one-dimensional cellular automaton is reversible but it is un-
decidable whether a two-dimensional cellular automaton is reversible.
8
Exercise 4.3.4 Show that the one-dimensional cellular automaton with the
neighbourhood N = {0, 1}, states {0, 1, . . . , 9} and the transition function
δ(x, y) = (5x + ⌈ 5y
10 ⌉) mod 10 is reversible.
Gruska (1997).
4.3. QUANTUM CELLULAR AUTOMATA 181
In one step A transfers from one basis state |c1 i to another |c2 i. The amplitude of such
a transition, α(c1 , c2 ), is defined as follows:
Y
α(c1 , c2 ) = δ(c1 (i + n1 ), c1 (i + n2 ) . . . c1 (i + nr ), c2 (i)).9
i∈Z
The evolution operator EA of A maps any state |φi ∈ l2 (C(A)) into the state |ψi = EA |φi
such that X X
|ψi = EA |φi = βc |ci, where βc = αc′ α(c′ , c).
c∈C(A) c′ ∈C(A)
Exercise 4.3.6 What is a good upper bound for deciding unitarity of arbitrary 1QCA?
Unitarity is easy to verify for trivial 1QCA; we can make use of this in Section 4.3.3.
Lemma 4.3.7 Evolution of a trivial 1QCA is unitary if and only if the following condition
holds for any q1 , q2 ∈ Q.
X
∗ 1, if q1 = q2
δ (q1 , q)δ(q2 , q) = (4.7)
0, otherwise.
q∈Q
9 It
follows from the stability of the quiescent state condition that this infinite product is a well-defined
complex number.
182 CHAPTER 4. AUTOMATA
Satisfiability of such a condition can be verified in time O(n2 ) by Dürr, LêThanh and
Santha (1996). The satisfiability of the above well-formedness condition is not sufficient for
the unitarity condition to hold, as the following example shows:
Example 4.3.8 Each classical one-dimensional CA A can be seen as a q1QCA. Its evolu-
tion operator is a matrix with the property that in every column indexed by a configuration
there is a single non-zero entry with value 1 (for the unique next configuration). This matrix
has all columns of norm 1. Its columns are all pairwise orthogonal if and only if the automa-
ton mapping is injective. Its evolution is unitary if and only if its global mapping is bijective.
Any classical 1CA whose global evolution is injective but not surjective is well-formed but
not unitary.
For example, due to Ch. Dürr, in the one-dimensional classical CA A, called “controlled
not”, with states {0, 1}, the quiescent state 0, the neighbourhood N = {0, 1}, and the tran-
sition function δ(x, y) = x ⊕ y, no configuration of the form 0∗ 10∗ has a pre-image (among
configurations in CA ). The global transition function of the automaton is injective but not
surjective and the automaton satisfies the well-formedness condition.
Remark 4.3.9 The history of attempts to introduce the concept of quantum cellular au-
tomata goes back to Grössing and Zeilinger (1988) and a series of subsequent papers of the
same authors. However, their model has little in common with the one discussed above.
This history well illustrates the merits and limits of methodologies developed in physics
and theoretical computing to deal with such basic problems and to come up with proper
concepts.
Remark 4.3.10 In order to study more in depth computability problems for QCA and
to consider the problem of universality, some restrictions have to be made on the types of
amplitudes allowed. For example, for each amplitude α there has to be an algorithm to
compute α to 2−n in time polynomial in n; or that all amplitudes are rational.
Even for 1QCA several basic problems are still open. One of them concerns universality. Is
there a single 1QCA which would be universal in a reasonable sense for the whole class of
1QCA and could efficiently simulate any 1QCA?10
10 van Dam (1996) solved positively the universality problem for a special “circular” model of 1QCA, in
which cells may form cycles of various length. Or, in other words, for space-periodic configurations only.
This interesting result did not seem to contribute much to the solution of the basic universality problem for
1QCA. However, as shown by Dürr (1997), if a q1QCA is QCA with respect to van Dam model, then it
is also 1QCA with respect to our model. This implies that a subset of 1QFA has nice properties from van
Dam’s model. For example, this subset has a universal instance and can be simulated by a QTM.
4.3. QUANTUM CELLULAR AUTOMATA 183
Remark 4.3.11 (CH. Dürr). A suitable definition of two- and more- dimensional quantum
cellular automata is an untrivial issue. Formally, it can be done by a similar modifica-
tion of the definition of the classical cellular automata—to assign complex amplitudes to
transitions—as in the case of one-dimensional cellular automata. The difficulty with such a
straightforward approach is that in such a case one cannot decide in polynomial time whether
a qQCA is really a QCA. This follows from the result of Kari (1990) that reversibility of
two-dimensional cellular automata is undecidable.
Another basic problem not yet fully solved is the problem of mutual efficient simulation
of quantum cellular automata and quantum Turing machines. There is so far only a partial
solution of this problem for a special class of 1QCA, for the so-called partitioned quantum
cellular automata (PQCA). They are a natural quantum version of the model introduced
by Morita and Harao (1989) and have been shown to be important in the classical case
because they are much more easy to deal with. For PQCA it is also easy to verify whether
their evolution is unitary. Results presented in the following section follow Watrous (1995).
δ( , , )
where
The function δc defines a 1CA (one-way CA) Ac = hQ, λ, N, δc i and the global transition
function Gδc is a permutation on configurations of A such that for any a ∈ CA , i ∈ N ,
[Gδc (a)](i) = δc (a(i + N )). The case N = {−1, 0, 1} is illustrated in Figure 4.7b.
Ac can also be seen as a 1QCA whose evolution operator UAc is defined by
1, Gδc (q1 ) = q2 ;
UAc (q2 , q1 ) =
0, otherwise,
Theorem 4.3.12 Evolution of a P1QCA A is unitary if and only if the evolution of the
corresponding trivial 1CA is unitary, and this holds if and only if the local transition matrix
of A is unitary.
Example 4.3.13 (Watrous, 1997) We describe a simple PQCA A = hQ, λ, U i that sim-
ulates, in a sense, the EPR phenomenon. Let
and for q, q ′ ∈ S, (
′
− √12 , if q = q ′ = (−, 0, +);
U (q , q) = √1 ,
2
otherwise.
where
(+, 0, −), if n = 0; (−, 0, +), if n = 0;
c0 (n) = d0 (n) =
(0, 0, 0), otherwise, (0, 0, 0), otherwise,
By an induction one can show that after t > 1 steps A is in the superposition
1
√ (|ct−1 i + |dt−1 i),
2
where
(+, 0, 0), if n = −t; (−, 0, 0), if n = −t;
ct (n) = (0, 0, −), if n = t; dt (n) = (0, 0, +), if n = t;
(0, 0, 0), otherwise, (0, 0, 0), otherwise.
If the states (0, 0, −) and (−, 0, 0) are interpreted as negative particles and
(0, 0, +), (+, 0, 0) as positive particles, then the configuration |ct i models the situation in
which the positive particle moves to the left and the negative one to the right; the configura-
tion |dt i models the situation when particles are reversed.
Exercise 4.3.15 (a) Replace in the definition of the 1QCA from Example 4.3.14 H ′ with
the square-root-of-not matrix, page 63, and show that the inverse of the global evolution
of the resulting 1QCA is not evolution of any 1QCA; (b) find necessary and sufficient
condition that if the matrix H ′ from Example 4.3.14 is replaced by a matrix U , then
the resulting 1QCA has such a global evolution function the inverse of which is not an
evolution function of a 1QCA.
4. For any q1 , q2 ∈ Q for which U (q1 , q2 ) has not been defined in (1) to (3), let
1, if q1 = q2 ;
U (q1 , q2 ) =
0, otherwise.
4.3. QUANTUM CELLULAR AUTOMATA 187
where xyn denotes the nth symbol of the string xy, or λ if out of the range.
Exercise 4.3.17 Show, for any n ∈ N, that the probability that M accepts the initial
configuration q0 w after n steps is equal to the probability that A accepts T (q0 w) after n
steps.
To summarize:
Theorem 4.3.18 (Watrous, 1995) To any QTM M there is a PQCA simulating M with
a constant slowdown.
Σ = Q ∪ Q′ ∪ {b},
Not all of the above transitions are really needed for simulation. Some of them are just
to make evolution unitary.
The aim of the first phase was to put all the information needed for the main second
phase in which transitions of A defined by U are simulated by transitions of M according
to the rules:
In the third phase M just moves its head to come to the cell representing the current
leftmost cell of the configuration of A. This is achieved by the transitions:
Finally, let δ take the value 0 everywhere not defined above. In addition, let k ′ = k and
Σa = {q, q ′ | q ∈ Qa }.
On the base of Theorem 4.3.12 it is now straightforward to show:
Lemma 4.3.19 If A and M are defined as above, then if A is a PQCA, then M is a QTM.
In order to formulate the new main simulation result we need to define two mappings:
T : CA → CM and f : N × N → N.
For a ∈ CA let nal and nar be the indices of the leftmost and rightmost non-quiescent
cells of a. Define
T (a) = (λ, s0 , λ)c,
where
a(i), if i 6= nal , i 6= nar ;
c(i) =
a∗ (i), otherwise,
and let f (t, |c|) = 4t2 + 4|c|t − t be the number of steps needed by M to simulate t steps of
A. One can now prove (Watrous, 1995):
Lemma 4.3.20 For any t ∈ N and c ∈ CA the probability that A accepts c after t steps is
equal to the probability that M accepts T (c) after f (t, |T (c)|) steps.
4.3. QUANTUM CELLULAR AUTOMATA 189
To summarize:
Theorem 4.3.21 To any PQCA A there exists a QTM simulating A with quadratic slow-
down.
190 CHAPTER 4. AUTOMATA
Chapter 5
COMPLEXITY
INTRODUCTION
The study of complexity questions and of complexity classes, computational and communi-
cational, has proved to be very enlightening and important for classical computation. It has
developed a firm theoretical basis for our understanding of the potentials and limitations of
computational resources, models, and modes. There is reason to expect the same for the
complexity investigations in quantum computation and communication.
It is of utmost importance to determine whether quantum classification of inherent com-
putational complexity is indeed different from the classical one. Would this prove to be the
case the very basic foundations of computing would be shaken.
Quantum computational complexity theory is characterized, as its classical counterpart,
by a number of fundamental open problems concerning the proper inclusions of complexity
classes. In order to get a better insight into these problems, and to test potential methods
to solve them, the relativized quantum complexity theory is of interest and importance.
It is also of importance to find out how much quantum features can speed up computa-
tions, shorten communications and achieve efficiency in size or space.
Investigations of the potential impacts on the power of computing of the existence of
slightly non-linear evolutions in quantum physics are also of interest.
LEARNING OBJECTIVES
The aim of the chapter is to learn:
1. the way universal quantum Turing machines can be constructed;
2. the basic quantum complexity classes and their properties;
3. the basic relations between classical and quantum complexity classes;
4. the basic results concerning relativized quantum complexity;
5. the basic concepts of quantum communication complexity;
6. a reduction of quantum communication protocols to quantum computation problems;
7. the potential impacts of non-linearity on the power of quantum computing.
191
Q
192 CHAPTER 5. COMPLEXITY
The main problem with having a single QTM performing an arbitrary unitary transfor-
mation is that a fixed QTM maps in a single step any configuration to a superposition of
a fixed number of configurations and therefore it cannot simulate in one step a step of any
QTM. The way out is first to decompose, efficiently, any given unitary transformation U (or
its approximation) to a “small number” of elementary unitary transformations and then to
carry out these elementary transformations, in order to realize U .
1. M is like a unit matrix except for one diagonal element which has the form eiθ , θ ∈
[0, 2π]. (Such a transformation is called a near-trivial shift and denoted by (j, j, θ)
if the element eiθ is in the jth row.)
2. M is like the unit matrix except for the elements in the intersections of a jth and a
kth row and column which form the matrix
cos θ − sin θ
sin θ cos θ
for some θ ∈ [0, 2π]. (Such a transformation is called a near-trivial rotation and it
is denoted by (j, k, θ).)
In the proof of the decomposition theorem presented below we make use of the following
result.
Lemma 5.1.2 There is a deterministic algorithm A such that given a d-dimensional column
vector v of complex numbers, and an ε > 0, A computes near-trivial matrices U1 , . . . , U2d−1
such that
||U1 . . . U2d−1 v − ||v|| e1 || ≤ ε,
where e1 is the unit vector, with one in the first component, and A runs in polynomial time
with respect to d, lg 1ε and length of the input.
Proof. The first step is to use, for j = 1, . . . , d, phase shifts Pj = (j, j, φj ), where
Re(v ) Re(v )
φj = 2π − arccos |vj |j or φj = arccos |vj |j (depending on whether Im(vj ) is positive or
negative), if vj 6= 0 and φj = 0, otherwise—to replace all components vj by |vj |.
As the next step d − 1 rotations Rj , j = d − 1, . . . , 1, are used to move all weights of the
vector into its first component—all other components will have at the end of this procedure
|v |
the value 0. This can be achieved with Rj = (j, j + 1, ψj ) and ψj = arccos √Pn j |v |2 if the
j=1 j
The angles φj and ψj can be computed in polynomial time, with respect to d, lg 1ε and the
ε
length of input, with the precision δ = (2d−1)||v|| . Let us denote by Pj′ , Rj′ the near-trivial
matrices corresponding to Pj , Rj but with elements determined with the precision δ. Since
|φj − φ′j | ≤ δ we have ||Pj − Pj′ || ≤ δ and since |ψj − ψj′ | ≤ δ we get |Rj − Rj′ | ≤ δ, and
therefore also the inequality
||R1′ . . . Rd−1
′
P1′ . . . Pd′ − R1 . . . Rd−1 P1 . . . Pd || ≤ (2d − 1)δ,
and consequently,
||R1′ . . . Rd−1
′
P1′ . . . Pd′ v − ||v||e1 || ≤ (2d − 1)δ||v|| = ε.
(The last inequality is due to the fact that if the distance between angles is at most δ, then
so it is between the corresponding points on the unit circle in both real and complex plane.)
Proof. Let us say that a matrix of the degree d is k-simple if its first k rows and columns
are identical with those of the unit matrix of degree d. The basic idea of the proof is to show
how to reduce the problem of the decomposition of an i-simple matrix of degree d, close to a
unitary matrix, to the problem of decomposition of an (i + 1)-simple matrix of degree d, for
i = 0, 1, . . . , d − 1. (Observe that the product of two k-simple matrices is again a k-simple
matrix.)
Suppose that we have started to approximate a k-simple matrix U by near-trivial matrices
whose product is V . In order to produce the desirable reduction, we need to create another
sequence of near-trivial matrices whose product is W = U V ∗ . In order to reduce the
problem, V should be such that U V ∗ is close to being (k + 1)-simple. To achieve that
Lemma 5.1.2 will be used.
Let U be k-simple and δ-close to a unitary matrix. Let Z be the lower right (d−k)×(d−k)
submatrix of U and Z1 its first row. By Lemma 5.1.2, one can construct a sequence of near-
trivial matrices V1 , . . . , V2(d−k)−1 of degree d − k whose product V = V1 . . . V2(d−k)−1 is such
that ||V Z1T − ||Z1 ||e1 || ≤ δ. Finally, let us extend the matrices V and all Vi into the matrices
of degree d that are k-simple.
Denote W = U V ∗ . V is unitary and W is k-simple and δ-close to a unitary matrix
(because U is unitary and V is δ-close to a unitary matrix). In addition, W is close to being
(k + 1)-simple in the following sense. For the (k + 1)th row Wk+1 of W it holds, as discussed
below,
||Wk+1 − ek+1 || ≤ 2δ and all ||Wj,k+1 || ≤ 6δ for all j ≥ k + 1. (5.2)
Let X be a (k + 1)-simple matrix of degree d such that Xi,j
√ = Wi,j if i, j > k + 1. From
follows that ||W − X|| ≤ 2δ + 6 dδ. Since W is δ-close to a
(5.2) and Exercise 1.4.21 it √
unitary matrix, X is (3δ + 6 dδ)-close to a unitary matrix.
The problem with the above idea is that it may happen that not all entries of W = U V ∗
can be computed exactly. However, there is a way out. By Exercise 1.4.21, it is sufficient
5.1. UNIVERSAL QUANTUM TURING MACHINES 195
to compute them with the accuracy dδ to obtain a matrix W1 such that ||W − W1 || ≤ δ.
Let us now use W1 to construct a new matrix X1 in a similar way as √ above. On the base
of the triangle
√ inequality we can then derive that ||W − X 1 || ≤ 3δ + 6 dδ and that X1 is
(4δ + 6 dδ)-close to a unitary matrix.
The problem of decomposing a k-simple matrix U was reduced in this way to the problem
of the decomposition of a (k + 1)-simple matrix X1 provided we are willing to accept two
errors:
√
1. An error ||W − X1 || ≤ 3δ + 6 dδ, because we are going to decompose in the next step
X1 and not W .
√
2. X1 is only (3δ + 6 dδ)-close to a unitary matrix.
√ ′
If δ ′ = 10 dδ, then δP is the upper
√ bound on√both of the above√ errors. The total error, after
d
d steps, is therefore j=1 (10 d)j δ ≤ 2(10 d)d δ, since 10 d ≤ 2. Since U is δ-close to a
unitary matrix with δ = 2(10ε√d)d , the total error of the approximation is, as required, at
most ε.
To finish the proof, it remains to show (5.2). This requires to make several careful
estimations and to use the inequality from Exercise 2.3.29b. This is left to the reader (see
also Bernstein, 1997, and Bernstein and Vazirani 1997).
and since ∞
X i
m 2n−2 ≤ m2n−4n+1 ≤ 2−2n+1 ,
i=lg n+2
we have
2πm 2πm
|m2n R − θ| mod 2π ≤ |m2n R − n
| mod 2π + | n − θ|
2 2
2π 2π 2π
≤ + n < n−1 < ε.
22n−1 2 2
196 CHAPTER 5. COMPLEXITY
The following theorem yields the key result used in Section 5.1.2 to design a universal
QTM.
U keeps shifting the description of M so it is always close to the position of the head (on
another track) of M.
There are surprisingly small classical universal TM (see Roghozin 1996, Gruska 1997).
Currently the smallest universal TM has 5 states and 5 tape symbols. There is also a
universal TM with 2 states (and 18 tape symbols) and 2 tape symbols (and 24 states). The
design of a universal QTM is much more complex.
It is natural to try to design a UQTM on the basis of similar ideas as the classical UTM.
However, there are some fundamental difficulties that need to be overcome.
1. Unitary transformations specifying evolutions of QTM have in principle infinite di-
mension. A way out is, given a QTM, first to design an equivalent unidirectional QTM,
evolution of which can be specified by a finite unitary matrix.
2. A natural idea that a UQTM U simulates the evolution of any given QTM M step by
step is not so easy to realize. The problem lies in the following. Each evolution step of M is
an application of its unitary matrix. If this is to be simulated by U then it seems that U has
to implement one step of M using several steps of U. Therefore U will not map a state of
M immediately to the proper superposition, but it has to create this superposition during
several steps. This causes difficulties because it is necessary that U is reversible. The way
out has been found in Section 5.1.1. A complex unitary transformation can be decomposed
into a sequence of near-trivial unitary matrices.
Encodings of QTM
Any universal QTM U gets as input an encoding of a QTM M and of its input x. Both
encodings have to be words in the alphabet of U. The number of tape symbols and states
can be encoded in a similar way as for classical TM (see, for example, page 224 in Gruska
(1997) or Appendix on web). The transition function δM is encoded by an algorithm that
for given arguments q, σ, σ ′ , q ′ , d computes the amplitude δM (q, σ, σ ′ , q ′ , d) with precision
2−n in time polynomial in n.
Design of UQTM
Theorem 5.1.6 There is a normal form QTM U such that for any QTM M, its input x,
ε > 0, and T ∈ N, U can simulate M with accuracy ε for T steps in time polynomial in T
and 1ε .
Proof. The universal QTM U, the existence of which we are going to demonstrate, gets
as input a description of M, x, T , and the simulation accuracy ε. U works in two phases.
2. The description of the unitary transformation UM′ for the evolution of M′ , written
with the accuracy 40T (10ε√d)d+2 , where d = |Σ| |Q|.
3. The binary string s of length |Q| of the directions of the head moves of M′ , for |Q|
states of M′ .
4. The desired number of the simulation steps, 5T , and the desired accuracy ε.
All that can be computed by a DTM in time polynomial in T , 1ε and the length of the
input and therefore, by the synchronization theorem, with the asymptotically same efficiency
also by a QTM.
Simulation of the computation steps of M′ . Let us first describe a QTM called
STEP that simulates one step of M′ . STEP operates as follows:
1. STEP transfers the current state q and the tape symbol σ to an empty work space
near the starting cell, leaving a special marker in their places.
2. STEP applies UM′ to (q, σ) to produce, with accuracy ε, a superposition of new states
and symbols (q ′ , σ ′ ).
3. STEP reverses Step 1, to replace the marker with (q ′ , σ ′ ) and to empty the working
space.
4. STEP transfers the state on the tape either one cell left or right according to the value
of q ′ th bit of the string s.
Steps 1, 3 and 4 can be done easily by DTM and therefore, by the synchronization
theorem, there exists a stationary, normal form QTM that realizes the above steps that
works, for a fixed M, in time polynomial in T . The fact that Step 2 can be done by a
QTM in time polynomial in T, d and 1ε follows from Theorem 5.1.5. Moreover, using the
results of Exercises 4.2.31 and 4.2.32, QTM for Steps 1 to 4 can be composed to get the
resulting QTM STEP to perform simulation of one step of M′ . By inserting STEP for the
special state in the RTM constructed according to the looping lemma 4.2.38, together with
an additional input T , the resulting QTM simulating M′ will compute in time polynomial
with respect to T and 1ε and with the accuracy T ε.
The resulting universal QTM is obtained by composing QTM of both phases.
Analysis. If, in the preprocessing, the transformation UM′ is computed to the specified
ε√
accuracy, then the transformation provided for the simulation is within 40T (10 d)d
of the
ε
desired unitary transformation UM′ and therefore it will be 40T (10√d)d -close to a unitary
transformation as required for the output of STEP. This implies that if each time STEP
ε ε
works with accuracy 40T , then it applies a unitary transformation which is within 20T
ε
of UM . After 5T runs of STEP the transformation is applied which is within 4 of the
′
Exercise 5.1.7 (Bernstein and Vazirani, 1997) Let M be a QTM such that there is a
polynomial time algorithm A such that, for any input x and any t > 0, all measurements
of M at time t are determined by A. Show that there is a QTM M′ such that for any
t > 0, any input x and any ε > 0, a measurement of M′ at time t, determined by a fixed
algorithm polynomial time in t and 1ε , allows sampling from a probability distribution of
M′ , which is within the total variation distance ε of the distribution samples from M,
with respect to the outcomes of A over t steps.
Quantum variations of the main time and space computational complexity classes of
classical computing are other important theoretical concepts to use in order to get a deeper
insight into the power of quantum computing.
For quantum time complexity classes, and for their relations to classical computational
complexity classes, several inclusion results are already known and will be discussed below.
Unfortunately, as is usual in complexity theory, in many cases it is not known whether
inclusions obtained are proper. However, it is known that many major new open problems
cannot be resolved without making a breakthrough concerning the separation of the classical
computational complexity classes.
In the case of quantum space complexity classes the situation is quite different, quantum
computing has not brought an asymptotical decrease in the space resources needed.
Types of QTM. In order to define the time complexity classes, one-tape multitrack QTM
are considered. To define the space complexity classes, off-line multitape QTM are
considered with one-way, read-only, input tape, a working tape, and one-way, write-
only, output tape. Such a model is needed to investigate sublinear complexity classes.
In both cases only such QTM are considered all amplitudes of which are rational. The
case of more general amplitudes is discussed separately.
Measurements and probabilities. It has been shown by Bernstein (1997), (see also Ex-
ercise 5.1.7), that in order to study the time complexity classes, it is sufficient to
consider only computations in which the measurement is done only after the machine
comes into the halting state. On the other hand, to study the space complexity of
multitape QTM (see Watrous, 1997a), a measurement is done each time a symbol is
200 CHAPTER 5. COMPLEXITY
written on the output tape. More precisely, to study time complexity classes, it is
sufficient to consider only such multitrack QTM M where the last track alphabet is
{λ, 0, 1} and if M runs with input string x on the first track and with all other track
empty, then the last track is observed each time M gets into the halting state, which
can be determined by a measurement that has no effect on the computation, and the
probability p with which we see 1 on the starting cell of the last track is the overall
probability x is accepted by M.1
Acceptance. Table 5.1 summarizes the definitions of the main complexity classes. Each
EQ RQ BQ NQ PrQ
x∈L 1 ≥ 12 + ε ≥ 21 + ε >0 > 12
x 6∈ L 0 0 ≤ 21 − ε 0 ≤ 12
Table 5.1: Definitions of the main complexity classes. The occurrence of ε in a column
of the table means that there is an ε > 0 such that the corresponding condition for
acceptance and rejection holds for all x in the alphabet of L.
of the classes is defined as the class of languages L such that for x ∈ L and for
x 6∈ L certain conditions for the acceptance probabilities are satisfied. E stands here
for “error-free computation”; R stands for “one-sided bounded-error computation”
(Monte Carlo); B stands for “bounded-error computation”; N for “nondeterministic
computation” and P r for unbounded-error computation. (In the Appendix, in Sec-
tion 9.3.2, this class is denoted PP in the case of the acceptance in polynomial time.)
bounded computation. In this case the resulting space-bounded classes correspond to the “halting” classes,
i.e., classes for which QTM halts absolutely
2 A function f : N → N is t(n)-time-constructible and s(n)-space-constructible if the function f ′ : {1}∗ →
{0, 1}∗ , defined by f ′ (1(n) ) = bin−1 (f (n))—binary representation of f (n)—is computable by a t(n)-time-
bounded and s(n)-space-bounded 2-tape TM. f is called time-constructible (space-constructible) if f is
f -time-constructible (f -space-constructible).
3 A CQ-Turing machine is a QTM such that the language it accepts is accepted in a way satisfying the
The following basic relations between the main classical and quantum time complexity
classes hold:
P ⊆ EQP ⊆ BQP BPP ⊆ BQP ⊆ PSPACE.
However, in order to prove these inclusions we need to show more precisely how the accep-
tance of languages by QTM is defined and how the probability of acceptance is determined.
We consider only such multitrack QTM M where the last track alphabet is {λ, 0, 1} and
if M runs with input string x on the first track and with all other tracks empty, then the
last track is observed when M gets into the halting state (which can be determined by a
measurement that has no effect on the computation) and the probability p with which we
see 1 on the starting cell of the last track is the overall probability x is accepted by M.
(Here the polynomial time restriction translates as follows: there exist a polynomial p, such
that the halting state is observed after at most p(|x|) steps.)
Open problem 5.2.5 Determine which of the inclusions P ⊆ EQP ⊆ BQP and BPP ⊆
BQP ⊆ PP ⊆ P#P ⊆ PSPACE are proper.
It follows from the above inclusions that one cannot expect a proof that QTM are more
time efficient than the classical TM, in the sense that some strict inclusions of the classical
and the corresponding quantum polynomial time complexity classes would be shown, unless
there is some breakthrough in the classical complexity theory.
does not fully correspond to the basic idea behind the class NP: guess and verify. A different view of NQP,
represented as a complexity class called quantum NP was presented by Kitaev at AQIP’99. He defines this
class in terms of quantum witnesses and verifiers—see Section 9.3 for the classical version of this approach.
7 This class, introduced by Wagner (1986), contains, for example, the graph nonisomorphism problem,
which is not known to be in NP. Equality 5.3 therefore implies that there is a QTM that can accept in
polynomial time with nonzero probability a description of two graphs if and only if they are nonisomorphic.
8 A space-bound versionb of equality 5.3 was shown by Watrous (1997).
204 CHAPTER 5. COMPLEXITY
Sketch of the proof. The following two technical results, for the first one see Allender
and Ogihara (1996), the second to Watrous (1997a), are the base of the proof.
1. If wA,B denotes a binary encoding of integer matrices A, B, then L = {wA,B | det (A) >
det(B)} ∈ PrSpace(lg n).
2. Let M be an s(n)-space bounded multitape QTM. Then for each input x there are in-
teger matrices A, B of degree 2O(s(n)) , elements of which are integers of length 2O(s(n)) ,
such that the following properties hold:
• det (A) > det (B) if and only if M accepts x with probability > 21 .
• There exists a DTM M which on input x and an integer k ∈ 2O(s(n)) , initially
written on its working tape, computes the kth bit of wA,B in space O(s(n)).
Let M be a QTM running in space s(n), and let A and B be matrices the existence of which
follows from Claim 2. Since both matrices are of degree 2O(s(n)) and all their elements have
2O(s(n)) size, the encoding wA,B can be assumed to have asymptotically at most the same
length.
According to Claim 1, there is a lg-space bounded PTM M1 accepting the string wA,B
such that det (A) > det (B) with probability > 12 . On this base we can design a PTM
M2 which works as follows. On an input x, M2 simulates M1 and keeps recording the
position of the head of M1 on wA,B . (For that M2 needs O(s(n)) space.) On the base of
this position, using the machine M0 the existence of which is assumed in Claim 2, second
item, M2 computes in O(lg 2O(s(n)) ) = O(s(n)) space the bit of wA,B , which M1 needs to
inspect. Since det (A) > det (B), if M accepts x with probability > 12 , M2 also accepts x
with probability > 12 .
Open problem 5.2.8 Are QTM with algebraic amplitudes equivalent to those with rational
amplitudes with respect to space efficiency?
As it follows from Theorem 5.2.1 and results of Watrous (1998), bounded-error probabilis-
tic computations can be simulated by QTM either in time-efficient way or in space-efficient
way. An open problem is whether they can be simulated by QTM in both time and space
efficient way. The existence of such simulations for a class of important graph problems was
shown by Watrous (1998).
Remark 5.2.9 Quantum version QNC of the classical parallel time complexity class NC is
also of interest (see, for example, Moore and Nilsson, 1998a). Due to the enormous problems
decoherence causes, it is of special interest to find out what can be computed in quantum
parallel polylogarithmic time. (However, from a complexity-theoretic point of view (e.g.
with a uniformity condition analogous to the classical case) a fully satisfactory definition of
QNC seems to be still a challenge.)
Oracle QTM
In classical computing an oracle can be seen as a special subroutine each call of which costs
only one time unit (see page 384 in the Appendix, Section 9.3, for classical oracle TM). In
the context of QTM, subroutine calls have to satisfy a special requirement which has no
classical parallel. It is necessary that subroutine calls do not leave around any garbage, only
their outcomes, because computational paths with the same result, but different garbage
behind do not interfere.
The simplest case of an oracle QTM is that with a Boolean oracle f : {0, 1}∗ → {0, 1}.
An oracle QTM M with the oracle f has a special query (oracle) track, with the alphabet
{λ, 0, 1}, on which the machine writes its question to the oracle in the form of a string xb
with x ∈ {0, 1}∗, b ∈ {0, 1} and it has two distinguished query states: a pre-query state q?
and a post-query state q!. If M enters the state q? and the oracle track contains at that
moment a string xb, where x ∈ {0, 1}∗, b ∈ {0, 1}, then M enters, in one step, the state
q! and the content of the oracle tape is changed to x · (b ⊕ f (x)), where · is here a symbol
for concatenation. In other words, the XOR operation is performed with b and f (x) as
arguments.
In a more general case the oracle is represented by a unitary operator U and the oracle
call changes the state |xi into |U xi.
As in the classical case, one can define such complexity classes as CA , where C is a
quantum complexity class and A is an oracle and also classes CA , where A is a set of
oracles.
Proof. We prove in detail the existence of an oracle A with the above property. The
existence of oracle B can be shown in a similar way and we only show the main new trick.
The construction of A is done recursively. In doing that we use two integer-to-integer
functions, p(n) and s(n), where
and s(n) is any recursive function which maps N into N and takes each value infinitely
many times.
Let {Mi }∞ i=1 be a fixed enumeration
S∞ of oracle DTM with input alphabet Σ = {0, 1}.
The oracle A is defined by A = i=1 Ai , where A1 = ∅ and, for n > 1, An+1 = An ∪ Rn ,
with Rn is chosen as follows.
Let the machine Ms(n) be simulated on the input 1p(n) for 2p(n)−1 steps with An as
the oracle. If Ms(n) does not stop within 2p(n)−1 steps, or rejects the input within 2p(n)−1
steps, then Rn = ∅. Otherwise, let Qn be the set of queries asked during the computation of
Ms(n) on the input 1p(n) . Clearly, |Qn | ≤ 2p(n)−1 . In this case Rn is chosen as any subset
of Σp(n) of size 2p(n)−1 such that Rn ∩ Qn = ∅. Such a set must exist because at most half
206 CHAPTER 5. COMPLEXITY
of the binary strings of length p(n) can be query strings in the computation of Ms(n) on
1p(n) for 2p(n)−1 steps.
Let now
SA = {1n | A ∩ Σn = ∅}.
We show first that there is an oracle QTM with the oracle A which accepts SA in
polynomial time.
Observe that 1n ∈ SA if and only if there are no words of length n in A. Let f :
{0, 1}∗ → {0, 1} be a recursive function such that f (x) = 1 if and only if x ∈ A. For any
n ∈ N let fn be the restriction of f to inputs of length n. From the definition of A it follows
that fn is either constant or balanced for each n. Hence 1n ∈ SA if and only if fn is not
balancedbalanced function. For a given n this can be decided by a quantum computer in
one step using the Deutsch–Jozsa algorithm. It therefore requires maximally linear time (to
write down 1n ) on a quantum computer to decide whether 1n ∈ SA .
Suppose now that SA can be accepted by an oracle DTM Mi with oracle A in polynomial
time. Let n be such that s(n) = i. We claim that MA i accepts the input 1
p(n)
and uses at
least 2p(n)−1 steps to do so.
Suppose first that MA i rejects the input 1
p(n)
. Then 1p(n) 6∈ SA because MA i should
p(n)
recognize SA . Therefore, A ∩ Σ 6= ∅. However, this means, by the definition of oracle A,
that MA i accepts 1
p(n)
within 2p(n)−1 steps using oracle An . By the definition of An , the set
An does not contain any oracle queries asked during such a computation. In addition, none
of these queries can be longer than 2p(n)−1 —there would be no time to write down such a
query. Therefore, the sets A and An can differ in strings smaller than 2p(n) only in strings
from Rn that are in A, but not in An . Therefore the same accepting path in MA i
n
for input
1p(n) exists if An is replaced by A. Consequently, MiA accepts SA —a contradiction. Hence
SA cannot be accepted by an oracle DTM with the oracle A.
Suppose now that MA i accepts 1
p(n)
with less than 2p(n)−1 steps. Then 1p(n) ∈ SA , since
Mi is supposed to accept SA . Hence A ∩ Σp(n) = ∅ and therefore A and An are identical
A
when restricted to strings of length 2p(n) or less. Consequently, it does not matter whether
we use as oracle A or An for the first 2p(n)−1 steps of Mi . Hence MA i
n
also accepts 1p(n) ,
and therefore A ∩ Σp(n) 6= ∅ by the construction of A—a contradiction.
In a similar way we can show the existence of an oracle B and a set YB ∈ {1}∗ such
that YB ⊆ EQPB , YB 6∈ NPB and the set YB ∪ {0n | 1n 6∈ YB } is neither in NPB nor in
co − NPB , but it is in EQPB .
In addition, a variety of relativized results concerning BQP have been obtained by Fort-
now and Rogers (1998) that seem even to suggest an hypothesis that BQP actually contains
no interesting complexity class outside BPP. For example, there is a relativized setting
where P = BQP and the polynomial time hierarchy (see, for example, Gruska (1997)) is
finite. And there is a relativized setting in which BQP does not have complete sets.
Simon (1994) showed the existence of an oracle relative to which BQP cannot be simu-
lated by a PTM in 2n/2 steps.
Another key question is whether NP ⊆ BQP. Bennett et al.(1997) showed that relative
to a random oracle, with probability 1, the class NP cannot be solved on a QTM in time
o(2n/2 ). This bound is tight due to Grover’s result in Section 3.3.9
9 These results do not rule out the possibility that NP ⊆ BQP. (It is not even clear whether BQP ⊆
BPPNP ; i.e. whether nondeterminism+randomness is sufficient to simulate QTM.) These results only imply
that there is no “black-box approach” to solving NP-complete problems utilizing some uniquely quantum
mechanical features of QTM (see Bennett et al. 1997).
5.3. QUANTUM COMMUNICATION COMPLEXITY 207
As already mentioned, oracles of QTM have to fullfil special conditions in order not to
have undesirable effects. An important task is therefore to determine how powerful oracles
can be used. Bennett et al. (1997) showed that each BQP-machine10 M can be modified
into another equivalent BQP-machine whose final superposition consists almost entirely of
a tape configuration containing just the input and a single bit answer. On this base they
have shown that BQPBQP = BQP. In addition, Fortnow and Rogers (1998) have shown
that PPBQP = PP.
In the classical complexity one considers, in addition to various classes P, NP,. . . of deci-
sion problems, also their F -versions—classes of functions computable with similar resources.
The same can be done also in quantum computing. For example, Aharonov et al. (1998)
have introduced the class FQP of functions computable by uniform quantum circuits with
polynomial size and depth. For this class they have shown that FQPFQP = FQP.
Theorem 5.2.11 There is an oracle A relative to which one-way functions exist and
PA = BQPA ,
Remark 5.2.12 A variety of results concerning the power of quantum computation have
been presented in Chapters 3 to 5. In spite of all these insights we have to realize the fact
that we are still far away from understanding well the computational power of quantum
systems.
In the so-called entanglement model only classical bits are communicated but commu-
nication is facilitated by an a priori distribution of entangled qubits among the communicat-
ing parties. It has already been demonstrated in Section 2.2 that this can bring asymptotic
improvement compared to the classical communication complexity. 11
In the so-called qubit communication model, discussed in this section, communi-
cating parties exchange qubits. This model is quite a straightforward generalization of the
model of communication in classical communication complexity.
Classical communication complexity theory (see Hromkovič, Juraj (1997), Kushilevitz
and Nisan (1997), Gruska (1997)), has already been much developed. Its importance stems
from the experience that it is to a large extent the complexity of the communication that
is behind the complexity of parallel and distributed computing. Lower-bound results on
communication complexity are often used to derive lower-bound results for computation
complexity. On the other hand, the development of the quantum communication complexity
theory is only just beginning and seems to be essentially more difficult.
It is not immediately clear whether qubits can reduce communication costs because one
of the fundamental results of the quantum information theory, due to Holevo (1973), says
that by sending n qubits one cannot convey faithfully more than n bits of information.
Example 5.3.1 Let x, y ∈ {0, 1}n, π = {x, y}. If f (x, y) is the parity of the string xy, then
clearly Cπ (f ) = 1. However, if f (x, y) = 1 if and only if x = y, then it can be shown that
Cπ (f ) = n.
In the above setting we have considered a fixed partition of inputs. Another model of the
classical communication complexity is to take least number of communication bits needed
to solve the problem with respect to all balanced (equal size) partitions of inputs between
Alice and Bob—notation Cb (f ).
Example 5.3.2 (Addition of binary numbers.) Assume that parties A and B are to compute
the sum of two n-bit numbers x = an . . . a1 , y = bn . . . b1 , where n is even, and each of them
knows exactly half of the input bits. Assume also that B is to compute n2 of the least
11 Sometimes also a modification of the above entanglement communication model is considered in which
qubits are used for communication. As shown in Section 6.4.4, with n entangled pairs of particles one can
send n qubits by sending 2n bits. Entangled model therefore needs for communication at most twice as
many bits as qubits, if enough of entangled pairs is available.
5.3. QUANTUM COMMUNICATION COMPLEXITY 209
significant bits of the sum and A the rest. How many bits do they need to exchange? The
answer largely depends much on how the input bits are divided between the two parties. Let
us consider two possible cases.
1. If B knows a n2 . . . a1 , b n2 . . . b1 and A the rest of the input bits, then it is clearly enough
that B sends to A the single bit, namely 0, if
and 1 otherwise. A can then compute the remaining bits of the sum.
2. However, if A knows an . . . a1 and B knows bn . . . b1 , then it seems to be intuitively
clear that B needs to get bits a n2 . . . a1 and A needs to get at least bits bn . . . b n2 and an
additional bit carrying information whether the sum of n/2 least significant parts of both
numbers is or is not larger than 2n/2 .
Example 5.3.3 Let f, g : {0, 1}n → {0, 1} and let Alice get as the input f and Bob g. The
following communication problems belong to the basic ones.
V
• EQ(f, g) = x∈{0,1}n (f (x) = g(x)),
L
• IP(f, g) = x∈{0,1}n (f (x) ∧ g(x)),
W
• DISJ(f, g) = x∈{0,1}n (f (x) ∧ g(x)),
where these functionals represent equality, inner product and disjointness12 of two functions,
respectively.
The following results hold; see references in Gruska (1997) and Buhrman et al. (1998).
the model considered here the communication complexity of DISJ problem and its complement are equal
(Buhrman et al. 1998).
210 CHAPTER 5. COMPLEXITY
1. To derive upper bounds for quantum communication complexity from the upper
bounds for quantum computation complexity.
2. To derive lower bounds for quantum computational complexity from the lower bounds
for quantum communication complexity .
Let Fn denote the set of Boolean functions f : {0, 1}n → {0, 1}.
Theorem 5.3.4 Let F : Fn → {0, 1} and L : {0, 1} × {0, 1} → {0, 1}. L induces a mapping
L : Fn × Fn → Fn such that L(g, h)(x) = L(g(x), h(x)) for all x ∈ {0, 1}n . If there is
a quantum algorithm A to compute F (f ) with t calls of f , then there is a t(2n + 4)-qubit
quantum communication protocol P for the following problem: Alice gets g, Bob gets h and
the aim is for Alice to determine F (L(g, h)) by communication with Bob. In addition, the
probability that communication according to P produces the correct result is the same as for
the algorithm A.
1. Alice performs the unitary operation |x, y, 0i → |x, y, g(x)i and sends n + 2 qubits to
Bob;
2. Bob performs the unitary operation |x, y, g(x)i → |x, L(g(x), h(x))⊕y, g(x)i and sends
n + 2-qubits to Alice;
Since there are t calls for which such a communication is needed, the total amount of
exchanged qubits is t(2n + 4).
5.3. QUANTUM COMMUNICATION COMPLEXITY 211
Exercise 5.3.5 (a) Does the protocol in the proof of Theorem 5.3.4 make use of entan-
glement? If yes, then where? If not, could we do the whole protocol classically? (b) Does
Bob need to send Alice n + 2 qubits? Is it not sufficient for him to send back only one
qubit with L(g(x), h(x)) ⊕ y?.
√
Example 5.3.6 The upper bound QC(DISJ) = O(n 2n ) follows from Theorem 5.3.4, by
taking L to be binary AND function and F to be 2n√-ary OR function, because for computation
of 2n -ary OR-function we have the upper bound 2n , see Section 3.3.
Example 5.3.7 An exponential gap between the exact classical and quantum communica-
tion complexity has been shown by Buhrman et al. (1998) for the following problem.
Let f, g : {0, 1}n → {0, 1} and let ∆(f, g) be the Hamming distance between f and g which
equals the Hamming distance of 2n -bit strings f (0)f (1) . . . f (2n − 1) and g(0)g(1)...g(2n − 1).
Let EQ’ be the partial function defined by
1, if ∆(f, g) = 0,
EQ′ (f, g) =
0, if ∆(f, g) = 2n−1 ;
and undefined for other arguments. (In the case of partial functions we require that com-
munication yields the correct outcome only for arguments at which the partial function is
defined.)
The upper bound QC0 (EQ′ ) = O(n) is the consequence of Theorem 5.3.4. Indeed, take
L to be XOR function and F to be the 2n -ary OR function restricted to balanced or zero
functions. The upper bound now follows from the analysis of the Deutsch–Jozsa algorithm
in Section 3.1. It has been shown, by Buhrman et al. (1998), that C0 (EQ′ ) = Ω(2n ).
Example 5.3.7 shows a communication task for which the number of bits needed to
communicate in order to compute a given function with zero-error is exponentially larger
then the number of qubits that need to be communicated. However, for this task there is a
classical randomized communication protocol that achieves the same result with small error
and requires to communicate the same number of bits as the number of qubits needed for
the best quantum communication protocol.
The first fully exponential gap between classical bounded-error randomized communi-
cation and quantum communication has been shown by Ambainis et al. (1998a) for the
following sampling task. Alice has a subset A ⊆ {1, 2, . . . , n} = S of cardinality k and
Bob task is to pick up another subset B ⊆ S of cardinality k disjoint with A (if possible),
The result was obtained as a byproduct of a method to deal with the following important
communication primitive.
Definition 5.3.8 (Sampling) Let f : X × Y → {0, 1} and let D be any probability distribu-
tion on X × Y . A communication protocol P is said to sample f according to D with error
ε > 0, if the distribution the protocol induces on {(x, y, z)} is ε-close, in the total variation
distance, to the distribution (D, f (D)) obtained by picking first (x, y) according to D and
then computing z = f (x, y).
√
For the disjointness problem discussed above nad k = Θ( n) they give a quantum
protocol in which Alice sends O(lg n lg 1ε ) qubits to Bob and they allow him to sample
from a distribution ε-close to the desired uniform distribution on subsets of√S disjoint to
A. In addition, they show that each classical randomized protocol needs Ω( n) bits to be
exchanged between Alice and Bob.
212 CHAPTER 5. COMPLEXITY
Nonlinear transformation (5.4) used in the algorithm seems to be artificial, chosen just to
get the desirable result. However, Abrams and Lloyd (1998) demonstrated that virtually any
deterministic nonlinear quantum theory will include such a nonlinear operator. In addition,
they have shown that virtually any deterministic nonlinear operation can be recast into this
form and the operation (5.4) can be constructed from an arbitrary unitary operation and a
simple nonlinear operation |0i → |0i and |1i → |0i.
Abrams and Lloyd (1998), developed their algorithms within Weinberg’s (1989) model
of nonlinear dynamics and they used operators that do not preserve scalar product of states.
In such a case Weinberg’s model can exhibit unphysical effects. By Czachor (1998) slight
modifications of both algorithms work also in the model that is known to be free of unphysical
influences.
Using quite a different model of quantum computing, Černý (1993) has shown how
to solve one NP-complete problem, namely the traveling salesman problem, in quantum
polynomial time, but using an exponentially large amount of energy.
214 CHAPTER 5. COMPLEXITY
Chapter 6
CRYPTOGRAPHY
INTRODUCTION
Secure communication is one of the areas of key importance for modern society in which
quantum information transmission and processing seems to be able to bring significant con-
tributions. For example, quantum cryptography may be the main defence against quantum
code breaking in the future.
An important new feature of quantum cryptography is that security of quantum key
generation and quantum cryptographic protocols is based on a more reliable fact, on the
laws of nature as revealed by quantum mechanics, than in the case of classical cryptography,
whose security is based on unproven assumptions concerning the computational hardness of
some algorithmic problems.
It is difficult to overemphasize the importance of quantum cryptography for an under-
standing and utilization of quantum information processing. Quantum cryptography was
the first area in which quantum laws were directly exploited to bring an essential advantage
in information processing.
Closely related are quantum teleportation and quantum superdense coding—special ways
of the transmission of quantum or classical information using one of the most puzzling
phenomena of the quantum world—non-locality features of the quantum entanglement.
LEARNING OBJECTIVES
The aim of the chapter is to learn:
1. several methods of secret key generation by two parties;
2. a method of multiparty secret key generation;
3. the unconditional security of quantum key generation;
4. the basic quantum cryptographic protocols;
5. the problems related to the security of quantum cryptographic protocols;
6. the main principles, circuits and some applications of quantum teleportation;
7. the quantum superdense coding procedure.
215
S
216 CHAPTER 6. CRYPTOGRAPHY
“Amazing, Holmes.”
“Elementary, my dear Watson, elementary.”
1 Secret-key cryptography, such as DES, has not yet been shown to be efficiently breakable by quantum
computing. Whether this can be done is an interesting but hardly practical problem because DES is expected
to be obsolete before any quantum computer is built.
6.1. PROLOGUE 217
6.1 Prologue
Quantum cryptography is like classical cryptography, a continuous fight between good and
bad. The basic setting is that Alice tries to send a quantum system to Bob and an adversarial
eavesdropper, Eve, tries to learn, or to change, as much as possible without being detected.
An eavesdropper has this time an especially hard task. Quantum states cannot be copied
and cannot be measured without causing, in general, a disturbance.
The key problem can be formulated as follows (see Fuchs and Peres, 1996). Alice prepares
a quantum system in a specific way unknown to Eve and sends it to Bob. The question
is how much information can Eve extract of that quantum system and what the cost of
that information is, in terms of the disturbance of the quantum system. Two concepts are
therefore crucial here: information and disturbance.
Let us consider two extreme cases within the scheme that Alice sends a state |ψi to Bob.
The first is that Eve has no information about how |ψi was prepared. The only thing Eve
can then do is to choose some basis {|ei i} of orthonormal states and to use the corresponding
projection measurement on |ψi. In such a case |ψi collapses into one of the states |ei i and
by that the only information Eve has learned is that |ψi is not orthogonal to |ei i. At the
same time |ψi can get much destroyed.
The second extreme case is that Eve knows that |ψi is one of the states of the basis {|ei i}.
In such a case Eve gets, by measuring |ψi with respect to the basis {|ei i} full information
about |ψi because |ψi collapses into itself. No disturbance to |ψi occurs.
The most interesting, important and hard case for quantum cryptography is the third
case, where Eve knows that |ψi is one of the states |ψ1 i, . . . , |ψn i that are mutually
nonorthogonal and pi is the probability that |ψi i is the state Alice sends. In this case the
question is how much information Eve can get by a measurement and how much disturbance
she causes by that. P
How to measure information gain of Eve? Shannon entropy ni=1 pi lg pi is a measure of
her ignorance about the system before the transmission takes place. She can try to decrease
this entropy by some measurement. What she gets is called mutual information.
In the case of eavesdropping Bob does not get a pure state but a mixed state specified
by a density matrix ρi for the case Alice sends |ψi i. The disturbance detectable by Bob is
given by D = 1 − hψi |ρi |ψi i.
An important case is that n = 2, p1 = p2 and |ψ1 i and |ψ2 i are states of H2 such that
a measurement of |ψb i with respect to |ψb̄ i produces both states with the same probability.
In such a case Eve has a 50% chance to make a correct guess for her measurement. In the
case of the correct guess she gets |ψi for 100%, otherwise for 50%. In the average she gets
75% of information. Can she do better? By doing some other measurements? As we shall
see in this chapter, she can get up to 85% of information but not more, no matter what she
does. This may sound pretty good, but not enough for cryptography, as we shall see.
Of course, when Alice sends a sequence of states Eve has also other options for eaves-
dropping than to try to measure immediately, one-by-one each sent state. For example, she
can postpone measurements to the end of the transmissions.
In such a general case the problem of Eve’s information gain versus disturbance is one
of the central one for security of quantum cryptography protocols.
How to send quantum states in general and qubits in particular is another general prob-
lem. Transmission of polarized photons is so far one of the basic tools in quantum cryp-
tographic protocols for sending qubit states.2 The main property to be used is that if a
2 For more about polarization of photons see Section 9.1.2 and 9.2.6.
218 CHAPTER 6. CRYPTOGRAPHY
This way such well-known and practically important cryptosystems as DES and ONE-TIME
PAD work. The second cryptosystem is perfectly secure under the assumption that each
key has the same length as the plaintext w and each time the cryptosystem is used with
a new randomly chosen key. In the case of binary plaintexts and keys the encryption and
decryption algorithms of ONE-TIME PAD cryptosystem are very simple
c = k ⊕ w and w = k ⊕ c,
where ⊕ stands for a component-wise exclusive-or operation. (Observe that using ONE-
TIME PAD cryptosystem we need one secret bit to transfer securely one bit—quite a price.)
Since the search for limitations concerning security is also one of the main tasks of the
quantum computing research, let us look a bit more closely into the problem in which sense
ONE-TIME PAD cryptosystem is perfectly secure. Let w and c be random variables for
plaintexts and cryptotexts. ONE-TIME PAD cryptosystem is perfectly secure in the sense
that Pr(w|c) = Pr (w), or, equivalently Pr (c|w) = Pr(c).
Exercise 6.2.1 Show why it is necessary for the security of ONE-TIME PAD cryptosys-
tem to require that each key is used only once.
Classical cryptography can therefore be so much secure, how secure key distribution
methods are available.
The main practical importance of public key cryptography is therefore for key dis-
tribution. It provides computationally secure key distribution under some unproven
6.2. QUANTUM KEY GENERATION 219
Protocols
The basic idea behind key generation is simple. If Alice and Bob want to share a binary
key of length n, then each of them first generates independently a private random binary
sequence of length m ≫ n. In order to extract from this random sequence a common key of
length n, Alice prepares a sequence of m tokens, one type for bit 1 and another type for bit
0 and sends through a sequence of such tokens her random sequence to Bob. Bob reports
to Alice the order, but not the value, of those of Alice’s bits that are the same as his. From
this sequence of bits they select n bits, say the first n.
The idea is simple but not secure in the classical setting because Eve can tamper trans-
missions and read Alice’s tokens and Bob’s report.
In the quantum setting the above idea can be implemented in such a way that its security
is based on the following basic principle of quantum mechanics.
1. Non-orthogonal states cannot be copied.4
2. Any measurement of states can change them, with high probability, irreversibly and
can create a significant and detectable rate of errors.
3 In the literature, for historical reasons, the term “quantum key distribution” (QKD) is mostly used even
(1983) to suggest money that could not be counterfeit without detection. The idea is simple, The bank
embeds qubits, randomly chosen from a nonorthogonal set into the bank notes and puts notes into circulation,
keeping the record which quantum states were embedded into notes. When a note returns to the bank its
qubit is meausred, according to the record. If there was no attempt to counterfeit the note, the readout
corresponds to the record and, in addition, the embedded state is not disturbed. (The main problem with
this idea, not solved yet, is how to store qubits for a long time.)
220 CHAPTER 6. CRYPTOGRAPHY
The quantum version of the basic idea for key generation goes as follows. Alice sends Bob
a sequence of photons randomly polarized. Bob measures them using a randomly chosen
basis. This necessarily disturbs some of the photons. Any eavesdropping by Eve introduces
additional disturbance. After all transmission of photons Alice and Bob determine, in a
public communication, the amount (probability) of eavesdropping. If it is not too much,
they select from the sent and received photons a shared secret key.
Three QKG protocols are discussed in the following. All of them are both historically
and practically of importance.
1. Protocol BB84, or 4-state protocol, with encodings and decodings based on the ex-
istence of two non-commuting observables (Bennett and Brassard, 1984). BB84 has
been the first fully successful attempt to exploit quantum laws to obtain a fundamental
advantage in information processing.
2. Protocol E91 with encodings based on quantum entanglement (Ekert, 1991).
3. Protocols B92, or 2-state protocol, with encodings based on two non-orthogonal states
(Bennett, 1992).
Security problem
The aim of a QKG protocol is to allow two parties, Alice and Bob, that share no information
initially, to share a secret key (a binary string) at the end.
There are two potential obstacles to overcome. First, the communication channel between
Alice and Bob can be noisy and faulty (some photons can get lost). Second, communication
during the key extraction phase has to be assumed to be performed “before the eyes” of
Eve, who can do her best to achieve that Alice and Bob do not meet their aim. (Eve can try
to learn the key Alice and Bob generate, or at least to get some information about it. She
can also try to achieve that at the end of the generation protocol Alice and Bob actually do
not share the same key.)
We shall not consider here the case that Eve can just disrupt communication between
Alice and Bob. We shall consider only the more difficult case that Eve tries to meet her aims
without being detected. We shall also not consider the case that Eve tries to alter public
communication between Alice and Bob or to pretend to be one of those parties. Alice and
Bob can use some authentication protocols for communication, to avoid such an interference.
We shall also consider the worst case, the usual one for cryptography, that Eve knows
which protocol is used. All she does not know is the private random keys of the parties.
6.2. QUANTUM KEY GENERATION 221
What are the means Eve can use to achieve her goals? First observe she cannot be only
a passive eavesdropper because the key is extracted only from bits Bob receives. She cannot
“tap” quantum transmissions. A single photon cannot be split and no quantum system can
clone nonorthogonal quantum states. Therefore it would seem that the only thing Eve can
do is to measure transmitted states (according to one of the observables Bob uses5 ) and
then forward to Bob the states she gets as results of her measurement. In such a case, as
already discussed, she has a 25% chance to make an error. (As a consequence, if Alice sends
n bits and Eve measures all, then there is only a ( 34 )n chance that there will be no error
introduced by her. For the case n = 100 there is therefore probability only 3 · 10−13 that no
error will be introduced by Eve.) However, this is not the whole story. There are various
attacks/measurements she can make.
Several types of attack have been identified so far. The two extreme types are the
following ones.
1. Intercept–reset attacks. Eve tries to learn as much as possible from particular
transmissions of Alice, qubit by qubit (photon by photon) using von Neumann mea-
surements.
2. Coherent or joint attacks. Instead of measuring the particles while they are in
transit from Alice to Bob, one–by–one, Eve regards all the transmitted particles as
a single entity. She then couples this entity with a simple auxiliary system (ancilla),
prepared in a special state, and creates the compound system. Afterwards, she sends
the particles to Bob and keeps the ancilla. After the end of the public interactions
between Alice and Bob (for error detection, error correction and privacy amplification),
Eve extracts from her ancilla some information about the key. Such attack are directed
against the final key. They represent the most general type of attacks that is possible.
(However, no particular attack of this type has been suggested so far.)
Error rate
In practice eavesdropping is not the only source of errors in transmission. Imperfections
of source, channels and detectors may also produce errors, usually up to a few per cent.
The number of such errors, as a fraction of the total number of detected bits, is called the
quantum bit error rate, and it is one of the parameters that characterizes how well a
transmission system works.
When a noisy channel is used to transmit quantum states the problem is to detect
eavesdropping. One way out for Alice and Bob is first to calculate the likely error rate
caused by a noisy channel, and then to consider the real error rate to be suspicious if it is
higher than estimated. Of course this is not a very secure method. Usually the best is to
assume the worst case, namely that all errors are due to an eavesdropper.
Error correction
One way to deal with the problems of noisy channels and faulty detectors is for Alice to first
encodes the sequence she wants to transmit using an error correction code.
Privacy amplification
Privacy amplification is a tool developed by Bennett, Brassard and Robert (1988) to select
a short and very secret binary string s from a longer but less secret string s′ .
The main idea is simple. If |s| = n, then one picks up n random subsets S1 , . . . , Sn of
bits of s′ and let si , the ith bit of S, be the parity of Si . One way to do it is to take a
random binary matrix of size |s| × |s′ | and to perform multiplication M s′T , where s′T is the
binary column vector corresponding to s′ .
The point is that even in the case where an eavesdropper knows quite a few bits of s′ ,
she will have almost no information about s.
More exactly, if Eve knows parity bits of k subsets of s′ , then if a random subset of bits
of s′ is chosen, then the probability that Eve has any information about its parity bit is less
than 2−(n−k−1) / ln 2.
Of particular importance is the case that a linear error correcting code is used to encode
at first the transmitted sequence and then a syndrome of it is distributed by public channel.
In such a case if Eve knows already t bits of s′ , and if no more than n−t−r −1 bits are given
to Eve as the syndrome of s′ , where r is a security parameter, then the expected amount of
−r
information Eve knows on the parity of a random subset of bits of s′ is less than 2ln 2 .
Preparation phase
BB84 protocol. Alice generates independently two private random binary sequences of
length m ≫ n bits and Bob generates one such private sequence of random bits.
B92 protocol. Both Alice and Bob generate their two private random binary sequences of
length m.
Assumptions: Alice is assumed to have four transmitters of photons in one of the following
four polarizations—0, 45, 90 and 135 (or −45) degrees—in the case of BB84 protocol
(see Figure 6.1a); and in one of two polarizations—90 and 135 degree—in the case of
B92 protocol (see Figure 6.1b).7
|1>
(a) (b)
In accordance with the laws of quantum physics Bob has a detector that can be set up
to distinguish between rectilinear polarizations (0 and 90 degrees) or can be quickly
reset to distinguish between diagonal polarizations (45 and 135 degrees). However,
again in accordance with the laws of quantum physics, there is no detector that could
distinguish between unorthogonal polarizations. In a more formal setting, Bob can use
either the standard observable B = {|0i, |1i} or the dual observable D = {|0′ i, |1′ i},
discussed on page 44, to measure the incoming photon.
Transmissions
BB84 protocol: To send a bit 0 (1) of her first random sequence through a quantum
channel8 , Alice chooses, on the base of her second random sequence, one of the
encodings |0i or |0′ i (|1i or |1′ i), i.e., the standard or dual basis, and sends the
photon of the corresponding polarization.
It is assumed here and in the following that photons are sent one by one in regular
intervals. As a consequence Bob knows when some photon does not get through
and the order index of all received bits.
Bob chooses, each time on the base of his private random sequence, one of the
observables B or D to measure the photon he is to receive. Bob records the
results of his measurements and keeps them secret. Observe that there are three
situations Bob can encounter: photon is not received, Bob uses the correct basis
(with respect to Alice’s choice) for his measurement, and Bob uses the incorrect
basis for measurement.
Figure 6.2 shows the possible results of the measurements and their probabilities.
When Bob guessed correctly the polarization chosen by Alice he obtained for 100%
the same bit as Alice had sent. However, when Bob failed to guess polarization,
and did not used the corresponding observable, he obtained the correct result
only with probability 21 . An example of an encoding–decoding process is in the
Figure 6.3.
7 Expressed in a more general form, Alice uses for encoding states from the set {|Oi, |1i, |0′ i, |1′ i}a in the
case of the BB84 protocol and staes from the set {|0i, |1′ rangle} in the case of the B92 protocol.
8 Quantum channel is a transmission medium that isolates the quantum state from interactions with the
environment.
224 CHAPTER 6. CRYPTOGRAPHY
Figure 6.3: Quantum transmissions in the BB84 protocol—R stands for the case that the
result of the measurement is random
B92 protocol. Alice uses encodings 0 → |0i and 1 → |1′ i and sends each bit by a photon
in one of the two nonorthogonal states.
Bob chooses, on the base of his random sequence, observable D for 0 and B for 1, and
checks whether the photon he has received was polarized as |0′ i or |1i.9 He records
the results of his measurements and keeps them secret.
Table 6.4 shows the possible results of the measurements and their probabilities. Where
the corresponding bits of their random sequences are different, the test fails with prob-
ability 1. Otherwise, it fails with probability 21 . An example of an encoding/decoding
procedure is in Figure 6.5.
Exercise 6.2.2 Could we modify B92 protocol in such a way that Bob makes test not for
O′ i and 1i but for |0i and |1′ i or for |0i and |1i?
BB84 protocol: Bob makes public the sequence of observables he used to measure
the photons he received—but not the results of the measurements —and Alice
tells Bob, through a classical channel, in which cases he has chosen the same
basis for observable as she did for encoding. The corresponding bits then form
the basic key both parties agree on.
B92 protocol: Those bits for which Bob’s tests don’t pass he takes as the key being
extracted and reports their positions to Alice through a public channel.
B92 protocol is simpler because only two polarizations of photons are used and that
is why this protocol is sometimes said to be the “minimal protocol” for QKG.
The basic description of the BB84 and B92 protocols is by that finished. In the following
we describe a more involved test for errors and a more involved protocol for key extraction
in the case of the BB84 protocol. This will be used later when discussing unconditional
security of the BB84 protocol. The basic ideas presented below (see Mayers and Yao, 1998),
are of a general importance for making quantum key generation protocols more robust to
the noise of various types.
1. Bob makes public the vector BB (but not Bb ). Alice lets Bob know the set T =
{i | AB [i] = BB [i]} of those cases where Bob chose the correct basis (and therefore he should
get the same bit Alice sent).
2. Bob chooses randomly a set R of n2 indices i ∈ [1, 2, . . . , n] and makes public the
set {(i, Bb [i]) | i ∈ T ∩ R}. Alice verifies whether the number of positions i ∈ T ∩ R such
that the number of Ab [i] 6= Bb [i] is smaller than δ |T2 | . If not, they stop the key generation
process because of the suspiciously large rate of errors. Otherwise, they continue to find out
whether there are still enough bits to use for key extraction.
226 CHAPTER 6. CRYPTOGRAPHY
3. Alice and Bob verify whether |T ∩ R| ≥ n̄ = ( 14 − β)n. If this is not the case, the
protocol is stopped. The parameter β is needed to make sure, using Chernoff’s bound10
2
that |T − R| ≥ n̄ with probability larger than (1 − e−2β n ).
4. If |T − R| ≥ n̄, a set E of size n̄ is randomly chosen from the set T − R.
Efficiency improvements
Protocols BB84 and B92 were presented above in their most basic form. Several general-
izations, modifications and improvements have already been developed (see Brassard, 1994,
and Brassard and Crépeau, 1996, for older references). For example, B92 protocol can be
based on any two nonorthogonal states cos θ|0i + sin θ|1i and cos θ|0i − sin θ|1i. Bruß (1998)
explored the security and efficiency of a generalized BB84 protocol in which three bases have
been used: classical, dual and circular.
A modification of the BB84 protocol that can almost double its efficiency was developed
by Ardehali et al. (1998). The basic idea behind such an increase of efficiency is very simple
and will now be presented.
In the original BB84 protocol, as presented above, both Alice and Bob choose their bases
with equal probabilities. As a consequence in about 50% of the cases Bob uses a different
polarization than Alice and therefore about 50% of polarized photons are discarded.
Two modifications were suggested in the protocol developed by Ardehali et al. (1998).
To select the basic key Alice chooses her polarizations with probabilities p and 1 − p
and Bob with probabilities p′ and 1 − p′ , 0 < p, p′ < 1. If p 6= 12 6= p′ this creates a
larger probability that the same basis is used for encoding and for measurement. This way
efficiency can be in the limit doubled, to approach 100%.
10 Chernoff ’s
Pn
bound. Let X1 , . . . , Xn be independent Bernoulli variables and S = i=1 Xi . If Pr (xi =
−nε2
1) = pi for 1 ≤ i ≤ n, then for all 0 < ε ≤ 1 it holds: (a) Pr (S − pn ≤ −εn) ≤ e 2n ; (b) Pr (S − pn ≤
−nε2
−εn) ≥ e 3n .
11 Observe that key is extracted only from those bits Bob received. This fact makes such QKG protocols
quantum channel and Bob’s detectors. In such a case Alice does not transmit to Bob a randomly chosen
sequence of bits but Alice first encodes her random sequence of bits using a linear error correcting code both
Alice and Bob agree on beforehand.
6.2. QUANTUM KEY GENERATION 227
To make the protocol secure a refined error analysis is used. Instead of creating one
sequence of agreed-upon bits and computing a single error rate, two sequences are created:
one if both of them use the rectilinear polarization; the other for the case where they use
the diagonal polarization. The error rate is considered small if it is small for both of these
sequences.
them to both Bob and Charles. This ensures that neither Bob nor Charles, the one who is
dishonest, postpones his choice of the bases until he learns which bases were chosen by Alice
and the other one.
Exercise 6.2.3 (a) Verify the first of the following decompositions and complete the
second one.
1
|000i + |111i = (|0i + |1i)(|00i + |11i) + (|0i − |1i)(|00i − |11i)
2
1
= (|0i + i|1i)(?...?)
2
(b) Show that
1 1
√ (|00i + |11i) = [e−iπ/4 (|0′ i|0′′ i + |1′ i|1′′ i) + eiπ/4 (|0′ i|1′ i + |1′ i|0′′ i)].
2 2
(c) Show that in the following cheating scheme for the multiparty key generation, the
probability of error is 25%. Bob succeeds to get Charles’s particle. Alice measures her
particle using either
√ the dual or the circular basis. √
Bob measures his two particles either
in (|00i ± |11i)/ 2 basis or in the (|00i ± i|11i)/ 2 basis and after that he sends one
particle to Charles. Both Bob and Charles measure now their particles with respect to
either the dual or the circular basis.
Protocols
1
√ (|01i + |10i),
2
3. Test for eavesdropping. How secure is the above protocol? Eve has no chance to
get some information about the key from the particles while they are in transit because
there is no information encoded there. She has two possibilities:
(a) To measure one or both particles on their way from the source to Alice and Bob
and by that to disturb the protocol and to ensure that Alice and Bob do not
share at the end a common key.
(b) To substitute her own, carefully prepared, particles for those generated by the
ideal source.
Examples
Example 6.2.4 (Ekert, 1991) The source emits spin- 12 particles in the state √12 (|01i −
|10i). Alice performs her measurement with respect to the angles 0◦ , 45◦ , 90◦ and Bob with
respect to angles 45◦ , 90◦ and 135◦ .
Denote E(i, j, b1 , b2 ) the probability that if Alice measures with respect to vector αi , Bob
with respect to βj , then Alice (Bob) gets as the outcome b1 (b2 ). Let us also denote
Example 6.2.5 (Phoenix and Townsend, 1995) The source emits pairs of photons lin-
early polarized in the state √12 (|01i + |10i), where |0i (|1i) corresponds to the vertical (hor-
izontal) polarization. Both Alice and Bob perform measurement with respect to the bases
corresponding to polarizations {0◦ , 90◦ }, {30◦, 120◦ } and {60◦, 150◦ }.
Denote by D(i, j) the difference between probabilities of obtaining a different outcome
and the probability to obtain the same outcome if Alice uses ith basis and Bob uses jth basis
for measurement and let
Entanglement protocols seem to be very different from BB84 and B92 protocols. How-
ever, this is not really so. Bennett, Brassard and Merman (1992) showed that a simplified
version of such a protocol, in which both Alice and Bob randomly choose for their mea-
surement 0◦ or 90◦ , is actually equivalent to the BB84 protocol. In addition, as shown by
Barnett and Phoenix (1993) and Phoenix and Townsend (1995), and as will be illustrated
below, one can have with only one particle a protocol equivalent to an entanglement-based
protocol which has the same level of security.
(a) (b)
source source
The basic setting discussed above is illustrated in Figure 6.6a. A far-away source of the
maximally entangled particles sends one photon of each pair to Alice and the second to Bob,
and they perform their measurements. The same protocol can be used and the same claim
about security and Bell inequality holds, if the source of photons is in Alice’s environment
(see Figure 6.6b), or if the second particle is not sent to Bob, (see Figure 6.6c), but after
Alice’s measurement she makes a copy of the to-be-Bob particle in the corresponding state
(Alice knows this state after her measurement) and sends it to Bob who performs a mea-
surement on it. Again the same holds about Bell inequality and its validity as in the original
case. Now it is evident that Alice actually does not need at all entangled particles. She can
6.2. QUANTUM KEY GENERATION 231
randomly choose one of the six possible states Bob’s original particle could be in13 and she
can send the particle in such a state to Bob—see Figure 6.6d.
Similarly as in the case of BB84 and B92 protocols, the question arises whether one
can have entanglement-based protocol secure even in the case of noise and eavesdropping.
Entanglement purification technique introduced in Section 8.3.2 is a way to deal with these
problems.
Security criteria
Two security criteria are considered: the privacy criterion and the security against
tampering criterion. Privacy means that the eavesdropper cannot learn the key, no
matter what she does. More exactly, that the eavesdropper is able to obtain only negligible
information (less than one bit), about the final key. Security against tampering means that
the eavesdropper cannot make Alice and Bob believe that they share a secret key if they do
not.
Usually, Alice and Bob perform a test and decide accordingly whether or not they share
a secret key. If the result of the test is negative the key generation process is repeated. A
very general view of the test will be used in the following —anything Alice and Bob consider
to decide whether or not they share a secret key. Denote by P such a test.
In the formal definitions of the security criteria we use the concept of a quantity Qn which
is exponentially small. By that is meant that there are c, ε > 0, such that Qn ≤ c2−εn
for almost all n. (n will be used in this context as a security parameter. For example, the
number of qubits to be transmitted.)
Definition 6.2.6 A key generation protocol is secure against tampering if the joint proba-
bility that each test P defined above is fulfilled and the keys of Alice and Bob are different
is exponentially small.
The basic idea of the privacy criterion defined below is that whatever Eve’s attack is,
information i she can gather is such that either the test P is not passed or i has negligible
value.
To be more formal, let k be the random variable values of which are potential keys and
i the random variable each value of which is an information i Eve can obtain either from
the public communication or through her measurements of the transmitted qubits. For a
13 As the consequence of the possibility for Alice to choose one of the three measurement bases.
232 CHAPTER 6. CRYPTOGRAPHY
particular key k, let P r(k) be the probability Alice and Bob generate the key k and let
P r(i|k) be the probability Eve gathers information i if the key generated has the value k.
The formal definition is based on the concept of the “noninformative information” of Eve.
In addition, Nθ (i) will denote the event that the information i of Eve is θ-noninformative.
The basic idea behind the inequality 6.1 is that the difference P r(i|k) − p should be
small. However, P r(i|k) itself is small and therefore it would not be sufficient to ask only
that |P r(i|k) − p| is small. It is needed that this difference is small even with respect to
small P r(k).
Definition 6.2.8 A key generation protocol is secure with respect to privacy, if there are
two exponentially small positive real numbers γ and θ such that P r(P ∩ Nθ (i)) ≤ γ.
Informally, Definition 6.2.8 says that except with the probability θ, Eve’s activity cannot
be such that the test P pass and Eve obtains “informative” information.
A variety of other security criteria have been investigated. Those presented above seem
to be strong enough. They take into consideration all known types of attacks including
coherent attacks. The development of these criteria has been a significant step that allows
to study the idea of “unconditional security” of QKG protocols.
Security requirements
It has been shown, as discussed below, that QKG protocol BB84 is secure and its security
holds:
Main result
In order to illustrate some proof methods we show first security against tampering of a
special version of BB84 protocol with the following two properties.
1. The set T has size exactly n2 . (This can be achieved if Bob first stores all received
photons and only after Alice announces AB , Bob chooses BB such that the set T has
the size n2 ).
2. When the size of T − R is not large enough, the protocol does not stop and the error
correction is done on all bits with indices in T − R.
where E is the event that the Hamming distance on bits with indices in T − R between Ab
and Bb is smaller than (δ + ε) n4 and
−nε2
ε
µ(ε) = e 32(δ+ 2 ) .
Proof is based on the fact that the set R is random and remains secret until Alice’s
quantum transmissions are over. Eve is therefore not able to distinguish between the sets
R ∩ T and R − T when an error is created. We can therefore consider as fixed the sequences
AB , BB , Ab and Bb , and therefore also T . Let e = Ab ⊕ Bb .
The rest of the proof is based on the fact that the weight of e(T ) (see Section 7.4.1),
namely w(e(T )), is either larger or smaller than (δ + 2ε ) n2 . In the first case, the probability
that P holds is small. In the second case, the probability that Ē holds is small. In both
cases, the probability that P ∧ Ē holds is small, as required.
Case 1. w(e(T )) ≥ (δ + 2ε ) n2 . The test P was defined in such a way that P holds only if
w(e(T ∩ R)) ≤ δn 4 . R is constructed in such a way that every i ∈ T , and also every i with
Ab [i] 6= Bb [i], belongs to T ∩ R with probability 21 . Therefore, each of the (δ + 2ε ) n2 errors is
in T ∩ R with probability 21 . According to Chernoff’s bound (see footnote, page 226), the
number of errors in T ∩ R is smaller than
1 ε ε n δn
− (δ + ) =
2 4(δ + 2ε ) 2 2 4
Lemma 6.2.9 sounds very technical. Less formally it says that only with the exponentially
small probability it may happen that the test P succeeds on R ∩ T and yet the number of
errors in T − R is greater than (δ + ε) n4 . This further implies that if an adequate error-
correcting technique is used, then Eve cannot succeed to make Alice and Bob believe that
they share the key if they do not.
The main result about security of BB84 protocol, proven by Mayers and Yao (1998a)
has the following form.
Security against tampering: The probability that the test P passes and there is more than
(δ + ε)n̄ errors in E is smaller than γ 2 .
Security with respect to privacy: If
where H(x) is Shannon entropy of x and the matrix K (used for privacy amplification), is
′
random, then with probability at least 1 − 2−ε n̄ we get K such that
2η
I(i, k′ ) ≤ + mγ,
ln 2
where k′ = k ⊕ w and w is a random string chosen by Alice and announced to Bob, I(i, k′ )
√
is mutual information and η = 2m (2 γ + γ).
Remark 6.2.11 Theorem 6.2.10 is both a significant achievement and a questionable result.
Technically, it is correct. The problem is only whether such a complicated result fully justifies
its interpretation as “the ultimate proof of unconditional security of the BB84 protocol”.
Because of its very complex claim and dependence on several parameters, it is unlikely that
the result will be fully accepted soon by the whole quantum computing community as a
final step concerning the unlimited security of the BB84 protocol. At the same it is unclear
whether a significantly simpler, more elegant and equally powerful proof is possible.
Security of the BB84 protocol and the quality of the photon source
Unconditional security of the BB84 protocol was obtained only under the assumption that
there is a perfect source of photons. It is known that imperfect sources may seem to behave
quite normally and, at the same time, seriously compromise security of the BB84 protocol.
The security problem for an imperfect source is a difficult problem to deal with. An in-
teresting and promising step in this direction was done by Mayers and Yao (1998a). They
proposed a concrete design for a new concept of a self-checking source. It is required,
from the manufacturer of a photon source, to provide certain tests designed in such a way
that if they pass, then the source is guaranteed to make the BB84 protocol secure.
Photons seem to be the best medium to carry qubits. They are relatively easy to produce
and photons of certain wavelength can be sufficiently reliably transmitted using existing
optical fibres. They are also quite well detected. Photons of wavelength 1.3µ m can travel
10 km in a fibre before half of them get absorbed. This should be enough to perform QKG
in local networks. (Unfortunately amplifiers cannot be used. This follows from “No-cloning
theorem”.) However, one could use “quantum repeaters” (see Briegel, 1998).
In the experimental quantum state transmissions two main methods of encoding of quan-
tum states by photons are through polarization of photons (using photons of shorter wave-
length) and through photon’s phase. (The last set-up was used by BT (British Telecom)
(see Phoenix and Townsend, 1995) and in Los Alamos National Laboratory (see Hughes,
1995).)
A practically very useful technique to realize QKG protocols was introduced by Bennett
(1992). The technique uses a Mach–Zehnder interferometer. Alice and Bob each controls one
of the phase modulators on one of the arms of the interferometer. Encoding and measure-
ments are done by setting the corresponding phase modulator. We describe very briefly the
basic ideas only for the BB92 and BB84 protocols. For implementation details see Hughes
et al. (1995).
An interferometric implementation of the QKG scheme for B92 protocol is shown in
Figure 6.7. Alice has a single photon source that she can use to send photons into a Mach–
Zehnder interferometer, in which she controls the phase φA along one of the optical paths.
Bob has a single photon detector at one of the interferometer’s outputs and controls the
phase φB along the other optical path. (In Figure 6.7 we indicate the phases corresponding
to Alice’s and Bob’s random bits.) The probability that a photon sent by Alice is detected
by Bob is
φA − φB
P = cos2 ( )
2
and it depends on both paths. If Alice and Bob use the phase angles (φA , φB ) = (0, 3π 2 ) for
their 0-bits, and (φA , φB ) = ( π2 , π) for their 1-bits, we get the situation the B92 protocol
requires.
Bob 0 1
ΦB 3π/2 π Bob’s
ΦB photon
detector
PM
Alice’s PM
photon ΦA
source
Alice 0 1
ΦA 0 π/2
In the case of the BB84 protocol, both Alice and Bob use identical Mach–Zehnder in-
236 CHAPTER 6. CRYPTOGRAPHY
terferometers with one path longer than the other (see Figure 6.8), and a phase modulator
(PM) on the shorter path. In order to send a random bit Alice randomly adds, through her
PM, a phase shift of 0, π2 , π or 3π2 , to her photon. Bob can add only phase shifts 0 or 2 .
π
π π
By adding 0 ( 2 ) shift, Bob can detect whether Alice’s phase has a phase shift 0 or π ( 2 or
3π π
2 ). If shifts 0 and 2 are interpreted as 1 and other two as 0, we get a basic setting for the
BB84 protocol.
Alice Bob
PM PM
Both theoretical and experimental work proceeds also to develop multiuser quantum
cryptographical networks. Biham, Huttner and Mor’s (1996) approach makes a use of non-
locality and assumes a quantum memory is kept in a transition centre to which users “bring
their particles”. An approach of Phoenix et al. (1995) uses optical networks and has been
experimentally tested, see also Phoenix and Townsend (1995).
Exercise 6.3.1 Show how to design a bit commitment protocol once an oblivious transfer
protocol is given.
None of the classical BCP and OTP is absolutely secure. Their computational security
is always based on some unproven assumptions of the computability theory.
The history of cryptographic protocols started with the following Blum’s coin-flipping
protocol (1981):
Blum’s coin-flipping protocol has actually been also the first non-trivial example of the
bit commitment protocol. In the field of the classical cryptographic protocols BCP play an
important role. For example, with such a protocol one can construct zero-knowledge proofs
for a variety of statements.15
1. Alice randomly chooses a sequence of bits (for example 1000 should be enough) and
a polarization (rectilinear or diagonal—standard or dual). Finally, Alice sends the
resulting sequence of the polarized photons to Bob.
2. Bob chooses, for each received photon, randomly, an observable, B or D, and measures
the incoming photon. He records the result into two tables—one for the observable
B and the second for the observable D. Since some photons can get lost during the
transmissions, there can be holes in both tables. At the end of all transmissions, Bob
makes a guess whether Alice chose rectilinear or diagonal polarization and announces
his guess to Alice. He is to win if the guess is correct and to lose otherwise.
3. Alice tells Bob whether he won or lost by telling him the polarization she chose. She
can certify her claim by sending Bob the random sequence of bits she chose at Step 1.
14 See, for example, Gruska (1997).
15 Actually for the whole class IP (see, for example, Gruska, 1997).
6.3. QUANTUM CRYPTOGRAPHIC PROTOCOLS 239
4. Bob verifies Alice’s claim by comparing his records in the table for the basis she claims
to choose. There should be a perfect agreement with the entries in that table and no
correlation with the other table.
Can Alice or Bob cheat? Bob is not able to cheat. Indeed, he would be able to cheat only
in the case he would be able to determine, with probability larger than 12 , on the base of the
sequence of photons he received, which polarization Alice has chosen. However, it can be
shown (see, for example, Glaubner, 1988), that any measuring device capable to distinguish
polarization from the stream of incoming photons, could be used, together with the EPR
phenomenon, to transmit information faster than light.
Alice has two possibilities to cheat: either at Step 1 or at Step 3. Let us first discuss
the second case. The only way she could cheat is by sending a sequence of bits that would
match entries of Bob’s table for others from the two possible polarizations (in our example
for the diagonal one). However, she has no way to find out the results of the measurements
Bob made using his observables, because they were just randomly chosen. She can only
guess, but the probability of making correct guesses gets exponentially small with length
of the transmitted sequence and, consequently, the probability that her cheating could be
discovered approaches fast 1.
Alice also cannot cheat at Step 1 by sending a mixture of diagonally and rectilinearly
polarized photons. In such a case, almost certainly, she would not be able in Step 3 to
ensure that the sequence of bits she sent agrees with any of Bob’s tables. However, there is
a special way Alice can cheat in Step 1—by making a clever use of the entangled photons.
In Step 1, instead of sending a sequence of photons polarized in one way or another,
she produces pairs of polarization-entangled photons, each pair in the state √12 (|01i + |10i),
sends to Bob one member of each pair, and stores the other one. After Bob announces his
guess in Step 2, she measures her photons in the opposite basis as was Bob’s guess. By that
she receives a sequence of bits perfectly correlated with Bob’s table corresponding to the
basis he did not choose as his guess in Step 2 and completely uncorrelated with the other
table. She then announces the sequence in Step 3.
The cheating Alice can do this way is theoretically perfectly acceptable. However, prac-
tical problems with storing entangled photons for a longer period are large. Moreover, every
error in this process could result, with high probability, in an error Bob could discover and
that would make him, at least, suspicious about the whole process.
240 CHAPTER 6. CRYPTOGRAPHY
1. Bob chooses a Boolean matrix G as a generator matrix (see Section 7.4.1), of a binary
linear (n, k, d)-code C such that nd > 10ε and nk = 0.52 and announces it to Alice.
2. Alice chooses:
3. Bob chooses a random string b′ of n bits and measures the ith photon according to
the basis M (b′i ), where M (0) = B and M (1) = D. Let c′ be the n-bit vector where c′i
is the result of the measurement of the ith photon.
Alice keeps the bit x and vectors c and b secret, until the opening takes place, and Bob
keeps vectors b′ and c′ secret.
To open the commitment x, Alice initiates the following protocol:
There are two features of this protocol that need an explanation. The use of a linear
code C and bounds 10ε and 0.52 on its parameters.
Both C and bounds were chosen in order to be able to prove that the protocol is secure
also in the case of a noisy channel. They have been chosen well to show that Bob can obtain
only exponentially small amount of Shannon information about b. However, not only the
original proof that Alice cannot cheat had a flaw but, as shown in Section 6.3.4, this flaw
cannot be corrected. The way Alice can cheat is in principle similar to the one described in
the protocol for coin-flipping.
6.3. QUANTUM CRYPTOGRAPHIC PROTOCOLS 241
In the BCJL protocol, Alice first chooses a random string r and a codeword c such that
r · c = x. She sends r to Bob through a classical channel and c by a quantum channel in
a similar way as in the case of QKG protocols. As already mentioned, see page 217, this
way Bob can obtain correctly roughly 75% bits. We show in Section 6.3.3 that cheating Bob
could obtain as much as 85% of bits, but this is the best possible outcome for Bob.
The binary code C was chosen in such a way that there are exponentially many codewords
around the vector c′ —the result of Bob’s measurements. To show that the bound nd = 0.52
was used. This bound can also be used to show that if G is chosen randomly, then it defines,
with large probability, an (n, k, d)-linear code.
1. Alice chooses a bit b and sends it to Bob through one photon encoded using a randomly
chosen basis—standard or dual.
2. Bob measures the photon with respect to a randomly chosen basis—standard or dual.
3. Alice lets Bob know the basis she chosed.
At the end Bob has a 50% chance to know b for sure and he knows whether he knows b
for sure. Alice has no information whether Bob knows the bit for sure.
There are two problems with this protocol.
1. An imperfect Alice’s source, or a noisy channel, or a faulty Bob’s detector could much
affect the probability of success of Bob’s measurement.
2. Bob could cheat by making his measurement in Breidbart basis (see page 244, for
details). This way he could learn b with large probability cos2 π8 ≈ 0.85.
The first nontrivial QOTP was due to Crépeau and Kilian (1988). A more robust version
of this protocol (due to Bennett et al. 1991), will now be presented. At first in an idealized
form, where polarized photons are used to transmit bits. A more practical version of the
protocol will be briefly discussed next.
Let b0 and b1 be Alice choices of bits and c be Bob guess.
Remark 6.3.8 The paper by Bennett et al. (1991) discussed also the ways how to derive
and verify parameters mentioned in Step 1 of the above protocol. In addition, it is there
assumed that instead of single photons dim light pulses are used for transmission, to have a
more realistic setting.
The security of the BBCS protocol will be discussed in Section 6.3.3. The main result
shown by Bennett et al. (1991) is:
Theorem 6.3.9 Let ∆ be data Bob obtained by the protocol. At least one of H(b0 |∆, b1 ) or
H(b1 |∆, b0 ) is exponentially close (in n) to 1; in no case does Alice learn something (where
H(b0 |∆, b1 ) is the conditional Shannon entropy).
For another improvement of the ideas presented by Crépeau and Kilian (1991) see to
Crépeau (1994),
Exercise 6.3.10 Design a quantum protocol identify(x, y) for Alice and Bob to decide
whether strings x and y are the same provided Alice knows only x and she does not want to
reveal it and Bob knows only y and does not want to reveal it. Assume they communicate
through: (a) noiseless channel; (b) noisy channel.
Proof. Density matrices ρ0 and ρ1 describing the mixed states representing bits 0 and
1, respectively, have the form
! !
3 1 1 1
1 1 ′ ′ 1 −
ρ0 = |0ih0| + |0 ih0 | = 4
1
4
3
ρ1 = (|1ih1| + |1′ ih1′ |) = 4 4 .
2 2 4 4
2 − 41 1
4
Let us now denote by commit’ the protocol which is like the protocol commit, but it has
instead of the step 2.c the following step:
Alice chooses a binary sequence b of length n randomly in such a way that 0
(1) is chosen with probability cos2 π8 (sin2 π8 ), and sends Bob a sequence of n
photons with polarizations B0 (ci ⊕ bi ).
It is easy to verify that for the protocol commit’ the density matrices ρ′0 and ρ′1 that
describe the quantum mixtures representing the states 0 and 1 are identical to matrices ρ0
and ρ1 . Namely,
π π
ρ′0 = cos2 |θ0 ihθ0 | + sin2 hθ1 ihθ1 | = ρ0 .
8 8
and, similarly, ρ′1 = ρ1 .
Moreover, if we denote by ρc (ρ′c ) the density matrix associated with the mixture of pure
states used by the procedure commit (commit’) to send c, then
n
M n
M
ρc = ρc i = ρ′ci = ρ′c .
i=1 i=1
Since |{c ∈ C | c · s = 0}| = |{c ∈ C | c · s = 1}| = 2k−1 , the density matrices ρ0 , ρ1 , ρ′0 and
ρ′1 , describing the quantum mixture of all states sent to Bob to commit to 0 (to 1), have the
form
X ρc X ρ′ X ρc X ρ′
c c
ρ0 = k−1
= = ρ′0 , ρ1 = = = ρ′1 .
2 2k−1 2 k−1 2k−1
{c∈C | c·r=0} {c∈C | c·r=0} {c∈C | c·r=1} {c∈C | c·r=1}
Since mixed states represented by the same density matrices cannot be distinguished by
any quantum measurement, the above result implies that Bob is able to get about c and
6.3. QUANTUM CRYPTOGRAPHIC PROTOCOLS 245
x the same information in both protocols commit and commit’. The point now is that
the measurement performed in the protocol commit’ maximizes Bob’s information about c
(and therefore about x), because in the measurement performed in that protocol Bob gets
all information available! Hence the optimal measurement for Bob in protocol commit is
the same. This implies that no coherent measurement on all photons could provide more
information for Bob.
As the next step we show that even if Bob performs the optimal measurement, he can
get only a very little information about x, and therefore he cannot cheat. As the first step
we show that the code c′ received by Bob must be quite away from the vector c sent by
Alice.
Lemma 6.3.12 Even if Bob performs the optimal measurement, there exists an 0 < α < 1
such that the probability that hd(c, c′ ) < γn, where γ = H −1 ( 21 ) = 0.1100279, is at most αn .
Proof. Let us assume the most ideal situation for Bob—a noiseless channel. If Bob
performs the optimal measurement, then P r(ci = c′i ) = cos2 π8 and P r(ci 6= c′i ) = sin2 π8 .
Hence the Hamming distance hd(c, c′ ) is expected to be sin2 π8 n ≈ 0.14644n. In order to
estimate the probability that the number of P errors will be less than γn, we use Bernstein
n
law of large numbers16 as follows: hd(c, c′ ) = i=1 xi , where xi = ci ⊕ c′i and P r(xi = 1) =
2 π ′
sin 8 . Hence the probability that hd(c, c ) < γn can be estimated as follows
X n Xn
xi xi 2 2
P r( ≤ γ) ≤ P r(| − σ 2 | ≥ σ 2 − γ) ≤ 2e−n(σ −γ) ≈ 2e−0.001326 ,
i=1
n i=1
n
Theorem 6.3.13 Even if Bob knows the Hamming distance hd(c, c′ ) = d he would have
asymptotically small information about x if d > γn.
Sketch of the proof. The number of codewords of length n at Hamming distance d from
c′ is nd . Using the assumption d > γn and a standard/clever approximation of nd one can
k− n +αn
derive that the average number of codewords at distance d from c′ is greater than 2 √2n
except with probability 2−αn for any α > 0.
k− n +αn
The codeword c is one of the 2 √2n , at least, many equally likely codewords at distance
d from c′ . The following lemma (due to Bennett et al., 1998),will be used to determine the
number of bits Bob can learn.
Lemma 6.3.14 If E is the set of equally probable candidates for c and a random subset of
bits of c is chosen, then the expected amount of Shannon information available to Bob about
the parity of this subset is less than |E|2ln 2 bits.
It follows from the above lemma that the number of bits√of information Bob can learn
2 n
about Alice’s commitment, after seeing c′ , is less than 2k−n/2−αn ln 2
. This number is expo-
√ √
n k 2 n 2−0.1n n
nentially small if k > 2 + αn. Since n = 0.52 we have for α = 0.1, 2k−n/2−αn ln 2
≤ ln 2 .
16 Bernstein law of large numbers: Let x1 , x2 , . . . , xn be independent Bernoulli variables. If P r(xi = 1) = p
for 1 ≤ i ≤ n, then for all 0 < δ ≤ p(1 − p) we have P r(| n xi −nδ 2 .
P
i=1 n − p| ≥ δ) ≤ 2e
246 CHAPTER 6. CRYPTOGRAPHY
The first protocol (due to Crépeau and Kilian, 1988), was considered as secure provided
neither party could store photons for a longer time and only projection measurements were
used by Eve. Mayers (1998) made the final contribution to the numerous attempts to show
that there is an unconditionally secure quantum oblivious transfer protocol provided there
is an unconditionally secure QBCP. At that time this was considered as very encouraging
results for quantum cryptography because it was believed that unconditionally secure QBCP
do exist.
The result mentioned above, and discussed in the next section, namely that uncondi-
tionally secure QBCP is impossible, implies that one cannot have an unconditionally secure
QOTP the security of which is based on security of a QBCP. However, this result does not
rule out the possibility that there is unconditionally secure QOTP.
The BBCS protocol was shown secure (see Bennett et al. 1991) even against cheating by
Bob with unlimited computing power under the assumption that Bob measures each photon
(or pulse), before the next one arrives, using a projection measurement, or else he loses the
opportunity to measure it at all.
Let us now discuss security of the BBCS protocol against the so-called photon (pulse)
storing attacks (due to Bennett et al. 1991). The basic idea is that Bob does not measure
the incoming photons in Step 3, he only stores them and waits with the measurement until
Alice makes clear in Step 4 which bases she used. In this way it seems that Bob could
present Alice with two good sequences and therefore he could get both bits b0 and b1 .
From the practical point of view this attack is very unlikely to succeed. First of all it
is technically hard to store photons (pulses) for a longer period. Secondly, even if such a
storage were to be available this would not be sufficient. The problem is that Bob needs to
tell Alice in Step 3 which photons (pulses) arrived successfully and were measured. However,
no technique is available or foreseeable to determine whether a measurement will succeed
without actually doing the measurement.
In addition, it is possible to change the BBCS protocol in such a way that it is fully
secure against any photons (pulses) storage attack, providing there exists (unconditionally)
secure QBCP. The basic idea goes as follows.
Alice sends to Bob not 2n 3n
α , but at least α of photons to achieve that 3n of them arrive
successfully. Then, before Step 4, Bob is required to use a QBCP to commit himself both to
bases he used in his measurements and to the outcomes of the measurements. Immediately
after that Alice would choose randomly n of the reported successful measurements and ask
Bob to unveil his commitments. This would allow Alice to check whether Bob’s commit-
ments are correct (subject to the error rate ε) when his commmited bases are correct and
uncorrelated otherwise. In addition, this way Alice could be sure that Bob’s measurements
took place before Step 4 and that he used bases as he was required to do.
addition, Lo and Chau were the first to argue that unconditionally secure QBCP may not
exist. This suspicion was then shown to be valid by Mayers (1998).17
The very basic idea behind breaking BCJL is that Alice can use ancilla to create a
compound quantum state that allows her to cheat as follows: She sends a part of the state
to Bob and keeps the rest. By measuring her part appropriately, without touching Bob’s
part, she can modify her state in such a way that she can cheat concerning her commitment.
1. Commitment phase.
(a) Alice and Bob put particles in their hands to some prescribed initial states.
(b) Alice and Bob repeat several times the following steps:
i. Depending on her commitment b Alice applies a unitary transformation Ub
on her particles and sends some of her particles to Bob.
ii. After receiving particles from Alice, Bob applies a unitary transformation to
particles in his hands and then sends some of his particles to Alice.
2. Opening phase.
(a) To open her commitment Alice sends all her particles to Bob.
(b) After receiving particles from Alice Bob performs some measurements on particles
in his hands to verify Alice’s honesty.
In the terms of Hilbert space concepts the above general scheme of QBCP has the
following transcription.
Let HA and HB be Hilbert spaces of Alice and Bob, and let HC correspond to their
communication channel and the environment. They execute their QBCP in H = HA ⊗
H B ⊗ HC .
As the first step Alice prepares a state |0A i or |1A i in HA ⊗ HC , according to her
commitment, and Bob prepares a state |ai in HB ⊗ HC . The overall initial state is then
|bA i ⊗ |ai.
In Step 1.b, in each communication round, each party D ∈ {A, B} performs a unitary
transformation on HD ⊗ HC (and therefore also on H).
The key insight is now that since each product of unitary transformations is again a
unitary transformation the whole communication process can be characterized by a single
unitary transformation U applied to |bA i ⊗ |ai. Since both Alice and Bob know the protocol
they also know U . Bob can therefore readily verify Alice’s commitment after she sends him
all her particles.
Remark 6.3.15 As is often the case, once the impossibility proof was made public attempts
started to show that it does not cover all cases, all possible QBCP. Two ideas were explored:
to introduce also classical communications, and to make Alice use classical BCP to commit
17 For a detailed treatment of the history of the attempts to deal with security of QBCP see Brassard et
al. (1998b).
248 CHAPTER 6. CRYPTOGRAPHY
herself, during the protocol, to some values, steps or measurements. All these attempts
failed. For some discussion of such ideas see, for example, Brassard et al. (1997). In
addition, Brassard et al. (1998b), have shown that even unconditionally secure classical
BCP does not help.
Exercise 6.3.16 Show that both QBCP on page 238 and 240 are special cases of the
above general scheme of QBCP.
Cheating
We show now that either Bob or Alice can cheat. More precisely, it will be shown that
if Bob cannot learn Alice’s commitment with high probability, then Alice can change her
commitment at the beginning of the opening phase, without Bob noticing it, and therefore
she can cheat—provided she has a quantum computer to perform unitary transformations.
Without loss of generality we can consider HC as a part of HA or HB , depending on
who is just making a transformation on H and therefore let H = HA ⊗ HB .
The key tool to do cheating is the Schmidt decomposition theorem—see page 374. Ac-
cording to this theorem, the total state of H at the end of the commitment phase can be
seen as having the following form in the case of the commitment to 0:
X√
|0f inal i = αi |ei , φi i (6.2)
i
If ρ0 and ρ1 are very different, then Bob can learn, with high probability, Alice’s commitment
and therefore Bob can cheat. If ρ0 and ρ1 are not too different there are still two cases to
consider.
and therefore X√
|1f inal i = αi |e′i , φi i.
i
18 Itis assumed here that all eigenvalues are non-degenerate. The case of degenerate eigenvalues can be
considered in a similar way.
6.3. QUANTUM CRYPTOGRAPHIC PROTOCOLS 249
Alice can therefore cheat by mapping |0f inal i into |1f inal i by applying on |0f inal i a unitary
transformation (on HA only!) that maps, for all i, |ei i into |e′i i.
This means that at the beginning of the commitment phase Alice can proceed by the
protocol as she would make commitment 0 and at the end of the commitment phase, or
better at the beginning of the opening phase, she can, without getting caught by Bob,
change her commitment if she wishes to do so.
Non-ideal case: the difference between matrices ρ0 and ρ1 is small, with respect to the
fidelity F (ρ0 , ρ1 ), defined as follows.
This measure of fidelity has the following property: to any purification θ1 of ρ1 there
exists a purification θ0 of ρ0 such that F (ρ0 , ρ1 ) = |hθ0 |θ1 i|. Clearly, 0 ≤ F (ρ0 , ρ1 ) ≤ 1.
Let us now assume that there is a small δ > 0 such that F (ρ0 , ρ1 ) = 1 − δ. In such a
case there is a state θ1 which is a purification of ρ0 and
It is now clear that Alice’s strategy for cheating in this non-ideal case can resemble that in
the ideal case. Namely, Alice chooses 0 as her “preliminary commitment” at the beginning
of the commitment phase and performs the commitment phase according to the protocol.
If, at the beginning of the opening phase, Alice decides to cheat, she makes public that
her commitment was 1 and she applies a local unitary transformation to change |0f inal i to
|1f inal i such that h0A |1f inal i = 1 − δ. Since the difference between |0A i and |1f inal i is very
small, Bob is not able to distinguish it and Alice can cheat with large probability.
disappear while an exact replica appears somewhere else. An underlying implicit assumption is that the
teleported object does not traverse directly to its destination. Only necessary information for its assembly
is first extracted, then transferred and finally used to assemble the original. This idea has been discarded
by scientists as deferring physical laws, for example, Heisenberg’s uncertainty principle, that does not allow
measurement of all needed information about the object to be teleported. “Quantum teleportation”, on
the other hand, defers no physical law and therefore it is a term that sounds perhaps stronger than its real
meaning is. Of course, this does not mean that it is not a very attractive concept (see Bennett et al, 1993).
20 She could do that were she to have a whole set of particles all in the state |ψi. In such a case Alice could
perform measurements on all these particles and determine |ψi pretty well and then send this information
to Bob who could prepare his source of qubits to produce |ψi.
6.4. QUANTUM TELEPORTATION AND SUPERDENSE CODING 251
than light but it can be argued that part of the information that was present in the particle
C is transmitted instantaneously (except two random bits that needed to be transported at
the speed of light at most).
2 classical bits
Alice Bob
unidentified
quantum state one of four
measu rement unitary transformation
EPR channel
|ψ> |M> |M> |ψ>
gets destroyed
EPR-pair
by measurement
Mathematical details are as follows: Assume that Alice and Bob share the EPR pair
|EP Ri = √12 (|00i + |11i). Let |ψi = α|0i + β|1i be the unknown quantum state of the
particle owned by Alice. She first couples her particle A with C to create the state
1
|φi = |ψi|EP Ri = √ (α|000i + α|011i + β|100i + β|111i).
2
The key point for teleportation is now that |φi can be expressed in a special way using
the Bell basis {|Φ± i, |Ψ± i}.
Indeed, since
1 1
|00i = √ (|Φ+ i + |Φ− i), |01i = √ (|Ψ+ i + |Ψ− i),
2 2
1 1
|10i = √ (|Ψ+ i − |Ψ− i), |11i = √ (|Φ+ i − |Φ− i),
2 2
we get
1
α|000i = α|00i|0i = √ (α|Φ+ i|0i + α|Φ− i|0i),
2
1
α|011i = α|01i|1i = √ (α|Ψ+ i|1i + α|Ψ− i|1i),
2
1
β|100i = β|10i|0i = √ (β|Ψ+ i|0i − β|Ψ− i|0i),
2
1
β|111i = β|11i|1i = √ (β|Φ+ i|1i − β|Φ− i|1i)
2
252 CHAPTER 6. CRYPTOGRAPHY
and therefore
1 1
|φi = Φ+ √ (α|0i + β|1i) + Ψ+ √ (β|1i + α|1i)
2 2
− 1 − 1
+Φ √ (α|0i − β|1i) + Ψ √ (−β|0i + α|1i).
2 2
If Alice now makes a measurement of the first two qubits of |φi, with respect to the Bell
basis, then she will get one of the four possible outcomes: 00+, 01+, 00− and 01−, and
therefore two classical bits of information, and |φi gets reduced to one of the states
1 1 1 1
√ (α|0i + β|1i), √ (β|0i + α|1i), √ (α|0i − β|1i), √ (β|0i + α|1i), (6.5)
2 2 2 2
and to the same state Bob’s particle gets into. In order to tell Bob into which of the above
four states |φi got reduced, she needs to send him two classical bits of information. Bob
needs them to know which of the following four unitary transformations
1 0 1 0 0 1 0 1
U00 = , U10 = , U01 = , U11 =
0 1 0 −1 1 0 −1 0
to apply to his particle in order to transform it to the original unknown state |φi = α|0i +
β|1i.
The sending of two classical bits plays the key role in quantum teleportation. Indeed,
it can be shown that if it was sufficient to send less than two classical bits of information
in the above teleportation scheme, this could be used by Bob to send messages faster than
light—see Bennett et al. (1993)—i.e. to send the so-called superluminal messages.
|ψ> R S S |0’>
|0> L |0’>
|0> T |ψ>
Consider the circuit in Figure 6.11 where L, R, S, and T are the gates implementing the
following unitary transformations
1 1 1 1 1 −1 i 0 −1 0
R= √ L= √ S= T =
2 −1 1 2 1 1 0 1 0 −i
Let |ψi be a qubit state. If the state |ψ00i is processed by the circuit from Figure 6.11, i.e.
|ψi is put on the topmost input and |0i on the other two, then the output will be |0′ 0′ ψi.
Exercise 6.4.1 Design unitary matrices corresponding to circuits: (a) in Figure 6.12a;
(b) in Figure 6.12b; (c) in Figure 6.11 before the dashed line; (d) in Figure 6.11 after the
dashed line; (e) for the whole circuit in Figure 6.11.
Exercise 6.4.2 Determine the intermediate states of the computation of the circuit from
Figure 6.11 on input |ψ00i after all gates.
In the state of the circuit at the dashed line all three qubits are entangled. A measurement
of the two upmost qubits provides two random classical bits, say u, v. Surprisingly enough
(verify it), if these two bits are “returned” into the circuit, i.e. if the computation of the
circuit to the right of the dashed line starts with input |uvzi, where z is the state of the
third qubit of the dashed line, then the output of the circuit will be |uvψi.
|0> L x |ψ> R u
|0> y x v
(a) (b)
Figure 6.12: A teleportation device
Out of the teleportation circuit shown in Figure 6.11 we can make, by a “cut along
the dashed line”, two circuits, one for Alice, one for Bob, to use for teleportation (see
Figure 6.13).
Β
Α2
u
|ψ> R S S |0’>
Μ
v
|0> L |0’>
|0> T |ψ>
Α1
Alice’s circuit consists of two subcircuits. Using the first one, A1 (see also Figure 6.12a),
with the initial states |0i, Alice can create a pair of particles Q2 and Q3 in the entangled
state √12 (|00i + |11i). Alice keeps Q2 and sends Q3 to Bob.
At some time later, let Alice want to teleport to Bob the unknown state |ψi of her new
particle Q1.21 She can then use the second subcircuit A2 (see Figure 6.12b), to entangle |ψi
with particles Q2 and Q3. (Observe that at the output of the second subcircuit all three
qubits are entangled.)
Alice now makes a measurement of Q1 and Q2, with respect to the Bell basis. As the
result she gets two classical bits of information, the states of Q1 and Q2 will collapse and
Q3 will get into one of the states shown in (6.5). The two bits she gets Alice sends, using a
classical channel, to Bob.
If Bob gets two bits he can add them as inputs to his teleportation circuit with Q3 as
the third input. Bob’s circuit is to choose, on the basis of two inputs, the proper rotation
to apply to Q3 and to perform the required rotation to have his qubit Q3 in the state |ψi.
Applications? Let us first explore a natural question whether teleportation could be use-
ful for quantum information processing. Yes, because teleportation can provide another way
to transmit information inside quantum computers and information systems, which can be of
interest especially if information has to be kept secret and should never be transmitted over
an insecure channel. Actually, it is in the area of quantum computers and communication
systems where the first applications of quantum teleportation are expected.
In addition, if Bob already possesses the state |ψi, then teleportation can be used by Bob
to determine |ψi more completely by making measurements on both copies of |ψi. Moreover,
teleportation is possible without Alice knowing the exact position of Bob. It is sufficient
to broadcast classical bits to all the possible locations Bob could be in (or to send him an
email).
Methods of quantum teleportation have been improved to work with arbitrarily high
fidelity even if the quantum channel is imperfect and the quantum noise is too strong to use
some quantum error-correction techniques (see Section 8.3.2 and Bennett et al. 1996a).
Partial implementations (without the last stage—Bob’s transformations), of quantum
teleportation over macroscopic distance have already been reported by Bouwmeester et al.
(1997), for the distance of 1m, and Boschi et al. (1998), using optical systems and photons.
A complete implementation of quantum teleportation over inner-atomic distance using liquid
state NMR technology was reported by Nielsen, Knill and Laflamme (1998).
Remark 6.4.3 In spite of the remarkable power of quantum entanglement for quantum
teleportation, and also as a substitute for communication, see Section 7.4.1, the power of
entanglement to facilitate direct communication between two parties is quite restricted.
For example, let two entangled particles be possed by Alice and Bob. If Alice receives an
unknown bit of information there is no operation she can perform on her particle in such a
way that Bob could then get the bit by performing an appropriate operation on his particle.
Splitting of information
There is a simple method, due to Hillery, Bužek and Berthiaume (1998), how Alice can
teleport a (secret) qubit |φi = α|0i + β|1i to Bob and Charles in such a way that they have
to cooperate in order to have |φi.
The basic idea is that Alice couples a given particle P in the state |φi with the state
|ψi = √12 (|000i + |111i) of three particles Pa , Pb and Pc she shares with Bob and Charles
and then performs a measurement on the state of particles P and Pa , with respect to the
Bell basis {Φ± , Ψ± }. Since
1
|φi|ψi = (|Φ+ i(α|00i + β|11i) + |Φ− i(α|00i − β|11i)
2
+|Ψ+ i(β|00i + α|11i) + |Ψ− i(−β|00i + α|11i)),
the outcome of the measurement is that particles Pb and Pc get into one of the states
1 1 1 1
√ (α|00i + β|11i), √ (α|00i − β|11i), √ (β|00i + α|11i), √ (−β|00i + α|11i)
2 2 2 2
and Alice gets two bits to tell her about which of these four cases happened. However,
neither Bob nor Charles has information about which of these four states their particles are
in.
Bob now performs a measurement of his particle with respect to the dual basis. He
gets out of it one bit of information and Charles’s particle Pc gets into one of 8 possible
states, which is uniquely determined by bits both Alice and Bob got as the results of their
measurements, and which can be transformed into the state |φi using one or two applications
of Pauli matrices.
Exercise 6.4.4 (a) Determine the density matrix of Charles’s particle after Alice’s mea-
surements; (b) determine 8 possible states into which Charles’s particle can get after Al-
ice’s and Bob’s measurements; (c) determine transformations Charles has to perform in
order to have his qubit in state |φi in dependence on bits learned by Alice and Bob in
their measurements.
Exercise 6.4.5 Show how to generalize the idea of splitting information between two
parties to the case of (a) 3 parties; (b) n parties.
Exercise 6.4.6 Show (Cleve et al. 1999), that using the mapping
α|0i+β|1i+γ|2i → α(|000i+|111i+|222i)+β(|012i+|120i+|201i)+γ(|021i+|102i+|210i)
one can distribute a “secret qutrit” to three qutrits in such a way that if each qutrit is
owned by a different party, then any two of them can reconstruct the secret state, but no
single party can do that alone.
256 CHAPTER 6. CRYPTOGRAPHY
The secret sharing problem, a quantum analogue of the classical one, is in full generality
solved by Cleve et al. (1999). They showed that such a secret sharing does exist if n < 2k.
The last restriction is due to “No-cloning theorem”.
b1
Alice Bob
b2
dual standard
EPR-pair measurements
b1 b2
Assume that Alice and Bob share two particles in the EPR state √12 (|00i + |11i), which
forms the EPR-channel. If Alice receives two classical bits, b1 , b2 , she performs on her particle
one of the Pauli rotations as shown in the second column of Figure 6.15. The resulting state
is shown in the third column and this state she then sends to Bob. He performs first on
both states XOR operation and this way he disentangles the state with the result shown in
column 5. Finally, Bob performs the measurement of Alice’s qubit in the dual bases and
of his qubit in the standard basis to get two bits (see column 6), Alice has sent him in one
qubit. In a simplified form the resulting system is depicted in Figure 6.16.
Remark 6.4.7 1. Quantum superdense coding transmission was first put into practice in
6.4. QUANTUM TELEPORTATION AND SUPERDENSE CODING 257
Innsbruck by Zeilinger’s group with polarization-entangled photons (see Matte et al. 1996).
b1 1 qubit b1
Alice Bob
b2 1111111
0000000 b2
0000000
1111111
0000000
1111111
2 bits EPR channel
0000000
1111111 2 bits
a source of
EPR states
Figure 6.16: Superdense coding scheme
2. In the superdense coding presented above, it is essential that Alice and Bob use a max-
imally entangled state. Indeed, Barenco and Ekert (1995) and Hausladen et al. (1996) have
shown that the amount of information communicated by the superdense coding decreases
from its maximum, 2 bits per qubit, with the decrease in the amount of entanglement, and
it becomes 1 bit when the entanglement is zero. When the initial state of the entangled
pair of qubits is mixed, capability to do superdense coding in terms of various measures of
entanglement has been investigated by Bose et al. (1998).
Remark 6.4.8 Experimental progress in the quantum key generation, creation of entangled
pairs over a long distance and in quantum teleportation has been such that the vision of
small quantum networks does not have to be far away. This puts the problems of quantum
multiparty communications and quantum distributed computing into a promising research
agenda.
258 CHAPTER 6. CRYPTOGRAPHY
Chapter 7
PROCESSORS
INTRODUCTION
Theoretical investigations concerning quantum algorithms, automata, complexity, informa-
tion theory and in cryptography are of great interest and importance. However, progress
in the experimental efforts to design quantum information-processing systems is crucial for
seeing properly the overall perspectives of the future designs of real and powerful quantum
computers, and for isolating and solving the problems that need to be dealt with if powerful
quantum computers are ever to be built.
It has been realized, from the very early days of research in quantum computing, at
least by some, that powerful evolution of isolated quantum systems is hard to utilize in real
quantum processors, because of their interaction with the environment that can destroy very
large but fragile quantum superpositions; and because of the natural imperfections of (in-
herently analogue) quantum devices. In addition, quantum error correction was considered
impossible.
Fortunately, several developments brought the vision of quantum computers closer to
reality. Quantum computation stabilization methods and quantum error correction codes
have turned out to be possible and efficient. Techniques for fault-tolerant quantum com-
puting have been developed. Finally, some promising technologies to design quantum gates,
circuits, and processors have been identified and are being experimentally tested.
LEARNING OBJECTIVES
The aim of the chapter is to learn:
259
T
260 CHAPTER 7. PROCESSORS
A real quantum computer is just a physical system whose evolution can be interpreted as
performing some specific quantum computation. In order to design such systems, frontiers
of current technology have to be explored and insights have to be developed into the essence
of such crucial problems as imprecisions during quantum computations and decoherence,
which occurs when quantum information is sent, in time or space, through a noisy quantum
channel. These problems are either specific for quantum computing or have in this case a
very different nature than in the classical computing. Finally, methods have to be developed
to deal with these problems. Some of such methods can be seen as quantum generalizations
of classical ones, but also methods had to be developed that have no classical analogue.
Progress has been achieved in studying the main obstacle to practical quantum comput-
ing — decoherence — and formidable successes have been achieved in developing quantum
computing and storage-stabilizing techniques: quantum error-correcting codes, entangle-
ment purification (or distillation) techniques, and so on. In addition, fault-tolerant tech-
niques have already been developed for quantum computing. It is clear today that arbitrarily
long quantum computations can be performed reliably, in principle, provided that the av-
erage probability of error per quantum gate is less than a certain threshold. Therefore,
internal imprecisions and external decoherence do not have to be any longer considered as
an obstacle to quantum computation we would not be able to cope with. Theoretical results
indicate that it might even be possible to build inherently fault-tolerant quantum hardware.
All that makes the vision of real quantum computers much closer than expected a few years
ago.
Significant progress has been achieved in the experimental development and testing of
several technologies that seem to have a potential use in the design of small experimental
quantum processors.
The main theoretical developments behind the design of experimental quantum proces-
sors as well as the main principles of technologies being currently explored for this purpose
are discussed in this chapter.
7.1. EARLY QUANTUM COMPUTERS IDEAS 261
should have for such a circuit. Feynman thereby found a systematic, though not very
efficient, way to transform a quantum circuit description of a quantum computer to the
dynamical Schrödinger equation that simulates computation steps of the circuit.
Feynman used extra k qubits, for so-called “program counter sites” as well as the “cre-
ation” operators ci (i = 1, . . . , k) and the “annihilation operators” ai (i = 0, . . . , k − 1).
Each creation operator ci “sets the ith counter qubit to 1”, the annihilation operator to 0.
The overall Hamiltonian has then the form
k−1
X
H= (ci+1 · ai · Ui+1 + (ci+1 · ai · Ui+1 )∗ )
i=0
where “·” denotes the product of matrices. The first terms take care of a sequential execution
of all gates. (One needs to add the conjugate terms because the resulting Hamiltonian has
to be Hermitian.) In total, Feynman’s approach needs a register with m + k − 1 qubits to
deal with k counters and m input qubits.
More exactly,
0 0 0 1
c= a=
1 0 0 0
and ci (ai ) is just c (a) applied to the ith counter qubit. (Observe that c maps |0i → |1i
and |1i to a “null state”; similarly a maps |1i → |0i and |0i to a “null state”.)
Computation on such a circuit begins by putting the input bits into the input register
and the pointer to occupy the site 0. One then checks, at site k, that site k is empty, or
that the site has a pointer. Once the cursor is found, it is removed so that it cannot return
down the program line. At that moment the register contains all outputs that just need to
be measured. Termination is not taken care of by such a quantum computer itself. It has
to be decided from the outside when a measurement is to be performed.
In Feynman’s model all of the quantum uncertainty of the computation is concentrated
in the time needed for computation to be completed, and not at all in the correctness of
the outcomes. Namely, if a computation is done, and a certain bit indicates it, the result
obtained is always correct.
Peres has considered the case that one qubit is encoded by three (|0i → |000i, |1i →
|111i) and the potential error states have the form
or
α1 |111i + β1 |011i + γ1 |101i + δ1 |110i,
for suitable amplitudes αi , βi , γi , δi , i = 0, 1.
Peres also considered a way of using Stern–Gerlach magnets for error detection and error
correction. In addition, he realized that errors can be corrected by unitary operations.
Finally, he has considered ways error correction can be incorporated in the Hamiltonian, so
that the probability of error can be made arbitrarily small.
In spite of the fact that Peres’ error-correcting code is not good enough, see page 280, be-
cause it does not use entanglement to protect quantum information, it was the first attempt
to consider quantum error-correcting codes.
Deutsch (1985) presented a general, fundamentally new, and fully quantum model of
quantum computation. The tape (t) of Deutsch’s Turing machine U consists of an infinite
sequence of qubits and its finite control consists of a finite sequence of qubits (m). In
addition there is an observable x, which has any integer from Z as its potential value—a
pointer to the currently scanned tape cell. Deutsch deals with the problem of infinitely long
tape by assuming that tape is not rigid and there is a mechanism that can move the tape
according to signals transmitted at finite speed between adjacent segments. The state of
the quantum computer U is therefore a unit vector in the space spanned by basis vectors
|x, t, mi.
The dynamics of U is given by a constant unitary operator U and for the evolution of
the state |ψ(t)i it holds
X
|ψ(t)i = U t |ψ(0)i where|ψ(0)i = λn |0, 0, ti,
n
and only finitely many λi are non-zero if an infinite number of elements in t are non-zero.
U has to satisfy a special condition in order to perform operations “by finite means”.
Deutsch was fully aware of such features of quantum computing as quantum parallelism
and entanglement. To explain these features he used Everett’s many-world interpretation,
Section 9.1.7, because, as he explains, “the intuitive explanation of these properties places
an intolerable strain on all interpretations of quantum theory other than Everett’s”.
1 By
“finitely realizable physical system” any physical object is meant upon which experimentation is
possible.
264 CHAPTER 7. PROCESSORS
Deutsch also described the universal quantum computer capable of simulating every
finitely realisable physical system, therefore also any other quantum computer, with arbi-
trarily high precision. To design the universal quantum computer Deutsch made use of the
fact that if α is any irrational multiple of π, then the four transformations
iα
cos α sin α cos α i sin α e 0 1 0
, , , .
− sin α cos α i sin α cos α 0 1 0 eiα
For small ε the first term is almost the state |ψ(t)i and the second one can be negligible.
(On the other hand, many quantum algorithms are so sensitive to small changes in the
amplitudes of superpositions that the above result should not be overestimated.)
Operational imperfections
Another source of imperfections, that results in the errors in the Hamiltonian of the system
and consequently in the quantum evolution, is an unavoidable inaccuracy of quantum com-
puter components. This is basically due to the fact that quantum computer components
are mainly analogue type devices. As a consequence, the state of a quantum superposition
depends on several continuous parameters. For example, gates very often used are those per-
forming a rotation by an angle θ. If such a gate is applied there is naturally some inaccuracy
in θ. Errors of this type are caused by unitary transformations, actually by over-rotations
or under-rotations.
There are several ways to estimate impacts of operational imperfections.
In Section 4.2.3 we have seen, by analyzing imprecisions of Turing machine computations,
Theorem 4.2.21, that imprecisions during computations only add and do not grow exponen-
tially. In addition, it was shown that O(lg t) bits of precision in transition amplitudes are
sufficient to support t steps of a QTM with required precision.
Another way to approach the problem is to consider the case that instead of correct
Hamiltonian H there is slightly different Hamiltonian H ′ = H + Herr . An analysis by
Williams and Clearwater (1997) shows that errors grow at most quadratically in time.
7.2.2 Decoherence
Two essential properties of quantum information processing that efficient quantum algo-
rithms essentially exploit are the existence of quantum superpositions and entanglement—
non-local correlations between different parts of physical systems. It is an elementary but
fundamentally important fact that both of them are ultrasensitive to interactions with the
environment. The enormous fragility of quantum states used to process information is the
main problem for any design of quantum processors.
In order to perform a successful quantum computation one has to maintain a coherent
unitary evolution until the completion of the computation. However, technologically it is
not possible to ensure that a quantum register is completely isolated from the environment.
There are several reasons why it is practically not possible to avoid interactions of a
real quantum computer with the environment. Quite sophisticated technology has to be
266 CHAPTER 7. PROCESSORS
used to create quantum registers, and this infrastructure technology cannot be fully isolated
from the register. Secondly, there is an unavoidable coupling of quantum systems with the
thermal environment. (Even such phenomena as cosmic rays may have their impacts.)
There are several ways interactions of quantum systems with their environment mate-
rialize. For example, the decay (dissipation) is a process by which a quantum system
dissipates energy into the environment. For example, if an excited (high energy) state is to
represent |1i in a qubit and the lower energy state represents |0i, then the system (qubit)
can, spontaneously, make a transit from |1i to |0i, emitting a photon in the process. A qubit
flip (from |1i to |0i only), is then the net result of such a spontaneous transition.
The decoherence is a general term for process of coupling of a quantum system (pro-
cessor) with its environment, if it is not perfectly isolated. As a consequence, quantum
state of the system is modified due to the interactions of the system with the environment.
Such interactions mean that the quantum dynamics of the environment is also relevant to
the operations of the quantum computer and its states become entangled with the states
of the environment. The effect of such an entanglement with the environment can be seen
as if the environment applied a measurement to the system. This has destructive impacts
on superpositions and interference. Decoherence tends to destroy irreversibly information
in a superposition of states in a quantum computer in a way we cannot control. It spoils
both constructive and destructive interferences that are essential for quantum computing
and long computations seem to be impossible. As the result, quantum information initially
encoded in a register becomes encoded instead in the correlations between the quantum
computer and its environment and we can no longer access the information by observing
only the computing device. Decoherence can therefore be seen as a physical process, in
which quantum systems lose, due to their interaction with environment, some of their key
quantum properties of importance for being able to have quantum information processes
more efficient than classical ones.
Especially delicate are entangled states. Making a measurement on an entangled state
will usually cause a collapse of it to a less entangled state. Small interactions with the
environment provide a sort of continuous measurement of the system. As a quantum system
gets larger this is harder and harder to ignore. System starts to decohere more and more
and starts to look and behave more and more as a classical system. Decoherence is why
quantum world looks classical at the human level (Gottesman, 1997).
Decoherence is the most fundamental obstacle so far preventing the design of real quan-
tum computers. For example (see Barenco, 1996), the effects of decoherence in time on the
qubit state |φi = α|0i + β|1i, represented by the density matrix
|α|2 αβ ∗
ρ|ψi = |φihφ| = ,
α∗ β |β|2 i
can be discussed by the time-dependent density matrix
t
|α|2 e− τ αβ ∗
ρ= t ,
e− τ α∗ β |β|2
in which τ is so-called decoherence time. Matrix ρτ converges to the diagonal matrix
|α|2 0
.
0 |β|2
Example 7.2.1 (Barenco, 1996) Decoherence can affect the probability distribution of the
possible outcomes of computations. In order to demonstrate that let us consider the situation
7.2. IMPACTS OF IMPERFECTIONS 267
that the Hadamard matrix H is applied to the state |0i of a qubit twice. We get
H 1 H
|0i −→ √ (|0i + |1i) −→ |0i. (7.1)
2
If (7.1) is reformulated in terms of the corresponding density matrices we see this evolution
as follows:
1 0 H 1 1 1 H 1 0
−→ −→ . (7.2)
0 0 2 1 1 0 0
If there is no decoherence, the measurement of the final state yields the outcome |0i with
probability 1. Let us now assume that decoherence occurs between the two applications of the
operator H and annihilates completely off-diagonal elements of the second matrix in (7.2).2
Instead of (7.2) we get the evolution
1 0 H 1 1 1 decoherence 1 1 0 H 1 1 0
−→ −→ −→ .
0 0 2 1 1 2 0 1 2 0 1
1
At the measurement we get both |0i and |1i with probability 2.
It is clear that decoherence is a major problem. It is less clear how really big a problem
it is and whether we can successfully deal with it.
The first estimations of decoherence impacts were very pessimistic. The basic message
was that decoherence is so large that the probability of getting a correct result decreases ex-
ponentially. Indeed, the error rate due to decoherence is time-dependent and approximately
modeled by the function
− t
1 − e τdec ,
where τdec is the so-called decoherence time. Therefore, by Barenco (1996), if τdec is the
typical decoherence time for a single qubit, the probability of getting the correct result for
a quantum computation with the input of size n is
P ≈ P0 e−is(n)t(n)/τdec ,
where P0 is the probability of the result with no errors, s(n) is the total number of qubits
necessary to perform the computation and t(n) is the time needed to perform the compu-
tation. This implies, for example, that the decoherence problem cannot be efficiently dealt
with by simply increasing the number of runs.
It has slowly turned out that a more detailed analysis of the decoherence problem can
bring a different and less pessimistic view of decoherence, in some situations at least. For
example, the decoherence problem is usually considered under the assumption that each
of the qubits interacts with a different environment. However, if a different assumption is
taken, for example that all qubits interact with the same environment, then it can be shown
(Duan and Guo, 1996), that for some entangled initial states no decoherence occurs at all.
Table 7.1, due to DiVincenzo (1995), displays estimates for gate switching time ts , de-
coherence time τdec , as well as the number of steps that can be performed without losing
coherence, for several technologies of potential interest.
Experiments indicate (see DiVincenzo and Terhal, 1998), that using a single trapped
beryllium ion, decoherence time is about 1 ms and with NMR technology around 1 s.
2 In
general, decoherence is often a process that eliminates off-diagonal elements of the density matrix of
mixed states.
268 CHAPTER 7. PROCESSORS
Table 7.1: Switching time ts , decoherence τdec , both in seconds, and the number of compu-
tation steps performed before decoherence impacts occur
How to fight decoherence? Firstly, one should try to use technologies with low decoher-
ence time. It is believed that there is still much to discover along these lines.
Secondly, it is well known that under certain circumstances decoherence is smaller. For
example, some systems need more time to decohere at very low temperatures.
Ingenuity of experimental physicists is expected to bring significant improvements con-
cerning decoherence. However, the main way to fight decoherence seems to be indirect one
—a “software approach”—to try to avoid damage caused by decoherence by undoing its bad
effects using, similarly as in the classical case, quantum error correcting codes, and other
methods presented in Section 7.4.
The third main source of environmental impacts is in some sense inverse to the first one.
They are impacts of external forces such as cosmic rays, or residual gas molecules, that can
hit a qubit and change its state. These impacts are much out of our control.
In this section we present a sort of quantum analogue of the majority voting method to
make use of the redundancy for quantum computation stabilization in a way that has no
parallel in classical computing.
Before presenting the method itself there is still one point to discuss concerning stabi-
lization: efficiency. Only such computation/memory stabilization methods are of interest
that are efficient enough. Computations which require an exponentially increasing preci-
sion, or exponential amount of time, space or energy or of some other physical resource
are considered as unfeasible. The same is true for polynomial resource-bounded computa-
tions, provided they do not have polynomial resource-ounded stabilization methods. From
a quantum computing/memory stabilization technique it is therefore required that it makes
polynomial resource-bounded computations out of any polynomial time-bounded computa-
tions.
Another fact of importance in this connection is that in some cases, for example for
decoherence, error probability grows exponentially with the size of input. It is therefore of
vital importance to determine whether we can have quantum stabilization that would make
polynomial algorithms stay polynomial even under such unfavourable circumstances.
The majority voting method, brought to classical computing by von Neumann, is very
successful in classical computing for a very simple reason: it is extremely good. If r compu-
tations are performed and each has probability 1 − θ of being successful, then by Chernoff
2
bound the probability that the method fails is less than e−θ r/6 and therefore the probability
decreases exponentially with the number r of redundant computations.
Such a performance we can hardly expect from a quantum computation stabilization
method. Fortunately, as will be shown, even much more modest stabilization methods may
still be useful. Indeed, let us have a polynomial time algorithm that performs each step of
the computation correctly with probability 1 − ε. After t steps the probability of successful
computation is (1 − ε)t ≈ e−εt . Suppose now that we have a computation stabilisation
method available which, using redundancy r, reduces the error in each step modestly, by
εt εt
the factor r1 . After t steps the probability of success is e− r ≈ 1 − δ if r = − lg(1−δ) , for any
δ > 0. In addition, r is polynomial in t and therefore also in the size of input.
We can therefore see that even with a moderately successful stabilization method, as
the one presented in this section, which is due to Barenco et al. (1997), we can stabilize
exponentially growing errors using only polynomial time computations.
(r)
The above definition does not bring a sufficient insight into the space SYMH —one does
(2)
not see immediately that states |ψ1 i|ψ2 i + |ψ2 i|ψ1 i are in SYMH if |ψ1 i, |ψ2 i ∈ H. The
following theorem offers an alternative and equivalent definition.
(r)
Theorem 7.3.2 SYMH is the subspace of all states in H (r) which are symmetric in the
sense that they are unchanged under the interchange of sites for any pair of positions in the
tensor product of their basic states.
(r)
It is important that the subspace SYMH is small. Its dimension is O(rn−1 ).
3. Apply U −1 to the first ⌈lg r!⌉ qubits of |ψ1 i. The resulting state is then
r!−1
1 X
|ψ2 i = √ |i, ξi i,
r! i=0
where |ξi i ∈ H (r) . Since U transforms |0(⌈lg r!⌉) i into an equal-amplitude superposition
of all |ii, U −1 transforms each |ii back to |0(⌈lg r!⌉) i with equal-amplitude. The coeffi-
cient at |0(⌈lg r!⌉) i will therefore be an equal amplitude superposition of all permutations
of |φ1 i, . . . , |φr i, i.e., the required symmetrized state.
4. Measure the first ⌈lg r!⌉ qubits with respect to the standard basis. If the result
is 0(⌈log r!⌉) , then the state |ψ0 i = |φ1 i . . . |φr i has been successfully projected into
SYM(r) n ; otherwise the symmetrization failed.
7.4. QUANTUM ERROR-CORRECTING CODES 271
Due to the linearity of all processes the algorithm can be applied to the general state of
H (r) and not only to a basis state as illustrated above.
Exercise 7.3.4 Show that for the algorithm 7.3.3 the total number of operations is
O(r2 lg n + (r lg r)2 ) and therefore the algorithm can be considered as efficient — in spite
of the fact that we need to consider a creation of superpositions with r! members.
Exercise 7.3.5 Design a symmetrization network of the size O(r2 ) for r redundant com-
putations.
It has been shown by Barenco et al. (1997), by a detailed analysis, that both for unitary
errors and for errors due to decoherence, the stabilization by symmetrization of r-redundant
computations reduces the error in each step of stabilization by the factor 1r .
are legal states. There are uncountably many such states. Since quantum evolution is
basically an analogue and therefore a continuous process, and since all the above states are
legal, there is no way to distinguish a state |φi from the one obtained from |φi by adding
some noise. Moreover, the number of possible errors seemed to be infinite and it seemed
that an error destroys a state in such a way that the original state cannot be recovered. In
addition, each attempt to make a restriction to a discrete set of states seemed to bring an
essential restriction to quantum computing.
However, it was shown first by Shor (1995) and Steane (1996), and soon by many others,
that good quantum error-correcting codes exist and they can protect qubits against general
types of error (which may be caused by imperfections or interactions with the environment).
Quantum error-correcting codes have to be based on different principles than classical error-
correcting codes, but they do exist. Since then the progress in the development of quantum
error-correcting codes has been remarkable—this has been one of the most successfully
developing areas of quantum computing and it is to a large extent due to these results, and
results discussed in the next section, on fault-tolerant quantum computing, that the vision
of real quantum computing is much closer.
The very basic idea of quantum computation with error-correcting codes goes as follows.
The evolution of the quantum computer is restricted to a subspace of the Hilbert space
carefully chosen in such a way that if quantum states are encoded using states of the chosen
272 CHAPTER 7. PROCESSORS
subspace, then all departures from this subspace, due to errors, lead to mutually orthogonal
subspaces. After a quantum state is entangled with the environment and an “error” occurs,
one can determine, by a measurement, but without destroying the erroneous state, into
which of the error subspaces the erroneous state has fell, and the error can be undone using
a unitary transformation.
However, it is far from trivial how to implement such an idea and how to utilize redun-
dancy for that. (It is well known that redundancy is not very useful in analogue computing.)
The ingenious idea of Shor and Steane was to use quantum entanglement for the design of
quantum error-correcting codes (QECC). The discovery of QECC caused much excitement
because it converted large-scale quantum computation from an impossibility to a possibility.
There are three new types of problem concerning quantum error-correcting codes, com-
pared to the classical situation:
2. The assumption that encoding and decoding are error-free is much less realistic.
A desirable error correction process can be seen as having the following form: the sender,
Alice, encodes a to-be-sent quantum state into a new quantum state which is then sent
through a noisy channel on which an error/noise operator acts and changes the transmited
state. Encodding has to be such that even if the error operator changes the state being
trasmitted, it cannot entangle it with the environment and, as a consequence, the receiver
Bob, who can act on the state he receives, but not on the environment, is able first to detect
which error operator was applied and then he can undo its effect and to receive the original
state.
As discussed in more detail in Chapter 8, in order Bob is able to recover the state Alice
has sent, no information about her state should leak into the environment. Quantum codes
have to be therefore such that they hide information from the environment. The idea is to
use encodings of such types that encoded quantum information of k qubits is spread out
over n qubits in a non-local way through an entangled state in such a way that environment
which can access only a small number of qubits can gain no information about the overall
state being transmited and this way transmited quantum information is protected.
Quantum error-correcting theory is a crucial part of quantum information theory. A
variety of quantum error-correction methods have their analogues in classical error-correcting
codes and rely heavily on their properties. That is why we start with a short summary of
the very basic concepts and methods of the classical error-correcting codes codes. For more
see Hill (1986), Hoffman et al. (1991), van Lint (1995) and MacWilliams and Sloane (1977).
A code C is a subset of Σnq for some n; its elements are called codewords. For error
detection and correction the minimal distance d(C) of a code C is of importance.
d(C1 , C2 ) = min{hd(w1 , w2 ) | w1 ∈ C1 , w2 ∈ C2 }.
This allows us to formulate one of the most basic results of the error-detecting and -correcting
codes.
Theorem 7.4.1 (1) A code C can detect up to s errors in any codeword if and only if
d(C) ≥ s + 1; (ii) A code C can correct up to t errors if and only if d(C) ≥ 2t + 1.
Definition 7.4.2 An (n, M, d)-code is a code of M words of length n and minimal distance
d. Aq (n, d) denotes the largest M such that there exists a q-nary (n, M, d)-code.
Definition 7.4.4 Two q-nary codes are called equivalent if one can be obtained from the
other by a combination of the following operations:
One of the aims of the coding theory is to find perfect codes. In order to define them
let, for u ∈ Σnq and r ≥ 0, S(u, r) = {v ∈ Σnq | hd(u, v) ≤ r} be the sphere of the radius r.
words.
Exercise 7.4.6 Show that a q-nary (n, M, 2t+1)-code satisfies the following space pack-
ing or Hamming inequality (bound);
n n n
M{ + (q − 1) + . . . + (q − 1)t } ≤ q n .
0 1 t
Definition 7.4.7 An (n, M, 2t + 1)-code C is called perfect4 if the equality holds in 7.4.6.
4 it is already pretty well known which codes are perfect.
274 CHAPTER 7. PROCESSORS
Linear codes
In order to formulate elegantly basic concepts and results of linear codes it is useful to
consider words u1 u2 . . . un of Σnq as vectors (u1 , u2 , . . . , un ) of length n with elements from
Zq . The set of all such vectors is denoted V (n, q). Component-wise addition of two vectors
and scalar/vector multiplications in V (n, q) are done in Zq .
Definition 7.4.9 A code C over V (n, q) is called linear if C is a subspace of
V (n, q),qprime.
Exercise 7.4.10 Show that a subset C ⊆ V (n, q) is a linear code if and only if: (1)
u + v ∈ C for all u, v in C; (2) au ∈ C for all u ∈ C, a ∈ Zq .
If the dimension dim(C) of a linear code C in V (n, q), as that of the subspace C, is k
then C is said to be an [n, k]-code. In addition if C is of distance d, then it is said to be
[n, k, d]-code. In other words, an [n, k, d]-code is a code by which n bits can store k bits of
information in such a way that correction of up to ⌊(d − 1)/2⌋ errors is always possible.
The rate of an [n, k, d]-linear code C is nk . This is the ratio of the information content
of a codeword to the information content of an arbitrary string of length n.
If C is a linear code, then C ⊥ = {w | u · w = 0 if u ∈ C} is called the dual code to C.
A code C is self-dual if C ⊥ = C.
Exercise 7.4.11 Show that all codewords of a binary self-dual code have an even number
of ones.
Exercise 7.4.12 Show that if C is an [n, k] code over Zq , then C ⊥ is a [n, n − k]-code
over Zq .
Exercise 7.4.13 Show that two k × n matrices generate equivalent linear [n, k]-codes
over Zq if one matrix can be obtained from the other by a sequence of operations of the
following type: (1) permutation of rows; (2) multiplication of a row by a non-zero scalar;
(3) addition of a scalar multiple of one row to another; (4) permutation of columns; (5)
multiplication of any column by a non-zero scalar.
A matrix G whose rows are all vectors of a basis of a linear code C (as a subspace)
is said to be a generator of C. A generator matrix H of the dual code C ⊥ is called the
parity-check matrix of C.
7.4. QUANTUM ERROR-CORRECTING CODES 275
S
If G is a generator matrix of an [n, k]-code C, then C = v∈{0,1}k vG. The name “parity-
check matrix” is derived from the fact that a parity matrix H of a code C can be used to
test whether a given word w is in C. Indeed, w ∈ C if and only if HwT = 0.
The following theorem, easy to show, provides a simple way to construct a parity-check
matrix of a linear code with a given generator matrix and vice versa.
Theorem 7.4.14 If G is the generator matrix of an [n, k]-code C written in the form G =
[Ik |A], where Ik is the k×k unit matrix, then a parity-check matrix for C is H = [−AT |In−k ].
It follows from the definition of the parity-check matrix H of a code C that for each
w∈C
wH T = 0T , 0 = HwT
and if w 6∈ C, w = w1 + we , w1 ∈ C, then w1 H T = 0T , w1T H = 0. This means that the
row space of H is orthogonal to C. In addition, GH T = 0T , 0 = HGT .
Exercise 7.4.15 Denote by w(u) the (Hamming) weight of a word u. For a code C
let w(C) = min{w(u) | u ∈ C − {0}}. Show that d(C) = w(C) for any linear code C.
Encoding with linear codes. If C is an [n, k]-code over Zq with a generator matrix
G, then C contains q n codewords and therefore it can be used to communicate up to q k
distinct messages. Let us identify messages with words in V (n, q). Encoding of a message u
is done by the matrix multiplication uG.
Syndrome decoding with linear codes is also easy, but several new concepts are
needed to formulate an efficient algorithm.
Definition 7.4.16 If C is an [n, k]-code over Zq and a is any vector in V (n, q), then the
set a + C = {a + x | x ∈ C} is called the coset of C. A vector of a coset with the minimum
weight is its leader (which does not have to be unique).
Exercise 7.4.17 Suppose C is an [n, k]-code over Zq . Show: (a) every vector of V (n, q)
is in some coset of C; (b) any coset contains exactly q k vectors; (c) two cosets of C are
either identical or disjoint.
Suppose H is a parity-check matrix of an [n, k]-code C. For any y ∈ V (n, q) the row
vector S(y) = yH T is called the syndrome of y (with respect to C).
As discussed above, if w = w1 + we with w1 ∈ C, then S(w) = S(we ) and therefore
the syndrome only depends on the word (vector) we . That means that a syndrome specifies
an error without revealing anything about the codeword w1 itself. This is an important
property of the syndromes of linear codes that plays the key role in several quantum error
correcting codes.
Exercise 7.4.18 Show that two vectors are in the same coset if and only if they have the
same syndrome.
Decoding is now easy once we have constructed the so-called standard array for an
[n, k]-code C. It is a q n−k × q k array of all vectors in V (n, q). The first row contains all
276 CHAPTER 7. PROCESSORS
codewords of C starting with the word 0(n) . The first column contains leaders of all cosets.
All other entries in the array are sums of elements in the first row of the corresponding
column and in the first column of the corresponding row. In addition, one column is added
with ith element being the syndromes for the cosets of the ith row.
Algorithm 7.4.19 (Syndrome decoding for linear codes) Given a word y to decode
do the following;
1. compute S(y) = yH T ;
2. Decode y as y − ly , where ly is the coset leader in the coset with the syndrome S(y).
In order to make decoding, or error correction, one needs to flip the erroneous bits, i.e.,
to apply NOT operation to them.
Exercise 7.4.20 (Singleton bound) Show that if C is an (n, k, d) linear code, then d ≤
n − k + 1.
The Hamming bound, page 273, and Exercise 7.4.20 provide upper bounds on the size
of linear (n, k, d)-codes with the given distance. The Gilbert–Varshamov bound
X
d−2
n−1
< 2n−k
j=0
j
Hamming codes
The Hamming codes are interesting and important examples of the single error-correcting
linear codes with easy encoding and decoding.
Definition 7.4.21 Let r ∈ N+ and H be an r × (2r − 1) matrix whose columns are distinct
non-zero vectors of V (r, 2). The code having H as its parity-check matrix is called a binary
Hamming code and denoted by Ham(r, 2).
Example 7.4.22 Ham(3, 2) is the Hamming code with the parity-check matrices
0 0 0 1 1 1 1 0 1 1 1 1 0 0
H= 0 1 1 0 0 1 1 or H ′ = 1 0 1 1 0 1 0 .
1 0 1 0 1 0 1 1 1 0 1 0 0 1
The main theoretical results on Hamming codes are summarized in the following theorem:
Theorem 7.4.23 The Hamming code Ham(r, 2) has the following properties: (1) it is a
[2r − 1, 2r − 1 − r]-code; (2) it has minimum distance 3; (3) is a perfect code.
7.4. QUANTUM ERROR-CORRECTING CODES 277
Hence Ham(3, 2)-code is an [7, 4, 3]-code. The dual to a Hamming code is called simplex
code or maximal-length feedback shift register code.
Exercise 7.4.24 Show that dual code to the Hamming [7, 4, 3]-code consists of Hamming
code codewords of even weight.
Dual of the extended Hamming code is also an important code: first-order Reed–
Muller code. Since each Hamming code is linear, encoding with it is easy as described
above. To describe decoding observe first that all coset leaders are exactly all vectors of
weight ≤ 1. The syndrome of each such coset leader (0, . . . , 1, . . . , 0) with 1 in the jth
position is just the transpose of the jth column of H. Therefore, if the columns of H are
arranged in order of the increasing binary numbers they represent, as in the example above,
we have the following decoding method:
Example 7.4.26 If y = 1110011, then S(y) = 001 and therefore y is decoded as 0110011.
Cyclic codes
Definition 7.4.27 C is a cyclic code if it is a linear code and if an−1 an−2 . . . a0 ∈ C
implies that a0 an−1 . . . a1 ∈ C.
Cyclic codes have an interesting algebraic structure. To see it let us identify a polynomial
a0 + a1 x + a2 x2 + . . . + an−1 xn−1 with each codeword an−1 an−2 . . . a0 .
Moreover, denote by Rnp the set of all polynomials of one variable over Zp , with p a
prime, taken modulo the polynomial xn − 1. For a polynomial f (x) ∈ Rnp denote hf (x)i =
{r(x)f (x) | r(x) ∈ Rnp }. It holds:
Theorem 7.4.28 (1) For any f (x) ∈ Rnp , the set hf (x)i is a cyclic code (generated by
f (x)). (2) If C is a non-zero cyclic code in Rnp , then there is a polynomial g(x) such that
C = hg(x)i and g is a factor of xn − 1.
A(n, d)A(n, d′ ) ≥ 2n .
If C is a [n, k]-linear code, then Cdual has a neat form, which is easy to show:
According to Exercise 7.4.12, Cdual is a [n, n − k]-code with 2n−k codewords. Linear
codes are therefore codes for which the equality holds in the inequality (2.5).
Classical error-correction techniques cannot be directly applied to quantum information
processing for two main reasons: (1) it is not possible, in general, to copy or measure
qubits without causing undesirable effects; (2) it is not sufficient to correct 0/1 values of
qubits—also amplitudes need to be preserved and this is a completely new feature with
which quantum error-correcting techniques have to be able to deal.
Error model
Inaccuracies, noise and decoherence can be described in terms of the most general quantum
operators—superoperators—or, equivalently, in terms of the unitary operators on the system
and its environment.
There is a large variety of possible quantum errors. 6 However, to consider QECC
successfully it is quite sufficient to make several simple, but (quasi-)realistic assumptions
concerning the character, frequency and types of errors. 7
We shall assume that errors, due to decoherence and inaccuracies, on different qubits
or on the same qubit in different times are random and statistically uncorrelated. Namely,
that they are locally independent (errors in different qubits or gates are not correlated)
and sequentially independent (subsequent errors on the same qubit or in the same gate
are not correlated). 8 (In other words, it is assumed that there are no interactions between
6 The term “error” is used here in a special way. As pointed out by Peres (1996) “A computer is a physical
system, subject to the laws of nature. No errors occur in the application of these laws. What we call an
error is a mismatch between what the computer does and what we wanted it to do.”
7 Without assumptions on how error occurs it is not possible to prove nontrivial results on error correction.
8 If additional information is available about the error process, more efficient quantum error-correcting
procedures can be developed to deal with errors of such processes. See, for example, Plenio, Vedral and
Knight (1996), for the case that the error process is a spontaneous emission.
7.4. QUANTUM ERROR-CORRECTING CODES 279
environments of different qubits and also between environments of the same qubit at different
time steps.) No knowledge about their nature will be assumed. 9 (As a consequence, an
error operator on n qubits can be written at each time step as a tensor product of errors
on particular qubits.) If these conditions are satisfied, then it is believed that errors are
correctable provided the error rate is below 10−5 per qubit and clock cycle (DiVincenzo and
Terhal, 1998). As a consequence, the above error model implies that a correlation between
errors on different qubits can exists only in the case of qubits interacting through a quantum
gate.
Error types
A general interaction between a qubit α|0i + β|1i and its environment leads to the evolution
of the form:
|ei(α|0i + β|1i) → α(|e00 i|0i + |e01 i|1i) + β(|e11 i|1i + |e10 i|0i), (7.3)
where |ei, |e00 i, |e01 i, |e10 i and |e11 i are states of the environment.
The right-hand side of (7.3) can now be written in the form
(|e0+ iI + |e0− iσz + |e1+ iσx − |e1− iiσy )(α|0i + β|1i), (7.4)
where
1 1
|e0+ i = (|e00 i + |e10 i) |e0− i = (|e00 i − |e10 i), (7.5)
2 2
1 1
|e1+ i = (|e01 i + |e11 i) |e1− i = (|e01 i − |e11 i), (7.6)
2 2
and σx , σy , σz are Pauli matrices.
The key ideas behind quantum error-correction codes can be seen looking carefully on
the state (7.4): (1) any error can be seen as being composed of four basic errors and therefore
if we are able to correct any of these four types of error, we can correct any error; (2) error
model resembles more a discrete one than a continuous one; (3) the resulting state of the
environment is independent of the state on which an error process acts and depends only
on the type of error operator being applied. This also suggest the following error-detection
and -recovery process: (1) To compute which type of error has occurred (error “syndrome”
is computed); (2) to undo errors.
The impact of Pauli matrices on a qubit |φi = α|0i + β|1i is shown in Figure 7.1.
σx therefore stands for the bit (flip) error (or “amplitude error”), σz for the sign (flip)
error (or “phase (shift)” error). σy and σx σz for a bit-sign (flip) error (or “bit-phase”
error)—a combination of the bit error and the phase error.
9 Sometimes the so-called no leakage assumption is made: a physical system which implements a qubit
has access only to the two-dimensional Hilbert space defined by the qubit. A photon with two basis states
represented by the horizontal and vertical polarizations is an example of a system which, without modifi-
cations, does not satisfy this assumption (photons have a tendency to be scattered or absorbed and in this
way lost for the computation).
280 CHAPTER 7. PROCESSORS
Observe that σy = iσx σz . That is why sometimes a slightly different error model is used,
with three types of errors represented by matrices X = σx , Z = σz and Y = σx σz .
In the case of an n-qubit register the general type of error is therefore represented by
the matrix
On
M= Mi ,
i=1
Example 7.4.31 Let us explore perhaps the simplest idea for a quantum error-correcting
code (Peres, 1985, Aharonov, 1998), namely encoding of the basis states
which results in the encoding of the general one-qubit state α|0i + β|1i → α|000i + β|111i.
Unfortunately, such an encoding is not good enough because it does not protect the quantum
state even against one error. Indeed, let us assume that a noise operator operates on the
first qubit and the environment |ei in such a simple way that it does not changes the first
qubit, but it changes the environment depending on the value of the qubit:
The resulting state is entangled with the environment and Bob cannot disentangled it by a
local action and cannot recover the original state.
E(|φi|0(n−k) i) → |φE i,
where |φE i is said to be the quantum code of |φi defined by E. The encodings, or of
the basis states of k qubits are called codewords and they form an orthonormal basis of a
2k -dimensional subspace of H2n .
If an error occurs in a codeword |φE i, then |φE i is altered by some linear transformation,
superoperator, E and
E
|φE i → |EφE i.
(E is not required to be unitary; there is a need to correct also non-unitary errors.)
An error-correction process (ECP) can now be modeled by unitary transformations that
first entangle the erroneous state |EφE i with an ancilla (an auxiliary state of auxiliary
qubits), and then transform the resulting entangled state into a tensor product of the original
state |φE i and a new state |AE i of the ancilla:
ECP
|EφE i|Ai → |φE i|AE i.
Since the state |φE i|AE i is not entangled we can measure |AE i without disturbing |φE i
and this way we can determine a transformation which has to be applied to E|φE i to get
|φE i.
7.4. QUANTUM ERROR-CORRECTING CODES 281
Let us now look into the error-creation and -correction process in more detail for the
important case where erroneous states have the form
l
X l
X
s
Ms |φE i or |ψenv iMs |φE i, (7.7)
s=1 s=1
where each Ms is a tensor product of n error matrices from the set {X, Y, Z, I} (and it is
s
called an an “error operator” or an “error” and |ψenv i are states of the environment. (As
discussed more in Section 7.4.5, such error operators generate a group which will be denoted
by Gn .)
The basic task is to determine, without disturbing |φE i in an irreversible way, an opera-
tion that has to be performed in order to get |φE i out of E|φE i. The basic idea is to compute,
as in the case of linear codes, syndromes of errors without disturbing |φE i. In order to do
that additional qubits of ancilla are introduced in a special initial state, for example in the
state |0(n−k) i. In order to compute syndromes, a syndrom-extraction operator S is applied
with the effect
l
X
s
|ψenv i(Ms |φE i|si). (7.8)
s=1
Since the states |si are orthogonal we can measure the ancilla qubits in the basis {|si} to
get:
s0
|ψenv i(Ms0 |φe i|s0 i)
for a single, randomly chosen, s0 . This is excellent: instead of a complicated erroneous state
we have now only one error operator Ms0 and by applying Ms−1 0
we get as the result the state
s0
|ψenv i|φE i|s0 i. Therefore, the state |φE i has been reconstructed—it is no longer entangled.
As shown above, the quantum error-correcting processes processes are, surprising, dis-
crete processes, not continuous ones. The key discretization step is projection measurement.
Remark 7.4.32 Actually, it is not necessary to measure ancilla qubits to get the syn-
drome. Indeed, after the syndrome extraction one can apply a unitary operator C
such that C(|xi|si) = Ms−1 |xi|si to the sum in (7.8). The final state then would be
P
|φE i ls=1 |ψenv
s
i|si—entanglement between the state and the environment is transferred
into the entanglement between the environment and the ancilla.
.
A quantum t error-correcting code code is a (unitary) mapping of k qubits into a subspace
of a quantum space of n > k qubits such that errors in any of the t qubits can be corrected;
i.e. the original quantum state can be perfectly recovered from the remaining n − t qubits.
The simplest case to consider is k = 1, even it is quite clear that “more efficient” QECC
are expected to exist for k > 1.
282 CHAPTER 7. PROCESSORS
Remark 7.4.33 Observe that whether a code is degenerate or not depends on the set of
errors that is expected to get corrected. For example, a degenerate two error-correcting code
can be nondegenerate when considering it as a one error-correcting code.
Exercise 7.4.34 How should condition (7.11) be changed would we want from a code
only the capability to detect an error or a set of errors?
2(3n + 1) ≤ 2n (7.16)
and 5 is the minimal n satisfying (7.16). Three questions now immediately come up:
where H is the Hadamard matrix, an application of which transforms the states expressed
in the standard basis to the dual basis and vice versa. In other words a sign error in the
standard basis is the bit error in the dual basis and vice versa.
Together with the fact that the classical error-correcting codes are used to correct bit
errors, the identities (7.17) suggest a simple approach to quantum error correction for the
case that only bit and sign errors are expected.
A classical error-correcting code C1 is used to correct bit errors. In order to correct sign
errors the erroneous state is first transformed to the dual base, sign errors are then corrected,
if by the basis transformation we get codewords with support from some error-correcting
code C2 . The resulting state is then transformed back to the standard basis. This will be
described later.
Notation
{n, k, d1 , d2 } is used to denote quantum codes C that with n qubits encode k qubits and
⌈(d1 − 1)/2⌉ (⌈(d2 −1)/2⌉) bit (sign) errors can be corrected by the code. Notation {n, k, d},
or [[n, k, d]], is used for {n, k, d, d} codes. Finally, notation {n, k, d}+ denotes codes code-
words of which are superpositions all basis states of which have as amplitude 1.
Remark 7.4.35 Interestingly enough, quantum error-correcting codes are able to fight de-
coherence caused by entanglement of quantum systems with the environment, using again
entanglement as the main tool. As pointed out by Preskill (1998), “we can fight entangle-
ment with entanglement”.
Figure 7.2: Examples of 1-qubit quantum error-correcting codes; all superpositions are
equally weighted, but amplitudes are omitted in the table
7.4. QUANTUM ERROR-CORRECTING CODES 285
Exercise 7.4.36 (Shor, 1995) (a) Show that if p is the probability of one qubit error,
then the probability that at least 2 qubits out of 9 are erroneous is approximately 36p2 ;
(b) Show that if k qubits are encoded using Shor’s 9-qubit code, then the probability that
9k-qubits can be decoded is (1 − 36p2 )k ; (c) determine probabilities as in the cases (a) and
(b) for LMPZ’s code.
Shor’s code was obtained from the repetition code {|000i, |111i} by replacing |0i with
√1 (|000i + |111i) and |1i with √12 (|000i − |111i), two states of the Bell’s basis in H8 :
2
1 1 1 1
√ (|000i ± |111i), √ (|001i ± |100i), √ (|010i ± |101i), √ (|010i ± |101i)
2 2 2 2
The starting classical code for Steane’s code was the simplex (7, 8, 4)-code
C = {0000000, 1010101, 0110011, 1100110, 0001111, 1011010, 0111100, 1101001}.
Using the standard-to-dual basis transformation we get Cdual = C ⊥ = Ham(3, 2)—
page 276. The resulting Steane‘s code is now obtained as
X X
|0E i = |ci, |1E i = |ci,
c∈Ham(3,2) c∈Ham(3,2)
c even c odd
The point is that the code C has distance 4 and C ⊥ has still large enough distance, 3, to
correct one bit error. Observe also that in both codes, |0E i and |1E i, the last three bits of
any codeword uniquely determine the first four. (They are actually the parity bits.)
LMPZ’s code was obtained experimentally, by analysis of the orthogonality conditions,
as discussed in the next subsection, and by a computer search (Laflamme et al. 1996).
Finally, Barenco’s 3-qubit code can be obtained from the repetition code {000, 111},
using the Hadamard rotation on all three qubits providing:
1
|0i → |000i → √ (|000i + |001i + |010i + |011i + |100i + |101i + |110i + |111i)
8
1
= √ (|0E i + |1E i),
2
1
|1i → |111i → √ (|000i − |001i − |010i + |011i − |100i + |101i + |110i − |111i)
8
10 LMPZ code is equivalent up to a change of basis of individual qubits to the code independently discovered
by Bennett et al. (1996a).
286 CHAPTER 7. PROCESSORS
1
= √ (|0E i − |1E i).
2
Exercise 7.4.37 Determine which bit, sign and bit–sign errors map the following codes
into mutually orthogonal states: (a) Shor’s 9-qubit code; (b) Steane’s 7-qubit code; (c)
LMPZ’s 5-qubit code; (d) Barenco’s 3-qubits code.
Encoding circuits—encoders
To use a quantum code with mappings |0i → |0E i, |1i → |1E i, a quantum circuit is needed
to transform an arbitrary quantum state α|0i + β|1i into the state α|0E i + β|1E i. Encoding
circuits for Steane’s and LMPZ’s codes are shown in Figures 7.3a,b.
α|0>+β|1>
|0> α|0>+|β> π
|0> |0> π π
|0> |0> Η
|0> H |0> Η
|0> H |0> Η
(a) |0> H (b)
Figure 7.3: Encoding circuits for the Steane’s and LMPZ’s codes; π-gate realizes π-rotation
The circuit to produce Steane’s code is simple (Preskill, 1998) and makes use of the fact
that codewords of the support of |1E i can be obtained from the codewords of the support for
|0E i by flipping the bits. The first two XORs produce the state α|0000000i+β|1110000i. The
three Hadamard gates produce equally weighted superposition of all eight possible values
for the last three bits. These three bits uniquely determine the first four bits and this is
taken care by the remaining multiple XOR gates.
Encoding circuit for LMPZ’s code is more tricky (Laflamme et al. 1996). The first three
bits of each codeword uniquely determine the last two in each of the codewords. This is
then easy to implement using multiple XOR gates. The problem is with the signs, and they
are taken care by π-gates that represent rotation by π (i.e. multiplication by eiπ ).
Exercise 7.4.38 Design encoding circuits for: (a) Shor’s 9 qubit code; (b) Barenco’s
3-qubit code.
In the case of Shor’s code, Shor (1995), it is instructive to see the impact of the deco-
herence process
|e0 i|0i → |a0 i|0i + |a1 i|1i, |e1 i|1i → |b0 i|0i + |b1 i|1i,
where |a0 i. |a1 i, |b0 i, b1 i are states of the environment, on the first qubit of the states
√1 (|000i + |111i) and √1 (|000i − |111i). The resulting states are shown in Figure 7.4a,b.
2 2
Observe an important fact that the states of the environment (|a0 i + |b1 i, . . .), are the
same for both error states arising from encodings of |0E i and |1E i.
It is now clear that using XOR gates with several ancillary qubits and comparing three
triplets one can easily determine whether an error occurred and in which qubit. By a
measurement with respect to the Bell basis the type of error can be determined. (By such
a measurement the overall state collapses but what remains is good enough to restore the
initial state.)
In the case of Steane’s code one can make use of the way syndrome is computed for
Hamming codes, page 277. Namely, the matrix H, page 276, shows that the error syndrome
for the bit error is uniquely determined by three bits: (b4 ⊕ b5 ⊕ b6 ⊕ b7 , b2 ⊕ b3 ⊕ b6 ⊕ b7 , b1 ⊕
b3 ⊕ b5 ⊕ b7 ), where (b1 , . . . , b7 ) denote the seven bits of the codewords.
To correct also sign errors the code has to be transformed first from the standard to the
dual basis. The key point is now that the resulting classical code of codewords is Ham(3, 2)
or (7, 4, 3)-code. Now
1 1
|0′E i = √ (|0E i + |1E i), |1′E i = √ (|0E i − |1E i),
2 2
and the same technique as above can be used to detect sign errors (and consequently also
bit–sign errors). Once this is done the code is transformed to the standard basis.
Exercise 7.4.39 Design error syndrome computation network for Steane’s code.
Syndromes for LMPZ’s code can be computed with the same circuit as for code gener-
ation; it is only necessary to run this circuit backward. A relation between syndromes and
errors is shown in Figure 7.5a (Laflamme et al. 1996).
Finally, in the case of the Barenco’s 3-qubit code let us denote by |0jE i and |1jE i, 1 ≤ j ≤ 3
states obtained from |0E i and |1E i by a sign error on the jth qubit. Error symbols can be
computed using two projection operators P1 on the subspace {|0E i, |1E i, |01E i, |11E i} and P2
on the subspace {|0E i, |1E i, |02E i, |12E i}. The syndrome table is shown in Figure 7.5b.
288 CHAPTER 7. PROCESSORS
Figure 7.5: Syndrome tables for the LMPZ’s code. (B (S) stands for bit (sign) error and
the number specifies the qubit, and 3-qubit code.
Correcting circuits
Once error symbols are computed, the error correction is done by applying the appropriate
unitary transformation on the erroneous state. Let us discuss it in some detail only for
Shor’s 9-qubit code.
At the measurement during the syndrome computation, with respect to Bell basis, the
erroneous state collapses into one of the states
In order to make the correction to each of such resulting states a unitary transformation has
to be applied that produces the original state
Exercise 7.4.40 Determine unitary transformations to get from the states in (7.18) to
the original state α(|000i + |111i) + β(|000i − |111i)
7.4. QUANTUM ERROR-CORRECTING CODES 289
Base changes
The first method we are going to discuss is very simple but surprisingly useful and an
important component of some more general methods discussed later.
Each quantum code mapping k qubits into n qubits specifies a subspace of dimension 2k in
H2n . A particular code is usually given by providing a particular basis of the code subspace.
By changing the basis a (potentially infinite) number of different code representations can
be obtained.
Of special importance is Hadamard rotation, which can switch a bit-correcting code to
a sign-correcting code and vice versa.. Sometimes a variety of useful code modifications can
be obtained by applying Hadamard rotation to a few carefully chosen qubits.
Example 7.4.41 (DiVincenzo and Shor, 1996)) Let us start with the codewords
1
|0E i = (|00000i
4
+ |11000i + |01100i + |00110i + |00011i + |10001i
− |10100i − |01010i − |00101i − |10010i − |01001i
− |11110i − |01111i − |10111i − |11011i − |11101i)
and |1E i obtained from |0E i by flipping all bits in the basis states. (Observe that basis states
of both superpositions are classical cyclic codes.) By changing the basis for all qubits using
the Hadamard rotation we get the codewords
1 1
|0′E i = √ (|0E i + |1E i) |1′E i = √ (|0E i − |1E i), (7.19)
2 2
and if the Hadamard rotation is again applied to the first two qubits of the codewords
|0E i, |1E i, then exactly LMPZ’s code from Figure 7.2 is obtained.
290 CHAPTER 7. PROCESSORS
H H
H H
H H
H H
H H
0 0 0 0
s1 s2 s3 s4
Exercise 7.4.42 Verify, in detail, the above claim how one can get LMPZ’s code.
Changing a quantum code by changing the basis with Hadamard rotation is also an
important tool to design error syndrome computing circuits. For example, Figure 7.6 shows
such a syndrome computation circuit (due to DiVincenzo and Shor, 1996), for the code with
|0′E i and |1′E i given above as codewords. A systematic structure of this network indicates
that once the first two qubit rotations have been found useful, one could deduce the rest of
the circuit by a careful inspection of the codes obtained each time additional qubits bases
were changed (and then transformed back).
The last equality, which expresses |cw i in the dual basis follows from the properties of the
superpositions |cw i summarized in the following exercises.
7.4. QUANTUM ERROR-CORRECTING CODES 291
Exercise
P 7.4.44 Show the following properties of the superpositions |cw i. (a)
v∈{0,1}k (−1)
vG·w
= 0 unless vG · w = 0 for all v ∈ {0, 1}k ; (b) if w1 + w2 ∈ C1⊥ ,
then |cw1 i = |cw2 i; (c) if w1 + w2 6∈ C1⊥ , then cw1 · cw2 = 0; (d) {|cw i | w ∈ {0, 1}k }
contains exactly 2k different states and all are mutually orthogonal (and therefore form a
basis for HC1 ); (e) cosets of C1⊥ are natural elements to index states {|cw i | w ∈ {0, 1}k }.
If C2 ⊆ C1 is another code, then we define a quantum code QC1 ,C2 = {|cw i | w ∈ C2⊥ }.
Clearly, dim(QC1 ,C2 ) = dim(C1 ) − dim(C2 ). If C is a self-dual (n, k, d) code, then QC =
QC,C ⊥ = {|cw i | w ∈ C ⊥ } and dim(QC ) = k − (n − k) = 2k − n.
Exercise 7.4.45 Design QC1 ,C2 code for the case C1 = Ham(3, 2), C2 = C1⊥ .
Exercise 7.4.46 Show that if C is a self-dual code then in the dual bases |cw i has the
form
1 X
|dw i = √ |w + ci. (7.22)
2n−k c∈C ⊥
The following theorem (see Calderbank and Shor, 1996), provides a justification for an
introduction of the codes QC1 ,C2 .
It is easy to see that QC1 ,C2 codes can be used to decode bit errors in one basis and sign
errors in the dual basis, and that these two error correction steps do not interfere—QC1 ,C2
can be therefore used also to correct bit–sign errors.
Using this approach {13, 5, 3}+-code, {14, 6, 3}+-code, {17, 7, 3}+-code and {20, 9, 3}+-
code were designed by Steane (1997c). He has also found, using a systematic change of the
signs of the basis states of codewords, the following codes: {8, 3, 3}-code, {10, 4, 3}-code and
{11, 5, 3}-code.
An interesting/important example of CSS codes are quantum Reed-Muller Codes, see
Section 7.4.1 for classical versions, due to Steane (1996b) and Zhang and Fuss (1997).
Steane (1998a) has also developed a method how to convert certain CSS codes into
quantum codes with better parameters.
Analysis of orthogonality conditions and search for signs of the basis states
This method, introduced by Laflamme et al. (1996), is simple in principle, but elaborate,
and was used to get the first “perfect” encoding of one qubit by 5 qubits and 8 basis states
for each of the codewords —see Table 7.2.
292 CHAPTER 7. PROCESSORS
Exercise 7.4.48 (Laflamme et al. 1996) Assume that an encoding has the form
n n
2X −1 2X −1
|0i → αi |ii, |1i → βi |ii
i=0 i=0
and that all states obtained by one qubit error are orthogonal. Show that in such a case
the following equalities have to hold:
X X X X
|αi |2 = |αi |2 = |αi |2 = |αi |2 . (7.23)
ik =0,il =0 ik =0,il =1 ik =1,il =0 ik =1,il =1
Encodings of |0E i and |1E i should be such that all states obtained from them by one
qubit error should be orthogonal. This gives rise to a bunch of conditions the amplitudes
have to satisfy, such as one in (7.23). These conditions then have to be analysed to determine
(to guess) the support for the code. As reported in Laflamme et al. (1996), this way they
found the support for the code shown in Figure 7.7.
If one then takes a natural assumption of simplicity, namely that all amplitudes of the
basis states are either +1 or −1, then a computer search can find out quantum codes with a
given support and amplitudes ±1. Laflamme et al. (1996) discovered that all such minimal
encodings of one qubit have to have two −1 as amplitudes in one codeword and four in
other. An example of a modified version of their code (see Williams and Clearwater, 1998),
has the form shown in Table 7.7:
Exercise 7.4.49 Determine whether each code with the same string-support as LMPZ
code has to have four amplitudes −1 at basis states in one of codewords and two at the
other.
E1 ⊗ E2 ⊗ . . . ⊗ En , (7.24)
7.4. QUANTUM ERROR-CORRECTING CODES 293
The following property is of crucial importance for the “stabilizer codes” to be defined later:
If M ∈ Gn and S ∈ SC are such that {M, S} = 0, then for any |φi, |ψi ∈ C,
and therefore hφ|M |ψi = 0. The code herefore satisfies the condition 7.11 whenever Ma ∗ Mb
anticommute with some element of S.
M1 Z Z I I I I I I I
M2 Z I Z I I I I I I M1 X X X X I I I
M3 I I I Z Z I I I I M2 X X I I X X I
M4 I I I Z I Z I I I M3 X I X I X I X
M5 I I I I I I Z Z I M4 Z Z Z Z X X X
M6 I I I I I I Z I Z M5 Z Z I I Z Z I
M7 X X X X X X I I I M6 Z I Z I Z I Z
M8 X X X I I I X X X
(a) (b)
M1 X X X X X X X X
M1 X Z Z X I
M2 Z Z Z Z Z Z Z Z
M2 I X Z Z X
M3 I X I X Y Z Y Z
M3 X I X Z Z
M4 I X Z Y I X Z Y
M4 Z X I X Z
M5 I Y X Z X Z I Y
(c) (d)
Example 7.4.50 Figures 7.8a,b,c (Gottesman, 1997), shows generators of the stabilizers
for Shor’s 9-qubit code, Steane’s 7-qubit code and LMPZ’s 5-qubit code. Figure 7.8d show
generators of the stabilizer for an [[8, 3, 3]]-code due to Gottesman (1996). Let us discuss
design and use of the stabilizer for Steane’s code. Error vectors in Figure 7.8b can be
discovered in a straightforward way from how one detects a single bit or sign error for this
code. Indeed, to detect a bit error in a state |ψi on one of the first three qubits it is sufficient
294 CHAPTER 7. PROCESSORS
to compare the first qubit with second and then the first qubit with the third in |ψi. One way
of doing that is to measure |ψi with respect to M1 and M2 as observables. A similar role
play error vectors M3 to M6 . The last two error vectors can be used to detect sign errors.
Exercise 7.4.52 Show that if a code encodes k qubits using n qubits, then its stabilizer
has dimension 2n−k .
Implications? Let E and F be error operators, both of weight t or less. If the operator
F ∗ E anticommutes with any operator in S, then vectors E|φi and F |ψi are orthogonal
for any |φi, |ψi ∈ C. However, this is exactly the requirement a non-degenerate quantum
error-correcting code should satisfy. In other words, to get a non-degenerate code we just
need to find a code C and the corresponding stabilizer SC such that any non-identity error
operator of Gn of weigth at most 2t anticommutes with some element from SC .
The above result suggests two methodologies to design non-degenerate codes: To choose
C and to look for SC or vice versa.
In order to facilitate the second methodology let us look for some properties SC should
have to be a stabilizer.
SC is clearly a subgroup of Gn . In addition, for any |ψi ∈ C, S ∈ SC , S 2 |ψi = S|ψi = |ψi
and therefore S 2 = I. Moreover, for any |φi ∈ C and any M, N ∈ SC , M N |ψi = N M |ψi =
|ψi and therefore [M, N ]|ψi = 0. We have therefore that either [M, N ] = 0 or {M, N } = 0.
In the case {M, N } = 0 we would have [M, N ] = 2M N , but this contradicts the property
[M, N ]|ψi = 0 for any |ψi ∈ C. Hence [M, N ] = 0. SC has therefore to be an Abelian
group. S 2 = I has to hold for any of its elements S. It can be shown that these conditions
are sufficient for the existence of a non trivial code C for which SC is a stabilizer (provided
Sc is not too big).
Let C be a code of codewords of length n and SC be its stabilizer. The centralizer
of SC , C(SC ) is the set of elements of Gn that commute with all elements of SC , Clearly
SC ⊆ C(SC ). Define FS = SC ∪ (Gn − C(SC )). The code C corrects any set E0 of errors such
that for any Ma , Mb ∈ SE , Ma∗ Mb ∈ SC ∪ (Gn − C(SC )).
For the stabilizer codes there are straightforward ways to design networks for encoding—
see Gottesman (1997)—and also for syndrom computation, see DiVincenzo and Shor (1995).
Gottesman (1996) has developed a methodology for choosing S so that error operators
of length at most 2t will anticommute with some member of S. On this base he designed an
infinite class of codes saturating the quantum Hamming bound. One of them is {8, 3, 3}-code
whose stabilizer is shown in Figure 7.8d.
The concept of the stabilizer has much simplified the design of quantum error correcting
codes—see Gottesman (1996), Steane (1996) and Calderbank et al. (1996, 1997).
Special notation
In order to describe stabilizers also a different notation is used which is especially handy in
the case of the quantum error-correcting codes derived from the classical codes. An error
operator M1 ⊗ . . . ⊗ Mn is written in the form X(x1 , . . . , xn )Z(z1 , . . . , zn ), where xi = 1 if
Mi ∈ {X, Y }, and 0 otherwise, and zi = 1 if Mi ∈ {Y, Z}, and 0 otherwise. Moreover, in
such a case also the notation (Xx |Zz ) is used, where xi and zi are defined as above.
7.4. QUANTUM ERROR-CORRECTING CODES 295
A set of error operators forming generators of a stabilizer can then be written in the
matrix form
(X|Z), (7.25)
For example, stabilizers for the LMPZ code and for the {8,3,3 }code mentioned above are
in this form described in Figure 7.9a,b.
If X and Z are (n − k) × n matrices for some n and k then (X|Z) forms a stabilizer if
XZ T + ZX T = 0 (Calderbank et al. 1997).
Let us consider a CSS code created out of two classical linear codes C1 and C2 and let PC1
and PC2 be their parity-check matrices.
A quantum code to correct just bit errors can be designed using the set of generators
SC1 obtained from vectors of PC1 by replacing each 1 with Z. To that we add the set of
generators obtained from the parity-check matrix PC2 for C2 with X replacing 1’s this time.
These generators can identify the sign errors. Together they identify also bit/sign errors.
In the case C2⊥ ⊆ C1 , C1⊥ ⊆ C2 these two generators can be combined into a single set of
generators for the code.
1 1 1 1 1 1 1 1 | 0 0 0 0 0 0 0 0
1 0 0 1 0 | 0 1 1 0 0
0 1 0 0 0 0 0 0 0 0 | 1 1 1 1 1 1 1 1
0 0 1 | 0 0 1 1 0
1 0 0 0 0 0 1 1 1 1 | 0 0 1 1 0 0 1 1
1 0 0 | 0 0 0 1 1
0 0 1 1 0 0 1 1 | 0 1 0 1 0 1 0 1
0 1 0 1 0 | 1 0 0 0 1
0 1 0 1 0 1 0 1 | 0 0 1 1 1 1 0 0
(a) (b)
The first example of a non-stabilizer code that is better than any other stabilizer code of the
same type is due to Rains et al. (1997). It is a {5,6,2}code that is defined (as a subspace)
by the following projection operators.
1
P = [3( I ⊗ I ⊗ I ⊗ I ⊗ I) + (I ⊗ σz ⊗ σy ⊗ σy ⊗ σz )cyc + (I ⊗ σx ⊗ σz ⊗ σz ⊗ σx )cyc
16
−(I ⊗ σy ⊗ σx ⊗ σx ⊗ σy )cyc + 2(σz ⊗ σx ⊗ σy ⊗ σy ⊗ σx )cyc − 2σz ⊗ σz ⊗ σz ⊗ σx ⊗ σz ]
where the subscript “cyc” denotes that all five cyclic shifts of the corresponding error oper-
ator have to be taken into the sum. The code was discovered by a combination of careful
reasoning and computations. The code is better than any other stabilizer code of the same
type in the sense that for each stabilizer (n, 6, 2)-code we have n ≤ 4.
296 CHAPTER 7. PROCESSORS
1. Quantum error-correcting codes help to deal only with the problem of reliable storage
and transmission of quantum information. They are by themselves insufficient to have
fault-tolerant quantum information processing.
2. Each use of quantum error correction methods brings additional requirements on quan-
tum memory, hardware and computation time. Indeed, to establish necessary redun-
dancy, additional qubits, and thereby additional quantum memory, are needed. In
addition encoding, error recovery and decoding operations require additional qubits,
gates and circuits. That may slow down the overall computation.
4. It is not sufficient that we can encode quantum information in such a way that it
is stored or transferred reliably for some time. Of key importance for real quantum
computing is that we can store and transmit quantum information reliably for a long
time and through a long distance.
5. It is also not sufficient that we can process quantum information for some time in a
fault-tolerant way. What is badly needed is to be able to do that for a sufficiently long
time.
The so-called concatenated codes (see Section 7.5.3), represent a way to deal with the
last two problems. Concatenated codes allow us to store and transmit a qubit with maximal
error ε, provided gates with errors at most cε (where c is a constant not dependent on ε)
and storage or channel elements with errors at most ε are given, independently of how long
we must store a state or how far we need to transmit it.
Remark 7.5.1 A way to cope with the main drawback of concatenated codes—still high
requirements on tolerable error probabilities for transmissions and local operations—has
been suggested by Briegel et al. (1998). They suggest to use “quantum repeaters to form
entanled pairs for an arbitrarily large distance. The main idea is to use a sequence of
imperfectly entangled pairs of particles and a new (nested) purification protocol, to create
a single distant pair of particles of high fidelity.
7.5. FAULT-TOLERANT QUANTUM COMPUTATION 297
Methods presented in this section allow us to cope with inaccuracies and decoherence
problem in three ways: (1) Fault-tolerant circuits are used to perform quantum gates in-
formation processing; (2) in between two quantum gates applications to qubits quantum
error detection and recovery circuits are used to take care of their recovery from errors;
(3) concatenated codes and quantum repeaters are used to perform quantum information
processing and transmission over long time and distance.
Error propagation
The first main problem we have to learn to deal with is error propagation due to entan-
glement. If a two-qubit gate is applied and one of qubits is erroneous, then the error can
propagate to the second qubit.
Example 7.5.2 As we could see in the previous section, XOR gates are of key importance
for quantum error-correcting codes processing operations, especially for error syndrome com-
putation. It is evident that if a error occurs in an XOR’s gate control qubit, then the error
propagates (“forwards”) to its target qubit. Less obvious, but very important for quantum
error-correction processing networks, is the fact that under certain circumstances an error
can propagate through an XOR gate also “backwards”, from the target qubit to the control
qubit. This is easily seen from Figure 2.11, where a relation between an XOR gate and
its reverse is shown at which control and target bits are interchanged. This network is a
very typical component of quantum error correction networks. Indeed, Hadamard rotations
applied to qubits change the basis from the standard to the dual (and vice versa) and inter-
change bit and sign errors. Therefore if a sign error occurs on the target qubit, it can become
a bit error of the source qubit.
Backward error propagation in XOR gates discussed in the above example implies that
it is not good to use for error syndrome computation such circuits as in Figure 7.10a, in
which one ancilla qubit is used “to xor” information from several data qubits. Indeed, a
spontaneous phase error at an ancilla qubit, a far from an unexpected event, could propagate
to a bit error in several code qubits. That is why a way of gathering information for syndrome
computation depicted in Figure 7.10b is surely superior with respect to error propagation
(even if it requires more qubits).
298 CHAPTER 7. PROCESSORS
|ψ> |ψ>
|ψ>
|0> |0> H H
|0> |0> H
|0>
|0> |0> H
Figure 7.10: Three ways of gathering information from data to ancilla: bad, better and the
best
Fault-tolerant ancillas
Unfortunately, even the way of gathering information shown in Figure 7.10b is not good
enough for fault-tolerant quantum error correction. The point is that code qubits keep
being entangled with the ancilla qubits and therefore a measurement of ancilla qubits can
destroy the encoded state |ψi.
Example 7.5.3 In the case of Steane’s codendexcode!Steane’s we have seen that informa-
tion about its last 4 qubits is needed to get the first bit of the syndrome (x4 ⊕ x5 ⊕ x6 ⊕ x7 ).
This information can be obtained using a four-qubit ancilla, with all qubits initially in the
state |0i, one qubit of ancilla for each bit of the syndrome. However, if the ancilla qubits
yield, after the measurement, the result 0101, then in |ψi the state |0E i collapses to the
state |1010101i, and the state |1E i collapses to the state |0100101i, which is not acceptable
because the state lost all protection against phase errors..
There are three main operations to work with quantum error-correcting codes: encod-
ing (code generation), error recovery (syndrome computation and code correction) and
decoding. Networks to perform these operations, especially code generation and error re-
covery normally require an ancilla. It is therefore of crucial importance for the design of a
fault-tolerant quantum network to find a way of copying information into the ancilla without
distractive effects on the state being “copied”.
One way out was found by Shor (1995) and it is illustrated in Figure 7.10c for the case of
an ancilla with 4 qubits. Information is copied into the ancilla with the initial (Shor) state
7
1 X
|φi = √ |parity(i)i
8 i=0
of the equal superposition of all even-parity 4-bit codewords. This state can be created in
the way shown in Figure 7.10c. (The first Hadamard gate and the multiple XOR create the
“cat state” √12 (|0000i + |1111i). The next four Hadamard gates then create |φi.
Why is this way of copping information into the ancilla in the state |φi better? What is
needed in the case of a syndrome computation is only information about the parity of four
bits copied into the ancilla. If this parity is 0, the state |ψi is not changed; if the parity is
one, then the state of ancilla changes into the equal superposition of all odd-weighted 4-bit
codewords. If we therefore measure the ancilla state, with respect to the standard basis in
7.5. FAULT-TOLERANT QUANTUM COMPUTATION 299
H16 , we get a four-bit word, randomly, and its parity is the corresponding syndrome. The
key point is that after XOR operations in this case the state of the ancilla is not entangled
with encoded qubit and therefore a measurement of ancilla does not hurt it!
|ψ>
|0> H H
|0> H
|0> H
|0> H
|0>
|0>
Figure 7.11: Ancilla state verification
Another possible case is that a single error at the cat state designing circuit could cause
two sign errors in the cat state and therefore two bit-errors in the initial Shor’s state.
However, such errors do not damage the encoded qubits; they can only cause that the
resulting syndrome measurement provides an incorrect result.
to be sure with that is to repeat each syndrome measurement several times, until the same
error syndrome is obtained k times in a row, where k is some reliability parameter.
A simple procedure for syndrome generation for stabilizer codes was developed by DiVin-
cenzo and Shor (1996).
To each generator X(x1 , . . . , xn )Z(z1 , . . . , zn ) one stage in the syndrome computation
circuit corresponds. Each stage consists of three phases: basis transformations on qubits,
syndrome gathering and undoing of the previous basis. The basic idea is to transform the
generator to the form
X(0, . . . , 0)Z(z1′ , . . . , zn′ ).
Exercise 7.5.4 Determine the number of gates needed to compute a syndrome in the
fault-tolerant way for additive codes.
XOR operation can also be performed qubitwise. Indeed, suppose we have two logical
qubits |da i and |db i, see page 291. Then
1 X X
|da i|db i = n−k
( |w + ai)( |w′ + bi).
2
w∈C ⊥ w ′ ∈C ⊥
By applying XOR from the ith qubit of |da i to the ith qubit of |db i we get the
1 X X
( |w + ai)( |w′ + b + w + ai) = |da i|da+b i,
2n−k
w∈C ⊥ w ′ ∈C ⊥
A measurement of the fourth qubit with the outcome 0 results in the state (iE , jE , ((i ∧
j) ⊕ k)E ), the one the Toffoli gate should produce. However, a measurement with outcome
1 produces the state −|(iE , jE , ((i ∧ j) ⊕ k)E i. In such a case we need to change the sign,
which can be done fault-tolerantly, as already mentioned.12
As the next step we show how to design a fault-tolerant circuit for the mapping |iE , jE i →
|iE , jE , (i ∧ j)E i provided we have a 3-qubit ancilla in the state
1
|Ai = (|0E 0E 0E i + |0E 1E 0E i + |1E 0E 0E i + |1E 1E 1E i).
2
This can be done using the gates that have already been shown to have equivalent fault-
tolerant circuits.
Let us first apply gates XOR3,1 and XOR4,2 to the state |iE , jE i|Ai. This way we can
realize the mapping
1
|0E 0E i|Ai → (|0E 0E 0E 0E 0E i + |0E 1E 0E 1E 0E i + |1E 0E 1E 0E 0E i + |1E 1E 1E 1E 1E i),
2
1
|0E 1E i|Ai → (|0E 1E 0E 0E 0E i + |0E 0E 0E 1E 0E i + |1E 1E 1E 0E 0E i + |1E 0E 1E 1E 1E i),
2
1
|1E 0E i|Ai → (|1E 0E 0E 0E 0E i + |1E 1E 0E 1E 0E i + |0E 0E 1E 0E 0E i + |0E 1E 1E 1E 1E i),
2
1
|1E 1E i|Ai → (|1E 1E 0E 0E 0E i + |1E 0E 0E 1E 0E i + |0E 1E 1E 0E 0E i + |0E 0E 1E 1E 1E i).
2
As the next step we measure the first two logical qubits. The outcome 0E 0E results
in the transformation |iE , jE , iE , jE , (i ∧ j)E i. The outcome 0E 1E results in a mapping
|iE , jE , uE , vE , wE i such that
N OT4 XOR3,5 |iE , jE , uE , vE , wE i = |iE , jE , iE , jE , (i ∧ j)E i,
which is what is needed. In a similar way we can transform the results of other measurements
outcomes, using XOR and NOT operations, to the desired mapping.
The last step is to show that the ancilla state |Ai can be designed in a fault-tolerant
way. Of course, this cannot be done using XOR and rotation gates only.
Let us consider two 3n-qubit states
1
|Ai = (|0E 0E 0E i + |0E 1E 0E i + |1E 0E 0E i + |1E 1E 1E i),
2
1
|Bi = (|0E 0E 1E i + |0E 1E 1E i + |1E 0E 1E i + |1E 1E 0E i),
2
12 Onecan make use of the trasformations |iE jE kE i → (−1)ij (− − 1)k |iE jE kE i which can be seen as a
composition of operations |iE jE i → (−1)ij |iE jE i and |kE i → (−1)k |kE i.
7.5. FAULT-TOLERANT QUANTUM COMPUTATION 303
Once this is done, |Ai can be designed as follows. The state √12 (|0(n) i + |1(n) i)(|Ai + |Bi)
is first designed and then the transformation given just above is applied to produce
1 1
√ (|0(n) i + |1(n) i)|Ai + √ (|0(n) i − |1(n) i)|Bi.
2 2
Finally, let us measure the first n qubits with respect to the basis
1 1
{ √ (|0(n) i + |1(n) i), √ (|0(0) i − |1(n) i)}.
2 2
This measurement shows whether unmeasured qubits are in the state |Ai or |Bi. In the
second case we can obtain |Ai from |Bi by a NOT operation.
Remark 7.5.6 A general theory of fault-tolerant operations for stabilizer codes has been
developed by Gottesman (1997). It is based on symmetries of the code stabilizer and it is
shown that fault-tolerant universal computation is possible for any stabilizer code. Gottes-
man (1997) discussed in detail fault-tolerant computation based on LMPZ code.
Remark 7.5.7 Another general method, this time for a class of CSS-codes, to build a
universal set of fault tolerant operations developed Steane (1998b). In addition, he has
introduced new techniques to restrict accumulation of errors before and during recovery
operations. On this basis he develops an optimistic evaluation of perspectives of quantum
computing, from a certain point of view. Under the standard noise model of stochastic,
uncorrelated errors, a quantum computer needs be only an order of magnitude larger than
the logical machine contained within it in order to be reliable. For example, a scale-up by
a factor 22, with error rate of order 10−5 , is sufficient to permit large quantum algorithms,
such as factorization of thousand digit numbers.
Fault-tolerant gates
A different approach to fault tolerance of quantum gates was worked out by Kitaev (1997).
He showed, theoretically, that there is a universal set of gates that are fault-tolerant by their
physical nature and therefore they should be insensitive to local influences (and could be
operated quite carelessly).
The possibility of a near-future physical realization of such gates is far from clear or,
at least, it does not seem to be in sight. However, Kitaev’s approach at least indicates
that there may be essentially different ways to stabilization of quantum computations than
those discussed above—to search for new physical principles that would lead to fault-tolerant
quantum hardware.
304 CHAPTER 7. PROCESSORS
Basic idea
The method of Knill and Laflamme (1996) shows one way how to deal with the problem.
It allows one to transmit or store a qubit with an error at most ε, regardless of time and
distance, provided the gates that are used work with an error at most cε, for a constant c
independent of ε, and storage or channel elements with an error ε at most, for some c and
ε.
The basic idea is simple: to encode qubits recursively up to a certain level of recursion
or hierarchy (see Figure 7.12), and to perform often recovery operations. The overhead
of the method is polynomial in the time and storage and in the distance of transmission.
The method could be implemented by having quantum repeaters used and spread at regular
time or distance intervals of the quantum channel with sufficiently many parallel paths. The
7.5. FAULT-TOLERANT QUANTUM COMPUTATION 305
qubit
qubit
encoding
qubit qubit
qubit
qubit
qubit
qubit
encoding
qubit qubit
qubit
qubit
qubit
qubit
encoding encoding
qubit qubit qubit
qubit
qubit
qubit
qubit
encoding
qubit qubit
qubit
qubit
qubit
qubit
encoding
qubit qubit
qubit
qubit
method can work in principle with any error-correcting code. Concerning the nature and
frequency of errors it is only assumed that eerors occur randomly and independently and
that “no leakage” assumption (see page 279) is satisfied.
For example, the data is first encoded using an [[n, k, d]]-code. Qubits of new code-
words are then encoded using an [[n1 , 1, d1 ]]-code. The resulting qubits are again encoded
using some [[n1 , 1, d2 ]]-code. This can continue up to a hth hierarchy level resulting in an
[[nn1 . . . nh−1 , k, dd1 . . . dh−1 ]]-code. In a special case all these encodings can be the same.
The case n = 5, k = 1, h = 2 is illustrated in Figure 7.12. Computation of the error
syndromes can be done in the time proportional to the sum of efforts to do so for all used
codes provided we can do that in parallel for all blocks of each hierarchy level once starting
with the last hierarchy level.
It is easy to see why several hierarchies of encodings can help. If a single encoding of
one qubit into n is used, ε is a probability of an error per qubit, errors are uncorrelated and
recovery is fault-tolerant, then the probability of error recovery failure is ε2 . However, if h
h
level of encodings are used, then the probability of an error failure is ε2 (with the number
of qubits nh ).
Recursive encoding has to be combined, in addition, with frequent recovery operations.
The method can be simply described, for example (see Knill and Laflamme, 1996), in terms
of the recursive concatenated coding procedure CCPh,r , where the parameter h specifies the
depth of recursion and r the frequency of the error-recovery operations.
The basic level of the recursion, the procedure, CCP1,r , begins with one qubit, encodes
it to n qubits using the code C, then applies the recovery operation r − 1 times and, finally,
decodes it back to a single qubit.
CCPh,r , h > 1, starts also with one qubit, encodes it into n qubits using the code C,
applies CCPh−1,r to each of n qubits of the code, applies r − 1 times the recovery operation,
then applies again CCPh−1,r to each qubit and, finally, decodes the state into a single qubit.
306 CHAPTER 7. PROCESSORS
Of course, two following recovery operations are to be performed only after some time
or distance interval—after the so called recovery period.
Resources needed. If CCPh,r is used, then the total number of (waiting or trans-
mitting) periods is rh and the total number of qubits needed is nh . The number p(h) of
error-correcting operations (encodings, recoveries and decodings) is given by the recursion
Example 7.5.8 An analysis of the method for the case of LMPZ’s code, made by Knill
and Laflamme (1996), shows that a qubit can be stored for arbitrary amount of time or
transmitted over arbitrary distance with error ε provided the following holds:
1 1
ε≥ 120 .The basic storage or channel elements have error at most 120 and one or two qubit
1
gates have error at most 21600 .
1
ε< 120 . The basic storage or channel elements have error at most ε and one or two qubit
ε(1−60ε2 )
gates have error at most 90 .
The concatenated coding method prefers to use such codes at which all gates used in the
error-recovery operations can be performed transversally, i.e., a gate can be performed
on a logical qubit by being performed on its particular qubits in a qubitwise manner. (For
example Steane’s 7-qubit code has this property.) If this is the case, then error-recovery
operations can be performed simultaneously on all qubits at the highest level of hierarchy.
1. Control. The available quantum states must be precisely identified. In addition, tech-
niques have to be available for restricting states of the created quantum systems to the
corresponding subspace of the Hilbert space. There must be a way to create quantum
registers with qubits adequately isolated from interactions with the environment for
the duration of computation.
13 Of course a variety of other approaches are in different states of their “justifications” and “verifications”.
308 CHAPTER 7. PROCESSORS
2. Storage. Techniques have to be developed for storing quantum bits reliably for time
needed to perform interesting computations.
3. Initial state. It has to be possible, in a reasonably simple way, to set the state vector
into a given simple (initial) state; for example to the state |0(n) i. (This can sometimes
be done by cooling the system into its ground state.)
+-
-+
+ +
-+
+-
Figure 7.13: Ion trap procesor. Electrodes generate a time dependent electric field which
generates an effective potential such that a string of ions is trapped—“stored in a linear
trap”. Each ion stores one qubit and it is addressed by a pair of laser beams. One-qubit
operations are performed by shining precissely timed laser pulses on individual ions. The
motional degree of freedom serves as a single “qubit bus” to transport quantum information
among the ions. Two-qubit operations are performed by a sequence of laser pulses on the two
particular ions. Initial state preparation is by optical pumping or laser cooling. Measurement
is by electron shelving and resonance fluorerescence. This enables to measure the state of
each ion with high signal to noise ratio (adapted from Steane, 1997, and Bennett, 1998).
Qubits. Each ion stores one qubit in its internal energy states. The quantum state of each
ion is a superposition of the ground state (interpreted as |0i) and the excited state
(interpreted as |1i).
Isolation. The ions are well isolated and spontaneous decay is the main source of decoher-
ence. Other source of decoherence is the heat produced in the coupling between the
charged ion strings and the noise voltage of electrodes.
Initial state. can be set up through optical pumping and laser cooling. This is a nontrivial
technological problem because a very low temperature is needed.
One-qubit gates. Ions are sufficiently well separated and therefore they can be individually
addressed by a pulsed laser. By shining precisely timed laser pulses and choosing
the phase of the laser appropriately, any one-qubit unitary transformation can be
performed.
Interaction of qubits. It is provided through the so-called joint vibrational mode. It is
not easy to achieve and this is one of the major drawbacks of the ion-trap technology
because this seems to prevent design of larger quantum registers. Two-qubit operations
are performed by using a laser on one qubit to provide an impulse that ripples through
a sequence of ions to the second qubit where another laser pulse stops the rippling
and performs the operation. This way a single qubit “bus” is created to transport
310 CHAPTER 7. PROCESSORS
quantum information among the ions. Cirac and Zoller showed that the XOR gate
can be implemented with ion-trap technology using altogether 5 pulses. In order to
implement XOR-gate they encoded both qubits within a single beryllium ion.
Performance conditions. They are very demanding. Extreme vacuum and extremely low
temperature is needed.
The ion-trap approach does not seem to be able to deal easily with registers with a large
number of qubits because of the problems with the interactions between ions. It has been
proved difficult to progress beyond the one-qubit level. Estimates (guesses) for potentials of
this technology much differ—from 10 or 12 till 47 qubits.
Research, development and even applications of the nuclear magnetic resonance tech-
nology already have long tradition and various experiments, in laboratories and hospitals,
have already for many years routinely achieved spin-state manipulations and measurements
equivalent in complexity to those required for quantum computing on a few qubits. In ad-
dition, this technology is known for relatively high decoherence time and capability to work
at room temperature. It is therefore no wonder that with this technology it has proved
relatively simple to design NMR quantum systems with 2 or 3 qubits, and it has been natu-
ral that the first two-qubits quantum algorithm implementations used NMR technology—as
discussed in the example bellow.
The basic idea is that a quantum register is in this case a molecule containing a “back-
bone” of about ten atoms (with other atoms allowed so as to be able to create needed
chemical bounds). Each qubit is realized in the spin orientation of an individual atomic
nucleus—the direction of the nuclear magnetic dipole—in the atoms of the molecule. Each
dipole can either reinforce or oppose an externally applied magnetic field. The first stage has
lower energy than the second. The state can be changed by the absorption or emission of
photons of right energy. The molecule is placed in a large magnetic field and the spin states
of the nuclei are manipulated by applying oscillating magnetic fields in pulses of controlled
312 CHAPTER 7. PROCESSORS
duration.
The basic difficulty is that the spin state of the nuclei of a single molecule can neither
be prepared nor measured. To overcome the problem not a single molecule but a cup of
liquid containing about 1020 − 1023 identical molecules is used. The number of qubits of
such an NMR computer is therefore equal to the number of backbone atoms per molecule.
This implies that there is an enormous redundancy.
One approach is to encode a qubit in the average spin state of a large number of nuclei.
The spin states are then manipulated by magnetic fields and the (average) spin state can
be measured with the NMR spectroscopy technology.
Let us now deal in more detail with the problem of addressing and manipulating partic-
ular qubits. Using the so-called chemical shifts and spin–spin coupling one can identify the
nuclear structure of the molecule of the sample. Single-qubit rotations are easy to imple-
ment. To them correspond rotations within a subspace corresponding to a single spin. Such
rotations can be achieved using proper radio-frequency pulses. By using molecules with 2–3
magnetically active nuclei one can implement either XOR or Toffoli/Fredkin transformation.
In both cases, first the current value of qubits is determined and then the corresponding
sequence of radio-frequency pulses is applied.
The other major problem is how to extract results from such “massively parallel com-
puters”. The way out is to measure the average spin state. This seems to be a fundamental
obstacle because quantum algorithms are probabilistic and averaging on an ensemble of
molecules is not equivalent to the computation on a single device. The main new contribu-
tion of Gershenfeld et al. (1996) and Cory et al. (1998) was that they found a way how
“effective pure state” could be prepared, manipulated and measured by performing suit-
able operations on the ensemble of molecules. One way to read out quantum information
is to excite the spin system and to observe the resulting NMR spectrum. Different qubits
correspond to different spins and give rise to signals at different resonance frequencies.
The NMR technology does not seem to scale up for several reasons: (1) an approach
to distinguish qubits in a molecule by their chemical identities seems impossible for large
molecules; (2) technology requires a too large redundancy; (3) it is hard to establish with
sufficient precision an initial state. It is currently expected that with this technology one
can hardly perform quantum computation with more than 12 qubits—unless radically new
ideas appear.
The NMR technology was used to make the first three implementations of quantum
algorithms: for the Deutsch problem discussed in Section 3.1 and for the Grover algorithm,
see Section 3.3. In order to give a bit of insight into such an ventures, and a flavour of the
expertise needed, let us provide a few basic details about the implementations of algorithms
for Deutsch’s problem—of slightly different algorithms from those presented in Section 3.1.
Example 7.6.1 Jones and Mosca (1998) demonstrated that 1 H nuclei in partially deuter-
ated cytosine can be used to implement a two-bit NMR quantum computer based on two
coupled spins. They used a 50 mM solution of the pyrimidine base cytosine in D2 O. Rapid
exchange of the two amine protons and the single amide proton with the deuterated solvent
leaves two remaining protons forming an isolated two-spin system. All experiments used
a home-made spectrometer with the 1 H operating frequency of 500 MHz. The first pair of
Hadamard gates was replaced by so-called 90◦y pulses and the second pair by 90◦−y pulses. Also
Uf transformation was implemented by a series of pulses. The final outcome was obtained
by applying a 90◦y pulse and by observing the spectrum.
Chuang et al. (1998) prepared the input state with a 200 mM , 0.5 ml sample of chlo-
roform dissolved in d6-acetone, at room temperature and standard pressure. The O(1016 )
7.6. EXPERIMENTAL QUANTUM PROCESSORS - 313
molecules in the bulk sample can be considered as independent quantum computers, all func-
tioning simultaneously. Pulsed-radio frequence electromagnetic fields were applied to perform
Hadamard rotations. The unitary mapping Uf was implemented using pulsed radio frequency
and spin–spin interaction. The outcomes were read out by applying a special read-out pulses
that transformed spin values into a voltage.
The first proposals for NMR implementations suffered from a signal-to-noise ratio that
decayed exponentially with the number of qubits. An NMR implementation in which the
signal-to-noise ratio depends only on features of NMR technology and not on the number of
qubits was suggested by Schulman and Vazirani (1998). They gave also a new technology
for preparing the initial state. In addition, they developed an abstract model of an NMR
computer and proved several results.
Remark 7.6.2 For an introductory description of ion trapped and NMR technologies and
implementations of NOT and XOR gates as well as basic gates needed to implement QFT
(see Section 3.3), see Berman et al. (1998).
14 Quantum-dot technology is of importance also for classical computing in its attempts to develop single-
electron transistors. It is already possible, even at room temperature, to transfer a single electron from a
reservoir into a quantum dot in such a way that once inside, it blocks transfer of other electrons. The current
through such a transistor depends on the number of electrons in the quantum dot. This allows us to “write”
and “erase” information—see Berman et al. (1998).
314 CHAPTER 7. PROCESSORS
Chapter 8
INFORMATION
INTRODUCTION
The development and the understanding of the basic concepts, methods and results of quan-
tum information theory and of the faithful transmission of quantum information in time and
space is the most fundamental problem of quantum information processing. In order to be
able to understand and to utilize fully information processing available in nature the concepts
of classical information theory need to be expanded to accumulate quantum information car-
riers. Three central problems concerning quantum information and its communication are
dealt with, very briefly, in this chapter.
1. Quantum information theory. How to rebuild classical information theory on
quantum foundations. How to define fundamental concepts of quantum information
theory. How to identify and explore the inherently quantum elements of such a theory.
2. Quantum transmissions theory. How to use optimally quantum channels to send
classical information and how to use optimally quantum and classical channels to
transmit quantum information. How to define and determine the capacity of different
quantum channels.
3. Quantum entanglement theory. How to quantify and manipulate entanglement
and how to produce good entanglement.
LEARNING OBJECTIVES
The aim of the chapter is to learn:
1. the basic concepts of quantum information theory;
2. the basic concepts of quantum transmission;
3. the basic techniques of quantum data compression;
4. the basic techniques of communication through a noisy channel;
5. the basic modes and measures of entanglement;
6. the basic quantum entanglement concentration and purification techniques.
315
Q
316 CHAPTER 8. INFORMATION
Hebrews ch 13, v 16
Classical information theory solved these basic problems elegantly and achieved enormous
practical applicability. Modern communications, space exploration, and very high quality
sound reproductions, for example, would be impossible without it.
where p(x, y) = p(y)p(x|y). S(X|Y ) can be seen as a measure of how much information, on
average, would remain in X if we were to learn Y .
From the above relations one can easily deduce that
where S(X, Y ) is the information content of X and Y (i.e., the information one gains if,
having a priori knowledge of neither, one learns the values of X and Y ).
318 CHAPTER 8. INFORMATION
where p(x : y) is the mutual probability defined by (p(x)p(y))/p(x, y). I(X : Y ) is a measure
of how much X and Y contain information about each other. One can also say that I(X : Y )
is the amount of information about X that is obtained by determining the value of Y . It
holds
Remark 8.1.1 The last identity can be used to describe the trade-off between entropy and
information in measurements, if we write the above identity in the form
X → Y → Z =⇒ I(X : Z) ≤ I(Y : Z)
says that if in a (Markov) process X develops to Y and later to Z, then Y cannot pass
to Z more information about X than it has received.
2. If X = X1 , . . . , Xm , Z = Z1 , . . . , Zm . then
m
X m
X
S(X|Z) ≤ S(Xi |Z) ≤ S(Xi |Zi ).
i=1 i=1
from the classical ones using the prefix “Q”. This Q is often ommitted in literature once the quantum case
is clear from the context.
8.1. QUANTUM ENTROPY AND INFORMATION 319
which should be an analogue to the conditional probability p(x|y). In (8.2) IA is the unit
matrix in the Hilbert space for X and ρY = T r[ρP
XY ] denotes a “marginal” density matrix—
an analogue to the “marginal” probability py = x p(x, y).
The above definition of QS(X|Y ) allows to show that an analogue of the identity 8.1
holds also for quantum entropy.4
In spite of the apparent similarity between quantum QS(X|Y ) and classical S(X|Y ) the
fact that in the quantum case we deal with (density) matrices, rather than with numbers, as
in the classical case, brings quite a different situation for quantum information theory and
potential far exceeding the classical one.
2 Actually, quantum entropy is defined as QS = −kB TrρX lg ρX , where kB is Boltzman constant. By
replacing kB with 1 the entropy becomes dimensionless and has no direct physical meaning.
3 Suppose that a classical random variable X has probability distribution p(x). Let a quantum system
be prepared in the state |xi, dictated by the value of X, with the probability p(x). In such a case it holds
for the corresponding density matrix ρ that QS(ρ) is an upper bound on the classical mutual information
I(X : Y ) between X and the result Y of a measurement of an PVOM observable on the system (see Holevo,
1973, and Levitin, 1969).
4 The quite complicated expression in (8.2), and the fact that a limit has to be used are due to the fact
that the joint density matrix (ρXY ) and the marginal matrices IX ⊗ ρY do not commute in general. If they
do commute, the whole expression gets much simpler, as discussed later.
320 CHAPTER 8. INFORMATION
The main point is that while p(x|y) is a probability distribution in x (i.e. 0 ≤ p(|y) ≤ 1),
its quantum analogue ρ(X|Y ) is not a density matrix. It is Hermitian and positive but its
eigenvalues can be larger than 1 and, consequently, the conditional entropy can be negative.
This helps to explain the well-known fact that quantum entropy is non-monotonic, and it
can be the case that QS(X, Y ) < QS(Y ), i.e. the quantum entropy of the entire system can
be smaller than the entropy of one of its subparts (what is not possible in the classical case).
This happens, for example, in the case of quantum entanglement as shown in the following
example.
Example 8.1.2 (Cerf and Adami, 1996) Consider the Bell state |ψi = √12 (|00i+|11i) of the
Hilbert space HAB = HA ⊗ HB , where HA = HB = H2 . The density matrix ρAB = |ψihψ|
is shown in Figure 8.1a. ρA = ρB = T r HB [ρAB ] = 12 (|0ih0| + |1ih1|), see Figure 8.1b.
1/2 0 0 1/2 1 0 0 1
0 0 0 0 1/2 0 0 0 0 0
ρAB = ρ = ρA|B =
0 0 0 0 A 0 1/2 0 0 0 0
1/2 0 0 1/2 1 0 0 1
(a) (b) (c)
Hence QS(A) = QS(B) = 1. The density matrix ρA|B = ρAB (IA ⊗ρB )−1 (because in this
case the joint and marginal matrices ρAB and IA ⊗ ρB commute) is shown in Figure 8.1c.
Hence QS(AB) = QS(B) + QS(A|B) = 1 − 1 = 0, because QS(A|B) = −1.
Observe that this definition is reduced to the classical one for the case ρXY is a diagonal
matrix.
However, not all basic equalities and inequalities of the classical information theory
transfer from the classical to the quantum case. For example, in the classical case we have
3. it remains stable in this entangled state (while enroute the sender and the receiver).
[ψ>
flow of time
|Ψ> [Ψ> |ψ> [ψ>
classical channel
memory
gate
(a) (b) |ψ> (c)
Figure 8.2: Quantum channels-80%
.
If the source states are pure and nonorthogonal, then no projection measurement can
extract full information about a state. In addition, whenever a source state is transmitted
through a quantum channel at most one faithful copy of the source state can be produced,
and only if no faithful copy remains at the sender.
An interesting intermediate case is that the source states are nonorthogonal but com-
muting mixed states. Such source states ρi can be broadcast in the following sense. The two
systems, A and B—that of sender and receiver—can be prepared in the joint state ρi (AB)
which is not a clone of the source states, i.e., ρi (AB) 6= ρi (A) ⊗ ρi (B), but can be obtained
by tracing out each of the subsystems. Namely,
ρi = Tr A ρi (AB) = Tr B ρi (AB).
If density matrices of the source states do not commute, then the source can neither be
cloned nor broadcasted (see Barnum et al. 1996), Bennett (1998a).
If a quantum channel is to transmit nonorthogonal states faithfully, it must operate on
states that are sent through the channel “blindly” (in an “oblivious way”)—without learning
anything about them. This is due to the fact that quantum information can be neither read
nor copied without disturbance.
Evolution of any quantum system can be seen as being done in the quantum channel of
the time flow. This evolution remains deterministic till some information starts to leak into
the environment. If this happens the channel gets noisy. The quantum state of the system
gets randomized and entangled with the environment.
Transmission of an unknown quantum state requires the following resources.
There are two basic ways of transmission of quantum states. A direct unrestricted
transmission of qubits at which the same particle provides both resources and functions. An
indirect one, through teleportation, see Section 6.4, at which both functions are performed
by an entangled pair of particles and by a communication through the classical channel.
Fidelity of transmissions
An important issue is how to measure quality of transmissions. (There are at least two rea-
sons why a transmission will generally be imperfect: (i) data compression is needed because
of the limited resources; (ii) noisy quantum channel corrupts the state being transmitted.)
If source states are pure and a quantum channel produces on each pure input state |φi i,
produced with probability pi , an output (mixed state) Wi , then the quality of transmissions
is measured by the fidelity (Jozsa and Schumacher, 1994)
X
F = pi hφi |Wi |φi i. (8.3)
i
F is the probability that a channel output state passes a test (conducted by someone who
knows the inputs) for being the same as the input.
In the case the input states ρi are mixed the fidelity is defined as follows:
X q√ √
2
F = pi Tr ρi WI ρi . (8.4)
i
8.2. QUANTUM CHANNELS AND DATA COMPRESSION 323
As shown by Uhlmann (1976) and Jozsa (1994), (8.4) is a natural generalization of (8.3)
and represents maximum
p√ of (8.3) over all purifications |φi i of ρi . More formally, if we define
√
F (ρ1 , ρ2 ) = (Tr ρ1 ρ2 ρ1 ))2 , then F (ρ1 , ρ2 ) = max{|hφ1 |φ2 i|2 | φi is a purification of ρi }.
Exercise 8.2.1 (Jozsa (1994)) Show the following properties of the fidelity function
defined above: (a) 0 ≤ F (ρ1 , ρ2 ) ≤ 1; (b) If ρ1 = |φihφ| is a pure state, then F (ρ1 , ρ2 ) =
hφ|ρi |φi; if ρ1 , ρ2 ≥ 0, p1 + p2 = 1, then F (ρ, p1 ρ1 + p2 ρ2 ) ≥ p1 F (ρ, ρ1 ) + p2 F (ρ, ρ2 ); (c)
F (ρ1 ⊗ ρ2 , ρ3 ⊗ ρ4 ) = F (ρ1 , ρ3 )F (ρ2 , ρ4 ); (d) F (ρ1 , ρ2 ) is preserved by unitary transfor-
mations.
A quantum channel will be considered as faithful if, in an appropriate limit, the expected
fidelity of transmission tends to unity. In other words the chance to distinguish channel
outputs from inputs by any quantum measurement should tend to zero.
Remark 8.2.2 It may happen that a quantum state to be transmitted is entangled with
some other state, We may now be intertested that entanglement is preserved by the trans-
mission. Because of that it was not clear how to define capacity of quantum channels. To
deal with the problem of the quality of transmission of entanglement the concept of en-
tanglement fidelity has been introduced by Barnum et al. (1998) and they have shown
the equivalence of the concept of quantum capacity based on entanglement fidelity and
transmission fidelity—to be discussed later.
Since one of the main aims of this section is to present an analogous result for quantum
data compression, let us discuss the basic ideas behind the proof of Shannon’s theorem for
the case of binary variables (Steane, 1997).
Let us assume that Alice wants to communicate to Bob a sequence X of n values
x1 , . . . , xn of a binary variable X that takes the value 1 with probability p and the value 0
with the probability 1 − p. The mean number of 1’s in X is in such a case np.
324 CHAPTER 8. INFORMATION
The result in (8.5) naturally leads to the following encoding strategy. A sequence X of
n binary values is said to be an ε-typical sequence, ε > 0, if its probability p(X) satisfies
the inequality
2−n(H(p)+ε) ≤ p(X) ≤ 2−n(H(p)−ε) .
It can be shown that the probability that the sequence Alice wants to send is an ε-typical
sequence is greater than 1 − ε, for sufficiently large n, no matter how small ε is.
The above facts lead naturally to the following communication decision. Alice does not
need to communicate X to Bob directly. It is enough that Alice sends Bob information
which of the typical sequences she wants to send. (Of course, they have to agree beforehand
how to number all typical sequences.) Alice therefore sends only the number of the typical
sequence instead of the sequence itself (and any nontypical , unlikely sequence, directly).
How good is the method? It can be shown that all of 2nH(p) typical sequences have
the same probability. In order to communicate a number of one of them, it is sufficient to
send nH(p) < n bits. In addition, there is no way for Alice to do better because all typical
sequences have the same probability.
Shannon’s noiseless theorem provides theoretical limitations how well can (classical) data
be compressed. The next important task is to develop simple methods that can either achieve
optimality or to get close to it.
Example 8.2.5 (Huffman code) For n = 4 and p = 41 the best possible data compres-
sion, according to Shannon’s theorem, requires sending on average 4H( 41 ) = 3.245 bits for
communicating 4 values of a binary variable X. A simple and practical method known as
Huffman code, see table 8.3, in which less probable sequences are encoded by longer words
and more probable sequences by shorter words, requires sending on average 3.273 bits per
message.
Remark 8.2.6 Shannon’s noiseless coding theorem also provides an interpretation of Shan-
non entropy S(X) as the mean number of bits necessary to code the outputs of a random
variable X using an ideal code.
8.2. QUANTUM CHANNELS AND DATA COMPRESSION 325
Example 8.2.7 Let the source produce states |φi and |ψi both with probability 12 . Let |φi =
√ √ √ √
0.9|ai + 0.1|bi and |ψi = 0.9|ai − 0.1|bi, where {|ai, |bi} is an orthonormal basis of
0.9 0
H2 . The density matrix of the source is and S(ρ) = H(0.9) = 0.469.
0 0.1
Example 8.2.8 For blocks s1 s2 s3 of three signals of the source X from Example 8.2.7 let
us encode s1 and s2 using the mapping |ai → |0i, |bi → |1i and let us ignore s3 . During the
decoding process let us decode s1 and s2 fully and s3 always as |ai. Encodings are therefore
into a 4-dimensional space and the fidelity of this encoding/decoding scheme is 0.9.
Theorem 8.2.9 (Schumacher’s noiseless coding theorem) For any quantum source
X with the density matrix ρ and any ε, δ > 0 it holds:
(a) If S(ρ) + δ qubits are available per signal, then for large n, there exists a cod-
ing/decoding scheme of fidelity F ≥ 1 − ε for strings of the signals of length n.
326 CHAPTER 8. INFORMATION
(b) If S(ρ) − δ qubits are available per signal, then, for any encoding/decoding scheme,
some strings of length n will be decoded with the fidelity F < ε for n sufficiently large.
Proof. We assume a quantum source X producing pure states (sequences of pure states)
|ψ1 i, . . . , |ψmP
i of a Hilbert space Hn with the probability pi for |ψi i. The density matrix ρ
m
of X is ρ = i=1 pi |φi ihφi |. In the proof we use two lemmas.
Lemma 8.2.10 Let |φi ihφi | ←→ Wi , 1 ≤ i ≤ m, be an association of density matrices
to signals, where each Wi is a density matrix of a mixed state with pure states over a d-
dimensional subspace D of Hn . Let the sum of d largest eigenvalues of ρ be θ. Then the
fidelity of transmissions |φi i −→ Wi is at most θ.
Proof of Lemma 8.2.10. Since each Wi is “supported” by D there is, for each 1 ≤ i ≤
(i) (i)
m, an orthonormal basis |ψ1 i, . . . , |ψd i of D consisting of eigenvectors of Wi such that
d
X (i) (i)
Wi = gj |ψji ihψj |,
j=1
(i)
where 0 ≤ gj ≤ 1. Denote by Γ the projection into D. Hence for any 1 ≤ i ≤ m,
Pd (i) (i)
Γ = j=1 |ψj ihψj |. In addition, for any 1 ≤ i ≤ m, we have
d
X (i) (i) (i)
hφi |Wi |φi i = gj hφi |ψj ihψj |φi i = Tr(|φi ihφi |Γ).
i=1
Let now |e1 i, . . . , |en i be an orthonormal basis of eigenvectors of ρ with the corresponding
eigenvalues λ1 , . . . , λn . Then
n
X
T r(ρΓ) = λi hei |Γ|ei i,
i=1
P
where 0 ≤ hei |Γ|ei i ≤ 1, and ni=1 hei |Γ|ei i = Tr(Γ) = d, see Exercises 9.2.24. From the
minimax properties of eigenvalues (see Kato, 1978), this implies that Tr(ρΓ) ≤ θ.
Observe that the above upper bound for Tr(ρΓ) is obtained if D is the subspace spanned
by d eigenvectors corresponding to d largest eigenvalues.
The lemma claims that for a fixed d we get the highest fidelity by taking as D the space
generated by eigenvectors corresponding to d largest eigenvalues. Let us denote by DX such
a subspace of Hn . We present now an encoding-decoding method based on the choice od D
as DX .
⊥
Let |0i be any fixed state in DX and let DX be the orthogonal complement of DX in
Hn . For each |φi i we construct Wi as follows:
⊥
If |φi i = αi |li i + βi |ri i, where |li i ∈ D, |ri i ∈ DX are unit vectors, then
Wi = |αi |2 |li ihli | + |βi |2 |0ih0|. (8.6)
Technically, Wi can be designed by applying observable PDX on |φi i and, if the result is 0,
⊥
i.e. |ψi i projects to DX , |0i is taken as the post-measurement state.
8.2. QUANTUM CHANNELS AND DATA COMPRESSION 327
Lemma 8.2.11 Suppose that the sum of the d largest eigenvalues of ρ is greater than 1 − ε.
Then the coding/decoding process defined by the mapping |φi i ←→ Wi has fidelity F > 1−2ε.
hφi |Wi |φi i = |αi |2 hφi |li i2 + |βi |2 |hφi |0i|2 ≥ |αi |2 |hφi |li i|2 = |αi |4 ≥ 2|αi |2 − 1
we get
m
X m
X
F = pi hφi |Wi |φi i ≥ 2 (pi |φi |2 ) − 1 = 2Tr(ρPDX
⊥ ) − 1 ≥ 1 − 2ε,
i=1 i=1
Pm ⊥
if we make use of the fact that i=1 pi |αi |2 = Tr(ρP( DX ) is the sum of d largest eigenvalues
and therefore it is larger than 1 − ε.
To finish the proof of Schumacher’s theorem let us formulate another of its ingredients—a
quantum modification of the classical result on typical sequences mentioned in Section 8.2.2.
Let λ = {λ1 , . . . , λm } be probabilities of X and let for any k ∈ N, Λk be the set of
probabilities (if elements of k-tuples are multipied and taken to form a multiset) of X k . Let
the probability of any subset of Xk be the sum of probabilities of its members. The result
on typical sequences we will use has the form (Jozsa and Schumacher, 1994):
Lemma 8.2.12 Let ε, δ > 0. (1) For sufficiently large k, the set Xk may be partitioned into
a subset L of “likely sequences” with at most 2k(QS(X)+δ) elements, which has probability
greater than 1 − ε, and into its complement of “unlikely sequences”, which has the probability
smaller than ε.
(2) Any subset of Xk , which has less than 2k(QS(ρ)−δ elements, has probability smaller
than 2k(QS(ρ)−δ) .
Example 8.2.13 Let us consider again the source from Example 8.2.7 and let Ha
be its 4-dimensional subspace spanned by the following set of three qubit states
328 CHAPTER 8. INFORMATION
{|aaai, |aabi, |abai, |baai} having majority of as. Let us use the following encoding/decoding
scheme:
Encoding. Let the states in Ha be encoded as follows:
(0.93 , 0.92 · 0.1, 0.92 · 0.1, 0.9 · 0.12 , 0.9 · 0.12 , 0.9 · 0.12 , 0.9 · 0.12 , 0.13 )
and therefore Tr (ρΓ) = 0.93 + 3(0.9)2 0.1 = 0.972 = 1 − 0.18. By Lemma 8.2.10, this
encoding/decoding scheme has fidelity at least 0.964 = 1 − 2 · 0.18.
Remark 8.2.14 The first attempts to develop quantum information theory were based on
applying classical information theory to probabilities derived from the representations of
quantum mixed states. Schumacher’s theorem has actually been the beginning of a new
approach, in which quantum information theory is based on concepts, measures and codings
that are inherently quantum.
Remark 8.2.15 Observe that quantum data compression has the following remarkable
property: it allows to compress and expand each of the 2n of distinct sequences of nonorthog-
onal states with fidelity approaching 1 even though the sequences cannot be reliably distin-
guished from one another by any measurement.
p
Definition 8.2.16 A m → n random access encoding is a function f : {0, 1}m ×
R{0,1} → Hn such that for any 1 ≤ i ≤ m there is a projection measurement Oi which
when applied to the value-state of f returns 0 or 1 and such that for any b ∈ {0, 1}m,
P r(Oi (f (b, r)) = bi ) ≥ p, where R is a source of random bits (f is called the encoding
function and Oi are decoding mappings).
0.85
Example 8.2.17 (A 2 → 1 encoding.) Let u0 = |0i, u1 = |1i, |v0 i = |0′ i, |v1 i = |1′ i and
f (x1 , x2 ) = √ 1 √ (|ux1 i + |vx2 i). The mapping f has the desirable properties provided the
2+ 2
first (second) bit is measured with respect to the standard (dual) basis. (This follows from
the fact that the distance between the codeword and the right subspace is π8 and the access
probability is therefore cos2 ( π8 ) ≈ 0.853.)
p
A classical bit-to-bit m → n encoding can be defined similarly and it has been proved
p
by Ambainis et al. (1998) that no 2 → 1 classical random access encoding exists if p > 21 .
p
The potential of classical m → n encodings is already well understood.
1 p
Lemma 8.2.18 If 2 < p ≤ 1, then n ≥ (1 − H(p))m for any classical m → n encoding.
1−ε 1
Lemma 8.2.21 If there is an m → n quantum encoding with ε < 64m2 , then n = Ω(m).
For the proof, which is quite technical, see Ambainis et al. (1998).
0.79
Exercise 8.2.23 Design a 3 → 1 encoding. (Hint: make eight vertices of a maximal
cube embedded in the Bloch sphere to encode 3 bits. For decoding use: standard, dual and
circular basis.)5
Lemma 8.2.18 and Theorem 8.2.19 show the gap between the classical and quantum
random access encodings.
N
Figure 8.4: Channel transmission scheme
of quantum channel capacity, QC1 (N ) and QC2 (N ), are defined as asymptotic quantum
capacities of the quantum channel N assisted by one-way classical communication (from the
sender to the receiver) or by two-way classical communication, between the sender and the
receiver (see Figure 8.5), where the superoperator E stands for the encoding and D for the
decoding processes. It has been shown, by Bennett et al. (1996a), that one-way classical
Q Q
C C
Q Q
E D E D
C C
Q Q
Q C
(a) (b)
Figure 8.5: Communication through a quantum channel assisted by one-way and two-way
classical communication. Q denotes a quantum channel and C a classical channel
communication does not help and therefore QC1 (N ) = QC(N ) for any quantum channel.
On the other hand, two-way classical communication can increase the capacity of some noisy
quantum channels.
The basic relations between these capacities can be summarized as follows (see Bennett
et al. 1996a and Bennett, 1998b):
• QC(N ) ≤ C(N ) for all quantum channels N and QC(N ) < C(θ) for some N .
• QC(N ) ≤ QC2 (N ) for all quantum channels N and QC(N ) < QC2 (N ) for some N .
• QC2 (N ) ≤ C(N ) for all known quantum channels N and QC2 (N ) < C(N ) for some.
332 CHAPTER 8. INFORMATION
The inequality QC(N ) ≤ QC2 (N ), follows from the definition. From the fact that
orthogonal quantum states can be used to transmit bits, it follows that QC(N ) ≤ C(N ).
No channel N is known for which QC2 (N ) > C(N ).
if p < 0.25408 then QC(QDCH) > 0, QC2 (QDCH) > 0, C(QDCH) > 0;
1 2
if 3 <p< 3 then QC1 (QDCH) = 0, QC2 (D) > 0, C(QDCH) > 0;
2
if 3 ≤ p ≤ 1 then : QC2 (QDCH) = 0, C(QDCH) > 0.
Theorem 8.3.2 (Nielsen (1998) If |ψi and |φi are pure states of HA ⊗ HB , then |ψi can be
transformed to |φi using local operations and classical communication if and only if λψ ≺ λφ .
Two states in HA ⊗Hb are called incommensurate if none can be obtained from another
by local operations and classical communication.
Open problem 8.3.4 1. Find a necessary and sufficient condition that a mixed state
ρ1 can be transformed into a mixed state ρ2 using only local operations and classical
communication.
2. Find a necessary and sufficient condition to transform one three (or more) party en-
tangled state into another one by local operations and classical communications of all
involved parties.
11
00
EPR-pairs
1100
111111
000000
b 1
Alice Bob
b2
a1 a2
communication
measurements using a public channel
Figure 8.6: Basic step of the entanglement purification
noisy channel). To each pair of particles (a2i−1 , b2i−1 ), (a2i , b2i ) they perform the above
purification step at the end of which they either keep the pair (a2i−1 , b2i−1 ) or discard both
pairs.
noise measurements
1111
0000 00
11 111
000
0000
1111 000
111
0000
1111 0010
11 111
000
0000
1111 10 111
000
0000
1111 11111
00000
0000
1111 00
11
0000
1111 111
000
0000
1111
0000
1111 1110110
000 1111
0000
000
111
0000
1111 10 000
111
0000
1111 000
111
0000
1111 000
111
0000
1111 1010 000
111
0000
1111 111
000
0000
1111 1111
0000
0000
1111
0000
1111 1010 111
000
1111
0000
0000
1111 000
111
0000
1111
0000
1111 10 000 good
111
good
EPR pairs EPR pairs
Figure 8.7: Entanglement purification with 2-way communication-80%
The input for the purification process do not have to be perfectly entangled pairs and
such pairs can be produced by a source and get through the noisy channels on both ways to
Alice and to Bob.
The above purification protocol can be modified and generalized by allowing both par-
ties to perform first various local operations (superoperators) and to use some other post-
selection procedures. Efficiency of purification is the main issue.
8.3. QUANTUM ENTANGLEMENT 335
Example 8.3.5 (Bennett et al. 1996b) Let us assume that a purification procedure is ap-
plied to a mixed state ρ and the aim is to obtain the singleton |Ψ− i. The purity of ρ, with
respect to the goal |Ψ− i, can be expressed by the fidelity F = hΨ− |ρ|Ψ− i. In order to ex-
plore the impact of a given purification procedure instead of ρ we can consider the so-called
Werner state
1−F
WF = F |Ψ− ihΨ− | + (|Ψ+ ihΨ+ | + |Φ+ ihΦ+ | + |Φ− ihΦ− |), (8.8)
3
which has the same fidelity as ρ.
In the simplified purification protocol given by Bennett et al. (1996b), at first Hadamard
rotations are applied locally, on each of the two pairs and after the BXOR operation such
rotations are applied again. They showed that in such a case new mixed state has with
probability 41 a better fidelity F ′ , with respect to |Ψ− i, namely
F 2 + 19 (1 − F )2
F′ = > F.
F 2 + 23 F (1 − F ) + 95 (1 − F )2
Using the Schmidt decomposition theorem, page 374, |φi can be expressed in the form
d
X
|φi = ci |αi i|βi i,
i=1
where ci are positive and {αi }di=1 , {βi }di=1 are orthonormal sets of states in A and B. In
Pd Pd
such a case ρA = TB |φihφ| = i=1 ci |αi ihαi |, and E(φ) = i=1 −c2i lg c2i .
A strong justification of the above definition of the entanglement of pure states is the
following result (see Bennett et al. 1996b). Consider n entangled pairs of particles, each in
the state |φi. Let Alice hold one particle of each pair and Bob, spatially separated, holds
the other. If |φi has E ebits of entanglement, then n pairs can be reversibly converted, by
local operations and classical communications, into m singletons, where m n approaches E for
large n and the fidelity approaches 1.
P
Example 8.3.6 (Bennett et al. 1996a) If a state |φi has representation
q |φi = 4i=1 αi |mi i,
P
where |mi i are states of the magic basis, then E(φ) = H( 12 + 21 1 − 4i=1 |αi |2 ).
Several measures of entanglement have been defined and investigated for mixed states
and density matrices They are based on clear physical ideas. Each of them can be seen as
a fundamental concept of quantum information theory.
Er (ρ) = min
′
(T rρ(lg ρ − lg ρ′ )),
ρ ∈D
separable, if 2 ≤ α ≤ 3
bound entangled, if 3 < α ≤ 4
free entangled, if 4 > α ≤ 5
A method to construct states with bound entanglement was developed by DiVincenzo et al.
1998a).
The first section of this Appendix is devoted to quantum theory. It contains an informal,
often popular, overview and discussion of several basic issues of quantum physics. It is
written mainly for those in computing with (almost) no knowledge of the subject. Exposition
is therefore necessarily without many details needed if one wants to be (fully) precise. For
more the reader is referred, for example, to Peres (1993), Bub (1997), Jammer (1966),
Penrose (1990, 1994) and Lindley (1996)—ranging these references, roughly, from more
technical to more popular. They mainly influenced presentation of the section.
Section 9.2 presents some basic concepts and results of Hilbert space theory in more
detail than in Section 1.4 and it contains additional subjects.
The third part of the Appendix, on the book web pages only, contains in Section 9.3
a survey of the basic concepts, models and results of the complexity theory. This part
is oriented mainly towards people outside of computing with (almost) no knowledge of the
computation and complexity theory. Section 9.4 contains additional exercises and Section 9.5
additional historical and bibliographical references.
There are two basic views of the goals a natural science has to meet: (a) to predict—to
provide results that allow us to predict behaviours and outcomes of natural processes; (b)
to explain—to provide explanations of the corresponding basic phenomena of Nature and
to help us to get an understanding of these phenomena.
Quantum mechanics superbly fulfils its prediction role. Concerning its capability to
explain phenomena of the quantum world, at least in the sense most of us would like, the
situation seems to be quite different.
341
342 Appendix–quantum theory
Quantum theory works perfectly in all practical applications and it describes with un-
precedented precision many phenomena of Nature. Predictions made on its basis have enor-
mous value and have been tested to about 14 orders of magnitude. No conflict between its
predictions and experiments is known. Quantum theory is at present the best mathematical
model to describe the physical world.
Without quantum theory we do not know, for example, to explain the properties of su-
perfluids, the functioning of lasers, the nature of chemistry, the very existence and behaviour
of solids, the colours of stars, as well as atomic and nuclear phenomena.
On the other hand, quantum theory is often seen as forcing us to accept, as the best
we currently have, philosophically highly unsatisfying views of the world that do not square
with our common sense perception of the functioning of the Universe.1
The basic reasons for arriving at a quantum theory view of the world were experimental.
Quantum theory arose out of the observations of subtle discrepancies between the outcomes
of some experiments and predictions classical physics offered.
For many of the issues we discuss there is no unique understanding and it is therefore
inevitable that the point of view presented here is not the only one possible.
Classical physics describes Nature nicely and fully in accordance with our common sense.
In classical physics reality exists independently of ourselves. Our bodies and brains are
themselves parts of the classical world. An “objective physical reality” seems to correspond
to all concepts of classical physics. In particular, particles of classical physics have always
position and momentum that can be (statistically) determined.
One of the basic reasons why classical physicss got into difficulties at the end of the
last century was the fact that it was not able to cope with the coexistence of two types of
physical objects: particles and fields. For a system with both particles and fields to be in
equilibrium, all energy gets taken from particles into fields. Since fields have infinitely many
degrees of freedom, particles are left without energy.
The question of whether light is a wave process or has a particle character goes back to
Newton. The celebrated Thomas Young’s two-slit experiment, performed in 1801, demon-
strated clearly the wave character and the interference of polarized light and pointed out
strongly the particle–wave dichotomy. However, this experiment did not get attention it
merited. It was not yet time to handle the problems it brought up.
A famous, historically important, and influential example of the instability of the coex-
istence of fields and particles was the “black-body radiation problem”. Imagine an enclosed
empty box at some fixed temperature. Electromagnetic radiation of the object should be
in some equilibrium with particles. However, if there were more energy in the walls of the
box than in the enclosed radiation, then energy would move from the walls to the interior,
1 By Penrose: “Quantum theory seems to lead to philosophical standpoints that many find deeply unsat-
isfying. At best, and taking its descriptions at their most literal, it provides us with a very strange view of
the world indeed. At worst, and taking literally the proclamations of some of its most famous protagonists,
it provides us with no view of the world at all.”
9.1. QUANTUM THEORY 343
increasing the density of radiation within the box. If there were more energy in the elec-
tromagnetic radiation than in the walls, then it would heat up the walls to restore equality.
Classical theory predicted that all energy would be sucked up by the field. However, the
experiments did not confirm it. At the high frequencies, where classical physics predicted
strong discrepancies—a rapid increase of energy—the energy not only did not go up so much,
but actually dropped out. To summarize, classical physics could not find out how to define
the electromagnetic radiation that would be contained in the box of a constant temperature.
In 1900 Max Planck2 came with a revolutionary theory that electromagnetic oscillations
could carry energy only in “quantas”, the energy E of which satisfies the relation
E = hν
to the frequency ν, where h is known today as Planck constant.3 With this idea the black-
body radiation problem could be dealt with.
In spite of Planck’s ability to come up with a new theory in accordance with the experi-
mental results, as in the case of the black body radiation problem, his theory did not receive
too much attention until the next step occurred. Einstein came up with the theory that
electromagnetic field also can exist only in discrete units and on this basis he was able to
explain the photoelectric effect.4 This implied, for example, that light itself must actually
be particles, because it was known, due to the results of Maxwell, that light consists of the
oscillations of the electromagnetic fields. However, it was also established experimentally
that light sometimes behaves as waves. -
The task was then to explain how it can happen that light consists of particles and
field oscillations at the same time—some experiments showed light as particles and some as
waves.
Niels Bohr5 made an important use of the Planck relation in 1913. He discovered that
the angular momentum—spin—of electrons in orbit about the nucleus can occur only
h
in integer multipies of the number 2π , for which Dirac introduced the symbol ~. The only
permissible values of the spin of electrons are therefore
With this approach Bohr was able to put the “solar model” of atoms on a more firm base and
to explain many different states energy takes and also special rules for spectral frequencies.
2 Max Plank (1858–1947), a German physicist. The quantum theory was developed from his hypothesis
that atoms emit energy only in discrete bundles (quanta). He received the 1918 Nobel prize for physics—for
his work on black-body radiation that depended on his hypothesis.
3 Planck’s ingenious solution was considered controversial for some time. Planck himself did not want
to believe that electromagnetic radiation was fundamentally structured in this new way. He hoped to
find some overlooked features of classical physics that would be able to explain why waves had to carry
energy in discrete quantities. Planck never contributed a genuine physical reality to those “little bundles of
electromagnetic energy”. He saw them as mathematical constructs hiding some physical principles (LIndley,
1996).
4 It was formally for this discovery Einstein received the Nobel prize.
5 Niels Henrik David Bohr (1885–1962), a Danish physicist, one of the best physicists of the twentieth
century. He helped to found and then directed the Institute for Theoretical Physics at the University of
Copenhagen. He was the main representative of the influential Copenhagen school of quantum physics.
Bohr received the 1922 Nobel prize for physics for his work on atomic structures. He combined quantum
theory with his new concept of atomic structure. Bohr explained the stability of the nuclear model of atoms
by postulating that electrons move on restricted orbits around the atoms nucleus and explaining how atoms
emit and absorb energy.
344 Appendix–quantum theory
Remark 9.1.1 It seems to be not only of historical interest, but also of importance for the
future to look into the question why it took so long before quantum computing issues started
to be investigated with sufficient vigour, because it is now clear that already 40 years ago it
was quite natural to start to do that, both for computer science and for physics.
Computer science, and actually even the most theoretical parts of it, has been developed,
and is still developed, basically as a servant of the computer industry and, consequently,
with quite restricted and short-term goals. This is why it actually ignored in a sense its
most basic scientific goal—to explore fully the potentials and limitations of computing and
communication based on the laws of physics7 . Theoretical computer science, especially
complexity theory, found interest and importance to work with computer models far outside
the framework of current classical computers, but these models, such as alternating Turing
machines, were considered only as tools. Attempts to see the main scientific aims of funda-
mental computer science (informatics), in a broader context, as in Gruska (1993), have been
and still are rare.8
6 In this way Planck’s constant, introduced to explain black-body radiation, was found relevant also to
On the other hand, physics has only very slowly developed an understanding that infor-
mation is a physically important phenomenon, concept, and resource9. In addition, physics
practically ignored an important fact that within computer science in general, and in com-
plexity theory in particular, important new concepts, methods, results and insights have
been developed that can be useful also for extending our understanding of the physical
world. It is also worth noticing that one could hardly expect some deeper developments
in quantum computing before complexity theory made significant advances and found ways
to classify computational tasks in a deep way, and before such of its modern branches as
randomized complexity theory were developed.
and quantifies the uncertainty with which the values of the observable are given. (In the
case of spectral representation of A, ∆Aψ = 0 if and only if ψ is the eigenvector of A.)
Heisenberg’s principle claims that the following lower bound on the uncertainties holds
when two observables are measured independently.11
1
∆Aψ · ∆Bψ ≥ |hAB − BAiψ |
2
similar methodological impacts as mathematics. Its main scientific aims are to study the laws and limitations
of the information processing world—whatever it is.
9 There have been attempts to do so that can be traced back to Maxwell’s demon paradox, but these
main contribution was the development of the matrix mechanics theory and the discovery of the uncertainty
principle. In addition, he tried to formulate a theory of elementary particles such that all elementary particles
would come as solutions of one field equations. He also contributed to the theory of ferromagnetism, structure
of atoms, cosmic rays and field theory. Heisenberg received the 1932 Nobel prize for physics for his work on
the development of quantum theory.
11 For a derivation of Heisenberg’s uncertainty principle see, for example, Peres (1993).
346 Appendix–quantum theory
Therefore, the only case where two observables A and B can be measured simultaneously
(independently) is precisely when they commute.
Many instances of Heisenberg’s principle are well known. One of them, and historically
the first one, considered already by Heisenberg, concerns the simultaneous measurement
of the position and the momentum of particles and it is called Heisenberg’s microscope.
Classically (see Figure 9.1a), you can measure both, position and momentum. Quantum
mechanically, if you measure the position precisely (Figure 9.1b), then for various possible
?
?
? ?
? ?
?
?
?
? ?
(a) (b) (c)
Figure 9.1: Measurement of the position and of the momentum of particles
values of the momentum only probabilities are given. Similarly, if you measure precisely
momentum first (Figure 9.1c), then there are many options for positions and for each of
them only the probability is known. More exactly, in this case it is assumed that two
conjugate observables, the position and momentum, are measured simultaneously, which
results in the fact that precision in the measurement of position is obtained at the expense
of precision of the measurement of momentum.
For example, one can use a stream of photons to measure an electron’s position and
momentum. Using an energetic photon, with short wavelength, one can get quite a good
idea where the electron is but one then has little idea about its momentum. (The problem
is that short-wavelength light implies a large-momentum kick to the electron. On the other
hand, a soft collision, with long-wavelength light, provides a poor idea of the electron’s
position but a good idea of its momentum.)
Another example of uncertainty principle was already demonstrated in connection with
the two-slit experiment. Either one can detect through which slit an electron went or one can
detect the interference pattern. Detecting through which slit an electron went is a particle
measurement; recording the interference pattern is a wave measurement. One can do any of
them, but not both in the same experiment.
The following strong form of uncertainty principle (Bennett 1998b) is also of interest for
quantum information processing: evolution of a quantum system remains deterministic only
if no information about it leaks out into the environment.
The uncertainty principle is a part of Nature and not a consequence of our technological
limitations.
In the following example, of importance for quantum key generation, an instance of
Heisenberg’s uncertainty principle is discussed in detail.
Example 9.1.2 (Uncertainty principle for polarized photons) Photons12 are elec-
12 Photon, or light quantum, is a particle composing light and other forms of electromagnetic radiation.
9.1. QUANTUM THEORY 347
tromagnetic waves and their electric and magnetic fields are perpendicular to the direction
of propagation and also to each other. An important property of photons is polarization—it
refers to the bias of the electric field in the electromagnetic field of the photon. If the electric
field vector is always parallel to a fixed line we have linear polarization (see Figure 9.2).
If the electric field vector rotates about the direction of propagation forming a (right-) left-
handed screw, we have (right) left eliptic polarization. If the rotating electric field vector
inscribes a circle, we have (right) left circular polarization.
Polarization13 is a property of photons with which one can demonstrate well what can and
cannot be done in the quantum world. In addition, polarized photons are key transmission
elements in quantum cryptography.
Polarization of photons. There is no way to determine exactly polarization of a single
photon. However, for any angle θ there are θ-polarizers—“filters”—that always produce
θ-polarized photons from an incoming stream of photons. Moreover, they let θ-polarized
photons go through and θ1 -polarized photons get through with a probability of cos2 (θ − θ1 ).
In other words, in order to create a photon whose electric field is oscillating in the required
plane one can use a polarizer whose polarization axis is set up at the desired angle. More
exactly, if the axis of the polarizer makes an angle θ with the plane of the electric field of the
photon entering the polarizer, then there is a probability cos2 θ that the photon will emerge
with its polarizer set at the desired angle and the remaining probability, sin2 θ, that it will
not be observed.
Photons whose electronic fields oscillate in a plane at either 0◦ or 90◦ to some reference
line are called usually rectilinearly polarized and those whose electric field oscillates in
a plane at 45◦ or 135◦ as diagonally polarized. Polarizers that produce only vertically or
horizontally polarized photons are depicted in Figure 9.3a,b.
Generation of orthogonally polarized photons. For any two orthogonal polariza-
tions (that differ by 90◦ ) there are generators that produce photons of two given orthogonal
polarizations. For example, a calcite crystal, properly oriented, can do the job. Figure 9.3c
shows a calcite crystal that causes θ-polarized photons to be either horizontally polarized,
with probability cos2 θ, or vertically polarized, with probability sin2 θ.
1111111
0000000
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
(a) 0000000
1111111 (c)
1111
0000
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
(b) (d)
Figure 9.3: Photon polarizers and measuring devices-80%
with two eigenvalues 1 and −1 and two eigenstates, |0i and |1i. The same is true if in the
diagonal basis we have the observable Ad represented, in that basis, again by the matrix 9.9.
(Observe that the matrices Ar and Ad are the same but they refer to a different basis!) In
order to find out whether those two observables commute we have to express both in the same
basis. The diagonal basis can be rotated into the rectilinear by the rotation
!
cos − π4 − sin − π4 √1 √1
= 2 2 .
sin − π4 cos − π4 − √12 √12
Since
1 0 0 1 0 1 0 −1 0 1 1 0
= 6= = .
0 −1 1 0 −1 0 1 0 1 0 0 −1
A. Einstein
Quantum theory, like as other theories of Nature, is based on observations and ex-
perimental results. Its mathematical abstraction culminated in a theory developed within
axiomatically defined Hilbert spaces. It is a theory that aims to provide mathematical
abstractions of physical concepts, observations and experimental results, and physical inter-
pretation of mathematical concepts, models, methods and results.
The first part of the aim was achieved well enough. However, from the earliest days
of quantum mechanics it was not clear what is the real physical interpretation of quantum
mechanical mathematical concepts. The idea of a unique and in the limit infinitely knowable
reality of the classical world appeared to be a fiction, but no clear reality for the quantum
world has emerged yet. It is also clear that to many natural language concepts which are
used to describe classical world phenomena one needs to assign different meanings when
using them for describing quantum phenomena. The problem with finding a “real world”
interpretation of quantum theory concepts is so severe that the quantum world is seen by
some as only an abstract concept.
It is often said that mathematical formalism of quantum theory provides only concepts,
results and methods that are superb for developing algorithms to compute probabilities of
experimental results. By Peres (1993), “in a strict sense quantum theory is a set of rules
allowing the computation of probabilities for the outcomes of tests which follow specific
preparations”.
For many quantum physicists this seems to be all they need, and they can be quite happy
with the current state of the theory. They have not encountered any difficulties with the
Copenhagen interpretation (discussed in Section 9.1.7).
On the other hand, it is not clear how important for quantum information processing
is the fact that the current state of quantum theory is clearly not fully satisfactory. Could
it happen that some other interpretations of quantum theory would significantly change
the merit of the current key results concerning quantum information processing, especially
concerning the extraordinary computational power of quantum computers (and/or their
limitations)?
There are several reasons why current quantum theory is considered by some schools
of thought in physics and in the philosophy of science not only as being far from complete
but actually as a theory no one fully understands, and a theory that is not able to attach
in a fully satisfactory way a definite physical meaning to its mathematical concepts.14 For
example
1. Quantum theory does not seem (to all) to provide a clear notion of what the reality
of quantum concepts could be. It provides no fully satisfactory and contradiction-free
physical interpretation of theoretical concepts. (However, there can be the following,
well-founded objections to this strong statement: What is a reality? Is there a (the)
14 On the other hand, those standing firmly on the Copenhagen interpretation consider many of these
problems as outside quantum mechanics. Moreover, new approaches to such basic and controversial problem
as quantum measurement, see Bush et al. (1997), bring new ideas how to deal with some of the open
problems of quantum mechanics.
350 Appendix–quantum theory
Of course, there are attempts to assign physical reality to such concepts as quantum
state, quantum systems and quantum measurement. However, they lead to hard-to-accept
mysteries and so-called paradoxes.
For that reason a quantum state is seen by some (see Peres, 1993), as having no direct
physical meaning and by others (for example von Neumann) as being a complete description
of reality.
At the level of the mathematical formalism of Hilbert spaces there are no principal prob-
lems with a pragmatic understanding of such concepts as state, observable, entanglement
and measurement. However, the attempts to derive these theoretical concepts and princi-
ples only from the physical reality and to assign them physical meaning have not worked
well. To derive such models as Hilbert space, physical abstractions and reasoning seem to
be hardly sufficient. One needs to use principles of abstraction, logic and aesthetic mathe-
matical thinking to derive such models and to utilize by that the richness of mathematical
concepts and experiences (for example, to come up with the requirement of completeness
for Hilbert spaces).
The existing difficulties with interpretations of quantum theory concepts are also often
seen on epistemology versus ontology level. There are views that existing quantum theory is
concerned only with our knowledge of reality, or, in other words, that it is directed primarily
on epistemology (the studies that focus on questions of how to obtain our knowledge and
how to make use of it) and much less on ontology.
The concept of a classical measurement or observation is one of the oldest on which science
is based. By a classical measurement, observation, or test, we acquire knowledge about the
15 The term “non-local interaction” is by itself intriguing. One way to understand it is that a local
interaction is one that either exhibits a direct contact, or at least employs an intermediary that is in direct
contact. The second part of this interpretation allows local interactions between objects astronomically
far apart—see for example gravity, which is considered as a local interaction because it is assumed to be
mediated by hypothetical quanta, gravitons, which travel between gravitating objects. It is usually assumed
that local interactions meet the following criteria: (1) They are mediated by other entities, such as particles
or fields; (2) they do not propagate faster than light; (3) their strength drops off with the distance. It is
known that all forces of the universe create local interactions. In such a case it is natural to ask where in
the physics we could find something that would allow non-local interactions. It seems that the measurement
postulate of quantum mechanics, with a force-free collapse of states provides, such a loophole on which the
existence of non-local interactions could be justified. The distaste in a large part of the scientific circle of
any theory assuming non-local interactions is based on the assumption that it would contradict Einstein’s
special theory of relativity, which says that nothing can travel faster than the speed of light.
9.1. QUANTUM THEORY 351
reality. During the measurement there is an interaction between the measuring devices and
the measured objects.
Quantum measurements are, on the other hand, very different and in many respects rev-
olutionary. As already discussed in Section 1.4, quantum measurement is perhaps the most
controversial issue of quantum theory. Peculiarities and controversial aspects of quantum
measurement are numerous and in spite of the fact that several books have been written on
this subject no really essential progress seems to be made—till the last two decades. Let us
now summarize some of the issues and peculiarities of the quantum measurement problem.
1. In the term quantum measurement the noun ‘measurement’ has a meaning very dif-
ferent from the one used in the classical world. As already mentioned, before quantum
mechanics it was taken for granted that when we measure something, we are gaining knowl-
edge of a pre-existing state—of an independent fact about the world. However, quantum
mechanics, at least its standard interpretation, says otherwise: some things are not deter-
mined except when they are measured—it is only by being measured that they take on
specific values. If we therefore attribute to the word “measurement” its ordinary mean-
ing, i.e. the acquaintance of knowledge about some pre-existing objective reality, we reach
various contradictions.
2. Some classical “measurement tasks” are not appropriate in the quantum setting
because there they are not well defined. For example, in some cases it is not meaningful to
pose the question “what is the value of the property P of the object O”, but it is meaningful
to ask whether a particular x is the value of P for the object O.
Example 9.1.4 (Peres, 1993) There is a way to produce photons with various polarizations,
but there is no way to measure polarization of a particular photon.16 The question “What is
the polarization of that photon?” cannot be answered and is considered in quantum physics as
having no meaning. The legitimate question that can be answered experimentally is whether
or not a particular photon has a specified polarization. As a consequence, if Alice prepares a
sequence of photons and sends them to Bob without disclosing their polarization, then there
is no instrument whatsoever to sort them into bins for polarization 0 − 30◦ , 30 − 60◦ . . . in
the way that agrees with the polarization as produced by Alice. This fact is actually used in
Section 6.2 to make a secure system for cryptographical key generation.
3. It is not always possible to obtain by measurements full information about the un-
known state of a quantum object. In addition, the unknown states of quantum objects are
considered as having no definite value except when they are measured.17
4. There have been attempts to consider quantum measurement as the very basic concept
of quantum physics (see Peres, 1993), and to use it to derive from it such concepts as
quantum state. By Peres, a measurement consists of a preparation, a test and a selection,
where preparations and tests are even more fundamental concepts. A quantum state is then
defined as an equivalence class of preparations.18
5. Quantum measurement can also be seen as an irreversible addition to otherwise
fundamentally reversible quantum evolution.
6. Measurements are considered as the key tools of science to get information that is then
used to abstract theories and to get knowledge. One can say that natural sciences are based
on observation of Nature. It is believed by most scientists that measurements in general
16 However, we can estimate the polarization with certain fidelity.
17 A different situation is, for example, when it is known how was a quantum object produced, or when
there is a possibility of producing an unlimited number of copies of the same object.
18 It is a good and fundamental question what kind of equivalence to consider for preparations.
352 Appendix–quantum theory
help to get knowledge about objective reality that is not fully known to us. However, the
mysteries and paradoxes quantum theory runs into make some believe that quantum theory
is incompatible with the claim that quantum measurements can discover some unknown but
pre-existing reality. By Peres (1993), “We have no satisfactory reason for ascribing objective
existence to physical quantities as distinguished from the numbers obtained when we make
measurements we correlate with them.“
7. The problem of measurement concerns also the basic dichotomy between the classical
and the quantum world. Indeed, from a broader point of view also the measuring device,
or at least a part of it, can be considered as a quantum system. This fictitious process of
shifting the microscopic level can be repeated. Some see as the last stage of this chain our
consciousness.
8. Several types of measurements are of special importance. For example:
• Repeatable measurements. These are measurements M such that if M is applied
to the result x obtained by M , we get x again. Projections P (with the property
P = P 2 ) are examples of repeatable measurements. As discussed in Peres (1993), not
all meaningful measurements are repeatable, and repeatable measurements are more an
exception than the rule. In spite of that the term “measurement” is generally used for
repeatable measurements. The non-repeatable measurements are also of importance
because they may provide more information than ideal repeatable ones.
• Maximal or complete measurements. These are measurements that produce the
same number of outcomes as the dimension of the corresponding Hilbert space. Each
maximal test is uniquely determined by an orthonormal basis.
• YES-NO measurements. They are specified by a subspace and its orthonormal
complement. The projection into one of the subspaces is interpreted as YES and into
the second as a NO answer. A special case of a primitive measurement is when
one of the subspaces is spanned by just one vector.
• POV measurements. Positive operator valued measurements (see Bush et al. 1997,
Peres, 1993, and Section 9.2.8), are different forms of measurement. For example, the
number of outcomes they produce can be larger than the dimension of the underlying
Hilbert space. Formally, a POV measurement is given by a set {Oi }ki=1 of positive
Pk
Hermitian operators such that i=1 Oi = I.
Design of paradoxes19 has been an old methodology in physics to point out, in a irresistible
way, some inconsistency of physical theories. Three such paradoxes, that have played an
19 These paradoxes played and still play an important role even not all physicists see all “quantum para-
9.1. QUANTUM THEORY 353
important role either in clarifying the role of information in physics or in the development
of quantum mechanics itself, will now be briefly discussed.
Maxwell’s demon
It seems that it was in connection with the attempts to explain Maxwell’s20 demon paradox,
from 1867, that information processing considerations started to play a significant role in
physics for the first time.
A B
doxes” as real paradoxes. For example, Peres’ (1993) position is as follows: “There is a temptation to
believe that each particle has a wave function which is its objective property.. . . Unfortunately, there is
no experimental evidence whatsoever to support this naive belief. On the contrary, if this view is taken
seriously, it leads to many bizarre consequences, called “quantum paradoxes”. These so-called paradoxes
originate solely from an incorrect interpretation of quantum theory. The latter is thoroughly pragmatic and,
when carefully used, never yields two contradicting answers to well-posed questions. It is only the misuse of
quantum concepts, guided by pseudorealistic philosophy, which leads to those “paradoxical results”.
20 James Clerk Maxwell (1831–1879), a Scottish physicist. In 1871 he became the first professor of exper-
imental physics at Cambridge. On the basis of Faraday’s laws for electricity and magnetism he developed a
mathematical uniform theory of electricity and magnetism. Maxwell discovered equations, nowadays bearing
his name, to describe phenomena of classical electromagnetism.
354 Appendix–quantum theory
Maxwell’s demon is a creation that operates a shutter to open and to close a trapdoor
between two compartments A and B of a completely isolated chamber containing a gas
of molecules with a random distribution of particles and velocities. Demon pursues the
subversive policy of only opening the door when a faster molecule approaches it from the
right, or a slow one from the left. Hence A cools down and B heats up. Working in this
way for a while the demon separates hot molecules from cold and establishes a temperature
difference between two compartments and decreases the entropy of the system without doing
any work—apparently violating the second law of thermodynamics.
Maxwell’s demon paradox created enormous controversy among physicists.21 The first
explanation was offered by Szilard in 1929 and it was based on the belief that measurements
that the demon has to perform, on the location and speed of molecules, increase its entropy
and this compensate for the decrease of the entropy in the system.
A real explanation came only after a deeper insight into thermodynamic cost of infor-
mation processing was obtained by Landauer and Bennett (the later one first showed that
demon measurements can be performed reversibly and therefore without an increase of en-
tropy). It is based on modern knowledge that not information acquisition but information
erasure requires energy. In a simplified form the explanation goes as follows.
The demon has to collect and store information in his memory about the locations and
speeds of the molecules. Since his memory is finite he has to erase information from his
memory from time to time, and it is during this process that the entropy increases as
required by the second law of thermodynamics.
Schrödinger cat
One of the most puzzling phenomena in our physical world, that is basically quantum me-
chanical, is why there is no quantum superposition on “macroscopic scale objects” (with the
exception of such phenomena as superconductivity). Or, does it actually exist and we are
only not able to observe it? In addition, why is it that in the case of measurement quantum
evolution does not continue as before? We can consider a given quantum system together
with a measuring device and such a system should develop according to a unitary evolution.
There is therefore an apparent contradiction here that were made very vivid through famous
Gedanken experiment of Schrödinger’s cat. (The experiment has numerous formulations and
we consider one of them.)
Let us assume we have a completely isolated chamber with four key elements: an observer,
a cat with a cat-food pot, a cup of poison, and an apparatus controlled by a beam of photons
from a photon source, also inside the chamber. The beam of photons is directed, as in a
Mach-Zehnder interferometer (see Figure 2.5), to a half-silvered mirror. When a photon
gets through the mirror, nothing particular happens. The cat keeps having a good time. If
the photon is reflected at the mirror it triggers a photo-cell (as a measuring device) and this
causes poison to leak from the cup to the cat-pot and the cat dies immediately.
From the point of view of (an unfortunate) observer in the chamber there are two pos-
sibilities. Either the measuring device, the photo-cell, does not record the photon and the
(lucky) cat is alive, or it does and the (poor) cat is dead. There are only these two pos-
sibilities and one of them has to happen. For an internal observer there are therefore two
options concerning the cat: “alive or dead”, and both have the same probability.
However, the situation looks different to an external observer. He “sees” the whole
21 For a more detailed treatment of the Maxwell paradox and its implications concerning entropy and
information see Leff and Fex (1990).
9.1. QUANTUM THEORY 355
E. T.
ob
se
rv
er
cat
system in the chamber as a single quantum system in which only unitary evolutions occur,
no measurement. From his point of view the photon is in a superposition of two states and
the cat gets into the state √12 (|alivei| ↑i + |deadi| ↓i), that is, she is both alive and dead at
the same time but neither of both, with the same probability. However, this contradicts our
experience. Cats we see are either alive or dead.
Where does Schrödinger’s paradox lead us to and how to deal with it? Some problems
are easy to identify: there is no clear definition what a measurement is. There seems to
be no way to draw line between those measurements that are “possible” and those that are
“impossible”. Some even considered that the problem with Schrödinger’s cat lies in the fact
that we have a conscious observer both inside and outside the chamber and that perhaps
the laws of quantum physics do not apply to consciousness.
Schrödinger’s Gedanken experiment led to the development of the so-called many-world
interpretation of quantum mechanics (see Everett, 1957, 1977) and Section 9.1.4.22
Recently, an understanding of Schrödinger’s cat mystery has developed that uses deco-
herence as the key element and goes briefly as follows. In order to specify fully a quantum
state of a cat one needs to specify quantum states of all its components, atoms, electrons,. . .
There is a huge number of quantum states that correspond to alive cats and a huge number
of states that correspond to dead cats. All these states constantly evolve due to the inherent
22 The many-world interpretation has recently been used by Deutsch (1997) when considering quantum
physics as one of four main strands of explanation that he considers as ones that may constitute the first
to-be-developed-yet “Theory of Everything”.
356 Appendix–quantum theory
interaction of their elements and interactions with the environment. Quantum superposi-
tion “both alive and dead but neither of both” can exist but only for an unnoticeable tiny
fraction of time, because it is very unstable, and then evolves, due to the decoherence, into a
mixed state: alive with a probability one-half and dead with the same probability. In short,
Schrödinger’s cat does not exist. Or rather it has an immesurable life-time before it evolves
into a classical or Newtonian cat (Lindley, 1996).
EPR measurements
Einstein was a strong opponent of the key view of the Copenhagen interpretation of quantum
physics, namely that quantum properties are not determined (or that they even do not have
a meaning) until they are measured, and insisted that unmeasured quantities must exist in
some state even though we might not know what the state is.
Einstein believed that quantum mechanics is incomplete, and that there must be a deeper
and more detailed theory that would include all the necessary information to allow us to
make full and certain predictions of the outcomes of measurements, not only the statements
of possibilities and probabilities. (On the other hand, Bohr claimed that looking for such a
theory is a misguided venture, motivated by our romantic thoughts of what physics should
be rather than by pragmatic understanding of what physics really is.)
In order to discredit views of physics resulting from the Copenhagen interpretation,
Albert Einstein,23 Boris Podolsky and Nathan Rosen (EPR), in their famous paper “Can
quantum mechanical description of reality be considered complete?”, in 1935, developed an
experimental set-up that helped to illustrate how strange consequences follow from quantum
theory.
Their basic reasoning goes as follows. Let us imagine two particles, whose total momen-
tum is constant, flying apart at the same speed. Once they are far apart you measure the
position (or momentum) of the first particle and by that you immediately know the position
(momentum) of the second particle.
However, Einstein and his colleagues made out of that two important conclusions.
1. By measuring precisely the position of the first particle we get precisely the position
of the second particle. Since no measurement was involved on the second particle
we can now measure its momentum precisely. However, this contradicts Heisenberg’s
principle.
2. If, without disturbing in any way a system (second particle), we can determine with
certainty the value of a physical quantity (position or momentum), there has to be an
element of physical reality that corresponds to this physical quantity. Therefore, both
position and momentum of particles have to be elements of physical reality.
Bohr came up with surprising, but actually deep explanation why Einstein’s reasoning, by
which he derived a contradiction with Heisenberg’s uncertainty principle, is wrong. Ac-
cording to Bohr one is not allowed to combine into one consideration outcomes of two
incompatible measurements (of the position of one particle and of the momentum of the
second particle).
23 Albert Einstein (1879–1955), considered as one of the best physicists of all time, an American theoretical
physicist of German origin. He wrote the basic papers on the special and general theory of relativity, showed
the equality of gravitation and inertia, contributed to the development of quantum theory postulating light
quanta. He received the 1921 Nobel prize for physics.
9.1. QUANTUM THEORY 357
An important modification of the basic EPR Gedanken experiment, due to Bohm, shows
the problem in a even clearer way. His basic Gedanken experiment dealt with two particles
that fly apart in such a way that their spins add up to zero or, in a more modern setting,
that they form a pair of entangled qubits, in the total state
1
√ (|0i|1i + |1i|0i)
2
and are spatially separated. In such a case a measurement of the spin of one of the qubits
determines (observes) uniquely the spin of another qubit without the second one being
disturbed by an indirect observation. Einstein called this phenomenon a “spooky-action-
at-a-distance” because measurement in one place seems to have an instantaneous effect at
the other place. The term “spooky” indicates that the influence was implied rather than
directly seen.
Einstein and his colleagues concluded from their Gedanken experiment that explanations
of the real phenomena which the current quantum physics offers (namely, its Copenhagen
interpretation) are not complete and suggested the way, a program, to fix it—how a proper
fundamental theory of Nature should look like. The EPR program asked for completeness
(“In a complete theory there is an element corresponding to each element of reality.”),
locality (“The real factual situation of system A is independent of what is done with system
B, which is spatially separated from the former.”) and reality (“If, without in any way
disturbing a system, we can predict with certainty (i.e., with probability equal to unity) the
value of a physical quantity, then there exists an element of physical reality corresponding
to this physical quantity.”).
In particular, they suggested that the wave functions (quantum states) do not provide
a complete description of the physical reality, and therefore there have to exist additional,
so-called “hidden” variables, whose objective values would unambiguously determine the
result of any quantum experiment.
explanation of how this step is performed. It only says what its outcomes are. There is
nothing in quantum theory that explains or determines the exact mechanism of quantum
measurement and the resulting state collapse. In particular, both measurements and state
collapse are presented in such a way that they would not require involvement of forces of
any kind.24
Fortunately, the problem of quantum measurement paradox seems to be on the way to
being solved and by means of quantum mechanics itself. Two concepts play the key role.
Decoherence and theoretical insights into the behaviour of complex systems. They allow us
to understand how large assembles of weird quantum objects can behave in a reasonable
way and how Nature gets around seemingly not understandable quantum measurement
phenomena.
The trouble with quantum mechanics, and the reason for its various “interpretations”,
is that despite its decades of phenomenal success, it is still not understood sufficiently well
what “quantum things” mean. We are not able to see “inside them”. Many of the best
physicists have worked on this problem—but not without reach conflicting conclusions.
There are two reasons for this. At first we are not able to visualize quantum objects and
phenomena and to translate mathematical formulas and explanations provided by quantum
theory into recognizable pictures and understandable words. Secondly, quantum theory
actually offers us certain ideas of how the world works. However, they do not conform to
our prior expectations we should like, or we think, the world should be.
Interpretations are attempts to get around these basic difficulties. The goal is to interpret
and explain quantum concepts and phenomena in such a way that we can say, or at least
some of us, “I got it”.
Finding and analysing interpretations of quantum theory is one of the main tasks of
modern science and also of the philosophy of sciences. The fact that this task seems to be
still far from being resolved is something we have to learn to live with, and not to blame
quantum physics for it. As Lindley says (1996), “if science sometimes provides explanations
without giving us what we would regard as an understanding, the deficiency belongs to us,
not to science”.
never had problems to determine whether a prospective measurement is really a measurement, and he just
did not bother with the problem of how measurements can be made in general.
9.1. QUANTUM THEORY 359
1930, lacked at that time mathematical rigor, especially because of the use of the delta func-
tions, that were for that time mathematically unacceptable. This has changed in between
and therefore Dirac’s formalism represents an alternative for quantum mechanics. (For a
more detailed, but concise treatment of the formalisms of quantum mechanics see Jammer
(1974).)
The Interpretation of quantum mechanics formalisms is one of the deepest and most
difficult problems of current science. Not only the physicists disagree on which formalism is
correct; the philosphers of science disagree even on what does it mean to have an interpre-
tation.
In order to introduce problems of interpretations of quantum mechanics let us take a
widespread position that a physical theory is a partially interpreted formal system (Jammer,
1974). A physical theory is seen as having two components: an abstract formalism F and
the rules of correspondence R (or an “interpretation” of F ). F is a deductive logical calculus
without empirical meaning and contains, in addition to logical constructs, also nonlogical
terms (as “state”). The rules of correspondence, R, make F physically meaningful by
assigning physical meaning to some of the nonlogical terms of F . Those nonlogical terms
that are not directly interpreted by R are called theoretical terms. They are only contextually
defined through the role they play within the logical structure of F . On one side, theoretical
terms can be in principle removed from a theory, but on the other side, they are an instrument
for new discoveries. Let FR denote F accompanying by R.
One school of thoughts takes FR as a physical theory—a mathematical system suitable
to describe as completely, concisely and precisely as possible our experimentally observable
knowledge of the “physical reality”. Other school of thoughts does not consider such ap-
proach as sufficient and requires to supplement FR with some unifying principle which
“establishes some internal coherence among the descriptive features of FR and endorse it
thereby with explanatory and predictive power”. This is an interpretation of FR which is
one of the most controversial problems of quantum mechanics and philosophy of science.
In addition to the rules of correspondence and the unifying principle, the third way to
provide an interpretation is to construct a model of the theory.
Main interpretations
There are several ways of categorizing the existing interpretations. One criterion is whether
a given interpretation is experimentally distinguishable from the orthodox one, or tries to
go beyond it.
If an interpretation is not experimentally distinguishable from the orthodox one, then
its benefit can be mainly on an aesthetic or logical or methodological level, as a change of
a research paradigm. It is natural that for such interpretations it is not easy to supersede
the orthodox one, unless it is much simpler, mainly for sociological reasons. The scientific
community is in general very conservative and prefers to stick with a theory till the facts
showing its obsolescence are much too strong.
The main interpretations of linear quantum mechanics, in which evolution is described by
the linear Schrödinger equation, are the following ones: Copenhagen interpretation25 , many-
worlds interpretation (Everett, 1957, 1977), and the hidden variable interpretations, Bohm
(1952), or its more modern version—an ontological interpretation of Bohm and Hiley (1993).
25 By Peres (1993) there are many noncompatible versions of the Copenhagen interpretation and by him.
“There is no real conflict between Stapp (1972) and Ballentine (1970)—two important expositions of the
Copenhagen interpretation—except that one calls Copenhagen interpretation what the other considers as
the exact opposite of the Copenhagen interpretation.”
360 Appendix–quantum theory
For a recent analysis of these and other interpretations of linear quantum mechanics see Bub
(1997). Interestingly enough, probably due to the fact that recent quantum experiments
display ever more extreme forms of non-classical behaviours, the interest for other than
Copenhagen interpretations seems to grow, but no real alternative has yet emerged.
There are also various models of nonlinear quantum mechanics in which evolution is
described by a nonlinear Schrödinger equation—obtained from linear Schrödinger equation
by introducing nonlinear modifications. For example, models due to de Broglie (1956) and
Weinberg (1989). No experiments supported yet the existence of nonlinear evolutions, but
some physicists do not consider their existence as completely ruled out.26
The Copenhagen interpretation is based on the following two principles.
1. A quantum system that has not yet been measured exists in a state of genuine inde-
terminacy. It makes no sense to say (and may even lead to contradictions) that it is
in a specific but unknown state.
2. The act of measurement forces the system to adopt one of the possible classical values,
with a probability that can be calculated from the appropriate quantum state of the
system and its measurement.
The first principle of the Copenhagen interpretation actually denies the existence of an
independent and unique objective reality on which all observers can agree—a fundamental
concept on which natural sciences are based. The second principle relies on a magic of
measurement no one was able to explain yet, and no one was able to understand why a
measurement could make indeterminacy go away.
The Copenhagen interpretation of quantum mechanics is not so much philosophy as an
“act of intellectual self-discipline”. It does not make quantum phenomena, such as the
two-slit experiment any easier to understand, it just tells us that we should not hope to
understand it in the way we should like to. It resolves certain difficulties only by declaring
them out of bounds.
The Copenhagen interpretation implies that quantum physical properties are not, as in
the classical world, intrinsic and unchangeable characteristics of the things we are measuring
but instead of that arise, in the quantum world, as a result of the act of a measurement and
cannot be ascribed any useful or consistent meaning before a measurement is made.
The Copenhagen interpretation divides the worlds into physical objects of two types:
things we measure (or might in principle measure) and things we measure with. Objects to
be measured are quantum objects and they live in uncertain, indefinite states until they are
measured. Objects we measure with are classical and always in a definite state.
There seem to be two inconsistencies in the Copenhagen interpretation, if looked at from
outside (even if it is consistent under certain internal assumptions).
1. Quantum objects can be assigned a definite value of a property only when a mea-
surement is made, but to make a measurement we need some sort of non-quantum—
classical—devices. At the same time quantum mechanics aspires to be the fundamen-
tal theory of physics and therefore also its measuring devices should be, in principle,
quantum and should adhere to all laws of quantum mechanics.
26 Nonlinear models may have “weird” properties and exhibit “unphysical effects”. As mentioned by Peres
(1993), it may happen that a state u(0) evolves after t steps into u(t) and the state v(0) into v(t) but
u(0) + v(0) does not evolve into u(t) + v(t). In addition, if this is the case, and other postulates of quantum
mechanics are not changed, evolution in such a system can violate second law of thermodynamics.
9.1. QUANTUM THEORY 361
From the practical point of view, the Copenhagen interpretation works satisfactorily.
From the philosophical or fundamental point of view many physicists consider the Copen-
hagen interpretation as unsatisfactory—in spite of its phenomenal practical success. It is
therefore natural that various other interpretations of quantum mechanics have been worked
out.
The hidden variable interpretation of quantum mechanics, due to Bohm (1952), tries
to formalize ideas of Einstein and to deny the fact that indeterminacy in quantum mechanics
is unavoidable reality. Bohm tried to restore for quantum mechanics the underlying reality,
even the one that cannot be seen and detected. In Bohm’s interpretation, measurements
seem to follow probability laws only because we are ignorant of certain hidden properties of
the things we are measuring. Were we to know the values of the hidden variables, we could
say precisely what outcomes a measurement would produce.
Bohm has worked out a new deterministic formulation of quantum mechanics that is
mathematically the same as the standard theory, but which is rearranged so that everything
looks classical except one strange quantum potential, into which all nonclassical aspects
of quantum theory end up.
Bohm gave specific and almost classical meaning to “wave” and “particle” halves of
quantum mechanics. The role of waves is to guide particles to create superpositions. For
example, in the two-slit experiment waves make particles go through one or another slit in
order to create the superposition pattern, and it is the quantum potential that gave rise to
guide waves. In Bohm’s interpretation the classical world is a part of the quantum world
and some paradoxes, such as Schrödinger’s cat paradox, do not exist.
Bohm built quantum mechanics on classical foundations. Everything is deterministic.
Every particle follows a predictable path and it is only because we do not have a precise
knowledge of the initial conditions that we need to use probabilities to describe the values
measurementswill end up with. In the case of photons it is the precise but unknown initial
position and the initial momentum of each photon that form the “hidden variables”.
Several problems arise with Bohm’s interpretation. It can be shown that guide waves
cannot use force to control particles. To get around this problem Bohm and Hiley recently
suggested that guide waves carry (active) information about where particles should go and
through this information they guide particles. However, this implies that both quantum
potential and guide waves have to exhibit non-local phenomena because they need to gather
information instantaneously from all parts of the current experiment.
Another problem arises in connection with the determinism in Bohm’s theory. If the
movement of particles is deterministic, how is it possible that a stream of them, in the
two-slit experiment, creates a superposition pattern? Bohm’s way out is the following one:
particles in a stream are not completely identical. Each particle is fully determined and
has its own momentum and direction. Guide waves send each particle in a completely
deterministic way but since all of them have slightly different initial conditions they go
along different routes.
27 Bohr was aware of this problem but got away with it by claiming that physicists know how to measure
things and in practice there has never been any doubt about whether an action is or is not a measurement.
362 Appendix–quantum theory
1. The outside view (mathematical structure) is physically real and the inside view (hu-
man language we use to describe it) is only a useful approximation for describing our
subjective perception.
2. The (subjectively perceived) inside view is physically real, and the outside view with
9.1. QUANTUM THEORY 363
A
C
switch switch
source
D
B
It is easy to verify that if A, B, C and D can take only values 1 and −1, then X can take
only one of the values 2 or −2.
If the experiment is repeated many times then elements of each of the pairs (A, C), (A, D),
(B, C) and (B, D) are assigned some of the values −1 or 1 in about the same number of
cases. It makes therefore sense to talk about the average value EX of X and it is evident
that this average value should be between −2 and 2, i.e. −2 ≤ EX ≤ 2. (This is actually
364 Appendix–Hilbert space
used three variables, equality A(B − C) = ±(1 − BC) and the corresponding inequality |EA(B − C)| ≤
1 − EBC.
9.2. HILBERT SPACE FRAMEWORK FOR QUANTUM COMPUTING 365
The family of all possible (pure) states of a quantum system constitutes what is known as
Hilbert space. The Hilbert space formalism is therefore the basic framework for formally
precise definitions and a study of the quantum mechanical concepts, phenomena, systems,
algorithms and processes. This is especially true for the physical foundations of quantum
computing.
Definition 9.2.1 A vector (linear) space S, with a carrier H, over a field K with the
carrier K is an algebra S = hH, +,−1 , 0, K, +f , ×f , 0, 1, ·i such that hH, +,−1 , 0i is a com-
mutative group, K = hK, +f , ×f , 0, 1i is a field, and · : K × H → H is a scalar multiplication
satisfying the following axioms for any a, b ∈ K, φ and ψ ∈ H:
• a · (φ + ψ) = a · φ + a · ψ, (a +f b) · φ = a · φ + b · φ {distributive laws}
• (a · (b · φ)) = (a ×f b) · φ
• 1 · φ = φ.29
Two basic notations are used for elements of inner product spaces. In so-called von
Neumann notation a plain letter, say ψ, denotes an element (vector). In Dirac notation
an element is denoted as |ψi and called ket vector. We use mainly Dirac notation, which
is usually more handy. However, in some cases, in order not to have an abuse of delimiters,
a simpler von Neumann notation is used.
Exercise 9.2.5 Show the following properties for any elements φ, ψ of an inner product
space H, and any c ∈ C: (a) hcφ|ψi = c∗ hφ|ψi; (b) ||cφ|| = |c| ||φ||; (c) ||φ + ψ||2 + ||φ −
ψ||2 = 2||φ||2 + 2||ψ||2 (parallelogram law);
An inner-product space with a carrier H is called complete if for any sequence {φi }∞ i=1
with φi ∈ H, and with the property limi,j→∞ ||φi − φj || = 0 there is a unique element φ ∈ H
such that limi→∞ ||φ − φi || = 0. A complete inner-product space is called Hilbert space.31
The elements of H are usually called vectors, and if they have norm 1, then (pure) states.
In the following we mostly identify H with its carrier H.
Definition 9.2.6 Two vectors φ and ψ of a Hilbert space H are called orthogonal if hφ, ψi =
0. A set S ⊆ H is orthogonal if any two of its elements are orthogonal. S is orthonormal
if it is orthogonal and all its elements have norm 1.
In ordinary terms, orthogonal states (represented by orthogonal vectors) are things that
are independent of each other; for example, both basic states of a spin- 21 particle, all positions
a particle can be located in or all configurations of a quantum automaton.
Exercise 9.2.7 (a) Show that if two vectors φ, ψ of a Hilbert space H are orthogonal,
2 2 2
then ||φ + ψ||
Pn = ||φ|| +2 ||ψ|| ; (b) If {φ1 , . . . , φn } is an orthonormal set in H, then for
2
all φ ∈ H: i=1 |hφi |φi| ≤ ||φ|| (Bessel’s inequality).
In a more general approach one can define the angle of two vectors φ and ψ as follows:
0, if φ = 0 or ψ = 0;
θφ,ψ =
arccos(Re(hφ|ψi/(||φ|| · ||ψ||))), otherwise.
31 Many of the results stated in the following for Hilbert spaces hold also for inner-product spaces in
general. However, to simplify presentation we shall talk about Hilbert spaces mostly. Note also that the
completeness requirement has no direct physical meaning. Proofs of many general results about Hilbert
spaces require the use of limits and to have limit elements within the space under consideration.
9.2. HILBERT SPACE FRAMEWORK FOR QUANTUM COMPUTING 367
Exercise 9.2.12 (a) Show that if n vectors are mutually orthogonal, then they are lin-
early independent; (b) Show that dimension of Hilbert space H can be defined as the
maximal number of linearly independent vectors in H.
Pn
If B = {βi }ni=1 , G = {γi }ni=1 are two orthogonal bases, and γi = j=1 cij βj , then
cij = hγi , βj i. Bases B and G are called mutually unbiased if |cij |2 = n1 .
Two mutually unbiased bases can be seen as being as different as possible. The standard
and dual bases are mutually unbiased. Since each basis corresponds to a test, mutually
unbiased bases correspond to tests with clearly distinguishable outcomes. Example: tests
for the vertical and the horizontal polarization of photons or for the clockwise and counter-
clockwise polarization.
B = {φi }ni=1 is a base of an n-dimensional Hilbert space and for a
Definition 9.2.13 If P
vector ψ it holds ψ = ni=1 αi φi , then the vector (α1 , . . . , αm ) is called a representation of
ψ in the base B.
Exercise 9.2.14 Let B1 = {φi }ni=1 and B2 = {ψi }ni=1 be two orthonormal bases of
an n-dimensional Hilbert space. Show that there is a matrix MB1 ,B2 , which maps B1 -
representations of vectors to their B2 -representation.
Another important feature of Hilbert spaces is their self-duality. This concept refers to the
space of continuous linear mappings from H to C (called functionals). If the addition and
scalar multiplication are defined in the most obvious way, then the space of linear functionals
for a given Hilbert space H is again a Hilbert space, so-called conjugate space denoted
H ∗ , related to H as follows:
Since the mapping fψ (φ) = hψ|φi is a functional for any ψ ∈ H, the last theorem
establishes a bijection between H and H ∗ . Instead of fψ , the notation hψ| is used and hψ| is
called a bra vector corresponding to the ket vector |ψi. In this notation hψ|(|φi) = hψ|φi.
In the case of n-dimensional complex Hilbert spaces a “ket” |ψi can be considered as
an n-dimensional column vector and a “bra” hφ| as an n-dimensional row vector. The
scalar product hφ|ψi is then a complex number—the result of a usual “row vector ×
column vector” product. The transformation |φi ↔ hφ| corresponds to a transposition and
conjugation. The tensor product |ψihφ| is a matrix, the result of a usual column vector
× row vector product.
Exercise 9.2.16 Show that if trace(A), or short Tr(A), of a matrix A denote the sum
of its diagonal elements, then (a) Tr(|ψihφ|) = hφ|ψi; Tr(a1 |ψ1 ihφ1 | + a2 |ψ2 ihφ2 |) =
a1 hφ1 |ψ1 i + a2 hψ2 |φ2 i.
Exercise 9.2.17 Show: (a) (|ψ1 i + |ψ2 i)hφ| = |ψ1 ihφ| + |ψ2 ihφ|; (b) |ψi(hφ1 | + hφ2 |) =
|ψihφ1 | + |ψihφ2 |; (c) (a|ψi)hφ| = a(|ψihφ|) = |ψi(ahφ|).
Exercise 9.2.19 Show that all eigenvalues of self-adjoint operators are real.
The n eigenvalues of A need not all be distinct. Let there be exactly k different eigen-
values λ1 , . . . , λk of A and let m(i) be the multiplicity of λi . In such a case the set of
eigenvectors corresponding to λi forms a subspace, the so-called eigenspace, denoted by
Hλi of Hn of the dimension m(i). An eigenvalue of multiplicity 1 (> 1) is called nonde-
generate (degenerate). A matrix M is called nondegenerate (degenerate) if (not) all of its
eigenvalues are nondegenerate.
For any Hilbert space H and any of its self-adjoint operators A there is an orthonormal
basis of H consisting of eigenvectors of A.
A special role among operators of a Hilbert space have projections. If H = W ⊕ W ⊥ is
a decomposition of a Hilbert space H into two orthogonal subspaces W and W ⊥ , then each
φ ∈ H has a unique representation φ = φW + φW ⊥ , where φW ∈ W and φW ⊥ ∈ W ⊥ . In
such a case the mapping
PW φ = φW
is an operator called the projection onto the subspace W . If W is a subspace spanned by
a single vector φ, then instead of P{φ} we write simply Pφ .
Exercise 9.2.21 Show that an operator A is a projection onto some subspace if and only
if A is idempotent (i.e., A = A2 ), and self-adjoint.
Exercise 9.2.22 Show: (a) hPW φ|ψi = hφ|PW ψi holds for any subspace W and any
vectors φ, ψ; (b) the square of the length of the projection of φ into W , i.e. ||PW φ||2 is
hφ|PW φi.
Pn
Therefore, if |ψi = i=1 hφi |ψi|φi i for an orthogonal basis {|φ1 i, . . . , |φn i}, then
Pφi |ψi = hφi |ψi|φi i.
If φ ∈ H, ||φ|| = 1, then the operator denoted by |φihφ| and defined by
|φihφ|(|ψi) = hφ|ψi|φi
is actually the projection Pφ onto the one-dimensional space spanned by the vector |φi.
370 Appendix–Hilbert space
where B is an orthonormal basis of H. This definition is O.K. because it can be shown, see
also execises below, that the trace of an operator is invariant under a change of basis. Let
T (H) be the set of all operators of H with finite traces.
Exercise 9.2.24 Show: (a) Tr(cA) = cTr(A) for any c ∈ C and any linear operator
A; (b) Tr(A1 + A2 ) = Tr(A1 ) + Tr(A2 ) for any linear operators A1 , A2 ; (c) the trace
functional is cyclically invariant, namely: Tr(A1 A2 ) = Tr(A2 A1 ) and Tr(A1 A2 A3 ) =
Tr(A2 A3 A1 ) for any linear operators A1 , A2 , A3 .
Exercise 9.2.25 Show that if P is a projection operator into a k-dimensional space, then
Tr(P ) = k.
Definition 9.2.26 A linear operator A is called bounded if there is an s ∈ R≥0 such that
||Aψ|| ≤ s||ψ|| for all ψ ∈ H.
If A is a linear bounded operator then its norm ||A|| is defined by
||Aψ||
||A|| = sup{ | )0 6= ψ ∈ H}.
||ψ||
Exercise 9.2.27 Show that |hAφ|ψi| ≤ ||A||2 and |hφ|Aφi| ≤ ||A||2 for any bounded
operator A and any states φ, ψ.
Exercise 9.2.28 Show that ||A + B|| ≤ ||A|| + ||B|| and ||AB|| ≤ ||A|| ||B|| for any two
bounded operators A, B.
[A, B] = AB − BA
9.2. HILBERT SPACE FRAMEWORK FOR QUANTUM COMPUTING 371
Exercise 9.2.29 Determine when two operators |φ1 ihφ1 | and |φ2 ihφ2 | commute, if |φ1 i
and |φ2 i are pure states.
One can show that two Hermitian operators commute if and only if there is a basis in
which they both have diagonal form.
If all pi are the same the term random mixture is sometimes used. In general, a quantum
system is not in a pure state. There may be several reasons for that. A source to produce
a quantum state is not perfect and it produces the state φi with probability pi . It may
also be the case that we have only partial knowledge about a system. For example, in an
interpretation of quantum mechanics the result of the measurement of a pure state
n
X
|ψi = αi |φi i
i=1
with respect to the observable given by an orthonormal basis {φi }ni=1 can be considered as
the mixed state
Mn
[ψi = |αi |2 |φi i.
i=1
In general, quantum processes are not always fully specified. Mixtures and density
matrices are a suitable framework to deal with such cases. Pn
To each mixed state [ψi corresponds a density matrix ρ[ψi . If [ψi = |φi = i=1 ci |φi i
is a pure state, then ρ[ψi = |φihφ|, i.e. ρ[ψi (i, j) = c∗i cj . For example, if |φi = α|0i + β|1i,
then
|α|2 αβ ∗
|φihφ| = .
α∗ β |β|2
372 Appendix–Hilbert space
Remark 9.2.30 The traditional meaning of density matrices is related to their representa-
tion of mixed states—eigenvalues of density matrices correspond to probabilities for finding
the system in their corresponding eigenvectors. More modern approaches see density matri-
ces as an alternative to vectors of Hilbert space to describe states of quantum systems.
Correspondence between mixed states and density matrices is not one-to-one. For example,
if
1 1 1 1
[ψ1 i = |0i ⊕ |1i and [ψ2 i = (|0i + |1i) ⊕ (|0i − |1i)
2 2 2 2
are two mixed states over the standard basis in H2 , then
1
2 0 1
ρ[ψ1 i = ρ[ψ2 i = = I.
0 12 2
Exercise 9.2.31 Show that if |φ1 i, |φ2 i are orthonormal states of H2 , then the density
matrix for the state [ψi = 21 |φ1 i ⊕ 12 |φ2 i has the form 21 I.
Density matrices are linear operators, so-called density operators, of Hilbert space and
have various interesting and important properties. Observe that if all pure states of a mixed
state are states of a Hilbert space H then the corresponding density matrix is an element of
the space H ∗ ⊗ H.
Exercise 9.2.32 Show: (a) each density operator ρ is self-adjoint, positive and Tr(ρ) =
1. (b) a density operator ρ is the density operator of a pure state if and only if ρ2 = ρ.
Exercise 9.2.33 Show an example of a mixed state with two pure states and another
example of a mixed state with three pure states such that the density matrices of both of
these mixed states are the same.
In a more general setting, density matrices are arbitrary positive and self-adjoint opera-
tors with trace 1.
Density matrices completely specify all practically distinguishable properties of mixed
states.
Example 9.2.34 Consider two spin- 21 particles which are far apart and in the entangled
state √12 (|0i|1i + |1i|0i). If an observer of the first particle measures the particle with respect
to the basis {|0i, |1i}, then the outcomes |0i or |1i are obtained with the same probability.
The observer of the second particle then sees his particle as being in the state |1i in the first
case and in the state |0i in the second case. In total he can say that the second particle is
in the mixed state ( 12 , |0i) ⊕ ( 12 , |1i).
If the observer of the first particle measures his particle with respect to the dual basis
{|0′ i, |1′ i}, then again he gets |0′ i with probability 12 and |1′ i with the same probability. In
such a case the second particle is in the state |1′ i or |0′ i depending on the outcome of the
first measurement. One can say again that the second particle is in the mixed state ( 12 , |0′ i)⊕
( 12 , |1′ i). However, to both of these mixtures the same density matrix 12 I corresponds, which
fully characterizes the state of the second particle.
9.2. HILBERT SPACE FRAMEWORK FOR QUANTUM COMPUTING 373
Example 9.2.35 Let us assume that to polarize a stream of photons we use for each photon
one of the following two methods.
(a) A photon is randomly polarized either in the vertical or horizontal direction.
(b) A photon is randomly polarized either in the right-to-left or left-to-right direction.
In both cases we have the same density matrix 12 I and the receiver has no way to find
out which of these two methods was used.
where all λi > 0, because ρ is positive. In such a case ρ can be seen as representing a
mixture of mutually orthogonal pure states φi with probabilities λi . This decomposition
into orthogonal states is unique, but, as already mentioned, ρ is also decomposable into a
mixture of (in general) non-orthogonal pure states φi .
If some eigenvalues of ρ are degenerate, then
X
ρ= λi Pλi
i
and the eigenvalues correspond to mutually orthogonal subspaces. The subspace Hλ corre-
sponding to the degenerate eigenvalue λ is generated by the set of solutions of the eigenvalue
equation ρω = λω. In this case, it is not possible to represent ρ as a mixture of a unique set
of orthogonal pure states.
Pure states, represented by idempotent density operators, are homogeneous in the sense
that no idempotent density operator is expressible as a nontrivial convex sum of two or more
different density operators.
Exercise 9.2.39 (Llyod, 1997) Show that if for two mixtures [φi = {(|φi i, pi ) | 1 ≤ i ≤ k}
Pk √
and [ψi = {(|ψi i, qi ) | 1 ≤ i ≤ k} we define hφ|ψi = j=1 pj qj hψj |φj i, then this “scalar
product” indeed has all the properties a scalar product should have.
374 Appendix–Hilbert space
Superoperators
Transformations on density matrices are performed by superoperators—linear operators
that map linear operators of a Hilbert space H1 on linear operators of another Hilbert
space H2 . The informal idea of physically implementable superoperators is captured by the
formal concept of completely positive maps. They are superoperators T that are positive
(they map positive semi–definite Hermitian matrices into positive semi-definite Hermitian
matrices), trace preserving and such that for any finite dimensional Hilbert space H also the
operator T ⊗ IH has the same property, where IH is the identity superoperator on H. (This
means that positivity must remain if Hilbert spaces H1 and H2 are extended by adding more
qubits.)
Examples of superoperators are encoders, decoders and quantum channels. Superopera-
tors can also be seen as unitary operators in a larger Hilbert space.
9.2. HILBERT SPACE FRAMEWORK FOR QUANTUM COMPUTING 375
where λ1 , . . . , λk are all its distinct eigenvalues, can be seen as an observable that represents
decomposition of H into subspaces Hλi corresponding to eigenvalues λi , and with λi s the
numerical value associated to the subspace Hλi .
If the spectrum of A is simple, then the probability assigned to the eigenvalue λi by a
unit vector ψ is defined as follows:
where ωi is the eigenvector corresponding to λi . Therefore, probψ (λi ) is the square of the
length of the projection of ψ into ωi . If A is degenerate, then
probψ (W ) = Tr (ρPW ),
where ρ = |ψihψ| is the density operator. In general we define for a density operator ρ and
|ψi ∈ H:
probρ (W ) = Tr(ρPW ).
Often we use notation hψ|A|ψi or hAiψ instead of Expψ (A) or even hAi if ψ is clear from
the context.
Exercise 9.2.42 Show that: (a) Expψ (A) = hψ|Aψi; (b) Show that Expψ (A) = Tr(Pψ A).
In the case of Pka mixed state ψ given by the pure states ψ1 , . . . , ψk and probabilities
p1 , . . . , pk , with i=1 pi = 1 we have
k
X k
X
Expψ (A) = pi Expψi (A) = pi hψi |Aψi i.
i=1 i=1
376 Appendix–Hilbert space
Exercise 9.2.43 ShowPfor the mixed state ψ P given by pure states ψ1 , . . . , ψk and proba-
k k
bilities p1 , . . . , pk that i=1 pi Expψ (A) = Tr(( i=1 pi Pψ )A).
The concept of observables is one of the most specific and important in quantum comput-
ing. The key point is that any pair consisting of an observable A and a state T determines
a probability distribution of values of the observable A on state T .
In the classical setting, examples of “observables” are position, velocity, energy, momen-
tum, and so on. Two of them, position and momentum, can be seen as “canonical” because
other classical observables can be expressed in terms of these two. In the Hilbert space
framework of quantum mechanics, observables are operators of a special type.
It follows from the above principle that unitary time evolution never turns a pure state
into a mixed state. Moreover, it holds for any t1 , t2 ≥ 0 that U (t1 + t2 ) = U (t1 )U (t2 ) and
for any |φi ∈ H, t0 ∈ R
lim U (t)|φi = U (t0 )|φi.
t→t0
On this base one can show that evolution of a mixed state, represented by the density
operator U (t), is given by
9.2.6 Measurements
Experiments show, and this has been formulated as one of the main principles of the Copen-
hagen interpretation, that on the macroscopic level we can receive, as outcomes of mea-
surements, only classical values (states) and not quantum superpositions. However, on the
quantum level the outcome is the quantum state obtained by the “collapse” of the measured
state.
9.2. HILBERT SPACE FRAMEWORK FOR QUANTUM COMPUTING 377
The basis of the main approach to the measurement in quantum computing is von Neu-
mann’s projection postulate: observation of a state ψ by an observable A provides, as the
result of the measurement, one of the eigenvalues of A and makes the state ψ collapse and
renormalize. For finite-dimensional Hilbert spaces this has been formally stated as follows:
Let H be a Hilbert space of dimension n corresponding to a quantum system S. Let
n
X
A= λi |φi ihφi |
i=1
The tensor product H1 ⊗H2 of Hilbert spaces H1 and H2 is of the dimension dim(H1 ⊗
H2 ) = dim(H1 ) dim(H2 ) and with the basis A × B = {(|φi i|ψj i)|φi i ∈ A, |ψj i ∈ B}.
Instead of (|φi i, |ψj i) we usually write |φi i|ψj i or simply |φi ψj i. In such a case we define
n X
X m
|φi ⊗ |ψi = ci dj |φi ψj i.
i=1 j=1
If |φ1 i, |φ2 i are states of H1 and |ψ1 i, |ψ2 i are states of H2 , then |φ1 i⊗|ψ1 i and |φ2 i⊗|ψ2 i
are states of H and their scalar product is defined by
Exercise 9.2.46 Show that: (a) the operation ⊗ on matrices is associative; (b) Tr(A ⊗
B) = Tr(A)Tr(B) for matrices A and B.
POV measurements
Let a quantum system S is in the mixed state represented by a density matrix ρ and let H
be the corresponding Hilbert space. Let us consider an auxiliary system Sa , an ancilla, in
the state ρa . The resulting density matrix of the combined system S ⊗ Sa is ρ ⊗ ρa .
Let the maximal measurement is performed on ρ ⊗ ρa in S ⊗ Sa . In such a case different
outcomes of the measurement correspond to projection Pµ to different orthogonal subspaces
and it holds
Pµ , if µ = ν; X
Pµ Pν = Pµ = I.
0, if µ 6= ν,
µ
In such a case the probability that the outcome µ is obtained by measuring ρ is given by
X
Pµi = Tr[Pµ (ρi ⊗ ρa )] = Pµ [mr, ns]ρ[n, m]ρa [s, r],
mr,ns
and therefore
Pµi = Tr(Aµ ρi ),
where Aµ is defined by X
Aµ [m, n] = Pµ [mr, ns]ρa [s, r],
rs
and Aµ is an operator on H.
The set of all operators Aµ is called a positive operatot valued measure (POVM),
because each Aµ is a positive operator. The matrices Aµ do not commute in general and
satisfy the relation X
Aµ = I.
µ
380 Appendix–Hilbert space
One difference between von Neumann’s projection measurement and POVM is that in
the later case the number of possible outcomes may be larger than the dimensionality of the
underlying Hilbert space H. The probability of the outcome µ is given by Tr(Aµ ρ) instead
of von Neumann’s Tr(Pµ ρ).35
and the optimal POVM consists of matrices of rank 1, see Peres (1993) for more details.
35 POVM measurement is also related with the general view of a physical process which starting in a state
ψ produces a random classical outcome µ and causes a collapse of ψ into the state ψµ of another Hilbert
space H ′ . This is then formalized through the mapping µ → Mµ , where Mµ : H → H ′ is a positive operator
called the measurement operator associated with the initial Hilbert space H into the final Hilbert space H ′ .
The only requirement on Mµ is that µ Mµ∗ Mµ = I. The mapping µ → Eµ = Mµ∗ Mµ is then called POVM.
P
Deterministic models
The very basic model of the universal computer, invented in 1937 by A. M. Turing, is
that of one–tape Turing machine (TM). A Turing machine M, see Figure 9.7, consists
of a bi–infinite tape, divided into an infinite number of cells in both directions, with one
distinctive starting cell, or 0-cell. Cells of the tape can contain any symbol from a finite tape
alphabet Γ, or a symbol ⊔ representing the empty cell; a read–write head is positioned
at any moment of the discrete time on a cell; a finite control unit is always in one of the
states of a finite set Q of states and implements a transition function
δ : Q × Γ → Q × Γ × {←, ↓, →}.
The interpretation of δ(q, σ) = (q ′ , σ ′ , d) goes as follows: if M is in the state q and the head
reads σ, then M enters the state q ′ , replaces σ by σ ′ in the cell the head is currently on, and
the head moves in the direction d (to the right (left) if d =→ (←) or does not move at all if
d =↓. Formally, M = hΓ, Q, q0 , δi or M = hΣ, Γ, q0 , Qt , δi if a set Σ ⊂ Γ is considered as the
input alphabet and Qt ⊆ Q contains the so-called terminating states. We can assume
that Qt = {ACCEP T, REJECT }. It is assumed that once M gets into a terminating state
than it remains in such a state.
a b c d e f g h g f e d c b
tape
finite q
control
Figure 9.7: One-tape Turing machine
Exercise 9.3.1 Show that for each TM M we can design another Turing machine M′
that simulates M and in no case it can move into a state both from right and left.
Turing machines are a natural computer model to study, on the one hand, language
acceptance and decision problems and, on the other, computation of string-to-string and
integer-to-integer functions.
Definition 9.3.2 (1) Let M = hΣ, Γ, Q, q0 , δi be a TM with the input alphabet Σ. Then
Exercise 9.3.4 A TM, as defined above, can perform in one step three actions: a state
change, a writing and a head move. Show that to each TM M we can design a TM
M′ which performs in each step at most two of these three elementary actions and: (a)
accepts the same language as M; (b) computes the same function as M.
Exercise 9.3.5 Design a Turing machine that: (a) multiplies two integers; (b) recognizes
whether a given word is a palindrome; (c) recognizes whether a given integer is prime.
δ : Q × Γk → Q × Γk × {←, ↓, →}k .
9.3. DETERMINISTIC AND RANDOMIZED COMPUTING 383
k tapes
If, at the beginning of a computation step, M is in the state q, its head on the ithe tape
reads the symbol σi and δ(q, σ1 , . . . , σk ) = (q ′ , σ1′ , . . . , σk′ , d1 , . . . , dk ), then M moves, in one
step, into the state q ′ , the head on the ith tape replaces σi by σi′ and moves in the direction
di . One-tape TM can be seen as a special case of MTM.
It is straightforward to introduce basic concepts concerning time resources for com-
putation on MTM. If an MTM M starts with a string w on its input tape and with all
other tapes empty and yields in m steps a terminating configuration, then m is the time
of the computation of M on the input w. Denote by TimeM (n) the maximal number of
steps of M for inputs of length n. M is said to operate within the time bound f (n) for a
function f : N → N, or to be f (n)-time bounded, if M halts within f (|w|) steps, for any
input w ∈ Σ∗ . If a language L is accepted by an f (n)-time bounded MTM, then we write
L ∈ Time(f (n)). Thus, Time(f (n)) is the family of languages that can be decided by an
f (n)-time bounded MTM—a time complexity class.
Similarly, one can introduce analogous concepts for space as a computational resource
of MTM. SpaceM (n) is the maximal numbers of cells, on any of the tapes, M uses when
computing with inputs of length n. M is said to be s(n)-space bounded, where s : N → N,
if M uses at most s(|w|) cells on any of the tapes for any input w. Space(s(n)) is the family
of languages that can be accepted with space bound s(n)—a space complexity class.
Of key importance for classical computing is the fact that there is a universal MTM U
that can efficiently, i.e., in polynomial time, simulate any other MTM.
The basic idea of such a universal MTM is very simple. If U gets on its input tape a
word w and a description (encoding) hMi of an MTM M (a description of its transition
function), in the form of a word in the tape alphabet of U , then in order to simulate one
step of M on an input w, U scans hMi to determine the step M would perform on w and
then U performs on w this particular step. Time for a simulation of one step of U can be
made proportional to the length of hMi.
There are many ways an MTM can be encoded by a word. Let us illustrate one of
them for the case of a one-tape TM M = hΓ, Q, q1 , Qt δi, where Γ = {a1 , a2 , . . . , an }, Q =
{q1 , . . . , qm }, Qt = {qm+1 , qm+2 }. M can be encoded by a binary string.
For example,
′
a transition δ(qi , aj ) = (qk , al , d) can be encoded as 0i 10j 10k 10l 10d ,
384 Appendix–complexity theory
oracle-tape
x in A ? ORACLE
q?
for
A
x
oracle-tape
Figure 9.9: Oracle Turing machine
Nondeterministic model
All models of TM introduced so far can be considered as realistic models of classical comput-
ing. This is not the case for so-called nondeterministic Turing machines—a mathematically
9.3. DETERMINISTIC AND RANDOMIZED COMPUTING 385
very natural generalization of TM that plays a very important role in the study of the
potentials and limitations of classical computing.
A one-tape nondeterministic Turing machine (NTM) M = hΓ, Q, q0 , δi is defined
formally in a similar way to a one-tape deterministic TM, except that instead of a transition
function we have a transition relation
δ ⊂ Q × Γ × Q × Γ × {←, ↓, →}.
This means that a step of an NTM is not in general uniquely determined and several alterna-
tive steps may be offered to choose (nondeterministically). As a consequence, a configuration
C of an NTM may have several potential next configurations, and M can go, nondetermin-
istically, from C to one of them. We can therefore see the overall computational process
of an NTM not as a sequence (of subsequent) configurations, but as a tree of (branching)
configurations—see Figure 9.10.
c0
2 1 3
c1 c2 c3
1 2 3 4 1 1 2 3
c c c c c c c c
4 5 6 7 8 9 10 11
Figure 9.10: Tree of configurations of NTM
We say that an NTM M accepts an input w (in time t(|w|) and space s(|w|)), if there
is at least one path in the configuration tree, with q0 w being the configuration at the root,
that ends in the accepting state (and it has the length at most t(|w|), and none of the
configurations on the path is longer than s(|w|)). This can be used to define, in a natural
way, when an NTM computes a relation or a function with certain time and space bounds.
For an NTM M let L(M) be the language accepted by M.
Exercise 9.3.6 Show that for each NTM M we can design an NTM M′ that can make
exactly two moves in each nonterminating configuration, accepts the same language as
M, and there is an integer k such that M accepts an input w in t steps if and only if
M′ accepts w in kt steps.
Exercise 9.3.7 Design an NTM that decides in polynomial time whether a given Boolean
formula is satisfiable.
δ : Q × Γ × Q × Γ × {←, ↓, →} → [0, 1]
that has to satisfy the following local probability condition: For each q ∈ Q, σ ∈ Γ
X
δ(q, σ, q ′ , σ ′ , d′ ) = 1.
q′ ,σ′ ,d′
δ(q, σ, q ′ , σ ′ , d′ ) is considered as the probability that if M is in the state q and reads σ, then
it makes the following move: the state q is changed into q ′ , σ is replaced by σ ′ on the square
scanned by the head, and the head moves in the direction d′ .
The transition function δ induces the probability for any two configurations C and C ′
that C moves into C ′ in one step.
A computation of a PTM M can be again seen as a configuration tree in which to
each configuration transfer a probability is assigned—see Figure 9.11. From that one can
derive the probability that a computation reaches a particular node—this is the product of
probabilities assigned to all edges (transfers) on the path from the root to the particular
node.
In contrast to NTM, probabilistic TM are considered to be a realistic model of compu-
tation.
Algorithms that make use of random numbers (generators) are often very simple, and
their efficiency is either comparable to or better than that of deterministic algorithms for
the same problem.
c0
0.5 0.5
c1 c2
c3 c4 c5 c6 c7
Figure 9.11: Configuration tree of a PTM-80%
1. Choose randomly 1 ≤ j ≤ n.
If the random choice of j’s corresponds to the uniform distribution, then RQUICKSORT
requires on average Θ(n lg n) time.
S∞ k
S∞ k
P = k=0 Time(n ),
S∞ NP = k=0 NTime(n ),
S∞
k k
PSPACE = k=0 Space(n ), NPSPACE = k=0 NSpace(n )
388 Appendix–complexity theory
2. Is it true that LR R
1 6= L2 for a randomly chosen oracle R?
For example, there are oracles A, B, C, D such that: (1) PA = NPA ; (2) PB 6= NPB ; (3)
NPC = PSPACEC ; (4) PD 6= PSPACED .
The main outcome of these results is the understanding that methods that relativize
(i.e. if they can be shown to get a relation between complexity classes then they can be
used to show the same relation between the relativized complexity classes) cannot be used
to separate such complexity classes as P, NP and PSPACE.
1
RP x ∈ L ⇒ P r(A(x) accepts) ≥ 2 x 6∈ L ⇒ P r(A(x) accepts) = 0
1
PP x ∈ L ⇒ P r(A(x) accepts) > 2 x 6∈ L → P r(A(x) accepts) ≤ 12
3
BPP x ∈ L ⇒ P r(A(x) accepts) ≥ 4 x 6∈ L ⇒ P r(A(x) accepts) ≤ 14
Class ZPP can also be defined as the class of problems that can be solved by PTM with
certainty in expected polynomial time.
Classes ZPP, RP and PP fall nicely into the hierarchy of deterministic complexity
classes presented in the previous subsection. Indeed, it holds:
There are two simple models of probabilistic Turing machines that are sufficient and
often more convenient for definition and study of polynomial time randomized complexity
classes.
The first model (see Figure 9.12a), is that of a one-tape Turing machine enhanced by
a special “randomized tape” and with the head which moves one square right after each
390 Appendix–complexity theory
reading from the random tape. Before any computation the randomized tape is assumed
to be filled up with an infinite random binary string. At some steps of computation the
symbol under the current position of the head on random tape is read—this corresponds
to a “random coin-tossing” at such a step—and computation continues depending on the
outcome of the reading.
Another elegant approach to the definition of the randomized complexity classes is
through special non-deterministic Turing machines M in which there are exactly two next
configurations to each non-terminating configuration and terminating states of M are di-
vided into accepting and rejecting states. In addition, these Turing machines are assumed
to have all computational paths of the same length. In this setting the classes RP, PP and
BPP can be defined as follows:
0
0
1 read-write tape
0 a b c d e f g h i j k l mn
1
1
0 configuration
q
1 tree
0 tape
finite control
0
of random
1 bits (a) (b)
Figure 9.12: A Turing machine with a tape of random bits and a configuration tree of a
special NTM
BPP A language L is in BPP (bounded error (away from 21 ) probabilistic polynomial time)
if there is a polynomial time-bounded NTM M such that
3
1. If x ∈ L, then at least 4 of the computations of M on x terminate in the accepting
state.
3
2. If x 6∈ L, then at least 4 of the computations of M on x terminate in the rejecting
state.
The class BPP is currently considered as the most important complexity class which
contains languages (problems) solvable in polynomial randomized time and a problem (lan-
guage) is considered as feasible if and only if it is in BPP.
It is clear that P ⊆ BPP but neither of the relations NP ⊆ BPP nor BPP ⊆ NP has
been shown yet. There is a belief, and a certain evidence, but not a proof, that P ( BPP ⊆
NP. One technical result supporting this hypothesis is that each language in BPP has a
polynomial time Boolean circuit but a similar result for NP does not seem to hold because
for NP-complete problems no polynomial-size circuits are known.
In the thesis we take the word “reasonable” to mean “in principle physically realisable”
and by “efficiently realizable on a probabilistic Turing machine” to be in the class BPP.
392 Appendix–Exercises
9.4 Exercises
1. Show that the following matrix is unitary:
1 1 1 1
2 2 2 2
1 i
− 12 i
2 2 2
A= 1 1 1 i .
2 −2 2 2
1
2 − 2i − 12 i
2
4. Express, using rotation, phase shift and the Pauli matrix σx , matrices: (a) σy ; (b) σz .
5. Show that Pauli matrices have the following properties: (a) σi σk + σk σi = 2δik , for
i, k ∈ {x, y, z}; (b) σy σz − σz σy = 2iσx ; (c) σz σx − σx σz = 2iσy .
7. How many of n input/output gates are reversible for: (a) n = 1 and n = 2; (b) n = 3;
(c) for an arbitrary n.
a b
8. Compute eigenvalues and eigenvectors of the matrix .
c d
9. Determine eigenvalues and eigenvectors of the following matrices: (a) Hadamard ma-
trix; (b) XOR matrix.
10. Determine when two non-orthogonal states |φ1 i and |φ2 i can be written, by a suitable
choice of the basis as |φ1 i = cos α|0i + sin α|1i and |φ2 i = sin α|0i + cos α|1i.
12. Express the state |φi ⊗ |φi, where |φi = α|0i + β|1i in the Bell basis.
13. Design a matrix that maps Bell states in H4 into the standard basis.
14. A family of bases of a Hilbert space Hn is called conjugate if each pair of the bases
from the family is conjugate. (a) Find a family of three conjugate bases in H2 ; (b)
(Wiesner, 1983) Show that for any n there is an m such that the Hilbert space Hm
has a family of n conjugate basis.
15. Find an example of an infinite matrix A such that A2 is not well defined.
9.4. EXERCISES 393
16. Show that if |φi and |ψi are orthogonal vectors of a Hilbert space H, then there exists
a Hermitian operator A in H such that |φi and |ψi are eigenvectors of A.
17. Let A, B and C be Hermitian matrices such that 0 ≤ B ≤ C. Show that CA = 0
implies BA = 0.
18. Show that the trace of a Hermitian or unitary matrix is equal to the sum of its
eigenvalues, and the determinant of such a matrix equals the product of its eigenvalues.
19. Show that if a matrix U is unitary and H is Hermitian, then U T HU is Hermitian and
if H is unitary, then U T HU is unitary too.
20. Let A be a (bounded) linear operator of a Hilbert space H. Show that the following
conditions are equivalent: (1) A is positive; (2) A = B ∗ B for some bounded linear
operator B; (3) A = A∗ and all its eigenvalues are positive.
21. (a) Show that
√ if A is a Hermitian matrix, then there is a unique Hermitian matrix
2
B, denoted√ A, such
√ √that B = A. (b) Show that there are Hermitian matrices A, B
such that AB 6= A B.
22. Show that if A is a Hermitian matrix, then the matrix eiA is unitary.
23. Let {φi }ni=1 be n orthonormal
Pn states of a Hilbert
√ space Hn and φ0 an arbitrary state
of Hn . Show that i=1 ||φi − φ0 ||2 ≥ 2n − 2 n.
24. Let S be a set of elements of an inner-product space H. Is the concept of the small
subspace containing S well defined and unique?
25. (Jozsa, 1994) Denote [a, b, c] the matrix from the Exercise 2b.; determine (a) Tr [a, b, c];
(b) Tr([a.b,c][d,e,f]); (c) show that any two matrices [a, b, c] and [d, e, f ] commute.
26. Show that the sequence of functions
eikx
E = {ek }∞
k=−∞ , ek (x) = √ ,
2π
forms an orthonormal set on L2 ([0, 2π)). (One of the fundamental results in the theory
of Fourier series is that E is a basis of L2 ([0, 2π)).)
27. Design a quantum circuit consisting of elementary gates and realizing the mapping:
1 0 0 0
0 1 0 0
0 0 1 0 .
0 0 0 eiφ
28. Describe the mapping realized by the gates described by the matrices
1 1 + eiπα 1 − eiπα
,
2 1 − eiπα 1 + eiπα
(a)
Figure 9.13: Examples of networks
30. Determine what kind of unitary transformation is performed by the circuit in Fig-
ure 9.13 provided the square gate represents: (a) Hadamard transformation; (b) trans-
formation H ′ ; (c) Pauli matrices.
31. Show how to design a network to transform the state √1 (|0000i + |1111i) into the
P 2
four-qubit state √18 i even |ii.
32. Design a 3-qubit input/output network to map states {|ii | 0 ≤ i ≤ 7} into the 3-
particle Bell states: √12 (|000i ± |111i), √12 (|001i ± |110i), √12 (|010i ± |101i), √12 (|100i ±
|011i).
33. (Barenco et al. 1995) Let
u00 u01
U=
u10 u11
V
be a unitary matrix. For m ∈ {0, 1, 2. . . .} let m (U ) be an (m + 1)-bit operator
defined by
^ V
uy0 |x1 , . . . , xm , 0i + uy1 |x1 , . . . , xm , 1i if mk=1 xk = 1
(U )(|x1 , . . . , xm , yi) =
|x1 , . . . , xm , yi otherwise
m
V V
for all x1 , . . . , xm , y ∈ {0, 1}. (a) Describe
the matrix m (U ); (b) m (U ) is called the
0 1
(m + 1)-bit Toffoli gate if U = —explain why.
1 0
V
34. Show that if W ∈ U (2), then the i (W ) gate can be simulated by the network from
Figure 9.14, where A, B and C are in SU (2), if and only if W ∈ SU (2) .
W A B C
35. Show that for any special unitary matrix W ∈ SU (2) there exist matrices A, B, C ∈
SU (2) such that ABC = 1 and Aσx σx C = W .
9.4. EXERCISES 395
motivation was to demonstrate that the Post correspondence problem remains undecidable even if it is
restricted to injective morphism. He did not pay attention that his universal reversible TM requires more
than polynomial time for simulation. His ideas were extended and clarified by Ruohonen (1985).
37 In addition, Bennett (1989) showed that reversibility does not cost too much in terms of time or space.
9.5. HISTORICAL AND BIBLIOGRAPHICAL REFERENCES 397
proof money” required facilities to store quantum states for a longer time, what was seen as
unfeasible at that time. However, in earlies 1980s Bennett started to realize the potential of
Wiesner’s ideas for cryptography and the basic observations have been published by Bennett
at al. (1982). Wiesner himself published his original paper in 1983. Of large importance for
the development of quantum cryptography was the first conference on quantum cryptogra-
phy organized by A. K. Ekert in 1993. The first cryptographical protocol, known as BB84,
was published by Bennett and Brassard (1984). Bennet and his collaborators made also the
first steps to verify QKD ideas by experiments and in 1989, using polarized photons, demon-
strated transmissions of photons for the distance of 32cm. In 1991 Ekert published his ideas
of QKD based on entanglement and in 1992 Bennett published a “minimal” QKD protocol,
known as B92 and suggested that it could be implemented using single-photon interference
with photons propagating over optical fiber for a long distance. Since then several significant
experiments have been performed with signal propagating over the distance of several (tens)
of kilometres—see Marand and Townsend (1995), Muller et al. (1995), and Hughes et al.
(1996). Muller et al. (1995) described sending quantum signals from Nyon to Geneva using
as quantum channel an optical fibre deployed beneath lake Geneva. Impressive progress
has also be made in open-space quantum key generation. From successful experiment for
the distance 32cm in 1989 to almost one kilometer (Buttler et al). For more details on
the history of quantum cryptography see Brassard and Crépeau (1996) and Brassard (1988,
1994).
The history of attempts to develop QCP and to prove their (unconditional) security is
short but reach on interesting, deep and surprising developments.
The first quantum coin-flipping protocol was developed by Bennett and Brassard (1984),
and they also showed its unsecurity. Crépeau and Killian (1988) developed a protocol for
oblivious transfer which was secure provided neither party could store photons for long time
and only projection measurements were allowed. It was later shown (see Mayers, 1996) for
the last step) that a secure quantum bit commitment is sufficient to get secure quantum
oblivious transfer. A protocol for quantum bit commitment, so called BCJL protocol, was
developed by Brassard, Crépau, Jozsa and Langlois (1993) and it was claimed that it is
unconditionally secure. (Actually BCJL protocol was a modification of the protocol devel-
oped by Brassard and Crépau (1991).) However, Lo and Chau (1997) and Mayers (1998)
have shown a way to cheat, using an “EPR-attack” and by delaying the measurement. (The
cheating strategy requires quantum computer!)
Bennett et al. (1993) is the basic reference for teleportation. Teleportation circuit is due
to Brassard (1996). The idea of superdense coding is due to Bennett and Wiesner (1992).
The first analysis of errors in quantum computation in the initial state (data), called
software errors, and in the Hamiltonian, called hardware errors, was performed by Zurek
(1984). Measurement (or readout) errors were first analyzed by Peres (1985).
The discovery of quantum error-correcting codes and of fault-tolerant techniques for
quantum computation have been further surprising and important steps toward the feasi-
bility of quantum computing.
Quantum error correction was assumed to be impossible till 1995. A break-through came
with pioneering papers of Shor (1995) and Steane (1996). Shor exhibited a way to encode
one qubit by an entangled state of nine qubits in such a way that a single bit or sign error
could be corrected. Steane was able to encode one qubit by seven qubits. An optimal way
to encode one qubit, by an entangled state of five qubits, was first shown by Laflamme et
al. (1996) and by Bennettet al. (1996).
The first attempts to develop a methodology for design of quantum error correcting codes
400 Appendix–Historical and bibliographical references
were due to Calderbank and Shor (1996), Steane (1996a) and Calderbank et al (1996)—to
make a quantum error-correcting code out of one or two classical error correcting codes.
A more general approach to stabilizer or additive codes was due to Gottesman (1996) and
Calderbank and Shor (1997). First example of a nonadditive code has been due to Rains et
al. (1997).
A necessary and sufficient conditions for a subspace of a Hilbert space to form a quantum
error-correcting code was derived by Bennett et al. (1996a) and Knill and Laflamme (1997).
Quantum versions of the Hamming bound and of the Gilbert-Varshamov bound were de-
rived by Ekert and Macchiavello (1996). An infinite class of codes satisfying the quantum
Hamming bound was developed by Gottesman (1996). Of large importance was also the
establishment of the relation of the quantum error-correcting codes to the preservation of
quantum entanglement in a noisy environment by Bennett et al. (1996a 1996b). For a
survey on quantum error-correction codes see Gottesman (1997), and Steane (1998).
The method of stabilization of quantum computation and memory by symmetrization,
presented in Section 7.3, is due to Barenco et al.(1997). Its origin goes back to Deutsch, in
1993, and Berthiaume et al. (1994).
Pioneering papers in the area of quantum fault-tolerant computations were due to Shor
(1996) and Zurek and Laflamme (1996). Shor has shown that there exist fault-tolerant
implementations for universal set of gates. DiVincenzo and Shor (1996) showed a system-
atic way how to design fault-tolerant syndrom computation circuits for additive codes. A
general approach to fault-tolerant computation based on stabilizer codes was developed by
Gottesman (1997) and Steane (1998b)
Concatenated codes, as a way to make reliably long transmissions of quantum informa-
tion, are due to Knill and Laflamme (1996). The first paper pointing out a possibility to
have fault-tolerant gates was due to Kitaev (1997). For a survey on fault-tolerant computing
see Preskill (1998). The idea of quantum repeaters is due to Briegel et al.
The concept of quantum entropy is due to von Neumann (1932) and an intensive survey
of quantum entropy results is due to Wehrl (1978). For classical results on communication
theory see Shannon and Weaver (1969).. For “classical results” on information theory for
quantum systems see Levitin (1987). Schumacher’s (1995) noiseless quantum coding theorem
has been a basis of a new approach to quantum information theory based on inherently
quantum concepts. In addition, the basic definition of the capacity of quantum channels
is also due to Schumacher (1995). Another proof of the basic Schumacher’s theorem was
done by Jozsa and Schumacher (1994). An explicit algorithm for performing Schumacher’s
noiseless compression of quantum bits was given by Cleve and DiVincenzo (1996). For
a quantum analogue of Shannon’s bound and code for the noisy channel transmission see
Lloyd (1997). Basic results on dense coding are due to Ambainis et al. (1998). Transmission
fidelities were defined and explored by Jozsa and Schumacher (1994) and Jozsa (1994). For
an attempt to develop quantum analogies to basic concepts of classical information theory
see Cerf and Adami (1996).
For basic results concerning quantum purification see Bennett et al. (1996a, 1996b).
For the relation between entanglement purification and quantum error-correcting codes see
Bennett et al. (1996a). Basic results concerning quantum concentration are due to Bennett
et al. (1996) and Lo with Popescu (1997). Measures of entanglement are due to Bennett et
al. (1996a), Vedral and Plenio (1997), Vedral et al. and DiVincenzo et al. (1998). Bennett
(1998,1998a) provides both the survey of current quantum information theory concepts and
results and presents also directions for further research. For a survey of results on quantum
information primitives and reducibilities see Bennett (1998).
9.5. HISTORICAL AND BIBLIOGRAPHICAL REFERENCES 401
For popular presentation of basic problems of quantum mechanics see Penrose (1990,
1994) and Lindley (1996). Introductions to quantum mechanics can be found in von Neu-
mann (1932), Dirac (1947) and Feynman, Leighton and Sandes (1964).. One of the main
references to quantum mechanics is Peres (1993)—a graduate level textbook and reference
book. For POV measurements see Bush, Grabowski and Lahti (1997). For an introduction
to Hilbert spaces and their relation to quantum mechanics see Cohen (1989) and Packel
(1974). For the history and philosophy of quantum mechanicsss Jammer (1974) and Bub
(1997).
There are quite a few of excellent introductory papers, surveys and tutorials on quantum
computing and communication: Ekert (1995), DiVincenzo (1995a, 1996), Svozil (1995),
Brassard (1995, 1996). Berthiaume (1997), Steane (1997), Aharonov (1998), Rieffel and
Polak (1998), Bennett (1998). Good surveys are also in PhD theses of Barenco (1996),
Hirvensalo (1996), Gottesman (1997). See also papers in a special issue of Physics Today
(March 1998).
The first two books on quantum computing are due to Williams and Clearwater (1997),
with various quantum simulations on CD-ROM and Berman et al. (1998), oriented mainly
to physicists and on a systematic and detailed treatment of various implementation issues.
An excellent source of papers on quantum computing is the following WWW site:
http://xxx.lanl.gov/archive/quant-ph
of the Quantum Physics Archive at Los Alamos National Laboratory with the following
average number of papers per month and year: 1995 (26), 1996 (38), 1997 (54), 1998 (85).
402 Appendix–Historical and bibliographical references
Bibliography
[1] D. S. Abrams and Seth Lloyd. Nonlinear quantum mechanics implies polynomial-time solution
for NP-complete and #P complete problems. Technical report, quant-ph/9801041, 1998.
[2] Leonard M. Adleman, Jonathan DeMarrais, and Ming-Deh A. Huang Quantum computability.
SIAM Journal of Computing, 26(5):1524–1540, 1997.
[3] Dorit Aharonov. Quantum computation. Technical report, quant-ph/9812037, 1998.
[4] Dorit Aharonov, Alexei Kitaev, and Noam Nisan. Quantum circuits with mixed states. In
Proceedings of 30th ACM STOC, pages 20–30, 1998. quant-ph/9806029.
[5] Y. Aharonov and J. Anandan Meaning of the density matrices. Technical report, quant-
ph/9803018, 1998.
[6] Eric Allender and Mitsunori Ogihara. Relationship among PL, #L and the determinant.
RAIRO, 30:1–21, 1996.
[7] L. Allen and Joseph H. Eberly Optical resonance and two-level atoms. Dover Publications,
1975.
[8] Masami Amano and Kazuo Iwama. Undecidability on quantum finite automata. In Proceed-
ings of 31st ACM STOC, page to be published, 1999.
[9] Andris Ambainis and Rūsiņs̆ Freivalds. 1-way quantum finite automata: strengths, weak-
nesses and generalizations. In Proceedings of 39th IEEE FOCS, pages 332–341, 1998. quant-
ph/9802062.
[10] Andris Ambainis, Ashwin Nayak, Amnon Ta-Shma , and Umesh Vazirani. Dense quantum
coding and a lower bound for 1-way quantum finite automata. Technical report, quant-
ph/9804043, 1998.
[11] Andris Ambainis, Leonard Schulman, Amnon Ta-Shma, Umesh Vazirani, and Avi Wigderson.
The quantum communication complexity of sampling. In Proceedings of 39th IEEE FOCS,
pages 342–351, 1998.
[12] Serafino Amoroso and Yale N. Patt Decision procedures for surjectivity and injectivity of
parallel maps for tesselation structures. Journal of Computer and System Sciences, 6:448–
464, 1972.
[13] Mohammad Ardehali, Gilles Brassard, H. F. Chau and, and Hoi-Kwong Lo. Efficient quantum
key distribution. Technical report, quant-ph/9803007, 1998.
[14] Alain Aspect, Jean Dalibard, and Gérard Roger. Experimental tests of Bell’s inequalities
using time-varying analyzers. Physical Review Letters, 49:1804–1807, 1982.
[15] Leslie E. Ballentine Quantum mechanics. Prentice Hall, 1970.
[16] Adriano Barenco. Dense coding based on quantum entanglement. Journal of Modern Optics,
42:1253–1259, 1995.
[17] Adriano Barenco. A universal two-bit gate for quantum computation. Proceedings of Royal
Society London A, 449:679–683, 1995.
[18] Adriano Barenco. Quantum computation. PhD thesis, University of Oxford, 1996.
403
404 BIBLIOGRAPHY
[19] Adriano Barenco, Charles H. Bennett, Richard Cleve, David P. DiVincenzo, Norman Margo-
lus, Peter W. Shor, Tycho Sleator, John A. Smolin, and Harald Weinfurter. Elementary gates
of quantum computation. Physical Review A, 52(5):3457–3467, 1995.
[20] Adriano Barenco, André Berthiaume, David Deutsch, Artur K. Ekert, Richard Jozsa, and
Chiarra Macchiavello. Stabilization of quantum computations by symmetrization. SIAM
Journal of Computing, 26(5):1541–1557, 1997. quant-ph/9604028.
[21] Adriano Barenco, David Deutsch, Artur K. Ekert, and Richard Jozsa. Conditional quantum
dynamics and logic gates. Physical Review Letters, 74:4083–4086, 1995a.
[22] Adriano Barenco and Artur K. Ekert Quantum computation. Acta Physica Slovaca,
45(3):205–216, 1995. In Proceedings of 3rd Central-European Workshop on Quantum Optics.
[23] Adriano Barenco, Artur K. Ekert, Kalle-Anntti Suominen, and Päivi Törmä. Approximate
quantum Fourier transform and decoherence. Physical Review A, 54(1):139–146, 1996b.
[24] Stephen M. Barnett and Simon J. D. Phoenix Bell’s inequality and rejected-data protocols
for quantum cryptography. Journal of Modern Optics, 40(8):1443–1448, 1993.
[25] Howard Barnum, C. M. Caves, Christopher. A. Fuchs, Richard Jozsa, and Benjamin W. Schu-
macher Noncomuting mixed states cannot be broadcast. Physical Review Letters, 76(15):2828–
2831, 1996. quant-ph 9511010.
[26] Howard Barnum, Emanuel H. Knill, and Michael A. Nielsen On quantum fidelities and channel
capacities. Technical report, quant-ph/9809010, 1998.
[27] Howard Barnum, Michael A. Nielsen, and Benjamin W. Schumacher Information transmission
through noisy quantum channels. Physical Review A, 57:4153–4175, 1997.
[28] Howard Barnum, John Smolin, and Barbara Terhal Results on quantum channel capacity.
Technical report, quant-ph/9711032, 1997a.
[29] Robert Beals. Quantum computation of Fourier transforms over symmetric groups. In Pro-
ceedings of 29th ACM STOC, pages 48–53, 1997.
[30] Robert Beals, Harry Buhrman, Richard Cleve, and Michele Mosca. Quantum lower bounds
for polynomials. In Proceedings of 39th IEEE FOCS, pages 352–361, 1998.
[31] David Beckman, Amalovoyal N. Chari, Srikrishna Devabhaktuni, and John Preskill. Efficient
networks for quantum factoring. Physical Review A, 54:1034–1063, 1996.
[32] John S. Bell On the Einstein-Podolsky-Rosen paradox. Physics 1, pages 195–200, 1964.
Reprinted in Quantum Theory and Measurement, (eds): J. A. Wheeler and W. H. Zurek,
403-408.
[33] Paul A. Benioff The computer as a physical system: A microscopic quantum mechanical
Hamiltonian model of computers as represented by Turing machines. Journal of Statistical
Physics, 22:563–591, 1980.
[34] Paul A. Benioff Quantum mechanical Hamiltonian models of discrete processes that erase
their own histories: application to Turing machines. International Journal of Theoretical
Physics, 21(3/4):177–202, 1982.
[35] Paul A. Benioff Quantum mechanical Hamiltonian models of Turing machines that dissipate
no energy. Physical Review Letters, 48:1581–1585, 1982a.
[36] Paul A. Benioff Models of quantum Turing machines. Fortschritte der Physics, 46(4-5):423–
441, 1998.
[37] Charles H. Bennett Logical reversibility of computation. IBM Journal of Research and
Development, 17:525–532, 1973.
[38] Charles H. Bennett Notes on the history of reversible computation. IBM Journal of Research
and Development, 32(1):16–23, 1988.
[39] Charles H. Bennett Logical depth and physical complexity. In The Universal Turing
Machines—a Half Century Survey, pages 227–257. Kammer & Univerzagt, Hamburg, 1988d.
[40] Charles H. Bennett Time/space trade-offs for reversible computation. SIAM Journal of
Computing, 18:766–776, 1989.
BIBLIOGRAPHY 405
[41] Charles H. Bennett Quantum cryptography using any two nonorthogonal states. Physical
Review Letters, 68(21):3121–3124, 1992.
[42] Charles H. Bennett Future directions for quantum information. In Introduction to quantum
computation and information, page to be published. World Scientific, 1998.
[43] Charles H. Bennett Quantum information. Physica Scripta, T76:210–217, 1998a.
[44] Charles H. Bennett Quantum information. Technical report, Tutorial, MFCS’98, 1998b.
[45] Charles H. Bennett, Ethan Bernstein, Gilles Brassard, and Umesh Vazirani. Strength and
weaknesses of quantum computation. SIAM Journal on Computing, 26(5):1510–1523, 1997.
[46] Charles H. Bennett, Herbert J. Bernstein, Sandu Popescu, and Benjamin W. Schumacher
Concentrating partial entanglement by local operations. Physical Review A, 53:2046–2053,
1996.
[47] Charles H. Bennett, François Bessette, Gilles Brassard, Louis Salvail, and John A. Smolin
Experimental quantum cryptography. Journal of Cryptology, 5(1):3–28, 1992.
[48] Charles H. Bennett and Gilles Brassard. Quantum cryptography: public key distribution
and coin tossing. In Proceedings of IEEE Conference on Computers, Systems and Signal
processing, Bangalore (India), pages 175–179, 1984.
[49] Charles H. Bennett and Gilles Brassard. The dawn of a new era for quantum cryptography.
The experimental prototype is working! SIGACT News, 20(4):78–82, 1989.
[50] Charles H. Bennett, Gilles Brassard, S. Briedbart, and Stephen J. Wiesner. Quantum cryp-
tology, or unforgeable subway tokens. In Proceedings of Crypto’82, pages 267–275, 1982.
[51] Charles H. Bennett, Gilles Brassard, Claude Crépeau, Richard Jozsa, Asher Peres, and
William K. Wootters Teleporting an unknown quantum state via dual classical and Einstein-
Podolsky-Rosen channels. Physical Review Letters, 70:1895–1899, 1993.
[52] Charles H. Bennett, Gilles Brassard, Claude Crépeau, and Marie-Hélène Skubiszewska. Prac-
tical quantum oblivious transfer. In Advances in Cryptography, Proceedings of Crypto’91,
LNCS 576, Springer Verlag, pages 351–366, 1991.
[53] Charles H. Bennett, Gilles Brassard, and N. David Merman. Quantum cryptography without
Bell’s theorem. Physical Review Letters, 68(5):557–559, 1992a.
[54] Charles H. Bennett, Gilles Brassard, Sandu Popescu, Benjamin W. Schumacher, John A.
Smolin, and William K. Wootters Purification of noisy entaglement and faithful teleportation
via noisy channels. Physical Review Letters, 76(5):722–725, 1996b.
[55] Charles H. Bennett, Gilles Brassard, and J.-M. Robert Privacy amplification by public dis-
cussion. SIAM Journal of Computing, 17(2):210–229, 1988.
[56] Charles H. Bennett, David P. DiVincenzo, Christopher A. Fuchs, Tal Moric, Eric M. Rains,
Peter W. Shor, and John A. Smolin Quantifying non-locality without entanglement. Technical
report, quant-ph/9804053, 1998.
[57] Charles H. Bennett, David P. DiVincenzo, and John A. Smolin Capacities of quantum erasure
channels. Technical report, quant-ph/9701015, 1997a.
[58] Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters Mixed
state entaglement and quantum error correction. Physical Review A, 54:3824–3851, 1996a.
quant-ph/0604024.
[59] Charles H. Bennett, Christopher A. Fuchs, and John A. Smolin Quantum Communication,
Computing and Measurement, chapter Entanglement-enhanced classical communication on a
noisy quantum channel, pages 79–88. Plenum, New York, 1997b.
[60] Charles H. Bennett and Stephen J. Wiesner Communication via one- and two-particle oper-
ators on Einstein-Podolsky-Rosen states. Physical Review Letters, 69(20):2881–2884, 1992.
[61] Gennady P. Berman, Gary D. Doolen, Ronnie Mainieri, and Vladimir I. Tsifrinovich Intro-
duction to quantum computing. World Scientific, 1998.
[62] Ethan Bernstein. Quantum complexity theory. PhD thesis, University of California, Berkeley,
1997.
406 BIBLIOGRAPHY
[63] Ethan Bernstein and Umesh Vazirani. Quantum complexity theory. In Proceedings of 25th
ACM STOC, pages 11–20, 1993.
[64] Ethan Bernstein and Umesh Vazirani. Quantum complexity theory. SIAM Journal of Com-
puting, 26(5):1411–1473, 1997.
[65] André Berthiaume. Complexity theory retrospectives II, chapter Quantum computation, pages
23–51. Springer-Verlag, 1997.
[66] André Berthiaume and Gilles Brassard. Oracle quantum computing. In Proceedings of the
Workshop on Physics of Computation, pages 195–199, 1992.
[67] André Berthiaume and Gilles Brassard. Oracle quantum computing. In Proceedings of 7th
IEEE Conference on Structure in Complexity Theory, pages 132–137, 1992a.
[68] André Berthiaume and Gilles Brassard. The quantum challenge to structural complexity. In
Proceedings of Structure in Complexity Conference, pages 132–137, 1992b.
[69] André Berthiaume and Gilles Brassard. Oracle quantum computing. Journal of Modern
Optics, 41(12):2521–2535, 1994.
[70] André Berthiaume, David Deutsch, and Richard Jozsa. The stabilization of quantum com-
putation. In Proceedings of the Third Workshop on Physics and Computation, pages 60–62.
IEEE Computer Society Press, 1994.
[71] Michael Biafore. Can computers have simple Hamiltonians? In Proceedings of Physics and
Computation, PhysComp’94, pages 63–69, 1994.
[72] Ivo Bialynicki-Birula and J. Mycielski Uncertainty relations for information entropy in wave
mechanics. Commen. Math. Phys, 44:129–132, 1975.
[73] Eli Biham, Ofer Biham, David Biron, Markus Grassl, and Daniel A. Lidar Exact solution
of Grover’s quantum search algorithm for arbitrary initial amplitude distribution. Technical
report, quant-ph/9807027, 1998.
[74] Eli Biham, Bruno Huttner, and Tal Mor. Quantum cryptographic network based on quantum
memories. Physical Review A, 54(4):2651–2658, 1996.
[75] David Biron, Ofer Biham, Markus Grassl, and Daniel A. Lidar Generalized Grover search
algorithm for arbitrary initial amplitude distribution. Technical report, quant-ph/9801066,
1998.
[76] Manuel Blum. Coin flipping by telephon. A protocol for solving impossible problems. In
Proceedings of the 24th IEEE FOCS, pages 133–137, 1982.
[77] Bruce M. Boghosian and Washington Taylor IV Simulating quantum mechanics on a quantum
computers. Physica D, 120:30–42, 1997. quant-ph/9701019.
[78] David Bohm. A suggested interpretation of the quantum theory in terms of “hidden” variables
I and II. Physics Review, 85:166–193, 1952. also in “Quantum theory and measurements (ed.
J. A. Wheeler and W. H. Zurek), Princeton University Press, 1983.
[79] David Bohm and Basil J. Hiley The undivided universe—an onthological interpretation of
quantum theory. Routledge, 1994.
[80] D. Boschi, S. Branca, F. DeMartini, L. Hardy, and Sandu Popescu. Experimental realization
of teleporting an unknown pure quantum state via dual classical Einstein–Podolsky–Rosen
channels. Physical Review Letters, 80:1121–1125, 1998.
[81] Sugato Bose, Martin B. Plenio, and Vlatko Vedral. Mixed state dense coding and its relation
to entanglement measures. Technical report, quant-ph/9810025, 1998.
[82] Dik Bouwmeester, Jean-Wei Pan, Matthew Daniel, Harald Weinfurter, and Anton Zeilinger.
Observation of three-photon Greenberer-Horner-Zeilinger entanglement. Technical report,
quant-ph/9810035, 1998.
[83] Dik Bouwmeester, Jean-Wei Pan, K. Mattle, M. Eibl, Harald Weinfurter, and Anton Zeilinger.
Experimental quantum teleportation. Nature, 390(6660):575–579, December 1997.
BIBLIOGRAPHY 407
[84] Michel Boyer, Gilles Brassard, Peter Høyer, and Alain Tapp. Tight bounds on quantum
searching. In Fourth Workshop on Physics and Computation, Ed. T. Toffoli, M. Biaford, J.
Lean, pages 36–43. New England Complex System Institute, 1996. See also Fortschritte der
Physics, 46, N4-5, 493-505, 1998, and quantum-ph/9605034.
[85] Gilles Brassard. Modern cryptography. A tutorial. Springer-Verlag, LNCS 325, 1988.
[86] Gilles Brassard. Cryptology column—quantum computing: the end of classical cryptography?
SIGACT News, 25(4):15–21, 1994.
[87] Gilles Brassard. Computer Science Today, chapter A quantum jump in Computer Science,
pages 1–14. Springer-Verlag, LNCS 1000, 1995.
[88] Gilles Brassard. New trends in quantum computing. In Proceedings of STAC’96, pages 3–10,
1996.
[89] Gilles Brassard. Teleportation as quantum computation. Proceedings of PhysComp-96, Phys-
ica D, 120:43–47, 1998.
[90] Gilles Brassard and Claude Crépeau. Quantum bit commitment and coin tossing protocols.
In Adanvecs in cryptology—CRYPTO’90, pages 49–61. LNCS 537, Springer-Verlag, 1991.
[91] Gilles Brassard and Claude Crépeau. Cryptology column—25 years of quantum cryptography?
SIGACT News, 27(3):13–24, 1996.
[92] Gilles Brassard, Claude Crépeau, Richard Jozsa, and Denis Langlois. A quantum bit com-
mitment scheme provably unbreakable by both parties. In Proceedings of 34th IEEE FOCS,
pages 362–371, 1993.
[93] Gilles Brassard, Claude Crépeau, Dominic C. Mayers, and Louis Salvail. A brief review on
the impossibility of quantum bit commitment. Technical report, quant-ph/9712023, 1997.
[94] Gilles Brassard, Claude Crépeau, Dominic C. Mayers, and Louis Salvail. Defeating classical
bit commitments with a quantum computer. Technical report, quant-ph/9806031, 1998b.
[95] Gilles Brassard and Peter Høyer. On the power of exact quantum polynomial time. Technical
report, quant-ph/9612017, 1996.
[96] Gilles Brassard and Peter Høyer. An exact quantum polynomial-time algorithm for Simon’s
problem. In Proceedings of Israeli Symposium on Theory of Computing and systems, pages
12–23, 1997.
[97] Gilles Brassard, Peter Høyer, and Alain Tapp. Quantum cryptanalysis of hash and claw-free
functions. In Proceedings of LATIN’98, pages 163–169. LNCS 1380, Springer-Verlag, 1998.
quant-ph/9705002.
[98] Gilles Brassard, Peter Høyer, and Alain Tapp. Quantum counting. Technical report, quant-
ph/9805082, 1998a.
[99] H.-J. Briegel, W. Dür, Juan I. Cirac, and Peter Zoller. Quantum repeaters for communication.
Technical report, quant-ph/9803056, 1998.
[100] Dagmar Bruß. Optimal eavesdropping in quantum cryptography with six states. Technical
report, quant-ph-98050019, 1998.
[101] Jeffrey Bub. Interpreting the quantum world. Cambridge University Press, 1997.
[102] Harry Buhrman, Richard Cleve, and Avi Wigderson. Quantum vs. classical communication
and computation. In Proceedings of 30th ACM STOC, pages 63–68, 1998. quant-ph/9802040.
[103] Harry Buhrman and Ronald de Wolf. Lower bounds for quantum search and derandomization.
Technical report, quant-ph/9811046, 1998.
[104] Guido Burkard, Daniel Loss, and David P. DiVincenzo Coupled quantum dots as quantum
gates. Technical report, quant-ph/9808026, 1998.
[105] Arthur W. Burks J. von Neumann: Theory of self-reproducing automata. University Illinois
Press, 1966.
[106] Paul Bush, Marian Grabowski, and Pekka J. Lahti Operational quantum physics. Springer,
1997.
408 BIBLIOGRAPHY
[130] David G. Cory, Amr Fahmy, and Timothy F. Havel Nuclear magnetic resonance spectrocopy:
an experimentally accessible paradigm for quantum computing. In Proceedings of Physics and
Computation, PhysComp96, pages 87–91, 1996.
[131] David G. Cory, W. Mass, M. Price, Emanuel H. Knill, Raymond Laflamme, Wojciech H.
Zurek, Timothy F. Havel, and S. S. Somaroo Experimental quantum error correction. Physical
Review Letters, 81:2152–2155, 1998.
[132] Claude Crépeau. Equivalence between two flavours of oblivious transfers. In Proceedings of
CRYPTO’87, LNCS 293, Springer-Verlag, pages 350–354, 1987.
[133] Claude Crépeau. Quantum oblivious transfer. Journal of Modern Optics, 41(12):2445–2454,
1994.
[134] Claude Crépeau and Joe Kilian. Achieving oblivious transfer using weakend security assump-
tions. In Proceedings of 29th IEEE FOCS, pages 42–52, 1988.
[135] Marek Czachor. Notes on nonlinear quantum algorithms. Technical report, quant-ph/9802051,
1998.
[136] Edward B. Davies Quantum theory of open systems. Academic Press, 1976.
[137] Louis de Broglie. Une tentative d’intérpretation causale et nonlinéaire de la mecanique ondu-
latoire. Gouthier-Villars, 1956.
[138] David Deutsch. Uncertainty in quantum measurement. Physical Review Letters, 50:631–633,
1983.
[139] David Deutsch. Quantum theory, the Church-Turing principle and the universal quantum
computer. Proceedings of the Royal Society London A, 400:97–117, 1985.
[140] David Deutsch. Quantum computational networks. Proceedings of Royal Society of London
A, 425:73–90, 1989.
[141] David Deutsch. The fabric of reality. Allen Lane, The Penguin Press, 1997.
[142] David Deutsch, Adriano Barenco, and Artur K. Ekert Universality in quantum computation.
Proceedings of Royal Society London A, 449:669–677, 1995. quant-ph/9505018.
[143] David Deutsch, Artur K. Ekert, Richard Jozsa, Chiarra Macchiavello, Sandu Popescu, and
Anna Sanpera. Quantum privacy amplification and the security of quantum cryptography
over noisy channels. Physical Review Letters, 77(13):2818–2821, 1996.
[144] David Deutsch and Richard Jozsa. Rapid solution of problems by quantum computation.
Proceedings of the Royal Society London A, 439:553–558, 1992.
[145] D. Dieks. Communication by EPR devices. Physical Letters A, 92:271–272, 1982.
[146] Paul A. M. Dirac The principles of quantum mechanics. Oxford University Press, 1947.
[147] David P. DiVincenzo Two-bit gates are universal for quantum computation. Physical Review
A, 51:1015–1022, 1995.
[148] David P. DiVincenzo Quantum computation. Science, 270:255–261, 1995a.
[149] David P. DiVincenzo Quantum gates and circuits. Proceedings of the Royal Society of London,
Series A, 454:261–276, 1998b. quant-ph/9705009.
[150] David P. DiVincenzo, Christopher A. Fuchs, Hideo Mabuchi, John A. Smolin, Ashish Thap-
liyal, and Armin Uhlmann. Entanglement of assistance. Technical report, quant-ph/9803033,
1998.
[151] David P. DiVincenzo and Daniel Loss. Quantum information is physical. Superlattices and
Microstructures, 23:419–432, 1997. cond-mat/9710259.
[152] David P. DiVincenzo, Tal Mor, Peter W. Shor, John A. Smolin, and Barbara M. Terhal
Unextendible product bases and bound entanglement. Technical report, quant-ph/9808030,
1998a.
[153] David P. DiVincenzo and Asher Peres. Quantum code words contradict local realism. Physical
Review A, 55(6):4089–4092, 1997.
410 BIBLIOGRAPHY
[154] David P. DiVincenzo and Peter W. Shor Fault-tolerant error correction with efficient quantum
codes. Physical Review Letters, 77:3260–3263, 1996.
[155] David P. DiVincenzo and Barbara M. Terhal Decoherence: the obstacle to quantum comput-
ing. Physics World, pages 53–57, March 1998.
[156] Peter Domokos, Jean-Michel Raimond, Michel Brune, and Serge Haroche. Simple cavity-
QED two-bit universal quantum logic gate: the principle and expected performance. Physical
Review A, 52(5):3554–3559, 1995.
[157] Lu-Ming Duan and Guang-Can Guo. Reducing decoherence in quantum computer memory
with all quantum bits coupling to the same environment. Technical report, quant-ph/9612003,
1996.
[158] Jean-Christophe Dubacq. How to simulate Turing machines by invertible one-dimensional
cellular automata. International Journal of Foundations of Computer Science, 6:395–402,
1995.
[159] Christoph Dürr. Automates cellulaires quantiques finis. PhD thesis, LRI, Université Paris-
Süd, 1997.
[160] Christoph Dürr and Peter Høyer. A quantum algorithm for finding the minimum. Technical
report, Université Paris-Süd-Odense, quant-ph/9607014, 1996.
[161] Christoph Dürr, Huong LêThanh, and Miklos Santha. A decision procedure for well-formed
linear quantum cellular automata. Random Structures and Algorithms, 11(4):381–394, 1996.
Preliminary version in Proceedings of 13th STACS’96, 281-292.
[162] Christoph Dürr and Miklos Santha. A decision procedure for unitary linear quantum cellular
automata. In Proceedings of 37th IEEE FOCS, pages 38–45, 1996. To appear in SIAM Journal
of Computing, quant-ph/9604007.
[163] Cynthia Dwork and Larry Stockmeyer. On the power of 2-way probabilistic finite automata.
In Proceedings of 30th IEEE FOCS, pages 480–485, 1989.
[164] Albert Einstein, Boris Podolsky, and Nathan Rosen. Can quantum mechanical description of
physics reality be considered complete? Physical Review, 47:777–780, 1935. Also in quantum
theory and measurement, ed. J. A. Wheeler, Wojciech H. Zurek, Princeton University Press,
1983.
[165] Artur K. Ekert Quantum cryptography based on Bell’s theorem. Physical Review Letters,
67(6):661–663, 1991.
[166] Artur K. Ekert Quantum computation. In Atomic Physics 14 - 14th International Conference
on Atomic Physics, Boulder, 1994, AIP Proceedings 323 (AIP Press, New York), pages 450–
4??, 1995.
[167] Artur K. Ekert From quantum code-making to quantum code-breaking. Technical report,
quant-ph/9703035, 1997.
[168] Artur K. Ekert and Richard Jozsa. Quantum computation and Shor’s quantum factoring
algorithm. Review of Modern Physics, 68:733–753, 1996.
[169] Artur K. Ekert and Richard Jozsa. Quantum algorithms entanglement-enhanced informa-
tion processing. Philosophical Transactions of Royal Society, A356:1762–1782, 1998. quant-
ph/9803072.
[170] Artur K. Ekert and Chiarra Macchiavello. Quantum error correction for communication.
Physical Review Letters, 77:2585–2588, 1996. quant-ph/9602022.
[171] Artur K. Ekert, John Rarity, Paul Tapster, and G. Palma Practical quantum cryptography
based on two-photon interferometry. Physical Review Letters, 69:1293–1295, 1992.
[172] Mark Ettinger and Peter Høyer. On quantum algorithms for noncommutative subgroups.
Technical report, quant-ph/9807029, 1998.
[173] Mark Ettinger and Peter Høyer. Hidden subgroup states are almost orthogonal. Technical
report, quant-ph/9901034, 1999.
[174] Mark Ettinger and Peter Høyer. A quantum observable for the graph isomorphism problem.
Technical report, quant-ph/9901029, 1999a.
BIBLIOGRAPHY 411
[175] Shimon Even, Oded Goldreich, and A. Lempel A randomized protocol for signing contracts.
In Proceedings of Crypto’82, Plenum Press, pages 205–210, 1983.
[176] Hugh Everett. The theory of the universal wave function. In The many-world interpretation
of quantum mechanics, pages 1–140. Princeton University Press, 1977. Reprinted PhD Thesis.
[177] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser. A limit on the speed
of quantum computation in determining parity. Technical report, quant-ph/9802045, 1998.
[178] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser. A limit on the speed of
quantum computation for insertion into an ordered list. Technical report, quant-ph/9812057,
1998a.
[179] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser. How many functions
can be distinguished with k quantum queries? Technical report, quant-ph/9901012, 1999.
[180] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser. Invariant quantum
algorithms for insertion into an ordered list. Technical report, quant-ph/9901059, 1999.
[181] Stephen Fenner, Frederic Green, Steven Homer, and Randall Pruim. Quantum NP is hard
for PH. In Proceedings of 6th Italian Conference on Theoretical Computer Science, pages
241–252, 1998. quant-ph/9812056.
[182] Richard P. Feynman Simulating physics with computers. International Journal of Theoretical
Physics, 21(6/7):467–488, 1982.
[183] Richard P. Feynman Quantum mechanical computers. Foundations of physics, pages 507–531,
1986. Originally appeared in Optics News, February 1985, p 11-20.
[184] Richard P. Feynman Feynman lectures on computation. Addison-Wesley, 1995. Edited by
Anthony J. G. Hey and Robin W. Allen.
[185] Richard P. Feynman, Robert B. Leighton, and Matthew Sands. The Feynman lectures on
physics, Volume 3. Addison-Wesley, 1964.
[186] Lance Fortnow and John Rogers. Complexity limitations on quantum computation. In Pro-
ceedings of 13th IEEE Conference on Computational Complexity, pages 202–209, 1998.
[187] Edward Fredkin. Digital mechanics. Physica D, 45:254–270, 1990.
[188] Edward Fredkin and Tommaso Toffoli. Conservative logic. International Journal of Theoretical
Physics, 21(3-4):219–253, 1982.
[189] Rūsiņs̆ Freivalds. Probabilistic two-way machines. In Proceedings of MFCS’81, LNCS 118,
pages 33–45, 1981.
[190] Christopher A. Fuchs and Asher Peres. Quantum-state disturbance versus information gain:
uncertainty relation for quantum information. Physival Review A, 53(4):2038–2045, 1996.
[191] Neil A. Gershenfeld and Isaac L. Chuang Bulk spin-resonance quantum computation. Science,
275:350–356, 1997.
[192] Neil A. Gershenfeld, Isaac L. Chuang, and Seth Lloyd. Bulk quantum computation. In
Proceedings of the 4th Workshop on Physics and Computation. Phys96, pages 134–134, 1996.
[193] James Glanz. Measurements are the only reality, say quantum tests. Science, 270:1439–1440,
1995.
[194] R. J. Glaubner Frontiers in Quantum Optics, chapter ?????, pages ???–??? Adam Helger,
Bristol, 1988.
[195] Daniel Gottesman. Class of quantum error correcting codes saturating the quantum Hamming
bound. Physical Review A, 54:1862–1868, 1996. Preprint quant-ph/9604038.
[196] Daniel Gottesman. Stabilizator codes and quantum error correction. PhD thesis, California
Institute of Technology, 1997. quant-ph/9705052.
[197] Daniel Gottesman. A theory of fault-tolerant quantum computation. Physics Review A,
57:127–137, 1997a. quant-ph/9702029.
[198] Daniel Gottesman. The Heisenberg representation of quantum computers. Technical report,
quant-ph/9807006, 1998.
412 BIBLIOGRAPHY
[199] Markus Grassl, Thomas Beth, and Thomas Pellizzari Codes for quantum erasure channels.
Physical Review A, 56:33–38, 1997. quant-ph/9610042.
[200] Daniel M. Greenberger, Michael A. Horne, and Anton Zeilinger. Bell’s theorem and the con-
ception of the universe, chapter Going beyond Bell’s theorem, pages 69–. Kluwer Academic,
Dordrecht, 1989.
[201] Robert B. Griffiths and Chi-Sheng Niu. Semiclassical Fourier transform for quantum compu-
tation. Physical Review Letters, 76(17):3228–3231, 1996.
[202] Gerhard Grössing and Anton Zeilinger. Quantum cellular automata. Complex systems, 2:197–
208, 1988. Originally appeared in Proceedings of Conference on Cellular Automata, Cam-
bridge, MA, 1986.
[203] Lov K. Grover A fast quantum mechanical algorithm for estimating a median. Technical
report, quant-ph/9607024, 1996.
[204] Lov K. Grover A fast quantum mechanical algorithm for database search. In Proceedings of
28th ACM STOC, pages 212–219, 1996a.
[205] Lov K. Grover Quantum computer can search arbitrarily large databases by a single querry.
Physical Review Letters, 79:4709–4712, 1997.
[206] Lov K. Grover Quantum mechanics helps in searching for a needle in a haystack. Physical
Review Letters, 78:325–328, 1997a. quant-ph/9605043.
[207] Lov K. Grover Quantum telecomputation. Technical report, quant-ph/9704012, 1997b.
[208] Lov K. Grover A framework for fast quantum mechanical algorithms. In Proceedings of 30th
ACM STOCS, pages 53–62, 1998.
[209] Lov K. Grover How fast can a quantum computer serach? Technical report, quant-
ph/9809029, 1998a.
[210] Lov K. Grover Quantum search on structured problems. Technical report, quant-ph/9802035,
1998b.
[211] Jozef Gruska. Why we should not any longer only repair, polish and iron computer science
education? Education and Computers, 8:303–330, 1993.
[212] Jozef Gruska. Foundations of computing. Thomson International Computer Press, 1997.
[213] Alfréd Haar. Zur Theorie der orthogonalen Funktionssysteme. Mathematische Annalen,
LXIX:331–371, 1910.
[214] E. Hagley, X. Maltre, G. Nogues, C. Wunderlich, M. Brune, Jean-Michel Raimond, and Serge
Haroche. Generation of Einstein-Podolsky-Rosen pairs of atom. Physical Review Letters,
79(1):1–5, 1997.
[215] Lisa Hales and Sean Hallgren. Sampling Fourier transforms on different domains. Technical
report, quant-ph/9812060, 1998.
[216] Paul Hausladen, Richard Jozsa, Benjamin Schumacher, Michael Westmoreland, and William
K.Wootters Classical information capacity of a quantum channel. Physical Review A, 54:1869–
1876, 1996.
[217] Mark Hillery, Vladimı́r Bužek, and André Berthiaume. Quantum secret sharing. Technical
report, quant-ph/9806063, 1998.
[218] Raymond Hill. A first course in coding theory. Claredon Press, Oxford, 1986.
[219] Scott Hill and William K. Wootters Entanglement of a pair of quantum bits. Technical report,
quant-ph/9703041, 1997.
[220] Mika Hirvensalo. On quantum computation. PhD thesis, Turku Center for Computer Science,
1997.
[221] D. G. Hoffman, D. A. Leonard, C. C. Linder, K. T. Phelps, C. A. Rodger, and J. R. Wall
Coding theory, the essentials. Marcel Dekker Inc., 1991.
[222] Tad Hogg. Highly structured searches with quantum computers. Physical Review Letters,
80:2473–2476, 1998. quant-ph/9508012.
BIBLIOGRAPHY 413
[223] Tad Hogg, Carlos Mochon, Wolfgang Polak, and Eleanor G. Rieffel Tools for quantum algo-
rithms. Technical report, quant-ph/9811073, 1998.
[224] Alexander S. Holevo Bounds for the quality of information transmitted by quantum com-
munication channel. Problemy Peredachi information, 9(3):3–11, 1973. English translation:
Problems of Information Transmission, V9, 1973, 177-183.
[225] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Inseparable two spin- 21 density
matrices can be distilled to a singleton form. Physical Review Letters, 78:574–577, 1997.
[226] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Mixed-state entanglement and
distillation: is there a “bound” entanglement in nature? Technical report, quant-ph/9801069,
1998.
[227] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Bound entanglement can be
activated. Technical report, quant-ph/9806058, 1998a.
[228] Michal Horodecki and Ryszard Horodecki. Are there basic laws of quantum information
processing? Technical report, quant-ph/9705003, 1997.
[229] Ryszard Horodecki. Unitary transformation ether and its possible applications. Annalen der
Physi (Leipzig)k, 48:479–488, 1991.
[230] Peter Høyer. Efficient quantum transforms. Technical report, quant-ph/9702028, 1997.
[231] Juraj Hromkovič. Communication complexity and parallel computing. Springer Verlag, 1997.
[232] Richard J. Hughes Cryptography, quantum computation and trapped ions. Technical report,
Los Alamos National Laboratory, 1997. preprint LA-UR-97-4986.
[233] Richard J. Hughes, D. M. Alde, P. Dyer, G. G. Luther, G. L. Morgan, and M. M. Schauer
Quantum cryptography. Contemporary physics, 36(3):149–163, 1995.
[234] Richard J. Hughes, Daniel F. V. James, J. J. Gomez, M. S. Gulley, M. H. Holzscheiter,
Paul G. Kwiat, S. K. Lamoreaux, C. G. Peterson, V. D. Sandberg, M. M. Schauer, C. M.
Simmons, C. E. Thorburn, D. Tupa, P. Z. Wang, and A. G. Whille The Los Alamos trapped
ion quantum generator computer experiment. Fortschritte Physics, 46:329–361, 1998.
[235] Richard J. Hughes, Daniel F. V. James, Emanuel H. Knill, Raymond Laflamme, and Albert G.
Petschek Decoherence bounds on quantum computation with trapped ions. Physical Review
Letters, 77:3240–3243, 1996a.
[236] Richard J. Hughes, G. G. Luther, G. L. Morgan, C. G. Peterson, and C. M. Simmons Quantum
cryptography over underground optical fibers. In Advances in Cryptology, Proccedings of
CRYPTO’96, pages 329–342. LNCS 1109, Springer-Verlag, 1996a.
[237] L. P. Hugston, Richard Jozsa, and William K. Wootters A complete classification of quantum
ensembles having a given density matrix. Physics Letters A, 183:14–18, 1993.
[238] Russe Impagliazzo and Steven Rudich. Limits on the provable consequances of one-way
permutations. In Proceedings of 21st ACM STOC, pages 44–61, 1989.
[239] Max Jammer. The philosophy of quantum mechanics. John Wiley & Sons, 1974.
[240] Jonathan A. Jones and Michele Mosca. Implementation of a quantum algorithm to solve
Deutsch’s problem on a nuclear magnetic resonance quantum computers. Journal of Chemical
Physics, 109(5):1648–1653, 1998. quant-ph/9801027, 9808056.
[241] T. F. Jordan Linear operators for quantum mechanics. Wiley, 1969.
[242] Richard Jozsa. Fidelity for mixed quantum states. Journal of Modern Optics, 41(12):2315–
2323, 1994.
[243] Richard Jozsa. Entanglement and quantum computation. In S. Huggett, L. Mason, K. P.
Todd, S. T. Tsou, and N. M. J. Woodhouse, editors, The Geometric Universe, pages 369–379.
Oxford University Press, 1997.
[244] Richard Jozsa. Quantum algorithms and the Fourier transform. Proceedings of Royal Society
London A, 454:323–337, 1997a. quant-ph/9707033.
[245] Richard Jozsa. Searching in Grover’s algorithm. Technical report, quant-ph/9901021, 1999.
414 BIBLIOGRAPHY
[246] Richard Jozsa and Benjamin W. Schumacher A new proof of quantum noiseless coding
theorem. Journal of Modern Optics, 41(12):2343–2349, 1994.
[247] Jānis Kaņeps and Rūsiņs̆ Freivalds. Running-time to recognize nonregular languages by 2-way
probabilistic automata. In Proceedings of MFCS’91, LNCS 510, pages 174–185, 1991.
[248] B. E. Kane A Si–based nuclear spin quantum computer. Nature, 393:133–13?, 1998.
[249] Jarkko Kari. Reversibility of 2D cellular automata is undecidable. Physica D, 45:379–385,
1990.
[250] T. Kato Perturbation theory for linear operators. Springer-Verlag, 1976.
[251] Robert W. Keyes Miniaturization of electronics and its bounds. IBM Journal of Research
and Development, 32:24–28, 1988.
[252] Joe Kilian. Founding cryptography on oblivious transfer. In Proceedings of 20th ACM STOC,
pages 20–31, 1988.
[253] Alexei Yu. Kitaev Quantum measurements and Abelian stabilizer problem. Technical report,
quant-ph/9511026, 1995.
[254] Alexei Yu. Kitaev Quantum communication, computing and measurement, chapter Quantum
error correction with imperfect gates, pages 181–188. Plenum Press, New York, 1996.
[255] Alexei Yu. Kitaev Fault-tolerant quantum computation by anyons. Technical report, 1997.
[256] Emanuel H. Knill and Raymond Laflamme. Concatenated quantum codes. Technical report,
quant-ph/9608012, 1996.
[257] Emanuel H. Knill and Raymond Laflamme. Theory of quantum error-correcting codes. Phys-
ical Review A, 55(2):900–911, 1997. quant-ph/9604034.
[258] Emanuel H. Knill, Raymond Laflamme, and Wojciech H. Zurek Accuracy threshold for
quantum computation. Technical report, quant-ph/9610011, 1996.
[259] Emanuel H. Knill, Raymond Laflamme, and Wojciech H. Zurek Resilent quantum computa-
tion: error models and thresholds. Technical report, quant-ph/9702058, 1997.
[260] Attila Kondacs and John Watrous. On the power of finite state automata. In Proceedings of
36th IEEE FOCS, pages 66–75, 1997.
[261] Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press,
1997.
[262] Paul G. Kwiat, K. Mattle, Harald Weinfurter, Anton Zeilinger, A. V. Sergienko, and Y. M.
Shih New high-intenzity source of polarization—entangled photon pairs. Physics Review
Letters, 75:4337–4341, 1995.
[263] Paul G. Kwiat and Harald Weinfurter. Embedded Bell-state analysis. Physical Review A,
58(4):2623–2626, 1998.
[264] Raymond Laflamme, Emanuel H. Knill, Wojciech H. Zurek/indexZurek, Wojciech H.,
P. Catasti, and S. U. S. Hanappen. NMR GHZ. Technical report, quant-ph/9709025, 1997.
[265] Raymond Laflamme, Cesar Miquel, Juan Paz, and Wojciech H. Zurek Perfect quantum error
correction code. Physical Review Letters, 77:198–201, 1996. quant-ph/9602019.
[266] Ralf Landauer. Information is physical. Physics Today, 44:23–29, 1991.
[267] Ralf Landauer. Is quantum mechanically coherent computation useful? In D. H. Fing and
B.-L. Hu, editors, Proceedings of the Drexel-4 Symposium on Quantum Nonintegrability—
Quantum Classical Correspondence, pages 37–55. International Press, 1994.
[268] Ralf Landauer. Is quantum mechanics useful? Philosophical Transactions of Royal Society of
London A, 353:367–376, 1995.
[269] Klaus-Jörn Lange, Pierre McKenzie, and Alain Tapp. Reversible space equals deterministic
space. In Proceedings of 12th Annual IEEE Conference on Computational Complexity, pages
45–50, 1997.
BIBLIOGRAPHY 415
[270] Yves Lecerf. Récursive insolubilité de l’equation genérale de diagonalisation de deux monomor-
phismes de monoides libres φx =∗ x. Comptes Rendus de l’Académie des Sciences, 257:2940–
2943, 1963.
[271] Yves Lecerf. Machines de Turing réversibles. récursive insolubilité en n ∈ N de léquation
u = θn u, où θ est un isomorphisme de codes. Comptes Rendus de l’Académie des Sciences,
257:2597–2600, 1963a.
[272] Harvey S. Leff and Andrew R. Fex Maxwell’s demon: entropy, information. Adam Hilger,
1990.
[273] Lev B. Levitin ????????? In Proceedings of the 4th All-Union Conference on Information
and Coding Theory, (Moscow-Tashkent), pages ???–???, 1969.
[274] Lev B. Levitin Information complexity and control in quantum physics, chapter Information
theory for quantum systems, pages 15–?? Springer, 1987.
[275] Noah Linden and Sandu Popescu. The halting problem for quantum computers. Technical
report, quant-ph/9806054, 1998.
[276] David Lindley. Where does the weirdness go? BasicBooks, 1996.
[277] Seth Lloyd. A potentially realizable quantum computer. Science, 261:1569–1571, 1993.
[278] Seth Lloyd. Envision a quantum computer. Science, 263:695, 1994.
[279] Seth Lloyd. Almost any quantum gate is universal. Physical Review Letters, 75:346–349, 1995.
[280] Seth Lloyd. Capacity of the noisy quantum channel. Physical Review A, 55(3):1613–1622,
1997.
[281] Hoi-Kwong Lo. Insecurity of quantum secure computations. Physical Review A, 56(2):1154–
1172, 1997.
[282] Hoi-Kwong Lo and H. F. Chau Is quantum bit commitment really possible? Physical Review
Letters, 78(17):3410–3413, 1997.
[283] Hoi-Kwong Lo and H. F. Chau Why quantum bit commitment and ideal quantum coin tossing
are impossible? Technical report, quant-ph/9711065, 1997a.
[284] Hoi-Kwong Lo and Sandu Popescu. Concentrating entanglement by local actions—beyond
mean values. Technical report, quant-ph/9707038, 1997.
[285] Samuel J. Lomonaco A quick glance at quantum cryptography. Technical report, quant-
ph/9811056, 1998.
[286] F. J. MacWilliams and Neil J. A. Sloane The theory of error-correcting codes. North-Holland,
Amsterdam, 1977.
[287] Yurii Manin. Solvable and unsolvable. Soviet radio, 1980. In Russian.
[288] C. Marand and Paul D. Townsend Quantum key distribution over the distance as long as 30
km. Optics Letters, 20:1695–1697, 1995.
[289] Norman Margolus. Complexity, entropy and physics, chapter Parallel quantum computation,
pages 273–. Addison-Wesley, 1994.
[290] K. Mattle, Harald Weinfurter, Paul G. Kwiat, and Anton Zeilinger. Dense coding in experi-
mental quantum communication. Physics Review Letters, 76:4656–4659, 1996.
[291] Dominic C. Mayers Quantum key distribution and string oblivious transfer in noisy channels.
Technical report, quant-ph/9606003, 1996.
[292] Dominic C. Mayers Unconditinaly secure quantum bit commitment is impossible. Physical
Review Letters, 78:3414–3417, 1998.
[293] Dominic C. Mayers and Andrew C.-C. Yao Unconditional security in quantum cryptography.
Technical report, quant-ph/9802025, 1998.
[294] Dominic C. Mayers and Andrew C.-C. Yao Quantum cryptography with imperfect apparatus.
Technical report, quant-ph/9809039, 1998a.
[295] David A. Meyer Quantum strategies. Technical report, quant-ph/9804010, 1998.
416 BIBLIOGRAPHY
[296] Christoper Monroe, Dan M. Meeckhof , B. E. King, W. M. Itano, and Daniel J. Wineland
Demonstration of a fundamental quantum logic gate. Physical Review Letters, 75(25):4714–
4717, 1995.
[297] Cristopher Moore and James P. Crutchfield Quantum automata and quantum grammars.
Technical report, Santa Fe, 1997.
[298] Cristopher Moore and Martin Nilsson. Parallel quantum computation and quantum codes.
Technical report, quant-ph/9808027, 1998.
[299] Cristopher Moore and Martin Nilsson. Some notes on parallel computation. Technical report,
quant-ph/9804034, 1998a.
[300] Kenichi Morita and Masateru Harao. Computational universality of one-dimensional re-
versible (injective) cellular automata. Transactions of the IEICE, E72:758–762, 1989.
[301] Michele Mosca. Quantum searching, counting and amplitude modification by eigenvector-
analysis. In Proceedings of MFCS’98 Workshop on Randomized Algorithms, pages 90–100,
1998.
[302] Michele Mosca. Counting on a quantum computer. Technical report, MFCS’98, 1998b. Talk
at MFCS’98 Workshop on randomization.
[303] Michele Mosca and Artur K. Ekert The hidden subgroup problem and eigenvalue estimation
on a quantum computer. In Proceedings of 1st NASA International Conference on Quantum
Computing & Quantum Communication, page ??? LNCS 1504, Springer-Verlag, 1998.
[304] Rajeev Motwani and Prabhakar Raghavan. Randomized algorithms. Cambridge University
Press, 1995.
[305] Antoine Muller, Hugo Zbinden, and Nicolas Gisin. Underwater quantum coding. Nature,
378:449–449, 1995.
[306] Michael Nielsen A partial order on the entangled states. Technical report, quant-ph/9811053,
1998.
[307] Michael A. Nielsen, Emanuel H. Knill, and Raymond Laflamme. Complete quantum telepor-
tation by nuclear magnetic resonance. Technical report, quant-ph-9811020, 1998.
[308] Noam Nisan. CREW PRAMs and decision trees. SIAM Journal of computing, 20(6):999–1007,
1991.
[309] Noam Nisan and Mario Szegedy. On the degree of Boolean functions as real polynomials. In
Proceedings of 24th ACM STOC, pages 462–467, 1992.
[310] Masanao Ozawa. Quantum Turing machines: local transitions, preparation, measurement,
and halting problem. Technical report, quant-ph/9809038, 1998.
[311] Masanao Ozawa and Harumichi Nishimura. Local transition functions of quantum Turing
machines. Technical report, quant-ph/9811069, 1998.
[312] Yuri Ozhigov. Fast quantum verification for formulas of predicate calculus. Technical report,
quant-ph/9809015, 1998.
[313] Yuri Ozhigov. Quantum computers speed up classical with probability zero. Technical report,
quant-ph/9803064, 1998a.
[314] Edward W. Packel Functional analysis, a short course. Indext Educational Publishers, 1974.
[315] Ramamohan Paturi. On the degree of polynomials that appriximate Boolean functions. In
Proceedings of 24th ACM FOCS, pages 468–474, 1992.
[316] Roger Penrose. The emperor’s new mind. Vintage, 1990.
[317] Roger Penrose. Shadows of the mind. Oxford University Press, 1994.
[318] Asher Peres. Reversible logic and quantum computers. Physical Review A, 32(6):3266–3276,
1985.
[319] Asher Peres. Quantum theory: Concepts and methods. Kluwer Academic Publisher, 1993.
[320] Asher Peres. Error symmetrization in quantum computers. Technical report, quant-
ph/9605009, 1996.
BIBLIOGRAPHY 417
[321] Asher Peres and William K. Wootters Optimal detection of quantum information. Physics
Review Letters, 66:1119–1122, 1991.
[322] Carl Adam Petri. Gründsätzliches zur Beschreibung diskreter Prozesse. In Proceedings 3.
Colloquium über Automatentheorie (Hannover), 1965, pages 121–140, 1967.
[323] Simon J. D. Phoenix, Stephen M. Barnett, Paul D. Townsend, and Keith J. Blow Multi-user
quantum cryptography on optical networks. Journal of modern optics, 42:1155–1163, 1995.
[324] Simon J. D. Phoenix and Paul D. Townsend Quantum cryptography: how to beat the code
breakers using quantum mechanics. Contemporary Physics, 36(3):165–195, 1995.
[325] Jean-Erie Pin. On the languages accepted by finite reversible automata. In Proceedings of
14th ICALP, pages 237–249. LNCS 267, Springer-Verlag, 1987.
[326] Martin B. Plenio and Peter L. Knight Realistic lower bounds for the factorization time of
large numbers on a quantum computer. Physical Review A, 53(5):2986–2990, 1996.
[327] Martin B. Plenio, Vlatko Vedral, and Peter L. Knight Quantum error correction in the
presence of spontaneous emission. Technical report, quamt-ph/9603022, 1996.
[328] John Preskill. Fault tolerant quantum computation. Technical report, quant-ph/9712048,
1997. To appear in “Introduction to quantum computation”, edited by H.-K. Lo, S. Popescu
and T. P. Spiller.
[329] John Preskill. Quantum computing. Technical report, CALTECH, 1998.
http://www.theory.caltech.edu/ preskill/ph229.
[330] Markus Püschel, Martin Röteller, and Thomas Beth. Fast Fourier transforms for a class of
non-abelian groups. Technical report, quant-ph/9807064, 1998.
[331] Michael O. Rabin Probabilistic automata. Information and Control, 6(3):230–244, 1963.
[332] Michael O. Rabin How to exchange secrets by oblivious transfer. Technical report, Technical
Memo TR-81, Aiken Computational Laboratory, Harward University, 1981.
[333] Eric M. Rains A rigorous treatment of distillable entanglement. Technical report, quant-
ph/9809078, 1998.
[334] Eric M. Rains, R. H. Hardin, Peter W. Shor, and Neil J. A. Sloane A nonadditive quantum
code. Physical Review Letters, 79:953–954, 1997.
[335] Eleanor G. Rieffel and Wolfgang Polak. An introduction to quantum computing for non-
physicists. Technical report, quant-ph/9809016, 1998.
[336] Yurii Roghozin. On the notion of universality and small universal Turing machines. Theoret-
ical Computer Science, 168:215–240, 1996.
[337] Hein Röhrig. An upper bound for searching an unordered list. Technical report, quant-
ph/9812061, 1998.
[338] D. A. Ross A modification of Grover’s algorithm as a fast database search. Technical report,
quant-ph/9807078, 1998.
[339] Martin Rötteler and Thomas Beth. Polynomial-time solution to the hidden subgroup problem
for a class of non-abelian groups. Technical report, quant-ph/9812070, 1998.
[340] Keijo Ruohonen. Reversible machines and Post’s correspondence problem for biprefix mor-
phisms. EIK, 21(12):579–595, 1995.
[341] Bruce Schneier. Applied cryptography. Wiley, New York, 1996.
[342] Leonard Schulman and Umesh Vazirani. Scalable NMR quantum computation. Technical
report, quant-ph/9804060, 1998.
[343] Benjamin W. Schumacher Quantum coding. Physical Review A, 51(4):2738–2747, 1995.
[344] Benjamin W. Schumacher and Michael A. Nielsen Quantum data processing and error cor-
rection. Physical Review A, 54(4):2629–2635, 1996.
[345] Claude E. Shannon and Warren Weaver. The mathematical theory of communication. Uni-
versity of Illinois Press, Urbana, 1949.
418 BIBLIOGRAPHY
[346] Peter W. Shor Algorithms for quantum computation: discrete log and factoring. In Proceed-
ings of 35th IEEE FOCS, pages 124–134, 1994.
[347] Peter W. Shor Scheme for reducing decoherence in quantum computer memory. Physical
Review A, 52:2493–2496, 1995.
[348] Peter W. Shor Fault-tolerant quantum computation. In Proceedings of 37th IEEE FOCS,
pages 56–65, 1996.
[349] Peter W. Shor Polynomial time algorithms for prime factorization and discrete logarithms on
quantum computer. SIAM Journal on Computing, 26(5):1484–1509, 1997.
[350] Daniel R. Simon On the power of quantum computation. In Proceedings of 35th IEEE FOCS,
pages 116–123, 1994. See also SIAM Journal of Computing, V26, N5, 1474-1483, 1997.
[351] Tycho Sleator and Harald Weinfurter. Realizable universal quantum logic gates. Physical
Review Letters, 74(20):4087–4090, 1995.
[352] Henry P. Stapp The Copenhagen interpretation. American Journal of Physics, 40:1098–1116,
1972.
[353] Andrew M. Steane Error correcting codes in quantum theory. Physics Review Letters,
77(5):793–797, 1996.
[354] Andrew M. Steane Multiple particle interference and quantum error correction. Proceedings
of Royal Society London, A 452:2551–2577, 1996a.
[355] Andrew M. Steane Quantum Reed-Muller code. Technical report, quant-ph/9608026, 1996b.
[356] Andrew M. Steane Simple quantum error-correcting codes. Physical Review A, 54:4741–4751,
1996c. quant-ph/9605021.
[357] Andrew M. Steane Quantum computing. Technical report, quant-ph/9708022, 1997.
[358] Andrew M. Steane Quantum error correction. In Introduction to quantum computation, page
to be published. World Scientific, 1998.
[359] Andrew M. Steane Enlargement of Calderbank Shor Steane quantum codes. Technical report,
quant-ph/9802061, 1998a.
[360] Andrew M. Steane Efficient fault-tolerant computing. Technical report, quant-ph/9809034,
1998b.
[361] Karl Svozil. Quantum computation and complexity theory I and II. Bulletin of EATCS, 55
and 56:170–207, 116–136, 1995.
[362] Max Tegmark. The interpretation of quantum mechanics: many worlds or many words.
Technical report, quant-ph/9709032, 1996.
[363] W. Teich, K. Obermayer, and G. Mahler Structural basis of multidisciplinary quantum
systems II. Effective few particle dynamics. Physical Review B, 37(4):8111–8120, 1988.
[364] Barbara M. Terhal and John A. Smolin Single quantum querying of a database. Physical
Review A, 58:1822–1826, 1997. quant-ph/9705041.
[365] Wolfgang Tittel, J. Brendel, Hugo Zbinden, and Nicolas Gisin. Experimental demonstration
of quantum correlation over more than 10km. Physical Review A, 57(5):3229–3232, 1998.
Preprint quant-ph/9806043.
[366] Wolfgang Tittel, Gregoire Robordy, and Nicolas Gisin. Quantum cryptography. Physics
World, pages 41–45, 1998a.
[367] Tommaso Toffoli. Computation and construction universality of reversible cellular automata.
Journal of Computer and System Sciences, 15:213–231, 1977.
[368] Tommaso Toffoli. Reversible computing. In Proceedings of ICALP’80, pages 632–644. LNCS
84, Springer-Verlag, 1980.
[369] Tommaso Toffoli. Bicontinuous extensions of invertible combinatorial functions. Mathematical
Systems Theory, 14:13–23, 1981.
[370] Q. A. Turchette, C. J. Hood, W. Lange, Hideo Mabuchi, and H. J. Kimble. Measurement of
conditional phase shifts for quantum logic. Physical Review Letters, 75:4710–4713, 1995.
BIBLIOGRAPHY 419
[371] Armin Uhlmann. The ”transition probability” in the state space of a *-algebra. Reports
Mathematical Physics, 9:273–279, 1976.
[372] Armin Uhlmann. Entropy and optimal decomposition of states relative to a maximal com-
mutative algebra. Technical report, quant-ph/9704017, 1997.
[373] Wim van Dam. A universal quantum cellular automaton. In M. Biafore J. Leão T. Toffoli,
editor, Proceedings of Physics and Computation, PhysComp’96, pages 323–331. New England
Complex Systems Institute, 1996.
[374] Wim van Dam. Quantum oracle interrogation. Technical report, quant-ph/9805006, 1998.
[375] Wim van Dam, Peter Høyer, and Alain Tapp. Multiparty quantum communication complex-
ity. Technical report, quant-ph/9710054, 1997.
[376] Wim van Dam. Quantum oracle interrogation: getting all information for almost half price.
In Proceedings of 39th IEEE FOCS, pages 362–367, 1998.
[377] S. J. van Enk, Juan I. Cirac, and Peter Zoller. Ideal quantum communication over noisy
channels: a quantum optical implementation. Physical Review Letters, 78:4293–4296, 1997.
[378] J. H. van Lint Introduction to coding theory. North Holand, 1995.
[379] Umesh Vazirani. Quantum computing. Technical report,
http://www.cs.berkeley.edu/ṽazirani, 1997.
[380] Vlatko Vedral, Adriano Barenco, and Artur K. Ekert Quantum networks for lementary
operations. Physical Review A, 54(1):147–153, 1996. quant-ph/9511018.
[381] Vlatko Vedral and Martin B. Plenio Entanglement measures and purification procedures.
Technical report, quant-ph/9707035, 1997.
[382] Vlatko Vedral and Martin B. Plenio Basics of quantum computations. Technical report,
quant-ph/9802065, 1998.
[383] Vlatko Vedral, Martin B. Plenio, Mark A. Ripin, and Peter L. Knight Quantifying entangle-
ment. Technical report, quant-ph/9702027, 1997.
[384] Dan Ventura and Tony Martinez. A quantum computational learning algorithm. Technical
report, quant-ph/9807052, 1998.
[385] John von Neumann. Thermodynamik quantenmechanischer Gesamtheiten. Göttinger
Nachrichten, pages 273–291, 1927.
[386] John von Neumann. Mathematische Grunglagen der Quantenmechanik. Springer-Verlag, 1932.
English translation: Mathematical Foundations of Quantum mechanics, Princeton University,
1955.
[387] Joachim von zur Gathen and James R. Rucke Polynomials with two values. Combinatorica,
17(3):345–362, 1997.
[388] Klaus Wagner. The complexity of combinatorial problems with succinct input representation.
Acta Informatica, 23:325–356, 1986.
[389] John Watrous. On one-dimensional cellular automata. In Proceedings of 36th IEEE FOCS,
pages 528–537, 1995.
[390] John Watrous. Notes on a quantum cellular automaton illustrating the EPR paradox. Tech-
nical report, University of Wisconsin, 1997.
[391] John Watrous. On the power of 2-way quantum finite automata. Technical report, University
of Wisconsin, 1997.
[392] John Watrous. Relationships between classical and quantum space bounded complexity
classes. In Proceedings of 13th IEEE Conference on Computational Complexity, pages 210–
227, 1997a.
[393] John Watrous. Quantum simulations of classical random walks and undirected graph connec-
tivity. Technical report, quant-ph/9812012, 1998.
[394] Alfred Wehrl. General properties of entropy. Rewiews of Modern Physics, 50:221–260, 1978.
[395] Steve Weinberg. Testing quantum mechanics. Annals of Physics, 194:336–386, 1989.
420 BIBLIOGRAPHY