Notes On Scott Aaronson's Quantum Information Science: Lecture 1 (January 17)
Notes On Scott Aaronson's Quantum Information Science: Lecture 1 (January 17)
by Paulo Alves
Overview
Lecture 1 (January 17)
An introduction to Quantum Information Science. A few important concepts are introduced
(Probability, Locality, Local Realism, the Church-Turing Thesis and its extended variation) to
contextualize how quantum mechanics affects our understanding of physics.
Lecture 6 (February 2)
Density Matrices are introduced to represent Mixed States. We see the properties of density
matrixes including Trace and Rank, as well as processes we may want to do with them like applying
unitary transformations, performing Eigendecomposition, and Tracing Out.
Lecture 7 (February 7)
The Bloch Sphere is introduced as a useful representation of possible states of a qubit.
1
The No Communication Theorem and the No Cloning Theorem limit what can be done with
quantum information. These limits allow for the creation of Quantum Money schemes, such as
Wiesner’s Scheme.
Lecture 8 (February 9)
Attacks on Wiesner’s Scheme are explored, including an Interactive Attack, and an Attack
Based on the Elitzur Vaidman Bomb.
BB84 is a Quantum Key Distribution scheme allowing two parties to generate a shared secret.
Lecture 14 (March 2)
The optimality of our strategy for the CHSH Game is discussed and proven through Tsirelson’s
Inequality. The implications of experimentally Testing the Bell Inequality lead us to
Superdeterminism and modern skepticism of quantum mechanics.
Other non-local games (The Odd Cycle Game and The Magic Square Game) are covered.
2
Lecture 15 (March 9)
The CHSH Game can be applied to Generating Guaranteed Random Numbers, and many
other tasks, which brings us to Quantum Computing. We discuss the intellectual origins of the field and
a few conceptual points.
Lecture 28 (May 2)
Further discussion of the reliability of qubits leads us to Stabilizer Sets and their compact
representations through Generator Sets of Pauli Matrices. The Gottesman-Knill Theorem explains
why stabilizer sets aren’t universal, and leads to the use of Tableau Representation.
Lecture 29 (May 4)
Quantum error correction codes with Transversality are prefered.
Practical implementations of of quantum computing are discussed, including the important
speedups it could provide, leading to a discussion of the HHL Theorem. DiVincenzo Criteria could be
satisfied with Trapped Ions or Superconducting Qubits, as well as Photonics (bringing us to the KLM
Theorem and Boson Sampling) or Non-abelian Anyons.
3
Lecture 1: Tues Jan 17
● Quantum Information Science is an inherently interdisciplinary field (Physics, CS, Math,
Engineering, Philosophy)
● About clarifying the workings of quantum mechanics.
○ We use it to ask questions about what you can and can’t do with quantum mechanics
○ Can help solve problems about the nature of quantum mechanics itself.
● Professor Aaronson is very much on the theoretical end of research.
○ Theorists inform what practicalists make which in turn informs theorists’ queries
There are several self-evident truths in the physical world. Quantum mechanics leaves some in
place, and slashes others. To start with…
Probability (p ∈ [0,1]) is the standard way of representing uncertainty in the world.
Probabilities have to follow certain obvious axioms like:
P1 + … + Pn = 1 mutually exclusive exhaustive possibilities sum to 1
Pi ≥ 0
As an aside:
There’s a view that “Probabilities are all in our heads”. Which is to say that if we knew
everything about the universe (let’s say position/velocity of all atoms in the solar system”)
that we could just crunch the equations and see that things either happen or they don’t.
Let’s say we have two points separated by a barrier with an open slit,
and we want to measure the probability that a particle goes from one
point to the other. It seems obviously true that increasing the number
of paths (say, by opening another slit) should increase the likelihood
that it will reach the other end.
Locality is the idea that things can only propagate through the structure of the universe at a certain speed.
When we update the state of a little patch of space, it should only require
knowledge of a little neighborhood around it. Conway’s Game Of Life (left) is an
apt comparison: things you do to the system can affect it, but they propagate only at
a certain speed.
Einstein’s Theory of Relativity explains that a bunch of known physics things are a
direct result of light’s speed. Anything traveling past the speed of light would be
tantamount to travelling back in time.
Local Realism says that any an instantaneous update in knowledge about far away events can be
explained by correlation of random variables.
4
For example, if you read your newspaper in Austin, you can instantly collapse the probability of
your friend-in-San-Francisco’s newspaper’s headline to whatever your headline is.
Some Pop Science articles may talk about seeing one particle’s spin instantaneously as a
result of knowing another particle’s spin, but that’s basically the same as the newspapers.
The Church-Turing Thesis says that every physical process can be simulated by a Turing machine to
any desired precision.
The way that Church and Turing understood this was as a definition of computation, but we think
of it instead as a falsifiable claim about the real world. You can think about this as the idea that the entire
universe is a video game: You’ve got all sorts of complicated things like quarks and whatnot, but at the
end of the day, you’ve got to be able to simulate it in a computer.
Theoretical computer science courses can be seen as basically math courses.
So what does connect them to reality? The Church-Turing Thesis.
The Extended Church-Turing Thesis says that there’s at most a polynomial blow-up for simulating
reality.
5
Lecture 2: Thurs Jan 19
Using time dilation, you could travel billions of years in the
future and get results to hard problems. Fun! But you’d need
a LOT of energy, and if you have that much energy in one
place you basically become a black hole. Not so fun!
Computational Universality says that there aren’t any computers that could exist which could solve a
problem that ours can’t already.
The Extended Church-Turing thesis says that if you can’t solve a problem in polynomial time on
today’s computers then no one will ever be able to. Quantum mechanics challenges this. With quantum
computers you could solve some problems faster that with a classical computer. With that said, however,
there could still be a quantum equivalent to the ECTT.
Feynman said that everything about quantum mechanics could be encapsulated in The Double Slit
Experiment.
In the double slit experiment, you shoot photons through a wall with two narrow slits.
Where the photon lands is probabilistic. If we plot where photons appear on the back wall, some
places are very likely, some not.
Note that this itself isn’t the weird part, we could totally justify this happening. What’s
weird is as follows. For some interval:
Let P be the probability that the photon lands on the interval.
Let P1 be the probability that the photon lands on the interval if only slit 1 is open.
Let P2 be the probability that the photon lands on the interval if only slit 2 is open.
You’d think that P = P1 + P2 , but that’s not the case. Dark fringes that exist with two slits end up being hit
by photons if only one slit is open.
The weirdness isn’t that “God plays dice,” but rather that “these aren’t normal dice”
You may think to measure which slit the photon went through, but doing so changes the
measurements into something that makes more sense. Note that this isn’t really a matter of having
a conscious observer: if the information about which slit the photon went through leaks out in any
way, the results go back to looking like they obey classical probability.
As if nature says “What? Me? I didn’t do anything!”
This is called Decoherence.
Decoherence is why the usual laws of probability look like they work in everyday life. A cat isn’t
in superposition because it interacts with normal stuff every day. These interactions essentially leak
information about the ‘cat system’ out.
It’s important to note that this relates to particles in isolation. Needing particles to be in isolation
is why it’s so hard to build a quantum computer.
6
The story of physics between 1900 and 1926 is that scientists kept finding things that didn’t fit with the
usual laws of mechanics or probability. They usually came up with hacky solutions that explained a thing
without connecting it to much else. That is, until Schrodinger, etc. came up with quantum mechanics.
A normal quantum physics class would go through this process of experimental proof to arrive at
quantum mechanics, but we’re just going to accept the rules as given and see what we can do from there.
For example take the usual high school model of the electron, rotating around a
nucleus in a fixed orbit. Scientes realized that this model would mean that the electron
would need to be constantly losing energy until it hit the nucleus. To explain this (and
many other phenomenon) scientists modified the laws of probability.
Instead of using probabilities p ∈ [0,1] they started using Amplitudes α ∈ ℂ. Amplitudes can be
positive or negative and can have an imaginary part.
The central claim of quantum mechanics is that to explain a system you’d need to give one amplitude for
each particle for each possible configuration of the particles.
The Born Rule says that the probability you see a particular outcome is the absolute value of the
amplitude squared.
P = |α|2
= |The real amplitude|2 + |The imaginary amplitude|2
So let’s see how amplitudes being complex leads them to act differently from probabilities. Lets revisit
the Double Slit Experiment considering Interference. We’ll say that:
If α1 = ½ and α2 = -½, then interference means that if both slits are open P = 0, but if
only one of them is open, P = ¼.
7
We use Linear Algebra to model states of systems as vectors and the evolution of systems in
isolation as transformations of vectors. M ( α1 ) = ( α1’)
( α2 ) = ( α2 ’)
For now, we’ll consider classical probability. Let’s look at flipping a coin:
( p ) tails
p, q > 0
heads
(q) p+q=1
We model this with a vector listing both possibilities and assigning a variable to each.
Let’s say we flip the coin, and if(we get heads){we flip again}, but if(we get tails){we turn it to heads}.
( 0 ½ ) ( p ) = ( q/2 )
( 1 ½ ) ( q ) = ( p + q/2 )
8
(0) We can see this clearly by using basis vectors.
(0)
A ( 1 ) = (ith column of A)
(0)
(0)
Now let’s say we want to flip two coins, or rather, two bits. For the first coin P(a) = P(getting 0), P(b) =
P(getting 1). For the second coin we’ll use P(c) and P(d).
( a ) 0 ( c ) 0
( b ) 1 ( d ) 1 ( P00 ) ( ac )
To combine the two probabilities we’ll use the Tensor Product. ( P01 ) ( ad )
( a ) ⊗ ( c ) = ( P10 ) = ( bc )
(b) (d) ( P11 ) ( bd )
It’s worth noting that not all combinations are possible.
For example: ( ac ) (½)
( ad ) ( 0) Would mean that
( bc ) = ( 0 ) (½)(½) = abcd = (0)(0)
( bd ) (½) Therefore it can’t be a tensor product.
Let’s say that if(the first bit is 1){we want to flip the second bit}
00
( ) ( ½ ) 00
01
( ) ( 0 ) 01
10
( ) ( ½ ) 10
11
( ) ( 0 ) 11
00 01 10 11
We’d do:
( 1 0 0 0 )( ½ ) (½)
( 0 1 0 0 )( 0 ) ( 0) This is called the Controlled NOT
( 0 0 0 1 )( ½ ) = ( 0 ) it comes up in quantum mechanics.
( 0 0 1 0 )( 0 ) (½)
Quantum mechanics basically follows this process to model states in quantum systems except that it uses
amplitudes instead of probabilities.
( ) ( a1 ) ( B1 )
( U ) ( a2 ) = ( B2 )
( ) ( a3 ) ( B3 )
n n
Where ∑ |Ai|2 = 1 = ∑ |Bi|2 and you’re measuring with probability |αi|2
i=1 i=1
9
Lecture 3: Tues Jan 24
Tensor Products are a way of building bigger vectors out of smaller ones.
Let’s apply a NOT operation to the first bit, and do nothing to the second bit. That’s really the same as
defining function f as f(00) = 10, f(01) = 11, f(10) = 00, f(11) = 01. So we can fill in the tensor product as
follows:
( 0 1 ) ( 1 0 ) 00 ( 0 0 1 0 ) ( ) 00
(10) ⊗ ( 0 1 ) = ( 0 0 0 1 ) ( ) 01
01
10
( 1 0 0 0 ) ( ) 10
11
( 0 1 0 0 ) ( ) 11
00 01 10 11
A Quantum State is a unit vector in ℂN referring to the state of a quantum system.
Formally a quantum state could exist in any dimension. Physics courses cover infinitely
dimensional vectors, but we’ll stick to discrete systems (which is to say that when we make a
measurement, there’s a discrete number of variables to be read (with continuous outcomes).
● What does quantum mechanics say about the universe being discrete or continuous at the base
level? It suggests a strange, hybrid picture. There’s an infinite number of possibilities, but a
discrete outcome. Formalisms of quantum mechanics technically contain infinite possibilities,
like a system with two variables ( βα ) has uncountably infinite possible amplitudes (given the only
restriction is that | α |2 + | β |2 = 1), but you could do that in classical mechanics as well by just
making a complex formation about the probabilities of flipping coins.
Often you’ll need to take the transpose of a vector ( α ) or for complex values ( α )
( β ) -> ( α β ) ( β ) -> ( α* β* )
10
Using the complex conjugate allows you to define a norm
||v||2 = vTv
Then we get ( α )
T
v v = ( α* β* ) ( β )
= α*α + β*β
= |α|2 + |β|2
And we define ⟨x|y⟩ as the inner product of ket |x⟩ with ket |y⟩
Therefore ⟨Ψ|Ψ⟩ = 1.
So ⟨v|w⟩ = ⟨w|v⟩*.
( α )
Remember: the way we change quantum states is by applying linear transformations. (U) ( β ) = ( α’ β’ )
A linear transformation is Unitary if |α|2 + |β|2 = |α’|2 + |β’|2
All real possible states of a qubit define a circle and all complex possibile states define a sphere. That’s
because these states are all the quantum vectors of length 1.
We define:
|0⟩ + |1⟩
|+⟩ = √2
|0⟩ − |1⟩
|-⟩ = √2
|0⟩ + i|1⟩
|i⟩ = √2
|0⟩ − i|1⟩
|-i⟩ = √2
11
Unitary Transformations are norm-preserving linear transformations.
( cosθ -sinθ )
For any angle θ you could have Rθ = ( sinθ cosθ ) which grabs a vector and rotates it θ degrees.
1 −1
( √2 √2
)
1 −1
For example Rπ/4 = ( √2 √2
)
Some examples:
Rπ/4|0⟩ = |+⟩
Rπ/4|+⟩ = |1⟩ You’ll get a full revolution after applying Rπ/4 eight times.
Rπ/4|1⟩ = -|-⟩
12
No matter what unitary transformation you apply: If |0⟩ goes to U|0⟩, then -|0⟩ goes to -U|0⟩.
The zero state and the minus zero state are indistinguishable mathematically, which is to say:
Global phase is unobservable.
Multiplying your entire quantum state by a scalar is like if last night someone moved the entire universe
twenty feet to the left. We can only really measure things relative to other things:
Relative phase is observable.
To distinguish between the states |+⟩ and |-⟩ we can rotate and then measure them.
There are no second chances. Once you measure, the outcome is set.
So you can distinguish some states via repeated measurement.
13
Lecture 4: Thurs Jan 26
( 1 -1 ) ( 0 1 )
1
We call the matrix Rπ/4 = √2
( 1 1 ) the √N OT gate, as Rπ/4 = ( 1 0 ) aka the NOT Gate.
2
( 1 1)
1
The Hadamard Gate is H = √2
( 1 -1 )
It’s useful because it represents a mapping between the |0⟩,|1⟩ basis to the |+⟩,|-⟩ basis.
( 1 1 ) ( 1 ) ( √2 )
1
H|0⟩ = √2 ( 1 -1 ) ( 0 ) = ( √2 ) = |+⟩ Similarly H|+⟩ = |0⟩, H|1⟩ = |-⟩, and H|-⟩ = |1⟩
Note that we’ve got two orthogonal (complementary) basis: being maximally
certain in the |+⟩,|-⟩ basis means that you’re maximally uncertain in the |0⟩,|1⟩
basis and vice versa.
So the probability of the outcome |V3⟩ is the projection onto the basis vector.
|⟨Vi|Ψ⟩ |2 = |α1|2
We use bases |0⟩ and |1⟩ arbitrarily as a nice convention.
There’s an extreme point of view in quantum mechanics that unitary transformations are the only thing
that really exist, and measurements don’t really exist. And the converse also exists: the view that
measurements are the only thing that really exist, and that unitary transformations don’t.
14
imply that the universe is reversible. We’ve known that the microscopic laws of physics are
reversible since galileo times (i.e. observing a falling object backwards follows gravity
backwards). So for example burning a book shouldn’t necessarily destroy the information
within, as physics says that you can get all the information from the smoke and ash left over.
● Deterministic
● Continuous
i.e. you can always apply them in a time-continuous way. That’s why it’s important that
(1 0) (10)
unitary matrices are complex. If the transformation ( 0 -1 ) took place in 1 sec. ( 0 i ) took place
over the first half of the second.
(1 0)
By the way, there is a 3x3 matrix that squares to ( 0 -1 ).
( 1 0 0 ) 2 (1 0 0) (α)
( 0 0 1 ) = ( 0 -1 0 ) (α) (1 0) ( β )
( 0 -1 0 ) ( 0 0 -1 ) Which means that you could apply ( β ) on ( 0 -1 ) by using ( 0 ) on it.
without ever needing complex numbers! That’s because using complex
numbers works in the same way as adding a new dimension to your vector. Just like you could reflect
your three-dimensional self by rotating yourself in the fourth dimension.
Important: Never eat anything in the fourth-dimension. It’ll mess with the chirality of your molecules.
Despite the philosophical conflict, unitary transformations and measurement sync up well because:
unitary transformations preserve the 2-norm and
Measurement gives probabilities given by the 2-norm
● We used to think everything was based on the 1-norm, until we found that quantum mechanics
was based on the 2-norm. This got researchers looking for things based on the 3-norm, 4-norm,
etc. They didn’t really find anything though (the extra credit problem on the homework on norm
preserving linear transformations sheds light on why).
○ Making quantum mechanics a bit of “an island in theory space”. If you try to adjust
anything about it in any way you get gunk. You could alternatively say that there’s
“nothing nice near quantum mechanics”.
15
● There are many reasons why complex numbers work better than the reals or quaternions.
One more example of a linear transformation.
∑ |αi|4
(11)
1 |0⟩ + i|1⟩
for √2
( i -i ) maps |0⟩ -> |i⟩ √2
|0⟩ − i|1⟩
and |1⟩ -> |-i⟩ √2
There are several interesting phenomena that already happen in the quantum mechanics of one qubit.
16
Another interesting variant of the same kind of effect is as follows:
( cosε -sinε )
What we can do is apply the rotation Rε = ( sinε cosε ). Giving us:
cosε|0⟩ + sinε|1⟩
If there’s a bomb, the probability it explodes is sin2ε ~ ε2, otherwise you get |0⟩
If there’s no bomb, cosε|0⟩ + sinε|1⟩
So repeating about π/2 times makes the probability of setting off the
bomb as 1/ε * ε2 = ε
17
Lecture 5: Tues Jan 30
Say you have a coin, and you want to figure out if it’s fair (p = ½) or if it’s biased (p = ½ + ε). How would
you go about doing so?
Given two orthogonal quantum states |v⟩ These on the other hand are
and |w⟩ there’s a basis that distinguishes them. indistinguishable.
Take the bisector of |v⟩ and |w⟩, and get the angles 45° to either side, ensuring
each original vector is the same distance to its closest basis vector.
The probability of getting |00⟩ = |α|2 Note that |00⟩ is the same as |0⟩|0⟩ or |0,0⟩ or |0⟩⊗|0⟩
2
|01⟩ = |β|
|10⟩ = |γ|2
|11⟩ = |δ|2
In principle there’s no distance limitation between qubits. You could have one on Earth, and the
other could be with your friend on the moon.
You’d only be able to measure the first bit:
The probability of getting |0⟩ = |α|2 + |β|2 because those are the amplitudes compatible with 0 in the 1st bit.
|1⟩ = |γ|2 + |δ|2
Suppose I measure the first qubit to |0⟩. What can I say about the second qubit?
Well we’ve narrowed down the possibilities to α|00⟩ and β|01⟩. The state of the system is thus
now in the superposition |0⟩ ⊗ (α|0⟩ + β|1⟩)
√|α| 2
+ |β|2 ←---- Don’t forget to normalize
This is called the Partial Measurement Rule
Systems collapse minimally to fit your measurements.
This is actually the last rule of quantum mechanics that we’ll cover in the course. Everything else
is just a consequence of rules we’ve already covered.
19
What if we wanted to always do NOT on the 2nd bit: ( 0 1 0 0 )
(1000)
This is I ⊗ NOT (0001)
/ | (0010)
(nothing on 1st bit) with (NOT on 2st bit)
Very often in quantum information we’ll want to take a group of qubits and perform an operation to one
of them, say Hadamard the 3rd qubit.
What that means in terms of the matrices is applying I ⊗ I ⊗ H ⊗ … ⊗ I
What’s H ⊗ H?
( 1 1 1 1 )
( 1 -1 1 -1 ) Why should it look like this?
½ ( 1 1 -1 -1 ) Let’s look at the first row: H|00⟩ = |++⟩. Which means for each
( 1 -1 -1 1 ) qubit there’s an equal prob it’s output lands on |0⟩ or |1⟩.
All of these are examples of using tensor products to build bigger unitary matrices, except for the
Controlled NOT, where the 1st bit affects the 2nd. We’ll need operations like that in order to have one
qubit affect another.
Start with 2 qubits at |0⟩ Apply Hadamard to 1st bit Apply a Controlled NOT with the 1st bit as
the control and the 2nd as the t arget.
1 |00⟩ + |10⟩ 1 |00⟩ + |11⟩
(1) |00⟩ ( √2 )
√2
( √2 )
√2
(0) ( 0 ) = |+⟩ ⊗ |0⟩ ( 0 )
1
(0) ( √2 ) ( 0 )
1
(0) ( 0 ) ( √2 )
20
The Controlled NOT can also be shown as |x, y⟩ -> |x, y ⊗ x⟩
|00⟩ + |11⟩
The state that this circuit ends on, √2
is called the Singlet or the Bell EPR Pair
This state is particularly interesting because measuring the 1st qubit collapses the 2nd qubit. It
can’t be factored into a tensor product of the 1st qubit’s state and the 2nd’s.
An Entangled state cannot be decomposed into a tensor product, while an Unentangled state can.
The basic rules of quantum mechanics force these properties to exist. They were noticed fairly
early in the history of of the field. It turns out that most states are entangled.
As we mentioned earlier, entanglement was what troubled Einstein about quantum mechanics. He
thought that it meant that quantum mechanics must ential faster than light communication.
That’s because particles need to be close to become entangled, but once they're entangled you can
separate them to an arbitrary distance and they’ll stay entangled. This has actually been demonstrated
experimentally for distances of up to 150 miles.
But what if before that, Alice takes this state and Hadamards the 1st bit?
Well it maps |00⟩ to |00⟩ + |10⟩ and |11⟩ to |01⟩ - |11⟩ (ignoring normalization).
That gives us: |00⟩ + |10⟩ +2 |01⟩ − |11⟩ Remember H|0⟩ = |+⟩, etc.
So now, applying the Partial Measurement Rule what is Bob’s state?
If Alice sees |0⟩, then Bob’s qubit collapses to the possibilities where Alice sees |0⟩:
|00⟩ + |01⟩
2 = |+⟩
Conversely, if Alice sees |1⟩:
|10⟩ − |11⟩
2 = |-⟩
The paper goes on to talk about how this is more troubling than before. Alice’s choice to measure
in the |+⟩,|-⟩ basis is affecting Bob’s qubit when he measures in the |+⟩,|-⟩ basis. And that looks a lot like
faster than light communication.
21
One thing we can do is as “what happens if Bob makes a measurement?”
● In the case where Alice measured in |0⟩,|1⟩, Bob will see |0⟩ or |1⟩ with equal probability.
● In the case where Alice Hadamards her bit, then measures in |+⟩,|-⟩…
○ Bob will still see |0⟩ or |1⟩ with equal probability (measuring in the |0⟩,|1⟩ basis)
So the probability that Bob sees |0⟩ or |1⟩ is the same regardless of what Alice does.
People decided that it looked like there was something more general going on here, though. And
so a different description should exist of Bob’s part of the state that’s unaffected by Alice’s
measurements. Which brings us to…
Mixed States
We’ve only talked about Pure States so far (isolated quantum systems), but you can have
quantum uncertainties layered together with regular, old uncertainty. This becomes important when we
talk about states where we’re only measuring one part. If we look at the whole Alice-and-Bob-system
together, it’ll look like a pure state.
22
Lecture 6: Thurs Feb 2
Last time we discussed the Bell Pair, and how if Alice measures her qubit in any basis, the state
of Bob’s qubit collapses to whichever state she got for hers. That being said, there’s a formalism that tells
us that Bob can’t do anything to distinguish which basis Alice makes her measurement in, and thus no
information travels instantaneously. This brings us to…
Mixed States
Which are probability distributions over quantum superposition.
We define a mixed state as a distribution over quantum states, so {pi, |Ψi⟩} = p1, |Ψ1⟩, … , pn, |Ψn⟩
^
Thus, we can think of a pure state as a degenerate state of a mixed state Note that these don’t
where all probabilities are 1. have to be orthogonal
The tricky thing about mixed states is that they have to preserve the property we discussed above
(that the basis Alice measures in doesn’t affect Bob’s state), which is to say that if we used the {pi, |Ψi⟩}
notation, we’d be allowing multiple instances of the notation to represent the same state. For example
|00⟩ + |01⟩
2 could be represented in the |0⟩,|1⟩ basis or the |+⟩,|-⟩ basis. To avoid this, we’ll use…
Density Matrices
represented as ρ = ∑ pi |Ψi⟩⟨Ψi|
i
23
Note that a mixture of |0⟩ and |1⟩ is different from a superposition of |0⟩ and |1⟩ (aka |+⟩), and so
they have different density matrices. However, the mixture of |0⟩ and |1⟩ and the mixture of |+⟩ and |-⟩
have the same density matrix: which makes sense because Alice converting between the two bases in our
example above should maintain Bob’s density matrix representation of the state.
In fact, this is true of whichever basis Alice chooses, and so for orthogonal
vectors |v⟩ and |w⟩ we have that |v⟩⟨v| +|w⟩⟨w| = I .
2 2
Measuring ρ in the basis |1⟩, … , |N⟩ gives us the probability of |i⟩ to be:
Pr[|i⟩] = ρii = ⟨i| ρ |i⟩
Which is represented by the diagonal entries of the density matrix.
You don’t need to square the value or anything because the Born Rule
is already encoded in the density matrix (i.e. (α1) (α1*) = )|α1|2
( p1 )
That means that a density state which is a diagonal matrix is ( … )
just a fancy way of writing a classical probability distribution. ( pN )
(½½)
While a pure state would look like |Ψ⟩⟨Ψ| = ( ½ ½ )
The matrix I/2 we’ve encountered above, as the even mixture of |0⟩ and |1⟩ (and also that of |+⟩
and |-⟩) is called the Maximally Mixed State. This state is basically just the outcome of a classical coin
flip, which gives it a special property:
Regardless of the basis we measure it in, both outcomes will be equally likely.
So for some basis |v⟩,|w⟩ you get the probabilities ⟨v| I/2 |v⟩ = ½ ⟨v | v⟩ = ½
⟨w| I/2 |w⟩ = ½ ⟨w | w⟩ = ½
This explains why Alice is unsuccessful in sending a message to Bob: the maximally mixed state in any
other basis is still the maximally mixed state.
† † †
∑ pi (U|Ψ
i⟩)(U|Ψi⟩) = ∑ pi U|Ψi⟩⟨Ψi|U = UρU
i i
You can pull out the U’s since it’s the same one applied to each mixture.
It’s worth noting that getting n2 values in the density matrix isn’t some abstraction, you really need all
those extra parameters. What do the off-diagonal entries represent?
24
|+⟩⟨+| = ( ½ ½ )
( ½ ½ ) These are where all the ‘quantumness’ resides.
It’s where the interference between qubits is represented.
They can be different depending on relative phase:
|+⟩ has positive off-diagonal entries
|-⟩ has negative off-diagonal entries
|i⟩⟨i| = ( ½ -i/2)
( i/2 ½ )
Later we’ll see that as a quantum system interacts with the environment, the off-diagonal states get
pushed down. ( ½ ε )
The density matrices in experimental quantum papers look like ( ε ½ ).
The bigger the off-diagonal values, the better the experiment: because it
represents them seeing more of the quantum effect.
Remember that you can view each ρ as UρU†, whose diagonal has to be a probability distribution for all
U. If we want that condition to hold, then in linear algebra terms, we need to add the restriction:
As a refresher: For the matrix ρ, the eigenvectors |Ψ⟩ hold the equation:
ρ|Ψ⟩ = λ|Ψ⟩ for some eigenvalue λ
If we had a negative eigenvector
⟨Ψ|ρ|Ψ⟩ = λ would be < 0, which is nonsense.
25
So you can say that for Σ λ |Ψi⟩⟨Ψi| :
λ are the eigenvalues and |Ψi⟩ are the eigenvalues.
One quantity you can always compute for density matrices is:
Rank
rank(ρ) = the number of non-zero λi’s
( the number of rows with no eigenvectors)
N
In general, if you have a bipartite pure state, it’ll look like ∑ αij |i ⟩|j⟩ = |Ψ⟩
i,j = 1
And you can get Bob’s local density matrix
(ρBob)j,j’ = ∑ αij α
ij’*
i
This process of going from part of a mixed state to a whole pure state is called Tracing Out.
26
● 2 quantum states will lead to different probabilities iff they have different d.m.’s
2) No-Communication Theorem
● If Alice and Bob share an entangled state, nothing Alice chooses to do will have any
effect on Bob’s density matrix.
○ In other words, there’s no observable effect on Bob’s end. Which is the
fundamental reason that quantum mechanics is compatible with the physical
limitations of reality.
27
Lecture 7: Tues Feb 7
The No Communication Theorem says that if Alice and Bob share an entangled state
N
|Ψ⟩
= ∑ αij |i ⟩Alice|j⟩Bob there’s nothing that Alice can do to her subsystem that can affect Bob’s
i,j = 1
density matrix.
We have the tools to prove this: just apply a tensor product to Alice’s side, then see if Bob’s
density matrix changes. Or have Alice measure her qubit, the see if Bob’s density matrix changes, etc.
Note that if we condition on the outcome of Alice’s measure (i.e. say that if Alice sees i then Bob
will see j), we may need to update Bob’s density matrix, but that’s also true in the classical world.
Bloch Sphere
is a geometric representation of all possible states of a qubit.
We’ve often drawn the state of qubits as a circle, which is already a little
awkward: half of the circle is going to waste since |0⟩ = -|0⟩ (both represent the
same density matrix).
We can see that |+⟩ and |-⟩ should be between |0⟩ and |1⟩. Then we can add
|i⟩ and |-i⟩ as a new dimension.
The mixture of any states |v⟩ and |w⟩ represented as points in or on the sphere can be said to be a point
between the two.
We can show geometrically that every mixed state can be written as a mixture of only two pure
states because you can always draw a line that connects any pure state you want to some point in the
sphere representing a mixed state, and then see which other pure state that the line intersects on the way
out. By some vector math, the point can be described as some linear combination of the vectors
representing pure states.
28
Experimentalists love the bloch sphere, because it works almost identically to how spin works
with electrons.
With these things called Spin-½ Particles you can measure the electron spin relative to any axis
of the sphere. You see if the electron is spinning clockwise or counterclockwise relative to the axis. And
that behaves just like a qubit, in that the measurement collapses a more complex behavior into a binary
result.
The weird part about Spin-½ Particles is that you could have asked the direction of the spin
relative to any other axis. So what’s really going on: What’s the real spin direction? It turns out that it’s
some point on the Bloch Sphere. So if the state of the electron is that it’s spinning in the (1,0,0) direction,
we can say that it’s in the |0⟩ state, and if it’s spinning in the (0,1,0) direction, we can say that it’s in the
|+⟩ direction, and so forth.
29
To clarify, a procedure that outputs some |Ψ⟩ can be rerun to get |Ψ⟩ repeatedly. What the No Cloning
Theorem says is that if the |Ψ⟩ is unknown, then you can’t make a copy.
In general, for any orthonormal basis you can clone basis vectors.
|00⟩ + |11⟩
Doing cNOT on produces the Bell Pair: 2 . Which sort of copies the first
Since the No Cloning Theorem is so important, we’ll present another proof of it:
A unitary transformation can be defined as a linear transformation that
preserves inner product. Which is to say that the angle between U|v⟩ and U|w⟩ is the
same as the one between |v⟩ and |w⟩.
Thus ⟨w|UTU|v⟩ = ⟨w|v⟩.
C only ever equals C2 if the inner product is 0 or 1: so the transformation is only linear if the v and w are
in the same orthonormal basis.
There’s a problem in classical probability that’s a nice analog to the No Cloning Theorem.
If we have a coin of some probability heads, can we produce another coin with the same
probability distribution? [Assuming the coin was made to have a certain probability distribution through
some process unknown to us]
30
The No Cloning Theorem has all sorts of applications to science fiction, because you can’t make arbitrary
copies of a physical system (say for teleporting yourself) if any of the relative relative information (say, in
your brain) is encoded in quantum states.
Quantum Money
is an application of the No Cloning Theorem. In some sense it was the first idea in quantum
information, and was involved in the birth of the field. The original quantum money scheme was
proposed by Wiesner in 1969, though it was only published in the 80s.
Wiesner had left research by then. He eventually became a sheep herder.
Wiesner realized that uncloneability is useful for money to prevent counterfeiting. In practice,
mints use special ink, watermarks, etc., but that’s essentially just an arms race with the counterfeiters. So
Wiesner proposed using qubits to make physical uncounterfietable money.
The immediate problem is that money systems need cloneability and verifiability.
Wiesner’s Scheme
Have quantum bills (WLOG all are the same denomination). Each has:
● A classical serial number S = {0,1}n
● A quantum state |Ψf(s)⟩ (of n qubits)
○ The qubits in this state are unentangled and will always be in one of four states:
■ |Ψ00⟩ = |0⟩ |Ψ01⟩ = |1⟩ |Ψ10⟩ = |+⟩ | Ψ11⟩ = |-⟩
In order to decide the state of a given bill, the bank maintains a giant database that stores for all bills in
circulation:
The classical serial number, and a function that takes the serial number as input and decides
which basis to measure each qubit in (and which basis vector it should be).
S1, f(S1)
S2, f(S2) | \
S3, f(S3) | /
Wiesner’s scheme has an important engineering problem though: you need to ensure that qubits don’t lose
their state (coherence). With current technology, qubits in a lab decohere in like an hour, tops.
There’s two basic things needed for a scheme like this: verifiability and uncloneability.
To verify a bill: bring it back to the bank. Bank verifies the bill by looking at the serial number,
looking at how each qubit in the bill was supposed to be prepared. If the qubit was supposed to be
prepared in |0⟩,|1⟩ measure in that basis.
Consider a counterfeiter that doesn’t know what basis each qubit is supposed to be in, and they
encode each qubit in a random allowable state. They only have a ½n chance of guessing all the right bases.
The security of this scheme wasn’t considered when it was proposed. Professor Aaronson asked about it
on Stack Exchange a few years ago which prompted someone to write a paper on it.
31
Lecture 8: Thurs Feb 9
Guest Lecture by Supartha Podder
So the bank will check each quantum state, the ones that should be in the |0⟩|1⟩ basis are correct ½ the
time. The ones that should be in |+⟩|-⟩ are correct ¼ of the time.
The odds of success of the counterfeiter (bank reading all states correctly is (⅝)n.
This sort of attack has an upper success bound of (¾)n.
Interactive Attack
There’s an attack on this scheme based around the fact that verification involves giving the bank
the bill, then the bank returns the bill and whether or not it’s valid.
We can repeatedly go to the bank, ask them to verify the bill.
For some qubit that we set to |0⟩
if the bank measured it correctly, we know it’s not |1⟩
if the bank measured it incorrectly, we know it’s not |0⟩
We can similarly distinguish out |+⟩ and |-⟩
So running the verification scheme over each possibility for that quantum state allows us to get a strong
picture of what state the bank is verifying it against.
Running this procedure O(log(n)) times and you can copy the note with probability O(1=1/n2).
32
Attack Based on the Elitzur Vaidman Bomb
Set |c⟩ to |0⟩
Repeat π/2ε times:
Apply Rε to |c⟩
Apply cNOT to |c⟩|Ψ1⟩
Then send the bill back to the bank.
Each time we apply cNOT given |Ψ1⟩ = |0⟩, we get (cosε|0⟩ + sinε|1⟩)|0⟩ = cosε|00⟩ + sinε|11⟩
Most of the time |c⟩ will stay at |0⟩.
Which means at each step the probability of getting caught is sin2ε.
Thus Prob[getting caught at all] is bounded at ≤ π/(2ε) sin2ε = O(ε)
The same holds for |1⟩ and |-⟩.
But if |Ψ1⟩ = |+⟩, cNOT doesn’t have the same effect
(cosε|0⟩ + sinε|1⟩)⊗ |0⟩ √2
+ |1⟩
will eventually rotate the qubit to | 1⟩.
So when we measure at the end, we can distinguish |+⟩ from the other states, because it’s the only one that
will be measured at |1⟩.
We can similarly distinguish the other three states by starting |c⟩ to the other three values.
This scheme still has a fundamental problem, which is that to make a transaction, you need to go to the
bank. If you have to go to the bank, you might as well do an account transfer instead. The point of
currency is that anyone should be able to verify it. Which brings us to...
33
As the name implies, this technique can only be used once securely, and it requires Alice and Bob
to share some initial knowledge. In fact, it’s been proven that Alice and Bob either need initial secret
information in common or you must make computational assumptions on an eavesdropper Eve.
So we want a scheme with no assumptions on Eve in which to share a secret (presuming we have
a classical authenticated channel: cannot be tampered by Eve, can be read)
In cryptography we want secrecy and authentication.
This protocol is only going to deal with secrecy.
BB84
This quantum encryption scheme was already there in Wiesner’s paper and was later formalized
by B&B. It circumvents the issues we’ve seen in maintaining a qubit, because it only requires coherence
for the time it takes for communication between Alice and Bob.
There are companies that are currently already doing quantum key distribution through fiber optic
cables over up to 10 miles. There are people trying it working from ground to satellite which would get
around the limitations of fiber optics, basically letting you do quantum key distribution over arbitrary
distances. China actually has a satellite up for this express purpose.
Here’s a diagram from the original paper that shows how BB84 works.
The basic idea is that you’re trying to establish some shared secret knowledge and you want to
know for certain that no eavesdroppers on the channel can uncover it. You’ve got a channel in which to
transmit quantum information, and a channel in which to transmit classical information. In both, no one
can impersonate Alice or Bob (authenticity) by eavesdroppers may be able to listen in (no secrecy).
● So Alice chooses a string x of random bits ϵ {0,1}n
● And another string y of random bits y ϵ {0,1}n which she uses to decide which basis to encode
each bit from x in.
● She then encodes the qubits in the |0⟩|1⟩ basis (in the diagram it’s R) or the |+⟩|-⟩ basis (D)
● Then she sends over the qubits to Bob.
● Bob picks his own random string y’ ϵ {0,1}n and uses y’i to decide which basis
● To decode the ith qubit send over (picking again between D and R)
34
Now Alice and Bob share which bases they picked to encode and measure in (the Y’s) and discard any
instances where they didn’t pick the same one (about half the time).
At this point we consider an eavesdropper Eve who was listening in to the qubits that were sent over. The
whole magic of using qubits is that if Eve listened in on the transmission she inherently changed the
qubit’s that Bob received. Sure, if she measured a |0⟩|1⟩ qubit in that axis, the qubit didn’t change, but
what if she measured a |+⟩|-⟩ qubit in the |0⟩|1⟩ basis?
If Alice sent |+⟩, then Eve measured |0⟩ and passed that along to Bob. Then Bob has a 50%
chance of measuring |+⟩ or |-⟩.
So Alice and Bob can verify that no one listened in to their qubit transmission by making sure
that some portion of their qubits that they believe match do match. Of course these qubits aren’t going to
be secret anymore, but they’ve still got all the others.
If any of the qubits didn’t match, then Eve eavesdropped and they can just try again and again
until they can get an instance where no one listened in.
The idea is that now Alice and Bob share some initial information and can thus use some classical
encryption scheme, like a 1 Time Pad.
35
Recitation Session
(Patrick)
Applying gates X,Y,Z or H is the same as doing a half turn on their respective axis.
S corresponds to a quarter turn around Z. [in the |+⟩ to | 1⟩ direction]
T2 = S, so T corresponds to an eighth turn around Z.
Rπ/4 corresponds to a quarter turn (i.e. π/4) on Y.
36
Lecture 9: Tues Feb 14
To review: We’ve seen 3 different types of states in play:
● Basis States
○ exist in a computational basis |i⟩
● Pure States
○ superpositions of basis states |Ψ⟩ = Σ αi|i⟩
● Mixed States
○ classical probabilities over pure states ρ = Σ ρi|Ψi⟩⟨Ψi|
Wiesner’s Scheme, as we’ve seen it, requires the bank to hold a lot of information. The paper
(BBBW 82) circumvents this by basically saying: Let f be a pseudorandom function, so that for any state
Sk the bank can compute f(Sk).
Why is this secure?
We use a reduction argument. Suppose that the counterfeiter can copy money by some means.
What does that say about fk? If it is pseudorandom, then fk is distinguishable from a random function, so
it’s not very good at being pseudorandom.
Superdense Coding
is the first protocol we’ll see that involves entanglement. Basic information theory (Shannon) tells
us that “with n bits you can’t send more than n bits of information.”
Now we’ll see how Alice can send Bob two classical bits by sending one qubit, though there is a
catch: Alice and Bob must share entanglement ahead of time.
In the scenario with no prior entanglement, you can’t send more than one bit per qubit.
If Alice sends |Ψ⟩ = α|0⟩ + β|1⟩ to Bob, he can only measure it once in some basis and then the
rest of the information in |Ψ⟩ is lost.
|00⟩ + |11⟩
Instead, let’s suppose that Alice and Bob share a Bell Pair: √2
We claim that Alice can manipulate her half, then send her qubit to Bob, then Bob can measure both
qubits and get two bits of information.
The key is to realize that Alice can get a state orthogonal to the Bell Pair by applying the following gates
to her bit:
● NOT ( 0 1 ) which gives us |01⟩ √2
+ |10⟩
(10)
37
|00⟩ − |11⟩
● A phase change ( 1 0 ) which gives us √2
( 0 -1 )
|01⟩ − |10⟩
● And applying both NOT and a phase change √2
More specifically, any pair of these four states is orthogonal.
For Bob to decode this transformation, he’ll want to use its matrix transform:
(1 0 0 1)
( 1 0 0 -1 )
( 0 -1 1 0 ) Which corresponds to the gates:
( 0 1 -1 0 ) cNOT (2nd controls 1st)
then Hadamard (2nd qubit)
The idea is that Alice transforms the Bell Pair into one of the four entangled states above, then Bob
decodes that two-qubit state into one of the four possible combinations of |0⟩ and |1⟩ which correspond to
the variables X and Y.
|01⟩ −|10⟩
So if Bob receives √2
applying cNOT gets him |1⟩⊗|-⟩, and Hadamard gets him |1⟩⊗|1⟩.
|00⟩ −|11⟩
if Bob receives √2
applying cNOT gets him |0⟩⊗|+⟩, and Hadamard gets him |0⟩⊗|1⟩.
Naturally, we may want to ask: if Alice and Bob had even more preshared entanglement, could Alice send
an arbitrarily large amount of information through one qubit?
There’s a theorem which answers: No.
It turns out that for every qubit, and any amount of entangled qubits (ebits), you can send two bits
of classical information. We show this through the inequality:
1 qubit + ebits ≥ 2 bits
As far as quantum speed-ups go, this isn’t particularly impressive, but it is pretty cool that it goes against
the most basic rules of information theory established by Shannon himself.
38
Quantum Teleportation
is a result from 1991 that came as a great surprise. You’ll still see it in the news sometimes given
its irresistable name. In this lecture we’ll over what it can and can’t do.
The inequality here is almost the converse of the one for superdense coding:
1 ebit + 2 bits ≥ 1 qubit
Which is to say, you need one pair of entangled qubits plus two classical bits in order to transmit
one qubit.
A more in depth explanation is given in the next lecture, but the gist of it is:
Alice has |Ψ⟩ = α|0⟩ + β|1⟩.
Alice applies some transformation to |Ψ⟩, then measure it.
Alice tells Bob some classical information on the phone.
Bob does some transformations (to his qubit of the entangled pair).
Bob now has |Ψ⟩
At the end, will Alice also have |Ψ⟩?
No. A logical consequence of the No Cloning Theorem is that there can only be one copy of the qubit.
39
Lecture 10: Thurs Feb 16
Quantum Teleportation (Continued)
So let’s say Alice wants to get a qubit over to Bob, but they do not share a quantum
communication channel. They do, however, have a classical channel and preshared entanglement.
At which point Alice measures both her qubits in the |0⟩,|1⟩ basis.
This leads to four possible outcomes:
If Alice Sees 00 01 10 11
Then Bob’s qubit is α|0⟩ + β|1⟩ α|1⟩ + β|0⟩ α|0⟩ - β| 1⟩ α|0⟩ - β| 1⟩
We’re deducing information about by Bob’s state using the partial measurement rule. If Alice
sees 00, then we narrow down the state of the entire system to the possibilities that fit, i.e. |000⟩ and |001⟩.
What is Bob’s state, if he knows that Alice measured, but not knowing the measurement?
It’s an even mixture of all four possibilities, which is the Maximally Mixed State.
This makes sense given the No Communication Theorem. Until Alice sends
information over, Bob’s qubit doesn’t depend on |Ψ⟩.
40
Now, Alice sends Bob her measurements via a classical channel.
If the first bit is 1, he applies ( 1 0 )
( 0 -1 )
If the second bit is 1, he applies ( 0 1 )
(10)
These transformations will bring Bob’s qubit to the state α|0⟩ + β|1⟩ = |Ψ⟩.
That means they’ve successfully sent over a qubit without a quantum channel!
This protocol works even if Alice doesn’t know what |Ψ⟩ is.
For this protocol to work, Alice had to measure her syndrome bits. These measurements were
destructive (since we can’t ensure that they’ll be made in a basis orthonormal to |Ψ⟩, and thus Alice
doesn’t have |Ψ⟩ at the end.
Something to think about: Where is |Ψ⟩ after Alice’s
measurement, but before Bob does his operations?
How do people come up with this stuff? I can’t picture how anyone trying to solve this problem would
even begin their search…
Well it’s worth pointing out that quantum mechanics was discovered in 1926 and that quantum
teleportation was only discovered in the 90’s. These sorts of properties can be hard to find. Oftentimes
someone tries to prove that something is impossible, and in doing so eventually figures out a way to get it
done.
Aren't we fundamentally sending infinitely more information than two classical bits if we’ve sent over
enough information to perfectly describe an arbitrary qubit, since the qubit’s amplitudes can be encoded
in an arbitrarily complex way?
I suppose, but you only really obtain the information that you can measure, which is significantly
less. Amplitudes may exist physically, but they’re different from other physical properties like length, in
that they seem to act a lot more like probabilities.
For some α|0⟩ + β|1⟩ you could say that β is a binary expansion that encodes the complete works
of Shakespeare—the rules of quantum mechanics don’t put a limit on the amount information that it takes
to encode a qubit. With that said, you could also encode the probability of a classical coin to do that.
If we can teleport one qubit, the next question we may want to ask is:
First, we can notice that the qubit that’s transmitted doesn’t have to be unentangled.
41
would entangle the fourth qubit to Bob’s qubit (You can check this via calculation). That’s not a
particularly interesting operation, since it lands you where you started, with one qubit of entanglement
between Alice and Bob, but it does have an interesting implication.
It suggests that it should be possible to transmit an n-qubit entangled state, by sending each over
at a time, thus using n ebits of preshared entanglement.
One further crazy consequence of this is that two qubits don’t need to interact directly to become
entangled.
A simple example would be this:
What does it take for Alice and Bob to get entangled anyways?
The obvious way is for Alice to create a Bell Pair and send one of the qubits to Bob.
In most practical experiments, the entangled qubits are created somewhere between
Alice and Bob and are then sent off to them.
We’ve seen the Bell Pair, and what it’s good for. There’s an analogue of it to three-party
|000⟩ + |111⟩
entanglement called The GHZ State: √2
. We’ll see applications of it later in
the course, but for now we’ll use it to show an interesting conceptual point.
Let’s say that Alice, Bob, and Charlie share 3 classically correlated states. If all three of
them get together, they can see that their qubits are classically correlated, and the same can
be said if only two of them are together.
But now suppose that Charlie is gone. Can Alice and Bob use the entanglement between
them to do quantum teleportation?
No. The trick here is that Charlie can measure without Alice and Bob knowing, which would
remove their qubits from superposition, and thus would make the quantum teleportation protocol fail.
A different way to see this is to look at the density matrix of shared by Alice and Bob
42
(½ )
ρAB = ( )
( )
( ½)
And notice that it’s different than the density matrix of a Bell Pair shared by Alice and Bob
(½ ½)
ρAB = ( ) Remember: This gets derived by |Ψ⟩⟨Ψ|
( )
(½ ½)
With GHZ, you can only see the entanglement if you have all three
together. This is often analogized to the Borromean Rings (right), a grouping
of three rings in a way that all three are linked together, without any two being
linked together.
There are other 3-qubit states which aren’t like that…
1
In the W State, √3 ( |100⟩ + |010⟩ + |001⟩), there’s some entanglement between Alice and Bob, and there’s
some entanglement between Alice and Charlie, but neither pair is maximally entangled.
So how do you quantify how much entanglement exists between two states?
It’s worth noting that we sort of get to decide what we think a measure of entanglement ought to
mean. We’ve seen how it can be useful to think of quantities of entanglement as a resource, so we can
phrase the question as “How many ‘Bell Pairs of entanglement’ is this?”
It’s not immediately obvious whether different kinds of entanglement would be good for
different things. That’s actually the case for large multi-party states, but with just Alice and
Bob, it turns out that you can just measure in ‘number of Bell Pairs of entanglement’.
Our first observation here should be that given any bipartite state, you can always find a
representation of it with Alice and Bob representing their qubits in bases that are orthonormal. So we can
write the state as Σ λi|vi⟩|wi⟩
such that all |vi⟩’s are orthonormal,
and all |wi⟩’s are orthonormal.
43
Schmidt Decomposition
Given a the matrix A = ( α11 … α1n ) representing the entire quantum state.
( . .. )
( αn1 … αnn )
We can multiply by two unitary matrices to get a diagonal matrix:
UAV = Λ U and V can be found efficiently using linear algebra
Essentially this means that we’re rotating Alice’s and Bob’s states into an orthogonal basis.
We then have ( |λi|2 ) and we can just ask for the Shannon entropy of this to figure out
( : ) how many Bell Pairs that’s equal to.
( |λn|2 )
44
Lecture 11: Tues Feb 21
For a classical probability distribution D = (P1, ..., Pn), we say its Shannon Entropy is
n
H(D) = ∑ P i log 2 P1
i
i=1
Von Neumann Entropy is generalization of Shannon Entropy from distributions to mixed states.
We say that the Von Neumann Entropy of a mixed state ρ is
n
S(ρ) = ∑ λi log2 1/λi
i=1
You could say that Von Neumann Entropy is the Shannon Entropy of the vector of eigenvalues of
the density matrix of ρ. If you diagonalize the density matrix, it represents a probability distribution over
n orthogonal outcomes, and taking the Shannon Entropy of that gives you the Von Neumann Entropy of
your quantum state.
We can now talk about how much entropy is in a bipartite pure state.
Entanglement Entropy
Given Alice and Bob share a bipartite, pure state |Ψ⟩ = ∑ αij|i⟩A|j⟩B
i, j
To quantify the entanglement entropy, we’ll trace out Bob’s part, and look at the Von Neumann Entropy
of Alice’s side, S(ρA), by asking: If Alice made an optimal measurement, how much could she learn about
Bob’s state?
A sample calculation...
|Ψ⟩ = ⅗ |0⟩A|+⟩B + ⅘ |1⟩A|-⟩B This is in Schmidt Form: Alice is in the X basis, Bob is in Y.
2 2 2 2
E = (⅗) log2 ((5/3) ) + (⅘) log2 ((5/4) )
= ~ .942
That means that if Alice and Bob share 1000 instances of |Ψ⟩, they’d be able to teleport about 942 qubits.
So for any bipartite, pure state we may want to know how many ebits of entanglement it corresponds to.
There are two values to consider:
It turns out that EF >> ED, which is to say that there exist bipartite, pure states which take a lot of
entanglement to make and but that you can only extract a fraction of the entanglement you put in.
We say that a mixed state ρAB is separable if it can be written as a mixture of product states.
i.e. ρAB = ∑ pi |vi⟩⟨vi| + |wi⟩⟨wi|
i
46
philosophical debate, as its positions have often corresponded to breakthroughs in quantum mechanics
(we’ll see an example of this with the Bell Inequality).
Most discussions about the implications of quantum mechanics to our understanding of reality
center around The Measurement Problem.
In most physics texts (and in this class), measurement is introduced as an unanalysed primitive
that we don’t question. There’s a fundamental weirdness about it that stems from the fact that quantum
mechanics seems to follow both:
1. Unitary Evolution
when no one is watching |Ψ⟩ -> U|Ψ⟩
2. Measurement
Which collapses states to a single possibility |Ψ⟩ -> i with probability |⟨Ψ|i⟩|2 = | α|2
In other words, quantum mechanics generally seems to work in a way that’s gradual, continuous,
and reversible most of the time (1), except for during (2), which is the only time we see it work in a way
that’s probabilistic, irreversible, and sudden. So we can alternatively phrase the question as:
“How does the universe know when to apply unitary evolution and when to apply measurement?”
People have argued about this for about 100 years, and the discussion is perhaps best compared to
the discussion surrounding the nature of consciousness (which has gone on for millennia) in that they both
devolve into people talking in circles about each other.
It’s worth discussing the three main schools of thought, starting with…
The Copenhagen Interpretation
The prefered interpretation of most of the founders of quantum mechanics and was proposed by
Bohr (hence the name) and Heisenberg.
It basically says that there are two different worlds: the quantum world and the physical world.
We live in the physical world, which only has classical information, but in doing experiments we’ve
discovered that there also exists the quantum world “beneath” it, which has quantum information.
Measurement, in this view, is the operation that bridges the two worlds.
It lets us “peek under the hood” into the quantum world and see what’s going on.
Bohr wrote long tracks saying that just to make statements about the quantum world in the
classical world is to suppose that there exists a boundary between them, and that we should never make an
error in trying to conflate the two. His point of view essentially says “if you don’t understand this, then
you’re just stuck in the old way of thinking, and you need to change”.
47
You could say that the Copenhagen interpretation is basically just S.U.A.C. without the S.U. part.
After seeing something weird, instead of shutting up, they’ll write volumes and volumes about how we
can’t find a deeper truth.
The popularity of this point of view corresponds to most researchers thinking, “yes, this is how
we do things in practice”. It seems likely that the popularity of this view isn’t going to last forever,
because at the end of the day, people will want to understand more about what physical states are truly
made of.
Schrödinger’s Cat
There were physicists in the 30s and 40s who never accepted the Copenhagen interpretation,
namely Einstein and Schrödinger, and they came up with plenty of examples to show just how untenable
it is to have a rigid boundary between worlds if you think hard about it.
The most famous of these is Schrödinger’s Cat, which first appears with Einstein saying that if
you think of a pile of gunpowder as being inherently unstable, you could model it as a quantum state
which looks like | ⟩+| ⟩
Then Schrödinger comes along and adds some flair by asking, “What happens if we create a
quantum state that corresponds to a superposition of a state in which a cat is alive and one where the cat is
dead?” He allows for the assumption that the cat is isolated by putting it in a box. | ⟩+| ⟩
The point of the thought experiment is that the formal rules of quantum mechanics should apply
whenever you have distinguishable states, and thus you should also be able to have linear combinations of
such states. It seems patently obvious that at some point we’re implicitly crossing the boundary between
the worlds, and thus we should have to say something about the nature of what’s going on before
measurement. Otherwise we’d devolve into extreme solipsism in saying that the cat only exists once
we’ve opened the box to observe it.
Wigner’s Friend
Is similar thought experiment. It says that Wigner could be put in a superposition of thinking one
1
thought or another, modeled as √2 ( |Wigner0⟩ + |Wigner1⟩ ).
We can look at the state of him and a friend that’s not aware of his state.
1
|Friend⟩ ⊗ √2 ( |Wigner0⟩ + |Wigner1⟩ )
Whichever branch Wigner is in is what he believes (either one thought or the other) after the
experiment has been performed. But from his friend’s point of view, the experiment hasn’t been
performed. Then the two can talk, making the state
1
√2
( |Friend0⟩|Wigner0⟩ + |Friend1⟩|Wigner1⟩ )
But then what happens if another friend comes along, and then another?
The point is to highlight the incompatibility of the perspectives of two observers: one ascribes a
pure state other mixed state. We need some way of regarding measurement as fictitious or believing in
only local truth.
48
Lecture 12: Thurs Feb 23
Last time we discussed a few interpretations of quantum mechanics and today we cover a few
more. The first of these isn’t so much an interpretation, but rather a proposal for a new physical theory.
Dynamic Collapse
Says that maybe quantum mechanics isn’t a complete theory. It does a good job of describing
microscopic systems, but maybe we’re not looking at all of the rules that govern reality.
The idea is that there exist some physics rules that we haven’t discovered which say that qubits
evolve over unitary transformations, but that the bigger the system is, the more likely it will collapse.
Thus, we can view this collapse as being a physical process that turns pure states into mixed states.
∑ αi |i⟩ ~> |i⟩ with probability |⟨Ψ|i⟩|2 = |α|2
i
So in the Schrödinger’s Cat example, Dynamic Collapse would say that it doesn’t matter how
isolated the box is. There exists some physical law that says that a system that big would eventually
evolve into a mixed state.
1 1
√2 (| ⟩+| ⟩) ---> √2 (| ⟩⟨ |+| ⟩⟨ |)
So if you measured in the alive/dead basis, you should be able to distinguish between these two
states.
Theoretically you could implement a measurement in any basis of a multi-qubit system. What this
means for our cat, is that there should exist a unitary transformation to get the “cat system” into a basis
where we can measure any of it’s qubits and get 0 if the cat is alive and 1 if the cat is dead.
Professor Aaronson is currently doing research into what other problems you’d have a solution
for if you solve this problem (are able to measure in an arbitrary basis). There’s already a theorem which
says that if you can distinguish between this and that state, then you must have the technological ability to
rotate between them.
Which means implementing the Schrödinger’s Cat experiment in real life need
not involve animal cruelty: if you were able to distinguish between the alive state
and dead state, you should be able to rotate the dead cat back into the alive state!
The idea of these Dynamic Collapse theories is that even if you had the technology to distinguish
between the two states, a system as big as a cat wouldn’t maintain itself in a pure state for a significant
amount of time.
The trouble with this is that it’s not really interpreting quantum mechanics, it’s just proposing
new laws of physics. Physicists have a high bar for such proposals, and the burden of proof is on you to
explain exactly how big a system needs to get to collapse. Fundamentally, there should be implications
which we’re able to measure the effects of.
49
The point is that if you propose a Dynamic Collapse theory, the burden is on you to clarify how it works
mathematically. Some suggestions include:
● Collapse happens when some number of atoms get involved
○ which is contradictory to our understanding of atoms, which relies on reductionism
● Collapse happen after a certain mass is reached
The trouble with these theories is that they need to keep adjusting their answers to questions like
“How much mass is enough to collapse it?” based on experimental evidence, which keeps producing
examples of bigger and bigger states in superposition.
Early on, we discussed the significance of the Double Slit Experiment as performed with photons.
People eventually tested it with protons, then molecules, and in 1999 Zeilinger performed it with
Buckyballs: molecules large enough to be seen with the naked eye.
To go even further...
Superconducting Qubits
If you take a coil, about 10mm across, and cool it to almost absolute zero, you’ll see a
current that’s in superposition of electrons rotating clockwise or counterclockwise about it.
This constitutes a superposition of billions of particles!
We’ll come back to these in time, as they’re an important technology for quantum computers.
Penrose has a specific prediction for the scale at which collapse happens, which may be testable
in our lifetime, but with GRW, the prediction retreats every time superposition is shown to be possible at
a news scale.
50
A popular position among people who want nature to be simulatable in a classical computer (and thus
don’t want quantum computers to work) says that:
A frog can be in a superposition of two states. However, a complex quantum computer wouldn’t
work because systems lose superposition after sufficient complexity.
This position is interesting because it could be falsified by building a quantum computer, and
reaching falsifiable theories is what moves these discussions from philosophy to science.
What happens if we keep doing experiments and quantum mechanics keeps perfectly describing
everything we see?
i.e. we want to not add any new physical laws, but we insist on being realists (saying that there
exists a real state of the world without believing that unitary transformations and measurement are
separate).
You can think of measurement as a special case of entanglement. It’s just your brain becoming
entangled with the system that you’re measuring. A cNOT gate is applied from the system you’re
observing onto you.
|0⟩ + |1⟩ |0⟩|Y ou0 ⟩ + |1⟩|Y ou1 ⟩
√2 |You⟩ -> √2
Essentially you’ve now branched into one of the two possibilities.
We perceive only one branch, but there exist countless other branches where
one month later every possible thing that could happen happens.
Some versions of this interpretation chose words carefully to avoid sounding like there exist
several physical worlds, but they all imply it. When Everett came up with this as a grad student at
Princeton, his advisor told him to remove references about the physical existence of several worlds,
because it wouldn’t chime with the physics establishment at the time, so he published without it.
Eventually Everett left physics for nuclear work. The only lecture he gave on the
topic was at UT decades later when people were finally coming around to the idea.
Deutsch, the biggest current advocate of the Many Worlds Interpretation, was there.
51
We don’t expect different branches to interfere with one another, because what has happened,
happened, and can’t be changed. |0⟩|Y ou0 ⟩ shouldn’t affect |1⟩|Y ou1 ⟩
This shouldn’t need to be a problem. To get the current world, you apply unitary transformations
representing every branching between the beginning of time and now. Interference would only happen if
two states are reached by applying different unitary transformations. Quantum mechanics says that this is
less likely to happen than an egg unscrambling itself (it’s thermodynamically disfavored).
We’ve said that measurement is the one irreversible part of quantum mechanics, but Many
Worlds says it’s not. In principle we could apply U-1 to get a measurement to unhappen, though like
unscrambling an egg, thermodynamics isn’t going to make it easy.
52
Lecture 13: Tues Feb 28
Everett’s Many Worlds Interpretation (Continued)
Everett’s Many Worlds Interpretation raises many questions.
Today we’ll tackle two of the most important:
In practice we see probabilistic results to experiments. It’s the reason that we know that quantum
mechanics works in the first place. So people tend to be hesitant about the Everett Interpretation because
it’s not abundantly clear why these probabilities would arise.
Many Worlders say that there exists a “splitting of the worlds” in such a way that amplitudes of ⅗
and ⅘ would correspond to 9/25th “volume of worldness” going one way, and the other 16/25th going the
other.
Some philosophers don’t really buy this because if worlds are equal, why wouldn’t they just
occur with even probabilities? Why bother with amplitudes at all? Many Worlders say that probabilities
are just “baked into” how quantum mechanics works. They justify this by arguing that we already agree
that density matrixes bake the Born Rule in (since the main diagonal represents Born Rule probabilities).
There’s all sorts of other technical arguments that come into play, which
boil down to “if nature is going to pick probabilities, they might as well
be these,” lest we get faster-than light communication, cloning, etc.
Many Worlders say that the opponents of Galileo and Copernicus could also claim the same about
Copernican vs Ptolemaic versions of observations of the planets, since Copernican heliocentrism made no
difference to the predictions of celestial movement.
Today we might say that the Copernican view is better because you could fly outside of the solar
system and see the planets rotating around the sun; it’s only our parochial situation of living on Earth that
motivated geocentrism. On that note, it may be harder to think up a physically possible analog for the
Many Worlds interpretation, since we can’t really get outside of the universe to the see the branching.
There is one neat way you could differentiate the two, though...
Last time we talked about increasing the scope of the Double Slit Experiment.
Bringing that thread to its logical conclusion, what if we could run the experiment with a
person?
It would then be necessary to say that observers can branch, and that a person is a
quantum system. That means it would no longer be enough to use the Copenhagen
interpretation.
53
If you talk to modern Copenhagenists about this they’ll take a quasi-solipsistic view, saying that
if this experiment were run, “the person being behaving quantumly doesn’t count as an observer, only I,
the experimenter do.”
Another place to consider the differences of interpretations is their relationships with special relativity.
Both the Copenhagen Interpretation and Dynamic Collapse appear to be in some tension with
special relativity.
If Alice and Bob share a Bell Pair, and Alice measures her qubit in some basis, Bob’s qubit
instantaneously collapses to that basis. Sure, Bob won’t immediately know the result of Alice’s
measurement, and thus describes his state as I/2, but that’s still a problem.
Simultaneousness for far away things isn’t well defined in special relativity, so people argue that
Alice’s measurement immediately causing a change in Bob’s qubit conflicts with it.
You can see this more clearly by taking a frame of reference where Bob’s change happens first.
How can we say that Alice’s measurement caused it?
The Many Worlds Interpretation doesn’t have to deal with this snag because it doesn’t assert that
collapse actually happens in the first place. It’s ok to view Bob’s change as happening first because
Alice’s measurement didn’t cause it, it was just a branching of the universe.
The second question we want to tackle is the Prefered Basis Problem. It says:
“Let’s say I buy into the argument that the universe keeps branching, well then…”
There’s a whole field of physics that tries to answer questions like these, called...
Decoherence Theory
which says that there are certain bases that tend to be robust to interactions with the environment,
but that most aren’t.
So for the example above, decoherence theory would say that an alive cat doesn’t easily decohere
if you poke it, but that a cat in the ½ (|Alive⟩ + |Dead⟩) state does, because the laws of physics pick out
certain bases as being special.
From the point of view of decoherence theory we say that an event has definitely happened only
if there exist several records of it scattered all over the place (where it’s not possible to collect them all).
54
This is perhaps best compared to putting an embarrassing picture on Facebook. If only a few
friends share it, you can still take it down. On the other hand, if the picture goes viral, then the cat is out
of the bag, and deleting all copies becomes an intractable problem.
You may think that all the options we’ve seen so far are bizarre and incomprehensible (Einstein certainly
did), and wonder if we could come up with a theory that avoids all of the craziness. This leads us to…
Hidden Variable Theories
which try to supplement quantum state vectors with some sort of hidden ingredients. The idea is
to have α|0⟩ + β|1⟩ represent a calculation to make a prediction on what the universe has already set the
qubit to be: either |0⟩ or |1⟩.
The big selling point of Bohmian Mechanics is that there’s only one random decision that has to
be made. “God needs to use a RNG to place the hidden variables” at the beginning of time, but afterwards
we’re just following the Born Rule.
Bohm and others noticed lots of weird consequences of Bohmian Mechanics. It looks nice with
just one particle, but problems start to arise when you look at a second. Bohmian Mechanics says that you
need to give a definite position for both particles, but people noticed that you can only get that with
faster-than-light influence in hidden variables (since Alice’s local transformation moves Bob’s qubit).
55
This wouldn’t be useful for faster-than-light communication or the like,
since hidden variables are explicitly defined as unmeasurable.
When Bohm proposed this, he was super eager for Einstein to accept the interpretation, but
Einstein didn’t really go for it, because of the sort of things listed above.
What Einstein really wanted (in modern terms), is a…
Local Hidden Variable Theory
where hidden variables can be localized to specific points in space and time.
The idea is that when entanglement is created, the qubits flip a coin and decide, “if anyone asks, let’s both
be 0,” coming up with such answers for all questions that could be asked (infinite bases and whatnot), and
that each qubit carries a copy around independently.
This is not Bohmian Mechanics: in 1963 John Bell actually wrote a paper that points out the
non-locality of Bohmian Mechanics. Bell says that it would be interesting to show that all hidden variable
theories must be non-local, and in fact the paper has a footnote that says that since publication, a proof of
this has been found.
56
The idea is that Alice and Bob are placed in separate rooms, and are each given a challenge bit (x
and y, respectively) by a referee. Then Alice sends back bit a, and Bob bit b.
The Bell Inequality is just the statement that the maximum classical win probability for this is 75%.
Bell noticed an additional fact though. If Alice and Bob had a pre-shared Bell Pair, there’s a better
strategy. In fact, the maximum win probability for a quantum strategy is cos2(π/8) ~ 85%.
The strategy involves Alice and Bob measuring their entangled qubit based on whether x and y are 0 or 1.
This strategy has the amazing property of making Alice and Bob win with probability cos2(π/8) for all
possible values of x and y.
57
Lecture 14: Thurs March 2
The Bell/CHSH Game (Continued)
Last time we talked about the CHSH Game and how we can use entanglement to create a better
strategy than the classical one.
The interesting case is where a and b are both set to 1.
This case requires that Alice measured in the |+⟩,|-⟩ basis and got |-⟩.
So what is Bob’s probability of getting |1⟩?
Still cos2(π/8), because the angle between |-⟩ and -|π/8⟩ is π/8, and
global phase doesn’t matter.
The reason this game relates to hidden variable theories is that if all correlation between particles
could be explained as “if anyone asks, we’re both 0,” you’d predict that Alice and Bob would win only
¾’s of the time (because that’s how good they can do by pre-sharing arbitrary amounts of classical
information). So you could refute local realism by running this experiment repeatedly—without having to
presuppose that quantum mechanics is true.
Does Alice and Bob’s ability to succeed more than ¾ of the time mean that they are communicating?
No, we know that’s not possible (No Communication Theorem). We can more explicitly work out
what Alice and Bob’s density matrixes look like over time to check this.
Bob’s initial density matrix is (½ 0) and after Alice measures it’s still (½ 0) .
(0 ½) (0 ½)
So in that sense, no signal has been communicated from Alice to Bob. Nevertheless, if you know
Alice’s measurement and outcome you can predict Bob’s measurement to update his density matrix. That
58
shouldn’t worry us though, since even classically if you condition on what Alice sees you can change
your predictions.
Imagine a hierarchy of possibilities within physics of what the universe allows. You’d have
Classical Local Realism at the bottom, where you can determine all outcomes of all measurements you
make, and you only need to use probability when you have incomplete information about local objects.
At the top of the hierarchy is a Faster-Than-Light Science-Fiction Utopia where Alice and Bob
can communicate instantaneously, you can travel faster than
light, and so forth.
If we ran the experiment and Alice and Bob were winning CHSH more than 75% of the time, and
we kept the assumption that the world is classical, then we would have to suppose that faster-than-light
communication is occurring. Instead we suppose the likelier alternative: quantum mechanics is at play.
59
Let’s say that Alice has two angles:
θ0, the
angle she outputs if she receives a 0 and
θ1, the one she outputs if she receives a 1.
Similarly, Bob has τ0 and τ1.
The same rules apply from the solution we constructed earlier for the CHSH game.
All we’re doing here is changing the chosen vectors into variables to try and
show that there’s no better vectors to chose than the ones we did.
We can then say that the probability of success for Alice and Bob is:
P[success] = ¼ [ cos2(θ0- τ0) + cos2(θ0-τ1) + cos2(θ1-τ0) + sin2(θ1-τ1)]
^ [1] ^ [2] ^ [3]
Why?
1. We assume each outcome has an equal chance of occurring.
2. Alice and Bob win (in most cases) if they output the same bit, so we measure the cosine between
their output angles.
3. Unless, both receive a 1. In this case we measure the chance of their angles being different, which
is their sine.
And we can abstract out the 2’s on the cosines by understanding that we could adjust our original vectors
to account for them.
We can also think of these cosines as the inner product of two vectors.
= ½ + ⅛ [U0 · V0 + U0 · V1 + U
1 · V0 - U1 · V1]
= ½ + ⅛ [U0 (V0 + V1) + U1 (V0 - V1)]
Since these are all unit vectors, they’re bounded by the norms
≤ ½ + ⅛ [||V0 + V1|| + ||V0 - V1||]11
And from here, we can use the parallelogram inequality to bound it further
≤ ½ + ⅛ √2(||V 0 + V 1 ||2 + ||V 0 − V 1 ||2
Which equals
= 1/2 + ( √2 /8) √4
= ¼ (2+ √2 )
Which wouldn’t you know it, brings us to
= cos2(π/8) ≈ 85%
60
So cos2(π/8) really is the maximum winning percentage for the CHSH game.
There’s been a trend in the last 10-15 years to study theories that would go past
quantum mechanics (past Tsirelson's Inequality), but that would still avoid
faster-than-light travel. In such a reality, it’s been proven that if Alice and Bob
want to schedule something on a calendar, they could agree on a date over only
one bit on communication. That’s better than can be done under
the rules of quantum mechanics!
Testing the Bell Inequality
When Bell proposed his inequality, it was meant only as a conceptual point about quantum
mechanics, but by the 1980s it was on it’s way to becoming a feasible experiment. Alan Aspect (and
others) ran the experiment, and his results were consistent with quantum mechanics.
He didn’t quite get to 85% given the usual difficulties that affect quantum
experiments, but he was able to reach a high statistical confidence
that he was producing wins greater than 80% of the time.
This showed that you can use entanglement to win the CHSH game. Perhaps more impressive is
that winning the CHSH game at > ¾ probability provides evidence that entanglement is there.
Most physicists shrugged, already sold on quantum mechanics (and the existence of
entanglement), but others looked for holes in the experiment, because it refutes the classical view of the
world.
They pointed out two loopholes in the existing experiment, essentially saying “if you squint
enough, classical local realism might still be possible”:
1. Detector Inefficiency
Sometimes detectors fail to detect a photon or they detect non-existent photons. Enough noise in
the experiments could skew the data.
2. The Locality Issue
Taking the measurement and storing it on a computer takes microseconds, which by physics
standards isn’t negligible. Unless Alice and Bob and the referee are very far away from each other, there
could be a sort of “local hidden variable conspiracy” going on, where as soon as Alice measures, some
particle (unknown to physicists) flies over to Bob and says “hey, Alice’s qubit measured to 0. You should
measure to 0 too.”
Aspect was able to close [2], but only in experiments still subject to [1].
By the 2000s, others were able to close [1], but only in experiments still subject to [2].
In 2016, a bunch of teams were finally able to close both loopholes simultaneously.
There are still people who deny the existence of entanglement, but through increasingly
solipsistic arguments. For example…
Superdeterminism
is a theory that says classical local realism is still the law of the land.
Explains the results of CHSH experiments by saying “We only think Alice and Bob can choose
bases randomly,” and that there’s a grand cosmic conspiracy involving all of our minds, our computers,
61
and our random number generators with the purpose of ensuring that Alice and Bob win the CHSH game
at > ¾ probability by rigging the measurement bases. That’s all it does.
Nobel Laureate Gerard ‘t Hooft advocates superdeterminism, so it’s not like the idea lacks serious
supporters, but Professor Aaronson is on board with entanglement.
Now we’ll look at other non-local games to see what other tasks the Bell Inequality can help with.
First, we have…
We take one run of the game to mean the referee asking a question once, and getting a response.
Without loss of generality, answers are always RED or BLUE, and the cycle has size n.
What strategy provides the best probability that Alice and Bob will pass the test and win the game?
We know that the classical strategy has Pr[win] < 1, because for Alice and Bob to agree on a
perfect solution ahead of time, they’d have to find a two-coloring (impossible). The best they can do is
1
agree on a coloring for all but one of the vertices, which gets them Pr[win] ≤ 1 − 2n .
1
We claim that with the quantum strategy has Pr[win] ≈
1 − n2 .
|00⟩ + |11⟩
First, Alice and Bob share a bell pair, 2 .
Alice and Bob each measure their qubit on a basis depending on the vertex they’re asked about.
The measurement bases each differ by 2π/n, so they’re
evenly spaced between |0⟩ and |1⟩.
The first basis has 0 map to answering BLUE and 1 to
answering RED. The second has 0 mapped to RED, and 1 to
BLUE. They continue alternating.
62
So when Alice and Bob are asked about the same vertex, they both measure in the same basis,
and thus both answer the same color.
When Alice and Bob are asked about adjacent vertices, we get a similar situation to the CHSH
game, where the probability of Bob measuring his qubit to the same value as Alice’s is the distance
between the two vectors. So they answer incorrectly with probability sin2θ = sin2(1/n) ≈ n12 .
You can see that this grid can’t actually be created by examining the
total sum of the grid. The first rule requires it to be even, the second requires it to be odd. That means
there’s no classical strategy where Alice and Bob always win.
Mermin (the author of our textbook) discovered a quantum strategy where Alice and Bob can
always win with only 2 ebits.
We wont write out this strategy.
63
Lecture 15: Thurs March 9
Until recently, the Bell Inequality was taught exclusively for being historically important, without
having any practical applications. Sure, it establishes that you can’t get away with a local hidden variable
theory, but practically speaking, no one actually wants to play the CHSH game. In the last 10 years,
however, it’s found applications in…
Thus cryptographers want to base their random number generation on the most minimal set of
assumptions possible. They want systems that are guaranteed to be truly random, and to be sure that no
one had added predictability to the number generation through some sort of backdoor.
You might think that, logically, one can never prove that numbers are truly random, and that the
best one can say is that “I can’t find any patterns here.” After all, you can’t prove a negative, and if not
the NSA, who’s to say that God himself didn’t insert a pseudo-random function the workings of quantum
mechanics?
Though presumably, if God wanted to read our emails he could do it some other way.
Interestingly, the Bell Inequality lets you certify that numbers are truly random under very weak
assumptions, which basically boil down to “No faster-than-light travel is possible.” Here’s how:
You have two boxes that share quantum entanglement, which presumably were designed by your worst
enemy. We’ll assume they can’t send signals back and forth (say you put them in Faraday Cages).
A referee sends them numbers.
They each return numbers.
If the returned numbers pass a test, we can say that they are
truly random.
64
The usual way to present the CHSH game is that Alice and Bob prove that they share
entanglement, and thus the universe is quantum mechanical. However, winning the game (better than 75%
of the time) also establishes that a and b have some randomness, that there was some amount of entropy
generated.
If a and b were deterministic functions—which is to say that they could be written as a(x, r) and
b(y, r), in terms of their input and pure randomness—then you’d have a local hidden variable theory. If x
and y were random, then there must exist some randomness in the outputs.
To put it another way: If Alice has a non-deterministic outcome and Alice’s state isn’t affected by
Bob’s, then some randomness must be in play.
What is the random result from Alice and Bob? What do you get out?
You can just take the stream of all b’s. The measure of entropy is just the Shannon Entropy.
{px}x if string x occurs with probability px
The total is Σ px log2 1/px
But each output b doesn’t represent an entire bit of randomness. You’d take these bits and run
them through a randomness extractor which would crunch them down from many sort-of-random bits to
fewer very random bits.
David Zuckerman (here at UT) is an expert on this.
What else could you certify about the boxes playing the CHSH game?
It turns out: an enormous amount of things.
You can certify that Alice and Bob did a specific sequence of local quantum transformations (up
to a change in bases). So just by making them play the CHSH game, you can guarantee they do any
unitary transformation of your choice. Reichardt and Vazirani describe this as a “classical leash for a
quantum system.”
One of the main current ideas for how a classical skeptic could verify a quantum solution also
appears here. For prime factoring, we can easily verify the solution of a quantum algorithm, but this isn’t
the case for all problems. Sometimes the only way to verify the solution to a quantum algorithm is by
testing the solution on a quantum computer. With this application of a CHSH game, you can guarantee
that the quantum computer is behaving as expected.
The other, less crazy, path to the same association came from Feynman, who gave a famous
lecture in 1982 concerned with the question, “how do you simulate quantum mechanics on a classical
computer?”
Chemists and Physicists had known for decades that this is hard, because the number of things
you need to keep track of increases exponentially with the number of particles. This is the case because,
as we know, an n-qubit state can be maximally entangled.
(α00…..0)
The state |Ψ⟩ = ∑ αxn |x⟩ must be described by the vector (α00…..1) of length 2n
n
x ε {0,1}
:
66
Even to solve for the energy of the system, or for the state of some particular qubit, there’s no
shortcut for reasoning with this enormous vector. So he raised the question, “Why don’t we build
computers out of qubits to simulate qubits?” No one knew if this would be useful for classical tasks as
well.
1. Can we solve anything on a quantum computer that can’t be solved on a classical computer?
No. Anything that can be done on a quantum computer can be done on a classical computer too
by storing the exponential number of variables that arise when working with qubits.
Quantum computing “only” may violate the Extended Church-Turing Thesis.
2. Why does each gate act only on a few qubits? Where is this assumption coming from?
It’s similar to how classical computers don’t have gates act on arbitrarily large quantities of bits,
and instead use small gates like AND, NOT to build up complex circuitry.
For the quantum case, you could imagine a giant unitary U, which takes qubits, encodes on them
the decision version of Travelling Salesman, which is then cNOT’ed to another qubit to get an answer.
But given such a definition, how would you go about building U?
Difficulty arises because there exists a staggeringly large amount of possible unitary matrices.
You can decompose any U, but it might result in an exponential number of small gates (just like
deconstructing an arbitrary Boolean string may require an exponential number of classical gates). We can
sort of circumvent this with the…
Accounting Argument
which says that we don’t need to consider all unitary matrices, just all of the diagonal ones where
the diagonal entries are either 1 or -1. And that’s great, because it means you don’t need to keep track of
2^(2^n) variables to keep track of U.
Shannon proved that the number of bits it takes to describe a circuit is roughly linear to the
number of gates. So almost every unitary matrix would take exponential gates to build.
Interestingly enough, we don’t know any examples of such unitary matrices.
But we do know that they’re out there!
This tells us something important. In quantum computing, we’re not interested in all unitary
matrices, only the ones that can be encoded in small circuits requiring a polynomial number of gates.
67
Lecture 16: Tues March 21
Guest Lecture by Tom Wong
Last time we addressed a few conceptual points about quantum computing. Today we cover two more:
With 300 qubits you’d need more atoms than are available in the universe.
So arbitrarily entangled states can’t be simulated well classically. This task requires a quantum computer.
In order to start talking about the construction of quantum computers through quantum gates, we need to
cover…
Universal Gate Sets
Classically, you’re probably familiar with all of the standard gates (AND, OR, NOT, NAND,
etc). A (classical) universal gate set is a grouping of such gates from which
you can construct all of the others.
For example, NAND by itself is universal. The diagram on the right
shows how you’d construct an OR gate out of NANDs, and the others can all
be worked out too.
Similarly, the Toffoli Gate universal. The Toffoli Gate, also known as the
controlled-controlled-NOT is a three bit gate where if A and B are 1, you flip C. To show that Toffoli is
universal, we construct a NAND gate out of one (in the diagram on the right). If a Toffoli can create a
universal gate set it must, too, be universal.
68
It’s worth noting that since Toffoli is reversible—given the outputs A, B, AB⨁C we can recover
inputs A, B, C—which means we can use it as a quantum gate too. Thus you can see that a quantum
computer can do anything a classical computer can do, because one can implement a classical universal
gate set.
There are plenty of ways that a gate set can fail to be universal.
1. Your gate set doesn’t create interference/superposition
Ex: {cNOT} can only flip between |0⟩ and |1⟩. It can maintain superposition, but it can’t create
any.
2. Your gate set has superposition, but is missing entanglement
Ex: {Hadamard} can create superposition, but it should be obvious that the gate can’t create
entanglement since it only acts on one qubit.
3. Your gate set only has real gates
Ex: {cNOT, Hadamard} is getting closer, but neither can reach positions with non-real values.
4. Your gate set is “only a stabilizer set”
We’re not going to go in depth with the concept of stabilizer sets. What’s important to know is
that a set like {cNOT, Hadamard, P = ( 1 0 ) } fails because it’s efficiently simulated by a classical
(0i)
computer (by the Gottesman-Knill Theorem). This property prevents it from getting speedups relative to
a classical computer.
Quantum Complexity
There’s two major ways we look at the complexity of quantum algorithms
69
The circuit complexity of a unitary is the size of the smallest circuit that implements it. We like
unitaries with polynomial circuit complexity. This can be difficult to find: it’s a gate-set-dependent
measure. At best we usually only get upper/lower bounds, so instead we tend to use…
Query complexity, the number of calls the algorithm makes to an oracle (or black box function).
The idea is that your oracle takes a bit and outputs a bit f : {0, 1} → {0, 1} . Classically you’d have a bit
go x → f (x) , but we replace this with quantum states |x⟩ → |f (x)⟩ .
Or rather, we want to replace it with quantum states, but we run into a bit of trouble because such
a transformation is not unitary. What we have to do instead is use an extra answer/target qubit.
So we give the black box two qubits: x,
which stays the same, and y, which receives the
answer.
|x, y ⟩ → |x, y ⊕ f (x)⟩
1
We start with |x,-⟩ = √2
( |x,0⟩ - |x,1⟩)
1
Applying Uf gets us √2
( |x,0 ⨁ f(x)⟩ - |x,1⨁ f(x)⟩)
1
Which equals { √2 ( |x,0⟩ - |x,1⟩) if f(x) = 0
1
{ √2 ( |x,1⟩ - |x,0⟩) f(x) = 1
f(x)
Which we can rewrite as (-1) |x,-⟩
This lets us avoid dealing with the answer qubit and just use the “phase oracle”.
|x⟩ → (− 1)f (x) |x⟩
Classically, this would take two queries since we need to know both bits.
Quantumly, Deutsch’s Algorithm can do it in one.
Start with a qubit at |0⟩, Hadamard it, then do a query which applies a phase change to each part
depending on the value of the function.
1 1
|0⟩ —> √2 ( |0⟩ + |1⟩) —> √2 ( (-1)f(x) |0⟩ + (-1)f(x) |1⟩)
We can substitute in the bits.
1
= √2 ( (− 1)b0 |0⟩ + (− 1)b1 |1⟩)
70
Then drag out b0.
1
= √2 (− 1)b0 ( |0⟩ + (− 1)b1 −b0 |1⟩)
So now if we have b0 = b1 we get (− 1)b0 1
√2
( |0⟩ + |1⟩)
b0 ≠ b1 (− 1)b0 1
√2
( |0⟩ − |1⟩)
We can ignore the phase out front since global phase doesn’t affect measurement, and then
Hadamard again to get our quantum states back in the |0⟩, |1⟩ basis.
Now the b0 = b1 case becomes |0⟩
and the b0 ≠ b1 case becomes |1⟩
71
Lecture 17: Thurs March 23
People often want to know where the true power of quantum computing comes from.
● Is it the ability of amplitudes to interfere with one another?
● Is it that entanglement gives us 2n amplitudes to work with?
But that’s sort of like dropping your keys and asking “what made them fall?”
● Is it their proximity to the Earth?
● Is it the curvature in space-time?
You could come up with all sorts of answers that are perfectly valid.
It seems like our rules for universal gate sets are just avoiding certain bad cases. Do we have formal proof
that they work?
Yes. There’s a paper from the 90s by Yaoyun Shi on the subject, but it’s out of scope for this
class.
In designing quantum algorithms, we’re ultimately looking to minimize the number of gates
required to implement them. That problem turns out to be insanely hard for reasons that have nothing to
do with quantum mechanics.
“What’s the smallest circuit that solves Boolean satisfiability?”
is a similarly hard problem, for reasons related to P vs NP.
So people design quantum algorithms that center around query complexity. This abstracts away
part of the problem by saying:
“There’s some Boolean function f : {0,1}n —> {0,1} and we’re trying to learn something about
f.”
You might want to learn:
Is there some input x where f(x) = 0?
Is there some symmetry in the solution?
Etc.
More importantly, we want to know how many queries it takes to solve such a problem.
In this model we abstract out the cost of gates that don’t do queries.
To be precise, we map queries as
|x, a⟩ → |x, a ⊕ f (x)⟩ since the transformation must be unitary.
But it can also be thought of as
|x⟩ → (− 1)f (x) |x⟩
72
Before we jump into a few quantum algorithms, it’s worth asking,
“Why do we care about this model? You’re debating how you’d phrase your wishes if you found a genie.
Who cares?”
You can think of a black box as basically a huge input. Querying f(i) means looking up the ith
number in the string.
That allows us to break a problem down to “if you want
to do an unordered (or ordered) search, how many queries do
you need?”
This is much more reasonable to compute that the alternative.
Tom showed that a reversible circuit can always simulate a non-reversible circuit, since Toffoli
can simulate NAND. However, in reversible computing erasing is expensive.
Imagine a classical circuit (without loss of generality, let’s say it’s a cluster of
NAND gates).
Suppose you have a circuit to compute f. How do we get a circuit that maps ∑ αx |x, 0⟩ → ∑ αx |x, f (x)⟩
x x
without all the garbage? In the 70s, Bennett invented a trick for this called…
Uncomputing
Let’s say I have some circuit that maps
C |x, 0, … , 0⟩ = |x, gar(x) , f (x)⟩.
First, run the circuit, C.
Then cNOT x. (make a copy of it in a safe place)
Then run the inverse circuit, C-1.
With that out of the way, we’re ready to talk about some quantum algorithms.
Deutsch’s Algorithm
computes the parity of two bits with one query. (the parity of n bits would require n/2 queries).
It basically involves making a state like 1 ( (-1)f(x) |0⟩ + (-1)f(x) |1⟩) and querying it in the |0⟩,|1⟩ basis.
√2
It uses the phase kickback trick to measure phase change.
The basic idea of the phase kickback trick is that we have a quantum oracle that does
∑ αx |x, y ⟩ → ∑ αx |x, y ⊕ f (x)⟩ but we’d rather get a final state in the form ∑ αx (− 1)f (x) |x⟩ . To accomplish
|0⟩|−⟩ − |1⟩|−⟩
this we put |-⟩ in second register. Uf gives us and we can interchange |0⟩ - |1⟩ and |1⟩ - |0⟩.
√2
74
The problem is to decide out which.
Classically, you could look at 2n-1 +1 cases of the function. If all inputs match, then the function is
constant. You can improve this through random sampling. On average, you’d need about 5 or 6 queries to
get an answer with a sufficiently small probability of error.
We’ll see how a quantum algorithm can solve this perfectly with only one query.
Truth is, this isn’t a hard classical problem, and so we wont get that big of
a speed up. This is why, initially, people didn’t care about quantum
computing. They figured all advantages would be in the same vein.
Here’s the quantum circuit for it:
= √1n ∑ (− 1)x·y |y⟩ Note: x·y is their inner product. Pick up a phase if xi = y i = 1.
2 n
y ε {0,1}
75
A shortcut to simplify this is to ask, “What is the amplitude when y = |00…0⟩?”
The Bernstein-Vazirani Algorithm, however, can solve this quantumly with only one query.
76
Lecture 28: Tues May 2
Today we’ll see a beautiful formalism for quantum error correction that has many roles in quantum
computation.
Last time we discussed the Quantum Fault Tolerance Theorem, which says that even if all
qubits in a system have some rate of noise, by:
● doing a bunch of gates in parallel
● applying measurement
● discarding bad qubits and replacing them
● And doing this all hierarchically (i.e. having layers of error atop one another)
we’ll still be able to do quantum computation, and the cost will be asymptotically reasonable
T -> O(T logC T)
This theorem set the research agenda for a lot of experimentalists, who began focusing on
attempts to minimize error. Once we can decrease error past a certain threshold, we’ll be able to push it
arbitrarily small by repeatedly applying our error correction techniques.
The best gauge of how research in quantum computing is going is the reliability of qubits.
Journalists often ask about things like the number of qubits, or “can you factor 15 into 3 and 5?” but more
important is crossing the threshold which would allow us to get arbitrarily small error.
We’re not there yet, but lots of progress is being made in two fronts:
1. Making qubits more reliable
Initially, ε (each qubit’s probability of failing at each time step) was close to 1, and the quantum
state would barely hold at all. The decoherence rates of IBM’s Quantum Experience, for example,
wouldn’t have been possible ten years ago.
John Martinez, with Google, has been able to get ε down to 1/1000 with a small number of qubits.
That’s already past the threshold, but adding more qubits creates more error, so the trick is to find a way
to add qubits while keeping error down.
We’re likely to soon see quantum error correction used to keep a logical qubit alive for longer
that the physical qubits below it. People are close to figuring this out, but it’s not quite there yet.
77
These came up when we discussed quantum universal gates. It’s not obvious that this definition
wouldn’t cover every quantum state. The Bell Pair is such a state, as are the states arising in Superdense
Coding or Quantum Teleportation.
If you play around with these gates, you’ll notice that tend to reach a discrete number of states,
and never anything between them. You’ll also notice that for an arbitrary number of qubits n, when these
qubits form superpositions over s pure states, it follows that |s| = 2k for some k, and s is always a
subspace s ≤ F2n.
78
The Pauli Matrices satisfy several beautiful identities.
X2 = Y2 = Z2 = I XY = iZ YX = –iZ
YZ = iX ZY = –iX
ZX = iY XZ = –iY
If you’ve seen the quaternions, you may notice that they satisfy the same kinds of relations.
This is also not a coincidence! Nothing is a coincidence in math!
For a given n-qubit pure state |Ψ⟩, we define |Ψ⟩’s stabilizer group as:
The group of all tensor products of Pauli Matrices that stabilize |Ψ⟩.
We know this is a group since being Pauli (and being a stabilizer) is closed under multiplication.
Additionally, this group is abelian.
For a slightly more interesting example, what’s the stabilizer group of a Bell Pair?
X|0⟩⊗X|0⟩ + X|1⟩⊗X|1⟩ |11⟩ + |00⟩ |00⟩ + |11⟩
We know XX is in it because
√2
= √2
= √2
.
The same argument can be made for –YY.
We can get the last element by doing component-wise multiplication: XX * –YY = –(iZ)(iZ) = ZZ
|00⟩ + |11⟩
So the stabilizer group of is { II, XX, –YY, ZZ }
√2
|00⟩ − |11⟩
You can likewise find the stabilizer group of to be { II, –XX, YY, ZZ }
√2
79
We wont see a proof of this, only an intuition for why it’s true.
So the 1-qubit stabilizer states are those with 2 elements in their stabilizer group.
The 2-qubit stabilizer states are those with 4 elements in their stabilizer group.
And so forth.
This is a completely different characterization of stabilizer states, a structural one. It tells us what
invariant is being preserved without any mention of quantum mechanics.
The one time that Professor Aaronson (being a theorist) ever wrote code that people
actually went out and used, was a project in grad school for a Computer Architecture course.
He made a fast simulator for stabilizer sets called CHP, letting a normal computer handle
thousands of qubits (limited only by their RAM). He was only trying to pass the class,
but incidentally published a paper with Gottesman for a better algorithm to implement this.
80
Truth be told, it had nothing to do with Computer Architecture.
He’s not sure with the professor accepted it.
So for a series of qubits starting at |00…0⟩, how do we find all of its stabilizer states?
We know it contains II…I but we wont put that in the generator. It’s implied.
We’ll also need { ZIII…I
{ IZII…I
{ IIZI…I
:
{ IIII…Z
But this is starting to get messy.
For Gottesman-Knill, it’s useful to have another representation of qubits.
Tableau Representation
which keeps track of two matrices of 1’s an 0’s.
Instead of representing each Pauli in a single matrix, it is specified over two bit in separate matrices.
The above matrix represents { ZIII, IZII, IIZI, IIIZ}.
We’re going to provide the rules for Tableau Representation without any formal proof that they
work, but you can go through each rule and reason through why it makes sense.
We’re also going to cheat a little. Keeping track of the +’s and –’s is tricky and not particularly
illuminating, so we’ll just ignore them. If we only want to know if measuring a qubit will give a definite
answer or not (without figuring out if it’s a |0⟩ or |1⟩), we can ignore the signs.
81
● To apply cNOT from the ith qubit to the jth:
○ Take the bitwise XOR of the ith column of X into the jth column of X
That seems reasonable enough, but... remember from the homework how a
cNOT from i–>j in the Hadamard basis is equivalent to a cNOT from j–>i?
That means we also have to…
○ Take the bitwise XOR of the j column of Z into the i column of Z
th th
These rules are enough to establish that measuring the ith qubit in the |0⟩,|1⟩ basis has a determinate
outcome iff the ith outcome of the X matrix is all 0’s.
Let’s test this out, keeping track of the tableau for the following circuit.
We start with
(00|10)
(00|01)
After the cNOT (in X: XOR 1st column into 2nd, in Z: XOR 2nd column into 1st)
(11|00) This is ( X X ) the stabilizer generator for the Bell Pair.
(00|11) (YY)
After the phase gate (XOR 1st column of X into 1st column of Z)
|00⟩ + i|11⟩
(11|10) A phase gate signifies the introduction of i’s. This corresponds to
√2
(00|11)
Most quantum error correction codes are done with stabilizer circuits, making them easy to
compute. As a result, the real importance of the stabilizer formalism is letting us keep track of them in a
more elegant way.
|000⟩ + |111⟩ ⊗3
For example, with Shor’s 9-qubit code, we were dealing with qubits in the form ( )
√2
. Since you can flip any two qubit in a grouping and retain the form, we can write this state’s generator as:
{ Z Z I I I I I I I,
82
I Z Z I I I I I I,
I I I Z Z I I I I,
I I I I Z Z I I I,
I I I I I I Z Z I,
I I I I I I I Z Z,
X X X X X X I I I,
I I I X X X X X X,
± X X X X X X X X X }
The last line can have either a + or –, encoding |0⟩ or |1⟩ respectively
83
Lecture 29: Thurs May 4
For a given quantum error correction code, applying a gate usually entails:
decoding the qubit => applying the gate => re-encoding the qubit
That’s why in practice people prefer quantum error correction codes with transversality.
We say that the Hadamard gate is transversal for a qubit if you can Hadamard the logical qubit
by applying the Hadamard gate to each physical qubit separately.
You can work out that Hadamard is transversal for Shor’s 9-qubit code.
There are quantum error correction codes where cNOT, H, and P are all transversal.
Unfortunately, there’s a theorem that says that arbitrary non-stabilizer gates can’t be transversal.
That means gates like Toffoli or Rπ/8 must be implemented through sequences of gates that are much more
expensive.
So in practical quantum computing, stabilizer applications cost almost nothing.
This means that an exponential speedup in quantum computing requires an exponential number of
stabilizer gates.
There are various tricks to produce this, like Magic State Distillation.
The basic idea is that applying stabilizer gates to some nonuniformly-distributed
states called ‘magic states’, such as cos(π/8)|0⟩ + sin(π/8)|1⟩ lets you
escape Gottesman-Knill and reach a universal quantum computer.
We’ve seen a high-level overview of quantum computing, so it behooves us to take a lecture to discuss
practical implementations.
Professor Aaronson has visited quantum computing labs all over the world.
They all have one strict rule for theorists: “don’t touch anything!”
But there are also important computational speedups to keep in mind, the top five (in order) are:
1. Quantum Simulation
Take a Hamiltonian of a real system, trotterize it, then run.
84
This would let you compute the effect of any chemical reaction without physically running it,
which would be amazing for chemists. We tend not to talk much about this, since it’s pretty
straightforward, but it’s easily the best application of quantum computing.
Even imperfect implementations of quantum computing are enough to see advantages for this.
2. Code Breaking
The sexiest application of quantum computing.
This would be very important for intelligence agencies, nefarious actors, intelligence agencies
who are themselves nefarious actors.
It would completely change how e-commerce is run, requiring everybody to move to private-key
crypto, lattice cryptography, etc. However, advantages here would require a fully fault-tolerant quantum
computer.
3. Grover
As we’ve seen, Grover’s Algorithm can only provide a polynomial speedup. However, it would be over a
broad range of applications.
It would essentially just “give a little more juice to Moore’s Law.”
4. Adiabatic Optimization
Might produce speedups better than Grover, but we’ll only know once we try.
5. Machine Learning
Very hot in recent years.
It’s a good match for quantum computing because many problems in the field (classifying data,
creating recommendation systems, etc) boil down to performing linear algebra on large sets of data. Even
better, you typically only need an approximate answer.
Over the past ten years, many papers have been published claiming that quantum computing can
give up to exponential speedups on such problems. This started in 2007 with…
Journalists often ask Professor Aaronson, “When will we all have personal quantum
computers and qPhones in our pocket?” It’s hard to imagine that’ll ever happen though,
because most things we do on our PCs can be done quickly on classical computers. At most
we’ll likely see cloud quantum computing, like the IBM Quantum Experience, where a
central location deals with the issues of maintaining quantum states while we reap the benefits.
Maybe this’ll seem myopic in a hundred years, like a guy from the 70s saying, “I only see
a market for five computers in the world, tops.” But you could argue that such people
were simply ahead of their time. We are moving to a world where most computation is done
on the cloud in a few centralized locations. Though this might also be shortsighted because
our current list of applications of quantum computing may be woefully incomplete.
● Long-Lived Qubits
It’s self-evident that you need some system that can maintain quantum states over long periods of time.
As we’ve said before, “the first requirement of quantum computing is the ability to perform I.”
● Universal Gates
You must be able to apply some universal set of gates.
Implicit here is the requirement that qubits can interact with one another.
● Initialization
You must be able to get qubits to |00…0⟩.
● Measurement
You’re familiar with measurement.
Different architectures have achieved different combinations of these. There are architectures
where initialization is hard or measurement is hard.
86
Such a lab will have ions, a magnet, and a classical computer that lets them see images of the
ions. This method isn’t totally reliable, and it takes work to keep the ions penned in.
It’s a bit like herding cattle.
Atomic nuclei have a spin state that can be clockwise, counterclockwise, or a superposition of
both. So we treat the nuclei’s spin as their quantum state. If you bring two ions close to each other, the
Coolum Effect can be used to create something resembling a CNOT gate.
You can manipulate qubits by using a laser to pick them up and move them around. This may
sound like it would be a tough balancing act, moving nuclei via laser while keeping them all floating
magnetically…
Yes. Yes it is.
But it has been demonstrated with up to 10-20 qubits. After that it becomes hard to interact with a
single qubit at a time.
In the last few years, several such ventures that were originally academic became
start-ups—which tend to give out much less information about how they’re doing.
That being said, this is the currently the most popular approach.
Google bought out almost the whole Santa Barbara lab (Martinez’s group), and have publicly
announced that they expect to have a 50-qubit system in a year. IBM also has a superconducting group
87
with similar claims. In addition there are several startups working with superconducting qubits, like
Rigetti—made up of several people who left IBM.
The basic idea is that you generate photons and send them
through fiber optic cables. When the two come together, you use a
Beam Splitter, which corresponds as a 2x2 unitary acting on the
qubit, take taking the state to or from superposition.
2-qubit gates are harder to do. There’s a set of operations that are easy to implement in such a system.
What’s not obvious is whether those operations are sufficient to produce a universal quantum computer.
Furthermore, the KLM Theorem opens the possibility of a new way to build a quantum computer, where
qubits are photons travel at the speed of light.
Photons can maintain superposition indefinitely by flying in a vacuum. The trouble is that they’re
flying at the speed of light, which makes it hard for them to interact with one another. This may require
stalling photons, but that may introduce decoherence.
88
The last approach we’ll cover is fairly esoteric, it’s called Non-abelian Anyons.
There are two types of fundamental particles: Bosons and Fermions. But in a two-dimensional
field, you can have particles that behave as neither Bosons nor Fermions.
Lots of physicists won Nobels in the 80s for this stuff.
If you can make such “quasiparticles” in a two-dimensional surface, just moving them around
would be sufficient to create a universal quantum computer. This setup may be naturally resistant to
decoherence. The caveat is that we’re only now starting to understand how to create the simplest
quasiparticles.
Microsoft is the current leader in this approach, and has hired several experts in this field
recently.
As a path forward…
Professor Aaronson’s thinks about what to expect from Quantum Supremacy in terms of three steps.
Lots of people dislike this term ^ for obvious reasons, but it has stuck for now.
89