0% found this document useful (0 votes)
35 views3 pages

27may 3

- Shannon entropy quantifies the uncertainty in a classical probability distribution. It is a measure of the information needed to describe the distribution. - For independent and identically distributed random variables sampled from a binary distribution, the number of 0s and 1s will typically be close to the expected values, with small fluctuations. - The size of the typical set of strings—those whose probabilities are close to what is expected—grows exponentially with the Shannon entropy of the distribution. This provides an operational meaning for entropy in data compression. - Quantum entropy is defined similarly using the von Neumann entropy of the density matrix. Quantum data compression aims to represent a tensor product of many copies of a quantum state on a smaller dimensional subspace
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views3 pages

27may 3

- Shannon entropy quantifies the uncertainty in a classical probability distribution. It is a measure of the information needed to describe the distribution. - For independent and identically distributed random variables sampled from a binary distribution, the number of 0s and 1s will typically be close to the expected values, with small fluctuations. - The size of the typical set of strings—those whose probabilities are close to what is expected—grows exponentially with the Shannon entropy of the distribution. This provides an operational meaning for entropy in data compression. - Quantum entropy is defined similarly using the von Neumann entropy of the density matrix. Quantum data compression aims to represent a tensor product of many copies of a quantum state on a smaller dimensional subspace
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

The Mathematics of Entanglement - Summer 2013

27 May, 2013

Quantum Entropy
Lecturer: Aram Harrow

Lecture 3

Shannon Entropy

In this part, we want to understand quantum information in a quantitative way. One of the
important concepts is entropy. But let us first look
P at classical entropy.
d
Given is a probability distribution p R+ , i pi = 1. The Shannon entropy of p is
H(p) =

pi log pi

(log is always to base to as we are talking about bits and units, convention limx70 x log x = 0).
Entropy quantifies uncertainty. We have maximal certainty for a deterministic distribution, e.g.
p = (1, 0, . . . , 0), H(p) = 0. The distribution with maximal uncertainty is p = ( d1 , . . . , d1 ), H(p) =
log d.
In the following we want to give Shannon entropy an operational meaning with help of the
problem of data compression. For this imagine you have a binary alphabet (d = 2) and you sample
n times independently and identically distribution from the distribution p = (, 1 ); we write
X1 , . . . , Xn i.i.d. {0, 1}. (Prob[Xi = 0] = Prob[Xi = 1] = 1 )

Typically, the number of 0s in the string n O( n and the number of 1s n(1 ) O( n. In


order to see why this is the case, consider the sum S = X1 + . . . , Xn (this equals the number of 1s
in the string). The expectation of this random variable is E[S] = E[X1 ] + . . . + E[Xn ] = n(1 ),
where we used the linearity of the expectation value. Furthermore the variance of S is V ar[S] =
V ar[X1 ] + . . . + V ar[Xn ] = nV ar[X1 ] = n(1 ) 41 . Here we used the independence of the
2 = (1)
random variable in the first equation and V ar[X1 ] = E[X12 ]E[X1 ]2 = (1)(1)

in the third. This implies that the standard deviation of S is smaller than 2n .
What does this have to do with compression? The
of possible n-bit
strings is
 total number
(n/e)n
n
n!
n
n
|{0, 1} | = 2 . The number of strings with n 0s is n = (n)!((1)n)! = (n/e)n ((1)n/e)(1)n =
(1/)n (1/(1))n(1) where we used Stirlings approximation. We can rewrite this as exp(n log 1/+
(1 ) log 1/(1 )) = exp(nH(p)). Hence, we only need to store around exp(nH(p)) possible
strings, which we can do in a memory having nH(p) bits. (Note we ignored the fluctuations. If we

took them into account, we would only need additional O( n) bits.) This analysis easily generalises
to arbitrary alphabets (not only binary).

Typical

I now want to give you a different way of looking at this problem, a way that is both more rigorous
and will more easily generalise to the quantum case. This we will do with help of typical sets.
Again let X1 , . . . , Xn be i.i.d distributed with distribution p. The probability of a string is then
given by
3-1

Prob[X1 . . . Xn = x1 , . . . , xn ] = p(x1 )p(x2 ) p(xn ) where we used the notation (random variables are in capital letters and values in small letters)
p (n) = p p p (n times).
xn = (x1 , . . . , xn ) n
where = {1, . . . , d} is the alphabet and n denotes strings of length n over that alphabet
Note that

log pn (xn ) =

n
X

log p(xi ) nE[log p(xi )]

p
n V ar[log p(xi ] = nH(p) O( n)

i=1

where we used
E[log p(xi )] =

p(xi ) log p(xi ) = H(p)

Let us now define the typical set as the set of strings whose
Tp,n, = {xn | log pn (xn ) + nH(p)| n}
Then
> 0

lim pn Tp,n, = 1

Our compression algorithm simply keeps all the strings in the typical set and throws away all
others. Hence, all we need to know the size of the typical set. This is easy. Note that
xn Tp,n, = exp(nH(p) n) pn (xn ) exp(nH(p) + n)
Note
1 pn (Tp,n, ) |Tp,n, | min pn (xn )
where the minimum is over all strings in the typical set. This implies
1 |Tp,n, | exp(nH(p) n)
which is equivalent to
log |Tp,n, | nH(p) + n
Exercise: Show this is optimal. More precisely, show that we cannot compress to nR bits for
R < H(p) unless the error does not go to zero. Hint: Use Chebycheff ienquality: Let Z be a random
variable P rob[|Z E[Z]| kSD[Z]] 1/k 2 Possible simplifications: 1) pretend all strings to be
typical 2) use exactly nR bits.

3-2

Quantum compression

Probability distributions are replaced by density matrices n = (n times) If is a state of


a qubit then this state lives on a 2n dimensional space. The goal of quantum data compression is to
represent this state on a smaller dimensional subspace. Just as before in bits, we now measure the
size in terms of the number of qubits needed to represent that subspace, the log of the dimension.
It turns out to P
be possible (and optimal) to do this in nS() n where S is the von Neumann
entropy S() = i log i = H() = tr log , where the i are the eigenvalues of the density
operator.

3-3

You might also like