Lecture 1
Lecture 1
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F1/67 (pg.1/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F3/67 (pg.3/114)
Logistics Review
Religious Accommodations
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F4/67 (pg.4/114)
Logistics Review
Prerequisites
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F5/67 (pg.5/114)
Logistics Review
Homework
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F6/67 (pg.6/114)
Logistics Review
Exams
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F7/67 (pg.7/114)
Logistics Review
Grading
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F8/67 (pg.8/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F9/67 (pg.9/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F10/67 (pg.10/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F11/67 (pg.11/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F12/67 (pg.12/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F13/67 (pg.13/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F15/67 (pg.15/114)
Logistics Review
Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F16/67 (pg.16/114)
Logistics Review
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F17/67 (pg.17/114)
Logistics Review
Homework
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F18/67 (pg.18/114)
Information Theory Information Entropy
Inspirational Quote
The moral life of man forms part of the subject-matter of the
artist but the morality of art consists of the perfect use of an
imperfect medium – Oscar Wilde
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F19/67 (pg.19/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F20/67 (pg.20/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F22/67 (pg.22/114)
Information Theory Information Entropy
Communications Theory
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F23/67 (pg.23/114)
Information Theory Information Entropy
Communication Theory
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F24/67 (pg.24/114)
Information Theory Information Entropy
Voice
Words
Pictures
Music, art
Galileo space probe orbiting Jupiter
Human cells about to reproduce
Human parents about to reproduce
Sensory input of biological organism
Or any signal at all (any binary data).
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F25/67 (pg.25/114)
Information Theory Information Entropy
Channel Possibilities
noise
Telephone line
High frequency radio link
Space communication link
Storage (disk, tape, internet, TCP/IP, social media), transmission
through time rather than space, could be degradation due to decay
Biological organism (send message from brain to foot, or from ear to
brain, or genetic message from parent to child)
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F26/67 (pg.26/114)
Information Theory Information Entropy
Receiver Possibilities
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F27/67 (pg.27/114)
Information Theory Information Entropy
Noise
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F28/67 (pg.28/114)
Information Theory Information Entropy
Encoder
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F29/67 (pg.29/114)
Information Theory Information Entropy
Decoder
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F30/67 (pg.30/114)
Information Theory Information Entropy
DNA or the chromosomes within each cell encode all the info about
each body
Source = Two parents
Encoder = Your imagination ,
Channel = Biological combination, meiosis (creation of haploid
gametes), mutation, and so on.
Noise, random mutation.
Decoder = further mitosis creating the new child.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F32/67 (pg.32/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F33/67 (pg.33/114)
Information Theory Information Entropy
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F34/67 (pg.34/114)
Information Theory Information Entropy
Communication Theory
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F35/67 (pg.35/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F36/67 (pg.36/114)
Information Theory Information Entropy
C R→
0 C R→
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F37/67 (pg.37/114)
Information Theory Information Entropy
What is information?
OED says:
1 facts provided or learned about something or someone.
2 what is conveyed or represented by a particular arrangement or
sequence of things.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F38/67 (pg.38/114)
Information Theory Information Entropy
What is information?
Wikipedia says:
1 Information in its most restricted technical sense is a message
(utterance or expression) or collection of messages in an ordered
sequence that consists of symbols, or it is the meaning that can be
interpreted from such a message or collection of messages.
Information can be recorded or transmitted. It can be recorded as
signs, or conveyed as signals. Information is any kind of event that
affects the state of a dynamic system. The concept has numerous
other meanings in different contexts. Moreover, the concept of
information is closely related to notions of constraint,
communication, control, data, form, instruction, knowledge,
meaning, mental stimulus, pattern, perception, representation, and
especially entropy.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F39/67 (pg.39/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F40/67 (pg.40/114)
Information Theory Information Entropy
Information
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.41/114)
Information Theory Information Entropy
Information
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.42/114)
Information Theory Information Entropy
Information
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.43/114)
Information Theory Information Entropy
Information
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.44/114)
Information Theory Information Entropy
Information
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.45/114)
Information Theory Information Entropy
Information
Poetry: “I heard an echo in a
hollow place. No sound of
blowing wind or drifting sand,
Oranges are 99¢/pound.
some ancient voice was this, a
It is cloudy in Seattle today. captive trace of gone-by
You are taking an information speech, of argument, demand,”
theory course right now. – Tiel Aisha Ansari
It is a balmy tropical climate in
Seattle. As in other places in
the Pacific North-West, warm,
sunny days are the norm.
Richard Dawkins will win the
U.S. Presidential Election in
November, 2024.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.46/114)
Information Theory Information Entropy
Information
Poetry: “I heard an echo in a
hollow place. No sound of
blowing wind or drifting sand,
Oranges are 99¢/pound.
some ancient voice was this, a
It is cloudy in Seattle today. captive trace of gone-by
You are taking an information speech, of argument, demand,”
theory course right now. – Tiel Aisha Ansari
It is a balmy tropical climate in A Painting (– Dali)
Seattle. As in other places in
the Pacific North-West, warm,
sunny days are the norm.
Richard Dawkins will win the
U.S. Presidential Election in
November, 2024.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.47/114)
Information Theory Information Entropy
Information
Poetry: “I heard an echo in a
hollow place. No sound of
blowing wind or drifting sand,
Oranges are 99¢/pound.
some ancient voice was this, a
It is cloudy in Seattle today. captive trace of gone-by
You are taking an information speech, of argument, demand,”
theory course right now. – Tiel Aisha Ansari
It is a balmy tropical climate in A Painting (– Dali)
Seattle. As in other places in
the Pacific North-West, warm,
sunny days are the norm.
Richard Dawkins will win the
U.S. Presidential Election in
November, 2024.
Music
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F41/67 (pg.48/114)
Information Theory Information Entropy
Information
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F42/67 (pg.49/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F43/67 (pg.50/114)
Information Theory Information Entropy
Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F44/67 (pg.51/114)
Information Theory Information Entropy
Communication Theory
Original model of communication
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F45/67 (pg.52/114)
Information Theory Information Entropy
Communication Theory
Original model of communication
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F45/67 (pg.53/114)
Information Theory Information Entropy
Communication Theory
Original model of communication
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F45/67 (pg.54/114)
Information Theory Information Entropy
Communication Theory
Original model of communication
noise
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F45/67 (pg.55/114)
Information Theory Information Entropy
Communication Theory
Original model of communication
noise
On Source Coding
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F46/67 (pg.57/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F47/67 (pg.58/114)
Information Theory Information Entropy
What is entropy?
Events Ek each occur with probability pk . pk indicates the
likelihood of the event Ek happening.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F48/67 (pg.59/114)
Information Theory Information Entropy
What is entropy?
Events Ek each occur with probability pk . pk indicates the
likelihood of the event Ek happening.
Shannon/Hartley information of event Ek is I(Ek ) = log(1/pk ),
indicating:
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F48/67 (pg.60/114)
Information Theory Information Entropy
What is entropy?
Events Ek each occur with probability pk . pk indicates the
likelihood of the event Ek happening.
Shannon/Hartley information of event Ek is I(Ek ) = log(1/pk ),
indicating:
1 a measure of surprise of finding out Ek . If pk = 1 ⇒no surprise in
finding out that Ek occurred, while pk = 0 ⇒ infinite surprise in
finding out Ek .
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F48/67 (pg.61/114)
Information Theory Information Entropy
What is entropy?
Events Ek each occur with probability pk . pk indicates the
likelihood of the event Ek happening.
Shannon/Hartley information of event Ek is I(Ek ) = log(1/pk ),
indicating:
1 a measure of surprise of finding out Ek . If pk = 1 ⇒no surprise in
finding out that Ek occurred, while pk = 0 ⇒ infinite surprise in
finding out Ek .
2 A measure if information gained in finding out Ek (information
gained is equal to surprise). pk = 1 ⇒No information is gained, while
pk = 0 ⇒ infinite information is gained.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F48/67 (pg.62/114)
Information Theory Information Entropy
What is entropy?
Events Ek each occur with probability pk . pk indicates the
likelihood of the event Ek happening.
Shannon/Hartley information of event Ek is I(Ek ) = log(1/pk ),
indicating:
1 a measure of surprise of finding out Ek . If pk = 1 ⇒no surprise in
finding out that Ek occurred, while pk = 0 ⇒ infinite surprise in
finding out Ek .
2 A measure if information gained in finding out Ek (information
gained is equal to surprise). pk = 1 ⇒No information is gained, while
pk = 0 ⇒ infinite information is gained.
3 A measure of the “uncertainty” of Ek , but really unexpectedness.
Unexpectedness is the thing that determines interest, or information
(see next slide).
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F48/67 (pg.63/114)
Information Theory Information Entropy
What is entropy?
Events Ek each occur with probability pk . pk indicates the
likelihood of the event Ek happening.
Shannon/Hartley information of event Ek is I(Ek ) = log(1/pk ),
indicating:
1 a measure of surprise of finding out Ek . If pk = 1 ⇒no surprise in
finding out that Ek occurred, while pk = 0 ⇒ infinite surprise in
finding out Ek .
2 A measure if information gained in finding out Ek (information
gained is equal to surprise). pk = 1 ⇒No information is gained, while
pk = 0 ⇒ infinite information is gained.
3 A measure of the “uncertainty” of Ek , but really unexpectedness.
Unexpectedness is the thing that determines interest, or information
(see next slide).
4 I(Ek ) = − log p(Ek ) = the self information of that event, or that
message. Why is it called self-information? We’ll soon see.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F48/67 (pg.64/114)
Information Theory Information Entropy
What is entropy?
Events Ek each occur with probability pk . pk indicates the
likelihood of the event Ek happening.
Shannon/Hartley information of event Ek is I(Ek ) = log(1/pk ),
indicating:
1 a measure of surprise of finding out Ek . If pk = 1 ⇒no surprise in
finding out that Ek occurred, while pk = 0 ⇒ infinite surprise in
finding out Ek .
2 A measure if information gained in finding out Ek (information
gained is equal to surprise). pk = 1 ⇒No information is gained, while
pk = 0 ⇒ infinite information is gained.
3 A measure of the “uncertainty” of Ek , but really unexpectedness.
Unexpectedness is the thing that determines interest, or information
(see next slide).
4 I(Ek ) = − log p(Ek ) = the self information of that event, or that
message. Why is it called self-information? We’ll soon see.
All logs are base 2 (by default), so log ≡ log2 unless otherwise
stated. ln will be natural log.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F48/67 (pg.65/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F49/67 (pg.66/114)
Information Theory Information Entropy
Uses of entropy in IT
Entropy uses:
measure information in the communication theory model.
1
Surprise of an event {X = x} is measured as log p(x) , and there are
reasons for using log. Entropy is the average surprise.
The lower bound on min number of guesses (on average) to guess
the value of a random variable.
The minimum number of bits to compress a source.
The optimal coding “length”, of a random source.
The minimum description length (MDL) of a random source that
can be achieved without probability of error.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F51/67 (pg.68/114)
Information Theory Information Entropy
X X 1
p(x)g(x) = p(x) log (1.1)
x x
p(x)
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F52/67 (pg.69/114)
Information Theory Information Entropy
Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F53/67 (pg.70/114)
Information Theory Information Entropy
Entropy Of Distributions
p(x)
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F54/67 (pg.71/114)
Information Theory Information Entropy
Entropy Of Distributions
p(x)
Low Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F54/67 (pg.72/114)
Information Theory Information Entropy
Entropy Of Distributions
p(x)
Low Entropy
x
p(x)
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F54/67 (pg.73/114)
Information Theory Information Entropy
Entropy Of Distributions
p(x)
Low Entropy
x
p(x)
High Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F54/67 (pg.74/114)
Information Theory Information Entropy
Entropy Of Distributions
p(x)
Low Entropy
x
p(x)
High Entropy
x
p(x)
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F54/67 (pg.75/114)
Information Theory Information Entropy
Entropy Of Distributions
p(x)
Low Entropy
x
p(x)
High Entropy
x
p(x)
In Between
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F54/67 (pg.76/114)
Information Theory Information Entropy
Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F55/67 (pg.77/114)
Information Theory Information Entropy
Binary Entropy
Binary alphabet, X ∈ {0, 1} say.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F56/67 (pg.78/114)
Information Theory Information Entropy
Binary Entropy
Binary alphabet, X ∈ {0, 1} say.
p(X = 1) = p = 1 − p(X = 0).
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F56/67 (pg.79/114)
Information Theory Information Entropy
Binary Entropy
Binary alphabet, X ∈ {0, 1} say.
p(X = 1) = p = 1 − p(X = 0).
H(X) = −p log p − (1 − p) log(1 − p) = H(p).
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F56/67 (pg.80/114)
Information Theory Information Entropy
Binary Entropy
Binary alphabet, X ∈ {0, 1} say.
p(X = 1) = p = 1 − p(X = 0).
H(X) = −p log p − (1 − p) log(1 − p) = H(p).
As a function of p, we get:
1
H(p) 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F56/67 (pg.81/114)
Information Theory Information Entropy
Binary Entropy
Binary alphabet, X ∈ {0, 1} say.
p(X = 1) = p = 1 − p(X = 0).
H(X) = −p log p − (1 − p) log(1 − p) = H(p).
As a function of p, we get:
1
H(p) 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p
Note, greatest uncertainty (value 1) when p = 0.5 and least
uncertainty (value 0) when p = 0 or p = 1.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F56/67 (pg.82/114)
Information Theory Information Entropy
Binary Entropy
Binary alphabet, X ∈ {0, 1} say.
p(X = 1) = p = 1 − p(X = 0).
H(X) = −p log p − (1 − p) log(1 − p) = H(p).
As a function of p, we get:
1
H(p) 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p
Note, greatest uncertainty (value 1) when p = 0.5 and least
uncertainty (value 0) when p = 0 or p = 1.
Note also: concave in p.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F56/67 (pg.83/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F57/67 (pg.84/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F57/67 (pg.85/114)
Information Theory Information Entropy
Joint Entropy
XX 1
H(X, Y ) = − p(x, y) log p(x, y) = E log (1.3)
x y
p(X, Y )
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F58/67 (pg.86/114)
Information Theory Information Entropy
Joint Entropy
XX 1
H(X, Y ) = − p(x, y) log p(x, y) = E log (1.3)
x y
p(X, Y )
X 1
H(X1 , . . . , XN ) = p(x1 , . . . , xN ) log
x1 ,x2 ,...,xN
p(x1 , . . . , xN )
(1.4)
1
= E log (1.5)
p(x1 , . . . , xN )
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F58/67 (pg.87/114)
Information Theory Information Entropy
1 P 1 P
H(X) ≜ E log p(X) = x p(x) log p(x) =− x p(x) log p(x)
Discrete since X is a discrete random variable (i.e., x ∈ X where X
is countable).
Note limα→0 α log α = 0, hence if p(x) = 0, the entropy is
uninfluenced.
Also since p(x) ≥ 0 and log 1/p(x) ≥ 0 discrete entropy is always
non-negative H(X) ≥ 0.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F59/67 (pg.88/114)
Information Theory Information Entropy
Conditional Entropy
For two random variables X, Y related via p(x, y), knowing the
event X = x can change the entropy of Y .
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F60/67 (pg.89/114)
Information Theory Information Entropy
Conditional Entropy
For two random variables X, Y related via p(x, y), knowing the
event X = x can change the entropy of Y .
Event conditional entropy H(Y |X = x)
1
H(Y |X = x) = E log (1.6)
p(Y |X = x)
X
=− p(y|x) log p(y|x) (1.7)
y
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F60/67 (pg.90/114)
Information Theory Information Entropy
Conditional Entropy
For two random variables X, Y related via p(x, y), knowing the
event X = x can change the entropy of Y .
Event conditional entropy H(Y |X = x)
1
H(Y |X = x) = E log (1.6)
p(Y |X = x)
X
=− p(y|x) log p(y|x) (1.7)
y
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F60/67 (pg.91/114)
Information Theory Information Entropy
Proof.
Corollary 1.5.3
⊥Y then H(X, Y ) = H(X) + H(Y ).
If X⊥
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F61/67 (pg.92/114)
Information Theory Information Entropy
N
X
H(X1 , X2 , . . . , XN ) = H(Xi |X1 , X2 , . . . , Xi−1 ) (1.13)
i=1
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F62/67 (pg.93/114)
Information Theory Information Entropy
N
X
H(X1 , X2 , . . . , XN ) = H(Xi |X1 , X2 , . . . , Xi−1 ) (1.13)
i=1
Proof.
Use chain rule of conditional probability, i..e, that
N
Y
p(x1 , x2 , . . . , xN ) = p(xi |x1 , . . . , xi−1 ) (1.14)
i=1
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F62/67 (pg.94/114)
Information Theory Information Entropy
N
X
H(X1 , X2 , . . . , XN ) = H(Xi |X1 , X2 , . . . , Xi−1 ) (1.13)
i=1
Proof.
Use chain rule of conditional probability, i..e, that
N
Y
p(x1 , x2 , . . . , xN ) = p(xi |x1 , . . . , xi−1 ) (1.14)
i=1
then
N
X
− log p(x1 , x2 , . . . , xN ) = − log p(xi |x1 , x2 , . . . , xi−1 ) (1.15)
i=1
−1
ln(x)
λ = 0.5
−2
λ=1
λ=2
−3 λ=3
λ=5
λ = 10
−4
λ = 20
ln(x)
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I x– Lecture 1 - Sep 27th, 2023 L1 F63/67 (pg.96/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.97/114)
Information Theory Information Entropy
Proof.
Approach: show that H(X) − log n ≤ 0.
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.98/114)
Information Theory Information Entropy
Proof.
Approach: show that H(X) − log n ≤ 0.
H(X) − log n
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.99/114)
Information Theory Information Entropy
Proof.
Approach: show that H(X) − log n ≤ 0.
X X
H(X) − log n = − p(x) log p(x) − p(x) log n (1.19)
x x
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.100/114)
Information Theory Information Entropy
Proof.
Approach: show that H(X) − log n ≤ 0.
X X
H(X) − log n = − p(x) log p(x) − p(x) log n (1.19)
x x
X 1
= log2 e p(x) ln (1.20)
x
p(x)n
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.101/114)
Information Theory Information Entropy
Proof.
Approach: show that H(X) − log n ≤ 0.
X X
H(X) − log n = − p(x) log p(x) − p(x) log n (1.19)
x x
X 1
= log2 e p(x) ln (1.20)
x
p(x)n
X 1
≤ log e p(x) −1 (1.21)
x
p(x)n
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.102/114)
Information Theory Information Entropy
Proof.
Approach: show that H(X) − log n ≤ 0.
X X
H(X) − log n = − p(x) log p(x) − p(x) log n (1.19)
x x
X 1
= log2 e p(x) ln (1.20)
x
p(x)n
X 1
≤ log e p(x) −1 (1.21)
x
p(x)n
" #
X1 X
= log e − p(x)
x
n x
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.103/114)
Information Theory Information Entropy
Proof.
Approach: show that H(X) − log n ≤ 0.
X X
H(X) − log n = − p(x) log p(x) − p(x) log n (1.19)
x x
X 1
= log2 e p(x) ln (1.20)
x
p(x)n
X 1
≤ log e p(x) −1 (1.21)
x
p(x)n
" #
X1 X
= log e − p(x) = 0 (1.22)
x
n x
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F64/67 (pg.104/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F65/67 (pg.105/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F65/67 (pg.106/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F65/67 (pg.107/114)
Information Theory Information Entropy
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F65/67 (pg.108/114)
Information Theory Information Entropy
Permutations
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F66/67 (pg.109/114)
Information Theory Information Entropy
Permutations
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F66/67 (pg.110/114)
Information Theory Information Entropy
Permutations
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F66/67 (pg.111/114)
Information Theory Information Entropy
Permutations
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F66/67 (pg.112/114)
Information Theory Information Entropy
Permutations
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F66/67 (pg.113/114)
Information Theory Information Entropy
Summary so far
X
H(X) = EI(X) = − p(x) log p(x) (1.23)
x
X
H(X, Y ) = − p(x, y) log p(x, y) (1.24)
x,y
X
H(Y |X) = − p(x, y) log p(y|x) (1.25)
x,y
and
Prof. Jeff Bilmes EE514a/Fall 2023/Info. Theory I – Lecture 1 - Sep 27th, 2023 L1 F67/67 (pg.114/114)