Lecture 2
Lecture 2
Entropy
Mutual Information
0
0 10 20 30 40 50 60 70 80 90 100
It is a function of p1, , pM
H(p1, , pM )
f (M ) < f (M ′)
Picking one person randomly from the classroom should result less
possibility than picking a person randomly from the US.
f (M L) f (M ) = f (L)
Grouping rule (Problem 2.27 in Text). Dividing the outcomes into two,
randomly choose one group, and then randomly pick an element from
one group, does not change the number of possible outcomes.
The only function that satises the requirements is the entropy function
M
∑
H(p1, , pM ) = pi log2 pi
i=1
0 log 0 = 0
– Concave
– Maximizes at p = 12
Example: how to ask questions?
Proof:
H ( X |Y ) I(X ;Y ) H ( Y |X )
H (X ) H (Y )
! # !# $
%&'( !"# !"!% !"&' !"&'
4+ 567853 9.: )*+
;<=8 57853: )*+ !"!% !"# !"&' !"&'
1.23 !"( ) ) )
#$!
-$./012
34.456 '$70/&580$
!" %"
! 9(!" )*) !$+ 9(%" )*) %&:!") *) !$+ 9(%" )*) %&+
!$ %&