0% found this document useful (0 votes)

40 views24 pages

Differential Entropy: Peng-Hua Wang

The document summarizes key concepts about differential entropy from chapter 8 of an information theory textbook. It defines differential entropy and discusses its properties for continuous random variables, including: the asymptotic equipartition property (AEP); typical sets; joint and conditional differential entropy; and that Gaussian distributions have maximum differential entropy among distributions with a given variance. It also presents formulas for differential entropy of multivariate normal distributions.

Uploaded by

riondi27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views24 pages

Differential Entropy: Peng-Hua Wang

Uploaded by

riondi27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Chapter 8

Differential Entropy

Peng-Hua Wang

Graduate Inst. of Comm. Engineering

National Taipei University

Chapter Outline

Chap. 8 Differential Entropy

8.1 Definitions
8.2 AEP for Continuous Random Variables
8.3 Relation of Differential Entropy to Discrete Entropy
8.4 Joint and Conditional Differential Entropy
8.5 Relative Entropy and Mutual Information
8.6 Properties of Differential Entropy and Related Amounts

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 2/24

8.1 Definitions

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 3/24

Definitions

Definition 1 (Differential entropy) The differential entropy h(X) of a

continuous random variable X with pdf f (X) is defined as
Z
h(X) = − f (x) log f (x)dx,
S

where S is the support region of the random variable.

Example
a
1 1
Z
X ∼ U (0, a), h(X) = − log dx = log a.
0 a a

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 4/24

Differential Entropy of Gaussian

x2
1 −
Example. If X ∼ N (0, σ 2 ) with pdf φ(x) = √2πσ 2
e , then
2σ 2

Z
ha (x) = − φ(x) loga φ(x)dx
2

1 x
Z
= − φ(x) loga √ − 2 loga e dx
2πσ 2 2σ
1 2 loga e 2 1 2
= loga (2πσ ) + 2
E φ [X ] = log a (2πeσ )
2 2σ 2

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 5/24

Differential Entropy of Gaussian

Remark. If a random variable with pdf f (x) has zero mean and
variance σ 2 , then
Z
− f (x) loga φ(x)dx
2

1 x
Z
= − f (x) loga √ − 2 loga e dx
2πσ 2 2σ
1 2 loga e 2 1 2
= loga (2πσ ) + 2
E f [X ] = log a (2πeσ )
2 2σ 2

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 6/24

Gaussian has Maximal Differential Entropy

Suppose that a random variable X with pdf f (x) has zero mean and
variance σ 2 , what is its maximal differential entropy?
Let φ(x) be the pdf of N (0, σ 2 ).
Z Z
φ(x)
h(X) + f (x) log φ(x)dx = f (x) log dx
f (x)
Z
φ(x)
≤ log f (x) dx (convexity of logarithm)
f (x)
Z
= log φ(x)dx = 0

That is,
1
Z
h(X) ≤ − f (x) log φ(x)dx = log(2πeσ 2 )
2
and equality holds if f (x) = φ(x).

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 7/24

8.2 AEP for Continuous Random Variables

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 8/24

AEP

Theorem 1 (AEP) Let X1 , X2 , . . . , Xn be a sequence of i.i.d. random

variables with common pdf f (x). Then,

1
− log f (X1 , X2 , . . . , Xn ) → E[− log f (X)] = h(X)
n
in probability.
(n)
Definition 2 (Typical Set) For ǫ > 0 the typical set Aǫ with respect
to f (x) is defined as

A(n)
ǫ = (x 1 , x 2 , . . . , x n ) ∈ S n
:

1
− log f (x1 , x2 , . . . , xn ) − h(X) ≤ ǫ
n

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 9/24

AEP

Definition 3 (Volume) The volume Vol(A) of a set A ⊂ Rn is defined

as Z
Vol(A) = dx1 dx2 . . . dxn
A
(n)
Theorem 2 (Properties of typical set) 1. Pr(Aǫ ) > 1 − ǫ for n
sufficiently large.

(n)
2. Vol(Aǫ ) ≤ 2n(h(X)+ǫ) for all n.

(n)
3. Vol(Aǫ ) ≥ (1 − ǫ)2n(h(X)−ǫ) for n sufficiently large.

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 10/24

8.4 Joint and Conditional Differential
Entropy

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 11/24

Definitions

Definition 4 (Differential entropy) The differential entropy of jointly

distributed random variables X1 , X2 , . . . , Xn is defined as
Z
h(X1 , X2 , . . . , Xn ) = − f (xn ) log f (xn )dxn

where f (xn ) = f (x1 , x2 , . . . , xn ) is the joint pdf.

Definition 5 (Conditional differential entropy) The conditional
differential entropy of jointly distributed random variables X, Y with joint
pdf f (x, y) is defined as, if it exists,
Z
h(X|Y ) = − f (x, y) log f (x|y)dxdy = h(X, Y ) − h(Y )

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 12/24

Multivariate Normal Distribution

Theorem 3 (Entropy of a multivariate normal) Let X1 , X2 , . . . , Xn

have a multivariate normal distribution with mean vector µ and
covariance matrix K. Then
1
h(X1 , X2 , . . . , Xn ) = log(2πe)n |K|
2
Proof. The joint pdf of a multivariate normal distribution is
1 − 21 (x−µ)t K−1 (x−µ)
φ(x) = n/2 1/2
e
(2π) |K|

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 13/24

Multivariate Normal Distribution

Therefore,
Z
h(X1 , X2 , . . . , Xn ) = − φ(x) loga φ(x)dx
Z
1 1
= φ(x) loga (2π)n |K| + (x − µ)t K−1 (x − µ) loga e dx
2 2
1 1
= loga (2π)n |K| + (loga e) E (x − µ)t K−1 (x − µ)
2 2 | {z }
=n
1 1
= loga (2π)n |K| + n loga e
2 2
1
= loga (2πe)n |K|
2

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 14/24

Multivariate Normal Distribution

Let Y = (Y1 , Y2 , . . . , Yn )t be a random vector. If K = E[YY t ],

then E[Y t K−1 Y] = n.

We have ki = E[Yi Y] and atj ki = δij .

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 15/24

Multivariate Normal Distribution

Now,
   
at1 at1 Y
   

 at2 

 at Y 
 2 
t −1 t
Y K Y=Y  .
 Y = (Y1 , Y2 , . . . , Yn ) 
  .. 

.
 .   . 
   
atn atn Y
= Y1 at1 Y + Y2 at2 Y + · · · + Yn atn Y

and

E[Y t K−1 Y] = at1 E[Y1 Y] + at2 E[Y2 Y] + · · · + atn E[Yn Y]

= at1 k1 + at2 k2 + · · · + atn kn = n

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 16/24

8.5 Relative Entropy and Mutual Information

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 17/24

Definitions

Definition 6 (Relative entropy) The relative entropy (or

KullbackVLeibler distance) D(f ||g) between two densities f (x) and
g(x) is defined as
f (x)
Z
D(f ||g) = f (x) log dx
g(x)
Definition 7 (Mutual information) The mutual information I(X; Y )
between two random variables with joint density f (x, y) is defined as

f (x, y)
Z
I(X; Y ) = f (x, y) log dxdy
f (x)f (y)

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 18/24

Example

Let (X, Y ) ∼ N (0, K) where

" #
2 2
σ ρσ
K= .
ρσ 2 σ2

Then h(X) = h(Y ) = 12 log(2πe)σ 2 and

1 1
h(X, Y ) = log(2πe) |K| = log(2πe)2 σ 4 (1 − ρ2 ).
2
2 2
Therefore,
1
I(X; Y ) = h(X) + h(Y ) − h(X; Y ) = − log(1 − ρ2 ).
2

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 19/24

8.6 Properties of Differential Entropy and
Related Amounts

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 20/24

Properties

Theorem 4 (Relative entropy)

D(f ||g) ≥ 0

with equality iff f

= g almost everywhere.
Corollary 1 1. I(X; Y ) ≥ 0 with equality iff X and Y are
independent.

1. h(X|Y ) ≤ h(X) with equality iff X and Y are independent.

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 21/24

Properties

Theorem 5 (Chain rule for differential entropy)

n
X
h(X1 , X2 , . . . , Xn ) = h(Xi |X1 , X2 , . . . , Xi−1 )
i=1

Corollary 2
n
X
h(X1 , X2 , . . . , Xn ) ≤ h(Xi )
i=1
Corollary 3 (Hadamard’s inequality) If K is the covariance matrix of a
multivariate normal distribution, then
n
Y
|K| ≤ Kii .
i=1

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 22/24

Properties

Theorem 6 1. h(X + c) = h(X)

2. h(aX) = h(X) + log |a|.

3. h(AX) = h(X) + log | det(A)|

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 23/24

Gaussian has Maximal Entropy

Theorem 7 Let the random vector X ∈ Rn have zero mean and

covariance K = E[XXt ]. Then h(X) ≤ 21 log(2πe)n |K|. with
equality X ∼ N (0, K)
R
Proof. Let g(x) be any density satisfying xi xj g(x)dx = Kij . Let
φ(x) be the density of N (0, K). Then,
Z Z
0 ≤ D(g||φ) = g log(g/φ) = −h(g) − g log φ
Z
= −h(g) − φ log φ = −h(g) + h(φ)

That is, h(g) ≤ h(φ). Equality holds if g = φ.

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 24/24

Wireless Digital Communications
80% (5)
Wireless Digital Communications
349 pages
Info Theory Polyanskiy Wu
No ratings yet
Info Theory Polyanskiy Wu
730 pages
Information Theory and Coding NOTES
No ratings yet
Information Theory and Coding NOTES
129 pages
Chapter 5
No ratings yet
Chapter 5
85 pages
Chapter 2
No ratings yet
Chapter 2
68 pages
Chapter 16
No ratings yet
Chapter 16
71 pages
Session 2
No ratings yet
Session 2
60 pages
Presentation Math7952
No ratings yet
Presentation Math7952
29 pages
Lecture 2
No ratings yet
Lecture 2
19 pages
Lecture 2
No ratings yet
Lecture 2
49 pages
Lecture 3 - Entropy
No ratings yet
Lecture 3 - Entropy
35 pages
2017 Mitsubishi l200 112981 PDF
No ratings yet
2017 Mitsubishi l200 112981 PDF
426 pages
2017 Mitsubishi l200 112981 PDF
No ratings yet
2017 Mitsubishi l200 112981 PDF
426 pages
Lecture 6
No ratings yet
Lecture 6
27 pages
Ec23ec4211itc PPT
No ratings yet
Ec23ec4211itc PPT
148 pages
2024, February-Weighted Multi Order Viterbi Algorithm
No ratings yet
2024, February-Weighted Multi Order Viterbi Algorithm
25 pages
It Co 1 en
No ratings yet
It Co 1 en
26 pages
Dabel Info Theory
No ratings yet
Dabel Info Theory
25 pages
ITC-Post Mid1
No ratings yet
ITC-Post Mid1
36 pages
Kullback-Leibler Divergence - Wikipedia
No ratings yet
Kullback-Leibler Divergence - Wikipedia
23 pages
Tema 1 Awp
No ratings yet
Tema 1 Awp
32 pages
Lecture Note PDF
No ratings yet
Lecture Note PDF
373 pages
Information Theoretic Inequalities
No ratings yet
Information Theoretic Inequalities
18 pages
A Mini-Introduction To Information Theor PDF
No ratings yet
A Mini-Introduction To Information Theor PDF
40 pages
Lec7 InformationTheory
No ratings yet
Lec7 InformationTheory
41 pages
A Mini-Introduction To Information Theor PDF
No ratings yet
A Mini-Introduction To Information Theor PDF
40 pages
Urban Sprwal in Vijaywada
No ratings yet
Urban Sprwal in Vijaywada
6 pages
8IT01 UNIT I Digital and Wireless Communication
No ratings yet
8IT01 UNIT I Digital and Wireless Communication
97 pages
Comm... System CH2-Lec1
No ratings yet
Comm... System CH2-Lec1
36 pages
(Patrick P. Potts) Introduction To Quantum Thermodynamics
No ratings yet
(Patrick P. Potts) Introduction To Quantum Thermodynamics
36 pages
Week 10 - Differential Entropy
No ratings yet
Week 10 - Differential Entropy
22 pages
Lecture 15
No ratings yet
Lecture 15
7 pages
ETN642-lec9 - CH9 Differential Entropy
No ratings yet
ETN642-lec9 - CH9 Differential Entropy
6 pages
Dtree&rf
No ratings yet
Dtree&rf
26 pages
Communication Theory and Coding: Basics
No ratings yet
Communication Theory and Coding: Basics
17 pages
Entropy and Mutual Information
No ratings yet
Entropy and Mutual Information
63 pages
Information Theory Answers
No ratings yet
Information Theory Answers
4 pages
Information Theory Textbook
No ratings yet
Information Theory Textbook
14 pages
Tagged Back-Translation: Isaac Caswell, Ciprian Chelba, David Grangier Google Research
No ratings yet
Tagged Back-Translation: Isaac Caswell, Ciprian Chelba, David Grangier Google Research
11 pages
Lecture 14
No ratings yet
Lecture 14
5 pages
Quantum Information: Stephen M. Barnett
No ratings yet
Quantum Information: Stephen M. Barnett
60 pages
Shannon's Theorems: Math and Science Summer Program 2020
No ratings yet
Shannon's Theorems: Math and Science Summer Program 2020
28 pages
Information Theory: KIE 2008 Communication Systems
No ratings yet
Information Theory: KIE 2008 Communication Systems
52 pages
Elements of Information Theory-Chapter1-2
No ratings yet
Elements of Information Theory-Chapter1-2
63 pages
Exercise Makes Better Mind A Data Mining Study On
No ratings yet
Exercise Makes Better Mind A Data Mining Study On
10 pages
PRML笔记-Notes on Pattern Recognition and Machine Learning PDF
100% (2)
PRML笔记-Notes on Pattern Recognition and Machine Learning PDF
77 pages
Entropy Handbook Definitions, Theorems, M-Files
No ratings yet
Entropy Handbook Definitions, Theorems, M-Files
22 pages
Report PDF
No ratings yet
Report PDF
73 pages
A Universal Data Compression System
No ratings yet
A Universal Data Compression System
9 pages
DISCRETE MEMORYLESS CHANNELS-itc
No ratings yet
DISCRETE MEMORYLESS CHANNELS-itc
20 pages
WSQ Gray-Scale Specification Version 3
No ratings yet
WSQ Gray-Scale Specification Version 3
56 pages
Chapter 8: Differential Entropy
No ratings yet
Chapter 8: Differential Entropy
21 pages
Notes08 Infotheory
No ratings yet
Notes08 Infotheory
7 pages
A Mini-Introduction To Information Theory: Edward Witten
No ratings yet
A Mini-Introduction To Information Theory: Edward Witten
39 pages
Three Tutorial Lectures
No ratings yet
Three Tutorial Lectures
36 pages
The Consistency Between Cross-Entropy and Distance Measures in Fuzzy Sets
No ratings yet
The Consistency Between Cross-Entropy and Distance Measures in Fuzzy Sets
11 pages
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
No ratings yet
Elements of Information Theory 2006 Thomas M. Cover and Joy A. Thomas
16 pages
Quantum Computation and Quantum Information: Michael A. Nielsen & Isaac L. Chuang
No ratings yet
Quantum Computation and Quantum Information: Michael A. Nielsen & Isaac L. Chuang
8 pages
Entr 5
No ratings yet
Entr 5
2 pages
Entropy, Relative Entropy and Mutual Information
No ratings yet
Entropy, Relative Entropy and Mutual Information
38 pages
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
No ratings yet
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
6 pages
Mathematical Theory
No ratings yet
Mathematical Theory
29 pages
Information Theory and Coding (Lecture 2) : Dr. Farman Ullah
No ratings yet
Information Theory and Coding (Lecture 2) : Dr. Farman Ullah
36 pages
Error - Correction & Crosstalk Avoidance in DSM Busses
No ratings yet
Error - Correction & Crosstalk Avoidance in DSM Busses
18 pages
Investigatory Project
No ratings yet
Investigatory Project
31 pages
Lecture 3: Entropy, Relative Entropy, and Mutual Information
No ratings yet
Lecture 3: Entropy, Relative Entropy, and Mutual Information
5 pages
L2 Review of Basics
No ratings yet
L2 Review of Basics
37 pages
Notes It
No ratings yet
Notes It
46 pages
Sheet 2
No ratings yet
Sheet 2
2 pages
Dynamic Risk Assessment For Excavation Engineering Based On Human Factors-1
No ratings yet
Dynamic Risk Assessment For Excavation Engineering Based On Human Factors-1
5 pages
Implementing TOD Around Suburban and Rural Stations: An Exploration of Spatial Potentialities and Constraints
No ratings yet
Implementing TOD Around Suburban and Rural Stations: An Exploration of Spatial Potentialities and Constraints
25 pages
Ech 4
No ratings yet
Ech 4
39 pages
Entropy
No ratings yet
Entropy
21 pages
A RGB Image Encryption Technique Using Lorenz and Rossler Chaotic System On DNA Sequences
No ratings yet
A RGB Image Encryption Technique Using Lorenz and Rossler Chaotic System On DNA Sequences
23 pages
Relative Entropy
No ratings yet
Relative Entropy
6 pages
Lecture 1: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 1: Entropy and Mutual Information: 2.1 Example
8 pages
Decision Tree Theory
No ratings yet
Decision Tree Theory
4 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
CoverThomas Ch2 PDF
No ratings yet
CoverThomas Ch2 PDF
38 pages
DifferentialEntropy Examples
No ratings yet
DifferentialEntropy Examples
17 pages
1.1 Shannon's Information Measures: Lecture 1 - January 26
No ratings yet
1.1 Shannon's Information Measures: Lecture 1 - January 26
5 pages
STAT 538 Maximum Entropy Models C Marina Meil A Mmp@stat - Washington.edu
No ratings yet
STAT 538 Maximum Entropy Models C Marina Meil A Mmp@stat - Washington.edu
20 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
27 pages
Lecture 2: Entropy and Mutual Information: 2.1 Example
No ratings yet
Lecture 2: Entropy and Mutual Information: 2.1 Example
8 pages
Quantum Information Theory
No ratings yet
Quantum Information Theory
19 pages
Information Theory Differential Entropy
No ratings yet
Information Theory Differential Entropy
29 pages
Information Theory Entropy Relative Entropy
No ratings yet
Information Theory Entropy Relative Entropy
60 pages
Lecture 3: Entropy, Relative Entropy, and Mutual Information
No ratings yet
Lecture 3: Entropy, Relative Entropy, and Mutual Information
5 pages
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
No ratings yet
The Binary Entropy Function: ECE 7680 Lecture 2 - Definitions and Basic Facts
8 pages
Auto Smart
No ratings yet
Auto Smart
2 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

Differential Entropy: Peng-Hua Wang

Uploaded by

Differential Entropy: Peng-Hua Wang

Uploaded by

Chapter 8

Graduate Inst. of Comm. Engineering

National Taipei University

Chap. 8 Differential Entropy

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 2/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 3/24

Definition 1 (Differential entropy) The differential entropy h(X) of a

where S is the support region of the random variable.

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 4/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 5/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 6/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 7/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 8/24

Theorem 1 (AEP) Let X1 , X2 , . . . , Xn be a sequence of i.i.d. random

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 9/24

Definition 3 (Volume) The volume Vol(A) of a set A ⊂ Rn is defined

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 10/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 11/24

Definition 4 (Differential entropy) The differential entropy of jointly

where f (xn ) = f (x1 , x2 , . . . , xn ) is the joint pdf.

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 12/24

Theorem 3 (Entropy of a multivariate normal) Let X1 , X2 , . . . , Xn

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 13/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 14/24

Let Y = (Y1 , Y2 , . . . , Yn )t be a random vector. If K = E[YY t ],

We have ki = E[Yi Y] and atj ki = δij .

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 15/24

E[Y t K−1 Y] = at1 E[Y1 Y] + at2 E[Y2 Y] + · · · + atn E[Yn Y]

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 16/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 17/24

Definition 6 (Relative entropy) The relative entropy (or

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 18/24

Let (X, Y ) ∼ N (0, K) where

Then h(X) = h(Y ) = 12 log(2πe)σ 2 and

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 19/24

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 20/24

Theorem 4 (Relative entropy)

with equality iff f

1. h(X|Y ) ≤ h(X) with equality iff X and Y are independent.

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 21/24

Theorem 5 (Chain rule for differential entropy)

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 22/24

Theorem 6 1. h(X + c) = h(X)

2. h(aX) = h(X) + log |a|.

3. h(AX) = h(X) + log | det(A)|

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 23/24

Theorem 7 Let the random vector X ∈ Rn have zero mean and

That is, h(g) ≤ h(φ). Equality holds if g = φ.

Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 8 - p. 24/24

You might also like