0% found this document useful (0 votes)

611 views26 pages

Information Theory

Information theory deals with quantifying and transmitting information. It analyzes communication systems using mathematical modeling. Entropy measures the uncertainty in a random variable and is key to information theory. Mutual information measures the shared information between two random variables. Shannon's noiseless coding theorem states that random variables can be compressed into a number of bits equal to their entropy without loss, or compressed into fewer bits with loss. The Shannon-Fano algorithm provides a method for entropy coding.

Uploaded by

amit mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

611 views26 pages

Information Theory

Uploaded by

amit mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 26

Information Theory

Presented By:
Er. Amit Mahajan
What is information theory

 Information theory is needed to enable the

communication system to carry information
(signals) from sender to receiver over a
communication channel
 It deals with mathematical modeling and analysis
of a communication system
 Its major task is to answer to the questions of
signal compression and transfer rate
 Those answers can be found and solved by
entropy and channel capacity
Entropy
 Entropy is defined in terms of probabilistic
behaviour of a source of information

In information theory the source output are
discrete random variables that have a certain
fixed finite alphabet with certain probabilities
 Entropy is an average information content for
the given source symbol
Mutual Information
• Mutual information uses conditional entropy of
X selected from a known alphabet
– conditional entropy means the uncertainty remaining
about the channel input after the channel output has
been observed
• Mutual information has several properties :
– symmetric channel
– always nonnegative
– relation to the joint entropy of a channel input and
channel output
Mutual Information
• The mutual information of two random variables is
a quantity that measures the mutual dependence of
the two variables. The most common unit of
measurement of mutual information is the bit, when
logarithms to the base 2 are used.
• The mutual information between two discrete
random variables is denoted by I(X;Y) and defined
as I(X;Y) =H(X) −H(X|Y )
• Mutual information is a useful concept to measure
the amount of information shared between input and
output of noisy channels.
Mutual Information
• Mutual information measures the information that X
and Y share: it measures how much knowing one of
these variables reduces our uncertainty about the
other. For example, if X and Y are independent, then
knowing X does not give any information about Y and
vice versa, so their mutual information is zero. At the
other extreme, if X and Y are identical then all
information conveyed by X is shared with Y: knowing
X determines the value of Y and vice versa. As a
result, the mutual information is the same as the
uncertainty contained in Y (or X) alone, namely the
entropy of Y (or X: clearly if X and Y are identical they
have equal entropy).
Entropy
 The mutual information of two discrete random variables X and
Y can be defined as:

where p(x,y) is the joint probability distribution

function of X and Y, and p1(x) and p2(y) are the
marginal probability distribution functions of X and Y
respectively
 Mutual information tells that how much information
does one random variable tell about another one.
 Mutual information quantifies the dependence between
the joint distribution of X and Y and what the joint
distribution would be if X and Y were independent.
Mutual information is a measure of dependence

(X; Y) = 0 if and only if X and Y are independent random
variables

if X and Y are independent, then p(x,y) = p(x) p(y), and
therefore:
• Mutual information can be equivalently expressed as

where H(X) and H(Y) are the marginal entropies, H(X|

Y) and H(Y|X) are the conditional entropies, and
H(X,Y) is the joint entropy of X and Y.
Entropy for discrete ensembles


The entropy H of a discrete random variable X with possible
values {x1, …, xn} is


E is the expected value function, and I(X) is the information
content or self-information of X.
I(X) is itself a random variable. If p denotes the probability
mass function of X then the entropy can explicitly be written
as
Entropy for discrete ensembles
 For a random variable X with n outcomes , the Shannon
entropy, a measure of uncertainty and denoted by H(X), is
defined as

where p(xi) is the probability mass function of outcome xi.

 consider a set of n possible outcomes (events) , with equal
probability p(xi) = 1 / n.

The uncertainty for such set of n outcomes is defined by
b is the base of the logarithm used. Common values of b
are 2, Euler's number e and 10, and the unit of entropy is
bitfor b = 2, nat for b = e, and dit (or digit) for b = 10.
 In the case of p = 0 for some i, the value of the
i
corresponding summand 0 logb 0 is taken to be 0, which
is consistent with the limit

The logarithm is used so to provide the additivity characteristic for
independent uncertainty. For example, consider appending to each
value of the first die the value of a second die, which has m possible
outcomes .

There are thus mn possible outcomes

The uncertainty for such set of mn outcomes is then

the uncertainty of playing with two dice is obtained by adding the

uncertainty of the second die logb(m) to the uncertainty of the first
die logb(n).
 the probability of each event is 1 / n

 In the case of a non-uniform probability mass function

(or density in the case of continuous random variables),

which is also called a surprisal; the lower the

probability p(xi), i.e. , the higher the uncertainty or the
surprise, i.e. , for the outcome xi.
 The average uncertainty , with being the average
operator, is obtained by
and is used as the definition of the entropy
H(X) .The above also explained why information
entropy and information uncertainty can be used
interchangeably.
EXAMPLE

 Entropy of a Coin toss as a function of the probability of it coming up

heads.
 Consider tossing a coin with known, not necessarily fair, probabilities of
coming up heads or tails.
 The entropy of the unknown result of the next toss of the coin is maximised
if the coin is fair (that is, if heads and tails both have equal probability 1/2).
This is the situation of maximum uncertainty as it is most difficult to predict
the outcome of the next toss; the result of each toss of the coin delivers a
full 1 bit of information.
 However, if we know the coin is not fair, but comes up
heads or tails with probabilities p and q, then there is less
uncertainty. Every time, one side is more likely to come
up than the other. The reduced uncertainty is quantified
in a lower entropy: on average each toss of the coin
delivers less than a full 1 bit of information.

The extreme case is that of a double-headed coin which
never comes up tails. Then there is no uncertainty. The
entropy is zero: each toss of the coin delivers no
information.
Shannon Noiseless coding theorem

Source coding is a mapping from (a sequence of) symbols from an
information source to a sequence of alphabet symbols (usually bits)
such that the source symbols can be exactly recovered from the
binary bits (lossless source coding) or recovered within some
distortion (lossy source coding). This is the concept behind data
compression.
 In information theory, the source coding theorem informally states
that:
“Random variables each with entropy H(X) can be compressed
into more than N H(X) bits with negligible risk of information loss,
as N tends to infinity; but conversely, if they are compressed into
fewer than NH(X) bits it is virtually certain that information will be
lost.".
Shannon's statement
Let X be a random variable taking values in some finite alphabet Σ1 and let f be a
decipherable code from Σ1 to Σ2where . Let S denote the resulting word
length of f(X).
If f is optimal in the sense that it has the minimal expected word length for X, then,

Proof
Let si denote the word length of each possible xi ( ). Define
.
, where C is chosen so that Then,
where the second line follows from Gibbs' inequality and the third line follows from
Kraft's inequality:

For the second inequality we may set so that

and so
and
and so by Kraft's inequality there exists a prefix-free code having those word lengths.
Thus the minimal S satisfies
Shannon-Fano Algorithm
 List the source symbols in order
of decreasing probability

Partition the set into two sets that
are as close to equiprobables as
possible, and assign 0 to the upper
set and 1 to the lower set.

Continue this process, each time
partitioning the sets as nearly
equal probabilities as possible
until further partitioning is not
possible.
For M=2,
Message Probability Encoded Message Length
x1 0.4 00 2
x2 0.2 01 2
x3 0.12 100 3
x4 0.08 101 3
0.08 110 3
x5 0.08 1110 4
x6 0.04 1111 4
x7

= 0.42 + 0.22+0.123+0.083 +0.083 + 0.084 + 0.04*4

=2.52letters/message

H(x) = - = 2.42 bits/ message

Effeciency = = 2.42/ 2.52 log 2 = 31.9%

For M=3,
Message Probability Encoded Message Length
x1 0.4 -1 1
x2 0.2 0 -1 2
x3 0.12 0 0 2
x4 0.08 1 -1 2
0.08 1 0 2
x5 0.08 1 1 -1 3
x6 0.04 1 1 0 3
x7

= 0.41 + 0.22+ 0.122 + 0.082 + 0.082 + 0.083 +

0.04*3
=1.72 letters/message
H(x) = - = 2.42 bits/ message

Effeciency = = 88.7%
Huffman Coding Algorithm
 Encoding algorithm
 Order the symbols by decreasing probabilities
 Starting from the bottom, assign 0 to the least probable
symbol and 1 to the next least probable symbol
 Combine the two least probable symbols into one composite Node
symbol Root
 Reorder the list with the composite symbol
 Repeat Step 2 until only two symbols remain in the list 1 0
 Huffman tree
1 0
 Nodes: symbols or composite symbols
 Branches: from each node, 0 defines one branch while 1
defines the other
 Decoding algorithm Leaves
 Start at the root, follow the branches based on the bits
received
 When a leaf is reached, a symbol is decoded
Huffman Coding Example
Symbols Prob. Symbols Prob. Symbols Prob.
A 0.35 A 0.35 A 0.35
B 0.17 DE 0.31 BC 0.34 1
C 0.17 B 0.17 1 DE 0.31 0
D 0.16 1 C 0.17 0
E 0.15 0

Huffman Tree
Huffman Codes 1 0 Symbols Prob.
BCDE A BCDE 0.65 1
A 0
B 111 BC 1 0 DE A 0.35 0

B 1
C 110 0 E
0 1
D 101
E 100 C D
Average code-word length = 0.35 x 1 + 0.65 x 3 = 2.30 bits per symbol
Refrences

http://www.quantiki.org/wiki/index.php/Shannon
%27s_noiseless_coding_theorem

Cover, Thomas M. (2006). "Chapter 5: Data Compression".
Elements of Information Theory. John Wiley & Sons.

http://moser.cm.nctu.edu.tw/nctu/it/index_0809.html

http://www.maths.abdn.ac.uk/~igc/tch/mx4002/notes/node59.html

http://en.wikipedia.org/wiki/Mutual_information

Echoes of The Tambaran Masculinity History and The Subject in The Work of Donald F Tuzin David Lipset Instant Download
No ratings yet
Echoes of The Tambaran Masculinity History and The Subject in The Work of Donald F Tuzin David Lipset Instant Download
85 pages
Structural Design of Spillway
No ratings yet
Structural Design of Spillway
10 pages
Brain and Machine Interfacing
No ratings yet
Brain and Machine Interfacing
4 pages
Content 3
No ratings yet
Content 3
7 pages
Math Ip3
No ratings yet
Math Ip3
8 pages
Moss Concrete
No ratings yet
Moss Concrete
6 pages
ME2102 Tutorial 6
No ratings yet
ME2102 Tutorial 6
2 pages
Lesson Plan in Napkin Folding
No ratings yet
Lesson Plan in Napkin Folding
2 pages
Major Project Synopsis Front Page
100% (1)
Major Project Synopsis Front Page
7 pages
Notice Regarding PTM For Students
No ratings yet
Notice Regarding PTM For Students
1 page
Steps Involved in Production and Utilization of A TV Programme
No ratings yet
Steps Involved in Production and Utilization of A TV Programme
5 pages
IICT Notes Unit-2
No ratings yet
IICT Notes Unit-2
17 pages
Religion, Guilt, and Ethical Standards
No ratings yet
Religion, Guilt, and Ethical Standards
17 pages
Grade 8 Revision
No ratings yet
Grade 8 Revision
11 pages
Superintendent
No ratings yet
Superintendent
55 pages
Chapter 4
No ratings yet
Chapter 4
49 pages
JPT Story
No ratings yet
JPT Story
47 pages
C&C Combined Module Notes
No ratings yet
C&C Combined Module Notes
206 pages
Professional Ed-WPS Office
100% (2)
Professional Ed-WPS Office
127 pages
Source Coding
No ratings yet
Source Coding
29 pages
1.1 Propositional Logic (EX) .4111.1534320746.8969
No ratings yet
1.1 Propositional Logic (EX) .4111.1534320746.8969
2 pages
Behavioral Finance: Jay R. Ritter
No ratings yet
Behavioral Finance: Jay R. Ritter
3 pages
Lecture 7 Source Coding 2024
No ratings yet
Lecture 7 Source Coding 2024
28 pages
21ECE72 - Coding and Cryp Module 1
No ratings yet
21ECE72 - Coding and Cryp Module 1
34 pages
Automotive Diagnosis Terminal (Dbscar Ii) : User Manual
No ratings yet
Automotive Diagnosis Terminal (Dbscar Ii) : User Manual
5 pages
Resource Utilization & Optimization in Quran: Synopsis For PHD Usulddin
No ratings yet
Resource Utilization & Optimization in Quran: Synopsis For PHD Usulddin
8 pages
PTSP
No ratings yet
PTSP
101 pages
Ae1 Listening Test Paper 08.2021
No ratings yet
Ae1 Listening Test Paper 08.2021
3 pages
The Physics Handbook: Charles P. Poole, JR
No ratings yet
The Physics Handbook: Charles P. Poole, JR
11 pages
Claves
No ratings yet
Claves
4 pages
80286, 80386, 80486 and Pentium Microprocessor
91% (11)
80286, 80386, 80486 and Pentium Microprocessor
9 pages
Biblio Tatla Aspects of Universality in Modern and Postmodern Architecture
No ratings yet
Biblio Tatla Aspects of Universality in Modern and Postmodern Architecture
3 pages
Coprocessor 8087
0% (2)
Coprocessor 8087
14 pages
Bab3 Matrikulasi
No ratings yet
Bab3 Matrikulasi
31 pages
Claytronics 1
50% (2)
Claytronics 1
19 pages
Introduction To Coalgebra 59: Towards Mathematics of States and Observation
100% (1)
Introduction To Coalgebra 59: Towards Mathematics of States and Observation
493 pages
Garage Door Control W/keyfob DSC-007: Application Note
No ratings yet
Garage Door Control W/keyfob DSC-007: Application Note
2 pages
Rockfall Barrier
No ratings yet
Rockfall Barrier
12 pages
TEDtalk Transcript - How To Spot A Liar
No ratings yet
TEDtalk Transcript - How To Spot A Liar
9 pages
Cyclic Codes
100% (7)
Cyclic Codes
20 pages
ICT - Module 1 Lecture 1
No ratings yet
ICT - Module 1 Lecture 1
34 pages
List of MCA For CSC
No ratings yet
List of MCA For CSC
9 pages
CE 3220 11 Drilling Rock and Earth PDF
No ratings yet
CE 3220 11 Drilling Rock and Earth PDF
67 pages
EE All Tutorials
No ratings yet
EE All Tutorials
39 pages
Calculation Sheet For External Surface Areas (Including Glass)
No ratings yet
Calculation Sheet For External Surface Areas (Including Glass)
20 pages
Firewall: By: Amit Mahajan Btech 7 Sem
100% (1)
Firewall: By: Amit Mahajan Btech 7 Sem
20 pages
Electromagnetic Waves
No ratings yet
Electromagnetic Waves
173 pages
Audio Spotlighting
100% (1)
Audio Spotlighting
12 pages
Congestion Control Techniques
100% (1)
Congestion Control Techniques
22 pages
Routing Protocols: By: Er. Amit Mahajan
100% (2)
Routing Protocols: By: Er. Amit Mahajan
30 pages
Virtual Private Network
100% (1)
Virtual Private Network
29 pages
9780203755419
100% (1)
9780203755419
431 pages
Diricklet Problem For Laplace
No ratings yet
Diricklet Problem For Laplace
174 pages
Electronic Reconnaissance, ECM, ECCM
100% (1)
Electronic Reconnaissance, ECM, ECCM
14 pages
Unit 1
No ratings yet
Unit 1
94 pages
How To Use Keil Complier
No ratings yet
How To Use Keil Complier
7 pages
Analytic Geometry Calculus Robert Yates
100% (2)
Analytic Geometry Calculus Robert Yates
258 pages
Internert2: By: Er. Amit Mahajan
No ratings yet
Internert2: By: Er. Amit Mahajan
38 pages
Smart Card
No ratings yet
Smart Card
27 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
8051 Microcontroller Development Board
100% (1)
8051 Microcontroller Development Board
66 pages
Multiplexing in SONET
No ratings yet
Multiplexing in SONET
35 pages
All Coding
No ratings yet
All Coding
52 pages
Alu Design
No ratings yet
Alu Design
3 pages
Associated Legendre Functions
No ratings yet
Associated Legendre Functions
14 pages
Calculus Exercises With Solutions
No ratings yet
Calculus Exercises With Solutions
135 pages
EM Waves Reflection
No ratings yet
EM Waves Reflection
38 pages
Dictionary of Algebra, Arithmetic, and Trigonometry - Steven G. Krantz (2001)
100% (1)
Dictionary of Algebra, Arithmetic, and Trigonometry - Steven G. Krantz (2001)
326 pages
L05 - Transformation in Variables of A Signal
No ratings yet
L05 - Transformation in Variables of A Signal
18 pages
Convergence Estimates in Approximation Theory
No ratings yet
Convergence Estimates in Approximation Theory
368 pages
Mutual Information, Joint Entropy & Conditional Entropy
No ratings yet
Mutual Information, Joint Entropy & Conditional Entropy
13 pages
L01 - Informal Introduction To Signals
No ratings yet
L01 - Informal Introduction To Signals
21 pages
Wireless Telecommunications Systems & Networks: Gary J. Mullett
100% (1)
Wireless Telecommunications Systems & Networks: Gary J. Mullett
56 pages
Gec 220 Partial Derivation
100% (1)
Gec 220 Partial Derivation
24 pages
Yoga Sutras of Patanjali 1.5-1.11 - Un-Coloring Your Thoughts
No ratings yet
Yoga Sutras of Patanjali 1.5-1.11 - Un-Coloring Your Thoughts
8 pages
Ito, Kiyoso - Encyclopedic Dictionary of Math Volume 2 PDF
100% (2)
Ito, Kiyoso - Encyclopedic Dictionary of Math Volume 2 PDF
767 pages
The Information Theory: C.E. Shannon, A Mathematical Theory of Communication'
No ratings yet
The Information Theory: C.E. Shannon, A Mathematical Theory of Communication'
43 pages
VCFT Unit-1 PPTS
No ratings yet
VCFT Unit-1 PPTS
72 pages
Advanced Functional Analysis - Ssab PDF
No ratings yet
Advanced Functional Analysis - Ssab PDF
182 pages
Theory of Computation
No ratings yet
Theory of Computation
10 pages
Laplace Transformation
No ratings yet
Laplace Transformation
48 pages
04 Substructuring
No ratings yet
04 Substructuring
10 pages
L04 - Problems On Classification of Signals
No ratings yet
L04 - Problems On Classification of Signals
18 pages
An Introduction To Error-Correcting Codes: The Virtues of Redundancy
No ratings yet
An Introduction To Error-Correcting Codes: The Virtues of Redundancy
38 pages
Edwin L. Woollett - Maxima by Example
100% (1)
Edwin L. Woollett - Maxima by Example
514 pages
8 Z Transforms
No ratings yet
8 Z Transforms
47 pages
Higher Order Fourier Analysis
100% (1)
Higher Order Fourier Analysis
195 pages
Fermat's Last Theorem
100% (5)
Fermat's Last Theorem
18 pages
Topics in Algebraic Combinatorics
No ratings yet
Topics in Algebraic Combinatorics
225 pages
Unit 3 Entropy
No ratings yet
Unit 3 Entropy
25 pages
Review of Vector Calculus
No ratings yet
Review of Vector Calculus
53 pages
Entropy: Applications of Entropy in Finance: A Review
No ratings yet
Entropy: Applications of Entropy in Finance: A Review
23 pages
Geometry and Group Theory - Pope
100% (1)
Geometry and Group Theory - Pope
181 pages
Portfolio Construction
No ratings yet
Portfolio Construction
15 pages
Laplace
No ratings yet
Laplace
19 pages
Interpreting Multiple Regression
No ratings yet
Interpreting Multiple Regression
19 pages
Topics in Discrete Math PDF
No ratings yet
Topics in Discrete Math PDF
173 pages
3F4 Power and Energy Spectral Density: Dr. I. J. Wassell
No ratings yet
3F4 Power and Energy Spectral Density: Dr. I. J. Wassell
12 pages
Chapter 1 Random Process: 1.1 Introduction (Physical Phenomenon)
No ratings yet
Chapter 1 Random Process: 1.1 Introduction (Physical Phenomenon)
61 pages
Lipschitz Functions: Lorianne Ricco February 4, 2004
No ratings yet
Lipschitz Functions: Lorianne Ricco February 4, 2004
3 pages
Chapter 5 Random Processes: Ensemble
No ratings yet
Chapter 5 Random Processes: Ensemble
19 pages
3 Signal Space Representation of Waveforms
No ratings yet
3 Signal Space Representation of Waveforms
5 pages
A Theorem of Minkowski The Four Squares Theorem: Pete L. Clark
No ratings yet
A Theorem of Minkowski The Four Squares Theorem: Pete L. Clark
13 pages
Probability Distributions
No ratings yet
Probability Distributions
17 pages
Clark W Edwin - Elementary Number Theory
100% (1)
Clark W Edwin - Elementary Number Theory
129 pages
On The History of Geometrization From Minkowski To Finsler Geometry H Goenner - Ps
No ratings yet
On The History of Geometrization From Minkowski To Finsler Geometry H Goenner - Ps
24 pages
Signal and System Lecture 6
No ratings yet
Signal and System Lecture 6
19 pages

Information Theory

Uploaded by

Information Theory

Uploaded by

Information Theory

 Information theory is needed to enable the

where p(x,y) is the joint probability distribution

where H(X) and H(Y) are the marginal entropies, H(X|

where p(xi) is the probability mass function of outcome xi.

There are thus mn possible outcomes

The uncertainty for such set of mn outcomes is then

the uncertainty of playing with two dice is obtained by adding the

 In the case of a non-uniform probability mass function

which is also called a surprisal; the lower the

 Entropy of a Coin toss as a function of the probability of it coming up

For the second inequality we may set so that

= 0.4*2 + 0.2*2+0.12*3+0.08*3 +0.08*3 + 0.08*4 + 0.04*4

H(x) = - = 2.42 bits/ message

Effeciency = = 2.42/ 2.52 log 2 = 31.9%

= 0.4*1 + 0.2*2+ 0.12*2 + 0.08*2 + 0.08*2 + 0.08*3 +

You might also like

= 0.42 + 0.22+0.123+0.083 +0.083 + 0.084 + 0.04*4

= 0.41 + 0.22+ 0.122 + 0.082 + 0.082 + 0.083 +