0% found this document useful (0 votes)

47 views29 pages

Information Theory: Entropy and Coding

The document discusses the relationship between uncertainty, information, and entropy, emphasizing that uncertainty can be quantified using probability. It explains source coding and compression, detailing Shannon's theories on lossless and lossy data compression, including the concepts of entropy rate and rate-distortion function. Additionally, it covers Huffman coding as a method for efficient encoding of messages, highlighting the importance of code efficiency and the impact of block length on coding performance.

Uploaded by

Nanda Mitra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views29 pages

Information Theory: Entropy and Coding

Uploaded by

Nanda Mitra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Encoding the Information

EC 303
Recap
• Uncertainty=Information
• Uncertainty measured in terms of probability
of occurrence (determined experimentally
using sufficiently large number of
experiments).
• Information: inversely proportional to
uncertainty measured by probability of
occurrence.

2
Uncertainty, Information and Entropy

• Let the source alphabet,

S = {s0, s1 , .. , sK -1}
with the prob. of occurrence
K -1
P( s = sk ) = pk , k = 0,1, .. , K -1 and p
k =0
k =1

• Assume the discrete memory-less source (DMS)

What is the measure of information?

3
Entropy
• The expected information may be quantified in a set of possible
outcomes or mutually exclusive events.
• Specifically, if an event i occurs with probability pi, 1<= i<= N out of
a set of N mutually exclusive events, then the average or expected
information is given by
N
H ( p1 , p2 ,..., pN ) =  pi log (1/ pi )
i =1

• H is also called the entropy of the probability distribution. Just as in

information, it is also measured in bits. It is simply the sum of
several terms, each of which is the information of a given event
weighted by the probability of that event occurring.
• It is often useful to think of the entropy as the average or expected
uncertainty associated with this set of events.

4
Source Coding
• Source Coding, is taking a set of messages that need to be sent from a
sender and encoding them in a way that is efficient. The notions of
information and entropy will be fundamentally important in this effort.
• E.g. ASCII coding, .wav format for audio, Digital images saving 3 color
values using 8 bits each. All these encodings involve a sequence of fixed-
length symbols, each of which can be easily manipulated independently.
For example, to find the 42nd character in the file, one just looks at the
42nd byte and interprets those 8 bits as an ASCII character.

• Do all the characters (ASCII, 128 characters) occur with same probability?
• Should all be encoded using same number of bits?

5
Compression
• Specifically, the entropy, defined earlier, tells us
the expected amount of information in a
message, when the message is drawn from a set
of possible messages, each occurring with some
probability.

• The entropy is a lower bound on the amount of

information that must be sent, on average, when
transmitting data about a particular choice.
6
Source Coding or Compression
• In his 1948 paper, “A Mathematical Theory of
Communication,” Claude E. Shannon formulated the theory of
data compression. Shannon established that there is a
fundamental limit to lossless data compression.

• This limit, called the entropy rate, is denoted by H. The exact

value of H depends on the information source --- more
specifically, the statistical nature of the source.

• It is possible to compress the source, in a lossless manner,

with compression rate close to H. It is mathematically
impossible to do better than H.

7
Lossy Data Compression
• Shannon also developed the theory of lossy data
compression. This is better known as rate-distortion theory. In
lossy data compression, the decompressed data does not
have to be exactly the same as the original data.

• Instead, some amount of distortion, D, is tolerated. Shannon

showed that, for a given source (with all its statistical
properties known) and a given distortion measure, there is a
function, R(D), called the rate-distortion function. The theory
says that if D is the tolerable amount of distortion, then R(D)
is the best possible compression rate.

8
Rate-Distortion Theory
– We trade off rate (number of bits per symbol)
versus distortion this is represented by a rate-
distortion function R(D)

9
Example
• Consider a fair dice (each outcome is equally likely). Compute the entropy
of this source.

• If this dice was loaded such that outcomes 6 and 5 are more likely than
others p(X=5) = 0.5 and p(X=6) = 1/3. Compute the entropy in this
scenario. What can you conclude from the above two results?

10
Example
• The number of outcomes for the dice (or the alphabet for the
DMS behind it) are {1, 2, 3, 4, 5, 6}. In the first case, the
probability for each is the same or 1/6.
• Entropy = 6*(log2(6)/6) = 2.585

• Loaded Dice:
• Entropy:

1 1  1 
H = log 2 (2) + log 2 (3) + 4 *  log 2 (24) 
2 3  24 
H = 1.7925

11
Source Coding Theorem
• Source coding theorem establishes a fundamental limit on the rate at
which the output of an information source can be compressed without
causing large error probability at the receiver. This is one of the
fundamental theorems of information theory.

• It states that, a source with entropy rate H can be encoded with

arbitrarily small error probability at any rate R (bits/source output)
provided, R > H. If R < H, the error probability will be independent of the
complexity of the encoder and decoder and bounded away from zero.

12
Source Coding Theorem
• The theorem, first proved by Shannon, only gives the theoretical bound for
the performance of the encoders. It does not provide any algorithm for
design of such optimum codes.

• As a result, several algorithms have been developed that try to compress

the information at the source in a fashion such that the data is recoverable
at the receiver without any losses.

13
Huffman Coding
• In Huffman coding, fixed length blocks of the source output are mapped to
variable length binary blocks. The idea here is to map the more frequently
occurring fixed length sequences to shorter binary sequences and the less
frequently occurring ones to longer binary sequences.

• Before we discuss various codes, let us identify some desirable and

undesirable properties of these assigned binary sequences.

14
Example
• Consider the following case where an alphabet of size 5 is
mapped using variable length coding.

Alphabet Probability Code1 Code2 Code3 code4

a1 ½ 1 1 0 00

a2 ¼ 01 10 10 01

a3 1/8 001 100 110 10

a4 1/16 0001 1000 1110 11

a5 1/16 00001 10000 1111 110

15
Example
• Code must be uniquely decodable.
• Code should be instantaneous.
• Prefix Condition: A code is said to satisfy prefix condition if no code
word is prefix to another code word.

• Prefix condition is necessary and sufficient condition for a code to

be uniquely decodable and instantaneous.
• Huffman codes are uniquely decodable instantaneous codes with
minimum average length. In this sense, they are optimum codes in
this category.

16
Compression
• The average length achieved by a code is computed as the sum of
binary code length ni weighted by the probability of the
corresponding code symbol.

• The shortest average code length for a particular alphabet may be

close to the entropy of the alphabet or significantly higher. This
ability is related to the alphabet; not the technique.

• Compression performance is measured by the compression ratio

which is ratio of average number of bits per sample before and
after compression.
• Efficiency is computed as the percentage ratio of the entropy to
the actual average length per sample.

17
Huffman Encoding Algorithm
• Sort the source outputs in the decreasing order of their probability.
• Merge the two least probable outputs into a single output whose
probability is the sum of the corresponding probabilities.
• Continue this till the number of remaining outputs is equal to two.
• Arbitrarily assign 0 and 1 as code words for the two remaining outputs.
• If the output is the result of merger of two outputs; append the current
code word with 0 and 1 to obtain the next code word. Repeat the last step
till there are no mergers.

18
Huffman Encoding
• Two entries with the lowest relative frequency are merged to form a new
branch with their composite probability. After every merging, the new
branch and the remaining branches are reordered such that the reduced
table preserves the descending probability occurrence.
• At this time, the new branch rises until it can rise no more. This bubbling
results in a code with lowest code length variance.
• If bubbling is not done; the code length will stay the same but with high
length variance.

19
Example
• Determine the Huffman code for a source with alphabet A = {a1, a2, a3, a4,
a5} with probabilities; 1/3, ¼, 1/6, 1/8 and 1/8.

• Determine the Huffman code for the output of a fair dice.

• Determine the Huffman code for the output for a loaded dice; p(X=5) = 0.5
and p(X=6) = 1/3.

• How does the length of the Huffman code compare with the Entropy of
the source?

20
Solution
• Problem 1:
– The Entropy =2.2091
– Average length = 27/12=2.25
– Efficiency = 98.18%, compression ratio = 3/2.25=1.33
• Problem 2:
– The Entropy = 2.585
– Average length = 16/6=2.667
– Efficiency = 96.93% Compression = 3/2.667=1.12
• Problem 3:
– The Entropy = 1.7935
– Average length = 11/6 = 1.833
– Efficiency = 97.85% Compression = 3/1.833=1.6367

21
Huffman Code
• Determine the Huffman code for a loaded coin. The p(X=head)=0.9.
Compare this with the entropy and determine the efficiency of your code.

• A three letter alphabet has the following probability distribution;

• p(Xi = a)=0.73
• p(Xi = b)=0.25
• p(Xi = c)=0.02

22
Code Extension
• In the first case, the outcome heads is most likely but it is not possible to
encode the outcome using less than one bit. Although the entropy is
0.4690, the number of bits needed is 1 so the efficiency is only 46.9 %

• In the second case, the codes are; a : 1, b: 01, c: 00. The average length =
1.31 vs. the entropy which is 0.9443. This code provides good compression
but efficiency is low.
• Compression ratio = 2/1.31 = 1.53
• Efficiency: 0.9443/1.31 = 72%

23
Code Extension
• To improve the code efficiency, we have to redefine the source alphabet.
With larger source, there is a better probability of variance is occurrence,
which in turn is necessary for reduction in average code length.

• How do we increase the source alphabet? Consider using two (or more)
outcomes of the coin flipping experiment at a time. Now the source
alphabet is of size 4.
• p(HH) = 0.81, p(HT)=0.09, p(TH) = 0.09, p(TT)=0.01
• Determine the entropy and average code length in this case and when
source alphabet is of size 8.

24
Code Extension
• Alphabet size = 4
• Entropy = 0.9380
• Code: HH: 1, HT: 00, TH: 011, TT: 000
• Average length = 1.29
• Efficiency: 0.9380/1.29 = 72%

• Alphabet size = 8
• Entropy: 1.4070
• Average length: 1.5970
• Efficiency = 1.4070/1.5970=88%

25
Block Coding
• As the block length increases, the code efficiency seems to improve.
This results is known as the “noiseless coding theorem” of Shannon,
which says that as we use the Huffman coding algorithm over longer
and longer blocks of symbols, the average number of bits required to
encode each symbol approaches the entropy of the source.

H ( X )  L( X )  H ( X ) + 1
H ( X n )  L( X n )  H ( X n ) + 1
H ( X n ) = nH ( X )
H ( X )  L( X n ) / n  H ( X n ) + 1 / n
lim L( X n ) / n = H ( X )
n→ 

26
Extension Codes
• Thus, code extension offers a powerful technique to improve the
efficiency of the code. In the cases discussed here, we assumed that
the individual outcomes were independent. That is not always the case.

• For example, while encoding English text, certain combinations are very
likely to occur than others. Consider qu, ch, sh, ng and so on. If
probabilities are assigned based on this information, more efficient
code can be obtained.

• Read: Huffman coding for Fax transmission pages 866-868 from ref [1].

27
Frequency of Occurrence

28
Example
• Given a long sequences of ternary symbols “a”, “b” and “c”; it was
observed that “a”, “b” and “c” were equally likely. 50% of the time, the
alphabet would be followed by the consecutive alphabet. The alphabet
would be followed by itself only 20% of the times.

• You are asked to design the Huffman code for this binary source. First
determine the entropy of the source and the efficiency of the generated
Huffman code.

Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
No ratings yet
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
7 pages
Introduction to Information Theory
No ratings yet
Introduction to Information Theory
45 pages
Information Theory 1
No ratings yet
Information Theory 1
37 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Week 3
No ratings yet
Week 3
30 pages
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
No ratings yet
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
13 pages
Lossless Compression Techniques Overview
No ratings yet
Lossless Compression Techniques Overview
10 pages
Information Theory Basics
No ratings yet
Information Theory Basics
26 pages
Info Theory for Telecom Students
No ratings yet
Info Theory for Telecom Students
28 pages
Chapter10 Part1 Huffman
No ratings yet
Chapter10 Part1 Huffman
17 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Data Compression Basics: Discrete Source
No ratings yet
Data Compression Basics: Discrete Source
34 pages
Lec 42024
No ratings yet
Lec 42024
13 pages
Information Theory
No ratings yet
Information Theory
38 pages
3-1-Lossless Compression
No ratings yet
3-1-Lossless Compression
10 pages
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
No ratings yet
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
39 pages
Imran Farid on Source Coding Techniques
No ratings yet
Imran Farid on Source Coding Techniques
32 pages
Chap 2
No ratings yet
Chap 2
47 pages
Chapter Five Lossless Compression
No ratings yet
Chapter Five Lossless Compression
49 pages
Source Coding and Compression Techniques
No ratings yet
Source Coding and Compression Techniques
34 pages
Materi Source Coding
No ratings yet
Materi Source Coding
39 pages
Digital Communications Lab (CE-343L) : Experiment NO
No ratings yet
Digital Communications Lab (CE-343L) : Experiment NO
3 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
Data Compression
No ratings yet
Data Compression
113 pages
Information Theory and Coding Unit Second
No ratings yet
Information Theory and Coding Unit Second
108 pages
Sayood DataCompression
No ratings yet
Sayood DataCompression
22 pages
Info Theory & Entropy Basics
No ratings yet
Info Theory & Entropy Basics
44 pages
Lossless Math
No ratings yet
Lossless Math
32 pages
2201.01741v2 - Understanding Entropy Coding With Asymmetric Numeral Systems (ANS) - Statistician Perspective
No ratings yet
2201.01741v2 - Understanding Entropy Coding With Asymmetric Numeral Systems (ANS) - Statistician Perspective
26 pages
Channel Coding Theorem Explained
No ratings yet
Channel Coding Theorem Explained
23 pages
Information Theory Notes
No ratings yet
Information Theory Notes
4 pages
CH 6
No ratings yet
CH 6
21 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
IICT Notes Unit-2
No ratings yet
IICT Notes Unit-2
17 pages
Understanding Discrete Memoryless Sources
No ratings yet
Understanding Discrete Memoryless Sources
32 pages
Introduction To Information Theory and Coding
No ratings yet
Introduction To Information Theory and Coding
46 pages
cp467 12 Lecture14 Compression1
No ratings yet
cp467 12 Lecture14 Compression1
146 pages
Lec3 Source Coding Annotated Day4
No ratings yet
Lec3 Source Coding Annotated Day4
75 pages
Algorithms in The Real World: Data Compression: Lectures 1 and 2
No ratings yet
Algorithms in The Real World: Data Compression: Lectures 1 and 2
55 pages
Intro To ICT 11
No ratings yet
Intro To ICT 11
31 pages
Topic 2 Information and Coding Theory
No ratings yet
Topic 2 Information and Coding Theory
68 pages
09 Basic Compression
No ratings yet
09 Basic Compression
81 pages
Source Coding Techniques in Digital Communication
No ratings yet
Source Coding Techniques in Digital Communication
31 pages
Entropy & Coding in Info Theory
No ratings yet
Entropy & Coding in Info Theory
10 pages
Information Entropy Explained
No ratings yet
Information Entropy Explained
3 pages
Information Theory 5th Unit
No ratings yet
Information Theory 5th Unit
20 pages
PTSP VI Part 2
No ratings yet
PTSP VI Part 2
44 pages
Real-World Data Compression Techniques
No ratings yet
Real-World Data Compression Techniques
33 pages
Assignment 1
No ratings yet
Assignment 1
9 pages
Information Theory in Communication Systems
No ratings yet
Information Theory in Communication Systems
105 pages
Information Theory and Source Coding
No ratings yet
Information Theory and Source Coding
45 pages
L15 Compression
No ratings yet
L15 Compression
63 pages
EC 2214: Coding & Data Compression: Vishwakarma Institute of Technology
No ratings yet
EC 2214: Coding & Data Compression: Vishwakarma Institute of Technology
35 pages
Chapter 5 - Information Theory
No ratings yet
Chapter 5 - Information Theory
22 pages
Elementary Fluids Mechanics .Handout9
No ratings yet
Elementary Fluids Mechanics .Handout9
9 pages
Biotransformation of Cedrol by Curvularia Lunata ATCC 12017: Dwight O. Collins, Paul B. Reese
No ratings yet
Biotransformation of Cedrol by Curvularia Lunata ATCC 12017: Dwight O. Collins, Paul B. Reese
5 pages
Name Tags: Stitch Guide
No ratings yet
Name Tags: Stitch Guide
5 pages
Regenerative Radio Techniques
No ratings yet
Regenerative Radio Techniques
4 pages
Obasa's Blog: Topic One: Output Devices
No ratings yet
Obasa's Blog: Topic One: Output Devices
1 page
F0371102 Etos Td-Ed en PDF
No ratings yet
F0371102 Etos Td-Ed en PDF
12 pages
Grade 12 Physics Model Exam 2024
No ratings yet
Grade 12 Physics Model Exam 2024
10 pages
07 k275 Varistor
No ratings yet
07 k275 Varistor
27 pages
FDMEE Data Loading in Oracle Cloud
No ratings yet
FDMEE Data Loading in Oracle Cloud
28 pages
Engine Guard Kit: Compact and Large
No ratings yet
Engine Guard Kit: Compact and Large
4 pages
FLIR A315 Datasheet
No ratings yet
FLIR A315 Datasheet
12 pages
Glacier Studies & Climate Change Training
No ratings yet
Glacier Studies & Climate Change Training
6 pages
Project PDF 1 #Docement.
No ratings yet
Project PDF 1 #Docement.
34 pages
وصف مواد تخصص الهندسة الكهربائية باللغة العربية
No ratings yet
وصف مواد تخصص الهندسة الكهربائية باللغة العربية
23 pages
Refractive Index of Ethyl Alcohol Experiment
No ratings yet
Refractive Index of Ethyl Alcohol Experiment
11 pages
To What Extent Do Oral Disorders Compromise The Quality of Life
No ratings yet
To What Extent Do Oral Disorders Compromise The Quality of Life
10 pages
Python Regular Expression MCQs
No ratings yet
Python Regular Expression MCQs
25 pages
4 Way Hacksaw Machine
67% (3)
4 Way Hacksaw Machine
8 pages
Discrete Approximations
No ratings yet
Discrete Approximations
4 pages
BUCU002 Computer Applications
No ratings yet
BUCU002 Computer Applications
83 pages
HLRM90 5S
No ratings yet
HLRM90 5S
2 pages
Modbus TCP and Its Client-Server Model and MQTT and Its Publish-Subscribe Model PDF
No ratings yet
Modbus TCP and Its Client-Server Model and MQTT and Its Publish-Subscribe Model PDF
8 pages
DPP (Iupac Naming-9)
No ratings yet
DPP (Iupac Naming-9)
3 pages
Bihar Diploma Mining Engg. Syllabus
No ratings yet
Bihar Diploma Mining Engg. Syllabus
17 pages
Average Study Material PDF 4 PDF
No ratings yet
Average Study Material PDF 4 PDF
6 pages
Sample Paper Maths Class 2: A) B) C) D)
No ratings yet
Sample Paper Maths Class 2: A) B) C) D)
4 pages
Kangara Region Details Springer Journal Imp
No ratings yet
Kangara Region Details Springer Journal Imp
19 pages
Sulphur Ash & Residue Testing Guide
No ratings yet
Sulphur Ash & Residue Testing Guide
4 pages
A333 Pipe Specifications Overview
No ratings yet
A333 Pipe Specifications Overview
8 pages
RG4R L1 C0x OpsMan v1.6
No ratings yet
RG4R L1 C0x OpsMan v1.6
55 pages

Information Theory: Entropy and Coding

Uploaded by

Information Theory: Entropy and Coding

Uploaded by

Encoding the Information

• Let the source alphabet,

• Assume the discrete memory-less source (DMS)

What is the measure of information?

• H is also called the entropy of the probability distribution. Just as in

• The entropy is a lower bound on the amount of

• This limit, called the entropy rate, is denoted by H. The exact

• It is possible to compress the source, in a lossless manner,

• Instead, some amount of distortion, D, is tolerated. Shannon

• It states that, a source with entropy rate H can be encoded with

• As a result, several algorithms have been developed that try to compress

• Before we discuss various codes, let us identify some desirable and

Alphabet Probability Code1 Code2 Code3 code4

a3 1/8 001 100 110 10

a4 1/16 0001 1000 1110 11

a5 1/16 00001 10000 1111 110

• Prefix condition is necessary and sufficient condition for a code to

• The shortest average code length for a particular alphabet may be

• Compression performance is measured by the compression ratio

• Determine the Huffman code for the output of a fair dice.

• A three letter alphabet has the following probability distribution;

You might also like