0% found this document useful (0 votes)

86 views75 pages

Lec3 Source Coding Annotated Day4

This document discusses source coding and summarizes key concepts from chapters 15 and 5 of referenced textbooks. It covers: 1) Analog to digital conversion techniques like sampling, quantization, and digital pulse modulation. 2) Key issues in evaluating performance of digital communication systems including efficiently representing information sources and reliably transmitting information over noisy channels. 3) Information theory provides limits on the minimum number of bits needed to represent a source and the maximum rate information can be transmitted over a channel.

Uploaded by

Paras Vekariya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views75 pages

Lec3 Source Coding Annotated Day4

Uploaded by

Paras Vekariya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 75

Source Coding

Chapter 15 up to Section 15.3, from B.P. Lathi book.

Chapter 5 Digital Communication Systems by Simon Haykin

Review of previous class
• Analog signal to digital signal conversion
• Sampling
• Quantization
• Uniform vs non-uniform
• Digital pulse modulation techniques
• Pulse code modulation (PCM)
• Differential pulse code modulation (DPCM)
• Delta modulation (DM)
Key Issues in Evaluating Performance of Digital Comm. System

• The information from a given source needs to be represented efficiently.

• The rate at which information can be transmitted reliably over a noisy channel.

• Given an information source and a noisy channel, information theory provides limits on

 Minimum no of bits per symbol to fully represent the source

 job of source coding

 Maximum rate at which reliable communication can take place over the noisy
channel
 job of channel coding
What is Information?
• What does “information” mean?
• There is no some exact definition, however:
• Information carries new specific knowledge, which is definitely new
for its recipient;
• Information is always carried by some specific carrier in different
forms (letters, digits, different specific symbols, sequences of digits,
letters, and symbols , etc.);
• Information is meaningful only if the recipient is able to interpret it.

7
Information
• The information materialized is a message.
• Information is always about something (size of a parameter,
occurrence of an event, etc).
• Viewed in this manner, information does not have to be accurate; it
may be a truth or a lie.
• Even a disruptive noise used to inhibit the flow of communication and
create misunderstanding would in this view be a form of information.
• However, generally speaking, if the amount of information in the
received message increases, the message is more accurate.

8
Information Theory Related Questions
• How we can measure the amount of information?
• How we can ensure the correctness of information?
• What to do if information gets corrupted by errors?
• How much memory does it require to store/transmit information?
• Can we reduce the time taken to transfer the information
(compression)?

9
Information Content
• What is the information content of any message?
• Shannon’s answer is: The information content of a message consists
simply of the number of 1s and 0s it takes to transmit it. Or information
can be measured in bits.
• Hence, the elementary unit of information is a binary unit: a bit, which
can be either 1 or 0; “true” or “false”; “yes” or “no”, etc.

• One of the basic postulates of information theory is that information

can be treated like a measurable physical quantity, such as density or
mass.
10
Information
• What do we mean by information?
• “A numerical measure of the uncertainty of an experimental outcome”

• How to quantitatively measure and represent information?

• Let us first look at how we assess the amount of information in our

daily lives using common sense

11
InformationUncertainty
• Zero information
• Sachin Tendulkar retired from Professional Cricket. (celebrity, known fact)
• Narendra Modi is the Prime Minister of India. (Known fact)
• Little information
• It will rain in Bangalore in the month of August. (not much uncertainty since
Aug. is monsoon time)
• Large information
• An earthquake is going to hit Bangalore tomorrow. (are you sure? an unlikely
event)
• Someone solved world hunger problem. (Seriously?)

12
Mathematical Model for a Discrete-Time Information Source

•We will study a simple model for information source,

called discrete memoryless source (DMS).
•Definition of DMS: A DMS is a discrete-time, discrete
amplitude random process in which the symbols (letters)
s are generated independently and with same
distribution.
•Thus, a DMS generates a sequence of i.i.d. RVs that take
takes values in a discrete set. The symbols emitted at any
time is independent
Information source of previous choices
𝑆 0 , 𝑆1 , 𝑆2 … 13
Mathematical Model for a Discrete-Time
Information Source
• The source output is modeled as a discrete random variable S which takes
on symbols from a fixed finite alphabet
𝐴= { 𝑠 0 , 𝑠1 , … , 𝑠 𝐾 − 1 }

𝑝 0 ,𝑝 1 , …, 𝑝 𝐾 −1
with the probability of occurrence such that
K -1
P( s  sk )  pk , k  0,1, .. , K -1 and p
k 0
k 1

• The full description of the DMS is𝑝given

0 ,𝑝 1 , by
…,the
𝑝 𝐾set
−1 A, called the
alphabet, and the probabilities

14
Measure of Information

Interrelations between information, uncertainty or surprise

No surprise no information

If the probability and for , then there is no surprise and, therefore, no

information when symbol is emitted.

The amount of information may be related to the inverse of

the probability of occurrence (p).

15
Definition of Information
• Using such intuition, Hartley proposed the following definition of the
information:
• The amount of information gain after observing the event , which
occurs with probability , as the logarithmic function

bits

Minimum no. of binary digits required to encode the message

16
• Properties:
• = 0 for =1
• No info. gain if we absolutely certain about the outcome of an event.

• 0 for 1
• The occurrence of the event provides some or no info., but never brings about a loss of info.

• for
• The less probable the event is, the more info we gain when it occurs.

• +,
• if and are statistically independent.

• Normally, the base of logarithm is accepted as 2 and the unit of information is bit (binary digit).
Example of Amount of Information
• One flip of a fair coin:
• Before the flip, there are two equally probable choices: heads or tails.
P(H)=P(T)=1/2. Amount of information = log2(2/1) = 1 bit
• Roll of two dice:
• Each die has six faces, so in the roll of two dice there are 36 possible
combinations for the outcome. Amount of information = log2(36/1) = 5.2 bits.
• A randomly chosen decimal digit is even:
• There are ten decimal digits; five of them are even (0, 2, 4, 6, 8). Amount of
information = log2(10/5) = 1 bit.

18
Entropy (Average information per Message)
• Clearly, is a discrete RV that takes values ,…, , respectively.
• The mean value of over the source alphabet is given by

H ( A)  E[ I ( sk )]
K 1
  pk I ( sk )
k 0
K 1
 1 
  pk log 2  
k 0  pk 
• This is called Entropy of a DMS with source alphabet .
• It is the measure of average info content per source symbol/message.
• It depends only on the probabilities of the source symbols.
19
Some Properties of Entropy

Where K is the number of symbols of the alphabet of the source.

• =0, if and only if the probability for some k , and the remaining
probabilities in the set are all zero
No uncertainty
• if and only if the probability for all k (all the symbols in the set are
equiprobable)
maximum uncertainty
Example: Entropy of Binary Memoryless Source

• For binary source of symbol 0 with prob. , and symbol 1

with prob. p1  1  p0

HH((A
S ))  - p0 log 2 p0 - p1 log 2 p1
 - p0 log 2 p0 - (1- p0 ) log 2 (1- p0 ), (bits)
H ( )
1.0
• When =0.5, it is called
binary symmetric source

p0
0 1
2
1 21
Extension of a Discrete Memoryless Source

• Sometimes it is useful to consider blocks of symbols than individual

symbols. Each block consisting of successive source symbols.

• We view it as an extended source with source alphabet that has distinct

blocks, where K is the no of symbols in the source alphabet of the
original source.

• In DMS, the source symbols are statistically independent, the entropy of

the extended source
How?
Proof: Consider a simple case n=2

K 1
1
H ( A)   pi log 2  
i 0  pi 
K 1 K 1  1 
H ( A )   pi p j log 2 
2  Since the source is
pp  memoryless
i 0 j 0  i j
K 1 
K 1
1  1 
  pi  p j log 2    log 2  
 p  p 
i 0 j 0  i  j 
 H ( A)  H ( A)  2 H ( A)

We can extend it for n > 2 as well.

Example
1. Consider a DMS with alphabet A={ 𝑠0 , 𝑠 1 , 𝑠 2} with =1/4, =1/4, =1/2. What is the
entropy of this source?

2. Next consider the 2nd order extension of the source. Since has three symbols, it
follows that the source alphabet of the extended source has =9 symbols. Now,
find the entropy of the extended source.
Verify that .
3. A DMS source has an alphabet of size K and the source outputs are
equally-likely. Find the entropy of that source.

Ans. Log(K)

• Uniform distribution has maximum entropy since the symbols are

equiprobable and uncertainty is the highest.
Source Coding
• Efficient representation of data generated by a source: The process by which this
representation is done is called source encoding. The device that performs this
representation is called source encoder.

An efficient encoder satisfies two functional requirements:

• The code-words produced by the encoder are typically binary in form aiming to
minimize the average bit rate required to represent the source by “reducing the
redundancy”

• The original source sequence must be perfectly reconstructed from the encoded
binary sequence (Lossless encoding).

26
Example
Source Encoder
• Do all the symbols in the source alphabet occur with same
probability?

• Should all be encoded using same number of bits?

• Less frequent-occurring symbols can be encoded with more bits than more
frequent-occurring symbols to reduce the average no of bits/message

• Lets see an example

Morse Code
• Source alphabet:
• Code alphabet:
• Code function:

Each alphanumeric
symbol to a sequence
of dots and dashes
Discrete 𝑠𝑘 Source 𝑏𝑘 Binary
memoryless
Average Code-word Length source
encoder sequence

• Let the binary code-word assigned to symbol by the encoder have

length (bits).
• Average code-word length () : Average no of bits per source symbol

K 1
L   pk lk
k 0
Coding Efficiency
• Let denote the minimum possible value of Then, the coding efficiency
() is defined as
Lmin
 K 1
L where L   pk lk
k 0

• We clearly have since . The source encoder is said to be efficient

when 1.
• W?
Source-Coding Theorem (by Shannon, 1948)
• Given a DMS source of entropy , the average code-word length for
any source encoding is bounded by
L  H ( A)
Lmin  H ( A)

• Hence, the entropy represents a fundamental limit on the average

no. of bits per source symbol.
• Source Coding Efficiency is defined as
H ( A)

L
Intuition behind Lmin  H ( A)
• Specifically, the entropy has a very intuitive meaning.
• Let us assume an extended source of length from a DMS where n

• Total no. of symbols in alphabet A is N.

• With very high probability (if n)
symbol repeats approximately times
symbol repeats approximately times
… symbol repeats approximately times
--All other compositions are extremely unlikely to occur.
• This means that with very high probability, every such sequence of the source has the same
composition, and therefore, the same probability.
Such a sequence is generally called the Typical sequence.

33
Intuition behind Lmin  H ( A)
• Since the source is memoryless, the probability of a typical sequence

N
P( S  s)   p npi
i
i 1
N
  2 npi log 2 pi This means that for large n, almost all the
i 1
o/p sequences of length n of the source are
n
n  pi log 2 pi equally probable with prob.
2 i 1

Such sequences are called typical sequences

 nH ( A )
2
# of typical sequences
Intuition behind Lmin  H ( A)
• Total no. of sequences of length n from alphabet of size N is
• But, the effective no. of sequences (typical sequences) .

• We mean that almost nothing is lost by neglecting the other

(-) number of sequences for very high n.

• Thus, in practice, it is enough to consider the set of typical sequences

than the set of all possible outputs. This is the essence of data
compression.
Intuition behind Lmin  H ( A)
• # of bits required to represent the typical sequences = nH(A)
• However, these bits are used to represent source output of length n.
• Therefore, on the average, any source output requires
nH(A)/n = H(A)
number of bits per source symbol for an essentially error-free
representation.

• This justifies why we use

Lmin  H ( A)
Exceptional case
• A DMS can be compressed if its PMF is not taken from uniform distribution.
• Why?

• since p = 1/N

• Then, no. of typical sequences =

• Total no. of possible sequences of length n from alphabet of size N is

• So, no compression is possible for source following uniform distribution.

Example
• Consider a fair dice (each outcome is equally likely). Compute
the entropy of this source.

• If this dice was loaded such that outcomes 6 and 5 are more
likely than others p(X=5) = 0.5 and p(X=6) = 1/3. The rest of the
outcomes occur with equally probability. Compute the entropy
in this scenario.

• What can you conclude from the above two results?

38
solution
• The number of outcomes for the dice (or the
alphabet for the DMS behind it) are {1, 2, 3, 4,
5, 6}. In the first case, the probability for each is
the same or 1/6.
• Entropy = 6*(log2(6)/6) = 2.585 bits

• Loaded Dice:
• Entropy: 1 1  1 
H log 2 (2)  log 2 (3)  4 *  log 2 (24) 
2 3  24 
H  1.7925bits

39
When the Source has Memory
• For a source having memory (example: printed English text has lot of
dependency between letters and words), the outputs of the source
are not independent and previous outputs reveal some information
about the future ones.
• This dependency reduces uncertainty, and the average information
info. content per source symbol is less for such a source.
Source Coding Theorem - Shannon
• Source coding theorem establishes a fundamental limit on the rate at
which the output of an information source can be compressed
without causing large error probability at the receiver.
• This is one of the fundamental theorems of information theory.

• Source coding theorem

• A source with entropy H can be encoded with arbitrarily small error
probability at any rate R (bits/source output) provided, R > H.
• If R < H, the error probability will be bounded away from zero, independent
of the complexity of the encoder and decoder employed.

41
Source Coding Algorithms
• The theorem, first proved by Shannon, only gives the theoretical bound for the
performance of the encoders. It does not provide any algorithm for design of
such optimum codes.

• As a result, several algorithms have been developed that try to compress the
information at the source in a fashion such that the data is recoverable at the
receiver without any losses.

42
Classification of codes

Alphabet Code1 Code2 Code3

Non-singular, but not uniquely decodable Instantaneous
uniquely decodable but not instantaneous code (Prefix-
free code)

1 0 10 0
2 010 00 10

3 01 11 110

4 10 110 111

43
Classification of codes
• Non-singular (distinct) codes: If each codeword is distinguishable from other codeword.

• Uniquely decodable codes: Set of codes that can be decoded in only one way
Code-1, You receive 01010 decode as x2 x4 or x1 x4 x4 (non-uniq.deco.)

• Instantaneously decodable: A uniquely decodable code is called an instantaneous code if the end of
any code word is recognizable without examining subsequent code symbols.

• Prefix-free Code: A code is said to be prefix-free code if no code word is prefix to another code word
• Prefix-free condition is a sufficient but not necessary condition for a code to be uniquely decodable
• Prefix-free uniquely decodable
• Reverse may not true always

Prefix-free codes instantaneous codes

44
Kraft Inequality

Note: However, it does not guarantee that any code that satisfies this inequality is
automatically uniquely decodable.
You need to check U.D. condition separately
What we want?

Optimal code satisfies two criteria

I. Need to be instantaneous/prefix-free code, which

is for sure u.d. code.

II. Has smaller average codeword length among other

prefix-free codes
1. Huffman Source Coding Algorithm
• Here, fixed length blocks of the source output are mapped to variable length
binary blocks. This is called fixed-to-variable length coding.
• The idea here is to map the more frequently occurring fixed length sequences to
shorter binary sequences and the less frequently occurring ones to longer binary
sequences, thus achieving good coding efficiency.

• Huffman codes are prefix-free codes with minimum average codeword length. In
this sense, they are optimal.

50
Steps
Huffman Encoding Example 1
• Average codeword length for Huffman code is
_

• The performance of the optimal code is almost close to the optimal one. It is
measured by coding efficiency, which is 2.418/2.45=0.97
• Huffman code is uniquely decodable. To emit the following message sequence

the encoded binary sequence is

You can verify that this seq. can be decoded only one way i.e.,
More Examples on Huffman Coding
1. Determine the Huffman code for a source with alphabet A =
{a1, a2, a3, a4, a5} with probabilities; 1/3, ¼, 1/6, 1/8 and 1/8.

2. Determine the Huffman code for the output of a fair dice.

How does the length of the Huffman code compare with the
Entropy of the source for each case?

55
Solution
• Problem 1:
• The Entropy =2.2091 bits
• Average length = 27/12=2.25 bits
• Coding efficiency =2.20/2.25= 98.18%.

• Problem 2:
• The Entropy = 2.585
• Average length = 16/6=2.667
• Coding Efficiency = 96.93%.

56
More Examples on Huffman Coding
• Determine the Huffman code for a loaded coin. The p(X=head)=0.9. Compare this
with the entropy and determine the efficiency of your code.

• A three letter alphabet has the following probability distribution;

• p(Xi = a)=0.73
• p(Xi = b)=0.25
• p(Xi = c)=0.02

57
solution
• In the first case, the outcome heads is most likely but it is not possible to encode the outcome
using less than one bit. Although the entropy is 0.468, the number of bits needed is 1 so the
efficiency is only .46

• In the second case, the codes are; a : 0, b: 10, c: 11. The average length = 1.27 vs. the entropy
which is 0.9443. Efficiency: 0.9443/1.27 = .74

58
Example: Extended Source
Solution
Block Coding
• As we use the Huffman coding algorithm over
longer and longer blocks of symbols, the average
number of bits required to encode each symbol
approaches the entropy of the source. (See the
previous example)

• Thus, as the block length increases, the coding

efficiency improves and approaches to 1.

• We will next see why is it so.

63
• We can show that the average codeword length of Huffman code satisfies the
following inequality
H ( A)  L  H ( A)  1 (Proof is skipped)

• If the Huffman code is designed for sequences of source letters of length n (the
nth order extension of the source), we have
H ( An )  Ln  H ( An )  1

• Where is the average codeword length for the extended source and thus, the
codeword length per message is, on an average,
•
L  Ln / n
• If the source is memoryless, H ( An )  nH ( A)

1
• Therefore, H ( A)  L  H ( A) 
n
• Thus, as n

• Equivalently, if the source is memoryless, then for Huffman

coding with sufficiently large block length (n), average
number of bits required to encode each symbol approaches
the entropy of the source.

• Thus, code extension offers a powerful technique to improve

the efficiency of the code.

65
Shanon-Fano Encoding
In Shannon-Fano encoding the ambiguity may arise in the choice of approximately equiprobable sets.
Drawback of Huffman & Shanon-Fano Coding

• The Huffman code is optimal in the sense that, for a given source, they provide a
prefix code with minimum no of bits /message.

• Both have two disadvantages:

• It depends on the source probabilities- source statistics need to know in advance
• Complexity of the algorithm exponentially increases with the block length

• Not a good choice for any practical source whose statistics are not known in
advance.

• The Lampel-Ziv algorithm belongs to the class of universal source coding algo, i.e.,
algorithms that are independent of source statistics. This is a variable-to-fixed
length coding scheme.
Lempel-Ziv (L-Z) Coding

The steps of L-Z algorithm are as follows:

• Any sequence of the source output is uniquely parsed into phrases of varying
length and these phrases are encoded using codewords of equal length.
• Parsing is done by identifying phrases of the smallest length that have not
appeared before.
• The new phrase is the concatenation of a previous phrase and a new source
message.
• Encoding: Lexicographic (dictionary) ordering of the previous phrase and the new
source message are concatenated.

72
Example 1
• Consider the following sequence
101011011010101010
In which dictionary
location the longest
prefix of the content The last message
After Parsing (the phrases are) appeared before of the content
1, 0, 10, 11, 01, 101,010, 1010
Dictionary location Contents Codeword
1 1 (0,1)
2 0 (0,0)
3 10 (1,0)
4 11 (1,1)
5 01 (2,1)
6 101 (3,1)
7 010 (5,0)
8 1010 (6,0)
Example 2
• Consider the following sequence
ABBAABBAABBABAABAA
After Parsing:
A, B, BA, AB, BAA, BB, ABA, ABAA
Dictionary location Contents Codeword
1 A (0,A)
2 B (0,B)
3 BA (2, A)
4 AB (1,B)
5 BAA (3,A)
6 BB (2,B)
7 ABA (4,A)
8 ABAA (7,A)
• Let … = C(n)
• Total # of symbols in the alphabet = K
• Then L-Z encoding yields a fixed length code sequence of length
C (n)log 2 C (n)  log 2 K  bits

• It can be shown that the above expression approaches to nH(A) for

large values of n (sequence length) and with stationary and ergodic
source
• This algo is widely used to compress computer files. (.zip)

Module 1
No ratings yet
Module 1
40 pages
CS6301 - Analog and Digital Communication (ADC) PDF
No ratings yet
CS6301 - Analog and Digital Communication (ADC) PDF
122 pages
Image Compression Fundamentals
85% (13)
Image Compression Fundamentals
84 pages
Information Theory and Reliable Communication - Gallager
92% (13)
Information Theory and Reliable Communication - Gallager
603 pages
Unit 1
100% (2)
Unit 1
45 pages
DC Digital Communication MODULE IV PART1
100% (3)
DC Digital Communication MODULE IV PART1
23 pages
ITC Unit 2
No ratings yet
ITC Unit 2
186 pages
Image Compression Fundamentals PDF
No ratings yet
Image Compression Fundamentals PDF
84 pages
Unit 5 DC
No ratings yet
Unit 5 DC
22 pages
CS6301 - Analog and Digital Communication (ADC) PDF
No ratings yet
CS6301 - Analog and Digital Communication (ADC) PDF
122 pages
EE 6717 PPT Slides
No ratings yet
EE 6717 PPT Slides
9 pages
Information Theory and Coding PDF
No ratings yet
Information Theory and Coding PDF
150 pages
Information Theory
50% (2)
Information Theory
30 pages
Coding Theory
No ratings yet
Coding Theory
49 pages
Information Theory Module 3
No ratings yet
Information Theory Module 3
68 pages
Ece141 Lec10 Information Theory
No ratings yet
Ece141 Lec10 Information Theory
49 pages
Adc Module 3
No ratings yet
Adc Module 3
80 pages
Shannon's Noiseless Coding Theorem
No ratings yet
Shannon's Noiseless Coding Theorem
4 pages
Information Theory: A Tutorial Introduction
0% (1)
Information Theory: A Tutorial Introduction
23 pages
Unit 5 Partial
No ratings yet
Unit 5 Partial
116 pages
The Source Coding Theorem: M Ario S. Alvim (Msalvim@dcc - Ufmg.br)
No ratings yet
The Source Coding Theorem: M Ario S. Alvim (Msalvim@dcc - Ufmg.br)
62 pages
Source Coding: 1. Introduction-Encoding of The Source Output 2. Shannon S Encoding Algorithm 3. 4. 5. Outcome
No ratings yet
Source Coding: 1. Introduction-Encoding of The Source Output 2. Shannon S Encoding Algorithm 3. 4. 5. Outcome
15 pages
Source Coding
No ratings yet
Source Coding
10 pages
Mod 1 FSD
No ratings yet
Mod 1 FSD
87 pages
C&C Combined Module Notes
No ratings yet
C&C Combined Module Notes
206 pages
7-Information Theory
No ratings yet
7-Information Theory
29 pages
Source Coding
No ratings yet
Source Coding
18 pages
Lecture 2
No ratings yet
Lecture 2
55 pages
Information Theory
No ratings yet
Information Theory
26 pages
Information Theory and Coding - Chapter 2
0% (1)
Information Theory and Coding - Chapter 2
41 pages
Chapter 02 Information Theory
No ratings yet
Chapter 02 Information Theory
15 pages
21EC51 DC Module 4
No ratings yet
21EC51 DC Module 4
40 pages
Information Theory: KIE 2008 Communication Systems
No ratings yet
Information Theory: KIE 2008 Communication Systems
52 pages
Unit 1
No ratings yet
Unit 1
94 pages
Itc Project
No ratings yet
Itc Project
8 pages
Shannon Source Coding Theorem
No ratings yet
Shannon Source Coding Theorem
3 pages
Mid Spring Exam 2025S
No ratings yet
Mid Spring Exam 2025S
6 pages
Proof To Shannon's Source Coding Theorem
No ratings yet
Proof To Shannon's Source Coding Theorem
5 pages
Source Coding Theorem: Marks Spaces E "." Q " - .-"
No ratings yet
Source Coding Theorem: Marks Spaces E "." Q " - .-"
2 pages
Week 5 Information Theory Part1
No ratings yet
Week 5 Information Theory Part1
26 pages
II-II Ece Acs-Unit 5
No ratings yet
II-II Ece Acs-Unit 5
30 pages
Chapte-2 Information Theory and Coding
No ratings yet
Chapte-2 Information Theory and Coding
68 pages
Unit 1 ITC
No ratings yet
Unit 1 ITC
25 pages
Chapter 1 (A)
No ratings yet
Chapter 1 (A)
30 pages
DC Unit3
No ratings yet
DC Unit3
97 pages
ITC Module1
No ratings yet
ITC Module1
31 pages
Itc Mod 1 - 1
No ratings yet
Itc Mod 1 - 1
47 pages
Information Theory and Source Coding
No ratings yet
Information Theory and Source Coding
45 pages
Comm... System CH2-Lec1
No ratings yet
Comm... System CH2-Lec1
36 pages
Discrete Memoryless Source Final
No ratings yet
Discrete Memoryless Source Final
34 pages
Ec23ec4211itc PPT
No ratings yet
Ec23ec4211itc PPT
148 pages
Unit 4 - DC - 2023-2024
No ratings yet
Unit 4 - DC - 2023-2024
100 pages
21ECE72 - Coding and Cryp Module 1
No ratings yet
21ECE72 - Coding and Cryp Module 1
34 pages
Information Theory
No ratings yet
Information Theory
108 pages
Data Compression: Reference: Proakis Salehi (II Ed.) Cap.4
No ratings yet
Data Compression: Reference: Proakis Salehi (II Ed.) Cap.4
30 pages
Infotheory&Coding BJS Compiled
No ratings yet
Infotheory&Coding BJS Compiled
91 pages
Mackay Book Review
No ratings yet
Mackay Book Review
2 pages
Information Theory Final
No ratings yet
Information Theory Final
50 pages
4 20240 456
0% (1)
4 20240 456
5 pages
Itc Term1
No ratings yet
Itc Term1
78 pages
CH 11
No ratings yet
CH 11
36 pages
Information Theory
No ratings yet
Information Theory
38 pages
Measure of Information
No ratings yet
Measure of Information
92 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
12 pages
ITC Mod1 Notes
0% (1)
ITC Mod1 Notes
66 pages
Chapter 2 - Edited
No ratings yet
Chapter 2 - Edited
45 pages
Information Theory
No ratings yet
Information Theory
20 pages
Lec35 - 210108062 - ZAINAB ALI
No ratings yet
Lec35 - 210108062 - ZAINAB ALI
9 pages
Information Theory
No ratings yet
Information Theory
37 pages
Chapter 6
No ratings yet
Chapter 6
34 pages
Module 1
No ratings yet
Module 1
29 pages
ECT305: Analog and Digital Communication Module 2, Part 3: DR - Susan Dominic Assistant Professor Dept. of ECE Rset
No ratings yet
ECT305: Analog and Digital Communication Module 2, Part 3: DR - Susan Dominic Assistant Professor Dept. of ECE Rset
21 pages
Source Coding
No ratings yet
Source Coding
29 pages
Lecture01 02 Part1
No ratings yet
Lecture01 02 Part1
27 pages
Information Theory 5th Unit
No ratings yet
Information Theory 5th Unit
20 pages
CE Notes
No ratings yet
CE Notes
32 pages
Communication System CH#2
No ratings yet
Communication System CH#2
40 pages
Lec2 - Data Compression PDF
No ratings yet
Lec2 - Data Compression PDF
9 pages
Digital Communication Chapter 3
No ratings yet
Digital Communication Chapter 3
37 pages
Digital Communication Intro2
No ratings yet
Digital Communication Intro2
2 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Information Theory and Coding 10EC55
No ratings yet
Information Theory and Coding 10EC55
50 pages
Digital Communication Chapter 3
No ratings yet
Digital Communication Chapter 3
37 pages
Intro Lecture Notes
No ratings yet
Intro Lecture Notes
15 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
84 pages
Information Theory PDF
No ratings yet
Information Theory PDF
26 pages
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
No ratings yet
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
30 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Information Theory: A Concise Introduction
From Everand
Information Theory: A Concise Introduction
Stefan Hollos
No ratings yet

Lec3 Source Coding Annotated Day4

Uploaded by

Lec3 Source Coding Annotated Day4

Uploaded by

Source Coding

Chapter 15 up to Section 15.3, from B.P. Lathi book.

Chapter 5 Digital Communication Systems by Simon Haykin

• The information from a given source needs to be represented efficiently.

 Minimum no of bits per symbol to fully represent the source

• One of the basic postulates of information theory is that information

• How to quantitatively measure and represent information?

• Let us first look at how we assess the amount of information in our

•We will study a simple model for information source,

• The full description of the DMS is𝑝given

Interrelations between information, uncertainty or surprise

If the probability and for , then there is no surprise and, therefore, no

The amount of information may be related to the inverse of

Minimum no. of binary digits required to encode the message

Where K is the number of symbols of the alphabet of the source.

• For binary source of symbol 0 with prob. , and symbol 1

• Sometimes it is useful to consider blocks of symbols than individual

• We view it as an extended source with source alphabet that has distinct

• In DMS, the source symbols are statistically independent, the entropy of

We can extend it for n > 2 as well.

• Uniform distribution has maximum entropy since the symbols are

An efficient encoder satisfies two functional requirements:

• Should all be encoded using same number of bits?

• Lets see an example

• Let the binary code-word assigned to symbol by the encoder have

• We clearly have since . The source encoder is said to be efficient

• Hence, the entropy represents a fundamental limit on the average

• Total no. of symbols in alphabet A is N.

Such sequences are called typical sequences

• We mean that almost nothing is lost by neglecting the other

• Thus, in practice, it is enough to consider the set of typical sequences

• This justifies why we use

• Then, no. of typical sequences =

• Total no. of possible sequences of length n from alphabet of size N is

• So, no compression is possible for source following uniform distribution.

• What can you conclude from the above two results?

• Source coding theorem

Alphabet Code1 Code2 Code3

Prefix-free codes instantaneous codes

Optimal code satisfies two criteria

I. Need to be instantaneous/prefix-free code, which

II. Has smaller average codeword length among other

the encoded binary sequence is

2. Determine the Huffman code for the output of a fair dice.

• A three letter alphabet has the following probability distribution;

• Thus, as the block length increases, the coding

• We will next see why is it so.

• Equivalently, if the source is memoryless, then for Huffman

• Thus, code extension offers a powerful technique to improve

• Both have two disadvantages:

The steps of L-Z algorithm are as follows:

• It can be shown that the above expression approaches to nH(A) for

You might also like