IAT-I Solution of 15EC54 Information Theory and Coding September 2017 by Rahul Nyamangoudar

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

15EC54 (ITC) IAT 1 - Scheme and Solution

Faculty : Mr. Rahul Nyamangoudar Class : ECE 5th Sem - A & B


Subject : Information Theory and Coding Subject Code : 15EC54

Q 1.
a. Define source entropy and average source information rate
Answer:
Source Entropy:
The entropy of a source indicates the minimum amount of bits required to represent
a symbol on an average. Entropy of a source emitting q possible symbol s1 , s2 , … , sq with
probabilities p1 , p2 , … , pq in a statistical independent sequence is given by,
M
1
H(S) = ∑ pi log 2 ( ) bits/symbol
pi
i=1

1M for Definition, 1M for Equation = 2M


Average source information rate:
The average source information rate R is defined as the product of the average
information content per symbol and the symbol rate rs . It represents the rate of symbols
generated by the source in bits/sec or symbols/sec.
R = H(S) ∗ rs bits/sec
1M for Definition, 1M for Equation = 2M

Total – 4 Marks

b. A discrete memoryless source emits one of five symbols once every millisecond. The
1 1 1 1 1
symbol probabilities are , , , , respectively. Find the source entropy and
2 4 8 16 16

information rate.
Solution
Given: Discrete Memoryless Source
No. of Symbols = 5 (Let s1 , s2 , s3 , s4 , s5 be the symbols)
1 1 1 1 1
Probabilities P(s1 ) = , P(s2 ) = , P(s3 ) = , P(s4 ) = , P(s5 ) =
2 4 8 16 16

Symbol rate (rs ) = 1 symbols/1 milli second = 103 symbols/second


Source Entropy:
5
1
H(S) = ∑ pi log 2 ( ) bits/symbol
pi
i=1

Prof. Rahul N, Dept of TCE 2017 1


15EC54 (ITC) IAT 1 - Scheme and Solution

1 1 1 1
H(s) = log 2 (2) + log 2 (4) + log 2 (8) + 2 ∗ log 2 (16)
2 4 8 16
𝐻(𝑠) = 1.875 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
1M for Entropy formula, 2M for Calculation = 3 Marks
Information rate:
R = H(s). rs
bits 103 symbols
R = 1.875 ∗
symbol second
𝑅 = 1875 𝑏𝑖𝑡𝑠/𝑠𝑒𝑐
1M for information rate, 1M for symbol rate, 1M for Calculation = 3M

Total – 6 Marks

Q 2.
a. How do you measure information? Justify your answer.
Answer:
The amount of information in a message depends only on the uncertainty of the
underlying event rather than its actual content and is measured as:
1
I(mk ) = log 2 ( )
pk
Proof:
Considering an information source that emits one of the ′q′ possible messages
m1 , m2 , . . . , mq with probabilities p1 , p2 , . . . , pq respectively, such that
p1 + p2 + . . . +pq = 1. Then,
The event that is most likely to happen (high probability), is having lesser
information and an event that is less likely to occur (lower probability) has more
information. Based on this, information of event can be related to the probability of event
as,
1
I(mk ) ∝ (2.1)
pk
Information content Ik must approach ′0′ as pk approaches ′1′, that is
I(mk ) → 0 as pk → 1 (2.2)
Information content is non-negative since each message will have some
information and in worst case, it can be equal to zero,
I(mk ) ≥ 0 for 0 ≤ pk ≤ 1 (2.3)

Prof. Rahul N, Dept of TCE 2017 2


15EC54 (ITC) IAT 1 - Scheme and Solution

Information content is zero for an event that will happen definitely and for an event
which will not occur at all.
For two messages mk and mj ,

I(mk ) < I(mj ) for pk > pj (2.4)


For two independent messages mk and mj , the total information conveyed is sum
of information conveyed by each message individually,
I(mk and mj ) ≜ I(mk mj ) = I(mk ) + I(mj ) (2.5)
A continuous function of pk that satisfies the constraints specified in equations
(2.1) to (2.6) is a logarithmic function and thus measure of information can be defined as,
1
I(mk ) = log 2 ( ) (2.6)
pk
1M for How? & 5M for Justification = 6M

Total – 6 Marks

b. A black and white TV picture consists of 640 lines of picture information. Assume that
each line consists of 480 picture elements (pixels) and that each pixel has 256 brightness
levels. The picture is repeated at the rate of 30 frames/sec. Calculate the average rate of
information conveyed by a TV set to a viewer.
Solution:
Given: No. of lines - 640
No. of pixels per line – 480
No. of brightness levels per pixel – 256
No. of frames per second – 30
Average information per pixel considering that all levels to be equi-probable.is
𝐻(𝑆) = log 2 256 = 8 𝑏𝑖𝑡𝑠 𝑝𝑒𝑟 𝑝𝑖𝑥𝑒𝑙
No. of pixels per frame or image is
𝑁𝑝 = 640 ∗ 480 = 307.2 ∗ 103 𝑝𝑖𝑥𝑒𝑙𝑠 𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
Thus Average rate of information conveyed by a TV set to viewer is
𝑅 = 𝐻(𝑆) ∗ 𝑁𝑜. 𝑜𝑓 𝑝𝑖𝑥𝑒𝑙 𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒 (𝑁𝑝 ) ∗ 𝑟𝑠 (𝑖𝑛 𝑓𝑟𝑎𝑚𝑒𝑠 𝑝𝑒𝑟 𝑠𝑒𝑐𝑜𝑛𝑑)
𝑏𝑖𝑡𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑎𝑚𝑒𝑠
𝑅=8 ∗ 307.2 ∗ 103 ∗ 30
𝑝𝑖𝑥𝑒𝑙 𝑓𝑟𝑎𝑚𝑒 𝑠𝑒𝑐𝑜𝑛𝑑
𝑅 = 73.728 𝑀𝑏𝑝𝑠
1M for H(S), 1M for Np, 2M for calculation = 4M

Total – 4 Marks

Prof. Rahul N, Dept of TCE 2017 3


15EC54 (ITC) IAT 1 - Scheme and Solution

Q 3.
For the Markov source shown in Figure Q3. Find G1 , G2 and Verify G1 > G2 > H(S).
C
1/4
A 3/4 1 2 3/4 B
C
𝟏 𝟏
𝑷(𝟏) = 1/4 𝑷(𝟐) =
𝟐 𝟐
Figure Q3.
Solution:
Given 𝑆 = {𝐴, 𝐵, 𝐶}
𝑆𝑡𝑎𝑡𝑒𝑠 = {1, 2}
1
𝑃(1) =
2
1
𝑃(2) =
2
Firstly finding entropy of each state:
2
1
𝐻𝑖 = ∑ 𝑝𝑖𝑗 log 2 ( )
𝑝𝑖𝑗
𝑗=1

3 4 1
𝐻1 = log 2 ( ) + log 2 (4)
4 3 4
𝐻1 = 0.8113 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
3 4 1
𝐻2 = log 2 ( ) + log 2 (4)
4 3 4
𝐻2 = 0.8113 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
Thus entropy of source is
2

𝐻(𝑆) = ∑ 𝑃(𝑖) ∗ 𝐻𝑖
𝑖=1
1 1
𝐻(𝑆) = ∗ 0.8113 + ∗ 0.8113
2 2
𝐻(𝑆) = 0.8113 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
Then to find 𝐺1 and 𝐺2 , the tree structure is represented as in Figure 3.1. Correspondingly
messages of length 1 & 2 with probabilities are listed in Table 3.1.

Prof. Rahul N, Dept of TCE 2017 4


15EC54 (ITC) IAT 1 - Scheme and Solution

AA (9/32)

A 1
A (3/8)
3/4
1 AC(3/32)
A 1/4
C 2
3/4

1 CC(1/32)
1/4
1
P(1) = 1/2 C
C C(1/8)
1/4
2 CB(3/32)
3/4
B 1

CA(3/32)

1
A
C(1/8)
3/4
1
CC(1/32)
C 1/4
C
2
1/4

2
3/4 BC(3/32)
P(2)=1/2
B B(3/8) C 1
1/4
2
BB(9/32)
3/4
B 2

Figure 3.1: Tree diagram representation

Prof. Rahul N, Dept of TCE 2017 5


15EC54 (ITC) IAT 1 - Scheme and Solution

Table 3.1 Messages of length 1 & 2 with their probabilities


Messages of length 1 Messages of length 2
3 9
P(A) = P(AA) =
8 32
3 3
P(B) = P(AC) =
8 32
1 1 2 1 1 2
P(C) = + = P(CC) = + =
8 8 8 32 32 32
3
P(CB) =
32
3
P(CA) =
32
3
P(BC) =
32
9
P(BB) =
32

1 1
𝐺𝑁 = ∑ 𝑃(𝑖) log 2 ( )
𝑁 𝑁 𝑃(𝑖)
𝑠

Thus, entropy per symbol of the sequence of symbols of length ‘1’, i.e. 𝐺1
𝐶
1
𝐺1 = ∑ 𝑃(𝑖) log 2 ( )
𝑃(𝑖)
𝑖=𝐴
3 8 2 8
𝐺1 = 2 ∗ log 2 ( ) + log 2 ( )
8 3 8 2
𝐺1 = 1.5613 𝑏𝑖𝑡𝑠 𝑝𝑒𝑟 𝑠𝑦𝑚𝑏𝑜𝑙
Thus, entropy per symbol of the sequence of symbols of length ‘2’, i.e. 𝐺2
𝐶𝐶
1 1
𝐺2 = ∑ 𝑃(𝑖) log 2 ( )
2 𝑃(𝑖)
𝑖=𝐴𝐴
1 9 32 3 32 2 32
𝐺2 = {2 ∗ ∗ log 2 ( ) + 4 ∗ ∗ log 2 ( ) + ∗ log 2 ( )}
2 32 9 32 3 32 2
𝐺2 = 1.28 𝑏𝑖𝑡𝑠 𝑝𝑒𝑟 𝑠𝑦𝑚𝑏𝑜𝑙
Hence 𝐺1 > 𝐺2 > 𝐻(𝑆)
1M each for H1, H2, H(S), G1, G2, 3M for tree, 2M for Table = 6M

Total – 6 Marks

Prof. Rahul N, Dept of TCE 2017 6


15EC54 (ITC) IAT 1 - Scheme and Solution

Q 4.
a. State the properties of entropy.
Let 𝑆 ≡ {𝑠1 , 𝑠2 , 𝑠3 , … , 𝑠𝑞 } be the set of symbols emitted from a zero-memory
source with probabilities {𝑝1 , 𝑝2 , 𝑝3 , … , 𝑝𝑞 } respectively. Let the entropy of zero-memory
source be 𝐻(𝑆), then
(I) 𝐻(𝑆) is a continuous function of {𝑝1 , 𝑝2 , 𝑝3 , … , 𝑝𝑞 }.
(II) Extremal Property:
Lower Bound on Entropy:
Entropy has a minimum value when one of the probability 𝑝𝑖 for 1 ≤
𝑖 ≤ 𝑞 is equal to ‘1’ and hence rest all other probabilities is zero. Thus
𝐻(𝑆) ≥ 0
Upper Bound on Entropy:
Entropy has a maximum value when all the individual probabilities are
equal, that is 𝑝1 = 𝑝2 = . . . = 𝑝𝑞 . Where each probability 𝑝𝑖 is equal to 1/𝑞
(𝑞 is number of symbols).
𝐻(𝑆) ≤ log 2 𝑞
Thus
0 ≤ 𝐻(𝑆) ≤ log 2 𝑞
(III) Additive Property
For a set of symbols 𝑆̅ ≡ {𝑠11 , 𝑠12 , … , 𝑠1𝑁 , 𝑠2 , 𝑠3 , … 𝑠𝑞 } emitted from the same
source with probabilities {𝑟11 , 𝑟12 , … , 𝑟1𝑁 , 𝑝2 , 𝑝3 , … , 𝑝𝑞 } and Entropy 𝐻(𝑆̅).
Where symbol 𝑠1 is split into multiple symbols {𝑠11 , 𝑠12 , … , 𝑠1𝑁 } having
probabilities {𝑟11 , 𝑟12 , … , 𝑟1𝑁 } respectively. Such that, 𝑟11 + 𝑟12 + 𝑟13 +
. . . +𝑟1𝑁 = 𝑝1 ,
𝐻(𝑆̅) ≥ 𝐻(𝑆)
(IV) Entropy function is a symmetrical function of all variables, that is
𝐻(𝑝1 , 𝑝2 , 𝑝3 , … , 𝑝𝑞 ) = 𝐻(𝑝𝜎(1) , 𝑝𝜎(2) , 𝑝𝜎(3) , … , 𝑝𝜎(𝑞) )
Where, 𝜎 denotes a permutation of (1, … , 𝑞).
1M for I, III, IV, 2M for II = 5M

Total – 5 Marks

Prof. Rahul N, Dept of TCE 2017 7


15EC54 (ITC) IAT 1 - Scheme and Solution

b. The international Morse code uses a sequence of dots and dashes to transmit letters
of the English alphabet. The dash is represented by a current pulse that has a duration
of 3 units and the dot has a duration of 1 unit. The probability of occurrence of a
dash is 1/3 of the probability of occurrence of a dot.
i. Calculate the information content of a dot and a dash.
ii. Calculate the average information in the dot-dash code.
iii. Assume that the dot lasts 2 msec, which is the same time interval as the pause
between symbols. Find the average rate of information transmission.
Solution:
Given S = {dot, dash}
Duration of dash = 3 units and Duration of dot = 1 unit
1
P(𝑑𝑎𝑠ℎ) = P(𝑑𝑜𝑡) (4.1)
3
w.k.t, 𝑃(𝑑𝑜𝑡) + 𝑃(𝑑𝑎𝑠ℎ) = 1 (4.2)

Using equation (4.1) in equation (4.2), we have


1
𝑃(𝑑𝑜𝑡) + 𝑃(𝑑𝑜𝑡) = 1
3
3
∴ 𝑃(𝑑𝑜𝑡) =
4
1
𝑃(𝑑𝑎𝑠ℎ) =
4
(i) Information of dot and dash is
1 4
𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛(𝑑𝑜𝑡) = log 2 ( ) = log 2 ( ) = 0.415 𝑏𝑖𝑡𝑠
𝑃(𝑑𝑜𝑡) 3
1
𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛(𝑑𝑎𝑠ℎ) = log 2 ( ) = log 2 (4) = 2 𝑏𝑖𝑡𝑠
𝑃(𝑑𝑎𝑠ℎ)
(ii) Average information of dot-dash code is
1 1
𝐻(𝑆) = 𝑃(𝑑𝑜𝑡) log 2 ( ) + 𝑃(𝑑𝑎𝑠ℎ) log 2 ( )
𝑃(𝑑𝑜𝑡) 𝑃(𝑑𝑎𝑠ℎ)
3 4 1
𝐻(𝑆) = ∗ log 2 ( ) + ∗ log 2 (4)
4 3 4
𝐻(𝑆) = 0.8113 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
(iii) 𝑑𝑢𝑟𝑎𝑡𝑖𝑜𝑛(𝑑𝑜𝑡) = 1 𝑢𝑛𝑖𝑡 = 2𝑚𝑠𝑒𝑐
𝑑𝑢𝑟𝑎𝑡𝑖𝑜𝑛(𝑑𝑎𝑠ℎ) = 3 𝑢𝑛𝑖𝑡𝑠 = 6𝑚𝑠𝑒𝑐
𝑑𝑢𝑟𝑎𝑡𝑖𝑜𝑛(𝑝𝑎𝑢𝑠𝑒) = 2 𝑚𝑠𝑒𝑐

Prof. Rahul N, Dept of TCE 2017 8


15EC54 (ITC) IAT 1 - Scheme and Solution

Considering a message of length 𝑁,


3
Number of dots = ∗ 𝑁 = 0.75 ∗ 𝑁 𝑑𝑜𝑡𝑠
4
1
Number of dashes = ∗ 𝑁 = 0.25 ∗ 𝑁 𝑑𝑜𝑡𝑠
4
Number of pauses = 𝑁 − 1 ≅ 𝑁 𝑝𝑎𝑢𝑠𝑒𝑠 (𝑓𝑜𝑟 𝑙𝑎𝑟𝑔𝑒 𝑁)
0.75 ∗ 𝑁 ∗ 2 ∗ 10−3 + 0.25 ∗ 𝑁 ∗ 6 ∗ 10−3 + 𝑁 ∗ 2 ∗ 10−3
Average duration/symbol =
𝑁
𝑁(1.5 + 1.5 + 2) ∗ 10−3
= = 5 𝑚𝑠𝑒𝑐𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
𝑁
∴ symbol rate (𝑟𝑠 ) = 1 𝑠𝑦𝑚𝑏𝑜𝑙/5 𝑚𝑠𝑒𝑐 = 200 𝑠𝑦𝑚𝑏𝑜𝑙𝑠/𝑠𝑒𝑐
Hence average information rate is
𝑅 = 𝐻(𝑆) ∗ 𝑟𝑠
𝑏𝑖𝑡𝑠 𝑠𝑦𝑚𝑏𝑜𝑙𝑠
𝑅 = 0.8113 ∗ 200 = 162.26 𝑏𝑖𝑡𝑠/𝑠𝑒𝑐
𝑠𝑦𝑚𝑏𝑜𝑙𝑠 𝑠𝑒𝑐
1M for (i), 1M for (ii), 3M for (iii) = 5M

Total – 5 Marks
Q 5.
a. Show that entropy of nth extension of a zero-memory source ‘S’ is H(Sn ) = nH(S),
where H(S) is entropy of zero-memory source.
Proof:
Let, the 𝑞 𝑛 symbols of 𝑛𝑡ℎ extension of source 𝑆, 𝑆 𝑛 , be {𝜎1 , 𝜎2 , 𝜎3 , … , 𝜎𝑞𝑛 }. Where
each 𝜎𝑖 corresponds to some sequence of length ′𝑛′ of the 𝑠𝑖 . Let 𝑃(𝜎𝑖 ) represents the
probability of 𝜎𝑖 , where 𝜎𝑖 corresponds to sequence of 𝑠𝑖 ′𝑠 represented as
𝜎𝑖 = 𝑠𝑖1 , 𝑠𝑖2 , … , 𝑠𝑖𝑛 (5.1)
Since occurrence of each individual symbol in 𝜎𝑖 is independent of the other,
𝑝(𝜎𝑖 ) = 𝑝𝑖1 𝑝𝑖2 … 𝑝𝑖𝑛 (5.2)

With 𝑝𝑖1 = 𝑃(𝑆𝑖1 ).


The Entropy of 𝑛𝑡ℎ extension of source can be written as,
1 (5.3)
𝐻(𝑆 𝑛 ) = ∑ 𝑝(𝜎𝑖 ) log 2 ( )
𝑝(𝜎𝑖 )
𝑆𝑛
Where
𝑞 𝑞 𝑞

∑ 𝑝(𝜎𝑖 ) = ∑ 𝑝𝑖1 𝑝𝑖2 … 𝑝𝑖𝑛 = ∑ ∑ . . . ∑ 𝑝𝑖1 𝑝𝑖2 … 𝑝𝑖𝑛


𝑆𝑛 𝑆𝑛 𝑖1 =1 𝑖2 =1 𝑖𝑛 =1

Prof. Rahul N, Dept of TCE 2017 9


15EC54 (ITC) IAT 1 - Scheme and Solution
𝑞 𝑞 𝑞

= ∑ 𝑝𝑖1 ∑ 𝑝𝑖2 . . . ∑ 𝑝𝑖𝑛 = 1


𝑖1 =1 𝑖2 =1 𝑖𝑛 =1

Equation (5.3) can be written as,


1
𝐻(𝑆 𝑛 ) = ∑ 𝑝(𝜎𝑖 ) log 2 ( )
𝑝𝑖1 𝑝𝑖2 … 𝑝𝑖𝑛
𝑆𝑛

1 1 1
= ∑ 𝑝(𝜎𝑖 ) log 2 ( ) + ∑ 𝑝(𝜎𝑖 ) log 2 ( ) + . . . + ∑ 𝑝(𝜎𝑖 ) log 2 ( ) (5.4)
𝑝𝑖1 𝑝𝑖2 𝑝𝑖𝑛
𝑆𝑛 𝑛 𝑆 𝑛 𝑆

If we take just the 1st term of the sum of summations given in above equation, we have
𝑞
1 1
∑ 𝑝(𝜎𝑖 ) log 2 ( ) = ∑ 𝑝𝑖1 𝑝𝑖2 … 𝑝𝑖𝑛 log 2 ( )
𝑛
𝑝𝑖1 𝑝𝑖1
𝑆 𝑖𝑛 =1
𝑞 𝑞 𝑞
1
= ∑ 𝑝𝑖1 log 2 ( ) ∑ 𝑝𝑖2 . . . ∑ 𝑝𝑖𝑛
𝑝𝑖1
𝑖1 =1 𝑖2 =1 𝑖𝑛 =1
𝑞
1
= ∑ 𝑝𝑖1 log 2 ( ) = 𝐻(𝑆)
𝑝𝑖1
𝑖1 =1

Thus,
1 (5.5)
∑ 𝑝(𝜎𝑖 ) log 2 ( ) = 𝐻(𝑆)
𝑝𝑖1
𝑆𝑛

Using equation (5.5) to evaluate other terms in equation (5.4), we obtain


𝐻(𝑆 𝑛 ) = 𝑛𝐻(𝑆)
2M till Equation (5.3), 3M for Equation (5) & (6) = 5M

Total – 5 Marks

b. For a zero-memory source S with source alphabet {A, C, D, I, M, R, T} having


1 1 1 1 1 1 1
probabilities { , , , , , , }. Construct ternary–compact code using Huffman
27 3 9 9 27 27 3

encoding procedure. Using the bit-representations obtained encode the message


“CMRITADDA”.
Solution:
Given No. of Symbols (𝑞) – 7
No. of representation symbols (𝑟) – 3 (ternary)
∴ No. of Stages is
𝑞−𝑟
𝛼=
𝑟−1
7−3
𝛼= =2
3−1

Prof. Rahul N, Dept of TCE 2017 10


15EC54 (ITC) IAT 1 - Scheme and Solution

(i) Considering composite symbol being placed as low as possible


Symbol Code Probabilities Stage 1 Stage 2
1 1 1
C 0 (0)
3 3 3
1 1 1
T 1 (1)
3 3 3
1 1 𝟏
D 20 (0) (𝟐)
9 9 𝟑
1 1
I 21 (1)
9 9
1 𝟏
A 220 (0) (𝟐)
27 𝟗
1
M 221 (1)
27
1
R 222 (2)
27

For this case, “CMRITADDA” can be encoded as


0,221,222,21,1,220,20,20,220
(i) Considering composite symbol being placed as high as possible
Symbol Code Probabilities Stage 1 Stage 2
1 1 𝟏
C 1 (𝟎)
3 3 𝟑
1 1 1
T 2 (1)
3 3 3
1 𝟏 1
D 01 (𝟎) (2)
9 𝟗 3
1 1
I 02 (1)
9 9
1 1
A 000 (0) (2)
27 9
1
M 001 (1)
27
1
R 002 (2)
27

For this case, “CMRITADDA” can be encoded as


1,001,002,02,2,000,01,01,000
4M for obtaining bit representations, 1M for encoding “CMRITADDA”= 5M

Total – 5 Marks

Prof. Rahul N, Dept of TCE 2017 11


15EC54 (ITC) IAT 1 - Scheme and Solution

Q 6.
State and prove Kraft – McMillan Inequality for instantaneous code.
Answer:
Consider an instantaneous code with source alphabet 𝑆 = {𝑠1 , 𝑠2 , … , 𝑠𝑞 } and code
alphabet 𝑋 = {𝑥1 , 𝑥2 , … , 𝑥𝑟 }. Thus source symbols {𝑠1 , 𝑠2 , … , 𝑠𝑞 } are represented by code
words {𝑋1 , 𝑋2 , … , 𝑋𝑞 } with lengths {𝑙1 , 𝑙2 , … , 𝑙𝑞 } respectively. Where each 𝑋𝑖 is formed as
sequences of symbols of code alphabet 𝑋.

Then a necessary and sufficient condition for the existence of an instantaneous code
with word lengths 𝑙1 , 𝑙2 , … , 𝑙𝑞 is that

∑ 𝑟 −𝑙𝑖 ≤ 1 (6.1)
𝑖=1

where ′𝑟′ is the number of different symbols in the code alphabet 𝑋.


Proof:
Consider the quantity,
𝑞 𝑛
𝑛
(∑ 𝑟 −𝑙𝑖 ) = (𝑟 −𝑙0 + 𝑟 −𝑙1 + 𝑟 −𝑙2 + . . . . +𝑟 −𝑙𝑞 ) (6.2)
𝑖=1

where ′𝑛 is a positive integer. Equation (6.2) represents the LHS of equation (6.1) for
the 𝑛𝑡ℎ extension of source code words.
Expanding equation (6.2) will result in 𝑞 𝑛 terms, with each having a form

𝑟 −𝑙𝑖1 −𝑙𝑖2 −𝑙𝑖3 − . . . . −𝑙𝑖𝑛


= 𝑟 −𝑘 (6.3)

where,
𝑙𝑖1 + 𝑙𝑖2 + 𝑙𝑖3 + . . . . +𝑙𝑖𝑛 = 𝑘 (6.4)
For a code,
Let the smallest possible length of a code-word be unity i.e. 𝑙𝑖 = 1, Then minimum
value of ′𝑘 ′ would be
𝑘|𝑚𝑖𝑛 = 𝑛 (6.5)
and let the largest value of code-word be ′𝑙′ i.e. 𝑙𝑖 = 𝑙, then the maximum value of ′𝑘′
would be
𝑘|𝑚𝑎𝑥 = 𝑛𝑙 (6.6)
If 𝑁𝑘 denote the number of terms of the form 𝑟 𝑘 and also the number of code-words of
length ′𝑘′, then we can rewrite equation

Prof. Rahul N, Dept of TCE 2017 12


15EC54 (ITC) IAT 1 - Scheme and Solution
𝑞 𝑛 𝑛𝑙

(∑ 𝑟 −𝑙𝑖 ) = ∑ 𝑁𝑘 𝑟 −𝑘 (6.7)
𝑖=1 𝑘=𝑛

If a code is uniquely decodable then,


𝑁𝑘 ≤ 𝑟 𝑘 (6.8)
Using inequality relation of equation (6.8) in equation (6.7), we have
𝑞 𝑛 𝑛𝑙

(∑ 𝑟 −𝑙𝑖 ) ≤ ∑ 𝑟 𝑘 𝑟 −𝑘
𝑖=1 𝑘=𝑛

Simplifying above equation,


𝑞 𝑛 𝑛𝑙
−𝑙𝑖
(∑ 𝑟 ) ≤ ∑1
𝑖=1 𝑘=𝑛
𝑞 𝑛 𝑛𝑙−𝒏+𝟏

(∑ 𝑟 −𝑙𝑖 ) ≤ ∑ 1
𝑖=1 𝑘=𝑛−𝒏+𝟏
𝑞 𝑛 𝑛𝑙−𝑛+1
−𝑙𝑖
(∑ 𝑟 ) ≤ ∑ 1
𝑖=1 𝑘=1
𝑞 𝑛

(∑ 𝑟 −𝑙𝑖 ) ≤ 𝑛𝑙 − 𝑛 + 1
𝑖=1

where, 𝑛𝑙 − 𝑛 + 1 = 𝑛(𝑙 − 1) + 1 ≅ 𝑛𝑙 + 1 ≅ 𝑛𝑙, thus


𝑞 𝑛

(∑ 𝑟 −𝑙𝑖 ) ≤ 𝑛𝑙 (9)
𝑖=1

Taking 𝑛𝑡ℎ roots on both sides of above inequality, we get


𝑞

∑ 𝑟 −𝑙𝑖 ≤ (𝑛𝑙)1/𝑛 for all n


𝑖=1

For large ′𝑛′, as 𝑛 → ∞, we have


lim (𝑛𝑙)1/𝑛 = 1 (10)
𝑛→∞

Thus
𝑞

∑ 𝑟 −𝑙𝑖 ≤ 1
𝑖=1

Which is nothing but the inequality given in equation (6.1).


2M for Statement, 8M for Derivation = 10M

Total – 10 Marks

Prof. Rahul N, Dept of TCE 2017 13


15EC54 (ITC) IAT 1 - Scheme and Solution

Q 7.
State and prove Shannon’s noiseless coding theorem. What do you infer from it?
Answer:
Let a block code with source symbols 𝑠1 , 𝑠2 , … , 𝑠𝑞 be represented by code
words 𝑋1 , 𝑋2 , … , 𝑋𝑞 . Let the probabilities of the source symbols be 𝑝1 , 𝑝2 , … , 𝑝𝑞 and
the lengths of the code words be 𝑝1 , 𝑝2 , … , 𝑝𝑞 . Let 𝐻𝑟 (𝑆) represent entropy of 𝑟 −
𝑎𝑟𝑦 representation of the source. Then the average length can be made as close to
entropy 𝐻𝑟 (𝑆) by coding the 𝑛𝑡ℎ extension of source S i.e. 𝑆 𝑛 , rather than S.
Proof:
Length of the symbol can be related to probability as
1
𝑙𝑖 ≈ log r ( ) (7.1)
𝑝𝑖
Thus,
 If log 𝑟 (1/𝑝𝑖 ) is an integer, we should choose the word length 𝑙𝑖 equal to this
integer.
 If log 𝑟 (1/𝑝𝑖 ) is not an integer, it might seem reasonable that a code could be
found by selecting 𝑙𝑖 as the first integer greater than this value.
Then, we may select 𝑙𝑖 which has an integer value between
1 1
log r ( ) ≤ 𝑙𝑖 ≤ log r ( ) + 1 (7.2)
𝑝𝑖 𝑝𝑖
Checking if the above lengths satisfy Kraft – McMillan inequality, since
1
log 𝑟 ( ) ≤ 𝑙𝑖 (7.3)
𝑝𝑖
Then
1
≤ 𝑟 𝑙𝑖
𝑝𝑖
Or
𝑝𝑖 ≥ 𝑟 −𝑙𝑖
Summing for all symbols, we have
𝑞 𝑞

∑ 𝑝𝑖 ≥ ∑ 𝑟 −𝑙𝑖
𝑖=1 𝑖=1
Thus
𝑞 𝑞
−𝑙𝑖
1 ≥ ∑𝑟 𝑜𝑟 ∑ 𝑟 −𝑙𝑖 ≤ 1
𝑖=1 𝑖=1
Hence it satisfies Kraft – McMillan inequality.

Prof. Rahul N, Dept of TCE 2017 14


15EC54 (ITC) IAT 1 - Scheme and Solution

Now considering equation (7.3) and multiplying it by 𝑝𝑖 , we have


1 1
𝑝𝑖 log r ( ) ≤ 𝑝𝑖 𝑙𝑖 ≤ 𝑝𝑖 log r ( ) + 𝑝𝑖 (7.4)
𝑝𝑖 𝑝𝑖
Equation (7.4) which holds good for all the symbols, summing up for the symbols we
have
𝑞 𝑞 𝑞 𝑞
1 1
∑ 𝑝𝑖 log r ( ) ≤ ∑ 𝑝𝑖 𝑙𝑖 ≤ ∑ 𝑝𝑖 log r ( ) + ∑ 𝑝𝑖 (7.5)
𝑝𝑖 𝑝𝑖
𝑖=1 𝑖=1 𝑖=1 𝑖=1

Where,
𝑞
1
∑ 𝑝𝑖 log r ( ) = 𝐻𝑟 (𝑆)
𝑝𝑖
𝑖=1

and
𝑞

∑ 𝑝𝑖 𝑙𝑖 = 𝐿
𝑖=1

Using above relations in equation (7.5), we have


𝐻𝑟 (𝑆) ≤ 𝐿 ≤ 𝐻𝑟 (𝑆) + 1 (7.6)
Since equation (7.6) is valid for zero-memory source, we can apply it to 𝑛𝑡ℎ extension
as well, thus
𝐻𝑟 (𝑆 𝑛 ) ≤ 𝐿𝑛 ≤ 𝐻𝑟 (𝑆 𝑛 ) + 1 (7.7)
where, 𝐿𝑛 indicates the average length on 𝑛𝑡ℎ extension of zero-memory source, we
also know that
𝐻𝑟 (𝑆 𝑛 ) = 𝑛𝐻𝑟 (𝑆)
Thus equation (7.7) can be rewritten as
𝑛𝐻𝑟 (𝑆) ≤ 𝐿𝑛 ≤ 𝑛𝐻𝑟 (𝑆) + 1 (7.8)
Dividing equation (7.8) by ′𝑛′ , we have
𝑳𝒏 𝟏
𝑯𝒓 (𝑺) ≤ ≤ 𝑯𝒓 (𝑺) + (7.9)
𝒏 𝒏
Where, 𝐿𝑛 /𝑛 indicates the average length of one zero-memory source symbol in an
𝑛𝑡ℎ extension symbol. So it is possible to make 𝐿𝑛 /𝑛 as close to 𝐻𝑟 (𝑆) as we wish by
coding the 𝑛𝑡ℎ extension of 𝑆 rather than 𝑆. Thus
𝑳𝒏
𝐥𝐢𝐦 = 𝑯𝒓 (𝑺) (7.10)
𝒏→∞ 𝒏

Equation (7.9) is known as Shannon’s First Theorem or Shannon’s Noiseless Coding


Theorem.

Prof. Rahul N, Dept of TCE 2017 15


15EC54 (ITC) IAT 1 - Scheme and Solution

Inference:
 Equation (7.9), tells us that we can make average length of r−ary code symbols
per source symbol as small as, but no smaller than the entropy of source measured
in r-ary units.
 Equation (7.10), tell us that we can make average length as close to entropy Hr (S)
by coding the nth extension of source S rather than S.
 Equation (7.10) also tell us that the price that we pay for decreasing Ln /n is the
increased coding complexity caused by the large number (qn ) of source symbols.
2M for Statement, 6M for Derivation, 2M for Inference = 10M

Total – 10 Marks

Q 8.
For a discrete memoryless source with source alphabet S = {A, B, C} and with
probabilities P = {0.1, 0.5, 0.4}:
i. Construct binary–Huffman code. Find its efficiency.
ii. Construct binary–Huffman code for second extension of discrete memoryless
source. Find its efficiency.
iii. Verify Shannon’s noiseless coding theorem using result of 8(i) and 8(ii)
Solution:
Given Source alphabet 𝑆 = {𝐴, 𝐵, 𝐶}
Probability of symbols 𝑃 = {0.1,0.5,0.4}
Thus, Entropy of Source is,
3
1
𝐻(𝑆) = ∑ 𝑝𝑖 log 2 ( )
𝑝𝑖
𝑖=1
1 1 1
𝐻(𝑆) = 0.1 ∗ log 2 ( ) + 0.5 ∗ log 2 ( ) + 0.4 ∗ log 2 ( )
0.1 0.5 0.4
∴ 𝐻(𝑆) = 0.33219 + 0.5 + 0.52877 = 1.361 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
i. Binary – Huffman code representation of source 𝑆 are obtained in Table 8.1
Table 8.1: Binary representation for symbols of source 𝑺
Symbol (𝒔𝒊 ) Code Probabilities (𝒑𝒊 ) Stage 1 Length (𝒍𝒊 )
B 0 0.5 0.5 (0) 1
C 10 0.4 (0) 𝟎. 𝟓 (𝟏) 2
A 11 0.1 (1) 2

Hence source 𝑆 = {𝐴, 𝐵, 𝐶} is 90.73 % efficient.

Prof. Rahul N, Dept of TCE 2017 16


15EC54 (ITC) IAT 1 - Scheme and Solution

Average length of above code is


3

𝐿 = ∑ 𝑝𝑖 𝑙𝑖 = 0.5 ∗ 1 + 0.4 ∗ 2 + 0.1 ∗ 2 = 1.5 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙


𝑖=1

Hence efficiency of coding is


𝐻(𝑆) 1.361
𝜂𝑐 = = = 0.9073
𝐿 1.5
0.5M each for H(S) & L; 1M each for Table 8.1 & ηc = 3M

ii. The second extension source symbols are obtained in Table 8.2
Table 8.2: Probability of each symbol of second extension source
Second Extension Symbol Probability
AA 0.01
AB 0.05
AC 0.04
BA 0.05
BB 0.25
BC 0.2
CA 0.04
CB 0.2
CC 0.16

Entropy of second extension source is


𝐻(𝑆 2 ) = 2. 𝐻(𝑆) = 2.722 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
Thus Huffman representation of second extension of source 𝑆 is:
9−2
𝛼= = 7 𝑠𝑡𝑎𝑔𝑒𝑠
2−1
Binary representation of symbols of second extension of source S of Table 8.2 are
obtained using Huffman – encoding procedure as in Table 8.3.
Average length of second extension of source is
9

𝐿2 = ∑ 𝜎𝑖 𝑙𝑖
𝑖=1

𝐿2 = 0.25 ∗ 2 + 2 ∗ 0.2 ∗ 2 + 0.16 ∗ 3 + 2 ∗ 0.05 ∗ 5 + 0.04 ∗ 5 + 0.04 ∗ 6 + 0.01 ∗ 6


𝐿2 = 2.78 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙

Prof. Rahul N, Dept of TCE 2017 17


15EC54 (ITC) IAT 1 - Scheme and Solution

Table 8.2: Binary representation for symbols of second extension of source 𝑺

𝝈𝒊 Code 𝑷(𝝈𝒊 ) Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 7 𝒍𝒊

BB 01 0.25 0.25 0.25 0.25 0.25 0.35 𝟎. 𝟒 𝟎. 𝟔 (𝟎) 2

BC 10 0.2 0.2 0.2 0.2 0.2 0.25 0.35 (0) 0.4 (1) 2

CB 11 0.2 0.2 0.2 0.2 0.2 0.2 (0) 0.25 (1) 2

CC 001 0.16 0.16 0.16 0.16 𝟎. 𝟏𝟗 (𝟎) 0.2 (1) 3

AB 00000 0.05 0.05 𝟎. 𝟎𝟗 𝟎. 𝟏 (𝟎) 0.16 (1) 5

BA 00001 0.05 0.05 0.05 (0) 0.09 (1) 5

AC 00011 0.04 𝟎. 𝟎𝟓 (𝟎) 0.05 (1) 5

CA 000100 0.04 (0) 0.04 (1) 6

AA 000101 0.01 (1) 6

Prof. Rahul N, Dept of TCE 2017 18


15EC54 (ITC) IAT 1 - Scheme and Solution

Hence efficiency of second extension of source is


𝐻(𝑆 2 ) 2.722
𝜂𝑐2 = = = 0.97914
𝐿2 2.78
Thus the efficiency of second extension of source is 97.914 %
0.5M each for H(S2) & L2; 1M each for Table 8.2 & ηc2; 3M for Table 8.3 = 6M

iii. According to Shannon’s Coding Theorem,


Average length of the code reaches close to entropy if coding is carried out for
𝑛𝑡ℎ extension of source rather than source itself.
Here,
For source 𝑆, efficiency is 90.73 %,
while for second extension of source S, i.e. 𝑆 2 , efficiency is 97.914%

1M for inference = 1M

Total – 10 Marks

Prof. Rahul N, Dept of TCE 2017 19

You might also like