0% found this document useful (0 votes)

25 views11 pages

Optimal Source Code: L L L L P L

The document discusses optimal source code and the Kraft inequality, which is essential for determining the expected length of prefix codes. It presents methods for finding optimal codeword lengths, including Shannon-Fano and Huffman coding schemes, emphasizing the importance of minimizing expected length while satisfying certain constraints. Additionally, it addresses the sensitivity of Huffman coding to variations in probability assignments and compares different coding schemes based on their efficiency and expected lengths.

Uploaded by

Turjo Sarker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views11 pages

Optimal Source Code: L L L L P L

Uploaded by

Turjo Sarker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Optimal source code

Now we proved that any codeword set that

satisfies the prefix condition has to satisfy the
Kraft inequality,

and that the Kraft inequality is a sufficient con-

dition for the existence of a codeword set with
the specified set of codeword lengths.

We now consider the problem of finding the

prefix code with the minimum expected length.

This is equivalent to finding the set of lengths

l1, l2, . . . , lm satisfying the Kraft inequality
P
and whose expected length L = pili is less
than the expected length of any other prefix
code.

This is a standard optimization problem: Min-

imize L over all integers l1, l2, . . . , lm
satisfying Kraft inequality over the li’s.
23
Bounds on Optimal Code Length

We now demonstrate a code that achieves an

expected length L within 1 bit of the lower
bound; that is, H(X) ≤ L < H(X) + 1

P
We wish to minimize L = pili subject to the
constraint that l1, l2, . . . , lm are integers
P −l
and r i ≤ 1.

We proved that the optimal codeword lengths

can be found by finding the probability distribu-
tion closest to the distribution of X in relative
entropy,

pi may not equal an integer, we round it up to

give integer word-length assignments,

li = dlogr p1 e
i

These lengths satisfy the Kraft inequality since

24
P −dlog p1 e P −log p1
≤
P
r i r i = pi = 1

This choice of codeword lengths satisfies

−li
Let Qi = qr
r−li
X

i=1

logr p1 ≤ li < logr p1 + 1

i i

Multiplying by pi and summing over i, we ob-

tain Hr (X) ≤ L < Hr (X) + 1

Since an optimal code can only be better than

this code, we have the following theorem.

Theorem: Let l1, l2, . . . , lm be optimal code-

word lengths for a source distribution p and a
r-ary alphabet, and let L be the associated ex-
P
pected length of an optimal code (L = pili
).

Then Hr (X) ≤ L < Hr (X) + 1

Shannon First Theorem

Hr (S) ≤ L < Hr (S) + 1

Then Hr (S n) ≤ Ln < Hr (S n) + 1 or, Hr (S) ≤

Ln < H (S) + 1
n r n

Ln
In the limit, lim = Hr (S)
n→∞ n

This is the noiseless coding theorem which sug-

gests that by coding the n-th extension of S,
one can make the average number of r-ary
code symbols per source symbol as small as
possible but not smaller than the entropy of
the source. Lnn better approximates the aver-
age codeword length.

For Markov source, the adjoint will obey the

bound on L. i.e. Hr (S̄) ≤ L

Augmenting previous results,

25
Hr (S) ≤ Hr (S̄) ≤ L

and Hr (S n) ≤ Hr (S¯n) ≤ Ln

Now select li as unique integer satisfying

logr P1 ≤ li < logr P1 + 1 - so that

i i

Hr (S)+ Hr (S̄)−H
n
r (S)
≤ Ln < H (S)+ Hr (S̄)−Hr (S)+1
n r n

Ln
Here also lim can be made as close to
n→∞ n
Hr (S) as possible.
Shannon Fano coding scheme

The length assignment described above gives

rise to Shannon-Fano code. So, lenghts are
chosen as logr ( P1 ) ≤ li < logr ( P1 ) + 1
i i

The problem with this scheme is that it does

not consider the relative positions of symbols
with respect to one another.

It only considers absolute probabilities. Ex.

pA = 1/210 and pB = 1 − 1/210 gives lA = 10
and lB = 1.

From common sense, it appears that the choice

should be lA and lB should both be 1.

However, in the long run, the Shannon-Fano

assignment would not increase the average length
too much. Nevertheless, we should do better.
26
Huffman coding scheme

Efficiency of code is given by η = HrL(S)

Redundancy of code is given by 1−η = Hr (S)−L

An optimal (shortest expected length) prefix

code for a given distribution can be constructed
by a simple algorithm discovered by Huffman.

First, sort the probabilities P1 ≥ P2 ≥ . . . ≥ Pq

Reduce the source to q-1 symbols and reorder

(sort again) until no of symbols =2.

Reduced sequence Sj has sα which has sα0 and

sα1 in Sj−1.

Then Pα = Pα0 + Pα1

27
At every level, if the two symbols having small-
est probabilities are collapsed into a compound
symbol and we go on constructing the coding
tree as a heap, we get the optimal code.

Then, overall average length would increase by

an amount Lj−1 = Lj + Pα0 + Pα1 ; since only
these two symbols have increased length.

So, the overhead due to expansion of a com-

pound symbol is minimum if the smallest prob-
ability symbols are chosen for the purpose.

Any other choice of sα0 and sα1 would not be

optimal.

This exhibits a greedy choice and therefore at

every stage of heap formation, we should sort
the symbol probabilities and collapse the two
symbols with smallest probabilities as we go up
towards the root of the coding tree.
Sensitivity of Huffman coding scheme

Suppose the probability assignment used for

code compression is different from what occurs
in real life.

p0i = pi + ei such that

q
1 2 1 P 2
X
q ei = 0 and var(ei) = σ = q ei
i=1

Hence, L0 = 1q lip0i = L + 1q
P P
liei

1 P
We should examine q liei to get a better feel
of how the length is affected by noise.

Using Lagrange multipliers λ and µ,

1 1 1
L = q liei − λ q ei − µ( q ei − σ 2)
P P 2

28
∂L = 1 (l − λ − 2µe ) = 0
∂ei q i i

On summing over i, 1q i li − λ = 0
P

gives λ = 1q i li.
P

∂L = 0 gives
ei ∂e
i
P
1 2µ 2 ) = 0 to get µ = ei l i
−
P P
q l i ei q (ei 2qσ 2
.

∂L = 0 gives
Now, putting λ and µ in li ∂e
i

1 P l2 − λ 1 P l − 2µ 1 P(e l ) = 0
q i q i q i i

1 P 2 1 P 2 1
Then, we can write ( q eili) = ( q li −( q li)2)σ 2
P

1
i.e. ( q eili)2 = Var (li) Var (ei)
P

This implies that high variance of codeword

lengths make the average length of code more
prone to variation with noise.
Comparing some coding schemes

Symbol Space Prob Code-I Code-II Code-III

A .5 0 00 0111
B .3 10 01 011
C .1 110 10 01
D .1 111 11 0

Expected length (Code-I) = 1.7 (uniquely de-

codable and instantaneous)

For Code-II it is 2.0 (fixed length, easy to de-

code)

For Code-III it is 3.2 (uniquely decodable, not

instantaneous; not efficient as well).

Data Compression Solutions
79% (19)
Data Compression Solutions
67 pages
Source Coding
No ratings yet
Source Coding
35 pages
Huffman Coding
No ratings yet
Huffman Coding
39 pages
L12, L13, L14, L15, L16 - Module 4 - Source Coding
No ratings yet
L12, L13, L14, L15, L16 - Module 4 - Source Coding
59 pages
DCT Based Coding
No ratings yet
DCT Based Coding
49 pages
Huffman
No ratings yet
Huffman
53 pages
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
No ratings yet
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
39 pages
Dce 1
No ratings yet
Dce 1
21 pages
Week 3
No ratings yet
Week 3
30 pages
Lecture 9
No ratings yet
Lecture 9
24 pages
Group Assignment Multimedia System
No ratings yet
Group Assignment Multimedia System
26 pages
cp467 12 Lecture14 Compression1
No ratings yet
cp467 12 Lecture14 Compression1
146 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
L15 Compression
No ratings yet
L15 Compression
63 pages
Unit 1: Information Theory and Coding
No ratings yet
Unit 1: Information Theory and Coding
52 pages
Unit 2
No ratings yet
Unit 2
30 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Source Coding
No ratings yet
Source Coding
18 pages
Source 515 A
No ratings yet
Source 515 A
80 pages
Huffman Coding: Eric Dubois
No ratings yet
Huffman Coding: Eric Dubois
17 pages
EE4740 Lecture4 Slides
No ratings yet
EE4740 Lecture4 Slides
43 pages
Module IV
No ratings yet
Module IV
37 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Notes Shannon
No ratings yet
Notes Shannon
6 pages
Entropy & Run Length Coding
No ratings yet
Entropy & Run Length Coding
45 pages
4 Information Theory
No ratings yet
4 Information Theory
53 pages
Ut 1 PPT
No ratings yet
Ut 1 PPT
77 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
Chapter 3 Multimedia Data Compression
No ratings yet
Chapter 3 Multimedia Data Compression
21 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
Coding Theory
No ratings yet
Coding Theory
49 pages
Imc14 03 Huffman Codes PDF
No ratings yet
Imc14 03 Huffman Codes PDF
31 pages
Information Theory and Coding - Chapter 3
No ratings yet
Information Theory and Coding - Chapter 3
33 pages
Data Compression Arithmetic Coding
No ratings yet
Data Compression Arithmetic Coding
38 pages
A2 Sol
No ratings yet
A2 Sol
7 pages
CH 6
No ratings yet
CH 6
21 pages
Source Coding
No ratings yet
Source Coding
10 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
Lecture 6 PDF
No ratings yet
Lecture 6 PDF
5 pages
Chapter10 Part1 Huffman
No ratings yet
Chapter10 Part1 Huffman
17 pages
Data Compression Can Be Achieved by Assigning To of The Data Source and
No ratings yet
Data Compression Can Be Achieved by Assigning To of The Data Source and
42 pages
Huffman Codes: Spring 2010
No ratings yet
Huffman Codes: Spring 2010
7 pages
Data Compression: Peng-Hua Wang
No ratings yet
Data Compression: Peng-Hua Wang
41 pages
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
37 pages
Data Compression Introduction
No ratings yet
Data Compression Introduction
43 pages
ch3 Part1
No ratings yet
ch3 Part1
7 pages
66 IC PPT Lecture 6
No ratings yet
66 IC PPT Lecture 6
20 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
Entropy Coding
No ratings yet
Entropy Coding
18 pages
Lec27 PDF
No ratings yet
Lec27 PDF
26 pages
Uniquely Decodable Codes
No ratings yet
Uniquely Decodable Codes
10 pages
HW 3 Sol
No ratings yet
HW 3 Sol
9 pages
Ex 06 e
No ratings yet
Ex 06 e
7 pages
5 Data Compression
No ratings yet
5 Data Compression
6 pages
Source Coding Theory: TSBK01 Image Coding and Data Compression
No ratings yet
Source Coding Theory: TSBK01 Image Coding and Data Compression
14 pages
Lecture35-37 SourceCoding
No ratings yet
Lecture35-37 SourceCoding
20 pages
Source Coding Theory: TSBK01 Image Coding and Data Compression
No ratings yet
Source Coding Theory: TSBK01 Image Coding and Data Compression
14 pages
BSC CompSc SyllabusNEP - 23
No ratings yet
BSC CompSc SyllabusNEP - 23
64 pages
Test Codility
0% (1)
Test Codility
2 pages
Tutorial On Surface Code Quantum Error Correction
No ratings yet
Tutorial On Surface Code Quantum Error Correction
29 pages
Level: K1 Knowledge K2 Understand K3 Apply K4 Analyse K5 Evaluate K6 Create
No ratings yet
Level: K1 Knowledge K2 Understand K3 Apply K4 Analyse K5 Evaluate K6 Create
2 pages
LectureNote MA221 26sep
No ratings yet
LectureNote MA221 26sep
24 pages
Minimal Excludant Over Distinct Parts Partition
No ratings yet
Minimal Excludant Over Distinct Parts Partition
14 pages
Basic Rules About Divisors
No ratings yet
Basic Rules About Divisors
11 pages
Computer Practical Project ISC
No ratings yet
Computer Practical Project ISC
57 pages
Integer Unit Notes
No ratings yet
Integer Unit Notes
8 pages
Gammafunction PDF
No ratings yet
Gammafunction PDF
20 pages
Assignment 2
No ratings yet
Assignment 2
33 pages
Newton-Raphson Method of Solving A Nonlinear Equation: After Reading This Chapter, You Should Be Able To
No ratings yet
Newton-Raphson Method of Solving A Nonlinear Equation: After Reading This Chapter, You Should Be Able To
10 pages
Number Theory
No ratings yet
Number Theory
1 page
PoS ADP CS
No ratings yet
PoS ADP CS
5 pages
Individual Contest - Junior Section: 16 Hanoi Open Mathematics Competition
No ratings yet
Individual Contest - Junior Section: 16 Hanoi Open Mathematics Competition
10 pages
Computer Science 21 22
No ratings yet
Computer Science 21 22
1 page
DFS & BFS Blog
No ratings yet
DFS & BFS Blog
10 pages
Problem Tutorial: "Apollonian Network"
No ratings yet
Problem Tutorial: "Apollonian Network"
6 pages
10th Maths RT Chap 04 (Set B) QP
No ratings yet
10th Maths RT Chap 04 (Set B) QP
2 pages
Mat 2126-Mat - 2126-Mat 2155 - Engineering Mathematics - III
No ratings yet
Mat 2126-Mat - 2126-Mat 2155 - Engineering Mathematics - III
2 pages
Pytagoras Triple and Types of Triangle
No ratings yet
Pytagoras Triple and Types of Triangle
6 pages
F.4 Revision Exercises Form 4 Mathematics Revision Exercises
No ratings yet
F.4 Revision Exercises Form 4 Mathematics Revision Exercises
4 pages
DSA FAT Model Question Paper
No ratings yet
DSA FAT Model Question Paper
2 pages
Beta Negative Binomial Distribution
No ratings yet
Beta Negative Binomial Distribution
3 pages
(Paper) The Discrete Noiseless Channel Revisited
No ratings yet
(Paper) The Discrete Noiseless Channel Revisited
16 pages
KTH Smallest Number Algo
No ratings yet
KTH Smallest Number Algo
17 pages
Exit (Extrinsic Information Transfer) Chart Analysis For BCH (Bose-Chaudhuri-Hocquenghem) Codes
No ratings yet
Exit (Extrinsic Information Transfer) Chart Analysis For BCH (Bose-Chaudhuri-Hocquenghem) Codes
7 pages
Universiti: Malaysia Final Examination
No ratings yet
Universiti: Malaysia Final Examination
5 pages
Digital Electronics Answer Sheet 3
No ratings yet
Digital Electronics Answer Sheet 3
1 page
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)

Optimal Source Code: L L L L P L

Uploaded by

Optimal Source Code: L L L L P L

Uploaded by

Optimal source code

Now we proved that any codeword set that

and that the Kraft inequality is a sufficient con-

We now consider the problem of finding the

This is equivalent to finding the set of lengths

This is a standard optimization problem: Min-

We now demonstrate a code that achieves an

We proved that the optimal codeword lengths

pi may not equal an integer, we round it up to

These lengths satisfy the Kraft inequality since

This choice of codeword lengths satisfies

logr p1 ≤ li < logr p1 + 1

Multiplying by pi and summing over i, we ob-

Since an optimal code can only be better than

Theorem: Let l1, l2, . . . , lm be optimal code-

Then Hr (X) ≤ L < Hr (X) + 1

Hr (S) ≤ L < Hr (S) + 1

Then Hr (S n) ≤ Ln < Hr (S n) + 1 or, Hr (S) ≤

This is the noiseless coding theorem which sug-

For Markov source, the adjoint will obey the

Augmenting previous results,

Now select li as unique integer satisfying

logr P1 ≤ li < logr P1 + 1 - so that

The length assignment described above gives

The problem with this scheme is that it does

It only considers absolute probabilities. Ex.

From common sense, it appears that the choice

However, in the long run, the Shannon-Fano

Efficiency of code is given by η = HrL(S)

Redundancy of code is given by 1−η = Hr (S)−L

An optimal (shortest expected length) prefix

First, sort the probabilities P1 ≥ P2 ≥ . . . ≥ Pq

Reduce the source to q-1 symbols and reorder

Reduced sequence Sj has sα which has sα0 and

Then Pα = Pα0 + Pα1

Then, overall average length would increase by

So, the overhead due to expansion of a com-

Any other choice of sα0 and sα1 would not be

This exhibits a greedy choice and therefore at

Suppose the probability assignment used for

p0i = pi + ei such that

Using Lagrange multipliers λ and µ,

This implies that high variance of codeword

Symbol Space Prob Code-I Code-II Code-III

Expected length (Code-I) = 1.7 (uniquely de-

For Code-II it is 2.0 (fixed length, easy to de-

For Code-III it is 3.2 (uniquely decodable, not

You might also like