0% found this document useful (0 votes)

113 views

Source Coding

1. Source coding is the conversion of the output of a discrete memoryless source into a binary sequence with the goal of minimizing the average bit rate. 2. Source coding theory states that the average code length L must be greater than or equal to the entropy H(X) of the source. L can be made arbitrarily close to H(X) with a suitable code. 3. Entropy coding techniques like Shannon-Fano and Huffman coding aim to design variable length codes with average code lengths close to the source entropy.

Uploaded by

Joy Nkirote

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views

Source Coding

Uploaded by

Joy Nkirote

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

SOURCE CODING

1. Definition of Source Coding Terms

(a) Code Length
(b) Code efficiency
(c) Code redundancy
2. Source Coding Theory
3. Classification of Codes
4. Entropy Encoding
(a) Shannon-Fano
(b) Huffman

ECE 416 – Digital Communication

Friday, 23 March 2018
1
SYLLABUS

2
DIGITAL COMMUNICATION

Baseband signal
Signal Source Source Channel Modulator
& Transducer Encoder Encoder

Microphone Sampling Error Control

TV Camera Companding
Flow-sensor Encrypting
etc
Channel
Free space
Co-axial cable
Baseband signal Water
fibre

Signal Output Message Channel Demodulator

Recovery decoder
SOURCE CODING DEFINITION & OBJECTIVE

1. Sources Coding is the conversion of the

output of a Discrete Memory-less Source
(DMS) into a binary sequence.
2. The Objective of Source Coding is to
minimize the average bit rate required to
represent signal by reducing the redundancy
of the information source.

4
CLASSIFICATION OF INFORMATION SOURCES

Information Sources fall into two categories:

a) Memory sources where the current symbol

depends on the previous symbols,

b) Memory-less sources where current symbols

are independent of previous symbols.

5
CODE LENGTH DEFINITION

• Assume X is a DMS with finite entropy H(X)

and alphabet {x1,x2,..,xn) and with
corresponding probabilities P(xi) for i =
1,2,..,m.
• If the binary code word assigned to a symbol xi
by the source encoder has length n, then the
codeword length is the number of binary
digits in the code word.

6
OTHER DEFINITIONS

1. The average Codeword Length is given by:

m
L   P ( xi )ni
i 1

2. Code Efficiency is defined as:

Lmin
 
L
where Lmin is the minimum possible value of the average
code length, L.
3. The code is said to be efficient when the code length,η
tends to 1.

3. Code redundancy is defined as:

  1  7
SOURCE CODING THEOREM

• The Source coding theorem states that a DMS

X with entropy H(X), and average code length
L per symbol is bound by
L ≥ H(X)
• Further, that L can be made closer to H(X) as
desired through some suitable code.

8
CLASSIFICATION OF CODES

1. Fixed Length Code:

A code whose code length is fixed.

9
CLASSIFICATION OF CODES

2. Variable Length Code

A code whose length varies for different
symbols.

10
CLASSIFICATION OF CODES

3. Distinct Code is a code in which each code

word is distinguishable from other code words

11
CLASSIFICATION OF CODES

4. Prefix-free Codes
Codes in which no codeword can be formed
by adding code symbols to another codeword

12
CLASSIFICATION OF CODES

5. Uniquely Decodable Codes

A code in which the original sequence can be
reconstructed perfectly from the encoded
binary sequence.

13
CLASSIFICATION OF CODES

6. Instantaneous Codes
1. A code which has the end of any codeword is
recognizable without examining subsequent code
symbols.
2. Instantaneous Codes have the property that no
codeword is a prefix of another codeword.

14
CLASSIFICATION OF CODES

7. Optimal Codes:
A code is said to be optimal if it is
instantaneous and has a minimum average
Length, 𝐿𝑚𝑖𝑛

15
WORKED EXAMPLE - 1

1. A Discrete Memory-less Source X has alphabet

{x1, x2} and associated probabilities, P(x1)=0.9,
P(x2)=0.1 where the symbols are encoded as:

Find the efficiency and redundancy of the code

16
SOLUTION

Entropy is:

Code efficiency is

Code redundancy is:

17
EXAMPLE 2

2. A Discrete Memory-Less Source X has

alphabet {x1,x2,x3,x4} and a source coding as
shown below.
xi P(xi) Code
X1 0.81 0
X2 0.09 10
X3 0.09 110
X4 0.01 111

Determine the efficiency and redundancy of

the code.
18
SOLUTION

• The average code word length, L is:

• The entropy H(X) is given by:

H(X)

• Code Efficiency is therefore:

H ( X ) 0.938
   0.727
L 1.29

• Code redundancy is therefore:

  1    0.273  27%
19
ENTROPY CODING

1. Entropy coding refers to the design of a

variable length code such that its average
codeword length approaches the entropy.
There are two main type of entropy coding,
i.e
(a) Shannon-Fano Coding
(b) Huffman Coding

20
SHANNON-FANO CODING

• Named after Claude Shannon and Robert Fano, is

a technique for constructing a prefix-code based
on a set of symbols and their probabilities
(estimated or measured).
• Shannon–Fano coding is suboptimal in the sense
that it does not achieve the lowest possible
expected code word length like Huffman coding.
• The technique was first proposed in Shannon's "A
Mathematical Theory of Communication", his
1948 article introducing the field of information
theory.

21
SHANNON-FANO CODING

The Shannon-Fano Code is generated by using

the following procedure:
1. List the source symbols in the order of
decreasing probability.
2. Partition the set into two sets with as nearly
equal probabilities as possible and assign 0 to the
upper set and 1 to the lower set.
3. Continue with the process each time partitioning
the sets with as nearly equal probabilities as
possible until further partitioning of sets is not
possible.
22
SHANNON-FANO CODING
1. Original symbol list and Probabilities
x(i) P(x(i))
x1 0.05
x2 0.30
x3 0.08
x4 0.25
x5 0.20
x6 0.12

2. Sort in the order of decreasing probabilities

x(i) P(x(i))
x2 0.30
x4 0.25
x5 0.20
x6 0.12
x3 0.08
x1 0.05

3. Partition into 2 sets above and below 0.5 (approx.)

x(i) P(x(i)) Step 1

x2 0.30 0
Assign 0
x4 0.25 0

x5 0.20 1

x6 0.12 1

x3 0.08 1
Assign 1
x1 0.05 1

23
SHANNON-FANO CODING

4. Partition into 2 sets above and below the middle points

x(i) P(x(i)) Step 1 Step 2
x2 0.30 0 0
x4 0.25 0 1
x5 0.20 1 0
x6 0.12 1 1
x3 0.08 1 1
Remaining
x1 0.05 1 1

5. Partition the remaining into 2 sets above and below the middle points
x(i) P(x(i)) Step 1 Step 2 Step 3
x2 0.30 0 0
x4 0.25 0 1
x5 0.20 1 0
x6 0.12 1 1 0
x3 0.08 1 1 1
x1 0.05 1 1 1 Remaining

5. Partition the remaining into 2 sets above and below the middle points
x(i) P(x(i)) Step 1 Step 2 Step 3 Step 4 Code
x2 0.30 0 0 00
x4 0.25 0 1 01
x5 0.20 1 0 10
x6 0.12 1 1 0 110
x3 0.08 1 1 1 0 1110
24
x1 0.05 1 1 1 1 1111
SHANNON-FANO CODING EXAMPLE 1

A DMS has four symbols, i.e x1. x2, x3 and x4 with

probabilities P(x1)=1/2, P(x2)=1/4, P(x3)=P(x4)=1/8.

Construct the Shannon-Fano Code and determine the

code efficiency.

25
SHANNON-FANO CODE-EXAMPLE 1 - SOLUTION

1. Shannon-Fano Code
x(i) P(x(i)) Step 1 Step 2 Step 3 Code
x1 0.500 0 0
x2 0.250 1 0 10
x3 0.125 1 1 0 110
x4 0.125 1 1 1 111

𝐼 𝑥1 = 𝑙𝑜𝑔2 2 = 1 = n1
𝐼 𝑥2 = 𝑙𝑜𝑔2 4 = 2 = n2
𝐼 𝑥3 = 𝑙𝑜𝑔2 8 = 3 = n3
𝐼 𝑥4 = 𝑙𝑜𝑔2 8 = 3 = n4
4
1 1 1 1
𝐻 𝑋 = ෍ 𝑃 𝑥𝑖 𝐼 𝑥𝑖 = 1 + 2 + 3 + 3 = 1.75
2 4 8 8
𝑖=1
4
1 1 1 1
𝐿 = ෍ 𝑃 𝑥𝑖 𝑛𝑖 = 1 + 2 + 3 + 3 = 1.75
2 4 8 8
𝑖=1
1. Efficiency
𝐻(𝑋)
𝑛= = 1 𝑜𝑟 100% 26
𝐿
HUFFMAN CODE

1. Huffman coding is a lossless data compression

algorithm using variable length codes.
2. Lengths of the assigned codes are based on the
frequencies of corresponding characters.
a) Most frequent character gets the shortest code.
b) Least frequent character gets the longest code.
3. The variable-length codes assigned to input
characters are Prefix Codes, i.e
– the codes are assigned in such a manner that the
code assigned to one character is not prefix of code
assigned to any other character.

27
HUFFMAN CODE

1. The Huffman Code results in a code that is optimal

and is therefore a code with the highest efficiency.
2. The Huffman procedure is based on the following
observations regarding optimum prefix codes.
a) Symbols that occur more frequently (have a higher
probability of occurrence) will have the shortest code
words.
b) The two symbols that occur least frequently will have the
same length.
c) Code words corresponding to the two lowest probability
symbols differ only in the last bit.

28
STEPS IN HUFFMAN CODING

There are mainly two major

steps in Huffman Coding, i.e
1. Build a Huffman Tree from
input characters.
2. Traverse the Huffman Tree
and assign codes to
characters. Character Code
a 0
b 111
d 1011
d 100
r 110
29
! 1010
ASSIGN CODES

1. Start encoding from the last reduction on the

tree.
2. Assign 0 to the first digit of the code words for
all symbols associated with the first probability.
Assign 1 to the second probability.
3. Assign 0 and 1 to the second digit of the two
probabilities that were combined in previous
reduction step while retaining all assignments in
the previous stage
4. Repeat the process until the first column is
reached.

30
USING THE HUFFMAN CODE IN PRACTICE
• Assume that you a character file that you would like to compress. By parsing
through the list, a computer stablishes that there are 100,000 characters with a
frequency of occurrence as shown below.
Character Frequency
A 45,000
B 13,000
C 12,000
D 16,000
E 9,000
F 5,000
Total 100,000

• Determine a code that encodes the file using as few bits as possible.

SOLUTION 1: USING A HAMMER

A fixed code scheme would require 3 bits per character., i.e 2𝑣 ≥ 6
Therefore using this code we will store 3 x 100,000 = 300kbits
----------
If on the other hand we used a byte to store each character, we would have
required a file of size 8 x 100,000 or 800kbits
31
HUFFMAN CODE FOR THE EXAMPLE

1. Average Code Length, L = 2.24

2. Bits required = L x 100,000 = 224,000 bits
32
HOMEWORK

• Determine the Huffman code for the following

codes and their corresponding probabilities.
Character Probability Character Probability

A 0.05 F 0.3 0.3 0.3 0.3 0.4 0.6

B 0.15 C 0.2 0.2 0.2 0.3 0.3

C 0.2 G 0.1 0.1 0.2 0.2 0.3

D 0.05 B 0.15 0.15 0.15 0.2

E 0.15 E 0.15 0.15 0.15

F 0.3 A 0.05 0.1

G 0.1 D 0.05

33
FIRST, CREATE THE TREE

Character Probability Character Probability

F 0.3 0.3 0.3 0.3 0.4 0.6
A 0.05
C 0.2 0.2 0.2 0.3 0.3
B 0.15
B 0.15 0.15 0.2 0.2 0.3
C 0.2
E 0.15 0.15 0.15 0.2
D 0.05
G 0.1 0.1 0.15
E 0.15

F 0.3 A 0.05 0.1

G 0.1 D 0.05

34
USE ONLINE CALCULATOR TO CROSS-CHECK

Source Coding
No ratings yet
Source Coding
18 pages
2. Coding Theory
No ratings yet
2. Coding Theory
49 pages
Ch. 2 Source Coding-Ppt1 PDF
No ratings yet
Ch. 2 Source Coding-Ppt1 PDF
59 pages
CH 6
No ratings yet
CH 6
21 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
LectureNotes01 PDF
No ratings yet
LectureNotes01 PDF
29 pages
Unit 2
No ratings yet
Unit 2
30 pages
Huffman Coding: Vida Movahedi
No ratings yet
Huffman Coding: Vida Movahedi
24 pages
Lossless Compression: Lesson 1
No ratings yet
Lossless Compression: Lesson 1
10 pages
Coding Techniques Important Questions-1
No ratings yet
Coding Techniques Important Questions-1
6 pages
ICT (Source Encoding & Channel Encoding)
No ratings yet
ICT (Source Encoding & Channel Encoding)
15 pages
Source Coding Shannon Fano Coding
No ratings yet
Source Coding Shannon Fano Coding
24 pages
Source 515 A
No ratings yet
Source 515 A
80 pages
Source Coding
No ratings yet
Source Coding
8 pages
Mesleki Yeterlilik
No ratings yet
Mesleki Yeterlilik
106 pages
Elements of Encoding
No ratings yet
Elements of Encoding
16 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
3-1-Lossless Compression
No ratings yet
3-1-Lossless Compression
10 pages
4 Huffman and shannon fano coding
No ratings yet
4 Huffman and shannon fano coding
23 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
L12, L13, L14, L15, L16 - Module 4 - Source Coding
No ratings yet
L12, L13, L14, L15, L16 - Module 4 - Source Coding
59 pages
Unit 2
No ratings yet
Unit 2
28 pages
DC 3
No ratings yet
DC 3
20 pages
cp467_12_lecture14_compression1
No ratings yet
cp467_12_lecture14_compression1
146 pages
Huff Man
No ratings yet
Huff Man
8 pages
Data Compression Arithmetic Coding
No ratings yet
Data Compression Arithmetic Coding
38 pages
Week 3
No ratings yet
Week 3
30 pages
Information Theory and Coding - Chapter 3
No ratings yet
Information Theory and Coding - Chapter 3
33 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
Tutorial 8
No ratings yet
Tutorial 8
20 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
chapter 2
No ratings yet
chapter 2
13 pages
Module IV
No ratings yet
Module IV
37 pages
Entropy
No ratings yet
Entropy
10 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Data Compression (Pt2)
No ratings yet
Data Compression (Pt2)
22 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
DCT Based Coding
No ratings yet
DCT Based Coding
49 pages
Basic Concepts of Encoding
No ratings yet
Basic Concepts of Encoding
34 pages
Ict Assignment
No ratings yet
Ict Assignment
3 pages
Lecture 5
No ratings yet
Lecture 5
13 pages
Huffman Coding Assignment
50% (2)
Huffman Coding Assignment
7 pages
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
37 pages
Chapter Five Lossless Compression
No ratings yet
Chapter Five Lossless Compression
49 pages
ICT
No ratings yet
ICT
10 pages
Question Bank: Information Coding Techniques
No ratings yet
Question Bank: Information Coding Techniques
10 pages
Source Coding
No ratings yet
Source Coding
29 pages
Error-Free Compression: Variable Length Coding
No ratings yet
Error-Free Compression: Variable Length Coding
13 pages
UNIT 2
No ratings yet
UNIT 2
82 pages
Huffman Coding: Vida Movahedi
No ratings yet
Huffman Coding: Vida Movahedi
8 pages
Dce 1
No ratings yet
Dce 1
21 pages
Revision of Lecture 1: Q Bits R R Q Q (Bits/symbol) I (M P Log R R R) ? M, P
No ratings yet
Revision of Lecture 1: Q Bits R R Q Q (Bits/symbol) I (M P Log R R R) ? M, P
18 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
PROGRAMMING WITH PYTHON: Master the Basics and Beyond with Hands-On Projects and Expert Guidance (2024 Guide for Beginners)
From Everand
PROGRAMMING WITH PYTHON: Master the Basics and Beyond with Hands-On Projects and Expert Guidance (2024 Guide for Beginners)
ERROL HOWARD
No ratings yet
Smart Arduino Projects: 10 Hands-On Builds with Shift Registers and Multiplexers for Automation and IoT
From Everand
Smart Arduino Projects: 10 Hands-On Builds with Shift Registers and Multiplexers for Automation and IoT
electronics projects
No ratings yet
Blowfish Cipher Tutorials - Herong's Tutorial Examples
From Everand
Blowfish Cipher Tutorials - Herong's Tutorial Examples
Herong Yang
No ratings yet
C Programming for Arduino
From Everand
C Programming for Arduino
Julien Bayle
4/5 (13)
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet

Source Coding

Uploaded by

Source Coding

Uploaded by

SOURCE CODING

1. Definition of Source Coding Terms

ECE 416 – Digital Communication

Microphone Sampling Error Control

Signal Output Message Channel Demodulator

1. Sources Coding is the conversion of the

Information Sources fall into two categories:

a) Memory sources where the current symbol

b) Memory-less sources where current symbols

• Assume X is a DMS with finite entropy H(X)

1. The average Codeword Length is given by:

2. Code Efficiency is defined as:

3. Code redundancy is defined as:

• The Source coding theorem states that a DMS

1. Fixed Length Code:

2. Variable Length Code

3. Distinct Code is a code in which each code

5. Uniquely Decodable Codes

1. A Discrete Memory-less Source X has alphabet

Find the efficiency and redundancy of the code

Code redundancy is:

2. A Discrete Memory-Less Source X has

Determine the efficiency and redundancy of

• The average code word length, L is:

• The entropy H(X) is given by:

• Code Efficiency is therefore:

• Code redundancy is therefore:

1. Entropy coding refers to the design of a

• Named after Claude Shannon and Robert Fano, is

The Shannon-Fano Code is generated by using

2. Sort in the order of decreasing probabilities

3. Partition into 2 sets above and below 0.5 (approx.)

4. Partition into 2 sets above and below the middle points

A DMS has four symbols, i.e x1. x2, x3 and x4 with

Construct the Shannon-Fano Code and determine the

1. Huffman coding is a lossless data compression

1. The Huffman Code results in a code that is optimal

There are mainly two major

1. Start encoding from the last reduction on the

SOLUTION 1: USING A HAMMER

1. Average Code Length, L = 2.24

• Determine the Huffman code for the following

A 0.05 F 0.3 0.3 0.3 0.3 0.4 0.6

B 0.15 C 0.2 0.2 0.2 0.3 0.3

C 0.2 G 0.1 0.1 0.2 0.2 0.3

D 0.05 B 0.15 0.15 0.15 0.2

E 0.15 E 0.15 0.15 0.15

F 0.3 A 0.05 0.1

Character Probability Character Probability

F 0.3 A 0.05 0.1

You might also like