Lecture 7 Source Coding 2024
Lecture 7 Source Coding 2024
University of Science
Faculty of Electronics & Telecommunications
Chapter 7:
Dang Le Khoa
Email: [email protected]
Outline
⚫ Basic concepts
⚫ Information
⚫ Entropy
⚫ Source Encoding
11/12/2024 2
Basic concepts
⚫ Block diagram of digital communication system
Noisy
Channel
11/12/2024 3
What is Information Theory?
11/12/2024 4
Information measure
11/12/2024 5
Information
11/12/2024 6
Information
11/12/2024 7
Entropy and Information rate
⚫ Consider an information source emitting a sequence of symbols from
the set X={x1,x2..,xM}
⚫ Each symbol xi is treated as a message with probability p(xi) and self-
information I(xi)
⚫ This source has an average rate of r symbols/sec
⚫ Discrete memoryless source
⚫ The amount of information produced by the source during an arbitrary
symbol interval is a disrete random variable X.
⚫ The average information per symbol is then given by:
M
H ( X ) = E{I ( x j )} = − p( x j ) log 2 p( x j ) bit/symbol (2)
j =1
11/12/2024 9
Entropy
0 H ( X ) log 2 M (3)
11/12/2024 10
Example
11/12/2024 11
Source coding theorem
Discrete
Source Binary
binary
encoder channel
source
Source symbol rate C = 1 bit/symbol
r = 3.5 symbols/s S = 2 symbols/s
SC = 2 bits/s
13
11/12/2024
Example of Source encoding
11/12/2024 14
Example of Source encoding(cont’d)
11/12/2024 15
First-Order extension
11/12/2024 16
Second-Order extension
Grouping 2 source symbols at a time
The codeword is based on Shannon-Fano Coding
(explained in some slides later)
11/12/2024
Second-Order extension
L 1.29
= = 0.645 code symbols/so urce symbol
n 2
11/12/2024 18
Third-Order extension
Grouping 3 source symbols at a time:
L 1.598
= = 0.533 code symbols/source symbol
n 3
11/12/2024 20
Efficiency of a source code
p( x )l
L
i i
i =1
n.H ( X )
eff =
L
11/12/2024 21
Behavior of L/n
11/12/2024 22
Shannon-Fano Coding
⚫ Procedure: 3 steps
1. Listing the source symbols in order of decreasing
probability
2. Partitioning the set into 2 subsets as close to equiprobable
as possible. 0’s are assigned to the upper set and 1’s to the
lower set
3. Continue to partition subsets until further partitioning is
not possible
⚫ Example:
11/12/2024 23
Example of Shannon-Fano Coding
Ui pi 1 2 3 4 5 Codewords
U1 .34 0 0 00
U2 .23 0 1 01
U3 .19 1 0 10
U4 .1 1 1 0 110
U5 .07 1 1 1 0 1110
U6 .06 1 1 1 1 0 11110
U7 .01 1 1 1 1 1 11111
11/12/2024 24
Shannon-Fano coding
7
L = pi li = 2.41
i =1
7
H (U ) = − pi log 2 pi = 2.37
i =1
H (U ) 2.37
eff = = = 0.98
L 2.41
11/12/2024 25
Huffman Coding [1][2][3]
⚫ Procedure: 3 steps
1. Listing the source symbols in order of decreasing
probability. The two source symbols of lowest
probabilities are assigned a 0 and a 1.
2. These two source symbols are combined into a new source
symbol with probability equal to the sum of the two
original probabilities. The new probability is placed in the
list in accordance with its value.
3. Repeat until the final probability of new combined symbol
is 1.0.
⚫ Example:
11/12/2024 26
Examples of Huffman Coding
0
1.0
Ui pi
0 1 Ui Codewords
U1 .34
.58 U1 00
1 U2 10
U2 .23 0
.42
U3 .19 U3 11
0 1
.24 U4 011
U4 .1
U5 .07 0 1 U5 0100
.14
U6 01010
U6 .06 0 1
.07 U7 01011
U7 .01
1
11/12/2024 27
Huffman Coding: disadvantages
⚫ When source have many symbols (outputs/messages), the code
becomes bulkyHufman code + fixed-length code.
⚫ Still some redundancy and redundancy is large with a small set
of messagesgrouping multiple independent messages
⚫ Grouping make redundancy small but the number of codewords
grows exponentially, code become more complex and delay is
introduced.
11/12/2024 28