0% found this document useful (0 votes)
15 views71 pages

06 Source Coding WuG 2023 10 10 17 24

Uploaded by

tluo34618
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views71 pages

06 Source Coding WuG 2023 10 10 17 24

Uploaded by

tluo34618
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

电子科技大学格拉斯哥学院 Glasgow College at UESTC

Information Theory

Lossless Source Coding


WU, Gang (武刚), Ph. D., Professor

National Key Laboratory Science and Technology on


Communications
(通信抗干扰技术国家级重点实验室)
[email protected];
[email protected]
Office: B3-607, Main building
Bitmap versus Vector Graph

2
Bitmap versus Vector Graph

3
Source Encoder and Decoder

• Source Encoder
– In digital communication we convert the signal from source into digital signal.
– The point to remember is we should like to use as few binary digits as possible
to represent the signal. In such away this efficient representation of the source
output results in little or no redundancy. This sequence of binary digits is
called information sequence.
– Source Encoding or Data Compression: the process of efficiently converting
the output of wither analog or digital source into a sequence of binary digits is
known as source encoding.
• Source Decoder
– At the end, if an analog signal is desired then source decoder tries to decode
the sequence from the knowledge of the encoding algorithm, which results in
the approximate replica of the input at the transmitter end.

4
Typical Image Coding (Lossy)
• JPEG/JPEG2000
– Recommendation T.81, International Telecommunication Union
(ITU) Std., Sept. 1992, Joint Photographic Expert Group (JPEG).
[Online]. Available: http://www.w3.org/Graphics/JPEG/itu-t81.pdf.
– Athanassios Skodras, Charilaos Christopoulos, and Touradj
Ebrahimi, “The jpeg 2000 still image compression standard,” IEEE
Signal processing magazine, vol. 18, no. 5, pp. 36–58,
2001.https://www.ece.uvic.ca/~frodo/software.html
• WebP
– Google Developers, “A new image format for the
web,”https://developers.google.com/speed/webp/, 2010.
• Google Inc., “Vp8 data format and decoding guide,”
https://datatracker.ietf.org/doc/html/rfc6386, 2011.
• FLIF
– Jon Sneyers and Pieter Wuille, “Flif: Free lossless image format
based on maniac compression,” in 2016 IEEE international
conference on image processing (ICIP). IEEE, 2016, pp. 66–70.
• Pixel CNN
– Aaron Van Oord, Nal Kalchbrenner, and Koray Kavukcuoglu, “Pixel
recurrent neural networks,” in International conference on
machine learning. PMLR, 2016, pp. 1747–1756.
– Tim Salimans, Andrej Karpathy, Xi Chen, and Diederik P Kingma,
“Pixelcnn++: Improving the pixelcnn with discretized logistic
mixture likelihood and other modifications,” arXiv preprint
arXiv:1701.05517, 2017.

5
Data Compression in Video Chatting

❑ In the video chatting, we do not need to transmit every details of the


bodies and background in a frame. We only need to transmit the
image of the head.
❑ Furthermore, we only need to transmit the offset vector rather than the
details of all the pixels.
6
Video Coding Standardization
• ITU-T(International Telecommunication Union –
Telecommunication Standardization Sector)
– VCEG(Video Coding Expert Group)
• H.26x
• ISO/IEC(International Standardization Organization,
International Electrotechnical Commission)
– MPEG(Moving Picture Expert Group)
• MEPG-x
• Jan. 2013, ISO/IEC and ITU-T jointl issued the new
HEVC standard(High Efficiency Video Coding)
– H.265(MPEG-H part2)
– 3GPP Release 12 (LTE-Advanced) include support for the
HEVC.

7
The updated specifications bringing support
for HEVC into 3GPP (2014)
Speicification # Title of 3GPP Speicification

TS 26.114 IP Multimedia Subsystem (IMS); Multimedia telephony; Media handling and interaction

TS 26.140 Multimedia Messaging Service (MMS); Media formats and codecs

TS 26.141 IP Multimedia System (IMS) Messaging and Presence; Media formats and codecs

TS 26.234 Transparent end-to-end Packet-switched Streaming Service (PSS); Protocols and codecs
Transparent end-to-end packet switched streaming service (PSS); 3GPP file format
TS 26.244
(3GP)
Transparent end-to-end Packet-switched Streaming Service (PSS); Progressive
TS 26.247
Download and Dynamic Adaptive Streaming over HTTP (3GP-DASH)
TS 26.346 Multimedia Broadcast/Multicast Service (MBMS); Protocols and codecs

TR 26.906* Evaluation of High Efficiency Video Coding (HEVC) for 3GPP services

*https://www.3gpp.org/ftp/Specs/html-info/26906.htm

8
Some Abbreviations
• DASH:
– Dynamic Adaptive Streaming over HTTP
• HEVC:
– High Efficiency Video Coding
• MBMS:
– Multimedia Broadcast/Multicast Service
• MTSI:
– Multimedia Telephony Services over IMS
• PSS:
– Packet-switched Streaming Service

9
Video Streaming Codec in 5G
• 3GPP started developing functionalities for
multimedia services and applications as part
of the Rel-16 and Rel-17 specifications.
• These include enablers for 5G Media
Streaming and extensions such as
– edge processing, analytics and event exposure;
– improvements to LTE-based 5G Broadcast and
hybrid services;
– 5G Multicast Broadcast Services (MBS);
– eXtended Reality (XR) and Augmented Reality (AR)
experiences.

10
Main Content

1 Fundamental of Source Coding

2 Lossless Source Coding Theorems

3 Classic Source Coding Schemes

11
Communication System

Transmitter
Source

Source Channel
Info.

Encryp Modu-
Encoder -tor Encoder lator

Channel
red 0101 0011 00110
Info. Dest.

Source Decryp Channel Demodu-


Decoder -tor Decoder lator

Receiver

12
Fundamental Concepts

❑ Source coding: reduce the redundancy, transmit more information


with a lower rate, improve the transmission efficiency.
❑ Channel coding: increase the redundancy by increasing the code rate
or the bandwidth, improve the transmission reliability.
❑ Cryptogram: encryption and decryption aims for increasing and
reducing the entropy, respectively, improve the transmission security.
❑ Source coding theorem: lossless source coding theorem (discrete
info source) and lossy source coding (continuous info source).
❑ Lossless source coding: every symbol generated by the info source
should be mapped onto a specific output codeword of the encoder.
❑ Lossy source coding: by satisfying the communication requirement,
the source coding allows some distortion (rate distortion theory).

13
Definition of Source Coding
❑ Source Coding: representing the original messages of the info source
by the code sequences, which are more suitable to be transmitted on a
specific medium. The source coding is completed by the encoder.
❑ The random message 𝑿 generated by the info source consists of 𝐿
discrete symbols, which is expressed as
𝑿 = 𝑋1 𝑋2 ⋯ 𝑋𝑙 ⋯ 𝑋𝐿 , where we have 𝑋𝑙 ∈ 𝑎1 , 𝑎2 , ⋯ , 𝑎𝑖 , ⋯ 𝑎𝑛 .
Source coding transforms the original random symbol sequence 𝑿
into the following code sequence 𝒀 :
𝒀 = 𝑌1 𝑌2 ⋯ 𝑌𝑘 ⋯ 𝑌𝐾 , where we have 𝑌𝑘 ∈ 𝑏1 , 𝑏2 , ⋯ , 𝑏𝑗 , ⋯ 𝑏𝑚 .

Lossless coding: every message of the info source is encoded into a


specific code sequence (codeword); Every codeword can only be
decoded into a specific message.
14
Source Coding:Example of Weather

• Implementation:Encoder (mapping)
• Mathematical Model
After enoding
S {00,01,10,11}
Source Source
Channel
Encoder
{s1(sunny),s2(cloudy), Binary sets =(0,1)
s3(rainy),s4(snowy)}
Codebook Is this codes optimum, in
(Set of codewords)
terms of efficiency and
• Essence: unique decodability?
– A transformation of the original symbol of the source
according to certain mathematical rules
– Single source symbol → Code symbol
– Source sequence → code sequence

15
Fixed Length Coding Theorem
❑ The entropy rate of a stationary memoryless info source 𝑿 is 𝐻(𝑋)
Fixed length
𝑿 = (𝑋1 𝑋2 ⋯ 𝑋𝑙 ⋯ 𝑋𝐿 ) coding 𝒀 = (𝑌1 𝑌2 ⋯ 𝑌𝑘 ⋯ 𝑌𝐾 )
𝑋𝑙 = {𝑎1 , 𝑎2 , ⋯ , 𝑎𝑖 , ⋯ 𝑎𝑛 } 𝑌𝑘 = {𝑏1 , 𝑏2 , ⋯ , 𝑏𝑗 , ⋯ 𝑏𝑚 }
❑ Fixed length coding theorem: for arbitrary 𝜀 > 0,𝛿 > 0, if we have
Information Rate 𝐾
𝑅 = log 2 𝑚 ≥ 𝐻 𝑋 + 𝜀 (Positive Theorem)
(bit/symbol) 𝐿
When 𝐿 is high, the decoding error could be lower than 𝛿. Otherwise,
Information Rate 𝐾
𝑅 = log 2 𝑚 ≤ 𝐻 𝑋 − 2𝜀, (Negative Theorem)
(bit/symbol) 𝐿
When 𝐿 is high, the decoding error cannot be avoided.
16
In p.43, Textbook

17
Coding Efficiency and Error Rate
❑ We have a stationary memoryless info source 𝑿 , which sequentially
sends symbols by obeying the following probability distribution:
𝐾
𝑎1 𝑎2 𝑎3 𝑎4 Coding Efficiency: 𝐻(𝑋)ൗ log 2 𝑚
𝑋 𝐿
= 1/2 1/4 1/8 1/8
𝑝(𝑋) What is decoding error rate?

Please calculate the coding efficiency and the decoding error rate for
the following cases:
(1) Encoding the one-symbol messages sent by 𝑿 with a binary code
having fixed length 2.
(2) Encoding the two-symbol messages sent by 𝑿 with a binary code
having fixed length 3.

18
Ideal Encoder
❑ Fixed length coding theorem (positive): when information rate is a little
higher than the entropy, we may realise error-free decoding. The length 𝐿 of
the message generated by the info source has to satisfy:
Var 𝐼 𝑎𝑖
𝐿≥ 2 .
𝜀 𝛿
The decoding error rate is lower than an arbitrary positive number 𝛿.
❑ Fixed length coding theorem :
❑ when the information rate 𝑅 is higher than the single-symbol entropy 𝐻(𝑋) by 𝜀, the
decoding error rate may not exceed 𝛿.
❑ If 𝑅 is lower than 𝐻(𝑋) by 2𝜀, the decoding error rate must be higher than 𝛿.
❑ When we encode the original message having an infinite length (𝐿 → ∞), the
ideal encoder having a coding efficiency tending to a unity exist, namely
𝐾
lim 𝐻(𝑋)ൗ log 2 𝑚 = 1,
𝐿→∞ 𝐿
which is impossible in practice.
19
Discussion on the Ideal Encoder
❑ We have the following single-symbol info source:
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6 𝑎7 𝑎8
= .
𝑝(𝑋) 0.4 0.18 0.1 0.1 0.07 0.06 0.05 0.04
Please encode this info source by using the fixed length code. The
coding efficiency should be 90%, the decoding error rate is 10−6 .
➢ Please calculate the length of the symbol sequence that needs to be
encoded together.
➢ If we use the binary fixed length code to encode every single
symbol generated by the info source, find the coding efficiency.

20
How an ideal encoder operates?
❑ An ideal encoder operates by obeying the following steps:
(1) Given 𝜀 and 𝛿, calculate the length 𝐿 of the symbol sequence generated by
𝑉𝑎𝑟 𝐼 𝑎𝑖
the info source that need to be encoded together by exploiting 𝐿 ≥ .
𝜀2𝛿

(2) Given 𝐿, calculate the probabilities of all the possible symbol sequences
having a length of 𝐿.

(3) Sort all the possible symbol sequences having a length of 𝐿 by the
descending order of their sending probabilities.

(4) Encode the sorted symbol sequences. The encoded symbol sequences
constitute a set of 𝐴𝜀 . The encoding process continues until 𝑝 𝐴𝜀 ≥ 1 − 𝛿.

21
What is a code?
• Definition:
– A code is a mapping from the discrete set of
symbols {0, · · · , M − 1} to finite binary sequences
• For each symbol, m their is a corresponding finite binary
sequence σm
• |σm| is the length of the binary sequence

• Expected number of bits per symbol (bit rate)

– Example for M = 4
• Message from {0,1,2,3}
• Encoded bit stream:(0, 2, 1, 3, 2) → (01|0|10|100100|0)
– Fixed Length Code:|σm| is constant for all m
– Variable Length Code: |σm| varies with m
22
Review on Source Coding

• Implementation:Encoder (mapping)
• Mathematical Model
After enoding
S {00,01,10,11}
Source Source
Channel
Encoder
{s1(sunny),s2(cloudy), Binary sets =(0,1)
s3(rainy),s4(snowy)}
Codebook Is this codes optimum, in
(Set of codewords)
terms of efficiency and
• Essence: unique decodability?
– A transformation of the original symbol of the source
according to certain mathematical rules
– Single source symbol → Code symbol
– Source sequence → code sequence

23
Unique Decodability
• Definition:
– For every string of source letters {𝑥1 , 𝑥2 , … , 𝑥𝑛 }, the
encoded output {C(𝑥1 )C(𝑥2 ), … , C(𝑥𝑛 )} must be distinct,
i.e., must differ from {C(𝑥1′ )C(𝑥2′ ), … , C(𝑥𝑚

)} for any other
source string {𝑥1‘ , 𝑥2’ , … , 𝑥𝑚
‘ }.

• Decoding error:
– If C(𝑥1 )C(𝑥2 ), … , C(𝑥𝑛 )= C(𝑥1′ )C(𝑥2′ ), … , C(𝑥𝑚
′ ), decoder must

fail on one of these inputs.


• Example:
– Consider a → 0, b → 01, c → 10
– Then ac → 010 and ba → 010
– Not uniquely decodable.

24
Example of Source coding
• Fixed length coding versus variable length coding

Source Prob. of source Codebook for source coding


symbol symbol Code 1 Code 2 Code 3 Code 4 Code 5
s1 p(s1)=1/2 00 00 0 1 1
s2 p(s2)=1/4 01 11 01 10 01
s3 p(s3)=1/8 10 00 001 100 001
s4 p(s4)=1/8 11 11 0001 1000 0001

• Is each code sequence decodable uniquely?


• Which has highest efficiency?
– the average number of coded bits per source symbol

25
Review on Fixed-length Coding Theorem

• What is the coding efficiency for source coding?


Fixed length
𝑿 = (𝑋1 𝑋2 ⋯ 𝑋𝑙 ⋯ 𝑋𝐿 ) coding 𝒀 = (𝑌1 𝑌2 ⋯ 𝑌𝑘 ⋯ 𝑌𝐾 )
𝑋𝑙 = {𝑎1 , 𝑎2 , ⋯ , 𝑎𝑖 , ⋯ 𝑎𝑛 } 𝑌𝑘 = {𝑏1 , 𝑏2 , ⋯ , 𝑏𝑗 , ⋯ 𝑏𝑚 }
𝐾 𝐾
Coding Efficiency: 𝐻(𝑋)ൗ log 2 𝑚 What is meaning of the denominator,
𝐿
log 2 𝑚?
𝐿

❑ We have the following single-symbol info source:


𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6 𝑎7 𝑎8
= .
𝑝(𝑋) 0.4 0.18 0.1 0.1 0.07 0.06 0.05 0.04
Please encode this info source by using the fixed length code.

26
Fixed Length Coding Theorem
❑ The entropy rate of a stationary memoryless info source 𝑿 is 𝐻(𝑋)
Fixed length
𝑿 = (𝑋1 𝑋2 ⋯ 𝑋𝑙 ⋯ 𝑋𝐿 ) coding 𝒀 = (𝑌1 𝑌2 ⋯ 𝑌𝑘 ⋯ 𝑌𝐾 )
𝑋𝑙 = {𝑎1 , 𝑎2 , ⋯ , 𝑎𝑖 , ⋯ 𝑎𝑛 } 𝑌𝑘 = {𝑏1 , 𝑏2 , ⋯ , 𝑏𝑗 , ⋯ 𝑏𝑚 }
❑ Fixed length coding theorem: for arbitrary 𝜀 > 0,𝛿 > 0, if we have
Information Rate 𝐾
𝑅 = log 2 𝑚 ≥ 𝐻 𝑋 + 𝜀 (Positive Theorem)
(bit/symbol) 𝐿
When 𝐿 is high, the decoding error could be lower than 𝛿. Otherwise,
Information Rate 𝐾
𝑅 = log 2 𝑚 ≤ 𝐻 𝑋 − 2𝜀, (Negative Theorem)
(bit/symbol) 𝐿
When 𝐿 is high, the decoding error cannot be avoided.
27
How to do fixed-length encoding?
• Segment source symbols into n-tuples.
• Map each n-tuple into binary L-tuple
where

• Let be number of bits per source


symbol

28
Coding efficiency
❑ We have the following single-symbol info source:
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6 𝑎7 𝑎8
= .
𝑝(𝑋) 0.4 0.18 0.1 0.1 0.07 0.06 0.05 0.04
Please encode this info source by using the fixed length code.
• Fixed-length codes
– 3 bit for each symbol(L=3)
– H(X)=2.55 bit/symbol
𝐻 𝑋
– 𝜂= = 85%
L
– Could we enhance the coding efficiency?
• Solution
– Variable-length source codes
29
Variable-length source codes
• Motivation
– Probable symbols should have shorter
codeworkds than improbable to reduce
bpss
• Definition
– A variable-length source code C encodes
each symbol x in source alphabet to a
binary X codeword C(x) of length l(x).
• For example, for X = {a, b, c}
– C(a)= 0
– C(b) = 10
– C(c) = 11
30
Decoder for Variable-length source codes

• Only prefix-free codes are uniquely


decodable.
• Example
– Consider a → 0, b → 01, c → 11
– Then
• accc → 0111111=016;
• bccc → 01111111=017
– This can be shown to be uniquely
decodable

31
Example: Not uniquely decodable

____ __
______ ______
_ _ ___
______ ______
_________ ______

32
Varied Length Coding Theorem
❑ For a memoryless info source 𝑋 having an entropy of 𝐻(𝑋), when
encoding its symbols by an 𝑚 -ary Variable-length codeword, a
lossless encoding scheme exists, whose average codeword length 𝐾 ഥ
satisfies: 𝐻(𝑋) 𝐻 𝑋
≤𝐾 ഥ <1+ .
log 2 𝑚 log 2 𝑚
The average information rate of the encoder satisfies:
𝐾ഥ
𝐻 𝑋 ≤ 𝑅ത = log 2 𝑚 < 𝐻 𝑋 + 𝜀,
𝐿
Where we have 𝐿 = 1 and 𝜀 is an arbitrary positive.
❑ When the info source sends a symbol sequence having a length of 𝐿,
the average code length satisfies:
𝐿𝐻(𝑋) 𝐿𝐻(𝑋)

≤𝐾 <1+
log 2 𝑚 log 2 𝑚
33
Proof

34
Proof (cont.)

35
Coding Efficiency
❑ The length 𝐿 of the symbol sequence to be encoded together by a
varied length encoder is far lower than a fixed length counterpart. The
coding efficiency is lower-bounded by
𝐻(𝑋) 𝐻(𝑋)
𝜂= >
𝑅ത log 2 𝑚
𝐻 𝑋 +
𝐿
❑ Encode the following single-symbol source by using a binary varied
length code:
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6 𝑎7 𝑎8
= .
𝑝(𝑋) 0.4 0.18 0.1 0.1 0.07 0.06 0.05 0.04
When the coding efficiency is required to be higher than 90%,
calculate the length of the symbol sequences that needs to be
processed together.
36
Pros and Cons of Varied-Length Code
❑ Varied length coding is capable of compressing the information.
❑ Basic principle of varied-length coding: the messages having high
sending probabilities are encoded into short codewords, while the
messages having low sending probabilities are encoded into long
codewords. Therefore, the average codeword length can be minimised
in order to increase the communication efficiency.
❑ Varied-length coding increase the complexity of decoding:
❑ The decoder should correctly identify the beginning of the codewords having
different lengths (synchronous decoding); since the code has various length,
when a specific symbol is received, the decoder does not know whether this is
the end of a code.
❑ It has to wait for receiving the following symbols. Then it can make a correct
decoding (decoding delay).

37
Examples of Varied Length Coding

Message of Sending
Code A Code B Code C Code D
Info source Probability
𝑎1 0.5 0 0 0 0
𝑎2 0.25 0 1 01 10
𝑎3 0.125 1 00 011 110
𝑎4 0.125 10 11 0111 1110
Uniquely
Undecodable Undecodable Decodable Decodable
Decoding Delay
No Delay
❑ A code is called a prefix-free code if no codeword is a prefix of any
other codewords. For brevity, a prefix-free code will be referred to as
a prefix code, which is also regarded as an instantaneous code.
38
Tree Representing code words of Code D

39
Tree Graph of Prefix Code

0 ternary code 0
1 1
0 2 0 2
0 0
1 1 1 1
2 2 0
2 2
0 0 0 1
1 1 1 2
2 2 2

Full Tree Non-full Tree


❑ Prefix codes can be constructed by a tree graph. The starting point is
the root and every segment between a pair of nodes is branch.
❑ The full tree constructs the fixed length code, while the non-full tree
constructs the varied length code.
40
Necessary and Sufficient Conditions
❑ Necessary and sufficient condition of an 𝑚-ary prefix-free code is
𝑛

෍ 𝑚−𝑘𝑖 ≤ 1. ( Kraft Inequality)


𝑖=1
𝑘𝑖 is the length of the i-th codeword.
Proof:Prove the necessity and the sufficiency of the inequality. See
Theorem 3.21, page 47, Textbook by Prof. Gallager
❑ Prefix codes are uniquely decodable.
𝑋 𝑎1 ⋯ 𝑎𝑖 ⋯ 𝑎𝑛
❑ Given an info source = 𝑝(𝑎 ) ⋯ 𝑝(𝑎 ) ⋯ 𝑝(𝑎 ) , if
𝑝(𝑋) 1 𝑖 𝑛
the length 𝑘𝑖 of the message 𝑎𝑖 ’s codeword satisfies
log 2 𝑝(𝑎𝑖 ) log 2 𝑝(𝑎𝑖 )
1− > 𝑘𝑖 ≥ − ,
log 2 𝑚 log 2 𝑚
Then this is a prefix code.
41
Proof on Kraft Inequality
• We prove this by associating
codewords with base 2 expansions
i.e., ‘decimals’ in base 2.

42
Proof on Kraft Inequality (2)

43
Proof on Kraft Inequality (3)

44
Proof on Kraft Inequality (4)

45
Proof on Kraft Inequality (5)

46
Quiz

47
The 2nd Way for Proof of Kraft Inequality

48
______ ______

_____

___ ___ ___ ___


_ _ _
__ __ __

_______

49
__

__ ___

50
Shannon Code
❑ Binary Shannon code has the following coding steps:
(1) Sort the symbols of the info source by the descending order
of their sending probabilities, e.g. 𝑝(𝑎1 ) ≥ ⋯ ≥ 𝑝(𝑎𝑛 )

𝑗−1
(2) Let 𝑝 𝑎0 = 0 and 𝑝𝑎 𝑎𝑗 = σ𝑖=0 𝑝(𝑎𝑖 ) , 𝑗 = 1,2, ⋯ , 𝑛
represents the accumulate probability before the 𝑗-th symbol.

(3) Decide the length 𝑘𝑖 of the i-th codeword to satisfy the


inequalities of − log 2 𝑝 𝑎𝑖 ≤ 𝑘𝑖 ≤ 1 − log 2 𝑝 𝑎𝑖 .

(4) Represent 𝑝𝑎 𝑎𝑗 in its binary form and take 𝑘𝑖 numbers after


the decimal point as the codeword of the symbol 𝑎𝑖 .
51
Example of Shannon Code

❑ Please encode the following single symbol info source by the binary
Shannon code and calculate its coding efficiency.
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6
=
𝑃(𝑋) 0.25 0.25 0.2 0.15 0.1 0.05
• Please compute expected length and coding efficiency.
To convert a fraction, keep multiplying the fraction part by 2 until it
becomes 0. Collect the integer parts in forward order.

-log20.25=2 0.25 × 2 = 0.5 0.5 × 2 = 1.0 0.7 × 2 = 1.4 0.85 × 2 = 1.7 0.95 × 2 = 1.9
0.0 × 2 = 0.0 0.4 × 2 = 0.8 0.7 × 2 = 1.4 0.9 × 2 = 1.8
(0.0)10=(00)2 0.5× 2 = 1.0
0.0 × 2 = 0.0 0.8 × 2 = 1.6 0.4 × 2 = 0.8 0.8 × 2 = 1.6
-log20.25=2 0.6 × 2 = 1.2
-log20.15=2.737 0.8 × 2 = 1.6
(0.25)10=(01)2 -log20.2=2.3219
-log20.1=3.3219 0.2 × 2 = 0.4
(0.5)10=(100)2 (0.7)10=(101)2
(0.85)10=(1101)2 -log20.05=4.329
(0.95)10=(11110)2

52
53
Fano Code
❑ M-ary Fano code has the following coding steps:
(1) Sort the symbols of the info source by the descending order
of their sending probabilities, e.g. 𝑝(𝑎1 ) ≥ ⋯ ≥ 𝑝(𝑎𝑛 )

(2) Divide the probabilities into M groups having similar sum


probabilities.

(3) Each group is assigned a code-element chosen from M


possible values.

(4) Consider each group as a unity and repeat steps (2) and (3)
until the probabilities is undividable.
54
Example of Fano Code

❑ Please encode the following single symbol info source by the binary
Fano code and calculate its coding efficiency.
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6
=
𝑃(𝑋) 0.25 0.25 0.2 0.15 0.1 0.05
Coding efficiency:
❑ We have a binary discrete info source 𝑋, whose probability distribution
is P 𝑋 = [0.7, 0.3],please encode its expanded source of order three
by the binary Fano code and calculate its coding efficiency.
Coding efficiency: xxxx

55
Huffman Code
❑ Binary Huffman code has the following coding steps:
(1) Sort the symbols of the info source by the descending order of their sending
probabilities, e.g. 𝑝(𝑎1 ) ≥ ⋯ ≥ 𝑝(𝑎𝑛 )

(2) Assign 0 and 1 to the symbols 𝑎𝑛−1 and 𝑎𝑛 having the lowest probabilities,
respectively. Combine 𝑎𝑛−1 and 𝑎𝑛 as a new symbol, whose probability is the
sum probability of 𝑎𝑛−1 and 𝑎𝑛 . The new source 𝑆1 has (𝑛 − 1) symbols.

(3) Sort the symbols in the new source 𝑆1 by the descending order of their
sending probabilities. Repeat step (2) and obtain a new reduced source 𝑆2
having (𝑛 − 2) symbols.

(4) Repeat the above steps until the info source is reduced to two symbols. Get
back to the original symbol along the coding path and obtain the codewords.

56
Examples of Huffman Coding (1)
• X={a1, a2, a3, a4, a5},Pr(ai)={0.3,0.25,0.25,0.1,0.1}

57
Examples of Huffman Coding (2)
• Using Tenary sets, {0,1,2}

58
Examples of Huffman Coding (3)

❑ Please encode the following single symbol info source by the binary
Huffman code and calculate its coding efficiency.
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6
=
𝑃(𝑋) 0.25 0.25 0.2 0.15 0.1 0.05

Coding efficiency: xxxx


❑ A ternary discrete info source 𝑋 has a probability distribution
𝑝 0 = 0.5, 𝑝 1 = 0.3, 𝑝 2 = 0.2 , please encode its expanded
source of order two by the binary Huffman code and calculate its
coding efficiency.
Coding efficiency: xxxx

59
Two methods for Huffman Coding
❑ Method 1: The combined new symbol is put behind the other
symbols having the same probabilities.
❑ Method 2: The combined new symbol is put ahead of the other
symbols having the same probabilities.
❑ Example: encode the following single-symbol source by the binary
Huffman code with these two methods and calculate the variance of
the codeword length:
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5
=
𝑃(𝑋) 0.4 0.2 0.2 0.1 0.1
❑ Conclusion: when sorting the symbols of the reduced source, putting
the combined new symbol in front is capable of efficiently using the
short codewords.

60
The paper published by Huffman
• Huffman D A. A method for the
construction of minimum-redundancy
codes[J]. Proceedings of the IRE, 1952,
40(9): 1098-1101.

61
Example in the paper

62
Derived Coding Requirement
• No message shall be coded in such a
way that its code is a prefix of any other
message, or that any of its prefixes are
used elsewhere as a message code.

63
𝑚-ary Huffman Code
❑ If the codewords of an 𝑚-ary code constitute a tree graph, the number
of the separable codewords is 𝑚 + 𝑘(𝑚 − 1), where 𝑘 is positive
integer.
❑ For minimising the average codeword length, the last reduced source
should have 𝑚 different symbols, when using 𝑚-ary Huffman code.
❑ In the first step of assigning the code-elements to the symbols having
lowest probabilities, the number of symbols processed is not 𝑚.
❑ If a source has 𝑛 symbol, let k is the minimum integer satisfying 𝑚 +
𝑘 𝑚 − 1 > 𝑛, then 𝑠 = 𝑚 + 𝑘 𝑚 − 1 − 𝑛 codewords is abandoned.
❑ We then simultaneously process (𝑚 − 𝑠) symbols in the first step.
𝑋 𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6 𝑎7 𝑎8
❑ Encode =
𝑝(𝑋) 0.4 0.18 0.1 0.1 0.07 0.06 0.05 0.04
by the ternary Huffman code and calculate its coding efficiency.
64
𝑚-ary Huffman Code (2)
• It will be noted that the terminating
auxiliary ensemble always has one unity
probability message.
• Each preceding ensemble is increased in
number by m -1 until the first auxiliary
ensemble is reached.
• Therefore, if N1 is the number of messages
in the first auxiliary ensemble, then (N1 -
1)/(m-1) must be an integer.
• However N1 = N-n0+ 1, where no is the
number of the least probable messages n0=2
combined in a bracket in the original
ensemble. Therefore, n0 (which, of course,
is at least two and no more than m) must
be of such a value that (N-n0)/(m-1) is an
integer.
65
Comparison

❑ Shannon code, Fano code and Huffman code all consider the statistics
of the info source. The symbols frequently sent are encoded by a short
codeword but those infrequently sent are encoded by a long codeword
in order to reduce the average codeword length.
❑ Shannon code has a unique coding scheme but its coding efficiency is
not very high. Fano code and Huffman code both have multiple
coding schemes.
❑ Fano code is suitable for the info source having the close group
probabilities after the division.
❑ Huffman code does not have nay specific requirement on the info
source. It as a high coding efficiency and the complexity of its
encoder is low.

66
Optimality of Huffman Codes

67
Assignment

68
Assignment(cont.)

69
Assignment(cont.)

70
Project
Please encode the 26 English letters and the ‘space’ in the English
novel – Game of Thrones by using an arbitrary source coding
technique.
Requirements:
(1) Please choose a source coding technique from Shannon coding,
Fano coding, Huffman coding or any other source coding technique.
(2) Please use at least the first two chapters “Prologue” and “Bran”
as the information source.
(3) Please freely choose a programming platform to complete this
project, C++, Java, Python, Matlab and etc.
(4) Please freely make a two-person group to complete this project.
(5) Please submit a compressed package named after the name of
the group members, which includes the executable file, the source
file and the English REPORT.
(6) Both Class 1 and Class 2, please, submits the packages to me,
and at the same time, to my two teaching assistants.
(7) Deadline: November 30, 2023

71

You might also like