Lecture 3
Lecture 3
LIU Liang
Assistant Professor
Department of Electronic and Information Engineering
The Hong Kong Polytechnic University
0
Lecture 2 Review
1
An Example for A/D and D/A Conversion
q Sampling rate: 5 Hz
q Number of quantization bits per sample: 3
2
Sampling at Transmitter
111 (7)
110 (6)
101 (5)
100 (4)
011 (3)
010 (2)
001 (1)
000 (0)
111
110
101
100
011
010
001
000
111
110
101
100
011
010
001
000
6
Lecture 3: Source Coding
7
Source Coding
q Source coding is also called data compression
q Wikipedia: “data compression, or source coding, is the process of encoding information
using fewer bits (or other information-bearing units) than an unencoded representation
would use through use of specific encoding schemes.”
q Applications
Ø General data compression: .zip, .gz …
Ø Image over network: telephone/internet/wireless/etc
Ø Slow device: 1xCD-ROM 150KB/s, bluetooth v1.2 up to ~0.25MB/s
Ø Large multimedia databases
8
Digital Images
9
Aspect Ratio
q The aspect ratio is the ratio between the number of pixels on the horizontal axis
(W) and the number of pixels on the vertical axis (H).
10
Color Depth
11
Image Size
12
Digital Video
13
Why We Need Compression
q Image: 6.0 million pixel camera, 3000x2000
Ø RGB (color depth 24 bits or 3 bytes)
§ What is the size of each image (MB)? 18 MB
§ Suppose the storage size is 1 GB. How many images can be stored? 56 images
q Video: DVD Disc 4.7 GB
Ø Video 720x480, RGB, 30 frames/sec
§ The maximum duration of the video per DVD disc 31.1MB in 1 sec 2.5 minutes
q Watch online video using cellphone:
Ø 352x240, RGB, 15 frames/sec
§ The minimum network speed: 30.4 Mbps
§ Can you afford it? (How about 4K video)
q How to compress data with fewer bits?
14
Lossless Compression and Lossy Compression
q Lossless compression: original data is perfectly reconstructed from the compressed data
q Lossy compression: only permit reconstruction of an approximation of the original data,
but with fewer encoded bits to store compressed data
15
Part I: Lossless Compression
16
Examples of Codes
q For any sample in a sample set
Ø Codeword assigned to :
Ø Length of codeword assigned to :
Ø Expected length of all codewords:
q Example 1: encode 1, 2, 3, 4
19
Uniquely Decodable Code
q A code is said to be uniquely decodable if every sequence of samples is
mapped to a different sequence of codewords
20
Uniquely Decodable Code
q Consider the following code
Ø Assigning codes to letters a, b, and c
21
Prefix Code
q A code is said to be a prefix code or instantaneous code if no codeword is prefix of any
other codeword
q Consider the following code
Ø Assigning codes to letters a, b, and c
22
Relation between Various Codes
23
Bound for Prefix Code
q Encode samples
Ø PDF:
q Theorem: The expected length L of any prefix code to encode is greater than
or equal to the entropy of , i.e.
24
Bound for Prefix Code
q Example:
Ø Encode
Ø PDF:
Ø How to design prefix code with the minimum expected length?
25
Bound for Prefix Code
q Core idea: using fewer bits to encode samples with higher frequency
26
Practical Issue for Prefix Code
q Recall that to achieve the minimum expected code length, the length of each codeword is
27
Practical Issue for Prefix Code
q Solution: the length of each codeword is
q Example:
Ø Encode
Ø PDF:
28
Practical Issue for Prefix Code
q Lower bound on expected code length
29
Revisit of Morse Code
30
Huffman Code
q If n is small, we can easily construct prefix code based on (e.g., n=3)
q What if n is very large?
q Example:
Ø Encode:
Ø PDF:
31
Encoding
q Example:
Root
Edge 1
a:0.42 0.58
Leaf
Huffman Tree 0.23 0.35
c:0.05 d:0.1
Leaf Leaf
32
Encoding
q Encoding rule: from root to each leaf; on each edge
Ø Turn left: 0
Ø Turn right: 1 Root
0 1 1
a:0.42 0
0.58 1
Leaf
Huffman Tree 0.23 0.35
0 1 0 1
c:0.05 d:0.1
Leaf Leaf
33
Encoding
q Encoding rule: combining all bits on edges from a root to a leaf
Root
0 1 1
a:0.42 0
0.58 1
Leaf
Huffman Tree 0.23 0.35
0 1 0 1
Root
0 1 1
a:0.42 0
0.58 1
Leaf
Huffman Tree 0.23 0.35
0 1 0 1
c:0.05 d:0.1
Leaf Leaf
35
Summary of Huffman Code
q Greedy algorithm to assign fewer bits to encode samples with higher frequency
36
Encoding and Decoding
q Example:
q Use 5 minutes
Ø Construct Huffman Tree and encode each letter by yourself
Ø How to decode the letters based on codeword 01111100
a:0.42
c:0.05 d:0.1
37
Huffman Code
q Example:
q Entropy of x:
38
Morse Code v.s. Huffman Code
39
Morse Code v.s. Huffman Code
Verify this
40
Part II: Lossy Compression
41
Lossy Compression
q Lossy compression: only permit reconstruction of an approximation of the original data,
but with fewer encoded bits to store compressed data
42
Why We Need Compression
q Image: 6.0 million pixel camera, 3000x2000
Ø RGB (color depth 24 bits or 3 bytes)
§ What is the size of each image (MB)? 18 MB
§ Suppose the storage size is 1 GB. How many images can be stored? 56 images
q Video: DVD Disc 4.7 GB
Ø Video 720x480, RGB, 30 frames/sec
§ The maximum duration of the video per DVD disc 31.1MB in 1 sec 2.5 minutes
q Watch online video using cellphone:
Ø 352x240, RGB, 15 frames/sec
§ The minimum network speed: 30.4 Mbps
§ Can you afford it? (How about 4K video)
q How to compress data with fewer bits?
43
What Can We Compress
q Redundancy: exceeding what is necessary or normal
Ø Spatial redundancy: adjacent pixels are highly correlated
44
What Can We Compress
q Irrelevance or perceptual redundancy
Ø Not all visual information is perceived by eye/brain, so throw away those that are not
Red
Magenta Yellow
White
Blue Green
Cyan
45
What Can We Compress
46
How Much Can We Compress
47