MM-Lecture 5 Image Compression
MM-Lecture 5 Image Compression
1
Introduction
Compression: the process of coding that will
effectively reduce the total number of bits needed to
represent certain information.
2
The Need for Compression?
3
Information Vs Data
Data = Information + Redundant Data
Key idea in compression: only keep the info
Data Redundancy
CODING: Fewer bits to represent frequent symbols.
INTERPIXEL / INTERFRAME: Neighboring pixels
have similar values.
PSYCHOVISUAL: Human visual system can not
simultaneously distinguish all colors.
4
Coding Redundancy
Fewer number of bits to represent frequently
occurring symbols.
Let pr(rk) = nk / n, k = 0,1,2, . ., L-1; L # of gray
levels.
Let rk be represented by l (rk) bits.
Therefore average # of bits required to represent
each pixel is
5
Coding Redundancy
Consider equation (A): It makes sense to assign fewer
bits to those rk for which pr(rk) are large in order to
reduce the sum.
This achieves data compression and results in a
variable length code.
More probable gray levels will have fewer # of bits.
6
Psycho Visual Redundancy
Question: Which one looks
more different from the original?
Original Image
8
Types of Compression
Lossless compression
lossless compression for legal and medical
documents, computer programs
exploit only data redundancy
Lossy compression
digital audio, image, video where some errors or loss
can be tolerated
exploit both data redundancy and human perception
properties
9
Lossless Compression
Common methods to remove redundancy
Run Length Coding
Huffman Coding
Dictionary-Based Coding
Arithmetic, etc
10
Run Length Coding (RLC)
Run-length coding is a very widely used and simple
compression technique
In this method we replace runs of symbols with pairs
of (run-length, symbol)
Example:
Input symbols: 7,7,7,7,7,90,9,9,9,1,1,1
requires 12 Byte
Using RLC: 5,7,90,3,9,3,1= 7 Byte
Compression ratio: 12/7
11
Huffman Coding
Assigns fewer bits to symbols that appear more
often and more bits to the symbols that appear less
often
Efficient when occurrence probabilities vary
widely
It constructs a binary tree in bottom-up manner.
Then it uses the tree to find the codeword for each
symbol
12
Huffman Coding-Algorithm
1. Put all symbols on a list sorted according to their
frequency counts.
2. Repeat until the list has only one symbol left:
a. From the list pick two symbols with the lowest frequency counts.
Form a Huffman sub tree that has these two symbols as child
nodes and create a parent node.
b. Assign the sum of the children's frequency counts to the parent
and insert it into the list such that the order is maintained.
c. Delete the children from the list.
3. Assign a codeword for each leaf based on the path
from the root.
13
Huffman Coding-Example
Source Number of Codeword Length of
Symbol occurrence assigned codeword
S1 30 00 2
S2 10 101 3
S3 20 11 2
S4 5 1001 4
S5 10 1000 4
S6 25 01 2
15
LZW Compression-Algorithm
16
LZW Compression-Example
We will compress the string
"ABABBABCABABBA"
Initially the dictionary is the following
Code String
1 A
2 B
3 C
17
LZW Compression-Example
18
LZW Decompression-Algorithm
19
LZW Decompression-Example
Decompress the code 124523461 using LZW method
20