0% found this document useful (0 votes)
45 views

Image Compression

The goal of image compression is to reduce the amount of data required to represent a digital image while preserving as much information as possible. There are three main types of redundancy in images: coding redundancy, interpixel redundancy, and psychovisual redundancy. Lossy compression techniques like quantization aim to remove psychovisual redundancy, resulting in some loss of information but a smaller file size.

Uploaded by

Vipin Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Image Compression

The goal of image compression is to reduce the amount of data required to represent a digital image while preserving as much information as possible. There are three main types of redundancy in images: coding redundancy, interpixel redundancy, and psychovisual redundancy. Lossy compression techniques like quantization aim to remove psychovisual redundancy, resulting in some loss of information but a smaller file size.

Uploaded by

Vipin Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Image Compression

Goal of Image Compression


 The goal of image compression is to reduce the amount of data
required to represent a digital image.
Data ≠ Information
 Data and information are not synonymous terms!
 Data is the means by which information is conveyed.
 Data compression aims to reduce the amount of data
required to represent a given quantity of information while
preserving as much information as possible.
Data vs Information (cont’d)
 The same amount of information can be represented by
various amount of data, e.g.:

Ex1: Your wife, Helen, will meet you at Logan Airport in Boston at 5 minutes past 6:00
pm tomorrow night

Your wife will meet you at Logan Airport at 5 minutes past 6:00 pm tomorrow
Ex2: night

Helen will meet you at Logan at 6:00 pm tomorrow night


Ex3:
Definitions: Compression Ratio

compression

Compression ratio:
Definitions: Data Redundancy

 Relative data redundancy:

Example:
Types of Data Redundancy
(1) Coding Redundancy
(2) Interpixel Redundancy
(3) Psychovisual Redundancy

 Compression attempts to reduce one or more of these


redundancy types.
Coding Redundancy
 Code: a list of symbols (letters, numbers, bits etc.)
 Code word: a sequence of symbols used to represent a piece of
information or an event (e.g., gray levels).
 Code word length: number of symbols in each code word
Coding Redundancy
 Average number of bits required to represent each pixel
is given by:
Coding redundancy
 The average number of bit used for fixed 3-bit code:
Inter pixel redundancy

(profile – line 100)

original threshold

thresholded
Psychovisual Redundancy
 Certain information has relatively less importance for the
quality of image perception. This information is said to be
psychovisually redundant.
 Unlike coding and interpixel redundancies, the psychovisual redundancy
is related with the real/quantifiable visual information. Its
elimination results a loss of quantitative information.
However psychovisually the loss is negligible.
 Removing this type of redundancy is a lossy process and the
lost information cannot be recovered.
 The method used to remove this type of redundancy is called
quantization which means the mapping of a broad range of
input values to a limited number of output values.
Image Compression Model (cont’d)

 Mapper: transforms input data in a way that facilitates reduction


of interpixel redundancies.
Image Compression Model (cont’d)

 Quantizer: reduces the accuracy of the mapper’s output in


accordance with some pre-established fidelity criteria.
Image Compression Model (cont’d)

 Symbol encoder: assigns the shortest code to the most


frequently occurring output values.
Image Compression Models (cont’d)

 Inverse steps are performed.

 Note that quantization is irreversible in general.


Fidelity Criteria

 How close is to ?

 Criteria
Subjective: based on human observers
Objective: mathematically defined criteria
Subjective Fidelity Criteria
Quality Measure of a Compressed
Image (Fidelity Criteria):
• The quality of such images can be evaluated by
objective and subjective methods.
• The objective quality measures:
Compression Methods
Lossless Vs Lossy
Entropy Coding
 Average information content of an image:
L 1
E   I (rk ) P(rk )
k 0

using

Entropy: units/pixel
(e.g., bits/pixel)
Huffman Encoding
 A = 0
B = 100
C = 1010
D = 1011
R = 11
 ABRACADABRA = 01001101010010110100110
 This is eleven letters in 23 bits
 A fixed-width encoding would require 3 bits for five different
letters, or 33 bits for 11 letters
 Notice that the encoded bit string can be decoded!
Why it works
 In this example, A was the most common letter
 In ABRACADABRA:
 5 As code for A is 1 bit long
 2 Rs code for R is 2 bits long
 2 Bs code for B is 3 bits long
1C code for C is 4 bits long
1D code for D is 4 bits long
Creating a Huffman encoding
 For each encoding unit (letter, in this example), associate a
frequency (number of times it occurs)
 You can also use a percentage or a probability
 Create a binary tree whose children are the encoding units
with the smallest frequencies
 The frequency of the root is the sum of the frequencies of the
leaves
 Repeat this procedure until all the encoding units are in the
binary tree
Example, step I
 Assume that relative frequencies are:
 A: 40
 B: 20
 C: 10
 D: 10
 R: 20
 (I chose simpler numbers than the real frequencies)
 Smallest number are 10 and 10 (C and D), so connect those
Example, step II
 C and D have already been used, and the new node above
them (call it C+D) has value 20
 The smallest values are B, C+D, and R, all of which have
value 20
 Connect any two of these
Example, step III
 The smallest values is R, while A and B+C+D all have value
40
 Connect R to either of the others
Example, step IV
 Connect the final two nodes
Example, step V
 Assign 0 to left branches, 1 to right branches
 Each encoding is a path from the root
 A = 0
B = 100
C = 1010
D = 1011
R = 11
 Each path
terminates at a
leaf
 Do you see why
encoded strings
are decodable?
Practical considerations
 It is not practical to create a Huffman encoding for a single
short string, such as ABRACADABRA
 To decode it, you would need the code table
 If you include the code table in the entire message, the whole
thing is bigger than just the ASCII message
 Huffman encoding is practical if:
 The encoded string is large relative to the code table, OR
 We agree on the code table beforehand
 For example, it’s easy to find a table of letter frequencies for English (or
any other alphabet-based language)
Shanon-Fano Coding:
 Sort the source symbols with their probabilities in a
decreasing order.
 Divide the full set of symbols into 2 parts such that each part
has an equal or approximately equal probability.
 Code the symbols in the first part with bit 0, and the symbols
in the second part with bit 1.
 Continue the process recursively until each block has only
one symbol in it.
Symb
Codeword
ol
Example: C 00
B 01
E 100
A 101
D 1100
H 1101
G 1110
F 1111
Arithmetic Coding
LZW Compression
 LZW compression is the compression of a file into a
smaller file using a table-based lookup algorithm invented by
Abraham Lempel, Jacob Ziv, and Terry Welch.
 When the LZW program starts to encode a file, the code table
contains only the first 256 entries, with the remainder of the table
being blank.
 This means that the first codes going into the compressed file are
simply the single bytes from the input file being converted to 12 bits.
 As the encoding continues, the LZW algorithm identifies repeated
sequences in the data, and adds them to the code table.
 Compression starts the second time a sequence is encountered.
 The key point is that a sequence from the input file is not added to the
code table until it has already been placed in the compressed file as
individual characters (codes 0 to 255). This is important because it
allows the uncompression program to reconstruct the code table directly
from the compressed data, without having to transmit the code table
separately.
LZW: Algorithm
LZW compression algorithm:
“ABABBABCABABBA” -- example

The output codes are:


1 2 4 5 2 3 4 6 1.
Instead of sending 14
characters, only 9
codes need to be sent
(compression ratio =
14/9 = 1.56).
LZW: Decompression
124523461

Input:
A B AB BA B C AB ABB A

Final Result: A B AB BA B C AB ABB A


Example: 45 bytes of the ASCII text string:
the/rain/in/Spain/falls/mainly/on/the/plain.

You might also like