Updated - MT390 - Tutorial 6 - Spring 2020 - 21
Updated - MT390 - Tutorial 6 - Spring 2020 - 21
Chapter 8:
Image Compression
Applications
2021-04-14 1
Preview:
• Image Compression is the science of reducing
the amount of data required to represent an
image.
2021-04-14 2
The following image is mostly used as reference
image in image compression studies:
2021-04-14 3
8.1 Fundamentals
2021-04-14 4
Compression Ratio:
▪ C=b/b’
▪ Here b and b’ are the number of bits in two
representations of the same information.
▪ If C=10 (or 10:1) then the larger
representation has 10 bits of data for every 1
bit of data in the smaller representation.
▪ R=1-1/C
2021-04-14 5
3 Types of Redundancy in Images:
1. Coding Redundancy: The 8-bit codes used
to represent image intensities contain more
bits than needed.
2021-04-14 6
2021-04-14 7
Let us assume, once again, that a discrete random variable rk
represents the gray levels of an image and that each rk
occurs with probability
nk
pr ( rk ) = k=0, 1, 2,…., L-1
n
Where L is the number of gray levels, nk is the number of
times that the kth gray level appears in the image, and n is
the total number of pixels in the image.
2021-04-14 8
Example 8.1: A simple illustration of variable length
coding
The computer generated image in fig. 8.1(a) has the intensity distribution
shown in the second column of table 8.1. If a natural 8-bit binary code(denoted
as code 1 in Table 8.1) is used to represent its 4 possible intensities, Lav –the
average number of bits for code1- is 8 bits, because L1(rk)= 8 bits for all rk.
2021-04-14 9
Example of Variable Length Coding (continue):
➢ On the other hand, if the scheme designated as code 2 in table 8.1 is
used, the average length of the encoded pixels is, in accordance with
Eq. (8.1-4) is:
➢ Lavg=0.25(2)+0.47(1)+0.25(3)+0.03(3)=1.81.
2021-04-14 10
Spatial and Temporal Redundancy:
▪ Nearby Pixels are correlated spatially. In
videos frames are correlated temporally.
▪ Consider Fig. 8.1(b)
1) Its histogram is uniform. All 256 intensities
are equally probable
2) Vertical Direction: pixels are independent
3) Horizontally: pixels are dependent
2021-04-14 11
Run-Length Encoding:
▪ Points 2 and 3 above show spatial redundancy
▪ Run-Length Encoding using sequence of run-
length pairs
▪ A run-length based representation compresses
the original 2-D, 8 bit intensity array by
▪ C=256x256x8/[(256+256)x8]=128:1
▪ Run-length coding is called mapping
▪ A mapping is said to be reversible if the pixels of
the original 2-D intensity array can be
constructed without errors
▪ Otherwise the mapping is said to be irreversible
2021-04-14 12
Irrelevant Information
2021-04-14 13
The fundamental premise of information theory is that the generation
of information can be modeled as a probabilistic process that can be
measured in a manner that agrees with intuition.
In accordance with this supposition, a random event E that occurs with
probability P(E) is said to contain
1
I ( E ) = log = − log P ( E )
P( E )
Units of information.
-The quantity I(E) often is called the self-information of E
bits/pixel
2021-04-14 14
The removal of “irrelevant visual” information
involves a loss of real or quantitative image
information. Because, information is lost, a means of
quantifying the nature of loss is needed. Two types
of criteria can be used for such an assessment:
2021-04-14 15
Objective Fidelity Criteria:
-When the level of information loss can be expressed as a function of the
original or input image and the compressed and subsequently
decompressed output image, it is said to be based on an object fidelity
criterion.
-A good example is the root-mean-square (rms) error between an input
and output image
-Let f ( x, y ) represent an input image and let f ( x, y) denote an estimate or
approximation of f ( x, y ) that results from compressing and subsequently
decompressing the input.
-For any value of x and y, the error e( x, y ) between f ( x, y ) and f ( x, y)
can be defined as
e( x, y) = f ( x, y) − f ( x, y)
-So that the total error between two images is
M −1 N −1
[ f ( x, y) − f ( x, y)]
x = 0 y =0
2021-04-14 16
-The root-mean-square error, erms , between f ( x, y ) and f ( x, y)
then is the square root of the squared error averaged over the MxN
array, or
M −1 N −1
erms = [ [ f ( x, y ) − f ( x, y )]2 ]1/ 2
x = 0 y =0
-A closely related objective fidelity criterion is the mean-square
signal-to-noise
ratio of the compressed-decompressed image.
- If f ( x, y) is considered [by a simple rearrangement of the terms] to
be the sum of the original image f ( x, y ) and a noise signal e( x, y ),
the mean-square signal-to-noise ratio of the output image,
denoted SNRms , is M −1 N −1
x =0 y =0
f ( x, y ) 2
SNRms = M −1 N −1
[ f (
x =0 y =0
x , y ) − f ( x , y )]2
2021-04-14 17
Subjective Fidelity Criteria:
• Done by people.
• Viewers evaluate images either side-by-side
or using a rating scale.
2021-04-14 18
2021-04-14 19
Can you subjectively rank these?
2021-04-14 20
Image Compression Models
2021-04-14 21
More about the Block Diagram:
Mapper:
The Mapper transforms the input image into a
(usually nonvisual) format designed to reduce
inter pixel redundancies.
Quantizer:
It reduces the accuracy of the Mapper
according to a predefined fidelity criterion;
attempting to eliminate only psychovisually
redundant data. This operation is irreversible
and must be omitted if error-free compression
is desired.
2021-04-14 22
Symbol Coder:
This creates a code (that reduces coding
redundancy) for the Quantizer output and maps
the output in accordance with the code.
Decoder:
It has only 2 components:
Symbol decoder:
It performs the inverse operation of the symbol
coder.
Inverse Mapper:
It performs the inverse operation of the Mapper.
2021-04-14 23
Coding Redundancy
Let the discrete random variable 𝑟𝑘 for k=1,
2, …L with associated probabilities 𝑝𝑟 (𝑟𝑘 )
represent the gray levels of an L-gray-level
image.
𝑟1 corresponds to gray level 0.
The probability 𝑝𝑟 𝑟𝑘 is given by:
𝑝𝑟 (𝑟𝑘 )=𝑛𝑘 /n, k=1,2,….L
In the above equation, the 𝑛𝑘 is the number
of times the kth gray level appears in the
image and n is then total number of pixels.
2021-04-14 24
If the number of bits used to represent each
value of 𝒓𝒌 is 𝒍(𝒓𝒌 ) then the average number
of bits required to represent each pixel is:
𝑳𝒂𝒗𝒈 = σ𝑳𝒌=𝟏 𝒍(𝒓𝒌 )𝒑𝒓 (𝒓𝒌 )
The above equation shows that the average
length of the code words assigned to the
various gray-level values is found by
summing the products of the number of bits
used to represent each gray level and the
probability that the gray level occurs.
Thus the total number of bits required to
code an MxN image is MNLavg.
2021-04-14 25
Consider Table below:
2021-04-14 26
▪ The previous Table shows that coding redundancy is almost
always present when the gray levels of an image are coded using
a natural binary code.
▪ In this table both the Fixed and the Variable-Length encoding of
a 4-level image is given.
Code 1:
Code 2:
The average number of bits required by Code 2 is:
3(.1875)+1(.5)+3(.125)+2(.1875)=1.8125
Compression Ratio=2/1.8125=1.103
2021-04-14 27
Why Code 2 performed better than Code 1:
2021-04-14 28
Types of Compressions:
2021-04-14 29
8.2 Some Basic Compression Methods
▪ Error-free
Huffman Coding
2021-04-14 30
Huffman Coding: Step 1
2021-04-14 31
2021-04-14 32
Huffman Coding: Step 2
2021-04-14 33
2021-04-14 34
Run-Length Coding
The basic concept is to code each contiguous group of 0’s and 1’s
encountered in a left to right scan of a row by its length and to
establish a convention for determining the values of the run.
Example
11111111111111111111111111111111111111111111111111111
11111111111111111111111111100000000000000000000000000
0011111111111111111111
-- in other words, the sequence
80 ones
28 zeros
20 ones
So, we might transmit or store a binary coded
representation of
80(1)28(0)20(1)
Or header, 80, 28, 20
2021-04-14 35
Example 2:
What if we have the following image ?
Solution:
64(01)
2021-04-14 36
Block Transform Coding:
2021-04-14 37
2021-04-14 38
Approximations of Fig. 8.9(a) using the (a) Fourier, (b) Walsh-
Hadamard, and (c) cosine transforms, together with the
corresponding scaled error images in (d)–(f).
2021-04-14 40
Transform Selection:
2021-04-14 41
Its given by (note: N = n for nxn block):
2021-04-14 42
Notes about DCT:
It is:
1.
2.
2021-04-14 43
Sub-images:
2021-04-14 44
Reconstruction error versus subimage size.
Approximations of Fig. 8.24(a) using 25% of the DCT
coefficients and (b ) 2×2subimages, (c ) 4×4 subimages,
and 8×8 subimages. The original image in (a) is a
zoomed section of Fig. 8.9(a).
Original image and compressed image
2021-04-14 47
85% Coefs Discarded:
2021-04-14 48
Two JPEG (based on DCT) approximations of Fig. 8.9(a). Each row
contains a result after compression and reconstruction, the
scaled difference between the result and the original image, and
a zoomed portion of the reconstructed image.
1. Lossless Predictive Coding:
2021-04-14 50
Prediction Error Calculation:
2021-04-14 51
A lossless predictive coding model: (a) encoder; (b) decoder.
(a) A view of the Earth from an orbiting space shuttle. (b) The
intensity histogram of (a). (c) The prediction error image
resulting from Eq. (8-33). (d) A histogram of the prediction
error. (Original image courtesy of NASA.)
▪ Prediction is formed as linear combination of
m previous samples:
2021-04-14 54
A lossy predictive coding model: (a) encoder; (b) decoder.
Notes:
2021-04-14 56
8.11 Wavelet Coding:
2021-04-14 57
2021-04-14 58
Three-scale wavelet transforms of Fig. 8.9(a) with
respect to (a) Haar wavelets, (b) Daubechies wavelets,
(c) symlets, and (d) Cohen- Daubechies-Feauveau
biorthogonal wavelets.
Decomposition level impact on wavelet coding the
image of Fig. 8.9(a).
❑ JPEG: Based on DCT
❑ JPEG2000:Based on DWT
2021-04-14 61
Figure 8.46 shows four JPEG-2000 approximations of the
monochrome image in Figure 8.9(a).
Successive rows of the figure illustrate increasing levels of
compression, including C = 25, 52, 75, and 105.
The images in column 1 are decompressed JPEG-2000
encodings.
The differences between these images and the original image
[see Fig. 8.9(a)] are shown in the second column, and the
third column contains a zoomed portion of the
reconstructions in column 1.
Because the compression ratios for the first two rows are
virtually identical to the compression ratios in Example 8.18,
these results can be compared (both qualitatively and
quantitatively) to the JPEG transform-based results in Figs.
8.29(a) through (f).
2021-04-14 62
Four JPEG-2000 approximations of Fig. 8.9(a). Each row contains a result after
compression and reconstruction, the scaled difference between the result and the
original image, and a zoomed portion of the reconstructed image. (Compare the results
in rows 1 and 2 with the JPEG results in Fig. 8.29.).
A visual comparison of the error images in rows 1 and 2 of Fig.
8.46 with the corresponding images in Figs. 8.29(b) and (e)
reveals a noticeable decrease of error in the JPEG-2000 results—
3.86 and 5.77 intensity levels, as opposed to 5.4 and 10.7
intensity levels for the JPEG results.
2021-04-14 64
Some of the applications of DIP techniques include:
2021-04-14 65