0% found this document useful (0 votes)
136 views65 pages

Updated - MT390 - Tutorial 6 - Spring 2020 - 21

This document summarizes image compression fundamentals. It discusses the different types of redundancy in images including coding, spatial/temporal, and irrelevant information redundancy. It describes how compression works to reduce the amount of data needed to represent an image by removing these redundancies. The document also discusses objective and subjective metrics for evaluating the fidelity and quality of compressed images, such as root-mean-square error and mean opinion scores. Finally, it provides a high-level overview of the basic components of an image compression system, including the encoder, mapper, quantizer, and decoder.

Uploaded by

Yousef Aboamara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views65 pages

Updated - MT390 - Tutorial 6 - Spring 2020 - 21

This document summarizes image compression fundamentals. It discusses the different types of redundancy in images including coding, spatial/temporal, and irrelevant information redundancy. It describes how compression works to reduce the amount of data needed to represent an image by removing these redundancies. The document also discusses objective and subjective metrics for evaluating the fidelity and quality of compressed images, such as root-mean-square error and mean opinion scores. Finally, it provides a high-level overview of the basic components of an image compression system, including the encoder, mapper, quantizer, and decoder.

Uploaded by

Yousef Aboamara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

MT390 (DIP): Tutorial 6

Chapter 8:
Image Compression

Applications

2021-04-14 1
Preview:
• Image Compression is the science of reducing
the amount of data required to represent an
image.

• A 2 hour SD TV Movie using 720x480x24 bit


pixel array will need 224GB of data for storage.

• Web page images and camera photos also need


to be compressed to save storage.

2021-04-14 2
The following image is mostly used as reference
image in image compression studies:

2021-04-14 3
8.1 Fundamentals

Data and Information:

• Various amounts of data can represent the


same amount of information.
• Hence Data and Information are not the same
thing.

2021-04-14 4
Compression Ratio:
▪ C=b/b’
▪ Here b and b’ are the number of bits in two
representations of the same information.
▪ If C=10 (or 10:1) then the larger
representation has 10 bits of data for every 1
bit of data in the smaller representation.

Relative Data Redundancy:

▪ R=1-1/C

2021-04-14 5
3 Types of Redundancy in Images:
1. Coding Redundancy: The 8-bit codes used
to represent image intensities contain more
bits than needed.

2. Spatial and Temporal Redundancy: Nearby


Pixels are correlated spatially. In videos frames
are correlated temporally.

3. Irrelevant Information: Most images contain


information that humans ignore.

2021-04-14 6
2021-04-14 7
Let us assume, once again, that a discrete random variable rk
represents the gray levels of an image and that each rk
occurs with probability
nk
pr ( rk ) = k=0, 1, 2,…., L-1
n
Where L is the number of gray levels, nk is the number of
times that the kth gray level appears in the image, and n is
the total number of pixels in the image.

If the number of bits used to represent each value of rk is l ( rk )


, then the average number of bits required to represent each
pixel is
L−1
Lavg =  L(rk ) pr (rk )
k =0

2021-04-14 8
Example 8.1: A simple illustration of variable length
coding

The computer generated image in fig. 8.1(a) has the intensity distribution
shown in the second column of table 8.1. If a natural 8-bit binary code(denoted
as code 1 in Table 8.1) is used to represent its 4 possible intensities, Lav –the
average number of bits for code1- is 8 bits, because L1(rk)= 8 bits for all rk.

table 8.1: Example of variable-length encoding

2021-04-14 9
Example of Variable Length Coding (continue):
➢ On the other hand, if the scheme designated as code 2 in table 8.1 is
used, the average length of the encoded pixels is, in accordance with
Eq. (8.1-4) is:
➢ Lavg=0.25(2)+0.47(1)+0.25(3)+0.03(3)=1.81.

➢ The total number of bits needed to represent the entire image is


MNLavg =256x256x1.81=118621.

➢ Compression ratio=C= 256x256x8/118621≈4.42.


➢ Relative redundancy=R= 1-1/c=1-1/4.42=0.774.

➢ The compression achieved by code 2 results from assigning fewer bits


to the more probable intensity values than to the less probable ones.
For example, r128 -the image’s most probable intensity- is assigned
the 1-bit code word 1[ of length l2(r128) =1], while r255 -its least
probable occurring intensity- is assigned the 3-bit code word 001 [of
length l2(r255) =3].

2021-04-14 10
Spatial and Temporal Redundancy:
▪ Nearby Pixels are correlated spatially. In
videos frames are correlated temporally.
▪ Consider Fig. 8.1(b)
1) Its histogram is uniform. All 256 intensities
are equally probable
2) Vertical Direction: pixels are independent
3) Horizontally: pixels are dependent

2021-04-14 11
Run-Length Encoding:
▪ Points 2 and 3 above show spatial redundancy
▪ Run-Length Encoding using sequence of run-
length pairs
▪ A run-length based representation compresses
the original 2-D, 8 bit intensity array by
▪ C=256x256x8/[(256+256)x8]=128:1
▪ Run-length coding is called mapping
▪ A mapping is said to be reversible if the pixels of
the original 2-D intensity array can be
constructed without errors
▪ Otherwise the mapping is said to be irreversible

2021-04-14 12
Irrelevant Information

▪ Remove information that is ignored by the human visual system


▪ Consider Fig. 8.1(c)
▪ It has lots of homogenous field of gray
▪ It can be represented by its average intensity alone—a single 8
bit value
▪ Hence C=256x256x8/8=65,536:1
▪ Here, different from the redundancies discussed before. Here the
elimination is possible because the information itself is not
essential for normal visual processing
▪ This omission results in a loss of quantitative information
referred as quantization
▪ Because information is lost, quantization is an irreversible
operation

2021-04-14 13
The fundamental premise of information theory is that the generation
of information can be modeled as a probabilistic process that can be
measured in a manner that agrees with intuition.
In accordance with this supposition, a random event E that occurs with
probability P(E) is said to contain
1
I ( E ) = log = − log P ( E )
P( E )
Units of information.
-The quantity I(E) often is called the self-information of E

-The amount of self-information attributed to event E is inversely


related to the probability of E
- The average information per source, called Entropy is defined as:

bits/pixel

2021-04-14 14
The removal of “irrelevant visual” information
involves a loss of real or quantitative image
information. Because, information is lost, a means of
quantifying the nature of loss is needed. Two types
of criteria can be used for such an assessment:

▪ Objective Fidelity Criteria


▪ Subjective Fidelity Criteria

2021-04-14 15
Objective Fidelity Criteria:
-When the level of information loss can be expressed as a function of the
original or input image and the compressed and subsequently
decompressed output image, it is said to be based on an object fidelity
criterion.
-A good example is the root-mean-square (rms) error between an input
and output image 
-Let f ( x, y ) represent an input image and let f ( x, y) denote an estimate or
approximation of f ( x, y ) that results from compressing and subsequently
decompressing the input. 
-For any value of x and y, the error e( x, y ) between f ( x, y ) and f ( x, y)
can be defined as

e( x, y) = f ( x, y) − f ( x, y)
-So that the total error between two images is
M −1 N −1 
[ f ( x, y) − f ( x, y)]
x = 0 y =0

2021-04-14 16

-The root-mean-square error, erms , between f ( x, y ) and f ( x, y)
then is the square root of the squared error averaged over the MxN
array, or
M −1 N −1 
erms = [ [ f ( x, y ) − f ( x, y )]2 ]1/ 2
x = 0 y =0
-A closely related objective fidelity criterion is the mean-square
signal-to-noise

ratio of the compressed-decompressed image.
- If f ( x, y) is considered [by a simple rearrangement of the terms] to
be the sum of the original image f ( x, y ) and a noise signal e( x, y ),
the mean-square signal-to-noise ratio of the output image,
denoted SNRms , is M −1 N −1 

x =0 y =0
f ( x, y ) 2
SNRms = M −1 N −1 

 [ f (
x =0 y =0
x , y ) − f ( x , y )]2

- The rms value of the signal-to-noise ratio, denotes SNRrms is


obtained by taking the square root

2021-04-14 17
Subjective Fidelity Criteria:

• Done by people.
• Viewers evaluate images either side-by-side
or using a rating scale.

2021-04-14 18
2021-04-14 19
Can you subjectively rank these?

2021-04-14 20
Image Compression Models

The figure above shows an image compression system which is


composed of two distinct functional components: an encoder and a
decoder. The encoder performs compression and the decoder
performs the complementary operation of decompression. Both
operations can be performed in software like a codec program.

2021-04-14 21
More about the Block Diagram:
Mapper:
The Mapper transforms the input image into a
(usually nonvisual) format designed to reduce
inter pixel redundancies.
Quantizer:
It reduces the accuracy of the Mapper
according to a predefined fidelity criterion;
attempting to eliminate only psychovisually
redundant data. This operation is irreversible
and must be omitted if error-free compression
is desired.

2021-04-14 22
Symbol Coder:
This creates a code (that reduces coding
redundancy) for the Quantizer output and maps
the output in accordance with the code.

Decoder:
It has only 2 components:

Symbol decoder:
It performs the inverse operation of the symbol
coder.

Inverse Mapper:
It performs the inverse operation of the Mapper.

2021-04-14 23
Coding Redundancy
 Let the discrete random variable 𝑟𝑘 for k=1,
2, …L with associated probabilities 𝑝𝑟 (𝑟𝑘 )
represent the gray levels of an L-gray-level
image.
 𝑟1 corresponds to gray level 0.
 The probability 𝑝𝑟 𝑟𝑘 is given by:
𝑝𝑟 (𝑟𝑘 )=𝑛𝑘 /n, k=1,2,….L
 In the above equation, the 𝑛𝑘 is the number
of times the kth gray level appears in the
image and n is then total number of pixels.

2021-04-14 24
If the number of bits used to represent each
value of 𝒓𝒌 is 𝒍(𝒓𝒌 ) then the average number
of bits required to represent each pixel is:
𝑳𝒂𝒗𝒈 = σ𝑳𝒌=𝟏 𝒍(𝒓𝒌 )𝒑𝒓 (𝒓𝒌 )
The above equation shows that the average
length of the code words assigned to the
various gray-level values is found by
summing the products of the number of bits
used to represent each gray level and the
probability that the gray level occurs.
Thus the total number of bits required to
code an MxN image is MNLavg.

2021-04-14 25
Consider Table below:

2021-04-14 26
▪ The previous Table shows that coding redundancy is almost
always present when the gray levels of an image are coded using
a natural binary code.
▪ In this table both the Fixed and the Variable-Length encoding of
a 4-level image is given.

Code 1:

Code 1 uses a 2-bit binary encoding and has an average length of


2 bits.

Code 2:
The average number of bits required by Code 2 is:

𝑳𝒂𝒗𝒈 = σ𝑳𝒌=𝟏 𝒍(𝒓𝒌 )𝒑𝒓 (𝒓𝒌 )=

3(.1875)+1(.5)+3(.125)+2(.1875)=1.8125

Compression Ratio=2/1.8125=1.103

2021-04-14 27
Why Code 2 performed better than Code 1:

Its code words are of varying length, allowing


the shortest code words to be assigned to the
gray levels that occur more frequently in the
image.

2021-04-14 28
Types of Compressions:

Lossless Compression (Error-free): Exact


reconstruction

Lossy Compression: Inexact reconstruction

2021-04-14 29
8.2 Some Basic Compression Methods
▪ Error-free

Huffman Coding

2021-04-14 30
Huffman Coding: Step 1

2021-04-14 31
2021-04-14 32
Huffman Coding: Step 2

2021-04-14 33
2021-04-14 34
Run-Length Coding
The basic concept is to code each contiguous group of 0’s and 1’s
encountered in a left to right scan of a row by its length and to
establish a convention for determining the values of the run.
Example

11111111111111111111111111111111111111111111111111111
11111111111111111111111111100000000000000000000000000
0011111111111111111111
-- in other words, the sequence
80 ones
28 zeros
20 ones
So, we might transmit or store a binary coded
representation of
80(1)28(0)20(1)
Or header, 80, 28, 20

2021-04-14 35
 Example 2:
 What if we have the following image ?

Solution:

64(01)

2021-04-14 36
Block Transform Coding:

2021-04-14 37
2021-04-14 38
Approximations of Fig. 8.9(a) using the (a) Fourier, (b) Walsh-
Hadamard, and (c) cosine transforms, together with the
corresponding scaled error images in (d)–(f).
2021-04-14 40
Transform Selection:

are called Transform Coefficients.

2021-04-14 41
Its given by (note: N = n for nxn block):

2021-04-14 42
Notes about DCT:
It is:
1.

2.

2021-04-14 43
Sub-images:

2021-04-14 44
Reconstruction error versus subimage size.
Approximations of Fig. 8.24(a) using 25% of the DCT
coefficients and (b ) 2×2subimages, (c ) 4×4 subimages,
and 8×8 subimages. The original image in (a) is a
zoomed section of Fig. 8.9(a).
Original image and compressed image

2021-04-14 47
85% Coefs Discarded:

Although there is some loss of quality in the


reconstructed image, it is clearly recognizable,
even though almost 85% of the DCT
coefficients were discarded.

2021-04-14 48
Two JPEG (based on DCT) approximations of Fig. 8.9(a). Each row
contains a result after compression and reconstruction, the
scaled difference between the result and the original image, and
a zoomed portion of the reconstructed image.
1. Lossless Predictive Coding:

Code only the new info:

2021-04-14 50
Prediction Error Calculation:

2021-04-14 51
A lossless predictive coding model: (a) encoder; (b) decoder.
(a) A view of the Earth from an orbiting space shuttle. (b) The
intensity histogram of (a). (c) The prediction error image
resulting from Eq. (8-33). (d) A histogram of the prediction
error. (Original image courtesy of NASA.)
▪ Prediction is formed as linear combination of
m previous samples:

▪ 1-D: samples come from current scan line


▪ 2-D: samples come from current & previous
scan lines
▪ 3-D: samples come from current image and
previous images

2021-04-14 54
A lossy predictive coding model: (a) encoder; (b) decoder.
Notes:

2021-04-14 56
8.11 Wavelet Coding:

2021-04-14 57
2021-04-14 58
Three-scale wavelet transforms of Fig. 8.9(a) with
respect to (a) Haar wavelets, (b) Daubechies wavelets,
(c) symlets, and (d) Cohen- Daubechies-Feauveau
biorthogonal wavelets.
Decomposition level impact on wavelet coding the
image of Fig. 8.9(a).
❑ JPEG: Based on DCT

❑ JPEG2000:Based on DWT

2021-04-14 61
 Figure 8.46 shows four JPEG-2000 approximations of the
monochrome image in Figure 8.9(a).
 Successive rows of the figure illustrate increasing levels of
compression, including C = 25, 52, 75, and 105.
 The images in column 1 are decompressed JPEG-2000
encodings.
 The differences between these images and the original image
[see Fig. 8.9(a)] are shown in the second column, and the
third column contains a zoomed portion of the
reconstructions in column 1.
 Because the compression ratios for the first two rows are
virtually identical to the compression ratios in Example 8.18,
these results can be compared (both qualitatively and
quantitatively) to the JPEG transform-based results in Figs.
8.29(a) through (f).

2021-04-14 62
Four JPEG-2000 approximations of Fig. 8.9(a). Each row contains a result after
compression and reconstruction, the scaled difference between the result and the
original image, and a zoomed portion of the reconstructed image. (Compare the results
in rows 1 and 2 with the JPEG results in Fig. 8.29.).
A visual comparison of the error images in rows 1 and 2 of Fig.
8.46 with the corresponding images in Figs. 8.29(b) and (e)
reveals a noticeable decrease of error in the JPEG-2000 results—
3.86 and 5.77 intensity levels, as opposed to 5.4 and 10.7
intensity levels for the JPEG results.

The computed errors favor the wavelet-based results at both


compression levels.

Besides decreasing reconstruction error, wavelet coding


dramatically increases (in a subjective sense) image quality.
Note that the blocking arti-fact that dominated the JPEG results
[see Figs. 8.29(c) and (f)] is not present in Fig. 8.46. Finally, we
note that the compression achieved in rows 3 and 4 of Fig. 8.46
is not practical with JPEG. JPEG-2000 provides useable images
that are compressed by more than 100:1, with the most
objectionable degradation being increased image blur.

2021-04-14 64
Some of the applications of DIP techniques include:

▪ Medical Field: Image classifications, tumor detection,


breast cancer detection etc..
▪ Agriculture Field: Crop Monitoring, Defect Detection etc..
▪ Robotics Field: DIP for Computer Vision
▪ Satellites: Resource Monitoring, Building Monitoring etc..
▪ Security: Biometrics, Surveillance etc..
▪ Object recognition: Optical Character recognition etc.
▪ Autonomous Vehicles: Driverless cars etc..

2021-04-14 65

You might also like