0% found this document useful (0 votes)
57 views167 pages

Unit-IV CC

The document discusses concepts related to image segmentation including region of interest selection, feature extraction, and image compression models. It covers topics such as histogram based features, color features, shape features, texture descriptors, and region of interest selection methods.

Uploaded by

jhaa98676
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views167 pages

Unit-IV CC

The document discusses concepts related to image segmentation including region of interest selection, feature extraction, and image compression models. It covers topics such as histogram based features, color features, shape features, texture descriptors, and region of interest selection methods.

Uploaded by

jhaa98676
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 167

21CSE251T – Digital Image Processing

Unit IV

Concepts on Image Segmentation


Topics
• Region of interest (ROI) selection
• Feature extraction: Histogram based features - Intensity
Features-Colour - Shape Features-Local Binary Patterns (LBP) -
Texture descriptors- Grey Level Occurrence Matrix (GLCM)
• Fundamentals of Image Compression models – Error Free
Compression – Variable Length Coding –Bit – Plane Coding –
Lossless Predictive Coding – Lossy Compression – Lossy
Predictive Coding
Region of Interest (ROI) Selection
Region of interest (ROI) selection
• Region of Interest (ROI) selection is a process where a specific
area or subset of an image is identified for further processing.
• The entire image may contain irrelevant or redundant
information. Instead focusing on a smaller region can improve
efficiency and accuracy.
• ROI selection is essential in tasks like object detection, tracking,
recognition, and analysis
• By focusing computational resources on relevant image regions,
ROI selection helps improve efficiency, reduce processing time,
and enhance accuracy.
ROI Selection methods
• Manual Selection
• Automatic Detection
• Feature-Based Selection
• ROI Tracking
Manual Selection
• A user may manually define the ROI by specifying its
boundaries interactively using tools like a mouse or touch
interface
• Example: In image editing software, a user might draw a
bounding box around the object of interest
Automatic Detection:
• Algorithms are used to automatically detect regions of interest
based on predefined criteria or features
• Example: In medical imaging, algorithms may automatically
detect and segment specific organs or abnormalities within an
image
Feature-Based Selection
• ROI selection can also be guided by specific features or
characteristics of the image
• Example: In face recognition systems, the ROI might be
determined based on facial features such as eyes, nose, and
mouth.
ROI Tracking
• In applications involving video or sequential images, the ROI
may need to be tracked over time as it moves or changes shape.
• Tracking algorithms are used to follow the ROI's movement and
adjust its position or size accordingly
Feature Extraction
Feature Extraction
• Feature extraction is an essential step in many image processing
and computer vision tasks, including image classification, object
detection, segmentation, and recognition.
• The purpose of an image feature is to describe the object in a
meaningful manner so as to aid the recognition process and to
help in the discrimination of these objects.
Essential Characteristics of Good features
are:
1. Robustness : The property of a feature’s invariance to
operations such as translation, rotation, scaling, illumination,
and noise
2. Discrimination : There should not be any overlapping
3. Reliability
4. Independence : The features should be uncorrelated to each
other
5. Resistance to noise
6. Compactness : The features should be small in number so that
they can be represented compactly
Histogram / Brightness Features
(Intensity Feature Descriptors)
Histogram features
• Histogram features are also known as brightness features and are
described in terms of the luminance (or intensity) features.
• The histogram shows the brightness distribution found in the object.
• The first order histogram of an image can be approximated using the
below equation
𝑁(𝑏)
𝑝 𝑏 =
𝑚
Here,
N(b) is the number of pixels having an amplitude of rb
L is the total grey levels and
m is the total number of pixels in the image.
First order moments.
First order moments are calculated based on individual
pixel values. Some of the features that are helpful include
the following.
1. Mean
2. Standard deviation
3. Skewness
4. Kurtosis
5. Energy
6. Entropy
Mean
Mean of the histogram is given by

𝐿−1
𝑀𝑒𝑎𝑛(𝑏)= 𝑏=0 𝑏. 𝑝(𝑏)

The mean indicates the brightness of image


L -> Total grey levels in the image.
b -> Intensity value of ith bin.
p(b) -> probability of occurrence of b in image.
Standard Deviation
Standard deviation indicates the contrast / the spread of data

2
SD = 𝜎𝑏 = 𝐿−1
𝑏=0[ 𝑏 − 𝑏 𝑝 𝑏 ]1/2
Skewness

Skewness indicates the asymmetry about the mean in the grey


level distribution.

1 𝐿−1
Skewness = 𝑏=0(𝑏 − 𝑏)3 𝑝(𝑏)
𝜎𝑏3
Kurtosis
Kurtosis is a measure of the "tailedness" of the intensity
distribution.
High kurtosis indicates a sharper peak and heavier tails, while low
kurtosis indicates a flatter peak and lighter tails

1 𝐿−1 4
Kurtosis = 𝑏=0 𝑏−𝑏 𝑝 𝑏 −3
𝜎𝑏4
Energy
Energy is the sum of brightness values of all the pixels present in
the object. This is called zero order spatial moment

𝐿−1
Energy = 𝑏=0[𝑝 𝑏 ]2

The energy is 1 for an image having a constant intensity and


reduces with increased distribution of grey level values
Entropy
The entropy indicates the minimum number of bits required to
code the image.

𝐿−1
Entropy = - 𝑏=0 𝑝 𝑏 . 𝑙𝑜𝑔2 𝑝(𝑏)
Second Order Moments / Texture Analysis
• These are based on joint probability distribution of pixels I and J.
The histogram of histogram of the second order distribution is
given by
𝑁(𝑎,𝑏)
p(a,b) ≈
𝑚

Where, m is the total number of pixels.

These measures are useful in measuring the texture of images


and includes Autocorrelation, Covariance, Inertia, Absolute value,
inverse difference, energy and entropy.
Colour Features
Colour Features
Colour features are very useful in object characterization.
Colour histograms can be obtained for colour images. Some of
the features are:

1. Colour histogram
2. Histogram Intersection
3. Colour Coherence Vector
Colour histogram
• Let two histograms of images A nd B, be given as follows:

A = 𝑕1𝐴 , 𝑕2𝐴 , … . . , 𝑕𝑘𝐴 and


B = 𝑕1𝐵 , 𝑕2𝐵 , … . . , 𝑕𝑘𝐵

The similarity between two images is given by colour


distance. Colour distance is given by the measure

𝑘
d= 𝑗=1 𝑕𝑗𝐴 − 𝑕𝑗𝐵
Histogram Intersection
• The pixels that are common in two images, is given by

𝑘 𝐴
I 𝐻 (𝐴) , 𝐻 (𝐵) = 𝑗=1 min(𝑕𝑗 − 𝑕𝑗𝐵 )
Colour Coherence Vector
For finding colour coherence vector, the image can be
partitioned into two parts – one part consisting of a count of
pixels that belong to a large uniform region and the other
consisting of the pixels that belong to a sparse region.

Colour coherence vector = 𝑡𝑢𝑝𝑙𝑒 𝑃, 𝑄

Where,
P is the coherent pixel
Q is the incoherent pixel
Note:
• Similar to non-colour images, the mean, the variance, skewness,
kurtosis and other moments can be identified and detected for
feature extraction from colour images
• These image features are mostly used with content-based
retrieval systems.
Shape features
Shape Feature
• Shape features in image processing are descriptors that capture
the geometric properties and spatial arrangement of objects or
regions within an image
• These features provide valuable information about the shape,
size, orientation, and compactness of objects, and they are
widely used in tasks such as object recognition, classification,
segmentation, and tracking.
Shape region features are
1. Area
2. Perimeter
3. Shape factor (Compactness)
4. Area to Perimeter ration
5. Object Length
6. Object width
7. Elongatedness
8. Aspect Ratio
9. Rectangularity
Area
The area of an object is the number of pixels in the object. It is
shift invariant. The area of binary image is given by

𝑛 𝑚
A= 𝑖=1 𝑗=1 𝐵(𝑖, 𝑗)

Area can also be calculated in terms of polygon. Let Q be the


polygon. Then the area of Q is given by

𝑁𝐵
A(Q) = 𝑁1 + −1
2
where,
NB is the number of points that lie in boundary
N1 is the number of interior points, not in boundary
Perimeter
• Perimeter of an object is the number of pixels present in the
boundary of the object.
• Perimeter is shift invariant and rotation invariant.
Compactness (Shape Factor)
• It’s a measure of how closely an object's shape resembles a
compact form

𝑃𝑒𝑟𝑖𝑚𝑒𝑡𝑒𝑟 2 𝑃2
Compactness = or
𝐴𝑟𝑒𝑎 𝐴
Circularity (Area to Perimeter ratio)
• A measure of how closely an object's shape resembles a circle

4𝜋𝐴
Circularity =
𝑃𝑒𝑟𝑖𝑚𝑒𝑡𝑒𝑟 2
Object Length (major axis)
The longest line that can be drawn through the object
connecting the two farthest points in the boundary is called its
major axis. If 𝑥1 , 𝑦1 𝑎𝑛𝑑 𝑥2 , 𝑦2 are end points of major axis,
then major axis length is given by

𝑥2 − 𝑥1 2 + 𝑦2 − 𝑦1 2
Object Width (minor axis)
It is the largest line that can be drawn through the object
while maintaining perpendicular with the major axis. The formula
is

𝑥2 − 𝑥1 2 + 𝑦2 − 𝑦1 2

𝑥1 , 𝑦1 𝑎𝑛𝑑 𝑥2 , 𝑦2 are end points of minor axis


Bounding box Area
• The area of the box that completely surrounds an object is
called its bounding box area. This is calculated by the formula

𝐵𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑏𝑜𝑥 𝑎𝑟𝑒𝑎 = 𝑀𝑎𝑗𝑜𝑟 𝐴𝑥𝑖𝑠 𝑙𝑒𝑛𝑔𝑡𝑕 𝑿 𝑀𝑖𝑛𝑜𝑟 𝐴𝑥𝑖𝑠 𝐿𝑒𝑛𝑔𝑡𝑕


Features Derived from Bounding Box
Rectangle
𝐿𝑒𝑛𝑔𝑡𝑕 𝑜𝑓 𝑡𝑕𝑒 𝑚𝑎𝑗𝑜𝑟 𝑎𝑥𝑖𝑠
𝐸𝑙𝑜𝑛𝑔𝑎𝑡𝑒𝑑𝑛𝑒𝑠𝑠 =
𝑃𝑒𝑟𝑖𝑚𝑒𝑡𝑒𝑟

𝐿𝑒𝑛𝑔𝑡𝑕 𝑜𝑓 𝑅𝑒𝑔𝑖𝑜𝑛 𝐵𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑅𝑒𝑐𝑡𝑎𝑛𝑔𝑙𝑒


𝐴𝑠𝑝𝑒𝑐𝑡 𝑅𝑎𝑡𝑖𝑜 =
𝑊𝑖𝑑𝑡𝑕 𝑜𝑓 𝑅𝑒𝑔𝑖𝑜𝑛 𝐵𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑅𝑒𝑐𝑡𝑎𝑛𝑔𝑙𝑒

𝑅𝑒𝑔𝑖𝑜𝑛 𝐴𝑟𝑒𝑎
𝑅𝑒𝑐𝑡𝑎𝑛𝑔𝑢𝑙𝑎𝑟𝑖𝑡𝑦 =
𝐴𝑟𝑒𝑎 𝑜𝑓 𝑚𝑖𝑛𝑖𝑚𝑢𝑚 𝑏𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑟𝑒𝑐𝑡𝑎𝑛𝑔𝑙𝑒
Local Binary Patterns (LBP)
• Local Binary Patterns (LBP) have emerged as a
powerful technique for texture description and feature
extraction.
• First introduced by Ojala, Pietikäinen, and Maenpaa in
1994, LBP has found widespread use in various
applications such as facial recognition, texture
classification, and object detection due to its simplicity
and effectiveness.
What are Local Binary Patterns?
• LBP is a texture descriptor that characterizes the
local structure and appearance of an image. It
operates by analyzing the relationship between a
pixel and its neighboring pixels within a defined
neighborhood. This process enables the encoding
of texture information by comparing pixel
intensities.
What are Local Binary Patterns?-Cont.
• The fundamental idea behind LBP revolves around
thresholding. To compute the LBP value of a pixel, a
threshold is applied to the intensity values of its surrounding
pixels.
• The resulting binary pattern is constructed by comparing the
intensity of each neighbor with the center pixel. If the
neighbor's intensity is greater than or equal to that of the
center pixel, it is assigned a value of 1; otherwise, it receives
a value of 0.
• This sequence of binary values is then converted to a decimal
number, creating the LBP code for that pixel.
LBP Work Process
• Selecting a Neighborhood: A key aspect of LBP
is the choice of neighborhood for each pixel.
Typically, a square neighborhood of a certain
radius is considered, encompassing a set of
neighboring pixels around the central pixel.
LBP Work Process
• Thresholding and Binary Comparison: Once
the neighborhood is defined, the LBP operator
computes the binary pattern. It compares the
grayscale intensity of the center pixel with that of
its neighbors, setting the bit to 1 if the neighbor's
intensity is greater than or equal to the center
pixel's intensity; otherwise, it assigns a 0.
LBP Work Process
•Building the LBP Code: After
performing the comparisons for all the
neighbors, the resulting binary sequence
is assembled clockwise or
counterclockwise to form the LBP code.
LBP Work Process
•Histogram Generation: The LBP codes
are then used to construct a histogram,
where the frequency of occurrence of
different LBP patterns within the image
is recorded. This histogram serves as a
feature vector representing the texture
characteristics of the image.
Algorithm to Calculate Local Binary Pattern
• Steps:
• 1- Divide the examined window into cells (e.g. 16x16 pixels for each
cell).
• 2- For each pixel in a cell, compare the pixel to each of its 8 neighbors
(on its left-top, left-middle, left-bottom, right-top, etc.). Follow the
pixels along a circle, i.e. clockwise or counter-clockwise.
• 3- Where the center pixel's value is greater than the neighbor's value,
write "1". Otherwise, write "0". This gives an 8-digit binary number
(which is usually converted to decimal for convenience).
• 4- Compute the histogram, over the cell, of the frequency of each
"number" occurring (i.e.,each combination of which pixels are smaller
and which are greater than the center)
• 5-Optionally normalize the histogram.
• 6- Concatenate (normalized) histograms of all cells. This gives the
feature vector for the window.
• Formula/Expression:
Applications of LBP

LBP has showcased its versatility in various image


analysis tasks:
• Texture Classification: LBP has been particularly
effective in classifying textures in images. Its ability to
capture local texture patterns makes it well-suited for
tasks like distinguishing between different surface
textures in materials.
Applications of LBP

•Facial Recognition: LBP has been widely


used in facial recognition systems due to its
robustness against illumination changes and
facial expressions. By extracting
discriminative features from facial images,
LBP contributes significantly to accurate
face recognition.
Applications of LBP

•Object Detection: In combination with


other techniques, LBP aids in object
detection by extracting texture features
that contribute to identifying objects
within images.
Challenges and Advancements
• While LBP presents several advantages, including
computational simplicity and robustness, it also faces certain
limitations. One of the main challenges lies in its sensitivity
to noise and variations in illumination.
• To address these challenges, researchers have proposed
various improvements and extensions to the basic LBP
algorithm. These include uniform patterns, rotation-invariant
LBP, and the use of different neighborhood structures to
enhance its robustness and discriminative power.
GREY LEVEL OCCURRENCE
MATRIX (GLCM)
FUNDAMENTALS OF IMAGE
COMPRESSION MODELS
Image Compression
• Module – 5 Image Compression:

• Introduction, coding Redundancy ,


• Inter-pixel redundancy, image compression model,
• Lossy and Lossless compression,
• Huffman Coding, Arithmetic Coding,
• LZW coding, Transform Coding,
• Sub-image size selection, blocking,
• DCT implementation using FFT,
• Run length coding.
Image Compression
• Image compression is art and science of reducing amount of data required to
represent the image.
• Need for image compression ?
• To reduce the storage space
• To reduce the transmission time while web surfing
• Fundamentals:
• Data compression refers to the process of reducing the amount of data required
to represent the given information
• Data and information are not same
• Data is a mean to convey the information
• W.k.t. same information can be represented using various amount of data
• Some representations may contain irrelevant or repeated information – redundant
data
Image Compression
• Let there be two representations to of the same information
and let a and b be the number of bits to represent them
respectively.
• The relative data redundancy RD is given by,

• where CR is compression ratio, given by CR= a/b


• Based on a and b values there can be three different cases
• i. If b = a, no redundant data is present in first set
• ii. If b << a, highly redundant data is present in first set.
• iii. If b >> a, highly redundant data is present in second set.
Image Compression
• In image compression context, a in the equation for CR, usually is the number of
bits needed to represent an image as 2-D array of intensities
• Typically 2-D array of intensities suffer from three types of redundancies
• Coding Redundancy
• Spatial or temporal redundancy
• Irrelevant information
• Coding Redundancy:
• Code- a system of symbols to represent body of information
• Each piece of info is assigned with sequence of code symbols called as code
words
• Number of symbols in each code word is – length
• The 8-bit codes that are used to represent the intensities in most of the 2-D
intensity arrays contain more bits than needed to represent the intensities
Image Compression
• Spatial or temporal redundancy
• Most of the pixels in 2-D intensity arrays are correlated spatially (each pixel is similar to
dependent on neighboring pixels) info is unnecessarily replicated in the representation of
correlated pixels
• Irrelevant information
• Most of the 2-D intensity arrays contain information that is ignored by the human visual
system or is extraneous to the use of the image.
• Hence they can be called as redundant as they are not used

• Image compression means eliminating one or more redundancy in the image


Image Compression
• Coding redundancy
• Let us assume that, a discrete random variable rk, in the interval [0, L-1] is used to
represent the intensities of an MXN image and that each rk, occurs with probability of
pr(rk)
• Then we can write ….(1)
• Where L is number of intensity values and nk is number of occurrences of kth intensity
• If no. of bits used to represent each value of rk is l(rk), then average no. of bits needed to
represent each pixel is given by
• …….(2)

• Average length of code word assigned to various intensity values is given by the sum of
product of no. of bits used represent each intensity and their probability of occurrences
• Total no. of bits needed to represent MXN image is MNLavg.
Image Compression
• If the intensities are represented using m-bit fixed length code the RHS of the previous
equation reduces to m, while substituting l(rk) = m.
• By taking constant m out of summation we are left with only summation of pr(rk) which
is always 1
• Consider the following computer generated 256 X 256 image
Image Compression
• The error between two images f(x,y) and ^(x,y) is given by

• The total error between two MxN size images is

• The root-mean-square error erms between (̂ x,y) and f(x,y) is obtained by


Image Compression
• Advantage:
• Simple and convenient technique to evaluate information loss
• Subjective Fidelity Criteria
• Since human perception is based on subjective quality, it is more
suitable to use subjective fidelity criteria to evaluate the information
loss.
• Concept:
• This method is implemented by showing a ‘typical’ decompressed
image to a number of viewers.
• Then, their evaluation is averaged to take a decision.
• These evaluations may be done using a rating scale or side-by-side
comparisons such as excellent, better, same, worse, etc.
Image Compression
• Input image f(x, y) is fed to the encoder, which gives the compressed version of
image.
• This will be stored for later use or transmitted for storage
• When the compressed image is given to decoder, a reconstructed output f^(x,y)
will be generated
• If this is exact replica of f(x, y) then such a compression system is called lossless
or error free system.
• If not, reconstructed image will be distorted and such a system is called lossy
system
• Encoding or compression process:

Image Compression
• The encoder is designed to remove the redundancies through three operations one
by one
• The first stage is mapper stage, which transforms f(x, y) into a format designed to
reduce the spatial and temporal redundancy
• This operation is reversible and may or may not reduce the amount of data
required to represent the image
• Run-length coding is an example of mapping function
• The second stage is quantizer.
• This reduces the accuracy of the mapper’s output in accordance with the pre-
established fidelity criterion
• Goal is to keep irrelevant information out of the compression
• Note that this is irreversible operation and hence to be skipped if error-free
compression is needed
Image Compression
• Third stage is symbol coder.
• This generates fixed or variable length code to represent the quantizer
output and maps the output in accordance with the code.
• In most of the cases, variable length code is used
• Shortest code words are assigned to most frequently occurring quantizer
output values.
• This reduces the coding redundancy
Image Compression
• Decoding process:
• Decoder consists of two components – symbol decoder and inverse
mapper

• They perform inverse operations of symbol coder and mapper


• Since quantization is irreversible process, reverse of this is not shown in
decoding process
Image Compression
• Error Free Compression / Lossless Compression:
• Error-free compression is the acceptable data reduction method since there is no data
loss.
• This method is applicable to both binary and gray-scale images.
• It provides compression ratios ranging from 2 to 10.
• Operations:
• (i) Forming an alternative representation for the given image by which its interpixel redundancies are
reduced
• (ii) Coding the representation to remove its coding redundancies.
• E.g.:
• Variable-Length Coding
• Bit-Plane coding
• LZW Coding
• Lossless Predictive Coding
Image Compression

• Types of Compression
• Two broad categories
• 1. Lossless algorithms.
• 2. Lossy algorithms.
Variable Length Coding –Bit – Plane Coding Lossless
Predictive Coding – Lossy Compression – Lossy
Predictive Coding.
Introduction and Overview

The field of image compression continues to


grow at a rapid pace

As we look to the future, the need to store and


transmit images will only continue to increase
faster than the available capability to process all
the data

(c) Scott E Umbaugh, SIUE 2005 85


 Applications that require image compression
are many and varied such as:

1. Internet,
2. Businesses,
3. Multimedia,
4. Satellite imaging,
5. Medical imaging

(c) Scott E Umbaugh, SIUE 2005 86


Compression algorithm development starts with
applications to two-dimensional (2-D) still images

After the 2-D methods are developed, they are


often extended to video (motion imaging)

However, we will focus on image compression of


single frames of image data

(c) Scott E Umbaugh, SIUE 2005 87


Image compression involves reducing the size of
image data files, while retaining necessary
information

Retaining necessary information depends upon


the application

Image segmentation methods, which are


primarily a data reduction process, can be used
for compression

(c) Scott E Umbaugh, SIUE 2005 88


The reduced file created by the compression
process is called the compressed file and is used
to reconstruct the image, resulting in the
decompressed image
The original image, before any compression is
performed, is called the uncompressed image file
The ratio of the original, uncompressed image
file and the compressed file is referred to as the
compression ratio

(c) Scott E Umbaugh, SIUE 2005 89


The compression ratio is denoted by:

(c) Scott E Umbaugh, SIUE 2005 90


The reduction in file size is necessary to meet
the bandwidth requirements for many
transmission systems, and for the storage
requirements in computer databases

Also, the amount of data required for digital


images is enormous

(c) Scott E Umbaugh, SIUE 2005 91


This number is based on the actual transmission
rate being the maximum, which is typically not
the case due to Internet traffic, overhead bits and
transmission errors

(c) Scott E Umbaugh, SIUE 2005 92


Additionally, considering that a web page might
contain more than one of these images, the time
it takes is simply too long

For high quality images the required resolution


can be much higher than the previous example

(c) Scott E Umbaugh, SIUE 2005 93


Example 10.1.5 applies maximum data rate to Example 10.1.4

(c) Scott E Umbaugh, SIUE 2005 94


Now, consider the transmission of video images,
where we need multiple frames per second
If we consider just one second of video data that
has been digitized at 640x480 pixels per frame,
and requiring 15 frames per second for interlaced
video, then:

(c) Scott E Umbaugh, SIUE 2005 95


Waiting 35 seconds for one second’s worth of
video is not exactly real time!

Even attempting to transmit uncompressed video


over the highest speed Internet connection is
impractical

For example: The Japanese Advanced Earth


Observing Satellite (ADEOS) transmits image
data at the rate of 120 Mbps

(c) Scott E Umbaugh, SIUE 2005 96


Applications requiring high speed connections
such as high definition television, real-time
teleconferencing, and transmission of multiband
high resolution satellite images, leads us to the
conclusion that image compression is not only
desirable but necessessary

Key to a successful compression scheme is


retaining necessary information

(c) Scott E Umbaugh, SIUE 2005 97


 To understand “retaining necessary
information”, we must differentiate between
data and information

1. Data:
• For digital images, data refers to the pixel gray
level values that correspond to the brightness
of a pixel at a point in space
• Data are used to convey information, much like
the way the alphabet is used to convey
information via words

(c) Scott E Umbaugh, SIUE 2005 98


2. Information:

• Information is an interpretation of the data in a


meaningful way

• Information is an elusive concept; it can be


application specific

(c) Scott E Umbaugh, SIUE 2005 99


 There are two primary types of image
compression methods:

1. Lossless compression methods:


• Allows for the exact recreation of the original
image data, and can compress complex
images to a maximum 1/2 to 1/3 the original
size – 2:1 to 3:1 compression ratios
• Preserves the data exactly

(c) Scott E Umbaugh, SIUE 2005 100


2. Lossy compression methods:

• Data loss, original image cannot be re-created


exactly

• Can compress complex images 10:1 to 50:1


and retain high quality, and 100 to 200 times
for lower quality, but acceptable, images

(c) Scott E Umbaugh, SIUE 2005 101


LOSSLESS COMPRESSION
METHODS

No loss of data, decompressed image exactly


same as uncompressed image
Medical images or any images used in courts
Lossless compression methods typically provide
about a 10% reduction in file size for complex
images

(c) Scott E Umbaugh, SIUE 2005 102


Lossless compression methods can provide
substantial compression for simple images

However, lossless compression techniques may


be used for both preprocessing and
postprocessing in image compression algorithms
to obtain the extra 10% compression

(c) Scott E Umbaugh, SIUE 2005 103


The underlying theory for lossless compression
(also called data compaction) comes from the
area of communications and information theory,
with a mathematical basis in probability theory

One of the most important concepts used is the


idea of information content and randomness in
data

(c) Scott E Umbaugh, SIUE 2005 104


Information theory defines information based on
the probability of an event, knowledge of an
unlikely event has more information than
knowledge of a likely event
For example:
• The earth will continue to revolve around the sun;
little information, 100% probability
• An earthquake will occur tomorrow; more info.
Less than 100% probability
• A matter transporter will be invented in the next
10 years; highly unlikely – low probability, high
information content

(c) Scott E Umbaugh, SIUE 2005 105


This perspective on information is the information
theoretic definition and should not be confused
with our working definition that requires
information in images to be useful, not simply
novel

Entropy is the measurement of the average


information in an image

(c) Scott E Umbaugh, SIUE 2005 106


The entropy for an N x N image can be
calculated by this equation:

(c) Scott E Umbaugh, SIUE 2005 107


This measure provides us with a theoretical
minimum for the average number of bits per pixel
that could be used to code the image

It can also be used as a metric for judging the


success of a coding scheme, as it is theoretically
optimal

(c) Scott E Umbaugh, SIUE 2005 108


(c) Scott E Umbaugh, SIUE 2005 109
(c) Scott E Umbaugh, SIUE 2005 110
The two preceding examples (10.2.1 and 10.2.2)
illustrate the range of the entropy:

The examples also illustrate the information


theory perspective regarding information and
randomness
The more randomness that exists in an image,
the more evenly distributed the gray levels, and
more bits per pixel are required to represent the
data

(c) Scott E Umbaugh, SIUE 2005 111


Figure 10.2-1 Entropy

a) Original image, c) Image after binary threshold,


entropy = 7.032 bpp entropy = 0.976 bpp

b) Image after local histogram equalization,


block size 4, entropy = 4.348 bpp

(c) Scott E Umbaugh, SIUE 2005 112


Figure 10.2-1 Entropy (contd)

d) Circle with a radius of 32, f) Circle with a radius of 32,


entropy = 0.283 bpp and a linear blur radius of 64,
entropy = 2.030 bpp

e) Circle with a radius of 64,


entropy = 0.716 bpp

(c) Scott E Umbaugh, SIUE 2005 113


Figure 10.2.1 depicts that a minimum overall file
size will be achieved if a smaller number of bits is
used to code the most frequent gray levels
Average number of bits per pixel (Length) in a
coder can be measured by the following
equation:

(c) Scott E Umbaugh, SIUE 2005 114


Huffman Coding

• The Huffman code, developed by D. Huffman in


1952, is a minimum length code
• This means that given the statistical distribution
of the gray levels (the histogram), the Huffman
algorithm will generate a code that is as close as
possible to the minimum bound, the entropy

(c) Scott E Umbaugh, SIUE 2005 115


• The method results in an unequal (or variable)
length code, where the size of the code words
can vary

• For complex images, Huffman coding alone will


typically reduce the file by 10% to 50% (1.1:1 to
1.5:1), but this ratio can be improved to 2:1 or 3:1
by preprocessing for irrelevant information
removal

(c) Scott E Umbaugh, SIUE 2005 116


• The Huffman algorithm can be described in five
steps:

1. Find the gray level probabilities for the image


by finding the histogram
2. Order the input probabilities (histogram
magnitudes) from smallest to largest
3. Combine the smallest two by addition
4. GOTO step 2, until only two probabilities are
left
5. By working backward along the tree, generate
code by alternating assignment of 0 and 1

(c) Scott E Umbaugh, SIUE 2005 117


(c) Scott E Umbaugh, SIUE 2005 118
(c) Scott E Umbaugh, SIUE 2005 119
(c) Scott E Umbaugh, SIUE 2005 120
(c) Scott E Umbaugh, SIUE 2005 121
(c) Scott E Umbaugh, SIUE 2005 122
(c) Scott E Umbaugh, SIUE 2005 123
• In the example, we observe a 2.0 : 1.9
compression, which is about a 1.05 compression
ratio, providing about 5% compression

• From the example we can see that the Huffman


code is highly dependent on the histogram, so
any preprocessing to simplify the histogram will
help improve the compression ratio

(c) Scott E Umbaugh, SIUE 2005 124


Run-Length Coding

• Run-length coding (RLC) works by counting


adjacent pixels with the same gray level value
called the run-length, which is then encoded and
stored

• RLC works best for binary, two-valued, images

(c) Scott E Umbaugh, SIUE 2005 125


• RLC can also work with complex images that
have been preprocessed by thresholding to
reduce the number of gray levels to two
• RLC can be implemented in various ways, but
the first step is to define the required parameters
• Horizontal RLC (counting along the rows) or
vertical RLC (counting along the columns) can be
used

(c) Scott E Umbaugh, SIUE 2005 126


• In basic horizontal RLC, the number of bits used
for the encoding depends on the number of
pixels in a row

• If the row has 2n pixels, then the required number


of bits is n, so that a run that is the length of the
entire row can be encoded

(c) Scott E Umbaugh, SIUE 2005 127


• The next step is to define a convention for the
first RLC number in a row – does it represent a
run of 0's or 1's?

(c) Scott E Umbaugh, SIUE 2005 128


(c) Scott E Umbaugh, SIUE 2005 129
(c) Scott E Umbaugh, SIUE 2005 130
• Bitplane-RLC : A technique which involves
extension of basic RLC method to gray level
images, by applying basic RLC to each bit-plane
independently

• For each binary digit in the gray level value, an


image plane is created, and this image plane (a
string of 0's and 1's) is then encoded using RLC

(c) Scott E Umbaugh, SIUE 2005 131


(c) Scott E Umbaugh, SIUE 2005 132
• Typical compression ratios of 0.5 to 1.2 are
achieved with complex 8-bit monochrome images

• Thus without further processing, this is not a


good compression technique for complex images

• Bitplane-RLC is most useful for simple images,


such as graphics files, where much higher
compression ratios are achieved

(c) Scott E Umbaugh, SIUE 2005 133


• The compression results using this method can
be improved by preprocessing to reduce the
number of gray levels, but then the compression
is not lossless

• With lossless bitplane RLC we can improve the


compression results by taking our original pixel
data (in natural code) and mapping it to a Gray
code (named after Frank Gray), where adjacent
numbers differ in only one bit

(c) Scott E Umbaugh, SIUE 2005 134


• As the adjacent pixel values are highly
correlated, adjacent pixel values tend to be
relatively close in gray level value, and this can
be problematic for RLC

(c) Scott E Umbaugh, SIUE 2005 135


Arithmetic Coding

• Arithmetic coding transforms input data into a


single floating point number between 0 and 1

• There is not a direct correspondence between


the code and the individual pixel values

(c) Scott E Umbaugh, SIUE 2005 136


• As each input symbol (pixel value) is read the
precision required for the number becomes
greater

• As the images are very large and the precision of


digital computers is finite, the entire image must
be divided into small subimages to be encoded

(c) Scott E Umbaugh, SIUE 2005 137


• Arithmetic coding uses the probability distribution
of the data (histogram), so it can theoretically
achieve the maximum compression specified by
the entropy

• It works by successively subdividing the interval


between 0 and 1, based on the placement of the
current pixel value in the probability distribution

(c) Scott E Umbaugh, SIUE 2005 138


(c) Scott E Umbaugh, SIUE 2005 139
(c) Scott E Umbaugh, SIUE 2005 140
(c) Scott E Umbaugh, SIUE 2005 141
• In practice, this technique may be used as part of
an image compression scheme, but is impractical
to use alone

• It is one of the options available in the JPEG


standard

(c) Scott E Umbaugh, SIUE 2005 142


Lossy Compression Methods

Lossy compression methods are required to


achieve high compression ratios with complex
images

They provides tradeoffs between image quality


and degree of compression, which allows the
compression algorithm to be customized to the
application

(c) Scott E Umbaugh, SIUE 2005 143


(c) Scott E Umbaugh, SIUE 2005 144
With more advanced methods, images can be
compressed 10 to 20 times with virtually no
visible information loss, and 30 to 50 times with
minimal degradation
Newer techniques, such as JPEG2000, can
achieve reasonably good image quality with
compression ratios as high as 100 to 200
Image enhancement and restoration techniques
can be combined with lossy compression
schemes to improve the appearance of the
decompressed image

(c) Scott E Umbaugh, SIUE 2005 145


In general, a higher compression ratio results in
a poorer image, but the results are highly image
dependent – application specific

Lossy compression can be performed in both the


spatial and transform domains. Hybrid methods
use both domains.

(c) Scott E Umbaugh, SIUE 2005 146


Gray-Level Run Length Coding

• The RLC technique can also be used for lossy


image compression, by reducing the number of
gray levels, and then applying standard RLC
techniques

• As with the lossless techniques, preprocessing


by Gray code mapping will improve the
compression ratio

(c) Scott E Umbaugh, SIUE 2005 147


Figure 10.3-2 Lossy Bitplane Run Length Coding

Note: No compression occurs until reduction to 5 bits/pixel

a) Original image, 8 bits/pixel, b) Image after reduction to 7 bits/pixel,


256 gray levels 128 gray levels, compression ratio 0.55,
with Gray code preprocessing 0.66

(c) Scott E Umbaugh, SIUE 2005 148


Figure 10.3-2 Lossy Bitplane Run Length Coding (contd)

c) Image after reduction to 6 bits/pixel, d) Image after reduction to 5 bits/pixel,


64 gray levels, compression ratio 0.77, 32 gray levels, compression ratio 1.20,
with Gray code preprocessing 0.97 with Gray code preprocessing 1.60

(c) Scott E Umbaugh, SIUE 2005 149


Figure 10.3-2 Lossy Bitplane Run Length Coding (contd)

e) Image after reduction to 4 bits/pixel, f) Image after reduction to 3 bits/pixel,


16 gray levels, compression ratio 2.17, 8 gray levels, compression ratio 4.86,
with Gray code preprocessing 2.79 with Gray code preprocessing 5.82

(c) Scott E Umbaugh, SIUE 2005 150


Figure 10.3-2 Lossy Bitplane Run Length Coding (contd)

g) Image after reduction to 2 bits/pixel, h) Image after reduction to 1 bit/pixel,


4 gray levels, compression ratio 13.18, 2 gray levels, compression ratio 44.46,
with Gray code preprocessing 15.44 with Gray code preprocessing 44.46

(c) Scott E Umbaugh, SIUE 2005 151


• A more sophisticated method is dynamic window-
based RLC
• This algorithm relaxes the criterion of the runs
being the same value and allows for the runs to
fall within a gray level range, called the dynamic
window range
• This range is dynamic because it starts out larger
than the actual gray level window range, and
maximum and minimum values are narrowed
down to the actual range as each pixel value is
encountered

(c) Scott E Umbaugh, SIUE 2005 152


• This process continues until a pixel is found out
of the actual range

• The image is encoded with two values, one for


the run length and one to approximate the gray
level value of the run

• This approximation can simply be the average of


all the gray level values in the run

(c) Scott E Umbaugh, SIUE 2005 153


(c) Scott E Umbaugh, SIUE 2005 154
(c) Scott E Umbaugh, SIUE 2005 155
(c) Scott E Umbaugh, SIUE 2005 156
• This particular algorithm also uses some
preprocessing to allow for the run-length
mapping to be coded so that a run can be any
length and is not constrained by the length of a
row

(c) Scott E Umbaugh, SIUE 2005 157


Lossy predictive coding
(Differential Predictive Coding)

• Differential predictive coding (DPC) predicts the


next pixel value based on previous values, and
encodes the difference between predicted and
actual value – the error signal
• This technique takes advantage of the fact that
adjacent pixels are highly correlated, except at
object boundaries

(c) Scott E Umbaugh, SIUE 2005 158


• Typically the difference, or error, will be small
which minimizes the number of bits required for
compressed file

• This error is then quantized, to further reduce the


data and to optimize visual results, and can then
be coded

(c) Scott E Umbaugh, SIUE 2005 159


(c) Scott E Umbaugh, SIUE 2005 160
• From the block diagram, we have the following:

• The prediction equation is typically a function of


the previous pixel(s), and can also include global
or application-specific information

(c) Scott E Umbaugh, SIUE 2005 161


(c) Scott E Umbaugh, SIUE 2005 162
• This quantized error can be encoded using a
lossless encoder, such as a Huffman coder

• It should be noted that it is important that the


predictor uses the same values during both
compression and decompression; specifically the
reconstructed values and not the original values

(c) Scott E Umbaugh, SIUE 2005 163


(c) Scott E Umbaugh, SIUE 2005 164
(c) Scott E Umbaugh, SIUE 2005 165
• The prediction equation can be one-dimensional
or two-dimensional, that is, it can be based on
previous values in the current row only, or on
previous rows also
• The following prediction equations are typical
examples of those used in practice, with the first
being one-dimensional and the next two being
two-dimensional:

(c) Scott E Umbaugh, SIUE 2005 166


(c) Scott E Umbaugh, SIUE 2005 167

You might also like