TYBSC CS Digital Image Processing Munotes
TYBSC CS Digital Image Processing Munotes
INTRODUCTION TO IMAGE-
PROCESSING SYSTEM
Unit structure
1.0 Objectives
1.1 Introduction
1.2 Image Sampling
1.2.1 Theory of 2D Sampling
1.2.2 Retrieving Image From Its Sample
1.2.3 Violation of Sampling criterion
1.3 Quantization
.in
1.4 Resolution
1.5 Human Visual Systems
1.5.1 Anatomy of HVS
es
1.5.2 Scotopic AndPhotopic Vision
1.5.3 Brightness and contrast
ot
1.0 OBJECTIVES
As the name indicates digital image processing helps:
To study two-dimensional Signals and Systems.
To understand image fundamentals and transforms necessary for image
processing.
To study the image enhancement techniques in spatial and frequency
domain.
To study image segmentation and image compression techniques
1
Digital Image Processing 1.1 INTRODUCTION
In current era digital plays an important role in several areas like
geography, information technology, medicine etc.
Digital image processing(DIP)refers to management of digital images
through a digital computer. It is a subarea of signals and systems but
particularly focusses on images. DIP aims on using a computer system that
is able to perform several processing of an image. The process contains
the input of that system which is a digital image , the system then process
that image using effective algorithms, and gives as an output an image.
The most general example is Matlab, Adobe Photoshop. It is widely used
for processing digital images.
Before proceeding let us define some terms as follows :
1. IMAGE : An image is defined as an two dimensional signal. It is
defined using mathematical function f(x,y) where x and y are the two
co-ordinates horizontally and vertically called as spatial or plane
.in
coordinates.The value of f(x,y) at any point is gives the pixel value at
that point of an image.represents a measure of some characteristic such
as brightness or color of a viewed scene. An image is a projection of a
3- D scene into a 2D projection plane
es
2. ANALOG IMAGE: An analog image cab ne represented
mathematically using continuous range of values which represent
ot
operating with the images, there are various software and algorithms.
Some of the examples of digital images are color processing, image
recognition, video processing, etc. Analog signals can be convted into
m
Figure 1
Advantages of Digital Image Processing
5 Neighbours of Pixel
A pixel will have four neighbpurs if they exist. NORTH, SOUTH, EAST
and WEST.
North
West P East
South
.in
Figure 2
3
Digital Image Processing 3. Then take inverse Fourire transform after which we get
.in
es
An alertnate approach for the above procedure is multiplying the anlaog
image by a 2D comb function.It is a rectangular grid of points as shown in
ot
figure 3 . the spaces in between thr grid points are x,y respectively.
un
m
Figure 3 Figure 4
4
After multiplying the comb function and analog function f(x,y) we get Introduction to Image-
discrete version of analog image given by Processing System
We can see that the final equations from both the methods are similar.
.in
1.2.2 RETRIEVING IMAGE FROM ITS SAMPLE
Since discreteness in one domain leads to periodicity in other domain,
es
sampling in spatial domain also leads to a periodic spectrum in the
frequency domain as in figure 5
ot
un
m
5
Digital Image Processing That is (*)
A low pass filter is used to extract the desired spectrum whose transfer
function is given by :
.in
spectrum with low pass filter as given below :
es
ot
.in
es
Violation of sampling criteria given in (*) simultaneously leads to to
overlapping of spectrum in both the direction as shown in
figure 8.
ot
un
m
7
Digital Image Processing 1.3 QUANTIZATION
Quantization is a process of transforming a real valued sampled image to
one taking only a finite number of distinct values. Under quantization
procedure the amplitude values of the image are digitized. In naïve
words, when you are quantizing an image, you are actually partitioning a
signal into quanta(partitions).
Quantiser design includes :
Number of levels.
Quantisers are classified in two types :
1. Scalar quantisers
.in
2. Vector quantisers es
ot
un
1.4 RESOLUTION
Resolution contributes the degree of distinguishable details. It is classified
as :
1. Spatial resolution :Spatial resolution is defined as the smallest
discernible detail in an image.Spatial resolution states that the clarity
of an image cannot be determined by the pixel resolution. The number
of pixels in an image does not matter.It is the number of independent
pixels values per inch.sampling is the principal factor determining this
resolution .
2. Gray – level resolution : Gray level resolution refers to the predictable
or deterministic change in the shades or levels of gray in an image. In
short gray level resolution is equal to the number of bits per pixel.
8
Introduction to Image-
1.5 HUMAN VISUAL SYSTEM Processing System
Human Visual System (HVS) is the most complicated system in life.
Many image processing applications are anticipated to produce images
that are to be watched by human observers. In HVS , the eyes act as the
sensor or camera, neurons act as the connecting cable and the brain acts
as the processor. It is hence important to realize the characteristics and
limitations of the human visual system to understand the “receiver” of the
2D signals.
.in
es
Figure 10 : Vertical section of the Human Eye
The first part of the visual system is the eye. This is shown in figure. Its
ot
form is nearly spherical and its diameter is approximately 20 mm. Its outer
cover consists of the ‘cornea' and ‘sclera' The cornea is a tough transparent
tissue in the front part of the eye. The sclera is an opaque membrane,
un
which is continuous with cornea and covers the remainder of the eye.
Directly below the sclera lies the “choroids”, which has many blood
vessels. At its anterior extreme lies the iris diaphragm. The light enters in
m
the eye through the central opening of the iris, whose diameter varies from
2mm to 8mm, according to the illumination conditions. Behind the iris is
the “lens” which consists of concentric layers of fibrous cells and contains
up to 60 to 70% of water. Its operation is similar to that of the man made
optical lenses. It focuses the light on the “retina” which is the innermost
membrane of the eye. Retina has two kinds of photoreceptors: cones and
rods. The cones are highly sensitive to color. Their number is 6-7 million
and they are mainly located at the central part of the retina. Each cone is
connected to one nerve end. Cone vision is the photopic or bright light
vision. Rods serve to view the general picture of the vision field. They are
sensitive to low levels of illumination and cannot discriminate colors. This
is the scotopic or dim-light vision. Their number is 75 to 150 million and
they are distributed over the retinal surface.
Several rods are connected to a single nerve end. This fact and their large
spatial distribution explain their low resolution. Both cones and rods
transform light to electric stimulus, which is carried through the optical
9
Digital Image Processing nerve to the human brain for the high level image processing and
perception.
.in
1.5.3 BRIGHTNESS AND CONTRAST
The Contrast and Brightness function enhances the appearance of raster
data by modifying the brightness and contrast within the image.
es
Brightness increases the overall lightness of the image—for example,
making dark colors lighter and light colors whiter—while contrast adjusts
the difference between the darkest and lightest colors. Below is an
ot
I = kL
Where ΔI(Delta I) represents the difference threshold, I represents the
initial stimulus intensity and k signifies that the proportion on the left side
of the equation remains constant despite variations in the I term.
It is denoted by Delta I.
2. Steven’s Law : If L is physical stimulus and I the perceived sensation
then the mathematical form of steven’s law is
I = Ln
Where L is the intensity against black background, the value of n is 0.5
10
3. Bloch’s Law : It is given by Introduction to Image-
Processing System
.in
1.5.5 MACH BAND EFFECT
Another characteristic of HVS is that it tends to “overshoot” around image
edges (boundaries of regions having different intensity). As a result,
es
regions of constant intensity, which are close to edges, appear to have
varying intensity. Such an example is shown in Figure11 . The stripes
appear to have varying intensity along the horizontal dimension, whereas
their intensity is constant. This effect is called Mach band effect. It
ot
indicates that the human eye is sensitive to edge information and that it
has high-pass characteristics.
un
m
.in
Figure 12 : Zooming Of Raster Image
es
2. Vector Image :These images consist of anchored dots and are
connected by lines and curves, similar to the connect-the-
dot activities you may have done as a kid. Because these
graphics are not based on pixels, they are known as resolution
ot
12
Introduction to Image-
1.7 IMAGE TYPES Processing System
Images are classified into four categories:
1. Binary image : Also called as Black and White image ,the binary image
as it name sys , contain only two pixel values - 0 and 1.Here 0 refers to
black color and 1 refers to white color. It is also know as Monochrome
image . A gray scale image can be converted to black and white image by
using threshold operation.
.in
2. Gray scale Image :Grayscale images, a kind of black-and-white or
gray monochrome, are composed exclusively of shades of gray. They
es
conatin only brightness information.The contrast ranges from black at the
weakest intensity to white at the strongest.In a gray scale image a pixel is
represented by a word or 8-bits, where a value of 0 represents black and
255 white.
ot
un
m
3. Colour Image :It has three values per pixel which measure
chrominance and intensity of light.Each pixel is avector of
colourcomponents.Commoncolour spaces are RGB(RegGren Blue),
CMYK(Cyan, Magenta, Yellow, Black) and HSV(Hue, Saturation
,Value).
13
Digital Image Processing 5. Range Image :These are special class of digital images .Here each
pixel represents a distance between a reference frame and a visible
point in the screen.Also called as depth images
.in
a. CCD (Charge-Coupled Device): A light-sensitive array of silicon
cells that is commonly used for digital camera image sensors. It
generates electrical current in proportion to light input and allows the
es
simultaneous capture of many pixels with one brief exposure.
b. Dark current:the charge of signal collected by pixel in absence of
light
ot
d. Pixel : It is the discrete sensitive cell that collects and holda photo
charge.
e. Fill factor :The fill factor of an imaging sensor is defined as the ratio
m
14
1.9 APPLICATIONS OF DIGITAL IMAGE Introduction to Image-
Processing System
PROCESSING
Some of the most important fields in which digital image processing is
widely used are as given below :
Image sharpening and restoration
Medical field
Remote sensing
Transmission and encoding
Machine/Robot vision
Color processing
Pattern recognition
.in
Video processing
Microscopic Imaging
es
1.10 SUMMARY
In this chapter we learnt about basics of digital image processing like its
structure components and conversion.
ot
1.11 EXERCISES
un
1.11 REFERENCES
1) Digital Image Processing, S Jayaraman, S Esakkirajan, T
Veerakumar,Tata McGraw-Hill Education Pvt. Ltd., 2009
2) Digital Image Processing 3rd Edition, Rafael C Gonzalez, Richard E
Woods, Pearson, 2008
3) Scilab Textbook Companion for Digital Image Processing, S.
Jayaraman, S. EsakkirajanAnd T. Veerakumar, 2016
15
2
CONVOLUTION AND CORRELATION
Unit structure
2.0 Objectives
2.1 Introduction
2.2 2D Signals
2.3 2D Systems
2.4 2D convolution
2.4.1 Graphical Method
2.4.2 Z Transform
.in
2.4.3 Matrix analysis
2.4.4 Circular convolution
es
2.5 2D Correlation
2.6 Summary
ot
2.7 References
un
2.0 OBJECTIVES
In this chapter we are going to learn about theory of signals and some
mathematical concepts required in signal processing .
m
2.1 INTRODUCTION
Signals convey information. Systems transform signals. A signal can be,
for example, a sequence of commands or a list of names. Mathematically,
we model both signals and systems as functions. A signal is a function that
maps a domain, often time or space, into a range, often a physical measure
such as air pressure or light intensity. A system is a function that maps
signals from its domain—its input signals—into signals in its range—its
output signals. Both the domain and the range are sets of signals (signal
spaces). Thus, systems are functions that operate on functions.
An example of one D signal is ECG, 2D signal is a still image.
2.2 2D SIGNALS
2D discrete signals are represented as : x(n1,n2) where x is a real or
complex value and n1,n2a pair of integers.
16
2D unit impulse sequence: it is given by x(n1,n2) = (n1,n2) Convolution and
Correlation
Which is given by
.in
Figure 1 : 2D impulse sequence Vertical line impulse Horizontal
Line Impulse
es
ot
un
m
17
Digital Image Processing Periodic Sequence : A 2d sequence is periodic if it repeats itself in a
regulary spaced interval.hence a sequence is periodic if
x(n1,n2) = x(n1 + N1,n2) = x(n1,n2 + N2) where N1 and N2 are positive
integers
2.3 2D SYSTEMS
A 2D system is a device or algorithm that carries out some operation on a
2D signal.ifx(n1,n2) is input signal to the system and y(n1,n2) is output of
the system then the relation is : y(n1,n2) = T[x(n1,n2)] where T is the
transformation by the system to produce output from input.
Classification of 2D systems : It can be classified as :
1. Linear Vs Non-Linear systems : A linear system is the one that
follows the laws of superposition. This law is necessary and sufficient
condition to prove the linearity of the system.the Linearity is defines as
:
.in
es
Where a and b are scalar constants.
ot
un
m
18
4. Stable system : A 2D linear shift invariant system is stable if and Convolution and
only if its impulse response is absolutely summable given by : Correlation
2.4 2D CONVOLUTION
Convolution and correlation are used to extract information from
images.They are linear and shift-invariant operations.Convolution is the
relationship between a system's input signal, output signal, and impulse
response.Shift-invariant means that we perform the same operation at
every point in the image.Linear means that this operation is linear, that is,
we replace every pixel with a linear
combination of its neighbors. These two properties make these operations
.in
very simple;it’s simpler if we do the same thing everywhere, and linear
operations are always thesimplest ones.Correlation is a way to identify a
known waveform in a noisy background. Correlation is a mathematical
operation much very similar to convolution. Just as with convolution,
es
correlation uses two signals to produce a third signal. This third signal is
called the cross-correlation of the two input signals. If a signal is
correlated with itself, the resulting signal is instead called the
ot
autocorrelation.
Convolution has wide range of applications like image fitering ,
enhancement, restoration ,feature extraction and template matching.
un
Folding
Shifting and
Addition
19
Digital Image Processing The process is performed as follows :
1. Given two matrices as input :x[n,m] and h[,m] ,first determine the
dimension of the resultant matrix i.e if x[n,m] is matrix of order 2x3
and h[,m] is of order 3x1 then resultant matrix is of order 4x3
.in
es
ot
un
m
20
3. Determine the values y(0,0), y(0,1) etc.wherey(0,0) is obtained as : Convolution and
Correlation
.in
es
ot
un
.in
es
ot
un
m
22
Convolution and
Correlation
.in
es
Circular correlation is widely used in zooming operation of digital
cameras.
ot
2.5 2D CORRELATION
un
Matrix methods : with the help of block Teoplitz matrix and block
Circulant matrix in terms of convolution
23
Digital Image Processing The flowchart for the same is as given below :
.in
es
ot
un
m
24
Circular correlation can be performed through matrix method as follows : Convolution and
Correlation
.in
es
ot
un
m
25
Digital Image Processing Circular correlation can be performed through transform method as
follows :
.in
es
ot
un
m
2.6 SUMMARY
In this chapter we learnt about convolution that can be carried out for
filtering process, its types and methods to perform.we also learnt about
correlation which is used to find the similarity between images or part of
images.
2.7 EXERCISES
1. Write a note on 2d signals.
2. Explain 2D systems and its classifications.
3. Define convolution . Explain its types of representation.
4. Write a note on cicular convolution.
5. Discuss applications ofCircular Convolution
6. Write a note on Correlation.
26
2.7 REFERENCES Convolution and
Correlation
1) Digital Image Processing, S Jayaraman, S Esakkirajan, T
Veerakumar,Tata McGraw-Hill Education Pvt. Ltd., 2009
2) Digital Image Processing 3rd Edition, Rafael C Gonzalez, Richard E
Woods, Pearson, 2008
3) Scilab Textbook Companion for Digital Image Processing, S.
Jayaraman, S. EsakkirajanAnd T. Veerakumar, 2016
.in
es
ot
un
m
27
3
IMAGE TRANSFORMS
Unit structure
3.0 Objectives
3.1 Introduction
3.2 Need for transform
3.3 Image transforms
3.4 Fourier transform
3.5 2D Discrete Fourier transform
3.6 Properties of 2D DFT
.in
3.7 Other transforms
3.6 Summary
es
3.7 References
3.0 OBJECTIVES
ot
In this chapter we are going to learn about image transforms which are
widely used in image analysis and image processing .Transform is a
un
mathematical tool basically used to move from one domain to another i.e
from time domain to frequency domain . they are useful for fast
computation of correlation and convolution.
m
3.1 INTRODUCTION
An image transform can be applied to an image to convert it from one
domain to another. Viewing an image in domains such as frequency
or Hough space enables the identification of features that may not be as
easily detected in the spatial domain. Common image transforms include:
Hough Transform, used to find lines in an image
Radon Transform, used to reconstruct images from fan-beam and
parallel-beam projection data
Discrete Cosine Transform, used in image and video compression
Discrete Fourier Transform, used in filtering and frequency analysis
Wavelet Transform, used to perform discrete wavelet analysis, denoise,
and fuse images
28
Image Transforms
3.2 NEED FOR TRANSFORM
The main purpose of the transform can be divided into following 5
groups:
1. Visualization: The objects which are not visible, they are observed.
2. Image sharpening and restoration: It is used for better image
resolution.
3. Image retrieval: An image of interest can be seen
4. Measurement of pattern: In an image, all the objects are measured.
5. Image Recognition: Each object in an image can be distinguished.
Transform is a mathematical toolto transform a signa. Hence it is required
for :
Mathematical convenience
.in
es
To extract more information .
ot
un
m
29
Digital Image Processing 3.3 IMAGE TRANSFORMS
Image transfor is represenatation of image . it is done for following two
reasons :
1. To isolate critical component of image so that they are accessible
directly.
2. It will make the image data compact so that they can be efficiently
stored and transmitted.
Different types of image transforms that wil be discussed are :
Fourier transform
Walschtransform
Hadamardtransform
.in
Wavelet transform etc
Classification of image transforms : they can be classified on the basis of
es
nature of fuctions as follows :
ot
un
m
The Fourier transform states that that the non periodic signals whose area
under the curve is finite can also be represented into integrals of the sines
and cosines after being multiplied by a certain weight.
30
The Fourier transform is a representation of an image as a sum of complex Image Transforms
exponentials of varying magnitudes, frequencies, and phases. The Fourier
transform plays a critical role in a broad range of image processing
applications, including enhancement, analysis, restoration, and
compression.
The Fourier transform also has many wide applications that include ,
image compression (e.g JPEG compression) , filtrering and image
analysis.
.in
A continuous time signal x(t) is converted to discrete tie signal x(nT)
es
using sampling process where T is the sampling interval.
ot
un
31
Digital Image Processing After a series of transformations :
.in
There is a fast algorithm for computing the DFT known as the fast
Fourier transform (FFT).
es
ot
un
m
where R(k,l) represents the real part of spectrum and l(k,l) the imaginary
part.
32
Image Transforms
3.6 PROPERTIES OF 2D-DFT
There are many properties of the transformation that give vision into the
content of the frequency domain representation of a signal and allow us to
manipulate singals in one domain or the other.
Following are some of the properties :
1. Separable property : It is computed in two steps by successive 1D
operations on rows and columns of an image .
.in
Figure 3 :Computation of 2D-DFT bySeparable Property
es
2. Spatial Shift property: Since both the space and frequency domains
are considered periodic for the purposes of the transforms, shifting
means rotating around the boundaries.
ot
un
m
3. Periodicity Property :
33
Digital Image Processing 5. Correlation property : It is used to find relative similarity in between
two signals.When we find similarity between of a signal to itself it is
called as autocorrelation . On the other hand when we find similarity
between two signals it is called as cross correlation. The cross
correlation between two signals is same as performing convolution of
one sequence with the folded version of other sequence.Thus this
property says that correlation of two sequneces in time domain is same
as multiplication of DFT of one sequence and time reversal of DFT of
another sequence in frequency domain.
6. Scaling property : Just as in one dimension, shrinking in one domain
causes expansion in the other for the 2D DFT. This means that as an
object grows in an image, the corresponding features in the frequency
domain will expand.Sclaing is used mainly to shrink or expand the size
of an image.
.in
es
7. Conjugate symmetry :
ot
un
8. Orthogonality property :
m
34
10. Multiplication by Exponent: Image Transforms
35
Digital Image Processing 3. HAAR transform: Haar wavelet compression is an efficient way to
perform both lossless and lossy image compression. It relies on
averaging and differencing values in an image matrix to produce a
matrix which is sparse or nearly sparse. A sparse matrix is a matrix in
which a large portion of its entries are 0.it is based on class of
orthogonal matrices whose elements are either +1 , -1 or 0 multiplied
by powers of .Algorithm to generate HAAR basis is :
.in
es
ot
un
m
36
Image Transforms
.in
used in clustering analysis and image compression.There are four
major steps in order to find the KL transform :-
a. Find the mean vector and covariance matrix of the given image x.
es
b. Find the Eigen values and then the eigen vectors of the covariance
matrix
ot
3.8 SUMMARY
In this chapter we studied why image transforms are required in digital
image processing , the different types available .
37
Digital Image Processing 3.9 EXERCISES
1. Why we need image transforms
2. What are different types of transforms?
3. Explain properties of DFT.
4. What is main difference between Walsh and Hadamard transform?
5. Explain Fourier transform and its properties .
6. What are advantages of Walsh over Fourier transform?
3.10 REFERENCES
1) Digital Image Processing, S Jayaraman, S Esakkirajan, T Veerakumar,
Tata McGraw-Hill Education Pvt. Ltd., 2009
2) Digital Image Processing 3rd Edition, Rafael C Gonzalez, Richard E
.in
Woods, Pearson, 2008
3) Scilab Textbook Companion for Digital Image Processing, S.
Jayaraman, S. EsakkirajanAnd T. Veerakumar, 2016
es
ot
un
m
38
4
IMAGE ENHANCEMENT
Unit Structure
4.0 Objectives
4.1 Introduction
4.2 Image Enhancement in spatial domain
4.2.1 Point operation
4.2.2 Mask operation
4.2.3 Global operation
4.3 Enhancement through Point operations
.in
4.3.1 Types of point operations
4.4 Histogram manipulation
es
4.5 LinearGray Level Transformation
4.6 Nonlinear Gray Level Transformation
ot
4.6.1 Thresholding
4.6.2 Gray-level slicing
un
39
Digital Image Processing 4.9.1 High boost filtering
4.9.2 Unsharp masking
4.10 Bit-plane slicing
4.11 Image Enhancement in frequency domain
4.12 Homomorphic filter
4.13 Zooming operation
4.14 Image Arithmetic
4.14.1 Image addition
4.14.2 Image subtraction
4.14.3 Image multiplication
4.14.4 Image division
.in
4.15 Summary
4.16 List of References
es
4.17 Unit End Exercises
4.0 OBJECTIVES
ot
4.1 INTRODUCTION
The goal of image enhancement is to make it easier for viewers to
understand the information contained in images. A better-quality image is
produced by an enhancement algorithm when it is used for a certain
application. This can be accomplished by either reducing noise or boosting
image contrast.
For presentation and analysis, image-enhancement techniques are used to
highlight, sharpen, or smooth visual characteristics. Application-specific
enhancement techniques are frequently created via empirical research.
Image enhancement techniques highlight particular image elements to
enhance how an image is perceived visually.
Image-enhancement methods can be divided into two groups in general,
including
(1) A method in the spatial domain; and (2) A method in the transform
domain.
40
Whereas the transform domain approach works with an image's Fourier Image Enhancement
transform before transforming it back into the spatial domain, the spatial
domain method operates directly on pixels. Because they are quick and
easy to use, histogram-based basic improvement approaches can produce
results that are acceptable for some applications. By removing a portion of
a filtered component from the original image, unsharp masking sharpens
the edges. Unsharp masking is a technique that has gained popularity as a
diagnostic aid.
.in
Figure 1. The point operation is represented by
es
In point operation, T operates on one pixel, or there exists a one-to-one
mapping between the input image f (m, n) and the output image g (m, n).
ot
un
m
41
Digital Image Processing
.in
in a point operation depends only on the image value there.The point
operation, which is shown in Figure 3, converts the input picture f (m, n)
into the output image g (m, n).It is clear from the figure that every f (m, n)
es
pixel with the same grey level maps to a single grey value in the final
image.
ot
un
m
1] Brightness modification
The value assigned to each pixel in an image determines how bright it is.
A constant is either added to or subtracted from the luminance of each
sample value to alter the brightness of the image. By adding a constant
value to each and every pixel in the image, the brightness can be
42
raised.The brightness can also be reduced by deducting a fixed amount Image Enhancement
from each and every pixel in the image.
.in
be decreased by subtracting a constant k from all the pixels of the
input image f [m,n]. This is represented by
g [m, n] = f [m, n] – k
es
ot
un
constant k. It is given by
g [m, n] = f [m, n] ∗ k
Changing the contrast of an image, changes the range of luminance values
present in the image
Specifying a value above 1 will increase the contrast by making bright
samples brighter and dark samples darker, thus expanding on the range
used. A value below 1 will do the opposite and reduce a smaller range of
sample values. An original image and its contrast-manipulated images are
illustrated in Figure 6.
43
Digital Image Processing
.in
A] Histogram: The gray-level values of an image are plotted against the
number of instances of each type of grey in the image to create the
histogram. A useful summary of an image's intensities is provided by the
histogram, but it is unable to transmit any knowledge of the spatial
es
connections between pixels. Further information on image contrast and
brightness is given by the histogram.
ot
4. The histogram will have an equal spread in the grey level for a high-
contrast image.
Image brightness may be improved by modifying the histogram of the
image.
B] Histogram equalisation: In order to distribute the grey levels of an
image uniformly across its range, a procedure called equalisation is used.
Based on the image histogram, histogram equalisation reassigns the
brightness values of pixels. The goal of histogram equalisation is to
produce a picture with a histogram that is as flat as feasible. More
aesthetically acceptable outcomes are produced by histogram equalisation
across a wider spectrum of photos.
C] Procedure to perform Histogram equalization : Histogram
equalisation is done by performing the following steps:
1. Find the running sum of the histogram values.
44
2. Normalise the values from Step (1) by dividing by the total number of Image Enhancement
pixels.
3. Multiply the values from Step (2) by the maximum gray-level value and
round.
4. Map the gray level values to the results from Step (3) using a one-to-
one correspondence.
Example: Perform histogram equalisation on the following image:
.in
7. The histogram of the input image is given below:
es
Step 1:Compute the running sum of histogram values. The running sum of
ot
Step 2: Divide the running sum obtained in Step 1 by the total number of
pixels. In this case, the total number of pixels is 25.
45
Digital Image Processing
The result is then rounded to the closest integer to get the following table:
.in
Step 4: Mapping of gray level by a one-to-one correspondence:
es
ot
un
m
The original image and the histogram equalised image are shown side by
side.
46
Image Enhancement
.in
es
ot
4.6.1 Thresholding
To extract a portion of an image that contains all the information,
thresholding is necessary. The problem of segmentation in general
includes thresholding. Hard thresholding and soft thresholding are two
major categories for thresholding.The procedure of hard thresholding
entails reducing coefficients with absolute values below the threshold to
zero. Another technique is soft thresholding, which shrinks the nonzero
coefficients towards zero by first setting to zero coefficients whose
absolute values are below the threshold.
47
Digital Image Processing 4.6.2 Gray-level slicing
The purpose of gray-level slicing is to highlight a specific range of gray
values. Two different approaches can be adopted for gray-level slicing
(a) Gray-level Slicing without Preserving Background This displays
high values for a range of interest and low values in other areas. The main
drawback of this approach is that the background information is discarded.
(b) Gray-level Slicing with Background In gray-level slicing with
background, the objective is to display high values for the range of interest
and original gray level values in other areas. This approach preserves the
background of the image.
.in
This type of mapping spreads out the lower gray levels. For an 8-bit
image, the lower gray level is zero and the higher gray level is 255. It is
desirable to map 0 to 0 and 255 to 255. The function g (m, n) = clog( f (m,
n) + 1) spreads out the lower gray levels.
es
4.6.4 Exponential transformation
In multiplicative filtering processes, exponential transformation has
ot
A physical device like a CRT does not produce light with a linear
relationship to the input signal.The intensity generated at the display's
surface is about the applied voltage raised by a factor of 2.5. Gamma is the
name given to the numerical value of this power function's exponent. To
obtain accurate intensity reproduction, this non-linearity must be adjusted.
The formula for the power law transformation is
where f(m, n) is the input image and g(m, n) is the output image. Gamma
(γ) can take either integer or fraction values.
.in
4.7.3 Mean filter/ Average filter/ Low pass filter
The average of all the values in the immediate area is used to replace each
pixel with the mean filter. The degree of filtration is determined by the
es
size of the neighbourhood. Each pixel in a spatial averaging operation is
replaced by the weighted average of the pixels in its immediate vicinity.
The low-pass filter removes the sharp changes that cause blurring and
ot
keeps the smooth area of the image. The following is the 3 by 3 spatial
mask that may execute the averaging operation:
un
m
It should be noted that for a low-pass spatial mask, the elements' sum is
equal to 1. The larger the mask becomes, the more blurring there will be.
In order to precisely find the core pixel, the mask's size is typically
unusual.
49
Digital Image Processing 4.7.4 Weighted average filter
The mask of a weighted average filter is given by
The weighted average filter gets its name because it is clear from the mask
that the closest pixels to the centre are weighted more heavily than the
farthest pixels.The pixel that has to be updated is changed by multiplying
the value of the neighbouring pixel by the matrix's weights and dividing
the result by the sum of the matrix's coefficients.
.in
es
ot
50
Median Filter is a simple and powerful non-linear filter. Image Enhancement
.in
es
ot
un
m
The sum of all the weights is zero; this implies that the resulting signal
will have zero dc value.
51
Digital Image Processing 4.9.1 High boost filtering
High-frequency emphasis filter is another name for a high-boost filter. To
help with visual interpretation, a high-boost filter is utilized to keep some
of the low-frequency components. Before deleting the low-pass image in
high-boost filtering, the input image f(m, n) is amplified by an
amplification factor A.
The expression for the high-boost filter then becomes
High boost = A x f(m,n) - low pass
Adding and subtracting 1 with the gain factor, we get
Thus,
.in
es
One of the methods frequently used for edge enhancement is unsharp
masking. This method tips the balance of the image towards the sharper
material by subtracting the smoothed version of the image from the
ot
.in
Bit plane slicing has three major purposes:
i) converting a grayscale image to a binary image;
es
ii) representing an image with fewer bits and compressing it to a smaller
size; and
iii) enhancing the image by sharpening certain areas.
ot
broken down into a collection of sine waves with varying frequency and
phase using the Fourier transform. Because a Fourier transform is entirely
reversible, if we compute a picture's Fourier transform and then
immediately inverse convert the result, we can recover the original image.
On the other hand, we can emphasize some frequency components and
attenuate others by multiplying each element of the Fourier coefficient by
an appropriately selected weighting function.After computing an inverse
transform, the corresponding changes in the spatial form can be observed.
Frequency domain filtering, often known as Fourier filtering, is the
process of selectively enhancing or suppressing frequency components.
The adjacency relationship between the pixels is described by the spatial
representation of image data. The frequency domain representation, on the
other hand, groups the visual data based on the frequency distribution. In
frequency domain filtering, the picture data is divided into different
spectral bands, each of which represents a particular range of image
information. Frequency domain filtering is the method of selective
frequency inclusion or exclusion.
53
Digital Image Processing It is possible to perform filtering in the frequency domain by specifying
the frequencies that should be kept and the frequencies that should be
discarded. Spatial domain filtering is accomplished by convolving the
image with a filter kernel. We know that
Convolution in spatial domain = Multiplication in the frequency domain
If filtering in spatial domain is done by convolving the input image f (m,
n) with the filter kernel h(m, n)
Here, F (k, l) is the spectrum of the input image and H (k, l) is the
.in
spectrum of the filter kernel. Thus, frequency domain filtering is
accomplished by taking the Fourier transform of the image and the Fourier
transform of the kernel, multiplying the two Fourier transforms, and taking
the inverse Fourier transform of the result. The multiplication of the
es
Fourier transforms needs to be carried out point by point. This point-by-
point multiplication requires that the Fourier transforms of the image and
the kernel themselves have the same dimensions. As convolution kernels
ot
are commonly much smaller than the images they are used to filter, it is
required to zero pad out the kernel to the size of the image to accomplish
this process.
un
the reflectance function at every point. Based on this fact, the simple
model for an image is given by
54
Because f(n1,n2) is the product of i (n1,n2) with r (n1,n2), the log of f (n1,n2) Image Enhancement
separates the two components as illustrated below:
where FI (k, l) and FR (k, l) are the Fourier transform of the illumination
and reflectance components respectively. Then, the desired filter function
H(k,l) can be applied separately to the illumination and the reflectance
component separately as shown below:
.in
The desired enhanced image is obtained by taking the exponential
es
operation as given below:
ot
as shown in figure 9
m
55
Digital Image Processing
.in
If multiple images of a given region are available for approximately the
es
same date and if a part of one of the images has some noise, then that part
can be compensated from other images available through image addition.
The concept of image addition is illustrated in figure 11.
ot
un
m
56
be used to remove certain features in the image. The image subtraction is Image Enhancement
illustrated in the figure 12.
.in
interested in a part of an image then extracting that area can be done by
multiplying the area by one and the rest by zero. The operation is depicted
in the figure 13.
es
ot
un
m
57
Digital Image Processing
4.15 SUMMARY
By modifying intensity functions, picture spectral content, or a
combination of these functions, image enhancement aims to alter how
the image is perceived.
.in
Either the spatial domain or the frequency domain can be used for
image enhancement. Application determines which image-
enhancement method should be used.
es
The gray-level changes used in the spatial-domain picture
enhancement technique include image negative, image slicing, image
ot
58
4.17 UNIT END EXERCISES Image Enhancement
.in
12) State and explain various arithmetic operations performed on an
image.
es
ot
un
m
59
5
BINARY IMAGE PROCESSING
Unit Structure
5.0 Objectives
5.1 Introduction
5.2 Mathematical morphology
5.3 Structuring elements
5.4 Morphological image processing
5.5 Logical operations
5.6 Morphological operations
.in
5.6.1Dilation
5.6.2 Erosion
es
5.6.3 Dilation and Erosion-based operations
5.7 Distance Transform
ot
5.8 Summary
5.9 List of References
5.10 Unit End Exercises
5.0 OBJECTIVES
To get familiar with the concept of binary image processing and
various morphological operations involved
60
5.1 INTRODUCTION Binary Image Processing
.in
A highly fruitful area of image processing is mathematical morphology. A
method for removing visual elements that are helpful for representation
and description is mathematical morphology. The technique was originally
es
developed by Matheron and Serra at the Ecole des Mines in Paris. The
collection of structural data on the visual domain serves as the inspiration.
Set theory serves as the sole foundation for the subject matter of
mathematical morphology. Several useful operators described in
ot
61
Digital Image Processing
.in
Figure 1: Example of structuring element reference
position in the output image, the binary outcome of that logical operation
is saved. The structuring element's size, content, and type of logical
operation all influence the impact that is produced. The binary picture and
structural element sets could be defined in 1, 2, or even higher dimensions
rather than being limited to sets on the 2D plane. Do the logical operation
if the structuring element is precisely positioned on the binary image; else,
leave the generated binary image pixel alone.
.in
the OR operation result of images A and B is shown in figure 4.
es
ot
63
Digital Image Processing
5.6.1 Dilation
.in
The binary picture is expanded from its original shape during the dilation
process. The structural element determines how the binary picture is
enlarged. Compared to the image itself, this structuring element is smaller
es
in size; typically, the structuring element is 3 × 3. The structuring element
is reflected and shifted from left to right and from top to bottom
throughout the dilation process, similar to the convolution process. During
each shift, the procedure searches for any overlapping identical pixels
ot
between the structuring element and that of the binary picture. The pixels
under the centre location of the structuring element will be set to 1 or
black if there is an overlap.
un
where Bˆ is the image B rotated about the origin. Equation 10.7 states that
when the image X is dilated by the structuring element B, the outcome
element z would be that there will be at least one element in B that
intersects with an element in X. If this is the case, the position where the
structuring element is being centred on the image will be ‘ON’. This
process is illustrated in figure 7. The black square represents 1 and the
white square represents 0.
Initially, the centre of the structuring element is aligned at position *. At
this point, there is no overlapping between the black squares of B and the
black squares of X; hence at position * the square will remain white. This
structuring element will then be shifted towards right. At position **, we
find that one of the black squares of B is overlapping or intersecting with
the black square of X. Thus, at position ** the square will be changed to
black. Similarly, the structuring element B is shifted from left to right and
64
from top to bottom on the image X to yield the dilated image as shown in Binary Image Processing
figure 7.
The dilation is an expansion operator that enlarges binary objects. Dilation
has many uses, but the major one is bridging gaps in an image, due to the
fact that B is expanding the features of X.
5.6.2 Erosion
.in
Dilation's opposite process is erosion. Erosion decreases an image if
dilation enlarges it. The structural element determines how the image is
reduced. With a 3x3 size, the structural element is typically smaller than
es
the image. When compared to greater structuring-element sizes, this will
guarantee faster computation times. The erosion process will shift the
structural element from left to right and top to bottom, almost identical to
the dilatation process. The method will check whether there is a complete
ot
overlap with the structuring element at the centre location, denoted by the
centre of the structuring element, or not.If there is no complete
un
overlapping then the centre pixel indicated by the centre of the structuring
element will be set white or 0.
Let us define X as the reference binary image and B as the structuring
element. Erosion is defined by the equation
m
The above equation states that the outcome element z is considered only
when the structuring element is a subset or equal to the binary image X.
This process is depicted in figure 8. Again, the white square indicates 0
and the black square indicates 1.
The erosion process starts at position *. Here, there is no complete
overlapping, and so the pixel at the position * will remain white. The
structuring element is then shifted to the right and the same condition is
observed. At position **, complete overlapping is not present; thus, the
black square marked with ** will be turned to white. The structuring
element is then shifted further until its centre reaches the position marked
by ***. Here, we see that the overlapping is complete, that is, all the black
squares in the structuring element overlap with the black squares in the
65
Digital Image Processing image. Hence, the centre of the structuring element corresponding to the
image will be black.
Erosion is a thinning operator that shrinks an image. By applying erosion
to an image, narrow regions can be eliminated, while wider ones are
thinned.
.in
Erosion and dilation can be combined to solve specific filtering tasks. The
most widely used combinations are opening and closing.
1] Opening:Opening is based on the morphological operations, erosion
es
and dilation. Opening breaks up thin strips, smoothes the interior of object
contours, and removes thin image areas. It involves applying erosion and
dilation procedures to the image in that order.
ot
The opening procedure is used to clean up CCD flaws and image noise.
By rounding corners from inside the object where the kernel utilizes fits,
un
66
5.7 DISTANCE TRANSFORM Binary Image Processing
.in
The value at a pixel in the Euclidean distance transform is directly
proportional to the distance in Euclidean terms between that pixel and the
object pixel that is closest to it. The procedure is noise-sensitive because it
es
determines the value at the pixel of interest using the value at a single
object pixel.
ot
The input image and its Euclidean distance transform are shown in the
figure 9.
m
This metric measures the path between the pixels based on a four-
connected neighbourhood, as shown in figure 10. For a pixel ‘p’ with the
coordinates (x, y), the set of pixels given by,
67
Digital Image Processing
Above equation is called its four neighbours, as shown in Fig. 10. The
input and its city block distance are shown in Fig. 11.
.in
Figure 11: City block distance transform
es
5.7.3 Chessboard distance
The chessboard distance of two pixels is given by
ot
The chessboard distance metric measures the path between the pixels
un
68
Binary Image Processing
.in
The quasi-Euclidean distance transform of the image is shown in Figure
14
es
ot
un
5.8 SUMMARY
The principle behind morphological image processing is to probe an
image with a tiny form or pattern called a structuring element. To
carry out a certain activity, structural elements of various shapes may
be used.
The erosion procedure involves probing the image with the structuring
element by successively moving the structuring element's origin to all
69
Digital Image Processing potential pixel locations. If the structuring element is entirely
contained in the image, the result is 1, otherwise it is 0.
.in
1] Explain the mathematical morphology.
2] What do you mean bystructuring elements?
es
3] Write a note on Morphological image processing.
4] Explain thelogical operations.
ot
6] What is Dilation?
7]What is Erosion?
m
70
6
COLOUR IMAGE PROCESSING
Unit Structure
6.0 Objectives
6.1 Introduction
6.2 Formation of colour
6.3 Human perception of colour
6.4 Colour Model
6.4.1 RGB colour model
6.4.2 CMY colour model
.in
6.4.3 HIS colour model
6.4.4 YIQ colour model
es
6.4.5 YCbCr colour coordinate
6.5 Colour image quantization
ot
6.0 OBJECTIVES
To understand the human perception and concepts related to colour
image processing
To learn different colour models
To get familiar with colour histogram equalization
6.1 INTRODUCTION
The perception of colour is the result of incoming visible light on the
retina. The most visually arresting aspect of every image is its colour,
which also has a big impact on how scenically beautiful it is.
71
Digital Image Processing Understanding the nature of light is required in order to comprehend
colour. Light has two distinct natures. It exhibits wave and particle
characteristics. After striking the retina, photons cause electric impulses
that, once they reach the brain, are interpreted into colour. Light waves
with various wavelengths are seen as having various colours.The human
eye cannot, however, see every wavelength. The visible spectrum is made
up of wavelengths between 380 and 780 nanometers.
.in
colour creation.The total amount of photons in the same range that are
present in all of the component colours makes up the final colour.TV
monitors use the additive color-formation principle.
es
2] Subtractive colour formation
When light is transmitted through a light filter, subtractive colour creation
happens. A light filter partially transmits and partially absorbs the light
ot
that it receives. For instance, a green filter allows radiation from the green
portion of the spectrum to pass while blocking radiation from other
wavelengths. The resultant colour is made up of the wavelengths that can
un
pass through each filter when used in sequence. The projection of colour
slides onto a screen result in subtractive colour generation.
.in
es
ot
The strength of the colours one receives for each of the wavelengths in the
visual spectrum is determined by the sensitivity curves of the rho, gamma,
and beta sensors in the human eye. The visual system can create the
perception of colour by selective sensing of various light wavelengths.The
m
73
Digital Image Processing
.in
es
ot
un
m
74
6.4.2 CMY colour model Colour Image Processing
.in
es
Figure 4: CMY colour cube
CMY is a subtractive colour model. From Fig. 5, it is obvious that
ot
75
Digital Image Processing 6.4.3 HIS colour model
Hue, Saturation, and Intensity is referred to as HSI. The observer's
perceived main colour is represented by hue. It is a characteristic
connected to the predominant wavelength. The term "saturation" describes
the degree of purity or the quantity of white light combined with a hue.
Brightness is reflected in intensity. As hue and saturation correspond to
how colours are seen by humans, HSI decouples colour information from
intensity information, making this representation particularly helpful for
creating image-processing algorithms. HIS colour space is well-liked
because it is based on how people see colour.
The conversion from RGB space to HSI space is given below:
.in
es
The HIS colour model can be considered as a cylinder, where the
ot
.in
The process of dividing the original colour space into larger cells in order
to reduce the size of a colour space is known as colour quantization. The
method of color-image quantization involves lowering the number of
colours in a digital colour image. A lossy image compression operation is
es
essentially what color-image quantization is. There are two main processes
in the process of quantizing colour images:
ot
colour image.
The basic goal of colour image quantization is to transfer the original
colour image's set of colours to the quantized image's significantly smaller
m
77
Digital Image Processing The formula for the uniform colour quantization scheme is
.in
Figure 7: Representation of uniform quantisation in two dimensions
es
2] Non-uniform colour quantisation
By utilizing non-uniform quantization, different thresholds are used on
ot
quantization.
Non-uniform quantization over various thresholds is mathematically
represented as follows
m
78
Colour Image Processing
.in
specific colour appears in a colour image.
6.7 SUMMARY
un
Between 400 nm (blue) and 700 nm is the range of visible light (red).
Red, green, and blue are replaced by the three primaries X, Y, and Z in
CIE.
The different colour models are RGB, CMY, YCbCr, YIQ, etc.
79
Digital Image Processing The RGB model is used most often for additive models. In an RGB
model, adding a colour and its complement create white.
Each pixel of the colour image will have three values, one each for the
red, green and blue components.
.in
Woods, Pearson, 2008.
3) Scilab Textbook Companion for Digital Image Processing, S.
Jayaraman, S. Esakkirajan and T. Veerakumar, 2016
es
(https://scilab.in/textbook_companion/generate_book/125).
80
7
IMAGE SEGMENTATION
Unit Structure
7.0 Objectives
7.1 Introduction
7.2 Image segmentation techniques
7.3 Region approach
7.3.1 Region growing
7.3.2 Region splitting
7.3.3 Region splitting and merging
.in
7.4 Clustering techniques
7.4.1 Hierarchical clustering
es
7.4.2 Partitional clustering
7.4.3 k-means clustering
ot
81
Digital Image Processing 7.7.8 Laplacian of gaussian (LOG)
7.7.9 Difference of gaussian filters (DoG)
7.7.10 Canny edge detectors
7.8 Edge Linking
7.9 Hough Transform
7.10 Summary
7.11 List of References
7.12 Unit End Exercises
7.0 OBJECTIVES
To separate or combine different image elements that can be used to
develop objects of interest that can be the subject of investigation and
interpretation.
.in
To get familiar with different segmentation techniques along with their
applicability
es
7.1 INTRODUCTION
Image segmentation is the division of a picture into groups of pixels that
ot
82
7.3 REGION APPROACH Image Segmentation
.in
is formed from the seed. After then, the resulting section is dropped from
the procedure. The remaining pixels are used to choose a new seed. This
keeps happening until every pixel has been assigned to a segment. The
settings for each segment must be updated when pixels are gathered. The
es
initial seed selected and the sequence in which neighbouring pixels are
evaluated may have a significant impact on the segmentation that results.
The choice of homogeneity criteria in image growth depends not only on
ot
the issue at hand but also on the kind of image that needs to be divided
into segments. The criteria used to determine whether a pixel should be
included in the area or not, the connectivity type used to identify
un
neighbours, and the approach taken to visit neighbouring pixels all affect
how regions evolve.
Comparing region-growing to traditional segmentation methods has
m
83
Digital Image Processing Splitting
1]Let R stand for the full image in split 1.
2] Split or partition the image into successively smaller quadrant portions
using the predicate P.
As seen in figure 1, a quadtree structure provides a practical way to
visualize the splitting method. The root of a quadtree represents the
complete image, and each node represents a subdivision.
.in
Figure 1: (a) Splitting of an image (b) Representation by a quadtree
es
It is probable that contiguous zones with the same characteristics will be
found in the final partition. Applying merging and only merging nearby
regions whose combined pixels fulfil the predicate P will address this
flaw.
ot
Merging
un
Any adjacent regions that are similar enough should be merged. The split-
and-merge algorithm's process is described below:
1. Commence with the entire picture.
m
.in
into larger and larger clusters. There are three categories into which the
algorithm can be subdivided: (1) single-link algorithm, (2) complete-link
algorithm, and (3) minimum-variance algorithm. According to the shortest
distance between data samples from two clusters, the single-link algorithm
es
combines two clusters. In light of this, the algorithm permits a propensity
to create clusters with elongated shapes. The complete-link approach,
however, consistently yields compact clusters despite incorporating the
ot
new cluster with the least amount of cost function rise, the minimum-
variance algorithm joins two clusters. This approach, known as the
pairwise-nearest-neighbor algorithm, has generated a lot of attention in
vector quantization.
m
Divide and conquer clustering starts with the entire dataset in one cluster
and splits it repeatedly until single-point clusters are found on leaf nodes.
In order to combat agglomerative clustering, it employs a reverse
clustering approach. The divisive algorithm performs an exhaustive search
for all pairings of clusters for data samples on each node.
Hierarchical algorithms include COBWEB, CURE, and CHAMELEON.
.in
1] Choose K initial clusters z1(l), z2(l), ………. zk(l).
2]Distribute the samples x among the K clusters at the kth iterative step
using the connection
es
ot
3] Determine the new cluster centreszj(k + 1), j = 1, 2,..., K, with the goal
of minimizing the sum of the squared distances between all locations in Cj
(k) and the new cluster. The sample mean of Cj (k) is the measurement
that minimizes this. Consequently, the new cluster centre is provided by
m
Dissimilarity function
The dissimilarity function is a crucial parameter for determining how
different data patterns are from one another. The diversity and size of the
characteristics contained in patterns necessitates careful selection of the
dissimilarity functions. Different clustering outcomes are produced by
using various clustering dissimilarities. The Minkowski metricis a popular
way to express how different two data vectors are which is given by,
.in
es
p =1 means the L1 distance or Manhattan distance
p =2 means the Euclidean distance or L2 distance
ot
Below are the pre requisite of a distance function that must be fulfilled by
a metric
un
m
87
Digital Image Processing The cosine similarity is given by
The dot in this instance is the L2 norm in Euclidean space and indicates
the inner product of two vectors. The Mahalanobis distanceis another
method for generalizing the Euclidean distance to the scale invariant
situation and is expressed as
.in
algorithm, the Mahalanobis distance is used as a measure of dissimilarity.
When used within the context of K-means clustering, the Mahalanobis
distance application is always hampered by the singularity problem and a
high computational cost.
es
7.5 THRESHOLDING
Segments made using thresholding techniques have pixels with
ot
straightforward thresholding.
88
classification error. Variable thresholding could be utilized if the overlap Image Segmentation
is brought on by differences in illumination across the image. One way to
see this is as a type of local segmentation.
.in
and outside the object are represented by the two peaks. The
comparatively few spots along the object's edge correlate to the dip
between the peaks. It is usual practice to determine the threshold grey
es
level using this dip.
The area function derivative for an object whose border is determined via
thresholding is the histogram
ot
un
where D is the gray level, A(D) is the area of the object obtained by
thresholding at gray level D, and H(D) is the histogram
m
.in
es
If the operator ∇ is applied to the function f then
ot
un
m
The gradient magnitude and gradient orientation are the two functions that
can be represented in terms of the directional derivatives. It is feasible to
determine the gradient's magnitude ||∇f|| and direction (∇f ).
The strength of the edge is determined by the gradient's magnitude, which
reveals how much the neighborhood's pixels differ from one another. The
gradient's size is given by
The greatest rate of growth of f (x, y) per unit distance in the gradient
orientation of |∇f| is determined by the gradient's size.
90
The gradient orientation indicates the direction of the biggest shift, which Image Segmentation
is probably the edge-crossing direction. The gradient's direction is
determined by
.in
A function of two variables, f (x, y), is an image. Only the partial
es
derivative along the x-axis is mentioned in the previous equation. It is
possible to identify pixel discontinuity in eight different directions,
including up, down, left, right, and along the four diagonals.
ot
91
Digital Image Processing
.in
locate edges when there is noise. The Roberts operator for a cross-
gradient method is the easiest approach to implement the first-order partial
derivative.
es
ot
un
These filters have the shortest aid, which results in a more precise
positioning of the edges, but the short support of the filters has the
drawback of being susceptible to noise.
92
Image Segmentation
.in
es
The support is longer on the Prewitt masks. The edge detector is less
susceptible to noise because the Prewitt mask differentiates in one
direction and averages in the other.
ot
The Sobel kernels bear Irwin Sobel's name. The Sobel kernel is based on
central differences, however when averaging, it gives more weight to the
central pixels. The Sobel kernels can be viewed as 3x3 approximations to
Gaussian kernels' first derivatives. The Sobel operator's partial derivates
m
are calculated as
93
Digital Image Processing 7.7.6 Frei-Chen edge detector
By mapping the intensity vector with a linear transformation and then
identifying edges based on the angle between the intensity vector and its
projection onto the edge subspace, Frei-Chen mask edge detection is
performed. The normalized weights allow for the realization of Frei-Chen
edge detection. Unique Frei-Chen masks are masks that include all of the
basis vectors. This suggests that the weighted sum of nine Frei-Chen
masks is used to represent a 3x3 image area. The image is initially
confounded with each of the nine masks. The convolution outcomes of
each mask are then combined to produce an innerproduct. Following are
the nine Frei-Chen masks:
.in
es
ot
The first four Frei-Chen masks are used to represent edges, the next four
to represent lines, and the last mask to represent averages. The image is
un
.in
operation:
es
ot
un
m
.in
The fact that a Laplacian zero-crossing merely combines the principal
curvatures together makes it difficult to use as an edge detector. In other
words, it doesn't actually determine the gradient's maximum magnitude.
es
Edges are defined by the Canny edge detector as second derivative zero-
crossings in the direction of the largest first derivative. The Canny
operator performs their work in stages. The image is initially smoothed
ot
96
7.8 EDGE LINKING Image Segmentation
One can threshold an edge-magnitude image and thin the resulting binary
image down to single pixel-wide closed, linked boundaries if the edges are
reasonably strong and the noise level is low. However, such an edge
picture will contain gaps that need to be filled under less-than-ideal
circumstances. Small gaps can be filled by looking for other end points in
a neighbourhood of 5 by 5 pixels or greater, centred on the end point, and
then filling in border pixels as necessary to join them. This, however, has
the potential to oversegment images with several edge points.
.in
2. For each of the points from Step 1, compute the edge quality function
from P.
3. Decide on the option that improves edge quality from P to those
es
points.
4. Begin the subsequent iteration from the location established in Step 2.
ot
97
Digital Image Processing
.in
Suppose we have a set of edge points xi, yi that lie along a straight-line
having parameters ρ0, θ0. Each edge point plots to a sinusoidal curve in the
ρ, θ space, but these curves must intersect at a point ρ0, θ0since this is a
es
line, they all have common.
7.10 SUMMARY
ot
By grouping the patterns into clusters or groups so that they are more
similar to one another than to patterns belonging to different clusters,
the clustering technique aims to access the links between patterns of
the data set.
98
Edges are essentially discontinuities in the intensity of the image Image Segmentation
brought on by adjustments to the image structure.
.in
7.12 UNIT END EXERCISES
1) What is content-based image retrieval?
2) What is an ‘edge’ in an image? On what mathematical operation are
es
the two basic approaches for edge detection based on?
3) What are the three stages of the Canny edge detector? Briefly explain
each phase.
ot
4) Compare the Canny edge detector with the Laplacian of Gaussian edge
detector
un
99
8
IMAGE COMPRESSION
Unit Structure
8.0 Objectives
8.1 Introduction
8.2 Need for image compression
8.3 Redundancy in images
8.4 Image-compression scheme
8.5 Fundamentals of Information Theory
8.5.1 Entropy and mutual information
.in
8.5.2 Shannon’s source coding theorem
8.5.3 Rate-distortion theory
es
8.6 Run-length coding
8.6.1 1-DRun-length coding
ot
8.0 OBJECTIVES
To understand and get familiar with the following concepts:
need for image compression along with their metrics and techniques
lossless and lossy image compression
spatial domain and frequency domain
100
8.1 INTRODUCTION Image Compression
The demand for digital information has rapidly increased as a result of the
development of multimedia technologies over the past few decades. The
widespread use of digital photographs is largely due to technological
advancements. Applications like medical and satellite imagery frequently
use still photos. Digital photos contain a tremendous quantity of data. As
digital images find more applications, image data size reduction for both
storage and transmission are becoming more and more crucial. A mapping
from a higher dimensional space to a lower dimensional space is what
image compression is. Many multimedia applications, including image
storage and transmission, heavily rely on image compression.The
fundamental objective of image compression is to represent an image with
the fewest possible bits while maintaining acceptable image quality. All
image-compression methods aim to reduce the amount of data as much as
possible while removing statistical redundancy and taking advantage of
perceptual irrelevancy.
.in
The amount of information that computers handle has increased
tremendously over the past few decades thanks to advancements in
es
Internet, teleconferencing, multimedia, and high-definition television
technology. Consequently, a significant issue with multimedia systems is
the storage and transmission of the digital image component. To exhibit
photographs with an acceptable level of quality, a tremendous amount of
ot
For instance, the amount of storage space needed to hold a 1600 x 1200
colour photograph is
1200 ×1600 × 8 × 3 = 46,080,000 bits
= 5,760,000 bytes
= 5.76 Mbytes
1.44 Mb is the maximum amount of space on a floppy disc. The maximum
space available if we have three floppies is 1.44 x 4 = 5.76 Mbytes. In
other words, a 1600x1200 RGB image requires at least four floppies to
store.
Every year, the amount of data carried across the Internet doubles, and a
sizable chunk of that data is made up of photographs. Any given device
will experience significant cost savings and become more accessible to use
if its bandwidth requirements are reduced. photos can be represented more
compactly through the use of image compression, allowing for speedier
transmission and storage of photos.
101
Digital Image Processing 8.3 REDUNDANCY IN IMAGES
Because images are typically highly coherent and contain redundant
information, image compression is achievable. Redundancy and
irrelevancy reduction help in compression. Duplication is referred to as
redundancy, while irrelevancy is the portion of an image's information that
the human visual system will not pick up on. Figure 1 depicts the
classification of redundancy
.in
Figure 1: Classification of redundancy
Statistical redundancy
es
As previously mentioned, there are two kinds of statistical redundancy:
interpixel redundancy and coding redundancy.
When adjacent pixels in an image are correlated, interpixel redundancy
ot
results. The implication is that the pixels next to each other are not
statistically independent. Interpixel redundancy is the term used to
describe the interpixel correlation.
un
Spatial redundancy
In an image, the statistical relationship between adjacent pixels is
represented by spatial redundancy. The term "spatial redundancy" refers to
the relationship between adjacent pixels in an image. Each pixel in an
image need not be individually represented. Instead, a pixel's neighbours
can be used to forecast it. The fundamental idea behind differential
coding, which is commonly used in picture and video reduction, is to
eliminate spatial redundancy by prediction.
Temporal redundancy
The statistical relationship between pixels from subsequent frames in a
video clip is known as temporal redundancy.Interframe redundancy is
another name for the temporal redundancy. To lessen temporal
redundancy, motion compensated predictive coding is used. Effective
102
video compression is achieved by removing a significant amount of Image Compression
temporal redundancy.
Psychovisual redundancy
The traits of the human visual system (HVS) are connected to
psychovisual redundancy. Visual information is not perceived similarly in
the HVS. It's possible that certain information is more crucial than other
information. Perception won't be impacted if less data is used to represent
less significant visual information. This suggests that visual information is
redundant in a psychovisual sense. Effective compression results from
removing the psychovisual redundant information.
.in
well-known separation theorem has been used to support the two coding
modules' independent designs.
es
ot
un
Channel Coding: The source encoder compresses the output and adds
controlled redundancy using the channel encoder. The channel
encoder's job is to shield the communication system from channel
noise and other transmission mistakes. To recreate the compressed
bits, the channel decoder makes use of the redundant bits in the bit
sequence.
103
Digital Image Processing 8.5 FUNDAMENTALS OF INFORMATION THEORY
Both lossless and lossy compression are mathematically based on
information theory. Entropy, average information, and rate-distortion
principles are covered in this section.
The entropy H(X) is the lower bound of the bit rate that can represent the
source without distortion. The conditional entropy of a random variable Y
with a finite alphabet set y is defined as
.in
es
where pxy(x, y) is the joint probability density function of X and Y and px|
y(x, y) is the conditional probability density function of X given Y. The
mutual information I (X; Y) between two random variables X and Y is a
measurement of the reduction in uncertainty due to conditioning of Y,
ot
which is defined as
un
104
8.5.3 Rate-distortion theory Image Compression
where PY|X implies source-coding rule and MSE the MSE of the encoding
rule.
.in
length coding (RLC). By encoding the total amount of symbols in a run,
run-length coding takes advantage of the spatial redundancy. The word
"run" denotes the repetition of a symbol, while the word "run-length"
es
denotes the quantity of repeated symbols. A series of numbers is converted
into a series of symbol pairs (run, value) using run-length coding. For this
type of compression, images with sizable regions of consistent shade make
suitable candidates. The bitmap file format used by Windows employs
ot
uses the horizontal correlation between pixels on the same scan line.
105
Digital Image Processing
.in
A top-down binary tree approach, Shannon-Fano coding uses binary trees.
The Shannon-Fano coding algorithm is provided below:
es
1. The collection of source symbols is arranged in reverse probability
order.
2. The interpretation of each symbol is that of a tree's root.
ot
3. The list is split into two groups with roughly equal overall probabilities
for each category.
4. A 0 is added to the first group's code phrase.
un
in each subset.
Example: Construct the Shanon–Fano code for the word MUMMY
Solution: The given word is MUMMY. The number of alphabets in the
word MUMMY (N) is five. Step 1 Determining the probability of
occurance of each character Probability of occurrence of each character in
MUMMY is given by
106
Image Compression
.in
es
Step 5 Computation of average length
ot
.in
symbols. es
ot
un
.in
subsequent symbols, it is instantaneous.It can only be decoded in one
method for each given string of symbols, making it uniquely decodable.
As a result, any string of Huffman-encoded symbols may be decoded by
looking at each symbol in the string from left to right.For the binary code
es
of Fig. 5, a left-to-right scan of the encoded string 010100111100 reveals
that the first valid code word is 01010, which is the code for symbol a3.
The next valid code is 011, which corresponds to symbol a1. Continuing
ot
(or a message). By itself, the code word designates a range of real values
between 0 and 1. The interval used to represent the message's increasing
number of symbols gets smaller as well as the quantity of information
units (let's say, bits) needed to do so.The size of the gap is decreased for
each symbol in the message in line with the likelihood that it will appear.
The technique achieves (but only in theory) the bound set by Shannon's
first theorem since it does not require, as does Huffman's approach, that
each source symbol translate into an integral number of code symbols (i.e.,
that the symbols be coded one at a time).
The procedure of fundamental arithmetic coding is shown in Figure 6.
Here, a1a2a3a3a4, a five-symbol sequence or message, is coded from a four-
symbol source. The message is first believed to take up the entire half-
open interval [0, 1) before the coding process even begins. This interval is
originally partitioned into four sections depending on the probabilities of
each source symbol, as shown in Table 1. For instance, the symbol a1 is
related to the subinterval [0, 0.2).The message interval is initially reduced
to [0, 0.2) because it is the first symbol of the message that is being coded.
109
Digital Image Processing As a result, in Fig. 6, the range [0, 0.2) is expanded to fill the entire height
of the figure, with the values of the restricted range designating its end
points. The next message symbol is then added to the narrower range, and
the procedure is repeated according to the probability of the origin source
symbol.In this manner, symbol a2 narrows the subinterval to [0.04, 0.08),
a3 further narrows it to [0.056, 0.072), and so on. The range is reduced to
[0.06752, 0.0688) by the last message symbol, which must be reserved as
a unique end-of-message signal. Of course, the message can be
represented by any integer falling inside this subinterval, such 0.068. The
five symbols of the arithmetically coded message shown in Fig. are
represented by three decimal digits. In comparison to the source's entropy,
which is 0.58 decimal digits per source symbol, this equates to 0.6 decimal
digits per source symbol.The resulting arithmetic code approaches the
bound defined by Shannon's first theorem as the length of the sequence
being coded grows. The use of finite precision arithmetic and the addition
of the end-of-message indicator, which is required to distinguish one
message from another, are the two variables that cause coding
performance to fall short of the bound in practise. The latter issue is
.in
addressed by scaling and rounding strategies introduced in practical
arithmetic coding systems. The scaling approach divides each subinterval
in accordance with the symbol probabilities after renormalizing it to the
[0, 1) range. The rounding technique ensures that the coding subintervals
es
are faithfully represented despite the truncations brought on by finite
precision arithmetic.
Table 1:Arithmetic coding example
ot
un
m
110
8.10 TRANSFORM-BASED COMPRESSION Image Compression
.in
overlapping blocks, and each block is expressed as a weighted sum of
discrete basis functions. The purpose of transform coding is to decompose
the correlated signal samples into a set of uncorrelated transform
coefficients, such that the energy is concentrated into as few coefficients
es
as possible. The block diagram of transform-based image coding scheme
is shown in Fig. 7.
ot
un
m
.in
8.11 SUMMARY
Compact representation is compression. The representation of a
es
picture with the fewest possible bits is known as image compression.
.in
9) Describe Huffman Coding.
10) Explain Arithmetic Coding.
es
11) Explain Transform-based compression.
ot
un
m
113