Applied Signal and Image Processing
Applied Signal and Image Processing
Applied Signal and Image Processing
M.Sc.
(COMPUTER SCIENCE)
SEMESTER - I
(REVISED SYLLABUS
AS PER NEP 2020)
: Punam Sindhu
Assistant Professor,
Hindi Vidya Prachar Samiti’s Ramniranjan
Jhunjhunwala College Of Arts, Science &
Commerce (Empowered Autonomous)
: Sonali Sambare
Assistant Professor,
SIES College of Arts, Science and Commerce
(Autonomous)
Published by : Director,
ipin Enterprises Centre for Distance and Online Education,
University of Mumbai,
Tantia Jogani
Vidyanagari, MumbaiIndustrial
- 400 098.Estate, Unit No. 2,
Ground Floor, Sitaram Mill Compound,
DTP Composed J.R. University
: Mumbai Boricha Marg,
Press Mumbai - 400 011
Printed by Vidyanagari, Santacruz (E), Mumbai - 400 098
CONTENTS
Unit No. Title Page No.
Semester- I
Programme Name: M.Sc. Computer Course Name: Applied Signal and Image
Science (Semester I) Processing
Total Credits: 04 Total Marks: 100
College assessment: 50 University assessment: 50
Course Outcome:
● Understand and apply the fundamentals of digital signal processing and frequency
domain operations for image analysis.
● Gain proficiency in image processing techniques such as intensity transformations,
histogram processing, and smoothing.
● Develop skills in edge detection and image segmentation using various algorithms and
approaches.
● Utilize morphological operations for image enhancement, feature extraction, and noise
reduction.
● Apply advanced image processing techniques including feature detection, descriptors,
and segmentation algorithms for complex image analysis and understanding.
Total
Course Code Course Title
Credits
PSCS501 Applied Signal and Image Processing 04
MODULE - I
Unit 1: Fundamentals of Digital Signals Processing
Periodic signals, Spectral decomposition, Signals, Reading and writing Waves,
Spectrums, Wave objects, Signal objects ,Noise: Uncorrelated noise, Integrated
spectrum, Brownian noise, Pink Noise, Gaussian noise; Autocorrelation: Correlation,
Serial correlation, Autocorrelation, Autocorrelation of periodic signals, Correlation as a
dot product Frequency domain Operations: Representing Image as Signals, Sampling
and Fourier Transforms, Discrete Fourier Transform, Convolution and Frequency
Domain Filtering, Smoothing using lowpass filters, Sharpening using high-pass filters.
02
Fast Fourier Transforms.
Page 8 of 48
MODULE - II
Unit 3:Structural and Morphological Operations
Edge Detection: Sobel, Canny Prewitt, Robert edge detection techniques, LoG and
DoG filters, Image Pyramids: Gaussian Pyramid, Laplacian Pyramid Morphological
Image Processing: Erosion, Dilation, Opening and closing, Hit-or-Miss Transformation,
Skeletonizing, Computing the convex hull, removing small objects, White and black top-
hats, Extracting the boundary, Grayscale operations
02
Unit 4: Advanced Image Processing Operations
Extracting Image Features and Descriptors: Feature detector versus descriptors,
Boundary Processing and feature descriptor, Principal Components, Harris Corner
Detector, Blob detector, Histogram of Oriented Gradients, Scale-invariant feature
transforms, Haar-like features Image Segmentation: Hough Transform for detecting
lines and circles, Thresholding and Otsu’s segmentation, Edge-based/regionbased
segmentation Region growing, Region splitting and Merging, Watershed algorithm,
Active Contours, morphological snakes, and GrabCut algorithms
Text Books:
1. Digital Image Processing by Rafael Gonzalez & Richard Woods, Pearson; 4th edition,
2018.
2. Think DSP: Digital Signal Processing in Python by Allen Downey, O'Reilly Media; 1st
edition (August 16, 2016).
Reference Books:
1. Understanding Digital Image Processing, VipinTyagi, CRC Press, 2018.
2. Digital Signal and Image Processing by Tamal Bose, John Wiley 2010.
3. Hands-On Image Processing with Python by SandipanDey,Packt Publishing, 2018.
4. Fundamentals of Digital Images Processing by A K Jain, Pearson, 2010
Page 9 of 48
Unit 1
1
FUNDAMENTALS OF DIGITAL
SIGNALS PROCESSING
Unit Structure :
1.0 Objective
1.1 Fundamentals of Digital Signals Processing
1.1.1 Periodic signals
1.1.2 Spectral decomposition
1.1.3 Signals
1.1.4 Reading and writing Waves
1.1.5 Spectrums
1.1.6 Wave objects
1.1.7 Signal objects
1.2 Noise
1.2.1 Uncorrelated noise
1.2.2 Integrated spectrum
1.2.3 Brownian noise
1.2.4 Pink Noise
1.2.5 Gaussian noise
1.3 Autocorrelation
1.3.1 Correlation
1.3.2 Serial correlation
1.3.3 Autocorrelation
1.3.4 Autocorrelation of periodic signals
1.3.5 Correlation as a dot product
1.4 Summary
1.5 Exercise
1.6 References
1
Applied Signal and Image 1.0 OBJECTIVE
Processing
In today's digital world we all are connected with all around us with the
help of signals.
Digital Signal Processing (DSP) is a magical tool that uses math and
computers to analyze and improve signals.
In this chapter, we’ll discuss the fundamentals of DSP With a
programming-based approach based on the reference book Think DSP
Digital Signal Processing in Python.
The examples and supporting code for this book are in Python.
The reader knows basic mathematics, including complex numbers.
The objective of this chapter is:
● To understand the basic concepts and techniques for processing signals
and digital signal processing fundamentals.
● To Understand the processes of analog-to-digital and digital-to-analog
conversion and relation between continuous-time and discrete time
signals and systems.
● The impetus is to introduce a few real-world signal processing
applications by analyze sound recordings and other signals, and
generating new sounds.
2
● The general purpose computer can be used for illustrating DSP theory Fundamentals of Digital
and application. Signals Processing
Advantages :
● DSP hardware is flexible and programmable
● DSP chips are relatively cheap (easily mass-produced)
● Digital storage is cheap
● Digital information can be encrypted, coded, and compressed
Disadvantages:
● Sampling leads to loss of information
● High-resolution ultra-fast A/D and D/A may be expensive
● Digital processing cannot always be done in real-time
3
Applied Signal and Image ● Segment from a recording of a bell
Processing
● Signal is periodic, but the shape of the signal is more complex.
● The shape of a periodic signal is called the waveform.
● Most musical instruments produce waveforms more complex than a
sinusoid.
● Sinusoid means signal that has the same shape as the trigonometric
sine function.
● The shape of the waveform determines the musical timbre, which is
our perception of the quality of the sound.
● People usually perceive complex waveforms as rich, warm and more
interesting than sinusoids.
● The duration to show three full repetitions of the signal known as
cycles.
● The duration of each cycle, called the period, is about 2.3 ms.
● The frequency of a signal is the number of cycles per second, which is
the inverse of the period.
● The units of frequency are cycles per second, or Hertz, abbreviated
“Hz”.
1.1.2 Spectral decomposition
● Spectral decomposition is any signal that can be expressed as the sum
of sinusoids with different frequencies.
● The discrete Fourier transform (DFT ) takes a signal and produces its
spectrum.
● The spectrum is the set of sinusoids that add up to produce the signal.
● The Fast Fourier transform (FFT) is an efficient way to compute the
DFT.
● In the following figure :
○ The x-axis is the range of frequencies that make up the signal.
○ The y-axis shows the strength or amplitude of each frequency
component.
4
Fundamentals of Digital
Signals Processing
1.1.3 Signals
● Signal is anything that carries information.
● It can also be defined as a physical quantity that varies with time,
temperature, pressure or with any independent variables such as
speech signal or video signal.
5
Applied Signal and Image ● Signal processing: It is the process of operation in which the
Processing characteristics of a signal are Amplitude, shape,phase,frequency,etc.
undergoes a change.
● In the Python module called thinkdsp.py that contains classes and
functions for working with signals and spectrums.
● To represent signals, thinkdsp provides a class called Signal, which is
the parent class for several signal types, including Sinusoid, which
represents both sine and cosine signals.
● Signals have an __add__ method, so you can use the + operator to add
them:
mix = sin_sig + cos_sig
● The result is a SumSignal, which represents the sum of two or more
signals.
1.1.5 Spectrums
The signal spectrum describes a signal's magnitude and phase
characteristics as a function of frequency.
The system spectrum describes how the system changes signal magnitude
and phase as a function of frequency.
● Wave provides make_spectrum, which returns a Spectrum:
spectrum = wave.make_spectrum()
● And Spectrum provides plot:
spectrum.plot()
● Spectrum provides three methods that modify the spectrum:
○ low_pass applies a low-pass filter, which means that components
above a given cutoff frequency are attenuated (that is, reduced in
magnitude) by a factor.
○ high_pass applies a high-pass filter, which means that it attenuates
components below the cutoff.
○ band_stop attenuates components in the band of frequencies between
two cutoffs.
● This example attenuates all frequencies above 600 by 99%:
spectrum.low_pass(cutoff=600, factor=0.01)
7
Applied Signal and Image ● A low pass filter removes bright, high-frequency sounds, so the result
Processing sounds muffled and darker.
● To hear what it sounds like, you can convert the Spectrum back to a
Wave, and then play it.
wave = spectrum.make_wave() wave.play('temp.wav')
● The play method writes the wave to a file and then plays it.
● If you use Jupyter notebooks, you can use make_audio, which makes
an Audio widget that plays the sound.
8
● For example: Fundamentals of Digital
Signals Processing
wave.ys *= 2
wave.ts += 1
○ The first line scales the wave by a factor of 2, making it louder.
○ The second line shifts the wave in time, making it start 1 second later.
● Wave provides methods that perform many common operations.
● For example, the same two transformations could be written:
wave.scale(2)
wave.shift(1)
● You can read the documentation of these methods and others at
http://greenteapress.com/thinkdsp.html
9
Applied Signal and Image ○ func: a Python function used to evaluate the signal at a particular point
Processing in time.
● It is usually either np.sin or np.cos, yielding a sine or cosine signal.
● Like many init methods, this one just tucks the parameters away for
future use.
● Signal provides make_wave, which looks like this:
def make_wave(self, duration=1, start=0, framerate=11025):
n = round(duration * framerate)
ts = start + np.arange(n) / framerate
ys = self.evaluate(ts)
return Wave(ys, ts, framerate=framerate)
Where,
○ start and duration are the start time and duration in seconds.
○ framerate is the number of frames (samples) per second.
○ n is the number of samples, and ts is a NumPy array of sample times.
● To compute the ys, make_wave invokes evaluate, which is provided
by Sinusoid:
def evaluate(self, ts):
phases = PI2 * self.freq * ts + self.offset
ys = self.amp * self.func(phases)
return ys
● Let’s unwind this function one step at time:
1. self.freq is frequency in cycles per second, and each element of ts is a
time in seconds, so their product is the number of cycles since the start
time.
2. PI2 is a constant that stores 2π. Multiplying by PI2 converts from
cycles to phase. You can think of phase as “cycles since the start time”
expressed in radians. Each cycle is 2π radians.
3. self.offset is the phase when t is ts[0]. It has the effect of shifting the
signal left or right in time.
4. If self.func is np.sin or np.cos, the result is a value between −1 and +1.
5. Multiplying by self.amp yields a signal that ranges from -self.amp to
+self.amp.
10
In math notation, evaluate is written like this: Fundamentals of Digital
Signals Processing
y = A cos(2πf t + φ0)
where
○ A is amplitude,
○ f is frequency,
○ t is time, and
○ φ0 is the phase offset.
● It may seem like I wrote a lot of code to evaluate one simple
expression, but as we’ll see, this code provides a framework for
dealing with all kinds of signals, not just sinusoids.
1.2 NOISE
In signal processing, noise is a general term for unwanted (and, in general,
unknown) modifications that a signal may suffer during capture, storage,
transmission, processing, or conversion.
11
Applied Signal and Image It uses np.random.uniform, which generates values from a uniform
Processing distribution.
In this example, the values are in the range between -amp to amp.
● The following example generates UU noise with duration 0.5 seconds
at 11,025 samples per second.
signal = thinkdsp.UncorrelatedUniformNoise()
wave = signal.make_wave(duration=0.5, framerate=11025)
● If you play this wave, it sounds like the static you hear if you tune a
radio between channels.
● Following Figure shows what the waveform looks like.
● As expected, it looks pretty random.
12
Fundamentals of Digital
Signals Processing
13
Applied Signal and Image ● It is a function of frequency, f, that shows the cumulative power in the
Processing spectrum up to f.
● Spectrum provides a method that computes the IntegratedSpectrum:
def make_integrated_spectrum(self):
cs = np.cumsum(self.power)
cs /= cs[-1]
return IntegratedSpectrum(cs, self.fs)
where,
self.power is a NumPy array containing power for each frequency.
np.cumsum computes the cumulative sum of the powers.
● Dividing through by the last element normalizes the integrated
spectrum so it runs from 0 to 1.
● The result is an IntegratedSpectrum.
● Here is the class definition:
class IntegratedSpectrum(object):
def __init__(self, cs, fs):
self.cs = cs
self.fs = fs
● Like Spectrum, IntegratedSpectrum provides plot_power, so we can
compute and plot the integrated spectrum like this:
integ = spectrum.make_integrated_spectrum()
integ.plot_power()
● The result, shown in the following Figure, is a straight line, which
indicates that power at all frequencies is constant, on average.
15
Applied Signal and Image signal = thinkdsp.BrownianNoise()
Processing
wave = signal.make_wave(duration=0.5, framerate=11025)
wave.plot()
● Following Figure shows the result.
18
Where, Fundamentals of Digital
Signals Processing
○ duration is the length of the wave in seconds.
○ start is the start time of the wave; it is included so that make_wave has
the same interface for all types of signal, but for random noise, start
time is irrelevant.
○ And framerate is the number of samples per second.
● make_wave creates a white noise wave, computes its spectrum, applies
a filter with the desired exponent, and then converts the filtered
spectrum back to a wave.
● Then it unbiases and normalizes the wave.
● Spectrum provides pink_filter:
○ def pink_filter(self, beta=1.0):
■ denom = self.fs ** (beta/2.0)
■ denom[0] = 1
■ self.hs /= denom
Where,
pink_filter divides each element of the spectrum by f β/2 .
Since power is the square of amplitude, this operation divides the power at
each component by f β .
It treats the component at f = 0 as a special case, partly to avoid dividing
by 0, and partly because this element represents the bias of the signal,
which we are going to set to 0 anyway.
Following Figure shows the resulting waveform.
20
● It has one other interesting property: the spectrum of UG noise is also Fundamentals of Digital
UG noise. Signals Processing
Figure : Normal probability plot for the real and imaginary parts of the
spectrum of Gaussian noise.
● The gray lines show a linear model fit to the data; the dark lines show
the data.
● A straight line on a normal probability plot indicates that the data
come from a Gaussian distribution.
● Except for some random variation at the extremes, these lines are
straight, which indicates that the spectrum of UG noise is UG noise.
● The spectrum of UU noise is also UG noise, at least approximately.
21
Applied Signal and Image ● By the Central Limit Theorem, the spectrum of almost any
Processing uncorrelated noise is approximately Gaussian, as long as the
distribution has finite mean and standard deviation, and the number of
samples is large.
1.3 AUTOCORRELATION
Autocorrelation, sometimes known as serial correlation in the discrete
time case, is the correlation of a signal with a delayed copy of itself as a
function of delay. Informally, it is the similarity between observations of a
random variable as a function of the time lag between them.
1.3.1 Correlation
● Correlation between variables means that if you know the value of
one, you have some information about the other.
● There are several ways to quantify correlation, but the most common is
the Pearson product-moment correlation coefficient, usually denoted ρ.
● For two variables, x and y, that each contain N values:
Where,
µx and µy are the means of x and y, and
σx and σy are their standard deviations.
● Pearson’s correlation is always between -1 and +1 (including both).
● If ρ is positive, we say that the correlation is positive, which means
that when one variable is high, the other tends to be high.
● If ρ is negative, the correlation is negative, so when one variable is
high, the other tends to be low.
● The magnitude of ρ indicates the strength of the correlation.
● If ρ is 1 or -1, the variables are perfectly correlated, which means that
if you know one, you can make a perfect prediction about the other.
● If ρ is near zero, the correlation is probably weak, so if you know one,
it doesn’t tell you much about the others.
● I say “probably weak” because it is also possible that there is a
nonlinear relationship that is not captured by the coefficient of
correlation.
● Nonlinear relationships are often important in statistics, but less often
relevant for signal processing, so I won’t say more about them here.
22
● Python provides several ways to compute correlations. Fundamentals of Digital
Signals Processing
● np.corrcoef takes any number of variables and computes a correlation
matrix that includes correlations between each pair of variables.
● I’ll present an example with only two variables.
● First, I define a function that constructs sine waves with different
phase offsets:
def make_sine(offset):
signal = thinkdsp.SinSignal(freq=440, offset=offset)
wave = signal.make_wave(duration=0.5, framerate=10000)
return wave
● Next I instantiate two waves with different offsets:
wave1 = make_sine(offset=0)
wave2 = make_sine(offset=1)
● Following Figure shows what the first few periods of these waves look
like.
Figure: Two sine waves that differ by a phase offset of 1 radian; their
coefficient of correlation is 0.54.
● When one wave is high, the other is usually high, so we expect them to
be correlated.
>>> corr_matrix = np.corrcoef(wave1.ys, wave2.ys, ddof=0) [[ 1. 0.54] [
0.54 1. ]]
● The option ddof=0 indicates that corrcoef should divide by N, as in the
equation above, rather than use the default, N − 1.
23
Applied Signal and Image ● The result is a correlation matrix:
Processing
○ The first element is the correlation of wave1 with itself, which is
always 1.
○ The last element is the correlation of wave2 with itself.
● The off-diagonal elements contain the value we’re interested in, the
correlation of wave1 and wave2.
● The value 0.54 indicates that the strength of the correlation is
moderate.
● As the phase offset increases, this correlation decreases until the waves
are 180 degrees out of phase, which yields correlation -1.
● Then it increases until the offset differs by 360 degrees.
● At that point we have come full circle and the correlation is 1.
● Following Figure shows the relationship between correlation and
phase offset for a sine wave.
Figure: The correlation of two sine waves as a function of the phase offset
between them. The result is a cosine.
● The shape of that curve should look familiar; it is a cosine.
● thinkdsp provides a simple interface for computing the correlation
between waves:
>>> wave1.corr(wave2) 0.54
1.3.3 Autocorrelation
● In the previous section we computed the correlation between each
value and the next, so we shifted the elements of the array by 1.
● But we can easily compute serial correlations with different lags.
● You can think of serial_corr as a function that maps from each value
of lag to the corresponding correlation, and we can evaluate that
function by looping through values of lag: def autocorr(wave):
lags = range(len(wave.ys)//2)
corrs = [serial_corr(wave, lag) for lag in lags]
return lags, corrs
○ autocorr takes a Wave object and returns the autocorrelation function
as a pair of sequences: lags is a sequence of integers from 0 to half the
length of the wave;
○ corrs is the sequence of serial correlations for each lag.
● Following Figure shows autocorrelation functions for pink noise with
three values of β.
26
Fundamentals of Digital
Signals Processing
29
Applied Signal and Image ● For this example, I estimated the period by trial and error.
Processing
● To automate the process, we can use the autocorrelation function.
lags, corrs = autocorr(segment)
plt.plot(lags, corrs)
● Following Figure shows the autocorrelation function for the segment
starting at t = 0.2 seconds.
30
● That’s consistent with how these terms are used in statistics, but in the Fundamentals of Digital
context of signal processing, the definitions are a little different. Signals Processing
1.4 SUMMARY
In this chapter we have discussed:
Fundamental concepts and techniques for processing signals and digital
signal processing, Process of conversion of analog to digital as well as
digital to analog conversion, Digital Signal Processing (DSP): Digital
Signal Processing (DSP) is an amazing field that uses advanced math and
31
Applied Signal and Image computers to understand and improve signals. Signals are everywhere in
Processing our digital world, like music, pictures, and wireless communication.
Applications of real -world signal processing problems.
Periodic signals: Signals that repeat themselves after some period of time
are called periodic signals. For example, if you strike a bell, it vibrates and
generates sound.
Noise: In signal processing, noise is a general term for unwanted (and, in
general, unknown) modifications that a signal may suffer during capture,
storage, transmission, processing, or conversion. The simplest way to
understand noise is to generate it, and the simplest kind to generate is
uncorrelated uniform noise (UU noise).
Spectrum: The spectrum is the set of sinusoids that add up to produce the
signal.
Autocorrelation: sometimes known as serial correlation in the discrete
time case, is the correlation of a signal with a delayed copy of itself as a
function of delay. Informally, it is the similarity between observations of a
random variable as a function of the time lag between them.
Correlation: Correlation between variables means that if you know the
value of one, you have some information about the other.
1.5 EXERCISE
Answer the following:
1. What is Digital Signal Processing (DSP)? Explain.
2. Define Digital Signal Processing (DSP). Draw basic DSP system. Note
down the advantages and disadvantages of DSP.
3. Write a short note on Periodic signals.
4. Explain following terms about Periodic signals:
a. Periodic signals
b. Waveform
c. Sinusoid
d. Cycles
e. Period
f. Frequency
5. What is Spectral decomposition? Explain.
6. What is Spectral decomposition? Explain Harmonics in detail.
7. Write a short note on the signal.
8. What is signal and signal processing? Explain the term SumSignal.
32
9. Explain following term for signal: Fundamentals of Digital
Signals Processing
a. Signal
b. Signal processing
c. SumSignal
d. Frame
e. Sample
f. wave
10. What is spectrum? Explain three methods that modify the spectrum.
11. Write a short note on a wave object.
12. Write a short note on : Signal objects. Explain The parameters of
__init__.
13. What is noise? Explain Uncorrelated noise.
14. What is the spectrum in noise ? Explain following three things about a
noise signal or its spectrum :
a. Distribution
b. Correlation
c. Relationship between power and frequency
15. Write a short note on: Integrated spectrum.
16. What is Brownian noise? Explain in detail.
17. What is Pink Noise? Explain in detail.
18. What is Gaussian noise? Explain.
19. Write a short note on :Correlation.
20. Write a short note on :Serial correlation.
21. Write a short note on: Autocorrelation.
22. Explain: Autocorrelation of periodic signals.
23. Explain: Correlation as a dot product
1.6 REFERENCES
Text Books/Reference Books:
1. Think DSP: Digital Signal Processing in Python by Allen Downey,
O'Reilly Media; 1st edition (August 16, 2016).
https://greenteapress.com/thinkdsp/thinkdsp.pdf
2. The code and sound samples used in this chapter are available from
https: //github.com/AllenDowney/ThinkDSP
33
Unit 1
2
FUNDAMENTALS OF DIGITAL
SIGNALS PROCESSING - II
Unit Structure :
2.0 Objective
2.1 Frequency domain Operations:
2.1.1 Introduction to Frequency domain
2.1.2 Representing Image as Signals
2.1.3 Sampling and Fourier Transforms
2.2 Discrete Fourier Transform
2.2.1 Convolution and Frequency Domain Filtering
2.2.2 Smoothing using lowpass filters
2.2.3 Sharpening using high-pass filters
2.3 Fast Fourier Transforms
2.4 Summary
2.5 Exercise
2.6 References
2.0 OBJECTIVE
The objective of this chapter is :
● To Understand the meaning of frequency domain filtering, and how it
differs from filtering in the spatial domain.
● Be familiar with the concepts of sampling, function reconstruction,
and aliasing.
● Understand convolution in the frequency domain, and how it is related
to filtering.
● Know how to obtain frequency domain filter functions from spatial
kernels, and vice versa.
● Know the steps required to perform filtering in the frequency domain.
34
● Understand the mechanics of the fast Fourier transform, and how to Fundamentals of Digital
use it effectively in image processing. Signals Processing - II
35
Applied Signal and Image ● The frequency domain is a space which is defined by Fourier
Processing transform.
● Fourier transform has a very wide application in image processing.
● Frequency domain analysis is used to indicate how signal energy can
be distributed in a range of frequency.
● The basic principle of frequency domain analysis in image filtering is
to computer 2D discrete Fourier transform of the image.
Image acquisition
● It is the first process in digital image processing.
● Acquisition could be as simple as being given an image that is already
in digital form.
● Generally, the image acquisition stage involves preprocessing, such as
scaling.
Image enhancement
● It is the process of manipulating an image so the result is more suitable
than the original for a specific application.
● It establishes at the outset that enhancement techniques are problem
oriented.
● Thus, for example, a method that is quite useful for enhancing X-ray
images may not be the best approach for enhancing satellite images
taken in the infrared band of the electromagnetic spectrum.
Image restoration
● It is an area that also deals with improving the appearance of an image.
● However, unlike enhancement, which is subjective, image restoration
is objective, in the sense that restoration techniques tend to be based
on mathematical or probabilistic models of image degradation.
● Enhancement, on the other hand, is based on human subjective
preferences regarding what constitutes a “good” enhancement result.
Wavelets
● Wavelets are the foundation for representing images in various degrees
of resolution.
37
Applied Signal and Image Compression
Processing
● As the name implies, it deals with techniques for reducing the storage
required to save an image, or the bandwidth required to transmit it.
● Although storage technology has improved significantly over the past
decade, the same cannot be said for transmission capacity.
● This is true particularly in uses of the internet, which are characterized
by significant pictorial content. Image compression
● It is familiar (perhaps inadvertently) to most users of computers in the
form of image file extensions, such as the jpg file extension used in the
JPEG (Joint Photographic Experts Group) image compression
standard.
Morphological processing
● It deals with tools for extracting image components that are useful in
the representation and description of shape.
Segmentation
● Segmentation partitions an image into its constituent parts or objects.
● In general, autonomous segmentation is one of the most difficult tasks
in digital image processing.
● A rugged segmentation procedure brings the process a long way
toward successful solution of imaging problems that require objects to
be identified individually.
● On the other hand, weak or erratic segmentation algorithms almost
always guarantee eventual failure. In general, the more accurate the
segmentation, the more likely automated object classification is to
succeed.
Feature extraction
● It almost always follows the output of a segmentation stage, which
usually is raw pixel data, constituting either the boundary of a region
(i.e., the set of pixels separating one image region from another) or all
the points in the region itself.
● Feature extraction consists of feature detection and feature description.
● Feature detection refers to finding the features in an image, region, or
boundary.
● Feature description assigns quantitative attributes to the detected
features.
● For example, we might detect corners in a region, and describe those
corners by their orientation and location; both of these descriptors are
quantitative attributes.
38
● Feature processing methods discussed in this chapter are subdivided Fundamentals of Digital
into three principal categories, depending on whether they are Signals Processing - II
applicable to boundaries, regions, or whole images.
● Some features are applicable to more than one category. Feature
descriptors should be as insensitive as possible to variations in
parameters such as scale, translation, rotation, illumination, and
viewpoint.
39
Applied Signal and Image
Processing
where,
1 n
S ( )
T
n
(
T
) is the Fourier transform of the impulse train
s∆t(t ).
● where the final step follows from the sifting property of the impulse.
● The summation in the last line of Eq. shows that the Fourier transform
F ( ) of the sampled function f (t ) is an infinite, periodic sequence
of copies of the transform of the original, continuous function.
● The separation between copies is determined by the value of 1/∆T.
41
Applied Signal and Image ● Observe that although f (t ) is a sampled function, its transform,
Processing
F ( ) , is continuous because it consists of copies of F( ), which is a
continuous function.
Following Figure is a graphical summary of the preceding results.
42
2.2 DISCRETE FOURIER TRANSFORM Fundamentals of Digital
Signals Processing - II
One of the principal goals of this chapter is the derivation of the discrete
Fourier transform (DFT) starting from basic principles.
The material up to this point may be viewed as the foundation of those
basic principles, so now we have in place the necessary tools to derive the
DFT.
The DFT and Image Processing
To filter an image in the frequency domain:
1. Compute F(u,v) the DFT of the image
2. Multiply F(u,v) by a filter function H(u,v)
3. Compute the inverse DFT of the result
● where the minus sign accounts for the flipping just mentioned, t is the
displacement needed to slide one function past the other, and t is a
dummy variable that is integrated out.
(f ✶h) (H∙F) ( )
The double arrow indicates that the expression on the right is obtained by
taking the forward Fourier transform of the expression on the left,
while the expression on the left is obtained by taking the inverse Fourier
transform of the expression on the right.
44
● Following a similar development would result in the other half of the Fundamentals of Digital
convolution theorem: Signals Processing - II
(f ∙ h) (t) (H ✶F) ( )
45
Applied Signal and Image Where 1 is the IDFT , F (u,v) is the DFT of the input image, f(x,y),
Processing H(u,v) is a filter transfer function and g(x,y) is the filtered i.e output
image.
● Function F,H, and g are arrays of size P x Q, the same as the padded
input image.
● The product H(u,v) F(u,v) is formed using element wise
multiplication.
● The filter transfer function modifies the transform of the input image
to yield the processed output, g(x,y).
● The task of specifying H(u,v) is simplified considerably by using
functions that are symmetric about their center, which requires that
F(u,v) be centered also.
● STEPS FOR FILTERING IN THE FREQUENCY DOMAIN:
The process of filtering in the frequency domain can be summarized as
follows:
1. Given an input image f(x,y) of size M x N, obtain the padding sizes P
and Q that is, P=2M and Q=2N.
2. From a padded image fp(x,y) of size P x Q using zero-,mirror-,or
replicate padding.
3. Multiply fp(x,y) by (-1)x+y to center the Fourier transform on the P x Q
frequency rectangle.
4. Compute the DFT, F(u,v), of the image from step 3.
5. Construct a real, symmetric filter transfer function, H(u,v), of size P x
Q with center at(P/2,Q/2)
6. From the product G(u,v) = H(u,v) F(u,v) using elementwise
multiplication; that is, G(i,k)=H(i,k)F(i,k) for i=0,1,2,...,M-1 and
k=0,1,2,...,N-1.
7. Obtain the final filtered images of size P x Q by computing the IDFT
of G(u,v):
8. Obtain the final filtered result, g(x,y), of the same size as the impute
image by extracting the M x N region from the top,left quadrant of
gp(x,y).
47
Applied Signal and Image ○ The name ideal indicates that all frequencies on or inside a circle of
Processing radius D0 are passed without attenuation,whereas all frequencies
outside the circle are completely attenuated (filtered out).
○ The ideal lowpass filter transfer function is radially symmetric about
the origin.
○ This means that it is defined completely by a radial cross section, as
above Fig.(c) shows.
○ A 2-D representation of the filter is obtained by rotating the cross
section 360°.
○ The transfer function for the ideal low pass filter can be given as:
○ For an ILPF cross section, the point of transition between the values
H( u, v) = 1 and H( u, v) = 0 is called the cutoff frequency.
○ In above Fig.the cutoff frequency is D0.
○ The sharp cutoff frequency of an ILPF cannot be realized with
electronic components, although they certainly can be simulated in a
computer (subject to the constrain that the fastest possible transition is
limited by the distance between pixels).
○ The lowpass filters in this chapter are compared by studying their
behavior as a function of the same cutoff frequencies.
○ One way to establish standard cutoff frequency loci using circles that
enclose specified amounts of total image power PT ,which we obtain
by summing the components of the power spectrum of the padded
images at each point ( u,v ),for u = 0,1,2,...,P-1 and v = 0,1,2,...,Q-1
that is,
○ If the DFT has been centered, a circle of radius D0 with origin at the
center of the frequency rectangle encloses percent of the power,
where
○ and the summation is over values of (u,v) that lie inside the circle or
on its boundary.
48
○ Belove given Figures (a) and (b) show a test pattern image and its Fundamentals of Digital
spectrum. Signals Processing - II
○ The circles superimposed on the spectrum have radii of 10, 30, 60,
160, and 460 pixels, respectively, and enclosed the percentages of total
power listed in the figure caption.
○ The spectrum falls off rapidly, with close to 87% of the total power
being enclosed by a relatively small circle of radius 10.
○ The significance of this will become evident in the following example.
50
○ As before, is a measure of spread about the center. Fundamentals of Digital
Signals Processing - II
○ By letting = D0 , we can express the Gaussian transfer function in
the same notation as other functions in this section:
51
Applied Signal and Image ○ Although humans fill these gaps visually without difficulty, machine
Processing recognition systems have real difficulties reading broken characters.
○ One approach for handling this problem is to bridge small gaps in the
input image by blurring it.
○ Figure (b) shows how well characters can be “repaired” by this simple
process using a Gaussian lowpass filter with D0 = 120.
○ It is typical to follow the type of “repair” just described with additional
processing, such as thresholding and thinning, to yield cleaner
characters.
○ Following figure shows an application of lowpass filtering for
producing a smoother, softer-looking result from a sharp original.
○ For human faces, the typical objective is to reduce the sharpness of
fine skin lines and small blemishes.
○ The magnified sections in Figs.(b) and (c) clearly show a significant
reduction in fine skin lines around the subject’s eyes.
○ In fact, the smoothed images look quite soft and pleasing.
FIGURE 4.49
(a) Original 785x 732 image.
(b) Result of filtering using a GLPF with D0 = 150.
(c) Result of filtering using a GLPF with D0 = 130. Note the reduction in
fine skin lines in the magnified sections in (b) and (c).
● Following shows two applications of lowpass filtering on the same
image, but with totally different objectives.
52
Fundamentals of Digital
Signals Processing - II
53
Applied Signal and Image
Processing
FIGURE
Top row: Perspective plot, image, and, radial cross section of an IHPF
transfer function.
Middle and bottom rows: The same sequence for GHPF and BHPF
transfer functions.
(The thin image borders were added for clarity. They are not part of the
data.)
54
Summary Table: Fundamentals of Digital
Signals Processing - II
We get,
F ( ) f (t )e j 2 ut dt
● By substituting equation
f (t) = f(t)s T (t) =
n
f (t ) (t nt )
for F (t ) ,
we obtain,
56
● The last step follows from Eq. Fundamentals of Digital
Signals Processing - II
m
m = 0,1,2, ….. , M - 1
M T
● Substituting this result for m into Eq.
1 M 1
fn = Fm e j2πmn/M n = 0,1,2…. ----------------------
M m 0
------------- Eq. B
● substituting Eq. (B) for fn into Eq. (A) gives the identity Fm ≡ Fm .
● Similarly, substituting Eq. (A) into Eq. (B) for Fm yields fn ≡ fn .
● This implies that Eqs. (A) and (B) constitute a discrete Fourier
transform pair.
● Furthermore, these identities indicate that the forward and inverse
Fourier transforms exist for any set of samples whose values are finite.
● Note that neither expression depends explicitly on the sampling
interval ∆T, nor on the frequency intervals of Eq.
m
m = 0,1,2, ….. , M - 1
M T
● Therefore, the DFT pair is applicable to any finite set of discrete
samples taken uniformly.
● We used m and n in the preceding development to denote discrete
variables because it is typical to do so for derivations.
● However, it is more intuitive, especially in two dimensions, to use the
notation x and y for image coordinate variables and u and v for
frequency variables, where these are understood to be integers.
● Then,
● Eqs. (A) and (B) become
M 1
F(u) = F(x) e -j2πux/M u = 0,1,2…. -----------------------------
x0
------ Eq. C
and
1 M 1
F (x) = F(u) ej2πux/M x = 0,1,2…. ---------------------------
M u 0
-------- Eq. D
● where we used functional notation instead of subscripts for simplicity.
58
● Comparing Eqs. (A) through (D), you can see that F(u) ≡ Fm and f(x) ≡ Fundamentals of Digital
fn . Signals Processing - II
● From this point on, we use Eqs. (C) and (D) to denote the 1-D DFT
pair.
● As in the continuous case, we often refer to Eq. (C) as the forward
DFT of f (x), and to Eq. (D) as the inverse DFT of F (u ).
● As before, we use the notation f(x) ⇔ F(u) to denote a Fourier
transform pair.
● Sometimes you will encounter in the literature the 1/M term in front of
Eq. (C) instead.
● That does not affect the proof that the two equations form a Fourier
transform pair.
● Knowledge that f(x) and F(u) are a transform pair is useful in proving
relationships between functions and their transforms.
● For example, you are asked in Problem 4.17 to show that
59
Applied Signal and Image ● The discrete equivalent of the 1-D convolution is
Processing
M 1
f(x) ✶ h(x) = f ( m) h ( x m)
m 0
Where x= 0,1,2,...,M-1 --
--------------- Eq. F
● Because in the preceding formulations the functions are periodic, their
convolution also is periodic. Above Equation F gives one period of the
periodic convolution.
● For this reason, this equation often is referred to as circular
convolution.
● This is a direct result of the periodicity of the DFT and its inverse.
● This is in contrast with the convolution in which values of the
displacement, x, were determined by the requirement of sliding one
function completely past the other, and were not fixed to the range
[0,M-1 ] as in circular convolution.
● Finally, we point out that the convolution theorem is applicable also to
discrete variables.
RELATIONSHIP BETWEEN THE SAMPLING AND FREQUENCY
INTERVALS
● If f (x) consists of M samples of a function f (t ) taken T units apart,
the length of the record comprising the set {f(x)}, x = 0,1,2,...,M-1, is
T = MT
● The corresponding spacing, u, in the frequency domain follows from
Eq.
m
u m = 0,1,2,.., M - 1 :
M T
1 1
u
M T T
● The entire frequency range spanned by the M components of the DFT
is then
1
R = Mu =
T
● Thus, we see from above Eqs. that the resolution in frequency, u, of
the DFT depends inversely on the length (duration, if t is time) of the
record, T, over which the continuous function, f (t ), is sampled; and
the range of frequencies spanned by the DFT depends on the sampling
interval T.
● Keep in mind these inverse relationships between u and T.
60
2.3 FAST FOURIER TRANSFORMS Fundamentals of Digital
Signals Processing - II
It is important to develop a basic understanding of methods by which
Fourier transform computations can be simplified and speeded up.
M 1
= F ( x, v)e j 2 ux / M
x 0
where
N 1
j 2 vy / M
F(u,v) = f ( x, y ) e
v 0
● For one value of x, and for v = 0,1,2,...,N-1, we see that F (x ,v) is the
1-D DFT of one row of f(x ,y ).
● We conclude that the 2-D DFT of f (x,y ) can be obtained by
computing the 1-D transform of each row of f (x,y) and then
computing the 1-D transform along each column of the result.
● This is an important simplification because we have to deal only with
one variable at a time.
● A similar development applies to computing the 2-D IDFT using the
1-D IDFT.
● However, as we show in the following section, we can compute the
IDFT using an algorithm designed to compute the forward DFT, so all
2-D Fourier transform computations are reduced to multiple passes of
a 1-D algorithm designed for computing the 1-D DFT.
W M = e j 2 / M
62
● With K being a positive integer and by substituting values Fundamentals of Digital
Signals Processing - II
2 K 1
F(u) = f ( x)W2uxK
x0
K 1 K 1
u (2 x )
= f (2 x)W 2K f (2 x 1)W2uK(2 x 1)
x0 x 0
● However, it can be shown using Eq. that W22Kux WKux , so Eq. can be
written as
K 1 K 1
F(u) = f(2x) WKux + f(2x+1) W2uK(2 x 1)
x0 x0
● Defining
K 1
Feven(u) = f(2x) WKux for u=0,1,2,...,K-1, and
x0
K 1
Fodd(u) = f(2x+1) WKux for u=0,1,2,...,K-1, reduces Eq. to
x0
63
Applied Signal and Image There are two different approaches in computing efficient DFT, those are:
Processing
1 Divide and Conquer approach
2 DFT as Linear filtering.
64
Example 2 Fundamentals of Digital
Signals Processing - II
Example 3
● In DFT methods, we have seen that the computational part is too long.
● We can reduce that through FFT or fast Fourier transform.
● So, we can say FFT is nothing but computation of discrete Fourier
transform in an algorithmic format, where the computational part will
be reduced.
● The main advantage of having FFT is that through it, we can design
the FIR filters.
● Mathematically, the FFT can be written as follows;
Example 4
Consider eight points named from x0 to x7 . We will choose the even terms
in one group and the odd terms in the other. Diagrammatic view of the
above said has been shown below:
65
Applied Signal and Image Here, points x0, x2, x4 and x6 have been grouped into one category and
Processing
Similarly, points x1, x3, x5 and x7 have been put into another category.
Now, we can further make them in a group of two and can proceed with
the computation.
Now, let us see how these breaking into further two is helping in
computation.
Initially, we took an eight-point sequence, but later we broke that one into
two parts G[k] and H[k]. G[k] stands for the even part whereas H[k]
stands for the odd part.
If we want to realize it through a diagram, then it can be shown as below:
66
Similarly, the final values can be written as follows − Fundamentals of Digital
Signals Processing - II
Example 5
Consider the sequence x[n]={ 2,1,-1,-3,0,1,2,1}. Calculate the FFT.
Solution − The given sequence is x[n]={ 2,1,-1,-3,0,1,2,1}
Arrange the terms as shown below;
67
Applied Signal and Image Example 6
Processing
Find the DFT of a sequence x(n)= {1,1,0,0} and find the IDFT of Y(K)=
{1,0,1,0}
68
Fundamentals of Digital
Signals Processing - II
Example 7
Find the DFT of a sequence
x(n) = 1 for 0 ≤n ≤ 2
= 0 otherwise
For (i) N=4 (ii) N=8. Plot | X(K) | and ∟X(K)
Solution: i) N= 4
Fig a) Sequence given in problem
69
Applied Signal and Image b) Periodic extension of the sequence for N=4
Processing
ForN 4
3
X (k ) x(n)e j nk / 2 k 0,1, 2,3
n 0
Fork 0
3
X (0) x(n) x(0) x(1) x(2) x(3)
n0
3
70
Fork 3 Fundamentals of Digital
3 Signals Processing - II
X (3) x(n)e j 3 n / 2
n 0
ii) N= 8
For N 8
71
Applied Signal and Image For N 8
Processing 7
X (k ) x(n)e j nk / 4
n 0
Fork 0
7
X (0) x(n)
n0
Therefore,
X (1) 2.414,X (1)
4
Fork 2
7
X (2) x(n)e j n / 2
n 0
72
Fundamentals of Digital
Therefore, X (3) X Signals Processing - II
4
Fork 4
7
X (4) x(n)e j n
n 0
1 e j 7 / 4 e j 7 / 2
7 7 7 7
1 cos j sin cos j sin
4 4 2 2
1 0.707 j 0.707 j
1.707 j1.707
X (7) 2.414,X (7)
4
X (k ) 3,2.414,1,0,414,1,0.414,1,0.414,1,2.414
X (k ) 0, , , ,0, , ,
4 2 4 4 2 4
73
Applied Signal and Image Fork 7
Processing 7
X (7) x(n)e j 7 n /4
n 0
1 e j 7 / 4 e j 7 / 2
7 7 7 7
1 cos j sin cos j sin
4 4 2 2
1 0.707 j 0.707 j
1.707 j1.707
X (7) 2.414,X (7)
4
X (k ) 3,2.414,1,0,414,1,0.414,1,0.414,1,2.414
X (k ) 0, , , ,0, , ,
4 2 4 4 2 4
2.4 SUMMARY
● In this chapter we have seen a progression from sampling to the
Fourier transform, and then to filtering in the frequency domain.
● Introduction The sampling theorem explained in the context of the
frequency domain.
● The same is true of effects such as aliasing.
● The material starts with basic principles, so that any reader with a
modest mathematical background would be in a position not only to
absorb the material, but also to apply it.
● Summary of DFT definitions and corresponding expressions.
74
Fundamentals of Digital
Signals Processing - II
2.5 EXERCISE
Answer the following:
1. Explain the process of obtaining the Discrete Fourier transform from
the continuous transform of a sampled function.
2. Explain the properties of the 2D Discrete Fourier transform.
3. Explain the following with relevant equations
a. The 2D discrete Fourier transform and its inverse.
b. The 2D continuous Fourier transform pair
75
Applied Signal and Image 4. Explain Image smoothing and Image sharpening in frequency domain.
Processing
5. Explain the steps for filtering in frequency domain in detail.
6. Write a short note on Sampling and the Fourier Transform of Sampled
Functions.
7. Explain Sharpening in the Frequency Domain Filters using highpass
filter.
8. Explain convolution.
9. Explain smoothing lowpass for
a. IDEAL,
b. BUTTERWORTH, and
c. GAUSSIAN.
10. Write a short note on FFT.
2.6 REFERENCES
Digital Image Processing by Rafael Gonzalez & Richard Woods, Pearson;
4th edition.pdf
https://sbme-tutorials.github.io/2018/cv/notes/3_week3.html
https://www.cis.rit.edu/class/simg782/lectures/lecture_14/lec782_05_14.p
df
https://universe.bits-pilani.ac.in/uploads/JNKDUBAI/ImageProcessing7-
FrequencyFiltering.pdf
https://www.tutorialspoint.com/digital_signal_processing/dsp_discrete_fo
urier_transform_solved_examples.htm
https://faculty.nps.edu/rcristi/EC3400online/homework/solutions/Solution
s_Chapter3.pdf
https://www.sathyabama.ac.in/sites/default/files/course-material/2020-
10/UNIT3_1.pdf
76
3
IMAGE PROCESSING FUNDAMENTALS
AND PIXEL TRANSFORMATION-I
Unit Structure:
3.0 Objectives
3.1 Definition
3.2 Overlapping Fields with Image Processing
3.3 Components of an Image Processing System
3.4 Fundamental Steps in Digital Image Processing
3.5 Application of Image Processing
3.6 Image Processing Pipeline
3.7 Tools and Libraries for Image Processing
3.8 Image Types
3.9 Image File Formats
3.10 Intensity Transformations
3.11 Some Basic Intensity Transformation Functions
3.12 Piecewise-Linear Transformation Functions
3.13 Summary
3.14 Exercise Questions
3.15 References
3.0 OBJECTIVES
This chapter provides an overview of image processing fundamental,
components of an image processing system, fundamental steps in digital
image processing, tools and libraries available for image processing,
image types and files formats, various application domains where image
processing can be highly useful, basic intensity transformation techniques
in spatial domain which includes image negative, log transformation and
power law transformation and contrast stretching technique to increase the
range of intensity levels in low contrast images.
77
Applied Signal and Image 3.1 DEFINITION
Processing
What is Image Processing?
Image processing refers to the manipulation and analysis of digital images
using various algorithms and techniques to extract useful information or
enhance certain aspects of the image. It involves acquiring, processing,
analyzing, and interpreting images to improve their quality, extract
relevant features, or perform specific tasks such as object detection, image
restoration, image segmentation and compression.
78
Image Processing
Fundamentals and Pixel
Transformation-I
There are infinite no of points in both the directions and intensity value is
continuousbetween 0 and 1at every point. It is not possible to store these
continuous values in digital form. Instead of storing all the intensity values
at all possible points in the image, we try to take sample of the image.
Each of these sample values are quantized and the quantization is done
using 8-bit (0 to 255) for gray scale images and 24-bit for colored images
(8-bit for each channel RGB).
3. High-Level Processes:
Image analysis and computer vision
The first is the physical device that is sensitive to the energy radiated
by the object we wish to image (Sensor).
Computer:
The computer in an image processing system is a general-purpose
computer and can range from a PC to a supercomputer. In dedicated
applications, sometimes specially designed computers are used to achieve
a required level of performance.
81
Applied Signal and Image 2. Image Enhancement:
Processing Basically, the idea behind enhancement techniques is to bring out detail
that is obscured, or simply to highlight certain features of interest in an
image. Such as, changing brightness & contrast etc.
3. Image Restoration:
Unlike enhancement, which is subjective, image restoration is objective.
Restoration techniques tend to be based on mathematical or probabilistic
models of image degradation.
4. Color Image Processing:
It deals with pseudo color and full color image processing color models
are applicable to digital image processing.
7. Morphological Processing:
Morphological processing deals with tools for extracting image
components that are useful in the representation and description of shape.
8. Segmentation:
Segmentation procedures partition an image into its constituent parts or
objects.
82
Image Processing
Fundamentals and Pixel
Transformation-I
Left hand image is Original CT Scan Image of human brain. Right hand
images are the processed images. Region of yellow and red tells the
presence of tumour in the brain.
83
Applied Signal and Image
Processing
A slice from MRI scan of canine heart and find boundaries between types
of tissue. A suitable filter is used to highlight edges.
Image Reconstruction
Image processing can be used to recover and fill in the missing or corrupt
parts of an image. This involves using image processing systems that have
been trained extensively with existing photo datasets to create newer
versions of old and damaged photos.
Machine Vision
One of the most interesting and useful applications of Image Processing is
in Computer Vision. Computer Vision is used to make the computer see,
identify things, and process the whole environment as a whole.
An important use of
computer vision is self-
driving cars, drones etc.
Computer Vision helps in
obstacle detection, path
recognition, and
understanding the
environment.
84
Image processing techniques are used to enhance, analyze, and interpret Image Processing
remote sensing data to derive insights about the Earth's surface, its Fundamentals and Pixel
features, and changes over time. Transformation-I
Industrial Inspection:
Industrial Vision systems are used in all kinds of industries for quality
check and inspection.
For eg, whether a bottle is filled up to the specified level or not. Machine
inspection is used to determine that all components are present and that all
solder joints are acceptable.
Biometrics:
Image processing plays a crucial role in biometric systems for tasks such
as face recognition, fingerprint recognition, iris recognition, and hand
geometry analysis.
Security and Surveillance:
Image processing techniques are employed in security and surveillance
systems for tasks such as video analytics, motion detection, object
recognition, and tracking of suspicious activities.
87
Applied Signal and Image Color images can be modelled as three-band monochrome image data,
Processing where each band of data corresponds to a different color. The actual
information stored in the digital image data is the brightness information
in each spectral band.
Common color spaces are RGB (Red, Green, Blue), HSI (Hue Saturation
and Intensity) CMYK (Cyan, Magenta, Yellow, Black).
A digital image is often encoded in the form of binary files for the purpose
of storage and transmission. Different file formats compress the image
data by different amounts.
- The GIF file format uses lossless compression scheme. As a result, the
quality of the image is pre- served.
- GIF interlaced images can be displayed as low-resolution images
initially and then develop clarity and detail gradually.
- GIF images can be used to create simple animations.
- GIF 89a images allow for one transparent colour.
88
Advantages of GIF File Format - GIF uses lossless compression Image Processing
algorithm that provides up to 4:1 compression of images which is the most Fundamentals and Pixel
widely supported graphics format on the Web.GIF supports transparency Transformation-I
and interlacing
Limitations of GIF File Format - GIF file format supports a maximum
of 256 colours. Due to this particular limitation, complex images may lose
some detail when translated into GIF
JPEG Files -
- JPEG is not actually a file type. JPEG is the most important current
standard for image compression. JPEG standard was created by a
working group of the International Organisation for Standardisation
(ISO). This format provides the most dramatic compression option for
photographic images. JPEG compression is used within the JFIF file
format that uses the file extension (jpg).
- This format is useful when the storage space is at a premium. JPEG
pictures store a single raster image in 24-bit colour.
JPEG is a platform-independent format that supports the highest levels of
compression; however, this compression is lossy. JPEG images are not
interlaced; however, progressive JPEG images support interlacing.
Advantage of JPEG File Format- The strength of JPEG file format is its
ability to compress larger image files. Due to this compression, the image
data can be stored effectively and transmitted efficiently from one place to
another
Limitations of JPEG File Format- JPEG in its base version does not
support multiple layers, high dynamic range. Hence JPEG will not be a
wise choice if one is interested in maintaining high quality pictures.
PNG Files -
- PNG stands for Portable Network Graphics. PNG is a bitmapped
image format that employs lossless data compression. PNG was
created to improve and replace the GIF format.
- The PNG file format is regarded and was made as a free and open-
source successor to the GIF file format. The PNG file format supports
true colour (16 million colours), whereas the GIF file format only
allows 256 colours.
- The lossless PNG format is best suited for editing pictures, whereas
the lossy formats like JPG are best for the final distribution of
photographic-type images due to smaller file size.
- Yet many earlier browsers do not support the PNG file format,
however with the release of Internet Explorer 7 all popular modern
browsers fully support PNG. Special features of PNG files include
support for up to 48 bits of colour information.
89
Applied Signal and Image
Processing TIFF Files -
- TIFF stands for Tagged Image File Format and was developed by the
Aldus Corporation in the 1980s. It was later supported by Microsoft.
- TIFF files are often used with scanned images. Since a TIFF file does
not compress an image file, hence images are often large but the
quality is preserved.
- It uses a filename extension of TIFF of TIF. The TIFF format is often
used to exchange files between applications and computer platforms.
- Within TIFF, a lossless compression routine known as LZW is
available. This reduces the size of the stored file without perceptible
loss in quality.
- The goals of the TIFF specification include extensibility, portability,
and revisability.
Advantages of TIFF File Format is that it can support any range of image
resolution, size, and colour depth and different compression techniques.
Disadvantage of TIFF file format is its large file size that limits its usage
in web applications.
90
Image Processing
Fundamentals and Pixel
Figure: A 3X3 neighborhood Transformation-I
about a Point (x, y) in an image in
the spatial domain. The
neighborhood is moved from pixel
to pixel in the image to generate an
output image.
s=L-1–r
Reversing the intensity levels of an image in this manner produces the
equivalent of a photographic negative.
91
Applied Signal and Image
Processing
s = c log(1 + r)
where ‘s’ and ‘r’ are the pixel
values of the output and input
image and c is a constant, and it
is assumed that r ≥ 0. The value
1 is added to each of the pixel
value of the input image because
if there is a pixel intensity of 0 in
the image, then log(0) is equal to
infinity.
The shape of the log curve shows that this transformation maps a narrow
range of low intensity values in the input into a wider range of output
levels. The opposite is true of higher values of input levels.
92
We use a transformation of this type to expand the values of dark pixels in Image Processing
an image while compressing the higher-level values. The opposite is true Fundamentals and Pixel
of the inverse log transformation. Transformation-I
The log function has the important characteristic that it compresses the
dynamic range of images with large variations in pixel values.
s = crγ
where c and γ are positive constants.
Power-law curves with fractional values of γ map a narrow range of dark
input values into a wider range of output values, with the opposite being
true for higher values of input levels.
Curves generated with values of γ >1 have exactly the opposite effect as
those generated with values of γ <1. It reduces to the identity
transformation when c= γ=1
A variety of devices used for image capture, printing, and display respond
according to a power law. By convention, the exponent in the power-law
equation is referred to as gamma. The process used to correct these power-
law response phenomena is called gamma correction.
93
Applied Signal and Image For example, cathode ray tube (CRT) devices have an intensity-to-voltage
Processing response that is a power function, with exponents varying from
approximately 1.8 to 2.5
With reference to the curve for γ=2.2, we see that such display systems
would tend to produce images that are darker than intended.
Contrast stretching.
(a) Form of transformation
function. (b) A low-contrast
image.
(c) Result of contrast stretching.
(d) Result of thresholding.
(Original image courtesy of Dr.
Roger Heady, Research School
ofBiological Sciences, Australian
National University, Canberra,
Australia.)
Thresholding function –
• In the limiting
case, T(r) produces
a two-level
(binary) image.
• A mapping of this
form is called a
thresholding
function.
3.13 SUMMARY
In this chapter, we learnt about the fundamentals about Image processing.
Before understanding image processing, we discussed about the
representation of images in digital form (matrix form). Different types of
images and image file formats are discussed. Intensity levels of binary
images, gray-scale images and colored images are briefly explained in the
95
Applied Signal and Image chapter. Wide area of applications of Iage processing is explained.
Processing Components and steps in digital image processing system is discussed
with diagrams.
Basic intensity transformation functions like log transformation and power
low transformation, contrast stretching and thresholding are demonstrated
using the transformed output images and function curves.
3.15 REFERENCES
1. https://www.researchgate.net/publication.
2. Digital Image Processing by Rafael Gonzalez & Richard Woods,
Pearson; 4th edition, pdf.
3. Digital Image Processing by S. Jayaraman, Tata McGraw Hill
Publication, pdf.
4. Images used are processed using the tools- OpenCV (Python) and
GNU Octave (compatible with MATLAB).
96
4
IMAGE PROCESSING FUNDAMENTALS
AND PIXEL TRANSFORMATION-II
Unit Structure:
4.0 Objective
4.1 Definitions
4.2 Histogram Processing
4.3 Histogram Equalization
4.4 Histogram Matching
4.5 Mechanics of Spatial Filtering
4.6 Image Smoothing (Low Pass) Filter
4.7 Smoothing Order-Statistic (Non-Linear) Filters
4.8 Sharpening Filters (High Pass Filters)
4.9 Illustration of The First and Second Derivatives of A 1-D Digital
Function- Example
4.10 Image Sharpening Filter -The Laplacian
4.11 Using First - Order Derivatives for (Edge Detection) Image
Sharpening - The Gradient
4.12 Summary
4.13 Exercise Questions
4.14 References
4.0 OBJECTIVE
This chapter provides an overview of image histogram processing. The
objective behind histogram processing is to improve the contrast of the
image by stretching the image intensity range. This chapter also includes
the mechanics of linear and non-linear low pass (smoothing/blur) filters.
First order and second order derivatives are discussed to understand the
high pass (sharpening)filters. Various smoothing and sharping filters are
explained with examples.
4.1 DEFINITIONS
Image Histogram:
Image histogram shows the graphical representation of intensity
distribution of all the pixels in an image. We can consider histogram as a
graph or plot, which gives an overall idea about the intensity distribution
of an image.
97
Applied Signal and Image
Processing It is a plot with pixel values (ranging from 0 to 255 in Gray-scale images)
in X-axis and corresponding number of pixels in the image on Y-axis.
98
Image Processing
Fundamentals and Pixel
Transformation
The horizontal axis of the histograms are values of rkand the vertical axis
are values of
h(rk) = nk or p(rk) = rk /MN if the values are normalized, where, M and N
are the row and column dimensions of the image.
We see that the components of the histogram in the high-contrast image
cover a wide range of the intensity scale and, further, that the distribution
of pixels is not too far from uniform.Intuitively, it is reasonable to
conclude that an image whose pixels tend to occupy the entire range of
possible intensity levels and, in addition, tend to be distributed uniformly,
will have an appearance of high contrast and will exhibit a large variety of
gray tones.
It is possible to develop a transformation function that can achieve this
effect automatically, using only the histogram of an input image.
This method usually increases the global contrast of images and allows for
areas of lower local contrast to gain a higher contrast.
It is common practice to normalize a histogram by dividing each of its
components by the total number of pixels in the image, denoted by the
product MN, where, M and N are the row and column dimensions of the
image.
Thus, a normalized histogram is given by p(rk) = rk /MN, for k = 0, 1, 2,
… , L - 1.
p(rk) is an estimate of the probability of occurrence of intensity level rk in
an image. The sum of all components of a normalized histogram is equal
to 1.
Assuming initially continuous intensity values, let the variable r denote
the intensitiesof an image to be processed. As usual, we assume that r is
99
Applied Signal and Image in the range [0, L − 1], with r = 0 representing black and r = L − 1
Processing representing white.
The discrete form of the Histogram Equalization transformation is-
These are the values of the equalized histogram. Observe that there are
only five distinct intensity levels. Because r0=0 was mapped to s0 = 1.
there are 790 pixels in the histogram equalized image with this value.
100
Also, there are 1023 pixels with a value of s1 = 3 and 850 pixels with a Image Processing
value of s2 = 5. Fundamentals and Pixel
Transformation
However, both r3and r4 were mapped to the same value, 6, so there are
(656 + 329) = 985 pixels in the equalized image with this value.
Similarly, there are (245 + 122 + 81) = 448 pixels with a value of 7 in the
histogram equalized image. Dividing these numbers by MN = 4096
yielded the equalized histogram.
102
Observe that the center coefficient of the filter, w(0,0), aligns with the Image Processing
pixel at location (x, y). Fundamentals and Pixel
Transformation
For a mask of size m x n we assume that m=2a+1 and n=2b+1 where a
and b are positive integers.
- For a filter of
size m x n we pad the
image with a
minimum of m-1
rows of 0’s at the top
and bottom and n-1
columns of 0’s on the
left and right.
m and n are equal
to 3, so we pad f
with two rows of 0’s
above and below
and two columns of
0’s to the left and
right
Spatial Correlation
Spatial Convolution
103
Applied Signal and Image 4.6 IMAGE SMOOTHING (LOW PASS) FILTER
Processing
Smoothing filters are used for blurring and for noise reduction. Blurring is
used in preprocessing tasks, such as removal of small details from an
image prior to (large) object extraction,
104
An Example of Average Filter- Image Processing
Fundamentals and Pixel
Transformation
• Linear filters blur all image structures points, edges and lines,
reduction of image quality
• Linear filters thus not used a lot for removing noise
105
Applied Signal and Image 4.7 SMOOTHING ORDER-STATISTIC (NON-LINEAR)
Processing
FILTERS
Order-statistic filters are nonlinear spatial filters whose response is based
on ordering (ranking) the pixels contained in the image area encompassed
by the filter, and then replacing the value of the center pixel with the value
determined by the ranking result.
The best-known filter in this category is the median filter, which, as its
name implies, replaces the value of a pixel by the median of the
intensity values in the neighborhood of that pixel (the original value of
the pixel is included in the computation of the median).
Median filters are quite popular because, for certain types of random
noise, they provide excellent noise-reduction capabilities, with
considerably less blurring than linear smoothing filters of similar size.
Using the 100th percentile results in the so-called max filter, which is
useful for finding the brightest points in an image.
The 0th percentile filter is the min filter, used for the opposite purpose.
106
Image Processing
Fundamentals and Pixel
Transformation
107
Applied Signal and Image
Processing
108
To avoid a situation in which the previous or next points are outside the Image Processing
range of the scan line, we show derivative computations in Figure, from Fundamentals and Pixel
the second through the penultimate points in the sequence. Transformation
Note that the sign of the second derivative changes at the onset and end of
a step or ramp.
The 2nd derivative of an image - where the image highlights regions of
rapid intensity change and is therefore often used for edge detection-
zero crossing edge detectors.
109
Applied Signal and Image 4.10 IMAGE SHARPENING FILTER -THE LAPLACIAN
Processing
This vector has the important geometrical property that it points in the
direction of the greatest rate of change of fat location (x, y).
Z1 Z2 Z3
Z4 Z4 Z5
Z6 Z7 Z8
111
Applied Signal and Image Using absolute values, the gradient is given by –
Processing
Difference in first and third row in the left mask gives partial
derivative in the vertical direction.
Difference in first and third column in the right mask gives partial
derivative in the horizontal direction.
4.12 SUMMARY
In this chapter we learned about the image histogram and its
transformation. Histogram equalization can be used to improve the
contrast of the image by stretching out the intensity range of the image. We
discussed the low pass and high pass filters in spatial domain of the image.
Smoothing Linear (Average filter) and non-linear filters (Median Filters)
are discussed and their impact was shown for noise reduction. Mechanics
of Sharpening filters is illustrated using first order and second order
derivatives of a 1-D function using a line profile. Mathematics of Laplacian
filter and gradient operators are discussed.
112
4.13 EXERCISE QUESTIONS Image Processing
Fundamentals and Pixel
Transformation
1. What is image histogram? Discuss the histogram pattens of low
contrast and high contrast images.
Grey Level 0 1 2 3 4 5 6 7
3. What is median filter used for? Apply median filter on the below
image keeping border values unchanged-
7. Write a note on high pass filters and formula for calculating 1st and
second order derivatives.
8. Calculate the 1st and 2nd order derivatives for the below given profile
of a line.
6 6 6 6 5 4 3 2 1 1 1 1 1 1 6 6 6 6 6
11. Using First-Order Derivatives for derive The Gradient Filter for edge
detection.
4.14 REFERENCES
1. https://towardsdatascience.com/histogram-equalization-5d1013626e64
2. https://docs.opencv.org/
3. Digital Image Processing by Rafael Gonzalez & Richard Woods,
Pearson; 4th edition, pdf.
4. Digital Image Processing by S. Jayaraman, Tata Mc-Graw Hill
Publication, pdf.
5. Images used are processed using the tools- OpenCV (Python) and
GNU Octave (compatible with MATLAB).
114
5
STRUCTURAL AND MORPHOLOGICAL
OPERATIONS
Unit Structure :
5.0 Objectives
5.1 Edge Detection
5.2 Edge properties
5.3 Simple edge model
5.3.1 Step Edge Model
5.3.2 Ramp Edge Model
5.4 Edge detection techniques
5.4.1 Sobel
5.4.2 Canny
5.4.3 Prewitt
5.4.4 Robert edge detection techniques
5.4.5 LoG filters
5.4.6 DoG filters
5.5 Image Pyramids
5.5.1 Gaussian Pyramid
5.5.2 Laplacian Pyramid
5.5.3 Morphological pyramid
5.6 Summary
5.7 Reference for further reading
5.0 OBJECTIVES
After going through this unit, you will be able to:
Learn dilation, erosion, opening, and closing for morphological
operations.
Master edge detection with Sobel, Prewitt, Robert, and Canny
techniques.
Explore LoG and DoG filters for edge detection at different scales.
Understand Gaussian and Laplacian pyramids for multi-scale image
representation.
Apply morphological operations like dilation, erosion, opening, and
closing for image manipulation and feature extraction.
115
Applied Signal and Image 5.1 EDGE DETECTION
Processing
Our eyes naturally focus on edges in an image, as they often reveal
important details about the scene. In image processing, edge detection
plays a similar role, identifying sharp changes in brightness that can tell us
a lot about the world captured in the image.
Why are edges important?
These sudden brightness shifts often correspond to:
Differences in depth (nearby vs. faraway objects)
Changes in surface orientation (flat vs. curved surfaces)
Variations in materials (wood vs. metal)
Shifts in lighting conditions (bright sun vs. shadow)
Ideally, edge detection would result in a clean map highlighting object
boundaries, surface markings, and other significant changes in the image.
This filtered image would contain less data, focusing on the essential
structure while filtering out less relevant details.
2. Orientation:
Indicates the direction of the edge, typically measured in degrees (0-
180) or radians (0-π). This helps differentiate between horizontal,
vertical, or diagonal edges.
3. Location:
Specifies the coordinates (x,y) of each edge pixel in the image. This
allows for precise localization of edges and potential object
boundaries.
4. Type:
Depending on the edge detection algorithm used, the edge type might
be categorized as a step edge (sharp transition), a ramp edge (gradual
change), or a roof edge (double intensity change).
5. Connectivity:
Describes how edge pixels are connected. Edges can be isolated
points, short segments, or continuous curves outlining objects.
6. Curvature:
Represents the degree to which the edge bends or curves. This can be
helpful in identifying shapes and distinguishing between smooth and
sharp corners.
7. Color:
In color images, the edge may have a specific color profile that can be
informative. For example, an edge between a red object and a blue
background might have a combined color representing the transition.
By analyzing these edge properties, we can gain a deeper understanding of
the image content. For example, strong edges with specific orientations
might indicate object boundaries, while weak, fragmented edges could be
noise or irrelevant details. Additionally, edge properties can be used for
tasks like:
117
Applied Signal and Image Motion detection: Tracking changes in edge properties over time to
Processing detect moving objects.
Understanding edge properties allows us to exploit edge detection for a
wider range of image processing and computer vision applications.
118
5.4 EDGE DETECTION TECHNIQUES Structural and
Morphological Operations
Edge detection techniques play a crucial role in identifying boundaries and
transitions within images, enabling various image processing tasks. Here,
we delve into several prominent edge detection methods, including Sobel,
Canny, Prewitt, and Robert operators, as well as LoG (Laplacian of
Gaussian) and DoG (Difference of Gaussians) filters.
5.4.1. Sobel
The Sobel operator is a widely-used method for detecting edges in digital
images. It is based on the computation of image gradients to identify areas
of rapid intensity change, which typically correspond to edges or
boundaries between objects. The Sobel operator is particularly effective
due to its simplicity and computational efficiency. Here, we delve into the
principles behind the Sobel operator and its application in edge detection.
The Sobel edge detection technique, which is commonly used in digital
image processing. The Sobel operator helps identify edges by estimating
the gradient magnitude and direction at each pixel in a grayscale image.
The Sobel algorithm is explained in detail below:
1. Converting the Image into Grayscale:
Before applying edge detection, we convert the original color image
into grayscale. Grayscale images have a single intensity channel,
making them suitable for edge analysis.
2. Sobel Filters:
The Sobel operator involves convolving the image with two filters:
one in the x-direction and another in the y-direction.
These filters are derivative filters, meaning they compute the first
derivative of the image intensity along the x and y axes.
119
Applied Signal and Image
Processing
Where (Gx) and (Gy) represent the gradients in the x and y directions,
respectively.
6. Edge Emphasis:
Pixels with high gradient magnitude form edges in the image.
The Sobel operator emphasizes these edges, making them stand out.
Applications:
Edge Detection: The Sobel operator is primarily used for edge
detection in various image processing applications, such as object
recognition, image segmentation, and feature extraction.
Image Analysis: It is also employed in image analysis tasks where the
detection of boundaries and transitions between regions is essential for
understanding image content.
Real-Time Processing: Due to its computational efficiency, the Sobel
operator is suitable for real-time processing applications, including
video processing and computer vision tasks.
Limitations:
Sensitive to noise: Noise in the image can lead to spurious edges in
the output. Pre-processing with noise reduction techniques might be
necessary.
Not scale-specific: The Sobel operator might struggle with edges at
different scales in the image. Techniques like Difference of Gaussians
(DoG) can address this limitation.
Example
Let's say we're working with a basic 3x3 grayscale image where the pixel
intensities vary:
image
50 100 150
75 120 175
120
We'll apply the Sobel operator to detect edges in this image. Structural and
Morphological Operations
Horizontal Sobel Kernel:
Convolution:
We apply the horizontal and vertical Sobel kernels separately to the image
using convolution:
For the horizontal gradient (gx):
gx = (-1*50) + (0*100) + (1*150) + (-2*75) + (0*120) + (2*175) +
(-1*100) + (0*150) + (1*200)
= -50 + 0 + 150 - 150 + 0 + 350 - 100 + 0 + 200
= -50 + 0 + 150 - 150 + 0 + 350 - 100 + 0 + 200
= 300
For the vertical gradient (gy):
gy = (-1*50) + (-2*100) + (-1*150) + (0*75) + (0*120) + (0*175)
+ (1*100) + (2*150) + (1*200)
= -50 - 200 - 150 + 0 + 0 + 0 + 100 + 300 + 200
= -400 + 0 + 0 + 400
=0
Gradient Magnitude:
The gradient magnitude is calculated using the formula:
magnitude =
= sqrt (Gx^2 + Gy^2)
= sqrt (300^2 + 0^2)
= sqrt (90000)
= 300
121
Applied Signal and Image Thresholding:
Processing
We compare the gradient magnitude (300) to a predefined threshold (let's
say 200). Since the magnitude exceeds the threshold, we consider this
pixel as part of an edge.
Output:
We mark the corresponding pixel in the output image as an edge pixel.
5.4.2. Canny
The Canny edge detector is a widely recognized and powerful technique
for identifying significant edges in images. It builds upon the foundation
of simpler methods like Sobel or Prewitt but incorporates additional steps
to achieve more robust and well-defined edges. The Canny algorithm
provides accurate results by considering multiple stages of processing.
The Canny edge detector involves several key steps:
1. Gaussian Smoothing:
Before detecting edges, the Canny algorithm applies Gaussian
blurring to the image. This helps reduce noise and smooths out
pixel intensity variations.
The Gaussian filter is defined by the following equation:
Here, (x) and (y) represent the pixel coordinates, and controls the
blurring strength.
2. Gradient Calculation:
Compute the gradient magnitude and direction using Sobel
operators. These operators estimate the intensity changes along the
x and y axes.
The gradient magnitude (G) is given by:
Where (Gx) and (Gy) are the gradients in the x and y directions,
respectively.
3. Non-Maximum Suppression:
Suppress non-maximum gradient values to keep only the local
maxima. This step ensures that only thin edges remain.
Compare the gradient magnitude with its neighbours along the
gradient direction. If it’s the maximum, retain it; otherwise, set it to
zero.
122
4. Double Thresholding: Structural and
Set two thresholds: a high threshold (HT) and a low threshold Morphological Operations
(LT).
Pixels with gradient magnitude above (HT) are considered strong
edges.
Pixels with gradient magnitude between (LT) and (HT) are
considered weak edges.
Pixels below (LT) are suppressed (set to zero).
Applications:
Robust Edge Detection: Canny effectively suppresses noise and
identifies well-defined edges.
Good Localization: The edges are accurately positioned relative to
the actual intensity changes in the image.
Single Edge Response: Each edge point is detected only once,
avoiding multiple detections for the same edge.
Limitations:
Computational Cost: Compared to simpler methods like Sobel,
Canny involves more steps and can be computationally more
expensive.
Parameter Tuning: The choice of thresholds (HT and LT) can
impact the results. Selecting appropriate thresholds depends on the
specific image and desired edge characteristics.
Example:
1. Grayscale Conversion: Convert the image to black and white.
Imagine we have a simple 5x5 image with varying shades of gray:
100 100 100 150 200
100 100 100 150 200
100 100 100 150 200
100 100 100 150 200
100 100 100 150 200
2. Blur the Image: Smooth out the image to reduce noise.
After blurring, the image remains the same:
100 100 100 150 200
100 100 100 150 200
100 100 100 150 200
100 100 100 150 200
100 100 100 150 200
123
Applied Signal and Image 3. Find Edges: Look for areas of rapid intensity change.
Processing
The algorithm detects edges where there's a sudden change in intensity:
0 0 50 100 100
0 0 50 100 100
0 0 50 100 100
0 0 50 100 100
0 0 50 100 100
4. Thinning: Ensure edges are only one pixel wide.
Thin the edges to keep only the strongest ones:
0 0 0 100 100
0 0 0 100 100
0 0 0 100 100
0 0 0 100 100
0 0 0 100 100
5. Thresholding: Determine which edges are significant.
Set a threshold to distinguish strong and weak edges:
0 0 0 255 255
0 0 0 255 255
0 0 0 255 255
0 0 0 255 255
0 0 0 255 255
6. Edge Tracking: Connect weak edges to strong ones.
Ensure weak edges that are part of a strong edge are retained:
0 0 0 255 255
0 0 0 255 255
0 0 0 255 255
0 0 0 255 255
0 0 0 255 255
5.4.3. Prewitt
The Prewitt operator is a fundamental technique in image processing used
for edge detection. It's a simple yet effective way to identify areas in an
image where there's a sudden change in pixel intensity, which often
corresponds to the edges of objects.
Here are the steps involved:
4. Convolution:
Convolve the image with both the Prewitt masks separately. This
involves sliding the masks over the image and computing the weighted
sum of pixel intensities.
The result of convolution along the x-axis (kx) and y-axis (ky)
represents the gradient components.
5. Edge Magnitude:
Calculate the edge magnitude (ked) by combining the gradient
components:
125
Applied Signal and Image Limitations
Processing
Noise: Prone to false edges due to image noise.
Accuracy: Less precise in pinpointing exact edge location and
direction, especially for diagonals.
Weak Edges: May miss subtle changes in intensity.
Example
Imagine you have a grayscale image of a black cat sitting on a white chair.
The Prewitt operator would be helpful in identifying the edges where the
black fur meets the white chair.
Here's how it works:
1. Masks: The Prewitt operator uses two 3x3 masks, one for horizontal
edges and one for vertical edges.
Horizontal Mask: Emphasizes changes in intensity from left to
right.
3. Edge Detection:
Magnitude: We can calculate the overall edge strength by taking the
absolute value of the results from both horizontal and vertical
convolutions. This highlights strong edges (areas with high intensity
change).
Direction (Optional): We can also calculate the direction of the edge
(horizontal, vertical, or diagonal) based on the signs of the mask
values in the convolution.
The final result would be a new image highlighting the edges of the cat
(where fur meets chair) with varying intensity based on how strong the
change in grayscale value is at that point.
126
5.4.4. Robert edge detection techniques Structural and
Morphological Operations
The Roberts operator is a simple and quick-to-compute method for
measuring a 2-D spatial gradient on an image. It highlights strong spatial
gradient regions, which often correspond to edges.
1. Roberts Edge Detection:
The Roberts operator calculates the gradient intensity in discrete
differentiation. It approximates the gradient by computing the sum of
squares of differences between diagonally adjacent pixels.
The operator uses a pair of 2x2 convolution masks, one of which is
merely the other rotated by 90 degrees. This is similar to the Sobel
operator.
The masks are designed to respond maximally to edges running at a
45° angle to the pixel grid.
The masks can be applied to the input grayscale image independently
to produce separate gradient component measurements in each
orientation.
These components can then be combined to determine the absolute
magnitude and orientation of the gradient at each location.
2. Roberts Masks:
The two Roberts masks are as follows:
Applications:
Simple and Fast: The Roberts operator is very easy to understand and
implement due to its small kernel size (2x2) and simple calculations.
Limitations:
127
Applied Signal and Image Less Precise: While good at detecting strong edges, it may not be as
Processing accurate in pinpointing the exact location of the edge compared to
more sophisticated methods.
Diagonal Focus: The Roberts operator is primarily sensitive to edges
along diagonal directions (because it checks diagonals).
Example
Consider a tiny 2x2 grayscale image:
[100, 150]
[120, 180]
We'll apply the Robert operators to compute the horizontal and vertical
gradients, then calculate the gradient magnitude.
1. Horizontal gradient:
2. Vertical gradient:
[ 0, 0] [ 0, 0]
128
Laplacian of Gaussian (LoG) in Image Processing: The LoG filter Structural and
is used in image processing for edge detection. It involves applying a Morphological Operations
Gaussian blur to an image before calculating the Laplacian. This helps
to reduce noise which can be amplified when calculating the second
derivative, which is what the Laplacian does. The LoG operator takes
a single grayscale image as input and produces another grayscale
image as output. The 2-D LoG function centered on zero and with
Gaussian standard deviation σ has the form:
This filter highlights regions of rapid intensity change and is often used for
edge detection.
Limitations:
Can be computationally intensive.
Sensitive to parameter choices.
Challenges in precise edge localization.
Susceptible to noise amplification.
Complexity in selecting the appropriate scale.
Example
Consider a small 3x3 grayscale image:
[100, 150, 200]
[120, 180, 220]
[90, 140, 190]
We'll apply the LoG filter with a Gaussian kernel of size 3x3 and sigma
(standard deviation) of 1.
1. Apply Gaussian Smoothing:
We'll convolve the image with a 3x3 Gaussian kernel.
Gaussian Kernel:
1/16 2/16 1/16
2/16 4/16 2/16
1/16 2/16 1/16
129
Applied Signal and Image Smoothed Image:
Processing
[147.8125, 183.5, 198.1875]
[163.75, 190.25, 206.25]
[140.1875, 167.75, 183.8125]
130
Subtract the blurred images: You then subtract the wider-blurred image Structural and
from the narrower-blurred image. This subtraction process results in a new Morphological Operations
image that emphasizes the edges that are present in the spatial range
between the two levels of blurring.
Mathematical Representation: The DoG function ( D(x, y, σ ) ) can be
represented as the difference between two Gaussian blurred images ( G(x,
y, kσ ) and ( G(x, y, σ ):
Application:
Limitations:
Parameter sensitivity.
Limited edge localization.
Noise sensitivity.
Computational overhead (though less than LoG).
Example:
Suppose we have a 3x3 grayscale image:
[100, 150, 200]
[120, 180, 220]
[90, 140, 190]
We'll apply two Gaussian filters with different standard deviations (σ) and
compute their difference.
1. Gaussian with σ1 = 1:
[0.0751, 0.1238, 0.0751]
[0.1238, 0.2042, 0.1238]
[0.0751, 0.1238, 0.0751]
2. Gaussian with σ2 = 2:
[0.028, 0.05, 0.028]
[0.05, 0.091, 0.05]
[0.028, 0.05, 0.028]
131
Applied Signal and Image 3. Difference (DoG):
Processing
[0.0471, 0.0738, 0.0471]
[0.0738, 0.1132, 0.0738]
[0.0471, 0.0738, 0.0471]
In this example, we perform the following steps:
Purpose:
Gaussian pyramids are used in various applications such as image
blending, texture analysis, and scale-invariant feature detection.
They provide an efficient and compact representation of images at
multiple resolutions, enabling computationally efficient processing at
different scales.
Example
Consider an original image of size 256x256 pixels. We can construct a
Gaussian pyramid with, for example, 4 levels. At each level, we
successively blur the image using Gaussian filtering and then downsample
it by a factor of 2. This results in a pyramid with images at decreasing
resolutions, capturing the image content at different scales. Here's an
example:
Level 0: Original Image (256x256)
Level 1: Blurred &Downsampled (128x128)
132
Level 2: Further Blurred &Downsampled (64x64) Structural and
Morphological Operations
Level 3: Further Blurred &Downsampled (32x32)
Level 4: Further Blurred &Downsampled (16x16)
Construction:
Purpose:
Benefits:
Example
Using the Gaussian pyramid constructed above, we can derive the
Laplacian pyramid. For each level of the Gaussian pyramid (except the
last level), we up sample the next level and subtract it from the current
level. This results in a pyramid capturing the details at different scales.
Here's an example:
Level 0: Original Image (256x256)
Level 1: Details (128x128)
Level 2: Details (64x64)
Level 3: Details (32x32)
133
Applied Signal and Image Construction:
Processing
Like the Laplacian pyramid, it is built by taking the difference between
levels of the Gaussian pyramid.
However, instead of the difference operation, morphological
operations such as erosion and dilation are applied.
Purpose:
Example
For a morphological pyramid, suppose we have an image containing
binary objects (e.g., shapes). We can use morphological operations such as
erosion and dilation to create a pyramid representing the morphological
features at different scales. Here's an example:
Level 0: Original Binary Image (256x256)
Level 1: Eroded & Dilated (128x128)
Level 2: Further Eroded & Dilated (64x64)
Level 3: Further Eroded & Dilated (32x32)
5.6 SUMMARY
In image processing, edge detection is crucial for identifying boundaries
between objects or regions. Various techniques, such as Sobel, Canny, and
Prewitt operators, utilize gradient information to detect edges accurately.
Simple edge models like step and ramp edges offer idealized
representations for analysis. Advanced methods like LoG and DoG filters
combine Gaussian blurring with differentiating operators to detect edges at
multiple scales. Additionally, image pyramids, including Gaussian,
Laplacian, and morphological pyramids, facilitate multi-scale analysis,
aiding tasks such as image compression and feature extraction. These
techniques are fundamental in computer vision applications, offering
insights into image structures and facilitating further processing.
134
Jain, A. K. (1989). Fundamentals of Digital Image Processing. Structural and
Prentice Hall. Morphological Operations
Websites
https://docs.opencv.org/
https://www.pyimagesearch.com/
https://towardsdatascience.com/tagged/image-processing
https://www.mathworks.com/help/images/
135
6
IMAGE PROCESSING
Unit Structure :
6.0 Objectives
6.1 Erosion
6.2 Dilation
6.3 Opening and closing
6.4 Hit-or-Miss Transformation
6.5 Skeletonizing
6.6 Computing the convex hull
6.7 Removing Small Objects
6.8 White and black top- hats
6.9 Extracting the boundary
6.10 Grayscale operations
6.11 Summary
6.12 Reference for further reading
6.0 OBJECTIVES
After completing this unit, you will be proficient in a range of
fundamental image processing techniques. You will be able to effectively
utilize erosion and dilation to enhance boundaries and fill gaps, implement
opening and closing operations for noise reduction and object
preservation, employ Hit-or-Miss Transformation for pattern detection,
and skeletonize images to thin object structures while maintaining
connectivity. Additionally, you will have the skills to compute convex
hulls for understanding object shapes, remove small objects while
preserving essential features, and utilize white and black top-hat
transformations for feature extraction. You will also be adept at extracting
boundaries for further analysis and applying grayscale operations for
contrast enhancement and filtering, thereby enhancing your capabilities in
image processing.
Set theory notation is a powerful tool in image processing, particularly for
defining fundamental morphological operations like erosion and dilation.
Here's a breakdown of how it's used:
Basics of set theory
Image as a Set:
We consider a digital image as a set of pixels. Each pixel is
represented by its coordinates (x, y) and has a specific intensity value.
136
In binary images, the intensity values are typically 0 (black) and 1 Image Processing
(white). So, the image becomes a set of coordinates representing
foreground pixels (usually white).
Union (U): This combines pixels that belong to either set A or set B or
both. In image processing, it could represent combining two separate
objects in an image.
Intersection (∩): This includes pixels that belong to both set A and set
B. It's useful for finding overlapping regions between objects.
Difference (\): This includes pixels that are in set A but not in set B. In
image processing, it can be used to isolate objects by removing
overlapping areas.
Complement (A^c): This represents all the pixels that are not in set
A. In a binary image, it would be all the black pixels if A represents
the white foreground objects.
Morphological Operations:
6.1 EROSION
Erosion is a fundamental morphological operation in image processing
that is used to shrink or thin objects in a binary image. It involves a
process where the boundaries of objects are eroded away based on a
predefined structuring element. This structuring element, which can be of
various shapes like a disk or a square, moves over the image, and at each
position, it compares its shape with the object’s pixels in the image. If the
structuring element fits within the object, the original pixel value is
retained; otherwise, it is set to zero, effectively eroding the boundary of
the object.
Applications of erosion:
Noise reduction: Erosion can help remove small isolated pixels that
might be caused by noise in the image.
Object separation: By selectively eroding objects, you can isolate
them from touching or overlapping objects.
Shape analysis: Erosion can be used to analyze the shape of objects in
an image by measuring how much they shrink under erosion.
Example
Imagine a binary image where "1" represents the object and "0" represents
the background:
00000
01110
01110
01110
00000
Let's use a simple 3x3 square structuring element:
111
111
111
To create erosion, we slide the structural element over the image. If all of
the pixels under the structural element are "1", the center pixel remains
"1"; otherwise, it is "0".
Here is the outcome of erosion:
00000
00000
00000
00000
00000
138
In this simplified example, the objects in the binary image have shrunk, or Image Processing
eroded, because we've eliminated pixels from their edges.
6.2 DILATION
Dilation, alongside erosion, is a fundamental operation in image
processing, particularly within the realm of mathematical morphology. In
contrast to erosion, which shrinks objects, dilation expands the boundaries
of objects in a binary image.
Where
Applications of dilation:
139
Applied Signal and Image Example
Processing
Consider a binary image where "1" represents the object and "0"
represents the background:
00000
01010
00000
Let's use a simple 3x3 square structuring element:
111
111
111
To perform dilation, we slide this structuring element over the image. If
any part of the structuring element overlaps with a foreground pixel
(marked as "1") in the image, we set the center pixel of the structuring
element to "1".
Here's the result of dilation:
11111
11111
11111
In this simplified example, the objects in the binary image have been
expanded or thickened as we've added pixels to their boundaries using the
3x3 square structuring element.
Duality
Erosion and dilation are duals of each other with respect to set
complementationand reflection. This duality is a powerful concept that
helps in understanding and applying these operations effectively in image
processing and analysis.
The duality principle can be expressed as:
These operations are used in various image processing tasks such as noise
reduction, shape analysis, and feature extraction
It's useful for removing small objects or noise from the foreground of
an image while preserving the larger structures.
Closing:
Differences:
141
Applied Signal and Image Order: Opening is erosion followed by dilation, whereas closing is
Processing dilation followed by erosion.
Combined Usage:
Example
Imagine a 1D "image" with values representing foreground (1) and
background (0):
Original Image (A): [1, 0, 1, 1, 0, 1, 0, 1, 0]
Structuring Element (B): [1, 1] (assuming a small rectangular structuring
element)
Opening (Erosion followed by Dilation):
1. Erosion: Iterate over each element in A. If both elements in B (shifted
to that position) are not 1 (foreground), the element in A becomes 0
(background).
Eroded Image: [1, 0, 0, 1, 0, 1, 0, 1, 0] (Notice isolated foreground
elements are removed)
2. Dilation: Iterate over each element in the eroded image. If at least one
element in B (shifted to that position) is 1 (foreground), the element in the
eroded image becomes 1 (foreground).
Opened Image: [1, 0, 1, 1, 0, 1, 0, 1, 0] (Small foreground elements are
gone, larger ones remain)
Closing (Dilation followed by Erosion):
1. Dilation: Following the same principle as above.
Dilated Image: [1, 1, 1, 1, 1, 1, 1, 1, 0] (Small gaps are filled)
2. Erosion: Similar to the erosion step in opening.
Closed Image: [1, 1, 1, 1, 0, 1, 0, 1, 0] (Small protrusions are removed, but
gaps are filled)
This simplified example demonstrates how opening removes small
foreground elements (noise) and closing fills small gaps/holes. Remember,
in real images, these operations would be applied to a 2D array of pixels,
and the size and shape of the structuring element would significantly
impact the results.
142
Morphological opening and closing has the following properties: Image Processing
1. Idempotence:
Performing opening or closing on an image multiple times with the
same structuring element yields the same result as performing it once.
Mathematically, ( ∘ )∘ = ∘ and ( ∙ )∙ = ∙ .
2. Extensivity:
Morphological opening and closing are extensive operations, meaning
the resulting image always contains the input image.
Mathematically, ⊆ ∘ and ⊆ ∙
3. Non-Increasing:
Morphological opening is non-increasing, meaning the size of objects
in the resulting image is never greater than the size of objects in the
input image.
Similarly, morphological closing is also non-increasing.
5. Associativity:
Opening and closing are associative operations, meaning the order of
applying these operations does not affect the final result.
Mathematically, ( ∘ )∘ = ∘( ∘ ) and ( ∙ )∙ = ∙( ∙ )
143
Applied Signal and Image Here, ( ) represents the erosion of image ( A ) by structuring
Processing element ( C ), and ( ) represents the erosion of the complement of
image ( A ) by structuring element ( D ).
Applications:
Example
Imagine a small 3x3 binary image (A):
A = [[0, 1, 0],[0, 1, 1],[0, 0, 0]]
find all locations with a vertical line segment (two foreground pixels
stacked on top of each other).
Structuring Elements:
B1 (Hit Element): A vertical line segment with two foreground
pixels (1s).
B1 = [[1, 0],[1, 0]]
B2 (Miss Element): A single foreground pixel (1) in the center,
representing a position that should not have a foreground pixel in
the original image for a hit.
B2 = [[0, 1, 0],[0, 1, 0]]
Process:
We'll iterate over each pixel in the image (A) and apply the following
logic at each position (i, j):
3. Output Image:
If both hit conditions are satisfied (vertical line segment found), set
the corresponding pixel (i, j) in the output image to 1 (foreground).
Otherwise, leave the output pixel (i, j) as 0 (background).
Output Image:
After checking all positions, the output image will have a 1 (foreground)
only at location (1, 1) because that's the only place where the vertical line
segment is found (two foreground pixels stacked). All other output pixels
will be 0 (background).
6.5 SKELETONIZING
Skeletonizing in image processing refers to the process of reducing a
shape to its basic form, which is its skeleton. This is done by successively
eroding the boundary of the shape until only the ‘skeleton’ remains. The
skeleton consists of a subset of the medial axis points of the shape and
represents the general form of the shape in a simplified manner.
In the context of binary images, skeletonizing is used to thin objects to
their minimal representation without changing the essential structure or
connectivity. This can be particularly useful for feature extraction, pattern
recognition, and image analysis tasks.
Mathematically, the skeleton ( S ) of a shape ( A ) in a binary image can
be defined as:
element ( ), and the intersection of all such erosions gives the skeleton
145
Applied Signal and Image What skeletonizing does:
Processing
Reduces object thickness: It iteratively removes pixels from the
boundaries of foreground objects (usually white pixels) until a one-
pixel-wide skeleton remains.
Example
Imagine a small 5x5 binary image (A) representing a simple shape:
A = [[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 0, 0, 0, 0]]
This represents a square with a thickness of 3 pixels.
Skeletonization Process (Simplified):
We'll assume a basic iterative approach where we remove pixels from the
object's boundaries that are "safe" to remove (won't break connectivity).
Iteration 1:
1. Identify pixels on the object's boundary (all foreground pixels touching
a background pixel).
2. Remove pixels that have exactly two foreground neighbors (these are
safe to remove as they won't disconnect the object).
146
Final Skeleton: Image Processing
Applications:
3. QuickHull:
Divide and Conquer: QuickHull follows a divide and conquer
strategy similar to QuickSort. It recursively divides the set of points
into subsets based on their relationship with a line connecting the two
outermost points (extreme points).
Finding Extreme Points: It first finds the extreme points (leftmost
and rightmost points) of the set.
Partitioning: It partitions the points into two subsets based on which
side of the line they lie. Points lying on the left side are included in the
left subset, and points lying on the right side are included in the right
subset.
Recursive Hull Construction: It recursively constructs the convex
hull for each subset until no further subdivision is possible.
148
Merging: Finally, it merges the convex hulls of the two subsets to Image Processing
form the convex hull of the entire set of points.
Efficiency: QuickHull is efficient and has an average-case time
complexity of ( log ) and a worst-case time complexity of ( 2). It
is suitable for both small and large datasets.
Example
Consider an image:
Points: [(1, 2), (3, 5), (6, 4), (7, 2), (4, 1), (2, 1)]
Now, let's apply each of the convex hull algorithms mentioned:
Graham Scan:
It sorts the points based on their angles and constructs the convex hull.
Result: [(1, 2), (2, 1), (7, 2), (6, 4), (3, 5)]
Jarvis’s March (Gift Wrapping):
It iteratively selects the next point with the smallest angle to enclose the
points.
Result: [(1, 2), (2, 1), (7, 2), (6, 4), (3, 5)]
QuickHull:
It divides the points into subsets and recursively finds the convex hull.
Result: [(1, 2), (2, 1), (7, 2), (6, 4), (3, 5)]
149
Applied Signal and Image pixels from the edges of objects. In this case, we erode the foreground
Processing objects (black pixels) in the image.
4. Size Threshold : We define a size threshold for objects. Objects with
fewer pixels than the threshold after erosion are considered "small"
and unwanted.
5. Removing Small Objects : There are two main approaches to remove
small objects:
Keeping only the eroded image: This approach discards the original
image and keeps only the eroded version, where small objects have
vanished.
White Top-Hat:
The white top-hat operation is the difference between the original image
and its opening.
Opening is an erosion followed by a dilation operation. It removes bright
regions smaller than the structuring element.White top-hat highlights
small bright structures or details in the image that are smaller than the
structuring element used in the opening operation.It is useful for
enhancing features such as small objects, bright spots, or textures against a
relatively bright background.
Black Top-Hat:
The black top-hat operation is the difference between the closing of the
original image and the original image itself.Closing is a dilation followed
by an erosion operation. It removes dark regions smaller than the
structuring element.Black top-hat highlights small dark structures or
details in the image that are smaller than the structuring element used in
the closing operation.It is useful for enhancing features such as small dark
objects, dark spots, or textures against a relatively dark background.
150
Both white top-hat and black top-hat operations are valuable tools in Image Processing
image processing for extracting and enhancing subtle features that may be
difficult to discern in the original image. They are commonly used in tasks
such as image enhancement, feature extraction, and texture analysis.
The equations for the white top-hat and black top-hat operations can
be expressed as follows:
White Top-Hat:
WTH(A)=A−(A∘B)
where:
A is the original image.
Black Top-Hat:
BTH(A)=(A∙B)−A
where:
A is the original image.
represents the closing operation.
B is the structuring element.
In these equations:
WTH(A) represents the white top-hat of image
BTH(A) represents the black top-hat of image
Example
Suppose we have the following binary image A, where "1" represents the
object and "0" represents the background:
151
Applied Signal and Image
Processing
Now, let's compute the white top-hat (WTH) and black top-hat (BTH) of
the given binary image.
Sure, let's compute the white top-hat (WTH) and black top-hat (BTH) of
the given binary image A.
Given the binary image A and the structuring element B:
We'll perform erosion and dilation operations to compute the white top-hat
and black top-hat. Let's start by computing erosion A⊖B and dilation
A⊕B.
Let's perform erosion and dilation on the given binary image A with the
structuring element B:
152
Erosion (A ⊖ B): Image Processing
The result after erosion will remove one layer of pixels from the boundary
of the objects in the image.
Dilation (A ⊕ B):
The result after dilation will add one layer of pixels to the boundary of the
objects in the image.
Now, let's compute the white top-hat and black top-hat using these results.
Given the results of erosion (A⊖B) and dilation (A⊕B), we can now
compute the white top-hat (WTH) and black top-hat (BTH) of the image
A.
White Top-Hat (WTH):
The white top-hat operation is obtained by subtracting the result of erosion
(A⊖B) from the original image (A)
WTH(A)=A−(A⊖B)
Subtracting the erosion result from the original image:
A is the same as the original image since erosion did not remove any
pixels.
153
Applied Signal and Image Black Top-Hat (BTH):
Processing
The black top-hat operation is obtained by subtracting the original image
(A) from the result of dilation (A⊕B)
BTH(A)=(A⊕B)−A
Subtracting the original image from the dilation result:
The black top-hat (BTH) of the image highlights the boundary of the
object, which was added by the dilation operation.
Boundary( )= −( ⊖ )
Where:
represents the original binary image.
154
Example Image Processing
Consider binary image , where "1" represents the object and "0"
represents the background:
1. Erosion (A⊖B):
Perform erosion on the original binary image using the structuring
element .
2. Difference:
Compute the difference between the original image and the eroded
image ⊖
Let's apply these steps to our example:
1. Erosion (A⊖B):
Erosion removes pixels from the boundary of objects in the image. For
each pixel, if all the pixels in the 3x3 neighbourhood around it are 1
(object), the pixel remains 1; otherwise, it becomes 0.
2. Difference:
Compute the difference between the original image and the eroded
image ⊖ to obtain the boundary.
Boundary(A)=A−(A⊖B)
155
Applied Signal and Image
Processing
Erosion:
In grayscale erosion, the minimum pixel value within the neighborhood
defined by the structuring element is assigned to the center pixel.Erosion
reduces the intensity of bright regions and is useful for removing small
bright spots or thinning object boundaries.
Dilation:
In grayscale dilation, the maximum pixel value within the neighborhood
defined by the structuring element is assigned to the center pixel.Dilation
increases the intensity of bright regions and is useful for filling gaps or
thickening object boundaries.
Opening:
Opening is the combination of erosion followed by dilation.It helps in
removing small bright regions while preserving the larger structures and is
useful for smoothing or filtering images.
Closing:
Closing is the combination of dilation followed by erosion.It helps in
filling small dark gaps or holes while preserving the larger structures and
is useful for image restoration or enhancing object boundaries.
Gradient:
The gradient of a grayscale image is computed as the difference between
dilation and erosion.It highlights the boundaries or edges of objects in the
image and is useful for edge detection or feature extraction.
Top-Hat Transform:
The top-hat transform is the difference between the original image and its
opening.It extracts small-scale features or details from the image and is
useful for image enhancement or background subtraction.
These grayscale morphology operations are fundamental tools in image
processing for analyzing and manipulating grayscale images to extract
meaningful information or enhance image quality.
156
Some common grayscale morphology operations: Image Processing
3. Grayscale Opening: ∘ =( ⊖ )⊕
4. Grayscale Closing: ∙ =( ⊕ )⊖
6.11 SUMMARY
Morphological operations are fundamental techniques in image
processing, each serving distinct purposes. Erosion and dilation are basic
operations for shrinking and expanding object boundaries, while opening
and closing are compound operations used for noise reduction and gap
filling, respectively. Hit-or-Miss transformation facilitates pattern
matching, and skeletonizing reduces object thickness to topological
skeletons. Computing the convex hull encloses objects within the smallest
convex shape, while removing small objects enhances segmentation
results. White and black top-hats highlight specific features, boundary
extraction isolates object contours, and grayscale operations are essential
for processing grayscale images. These operations find extensive
application in various fields like image analysis, pattern recognition, and
feature extraction, offering a powerful toolkit for image enhancement and
understanding.
157
Applied Signal and Image Soille, P. (2003). Morphological Image Analysis: Principles and
Processing Applications. Springer.
Haralick, R. M., & Shapiro, L. G. (1992). Computer and Robot Vision:
Volume I. Addison-Wesley.
Dougherty, G. (2003). Digital Image Processing for Medical
Applications. Cambridge University Press.
Web references
158
7
ADVANCED IMAGE PROCESSING
OPERATIONS
Unit Structure :
7.0 Objectives
7.1 Introduction
7.2 Extracting Image Features and Descriptors: Feature detector versus
descriptors
7.3 Boundary Processing and feature descriptor
7.4 Principal Components
7.4.1 Introduction to Principal Component Analysis (PCA):
7.4.2 Key Concepts
7.4.3 PCA Example
7.4.4 What Are Principal Components?
7.4.5 How Principal Component Analysis (PCA) work?
7.4.6 Applications of PCA
7.5 Harris Corner Detector
7.6 Blob detector
7.7 Histogram of Oriented Gradients
7.8 Scale-invariant feature transforms
7.9 Haar-like features
7.10 Summary
7.11 List of References
7.12 Unit End Exercises
7.0 OBJECTIVES
To get familiar with various advance image processing operations
To study the analogy between feature detector and descriptors
To gain detail insights about the principal components along with
different detectors associated with image processing techniques
159
Applied Signal and Image
Processing
7.1 INTRODUCTION
In the vast realm of computer vision, unlocking the intricacies of visual
data is paramount for machines to perceive and comprehend the world
around them. Central to this endeavor is the extraction and description of
image features, a fundamental process that underpins numerous
applications ranging from object recognition to image matching and
beyond. This chapter delves into the intricacies of this critical domain,
exploring various techniques and methodologies employed in the
extraction and characterization of image features.
From fundamental concepts to advanced techniques, each section sheds
light on a specific aspect of this multifaceted domain, offering insights and
practical knowledge to readers keen on mastering the art of image
analysis. Additionally, this chapter also explores the role of feature
descriptors in capturing the salient characteristics of image boundaries,
paving the way for robust and efficient feature representation.
This chapter elucidates the principles of PCA and its applications in
extracting informative features from high-dimensional data. Through
insightful discussions and illustrative examples, readers gain a deeper
understanding of how PCA facilitates the extraction of essential visual
features while mitigating the curse of dimensionality.
1. Feature Detectors:
Feature detectors are algorithms designed to identify distinctive points
or regions in an image that are salient or informative. These points are
often referred to as keypoints or interest points.
They locate areas with unique characteristics such as corners, edges, or
blobs that are likely to be recognizable across different views of an
object or scene.
Common feature detectors include Harris corner detector, FAST
(Features from Accelerated Segment Test), and SIFT (Scale-Invariant
Feature Transform).
2. Descriptors:
Descriptors are complementary to feature detectors. Once keypoints
are detected, descriptors are used to describe the local appearance of
160
these keypoints in a way that is invariant to changes in illumination, Advanced Image
viewpoint, and scale. Processing Operations
163
Applied Signal and Image 3. Variance Maximization : PCA seeks to maximize the variance of the
Processing data along each principal component. By capturing the directions of
maximum variance, PCA ensures that the transformed data retains as
much information as possible from the original dataset.
4. Eigenvalues and Eigenvectors : PCA is based on the eigende
composition of the covariance matrix of the original data. The eigenvalues
represent the amount of variance explained by each principal component,
while the eigenvectors represent the directions (or axes) of maximum
variance.
5. Projection : After determining the principal components, PCA projects
the original data onto these components to obtain the transformed dataset.
This projection preserves the most significant information in the data
while reducing its dimensionality.
164
The principal component can be written as: Advanced Image
Processing Operations
Z¹ = Φ¹¹X¹ + Φ²¹X² + Φ³¹X³ + .... +Φp¹Xp
where,
Z¹ is the first principal component
Φp¹ is the loading vector comprising loadings (Φ¹, Φ²..) of the first
principal component. The loadings are constrained to a sum of
squares equals to 1. This is because a large magnitude of loadings
may lead to a large variance. It also defines the direction of the
principal component (Z¹), along which data varies the most. It
results in a line in p dimensional space, which is closest to
the n observations. Closeness is measured using average squared
euclidean distance.
X¹..Xp are normalized predictors. Normalized predictors have mean
values equal to zero and standard deviations equal to one.
First Principal Component:
The first principal component is a linear combination of original
predictor variables that captures the data set’s maximum variance. It
determines the direction of highest variability in the data. Larger the
variability captured in the first component, larger the information
captured by component. No other component can have variability
higher than first principal component.
The first principal component results in a line that is closest to the
data, i.e., it minimizes the sum of squared distance between a data
point and the line.
Similarly, we can compute the second principal component also.
Second Principal Component (Z²):
The second principal component is also a linear combination of
original predictors, which captures the remaining variance in the data
set and is uncorrelated with Z¹. In other words, the correlation between
first and second components should be zero. It can be represented as:
Z² = Φ¹²X¹ + Φ²²X² + Φ³²X³ + .... + Φp2Xp
If the two components are uncorrelated, their directions should be
orthogonal (image below). This image is based on simulated data with
2 predictors. Notice the direction of the components; as expected, they
are orthogonal. This suggests the correlation b/w these components are
zero.
165
Applied Signal and Image
Processing
166
5. Choose Principal Components: Advanced Image
Processing Operations
Select the top k eigenvectors (principal components) where k is the
desired dimensionality of the reduced dataset.
167
Applied Signal and Image Here's a brief overview of how it works:
Processing
1. Gradient Calculation : The first step involves computing the gradient
of the image. This is typically done using a gradient operator like the
Sobel operator.
2. Structure Tensor : For each pixel in the image, a structure tensor is
calculated. The structure tensor represents the local image structure
around the pixel. It is essentially a covariance matrix of the image
gradients within a local neighborhood of the pixel.
3. Corner Response Function : The corner response function is
computed using the eigenvalues of the structure tensor. This function
measures the likelihood of a point being a corner. High values of the
corner response function indicate strong corners.
4. Thresholding and Non-maximum Suppression : After computing
the corner response function, a threshold is applied to identify strong
corners. Additionally, non-maximum suppression is often performed
to select only the local maxima as corner points.
We will now discuss about a corner detection method. Fig. 2 depicts the
concept of the Harris-Stephens (HS) corner detector. The fundamental
strategy is as follows: To find the corners of an image, drag a tiny window
over it. It is intended for the detector window to compute variations in
intensity. There are three situations which fascinate us: The conditions that
occur when the window spans a boundary between two regions, as in
location B, are (1) areas of zero (or small) intensity changes in all
directions; (2) areas of changes in one direction but no (or small) changes
in the orthogonal direction; and (3) areas of significant changes in all
directions. Location C is an example of an area that contains a corner or
isolated points. A mathematical model that aims to distinguish between
these three conditions is called the HS corner detector.
168
same size that has been shifted by (x, y). Next, the two patches' weighted Advanced Image
total of squared differences is obtained as follows: Processing Operations
Figure 3
Because a square matrix's determinant equals the product of its
eigenvalues and its trace equals the sum of its eigenvalues, the HS detector
uses a measure of corner response. The definition of the measure is
172
However, it's important to consider HOG's limitations: Advanced Image
Processing Operations
Complexity: While good for basic shapes, HOG might struggle with
objects undergoing significant pose variations or complex
appearances.
HOG has been widely used due to its simplicity and effectiveness,
especially when combined with machine learning algorithms like Support
Vector Machines (SVMs) or neural networks. However, it's worth noting
that HOG alone might not be robust to complex variations in appearance
and pose, so it's often used as part of a larger pipeline in modern computer
vision systems.
2. Keypoint Localization:
173
Applied Signal and Image 3. Orientation Assignment:
Processing
To achieve rotation invariance, each keypoint is assigned a dominant
orientation based on local image gradient directions.
4. Descriptor Generation:
The resulting histograms across all bins are concatenated to form the
keypoint descriptor, which encodes information about the local image
gradients in the keypoint's neighborhood.
5. Keypoint Matching:
Strengths of SIFT:
Scale and Rotation Invariance: SIFT's key advantage is its ability to
find and match features regardless of image scale or rotation.
174
Robustness: It's relatively robust to illumination changes and some Advanced Image
geometric distortions. Processing Operations
Limitations of SIFT:
Computational Cost: Compared to HOG, SIFT is computationally
more expensive due to the scale-space representation and descriptor
generation.
Sensitivity to Complex Deformations: While robust to some
distortions, SIFT might struggle with highly deformed objects.
Overall, SIFT is a cornerstone technique in computer vision for feature
matching and object recognition. Its ability to handle scale and rotation
variations makes it valuable in tasks like image stitching, 3D modeling,
and robot navigation
1. Definition:
175
Applied Signal and Image Four-rectangle Features: These features divide a rectangular region
Processing into four smaller rectangles and compute the difference between the
sum of pixel intensities in the two diagonally opposite rectangles.
3. Integral Image:
Each pixel in the integral image contains the sum of all pixel
intensities above and to the left of it in the original image.
4. Feature Evaluation:
5. Object Detection:
Regions that pass all stages of the cascade are considered positive
detections and are further refined using additional techniques.
Advantages of Haar-like features:
Simplicity: They are easy to compute and understand, making them
efficient for real-time applications.
176
Effectiveness: They can effectively capture basic geometric properties Advanced Image
relevant for object detection, particularly for objects with well-defined Processing Operations
shapes like faces.
7.10 SUMMARY
In this chapter, we embark on a journey into the heart of image feature
extraction and description, exploring the diverse landscape of
methodologies and algorithms that enable machines to discern and
interpret visual information. From fundamental concepts to advanced
techniques, each section sheds light on a specific aspect of this
multifaceted domain, offering insights and practical knowledge to readers
keen on mastering the art of image analysis.
At the boundary between raw pixel data and meaningful visual
representations lies the crucial step of boundary processing. This chapter
also discussed about the intricacies of extracting features from image
boundaries, unraveling the underlying principles and methodologies that
guide this process. Additionally, it explores the role of feature descriptors
in capturing the salient characteristics of image boundaries, paving the
way for robust and efficient feature representation.We also studiedthe
principles of PCA and its applications in extracting informative features
from high-dimensional data.
Corner detection lies at the heart of many computer vision applications,
enabling the identification of salient keypoints in images. We explored the
Harris Corner Detector, a seminal algorithm that revolutionized the field
of feature detection. From its mathematical foundations to practical
implementation strategies, readers are equipped with the knowledge to
harness the power of corner detection in various visual tasks. We
discussed how the Histogram of Oriented Gradients (HOG) emerges as a
powerful technique for capturing local image structure and texture along
with a comprehensive overview of SIFT, from its underlying principles to
practical implementation strategies. Through intuitive explanations and
illustrative examples, readers discover the versatility of Haar-like features
and their applications across diverse domains.
177
Applied Signal and Image By delving into the depths of image feature extraction and description, this
Processing chapter equips readers with the knowledge and tools to unravel the
mysteries of visual data, paving the way for groundbreaking advancements
in the field of computer vision. Whether you're a seasoned practitioner or
an aspiring enthusiast, this chapter serves as a guiding beacon in your
quest to master the art of image analysis.
178
8
IMAGE SEGMENTATION
Unit Structure :
8.0 Objectives
8.1 Introduction
8.2 Hough Transform for detecting lines and circles
8.3 Thresholding and Otsu’s segmentation
8.4 Edge-based/region-based segmentation
8.5 Region growing
8.6 Region splitting and Merging
8.7 Watershed algorithm
8.8 Active Contours
8.9 Morphological snakes
8.10 GrabCut algorithms
8.11 Summary
8.12 List of References
8.13 Unit End Exercises
8.0 OBJECTIVES
To get insights of image segmentation
To understand different techniques and algorithms associated with
segmentation
8.1 INTRODUCTION
Image segmentation is a fundamental task in computer vision that involves
partitioning an image into multiple segments or regions based on certain
characteristics such as color, intensity, texture, or semantic meaning. The
goal of image segmentation is to simplify and/or change the representation
of an image into something more meaningful and easier to analyze.
Objective: The primary objective of image segmentation is to simplify the
representation of an image into more meaningful and easy-to-analyze
parts. It aims to partition an image into distinct regions or objects based on
certain features or criteria.
179
Applied Signal and Image Types of Segmentation:
Processing
Semantic Segmentation : This type of segmentation assigns a class
label to each pixel in the image, effectively dividing the image into
regions corresponding to different objects or regions of interest.
Region Growing : This method starts with seed points and grows
regions by adding neighboring pixels that meet certain similarity
criteria.
Challenges:
Applications:
180
In essence, image segmentation is a crucial preprocessing step in many Image Segmentation
computer vision tasks, enabling the extraction of meaningful information
from images for further analysis and decision-making.
In the Hough Transform for lines, each pixel in the image space
corresponds to a parameter space representation. Instead of
representing lines as slopes and intercepts (as in Cartesian space), lines
are represented as points in a parameter space known as the Hough
space.
2. Voting :
For each edge pixel in the binary image (typically obtained through
edge detection techniques like Canny edge detector), we compute the
possible lines that could pass through that pixel in parameter space.
Each edge pixel votes for the possible lines it could belong to in the
Hough space. This is done by incrementing the corresponding cells in
the Hough space for each possible line.
3. Accumulator Thresholding:
After all edge pixels have voted, we examine the Hough space to
identify cells with high vote counts. These cells correspond to lines in
the original image.
181
Applied Signal and Image 4. Line Extraction:
Processing
After thresholding, lines are extracted from the parameter space
representation by mapping back to the Cartesian space using the
parameters represented by the selected cells.
2. Voting:
Similar to line detection, for each edge pixel in the binary image, we
compute the possible circles that could pass through that pixel in
parameter space.
Each edge pixel votes for the possible circles it could belong to in the
Hough space by incrementing the corresponding cells in the
accumulator array.
3. Accumulator Thresholding:
After all edge pixels have voted, we examine the accumulator array to
identify cells with high vote counts, indicating potential circle
detections.
4. Circle Extraction:
Summary
182
It operates by transforming the spatial domain representation of the Image Segmentation
image into a parameter space where geometric shapes are represented
explicitly.
Voting and accumulator thresholding are key steps in both line and
circle detection variants of the Hough Transform.
Thresholding:
1. Simple Thresholding:
2. Adaptive Thresholding:
Adaptive thresholding adjusts the threshold value for each pixel based
on the local neighborhood of that pixel. This is useful when the
illumination varies across the image.
3. Otsu's Thresholding:
183
Applied Signal and Image Otsu's method works by iterating through all possible threshold values
Processing and selecting the one that minimizes the intra-class variance (variance
within each class) or maximizes the inter-class variance (variance
between classes).
Applications:
Otsu’s Segmentation:
Steps:
Iterate through all possible threshold values and calculate the between-
class variance for each threshold.
Apply the selected threshold to the image to obtain the binary result.
the Otsu's method given by Nobuyuki Otsufor obtaining optimal
thresholding. This is a variance-based method used to evaluate the least
weighted variance between the foreground and background pixels. The
essential factor is to measure the distribution of background and
foreground pixels while iterating over all potential threshold values and
then locating the threshold at which the dispersion is the smallest [C.
Huang et.al., 37].
184
Algorithm Image Segmentation
The algorithm repeatedly finds the threshold that reduces the variance
belonging to the same class determined by the weighted sum of spread.
Grayscale typically has hues between 0-255 (0-1 in case of float).
The following equation is utilized to calculate the variance at threshold t:
Where ωbg(t) and ωfg(t) represents the probability of pixels for value of t
and σ2 represents the deviation of color values.
Let,
Pall :total pixel count,
PBG(t) and PFG(t) : background and foreground pixels count at t,
So the updates are given by,
Where,
xi and x bar: pixel value and its mean at i in the group (bg or fg)
N: number of pixels.
Advantages:
Limitations:
Edge-based Segmentation:
Edge-based segmentation relies on detecting significant changes in
intensity or color, which often correspond to object boundaries or edges in
the image.
1. Edge Detection:
2. Gradient Magnitude:
Where Gx(x, y)and Gy(x, y)are the horizontal and vertical components
of the gradient, respectively.
3. Thresholding:
Region-based Segmentation:
Region-based segmentation aims to partition an image into regions or
objects based on certain criteria such as color similarity, texture, or
intensity homogeneity.
186
1. Region Growing: Image Segmentation
Region growing algorithms start with seed points and iteratively grow
regions by adding neighboring pixels that meet certain similarity
criteria.
Let Ri represent a region in the image, and I(x, y)be the intensity of the
pixel at coordinates (x, y) within the region. The region homogeneity
criterion can be expressed as:
|I(x, y) - μi| <=T
Given a seed point (xs, ys), the region growing algorithm iteratively
adds neighboring pixels (x, y) to the region if they satisfy the
homogeneity criterion.
187
Applied Signal and Image combination of these techniques is employed for more robust
Processing segmentation results.
Select seed points within the image. These seed points can be chosen
manually or automatically based on certain criteria.
2. Homogeneity Criterion:
Let I(x, y) represent the intensity (or color) of the pixel at coordinates
(x, y).
For each seed point (xs, ys) in S, initialize a region R containing only
the seed point.
Continue this process until no more pixels can be added to the region.
188
Mathematical Equations: Image Segmentation
1. Homogeneity Criterion:
R = R∪{(x,y)}
Advantages:
Limitations:
189
Applied Signal and Image 8.6 REGION SPLITTING AND MERGING
Processing
Region splitting and merging is another important technique for image
segmentation, particularly useful when dealing with complex images
containing objects with varying characteristics or cluttered backgrounds.
This method involves dividing regions into smaller segments based on
certain criteria (splitting) and then merging adjacent segments that satisfy
specific similarity conditions (merging). Here's a detailed explanation of
region splitting and merging along with mathematical equations:
Region Splitting:
1. Homogeneity Criterion for Splitting:
2. Splitting Process:
For each region that satisfies the splitting criterion, divide it into
smaller segments.
Region Merging:
1.Homogeneity Criterion for Merging:
190
Common criteria include comparing the mean intensity (or color) of Image Segmentation
neighboring regions and checking for color or intensity similarity.
For example, two regions may be merged if their mean intensities are
similar or if their color histograms exhibit significant overlap.
2. Merging Process:
For each pair of adjacent regions that satisfy the merging criterion,
merge them into a single region.
Let R1 and R2represent two adjacent regions in the image, with mean
intensities μ1 and
μ2, respectively.
Advantages:
Region splitting and merging can handle complex images with varying
characteristics.
Limitations:
191
Applied Signal and Image 8.7 WATERSHED ALGORITHM
Processing
The Watershed algorithm is a powerful method for image segmentation,
particularly in scenarios where objects of interest are touching or
overlapping. It views the grayscale image as a topographic surface, where
pixel intensities represent elevations, and the goal is to partition the
surface into catchment basins (regions) corresponding to distinct objects.
Here's a detailed explanation of the Watershed algorithm along with
mathematical equations and a diagram:
Watershed Algorithm:
1. Gradient Computation:
2. Marker Selection:
4. Boundary Identification:
Mathematical Equations:
1. Gradient Magnitude Calculation:
Let I(x, y) represent the intensity of the pixel at coordinates (x, y), and
G(x, y) denote the gradient magnitude:
G(x, y) = ∣∇I(x,y)∣
192
- Where ∣∇I(x,y)∣represents the gradient vector. Image Segmentation
2. Marker Selection:
Markers can be manually defined or obtained automatically. Let M
represent the set of markers:
M ={m1, m2, ..., mn}
Each marker mi can be represented as a point in the image domain (xi, yi)
along with an associated label.
4. Boundary Identification:
Boundaries between adjacent regions are formed where water from
different markers meets during the flooding process.
These boundaries can be represented as watershed lines or contours.
Diagram:
------------------------- Step 1: Gradient Magnitude -------------------------
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
193
Applied Signal and Image ------------------------- Step 3: Flood Fill --------------------------------
Processing
| | | | | | | | | | | |
| | | | | | | | | | | |
| 1 | 1 | 1 | | | | | | | | |
| 1 | 1 | 1 | | | | | | | | |
| o | o | o | | | | | | | | |
Step 3 depicts the flood fill simulation, where the labels propagate
from the markers.
Advantages:
Limitations:
194
8.8 ACTIVE CONTOURS Image Segmentation
Initialize a curve (or contour) within the image domain, typically close
to the boundary of the object to be segmented.
2. Energy Minimization:
3. Curve Evolution:
4. Convergence:
Iterate until the curve converges to the desired object boundary or until
a termination criterion is met (e.g., maximum number of iterations,
negligible change in energy).
Mathematical Equations:
1. Energy Functional:
195
Applied Signal and Image E=Einternal+α⋅Eexternal
Processing Where αis a weighting parameter controlling the influence of the external
energy.
2. Internal Energy:
3. External Energy:
4. Curve Evolution:
Advantages:
Limitations:
196
The algorithm may require significant computational resources, Image Segmentation
especially for large images or complex objects.
In summary, Active Contours are versatile techniques for image
segmentation, widely used in medical imaging, remote sensing, and
computer vision applications. By minimizing an energy functional that
balances curve smoothness and image compatibility, Active Contours can
accuratelydelineate object boundaries and provide precise segmentation
results.
Initialize a curve (or contour) within the image domain, typically close
to the boundary of the object to be segmented.
2. Energy Minimization:
3. Curve Evolution:
Mathematical Equations:
1. Energy Functional:
2. Internal Energy:
3. External Energy:
4. Morphological Energy:
Advantages:
They can handle objects with irregular shapes and varying contrast
levels.
Limitations:
GrabCut Algorithm:
1. Initialization:
3. Graph Construction:
Construct a graph where each pixel in the image is a node, and the
edges between nodes represent the pairwise relationships between
pixels.
199
Applied Signal and Image The smoothness term penalizes abrupt changes in the segmentation,
Processing promoting smooth object boundaries.
6. Convergence:
Mathematical Equations:
Advantages:
It does not require extensive user input and can adapt to complex
object shapes and backgrounds.
200
The algorithm provides accurate segmentation results, particularly for Image Segmentation
images with well-defined object boundaries.
Limitations:
The algorithm may not perform well on images with low contrast or
ambiguous object boundaries.
In summary, the GrabCut algorithm offers a versatile and effective
approach to foreground object segmentation in images, combining
Gaussian mixture modeling with graph cuts to achieve accurate and
efficient results. With its interactive nature and robust performance,
GrabCut has become a popular choice for various computer vision and
image editing applications.
8.11 SUMMARY
We have seen thatmain goal of image segmentation is to simplify the
representation of an image into more meaningful and easy-to-analyze
parts. It aims to partition an image into distinct regions or objects based on
certain features or criteria.We have also observed thatthe Hough
Transform provides a robust method for detecting lines and circles in
images, even in the presence of noise and occlusion.It operates by
transforming the spatial domain representation of the image into a
parameter space where geometric shapes are represented explicitly.
Thresholding is a fundamental image processing technique used for image
segmentation, and Otsu's method provides an automatic way to select an
optimal threshold value based on the image's intensity distribution
We also explored region splitting and merging which is a flexible
technique for image segmentation, suitable for a wide range of
applications such as medical imaging, remote sensing, and scene analysis.
By iteratively dividing and merging regions based on predefined criteria,
this method can effectively partition images into meaningful segments or
objects.
202