IMAGE Processing Lab Manual
IMAGE Processing Lab Manual
MCA
(TWO YEARS PATTERN)
SEMESTER - II (CBCS)
Experiment I
1. Image Enhancement 01
Experiment II
2. Discrete Fourier Transform 19
Experiment III
3. Discrete cosine transform 33
Experiment IV
4. Image Segmentation and Image Restoration 47
Experiment V
5. Image Data Compression 75
Experiment VI
6. Morphological Operation 83
F.Y. MCA
SEMESTER - II (CBCS)
IMAGE PROCESSING LAB
Syllabus
Experiment – I
1
IMAGE ENHANCEMENT
Spatial Domain and Frequency Domain Techniques
1.1.0 Objectives
The aim of image enhancement is to improve the interpretability or
perception of information in images for human viewers, or to provide
better' input for other automated image processing techniques. Image
enhancement techniques can be divided into two broad categories:
1. Spatial domain methods, which operate directly on pixels, and
2. Frequency domain methods, which operate on the Fourier transform of
an image.
where (2d+1)X(2d+1) is the mask size, w(i,j)'s are weights of the mask,
f(x,y) is input pixel at coordinates (x,y), R(x,y) is the output value at (x,y).
If the center of the mask is at location (x,y) in the image, the gray level of
the pixel located at (x,y) is replaced by R, the mask is then moved to the
next location in the image and the process is repeated. This continues until
all pixel locations have been covered.
2
a. Low pass filtering: Image Enhancement
● The key requirement is that all coefficients are positive.
● Neighborhood averaging is a special case of LPF where all
coefficients are equal.
● It blurs edges and other sharp details in the image.
b. Median filtering:
If the objective is to achieve noise reduction instead of blurring, this
method should be used. This method is particularly effective when the
noise pattern consists of strong, spike-like components and the
characteristic to be preserved is edge sharpness. It is a nonlinear operation.
For each input pixel f(x,y), we sort the values of the pixel and its
neighbors to determine their median and assign its value to the output
pixel g(x,y).
3
Image Processing Lab a. Basic high pass spatial filter:
The shape of the impulse response needed to implement a high pass spatial
filter indicates that the filter should have positive coefficients near its
center, and negative coefficients in the outer periphery.
Example : filter mask of a 3x3 sharpening filter
Experiment 01
Aim :- Program for image enhancement (Smoothing & Sharpening )
using spatial domain filters.
Objective :-
The purpose of this assignment is to study image filtering in the spatial
domain. Spatial filtering is performed by convolving the image with a
mask or a kernel.Spatial filters include sharpening, smoothing, edge
detection, noise removal, etc. It consists of four parts: the first one
discusses the spatial filtering of an image using a spatial mask .3x3, 5x5,
and then this mask is used in a blurring filter . The second part studies the
order statistics filters, specially the median filter.
Part I Smoothing spatial filter: The output of a smoothing spatial filter is
simply the average of the pixels contained in the neighborhood of the filter
mask. - These filters are sometimes called averaging filters and also low
pass filters - Two types of masks of the spatial filter
5
Image Processing Lab Steps :-
1) Read input Image.
2) Add noise using the “imnoise()” function.
3) Define a (3 x 3) filter.
4)Use convolution function conv2() for filtering
• Order statistics filters are nonlinear spatial filters whose response is
based on ordering (ranking) the pixels contained in an area covered by the
filter
• The best known example in this category is median filter
• Median filter - Median filters replace the value of the pixel by the
median of the gray levels in the neighborhood of that pixel
Part :-II Sharpening spatial filter:
Spatial domain sharpening filters are also called as High Pass Filters
Laplacian Filters
Laplacian Filters
6
Gradient Filter Image Enhancement
Description:
h = fspecial(type) create a two-dimensional filter h of the specified type.
fspecial returns h as a correlation kernel,.
Value Description
7
Image Processing Lab Value Description 'average' Averaging filter 'gaussian' Gaussian lowpass
filter
h = fspecial(type, parameters)
accepts the filter specified by type plus additional modifying parameters
particular to the type of filter chosen. If you omit these arguments, fspecial
uses default values for the parameters. The following list shows the syntax
for each filter type. Where applicable,
h = fspecial('average', hsize)
returns an averaging filter h of size hsize. The argument hsize can be a
vector specifying the number of rows and columns in h, or it can be a
scalar, in which case h is a square matrix. The default value for hsize is
[3 3].
h = fspecial ('gaussian', hsize, sigma) :-
returns a rotationally symmetric Gaussian lowpass filter of size hsize with
standard deviation sigma (positive). hsize can be a vector specifying the
number of rows and columns in h, or it can be a scalar, in which case h is a
square matrix. The default value for hsize is [3 3]; the default value for
sigma is 0.5.
h = fspecial('laplacian', alpha)
returns a 3-by-3 filter approximating the shape of the two-dimensional
Laplacian operator. The parameter alpha controls the shape of the
Laplacian and must be in the range 0.0 to 1.0. The default value for alpha
is 0.2.
h = fspecial('log', hsize, sigma)
returns a rotationally symmetric Laplacian of Gaussian filter of size hsize
with standard deviation sigma (positive). hsize can be a vector specifying
the number of rows and columns in h, or it can be a scalar, in which case h
is a square matrix. The default value for hsize is [5 5] and 0.5 for sigma.
h = fspecial('prewitt')
h = fspecial('sobel')
Median Filter :- Median filters replace the value of the pixel by the
median of the gray levels in the neighborhood of that pixel
1) Open /Read an image in a matrix .
2) Create a 3x3 matrix B called a mask.
3) Read the first 3x3 pixel grid of the input image into B.
4) Sort the matrix B in ascending order.
8
5) Select the middle value and put that as the first pixel value in the Image Enhancement
output image matrix.
6) Repeat the procedure for the entire input image by reading the next 3x3
values from the input image and sort using mask B. This way Output
image values are calculated.
7) Display the input image and Output Image.
10
subplot(2,2,4), Image Enhancement
imshow(uint8(c2)),
title('5 x 5 Averaging filter'),
%.................. figure,
subplot(2,2,1),
imshow(a),
title('Original Image'),
subplot(2,2,2),
imshow(d),
title('Speckle noise'),
subplot(2,2,3),
imshow(uint8(d1)),
title('3 x 3 Averaging filter'),
subplot(2,2,4),
imshow(uint8(d2)),
title('5 x 5 Averaging filter'),
11
Image Processing Lab imshow(a),
title('Original image')
subplot(3,2,2),
imshow(b),
title('Salt & pepper noise')
subplot(3,2,3),
imshow(uint8(c1)),
title('3 x 3 smoothing')
subplot(3,2,4),
imshow(uint8(c2)),
title('5 x 5 smoothing')
subplot(3,2,5),
imshow(uint8(c3)),
title('3x 3 Median filter')
subplot(3,2,6),
imshow(uint8(c4)),
title('5 x 5 Median filter')
12
% this program is for sharpening spatial domain filter Image Enhancement
a=imread('D:\horse.jpg');
%Defining the laplacian filters
h1=[0 -1 0;-1 4 -1;0 -1 0]
h2=[-1 -1 -1;-1 8 -1; -1 -1 -1];
h3=[-1 -1 -1;-1 9 -1; -1 -1 -1];
c1=conv2(a,h1,'same');
c2=conv2(a,h2,'same');
c3=conv2(a,h3,'same');
subplot(2,2,1),imshow(a),
title('Original image')
subplot(2,2,2),
imshow(uint8(c1)),
title('Laplacian sharpening 4 at center')
subplot(2,2,3),imshow(uint8(c2)),
title('Laplacian sharpening 8 at center ')
subplot(2,2,4),
imshow(uint8(c3)),
title(' Laplacian sharpening 9 at center')
%Averaging Filter
A=ones(200,200);
A(30:60,30:60)=0;
A(70:150,50:170)=0
figure(1)
subplot(1,2,1)
imshow(A)
AM=1/9.*[1 1 1;1 1 1;1 1 1];
B=conv2(A,AM);
subplot(1,2,2)
imshow(B)
13
Image Processing Lab
Experiment 02
AIM: To Implement smoothing or averaging filters in spatial domain.
OBJECTIVE: To Implement smoothing or averaging filters in spatial
domain.
TOOLS REQUIRED: MATLAB
THEORY:
Filtering is a technique for modifying or enhancing an image. Masks or
filters will be defined. The general process of convolution and correlation
will be introduced via an example. Also smoothing linear filters such as
box and weighted average filters will be introduced. In statistics and image
processing, to smooth a data set is to create an approximating function that
attempts to capture important patterns in the data, while leaving out noise
or other fine-scale structures/rapid phenomena. In smoothing, the data
points of a signal are modified so individual points (presumably because
of noise) are reduced, and points that are lower than the adjacent points are
increased leading to a smoother signal. Smoothing may be used in two
important ways that can aid in data analysis by being able to extract more
information from the data as long as the assumption of smoothing is
reasonable by being able to provide analyses that are both flexible and
robust. different algorithms are used in smoothing.
% Program for implementation of smoothing or averaging filter in
spatial domain
I=imread('trees.tif');
subplot(2,2,1);
imshow(J);
title('original image');
f=ones(3,3)/9;
h=imfilter(I,f,'circular');
subplot(2,2,2);
imshow(h);
title('averaged image');
14
Result: Image Enhancement
The concept behind the Fourier transform is that any waveform can be
constructed using a sum of sine and cosine waves of different frequencies.
The exponential in the above formula can be expanded into sines and
cosines with the variables u and v determining these frequencies.
The inverse of the above discrete Fourier transform is given by the
following equation:
15
Image Processing Lab ● The values of the Fourier transform are complex, meaning they have
real and imaginary parts. The imaginary parts are represented by i,
which is defined solely by the property that its square is −1, ie:
● The fast Fourier transform (FFT) is a fast algorithm for computing the
discrete Fourier transform.
● MATLAB has three functions to compute the DFT:
1. fft -for one dimension (useful for audio)
2. fft2 -for two dimensions (useful for images)
3. fftn -for n dimensions
● MATLAB has three related functions that compute the inverse DFT:
1. ifft
2. ifft2
3. ifftn
17
Image Processing Lab 1.1.4 References
● Digital Image Processing, Using MATLAB, by Rafael C. Gonzalez,
Richard E. Woods, and Steven L. Eddins
● Image Processing Toolbox, For Use with MATLAB (MATLAB's
documentation)--available through MATLAB's help menu or
online at:
http://www.mathworks.com/access/helpdesk/help/toolbox/images/
● Frequency Domain Processing: www.cs.uregina.ca/Links/class-
info/425/Lab5/index.html
18
Experiment -II
2
DISCRETE FOURIER TRANSFORMATION
Aim: To find DFT/FFT forward and inverse transform of image.
Theory:
FFT: fast Fourier transform.
IFFT: Inverse fast Fourier transform.
19
Image Processing Lab Using the DFT, we can compose the above signal to a series of sinusoids
and each of them will have a different frequency. The following 3D figure
shows the idea behind the DFT, that the above signal is actually the results
of the sum of 3 different sine waves. The time domain signal, which is the
above signal we saw can be transformed into a figure in the frequency
domain called DFT amplitude spectrum, where the signal frequencies are
showing as vertical bars. The height of the bar after normalization is the
amplitude of the signal in the time domain. You can see that the 3 vertical
bars are corresponding the 3 frequencies of the sine wave, which are also
plotted in the figure.
In this section, we will learn how to use DFT to compute and plot the DFT
amplitude spectrum.
DFT
The DFT can transform a sequence of evenly spaced signal to the
information about the frequency of all the sine waves that needed to sum
to the time domain signal. It is defined as:
Xk=∑n=0N−1xn e−i2πkn/N=∑n=0N−1xn[cos(2πkn/N)−i sin(2πkn/N)]
Xk=∑n=0N−1xn⋅ e−i2πkn/N=∑n=0N−1xn[cos(2πkn/N)−i sin(2πkn/N)]
where
● N = number of samples
● n = current sample
● k = current frequency, where k∈[0,N−1]k∈[0,N−1]
● xnxn = the sine value at sample n
● XkXk = The DFT which include information of both amplitude and
phase
20
Also, the last expression in the above equation derived from the Euler’s Discrete Fourier Transform
formula, which links the trigonometric functions to the complex
exponential function: ei x=cosx+i sinxei x=cosx+i sinx
Due to the nature of the transform, X0=∑N−1n=0xnX0=∑n=0N−1xn.
If NN is an odd number, the elements X1,X2,...,X(N−1)/2X1,
X2,...,X(N−1)/2 contain the positive frequency terms and the
elements X(N+1)/2,...,XN−1X(N+1)/2,...,XN−1 contain the negative
frequency terms, in order of decreasingly negative frequency. While
if NN is even, the elements X1,X2,...,XN/2−1X1,X2,...,XN/2−1 contain
the positive frequency terms, and the elements XN/2, ...,XN−1XN/2,...,
XN−1 contain the negative frequency terms, in order of decreasingly
negative frequency. In the case that our input signal xx is a real-valued
sequence, the DFT output XnXn for positive frequencies is the conjugate
of the values XnXn for negative frequencies, the spectrum will be
symmetric. Therefore, usually we only plot the DFT corresponding to the
positive frequencies.
Note that the XkXk is a complex number that encodes both the amplitude
and phase information of a complex sinusoidal component ei 2πkn/Nei
2πkn/ N of function xnxn. The amplitude and phase of the signal can be
calculated as:
amp=|Xk|N=Re(Xk)2+Im(Xk)2−−−−−−−−−−−−−−−−√Namp=|Xk|N=Re(
Xk)2+Im(Xk)2N
phase=atan2(Im(Xk),Re(Xk))phase=atan2(Im(Xk),Re(Xk))
where Im(Xk)Im(Xk) and Re(Xk)Re(Xk) are the imagery and real part of
the complex number, atan2atan2 is the two-argument form of
the arctanarctan function.
The amplitudes returned by DFT equal to the amplitudes of the signals fed
into the DFT if we normalize it by the number of sample points. Note that
doing this will divide the power between the positive and negative sides, if
the input signal is real-valued sequence as we described above, the
spectrum of the positive and negative frequencies will be symmetric,
therefore, we will only look at one side of the DFT result, and instead of
divide NN, we divide N/2N/2 to get the amplitude corresponding to the
time domain signal.
Now that we have the basic knowledge of DFT, let’s see how we can use
it.
TRY IT! Generate 3 sine waves with frequencies 1 Hz, 4 Hz, and 7 Hz,
amplitudes 3, 1 and 0.5, and phase all zeros. Add this 3 sine waves
together with a sampling rate 100 Hz, you will see that it is the same
signal we just shown at the beginning of the section.
21
Image Processing Lab import matplotlib.pyplot as plt
import numpy as np
plt.style.use('seaborn-poster')
%matplotlib inline
# sampling rate
sr = 100
# sampling interval
ts = 1.0/sr
t = np.arange(0,1,ts)
freq = 1.
x = 3*np.sin(2*np.pi*freq*t)
freq = 4
x += np.sin(2*np.pi*freq*t)
freq = 7
x += 0.5* np.sin(2*np.pi*freq*t)
plt.figure(figsize = (8, 6))
plt.plot(t, x, 'r')
plt.ylabel('Amplitude')
plt.show()
Output:
22
TRY IT! Write a function DFT(x) which takes in one argument, x - input Discrete Fourier Transform
1 dimensional real-valued signal. The function will calculate the DFT of
the signal and return the DFT values. Apply this function to the signal we
generated above and plot the result.
def DFT(x):
"""
Function to calculate the
discrete Fourier Transform
of a 1D real-valued signal x
"""
N = len(x)
n = np.arange(N)
k = n.reshape((N, 1))
e = np.exp(-2j * np.pi * k * n / N)
X = np.dot(e, x)
return X
X = DFT(x)
# calculate the frequency
N = len(X)
n = np.arange(N)
T = N/sr
freq = n/T
plt.figure(figsize = (8, 6))
plt.stem(freq, abs(X), 'b', \
markerfmt=" ", basefmt="-b")
plt.xlabel('Freq (Hz)')
plt.ylabel('DFT Amplitude |X(freq)|')
plt.show()
23
Image Processing Lab Output:
We can see from here that the output of the DFT is symmetric at half of
the sampling rate (you can try different sampling rate to test). This half of
the sampling rate is called Nyquist frequency or the folding frequency, it
is named after the electronic engineer Harry Nyquist. He and Claude
Shannon have the Nyquist-Shannon sampling theorem, which states that a
signal sampled at a rate can be fully reconstructed if it contains only
frequency components below half that sampling frequency, thus the
highest frequency output from the DFT is half the sampling rate.
n_oneside = N//2
# get the one side frequency
f_oneside = freq[:n_oneside]
# normalize the amplitude
X_oneside =X[:n_oneside]/n_oneside
plt.figure(figsize = (12, 6))
plt.subplot(121)
plt.stem(f_oneside, abs(X_oneside), 'b', \
markerfmt=" ", basefmt="-b")
plt.xlabel('Freq (Hz)')
plt.ylabel('DFT Amplitude |X(freq)|')
plt.subplot(122)
24
plt.stem(f_oneside, abs(X_oneside), 'b', \ Discrete Fourier Transform
markerfmt=" ", basefmt="-b")
plt.xlabel('Freq (Hz)')
plt.xlim(0, 10)
plt.tight_layout()
plt.show()
Output:
We can see by plotting the first half of the DFT results, we can see 3 clear
peaks at frequency 1 Hz, 4 Hz, and 7 Hz, with amplitude 3, 1, 0.5 as
expected. This is how we can use the DFT to analyze an arbitrary signal
by decomposing it to simple sine waves.
The inverse DFT
Of course, we can do the inverse transform of the DFT easily.
xn=1N∑k=0N−1Xk ei 2πkn/Nxn=1N∑k=0N−1Xk⋅ ei⋅ 2πkn/N
We will leave this as an exercise for you to write a function.
The limit of DFT
The main issue with the above DFT implementation is that it is not
efficient if we have a signal with many data points. It may take a long time
to compute the DFT if the signal is large.
TRY IT Write a function to generate a simple signal with different
sampling rate, and see the difference of computing time by varying the
sampling rate.
def gen_sig(sr):
'''
25
Image Processing Lab function to generate
a simple 1D signal with
different sampling rate
'''
ts = 1.0/sr
t = np.arange(0,1,ts)
freq = 1.
x = 3*np.sin(2*np.pi*freq*t)
return x
# sampling rate =2000
sr = 2000
%timeit DFT(gen_sig(sr))
Output:
120 ms ± 8.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# sampling rate 20000
sr = 20000
%timeit DFT(gen_sig(sr))
Output:
15.9 s ± 1.51 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
Example 1 :
# import sympy
from sympy import fft
# sequence
seq = [15, 21, 13, 44]
# fft
transform = fft(seq)
print (transform)
Output :
FFT : [93, 2 - 23*I, -37, 2 + 23*I]
26
Discrete Fourier Transform
Example 2 :
# import sympy
from sympy import fft
# sequence
seq = [15, 21, 13, 44]
decimal_point = 4
# fft
transform = fft(seq, decimal_point )
Output :
FFT : [93, 2.0 - 23.0*I, -37, 2.0 + 23.0*I]
28
A recursive implementation of Discrete Fourier Transform
the 1D Cooley-Tukey FFT, the
input should have a length of
power of 2.
"""
N = len(x)
if N == 1:
return x
else:
X_even = FFT(x[::2])
X_odd = FFT(x[1::2])
factor = \
np.exp(-2j*np.pi*np.arange(N)/ N)
X = np.concatenate(\
[X_even+factor[:int(N/2)]*X_odd,
X_even+factor[int(N/2):]*X_odd])
return X
# sampling rate
sr = 128
# sampling interval
ts = 1.0/sr
t = np.arange(0,1,ts)
freq = 1.
x = 3*np.sin(2*np.pi*freq*t)
freq = 4
x += np.sin(2*np.pi*freq*t)
freq = 7
x += 0.5* np.sin(2*np.pi*freq*t)
plt.figure(figsize = (8, 6))
plt.plot(t, x, 'r')
plt.ylabel('Amplitude')
plt.show()
29
Image Processing Lab TRY IT! Use the FFT function to calculate the Fourier transform of the
above signal. Plot the amplitude spectrum for both the two-sided and one-
side frequencies.
X=FFT(x)
# calculate the frequency
N = len(X)
n = np.arange(N)
T = N/sr
freq = n/T
plt.figure(figsize = (12, 6))
plt.subplot(121)
plt.stem(freq, abs(X), 'b', \
markerfmt=" ", basefmt="-b")
plt.xlabel('Freq (Hz)')
plt.ylabel('FFT Amplitude |X(freq)|')
# Get the one-sided specturm
n_oneside = N//2
# get the one side frequency
f_oneside = freq[:n_oneside]
# normalize the amplitude
X_oneside =X[:n_oneside]/n_oneside
plt.subplot(122)
plt.stem(f_oneside, abs(X_oneside), 'b', \
markerfmt=" ", basefmt="-b")
plt.xlabel('Freq (Hz)')
plt.ylabel('Normalized FFT Amplitude |X(freq)|')
plt.tight_layout()
plt.show()
30
TRY IT! Generate a simple signal for length 2048, and time how long it Discrete Fourier Transform
will run the FFT and compare the speed with the DFT.
def gen_sig(sr):
'''
function to generate
a simple 1D signal with
different sampling rate
'''
ts = 1.0/sr
t = np.arange(0,1,ts)
freq = 1.
x = 3*np.sin(2*np.pi*freq*t)
return x
# sampling rate =2048
sr = 2048
%timeit FFT(gen_sig(sr))
16.9 ms ± 1.3 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
We can see that, for a signal with length 2048 (about 2000), this
implementation of F
Example 1:
# import sympy
from sympy import ifft
# sequence
seq = [15, 21, 13, 44]
# fft
transform = ifft(seq)
print ("Inverse FFT : ", transform)
Output:
Inverse FFT : [93/4, 1/2 + 23*I/4, -37/4, 1/2 - 23*I/4]
31
Image Processing Lab Example 2:
# import sympy
from sympy import ifft
# sequence
seq = [15, 21, 13, 44]
decimal_point = 4
# fft
transform = ifft(seq, decimal_point )
print ("Inverse FFT : ", transform)
Output:
Inverse FFT : [23.25, 0.5 + 5.75*I, -9.250, 0.5 - 5.75*I]
32
Experiment III
3
DISCRETE COSINE TRANSFORM
Aim: To find DCT forward and inverse transform of image.
Theory:
DCT: Discrete cosine transform.
IDCT: Inverse discrete cosine transform.
import numpy as np
from numpy import *
two dimensions
34
[-0.5 -0.5]] Discrete cosine transform
1, 1 :
[[ 0.5 -0.5]
[-0.5 0.5]]
In [3]:
# The 8 x 8 DCT matrix thus looks like this.
N=8
dct = np.zeros((N, N))
for x in range(N):
dct[0,x] = sqrt(2.0/N) / sqrt(2.0)
for u in range(1,N):
for x in range(N):
dct[u,x] = sqrt(2.0/N) * cos((pi/N) * u * (x + 0.5) )
np.set_printoptions(precision=3)
dct
Out[3]:
array([[ 0.354, 0.354, 0.354, 0.354, 0.354, 0.354, 0.354, 0.354],
[ 0.49 , 0.416, 0.278, 0.098, -0.098, -0.278, -0.416, -0.49 ],
[ 0.462, 0.191, -0.191, -0.462, -0.462, -0.191, 0.191, 0.462],
[ 0.416, -0.098, -0.49 , -0.278, 0.278, 0.49 , 0.098, -0.416],
[ 0.354, -0.354, -0.354, 0.354, 0.354, -0.354, -0.354, 0.354],
[ 0.278, -0.49 , 0.098, 0.416, -0.416, -0.098, 0.49 , -0.278],
[ 0.191, -0.462, 0.462, -0.191, -0.191, 0.462, -0.462, 0.191],
[ 0.098, -0.278, 0.416, -0.49 , 0.49 , -0.416, 0.278, -0.098]])
The corresponding eight 1D basis vectors (the matrix rows) oscillate with
successively higher spatial frequencies.
In [4]:
# Here's what they look like.
figure(figsize=(9,12))
for u in range(N):
35
Image Processing Lab subplot(4, 2, u+1)
ylim((-1, 1))
title(str(u))
plot(dct[u, :])
plot(dct[u, :],'ro')
Lik
e the N=2 case, the vectors are orthnormal. In other words, their dot
products are 0, and each has length 1. Here are a few illustrative products.
36
In [5]: Discrete cosine transform
def rowdot(i,j):
return dot(dct[i, :], dct[j, :])
rowdot(0,0), rowdot(3,3), rowdot(0,3), rowdot(1, 7), rowdot(1,5)
Out[5]:
(0.9999999999999998,
0.9999999999999998,
6.938893903907228e-17,
1.942890293094024e-16,
-2.498001805406602e-16)
This also implies the inverse of this matrix is just its transpose.
In [6]:
dct_transpose = dct.transpose()
dct_transpose
Out[6]:
array([[ 0.354, 0.49 , 0.462, 0.416, 0.354, 0.278, 0.191, 0.098],
[ 0.354, 0.416, 0.191, -0.098, -0.354, -0.49 , -0.462, -0.278],
[ 0.354, 0.278, -0.191, -0.49 , -0.354, 0.098, 0.462, 0.416],
[ 0.354, 0.098, -0.462, -0.278, 0.354, 0.416, -0.191, -0.49 ],
[ 0.354, -0.098, -0.462, 0.278, 0.354, -0.416, -0.191, 0.49 ],
[ 0.354, -0.278, -0.191, 0.49 , -0.354, -0.098, 0.462, -0.416],
[ 0.354, -0.416, 0.191, 0.098, -0.354, 0.49 , -0.462, 0.278],
[ 0.354, -0.49 , 0.462, -0.416, 0.354, -0.278, 0.191, -0.098]])
In [7]:
# Is the dot product of dct and its transpose the identity?
maybe_identity = dot(dct, dct_transpose)
# Since there are many nearly zero like 3.2334e-17 in this numerical
result,
# the output will look much nicer if we round them all of to (say) 6 places.
roundoff = vectorize(lambda m: round(m, 6))
37
Image Processing Lab roundoff(maybe_identity)
Out[7]:
array([[ 1., 0., -0., 0., 0., 0., -0., -0.],
[ 0., 1., 0., -0., 0., -0., 0., 0.],
[-0., 0., 1., 0., -0., 0., 0., 0.],
[ 0., -0., 0., 1., 0., 0., -0., 0.],
[ 0., 0., -0., 0., 1., 0., -0., -0.],
[ 0., -0., 0., 0., 0., 1., 0., -0.],
[-0., 0., 0., -0., -0., 0., 1., 0.],
[-0., 0., 0., 0., -0., -0., 0., 1.]])
38
In [11]: Discrete cosine transform
# The image itself contains 3 dimensions: rows, columns, and colors
img.shape
Out[11]:
(112, 112, 3)
All three of the R,G,B color values in the greyscale image are the same for
each pixel.
Let's just look at values from one tiny 8 x 8 block (which is what's used
JPEG compression) near his nose.
(The next images use a false color spectrum to display pixel intensity.)
In [12]:
tiny = img[40:48, 40:48, 0] # a tiny 8 x 8 block, in the color=0 (Red)
channel
def show_image(img):
plt.imshow(img)
plt.colorbar()
show_image(tiny)
In [13]:
# And here are the numbers.
39
Image Processing Lab tiny
Out[13]:
array([[179, 140, 138, 101, 110, 135, 143, 144],
[ 76, 64, 91, 110, 113, 109, 104, 118],
[ 78, 68, 40, 34, 33, 66, 90, 105],
[209, 204, 168, 163, 132, 100, 73, 57],
[219, 231, 221, 227, 226, 205, 172, 130],
[215, 213, 217, 223, 232, 224, 217, 203],
[181, 202, 233, 214, 207, 226, 235, 235],
[ 69, 44, 62, 66, 83, 129, 153, 182]], dtype=uint8)
Now we define the 2D version of the N=8 DCT described above.
The trick is to apply the 1D DCT to every column, and then also apply it
to every row, i.e.
G=DCT⋅f⋅DCTTG=DCT⋅f⋅DCTT
In [14]:
def doDCT(grid):
return dot(dot(dct, grid), dct_transpose)
def undoDCT(grid):
return dot(dot(dct_transpose, grid), dct)
# test : do DCT, then undo DCT; should get back the same image.
tiny_do_undo = undoDCT(doDCT(tiny))
show_image(tiny_do_undo) # Yup, looks the same.
40
Discrete cosine transform
In [15]:
# And the numbers are the same.
tiny_do_undo
Out[15]:
array([[179., 140., 138., 101., 110., 135., 143., 144.],
[ 76., 64., 91., 110., 113., 109., 104., 118.],
[ 78., 68., 40., 34., 33., 66., 90., 105.],
[209., 204., 168., 163., 132., 100., 73., 57.],
[219., 231., 221., 227., 226., 205., 172., 130.],
[215., 213., 217., 223., 232., 224., 217., 203.],
[181., 202., 233., 214., 207., 226., 235., 235.],
[ 69., 44., 62., 66., 83., 129., 153., 182.]])
The DCT transform looks like this. Note that most of the intensity is at the
top left, in the lowest frequencies.
In [16]:
tinyDCT = doDCT(tiny)
show_image(tinyDCT)
41
Image Processing Lab
In [17]:
set_printoptions(linewidth=100) # output line width (default is 75)
round6 = vectorize(lambda m: '{:6.1f}'.format(m))
round6(tinyDCT)
Out[17]:
array([['1173.9', ' 3.6', ' 19.8', ' 12.3', ' -5.4', ' 8.2', ' 10.3', ' -0.0'],
['-225.9', ' 64.1', ' 24.2', ' 12.2', ' 9.9', ' -0.2', ' 0.0', ' 0.1'],
['-122.7', '-161.8', ' 63.2', ' -15.0', ' 0.3', ' 11.1', ' 28.5', ' 10.7'],
[' 341.9', ' 50.8', ' -48.4', ' 12.0', ' -10.2', ' -0.4', ' 0.1', ' 12.1'],
[' -20.1', ' 80.2', ' 6.9', ' 22.1', ' 0.1', ' -0.1', ' -0.0', ' -0.3'],
[' 74.4', ' 69.9', ' 32.9', ' -13.0', ' -16.3', ' -0.4', ' -0.2', ' -0.0'],
['-100.6', ' -38.9', ' 64.3', ' 17.2', ' -0.3', ' 0.5', ' -0.2', ' -0.1'],
[' 13.8', ' -36.5', ' 18.5', ' -0.4', ' -21.6', ' 0.1', ' 0.3', ' 0.2']],
dtype='<U6')
42
Discrete cosine transform
43
Image Processing Lab In [19]:
# First make a copy to work on.
tinyDCT_chopped = tinyDCT.copy()
# Then zero the pieces below the x + y = 8 line.
for x in range(N):
for u in range(N):
if x + u > 8:
tinyDCT_chopped[x,u] = 0.0
show_image(tinyDCT_chopped)
In [20]:
round6(tinyDCT_chopped)
# Notice all the zeros at the bottom right - those are the chopped high
frequences.
# We've essentially done a "low pass filter" on the spacial frequencies.
Out[20]:
array([['1173.9', ' 3.6', ' 19.8', ' 12.3', ' -5.4', ' 8.2', ' 10.3', ' -0.0'],
['-225.9', ' 64.1', ' 24.2', ' 12.2', ' 9.9', ' -0.2', ' 0.0', ' 0.1'],
['-122.7', '-161.8', ' 63.2', ' -15.0', ' 0.3', ' 11.1', ' 28.5', ' 0.0'],
[' 341.9', ' 50.8', ' -48.4', ' 12.0', ' -10.2', ' -0.4', ' 0.0', ' 0.0'],
44
[' -20.1', ' 80.2', ' 6.9', ' 22.1', ' 0.1', ' 0.0', ' 0.0', ' 0.0'], Discrete cosine transform
[' 74.4', ' 69.9', ' 32.9', ' -13.0', ' 0.0', ' 0.0', ' 0.0', ' 0.0'],
['-100.6', ' -38.9', ' 64.3', ' 0.0', ' 0.0', ' 0.0', ' 0.0', ' 0.0'],
[' 13.8', ' -36.5', ' 0.0', ' 0.0', ' 0.0', ' 0.0', ' 0.0', ' 0.0']],
dtype='<U6')
To see what this did to the original, we just transform it back.
In [21]:
tiny_chopped_float = undoDCT(tinyDCT_chopped)
# Also convert the floats back to uint8, which was the original format
tiny_chopped = vectorize(lambda x: uint8(x))(tiny_chopped_float)
show_image(tiny_chopped)
In [22]:
tiny_chopped
Out[22]:
array([[178, 140, 133, 109, 107, 135, 137, 147],
[ 76, 69, 90, 100, 107, 117, 110, 112],
[ 75, 61, 44, 39, 42, 56, 86, 107],
[214, 204, 169, 152, 131, 97, 78, 57],
45
Image Processing Lab [217, 227, 220, 230, 233, 206, 169, 125],
[211, 220, 221, 219, 220, 223, 220, 206],
[186, 196, 223, 220, 214, 227, 229, 234],
[ 66, 46, 65, 63, 79, 129, 155, 181]], dtype=uint8)
And we have something close to the original back again - even though
close to half of the transformed image was set to zero.
conclusions
The procedue here isn't what happens in JPEG compression, but does
illustrate one of the central concepts - throwing away some of higher
spatial frequency information after a DCT transform.
In the real JPEG lossy compression algorithm, the steps are
● the color space is transformed from R,G,B to Y,Cb,Cr to take
advantage of human visual prejudices
● the values are shifted so that they center around zero
● the values after the DCT are "quantized" (i.e. rounded off) by different
amounts at different spots in the grid. (This* is the lossy step, and how
lossy depends on the JPEG quality.)
● a zigzag (keeping similar frequencies together) pattern turns this to a
1D stream of 64 values
● which are then huffman encoded by, typically by a pre-chosen code
(part of the JPEG standard
46
Experiment IV
4
IMAGE SEGMENTATION AND IMAGE
RESTORATION
Aim: The detection of discontinuities – Point, Line and Edge detections,
Hough transform, Thresholding, Region based segmentation chain codes.
Theory:
• This is usually accomplished by applying a suitable mask to the image.
Detection of lines
• This is used to detect lines in an image.
• It can be done using the following four masks:
Edge Detection
• Isolated points and thin lines do not occur frequently in most practical
applications.
• For image segmentation, we are mostly interested in detecting the
boundary between two regions with relatively distinct gray-level
properties.
• We assume that the regions in question are sufficiently homogeneous so
that the transition between two regions can be determined on the basis of
gray-level discontinuities alone.
• An edge in an image may be defined as a discontinuity or abrupt change
in gray level.
49
Image Processing Lab
50
Image segmentation and Image
Restoration
• From the example above, it is clear that the magnitude of the first
derivative can be used to detect the presence of an edge in an image.
• The sign of the second derivative can be used to determine whether
an edge pixel lies on the dark or light side of an edge.
• The zero crossings of the second derivative provide a powerful way
of locating edges in an image.
• We would like to have small-sized masks in order to detect fine
variation in graylevel distribution (i.e., micro-edges). •
51
Image Processing Lab • On the other hand, we would like to employ large-sized masks in
order to detect coarse variation in graylevel distribution (i.e., macro-
edges) and filter-out noise and other irregularities.
• We therefore need to find a mask size, which is a compromise
between these two opposing requirements, or determine edge content
by using different mask sizes
• Most common differentiation operator is the gradient.
52
• Other discrete approximations to the gradient (more precisely, the Image segmentation and Image
appropriate partial derivatives) have been proposed (Roberts, Restoration
Prewitt).
• Because derivatives enhance noise, the previous operators may not
give good results if the input image is very noisy.
• One way to combat the effect of noise is by applying a smoothing
mask. The Sobel edge detector combines this smoothing operation
along with the derivative operation give the following masks:
Since the gradient edge detection methodology depends only on the
relative magnitudes within an image, scalar multiplication by factors such
as 1/2 or 1/8 play no essential role. The same is true for the signs of the
mask entries. Therefore, masks like correspond to the same detector,
namely the Sobel edge detector.
Example:
53
Image Processing Lab The code will only compile in linux environment. Make sure that openCV
is installed in your system before you run the program.
54
Image segmentation and Image
Restoration
# Python program to Edge detection
# using OpenCV in Python
# using Sobel edge detection
# and laplacian method
import cv2
import numpy as np
while(1):
# Calculation of Sobelx
sobelx = cv2.Sobel(frame,cv2.CV_64F,1,0,ksize=5)
# Calculation of Sobely
sobely = cv2.Sobel(frame,cv2.CV_64F,0,1,ksize=5)
# Calculation of Laplacian
laplacian = cv2.Laplacian(frame,cv2.CV_64F)
cv2.imshow('sobelx',sobelx)
cv2.imshow('sobely',sobely)
cv2.imshow('laplacian',laplacian)
k = cv2.waitKey(5) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()
55
Image Processing Lab Calculation of the derivative of an image
A digital image is represented by a matrix that stores the
RGB/BGR/HSV(whichever color space the image belongs to) value of
each pixel in rows and columns.
● Factor = 11 – 2- 2- 2- 2- 2 = 3
Offset = 0
Weighted Sum = 124*0 + 19*(-2) + 110*(-2) + 53*11 + 44*(-2) +
19*0 + 60*(-2) + 100*0 = 117
O[4,2] = (117/3) + 0 = 39
56
Parameters: Image segmentation and Image
Restoration
57
Image Processing Lab
Now take the second point on the line. Do the same as above. Increment
the values in the cells corresponding to (r,0) you got. This time, the cell
(50,90) = 2. We are actually voting the (r,0) values. You continue this
process for every point on the line. At each point, the cell (50,90) will be
incremented or voted up, while other cells may or may not be voted up.
This way, at the end, the cell (50,90) will have maximum votes. So if you
search the accumulator for maximum votes, you get the value (50,90)
which says, there is a line in this image at distance 50 from origin and at
angle 90 degrees.
58
Image segmentation and Image
Restoration
59
Image Processing Lab
60
Elaboration of function(cv2.HoughLines (edges,1,np.pi/180, 200)): Image segmentation and Image
Restoration
61
Image Processing Lab
Alternate simpler method for directly extracting points:
Python3import cv2
import numpy as np
# Read image
image = cv2.imread('path/to/image.png')
# Convert image to grayscale
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
# Use canny edge detection
edges = cv2.Canny(gray,50,150,apertureSize=3)
# Apply HoughLinesP method to
# to directly obtain line end points
lines = cv2.HoughLinesP(
edges, # Input edge image
1, # Distance resolution in pixels
np.pi/180, # Angle resolution in radians
threshold=100, # Min number of votes for valid line
minLineLength=5, # Min allowed length of line
maxLineGap=10 # Max allowed gap between line for joining them
)
# Iterate over points
for points in lines:
# Extracted points nested in the list
x1,y1,x2,y2=points[0]
# Draw the lines joing the points
# On the original image
cv2.line(image,(x1,y1),(x2,y2),(0,255,0),2)
# Maintain a simples lookup list for points
lines_list.append([(x1,y1),(x2,y2)])
62
Summarizing the process Image segmentation and Image
Restoration
In an image analysis context, the coordinates of the point(s) of edge
segments (i.e. X,Y ) in the image are known and therefore serve as
constants in the parametric line equation, while R(rho) and Theta(θ) are
the unknown variables we seek.
● If we plot the possible (r) values defined by each (theta), points in
cartesian image space map to curves (i.e. sinusoids) in the polar Hough
parameter space. This point-to-curve transformation is the Hough
transformation for straight lines.
● The transform is implemented by quantizing the Hough parameter
space into finite intervals or accumulator cells. As the algorithm runs,
each (X,Y) is transformed into a discretized (r,0) curve and the
accumulator(2D array) cells which lie along this curve are
incremented.
● Resulting peaks in the accumulator array represent strong evidence
that a corresponding straight line exists in the image.
Applications of Hough transform:
1. It is used to isolate features of a particular shape within an image.
2. Tolerant of gaps in feature boundary descriptions and is relatively
unaffected by image noise.
3. Used extensively in barcode scanning, verification and recognition
If f (x, y) < T
then f (x, y) = 0
else
f (x, y) = 255
63
Image Processing Lab where
f (x, y) = Coordinate Pixel Value
T = Threshold Value.
In Open CV with Python, the function cv2.threshold is used for
thresholding.
Simple Thresholding
The basic Thresholding technique is Binary Thresholding. For every pixel,
the same threshold value is applied. If the pixel value is smaller than the
threshold, it is set to 0, otherwise, it is set to a maximum value.
The different Simple Thresholding Techniques are:
64
Image segmentation and Image
Restoration
● Python3
● # Python program to illustrate
● # simple thresholding type on an image
●
● # organizing imports
● import cv2
● import numpy as np
●
● # path to input image is specified and
● # image is loaded with imread command
● image1 = cv2.imread('input1.jpg')
●
● # cv2.cvtColor is applied over the
● # image input with applied parameters
● # to convert the image in grayscale
● img = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
65
Image Processing Lab
Segmentation
1 0 -1
2 0 -2
1 0 -1
Sobel vertical
Operator
+1 2 1
0 0 0
-1 -2 -1
Sobel Horizontal
Operator
0 -1 0
-1 4 -1
0 -1 0
Negative Laplace
Operator
67
Image Processing Lab
● Pros :
● This approach is similar to how the humans brain approaches the
segmentation task.
● Works well in images with good contrast between object and
background.
● Limitations:
● Does not work well on images with smooth transitions and low
contrast.
● Sensitive to noise.
● Robust edge linking is not trivial and easy to perform.
Region-Based Segmentation
● Top-down approach
● First, we need to define the predefined seed pixel. Either we can
define all pixels as seed pixels or randomly chosen pixels. Grow
regions until all pixels in the image belongs to the region.
● Bottom-Up approach
● Select seed only from objects of interest. Grow regions only if the
similarity criterion is fulfilled.
● Similarity Measures:
● Similarity measures can be of different types: For the grayscale
image the similarity measure can be the different textures and
other spatial properties, intensity difference within a region or the
distance b/w mean value of the region.
● Region merging techniques:
● In the region merging technique, we try to combine the regions
that contain the single object and separate it from the
background.. There are many regions merging techniques such as
Watershed algorithm, Split and merge algorithm, etc.
68
Image segmentation and Image
Restoration
● Pros:
● Since it performs simple threshold calculation, it is faster to
perform.
● Region-based segmentation works better when the object and
background have high contrast.
● Limitations:
● It did not produce many accurate segmentation results when there
are no significant differences b/w pixel values of the object and
the background.
Implementation:
● # code
● import numpy as np
● import matplotlib.pyplot as plt
● from skimage.feature import canny
● from skimage import data,morphology
● from skimage.color import rgb2gray
● import scipy.ndimage as nd
● plt.rcParams["figure.figsize"] = (12,8)
● %matplotlib inline
● # load images and convert grayscale
● rocket = data.rocket()
● rocket_wh = rgb2gray(rocket)
● # apply edge segmentation
● # plot canny edge detection
● edges = canny(rocket_wh)
● plt.imshow(edges, interpolation='gaussian')
● plt.title('Canny detector')
● # fill regions to perform edge segmentation
● fill_im = nd.binary_fill_holes(edges)
● plt.imshow(fill_im)
● plt.title('Region Filling')
69
Image Processing Lab ● # Region Segmentation
● # First we print the elevation map
● elevation_map = sobel(rocket_wh)
● plt.imshow(elevation_map)
plt.imshow(markers)
plt.title('markers')
# Perform watershed region segmentation
segmentation = morphology.watershed(elevation_map, markers)
plt.imshow(segmentation)
plt.title('Watershed segmentation')
fig.subplots_adjust(**margins)
70
Image segmentation and Image
Restoration
Output:
71
Image Processing Lab
Elevation
maps
72
Image segmentation and Image
Restoration
73
Image Processing Lab
74
Experiment V
5
IMAGE DATA COMPRESSION
Aim: Fundamentals of compression, Basic compression methods.
Theory:
In the field of Image processing, the compression of images is an
important step before we start the processing of larger images or videos.
The compression of images is carried out by an encoder and output a
compressed form of an image. In the processes of compression, the
mathematical transforms play a vital role. A flow chart of the process of
the compression of the image can be represented as:
75
Image Processing Lab What is a transformation(Mathematically)?
It a function that maps from one domain(vector space) to another
domain(other vector space). Assume, T is a transform, f(t):X->X’ is a
function then, T(f(t)) is called the transform of the function.
We generally carry out the transformation of the function from one vector
space to the other because when we do that in the newly projected vector
space we infer more information about the function.
A real life example of a transform:
So we can see that the computation cost has reduced as we switched to the
frequency domain. We can also see that in the time domain the
convolution was equivalent to an integration operator but in the frequency
domain, it becomes equal to the simple product of terms. So, this way the
cost of computation reduces.
So this way when we transform the image from domain to the other
carrying out the spatial filtering operations becomes easier.
Quantization
The process quantization is a vital step in which the various levels of
intensity are grouped into a particular level based on the mathematical
function defined on the pixels. Generally, the newer level is determined by
taking a fixed filter size of “m” and dividing each of the “m” terms of the
filter and rounding it its closest integer and again multiplying with “m”.
Basic quantization Function: [pixelvalue/m] * m
So, the closest of the pixel values approximate to a single level hence as
the no of distinct levels involved in the image becomes less. Hence we
reduce the redundancy in the level of the intensity. So thus quantization
helps in reducing the distinct levels.
Eg: (m=9)
Thus, we see in the above example both the intensity values round up to
18 thus we reduce the number of distinct levels(characters involved) in the
image specification.
77
Image Processing Lab Symbol Encoding
The symbol stage involves where the distinct characters involved in the
image are encoded in a way that the no. of bits required to represent a
character is optimal based on the frequency of the character’s occurrence.
In simple terms, In this stage codewords are generated for the different
characters present. By doing so we aim to reduce the no. of bits required to
represent the intensity levels and represent them in an optimum number of
bits.
There are many encoding algorithms. Some of the popular ones are:
Huffman variable-length encoding.
Run-length encoding.
In the Huffman coding scheme, we try to find the codes in such a way that
none of the codes are the prefixes to the other. And based on the
probability of the occurrence of the character the length of the code is
determined. In order to have an optimum solution the most probable
character has the smallest length code.
Example:
We see the actual 8-bit representation as well as the new smaller length
codes. The mechanism of generation of codes is:
78
Image Data Compression
So we see how the storage requirement for the no of bits is decreased as:
Initial representation–average code length: 8 bits per intensity level.
After encoding–average code length:
(0.6*1)+(0.3*2)+(0.06*3)+(0.02*4)+(0.01*5)+(0.01*5)=1.56 bits per
intensity level
Thus the no of bits required to represent the pixel intensity is drastically
reduced.
Thus in this way, the mechanism of quantization helps in compression.
When the images are once compressed its easy for them to be stored on a
device or to transfer them. And based on the type of transforms used, type
of quantization, and the encoding scheme the decoders are designed based
on the reversed logic of the compression so that the original image can be
re-built based on the data obtained out of the compressed images.
There are organizations who receive data form lakhs or more persons,
which is mostly in form of text, with a few images. Most of you know that
the text part is stored in databases in the form of tables, but what about the
images? The images are small compared to the textual data but constitute a
much higher space in terms of storage. Hence, to save on the part of space
and keep running the processes smoothly, they ask the users to submit the
compressed images. As most of the readers have a bit of CS
background(either in school or college), they understand that using online
free tools to compress images is not a good practice for them.
Till Windows 7, Microsoft used to give MS Office Picture Manager which
could be used to compress images till an extent, but it also had some
limitations.
79
Image Processing Lab Those who know a bit of python can install python and use pip
install pillow in command prompt(terminal for Linux users) to install
pillow fork.
You’ll get a screen like this
Assemble all the files in a folder and keep the file Compress.py in the
same folder.
Run the python file with python.
Below is the Source Code of the file:
# run this in any directory
# add -v for verbose
# get Pillow (fork of PIL) from
# pip before running -->
# pip install Pillow
81
Image Processing Lab Folder Before Compression:
82
Experiment VI
6
MORPHOLOGICAL OPERATION
Aim: Morphological operational: Dilation, Erosion, Opening, Closing.
Theory:
EROSION AND DILATION IN MORPHOLOGICAL
PROCESSING.
These operations are fundamental to morphological processing.
Erosion:
With A and B as sets in Z2 , the erosion of A by B, denoted A B, is
defined as
In words, this equation indicates that the erosion of A by B is the set of all
points z such that B, translated by z, is contained in A. In the following
discussion, set B is assumed to be a structuring element. The statement
that B has to be contained in A is equivalent to B not sharing any common
elements with the background; we can express erosion in the following
equivalent form:
a) Set (b) Square structuring element, (c) Erosion of by shown shaded. (d)
Elongated structuring element. (e) Erosion of by using this element. The
dotted border in (c) and (e) is the boundary of set A, shown only for
reference.
83
Image Processing Lab The elements of A and B are shown shaded and the background is white.
The solid boundary in Fig. (c) is the limit beyond which further
displacements of the origin of B would cause the structuring element to
cease being completely contained in A. Thus, the locus of points
(locations of the origin of B) within (and including) this boundary,
constitutes the erosion of A by B. We show the erosion shaded in Fig.
(c).The boundary of set A is shown dashed in Figs. (c) and (e) only as a
reference; it is not part of the erosion operation. Figure (d) shows an
elongated structuring element, and Fig. (e) shows the erosion of A by this
element. Note that the original set was eroded to a line. However, these
equations have the distinct advantage over other formulations in that they
are more intuitive when the structuring element B is viewed as a spatial
mask.
Thus, erosion shrinks or thins objects in a binary image. In fact, we can
view erosion as a morphological filtering operation in which image details
smaller than the structuring element are filtered (re-moved) from the
image
(i) Dilation
However, the preceding definitions have a distinct advantage over other
formulations in that they are more intuitive when the structuring element
B is viewed as a convolution mask. The basic process of flipping
(rotating) B about its origin and then successively displacing it so that it
slides over set (image) A is analogous to spatial convolution. Keep in
mind, however, that dilation is based on set operations and therefore is a
nonlinear operation, whereas convolution is a linear operation. Unlike
erosion, which is a shrinking or thinning operation, dilation "grows" or
"thickens" objects in a binary image. The specific manner and extent of
this thickening is controlled by the shape of the structuring element used.
In the following Figure (b) shows a structuring element (in this case B = B
because the SE is symmetric about its origin). The dashed line in Fig. (c)
shows the original set for reference, and the solid line shows the limit
beyond which any further displacements of the origin of B by z would
cause the intersection of B and A to be empty. Therefore, all points on and
inside this boundary constitute the dilation of A by B. Figure (d) shows a
structuring element designed to achieve more dilation vertically than
horizontally, and Fig. (e) shows the dilation achieved with this element
84
Morphological Operation
FIG:4.1.5 (a) Set (b) Square structuring element (the dot denotes the
origin). (c) Dilation of by shown shaded. (d) Elongated structuring
element. (e) Dilation of using this element. The dotted border in (c) and
(e) is the boundary of set shown only for reference.
85
Image Processing Lab
86
Morphological Operation
87
Image Processing Lab _, image = screenRead.read()
# Converts to HSV color space, OCV reads colors as BGR
# frame is converted to hsv
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
# defining the range of masking
blue1 = np.array([110, 50, 50])
blue2 = np.array([130, 255, 255])
# initializing the mask to be
# convoluted over input image
mask = cv2.inRange(hsv, blue1, blue2)
# passing the bitwise_and over
# each pixel convoluted
res = cv2.bitwise_and(image, image, mask = mask)
# defining the kernel i.e. Structuring element
kernel = np.ones((5, 5), np.uint8)
# defining the opening function
# over the image and structuring element
opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# The mask and opening operation
# is shown in the window
cv2.imshow('Mask', mask)
cv2.imshow('Opening', opening)
# Wait for 'a' key to stop the program
if cv2.waitKey(1) & 0xFF == ord('a'):
break
# De-allocate any associated memory usage
cv2.destroyAllWindows()
# Close the window / Release webcam
screenRead.release()
88
Input Frame: Morphological Operation
Mask:
89
Image Processing Lab Output Frame:
The system recognizes the defined blue book as the input as removes and
simplifies the internal noise in the region of interest with the help of the
Opening function.
Morphological Operations in Image Processing (Closing)
Closing is similar to the opening operation. In closing operation, the basic
premise is that the closing is opening performed in reverse. It is defined
simply as a dilation followed by an erosion using the same structuring
element used in the opening operation.
Mask:
92
Output: Morphological Operation
93
Image Processing Lab 3. It increases the white region in the image or the size of the foreground
object increases
# Python program to demonstrate erosion and
# dilation of images.
import cv2
import numpy as np
# Reading the input image
img = cv2.imread('input.png', 0)
# Taking a matrix of size 5 as the kernel
kernel = np.ones((5,5), np.uint8)
# The first parameter is the original image,
# kernel is the matrix with which image is
# convolved and third parameter is the number
# of iterations, which will determine how much
# you want to erode/dilate a given image.
img_erosion = cv2.erode(img, kernel, iterations=1)
img_dilation = cv2.dilate(img, kernel, iterations=1)
cv2.imshow('Input', img)
cv2.imshow('Erosion', img_erosion)
cv2.imshow('Dilation', img_dilation)
cv2.waitKey(0)
Uses of Erosion and Dilation:
1. Erosion:
● It is useful for removing small white noises.
● Used to detach two connected objects etc.
2. Dilation:
● In cases like noise removal, erosion is followed by dilation. Because,
erosion removes white noises, but it also shrinks our object. So we
dilate it. Since noise is gone, they won’t come back, but our object
area increases.
● It is also useful in joining broken parts of an object.
94