Cv Unit 3 Feature Detection
Cv Unit 3 Feature Detection
Unit 3
What is Feature?
In computer vision and image processing, a feature is a piece of information about the content of an
image; a certain region of the image has certain properties.
Features may be specific structures in the image such as points, edges or objects. Features may also
be the result of a general neighborhood operation or feature detection applied to the image.
Feature detection
● In computer vision and image processing the concept of feature detection refers to methods
that aim at computing abstractions of image information and making local decisions at every
image point whether there is an image feature of a given type at that point or not.
● The resulting features will be subsets of the image domain, often in the form of isolated
points, continuous curves or connected regions.
Feature extraction
Feature detection = how to find some interesting points (features) in the image. (For
example, find a corner, find a template, and so on.)
Feature extraction = how to represent the interesting points we found to compare them with
other interesting points (features) in the image.
Practical example: You can find a corner with the harris corner method, but you can describe it with
any method you want (Histograms, HOG, Local Orientation in the 8th adjacency for instance)
Edge Detection
● Edges are significant local changes of intensity in a digital image. An edge can be defined as
a set of connected pixels that forms a boundary between two disjoint regions. There are
three types of edges:
○ Horizontal edges
○ Vertical edges
○ Diagonal edges
○ image morphology
○ feature extraction
● Edge detection allows users to observe the features of an image for a significant change in
the gray level. This texture indicates the end of one region in the image and the beginning of
another.
● It reduces the amount of data in an image and preserves the structural properties of an
image.
Sobel Operator:
● It is a discrete differentiation operator.
● The Sobel edge detection operator extracts all the edges of an image, without worrying
about the directions. The main advantage of the Sobel operator is that it provides a
differencing and smoothing effect.
● Sobel edge detection operator is implemented as the sum of two directional edges. And the
resulting image is a unidirectional outline in the original image.
● Sobel Edge detection operator consists of 3x3 convolution kernels. Gx is a simple kernel
and Gy is rotated by 90°
● These Kernels are applied separately to the input image because separate measurements can
be produced in each orientation i.e Gx and Gy.
● Advantages:
○ Simple and time efficient computation
○ Very easy at searching for smooth edges
● Limitations:
○ Diagonal direction points are not preserved always
Prewitt Operator:
● This operator is almost similar to the sobel operator.
● It also detects vertical and horizontal edges of an image. It is one of the best ways to detect
the orientation and magnitude of an image.
● It uses the kernels or masks –
● Advantages:
○ Good performance on detecting vertical and horizontal edges
○ Best operator to detect the orientation of an image
● Limitations:
○ The magnitude of coefficient is fixed and cannot be changed
○ Diagonal direction points are not preserved always
Robert Operator:
● This gradient-based operator computes the sum of squares of the differences between
diagonally adjacent pixels in an image through discrete differentiation.
● Robert's cross operator is used to perform 2-D spatial gradient measurement on an image
which is simple and quick to compute. In Robert's cross operator, at each point pixel values
represent the absolute magnitude of the input image at that point.
● Robert's cross operator consists of 2x2 convolution kernels. Gx is a simple kernel and Gy is
rotated by 90o
● Advantages:
○ Detection of edges and orientation are very easy
○ Diagonal direction points are preserved
● Limitations:
○ Very sensitive to noise
○ Not very accurate in edge detection
● It is a gaussian-based operator which uses the Laplacian to take the second derivative of an
image. This really works well when the transition of the grey level seems to be abrupt.
● It works on the zero-crossing method i.e when the second-order derivative crosses zero, then
that particular location corresponds to a maximum level. It is called an edge location.
● Here the Gaussian operator reduces the noise and the Laplacian operator detects the sharp
edges.
Laplacian, the input image is represented as a set of discrete pixels. 3 commonly used
kernels are as following:
● Advantages:
○ Easy to detect edges and their various orientations
○ There is fixed characteristics in all directions
● Limitations:
○ Very sensitive to noise
○ The localization error may be severe at curved edges
○ It generates noisy responses that do not correspond to edges, so-called “false edges”
Canny Operator:
https://towardsdatascience.com/canny-edge-detection-step-by-step-in-python-computer-vision-b49c3a2d8123
3. The edge points at the output of step 2 result in wide ridges. The algorithm thins those
ridges, leaving only the pixels at the top of each ridge.
4. The ridge pixels are then thresholded using two thresholds Tlow and Thigh: ridge pixels
with values greater than Thigh are considered strong edge pixels; ridge pixels with values
between Tlow and Thigh are said to be weak pixels. This process is known as hysteresis
thresholding.
5. The algorithm performs edge linking, aggregating weak pixels that are 8- connected 2 to the
strong pixels.
● Advantages:
○ It has good localization
○ It extract image features without altering the features
○ Less Sensitive to noise
● Limitations:
○ There is false zero crossing
○ Complex computation and time consuming
Corner Detection
A corner can be defined as the intersection of two edges. A corner can also be defined as a point for
which there are two dominant and different edge directions in a local neighbourhood of the point .
Template based corner detection methods use different representative templates to match the image.
Correlations between templates and the image are used to detect corners.The detection performance
highly depends on the choice of appropriate templates.
After the correlations between the templates and the image are determined. An appropriate
threshold should be carefully chosen to determine the existence of corners.
Contour based corner detection methods are based on edge detection. In this category of methods,
edges in the image are detected first. Then, the corner is detected along the contour.
Direct corner detection methods use mathematical computations to detect the corner. This category
of methods usually applies some statistical operations to the image first. Then, corners are detected
based on statistical information.
Moravec detector
The principle of this detector is to observe if a sub-image, moved around one pixel in all directions,
changes significantly. If this is the case, then the considered pixel is a corner.
Fig. Principle of Moravec detector. From left to right : on a flat area, small shifts in the sub-image
(in red) do not cause any change; on a contour, we observe changes in only one direction; around a
corner there are significant changes in all directions.¶
● Mathematically, the change is characterized in each pixel (m,n) of the image by Em,n(x,y)
which represents the difference between the sub-images for an offset (x,y):
where:
x and y represent the offsets in the four directions: (x,y)∈{(1,0),(1,1),(0,1),(−1,1)},
wm,n is a rectangular window around pixel (m,n),
f(u+x,v+y)−f(u,v) is the difference between the sub-image f(u,v) and the offset patch
f(u+x,v+y),
● In each pixel (m,n), the minimum of Em,n(x,y) in the four directions is kept and denoted
Fm,n. Finally, the detected corners correspond to the local maxima of Fm,n, that is, at pixels
(m,n) where the smallest value of Em,n(x,y) is large.
The Harris corner detection algorithm also called the Harris & Stephens corner detector is one of
the simplest corner detectors available.
The idea is to locate interest points where the surrounding neighbourhood shows edges in more than
one direction. The basic idea of algorithm is to find the difference in intensity for a displacement of
(u,v) in all directions which is expressed as below:
Window function is either a rectangular window or a gaussian window which gives weights to
pixels at (x,y). The above equation can be further approximated using Tayler expansion which gives
us the final formula as:
where,
Ix and Iy are image derivatives in x and y directions respectively. One can compute the derivative
using the sobel kernel.
where,
where A, B and C are shifts of windows defined by w. The lambdas are the Eigenvalues of M.
Hough Transform
•This is a line in (a b, ) space parameterized by x and y. So a single point in xy-space gives a line in
(a,b) space.
The fact is that all points on the line defined by (x y) The fact is that all points on the line defined
by (x,y) and (z,k) in (x,y) space will parameterize lines that intersect in (a’,b’) in (a,b) space
• Points that lie on a line will form a “cluster of crossings” in the (a,b) space.
Accumulator Space:
● Quantize the parameter space (a,b), that is, divide it into cells.This quantized space is often
referred to as the accumulator cells.
● amax is the maximum value of a and amin is the minimal value of a etc. Count the number of
times a line intersects a given cell.
● For each point (x,y) with value 1 in the binary image, find the values of (a,b) in the range
[[amin,amax],[bmin,bmax]] defining the line corresponding to this point.Increase the value of the
accumulator for these [a’,b’] points.Then proceed with the next point in the image.
● Cells receiving a minimum number of “votes” are assumed to correspond to lines in (x,y)
space. Lines can be found as peaks in this accumulator space.
● The polar (also called normal) representation of straight lines is x cosθ + y sinθ = ρ
● Each point (xi ,yi ) in the xy-plane gives a sinusoid in the ρ-θ plane.
● M collinear point lying on the line will give M curves that intersect at (ρi ,θj ) in the
parameter
xi cosθ + yi sinθ = ρ
● will give M curves that intersect at (ρi ,θj ) in the parameter plane.
● Local maxima give significant lines.
● The intersection point (ρ0,θ0) corresponds to the lines that y passes through two points
(x1 ,y1 ) and (x2 ,y2 )
● A horizontal horizontal line will have θ=0 and ρ equal to the intercept with the y axis.
● A vertical line will have θ=90 and ρ equal to the intercept with the x axis.
Step 4: For each foreground point (xk,yk) in the thresholded edge image
Let θj equal all the possible θ-values
Output : After this procedure, A(i,j)=P means that P points in the xy-space lie on the line space lie
on the line ρj =x cos θj +y sin θj
Advantages:
● Conceptually simple.
● Easy implementation Easy implementation.
● Handles missing and occluded data very gracefully.
● Can be adapted to many types of forms, not just lines.
Disadvantages:
● Computationally complex for objects with many parameters.
● Looks for only one single type of object Looks for only one single type of object.
● Can be “fooled” by “apparent lines”.
● The length and the position of a line segment cannot be determined.
● Collinear line segments cannot be separated
Active Contours
● Active contour is a type of segmentation technique which can be defined as use of energy
forces and constraints for segregation of the pixels of interest from the image .
● Contours are boundaries designed for the area of interest required in an image. Contour is a
collection of points that undergoes an interpolation process. The interpolation process can be
linear, splines and polynomials which describe the curve in the image .
● The main application of active contours in image processing is to define smooth shape in the
image and form a closed contour for the region.
● Energy functional is always associated with the curve defined in the image.
○ External energy is defined as the combination of forces which is used to control the
positioning of the contour onto the image
○ internal energy, to control the deformable changes.
● The desired contour is obtained by defining the minimum of the energy functional.
Deforming of the contour is described by a collection of points that finds a contour.
1. Snake model
● The model mainly works to identify and outline the target object considered for
segmentation. It uses a certain amount of prior knowledge about the target object contour
especially for complex objects.
● Active snake model configures by spline , focussed to minimise energy followed by various
forces governing the image. Spline is a mathematical expression of a set of polynomials to
derive geometric figures like curves.
● Spline of minimising energy guides the constraint forces and with the help of internal and
external image forces based on appropriate contour features.
● Snake works efficiently with complex target objects by breaking down the figure into
various smaller targets.
● The parametric form of the curve is exploited in the Snake model that has more advantages
than utilising implicit and explicit curve forms
where x and y are the coordinates of the two-dimensional curve, v is spline parameter in the range
0–1, s is linear parameter ∈ [0,1] and t is time parameter ∈ [0, ∞].
Step 2:The contour moves :For Moving the Contour there are two common philosophies:
● Energy minimization
○ “Ad-hoc” energy equation describes how good the curve looks, and how well it
matches the image
○ “Visible” image boundaries represent a low energy state for the active contour
○ The curve is (typically) represented as a set of sequentially connected points.Each
point is connected to its 2 neighboring points.
Two Terms :::: Internal Energy + External Energy
○ External Energy : Also called image energy
○ Designed to capture desired image features
○ Internal Energy :::: Also called shape energy
○ Designed to reduce extreme curvature and prevent outlier points
○ The total energy of active snake model is a summation of three types of energy
namely
(i) internal energy (Ei) which depends on the degree of the spline relating to the
shape of the target image;
(ii) external energy (Ee) which includes the external forces given by the user and
also energy from various other factors;
(iii) energy of the image under consideration (EI) which conveys valuable data on
the illumination of the spline representing the target object.
○ The total energy defined for the contour formation in the snake model is given by
○ Einternal describes the internal energy which defines piecewise smoothness constraints
in the contour, where α decides on how far the snake will be extended and the
capacity of elasticity possible for the snake. β decides on the rigidity level for the
snake.
Eexternal energy constraints are mainly used to define the snake near the required local minimum. It
may be described using high level interpretation and interaction.
The contour of the target object is shown in the above, where w1 is called the line efficient and w2
is called the edge efficient. According to the higher values of w1 and w2, snake will align itself to
darker pixel regions in the case of positive value and it progresses towards
the bright pixels when the value is negative.
● Gradient vector flow model is an extended and well-defined technique of snake or active
contour models.
● The traditional snake model possesses two limitations that is poor convergence performance
of the contour for concave boundaries and when the snake curve flow is initiated at long
distance from the minimum.
● Gradient vector flow (GVF) field is determined based on the following steps.
Step 1 : The primary step is to detect the edge mapping function f(x, y) from the image I(x, y). Edge
mapping function for binary images is described by
Step 2 : Gradient vector flow field is the equilibrium solution that reduces the functional energy.
● The functional energy possesses two different terms such as
○ smoothing term and
○ data term which depends on the parameter μ.
● The parameter value is based on the noise level in the image, that is if the noise level is high
then the parameter has to be increased.
● Limitation is an increase in the value of μ that reduces the rounding of edges but weakens
the smoothing condition of the contour to a certain extent.
● The gradient vector flow is defined by the energy functional
● In this equation, g describes the gradient vector flow which can be derived based on the
Euler equations.
3. Balloon model
If a snake smaller than minima contour will not find the minima and continue to shrink. To
overcome the limitations of the snake model, the balloon model was introduced in which the
inflation term is induced into the forces acting on the snake.
The additional inflation force is given by
Here k1 should possess similar magnitude as that of the image normalisation vector k
Algorithm:
● It will locate an area in the volume, then place an icosahedron in that area such that it
contains no points. Expand (or) subdivide the icosahedron according to force.
● Starts with a small icosahedron inside the object.
● Geometric contours can be obtained based on regions and edges in the curvature of the
image. Edge-based geometric active contours define a geometric flow curve evolution
depending on the gradients of edges or boundaries in the image that undergoes contour
segmentation.
● Edge-based geometric models possess fast computation speed and can simultaneously
segment different regions of different intensities. In some regions, penetration of the gap
in-between the curvature occurs due to large gradient magnitudes.
● Region-based geometric contour models are based on either the variance inside and outside
contour or the squared difference between average intensities inside and outside the contours
along with the total contour length.
● Geometric active contours are mainly employed in medical image computing in
image-based segmentation.
● In general, active contour models possess different extended versions with change either in
the form of energy constraints or forces. New contour models are designed for the
segmentation of absolute details of the image.
SIFT descriptors
● The scale-invariant feature transform (SIFT) is an algorithm used to detect and describe
local features in digital images.
○ It locates certain key points
○ then furnishes them with quantitative information (so-called descriptors)
○ It will be used for object recognition.
○ Gaussian blur has a particular expression or “operator” that is applied to each pixel.
(Blurred Image)
G is the Gaussian Blur operator and I is an image.
While x,y are the location coordinates
σ is the “scale” parameter.
Greater the value, greater the blur.
(GaussianBlur Operator)
● Blurred images are used to generate another set of images, the Difference of
Gaussians (DoG). These DoG images are great for finding out interesting key points
in the image.
● The difference of Gaussian is obtained as the difference of Gaussian blurring of an
image with two different σ, let it be σ and kσ.
:
● Finding keypoints
○ One pixel in an image is compared with its 8 neighbors as well as 9 pixels in the next
scale and 9 pixels in previous scales. This way, a total of 26 checks are made. If it is
a local extrema, it is a potential keypoint. It basically means that keypoint is best
represented in that scale.
● Key Points generated in the previous step produce a lot of keypoints. Some of them are not
as useful as features.
● Taylor series expansion of scale space is used to get a more accurate location of extrema,
and if the intensity at this extrema is less than a threshold value (0.03), it is rejected.
● It will be used a 2x2 Hessian matrix (H) to compute the principal curvature.
○ The “amount” that is added to the bin is proportional to the magnitude of the
gradient at that point.
● From the bins histogram will be generated.The highest peak in the histogram is taken and
any peak above 80% of it is also considered to calculate the orientation. It creates keypoints
with the same location and scale, but different directions.
● At this point, each keypoint has a location, scale, orientation. Next is to compute a
descriptor for the local image region about each keypoint that is highly distinctive and
invariant as possible to variations such as changes in viewpoint and illumination.
● To do this, a 16x16 window around the keypoint is taken. It is divided into 16 sub-blocks of
4x4 size.
1. Rotation dependence The feature vector uses gradient orientations. Clearly, if you rotate
the image, everything changes.To achieve rotation independence, the keypoint’s rotation is
subtracted from each orientation. Thus each gradient orientation is relative to the keypoint’s
orientation.
2. Illumination dependence If we threshold numbers that are big, we can achieve illumination
independence. So, any number (of the 128) greater than 0.2 is changed to 0.2. This resultant
feature vector is normalized again. And now you have an illumination independent feature
vector!
Keypoints between two images are matched by identifying their nearest neighbors. But in some
cases, the second closest-match may be very near to the first.In that case, the ratio of
closest-distance to second-closest distance is taken. If it is greater than 0.8, they are rejected. It
eliminates around 90% of false matches.
HOG Descriptors
HOG, or Histogram of Oriented Gradients, is a feature descriptor that is often used to extract
features from image data.
● The HOG descriptor focuses on the structure or the shape of an object. In the case of edge
features, we only identify if the pixel is an edge or not. HOG is able to provide the edge
direction as well. This is done by extracting the gradient and orientation (magnitude and
direction) of the edges
● These orientations are calculated in ‘localized’ portions. This means that the complete
image is broken down into smaller regions and for each region, the gradients and orientation
are calculated.
● Finally the HOG would generate a Histogram for each of these regions separately. The
histograms are created using the gradients and orientations of the pixel values, hence the
name ‘Histogram of Oriented Gradients’
The HOG feature descriptor counts the occurrences of gradient orientation in localized
portions of an image.
Process of Calculating the Histogram of Oriented Gradients (HOG)
● We need to preprocess the image and bring down the width to height ratio to 1:2. The image
size should preferably be 64 x 128.
● This is because we will be dividing the image into 8*8 and 16*16 patches to extract the
features.
The next step is to calculate the gradient for every pixel in the image. Gradients are the small
changes in the x and y directions.
● Get the pixel values for this patch.(the matrix shown here is only used as an example).
● For the pixel value 85. To determine the gradient in the x-direction, subtract the value on the
left from the pixel value on the right. To calculate the gradient in the y-direction, subtract the
pixel value below from the pixel value above the selected pixel.
● Hence the resultant gradients in the x and y direction for this pixel are:
○ Change in X direction(Gx) = 89 – 78 = 11
○ Change in Y direction(G y) = 68 – 56 = 8
● This process will give us two new matrices – one storing gradients in the x-direction and the
other storing gradients in the y direction.
● The next step would be to find the magnitude and orientation using these values.
Using the gradients we determine the magnitude and direction for each pixel value. For this step, we
will use the Pythagoras theorem .
The gradients are basically the base and perpendicular here. So, for example, we had Gx and Gy as
11 and 8. Total gradient magnitude:
Next, calculate the orientation (or direction) for the same pixel.
tan(Φ) = Gy / Gx
The orientation comes out to be 36 when we plug in the values. This way for every pixel value, we
have the total gradient (magnitude) and the orientation (direction). We need to generate the
histogram using these gradients and orientations.
We take the angle or orientation on the x-axis and the frequency on the y-axis.
Method 1:We will take each pixel value, find the orientation of the pixel and update the frequency
table.For Ex: the process for the highlighted pixel (85). Since the orientation for this pixel is 36, we
will add a number against angle value 36, denoting the frequency:
This frequency table can be used to generate a histogram with angle values on the x-axis and the
frequency on the y-axis.
Method 2:Here we have a bin size of 20. So, the number of buckets we would get here is 9.
For each pixel, store orientation into the frequency of the orientation values in the form of a 9 x 1
matrix. Plotting this would give us the histogram:
Method 3:Here is another way in which we can generate the histogram – instead of using the
frequency, we can use the gradient magnitude to fill the values in the matrix.
Here we are using the orientation value of 30, and updating the bin 20 only. Additionally, we should
give some weight to the other bin as well.
Method 4:Let’s make a small modification to the above method. Here, we will add the contribution
of a pixel’s gradient to the bins on either side of the pixel gradient. Remember, the higher
contribution should be to the bin value which is closer to the orientation.
The histograms created in the HOG feature descriptor are not generated for the whole image.
Instead, the image is divided into 8×8 cells, and the histogram of oriented gradients is computed for
each cell. If we divide the image into 8×8 cells and generate the histograms, we will get a 9 x 1
matrix for each cell. This matrix is generated using method 4 that
Once we have generated the HOG for the 8×8 patches in the image, the next step is to normalize the
histogram.
The gradients of the image are sensitive to the overall lighting. This means that for a particular
picture, some portion of the image would be very bright as compared to the other portions.
But we can reduce this lighting variation by normalizing the gradients by taking 16×16 blocks. Here
is an example that can explain how 16×16 blocks are created:
Here, we will be combining four 8×8 cells to create a 16×16 block. And we already know that each
8×8 cell has a 9×1 matrix for a histogram. So, we would have four 9×1 matrices or a single 36×1
matrix. To normalize this matrix, Mathematically, for a given vector V = [a1, a2, a3, ….a36]
And divide all the values in the vector V with this value k:
Now, we will combine all (16 X16 blocks)these to get the features for the final image.
We would have 105 (7×15) blocks of 16×16. Each of these 105 blocks has a vector of 36×1 as
features. Hence, the total features for the image would be 105 x 36×1 = 3780 features.
● The bins are normally taken to be uniform in log-polar space. The shape contexts of two
different versions of the letter "A" are shown.
○ (a) and (b) are the sampled edge points of the two shapes.
○ (c) is the diagram of the log-polar bins used to compute the shape context.
○ (d) is the shape context for the point marked with a circle in (a), (e) is that for the
point marked as a diamond in (b), and (f) is that for the triangle. As can be seen,
since (d) and (e) are the shape contexts for two closely related points, they are quite
similar, while the shape context in (f) is very different.
● For a feature descriptor to be useful, it needs to have certain invariances.
○ Translational invariance comes naturally to shape context.
○ Scale invariance is obtained by normalizing all radial distances by the mean distance
α between all the point pairs in the shape .
● The shape of an object is essentially captured by a finite subset of the points on the internal
or external contours on the object. These can be simply obtained using the Canny edge
detector and picking a random set of points from the edges.
● For each point pi on the shape, consider the n − 1 vectors obtained by connecting pi to all
other points. The set of all these vectors is a rich description of the shape localized at that
point.
● Consider two points p and q that have normalized K-bin histograms (i.e. shape contexts)
g(k) and h(k). As shape contexts are represented as histograms, it is natural to use the χ2 test
statistic as the "shape context cost" of matching the two points:
● In addition to the shape context cost, an extra cost based on the appearance can be added.
Now the total cost of matching the two points could be a weighted-sum of the two costs:
● Now for each point pi on the first shape and a point qj on the second shape, calculate the
cost as described and call it Ci,j. This is the cost matrix.
● Now, a one-to-one matching pi that matches each point pi on shape 1 and qj on shape 2 that
minimizes the total cost of matching,
● Given the set of correspondences between a finite set of points on the two shapes, a
transformation can be estimated to map any point from one shape to the other.
● There are several choices for this transformation, described below.
Affine
Where
● Now, measure a shape distance between two shapes P and Q.This distance is going to be a
weighted sum of three potential terms:
○ Shape context distance: this is the symmetric sum of shape context matching costs
over best matching points:
○ Transformation cost: The final cost measures how much transformation is necessary
to bring the two images into alignment.
Morphological operations
All morphological processing operations are based on mentioned terms.
Structuring Element: It is a matrix or a small-sized template that is used to traverse an image. The
structuring element is positioned at all possible locations in the image, and it is compared with the
connected pixels. It can be of any shape.
● Fit: When all the pixels in the structuring element cover the pixels of the object, we call it
Fit.
● Hit: When at least one of the pixels in the structuring element cover the pixels of the object,
we call it Hit.
● Miss: When no pixel in the structuring element cover the pixels of the object, we call it
miss.
Morphological Operations
The structuring element is moved across every pixel in the original image to give a pixel in a new
processed image. The value of this new pixel depends on the morphological operation performed.
1. Erosion
Erosion shrinks the image pixels, or erosion removes pixels on object boundaries. First, we traverse
the structuring element over the image object to perform an erosion operation, as shown in Figure .
The output pixel values are calculated using the following equation.(f + s ) - s
An example of Erosion is shown in Figure . Figure(a) represents the original image, (b) and (c)
shows processed images after erosion using 3x3 and 5x5 structuring elements respectively.
Properties:
2. Dilation
Dilation expands the image pixels, or it adds pixels on object boundaries. First, we traverse the
structuring element over the image object to perform an dilation operation, as shown in Figure . The
output pixel values are calculated using the following equation.(f U s)
An example of Dilation is shown in Figure. Figure (a) represents original image, (b) and (c) shows
processed images after dilation using 3x3 and 5x5 structuring elements respectively.
Properties:
Compound Operations
Most morphological operations are not performed using either dilation or erosion; instead, they are
performed by using both. Two most widely used compound operations are:
(a) Closing (by first performing dilation and then erosion), and
(b) Opening ,denoted by f * s, Computed by first performing erosion and then dilation
f *s = (f - s) + s
Extracting the boundary is an important process to gain information and understand the feature of
an image. It is the first process in preprocessing to present the image’s characteristics. This
process can help the researcher to acquire data from the image. We can perform boundary extraction
of an object by following the below steps.
Step 1. Create an image (E) by erosion process; this will shrink the image slightly. The kernel size
of the structuring element can be varied accordingly.
Step 2. Subtract image E from the original image. By performing this step, we get the boundary of
our object.