Design and Implementation: 3.1 Morphology
Design and Implementation: 3.1 Morphology
Design and Implementation: 3.1 Morphology
3.INTRODUCTION
3.1 MORPHOLOGY
Morphology has been widely used in image processing, especially in noise
removal. An adaptive algorithm is developed for determining, from a given class of
grayscale morphological filters, a filter which minimizes the mean spare error between
its output and a desired process. The adaptation using the conventional least mean
square algorithm optimizes the grayscale structuring element in a given search area.
The performance of noise removal is compared to another class of nonlinear filters,
i.e., adaptive and non-adaptive stack-based filters.
Morphology is a technique of image processing based on shapes.
Morphological operations apply a structuring element to an input image, creating an
output image at the same size. The value of each pixel in the input image is based on a
comparison of the corresponding pixel in the input image with its neighbors. By
choosing the size and shape of the neighbor, you can construct a morphological
operation that is sensitive to specific shapes in the input image. The morphological
operations can first be defined on grayscale images. Where the source image is planar
(single-channel). The definition can then be expanded to full-color images.
Mathematical morphology (MM) is a theory and technique for the analysis and processing of
geometrical structures, based on set theory, lattice theory, topology, and random functions.
MM is most commonly applied to digital images, but it can be employed as well
on graphs, surface meshes, solids, and many other spatial structures.
Topological and geometrical continuousspaceconceptssuchas size, shape, convexity, c
onnectivity, and geodesic distance, can be characterized by MM on both continuous
and discrete spaces. MM is also the foundation of morphological image processing, which
consists of a set of operators that transform images according to the above characterizations.
Structuring element
The basic idea in binary morphology is to probe an image with a simple, pre-defined
shape, drawing conclusions on how this shape fits or misses the shapes in the image. This
simple "probe" is called structuring element, and is itself a binary image (i.e., a subset of the
space or grid).
Here are some examples of widely used structuring elements (denoted by B):
Let ; B is the "cross" given by: B={(-1,0), (0,-1), (0,0), (0,1), (1,0)
Most software systems also provide some degree of manual editing tools, that allow
the user to draw lines or paint pixels to “touch up” the original binary. This is often the
fastest way to correct a minor defect, but it should be used with caution and only as a last
resort because of the dangers of inconsistent results or accidentally impressing the user’s
hopes and desires onto the actual data. These tools are very similar in use to most computer
drawing or painting programs and their use should be obvious with a little practice. In
Photoshop, the tools include the paintbrush, eraser, line drawing tool, and the ability to
select a region of interest and outline, fill or erase it. The notes here concentrate on the
computer-processing algorithms.
In Photoshop, any selection region marquee tool, lasso, wand can be converted to an
"alpha channel" an 8 bit grey scale image plane . This will appear in the Channels list. If
an alpha channel is selected for display along with the image, it will appear as a colored
overlay that is clear on the selected features and tinted on the background (the opposite
convention as that used in the thresholding previews in Part. If the alpha channel is
selected for display by itself it appears as a black and white image, and should be inverted
to make the selection be the foreground and the unselected region be the back ground.
The convention adopted in the Image Processing Tool Kit and in Fovea Pro is that
features are black on a white background. This is consistent with Macintosh programs
which treat white as the background color for both screen and printouts, with many
modern Windows programs, and also with the convention used in the measurement
routines discussed in Part 7. However, some Windows programs that began their
existence in a DOS environment still treat the background as black and features as bright,
corresponding to monochrome CRT displays; this may create confusion if you go back
and forth between multiple programs. Binary image operations discussed in this section,
and measurement operations in Parts 7 and 8, consider white pixels to be background and
all other pixels (colored, grey, or black) to be parts of features.
The basic operations are shift-invariant (translation invariant) operators strongly related
to Murkowski.
Dilation is one of the two basic operators in the area of mathematical morphology, the
other being erosion. It is typically append to binary images, but there are versions that work
on grey scale images. The basic effect of the operator on a binary image is to gradually
enlarge the boundaries of regions of foreground pixels (i.e. white pixels, typically). Thus
areas of foreground pixels grow in size while holes within those regions become smaller. The
dilation operator takes two pieces of data as inputs. The first is the image which is to be
dilated. The second is a (usually small) set of coordinate points known as a structuring
element. It is this structuring element that determines the precise effect of the dilation on the
input image . Dilation increases the valleys and enlarges the width of maximum regions, so it
can remove negative impulsive noises but do little on positives ones.
Any pixel in the output image touched by the · in the structuring element is set to ON
when any point of the structuring element touches a ON pixel in the original image. This
tends to close up holes in an image by expanding the ON regions. It also makes objects
larger. Note that the result depends upon both the shape of the structuring element and the
location of its origin.
3.3.1 Dilation
The dilation of the dark-blue square by a disk, resulting in the light-blue square with rounded
corners.
If B has a center on the origin, as before, then the dilation of A by B can be understood
as the locus of the points covered by B when the center of B moves inside A. In the above
example, the dilation of the square of side 10 by the disk of radius 2 is a square of side 13,
with rounded corners, centered at the origin. The radius of the rounded corners is 2.
Example application: Dilation is the opposite of the erosion. Figures that are very
lightly drawn get thick when "dilated". Easiest way to describe it is to imagine the same
fax/text is written with a thicker pen.
3.3.2 EROSION
Erosion is one of the two basic operators in the area of mathematical morphology, the
other being dilation. It is typically applied to binary images, but there are versions that work
on grey scale images. The basic effect of the operator on a binary image is to erode away the
boundaries of regions of foreground pixels (i.e. white pixels, typically). Thus areas of
foreground pixels shrink in size, and holes within those areas become larger (Serra J, 1982).
It is known that erosion reduces the peaks and enlarges the widths of minimum regions, so it
can remove positive noises but affect negative impulsive noises little ( Paul T. Jackay, et a/,
1996).
Let (Bˆ )s be the reflection of B about its origin and followed by a shift by s.
Dilation, written A ⊕ B, is the set of all shifts that satisfy the following:
A ⊕ B = {s|(Bˆ )s ∩ A = ∅} Equivalently,
3.3.3 EROSION
The erosion of the dark-blue square by a disk, resulting in the light-blue square.
When the structuring element B has a center (e.g., B is a disk or a square), and this
center is located on the origin of E, then the erosion of A by B can be understood as
the locus of points reached by the center of B when B moves inside A. For example, the
erosion of a square of side 10, centered at the origin, by a disc of radius 2, also centered at the
origin, is a square of side 6 centered at the origin.
Example application: Assume we have received a fax of a dark photocopy.
Everything looks like it was written with a pen that is bleeding. Erosion process will
allow thicker lines to get skinny and detect the hole inside the letter "o".
3.3.3 .1 Opening
The opening of the dark-blue square by a disk, resulting in the light-blue square with
round corners.
which means that it is the locus of translations of the structuring element B inside the
image A. In the case of the square of side 10, and a disc of radius 2 as the structuring element,
the opening is a square of side 10 with rounded corners, where the corner radius is 2.
The closing of the dark-blue shape (union of two squares) by a disk, resulting in the union of
the dark-blue shape and the light-blue areas. The closing of A by B is obtained by the dilation
of A by B, followed by erosion of the resulting structure by B:
Erosion, Dilation and their combined uses are ways to add or remove pixels from the
boundaries of features in order to smooth them, to join separated portions of features or
separate touching features, and to remove isolated pixel noise from the image. Dilation
turns pixels “on” according to rules based on the number or arrangement of neighboring
pixels, Erosion turns pixels “off” according to similar rules, while Opening - an erosion
followed by a dilation - and Closing - the reverse sequence attempt to restore the original
area of features but with some rearrangement of the boundary pixels.
Fig 3.6 Original image.
The plug-in routines are actually capable of more general use, and allow selection of
the neighbor coefficient and number of iterations to enter these values, which are
remembered (in a disk file) for future use by Erosion, Dilation, Opening and Closing. The
Coefficient is a threshold in the number of neighbors (out of the total of 8) that must be of
the opposite color (black vs. white) for the central pixel to change. Classic dilation
corresponds to a coefficient of zero since a white pixel becomes black if ANY (more than
zero) of its neighbors is black. The number of iterations (called the Depth) is the number of
repetitions of erosion and / or dilation. In an opening (or closing) the numbers of iterations
of erosion (or dilation) are performed first, followed by the same number of dilations (or
erosions). You select each of these specific morphological operations as Erode , or Dilate
, or Open , or Close .
The second method proposed is the block analysis where the entire image is split
into a number of blocks and each block is enhanced individually. The next proposed
method is the erosion-dilation method which is similar to block analysis but uses
morphological operations (erosion and dilation) for the entire image rather than splitting
into blocks. All these methods were initially applied for the gray level images and later were
extended to colour images by splitting the colour image into its respective R,G and B
components, individually enhancing them and concatenating them to yield the enhanced
image. All the above mentioned techniques operate on the image in the spatial domain. The
final method is the DCT where the frequency domain is used. Here we s c a l e the
DC coefficients of the image after DCT has been taken. The DC coefficient is adjusted as it
contains the maximum information. Here, we move from RGB domain to YCbCr domain
for processing and in YCbCr, to adjust (scale) the DC coefficient, i.e. Y(0,0). The image is
converted from RGB to YCbCr domain because if the image is enhanced without
converting, there is a good chance that it may yield an undesired output image. The
enhancement of images is done using the log operator [1]. This is taken because it avoids
abrupt changes in lighting. For example, if 2 adjacent pixel values are 10 and 100, their
difference in normal scale is 90. But in the logarithmic scale, this difference reduces to
just 1, thus providing a perfect platform for image enhancement.
There are also techniques based on data statistical analysis, such as global and local
histogram equalization. In the histogram equalization process, gray level intensities are
distributed over the entire area to obtain a uniformly spread histogram thus keeping all
the distributed values nearly the same. The enhancement level is not significant and
provides good results only for certain images but fails to provide good results for most of
the images, especially those taken under poor lighting. In other words, it doesn’t
provide good performance for detail preservation. There are a lot of algorithms proposed
for enhancement of images taken under poor lighting but obviously some methods prove
better than others.
Mathematical morphology is a tool for extracting image components that are useful
for representation and description. The content of mathematical morphology is completely
based on set theory. By using set operations, there are many useful operators defined in
mathematical morphology. They are dilation, erosion, opening and closing. Morphological
operations apply structuring elements to an input image, creating an output image of the same
size. Structuring element determines exactly how the object will be dilated or eroded.
Irrespective of the size of the structuring element, the origin is located at its center.
Block Analysis:
For Gray level images:
Let f be the original image which is subdivided into number of blocks with each block
is the sub-image of the original image.
It is clear that the background parameter entirely is dependent up on the background
criteria τi value. For f = τi, the background parameter takes the maximum intensity value Mi
within the analysed block, and the minimum intensity value m i otherwise. In order to avoid
indetermination condition, unit was added to the logarithmic function.
The more is the number of blocks, the better will be quality of the enhanced image. In
the enhanced images, it can be seen that the objects that are not clearly visible in the original
image are revealed. As the size of the structuring element increases it is hard to preserve the
image as blurring and contouring effects are severe. The results are best obtained by keeping
the size of the structuring element as 1 (µ=1). Sample input (left half of the image) and
output image (right half) for block analysis is shown below:
Erosion-Dilation method:
For Gray level images:
This method is similar to block analysis in many ways; apart from the fact that the
manipulation is done on the image as a whole rather than partitioning it into blocks.
Firstly minimum I min (x) and maximum intensity Imax (x) contained in a structuring element
(B) of elemental size 3 × 3 is calculated.
By employing Erosion-Dilation method we obtain a better local analysis of the image for
detecting the background criteria than the previously used method of Blocks. This is because
the structuring element µB permits the analysis of eight boring pixels at each point in the
image. By increasing the size of the structuring element more pixels will be taken into
account for finding the background criteria. It can be easily visualized that several
characteristics that are not visible at first sight appear in the enhanced images.
The trouble with this method is that morphological erosion or dilation when used with
large size of µ to reveal the background, undesired values maybe generated.
In general it is desirable to filter an image without generating any new components. The
transformation function which enables to eliminate unnecessary parts without affecting other
regions of the image is defined in mathematical morphology which is termed as
transformation by reconstruction.
We go for opening by reconstruction because it restores the original shape of the objects in
the image that remain after erosion as it touches the regional minima and merges the regional
maxima (as shown in Fig opr).
Where, maxint refers to maximum gray level intensity which is equal to 255. If the intensity
of the background increases, the image becomes lighter because of the additive effect of the
whiteness (i.e. maximum intensity) of the background. It is to be remembered that it is the
objective of opening by reconstruction to preserve the shape of the image components that
remain after erosion.
The methods of Block Analysis, Erosion Dilation and Opening by Reconstruction face a
common problem of over enhancement in the resultant image. To be precise, no proper way
is found to be exercised in these methods to control the enhancement in images. Another
thing is that the results are best obtained only for high resolution images. To overcome these
problems we go for transform domain i.e. frequency domain techniques.
3.3 CANNY EDGE DETECTION
.
Figure 3.18 Discrete approximation to Gaussian function with σ =1.4.
2) After smoothing the image and eliminating the noise, the next step is to find the edge
strength by taking the gradient of the image. The Sobel operator performs a 2-D spatial
gradient measurement on an image. Then, the approximate absolute gradient magnitude (edge
strength) at each point can be found. The Sobel operator uses a pair of 4x4 convolution
masks, one estimating the gradient in the x-direction (columns) and the other estimating the
gradient in the y-direction (rows). They are shown below:
The magnitude, or edge strength, of the gradient is then approximated using the
formula:
|G| = |Gx| + |Gy| (3.1)
3) Finding the edge direction is trivial once the gradient in the x and y directions are known
However, to generate an error whenever sumX is equal to zero. So in the code there has to be
a restriction set whenever this takes place. Whenever the gradient in the x direction is equal
to zero, the edge direction has to be equal to 90 degrees or 0 degrees, depending on what the
value of the gradient in the y-direction is equal to. If Gy has a value of zero, the edge direction
will equal 0 degrees. Otherwise the edge direction will equal 90 degrees. The formula for
finding the edge direction is just:
θ = tan -1(Gy / Gx) (3.2)
4) Once the edge direction is known, the next step is to relate the edge direction to a direction
that can be traced in an image. So if the pixels of a 5x5 image are aligned as follows:
x x x x x
x x x x x
x x a x x
x x x x x
x x x x x
Then, it can be seen by looking at pixel "a", there are only four possible directions
when describing the surrounding pixels - 0 degrees (in the horizontal direction), 45 degrees
(along the positive diagonal), 90 degrees (in the vertical direction), or 145 degrees (along the
negative diagonal). So now the edge orientation has to be resolved into one of these four
directions depending on which direction it is closest to (e.g. if the orientation angle is found
to be 4 degrees, make it zero degrees). Think of this as taking a semicircle and dividing it into
5 regions.
Therefore, any edge direction falling within the yellow range (0 to 22.5 & 157.5 to
180 degrees) is set to 0 degrees. Any edge direction falling in the green range (22.5 to 67.5
degrees) is set to 45 degrees. Any edge direction falling in the blue range (67.5 to 112.5
degrees) is set to 90 degrees. And finally, any edge direction falling within the red range
(112.5 to 157.5 degrees) is set to 145 degrees.
5) After the edge directions are known, nonmaximum suppression now has to be applied.
Nonmaximum suppression is used to trace along the edge in the edge direction and suppress
any pixel value (sets it equal to 0) that is not considered to be an edge. This will give a thin
line in the output image.
6) Finally, hysteresis is used as a means of eliminating streaking. Streaking is the breaking up
of an edge contour caused by the operator output fluctuating above and below the threshold.
If a single threshold, T1 is applied to an image, and an edge has an average strength equal to
T1, then due to noise, there will be instances where the edge dips below the threshold.
Equally it will also extend above the threshold making an edge look like a dashed line. To
avoid this, hysteresis uses 2 thresholds, a high and a low. Any pixel in the image that has a
value greater than T1 is presumed to be an edge pixel, and is marked as such immediately.
Then, any pixels that are connected to this edge pixel and that have a value greater than T2
are also selected as edge pixels. If you think of following an edge, you need a gradient of T2
to start but you don't stop till you hit a gradient below T1.
The Hough transform is a standard computer vision algorithm that can be used to
determine the parameters of simple geometric objects, such as lines and circles, present in an
image. The circular Hough transform can be employed to deduce the radius and centre
coordinates of the pupil and iris regions. An automatic segmentation algorithm based on the
circular Hough transform is employed by Wilde’s . Firstly, an edge map is generated by
calculating the first derivatives of intensity values in an eye image and then thresholding the
result. From the edge map, votes are cast in Hough space for the parameters of circles passing
through each edge point.
A maximum point in the Hough space will correspond to the radius and centre
coordinates of the circle best defined by the edge points. is the angle of rotation relative to the
x-axis. In performing the preceding edge detection step, Wildes et al. bias the derivatives in
the horizontal direction for detecting the eyelids, and in the vertical direction for detecting the
outer circular boundary of the iris. The motivation for this is that the eyelids are usually
horizontally aligned, and also the eyelid edge map will corrupt the circular iris boundary edge
map if using all gradient data. Taking only the vertical gradients for locating the iris boundary
will reduce influence of the eyelids when performing circular Hough transform, and not all of
the edge pixels defining the circle are required for successful localization. Not only does this
make circle localization more accurate, it also makes it more efficient, since there are less
edge points to cast votes in the Hough space.
3.5 Daugman’s Rubber Sheet Model
The homogenous rubber sheet model devised by Daugman remaps each point With in
the iris region to a pair of polar coordinates (r, θ) where r is on the interval [0,1] and . is angle
[0,2π].
The remapping of the iris region from (x,y) Cartesian coordinates to the normalized
non-concentric polar representation is modeled as.
I ( x ( r ,θ ) , y ( r ,θ ) ) → I ( r , θ )
With
x ( r , θ ) =( 1−r ) x p ( θ ) +r x l ( θ )
y ( r ,θ )=( 1−r ) y p ( θ ) +r y l ( θ )
Where I(x,y) is the iris region image, (x,y) are the original Cartesian coordinates, (r,
θ) are the corresponding normalized polar coordinates, and x p, yp and xl, yl are the
coordinates of the pupil and iris boundaries along the θ direction. The rubber sheet model
takes into account pupil dilation and size inconsistencies in order to produce a normalized
representation with constant dimensions. In this way the iris region is modeled as a flexible
rubber sheet anchored at the iris boundary with the pupil centre as the reference point.
Even though the homogenous rubber sheet model accounts for pupil dilation, imaging
distance and non-concentric pupil displacement, it does not compensate for rotational
inconsistencies. In the Daugman system, rotation is accounted for during matching by
shifting the iris templates in the direction until two iris templates are aligned. For
normalization of iris regions a technique based on Daugman’s rubber sheet model was
employed. The centre of the pupil was considered as the reference point, and radial vectors
pass through the iris region, as shown in Figure 4.4 A number of data points are selected
along each radial line and this is defined as the radial resolution. The number of radial lines
going around the iris region is defined as the angular resolution. Since the pupil can be non-
concentric to the iris, a remapping formula is needed to rescale points depending on the angle
around the circle. This is given by.
r ' = √ α β ± √α β2 −α −r i2
With
α =o x 2 +o y 2
ox
(
β=cos π−arctan
( ) )
oy
−θ
Where displacement of the centre of the pupil relative to the centre of the iris is given
by ox , oy, and r’ is the distance between the edge of the pupil and edge of the iris at an angle,
around the region, and rI is the radius of the iris. The remapping formula first gives the radius
of the iris region ‘doughnut’ as a function of the angle.
A constant number of points are chosen along each radial line, so that a constant
number of radial data points are taken, irrespective of how narrow or wide the radius is at a
particular angle. The normalized pattern was created by backtracking to find the Cartesian
coordinates of data points from the radial and angular position in the normalized pattern.
From the ‘doughnut’ iris region, normalization produces a 2D array with horizontal
dimensions of angular resolution and vertical dimensions of radial resolution. Another 2D
array was created for marking reflections, eyelashes, and eyelids detected in the segmentation
stage. In order to prevent non-iris region data from corrupting the normalized representation,
data points which occur along the pupil border or the iris border are discarded.
3.6 INFUSION OF EDGE AND REGION INFORMATION
To address the many problems outlined above this paper describes a new corner and
edge detector developed from the phase congruency model of feature detection. The new
operator uses the principal moments of the phase congruency information to determine corner
and edge information. Phase congruency is a dimensionless quantity and provides
information that is invariant to image contrast. This allows the magnitudes of the principal
moments of phase congruency to be used directly to determine the edge and corner strength.
The minimum and maximum moments provide feature information in their own right; one
does not have to look at their ratios. If the maximum moment of phase congruency at a point
is large then that point should be marked as an edge. If the mini-mum moment of phase
congruency is also large then that point should also be marked as a `corner'. The hypothesis
being that a large minimum moment of phase congruency indicates there is significant phase
congruency in more than one orientation, making it a corner.
The resulting corner and edge operator is highly localized and the invariance of the
response to image contrast results in reliable feature detection under varying illumination
conditions with fixed thresholds. An additional feature of the operator is that the corner map
is a strict subset of the edge map. This facilitates the cooperative use of corner and edge
information.
Rather than assume a feature is a point of maximal intensity gradient, the local energy
model postulates that features are perceived at points in an image where the Fourier
components are maximally in phase as shown in Figure 4.5.
Figure 3.21 Fourier series of a square wave and the sum of the first four terms.
How the Fourier components are all in phase at the point of the step in the square
wave. Congruency of phase at any angle produces a clearly perceived feature. The angle at
which the congruency occurs dictates the feature type, for example, step or delta. The
measurement of phase congruency at a point in a signal can be seen geometrically in Figure
4.6 The local, complex valued, Fourier components at a location x in the signal will each
have an amplitude An(x) and a phase angle Φn(x). Figure 4.6 plots these local Fourier
components as complex vectors adding head to tail. The magnitude of the vector from the
origin to the end point is the Local Energy,
MATLAB 7.9
MATLAB
MATLAB is a high-performance language for technical computing. It integrates
computation, visualization, and programming in an easy-to-use environment where problems
and solutions are expressed in familiar mathematical notation.
3.9.1 Flowchart:
3.9.2 ALGORITHM
1) The iris is first roughly localized by edge detection, and Hough transform.
2) Normalize the segmented Iris image from Cartesian coordinates to the normalized
non- concentric polar representation of size 60*250 using Daugman’s rubber sheet
model
3) Apply the infusion of edge and region information
4) Extract phase information from the infused image and generate the iris code for
enrolling image
5) Extract phase information from the infused image and generate the iris code input
image
6) Find the Hamming Distance between the binarized feature vectors obtained in
steps 4,5 with the corresponding feature vector in the database
7) If HD<0.4, the subject is accepted as genuine, else rejected
Canny Edge Detection is to detect at the zero-crossings of the second directional derivative of
the smoothed image In the direction of the gradient where the gradient magnitude of the
smoothed image being greater than some threshold depending on image statistics. Canny
zero-crossings correspond to the first directional derivatives of maxima and minima in the
direction of the gradient. Maxima in magnitude are reasonable choice for locating edges.
Extensive observations on the CASIA iris dataset show that the pupil and eyelash regions in
iris images have lower intensity values and the reflection and eyelid regions have higher
intensity values. Intuitively, good segmentation results could be obtained by a simple
threshold. However, this result would inevitably be sensitive to the change of illumination
and not robust for the subsequent recognition process. For overcoming this problem, the
boundary of the probable noise regions is first localized by the edge information based on
Log Gabor radial filter. We set the pupil and eyelash noises to noises of version In general,
iris imaging cameras use infra-red light for illumination, the reflection regions thus could
stably characterized by high intensity values close to 255. Then, a simple threshold can be
used to successfully remove the reflection noises. After the pupil, reflection and eyelash
noises were detected, the remaining edge information would be the boundary between the iris
and eyelids noises and gives the restricted regions where eyelid noises exist. Hough transform
is used in the restricted regions for accurately localizing the eyelid noises, which can speed
up the whole process.
The Hamming distance gives a measure of how many bits are the same between two bit
patterns. Using the Hamming distance of two bit patterns, a decision can be made as to
whether the two patterns were generated from same one
3.10 ADVANTAGES