digital image processing
digital image processing
C
SSW
2023-2024
PART – B (2 x 5 = 10 Marks)
(Answer ANY TWO questions)
&
C
(Open Choice – 2 out of 5 questions)
W
&
UNIT–I
Introduction: What is Digital image processing – the origin of DIP – Examples of fields
that use
DIP – Fundamentals steps in DIP – Components of an image processing system. Digital Image
Fundamentals: Elements of Visual perception – Light and the electromagnetic spectrum – Image
sensing and acquisition – Image sampling and Quantization – Some Basic relationship between
Pixels – Linear & Nonlinear operations.
UNIT–II
Image Enhancement in the spatial domain:- Background – some basic Gray level
Transformations – Histogram Processing – Enhancement using Arithmetic / Logic operations –
Basics of spatial filtering – Smoothing spatial filters – Sharpening spatial filters – Combining
spatial enhancement methods.
UNIT–III
Image Restoration: A model of the Image Degradation / Restoration Process – Noise
models –
Restoration is the process of noise only – Spatial Filtering – Periodic Noise reduction by
frequency domain filtering – Linear, Portion – Invariant Degradations – Estimating the
degradation function – Inverse filtering – Minimum mean square Error Filtering – Constrained
C
least squares filtering – Geometric mean filter – Geometric Transformations.
UNIT–IV
S SW
Image Compression: Fundamentals – Image compression models – Elements of
Information
Theory – Error Free compression – Lossy compression – Image compression standards.
UNIT–V
Image Segmentation: Detection and Discontinuities – Edge Linking and Boundary
deduction –
Thresholding – Region-Based segmentation – Segmentation by Morphological watersheds – The
use of motion in segmentation.
REFERENCES
1. B.Chanla, D. Dutta Majumder, “DigitalImageProcessingand Analysis”, PHI, 2003.
2. Nick Elford, “Digital Image Processing a practical introducing using Java”, Pearson
Education, 2004.
3. Todd R.Reed,“Digital Image Sequence Processing,Compression,andAnalysis”,CRC Press,
2015.
4. L.Prasad,S.S.Iyengar,“WaveletAnalysiswithApplicationstoImageProcessing”,CRCPress,
2015.
OUTCOMES:
At the end of this course, students should able
Review the fundamental concepts of a digital image processing system and Analyze images in the
frequency domain using various transforms
• Evaluate the techniques for image enhancement and image restoration. Categorize various
compression techniques
• Interpret Image compression standards, and Interpret image segmentation and representation
techniques
• Gain idea to process various image used in various fields such as weather forecasting, Diagnosis
of various disease using image such as tumor, cancer etc
INTRODUCTION
o The field of digital image processing refers to processing digital images by means
of a digital computer.
C
S SW
What is an Image
o An image is nothing more than a two dimensional signal. It is defined by the mathematical
function f(x,y) where x and y are the two co-ordinates horizontally and vertically.
o The value of f(x,y) at any point is gives the pixel value at that point of an image.
SW
computer screen. But actually , this image is nothing but a two dimensional array of
numbers ranging between 0 and 255.
S
Three types of computerized processes in this field:
• low- level processes
• mid- level processes
• high-level processes
Since digital image processing has very wide applications and almost all of the technical
fields are impacted by DIP, we will just discuss some of the major applications of DIP.
o In this electromagnetic spectrum, we are only able to see the visible spectrum. Visible
spectrum mainly includes seven different colors that are commonly term as (VIBGOYR).
VIBGOYR stands for violet , indigo , blue , green , orange , yellow and Red.
o But that doesnot nullify the existence of other stuff in the spectrum. Our human eye can
only see the visible portion, in which we saw all the objects. But a camera can see the
C
other things that a naked eye is unable to see. For example: x rays , gamma rays , e.t.c.
S W
Hence the analysis of all that stuff too is done in digital image processing.
S
❖ Applications of Digital Image Processing
Some of the major fields in which digital image processing is widely used are mentioned below
• Medical field
• Remote sensing
• Machine/Robot vision
DIGITAL IMAGE PROCESSING Page 7
• Color processing
• Pattern recognition
• Video processing
• Microscopic Imaging
• Others
C
way to achieve desired result. It refers to do what Photoshop usually does.
S SW
o This includes Zooming, blurring , sharpening , gray scale to color conversion, detecting
edges and vice versa , Image retrieval and Image recognition. The common examples are:
C
Sharp image
SSW
Edges
• PET scan
• X Ray Imaging
•
C
Medical CT
W
• UV imaging
SS
❖ UV imaging
o In the field of remote sensing , the area of the earth is scanned by a satellite or from a very
high ground and then it is analyzed to obtain information about it. One particular
application of digital image processing in the field of remote sensing is to detect
infrastructure damages caused by an earthquake.
o As it takes longer time to grasp damage, even if serious damages are focused on. Since
the area effected by the earthquake is sometimes so wide , that it not possible to examine
it with human eye in order to estimate damages. Even if it is , then it is very hectic and
time consuming procedure. So a solution to this is found in digital image processing.
o An image of the effected area is captured from the above ground and then it is analyzed
to detect the various types of damage done by the earthquake.
o The picture that was sent took three hours to reach from one place to another.
o Now just imagine , that today we are able to see live video feed , or live cctv footage from
one continent to another with just a delay of seconds. It means that a lot of work has been
done in this field too. This field doesnot only focus on transmission , but also on encoding.
Many different formats have been developed for high or low bandwith to encode photos
and then stream it over the internet or e.t.c.
❖ Hurdle detection
Hurdle detection is one of the common task that has been done through image processing,
by identifying different type of objects in the image and then calculating the distance between
robot and hurdles.
C
W
Most of the robots today work by following the line and thus are called line follower
robots. This help a robot to move on its path and perform some tasks. This has also been achieved
through image processing.
❖ Pattern recognition
Pattern recognition involves study from image processing and from various other fields
that includes machine learning ( a branch of artificial intelligence). In pattern recognition ,
image processing is used for identifying the objects in an images and then machine learning
is used to train the system for the change in pattern. Pattern recognition is used in computer
aided diagnosis , recognition of handwriting , recognition of images e.t.c
❖ Video processing
A video is nothing but just the very fast movement of pictures. The quality of the video
depends on the number of frames/pictures per minute and the quality of each frame being used.
Video processing involves noise reduction , detail enhancement , motion detection , frame rate
conversion , aspect ratio conversion , color space conversion e.t.c.
S SW
being given an image that is already in digital form. Generally, the image
acquisition stage involves preprocessing, such as scaling.
o Image enhancement is the process of manipulating an image so that the result is
more suitable than the original for a specific application. The word specific is
important here, because it establishes at the outset that enhancement techniques are
problem oriented.
o Image restoration is an area that also deals with improving the appearance of an
image. In the sense that restoration techniques tend to be based on mathematical or
probabilistic models of image degradation. Enhancement, on the other hand, is
based on human subjective preferences regarding what constitutes a “good”
enhancement result.
Wavelets and
Color image
Multiresolution Compression Morphological
Processing
Processing Processing
Image Segmentation
Restoration
Knowledge base
Image
Filtering and Representation
Enhancement & description
C
W
SS
Image Object
Acquisition Recognition
Problem Domain
o Color image processing is an area that has been gaining in importance because of
the significant increase in the use of digital images over the Internet. Color is used
also in later chapters as the basis for extracting features of interest in an image.
o Morphological processing deals with tools for extracting image components that
are useful in the representation and description of shape. a transition from processes
that output images to processes that output image attributes
.
o Segmentation procedures partition an image into its constituent parts or objects. In
general, autonomous segmentation is one of the most difficult tasks in digital image
processing. A rugged segmentation procedure brings the process a long way toward
successful solution of imaging problems that require objects to be identified
individually.
o Representation and description almost always follow the output of a segmentation
C
stage, which usually is raw pixel data, constituting either the boundary of a region
SW
(i.e., the set of pixels separating one image region from another) or all the points in
S
the region itself. In either case, converting the data to a form suitable for computer
processing is necessary. The first decision that must be made is whether the data
should be represented as a boundary or as a complete region.
Boundary representation is appropriate when the focus is on external shape
characteristics, such as corners and inflections. Regional representation is
appropriate when the focus is on internal properties, such as texture or skeletal
shape.
o Description, also called feature selection, deals with extracting attributes that result
in some quantitative information of interest or are basic for differentiating one class
of objects from another.
C
Figure shows the basic components comprising a typical general-purpose system used
W
for digital image processing.
SS
Network
Image sensors
Problem Domain
With reference to sensing, two elements are required to acquire digital images.
o The first is a physical device that is sensitive to the energy radiated by the object
we wish to image.
S SW
probabilistic formulations, human intuition and analysis play a central role in the choice of one
technique versus another, and this choice often is made based on subjective, visual judgments.
Structure of the Human Eye:
Figure shows a simplified horizontal cross section of the human eye. The eye is nearly a
sphere, with an average diameter of approximately 20 mm.
SW
contains a black pigment. The lens is made up of concentric layers of fibrous cells
S
and is suspended by fibers that attach to the ciliary body. It contains 60 to 70%
water, about 6% fat, and more protein than any other tissue in the eye.
o The lens is colored by a slightly yellow pigmentation that increases with age. In
extreme cases, excessive clouding of the lens, caused by the affliction commonly
referred to as cataracts, can lead to poor color discrimination and loss of clear
vision. The lens absorbs approximately 8% of the visible light spectrum, with
relatively higher absorption at shorter wavelengths.
o Both infrared and ultraviolet light are absorbed appreciably by proteins within the
lens structure and, in excessive amounts, can damage the eye. The innermost
membrane of the eye is the retina, which lines the inside of the wall’s entire
posterior portion. When the eye is properly focused, light from an object outside
the eye is imaged on the retina. Pattern vision is afforded by the distribution of
discrete light receptors over the surface of the retina.
C
focal length needed to achieve proper focus is obtained by varying the shape of the
SW
lens. The fibers in the ciliary body accomplish this, flattening or thickening the lens
S
for distant or near objects, respectively.
o The distance between the center of the lens and the retina along the visual axis is
approximately 17 mm. The range of focal lengths is approximately 14 mm to 17
mm, the latter taking place when the eye is relaxed and focused at distances greater
than about 3 m. The geometry in Figure illustrates how to obtain the dimensions of
an image formed on the retina.
o For example, suppose that a person is looking at a tree 15 m high at a distance of
100 m. Letting h denote the height of that object in the retinal image, the geometry
C
W
SS
C
than radio waves, infrared still more, then visible, ultraviolet, X-rays, and finally
W
gamma rays, the most energetic of all. This is the reason why gamma rays are so
SS
dangerous to living organisms.
o Light is a particular type of electromagnetic radiation that can be sensed by the
human eye. The visible band of the electromagnetic spectrum spans the range from
approximately 0.43 µm (violet) to about 0.79 µm (red).
o For convenience, the color spectrum is divided into six broad regions: violet, blue,
green, yellow, orange, and red. No color (or other component of the
electromagnetic spectrum) ends abruptly, but rather each range blends smoothly
into the next, as shown in the above Figure.
o The colors that humans perceive in an object are determined by the nature of the
light reflected from the object. A body that reflects light relatively balanced in all
visible wavelengths appears white to the observer. However, a body that favors
reflectance in a limited range of the visible spectrum exhibits some shades of color.
o For example, green objects reflect light with wavelengths primarily in the 500 to
570 nm range while absorbing most of the energy at other wavelengths. Light that
is void of color is called monochromatic (or achromatic) light.
observer perceives from a light source. For example, light emitted from a source
SS
operating in the far infrared region of the spectrum could have significant energy
(radiance), but an observer would hardly perceive it; its luminance would be almost
zero.
o Brightness is a subjective descriptor of light perception that is practically
impossible to measure. It embodies the achromatic notion of intensity and is one of
the key factors in describing color sensation. Gamma radiation is important for
medical and astronomical imaging, and for imaging radiation in nuclear
environments. Hard (high-energy) X-rays are used in industrial applications. Chest
and dental X-rays are in the lower energy (soft) end of the X-ray band. The soft X-
ray band transitions into the far ultraviolet light region, which in turn blends with
the visible spectrum at longer wavelengths.
For example, a green (pass) filter in front of a light sensor favors light in the green band of
the color spectrum. As a consequence, the sensor output will be stronger for green light than for
other components in the visible spectrum. In order to generate a 2-D image using a single sensor,
there has to be relative displacements in both the x- and y-directions between the sensor and the
area to be imaged.
C
W
SS
hanical
digitizers sometimes are referred to as microdensitometers. Another example of
imaging with a single sensor places a laser source coincident with the sensor.
Moving mirrors are used to control the outgoing beam in a scanning pattern and to
direct the reflected laser signal onto the sensor.
C
electromagnetic spectrum are mounted perpendicular to the direction of flight. The
W
imaging strip gives one line of an image at a time, and the motion of the strip
S
S
completes the other dimension of a two-dimensional image.
o Lenses or other focusing schemes are used to project the area to be scanned onto
the sensors. Sensor strips mounted in a ring configuration are used in medical and
industrial imaging to obtain cross-sectional (“slice”) images of 3-D objects, as
Fig. (b) shows.
S SW
they require extensive processing. A 3-D digital volume consisting of stacked
images is generated as the object is moved in a direction perpendicular to the sensor
ring. Other modalities of imaging based on the CAT principle include magnetic
resonance imaging (MRI) and positron emission tomography (PET).The
illumination sources, sensors, and types of images are different, but conceptually
they are very similar to the basic imaging approach shown in Fig.(b).
C
W
When an image is generated from a physical process, its intensity values are proportional
SS
C
SW
S
(c) (d)
S SW
o It is implied in Figure that, in addition to the number of discrete levels used; the
accuracy achieved in quantization is highly dependent on the noise content of the
sampled signal. Sampling in the manner just described assumes that we have a
continuous image in both coordinate directions as well as in amplitude.
SW
explained in the previous section. Suppose that we sample the continuous image
S
into a 2-D array, f(x, y), containing M rows and N columns, where (x, y) are discrete
coordinates.
o For notational clarity and convenience, we use integer values for these discrete
coordinates: x = 0, 1, 2,………..,M-1 and y = 0, 1,2,………..,N-1 Thus, for
example, the value of the digital image at the origin is F(0,0) , and the next
coordinate value along the first row is f(0,1) . Here, the notation (0, 1) is used to
signify the second sample along the first row.
o It does not mean that these are the values of the physical coordinates when the
image was sampled. In general, the value of the image at any coordinates (x, y) is
denoted f(x, y), where x and y are integers. The section of the real plane spanned
S SW
with a 20-megapixel CCD imaging chip can be expected to have a higher capability
to resolve detail than an 8-megapixel camera, assuming that both cameras are
equipped with comparable lenses and the comparison images are taken at the same
distance.
S
the same pixel spacing as the original, and then shrink it so that it fits exactly over
the original image. Obviously, the pixel spacing in the shrunken 750 X 750 grid
will be less than the pixel spacing in the original image.
o To perform intensity-level assignment for any point in the overlay, we look for its
closest pixel in the 750 X 750 original images and assign the intensity of that pixel
to the new pixel in the grid. When we are finished assigning intensities to all the
points in the overlay grid, we expand it to the original specified size to obtain the
zoomed image.
(a) 4-adjacency. Two pixels p and q with values from V are 4-adjacent if q is in the set N4(P).
W
(b) 8-adjacency. Two pixels p and q with values from V are 8-adjacent if q is in the set N8(P).
SS
(c) m-adjacency (mixed adjacency).Two pixels p and q with values from V are
m-adjacent if
(i) q is in N4(P) or
(ii) q is in ND(P) and the set N4(P)∩ N4(q) has no pixels whose values are from
V.
2
C
2 1
S W 2
2 1
2 S 0
1
1
2
2
2
The pixels with D4=1 are the 4-neighbors of (x, y). The D8 distance (called the chessboard
distance) between p and q is defined as
D8 (p, q) = max (| x - s |, | y - t |)
In this case, the pixels with distance from (x, y) less than or equal to some value r form a
square centered at. For example, the pixels with D8 distance ≤ 2 from (x, y) (the center point)
form the following contours of constant distance:
C
S W
S
QUESTION BANK
1 MARKS
b. Vertex
c. Contour
d. Random
Hide Answer Workspace
2) ________ represents the transition between image function's continuous values and its digital
equivalent.
a. Rasterization
b. Quantization
C
SW
c. Sampling
d. None of the above
Hide Answer Workspace S
Answer: a) Quantization
3) Which of the following correctly describes the slightest visible change in the level of intensity?
Contour
a. Saturation
b. Contrast
c. Intensity Resolution
Hide Answer Workspace
Explanation: Intensity resolution can be defined as the total number of bits required to quantize
an image.
4) What is the name of the tool that helps in zooming, shrinking, rotating, etc.?
a. Filters
b. Interpolation
c. Sampling
d. None of the above
Hide Answer Workspace
Answer: b) Interpolation
Explanation: Interpolation is one such basic tool that is used to zoom, shrink, rotate, etc.
5) The dynamic range of the imaging system is a quantitative relation where the upper limit can
be determined by
a. Brightness
b. Contrast
c. Saturation
C
d. Noise
Hide Answer Workspace
S W
Answer: c) Saturation
S
Explanation: Saturation is taken as a numerator.
Answer: d) Noise
Answer: a) Photodiode
Explanation: The photodiode is a p-n junction semiconductor device that transmutes the light into
an electric current.
Explanation: Computerized Axial Tomography is based on image acquisition that uses sensor
strips.
9) What is meant by the section of the real plane that the image coordinates have spanned?
a. Coordinate Axis
C
SW
b. Plane of Symmetry
c. Spatial Domain
d. None of the above
Hide Answer Workspace
S
Answer: c) Spatial Domain
Explanation: Spatial Domain refers to the section of the real plane that has been spanned by the
coordinates of an image, where x and y coordinates are called Spatial coordinates.
10) Which of the following is the effect of using an inadequate amount of intensity levels in a
digital image's smooth areas?
Explanation: False contouring is caused when the grey-level resolution of a digital image gets
decreased.
11) What is the name of the process in which the known data is utilized to evaluate the value at an
unknown location?
a. Interpolation
b. Acquisition
c. Pixelation
d. None of the above
Show Answer Workspace
13) Name the procedure in which individual pixel values of the digital image get altered.
a. Neighborhood Operations
b. Image Registration
c. Geometric Spatial Transformation
d. Single Pixel Operation
Hide Answer Workspace
15) Which of the following color possess the longest wavelength in the visible spectrum?
a. Yellow
b. Red
c. Blue
d. Violet
Hide Answer Workspace
Answer: b) Red
Explanation: In the visible spectrum, red has the longest wavelength. The visible colors are
ranged from shortest to longest wavelength, i.e., Violet, Blue, Green, Yellow, Orange, and Red.
5 MARKS:
C
1. Explain Types of computerized processes in detail.
SW
2. Explain Components of an image processing system in detail (April 2012).
S
3. Explain Light and the electromagnetic spectrum in detail.
4. Explain some basic relationships between pixels in detail (April 2012).
10 MARKS:
1. Explain Examples of fields that use digital image processing in detail.
2. Explain Fundamental steps in digital image processing in detail (April 2012).
3. Explain Image sensing and acquisition in detail (April 2012)..
4. Explain Image sampling and quantization in detail.
1. BACKGROUND:
Image enhancement approaches fall into two broad categories: spatial domain methods
and frequency domain methods. The term spatial domain refers to the image plane itself, and
approaches in this category are based on direct manipulation of pixels in an image. Frequency
domain processing techniques are based on modifying the Fourier transform of an image.
The term spatial domain refers to the aggregate of pixels composing an image. Spatial
domain methods are procedures that operate directly on these pixels. Spatial domain processes
will be denoted by the expression
g (x, y) = T [f (x, y)]
where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f,
defined over some neighborhood of (x, y). In addition, T can operate on a set of input images,
such as performing the pixel-by-pixel sum of K images for noise reduction.
The principal approach in defining a neighborhood about a point (x, y) is to use a square
or rectangular sub image area centered at (x, y), as shown in below figure 1.
C
W
SS
C
W
would be to produce an image of higher contrast than the original by darkening the levels below
SS
m and brightening the levels above m in the original image. In this technique, known as contrast
stretching, the values of r below m are compressed by the transformation function into a narrow
range of s, toward black. The opposite effect takes place for values of r above m. In the limiting
case shown in Fig. 2(b), T(r) produces a two-level (binary) image. A mapping of this form is
called a thresholding function.
Figure 3: Some basic gray-level transformation functions used for image ehancement.
C
negative. This type of processing is particularly suited for enhancing white or gray detail
SS
embedded in dark regions of an image, especially when the black areas are dominant in size.
2. Log Transformations:
The general form of the log transformation
s = c log (1 + r)
where c is a constant, and it is assumed that r ≥ 0.
The shape of the log curve in Fig. 3 shows that this transformation maps a narrow range of low
gray-level values in the input image into a wider range of output levels. The opposite is true
of higher values of input levels. We would use a transformation of this type to expand the values
C
W
SS
Figure 4: Plots of the equation s = c r γ for various values of g (c=1 in all cases).
Plots of s versus r for various values of g are shown in Fig. 4. As in the case of the log
transformation, power-law curves with fractional values of g map a narrow range of dark input
values into a wider range of output values, with the opposite being true for higher values of input
levels.
C
S SW
Figure 5: Bit-plane representation of an 8-bit image.
Decomposing an image into its bit planes is useful for analyzing the relative importance of
each bit in the image, a process that aids in determining the adequacy of the number of bits used
to quantize the image. It is useful in image compression.
• The reconstruction is done by using few planes only.
• It is done by multiplying the pixels of the nth plane by a constant 2n-1.
• All the generated planes are added together (few of 8 planes)
• If we use bit plane 7 and 8, multiply bit plane 8 by 128 and plane 7 by 64 and then added
together.
The histogram of a digital image with gray levels in the range [0, L-1] is a discrete
function h(rk)=nk, where rk is the kth gray level and nk is the number of pixels in the image
having gray level rk. Histograms are the basis for numerous spatial domain processing
techniques.
Histogram manipulation can be used effectively for image enhancement. Histograms are
simple to calculate in software and also lend themselves to economic hardware implementations,
thus making them a popular tool for real-time image processing.
The horizontal axis of each histogram plot corresponds to gray level values, rk. The
vertical axis corresponds to values of
H(rk)=nk or p(rk)=nk/n if the values are normalized.
Thus, as indicated previously, these histogram plots are simply plots of h(rk)=nk versus
rk or p(rk)=nk/n versus rk.
Histogram Equalization:
Consider for a moment continuous functions, and let the variable r represent the gray
levels of the image to be enhanced. In the initial part of our discussion we assume that r has been
C
normalized to the interval [0, 1], with r=0 representing black and r=1 representing white. Later,
SW
we consider a discrete formulation and allow pixel values to be in the interval [0, L-1]. For any r
S
satisfying the aforementioned conditions, we focus attention on transformations of the form
s=T(r) 0 ≤ r ≤ 1 ......(1)
that produce a level s for every pixel value r in the original image. For reasons that will become
obvious shortly, we assume that the transformation function T(r) satisfies the following
conditions:
a. T(r) is a monotonically increasing function in the interval 0 ≤ r ≤ L-1:
T(r) be single valued is needed to guarantee that the inverse transformation will exist, and
the monotonicity condition preserves the increasing order from black to white in the
output image.
b. 0 ≤ T(r) ≤ L-1 for 0 ≤ r ≤ L-1:
It guarantees that the output gray levels will be in the same range as the input levels.
Figure 6 gives an example of a transformation function that satisfies these two conditions.
DIGITAL IMAGE PROCESSING Page 49
Figure 6 A gray-level transformation functions that is both single valued and monotonically increasing.
The inverse transformation from s back to r is denoted
r = T-1(s) 0 ≤ s ≤1 ......(2)
The gray levels in an image may be viewed as random variables in the interval [0, 1]. One
of the =most fundamental descriptors of a random variable is its probability density function
(PDF). Let pr(r) and ps(s) denote the probability density functions of random variables r and s,
respectively,
where the subscripts on p are used to denote that pr and ps are different functions. A basic
result from an elementary probability theory is that, if pr(r) and T(r) are known and T-1(s)
C
satisfies condition(a) specified, then the probability density function ps(s) of the transformed
W
.........(3)
A transformation function of particular importance in image processing has the form
..........(4)
where w is a dummy variable of integration. From Leibniz’s rule in calculus
For discrete values we deal with probabilities and summations instead of probability
density functions and integrals. The probability of occurrence of gray level rk in an image is
approximated by
Histogram Matching:
As indicated in the preceding discussion, histogram equalization automatically determines
a transformation function that seeks to produce an output image that has a uniform histogram.
When automatic enhancement is desired, this is a good approach because the results from this
C
technique are predictable and the method is simple to implement. But in few applications, its
SW
required to extract specified histogram then the method used to generate a processed image that
has a specified histogram is called histogram matching or histogram specification.
S
Let us return for a moment to continuous gray levels r and z and let pr(r) and pz(z) denote
their corresponding continues probability density functions. In this notation, r and z denote the
gray levels of the input and output (processed) images, respectively. We can estimate pr(r) from
the given input image, while pz(z) is the specified probability density function that we wish the
output image to have.
Let s be a random variable with the property
where t is a dummy variable of integration. It then follows from these two equations that
G(z)=T(r) and, therefore, that z must satisfy the condition.
The transformation T(r) can be obtained from Eq. (10) once pr(r) has been estimated from the
input image. Similarly, the transformation function G(z) can be obtained using Eq. (11) because
C
pz(z) is given.
W
The discrete formulation of Eq. (3.3-10) is given by Eq. (3.3-8), which we repeat here for
SS
convenience:
where n is the total number of pixels in the image, nj is the number of pixels with gray level rj ,
and L is the number of discrete gray levels. Similarly, the discrete formulation of Eq. (11) is
obtained from the given histogram pz(zi),i=0,1, 2,p ,L-1, and has the form
C
Figure 7(a) Graphical interpretation of mapping from rk to sk via T(r). (b) Mapping of zq to its
S SW
corresponding value vq via G(z). (c) Inverse mapping from sk to its corresponding value of zk.
The procedure we have just developed for histogram matching may be summarized as follows:
1. Obtain the histogram of the given image.
2. Use Eq. (13) to pre-compute a mapped level sk for each level rk.
3. Obtain the transformation function G from the given pz(z) using Eq. (14).
4. Pre-compute zk for each value of sk using the iterative scheme defined in connection with Eq.
5. For each pixel in the original image, if the value of that pixel is rk, map this value to its
corresponding level sk; then map level sk into the final level zk. Use the pre-computed values
from Steps (2) and (4) for these mappings
C
S SW
Image Subtraction:
The difference between two images f(x, y) and h(x, y), expressed as
g(x, y) = f(x, y) - h(x, y),
And it is obtained by computing the difference between all pairs of corresponding pixels from f
and h. The key usefulness of subtraction is the enhancement of differences between images.
One of the most commercially successful and beneficial uses of image subtraction is in
the area of medical imaging called mask mode radiography. In this case h(x, y), the mask, is an
X-ray image of a region of a patient’s body captured by an intensified TV camera (instead of
traditional X-ray film) located opposite an X-ray source. The procedure consists of injecting a
contrast medium into the patient’s bloodstream, taking a series of images of the same anatomical
region as h(x, y), and subtracting this mask from the series of incoming images after injection of
the contrast medium. The net effect of subtracting the mask from each sample in the incoming
stream of TV images is that the areas that are different between f(x, y) and h(x, y) appear in the
C
SW
S
Figure 8 The mechanics of spatial filtering. The magnified drawing shows a 3*3 mask and
the image section directly under it; the image section is shown displaced out from under the
mask for ease of readability.
For the 3*3 mask shown in Fig. 3.32, the result (or response), R, of linear filtering with the
filter mask at a point (x, y) in the image is
where, from the previous paragraph, a=(m-1)/2 and b=(n-1)/2. To generate a complete filtered
image this equation must be applied for x=0, 1, 2, p , M-1 and y=0, 1, 2, p , N-1.
When interest lies on the response, R, of an m*n mask at any point (x, y), and not on the
mechanics of implementing mask convolution, it is common practice to simplify the notation by
using the following expression:
C
W
SS
C
SW
This is the average of the gray levels of the pixels in the 3*3 neighborhood defined by the
mask. Note that, instead of being 1_9, the coefficients of the filter are all 1’s. An m*n mask
S
would have a normalizing constant equal to 1_mn. A spatial averaging filter in which all
coefficients are equal is sometimes called a box filter.
Figure 9 Two 3*3 smoothing (averaging) filter masks. The constant multiplier in front of
each mask is equal to the sum of the values of its coefficients, as is required to compute an
average.
The second mask shown in Fig. 9 is a little more interesting. This mask yields a so-called
weighted average, terminology used to indicate that pixels are multiplied by different
coefficients, thus giving more importance (weight) to some pixels at the expense of others. In the
excellent noise-reduction capabilities, with considerably less blurring than linear smoothing
filters of similar size. Median filters are particularly effective in the presence of impulse noise,
also called salt-and-pepper noise because of its appearance as white and black dots superimposed
on an image.
The median, j, of a set of values is such that half the values in the set are less than or equal
to j, and half are greater than or equal to j. In order to perform median filtering at a point in an
image, we first sort the values of the pixel in question and its neighbors, determine their median,
and assign this value to that pixel. For example, in a 3*3 neighborhood the median is the 5th
largest value, in a 5*5 neighborhood the 13th largest value, and so on. When several values in a
neighborhood are the same, all equal values are grouped. For example, suppose that a 3*3
neighborhood has values (10, 20, 20, 20, 15, 20, 20, 25, 100). These values are sorted as (10, 15,
20, 20, 20, 20, 20, 25, 100), which results in a median of 20. Thus, the principal function of
S SW
detail that has been blurred, either in error or as a natural effect of a particular method of image
acquisition. Image sharpening vary and include applications ranging from electronic printing
and medical imaging to industrial inspection and autonomous guidance in military systems.
Sharpening filters that are based on first- and second-order derivatives.
The derivatives of a digital function are defined in terms of differences. There are various
ways to define these differences. However, we require that any definition we use for a first
derivative (1) must be zero in flat segments (areas of constant gray-level values); (2) must be
nonzero at the onset of a gray- level step or ramp; and (3) must be nonzero along ramps.
Similarly, any definition of a second derivative (1) must be zero in flat areas; (2) must be
nonzero at the onset and end of a gray-level step or ramp; and (3) must be zero along ramps of
constant slope.
A basic definition of the first-order derivative of a one-dimensional function f(x) is the
difference
It is easily verified that these two definitions satisfy the conditions stated previously
regarding derivatives of the first and second order.
Figure 10 Simplified profile (the points are joined by dashed lines to simplify
interpretation).
C
S SW
Figure 10 shows a simplification of the profile, with just enough numbers to make it
possible for us to analyze how the first- and second-order derivatives behave as they encounter a
noise point, a line, and then the edge of an object. In our simplified diagram the transition in the
ramp spans four pixels, the noise point is a single pixel, the line is three pixels thick, and the
transition in the gray-level step takes place between adjacent pixels. The number of gray levels
was simplified to only eight levels.
Let us consider the properties of the first and second derivatives as we traverse the profile
from left to right.
1. The first-order derivative is nonzero along the entire ramp,
2. The second-order derivative is nonzero only at the onset and end of the ramp.
We conclude that first-order derivatives produce “thick” edges and second-order
derivatives, much finer ones.
3. We encounter the isolated noise point. Here, the response at and around the point is
much stronger for the second- than for the first-order derivative.
DIGITAL IMAGE PROCESSING Page 61
4. A second-order derivative is much more aggressive than a first-order derivative in
enhancing sharp changes. Thus, we can expect a second-order derivative to enhance
fine detail (including noise) much more than a first-order derivative.
5. The thin line is a fine detail, and we see essentially the same difference between the
two derivatives. If the maximum gray level of the line had been the same as the
isolated point, the response of the second derivative would have been stronger for the
latter.
6. Finally, in this case, the response of the two derivatives is the same at the gray-level
step. We also note that the second derivative has a transition from positive back to
negative.
Comparison B/N first- and second-order derivatives response
1. First-order derivatives generally produce thicker edges in an image.
2. Second-order derivatives have a stronger response to fine detail, such as thin lines and
isolated points.
C
3. First order derivatives generally have a stronger response to a gray-level step
SW
4. Second-order derivatives produce a double response at step changes in gray level.
Use of Second Derivatives for Enhancement–The Laplacian
S
The approach basically consists of defining a discrete formulation of the second-order
derivative and then constructing a filter mask based on that formulation. We are interested in
isotropic filters, whose response is independent of the direction of the discontinuities in the
image to which the filter is applied. In other words, isotropic filters are rotation invariant, in the
sense that rotating the image and then applying the filter gives the same result as applying the
filter to the image first and then rotating the result.
It can be shown that the simplest isotropic derivative operator is the Laplacian, which,
for a function (image) f(x, y) of two variables, is defined as
Because derivatives of any order are linear operations, the Laplacian is a linear operator.
The definition of the digital second derivative given in that section is one of the most used.
Taking Into account that we now have two variables, we use the following notation for the
partial second-order derivative in the x-direction
S
(d) Two other implementations of the Laplacian.SW
used to implement an extension of this equation that includes the diagonal neighbors. (c) and
Because the Laplacian is a derivative operator, its use highlights gray-level discontinuities
in an image and deemphasizes regions with slowly varying gray levels.This will tend to
produce images that have grayish edge lines and other discontinuities, all superimposed on a
dark, featureless background. Background features can be “recovered” while still preserving the
sharpening effect of the Laplacian operation simply by adding the original and Laplacian
images.
As noted in the previous paragraph, it is important to keep in mind which definition of the
noise content make this image difficult to enhance. The strategy we will follow is to utilize the
SS
Laplacian to highlight fine detail, and the gradient to enhance prominent edges.
Figure 3.43(b) shows the Laplacian of the original image, obtained using the mask in Fig.
3.39(d). This image was scaled (for display only) using the same technique as in Fig. 3.40.We
can obtain a sharpened image at this point simply by adding Figs. 3.43(a) and (b), which are an
implementation of the second line in Eq. (3.7-5) (we used a mask with a positive center
coefficient). Just by looking at the noise level in (b), we would expect a rather noisy sharpened
image if we added Figs. 3.43(a) and (b), a fact that is confirmed by the result shown in Fig.
3.43(c).
One way that comes immediately to mind to reduce the noise is to use a median filter.
However, median filtering is a nonlinear process capable of removing image features. This is
unacceptable in medical image processing.
Figure 3.43(d) shows the Sobel gradient of the original image, computed using Eq. (3.7-14).
C
SW
S
smoothed gradient
image.Adding the product image to the original resulted in the sharpened image shown in Fig.
3.43(g).
QUESTION BANK
1 MARKS
1. A pixel p at coordinates (x, y) has neighbors whose coordinates are given by:
(x+1, y), (x-1, y), (x, y+1), (x, y-1)
This set of pixels is called ____________
a) 4-neighbors of p
b) Diagonal neighbors
c) 8-neighbors
d) None of the mentioned
Answer: a
Explanation: The given set of neighbor pixel are 1 unit distance to right, left, up and below
respectively from pixel p(x, y). So, are called 4-neighbors of p.
2. A pixel p at coordinates (x, y) has neighbors whose coordinates are given by:
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)
This set of pixels is called ____________
a) 4-neighbors of p
C
W
b) Diagonal neighbors
SS
c) 8-neighbors
d) None of the mentioned
Answer: b
Explanation: The given set of neighbor pixel are 1 unit distance to right-up diagonal, right-down
diagonal, left-up diagonal and left-down diagonal respectively from pixel p(x, y). So, are called
Diagonal neighbors of p.
3. Two pixels p and q having gray values from V, the set of gray-level values used to define
adjacency, are m-adjacent if:
C
Explanation: For a subset of pixels in an image S
For any pixel p in S, the set of pixels is called a connected component of S if connected to p in S.
5. The domain that refers to image plane itself and the domain that refers to Fourier transform of
an image is/are :
a) Spatial domain in both
b) Frequency domain in both
c) Spatial domain and Frequency domain respectively
d) Frequency domain and Spatial domain respectively
Answer: c
Explanation: Spatial domain itself refers to the image plane, and approaches in this category are
based on direct manipulation of pixels in an image.
6. What is the technique for a gray-level transformation function called, if the transformation
would be to produce an image of higher contrast than the original by darkening the levels below
some gray-level m and brightening the levels above m in the original image.
a) Contouring
b) Contrast stretching
c) Mask processing
d) Point processing
Answer: b
Explanation: For a gray-level transformation function “s=T(r)”, where r and s are the gray-level
of f(x, y) (input image) and g(x, y) (output image) respectively at any point (x, y).
Then the technique, contrast stretching compresses the value of r below m by transformation
function into a narrow range of s, towards black and brightens the value of r above m.
7. For pixels p(x, y), q(s, t), and z(v, w), D is a distance function or metric if:
a) D(p, q) ≥ 0
b) D(p, q) = D(q, p)
C
SW
c) D(p, z) ≤ D(p, q) + D(q, z)
d) All of the mentioned
Answer: d S
Explanation: For pixels p(x, y), q(s, t), and z(v, w), D is a distance function or metric if:
(i) D(p, q) ≥ 0, (D(p, q) = 0 if p=q),
(ii) D(p, q) = D(q, p), and
(iii) D(p, z) ≤ D(p, q) + D(q, z).
a. 1048576
b. 1148576
c. 1248576
d. 1348576
Answer a
10. The lens is made up of concentric layers of
a. strong cells
b. inner cells
c. fibrous cells
d. outer cells
Answer c
11. Digital images are displayed as a discrete set of
C
W
a. values
b. numbers
SS
c. frequencies
d. intensities
Answer d
12. Each element of the matrix is called
a. dots
b. coordinate
c. pixels
d. value
Answer c
13. DPI stands for
a. dots per image
b. dots per inches
5 MARKS
S
1. Explain the background of Image Enhancement.
10 MARKS
Image Restoration
Image restoration is the process of recovering an image that has been degraded by some
knowledge of degradation function H and the additive noise term . Thus in restoration,
degradation is modelled and its inverse process is applied to recover the original image.
Terminology:
● = degraded image
In spatial domain:
In frequency domain:
C
W
SS
(for restoration)
2, Noise Models:
S
Generally, a mathematical model of image degradation and its restoration is used for
processing. The figure below shows the presence of a degradation function h(x,y) and an external
noise n(x,y) component coming into the original image signal f(x,y) thereby producing a final
degraded image g(x,y). This part composes the degradation model. Mathematically we can write
the following :
C
SW
S
The external noise is probabilistic in nature and there are several noise models used frequently in
the field of digital image processing. We have several probability density functions of the noise.
Noise Models
Gaussian Noise:
Because of its mathematical simplicity, the Gaussian noise model is often used in practice and
C
even in situations where they are marginally applicable at best. Here, m is the mean and σ2 is the
W
variance.
S S
Gaussian noise arises in an image due to factors such as electronic circuit noise and sensor noise
due to poor illumination or high temperature.
C
Rayleigh noise is usually used to characterize noise phenomena in range imaging.
Exponential Noise
Here a > 0. The mean and variance of this noise pdf are:
C
SW
This density function is a special case of b = 1.
S
Exponential noise is also commonly present in cases of laser imaging.
Uniform Noise
Uniform noise is not practically present but is often used in numerical simulations to analyze
systems.
Impulse Noise
If b > a, intensity b will appear as a light dot in the image. Conversely, level a will appear like a
black dot in the image. Hence, this presence of white and black dots in the image resembles to salt-
and-pepper granules, hence also called salt-and-pepper noise. When either Pa or Pb is zero, it is
called unipolar noise. The origin of impulse noise is quick transients such as faulty switching in
cameras or other such cases.
g(x,y)=f(x,y)+η(x,y) or
W
The noise terms are unknown so subtracting them from g(x,y) or G(u,v) is not a realistic
approach. In the case of periodic noise it is possible to estimate N(u,v) from the spectrumG(u,v).
So N(u,v) can be subtracted from G(u,v) to obtain an estimate of original image. Spatial filtering
can be done when only additive noise is present. The following techniques can be used to reduce
the noise effect:
i) Mean Filter:
This filter is useful for flinging the darkest point in image. Also, it reduces salt noise of the min
operation.
(c)Midpoint filter:
The midpoint filter simply computes the midpoint between the maximum and minimum
values in the area encompassed by
C
S SW
It comeliness the order statistics and averaging .This filter works best for randomly distributed
noise like Gaussian or uniform noise.
(d)Harmonic Mean filter:
The harmonic mean filtering operation is given by the expression
Order statistics filters are spatial filters whose response is based on ordering the pixel
contained in the image area encompassed by the filter. The response of the filter at any point is
determined by the ranking result.
Median filter:
It is the best order statistic filter; it replaces the value of a pixel by the median of gray levels
in the Neighborhood of the pixel.
The original of the pixel is included in the computation of the median of the filter are quite
possible because for certain types of random noise, the provide excellent noise reduction
capabilities with considerably less blurring then smoothing filters of similar size. These are
effective for bipolar and unipolor impulse noise
C
Using the l00th percentile of ranked set of numbers is called the max filter and is given by
SS
the equation
It is used for finding the brightest point in an image. Pepper noise in the image has very low values,
it is reduced by max filter using the max selection process in the sublimated area sky. The 0th
percentile filter is min filter.
4. Periodic Noise Reduction by Frequency Domain Filtering:
Source: D,E. Dudgeon and RM. Mersereau, ̳Multidimensional Digital Signal Processing‘,
Prentice Hall Professional Technical Reference, 1990Page- 312)
The function of a band pass filter is opposite to that of a band reject filter It allows a
specific frequency band of the image to be passed and blocks the rest of frequencies. The transfer
function of a band pass filter can be obtained from a corresponding band reject filter with transfer
function Hbr(u,v) by using the equation
A notch filter rejects (or passes) frequencies in predefined neighborhoods about a center
frequency. Due to the symmetry of the Fourier transform notch filters must appear in symmetric
pairs about the origin. The transfer function of an ideal notch reject filter of radius D0 with centers
a (u0, v0) and by symmetry at (-u0, v0) is
C
SW
S
Ideal, Butterworth, Gaussian notch filters
5. Inverse Filtering:
C
that have been degraded by blurring, for example, due to a defocused camera or motion blur. By
W
applying the inverse of the blurring filter, one can attempt to recover the original sharp image.
S
S
b. Signal Deconvolution: In the realm of signal processing, inverse filtering is used to
deconvolve signals. This can help recover the original signal from a distorted or noisy version.
For example, in communication systems, it can be used to mitigate channel-induced distortion.
c. Astronomy: In astronomy, inverse filtering is used to enhance the quality of astronomical
images by compensating for atmospheric distortions, telescope imperfections, or other forms of
degradation.
d. Medical Imaging: Inverse filtering can be applied in medical imaging to improve the quality
of MRI or CT scan images by correcting for motion artifacts or other sources of degradation.
e. Audio Processing: In audio processing, inverse filtering can be used to reduce the impact of
room reverberation or to remove the effects of a known acoustic filter, improving speech or
audio quality.
f. Seismic Imaging: In the field of geophysics, inverse filtering is applied to seismic data to
correct for the effects of the subsurface, enabling a clearer interpretation of subsurface structures.
3. Challenges:
C
and the available noisy or distorted image. When the filter is linear, minimum mean squared
SW
error (MMSE) filters may be designed using closed form matrix expressions. Simplicity of
S
design is an important advantage of optimal linear filters. Suppose we are given a noisy or
distorted image x and we want to estimate the image y by applying a linear filter to x. The
estimate ˆys at lattice location s can then be written as
yˆs = zsθ
where zs = [xs, xs+r1 , . . . , xs+rp−1 ] is a row vector of pixels from a window surrounding xs,
and θ is a column vector of filter coefficients. In MMSE filtering, the goal is to find the vector θ
that will minimize the expected mean square prediction error
MSE = E[|ys − yˆs| 2 ]
θ ∗ = arg min θ E[|ys − zsθ| 2 ] .
The solution for θ that minimizes the MSE can be shown to be
In practice, the values of Rzz and rzy may not be known, so that they must be estimated from
examples of the image pairs X and Y . The coefficients for the filter may then be estimated in a
training procedure known as least squares estimation. Least squares estimation determines that
values of the filter coefficients which actually minimize the total squared error for a specific set
of training images. To do this, let Y = [y1, y2, . . . , yN ] T be a column vector of pixels from
the desired image. For reasons that will be discussed later, this vector Y may not contain all the
pixels in y. For each ys there is an associated set of pixels zs = [xs, xs+r1 , . . . , xs+rp−1 ] in a
C
window surrounding xs. We can then express the column vector of prediction errors as
S W
S
is an N × p matrix where each row zs contains p pixels from a window surrounding the
corrupted pixel xs. The total squared error is then given by
is then formed by the associated windows in x centered about the locations xπ(i) . Notice that
the original images x and y are left at their original resolution, but that the pair of (zs, ys) are
sparsely sampled.
1. Down load the zip file images.zip from the lab home page. This file contains an image,
img14g.tif, a blurred version img14bl.tif, and two noisy versions, img14gn.tif and img14sp.tif.
2. Use Matlab to compute estimates of the covariance matrix Rˆ zz and the cross correlation
rˆzy for a 7×7 prediction window. Use the original img14g.tif for Y and use img14bl.tif for X.
Only sample the pairs (zs, ys) at (1/400)th of the pixel locations in the image. You can do this
by taking a sample at every 20th column and every 20th row. The Matlab reshape command
may be useful in this exercise.
C
3. Using your estimates Rˆ zz and ˆrzy, compute the corresponding filter coefficients θ ∗ .
W
4. Apply the optimal filter to the image img14bl.tif. Print or export the filtere
SS
d image for your report. 5. Repeat this procedure using img14gn.tif for X. Then repeat the
procedure using img14sp.tif for X
6. GEOMETRIC TRANSFORMATIONS:
• Geometric transforms permit the elimination of geometric distortion that occurs when an
image is captured.
DIGITAL IMAGE PROCESSING Page 84
• An example is an attempt to match remotely sensed images of the same area taken after
one year, when the more recent image was probably not taken from precisely the same
position.
• To inspect changes over the year, it is necessary first to execute a geometric transformation,
and then subtract one image from the other.
• A geometric transform is a vector function T that maps the pixel (x,y) to a new position
(x',y').
• The transformation equations are either known in advance or can be determined from
known original and transformed images.
• Several pixels in both images with known correspondence are used to derive the unknown
C
transformation.
W
SW
S
➢ 4 pairs of corresponding points are sufficient to find transformation coefficients
• Even simpler is the affine transformation for which three pairs of corresponding points
are sufficient to find the coefficients
• If the transformation is singular (has no inverse) then J=0. If the area of the image is
invariant under the transformation then J=1.
• The Jacobean for the general bilinear transform (4.11)
• The right side of Figure shows how the new brightness is assigned.
• C
Dashed lines show how the inverse planar transformation maps the raster of the output
S W
image into the input image - full lines show the raster of the input image.
• The position error of the nearest neighborhood interpolation is at most half a pixel.
• This error is perceptible on objects with straight line boundaries that may appear step-like
after the transformation.
Linear interpolation:
DIGITAL IMAGE PROCESSING Page 89
• Explores four points neighboring the point (x,y), and assumes that the brightness function
is linear in this neighborhood.
• C
Linear interpolation can cause a small decrease in resolution and blurring due to its
averaging nature.
S W
• S
The problem of step like straight boundaries with the nearest neighborhood interpolation
is reduced.
Bicubic Interpolation:
• Improves the model of the brightness function by approximating it locally by a bicubic
polynomial surface; sixteen neighboring points are used for interpolation.
• Interpolation kernel (`Mexican hat') is given by
C
S SW
1 MARKS
C
SSW
13. An ___a place where there is a rapid change in the brightness (or other property) of an image
a) point
b) edge
c) line
d) None of the above
Answer:b
S SW
c) does not place any emphasis on the pixels that are closer to the centre of the mask.
d) detects vertical and horizontal edges of an image
Answer: c & d
5 MARKS
1. Explain about Restoration Process.
2. Explain about Noise Models.
3. Explain about Spatial Filtering.
4. Explain about Invariant Degradation
10 MARKS.
1. Explain about Periodic Noise reduction by Frequency Domain Filtering.
2. Explain about Minimum Mean Square Filtering.
3. Explain about Constrained Least Square Filtering.
4. Explain about Geometric Transformation.
compression transformation of the image; the more correlated the image data are, the more
data items can be removed.
C
the requirements are weaker, but the image data compression must not cause significant
SW
changes in an image.
•
S
Data compression success is usually measured in the reconstructed image by the mean
squared error (MSE), signal to noise ratio etc. although these global error measures do not
always reflect subjective image quality.
• Image data compression design - 2 parts.
• 1) Image data properties determination
o gray level histogram
o image entropy
o various correlation functions
o etc.
• 2) Appropriate compression technique design.
• where b is the smallest number of bits with which the image quantization levels can be
represented.
• A good estimate of entropy is usually not available.
• C
Image data entropy can however be estimated from a gray level histogram.
•
SW
Let h(k) be the frequency of gray level k in an image f, 0 <= k <= 2^b -1, and let the image
size be M x N.
S
• The probability of occurrence of gray level k can be estimated as
• Example
o the entropy of satellite remote sensing data may be between 4 and 5, considering 8
bits per pixel
o the information redundancy is between 3 and 4 bits
o these data can be represented by an average data volume of 4 to 5 bits per pixel
with no loss of information, and the compression ratio would be between 1.6 and
2.
C
Discrete image transforms in image data compression
S W
•
•
S
Basic idea: image data representation by coefficients of discrete image transforms
The transform coefficients are ordered according to their importance
o the least important (low contribution) coefficients are omitted.
• To remove correlated image data, the Karhunen-Loeve transform is the most important.
• This transform builds a set of non-correlated variables with decreasing variance.
C
SSW
• Linear predictor of the third order is sufficient for estimation in a wide variety of images.
• The estimate ~f can be computed as
• and the solution, assuming f is a stationary random process with a zero mean, using a
predictor of the third order, is
C
S SW
o To define zones in the space, the set of points contained in each zone being projected on a
representative vector (centroid) C
S SW
Example: 2-dimensional spaces
o Map k-dimensional vectors in the vector space R k into a finite set of vector
C
Hierarchical and progressive compression techniques
•
S SW
A substantial reduction in bit volume can be obtained by merely representing a source as a
pyramid.
• Approaches exist for which the entire pyramid requires data volume equal to that of the
full resolution image..
• Even more significant reduction can be achieved for images with large areas of the same
gray level if a quadtree coding scheme is applied.
• Transform-based methods better preserve subjective image quality, and are less sensitive
to statistical image property changes both inside a single image and between images.
• Prediction methods, on the other hand, can achieve larger compression ratios in a much
less expensive way, tend to be much faster than transform-based or vector quantization
compression schemes, and can easily be realized in hardware.
• If compressed images are transmitted, an important property is insensitivity to transmission
channel noise. Transform-based techniques are significantly less sensitive to the channel
noise - if a transform coefficient is corrupted during transmission, the resulting image
distortion is homogeneously spread through the image or image part and is not too
disturbing.
• Erroneous transmission of a difference value in prediction compressions causes not only
C
an error in a particular pixel, it influences values in the neighborhood because the predictor
SW
involved has a considerable visual effect in a reconstructed image.
•
S
Pyramid based techniques have a natural compression ability and show a potential for
further improvement of compression ratios. They are suitable for dynamic image
compression and for progressive and smart transmission approaches.
Coding
C
RGB images (and the appropriate palette with pixel depths between 1 and
8 bits.
S W
▪
▪
S
Blocks of data are encoded using the LZW algorithm.
GIF has two versions - 87a and 89a, the latter supporting the storing of text
and graphics in the same file.
o TIFF (Tagged Image File Format) was first defined by the Aldus Corporation in
1986, and has gone through a number of versions to incorporate RGB color,
compressed color (LZW), other color formats and ultimately (in Version 6), JPEG
compression (below) -- these versions all have backward compatibility.
•
dequantizing and inverse DCT.
C
W
In the compression stage, the unsigned image values from the interval [0,(2^b)-1] are first
• S S
shifted to cover the interval [-2^(b-1),2^(b-1)-1].
The image is then divided into 8x8 blocks and each block is independently transformed
into the frequency domain using the DCT-II transform.
• Many of the 64 DCT coefficients have zero or near-zero values in typical 8x8 blocks which
forms the basis for compression.
• The 64 coefficients are quantized using a quantization table Q(u,v) of integers from 0 to
255 that is specified by the application to reduce the storage/transmission requirements of
coefficients that contribute little or nothing to the image content.
• After quantization, the dc-coefficient F(0,0) is followed by the 63 ac-coefficients that are
ordered in a 2D matrix in a zig-zag fashion according to their increasing frequency.
• The dc-coefficients are then encoded using predictive coding, the rationale being that
average gray levels of adjacent 8x8 blocks (dc-coefficients) tend to be similar.
• The last step of the sequential JPEG compression algorithm is entropy encoding.
o Two approaches are specified by the JPEG standard.
o The baseline system uses simple Huffman coding while the extended system uses
arithmetic coding and is suitable for a wider range of applications.
• Sequential JPEG decompression uses all the steps described above in the reverse order.
After entropy decoding (Huffman or arithmetic), the symbols are converted into DCT
coefficients and dequantized
C
• S SW
where again, the Q(u,v) are quantization coefficients from the quantization table that is
transmitted together with the image data.
• Finally, the inverse DCT transform is performed and the image gray values are shifted back
to the interval [0,(2^b)-1].
QUESTION BANK
1 MARKS:
Explanation: The median filter belongs to the order static filter, which substitutes the pixel value
by the median of grey level that exists in the neighborhood of the pixel.
Answer: b) s=clog10(1+r)
SS
Explanation: The power-law transformation can be mathematically derived as; s = cry, where c
and g represent the positive constants. However, we can write the same equation in another way,
such as s=c. (r + ε) γ, which represents an offset.
4. Which of the following requires to specify the information at the time of input?
a. Power transformation
b. Log transformation
c. Linear transformation
d. Piece-wise transformation
Hide Answer Workspace
Explanation: Piece-wise transformation plays a vital role while formulating some other
transformations. Its only disadvantage is that it requires a considerable number of inputs.
Answer: a) Laplacian
C
SW
Explanation: Laplacian is the second-order derivative operator.
a. Gaussian Transform
S
6.Which of the following is used to resolve the dark features in the image?
b. Laplacian Transform
c. Power-law Transformation
d. Histogram Specification
Hide Answer Workspace
Answer: b) 0
9.What is the name of the process that moves a filter mask over the image, followed by calculating
C
the sum of products?
a. Correlation
b. Convolution
c. Linear spatial filtering
S SW
d. Non-linear spatial filtering
Hide Answer Workspace
Answer: a) Correlation
Explanation: Correlation can be defined as the process of moving a filter, which is often denoted
as a kernel over the image to compute the sum of products at each distinct location.
11.Which of the following operations is used in homographic filtering for converting the input
image to discrete Fourier transformed function?
a. Exponential Function
b. Logarithmic Function
c. Negative Function
d. None of the above
Hide Answer Workspace
Explanation: A homomorphic system is the only class of a system that helps achieve the
separation of luminesce and reflectance components of an image.
14.Given an intensity level [0, L-1] with "r" and "s" positive values, how will the negative of an
image obtain?
a. s=L-1-r
b. s = L - 1 + r
c. s = L + 1 - r
d. s = L + 1 + r
Hide Answer Workspace
C
SW
Answer: a) s = L - 1 - r
S
15. In general, the log transformation can be represented by _________
a. s = c.log (1 - r)
b. s = c - log (1 - r)
c. s = c.log (1 + r)
d. s = c + log (1 + r)
Hide Answer Workspace
Answer: c) s = c.log (1 + r)
10 MARKS:
1. Explain the image data compression in detail.
2. Explain the discrete image transforms in image data compression in detail (April 2012).
3. Explain the hierarchical and progressive compression techniques in detail.
4. Explain the JPEG and MPEG image compression in detail (April 2012).
C
SW
S
SW
region.
• Because of the different natures of the various edge- and region-based algorithms, they
S
may be expected to give somewhat different results and consequently different
information.
• The segmentation results of these two approaches can therefore be combined in a single
description structure, e.g., a relational graph.
THRESHOLDING:
Gray level thresholding is the simplest segmentation process. Many objects or image
regions are characterized by constant reflectivity or light absorption of their surface.
Thresholding is computationally inexpensive and fast. Thresholding can easily be done in
real time using specialized hardware. Complete segmentation can result from thresholding in
simple scenes.
1. Thresholding algorithm:
Basic thershlding
➢ Search all the pixels f(i,j) of the image f. An image element g(i,j) of the segmented
image is an object pixel if f(i,j) >= T, and is a background pixel otherwise.
• C
Correct threshold selection is crucial for successful threshold segmentation.
SW
• Threshold selection can be interactive or can be the result of some threshold detection
method.
•
S
Single global threshold ... successful only under very unusual circumstances
➢ Gray level variations are likely - due to non-uniform lighting, non-uniform input
device parameters or a number of other factors.
• Variable thresholding (also adaptive thresholding), in which the threshold value varies
over the image as a function of local image characteristics, can produce the solution in
these cases.
➢ image f is divided into subimages fc
➢ a threshold is determined independently in each subimage
➢ If a threshold cannot be determined in some subimage, it can be interpolated
from thresholds determined in neighboring subimages.
➢ Each subimage is then processed with respect to its local threshold.
2. Thresholding modifications:
➢ Band-thresholding:
• segment an image into regions of pixels with gray levels from a set D and into
background otherwise
➢ In text segmentation, prior information about the ratio between the sheet area
and character area can be used.
➢ If such a priori information is not available - another property, for example
the average width of lines in drawings, etc. can be used - the threshold can
be determined to provide the required line width in the segmented image.
b. More complex methods of threshold detection:
❖ Based on histogram shape analysis.
❖ Bimodal histogram - if objects have approximately the same gray level
that differs from the gray level of the background.
SW
• Bimodal histogram threshold detection algorithms
5. Multi-spectral thresholding:
Multispectral or color images. One segmentation approach determines thresholds
independently in each spectral band and combines them into a single segmented image.
EDGE-BASED SEGMENTATION:
o Edge-based segmentation represents a large group of methods based on information
about edges in the image.
o Edge-based segmentations rely on edges found in an image by edge detecting operators
-- these edges mark image locations of discontinuities in gray level, color, texture, etc.
o Image resulting from edge detection cannot be used as a segmentation result.
C
Supplementary processing steps must follow to combine edges into edge chains that
W
C
SW
•
S
Edge context is considered at both ends of an edge, giving the minimal edge
neighborhood.
• The central edge e has a vertex at each of its ends and three possible border
continuations can be found from both of these vertices.
• Vertex type -- number of edges emanating from the vertex, not counting the edge
e.
• The type of edge e can then be represented using a number pair i-j describing
edge patterns at each vertex, where i and j are the vertex types of the edge e.
• The main steps of the above Algorithm are evaluation of vertex types followed by
evaluation of edge types, and the manner in which the edge confidences are
modified.
• A vertex is considered to be of type i if
• C
Edge relaxation, as described above, rapidly improves the initial edge labeling in
SW
the first few iterations.
• Unfortunately, it often slowly drifts giving worse results than expected after
•
larger numbers of iterations.
S
The reason for this strange behavior is in searching for the global maximum of
the edge consistency criterion over all the image, which may not give locally
optimal results.
• A solution is found in setting edge confidences to zero under a certain threshold,
and to one over another threshold which increases the influence of original image
data.
• Therefore, one additional step must be added to the edge confidence computation
•
Where T_1 and T_2 are parameters controlling the edge relaxation convergence
speed and resulting border accuracy.
• This method makes multiple labeling possible; the existence of two edges at
different directions in one pixel may occur in corners, crosses, etc.
• The relaxation method can easily be implemented in parallel.
C. BORDER TRACING:
• If a region border is not known but regions have been defined in the image, borders
can be uniquely detected.
• First, let us assume that the image with regions is either binary or that regions have
been labeled.
• An inner region border is a subset of the region
C
W
SS
C
W
SS
r all regions
larger than one pixel.
• Looking for the border of a single-pixel region is a trivial problem.
• This algorithm is able to find region borders but does not find borders of region
holes.
• To search for hole borders as well, the border must be traced starting in each
region or hole border element if this element has never been a member of any
border previously traced.
• Note that if objects are of unit width, more conditions must be added.
• If the goal is to detect an outer region border, the given algorithm may still
be used based on 4-connectivity.
• Note that some border elements may be repeated in the outer border up to three
times.
C
W
SS
C
S SW
an
intuitive definition of RIGHT, LEFT, UPPER, and LOWER outer boundary
points, the extended boundary may be obtained by shifting all the UPPER outer
boundary points one pixel down and right, shifting all the LEFT outer
boundary points one pixel to the right, and shifting all the RIGHT outer
boundary points one pixel down. The LOWER outer boundary point positions
remain unchanged.
• The look-up table approach makes the tracing more efficient than
conventional methods and makes parallel implementation possible.
• In addition to extended boundary tracing, it provides a description of each
boundary segment in chain code form together with information about
vertices.
• This method is very suitable for representing borders in higher level
segmentation approaches including methods that integrate edge-based and
region-based segmentation results.
• Resulting regions of the segmented image must be both homogeneous and maximal.
A. REGION MERGING:
C
• S SW
The data structure called supergrid carries all the necessary information for region
merging in 4-adjacency using crack edges.
➢ Merging heuristics:
❖ Two adjacent regions are merged if a significant part of their common boundary
consists of weak edges
❖ Two adjacent regions are also merged if a significant part of their common
boundary consists of weak edges, but in this case not considering the total length
of the region borders.
• Of the two given heuristics, the first is more general and the second cannot be used alone
because it does not consider the influence of different region sizes.
• Edge significance can be evaluated according to the formula
Where v_ij=1 indicates a significant edge, v_ij=0 a weak edge, T_1 is a preset threshold,
and s_ij is the crack edge value [s_ij = |f(x_i) - f(x_j)|].
• The supergrid data structure allows precise work with edges and borders but a big
W
disadvantage of this data structure is that it is not suitable for the representation of regions.
• A good data structure to use can be a planar region adjacency graph.
SS
B. REGION SPLITTING
• Region splitting is the opposite of region merging.
• It begins with the whole image represented as a single region which does not usually
satisfy the condition of homogeneity.
• The existing image regions are sequentially split to satisfy all the above given
conditions of homogeneity.
• Region splitting does not result in the same segmentation even if the same homogeneity
criteria are used.
C
•
SW
If any region in any pyramid level is not homogeneous (excluding the lowest level), it
• S
is split into four subregions -- these are elements of higher resolution at the level below.
If four regions exist at any pyramid level with approximately the same value of
homogeneity measure, they are merged into a single region in an upper pyramid level.
• The segmentation process can be understood as the construction of a segmentation
quadtree where each leaf node represents a homogeneous region.
• Splitting and merging corresponds to removing or building parts of the segmentation
quadtree.
• Split-and-merge methods usually store the adjacency information in region adjacency
graphs (or similar data structures).
• Using segmentation trees, in which regions do not have to be contiguous, is both
implementationally and computationally easier.
• An unpleasant drawback of segmentation quadtrees is the square region shape
assumption
➢ merging of regions which are not part of the same branch of the segmentation
tree
• Because both split-and-merge processing options are available, the starting
segmentation does not have to satisfy any of the homogeneity conditions.
C
W
SS
label in previous search locations, but these labels do not necessarily match the splitting
pattern found in the processed block.
• If a mismatch is detected in step 3 of the algorithm, it is necessary to resolve
possibilities of merging regions that were considered separate so far - to assign the
same label to two regions previously labeled differently.
• Two regions R_1 and R_2 are merged into a region R_3 if
• Where m_1 and m_2 are the mean gray level values in regions R_1 and R_2, and T is
some appropriate threshold.
• If region merging is not allowed, regions keep their previous labels.
• If larger blocks are used, more complex image properties can be included in the
homogeneity criteria (even if these larger blocks are divided into 2x2 sub-blocks to
determine the splitting pattern).
DIGITAL IMAGE PROCESSING Page 139
D. WATERSHED SEGMENTATION:
• The concepts of watersheds and catchment basins are well known in topography.
• Watershed lines divide individual catchment basins.
• The North American Continental Divide is a textbook example of a watershed line with
catchment basins formed by the Atlantic and Pacific Oceans.
• Image data may be interpreted as a topographic surface where the gradient image gray-
levels represent altitudes.
• Region edges correspond to high watersheds and low-gradient region interiors
correspond to catchment basins.
• Catchment basins of the topographic surface are homogeneous in the sense that all
pixels belonging to the same catchment basin are connected with the basin's region of
minimum altitude (gray-level) by a simple path of pixels that have monotonically
decreasing altitude (gray-level) along the path.
• Such catchment basins then represent the regions of the segmented image.
•
• C
Concept of watersheds and catchment basins is quite straightforward.
Early watershed methods resulted in either slow or inaccurate execution.
SW
• Most of the existing algorithms start with extraction of potential watershed line pixels
using a local 3 x 3 operation, which are then connected into geomorphological networks
S
in subsequent steps. Due to the local character of the first step, these approaches are
often inaccurate.
• A watershed transformation was also introduced in the context of mathematical
morphology - computationally demanding and therefore time consuming.
• Two basic approaches to watershed image segmentation.
➢ The first one starts with finding a downstream path from each pixel of the
image to a local minimum of image surface altitude.
➢ A catchment basin is then defined as the set of pixels for which their respective
downstream paths all end up in the same altitude minimum.
➢ While the downstream paths are easy to determine for continuous altitude
surfaces by calculating the local gradients, no rules exist to define the
downstream paths uniquely for digital surfaces.
➢ The second approach is essentially dual to the first one; instead of identifying
the downstream paths, the catchment basins fill from the bottom.
➢ Imagine that there is a hole in each local minimum, and that the topographic
surface is immersed in water - water starts filling all catchment basins, minima
C
S SW
• All pixels with gray-level k+1 that belong to the influence zone of a catchment basin
labeled l are also labeled with the label l, thus causing the catchment basin to grow.
• The pixels from the queue are processed sequentially, and all pixels from the queue
that cannot be assigned an existing label represent newly discovered catchment basins
and are marked with new and unique labels.
• Example of watershed segmentation.
C
S SW
• While this method would work well in the continuous space with the watershed
lines accurately dividing the adjacent catchment basins, the watersheds in images
with large plateaus may be quite thick in discrete spaces:
•This algorithm will execute much faster if all regions smaller than a preselected size
are merged with their neighbors without having to order them by size.
MATCHING:
• Matching is another basic approach to segmentation that can be used to locate known
objects in an image, to search for specific patterns, etc.
• The best match is based on some criterion of optimality which depends on object properties
and object relations.
C
S SW
• Matched patterns can be very small, or they can represent whole objects of interest.
• While matching is often based on directly comparing gray-level properties of image
subregions, it can be equally well performed using image-derived features or higher-level
image descriptors.
• In such cases, the matching may become invariant to image transforms.
• Criteria of optimality can compute anything from simple correlations up to complex
approaches of graph matching.
A. MATCHING CRITERIA:
C
step, and then it is not necessary to test all possible pattern locations.
SW
➢ Another speed improvement can be realized if a mismatch can be detected
before all the corresponding pixels have been tested.
S
➢ The correlation changes slowly around the best matching location ... matching
can be tested at lower resolution first, looking for an exact match in the
neighborhood of good low-resolution matches only.
• The mismatch must be detected as soon as possible since mismatches are found much more
often than matches.
• Considering the matching formulae given above, testing in a specified position must stop
when the value in the denominator (measure of mismatch) exceeds some preset threshold.
• This implies that it is better to begin the correlation test in pixels with a high probability of
mismatch in order to get a steep growth in the mismatch criterion.
• This criterion growth will be faster than that produced by an arbitrary pixel order
computation.
•
2
(3.4) S
Construct a matrix, which is the repeat of an initial mean and use this matrix to minus
“dataPts”. Then calculate the square of every components of the new matrix and
individually sum every column to get a vector “SqDistToAll”. For example,
4 4 4 4 4 4 9 4 1 0 1 4
1 1 1 1 1 1 − dataPts . ^ 2 4 16 9 0 36 64
2 2 2 2 2 2 4 9 1 0 16 25
• SqDistToAll= =
•
17 29 11 0 53 93
Sum every column (3.5)
C
S W
S
C
• This interaction must specify an approximate shape and starting position for the snake
somewhere near the desired contour.
SW
• A priori information is then used to push the snake toward an appropriate solution.
C
• The line-based functional may be very simple
W
•
•
SS
where f(x,y) denotes image gray levels at image location (x,y).
The sign of wline specifies whether the snake is attracted to light or dark lines.
• The edge-based functional attracts the snake to contours with large image gradients - that
is, to locations of strong edges:
• Line terminations and corners may influence the snake using a weighted energy functional
Eterm
• The snake behavior may be controlled by adjusting the weights wline, wedge, wterm.
• S
energy) allowing more a priori knowledge to be incorporated.
Difficulties with the numerical instability of the original method were overcome by Berger
by incorporating an idea of snake growing.
• A single primary snake may begin which later divides itself into pieces.
• The pieces of very low energy are allowed to grow in directions of their tangents while
higher energy pieces are eliminated.
• After each growing step, the energy of each snake piece is minimized (the ends are pulled
to the true contour and the snake growing process is repeated.
• Further, the snake growing method may overcome the initialization problem.
• The primary snake may fall into an unlikely local minimum but parts of the snake may still
lie on salient features.
• The very low energy parts (the probable pieces) of the primary snake are used to initialize
the snake growing in later steps.
• This iterative snake growing always converges and the numerical solution is therefore
stable - the robustness of the method is paid for by an increase in the processing cost of the
algorithm.
•
SW
solution of the finite element method and has the advantage of greater numerical stability
•
•
and better efficiency.
S
This approach is especially useful in the case of closed or nearly closed contours.
An additional pressure force is added to the contour interior by considering the curve as
a balloon which is inflated.
• This allows the snake to overcome isolated energy valleys resulting from spurious edge
points giving better results.
specified seed points and user-specified functions, called (fuzzy) affinities, which map each
W
o Fuzzy algorithms take into consideration various uncertainties such as noise, uneven
illumination/brightness/contrast differences, etc.
o Example: If two regions have about same gray-scale and if they are relatively close to each
other in space, then they likely to belong to the same object.
o Effectiveness of the FC algorithm is dependent on the choice of the affinity function, and
the general setup can be divided into three components (for any voxels p and q): Adjacency
Homogeneity Object Feature
o All voxels are assessed via defined affinity functions for labelling.
o Affinity: local relation between every two image elements u and v
– If u and v are apart, affinity should be small (or zero)
– If u and v are close, affinity should be large
spatially.
o Its strength α (a, b):
o A local fuzzy relation κ to indicate how voxels a and b hang together locally
in scene
S =(C, f).
o Fuzzy Affinity (k)local hanging-togetherness between two spels (i.e., space
elements)
C
S W
S
FC Algorithm
1. Define properties of fuzzy adjacency α and fuzzy affinity κ
2. Determine the affinity values for all pairs of fuzzy adjacent voxels
3. Determine the segmentation seed element c
4. Determine all possible paths between the seed c and all other voxels di in the image domain
considering the fuzzy adjacency relation
5. For each path, determine its strength using minimum affinity along the path
6. For each voxel di , determine its fuzzy connectedness to the seed point c as the maximum
strength of all possible paths < c, …, di > and form connectedness map.
C
be recognized from the description if objects are occluded and only partial shape
SW
information is available.
➢ Local/global description character: Global descriptors can only be used if complete
S
object data are available for analysis. Local descriptors describe local object
properties using partial information about the objects. Thus, local descriptors can
be used for description of occluded objects.
➢ Mathematical and heuristic techniques: A typical mathematical technique is shape
description based on the Fourier transform. A representative heuristic method may
be elongatedness.
➢ A robustness of description to translation, rotation, and scale transformations:
Shape description properties in different resolutions.
• Sensitivity to scale is even more serious if a shape description is derived, because shape
may change substantially with image resolution.
C
W
SS
description.
W
• Despite the fact that we are dealing with two-dimensional shape and its description, our
SS
world is three-dimensional and the same objects, if seen from different angles (or changing
position/orientation in space), may form very different 2D projections.
• The ideal case would be to have a universal shape descriptor capable of overcoming these
changes -- to design projection-invariant descriptors.
• Consider an object with planar faces and imagine how many very different 2D shapes may
result from a given face if the position and 3D orientation of this simple object changes
with respect to an observer. In some special cases, like circles which transform to ellipses,
or planar polygons, projectively invariant features (invariants} can be found.
• Object occlusion is another hard problem in shape recognition. However, the situation is
easier here (if pure occlusion is considered, not combined with orientation variations
yielding changes in 2D projections as discussed above), since visible parts of objects may
be used for description.
C
W
SS
S
SS
A. CHAIN CODES:
• Chain codes describe an object by a sequence of unit-size line segments with a given
orientation.
• If the chain code is used for matching it must be independent of the choice of the first
border pixel in the sequence. One possibility for normalizing the chain code is to find
the pixel in the border sequence which results in the minimum integer number if the
description chain is interpreted as a base four number -- that pixel is then used as the
starting pixel.
➢ A mod 4 or mod 8 differences are called a chain code derivative.
B. SIMPLE GEOMETRIC BORDER REPRESENTATION:
• The following descriptors are mostly based on geometric properties of described regions.
Because of the discrete character of digital images, all of them are sensitive to image
resolution.
✓ Boundary length
C
✓ Curvature
W
SS
• Bending energy
C
SW
S
• Chord
• Chord is a line joining any two points of the region boundary is a chord.
• Let b(x,y)=1 represent the contour points, and b(x,y)=0 represent all other points.
C
• Procedures based on region decomposition into smaller and simpler subregions must be
applied to describe more complicated regions, then subregions can be described separately
SW
using heuristic approaches.
A. SIMPLE SCALAR REGION DESCRIPTORS:
S
• Area is given by the number of pixels of which the region consists.
• The real area of each pixel may be taken into consideration to get the real size of a
region.
• If an image is represented as a rectangular raster, simple counting of region pixels will
provide its area.
• If the image is represented by a quadtree, then:
Euler's number:
Projections:
• Horizontal and vertical region projections
Eccentricity:
• The simplest is the ratio of major and minor axes of an object.
Elongatedness:
• A ratio between the length and width of the region bounding rectangle.
C
W
• This criterion cannot succeed in curved regions, for which the evaluation of elongatedness
SS
Rectangularity:
Direction:
• Direction is a property which makes sense in elongated regions only.
• If the region is elongated, direction is the direction of the longer side of a minimum
bounding rectangle.
• If the shape moments are known, the direction \theta can be computed as
Compactness:
• Compactness is independent of linear transformations
•
• C
The most compact region in a Euclidean space is a circle.
Compactness assumes values in the interval [1,infty) in digital images if the boundary is
SW
defined as an inner boundary, while using the outer boundary, compactness assumes values
in the interval [16,infty).
•
S
Independence from linear transformations is gained only if an outer boundary
representation is used.
Moments:
• Region moment representations interpret a normalized gray level image function as a
probability density of a 2D random variable.
• Properties of this random variable can be described using statistical characteristics -
moments.
• where x,y,i,j are the region point co-ordinates (pixel co-ordinates in digitized images).
• Translation invariance can be achieved if we use the central moments
• or in digitized images
• C
where x_c, y_c are the co-ordinates of the region's centroid
S SW
• In the binary case, m_00 represents the region area.
✓ Scale invariant features can also be found in scaled central moments
• Rotation invariance can be achieved if the co-ordinate system is chosen such that mu_11 =
0.
• A less general form of invariance is given by seven rotation, translation, and scale invariant
moment characteristics
•
C
All moment characteristics are dependent on the linear gray level transformations of
SW
regions; to describe region shape properties, we work with binary image data (f(i,j)=1 in
region pixels) and dependence on the linear gray level transform disappears.
•
•
by its boundary. S
Moment characteristics can be used in shape description even if the region is represented
• Less noise-sensitive results can be obtained from the following shape descriptors
C
S SW
QUESTION BANK
1 MARKS
a. sharpening
b. blurring
c. smoothing
d. contrast
Answer: (c).smoothing
a. thin
b. thick
c. sharp
C
d. blur
W
Answer: (a).thin
d. Both a and b
a. area pixels
b. line pixels
c. point pixels
d. edge pixels
a. 0
b. 1
c. 11
d. x
d. Both a and b
a. morphology
b. set theory
c. extraction
d. recognition
Answer: (a).morphology
W
a. 0
b. 30
c. 45
d. 90
Answer: (d).90
a. R
b. R'
c. Ri
d. Rn
Answer: (a).R
a. sum to zero
b. subtraction to zero
c. division to zero
d. multiplication to zero
a. first derivative
C
SW
b. second derivative
c.
d.
third derivative
Both a and b
S
View Answer Report Discuss Too Difficult!
14. If the standard deviation of the pixels is positive, then sub image is labeled as
a. black
b. green
c. white
d. red
Answer: (c).white
a. large image
d. binary image
S
View Answer Report Discuss Too Difficult!
5 MARKS:
1. Explain the segmentation in detail.
2. Explain the thresholding modifications in detail.
3. Explain the multi-spectral thresholding in detail.
4. Explain the hierarchical thresholding in detail.
5. Explain the edge relaxation in detail.
6. Explain the border tracing in detail (April 2012).
7. Explain the region merging in detail.
8. Explain the water shade segmentation in detail.
9. Explain the matching in detail (April 2012).
C
11. Explain the region neighborhood graphs in detail.
12. Explain the convex hull in detail.
SW
S