IT6005
IT6005
IT6005
net
ww
w.E
a syE
ngi
nee
rin
g.n
et
ww
UNIT I DIGITAL IMAGE FUNDAMENTALS 8
asy
Sampling and Quantization – Relationships between pixels - color models.
gin
Spatial Domain: Gray level transformations – Histogram processing –
eer
Basics of Spatial Filtering– Smoothing and Sharpening Spatial Filtering –
Frequency Domain: Introduction to Fourier Transform – Smoothing and
ing
Sharpening frequency domain filters – Ideal, Butterworth and Gaussian filters.
.ne
UNIT III IMAGE RESTORATION AND SEGMENTATION
Noise models – Mean Filters – Order Statistics – Adaptive filters – Band
9
t
reject Filters – Band pass Filters – Notch Filters – Optimum Notch Filtering –
Inverse Filtering – Wiener filtering Segmentation: Detection of Discontinuities–
Edge Linking and Boundary detection – Region based segmentation-
Morphological processing- erosion and dilation.
ww
OUTCOMES:
w.E
Upon successful completion of this course, students will be able to:
Discuss digital image fundamentals.
asy
Apply image enhancement and restoration techniques.
En
Use image compression and segmentation Techniques.
Represent features of images.
gin
TEXT BOOK:
C. Gonzales, Richard E.
eer
Woods, “Digital Image
1. Rafael
Processing”, Third Edition, Pearson Education, 2010. ing
REFERENCES:
.ne
1. Rafael C. Gonzalez, Richard E. Woods, Steven L. Eddins, “Digital
Image Processing Using MATLAB”, Third Edition Tata Mc Graw Hill Pvt.
t
Ltd., 2011.
2. Anil Jain K. “Fundamentals of Digital Image Processing”, PHI Learning
Pvt. td., 2011.
3. Willliam K Pratt, “Digital Image Processing”, John Willey, 2002.
4. Malay K. Pakhira, “Digital Image Processing and Pattern Recognition”,
First Edition, PHI Learning Pvt. Ltd., 2011.
5. http://eeweb.poly.edu/~onur/lectures/lectures.html.
6. http://www.caen.uiowa.edu/~dip/LECTURE/lecture.html
OBJECTIVES:
Learn digital image fundamentals.
w.E
Be
techniques.
familiar with image compression and segmentation
asy
Learn to represent image in form of features.
2. En
Need and Importance for Study of the Subject
Need for Study of the Subject: gin
eer
Makes it possible to design the simple image processing concept and
experiments that are: – Image compression, Image Enhancement, Image
Restoration ect. ing
Allows students to upgrade their knowledge in Digital Image
.ne
Processing field for their research and project
Helps students/engineers in touch with the latest signal processing
t
technologies.(3G, 4G, 5G, smart phones, Wifi, encoding and decoding
concepts, Signal transmission ect).
Latest Developments:
asy
En
LIFI if fast data transfer in communication signals networking.
New revolution in 5G, 6G in all the smart phones, smart
televisions, etc. gin
Satellite signal transmission and restoration.
eer
4. Industrial Visit (Planned if any): -NO-ing
.ne
t
ww
REFERENCES:
w.E
1. Rafael C. Gonzalez, Richard E. Woods, Steven L. Eddins, “Digital
Image Processing Using MATLAB”, Third Edition Tata Mc Graw Hill Pvt.
Ltd., 2011.
asy
2. Anil Jain K. “Fundamentals of Digital Image Processing”, PHI Learning
Pvt. Ltd., 2011. En
gin
3. Willliam K Pratt, “Digital Image Processing”, John Willey, 2002.
eer
4. Malay K. Pakhira, “Digital Image Processing and Pattern Recognition”,
First Edition, PHI Learning Pvt. Ltd., 2011.
5. http://eeweb.poly.edu/~onur/lectures/lectures.html. ing
6. http://www.caen.uiowa.edu/~dip/LECTURE/lecture.html
.ne
Le
ct.
No
Uni
t
No
Topics to be Covered
Text/
Refer
ence
Pages
We
ek
No.
t
.
DIGITAL IMAGE FUNDAMENTALS
Introduction about the Digital Image T1, 15-17,
1 R1
Processing 10-12
T1
2 I Origin of Digital Image Processing 17-21
1
Steps in Digital Image Processing – T1
3 39-42
Components
4 Elements of Visual Perception T1 34-38
ww
13 Basics of Spatial Filtering T1 116-119
15
w.E
14, II Smoothing and Sharpening Spatial
Filtering
Frequency Domain: Introduction to
T1
T1
119-128
16
17, asy
Fourier Transform
Smoothing and Sharpening frequency T1
149-156
167-178
18
19
domain filters
Ideal filters En T1 182 4
ww
39
40
Error Free Compression
Variable Length Coding – Bit-Plane T1
440-442
442-448
41w.E Coding
Lossless Predictive Coding T1 456-459
42,
43 Coding
asy
Lossy Compression, Lossy Predictive
T2 459-486
8
En
44 Compression Standards.
gin
T1 492-510
IMAGE REPRESENTATION AND RECOGNITION
45
Introduction about Image
Representation and Recognition
T1
643-644
eer
46 Boundary representation T1 644-645
ing
47 Chain Code
Polygonal approximation, signature,
T1
T1
644
.ne 9
48
49
V
boundary segments
Boundary description – Shape number T1
646
653-655
t
Fourier Descriptor, moments- Regional T1
50 655-659
Descriptors
51 Topological feature T1 661-663
10
52. T1
Texture - Patterns and Pattern classes 665-672
53
54 Recognition based on matching. T1 698-704
UNIT – I
DIGITAL IMAGE FUNDAMENTALS
Introduction – Origin – Steps in Digital Image Processing – Components
– Elements of Visual Perception – Image Sensing and Acquisition – Image
Sampling and Quantization – Relationships between pixels - color models.
PART -A (2 Marks)
ww
any point (x, y) is called intensity or gray scale or brightness of the image at
w.E
that point.
2. Define Brightness? [AUC NOV2011]
asy
Brightness of an object is the perceived luminance of the surround. Two
En
objects with different surroundings would have identical luminance but
different brightness.
3. What do you meant by Color model? gin [AUC APR 2013]
eer
A Color model is a specification of 3D-coordinates system and a
ing
subspace within that system where each color is represented by a single
point.
4.List the hardware oriented color models?
.ne
[AUC APR 2012]
RGB model
CMY model
t
YIQ model
HSI model
5. What is Hue of saturation? [AUC NOV 2012, APR 2013]
Hue is a color attribute that describes a pure color where saturation
gives a measure of the degree to which a pure color is diluted by white light.
6. List the applications of color models? [AUC NOV 2011]
1. RGB model--- used for color monitor & color video camera
2. CMY model---used for color printing
ww
9. Define Digital image? [AUC NOV 2010]
w.EWhen x, y and the amplitude values of f all are finite discrete quantities,
we call the image digital image.
asy
10. What are the steps involved in DIP? [AUC NOV 2013]
1. Image Acquisition
2. Preprocessing En
3. Segmentation gin
4. Representation and Description
eer
5. Recognition and Interpretation
11. What is recognition and Interpretation? ing [AUC NOV 2010]
Recognition means is a process that assigns a label to an object based
.ne
on the information provided by its descriptors.
Interpretation means assigning meaning to a recognized object.
t
12. Specify the elements of DIP system? [AUC APR 2011]
1. Image Acquisition
2. Storage
3. Processing
4. Display
13. Define sampling and quantization [AUC NOV 2013, MAY 2017]
Sampling means digitizing the co-ordinate value (x, y).
Quantization means digitizing the amplitude value.
10
PART B
16 Marks
ww Image segmentation
asy
cameras. It also involves steps like preprocessing and scaling.
En
Image Enhancement is the process of highlighting certain features of interest
in an image.
gin
Image Restoration deals with improving the appearance of an image.
eer
Color image processing involves the processing of images which are in
ing
color rather than in binary or gray. It finds applications in the use of digital
images in the internet.
Wavelets are foundation for representing images in various degrees of
.ne
resolution.
Image compression deals with the techniques for reducing the size of the
t
image for storage and reducing the bandwidth for transmitting.
Morphological processing deals with the tools for extracting the image
components that are useful in the representation and description of shape.
Segmentation partition an image into its constituent parts or objects.
Representation transforms raw data into a suitable form subsequent for
computer processing. Description deals with extracting attributes that result
in some quantitative information of interest or are basic for differentiating one
class of objects from another.
11
ww
w.E
asy
En
gin
eer
ing
.ne
Fig: Fundamental steps in Digital Image Processing
t
2. What are the components of an Image Processing System?
[AUC NOV 2012 , APR 2013]
Briefly discuss about the elements of a Digital Image Processing
System. [MAY/JUNE 2014, MAY/JUNE 2017]
1. Image Sensors
Image sensing or image acquisition is used to acquire i.e., to get digital
images. It requires two elements, which are,
a)A physical device which is use to sense the object.
12
b)The second device is the digitizer which is used to convert the output
of the physical sensing device into digital form.
2. Specialized image processing hardware
This hardware usually consists of digitizer plus ALU that performs some
primitive operations like arithmetic and logical.
3. Computer
The computer in an image processing system is a general purpose
computer and can range from PC to a supercomputer.
4. Software
The software for image processing system consists of specialized
ww
modules that can perform specific tasks. Some software packages have the
w.E
facility for the user to write code using specialized modules.
5. Mass storage
asy
Mass storage capability is needed if the image is not compressed.
En
There are three principal categories.
Short term storage for use during processing, example: computer
gin
memory, frame buffers. Frame buffers are specialized boards that can
eer
store one or more images and can be accessed rapidly at video rates.
ing
This method allows instantaneous image zoom, scroll (vertical shifts)
and pan (horizontal shifts also.
On- line storage for relatively fast recall, example; magnetic disk or .ne
optical media. This type of storage gives frequent access to the storage
data.
t
Archival storage characterized by frequent access, example: magnetic
tapes and optical disks. It requires large amount of storage space and
the stored data is accessed infrequently.
6. Image displays
Image displays are color TV monitors. These monitors are driven by the
output of image and graphics display cards which are a part of the computer
system.
7. Hard copy
13
Hard copy devices are used for recording images. These devices
include laser printers, film cameras, heat sensitive devices, inkjet printers and
digital units such as optical and CD ROM disks.
8. Networking
Networking is useful for transmitting images. It includes optical fiber and
other broad band technologies.
ww
w.E
asy
En
gin
eer
ing
.ne
Fig: components of digital image processing systems
t
3. Explain about elements of visual perception. [AUC NOV 2013]
OR
Describe the elements of visual perception with suitable diagram.
[MAY/JUNE 2016]
EYE Characteristics
Nearly spherical
Approximately 20 mm in diameter
Three membranes
14
ww
w.E
asy
En
gin
eer
ing
.ne
Fig: Structure of Human eye
t
Iris diaphragm
It contracts and expands to control the amount of light enteringinto the
eye. The central opening of the iris is known as pupil whose diameter varies
from 2-8 mm.
Lens
It is made up of many layers of fibrous cells. It is suspended and is
attached to the ciliary body. It contains 60% to 70% water and 6% fat and
more protein. The lens is colored by a slightly yellow pigmentation. This
15
ww
and cones operate differently.
w.E
Cones
Cones are highly sensitive to color and are located in the fovea. There
asy
are 6 to 7 million cones. Each cone is connected with its own nerve end.
En
Therefore humans can resolve fine details with the use of cones. Cones
respond to higher levels of illumination; their response is called photopic
vision or bright light vision gin
Rods
eer
ing
Rods are more sensitive to low illumination than cones. There are
about 75 to 159 million rods. Many numbers of rods are connected to a
common, single nerve. Thus the amount of detail recognizable is less. .ne
Therefore rods provide only a general overall picture of the field of view. Due
to stimulation of rods the objects that appear color in daylight will appear
t
colorless in moon light. This phenomenon is called scotopic vision or dim
light vision.
There are three basic types of cones in the retina
These cones have different absorption characteristics as a function of
wavelength with peak absorptions in the red, green, and blue regions of
the optical spectrum. Most of the cones are at the fovea. Rods are
spread just about everywhere except the fovea
There is a relatively low sensitivity to blue light. There is a lot of overlap
16
w.E
To focus distant object greater than 3m the lens is made flattened by
the controlling muscles and it will have lowest refractive index
asy
En
gin
eer
ing
Fig: Sensitivity of rods and cones .ne
t
17
ww
Brightness adaptation and discrimination
w.EThe range of light intensity levels to which the human visual system can
adapt is enormous from scotopic threshold to the glare limit. Subjective
asy
brightness is a logarithmic function of the light intensity incident on the eye.
En
The visual system cannot operate over long range simultaneously; rather it
accomplishes this large variation by changes in its overall sensitivity. This
gin
phenomenon is known as brightness adaptation. For any given set of
eer
conditions the current sensitivity level of the visual system is called the
ing
brightness adaptation level. Fig shows the plot of light intensity versus
subjective brightness.
.ne
t
18
ww
w.E Fig: Weber ratio as a function of intensity
Perceived brightness and intensity
asy
In actual case, the perceived brightness is not a function of intensity.
En
This can be explained with the use of two phenomena namely
1. Simultaneous contrast
2. Mach band effect gin
1. Simultaneous contrast
eer
•
• ing
The small squares in each image are the same intensity.
Because the different background intensities, the small squares do not
appear equally bright. .ne
• Perceiving the two squares on different backgrounds as different, even
though they are in fact identical, is called the simultaneous contrast
t
effect.
• Psychophysically, we say this effect is caused by the difference in the
backgrounds.
19
ww
w.E
asy
En
gin
Fig: Example for match band effect eer
optical illusion
ing
This is one in which the eye fills in nonexisting information or wrongly
perceives geometrical properties of object. .ne
t
20
ww
Image Sensing and Acquisition:
asy
of energy from that source by the elements of the “scene” being imaged.
En
We enclose illumination and scene in quotes to emphasize the fact that
they are considerably more general than the familiar situation in which a
gin
visible light source illuminates a common everyday 3-D (three-dimensional)
scene.
eer
ing
For example, the illumination may originate from a source of
electromagnetic energy such as radar, infrared, or X-ray energy. But, as noted
earlier, it could originate from less traditional sources, such as ultrasound or
.ne
even a computer-generated illumination pattern.
Similarly, the scene elements could be familiar objects, but they can just
t
as easily be molecules, buried rock formations, or a human brain. We could
even image a source, such as acquiring images of the sun.
Depending on the nature of the source, illumination energy is reflected
from, or transmitted through, objects. An example in the first category is light
reflected from a planar surface.
An example in the second category is when X-rays pass through a
patient‟s body for the purpose of generating a diagnostic X-ray film.
In some applications, the reflected or transmitted energy is focused onto
21
a photo converter (e.g., a phosphor screen), which converts the energy into
visible light. Electron microscopy and some applications of gamma imaging
use this approach.
Figure 4.1 shows the three principal sensor arrangements used to
transform illumination energy into digital images.
The idea is simple: Incoming energy is transformed into a voltage by the
combination of input electrical power and sensor material that is responsive to
the particular type of energy being detected.
The output voltage waveform is the response of the sensor(s), and a
digital quantity is obtained from each sensor by digitizing its response.
ww
w.E
asy
En
gin
eer
ing
.ne
t
Fig.4.1 (a) Single imaging Sensor (b) Line sensor (c) Array sensor
22
ww
the area to be imaged. Figure 4.2 shows an arrangement used in high-
w.E
precision scanning, where a film negative is mounted onto a drum whose
mechanical rotation provides displacement in one dimension. The single
asy
sensor is mounted on a lead screw that provides motion in the perpendicular
En
direction. Since mechanical motion can be controlled with high precision, this
method is an inexpensive (but slow) way to obtain high-resolution images.
gin
Other similar mechanical arrangements use a flat bed, with the sensor moving
eer
in two linear directions. These types of mechanical digitizers sometimes are
referred to as micro densitometers.
ing
.ne
t
Fig.4.2. Combining a single sensor with motion to generate a 2-D image
(2) Image Acquisition Using Sensor Strips:
A geometry that is used much more frequently than single sensors
consists of an in-line arrangement of sensors in the form of a sensor strip, as
Fig. 4.1 (b) shows. The strip provides imaging elements in one direction.
Motion perpendicular to the strip provides imaging in the other direction, as
23
shown in Fig. 4.3 (a).This is the type of arrangement used in most flat bed
scanners. Sensing devices with 4000 or more in-line sensors are possible. In-
line sensors are used routinely in airborne imaging applications, in which the
imaging system is mounted on an aircraft that flies at a constant altitude and
speed over the geographical area to be imaged. One-dimensional imaging
sensor strips that respond to various bands of the electromagnetic spectrum
are mounted perpendicular to the direction of flight. The imaging strip gives
one line of an image at a time, and the motion of the strip completes the other
dimension of a two-dimensional image. Lenses or other focusing schemes are
used to project the area to be scanned onto the sensors.
w.E
industrial imaging to obtain cross-sectional (“slice”) images of 3-D objects, as
Fig. 4.3 (b) shows. A rotating X-ray source provides illumination and the
asy
portion of the sensors opposite the source collect the X-ray energy that pass
En
through the object (the sensors obviously have to be sensitive to X-ray
energy).This is the basis for medical and industrial computerized axial
gin
tomography (CAT). It is important to note that the output of the sensors must
eer
be processed by reconstruction algorithms whose objective is to transform the
sensed data into meaningful cross-sectional images.
ing
.ne
t
Fig.4.3 (a) Image acquisition using a linear sensor strip (b) Image acquisition
using a circular sensor strip.
24
In other words, images are not obtained directly from the sensors by
motion alone; they require extensive processing. A 3-D digital volume
consisting of stacked images is generated as the object is moved in a
direction perpendicular to the sensor ring. Other modalities of imaging based
on the CAT principle include magnetic resonance imaging (MRI) and positron
emission tomography (PET).The illumination sources, sensors, and types of
images are different, but conceptually they are very similar to the basic
imaging approach shown in Fig. 4.3 (b).
(3) Image Acquisition Using Sensor Arrays:
Figure 4.1 (c) shows individual sensors arranged in the form of a 2-D
ww
array. Numerous electromagnetic and some ultrasonic sensing devices
w.E
frequently are arranged in an array format.
This is also the predominant arrangement found in digital cameras. A
asy
typical sensor for these cameras is a CCD array, which can be manufactured
En
with a broad range of sensing properties and can be packaged in rugged
arrays of 4000 * 4000 elements or more.
gin
CCD sensors are used widely in digital cameras and other light sensing
instruments.
eer
ing
The response of each sensor is proportional to the integral of the light
energy projected onto the surface of the sensor, a property that is used in
astronomical and other applications requiring low noise images.
.ne
Noise reduction is achieved by letting the sensor integrate the input light
signal over minutes or even hours.
t
Since the sensor array shown in Fig. 4.4 (c) is two dimensional, its key
advantage is that a complete image can be obtained by focusing the energy
pattern onto the surface of the array.
The principal manner in which array sensors are used is shown in
Fig.4.4. This figure shows the energy from an illumination source being
reflected from a scene element, but, as mentioned at the beginning of this
section, the energy also could be transmitted through the scene elements.
The first function performed by the imaging system shown in Fig.4.4 (c)
25
ww
w.E
asy
En
gin
eer
ing
.ne
Fig.4.4 An example of the digital image acquisition process (a) Energy
(“illumination”) source (b) An element of a scene (c) Imaging system (d)
t
Projection of the scene onto the image plane (e) Digitized image
26
explain about checker board effect and false contouring with neat
sketch. [APRIL/MAY 2015, APRIL/MAY 2017]
To be suitable for computer processing an image, f(x, y) must be
digitized both spatially and in amplitude. Digitizing the spatial coordinates is
called image sampling. Amplitude digitization is called gray-level quantization.
The one dimensional function shown in fig.5.1.(a). Fig.5.1 (b) is a plot of
amplitude values of the continuous image along the line segment AB in
fig.5.1(a). the random variations are due to image noise. To sample this
function, we take equally spaced samples along line AB, as shown in
fig.5.1(c).
ww The location of each sample is given by vertical tick mark in the bottom
w.E
part of the figure. The samples are shown as white square box superimposed
on the function. The set of these discrete locations gives the sampled
function.
asy
En
In order to form a digital function, the gray values must be quantized
into digital values. The right side of fig.5.1(c) shows the gray level scale
gin
divided into eight discrete levels, ranging from black to white. The vertical tick
eer
marks indicate the specific value assigned to each of the eight gray levels.
ing
The continuous values are quantized simply by assigning one of the eight gray
levels to each sample. The digital sample resulting from sampling and
quantization is shown in fig.5.1(d). Starting from the top of the image and
.ne
carrying out this procedure line by line produces a two dimensional image.
t
27
Fig.5.1. Generating a digital image. (a) continuous image (b)A scan line
from A to B in the continuous image (c) sampling and quantization
(d) Digital scan line.
ww
w.E
Representing digital images
The result of sampling and quantization is a matrix of real numbers.
asy
Assume that f(x, y) is sampled so that the resulting digital image has M rows
En
and N columns. The values of the coordinates (x,y) now become discrete
quantities. The values of the coordinates at the origin are (x, y) = (0, 0).
Fig.5.3 shows the coordinate convention. gin
eer
ing
.ne
t
Fig.5.3 Coordinate convention to represent digital images
The complete digital image in matrix form can be represented as
28
ww
For example, a 128x128 image with 64 gray levels would require 98,304 bits
w.E
of storage.
Spatial and Gray level resolution:
Spatial resolution:
asy
En
It is the smallest discernible detail in an Image. The more pixels in a
fixed range, the higher the resolution.
Aliasing and Moiré patterns: gin
eer
The Shannon„s sampling theorem tells that, if a function is sampled at a
ing
rate equal to or greater than twice its higher frequency it is possible to recover
completely the original functions from the samples. If the function is under
sampled, then a phenomenon called aliasing corrupts the sampled image, the
.ne
corruption in the form of additional frequency components being introduced
into the sampling function. These are called aliased frequencies. The effect of
t
aliased frequencies can be seen under right conditions in the form of Moiré
patterns. Aliasing effect can be decreased by reducing the high frequency
components. This is done by blurring or smoothing the image before
sampling.
29
constructible colors within a particular model. Any color that can be specified
using a model will correspond to a single point within the subspace it defines.
Each color model is oriented towards either specific hardware (RGB, CMY,
YIQ), or image processing applications (HSI).
Hardware oriented models:
RGB (red, green, blue): Monitor video camera.
CMY (cyan, magenta, yellow), CMYK (CMY, black) model for color
printing.
HSI model, which corresponds closely with the way humans describe
and interpret color.
ww
Application oriented models
w.E
These models are used in applications where color manipulation is a
goal
asy
One example is the creation of color graphics for animation
The RGB Model
En
gin
In the RGB model, an image consists of three independent image
planes, one in each of the primary colors: red, green and blue. (The standard
eer
wavelengths for the three primaries are as shown in figure). Specifying a
ing
particular color is by specifying the amount of each of the primary components
present. Figure 6.1 shows the geometry of the RGB color model for specifying
colors using a Cartesian coordinate system. The grayscale spectrum, i.e. .ne
those colors made from equal amounts of each primary, lies on the line joining
the black and white vertices.
t
30
Fig.6.1 The RGB color cube. The gray scale spectrum lies on the line
joining the black and white vertices.
This is an additive model, i.e. the colors present in the light add to form
new colors, and is appropriate for the mixing of colored light for example. The
image on the left of figure 6.2 shows the additive mixing of red, green and
blue primaries to form the three secondary colors yellow (red + green), cyan
(blue + green) and magenta (red + blue), and white ((red + green + blue). The
RGB model is used for color monitors and most video cameras.
ww
w.E Fig.6.2 RGB 24 bit color cube
asy
En
gin
eer
ing
Fig.6.3 The figure on the left shows the additive mixing of red, green and
blue primaries to form the three secondary colors yellow (red + green), cyan
.ne
(blue + green) and magenta (red + blue), and white ((red + green + blue).
The figure on the right shows the three subtractive primaries, and their pair
t
wise combinations to form red, green and blue, and finally black by
subtracting all three primaries from white.
Pixel Depth:
The number of bits used to represent each pixel in the RGB space is
called the pixel depth. If the image is represented by 8 bits then the pixel
depth of each RGB color pixel = 3*number of bits/plane=3*8=24
A full color image is a 24 bit RGB color image. Therefore total number of
colors in a full color image = (28)3 = 16,777,216
31
ww
differently by different operating system. Therefore the remaining 216 colors
w.E
are accepted to be the standard safe colors.
Component values of safe colors:
asy
Each of the 216 safe colors can be formed from three RGB component
En
values. But each component value should be selected only from the set of
values {0, 51,102,153,204,255}, in which the successive numbers are
gin
obtained by adding 51 and are divisible by 3 therefore total number of possible
values= 6*6*6=216
eer
Hexadecimal representation
ing
The component values in RGB model should be represented using
hexadecimal number system. The decimal numbers 1,2,….14,15 correspond
.ne
to the hex numbers 0,1,2,….9,A,B,C,D,E,F. the equivalent representation of
the component values is given in table 1.1
t
Table 1.1 Valid values of RGB components
Decimal Hexadecimal
0 00
51 33
102 66
153 99
204 CC
255 FF
32
Applications:
Color monitors, Color video cameras
Advantages:
Image color generation
Changing to other models such such as CMY is straight forward
It is suitable for hardware implementation
It is based on the strong perception of human vision to red, green and
blue primaries.
Disadvantages:
It is not acceptable that a color image is formed by combining three
ww primary colors.
w.E
This model is not suitable for describing colors in a way which is
practical for human interpretation.
The CMY Model
asy
En
The CMY (cyan-magenta-yellow) model is a subtractive model
gin
appropriate to absorption of colors, for example due to pigments in paints.
Whereas the RGB model asks what is added to black to get a particular color,
eer
the CMY model asks what is subtracted from white. In this case, the primaries
ing
are cyan, magenta and yellow, with red, green and blue as secondary colors.
When a surface coated with cyan pigment is illuminated by white light,
no red light is reflected, and similarly for magenta and green, and yellow and .ne
blue. The relationship between the RGB and CMY models is given by:
t
The CMY model is used by printing devices and filters.
The HSI Model
As mentioned above, color may be specified by the three quantities hue,
saturation and intensity which is similar to the way of human interpretation.
Hue: It is a color attribute that describes a pure color.
Saturation: It is a measure of the degree to which a pure color is diluted by
white light.
33
Intensity:
It is a measureable and interpretable descriptor of monochromatic
images, which is also called the gray level.
(1) To find Intensity:
The intensity can be extracted from an RGB image because an RGB
color image is viewed as three monochrome intensity images.
Intensity Axis:
A vertical line joining the black vertex (0, 0, 0) and white vertex(1,1, 1)
is called the intensity axis. The intensity axis represents the gray scale.
Determining intensity component
ww
The intensity component of any color is determined as
w.E
A plane which is perpendicular to the intensity axis and containing the
color point is passed through the cube.
asy
The point at which the plane intersects the intensity axis gives the
intensity value.
En
The intensity value will be in the range [0, 1]
gin
eer
ing
.ne
t
Fig.6.4 The HIS Model
(2) To find saturation:
All points on the intensity axis are gray which means that the saturation
i.e., purity of points on the axis is zero.
When the distance of a color from the intensity axis increases, the
saturation of that color also increases.
(3) To Find Hue:
The hue of a color can also be determined from the RGB color cube
34
because, it is clear from the RGB color model that, if three points namely
black, white and any one color are joined, a triangle is formed. All the points
inside the triangle will have the same hue. This is due to the fact that black
and white components cannot change the hue. But, intensity and saturation of
points inside the triangle will be different.
(4) HSI color space
The HIS color space is represented by
A vertical intensity axis and
The locus of color points that lie on planes perpendicular to the axis
The shape of the cube is defined by the intersecting points of these planes
ww
with the faces of cube. As the planes move up and down along the intensity
w.E
axis, the shape can either be a triangle or a hexagon. The hexagon shaped
plane and triangle plane is shown in fig. 1.23
asy
En
gin
eer
ing
Fig.6.5 HIS color space .ne
In HSI space,
The primary colors are separated by 120°.
t
The secondary colors are also separated by 120°.
The angle between the secondary„s and primaries is 60°.
Representation of Hue:
The hue of a color point is determined by an angle from some reference
point.
Almost in all cases, the red axis is selected as the reference.
Therefore, if the angle between the point and the red axis is 0°, it
represents zero hue.
35
The hue increases if the angle from red axis increases in the counter
clock wise direction
Representation of saturation
The saturation is described as the length from the vertical axis.
In the HSI space, it is represented by the length of the vector from the
origin to the color point.
If the length is more the saturation is high and vice versa.
ww
w.E
asy
En
Fig.6.6 The HSI model, showing the saturation and Hue calculation.
Fig. 6.6 shows the HSI model, with HSI solid on the left, and the HSI
gin
triangle on the right, formed by taking a horizontal slice through the HSI solid
eer
at a particular intensity. Hue is measured from red, and saturation is given by
ing
distance from the axis. Colors on the surface of the solid are fully saturated,
i.e. pure colors, and the grayscale spectrum is on the axis of the solid. For
these colors, hue is undefined. .ne
Components of HSI color space
The vertical intensity axis
t
The length of the vector S from the origin to a color point.
The angle H between the vector and the red axis.
Advantages of HSI model:
It describes colors in terms that are suitable for human interpretation.
The model allows independent control over the color describing
quantities namely hue saturation and intensity.
It can be used as an ideal tool for developing image processing
algorithms based on color descriptions.
36
UNIT-II
IMAGE ENHANCEMENT
Spatial Domain: Gray level transformations – Histogram processing –
Basics of Spatial Filtering– Smoothing and Sharpening Spatial Filtering –
Frequency Domain: Introduction to Fourier Transform – Smoothing and
Sharpening frequency domain filters – Ideal, Butterworth and Gaussian filters.
PART A
2 Marks
1. Specify the objective of image enhancement technique. [APR/ MAY
2017]
w.E
that the result is more suitable than the original image for a particular
application.
asy
2. Name the different types of derivative filters?
En
1. Perwitt operators
2. Roberts cross gradient operators
3. Sobel operators gin
3. What is contrast stretching?
eer
ing
Contrast stretching reduces an image of higher contrast than the
original by darkening the levels below m and brightening the levels above m
in the image.
.ne
4. What is grey level slicing?
Highlighting a specific range of grey levels in an image often is
t
desired. Applications include enhancing features such as masses of water in
satellite imagery and enhancing flaws in x-ray images.
5. Define image subtraction.
The difference between 2 images f(x,y) and h(x,y) expressed as,
g(x,y)=f(x,y)-h(x,y)
is obtained by computing the difference between all pairs of
corresponding pixels from f and h
37
ww
8. Give the formula for negative and log transformation.
asy
Where c-constant and r=0
En
9. What is meant by bit plane slicing? [NOV DEC 2016]
gin
Instead of highlighting gray level ranges, highlighting the contribution
eer
made to total image appearance by specific bits might be desired. Suppose
that each pixel in an image is represented by 8 bits. Imagine that the image
ing
is composed of eight 1-bit planes, ranging from bit plane 0 for LSB to bit
plane-7 for MSB.
.ne
10. Define histogram.
The histogram of a digital image with gray levels in the range [0, L-1]
t
is a discrete function h(rk)=nk.
rk-kth gray level
nk-number of pixels in the image having gray level rk.
38
PART B
16 MARKS
w.E
Normalized histogram is given by
En
The horizontal axis of each histograms corresponds to gray level values rk
gin
The vertical axis corresponds to values of h(rk)= nk or P(rk) = nk/n
eer
By processing (modifying) the histogram of an image we can create a
new image with specific desired properties.
ing
For dark image, the components of histogram are concentrated on the
low or dark side of gray scale.
.ne
t
39
ww
w.E
asy
Fig: Low contrast image and its histogram
For high contrast image, histogram extends a broad range and is uniform
throughout.
En
gin
eer
ing
.ne
Fig: High contrast image and its histogram
t
Global histogram equalization
In this section we will assume that the image to be processed has a
continuous intensity that lies within the interval [0, L-1]. Suppose we divide
the image intensity with its maximum valueL-1. Let the variable 'r„ represent
the new grey levels (image intensity) in the image, where now 0 ≤ r ≤ 1 and
let p r(r) denote the probability density function (pdf) of the variable r . We
now apply the following transformation function to the intensity.
40
ww
w.E
asy
En
gin
eer
ing
Fig Transformation function .ne
By observing the transformation of equation (1) we immediately see that it
possesses the following properties:
t
(i) T(r) should be single valued and monotonically increasing in the
interval
0 ≤ r ≤ 1 . Single valued function guarantees that the inverse
transformation will also exist. Monotonicity condition preserves the
increasing order of gray levels from black to white in the output image.
(ii) 0 ≤ T (r) ≤ 1, for, 0 ≤ r ≤ 1 i.e., the input and output gray levels are
same.
Inverse transformation:
41
w.E
asy
w is dummy variable for integration
To find ps(s)
En
gin
eer
Substituting (5) in (3) gives,
ing
.ne
From the above analysis it is obvious that the transformation of t
equation (1) converts the original image into a new image with uniform
probability density function. Unfortunately, in a real life scenario we must deal
with digital images. The discrete form of histogram equalization is given by
the relation
Thus mapping each pixel with level rk in the input image into a corresponding
pixel with level sk in the output image using eqn.(7) will produce an enhanced
42
Advantages:
It is easy to implement
The information it needs can be obtained directly from the given image
and no additional parameter specifications are required.
The results from this technique are predictable and It is fully automatic.
ww
Local histogram equalization
Global histogram equalization is suitable for overall enhancement. It is
w.E
often necessary to enhance details over small areas. The number of pixels in
these areas may have negligible influence on the computation of a global
transformation, asy
so the use of this type of transformation does not
En
necessarily guarantee the desired local enhancement. The solution is to
gin
devise transformation functions based on the grey level distribution – or
other properties – in the neighborhood of every pixel in the image. The
eer
histogram processing technique previously described is easily adaptable to
ing
local enhancement. The procedure is to define a square or rectangular
neighborhood and move the centre of this area from pixel to pixel. At each
location the histogram of the points in the neighborhood is computed and a .ne
histogram equalization transformation function is obtained. This function is
finally used to map the grey level of the pixel centered in the neighborhood.
t
The centre of the neighbourhood region is then moved to an adjacent pixel
location and the procedure is repeated. Since only one new row or column of
the neighborhood changes during a pixel-to-pixel translation of the region,
updating the histogram obtained in the previous location with the new data
introduced at each motion step is possible quite easily. This approach has
obvious advantages over repeatedly computing the histogram over all pixels
in the neighborhood region each time the region is moved one pixel location.
Another approach often used to reduce computation is to utilize non
43
ww
p z ( z) is the desired probability density function
44
w.E
Power-law (nth power and nth root transformations)
(i)Image Negative
asy
The negative of an image with intensity levels in the range [0,L-1] can
be described by: s = L −1 –r
En
gin
eer
ing
.ne
t
Fig: Image Negative
(ii)Log Transformations
General form: s= c log(1 / r) c is a constant and r≥0 Maps a narrow
range of low intensity values in input to a wider output range. The opposite is
true for high intensity input values. Compresses the dynamic range of images
with large variations in pixel values.
45
ww
w.E Fig. Basic gray level transformation functions
(iii)Power-Law (Gamma) Transformations:
En
Where c and γ are positive constants. Power- law curves with fractional
values of γ map a narrow range of dark input values to a wider range of output
gin
values. The opposite is true for higher values of input levels exists a family of
possible transformation curves by varying γ
eer
Power-Law Transformation Curves Gamma Correction
ing
Many devices used for image capture, display and printing respond
according to a power law The exponent in the power-law equation is referred .ne
to as gamma. The process of correcting for the power-law response is
referred to as gamma correction
t
Example:
CRT devices have an intensity-to-voltage response that is a power
function (exponents typically range from 1.8-2.5). Gamma correction in this
case could be achieved by applying the transformation s= r1/2.5= r 0.4
(iv)Piecewise-Linear Transformations
Piecewise functions can be arbitrarily complex. A disadvantage is that
their specification requires significant user input.
46
ww
w.E
asy Fig. Contrast Stretching
En
gin
eer
ing
Low Contrast Image Contrast stretched image .ne
Intensity-level Slicing
t
interest to one value (white) and all others to another value (black), Produces
a binary image. Brighten (or darken) pixel values in a range of interest and
leave all others unchanged.
Bit plane Slicing
Instead of highlighting gray level images, highlighting the contribution
made to total image appearance by specific bits might be desired. Suppose
that each pixel in an image is represented by 8 bits. Imagine the image is
composed of 8, 1-bit planes ranging from bit plane1-0 (LSB)to bit plane 7
(MSB). In terms of 8-bits bytes, plane 0 contains all lowest order bits in the
bytes comprising the pixels in the image and plane 7 contains all high order
ww
bits. Separating a digital image into its bit planes is useful for analyzing the
w.E
relative importance played by each bit of the image, implying, it determines
the adequacy of numbers of bits used to quantize each pixel , useful for image
compression.
asy
En
gin
eer
ing
.ne
t
Fig: Bit plane representation
In terms of bit-plane extraction for a 8-bit image, it is seen that binary
image for bit plane 7 is obtained by proceeding the input image with a
thresholding gray-level transformation function that maps all levels between 0
and 127 to one level (e.g 0)and maps all levels from 129 to 253 to another
(eg. 255).
48
w.E
Linear Filters and Non linear filters based on the operation performed on
the image. Filtering means accepting ( passing ) or rejecting some
asy
frequencies. Mechanics of spatial filtering
En
f(x, y)
Col -1 gin
Filter
Col 0
g(x, y)
Col 1
Row -1 -1,-1 -1,0 eer
-1,-1
Row 0 W 1
0,-1 W0,0
2 W
0,13
ing
Row 1 W 4
1,-1 W
1,05 W
1,16
.ne
W7 W8 W9
At any point (x,y) in the image, the response g(x,y) of the filter is
the sum of products of the filter coefficients and the image response and
t
the image pixels encompassed by the filter.
Observe that he filter w(0,0) aligns with the pixel at
location(x,y)
g(x,y)= w(-1,-1)f(x-1,y-1)+w(-1,-0) f(x-1,y) +
…+w(0,0)f(x,y)+….+w(1,1)f(x+1,y+1)
1) Smoothing Spacial Filter
Smoothing filter are used for blurring and for noise detection.
a) Smoothing Linear Filter
49
ww
w.E
asy
En
gin
eer
ing
Fig: The mechanism of spacial filtering of 3x3 Mask
.ne
t
Fig: Two 3 X 3 smoothing filter masks. The constant multiplier is equal to the
sum of the values of its coefficients as is required to compute an average
50
ww
w.E
asy
An important application of spatial averaging is to blur an image
for the purpose getting a gross representation of objects of interest, such
En
that the intensity of smaller objects blends with the background; after filtering
and thresholding.
gin
eer
ing
.ne
t
Fig: (a) Image from the Hubble Space Telescope (b) Image Processed by a
15 x15 averaging mask. (c) Results of the Thresholding
51
ww
their coefficients positive and equal to each other as for example the mask
shown below. Moreover, they sum up to 1 in order to maintain the mean of
w.E
the image.
asy 1 1 1
1x
En1 1 1
9
1 gin
1 1
will recall from basic statistics that ranking lends itself to many other
possibilities. For example, using the 100th percentile results in the so-
called max filter, which is useful in finding the brightest points in an
image. The response of a 3*3 max filter is given by R=max [ zk| k=1, 2, ,…
9]
The 0th percentile filter is the min filter, used for the opposite purpose.
2) Sharpening Spatial Filters
In the last section, we saw that image blurring could be
accomplished in the spatial domain by pixel averaging in a neighborhood.
Since averaging is analogous to integration, it is logical to conclude
ww
that sharpening could be accomplished by spatial differentiation. This,
w.E
in fact, is the case, and the discussion in this section deals with
various ways of defining and implementing operators for sharpening by
asy
digital differentiation.
En
Fundamentally, the strength of the response of a derivative operator
is proportional to the degree of discontinuity of the image at the point at
gin
which the operator is applied. Thus, image differentiation enhances edges
eer
and other discontinuities (such as noise) and deemphasizes areas with
slowly varying gray-level values.
ing
For first derivative (1) must be zero in flat segments (areas of
constant gray-level values); (2) must be nonzero at the onset of a gray- .ne
level step or ramp; and (3) must be nonzero along ramps. Similarly, any
definition of a second derivative (1) must be zero in flat areas; (2) must
t
be nonzero at the onset and end of a gray-level step or ramp; and (3)
must be zero along ramps of constant slope.
A basic definition of the first-order derivative of a one-dimensional
function f(x) is the difference.
53
ww
much more than a first-order derivative.
w.E
(1) First-order derivatives generally produce thicker edges in an image.
(2) Second-order derivatives have a stronger response to fine detail, such
asy
as thin lines and isolated points.
(3) First order derivatives generally have a stronger response to a gray-
level step. En
gin
(4) Second- order derivatives produce a double response at step
changes in gray level.
eer
Use of Second Derivatives for Enhancement–The Laplacian
ing
We are interested in isotropic filters, whose response is independent
of the direction of the discontinuities in the image .ne
to which the filter is
applied. In other words, isotropic filters are rotation invariant, in the sense
that rotating the image and then applying the filter gives the same result
t
as applying the filter to the image first and then rotating the result.
Development of the method
It can be shown (Rosenfeld and Kak [1982]) that the simplest
isotropic derivative operator is the Laplacian, which, for a function
(image) f(x, y) of two variables, is defined as
is a linear operator.
In order to be useful for digital image processing, this equation
needs to be expressed in discrete form.
We use the following notation for the partial second-order derivative in the
x-direction:
ww
The digital implementation of the two-dimensional Laplacian is obtained
w.E
by summing these two components:
asy
En
This equation can be implemented using the mask gives an isotropic
gin
result for rotations in increments of 90°. The diagonal directions can be
incorporated in the definition of the digital Laplacian by adding two more
eer
terms, one for each of the two diagonal directions. Since each diagonal
ing
term also contains a –2f(x, y) term, the total subtracted from the difference
terms now would be –8f(x, y).
Laplacian operator .ne
The main disadvantage
produces double edges. Because
of the Laplacian
the Laplacian
operator
is a
t
is that it
derivative
operator, its use highlights gray-level discontinuities in an image
and deemphasizes regions with slowly varying gray levels. This will
tend to produce images that have grayish edge lines and other
discontinuities, all superimposed on a dark, featureless background.
Background features can be “recovered” while still preserving the
sharpening effect of the Laplacian operation simply by adding the
original and Laplacian images. As noted in the previous paragraph, it is
important to keep in mind which definition of the Laplacian is used. If the
55
ww
w.E
(a) Filter mask asy
used to implement the digital
Laplacian,
En
gin
(b) Mask used to implement an extension that includes the diagonal
neighbors.
56
filtering. A high- boost filtered image, fhb, is defined at any point (x, y) as
fhb(x, y) = Af(x, y) – b(x, y)
where A _ 1 and, as before, b is a blurred version of f. This equation
may be written as
fhb(x, y) = (A - 1)f(x, y) + f(x, y) - b(x, y).
fhb(x, y) = (A - 1)f(x, y) + fs(x, y)
as the expression for computing a high-boost-filtered image.
One of the principal applications of boost filtering is when the input
image is darker than desired. By varying the boost coefficient, it
generally is possible to obtain an overall increase in average gray level
ww
of the image, thus helping to brighten the final result.
w.E
4) What is meant by Frequency Filtering? Discuss in detail about
asy
Smoothing and Sharpening frequency Filtering?
En or
Compare the various filters available under frequency domain for image
eer
Edges and other sharp transitions (such as noise) in the gray levels of
an image contribute significantly to the high-frequency content of its Fourier
ing
transform. Hence smoothing (blurring is achieved in the frequency domain by
attenuating a specified range of high-frequency components in the transform
.ne
of a given image.
Our basic “model” for filtering in the frequency domain is t
G (u, v) =H (u, v) F (u, v)
where F(u, v) is the Fourier transform of the image to be smoothed. The
objective is to select a filter transfer function H(u, v) that yields G(u, v) by
attenuating the high-frequency components of F(u, v).
Ideal Lowpass Filters
The simplest lowpass filter we can envision is filter that “cuts off” all
high-frequency components of the Fourier transform that are at a distance
greater than a specified distance D0 from the origin of the (centered)
transform. Such a filter is called a two-dimensional (2-D) ideal lowpass filter
57
ww
where Do is a stated nonnegative quantity (the cutoff frequency) and D(u,v) is
the distance from the point (u,v) to the center of the frequency plane.If the
w.E
image in question is of size M x N, we know that its transform also is of this
size, so the center of the frequency rectangle is at (u, v) = (M/2, N/2) due to
asy
the fact that the transform has been centered. In this case, the distance from
En
any point (u, v) to the center (origin) of the Fourier transform is given by
D(u, v)= [(u – M/2)2 + (v – N/2)2] ½
gin
eer
ing
.ne
For an ideal lowpass filter cross section, the point of transition between H (u,
v) = 1 and H (u, v) = 0 is called the cutoff frequency.
t
One way to establish a set of standard cutoff frequency loci is to
compute circles that enclose specified amounts of total image power PT .This
quantity is obtained by summing the components of the power spectrum at
each point (u, v), for u = 0,1,2,…,M – 1 and v = 0,1,2,…,N -1; that is,
and the summation is taken over the values of (u, v) that lie inside the circle or
on its boundary. The blurring and ringing properties of the ILPF can be
explained by reference to the convolution theorem. The Fourier transforms of
the original image f(x, y) and the blurred image g(x, y) are related in the
frequency domain by the equation.
G(u, v) = H(u, v) F(u, v)
ww
w.E
asy
En
gin
eer
ing
where, as before, h(u, v) is the filter function and F and G are the Fourier
transforms of the two images just mentioned. The convolution theorem tells us
that the corresponding process in the spatial domain is .ne
g(x, y) = h(x, y) * f(x, y)
where h(x, y) is the inverse Fourier transform of the filter transfer function H(u,
t
v).
The spatial filter function h(x, y) was obtained in the standard way: (1)
H(u, v) was multiplied by (-1)u+v for centering; (2) this was followed by the
inverse DFT; and (3) the real part of the inverse DFT was multiplied by (-1)x+y.
Butterworth Lowpass Filters
The transfer function of a Butterworth lowpass filter (BLPF) of order n,
and with cutoff frequency at a distance D0 from the origin, is defined as
59
ww
w.E
asy
En
gin
eer
ing
.ne
t
60
Unlike the ILPF, the BLPF transfer function does not have a sharp
discontinuity that establishes a clear cutoff between passed and filtered
frequencies. A Butterworth filter of order 1 has no ringing. Ringing generally is
imperceptible in filters of order 2, but can become a significant factor in filters
of higher order. The filter of order 2 does so mild ringing and small negative
values, but they certainly are less pronounced than in the ILPF. A Butterworth
filter of order 20 already exhibits the characteristics of the ILPF. In general,
BLPFs of order 2 are a good compromise between effective lowpass filtering
and acceptable ringing characteristics.
Gaussian Lowpass Filters
w.E 2 2
H(u,v) = e -D (u,v)/2 D 0
asy
En
gin
eer
ing
.ne
D(u, v) is the distance form of the origin of the Fourier transform.is a measure
t
of the spread of the Gaussian curve. By letting = D0.
2 2
H(u,v) = e -D (u,v)/2 D 0
where D0 is the cutoff frequency. When D(u, v) = D0, the filter is down to 0.607
of its maximum value. The inverse Fourier transform of the Gaussian lowpass
filter also is Gaussian.
Sharpening Frequency Domain Filters
The high-frequency components are: edges and sharp transitions
such as noise. Sharpening can be achieved by high pass filtering
61
ww
w.E
asy
En
gin
eer
ing
.ne
t
Fig: Top Row- Perspective plot, Image Representation and Cross Section of
typical Ideal High Pass Filter, Second Row- Perspective plot, Image
Representation and Cross Section of typical Butterworth High Pass Filter,
62
ww
w.E
asy
En
gin
where D0 is the cutoff distance measured from the origin of the
frequency rectangle. This filter is opposite of the ideal low-pass filter in the
eer
sense that it sets to zero all frequencies inside a circle of radius D 0 while
ing
passing, without attenuation, all frequencies outside the circle. As in the case
of the ideal lowpass filter, IHPF is physically realizable with electronic
components. .ne
t
ww
w.E
asy
Gaussian Highpass Filters
En
The transfer function of the Gaussian Highpass Filters (GHPF) with
gin
cutoff frequency locus at a distance D0 from the origin is given by
eer
ing
.ne
t
64
w.E
asy
En
gin
eer
ing
Fig: Special Representation of a)Ideal b)Butterworth c)Gaussian Frequency
Domain of Highpass Filters .ne
Gaussian Lowpass Filters
The form of these filters in two dimensions is given by
t
H(u,v) = e -D2(u,v)/2σ2
D(u, v) is the distance form of the origin of the Fourier transform.
is a measure of the spread of the Gaussian curve. By letting = D0.
H(u,v) = e -D2(u,v)/2 D0 2
where D0 is the cutoff frequency. When D(u, v) = D0, the filter is down to 0.607
of its maximum value.
The inverse Fourier transform of the Gaussian lowpass filter also is
Gaussian.
65
UNIT III
ww
the inverse process in order to recover the image
Degradation -gray value altered
asy
Restoration attempts to reconstruct or recover an image that has
En
been degraded by using a clear knowledge of the degrading
phenomenon.
gin
3. Explain additivity property in Linear Operator?
H[f1(x,y)+f2(x,y)]=H[f1(x,y)]+H[f2(x,y)]
eer
ing
The additive property says that if H is the linear operator,the response to a
sum of two is equal to the sum of the two responses.
4. What are the two methods of algebraic approach? [ APR/MAY 2012] .ne
Unconstraint restoration approach
Constraint restoration approach
t
5. What is meant by Noise probability density function?
The spatial noise descriptor is the statistical behavior of gray level
values in the noise component of the model.
6. Why the restoration is called as unconstrained restoration? [APR
MAY 2017]
In the absence of any knowledge about the noise „n‟, a meaningful
criterion function is to seek an f^ such that H f^ approximates of in a least
square sense by assuming the noise term is as small as possible.
66
w.E The limitation of inverse and pseudo inverse filter is very sensitive
noise. The wiener filtering is a method of restoring images in the presence of
blur as well as noise. asy
En
9. What is pseudo inverse filter? (Dec’13)
gin
It is the stabilized version of the inverse filter. For a linear shift invariant
systemwith frequency response H(u, v) the pseudo inverse filter is defined as
67
ww
space.
w.E
13. State the problems in region splitting and merging based image
segmentation. (Dec’14)
asy
Initial seed points – different set of initial seed point cause different
segmented result.
Time consuming problem En
gin
This method is not suitable for color images and produce fault colors
sometime.
eer
ing
Region growth may stop at any time when no more pixel satisfy the criteria.
.ne
t
68
PART B 16 Marks
ww
have no spatial dependences from image to image
w.E
The term noise has the following meanings:
1. An undesired disturbance within the frequency band of interest; the
summation of
asy
unwanted or disturbing energy
communications system from man-made and natural sources.
introduced into a
En
2. A disturbance that affects a signal and that may distort the information
carried by the signal.
gin
eer
3. Random variations of one or more characteristics of any entity such as
voltage, current, or data.
ing
4. A random signal of known statistical properties of amplitude, distribution,
and spectral density.
.ne
Noise has long been studied. People analyze its property, type,
influence and what can be done about it. Most of the research is done in t
mathematics and close related to Probability Theory
Types of mixing noise with signal
In many applications it is assumed that noise is additive and
statistically independent of the signal
g(x, y) = f (x, y) + η(x, y)
For thermal noise often, noise is signal-dependent. Examples: Speckle
photon noise… Many noise sources can be modeled by a multiplicative
model:
69
w.E
Advanced de-noising filters are based on adaptive strategy, i.e. the
procedure tries to adapt the application of the filter as it progresses
asy
Frequency domain filters provide powerful de-noising methods.
En
Noise in Colour images may have different characteristics in different
gin
colour channels, but removing noise uses the same strategy
1. Gaussian (normal) noise is very attractive from a mathematical point of
view since its DFT is another Gaussian process. eer
ing
.ne
t
temperature.
ww
w.ERadar range and velocity images typically contain noise that can be
asy
modeled by the Rayleigh distribution.
En
gin
eer
ing
.ne
t
71
ww
w.E
asy
En
gin
eer
The gray level values of the noise are evenly distributed across a specific
range
ing
• Quantization noise has an approximately uniform distribution
.ne
t
72
ww In most images, adjacent pixels are highly correlated, while the gray
asy
decreases away from the origin.
Power spectrum of an image is the Fourier transform of its
En
autocorrelation function, therefore, we can argue that the power
gin
spectrum of an image generally decreases with frequency
eer
Typical noise sources have either a flat power spectrum or one that
ing
decreases with frequency more slowly than typical image power
spectra.
Therefore, the expected situation is for the signal to dominate the .ne
spectrum at low frequencies while the noise dominates at high
frequencies.
t
Until now our focus was the calculation of degradation function H(u,v).
Having H(u,v) calculated/ estimated the next step is the restoration of the
degraded image. There are different types of filtering techniques for
obtaining or for restoring the original image from a degraded image. The
simplest kind of Approach to restoration is direct inverse filtering technique.
The simplest way of image restoration is by using Inverse
filtering:
Now, the concept of inverse filtering is very simple. Our expression is
73
that G (u, v) that is the Fourier transform of the degraded image is given by
H (u, v) into F‟ (u, v) where H (u, v) is the degradation function in the
frequency domain and estimate F‟ (u, v) is the Fourier transform of the
original image, G (u, v) is the Fourier transform of the degraded image. The
division is an array operation.
Noise is
enhanced
when
ww
H(u,v) is
small.
To avoid the side effect of enhancing noise, we can apply this formulation to
w.E
freq. component (u,v) within a radius D0 from the center of H(u,v).
So, this expression says that even if H (u, v) is known exactly, the
asy
perfect reconstruction may not be possible because N (u,v) is not known.
En
Again if H(u,v) is near zero, N(u,v)/H(u,v) will dominate the F‟(u,v)
estimate.
gin
Now, because this H (u, v) into F (u, v), this is a point by point multiplication.
eer
That is for every value u and v, the corresponding F component and the
ing
corresponding H component will be multiplied together to give you the final
matrix which is again in the frequency domain. This problem could be
reduced by limiting the analysis to frequencies near the origin. .ne
The solution is again to carry out the restoration process in a limited
neighborhood about the origin where H(u,v) is not very small. This procedure
t
is called pseudo inverse filtering.
In Inverse filtering, we simply take H(u,v) such that the noise does not
dominate the result. This is achieved by including only the low frequency
components of H(u,v) around the origin. Note that, the origin, H(M/2,N/2),
corresponds to the highest amplitude component.
Consider the degradation function of the atmospheric
turbulence for the origin of the frequency spectrum,
74
ww
w.E
asy
En
gin
3. Wiener filter (constrained) Direct Method (Stochastic Regularization)
or
eer
Explain the use of wiener filtering in image restoration.
[MAY/JUNE 2016, APR/MAY 2017] ing
or
.ne
How wiener filter is helpful to reduce the mean square error when image
is corrupted by motion blur and additive noise? [APRIL/MAY 2015] t
Inverse Filter considers degradation function only and does not
consider the noise part.
In case of Wiener filtering approach, the Wiener filtering tries to
reconstruct the degraded image by minimizing an error function. So, it is
something like this
Restoration: Wiener filter
Degradation model:
g(x, y) = h(x, y) * f (x, y) + η(x, y)
75
ww
w.E
asy
En
gin
eer
ing
.ne
t
* Sη(u, v) = |N(u, v)|2 : the power spectrum of the noise G(u,v) transform of
the degraded image
76
Now, in this case, you might notice that if the image does not contain
any noise; then obviously, S(u,v) which is the power spectrum of the noise
will be equal to 0 and in that case, this wiener filter becomes identical with the
inverse filter. But if the degraded image also contains additive noise in
addition to the blurring; in that case, the wiener filter and the inverse filter is
different.
ww
m X n rectangular window Sxy.
w.E
Adaptive, local noise reduction filter:
The simplest statistical measures of a random variable are its mean
asy
and variance. These are reasonable parameters on which to base an
En
adaptive filler because they are quantities closely related to the appearance
gin
of an image. The mean gives a measure of average gray level in the region
over which the mean is computed, and the variance gives a measure of
average contrast in that region.
eer
ing
This filter is to operate on a local region, Sxy. The response of the
filter at any point (x, y) on which the region is centered is to be based on
.ne
four quantities: (a) g(x, y), the value of the noisy image at (x, y); (b) a2, the
variance of the noise corrupting /(x, y) to form g(x, y); (c) ray, the local mean
2
of the pixels in Sxy; and (d) σ L , the local variance of the pixels in Sxy.
t
The behavior of the filter to be as follows:
This is the trivial, zero-noisecase in which g (x, y) is equal to f (x, y).
2. If the local variance is high relative to σ2 η the filter should return a value close
to g (x, y).
3. If the two variances are equal, we want the filter to return the
arithmetic mean value of the pixels in Sxy. This condition occurs when
the local area has the same properties as the overall image, and local
noise is to be reduced simply by averaging. Adaptive local noise filter is
77
given by
f (x,y) = g (x,y) – σ2η / σ2L[ g (x,y) – mL ]
The only quantity that needs to be known or estimated is the variance of the
overall noise, a2. The other parameters are computed from the pixels in
Sxy at each location (x, y) on which the filter window is centered.
Adaptive median filter:
The median filter performs well as long as the spatial density of the
impulse noise is not large (as a rule of thumb, Pa and Pb less than 0.2).
The adaptive median filtering can handle impulse noise with probabilities
ww
even larger than these. An additional benefit of the adaptive median filter is
that it seeks to preserve detail while smoothing non impulse noise,
w.E
something that the "traditional" median filter does not do. The adaptive
median filter also works in a rectangular window area S xy. Unlike those
asy
filters, however, the adaptive median filter changes (increases) the size of
En
Sxy during filter operation, depending on certain conditions. The output of
gin
the filter is a single value used to replace the value of the pixel at (x, y), the
eer
particular point on which the window S xy is centered at a given time.
Consider the following notation:
ing
zmin = minimum gray level value in Sxy
zmax = maximum gray level value in Sxy
.ne
zmcd = median of gray levels in Sxy
zxy = gray level at coordinates (x, y)
t
Smax = maximum allowed size of Sxy.
The adaptive median filtering algorithm works in two levels, denoted level
A and level B, as follows:
78
ww
Edge Linking and Boundary Detection
Ideally, the methods discussed in the previous section should yield
w.E
pixels lying only on edges. In practice, this set of pixels seldom characterizes
asy
an edge completely because of noise, breaks in the edge from non uniform
illumination, and other effects that introduce spurious intensity discontinuities.
En
Thus edge detection algorithms typically are followed by linking procedures to
gin
assemble edge pixels into meaningful edges. Several basic approaches are
suited to this purpose.
Local Processing eer
ing
One of the simplest approaches for linking edge points is to analyze the
characteristics of pixels in a small neighborhood (say, 3 x 3 or 5 x 5) about
.ne
every point (x, y) in an image that has been labeled an edge point. All points
that are similar according to a set of predefined criteria are linked, forming an
edge of pixels that share those criteria.
t
The two principal properties used for stabling similarity of edge pixels in
this kind of analysis are
(1) the strength of the response of the gradient operator used to produce the
edge pixel; and (2) the direction of the gradient vector. The first property is
given by the value of ∆f, Thus an edge pixel with coordinates (x0, y0) in a
predefined neighborhood of (x, y), is similar in magnitude to the pixel at (x, y)
if |∆f(x,y) - ∆f (xo, yo)| < E where E is a nonnegative threshold.
An edge pixel at (x0, y0) in the predefined neighborhood of (x, y) has an
79
angle similar to the pixel at (x, y) if |α(x,y) - α(xo, yo)| < A where A is a
nonnegative angle threshold
Global Processing
Given n points in an image, suppose that we want to find subsets of
these points that lie on straight lines. One possible solution is to first find all
lines determined by every pair of points and then find all subsets of points that
are close to particular lines. The problem with this procedure is that it involves
finding n (n – 1)/ n2 lines and then performing (n) (n(n – 1))/2 n3 comparisons
of every point to all lines. This approach is computationally prohibitive in all
but the most trivial applications
ww
Global Processing via Graph-Theoretic Techniques
w.E Global approach for edge detection and linking based on representing
edge segments in the form of a graph and searching the graph for low-cost
asy
paths that correspond to significant edges. This representation provides a
En
rugged approach that performs well in the presence of noise.
gin
6. Explain about Region based segmentation [APR/MAY 2017, NOV DEC
2016]
eer
Region based segmentation
ing
The objective of segmentation is to partition an image into regions.
Region Growing
.ne
As its name implies, region growing is a procedure that groups pixels or
subregions into larger regions based on predefined criteria. The basic
t
approach is to start with a set of “seed” points and from these grow regions by
appending to each seed those neighboring pixels that have properties similar
to the seed (such as specific ranges of gray level or color)
When a priori information is not available, the procedure is to compute
at every pixel the same set of properties that ultimately will be used to assign
pixels to regions during the growing process. If the result of these
computations shows clusters of values , the pixels whose properties
place them near the centroid of these clusters can be used as seeds.
The selection of similarity criteria depends not only on the problem
80
ww
assumption that a model of expected results is at least partially available.
w.E
Region Splitting and Merging
An alternative is to subdivide an image initially into a set of arbitrary,
asy
disjointed regions and then merge and/or split the regions in an attempt to
satisfy the conditions.
En
Let R represent the entire image region and select a predicate P. One
gin
approach for segmenting R is to subdivide it successively into smaller and
eer
smaller quadrant regions so that, for any region Ri, P(Ri) = TRUE. We start
ing
with the entire region. If P(R) = FALSE, we divided the image into quadrants.
If P is FALSE for any quadrant, we subdivide that quadrant into subquadrants,
.ne
and so on. This particular splitting technique has a convenient representation
in the form of a so-called quadtree.
If only splitting were used, the final partition likely would contain
t
adjacent region with identical properties. This drawback may be remedied by
allowing merging, as well as splitting. Merging only adjacent regions whose
combined pixels satisfy the predicate P. That is, two adjacent regions Rj and
Rk are merged only if P (Rj U Rk ) = TRUE.
1. Split into four disjoint quadrants any region Ri for which P (Ri) = FALSE
2. Merge any adjacent regions Rj and Rk for which P (Rj U Rk ) = TRUE
3. Stop when no further merging or splitting is possible.
81
w.E
values, not on their numerical values, and therefore are especially suited to
the processing of binary images. Morphological operations can also be
asy
applied to greyscale images such that their light transfer functions are
En
unknown and therefore their absolute pixel values are of no or minor interest.
Morphological techniques probe an image with a small shape or
gin
template called a structuring element. The structuring element is positioned
eer
at all possible locations in the image and it is compared with the
ing
corresponding neighbourhood of pixels. Some operations test whether the
element "fits" within the neighbourhood, while others test whether it "hits" or
intersects the neighbourhood:
.ne
A morphological operation on a binary image creates a new binary
image in which the pixel has a non-zero value only if the test is successful at
t
that location in the input image.
The structuring element is a small binary image, i.e. a small matrix of
pixels, each with a value of zero or one:
• The matrix dimensions specify the size of the structuring element.
• The pattern of ones and zeros specifies the shape of the structuring
element.
• An origin of the structuring element is usually one of its pixels,
although generally the origin can be outside the structuring
element.
82
ww
moprphological image processing the same role as convolution kernels in
w.E
linear image filtering.
When a structuring element is placed in a binary image, each of its
asy
pixels is associated with the corresponding pixel of the neighbourhood under
En
the structuring element. The structuring element is said to fit the image if, for
each of its pixels set to 1, the corresponding image pixel is also 1. Similarly, a
gin
structuring element is said to hit, or intersect, an image if, at least for one of
eer
its pixels set to 1 the corresponding image pixel is also 1.
ing
.ne
t
Fig: Fitting and Hitting of Binary images
Zero-valued pixels of the structuring element are ignored, i.e. indicate points
where the corresponding image value is irrelevant.
Fundamental operations
Erosion and Dilation
Compound operations
Many morphological operations are represented as combinations of
erosion, dilation, and simple set-theoretic operations such as the
complement of a binary image:
83
ww
minimum; and (c) points at which water would be equally likely to fall to more
w.E
than one such minimum. For a particular regional minimum, the set of points
satisfying condition (b) is called the catchment basin or watershed of that
asy
minimum. The points satisfying condition (c) form crest lines on the
En
topographic surface and are termed divide lines or watershed lines.
The principal objective of segmentation algorithms based on these
gin
concepts is to find the watershed lines. The basic idea is simple: Suppose that
eer
a hole is punched in each regional minimum and that the entire topography is
ing
flooded from below by letting water rise through the holes at a uniform rate.
When the rising water in distinct catchment basins is about to merge, a dam is
built to prevent the merging. The flooding will eventually reach a stage when
.ne
only the tops of the dams are visible above the water line. These dam
boundaries correspond to the divide lines of the watersheds. Therefore, they
t
are the (continuous) boundaries extracted by a watershed segmentation
algorithm.
Erosion and Dilation
Two very common morphology operators: Dilation and Erosion. For this
purpose, we will use the following OpenCV functions: ◦erode ◦dilate
Morphological Operations:
In short: A set of operations that process images based on shapes.
Morphological operations apply a structuring element to an input image
and generate an output image.
84
The most basic morphological operations are two: Erosion and Dilation.
They have a wide array of uses, i.e. :
Removing noise
Isolation of individual elements and joining disparate elements in an
image.
Finding of intensity bumps or holes in an image
1.Dilation:
This operations consists of convoluting an image A with some kernel
(B), which can have any shape or size, usually a square or circle.
The kernel B has a defined anchor point, usually being the center of the
ww kernel.
w.E
As the kernel B is scanned over the image, we compute the maximal
pixel value overlapped by B and replace the image pixel in the anchor
asy
point position with that maximal value. As you can deduce, this
En
maximizing operation causes bright regions within an image to “grow”
gin
(therefore the name dilation). Take as an example the image above.
Applying dilation we can get:
eer
The background (bright) dilates around the black regions of the letter.
Erosion:
ing
This operation is the sister of dilation. What this does is to compute a
local minimum over the area of the kernel. .ne
As the kernel B is scanned over the image, we compute the minimal
pixel value overlapped by B and replace the image pixel under the
t
anchor point with that minimal value.
Analagously to the example for dilation, we can apply the erosion
operator to the original image . In the result below that the bright areas
of the image (the background, apparently), get thinner, whereas the
dark zones (the “writing”) gets bigger.
85
UNIT IV
WAVELETS AND IMAGE COMPRESSION 9
Wavelets – Sub band coding - Multi resolution expansions -
Compression: Fundamentals – Image Compression models – Error Free
Compression – Variable Length Coding – Bit-Plane Coding – Lossless
Predictive Coding – Lossy Compression – Lossy Predictive Coding –
Compression Standards.
PART A
1. What is image, Data Compression and its type?
ww
Image compression
asy
basis of reduction process is removal of redundant data.
Data Compression
En
Data compression requires the identification and extraction of source
gin
redundancy. In other words, data compression seeks to reduce the number of
bits used to store or transmit information.
eer
Types
Lossless compression ing
Lossy compression
.ne
2. What is the need for Compression? (May’14) (May’13)
In terms of storage, the capacity of a storage device can be effectively
t
increased with methods that compress a body of data on its way to a storage
device and decompress it when it is retrieved.
In terms of communications, the bandwidth of a digital communication
link can be effectively increased by compressing data at the sending end and
decompressing data at the receiving end.
At any given time, the ability of the Internet to transfer data is fixed.
Thus, if data can effectively be compressed wherever possible, significant
improvements of data throughput can be achieved. Many files can be
86
ww
spatial redundant geometric redundant or inter pixel redundant. Eg: Run
4.w.E
length coding.
What is run length coding?(May’14, APR MAY 2017)
asy
Run-length Encoding, or RLE is a technique used to reduce the size of
En
a repeating string of characters. This repeating string is called a run; typically
RLE encodes a run of symbols into two bytes, a count and a symbol. RLE can
gin
compress any type of data regardless of its information content, but the
eer
content of data to be compressed affects the compression ratio. Compression
is normally measured with the compression ratio:
5. Define compression ratio. (June’12) ( Dec’14) ing
Compression Ratio =original size/ compressed size
.ne
6. Define encoder and source encoder?
Encoder
t
Source encoder is responsible for removing the coding and inter pixel
redundancy and psycho visual redundancy. There are two components
A)Source Encoder B)Channel Encoder
source encoder
Source encoder performs three operations
Mapper - this transforms the input data into non-visual format. It
reduces the inter pixel redundancy.
Quantizer - It reduces the psycho visual redundancy of the input
images. This step is omitted if the system is error free.
87
Symbol encoder - This reduces the coding redundancy. This is the final
stage of encoding process.
7. Define channel encoder and types of decoder?
Channel encoder
The channel encoder reduces the impact of the channel noise by
inserting redundant bits into the source encoded data. Eg: Hamming code
types of decoder
Sourced decoder has two components
a)Symbol decoder - This performs inverse operation of symbol encoder.
b)Inverse mapping - This performs inverse operation of map per.
ww
Channel decoder-this is omitted if the system is error free.
8.
w.E What are the operations performed by error free compression and
Variable Length Coding? [APR/MAY 2017]
asy
error free compression
En
Devising an alternative representation of the image in which its inter
pixel redundant are reduced.
gin
Coding the representation to eliminate coding redundancy
Variable Length Coding
eer
ing
Variable Length Coding is the simplest approach to error free
compression. It reduces only the coding redundancy. It assigns the shortest
possible codeword to the most probable gray levels.
.ne
9. Define Huffman coding and mention its limitation (June’12&
(Dec’13))
t
1. Huffman coding is a popular technique for removing coding redundancy.
2. When coding the symbols of an information source the Huffman code yields
the smallest possible number of code words, code symbols per source
symbol.
Limitation: For equi probable symbols, Huffman coding produces variable
code words.
10. Define Block code, instantaneous code and B2 code?
Block code
Each source symbol is mapped into fixed sequence of code symbols or
88
ww
order. Divide the total number of symbols into block of equal size. Sum the
w.E
probabilities of all the source symbols outside the reference block. Now apply
the procedure for reference block, including the prefix source symbol. The
asy
code words for the remaining symbols can be constructed by means of one or
En
more prefix code followed by the reference block as in the case of binary shift
code.
gin
11. What is bit plane Decomposition? (Dec’13)
eer
An effective technique for reducing an image‟s inter pixel redundancies
ing
is to process the image‟s bit plane individually. This technique is based on the
concept of decomposing multilevel images into a series of binary images and
compressing each binary image via one of several well-known binary
.ne
compression methods.
12. What are the coding systems in JPEG? (Dec’12)
t
A lossy baseline coding system, which is based on the DCT and is
adequate for most compression application.
An extended coding system for greater compression, higher precision or
progressive reconstruction applications.
A lossless independent coding system for reversible compression.
13. What is JPEG and basic steps in JPEG?
JPEG
The acronym is expanded as "Joint Photographic Expert Group". It is an
international standard in1992. It perfectly Works with color and gray scale
89
ww The two variable length codes (Binary shift, Huffman Shift) are referred to
w.E
as shift codes.
A shift code is generated by
asy
Arranging probabilities of the source symbols are monotonically
decreasing.
En
Dividing the total number of symbols into symbol blocks of equal size.
gin
Coding the individual elements within all blocks identically.
eer
Adding special shift up/down symbols to identify each block.
ing
PART B
.ne
1. What is image compression? Explain any four variable length
coding compression schemes. (Dec’13, APR/MAY 2017, NOV DEC 2016
t
OR
Explain the schematics of image compression standard JPEG. (May’14)
Image File Formats:
Image file formats are standardized means of organizing and storing
digital images. Image files are composed of digital data in one of these
formats that can be for use on a computer display or printer. An image file
format may store data in uncompressed, compressed, or vector formats.
1. Image file sizes:
90
ww
result in very different file sizes after compression due to the nature of
w.E
compression algorithms. With some compression formats, images that are
less complex may result in smaller compressed file sizes. This characteristic
asy
sometimes results in a smaller file size for some lossless formats than lossy
En
formats. For example, graphically simple images (i.e images with large
continuous regions like line art or animation sequences) may be losslessly
gin
compressed into a GIF or PNG format and result in a smaller file size than a
eer
lossy JPEG format. Vector images, unlike raster images, can be any
ing
dimension independent of file size. File size increases only with the addition of
more vectors.
2. Image file compression
.ne
There are two types of image file compression algorithms: lossless and
lossy.
t
a. Lossless compression algorithms reduce file size while preserving
a perfect copy of the original uncompressed image. Lossless compression
generally, but not exclusively, results in larger files than lossy compression.
Lossless compression should be used to avoid accumulating stages of re-
compression when editing images.
b. Lossy compression algorithms preserve a representation of the
original uncompressed image that may appear to be a perfect copy, but it is
not a perfect copy. Often lossy compression is able to achieve smaller file
91
sizes than lossless compression. Most lossy compression algorithms allow for
variable compression that trades image quality for file size.
Major graphic file formats
The two main families of graphics Raster and Vector.
Raster formats
1.JPEG/JFIF
JPEG (Joint Photographic Experts Group) is a compression method;
JPEG-compressed images are usually stored in the JFIF (JPEG File
Interchange Format) file format. JPEG compression is (in most cases) lossy
compression. The JPEG/JFIF filename extension is JPG or JPEG. Nearly
ww
every digital camera can save images in the JPEG/JFIF format, which
w.E
supports 8-bit gray scale images and 24-bit color images (8 bits each for red,
green, and blue). JPEG applies lossy compression to images, which can
asy
result in a significant reduction of the file size.
2. JPEG 2000
En
JPEG 2000 is a compression standard enabling both lossless and lossy
gin
storage. The compression methods used are different from the ones in
eer
standard JFIF/JPEG; they improve quality and compression ratios, but also
ing
require more computational power to process. JPEG 2000 also adds features
that are missing in JPEG. It is not nearly as common as JPEG, but it is used
.ne
currently in professional movie editing and distribution (some digital cinemas,
for example, use JPEG 2000 for individual movie frames).
3. EXIF
t
The EXIF (Exchangeable image file format) format is a file standard
similar to the JFIF format with TIFF extensions; it is incorporated in the JPEG-
writing software used in most cameras. Its purpose is to record and to
standardize the exchange of images with image metadata between digital
cameras and editing and viewing software. The metadata are recorded for
individual images and include such things as camera settings, time and date,
shutter speed, exposure, image size, compression, name of camera, color
information. When images are viewed or edited by image editing software, all
of this image information can be displayed.
92
4. TIFF
The TIFF (Tagged Image File Format) format is a flexible format that
normally saves 8 bits or 16 bits per color (red, green, blue) for 24-bit and 48-
bit totals, respectively, usually using either the TIFF or TIF filename extension.
TIFFs can be lossy and lossless; some offer relatively good lossless
compression for bi-level (black & white) images. Some digital cameras can
save in TIFF format, using the LZW compression algorithm for lossless
storage. TIFF image format is not widely supported by web browsers. TIFF
remains widely accepted as a photograph file standard in the printing
business. TIFF can handle device-specific color spaces, such as the CMYK
ww
defined by a particular set of printing press inks. OCR (Optical Character
w.E
Recognition) software packages commonly
monochromatic) form of TIFF image for scanned text pages.
generate some (often
5. RAW
asy
En
RAW refers to raw image formats that are available on some digital
cameras, rather than to a specific format. These formats usually use a
gin
lossless or nearly lossless compression, and produce file sizes smaller than
eer
the TIFF formats. Although there is a standard raw image format, (ISO 12234-
ing
2, TIFF/EP), the raw formats used by most cameras are not standardized or
documented, and differ among camera manufacturers.
Most camera manufacturers have their own software for decoding or
.ne
developing their raw file format, but there are also many third-party raw file
converter applications available that accept raw files from most digital
t
cameras. Some graphic programs and image editors may not accept some or
all raw file formats, and some older ones have been effectively orphaned
already.
6. GIF
GIF (Graphics Interchange Format) is limited to an 8-bit palette or 256
colors. This makes the GIF format suitable for storing graphics with relatively
few colors such as simple diagrams, shapes, logos and cartoon style images.
The GIF format supports animation and is still widely used to provide image
animation effects. It also uses a lossless compression that is more effective
93
when large areas have a single color, and ineffective for detailed images or
dithered images.
7. BMP
The BMP file format (Windows bitmap) handles graphics files within the
Microsoft Windows OS. Typically, BMP files are uncompressed, hence they
are large; the advantage is their simplicity and wide acceptance in Windows
programs.
8. PNG
The PNG (Portable Network Graphics) file format was created as the
free, open-source successor to GIF. The PNG file format supports 8 bit palette
ww
images (with optional transparency for all palette colors) and 24 bit true color
w.E
(16 million colors) or 48 bit true color with and without alpha channel - while
GIF supports only 256 colors and a single transparent color. Compared to
asy
JPEG, PNG excels when the image has large, uniformly colored areas. Thus
En
lossless PNG format is best suited for pictures still under edition - and the
lossy formats, like JPEG, are best for the final distribution of photographic
gin
images, because in this case JPG files are usually smaller than PNG files.
eer
The Adam7-interlacing allows an early preview, even when only a small
percentage of the image data has been transmitted.
9. HDR Raster formats ing
.ne
Most typical raster formats cannot store HDR data (32 bit floating point
values per pixel component), which is why some relatively old or complex
formats are still predominant here, and worth mentioning separately. Newer
t
alternatives are showing up, though.
10. Other image file formats of the raster type
Other image file formats of raster type include:
JPEG XR (New JPEG standard based on Microsoft HD Photo)
TGA (TARGA)
ILBM (IFF-style format for up to 32 bit in planar representation, plus
optional 64 bit extensions
DEEP (IFF-style format used by TV Paint
94
ww OR
w.E
How an image is compressed using JPEG Image compression with an
image matrix? MAY/JUNE 2015
asy
A compression system consists of two distinct structural blocks: an
encoder and a decoder.' An input image f (x, y) is fed into the encoder, which
En
creates a set of symbols from the input data. After transmission over the
gin
channel, the encoded representation is fed to the decoder, where a
eer
reconstructed output image f(x, y) is generated. In general, f'(x, y) may or may
not be an exact replica of f(x, y). If it is, the system is error free or information
ing
preserving; if not, some level of distortion is present in the reconstructed
image.
.ne
Both the encoder and decoder shown in Fig. 4.1 consist of two relatively
in- dependent functions or sub blocks. The encoder is made up of a source t
encoder, which removes input redundancies and a channel encoder, which
increases the noise immunity of the source encoder's output. As would be
expected, the decoder includes a channel decoder followed by a source
decoder. If the channel between the encoder and decoder is noise free (not
prone to error), the channel encoder and decoder are omitted, and the general
encoder and decoder be- come the source encoder and decoder,
respectively.
95
ww
w.E
asy
Figure. 4.2 (a) Source encoder and (b) source decoder model.
ing
encoding approach to use in any given situation. Normally, the approach can
.ne
be modelled by a series of three independent operations. A Fig. 4.2(a) shows,
each operation is designed to reduce one of the three redundancies and
Figure 4.2(b) depicts the corresponding source decoder.
In the first stage of the source encoding process, the mapper
t
transforms the input data into a (usually non visual) format designed to reduce
inter pixel redundancies in the input image. This operation generally is
reversible and may or may not reduce directly the amount of data required to
represent the image.
The second stage, or quantizer block in fig. 4.2(a), reduces the
accuracy of the mapper's output in accordance with some pre-established
fidelity criterion. This stage reduces the psycho visual redundancies of the
input image.
96
In the third and final stage of the source encoding process, the symbol
coder creates a fixed- or variable-length code to represent the quantizer
output and maps the output in accordance with the code. The term symbol
coder distinguishes this coding operation from the overall source encoding
process. In most cases: a variable-length code is used to represent the
mapped and quantized data set. It assigns the shortest code words to the
most frequently occurring output values and thus reduces coding redundant.
Figure 4.2(a) shows the source encoding process as three successive
operations, but all three operations are not necessarily included in every
compression system. Recall, for example, that the quantizer must be omitted
ww
when error-free compression is desired.
w.E The source decoder shown in Fig. 4.2(b) contains only two
components; a symbol decoder and an inverse mapper. These blocks
asy
perform, in reverse order, the inverse operations of the source encoder's
En
symbol encoder and mapper blocks. Because quantization results in
irreversible information loss, an inverse quantizer block is not included in the
gin
general source decoder model shown in Fig. 4.2(b).
The Channel Encoder and Decoder
eer
ing
The channel encoder and decoder play an important role in the overall
encoding-decoding process when the channel of Fig. 4.1 is noisy or prone to
.ne
error. They are designed to reduce the impact of channel noise by inserting a
con- trolled form of redundancy into the source encoded data. As the output of
the source encoder contains little redundancy, it would be highly sensitive to
t
trans- mission noise without the addition of this "controlled redundancy."
One of the most useful channel encoding techniques was devised by R.
W. Hamming (Hamming [1950]). It is based on appending enough bits to the
data being encoded to ensure that some minimum number of bits must
change between valid code words. Hamming showed, for example, that if 3
bits of redundancy are added to a 4-bit word, so that the distanced between
any two valid code words is 3, all single-bit errors can be detected and
corrected. (By ap- pending additional bits of redundancy, multiple-bit errors
97
can be detected and corrected). The 7-bit Hamming (7,4) code word h1, h2... .
h5, h6, h7 associated with a 4-bit binary number b3, b2, b1, b0, is
where @ denotes the exclusive OR operation. Note that bits h1,h2 and
h4, are even- parity bits for the bit fields b3b2b0,b3b1b0 and b2b1b0 respectively.
(Recall that a string of binary bits has even parity if the number of bits with a
ww
value of 1 is even).
3.
w.E Explain in full details about Error free Compression? APR 2017
OR
asy
Describe run length encoding with examples. APRIL/MAY 2015
En OR
gin
With an example Huffman coding scheme results with image
compression? NOV DEC 2016
eer
The principal of error-free compression strategies are currently in used
ing
are discussed here. They normally provide compression ratios of 2 to 10.
Moreover, they are equally applicable to both binary and gray-scale images.
The error-free compression techniques generally are composed of two .ne
relatively independent operations:
devising an alternative representation of the image in which its inter
t
pixel redundancies are reduced.
coding the representation to eliminate coding redundancies.
Variable-Length Coding
The simplest approach to error-free image compression is to reduce
only coding redundancy. Coding redundancy normally is present in any
natural binary encoding of the gray levels in an image. To do so requires
construction of a variable-length code that assigns the shortest possible code
words to the most probable gray levels. Here, we examine several optimal
98
and near optimal techniques for constructing such a code. These techniques
are formulated in the language of information theory. In practice, the source
symbols may be either the gray levels of an image or the output of a gray-
level mapping operation (pixel differences, run lengths, and so on).
Huffman coding
The most popular technique for removing coding redundancy is due to
Huffman (Huffman [1952]). When coding the symbols of an information source
individually, Huffman coding yields the smallest possible number of code
symbols per source symbol. In terms of the noiseless coding theorem (see
Section 8.3.3), the resulting code is optimal for a fixed value of n, subject to
ww
the constraint that the source symbols be coded one at a time.
asy
and combining the lowest probability symbols into a single symbol that
En
replaces them in the next source reduction. Figure 4.3 illustrates this process
for binary coding (K-ary Huffman codes can also be constructed). At the far
gin
left, a hypothetical set of source symbols and their probabilities are ordered
eer
from top to bottom in terms of decreasing probability values. To form the first
ing
source reduction, the bottom two probabilities, 0.06 and 0.04, are combined to
form a "compound symbol" with probability 0.1. This compound symbol and its
.ne
associated probability are placed in the first source reduction column so that
the probabilities of the reduced source are also ordered from the most to the
least probable. This process is then repeated until a reduced source with two
t
symbols (at the far right) is reached.
The second step in Huffman's procedure is to code each reduced
source, starting with the smallest source and working back to the original
source. The minimal length binary code for a two-symbol source, of course, is
the symbols 0 and 1. As Fig. 4.4 shows, these symbols are assigned to the
two symbols on the right (the assignment is arbitrary; reversing the order of
the 0 and 1 would work just as well). As the reduced source symbol with
probability 0.6 was generated by combining two symbols in the reduced
source to its left, the 0 used to code it is now assigned to both of these
99
ww
w.E Figure. 4.3 Huffman source reductions.
asy
En
gin
eer
ing
Figure. 4.4 Huffman code assignment procedure.
.ne
Arithmetic coding
t
100
Figure 4.5 illustrates the basic arithmetic coding process. Here, a five-
symbol sequence or message, a1, a2, a3, a3, a5 from a four-symbol source is
coded. At the start of the coding process, the message is assumed to occupy
the entire half- open interval [0, 1]. As Table 4.1 shows, this interval is initially
subdivided into four regions based on the probabilities of each source symbol.
Symbol a1 for example, is associated with subinterval [0,0.2). Because it is the
first symbol of the message being coded, the message interval is initially
narrowed to [0, 0.2).Thus in Fig. 4.5 [0, 0.2) is expanded to the full height of
the figure and its end points labelled by the values of the narrowed range. The
narrowed range is then subdivided in accordance with the original source
ww
symbol probabilities and the process continues with the next message
w.E
symbol. In this manner, symbol n, narrows the subinterval to [0.04, 0.08), a 3
further narrows it to [0.056,0.072), and so on. The final message symbol,
asy
which must be reserved as a special end-of- message indicator, narrows the
En
range to [0.06752,0.0688). Of course, any number within this subinterval-for
example, 0.068-can be used to represent the message.
gin
eer
ing
.ne
Table 4.1 Arithmetic coding example.
t
LZW Coding
LZW coding is conceptually very simple (Welch [1984]). At the onset of
the coding process, a codebook or "dictionary" containing the source symbols
to be coded is constructed. For 8-bit monochrome images, the first 256 words
of the dictionary are assigned to the gray values 0, 1, 2,. . . , 255. As the
encoder sequentially examines the image's pixels, gray-level sequences that
are not in the dictionary are placed in algorithmically determined (e.g., the
101
next unused) locations. If the first two pixels of the image are white, for
instance, sequence "255-255" might be assigned to location 256, the address
following the locations reserved for gray levels 0 through 255.The next time
that two consecutive white pixels are encountered, code word 256, the
address of the location containing sequence 255-255, is used to represent
them. If a 9-bit, 512-word dictionary is employed in the coding process, the
original (8 + 8) bits that were used to represent the row pixels are replaced by
a single 9-bit code word. Cleary, the size of the dictionary is an important
system parameter. If it is too small, the detection of matching gray-level
sequences will be less likely; if it is too large, the size of the code words will
ww
adversely affect compression performance.
w.E
Bit-Plane Coding
Another effective technique for reducing an image's inter pixel
asy
redundancies is to process the image's bit planes individually. The technique,
En
called bit-plane coding, is based on the concept of decomposing a multilevel
(monochrome or color) image into a series of binary images and compressing
gin
each binary image via one of several well-known binary compression
eer
methods. In this section, we describe the most popular decomposition
ing
approaches and review several of the more commonly used compression
methods.
Bit-plane decomposition
.ne
The gray levels of an m-bit gray-scale image can be represented in the
form of the base 2 polynomial
t
Based on this property, a simple method of decomposing the image into a
collection of binary images is to separate the m coefficients of the polynomial
into m 1-bit planes.
extracting and coding only the new information in each pixel. The new
information of a pixel is defined as the difference between the actual and
predicted value of that pixel.
Figure 4.6 shows the basic components of a lossless predictive coding
system. The system consists of an encoder and a decoder, each containing
an identical predictor. As each successive pixel of the input image, denoted fn,
is introduced to the encoder, the predictor generates the anticipated value of
that pixel based on some number of past inputs. The output of the predictor is
then rounded to the nearest integer, denoted f'n and used to form the
difference or prediction error
ww
w.E which is coded using a variable-length code (by the symbol encoder) to
asy
generate the next element of the compressed data stream. The decoder of
En
Fig. 4.6(b) reconstructs en from the received variable-length code words and
performs the inverse operation
gin
eer
Various local, global, and adaptive methods can be used to generate f' n.
ing
In most cases, however, the prediction is formed by a linear combination of m
previous pixels. That is,
.ne
t
where m is the order of the linear predictor, round is a function used to
denote the rounding or nearest integer operation, and the αi, for i = 1,2,. . . ,m
are prediction coefficients. In raster scan applications, the subscript n indexes
the predictor outputs in accordance with their time of occurrence. In 1-D linear
predictive coding, for example we can be written
103
ww
w.E
asy
En
gin
Figure. 4.6 A lossless predictive coding model: (a) encoder; (b) decoder.
eer
5.
ing
Explain in full details about Lossy Compression and Lossy
Predictive Coding?
Lossy Compression .ne
The lossy encoding is based on the concept of compromising the
accuracy of the reconstructed image in exchange for increased compression.
t
If the resulting distortion (which may or may not be visually apparent) can be
tolerated, the increase in compression can be significant. In fact, many lossy
encoding techniques are capable of reproducing recognizable monochrome
images from data that have been compressed by more than 100: 1 and
images that are virtually indistinguishable from the originals at 10: 1 to 50: 1.
Error-free encoding of monochrome images, however, seldom results in more
than a 3: 1 reduction in data.
104
ww
w.E
asy
En
gin
eer
ing
Figure. 4.7 A lossy predictive coding model: (a) encoder and (b) decoder.
As Fig. 4.7(a) shows, this is accomplished by placing the lossy.ne
encoder's predictor within a feedback loop, where its input, denoted f' n, is
generated as a function of past predictions and the corresponding quantized
t
errors. That is,
105
and
ww
fixed-length code. The resulting DM code rate is 1 bit/pixel.
w.E
asy
En
gin
eer
Figure. 4.8 Delta modulation (DM)
Transform Coding ing
.ne
In transform coding, a reversible, linear transform (such as the Fourier
transform) is used to map the image into a set of transform coefficients, which
are then quantized and coded. For most natural images, a significant number
of the coefficients have small magnitudes and can be coarsely quantized (or
t
discarded entirely) with little image distortion.
106
ww
coefficients that carry the least information. These coefficients have the
w.E
smallest impact on reconstructed sub image quality. The encoding process
terminates by coding (normally using a variable- length code) the quantized
asy
coefficients. Any or all of the transform encoding steps can be adapted to local
En
image content, called adaptive transform coding, or fixed for all sub images,
called non adaptive transform coding.
Wavelet Coding gin
eer
ing
.ne
Figure. 4.10 wavelet coding system (a) Encoder, (b) Decoder
Figure 4.10 shows a typical wavelet coding system. To encode a 2J X 2J
t
image, an analyzing wavelet, ψ and minimum decomposition level, J - P, are
selected and used to compute the image's discrete wavelet transform. Since
many of the computed coefficients carry little visual information, they can be
quantized and coded to minimize inter coefficient and coding redundancy.
Moreover, the quantization can be adapted to exploit any positional correlation
across the P decomposition levels.
107
ww
by the International Standardization Organization (ISO) and the Consultative
w.E
Committee of the International Telephone and Tclcgraph (CCITT).
Binary Image Compression Standards
asy
Two of the most widely used image compression standards are the
En
CCITT Group 3 and 4 standards for binary image compression. Although they
are currently utilized in a wide variety of computer applications, they were
gin
originally designed as facsimile (FAX) coding methods for transmitting
eer
documents over telephone networks. The Group 3 standard applies a non
ing
adaptive, 1-D run- length coding technique in which the last K - 1 lines of each
group of K lines (For K = 2 or 4) are optionally code in a 2-D manner. The
Group 4 standard is a simplified or streamlined version of the Group 3
.ne
standard in which only 2-D coding is allowed. Both standards use the same
non adaptive 2-D coding approach.
t
During the development of the CCTTT standards, eight representative
"test" documents were selected and used as a baseline for evaluating various
binary compression alternatives. The existing Group 3 and 4 standards coin
press these documents, which include both typed and handwritten text (in
several languages) as well as a few line drawings, by about 15 : 1. Because
the Group 3 and 4 standards are based on non adaptive techniques, however,
they sometimes result in data expansion (e.g., with half-tone images). To
overcome this and related problems, the Joint Bilevel Imaging Group (JBIG)-a
joint committee of the CCJT and ISO- has adopted and/or proposed several
108
ww
image compression. To develop the standards, CCTTT and IS0 committees
w.E
solicited algorithm recommendations from a large number of companies,
universities, and research 1aboratories.The best of those submitted were
asy
selected on the basis of image quality and compression performance. The
En
resulting standards, which include the original DCT-based PEG standard, the
recently proposed wavelet-based JPEG 2000 standard, and the JPEG-LS
gin
standard, a lossless to near lossless adaptive prediction scheme that includes
eer
a mechanism for flat region detection and run-length coding (ISO/IEC [1999),
ing
represent the state of the art in continuous tone image compression.
JPEG
One of the most popular and comprehensive continuous tone, still
.ne
frame compression standards is the JPEG standard. It defines three different
coding systems: (1) a lossy baseline coding system, which is based on the
t
DCT and is adequate for most compression applications; (2) an mended
coding system for greater compression, higher precision, or progressive
reconstruction applications; and (3) a lossless independent coding system for
reversible compression. To be JPEG compatible, a product or system must
include support for the baseline system. No particular file format, spatial
resolution, or color space model is specified.
In the baseline system, often called the sequential baseline system, the
input and output data precision is limited to 8 bits, whereas the quantized DCT
values are restricted to 11 bits. The compression itself is performed in three
109
ww
storage, display, and/or editing. Coefficient quantization is adapted to
w.E
individual scales and sub bands and the quantized coefficients are
arithmetically coded on a bit-plane basis. Using the notation of the standard,
asy
an image is encoded as follows (ISO/IEC [2000]).
Video Compression Standards
En
Video compression standards extend the transform-based, still image
gin
compression techniques of the previous section to include methods for
eer
reducing temporal or frame-to-frame redundancies. Although there are a
ing
variety of video coding standards in use today, most rely on similar video
compression techniques. Depending on the intended application, the
standards can be grouped into two broad categories:
.ne(1) video
teleconferencing standards and (2) multi- media standards.
A number of video teleconferencing standards, including H.261 (also
t
referred to as PX64), H.262, H.263, & H.320, have been defined by the
International Telecommunications Union (ITU), the successor to the CCITT.
H.261 is intended for operation at affordable telecom bit rates and to support
full motion video transmission over T1 lines with delays of less than 150 ms.
Delays exceeding 150 ms do not provide viewers the "feeling" of direct visual
feedback. H.263, on the other hand, is designed for very low bit rate video, in
the range of 10 to 30 kbit/s, and H.320, a superset of H.261, is constructed for
Integrated Services Digital Network' (TSDN) bandwidths. Each standard uses
a motion- compensated, DCT-based coding scheme. Since motion estimation
110
ww
coding standard for the storage and retrieval of video on digital media like
w.E
compact disk read-only memories (CD-ROMs).
Figure 4.11 shows a typical MPEG encoder. It exploits redundancies
asy
within and between adjacent video frames, motion uniformity between frames,
En
and the psycho visual properties of the human visual system. The input of the
encoder is an 8 x 8 array of pixels, called an image block. The standards
gin
define a macro block as a 2 x 2 array of image blocks (i.e., a 16 X I6 array of
eer
image elements) and a slice as a row of non overlapping macro blocks.
ing
.ne
t
Figure. 4.11A basic DPCM/ DCT encoder for motion compensated video
compression
111
ww
w.E
asy
En
gin
eer
ing
.ne
t
112
UNIT V
IMAGE REPRESENTATION AND RECOGNITION
ww
this representation is based on 4 or 8 connectivity of the segments. The
w.E
direction of each segment is coded by using a numbering scheme.
2. What are the demerits of chain code?
asy
The demerits of chain code are:
The resulting chain code tends to be quite long
En
Any small disturbance along the boundary due to noise causes changes
gin
in the code that may not be related to the shape of the boundary.
eer
3. What is thinning or skeletonising algorithm? NOV DEC 2016
ing
An important approach to represent the structural shape of a plane
region is to reduce it to a graph. This reduction may be accomplished by
.ne
obtaining the skeletonising algorithm. It play a central role in a broad range of
problems in image processing, ranging from automated inspection of printed
circuit boards to counting of asbestos fibres in air filter.
t
4. What is polygonal approximation method?
Polygonal approximation is a image representation approach in which a
digital boundary can be approximated with arbitrary accuracy by a polygon.
For a closed curve the approximation is exact when the number of segments
in polygon is equal to the number of points in the boundary so that each pair
of adjacent points defines a segment in the polygon.
113
5. Define Signature?
A signature is a 1-D representation of a boundary (which is a 2-D thing):
it should be easier to describe. E.g.: distance from the centroid vs angle.
6. Describe Fourier descriptors?
This is a way of using the Fourier transform to analyze the shape of a
boundary. The x-y coordinates of the boundary are treated as the real and
imaginary parts of a complex number. Then the list of coordinates is Fourier
transformed using the DFT . The Fourier coefficients are called the Fourier
descriptors.
ww
w.E
The complex coefficients a(u) are called Fourier descriptor of a boundary.
The inverse Fourier descriptor is given by:
asy
En
7. Define Texture and list the approaches to describe texture of a region.
NOV DEC 2016 gin
eer
Texture is one of the regional descriptors. It provides the measure of
properties such as smoothness, coarseness and regularity.
The approaches to describe the texture of a region of are: ing
i) Statistical approach.
.ne
ii)
iii)
Structural approach
Spectural approach
t
8. What are the features of Fourier spectrum?
Peaks give principal directions of the patterns
Location of the peaks gives the fundamental period(s)
Periodic components can be removed via filtering; the remaining non-
periodic image can be analyzed using statistical techniques
9. Define Pattern recognition?
114
ww
of a shape number is the number of digits in its representation.
w.E
12. Name a few measures used as simple descriptors in regional
descriptors.
i) Area
asy
ii) Perimeter
En
iii) Mean and median gray levels.
gin
iv) Minimum and maximum of gray levels
eer
v) Number of pixels with values above and below mean.
PART B ing
1. Define Boundary representation with an algorithm and also briefly .ne
explain about the Chain codes. NOV DEC 2016
The result of segmentation is a set of regions. Regions have then to be
t
represented and described.
Two main ways of representing a region:
external characteristics (its boundary): focus on shape
internal characteristics (its internal pixels): focus on color,
textures…
The next step: description
E.g.: a region may be represented by its boundary, and its boundary
described by some features such as length, regularity…Features should be
115
ww
Moore Boundary Tracking Algorithm:
w.E
1. Let the starting point be the leftmost point (bo) in the image. Denote c0 the
west neighbour of bo. Start at c0 and proceed in clockwise direction.Let b1 is
asy
the first neighbour and c1 be the back ground point in the sequence. Store the
locations of b0 and b1.
2. Let b = b1 and c = c1 En
gin
3. Let the 8-neighbours of b, start at c and proceed in clockwise direction.
(n1,n2...nk)
eer
4. Let b = nk and c = nk-1
ing
5. Repeat steps 3 & 4 until b=bo and next boundary point is b1.
Chain codes.
.ne
These are used to represent a boundary of a connected region. Also
denoted by, list of segments with defined length and direction. They are based
t
on 4-directional chain codes and 8-directional chain codes
A boundary code formed as a sequence of such directional numbers is
referred as Freeman chain code.
116
ww
code represent a closed path, rotation normalization can be achieved by
w.E
circularly shifting the number of the code so that the list of numbers forms the
smallest possible integer.
asy
• To normalize wrt rotation:
En
The first difference of the chain code:
This difference is obtained by counting the number of direction changes (in a
counter clockwise direction). gin
eer
For example, the first difference of the 4-direction chain code 10103322 is
ing
3133030. Size normalization can be achieved by adjusting the size of the
resampling grid.
.ne
t
117
ww
Minimum perimeter polygons: (Merging and splitting)
w.E Merging and splitting are often used together to ensure that vertices
appear where they would naturally in the boundary. A least squares criterion
asy
to a straight line is used to stop the processing.
Cellular complex ≡ set of cells enclosing digital boundary.
En
Imagine the boundary as a “rubber band” and allow it to shrink. The
gin
maximum error per grid cell is √2d, where d is the dimension of a grid cell.
eer
ing
.ne
t
Fig. a) An object boundary b) Boundary enclosed by cells c) MPP obtained by
allowing the boundary to shrink
MPP Observations:
The MPP bounded by a simply connected cellular complex is not self-
intersecting.
Every convex vertex of the MPP is a W vertex, but not every W vertex of a
boundary is a vertex of the MPP.
118
Every mirrored concave vertex of the MPP is a B vertex, but not every B
vertex of a boundary is a vertex of the MPP.
All B vertices are on or outside the MPP, and all W vertices are on or inside
the MPP.
The uppermost, leftmost vertex in a sequence of vertices contained in a
cellular complex is always a W
Let a=(x1,y1), b=(x2,y2), and c=(x3,y3)
ww
w.E
asy
Form a list whose rows are the coordinates of each vertex and whether
En
that vertex is W or B. The concave verttices must be mirrored, the vertices
must be in sequential order, and the first uppermost, leftmost vertex VO is a
gin
W vertex. There is a white crawler (WC) and a black crawler (BC). The Wc
eer
crawls along the convex W vertices, and the BC crawls along the mirrored
concave B vertices.
MPP Algorithm: ing
1. Set WC=BC=VO
.ne
2. (a) VK is on the positive side of the line (VL,WC) [sgn(VL,WC,VK)>0
(b) VK is on the negative side of the line (VL,WC) or is collinear with it
t
[sgn(VL,WC,VK)<0; VK is on the positive side of the line (VL,BC) or is
collinear with it [sgn(VL,BC,VK)>0
(c) VK is on the negative side of the line (VL,BC) [sgn(VL,BC,VK)<0
If condition (a) holds the next MPP vertex is WC and VL=WC; set
WC=BC=VL and continue with the next vertex.
If condition (b) holds VK becomes a candidate MPP vertex. Set WC=VK if
VK is convex otherwise set BC=VK. Continue with next vertex.
If condition (c) holds the next vertex is BC and VL=BC. Re-initialize the
algorithm by setting WC=BC=VL and continue with the next vertex after VL.
119
ww
(4) Repeat until E > T
w.E
(5) Store a and b of y = ax + b, and set E = 0
(6) Find the following line and repeat until all the edge pixels were considered.
asy
(7) Calculate the vertices of the polygon, that is where the lines intersect.
Splitting techniques
En
• Joint the two furthest points on the boundary −→ line ab
gin
• Obtain a point on the upper segment, that is c and a point on the lower
eer
segment, that is d,such that the perpendicular distance from these points to
ab is as large as possible
• Now obtain a polygon by joining c and d with a and b ing
.ne
• Repeat until the perpendicular distance is less than some predefined fraction
of ab
t
120
ww
w.E
asy
En
• Signatures are invariant to translation
gin
• Invariance to rotation: depends on the starting point
eer
- the starting point could e.g. be the one farthest from the centroid
• Scaling varies the amplitude of the signature
ing
- invariance can be obtained by normalizing between 0 and 1, or by
dividing by the variance of the signature.
.ne
Normalization for rotation:
(1) Choose the starting point as the furthest point from the centroid OR
t
(2) Choose the starting point as the point on the major axis that is the furthest
from the centroid.
Normalization for scale Note: ↑ scale ⇒↑ amplitude of signature
(1) Scale signature between 0 and 1 Problem: sensitive to noise
(2) Divide each sample by its variance - assuming it is not zero
ww
powerful tool for robust decomposition of the boundary.
w.E Boundary segments are usually easier to describe than the boundary as
a whole. We need a robust decomposition: convex hull
asy
A convex set (region) is a set (region) in which any two elements
En
(points) A and B in the set (region) can be joined by a line AB, so that each
point on AB is part of the set (region). The convex hull H of an arbitrary set
gin
(region) S is the smallest convex set (region) containing S
Convex deficiency: D = H − S
eer
ing
The region boundary is partitioned by following the contour of S and
marking the points at which a transition is made into or out of a component of
.ne
the convex deficiency .The partitioning of irregular boundaries (that results
from the digitization process or noise) usually leads to small meaningless
components.
t
We therefore smooth the boundary prior to partitioning:
(1) Use averaging mask on coordinates of boundary pixels OR
(2) Polygonal approximation prior to computation of convex deficiency.
Skeleton
One way to represent a shape is to reduce it to a graph, by obtaining its
skeleton via thinning (skeletonization). Skeletons are used to produce a one
pixel wide graph that has the same basic shape of the region, like a stick
figure of a human. It can be used to analyze the geometric structure of a
region which has bumps and “arms”.
122
ww
valued 0. Let N(p1) = P2 + P3 +....P9
w.E
T(p1) is the no. of 0-1 transitions in the ordered sequence P2,P3....P9, P2
asy
En
gin
eer
ing
Fig. Medial axes of three simple regions
.ne
t
Fig. Neighbourhood arrangement used by the thinning algorithm.
Step 1: Flag a contour point p1 for deletion if the following conditions are
satisfied a) 2< N(p1) < 6 b) T(p1) = 1 c) p2 .p4.p6 = 0 d) p4.p6.p8 = 0
Step 2: Flag a contour point p1 for deletion again. However, conditions (a) and
(b) remain the same, but conditions (c) and (d) are changed to p2.p6.p8 = 0,
p2.p4.p8 = 0
A thinning algorithm:
applying step 1 to flag border points for deletion
123
asy
boundary. The x-y coordinates of the boundary are treated as the real and
En
imaginary parts of a complex number. Then the list of coordinates is Fourier
transformed using the DFT . The Fourier coefficients are called the Fourier
descriptors. gin
eer
The basic shape of the region is determined by the first several
ing
coefficients, which represent lower frequencies. Higher frequency terms
provide information on the fine detail of the boundary.
– It becomes a 1-D descriptor
.ne
– Fourier descriptors are not insensitive to translation..., but effects on the
transform coefficients are known. Suppose that a boundary is
t
represented by K coordinate pairs in the xy plane, (x0, y0), (x1, y1), (x2,
y2), . . . , (xK−1, yK−1)
When we traverse this boundary in an anti-clockwise direction the boundary
can be represented as the sequence of coordinates s(k) = [ x(k), y(k) ] for k =
0, 1, 2, . ,K − 1
1. Represent each point on a digital boundary as s(k) = x(k) + jy(x)
2. Compute the DFT of the set of boundary points
124
3. The coefficients a(u) are the Fourier descriptors of the boundary Since K
can be large we usually approximate the boundary by a smaller set of points,
i.e., P, so that
ww
w.E
asy
Rotation, scale, and translation of a boundary have simple effects on the
En
Fourier description of that boundary.
ing
A boundary segment can be represented by a 1-D discrete function g(r) ...
The amplitude of g can now be treated as a discrete random variable v so that
a histogram p(vi), i = 0, 1, . . . ,A − 1 is formed, where A is the number of
.ne
amplitude increments
Moments are statistical measures of data. They come in integer orders.
t
Order 0 is just the number of points in the data.
Order 1 is the sum and is used to find the average.
Order 2 is related to the variance, and
Order 3 to the skew of the data.
Higher orders can also be used, but don‟t have simple meanings.
Once a boundary is described as a 1-D function, statistical moments
(mean, variance, and a few higher-order central moments) can be used to
describe it:
125
w.E
between signatures of clearly distinct shapes.
Alternatively, treat g(ri) as the probability of value reoccurring, so that
the moments are
asy
En
gin
eer
Here K is the number of points on the boundary, and μn(r) is directly
related to the shape of g(r): ing
Spread of the curve: μ2(r)
.ne
Symmetry with reference to the mean: μ3(r)
t
5. Explain in detail about the Patterns and pattern classes.
Pattern : an arrangement of descriptors.
Pattern classes: a pattern class is a family of patterns that share some
common properties.
Pattern recognition: to assign patterns to their respective classes.
Three common pattern arrangements used in practices are
* Vectors – quantitative description
* Strings – structural description
* Trees – structural description
126
ww
w.E Fig. Three types of iris flowers described by two measurements.
asy
Here is another example of pattern vector generation.
En
In this case, we are interested in different types of noisy shapes. Sample
the signatures at some specified interval values of θ.
gin
eer
ing
.ne
Fig. A noisy object and its corresponding signature
t
String descriptions adequately generate patterns of objects and other
entities whose structure is based on relatively simple connectivity of
primitives, usually associated with boundary shape.
ww
w.E
6. Define matching. How the matching is performed based on
recognition.
asy
Decision-theoretic approaches to recognition are based on the use
decision functions.
En
Let represent an n-dimensional pattern vector. For W pattern classes , we
gin
want to find W decision functions with the property that, if a pattern x
belongs to class , then
eer
di(x) > dj(x) j = 1,2,... W
ing
The decision boundary separating class ωi and ωj is given by di(x) = dj(x)
Matching
.ne
Recognition techniques based on matching represent each class by a
prototype pattern vector. An unknown pattern is assigned to the class to which
t
it is closest in terms of predefined metric. The simplest approach is the
minimum distance classifier, computes the Euclidean distance between the
unknown and each of the prototype vectors. It chooses the smallest distance
to make a decision.
Minimum distance classifier
Mean vector of the pattern of the class ωj
The minimum distance classifier works well when the distance between
means is large compared to the spread or randomness of each class with
respect to its mean.
128
Matching by correlation
The correlation of the mask w(x, y) of size m x n, with an image f (x, y)
may be expressed in the form
w.E
where the limits of summation are taken over the region shared by ω and
fSpatial correlation is related to the transforms of the functions via correlation
theorem:
asy
En
f(x, y) * w(x, y) = F* (u, v) W(u, v)
Fig. Shows a template of size m x n whose center is at an arbitrary
gin
location(x, y). The correlation at this point is obtained by applying normalized
eer
correlation coefficient. Then the center of the teplate is incremented to an
ing
adjascent location and the procedure is repeated. The complete correlation
coefficient is obtained by moving the center of the template so center of ω
.ne
visits every pixel in f. At the end, maximum in γ(x, y) to find where the best
match occured. It is possible to have multiple locations in γ(x, y) with the
maximum value, indicating several matches between ω and f.
t
129
ww
Time: Three Hours Maximum: 100 Marks
asy
1. When is fine sampling and coarse sampling used?
En
2. What is the function of an image sensor? page no. 12
3. Differentiate between image enhancement and restoration? page no. 11
gin
4. If all the pixels in an image are shuffled, will there be any change in the
histogram Justify your answer?
eer
6. Define region growing? ing
5. Why the restoration is called as unconstrained restoration?
page no. 80
7. What is run length coding?
.ne
8. What are the operations performed by error free compression?.
9. Does the use of chain code compress the description information of an
t
object contour?
10. What is meant by pattern classes? page no. 126
PART B (5 x 16 = 80 Marks)
11. a. Explain the components of image processing systems? (16)
(OR)
(b) (i) Discuss the effects of non uniform sampling and quantization? (8)
(ii) Explain how color images are representaed using HSI color space
model? (8)
130
ww
(b) Drive a wiener filter for image restoration and specify its advantages
w.E
over inverse filter.
14. (a) Explain region splitting and merging technique for image
(16)
asy
segmentation with suitable examples? (16)
(b) En (OR)
Encode the sentence 'I LOVE IMAGE PROCESSING' using
arithmetic code procedure gin (16)
15. (a)
eer
Explain in detail about the object recognition techniques based on
matching.
(OR) ing (16)
131
ww
3. What is meant by bit plane slicing?
w.E
4. What is unsharp masking?
5. State the causes of degradation in an image?
asy
6. What do you understand by Mexican hat function?
7. What is an image pyramid?
En
8. State whether the given Huffman code 0, 10, 01, 011 for the symbols a1,
a2, a3, 4 is uniquely decodable or not?. gin
9. What is Skeletonizing?
eer
10. Define texture?
PART B (5 x 16 = 80 Marks) ing
11. a. (i) With necessary diagrams explain how an anlog image is converted
.ne
in to digital image.
(ii) What is meant by image sensing? Explain in detail the construction
t (8)
3 5 5 5 3
3 4 5 4 3
4 4 4 4 4
(OR)
(b)(i) Explain the detail the method for smoothening the image in
frequency domain? (10)
(ii) Explain Gradient operators for image enhancement? (6)
13. (a) (i) Apply order statistics filters on the selected pixels in the
image? (8)
(ii) Explain how wiener filter is used for image restoration? (8)
ww 1 2 3
w.E 0
1
1
4
2
5
asy (OR)
(b) (i)
(ii) En
Explain the process of edge linking using Hough transform?
Explain region based segmentation techniques?
(8)
(8)
14. (a) gin
(i)Explain two dimensional Discrete Wavelet Tramsform? (8)
(ii)
eer
Encode the word a1 a2 a3 a4 using arithmetic code and
generate the tag for the given symbol with probabilities.
a1→ 0.2, a2→0.2, a3→0.4, a4→0.2. ing (8)
(OR)
.ne
(b) What is the need for imge compression? Explain image
compression standards in details? (16)
t
15. (a) Explain in detail any two boundary representation schemes and
illustrate with examples? (16)
(OR)
(b)Explain image recognition based on matching? (16)
133
w.E
1. Distinguish between monochrome and gray scale image?
2. What is the goal of an image transform?
asy
3. What is image filtering?
En
4. Specify the need of image enhancement.
5. When will a constrained least square filter (CLS) reduce to an inverse filter?
gin
6. What are the advantages of homomorphic filtering?
7. Compare canny and Gaussian edge detector?
eer
8. Give two applications of image segmentation.
ing
9. Determine whether the code (0, 01, 11) is uniquely decoded or not?
10. Differentiate Scalar and Vector quantization.
.ne
11. a.
PART B (5 x 16 = 80 Marks)
(i) Explain the fundamental blocks in digital image processing
t
system. (7)
(ii) Compute the DCT for the sub image of size 5x5 and the image is
given as (9)
20 30 40 50 40
20 35 45 45 40
30 70 70 70 40
60 65 60 65 40
20 25 49 45 40
(OR)
134
(b) (i) Describe the elements of visual perception with suitable diagram. (8)
(ii) Discuss the properties and applications of KL Transform (8)
12. (a)(i) Explain the histogram equalization method of image enhancement.
(ii) Compare the various filters available under frequency domain for
image enhancement. (6)
(OR)
(b)(i) Describe the filters used for noise distribution removal from
images. (8)
(ii) Discuss the techniques applicable for color image
enhancement. (8)
ww
13. (a) (i) Draw the block diagram for image degradation model and
w.E
explain.
(ii) Explain the use of wiener filtering in image restoration.
(8)
(8)
asy (OR)
(b) (i)
En
Discuss the concept of inverse and pseudo inverse filters for
image restoration. (8)
(ii) gin
What are the spatial transformation techniques used for image
restoration? Explain them in detail.
eer (8)
14. (a)
ing
(i)Explain the thresholding approach of segmenting an image. (8)
(ii) Discuss the use of morphological watershed segmentation
process.
.ne (8)
(b) (i)
(OR)
Discuss in details any two region based image
t
segmentation techniques. (8)
(ii) With an algorithm explain watersheds segmentation process. (8)
15. (a) (i) With a block diagram explain shift coding approach for
image compression (8)
(ii) Describe the stages in MPEG image compression standard. (8)
(OR)
(b)(i) With an example Huffman coding scheme results with image
compression (8)
(ii) Explain the parts of JPEG compression block diagram. (8)
135
ww
1. Define simultaneous contrast and mach band effect.
w.E
2. Define brightness and contrast.
3. Give the PDF of uniform noise and sketch it.
asy
4. Define and give the transfer function of Mean and Geometric Mean filter.
En
5. Define image degradation model and sketch it.
6. Define Geometric transformation
gin
7. Write the properties of first order and second order derivative.
8. Define local thresholding for edge detection.
eer
ing
9. State the need for data compression and compare lossy and lossless
compression techniques.
10. List the advantages of transform coding.
.ne
11. (a) (i)
PART B --- (5 x 16 = 80 Marks)
Describe how the image is digitized by sampling and
t
quantization and explain about checker board effect and false contouring with
neat sketch. (8)
(ii) Find Discrete Cosine Transform and its inverse for the
following image data. [0255; 2550] [2x2] matrix. (8)
(OR)
(b) Obtain Discrete Fourier transform for the given vectors. Input
image
matrix = [0 0; 255 255] [ 2 x 2] matrix. Also analyse how the
Fourier transform is used if the image is rotated or translated. (16)
136
w.E
processing?
13. (a) Describe inverse filtering for removal of blur caused by any
asy
motion and describe how it restore the image. (16)
(b) En (OR)
How wiener filter is helpful to reduce the mean square error when
image is corrupted gin
by motion blur and additive noise? (16)
14. (a) (i)
eer
How do you link edge pixels, through Hough transform? (8)
(ii)
ing
Describe Watershed segmentation algorithm.
(OR)
(8)
(b) (i) Explain region based segmentation and region growing with an
.ne
example.
(ii) Discuss how to construct dams using morphological operation
(8)
t
15. (a) (i) Describe vector quantization with neat sketch. 8)
(ii) A source emits letters from an alphabet A = (a1, a2, a3, a4,
a5) with probabilities P (a1) = 0.3, P (a2) = 0.4, P (a3) = 0.15, P (a4) =
0.05 and P (a5) = 0.1. Find a Huffman code for this source? Find the average
length of the code and its redundancy? (8)
(OR)
(b) (i) Decribe run length encoding with examples. (8)
(ii) How an image is compressed using JPEG Image
compression with an image matrix? (8)
137
ww
1. Compare RGB and HSI color image models.
w.E
2. Write the kernel for 2D-DCT and how this lead to data compression.
3. What are the possible ways, for adding noise in images?
asy
4. For the following image region, obtain the medium filtered output.
72
15
55
20 En
33
3
65
5
32
18
30
21
21
65
12
30
35 40 34 gin
255 200 17 51 87
0 255 20 100 101
eer
87 59 4
65
30
32
11
18
8
78
97
86
108
50
ing
129
21
151
11
2
68 72 19 37 14 27 50 64
.ne
36 202 111 18
5. What is Lagrange multiplier? Where it is used?
26 192 23 63
t
6. Why blur is to be removed from images?
7. How edges are linked through Hough transform?
8. State the problems in “region splitting and merging” based image
segmentation.
9. What is shift code? How this is used in image analysis?
10. Write the performance metrics for image compression.
138
ww
12. (a) (i) Write the salient features of image histogram. What do you
w.E (ii)
infer?
Explain any two techniques for color image
(8)
enhancement.(8)
asy (OR)
Why
(b) (i)
En
How do you perform directional smoothing, in images?
it is required? (8)
(ii) gin
What is geometric mean and harmonic mean with
reference
eer
to an image? What purpose do they serve for image analysis?
Discuss.
13. (a) (i) ing
Describe how image restoration can be performed for black
(8)
14. (a) (i) How edge detection is performed in digital images using
(1) Laplaction operator (2)
(2) Sobel operator and (2)
(3) Prewitt operator and compare their outcomes. (2+2)
139
ww R0 = 1008
w.E R1
R2
= 320
= 456
asy
R3 = 686
R4
R5
= 105
= 803En
R6 = 417 gin
R7 = 301
eer
(ii) What is arithmetic coding Illustrate.
(OR) ing (8)
(b) (i) Explain the procedure for obtaining Run length Coding(RLC)
.ne
(ii)
What are the advantages if any?
Write short notes on
t(8)
140
ww
1. Define Hue and Saturation.
w.E
2. What do you mean by mach band effect?
3. Define Spatial Averaging.
asy
4. Define the operation of a Harmonic Mean Filter.
En
5. Compare constrained and unconstrained Restoration.
6. What is the principle of Inverse filtering?
gin
7. State the conditions for Region Splitting Merging Processes.
eer
8. What are factors affecting the accuracy of Region Growing?
9. What is the need for image compression?
10. What is Run Length Encoding? ing
PART B – (5 x16=80 marks)
.ne
11. (a) (i)
Processing System.
Briefly discuss about the elements of a Digital Image
t(8)
(ii) Explain about the sampling. (4)
(iii) Write the kernel matrix for SVD transform. (4)
(OR)
(b) Explain in detail about the Vidicon and Digital camera working
principles. (16)
12. (a) Briefly discuss about the Histogram Equalization and specification
Techniques. (16)
(OR)
141
w.E
15. (a)
Morphological Water shed.
(i) What is the need for Date Compression
(16)
(6)
(ii)
asy
Explain in details about the arithmetic Coding. (10)
(b) En (OR)
Write short notes on the following images Codings:
(i) JPEG standard gin (4)
(ii) MPEG.
eer (4)
(iii) Transform Coding
ing
.ne
t
142