Image Processing - Notes
Image Processing - Notes
MCA
(TWO YEARS PATTERN)
SEMESTER - II (CBCS)
IMAGE PROCESSING
: Dr. D.S.Rao
Professor (CSE) & Associate Dean
(Student Affairs)
Koneru Lakshmaiah Education Foundation,
L H (Deemed to be University) Hyderabad
Campus, Hyderabad, Telangana
: Ms.Akshata Raut
Assistant Professor
Module I
1. Digital Image Processing 01
Module II
2. Spatial Domain Methods 32
3. Image Averaging Spatial Filtering 63
Module III
4. Discrete Fourier Transform -I 90
5. Discrete Fourier Transform- II 104
Module IV
6. Image degradation 125
7. Image restoration techniques 129
Module V
8. Image Data Compression and morphological Operation 154
9. Image compression standards 161
Module VI
10. Applications of Image Processing 180
11. Human Body Tracking Based on Discrete Wavelet
Transform 195
F.Y. MCA
(TWO YEARS PATTERN)
SEMESTER - II (CBCS)
IMAGE PROCESSING
I
3 Discrete Fourier Transform: 8
Discrete Fourier Transform: Introduction , DFT and
its properties, FFT algorithms ñ direct, divide and
conquer approach, 2-D DFT & FFT Image
Transforms : Introduction to Unitary Transform,
DFT, Properties of 2-D DFT, FFT, IFFT, Walsh
transform, Hadamard Transform, Discrete Cosine
Transform, Discrete Wavelet Transform: Haar
Transforms, KL Transform
Self Learning Topics: Signals, Fourier Transform,
Color space and Transformation
II
6 Applications of Image Processing: 4
Case Study on Digital Watermarking, Biometric
Authentication (Face, Finger Print, Signature
Recognition), Vehicle Number Plate Detection and
Recognition, Object Detection using Correlation
Principle, Person Tracking using DWT,
Handwritten and Printed Character Recognition,
Contend Based Image Retrieval, Text Compression.
Self-Learning Topics: Industrial applications.
III
Module I
Introduction to Image Processing Systems:
1
DIGITAL IMAGE PROCESSING
Unit Structure
1.0 Objectives
1.1 Introduction
1.2 An Overview
1.2.1 What is an Image?
1.2.2 What is a Digital image?
1.2.3 Types of Image
1.2.4 Digital Image Processing
1.3 Image Representation
1.4 Basic relationship between Pixels
1.4.1 Neighbors of a Pixel
1.4.2 Adjacency, Connectivity, Regions and Boundaries
1.4.3 Distance Measures
1.4.4 Image operations on a Pixel Basis
1.5 Elements of Digital Image Processing system
1.6 Elements of Visual Perception
1.6.1Structure of Human Eye
1.6.2 Image Formation in the Eye
1.6.3 Brightness and contrast
1.6.5 Hue
1.6.6 Saturation
1.6.7 Mach band effect
1.7 Simple Image Formation Model
1.8 Vidicon and Digital Camera working Principle
1
Image Processing 1.8.1 Vidicon
1.8.2 Digital Camera
1.9 Colour Image Fundamentals
1.9.1 RGB
1.9.2 CMY
1.9.3 HIS Models
1.9.4 2D Sampling
1.9.5 Quantization
1.10 Summary
1.11 References
1.12 Unit End Exercises
1.0 OBJECTIVES
After going through this unit, you will be able to:
❖ Gain the knowledge about evolution of digital image processing
❖ Analyse the limits of digital images
❖ Derive the representation and relationship of pixels
❖ Describe the functioning of digital image processing system
❖ Specify the color models of image processing such as RGB, CMY
and Hue
1.1 INTRODUCTION
Digital Images plays main role in the day-to-day life. The visual effect
plays major role than any other media. When we see an image without
saying, without explaining anything we understand the concept.
1921
Bartlane cable picture transmission system used specialized printing
equipment coded pictures and then reproduced on telegraph printer fitted
with typefaces simulating a halftone pattern. This technology reduced the
time required to transmit a picture across Atlantic to less than 3 hours.
2
Level of coding images was 5. Figure 1.1 shows the picture transmitted in Digital Image Processing
this way.
1922
Visual quality is improved through selection of printing procedures and
distribution of intensity levels. A technique based on photographic
reproduction made from tapes perforated at telegraph receiving terminal.
Level of coding images was 5. Figure 1.2 shows the picture transmitted in
this way.
1929
The intensity level was increased to 15. Figure 1.3 shows the picture
transmitted in this way.
1964
The digital image used through digital computer and its advanced
techniques lead to Digital image processing. The Ranger 7 spacecraft of
U.S. took the first image of moon, shown in Figure 1.4. The enhanced
methods from the lessons learned from this imaging served as the basis for
Surveyor missions to moon, Mariner series missions to Mars and Appolo
manned flights to the moon and others.
Figure 1.1 :
1970
In parallel to space applications, the medical imaging, remote earth
resources and astronomy the digital image processing was applied. Ex.
CAT- Computerized Axial Tomography and X-rays uses DIP.
1992
Berners-Lee uploaded the first image to the internet, in 1992. It was of Les
HorriblesCernettes, a parody pop band founded by CERN employees.
1997
Fractals: Computer generated images are introduced based on the iterative
reproduction of a basic pattern according to some mathematical rules.
3
Image Processing 1.2 AN OVERVIEW
1.2.1. What is an Image?
Visual representation of an object is called as Image. An image is a two-
dimensional function that represents a measure of some characteristic such
as brightness or color of a viewed scene.
i)Analog Image
The image which is having continuously varying physical quantity in the
spatial data such as x, y of the particular axis is known as Analog Image.
Analog image can be mathematically represented as a continuous range of
values representing position and intensity. The image produced on the
screen of a CRT monitor, Television and medical images are analog
images.
4
ii) Digital Image Digital Image Processing
A digital image is composed of picture elements called pixels with
discrete data. Pixels are the smallest sample of an image. A pixel
represents the brightness at one point. The common formats of digital
images are TIFF, GIF, JPEG, PNG, and Post-Script.
6
f(x,y) = Digital Image Processing
A=
(x,y-1)
(x, y+1)
7
Image Processing 1.4.2 Adjacency, Connectivity, Regions and Boundaries
● To define adjacency the set of grey–level values V is considered.
● In a binary image, the adjacency of pixels with value 1 is referred as
V={1}.
● In a grey-scale image, the idea is the same, but Vtypically contains
more elements for example, V= {100, 101,…,150} that is subset of
any 256 values from 0-255
Types of Adjacency:
(i) 4- Adjacency – two pixels p and q with value from V are 4 –
adjacency if A is in the set N4(p)
(ii) 8- Adjacency – two pixels p and q with value from V are 8 –adjacency
if A is in the set N8(p)
(iii) M-adjacency –two pixel p and q with value from V are m – adjacency
if
a) Q is in N4(p) or
b) Q is in ND(q) and theSet N4(p) ∩ N4(q) has no pixel whose values are
fromV.
Mixed adjacency is a modification of 8-adjacency. It is introduced to
eliminate the ambiguities that often arise when 8-adjacency is used.
0 1 1 0 1 1 0 1 1
0 1 0 0 1 0 0 1 0
0 0 1 0 0 1 0 0 1
8
Connectivity: Digital Image Processing
Let S represent a subset of pixels in an image, two pixels p and q are said
to be connected in S if there exists a path between them consisting entirely
of pixels in S.
For any pixel p in S, the set of pixels that are connected to it in S is called
a connected component of S. If it only has one connected component, then
set S is called a connected set.
Pixels having a distance less than or equal to some value r from (x,y) are
the points contained in a disk of radius “ r” centered at (x,y)
The D4distance (also called city-block distance) between p and q is
definedas:
D4(p,q) = | x – s | + | y – t |
Pixels having a D4 distance from (x,y), less than or equal to some value r
form a Diamond centered at (x,y)
Example:
The pixels with distance D4≤ 2 from (x,y) form the following contours of
constant distance.
9
Image Processing The pixels with D4= 1 are the 4-neighbors of (x,y)
2
2 1 2
2 1 0 1 2
2 1 2
2
The D8distance (also called chessboard distance) between p and q is
defined as:
D8(p,q) = max(| x – s |,| y – t |)
Pixels having a D8 distance from (x,y), less than or equal to some value r
form a square Centered at (x,y).
2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
Example:
D8distance ≤ 2 from (x,y) form the following contours of constant
distance.
DmDistance:
Dm is the shortest m-path between the points.In this case, thedistance
between two pixels will depend on the values of the pixels along the path,
as well as the values of their neighbors.
Example:
P P
3 4
P P
1 2
p
Consider the following arrangement of pixels and assume that p, p2, and
p4 have value 1 and that p1 and p3 can have can have a value of 0 or 1
Consider the adjacency of pixels values; V ={1}.Compute the Dm between
points p and p4
10
There are 4 cases: Digital Image Processing
p p2 p4
Case1: If p1 =0 and p3 = 0
Length of the shortest m-path (the Dm distance) is 2;
Case2: If p1 =1 and p3 = 0
p1 and p will no longer be adjacent then, the length of the shortest path
will be 3
p p1 p2 p4
Case3: If p1 =0 and p3 = 1
p p2 p3 p4
The shortest –m-path will be 3 ;
Case4: If p1 =1 and p3 = 1
p p1 p2 p3 p4
The shortest –m-path will be 4 ;
11
Image Processing
Image storage
devices Image display
Computer Memory devices
Image Acquisition Frame Buffers CRT
devices Computer Monitor
Magnetic tapes
CCD Sensor
Optical disks Printer
CMOS Sensor
TV Monitor
Image Scanners
Projector
Image processing
Computer
12
ii) Image storage devices Digital Image Processing
If the image is not compressed the enormous volume of storage is required
There are three categories of storage devices. They are :
a) Short term storage b) Online storage c) Archival Storage
Short term storage : Used at the time of processing, Example: computer
memory, frame buffers. Frame buffers stores more than one image and can
be accessed rapidly at video rates. Image zoom, scrolling and pan shifts
are done through frame buffers.
Online storage: It is used while accessing the data often. It encourages the
fast recall, Example; magnetic disk or optical media.
Archival storage: It is characterized by frequent access, example:
magnetic tapes and optical disks. It requires large amount of storage space
and the stored data is accessed infrequently.
i) Cornea; Sclera
The cornea is a tough, transparent tissue that covers the anterior , front
surface of eye. The sclera is an opaque membrane that is continuous with
the cornea and encloses the remaining portion of the eye.
13
Image Processing ii) Choroid
It is located directly below the sclera. It contains network of blood vessels
which provides nutrition to the eye. The outer cover of the choroid is
heavily pigmented to reduce amount of extraneous light entering the eye.
Also contains the iris diaphragm and ciliary body
Iris diaphragm
It contracts and expands to control the amount of light entering into the
eye. The central opening of the iris which appears black is known as pupil
whose diameter varies from 2mm to 8mm.
Lens
It is made up of many layers of fibrous cells. It is suspended and is
attached to the ciliary body. It contains 60% to 70% water and 6% fat and
more protein. The lens is colored by a slightly yellow pigmentation. This
coloring increases with age, which leads to clouding of lens. Excessive
clouding of lens happens in extreme cases which are known as cataracts.
This leads to poor color discrimination and loss of clear vision.
The lens absorbs approximately 8% of the visible light spectrum, with
relatively higher absorption at shorter wavelengths. Both infrared and
ultraviolet light are absorbed appreciably by proteinswithin the lens
structure and, in excessive amounts, can damage the eye.
iii) Retina
It is the inner most membrane, objects are imaged on the surface. The
central portion of retina is called the fovea. Two types of receptors in
retina are Rods and Cones
Rods are long small receptors and Cones are short thicker in structure.The
rods and cones are not distributed evenly around the retina.
14
Cones Digital Image Processing
Cones are highly sensitive to color and are located in the fovea. There are
6 to 7 million cones. Each cone is connected with its own nerve end.
Therefore humans can resolve fine details with the use of cones. Cones
respond to higher levels of illumination; their response is called photopic
vision or bright light vision
Rods
Rods are more sensitive to low illumination than cones. There are about
75 to 159 million rods. Many numbers of rods are connected to a common,
single nerve. Thus the amount of detail recognizable is less. Therefore
rods provide only a general overall picture of the field of view. Due to
stimulation of rods the objects that appear color in daylight will appear
colorless in moon light. This phenomenon is called scotopic vision or dim
light vision.
The area where there is absence of receptors is called the blind spot
15
Image Processing
Fig. 1.14 Graphical representation of the eye Point C is the optical center of
the lens
To focus nearer objects the muscles allow the lens to become thicker,and
strongest refractive index.
The distance between the centre of the lens and the retina is called focal
length.
It ranges from 14mm to 17mm as the refractive power decreases from
maximum to minimum.
1.6.3 Brightness
The following terms are used to define color light:
i)Brightness or Luminance: This is the amount of light received by the eye
regardless of color.
ii) Hue: This is the predominant spectral color in the light.
iii)Saturation: This indicates the spectral purity of the color in the light
1.6.4 Contrast
The response of the eye to changes in the intensity of illumination is non-
linear
This does not hold at very low or very high intensities and it is dependent
on the intensity of the surround.
Perceived brightness and intensity
Perceived brightness is not a function of intensity. This can be explained
by Simultaneous contrast and Mach band effect
Simultaneous contrast
The small squares in each image are the same intensity.
Because the different background intensities, the small squares do not
appear equally bright.
Perceiving the two squares on different backgrounds as different, even
though they are in fact identical, is called the simultaneous contrast effect.
Psychophysically, we say this effect is caused by the difference in the
backgrounds.
The term contrast is used to emphasise the difference in luminance of
objects. The perceived brightness of a surface depends upon the local
background which is illustrated in Fig. 1.16. In Fig. 1.16, the small square
on the right-hand side appears brighter when compared to the brightness
of the square on the left-handside, even though the gray level of both the
squares are the same. This phenomenon is termed ‘simultaneous contrast’.
It is to be noted that simultaneous contrast can make the same colours look
different.
17
Image Processing 1.6.5 Hue
Hue refers to the dominant color family like Yellow, Orange, Red, Violet,
Blue, and Green tertiary colors would also be considered hues. Hue is
mixed colors where neither color is dominant.
The pure hues are around the perimeter. The closer to the center of the
circle are more desaturated the colors, with white at the center. This Fig
1.17 shows hues, saturation and lightness.
1.6.6 Saturation
Saturation is how “pure” the color is. For example, if its hue is cyan, its
saturation would be how purely cyan it is. Less saturated would mean
more whitish or grayish. If a color has greater-than-0 values for all three of
its red, green and blue primaries then it’s somewhat desaturated.
18
Visual appearance of each strip is darker at its leftside than its right. The Digital Image Processing
special interaction of luminance from an object and its surrounding creates
the Mechband effect which shows that brightness is not a monotonic
function of luminance.
Mechband is caused by lateral inhibition of receptors in the eye.
Receptors receive the light they draw light-sensitive chemical compound
Receptors directly on the lighter side of the boundary can pull in unused
chemicals from the darker side, and produce a stronger response,and the
darker side of the boundary, gives a weaker effect..
Luminance within each block is constant
The apparent lightness of each strip vary across its length.
Close to the left edge of the strip it appears lighter than at the centre, and
close to the right edge of the strip it appears darker than at the centre.
The visual system is exaggerating the difference in luminance (contrast) at
each edge in order top detect it.
It shows that the human visual system tends to undershoot or overshoot
around the boundary regions of different intensities.
Digital camera
A digital camera is a camera that captures images and turns them into
digital form.
Digital camera shares an optical system which uses a lens with a variable
diaphragm to focus light onto an imagepickup device.
The diaphragm and shutter admit the correct amount oflight to the imager.
Digital camera contains image sensors that captures theincoming light rays
and turns them into electrical signals.
This image sensors can be of two types- i) charge-coupled device (CCD)
or ii)CMOS image sensor.
Light from the object zooms into the camera lens.
This incoming light hit the image sensor, which breaks it upinto millions
of pixels.
The sensor measures the color and brightness of each pixeland stores it as
a number.
The output digital photograph is effectively a long string ofnumbers
describing the exact details of each pixel itcontains.
21
Image Processing 1.9 COLOUR IMAGE FUNDAMENTALS
1.9.1 RGB
In the RGB model, an image consists of three independent image planes,
one in each of the primary colors: red, green and blue. (The standard
wavelengths for the three primaries are as shown in figure). Specifying a
particular color is by specifying the amount of each of the primary
components present. Figure 1.21 shows the geometry of the RGB color
model for specifying colors using a Cartesian coordinate system. The
grayscale spectrum, i.e. those colors made from equal amounts of each
primary, lies on the line joining the black and white vertices.
Fig.1.21 The RGB color cube. The gray scale spectrum lies on the line
joining the black and white vertices.
This is an additive model, i.e. the colors present in the light add to form
new colors, and is appropriate for the mixing of colored light for example.
The image on the left of figure 1.22 shows the additive mixing of red,
green and blue primaries to form the three secondary colors yellow (red +
green), cyan (blue + green) and magenta (red + blue), and white ((red +
green + blue). The RGB model is used for color monitors and most video
cameras.
22
Fig.6.2 RGB 24 bit color cube Digital Image Processing
Fig 1.23 generating the RGB image of Fig. 1.24 safe 216 RGB colors and
the cross sectional color plane gray in 256-color RGB system
Pixel Depth:
The number of bits used to represent each pixel in the RGB space is called
the pixel depth. If the image is represented by 8 bits then
the pixel depth of each RGB color pixel = 3*number of bits/plane=3*8=24
A full color image is a 24 bit RGB color image. Therefore total number of
colors in a full color image = (28)3 = 16,777,216
23
Image Processing Safe RGB colors:
Most of the system use 256 colors. Withoutt depending on the hardware
capabilities of the system the system reproduces subset of colors which is
called the set of RGB colors or the set of all systems safe colors.
Hexadecimal representation
The component values in RGB model should be represented using
hexadecimal number system. The decimal numbers 1,2,….14,15
correspond to the hex numbers 0,1,2,….9,A,B,C,D,E,F. the equivalent
representation of the component values is given in table:
Hex 00 33 66 99 CC FF
Decimal 0 51 102 153 204 255
Applications:
Color monitors, Color video cameras
Advantages:
● Image color generation
● Changing to other models such as CMY is straight forward
● It is suitable for hardware implementation
● It is based on the strong perception of human vision to red, green
andblue primaries.
Disadvantages:
● It is not acceptable that a color image is formed by combining three
primary colors.
24
● This model is not suitable for describing colors in a way which is Digital Image Processing
practical for human interpretation.
1.9.2 CMY
The CMY (Cyan Mmagenta Yellow) model is a subtractive model
appropriate to absorption of colors. The CMY model asks what is
subtracted from white. The primary colors are cyan, magenta and yellow,
and secondary colors are with red, green and blue
The surface coated with cyan pigment is illuminated by white light, no red
light is reflected, and similarly for magenta and green, and yellow and
blue. The relationship between the RGB and CMY models is given by:
C 1 R
M = 1 - G
Y 1 B
The CMY model is used by printing devices and filters.
25
Image Processing Representation of Hue:
The hue of a color point is determined by an angle from some reference
point.
The angle between the point and the red axis is 0° is zero hue.
If the angle from red axis increases in the counter clock wise direction
then hue increases.
Fig. 1.25 Conceptual relationship between the RGB and HSI color models
ii) Intensity:
The intensity can be extracted from an RGB image because an RGB color
image is viewed as three monochrome intensity images.
Intensity Axis:
A vertical line joining the black vertex (0, 0, 0) and white vertex(1,1, 1) is
called intensity axis. The intensity axis represents the gray scale.
iii)saturation:
All points on the intensity axis are gray which means that the saturation
i.e., purity of points on the axis is zero.
When the distance of a color from the intensity axis increases, the
saturation of that color also increases.
Representation of saturation
The saturation is described as the length from the vertical axis.
In the HSI space, it is represented by the length of the vector from the
origin to the color point.
If the length is more the saturation is high and vice versa.
26
Digital Image Processing
H=
27
Image Processing GB (Green, Blue) Sector (120 ) : When H
(H=H-120 is in this sector the RGB components are given by the
equations:
1.9.4 2D SAMPLING
To create a digital image, convert the continuous sensed data into digital
form. This involves two processes.
i) Sampling
ii) Quantization
An image, f(x, y), may be continuous with respect to the x- and y-
coordinates, and also in amplitude. To convert it to digital form, sample
the function in both coordinates and in amplitude.
Digitizing the coordinate values is called Sampling.
The one-dimensional function in Fig. 1.27(b) is a plot of amplitude
(intensity level) values of the continuous image along the line segment AB
in Fig. 1.27 (a).
28
To sample this function, equally spaced samples along line AB, are Digital Image Processing
depicted in Fig. 1.27 (c). The spatial location of each sample is indicated
by a vertical tick mark.
The samples are shown as small white squares super imposed on the
function. The set of these discrete locations gives the sampled function.
However, the values of the samples still span (vertically) a continuous
range of intensity values.
The intensity values must be (quantized) to form a digital function
The right side of Fig. 1.27 (c) shows the intensity scale divided into eight
discrete intervals, ranging from black to white. The vertical tick marks
indicate the specific value assigned to each of the eight intensity intervals.
The continuous intensity levels are quantized by assigning one of the eight
values to each sample. The assignment is made depending on the vertical
proximity of a sample to a vertical tick mark. The digital samples resulting
from both sampling and quantization are shown in Fig1.27 (d). Starting at
the top of the image and carrying out this procedure line by line produces
a two-dimensional digital image.
Fig. 1.27 Generating a digital image. (a) Continuous image. (b) A scan
line from A to B in the continuous image, used to illustrate the concepts
of sampling and quantization. (c) Sampling and quantization.
(d) Digital scan line.
1.9.5 QUANTIZATION
Digitizing the amplitude values is called Quantization.
Quantisation involves representing the sampled data by a finite number of
levels based on some criteria such as minimisation of quantiser distortion.
29
Image Processing Quantisers can be classified into two types, namely, i) scalar quantisers
and ii) vector quantisers. The classification of quantisers is shown in
Fig. 1.29.
1.10 SUMMARY
Since1921 when the Bartlane cable picture transmission system was
introduced the Digital images started its evolution. In 1964 the computers
are used to process digital images and the actual digital image processing
started working.
Digital image composed of elements called pixels. For the immediate
output display, fast processing and huge storage the digital images are
used.
30
The position of the pixels to represent the digital images are identified Digital Image Processing
through neibors of the pixels, adjacency, boundaries and connectivities of
the pixels.
Image Acquisition devices, Image storage devices, Image processing
elements and Image display devices are the basic elements of the digital
image processing sytem which are used to process the digital images. The
structure of human eye helps the human to understand and sense the colors
and structure of the images.
RGB , CMY are useful in representing the images with different colors,
brightness and contrasts.
1.11 REFERENCES
1. R.C.Gonzalez&R.E.Woods, Digital Image Processing, Pearson
Education, 3rd edition, ISBN. 13:978-0131687288
2. S. Jayaraman Digital Image Processing TMH (McGraw Hill)
publication, ISBN- 13:978-0-07- 0144798
3. William K. Pratt, “Digital Image Processing”, John Wiley, NJ, 4th
Edition,2007
4. The Origins of Digital Image Processing & Application areas in
Digital Image Processing Medical Images,Mukul, Sajjansingh, and
Nishi
31
Module II
Image Enhancement in the spatial domain
2
SPATIAL DOMAIN METHODS
Unit Structure
2.0 Objectives
2.1 Introduction
2.2 An Overview
2.3 Spatial Domain Methods
2.3.1 Point Processing
2.3.2 Intensity transformations
2.3.3 Histogram Processing
2.3.4 Image Subtraction
2.4 Let us Sum Up
2.5 List of References
2.6 Bibliography
2.7 Unit End Exercises
2.0 OBJECTIVES
Enhancement's main goal is to improve the quality of an image so that it
may be used in a certain process.
● Enhancement of images Enhancement in the spatial domain and
Frequency domain fall into two categories.
● The word spatial domain refers to the Image Plane itself, which is
DIRECT pixel manipulation.
● Frequency domain processing approaches work by altering an image's
Fourier transform.
32
2.1 INTRODUCTION Spatial Domain Methods
33
Image Processing Figure 1 depicts further greyscale adjustments.
● EQUALIZATION OF HISTOGRAMS
Equalization of histograms is a typical approach for improving the
appearance of photographs. Assume that we have a largely dark image.
The visual detail is compressed towards the dark end of the histogram, and
the histogram is skewed towards the lower end of the greyscale. The
image would be much clearer if we could stretch out the grey levels at the
dark end to obtain a more consistently distributed histogram.
Figure 2 shows the original image, histogram, and equalised versions. Both
photos have been quantized to a total of 64 grey levels.
34
Finding a grey scale translation function that produces an output image Spatial Domain Methods
with a uniform histogram is the goal of histogram equalisation (or nearly
so).
What is the procedure for determining the grey scale transformation
function? Assume that our grey levels are continuous and that they have
been normalised to a range of 0 to 1.
We need to identify a transformation T that converts the grey values r in
the input image F to grey values s = T(r) in the converted image.
The assumption is that
● T is single valued and monotonically increasing, and
● for .
The inverse transformation from s to r is given by
r = T-1(s).
We have a probability distribution for grey levels in the input image Pr if
we take the histogram for the input image and normalise it so that the area
under the histogram is Pr(r).
What is the probability distribution Ps(s) if we transform the input image
to s = T(r)?
It turns out that, according to probability theory,
where r = T-1(s).
Consider the transformation
35
Image Processing
for all .Thus, Ps(s) is now a uniform distribution
function, which is what we want.
● DISCRETE FORMULATION
The probability distribution of grey levels in the input image must first be
determined. Now
where nk is the number of pixels having grey level k, and N is the total
number of pixels in the image.
The transformation now becomes
Note that ,
● SMOOTHING AN IMAGE
Image smoothing is used to reduce the impact of camera noise, erroneous
pixel values, missing pixel values, and other factors. Image smoothing can
be done in a variety of ways; we'll look at neighborhood averaging and
edge-preserving smoothing.
● NEIGHBOURHOOD AVERAGING
36
1/9 1/9 1/9 Spatial Domain Methods
Each pixel value is multiplied by 1/9and then totalled before being placed
in the resulting image. This mask is moved across the image in steps until
every pixel is covered. This soothing mask is used to convolve the image
(also known as a spatial filter or kernel).
The value of a pixel, on the other hand, is normally expected to be more
strongly related to the values of pixels nearby than to those further away.
This is because most points in a picture are spatially coherent with their
neighbours; in fact, this hypothesis is only false at edge or feature points.
As a result, the pixels towards the mask's center are usually given a higher
weight than those on the edges.
The rectangular weighting function (which just takes the average over the
window), a triangular weighting function, and a Gaussian are all typical
weighting functions.
Although Gaussian smoothing is the most widely utilized, there isn't much
of a difference between alternative weighting functions in practice.
Gaussian smoothing is characterized by the smooth modification of the
image's frequency components.
Smoothing decreases or attenuates the image's higher frequencies. Other
mask shapes can cause strange things to happen to the frequency
spectrum, but we normally don't notice much in terms of image
appearance.
Smoothing that preserves the edge
Because the image's high frequencies are suppressed, neighborhood
averaging or Gaussian smoothing will tend to blur edges. Using median
filtering as an alternative is a viable option. The grey level is set to the
median of the pixel values in the pixel's immediate vicinity.
The median m of a set of values is the value at which half of the values are
less than m and the other half are greater. Assume that the pixel values
3 x 3 in a given neighborhood are (10, 20, 20, 15, 20, 20, 20, 25, 100). We
obtain (10, 15, 20, 20, |20|, 20, 20, 25, 100) if we order the values, and the
median is 20.
The result of median filtering is that pixels with outlying values are forced
to become more like their neighbors while maintaining edges. Median
filters, by definition, are non-linear.
Median filtering is a morphological operation. Pixel values are replaced
with the smallest value in the neighborhood when we erode an image.
37
Image Processing When distorting an image, the greatest value in the neighborhood is used
to replace pixel values. Median filtering replaces pixels with the
neighborhood's median value. The type of morphological operation is
determined by the rank of the value of the pixel used in the neighborhood.
Figure 3: Image of Genevieve with salt and pepper noise, averaging result, and
median filtering result.
2.2 AN OVERVIEW
The spatial domain technique is a well-known denoising technique. It's a
noise-reduction approach that uses spatial filters to apply directly to digital
photos. Linear and nonlinear spatial filters are the two types of spatial
filtering algorithms (Sanches et al., 2008). Filtering is a method used in
image processing to do several preprocessing and other tasks such as
interpolation, resampling, denoising, and so on. The type of task
performed by the filter method and the type of digital image determine the
filter method to be used. Filter methods are used in digital image
processing to remove undesirable noise from digital photographs while
maintaining the original image (Priya et al., 2018; Agostinelli et al., 2013).
38
Nonlinear filters are used in a variety of ways, the most common of which Spatial Domain Methods
is to remove a certain sort of unwanted noise from digital photographs.
There is no built-in way for detecting noise in the digital image with this
method. Nonlinear filters often eliminate noise to a certain point while
blurring images and hiding edges. Several academics have created various
sorts of median (nonlinear) filters to solve this challenge throughout the
previous decade. The median filter, partial differential equations, nonlocal
mean, and total variation are the most used nonlinear filters. A linear filter
is a denoising technique in which the image's output results vary in a
linear fashion. Denoising outcomes are influenced by the image's input.
As the image's input changes, the image's output changes linearly. The
processing time of linear filters for picture denoising is determined by the
input signals and the output signals. The mean linear filter is the most
effective filter for removing Gaussian noise from digital medical pictures.
This approach is a simple way to denoise digital photos (Wieclawek and
Pietka, 2019). The average or mean pixels values of the neighbour pixels
are calculated first, and then replaced with every pixel of the digital image
in the mean filter. To reduce noise from a digital image, it's a very useful
linear filtering approach. Wiener filtering is another linear filtering
technique. This technique requires all additive noise, noise spectra, and
digital picture inputs, and it works best if all of the input signals are in
good working order. This strategy reduces the mean square error of the
intended and estimated random processes by removing noise.
f(x,y) is the input picture, g(x,y) is the processed image (i.e. the result or
output image), and an operator on f is defined over some neighbourhood N
of (x,y). We usually employ a rectangle subimage centred at N for (x,y).
The gray levels of f(x,y) and g are represented by r,s (x,y). We can
produce some intriguing effects with this technique, such as contrast
39
Image Processing stretching and bi-level mapping (here an image is converted so that it only
contains black and one color white). The challenge is to define T in such a
way that it darkens grey levels below a particular threshold k and
brightens grey levels above it. A black-and-white image is created when
the darkening and brightening are both consistent (black and white). This
technique is known as 'point-processing' since s is only dependent on the
value (i.e. the gray-level) of T in a single pixel.
40
Spatial Domain Methods
The value of b is increased every time you press the '+' brightness button,
and vice versa. As b is increased, a higher and higher value is added to
each pixel in the input image, making the image brighter. The image
becomes brighter if b > 0, and darker if b 0. Figure 2.2 depicts the effect of
altering the brightness.
41
Image Processing Figure 2.1: The point-processing principle. A pixel in the input image is
processed, and the result is saved in the output image at the same location.
Figure 2.2: The resultant image will be equivalent to the input image if b
in Eq. 2.1 is zero. If b is a negative quantity, the image produced will be
smaller.
If b is a positive number, the brightness of the resulting image will be
increased.
The use of a graph, as shown in Fig. 2.3, is often a more convenient
manner of illustrating the brightness action. The graph depicts the
mapping of pixel values in the input image (horizontal axis) to pixel
values in the output picture (vertical axis) (vertical axis). Gray-level
mapping is the name given to such a graph. The mapping does nothing in
the first graph, i.e., g(142,42) = /. (142,42).
In the following graph, all pixel values are increased (b > 0), resulting in a
brighter image. This has two effects: I no pixel in the output image will be
fully dark, and ii) some pixels in the output image will have a value
greater than 255. The latter is undesirable due to an 8-bit image's upper
limit, hence all pixels above 255 are set to 255, as shown in the graph's
horizontal section. When b 0 is set to zero, some pixels will have negative
values and will be set to zero in the output, as shown in the previous
graph.
You can adjust the contrast in the same way that you can adjust the
brightness on your TV. The gray-level values that make up an image's
contrast are how distinct they are. When we look at two pixels with values
112 and 114 adjacent to each other, the human eye has trouble
distinguishing them, and we remark there is a low contrast. If the pixels
are 112 and 212, on the other hand, we can readily differentiate them and
claim the contrast is great.
42
Spatial Domain Methods
Three instances of gray-level mapping are shown in Figure 2.3. The input
is shown at the top of the page. The three additional images are the result
of the three gray-level mappings being applied to the input. Eq. 4.1 is used
in all three gray-level mappings.
Figure 2.4: If the value of an in Eq. 2.2 is one, the output image will be the
same as the input image. If an is less than one, the resulting image will be
less contrasted; if an is greater than one, the resulting image will be more
contrasted.
Changing the slope of the graph1 changes the contrast of an image:
If an is more than one, the contrast is raised; if it is less than one, the
contrast is diminished. When a = 2, the pixels 112 and 114, for example,
will have the values 224 and 228, respectively. The contrast is raised by a
factor of two because the difference between them is increased by a factor
of two. The effect of adjusting the contrast may be observed in Fig. 4.4.
43
Image Processing When the equations for brightness (Eq. 2.1) and contrast (Eq. 2.2) are
combined, we get
44
Spatial Domain Methods
Figure 2.6: With a value of 0.45, gamma mapping to the left is 0.45, while
with a value of 2.22, gamma mapping to the right is 2.22. The original
image is in the middle.
Where umax is the input image's maximum pixel value. Changing the
pixel values of the input image using a linear mapping before the
logarithmic mapping can alter the behavior of the logarithmic mapping.
Figure 4.7 shows the logarithmic mapping from [0,255] to [0,255]. This
mapping will stretch low-intensity pixels while suppressing high-intensity
pixels' contrast. Figure 4.7 shows one example.
46
In general, Intensity Transformation Functions are used to adjust the Spatial Domain Methods
intensity. The four main intensity transformation functions are discussed
in the following sections:
1. photographic negative (using imcomplement)
2. gamma transformation (using imadjust)
3. logarithmic transformations (using c*log(1+f))
4. contrast-stretching transformations
(using 1./(1+(m./(double(f)+eps)).^E)
● PHOTOGRAPHIC NEGATIVE
The Photographic Negative is the most straightforward of the intensity
conversions. Assume we're dealing with grayscale double arrays with
black equal to 0 and white equal to 1. The notion is that 0s become 1s, and
1s become 0s, with any gradients in between reversed as well. This means
that genuine black becomes true white and vice versa in terms of intensity.
Incomplement provides a function in MATLAB that allows you to
produce photographic negatives (f). The graph below displays the
mapping between the original values (x-axis) and the incomplement
function, with a=0:.01:1.
47
Image Processing Original Photographic Negative
● GAMMA TRANSFORMATIONS
Gamma Transformations allow you to curve the grayscale components to
brighten or darken the intensity (when gamma is less than one) (when
gamma is greater than one). These gamma conversions are created using
the MATLAB function:
imadjust(f, [low in high in], [low out high out], gamma) The input image
is f, the curve is gamma, and the clipping is [low in high in] and [low out
high out].
Values below low in and above high in are clipped to low out and high
out, respectively. Both [low in high in] and [low out high out] are used in
this lab with []. This indicates that the input's full range is mapped to the
output's full range. The plots below show the effect of varying gamma
with a=0:.01:1. Notice that the red line has gamma=0.4, which creates an
upward curve and will brighten the image.
48
Spatial Domain Methods
49
Image Processing The gamma transformation is a crucial step in the image display process.
You should find out more information about them. Charles Poynton, a
digital video systems expert who previously worked for NASA, has a
great gamma FAQ that I recommend you read, especially if you plan to
handle CGI. He also dispels several common misunderstandings
concerning gamma.
● LOGARITHMIC TRANSFORMATIONS
Logarithmic Transformations (such as the Gamma Transformation, where
gamma 1) can be used to brighten an image's intensity. It's most
commonly used to boost the detail (or contrast) of low-intensity values.
They're particularly good at bringing out detail in Fourier transformations
(covered in a later lab). The equation for obtaining the Logarithmic
transform of image f in MATLAB is:
g = c*log(1 + double(f))
The constant c is typically used to scale the log function's range to fit the
input domain. For an uint8 picture, c=255/log(1+255), or c=1/log(1+1)
(~1.45) for a double image. It can also be used to boost contrast—the
higher the c value, the brighter the image appears. The log function, when
used in this manner, can produce results that are excessively bright to
display. The graphic below shows the result for various values of c when
a=0:.01:1. For the plots of c=2 and c=5, the min function clamps the y-
values at 1. (teal and purple lines, respectively).
50
Spatial Domain Methods
51
Image Processing
53
Image Processing
The original image and the outcomes of applying the three changes from
above are shown below. The m value used in the following examples is
the average of the image intensities (0.2104). The function becomes more
like a thresholding function with threshold m for very high E values,The
resulting image is more black and white than grayscale, for example.
The following shows the original image and the results of applying the
three transformations from above. The m value used below is 0.2, 0.5, and
0.7. Notice that 0.7 produces a darker image with fewer details for this tire
image.
55
Image Processing The MATLAB code that created these images is:
I=imread('tire.tif');
I2=im2double(I);
contrast1=1./(1+(0.2./(I2+eps)).^4)
contrast2=1./(1+(0.5./(I2+eps)).^4);
contrast3=1./(1+(0.7./(I2+eps)).^4);
imshow(I2)
figure,imshow(contrast1)
figure,imshow(contrast2)
figure,imshow(contrast3)
Intensity
Transformation Transformation Corresponding intrans Call
Function
photographic
neg=imcomplement(I); neg=intrans(I,'neg');
negative
I2=im2double(I);
logarithmic log=intrans(I,'log',5);
log=5*log(1+I2);
gamma=imadjust
gamma gamma=intrans(I,'gamma',0.4);
(I,[],[],0.4);
I2=im2double(I);
contrast-
contrast=1./(1+(0.2./(I2 contrast=intrans(I,'stretch',0.2,5);
stretching
+eps)).^5);
56
2.3.3 HISTOGRAM PROCESSING Spatial Domain Methods
● HISTOGRAMS INTRODUCTION
The histogram is a graphical representation of a digital image used in
digital image processing. A graph is a representation of each tonal value as
a number of pixels. In today's digital cameras, the image histogram is
available. They are used by photographers to see the dispersion of tones
captured.
The horizontal axis of a graph represents tonal fluctuations, whereas the
vertical axis represents the number of pixels in that specific pixel. The left
side of the horizontal axis depicts black and dark parts, the middle
represents medium grey colour, and the vertical axis reflects the area's
size.
57
Image Processing 2. It's a tool for analyzing images. A careful examination of the
histogram can be used to predict image properties.
3. The image's brightness can be modified by looking at the histogram's
features.
4. Having information on the x-axis of a histogram allows you to
modify the image's contrast according to your needs.
5. It is used to equalize images. To create a high contrast image, the
grey level intensities are extended along the x-axis.
6. Histograms are utilized in thresholding because they improve the
image's appearance.
7. We can figure out which type of transformation is used in the method
if we have the input and output histograms of an image.
● HISTOGRAM STRETCHING
The contrast of an image is boosted through histogram stretching. The
contrast of an image is defined as the difference between the maximum
and minimum pixel intensity values.
58
If we wish to increase the contrast of an image, we expand the histogram Spatial Domain Methods
till it covers the entire dynamic range of the histogram.
We may determine whether an image has low or high contrast by looking
at its histogram.
● HISTOGRAM EQUALIZATION
Equalizing all of an image's pixel values is done through histogram
equalization. The transformation is carried out in such a way that the
histogram is uniformly flattened.
Histogram equalization broadens the dynamic range of pixel values and
ensures that each level has an equal number of pixels, resulting in a flat
histogram with great contrast.
When extending a histogram, the shape of the histogram remains the
same, however, when equalizing a histogram, the shape of the histogram
changes, and just one image is generated.
59
Image Processing A fascinating application is in medicine, where h(x,y) is called a mask and
subtracted from a succession of photos fi(x,y), yielding some fascinating
images. It is possible to watch a dye propagate through a person's brain
arteries, for example, by doing so. The portions in the photos that look the
same get darkened each time the difference is calculated, while the
differences become more highlighted (they are not subtracted out of the
resulting image).
● IMAGE AVERAGING
Consider a noisy image g(x,y), which is created by adding a specific
amount of noise n(x,y) to an original image f(x,y):
60
Spatial Domain Methods
5. https://www.google.com/search?q=point+processing+in+image+p
rocessing&rlz=1C1CHZN_enIN974IN974&oq=POINT+PROCE
SSING&aqs=chrome.1.0i512l10.1767j0j15&sourceid=chrome&ie
=UTF-8
6. https://www.cs.uregina.ca/Links/class-info/425/Lab3/
61
Image Processing 7. https://www.javatpoint.com/dip-
histograms#:~:text=In%20digital%20image%20processing%2C%20h
istograms,the%20details%20of%20its%20histogram.
8. http://www.faadooengineers.com/online-study/post/ece/digital-
image-processing/1123/image-subtraction-and-image-averaging
2.6 BIBLIOGRAPHY
1. https://www.mv.helsinki.fi/home/khoramsh/4Image%20Enhancement
%20in%20Spatial%20Domain.pdf
2. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWE
NS/LECT5/node3.html
3. https://www.sciencedirect.com/topics/engineering/spatial-domain
4. http://www.faadooengineers.com/online-study/post/cse/digital-imge-
processing/674/spatial-domain-methods
5. https://www.google.com/search?q=point+processing+in+image+pr
ocessing&rlz=1C1CHZN_enIN974IN974&oq=POINT+PROCESS
ING&aqs=chrome.1.0i512l10.1767j0j15&sourceid=chrome&ie=U
TF-8
6. https://www.cs.uregina.ca/Links/class-info/425/Lab3/
7. https://www.javatpoint.com/dip-
histograms#:~:text=In%20digital%20image%20processing%2C%20hi
stograms,the%20details%20of%20its%20histogram.
8. http://www.faadooengineers.com/online-study/post/ece/digital-
image-processing/1123/image-subtraction-and-image-averaging
62
3
IMAGE AVERAGING SPATIAL
FILTERING
Unit Structure
3.0 Objectives
3.1 Introduction
3.2 An Overview
3.3 Image Averaging Spatial Filtering
3.3.1 Smoothing Filters
3.3.2 Sharpening Filters
3.4 Frequency Domain Methods
3.4.1 Low Pass Filterning
3.4.2 High Pass Filtering
3.4.3 Homomorphic Filter
3.5 Let us Sum Up
3.6 List of References
3.7 Bibliography
3.8 Unit End Exercises
3.0 OBJECTIVES
● The Spatial Filtering technique is applied to individual pixels in an
image. A mask is typically thought to be increased in size so that it has a
distinct center pixel. This mask is positioned on the image so that the
mask's center traverses all of the image's pixels.
● Spatial filtering is frequently used to "clean up" laser output,
reducing aberrations in the beam caused by poor, unclean, or damaged
optics, or fluctuations in the laser gain medium itself.
3.1 INTRODUCTION
Spatial filtering is a method of modifying the features of an optical image
by selecting deleting certain spatial frequencies that make up an object,
such as video data received from satellites and space probes, or raster
removal from a television broadcast or scanned image.
63
Image Processing Average (or mean) filtering is a technique for smoothing photographs by
lowering the intensity fluctuation between adjacent pixels. The average
filter replaces each value with the average value of neighboring pixels,
including itself, as it moves through the image pixel by pixel.
Filtering is a method of altering or improving an image. The processed
value for the current pixel depends on both itself and adjacent pixels in a
spatial domain operation or filtering... Filters or masks will be defined.
Filtering is a method of altering or improving an image. The processed
value for the current pixel depends on both itself and adjacent pixels in a
spatial domain operation or filtering... Filters or masks will be defined.
3.2 AN OVERVIEW
IMAGE ENHANCEMENT OVERVIEW
By working with noisy photos we can filter signals from noise in two
dimensions. Two types of noise: binary and Gaussian.
The user specifies a percentage value in the binary case (a number
between 0 and 100). This value is randomly set equal to the maximum
grey level value and reflects the percentage of pixels in the image whose
values will be completely lost (corresponding to a white pixel).
The value of the pixel x(k,l) is changed in the Gaussian case by additive
white gaussian noise x(k,l)+n, with noise n~N(0,v) being normally
distributed and variance v set by the user (a number between 0 and 2 in
this exercise).
The image is the same in binary noise, except for a set of points where the
image's pixels are set to white. The noisy image seems blurred in the case
of Gaussian noise.
64
Image with Gaussian noise Spatial Domain Methods
1. Median filtering
A pixel is replaced by the median of the pixels in a window around it in
median filtering. That is to say,
W is a suitable window that surrounds the pixel. The median filtering
algorithm entails sorting the pixel values in the window in ascending or
descending order and selecting the middle value. In most cases, a square
window with an odd square size is chosen.
2. Spatial averaging
Each pixel is replaced by an average of its nearby pixels in the case of
spatial averaging. That is to say,
65
Image Processing
3.3 IMAGE AVERAGING AND SPATIAL FILTERING
SPATIAL FILTERING AND ITS TYPES
The Spatial Filtering technique is applied to individual pixels in an image.
A mask is typically thought to be increased in size so that it has a distinct
centre pixel. This mask is positioned on the image so that the mask's
centre traverses all of the image's pixels.
Classification in General:
Smoothing Spatial Filter: A smoothing spatial filter is used to blur and
reduce noise in an image. Blurring is a pre-processing technique for
removing minor details, and it is used to achieve Noise Reduction.
66
(ii) Maximum filter: The maximum filter is the 100th percentile filter. Spatial Domain Methods
The largest value in the window replaces the value in the center.
(ii) Median filter: Every pixel in the image is taken into account. The
original values of the pixel are replaced by the median of the list after
surrounding pixels are sorted first.
● GAUSSIAN
● MEAN
● MEAN SHIFT
67
Image Processing ● MEDIAN
● NON-LOCAL MEANS
● GAUSSIAN
When you apply the Gaussian filter to an image, it blurs it and removes
information and noise. It's comparable to the Mean filter in this regard. It
does, however, use a kernel that represents a Gaussian or bell-shaped
hump. Unlike the Mean filter, which produces an evenly weighted
average, the Gaussian filter produces a weighted average of each pixel's
neighborhood, with the average weighted more towards the center pixels'
value. As a result, the Gaussian filter smoothes the image more gently and
maintains the edges better than a Mean filter of comparable size.
The frequency response of the Gaussian filter is one of the main
justifications for adopting it for smoothing. Lowpass frequency filters are
used by the majority of convolution-based smoothing filters. As a result,
they have the effect of removing high spatial frequency components from
an image. You can be quite certain about what range of spatial frequencies
will be present in the image after filtering by selecting an adequately big
Gaussian, which is not the case with the Mean filter. Computational
biologists are also interested in the Gaussian filter since it has been
associated with some biological plausibility. For example, some cells in
the brain's visual circuits often respond in a Gaussian fashion.
Because many edge-detection filters are susceptible to noise, Gaussian
smoothing is typically utilised before edge detection.
MEAN
Mean filtering is a straightforward technique for smoothing and reducing
noise in photographs by removing pixel values that aren't indicative of
their surroundings. Mean filtering is a technique that replaces each pixel
value in an image with the mean or average of its neighbors, including
itself.
The Mean filter, like other convolution filters, is based on a kernel, which
describes the shape and size of the sampled neighborhood for calculating
the mean. The most common kernel size is 3x3, but larger kernels might
be utilized for more severe smoothing. It's worth noting that a small kernel
can be applied multiple times to achieve a similar, but not identical, result
to a single pass with a large kernel.
Although noise is reduced after mean filtering, the image has been
softened or blurred, and high-frequency detail has been lost. This is
mainly caused by the filter's limits, which are as follows:
• A single pixel with a very atypical value can have a considerable impact
on the mean value of all the pixels in its vicinity.
68
The filter will interpolate new values for pixels on the edge when the filter Spatial Domain Methods
neighborhood straddles an edge. If crisp edges are required in the output,
this could be a problem.
The Median filter, which is more commonly employed for noise reduction
than the Mean filter, can solve both of these concerns. Smoothing is often
done with other convolution filters that do not calculate the mean of a
neighborhood. The Gaussian filter is one of the most popular.
MEAN SHIFT
Mean shift filtering is based on a data clustering algorithm extensively
used in image processing and can be utilized for edge-preserving
smoothing. The collection of surrounding pixels is determined for each
pixel in an image with a spatial location and a specific grayscale value.
The new spatial center (spatial mean) and the new mean value are
calculated for this set of adjacent pixels. The new center for the following
iteration is determined by the calculated mean values. Iterate the specified
technique until the spatial and grayscale mean cease changing. The final
mean value will be set to the iteration's beginning point at the end of the
iteration.
MEDIAN
The Median filter is typically used to minimise image noise, and it can
often preserve image clarity and edges better than the Mean filter. This
filter, like the Mean filter, examines each pixel in the image individually
and compares it to its neighbours to determine whether it is typical of its
surroundings. Instead of merely replacing the pixel value with the mean of
nearby pixel values, the median of those values is used instead. Median
filters are especially good for reducing random intensity spikes that
commonly appear in microscope images.
This filter's operation is depicted in the diagram below. The median is
derived by numerically ordering all of the pixel values in the surrounding
neighborhood, in this case, a 3x3 square, and then replacing the pixel in
question with the middle pixel value.
Median filter
69
Image Processing The center pixel value of 150, as seen in the picture, is not typical of the
surrounding pixels and is substituted with the median value of 124. It's
worth noting that larger neighborhoods will result in more severe
smoothing.
The Median filter has two key advantages over the Mean filter since it
calculates the median value of a neighborhood rather than the mean: • The
median is more robust than the mean, thus a single very unrepresentative
pixel in a neighborhood will not have a substantial impact on the median
value. For example, in datasets contaminated with salt-and-pepper noise
(scatter dots).
● Since the median value must be the value of one of the pixels in the
neighborhood, the Median filter does not create unrealistic pixel values
when the filter straddles an edge. For this reason, it is much better at
preserving sharp edges than the Mean filter.
● However, the Median filter is sometimes not as subjectively good at
dealing with large amounts of Gaussian noise as the Mean filter. It is
also relatively complex to compute.
NON-LOCAL MEANS
Unlike the Mean filter, which smooths a picture by taking the mean of a
set of pixels surrounding a target pixel, the Non-Local Means filter takes
the mean of all pixels in the image, weighted by their similarity to the
target pixel. When compared to mean filtering, this filter can result in
improved post-filtering clarity with minimum information loss. When
smoothing noisy images, the Non-Local Means or Bilateral filter should
be your first choice in many circumstances.
It's worth noting that non-local means filtering works best when the noise
in the data is white noise, in which case most visual characteristics,
including small and thin ones, will be maintained.
70
When compared to their neighbors, the brighter pixels are rendered Spatial Domain Methods
brighter (boosted).
Sharpening or blurring an image can be reduced to a series of matrix
arithmetic operations.
When we apply a filter to our image, we're doing a convolution operation
on it with a Xen kernel. A kernel is a square matrix with nxn dimensions.
The kernel is what determines the type of operation we're doing, such as
sharpening, blurring, edge detection, gaussian blurring, and so on.
The following is an example of a sharpening kernel:
SHARPENING
• Sharpening is a technique for enhancing the transition between features
and details by sharpening and highlighting the edges. Sharpening, on the
other hand, does not consider whether it is enhancing the image's original
features or the noise associated with it. It improves both.
Blurring vs Sharpening
● Blurring: Blurring/smoothing is accomplished in the spatial domain by
averaging the pixels of its neighbors, resulting in a blurring effect. It's
an integration procedure.
71
Image Processing
The value k determines how much weight should be given to the mask that
is being added.
1. Unsharp Masking is represented by k = 1.
2. High Boost Filtering is represented by k > 1 since we are boosting high-
frequency components by adding higher weights to the image's mask
(edge features).
This approach, like most other sharpening filters, will not yield adequate
results if the image contains noise.
We may get the mask without subtracting the blurred image from the
original by using a negative Laplacian filter.
72
2) Laplacian Filters Spatial Domain Methods
73
Image Processing Resultant resultant Laplacian Matrix
74
• kernel -> kernel is a 3X3 matrix that we define based on how we want to Spatial Domain Methods
slide the picture across for convolution.
• cv2.filter2D -> cv2.filter2D -> cv2.filter2D To convolve a kernel with an
image, Opencv includes a function called filter2D.
It accepts three parameters as input:
1. img -> picture input
2. ddepth -> the depth of the output image
3. kernel-> kernel of convolution
Original Image
75
Image Processing • ImageFilter has a number of pre-defined filters, such as sharpen
and blur, that may be used with the filter() method.
• We sharpen our image twice and save the results in the sharp1
and sharp2 variables.
Image after 1st sharp operation
Sharpening effects can be seen, with the features becoming brighter and
more distinguishable.
76
3.4 FREQUENCY DOMAIN METHOD Spatial Domain Methods
Filtering
Low pass filtering is the process of removing high-frequency components
from an image. The image is blurred as a result of this (and thus a
reduction in sharp transitions associated with noise). All low-frequency
components would be retained while all high-frequency components
would be eliminated in an ideal low pass filter. Ideal filters, on the other
hand, have two flaws: blurring and ringing. The shape of the related
spatial domain filter, which includes a huge number of undulations, is the
source of these issues. Smoother frequency-domain filter transitions, such
as the Butterworth filter, produce substantially superior outcomes.
77
Image Processing simplicity, we'll just discuss real and radially symmetric filters. • A perfect
lowpass filter with r0 as the cutoff frequency
The origin (0, 0) is in the image's center, not its corner (remember the
"fftshift" operation).
• Using electrical components, the sudden shift from 1 to 0 of the transfer
function H (u,v) is impossible to achieve in practice. It can, however, be
simulated on a computer.
78
Spatial Domain Methods
79
Image Processing
80
Butterworth LPF example: False contouring Spatial Domain Methods
81
Image Processing
82
Spatial Domain Methods
The origin (0, 0) is in the image's centre, not its corner (remember the
"fftshift" operation).
• Using electrical components, the sudden shift from 1 to 0 of the transfer
function H (u,v) is impossible to achieve in practise. It can, however, be
simulated on a computer.
83
Image Processing
• Note how the output images have a strong ringing effect, which is a
hallmark of ideal filters. The discontinuity in the filter transfer function
is to blame.
• A two-dimensional Butterworth highpass filter has the following transfer
function:
• Because the frequency response does not have a sharp transition like the
ideal HPF, it is better for image sharpening because it does not introduce
ringing.
84
Butterworth HPF example Spatial Domain Methods
F(x,y) = i(x,y)r(x,y),
85
Image Processing
where and 0 < r(x,y) < 1. We cannot easily use the
above product to operate separately on the frequency components of
illumination and reflection because the Fourier transform of the product of
two functions is not separable; that is
Then
or
86
Where S is the result's Fourier transform. In the realm of space, Spatial Domain Methods
By letting
and
We get
s(x,y) = i'(x,y) + r'(x,y).
Finally, because z was calculated by taking the logarithm of the original
image F, the inverse produces the desired augmented image:
87
Image Processing
3.6 LIST OF REFERENCES
1. https://www.geeksforgeeks.org/spatial-filtering-and-its-types/
2. http://www.seas.ucla.edu/dsplab/ie/over.html
3. https://www.geeksforgeeks.org/spatial-filtering-and-its-types/
4. https://www.theobjects.com/dragonfly/dfhelp/3-
5/Content/05_Image%20Processing/Smoothing%20Filters.htm#:~:te
xt=Mean%20filtering%20is%20a%20simple,of%20its%20neighbors
%2C%20including%20itself.
5. http://saravananthirumuruganathan.wordpress.com/2010/04/01/introd
uction-tomean-shift-algorithm/
6. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/O
WENS/LECT5/node4.html#:~:text=Image%20enhancement%20
in%20the%20frequency,to%20produce%20the%20enhanced%2
0image.
7. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
8. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
3.7 BIBLIOGRAPHY
1. https://www.geeksforgeeks.org/spatial-filtering-and-its-types/
2. http://www.seas.ucla.edu/dsplab/ie/over.html
3. https://www.geeksforgeeks.org/spatial-filtering-and-its-types/
4. https://www.theobjects.com/dragonfly/dfhelp/3-
5/Content/05_Image%20Processing/Smoothing%20Filters.htm#:~:te
xt=Mean%20filtering%20is%20a%20simple,of%20its%20neighbors
%2C%20including%20itself.
5. http://saravananthirumuruganathan.wordpress.com/2010/04/01/introd
uction-tomean-shift-algorithm/
6. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/
OWENS/LECT5/node4.html#:~:text=Image%20enhancement%
20in%20the%20frequency,to%20produce%20the%20enhanced
%20image.
7. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
8. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
9. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWE
NS/LECT5/node4.html#:~:text=Image%20enhancement%20in%20th
e%20frequency,to%20produce%20the%20enhanced%20image.
88
3.8 UNIT END EXERCISES Spatial Domain Methods
Q1. Why does the averaging filter cause the image to blur?
Q2. How does applying an average filter to a digital image affect it?
Q3. What does it mean to sharpen spatial filters?
Q4. What is the primary purpose of image sharpening?
Q5. What is the best way to sharpen an image?
Q6. How do you figure out what a low-pass filter's cutoff frequency is?
Q7. What is the purpose of a low-pass filter?
Q8. What is the effect of high pass filtering on an image?
Q9. In homomorphic filtering, which filter is used?
Q10. In homomorphic filtering, which high-pass filter is used?
89
Module III
4
DISCRETE FOURIER TRANSFORM-I
Unit Structure
4.1 Objectives
4.2 Introduction
4.3 Properties of DFT
4.4 FFT algorithms ñ direct, divide and conquer approach
4.4.1 Direct Computation of the DFT
4.4.2 Divide-and-Conquer Approach to Computation of the DFT
4.5 2D Discrete Fourier Transform (DFT) and Fast Fourier Transform
(FFT)
4.5.1 2D Discrete Fourier Transform (DFT)
4.5.2 Computational speed of FFT
4.5.3 Practical considerations
4.6 Summary
4.7 References
4.8 Unit End Exercises
4.1 OBJECTIVES
After going through this unit, you will be able to:
● Understood the fundamental concepts of Digital Image processing
● Able to discuss mathematical transforms.
● Describe the DCT and DFT techniques
● Classify different types of image transforms
● Examine the use of Fourier transforms for image processing in the
frequency domain
90
4.2 INTRODUCTION Discrete Fourier Transform
ω = k × 2π
To limit the infinite number of values to a finite number, Eq. is modified
as
91
Image Processing The Discrete Fourier Transform (DFT) of a finite duration sequence x(n)
is defined as
where k = 0, 1......, N – 1
The discrete-frequency representation (DFT) transfers a discrete signal
onto a complex sinusoidal basis.
constants.
ii) Periodicity :
If a sequence x(n) periodic with periodicity N then N point DFT, X(k) is
also periodic with periodicity N .
92
iii) Circular Time shift : Discrete Fourier Transform
It states that if discrete time signal is circularly shifted in time by m units
93
Image Processing
where
Similarly, IDFT given as,
Property of symmetry:
Property of periodicity:
These two essential features of the phase factor are used by the
computationally efficient algorithms presented in this section, commonly
known as fast Fourier transform (FFT) algorithms.
95
Image Processing
Fig. 1 Two dimensional data array for storing the sequence x(n) 0 < n
< N-1
array in a variety of ways, each of which depends on the mapping of index
n to the " indexes (l, m).
For example, suppose that we select the mapping
n = Ml + m
This leads to an arrangement in which the first row consists of the first M
elements of x(n), the second row consists of the next M elements of x(n),
and so on, as illustrated in Fig. 2(a). On the other hand, the mapping
n = 1 + mL
stores the first L elements of x(n) in the first column, the next L elements
in the second column, and so on, as illustrated in Fig.2(b).
96
Discrete Fourier Transform
,
But
97
Image Processing
Algorithm 1
1. Store the signal column-wise.
2. Compute the M-point DFT of each row.
98
Discrete Fourier Transform
3. Multiply the resulting array by the phase factors
4. Compute the L -point DFT of each column
5. Read the resulting array row-wise.
An additional algorithm with a similar computational structure can be
obtained if the input signal is stored row-wise and the resulting
transformation is column-wise. This case we select,
n = Ml + m
k = qL + p
This choice of indices leads to the formula for the DFT in the form,
Algorithm 2
1. Store the signal row-wise.
2. Compute the L -point DFT at each column.
99
Image Processing
where mod (F(k, l)) = (R2{F(k, l)}+ I2{F(k, l)})1/2 is called the magnitude
spectrum of the Fourier transform and
is the phase angle or phase spectrum. Here, R{F(k, l)}, I{F(k, l)} are the
real and imaginary parts of F(k, l) respectively.
The Fast Fourier Transform is the most computationally efficient type of
DFT (FFT).
The FFT of an image can be represented in one of two ways: (a)
conventional representation or (b) optical representation.
High frequencies are collected at the centre of the image in the standard
form, whereas low frequencies are distributed at the edges, as seen in Fig.
1. The null frequency can be seen in the upper-left corner of the graph.
100
Discrete Fourier Transform
The frequency range is [0, N] X [0, M], where M is the image's horizontal
resolution and N is the image's vertical resolution.
101
Image Processing
4.6 SUMMARY :
Frequency smoothing and frequency leaking are examples of DFT
applications on finite pictures with MxN pixels. DFT is based on
discretely sampled pictures (pixels), which suffer from aliasing. DFT takes
into account periodic boundary conditions including centering, edge
effects, and convolution. Images have borders and are truncated (finite),
resulting in frequency smoothing and leakage. All drawbacks of DFT
overcomes by FFT.
4.7 REFERENCES
1] S. Jayaraman Digital Image Processing TMH (McGraw Hill)
publication, ISBN- 13:978-0-07- 0144798
2] John G. Proakis, Digital Signal Processing: Principles, Algorithms,
And Applications, 4/E
3] Gonzalez, Woods & Steven, Digital Image Processing using MATLAB,
Pearson Education, ISBN-13:978-0130085191
4] https://www.robots.ox.ac.uk/~sjrob/Teaching/SP/l7.pdf
Answer : a
4. Which of the following is true regarding the number of computations
required to compute DFT at any one value of ‘k’?
103
Image Processing
a) 4N-2 real multiplications and 4N real additions
b) 4N real multiplications and 4N-4 real additions
c) 4N-2 real multiplications and 4N+2 real additions
d) 4N real multiplications and 4N-2 real additions
Answer : d
5. Divide-and-conquer approach is based on the decomposition of an N-
point DFT into successively smaller DFTs. This basic approach leads to
FFT algorithms.
a) True
b) False
Answer : a
6. How many complex multiplications are performed in computing the N-
point DFT of a sequence using divide-and-conquer method if N=LM?
a) N(L+M+2)
b) N(L+M-2)
c) N(L+M-1)
d) N(L+M+1)
Answer : d
7. Define discrete Fourier transform and its inverse.
8. State and prove the translation property.
9. Give the drawbacks of DFT.
10. Give the property of symmetry and Periodicity of Direct DFT.
104
5
DISCRETE FOURIER TRANSFORM-II
Unit Structure
5.1 Objectives
5.2 Introduction
5.2.1 Image Transforms
5.2.2 Unitary Transform
5.3 Properties of 2-D DFT
5.4 Classification of Image transforms
5.4.1 Walsh Transform
5.4.2 Hadamard Transform
5.4.3 Discrete cosine transform
5.4.4 Discrete Wavelet Transform
5.4.4.1 Haar Transform
5.4.4.2 KL Transform
5.5 Summary
5.6 References
5.7 Unit End Exercises
5.1 OBJECTIVES
After going through this unit, you will be able to:
● Understood the fundamental concepts of Digital Image processing
● Able to discuss mathematical transforms.
● Describe the DCT and DFT techniques
● Classify different types of image transforms
● Examine the use of Fourier transforms for image processing in the
frequency domain
104
5.2 INTRODUCTION Discrete Fourier Transform
where k = 0, 1..., 3
1. Finding X(0)
2. Finding X(1)
105
Image Processing
3. Finding X(2)
106
Discrete Fourier Transform
* T
(A ) =A =H 1 1 1 1
1 j -1 -j
1 -1 1 -1
1 -j -1 j
The result is the identity matrix, which shows that Fourier transform
satisfies unitary condition.
Sequency - It refers to the number of sign changes. The sequency for a
DFT matrix of order 4 is given below.
107
Image Processing 5.3 PROPERTIES OF 2-D DFT [1] :
The properties of 2D DFT are shown in table 1.
108
Discrete Fourier Transform
Example) Find the 1D Walsh basis for the fourth-order system (N = 4).
the value of N is given as four. From the value of N, the value of m is
calculated as N = 4;
m = log2 N
=log2 4 = log2 22
=2*log22
m=2
In this, N = 4. So n and k have the values of 0, 1, 2 and 3. I varies from 0
to m–1. From the above computation, m = 2. So i has the value of 0 and 1.
The construction of Walsh basis for N = 4 is given in Table 1.
When k or n is equal to zero, the basis value will be 1/N.
109
Image Processing Sequency : The Walsh functions may be ordered by the number of zero
crossings or sequency, and the coefficients of the representation may be
called sequency components. The sequency of the Walsh basis function
for N = 4 is shown in Table 2.
Likewise, all the values of the Walsh transform can be calculated. After
the calculation of all values, the basis for N = 4 is given below [1].
110
Discrete Fourier Transform
Note: When looking at the Walsh basis, every entity has the same
magnitude (1/N), with the only difference being the sign (whether it is
positive or negative). As a result, the following is a shortcut approach for
locating the sign:
Step 1 Write the binary representation of n.
Step 2 Write the binary representation of k in the reverse order.
Step 3 Check for the number of overlaps of 1 between n and k.
Step 4 If the number of overlaps of 1 is
i) zero then the sign is positive
ii) even then the sign is positive
iii) odd then the sign is negative
111
Image Processing The Hadamard matrix of order 2N can be generated by Kronecker product
operation:
HN HN
H2N =
HN - HN
1 1 1 1
1 -1 1 -1
=
1 1 -1 -1
1 -1 -1 1
113
Image Processing
114
Discrete Fourier Transform
But,
Upon simplification,
115
Image Processing The process of reconstructing a set of spatial domain samples from the
DCT coefficients is called the inverse discrete cosine transform (IDCT).
The inverse discrete cosine transformation is given by,
117
Image Processing 5.4.4.2 KL Transform (KARHUNEN–LOEVE TRANSFORM) :
Harold Hotelling was the first to study the discrete formulation of the KL
transform, which is why it is also known as the Hotelling transform. The
KL transform is a reversible linear transform that takes advantage of a
vector representation's statistical features.
The orthogonal eigenvectors of a data set's covariance matrix are the basic
functions of the KL transform. The input data is optimally decorrelated
using a KL transform. The majority of the 'energy' of the transform
coefficients is focused inside the first few components after a KL
transform. A KL transform's energy compaction property is this.
Drawbacks of KL transform :
i. A KL transform is input-dependent, and the fundamental function for
each signal model on which it acts must be determined. There is no unique
mathematical structure in the KL bases that allows for quick
implementation.
ii. The KL transform necessitates multiply/add operations in the order of
O(m2). O(log2m) multiplications are required for the DFT and DCT.
118
In the formula for covariance matrix, x denotes the mean of the input Discrete Fourier Transform
matrix. The formula to compute
the mean of the given matrix is given below:
xxT
To find the E
119
Image Processing Step 3 Determination of eigen values of the covariance matrix
To find the eigen values λ, we solve the characteristic equation,
λ2 – λ - 4 = 0
From the last equation, we have to find the eigen values λ0, λ1. Solving
above equation,
120
Discrete Fourier Transform
121
Image Processing From the normalised eigen vector, we have to form the transformation
matrix.
122
5.5 SUMMARY Discrete Fourier Transform
5.6 REFERENCES
1] S. Jayaraman Digital Image Processing TMH (McGraw Hill)
publication, ISBN- 13:978-0-07- 0144798
2] John G. Proakis, Digital Signal Processing: Principles, Algorithms,
And Applications, 4/E
3] Gonzalez, Woods & Steven, Digital Image Processing using MATLAB,
Pearson Education, ISBN-13:978-0130085191
4] https://www.robots.ox.ac.uk/~sjrob/Teaching/SP/l7.pdf
Answer : a
9. The walsh and hadamard transforms are ___________in nature
(a) sinusoidal
(b) cosine
(c) non-sinusoidal
(d) cosine and sine
Answer : c
10. Unsampling is a process of ____________the spatial resolution of the
image
(a) decreasing
(b) increasing
(c) averaging
(d) doubling
Answer : b
124
Module IV
Image Restoration and Image Segmentation:
6
IMAGE DEGRADATION
Unit Structure
6.0 Image degradation
6.1 Classification of Image restoration Techniques
6.2 Image restoration model
6.3 Image blur
6.4 Noise model
6.4.1 Exponential
6.4.2 Uniform
6.4.3 Salt and Pepper
125
Image Processing
126
6.3 IMAGE BLUR Image Degradation
127
Image Processing Exponential noise is a special case of gamma or Erlang
noise where b parameters equal to 1.
6.4.2 Uniform
The uniform noise cause by quantizing the pixels of image to a number of
distinct levels is known as quantization noise, the level of the gray values
of the noise are uniformly distributed across a specified range. It can be
used to generate any different type of noise distribution.
and
σ2
128
7
IMAGE RESTORATION TECHNIQUES
Unit Structure
7.1 Image restoration techniques
7.1.1 Inverse filtering
7.1.2 Average filtering
7.1.3 Median filtering
7.2 The detection of discontinuities
7.2.1 Point detection
7.2.2 Line detection
7.2.3 Edge detections
7.3 Various methods used for edge detection
7.3.1 Prewitt Filter or Prewitt Operator
7.3.2 Sobel Filter or Sobel Operator
7.3.3 Fri-Chen Filter Hough Transform
7.4 Thresholding Region based segmentation Chain codes
7.4.1 Region-based segmentation
7.4.2 Region-based segmentation Chain codes
7.5 Polygon approximation
7.5.1 Shape numbers
7.6 References
7.7 Moocs
7.8 Video links
7.9 Quiz
7.1 IMAGE RESTORATION TECHNIQUES
7.1.1 Inverse filtering
It is the process of receiving the input of a system from its output and is
the simplest approach to restore the original image as the degradation
function is known. The simplest approach to restoration is direct inverse
filtering, where we compute an estimate, (u,v), of the transform of the
original image by dividing the transform of the degraded image, G(u,v),
by the degradation transfer function:
129
Image Processing
where
Sxy is a subimage centered on point (x, y).
7.2 THE DETECTION OF DISCONTINUITIES
The partitions or sub-division of an image is based on some abrupt
changes in the intensity level of images and is used for detecting three
basic types of grey-level discontinuities in a digital image: Points, Lines
and Edges. To identify these, 3* 3 mask operation is used.
The response of the mask at any point in the image is given by: -
130
where Image Restoration Techniques
zi is gray-level of pixel associated with mask coefficient wi.
|R| T
Where R is the response of the mask at any point in the image and T is
non-negative threshold value. It means that isolated point is detected at the
corresponding value (x, y).
The result of point detection mask is shown in Fig 4:
.
Suppose that the four masks are run individually through an image. If, at a
certain point in the image, |Ri| > |Rjl, for all j ≠ i, that point is said to be
more likely associated with a line in the direction of mask i.
132
Image Restoration Techniques
133
Image Processing 7.3 VARIOUS METHODS USED FOR EDGE
DETECTION
Detection of edges
Most of the shape information of an image is enclosed in edges. So first
we detect these edges in an image and by using these filters and then by
enhancing those areas of image which contains edges, sharpness of the
image will increase and image will become clearer.
Prewitt Operator
Sobel Operator
Robinson Compass Masks
Krisch Compass Masks
Laplacian Operator.
All the filters mentioned above are Linear filters.
134
7.3.2 Sobel Filter or Sobel Operator Image Restoration Techniques
Sobel Filter looks similar to Prewitt operator; it is a derivate mask used for
edge detection. Sobel operator is also used to detect two kinds of edges in
an image:
Horizontal direction.
Vertical direction.
Major difference is that in sobel operator the coefficients of masks are not
fixed and they can be adjusted according to our requirement unless they do
not violate any property of derivative masks.
This mask works exactly same as the Prewitt operator vertical mask. The
only one difference it has “2” and “-2” values in center of first and third
column. As applied on an image this mask will highlight the vertical
edges.
Sample Image
Following is a sample picture on which we will apply above two masks
one at time.
Comparison
As you can see, in the first image to which the vertical mask is applied, all
vertical edges are easier to see than the original image. Similarly, in the
second image, all horizontal edges are shown as a result of applying the
horizontal mask.
In this way, you can see that both horizontal and vertical edges of the
image can be detected. Also, if you compare the result of the Sobel
operator with the Prewitt operator, you can see that the Sobel operator
finds more edges and makes the edges easier to see than the Prewitt
operator.
This is because the Sobel operator gave more weight to the pixel weight
of the edges.
Applying more weight to mask
Applying more weight to the mask, the more edges it will get for us.
-1 0 1
-5 0 5
-1 0 1
Compare the result of this mask with of the Prewitt vertical mask, it is
apparent that this mask will give out more edges as compared to Prewitt
one just because we have allotted more weight in the mask.
137
Image Processing 7.3.3 Fri-Chen Filter Hough Transform
Fri-Chen edge detector is also a first order operation Prewitt and Sobel
operator. Frei-Chen masks are unique masks, contains all the basis
vectors. This means that a 3×3 image area is represented with the
weighted sum of nine Frei-Chen masks that can be seen below: -
139
Image Processing 7.5.1 Shape numbers
As shown in the figure below, the shape number of the Freeman chain-
coded boundary based on the 4-way code is defined as the first difference
in minimum magnitude. The order n, of a shape is defined as the number
of digits in the representation. Moreover, for closed boundaries, n is even,
and its value limits the number of different shapes possible. The first
difference in the 4-way directional chain code is independent of rotation
(in 90 ° increments), but the coded boundaries usually depend on the on
the orientation of the grid.
Depending on how the grid spacing is selected, the resulting shape
number order is usually equal to n, but borders with indentations
comparable to this spacing may produce shape numbers greater than n. In
this case, specify a rectangle with an order less than n and repeat the
process until the resulting shape number is nth. The order of form numbers
starts at 4, and we are using 4 connections, so we always need it.
The border is closed.
7.6 REFERENCES
1. Pratt WK. Introduction to digital image processing. CRC press; 2013
Sep 13.
2. Niblack W. An introduction to digital image processing. Strandberg
Publishing Company; 1985 Oct 1.
3. Burger W, Burge MJ, Burge MJ, Burge MJ. Principles of digital
image processing. London: Springer; 2009.
140
4. Jain AK. Fundamentals of digital image processing. Prentice-Hall, Image Restoration Techniques
Inc.; 1989 Jan 1.
5. Dougherty ER. Digital image processing methods. CRC Press; 2020
Aug 26.
6. Gonzalez RC. Digital image processing. Pearson education india;
2009.
7. Marchand-Maillet S, Sharaiha YM. Binary digital image processing:
a discrete approach. Elsevier; 1999 Dec 1.
8. Andrews HC, Hunt BR. Digital image restoration.
9. Lagendijk RL, Biemond J. Basic methods for image restoration and
identification. InThe essential guide to image processing 2009 Jan 1
(pp. 323-348). Academic Press.
10. Banham MR, Katsaggelos AK. Digital image restoration. IEEE
signal processing magazine. 1997 Mar;14(2):24-41.
11. Hunt BR. Bauesian Methods in Nonkinear Digital Image Restoration.
IEEE Transactions on Computers. 1977 Mar 1;26(3):219-29.
12. Figueiredo MA, Nowak RD. An EM algorithm for wavelet-based
image restoration. IEEE Transactions on Image Processing. 2003
Aug 4;12(8):906-16.
13. Digital Image Processing – Tutorialspoint.
https://www.tutorialspoint.com/dip/index.htm.
14. Types of Restoration Filters. https://www.geeksforgeeks.org/types-
of-restoration-filters/.
7.7 MOOCS
1. Digital Image Processing.
https://onlinecourses.nptel.ac.in/noc19_ee55/preview.
2. Digital Image Processing.
https://www.mygreatlearning.com/academy/learn-for-
free/courses/digital-image-processing.
3. Fundamentals of Digital Image Processing.
https://alison.com/course/fundamentals-of-digital-image-processing
142
12. Image Restoration Techniques – I. Image Restoration Techniques
https://www.youtube.com/watch?v=MrNafUqh860.
13. Image degradation and restoration | Digital Image Processing.
https://www.youtube.com/watch?v=ScBBAHHxepY.
14. Degradation function.
https://www.youtube.com/watch?v=dIC53nDnwgk.
7.9 QUIZ
143
Image Processing 5. What are the categories of digital image processing?
a) Image Enhancement
b) Image Classification and Analysis
c) Image Transformation
d) All of the mentioned
ANSWER: D
6. How does picture formation in the eye vary from image formation in a
camera?
a) Fixed focal length
b) Varying distance between lens and imaging plane
c) No difference
d) Variable focal length
ANSWER: D
7. What are the names of the various colour image processing categories?
a) Pseudo-color and Multi-color processing
b) Half-color and pseudo-color processing
c) Full-color and pseudo-color processing
d) Half-color and full-color processing
ANSWER: C
10. The aliasing effect on an image can be reduced using which of the
following methods?
a) By reducing the high-frequency components of image by clarifying the
image
b) By increasing the high-frequency components of image by clarifying
the image
144
c) By increasing the high-frequency components of image by blurring the Image Restoration Techniques
image
d) By reducing the high-frequency components of image by blurring the
image
ANSWER: D
11. Which of the following is the first and foremost step in Image
Processing?
a) Image acquisition
b) Segmentation
c) Image enhancement
d) Image restoration
ANSWER: A
13. Which of the following is the next step in image processing after
compression?
a) Representation and description
b) Morphological processing
c) Segmentation
d) Wavelets
ANSWER: B
18. The digitization process, in which the digital image comprises M rows
and N columns, necessitates choices for M, N, and the number of grey
levels per pixel, L. M and N must have which of the following values?
a) M have to be positive and N have to be negative integer
b) M have to be negative and N have to be positive integer
c) M and N have to be negative integer
d) M and N have to be positive integer
ANSWER: D
22. Points whose locations are known exactly in the input and reference
images are used in Geometric Spacial Transformation.
a) Known points
b) Key-points
c) Réseau points
d) Tie points
ANSWER: D
147
Image Processing
26. Region of Interest (ROI) operations is generally known as _______
a) Masking
b) Dilation
c) Shading correction
d) None of the Mentioned
ANSWER: A
29. Which of the following illustrates three main types of image enhancing
functions?
a) Linear, logarithmic and power law
b) Linear, logarithmic and inverse law
c) Linear, exponential and inverse law
d) Power law, logarithmic and inverse law
ANSWER: D
148
31. Which of the following operation is done on the pixels in sharpening Image Restoration Techniques
the image, in the spatial domain?
a) Differentiation
b) Median
c) Integration
d) Average
ANSWER: A
149
Image Processing 36. Which of the following makes an image difficult to enhance?
a) Dynamic range of intensity levels
b) High noise
c) Narrow range of intensity levels
d) All of the mentioned
ANSWER: D
37. _________ is the process of moving a filter mask over the image and
computing the sum of products at each location.
a) Nonlinear spatial filtering
b) Convolution
c) Correlation
d) Linear spatial filtering
ANSWER: C
43. The response for linear spatial filtering is given by the relationship
__________
a) Difference of filter coefficient’s product and corresponding image pixel
under filter mask
b) Product of filter coefficient’s product and corresponding image pixel
under filter mask
c) Sum of filter coefficient’s product and corresponding image pixel under
filter mask
d) None of the mentioned
ANSWER: C
151
Image Processing 47. Gamma Correction is defined as __________
a) Light brightness variation
b) A Power-law response phenomenon
c) Inverted Intensity curve
d) None of the Mentioned
ANSWER: B
152
Module V
8
IMAGE DATA COMPRESSION AND
MORPHOLOGICAL OPERATION
Unit Structure
8.1 Need for compression
8.2 Redundancy in image
8.3 Classification of Image compression schemes
8.4 Huffman coding
8.5 Arithmetic coding
8.6 Dictionary based compression
8.7 Lempel-Ziv-Welch (LZW) algorithm
8.8 Transform based compression
154
8.2 REDUNDANCY IN IMAGE Image Data Compression and
morphological Operation
Redundancy refers to "storing additional information to represent a set of
information." We know that computers store images in pixel values.
Therefore, the pixel values of the image may be duplicated, or even if
some pixel values are deleted, the information in the actual image may
not be affected. 3-Types of Image redundancy: -
a) Coding redundancy: -
The symbols such as letters, numbers, bits, and so on are used to represent
a set of data or events and collection of these symbols is known as code.
Each code word's length is determined by the number of symbols it
contains. In most 2-D intensity arrays, the 8-bit codes used to represent the
intensities contain more bits than are required to express the intensities.
Because most 2-D intensity array pixels are spatially interconnected (i.e.,
each pixel is similar to or dependent on surrounding pixels), information is
duplicated in the representations of the correlated pixels unnecessarily.
Temporally interconnected pixels (those that are similar to or dependent
on pixels in surrounding frames) in a video series also duplicate
information.
c) Irrelevant information: -
Human visual system ignore most of the 2-D intensity arrays that contain
data. If that data is not used it is considered to be as redundant.
155
Image Processing Further, these two techniques are also classified as follows
Image
Compression
Lossless
Lossy Compression
Compression
Transformed
Non-Transformed Based Decorrelation Entropy Coding
Based
Vector RLC,LZW,
DCT Based DWT Based Fractals SPHIT, Huffman Coding,
Quantization EZW, Golomb Code,
EBCOT Golomb-Rice
Code,
MQ Coder
Fast DCT, SHPS,
Integer DCT, Strip Based,
Binary DCT, Two line Based,
Signed DCT, Single line
Zonal DCT, Based
Fast zonal DCT
156
Image Data Compression and
morphological Operation
157
Image Processing Steps and example: -
Figure 1 illustrates the basic arithmetic coding process. A five-symbol
sequence or message, a1a2a3a3a4, is coded here from a four-symbol
source. The message is supposed to occupy the entire half-open interval
[0, 1] at the start of the coding procedure. This interval is initially
partitioned into four sections depending on the probabilities of each source
symbol, as shown in Table below, for example, symbol a1 is related with
Subinterval [0, 0.2]. The message interval is initially limited to [0, 0.2]
because it is the first symbol of the message being coded.
The range [0, 0.2] is enlarged to the full height of the figure, with the
values of the narrowed range labeling its end points. The narrower range is
then subdivided according with probabilities of the original source
symbol, and the process is repeated for the next message symbol. Symbols
a2 and a3 narrow the subinterval to [0.04, 0.08], 0.056, 0.072, and so on.
The range is narrowed to [0.06752, 0.0688) when the last message sign is
used as a specific end-of-message indicator. Of fact, the message can be
represented by any number within this subinterval, such as 0.068.
For Example:
The grey values 0, 1, 2,..., and 255 are assigned to the first 256 words in
the dictionary for 8-bit monochrome images. As the encoder sequentially
examines the image's pixels, gray- level sequences that are not in the
dictionary are placed in algorithmically determined (e.g., the next unused)
locations. If the first two pixels of the image are white, for instance,
sequence ―255- 255 might be assigned to location 256, the address
following the locations reserved for gray levels 0 through 255. The next
time that two consecutive white pixels are encountered, code word 256,
the address of the location containing sequence 255-255, is used to
represent them. If a 9-bit, 512-word dictionary is employed in the coding
process, the original (8 + 8) bits that were used to represent the two pixels
are replaced by a single 9-bit code word.
Consider the following 4 x 4, 8-bit image of a vertical edge: -
160
9
IMAGE COMPRESSION STANDARDS
Unit Structure
9.1 JPEG (Joint Photograph Expert Group)
9.2 MPEG (Moving Picture Expert Group)
9.3 Vector Quantization
9.4 Wavelet based image compression
9.5 Morphological Operation
9.6 References
9.7 Moocs
9.8 Video links
9.9 Quiz
161
Image Processing
The range of the pixels intensities now are from 0 to 255. In order to
change the range from -128 to 127, it is required to subtract 128 from each
pixel value, we got the following results.
162
Image compression standards
The result comes from this is stored in let’s say A(j,k) matrix.
This matrix is given below: -
163
Image Processing
164
vii. Figure2 shows how I-, P-, and B-frames are constructed from a Image compression standards
series of seven frames.
165
Image Processing The entire movie is designated a video sequence as per MPEG
standard, and each picture has three components: one luminance
component and two chrominance components (y, u & v).
The luminance component contains the gray scale picture & the
chrominance components provide the color, hue & saturation.
The MPEG decoder has three parts, audio layer, video layer, system
layer.
The basic building block of an MPEG picture is the macro block as
shown:
Where,
Q = Quantization
DCT = Discrete cosine transform
166
The quantized numbers Q_(DCT )are encoded using non adaptive Image compression standards
Huffman method.
170
Image compression standards
171
Image Processing Dilation: -
Dilation expands the image pixels for given element A by applying
structuring element B. The equation of this operator is defined as
A= Object to be dilated.
B=Structuring element.
Steps to perform
a) Fully match = 1
b) Some match = 1
c) No match = 0
Example
Given image A
Structuring element B
Output
Erosion
Erosion shrinks the image pixels for shrinking an element A by applying
structuring element B. The equation of this operator is defined as: -
172
Image compression standards
Steps to perform
a) Fully match = 1
b) Some match = 0
c) No match = 0
For Example
Given image A
Structuring element B
Output
Opening
Opening generally smoothes the contour of an object, breaks narrow
isthmuses, and eliminates thin protrusions.
The opening of set A by structuring element B, denoted by A B, is defined
as: -
173
Image Processing So, here an erosion followed by a dilation.
For Example
Set A
Structuring Element B
Output
Closing
Closing tends to smooth sections of contours but it generates fuses narrow
breaks and long thin gulfs, eliminates small holes, and fills gaps in the
contour.
The closing of set A by structuring element B, denoted by A B, is defined
as: -
For Example
Set A
174
Structuring Element B Image compression standards
Output
9.6 REFERENCES
1. Pratt WK. Introduction to digital image processing. CRC press; 2013
Sep 13.
2. Niblack W. An introduction to digital image processing. Strandberg
Publishing Company; 1985 Oct 1.
3. Burger W, Burge MJ, Burge MJ, Burge MJ. Principles of digital image
processing. London: Springer; 2009.
4. Jain AK. Fundamentals of digital image processing. Prentice-Hall,
Inc.; 1989 Jan 1.
5. Dougherty ER. Digital image processing methods. CRC Press; 2020
Aug 26.
6. Gonzalez RC. Digital image processing. Pearson education india;
2009.
7. Marchand-Maillet S, Sharaiha YM. Binary digital image processing: a
discrete approach. Elsevier; 1999 Dec 1.
8. Andrews HC, Hunt BR. Digital image restoration.
9. Lagendijk RL, Biemond J. Basic methods for image restoration and
identification. InThe essential guide to image processing 2009 Jan 1
(pp. 323-348). Academic Press.
10. Banham MR, Katsaggelos AK. Digital image restoration. IEEE signal
processing magazine. 1997 Mar;14(2):24-41.
175
Image Processing 11. Hunt BR. Bauesian Methods in Nonkinear Digital Image Restoration.
IEEE Transactions on Computers. 1977 Mar 1;26(3):219-29.
12. Figueiredo MA, Nowak RD. An EM algorithm for wavelet-based
image restoration. IEEE Transactions on Image Processing. 2003 Aug
4;12(8):906-16.
13. Digital Image Processing – Tutorialspoint.
https://www.tutorialspoint.com/dip/index.htm.
14. Types of Restoration Filters. https://www.geeksforgeeks.org/types-of-
restoration-filters/.
9.7 MOOCS
1. Fundamentals of Digital Image and Video Processing. Coursera.
https://www.coursera.org/lecture/digital/mpeg-4-qYxK2
2. Moving Pictures Expert Group (MPEG) Video. SCTE.
https://www.scte.org/education/course-offerings/course-
catalog/moving-pictures-expert-group-mpeg-video/
3. Huffman Coding. Coursera.
https://www.coursera.org/lecture/digital/huffman-coding-0CZoy
4. Morphology. Udemy. https://www.udemy.com/course/morphology/
176
9.9 QUIZ Image compression standards
10. Which bitmap file format support the Run length encoding ?
(A) BMP
(B) PCX
(C) TIF
(D) All of the above
Answer: D
179
Module VI
10
APPLICATIONS OF IMAGE PROCESSING
Unit Structure
10.1 Case Study on Digital Watermarking
10.2 Digital watermarking techniques: A case study in fingerprints
and faces
10.3 Vehicle Registration Number Plate Detection and Recognition
using Image Processing Techniques
10.4 Object Detection using Correlation Principle
180
Applications of Image
Processing
1. Embed
2. Attack
3. Protection
Embed: Embedded with the digital watermark.
Attack: Any change in the transmitted content, it becomes a threat and is
called an attack to the watermarking system.
Protection: The detection of the watermark from the noisy signal which
might have altered media is called Protection.
Types of Watermarks [3]:
1. Visible Watermarks
2. Invisible Watermarks
3. Public Watermarks
4. Fragile Watermarks
Visible Watermarks: These are visible in nature.
Invisible Watermarks: These are invisible but are embedded in the
media and use steganography technique.
Public Watermarks: These can be modified using certain algorithms by
anyone and are not secure.
Fragile Watermarks: These are said to be destroyed as data manipulation
occurs, need to use a system as to detect the changes occurred to the data,
if fragile watermarks are used.
181
Image Processing Digital watermarking is used for numerous purposes including [1-2]:
Broadcast Monitoring
Ownership Assertion
Transaction Tracking
Content Authentication
Copy Control and Fingerprinting
183
Image Processing separately for each image segment; It's easier to remove this type of
watermark.
Proposed Technique
184
Applications of Image
Processing
185
Image Processing
Conclusion
Watermarking biometric information may be a still a comparatively new
issue, however it's of growing importance as a lot of sturdy strategies of
verification and authentication are being used. Biometrics give the
mandatory distinctive characteristics but their validity should be ensured.
A receiver can't perpetually confirm whether or not or not she has received
the right data while not the sender giving her access to important data like
watermark. The key projected here is one amongst several potential
methods. The native average theme creates a semi-unique key for every
data set transmitted and so is tougher to tamper with. It conjointly has the
flexibility to pin-point wherever meddling has occurred up to a small pixel
window and information security will be assured in databases similarly as
in transmission. However, it's solely a semi-unique key. It’s do able to
change the image however retain constant key, because the average isn't
perpetually the simplest tool for characterizing data. A non-linear
mechanism may well be more sensitive to small changes and are a few
things that might be investigated. One major flaw in our methodology is
its inability to notice whether or not the alterations within the image are
because of channel distortions and noise or actual tampering by an
individual. Generally the transmission noise may be a perform of the
encryption theme employed, and at alternative times it's a function of the
channel itself. However, having how to work out whether or not the
“tampering” is that the results of noise or a malicious attack would be
useful. As for the noise to be seen as tampering, it should be sturdy
enough to start out disrupting the image and from that time on that may be
understood as an accidental attack. Another potential drawback is the
“disgruntled worker” attack. If a discontented employee has access to the
executable, then it is straightforward to form the possible perpetually
agree that the image received has not been tampered albeit it's been.
186
Similarly, the executable may be altered in order that it provides Applications of Image
systematically negative responses. a technique to try and do thus would be Processing
to introduce a random perform that operates in conjunction with the
executable, so that for example, a 5x5 native average key's not the sole
possibility.
Methodology
187
Image Processing
PRE-PROCESSING
The input can be an image or a video. Video is considered as a series of
frames/frames, before starting license plate detection, the image source
must be matched for further processing. Figure 7(a) is the example input
image used to show the process. Here is the order in which the image
processing techniques are applied:Image Under-Sampling
RGB to HSV Conversion
Grayscale extraction
Morphological transformations
Gaussian Smoothing
Inverted Adaptive Gaussian Thresholding
At the end of the previous stage of image pre-processing, Inverted
Adaptive Gaussian Thresholding, returns a binarized image, with values of
either 0 or 255.
Fig. 7. (a) Fonts used for training; (b) Extracted images for a the letter
‘P’
Results and discussion
The experiments were conducted on a Windows 10 machine with 8 GB of RAM
and an i5 processor running at 2.4 GHz frequency. The OpenCV Python
library is used to implement image processing tools. System testing was
performed with photos and videos. All of the above cases, such as
irregularly illuminated plates, stylized fonts, close-up plates, far away
plates are considered part of the testing including images with different
environmental conditions. Figure 8 (a) shows an image for testing the
case of irregular and small number plates. Figure 8 (b) shows a case of
partially worn-out and a standard number plate.
189
Image Processing
CONCLUSION
The job involves detecting number plates and recognizing the number
plate, involving the number of Indian vehicles or number plates. The main
contributions of this work include: taking into account difficult situations
such as light changes, blur, asymmetrical, noisy, no standard images and
partial worn plates. In this job, first, some image processing techniques,
morphological changes, Gaussian smoothing, the Gaussian threshold is
used in the pre-handling period. Then, for the segment of the number
plate, the borders are applied at the next boundary and the contours are
filtered according to the size of the character and space location. Finally,
after filtering and transforming areas of interest, neighborhood algorithms
are used to recognize the characters.
190
Applications of Image
Processing
192
Applications of Image
Processing
CONCLUSION
Due to its powerful learning ability and advantages in dealing with
occlusion, scale transformation and background switches, deep learning
based object detection has been a research hotspot in recent years. Review
starts with generic object detection pipelines which provide base
architectures for other related tasks. Then, three other common tasks,
namely salient object detection, face detection and pedestrian detection,
are also briefly reviewed. Finally, several promising future directions to
gain a thorough understanding of the object detection landscape have been
expressed. This section is meaningful for the developments in neural
networks and related learning systems, providing valuable insights and
guidelines for future progress.
194
11
HUMAN BODY TRACKING BASED ON
DISCRETE WAVELET TRANSFORM
Unit Structure
Fig 12: (a) Original image (b) first-level DWT (c) second-level DWT
(d) sub-bands of second-level DWT
197
Image Processing Experimental results
Proposed tracking system is implemented in Windows XP PC with
Pentium 2.0G CPU, 1024MB RAM under Borland C++ builder 5 software
environment as the implementation platform. The resolution of each color
images is 320x240 pixels. We can achieve real-time processing at about
25 frames per second. As demonstrated satisfactory performances have
been achieved via the proposed approach.
Conclusions
With the aim at single human-body tracking, a novel colour image real-
time human body tracking system based on discrete wavelet transform is
proposed for identifying the target based on color and spatial information.
To improve tracking performances, discrete wavelet transform is used to
pre-process the image for reducing computations required and achieving
real-time tracking. The experiments results have shown that the proposed
tracking system is capable of realtime tracking human objects in about 25
frames per second.
CEDAR
CEDAR, was developed by the researchers at the University of Buffalo in
2002 and is considered among the first few large databases of handwritten
characters. In CEDAR, the images were scanned at 300 dpi as shown in
Figure 14.
199
Image Processing
CONCLUSION
1) Optical character recognition has been around for the last eight (8)
decades. Development of machine learning and deep learning has
enabled individual researchers to develop algorithms and
techniques, which can recognize handwritten manuscripts with
greater accuracy.
2) Systematically extracted and analyzed research publications on six
widely spoken languages. We explored that some techniques
perform better on one script than on another, e.g. multilayer
perceptron classifier gave better accuracy on Devanagri, and
Bangla numerals and gave average results for other languages.
3) Most of the published research studies propose a solution for one
language or even a subset of a language.
4) It is observed that researchers are increasingly using Convolution
Neural Networks (CNN) for the recognition of handwritten and
machine-printed characters. This is due to the fact that CNN based
architectures are well suited for recognition tasks where input is an
image.
201
Image Processing Step 5. Recursively apply the steps 3 and 4 to each of the two halves,
subdividing groups and adding bits to the codes until each symbol has
become a corresponding leaf on the tree.
b) HUFFMAN CODING
The Huffman algorithm is simple and can be described in terms of
creating a Huffman code tree. The procedure for building this tree is:
Step 1. Start with a list of free nodes, where each node corresponds to a
symbol in the alphabet.
Step 2. Select two free nodes with the lowest weight from the list.
Step 3. Create a parent node for these two nodes selected and the weight is
equal to the weight of the sum of two child nodes.
Step 4. Remove the two child nodes from the list and the parent node is
added to the list of free nodes.
Step 5. Repeat the process starting from step-2 until only a single tree
remains.
d) ARITHMETIC CODING
Huffman and Shannon-Fano coding techniques suffer from the fact that an
integral value of bits is needed to code a character. Arithmetic coding
completely bypasses the idea of replacing every input symbol with a
codeword. Instead it replaces a stream of input symbols with a single
floating point number as output. The basic concept of arithmetic coding
was developed by Elias in the early 1960’s and further developed largely
by Pasco, Rissanen and Langdon. The main aim of Arithmetic coding is to
assign an interval to each potential symbol. Then a decimal number is
202
assigned to this interval. The algorithm starts with an interval of 0.0 and Human Body Tracking Based
1.0. After each input symbol from the alphabet is read, the interval is
on Discrete Wavelet Transform
subdivided into a smaller interval in proportion to the input symbol’s
probability. This subinterval then becomes the new interval and is divided
into parts according to probability of symbols from the input alphabet.
This is repeated for each and every input symbol. And, at the end, any
floating point number from the final interval uniquely determines the input
data.
203
Image Processing
f) LZ78
In 1978 Jacob Ziv and Abraham Lempel presented their dictionary based
scheme, which is known as LZ78. This dictionary has to be built both at
the encoding and decoding side and they must follow the same rules to
ensure that they use an identical dictionary. The codewords output by the
algorithm consists of two elements where ‘i’ is an index referring to the
longest matching dictionary entry and the first non-matching symbol.
When a symbol that is not yet found in the dictionary, the codeword has
the index value 0 and it is added to the dictionary as well. The algorithm
gradually builds up a dictionary with this method. The algorithm for LZ78
is given below:
LZ78 algorithm has the ability to capture patterns and hold them
indefinitely but it also has a serious drawback. There are various methods
to limit dictionary size, the easiest being to stop adding entries and
continue like a static dictionary coder or to throw the dictionary away and
start from scratch after a certain number of entries has been reached. The
encoding done by LZ78 is fast, compared to LZ77, and that is the main
advantage of dictionary based compression. The decompression in LZ78 is
faster compared to the process of compression.
EXPERIMENTAL RESULTS
In this section we compare the performance of various Statistical
compression techniques (Run Length Encoding, Shannon-Fano coding,
204
Huffman coding, Adaptive Huffman coding and Arithmetic coding), LZ77 Human Body Tracking Based
family algorithms (LZ77, LZSS, LZH and LZB) and LZ78 family
on Discrete Wavelet Transform
algorithms (LZ78, LZW and LZFG). Research works done to evaluate the
efficiency of any compression algorithm are carried out having two
important parameters. Tested several times the practical performance of
the above mentioned techniques on files of Canterbury corpus and have
found out the results of various Statistical coding techniques and Lempel -
Ziv techniques selected for this study.
CONCLUSION
Statistical compression techniques and Lempel Ziv algorithms were taken
up to examine the performance in compression. In the Statistical
compression techniques, Arithmetic coding technique outperforms the rest
with an improvement of 1.15% over Adaptive Huffman coding, 2.28%
over Huffman coding, 6.36% over Shannon-Fano coding and 35.06% over
Run Length Encoding technique. LZB outperforms LZ77, LZSS and LZH
to show a marked compression, which is 19.85% improvement over LZ77,
6.33% improvement over LZSS and 3.42% improvement over LZH,
amongst the LZ77 family. LZFG shows a significant result in the average
BPC compared to LZ78 and LZW. From the result it is evident that LZFG
has outperformed the other two with an improvement of 32.16% over
LZ78 and 41.02% over LZW.
Introduction
Early reports of the performance of CBIR systems were often restricted
simply to printing the results of one or more example queries. This is
easily tailored to give a positive impression, since developers can chooses
queries which give good results. It is neither an objective performance
205
Image Processing measure, nor a means of comparing different systems. Many of the
measures used in CBIR have long been used in IR. Several other standard
IR tools have recently been imported into CBIR.
In the 1950s IR researchers were already discussing performance
evaluation, and the first concrete steps were taken with the development of
the SMART system in 1961. Other important steps towards common
performance measures were made with the Craneld test. Finally, the
TREC series started in 1992, combining many efforts to provide common
performance tests. The TREC project provides a focus for these activities
and is the worldwide standard in IR. Such novelties are included in TREC
regularly.
Information Retrieval
Although performance evaluation in IR started in the 1950s, here we focus
on newer results and especially on TREC and its achievements in the IR
community. Not only did TREC provide an evaluation scheme accepted
worldwide, but it also brought academic and commercial developers
together and thus created a new dynamic for the field.
Data Collections
The TREC collection is the main collection used in IR. Co-sponsored by
the National Institute of Standards and Technology and the Defense
Advanced Research Projects Agency, TREC has been held annually since
its inception. A large amount of training data is also provided before the
conference. Special evaluations exist for interactive systems, spoken
language, high-precision and cross-language retrieval. The collections can
grow as computing power increases, and as new research areas are added.
Relevance judgments
The determination of relevant and non-relevant documents for a given
query is one of the most important and time-consuming tasks. TREC uses
the following working definition of relevance: If you were writing a report
on the subject of the topic and would use the information contained in the
document in the report, then the document is relevant. Only binary
judgments are made, and a document is judged relevant if any piece of it
is.
Performance measures
The most common evaluation measures used in IR are precision and
recall, usually presented as a precision vs. recall graph. Researchers are
familiar with PR graphs and can extract information from them without
interpretation problems.
206
Human Body Tracking Based
on Discrete Wavelet Transform
Image grouping
An alternative approach is for the collection creator or a domain expert to
group images according to some criteria. Domain expert knowledge is
207
Image Processing very often used in medical CBIR. This can be seen as real groundtruth,
because the images have a diagnosis certified by at least one medical
doctor. These groups can then be used like the subsets discussed above.
Simulating users
Some studies simulate a user, by assuming that users' image similarity
judgments are modeled by the metric used by the CBIR system, plus
noise. Real users are very hard to model: Tversky (1977) has shown that
human similarity judgments seem not to obey the requirements of a
metric, and they are certainly user- and task-dependent. Such simulations
cannot replace real user studies.
Single-valued measures
Rank of the best match Berman & Shapiro (1999) measure whether the
\most relevant" image is in either the first 50 or first 500 images retrieved.
50 represents the number of images returned on screen and 500 is an
estimate of the maximum number of images a user might look at when
browsing.
Error rate Hwang et al. (1999) use this measure, which is common in
object or face recognition. It is in fact a single precision value, so it is
important to know where the value is measured.
Retrieval efficiency
Muller & Rigoll (1999) define Retrieval efficiency as specified below. If
the number of images retrieved is lower than or equal to the number of
relevant images, this value is the precision, otherwise it is the recall of a
query. This definition can be misleading since it mixes two standard
measures.
208
Correct and incorrect detection Human Body Tracking Based
Ozer et al. (1999) use these measures in an object recognition context. The on Discrete Wavelet Transform
numbers of correct and incorrect classifications are counted. When divided
by the number of retrieved images, these measures are equivalent to error
rate and precision.
Graphical representations
Precision vs. recall graphs
PR graphs are a standard evaluation method in IR and are increasingly
used by the CBIR community. PR graphs contain a lot of information, and
their long use means that they can be ready easily by many researchers. It
is also common to present a partial PR graph (e.g. He (1997)). This can be
useful in showing a region in more detail, but it can also be misleading
since areas of poor performance can be omitted. Interpretation is also
harder, since the scaling has to be watched carefully. A partial graph
should always be used in conjunction with the complete graph.
Fig 17: PR graphs for four different queries both without and with
feedback.
Correctly retrieved vs. all retrieved graphs contain the same information as
recall graphs, but differently scaled. Fraction correct vs. No. images
retrieved graphs are equivalent to precision graphs. Average recognition
rate vs. No. images retrieved graphs show the average percentage of
relevant images among the first N retrievals. This is equivalent to the
recall graph.
209
Image Processing
Fig 18: Recall vs. No. of images graph and partial precision vs. No. of
images graph
CONCLUSIONS
Current section gives an overview of existing performance evaluation
measures in CBIR. The need for standardized evaluation measures is clear,
since several measures are slight variations of the same definition. This
makes it very hard to compare the performance of systems objectively. To
overcome this problem a set of standard performance measures and a
standard image database is needed. We have proposed such a set of
measures, similar to those used in TREC. A frequently updated shared
image database and the regular comparison of system performances would
be of great benefit to the CBIR community.
11.5 REFERENCES
1. Digital Watermarking.
https://www.techopedia.com/definition/24927/digital-watermarking
2. Rashid A. Digital watermarking applications and techniques: a brief
review. International Journal of Computer Applications Technology and
Research. 2016;5(3):147-50.
3. Digital Watermarking and its Types.
https://www.geeksforgeeks.org/digital-watermarking-and-its-types/.
4. Jain S. Digital watermarking techniques: a case study in fingerprints &
faces. InProc. Indian Conf. Computer Vision, Graphics, and Image
Processing 2000 Dec (pp. 139-144).
5. Joshi, Manjunath & Joshi, Vaibhav & Raval, Mehul. (2013). Multilevel
Semi-fragile Watermarking Technique for Improving Biometric
Fingerprint System Security. Communications in Computer and
Information Science. 276. 10.1007/978-3-642-37463-0_25.
6. Ganta S, Svsrk P. A novel method for Indian vehicle registration
number plate detection and recognition using image processing
techniques. Procedia Computer Science. 2020 Jan 1;167:2623-33.
210
7. Zhao ZQ, Zheng P, Xu ST, Wu X. Object detection with deep learning: Human Body Tracking Based
A review. IEEE transactions on neural networks and learning systems.
on Discrete Wavelet Transform
2019 Jan 28;30(11):3212-32.
8. Chang SL, Hsu CC, Lu TC, Wang TH. Human body tracking based on
discrete wavelet transform. InProceedings of the 2007 WSEAS
International Conference on Circuits, Systems, Signal and
Telecommunications 2007 Jan 17 (pp. 113-122).
9. Shanmugasundaram S, Lourdusamy R. A comparative study of text
compression algorithms. International Journal of Wisdom Based
Computing. 2011 Dec;1(3):68-76.
10. Müller H, Müller W, Squire DM, Marchand-Maillet S, Pun T.
Performance evaluation in content-based image retrieval: overview and
proposals. Pattern recognition letters. 2001 Apr 1;22(5):593-601.
11.6 MOOCS
1. Watermarking Basics. https://www.coursera.org/lecture/hardware-
security/watermarking-basics-UHE3w.
2. Biometric Authentication. https://www.coursera.org/lecture/usable-
security/biometric-authentication-RXVog.
3. Biometrics. https://www.udemy.com/course/biometrics/.
4. YOLO: Automatic License Plate Detection & Extract text App.
https://www.udemy.com/course/deep-learning-web-app-project-
number-plate-detection-ocr/
5. Object Detection. https://www.coursera.org/lecture/convolutional-
neural-networks/object-detection-VgyWR.
6. Introduction to Optical Character Recognition.
https://www.coursera.org/lecture/python-project/introduction-to-
optical-character-recognition-n8be7.
7. Introduction to Data Compression.
https://www.coursera.org/lecture/algorithms-part2/introduction-to-
data-compression-OtmHU.
211
Image Processing 4. Biometric authentication and its types and methods, information
security. https://www.youtube.com/watch?v=tTnkq6Y3Hdg.
5. Vehicle License Plate Recognition.
https://www.youtube.com/watch?v=CVDTtRiIXME
6. Vehicle Number Plate Recognition using MATLAB.
https://www.youtube.com/watch?v=p_g-g7C3uHw.
7. Handwritten and Printed Text Recognition.
https://www.youtube.com/watch?v=H64vHn_R0vg
8. OCR Explained...Handwriting Recognition!!!.
https://www.youtube.com/watch?v=i_XJa165_9I.
212
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/331095222
CITATIONS READS
0 680
4 authors, including:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Md. Abu Bakr Siddique on 19 February 2019.
Abstract—In this paper, at first, a color image of a car is Inverse filter is a handy technique for image restoration
taken. Then the image is transformed into a grayscale image. if a proper degradation function can be modeled for the
After that, the motion blurring effect is applied to that image corrupted image. The performance of the inverse filter is
according to the image degradation model described in quite right when the noise does not corrupt images, but in
equation 3. The blurring effect can be controlled by a and b
the presence of noise in the images, performance degrades
components of the model. Then random noise is added in the
image via Matlab programming. Many methods can restore significantly as high pass inverse filtration cannot eliminate
the noisy and motion blurred image; particularly in this paper noise properly because noise tends to be high frequency.
Inverse filtering as well as Wiener filtering are implemented Wiener filter is incorporated with low pass filter together
for the restoration purpose. Consequently, both motion with high pass filter; as a result, it works actively in the
blurred and noisy motion blurred images are restored via existence of additive noise within the image. It performs
Inverse filtering as well as Wiener filtering techniques and the deconvolution operation (high pass filtering) to invert
comparison is made among them. motion blurring and also perform compression operation i.e.
(low pass filtering) to eliminate the additive noise.
Keywords—Color image, grayscale image, motion blurring,
random noise, inverse filtering, Wiener filtering, restoration of
Furthermore, in the process of inverting motion blurring and
an image. noise elimination, Wiener filter diminishes the overall mean
square inaccuracy between the original and the output image
of the filtration.
I. INTRODUCTION
In this paper, the implementation of inverse filtering and
In digital image processing, image restoration is an Wiener filtering are analyzed for image restoration. Inverse
essential approach used for the retrieval of uncorrupted, filtering is applied into a motion blurred car image at first,
original image from the blurred and noisy image [1, 2] and then wiener filtering is also used to the same image.
because of motion blur, noise, etc. caused by environmental After that, inverse and Wiener filtering are performed on the
effects [3] and camera misfocus. Image blur may occur for same motion blurred car image with additive noise. Finally,
many reasons such as motion blur which is due to the the comparison is made between inverse and Wiener
sluggish camera shutter speed comparative to the filtering regarding their performances in restoring motion
instantaneous motion of the targeted object [4]. The image blurred images with and without additive noise.
also may subject to several forms of noises such as Poisson
noise, Gaussian noise, etc. Poisson noise is controlled by II. LITERATURE REVIEW
signal and it is associated with the low light sources owing Over the past two decades, the technique of image
to photon counting statistic [4]. In contrast, the reason of processing has taken its place into every aspect of today's
Gaussian noise is because of electronic components and technological society. In digital image processing, there are
broadcast transmission effects [4]. In short, the term image a variety of essential steps involved such as image
restoration is an inverse process [5] by which the enhancement, pre-processing of images, image
uncorrupted, original image can be recovered from the segmentation, image restoration and reconstruction of
degraded form of the actual image [6]. There are many images etc. Among them, image restoration plays a vital
useful applications of digital image restoration in several role in today's world. It has several fields of applications in
fields including the area of astronomical imaging, medical the areas of astronomy, remote sensing, microscopy,
imaging, media and filmography, security and surveillance medical imaging, satellite imaging, molecular spectroscopy,
videotapes, law enforcement and forensic science, image law enforcement, and digital media restoration etc. Image
and video coding, centralized aviation assessment restoration is very challenging as there is a lot of
procedures [7], uniformly blurred television pictures interference and noise in the environment like Gaussian
restoration [8], etc. Several algorithmic techniques such as noise, multiplicative noise, and impulse noise etc, inclusive
Artificial Neural Network [9], Convolutional neural of the camera such as wide angle lens, long exposure times,
Network [10], and K-nearest Neighbors [11] can also be wind speed and degradation, blurring such as uniform blur,
applied in image processing techniques such as atmospheric blur, motion blur, and Gaussian blur etc.
segmentation, thresholding and filtering. The technique used However, there are various methods of image restoration in
in image restoration is known as filtering which suppresses the domain of image processing, for instance, Median filter,
or removes unwanted components or features from the Wiener filtering, inverse filtering, Harmonic mean filter,
images. The most popular filtering techniques are used in Arithmetic mean filter, Max filter, and Maximum
image restoration in recent times are inverse filtering and Likelihood (ML) method etc. Among these restoration
Wiener filtering [12]. methods, Wiener and inverse filtering method is the
Fig. 10. Restoration of noisy motion blurred car image by Wiener filtering
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 5 February 2019 doi:10.20944/preprints201811.0566.v2
Abstract—We study the problem of restoring severely degraded have received very little attention in the face recognition litera-
face images such as images scanned from passport photos or im- ture include halftoning [Fig. 1(e)], dithering [Fig. 1(f)], and the
ages subjected to fax compression, downscaling, and printing. The presence of security watermarks on documents [Fig. 1(g)–(j)].
purpose of this paper is to illustrate the complexity of face recog-
nition in such realistic scenarios and to provide a viable solution These types of degradation are observed in face images that
to it. The contributions of this work are two-fold. First, a database are digitally acquired from printed or faxed documents. Thus,
of face images is assembled and used to illustrate the challenges successful face recognition in the presence of such low quality
associated with matching severely degraded face images. Second, probe images is an open research issue.
a preprocessing scheme with low computational complexity is de- This work concerns itself with an automated face recognition
veloped in order to eliminate the noise present in degraded images
and restore their quality. An extensive experimental study is per- scenario that involves comparing degraded facial photographs
formed to establish that the proposed restoration scheme improves of subjects against their high-resolution counterparts (Fig. 2).
the quality of the ensuing face images while simultaneously im- The degradation considered in this work is a consequence of
proving the performance of face matching. scanning, printing, or faxing face photos. The three types of
Index Terms—Face recognition, faxed face images, image quality degradation considered here are: 1) fax image compression,1
measures, image restoration, scanned face images. 2) fax compression, followed by printing, and scanning, and
3) fax compression, followed by actual fax transmission, and
scanning. These scenarios are encountered in situations where
I. INTRODUCTION
there is a need, for example, to identify legacy face photos ac-
A. Motivation quired by a government agency that has been faxed to another
agency. Other examples include matching scanned face images
present in driver’s licenses, refugee documents, and visas for
T HE past decade has seen significant progress in the field
of automated face recognition as is borne out by results of
the 2006 Face Recognition Vendor Test (FRVT) organized by
the purpose of establishing or verifying a subject’s identity.
The factors impacting the quality of degraded face photos can
NIST [2]. For example, at a false accept rate (FAR) of 0.1%, the be 1) person-related, e.g., variations in hairstyle, expression,
false reject rate (FRR) of the best performing face recognition and pose of the individual; 2) document-related, e.g., lamina-
system has decreased from 79% in 1993 to 1% in 2006. How- tion and security watermarks that are often embedded on pass-
ever, the problem of matching facial images that are severely port photos, variations in image quality, tonality across the face,
degraded remains to be a challenge. Typical sources of image and color cast of the photographs; 3) device-related, e.g., the
degradation include harsh ambient illumination conditions [3], foibles of the scanner used to capture face images from docu-
low quality imaging devices, image compression, down sam- ments, camera resolution, image file format, fax compression
pling, out-of-focus acquisition, device or transmission noise, type, lighting artifacts, document photo size, and operator vari-
and motion blur [Fig. 1(a)–(f)]. Other types of degradation that ability.
Fig. 1. Degraded face images: Low-resolution probe face images due to various degradation factors. (a) Original. (b) Additive Gaussian noise. (c) JPEG com-
pressed (medium quality). (d) Resized to 10% and up-scaled to the original spatial resolution. (e) Half-toning. (f) Floyd–Steinberg dithering [4]. Mug-shots of face
images taken from passports issued by different countries: (g) Greece (issued in 2006). (h) China (issued in 2008). (i) U.S. (issued in 2008). (j) Egypt (issued in
2005) [1].
cause image blurring. Therefore, these filters are efficient in de- where , . This is called cycle
noising smooth images but not images with several discontinu- spinning denoising. If we have an -sample data, then pixel
ities. precision translation invariance is achieved by having
2) Thresholding-Based Nonlinear Denoising: When wavelet translation transforms (vectors) or .
wavelets are used to deal with the problem of image denoising Similar to cycle spinning denoising, thresholding-based
[24], the necessary steps involved are the following: 1) Apply translation invariant denoising can be defined as
discrete wavelet transform (DWT) to the noisy image by using
a wavelet function (e.g., Daubuchies, Symlet, etc.). 2) Apply (9)
a thresholding estimator to the resulting coefficients thereby
suppressing those coefficients smaller than a certain amplitude.
3) Reconstruct the denoised image from the estimated wavelet The benefit of translation invariance over orthogonal thresh-
coefficients by applying the inverse discrete wavelet transform olding is the SNR improvement afforded by the former. The
(IDWT). problem with orthogonal thresholding is that it introduces oscil-
The idea of using a thresholding estimator for denoising was lating artifacts that occur at random locations when changes.
systematically explored for the first time in [26]. An important However, translation invariance significantly reduces these arti-
consideration here is the choice of the thresholding estimator facts by the averaging process. A further improvement in SNR
and threshold value used since they impact the effectiveness can be obtained by proper selection of the thresholding esti-
of denoising. Different estimators exist that are based on dif- mator.
ferent threshold value quantization methods, viz., hard, soft, or
semisoft thresholding.
Each estimator removes redundant coefficients using a non- IV. FACE IMAGE RESTORATION METHODOLOGY
linear thresholding based on (4), where is the noisy obser- The proposed restoration methodology is composed of an on-
vation, is the mother wavelet function, ( is line and an offline process (see Fig. 4). The online process has
the scale and is the position of the wavelet basis), is the two steps. First, each input face image is automatically classi-
thresholding estimator, is the thresholding type, and is the fied into one of three degradation categories considered in this
threshold used. work: 1) class 1: fax compression, 2) class 2: fax compression,
If is an input signal, then the estimators used in this paper followed by printing and scanning, and 3) class 3: fax compres-
are defined based on (5)–(7), where is a parameter greater sion, followed by fax transmission and scanning. In actual im-
than 1, and the superscripts , , and denote hard, soft, and plementation, the system will not know whether the input face
semisoft thresholding, respectively. image is degraded or not. If the input face image is the orig-
In nonlinear thresholding-based denoising methods [see (4)], inal (good quality) image, it is assigned a fourth category, i.e.,
translation invariance means that the basis is 4) class 4: good quality. Based on this classification, a restora-
translation invariant , where is a lattice of and tion algorithm with a predefined meta-parameter set associated
for an image signal. While the Fourier basis is translation with the nature of degradation of the input image, is invoked.
invariant, the orthogonal wavelet basis is not (in either the Each meta-parameter set is deduced during the offline process.
continuous or discrete settings)
A. Offline Process
where denotes the set of linear denoising param- [28], which is the statistical relationship of a pixel’s intensity to
eters, i.e., filter type and window size, and the intensity of its neighboring pixels. The COM measures the
is the set of nonlinear parameters, i.e., wavelet type, thresh- probability that a pixel of a particular gray level occurs at a spec-
olding type and level, respectively. Then, , where ified direction and distance from its neighboring pixels. In this
represents a dataset of degraded images, given a finite study, the main textural features extracted are inertia, correla-
domain that represents the parameters employed tion, energy, and homogeneity:
(discrete or real numbers) and a quality metric function such • Inertia is a measure of local variation in an image. A high
that , the proposed reconstruction method works inertia value indicates a high degree of local variation.
by finding the parameter set in that maximizes • Correlation measures the joint probability occurrence of
the specified pixel pairs.
• Energy provides the sum of squared elements in the COM.
• Homogeneity measures the closeness of the distribution of
elements in the COM to the COM diagonal.
These features are calculated from the cooccurrence matrix
where pairs of pixels separated by a distance ranging from 1
where the terms involved correspond to filtering (noted as ), to 40 in the horizontal direction are considered resulting in
nonlinear denoising (noted as ), and their combination (noted a total of 160 features per image (4 main textural features at
as ). 40 different offsets).
This procedure is iterated until convergence (i.e., stability of Apart from these textural features, image graininess is used
the maximum quality) by altering the constrained parameters as an additional image quality feature. Graininess is measured
(window/wavelet/thresholding type) and updating the window by the percentage change in image contrast of the original
size and threshold level in an incremental way. The maximum image before and after blurring is applied.2 The identification
number of iterations is empirically set. For instance, a threshold of the degradation type of an input image is done by using the
value of more than 60 results in removing too much information -Nearest Neighbor ( -NN) method [29], [30] with .
content. The application of this process to a degraded training The online process restores the input image by employing the
dataset results in an estimated parameter set for each image. The associated meta-parameter set (deduced in the offline process).
optimum meta-parameter set for each degraded training dataset
is obtained by averaging. The derived meta-parameter sets are C. Computation Time
utilized in the online restoration process. The online restoration process when using MATLAB on a
Windows Vista 32-bit system with 4-GB RAM and Intel Core
B. Online Process Duo CPU T9300 at 2.5 GHz, requires about 0.08 s for the -NN
In the online process (see Fig. 4), the degradation type of each classification and about 2.5 s for image denoising, i.e., a total
input image is recognized by using a texture- and quality-based time of less than 3 s per image.
classification algorithm. First, the classifier utilizes the gray- 2Available: http://www.clear.rice.edu/elec301/Projects02/artSpy/graini-
tone spatial-dependence matrix, or cooccurrence matrix (COM) ness.html
376 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 6, NO. 2, JUNE 2011
V. DEGRADED FACE IMAGE DATABASES Passport Face Dataset (HPassFaceD) containing face images
scanned from the photo page of passports (see Fig. 5).
In this section, we will describe the hardware used for 1) the 2) Experimental Protocol: Three databases were used in this
acquisition of the high-quality face images, and 2) for printing, paper (Fig. 6).
scanning, and faxing the face images (along with the associated Passport Database: As stated above, the data collection
software). We will also describe the live subject-capture setup process resulted in the generation of the Passport Database
used during the data collection process and the three degraded (PassportDB) composed of three datasets: 1) the NFaceD
face image databases used in this paper. dataset that contains high-resolution face photographs from
1) Hardware and Subject-Capture Setup: A NIKON Coolpix live subjects, 2) the NPassFaceD dataset that contains passport
P-80 digital camera was used for the acquisition of the high- face images of the subjects acquired by using the P-80 camera,
quality face images (3648 2736) and an HP Office jet Pro and 3) the HPassFaceD dataset that contains the passport face
L7780 system was used for printing and scanning images. The images of the subjects acquired by using the scanning mode of
fax machine used was a Konica Minolta bizhub 501, in which the L7780 machine.
the fax resolution was set to 600 600 dpi, the data compres- In the case of NPassFaceD, three samples of the photo page
sion method was MH/MR/MMR/JBIG, and transmission stan- of the passport were acquired for each subject. In the case of
dard used for the fax communication line was super G3. The HPassFaceD, one scan (per subject) was sufficient to capture a
Essential Fax software was used to convert the scanned docu- reasonable quality mug-shot from the passport (Fig. 5).
ment of the initial nondegraded face photos into a PDF docu- Passport-Fax Database: This database was created from
ment with the fax resolution set to 203 196 dpi. the Passport Database (Fig. 7). First, images in the Passport
Our live subject-capture setup was based on the one sug- database were passed through four fax-related degradation
gested by the U.S. State Department, Bureau of Consular scenarios. This resulted in the generation of four fax-passport
Affairs [31]. For the passport-capture setup we used the P-80 datasets that demonstrate the different degradation stages of the
camera and the L7780 system. We acquired data from 28 sub- faxing process when applied to the original passport photos: –
jects bearing passports from different countries, i.e., 4 from Dataset 1: Each face image in the NPassFaceD/HPassFaceD
Europe, 14 from the United States, 5 from India, 2 from Middle datasets was placed in a Microsoft PowerPoint document. This
East, and 3 from China; the age distribution of these partici- document was then processed by the fax software producing
pants was as follows: 20–25 (12 subjects), 25–35 (10 subjects), a multipage PDF document with fax compressed face images.
and over 35 (6 subjects). The database was collected over Each page of the document was then resized to 150%. Then,
2 sessions spanning approximately 10 days. In the beginning each face image was captured at a resolution of 600 600 dpi
of the first session, the subjects were briefed about the data by using a screen capture utility software (SnagIt v8.2.3). –
collection process after which they signed a consent document. Dataset 2: Same as Dataset 1, but this time each page of the
During data collection, each subject was asked to sit 4 feet PowerPoint document was resized to 100%. Then each face
away from the camera. The data collection process resulted in image was captured at a resolution of 400 400 dpi. The pur-
the generation of three datasets, i.e., the NIKON Face Dataset pose of employing this scenario was to study the effect of lower
(NFaceD) containing high-resolution face photographs from resolution of the passport face images on system performance.
live subjects, the NIKON Passport Face Dataset (NPassFaceD) – Dataset 3: Following the same initial steps of Dataset 1, a
containing images of passport photos, and the HP Scanned multipage PDF document was produced with degraded images
BOURLAI et al.: RESTORING DEGRADED FACE IMAGES 377
due to fax compression. The document was then printed and fax data this step is not employed since the color images
scanned at a resolution of 600 600 dpi. – Dataset 4: Again, are converted to grayscale by the faxing process.
we followed the same initial steps of Dataset 1. In this case, the 3) Normalization: In the next step, a geometric normaliza-
PDF document produced was sent via an actual fax machine tion scheme is applied to the original and degraded im-
and each of the resulting faxed pages was then scanned at a ages after detection. The normalization scheme compen-
resolution of 600 600 dpi. sates for slight perturbations in the frontal pose. Geometric
FRGC2-Passport FAX Database: The primary goal of the normalization is composed of two main steps: eye detec-
Face Recognition Grand Challenge (FRGC) Database project tion and affine transformation. Eye detection is based on a
was to evaluate the face recognition technology. In this work, template matching algorithm. Initially, the algorithm cre-
we combined the FRGC dataset that has 380 subjects with our ates a global eye from all subjects in the training set and
NFacePass dataset that consists of another 28 subjects. The ex- then uses it for eye detection based on a cross correlation
tended dataset is composed of 408 subjects with eight samples score between the global and the test image. Based on the
per subject, i.e., 3264 high-quality facial images. The purpose eye coordinates obtained by eye detection, the canonical
was to create a larger dataset of high-quality face images that faces are constructed by applying an affine transformation
can be used to evaluate the restoration efficiency of our method- as shown in Fig. 4. These faces are warped to a size of
ology in terms of identification performance, i.e., to investigate 300 300. The photometric normalization applied to the
whether the restored face images can be matched with the cor- passport images before restoration is a combination of ho-
rect identity in the augmented database. Following the process momorphic filtering and histogram equalization. The same
described for the previous database, four datasets were created process is used for the fax compressed images before they
and used in our experiments. are sent to the fax machine.
4) Image Restoration: The methodology discussed in
A. Face Image Matching Methodology Section IV is used. By employing this algorithm, we
process the datasets described in Section V and create
The salient stages of the proposed method are described their reconstructed versions that are later used for quality
below: evaluation and identity authentication. Fig. 8 illustrates
1) Face Detection: The Viola & Jones face detection algo- the effect of applying the restoration algorithm on some
rithm [32] is used to localize the spatial extent of the face of the Passport Datasets (1, 3, and 4), i.e., passport faces
and determine its boundary. a) subjected to T.6 compression (FAX SW) and restored;
2) Channel Selection: The images are acquired in the RGB b) subjected to T.6 compression, printed, scanned, and
color domain. Empirically, it was determined that in the restored; and c) subjected to T.6 compression, sent via fax
majority of passports, the Green channel (RGB color machine, then scanned and finally restored. Note that in
space) and the Value channel (HSV color space) are less Fig. 8, the degraded faces in the left column are the images
sensitive to the effects of watermarking and reflections obtained after face detection and before normalization.
from the lamination. These two channels are selected and 5) Face Recognition Systems: Both commercial and aca-
then added, resulting in a new single-channel image. This demic software were employed to perform the face
step is beneficial when using the Passport data. With the recognition experiments: 1) Commercial software Identity
378 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 6, NO. 2, JUNE 2011
Fig. 8. Illustration of the effect of the proposed restoration algorithm. The input consists of (a) images subjected to fax compression and then captured at
600 600 dpi resolution; (b) images subjected to fax compression and then captured at 400 400 dpi resolution; (c) images subjected to fax compression then
printed and scanned.
Tools G8 provided by L1 Systems;3 2) standard face Matching (EBGM) method [38]; and (3) Local Binary
recognition methods provided by the CSU Face Iden- Pattern (LBP) method [39].
tification Evaluation System [8], including Principle
Components Analysis (PCA) [33]–[35], a combined Prin-
ciple Components Analysis and Linear Discriminant VI. EMPIRICAL EVALUATION
Analysis algorithm (PCA+LDA) [36], the Bayesian In-
The experimental scenarios investigated in this paper are the
trapersonal/Extra-personal Classifier (BIC) using either
following: 1) evaluation of image restoration in terms of image
the Maximum likelihood (ML) or the Maximum a poste-
quality metrics; 2) evaluation of the texture and quality based
riori (MAP) hypothesis [37] and the Elastic Bunch Graph
classification scheme; and 3) identification performance before
3Available: www.l1id.com and after image restoration.
BOURLAI et al.: RESTORING DEGRADED FACE IMAGES 379
Fig. 9. Improvement in image quality as assessed by the PSNR and UIQ metrics. These metrics are computed by using the high-quality counterpart of each image
as the “clean image.”
Fig. 10. Comparison of degraded images and their reconstructed counterparts after employing the proposed restoration method using PSNR/UIQ as quality met-
rics. UIQ appears to result in, at least visually, better images.
Fig. 12. Box plot of degradation classification performance results when using
a combination of features. Note that the central mark (red line) is the median
classification result over 10 runs, the edges of the box (blue) are the 25th and
75th percentiles, the whiskers (black lines) extend to the most extreme data
points not considered outliers, and outliers (red crosses) are plotted individu-
ally. Inertia; Homogeneity; Energy; Contrast (Image
Graininess); and no usage of Contrast.
TABLE I
CLASSIFICATION RESULTS WHEN USING THE TEXTURAL- AND QUALITY-BASED CLASSIFICATION ALGORITHM. B FAX FAX COMPRESSION (NOT SENT VIA
FAX MACHINE); LRes LOW RESOLUTION; HRes HIGH RESOLUTION; AFAX SENT VIA FAX MACHINE; CL CLASSIFICATION; EV ERROR VARIANCE
TABLE II
PARTITIONING THE FRGC2-PASSPORT FAX DATABASE PRIOR TO APPLYING THE CSU FR ALGORITHMS
Fig. 13. Face identification results: High-quality versus high-quality face image comparison.
To test our classification algorithm, we used a dataset of tion. For this purpose, we perform a two-stage investigation
sample images (27 subjects 4) for training and the four that involves 1) high-quality versus high-quality face image
samples of the remaining (28th) subject for testing (1 subject comparison (baseline), and 2) high-quality versus degraded
4), where 4 in both cases represents one sample from each of face image comparison. In the high-quality versus high-quality
the four classes involved. Thus, we performed a total of 28 ex- tests, we seek to establish the baseline performance of each
periments where the training and test datasets were resampled, of the face recognition methods (academic and commercial)
i.e., in each experiment the data of a different subject (out of employed. In the high-quality versus degraded tests, we in-
the 28) was used for testing. Each experiment was performed vestigate the matching performance of degraded face images
before and after fusing textural and image graininess features. against their high-resolution counterparts.
The results are summarized in Table I. Note that if an image is Table II illustrates the way we split the FRGC2-Passport FAX
misclassified, it will be subjected to the set of meta-parameters Database to apply the CSU FR algorithms. For the G8 and LBP
pertaining to the incorrect class. algorithms we used 4 samples of all the 408 subjects, and ran a
We also applied our feature extraction algorithm on the 5-fold cross-validation where one sample per subject was used
original training set of the FRGC2 subset. Then, we randomly as the gallery image and the rest were used as probes. The iden-
selected 100 samples from the original test set (see Table II) tification performance of the system is evaluated through the
10 times, and then applied feature extraction on each gen- cumulative match characteristic (CMC) curve. The CMC curve
erated test subset. We performed the above process on the measures the identification system performance, and
three degraded datasets that are generated from the original judges the ranking capability of the identification system.
FRGC2 training/test sets, and performed 26 400 classification All the results before and after restoration are presented in
experiments in total. The outcome of these experiments is Figs. 13–16. We can now evaluate the consistency of the results
summarized Fig. 12 (box-plot results). and the significant benefits of our restoration methodology in
terms of face identification performance. For high-quality face
C. Face Identification Experiments images with no photometric normalization, the average rank 1
The third experimental scenario is a series of face identifi- score of all the FR algorithms is 93.43% (see Fig. 13). This
cation tests which compare system performance resulting from average performance drops to 80.4% when fax compression
the baseline (FRGC2-Passport FAX Database), degraded, and images are used before restoration. After restoration, the av-
reconstructed face datasets. The goal here is to illustrate that erage rank 1 score increases to 90.8% (12.94% performance im-
the face matching performance improves with image restora- provement). When the fax compressed images are also printed,
382 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 6, NO. 2, JUNE 2011
Fig. 14. Face identification results: High-quality versus Fax compressed images.
Fig. 15. Face identification results: High-quality versus Fax compressed images which have been printed and scanned.
the performance drops further to 70.8% before restoration, but quality metrics. The online restoration mode uses a classifica-
increases to 89.3% after restoration (26.13% performance im- tion algorithm to determine the nature of the degradation in the
provement). Finally, when the most degraded images were used input image, and then uses the meta-parameter set identified in
(images sent via a fax machine) the average rank 1 score across the offline mode to restore the degraded image. Experimental
all the algorithms drops to 58.7% before restoration while after results show that the restored face images not only have higher
restoration it goes up to 81.2%. It is interesting to note that the image quality, but they also lead to higher recognition perfor-
identification performance of the high-quality images is compa- mance than their original degraded counterparts.
rable to that of the restored degraded images. Note that each face Commercial face recognition software may have their own
identification algorithm performs differently, and in some cases internal normalization schemes (geometric and photometric)
(e.g., G8), the performance is optimal for both raw and restored that cannot be controlled by the end-user, and this can result
images (in the case of fax compression) achieving a 100% iden- in inferior performance when compared to some academic
tification rate at rank 1. The consistency in improving recogni- algorithms (i.e., LDA) when restoration is employed. For
tion performance indicates the significance of the proposed face example, when G8 was used on fax compressed data, the
image restoration methodology. identification performance was 79.2% while LDA resulted in a
91.4% matching accuracy. In both cases, the restoration helped,
yet LDA (97.9%) performed better than G8 (93.6%). Since
VII. CONCLUSIONS AND FUTURE WORK
the preprocessing stage of the noncommercial algorithms can
We have studied the problem of image restoration of severely be better controlled than commercial ones, several academic
degraded face images. The proposed image restoration algo- algorithms were found to be comparable in performance to the
rithm compensates for some of the common degradations en- commercial one after restoration.
countered in a law-enforcement scenario. The proposed restora- The proposed image restoration approach can potentially dis-
tion method consists of an offline mode (image restoration is ap- card important textural information from the face image. One
plied iteratively, resulting in the optimum meta-parameter sets), possible improvement could be the use of super-resolution algo-
where the objective function is based on two different image rithms that learn a prior on the spatial distribution of the image
BOURLAI et al.: RESTORING DEGRADED FACE IMAGES 383
Fig. 16. Face identification results: High-quality versus images that are sent via a Fax machine and then scanned. Note that the EBGM method is illustrated
separately because it results in very poor matching performance. This could be implementation-specific and may be due to errors in detecting landmark points.
gradient for frontal images of faces [19]. Another future direc- [9] H. C. Andrews and B. R. Hunt, Digital Image Restoration. Engle-
tion is to extend the proposed approach to real surveillance sce- wood Cliffs, NJ: Prentice-Hall, 1977.
[10] M. R. Banham and A. K. Katsaggelos, “Digital image restoration,”
narios in order to restore low quality images. Finally, another IEEE Signal Process. Mag., vol. 14, no. 2, pp. 24–41, Aug. 2002 [On-
area that merits further investigation is the better classification line]. Available: http://dx.doi.org/10.1109/79.581363
of degraded images. Such an effort will improve the integrity of [11] J. G. Nagy and D. P. O’Leary, “Restoring images degraded by spa-
tially-variant blur,” SIAM J. Sci. Comput., vol. 19, pp. 1063–1082,
the overall restoration approach. 1996.
[12] M. Figueiredo and R. Nowak, “An EM algorithm for wavelet-based
ACKNOWLEDGMENT image restoration,” IEEE Trans. Image Process., vol. 12, no. 8, pp.
906–916, Aug. 2003.
The authors would like to thank researchers at Colorado State [13] J. Bioucas Dias and M. Figueiredo, “A new TwIST: Two-step iterative
University for their excellent support in using the Face Evalu- shrinkage/thresholding algorithms for image restoration,” IEEE Trans.
ation Toolkit. They are grateful to Z. Jafri, C. Whitelam, and Image Process., vol. 16, no. 12, pp. 2992–3004, Dec. 2007.
A. Jagannathan at West Virginia University for their valuable [14] A. M. Thompson, J. C. Brown, J. W. Kay, and D. M. Titterington,
“A study of methods of choosing the smoothing parameter in image
assistance with the experiments. restoration by regularization,” IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 13, no. 4, pp. 326–339, Apr. 1991.
REFERENCES [15] M. I. Sezan and A. M. Tekalp, “Survey of recent developments in dig-
[1] T. Bourlai, A. Ross, and A. Jain, “On matching digital face images ital image restoration,” Opt. Eng., vol. 29, no. 5, pp. 393–404, 1990
against scanned passport photos,” in Proc. First IEEE Int. Conf. Bio- [Online]. Available: http://link.aip.org/link/?JOE/29/393/1
metrics, Identity and Security (BIDS), Tampa, FL, Sep. 2009. [16] P. H. Hennings-Yeomans, S. Baker, and B. V. Kumar, “Simultaneous
[2] P. J. Phillips, W. T. Scruggs, A. J. O’Toole, P. J. Flynn, K. W. Bowyer, super-resolution and feature extraction for recognition of low-resolu-
C. L. Schott, and M. Sharpe, “FRVT 2006 and ICE 2006 large-scale tion faces,” in Proc. Computer Vision and Pattern Recognition (CVPR),
experimental results,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Jun. 2008, pp. 1–8.
32, no. 5, pp. 831–846, May 2010. [17] P. H. Hennings-Yeomans, B. V. K. V. Kumar, and S. Baker, “Robust
[3] S. K. Zhou, R. Chellappa, and W. Zhao, Unconstrained Face Recog- low-resolution face identification and verification using high-resolu-
nition. New York: Springer, 2006. tion features,” in Proc. Int. Conf. Image Processing (ICIP), Nov. 2009,
[4] R. Floyd and L. Steinberg, “An adaptive algorithm for spatial grey pp. 33–36.
scale,” in Proc. Society of Information Display, 1976, vol. 17, pp. [18] W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Example based super-
75–77. resolution,” IEEE Comput. Graph. Applicat., vol. 22, no. 2, pp. 56–65,
[5] Z. Wang and A. C. Bovik, “A universal image quality index,” IEEE Mar./Apr. 2002.
Signal Process. Lett., vol. 9, no. 3, pp. 81–84, Mar. 2002. [19] S. Baker and T. Kanade, “Hallucinating faces,” in Proc. Fourth Int.
[6] P. J. Phillips, P. J. Flynn, T. Scruggs, K. W. Bowyer, J. Chang, K. Conf. Auth. Face and Gesture Rec., Grenoble, France, 2000.
Hoffman, J. Marques, J. Min, and W. Worek, “Overview of the face [20] M. Elad and A. Feuer, “Super-resolution reconstruction of image se-
recognition grand challenge,” in Proc. Computer Vision and Pattern quences,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 9, pp.
Recognition Conf., Jun. 2005, vol. 1, pp. 947–954. 817–834, Sep. 1999.
[7] T. Ahonen, A. Hadid, and M. Pietikinen, “Face recognition with local [21] N. Ramanathan and R. Chellappa, “Face verification across age pro-
binary patterns: Application to face recognition,” in Proc. Eur. Conf. gression,” IEEE Trans. Image Process., vol. 15, no. 11, pp. 3349–3362,
Computer Vision (ECCV), Jun. 2004, vol. 8, pp. 469–481. Nov. 2006.
[8] D. S. Bolme, J. R. Beveridge, M. L. Teixeira, and B. A. Draper, “The [22] V. V. Starovoitov, D. Samal, and B. Sankur, “Matching of faces
CSU face identification evaluation system: Its purpose, features and in camera images and document photographs,” in Proc. Int. Conf.
structure,” in Proc. Int. Conf. Computer Vision Systems, Apr. 2003, pp. Acoustic, Speech, and Signal Processing, Jun. 2000, vol. IV, pp.
304–311. 2349–2352.
384 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 6, NO. 2, JUNE 2011
[23] V. V. Starovoitov, D. I. Samal, and D. V. Briliuk, “Three approaches He worked as a Postdoctoral researcher in a joint project between the Univer-
for face recognition,” in Proc. Int. Conf. Pattern Recognition and sity of Houston and the Methodist Hospital (Department of Surgery) at Houston,
Image Analysis, Oct. 2002, pp. 707–711. TX, in the fields of thermal imaging and computational physiology. From Feb-
[24] S. K. Mohideen, S. A. Perumal, and M. M. Sathik, “Image de-noising ruary 2008 to December 2009 he worked as a Visiting Research Assistant Pro-
using discrete wavelet transform,” Int. J. Comput. Sci. Netw. Security, fessor at West Virginia University (WVU), Morgantown. Since January 2010
vol. 8, no. 1, pp. 213–216, Jan. 2008. he has been a Research Assistant Professor at WVU. He is supervising the eye
[25] Q. Huynh-Thu and M. Ghanbari, “Scope of validity of PSNR in image/ detection team, has been involved in various projects in the fields of biometrics,
video quality assessment,” Electron. Lett., vol. 44, no. 13, pp. 800–801, and multispectral imaging, and authored several book chapters, journals and
2008. conference papers. His areas of expertise are image processing, pattern recog-
[26] D. Donoho and I. Johnstone, “Ideal spatial adaptation via wavelet nition, and biometrics.
shrinkage,” Biometrika, vol. 81, pp. 425–455, 1994.
[27] R. R. Coifman and D. L. Donoho, “Translation-invariant de-noising,”
in Wavelets and Statistics. New York: Springer-Verlag, 1994, vol.
103, Springer Lecture Notes, pp. 125–150. Arun Ross (S’00–M’03–SM’10) received the B.E.
[28] R. M. Haralick, K. Shanmugam, and I. Dinstein, “Textural features for (Hons.) degree in computer science from BITS, Pi-
image classification,” IEEE Trans. Syst., Man, Cybern., vol. SMC-3, lani, India, in 1996, and the M.S. and Ph.D. degrees
no. 6, pp. 610–621, Nov. 1973. in computer science and engineering from Michigan
[29] T. M. Cover and P. E. Hart, “Nearest neighbor pattern classification,” State University, in 1999 and 2003, respectively.
IEEE Trans. Inform. Theory, vol. 13, no. 1, pp. 21–27, Jan. 1967. Between 1996 and 1997, he was with Tata Elxsi
[30] E. Fix and J. L. Hodges, Discriminatory Analysis, Nonparametric (India) Ltd., Bangalore. He also spent three summers
Discrimination: Consistency Properties USAF School of Aviation (2000–2002) at Siemens Corporate Research, Inc.,
Medicine, Randolph Field, TX, Tech. Rep. 4, 1951. Princeton working on fingerprint recognition algo-
[31] Setup and Production Guidelines for Passport and Visa Photographs rithms. He is currently an Associate Professor in the
U.S. Department of State, 2009 [Online]. Available: http://travel.state. Lane Department of Computer Science and Electrical
gov/passport/get/get_873.html Engineering at West Virginia University. His research interests include pattern
[32] P. A. Viola and M. J. Jones, “Robust real-time face detection,” Int. J. recognition, classifier fusion, machine learning, computer vision, and biomet-
Comput. Vis., vol. 57, no. 2, pp. 137–154, 2004. rics. He is the coauthor of Handbook of Multibiometrics and coeditor of Hand-
[33] L. Sirovich and M. Kirby, “Application of the Karhunen-Loeve pro- book of Biometrics. He is an Associate Editor of the IEEE TRANSACTIONS ON
cedure for the characterization of human faces,” IEEE Trans. Pattern IMAGE PROCESSING and the IEEE TRANSACTIONS ON INFORMATION FORENSICS
Anal. Mach. Intell., vol. 12, no. 1, pp. 103–108, Jan. 1990. AND SECURITY.
[34] M. Turk and A. Pentland, “Eigenfaces for recognition,” J. Cognitive Dr. Ross is a recipient of NSF’s CAREER Award and was designated a Kavli
Neurosci., vol. 3, no. 1, pp. 71–86, 1991. Frontier Fellow by the National Academy of Sciences in 2006.
[35] A. P. Devijver and J. Kittler, Pattern Recognition: A Statistical Ap-
proach. Englewood Cliffs, NJ: Prentice-Hall, 1982.
[36] P. Belhumeur, J. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisher-
faces: Recognition using class specific linear projection,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711–720, Jul. 1997. Anil K. Jain (S’70–M’72–SM’86–F’91) is a uni-
[37] M. Teixeira, “The Bayesian Intrapersonal/Extrapersonal Classifier,” versity distinguished professor in the Department
Master’s thesis, Colorado State University, Fort Collins, CO, 2003. of Computer Science and Engineering at Michigan
[38] L. Wiskott, J.-M. Fellous, N. Kruger, and C. V. D. Malsburg, “Face State University. His research interests include pat-
recognition by elastic bunch graph matching,” IEEE Trans. Pattern tern recognition and biometric authentication. The
Anal. Mach. Intell., vol. 19, no. 7, pp. 775–779, Jul. 1997. holder of six patents in the area of fingerprints, he
[39] M. Pietikinen, “Image analysis with local binary patterns,” in Proc. is the author of a number of books, including Hand-
Scandinavian Conf. Image Analysis, Jun. 2005, pp. 115–118. book of Fingerprint Recognition (2009), Handbook
of Biometrics (2007), Handbook of Multibiometrics
(2006), Handbook of Face Recognition (2005),
BIOMETRICS: Personal Identification in Networked
Society (1999), and Algorithms for Clustering Data (1988).
Dr. Jain received the 1996 IEEE TRANSACTIONS ON NEURAL NETWORKS
Thirimachos Bourlai (M’10) received the Diploma Outstanding Paper Award and the Pattern Recognition Society best paper
(M.Eng. equivalent) in electrical and computer en- awards in 1987, 1991, and 2005. He served as the editor-in-chief of the
gineering from the Aristotle University of Thessa- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
loniki, Greece, in 1999, the M.Sc. degree in med- (1991–1994). He is a fellow of the AAAS, ACM, IEEE, IAPR, and SPIE.
ical imaging (with distinction) from the University of He has received Fulbright, Guggenheim, Alexander von Humboldt, IEEE
Surrey, U.K., in 2002 under the supervision of Prof. Computer Society Technical Achievement, IEEE Wallace McDowell, ICDM
M. Petrou. He received the Ph.D. degree (full scholar- Research Contributions, and IAPR King-Sun Fu awards. ISI has designated
ship) in the field of face recognition and smart cards, him a highly cited researcher. According to Citeseer, his book Algorithms for
in 2006, in a collaboration with OmniPerception Ltd. Clustering Data (Prentice-Hall, 1988) is ranked #93 in most cited articles in
(U.K.), and his Postdocorate in multimodal biomet- computer science. He served as a member of the Defense Science Board and
rics, in August 2007, both under the supervision of The National Academies committees on Whither Biometrics and Improvised
Prof. J. Kittler. Explosive Devices.