0% found this document useful (0 votes)
8 views

Image Processing Book

Uploaded by

SELIN P
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Image Processing Book

Uploaded by

SELIN P
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 304

Page 1 of 304

2
Image Enhancement
Spatial Domain: Gray level transformations -
Histogram processing -
Basics of Spatial
to
Filtering -Smoothing and Sharpening Spatial Fiftering, Frequency Domain: Introduction
Fourier Transform – Smoothing and Sharpening frequency domain filters – Ideal, Buttenworth
and Gausian filters, Homomorphic filtering, Color image enhancement.

2.1. INTRODUCTION

The principal objective of enhancement is to process


an image so that the result is
more suitable than the original image for a specific application. Image enhancement
approaches fall into two board categories.
1. Spatial Domain methods
2. Frequency Domain methods

2.2. SPATIAL DOMAIN METHODS BASICS

The termn spatial domain refers to the image plane itself and approaches in this
an image.
categories are based on direct manipulation of pixel in
Spatial domain process are denoted by the expression
g y) = TM,y)]
(, **. (2.1)
where f(, y) -input image
g (x, y) - processed image
E
T- Operator onf, defined over some neighborhood of
f (&, y)
The neighborhood of a point (x, y) can be explained by using as square or
rectangular sub image area centered at (x, y).
Page 2 of 304
Digital Image Processing
22
Origin

- (x, y)

3*3
neighborhood
of (x,y)

Image f(x, y)

spatial domain,
3Ne X

Fig. 2.1. 3 x3 neighborhood about (x, y) in an image


The center of sub image is moved from pixel to pixel starting at the top left corner.
The operator T is applied to each location (x, y)to find the output g at that location.
The process utilizes only the pixel in the area of the image spanned
by the
neighborhood.

2.3. BASIC GRAY LEVEL TRANSFORMATION FUNCTION


It is the simplest form of the transformation when the neighborhood
is of size
1 x 1.
In this case, g depends only on the value of f at a single
point (x, y) and
T becomes a gray level transformation fnction of the forms.

S= T(r) ...
(2.2)
where
r- denotes the gray level off (x, y) 22
S-denotes the gray level of g (x, y) at any point (, y)
Based on the shape of Tr) there are two categories
of techniques which are used
for contrast enhancement. They are,
1. Contrast stretching
2. Thresholding

2.3.1. CONTRAST STRETCHING


It produces an image of higher contrast than the original one.
The operation is
performed by darkening the levels below m and brightening
the levels above m in the
original image.
Page 3 of 304
Image Enhancement 2.3

S=T()

Light

-
Dark

m
Dark - -Light

Fig. 2.2. Intensity Transformation Function for Contrast Stretching.!


In this technique the value of r below m are compressed by the transformation
function into a narrow range of aS towards black. The opposite effect takes place for
the values of r above m.

2.3.2. THRESHOLDING FUNCTION


It is a limiting case where T(r) produces a two levels binary image. The values
below m are transformed as black and above m are transformed as white.

s=T()

ight

-T()

Dark

Dark +Light

Fig. 2.3. ntensity Transformation Function for Thresholding

2.3.3. POINT PROCESSING


called
Approaches whose results depend only on the intensity at point sometimes
a

as point processing Techniques. The contrast stretching and thresholding


process are

called as point processing.


Page 4 of 304
2.4| Digital Image Processing

2.4. BASIC GRAY LEVEL TRANSFORMATION


Gray Level Transformations are the simplest image processing techniques
compared to others. The value of pixels before processing can be represented as r and
after processing can be represented as S. These values are related by
an expression.
S = T(r)
where, T is a transformation that maps a pixel value r into a pixel
value S.
There are three basic types of functions used frequently for image
enhancement
such as
1. Linear (negative and identity transformations)
2. Logarithmic (Log an inverse log transformations)
3. Power law (nth power and nth root transformations) enn 0 10 3nat
4. Piecewise Linear Transformation Functions.
L-1

Negative

nth root
si3LJ4
s
level,

Log
intensity

nth power
LI2
Output

LJ4

Identity Invérse log

0
LI4 L/2 3LI4 L-1
Input intensity level, r.

Fig. 2.4. Basic Intensity (Gray) Level Transformation Functions

2.4.1. IMAGE NEGATIVES.


The negatiye of an image with gray level in the range [0,
L= 1] is obtained by
using the negative transformation. The expression of the transformation is
S = L-1-r
Page 5 of 304

Image Enhancement 2.5


Reversing the gray levels of an image in this manner produces the equivalent of a
photographic negative. This type of processing is practically suited for enhancing
white or gray detail embedded in dark regions of an image especially when the black
areas are dominant in size.

2.4.2. LOG TRANSFORMATIONS


The general form of the log transformation is
S = clog (1 +
r)|
where c-Constant
r0
This transformation maps a narrow range of gray level values in the input image
into a wider range of output gray levels. The opposite is true for higher values of
input levels.
Log transformation can be used to expand the values of dark pixels in an image
while compressing the higher level values. The opposite is true for inverse log
transformation.
compresses
The log transformation function has an important characteristic that it
the dynamic range of images with large variations in pixel values.
Example. Fourier
Spectrum.

2.4.3. POWER LAW (GAMMA) TRANSFORMATIONs


Power Law Transformations has the basic form
|S= crY

where C and Y are positive constants.


map a narrow range of dark input
Power 1law curves with fractional values of
values into a wider range of output values, with the opposite
being true for higher
curves by varying values of Y.
values of inputs levels. We may get various

Gamma Correction
display respond
A variety of devices used for image capture, printing, and
according to a power law. By convention, the
exponent in the power law equation is
process. used to correct this power..jaw response
referred to as gamma. The i
phenomenon is called gamma correction.
Page 6 of 304
2.6|
Digital Image Processing

L-1

Y50.04

Y= 0.10

3U4 Y=0.20

Y=0.40
s
level,

y=0.67
LU2 intensity

y=1

Y=1.5
Output

Y=2.5
U4
Y=5.0

Y=10.0

Y=25.0

U4 U2
Input intensity level,r 3LI4L-1
Fig. 2.5. Plots of
the Equations S = Crrfor various values
Y
of
(C= 1 in all cases)
Example
CRT devices have intensity to voltage response
that is a power function. Gamma
correction is important if displaying an image
accurately on a computer screen is ot
concern. Images that are not
corrected properly can look cither bleached out or too
dark.Color phenomenon also uses this concept gamma
of correction.
Gamma correction concept becoming more popular
due to use of images over the
internet. It is important in general purpose contract manipulation.
To make an image
black we use y> 1 and y < 1 for white image.

Uses of Gamma Correction


1. It is important to display an image accurately on a computer screen.

2. It is used to reproduce colors accurately.


Page 7 of 304

Image Enhancement 2.7


2,4.4. PIECEWISE LINEAR TRANSFORMATION FUNCTIONS
It is a complementary approach for image negatives, log transformation and power
law transformations.

These functions can be arbitrarily complex.


Disadvantages sisT1Ot3ct! i.0
Their specification requires considerably more user input.t ptibioststi1
Types aiboseroial
1. Contrast Stretchings e2i
2. Gray-Level Slicing rit utt t
3. Bit-Plane slicing

2.4.4.1.Contrast Stretching
It is the simplest piecewise linear transformation function. We can have low
contrast images because of lack of illumination, problem in imaging sensor or wrong
seting of lens aperture during image acquisition.
L-1

s
level, 3L/4F
intensity

LI2
Output

LI4
(1, s1)

LI4 L/2 3 L/4 L-1


Input intensity level, r

Fig. 2.6. Transfornation Function for Contrast Stretching


an
Contrast stretching is a process that expands the range of intensity levels in
image so that it spans the full intensity range of the recording medium
or display
device.
Page 8 of 304

2.8 Digital Image Processing

The idea behind contrast stretching is to increase the dynamic range of gray levels
in the image being processed.
The location of points (r1, S) and (", S,)control the shape of the curve.tala
a. If r;=, and S, = S, the transformation is a linear function that deduces no
change in gray levels.
b. If r =
S, S, = 0, and S, = L - 1, then the transformation becomes a
thresholding function that creates a binary iinage.
C. Intermediate values of (r, S,) and (r2, S) produce various degrees of spread
in the gray value of the output image thus affecting its contract.
Generally r sr, and S, s S, So that the function is single valued and
monotonically increasing.

2.4.4.2. Gray Level Slicing


wo Gray Level Slicing is the process of highlighting a specific range of gray level in
ran image. For example when enhancing features such as masses of water in satellite
image and enhancing flows in x-ray images.
There are two ways of doing this,
1. First method is todisplay a high value for all gray level in the range of interest
and a low value for all other gray level.
L-1

T()

Fig. 2.7. Gray Level Slicing - I",approach


Page 9 of 304
"Image Enhancement
2.9
2. Second method is to brighter the desired ranges gray
of levels but preserve the
background and gray level unchanged in the image.
L

T()

0 A B L-1
Fig. 2.8. Gray Level Slicing 2nd approach
2.4.4.3. Bit plane Slicing
Sometimes it is important to highlight the contribution made to the total image
appearance by specific bits. For example consider if each pixel is represented by 8
bits.
Imaging that an image is composed of eight 1-bit planes ranging from bit plane 0
for the least significant bit to bit plane 7 for the most significant bit. n
terms of 8-bit
bytes, plane 0 contains all the lowest order bits in the image and plane 7 contains all
the high order bits.
Bit plane 7
One 8-bit (most significant)
byte

Bit plane 0
a (least significant)

Fig. 2.9. Bit-plane representation of an 8-bit image


Page 10 of 304

|2.10 Digital Image Processing

Higher order bits contain the majority of visually significant data and low order
bits contribute to more subtle details in the image.
Separating a digital image into its its planes is useful for analyzing the relative
importance played by each bit of the image.
Ithelps in determining the adequacy of the number of bits used to quantize each
pixel. It is also useful for image compression.

2.5. HISTOGRAM PROCESSING


Histograms are the basis for numerous spatial domain processing techniques.
Histogram manipulation can be used for image enhancement.
The histogram of a digital image with gray levels in the range [0, L- 1] is a
discrete furction of the form.
hr) = ng
where,
rp is the kth intensity value
n, is the number of pixels in the image with intensity
r
Normalized Histogram
Normalized histogram can be obtained by dividing each of its components by the
total number of pixels in the image. A normalized histogram is given by the equation.

P(r) = for k =0, 1, 2...... L-1


MN
where MN - total number of pixels in the image
M - Row dimension of the image
N- Column dimension of
the image
P(r) gives the estimate of the probability of occurrence of gray level r. The sum
of all components of a normalized histogram is equal to 1.

Histograms are may be viewed graphically simply as plot of h (r)=n, versus

P(r) = MN versus k
Page 11 of 304

Image Enhancement 2.11


In the dark image the components of the histogram are concentrated on the low
(dark) side of the gray scale. In case of bright image the histogram components are
biased towards the high side of the gray scale.
The histogram of a low contrast image will be narrow and will be centered towards
the middle of the gray scale.
The components of the histogram in the high contrast image cover a broad range
of the gray scale.

2.5.1. HISTOGRAM EQUALIZATION


Histogram equalization is a common technique for enhancing the appearance of
images. Suppose we have an image which is predominantly dark. Then its histogram
would be skewed towards the lower end of the gray scale and all the image detail are
compressed into the dark end of the histogram.
If we could stretch out the gray levels at the dark end to produce a more uniformly
distributed histogram then the image would become much clearer.
Let there is a continuous function with r being gray levels of the image to be
enhanced. The range of r is [0, 1] with r = 0 representing black and r = 1
representing white.
The transformation function is of the form
S = T(r) 0srL-1
It produces an output gray level S for every pixel in the input image having
intensity r. The transformation function is assumed to fulfill two conditions.
(a) T(r) is amonotonically increasing function in the interval sr L-1
0

(b) 0sT(r) SL-1 for SrsL-1


0

The gray level transformation function that satisfies the condition (a) and (b)
which is shown in below figure 2.11.
Page 12 of 304
2.12 Digital Image Processing

Dark image

Bright image

Low-contrast image

High-contrast image

Fig. 2.10. Four basic image types: dark, light, low contrast, high contrast,
and their corresponding histograms.
Page 13 of 304

Image Enhancement 2.13


T(r)

L-1

T()

Sk

rkL-1
Fig. 2.1. The Inverse Transformation fromS tor is denoted by
r = T-l (S) 0<S<L-1
A
monotonic transformation function performs a one to one or many to one
mapping.
The transformation function should be single valued so that the inverse
transformation should exist. Monotonically increasing condition preserves the
increasing order from back to white in the output image. The second conditions
guarantee that the output gray levels will be in the same range as the input levels.

Probability Density Function (PDF)


The gray levels of the image may be viewed as random variables in the interval
[0, 1]. The most fundamental descriptor of a random variable is its probability density
function [PDFJ.
Pr) and P.(s) denote the probability density functions of random variables
r and s respectively. Basic results from an elementary probability theory states that if
Pír)and T) are known and T(r) satisfies conditions (a), then the probability density
function P(s) of the transformed variable s is given by the formula

P,(s) =
P,
) dr
ds (2.3)

Thus the PDF of the transformed variable s is determined by the gray levels PDF
of the input image and by the chosen transformations function.
A transformation function of particular importance in image processing has the
form.
Page 14 of 304
2.14 Digital Image Processing

S = T¢) =(L-1) P-
() dw ... (2.4)

where w is a dummy variable integration.


of
The right side of above equation is recognized as
the Cumulative Distribution
Function (CDF) of Random variable r.
Using this definition of T we see that
the derivative of S with respect
tor is
ds
= P(r) ... (2.5)
dr
Substituting it back in the expression
for Ps we may get
P,(S) = P,
) Pr) =1 ...(2.6)
An important point here is that T(r) depends on
P, () but the resulting P(S)
always uniform, and independent
of the form of P().
The probability of occurrence gray
of levels r in an image as approximated

P, () = nk
MN k= 0, 1,2, ......, L -1 ...
(2.7)
where,
MN is the total number of pixels
in the image
n, is the number of pixels that have intensity
r,
L isthe number of possible intensity
levels in the image.
The discrete transformationfunction is given by
n

S = T() =

i= 0
MN

=
... (2.8)
i= 0

Thus a processed (output) image is obtained


by mapping each pixel with levels
in the input image into a corresponding pixel rt
s
with level in the output image.
r
A plot of P, (r) versus is called histogram.
by the above equation is called histogram
The transformation function given
equalization or histogram linearization.
Equalization automatically determines a transformation
function that seeks toproduce
Page 15 of 304
|2.15|
Image Enhancement

an output image that has a uniform histogram. It is a good approach when automatic
enhancement is needed.

2.5.2. HISTOGRAM MATCHING (SPECIFICATION)


we
In some cases it may be desirable to specify the shape of the histogram.that
wish the processed image to have.
Histogram equalization does not allow interactive image enhancement
and
we
generates only one result, an approximation to a uniform histogram. Sometimes
need to be able to specify particular histogram shapes capable of highlighting cretain
a
gray-level ranges. The method used to generate a processed image that has specified
histogram is called histogram matching or histogram specification.

Algorithm
Step 1: Compute S; = P, (K) where K
= 0, 1
...... L - 1, the cumulative
normalized histogram of f.
...... the transformation function, from
Step 2: Compute G (K), K=0,
1
L- 1,

the given histogram hz.


...
Step 3: Compute G, (SK) for each K=0, 1, L-l using an interactive method
or effect directly computes G, (P,(K))
Step 4: Transformf using G, (P, (K))

2.5.3. LOCAL HISTOGRAM PROCESSING (LOCAL ENHANCEMENT)


the on
In earlier methods pixels were modified by a transformation function based
gray level of an entire image. It is not suitable when enhancement is to be done is
some small areas of the image. This problem can be solved by local histogram
processing, where a transformation function is applied only in the neighborhood of
pixels in the interested region.
move the center from pixel
Define square or rectangular neighborhood (mask) and
to pixei. For each neighborhood

1., Calculate histogram of the points in the neighborhood

2. Obtain histogram equalization/specification function


3. Map gray level of pixel centered in neighborhood.
Page 16 of 304

|2.16| Digital Image Processing

an adjacent pize
center the neighborhood region is then moved to
t. The of
location and the procedure is repeated.
FOR IMAGE
2.5.4. USINGHISTOGRAM STATISTICS
ENHANCEMENT
can directly from an image histogram and it can be used for
.Statistics be obtained
image enhancement.
range
r. a discrete random yariable representing intensity values ín the
Let denote
corresponding to
[0, L - 1] and let P(r) denote the normalized histogram component
mean is defined as
value r,, The nh moment ofr about its
L-I ... (2.9)
= Z ¢,-m)" P(r)
H)
i=0
where
m is the mean (average intensity) value of r

L-1 ...(2.10)
m = Z r, Pr)
i=0

The second moment is particularly important


L-1
= Z
(r,-m)2 P(r) ..(2.11)
H)
i=0
by g2, because
Equation 2.11 can be recognized as the intensity variance and denoted
standard deviation is the square root of the variance.

g² =
L-1
(r;-m)? P(r)
.. (2.12)
hr)=
i=0
or standard deviation
The mean is a measure of average intensity and the variance
is a measure of contrast in an image.
When working with only the mean and variance, it is
casy to estimate them
directly from the sample values, without computing the histogram. These estimates
are called sample mean and sample variance, which is shown in below equations.
1 M-1 N-1 ... (2.13)
m =
MN y =0
x=0
Page 17 of 304
Image Enhancement 2.17

and
1
2 = MN
M-1 N-1
.. (2.14)
x=0y=0

for x= 0, 1,2.... M-1and

t, I, 2..... N-1
The global mearn and variance are computed over an entire image and are useful
for gross adjustments in overall intensity and contrast.

The local mean and variance are used as the basic for making changes that depend
on image characteristics in a neighborhood about each pixel in an image.

Let (x,y)denote the coordinates of any pixel in a given image and let Sxy denote a
neighborhood (sub image) of specified size, centered on (*, y). The mean value of the
pixels in this neighborhood is defined as
L-1
(2.15)
>xy
ai=0.
where, Psy is the histogram of the pixels in region S,, The variance of the pixels
in the neighborhood is given by
L-1
= (2.16)
Sxy i=0
The local mean is a measure of average intensity in neighborhood S,xy and the local
variance is a measure of intensity contrast in that neighborhood.
Let f(x, y) represent the value of an image at any image coordinates (x, y) and let

g, y) represent the corresponding enhanced value at those coordinates then.


(E- sKo mg AND K,Gg S G,, SK, G, ...
= Ef,)f*.y) ifm,
(2.17)
g,y) otherwise
V, y)
for x = 0, 1,2..... M-l and
y= 0, 1, 2 .... N- 1

where
K, K, and K, are specified parameters
E,

mç- global mean of the input image


Page 18 of 304

2.18| Digital Image Processing

GG -standard deviation of global mnean


mean and stand deviation.
m
and o, are the local

2.6. BASICS OF SPATIAL FILTERING


a broad spectrum of
Spatial Filtering is one of the principal tools used in
more important,
applications. So understanding the basics of spatial filtering becomes
Filtering refers to accepting or rejecting certain frequency
components. If a filter
passes low frequencies means it can be called as low pass filter. The net effect
produced by a low pass filter is to smooth an image.

2.6.1. THE MECHANICS OF SPATIAL FILTERING


Spatial Filtering is an example of neighborhood operations; in this the operations
are done on the values of the image pixels in the neighborhood and the corresponding
value of a sub image that has the same dimensions as of the neighborhood.
This sub image is called a filter, mask, kernel, template or window. The values in
the filter sub image are referred to as coefficients rather than pixel. Spatial filtering
operations are performed directly on the pixel values (amplitude/gray scale) of the
image.

Linear Spatial Filter

If the operation performed on the image pixels is linear then the filter is called as
linear spatial filter, otherwise the filter is nonlinear.
Page 19 of 304
Image Enhancement (2.19
Image origin

Filter Mask

w-1,-1) w-1,0) w-1,1)

Image f(x, y)

w0-1) wa 1)

wl,-1) w,) wI, 1)

foc-1, y-1) f(x-1, y) fo-1, y+1) Mask coefficients, showing


coordinate arrangement

fo y-1) foy) foy+1)

fox+1, y-1) fo+1,y) fox+1,y+1)

Pixels of image
section under mask

Lfls: Fig. 2.12. The mechanics of linear spatialfiltering using a 3 x 3 filter mask
Above figure 2.12 shows the mechanics of linear spatial filtering using a 3 x 3
neighborhood. The process consists of moving the filter mask from point to point in
a predefined
the image. At each point (x, y) the response is calculated using
relationship.
For linear spatial filtering the response is given by a
sum of products of the filter

coeficient and the corresponding image pixels in the area spanned by the filter mask.
Page 20 of 304

2.20 Digital Image Processing

The results R of linear fltering with the filter mask at point (, ) in the image is

w(-1, -1) f( - 1, - 1)+ w(-1, 0)


f - 1, ) +
R= + 1,y + 1)
+ w(0, 0)f, y) + ......t w(1, 0)f(x + 1, y) + w(1, 1)fr
...
(2.18)
pixel is
The sum of products of the mask coefficient with the corresponding
v)
directly under the mask. The coefficient w (0, 0) coincides with image value f(x,
sum of products
indicating that mask it centered at (*, y) when the computation of
takes place.
For
a
mask of size Mx we assumeN
m=2a +1 and n=
26 + 1, where a and b
are of odd size.
are non-negative integers. It shows that all the masks
a mask of
In the general linear filtering of an image of size of M*N with filter
size m *n is given by the expression.

+ ...(2.19)
g, y)
= w(s, 1)f(x+s,y 1)

S =-a t=-b
w every pixel inf.
where, x and y are varied so that each pixel in visits
(m-1)

b
(n-1)
2
CONVOLUTION
2.6.2. SPATIAL CORRELATION AND
we must have basic knowledge about
To perform Linear Spatial Filtering
correlation and convolution.
over the image and computing
Correlation is the process of moving a filter mask
the sum of products at each location.
concept called
The process of linear filtering is similar to frequency domain
as convoluting a
convolution. For this reason, linear spatial filtering is referred to
mask with an image. Filter mask are sometimes called convolution mark.
Convolution also performs the same mechanism of correlation except that, n
convolution the filter is first rotated by 180°.
Page 21 of 304
Image Enhancement
2.21
Correlation
Convolution
rOrigin f W rOrigin f W rotated 180°
(a) 0
0 10
0 0 0 0
12 3 2 8
0 00 1.0 0 008 2 3 2 1
()
(b) 000 100 0
0 0 00100 0 0 (j)
1 2 3 2 8 8 2
3:21
t Starting position alignment

Zero padding
(c) 0 0 0 0 0 00 1
0 00 0 0 0 0 0
1 2 3 2 8
00 0 0 0 0 0 10 0 0 0 000 0 (k)
8 2 3 2 1

(d) 0 0 0 0 0 0010 00 0
00 0 0
00 0.0 0
00 1,0 0 0 000 0 0 (0)
1 2 3 2 8 8 2-3 2 1

Position after one shift

(e)
00 000 0 0 10 0
000 0
00 000 000 0 1
0 0 0 00 000 (m)
1 2 3 2 8 8 2 3 21
L Position after four shifts

() 0 0
00 0
00 1
000 0 0
00 0 000 000 0 1 0 0
000 00 0
(n)
1
2 3 2 8 8 2 3 2 1

Finalposition

Fullcorrelation result Full convolution result


(g) 000 8 2 3 2 10 0 0 0 000 1
2 3 2 8 0 0 0 0 o)
Cropped correlation result Cropped convolution result
(h) 0 8 2 3 2 1 0 0 0 12 3 2 8 0 0 (p)

Fig. 2.13. Ilustration of 1-Dcorrelation and convolutionoffilter


with a discrete unit impulse
are
An important point in implementing correlation and convolution is that there
parts of the functions that do not overlap.
The solution for this problem is
1. Padf with enough 0s on either side to allow each pixel in
w to visit every

pixel inf.
2. If the filter is of size m,
we need m- 1 Os on either side of f.
Page 22 of 304

Digital Image Processing


2.22
Padded f
0 0 0 0
0 0
0
00 0
0
0 0 0 0
0
00
0 0 0 0.0 0
00 0
-rOrigin f(x, y) o 0 0 0 0 00
00 0 0 1
00 0 0
0 0 0 00 0 0
0
0 00t w(x, y)
00 000
0 0

0 0 1
00 123 00
00 0 0 0 4 5 6 000 00 0 Dinie
0 0 0
00 0 0
000 0 0 7 89 00
(b)
(a)
Full correlation result
Cropped correlation result
Initial position for w
T
1 2 3 0 0 0 00 0 00 0 00 0 0 0 0 0 0 0 0 0
0 0 0
00 0 0 0 9 8 7 0
4 5 6: 0 0
0
0 00
0. 0 0 0 0 0 0 0 0 0 0 6 5 4 0
7 8 9: 0 0 00 0 0 987 0
0 0 0 3 2 1
0 0 0 0 0 0 0 0 0 0 0 0 0
5 4 0 0
0 0 0 01 0 00 0 00
0
00 0 0 0 0
000 00 0 0
0
00 0 0 0 0 0 00
0 0 0 00 0 0 0
000 00 0 0 0 0 0 0

00 0 0 0 0 0 0 00 0
00 00 (e)
(c) (d)
Full convolution result Cropped convolution result
Rotated w
o 0 0 0 0 0 00 0 00 0 0 0 0 00 0 0 00
i9 8
7 0 0
0 0 0
0
3
01 2
i6 5 4 o 00. 0 00 0
0 4 5 6 0
0 0 0 0 00 0 0 0
00 0 0
0 0 0 0 00 0 1
2 3 0 0 0 078 90
0 4
5 6
00 0 000 0 0
00 0 0 00 0
00 0 7 8
90 0
00 0 00 00 0
0 0
00 0 00 0
00 0 0 0 0 0
0 0 0 0 0 00 0 0
0 0 0 0
0 000 0
0 0 00 0 0 0 0 0 0 0
00 0 00 00 (g) (h)

a 2-D
convolution (last row) of 2-Dfilter with
a
Fig. 2.14. Correlation (middle row) and
gray to simplify visual analysis
discrete, unit impulse. The Os are shown in
m x n with an image f(x,y), denoted aS
The correlation of a filter w(x, y) of size
a (2.20)
w(s, t)f(x +s, y +t) •..

S =-a t=-b
In a similar manner, the convolution of w(*, y) andf(*, y) denoted by.
a (2.21)
w(s, )f(* -5,y -1)
S=-a t=-b
where the minus signs on the right flipf(i.e., rotated it by 180)
Page 23 of 304

Image Enhancement |2.23

2,6.3. VECTOR REPRESENTATION OF LINEAR FILTERING


When interest lies in the characteristic responseR of a mask either for correlation
or convolution, sometimes, we can write the sum of products as

nR= W, Z+w, Z, +...... + Wmm


Zmn1
mn

= WIZ ... (2.22)

where the w's are mask coefficients and the z's are the vale of the image
gray levels corresponding to those coefficients and mn is the total number of
coefficients in the mask.
W1 W2 W3

W4 Ws W6

W1 W8 Wo

Fig. 2.15. Representation of a general 3 x3filter mask


For the 3 x 3 general mask shown in above figure 2.15, the response at any point
(r, y) in the image is given by
R=
W

Z+W Z,
t...... tW9 Zo

k=1
= WIZ ...(2.23)
where, w and z are 9-dimensional vectors formed from the coefficients of the
mask and the image intensities encompassed by the mask respectively.

2.6.4. GENERATING SPATIAL FILTER MASKS


Generating an m x n linear spatial filter requires mn mask coefficients. These
coefficients are selected based on what the filter is supposed to do.
sum of the nine
The average value at any location (, y) in the image is the
intensity values in the 3 x3 neighborhood centered on (*, y) divided by 9. Let
Z,, i=

1, 2, .....9, denote these intensities, the


average isi JAiTa9 2HHTOOLAE:
Page 24 of 304
2.24 Digital Image Processing

1 9 ... (2.24)
R=
19 i=1 1

w, =o
coefficient values
This is the same as equation 2.23 with

Gaussian Function
A Gaussian Function of two variables has the basic form
h (x, y) =e 242

where
G is the standard deviation
x and y are coordinates.

Non-Linear Filter
Generating a nonlinear filter requires following information.
1. The size of a neighborhood

2. Operation (s)to be performedon the image pixels


some applications can perform
Non-liner filters are quite powerful and in
functions that are beyond the capabilities of linear filters.
An important point in implementing neighborhood operations
for spatial filtering
is the issue of what happen when the center of the filter
approaches the border of the
image. There are several ways to handle this situation.
1. To limit the excursion of the center of the mask to be
at distance of less than
will be smaller
(n- 1))2 pixels form the border. The resulting filteral image
than the original but all the pixels will be processed with the full mask.
in the
2. Filter all pixels only with the section of the mask that is fully contained
processed
image. It will create bands of pixels near the border that will be
with a partial mask.
by
3. Padding the image by adding rows and columns of 0s and or padding
replicating rows and columns. Tlie padding is removed at the end of the
process.

2.7. SMOOTHING SPATIAL FILTERING


Smoothing filters are used for blurring and for noise reduction.
Image Enhancement
Page
2.25| 25 of 304

Blurring
It is used in preprocessing tasks, such as removal of small details from an image
nrior to (range) object extraction and bridging of small gaps in lines or curves.

Noise Reduction
It can be accomplished by blurring with a inear filter and also by nonlinear
filtering.o
2.7.1, SMOOTHING BY LINEAR FILTERS
The output of a smoothing linear spatial filter is simply the average of the pixei
contained in the neighborhood of the filter mask. These filters also called averaging
filters or low pass filters.

Operation
The operation is performed by replacing the value of every pixel in the image by
the average of the gray levels in the neighborhood defined by the filter mask.
This
process reduces sharp transitions in gray levels in the image.

Types
1. Box Filter
2. Weighted average filter

2.7.1.1. Box Filter


are equal is called, as box filter.
A spatial averaging filter in which all coefficients
An example of such filter is shown in below figure 2.16.
1 1 1

1 1 1

1 1

Sum of allthe coefficients


=
1+1+1+1+1+1+1+1+1=9
Fig. 2.16. Box Filter
use the
of box filter yields
Above figure 2.16 shows a 3 x 3 smoothing filters
standard average of the pixels under the mask. The mask coefficients
for box
filter is
Page 26 of 304
|2.26|
Digital Image Processing

9
1
z,
RS 9
i=1
(2.25S)

The average value is needed to. be computed and normalized constant is multipliea
1
with thefilter mask in thiscase).

The denominator value of this constant is equal to the sum of all coefficient values
of the mask. An m xn
mask would have a normalizing constant equal to l/mn.
2.7.1.2. Weighted Average Filter
A weighted average filter is the one in which pixels are multiplied by different
coefficients.
1
2 1

16
2 4 2
1 22 1

Sum of all the coefficients =1 +2 +1+2+4+2+ 1+2+ |=16


Fig. 2.17. Weighted Average Filter.
Here, the pixel at the center of the rask is multiplied by a higher value than any
other, thus giving this pixel more importance in the calculation of the average. The
other pixels are inversely weighted as a function of their distance from the center of
the mask.
The diagonal terms are further away from the center than the orthogonal neighbors
(by a factor of 2 ) and thus are weighted less than
the immediate neighbors of the
center pixel.
The general implementation for filtering an M x
N image with a weighte0
averaging filter or size m x n is given by
the expression.

w(s, 1)f(r +s,y +1)


g*, y) = S=-a t=-b
a a..
ges
(2.26)

2 w(s, t)
S-a t=-b
Page 27 of 304

Image Enhancement 2.27


The denominator of above equation is a sum of the mask coefficients and therefore
it is a constant that needs to be computed only once.

Applications
1. Noise Reduction
2. Smoothing of false contours i.e., outlines
3. Irrelevant details can also be removed by these kinds of filters, irrelevant
means which are not our interest.

2.7.2. SMOOTHING BY NON-LINEAR FILTERS


Types of non-linear spatial filters are also known as order statistic flters, because
whose response is based on ordering of the pixels contained in the image area
compressed by the filter and the replacing the value of the center pixel with value
determined by the ranking result.
The best example of this category is median filter. In this filter the values of the
center pixel is replaced by median of gray levels in the neighborhood of that pixel.
Median filters are quite popular because, for certain types of random noise, they
provide excellent noise-reduction capabilities, with considerably less blurring than
linear smoothing filters.
or salt and pepper
These filters are particularly effective in the case of impulse
super imposed
noise. It is called so because of its appearance as white and black dots
are
on an image. The median& of a set of values is such that half the values in the set
are greater than or equal to
,.
less than or equal to and half
In order to perform median filtering at a point in
an image, we first sort the values
assign this
of the pixel in the question and its neighbors, determine their median and
value to that pixel.
Order-statistics filters are spatial filters whose
response is based on ordering
(Ranking) the pixels contained in the image are encompassed by the filter. The
response of the filter at any point is determined by the ranking result.

2.7.2.1. Median Filter


as its name
The best known order statistics filter is the median filter, which
implies, replaces the value of
a
pixel by the median of the gray levels in the
neighborhood of that pixel.
Page 28 of 304

|2.28| Digital Image Processing

= median (gs,1)}
f (*, y) (s, 1) e Sp
median
original value of the pixel is included in the computation of the
The
because, for certain types of Random noise, the:
Median filters are quite popular than
noise reduction capabilities, with considerably less blurring
provide excellent
linear smoothing filters of similar size. unipolar
in presence of both bipolar and
Median filters are particularly effective coITupted
yields excellent results for images
impulse noise. In fact, the median filter
by this type of noise.

2.7.2.2. Max and Min filter


statistics filter most used in image
Although the median filter is by far the order
means the only one. The median represents
the
processing. It is by no,
S0th,percentile of a ranked set of numbers, but recall from basic
statistics that ranking
the
possibilities. For example, using
lends itself to many other
100 percentile results in the so called
max filter given by

= max {g(s, )}
f (r, y)
(s, )e S,y

in an image. Also, because


This filter is useful for finding the brightest points
as a result of the max
pepper noise has very low values, it is reduced by this filter
areas. The 0 percentile filter is the min filter.
selection process in the sub image

2.8. SHARPENING SPATIAL FILTERS


fine details in an image or
to
The principal objective of sharpening is highlight
to
error or as a natural effect of
enhance details that haye been blurred either in
particular method for image acquisition.
range from electronic printing and medical
The applications of image sharpening
imaging to industrial inspection and autonomous guidance in military systems.
As smoothing can be achieved by integration, sharpening can be achieved by
spatial differentiation. The strength of response of derivative operator is proportiona
to the degree of discontinuity of the image at that point at which the operator 13
applied. Thus, image differentiation enhances edges and other discontinuities and
deemphasizes the areas with slow varying gray levels.
Page 29 of 304
Image Enhancement 2.29|

It is a common practice to approximate the magnitude of the gradient by using


ahsolute values instead of square and square roots.

2.8.1. FIRST ORDER DERIVATIVE


The requirements of a first order derivative are
1. Must be zero in areas of corstant intensity
2. Must be non-zero at the onset of an intensity step or ramp D

3. Must be non-zero along ramps.


A basic definition of a first order derivative of a one dimensional function f() is
the difforence.

= + 1)-f(*) ... (2.27)


Ôx fr
2.8.2. SECOND ORDER DERIVATIVE
The requirements of a second order derivative are
1. Must be zero in constant area
2. Must be non-zero at the onset and end of an intensity step or ramp
3. Must be zero along ramps of constant slope.
we define the second order derivative off(x) as the difference

= f(r+1)-f(x
-
1)-2f(*) .. (2.28)
ax?

2.8.3. USING THE SECOND DERIVATIVE FOR IMAGE SHARPENING


- THE LAPLACIAN
are rotation invariant
Isotropic filters are the most commonly used filters. These
because their response is independent of the irection of the discontinuities in the
image to which the filter is applied.
second order
The simplest isotropic derivative operator is the Laplacian. The
a two dimensional
derivative is calculated using Laplacian. The Laplacian for
function f(x, y) is defined as

...(2.29)

Partial second order directive in the x-direction


Page 30 of 304
|2.30| Digital Image Processing

=
f(r+ 1, y) +f(r - 1, y)- 2f(x, y) ... (2.30)

And similarly in the y - direction


- f(r, y + .
1) +f(r,y- 1)-2r, y) .. (2.31)
y?
a dimensional Laplacian obtained by summim
The digital implementation of two
the two components
-y + I) +f(r,y )–40,y)]... (2.32)-
Vf= Vr + 1,y) +f(x 1, y) +fo,
any one of the following masks.
The equation can be represented using

0 1 1
11
1

1
1
-4 1 1
-8
1 1 1
0 1 0

0 -1 -1 -1 -1
-1 4 -1 -1 -1
-1 -1 -1

Fig. 2.18. Filter mask used to implement above equation (2.32)


an image and deemphasize the
Laplacian highlights gray-level discontinuities in
a The
regions of slow varying gray levels. This makes the background black image.
background texture can be recovered by adding the original and Laplacian images.
The basic way in which we use the Laplacian for image sharpening is

g*,y) = f*,y) +c [V2f,y)] ... (2.33)


where

f*, y)and g(*, y) are the input and sharpened images.


2.8.4. UNSHARP MASKING AND HIGH B00ST FILTERING
Unsharp masking means subtracting a blurred version of an image from th
original image. Unsharp masking consists of the following steps.
1. Blur the original image
Page 31 of 304
|2.31|
Image Enhancement

2. Subtract the blurred image from the original


3. Add the mask to the original.

Let y) denote the blurred image, unsharp masking is expressed in equation


f (,
form as follows. First we obtain the mask

Smask (, y) = f(, y) -f (, y) 9st(2.34)


Then we can add a weighted portion of the mask back to the original image.
...(2.35)
g&, y) = fu,y) +k *Smask (*, y)
when
k=1we have unsharp masking
when
>1 theprocess is referred to as high boost filtering.
k

Example

Original signal Unsharp mask

(c)
(a)

Blurred signal Sharpened signal

(d)
(b)

(a) original signal


Fig. 2.19. Illustration of themechanics of unsharp nasking
(c) unsharp nask (d)
(b) blurred signal with original shown dashedforreference
sharpened signal obtained by adding (c) to (a)
FOR (NONLINEAR)
2.8.5. USING FIRST ORDER DERIVATIVES
IMAGE SHARPENING- THE GRADIENT
are implemented using the magnitude
First order derivatives in image processing
,
of the gradient. For a function y) the gradient of f at coordinates
(X, y) is defined as the two dimensional column vectors.
Page 32 of 304

|2.32 Digital Image Processing

8x (2.36)
Vf = grad () = 8y •..

ôy

property that it points in the direction of


This vector has the important geometrical
the greatest rate of change off at location (x, y).
denoted as M (x, y), where
The magnitude (length) of vector Vf,
... (2.37)
M(*, y) = mag (V/) =g tg,
vector.
is the value at (x, y) of the rate of change
in the direction of the gradient
same size as the original, created when x and y
are
M (, y) is an imnage of the
referred as gradient image.
allowed to vary over all pixel locations inf. This image is
(2.38)
M(, y) 8+ |g
First derivatives are implemented using the magnitude of
the gradient in image
a 3 x 3 region.
processing. Consider an intensities of image points in

f -1,y - 1) :2,
|f-1,y) :Z,
f-1,y+ 1) :Z,
|f, y - 1) : Z4
: Z;
f, y)
f(r,y + 1)
: Zy
f+ 1,y+1)
: Zg
fx +1,y)
fx+1,y+1) : Z

First order derivative that satisfy the conditions are

8, = (Zg Z) -
Page 33 of 304
Image Enhancement
j2.33

and = (Z
&y
-Z;) (2.39)
Two other definitions proposed by Roberts
in the early development of digital
image processing Use cross differences.

(2, - Z)
and

8, = (Z - Z;) ... (2.40)


There are two more important operators to compute
the gradient.
They are
elrsiba to
1,Roberts cross gradient operators
2. Sobel operators

Roberts Cross gradient operators


Above equation (2.39) can be implemented using the two linear filter masks which
is shown in below figure. These masks are referred to as the Roberts cross gradient
operators.
-1 -1
0 1
0

Fig. 2.20. Roberts Cross gradient operators


Roberts cross gradient operators are
8x = (Zg - Z)
and 8y = (Z - Z;)
If we use equation (2.37) and (2.40) we get
...
M(, y) = VIZ,-Z42 + |Z-Z4? (2.41)

If we use equation (2.38) and (2.40), we get


M(*, ) a Z,-Z + |Zy-Z
Sobel Operators
x 3.
The smallest filter masks in which we are interested are of size 3
as
Approximations to g. and g, using a 3 x 3 neighborhood centered on Z, are
follows.
- (Z, + 22, + Z,)
suii 8, =a -(Z,+ 2Z, + Z,)
Page 34 of 304
2.34 Digital Image Processing

... (2.42)
and &y oy
= (Z, + 22,+ Z)-(Z + 2Z,
t Z)
implemented using the mask shown in below figure.
These equations can be
1
-2 -1 -1
-1
2 0
0
1
1 2 1 -1
Fig,2.21. Sobel Operators
these masks, we obtain the magnitude
After computing the partial derivatives with
of the gradients.
We know that
M (, y)
above equation.
Then substituting g, and g, values in
Mry)
~
V/(Z +22, + Z,)- (Z, 2Z,
+ +Z) +
|(Z, + 22,+ Z)
- (Z + 22, +Z))
are called the Sobel operators.
The masks inabove two figures 2.21

IN FREQUENCY DOMAIN
2.9. IMAGE ENHANCEMENT
forward. We simply compute the
Enhancement in the frequency domain is straight
multiply the result by a filter
transfer
the image to be enhanced,
Fourier transform of
to produce the enhanced image.
function and take the inverse transformn

THE FREQUENCY DOMAIN


2.9.1. FOURIER TRANSFORM AND sum of sines and
Any function that periodically
reports itself can be expressed as a
sum s
cosines of different frequencies eachmultiplied by a different coefficient, this
are non-periodic but whose area under
called Fourier series. Even the functions which Fourier
curve if finite can also be represented in such formn; this is now called
the
transform.
completely
can be
A
function represented in either of these foms and
reconstructed via an ínverse process with no loss of information.
2.9.1.1. 1-D Fourier Transformation and its Inverse
Transformation
If there is a single variable, continuous function f(*), then Fourier
F(U) may be given as
Page 35 of 304
Image Enhancement 2.35

FC)) = F(U) exp (-j2rus) dx .. (2.43)


=|f*) j'=N-1
-00

And the reverse process to recoverf(o) from F(U) is

H (FU) -= f)=
...
F
F(U) exp (j2rux) du (2.44)

Equation 2.43 and 2.44 comprise of Fourier Transformation pair.


Fourier transformation of a discrete function of one variable f(), x =0, 1, 2
...
m-1 is given by
1 N-1 ...... ...
F(U) = N
=
exp
i2nuxN] foru 0, 1, N-1 (2.45)
2*)
x =0
toobtain fx) from F(U)
N-1 ... (2.46)
= FU) exp i2uxN] for x=0, 1, 2...... N-1
So)
u=0
a discrete Fourier
The above two equation 2.45 and 2.46 comprise of
transformation pair. According to Euler's formula ...
ejx = coSx + j sinx (2.47)
Substituting these value to equation 2.45
2Tux 2tux for u
=
0, 1,...., N-1
F(U) = +j sin N
N
(2.48)
a components, based
"The Fourier transformation separates function into various
are complex quantities.
on frequency components". These components
F(U) in polar coordinates
|F(2)) -[R? (u) + 1P (u)]
F(u) = R (u) +jl(u)
U

[OR]
O(0) = tan ... (2.49)
F(u) = F(u)le/u)

2.9.1.2. 2-D Fourier Transformation and its inverse


continuous function f(, y) (an
The Fourtier transform of a two dimensional
image) of size M * N is given by
Page 36 of 304
Digital Image Processing
|2.36

+ dxdy
)} = F(u, v) =| Sf(x, y) exp [j2n(ux vy)] ...(2.50)
FG,
given by equation 2e301g 2.24rit Loi
Inverse Fourier Transformation is
. (2.51)
v) exp [j2r(ux
+
v)] dudyt
F{F (4, v)} =f6,y)=| fu,
variables.
where (u, v)are frequency (m/2,
v) to frequencv coordinate
Preprocessing is done to shift the origin of F(u,
center M * N area occupied by the 2D FT. It is known
as
-
n/2) which is the of the
frequency rectangle.

2.9.1.3. Discrete Fourier Transform


an image f(, y) of size M x N is defined by
The two-dimensional DFT of
M-1 N-1
e 2*
F(u, v) = ) 2 f*,y) M
N
=0 -y=0
... (2.52)
u=0, 1,
.... M-1, y =0, 1, 2....N -1
1 M-1 N-1 e
F(*, y) MN
u =0 y
X F(u, v) 2at
=0
...(2.53)
x=0, 1,. M- 1,y
= 0,
1, 2.....N-1
Here, x, y-Spatial or image
variables

- 2.52 and 2.53 are called


u, v Transform or frequency variables. Equations
discrete Fourier Transformation.

DOMAIN
2.9.2. BASIS OF FILTERING IN FREQUENCY
are based on modifying the Fourier
Filtering techniques in the frequency domain
inverse DFT to g
Transform to achieve a specificobjective and then computing the
us back to the image domain.
the image
H(u, v) called a filter because it suppresses certain frequencies from
while leaving others unchanged.
Page 37 of 304

Image Enhancement 2.37


Filter Inverse
Fourier
Transform function Fourier
H(u,v) Transform

F(u,v) Hu.y Fuv


Post Post
Processing Processing

f(x.y) g(,y)
input imageL enhanced
image

Fig. 2.22. Basic Steps of


Frequency Domain Filtering

Basic Steps of Filtering in Frequency Domain


1. Multiply the input image f(, y) by (-1)X + Y to centre the transform.
2. Compute F(u, v) Fourier Transform of the image
3. Multiply F(u, v) by a filter function H(u, v)
4. Compute the inverse DFT of Result of step 3
5. Obtain the Real part of Result of step 4
6. Multiply the result in(5) by (-1) X=Y

2.10. SMOOTHING BY FREQUENCY DOMAIN FILTERS


Edges and other sharp transition of the gray levels of an image contribute
significantly to the high frequency contents of its Fourier transformation. Hence
smoothing is achieved in the frequency domain by attenuation (i.e) a specified range
a
of high frequency components in the transform of given image.
Basic model of filtering in the frequency domain is
Gu, v) = H(u, v) F(u, v)
F(u,v)- Fourier Transform of the image to be smoothed.
Objective of this method is to find out a filter function H(u, v) that yields
G(u, v) by attenuating the high frequency component of F(u,v).
There are three types of low pass filters.
1. Ideal
Page 38 of 304

Digital Image Processing


2.38
2.
Butterworth
3. Gaussian
categories cover the range from very sharp (ideal) to very smooth
These three order.
filtering. The butter worth filter has a parameter called the filter
(Gaussian)

2.10.1. IDEAL LOW PASS FILTER


all the three filters. It cuts of all high frequency component of
It is the simplest of doing
Transform that are at a distance greater than a specified distance
the Fourier
form the origin of the transform.
as dimensional ideal low pass filter (ILPF) and has the transfer
It is called two
function.
1
if D(u, v) s Do
H(u, v) = 0 if D(u, v) > Do
where D, is a specified non-negative quantity
distance from point (u, v) to the center of frequency Rectangle, i.e.,
D(u, v) is the
1
v) = [(u-PI2) (- + Q2)j2
D(u,
also be of the same size. So center of
the
If the size of image is M *N, filter will
=
frequency Rectangle (u, v) (MI2, N/2)
because of center transform.
D
(u, v) = (u2+ y2)?
are passed without any
Because it is ideal case. So all frequency inside the circle
attenuation whereas all frequency outside the circles is
completely attenuated.

cross section, the point of transition between


For an ideal low pass filter
H(u, v) =l and H(u, v) =0 is called the “cut of frequency*".

One-way to establish a set of standard cut of frequency locus is to


compute circl
that include specified amount of total image power P. This quantity is obtained by
summing the components of the power specirum of the padded images at each pomi

(u, v) for u =0, 1,


...... P 1 and V = 0, 1, ..... Q
- -1, that is
iioP-1
=
P
2 P (u, v)
=0 v=0
Page 39 of 304

Image Enhancement 2.39|

A
circle of radius D, with origin at the center of the frequency Rectangle encloses
a percent of the power, where
a = P(u,v)
100|
P,
and the summation is taken over values of (u, v) that lie inside the circle or
on its boundary.

H (u, v) H (u, v)

u
+D (u, v)
Do

(a) (b) (c)

Fig. 2.23. (a) Perspective plot of an ideal low-pass filter transfer function.
Filler
(b) displayed as an image (c) Filter radial cross section
ldeal Low Pass Filter is not suitable for practical usage. But they can be
implemented in any computer system.

2.10.2. BUTTERWORTH LOW PASS FILTERS


The transfer Function of a Buttervworth Low pass Filter (BLPF) of order n and with
as
cut off frequency at a distance D, from the origin is defined
1

H(u, v) = 1+ (D(u, v)/Dol

Most appropriate value of n is 2.


It does not have sharp discontinuity unlike ILPF that establishes
a clear cut off
between passed and filtered frequencies.
a
Defining a cut off frequency is a main concern in these filters. This filter gives
Smooth transition in blurring as a function of increasing cut off frequency.
A Butterworth filter of order 1 has no Ringing, Ringing increases
as a function of

ilter order. (Higher Order Leads to Negative Values).


Page 40 of 304
l2.40 Digital Image Processing

H(u, v)
H (u, v)
1.0

n=1
0.5 -n=2
cn=3

D(u, V)5ta
D.

(c)
(a) (b)
as
butter worth low-pass filter function(b)filter displayed
Fig. 2.24. (a) perspective plot ofa
1 through 4
an image (c) filter cross sections of orders

FILTERS
2.10.3. GAUSSIAN LOW PASS
The transfer function of a Gaussian Low
Pass Filter is
(u, v) /2o2
H(u, v) =e -D'
where D (u, v)the distance of point (u, v) from
the center of the transform
o=D,-specified cut-off frequency.
The filter has an important characteristic that the
inverse of it is also Gaussian.
H (u, v)
H
(u, v)
1.0

Do= 10
0.667 Do= 20
Do= 40
Do= 100

D(u, v)

(a) (b) (c)

Fig. 2.25. Perspective plot of a GLPF transfer function.


(6) Filler displayed as an image. (e) Filter radial ceross sections for various values of Do

2.11. SHARPING BY FREQUENCY DOMAIN FILTERS


Image sharpening can be achieved by a high pass filtering process, whe
attenuates the low frequency components without disturbing high frequen
information. These are radially symmetric and completely specified by a
section.
Page 41 of 304

Image Enhancement 2.41


If we have the transfer function of a low pass filter the corresponding high pass
filter can be obtained using the equation
=
Hp (u, v) 1-Hp (u, v)
Where, Hp (u, v) is the transfer function of the low pass filter. That is, when the
low pass filter attenuates frequencies, the high pass filter passes them and vice versa.
H (u, v) H
(u, v)

10

-D (u, v)

(a) (b) (c)


H (u,v)
H (u, v)

+D(u, v)

(d) (e) (f

H
H
(u, v)
(u, v)
-V 1.0

-D(u, v)

e(g) (h) (i)

Fig. 2.26. Top row: Perspective plot, image representation,


y

rows:The same
and cross Section ofa typical ideal highpass filter. Mtiddl and bottóm
sequence for typical Butterworth and Gaussian highpass filters.
Page 42 of 304
2.42 Digital Image Processing

2.11.1. IDEAL HIGH PASS FILTERS


This filter is opposite of the ideal low
pass filter and has the transfer function of

the form
sDo
0
if D(u, v)
H(u, v) = 1 if D(u,
v) > Do

where D, is the cut-off frequency.

It sets to zero all frequencies inside


a
circle of radius D, while passing, withoy

attenuation, all frequencies outside the circle.

2.11.2. BUTTERWORTH HIGH PASS FILTERS (BHPF)


n is given by the
The transfer function of Butterworth High Pass Filters of order
equation.
1

H (u, v) = 1+ D/D(u,v)]n

smoother with the


The transition into higher values of cut-off frequencies is much
BHPF.

2.11.3. GAUSSIAN HIGH PASS FILTERS (GHPF)


cut off frequency locus
The transfer function of the Gaussian High Pass Filter with
at a distance D, from the center of the frequency Rectangle is given by
w)/2 D;
H(4, v) = 1-e-bu,
[OR]
v) /2 G2
(u, v) = 1-e-D(u,
H

2.12. HOMOMORPHIC FILTERING


Homomorphic Filters are widely used in image processing for compensating represent
an image
effect of no uniform illumination in an image. Pixel intensities in
the light reflected from the corresponding points in the objects.
As per image model, image f(*, y) may be characterized by two components.
1. The amount of source light incident on the scene being viewed and
2. The amount of light reflected by the objects in the scene.
Page 43 of 304

Image Enhancenent 2.43


These portions of light are called the illumination and reflectance componnts and
are denoted as i(*, ) and r*, y) respectively. The functions i(x, y) and ro, y)
combine multiplicatively to give the image function fr, y).
... (2.54)
fa, y) = i(t, y) r(*, y)

where 0<ik, y)< 1and 0 <r(x, y) <1


Homomorphic filters are used in such situations where the image is subjected to
the multiplicative interference or noise as depicted in above equation 2.54.
We cannot easily use the above product to operate separately on the frequency
components of illumination and reflection because the Fourier transform off(*, ) is
not separable, that is
Ff*, )] not equal to F[i(*, y)]·F[r(, y)] (2.55)
We can separate the twocomponents by taking the logarithm of the two sides
lnf (x,y) = In i(*, y) + In r(*, y) (2.56)
Taking Fourier transforms on both sides we get,
=
F[lnf(:, y) F[ni, y)] + F[ln r(*, y)] (2.57)
that is,
...(2.58)
F(r, y) = I(*, y) + R(r, y)
where F, Iand R are the Fourier transforms of lnf (, y), ln i(c, y) and ln r(x, y)
sum of two
respectively. The function F represents the Fourier transform of the
images such as,
1. Low frequency illumination image
2. High frequency reflectance image
suppresses low frequency
If we now apply a filter with a transfer function that
can suppress the
components and enhances high frequency components, then we
illumination component and enhance the reflectance component. Taking the inverse
we get
transform of F(%, y) and then anti-logarithm,
= + rl (x, y) ...
f, y) il (*, y) (2.59)
g(x, y)
DFT
H(u,v) (DFT)Y1 exp Enhanced
In Image
f(x, y)
Input Image

Fig. 2.27. Block diagram of HomomorphicFiltering


Page 44 of 304
|2.44 Digital Image Processing
2.13. COLOR IAGE ENHANCEIENT
Enhancemtent techniques can be used to process an image so that the final result is
Inore suitable than the original image for a specificapplication. Image enhancenent
techniques fall into two broad categories.
1. Spatial Domain Techniques
2. Frequency Domain Techniques
are
The spatia! Dormain refers to the image itself, and spatial domain approaches
frequency
based on the direct manipulation of pixels in the image. On the other hand,
image.
domain techniques are based on rmodifying the fourier transform of the
image
In addition to the requirements of nonochrorne image enhancement, color
or color contrast in a color
enhancement may require improvement of color balance
a more difficult task not only because
image. Enhancement of color images becomes
color
of the added dimension of the data but also due to the added complexity of
perception.

R Monochrorme R
image enhancernent

Inverse Display

G T2 Monochrome cOordinate G
Input Coordinate
conversion image enhancement transformation
Image

T3 B
B T3 Monochrome
image enhancement

Fig. 2.28. Color Image Enhancement


enhancement algorithms is shown
A
practical approach to developing color image
in following figure. The input color
coordinates of each pixel are independently
transformed into another set of color coordinates,
where the image in each
algorithm, which could be
coordinates is enhanced by its own image enhancement
chosen suitably from the foregoing set of algorithms.
T.T, are inverse transformed to R, G,B
T
.
The enhanced image coordinates 1

3 is enhanced independently,
lOr display. Since each imág plane T, (m, n), =1,2,
k
of
care has to be taken so that are withinthe color gamut
the enhanced coordinates T
the R-G-B system.
Page 45 of 304

Image Enhancemnent |2.45

The choice of color coordinate system T,, k = 1, 2, 3


in which enhancement
algorithms are implemented may be problem dependent.
TWO MARKS QUESTIONS AND ANSNERS
1. What is meant by Image enhancement?
It is among the simplest and most appealing areas
of digital image processing.
The idea behind this is to bring out details that are
obscured or simply to
highlight certain features of interest in image. Image enhancement
is a very
subjective area of image prOcessing.
2. Mention the basic approaches of image enhancement.
The two basic approaches of image enhancement is
1. Spatial Domain Methods
2. Frequency Domain Methods
3. What is Spatial Domain Methods?
The termn Spatial Domain refers to the image plane itself and approaches
in this
category are based on direct manipulation of pixel in an image.
4. Define the concept of inage negative and its application. [Apr/May- 2010]
The negative of an image with gray level in the range [0, L -1] is obtained by
using the negative transformation. The expression of the transformation is
S= L-1-r
There are a number of applications in which negative of the digital images are
quite useful. Displaying of medical images and photographing a screen with
monochrome positive film with the idea of using the resulting negatives as
normal slides.
S. What is pointprocessing? [Apr/May- 2011]
Approaches whose results depend only on the intensity at a point sometimes
called as point processing techniques. The contrast stretching and Thresholding
process are called as point processing.
6. What is contrast stretchng? fMavJune - 2009, Apr/May - 2011]
It produces an image of higher contrast than the original one. The operation
is performed by darkening the levels below m,and brightening the levels
above m in the original image.
Page 46 of 304
Digital Image Processtng
246
range of intensity levelsi
Contrast stretching is a process that expands the
an image so that it spans the füll intensity range of the recording. medium or

display device.
7. What is Gray Level Transformation function?
is of siza
It is the simples form of the transformations when the neighborbood
1. In this case, g depends only on the value off at a single point (x, y)and
T

1x
becomes a gray level transtormation function of the forms S=T(r).

8. Define Bit Plane Slicing?


In Bit-Plane Slicing each pixel in an image is represented by 8 bits planes
ranging from bit plane 0 for the least significant bit to bit plane 7 for the
significant bit.
By modifying these bit planes, bit plane slicing method highlights the

contribution made to total image enhancement by specific bits and provides


image enhancement.

9. What is Gray Level Slicing?


Gray Level Slicing is the process of highlighting a specific range of gray level in
an image. For example when enhancing features such as masses of water in
satellite image and enhancing flows in X-ray images.

10. What is Gamma Correction?


A variety of devices used for image capture, printing and
display respond
according to a power law. By convention, the exponent in' the power law
equation is referred to as gamma. The process
used to correct this power law
response phenomenon is called gamna
correction.
11. What is mask
processing? -
[Apr/May 2010]
Mask is a small sub image which is
used in local processing to modify
values of original image the pix
for the purpose of image ent.ancement.
The center of this mask is
moved from pixel to pixel,
Corner of an image and at starting at the top
each point the response an operator
Processing an image of T is calculateu
in this way is known as
mask processing.
Page 47 of 304
Image Enhancement
2.47
12. What is a histogram?
Nov/Dec 2009]
Histogram of a digital image is the probability of occurrence associated
with the
gray levels in the range 0 to 255.
Histogram can be expressed using discrete
function of the form.
h(r)
where,ibs r

r-is the kth intensity value

n,-is the number of pixels in the image with intensity


r
13. What is histogram equalization?
The technique used for obtaining a uniform histogram is known as histogram
equalization or histogram linearization.
14. What is histogram matching?
The method used to generate a processed image that has a specified histogram is
called histogram matching or histogram specification.
15. What is global histogram processing?

In global histogram processing the pixels are modified by a transformation


function based on the intensity distribution of an entire image.
16. What is local histogram processing?
In local histogram processing, the procedure is to define a neighborhood and
move its center from pixel to pixel. At each location, the histogram of the points
in the neighborhood is computed and either a histogram equalization or
histogram specification transformation function is obtained.
17. Define Spatial Filtering. -
[Apr/May 2011]
Spatial Filtering is the process of moving the filter mask from point to point in an
image.

18: What is Linear Spatial Filter?


In linear spatial filter the response is given by a sum of products of the filter
coefficients and the corresponding image pixels in the area spanned by the filter
mask.
Page 48 of 304
Digital Image Processing
|2.48

19. Define Averaging Filters?


average of the pixels
The outputsmoothing, linear spatial filter is the
of a
filters are called
filter mask. These filters
contained in the neighborhood of the
averaging filters.

20. What is a Median Filter?


a by themedian of the gray levele
The Median Filter replaces the value of pixel
in the neighborhood of that pixel.
filters.
21. Name the different types of derivative
1, Perwitt Operators
2. Roberts Cross Gradient Operators
3. Sobel Operators
22. Write the application filters?
of sharpening
to application.
1. Electronic printing and medical imaging industrial
weapons.
2. Autonomous target detection in smart

23. What is Maxinmum filter and Minimum filter?


finding brightness points
The 100th percentile is maximum filter, used in
filter, used for finding darkest points
image. The 0 percentile filter is minimum
in an image.

24. Write the steps involved in frequency domain


filtering.
+ Y to centre of transform
Multiply the input imagef(, y) by (-1)X
1.

2. Compute F(u, v) Fourier Transform of the image


3. Multiply F(u, v) by a filter function H(u, v)
4. Compute the inverse DFT of Result of step 3
5. Obtain the Real Part of Result of step 4
6. Multiply the result in (5) by (-1y=y

25. What is Frequency Domain Methods?


transform.
Frequency domain methods based on modifying the image by Fourier

Give the formula for negative and log transformation.


Negative transformation: S=L-1 -r where C- constant
Page 49 of 304
Image Enhancement 249
Log transformation
r20Ts
:
S=Clog (1 +
r)
27. Mention the uses of Gamma Correction.
1. It is important to displaying an image accurately on a computer screen.

2. It is used to reproduce colors accurately.

28. Define spatial correlation and convolution.


Correlation is the process of moving a filter mask over the image and computing
the sum of products at each location.
The process of linear filtering is similar to frequency domain concept called
convolution.

29. What is Gaussian Function?


A Gaussian Function of two variables has the basic form

h(x, y) =e 262

where
g is the standard deviation
x
and y are coordinates.

30. What is blurring?


It is used in preprocessing tasks, such as removal of small details from an image
prior to (large) object extraction and bridging of small gaps in lines or curves.
31. How noise reduction can be performed in spatial filtering?
Noise Reduction can be accomplished by blurring with a linear filter and also by
non-linear filtering.
32. What is box filter?
A spatial averaging filter in which allcoefficients are equal is called as box filter.
An example of such filter is shown in below figure.
1 1 1

X 11 1

1
11|
Sum of all the coefficients
=
1+1+1+1+1+1+1+1+1 =9iha
Page 50 of 304
|2.50 Digital Image Processino

33. Define Weighted Average Filler.


A
weighted average filter is the one
in which pixels are multiplied by difereng
coefficients.
1
2 1

X 2 4 2
1:
|1|2
=
Sum of all the coefficients 1+2+1 +2 +-4+2+1+2+1=16
34. Whut is the objective of sharpening?
The principal objective of sharpening is to highlight fine details in an image or ta
enhance details that have been blurred either in error or as a natural effect of
particular method for image acquisition.

35. What are the basicRequirements offirst order derivative?


The Requirements of a first order derivative are
1 must be zero in areas of constant intensity
2. must be non-zero at the onset of an intensity Step or RamptasL at n

3. must be non-zeroalong ramps.

36. What are the basic requirements of


second order derivative?
The requirements ofa second order derivative are
1. Must be zero in constant area
2. Must be non-zero at the onset of and end of an intensity step or Ramp
3. Must be zero along ramps of constant slope.

37. Define unsharp masking.


Unsharp masking means subtracting a blurred version of an image from
the

original image. Unshaped masking consists of the following steps.


1. Blur the original image
2. Subtract the blurred image from the original
3. Add the mask to the original.

38. Give the Relation for 1-D Fourier transfornation pair.


The discrete Fourier transform is defined by 3
Page 51 of 304
Image Enhancement 2.51

-
FU) =
f) exp (j2rus)
t j--1 (1)

Inverse discrete Fourier transform is defined by

f) =|F(U) exp ((2rux) du ...

Equation (1) and (2) comprise of Fourier transformation pair.

39. Mention difference between first and second order derivatives.

S.No. First Order Derivatives Second Order Derivatives

It is value is non-zero along |It is value is non-zero only at


the
1.
the entire Ramp. onset and end of the ramp.

2. They produce thick edges They produce very thin edges

Implementation is difficult Implementation is simpler than


3.
the other.

40. Define transfer functionof ideal LPF.


It cuts of all high frequency component of the Fourier
transform that are at a
the transform.
distance greater than a specified distance D, from the origin of
1
if D(u, v) s Do
H (u, v) = 0 if D(u, v)> Do

HPF.
41. Define transfer function of Ideal, Butterworth, LPF and
-
[Nov/Dec- 2011, Apr/May 2011, May/June-2009.]
Ideal Filter Butterworth Filter
H 1
(u, v) H (u, v) =1+[D(u, v)
Low pass /Dlzn
filter
1
if D(u, v) s Do
0 if D(u, v) > Do
2.52 Digital Image Processing
Page 52 of 304

H(u, v)
1
s0
High pass Jo if Du, v) H(u, V)
filter l1 if D(u, v)
>0 1+ [DD (u, v)]n

42. Define transfer function of Gaussian Filter in Low


pass and High pas.
INov/Dec- 2012]
Gaussian H(u, v) =e -D (u, v)/2o2
Low pass
filter
Gaussian v))2o2
H(u, v) =1-e -D² (u,
High pass
filter

43. Define Homomorphic filtering.


Homomorphic Filtering is the process of improving the appearance of an image

by simultaneous gray level range compression and contrast enhancement.


44. Write down the equation used to obtain the enhanced image using Laplacian
filters.
+ 1,y) +f(r - 1, ) +f(r, y + 1) +f(r,y - 1)-4f(, y)].
f
45. Mention some applications of weighted averaging filter.
1. Noise reduction
2. Smoothing of false contours i.e., outlines
3. Irrelevant details can also be removed.by these kinds of filters, irrelevan
means which are not our interest.

40. How are smoothing flters used in image processing? Give any two smoonls
filters. Nov/Decc- 2010/
Smoothing filters are used for blurring and noise
reduction Smoothing can be
performed in two ways.
1. Linear
2. Non-Linear
1. Box filter
2. Weighted average filter used for linear filters
Page 53 of 304

Image Enhancement 2.53


3. Median filter
4. Max & Min filter used for non-linear filters
47. Name the different ypes of derivative filter. Nov/Dec- 20097
The filters based on the second derivative are
i. Laplacian filters
Thefilters based on the first derivative are
i. Robers cross gradient operators
ii Sobel operators
48. Define Averaging filters. eNov/Dec -
2009]
The output of a smoothing linear spatial filter is simply the average of the pixel
contained in the neighborhood of the filter mask. These filters also called
averaging filters or low pass filters.
Example: 1. Box filter
2. Weighted average filter
49. Compare Spatial and Frequency Domain Methods. [Nov/Dec- 2008]

S.No. SpatialDomain Methods Frequency Domain Methods


Spatial Domain refers to the Frequency domain methods based
image plane itself and these on modifying the image by
1. methods are based on direct Fourier transform.
manipulation of pixels in an
image.
Sub images are used Low pass and high pass filters are
2.
used.
S0. What are the effects of applying Butterworth low pass filter to
noisy image? [Nov/Dec- 2008]
1. Butterworth filter gives a smooth transition in blurring as a function

of increasing cut off frequency.


2. A Butterworth filter of order
1
has no ringing.

DI. What are the applications of sharpening filters ? Apr/May- 2011]


1. Electronic Printing
2. Medical imaging
3. Autonomous guidance in military systemst
4. Industrial Inspection
Page 54 of 304

|2.54 Digital Image Processing

S2. Wrte down a 3 x3 mask for the smoothing and sharpening filters.
(Apr/May - 2011)
as
For smoothing filters we can consider box filter.
1

1 1

For Sharping filters we can consider Sobel operators.


-1 -2-1
0
1
|2 1

53. What is order statistics filter?


or in the
If the response of a filter is based on the ordering ranking of the pixels
neighborhood, it is called order - statistics filter. It is type of non-linear
a spatial

filter. Thrce categories of order statistics filters are

1. Min filter
2. Median filter
3. Max filter

REVIEW QUESTIONS

1. Write short notes on. lApr/May - 2011]


L Contrast stretching
iü Gray Level slicing
Ans. Refer Section 2.4.4.1 Page.no: 2.7 and Refer Section 2.4.4.2 Page.no: 2.8

2 Explain the Homomorphic fltering approach in image enhancement


[Apr/May- 2011, NoDec- 2010, May/June 2009] -
Ans. Refer Section 2.12 Page.no: 2.43
3. Write down the basic concepts of spatial filtering. (Apr/May- 2011]
Ans. Refer Section 2.6 Page.no: 2.18
Page 55 of 304
Image Enhancement 2.55
A Explain about histogram equalization technique.
Apr/May- 2011]
Ans. Refer Section 2.5.1 Pagt.no: 2.11
s Discuss the Histogram Processing a
of digital image.
[May/June - 2012, NowDec- 2011, NovDec - 2008,
NovDec- 2009, NovDec - 2010]
Refer Section 2.5 Page.no: 2.10

6 Discuss the threetypes of LowPass filters in detail.


[NovDec- 2008, NovDec- 2010]
Ans. Refer Section 2.10Page.no: 2.38
7.
-
Explain the various sharpening filtersin spatial domain. [May/June 2009]
Ans. Refer Section 2.6Page.no: 2.18
8. Describe the principle of the image enhancement using Nov/Dec- 2012]
i. Gray Level Slicing

ii. Histogram equalization


Ans, Refer Section 2.4.4.2 Page.no: 2.8 and
Refer Section 2.5.1 Page.no: 2.11
9. Discuss in detail how inage sharpening is achieved using [NovDec 2012/ -
1. HPF
i Unsharp masking
iii Gradient Operators
Ans.
i. Refer Section 2.11.1 Page.no: 2.43
ii. Refer Section 2.8.4 Page.no: 2.31 and

i. Refer Section 2.8.5 Page.no: 2.32


10. Discuss the image smoothing filters with its model in the spatial domain.
NovDec– 2011]
Ans. Refer Section 2.7 Page.no: 2.25
Page 56 of 304

2.56] Digital Image Processing

image enhancement.
l1. Describe the frequency domain methods for
-
[Nov/Dec 2011)
eeers
Ans, Refer Section 2.9 Page.no: 2.35
12. Describe the following filters.
NovDec-2011]
i. Laplacian filters
ii Sharpening filters
Ans. o
2.29
i. Refer Section 2,8.3 Page.no: 2.30 and Refer Section 2.8 Page.no:
Page 57 of 304

3
Image Restoration
Image Restoration - degradation model, Properties, Noise models - Mean Filters -
Order Statistics - Adaptive filters - Band reject Filters - Band pass Filters - Notch
Filters- Optimum Notch Filtering - Inverse Filtering - Wiener filtering

3.1. IMAGE RESTORATION

Restoration improves image in some predefined sense. It is an objective process.


a
Restoration attempts to reconstruct an image that has been degraded by using prior
knowledge of the degradation phenomenon. These techniques are oriented toward
process in order to recover
modeling the degradation and then applying the inverse
the original image.
to remnove or reduce the
Image Restoration refers to a class of methods that aim
was being obtained.
degradations that haye occurred while the digital image
degradation and applying
Restoration techniques are oriented toward modeling the
recover the original image.
the inverse process in order to
gone through some sort of degradation.
All natural images when displayed have
a. During Display Mode
b. Acquisition mode or
C. Processing mode

The degradations may be due to


a. Sensor noise
b. Blur due to camera mis-focus
Page 58 of 304

Digital lmage Processine


3.2
C. Relative object camera motion
d. Random atmospheric turbulence
e. Others

Types
The restoration techniques are classified into
two types.
1.
Spatial domain techniques
2. Frequency domain techniques

3.2. IMAGE RESTORATION/ DEGRADATION MODEL


on an input
Degradation process operates on a degradation function that operates
image with an additive noise term.
Input image is represented by using the notation f, y) noise term can be
as
represented as y). These two terms when combined gives the result
n,
g*, y).
If are given gr, y), some knowledge about the degradation function Hor J and
we

some knowledge about the additive noise term


n(, y), the objective of restoration is
to obtain an estimate f (*,y) of the original image.
We want the estimate to be as close as possible to the original image. The more we
know about h and n, the closerf(x, y)will be tof°(*, y).
linear position invariant process, then degraded image is given in the
If it is a

spatial domain by,


... (3.1)
g*, y) = h (,y) *f (1,y) + n*, y)
where h(r, y) is spatial representation of degradation function and *symbol
represents convolution.
In frequency domain we may write this equation as
G
(u, v) = H(u, v)F (4, v) + N (u, v) ...(3.2)
The terms in the capital letters are the Fourier transform of the corresponding
terms in the spatial domain.
Page 59 of 304

Image Restoration 3.3|

.
Tg(x,y)
Degradation Restoration
Function H filter (s)

Noise
n(x, y)
Degradation Restoration

Fig 3.1.A model oftheimage degradation/Restoration processsg


The image restoration process can be achieved by inversingthe image degradation
process, i.e.,
G
F
(u, v)-N (u, v) F(u, v)
(u, v) = H (u, v)
H (u, v)
1
where is the inverse filter and G (u, v) is the .ecovered image.
H (u, v)
Although the concept is relatively simple, the actual implementation is difficult to
achieve, as one requires prior knowledge or identifications of the unknown
degradation function h(*, y)and the unknown noise source n(x, y).

3.3. NOISE MODELS


The principal source of noise in digital images arises during image acquisition
and/or transmission. The performance of imaging sensors is affected by a variety of
factors, such as environmental conditions during image acqujsition and by the quality
of the sensing elements themselves.
Images are corrupted during transmission principally due
to interference in the
channels used for transmission. Since main sources of noise presented in digital
sensor circuitry,
Images are resulted from atmospheric disturbance and image
following assumptions can be made.
location.
The noise model is spatial invariant i.e., independent of spatial
C The noise model is uncorrelated with the object function.Ste
NOISE
S.3.l, SPATIAL AND FREQUENCY PROPERTIES OF
A
requency property refers to the frequency content of noise in the Fourier Sense
e, as opposed to frequencies of the electromagnetic spectrum.
Page 60 of 304
3.4
Digital Image Processino
For
example, when the Fourier spectrum of noise is constant, the noise is called
white noise. This terminology
is a carryover from the physical properties white
of

light, which contains nearly all frequencies in the visible spectrum in equal
proportions.
Spatial frequency refers to the exception of spatially periodic noise that noise is
independent of spatial coordinates, and that it is uncorrelated with respect to the
image itself.

3.3.2. PROBABILITY DENSITY FUNCTIONS


The most common probability density functions found in image processing
applications are as follows.
Gaussian noise
Ray leigh noise
Erlang (gamma noise)
Exponential noise
Uniform noise
Impulse (salt and pepper) noise

3.3.2.1. Gaussian noise


These noise models are used frequently in practices because of its tractability in
both spatial and frequency domain. It also called as normal noise model.
The PDF of a Gaussian Random Variable, z is given by
1

p() 26 (3.3)
2r6

where, Z=gray level


z =mean of average value of z
G= standard deviation
G2= variance of z
Page 61 of 304

Image Restoration
3.5
Pz)

Gaussian

0.607

z+o
Fig. 3.2. Gaussian noise PDF
3.3.2.2. Rayleigh
Noise
Unlike Gaussian distribution, the Raylight distribution is not symmetric. It is given
by the formula,

.2
P(z) = (2-a) e-(2-a)b for z 2 a ... (3.4)
for z <a
The mean and variance of this density are given by
z = ... (3.5)
a+1 4

g² =
b
(4-)
and 4 ...3.6)

P(Z)

Rayleigh

Fig. 3.3. Ray Leigh noise PDF


It is displaced from the origin and skewed towards the right. The Rayleigh density
Cn be quite useful for approximating skewed histograms.
Page 62 of 304
3.3.2.3. Erlang (gamma) Noise
The PDF of Erlang noise is given by
abzb -1
for z> 0
P(2) = (6-))Te ...(3.7)
for z <0
The mean and variance of this density are given by,
... (3.8)

and ... (3.9)


a2

K
Gamma

a(b-1)b1e-(b-1)
K=
(b-1)!

(b-1)/a
Fig. 3.4. Gamma noise PDF

3.3.2.4. Exponential Noise


Exponential distribution has an exponential
shape. The PDF of exponential noise
is given as

P(z) = for z 0 ... (3.10)


for z < 0
where a >0. The mean and
variance of thisdensity function are,

z ... (3.11)
and g2 = ... (3.12)
Page 63 of 304

Image Restoration 3.7


es P(Z)

exponential

Fig. 3.5. Exponential noise PDF


3.3.2.5. Uniform Noise
The PDF of uniform noise is given by
1

p(z) = b-a
if as z<b ... (3.13)
otherwise
The mean of this density function is given by
a +b
2
..(3.14)
and its variance by

G2 = (b-a ... (3.15)


12
P(z)

b-a Uniform

a b

Fig. 3.6. Uniform noise PDF


3.3.2.6. Impulse
noise
PDF
n this case, the noise is signal dependent, and is multiplied to the image. The
O1 bipolar (impulse)noise isgiven by,
Pa for z Fa
... (3.16)
p(z) P, for z =b
otherwise
Page 64 of 304
Digital Image Processing
3.8
gray level b will appears as
light dot in the image and level a
a
If b>a, ill
appear like a
dark dot. If either P, or P, is zero then the impulse noise is called
unipolar. Bipolar impulse noise also called as salt and pepper noise, data-drop-outand
spike noise.
p(z)

Pb

Impulse
Pa

Fig. 3.7. Impulse noise PDF

3.3.3. COMPARISON BETWEEN VARIOUS NOISE


Periodic Noise
Periodic noise in a
image arises from electrical or electromechanical interference
during image acquisition. Periodic nise can be reduced significantly via frequency
domain filtering.

Estimation of Noise Parameters


The parameters of periodic noise typically are estimated by inspection of the
Fourier spectrum of the image. The parameters of noise PDFs may be known partially
from sensor specification, but it is necessary to estimate them for particular imaging
arrangement.

Table 3.1. Comparison between various noise


S. Noise
NO Model PDF Mean () Variance (o)

Gaussian mean
1.
|noise p(z) = 1=z-2z26 is the
average value
ofz
P()=
2.
Rayleigh
noise
, (Z-a)
2
e-(Z -a)b for z
a |z= a
g2b(4-a)
0
for z<a
+/
Page 65 of 304

Image Restoration 3.9


gbzb -l
3.
Erlang
|P(z) - for z > 0 b
noise a
0 for z <0

4.
Exponen
P(2) for
z<
z
>0as
tial noise for 0tAsG
Uniform - a
s z <b
5.
noise
P) if
(b-a
12
otherwise
for z =a
6.
|Impulsen()=P, for z =b
noise
otherwise

3.4. RESTORATION IN THE PRESENCE OF NOISE ONLY -


SPATIAL FILTERING

Spatial Filtering is the method of choice in situations when only additive Random
noise is present. When the only degradation present in an image is noise means we
can have

into g (*, y) =f (*, y) + n (*,y) ... (3.17)


g*, y)= h (x, y) *f (*, y) + n(,y)
(u, v)=H (4, v) +
(u, v) into G (4, v) F(u, v) +N (4, v) ... (3.18)
G
*F (u, v) N

The noise terms are unknown, so subtracting them from g*, y) or G (u, v) is nota
Realistic Option. In the case of periodic noise, it is usually possible to estimate N
(u,v) from the spectrum of G (u,v). In this case, N (4,) can be subtracted from G (u,
v) toobtain an estimate of the original image.

3.4.1. MEAN FILTERS


Mean Filters are the spatial filters which are used for noise reduction. Some of thbe
Important mean filters are
1. Arithmetic Mean Filter
2. Geometric Mean Filter
3. Harmonic Mean Filter
4. Contra Harnonic Mean Filter
Page 66 of 304
3.10| Digital Image Processing

3.4.1.1. Arithmetic Mean Filter


It is the simplest mean filters. Let S,p represents the set of coordinates in the
sub image of size m xn, centered at point (*, y). The arithmetic mean filter computeg
the average value of the corrupted image g, y) in the area defined
by S

The value of the restored image f at point (, y) is the arithmetic mean computed
using the pixels in the region defined by S,p

1
=
f (r, y) mn E g(s, ) ...(3.19)
(s, 1) e Sxy

This operation can be implemented using a spatial filter of size m x n in which all
1
coefficients have value mean filter smnooth's local variations in an
mn A image, and
noise is reduced as a result of blurring.

For every pizel in the image, the pixel value is replaced


by the mean value of its
neighbouring p'xels (m x n) with a weight W, = 1
This will resulted in a
mn
smoothing effect in the image.

3.4.1.2. Geometric Mean Filter


An image restored using a geometric mean filter is given
by the expression.
... (3.20)
(s, 1) e Sky
Here, each restored pixel is given by the
product of the pixel in the sub image
1
window, raised to the power A geometric mean filter achieves smoothing
mn
comparable to the arithmetic mean filter, but
it tends to lose less image detail in the
process.

3.4.1.3. Harmonic Mean Filter


The Harmonic Mean Filtering operation is given
by the expression
= mn
f(e,) (3.21)
1

gs,1)
Page 67 of 304

Image Restoration 3.11


The Harmonic mean filter works well for salt noise, but fails for pepper noise. It
dpes well also with other types of noise like Gaussian noise.
Harmonic Mean Filter
3.4.1.4. Contra

The contra harmonic mean filter yields a restored image based on the expresSion

Z g(s, yQ+1
(s, ) e Sy ...
f (r, y) = 3.22)

(s. ) e Sy

where Q is called the order of the filter. This filter is well suited for reducing or
virtually eliminating the effects of salt and pepper noise.
pepper noise is eliminated
IfQis positive then
IfQ isnegative then salt noise is eliminated
Also

IfQ=0 then the filter becomes arithmetic mean filter


mean filter
IfQ=-1 then the filter becomes harmonic

3.4.2. ORDER-STATISTIC FILTERS


Order-Statistics Filters are spatial filters whose response is based on ordering the
pixel contained in the image area encompassed by the filter. The response of the filter
at any point is determined by the ranking result. Some of the important
order-statistic filters are
1. Median filter
2. Max and min filter
3. Midpoint filter

4. Alpha trimmed mean filter


3.4.2.1. Median Filter
It is the best order statistic filter; it replaces the value of a pixel by the median of
gray levels in the neighborhood of that pixel.
Page 68 of 304
3.12 DigitalImage
Processing

f (*, y) = median g (s, t)}st


(s, 1) ...(3.23)
The value of
the pixel is included in the computation of
the median. Median filter
are quite popular because for certain types of random noise, they provide
excellent
noise reduction capabilities, with considerably less blurring than linear
Smoothing
flters of similar size. These are effective for bipolar and unipolar impulse noise.
3.4.2.2. Max and Min Filters
Using the 100" percentile of ranked set of numbers is called the max filter and :.
given by the equation.

1 o f (*, y). max


(s. i) e Sxy (3.24)

It is useful for finding the brightest points in an image. Pepper noise


in the image'
has very low values; it is reduced by max filter using the max
selection process in the
sub image area S,y
The percentile filter is the min filter

f (x, y) = min {g (s, 1)} •..


(3.25)
(s, 4) eSxy
This filter is useful for finding the darkest points in an image. Also, it reduces salt
noise as 'a result of the min operation.

3.4.2.3. Midpoint Filter


The Midpoint filter computes the midpoint between maximum and minimum
values in the area encompassed by the filter.

max (g (s, 1)} + min ( (s, )}


.. (3.26)
L(s, 1) e Sxy (s, t) eSxy
This filter combines order statistics and averaging
filter. It works best
randomly distributed noise, like Gaussian or
uniform noise.
3.4.2.4. Alpha-trimmed Mean Filter

Suppose that we delete the values 0


2 lowest and the 2 highest intensity
g(s, t) in the neighbourhood S, Let g, piixels. A
(s, t) represent the remaining mn-d
flter formed by averaging these mean
remaining pixels is called an alpha-trimmed
filter.
Page 69 of 304

Image Restoration 3.13|

f(*, y) = mm-d 2 8(s, 1)t


(s, t) e Sxy
where the value of d can range from 0 to mn- 1. When d=0, the alpha-trimmed
flter reduces to the arithmetic mean filter. If d = mn-1, the filter becomesa median
filter.

3.4.3. ADAPTIVE FILTERS


Adaptive filters are capable of performance superior and whose behaviour changes
based on statistical characteristics of the image inside the filter region defined by the
mxn rectangular window It has following types.
S,

1. Adaptive, local noise reduction filter


2. Adaptive, median filter.
3.4.3.1. Adaptive, local noise reduction filter
The mean and variance are the simplest statistical measures of a random variable.
These are reasonable parameters on which to base an adaptive filter because they are
quantities closely related to the appearance of an image.
The mean gives a measure of average intensity in the region over which the mean
iscomputed and the variance gives a measure of contrast in that region.
These filters will operate on a local region, S,y The response of the filter at any
point (:, y) on which the region is centered is to be based on four quantities.
(a) g (*, y)the value of the noisy image at (*, y)
(b) o,,the variance of the noise corruptingf(r, y) to form g (, y).
(C) m,, the local mean of the pixels in S,
(a) o ,the local variance of the pixels in S,y

The behavior of the filter to be as follows.


1. If o is zero., the filter should return the value of g, y). This is the trivial.
zero noise case in which gr, y) is equal to r, y).
a
2. If the local variance is high relative to o,, the filter should return value
close to g(x. v). high local variance is associated with edges and these
A

should be preserved.
Page 70 of 304
Digital Image Processing
3.14
men
3. If the two variances are equal, we want the filter to return the arithmetic
area
the pixels in S,. This condition occurs when the local
of hae
value
same properties as the overall image and local noise is to be reduced
averaging.

An adaptive expression for obtaining f (*; y) based on these assumptions may be


written as

fa.) = g
*,y)- [s (*, y) - mJ ... (3.28)

3.4.3.2. Adaptive Median Filter


Adaptive Median Filtering can handle impulse noise with probabilities larger than
median filter. It preserves detail while smoothing non impulse noise, sometimes the
traditional median filter does not do.
Adaptive median filter also works in a rectangular window area S.
and the size of
S, during filter operation, depending on certain conditions.
Consider the following notation,
minimum intensity value in S,,
Zmny maximum intensity value in S,
Zmed= median of intensity value S
in
= intensity value at
coordinates (x, y)
Sma = maximum allowed size ofS.

Algorithm
The adaptive median-filtering algorithm
works in two stages, denoted stage A a
stage B, as follows.

Stage
AAj7 Lmed min i:
mia
A, = Zmed-Zmax
IfA, >0 AND A,<0, go to Stage B

Elseincrease the window size


If window size s Smay repeat Stage A
Else output Zmed
Page 71 of 304

Image Restoration 3.15|

Stage B :B =Zy -Zmin


B, =Zy -Zmax
IfB, >0 AND B, <0, output Zy
Else output Zmed
The algorithm can be used for following three main purposes.
1. Toremove salt and - pepper (impulse) noise
2. To provide smoothing of other noise that may not be impulsive
3. To reduce distortion.

3.5. PERIODIC NOISE REDUCTION BY FREQUENCY DOMAINFILTERING

Periodic noise can be analyzed and filtered quite effectively using frequency
domain techniques. Periodic noise appears as concentrated bursts of energy in the
Fourier transform, at locations corresponding to the frequencies of the periodic
interference. The approach is to use a selective filter to isolate the noise.
The three types of selective filters are
1. Band reject filter
2. Bandpass filter
.
3. Notch filter

3.5.1, BAND REJECT FILTER


It removes a band of frequencies about the origin of the Fourier Transformer. The
principal application of band reject filtering is for noise removal in applications where
the general location of the noise component(s) in the frequency domain is
approximately known. Sinusoidal noise can be easily removed by using these kinds of
filters because it shows two impulses that are mirror images of each other about the
origin. It has three types.
1. Ideal band reject filter
2. Butterworth band reject filter
3. Gaussian band reject filter.
Page 72 of 304

|3.16| Digital Image Procesing

3.5.1.1. Ideal Band Reject Filter


An ideal band reject filter is given by the expression
W
1
if D(u, ) <
Do-2
H(u, v) = 0 if Do- SD(u,v) s D,+ W ...(3.29)
W
1
if D(u, v) > D, +
2
where
D(u,v) is the distancefrom the origin of the centered frequency
rectangle.
W-is the width of the band.
Do- is the radial center of the frequency rectangle
3.6.1.2. Butterworth Bandreject
Filter
Butterworth bandreject filter is
given by the expression
1
H (u, v) =
D(u, v)W
D² (u, v) -D
3.5.1.3. Gaussian Bandreject
Filter
Gaussian bandreject filter
isgiven by the expression
H (u, ) - I-exp
(4, v)
D W

3.5.2. BANDPASS FILTER


The function of a bandpass
filter is opposite to that
It allows a speçific frequency band of a bandreject filter.
of the image to be passed and blocks
frequencies. the rest of
The transfer function of a band pass
filter can be obtained from a
band reject filter with transfer function corresponding
Hpp (u,v) by using
the equation
Hpp =
(4,v) 1-HBR (u, v)
These filters cannot be applied
directly on an image because may remove
it too
much details of an image but these are effective
in isolating the effect of an image ol
selected frequency bands.
Image
Restoration Page 73 of 304
3.17
3.5.3. NOTCH FILTERS
A notch flter rejects frequencies
in pre-defined neighborhoods
frequency. These filters are symmetric about about a center
origin in the fourier
transfer function of ideal notch reject filter of transform the
radius do with center at
symmetry at (.....) is (......) and by

H(u, 1 v) = ifD, (u, v) s D, or D,


(u, v) S D, 0tn:
(2, )sD,
l1 otherwise 2
where

D, (u, v) = 2

2
D, (u,v) = + uo) N
+|v-7to
Butterworth notch reject filter
of order n is given by

H(u, v) - 1-exp
1 (D, (u,v) D, (u, v)

A Gaussian notch reject filter has the formula

H(u, v) = D, n

D, (u, v) D, (u, v)

These filters become high pass rather than suppress, the frequencies
contained in
he notch areas. These filters willperform exactly the opposite function as the notch
reject filer. The transfer function of this filter may be given as

Hyp (u,v)
=1-HR(", v)
where,
Hyp (u,v) - transfer function of the pass filter
(u, v)– transfer function of a notch reject filter
3.5,4, oPTIMUM NOTCH FILTERING
Optimum minimize local variances
Notch Filtering is used to of the restored
estimate
f (*, y)- These kinds of filter will follow the same set of procedure such as
Page 74 of 304

3.18 Digital Image Processing

& The first step is to extract the principal


frequency components of the
interference pattern. As before, this can be done by
placing a pass notch filter.
Hp (u, v) at the location of each spike.
If the filter is constructed to pass only components
associated with the
terference pattern, then the Fourier
Transform of the interference noise
pattern is given by the expression.

N(u, v) = Hyp (u, v)G (u, v)


where
G (u, v) is the Fourier
Transform of the corrupted
After a particular image.
filter has been selected,
domain is obtained the corresponding pattern
from the expression. in the spatial
n,y) =
g{HNp (u, v)G (u, v)}
The corrupted image
is assumed to be formed
image f(, y)and the by the addition
interference. of the uncorrupted
The effect of comnponents
instead of subtracting not present in
the estimate of
from g*, y)a weighted n(x, y) can be minimized
off(*, y). portion of m(*,
y) to obtain an estimate
= g
(*,y)- w (x, y) n (*,
where y)
w (x, y) is called a weighting or
modulation function
w (, y)

2
3.6. INVERSE FILTERING
Inverse filtering
is a process
function H.
This function can of restoring an
be obtained by image degraded
The simplest approach
any method. by a degradation
to restoration
provides an estimate F(u, v) is direct,
of the transform inverse filtering.
the transform of the degradedimage G of the original Inverse filtering
(u, v) bythe degradationimage simply by during
G
v) = H(u,
(u, F v) (u, v) +N
(u, v) function.
Page 75 of 304
Inage Restoration
|3.19|
=
(u, v) G

F(u, V)
v)
H (u, v)
H
(u, v) F (u, v) +N (u, v)
H(u,v)
F
(u, v) = F(u, v) + N(u, v)
H (u, v)
It shows an interesting result that even we
if know the depredation function we
cannot recover the underrated
image exactly because N (u, v) is not known.
If the degradation value has zero or very small values N
(u, v)
then the ratio
H (4, v)
could easily dominate the estimate
F(u, v).
3.7.MINIMUM MEAN SQUARE ERROR (OR) WIENERFILTERING
This filter incorporates both degradation and statisical behavior
of noise into the
restoration process. The main concept behind this approach is that the images and
noise are considered as random variables and the objective is to find. an estimate
f of the uncorupted imagef such that mean square error between them is minimized.
=
f (r) h, ( -s)g (s)
S=-0
This error measure is given by

e2 = E(U-D}
where, E {} is the expected value of the argument.
Assuming that the noise and the image are uncorrelated (mean zero average value)
One or other has zero mean values.

The minimum error function of the above expression is given in the frequency
domain by the expression.

H' (4, v) S, (u, v)


F(u, v) =S-(u.
S,
v) lH (u, v)? + S, (u, v) J G (u, v)
|H
H' (u, v) G (u, v)
|H (u, v) P + S, (u, v)/S, (u,v)
H(u, v) P
(u, v)]2 + S, (u, v) / S, (u,
v) Gu, v)
H (u, H
v) v)l?
Page 76 of 304
Digital Image Processing
3.20

Product ofa complex quantity with is conjugate is equal to the magnitude of


tha

complex quantity squared. This result is known as the Wiener filter. The flter
named so because of the name of its inventor N. Wiener. The term inside the brackoe
is known as minimum mean square error filter or the least Square
error filter.

(u, v) = Fourier transform of degradation function


H

H* (u, v) = complex conjugate of H(u, v)


|H (u, v)l² = H* (u, v) H (u, v)
S,(u, v) = N
(u, v)2 = power spectrum of the noise
= |F (u, v)2=power spectrum
S, (u,v)
of the undegraded image
G (u, v) = Fourier transform of the degraded
image
The restored image in the spatial domain is
given by the inverse Fourier transform
of the frequency domain estimate F(u, v). Mean square error
in statistical fornm can be
approximated by the function.
H
(u, v) 2 G(u, v)
H (u, v) H (u, +K v)

where K is a specified constant


that is added to all terms of ]H
(u, v)2.
A number of useful measures are
based on the power spectra

most important is the signal to of noise. One of the
noise Ratio, approximated
quantities such as
using frequency domain
M-1 N-1
|F (u, v) |2
v=
SNR = u=0 0
M-1 N-1
N(u, v) | 2
v =
u=0 0

This ratio gives a measure


of the level of information bearing
level of noise power. signal power to the
The mean square error 1s also
represented in terms a summatibn
original and restored images. inyolving the
1
M-1N-1[
MSE
MN Lro,y)-f6.y)l
x=0y=0
Page 77 of 304
Inage Restoration
We can define a signal to noise ratio in 3.21
the spatial domain as
M-1 N-1 A

SNR = M-1 N-1


X=0 y=0

Src,y) -f(o.yl
x=0 y=0

TWO MARKS QUESTIONS


AND ANSWERS
1.
What is meantby Image Restoration?
Restoration attempts to reconstruct or
recover an image
by using a clear Knowledge that has been degraded
of the degrading phenomenon.
2. What are the two properties
in Linear Operator?
1. Additive
2. Homogeneity.
3. How a degradation process is modeled?
(May'13)
n(x, y)

f(x,y) H g(x, y)

A system operator which together with an additive white noise term n (x,v)
H,

operates on an input image f (*,y) to produce a degraded image g(x,y).

4. Define Gray-level interpolation.

Gray-level interpolation deals with the assignment of gray levels to pixels in the
spatially Transformed image
5. What
is meant by Noise probability density function?
The spatial noise descriptor is the statistical behavior of gray level values in the
noise component of the model.
6. What is pseudo inverse filter? (Dec'13)
It is the stabilized version of the inverse filter. For a linear shift invariant system
is defined às
with
frequency response H (u, v) the pseudo inverse filter
H(u, v) = 1/(H(u, v)
Page 78 of 304
Digital Image. Processing
3.22
7. What is meant by least mnean square filter or wiener
filter? (Dec'12)
The limitation of very sensitive noise.
inverse and pseudo inverse filter is The
a
wiener filtering is method of restoring images in the presence as
of blur well
noise.

8 What is meant by blind image restoration?

Information about the degradation must be extracted from the observed image
either explicitly or implicitly. This task is called as blind image
restoration.
9. What are the two approaches for blind image
restoration?
1. Direct measurement
2. Indirect estimation

10. What is meant by Direct


meaSurement?
In direct measurement the blur impulse response
and noise levels are first
estimated from an observed
image where this parameter is
restoration. utilized in the
11. What is blur impulse response
and noise levels?
Blur impulse response: This parameter
is measured by isolating an image a
suspected object within a picture. of
Noise levels: The noise an
of observed image can
image covariance over a be estimated by measuring the
region of constant background
12. What is meant by luminance.
indirect estimation?
Indirect estimation method
employs temporal or spatial
a restoration or to averaging to either obtan
obtain key elements an
of image restoration algorithm.
13. Givethe difference
between Enhancement
Enhancement technique and Restoration,
is based primarily on
present to the viewer. For the pleasing aspects
example: Contrast Stretching. it mig
image blur by applying a Whereas removal of
deblurrings function
technique. is considered a restoratiou
14. What do you mean by
Point processing?
Image enhancement at any Point
in an image depends only on the gray level at
that point is often referred to as Point processing.
Page 79 of 304
mage Restoration
3.23
15.
What is Image Negatives?
The negative of an
image with gray
gative levels in the range
using the negative transformation, which [0,
L-
1] obtained by
is given by the expression.is

where sis output pixel r is input


s=L-1-r
pixel
16. Give the formula for negative and
log transformation.
Negative: S
=L-1-r;
Log: S = c
log (1 +r)
Where c-Constant and >0
17, What is meant by bitplane slicing?
Lnstead of highlighting gray (Dec'13)
level ranges, highlighting the contribution made to
total image appearance by specific
bits might be desired. Suppose that
in an image is represented by 8 each pixel
bits. Imagine that the image is composed
1-bit planes, ranging from bit plane 0 of eight
for LSB to bit plane-7 for MSB.
18. Why blur is to be removed from
images?
(Dec'14)
The blur is caused by lens that is improper manner, relative motion between
camera and scene and atmospheric
turbulence. It will introduce bandwidth
reduction and make the image analysis as complex.
To prevent the issues, blur is
removed from the images.

19. What is Lagrange multiplier? Where it is used?


(Dec'14)
The Lagrange multiplier is a strategy for finding the local minima and maxima
of
a function subject to equality
constraints. This is mainly used in the image
restoration process like image acquisition, image storage and transmission.
20. What is theprinciple of inverse filtering? (Mayl4)
Inverse filtering is given
by
G (4, v)
F (u, v) H(u, V)

F(u,
v)- restored image.
G(u, v)- Degraded image
H(u, v) - Filter transfer function.
Page 80 of 304
Digital Image Processino
3.24|
filter?
21. What is maximumfilteT and minimum
in an
The 100thpercentile is maximum filter is used in finding brightest points
image. The 0th percentile flter minimum filter used for finding darkest
is points

in an image.
22. Name the different types of derivative filters
1. Perwitt operators
2. Roberts cross gradient operators
3. Sobel operators

23. What are possible ways foradding noise in inages? (Dec'14)


Image sensors, scanners, image acquisition, modify the pixel values, changing
the background or foreground of an image, addition of two images, arithmetic
operations between two images and image processing algorithms are the possible
ways for adding noise in images.

24. Define spatial averaging. (May'14)


Spatial averaging is the process of finding out average of a center pixel and its
neighbors. For linear spatial averaging, the response is given by a sum of
products of the average filter mask, and the corresponding image pixels in the
area spanned by the filter mask.

25. Define harmonic mean filter.


(May'14)
The harmonic filter operation is given by

= mn
fe,y) 1
Es,f e s(xy)
g(s,)
This filter is working well for salt noise
but fails for pepper noise.
26. Define and give the transfer function contra harmonic
of filter. (May'13)
The contra harnmonic filter is used to reduce salt
and pepper noise. The conue
harmonic filtering results in a restored image expressed as
(s)Sxy8(s, 1)Q+1
Zs)sxy 8(s, t)Q
Image Restoration Page 81 of 304

Give the. relation/PDF


27.
for guassian 3.25
noise.iy
Guassian noise:
The PDFof a
Guassian
srht pot (May 15)
random variable
,Z is given by
P(z) = 1
V2rno ee-z)2/
Where

Z=gray level
Z= mean of average value of Z
G= standard deviation
= variance of Z

28. Give the relation


for rayleigh noise
Rayleigh noise: The PDF is
P(Z)
2(z–a)e-abb for Z > =a
for Z <a
Mean Z
=atyab/4
Standard deviationo=b (4-t)/4
29, Give the relation for Gammanoise
Gamma noise:
The PDF is
abzb -1 -az
= (6 -l)Te for z >0
P()
for z < 0
Mean Z = deviationg'= b/a
b/a; Standard
30, Give
the relation for. Exponential noise
for z > 0
p(2) =
for z <0
where of this density function are,
a>0. The mean and variance

and
Page 82 of 304

|3.26| Digital lmage Processing

31. Give the relation for uniform noise


1 1

ifa <z <b


P(2)
=b-a otherwise
(3.30)

The mean of this density function is given by


a+b
2 ...(3.31)
and its variance by

(b-a)
12
32. Give the relation
for Impulse noise
Impulse noise:

Pa for z =Fa
P()=P for z =b
otherwise
33. Write sobel horizontal
and vertical edge detection masks.
(May'13)
-1 -2 -1 -1 0 1

-2 2
2 2
-1 1

Horizontal masking
Vertical masking

REVIEW QUESTIONS
1. What is the use
of wiener filter or least mean square
restoration? Explain. filter in an image
(Nov/Dec 13, Nov/Dec 14,
May/June 13 and May/June 14)
Refer Section 3.7, Page No. 3.19
2. What is meant by Inverse filtering?
Explain. (Nov/Dec 13, May/June 14)
Refer Section 3.6, Page No. 3.19
3. Explain about various filters involved
innoise models.
Refer Section 2.3, Page No. 3.3
Page 83 of 304
Restoration
Image
3.27
Describe the image restoration technique of inverse
4.
filtering. Why inverse
filtering approach fails in the presence of
noise?
Refer Section 3.4, Page No. 3.9
NovDec 2017

.Anply order statistiCs Juters on the selected pixels


in the image.
Nov/Dec2016
Refer Section 3.4.2, Page No. 3.12

6 Explain about image restoration model.


Refer Section 3.2, Page No. 3.2
Page 84 of 304

Image Segmentationeb
Edge detection, Edge linking via Hough transform- Thresholding - Region based
segmentation - Region growing - Region splitting and merging - Morphological
processing- erosion and dilation, Segmentation by morphological waterslheds – basic
concepts -Dan construction- Watersled segmentation algorithm.

4.1. INTRODUCTION

If an image has been preprocessed appropriately to remove noise and artifacts,


segmentation is often the key step in interpreting the image. Image segmentation is a
process in which regions or features sharing similar characteristics are identified and
grouped together.
Image segmentation may use statistical classification thresholding, edge detection,
region detection or any combination of these techniques. The output of the
segmentation step is usually a set of classified elements, most segmentation
techniques are either region based or edge based.
* Region based techniques rely on common patterns in intensity values within a
cluster of neighboring pixels. The cluster is referred to as the region and the
goal of the segmentation algorithm is to group regions according to their
anatomical or functional roles.
Edge based techniques rely on discontinuities in image values between
distinct regions, and the goal of the segmentation algorithm is to accurately
demarcate the boundar'y separating these regions.
an
Segmentation is a process of extracting and representing informnation from
Image is to group pixels together into regions of similarity.
or group regions
Region based segmentation methods attempt to partition
according to common image properties. These image properties consists of
Page 85 of 304

4.2 Digital Inmage Processing

i) Intensity values from original images or computed values based on


an image
operator.
i) Textures or patterns that is unique to each type of region.
ii) Spectral profiles that provide multi-dimensional image data.
Elaborate systems may use a combination
of these properties to segment images
while simpler systems may be restricted to
minimal set on properties depending
the type of data available. on

4.2. DETECTION OF DISCONTINUITIES


There are three basic types
of gray-level discontinuities, known as
i) Points
waikhor2
i) Lines and
ii) Edges
To identify these discontinuities,
mask processing is performed,
response the where the
of mask is identified with respect to
its center location.
4.2.1. POINT DETECTION
An isolated point is a point
whose gray level or intensity
its background and which is entirely different from
is located in a homogeneous or
i.e., the area having nearly homogeneous area
all the pixels with same intensity
levels.
-1 -1
-1 -1
T -1 |-1-1
0223
This mask is used to
Fig. 4.1. Point Detection
Mask
detect the isolated i
consists of coefficients points due to noise or
-1 everywhere except at the center (center cointerference.
The sumn of all the co-efficient
is 0. The response efficient 1s
the region is of the mask at the center point O

R= W
Z
+ WZt ..... + WgZ
9

k=1
where z is the intensity of the pixels.
Page 86 of 304

Image Segmentation
43
We know that the sum of all coefficient is zero, it indicating that the mask
resnonse willbe zero in areas constant intensity.
of 3

Point has been detected at the location (x, v) on which the mask is centered if the
absolute value of the response of the mask at that point exceeds a specified threshold.
Such points are labeled 1 in the output image and all others are labeled 0. st koytani
The output is obtained using the following expression. aa iat
otherwise
where,
g is the output image
Tis a non-negative threshold.
This formula used to measures the weighted differences between a pixel and its 8
neighbors. The intensity of an isolated point will be quite different from its
surroundings and thus will be easily detectable by this type of mask.

4.2.2. LINE DETECTION


While edges (i.e., boundaries between regions with relatively distinct gray levels)
are the most common type of discontinuity in an image, instances of thin lines in an
image occur frequently enough that it is useful to have a separate mechanism for
detecting them.
A convolution based technique can be used vwhich produces an image description
of the thin lines in an input image. The Hough transform can be used to detect lines
an
however, in that case, the output is a PARAMETRIC description of the lines in
image.
c) d)
a)
-1-1| -1|
b) 1 2 -1 2-1-1 -1-1 2

2 2 2 -1 2 -1 -1 2 -1 1
2-1
-1 2 -1 -1 2 2 -1 -1
-1-1-1
to
Fig. 4.2. Four Line Detection Kennels which respond maximally
(a) horizontal (b)vertical (c) +45 degree (d)– 45 degree

The line detection operator consists of a convolution kennel tuned


to detect the
presence of lines of a particular width n, at a particular orientation. Figure below
Page 87 of 304

|4.4| Digital Image Processine

shows a
collection of four such kennels, which each respond to lines of
singlepixe,
width at the particular orientation shown.kenierbss nt
s These masks above are tuned for light lines against a dark background, and w
we are
give a big negative response to dark line against a light background. If
interested in detecting dark lines against a light background, then we should neeh
the mask values. Alternatively, we
interested in either kind of line, in
might be which

case, we could take the absolute value of the convolution output.


Let Ri, R, Rq and R, denote the responses of the masks in above figure from lef
to right. Suppose that an image is filtered (individually) with the four masks.
If,at a given point in the image R| > RI, for all j k, that point is said to be mo
likely associated with a line in the direction of mask k.
For example, if at a point in the image, R,| > |R| forj = 2, 3, 4, that particular

point is said to be more likely associated with horizontal line.


If we are interested in detecting all the lines in an image in the direction defined by
a given mask, we simply run the mask through the image and threshold the absolute
value of the result.

4.2.3. EDGE DETECTION


Edges are places in the image with strong intensity contrast.
Since edges often
occur at image locations representing
object boundaries, edge detection is extensively
used in image segmentation when we want to
divide the image into area
corresponding to different objects. Representing an
image by its edges has the further
advantage that the amount of data is reduced
significantly while retaining most of th
image information.
Edge detection is a process
of identifying edges in an image to be used
fundamental asset in image
analysis.
An edge in an image is a significant
local change in the image intensity, usua
associated with a discontinuity
in either the image intensity or
the image intensity. Discontinuity the first derivative
in the image intensity can
be
1. Step discontinuities

2. Line discontinuities ec oielej ROHS2h


Page 88 of 304

Image Segmentation
|4.5

Step discontinuities
In step discontinuities where the
image intensity abruptly changes from one
on one side value
of the discontinuity to a different value on the opposite side.
Line discontinuities
In line discontinuities, where the image
intensity abruptly changes value but then
returns to the starting value within some
short distance.
Step and line edges are rare in real images, because
or the smoothing introduced
of low-frequency components
by most sensing devices, sharp discontinuities rarely
exist in real signals. Step edges become ramp edges and line
edges become roof
edges, where intensity changes are not instantaneous but occur over a finite distance.

Step Edges
A Step edge involves a transition between two intensity levels occurring ideally
over the distanceof
1 pixel.

ldeal Edges
ldeal edges can occur over the distance of 1
pixel, provided that no additional
processing is used to make them look “Real".
Roof Edges
A roof edges is really nothing more than a 1 pixel thick line running through a
region in an image.

Fig. 4.3. Fro:n left to right, models of step, a ramp anda roof edge and
their corresponding intensity profiles.
The second derivative is positive at the beginning of the
ramp, negative at the end
OT
the ramp, zero at points on the ramp and zero at points of constant intensity.
a
The intersection between the zero intensity axis and line extending between the
extreme of the second derivative marks a point called the zero crossing of the second
derivative.
Page 89 of 304
Digital Image Processino
|4.6|

Horizontal intensity
profile

First
derivative

Second
.derivative

Zero crossing

of constant intensity separated by


Fig. 4.4. (a).Two regions
an ideal vertical ramp edge. (b) Detail
near the edge, slhowing
ahorizontal intensity profile, together with its first
and second derivatives.
can used to detect the presence of an edge at a point in an
First derivative be
can to determine whether an edge pixel lies on the
image. Second derivative be used
dark or light side of an edge.

4.2.3.1. ImageGradient and its Properties

, Image gradient is a tool used to finding edge strength and


y) of an image f. It can be denoted by Vf and defined
direction at location
as the vector.

Vf = grad () =
ôy
This vector has the important geometrical property that it points in the auov
the greatest rate of change offat location. (*, y).
The magnitude (length) of vector Vf, denoted as
M (*, y)
where
M(x, y) = mag
(V)= +g2 that
is the value the rate of change in the direction oí
of
Vector. Note
and ,
M
y) are images of the same size as the gradient x andy
the original, created when
are allowed to vary over all pixel locations in f
Page 90 of 304

Image Segmentation |4.7|


The direction of the gradient vector is given by the angle.

a (x, y) = tan-1 8y
8x
measured with respect to the x-axis.

Roberts Operators
When diagonal edge direction is of interest. we need a 2-D mask. The Roberts
cross gradient operators are one of the earliest attempts to use 2-D masks with a
diagonal preference. Consider the 3x3 region shown below and the Robert operators
are based on implementing the diagonel differences.

=(Z- Z)

and = (Zg- Z¢)


ôy

NN| Z2
Z5 N
0

N
Z
Zo 1

3 x 3 image Mask Mask

Fig. 4.5. Roberts Operators

Prewitt Operators
This operator can be used to find g, and g
The masks used in this method are
given below.

NN Z3 -1-1-1 1

Z4 Z5 Z6 -1 1

Z7 Z8 Zg
1 1 -1 0

Prewitt Mask for Prewitt Mask for


3x3 Image Horizontal detection vertical detection
Fig. 4.6. Prewitt Operators

&,=
= (z,
t Zt z)-(Z tz,+ z,)
Page 91 of 304
Digital Image Processing
4.8|
)-(z +z+
z,)

ôy
-(zt Zt
and

Sobel Operators
operators can give Sobel operators that use a
slight variation in Prewitt
weight
A

center coefficient.
of 2 in the
Z3 -1-2-1 -1
NN 0 -2 2
Z4 Z5 Z6 0

Z8 Z9 |12 2 1 -1

Sobel Mask for Sobel Mask for


3x3 Image Horizontal detection vertical detection

Fig. 4.7. Sobel Operators

8x = - (2, 2z,t z)-(Z, + 2z,


+ +
z,)

and = (Z, + 2z,+ z)- (z + 2z4+ z)


ôy

4.3. EDGE LINKING AND BOUNDARY DETECTION

ldcally, edge detection techniques yield pixels lying only on the boundaris
between regions. In practice, this pixel set seldom characterizes a bounday
completely because of
i) noise
ii) breaks in the boundarydue to
non-uniform illumination.
ii) other effects that introduce spurious
discontinuities in intensity values.
Thus, edg detection
algorithms are
boundary detection usually followed by linking and ou
procedures designed
boundaries. The to assemble edge pixels into meaningfu
following techniques boundary
are used
detection. for edge linking and
1. Local processing
2. Global processing
using Hough transform
3. Regional processing
Page 92 of 304

Image Segmentation 4.9

4.3.1. LOCAL PROCESSING


Basic Idea: Analyze the characteristics of pixels in a small.neighborhood
(3x3, 5 x 5, etc) for every point (, y) that has undergone edge detection.
All points that are similar according to predefined criteria are linked, forming an
edge of pixels that share common properties according to the specified criteria.

Principal Properties
There are two principal properties for establishing similarity of edge pixels in this
kind of Analysis are
1. The strength of the response of the gradient operator used to produce the edge
pixels.
2. The direction of the gradient.

= tan-! 8y ...
a (x, y) (4.1)

Let S,, denote the set of coordinates of a neighborhood centered at point (, ) in


an image. An edge pixel with coordinates (s, 1) in S, is similar in magnitude to the

pixl at (r, y) if
sE ... (4.2)
M(s, 1)
-M (*, )l
where E is a positive threshold.
The direction angle of the gradient is shown in above equation (3.30). An edge
pixel with coordinates (s, 1) in Sy has an angle to the pixel at (*, y) if

|a (s, )-a (r, y))


sA
where A is a positive angle threshold.
The direction of the edge at (*, y) is perpendicular to the direction of the gradient
vector at that point.
A pixel with coordinates (s, t) in S,, is linked to the pixel at (, y) if both
magnitude and direction criteria are satisfied. This process is repeated at every
location in the image.
A record must be kept tolinked points as the center of the neighborhood is moved
from pixel to pixel. A simple book keeping procedure is to assign a different intensity
value to each set of linked edge pixels.
A simplification particularly well suited for real time applications consists of the
following steps.
Page 93 of 304
Digital Image Processing
|4.10
1. Compute the gradient magnitude and
angle arrays, M(, y) and aa (x,
(x. v)ofs
y) of the
input imagef (x, )
at any pair of coordinates (x, v) is oi.
2. Formn a binary image, g, whose value
by
if M(, y)> TM AND a (*, y)=A#T,
=
g(*,y) otherwise

where,
T, is a threshold
A is a specified angle direction
±T, defines a band of acceptable directions about A
3. Scan the rows of g and fill (set to 1) all gaps (sets of O) in each row that
donot exceed a specified length, K. A gap is bounded at both ends by one or
more ls. The rows are processed individually, with no memory between them.
4. Todetect gaps in any other direction, rotate g by this angle and apply the
horizontal scanning procedure in step 3. Rotate the result back by – 0.
When interest lies in horizontal and vertical edge linking, step 4 becomes a simple
procedure in which g is rotated ninety degrees, the rowS are scanned and the result is
rotated back.

4.3.2. REGIONAL PROCESSING


The locations of regions of interest in an image are known or can be determined.
This implies that knowledge is available regarding the regional
membershipof
pixels
in the corresponding edge image.

For these kinds of situations, we can use


techniques for-linking pixels on a regioa
basis, with the desired result being an approximation
to the boundary of the reglo
For example, polygonal approximations are
particularly attractive because they ca
capture the essential
shape features of a
region while keepingthe representation of the
boundary relatively simple.

Two important requirements


must be considered such
for polygonal fit algorithm
as
1. Two starting points
must be specified
2. All the points must be ordered.
Segmentation Page 94 of 304
Image
|4.11|

o
A

B
(a)
(b)

C
D
D

o
E E

Ad
F
B

(c) (d)
Fig. 4.8. Illustration of the Iterative Polygonal
Fit Algorithm
Algorithm

An algorithm for finding a polygonal fit to open and closed curves will have the
following steps.
Step 1
Let P be a sequence of ordered, distinct, 1 valued points of a binary
image. Specify two starting points, A and B. These are the two
starting vertices of the polygon.
Step 2 : Specify a threshold T and two empty stacks, OPEN and CLOSED.
Step 3
If the points in P correspond to a closed curve, put A intoOPEN
and put B into OPEN and into CLOSED. If the points correspond
to an open curve, put into OPEN andB into CLOSED.
A

Step 4
Compute the parameters of the line passing from the last vertex in
CLOSED to the last vertex in OPEN.
Step 5 P
Compute the distances from the line in step 4 to all the points in
4
whose sequence places them between the verticesD,max from Sten
Select the point, max with the maximum distance,
V,
Page 95 of 304
4.12 Digital Image Processine

If D,may> T, place Vm at the end of the OPEN stack as


Step 6 a ew
vertex. Go to step 4.
Step 7 Else, Remove the. last vertex from OPEN and insert it as
the last
vertex of CLOSED.
Step & If OPEN is not empty, go to step 4.
Step 9 Else. exit. The vertices in :CLOSED are the vertices of the
polygonal fit into the points in P.

4.3.3. GLOBAL PROCESSING USING THE HOUGH TRANSFORM


In this method, global relationships between pixels are considered and the point:
are linked by first determining whether they lie on a curve of specified shape.

Finding Straight Line Points


Given n points in an image, suppose that we want to find subsets of these points
that lie on straight lines. There are two possible solutions to finding straight line
points.

Method1
First find all lines determined by every pair
of points
Next find all subsets of points that are close to particular lines.

Drawback
This approach involves finding n(n -
1)/2 ~ n² lines and then performing
(n) (n(n- 1))/2 ~ n³ comparisons of every point, to all lines.
Therefore, it is not a
preferred method.

Method 2
Hough Transform is an alternative approach to
method 1.
4.3.3.1. Hough Transform
Consider apoint (x, y) in the xy plane - and the general eqution of a straight
in slope intercept form, y; = ax, t b.
Infinitely many lines pass through
(x y) but they all satisfy the equation y, = ax,
+b for varying values of a and b.
A
single line for a fixed pair (*, y) in
as
the parameter space or ab plane cân-
written

b=-X{a + y;
Page 96 of 304

Image Segmentation 4.13

Consider a second point ( y) also has a line in parameter space associated with
it. These lines are parallel, so this line intersects the line associated with
some point a' b, where a'
(i, V) at is the slope and b' is the intercept of the line
containing both (*, Y) and y) in the xy plane.
(, -
b'
b

+
b=-Xa yi

b= -xa y +

Fig. 4.9. a) xy -plane b)parameter space

Accumulator Cells
An important property of Hough Transform is that the parameter space can be
subdivided intocells called accumulatorcells. This is shown in below figure.
Omin Omax

Pmin
----4
------------4---

-------4
Pmax

Fig. 4.10. Division of the pplane into accumulator cells


Where (pmins Dma) and ( min, Omax) are the expected ranges of the parameter values:
-90 s D, where D is the maximum distance between opposite
<0 <
90° and -Dsp
Corners in an imnage.
Page 97 of 304
Digital Image Processing
4.14|
A

The cell at coordinates (i, ji) with accumulator value (i, j) corresponds to the
square associated with parameter space coordinates (p,, 0,).
Initially, these cells are set to zero. Then, for every non background point
of the subdivision
(I}) in the xy - plane, the parameter is allowed to have each

values on the - axis.

Then solve for the corresponding p using the equation. p = x Cos + y, sin A
The resulting p values are rounded off to the nearest allowed cell value along the

p- axis,
,
If a choice of results in solution p, then we let A (p, g) = A (p, ) + 1. The
number of subdivisions in the p-plane determines the accuracy
of the colinearity of
these points.
Hough transform is applicable to any function
of the form g (v, c) = 0,
where v is a vector of coordinates and
c is a vector of coefficients.
The Hough transform depends on the
number of coordinates and coefficients in a
given functional representation.
An approach based on the Hough Transform
is as follows.
i) Obtain a binary edge image
i) Specify subdivisions in the
p-plane
iii) Examine the counts
of the accumulator cells for high
iv) Examine the relationship pixel concentrations.
(Exampie continuity)
chosen cell. between pixels in
Continuity in this case
usually is based on computing
disconnected pixels corresponding the distance betwen
to a given accumulator
A gap in a line associated cell.
with a given cell is bridged
less thanaspecifiedthreshold. if the length of the gap
Page 98 of 304

Image Segmentation 4.15


4.4. THRESHOLDING
Image thresholding is a simple form of image segmentation. It is a way to create a
binary image from a grayscale or full-color image. This is typically done in order to
separate "object" or foreground pixels from background pixels to. aid in image
processing.
Thresholding is a technique for partitioning imagedirectly into region based
on intensity value and/or property of these value
Because of intuitive property,
Simplicity of implementation
Computational speed
$ Image thresholding enjoyscentral place in image segmentation

T
T T2

(a) (b)
can be partitioned
Fig. 4.1I. Intensity histogram that

4.4.1. Thresholding-Foundation
an image, fx, ),
Suppose that the grey-level histogram coIresponds to
a way that object and
composed of dark objects in a light background, in such
two dominant modes.
background pixels have gray levels groupcd into
background is to select a
One obvious way to extract the objects from the
any point (r, y) for which f, y)
threshold T'that separates these modes. Then
a background point.
>T is called an object point, otherwise, the point is called
and labelling
Segrnentation is accomplished by scanning the image pixel by pixel
on whether the grey level is greater or
cach pixel as object or background, depending
less than the value of T.
0f,y)<T
fr, >T
y)
Page 99 of 304

4.16 Digital Image Processing

Thresholding works well when a grey level histogram of the image groups
separates the pixels of the object and the background into two dominant modes. Then
a threshold T can be easily chosen Between the modes.
In such a case the histogram has to be partitioned by multiple thresholds.

Multilevel thresholding classifies a point (*, y) as belonging to one object


class
ifT, < (%,
) <= T,,
to the other object class
iff(x, y) > T

and to the background


iff(x, y)
<T,.
That is,the segmented image is given by

iff(r, y) > T,
g (3, ) = if T, <f4, y)s T,
C iff(r, y) sT,
In turn, the key factors affecting the properties of the valley(s) are
1. The separation between peaks
2. The noise content in the image
3. The relative sizes of objects and background.
4. The uniformity of the illumination source and
5. The uniformity of the reflectance properties of the image.

4.4.2. Basic Global and Local Thresholding


Thresholding may be viewed as an operation that inyolves tests against a

function T of the form:


T=T [, y,p(, ).f(%, )]
wheref(x, y)is the gray level, and p (x, ) is some local property.
Simple thresholding schemes compare each pixels gray level with a single
lobal threshold. This is referred to as Global Thresholding.
Page 100 of 304

Image Segmentation
|4.17|
If T depends on bothf (x, y) and p (x, v) then this is referred to a Local
Thresholding.
The following iterative algorithm can be used for this purpose. 3

1. Select an initial estimate for T.


2. Segment the image using T. This well produce two groups`of pixels: G1
consisting of all pixels with gray level values > T and G2 consisting pixels
of
with values <T.
3. Compute the average gray level values l and 12 for the pixels in regions G1
and G2.
4. Compute anewthreshold value: T=%[ul +
2]
5. Repeat step 2 through 4 until the difference in T in successive iterations is
smaller than a predefined parameter, To

33 127 191 255

(a) (b) (c)

Fig. 4.12. (a) Noisy Fingerprint, (b) Histogram,


(c) Segmented result using global threshold

4.4.3. Optimum Global Thresholding using Otsu's Method


Thresholding may be viewed as a statistical decision theory problem whose
objective is tominimize the average error incurred in assigning pixels to two or more
groups called classes. Otsu's method is optimum in the sense that it maximizes the
between class variance.
Let fo, 1, 2. ...
L÷1} denote the L distinct intensity levels in a digitl image of
Size M x N pixels and Let n, denote the number of pixels with intensity i.
Page 101 of 304

4.18 Digital Image Processing

The total number MN,of pixels in the image is


MN = n, n + n, t t... tn
The normalized histogram has components p, = n,1 MN, from which
it followS
that
LI
p,= 1, p,20 ... (4.3)
i=0
k
P,(k) = P,
... (4.4)
i=0
Viewed another way, this is the probability of class
C, occurring. For example, if
we set
k= the probability of class C, having any pixels assigned to it is zero.
0,
Similarly, the probability of class C, occurring is
L-1
P, () = p, =
1- P () ... (4.5)
i=k+1
The mean intensity value of the pixels assigned to class C, is
m,(k)
=2 i= 0
iP (i/C,)

= iP (C/i) P()/P(C)
...(4.6)
i =0

i=0
Where P,(k) is given in Eq. 4.6. The term
P(i/C,) in the first line of Eq. 4.6 is the
probability of value i, given that i comes
from class C,.
The second line in the equation follows
from Bayes' formula:
P(A/B) = P(B/A) P(A) / P(B)
Thethird line follows from
the fact that P(C,/), the probability of C; given i,is1
because we are dealing only with
valuesof i from class C..
Also, P() is the probability of
the ith value, which is
simply the ith
component of
the histogram, pi. Finally, P(Cj) is the probability
of C. which we know from
Eq. 4.4 is equal to P,().
Page 102 of 304

Image Segmentation 4.19


Similarly, the mean intensity value of the pixels assigned to class C, is
L-I
m,(k) Z iP (/C)
L-1
... (4.7)
P,(k) |k+1
The cumulative mean (average intensity) up to level k is given by

m{) = ip,
k
.. (4.8)
=0
and the average intensity of the entire image (.e.,the global mean) is given by
mG
L-1
... (4.9)
i=0
The validity of the following two equations can be verified by direct substitution
of the preceding results:
P,m + P2m =
mG (4.10)
and
... (4.11)
P, +P, = 1

Where we have omitted the ks temporarily in favor of notational clarity.


In order to evaluate the "goodness" of the threshold at level k we use the
normalized, dimensionless metric.
2
CB ... (4.12)
2
OG
where d is the global variance [i.e.,the intensity variance of
al the pixels in the
image,
L-1
...
(4.13)
I=0
And o is the between class variance, defineds

o
= P,(m, -m +
Pm, -ma .4.14)
This expression can be written also as
d PP, (m, -m,)
Page 103 of 304

|4.20| Digital Image Processing

(mGP,-m)?
...
P(1-P) (4.15)

Reintroducing k, we have the final results:


ok) ... (4.16)
n(k) =

And
[mGP, (k)–m()]?
=
P,(k) [1-P, (k)] . (4.17)
Then, the optimum threshold is the value, k*,
that maximizes oB():
max
0sk<Ll ... (4.18)
Otsu's algorithm may be summarized as follows:
1. Compute the
normalized histogram of the
components of the histogram by p, input image. Denote the
...,
i= 0, 1, 2, L-1.
2. Compute the cumulative sums,
P,(k), for k= 0, 1,2, ..., L-1, using Eq. 4.4.
3. Compute the cumulative means,
m(k), for k =0, 1, 2, ...,
4. Compute the global intensity mean, ma, L- 1,using Eq. 4.8.
using equation 4.9.
S. Compute the between-class
variance, op (k), for k = 0, 1,
2,
...,
equation 4.17.
L-1, using
6. Obtain the Otsu threshold,
k*, as the value of k for
which o (k) is maximum.
If the maximum is not unique,
obtain k* by averaging
corresponding to thevarious maxima the values of k
detected.
7. Obtain the separability measure,
n*, by evaluating Eq. 4.16 at
k= k*.
4.4.4. MULTIPLE THRESHOLDS
Multiple thresholds can
be
extended to an arbitrary
the separability measure on number of thresholds, because
which it is besed also extends an
classes. In the case to arbitrary number
of K classes, Ci, Co C

generalizes to the expression, the between class variance


K

2 P,(m-m
k=1
(4.19)
Salenatton Page 104 of 304
4.21
Ahere
=
P; 2 iP, ... (4.20)

M
=p.2 iP,
...(4.21)
e
iC
Global mean. The
.. kk-1
K

and mg is i the classes are separated by


K-1 thresholds whose
,
-ihes &i ky are the values that maximize equation 4.19 as

K
max
kz
..., kk_ ) ... (4.22)
0<ksky.<kn-1l (k,
ockskys.<k
k=1
For three classes consisting of three intensity interval the between class variance is

given by

a
o= P
(m, -m+ P,(m, -m? + P,(m, -mio
d (4.23)

where

ki
P, =
p

i=0

kz .. (4.24)
i=k+l
L-1
P; = p
i=kytl

and

= 1

2=kt1

Thefollowing
relationships hold:
=
Pm, + P,m, + Psm,
Page 105 of 304
4.22 Digital Image Processing

and

P +P, +P, = 1 ...(4.26)


The procedure starts by selecting the first value of k, (that value is 1 becausa

looking for a threshold at 0 intensity makes no sense; also, keep in mind that the
increment values are integers because we are dealing with intensities).
Next, k, is incremented through all its values greater than k,
and less than L-1
(i.., k, k, + 1,
...,
L- 2). Then is incremented to its next value andkis
k

incremented again through all its values greater


than k.
This procedure is repeated until k, =
array, o (k,
L-3. The result of this process is a 2-D
k), and the last step is to look for the maximum value in this array. The
values of k, and k, corresponding to that maximum are
.
the optimum thresholds, k
and k, If there are several maxima,
the corresponding values of k, and k, are
averaged to obtain the final thresholds. The
threshold image is then given by
iff(, y) s k
=
ifk<fy) sk, ... (4.27)
C iff(, y)>k
where a, b, and c are any three valid intensity
values.
Finally, we note that the separability measure
for one threshold extends directly to
multiple thresholds:

(4.28)
Where o is the total image variance.
4.4.5. Variable Thresholding
Factors such as noise and nonuniform
illuminate play a major role in ue
performance of a thresholding
algorithm. One of the simplest
approaches to variable
thresholding is to subdivide an image
into nonoverlapping Rectangles.
This approach 1s used to compensate for
non-uniformities in illumination and /o1
Reflectance. The Kectangles are chosen small
enough so that the illumination of ea
uniform.
is approximately
Page 106 of 304

Image Segmentation 4.23|

Image subdivision generally works well. When the objects of interest and the
background occupy Regions of Reasonably comparable size. When this is not the
case, the method typically fails, because of the likelihood of subdivisions containing
only object or background pixels. Although this situation can be addressed by using
additional techniques to determine when a subdivision contains both types of pixels
the logic required to address different scenarios can get complicated.
We illustrate the basic approach to local thresholding using the standard deviation
and mean of the pixels in a neighbourhood of every point in an image. These two
quantities are quite useful for determining local thresholds because they are
descriptors of local contrast and average intensity.Let Gy,and m,, denote the standard
deviation and mean value of the set of pixels contained in a
neighbourhood,
Sp centred at coordinates (x, y) in an image. The follovwing are common forms of
variable, local thresholds:
... (4.29)
Igy # a,yt bmgy

where a and b are nonnegative constants, and


Ty = ao,y + bmg (4.30)
where mG is the global image mean. The segmented image is computed as
=
1
iff(e, y) > Ty ... (4.31)
g(%,) iff(%, y) s Ty
f(x,y) is the input image. This equation is evaluated for all pixel locations
where
in the image, and a different threshold is computed at each location (, y) using the
pixels in the neighbourhood S,g

Significant power (with a modest increase in computation) can be added to local


thresholding by using predicates based on the parameters computed in the
neighbourhoods of (*, y):
= ifQ(local parameters) is true
g(*) if Q(local parameters) is false
... (4.32)
predicate based on parameters computed using the pixels in
Where Q is a
neighbourhood S..For example, consider the following predicate Q (o,p m,), based
on the local mean and standard deviation:
true iff(*, y)> ao,, ANDfu,y) > bmy .. (4.33)
false otherwise
Page 107 of 304
Digital Image Processing
4.24
ttalhe ue
Local Thresholding using Moving Averagesg ie'

3 Special case of local thresholding


method. i enoyi 30 tst
of an image.
Computing a moving average along scan lines
Scanning carried out line by line in zigzag pattern to reduce iilumination bias.
¢ Let Z: intensity of the point at step k+ 1 in the scanning sequence.
The moving average (mean) at this point is given by
k+1
m
(k+ 1)= zii
(=k+2-n
m(k)+1
n(Zk1 -Z-)
n= number of points used in computing the average
m(1) =zl/n.
This algorithm is initialized only once not
at every row, because moving
average is computed
for every point in the image.
Segmentation is implemented
using
g (,y) iff(, y) l
iff(x, y) sTy
With T= bm,, where b is constant
and m, is the moving average
at 'point (x, y) in the input from e4
image.
nnt, Siydtngtrkly
ftoi. lyedsaifofthhwjat
9lotw thotias
alidkolailns Cgythyyysmghyteueiftakal
Buy alie
wthate
utaiwtats ouharalitoa
(a)
(b)
Fig. 4.13. (a) Text image (c)
corrupted by spot shading.
Otsu's method. (c) (b) Result of globalthresholding
Thresholding based on
Result of local thresholding
using moving averages ss
moving averages works
are small with respect to well when the obiects
the image size. of interes
Example: Images of typed or handwritten text.
Page 108 of 304

Image Segmentation 4.25

fart

a tutiiwtaits ou farullopa
(a) (b) (c)

Fig. 4.14. () Text inage corrupted by spot sháding. (b) Result of global thresholding
using Otsu's method. (c) Result of local thresholding using moving averages.

4.4.6. Applications of Thresholding


3 Analyze and recognize fingerprints
$ During the process of recovering/analyzing/recognizing photographed or
scanned letters.
Reduce amount of information (e.g., for image transfer, or content
recognition)
Real-time adaptive thresholding (e.g., face detection)
Traffic control and wireless mesh networks
Motion detection using dynamic thresholding
¢ Background subtraction (e.g., real-time subtraction for biometric face
detection)

4.5. REGION BASED SEGMENTATION

Segmentation is the process of partitioning an image into multiple regions.


an image
Regions are a group of connected Pixels with similar properties. A region in
can b defined by its border (edge) or its interior. If we know the interior
We can always the border and vice versa.
define
In most cases, segmentation shouldprovide a set of regions having the following
properties.
1. Connectivity and Compactness
2. Regularity of boundaries
J. Homogeneitv in terms of color or texture
4. Differentiation from neighbor regions.
Page 109 of 304

Digital Inmage Processing


4.26

represent the entire image region.


Let R
We want to partition
n subregions, R,, R, ......R, Such that
Summation of R,
=R- all pixels must belong to a region.
b. R,is a connected region for i=1,2 ...n-pixels in a region must be connectel
C. R,
oR=, for all i andj. i *j -Regionsmust be disjoint
d. P
(R,) = True for all i-pixels in a region must all share the same property
e. P (R,UR)= False, for all i andj, i +j,and R,, R, are adjacent.

P(R) is a logical predicate defined over all points in R,. It must


be true for al
pixels inside the region and false for pixels
in other regions. Regions R, and R,are
neighbors if their union forms a connected
component.
Types
There are two types
of region based segmentation, namely.
1. Region Growing
2. Region Splitting and merging

4.5.1. REGION GROWING


Region growing is a procedure
that groups pixels or sub
based on predefined regions into larger regions
criteria for growth. It
has the following steps.
An initial set of small areas
is iteratively merged
constraints. according to siminy
Start by choosing an
arbitrary secd pixel
pixels. and compare it with
neighboe
Region is grown from
the seed pixel by adding
similar, increasing in neighboring pixels
When
the size of the region. ta
the growth of one
which does not yet region stops we simply pixe
belong to any region choose another seed
This whole process and start again.
is continued until
bottom up method. all pixels belong to some region. Itis&
Region Growing algorithm
Let f(r, y) denote an input array, S
locations of
(, y) denote a seed array containing 1S1S atthe
seed points and
0S elsewhere, and Q denote appliedat
a to be
predicate
Page 110 of 304

Image Segmentation |4.27

each location (, J). Arrays f and S are assumed to be of the same size. A basic
region growing algorithm based on 8-connectivity may be stated as follows.
Step 1
Fnd allconnected components in S(*, y) and erode each connected
component to one pixel, label all such pixels found as 1. Allother
pixels in S are labeled 0.
Step2
:
Form an image fo such that, at a pair of coordinates (r, y), Let
Jo (E, y) = 1 if the input image satisfies the given predicate, Q at
those coordinates, otherwise letfo (x, y)=0
Step 3 :Let g be an image formed by appending to each seed point in S all
the 1-valued points info that are 8-connected to that seed point.
Step 4 :Label each connected component ing with a different region label
i.e., segmented image obtained by region growing
However, starting with a particular seed pixel and letting this region grow
completely before trying other seeds biases the segmentation in favour of the regions
which are segmented first.
This can have several undesirable effects.
Current region dominates the growth process-ambiguities around edges
of adjacent regions may not be resolved correctly.
Different choices of seeds may give different segmentation results.
on an edge.
*Problems can occur if the (arbitrarily chosen) seed point lies
Simultaneous Region Growing
To counter the above problems, simultaneous region growing techniques have
ben developed.
& Similarities of neighboring regions are taken into account in the growing
procesS.
dominate the proceedings.
$ No single region is allowed completely
to
grow at same time.
A number of regions are allowed to
expanding regions.
$Similar regions will gradually coalesce into
Advantages
on parallel computers.
Easy and efficient to implement
Page 111 of 304
4.28 Digital Image Processing

4.5.2. REGION SPLITTING AND MERGING


Region Splitting and Merging is a segmentation process in which an image i
initially subdivided into a set of arbitrary, disjoint regions and then the regions ara
merged and/or split to satisfy the basic conditions. This is an alternative approachs.
the region growing method.
To illustrate the basic principle of split and merge methods, let us consider an
imaginary image.
* Let R represent the whole image shown in fig 4.15 (a).
3 Not all the pixels in fig 4.15 (a) are similar. So
the region is split in
fig 4.15 (b), (i.e.)QR) = FALSE
Assume that all pixels within each of
the regions R, R, and R, are similar,
but those in R are not.
Therefore R, is split ncst, as shown
in fig 4.15 (c).

R1 R2 R1 R2
R

R3 R4 R41 R42
R3
R43 R44
(a) whole image
(b) First split (c) Second split
Fig. 4.15.
Quadtree
We can also describe
the splitting of the image
tree that is trees
in which each node has
using a tree structure
called quad
exactly four descendants.
The images coiresponding
regions or quad images.
to the nodes of a quad tree
sometimes are called quad

R1 R2
R41 R42 R1 R2 Ra RA
R43 Ras

R41 R42 R43) R44)


Fig. 4.16. Quad tree
structure of partitioned image
Page 112 of 304

mage Segmentation 4.29

Ifthe process is stopped only with splitting, the result may have adjacent regions
with identical properties. Therefore, further merging as. well as
splitting is needed.sl
When the combined pixels of two adjacent regions satisfy
the predicate Q, that is,
adjacent regions R, and R; are
two merged only if

Q(R,UR) = TRUE

Split and Merge Algorithm


Step 1 * Split into four disjoint quadrants any region R, for which
Q(R) = FALSE
Step 2 When no further splitting is possible, merge any adjacent regions Ky
and R, for which Q (R,UR)=TRUE.
i to9at 9.3
Step
3: Stop when no further merging is possible.
estnloi
4.6. MORPHOLOGICAL PROCESSING
The word morphological refers to the scientific branch that deals the forms and
structures of animals/plants. Morphological in image processing is a tool for
extracting image components that are useful in the representation and description of
region shape such as boundaries and skeletons.

Furthermore, the morphological operations can be used for filtering, thinning and
pruning. This is middle level of image processing technique in which the input is
image but the output is attributes extracted from those images.

The languages of Morphology come from the set theory, where image objects can
be represented by sets. For example an image object containing black pixels can be
considered a set of black pixels in 2-D space of z², where each elemnt of the set is a
tuple (2 vector) whose coordinates are the (x, ) of white pixel in an image.
-D
4.6.1. BASICS OF SET THEORY 3

Let A be set = a is an element ofA: aleA,:


in z² and a (a, a) then
lfa is not an element of A
then a A
every element is also an element of set B, then A is said to be a subset
of set A
of B written as

AsB
Page 113 of 304
Digital Image Processing
|4.30

of all elements that are on both


The union of A
and B
is the collection in set,
It is represented as

C= AUB
The intersection for the sets A and B is the set element belonging to both A and D
is represented as
D = AnB
If there are no common elements in A and B, then the sets are called disjojnt sete

represented as

AnB =.
is the name of theset with no members.
The complements of a sets A is the set of elements in the
image not contained A.
A =w|we A}
The differences of two sets A and B
is denoted by
A-B = {w|w A, w¢ B}
The reflection of two set B, denoted byB is defined as

B=ww=-b, for b e B}

IfB is the set of pixels (2-D points) representing an


simply the set of points object in an image, then B is
in B whose (x, y) coordinates
by (-x, -y). have been replaced
The translation of a set B
by point z = (z, Z), denoted
(B), is defined as
(B), =(cle =b+z for b e B}
If B isthe set of pixels representing an
object in an image, then (B),
points in B
whose (x, y) coordinates have been is the set o
replaced by (x t Zj, y + Zz). It is
shown in below figure.
Page 114 of 304

Image Segmentationt 4.31|

(B)Z

Fig. 4.17. (a) A set, (b) its reflection and (c) its translation by z

4.6.2. EROSION AND DILATION


Erosion and dilation are the two fundamental operations used in morphological
Almost all morphological algorithms depend on these
image processing.
two operations.

Erosion
away the
Erosion shrinks an image object. The basic effect of erosion is to erode
boundaries of for ground pixel thus area of foreground pixel shrinksto size and holes
within those areas become larger.
Mathematically, erosion of setsA by
B
denotedA OB
B, is defined as
= {zI(B)zcA}
AB
z
This equation indicates that the erosion of A by B is the set of all points
Such that B, translated by Z. is contained in A.

Characteristics
size o1 objects and removes snall anomalies
by
It generally decreases the
than the structuring element.
subtracting objects with a radius smaller
erosion reduces thc brightness of bright objects on a
With gray scale images
taking the neighborhood minimum when passing the
dark background by
Structuring element over the image.
Page 115 of 304

|4.32| Digital lmage Processing


With erosion binary images it completely removes object
smaller than
the structuring element and removes perimeter pixels from
latger

-
image objects.
Example

d/4

;d/2

d/4 A
B

d/4
A B
ad/8 Jal4al8 B
d/8 3d/4 d/8
(a) (b) (c) (d) (e)
Fig. 4.18. (a) set A (b) Square structuring element, B (c)
Erosion ofA by B, shown shaded,
() Elongated structuring element, (e) Erosion of by B using this element
A

Above figure shows an example of erosion.


The elements of A and B are shown
shaded and the background is white. The
solid boundary in Figure (c) is the limit
beyond which further displacements
of the origin of B could cause the structuring
element to cease being completely
contained in A.
Structuring Elements
These are also called kennel. It consists
a number of discrete of a pattern specified as the coordinates of
points suitable to some origin. It
0's and 1's. Typically it is much generally consists of a matrx O1
smaller than the image being
The center pixel of the structuring
processed.
elements is called the origin and
pixel of the interest it identifies the
of the pixel being processed. The pixels
elements containing 1's in the structuring
define the neighborhood
The structuring element of the structuring element.
is positioned at all positions
compared with the corresponding in the image and it
neighborhood of pixels. Two main characterisuus
that are directly related to
uueturing elements
i) Shape
i) Size
Shape
The element may be ball or
line, convexaring. By
elements, one sesets a way of choosing particular structuring
differentially some
shape or spatial orientation. objects from other according to their
Page 116 of 304

Image Segmentation 4.33

Size
The structuring element can be a 3
×3 or a 21 x 21 square.
Dilation

With A and B as set in z²,the dilation of A


by B,denoted A B, is defined as

This equation is based on reflecting B about its origin and shifting this reflection
by z. The dilation is then the set of all displacements z such that B and
A overlap by at least one element. The above equation may be rewritten
asntt.2

The set is referred to as the structuring element in the dilation. This structuring
B

element may be through of as a convolution mask. Because the basic operation of


flipping B about its origin and then successively displacing it. So that it slides over
the image A is Analogous to the convolution process.
d/2
d/4

d
d/ 4
d/4
A B d/2
Bn A B
A B=B d/8 d B=B d/8 d d/8

(c) (d) (e) Houe


(a) (b)

Fig. 4.19. (a) Set A, (b) Square structuring element, (e) Dilation afA by B shown shaded
of using this element.
A

() Elongated structuring element (e) Dilation


The structuring element and its reflection are equalbecause it is symmetric with
respect to the origin. The dashed line shows the boundary constitute beyond which
any further displacement by z would cause the intersection of B and A to be empty.

Therefore all the points inside this boundary constitute the dilation: of A to B.
Dilation has an advantage over low pass filtering that morphological method results
directly in a binary
image and corvert it into
a gray
scale which would require a pass
with a thresholding function to convert it back to binary form.
Page 117 of 304
Digital Image Processing
|4.34

4.6.3. DUALITY with respect to set complementation


Erosion and dilation
are duals of each other
and reflection. That is
... (4.34)
.(A B)e = A
©B

and (A B)e = ..(4.35)


Equation (4.35) indicates that erosion ofA by B is the complement of the dilatin
of A' by B and vice versa.
6 The duality property is useful particularly when the structuring element is

symmetric with respect to its origin, so that B =B.


Then, we can obtain the erosion of an image by B simply by dilating its
background (i.e., dilating A) with the same structuring element and complementing
the result, similar comments can suitable for equation (4.34).

Proof
The erosion is
(AO B) = {zl(B), c A}
If set (B), is contained in A, then (B), n A°= in which case the preceding
expression becomes
A
B)° = (z|(B),
A}
But the complement of the set
for z's that satisfy (B),
such that (B),nA° ^ Ae = is the set of ZS
#.
Therefore,

wavsholAB) =
{z\(B),nA} =

= Ac
B
4.7. SEGMENTATION
BY MORPHOLOGICAL
WATERSHEDS
So far we have discussed
segmentation based on
three prir.cipal concepts.
(a) Edge deteçtion
'
(b)Thresholding and 0 :
(c) Region Growing
Page 118 of 304

Image Segmentation |4.35

Fach of these approaches wås found to have advantages (for example, speed
in the
case of global thresholding) and
disadvantages (for example, the need for post
processing, such as edge linking, in edge-based segmentation).abi
irsmg
In this section, we discuss an approach based on the concept of so called
morphological watersheds. Segmentation by watersheds embodies many
of the
concepts of the other three approaches and often produces more
stable segmentation
results, including connected segmentation boundaries. This approach also provides a
simple framework for incorporating knowledge based constrains in the segmentation
process.

4.7.1. BASIC CONCEPTS


The concept of watersheds is based on visualizing an image in three dimensions;
two spatial coordinates versus intensity. In such a “topographic" we
interpretation,
consider three types of points.
(a) Points belonging to a regional minimum
(6) Points at which a drop of water, if placed at the location of any of those
points, fallwith certainly to a single minimum
(c) Points at which water would be equally likely to fall to more than one such
minimum.
For a particular regional minimum, the set of points satisfying condition (b) is
called the catchment basin or watershed of that minimum. The points satisfying
condition (c) form crest lines on the topographic surface and are termed divide lines
or watershed
lines.
The principal objective of segmentation algorithms based on these concepts is to
tind the watershed lines. The basic idea is as follows:

* A hole is punched ineach regional minimum and that the entire topography is
flooded from below by letting water rise through the holes at a uniform rate.
When the rising water in distinct catchment basins is about to merge, a dam is
built to prevent the merging.
3 The flooding will eventually reach a stage when only the tops of the dams are
visible above the water line.
Page 119 of 304
|436
Digital lmage Processing

&
These dam boundaries correspond to the divide lines of the watersheds.
Therefore, they are the connected boundaries extracted by a watershed
segmentation algorithm.

4.7.2. DAM CONSTRUCTION


construction is based on binary images, which are members of 2-D
Dam
integer
space Z². The simplest way
to construct dams separating sets of binary points is
to
use morphological dilation.
The basics of how to construct dams using dilation are illustrated
in following
figure 4.20.
Figure 4.20(a) showS portions
of two catchment basins at flooding step n-1 and (b)
shows the result at the next
flooding step n.
The water has spilled from one
basin to the other, and therefore a
built to keep this from happening. dam must be
Let M1 and M2 denote the sets
points in two regional minima. of coordinates of
The catchment basin associated
minima at stage n-1 with these two
of flooding be denoted by Cn-1
respectively. (M1) and Cn-1 (M2),

Let C[n-1] denote


the union of these two sets.
components in figure There are two connected
4.20 (a) and only one
This connected component encompasses connected component in
figure 4.20 (0).
the earlier two components,
Suppose each
of the connected components shown dashed.
structuring element in figure 4.20 (a)
shown in figure 4.20 is dilated by the
(c), subject to two
1. The dilation conditions.
has to be constrained
to q.
2. The dilation cannot
be performed on points
dilated to merge. that would cause
the sets bes
Construction
of the dam at
points in the path
this level of flooding
just determined to a value greater is completed by setting all the
of the image, where q denoted than
the connected component.the
maximum intensity value
The height of all
damsis generally set
of 1 plus
image. This will prevent water
from crossing over
the maximum allowed valuee inthe
the level of flooding is increased. the part of the completed dam as
Page 120 of 304
Image Segmentation 4.37

(a)

Origin
N11
111
|1|1|1|
(b)

First dilation

E Second dilation
X Dam points

(c)

hC2Fio 4.20. Dam Construction


Page 121 of 304

4.38 Digital Image Processing

4.7.3. WATERSHEDS SEGMENTATION ALGORITHM


Let M,, M,, ......
be sets denoting the coordinates of the points
M,
in the regional
minima of an image g(*, v). let C(M)) be a set
denoting the c0ordinates
in the catchment basin associated with regional
ofthe points
minimum. M,.
Let T[n] represent the set of coordinates (s, ) for which g(s, t) <n. That is,
T[n] = {s, 1)lg(s, t) <n}
Geometricaly, T[n] is the set
of coordinates of points in g(x, y) lying below the
plane g(x,y) = n.
C,(M) denote the set
of coordinate of points in the catchment
with minimum M, that are basin associated
flooded at stage n.

C,M)
C(M) n T[a]
In other words, C,(M) = 1 at
location (x, y) if (x,
otherwise C,(M) = y) e C(M) AND (*, y) eT[a};
0.
Next,we let C[n]
denote the union
of the flooded catchment
basins at stage n.
R
C[n] = U
C, (M)
i=1
Then C[max + 1] is
the union of all catchment
basins.
R
C[max + 1] =
C(M)
The algorithm for
finding the watersheds
=Tfmin+1]. The lines is initialized
algorithm proceeds with C[min
recursively, computing J
Let Q
denote the set of C[n] from C[n -1]
connected components
component q e
Qn, there are three possibilities, in T[n]. Then, for each connected
1.
qnC[n-1] is empty.
2. qnC{n -1] contains one
connected component of
3. gn C[n C[n-1].
-1]contains more than one
connected component
Condition occurs when a of C[n
1
new minimum is -IJ.
connected component q encountered, which case
is incorporated into in
C[n-1]to form C[n].
Page 122 of 304

Image Segmentation 4.39|

Condition 2 occurs wheng lies within the catchment basin of some


regional
minimum, in whichcase q is incorporated into C[n
–1] to form C[n].
Conditions 3 occurs when all or part of a ridge separating two or more
catchment basins is encountered. Further flooding cause the water level in
these catchment basins to merge.

4.7.4. THE USE OF MARKERS


An approach used to controlover segmentation is based on the concept of markers.
A marker is a connected component belonging to an image. We have internal markers
associated with objects of interest and external markers associated with the
background.
A procedure for marker selection typically will consist of twoprincipal steps.
1. Preprocessing
2. Defintion of a set of criteria that markers must safety
The marker selection can range from simple procedures based on intensity values
and connectivity.

TWO MARKS QUESTIONS AND ANSWERS


1. What is segmentation?
Segmentation is the process of portioning an image into its constitute regions or
objects based on certain criteria. Image segmentation algorithms are based on
either discontinuity principle or similarity principle.
2, Write the applications of
segmentation. (Dec'13)
Detection of isolated points.

Detection of lines and edges in an image.


What are the three types of discontinuity in digital image?
Points, lines and edges.

How the derivatiyes are obtained in edge.detection during formulation ?


lhe first derivative at any in an image is obtained by using the magnitude of the
are obtained by using the
3radient at that point. Similarly the second derivatives
laplaçian:
Page 123 of 304

4.40 Digital Image Processing

5. Write about linking edge points.


The approach for linking edge points is to analyze the characteristics of
pixels
a
small neighborhood (3x3 or 5x5)about every point (, y) in an image that in
h.
undergone edge detection. All points.that are similar are. linked, forming
boundary of pixels that share some common properties. a

6. What are the two properties used for establishing


similarity of edge pixels?
(1) The strength of the response
of the gradient operator used to produce the
edge pixel.
(2) The direction of the gradient.

7. What is edge?
(Dec'13)
An edge is a set of connected pixels
that lie on the boundary between two regions
edges are more closely modeled as
having a ramp like profile. The slope the
ramp is inversely of
proportional to the degree
of blurring in the edge.
8. Give the properties
of the second derivative around an edge
The sign of the second derivative can
be used to determine whether an edge
pixel lies on the dark or light
side of an edge
It producestwo values
for every edge in an image.
An imaginary straight line joining
the extreme
stof the second derivative would cross zero near positive and negative values
the midpoint of the edge.
9. Define Gradient Operator
First order derivatives a
of digital image are based on
the 2-D gradient. various approximation o
The gradient of an image
f(x, y) at location (x,y)
The magnitude (length) is defined as the vector
of vector Vf, denoted as M
where (x,y)

mag
(V) =Ng +g?
10. What is meant by
object point and background
To execute point?
the objects from the
background to select
separates these modes. Then any is a
threshold
T that
point (x,y) for which fox,y)>T an
obiect noint. Otherwise called
the point is called background is
point.
Page 124 of 304
Inage Segmentation
441
11. What is
global l threshold?
CITeu
When Threshold T depends
only on f(%,y) then the threshold is called global.
12. Define region growing.
Region growing 1s a procedure
that groups pixels or sub regions in to layer
regions based on Predefined criteria.
The basic approach is to start with a set
seed. points and from these grow. of
regions by appending to each seed these
neighboring pixels that have properties
similar to the
13. Specify the steps involved in splitting
seed.tnaad
and merging. (May'14)
Split into 4 disjoint quadrants any region
Ri for which P(R) FALSE. Merge
any adjacent regions R
and R for which P(R;UR)= TRUE. Stop when no
further merging or splitting is positive.

14. What is Local threshold?


If Threshold T depends both onf(*,y) and p (x,y)
iscalled local.
15. State the problems in region splitting and merging based image
segmentation.
(Dec'14)
Initial seed points - different set of initial seed point cause different
segmented result.
Time consuming problem.
3 This method is not suitable for color images and produce faulý colors
sometimne.
Region growth may stop at any time when no more pixel satisfy the criteria.

affecting the accuracy of region growing?


16.
What are,
e
factors (May'14)
The factors affecting the accuracy of region growing are like lightning variations,
pixel's intensity value.
-17. Define region splitting and merging
an image is
Kegion splitting and merging is a segmentation process in which
nttially subdivided into a set of arbitrary, disjoint regions and then the regions
nerger and /or split to satisfy the basic conditions.
Page 125 of 304
Digital Image Processing
4.42
REVIEW QUESTIONS

1. Explain about thresholding.


Refer Section 4.4, Page No. 4.15
2. Explain Edge detection and edge linking in dletail.: (May/June l4

Refer Section 4.3, Page No. 4.8


3. Discuss about region based segmentation teclniques compare threshold revion
based techniques? (May/June 13)
Refer Section 4.5, Page No. 4.25
4. 'Explain the two techniques of regionsegmentation. (May/June 14)
Refer Section 4.5, Page No. 4.25
5. Explain about Erosion and dilation process.
Refer Section 4.62, Page No. 4.31
Page 126 of 304

h
5
Image Compression
and Recognition
Need for data compression, Huffinan, Run Length Encoding, Shiftcodes, Arithmetic
coding,JPEG standard, MPEG. Boundary representation, Boundary description,
Fourier Descriptor, Regional Deseriptors - Topological feature, Texture - Patterns
and Pattern classes- Recognition based on matching.

5.1. INTRODUCTION

Image compression is theart and science of reducing the amount of data required
to represent an image. It is one of the most useful and commercially successfully
technologies in the field of digital processing.
The number of images that are compressed and decompressed daily is staggering
and the compressions and decompressions themselves are virtually invisible to the
user.

Types

Iwo types of digital image compression are


I. Lossless (or Error free (or) Information preserving compression).
2. Lossy compression.
Applications
Image wide range of applications including
Compression of

3 TelevideoConferencing
Remote Sensing
Medical imaging
* Facsimile Transmission (FAX)
Control of remotely piloted vehicles in space and military
Page 127 of 304

5.2 Digital Image Processing

5.2. NEED FOR DATACOMPRESSION


can dramatically decrease the amount of storage a file takes
Data comnpression

For example, in a 2:1 compression ratio, a 20 megabyte (MB) file takes un


10 MB of space. As a result of compression, administrators spend less money and Jese
time on storage.
Compresion optimizes backup storage performance and has recently shown up in
primary storage data reduction. Compression will be an important method of data
reduction as data continues to grow exponentially.

5.3. FUNDAMENTALS à Ska

The term data compression refers to the process of reducing the amount of data
required to represent a given quantity of information.
Data and information are not the same thing; data are the means by which
information is conveyed because various amounts of data can be used to represent the
same amount of information.
Some representations may contain irrelevant or repeated information are said to
contain Data Redundancy or Redundant Data. If this Redundancy is removed then
compression can be achieved.

Relative Data Redundancy22


Let b and b' are denote the number of bits or information carrying units in two
representations of the same information, the relative data redundancy R of the
representation with b bits is
1

R=1 ... (5.1)

where C - commonly called the compression ratio, is defined as


C= .(5.2)
Stnit: b

Based on band b' values we have following three different cases


Case 1

Ifb =b' then C=1 and R


=0 it means no redundant data
Case 2

JfB <<b then C =0 and R=1:it means highly Redundant


data 0
Page 128 of 304

Image Compression and Recognition 5.3|

Case 3
Ifb'>> b then C=0and R
=o Secônd set has more data than the original
set. This case is undesirable.

of data redundancies can be found in image compression. They are


Three types
b
1. Coding Redundancyt aur
2. Spatial and Temporal Redundancy
3. Irrelevant Information 30t35ridnt.,.
5.3.1. CODING REDUNDANCY
code is a system of symbols used to represent a body of information or set
A

of events. Each piece of information or event is assigned a sequence of code symbols


called a Code Word. Thus, the Code Length is defined as the number of symbols in
each code word.
The 8-bit codes are used to represent the intensities in most 2-D intensity arrays
contain more bits than are needed to represent the intensities.
Assume that a discrete random variable r; in the interval [0, L 1] is -
Used to represent the intensities of an M x N image and that each r; occurs with
probabilityP, (r).

So P, (r) =MMk=0, 1, 2......L-1 .(5.3)

where
L is the number of intensity values
n, is the number of times that the kh intensity appears in the image

Average Length of Code VWords


f the number of bits used to represent cach value of r, is l (, then the average
number of bits required to represent each pixel is
L-1
Lavg = 53..(5.4)
0
k=
where
I(r) - number of bits used to represent each intensity level
P,(r)-Probability of occurrence of intensity level r,
Page 129 of 304

5.4| Digital Image Processing

The average length of the code words assigned to the various intensity values s
found by summing the products of the number of bits used to represent each intensity
and the probability that the intensity occurs.
The total number of bits required to represent an M xN image is MNLv:
Coding Redundancy can be avoided and compression achieved by variable length
code method. This method assigning fewer bits to more probable intensity values and
more bits to less probable intensity values.

A natural binary encoding assigns the same number of bits to both the most and
least probable values and failing to minimize equation 4.52 and resulting in coding
redundancy.

5.3.2. SPATIAL AND TEMPORAL REDUNDANCY


The pixels of most 2-D intensity arrays are correlated spatially and
information is
unnecessarily replicated in the representations of the correlated pixels.

In a video sequence, temporally correlated pixels also duplicate information.

In most images, pixels are correlated spatially (in both x


and y) and in time,
because most pixel intensities can be predicted reasonably
well from neighboring
intensities and information carried by a single pixel is
small.
Toreduce the redundant associated with spatially and
temporally correlated pixels,
a 2-D intensity array must be transformed
into a more efficient but usually non-visual
representation.
For example, run-lengths or the differences between
adjacent pixels can be used to
reduce the redundancy. This type of transformation
is called Mappings.
A mapping is said to be reversible
if the pixels of the original2-D intensity array
can be reconstructed
without error. from the transformed data set.,
Otherwise the
mapping is said to be irreversible.

5.3.3. IRRELEVANT INFORMATION


Most 2-D intensity arrays contain information
that is ignored by the human visual
system and/or extraneous to the intended use
of the image. It is redundant in the sense
that it is not used.
Page 130 of 304

Image Compression and Recognition 5.5


BA. IMAGE COMPRESSION MODELS

An image compression system is composed of two distinct functional components


called
1. Encoder
2. Decoder

The enccder performs compression and the decoder performs the complementary
operation of decompression. A codec is a device or program that is capable of both
encoding and decoding.

f(x, y) Symbol
or Mapper Quantizer
coder
f(x, y, t) Compressed data
for storage
Encoder and transmission

Symbol Inverse
(x, y)
or
decoder mapper f(x. y. t)

Decoder

Fig. 5.1. Functional Block diagram of general mage compression system


a

Input image f(x, ...) is fed into the encoder, which


creates a compressed
use or transmitted
representation the input. This representation is stored for later
of
IOTstorage and use at a remote location.
When the compressed representation is presented
to its complementary decoder

then a reconstructed output image f (*,...)


is generated.
as and
application, the encoded input can be represented f(r, y)
In still image

decoder output can be represented


as f(x, y).u ia
can be represented as (, y., ) and f
In video applications, the encoded input
as y, t) where discrete parameter t specifies
decoder output can be represented f(r,
) then the compression system is
ime. If f (x. ...) may be exact replica
of f(*,

valled error free, lossless or information


preserving.
then the compression system is
If f(E, ...) may not be an exact replica off(*, ..)
called lossy compression.
Page 131 of 304

5.6 Digital Image Processing

5.4.1. ENCODING OR COMPRESSION PROCESS


The encoder is designed to remove three kinds of redundancies such as coding
redundancy, spatial and temporal redundancy, and irelevant information.
These three kinds of iiundancies can be removed by with help of following series
of three independent operations such as
1. Mapper
2. Quantizer
3. Symbol Coder

5.4.1.1. Mapper
A Mapper will transform f(x, ..) into a format designed to reduce spatial and
temporal redundancy. This operation is reversible and may or may not reduce directly
the amount of data required to represent the image.
Run-length coding is an example of a mapping that normally yields compression
in the first step of the encoding process.

In video applications the mappers are used previous video frames to facilitate the
removal of temporal redundancy.

5.4.1.2. Quantizer

The quantizer reduces the accuracy of the mapper's output


in accordance with a
pre-established fidelity criterion. The main goal
of quantizer is to keep irelevant
information into out of the compressed representation.

In video applicatins, the bit rate of the encoded output


is measured and used to
adjust the operation of quantizer,
sothat a predetermined average output rate 1s
maintained. Thus, the visual quality the output can vary
of from frame to frame as a
function of image content.

5.4.1.3. SymbolCoder

The symbol coder will generate a fixed or


variable length code to represent the
quantizer output and maps the output in accordance with
the code.
Page 132 of 304

Inage Compression and Recognition 5.7


In many cases, a variable length code is used. The shortest code words are
assigned to the most frequently occurring quantizer output values because, it
minimizes coding redundancy. stbb
5.4.2. DECODING OR DECOMPRESSION PROCESS
The decoder will contains two kinds of components such as
1. Symbol decoder
2.
Inverse mapper
Symboldecoder will perform inverse operations of the symbol encoder as like that
inverse mapper will perform inverseoperations of mapper.

Quantization results in irreversible information loss so an inverse quantizer block


is not included in the general decoder model.
In video applications, decoded output frames are maintained in an internal frame
store and used to reinsert the temporal redundancy that was removed at the encoder.

5.5. ERROR FREE COMPRESSION

Eror free compression or lossless data compression is a class of data compression


algorithm that allows the exact original data to be reconstructed from the compressed
data.

f(x, y) Compression Decompression

Fig. 5.2. Lossless Compression Model

e(r, y) = f(*, y) -f(%, y) - 0

Lossless compression is used in cases where it is important that the original and
the decompressed data to be identical or where deviations from the original data
could be deleterious.

Applications
1. Digital radiography
2. Satellite imaging

J. Archive of medical or business documents


Page 133 of 304
Digital lmage Processing
5.8
Methods
of achieve lossless or error free compression
There are four important methods are
-
1. Variable length coding Reduce coding redundanciest
2. Bit-plane coding
3. LZW coding -
Reduce interpixel Redundancies
4. Lossless Predictive coding

5.6. VARIABLE LENGTH CODING


Variable length coding is used to reduce only coding redundancy. It is present in
any natural binary encoding of gray level in an image.
Variable length coding assigns shortest possible code words to the most probable
intensity levels and vice versa.

5.6.1. HUFFMAN CODING


Huffman coding is introduced by Huffman in the year 1952. It is one of the most
popular techniques for removing coding redundancy.
Huffman coding provides the smallest possible number of code symbols per source
symbol when coding the symbols of an information source individually.
In practice, the source symbols may be either the intensities of an image or the
output of an intensity mapping operation.

Steps in Huffman Coding

Step 1

Create a series of source reductions by ordering the probability of symbols under


consideration and combine lowest probability symbol that replaces them in next
source reduction.

Below table illustrates this process for binary coding. Here, in the left side,
hypothetical set of source symbols and their probabilities are given in the decreasing
probability values.

{aj, az, ay, a4, as, a6} ={0.1, 0.4, 0.06, 0.1, 0.04, 0.3}
Page 134 of 304

Irtage Compresslon Und Recognltton |5.91

Table 5.1. Huffman Souree Reduction


Orlginal Bource Source Reducion
Symbol Probablity
0.4 04
0.3 0.3 03 03
0.1 0.1 03
0.1 0.1

0,08 0.1

0.04

First source reduction isformed by combining 0.06 and 0.04.


i.c., 0.06 + 0.04 = 0.1
So, in the source reduction, I column, write this 0.1 instead of 0.06 and 0.04.
In source reduction, IIcolumn (0.1 + 0.1 = 0.2), 0.2 is written as a 3 umber. In
column IV we get only 2 values so we can stop the reduction.

Step 2
In this step, cach reduced source is coded. It starts from the smallest source
obtaincd in the last step and goes back to the original source. The minimal length
binary codes are usedare 0 and 1.
Table 5.2. Huffman Code Assignment Procedure

Original source Source reduction

Symbol Probabillty Code 2 3 4

a2 0,4 0.4 04 04 1
r0.60

0.3 00 0.3 00 0.3 00 0.3 00 041

a4 0.1 011 0.1 011 0.2 0100.3 01

0100 0.1 011


a4 0.1
i0.1 0100

ag 01010 0,1 0101-J


0.08

ag 0,04 01011
Page 135 of 304
Digital Image Processing
|5.10
are assigned 0 and 1 first
The reduced symbols 0.6 and 0.4 in the last column
source to its lef
Since 0.6 was generating by combining two symbols in the reduced
a 0 and 1 are appended i.e., combined to differentiate them from each other, which
produces the codes 00 and 01.
Then, a 0 and 1 are arbitrarily appended with 01 since its symbol 0.3 was
generated by adding 0.2 and 0.1 in the second column. This produces the codes 010
and 011.
This operation is repeated for each reduced source until the original source is
reached. The average length of this code is
Lavg (0.4) (1)+ (0.3) (2) + (0.1) (3) + (0.1) (4)+ (0.06) (5) + (0.04) (5)
= 2.2 bits/pixel
and the entropy of the source is 2.14 bits/symbol.

Decoding n

Huffman's procedure creates the optimal code for a set of symbols and
probabilities subject to the constraint that the symbols to be coded one at a time.
After the code has been created, coding and/or error free decoding is accomplished
in a simple lookup table method.
In Huffman coding each source symbol is mapped into a fixed sequence
of code
symbols. Also, each code symbol can be instantaneously decoded in a unique way
without referring the succeeding synmbols. Therefore, it is called an instantaneous,
uniquely decodable block code.

Example
For the binary code of above table, a left to right scan
of the encoded string
01010011110 reveals that the first valid code word is 01010, which is
the code for
symbol a. The next valid code is 011, which corresponds to symbol
a,
continuing in
this manner reveals the completely decoded message to
be a,a,a,a,a;
Advantages
1
It creates an optimal code for a set of symbols and probability
2. Coding/decoding process can be done in a simple lookup table manner.
3. Implementation is very simple.
Page 136 of 304
5.11
Image
Compression and Recognition

Drawbacks

1. When a large number of symbols is to be coded, the construction of an


optimal Huffman code is a non-trivial task.

2. For J source symbols, it requires J-2 source reduction andJ-2 code


assignments. Therefore, the computational complexity is more.

5.6.2. RUN LENGTH ENCODING


Run-length encoding (RLE) is a technique used to reduce the size of a repeating
string of characters. This repeating string is called a Run.

Typically RLE encodes a run of symbols into two bytes,;


Count
symbol
content, but the
RLE can compress any type of data regardless of its information
content of data to be compresses affects the compression ratio.

Consider the example in which we have represented


an M x N image whose top
was a primitive
half s totally white, and bottom half is totally black. That example
attempt to encode the image using RLE.
a source. The algorithm
The principle of RLE is to exploit the repeating values in
counts the consecutive repetition amount of a symbol and uses that value to represent
the run. This simple principle works best on certain source types in which repeated
data values are significant.
are quite suitable for RLE.
Black-White document images, cartoon images, etc.
Actually, RLE may be used on any kind of source regardless of its
content, but its
on the above types of data
Compression efficiency changes significantly depending
are used or not. As another application suitable for RLE, we can mention text files
which for indention and formatting paragraphs, tables and
contains multiple spaçes
charts.
is run of characters is
Principle: As indicated above, basic RILE principle that the
a single character.
Examples
replaced with
the number of the same characters and
nay
be helpful to understand it better.
Example

Consider a
text source: RTAAA ASDEEEEE
Page 137 of 304
S12 Digital Image Processing

The RLE representation is: RT *4ASD *SE


This example also shows how to distinguish whether a symbol corresponds to the
daa value or itsrepetition count (called run). Each repeating bunch of characters ie
replaced with three symbols: an indicator (*), number of characters, and the character
itself.
We need the indication of the redundant character * tó separate between the
encoding ofa repeating cluster and a single character.
In the above example, if there is no repetition around a character, it is encoded as
itself. In the above example, it is important to realize
that the encoding process is
effective only if there are sequences of 4 or more repeating characters
because three
characters are used to conduct RILE.
For example, coding two repeating characters would lead to
expansion and coding
three repeating characters wouldn't cause compression or
expansion since we
represent a repetitive cluster with at least three symbols.
The decoding process is easy: If there aren't control characters
(*) the coded
symbol just corresponds to the original symbol, and control
if character occurred thn
it must be replaced with characters in a defined number
of times. It can be noticed
that the process of decoding control characters don't lead to any
special procedures.
Nevertheless the encoder applies a three
symbol encoding strategy to represent a
repetitive cluster:
CTRL
COUNT
CHAR
Here, the following terminology is defined:
CTRL - control character
hich is used to indicate compression
COUNT- number of counted characters
in stream of the same characters
CHAR - repeating characters.

5.6.3. BINARY SHIFT CODE


The most popular technique for removing coding
redundancy is Huffman coding.
When coding symbols of an information source
individually, Huffiman coding yields
Page 138 of 304
Image Compression and Recognition
|5.13
dhe smallest possible number
of code symbols per source symbol.
has the higher coding efficiency. Huffman coding
When a large number
of symbols are to be coded, the construction
inary Huffman code is a nontrivial of optimal
task. For general case
oource
reductions and J-2 code assignments are of J source symbols, J-2
required. So complexity increases.
sacrificing coding efficiency So
for simplicity in code construction becomes necessary.
In the Binary shift
code, the symbols are divided into blocks
the block size is of equal size. Usually,
2k-lsymbols, where k is a positive integer. k
block is 1), then the Huffman If =1(the size of cach
shift coding is equivalent to the standard Huffman
coding.
The individual source symbols
within first block are coded identically
standard Binary code. While with
coding the symbols of first block, hypothetical
with probability symbol
equal to sum of probabilities of all symbols belonging to
is coded with them,
other blocks
thus affecting codeword assignment. The code words
for the
remaining symbols can be constructed by means one or more prefix code followed
of
bythe reference block as in the case of binary shift,code.

Binary Shift Algorithm


The Binary shift code is generated by the following
procedure:
1.
The source symbols are arranged so that their probabilities are monotonically
decreasing.
2. The total number of source symbols is divided into symbol blocks of egqual
size.
3, The individual source symbols within all blocks are coded identically with
natural binary code. Let's designate the Huffman code of that hypothetical
symbol as C.
* Unique prefix sequence Cis chosen and Ck-1 is concatenated with symbols of
block k to identifý the symbols within this block.
Example
The Source source
of information generates the symbols shown below. Encoding the
A

symbols
with the binary encoder andthe truncated Huffiman encoder gives:
5.14| Digital ImagePage 139 of 304
Processtng

Source Symbol P; Binary Code


A0 0.3 0000
A1 0.2 0001
A2 0.15 0010
A3 0.1 0011
A4 0.08 0100
A5 0.06 0101
A6 0.05 0110
A7 0.04 0111
A8 0.02 1000
H=2.778 4
The Entropy of the source is
8
H =-2 Pi log, Pi= 2.778 bit/symbol
i=0
Since we have 9 symbols (9 < 16 = 24). We need 4 bits at
least to represent eacn
symbol in binary (fixed-length code). Hence the average length the binary code is
of
8
= Pi
li=4 2 Pi=4*1=4 bit/symbol
i=0 i=0
Thus the efficiency of the binary code is
H
Davg
2.778
4 =69.5%

Let's dividethe source syn:bols into blocks of 3(22- 1) symbols, to construct a


Huffman shift code. Let's introduce a hypothetical symbol Ax which probabilny
The
0.35 (equals the sum of the probabilities of the last six symbols from A3 to A8).
new set of symbols is shown in the table below.

Source Symbol P; |Binary Code

Ax 0.35 00
A0 0.3 01
Page 140 of 304

Image Compression and Recognition 5.15


Source Symbol !, Code
sinary Code
A1 0.2 10
A2 0.15 11

Then, the resultant code is

Source Symbol P; Binary Code


A0 0.3 0000
A1 0.2 0001
.

A2 0.15 0010
A3 0.1 0011

A4 0.08 0100

A5 0.06 0101

A6 0.05 0110
A7 0.04 0111

A8 0.02 1000

H=2.778 4

The 6 1less probable source symbols are assigned the Huffman code of that
hypothetical symbol Ax(1) concatenated with natural binary code of length 3.
The average length of the truncated Huffman code is

= SPi li=0.65 *
2
+0.24 * 4 +0.11 *6
i=0
2.92 bit/symbol
Thus the efficiency of the Shannon-Fano code is
H
2.778
95.14%
2.92
This example demonstrates that the efficiency of the Binary shift encoder is higher
than that of the binary encoder.
Page 141 of 304
Digital Image Processing
5.16
Applying the Huffman code of the same source, we get the following
codeworde

Source Symbol P; Binary Code


A0 0.3 00

A1 0.2 01

A2 0.15 100

A3 0.1 110

A4 0.08 1010

A5 0.06 1011

A6 0.05 1110
A7 0.04 11110
A8 0.02 11111
Lag H=2.778 2.81

The average length of the Huffman code is


8
= )Pi 1i
i=0

= 0.5 * 2+0.25 *3+0.19 * 4+ 0.06 * 5


= 2.81 bit/symbol

Thus the efficiency of the Shannon-Fanocode is

H
Lavg

2.778
2.81

= 98.86%

This example demonstrates that the efficiency


of the truncated Huffman encoder 1S
a bit lower than that the standard
of Huffman encoder. However, the time is reduced
by using the binary shift.
Page 142 of 304

Image Compression and Recognition 5.17


5.6.4. ARITHMETIC CODING
Arithmetic coding was developed by Elias in 1963. It is a one of the variable
length coding method which is used to reduce the coding redundancies present in an
image.
In arithmetic coding one-to-one correspondence between source symbols and code
words does not exist. Because it generates nonblack codes.
An entire sequence of source symbols is assigned a single arithmetic code word.
The code word itself defines an interval of real numbers between 0 and 1.
When the number of symbols in the message increases, two changes can happen.
1. The interval used to represent the message becomes smaller according to the

probability of cach symbol.


2. The number of information units required to represent the interval becomes
larger.

Procedure
Consider a source with four symbols, aj, a,
ag and a4. Now, the sequence or
message with five symbols a a,aga, is required to be coded.

Step 1: The message is assumed to occupy the entire half-open interval [0, 1]
Step 2: This interval is subdivided initially into four regions based on the
probabilities of each source symbol.
Table 5.3. Arithmetic Coding Example

Source Initial Sub


Probability
Symbol interval
0.2 [0.0, 0.2]
0.2 [0.2, 0.4]
0.4 [0.4, 0.8]
0.2 [0.8, 1.0]

Step 3 : The first symbol a, of the message is narrowed to the initial sub interval
[0, 0.2]
Step 4: The interval [0, 0.2] is subdivided according to the probability of the
next symbol a.
Page 143 of 304
5.18| Digital Image Processino

i.e., Minimum value + [Difference x Subinterval] =New interval


0+[(0.2) * 0.2] = 0.04
0+ [(0.2-0) x 0.4] = 0.08
Thus,a, narrows the subinterval [0.04, 0.08]
Step 5: The interval a, is subdivided according to the probability a,.

i.e., 0.04 + [(0.08 - 0.04) x 0.4] =0.056


0.04 + [(0.08 –0.04) x 0.8] =0.072
Thus, narrows the interval [0.056, 0.072|
a

Step 6:The Last message symbol a, narrows the interval [0.0624; 0.0688]
i.e., 0.056 + [(0.072 -0.056) x 0.8] =0.0688
0.04 + [(0.072 -0.056) x 0.4] =0.0624
Step 7: Finally
0.0624 + [(O.0688 –0.0624) x 1] = 0.0688
0.0624 + [(0.0688 -0.0624) x 0.8] =0.06752
Then the final message symbol is reserved with a special end message indicator,
of
narrows the range to [0.06752, 0.0688]

Encoding sequence

0.25 0.08 0.072 0.0688

0.06752
ag a3

a2

a a a
a
0 0.04-J 0.056-J 0.0624
Fig. 5.3. Arithmetic Coding
Procedure
In the arithmetically coded message of
above figure, three decimal digits are used
to represent, the five symbol message a1, az,
a3, a4,
Page 144 of 304

Image Compression and Recognition 5.19|

Here, the numbers of source symbols are 5 and number of decimal digits used to
represent the message is 3.
. Number of decimal digits/source symbol =3
=0.6 digits/symbol
Limitations
There are two practical factors which affect the performance efficiency of
arithmetic coding. They are
1, The addition of the end-of message indicator that is needed to separate one
message from another.
2. The use of finite precision arithmnetic.
Practical implementations of arithmetic coding address the latter problem by
introducing a scaling strategy and a rounding strategy.

Scaling Strategy
In scaling strategy renormalizes each subinterval to the [0,1] range before
subdividing it in accordance with thesymbol probabilities.

Rounding Strategy
The Rounding Strategy used to represent the coding subintervals accurately by
preventing the transaction effects of finite precision arithmetic.

5.6.5. LZW CODING


an image can be obtained by
An error free compression and spatial redundancies in
a technique called Lempel-Ziv-Welch (LZW) coding. It assigns fixed length code
words to variable length sequences of source symbols.
This technique is based on the Shannon's theorem which states that, the
nth

extension of a zero-memory source can be coded with fewer


average bits per source
symbol than the non-extended source itself.
Knowledge of the probability of occurrence of the source symbols is required
very simple as
before encoding them. The encoding procedure of LZW method is
given below.
First we need to construct a codebook or dictionary which
containing the source
Symbols to be coded. For 8 bit monochrome images, the first 256
words of the
dictionary are assigned to intensities 0, 1, 2...255.
Page 145 of 304

5.20 Digital lmage Processing

The encoder tests the pixels of the image one by one, the intensity sequences that
are not in the directory are placed in algorithmically determined locations, which may
be the next unused locations.
For example, if the first two pixels of the image are white which corresponds to
the intensity level 255, the sequence 255-255 may be placed in the location 256 i.e..
the next unused location.
Thesize of the dictionary is an important system parameter. Because it is too small
then the detection of matching intensity level sequences will be
less likely and if it is
too large the size of the code words will affect
compression performance.
The decoding of the image is performed by reconstructing
the code book or
dictionary. A unique feature of the LZW coding is
that the coding dictionary or code
book is created while the data are being encoded.
An LZW decoder builds an identical decompression
dictionary as it decodes
simultaneously the encoded data streamn. The
dictionary has the following problems.
In most practical applications, the dictionary
overflow is a most critical problem.
There are three methods or ways to handle this problem.
( -1. Flush or Reinitialize the
dictionary when it becomes full and continue
coding
with a new initialized dictionary.
2. Monitor compression performance
and flush the dictionary when it becomes
poor or unacceptable.
3. The least used dictionary entirecan
be tracked and replaced when necessary.
Advantages
LZW technique is simple
LZW compression has been integrated
into a variety of mainstream imaging
file formats, including
1. Graphic Interchange Format
(GIF)
2. Tagged Image
File Format (TIFF)
3. Portable Document Format
(PDF)eo st
The PNG format was created to get
around LZW Licensing Requirements.
Page 146 of 304

Image Compression and kecognition |5.21|

5.7, BIT PLANE CODING


Bit plane Coding is based on the concept of decomposing a multilevel image into a
seies of binary image and compressing each binary inage via one of several well
known binary compression methods. There re two most popular decomposition
aoroaches are described jn thís section. Such as
1. Base
-2 polynomial
2. m-bit Gray code

5.7.1. BASE-2 POLYNOMIAL


The intensities of an m-bit nonochrone image can be represented in the form of
the base-2 polynomial.
... + a, 21+ a,20
i.e., a-2 m-1+ an-2m-2 +
Decomposing the image into a collection of binary images is to separate the
m-coeficients of the polynomial into m 1-bit bit planes.
The lowest order bit plane is generated by collecting the a, bits of each pixel
the highest order bit plane contains the an-j bit of coefficients.

In general each bit plane is constructed by


1. Setting its pizels equal to the values of the appropriate bits

2. Polynornial coefficients from each pizel in the original image.

Dizadvantages
Srmall changes in intensity can have a significant input on the complexity of the bit
planes.

5.7.2. M-BIT GRAY CODE


The m-bit gray code is an alternative decomposition approach which reduces the
effect of thesrnall intensity variations. The m-bit Gray code gm - I82
8, 8o that
can be computed from
Corresponds to the polynomial in above equation

8 = a,B a+0Sism-2
...(5.5)
Here > denotes the exclusive OR operation. This code has the unique property that
tuCcessive code words differ in only one bit position.

kdvantaçgessi
m-bit u s
Snall changes in intensity are less likely to affert all planes.
Page 147 of 304

5.22 Digital Image Processing

5.8. LOSSLESS PREDICTIVE CODING


Predictive coding is based on eliminating the redundancies of closely spa
pixels-in space andor time by extracting and coding only the new information in each
pixel.
The new information of a pixel is defined as the difference between the actual and
predicted value of the pixel.
Input
sequence
f(n)
2
\e(n) Symbol Compressed
encoder sequence

Predictor Nearest
integer

(a)

Compressed Symbol e(n) f(n)


sequence decoder Decompressed
sequence

Predictor
îin)

(b)
Fig. 5.4. Lossless Predictive Coding Model (a) Encoder (b)
Decoder
5.8.1. ENCODER
1An encoder will contains three different kinds of components such as
1. Predictor
2. Nearest Integer
3. Symbol encoder

Predictor
An successive samples of discrete time input signal
f(n) is given to the predictor.
The predictor will generates the anticipated value
of each sample based on
specified number of past samples.

Nearest Integer
The output of the predictor is given to the nearest integer, denoted fín) and used to
form the difference or prediction
erTOr.oishve
Page 148 of 304

Image Compression and Recognition |s.23

e(n) = fn) - fn) ...(5.6)

Symbol Encoder
The difference between f(n) and f(n) that is f(z) -f(n) can be encoded using a
variable length code by the symbol encoder
o
sat táo sGuzi
The symbol encoder will generate the next element of the compressed data stream.
It is the output of symbol encoder.

5.8.2. DECODER
Decoder will contains following two different kinds of components
1. Symbol decoder
2. Predictor"

Symbol Decoder.
It willperform opposite action of symbol encoder. It reconstructs e (n) from the
received variable length code words and performs the inverse operations.
-e(n) + f(n) .. (5.7)
fn)
to decompress or recreate the original input sequence.

Various 1local, global and adaptive methods can be used to generate f(n).

Predictor
The predictor is formed as a linear combination of previous samples.
m

f(n)
= round
a,fn -i) .. (5.8)
ie,
Li=1

where
round is a function used to denote the rounding or nearest integer operation.
m is the order of the linear preditor
Ca, for i
=
1,2.....m.are prediction coefficients.
If the input sequence in encoder is considered to be samples of an image, the f(n)
in above three equations are pixels and the m samples used to predict the value of
following three things.
Page 149 of 304
Digital Image Processinp
5.24
1., Each pixel come from the current scan line then it is called 1-D linear
predictive coding.
2. from the current and previous scale lines called 2-D linear
If each pixel come
A predictive coding.ohdet
sil
kps fo)t ooxrd sa
3. If each pixel come fromthe current image and previous images in a
sequence
of images called 3-D linear predictive coding.
Thus, for 1-D linear predictive image coding, then f(n) can be written as
ti
= round
f (*,y)
Li=1
,f(, y-) .69)
Where each sample is now expressed as a function of the input images spatial
coordinates x and y.
From above equation indicates that the 1-D linear prudiction is a function the
of
previous pixels on the current line alone.
In 2-D predictive coding, the prediction is a function the previous
of pixels in a
left-to-right, top-to-bottom scan of an image.
In the 3-D predictive coding, it is based on these pixels
and the previous pixels of
preceding frames.

Overhead of the Predictive Coding Process


Above equation 5.9 cannot be evaluated for
the first m pixels of each line, so those
pixels must be coded by using other means such as
Huffman Code. Similar comments
apply to the higher dimensional cases.
The compression achieved in predictive coding
is related directly to the entropy
reduction that results from mapping an
input image into a prediction error sequence
called a prediction residual.
Spatial redundancy is removed by
the prediction and differencing process.
Therefore, the probability density function
(PDF) of the prediction error has highly
peaked at zero and characterized by a small variance.
This PDF can be modeled by a zero mean
uncorrelated Laplacian PDF.
Page 150 of 304
Image Compression and Recognition |5.25|
-V2 lel
E
P(e) = ... (5.10)
20, eOe
where, o, is the standard deviation of e.,

5.9, LOSSY COMPRESSION

A compression technique that does not decompress digital data back to 100% of
the original.Lossy methods can provide high degree of compression and result in
Smaller compressed files but some numbers of the original pixels, scund waves or
video frames are removed forever. Examples are the widely used JPEG image, MPEG
video and MP3 audio formats.
a
Lossy compression is never used for business data and text, which demand
perfect restoration.
Table 5.4. Comparison between Lossless and Lossy compression
S.No Lossless Compression Lossy Compression

1. No information is lost. Some information is lost

Used for text and data Used for audio and video
2.
w
Compression ratio is less High compression ratio
3.
It is not reversible
4. Completely reversible
of Compression depends upon
5
Compression is independent
sensitivity of human eyes, ear, etc.
human response

3.10. BLOCK TRANSFORM CODING


an image
coding is a technique of lossy compression, that divides
Block transform processes the blocks
non-overlapping blocks of equal size and
Into small
Independently usinga 2-D transform.
map each block
coding, a reversible linear transform is used to
In block transfom coded.
coefficients, which are then quantized and
Or sub image into a set of transform
magnihudes
a significant number of the coefficients have small
For most images. distortion.
d can be coarsely
quantized with little image
Page 151 of 304
Digital Image Processing
|5.26|
Construct Symbol Compressed
Input Forward Quantizer
n
Xn transformm
encoder Image
image
(M X N) subimages
(a)

Merge Decompressed
Compressed Symbol Inverse nXn
transform image
image decoder subimages

(b)

Fig. 5.5. A Block Transform coding system (a) encoder (b) decoder

Encoder
Encoder will perform four relatively straight forward operations.
1. Sub image decomposition
2. Transformation
3. Quantization and
4. Coding
First an M x N input image is subdivided into sub images of size n xn, which are
transformed to generate MNln². Sub inmage transform arrays, each of size n x n.
The main goal of the transformation process is to decorrelate the pixels of each sub
image, or to pack as much information as possible into smallest number of transform
coefficients.
The quantization stage selectively eliminates or more coarsely quantizes the
coefficients that carry the least amount of information in a predefined sense.
These coeficients have the smallest impact on reconstructed sub image quality.
The encoding process terminates by coding the quantized coefficients.
If any or all of the transform encoding steps can be adapted to local image content
called adaptive transform coding or fixed for sub images called non adaptive
ll
transform coding.

Decoder
The decoder wiliperform the inverse operation of the encoder. The only difference
that there is no need for quantization.
is
Page 152 of 304

Inage Compression and Recognition |5.27|

There are three main issues to be taken care during transform coding of an image

are
1. Transform selection
2. Sub image size selection and
3. Bit allocation

5.10.1. TRANSFORM SELECTION


Blocks transform coding systems based on a variety of discrete 2-D transforms
haye been constructed. The choice of a particular transform in a given application
depends on
(i) The amount of reconstruction error that can be tolerated.
(ii) The computational resources available.
Compression is achieved during the quantization of the transformed coefficients
not during the transformation step.
x n whose forward discrete transform
Consider a sub image g(x, y) of size n
T(u,v)can be expressed in terms of the general relation.

where r(x, y, u, v) forward transformation
kernels.
is defined as
Similarly the generalized inverse discrete transform
Where S(, v)- Inverse transformation kernels
y, u,
V = 0, 1, 2...n are called transform
In these equations, the T(, v) for u, -l
coefficients.

Transformation Kernels
are also known as basic functions
The forward and inverse transformation kernels
images.These kernels determine the following
or basis

I. The type of transform that is


computed

d. The overall computational complexity


are
Reconstruction errors of the block transfom coding system in which they
J.

,
employed.
are said to be separable if
1he forward andreverse transformation kernels
r(*, y, u, v) = r;(, u) r,
v)
s, v) ...
s(*, y, u, v) s(, u) (,
= (5.11)
Page 153 of 304
Digital lmage Processtng
5.28
kernels are said to be symmetric if r is
The forward and reverse transformation
functionally equal to r?
r(x, ), u, v) = r; (, u) r; , v)
... ($.12)
s*, y, u, v) = s, u) s, (y, v)

Transformation Kernel Pain


There are three transformation kernel pair is used such as
1. Discrete Fourier transform (DFT) kernel pair
2. Walsh - Hadamard transform (WHT) kernel pair
3. Discrete Cosine Transform (DCT) kernel pair

DFT kernel pair

The best known transformation kernel pair is Discrete Fourier Transform (DFT)
kernel pair. It is expressed as

r(*, ), u, v)= e-2mux + vy/n

and s,), u, v) = ej?ur + yn ... (5.13)


n2
Where,j =-1
WHT Kernel Pair

A computationally simpler transformation that is also


useful in transform coding
called the Walsh-Hadamard Transform (WI!T),
is derived from the functionally
identical kernels.
m
-1
r(x, y, u, v) = s(:,y, u,v) -)P (") + b, 6)P, ())

where, n=2m
The summation in the exponent
2 arithmetic and b(z) is the kth
of this expression is performed in modulo
bit (from right to left) in the binary
z. representation
of
If m=3 and z = 6 for example b, (z)
=0, b,(z) = 1 and b,(z) = 1. The p, (u) n
above equation are computed using
Po (u) = bm-1 (u)
P, (4) = bm-i(u) + bm-2 (u)
Page 154 of 304

Image
Compression and Recognition 5.29

P» (4)= bm-2 (4)+ 3 (u)

= (u) + ... (5.14)


P-i (u) b
b,(u)
DCT Kernel Pair
One of the transformations used most frequently for image compression is the
Discrete Cosine Transform (DCT). It is obtained as

r(*, y, u, v) = s,y, u, v)
uz (2y +1)vT ... (5.15)
= a (wa(v)cos L 2+1)
2n 2n

where
V1/n for u =0
=
for u 1,2... n-1
and similarly for a ().
Transform Matrix, G

An n x n
sub image g (, y) can be expressed as a function of its 2-D transform
T(u, v)

g(r, y) = n-1n-1
) T(u, v) S(x, y, u, v) for x,
y =
0, 1,2... n-1 ...
6.16)
u=0y =0

The inverse kemel S(, y, u, v) in above equation depends only on the indices x,
V, 4, v and not on the values of g (x, y) or T(u, v).
This can be modified in matrix form as

G=
n-1n-1
T(u, v) Sspe
... (5.17)
u=0y=0
where,
Gisan n Xnmatrix containingthe pixels of g(*,y)
And
Page 155 of 304

5.30| Digital mage Processing

s(0, 0, 4, v) s(0, 1, 4, v) s(0, n-1,


, v) 7

s(1, 0, u, v)
Sy :

s(n-1, 0, u, v)
s(n-1, 1, 4, v) : s(n– 1, n-1, , v)J
Then, G the matrix containing the pixels of the input sub image is explicity
defined as a linear combination ofn2 matrices of size n x n.

Approximation of G
If we define a transform coefficient masking function as
0 if T(u, v) satisfies a specified truncation criterion
xlu, v) otherwise
For u, v
=
0, 1, 2...n -
1, an approximation of G can be obtained from the

truncated expansion.
n-1n-1 ... (5.18)
G =X E x (4, v) T(u, v) S,p
u=0y =0
Where,
x (u, v) is constructed to eliminate the basis images that make the smallest
contribution to the total image.

Mean Square Error

The mean square error between sub image G and approximation G then is

em
E{IG– Gr}
(n-1 n-1 n z (u, v) T(u, v)
T(4, v)
S,- S,?
u=0 V=0 u=0 v=0
[n-1 n-1

u=0 v=0
n-ln-1 oTa)
2 x ... (5.19)
= E [1- 4,
v)]
u=0V=0
where
JIG– G|is the norm of matrix (G-G)
2
CT(u) is the variance of the coefficient at transform location (u, v).
Page 156 of 304
Image Compression and Recognition
5.31
The mean square erTor
of the MNin²sub images of an
runs the mean square error MxN image are identical.
of the M x N image equals the mean square
single subimage. error of a

This mean square erTor can


KIT
be minimized using Karhunen -
Loeve transform. But
isdata dependent; obtaining the KLT basis images for
each sub image is a non
trivial computational reason,
task. For this the KLT is used infrequently in practice
for image compression.

A transform such as the DFT, WHT or DCT, whose basis images are
fixed, so
normally it is used.

5.10.2. SUB IMAGE SIZE SELECTION


Another significant factor affecting transforms coding error
and computational
complexity is sub image size.

In most applications, images are subdivided so that the correlation between


adjacent sub images is reduced to some acceptable level.

Sothat n isan integer power of 2 wheren is the sub image dimension.


When the sub image size increases it cause
1. Reconstruction error decreases based on the transform used.
2. The level of compression increases
3. Computational complexity increases
Ihe most popular sub image sizes are 8 x and 16 x 16. The effect of sub image
8

Size on the reconstruction error i.e., root-mean square error for Fourier,

Walsh-Hadamard and Cosine transform is shown in below figure.

AIl three curyes intersect when 2 x 2 sub images are used. In this case, only one of
the four coefficients (25%) of each transformed array was retained. The coefficient in
all cases was the dc component, so the inverse transform simply replaced the four sub
image pixels by
their average value.
Page 157 of 304
Digital Image Processing
5.32

6.5

5.5
error
FFT

Root-mnean-square

4.5
WHT
4

3.5
DCT
3

2.5

2
o--------
2 4 16 32 64 128 256

Subimage size

Fig. 5.6. Reconstruction error versus subimage size

5.10.3. BIT ALLOCATION


The overall process of truncating, quantizing and coding the coefficients of a

transformed sub image is called as bit allocation.

In most transform coding systems, the retained coefficients are selected on the
basis of maximum variance called Zonal Coding or on the basis of maximum
magnitude called Threshold Coding.

5.10.3.1. Zonal Coding


Zonal coding is based on the information theory concept viewing information as
of
uncertainty. Therefore the transform coefficients of maximum variance carry the most
image information and should retained in the coding process.

The variances of the transformed coefficients can be calculated directly


from the
ensemble of MN/n2 transformed sub image arrays.
Page 158 of 304

Image
Compression and Recognition 5.33

oooooo

0
0
0 0 0

0 0
olo
0 0 0
0 0

Fig. 5.7Zonal Mask (shading highlights the coefficients


that aie retained)
Zonal mask can be constructed by placing a l in the locations of maximum
variance and a 0in all other locations. Coefficients of maximum variance are located
around the origin of an image transform, resulting in the typical zonal mask shown in
figure S.7.
The retained coefficients of the zonal sampling process must be quantized and
coded. So zonal masks are sometimes depicted showing the number of bits used to
code each coefficient which is shown in below figure.
1
8 7 6 3 2

7 5 3 2 1

1
5 4 3 3

4 3 3 2 1 0

1
3 3 2

1 1
2 2

1 1 0

0 0 0

Fig. 5.8. Zonal bit allocation


This coding operation can be performed in two ways.

1. The coefficients are normalized by their standard deviations and uniformly


quantized.
Lloyd-Max quantizer is designed for each
A quantizer, such as an optimal
2.

coefficient.
Page 159 of 304

Digital Image Processing


5.34
or dc coefficient normally i
To construct the required quantizer, the zeroth
coefficients are
modeled by a Ray Leigh density function, whereas the remaining
modeled by a Laplacian or Gaussian density.
The number of quantization levels allotted to each quantizer is made proportional
o
to log, T(u, v)
Thus the retained coefficients will be. selected on the basis of
maximum variance. The bits assigned to the coefficients also be proportional to the
logarithm of the coefficient variance.

5.10.3.2. Threshold Coding a

Zonal coding is implemented by using a single fixed mask for all sub images,
whereas threshold coding is inherently adaptive it means the location of the transform
coefficients retained for each sub image vary from one sub image to another.
Threshold coding is the adaptive transform coding approach most often used in
practice because of its computation simplicity.
The concept of threshold coding is that for any sub image, the transform
coefficients of largest magnitude make the most significant contribution to
reconstructed sub image quality.
The locations of the maximum coefficients vary from one sub image to another, so
the elements of x (u, v) T(u, v) normally are reordered to form a 1-D, run-length
coded sequence.

1 0
0
ooolollo

0 0 0
1
0
0 0
0 0
0 0

Fig. 5.9. Threshold mask, (shading highlights the coefficient


that are retained)
Above figure shows a typical threshold mask for one sub image
of a hypothetea
image. This mask provides a convenient way to visualize the threshold codig
Page 160 of 304

Image Compression and Recognition s.35|

process for the corresponding sub image as well as mathematically describe the
process of approximation G.
When the mask is applied to the sub image for which was derived and the resulting
Xnarray is reordered to form an n² elenment coefficient sequence in accordance
n

with the zigzag ordering pattern, which is shown in below figure.

1 5 14 15 27 28

2 4 7 13 16 26 29 42

3 1217 25 30 41 43
11 1824 31 4044 53
10 19 2332 | 394552| 54
20 22|33 38 46 5155 60

21 34| 3747 50 56 59 61

35| 36 4849 57 5862 63

Fig. 5.10. Threshold Coeficient ordering sequence


The reordered 1-D sequence contains several long runs of 0's representing the
discarded coefficients. These are run-length coded.
The non-zero or retained coefficients, corresponding to the mask location
that
contains a 1 is represented using a variable length code.
a
There are three basic ways to apply threshold to transformed sub
image or to
Create a sub image threshold masking function.

They are
1. A single global threshold can be applied to all sub images. Here,
the level of
on the number of
compression differs from image to image, depending.
coefficientthat exceeds the global threshold.
N

2. A different threshold can be used for each sub image; this


method is called
same number of coefficients is discarded
Largest coding. In this method, the
constant and known in
for each sub image. As a result, the code rate is
advance.
Page 161 of 304

5.36| Digital lmage Processing


3. The threshold can be varied as a
function of the location of each coefficient
within the sub image.
Advantage of this method is that thresholding and.quantization. can
be combinea
by replacing x(u, v) T(u, v) in approximation of G with
where

T(u, v) is a threshold and


quantized approximation of T(u, v)
and Z(u, v) is an
element of the transform normalization array.

Z(0,0) .Z(0,1) Z (0, n - 1)


Z
(1,0)
Z=
: : :
Z
(n-1, 0)
Z(n-1, 1) ...
Z(n-1, n -1) J
The T(u, v)is an inverse
transformed to obtain an
gr, y) it must be multiplied by Z(u, v). approximation of sub image
(u,v) = T(u, v) Z(4,
v) ... (5.20)
The inverse transform
of T (u, v) is called as denormalized
decompressed sub image array and yields
approximation. it the
Below figure depicts
that if Z(u, v)
assumes integer
value k if and only
isassigned a particular value C,then
if T(u, v)

kc ;sT(u, v) < C
ke +
2 ... (5.21)
If Z(u, v)> 2T
(u, v) then T(u, v)
truncated or discarded. =0 andthe transform coefficient
is completely
When
T(u, v) is represented
with a variable
length as the length code means
magnitude
T(4, v) is controlled of k increases. The number it increases in
by the value of c. of bits used to represe
Page 162 of 304
Image
Compression and Recognition
5.37
T
(u, v)4
3

T (u, v)
-3c -2c -c -2c 3c
-1

+-2
-3

Fig. 5.1l. A
threshold coding quantization curve

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 57 69 56

14 17 22|29 51 87 80 62
68109103| 77
18 22 37 56
24 35 55 64 81 104113 92

49 64 7887103|121120101
99
72| 92 95 98 112100|103

Fig, 5.12. Typical normalization Matrix


The elements of Z can be scaled to achieve a variety of compression levels. Above
igure shows a typical normalization array.

5.11. WAVELET CODING


Wavelet coding is based on the coefficients of a transform that decorrelates the
pixels of an can be coded more efficiently than the original pixels themselves.
image
an into a set of basic functions called
transform decompression image
Wavelet
wavelets.
Wavelets important information into a small number of
visual
pack most of the
Coefficients can be quantized coarsely or truncated to
and the remaining coefficients
Zero with
little image distortion.
Page 163 of 304
Digital Image Processing
S.38|

5.11.1. WAVELET CODING SYSTEM


as
Wavelet coding system consists of two major parts such
1. Encoder
2. Decoder

5.11.1.1. Encoding
Below figure shows a basic function of encoding process.

Input Wavelet Symbol Compressed


image transform Quantizer
encoder image

Fig. 5.13. Wavelet Encoder


Toencode a2 x 2 image, an analyzing wavelet, and minimum decomposition
level J-P are selected and used to compute
the discrete wavelet transform of-the
image.

If the wavelet has a complementary scaling functiono,


then the fast wavelet
transform can be used.
The computed transforms converts a
large portion of the original
horizontal, vertical and diagonal image to
decomposition coefficients
Laplacian like probabilities. with zero mean and
The computed coefficients carry
little visual information
and coded to minimize inter and they can be quantized
coefficient and coding redundancy.
The quantization can
be adapted to exploit any
decomposition levels. positional correlation across
the p
The following one or more
lossless coding methods can
final symbol coding step. be incorporated into the
1. Run-length
coding
2. Huffman coding
3. Arithmetic coding and
4. Bit plane coding.
Page 164 of 304

Image Compression and Recognition 5.39


6.11.1.2. Decoding
Below figure shows the basic functions
of decoding process.18
Compressed Symbol Inverse Decompressed
image decoder wavelet transform image

Fig. 5.14. Wavelet Decoder


The decoding process in an inverse operation of encoding process except
not have quantization.
it does
Because quantization cannot be reversed exactly.
Wavelet based system has following advantages
compared with transform coding
system.
1. Wavelet transforms are both computationally efficient and
inherently local.
2. Sub divisicn of the original image is unnecessary.

5.11.2. WAVELET SELECTION


The wavelets chosen as the basis of the forward and inverse transforms in wavelet
coding system will affect all aspects of wavelet coding system
design and
performance.

When the transforming wavelet has a companion scaling


function then the
transformation can be implemented as a sequence
of
digital filtering operations, with
the number of filter taps equal to the number
of non-zero vavelet and scaling vector
coefficient.

The ability of the wavelet is to pack information into a small number of transform
Coefficients used to determines its compression and reconstruction performance.
lhe most widely used expansion functions for wavelet based compression are
1. Daubenchies Wavelet
2. Biorhtogonal Wavelets
5.11.3.
DECOMPOSITION LEVEL SELECTION
he number of decomposition or transform levels is another important factor
which
affects
1.
Wavelet coding computational complexity
C.
Reconstruction error
5.40 Page 165 of 304
Digital ImageProcesstne
A
P-scale fast wavelet transform involves P-filter bank iterations, so
the number of
operations in the computation of the forward and inverse transforms increases m

the number of decomposition levels.


Also, more decomposition levels will affects increasingly larger areas of th
reconstructed image.
In many applications, like searching image databases or transmitting images for

progressive reconstruction, the number of transform levels is selected based on


1. Resolution of
the stored or transmitted image
2. Scale of the lowest useful approximations.

5.11.4. QUANTIZER DESIGN


Co-efficient Quantization is the most important factor will affect wavelet coding
compression and reconstruction error. So the quantizer used for encoding should be
designed with care.
The most widely used quantizer is uniform and the effectiveness of the

quantization can be improved by


S1. ntroducing a larger quantization interval around zero called a dead zone.
2.
Adapting the size of the quantization interval from scale to scale.
In either case, the selected quantization intervals must be transmitted to the
decoder with the encoded image bit stream.
These quantization intervals can be determined manually or computed
automatically based on the image being compressed.

5.12. LOSSY PREDICTIVE CODING


Lossy predictive coding is spatial domain method because it operates directly on
image pixels. Lossy predictive coding system has
1. Encoder
2.
Decoder

5.12.1. ENCODER
Below figure shows functional block diagram of encode
Page 166 of 304
Dmage Compression andIRecognition
5.41
f(n)
Input e(n) e(n)
Quantizer Symbol Compressed
sequence
encoder sequence

Predictor

f(n)

Fig. 5.15. Lossy Predictive Encoder


The main difference between lossless predictive encoder and Jossy predictive
encoder is that the nearest integer block is replaced by a quantizer and predictor
has
feedback.

Quantizer
Quantizer is inserted between the symbol encoder and the point at which the
prediction error is formed.

It maps the prediction error into a limited range of outputs denoted e (n), which
establish the amount of compression and distortion that occurs.

Predictor
The predictions generated by the encoder and decoder are equivalent. This can be
accomplished by placing the lossy Encoder's predictor within a feedback loop.

Its input (n) is general as a function of past predictions and the corresponding
quantized errors.

f(n) = e(n) + ... (5.22)


ie. f (n)
5.12.2, DECODER
Below figure shows the functional block diagram of decoder.

e(n) (n) Decompressed


.Compressed Symbol
sequence decoder sequence

Predictor
f(n)

decoder
Fig. 5.16. Lossy predictive
ne function and block diagram of this decoder are exactly
same as the in
figure 5,4
(b).
Page 167 of 304
5.42 Digital lmage Processing

The error will produced at the output of the decoder is avoided when the
be

predictor of encoder is placed in


a
closed loop. The output of decoder is same
equation 5.21.
f(n) = e(n) + f (n) ...(5.23)
Delta Modulation (DM)
Delta modulation is a simple but well known form of lossy predictive coding în

which the predictor and quantizer are defined as,

f(n) = ai(n
– 1)
•..
. (5.24)
and

i(n) - S+% for e(n) >0


|-8 otherwise
where
a is a prediction coefficient
a
Zis positive constant

The output of the quantizer, e (n)can be represented by a single


bit, so the symbol
encoder of figure 5.16 can utilize a 1-bit fixed length code.
The resulting DM code
rate is 1 bit/pixel.

5.13. IMAGE COMPRESSION STANDARDS

An image file format is a standard way to organize and store


image data. It defines
how to data is arranged and the type of compression any
if that is used. An image
contains is similar to a file format but handles
multiple types of image data.
Image compression standards define
procedures for compressing and
decompressing images i.., for reducing the amount
of data needed to represent an
image. These standards are the underpinning
of the widespread acceptance of image
compression technology.

Image compression standards are sanctioned by


the International Standards
Organization (ISO), the International Electro technical
Commission (IEC), and/ or the
International Telecommunications
Union (ITU-T) a United Nations
(U
organization that was once called the consultative
committee of the Internatioue
Telephone and Telegraph (CCITT).
Page 168 of 304

Compression and Recognition


Image 5.43]
9sld
Image Compression Standards

StillImage
Video
1. DV
h 2. H.261
Binary Continuous Tone
Image compression StillImage compression 3.H.262
4. H.263
Group 3
1. CCITT 1. JPEG 5. H.264
2. CCITT Group 4 2. JPEG-LS 6. MPEG-1
3. JBIG or JBIG1 3. JPEG-2000 7. MPEG-2
4. JBIG2 8. MPEG-4
9. MPEG-4 AVC

Two video compression standards, VC-1 by the society of Motion Pictures and
Television Engineets (SMPTE) and AVS by the chinese Ministry of Information
Industry (MI) are also included.o
5.13.1. BINARY IMAGE COMPRESSION STANDARDS
Two of the oldest and most widely used image compression standards are the
CCITT Group 3 and Group 4 standards for binary image compression.
They have been used in a variety of computer applications and they were
originally designed as facsimile (FAX) coding methods for transmitting documents
Over telephone networks.
The Group 3 standard uses a 1-D run length coding technique in which the last K-1
nes of each group of K line (for K = 2 or 4) can be optionally coded in a 2-D
mànner.
of the Group 3
The Group 4 standard is a simplified or streamlined version
standard in which only 2-D coding is allowed.

Both Group 3 and Group 4 standards use the same, 2-D coding approach, in that
Wo dimensional in the sense the information from the previousline is used to encode
the current
line.
5.13.1.1.
dimensional CCITT compression
One
3 standard.
ne 1-D compression approach adopted for only the CCITT Group
ere each line of an image is encoded as a series of variable length Haffiman code
Words.
Page 169 of 304
Digital Image Processing
5.44
and black runs
These code words represent the
run lengths of alternating white ina
scan of compression method employed is commonly
line. The
left-to-right the
coding. It has two types of code.
referred to us Modified Hufiman (MH)
1. Terminating codes
2. Makeup codes
ars
Depending on the run length value, two types of variable length code words
used.
1. If runlength<63 then modified Huffman code is used as a terminating
code.

2. If runlength > 63 - then two codes are used makeup code for quotient
otat[r/64] and terminating code for remainder r

iasa mod 64.


The main requirements of 1-D compresion standard is that each line begin with a
white run-length code word. For zero run length, the white code word is 00110101.

End-of-Line (EOL) code word


The unique End-of-Line (EOL), code word is the 12 bit code 000000000001. It is
used for
. To terminate each line

2. To indicate the first line of each new image


3. The end of a sequence of images is indicated
by six consecutive EOLs.
5.13.1.2. Two Dimensional CCITT
compression
The 2-D compression approach is adopted
for both the CCITT Group 3 and 4
standards is a line-by-line method
in which the position of each black-to-white or
white-to-black run transition
is coded with respéct to the
element a, i.e., situated on the current position of a reference
coding line.
The previously coded line is
called the Reference Line
the first line of each new image an and the reference line 10
is imaginary white line.
The 2-D coding technique
that is used is called Relative
Designate (READ) coding. Element Addres
In Group 3 standard, one or three
allowed between successive READ coded lines
MH coded lines and
READ (MR) coding. the technigue is called Mod1
Page 170 of 304

Image Compression and Recognition |5.45

Start new
coding line

Put a, before
the first pixel

Detect a

Detect b

Dtect b2

No
b, left of a,

Yes Yes
Ja,b,| s3
No

Detect a

Horizontal Vertical mode


Pass mode mode coding coding
coding

Put a Put a, on a2 Put a, on a,


under b

End of
line?
No

Yes
End of
coding line

Fig. 5.17. CCITT 2-DREAD coding procedure


Page 171 of 304

5.46 Digital Image Processing


are allowed and the
In Group 4 standard, a greater number of READ coded lines
method is called Modified READ (MMR) coding.
process fora single scan line.
The figure 5.17 shows the basic 2-D coding
Initial steps of the procedure are directed at locating several key changing
elements a, a,, a,, b, and b,.
on
A changing element is a pixel whose value is different from the previous pixel
the same coding line. Define the reference changing element a, and it is the most
important changing element. It is either set to the location of an imaginary white
changing element to the left of the first pixel in each new coding line or determined
from the previous coding mode.
After a, is located, identify a, which is the next changing element at the right side
of a, on the current coding line. Next locate a, which is the next changing element at
the right side of a, in the current line.
Find b, which is the changing element of the opposite value of a, and located at the
right of a, on the reference or previous line.
Identify b, which is the next changing element of the right of b, on the reference
line.If any of these changing elements are not detected, they are set to the location of
an imaginary pixel to the right of the last pixel on the appropriate line. Below figure
shows the general relationships between the various changing elements.
b b
Reference line

Coding line ao Next ao" a Pass mode


(a)

Vertical mode
a,b b b
Reference line

Coding line
ajaz S=0
=1
Horizontal mode

(b)

Fig. 5.18. CCITT (a) pass mode and (b) horizontal and
vertical mode coding parameters
Compression. and Recognition
Page 172 of 304
Inage
5.47
Afler identification of the current
Reference element
elerments, two simple tests are performed and associated changing
to select one of
modes. three possible coding

1. PasS mode

2. Vertical mode
3. Horizontal mode
Test 1: Compare the
location of b, with respect to a,.
Test 2: Compute the distance
between a, and b, locations
and compare that
distance against 3.
Depending on the outcome of these tests, one
of the three outlined coding blocks is
selected. Establish a new
reference element for the next coding iteration.

Table 5.5. CCITT two dimensional code table


Mode Code Word
Pass 0001
Horizontal 001+ M(aga) + M(aja)
Vertical
a, below b 1

a one to the right of b 011


a two to the right of b 000011
a, three to the right of b 0000011
a, one to the left of b 010
a, two to the left of b 000010
a, three to the left of by 0000010
.
Extension 0000001xXX

Above table defines the specific codes utilized for each of the three possible
coding modes.
Pass Mode
In pass mode,-which specifically excludes the case in which b, is directly above a,

onlythe pass mode code word 0001 needed.


is
This mode identifies white or black reference line runs that do not overlap the
Current white or
black coding line runs.
Horizontal
Mode
In horizontal coding mode, the distances from a, to a, and a, to a, must be coded in
accordance
with the termination and makeup codes.
Page 173 of 304
Digital Image Processing
5.48
Vertical Mode: length codes is assignedtothe
one of six special variable
In vertical coding mode,
distance between a, and b used
word at the bottom of above
table is to enter an
The extension mode code

optional facsimile coding mode.

JBIG Or JBIG standard


1
lossless
Bi-level Image Experts Group standard for progressive,
A joint
compression of bi-level (binary) images.
can be coded on a bit plane basis.
Continuous tone images of upto 6 bits/pixel
an initial low resolution version of
Context sensitive arithmetic coding is used and
the image can be gradually enhanced with additional compressed data.

JBIG 2 standard
JBIG is an international standard for bi-level image compression. By
2 standard

segmenting an image into overlapping and/or non-overlapping regions of text


halftone and generic content, compression techniques that are specifically optimized
for each type of content are employed.

5.13.2. CONTINUOUS TONE STILL IMAGE COMPRESSION


STANDARDS
5.13.2,1. JPEG standard
Oneof the most popular and comprehensive
continuous tone, still frame
compression standards is the JPEG standard.
It defines three different coding syst
A
1. lossy baseline coding system
which is based on the DCT and is used
in almost allcompression application
2. An extended coding system
It is used for greater compression,
higher precision or
progressiVe
Reconstruction applications
3. A lossless independent
coding It is used for Reyersible compressiO
system
Page 174 of 304
Compression and Recognition
Amage
5.49
Baseline coding system
t is also known as sequential baseline system and it is based on the Discrete
Cosine Transform (DCT). The input and output data precision
is limited to bitsand &

the quantized
DCT vaues are restricted to 11 bits.

Compression Procedure
Belowdiagram shows basic functions of JPEG encoder.

Source
DCT Quantizer Entropy Compressed
image Encoder image
data
.

Quantable Huffman
table

Fig. 5.19. JPEG Encoder


The compression can be performed in three sequential steps
1. DCT computation
2. Quantization
3. Variable length code assignment

DCT computation
size 8x8.
First, the image is subdivide into pixel blocks of
top to bottom.
All the sub images are processed left to right and
= are level shifted by subtracting
Each sub image will have 8x8 64 pxels
intensity levels.
the quantity 2k-1, where 2 is the maximum number of
blocks is
Then the 2-D discrete cosine Transform (DCT) of the pixel
computed.
Quantization

The pixels are quantized using


T
(4, V)
Z (u, v)
Page 175 of 304
DigitalImage Processing
5.50|

are reordered in the Zig-zag pattern to


form a 1-D
The quantized, coefficients
sequence of quantized coefficients.otzVi i

Variables Length code Assignment


one dimensionally Reordered array generated under the zig-zag patterm i,
The
frequency.
arranged qualitatively according to increasing spatial
a that defines
The Non-zero AC coefficients are coded using variable length code
the coefficient values and number of preceding zeros.
the
The DC coefficient is difference coded Relative to the DC coefficient of
previous sub images.

Decoding
Below figure showsa basic functional block of JPEGdecoder.

Compressed Entropy Reconstructed


Decoder Dequantizer IDCT image
image
data

Quantable Huffman
table

Fig. 5.20. JPEG Decoder


The JPEG Decoder will perform the opposite function of encoder explained
"
above.

Advantages
1. The reordering of quantized coefficients may result
in long runs of zeroS.
2. Instead of default coding tables and quantized arrays, to
the user is allowed
construct custom tables
and/or arrays according characteristics.
tothe image
5.13.2.2. JPEG - LS Standard
is a lossless to near lossless standard for continuous
It
no images based
on

adaptive prediction, context modeling and to


Golomb coding.
Page 176 of 304
Image Compression and Recognition
5.51
613.2.3. JPEG- 2000Standard
JPEG-2000 is an extension of the Initial JPEG standard to provide increased
flexibility in
1
Compression of continuoustone still images
2. Access to the compressed data
For example, portions of a JPEG-2000 compressed image can be extracted for
retransmission, storage, and display and/or editing.
The standard is based on the wavelet coding techniques.
Coefficient quantization
is adapted to individual scales and sub bands and the quantized coefficients are
arithmetically coded on a bit-plane basis.

JPEG-2000 Encoding

The encoding process of the JPEG 2000 standard has the following steps.
Step 1: Level Shifting
First step of the encoding process is to DC level shift the samples of the n-bit
unsigned image to be coded by subtracting 2n-1

a
If the image has more than one component like Red, Green and Blue planes of
color image then each component is shifted individually.

Step 2: Optional Decorrelation


means they may be optionally decorrelated
If there are exactly three components
components. It is used
uSing a reversible or non-reversible linear combination of the
to improve compression efficiency.

Step 3: Tiling
optionally decorrelated, its components
After the image has been level shifted and
can be divided into Tiles.
an
are processed independently. Because
1les are rectangular arrays of pixels that
process creates tile components.
image can have more than one component and tiling
component can be reconstructed independently and providing a simple
tach tile a
manipulating a limited region of coded image.
nechanism for accessing and/or
Page 177 of 304
g
Digital Image Processing
5.52
Step 4: Transformation
of the rows and columns of each
LThe 1-D discrete wavelet transform
component is computed.
is based on a biorhtoconsl
For error-free compression, the transform
5-3 coefficient scaling ànd wavelet vector.
coefficients.
A rounding procedure is defined for non-integer valued transform
In lossy applications, a 9-7 coefficient scaling wavelet vector is employed.
case, the transform is computed using complementary lifting based
In either
approach. The complementary lifting based implementation involves six sequential
"lifting" and "scaling" operations.
Y(2n + 1) = X(2n + 1) + a [X(2n) + X(2n + 2)],
i-3s2n +1<i,t3
Y(2n) = X(2n) +B [Y(2n- 1)+ Y(2n + 1)],i,-2s 2n+ 1 <i, +2
Y(2n + 1)= Y(2n+1) +y [Y(2) + Y(2n +2) ], i,-1< 2n +1<ij, +1
Y(2n) = X(2n) + 8 [Y(2n-1) + Y(2n + 1) ], i, S
2n <iclodad et
lid-s Y(2n+ 1) -K. Y(2n + 1), i, S 2n +1<ij
Y(2n) = Y(2n)/K, i, s 2n <i
Here, X is the tile.component being transformed
Y is the resulting transform
i, and i, define the position of the tile component within a
component.
lsbi a., B, y andS are lifting parameter.
Total number of Total number of samples
transform coefficients in the original image
Step 5: Quantization
Quantization is needed because
the important visual information concentrated in
a few coefficients.!. is
Toreduce the number of bits needed use
coefficient quantization.
to represent the transform we haye to
128320
aOst:0e 2
Page 178 of 304
Image Compression and Recognition
5.53
Coefficient ag(u, V) of sub band b is quantized
A

to value g A(u, v) using

tap ti t

984, ) sign |a;(u, v)]


floorSsiibs .. (5.25)
where, the quantization step size A, is

25
A, = 2 1 +

R, is the nominal dynamic range of sub band b

6, and , are the number of bits allotted to the exponent and


mantissa of sub band
coefficients.

The nominal dynamic range of sub band b is the sum of the number
of bits used to
represent theoriginal image and the analysis gain bits for sub band
b.ot
Step 6: Coefficient Bit Modeling
brid
The coefficient of each transformed tile component's sub bands are arranged into
rectangular blocks called code blocks, which are coded individually and one bit plane
at a time.

Step 7: Arithmetic Coding

Starting from the most significant bit plane with a non-zero element,each bit plane
1S
processed in three passes.
Each bit in a bit plane is coded in only one of the three passes, such as
1. Significant Propagation
2. Magnitude Refinement and
3. Cleanup
coded.
The outputs of these passes are arithmetically
Step 8: Bit-Stream
Layering
output of arithmetic coding bit plane is grouped with similar passes from other
The
Code blocks to
form layers.
parses from each code
layer is an arbitrary number of groupings of coding
block.
Page 179 of 304

5.54 Digital Image Processing

Step 9: Packetizing
The layers obtained from above step are divided into packets and these
packets are
a the total
providing an additional method of extracting spatial region of interest from
code stream. Packets are the fundamentalunit of the encoded
code stream.

JPEG-2000 Decoding
the
JPEG-2000 decoders simply invert the operations of the encoder, It has
following process.

Step 1: Reconstruction of Tile Components


After reconstructing the sub bands of the.tile components from the arithmetically
coded JPEG-2000packets, a user can select number of the sub bands is decoded.

Step 2: Dequantization
Even though the encoder may have encoded M, bit planes for a particular sub
band, user may choose to decode only N, bit planes.

This amounts to quantizing the coefficients of the code block using a step size of

Any non-decoded bits are set to zero and the resulting coefficients, denoted as
qs(u, v) are inverse quantized using
, v)
(Gs(u,
tr. Ms-Nglu,
2 Ms
Th(u, v) >0
R (u, v) = (G(4, v) -r •2 G,u, v) <0.
... (5.27)

Gsl4, v) =0
where
R
(u, v) is an inverse quantized transform coefficient

N, (u, v) is the number of decoded bit planes for a.(u. v)


r- Reconstruction parameter
Resonstruction parameter r is chosen by the decoder to produce
the best visual or
objective quality of reconstruction.
Page 180 of 304
Compression and Recognition
Inage
5.55
Step 3: Inverse Transformation
transform of the dequantized
The inverse
coefficients is now computcd
row using a
fast Walsh
by column
and by Transform (FWT) filter
bank. This can also be
performedlby lifting basedloperations.
X(2n) =
K. Y(2n) il i-3s 2n <i +3 o

K(2n+ 1) =(-1/K) Y(2n+ 1),ok tas i -2< 2n -1<i,+2


X(2n) =X(2n)- [X(2n -1) + X(2n + 1)] io-3s2n <i, +3t
X(2n+1)= X(2n+1)-Y[X(2n) + X(2n +
2)] 2n +
in-2s 1
<i,+2
X(2n)= X(2n)-p [X(2n -1)+ X(2n + 1)|
-1s2n<i,+1
X(2n+1) = X(2n+1) - a[X(2n) + X(2n+ 2)] i,s 2n + 1<i,
Step 4: Reconstruction for Tiles
The tile components are assembled to a reconstruct the tiles.

Step 5: Inverse Component Transformation

This is performed only if component transformation was done during the encoding
process.

Step 6: DC Level Shifting


- Thus the
Finally, the transformed pixels are level shifted by adding +2"
1,

reconstructed image is obtained.

5•13.3. VIDEO COMPRESSION STANDARD


Video compressionstandards are developed by extending the transform based
still
mage compression techniques. These standards provide methods for reducing
Temporal or frame to frame redundancies.

1. Digital Video (DV)


a video standard tailored to home and semiprofessional video
Digital yideo is
news gathering and
Production applications and equipment like electronic
camcorders.
using a DCT
,rameS are compressed independently tor uncomplicated editing
based
approach similar to JPEG.
Page 181 of 304

otitge Digtal Image Processing


5.56|

H.261

r. Itis a two-way video conferencing standard for SDN(ntegrated Services Diotal


Network) lines.
It supports non-interlaced 352 x 288 and 176% 144 resolution images called CE
(Common Intermediate Formats) and QCIF (Quarter CIF).
A DCT-based compression approach similar to JPEG is used and with frame-to
frame prediction differencing to reduce temporal redundancy.
A block based technique is used tocompensaie for motion between frames.
3. H.263

It is an enhanced version of H.261 designed for ordinary telephone modems (i.e.,


28.8 kb/s) with additional resolutions.

4
H.264

It is an extension of H.261 – H.263 for video conferencing, internet streaming and


television broad casting.

grlt supports prediction differences within frame, variable block size integer
transforms and context adaptive arithmetic coding.

5. MPEG-1

sdA mntion pictures expert group standard for CD-ROM applications with non
interlaced video at up to 1.5 Mb/s.
It is similar to H.261, but frame predictions can be
based on the previous frame,
next frame or an interpolation of both. It is
supported by almost all computers and
DVD players.! h
irst
6. MPEG-2
It is an extension of MPEG 1 -
designed for DVDs with transfer rates to
15 Mb/s. It supports interlaced
video and HDTV. It is the most successful video
standard to date ie bt Gat bou

7. MPEG-4

It is an extension of MPEG-2 that supports variable


block sizes and prediction
differencing within frames.
Page 182 of 304

Image Compression and Recognition 5.57


It is used for small frame full motion compression and its bit rate 5 to 64 kbit/s
msed for mobile and Public Switched Telephone Network (PSTN) applications and
upto 4 Mbits/s for TV and film applications.

It can provide improved video compression efficiency and ability to add or drove
F
audio and video objects.f bgs insif: St abob obu
5.13.3.4. MPEG Encoder
The MPEG standards are based on a DPCMDCT coding scheme. Below figure
showsa typical motion compensated videoencoder. It exploitsiths odi 12 its:no3
1. o
Redundancies within and between adjacent yideo framessr ti boz
2. Motion uniformity between frames and
3. The psycho visual properties of the human visual system.

Rate
controller
Difference
macroblock
Image Mapper Vara5elength Encoded
ouantizer Buffer
macroblock macroblock

Inverse
quantizer

Inverse
Mapper
(e.g.. DCT-)ats 2

Prediction macroblock

Variable-length Encoded
coding motion
vector
Motion estimator and Decoded
compensator w/frame delay| macroblock

Fig. 5.21.
A
Typical Motion compensated video encoder
The input to the encoder as sequential macro blocks of video. The path having the
DCT quantizer and variable length coding block in above figure is known as the
primary input-to-output
path.
The encoder will perform simultaneous transformation, quantization and varieble
length coding use of these blocks.
operations on the input withthe
Page 183 of 304

Digital Image Processinp


s.58
to encoder such as
Two kinds of inputs can be given the
1. Conventional macro block of image data:
S4 i
o

2. Difference between a conventional image block and its


prediction based
previous and/or future video frames.
The encoder includes an inverse quantizer and inverse mapper So that its
predictions match those of the complementary decoder.
A
rate controller is used to adjust the quantization parameters in order to match the
generated bit stream and video channel. This rate controller function is based on the
content of the output buffer.
Based on the input, three types of encoded output frames are produced. They are
1. I-frame
2. P-frame
3. B-frame

Decoding
The decoder accesses the areas of the reference frames that were used in the
encoder to form the prediction residuals.
The encoder frames are reordered before transmission, so that the decoder will be
able to reconstruct and display them in the proper sequence.

1. Intraframeor Independent Frame (|-frame)


Frames compressed without a prediction residual are called Intra frames or
Independent Frames (I-frames).
They can be decoded without access to other frames in the video to
which they
belong.s
I-frames usually resemble JPEG encoded images and are
ideal starting points for
the generation of prediction residuals. It provide
1. Highest degree of Random access
2. Ease of editing and
3. High resistance to the propagation
of transmission error.
As a result, all standards require the periodic insertion
of I-frames into tne
compressed video code stream.
Page 184 of 304

Image Compression and Recognition 5.59


2. (P-frame)awo0
Predictive Frame (P-frame)
bip-frame is the compressed difference between the current frame and its prediction
hased on the previous Ior P frame.

Bi-Directional Frame (B-frame)


An encoded frame that is based on the subsequent frame is called a Bidirectional
frame (B-frame).
B-frames require the compressed code stream to be reordered so that frames are
presented tothe decoder in the proper coding sequence rather than the natural display
order.

5.14. INTRODUCTION
After an image is segmented into regions, the resulting aggregate
n

of segmented
a
pixels isrepresented and described for further computer processing. Representing
region involves two choices,
1, In terms of its external characteristics (its boundary)
2. In termsof its internal characteristics (the pixels comprising the region)
computer. The
Above tWo schemes are only parts of task of making data useful to
describe the region based on representation.
next task is to

For example, a region may be represented by


its boundary and its boundary is
straight-line joining its
described by features such as its length, the orientation of
extreme points and number of concavities in the boundary.

Which to chose when?


when the primary focus is on shape
An external representation is chosen on
Characteristics, An internal representation is selected
when the primary focus is
texture.
egional properties such as colour and case.
may necessary to use both types of representation. In either
Sometimes it be
as possible to variations in
de teatures selected
as descriptors should be insensitive
Size, translation and rotation.

D.15, REPRESENTATION
segmented data into representations that
epresentation deals with computation of
facilitate descriptors.
the computation of
Page 185 of 304

5.60 Rotinsz03514 Digital Image Processirng

5.15.1. BOUNDARY (BORDER) FOLLOWING


that points in the boundary, of the region be ordered
noiMost.of the algorithms require tha
in aclockwise (or counter clockwise) direction.
Boundary Following Algorithm
1. We blnaymages
with wIth object and background represented
LRnOiba 1
a
and 0 respectively.
2. That images are padded with borders of Os to eliminate the possibility of
object merging with image border.
Given a binary region R or its boundary, an algorithm for following the border of
R, or the given boundary, conists the following steps.
of
1. Let the starting point, b be the uppermost, leftmost point in the image that is
botnomlabelled 1: Co the west neighbour of b:Examine 8 neighbours bo
s griinsCo
of starting at
andproceed in a clockwise direction [see figure (b)] botnsestq
Let b, denote first neighbour encountered with value 1
and c be background
point immediately preceding b, in the sequence. Store the locations bo and
of
b, for use in step5.
2.. Let
b= b, and c=c [See figure(c)]
3. Let the 8 neighbours of,b,
starting at c and the preceding in a clockWse
direction, be denoted by n j, n2, .....ng.
Find First n, Labeled 1.
4. Let b= and c=
5. Repeat step 3 and 4 until b = b0,
and the next boundary point found is
sequence of b points found b. 1n9
when the algorithm stops constitutes the set
ordered boundary points.
sqThis algorithm is also called as Moore boundary
tracking algorithm.

1 1

1
01 sidiz200 28 9V1:a 1
b1 iei323b esbotuua2
1 1

1
io1JRG1 bas noiclen
1 1

MOITAT4eEAA3
Page 1865.61
of 304
.th
Col
Dol1 1

1
1
1
1

1 1

4 1

(b)
(c)

1
1 1 1

(d)
Fig. 5.22. Illustration of the First Few Steps
inthe Boundary-Following Algorithm
5.15.2. CHAIN
CODESeot o:
Chain codes are used to represent a
boundary by a coninected sequence
line segments of specified length of straight
and direction. Typically, this representation
on 4 or 8 connectivity is based
of the segments. The direction of each segment is coded by
using anumbering scheme, as in below figure 5.23.

abrA boundary code formed as a sequence of such directional numbers is


referred to
as a Freeman
chain code. Digital images usually are acquired and processed agrid
TOTmat with equal spacing in the X
and Y directions.
1
o2
3 1

0
2+ 4+

5 7

Fig. 5.23. Direction Numbers for (a) 4 directional chain code,heand


(b) &-directional chain code ,t fhiiIt
Page 187 of 304

Digital Image Processing


5.62
a in, (say, a clockwise
So chain code can be generated by following boundary
a

direction) and assigning a direction to the segments connecting every pair of pixele

t6
2 t6
t6
2
}6

354
(a) (b) (c)

Fig. 5.24. (a) Digital boundary with resampling grid superimposed


(6) Result of resampling. (c) 8-directional chain-coded boundary

Thismethod is generally unacceptable îor two principal reasons.


1. Resulting chain tends to be quite long.
2. Any small disturbances along the boundary due to noise or imperfect
isegmentation can cause changes in code.

A solution toabove problem is to resample the boundary by selecting a larger gid


spacing. Then, as the boundary is traversed, a boundary point is assigned to each nod
of the large grid, depending upon the proximity of original boundary to that node.
The resampled boundary obtained in this way then can be represented Dy
4 or 8 code.
Figure (c) shows the coarser boundary points represented by an
8-directional chain code. The accuracy of the resulting code reepresentation depends
on the spacing of the sampling grid.
The chain code a code
of
boundary depends on the starting point. However, the
can be normalised withrespect to the starting point procedur.
by a straightforward
1 We treat
the chain code as a circular numbers and
Sequence of direction of
redefine starting point so that the integer
resulting sequence forms an
minimum magnitude.
Page 188 of 304
Image Compression and. Recognition
5.63|
2. We can also normalise for
rotation by the first difference
instead of the code itself. of the chain code
Example: First difference
of 4-directional chain code 10103322
is 3133030.
If we treat the code as a circular sequence
to normalise with respect to
noint. then the first element the starting
of the difference iscomputed by using the transition
hetween the last and first components
of the chain. Here, the result is 33133030.
These normalisations are exact only
if the boundarjes are invariant to rotation and
scale change.

5.15.3. POLYGONAL APPROXIMATION USING


MINIMUM
PERIMETER POLYGONS
A digital boundary can be approximated with arbitrary accuracy
by a polygon.
For a closed boundary, approximation becomes exact the number of segments
of
the polygon is equal to the number of points in the boundary so that each pair of
adjacent points defines asegment
of the polygon.

The goal of a polygonal approximation is to capture the essence of the shape in a


given boundary using the fewest possible number of segments.bto

Approximation techniques of modest complexity are well-suited for image


processing tasks. Amongthese, one of the most powerful is representing a boundary
by a Minimum Perimeter Polygon (MPP),

Minimum Perimeter Polygon(MPP)


to enclose a
An approach for generating an algorithm to compute MPPs is
as figure (b).
boundary by aset of concatenated cells, shown in below
Think of the boundary as a rubber band.
As allowed to shrink, the rubber band will be constrained
by the inner and
outer walls the bounding regionss osit
p
of
Page 189 of 304

5.64 Digital Image Processing

sb ni (

oiz

baois
(a) (b) (c)
Fig. 5.25. (a) An object boundary (black curve). (b) Boundary enclosed by cells (in gray).
(c) Minimun Perimeter Polygon obtained by allowing the boundary to
shrink.
This shrinking produces the shape of a polygon of minimum perimeter
that
circumscribes the region enclosed by the cell strip as figure (c)
to 2ir:31 shows.
The size of cells determines the accuracy of the polygonal
approximation. In the
limit, if size of each cellcorresponds to a pixel in the
boundary, the error in each cell
between the boundary and the MPP approximation at most
would be 2d ,where d
min possible pixel distance.
This error can be reduced to half by forcing
each cell in polygonal approximation
to be centred on its corresponding
pixel in the original boundary.
vThe objective is to use
the largest possible cell size acceptable
application. Thus, MPPs produced with in a given
the fewest number of vertices.
The cellular approach reduces
the shape of the object enclosed
boundary to the area circumscribed by by the original
the grey wall.
In the above figure 5.25, its boundary
consists of 4-connected straight-line
segments. Suppose that we traverse
this boundary in a counter clockwise
direction.
Every turn encountered in
the traversal will be either a convex or a concave
with the angle of a vertex being an vertex,
interior angle of the 4-connected
boundary.
Convex and concave vertices are
shown, respectively as white
in below figure (b). and black dots
ImageC Kecognition Page 190 of 304

5.65|

(a)
(b)
Eie.5.26.
(a) Region (c)
(dark gray) resulting
(6) Convex from
(white dots)
and concave
enclosing the original boundary .anoije
the boundary of (black dots) vertices by cells
the dark gray region inwoonterclóckwise obtained by following
(c) Concave vertices the
adt diagonal,mirror (black dots) displaced direction
locations in the outer to thetr iil satl i0T13
wall of
The vertices thebounding region
of the MPP coincide either g9d!
(white dots) or with convex vertices
with the mirrors of the concave in the inner wall
Only convex vertices (black dots) in the outer
vertices of the inner wall.
wall and concave vertices
verticès of
the MPP.I of the outer'wall can be
u3sbs to artoita9i sit sub93010 9di
MPP Algorithm io bns orit A
1. The MPP bounded
by a simply connected cellular
Intersecting.swis 201 of: roiEmixOKES complex is notsself
L, iiEvery convex
grilirest sd: i 23oiroV
vertex of the MPP is a W vertex,
but not every W.ivertex a
boundary is a vertex of
of the MPP.
. Every
mirrored concave vertex of the MPP a
is B vertex; butinot every
Bvertex of a boundary is a vertex of
the MPP.: art ja e31i0q to 19dmua
4. All B
vertices are on or outside the MPP, and all vertices are onor
W

the MPP, inside


5. The 23U0IMHOET DMITTII4e.S.3,a
Uppermost, leftmost vertex in a sequence of vertices contained in a
cellular
Complex is always a vertex of the MPP.
W

Inthis algorithm,
Concave White (W) represents convex and Black (B) denotes, mirrored
vertices.
Page 191 of 304

5.66| Digital Image Processing

5.16. OTHERPOLYGONAL APPROXIMATION APPROACHES


As like minimum perimeter polygons, there are two more important types of
polygonal approximations available. These methods are preferred for many image
processing applications. They are
1. Merging Techniques
2. Splitting Techniques

5.16.1. MERGING TECHNIQUES


Merging techniques based on average error or other criteria have been applied to
the problem of polygonal approximation. It will perform
the following sequence of
actions.
1. Setathresholdfor least square error.
2. Merge some points along the boundary. This is done until the least square
error line fit of the points merged so
far exceeds a preset threshold.
3. When this condition occurs, store the parameters of the line
e
and set the errOr
to 0,
4. Repeat the steps (2) and (3) by emerging new points
until the error again
texceeds the threshold.
5. At the end of the procedure the intersections
of adjacent line segments form
the vertices of the polygon.
Drawbacks
1. Vertices in the resulting approximation
do not always correspond to
inflections in the original boundary, because a new
line is not started until the
error threshold is exceeded.
2. For instance, a long straight line was being
tracked and it turned a corner, a
number of points past the corner would be absorbed
o
before the threshold was

exceeded.

5.16.2. SPLITTING TECHNIQUES


Above drawbacks can be overcome
by using splitting techniques along wiut
merging. One approach to boundary segment
splitting is to subdivide a segment
successively into two parts until a specified criterion
is satisfied.
Page 192 of 304
Image Compression and Recognition 5.67

The maximum perpendicular distance from a boundary segment to


the line joining
a
its two endpoints should not exceed preset threshold.

Based on this, the boundary is divided into two. segments by a line ab as shown in
below figure (b). Here, cd and efare the perpendicular distances from the line.
Below figure (c) shows the result of using the splitting procedure with a threshold
equal to 0.25 times the length of line ab.

Thus, the obtained vertices are a, c, b and e. Joining these vertices result in the
polygon which represents the given boundary.

(a) (b)

(c) (d)

Fig. S.27. (9) Original boundary. (6) Boundary divided into segments based
on extreme points. (c) Joining of vertices. (d) Resulting polygon
Advantage
It seeks prominent reflection points. So the resulting vertices can produce a
good polygonal approximation for the boundary.

5.17. BOUNDARY SEGMENTS


Decomposing a boundary into segments is often useful. Decomposition reduces
the boundary's complexity and thus simplifies the description process.

This approach is particularly attractive when the boundary contains one or more
Significant concavities that carry shape information.
The use of the convex hull of the region enclosed by the boundary is a powerful
tool for robust decomposition of the bounday• ala doigtii, ot oh
Page 193 of 304
Digital Image Procesthng
5.68|

the smallest convex set containing S. Te


an arbitrary set S is
The convex hall H of
called the convex deficiency D of the set S. Consider the belo
set difference H-S is
an object S shown in figure (a).
figure. Consider the boundary of
the set i.e., objects S is defined, which is the
1. First, the convex deficiency of
blne shaded region in figure (b),g2 d: 9nia io slhie cetit )

S is followed and the points at which


there is
2. Next, the contour i.e., outline of
convex deficiency are marked.3iio eih
1
1atransition into or out of the
3. These points are the partitioning points that give the
segmented boundary. The
result obtained is shown in figure (c).

(a) (b) (c)


Fig. 5.28. Boundary Segmentation
Advantages
1. Segmenting a boundary using convex deficiency is independent
of the size
and the orientation of a given region.
2. The convex hull and its deficiency can also
be used for describing a region
and its boundary.
-

Drawback
Digital boundaries tend to be irregular
because
of digitization, noise and variations
in segmentation. Such boundaries produce convex
meaningless components scattered randomly
deficiencies with the small,
an inefficient decomposition process. throughout the boundary. This results in

Solution
These irregularities can be removed
by smoothing before partitioning.
to do smoothing which
replace the coordinates One way 15
coordinates of K of its neighbours of each pixel by the average
along the boundary. But
Compression Page 194 of 304
Image and Recognition
Large values of K result in
5.69
excessive smoothing.
Small values of K
result in inefficient
smoothing.
Therefore, polygonal approximation method can
be used before finding convex
deficiency.

5.18. BOUNDARY
Boundary descriptors
described the boundary
of a region using the features
houndary. It can be classified into two types such as of the
1.
Simple descriptors
2. Fourier descriptors
There are two quantities
which are used to describe the above kinds
such as
of descriptors
1.Shape numbers
andib etwooola at et tbnsod odi n1W
2. Statistical moments

5.18.1. SIMPLE DESCRIPTORS5e ht unais

The length of a boundary is one


of the simplest descriptors. The number of pixels
along a boundary gives a rough approximation
of its length.
For a chain coded curve with unit spacing in both directions, the length is obtained
by

Number of vertical and number of diagonal


|Length = horizontalcomponents +2 Components
The diameter of a boundary Bis defined as 2uteO9 429:]
max
Diam (B)=
i,j
[DP» P)l|
Where
D - is a
distance measureiheabsntest J h00Up92
PoPj are points on the boundary. the2ii .90si
a boundary is defined as the line segment connecting the
he major axis of
me points of its diameter.
two extreme
perpendicular to the major
The minor axis of a boundary is defined as the line
axis.
Page 195 of 304

5.70 Digital Image Processing

The ratio of the major axis to the minor axis.is Called the efficiency of a
boundary.
Major Axis
=
XOVag3 30ib b i.e., Bfficiency Minor Axis
The major axis intersects with the boundary at two points and the minor axis also
intersects with the boundary at two points. A box which completely encloses the
boundary by passing through these four outer points is called the basic rectangle.
Curvature is defined as the rate of change of slope. But, obtaining reliable
measures of curvature at a point in a digital boundary is difficult because these
boundaries tend to be locally ragged.
Therefore, the difference between the slopes of adjacent boundary segments can be
used as a descriptor for the curvature at the point of intersection of the two segments.
When the boundary is traversed in the clockwise direction, the following can be
defined.
1. If the change in slope at P is nonnegative, then a vertex point P is said to bea
part ofa convex segment.
2. If the change in slope at P is negative, then a vertex point P is said to a part
be
of a concave segment.
3. If the change is less than 10°, then vertex point P is a part a
of nearly straight
segment.
4. If the change is greater than 90°, then vertex point
P is a corner point.
These descriptors must be used with care,
because their interpretation depends on
the length of the individual segments relative to
the overall length of the boundary.
5.18.2. FOURIER DESCRIPTORS
Furrier descriptors describe a digital
boundary by considering it as a complex
sequence. Consider
the digital boundary shown in below
K number of points in the XY-plane. can figure 5.29. which has
It reduce a 2-D to a 1-D problenm.
Page 196 of 304
Image Compression and Recognition
5.71

axis

Imaginary

Real axis.
Fig. 5.29. A digital boundary
as a complex sequence
ot Starting. at arbitrary point (xn.V).and
an
following boundary
counterclockwise direction, in the
its coordinate points are e
iant r (9t)

Bzl),.
These coordinates can
be expressed in the form of
x(K) =
And

The sequence of c0ordinates of the boundary are


s(K)=x(K), y(K)], for =0, 1,2, ...... ,K-1. K

Moreover, each coordinate pair can be treated as a complex


number in the form

+jy(K) for K = 0, 1, 2 ......


S(K) =x(K) K-1
Where x-axis treated as the real axis
y-axis treated as the imaginary axis.
The Discrete
Fourier Transform (DFT) of S(K))is
K-1
bo a(u) =S s
(K)eJZmkK t0st 1iuo3
rold
k=0
for
=0, 1,2,
Here.,
....K-1.tss 'ioda
the coefficients au) are called the fourier descriptors of the
boundary.TheComplex
Inverse Fourier Transform
of these coefficients restores S(K).S
Page 197 of 304
Digital Image Processing
5.72

K-1
S(K) =
K
al)ej2muk/K
=0
..... K-1
for k=0, 1, 2,
p t
Instead of all the Fourier Coefficients, we can use only the first coefficients
reduce the number of terms used for reconstruction. This is equivalent to setting.
a(u) =0 for u> P-1in above equation.
The result is the following approximation to s(K)
= P-1
$ (K) au) ej2muk/P for k= 0, 1,2, .....K-1

afi Only P terms are used to obtain each component ofs (K), K still ranges from 0 to
K-1.(ie) the same number of pointsexists in the approximate boundary, but not as
many terms are used in the reconstruction of each point.
Low-frequency components determine global shape, thus, the smaller P becomes,
then lost on the boundary.
Basic Properties
Fourier descriptors can be used as the basis for differentiating between distinct
boundary shapes.
Descriptors should be as insensitive as possible to translation, rotation and scale
changes. An addition to this, the descriptors must be insensitive to the starting point.
Therefore the Fourier descriptors should not be sensitive to
1. Translation
2. Rotation
3. Scale Changes and
4. Starting Point
But, the changes in these parameters produce some
simple transformations on u
Fourier descriptors. Above four properties can
be explained below.
1. Rotation
Rotation of a
point by an angle is
accomplished by multiplying the point
about the origin of
the complex plane
by e®, to every point of entie
sequence about the origin. S(K) rotates the
9er o ian: igtt 3vnt ii
Page 198 of 304
Image Compression, and Recognition

The rotated, sequence 5.73


is S(K) e, whose Fourier descriptors
4, (u) = K-1 S(K) ejo
a,
areANEE3I.8
2 ej2ukK
k=0
a

=0, 1, 2, ......
u
(u) ejo
for
K-1
Thus rotation simply
affects all coefficients
equally by a multiplicative
term ejo constant
2. Translation
Translation consists
of adding a constant displacement to
boundary. It is obtained as all coordinates in the

S,(K) = S(K) + A,y


where

A,y = A, tjA,
Translation has no effect on
the descriptors, except for u =
impulse S(u). 0, which has the

3. Starting Point
eThe starting point of the sequence can be changed from
the expression.a
K=0to K= K, by using
l hand
s(K) = s(K- K)
s(K) = x (K-K) +jy (K-K)r st
The basic properties of Fourier descriptors are summarized in below table.izaog
Table 5.6. Some basic properties of
Fourier descriptors tr
S. No. Translation Boundary Fourier Descriptor
1
Identify s (K) a(u)
2.
Rotation s, (K) = s(K) eje a, (u) =
a(u) ejousUod
3. Translation s,(K) =s(K) + A,y4, (u) =
a(u) + A,, 8(u)
4. Scaling s,(K) = as(K) a, (u) = aa(u)
5. Starting point
s, (K) =
s(K-k)a,(u)=a(u)e-j2riKouK e
Page 199 of 304

5.74 .ieD* Digital Image Processing

5.18.3. SHAPE NUMBERS oiu dw t2:s


Order 4 Order 6

1
Chain code: 03 2 1
003 22
Difference: 3 3 33 30 3Sh
330
Shape no.: 3 3 3 3 033 0 3 3

Order 8

Chain code: 0 033 22 1 1


030 3
22 1 1
00 03.2 2 2 1

Difference: 3 030 30 30 3 3 1 3 30 3 0 3 0 0 3 3 0 0 3
Shape no.: 0 3 0 30 3 0 3 030 3
3 1
3 3 003 3 0 0 3 3

Fig. 5.30. AU Shapes of order 4, 6 and &


The shape number of a boundary is defined as the first difference of smallest
magnitude. The order n ofa shape number is defined as the number of digits in its
representation.
More ever, n is even for a closed boundary, and its value limits the number of
possible differnt shapes. Figure 5.30 shows all the shapes of order 4. and 8 along
6
with their chain code representation, first differences
and corresponding shape
numbers.
The first difference of a chain code is independent
of rotation; in general the coded
boundary depends on the orientation of the grid.

5.18.4. STATISTICAL MOMENTS


The shape of boundary segments can be described by using
statistical momens
such moments are mean, variance and higher order moments, can
It be used to redueo
the 2-D description problem to 1-D problem.
Page 200 of 304
Compression and Recognition
Tnage 5.75
Consider the segment of a boundary shown in below figure. In that fig. 5.31 (a)
segment of'a boundary and figure (b) shows as a
shows the the segment represented
an
1-D.function gr) )
of arbitrary variable r.

g(r)

(4) Boundary Segment (b) Representation as a 1-D Function


a
Fig. 5.31. 2D and ID representation of
boundary segment

is obtained by connecting the two end points of the segment and


This function

rotating the line segment until it is horizontal. The coordinates of the points are
rotated by the same angle.
v and form an amplitude
The amplitude of g as a discrete random variable
histogram
=
1, 2,
......, A - 1
P(v), i 0,
Where,
increments.
A is the number of discrete amplitude
The nth moment of V about its mean is
A-1
H,() = Z (v;-m)"P (v)
=0
where
A-1
X y,P(v)

quantity m is recognized
m=
=0
as the mean or average value of
V
and , as its
variance.
area and treat it as a
An
altermative approach to normalize gr) to unit
histogram. is value r occurring.
lhe g(r) is treated
as the probability of
Page 201 of 304
5.76 Digital Image Processino

is treated as the random variable and the moments


are
nis case, r
= K-1
H,
() (-m)"g (r) os C

i=0

where
K-1
m=
i=0
In this notation, K is the number of points on the boundary and
,() is directly
related to the shape of g(r).

Advantages
The advantages of using statistical moments when
comparing to other techniques
of boundary description are
1. Implementation of moments is straight forward
2. It carries a physical interpretation boundary
of shape.
3. This approach is not sensitive to rotation.
4. Size normalization can be achieved
by scaling the range
g and r, of values of

5.19. REGIONAL DESCRIPTORS


Regional descriptors are used
todescribe image region.In
and boundary descriptors practice, both regional
used combinable. Some important
1. Simple descriptors
regional descriptors are
2. Topological descriptors
3. Texture
4. Moment Invariants

5.19.1., SIMPLE DESCRIPTORS


The area of a region
is defined as the number
of a region is the length of pixels in the region. The perime
of its boundary.
Area and perimeter are
used as descriptors, when
the size of the region is invaries
Page 202 of 304

mage Compression
and Recognition
These two descriptors
are used to
5.77
measuring compactness
defined as of a region, it can be
Compactness (Perimeter)²
Area
Circularity ratio
is slightly different descriptor
defined as of compactness and it can
be
4TA
Rç =
P2
where
A is the area of the region
Pis the length of its perimeter
The value of this measure is 1 for a circular region and for a square.
Compactness is a dimensionless measure and thus is
insensitive to uniform scale
changes. It is also insensitive to orientation
and thus the error introduced by rotation
of a digital region can be avoided.

5.19.2. TOPOLOGICAL DESCRIPTORS


lopological properties are useful for global descriptions of regions in the image
Pane. T'opology is defined as the study or properties of an image that are unaffected
oy any
deformation such as stretching, rotation etc.
As long as there is no tearing or joining of the figure or image sometimes these, are
called rubber sheet
distortions.
as
ere are two topological properties useful for region description, such
1. Number of holes in the region

Z. Number of connected component of the region.


1f a LOpological descriptor is defined by the number of holes in the region, this
property or transformation.
will not be affected by a stretching rotation
Page 203 of 304
Digital Image Processing
5.78

Fig. 5.32. A region with two holes


The number of holes will change if the region is torn or folded. Topological
properties do not depend on the notion of distance or any properties implicitly based
on the concept of a distance measure.

Another topological property useful for region description is the number of


connected components. Below figure shows a region with three connected
components.

Fig. 5.33. A region with three connected components


The number of holes H and conneçted components
C in a figure can be used t0
define the Euler number.

E =
C- H|

where,
C- connected component i2
H- holes

AB
Fig. 5.34.Regions with
Euler Numbers equal to 0 and respectively
Page 204 of 304
Compression and Recognition
Inage 5.79

The Euler number also a topological property.


is The regions shown in above
Euler numbers
figure have equal to and -1 respectively,
0
because the “A" has one
connected component and one hole and
the B" has one connected component but
twoholes.

ForA
=
E C-H
C= 1 and

H = 1

Therefore

E = 0 for A
For B

E = (-H
C = 1
and
H = 2
Therefore E = 1-2
E=-1 for B

Regions represented by straight line segments also referred to as polygonal


networks.

These networks have a simple interpretation in terms of the Euler number. Below
igure shows a polygonal network.

Vertex (V)

Face (F)

Hole (H)

Edge (Q)

Fig. 5.35. Region containing


a Polygonal Network
Page 205 of 304

5.80|
Digital mage Processing

and the numher


Denoting the number of vertices by V, the number of edges by
of faces by F gives the following relationship called the Euler
Formula.

We know that the Euler Number is


E =
H
C-
Then above equation can be rewritten as
V-Q +F = E
For the above figure we have the following values.
V.= 7

Q= 11

F= 2
C = 1
H = 3
Thus the Euler number is -2
7-11 +2 = 1-3
=
-2
Topological descriptors provide an additional feature that is useful in
characterizing regions in a scene.

5.19.3. TEXTURE
Texture content is an important quantity used to describe a
region. This descriptor
provides measuresof properties such as
1. Smoothness
2. Coarseness
3. Regularity
The three principal approaches used image of
in processing to describe the texture
region are
a

1. Statistical approaches
2. Structural approaches
3. Spectral approaches
Page 206 of 304
Image Compression and Recognition
5.81|
Statistical approaches are used to characterize
coarse, grainy and soon.
the texture of a region as smooth,

Structural techniques deal


with the arrangement of image
description of texture based on primitives, such as the
regularity spaced parallel lines.
Spectral techniques are based on
properties of the Fourier spectrum
primarily to detect global and are used
periodicity in an image by identifying
peaks in the spectrum. high-energy, narrow

5.19.3.1. Statistical
approaches
One of the simplest approaches for
describing texture is tó use statistical moments
of the intensity histogram an
of image or region.
1. Statistical Moments
Let Z be a random variable denoting
intensity
P (Z)- corresponding histogram for i - 0, 1,
2,..., L- 1
L-number of diferent intensity levels.
The nth moment of Z about the mean is
L-1
H(Z) = (Z-m)" P(Z) .s. (5.28)
i=0
where,
m is the mean value
of Z (the average intensity)
L-1
j=0
(5.29)
Note from equation 5.1 that Ho = 1
and u = 0. These are called as zeroth
and first
moment.

2. Second Moment
p
The second moment is importance in texture description.
It is a measure of
Lntensity contrast that can be used to establish descriptors
of relative smoothness.
For example, the measure
1
R (Z) = ...
l-; 1+ G² (Z) (5.30)
Above equation gives 0 value for areas of constant intensity and 1 for large values
ofG²(Z).
Page 207 of 304

5.82
Digital Image Processing

Thus, variance is a very important measure in the texture description. It is given


hu.

the second moment.


... (5.31)
i.e.,

8
The third moment is defined as the measure of the skewness. tsi 0
vlih
ie., = L-1E (Zi-m)³ P(Zi)
(Z) (S.32)
j=0

Fourth Moment
The fourth moment is defined as the measure of the relative flatness of the
histogram.
L-1
i.e., H (Z)
=Z i =0
(Zi-m) P(Z) ...6.33)

Fifth and Higher moments


The fifth and higher moments are not so easily related to histogram shape
but they
can provide further quantitative discrimination texture content.
of
Uniformity
Uniformity is another measure of texture which is based on
the histogram. It is
defined as
.L-1=
U(Z)
P2(Z;) .. (5.34)
i=0
An image is said to be maximally uniform when measure
0 is maximum for an
image in which all intensity levels are equal and decreases
from there.
Entropy
Entropyis also one of the texture measures
and it can be defined as a measure ol
an
variability of image. It can be expressed as

e (Z) = - L-1 P(Z) log, P(Z) ..(5.35)


i=0
2.2.Gray Level Co-0ccurrence Matric, G
Let O be an operator that defines the position two
of pixels relative to each otne
and consider an imagefwhich has an L possible intensity
level.
Page 208 of 304
Compression and Recogiic:
Image
5.83
G be a matrix whose element
Let gi is the number of times that points with gray
level Z,
and 7.occur inf in the position specified by ), where 1 sI,jsL.
A matrix formed in this way
is referred to as a gray level c0-0ccurrence matrix and
as G.
tcan be denoted
123 4
6 7 8
20 0 1 1

3 2
4 75
7 20 0 1

5 1 6 2 5 3|0 1 0 10 0

8 8 6 8 1 2 ojv 4 0 0 10 1 0
1 0 1
4 3 4 5 5 1
52 2
1

87 8 7 6
2 -64+3 0
0 1 1 0 2
7 8 6
2 1
8| 1 002|
Co-occurrence matrix G
Image f
of a Co-occurrence Matrix
Fig. 5.36. Generation
how to construct a co-0ccurrence matrix
Above figure 5.36 shows
an example of
as "One pixel immediately to the right
operator Q defined
using L = 8 and a position immediately to its right.
apixel is defined as the pixel
1.e., the neighbor of
1, because there is only one
G is
figure we can see that element (1, 1) of immediately to its right.
In that
1 having pixel valued 1
a
OCCurrence in f of a pixel valued occurrences in f of a
are three
(6, 2) of G is 3, because there
Similarly, element to tis right.
valued 2immediatly
a pixel
pixel with a value of 6 having right and one pixel
as "One pixel to the
Q is defined of a 1
If the position operator, are no instances inf
G have been
0, becausethere
above, then position (1, 1)in by Q.
with another 1 in the position
specified matrix
determines the size of
image
intensity levels in the x 256.
The number of possible G will be
of size 256
possible levels)
G.
For an 8-bit image (256
Page 209 of 304
Digital lmage Processing
5.84
size Ky
characterizing co-occurrence matrices
of

Table 5.7. Descriptors used for


termn p, the ijth term of G divided by the sum of the elements ofG.
The is
Explanation ot E Formulaiss k
Descriptor
Measures the strongest response of max(Pj)
Maximum
probability G.The range of values is [0, 1].
Correlation measure of how correlated a
A
K
(i - m,)(j – mJP
pixel is to its neighbor over the
i=1 j=1
entire image. Range of values is
1to - 1, corresponding to perfect

positive and perfect negative


correlations. This measure is not
defined if either standard deviation
is zero.
Contrast A measure of intensity contrast K
between a pixel and its neighbor over
the entire image. The range of values =1 j=1
is 0 (whenG is constant) to (K - 1.
Uniformity A measure of uniformity in the range K
(also called [0, 1]. Uniformity is 1 for a constant
Energy) image. i=1 j=1 SoDI
Homogeneity Measures the spatial closeness of the
distribution of elements in G to the K K
Pj
diagonal. Therange of values is (0, 1]. 2
vodA
with the maximum being achieved i=1 i1 1
+|i -il
when G is a diagonal matrix.
Entropy Measures the randomness of the K K
elements ofG. The entropy is 0 when
all p,'s are 0and is maximum when
2i=1 Z P; log2 Pi
i=1
Siiz3 o
t. all p's are equal. The maximum
value is 2 logzK.

The total number (n) of pixel pairs that satisfy Q is equal to


the sum of the
elements of G. Then, the quantity

=
Py n
Is an estimate of the probability that a pair points
of satisfying Q will have valus
*(Z,Z). These probabilities are in the range [0, 1] and their sum is 1.
K K
2
i=1 j=1
P, = 1
Compression. and Recognition Page 210 of 304
mage
|5.85
Where K.is the row (or column) dimension
of square matrix G.
A
of descriptors useful for characterizing the contents
set
of G are
table. The
quantities used
in the correlation descriptor (second row listed in above
defined as follows. in thetable) are

m, =
i p a
i=1 j=1 bad syxoiotgsest
K K

Py
j=1 i=1
and
K K

i=1 j=1
K K

j=1 i=1
Pi
Let

K
=
P) Py
j=1
And

=
PÚ) 2 PyC
lhen the preceding equations can be written as
K
m, = i Pi)
i=1
K
ln
mc = j PÖ)
j=1

2 ((-m) P)
K
S (-mJP PG)
The m, a mean computed along rowS of the normalized
G
and
1S a
1S
in the form of
mean
computed along columns the of normalized o, in
Similarly, o, andG.
are
heform Each
ofthese
of
standard deviations computed along rows and columns respectively.
G.
terms is a scalar, independently of the size of
Page 211 of 304

5.86|
Digital Image Processing

5.19.3.2. Structural Approaches


texture
As mentioned at the beginning of this section, a second category of
description is based on structural concepts.
Structural techniques deal with the arrangement of image primitives, such as the
description of texture based on regularly spaced parallel lines.

(a)

(b)

(c)

Fig. 5.37. (a) Texture primitive. (b) Pattern generated by the rule S aS
(c) 2-D texture pattern generated by this and other rules.
Let a texture primitive is denoted as a and it is represented as a
circle shown in
figure.
Now, we have a rule of the form S as, which indicates that
the symbol S may be
rewritten as aS. For example, if this rule is applied three times,
then it results in the
form of string as aaaS.
If represents a circle figure (a) and the meaning
'a' of “circle to the right" is
a
assigned to string of the form áaa the ruleS aS allows generation of the
texture pattern shown in figure (l).
Suppose we add some new rules to this scheme such as.
Page 212 of 304
Image Compression and
and Recognition
|5.87
A bs
S a
Where the pressure of b means Circle
down and the presence of C
means circle to
the left.
Now, we can generate a string of the form aaabccbaa
that corresponds to a 3 x 3
metriX of circles.
Larger texture pattern, such as in figure (c), can
be generated easily
in the same way.

Special Approaches
5:19.3.3.
The Fourier spectrum is ideally suited
for describing the directionality of periodic
or almost
periodic 2-D patterns in an image.
These global texture patterns are easily distinguishable as
concentrations of high
energy bursts in the spectrum. Here, we consider three features of the Fourier
spectrum that are useful for texture description.
Such as,
1. Prominent peaks in the spectrum give the principal direction of the texture

patterns.
2. The location of the peaks in the frequency plane gives the fundamental spatial
period of the patterns.
3. Eliminating any periodic components via filtering leaves non-periodic image

elements.
Detection and interpretation of the spectrum feature are simplified by the
expressing the spectrumn in polar coordinates.

S(r,0)
Where,

S - spectrum function
frequency variable
direction variable
a 1-D function S. ().
rOr each direction 0. the S(r, 0) may be considered
Similarly, a 1-D function.
for each frequency the S, (0) is
Analyzing S, (r) fora fixed 0 value provides the behavior ofthe spectrum along a
Analyzing
Radial
direction from the origin.
Analyzing r gives the behavior of the spectrum along
S,(0) for a fixed value of
a circle
with center on the origin.
Page 213 of 304

Digital Image Processing


5.88|
A
global description is obtained by integrating (summing for discrete variahle
these functions.
=
S(r) S)
G=0

Where,
Ris the radius of a circle centered at the origin. The results of abovetwo equation
gives a pair of values [S(r), S(©)]for each pair of coordinates (r, 0).
By varying these coordinates, we can generate two 1-D functions S(r) and S(0).
This gives the spectral energy description of texture for an entire image or region.
Descriptors of these functions can be computed in order to characterize their
behavior quantitatively. Descriptors typically used for following purpose.
1 The location of the highest value
2. The mean and variance of both the amplitude and axial variations.
3. The distance between the mean and the highest value
of the function.
5.19.4. MOMENT INARIANTS
The 2-D moment of order (p + ) of a digital image f(x, y)
of size MxN is
defined as
M - 1
N-1
xPy9
Mpq
X=0 y=0
f, y)

where
p=0, 1, 2....
q=0, 1, 2..... are integers
The corresponding central moment of order (p + g) is defined as
-1N-1
M

*=0 y=0
for p = 0,
1,2
...... and =0.1.2
a

where

Moo
Ognition Page 214 of 304
5.89
and
Moo
The normalized central moments,
denoted Npg We defined as

Where,

2 +1
..
Forp +q =2, 3,
A set
of seven invariant moments can be
derived from the second
moments. and third

i = m20 t no2

= (n30 -3112)2 + (3n21-To)


3
= (n30 +
4
n2+ (n21 t no:)
+ (3121-1o:) (21 + Mo:) 3(130 + N-(M21 + os)2]lhs e.

(20 - No2)
+
=
6 t
(N30 ni2-(N21Nos)+4n11 (30 1)
(21+Mo3)
0, = (3n21-nos) (n30 + ni)(130
+t
nip-3(21 t lo3)2
No3) ]
+(3n21-No3) (n21to3) 3 (30 t 112-(n21+
to translation, scale change, mirroring and
tus set of moments is invariant
rotation.

5.20.
PATTERNS AND PATTERN CLASSES a family of
a pattern class is
descriptors and
pattern an arrangement of
A
are denoted W1,W2..Ww
patterns
is properties. Pattern classes
common
that share Some
Where w
1s the number of classes. patterns to their
involves techniques for assigning
Pattern machine as possible.
recognition by as little human intervention
Tespective and with
classes automatically
Page 215 of 304

5.90 Digital Image Processing

There are three common pattern arrangements are used such as


1. Vectors

3. Trees
Among these, vectors are used for quantitative escriptions and strings and trees
are used for structural descriptions.

5.20.1. PATTERN VECTORS


Pattern vectors are represented by bold letters, such as X, Y, and ZL
and take the
form

X=

where,
*-represents the h descriptor
n-is the total number of such descriptors associated with the pattern.
A pattem vector can be expressed in two forms such as

and X= (j; *
..xjT
where T indicates Transposition

Example1
Fishes (1936) reported the use of what
then was a new technique called
discriminant analysis to recognize three types
of Iris flowers to measuring the
widths and lengths of their petals. Three iris flowers are
1. Iris Setosa

2. Iris Virginica
3. Iris Versicolor
Page 216 of 304

Image Compression and Recognition 5.91

2.5 - A Iris virginica


D Iris versicolor
CasoTs i0 so ris setosate zitDE) ilids A ALA
o
AA A

2.0
(cm) AA

width

1.5
Petal
t 9vilim
ri
1.0

0.5

1 2 3 4 5 6
Petal length (cm)
Three types of iris flowers described by two measurements
Fig. 5.38.

Each flower is described by two measurements, which leads to q 2-D pattern


vector of the form. .
X =

where
x, and x, correspond to petakength and width respectively. The three pattern
classes are denoted as w1, W, and w, Wwhere

represent setosa
W

W) represent virginica
W3 represent versicolor
The petals of flowers vary in width and length. Above figure 5.38 shows length
and width measurements for several samples of each type of iris.
After a set of measurements has been selected, the.components of apattern vector
become the entire description of each physical sample. In this case each flower
Decomes a point in 2-D Euclidean space. s t
Page 217 of 304
s Digital Image Processing
5,92

The measurements petal width and length can be used to separate the class of
of

Iris Setosa from the other two but it did not separate as the virginica and versicols
types from each other.
The degree of class separability depends strongly on the choice of descriptors
selected for an application.

5.20.2. PATTERN STRINGS


String descriptions are best suited for applications in which connectivity of
primitives can be expressed in a head to tail or other continuous manner.

t a
|b

b
L

(a) (b)
Fig. 5.39. (a) Staircase Structure (b) Structure coded in terms
of the primitives a and b
to yield the string description ..... ababab ...
In Figure (a) shows a simple staircase pattern. This pattern could
be sampled and
expressed in terms of a patern vector.
Assume that this structure has been segmnented out
of an image. By defining the
two primitive elements a and b shown, we may code
figure (a) in the form shown in
figure (b).
The most obvious property of the coded structure is
the repetitiveness of the
elements a and b. Therefore, a simple description
approach is to formulate a
recursive relationship involving these primitive elements.
String descriptions adequately generate patterns of objects and
other entities whose
structure is based on simple connectivity of primitives.
Strings are 1-D structure, their application to image
description requre
establishing an appropriate method for reducing 2-D positional relations to 1-D form.
Page 218 of 304

Image Compression
and Recognition
5.93
Most applications of strings to image description are based on the idea of
extracting connected line segments from
the objects of interest.

5.20.3. PATTERN TREE


more poweriul aPProach for many
A
applications is the use of tree descriptions. A
a
tree Tis finite set of. one or more nodes for which

(a) There is a unique node $ designated the root and


(b) The remaining nodes are partitioned into m disjointed sets T;, ...... T, each

of which in turn is a tree called sub tree of T.
A tree frontier is the set of nodes at the bottom of the tree (the leaves), taken in
order from left to right. For example, the tree shown
in below figure has root $ and
frontier xy.

Fig. 5.40. Sinple tree with root S and frontierxy


There are two types of information in a tree are important such as
1. Information about a node stored as a set of words describing the node.
2. Information relating a node to its neighbors, stored as a set of pointers to those
neighbors.
As used in image description, the first type of information identifies an image
substructure. Example: Region or boundary segment. The second type defines the

Paysical relationship of that substructure to other substructure.


In below figure. (a) can be represented by a tree using the relationship
"inside $.
of", if the root or the tree is denoted
PIgure (a) shows that the first level of complexity involves a and inside $, which
c
produces
two branches emanating from the root, as shown in figure (b). The next
levelinvolves b c. inside e completes the tree.
inside a, and d and inside Finally,f
e
Page 219 of 304
Digital Image Processing
5.94)
$

(a) (b)

Fig. 5.41. (a) A Simple composite region (b) Tree representation obtained
by using the relationship "inside of

5.21. RECOGNITION BASED ON MATCHING


Recognition techniques based on matching represent each class by a prototype
pattern vector.
An unknown pattern is assigned to the class to which it is closest in terms of a
predefined metric. Matching can have two kinds of approach.
1
Minimum distance classifier
2. Matching by correlation

5.21.1, MINIMUMDISTANCE CLASSIFIER


The minimum distance classifier works well when the distance between means is
large compared to the spread or randomness of each class with respect to its mean.
Prototype of each pattern class to be the mean vector of the patterns the class
of
1
m, = N
N, Xj=1,2, .....w
*EWj

N,- is the number of patterrn vectors from class w,


w pattern classes.
-is the number of

For an unknown vector we can find class membership using vector


X is to assign 1
to the class of its closest prototype.
Using the Euclidean distance to determine closeness reduces the problem to
computing the distance measures
nion Page 220 of 304
D, (X) =
|X- m|| forj=1,2, . 5.95
IfD, (X) is the smallest distance we can
assign
Selecting the Smallest distance
X
to class w;
is equivalent to
evaluating
d, (X) = 1
the functions.
Xm,-§ m
forj = 1,2,.....
m,
Ifd (X) is the largest numerical value
The decision boundary
then we have X
between classes
w, and w,
to class w;
classifier is for a minimum
distance
d, (X) = d,
d (X)-d (2)
2= (m-m)-(m,
X ns3 otben
1

-m)T (m,+ m)=0


The surface
segment
given by above equation
2 iu
is the perpendicular bisector
ze o

joining m; and m,. For n = of the lin


2, the perpendicular bisector a
is a plane and for n is line, forn =3 it
>3it is called a hyper plane.
5,21.2. MATCHING BY CORRELATION
Spatial convolution is
related to the Fourier transform of the functions
convolution theorem, spatial via the
correlation is related to the transforms of the functions
via the correlation
theorem.

where,
J*,y) t w(*, y) F* (u, v) W(u, v)
ues
K indicates spatial convolution andF* is the complex conjugate of F.

TWO MARKS QUESTIONS AND ANSWERS


What is
image compression?
unage compression refers to the process of redundancy amount of data required
The basis of
to epresent thegiven quantity of information for digital image.
reduction process removal of redundant data.
is
2, What
iis Data Compression?
of source
Data identification and extraction
compression requires the reduce the number of bits
redundancy. other words, datacompression seeks to
In
usedto store or transmit information.
Page 221 of 304

Digital Image Processing


|5.96|

3. What are two main types of Data compression?


1. Lossless compression can
recover the exact original data after compression.
or word
It is used mainly for compressing database records, spread sheets
processing files, where exact replication of the original is essential.
2. Lossy compression will result in a certain loss of
accuracy in exchange for a
substantial increase in compression. Lossy compression is more effective
when used to compress graphic images and digitised voice where losses
outside visual or aural perception can be tolerated.
4. What is the need for Compression? (May'14)(May'13)
In terms of storage, the capacity of a storage device can be effectively increased
with methods that compress a body of data on its way to a storage device and
decompress it when it is retrieved.
1. In terms of communications, the bandwidth of a digital communication link
can be effectively increased by compressing data at the sending end and
decompressing data at the receiving end.
2. At any given time, the ability of the Internet to transfer data is fixed. Thus, if
data can effectively be compressed wherever possible, significant
improvements of data throughput can be achieved. Many files can be
combined into one compressed document making sending easier.

5. What are different Compression Methods?


Run Length Encoding (RLE)
Arithmetic coding
Huffman coding and
Transform coding
6. Define coding redundancy.
If the gray level of an image is coded in away that uses more code words than
necessary torepresent each gray level,
then the resulting image is said to contain
coding redundancy.
Page 222 of 304
Image Compression and Recognition
5,97
1 Define interpixel redundancy.
The value of any given
pixelcan be predicted from the values
The information carried of its neighbors.
by is small. Therefore the visual
contribution of a single
pixel to an image is redundant.
Otherwise called as spatial.
redundant, geometric redundant or
redundant. Eg: Run length coding interpixel

8.
What is run length coding?
(May'14)
Run-length Encoding or RLE
isa technigque used to reduce the size of a repeating
string of characters. This repeating string is
called a run; typically RLE encodes a
run of symbols into two a
bytes, count and a symbol. RLE can compress any type
of data regardless of its information content, but the content
of data to be
compressed affects the compression ratio. Compression is normally
measured
with the compression ratio:

9. Define compression ratio. (June'12)


Compression Ratio = original size/ compressed size
10. Define psycho visual redundancy.
In normal visual processing certain information has less importance than other
information. So this information is said to be psycho visual redundant.

II. Define encoder.


Encoder is responsible for removing the coding and interpixel redundancy and
psycho visual redundancy. There are two components A) Source Encoder B)
Channel Encoder
12, Define source encoder.
Source encoder performs three operations
1) Mapper -this transforms the input data into non-visual format. It reduces the
interpixel redundancy.
2) Quantizer - It reduces the psycho yisual redundancy of the input images.
This step is omitted if the system is error free.
) Symbol encoder- This reduces the coding redundancy .This is the final stage

of encoding proceSs.
Page 223 of 304
Digital Image Processing
5.98

13. Define channel encoder.


the channel noise by insertin
The channel encoder reduces the impact of
source encoded data. Eg: Hamming code
redundant bits into the

14. Whai are the types of decoder?


Source decoder- inas two components
-
a) Symbol decoder This performs inverse
operation of symbol encoder.
- mapper. Channel
b) Inverse mapping This performs inverse operation of
decoder-this is omitted if the system is error free.

15. What are the operations performed by error free compression?


1) Devising an alternative representation of the image in which its interpixel
redundant are reduced.
2) Coding the representation to eliminate coding redundancy

16. What is Variable Length Coding?


Variable Length Coding is the simplest approach to error free compression. It
reduces only the coding redundancy. It assigns the shortest possible code word to
the most probable gray levels.

17. Define Huffman coding and mention its limitation.


(June'12 & (Dec'13)
i 1. Huffman coding is a popular technique for removing coding redundancy.
2. When coding the symbols of an information source the Huffman code yielas
the smallest possible number of code words, code symbols per soulv
symbol.
Limitation: For equi probable symbols, Huffman coding code
produces variable
words.

18. Define Block code.


Each source symbol is mapped code
into fixed sequence of code symbols or
words. So it is called as block code,
Page 224 of 304
Compression and Recognition
aage
5.99
Define. instantaneous code.
B word that
A code is not a
prefix of any other code word
is called instantaneous or
prefixes code word.

Define uniquely decidable code.


2
A code wordthat is not a
combination of any other code word is said to be

uniquely decidable code.

21. Define B2 code.


Each code word is made up of continuation bit c and information bit which are
binary numbers. This is called B2 code or B code. This is called B2 code because
two information bits are used for continuation bits

22. Define the procedure for Huffnan shit coding. (Dec'12) (May'13)
List allthe source symbols along with its probabilities in descending order.
Divide the total number of symbols into block of equal size. Sum the
the
probabilities of all the source symbols outside the reference block. Now apply
source symbol.
procedure for reference block, including the prefix
The code words for the remaining sy.nbols be can constructed by means of one
as in the case of binary shift
or more prefix code followed by the reference block
code.

23. Define arithmetic coding.


coding one to one corresponds between source symbols and code
In arithmetic a
as the single arithmetic code word assigned for
word doesn't exist where
word defines an interval of number between
source symbols. A code
Sequence of
0 and 1.
(Dec'13)
Decomposition?
4. What is bit plane an image's interpixel redundancies is to
reducing
An effective technique for technique is based on the
concept
individually. This
process the image's bit plane and
images into a series of binary images
Of decomposing
multilevel binary
image via one of several well-known
compressing each binary
compression methods.
Page 225 of 304
25. Draw the block diagram of transform coding system.
Input Wavelet Symbol Compressed
Image Transform Quantizer
Encoder Image

Compressed Inverse Decompressed Symbol


Image Wavelet Image Decoder

26. How effectiveness of quantization can


be improved?
I. Introducing an enlarged quantization interval around zero,
called a dead
Zero.
2. Adapting the size of the quantization intervals from
scale to scale. In either
case, the selected quantization
intervals must be transmitted to the decoder
with the encoded image bit stream.
27. Whatare the coding systems in JPEG?
(Dec'12)
1. A lossy baseline coding system, which is based on
the DCT and is adequate
for most compression application.
2. An extended coding system for greater
compression, higher precision or
progressive reconstruction applications.
3. A lossless independent coding system
for reversible compression.
28. What is JPEG?
The acronym is expanded as "Joint Photographic
Expert Group", It.is an
international standard in 1992. It
perfectly Works with color and gray
scale
images, Many applications e.g., satellite, medical.
29. What are the basic steps in JPEG?
The Major Steps in JPEG Coding involve:
1. DCT (Discrete Cosine Transformation)

2. Quantization
3. Zigzag Scan
4.DPCM on DC component
5. RLE on AC Components
6. Entropy Coding
Page 226 of 304

Image Compression and Recognition


5.101|
30. What is MPEG?
The acronym is expanded as "Moving Picture Expert Group".
It is an
international standard in 1992. It perfectly Works with video and also used in
teleconferencing
31. Define I-frame.
1-frame is Intra frame or Independent frame. An -frame is compressed
independently of all frames. It resembles a JPEG encoded image. It is the
reference point fo1 the motion estimation needed to generate subsequent P and
B-frame.
32. Define Pframe.
P-frame is called predictive frame. A P-frame is the compressed difference
on the previous I or
between the current frame and a prediction of it based
B-frame.
33. Define B-frame.
B-frame is the bidirectional frame. A B-frame is the compressed difference
on the previous
o
between the current frame and a prediction of it based
Ior P-frame or next P-frame. Accordingly the decoder
must have access to both
past and future reference frames.

34. What is shift code?


.820 (Dec'14)
Huffman Shift) are referred to as
The two variable length codes (Binary shift,
by
shift codes. A shift code is generated
source symbols are monotonically decreasing.
i) Arranging probabilities of the
symbol blocks of equal size.
ii) Dividing the total number of symbols into
identically.
iii) Coding the individual elements within all blocks
iv) Adding special shift up/down
symbols to identify each block.

compression. otas(2 1o (Dec '14)


35. Write the performance netrics for image
message size before compression
=
3te Compression ratio code size after compression

Bit error rate


Page 227 of 304

5.102 Digital Image Processing

36. How arithmetic coding is advantages over Huffnan coding for text

S.No. Huffman Coding Arithmetic Coding


1.
Codes for the characters Coding is done for messages of
are derived. short lengths.
bsShannon's rate is achieved
Shannon's rate is always achieved
only if character
x ! 2. are all integer irrespective of probabilities of
ipprobabilities
power of 1/2 characters.

Precision of the computer Precision of the computer


3. determine length of the character
i does not affect coding
string that can be encoded.
Huffman coding is the
4. Arithmetic coding is complicated.
simple technique

37. What is JPEG standard?


at JPEG stands for joint photographic experts group. This has developed a standard
ti1 for compression -of mnonochrome/color still photographs
and images. This
compression standard is known as JPEG standard. It is also known as ISO
standard 10918.
38. State the main application of Graphics Interchange
Format (GIF).
The GIF format is used mainly with internet to represent
and compress graphical
images. GIF images can be transmitted and stored over the network interlaced
in
mode, this is very useful when images are transmitted over
low bit rate channels.
39. What is Lempel-Ziv Welch code (LZ)?
Lempel-Ziv Welch is a dictionary based compression
method. It maps a variable
number of symbols to a fixed length code.
LZW is a good example of compression or communication
schemes that transmit
the model rather than transmit the data.
Page 228 of 304

Image Compression and Recognition


5.103
40. Compare Huffinan Coding and Lempel Ziv Welch Coding,

S.NO. Huffman Coding Lempel-Ziv


Welch Coding
Huffman coding is an
It is the most widely used
entropy encoding
1. techniques for lossless file
algorithm used for lossless
compression.
data comnpression.
A Huffman encoder takes
A

a block of input characters


LZW is a variable to fixed
2. with fixed length and
length code.
produces a block of output
bits of variable length.
- 201O)
41. Define chain coding. (Apr/May-11, Nov/Dec
Chain coding is defined as the process of representing a boundary between two
76g regions by using a connected sequence of straight line segments having specified
length and direction.

42. Define the chain code derivative in 4 and & connectivity. (Nov/Dec- 2009)
The chain code derivative in 4 connectivity is given by
1

2 0

3
is given by
The chain code derivative in 8 connectivity
Page 229 of 304
Digital Image Processino
5.104
2
1
3

7
5
6
to describe
are the three principal approaches used in image processing
43. What
a -
(May/June 2009, Nov/Dec 2008)-
the texture of
region?
The three approaches used to describe texture are
i) Statistical approaches- characterize the texture as smooth, coarse etc.
ii) Structural approaches - Based on the arrangement of image primitives.

i) Spectral approaches - Based on the properties of Fourier spectrum.


44. What is texture? [NovDec-2007]
Loin Texture content is an important quality used to describe a region. This descriptor
provides measures of properties such as
1. Smoothness
2. CoarsenesS
3. Regularity
45. Define pattern.
A pattern is a quantitative or structural description of an objective or some other
entity of interest in an image.
46. Define pattern class.
A pattern class is a family of patterns that
share some common properties. Patter
classes are denoted W1, W2 .Ww
where W is the number of classes.
47. What are the demerits
of chain code?
The resulting chain code tends to
be quite long.
Any small disturbance along the boundary
code that may not be related to due to noise cause changes in the
the shape of the boundary.
Page 230 of 304

Image Compression and Recognition


5.105|
48. Specify the various image representation
approaches
Chain codes
& Polygonal approximation
Boundary segments.
49. What is polygonal approximation
method ?
Polygonal approximation is an image representation approach
in which a digital
boundary can be approximated with arbitrary accuracy
by a polygon. For a
closed curve the approximation is exact when the
number of segments in polygon
is equal to the number of points in the boundary so
that each pair of adjacernt
points defines a segment in the polygon.
50. Sperify the various polygonal approximation
methods
Minimum perimeter polygons
$ Merging techniques
$ Splitting techniques
51. Name
fewboundary descriptors
Simple descriptors
Shape numbers
Fourier descriptors
52. Give the formulafor diameter of
boundaryesiusm bs nssh
The diameter of a boundary B is defined as

max
|Diam (B) = i,j [D(P, P)]
Where
D-is a distance measure
Pi P; are points on the boundary
S3. Define length of a boundary.
The length of a boundary is the number of pixels along a boundary. Eg. For
a chain coded curve with unit spacing in both directions the numnber of vertical
and horizontal components plus y2 times the number of diagonal components
gives its exact length.
Page 231 of 304
Digital Image Processino
5.106
S4. Define shape numbers magnitude, The order
as
the first difference of smallest
Shape number is defined
is the number of digits in itsrepresentation.
n of a shape number
Fourier descriptors for the following transformations
53. Give the
(1) Identity
(2) Rotaiion
(3) Translation a0t0insei tis 2O!
(4) Scaling
(5) Starting pointf
il r et isei
56. Specify the types e,' regional descriptors

Sir le descriptors
Topological descriptors
Texture
Moment invariants
57. Name fewmeasures used as simple descriptors in region descriptors
Area
Perinneter
Compactness
Mean and median of gray levels

& Minimum and maximum of gray levels


Number of pixels with values above and below mean
58. Describe texture.
Texture is one of the regional descriptors. It provides measures properties such
of
as smoothness, coarseness and regularity.
There are 3 approaches used to
describe texture of a region. They are:
Statistical
Structural
Page 232 of 304
Image Compression and Recognition
5.107|
59. Describe statistical approach
Statistical approaches describe smooth, coarse,
grainy characteristics of texture.
This is the simplest one compared others.
to It describes texture using statistical
moments of the gray level histogram an
of image or region.1o1rmi4 .
60. Define gray level co-0ccurrence matrix,

.Amatrix C is formed by dividing every element of A by n (A is a k x k matrix


Bi

and n is the total number of point pairs in


the image satisfying P position
operator). The matrix C is called gray level co-occurrence matrix C
if depends on
P, the presence of given texture patterns may be detected by choosing an
appropriate position operator.
61. What is pattern recognition?
It involves the techniques for arranging pattern to their respective classes by
automatically and with a little human intervention
-ui:otiltdsieY
62. Define Translation.
Translation consists of adding a constant displacement to all coordinates in the
boundary.
ato It is obtained as zrieToga insgere t
.
S,(K) = S(K)+ Axy hihst TetA
where
Axy = Ar +jAy

63. Define statistical moments.


can be described by using statistical moments
The shape of boundary segments
can be used to
higher order moments. It
Such moments are mean, variance and
reduce the 2-D description problem to 1-D problem.
moments.
64.
Mention some advantages ofstatistical
. Implementation of moments is straight forward. A

boundary shape. Si
2. It carries a physical interpretationof
not sensitive to rotation.
lhis approach is scaling the range of
values of
can be achieved by
*. Size normalization
g and r.
Page 233 of 304
Digital Image. Processing
5.108
properties for region description?
used
65. What are the topological description such ae
are two topological properties useful for region
a These
au 1. Number of holes in the region.
of the region.s i ke
2.
Number of connected component
66. What is meant by uniformity?
on
Uniformity is one of the measures
of texture which is based the histogram.It
is defined as
L-1
=
ntiet 29ThU(Z) P2
(Z;)
i=0
An image is said to be maximally uniform when measure 0 is maximum for an
image in which all intensity levels are equal and decreases from there.
67. Define Entropy.
Entropy also one of the texture measure and it can be defined as a measure of
variability of an image. It can be expressed as

e(2) = - L-1 P(Z) log, P(Z;)


3Elne i=0
68. Define Pattern Tree.
A more powerful approach for many applications
is the use of tree descripionS
A tree a
T is finite set of one or more nodes for
which
a. There is a unique node $ designated
the root and
b.- The remaining
nodes are partitioned into m disiointed
sets T,, T, ..
each of which in turn is a tree
called sub tree of T.
69. Define Minimum Distance
Classifier.
The minimum distance classifier
is used to. to
classes which minimize classify unknown image data
the distance between in
multi-feature space. the image data and the class
70. Define string descriptions.

String descriptions adequately


generate patterns entities
whose structure is based on of objects and other
relatively simple
associated with boundary shape. connectivity primitives usually
of
Page 234 of 304

Compression and Recognition


mage 5.109|

REVIEW QUESTIONS

Eyplainthree basic data Redundancy? What is the needfor data


compression?
Refer Section 5.3, Page No. 5.2
What is image compression? Explain any four variable length coding
2.
compressionschemes?
Refer Section 5.4 and 5.6 Page Nos. 5.5 and 5.8
3.
Explain about imagecompression model?.
Refer Section 5.4, Page No.5.5
4.
Explain about Error free conpression?
Refer Section 5.5, Page No. 5.7
5. Explain arithmetic coding?
May 14, Dec 13
Refer Section 5.6.4., Page No. 5.17
6. Explain about Lossless Predictive Coding?
Refer Section 5.8, Page No.5.2.2
7. What is the need of block transform
coding? Explain.
Refer Section 5.10, Page No. 5.25
Page 235 of 304

Digital Image Fundamentals


Steps in Digital Image Processing- Componcnts – Elements of Visual Perception - Image
Sensingand Acquisition - Image Sampling and Quantization - Relationshiys betwen
- Color image
pixes fundamentals - RGB, HSI models, Two-dimensional mathematical
preliminaries, 2D transforms - DFT, DCT

1.1. INTRODUCTION

Pictures are the most common and covenient means of conveying or transmitting
information. A picture is worth a thousand words. Pictures concisely convey
information about positions, sizes and inter-relationships between objects. They
portray spatial information that we can recognize as objects. Human begins are good
at deriving information from such images, because of our innate visual and mental
abilities. About 75% of the information received by human is in pictorial form.
Modern digital technology has made it possible to manipulate multi-dimensionai
Signals with systems that range from simple digital circuits to advanced parallel
Computers. The goal of this manipulation can be divided into three categories.

Technology Outcomes
Image Processing image in image out
Image Analysis image in measurements out
Image understanding image in high level description out

Image processingot
The digital image processing deals with developing a digital system that performs
operations on a digital image.
2
Page 236 of 304

1.2| Digital Image Processing

An
image is nothing more than a two-dimensional signal. It is defined by
the
mathematical function f(o, y) where x and y are the two co-ordinates horizontally
and vertically and the amplitude
off at any pair of coordinate (, y) is called tha
intensity or gray level of the image at that point.

A digital image is composed of a finite number


of elements, each of which has a
particular location and values of these elements are
referred to as picture elêments
image elements, pels and pixels.

1.1.1. MOTIVATION AND PERSPECTIVE


Digital image processing deals with manipulation
of digital images through a
digital computer. It a subfield
is of signals and systems but focus particularly on
images. DIP focuses on developing 2 computer
system that is able to perform
processing on an image. The input
of the system is a digital image and the system
process that image
using efficient algorithms, and gives an
common
image as an output. The
most example is Adobe Photoshop. It is one
of the widely used applications
for processing digital image.

1.1.2. APPLICATIONS
Some of the major fields
in which digital image processing
is widely used are
mentioned below.
1. Office automation
Optical character recognition,
document processing, cursive
logo and icon recognition, script Recognitio
identification of address area on
envelop, etc.
2. Industrial automation
Automatic inspection system,
non-destructive testing, automatic
process related to assemblus
VLSI manufacturing, PCB
checking, robotics, oil and natural
exploration, process control applications etc.

3. Bio-Medical
ECG, EEG, EMG Analysis,
cytological, histological and stereological
applications, automated radiology
and pathology, x-ray image
analysis, eto.
Page 237 of 304

Digital Image Fundamentals


1.3
4. Remote Sensing
Natural resources survey and management, estimation related to agriculture,
hydrology, forestry, mineralogy, urban planning, environment and pollution control,
monitoring traffic along Roads, docks and airfields, etc.
LIS. XA9Hn!
5. Scientific Applications ah
ahkt reertonsoi insttt
High energy physics, bubble chamber and other forms of track Analysis etc.
6. Criminology
Finger print identification, human face registration and matching, forensic
investigation,etc.
7. Astronomy and Space applications
Restoration of. images suffering from geometric and photometric distortions,
computing close-up picture of planetary surfaces etc.
8. Meteorology
Short term whether forecasting, long-term climatic change detection from satellite
and other remote sensing data, cloud pattern Analysis, etc.

9. Information Technology
Facsimile image transmission, videotex, video conferencing and videophones, etc.

10. Entertainment and Customer electronics


HDTV, multimedia and video 'editing etc.
11. Printing and graphic arts
Colour fidelity in desktop publishing, art conservation and
dissemination etc.

12. Military application


less
Missile guidance and detection, target identification, navigation of pilot
vehicles, reconnaissance and Range finding etc.
In addition to the above mentioned areas, another important application of image
processing techniques is improvementof quality or
appearance of a given image.

T.Z. THE ORIGINS OF DIGITAL IMAGE PROCESSING

across a large evolution.


The origin of Digital Image Processing came
Page 238 of 304

the
newspaper industry in 1920,
14 digital images
was in
New York
applications of and
First between London
submarine cable picture transmission system
were sent by Bartlane cable acro
Wag

same year 1920s, the transport a picture the


to
In the
reduce the time required
introduced to three hours.
week to less than
more than a a
Atlantic from favour of technique based on
printing in
Method was used
In 1921, the tapes perforated at the telegraph
made from
photographic reproduction
receiving. fve
Systems were capable of coding images in
In early 1925, the Bartlane was increased to 15 levels
gray and later this capability
distinct levels of
in 1929.
a system for developing a film plate
via
During this period, the introduction of
light beams that were modulated by the coded picture
tape improved the
reproduction process considerably.
The above examples involves only digital images, they are not considered
as

digital image proçessing because computers were not involved in the creation
of digital images. Thus, the history of digital image processing is intimately
tied to the development of the igital computer.
In 1940s, a modern digital computer was introduced by
John Von Neumann o
two key concepts.
i) A memory to hold a stored progranm
and data
ii) Conditional Branching
These two ideas are the foundation
of a Central Processing Unit (CPU).
Later, in early 1960s, the
first powerful digital computer was
carry out meaningful introduced
image processing tasks.
In 1964, the improving
of digital image processing Jet
propulsion Laboratory was began at
when pictures
were processed of the moon
by a computer to correct transmitted by Ranger 7
inherent in the
on-board television camera. various types of image distortion
In parallel, with space
application, digital
inthe late 1960s image processing techniques began
resources
and early 1970s to
be used
observations and
astronomy.
in MedicalImaging, remote Earth
Page 239 of 304

Digital Image Fundamentals


1.5|
Os
In early 1970s, the Computerized Axial Tomography
(CAT) also called as
Computerized Tomography (CT)was introduced to process in
as which a ring of
detectors encircles an object (patient) and an x-ray source. Concentric with the
che
detector ring, rotates about the object.
From 1960s until the present, the field
of image processing has grown
On vigorously. In addition to applications in Medicine and the space program,
ph digital image processing techniques now are used in a broad range of
applications.
ive Computer procedures are used to enhance the contrast or code the intensively
els levels into color for easier interpretation of x-rays and other images used in
industry, medicines and the biological sciences.
via Geographers use the same or similar techniques to study pollution patterns
the from aerial and satellite imagery.
Image enhancement and restoration procedures are used to process degraded
as images of unrecoverable objects.
ion In archeology, image processing methods have successfully restores blurred
ely pictures that were the only available records of rare artifacts lost.or damages
after being photographed.
C In physics and related fields, computer techniques routinely enhance images
of
of experiments in areas such as high-energy plasmas and electron microscopy.
Similarly successful applications of image processing can be found in
astronomy, biology, nuclear medicine, low enforcement, defense and industry.

1.3. FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING


are given
1
to The various steps required for any digital image processing applications
below. (Fig. 1.2.)
Jet 1. Image acquisition
er 7 2. Image enhancement
tion J. Image restoration
4. Color image processing
gan S. Wavelets
Earth
6. Compression
Page 240 of 304
|1.6 Digital Image Processing

7. Morphological processing
8. Segmentation
9. Representation and description
10. Recognition
There are two categories of the steps involved in the image processing.
1. Methods whose input and output are images
are attributes
2. Methods whose inputs may be images but whose outputs
extracted from those images.
This organization is summarized in below figure:
The diagram does not imply that everyprocess is applied to an image. Instead of
that, the intention is to convey an idea of all the methodologies that can be applied to
images for different purposes and possibly with different objectives.
1. Image Acquisition
First step in image processing is image capturing i.e., to capture a digital image. It
could be as simple as being given an image that is already in digital form. Generally
the image acquisition stage involves processing such as scaling.

2. Image enhancement
The principal objective of enhancement technique is to process an image so that
the result is more suitable than the original image for a specific application

Modified Image
Original Image

Fig. 1.1. Image enhancement


Page 241 of 304

Digital lmnage Fundamentals


1.7|

Image acquisition

Image enhancement

Image Restoration

Color Image Processing

Wavelets

Compression

Morphological
processing

Segmentation

Representation and
description

Recognition

Fig. 1.2. Fundamental Steps in Digital Image Processing


Page 242 of 304
1.8 Digital Image Processing

3. Image Restoration

Input: image Output : image

Fig. I.3. Image Restoration


The ultimate goal of restoration technique is to improving the appearance of an
image. It is a process that attempts to reconstruct or recover an image that has been
degraded by using some prior knowledge of the degradation concept.
Image restoration is an objective approach, in the sense that restoration techniques
tent to be based on mathematical or probabilistic models of image processing.
Enhancement on the other hand is based on human subjective performances regarding
what constitutes a "good" enhancement result.
4. Color Image Processing
Color Image Processing deals with basically color models and their
implementation in image processing applications.
5. Wavelets and Multi resolution Processing
These are the foundation for representing image in various degrees of Resolution.
6. Compression
It deals with techniques reducing the storage required to save an image, or the
bandwidth required to transmit it over the network. It has two major approaches.
a) Lossless compression
b) Lossy compression
Digital Image Fundamentals Page 243 of 304
1.9
Output :bit-stream data

"010100101100110101001.."

Fig. 1.4. Compression


a) Lossless compression
Lossless compression is a class of data compression
algorithms that allows the
original data to the perfectly reconstructed from
the compressed data.
b) Lossy Compression
It is a compression technique that does not decompress digital data back to 100%
of the original.
7. Morphological Processing
It deals with tools for extracting image components that are
useful in the
representation and description of shape and boundary of objects. It is majorly used in
automated inspection application.
8. Segmentation
Process of segmentation is partition of input image into constituent parts. The key
role of segmentation is to extract the boundary of the object from the background.
Ine output of the segmentation stage usually consists of either boundary of the region
or allthe points in
the region itselt.
9. Representation and description
It always follows the output of segmentation step that is raw pixel data,
COnstituting either the boundary of an image or points in the region itself. In either
Case converting the data to a form suitable for computer processing is necessary.
ognition Page 244 of 304
It is the process Digital
Image
stepof image that assigns Processing
processing label to an
which use objectbased
11. Knowledge on its descriptor.
artificial intelligence
base of software. Itisthelast
Knowledge
about the
in the form problem domain
of knowledge is coded into
of the image base. This knowledge the image processing
where the information as
is simple as system
with the knowledge of interest is located. describing
base to decide the regions
application, about the Each module
appropriate will interact
technique
for the rights
1.4. COMPONENTS
OF AN IMAGE
Image Sensors PROCESSING
SYSTEM

With reference to
sensing two elemnents are
first is a physical device required to acquire
digital image. The
that issensitive to the energy
to image. The second, radiated bythe object we
called a digitizer, is a wish
physical sensing device device for converting
into digital form. the output of the

Network

Image
displays Computer Mass storage

Hard copy Specialized


Image Image
processing
processing software
Hardware

Image sensors

Problem
domain

Fig. 1.5. Components of a General Purpose Image Processing System


Page 245 of 304
Digital Image Fundamentals
1.11|

4
Specialized Image Processing Hardware
It consists of the digitizer plus Hardware that performs other primitive operations

ach as an arithmetic logic unit (ALU), which performs arithmetic and logical
as front and
operations in parallel on image. This type of Hardware is also known
subsystem.

2. Computer
It is a general purpose computer and can range from a PC to a super computer
depending on the application. In dedicated application, sometimes specially designed
computer are used to achieve a required level of performance.

3. Software
It consists of specialized modules that perform specific tasks. A well designed
package also includes capability for the user to write code, as a minimum, utilizes the
specialized module. More sophisticated software packages allow the integration of
these modules.

4. Mass storage
Mass storage capability is a must in image processing application. An image of
size 1024 x1024 pixels, in which the intensity of each pixel is an s-bit quantity,
requires one megabytes of storage space if the image is not compressed. Image
processing applications falls into three principal categories of storage.
(1) Short term storage for use during processing
(1) On-line storage for relatively fast retrieval
(i) Archival storage such as magnetic tapes and disks.
Ö). Short term storage for use during processing
Ihis can be provided by using computer memory or frame buffers.
Frame buffers are specialized board that store one or more images and can be
accessed rapidly at video rates. This method allows instantaneous image zoom, scroll
(vertical shifts) and pan (horizontal shifts).

). On-line storage for Fast retrieval


On-line storage generally tasks the form of magnetic disks or optical media
storage.
This type of storage gives frequent access to the stored data.
Page 246 of 304
1.12 Digital Image Processing

(iii). Archival Storage


It requires large amount of storage space and the stored data is accessed in
frequently. Magnetic tapes and optical disk packed in juke boxes provide this tpe of
storage.

5. Image displays
Commonly used displays are color TV monitors. These monitors are driven by the
outputs of image and graphics displays cards that are an integral part of computer
system.

6. Hardcopy devices
cameras, heat
The devices for recording image includes Laser printers, film
as optical and CD ROM disk.
sensitive devices, inkjet units and digital units such
paper is the obvious medium to
Films provide the highest possible resolution, but
choice for written applications.

7. Networking
any computer system in use today.
Networking is almost a default function in
processing applications, the
Because of the large amount of data inherent in image
bandwidth.
key consideration in image transmission is

1.5. ELEMENTS OF VISUAL PERCEPTION


designers and users of image processing techniques to
It is important for
system and the underlying perceptual
understand the characteristics of human vision
techniques are to build
processes. Since the aim of the imageprocessing and Analysis
similar capabilities as the human vision system, one must know
a system that has
eye, brightness adaptation and discrimination,
about the formation of image in the in
and perceptual mechanism and should incorporate such knowledge
image quality
the processing algorithms.

1.5.1. STRUCTURE OF THE HUMAN EYE


Human vision system consist of four major elements
1. The eyes
2. The lateral geniculate body
3. The visual cortex in the optical lobe of the brain
4. The neural communication pathway from eyes to visual cortex
Page 247 of 304
Digital Image Fundamentals
1.13

Cornea
Iris
body Anteriorchamber,
Ciliary Ciliary muscle

Lens

Ciliary fibers

Visual axis

Vitreous humor
Retina

Blind spot
Sclera Fovea

Choroid.

Nerve
and
sheath

Fig. 1.6. Simplified diagram of a cross section the


of human
eye
The eye is nearly a sphere with average
approximately 20 mm diameter. The eye is
enclosed with three membranes.
(1) Cornea and Sclera
(ii) Choroid
(iii) Retina
0). The Cornea
and Sclera
1Sa tough, transparent tissue
that covers the anterior surface of the eye. Rest
the
optic globe is covered by of
the sclera.
(ii). Choroid

The chorojd lies directly


below the sclera. It contains a network
that serve as of blood vessels
the major source of nutrition to the eyes. It helps
light entering to reduce extraneous
in the eye.
Page 248 of 304
1.14 Digital Image Processing
It has two parts.
1. Iris Diaphragms - it contracts or expands to control the amount of light that
enter the eyes.
2. Ciliary body
Lens
The lens is made up of concentric layers of fibrous cells andis suspended by fibers
that attack to the Cliary body. The infrared and ultraviolet light are absorbed
appreciably proteins within the lens structure and in excessive amount can damage
the eye.
(iii). Retina
Itis the innermost membrane of the eye, which lines the inside of the wall's entire
posterior portion. When the eye is properly focused, light from an object outside the
eye is imaged on the retina. There are various light Receptors over the surface of the
retina.
The two major classes of the receptors are
1. Cones
It is in the number about 6 to 7 million. These are located in the central portion of
the retina called the Fovea. These are highly sensitive to color. Human can resolve
fine details with these cones because each one is connected to its own nerve end.
Cone vision is called photopic or bright light vision.
2. Rods
These are very much in number from 75 to 150million and are distributed over the
entire retinal surface. The large area of distribution and the fact that several rods are
connected to a single nerve give a general overall picture of the field to view. They
are not involved in the color vision and are sensitive to low level of illumination. Rod
vision is called as, scotopic or dim light vision.
The absent of Reciprocators iscalled Hlind spot.
1.5.2. IMAGE FORMATION IN THE EYE
The major difference between the lens of the eye and an ordinary
optical lens o1
the eye and an ordinary optical lens in that the former is flexible. The shape
of the
lens of the eye is controlled by tension in the fiber
of the ciliary body. To focus on the
Page 249 of 304

Digital Image Fundamentals 1.15|

distant object the controlling muscles allow the


lens to become thicker in order to
focus on object near the eye it becoimes relatively flattened.
The distance between the center of the lens and retina
is called the focal length and
it yaries from 17 mm to 14 mm as the refractive power
of the lens increases from its
minimum to its maximum.
When the eye focuses on an object further away than about 3m, the lens
exhibits
its lowest refractive power. When the eye focuses on a nearly object. The lens is most
strongly refractive.
The retinal image is reflected primarily in the area of the fovea. Perception takes
place by the relative excitation of light receptors, which transform Radiant energy
into electrical impulses that are ultimately decoded by the brain.

15 m

100 m -17 mm

Point Cis the optical center of


the lens

Fig. 1.7. Graphical representation of


theeye looking at a palm tree

1.5.3. BRIGHTNESS ADAPTATION AND DISCRIMINATION


Digital images are displayed as a discrete set of intensities, the eye's ability to
discriminate between different intensity Levels is an important consideration in
presenting image processing results.
can adopt is
The range of light intensity levels to which the human visual system
enormous on the order of 10-10 from the scotopic threshold to the glare limit.
Experimental evidence indicates the subjective brightness is a logarithmic function of
incident on the eye.
the light intensity

Ihe curve represents the range of intensities to which the visual


system can adopt.
Dut the visual system cannot operate over such a dynamic range simultaneously.

ather, it is accomplished by change in its overall sensitivity called brightness


adaptation.
Page 250 of 304

1.16 Digital Image Processing

it Glare limit

brightness

range

Adaptation
Ba
Subjective
B

Scotopic

Photopic
Scotopic
threshold
-6 -4
Log of intensity (mL)

Fig. 1.8. Range of subjectivebrightness sensations


showing aparticular adaptation level
any set conditions, the current sensitivity level to which of the visual
For given of
system is called brightness adaptation level, B, in
the curve. The small intersecting
eye can perceive when
curve represents the range of subjective brightness that the
are
to this level. It is restricted at level B,, at and below which all stimuli
adapted
upper portion of the curve is not actually
perceived as indistinguishable blacks. The
B,.
restricted. While simply raise the adaptation level higher than
any
changes in light intensity at
The ability of the eye to discriminate between
adaptation level is also of considerable interest.
to occupy the entire field oI
Take a flat, uniformly illuminated area large enough
view of the subject. It may be a diffuser such as an opaque glass, that is illuminated
can be varied. To this field is added
from behind by a light source whose Intensity, ()
an increment of illumination AI in the form of a short duration flash that appears as
circle in the center of the uniformly illuminated field.

If AI
isnot bright enough, the subject cannot see any perceivable changes.i
Page 251 of 304

Digital lmage Fundamentals 1.17|

CI+ALig'nd aue"gnirseigi

Fig. 1.9. Basic Experimental setup used to characterize brightness discrimination


As Al gets stronger the subject may indicate of a perceived change. AI, is the
increment of illumination discernible 50% of the time with background illunination I.
Now,
AIIiscalled the weber ratio.
Small value of weber ratio means that small percentage change in intensity is
discriminable representing “good" brightness discrimination.

Perceived brightness.

LActual illumination

Fig. 1.10. Illustration of the Mach band effect


Page 252 of 304
|1.18|
Digital Image Processing
Large value of weber ratio means large percentage
change in intensity is reouirot
representing “poor brightness discrimination",
Two phenomena clearly demonstrate that perceived brightness is not a
simnl
function of intensity. The first is based on the fact
that the visual system tends to
undershoot or overshoot around the boundary of regions of different intensities.
The intensity of the stripes is constant; we actually perceive a
brightness patterm
that is strongly scalloped near the boundaries. These scalloped bands are called Mach
bands after Ernst Mach, who first described the Phenomenon in 1865.
The second phenomenon called simultaneous contrast is related to the fact that a
region's perceived brightness does not depend simply on its intensity, as shown in
below figure. AIlthe center squares have exactly the same intensity. But they appear
to the eye to become darker as the background gets lighter.

Fig. 1.11. Examples of Simultaneous contrast


Optical illusion
In this, the eye fills the non-existing information or wrongly pervious geometrical
properties of objects.
no lines
In figure (a) the out-line of a square is seen clearly, despite the fact that
defining such a figure are part of the image.
In figure (b), a few lines are sufficient to give the illusion of a complete circle.
ln figure (c), two horizontal line segments are of.the same length, but one appea
shorter than the other.
In figure (d), all lines are oriented at 45° are equidistant and parallel. The cros
Optical
hatching creates the illusion that those lines are far from being parallel.
illusions are a characteristic of the human visual system that is not fully understood.
Page 253 of 304

Digital Image Fundamentals 1.19|

(a) (b)

(c) (d)

Fig.: 1.12. Some well-known optical illusions

1.6. IMAGE SENSING AND ACQUISITION


an source and
Most of the images are generated by the combination of illumination
scene elements could be familiar objects,
the reflection or absorption of energy. The
formations or a human brain.
but they can just as easily be molecules, buried rock
source, illumination energy is reflected from, or
Depending on the nature of the
transmitted through objects.
There are three principal sensor arrangements used
to transform illumination
energy into digital images, such as
1. Single imaging sensor

2. Line sensor
3. Array sensor
Sensors Working Method
sensors will perform following ideas to generate a
All the above three kinds of
digital image.
a voltage by the combination of input
1. Incoming energy is transformed into
electrical power.
Page 254 of 304
1.20 Digital Image Processing

2. Sensor material is responsive to the particular type of energy being detected.


3. The output voltage waveform is response of the.sensor and a digital quantity is
obtained from each sensor by digitizing itsresponse.

1.6.1. IMAGE ACQUISITION USING


A
SINGLE SENSOR
Energy

Filter

Sensing material
Power in–

.\Voltage waveform out


Housing
Single Inaging Sensor
Fig. 1.13.
a
Above diagram shows the components of a single sensor. Photodiode is best
example for this type. The use of a filter in front of a sensor improves selectivity.
In order to generate a 2-D image using a single sensor, there has to be relative
displacements in both the x and y directions between the sensor and the area to the
imaged.
Film

Rotaion
Sensor
(«(CKK(CKC(KCKCKCK((C

Linear motion

One image line out


per increment of rotation
and full linear displacement
of sensor from left to right
Fig. 1.14. Combining a single sensor with motion togenerate a 2-D image
Above figure shows an arrangement used in high precision scanning. A film
negative is mounted onto a.drum that's mechanical Rotation provides displacement in
one dimension.
Page 255 of 304

Digital Image Fundamentals 1.21|

The single sensor is mounted on a lead screw that provides motion in the
nerpendicular direction, because mechanical motion can be controlled with high
precision.
Microdensitometers
In above figure other similar mechanical arrangements use a flat bed, with the
sensor moving in two linear directions. These mechanical digitizers are also known as
microdensitometers.

1,6.2. IMAGE ACQUISITION USING SENSOR STRIPS

Fig. 1.15. Line Sensor


The strip provides imaging elements in one direction. Motion perpendicular to the
strip provides imaging in the other direction.

One image line out per


increment of linear motion

Imaged area

Linear motion

Sensor strip

Fig. 1.16. (a) Image acquisition using a linear sensor strip


The above kinds of arrangement used in most flatbed scanners sensing devices
WIth 4000 or more in line sensors are poSS1ble.
In line sensors are used in airborne imaging applications, in that the imaging
System is mounted on an aircraft that flies at a constant altitude and speed over the
geographical area to be imaged.
One dimensional imaging sensor strips respond to various bands of the
clectromagnetic spectrum which are inounted perpendicular to the direction of flight.
Page 256 of 304
1.22| Digital lmage Processing

The imaging strip gives one line of an image at a time and the motion of the strin
completes the other dimension of a two-dimensional image. Lenses or other focusing
schemes are used to project the area to be scanned onto the sensors.
Sensor strips mounted in a ring configuration are used in medical and industrial
imaging to obtain cross sectional images of3-D objects.
A rotating x-ray source provides ilumination and the sensors opposite the source
collect the x-ray energy that passes through the object. This is the basis for medical
and industrial computerized axial tomography (CAT) imaging.

Image
reconstruction}
Cross-sectional
images of 3-D object

3-D object
X-ray
SOurce

motion
Linear

Sensor ring

Fig. 1.16. (b) Image acquisition using a circular sensor strip


so they require
Images are not obtained directly from the sensors by motion alone
of imaging based on the CAT principle
extensive processing. Other modalities
magnetic Resonance Imaging (MRI) and positron emission tomography
include
(PET).
Page 257 of 304

Digital Image Fundamentals 1.23


1.6.3. IMAGE ACQUISITION USING SENSOR ARRAYS

Fig. 1.17. Array Sensor


Above figure shows individual sensors arranged in the form of 2-D array. In that
numerous electromagnetic and some ultrasonic sensing devices frequently are
arranged in an array format. It will be used in digital cameras.
In a digital cameras we will used CCD array of sensor which can be manufactured
with a broad range of sensing properties and can be arranged in arrays of 4000 x 4000
elements or more.
CCD sensors are widely used in digital cameras and other light sensing
Instruments. The response of each sensor is proportional to the integral of the light
energy projected onto the surface of the sensor.
Advantage
1. Noise reduction

2. Complete image can be obtained by focusing the energy pattern onto the
surface of the array.
a scene element.
lhe energy from an illumination source is being reflected from
ne imaging system will collect the incoming energy and focus it onto an image
plane.

II the illumination is light, the front end of the imaging system is an optical lens
hat projects the viewed scene onto the lens focal plane.
Page 258 of 304
Digital Image Processtng
1.24

llumination (energv
sOurce
a

Output (digitized) image

Imaging system

(Internal) image plane

Scene element
b

Fig. 1.18. An example of


thedigital image acquisition process
The sensor array, which is coincident with the focal plane and produces outputs
proportional to the integralof the light received at each sensor.
Digital and analog circuitry sweep these outputs and convert them, to an Analog
signal, which is digitized by another section of the imaging system. It produces a
output in the form of digital image.

1.6.4. A SIMPLE IMAGE FORMATION MODEL


An image is denoted by a two dimensional function of
the form f{x, y}. The Valu
or
amplitude of f at spatial coordinates {x, y} is a positive scalar quantity whose
physical meaning is determined by the source the image.
of
When an image is generated
by a physical process, its values are proportional to
energy radiated by a and
physical source. As a consequence,f(x,
y) must be non-zero
finite, that is
o
<f,y) < o
The function f(9) may be characterized by two comnonente
tie
1. Theamount of thesource illumination incident on scene
the being viewed.
Page 259 of 304
1.25
2,The amount of
the source illumination reflected
back by the objects in
Scene. the
These are called illumination
and reflectance components
i(* y) and r(*y) respectively. and are denoted by

The two functions combines as


a product to form
monochrome image at any f(,y). We call the intensity of a
coordinates (xy) the gray Level
(1) of the image at that
point.

I= fry)
Loin S 1sLmni
Li is to positive and Lmay must be finite.

Lmin Imin min

Lmax max max

The interval [Lminy Lmawl is called gray scale. Common


practice is to shift this
interval numerically to the interval 0, L-1] where 1 = 0
is considered back and
l=L-1 is considered white on the gray scale. All intermediate values are shades of
gray varying from black to white.

1.7. IMAGE SAMPLING AND QUANTIZATION

To.create a digital image, we need to convert the continuous sensed data into
digital form. This involves two processes-sampling and quantization. An image may
be continuous with respect to the X and Y coordinates and also in amplitude. To
convert it into digital form we have to sample the function in both coordinates and in
amplitudes.

1.7.1. BASIC CONCEPTS IN SAMPLING AND QUANTIZATION


In figure i.19 (a) shows a continuous imagef that we want to convert to digital
Torm. To convert it, we have to sample the function in both coordinates and in
amplitude.

1, “Digitizing the coordinate values is called sampling"


2. “Digitizing the amplitude values is called quantization.
Page 260 of 304
1.26| Digital lmage Processing

The one dimensional function in Figure 1.19 (b) is a plot of amplitude values of
the continuous image along the line segment AB in Figure 1.19 (a). The random
variations are due to image noise.

A B

A B

(a) (b)

Sampling

(c) (d)

Fig. 1.19, Generatinga digital image


To sample this function, we take equally spaced samples along line AB, which is
shown in figure 1.19 (c). The spatial location of each sample is indicated by a vertical
tick mark in the bottom part of the figure. The samples are
shown as small white
squares super imposed on the function. The set
of these discrete locations gives the
sampled function.
The values of the sampies are also a continuous range
to form a digital function, the intensity of intensity values. In order
values also must be converted into discrete
Page 261 of 304
Digital Image Fundamentals
|1.27|
auantities. So the intensity scale divided into eight discrete
intervals, ranging from
hlack to white. The vertical tick mark assign the
specific value assigned to each of the
eight level values.

(a) (b)
Fig. 1.20. (a) Continuous Image projected onto a Sensor array
(b) Result of imagesampling and quantization
The digital samples resulting from both sampling and quantization are shown in
figure 1.19 (d). Starting at the top ofthe image and carrying out this procedure line by
line produces a two dimensional digitalimage.

When a sensing array is used for image acquisition, there is no motion and the
number of sensors in the array establishes the limits of sampling in both directions.
Quantization of the sensor outputsis as before, above figure 1.20 shows this concept.
an
In that figure 1.20 (a) shows the continuous image projected onto the plane of
array sensor. Figure (b) shows the image after sampling andquantization.

L,7.2. REPRESENTING DIGITAL IMAGES


an
The result of sampling and quantization is matrix of Real numbers. Assume that
Image f(*,y) is sampled so that the resulting digital image has M
rows and N
Columns, The values of the coordinates (x,y) now become discrete quantities thus the
Value of the coordinates at origin become (x,y) = (0,0), and the next coordinate value

along the first row. It does not mean that these are the values of physical coordinates
When the image was sampled.
Page 262 of 304
1.28 Digital Image Processing

Spatial Domain
The section of the real plane spanned by the coordinates of an image is called the
spatial domain, withx and y is referred as spatial variables or spatial coordinates.

f(0,0) f0,1)......f(0, N- 1)
f(1, 0) f(1, 1)....f(1, N–1)
fr,y) =
:

LfM-1, 0) f(M-1,1)......... M1,N-1)J


Both sides of this equation are equivalent ways
of expressing a digital image
quantitatively. The right side is a matrix of real numbers.
Each element of this matrix
called an image element, pixel, or pel.
The sampling process may be viewed as partitioning
the xy plane into a grid with
the coordinates of the center of each grid being a
pair of elements from the Cartesian
products Z2, which is the set of all ordered
pair of elements (Z;, Z,) with Z;
and Z,
being integers from Z.
Hence,f(*, y)is a digital image if gray level (that is a
real number from the set of
real number, R) to each distinct pair coordinates
of (x, y). This functional assignment
is the quantization process. If the gray levels are also integers,
Z replaces
R, and a
digital image becomes a 2D function
whose coordinates and the amplitude
integers. value are

Due to storage and quantizing Hardware


considerations, the number
levels typically isan integer power of 2. of intensity

L = 2k
Then, the number b of bites required to store a
digital image is
b =
M*N*K
when M =N this equation becomes
b = N2 K
When an image can haye 2k gray
levels, it is referred to as K-bit.
possible gray levels is called an 8 bit image An image with 256
(256= 28)

1.7.3. SPATIAL AND INTENSITY RESOLUTION


Spatial resolution is a measure of the
smallest discernible detail in an image.
Page 263 of 304
Digital Image Fundamentals
1.29
Suppose we construct a chart with alternating black and white
vertical lines,
each of width Wunits.
The width of a line pair is 2W, and there are
2W line pairs per unit distance.

(a) (b) (c)

(d) (e) ()
Fig. 1.21. (a) 1024*1024, 8-bit image. (b) S12*512 inage resampled into 1024 *1024pixels
by row and colunn duplication. ) 256*256, 128*128, 64*64, and
(c) through
32*32 images resampled intol024*1024 pixels.
Intensity resolution refers to the smallest discernible change in intensity level.
an
Based on Hardware considerations, the number of intensity levels usually is
Integer power of two. The most common number is 8 bits, with 16 bits being used in
SOme applications.
Page 264 of 304
1.30 Digital Image Processing

(a) (b) (c)


Fig. 1.22. (a) 452*374, 256-level image. (b)-(d)
Inage displayed in 128, 64, and 32
intensity levels, while keeping the spatial resolution constant.

(d)
(e)
()

(e)
Fig. 1.22. (e) - (h) mage displayed in (h)
4.7.3.1. ISO Preference l6, 8, 4, and 2intensity levels.
Curves
To see the effect of varying N
and K
simultaneously,
Lhaving little, mid-level three pictures are taket
and high level of details.
Page 265 of 304

(a) (b) (c)

Fig. 1.23. (a) Image with a low level of detail. (6) Image with a medium level of detail. (e)
Image with a relatively large amount of detail
to
Different image were generated by varyingN and K and observes then asked
were summarized in
rank the results according to their subjective quantity. Results
the form of ISO preference curve in the N - K plane.

Face
k
Cameraman

Crowd
4

32 64 128 256

curves for the three types


Fig. 1.24. Typical ISO preference
of images in Fig. 1.23
in each
The ISO preference curve tends to shift right and upward but their shapes
up and right in the curve
of the three image categories are shown in the figure. A shift
Simply means large values for N and K which implies better picture quality.
Page 266 of 304
|1.32 Digital Image Processing

The results shows that ISOpreference curve tends to become more vertical as the
detail in the image increases. The result suggests that for image with a large amount
of details only a few gray levels may be needed. For a fixed value of N, the perceived
quality for this type of image is nearly independent of the number of gray levels used.

1,7.4. IMAGE INTERPOLATION


Interpolation is a basic tool used extensively in tasks such as zoonming, shrinking,
rotation and geometric corrections. Fundamentally, interpolation is the process
of
using known data to estimate values at unknown locations.
Zooming may be said oversampling and shrinking may be called as under
sampling these techniques are applied to a digital image.
There are two steps of zooming.
1. Creation of new pixel locations
2. Assignment of gray level to those new locations
In order to perform gray level assignment for any point in the overly, we look for
the closet pixel in the original image and assign its gray level to the new pixel in the
grid. This method is known as nearest neighbor interpolation.

Pixelreplication

It is a special case of nearest neighbor interpolation and it is applicable if we want


to increase the size of an integer number of times.

For example to increase the size of image as double, we can duplicate each
column. This doubles the size of the image horizontal direction. To increase
assignment of each of vertical direction we can duplicate each row. The gray level
assignment of each pixel is determined by the fact that new locations are exact
duplicates of old locations.
Drawbacks
Nearest interpolation is fast, it has the undesirable feature that it produces a check
board that it not desirable.
Bilinear Interpolationti
It uses the four nearest neighbors to estimate the intensity at a
given location. Let
/e u) denote the coordinates of the location to which we want to assign an intensity
Page 267 of 304

Digital Image Fundamentals 1.33|

value and let v,y) denote that intensity value. For bilinear interpolation the assigned
gray levels is given by

V(*, y) =
ax + by + cxy + d
where the four coefficients are deternined from the four. eqjuations in four
unknowns that can be written using the four nearest neighbors of point (*, y).

Bicubic Interpolation
It involves the sixteen nearest neighbors of a point. The intensity value assigned to
point (*,y) is obtained using the equation
3
3

i=0 j=0
Where the sixteen coefficients are determined from the sixteen equations in sixteen
unknowns that can be written using the sixteen nearest neighbors of point (, y).
Shrinking is done in the similar manner. The equivalent process of the pixel
replication is row and column deletion. Shrinking leads to the problem of aliasing.

1.8. RELATIONSHIPS BETWEEN PIXELS


In image processing we must know what is the relationship between pixels to
process any image in an easy way. The Relationships between an image f(*, y) are
explained as follows.

1.8.1. NEIGHBORS OF A PIXEL


A pixel p at coordinates (, y) has four horizontal and vertical neighbors whose
coordinates are given by.
(x+1,y), (s - 1,y), («, y+ 1), (*,y -1)
a
This set of pixels called the 4 neighbors ofp is denoted by N, p). Each pixel is
unit distance from (*, y) and some of the neighbor locations of p lie outside the

digital image if (x, y) is on the border of the image.


The four diagonal neighbors ofp have coordinates
(x+1,y+ 1), (* + 1,y- 1), (x –1, y+ 1), (x 1,y-1)
and are denoted by Np(). There points together with the 4 - neighbors are called
the 8 -neighbors of p,denoted by Ng (P).
Page 268 of 304
1.34| Digital Image Processing

1.8.2. ADJACENCY
Let V
the set of gray level values used to define adjacency,
be in a binary
imags
V= {1} if we are referring to adjacency of pixel with value 1.

Three types of Adjacency


1. 4-adjacency- two pixel P
andQ with value from V
are 4 - adjacency
iif A
in the set N,(P).
2. 8- adjacency - two pixel P and Q with value from V are 8 - adjacency ifA:
in the set Ng (P).
3. M- adjacency- two pixel P and Q
with value from V are m- adjacency if
i) q
is in N,P), or

ii) q is in NP) and the set N,(P) nN,(g) has no pixels whose values an:

from V.

1.8.3. PATH
It is also known as digital path or curve. A path from
pixelP with coordinates
(*, y) to pixel q with coordinates (s,t) is a sequence of distinct pixels wit
coordinates

where

(, Vo) = (,y)
(p y) = (s, ) and
(Ip y) and (-y,- ) are adjacent for 1 sis n+
1 1

0 1 0

1
1

(a) (b) (c)

Fig. 1.25.
Page 269 of 304
1.35
Digital Image Fundamentals

1.8.4. CONNECTIVITY
Let s Represent a subset
of pixels in an image. Two pixelsp and q are said to be
connected in s if therc exists a path between them consisting entirely of pixels
in s.
For any pixcl p in s, the set of pixels that are connected to it in s is called a
connectcd component of s. If it only has one connected component, then set s is
called a connected set.

REGION
Let R be a subset of pixels in an image. If R is a connected set, it is called a
Region of the image.

BOUNDARY
Boundary is also called as border or contour. The boundary of a Region R is the
set of points that are adjacent to points in the complement of R.
It also defined as the border of a region is the set of pixels in the region that have
at least one background nighbor. The boundary of a finite region forms a closet path
and this global concept.

EDGE
Edges are formed by the pixels with derivative value that is higher than a present
threshold. Thus, edges are considered as gray level or intensity discontinuities.
Therefore it is local concept.

1.8.5, DISTANCE MEASURES


a
For pixels, p, g and z with coordinates (*, y), (s, t) and (v, w) respectively, D is
distance function or metric if

a, D
(p,) 0(D (p, q) =0 iffp=)
b. D (p, 4) = D(g,p) and
C. D (p, z) s D(p,g)+D (g.z)
Euclidean Distance
The Euclidean distance betweenpand q is defined as
D, (P, 4) = +
(-s)2
(y -)2
Page 270 of 304

1.36 Digital Image Processing


or equal to sonme
For this distance measure, the pixels having a distance less than
r at
value r from (*, y) are the points contained in disk of radius centered (,
a y).

City Block Distance


as
The city block distance (or) D, distance between p and is defined
g

D,(p, q) = -s| + y-1|


In this case, the pixels having a D, distance from (x, y) less than
or equal to som
value r form a diamond centered at (x, y)
For example, the pixels with D, distance s 2 from (x, y) from the following
contours of constant distance.
2

2
1
2

2 1 0 1 2

2 1 2
2

The pixels with D, =1are the 4- neighbors of (*, y).

Chessboard Distance
The D, distance can be called as chessboard distance, between p and g is
defined as
D, (p, q) = max (x-s|, ly -)
Inthis case, the pixels with Dg distance from (3, y) less than or equal to some
value r form a square centered at (, y). For example the pixels with
D, distance
s2 form the following contours of constant distance.

2 2 2 2 2
2 1 1 2

2 1 1 2
2 1
12
2 2 2 2 2
The pixels with Dg=lare the 8-neighbors of (*, y).
Page 271 of 304

Digital Image Fundamentals |1.37|

D, Distance
D. distance between two points is defined as the shortest m path between the
points. It considers m - adjacency. In the case, the distance between two pixels will
depend on
1, The values of the pixels along the path and
2. The values of their neighbors.

Example
Consider the pixel arrangemert given below and assume that p1, P> and p4 have
value 1 and that p, and p, can have a value of0 or 1.
P3 P4

P1 P2

Case 1

Itp, andp, are 0, the length of the shortest m -path between p and p, is 2.
Case 2
Ifp, is 1, then p, and p will no longer be m - adjacent and the length of the
shortest m - path becomes 3.
Case 3
Ifp, is 1, then the length of the shortest m - path also 3.
Case 4
If both p, and p, are 1, the length of the shortest m - path betweenp and p, is 4.
1.9. CoLOR IMAGE PROCESSING FUNDAMENTALS.
The use of color in image processing is motivated by two principal factors.
1. Color is a powerful descriptor that simplifies object identification and
extraction from a scene.
2. Humans can discern thousands of color shades and intensities.

Color image processing is divided into two major areas namely.


1. Full-color processing
2. Pseudocolor processing
Page 272 of 304

Digital Image Processing


1.38
Full-color Processing
a
are acquired using full-color sensors, such as color
In this category, the images
TVcamera or color scanner.
range of applications, including publishing.
This technique is used in a broad
visualization and the internet.

Pseudocoloor processing
or Range of intensities.
A color is assigned to a particular monochrome intensity
as violet, blue, green,
Color spectrum may be divided into six broad regions such
an object are determined by the
yellow, orange and red. The colors perceived in
nature of the light reflected from the object.

1.9.1, CHARACTERIZATION OF LIGHT


quantities
Characterization of light is central to the science of color. The various
describing differenttypes of light are as follows.

1. Achromatic light
set and it has
Achromatic light is the light seen on a black and white television
been as implicit component for image processing.
Its only attribute is its intensity or amount
or gray level. The term gray level refers
to grays and finally to white.
toa scalar measure of intensity that ranges from black,
2. Chromatic light
400 to
Chromatic light spans the electromagnetic spectrum from approximately
a chromatic light
700 nm. Three basic quantities are used to describe the quality of
Source such as
1. Radiance
2. Luminance
3. Brightness

Radiance
It is the total amount of energy that flows from the light source and measured
watts (W).
Page 273 of 304

Digital lmage Fundamentals 1.39

Luminance
It is ameasure of the amount of energy an observer perceives from a light source
and measured in lumens (Im).

Brightness
It isa subjective descriptor that is practically impossible to measure. It embodies
the achromatic notation of intensity. and one of the key factors for describing color

sensation.

Primary Colors
Red (R), Green (G) and Blue (B) are called as primary colors because when
mixing color in various intensity proportions, they can produce all visible colors.
this

Secondary Colors
These primary colors can be added in different proportions to produce secondary
are given below.
colors. They also called as primary colors of pigments and

3 Magenta (Red + Blue)

3 Cyan (Green + Blue)


Yellow (Red + Green)
Mixture of Light

Green

Yellow Cyan
White
Magenta Blue
Red

Fig. 1.26. Mixture of Light

Mixing the three primaries or a secondary with its opposite primary color, in the
nght intensities produces white light. The primary colors are added to produce
Pigment primaries, they are called adaitive primaries.
Page 274 of 304

1.40 Digital Image Processing

Mixture of pigments
are produced,
When the primary colors of pigments are added, the primary colors
So in this case primary colors become secondary colors.si
A proper combination of the three pigment primaries
or a secondary with its

opposite primary can produces black.

Yellow

Red Green
Black
Magenta
Blue
Cyan

Fig. 1.27. Mixture of Pigments

Primary colors are subtracted out from the pigment primaries they are called
subtractive primaries.

1.9.2. CHARACTERISTICS
The following three characteristics are used to differentiate one color from other
such as
1. Brightness
2. Hue
3. Saturation
Brightness
It gives the chromatic notion
of intensity.
Hue
It represents dominant color as
perceived by an observer.
Saturation
It refers tothe relative purity,or the amount
of white light mixed with a hue. IU
pure spectrum colors are fully saturated.
Page 275 of 304

Digital Image Fundamentals 1.41


Hue and saturation taken together are called chromaticity. Therefore, a color is
characterized by its brightness and chromaticity.

1.9.3. TRICHROMATIC Co-EFFICIENTS


The amounts of Red, green and blue needed to form any particular color are called
the Tristimulus values and are denoted by R, G and B respectively. A color is
specified by its trichromatic coefficients, such as
R
T
R+G+ B

G
R+G+B
B
h=
R+G+ B
From the above three equations, it can be shown that
rtg +b = 1

Thus, the Tristimulus values needed to form any color can be obtained from the
above equations.

1.9.4. CHROMATICITY DIAGRAM


Chromaticity diagram can be used to specifying colors and it will be shown in
below figure 1.28.

Chromaticity diagram shows color composition as a function oif r(Red) and.


8(Green). For any value ofr and g, the corresponding value of b(blue) is obtained
from above equation.
1
rtg+b=
b = - g)
1-(r
From the figure 1.28, we have following things.
1. Any point located on the boundary of the chromaticity chart is fully saturated.
2. As a point leaves the boundary and approaches the point of equal energy,
more white light is added to the color and it becomes less saturated.
3. The saturation at the point of equal energy is zero.
1.42 Page
Digital Image 276 of 304
Processing

0.9 -

520 nm

Green

Warm white Gold


Cool white

Pink
Deep Equalenergy Red
Blue nm
point 780

Blue

380 nm
TT 0.8

Fig. 1.28. Chromaticity Diagram

Uses
any two
1. It is useful for color mixing because a straight line segment joining
can be
points in the diagram defines all the different color variations that
obtained by combining the two colors in different ratios.
2. A line drawn from the point of equal energy to any point on the boundary will
define all the shades of that particular spectrum color.

1.10. COLOR MODELS

Color Model also known as color space or color system. The main purpose of 2
color model is to facilitate the specification of colors in some standard. A color model
-is åspecification of a coordinate system and a subspace within that system where
each color is represented by a single point.
Color Models are classified into two types according to their use.
1. Hardware Oriented Color Models

2. Application Oriented Color


Models
Page 277 of 304

Digital Image Fundamentals


1.43
1. Hardware Oriented Color Models
Hardware Oriented Color Models most
commonly used in
1. RGB(Red, Green, Blue) model for
color monitors and color video camnera
2. CMY (Cyan, Magenta, Yellow) model and
CMYK (cyan, Magenta, Yellow,
Black) models for color printing.
3. HIS (Hue, Saturation and intensity) model.

2.
Application Oriented Color Models
These color models are used in the creation
of color graphics for animation and
manipulation of colors.

1.10.1. RGBCOLOR MODEL


In the RGB model, each color appears in its primary spectral components
of Red,
Green and Blue. This model is based on aCartesian coordinate system.

+
Blue

(0, 0 ,1)
Cyan

Magenta White
scale
Gray
(0, 1, 0)
Black G
Green

(1,0,0)
Red Yellow
RGB colour Cube
R

Fig. 1.29. RGB Color Cube


In the above figure RGB primary values are at three corners, the secondary colors
Cyan, Magenta and Yellow are at three other corners, black is at the origin and white
1S at
the corner further fromthe origin.
In this model, the gray scale extends from black to white along the line joining
nese tvvO points, all values of R, G, and B.are assumed to be
in the range [0, 1J.
14
Page 278 of 304
Digital Image Processing
Kel Depth
The number of bits used to represent each
pixel in RGB space is called the Pixel
epth.

xample
In RGB image each of thered, green, and blue
imags is an 8- bit image. So pixel
epth of RGB color pixel =3 x Number
ofbits/plane
= 3x 8

24
Full color Image
Full Color Image is used to denote a -
24 bit RGB color image. The
total number
of colors in a 24-bit RGBimage is (283 =
16,777,216.
Safe RGB ColorS
Many applications use only a few
hundred or fewer colors only. So many
in use today are limited to 256 colors. systems

In current use many systems have' a subset


of colors that are likely to
reproduced faithfully,reasonably independently be
of viewer hardware capabilities. This
subset of colors is called the set
of safe RGB colors or set of all systems
In internet application, they are called safe safe colors.
web colors or safe brovwser
colos.
Applications.
The RGB color model has following
applications
1. Color Monitors
2.
Color Video Cameras, eto.,

Advantages
1. It is suitable for Hardware implementation.
2.: Changing to other models
such as CMY is a straight
forward.
3. Creating colors in this model an
is easy process therefore
“Image color generation". it is an ideal tool fe

Disadvantages
1
Itis not acceptable that a color
image is formed bycombining three primary
images.
Page 279 of 304

Digital lmage Fundamentals 1.45


2. This model is not suitable for describing colors in a way which is practical for
human interpretation.

1,10.2. CMY ANDCMYK COLOR MODELS


Cyan, Magenta and Yellow are the secondary colors of light or the primary colors
of pigments. For example, when a surface coated with cyan pigment is illuminated
with white light and no red light is reflected firom the surface. (i.e.) cyan subtracts rèd
light from reflected white light.

To convert RGB to CMY below simple equation will be used.


fR]

Allcolorvalues have been normalized to the range [0, 1]


Pure cyan does not contain red, as like that pure magenta does not reflect green
and pure yellow does not reflect blue.
RGB values can be obtained easily from a set of a CMY values by subtracting the
individual CMY values from 1.

We know that equal amounts of the pigment primaries, cyan, magenta and yellow
should produce black. In order to produce true black, a fourth color black is added
and it gives a new color model called CMYK color model.

1.10.3. HSI COLOR MODEL


In this HSI color model, an image can be described in the form of Hue, saturation
and Intensity.

Hue It is a color attribute that describes a pure color.

Saturation It gives a measure of the degree to which a pure


color is diluted by white light.

Intensity It is a most useful descriptor of monochromatic


images. This quantity definitely is measurable and
easily interpretable. It is also called as gray level.
The HSI color model decouples the intensity component from the color carrying
information in a color image.
Page 280 of 304

1.46| Digital Image Processing

So HSI model is an ideal tool for developing image processing algorithms based
on color descriptions that are natural and intuitive to humans.

To find Intensity
An RGB color image can be viewed as three monochrome intensity
images, so the
intensity can be extracted from on RGB image.
The intensity axis is a vertical line joining the black
and white vertices. Black
vertex is (0, 0, 0) and white vertex is (1, 1, 1).

White

Cyan Magenta
Yellow

Blue
Green Red

Black

Fig. 1.30. Tofind Intensity


Determining Intensity Component
The following methods are
used to find intensity component any
A plane which is perpendicular to of color.
the intensity axis and containing
color point is passed through the the
cube.
The intersection of the plane
with the intensity axis has
in the range [0, the intensity value
1].
Tofind Saturation
The saturation of a color increases
asa function of distance
The saturation of points on from the intensity axis.
the intensity axis is zero,
are gray. because all points along
this axis
To fine Hue
The Hue of a color also can
be determined
1om the RGB color model. Below cubefrom the RGB color cube because; 1s
shows a plane defined
it
by three planes
Page 281 of 304

Digital Image Fundamentals


1.47
(blak, white and cyan). All points contained in the plane segment defined
intensity axis and the boundaries by the
of the cube have the same hue.
If two of the points are black and white and the third a
on the triangle should is color point then all points
have the same hue because the black and
cannotchange the hue. white components

White

Cyan Magenta Yellow

Blue Green Red

Black

Fig. 1.31. To find Hue


By rotating the shaded plane about the vertical intensity axis, we can obtain
different hues. So the hue, saturation and intensity values required to form the HSI
space can be obtained from the RGB color Cube. This implies that any point on the
RGB color cube can be converted into HSIcolor model.
HSI color space
HSI color space is represented by a vertical intensity axis and the locus of color
points that lie on planes perpendicular to this axis.
a
Asthe planes move up and down along the intensity axis, the shape can either be
triangle or a hexagon. The hexagon shaped plane is shown in below figure.
Green Yellow

White
Cyan Red

Blue Magenta

Fig. 1.32. Hexagonal HSI color Space


Page 282 of 304
1.48 Digital Image Processing

In the above plane,


Primary colors are separated by 120° s
Secondary colors also separated by 120° because the angle between the
secondaries and primaries is 60°.
Representation of Hue and Saturation
The hue of the point is determined by an angle from some reference point. The
saturation is the length of the vector from the origin to the point.
Origin is defined by the intersection of the color plane with the vertical intensity
axis. It is the important components of the HSI color space.
The different shapes of HSI planes representing the hue and saturation of an
arbitrary color point are shown below. In that diagram the dot is an arbitrary color
point.
Green Yellow Green
Green Yellow

Yellow
Cyan Red Cyan Cyan
Red

Blue Magenta Blue Magenta Red Blue Magenta

Fig. 1.33. Hue and Saturation in Hexagon triungle and circular shaped HSI models

Converting Colors from RGB to HSI


To convert RGB color model into HSI model, first we need to make a following
assumptión.
1. The RGB values have been normalized to the range [0, 1].
2. The angle 'Q is measured with respect to the red axis of HSI space.
The Hue of each RGB pixel is obtained from the equation

H = ifB< G
... (1.1)
360
-0ifB> G
Where
=
cos-!
1/2 [(RG)+ (R-B)] .. (1.2)
+
L[R- G)2
(R-B) (G-B))²
Page 283 of 304

Digital Image Fundamentals 149


The saturation component is given by
3 ... (1.3)
S = min(R, G, B)]
(R+G+B)
The intensity component is given by

I = (R+G+ B)
...(1.4)
Gonverting Colors From HSIto RGB
When the HSÍvalues of an image are given in the range [0, 1] the equivalent RGB
values are determined separately for three sectors of hue in the HSI color space as
shown in figure - Hexagonal color space.
1. Red-Green (RG) sector, 0°<H< 120°
2. Green- Blue (GB) sector, 120°< H<240°
3. Blue - Red (BR) sector, 240° < H s360°
RG- sector (0° s H< 1209)
When H is in this sector, the RGB components are given by the equations
B = I(1- S) ... (1.5)
ScosH ...
R= 1 cos(60 H)11+; - (1.6)

G = ... (1.7)
and 3I- (R+ B)

GB-sector (120sH<240°)
If the given value of H is in this sector, we first subtract 120º from it.

H = H–120o
Then the RGB components are

R= I(1-S) •.. (1.8)


ScosH ...
G = 1+ cos(60°-H) (1.9)
and B = 3I-(R+G) ioioaitH io ta.(1.10)
BR Sector (240°sHs360°)
If H
is in this range, we subtract 2400 from it.
H = H–240
Page 284 of 304
1.50 Digital Image Processing

Then the RGB components are


...(1.11)
B = ScosH
1+: i (1.12)
cos(60°-H) 11(UK
n
and R= 31-(G+ B) ... (1.13)
Manipulating HSlcomponent Images
There are three procedures used to create new color by modifying a portion an
1 of
RGB image. To perform this operation we need a HSI component images,
such as
H =
if B < G
360 -0 ifB> G
3
S = [min (R, G, B)]
(R+G+ B)

7
(R+ G+ B)

Procedure 1
To change the individual color
of any region in the RGB image, we change the
values of the corresponding region in the
hue image of below figure (b)
Now we convert the new H image, along
with the unchanged S and I images, back
to RGB using the above equations
from 1.1to 1.13.
Procedure 2
To change the saturation or purity
of the color in any region, we follow the above
same procedure, except we
that make changes in the saturation in HSI space,
shown in figure (c). which is

Procedure 3
To average intensity of any region, we
follow the same procedure, except
make changes in the intensity, which that we
is shown in figure 1.34.
Advantages of HSI model
1 This model allows independent
control over the color describing
quantities
namely hue, saturation and intensity.
2. It can be used as an ideal
tool for developing image processing algorithms
based on color descriptions.
Page 285 of 304

Digital Image Fundamentals 1.51|

Oyan Yellow

White

Blue
Magenta! Red

(a) (b)

(c) (d)
Fig. 1.34. (a) RGB image and the components of its corresponding HSI image:
(b) hue, (c) saturation, and (d) intensity

1.11. TWO DIMENSIONAL MATHEMATICAL PRELIMINARIES


Mathematical Preliminaries are unavoidable in any system theory discipline,
especially image processing. In this section we consider important mathematical
preliminaries for Digital Image Processing.

1.11.1. ARRAY VERSUS MATRIX OPERATIONS


An Array operation is carried out on pixel-by-pixel basis and
matrix operation is
based on matrix theory.

The array product of two images. Collágs of Enginebrng


Aire
a2
La1 a22 and bz1 b LIBRARY

Tbu b12 [a,jbn a,pb


Veilete
-
a2 b1 bn Lagib1 a,gb
Page 286 of 304
The matrix product of two images.

-
aibtapbai ajbnt an bz]
L
a1 a2 JLb1 b2 a,ibit azbzi ayjbit ay by
1.11.2. LINEAR VERSUS NONLINEAR OPERATIONS
Let an operator H produces an output image g(x,
y) for an input image fx, v).
H C, y)) = g *,y)
H is said to be linear operator if it satisfy the property
homogeneity, otherwise nonlinear operator.
of additivity and

H [af(*, y) + af (*, y)] = a,H , (, y) t a, H [G («, y)]


= aj g,(*, y) + a, g,
(, y)
Additivity Property
The output of the sum of two inputs is same as
performing the operation on the
inputs individual and then summing the result.
H, (6, y)+ f*, y)] = H[S,(«,y) +
H(«,))
Homogeneity Property
The output of a constant times input sanme as the output
is of original input
multiplied by that constant.
= aH
itsdaxo H[a Su,)] f(, y)l
ag (*, y)
Example of Linear Operator
Sum operator 2:
Z [a,f(x, y) +
a,f*, y)] = a,2 [, )) +a, ,x, y)]

T6 51
=
,y) 4,=1and a,
=-1
+ =
E[a, f:y) a,f*, y)] 15
Page 287 of 304
Digital Image Fundamentals 1.53
a, E , y) +a,
If*, y)] =7-22 =-1
where S(, y) = 7,

2 L, y) = 22
Example of Nonlinear Operator
Max Operator Max{}
Max [a,f(o, y)+ a,f (*, v)] = a, Max [ (, y)] +a, Max ; (, ))
-6 -3
=Max
= -2

(1) Max

=3-7=-4
1.11.3. ARITHMETICOPERATIONS
Arithmetic operations are carried out between corresponding pixel pairs. The basic
arithmetic operations are,
s(*, y) fr,y) +g*, y)

f*, y)-g,
y)
d(r,y)
p(r,y) = fu,y) x g*, y)
q*, y) = f,y) *g,y)

111,4. LOGICAL OPERATIONS


The basic logical operations are,
C =
(A) OR (B) -Union

D
=
(A)AND (B)– Intersection
E = NOT(A) -Complement

Logical operations applied to binary images only


154 Page 288 of 304
Digital Image Processing

Examples ofLogical NOT Operation

A NOT (A)
ie i03TH0z)
ikll Fig. 1.35. Examples of Logical NOT
Operation th
Examples of Logical OR Operation

(A)OR (B)

Fig. 1.36. Examples of


Logical OR Operation
Examples of Logical AND
Operation

B
(A)AND (B)
Fig. 1.37. Exanples of LogicalAND
Operation
Page 289 of 304

Digital Image Fundamentals 1.55


1.12. IMAGE TRANSFORMS
All the image processing approaches operate directly on the pixels of the input
image (i.e) they work directly in the spatial domain. In some cases, image
processing
tasks are best formulated by transforming the input images carrying the
specified task
a
in transform domain and applying the inyerse transform to return to
the spatial
domain.
A particularly important class of 2-D linear transforms, denoted T(u, v) can be
expressed in the general form.
M-1 N-I
T(u, v) = fr, y) r(x, y,u, v)
x0 y 0
Where,
f(x, y) is the input image.
r (*, y, u, v) iscalled the forvward transformation kennel.
Variables u and v are called the transform variables.
T(u, v) is called the forward transform off(r, y).
T(u, v) RT(u, V)]
Inverse
Transfom Operation
R Transform

Spatial Spatial
domain domain

Transform domain

Fig. 1.38. General Approach for Operating in the Linear Transform Domain
Figure 1.38 shows the basic steps for performing image processing in the linear
transform domain. First, the input image is transformed, the transform is then
modified by a predefined operation and finally, the output image is obtained by
computing the inverse of the modifiedtransform.
1.12.1. TWO DIMENSIONAL DFT
The two dimensional DFT of an NXN image {u(m, n)} is a separable transform
defined as,
N-1 N-I km
... (1.14)
V(k,,1)) = W.0sk,lsN-1
W
X u(m, n)
2
m=0 n0
Page 290 of 304

1.56 Digital Image Processing

and the inverse transform is,


1 N-1 N-1 - km - In ...
u(m, n) = V(k, ) W W
,0sm,nsN-1 (1.15)
k=0 1=0

The two dimensional unitary DFT pair is defined as,

=
1 N-1 N-1 km
...
V(k,, ))
ÑŽ u(m, n) W, W,.0sk, IsN-1 (1.16)
m=0 n=0
1 N-1 N-1 - km - In
u(m, n) = V(k, ) W, W, ... (1.17)
N ,0sm,nsN-1
k=0 =0
In matrix notation this becomes,

V = FUF ... (1.18)

U= F*VF* ... (1.19)

IfUand V
are mapped into row ordered vectors u and v respectively, then
V = Fu u=V* 0 ... (1.20)
F = FF

The N² x N² matrix F represents the N× N two dimensional unitary DFT.

Properties of the Two-Dimensional DFT


The properties of the two-dimensional unitary DFT are quite similar to the one

dimensional case and are summarized next.


Symmetric unitary,
FT=F,F -=F * = F *1 ... (1.21)
Periodic Extensions

v(k+N, 1+N) = v(k, I),


Y,I
u(m + N, n +N) = u(m, n), Vm, n (1.22)
b,.
Sampled Fourier Spectrum, If u
(m, n) = u(n, n), 0 s m, n
sN- 1, and
um, n) =0otherwise, then
(2rk 2rl
|= DFT{u(m, n)} = v(k,
I) (1.23)
Page 291 of 304

Digital Image Fundamentals 1.57|

U
where (01, o,)is the Fourier Transform of u(m, n)
Fast transform: Since the two-dimensional DFT is separable, the transformation
of is equivalent to 2N one-dimensional unitary DFTS, cach of which can be
performed in O(N log,N) operations via the FFT. Hence the total' number of
operations is O(N² log,N).

Conjugate Symmetry: The DFT and unitary DFT of real images exhibit
conjugate symmetry, that is,
...
-+*)- N N
.}+). osk1s-1
(1.24)

[or]

v(k, l) = y* (N- k, N-l)), 0<k, l<N-1


From.this, it can be shown that v(k, l) has only N² independent real elements. For
or
example, the samples in the shaded region of figure determine the complete DFT
unitary DFT.
Basis Imnages: The basis images are given by,
T

1 -(km + ln)
0sm, n sN-1 0sk, IsN-1... (1.25)
N

Two-dimensional circular Convolution Theorem:


arrays is the product
The DFT of thetwo-dimensional circular convolution of two
of their DFTS.
Two-dimensional circular convolution of twoN×N
array h(m, n) and u, (m,n) is
defined as,
3
N-I N-1rh(m - m, n
-n')e u,(m, n'), 0Sm, nsN-1
U,(m, n) =
m'=0n'=0
... (1.26)

where,
... (1.27)
h (m, n), = h(mn modulo N, n modulo N)
Page 292 of 304

1.58 Digital Image Processing

(-1)-N2. N-1
k

(NI2) – 1.

NI2

tagssisz03
N-1
NI2
Fig. 1.39. Discrete Fourier transform coefficients
v(k,) in the shaded area
determine the remaining coefficients
n

N-1 u(m, n)

h(m, n)= 0
h(m - m', n-ne u(m', n')
M-1 h(m, n) 0
(m, n)

m
m'
M-1 N-1

b) Circular convolution of
uui -d (a) Array h(m, n) h(m, n)with u(m, n)
over N x N region

Fig. 1.40. Two-Dimensional Circular Convolution


Figure 1.40 shows the meaning of circular convolution.
It is the same when a
periodic extension of h(n, n) is convolved over an
Nx N region with u,(m, n) tne
two-dimensional DFT of h(m m, -
n-n)c for fixed m', n is given by
N-1 N-1 + nl)
S h(m-m, n -n')WN(mk
m=0 n=0
(m'k +n'l) N-1- N-1-n (ik +j)
wS i=-m' j=-n
hi,), W
Page 293 of 304

Digital Image Fundamentals 1.59

W
(m'k +
r') N-1 N-1 h(m,n)
(mk +
Wy.n) .(1.28)
m=0 n=0
w(m'k+ n')
DFT {h(m, n)}N
where we have used (equation 1.27). Taking the DFT of both sides of 1.26 and
using the preceding result, we obtain,
...
DFT{u,(m, n)}N = DFT {h(m, n)}N DFT u (m, n)}N (1.29)
From this and the fast transform property, it follows that an N x N,circular
convolution can be performed in ON2 log,N) operations. This property is also useful
in calculating tw0-dimensional convolutions such as,rests ci at iL 52*
M-1 M-1
ns
Sile x(m, n) =
2 x(m -m', n -n') x, (m', n')..(1.30)
m'=0.n'=0
m, n,
where x,(m, n) and x,(m, n) are assumed to be zero for [0, M- 1]. The

region support for the result x (m, n) is {0 S m, n s 2M 2}. Let N


of
- 2M- 1
and
define N×N arrays,
0sm,7n SM-1 ... (1.31)
h (m, n) A*|x(m,n), otherwise
Jx(m,n), 0sm,n sM-1 ...(1.32)
u,(m, n) A
0, otherwise
array x(m.
We denote DFT{x(m, n)}N as the two-dimensional DFT of anN×N
n), 0 < m, n sN-1.
u, (m, n) according to
Evaluating the circular convolution of h (m, n) and
equation 1.26, it can be seen with the aid of
figure 1.40 that,
... (1.33)
x(m, n)=u,(m, n), 0Sm, ns2M-2

This means the tWo-dimensional linear


convolution of 1.30 can be performed in
O
(N2log, N) operations.
N, and
Operations: Dividing both sides of equation 1.28 by
Block Circulant
we obtain
Using the definition of Kronecker product,
= 'utabs.. (1.34)
(F F) H D (F F)
rethi
where K is doubly circulant and D is diagonal
whose elements are given by,
Page 294 of 304

1.60 Digital Image Processing

DFT {h(m, n)}N 0Sk, l<N-1 ... (1.35)


s[DJN+, N» A d=
Equation 1.34 can be written as,
FH = DF F*F = D
... (1.36)
Or

that is, a doubly block circulant matrix is diagonalized by the two-dimensional


unitary DFT. From equation 1.35 and the fast transform property, we conclude that a
doubly block circulant matrix can be diagonalized in O(N² log,N) operations.
The eigenvalues of H, given by the two-dimensional DFT of h (m, n), are the
same as operating NF on the first column of K. this because the elements of the first
column of H are the elements h(m, n) mapped by lexicographic ordering.
Block Toeplitz Operations: Our discussion on linear convolution implies that any
doubly block Toeplitz matrix operation can be imbedded into a doubly block circulant
operation, which in turn, can be implemented using the two dimensional unitary DFT.

TWO MARKS QUESTIONS AND ANSWERS

1. What is meant by digtal image processing?


The digital image processing deals with developing a
digital system that
performs on a digital image.
2. What is meant by digital image?
A digital image is composed
of a finite number of elements each of which has a
particular location and values
of these elements are referred to as
o
elements, image elements, pels picture
and pixels.
3. Define gray level the image.
of
An image is nothing more than a two
mathematical function dimensional signal. It is defined
f(x, y), where x and y are the two by the
horizontally and vertically co-ordinates
and the amplitude of at any
(x, y)is called the intensity or gray f
level of the image at the
pair of coordinate
point.
4. Give any fiveapplication area
2ssing.Auitk
of digital image processing.
1. Industrial automation [NovDec'07]
2. Bio- Medical
3. Remote- Sensing
Page 295 of 304

Digital Image Fundamentals 1.61

4. Scientific Applications
5. Military Applications
5. State the steps involved in digital image processing?isesxedl 2l sgsae
The various steps required for any digital image processing applications are
1. Image acquisition

2. Image enhancement
3. Image Restoration
4. Color Image processing
5. Compression & wavelets
6. Morphological processing
7. Segmentation
8. Representation &
description
9. Recognition
6. Define sampling and quantization. [Apr/May 2011; Nov/Dec 2010]
To create a digital image, we need to convert the continuous sensed data into
digital form. This involves two processes sampling and quantization.
Digitizing the coordinate values is called sampling
Digitizing the amplitude values is called quantization.
7. Define compression.
to save an
Compression is a technique used to reducing the storage required
over the network. It has two major
image or the bandwidth required to transmit it
approaches.
1. Lossless compression
2. Lossy compression
8. What is meant by Lossless compression?
allow the
Lossless compression is a class of data compression algorithms that
original data to be perfectly reconstructed from the compressed data.
9. What is meant by Lossy compression?
It is a compression technique that does not decompress digital
data back to 100%

of the original.
Page 296 of 304
1.62 Digital Image Processing

10. What do you meant by resolution of an image?


Resolution of an
image specifies the amount of detail needed to represent an
image. If the resolution is high, the depth, clarity and minute details of the picture
will be better.
11, What is the need segmentation?
of

Process of segmentation
is partition of input image into constituent parts. The
key role of segmentation is to extract the boundary
of the object from the
background. The output of the segmentation stage usually
consists of either
boundary of the region or all the points in the region
itself.
12. Mention some types
of Mass storage.
1. Short term storage for use during proccssing.
2. On-line storage for relatively fast retrieval.
3. Archival storage such as magnetic tapes and disks.
13. List the membranes of a
human eye.
The eye is enclosed with three membranes.
1. Cormea and Sclera
2. Choroid
3. Retina
14. What is cornea
and sclera?
It is a tough, transparent tissue
that covers the anterior surface
the optic global is covered by the of the eyc. Rest of
sclera.
15. What is choroid?
The choroid lies directly below
the sclera. It contains a network
that serve as the major source of blood vessels
extraneous light entering of nutrition to the eyes. It helps to reduce
in the eye. It has two parts
1. Iris

2. Ciliary body

16. What is meant by photopic?


The cones in eye can be located
in the central portion
forea Muscles controlling the of the retina called the
eje rotate the eyeball until the image an
of obje
T64| Page 297 of 304
Digital Image
Process
23. What is spatial and Intensity
level resolution? LApr/May 20117
Spatial resolution is a measure the smallest
of discernible detail in an image
Intensity resolution refers to the smallest
discernible changes in intensity leyel
also known as gray level resolution.
24. What is image
interpolation?in i

Interpolation is a basic tool used


extensively in tasks such as zooming,
Rotating and geometric corrections. shrinking
Interpolation is the process
data to estimate values at unknown of using known
locations.
25. Define Bilinear Interpolation?
Bilinar Interpolation uses the four neighbors to
estimate the intensity at a givea
location.
26. Define Bicubic
nterpolation?
Bicubic Interpolation involves the
sixteen nearest neighbors
intensity value assigned to point of a point. The
(, y) is obtained by using the equation.
3 3
V(, y) = E aj x'yi
i=0j= 0

27. Define Spatial Domain.


The section of the real plane spanned
by the coordinates of an image
spatial domain, with x and y are is called the
referred to as spatial variables or spatia
coordinates.
28. Define four neighbours
ofpixel. P[Np))
[Nov/Dec 2009]
A pixel p at coordinates (x, y)has
four horizontal and vertical neighbors
coordinates are given by whos

(x + 1,y), (x -1, y), (×, y +


Thisset of pixels call the
1), (*, y 1)-
4- neighbours ofp is denoted by N, (p).
29. Define diagonal neighbor of
pixel P[N,P)J.
The coordinates of the four diagonal
neighbors of p are given
by
(x+1,y + 1), (x +
1,y- 1), (x -1,y + 1), (x-1,y -1)
and are denoted by N, (P).
Page 298 of 304

Digital Image Fundamentals 1.65


30. Define 8 Neighbours ofpixel P[N(p)J. (5 r[Nov/Dec 2009)
The diagonal neighbors together with the neighbors are called the
4-
8-neighbors of P, denoted by Np).
.
31. Define Path.
Path is also known as digital path or curve. A path fromn pixel p with coordinates
(, y) to pixel g with coordinates (s, t) is a sequence of distinct pixels with
coordinates.

Where (o, Vo) = (, y)


(n, y,) (s, t) and
(S, 1)
(,y) and (*;-1»}- ) are adjacent for 1 sisn
32. Define Connectivity.

Let S represent a subset of pixels in an image. Two pixelsp and q are said to be
connected in s if there exists a path between them consisting entirely at pixels in
S.

33. Define connected component and connected set.

For any pixel p in s, the set of pixels that are connected to it in s is called a
connected component of s. Ifit only has one connected component, then set s is
called a connected set.

34. State about Eucliclean distance.


3
The Eucliclean distance between p and q is defined asl s

D,
(p,4) = V (-s)+y -te nina

For this distance measure, the pixels having a distance less than or equal to some
value r from (*, y) are the points contained in a disk of Radius r centered at
(, y).
35. What is D, distance? (City block distance]
The city block distance or Da distance between p and g is defined as
-
-sl+ y
=
D,(P, q)
Page 299 of 304

|1.66 Digital Image Processing


a
The pixels having D, distance from (x, y) less than Or equal to some value
from a diamond centered at
(, y).h 2 gt hogil
Example.
2
2 1 2
1 1 1 2

2 2

2
36. What is D, distance? (Chessboard
distance]
The D, distance can be called as chessboard
distance, betvween p and g is defined
as
D,
(P,q) = max(x -s, ly-)
The pixels with D, distance from (x,
y) less than or equal to some value r from a
square centered at (x, y)
Example.
2 2 2 2 2
21.,1 1.
23te
1 2
2 2 2 2 2
37. Define D,, distance.
D,, distance between two points is defined as
the shortest m path between the -
-
points. It considered m adjacency.
In this case, the distance between two pixels
Will depend on
.
b1. The values of thepixels along the path and
2. The values of their neighbors.
38. Define Full color processing.
In fullcolor processing, the images are acquired
using full. Color sensors, such 2
a color TVcamera or color scanner.
These technigues are used in broad range
applications, including publishing, visualization
and the internet.
Page 300 of 304

Digital Image Fundamentals


1.67|
20. Define Pseudocolor processing.
In Pseudocolor processing, a color is assigned to a
particular monochrome
intensity or range of intensities. In this processing color spectrum may be divided
into six broad regions such as violet, blue, green, yellow, orange and red.
The
colors perceived in an object are determined by the nature
of the light reflected
from the object.

40. What is Achromatic light?


Achromatic light is the light seen on black and white television set and it has
been an implicit componentfor image processing.

41. What are the basic quantities are used to describe the quality of a
chromatic
light source?
Three basic quantities are used to describe the quality of a chromatic light source
such as
1. Radiance
2. Luminance
3. Brightness
42. Define Radiance, Luminance, Brightness
Radiance: It is the total amount of energy that flows from the light source and
measured in watts (w).
Luminance: It is a measure of the amount of energy an observer perceives from
a light source and measured in lumens (lm).
measure.
Brightness: It is a subjective descriptor that is practically impossible to
It embodies the achromatic notion of intensity and
one of the key factors for
describing color sensation.
43. Why RGB colors called as primary colors?
are called as primary colors because when
Red (R), Green (G), and Blue (B)
can produce all visible
mixing this color in various intensity proportions, they
is colors.

94, What are secondary colors?


as colors. They
(C), Magenta (M), and Yellow () are called, secondary
Cyan
are
italso called as primary colors of pigments and given below.
+ e
Cyan (Green Blue) h; olcee
Page 301 of 304

1.68 Digital Image Processine

Magenta(Red +
Blue)c
Yellow (Red + Green)
45. What are the characteristics used to differentiate one color fromother color)
There are three characteristics are used to differentiate one color from other sur
as
1. Brightness
2. Hue
3. Saturation
46. Define Brightness, hue and saturation.
[ NovDec 2010)
Hue: It represents dominant color as perceived by an observer.
Saturation: It refers to the relative purity or the amount of white light mix
with a hue. The pure spectrum colors are fully saturated.
2
Brightness: It gives the chromatic notion of intensity.
47. Define Trichromatic coefficients.
The amount of Red, green and blue needed to form any particular
color are call
the Trichromatic values and are denoted by R, G and B
respectively. A color
specified by its trichromatic coefficients, such as
R
R+G +B
G
R+G + B
B
R+G+B
48. What are the basic types of color model?
Color models are classified into two types
according to their uses.
1. Hardware oriented color models.
2. Application oriented color models.
49. Define pixel depth.lan
The nunber of bits used to represent each
pixel in RGB space is called the r
depth.
50. Define HSI color space.
HSI color space is represented by a vertical
intensity axis and the locus of co
poiuts that lie on planes perpendicular to this axis.
Page 302 of 304

Digital Image Fundamentals


|1.69|
S1. Mention some advantages of HIS model.
1. This model
allows independent control over the color
namely hue, saturation and intensity. describing quantities

2. It can be used as an ideal tool for


developing image processing algorithms
based on color descriptions.
52. Define color imnage processing.
Color image processing deals with color models and
their implementation in
image processing applications.

53. How many sensor arrangenents are used for transform


illunination energy
into digital images?
There are three principal sensor arrangements used to transform illumination
energy digital images such as
1. Single Imaging sensor
2. Line sensor
3. Array sensor

54. What is meant by gray level?

The intensity of monochrome image f(r, y) is called gray level" of the image at
point.
Lmin S lS Lmax

Where is the gray level of the image and interval (Lmin. Lma) is called gray
scale.

S5. What is meant by focal length of eye?


The distance between the center of the lens and the retina is called the focal
length of thehuman eye. It ranges from 14 mm to 17 mm.

S6. List the Hardwareoriented color Models? [Apr/May 2010]


are:
The color models which are oriented toward Hardware
camera.
1. RGB(Red, Green, Blue) model for color monitors and color video
2. CMY (Cyan, Magenta, Yellow) model and CMYK (Cyan, Magenta, Yellow,
Black) models for color printing.
3. HIS (Hue, Saturation and Intensity) model.
Page 303 of 304
|1.70 Digital Image Processine

57. Draw the block diagram of elements of tmage processing systems?


I Nov/Dec2010
Network

Image
displays Computer Mass storage fO bazi

Hard copy Specialized


Image
image
processing processing
Hardware software

Image sensors

Problem
domain

58. What are the types of lightReceptors? [Nov/Dec 2010, Apr/May 2011]
The two types of light receptors that are distributed over the surface the retina
of
of the eyye are,
1. Cones- These are used to resolve fine details and hence cone vision is
called photopic or bright light vision.
2. Rods They provide only a general, overall picture of the field
of view.
Hence the vision is called scotopic or dim light vision.
59. What is meant by illunination and reflectance?
[Apr/May 2011]
Illuminance is defined as the amount of source illumination
incident on the scene
being viewed.
Reflectance is defined as the amount of illumination
reflected by the objects in
the scene.
60. What is the Image Averaging?
[Nov/Dec 2011]
Image averaging is the process of replacing each pixel an
in image by a weighie
average of its neighborhood pixels. This process
is used to reduce the no159
content in an image.
Page 304 of 304
Digital Image Fundamentals
171
REVIEW QUESTIONS
1
Discuss about the basic
relationship between pixels.
[Apr/May 2011]
Ans. Refer Section 1.8 Page.no: 1.33
2.
Explain the components
of an image processing system.
NovDec2010, A. U., NowDec 2008]
Ans. Refer Section 1.4 Page.no: 1.9
3. Explain the three types of adjacent relationship
between pixels.
NovDec 2010]
Ans, Refer Section 1.8.2 Page.no:1.34
4. Describe the principle of sampling and quantization. DiscuSs
the effect of
increasing the
1. Sampling frequency
2. Ouantization levels on image
(MayJune 2012)
5. How an RGB is represented using HIS format? Describe the transformation.
[May/June 2012)
Ans. Refer Section 1.10.1 Page.no: 1.43
6. Explain about RGB color model.
Ans, Refer Section 1.10Page.no:1.42
7. Explain the fundamentalsteps in digital image processing.
Ans, Refer Section 1.3 Page.no:1.5
. Explain the elements of visual perception with neat diagram
[NovDec 2010]
Ans, Refer Section 1.5 Page.no:1.12

You might also like