Image Processing Book
Image Processing Book
2
Image Enhancement
Spatial Domain: Gray level transformations -
Histogram processing -
Basics of Spatial
to
Filtering -Smoothing and Sharpening Spatial Fiftering, Frequency Domain: Introduction
Fourier Transform – Smoothing and Sharpening frequency domain filters – Ideal, Buttenworth
and Gausian filters, Homomorphic filtering, Color image enhancement.
2.1. INTRODUCTION
The termn spatial domain refers to the image plane itself and approaches in this
an image.
categories are based on direct manipulation of pixel in
Spatial domain process are denoted by the expression
g y) = TM,y)]
(, **. (2.1)
where f(, y) -input image
g (x, y) - processed image
E
T- Operator onf, defined over some neighborhood of
f (&, y)
The neighborhood of a point (x, y) can be explained by using as square or
rectangular sub image area centered at (x, y).
Page 2 of 304
Digital Image Processing
22
Origin
- (x, y)
3*3
neighborhood
of (x,y)
Image f(x, y)
spatial domain,
3Ne X
S= T(r) ...
(2.2)
where
r- denotes the gray level off (x, y) 22
S-denotes the gray level of g (x, y) at any point (, y)
Based on the shape of Tr) there are two categories
of techniques which are used
for contrast enhancement. They are,
1. Contrast stretching
2. Thresholding
S=T()
Light
-
Dark
m
Dark - -Light
s=T()
ight
-T()
Dark
Dark +Light
Negative
nth root
si3LJ4
s
level,
Log
intensity
nth power
LI2
Output
LJ4
0
LI4 L/2 3LI4 L-1
Input intensity level, r.
Gamma Correction
display respond
A variety of devices used for image capture, printing, and
according to a power law. By convention, the
exponent in the power law equation is
process. used to correct this power..jaw response
referred to as gamma. The i
phenomenon is called gamma correction.
Page 6 of 304
2.6|
Digital Image Processing
L-1
Y50.04
Y= 0.10
3U4 Y=0.20
Y=0.40
s
level,
y=0.67
LU2 intensity
y=1
Y=1.5
Output
Y=2.5
U4
Y=5.0
Y=10.0
Y=25.0
U4 U2
Input intensity level,r 3LI4L-1
Fig. 2.5. Plots of
the Equations S = Crrfor various values
Y
of
(C= 1 in all cases)
Example
CRT devices have intensity to voltage response
that is a power function. Gamma
correction is important if displaying an image
accurately on a computer screen is ot
concern. Images that are not
corrected properly can look cither bleached out or too
dark.Color phenomenon also uses this concept gamma
of correction.
Gamma correction concept becoming more popular
due to use of images over the
internet. It is important in general purpose contract manipulation.
To make an image
black we use y> 1 and y < 1 for white image.
2.4.4.1.Contrast Stretching
It is the simplest piecewise linear transformation function. We can have low
contrast images because of lack of illumination, problem in imaging sensor or wrong
seting of lens aperture during image acquisition.
L-1
s
level, 3L/4F
intensity
LI2
Output
LI4
(1, s1)
The idea behind contrast stretching is to increase the dynamic range of gray levels
in the image being processed.
The location of points (r1, S) and (", S,)control the shape of the curve.tala
a. If r;=, and S, = S, the transformation is a linear function that deduces no
change in gray levels.
b. If r =
S, S, = 0, and S, = L - 1, then the transformation becomes a
thresholding function that creates a binary iinage.
C. Intermediate values of (r, S,) and (r2, S) produce various degrees of spread
in the gray value of the output image thus affecting its contract.
Generally r sr, and S, s S, So that the function is single valued and
monotonically increasing.
T()
T()
0 A B L-1
Fig. 2.8. Gray Level Slicing 2nd approach
2.4.4.3. Bit plane Slicing
Sometimes it is important to highlight the contribution made to the total image
appearance by specific bits. For example consider if each pixel is represented by 8
bits.
Imaging that an image is composed of eight 1-bit planes ranging from bit plane 0
for the least significant bit to bit plane 7 for the most significant bit. n
terms of 8-bit
bytes, plane 0 contains all the lowest order bits in the image and plane 7 contains all
the high order bits.
Bit plane 7
One 8-bit (most significant)
byte
Bit plane 0
a (least significant)
Higher order bits contain the majority of visually significant data and low order
bits contribute to more subtle details in the image.
Separating a digital image into its its planes is useful for analyzing the relative
importance played by each bit of the image.
Ithelps in determining the adequacy of the number of bits used to quantize each
pixel. It is also useful for image compression.
P(r) = MN versus k
Page 11 of 304
The gray level transformation function that satisfies the condition (a) and (b)
which is shown in below figure 2.11.
Page 12 of 304
2.12 Digital Image Processing
Dark image
Bright image
Low-contrast image
High-contrast image
Fig. 2.10. Four basic image types: dark, light, low contrast, high contrast,
and their corresponding histograms.
Page 13 of 304
L-1
T()
Sk
rkL-1
Fig. 2.1. The Inverse Transformation fromS tor is denoted by
r = T-l (S) 0<S<L-1
A
monotonic transformation function performs a one to one or many to one
mapping.
The transformation function should be single valued so that the inverse
transformation should exist. Monotonically increasing condition preserves the
increasing order from back to white in the output image. The second conditions
guarantee that the output gray levels will be in the same range as the input levels.
P,(s) =
P,
) dr
ds (2.3)
Thus the PDF of the transformed variable s is determined by the gray levels PDF
of the input image and by the chosen transformations function.
A transformation function of particular importance in image processing has the
form.
Page 14 of 304
2.14 Digital Image Processing
S = T¢) =(L-1) P-
() dw ... (2.4)
P, () = nk
MN k= 0, 1,2, ......, L -1 ...
(2.7)
where,
MN is the total number of pixels
in the image
n, is the number of pixels that have intensity
r,
L isthe number of possible intensity
levels in the image.
The discrete transformationfunction is given by
n
S = T() =
i= 0
MN
=
... (2.8)
i= 0
an output image that has a uniform histogram. It is a good approach when automatic
enhancement is needed.
Algorithm
Step 1: Compute S; = P, (K) where K
= 0, 1
...... L - 1, the cumulative
normalized histogram of f.
...... the transformation function, from
Step 2: Compute G (K), K=0,
1
L- 1,
an adjacent pize
center the neighborhood region is then moved to
t. The of
location and the procedure is repeated.
FOR IMAGE
2.5.4. USINGHISTOGRAM STATISTICS
ENHANCEMENT
can directly from an image histogram and it can be used for
.Statistics be obtained
image enhancement.
range
r. a discrete random yariable representing intensity values ín the
Let denote
corresponding to
[0, L - 1] and let P(r) denote the normalized histogram component
mean is defined as
value r,, The nh moment ofr about its
L-I ... (2.9)
= Z ¢,-m)" P(r)
H)
i=0
where
m is the mean (average intensity) value of r
L-1 ...(2.10)
m = Z r, Pr)
i=0
g² =
L-1
(r;-m)? P(r)
.. (2.12)
hr)=
i=0
or standard deviation
The mean is a measure of average intensity and the variance
is a measure of contrast in an image.
When working with only the mean and variance, it is
casy to estimate them
directly from the sample values, without computing the histogram. These estimates
are called sample mean and sample variance, which is shown in below equations.
1 M-1 N-1 ... (2.13)
m =
MN y =0
x=0
Page 17 of 304
Image Enhancement 2.17
and
1
2 = MN
M-1 N-1
.. (2.14)
x=0y=0
t, I, 2..... N-1
The global mearn and variance are computed over an entire image and are useful
for gross adjustments in overall intensity and contrast.
The local mean and variance are used as the basic for making changes that depend
on image characteristics in a neighborhood about each pixel in an image.
Let (x,y)denote the coordinates of any pixel in a given image and let Sxy denote a
neighborhood (sub image) of specified size, centered on (*, y). The mean value of the
pixels in this neighborhood is defined as
L-1
(2.15)
>xy
ai=0.
where, Psy is the histogram of the pixels in region S,, The variance of the pixels
in the neighborhood is given by
L-1
= (2.16)
Sxy i=0
The local mean is a measure of average intensity in neighborhood S,xy and the local
variance is a measure of intensity contrast in that neighborhood.
Let f(x, y) represent the value of an image at any image coordinates (x, y) and let
where
K, K, and K, are specified parameters
E,
If the operation performed on the image pixels is linear then the filter is called as
linear spatial filter, otherwise the filter is nonlinear.
Page 19 of 304
Image Enhancement (2.19
Image origin
Filter Mask
Image f(x, y)
w0-1) wa 1)
Pixels of image
section under mask
Lfls: Fig. 2.12. The mechanics of linear spatialfiltering using a 3 x 3 filter mask
Above figure 2.12 shows the mechanics of linear spatial filtering using a 3 x 3
neighborhood. The process consists of moving the filter mask from point to point in
a predefined
the image. At each point (x, y) the response is calculated using
relationship.
For linear spatial filtering the response is given by a
sum of products of the filter
coeficient and the corresponding image pixels in the area spanned by the filter mask.
Page 20 of 304
The results R of linear fltering with the filter mask at point (, ) in the image is
+ ...(2.19)
g, y)
= w(s, 1)f(x+s,y 1)
S =-a t=-b
w every pixel inf.
where, x and y are varied so that each pixel in visits
(m-1)
b
(n-1)
2
CONVOLUTION
2.6.2. SPATIAL CORRELATION AND
we must have basic knowledge about
To perform Linear Spatial Filtering
correlation and convolution.
over the image and computing
Correlation is the process of moving a filter mask
the sum of products at each location.
concept called
The process of linear filtering is similar to frequency domain
as convoluting a
convolution. For this reason, linear spatial filtering is referred to
mask with an image. Filter mask are sometimes called convolution mark.
Convolution also performs the same mechanism of correlation except that, n
convolution the filter is first rotated by 180°.
Page 21 of 304
Image Enhancement
2.21
Correlation
Convolution
rOrigin f W rOrigin f W rotated 180°
(a) 0
0 10
0 0 0 0
12 3 2 8
0 00 1.0 0 008 2 3 2 1
()
(b) 000 100 0
0 0 00100 0 0 (j)
1 2 3 2 8 8 2
3:21
t Starting position alignment
Zero padding
(c) 0 0 0 0 0 00 1
0 00 0 0 0 0 0
1 2 3 2 8
00 0 0 0 0 0 10 0 0 0 000 0 (k)
8 2 3 2 1
(d) 0 0 0 0 0 0010 00 0
00 0 0
00 0.0 0
00 1,0 0 0 000 0 0 (0)
1 2 3 2 8 8 2-3 2 1
(e)
00 000 0 0 10 0
000 0
00 000 000 0 1
0 0 0 00 000 (m)
1 2 3 2 8 8 2 3 21
L Position after four shifts
() 0 0
00 0
00 1
000 0 0
00 0 000 000 0 1 0 0
000 00 0
(n)
1
2 3 2 8 8 2 3 2 1
Finalposition
pixel inf.
2. If the filter is of size m,
we need m- 1 Os on either side of f.
Page 22 of 304
0 0 1
00 123 00
00 0 0 0 4 5 6 000 00 0 Dinie
0 0 0
00 0 0
000 0 0 7 89 00
(b)
(a)
Full correlation result
Cropped correlation result
Initial position for w
T
1 2 3 0 0 0 00 0 00 0 00 0 0 0 0 0 0 0 0 0
0 0 0
00 0 0 0 9 8 7 0
4 5 6: 0 0
0
0 00
0. 0 0 0 0 0 0 0 0 0 0 6 5 4 0
7 8 9: 0 0 00 0 0 987 0
0 0 0 3 2 1
0 0 0 0 0 0 0 0 0 0 0 0 0
5 4 0 0
0 0 0 01 0 00 0 00
0
00 0 0 0 0
000 00 0 0
0
00 0 0 0 0 0 00
0 0 0 00 0 0 0
000 00 0 0 0 0 0 0
00 0 0 0 0 0 0 00 0
00 00 (e)
(c) (d)
Full convolution result Cropped convolution result
Rotated w
o 0 0 0 0 0 00 0 00 0 0 0 0 00 0 0 00
i9 8
7 0 0
0 0 0
0
3
01 2
i6 5 4 o 00. 0 00 0
0 4 5 6 0
0 0 0 0 00 0 0 0
00 0 0
0 0 0 0 00 0 1
2 3 0 0 0 078 90
0 4
5 6
00 0 000 0 0
00 0 0 00 0
00 0 7 8
90 0
00 0 00 00 0
0 0
00 0 00 0
00 0 0 0 0 0
0 0 0 0 0 00 0 0
0 0 0 0
0 000 0
0 0 00 0 0 0 0 0 0 0
00 0 00 00 (g) (h)
a 2-D
convolution (last row) of 2-Dfilter with
a
Fig. 2.14. Correlation (middle row) and
gray to simplify visual analysis
discrete, unit impulse. The Os are shown in
m x n with an image f(x,y), denoted aS
The correlation of a filter w(x, y) of size
a (2.20)
w(s, t)f(x +s, y +t) •..
S =-a t=-b
In a similar manner, the convolution of w(*, y) andf(*, y) denoted by.
a (2.21)
w(s, )f(* -5,y -1)
S=-a t=-b
where the minus signs on the right flipf(i.e., rotated it by 180)
Page 23 of 304
where the w's are mask coefficients and the z's are the vale of the image
gray levels corresponding to those coefficients and mn is the total number of
coefficients in the mask.
W1 W2 W3
W4 Ws W6
W1 W8 Wo
Z+W Z,
t...... tW9 Zo
k=1
= WIZ ...(2.23)
where, w and z are 9-dimensional vectors formed from the coefficients of the
mask and the image intensities encompassed by the mask respectively.
1 9 ... (2.24)
R=
19 i=1 1
w, =o
coefficient values
This is the same as equation 2.23 with
Gaussian Function
A Gaussian Function of two variables has the basic form
h (x, y) =e 242
where
G is the standard deviation
x and y are coordinates.
Non-Linear Filter
Generating a nonlinear filter requires following information.
1. The size of a neighborhood
Blurring
It is used in preprocessing tasks, such as removal of small details from an image
nrior to (range) object extraction and bridging of small gaps in lines or curves.
Noise Reduction
It can be accomplished by blurring with a inear filter and also by nonlinear
filtering.o
2.7.1, SMOOTHING BY LINEAR FILTERS
The output of a smoothing linear spatial filter is simply the average of the pixei
contained in the neighborhood of the filter mask. These filters also called averaging
filters or low pass filters.
Operation
The operation is performed by replacing the value of every pixel in the image by
the average of the gray levels in the neighborhood defined by the filter mask.
This
process reduces sharp transitions in gray levels in the image.
Types
1. Box Filter
2. Weighted average filter
1 1 1
1 1
9
1
z,
RS 9
i=1
(2.25S)
The average value is needed to. be computed and normalized constant is multipliea
1
with thefilter mask in thiscase).
The denominator value of this constant is equal to the sum of all coefficient values
of the mask. An m xn
mask would have a normalizing constant equal to l/mn.
2.7.1.2. Weighted Average Filter
A weighted average filter is the one in which pixels are multiplied by different
coefficients.
1
2 1
16
2 4 2
1 22 1
2 w(s, t)
S-a t=-b
Page 27 of 304
Applications
1. Noise Reduction
2. Smoothing of false contours i.e., outlines
3. Irrelevant details can also be removed by these kinds of filters, irrelevant
means which are not our interest.
= median (gs,1)}
f (*, y) (s, 1) e Sp
median
original value of the pixel is included in the computation of the
The
because, for certain types of Random noise, the:
Median filters are quite popular than
noise reduction capabilities, with considerably less blurring
provide excellent
linear smoothing filters of similar size. unipolar
in presence of both bipolar and
Median filters are particularly effective coITupted
yields excellent results for images
impulse noise. In fact, the median filter
by this type of noise.
= max {g(s, )}
f (r, y)
(s, )e S,y
= f(r+1)-f(x
-
1)-2f(*) .. (2.28)
ax?
...(2.29)
=
f(r+ 1, y) +f(r - 1, y)- 2f(x, y) ... (2.30)
0 1 1
11
1
1
1
-4 1 1
-8
1 1 1
0 1 0
0 -1 -1 -1 -1
-1 4 -1 -1 -1
-1 -1 -1
Example
(c)
(a)
(d)
(b)
8x (2.36)
Vf = grad () = 8y •..
ôy
f -1,y - 1) :2,
|f-1,y) :Z,
f-1,y+ 1) :Z,
|f, y - 1) : Z4
: Z;
f, y)
f(r,y + 1)
: Zy
f+ 1,y+1)
: Zg
fx +1,y)
fx+1,y+1) : Z
8, = (Zg Z) -
Page 33 of 304
Image Enhancement
j2.33
and = (Z
&y
-Z;) (2.39)
Two other definitions proposed by Roberts
in the early development of digital
image processing Use cross differences.
(2, - Z)
and
... (2.42)
and &y oy
= (Z, + 22,+ Z)-(Z + 2Z,
t Z)
implemented using the mask shown in below figure.
These equations can be
1
-2 -1 -1
-1
2 0
0
1
1 2 1 -1
Fig,2.21. Sobel Operators
these masks, we obtain the magnitude
After computing the partial derivatives with
of the gradients.
We know that
M (, y)
above equation.
Then substituting g, and g, values in
Mry)
~
V/(Z +22, + Z,)- (Z, 2Z,
+ +Z) +
|(Z, + 22,+ Z)
- (Z + 22, +Z))
are called the Sobel operators.
The masks inabove two figures 2.21
IN FREQUENCY DOMAIN
2.9. IMAGE ENHANCEMENT
forward. We simply compute the
Enhancement in the frequency domain is straight
multiply the result by a filter
transfer
the image to be enhanced,
Fourier transform of
to produce the enhanced image.
function and take the inverse transformn
H (FU) -= f)=
...
F
F(U) exp (j2rux) du (2.44)
[OR]
O(0) = tan ... (2.49)
F(u) = F(u)le/u)
+ dxdy
)} = F(u, v) =| Sf(x, y) exp [j2n(ux vy)] ...(2.50)
FG,
given by equation 2e301g 2.24rit Loi
Inverse Fourier Transformation is
. (2.51)
v) exp [j2r(ux
+
v)] dudyt
F{F (4, v)} =f6,y)=| fu,
variables.
where (u, v)are frequency (m/2,
v) to frequencv coordinate
Preprocessing is done to shift the origin of F(u,
center M * N area occupied by the 2D FT. It is known
as
-
n/2) which is the of the
frequency rectangle.
DOMAIN
2.9.2. BASIS OF FILTERING IN FREQUENCY
are based on modifying the Fourier
Filtering techniques in the frequency domain
inverse DFT to g
Transform to achieve a specificobjective and then computing the
us back to the image domain.
the image
H(u, v) called a filter because it suppresses certain frequencies from
while leaving others unchanged.
Page 37 of 304
f(x.y) g(,y)
input imageL enhanced
image
A
circle of radius D, with origin at the center of the frequency Rectangle encloses
a percent of the power, where
a = P(u,v)
100|
P,
and the summation is taken over values of (u, v) that lie inside the circle or
on its boundary.
H (u, v) H (u, v)
u
+D (u, v)
Do
Fig. 2.23. (a) Perspective plot of an ideal low-pass filter transfer function.
Filler
(b) displayed as an image (c) Filter radial cross section
ldeal Low Pass Filter is not suitable for practical usage. But they can be
implemented in any computer system.
H(u, v)
H (u, v)
1.0
n=1
0.5 -n=2
cn=3
D(u, V)5ta
D.
(c)
(a) (b)
as
butter worth low-pass filter function(b)filter displayed
Fig. 2.24. (a) perspective plot ofa
1 through 4
an image (c) filter cross sections of orders
FILTERS
2.10.3. GAUSSIAN LOW PASS
The transfer function of a Gaussian Low
Pass Filter is
(u, v) /2o2
H(u, v) =e -D'
where D (u, v)the distance of point (u, v) from
the center of the transform
o=D,-specified cut-off frequency.
The filter has an important characteristic that the
inverse of it is also Gaussian.
H (u, v)
H
(u, v)
1.0
Do= 10
0.667 Do= 20
Do= 40
Do= 100
D(u, v)
10
-D (u, v)
+D(u, v)
(d) (e) (f
H
H
(u, v)
(u, v)
-V 1.0
-D(u, v)
rows:The same
and cross Section ofa typical ideal highpass filter. Mtiddl and bottóm
sequence for typical Butterworth and Gaussian highpass filters.
Page 42 of 304
2.42 Digital Image Processing
the form
sDo
0
if D(u, v)
H(u, v) = 1 if D(u,
v) > Do
H (u, v) = 1+ D/D(u,v)]n
R Monochrorme R
image enhancernent
Inverse Display
G T2 Monochrome cOordinate G
Input Coordinate
conversion image enhancement transformation
Image
T3 B
B T3 Monochrome
image enhancement
3 is enhanced independently,
lOr display. Since each imág plane T, (m, n), =1,2,
k
of
care has to be taken so that are withinthe color gamut
the enhanced coordinates T
the R-G-B system.
Page 45 of 304
display device.
7. What is Gray Level Transformation function?
is of siza
It is the simples form of the transformations when the neighborbood
1. In this case, g depends only on the value off at a single point (x, y)and
T
1x
becomes a gray level transtormation function of the forms S=T(r).
h(x, y) =e 262
where
g is the standard deviation
x
and y are coordinates.
X 11 1
1
11|
Sum of all the coefficients
=
1+1+1+1+1+1+1+1+1 =9iha
Page 50 of 304
|2.50 Digital Image Processino
X 2 4 2
1:
|1|2
=
Sum of all the coefficients 1+2+1 +2 +-4+2+1+2+1=16
34. Whut is the objective of sharpening?
The principal objective of sharpening is to highlight fine details in an image or ta
enhance details that have been blurred either in error or as a natural effect of
particular method for image acquisition.
-
FU) =
f) exp (j2rus)
t j--1 (1)
HPF.
41. Define transfer function of Ideal, Butterworth, LPF and
-
[Nov/Dec- 2011, Apr/May 2011, May/June-2009.]
Ideal Filter Butterworth Filter
H 1
(u, v) H (u, v) =1+[D(u, v)
Low pass /Dlzn
filter
1
if D(u, v) s Do
0 if D(u, v) > Do
2.52 Digital Image Processing
Page 52 of 304
H(u, v)
1
s0
High pass Jo if Du, v) H(u, V)
filter l1 if D(u, v)
>0 1+ [DD (u, v)]n
40. How are smoothing flters used in image processing? Give any two smoonls
filters. Nov/Decc- 2010/
Smoothing filters are used for blurring and noise
reduction Smoothing can be
performed in two ways.
1. Linear
2. Non-Linear
1. Box filter
2. Weighted average filter used for linear filters
Page 53 of 304
S2. Wrte down a 3 x3 mask for the smoothing and sharpening filters.
(Apr/May - 2011)
as
For smoothing filters we can consider box filter.
1
1 1
1. Min filter
2. Median filter
3. Max filter
REVIEW QUESTIONS
image enhancement.
l1. Describe the frequency domain methods for
-
[Nov/Dec 2011)
eeers
Ans, Refer Section 2.9 Page.no: 2.35
12. Describe the following filters.
NovDec-2011]
i. Laplacian filters
ii Sharpening filters
Ans. o
2.29
i. Refer Section 2,8.3 Page.no: 2.30 and Refer Section 2.8 Page.no:
Page 57 of 304
3
Image Restoration
Image Restoration - degradation model, Properties, Noise models - Mean Filters -
Order Statistics - Adaptive filters - Band reject Filters - Band pass Filters - Notch
Filters- Optimum Notch Filtering - Inverse Filtering - Wiener filtering
Types
The restoration techniques are classified into
two types.
1.
Spatial domain techniques
2. Frequency domain techniques
.
Tg(x,y)
Degradation Restoration
Function H filter (s)
Noise
n(x, y)
Degradation Restoration
light, which contains nearly all frequencies in the visible spectrum in equal
proportions.
Spatial frequency refers to the exception of spatially periodic noise that noise is
independent of spatial coordinates, and that it is uncorrelated with respect to the
image itself.
p() 26 (3.3)
2r6
Image Restoration
3.5
Pz)
Gaussian
0.607
z+o
Fig. 3.2. Gaussian noise PDF
3.3.2.2. Rayleigh
Noise
Unlike Gaussian distribution, the Raylight distribution is not symmetric. It is given
by the formula,
.2
P(z) = (2-a) e-(2-a)b for z 2 a ... (3.4)
for z <a
The mean and variance of this density are given by
z = ... (3.5)
a+1 4
g² =
b
(4-)
and 4 ...3.6)
P(Z)
Rayleigh
K
Gamma
a(b-1)b1e-(b-1)
K=
(b-1)!
(b-1)/a
Fig. 3.4. Gamma noise PDF
z ... (3.11)
and g2 = ... (3.12)
Page 63 of 304
exponential
p(z) = b-a
if as z<b ... (3.13)
otherwise
The mean of this density function is given by
a +b
2
..(3.14)
and its variance by
b-a Uniform
a b
Pb
Impulse
Pa
Gaussian mean
1.
|noise p(z) = 1=z-2z26 is the
average value
ofz
P()=
2.
Rayleigh
noise
, (Z-a)
2
e-(Z -a)b for z
a |z= a
g2b(4-a)
0
for z<a
+/
Page 65 of 304
4.
Exponen
P(2) for
z<
z
>0as
tial noise for 0tAsG
Uniform - a
s z <b
5.
noise
P) if
(b-a
12
otherwise
for z =a
6.
|Impulsen()=P, for z =b
noise
otherwise
Spatial Filtering is the method of choice in situations when only additive Random
noise is present. When the only degradation present in an image is noise means we
can have
The noise terms are unknown, so subtracting them from g*, y) or G (u, v) is nota
Realistic Option. In the case of periodic noise, it is usually possible to estimate N
(u,v) from the spectrum of G (u,v). In this case, N (4,) can be subtracted from G (u,
v) toobtain an estimate of the original image.
The value of the restored image f at point (, y) is the arithmetic mean computed
using the pixels in the region defined by S,p
1
=
f (r, y) mn E g(s, ) ...(3.19)
(s, 1) e Sxy
This operation can be implemented using a spatial filter of size m x n in which all
1
coefficients have value mean filter smnooth's local variations in an
mn A image, and
noise is reduced as a result of blurring.
gs,1)
Page 67 of 304
The contra harmonic mean filter yields a restored image based on the expresSion
Z g(s, yQ+1
(s, ) e Sy ...
f (r, y) = 3.22)
(s. ) e Sy
where Q is called the order of the filter. This filter is well suited for reducing or
virtually eliminating the effects of salt and pepper noise.
pepper noise is eliminated
IfQis positive then
IfQ isnegative then salt noise is eliminated
Also
should be preserved.
Page 70 of 304
Digital Image Processing
3.14
men
3. If the two variances are equal, we want the filter to return the arithmetic
area
the pixels in S,. This condition occurs when the local
of hae
value
same properties as the overall image and local noise is to be reduced
averaging.
fa.) = g
*,y)- [s (*, y) - mJ ... (3.28)
Algorithm
The adaptive median-filtering algorithm
works in two stages, denoted stage A a
stage B, as follows.
Stage
AAj7 Lmed min i:
mia
A, = Zmed-Zmax
IfA, >0 AND A,<0, go to Stage B
Periodic noise can be analyzed and filtered quite effectively using frequency
domain techniques. Periodic noise appears as concentrated bursts of energy in the
Fourier transform, at locations corresponding to the frequencies of the periodic
interference. The approach is to use a selective filter to isolate the noise.
The three types of selective filters are
1. Band reject filter
2. Bandpass filter
.
3. Notch filter
D, (u, v) = 2
2
D, (u,v) = + uo) N
+|v-7to
Butterworth notch reject filter
of order n is given by
H(u, v) - 1-exp
1 (D, (u,v) D, (u, v)
H(u, v) = D, n
D, (u, v) D, (u, v)
These filters become high pass rather than suppress, the frequencies
contained in
he notch areas. These filters willperform exactly the opposite function as the notch
reject filer. The transfer function of this filter may be given as
Hyp (u,v)
=1-HR(", v)
where,
Hyp (u,v) - transfer function of the pass filter
(u, v)– transfer function of a notch reject filter
3.5,4, oPTIMUM NOTCH FILTERING
Optimum minimize local variances
Notch Filtering is used to of the restored
estimate
f (*, y)- These kinds of filter will follow the same set of procedure such as
Page 74 of 304
2
3.6. INVERSE FILTERING
Inverse filtering
is a process
function H.
This function can of restoring an
be obtained by image degraded
The simplest approach
any method. by a degradation
to restoration
provides an estimate F(u, v) is direct,
of the transform inverse filtering.
the transform of the degradedimage G of the original Inverse filtering
(u, v) bythe degradationimage simply by during
G
v) = H(u,
(u, F v) (u, v) +N
(u, v) function.
Page 75 of 304
Inage Restoration
|3.19|
=
(u, v) G
F(u, V)
v)
H (u, v)
H
(u, v) F (u, v) +N (u, v)
H(u,v)
F
(u, v) = F(u, v) + N(u, v)
H (u, v)
It shows an interesting result that even we
if know the depredation function we
cannot recover the underrated
image exactly because N (u, v) is not known.
If the degradation value has zero or very small values N
(u, v)
then the ratio
H (4, v)
could easily dominate the estimate
F(u, v).
3.7.MINIMUM MEAN SQUARE ERROR (OR) WIENERFILTERING
This filter incorporates both degradation and statisical behavior
of noise into the
restoration process. The main concept behind this approach is that the images and
noise are considered as random variables and the objective is to find. an estimate
f of the uncorupted imagef such that mean square error between them is minimized.
=
f (r) h, ( -s)g (s)
S=-0
This error measure is given by
e2 = E(U-D}
where, E {} is the expected value of the argument.
Assuming that the noise and the image are uncorrelated (mean zero average value)
One or other has zero mean values.
The minimum error function of the above expression is given in the frequency
domain by the expression.
complex quantity squared. This result is known as the Wiener filter. The flter
named so because of the name of its inventor N. Wiener. The term inside the brackoe
is known as minimum mean square error filter or the least Square
error filter.
Src,y) -f(o.yl
x=0 y=0
f(x,y) H g(x, y)
A system operator which together with an additive white noise term n (x,v)
H,
Gray-level interpolation deals with the assignment of gray levels to pixels in the
spatially Transformed image
5. What
is meant by Noise probability density function?
The spatial noise descriptor is the statistical behavior of gray level values in the
noise component of the model.
6. What is pseudo inverse filter? (Dec'13)
It is the stabilized version of the inverse filter. For a linear shift invariant system
is defined às
with
frequency response H (u, v) the pseudo inverse filter
H(u, v) = 1/(H(u, v)
Page 78 of 304
Digital Image. Processing
3.22
7. What is meant by least mnean square filter or wiener
filter? (Dec'12)
The limitation of very sensitive noise.
inverse and pseudo inverse filter is The
a
wiener filtering is method of restoring images in the presence as
of blur well
noise.
Information about the degradation must be extracted from the observed image
either explicitly or implicitly. This task is called as blind image
restoration.
9. What are the two approaches for blind image
restoration?
1. Direct measurement
2. Indirect estimation
F(u,
v)- restored image.
G(u, v)- Degraded image
H(u, v) - Filter transfer function.
Page 80 of 304
Digital Image Processino
3.24|
filter?
21. What is maximumfilteT and minimum
in an
The 100thpercentile is maximum filter is used in finding brightest points
image. The 0th percentile flter minimum filter used for finding darkest
is points
in an image.
22. Name the different types of derivative filters
1. Perwitt operators
2. Roberts cross gradient operators
3. Sobel operators
= mn
fe,y) 1
Es,f e s(xy)
g(s,)
This filter is working well for salt noise
but fails for pepper noise.
26. Define and give the transfer function contra harmonic
of filter. (May'13)
The contra harnmonic filter is used to reduce salt
and pepper noise. The conue
harmonic filtering results in a restored image expressed as
(s)Sxy8(s, 1)Q+1
Zs)sxy 8(s, t)Q
Image Restoration Page 81 of 304
Z=gray level
Z= mean of average value of Z
G= standard deviation
= variance of Z
and
Page 82 of 304
(b-a)
12
32. Give the relation
for Impulse noise
Impulse noise:
Pa for z =Fa
P()=P for z =b
otherwise
33. Write sobel horizontal
and vertical edge detection masks.
(May'13)
-1 -2 -1 -1 0 1
-2 2
2 2
-1 1
Horizontal masking
Vertical masking
REVIEW QUESTIONS
1. What is the use
of wiener filter or least mean square
restoration? Explain. filter in an image
(Nov/Dec 13, Nov/Dec 14,
May/June 13 and May/June 14)
Refer Section 3.7, Page No. 3.19
2. What is meant by Inverse filtering?
Explain. (Nov/Dec 13, May/June 14)
Refer Section 3.6, Page No. 3.19
3. Explain about various filters involved
innoise models.
Refer Section 2.3, Page No. 3.3
Page 83 of 304
Restoration
Image
3.27
Describe the image restoration technique of inverse
4.
filtering. Why inverse
filtering approach fails in the presence of
noise?
Refer Section 3.4, Page No. 3.9
NovDec 2017
Image Segmentationeb
Edge detection, Edge linking via Hough transform- Thresholding - Region based
segmentation - Region growing - Region splitting and merging - Morphological
processing- erosion and dilation, Segmentation by morphological waterslheds – basic
concepts -Dan construction- Watersled segmentation algorithm.
4.1. INTRODUCTION
R= W
Z
+ WZt ..... + WgZ
9
k=1
where z is the intensity of the pixels.
Page 86 of 304
Image Segmentation
43
We know that the sum of all coefficient is zero, it indicating that the mask
resnonse willbe zero in areas constant intensity.
of 3
Point has been detected at the location (x, v) on which the mask is centered if the
absolute value of the response of the mask at that point exceeds a specified threshold.
Such points are labeled 1 in the output image and all others are labeled 0. st koytani
The output is obtained using the following expression. aa iat
otherwise
where,
g is the output image
Tis a non-negative threshold.
This formula used to measures the weighted differences between a pixel and its 8
neighbors. The intensity of an isolated point will be quite different from its
surroundings and thus will be easily detectable by this type of mask.
2 2 2 -1 2 -1 -1 2 -1 1
2-1
-1 2 -1 -1 2 2 -1 -1
-1-1-1
to
Fig. 4.2. Four Line Detection Kennels which respond maximally
(a) horizontal (b)vertical (c) +45 degree (d)– 45 degree
shows a
collection of four such kennels, which each respond to lines of
singlepixe,
width at the particular orientation shown.kenierbss nt
s These masks above are tuned for light lines against a dark background, and w
we are
give a big negative response to dark line against a light background. If
interested in detecting dark lines against a light background, then we should neeh
the mask values. Alternatively, we
interested in either kind of line, in
might be which
Image Segmentation
|4.5
Step discontinuities
In step discontinuities where the
image intensity abruptly changes from one
on one side value
of the discontinuity to a different value on the opposite side.
Line discontinuities
In line discontinuities, where the image
intensity abruptly changes value but then
returns to the starting value within some
short distance.
Step and line edges are rare in real images, because
or the smoothing introduced
of low-frequency components
by most sensing devices, sharp discontinuities rarely
exist in real signals. Step edges become ramp edges and line
edges become roof
edges, where intensity changes are not instantaneous but occur over a finite distance.
Step Edges
A Step edge involves a transition between two intensity levels occurring ideally
over the distanceof
1 pixel.
ldeal Edges
ldeal edges can occur over the distance of 1
pixel, provided that no additional
processing is used to make them look “Real".
Roof Edges
A roof edges is really nothing more than a 1 pixel thick line running through a
region in an image.
Fig. 4.3. Fro:n left to right, models of step, a ramp anda roof edge and
their corresponding intensity profiles.
The second derivative is positive at the beginning of the
ramp, negative at the end
OT
the ramp, zero at points on the ramp and zero at points of constant intensity.
a
The intersection between the zero intensity axis and line extending between the
extreme of the second derivative marks a point called the zero crossing of the second
derivative.
Page 89 of 304
Digital Image Processino
|4.6|
Horizontal intensity
profile
First
derivative
Second
.derivative
Zero crossing
Vf = grad () =
ôy
This vector has the important geometrical property that it points in the auov
the greatest rate of change offat location. (*, y).
The magnitude (length) of vector Vf, denoted as
M (*, y)
where
M(x, y) = mag
(V)= +g2 that
is the value the rate of change in the direction oí
of
Vector. Note
and ,
M
y) are images of the same size as the gradient x andy
the original, created when
are allowed to vary over all pixel locations in f
Page 90 of 304
a (x, y) = tan-1 8y
8x
measured with respect to the x-axis.
Roberts Operators
When diagonal edge direction is of interest. we need a 2-D mask. The Roberts
cross gradient operators are one of the earliest attempts to use 2-D masks with a
diagonal preference. Consider the 3x3 region shown below and the Robert operators
are based on implementing the diagonel differences.
=(Z- Z)
NN| Z2
Z5 N
0
N
Z
Zo 1
Prewitt Operators
This operator can be used to find g, and g
The masks used in this method are
given below.
NN Z3 -1-1-1 1
Z4 Z5 Z6 -1 1
Z7 Z8 Zg
1 1 -1 0
&,=
= (z,
t Zt z)-(Z tz,+ z,)
Page 91 of 304
Digital Image Processing
4.8|
)-(z +z+
z,)
ôy
-(zt Zt
and
Sobel Operators
operators can give Sobel operators that use a
slight variation in Prewitt
weight
A
center coefficient.
of 2 in the
Z3 -1-2-1 -1
NN 0 -2 2
Z4 Z5 Z6 0
Z8 Z9 |12 2 1 -1
ldcally, edge detection techniques yield pixels lying only on the boundaris
between regions. In practice, this pixel set seldom characterizes a bounday
completely because of
i) noise
ii) breaks in the boundarydue to
non-uniform illumination.
ii) other effects that introduce spurious
discontinuities in intensity values.
Thus, edg detection
algorithms are
boundary detection usually followed by linking and ou
procedures designed
boundaries. The to assemble edge pixels into meaningfu
following techniques boundary
are used
detection. for edge linking and
1. Local processing
2. Global processing
using Hough transform
3. Regional processing
Page 92 of 304
Principal Properties
There are two principal properties for establishing similarity of edge pixels in this
kind of Analysis are
1. The strength of the response of the gradient operator used to produce the edge
pixels.
2. The direction of the gradient.
= tan-! 8y ...
a (x, y) (4.1)
pixl at (r, y) if
sE ... (4.2)
M(s, 1)
-M (*, )l
where E is a positive threshold.
The direction angle of the gradient is shown in above equation (3.30). An edge
pixel with coordinates (s, 1) in Sy has an angle to the pixel at (*, y) if
where,
T, is a threshold
A is a specified angle direction
±T, defines a band of acceptable directions about A
3. Scan the rows of g and fill (set to 1) all gaps (sets of O) in each row that
donot exceed a specified length, K. A gap is bounded at both ends by one or
more ls. The rows are processed individually, with no memory between them.
4. Todetect gaps in any other direction, rotate g by this angle and apply the
horizontal scanning procedure in step 3. Rotate the result back by – 0.
When interest lies in horizontal and vertical edge linking, step 4 becomes a simple
procedure in which g is rotated ninety degrees, the rowS are scanned and the result is
rotated back.
o
A
B
(a)
(b)
C
D
D
o
E E
Ad
F
B
(c) (d)
Fig. 4.8. Illustration of the Iterative Polygonal
Fit Algorithm
Algorithm
An algorithm for finding a polygonal fit to open and closed curves will have the
following steps.
Step 1
Let P be a sequence of ordered, distinct, 1 valued points of a binary
image. Specify two starting points, A and B. These are the two
starting vertices of the polygon.
Step 2 : Specify a threshold T and two empty stacks, OPEN and CLOSED.
Step 3
If the points in P correspond to a closed curve, put A intoOPEN
and put B into OPEN and into CLOSED. If the points correspond
to an open curve, put into OPEN andB into CLOSED.
A
Step 4
Compute the parameters of the line passing from the last vertex in
CLOSED to the last vertex in OPEN.
Step 5 P
Compute the distances from the line in step 4 to all the points in
4
whose sequence places them between the verticesD,max from Sten
Select the point, max with the maximum distance,
V,
Page 95 of 304
4.12 Digital Image Processine
Method1
First find all lines determined by every pair
of points
Next find all subsets of points that are close to particular lines.
Drawback
This approach involves finding n(n -
1)/2 ~ n² lines and then performing
(n) (n(n- 1))/2 ~ n³ comparisons of every point, to all lines.
Therefore, it is not a
preferred method.
Method 2
Hough Transform is an alternative approach to
method 1.
4.3.3.1. Hough Transform
Consider apoint (x, y) in the xy plane - and the general eqution of a straight
in slope intercept form, y; = ax, t b.
Infinitely many lines pass through
(x y) but they all satisfy the equation y, = ax,
+b for varying values of a and b.
A
single line for a fixed pair (*, y) in
as
the parameter space or ab plane cân-
written
b=-X{a + y;
Page 96 of 304
Consider a second point ( y) also has a line in parameter space associated with
it. These lines are parallel, so this line intersects the line associated with
some point a' b, where a'
(i, V) at is the slope and b' is the intercept of the line
containing both (*, Y) and y) in the xy plane.
(, -
b'
b
+
b=-Xa yi
b= -xa y +
Accumulator Cells
An important property of Hough Transform is that the parameter space can be
subdivided intocells called accumulatorcells. This is shown in below figure.
Omin Omax
Pmin
----4
------------4---
-------4
Pmax
The cell at coordinates (i, ji) with accumulator value (i, j) corresponds to the
square associated with parameter space coordinates (p,, 0,).
Initially, these cells are set to zero. Then, for every non background point
of the subdivision
(I}) in the xy - plane, the parameter is allowed to have each
Then solve for the corresponding p using the equation. p = x Cos + y, sin A
The resulting p values are rounded off to the nearest allowed cell value along the
p- axis,
,
If a choice of results in solution p, then we let A (p, g) = A (p, ) + 1. The
number of subdivisions in the p-plane determines the accuracy
of the colinearity of
these points.
Hough transform is applicable to any function
of the form g (v, c) = 0,
where v is a vector of coordinates and
c is a vector of coefficients.
The Hough transform depends on the
number of coordinates and coefficients in a
given functional representation.
An approach based on the Hough Transform
is as follows.
i) Obtain a binary edge image
i) Specify subdivisions in the
p-plane
iii) Examine the counts
of the accumulator cells for high
iv) Examine the relationship pixel concentrations.
(Exampie continuity)
chosen cell. between pixels in
Continuity in this case
usually is based on computing
disconnected pixels corresponding the distance betwen
to a given accumulator
A gap in a line associated cell.
with a given cell is bridged
less thanaspecifiedthreshold. if the length of the gap
Page 98 of 304
T
T T2
(a) (b)
can be partitioned
Fig. 4.1I. Intensity histogram that
4.4.1. Thresholding-Foundation
an image, fx, ),
Suppose that the grey-level histogram coIresponds to
a way that object and
composed of dark objects in a light background, in such
two dominant modes.
background pixels have gray levels groupcd into
background is to select a
One obvious way to extract the objects from the
any point (r, y) for which f, y)
threshold T'that separates these modes. Then
a background point.
>T is called an object point, otherwise, the point is called
and labelling
Segrnentation is accomplished by scanning the image pixel by pixel
on whether the grey level is greater or
cach pixel as object or background, depending
less than the value of T.
0f,y)<T
fr, >T
y)
Page 99 of 304
Thresholding works well when a grey level histogram of the image groups
separates the pixels of the object and the background into two dominant modes. Then
a threshold T can be easily chosen Between the modes.
In such a case the histogram has to be partitioned by multiple thresholds.
iff(r, y) > T,
g (3, ) = if T, <f4, y)s T,
C iff(r, y) sT,
In turn, the key factors affecting the properties of the valley(s) are
1. The separation between peaks
2. The noise content in the image
3. The relative sizes of objects and background.
4. The uniformity of the illumination source and
5. The uniformity of the reflectance properties of the image.
Image Segmentation
|4.17|
If T depends on bothf (x, y) and p (x, v) then this is referred to a Local
Thresholding.
The following iterative algorithm can be used for this purpose. 3
= iP (C/i) P()/P(C)
...(4.6)
i =0
i=0
Where P,(k) is given in Eq. 4.6. The term
P(i/C,) in the first line of Eq. 4.6 is the
probability of value i, given that i comes
from class C,.
The second line in the equation follows
from Bayes' formula:
P(A/B) = P(B/A) P(A) / P(B)
Thethird line follows from
the fact that P(C,/), the probability of C; given i,is1
because we are dealing only with
valuesof i from class C..
Also, P() is the probability of
the ith value, which is
simply the ith
component of
the histogram, pi. Finally, P(Cj) is the probability
of C. which we know from
Eq. 4.4 is equal to P,().
Page 102 of 304
m{) = ip,
k
.. (4.8)
=0
and the average intensity of the entire image (.e.,the global mean) is given by
mG
L-1
... (4.9)
i=0
The validity of the following two equations can be verified by direct substitution
of the preceding results:
P,m + P2m =
mG (4.10)
and
... (4.11)
P, +P, = 1
o
= P,(m, -m +
Pm, -ma .4.14)
This expression can be written also as
d PP, (m, -m,)
Page 103 of 304
(mGP,-m)?
...
P(1-P) (4.15)
And
[mGP, (k)–m()]?
=
P,(k) [1-P, (k)] . (4.17)
Then, the optimum threshold is the value, k*,
that maximizes oB():
max
0sk<Ll ... (4.18)
Otsu's algorithm may be summarized as follows:
1. Compute the
normalized histogram of the
components of the histogram by p, input image. Denote the
...,
i= 0, 1, 2, L-1.
2. Compute the cumulative sums,
P,(k), for k= 0, 1,2, ..., L-1, using Eq. 4.4.
3. Compute the cumulative means,
m(k), for k =0, 1, 2, ...,
4. Compute the global intensity mean, ma, L- 1,using Eq. 4.8.
using equation 4.9.
S. Compute the between-class
variance, op (k), for k = 0, 1,
2,
...,
equation 4.17.
L-1, using
6. Obtain the Otsu threshold,
k*, as the value of k for
which o (k) is maximum.
If the maximum is not unique,
obtain k* by averaging
corresponding to thevarious maxima the values of k
detected.
7. Obtain the separability measure,
n*, by evaluating Eq. 4.16 at
k= k*.
4.4.4. MULTIPLE THRESHOLDS
Multiple thresholds can
be
extended to an arbitrary
the separability measure on number of thresholds, because
which it is besed also extends an
classes. In the case to arbitrary number
of K classes, Ci, Co C
2 P,(m-m
k=1
(4.19)
Salenatton Page 104 of 304
4.21
Ahere
=
P; 2 iP, ... (4.20)
M
=p.2 iP,
...(4.21)
e
iC
Global mean. The
.. kk-1
K
K
max
kz
..., kk_ ) ... (4.22)
0<ksky.<kn-1l (k,
ockskys.<k
k=1
For three classes consisting of three intensity interval the between class variance is
given by
a
o= P
(m, -m+ P,(m, -m? + P,(m, -mio
d (4.23)
where
ki
P, =
p
i=0
kz .. (4.24)
i=k+l
L-1
P; = p
i=kytl
and
= 1
2=kt1
Thefollowing
relationships hold:
=
Pm, + P,m, + Psm,
Page 105 of 304
4.22 Digital Image Processing
and
looking for a threshold at 0 intensity makes no sense; also, keep in mind that the
increment values are integers because we are dealing with intensities).
Next, k, is incremented through all its values greater than k,
and less than L-1
(i.., k, k, + 1,
...,
L- 2). Then is incremented to its next value andkis
k
(4.28)
Where o is the total image variance.
4.4.5. Variable Thresholding
Factors such as noise and nonuniform
illuminate play a major role in ue
performance of a thresholding
algorithm. One of the simplest
approaches to variable
thresholding is to subdivide an image
into nonoverlapping Rectangles.
This approach 1s used to compensate for
non-uniformities in illumination and /o1
Reflectance. The Kectangles are chosen small
enough so that the illumination of ea
uniform.
is approximately
Page 106 of 304
Image subdivision generally works well. When the objects of interest and the
background occupy Regions of Reasonably comparable size. When this is not the
case, the method typically fails, because of the likelihood of subdivisions containing
only object or background pixels. Although this situation can be addressed by using
additional techniques to determine when a subdivision contains both types of pixels
the logic required to address different scenarios can get complicated.
We illustrate the basic approach to local thresholding using the standard deviation
and mean of the pixels in a neighbourhood of every point in an image. These two
quantities are quite useful for determining local thresholds because they are
descriptors of local contrast and average intensity.Let Gy,and m,, denote the standard
deviation and mean value of the set of pixels contained in a
neighbourhood,
Sp centred at coordinates (x, y) in an image. The follovwing are common forms of
variable, local thresholds:
... (4.29)
Igy # a,yt bmgy
fart
a tutiiwtaits ou farullopa
(a) (b) (c)
Fig. 4.14. () Text inage corrupted by spot sháding. (b) Result of global thresholding
using Otsu's method. (c) Result of local thresholding using moving averages.
each location (, J). Arrays f and S are assumed to be of the same size. A basic
region growing algorithm based on 8-connectivity may be stated as follows.
Step 1
Fnd allconnected components in S(*, y) and erode each connected
component to one pixel, label all such pixels found as 1. Allother
pixels in S are labeled 0.
Step2
:
Form an image fo such that, at a pair of coordinates (r, y), Let
Jo (E, y) = 1 if the input image satisfies the given predicate, Q at
those coordinates, otherwise letfo (x, y)=0
Step 3 :Let g be an image formed by appending to each seed point in S all
the 1-valued points info that are 8-connected to that seed point.
Step 4 :Label each connected component ing with a different region label
i.e., segmented image obtained by region growing
However, starting with a particular seed pixel and letting this region grow
completely before trying other seeds biases the segmentation in favour of the regions
which are segmented first.
This can have several undesirable effects.
Current region dominates the growth process-ambiguities around edges
of adjacent regions may not be resolved correctly.
Different choices of seeds may give different segmentation results.
on an edge.
*Problems can occur if the (arbitrarily chosen) seed point lies
Simultaneous Region Growing
To counter the above problems, simultaneous region growing techniques have
ben developed.
& Similarities of neighboring regions are taken into account in the growing
procesS.
dominate the proceedings.
$ No single region is allowed completely
to
grow at same time.
A number of regions are allowed to
expanding regions.
$Similar regions will gradually coalesce into
Advantages
on parallel computers.
Easy and efficient to implement
Page 111 of 304
4.28 Digital Image Processing
R1 R2 R1 R2
R
R3 R4 R41 R42
R3
R43 R44
(a) whole image
(b) First split (c) Second split
Fig. 4.15.
Quadtree
We can also describe
the splitting of the image
tree that is trees
in which each node has
using a tree structure
called quad
exactly four descendants.
The images coiresponding
regions or quad images.
to the nodes of a quad tree
sometimes are called quad
R1 R2
R41 R42 R1 R2 Ra RA
R43 Ras
Ifthe process is stopped only with splitting, the result may have adjacent regions
with identical properties. Therefore, further merging as. well as
splitting is needed.sl
When the combined pixels of two adjacent regions satisfy
the predicate Q, that is,
adjacent regions R, and R; are
two merged only if
Q(R,UR) = TRUE
Furthermore, the morphological operations can be used for filtering, thinning and
pruning. This is middle level of image processing technique in which the input is
image but the output is attributes extracted from those images.
The languages of Morphology come from the set theory, where image objects can
be represented by sets. For example an image object containing black pixels can be
considered a set of black pixels in 2-D space of z², where each elemnt of the set is a
tuple (2 vector) whose coordinates are the (x, ) of white pixel in an image.
-D
4.6.1. BASICS OF SET THEORY 3
AsB
Page 113 of 304
Digital Image Processing
|4.30
C= AUB
The intersection for the sets A and B is the set element belonging to both A and D
is represented as
D = AnB
If there are no common elements in A and B, then the sets are called disjojnt sete
represented as
AnB =.
is the name of theset with no members.
The complements of a sets A is the set of elements in the
image not contained A.
A =w|we A}
The differences of two sets A and B
is denoted by
A-B = {w|w A, w¢ B}
The reflection of two set B, denoted byB is defined as
B=ww=-b, for b e B}
(B)Z
Fig. 4.17. (a) A set, (b) its reflection and (c) its translation by z
Erosion
away the
Erosion shrinks an image object. The basic effect of erosion is to erode
boundaries of for ground pixel thus area of foreground pixel shrinksto size and holes
within those areas become larger.
Mathematically, erosion of setsA by
B
denotedA OB
B, is defined as
= {zI(B)zcA}
AB
z
This equation indicates that the erosion of A by B is the set of all points
Such that B, translated by Z. is contained in A.
Characteristics
size o1 objects and removes snall anomalies
by
It generally decreases the
than the structuring element.
subtracting objects with a radius smaller
erosion reduces thc brightness of bright objects on a
With gray scale images
taking the neighborhood minimum when passing the
dark background by
Structuring element over the image.
Page 115 of 304
-
image objects.
Example
d/4
;d/2
d/4 A
B
d/4
A B
ad/8 Jal4al8 B
d/8 3d/4 d/8
(a) (b) (c) (d) (e)
Fig. 4.18. (a) set A (b) Square structuring element, B (c)
Erosion ofA by B, shown shaded,
() Elongated structuring element, (e) Erosion of by B using this element
A
Size
The structuring element can be a 3
×3 or a 21 x 21 square.
Dilation
This equation is based on reflecting B about its origin and shifting this reflection
by z. The dilation is then the set of all displacements z such that B and
A overlap by at least one element. The above equation may be rewritten
asntt.2
The set is referred to as the structuring element in the dilation. This structuring
B
d
d/ 4
d/4
A B d/2
Bn A B
A B=B d/8 d B=B d/8 d d/8
Fig. 4.19. (a) Set A, (b) Square structuring element, (e) Dilation afA by B shown shaded
of using this element.
A
Therefore all the points inside this boundary constitute the dilation: of A to B.
Dilation has an advantage over low pass filtering that morphological method results
directly in a binary
image and corvert it into
a gray
scale which would require a pass
with a thresholding function to convert it back to binary form.
Page 117 of 304
Digital Image Processing
|4.34
Proof
The erosion is
(AO B) = {zl(B), c A}
If set (B), is contained in A, then (B), n A°= in which case the preceding
expression becomes
A
B)° = (z|(B),
A}
But the complement of the set
for z's that satisfy (B),
such that (B),nA° ^ Ae = is the set of ZS
#.
Therefore,
wavsholAB) =
{z\(B),nA} =
= Ac
B
4.7. SEGMENTATION
BY MORPHOLOGICAL
WATERSHEDS
So far we have discussed
segmentation based on
three prir.cipal concepts.
(a) Edge deteçtion
'
(b)Thresholding and 0 :
(c) Region Growing
Page 118 of 304
Fach of these approaches wås found to have advantages (for example, speed
in the
case of global thresholding) and
disadvantages (for example, the need for post
processing, such as edge linking, in edge-based segmentation).abi
irsmg
In this section, we discuss an approach based on the concept of so called
morphological watersheds. Segmentation by watersheds embodies many
of the
concepts of the other three approaches and often produces more
stable segmentation
results, including connected segmentation boundaries. This approach also provides a
simple framework for incorporating knowledge based constrains in the segmentation
process.
* A hole is punched ineach regional minimum and that the entire topography is
flooded from below by letting water rise through the holes at a uniform rate.
When the rising water in distinct catchment basins is about to merge, a dam is
built to prevent the merging.
3 The flooding will eventually reach a stage when only the tops of the dams are
visible above the water line.
Page 119 of 304
|436
Digital lmage Processing
&
These dam boundaries correspond to the divide lines of the watersheds.
Therefore, they are the connected boundaries extracted by a watershed
segmentation algorithm.
(a)
Origin
N11
111
|1|1|1|
(b)
First dilation
E Second dilation
X Dam points
(c)
C,M)
C(M) n T[a]
In other words, C,(M) = 1 at
location (x, y) if (x,
otherwise C,(M) = y) e C(M) AND (*, y) eT[a};
0.
Next,we let C[n]
denote the union
of the flooded catchment
basins at stage n.
R
C[n] = U
C, (M)
i=1
Then C[max + 1] is
the union of all catchment
basins.
R
C[max + 1] =
C(M)
The algorithm for
finding the watersheds
=Tfmin+1]. The lines is initialized
algorithm proceeds with C[min
recursively, computing J
Let Q
denote the set of C[n] from C[n -1]
connected components
component q e
Qn, there are three possibilities, in T[n]. Then, for each connected
1.
qnC[n-1] is empty.
2. qnC{n -1] contains one
connected component of
3. gn C[n C[n-1].
-1]contains more than one
connected component
Condition occurs when a of C[n
1
new minimum is -IJ.
connected component q encountered, which case
is incorporated into in
C[n-1]to form C[n].
Page 122 of 304
7. What is edge?
(Dec'13)
An edge is a set of connected pixels
that lie on the boundary between two regions
edges are more closely modeled as
having a ramp like profile. The slope the
ramp is inversely of
proportional to the degree
of blurring in the edge.
8. Give the properties
of the second derivative around an edge
The sign of the second derivative can
be used to determine whether an edge
pixel lies on the dark or light
side of an edge
It producestwo values
for every edge in an image.
An imaginary straight line joining
the extreme
stof the second derivative would cross zero near positive and negative values
the midpoint of the edge.
9. Define Gradient Operator
First order derivatives a
of digital image are based on
the 2-D gradient. various approximation o
The gradient of an image
f(x, y) at location (x,y)
The magnitude (length) is defined as the vector
of vector Vf, denoted as M
where (x,y)
mag
(V) =Ng +g?
10. What is meant by
object point and background
To execute point?
the objects from the
background to select
separates these modes. Then any is a
threshold
T that
point (x,y) for which fox,y)>T an
obiect noint. Otherwise called
the point is called background is
point.
Page 124 of 304
Inage Segmentation
441
11. What is
global l threshold?
CITeu
When Threshold T depends
only on f(%,y) then the threshold is called global.
12. Define region growing.
Region growing 1s a procedure
that groups pixels or sub regions in to layer
regions based on Predefined criteria.
The basic approach is to start with a set
seed. points and from these grow. of
regions by appending to each seed these
neighboring pixels that have properties
similar to the
13. Specify the steps involved in splitting
seed.tnaad
and merging. (May'14)
Split into 4 disjoint quadrants any region
Ri for which P(R) FALSE. Merge
any adjacent regions R
and R for which P(R;UR)= TRUE. Stop when no
further merging or splitting is positive.
h
5
Image Compression
and Recognition
Need for data compression, Huffinan, Run Length Encoding, Shiftcodes, Arithmetic
coding,JPEG standard, MPEG. Boundary representation, Boundary description,
Fourier Descriptor, Regional Deseriptors - Topological feature, Texture - Patterns
and Pattern classes- Recognition based on matching.
5.1. INTRODUCTION
Image compression is theart and science of reducing the amount of data required
to represent an image. It is one of the most useful and commercially successfully
technologies in the field of digital processing.
The number of images that are compressed and decompressed daily is staggering
and the compressions and decompressions themselves are virtually invisible to the
user.
Types
3 TelevideoConferencing
Remote Sensing
Medical imaging
* Facsimile Transmission (FAX)
Control of remotely piloted vehicles in space and military
Page 127 of 304
The term data compression refers to the process of reducing the amount of data
required to represent a given quantity of information.
Data and information are not the same thing; data are the means by which
information is conveyed because various amounts of data can be used to represent the
same amount of information.
Some representations may contain irrelevant or repeated information are said to
contain Data Redundancy or Redundant Data. If this Redundancy is removed then
compression can be achieved.
Case 3
Ifb'>> b then C=0and R
=o Secônd set has more data than the original
set. This case is undesirable.
where
L is the number of intensity values
n, is the number of times that the kh intensity appears in the image
The average length of the code words assigned to the various intensity values s
found by summing the products of the number of bits used to represent each intensity
and the probability that the intensity occurs.
The total number of bits required to represent an M xN image is MNLv:
Coding Redundancy can be avoided and compression achieved by variable length
code method. This method assigning fewer bits to more probable intensity values and
more bits to less probable intensity values.
A natural binary encoding assigns the same number of bits to both the most and
least probable values and failing to minimize equation 4.52 and resulting in coding
redundancy.
The enccder performs compression and the decoder performs the complementary
operation of decompression. A codec is a device or program that is capable of both
encoding and decoding.
f(x, y) Symbol
or Mapper Quantizer
coder
f(x, y, t) Compressed data
for storage
Encoder and transmission
Symbol Inverse
(x, y)
or
decoder mapper f(x. y. t)
Decoder
5.4.1.1. Mapper
A Mapper will transform f(x, ..) into a format designed to reduce spatial and
temporal redundancy. This operation is reversible and may or may not reduce directly
the amount of data required to represent the image.
Run-length coding is an example of a mapping that normally yields compression
in the first step of the encoding process.
In video applications the mappers are used previous video frames to facilitate the
removal of temporal redundancy.
5.4.1.2. Quantizer
5.4.1.3. SymbolCoder
Lossless compression is used in cases where it is important that the original and
the decompressed data to be identical or where deviations from the original data
could be deleterious.
Applications
1. Digital radiography
2. Satellite imaging
Step 1
Below table illustrates this process for binary coding. Here, in the left side,
hypothetical set of source symbols and their probabilities are given in the decreasing
probability values.
{aj, az, ay, a4, as, a6} ={0.1, 0.4, 0.06, 0.1, 0.04, 0.3}
Page 134 of 304
0,08 0.1
0.04
Step 2
In this step, cach reduced source is coded. It starts from the smallest source
obtaincd in the last step and goes back to the original source. The minimal length
binary codes are usedare 0 and 1.
Table 5.2. Huffman Code Assignment Procedure
a2 0,4 0.4 04 04 1
r0.60
ag 0,04 01011
Page 135 of 304
Digital Image Processing
|5.10
are assigned 0 and 1 first
The reduced symbols 0.6 and 0.4 in the last column
source to its lef
Since 0.6 was generating by combining two symbols in the reduced
a 0 and 1 are appended i.e., combined to differentiate them from each other, which
produces the codes 00 and 01.
Then, a 0 and 1 are arbitrarily appended with 01 since its symbol 0.3 was
generated by adding 0.2 and 0.1 in the second column. This produces the codes 010
and 011.
This operation is repeated for each reduced source until the original source is
reached. The average length of this code is
Lavg (0.4) (1)+ (0.3) (2) + (0.1) (3) + (0.1) (4)+ (0.06) (5) + (0.04) (5)
= 2.2 bits/pixel
and the entropy of the source is 2.14 bits/symbol.
Decoding n
Huffman's procedure creates the optimal code for a set of symbols and
probabilities subject to the constraint that the symbols to be coded one at a time.
After the code has been created, coding and/or error free decoding is accomplished
in a simple lookup table method.
In Huffman coding each source symbol is mapped into a fixed sequence
of code
symbols. Also, each code symbol can be instantaneously decoded in a unique way
without referring the succeeding synmbols. Therefore, it is called an instantaneous,
uniquely decodable block code.
Example
For the binary code of above table, a left to right scan
of the encoded string
01010011110 reveals that the first valid code word is 01010, which is
the code for
symbol a. The next valid code is 011, which corresponds to symbol
a,
continuing in
this manner reveals the completely decoded message to
be a,a,a,a,a;
Advantages
1
It creates an optimal code for a set of symbols and probability
2. Coding/decoding process can be done in a simple lookup table manner.
3. Implementation is very simple.
Page 136 of 304
5.11
Image
Compression and Recognition
Drawbacks
Consider a
text source: RTAAA ASDEEEEE
Page 137 of 304
S12 Digital Image Processing
symbols
with the binary encoder andthe truncated Huffiman encoder gives:
5.14| Digital ImagePage 139 of 304
Processtng
Ax 0.35 00
A0 0.3 01
Page 140 of 304
A2 0.15 0010
A3 0.1 0011
A4 0.08 0100
A5 0.06 0101
A6 0.05 0110
A7 0.04 0111
A8 0.02 1000
H=2.778 4
The 6 1less probable source symbols are assigned the Huffman code of that
hypothetical symbol Ax(1) concatenated with natural binary code of length 3.
The average length of the truncated Huffman code is
= SPi li=0.65 *
2
+0.24 * 4 +0.11 *6
i=0
2.92 bit/symbol
Thus the efficiency of the Shannon-Fano code is
H
2.778
95.14%
2.92
This example demonstrates that the efficiency of the Binary shift encoder is higher
than that of the binary encoder.
Page 141 of 304
Digital Image Processing
5.16
Applying the Huffman code of the same source, we get the following
codeworde
A1 0.2 01
A2 0.15 100
A3 0.1 110
A4 0.08 1010
A5 0.06 1011
A6 0.05 1110
A7 0.04 11110
A8 0.02 11111
Lag H=2.778 2.81
H
Lavg
2.778
2.81
= 98.86%
Procedure
Consider a source with four symbols, aj, a,
ag and a4. Now, the sequence or
message with five symbols a a,aga, is required to be coded.
Step 1: The message is assumed to occupy the entire half-open interval [0, 1]
Step 2: This interval is subdivided initially into four regions based on the
probabilities of each source symbol.
Table 5.3. Arithmetic Coding Example
Step 3 : The first symbol a, of the message is narrowed to the initial sub interval
[0, 0.2]
Step 4: The interval [0, 0.2] is subdivided according to the probability of the
next symbol a.
Page 143 of 304
5.18| Digital Image Processino
Step 6:The Last message symbol a, narrows the interval [0.0624; 0.0688]
i.e., 0.056 + [(0.072 -0.056) x 0.8] =0.0688
0.04 + [(0.072 -0.056) x 0.4] =0.0624
Step 7: Finally
0.0624 + [(O.0688 –0.0624) x 1] = 0.0688
0.0624 + [(0.0688 -0.0624) x 0.8] =0.06752
Then the final message symbol is reserved with a special end message indicator,
of
narrows the range to [0.06752, 0.0688]
Encoding sequence
0.06752
ag a3
a2
a a a
a
0 0.04-J 0.056-J 0.0624
Fig. 5.3. Arithmetic Coding
Procedure
In the arithmetically coded message of
above figure, three decimal digits are used
to represent, the five symbol message a1, az,
a3, a4,
Page 144 of 304
Here, the numbers of source symbols are 5 and number of decimal digits used to
represent the message is 3.
. Number of decimal digits/source symbol =3
=0.6 digits/symbol
Limitations
There are two practical factors which affect the performance efficiency of
arithmetic coding. They are
1, The addition of the end-of message indicator that is needed to separate one
message from another.
2. The use of finite precision arithmnetic.
Practical implementations of arithmetic coding address the latter problem by
introducing a scaling strategy and a rounding strategy.
Scaling Strategy
In scaling strategy renormalizes each subinterval to the [0,1] range before
subdividing it in accordance with thesymbol probabilities.
Rounding Strategy
The Rounding Strategy used to represent the coding subintervals accurately by
preventing the transaction effects of finite precision arithmetic.
The encoder tests the pixels of the image one by one, the intensity sequences that
are not in the directory are placed in algorithmically determined locations, which may
be the next unused locations.
For example, if the first two pixels of the image are white which corresponds to
the intensity level 255, the sequence 255-255 may be placed in the location 256 i.e..
the next unused location.
Thesize of the dictionary is an important system parameter. Because it is too small
then the detection of matching intensity level sequences will be
less likely and if it is
too large the size of the code words will affect
compression performance.
The decoding of the image is performed by reconstructing
the code book or
dictionary. A unique feature of the LZW coding is
that the coding dictionary or code
book is created while the data are being encoded.
An LZW decoder builds an identical decompression
dictionary as it decodes
simultaneously the encoded data streamn. The
dictionary has the following problems.
In most practical applications, the dictionary
overflow is a most critical problem.
There are three methods or ways to handle this problem.
( -1. Flush or Reinitialize the
dictionary when it becomes full and continue
coding
with a new initialized dictionary.
2. Monitor compression performance
and flush the dictionary when it becomes
poor or unacceptable.
3. The least used dictionary entirecan
be tracked and replaced when necessary.
Advantages
LZW technique is simple
LZW compression has been integrated
into a variety of mainstream imaging
file formats, including
1. Graphic Interchange Format
(GIF)
2. Tagged Image
File Format (TIFF)
3. Portable Document Format
(PDF)eo st
The PNG format was created to get
around LZW Licensing Requirements.
Page 146 of 304
Dizadvantages
Srmall changes in intensity can have a significant input on the complexity of the bit
planes.
8 = a,B a+0Sism-2
...(5.5)
Here > denotes the exclusive OR operation. This code has the unique property that
tuCcessive code words differ in only one bit position.
kdvantaçgessi
m-bit u s
Snall changes in intensity are less likely to affert all planes.
Page 147 of 304
Predictor Nearest
integer
(a)
Predictor
îin)
(b)
Fig. 5.4. Lossless Predictive Coding Model (a) Encoder (b)
Decoder
5.8.1. ENCODER
1An encoder will contains three different kinds of components such as
1. Predictor
2. Nearest Integer
3. Symbol encoder
Predictor
An successive samples of discrete time input signal
f(n) is given to the predictor.
The predictor will generates the anticipated value
of each sample based on
specified number of past samples.
Nearest Integer
The output of the predictor is given to the nearest integer, denoted fín) and used to
form the difference or prediction
erTOr.oishve
Page 148 of 304
Symbol Encoder
The difference between f(n) and f(n) that is f(z) -f(n) can be encoded using a
variable length code by the symbol encoder
o
sat táo sGuzi
The symbol encoder will generate the next element of the compressed data stream.
It is the output of symbol encoder.
5.8.2. DECODER
Decoder will contains following two different kinds of components
1. Symbol decoder
2. Predictor"
Symbol Decoder.
It willperform opposite action of symbol encoder. It reconstructs e (n) from the
received variable length code words and performs the inverse operations.
-e(n) + f(n) .. (5.7)
fn)
to decompress or recreate the original input sequence.
Various 1local, global and adaptive methods can be used to generate f(n).
Predictor
The predictor is formed as a linear combination of previous samples.
m
f(n)
= round
a,fn -i) .. (5.8)
ie,
Li=1
where
round is a function used to denote the rounding or nearest integer operation.
m is the order of the linear preditor
Ca, for i
=
1,2.....m.are prediction coefficients.
If the input sequence in encoder is considered to be samples of an image, the f(n)
in above three equations are pixels and the m samples used to predict the value of
following three things.
Page 149 of 304
Digital Image Processinp
5.24
1., Each pixel come from the current scan line then it is called 1-D linear
predictive coding.
2. from the current and previous scale lines called 2-D linear
If each pixel come
A predictive coding.ohdet
sil
kps fo)t ooxrd sa
3. If each pixel come fromthe current image and previous images in a
sequence
of images called 3-D linear predictive coding.
Thus, for 1-D linear predictive image coding, then f(n) can be written as
ti
= round
f (*,y)
Li=1
,f(, y-) .69)
Where each sample is now expressed as a function of the input images spatial
coordinates x and y.
From above equation indicates that the 1-D linear prudiction is a function the
of
previous pixels on the current line alone.
In 2-D predictive coding, the prediction is a function the previous
of pixels in a
left-to-right, top-to-bottom scan of an image.
In the 3-D predictive coding, it is based on these pixels
and the previous pixels of
preceding frames.
A compression technique that does not decompress digital data back to 100% of
the original.Lossy methods can provide high degree of compression and result in
Smaller compressed files but some numbers of the original pixels, scund waves or
video frames are removed forever. Examples are the widely used JPEG image, MPEG
video and MP3 audio formats.
a
Lossy compression is never used for business data and text, which demand
perfect restoration.
Table 5.4. Comparison between Lossless and Lossy compression
S.No Lossless Compression Lossy Compression
Used for text and data Used for audio and video
2.
w
Compression ratio is less High compression ratio
3.
It is not reversible
4. Completely reversible
of Compression depends upon
5
Compression is independent
sensitivity of human eyes, ear, etc.
human response
Merge Decompressed
Compressed Symbol Inverse nXn
transform image
image decoder subimages
(b)
Fig. 5.5. A Block Transform coding system (a) encoder (b) decoder
Encoder
Encoder will perform four relatively straight forward operations.
1. Sub image decomposition
2. Transformation
3. Quantization and
4. Coding
First an M x N input image is subdivided into sub images of size n xn, which are
transformed to generate MNln². Sub inmage transform arrays, each of size n x n.
The main goal of the transformation process is to decorrelate the pixels of each sub
image, or to pack as much information as possible into smallest number of transform
coefficients.
The quantization stage selectively eliminates or more coarsely quantizes the
coefficients that carry the least amount of information in a predefined sense.
These coeficients have the smallest impact on reconstructed sub image quality.
The encoding process terminates by coding the quantized coefficients.
If any or all of the transform encoding steps can be adapted to local image content
called adaptive transform coding or fixed for sub images called non adaptive
ll
transform coding.
Decoder
The decoder wiliperform the inverse operation of the encoder. The only difference
that there is no need for quantization.
is
Page 152 of 304
There are three main issues to be taken care during transform coding of an image
are
1. Transform selection
2. Sub image size selection and
3. Bit allocation
Transformation Kernels
are also known as basic functions
The forward and inverse transformation kernels
images.These kernels determine the following
or basis
,
employed.
are said to be separable if
1he forward andreverse transformation kernels
r(*, y, u, v) = r;(, u) r,
v)
s, v) ...
s(*, y, u, v) s(, u) (,
= (5.11)
Page 153 of 304
Digital lmage Processtng
5.28
kernels are said to be symmetric if r is
The forward and reverse transformation
functionally equal to r?
r(x, ), u, v) = r; (, u) r; , v)
... ($.12)
s*, y, u, v) = s, u) s, (y, v)
The best known transformation kernel pair is Discrete Fourier Transform (DFT)
kernel pair. It is expressed as
where, n=2m
The summation in the exponent
2 arithmetic and b(z) is the kth
of this expression is performed in modulo
bit (from right to left) in the binary
z. representation
of
If m=3 and z = 6 for example b, (z)
=0, b,(z) = 1 and b,(z) = 1. The p, (u) n
above equation are computed using
Po (u) = bm-1 (u)
P, (4) = bm-i(u) + bm-2 (u)
Page 154 of 304
Image
Compression and Recognition 5.29
r(*, y, u, v) = s,y, u, v)
uz (2y +1)vT ... (5.15)
= a (wa(v)cos L 2+1)
2n 2n
where
V1/n for u =0
=
for u 1,2... n-1
and similarly for a ().
Transform Matrix, G
An n x n
sub image g (, y) can be expressed as a function of its 2-D transform
T(u, v)
g(r, y) = n-1n-1
) T(u, v) S(x, y, u, v) for x,
y =
0, 1,2... n-1 ...
6.16)
u=0y =0
The inverse kemel S(, y, u, v) in above equation depends only on the indices x,
V, 4, v and not on the values of g (x, y) or T(u, v).
This can be modified in matrix form as
G=
n-1n-1
T(u, v) Sspe
... (5.17)
u=0y=0
where,
Gisan n Xnmatrix containingthe pixels of g(*,y)
And
Page 155 of 304
s(1, 0, u, v)
Sy :
s(n-1, 0, u, v)
s(n-1, 1, 4, v) : s(n– 1, n-1, , v)J
Then, G the matrix containing the pixels of the input sub image is explicity
defined as a linear combination ofn2 matrices of size n x n.
Approximation of G
If we define a transform coefficient masking function as
0 if T(u, v) satisfies a specified truncation criterion
xlu, v) otherwise
For u, v
=
0, 1, 2...n -
1, an approximation of G can be obtained from the
truncated expansion.
n-1n-1 ... (5.18)
G =X E x (4, v) T(u, v) S,p
u=0y =0
Where,
x (u, v) is constructed to eliminate the basis images that make the smallest
contribution to the total image.
The mean square error between sub image G and approximation G then is
em
E{IG– Gr}
(n-1 n-1 n z (u, v) T(u, v)
T(4, v)
S,- S,?
u=0 V=0 u=0 v=0
[n-1 n-1
u=0 v=0
n-ln-1 oTa)
2 x ... (5.19)
= E [1- 4,
v)]
u=0V=0
where
JIG– G|is the norm of matrix (G-G)
2
CT(u) is the variance of the coefficient at transform location (u, v).
Page 156 of 304
Image Compression and Recognition
5.31
The mean square erTor
of the MNin²sub images of an
runs the mean square error MxN image are identical.
of the M x N image equals the mean square
single subimage. error of a
A transform such as the DFT, WHT or DCT, whose basis images are
fixed, so
normally it is used.
Size on the reconstruction error i.e., root-mean square error for Fourier,
AIl three curyes intersect when 2 x 2 sub images are used. In this case, only one of
the four coefficients (25%) of each transformed array was retained. The coefficient in
all cases was the dc component, so the inverse transform simply replaced the four sub
image pixels by
their average value.
Page 157 of 304
Digital Image Processing
5.32
6.5
5.5
error
FFT
Root-mnean-square
4.5
WHT
4
3.5
DCT
3
2.5
2
o--------
2 4 16 32 64 128 256
Subimage size
In most transform coding systems, the retained coefficients are selected on the
basis of maximum variance called Zonal Coding or on the basis of maximum
magnitude called Threshold Coding.
Image
Compression and Recognition 5.33
oooooo
0
0
0 0 0
0 0
olo
0 0 0
0 0
7 5 3 2 1
1
5 4 3 3
4 3 3 2 1 0
1
3 3 2
1 1
2 2
1 1 0
0 0 0
coefficient.
Page 159 of 304
Zonal coding is implemented by using a single fixed mask for all sub images,
whereas threshold coding is inherently adaptive it means the location of the transform
coefficients retained for each sub image vary from one sub image to another.
Threshold coding is the adaptive transform coding approach most often used in
practice because of its computation simplicity.
The concept of threshold coding is that for any sub image, the transform
coefficients of largest magnitude make the most significant contribution to
reconstructed sub image quality.
The locations of the maximum coefficients vary from one sub image to another, so
the elements of x (u, v) T(u, v) normally are reordered to form a 1-D, run-length
coded sequence.
1 0
0
ooolollo
0 0 0
1
0
0 0
0 0
0 0
process for the corresponding sub image as well as mathematically describe the
process of approximation G.
When the mask is applied to the sub image for which was derived and the resulting
Xnarray is reordered to form an n² elenment coefficient sequence in accordance
n
1 5 14 15 27 28
2 4 7 13 16 26 29 42
3 1217 25 30 41 43
11 1824 31 4044 53
10 19 2332 | 394552| 54
20 22|33 38 46 5155 60
21 34| 3747 50 56 59 61
They are
1. A single global threshold can be applied to all sub images. Here,
the level of
on the number of
compression differs from image to image, depending.
coefficientthat exceeds the global threshold.
N
kc ;sT(u, v) < C
ke +
2 ... (5.21)
If Z(u, v)> 2T
(u, v) then T(u, v)
truncated or discarded. =0 andthe transform coefficient
is completely
When
T(u, v) is represented
with a variable
length as the length code means
magnitude
T(4, v) is controlled of k increases. The number it increases in
by the value of c. of bits used to represe
Page 162 of 304
Image
Compression and Recognition
5.37
T
(u, v)4
3
T (u, v)
-3c -2c -c -2c 3c
-1
+-2
-3
Fig. 5.1l. A
threshold coding quantization curve
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22|29 51 87 80 62
68109103| 77
18 22 37 56
24 35 55 64 81 104113 92
49 64 7887103|121120101
99
72| 92 95 98 112100|103
5.11.1.1. Encoding
Below figure shows a basic function of encoding process.
The ability of the wavelet is to pack information into a small number of transform
Coefficients used to determines its compression and reconstruction performance.
lhe most widely used expansion functions for wavelet based compression are
1. Daubenchies Wavelet
2. Biorhtogonal Wavelets
5.11.3.
DECOMPOSITION LEVEL SELECTION
he number of decomposition or transform levels is another important factor
which
affects
1.
Wavelet coding computational complexity
C.
Reconstruction error
5.40 Page 165 of 304
Digital ImageProcesstne
A
P-scale fast wavelet transform involves P-filter bank iterations, so
the number of
operations in the computation of the forward and inverse transforms increases m
5.12.1. ENCODER
Below figure shows functional block diagram of encode
Page 166 of 304
Dmage Compression andIRecognition
5.41
f(n)
Input e(n) e(n)
Quantizer Symbol Compressed
sequence
encoder sequence
Predictor
f(n)
Quantizer
Quantizer is inserted between the symbol encoder and the point at which the
prediction error is formed.
It maps the prediction error into a limited range of outputs denoted e (n), which
establish the amount of compression and distortion that occurs.
Predictor
The predictions generated by the encoder and decoder are equivalent. This can be
accomplished by placing the lossy Encoder's predictor within a feedback loop.
Its input (n) is general as a function of past predictions and the corresponding
quantized errors.
Predictor
f(n)
decoder
Fig. 5.16. Lossy predictive
ne function and block diagram of this decoder are exactly
same as the in
figure 5,4
(b).
Page 167 of 304
5.42 Digital lmage Processing
The error will produced at the output of the decoder is avoided when the
be
f(n) = ai(n
– 1)
•..
. (5.24)
and
StillImage
Video
1. DV
h 2. H.261
Binary Continuous Tone
Image compression StillImage compression 3.H.262
4. H.263
Group 3
1. CCITT 1. JPEG 5. H.264
2. CCITT Group 4 2. JPEG-LS 6. MPEG-1
3. JBIG or JBIG1 3. JPEG-2000 7. MPEG-2
4. JBIG2 8. MPEG-4
9. MPEG-4 AVC
Two video compression standards, VC-1 by the society of Motion Pictures and
Television Engineets (SMPTE) and AVS by the chinese Ministry of Information
Industry (MI) are also included.o
5.13.1. BINARY IMAGE COMPRESSION STANDARDS
Two of the oldest and most widely used image compression standards are the
CCITT Group 3 and Group 4 standards for binary image compression.
They have been used in a variety of computer applications and they were
originally designed as facsimile (FAX) coding methods for transmitting documents
Over telephone networks.
The Group 3 standard uses a 1-D run length coding technique in which the last K-1
nes of each group of K line (for K = 2 or 4) can be optionally coded in a 2-D
mànner.
of the Group 3
The Group 4 standard is a simplified or streamlined version
standard in which only 2-D coding is allowed.
Both Group 3 and Group 4 standards use the same, 2-D coding approach, in that
Wo dimensional in the sense the information from the previousline is used to encode
the current
line.
5.13.1.1.
dimensional CCITT compression
One
3 standard.
ne 1-D compression approach adopted for only the CCITT Group
ere each line of an image is encoded as a series of variable length Haffiman code
Words.
Page 169 of 304
Digital Image Processing
5.44
and black runs
These code words represent the
run lengths of alternating white ina
scan of compression method employed is commonly
line. The
left-to-right the
coding. It has two types of code.
referred to us Modified Hufiman (MH)
1. Terminating codes
2. Makeup codes
ars
Depending on the run length value, two types of variable length code words
used.
1. If runlength<63 then modified Huffman code is used as a terminating
code.
2. If runlength > 63 - then two codes are used makeup code for quotient
otat[r/64] and terminating code for remainder r
Start new
coding line
Put a, before
the first pixel
Detect a
Detect b
Dtect b2
No
b, left of a,
Yes Yes
Ja,b,| s3
No
Detect a
End of
line?
No
Yes
End of
coding line
Vertical mode
a,b b b
Reference line
Coding line
ajaz S=0
=1
Horizontal mode
(b)
Fig. 5.18. CCITT (a) pass mode and (b) horizontal and
vertical mode coding parameters
Compression. and Recognition
Page 172 of 304
Inage
5.47
Afler identification of the current
Reference element
elerments, two simple tests are performed and associated changing
to select one of
modes. three possible coding
1. PasS mode
2. Vertical mode
3. Horizontal mode
Test 1: Compare the
location of b, with respect to a,.
Test 2: Compute the distance
between a, and b, locations
and compare that
distance against 3.
Depending on the outcome of these tests, one
of the three outlined coding blocks is
selected. Establish a new
reference element for the next coding iteration.
Above table defines the specific codes utilized for each of the three possible
coding modes.
Pass Mode
In pass mode,-which specifically excludes the case in which b, is directly above a,
JBIG 2 standard
JBIG is an international standard for bi-level image compression. By
2 standard
the quantized
DCT vaues are restricted to 11 bits.
Compression Procedure
Belowdiagram shows basic functions of JPEG encoder.
Source
DCT Quantizer Entropy Compressed
image Encoder image
data
.
Quantable Huffman
table
DCT computation
size 8x8.
First, the image is subdivide into pixel blocks of
top to bottom.
All the sub images are processed left to right and
= are level shifted by subtracting
Each sub image will have 8x8 64 pxels
intensity levels.
the quantity 2k-1, where 2 is the maximum number of
blocks is
Then the 2-D discrete cosine Transform (DCT) of the pixel
computed.
Quantization
Decoding
Below figure showsa basic functional block of JPEGdecoder.
Quantable Huffman
table
Advantages
1. The reordering of quantized coefficients may result
in long runs of zeroS.
2. Instead of default coding tables and quantized arrays, to
the user is allowed
construct custom tables
and/or arrays according characteristics.
tothe image
5.13.2.2. JPEG - LS Standard
is a lossless to near lossless standard for continuous
It
no images based
on
JPEG-2000 Encoding
The encoding process of the JPEG 2000 standard has the following steps.
Step 1: Level Shifting
First step of the encoding process is to DC level shift the samples of the n-bit
unsigned image to be coded by subtracting 2n-1
a
If the image has more than one component like Red, Green and Blue planes of
color image then each component is shifted individually.
Step 3: Tiling
optionally decorrelated, its components
After the image has been level shifted and
can be divided into Tiles.
an
are processed independently. Because
1les are rectangular arrays of pixels that
process creates tile components.
image can have more than one component and tiling
component can be reconstructed independently and providing a simple
tach tile a
manipulating a limited region of coded image.
nechanism for accessing and/or
Page 177 of 304
g
Digital Image Processing
5.52
Step 4: Transformation
of the rows and columns of each
LThe 1-D discrete wavelet transform
component is computed.
is based on a biorhtoconsl
For error-free compression, the transform
5-3 coefficient scaling ànd wavelet vector.
coefficients.
A rounding procedure is defined for non-integer valued transform
In lossy applications, a 9-7 coefficient scaling wavelet vector is employed.
case, the transform is computed using complementary lifting based
In either
approach. The complementary lifting based implementation involves six sequential
"lifting" and "scaling" operations.
Y(2n + 1) = X(2n + 1) + a [X(2n) + X(2n + 2)],
i-3s2n +1<i,t3
Y(2n) = X(2n) +B [Y(2n- 1)+ Y(2n + 1)],i,-2s 2n+ 1 <i, +2
Y(2n + 1)= Y(2n+1) +y [Y(2) + Y(2n +2) ], i,-1< 2n +1<ij, +1
Y(2n) = X(2n) + 8 [Y(2n-1) + Y(2n + 1) ], i, S
2n <iclodad et
lid-s Y(2n+ 1) -K. Y(2n + 1), i, S 2n +1<ij
Y(2n) = Y(2n)/K, i, s 2n <i
Here, X is the tile.component being transformed
Y is the resulting transform
i, and i, define the position of the tile component within a
component.
lsbi a., B, y andS are lifting parameter.
Total number of Total number of samples
transform coefficients in the original image
Step 5: Quantization
Quantization is needed because
the important visual information concentrated in
a few coefficients.!. is
Toreduce the number of bits needed use
coefficient quantization.
to represent the transform we haye to
128320
aOst:0e 2
Page 178 of 304
Image Compression and Recognition
5.53
Coefficient ag(u, V) of sub band b is quantized
A
tap ti t
25
A, = 2 1 +
The nominal dynamic range of sub band b is the sum of the number
of bits used to
represent theoriginal image and the analysis gain bits for sub band
b.ot
Step 6: Coefficient Bit Modeling
brid
The coefficient of each transformed tile component's sub bands are arranged into
rectangular blocks called code blocks, which are coded individually and one bit plane
at a time.
Starting from the most significant bit plane with a non-zero element,each bit plane
1S
processed in three passes.
Each bit in a bit plane is coded in only one of the three passes, such as
1. Significant Propagation
2. Magnitude Refinement and
3. Cleanup
coded.
The outputs of these passes are arithmetically
Step 8: Bit-Stream
Layering
output of arithmetic coding bit plane is grouped with similar passes from other
The
Code blocks to
form layers.
parses from each code
layer is an arbitrary number of groupings of coding
block.
Page 179 of 304
Step 9: Packetizing
The layers obtained from above step are divided into packets and these
packets are
a the total
providing an additional method of extracting spatial region of interest from
code stream. Packets are the fundamentalunit of the encoded
code stream.
JPEG-2000 Decoding
the
JPEG-2000 decoders simply invert the operations of the encoder, It has
following process.
Step 2: Dequantization
Even though the encoder may have encoded M, bit planes for a particular sub
band, user may choose to decode only N, bit planes.
This amounts to quantizing the coefficients of the code block using a step size of
Any non-decoded bits are set to zero and the resulting coefficients, denoted as
qs(u, v) are inverse quantized using
, v)
(Gs(u,
tr. Ms-Nglu,
2 Ms
Th(u, v) >0
R (u, v) = (G(4, v) -r •2 G,u, v) <0.
... (5.27)
Gsl4, v) =0
where
R
(u, v) is an inverse quantized transform coefficient
This is performed only if component transformation was done during the encoding
process.
H.261
4
H.264
grlt supports prediction differences within frame, variable block size integer
transforms and context adaptive arithmetic coding.
5. MPEG-1
sdA mntion pictures expert group standard for CD-ROM applications with non
interlaced video at up to 1.5 Mb/s.
It is similar to H.261, but frame predictions can be
based on the previous frame,
next frame or an interpolation of both. It is
supported by almost all computers and
DVD players.! h
irst
6. MPEG-2
It is an extension of MPEG 1 -
designed for DVDs with transfer rates to
15 Mb/s. It supports interlaced
video and HDTV. It is the most successful video
standard to date ie bt Gat bou
7. MPEG-4
It can provide improved video compression efficiency and ability to add or drove
F
audio and video objects.f bgs insif: St abob obu
5.13.3.4. MPEG Encoder
The MPEG standards are based on a DPCMDCT coding scheme. Below figure
showsa typical motion compensated videoencoder. It exploitsiths odi 12 its:no3
1. o
Redundancies within and between adjacent yideo framessr ti boz
2. Motion uniformity between frames and
3. The psycho visual properties of the human visual system.
Rate
controller
Difference
macroblock
Image Mapper Vara5elength Encoded
ouantizer Buffer
macroblock macroblock
Inverse
quantizer
Inverse
Mapper
(e.g.. DCT-)ats 2
Prediction macroblock
Variable-length Encoded
coding motion
vector
Motion estimator and Decoded
compensator w/frame delay| macroblock
Fig. 5.21.
A
Typical Motion compensated video encoder
The input to the encoder as sequential macro blocks of video. The path having the
DCT quantizer and variable length coding block in above figure is known as the
primary input-to-output
path.
The encoder will perform simultaneous transformation, quantization and varieble
length coding use of these blocks.
operations on the input withthe
Page 183 of 304
Decoding
The decoder accesses the areas of the reference frames that were used in the
encoder to form the prediction residuals.
The encoder frames are reordered before transmission, so that the decoder will be
able to reconstruct and display them in the proper sequence.
5.14. INTRODUCTION
After an image is segmented into regions, the resulting aggregate
n
of segmented
a
pixels isrepresented and described for further computer processing. Representing
region involves two choices,
1, In terms of its external characteristics (its boundary)
2. In termsof its internal characteristics (the pixels comprising the region)
computer. The
Above tWo schemes are only parts of task of making data useful to
describe the region based on representation.
next task is to
D.15, REPRESENTATION
segmented data into representations that
epresentation deals with computation of
facilitate descriptors.
the computation of
Page 185 of 304
1 1
1
01 sidiz200 28 9V1:a 1
b1 iei323b esbotuua2
1 1
1
io1JRG1 bas noiclen
1 1
MOITAT4eEAA3
Page 1865.61
of 304
.th
Col
Dol1 1
1
1
1
1
1 1
4 1
(b)
(c)
1
1 1 1
(d)
Fig. 5.22. Illustration of the First Few Steps
inthe Boundary-Following Algorithm
5.15.2. CHAIN
CODESeot o:
Chain codes are used to represent a
boundary by a coninected sequence
line segments of specified length of straight
and direction. Typically, this representation
on 4 or 8 connectivity is based
of the segments. The direction of each segment is coded by
using anumbering scheme, as in below figure 5.23.
0
2+ 4+
5 7
direction) and assigning a direction to the segments connecting every pair of pixele
t6
2 t6
t6
2
}6
354
(a) (b) (c)
sb ni (
oiz
baois
(a) (b) (c)
Fig. 5.25. (a) An object boundary (black curve). (b) Boundary enclosed by cells (in gray).
(c) Minimun Perimeter Polygon obtained by allowing the boundary to
shrink.
This shrinking produces the shape of a polygon of minimum perimeter
that
circumscribes the region enclosed by the cell strip as figure (c)
to 2ir:31 shows.
The size of cells determines the accuracy of the polygonal
approximation. In the
limit, if size of each cellcorresponds to a pixel in the
boundary, the error in each cell
between the boundary and the MPP approximation at most
would be 2d ,where d
min possible pixel distance.
This error can be reduced to half by forcing
each cell in polygonal approximation
to be centred on its corresponding
pixel in the original boundary.
vThe objective is to use
the largest possible cell size acceptable
application. Thus, MPPs produced with in a given
the fewest number of vertices.
The cellular approach reduces
the shape of the object enclosed
boundary to the area circumscribed by by the original
the grey wall.
In the above figure 5.25, its boundary
consists of 4-connected straight-line
segments. Suppose that we traverse
this boundary in a counter clockwise
direction.
Every turn encountered in
the traversal will be either a convex or a concave
with the angle of a vertex being an vertex,
interior angle of the 4-connected
boundary.
Convex and concave vertices are
shown, respectively as white
in below figure (b). and black dots
ImageC Kecognition Page 190 of 304
5.65|
(a)
(b)
Eie.5.26.
(a) Region (c)
(dark gray) resulting
(6) Convex from
(white dots)
and concave
enclosing the original boundary .anoije
the boundary of (black dots) vertices by cells
the dark gray region inwoonterclóckwise obtained by following
(c) Concave vertices the
adt diagonal,mirror (black dots) displaced direction
locations in the outer to thetr iil satl i0T13
wall of
The vertices thebounding region
of the MPP coincide either g9d!
(white dots) or with convex vertices
with the mirrors of the concave in the inner wall
Only convex vertices (black dots) in the outer
vertices of the inner wall.
wall and concave vertices
verticès of
the MPP.I of the outer'wall can be
u3sbs to artoita9i sit sub93010 9di
MPP Algorithm io bns orit A
1. The MPP bounded
by a simply connected cellular
Intersecting.swis 201 of: roiEmixOKES complex is notsself
L, iiEvery convex
grilirest sd: i 23oiroV
vertex of the MPP is a W vertex,
but not every W.ivertex a
boundary is a vertex of
of the MPP.
. Every
mirrored concave vertex of the MPP a
is B vertex; butinot every
Bvertex of a boundary is a vertex of
the MPP.: art ja e31i0q to 19dmua
4. All B
vertices are on or outside the MPP, and all vertices are onor
W
Inthis algorithm,
Concave White (W) represents convex and Black (B) denotes, mirrored
vertices.
Page 191 of 304
Based on this, the boundary is divided into two. segments by a line ab as shown in
below figure (b). Here, cd and efare the perpendicular distances from the line.
Below figure (c) shows the result of using the splitting procedure with a threshold
equal to 0.25 times the length of line ab.
Thus, the obtained vertices are a, c, b and e. Joining these vertices result in the
polygon which represents the given boundary.
(a) (b)
(c) (d)
Fig. S.27. (9) Original boundary. (6) Boundary divided into segments based
on extreme points. (c) Joining of vertices. (d) Resulting polygon
Advantage
It seeks prominent reflection points. So the resulting vertices can produce a
good polygonal approximation for the boundary.
This approach is particularly attractive when the boundary contains one or more
Significant concavities that carry shape information.
The use of the convex hull of the region enclosed by the boundary is a powerful
tool for robust decomposition of the bounday• ala doigtii, ot oh
Page 193 of 304
Digital Image Procesthng
5.68|
Drawback
Digital boundaries tend to be irregular
because
of digitization, noise and variations
in segmentation. Such boundaries produce convex
meaningless components scattered randomly
deficiencies with the small,
an inefficient decomposition process. throughout the boundary. This results in
Solution
These irregularities can be removed
by smoothing before partitioning.
to do smoothing which
replace the coordinates One way 15
coordinates of K of its neighbours of each pixel by the average
along the boundary. But
Compression Page 194 of 304
Image and Recognition
Large values of K result in
5.69
excessive smoothing.
Small values of K
result in inefficient
smoothing.
Therefore, polygonal approximation method can
be used before finding convex
deficiency.
5.18. BOUNDARY
Boundary descriptors
described the boundary
of a region using the features
houndary. It can be classified into two types such as of the
1.
Simple descriptors
2. Fourier descriptors
There are two quantities
which are used to describe the above kinds
such as
of descriptors
1.Shape numbers
andib etwooola at et tbnsod odi n1W
2. Statistical moments
The ratio of the major axis to the minor axis.is Called the efficiency of a
boundary.
Major Axis
=
XOVag3 30ib b i.e., Bfficiency Minor Axis
The major axis intersects with the boundary at two points and the minor axis also
intersects with the boundary at two points. A box which completely encloses the
boundary by passing through these four outer points is called the basic rectangle.
Curvature is defined as the rate of change of slope. But, obtaining reliable
measures of curvature at a point in a digital boundary is difficult because these
boundaries tend to be locally ragged.
Therefore, the difference between the slopes of adjacent boundary segments can be
used as a descriptor for the curvature at the point of intersection of the two segments.
When the boundary is traversed in the clockwise direction, the following can be
defined.
1. If the change in slope at P is nonnegative, then a vertex point P is said to bea
part ofa convex segment.
2. If the change in slope at P is negative, then a vertex point P is said to a part
be
of a concave segment.
3. If the change is less than 10°, then vertex point P is a part a
of nearly straight
segment.
4. If the change is greater than 90°, then vertex point
P is a corner point.
These descriptors must be used with care,
because their interpretation depends on
the length of the individual segments relative to
the overall length of the boundary.
5.18.2. FOURIER DESCRIPTORS
Furrier descriptors describe a digital
boundary by considering it as a complex
sequence. Consider
the digital boundary shown in below
K number of points in the XY-plane. can figure 5.29. which has
It reduce a 2-D to a 1-D problenm.
Page 196 of 304
Image Compression and Recognition
5.71
axis
Imaginary
Real axis.
Fig. 5.29. A digital boundary
as a complex sequence
ot Starting. at arbitrary point (xn.V).and
an
following boundary
counterclockwise direction, in the
its coordinate points are e
iant r (9t)
Bzl),.
These coordinates can
be expressed in the form of
x(K) =
And
K-1
S(K) =
K
al)ej2muk/K
=0
..... K-1
for k=0, 1, 2,
p t
Instead of all the Fourier Coefficients, we can use only the first coefficients
reduce the number of terms used for reconstruction. This is equivalent to setting.
a(u) =0 for u> P-1in above equation.
The result is the following approximation to s(K)
= P-1
$ (K) au) ej2muk/P for k= 0, 1,2, .....K-1
afi Only P terms are used to obtain each component ofs (K), K still ranges from 0 to
K-1.(ie) the same number of pointsexists in the approximate boundary, but not as
many terms are used in the reconstruction of each point.
Low-frequency components determine global shape, thus, the smaller P becomes,
then lost on the boundary.
Basic Properties
Fourier descriptors can be used as the basis for differentiating between distinct
boundary shapes.
Descriptors should be as insensitive as possible to translation, rotation and scale
changes. An addition to this, the descriptors must be insensitive to the starting point.
Therefore the Fourier descriptors should not be sensitive to
1. Translation
2. Rotation
3. Scale Changes and
4. Starting Point
But, the changes in these parameters produce some
simple transformations on u
Fourier descriptors. Above four properties can
be explained below.
1. Rotation
Rotation of a
point by an angle is
accomplished by multiplying the point
about the origin of
the complex plane
by e®, to every point of entie
sequence about the origin. S(K) rotates the
9er o ian: igtt 3vnt ii
Page 198 of 304
Image Compression, and Recognition
=0, 1, 2, ......
u
(u) ejo
for
K-1
Thus rotation simply
affects all coefficients
equally by a multiplicative
term ejo constant
2. Translation
Translation consists
of adding a constant displacement to
boundary. It is obtained as all coordinates in the
A,y = A, tjA,
Translation has no effect on
the descriptors, except for u =
impulse S(u). 0, which has the
3. Starting Point
eThe starting point of the sequence can be changed from
the expression.a
K=0to K= K, by using
l hand
s(K) = s(K- K)
s(K) = x (K-K) +jy (K-K)r st
The basic properties of Fourier descriptors are summarized in below table.izaog
Table 5.6. Some basic properties of
Fourier descriptors tr
S. No. Translation Boundary Fourier Descriptor
1
Identify s (K) a(u)
2.
Rotation s, (K) = s(K) eje a, (u) =
a(u) ejousUod
3. Translation s,(K) =s(K) + A,y4, (u) =
a(u) + A,, 8(u)
4. Scaling s,(K) = as(K) a, (u) = aa(u)
5. Starting point
s, (K) =
s(K-k)a,(u)=a(u)e-j2riKouK e
Page 199 of 304
1
Chain code: 03 2 1
003 22
Difference: 3 3 33 30 3Sh
330
Shape no.: 3 3 3 3 033 0 3 3
Order 8
Difference: 3 030 30 30 3 3 1 3 30 3 0 3 0 0 3 3 0 0 3
Shape no.: 0 3 0 30 3 0 3 030 3
3 1
3 3 003 3 0 0 3 3
g(r)
rotating the line segment until it is horizontal. The coordinates of the points are
rotated by the same angle.
v and form an amplitude
The amplitude of g as a discrete random variable
histogram
=
1, 2,
......, A - 1
P(v), i 0,
Where,
increments.
A is the number of discrete amplitude
The nth moment of V about its mean is
A-1
H,() = Z (v;-m)"P (v)
=0
where
A-1
X y,P(v)
quantity m is recognized
m=
=0
as the mean or average value of
V
and , as its
variance.
area and treat it as a
An
altermative approach to normalize gr) to unit
histogram. is value r occurring.
lhe g(r) is treated
as the probability of
Page 201 of 304
5.76 Digital Image Processino
i=0
where
K-1
m=
i=0
In this notation, K is the number of points on the boundary and
,() is directly
related to the shape of g(r).
Advantages
The advantages of using statistical moments when
comparing to other techniques
of boundary description are
1. Implementation of moments is straight forward
2. It carries a physical interpretation boundary
of shape.
3. This approach is not sensitive to rotation.
4. Size normalization can be achieved
by scaling the range
g and r, of values of
mage Compression
and Recognition
These two descriptors
are used to
5.77
measuring compactness
defined as of a region, it can be
Compactness (Perimeter)²
Area
Circularity ratio
is slightly different descriptor
defined as of compactness and it can
be
4TA
Rç =
P2
where
A is the area of the region
Pis the length of its perimeter
The value of this measure is 1 for a circular region and for a square.
Compactness is a dimensionless measure and thus is
insensitive to uniform scale
changes. It is also insensitive to orientation
and thus the error introduced by rotation
of a digital region can be avoided.
E =
C- H|
where,
C- connected component i2
H- holes
AB
Fig. 5.34.Regions with
Euler Numbers equal to 0 and respectively
Page 204 of 304
Compression and Recognition
Inage 5.79
ForA
=
E C-H
C= 1 and
H = 1
Therefore
E = 0 for A
For B
E = (-H
C = 1
and
H = 2
Therefore E = 1-2
E=-1 for B
These networks have a simple interpretation in terms of the Euler number. Below
igure shows a polygonal network.
Vertex (V)
Face (F)
Hole (H)
Edge (Q)
5.80|
Digital mage Processing
Q= 11
F= 2
C = 1
H = 3
Thus the Euler number is -2
7-11 +2 = 1-3
=
-2
Topological descriptors provide an additional feature that is useful in
characterizing regions in a scene.
5.19.3. TEXTURE
Texture content is an important quantity used to describe a
region. This descriptor
provides measuresof properties such as
1. Smoothness
2. Coarseness
3. Regularity
The three principal approaches used image of
in processing to describe the texture
region are
a
1. Statistical approaches
2. Structural approaches
3. Spectral approaches
Page 206 of 304
Image Compression and Recognition
5.81|
Statistical approaches are used to characterize
coarse, grainy and soon.
the texture of a region as smooth,
5.19.3.1. Statistical
approaches
One of the simplest approaches for
describing texture is tó use statistical moments
of the intensity histogram an
of image or region.
1. Statistical Moments
Let Z be a random variable denoting
intensity
P (Z)- corresponding histogram for i - 0, 1,
2,..., L- 1
L-number of diferent intensity levels.
The nth moment of Z about the mean is
L-1
H(Z) = (Z-m)" P(Z) .s. (5.28)
i=0
where,
m is the mean value
of Z (the average intensity)
L-1
j=0
(5.29)
Note from equation 5.1 that Ho = 1
and u = 0. These are called as zeroth
and first
moment.
2. Second Moment
p
The second moment is importance in texture description.
It is a measure of
Lntensity contrast that can be used to establish descriptors
of relative smoothness.
For example, the measure
1
R (Z) = ...
l-; 1+ G² (Z) (5.30)
Above equation gives 0 value for areas of constant intensity and 1 for large values
ofG²(Z).
Page 207 of 304
5.82
Digital Image Processing
8
The third moment is defined as the measure of the skewness. tsi 0
vlih
ie., = L-1E (Zi-m)³ P(Zi)
(Z) (S.32)
j=0
Fourth Moment
The fourth moment is defined as the measure of the relative flatness of the
histogram.
L-1
i.e., H (Z)
=Z i =0
(Zi-m) P(Z) ...6.33)
3 2
4 75
7 20 0 1
5 1 6 2 5 3|0 1 0 10 0
8 8 6 8 1 2 ojv 4 0 0 10 1 0
1 0 1
4 3 4 5 5 1
52 2
1
87 8 7 6
2 -64+3 0
0 1 1 0 2
7 8 6
2 1
8| 1 002|
Co-occurrence matrix G
Image f
of a Co-occurrence Matrix
Fig. 5.36. Generation
how to construct a co-0ccurrence matrix
Above figure 5.36 shows
an example of
as "One pixel immediately to the right
operator Q defined
using L = 8 and a position immediately to its right.
apixel is defined as the pixel
1.e., the neighbor of
1, because there is only one
G is
figure we can see that element (1, 1) of immediately to its right.
In that
1 having pixel valued 1
a
OCCurrence in f of a pixel valued occurrences in f of a
are three
(6, 2) of G is 3, because there
Similarly, element to tis right.
valued 2immediatly
a pixel
pixel with a value of 6 having right and one pixel
as "One pixel to the
Q is defined of a 1
If the position operator, are no instances inf
G have been
0, becausethere
above, then position (1, 1)in by Q.
with another 1 in the position
specified matrix
determines the size of
image
intensity levels in the x 256.
The number of possible G will be
of size 256
possible levels)
G.
For an 8-bit image (256
Page 209 of 304
Digital lmage Processing
5.84
size Ky
characterizing co-occurrence matrices
of
=
Py n
Is an estimate of the probability that a pair points
of satisfying Q will have valus
*(Z,Z). These probabilities are in the range [0, 1] and their sum is 1.
K K
2
i=1 j=1
P, = 1
Compression. and Recognition Page 210 of 304
mage
|5.85
Where K.is the row (or column) dimension
of square matrix G.
A
of descriptors useful for characterizing the contents
set
of G are
table. The
quantities used
in the correlation descriptor (second row listed in above
defined as follows. in thetable) are
m, =
i p a
i=1 j=1 bad syxoiotgsest
K K
Py
j=1 i=1
and
K K
i=1 j=1
K K
j=1 i=1
Pi
Let
K
=
P) Py
j=1
And
=
PÚ) 2 PyC
lhen the preceding equations can be written as
K
m, = i Pi)
i=1
K
ln
mc = j PÖ)
j=1
2 ((-m) P)
K
S (-mJP PG)
The m, a mean computed along rowS of the normalized
G
and
1S a
1S
in the form of
mean
computed along columns the of normalized o, in
Similarly, o, andG.
are
heform Each
ofthese
of
standard deviations computed along rows and columns respectively.
G.
terms is a scalar, independently of the size of
Page 211 of 304
5.86|
Digital Image Processing
(a)
(b)
(c)
Fig. 5.37. (a) Texture primitive. (b) Pattern generated by the rule S aS
(c) 2-D texture pattern generated by this and other rules.
Let a texture primitive is denoted as a and it is represented as a
circle shown in
figure.
Now, we have a rule of the form S as, which indicates that
the symbol S may be
rewritten as aS. For example, if this rule is applied three times,
then it results in the
form of string as aaaS.
If represents a circle figure (a) and the meaning
'a' of “circle to the right" is
a
assigned to string of the form áaa the ruleS aS allows generation of the
texture pattern shown in figure (l).
Suppose we add some new rules to this scheme such as.
Page 212 of 304
Image Compression and
and Recognition
|5.87
A bs
S a
Where the pressure of b means Circle
down and the presence of C
means circle to
the left.
Now, we can generate a string of the form aaabccbaa
that corresponds to a 3 x 3
metriX of circles.
Larger texture pattern, such as in figure (c), can
be generated easily
in the same way.
Special Approaches
5:19.3.3.
The Fourier spectrum is ideally suited
for describing the directionality of periodic
or almost
periodic 2-D patterns in an image.
These global texture patterns are easily distinguishable as
concentrations of high
energy bursts in the spectrum. Here, we consider three features of the Fourier
spectrum that are useful for texture description.
Such as,
1. Prominent peaks in the spectrum give the principal direction of the texture
patterns.
2. The location of the peaks in the frequency plane gives the fundamental spatial
period of the patterns.
3. Eliminating any periodic components via filtering leaves non-periodic image
elements.
Detection and interpretation of the spectrum feature are simplified by the
expressing the spectrumn in polar coordinates.
S(r,0)
Where,
S - spectrum function
frequency variable
direction variable
a 1-D function S. ().
rOr each direction 0. the S(r, 0) may be considered
Similarly, a 1-D function.
for each frequency the S, (0) is
Analyzing S, (r) fora fixed 0 value provides the behavior ofthe spectrum along a
Analyzing
Radial
direction from the origin.
Analyzing r gives the behavior of the spectrum along
S,(0) for a fixed value of
a circle
with center on the origin.
Page 213 of 304
Where,
Ris the radius of a circle centered at the origin. The results of abovetwo equation
gives a pair of values [S(r), S(©)]for each pair of coordinates (r, 0).
By varying these coordinates, we can generate two 1-D functions S(r) and S(0).
This gives the spectral energy description of texture for an entire image or region.
Descriptors of these functions can be computed in order to characterize their
behavior quantitatively. Descriptors typically used for following purpose.
1 The location of the highest value
2. The mean and variance of both the amplitude and axial variations.
3. The distance between the mean and the highest value
of the function.
5.19.4. MOMENT INARIANTS
The 2-D moment of order (p + ) of a digital image f(x, y)
of size MxN is
defined as
M - 1
N-1
xPy9
Mpq
X=0 y=0
f, y)
where
p=0, 1, 2....
q=0, 1, 2..... are integers
The corresponding central moment of order (p + g) is defined as
-1N-1
M
*=0 y=0
for p = 0,
1,2
...... and =0.1.2
a
where
Moo
Ognition Page 214 of 304
5.89
and
Moo
The normalized central moments,
denoted Npg We defined as
Where,
2 +1
..
Forp +q =2, 3,
A set
of seven invariant moments can be
derived from the second
moments. and third
i = m20 t no2
(20 - No2)
+
=
6 t
(N30 ni2-(N21Nos)+4n11 (30 1)
(21+Mo3)
0, = (3n21-nos) (n30 + ni)(130
+t
nip-3(21 t lo3)2
No3) ]
+(3n21-No3) (n21to3) 3 (30 t 112-(n21+
to translation, scale change, mirroring and
tus set of moments is invariant
rotation.
5.20.
PATTERNS AND PATTERN CLASSES a family of
a pattern class is
descriptors and
pattern an arrangement of
A
are denoted W1,W2..Ww
patterns
is properties. Pattern classes
common
that share Some
Where w
1s the number of classes. patterns to their
involves techniques for assigning
Pattern machine as possible.
recognition by as little human intervention
Tespective and with
classes automatically
Page 215 of 304
3. Trees
Among these, vectors are used for quantitative escriptions and strings and trees
are used for structural descriptions.
X=
where,
*-represents the h descriptor
n-is the total number of such descriptors associated with the pattern.
A pattem vector can be expressed in two forms such as
and X= (j; *
..xjT
where T indicates Transposition
Example1
Fishes (1936) reported the use of what
then was a new technique called
discriminant analysis to recognize three types
of Iris flowers to measuring the
widths and lengths of their petals. Three iris flowers are
1. Iris Setosa
2. Iris Virginica
3. Iris Versicolor
Page 216 of 304
2.0
(cm) AA
width
1.5
Petal
t 9vilim
ri
1.0
0.5
1 2 3 4 5 6
Petal length (cm)
Three types of iris flowers described by two measurements
Fig. 5.38.
where
x, and x, correspond to petakength and width respectively. The three pattern
classes are denoted as w1, W, and w, Wwhere
represent setosa
W
W) represent virginica
W3 represent versicolor
The petals of flowers vary in width and length. Above figure 5.38 shows length
and width measurements for several samples of each type of iris.
After a set of measurements has been selected, the.components of apattern vector
become the entire description of each physical sample. In this case each flower
Decomes a point in 2-D Euclidean space. s t
Page 217 of 304
s Digital Image Processing
5,92
The measurements petal width and length can be used to separate the class of
of
Iris Setosa from the other two but it did not separate as the virginica and versicols
types from each other.
The degree of class separability depends strongly on the choice of descriptors
selected for an application.
t a
|b
b
L
(a) (b)
Fig. 5.39. (a) Staircase Structure (b) Structure coded in terms
of the primitives a and b
to yield the string description ..... ababab ...
In Figure (a) shows a simple staircase pattern. This pattern could
be sampled and
expressed in terms of a patern vector.
Assume that this structure has been segmnented out
of an image. By defining the
two primitive elements a and b shown, we may code
figure (a) in the form shown in
figure (b).
The most obvious property of the coded structure is
the repetitiveness of the
elements a and b. Therefore, a simple description
approach is to formulate a
recursive relationship involving these primitive elements.
String descriptions adequately generate patterns of objects and
other entities whose
structure is based on simple connectivity of primitives.
Strings are 1-D structure, their application to image
description requre
establishing an appropriate method for reducing 2-D positional relations to 1-D form.
Page 218 of 304
Image Compression
and Recognition
5.93
Most applications of strings to image description are based on the idea of
extracting connected line segments from
the objects of interest.
(a) (b)
Fig. 5.41. (a) A Simple composite region (b) Tree representation obtained
by using the relationship "inside of
where,
J*,y) t w(*, y) F* (u, v) W(u, v)
ues
K indicates spatial convolution andF* is the complex conjugate of F.
8.
What is run length coding?
(May'14)
Run-length Encoding or RLE
isa technigque used to reduce the size of a repeating
string of characters. This repeating string is
called a run; typically RLE encodes a
run of symbols into two a
bytes, count and a symbol. RLE can compress any type
of data regardless of its information content, but the content
of data to be
compressed affects the compression ratio. Compression is normally
measured
with the compression ratio:
of encoding proceSs.
Page 223 of 304
Digital Image Processing
5.98
22. Define the procedure for Huffnan shit coding. (Dec'12) (May'13)
List allthe source symbols along with its probabilities in descending order.
Divide the total number of symbols into block of equal size. Sum the
the
probabilities of all the source symbols outside the reference block. Now apply
source symbol.
procedure for reference block, including the prefix
The code words for the remaining sy.nbols be can constructed by means of one
as in the case of binary shift
or more prefix code followed by the reference block
code.
2. Quantization
3. Zigzag Scan
4.DPCM on DC component
5. RLE on AC Components
6. Entropy Coding
Page 226 of 304
36. How arithmetic coding is advantages over Huffnan coding for text
42. Define the chain code derivative in 4 and & connectivity. (Nov/Dec- 2009)
The chain code derivative in 4 connectivity is given by
1
2 0
3
is given by
The chain code derivative in 8 connectivity
Page 229 of 304
Digital Image Processino
5.104
2
1
3
7
5
6
to describe
are the three principal approaches used in image processing
43. What
a -
(May/June 2009, Nov/Dec 2008)-
the texture of
region?
The three approaches used to describe texture are
i) Statistical approaches- characterize the texture as smooth, coarse etc.
ii) Structural approaches - Based on the arrangement of image primitives.
max
|Diam (B) = i,j [D(P, P)]
Where
D-is a distance measure
Pi P; are points on the boundary
S3. Define length of a boundary.
The length of a boundary is the number of pixels along a boundary. Eg. For
a chain coded curve with unit spacing in both directions the numnber of vertical
and horizontal components plus y2 times the number of diagonal components
gives its exact length.
Page 231 of 304
Digital Image Processino
5.106
S4. Define shape numbers magnitude, The order
as
the first difference of smallest
Shape number is defined
is the number of digits in itsrepresentation.
n of a shape number
Fourier descriptors for the following transformations
53. Give the
(1) Identity
(2) Rotaiion
(3) Translation a0t0insei tis 2O!
(4) Scaling
(5) Starting pointf
il r et isei
56. Specify the types e,' regional descriptors
Sir le descriptors
Topological descriptors
Texture
Moment invariants
57. Name fewmeasures used as simple descriptors in region descriptors
Area
Perinneter
Compactness
Mean and median of gray levels
boundary shape. Si
2. It carries a physical interpretationof
not sensitive to rotation.
lhis approach is scaling the range of
values of
can be achieved by
*. Size normalization
g and r.
Page 233 of 304
Digital Image. Processing
5.108
properties for region description?
used
65. What are the topological description such ae
are two topological properties useful for region
a These
au 1. Number of holes in the region.
of the region.s i ke
2.
Number of connected component
66. What is meant by uniformity?
on
Uniformity is one of the measures
of texture which is based the histogram.It
is defined as
L-1
=
ntiet 29ThU(Z) P2
(Z;)
i=0
An image is said to be maximally uniform when measure 0 is maximum for an
image in which all intensity levels are equal and decreases from there.
67. Define Entropy.
Entropy also one of the texture measure and it can be defined as a measure of
variability of an image. It can be expressed as
REVIEW QUESTIONS
1.1. INTRODUCTION
Pictures are the most common and covenient means of conveying or transmitting
information. A picture is worth a thousand words. Pictures concisely convey
information about positions, sizes and inter-relationships between objects. They
portray spatial information that we can recognize as objects. Human begins are good
at deriving information from such images, because of our innate visual and mental
abilities. About 75% of the information received by human is in pictorial form.
Modern digital technology has made it possible to manipulate multi-dimensionai
Signals with systems that range from simple digital circuits to advanced parallel
Computers. The goal of this manipulation can be divided into three categories.
Technology Outcomes
Image Processing image in image out
Image Analysis image in measurements out
Image understanding image in high level description out
Image processingot
The digital image processing deals with developing a digital system that performs
operations on a digital image.
2
Page 236 of 304
An
image is nothing more than a two-dimensional signal. It is defined by
the
mathematical function f(o, y) where x and y are the two co-ordinates horizontally
and vertically and the amplitude
off at any pair of coordinate (, y) is called tha
intensity or gray level of the image at that point.
1.1.2. APPLICATIONS
Some of the major fields
in which digital image processing
is widely used are
mentioned below.
1. Office automation
Optical character recognition,
document processing, cursive
logo and icon recognition, script Recognitio
identification of address area on
envelop, etc.
2. Industrial automation
Automatic inspection system,
non-destructive testing, automatic
process related to assemblus
VLSI manufacturing, PCB
checking, robotics, oil and natural
exploration, process control applications etc.
3. Bio-Medical
ECG, EEG, EMG Analysis,
cytological, histological and stereological
applications, automated radiology
and pathology, x-ray image
analysis, eto.
Page 237 of 304
9. Information Technology
Facsimile image transmission, videotex, video conferencing and videophones, etc.
the
newspaper industry in 1920,
14 digital images
was in
New York
applications of and
First between London
submarine cable picture transmission system
were sent by Bartlane cable acro
Wag
digital image proçessing because computers were not involved in the creation
of digital images. Thus, the history of digital image processing is intimately
tied to the development of the igital computer.
In 1940s, a modern digital computer was introduced by
John Von Neumann o
two key concepts.
i) A memory to hold a stored progranm
and data
ii) Conditional Branching
These two ideas are the foundation
of a Central Processing Unit (CPU).
Later, in early 1960s, the
first powerful digital computer was
carry out meaningful introduced
image processing tasks.
In 1964, the improving
of digital image processing Jet
propulsion Laboratory was began at
when pictures
were processed of the moon
by a computer to correct transmitted by Ranger 7
inherent in the
on-board television camera. various types of image distortion
In parallel, with space
application, digital
inthe late 1960s image processing techniques began
resources
and early 1970s to
be used
observations and
astronomy.
in MedicalImaging, remote Earth
Page 239 of 304
7. Morphological processing
8. Segmentation
9. Representation and description
10. Recognition
There are two categories of the steps involved in the image processing.
1. Methods whose input and output are images
are attributes
2. Methods whose inputs may be images but whose outputs
extracted from those images.
This organization is summarized in below figure:
The diagram does not imply that everyprocess is applied to an image. Instead of
that, the intention is to convey an idea of all the methodologies that can be applied to
images for different purposes and possibly with different objectives.
1. Image Acquisition
First step in image processing is image capturing i.e., to capture a digital image. It
could be as simple as being given an image that is already in digital form. Generally
the image acquisition stage involves processing such as scaling.
2. Image enhancement
The principal objective of enhancement technique is to process an image so that
the result is more suitable than the original image for a specific application
Modified Image
Original Image
Image acquisition
Image enhancement
Image Restoration
Wavelets
Compression
Morphological
processing
Segmentation
Representation and
description
Recognition
3. Image Restoration
"010100101100110101001.."
With reference to
sensing two elemnents are
first is a physical device required to acquire
digital image. The
that issensitive to the energy
to image. The second, radiated bythe object we
called a digitizer, is a wish
physical sensing device device for converting
into digital form. the output of the
Network
Image
displays Computer Mass storage
Image sensors
Problem
domain
4
Specialized Image Processing Hardware
It consists of the digitizer plus Hardware that performs other primitive operations
ach as an arithmetic logic unit (ALU), which performs arithmetic and logical
as front and
operations in parallel on image. This type of Hardware is also known
subsystem.
2. Computer
It is a general purpose computer and can range from a PC to a super computer
depending on the application. In dedicated application, sometimes specially designed
computer are used to achieve a required level of performance.
3. Software
It consists of specialized modules that perform specific tasks. A well designed
package also includes capability for the user to write code, as a minimum, utilizes the
specialized module. More sophisticated software packages allow the integration of
these modules.
4. Mass storage
Mass storage capability is a must in image processing application. An image of
size 1024 x1024 pixels, in which the intensity of each pixel is an s-bit quantity,
requires one megabytes of storage space if the image is not compressed. Image
processing applications falls into three principal categories of storage.
(1) Short term storage for use during processing
(1) On-line storage for relatively fast retrieval
(i) Archival storage such as magnetic tapes and disks.
Ö). Short term storage for use during processing
Ihis can be provided by using computer memory or frame buffers.
Frame buffers are specialized board that store one or more images and can be
accessed rapidly at video rates. This method allows instantaneous image zoom, scroll
(vertical shifts) and pan (horizontal shifts).
5. Image displays
Commonly used displays are color TV monitors. These monitors are driven by the
outputs of image and graphics displays cards that are an integral part of computer
system.
6. Hardcopy devices
cameras, heat
The devices for recording image includes Laser printers, film
as optical and CD ROM disk.
sensitive devices, inkjet units and digital units such
paper is the obvious medium to
Films provide the highest possible resolution, but
choice for written applications.
7. Networking
any computer system in use today.
Networking is almost a default function in
processing applications, the
Because of the large amount of data inherent in image
bandwidth.
key consideration in image transmission is
Cornea
Iris
body Anteriorchamber,
Ciliary Ciliary muscle
Lens
Ciliary fibers
Visual axis
Vitreous humor
Retina
Blind spot
Sclera Fovea
Choroid.
Nerve
and
sheath
15 m
100 m -17 mm
it Glare limit
brightness
range
Adaptation
Ba
Subjective
B
Scotopic
Photopic
Scotopic
threshold
-6 -4
Log of intensity (mL)
If AI
isnot bright enough, the subject cannot see any perceivable changes.i
Page 251 of 304
CI+ALig'nd aue"gnirseigi
Perceived brightness.
LActual illumination
(a) (b)
(c) (d)
2. Line sensor
3. Array sensor
Sensors Working Method
sensors will perform following ideas to generate a
All the above three kinds of
digital image.
a voltage by the combination of input
1. Incoming energy is transformed into
electrical power.
Page 254 of 304
1.20 Digital Image Processing
Filter
Sensing material
Power in–
Rotaion
Sensor
(«(CKK(CKC(KCKCKCK((C
Linear motion
The single sensor is mounted on a lead screw that provides motion in the
nerpendicular direction, because mechanical motion can be controlled with high
precision.
Microdensitometers
In above figure other similar mechanical arrangements use a flat bed, with the
sensor moving in two linear directions. These mechanical digitizers are also known as
microdensitometers.
Imaged area
Linear motion
Sensor strip
The imaging strip gives one line of an image at a time and the motion of the strin
completes the other dimension of a two-dimensional image. Lenses or other focusing
schemes are used to project the area to be scanned onto the sensors.
Sensor strips mounted in a ring configuration are used in medical and industrial
imaging to obtain cross sectional images of3-D objects.
A rotating x-ray source provides ilumination and the sensors opposite the source
collect the x-ray energy that passes through the object. This is the basis for medical
and industrial computerized axial tomography (CAT) imaging.
Image
reconstruction}
Cross-sectional
images of 3-D object
3-D object
X-ray
SOurce
motion
Linear
Sensor ring
2. Complete image can be obtained by focusing the energy pattern onto the
surface of the array.
a scene element.
lhe energy from an illumination source is being reflected from
ne imaging system will collect the incoming energy and focus it onto an image
plane.
II the illumination is light, the front end of the imaging system is an optical lens
hat projects the viewed scene onto the lens focal plane.
Page 258 of 304
Digital Image Processtng
1.24
llumination (energv
sOurce
a
Imaging system
Scene element
b
I= fry)
Loin S 1sLmni
Li is to positive and Lmay must be finite.
To.create a digital image, we need to convert the continuous sensed data into
digital form. This involves two processes-sampling and quantization. An image may
be continuous with respect to the X and Y coordinates and also in amplitude. To
convert it into digital form we have to sample the function in both coordinates and in
amplitudes.
The one dimensional function in Figure 1.19 (b) is a plot of amplitude values of
the continuous image along the line segment AB in Figure 1.19 (a). The random
variations are due to image noise.
A B
A B
(a) (b)
Sampling
(c) (d)
(a) (b)
Fig. 1.20. (a) Continuous Image projected onto a Sensor array
(b) Result of imagesampling and quantization
The digital samples resulting from both sampling and quantization are shown in
figure 1.19 (d). Starting at the top ofthe image and carrying out this procedure line by
line produces a two dimensional digitalimage.
When a sensing array is used for image acquisition, there is no motion and the
number of sensors in the array establishes the limits of sampling in both directions.
Quantization of the sensor outputsis as before, above figure 1.20 shows this concept.
an
In that figure 1.20 (a) shows the continuous image projected onto the plane of
array sensor. Figure (b) shows the image after sampling andquantization.
along the first row. It does not mean that these are the values of physical coordinates
When the image was sampled.
Page 262 of 304
1.28 Digital Image Processing
Spatial Domain
The section of the real plane spanned by the coordinates of an image is called the
spatial domain, withx and y is referred as spatial variables or spatial coordinates.
f(0,0) f0,1)......f(0, N- 1)
f(1, 0) f(1, 1)....f(1, N–1)
fr,y) =
:
L = 2k
Then, the number b of bites required to store a
digital image is
b =
M*N*K
when M =N this equation becomes
b = N2 K
When an image can haye 2k gray
levels, it is referred to as K-bit.
possible gray levels is called an 8 bit image An image with 256
(256= 28)
(d) (e) ()
Fig. 1.21. (a) 1024*1024, 8-bit image. (b) S12*512 inage resampled into 1024 *1024pixels
by row and colunn duplication. ) 256*256, 128*128, 64*64, and
(c) through
32*32 images resampled intol024*1024 pixels.
Intensity resolution refers to the smallest discernible change in intensity level.
an
Based on Hardware considerations, the number of intensity levels usually is
Integer power of two. The most common number is 8 bits, with 16 bits being used in
SOme applications.
Page 264 of 304
1.30 Digital Image Processing
(d)
(e)
()
(e)
Fig. 1.22. (e) - (h) mage displayed in (h)
4.7.3.1. ISO Preference l6, 8, 4, and 2intensity levels.
Curves
To see the effect of varying N
and K
simultaneously,
Lhaving little, mid-level three pictures are taket
and high level of details.
Page 265 of 304
Fig. 1.23. (a) Image with a low level of detail. (6) Image with a medium level of detail. (e)
Image with a relatively large amount of detail
to
Different image were generated by varyingN and K and observes then asked
were summarized in
rank the results according to their subjective quantity. Results
the form of ISO preference curve in the N - K plane.
Face
k
Cameraman
Crowd
4
32 64 128 256
The results shows that ISOpreference curve tends to become more vertical as the
detail in the image increases. The result suggests that for image with a large amount
of details only a few gray levels may be needed. For a fixed value of N, the perceived
quality for this type of image is nearly independent of the number of gray levels used.
Pixelreplication
For example to increase the size of image as double, we can duplicate each
column. This doubles the size of the image horizontal direction. To increase
assignment of each of vertical direction we can duplicate each row. The gray level
assignment of each pixel is determined by the fact that new locations are exact
duplicates of old locations.
Drawbacks
Nearest interpolation is fast, it has the undesirable feature that it produces a check
board that it not desirable.
Bilinear Interpolationti
It uses the four nearest neighbors to estimate the intensity at a
given location. Let
/e u) denote the coordinates of the location to which we want to assign an intensity
Page 267 of 304
value and let v,y) denote that intensity value. For bilinear interpolation the assigned
gray levels is given by
V(*, y) =
ax + by + cxy + d
where the four coefficients are deternined from the four. eqjuations in four
unknowns that can be written using the four nearest neighbors of point (*, y).
Bicubic Interpolation
It involves the sixteen nearest neighbors of a point. The intensity value assigned to
point (*,y) is obtained using the equation
3
3
i=0 j=0
Where the sixteen coefficients are determined from the sixteen equations in sixteen
unknowns that can be written using the sixteen nearest neighbors of point (, y).
Shrinking is done in the similar manner. The equivalent process of the pixel
replication is row and column deletion. Shrinking leads to the problem of aliasing.
1.8.2. ADJACENCY
Let V
the set of gray level values used to define adjacency,
be in a binary
imags
V= {1} if we are referring to adjacency of pixel with value 1.
ii) q is in NP) and the set N,(P) nN,(g) has no pixels whose values an:
from V.
1.8.3. PATH
It is also known as digital path or curve. A path from
pixelP with coordinates
(*, y) to pixel q with coordinates (s,t) is a sequence of distinct pixels wit
coordinates
where
(, Vo) = (,y)
(p y) = (s, ) and
(Ip y) and (-y,- ) are adjacent for 1 sis n+
1 1
0 1 0
1
1
Fig. 1.25.
Page 269 of 304
1.35
Digital Image Fundamentals
1.8.4. CONNECTIVITY
Let s Represent a subset
of pixels in an image. Two pixelsp and q are said to be
connected in s if therc exists a path between them consisting entirely of pixels
in s.
For any pixcl p in s, the set of pixels that are connected to it in s is called a
connectcd component of s. If it only has one connected component, then set s is
called a connected set.
REGION
Let R be a subset of pixels in an image. If R is a connected set, it is called a
Region of the image.
BOUNDARY
Boundary is also called as border or contour. The boundary of a Region R is the
set of points that are adjacent to points in the complement of R.
It also defined as the border of a region is the set of pixels in the region that have
at least one background nighbor. The boundary of a finite region forms a closet path
and this global concept.
EDGE
Edges are formed by the pixels with derivative value that is higher than a present
threshold. Thus, edges are considered as gray level or intensity discontinuities.
Therefore it is local concept.
a, D
(p,) 0(D (p, q) =0 iffp=)
b. D (p, 4) = D(g,p) and
C. D (p, z) s D(p,g)+D (g.z)
Euclidean Distance
The Euclidean distance betweenpand q is defined as
D, (P, 4) = +
(-s)2
(y -)2
Page 270 of 304
2
1
2
2 1 0 1 2
2 1 2
2
Chessboard Distance
The D, distance can be called as chessboard distance, between p and g is
defined as
D, (p, q) = max (x-s|, ly -)
Inthis case, the pixels with Dg distance from (3, y) less than or equal to some
value r form a square centered at (, y). For example the pixels with
D, distance
s2 form the following contours of constant distance.
2 2 2 2 2
2 1 1 2
2 1 1 2
2 1
12
2 2 2 2 2
The pixels with Dg=lare the 8-neighbors of (*, y).
Page 271 of 304
D, Distance
D. distance between two points is defined as the shortest m path between the
points. It considers m - adjacency. In the case, the distance between two pixels will
depend on
1, The values of the pixels along the path and
2. The values of their neighbors.
Example
Consider the pixel arrangemert given below and assume that p1, P> and p4 have
value 1 and that p, and p, can have a value of0 or 1.
P3 P4
P1 P2
Case 1
Itp, andp, are 0, the length of the shortest m -path between p and p, is 2.
Case 2
Ifp, is 1, then p, and p will no longer be m - adjacent and the length of the
shortest m - path becomes 3.
Case 3
Ifp, is 1, then the length of the shortest m - path also 3.
Case 4
If both p, and p, are 1, the length of the shortest m - path betweenp and p, is 4.
1.9. CoLOR IMAGE PROCESSING FUNDAMENTALS.
The use of color in image processing is motivated by two principal factors.
1. Color is a powerful descriptor that simplifies object identification and
extraction from a scene.
2. Humans can discern thousands of color shades and intensities.
Pseudocoloor processing
or Range of intensities.
A color is assigned to a particular monochrome intensity
as violet, blue, green,
Color spectrum may be divided into six broad regions such
an object are determined by the
yellow, orange and red. The colors perceived in
nature of the light reflected from the object.
1. Achromatic light
set and it has
Achromatic light is the light seen on a black and white television
been as implicit component for image processing.
Its only attribute is its intensity or amount
or gray level. The term gray level refers
to grays and finally to white.
toa scalar measure of intensity that ranges from black,
2. Chromatic light
400 to
Chromatic light spans the electromagnetic spectrum from approximately
a chromatic light
700 nm. Three basic quantities are used to describe the quality of
Source such as
1. Radiance
2. Luminance
3. Brightness
Radiance
It is the total amount of energy that flows from the light source and measured
watts (W).
Page 273 of 304
Luminance
It is ameasure of the amount of energy an observer perceives from a light source
and measured in lumens (Im).
Brightness
It isa subjective descriptor that is practically impossible to measure. It embodies
the achromatic notation of intensity. and one of the key factors for describing color
sensation.
Primary Colors
Red (R), Green (G) and Blue (B) are called as primary colors because when
mixing color in various intensity proportions, they can produce all visible colors.
this
Secondary Colors
These primary colors can be added in different proportions to produce secondary
are given below.
colors. They also called as primary colors of pigments and
Green
Yellow Cyan
White
Magenta Blue
Red
Mixing the three primaries or a secondary with its opposite primary color, in the
nght intensities produces white light. The primary colors are added to produce
Pigment primaries, they are called adaitive primaries.
Page 274 of 304
Mixture of pigments
are produced,
When the primary colors of pigments are added, the primary colors
So in this case primary colors become secondary colors.si
A proper combination of the three pigment primaries
or a secondary with its
Yellow
Red Green
Black
Magenta
Blue
Cyan
Primary colors are subtracted out from the pigment primaries they are called
subtractive primaries.
1.9.2. CHARACTERISTICS
The following three characteristics are used to differentiate one color from other
such as
1. Brightness
2. Hue
3. Saturation
Brightness
It gives the chromatic notion
of intensity.
Hue
It represents dominant color as
perceived by an observer.
Saturation
It refers tothe relative purity,or the amount
of white light mixed with a hue. IU
pure spectrum colors are fully saturated.
Page 275 of 304
G
R+G+B
B
h=
R+G+ B
From the above three equations, it can be shown that
rtg +b = 1
Thus, the Tristimulus values needed to form any color can be obtained from the
above equations.
0.9 -
520 nm
Green
Pink
Deep Equalenergy Red
Blue nm
point 780
Blue
380 nm
TT 0.8
Uses
any two
1. It is useful for color mixing because a straight line segment joining
can be
points in the diagram defines all the different color variations that
obtained by combining the two colors in different ratios.
2. A line drawn from the point of equal energy to any point on the boundary will
define all the shades of that particular spectrum color.
Color Model also known as color space or color system. The main purpose of 2
color model is to facilitate the specification of colors in some standard. A color model
-is åspecification of a coordinate system and a subspace within that system where
each color is represented by a single point.
Color Models are classified into two types according to their use.
1. Hardware Oriented Color Models
2.
Application Oriented Color Models
These color models are used in the creation
of color graphics for animation and
manipulation of colors.
+
Blue
(0, 0 ,1)
Cyan
Magenta White
scale
Gray
(0, 1, 0)
Black G
Green
(1,0,0)
Red Yellow
RGB colour Cube
R
xample
In RGB image each of thered, green, and blue
imags is an 8- bit image. So pixel
epth of RGB color pixel =3 x Number
ofbits/plane
= 3x 8
24
Full color Image
Full Color Image is used to denote a -
24 bit RGB color image. The
total number
of colors in a 24-bit RGBimage is (283 =
16,777,216.
Safe RGB ColorS
Many applications use only a few
hundred or fewer colors only. So many
in use today are limited to 256 colors. systems
Advantages
1. It is suitable for Hardware implementation.
2.: Changing to other models
such as CMY is a straight
forward.
3. Creating colors in this model an
is easy process therefore
“Image color generation". it is an ideal tool fe
Disadvantages
1
Itis not acceptable that a color
image is formed bycombining three primary
images.
Page 279 of 304
We know that equal amounts of the pigment primaries, cyan, magenta and yellow
should produce black. In order to produce true black, a fourth color black is added
and it gives a new color model called CMYK color model.
So HSI model is an ideal tool for developing image processing algorithms based
on color descriptions that are natural and intuitive to humans.
To find Intensity
An RGB color image can be viewed as three monochrome intensity
images, so the
intensity can be extracted from on RGB image.
The intensity axis is a vertical line joining the black
and white vertices. Black
vertex is (0, 0, 0) and white vertex is (1, 1, 1).
White
Cyan Magenta
Yellow
Blue
Green Red
Black
White
Black
White
Cyan Red
Blue Magenta
Yellow
Cyan Red Cyan Cyan
Red
Fig. 1.33. Hue and Saturation in Hexagon triungle and circular shaped HSI models
H = ifB< G
... (1.1)
360
-0ifB> G
Where
=
cos-!
1/2 [(RG)+ (R-B)] .. (1.2)
+
L[R- G)2
(R-B) (G-B))²
Page 283 of 304
I = (R+G+ B)
...(1.4)
Gonverting Colors From HSIto RGB
When the HSÍvalues of an image are given in the range [0, 1] the equivalent RGB
values are determined separately for three sectors of hue in the HSI color space as
shown in figure - Hexagonal color space.
1. Red-Green (RG) sector, 0°<H< 120°
2. Green- Blue (GB) sector, 120°< H<240°
3. Blue - Red (BR) sector, 240° < H s360°
RG- sector (0° s H< 1209)
When H is in this sector, the RGB components are given by the equations
B = I(1- S) ... (1.5)
ScosH ...
R= 1 cos(60 H)11+; - (1.6)
G = ... (1.7)
and 3I- (R+ B)
GB-sector (120sH<240°)
If the given value of H is in this sector, we first subtract 120º from it.
H = H–120o
Then the RGB components are
7
(R+ G+ B)
Procedure 1
To change the individual color
of any region in the RGB image, we change the
values of the corresponding region in the
hue image of below figure (b)
Now we convert the new H image, along
with the unchanged S and I images, back
to RGB using the above equations
from 1.1to 1.13.
Procedure 2
To change the saturation or purity
of the color in any region, we follow the above
same procedure, except we
that make changes in the saturation in HSI space,
shown in figure (c). which is
Procedure 3
To average intensity of any region, we
follow the same procedure, except
make changes in the intensity, which that we
is shown in figure 1.34.
Advantages of HSI model
1 This model allows independent
control over the color describing
quantities
namely hue, saturation and intensity.
2. It can be used as an ideal
tool for developing image processing algorithms
based on color descriptions.
Page 285 of 304
Oyan Yellow
White
Blue
Magenta! Red
(a) (b)
(c) (d)
Fig. 1.34. (a) RGB image and the components of its corresponding HSI image:
(b) hue, (c) saturation, and (d) intensity
-
aibtapbai ajbnt an bz]
L
a1 a2 JLb1 b2 a,ibit azbzi ayjbit ay by
1.11.2. LINEAR VERSUS NONLINEAR OPERATIONS
Let an operator H produces an output image g(x,
y) for an input image fx, v).
H C, y)) = g *,y)
H is said to be linear operator if it satisfy the property
homogeneity, otherwise nonlinear operator.
of additivity and
T6 51
=
,y) 4,=1and a,
=-1
+ =
E[a, f:y) a,f*, y)] 15
Page 287 of 304
Digital Image Fundamentals 1.53
a, E , y) +a,
If*, y)] =7-22 =-1
where S(, y) = 7,
2 L, y) = 22
Example of Nonlinear Operator
Max Operator Max{}
Max [a,f(o, y)+ a,f (*, v)] = a, Max [ (, y)] +a, Max ; (, ))
-6 -3
=Max
= -2
(1) Max
=3-7=-4
1.11.3. ARITHMETICOPERATIONS
Arithmetic operations are carried out between corresponding pixel pairs. The basic
arithmetic operations are,
s(*, y) fr,y) +g*, y)
f*, y)-g,
y)
d(r,y)
p(r,y) = fu,y) x g*, y)
q*, y) = f,y) *g,y)
D
=
(A)AND (B)– Intersection
E = NOT(A) -Complement
A NOT (A)
ie i03TH0z)
ikll Fig. 1.35. Examples of Logical NOT
Operation th
Examples of Logical OR Operation
(A)OR (B)
B
(A)AND (B)
Fig. 1.37. Exanples of LogicalAND
Operation
Page 289 of 304
Spatial Spatial
domain domain
Transform domain
Fig. 1.38. General Approach for Operating in the Linear Transform Domain
Figure 1.38 shows the basic steps for performing image processing in the linear
transform domain. First, the input image is transformed, the transform is then
modified by a predefined operation and finally, the output image is obtained by
computing the inverse of the modifiedtransform.
1.12.1. TWO DIMENSIONAL DFT
The two dimensional DFT of an NXN image {u(m, n)} is a separable transform
defined as,
N-1 N-I km
... (1.14)
V(k,,1)) = W.0sk,lsN-1
W
X u(m, n)
2
m=0 n0
Page 290 of 304
=
1 N-1 N-1 km
...
V(k,, ))
ÑŽ u(m, n) W, W,.0sk, IsN-1 (1.16)
m=0 n=0
1 N-1 N-1 - km - In
u(m, n) = V(k, ) W, W, ... (1.17)
N ,0sm,nsN-1
k=0 =0
In matrix notation this becomes,
IfUand V
are mapped into row ordered vectors u and v respectively, then
V = Fu u=V* 0 ... (1.20)
F = FF
U
where (01, o,)is the Fourier Transform of u(m, n)
Fast transform: Since the two-dimensional DFT is separable, the transformation
of is equivalent to 2N one-dimensional unitary DFTS, cach of which can be
performed in O(N log,N) operations via the FFT. Hence the total' number of
operations is O(N² log,N).
Conjugate Symmetry: The DFT and unitary DFT of real images exhibit
conjugate symmetry, that is,
...
-+*)- N N
.}+). osk1s-1
(1.24)
[or]
1 -(km + ln)
0sm, n sN-1 0sk, IsN-1... (1.25)
N
where,
... (1.27)
h (m, n), = h(mn modulo N, n modulo N)
Page 292 of 304
(-1)-N2. N-1
k
(NI2) – 1.
NI2
tagssisz03
N-1
NI2
Fig. 1.39. Discrete Fourier transform coefficients
v(k,) in the shaded area
determine the remaining coefficients
n
N-1 u(m, n)
h(m, n)= 0
h(m - m', n-ne u(m', n')
M-1 h(m, n) 0
(m, n)
m
m'
M-1 N-1
b) Circular convolution of
uui -d (a) Array h(m, n) h(m, n)with u(m, n)
over N x N region
W
(m'k +
r') N-1 N-1 h(m,n)
(mk +
Wy.n) .(1.28)
m=0 n=0
w(m'k+ n')
DFT {h(m, n)}N
where we have used (equation 1.27). Taking the DFT of both sides of 1.26 and
using the preceding result, we obtain,
...
DFT{u,(m, n)}N = DFT {h(m, n)}N DFT u (m, n)}N (1.29)
From this and the fast transform property, it follows that an N x N,circular
convolution can be performed in ON2 log,N) operations. This property is also useful
in calculating tw0-dimensional convolutions such as,rests ci at iL 52*
M-1 M-1
ns
Sile x(m, n) =
2 x(m -m', n -n') x, (m', n')..(1.30)
m'=0.n'=0
m, n,
where x,(m, n) and x,(m, n) are assumed to be zero for [0, M- 1]. The
4. Scientific Applications
5. Military Applications
5. State the steps involved in digital image processing?isesxedl 2l sgsae
The various steps required for any digital image processing applications are
1. Image acquisition
2. Image enhancement
3. Image Restoration
4. Color Image processing
5. Compression & wavelets
6. Morphological processing
7. Segmentation
8. Representation &
description
9. Recognition
6. Define sampling and quantization. [Apr/May 2011; Nov/Dec 2010]
To create a digital image, we need to convert the continuous sensed data into
digital form. This involves two processes sampling and quantization.
Digitizing the coordinate values is called sampling
Digitizing the amplitude values is called quantization.
7. Define compression.
to save an
Compression is a technique used to reducing the storage required
over the network. It has two major
image or the bandwidth required to transmit it
approaches.
1. Lossless compression
2. Lossy compression
8. What is meant by Lossless compression?
allow the
Lossless compression is a class of data compression algorithms that
original data to be perfectly reconstructed from the compressed data.
9. What is meant by Lossy compression?
It is a compression technique that does not decompress digital
data back to 100%
of the original.
Page 296 of 304
1.62 Digital Image Processing
Process of segmentation
is partition of input image into constituent parts. The
key role of segmentation is to extract the boundary
of the object from the
background. The output of the segmentation stage usually
consists of either
boundary of the region or all the points in the region
itself.
12. Mention some types
of Mass storage.
1. Short term storage for use during proccssing.
2. On-line storage for relatively fast retrieval.
3. Archival storage such as magnetic tapes and disks.
13. List the membranes of a
human eye.
The eye is enclosed with three membranes.
1. Cormea and Sclera
2. Choroid
3. Retina
14. What is cornea
and sclera?
It is a tough, transparent tissue
that covers the anterior surface
the optic global is covered by the of the eyc. Rest of
sclera.
15. What is choroid?
The choroid lies directly below
the sclera. It contains a network
that serve as the major source of blood vessels
extraneous light entering of nutrition to the eyes. It helps to reduce
in the eye. It has two parts
1. Iris
2. Ciliary body
Let S represent a subset of pixels in an image. Two pixelsp and q are said to be
connected in s if there exists a path between them consisting entirely at pixels in
S.
For any pixel p in s, the set of pixels that are connected to it in s is called a
connected component of s. Ifit only has one connected component, then set s is
called a connected set.
D,
(p,4) = V (-s)+y -te nina
For this distance measure, the pixels having a distance less than or equal to some
value r from (*, y) are the points contained in a disk of Radius r centered at
(, y).
35. What is D, distance? (City block distance]
The city block distance or Da distance between p and g is defined as
-
-sl+ y
=
D,(P, q)
Page 299 of 304
2 2
2
36. What is D, distance? (Chessboard
distance]
The D, distance can be called as chessboard
distance, betvween p and g is defined
as
D,
(P,q) = max(x -s, ly-)
The pixels with D, distance from (x,
y) less than or equal to some value r from a
square centered at (x, y)
Example.
2 2 2 2 2
21.,1 1.
23te
1 2
2 2 2 2 2
37. Define D,, distance.
D,, distance between two points is defined as
the shortest m path between the -
-
points. It considered m adjacency.
In this case, the distance between two pixels
Will depend on
.
b1. The values of thepixels along the path and
2. The values of their neighbors.
38. Define Full color processing.
In fullcolor processing, the images are acquired
using full. Color sensors, such 2
a color TVcamera or color scanner.
These technigues are used in broad range
applications, including publishing, visualization
and the internet.
Page 300 of 304
41. What are the basic quantities are used to describe the quality of a
chromatic
light source?
Three basic quantities are used to describe the quality of a chromatic light source
such as
1. Radiance
2. Luminance
3. Brightness
42. Define Radiance, Luminance, Brightness
Radiance: It is the total amount of energy that flows from the light source and
measured in watts (w).
Luminance: It is a measure of the amount of energy an observer perceives from
a light source and measured in lumens (lm).
measure.
Brightness: It is a subjective descriptor that is practically impossible to
It embodies the achromatic notion of intensity and
one of the key factors for
describing color sensation.
43. Why RGB colors called as primary colors?
are called as primary colors because when
Red (R), Green (G), and Blue (B)
can produce all visible
mixing this color in various intensity proportions, they
is colors.
Magenta(Red +
Blue)c
Yellow (Red + Green)
45. What are the characteristics used to differentiate one color fromother color)
There are three characteristics are used to differentiate one color from other sur
as
1. Brightness
2. Hue
3. Saturation
46. Define Brightness, hue and saturation.
[ NovDec 2010)
Hue: It represents dominant color as perceived by an observer.
Saturation: It refers to the relative purity or the amount of white light mix
with a hue. The pure spectrum colors are fully saturated.
2
Brightness: It gives the chromatic notion of intensity.
47. Define Trichromatic coefficients.
The amount of Red, green and blue needed to form any particular
color are call
the Trichromatic values and are denoted by R, G and B
respectively. A color
specified by its trichromatic coefficients, such as
R
R+G +B
G
R+G + B
B
R+G+B
48. What are the basic types of color model?
Color models are classified into two types
according to their uses.
1. Hardware oriented color models.
2. Application oriented color models.
49. Define pixel depth.lan
The nunber of bits used to represent each
pixel in RGB space is called the r
depth.
50. Define HSI color space.
HSI color space is represented by a vertical
intensity axis and the locus of co
poiuts that lie on planes perpendicular to this axis.
Page 302 of 304
The intensity of monochrome image f(r, y) is called gray level" of the image at
point.
Lmin S lS Lmax
Where is the gray level of the image and interval (Lmin. Lma) is called gray
scale.
Image
displays Computer Mass storage fO bazi
Image sensors
Problem
domain
58. What are the types of lightReceptors? [Nov/Dec 2010, Apr/May 2011]
The two types of light receptors that are distributed over the surface the retina
of
of the eyye are,
1. Cones- These are used to resolve fine details and hence cone vision is
called photopic or bright light vision.
2. Rods They provide only a general, overall picture of the field
of view.
Hence the vision is called scotopic or dim light vision.
59. What is meant by illunination and reflectance?
[Apr/May 2011]
Illuminance is defined as the amount of source illumination
incident on the scene
being viewed.
Reflectance is defined as the amount of illumination
reflected by the objects in
the scene.
60. What is the Image Averaging?
[Nov/Dec 2011]
Image averaging is the process of replacing each pixel an
in image by a weighie
average of its neighborhood pixels. This process
is used to reduce the no159
content in an image.
Page 304 of 304
Digital Image Fundamentals
171
REVIEW QUESTIONS
1
Discuss about the basic
relationship between pixels.
[Apr/May 2011]
Ans. Refer Section 1.8 Page.no: 1.33
2.
Explain the components
of an image processing system.
NovDec2010, A. U., NowDec 2008]
Ans. Refer Section 1.4 Page.no: 1.9
3. Explain the three types of adjacent relationship
between pixels.
NovDec 2010]
Ans, Refer Section 1.8.2 Page.no:1.34
4. Describe the principle of sampling and quantization. DiscuSs
the effect of
increasing the
1. Sampling frequency
2. Ouantization levels on image
(MayJune 2012)
5. How an RGB is represented using HIS format? Describe the transformation.
[May/June 2012)
Ans. Refer Section 1.10.1 Page.no: 1.43
6. Explain about RGB color model.
Ans, Refer Section 1.10Page.no:1.42
7. Explain the fundamentalsteps in digital image processing.
Ans, Refer Section 1.3 Page.no:1.5
. Explain the elements of visual perception with neat diagram
[NovDec 2010]
Ans, Refer Section 1.5 Page.no:1.12