0% found this document useful (0 votes)
5 views

Digital Image Processing Notes 2020

dip notes bca

Uploaded by

Jeevan Kp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Digital Image Processing Notes 2020

dip notes bca

Uploaded by

Jeevan Kp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

MIT First Grade College

Manandavadi Road, Mysore-08


Affiliated to University of Mysore

VISION OF THE INSTITUTE


Empower the individuals and society at large through educational excellence;
sensitize them for a life dedicated to the service of fellow human beings and mother
land.

MISSION OF THE INSTITUTE


To impact holistic education that enables the students to become socially
responsive and useful, with roots firm on traditional and cultural values; and to
hone their skills to accept challenges and respond to opportunities in a global
scenario.

Lecture Notes on: Digital Image Processing

Prepared by: Chaitra K N

Department: Department of Computer Science

Syllabus
Unit I

Digital image fundamentals: Light and Electromagnetic spectrum, Components of Image


processing system, Image formation and digitization concepts, Neighbors of pixel adjacency
connectivity, regions and boundaries, Distance measures, Applications.

Unit II

Image Enhancements: Image Enhancements: In spatial domain: Basic gray level transformations,
Histogram processing, using arithmetic/Logic operations, smoothing spatial filters, Sharpening
spatial filters.
In Frequency domain: Introduction to the Fourier transform and frequency domain concepts,
smoothing frequency-domain filters, Sharpening frequency domain filters.

Unit III

Image Restoration and Colour Image processing: Various noise models, image restoration using
spatial domain filtering, image restoration using frequency domain filtering, Estimating the
degradation function, Inverse filtering.

Colour fundamentals, Colour models, Colour transformation, Smoothing and Sharpening, Colour
segmentation

Unit IV

Image compression and Image segmentation:

Introduction, Image compression model, Error-free compression, Lossy compression. Detection


of discontinuities, Edge linking and boundary detection, thresholding

Text Books:

1. Principles of digital image processing, by Burger, Wilhelm, Burge, Mark J.


2. Fundamentals of Digital Image Processing, by Anil K Jain.
3. Fundamentals of Digital Image Processing, by Annadurai, R. Shanmugalakshmi.

Table of Contents

Sl. No. Topic Page No.


Unit I
1 Light and Electromagnetic spectrum 1-2
2 Components of Image processing system 3-4
3 Image formation and digitization concepts 4-4
4 Neighbors of pixel adjacency connectivity 5-6
5 regions and boundaries 7-7
6 Distance measures 7-10
7 Applications 10-11
Unit II
8 Basic gray level transformations 13-19
9 Histogram processing 19-21
10 using arithmetic/Logic operations 21-22
11 smoothing spatial filters 22-23
12 Sharpening spatial filters 24-26

MIT FGC Digital Image Processing 2


13 Introduction to the Fourier transform and frequency domain 26-27
concepts
14 smoothing frequency-domain filters 28-32
15 Sharpening frequency domain filters 33-36
Unit III
16 Estimating the degradation function 37-38
17 Various Noise Models 38-41
18 Restoration in the presence of Noise only- Spatial filtering 41-43
19 Image restoration using frequency domain filtering 43-45
20 Inverse Filtering 46-48
21 Colour Fundamentals 48-49
22 Colour Model 49-54
23 Colour Transformation 54-57
24 Smoothing and Sharpening 57-59
Unit IV
25 Colour Segmentation 59-61
26 Introduction 62-62
27 Image Compression Model 62-63
28 Error-free compression or loss less compression 63-64
29 Lossy Compression 64-66
30 Detection of Discontinuities 66-71
31 Edge linking and boundary detection 71-72
32 Thresholding 72-75

Notes (Hand Written or Typed)


Typed

MIT FGC Digital Image Processing 3


Unit I
Digital Image Fundamentals

1.1 Introduction
The field of digital image processing refers to processing digital images by means of digital
computer. Digital image is composed of a finite number of elements, each of which has a
particular location and value. These elements are called picture elements, image elements, pels
and pixels. Pixel is the term used most widely to denote the elements of digital image.
An image is a two-dimensional function that represents a measure of some characteristic such
as brightness or color of a viewed scene. An image is a projection of a 3-D scene into a 2D
projection plane.
An image may be defined as a two-dimensional function f (x, y), where x and y are spatial
(plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the
intensity of the image at that point.
The term gray level is used often to refer to the intensity of monochrome images. Color images
are formed by a combination of individual 2-D images.
For example: The RGB color system, a color image consists of three (red, green and blue)
individual component images. For this reason, many of the techniques developed for
monochrome images can be extended to color images by processing the three component
images individually.
An image may be continuous with respect to the x- and y- coordinates and also in amplitude.
Converting such an image to digital form requires that the coordinates, as well as the amplitude,
be digitized.
1.2 Light and Electromagnetic Spectrum
Light, or Visible Light, commonly refers to electromagnetic radiation that can be detected by
the human eye. The entire electromagnetic spectrum is extremely broad, ranging from low
energy radio waves with wavelengths that are measured in meters, to high energy gamma rays
with wavelengths that are less than 1 x 10-11 meters. Electromagnetic radiation, as the name
suggests, describes fluctuations of electric and magnetic fields, transporting energy at the
Speed of Light (which is ~ 300,000 km/sec through a vacuum). Light can also be described in
terms of a stream of photons, massless packets of energy, each travelling with wavelike
properties at the speed of light.

1
Visible light is not inherently different from the other parts of the electromagnetic spectrum,
with the exception that the human eye can detect visible waves. This in fact corresponds to
only a very narrow window of the electromagnetic spectrum, ranging from about 400nm for
violet light through to 700nm for red light. Radiation lower than 400nm is referred to as Ultra-
Violet (UV) and radiation longer than 700nm is referred to as Infra-Red (IR), neither of which
can be detected by the human eye.
Gamma rays: Gamma-rays are high frequency (or shortest wavelength) electromagnetic
radiation and therefore carry a lot of energy. Used in Radio Therapy and to kill cancerous cells
x-rays: X-ray, electromagnetic radiation of extremely short wavelength and high frequency.
Used to detect bone fracture, the discovery of cavities and impacted wisdom teeth.
Ultraviolet rays: Electromagnetic radiation comes from the sun and transmitted in waves or
particles at different wavelengths and frequencies. Uses for UV light include getting a sun tan,
detecting forged bank notes in shops, and hardening some types of dental filling.
Visible light: Visible light is a very narrow band of frequencies of electromagnetic waves that
are perceptible by the human eye. The eye contains specialized cells called rods and cones that
are sensitive to the visible spectrum. As mentioned previously, most of us see visible light
every day. For example, the sun produces visible light. Incandescent light bulbs, fluorescent,
and neon lights are other examples of visible light that we may see on a regular basis.
Infrared rays: Infrared radiation (IR), sometimes known as infrared light, is electromagnetic
radiation (EMR) with wavelengths longer than those of visible light. It is used in sensors and
TV remote
Micro wave: An electromagnetic wave with a frequency in the range of 100 megahertz to 30
gigahertzes (lower than infrared but higher than other radio waves). Microwaves are used in
radar, radio transmission, cooking, and other applications.
Radio wave: Radio wave, wave from the portion of the electromagnetic spectrum at lower
frequencies than microwaves. They are used in standard broadcast radio and television.

2
1.3 Components of Image Processing

Image Sensors: With reference to sensing, two elements are required to acquire digital image.
The first is a physical device that is sensitive to the energy radiated by the object we wish to
image and second is specialized image processing hardware.
Specialize image processing hardware: It consists of the digitizer just mentioned, plus
hardware that performs other primitive operations such as an arithmetic logic unit, which
performs arithmetic such addition and subtraction and logical operations in parallel on images.
Computer: It is a general purpose computer and can range from a PC to a supercomputer
depending on the application. In dedicated applications, sometimes specially designed
computer are used to achieve a required level of performance
Software: It consists of specialized modules that perform specific tasks a well-designed
package also includes capability for the user to write code, as a minimum, utilizes the
specialized module. More sophisticated software packages allow the integration of these
modules.
Mass storage: This capability is a must in image processing applications. An image of size
1024 x1024 pixels, in which the intensity of each pixel is an 8- bit quantity requires one
Megabytes of storage space if the image is not compressed. Image processing applications falls
into three principal categories of storage
i) Short term storage for use during processing
ii) On line storage for relatively fast retrieval
iii) Archival storage such as magnetic tapes and disks
Image display: Image displays in use today are mainly color TV monitors. These monitors are
driven by the outputs of image and graphics displays cards that are an integral part of computer
system.

3
Hardcopy devices: The devices for recording image includes laser printers, film cameras, heat
sensitive devices inkjet units and digital units such as optical and CD ROM disk. Films provide
the highest possible resolution, but paper is the obvious medium of choice for written
applications.
Networking: It is almost a default function in any computer system in use today because of
the large amount of data inherent in image processing applications. The key consideration in
image transmission bandwidth.
1.4 Image formation and Digitization
In order to become suitable for digital processing, an image function f(x,y) must be digitized
both spatially and in amplitude. Typically, a frame grabber or digitizer is used to sample and
quantize the analogue video signal. Hence in order to create an image which is digital, we need
to covert continuous data into digital form. There are two steps in which it is done:
 Sampling
 Quantization
Sampling
Mathematically, sampling can be defined as mapping of the signal’s domain from the space R
to the space N. Sampling of the signal is a process of creating a discrete signal from the
continuous one, in a way that the values (samples) are taken only in the certain places (or with
certain time steps) from the original continuous signal. Thus, the image can be seen as matrix

Quantization
Quantization corresponds to a discretization of the intensity values. That is, of the co-domain
of the function. After sampling and quantization, we get f : [1, . . . ,N] × [1, . . . , M] −→ [0, . .
. , L]. Typically, 256 levels (8 bits/pixel) suffices to represent the intensity. For color images,
256 levels are usually used for each color intensity.

4
1.5 Neighbors of pixels, adjacency and connectivity
1.5.1 Neighbors of pixels
a. N4 (p): 4-neighbors of p:
 Any pixel p (x, y) has two vertical and two horizontal neighbors, given by (x+1,
y), (x-1, y), (x, y+1), (x, y-1)
 This set of pixels are called the 4-neighbors of P, and is denoted by N4(P)
 Each of them is at a unit distance from P.

b. ND(p): Diagonal neighbors of p:


 This set of pixels, called 4-neighbors and denoted by ND (p).
 ND(p): four diagonal neighbors of p have coordinates (x+1, y+1), (x+1, y-1), (x-
1, y+1), (x-1, y-1)

c. N8(p): 8-neighbors of p:
 N4(P)and ND(p) together are called 8-neighbors of p, denoted by N8(p).
 N8= N4 U ND
 Some of the points in the N4, ND and N8 may fall outside image when P lies on
the border of image.

1.5.2 Adjacency of Pixels

 Two pixels are connected if they are neighbors and their gray levels satisfy some
specified criterion of similarity.
 For example, in a binary image two pixels are connected if they are 4-neighbors and
have same value (0/1)
 Let v: a set of intensity values used to define adjacency and connectivity.

5
 In a binary Image v={1}, if we are referring to adjacency of pixels with value 1.
 In a Gray scale image, the idea is the same, but v typically contains more elements,
for example v= {180, 181, 182,....,200}.
 If the possible intensity values 0 to 255, v set could be any subset of these 256 values.
Types of adjacency
1. 4-adjacency: Two pixels’ p and q with values from v are 4-adjacent if q is in the set N4
(p).
2. 8-adjacency: Two pixels’ p and q with values from v are 8-adjacent if q is in the set
N8(p).
3. m-adjacency (mixed): two pixels’ p and q with values from v are m-adjacent if:
q is in N4(p) or
q is in ND (P) and The set N4(p) ∩ N4(q) has no pixel whose values are from v (No
intersection).
 Mixed adjacency is a modification of 8-adjacency ''introduced to eliminate the
ambiguities that often arise when 8- adjacency is used. (eliminate multiple path
connection)
 Pixel arrangement as shown in figure for v= {1}

1.5.3 Connectivity of Pixels


In a binary (black and white) image, two neighboring pixels are connected if their values are
the same, i.e., both equal to 0 (black) or 1 (white). In a gray level image, two neighboring pixels
are connected if their values are close to each other, i.e., they both belong to the same subset
of similar gray levels: p € V and q € V, where V is a subset of all gray levels in the image.
Specifically, the connectivity can be defined as one of the following:
4-connected: Two pixels’ p and q are 4-connected if they are 4-neighbors and p € V and q €
V
8-connected: Two pixels’ p and q are 8-connected if they are 8-neighbors and p € V and q €
V
mixed-connected: Two pixels’ p and q are mix-connected if
 p € V, q € V and 4-connected, or
 p € V, q € V and D-connected and (N4(p)∩N4(q)ȻV)

6
1.6 Region and Boundaries

 Let R to be a subset of pixels in an image, we call a R a region of the image. If R is a


connected set.
 Region that are not adjacent are said to be disjoint.
 Example: the two regions (of Is) in figure, are adjacent only if 8-adjacany is used.

 4-path between the two regions does not exist, (so their union in not a connected set).
 Boundary (border) image contains K disjoint regions, Rk, k=1, 2, ...., k, none of which
touches the image border

 Let RU denotes the union of all the K regions, (Ru)c denote its complement
(Complement of a set S is the set of points that are not in s). Ru is called foreground;
(Ru)c is called background of the image.
 Boundary (border or contour) of a region R is the set of points that are adjacent to
points in the complement of R

1.7 Distance Measures


For pixel p, q and z with coordinate (x. y), (s, t) and (v, w) respectively D is a distance function
or metric if
D (p, q) ≥ O {D (p, q) = O iff p=q}
D (p, q) = D (p, q) and
D (p, q) ≥ O {D(p, q)+D(q ,z)

7
 The Euclidean Distance between p and q is defined as:
De (p,q) = [(x –s)2 + (y - t)2]1/2
Pixels having a distance less than or equal to some value r from (x,y) are the points
contained in a disk of radius ‘r’ centered at (x,y)

 The D4 distance (also called city-block distance) between p and q is defined as: D4
(p,q) = |x –s | + | y –t |
Pixels having a D4 distance from (x,y), less than or equal to some value r form a
Diamond centered at (x,y)

Example:
The pixels with distance D4 ≤ 2 from x,y) for the( following contours of constant
distance.
The pixels with D4 = 1 are the 4-neighbors of (x,y)

 The D8 distance (also called chessboard distance) between p and q is defined as: D8
(p,q) = max(| x –s |,| y –t |)

8
Pixels having a D8 distance from (x,y), less than or equal to some value r form a square
Centered at (x,y).

Example:
D8 distance ≤ 2 from (x,y) form the follow

 Dm distance:
It is defined as the shortest m-path between the points.

In this case, the distance between two pixels will depend on the values of the
pixels along the path, as well as the values of their neighbors.
• Example:

Consider the following arrangement of pixels and assume that p, p2, and p4
have value 1 and that p1 and p3 can have can have a value of 0 or 1 Suppose
that we consider the adjacency of pixels’ values 1 (i.e. V = {1})

Now, to compute the Dm between points p and p4


Here we have 4 cases:
Case1: If p1 =0 and p3 = 0

9
The length of the shortest m-path (the Dm distance) is 2 (p, p2, p4)

Case2: If p1 =1 and p3 = 0
now, p1 and p will no longer be adjacent (see m-adjacency definition) then, the length
of the shortest path will be 3 (p, p1, p2, p4)

Case3: If p1 =0 and p3 = 1
The same applies here, and the shortest –m-path will be 3 (p, p2, p3, p4)

Case4: If p1 =1 and p3 = 1
The length of the shortest m-path will be 4 (p, p1, p2, p3, p4)

1.8 Applications

Since digital image processing has very wide applications and almost all of the technical
fields are impacted by DIP, we will just discuss some of the major applications of DIP.
Digital image processing has a broad spectrum of applications, such as
 Remote sensing via satellites and other spacecrafts
 Image transmission and storage for business applications
 Medical processing,
 RADAR (Radio Detection and Ranging)
 SONAR(Sound Navigation and Ranging) and

10
 Acoustic image processing (The study of underwater sound is known as
underwater acoustics or hydro acoustics.)
 Robotics and automated inspection of industrial parts.
Images acquired by satellites are useful in tracking of
 Earth resources;
 Geographical mapping;
 Prediction of agricultural crops,
 Urban growth and weather monitoring
 Flood and fire control and many other environmental applications.
Space image applications include:
 Recognition and analysis of objects contained in images obtained from deep space-
probe missions.
 Image transmission and storage applications occur in broadcast television
 Teleconferencing
 Transmission of facsimile images (Printed documents and graphics) for office
automation
Communication over computer networks
 Closed-circuit television based security monitoring systems and
 In military communications.
Medical applications:
 Processing of chest X- rays
 Cineangiograms
 Projection images of transaxial tomography and
 Medical images that occur in radiology nuclear magnetic resonance(NMR)
 Ultrasonic scanning

11
UNIT II
IMAGE ENHANCEMENTS

1. Image Enhancement in Spatial Domain


1.1 Introduction
Image enhancement approaches fall into two broad categories: spatial domain methods and
frequency domain methods. The term spatial domain refers to the image plane itself, and
approaches in this category are based on direct manipulation of pixels in an image.
Frequency domain processing techniques are based on modifying the Fourier transform of an
image. Enhancing an image provides better contrast and a more detailed image as compare to
non-enhanced image. Image enhancement has very good applications. It is used to enhance
medical images, images captured in remote sensing, images from satellite etc. As indicated
previously, the term spatial domain refers to the aggregate of pixels composing an image.
Spatial domain methods are procedures that operate directly on these pixels. Spatial domain
processes will be denoted by the expression.

g(x,y) = T[f(x,y)]

where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f,
defined over some neighborhood of (x, y). The principal approach in defining a neighborhood
about a point (x, y) is to use a square or rectangular subimage area centered at (x, y), as Fig.
shows. The center of the subimage is moved from pixel to pixel starting, say, at the top left
corner. The operator T is applied at each location (x, y) to yield the output, g, at that location.
The process utilizes only the pixels in the area of the image spanned by the neighborhood.

Fig.: 3x3 neighborhood about a point (x, y) in an image.

The simplest form of T is when the neighbourhood is of size 1*1 (that is, a single pixel). In
this case, g depends only on the value of f at (x, y), and T becomes a gray-level (also called an
intensity or mapping) transformation function of the form

12
s = T (r)
where r is the pixels of the input image and s is the pixels of the output image. T is a
transformation function that maps each value

For example, if T(r) has the form shown in Fig. 2.2(a), the effect of this transformation would
be to produce an image of higher contrast than the original by darkening the levels below m
and brightening the levels above m in the original image. In this technique, known as contrast
stretching, the values of r below m are compressed by the transformation function into a narrow
range of s, toward black. The opposite effect takes place for values of r above m.

In the limiting case shown in Fig. 2.2(b), T(r) produces a two-level (binary) image. A mapping
of this form is called a thresholding function.

One of the principal approaches in this formulation is based on the use of so-called masks (also
referred to as filters, kernels, templates, or windows). Basically, a mask is a small (say, 3*3)
2-D array, such as the one shown in Fig. 2.1, in which the values of the mask coefficients
determine the nature of the process, such as image sharpening. Enhancement techniques based
on this type of approach often are referred to as mask processing or filtering.

Fig. 2.2 Gray level transformation functions for contrast enhancement.

Image enhancement can be done through gray level transformations which are discussed
below.
1. 2. Basic gray level transformations:
 Linear transformation
 Log transformations
 Power law transformations
 Piecewise-Linear transformation functions

13
(a) Linear Transformation:
Linear transformation includes simple identity and negative transformation. Identity transition
is shown by a straight line. In this transition, each value of the input image is directly mapped
to each other value of output image. That results in the same input image and output image.
And hence is called identity transformation. It has been shown below:

Image negative or negative transformation: The image negative with gray level value in the
range of [0, L-1] is obtained by negative transformation given by S = T(r) or S = L -1 –r
Where r = gray level value at pixel (x, y), L is the largest gray level consists in the image. It
results in getting photograph negative. It is useful when for enhancing white details embedded
in dark regions of the image. The overall graph of these transitions has been shown below.

Fig. Some basic gray-level transformation functions used for image enhancement.

In this case the following transition has been done.


S = (L –1) –r
since the input image of Einstein is an 8 bpp image, so the number of levels in this image are
256. Putting 256 in the equation, we get this
S = 255 –r
So each value is subtracted by 255 and the result image has been shown above. So what happens
is that, the lighter pixels become dark and the darker picture becomes light. And it results in
image negative. It has been shown in the graph below.

14
Fig. Negative transformations.
(b) Log transformations:
The log transformations can be defined by this formula
s = c log (r + 1).
Where s and r are the pixel values of the output and the input image and c is a constant. The
value 1 is added to each of the pixel value of the input image because if there is a pixel intensity
of 0 in the image, then log (0) is equal to infinity. So 1 is added, to make the minimum value
at least 1.
During log transformation, the dark pixels in an image are expanded as compare to the higher
pixel values. The higher pixel values are kind of compressed in log transformation. This result
in following image enhancement.

Fig. log transformation curve input vs output

(c) Power –law transformations:

There are further two transformations is power law transformations, that include nth power and
nth root transformation. These transformations can be given by the expression:

s= cr γ (6)

This symbol γ is called gamma, due to which this transformation is also known as gamma
transformation.

Variation in the value of γ varies the enhancement of the images. Different display devices /
monitors have their own gamma correction, that is why they display their image at
different intensity.

15
where c and g are positive constants. Sometimes Eq. (6) is written as S = C (r + ε)γ

to account for an offset (that is, a measurable output when the input is zero). Plots of s versus
r for various values of γ are shown in Fig. As in the case of the log transformation, power-law
curves with fractional values of γ map a narrow range of dark input values into a wider range
of output values, with the opposite being true for higher values of input levels. Unlike the log
function, however, we notice here a family of possible transformation curves obtained simply
by varying γ.

In Fig that curves generated with values of γ>1 have exactly The opposite effect as those
generated with values of γ<1. Finally, we Note that Eq. (6) reduces to the identity
transformation when c=γ=1.

Fig. Plot of the equation S = crγ for various values of γ (c =1 in all cases).

This type of transformation is used for enhancing images for different type of display devices.
The gamma of different display devices is different. For example, Gamma of CRT lies in
between of 1.8 to 2.5, that means the image displayed on CRT is dark.

Varying gamma (γ) obtains family of possible transformation curves S = C* r γ. Here C and γ
are positive constants. Plot of S versus r for various values of γ is
γ > 1 compresses dark values
Expands bright values
γ <1 (similar to Log transformation)
Expands dark values Compresses bright values.
When C = γ = 1, it reduces to identity transformation.

(d) Piecewise-Linear Transformation Functions:

A complementary approach to the methods discussed in the previous three sections is to use
piecewise linear functions. The principal advantage of piecewise linear functions over the types
of functions we have discussed thus far is that the form of piecewise functions can be arbitrarily
complex.

The principal disadvantage of piecewise functions is that their specification requires


considerably more user input.

Contrast stretching: One of the simplest piecewise linear functions is a contrast-stretching

16
transformation. Low-contrast images can result from poor illumination, lack of dynamic range
in the imaging sensor, or even wrong setting of a lens aperture during image acquisition.

S= T(r )

Figure x(a) shows a typical transformation used for contrast stretching. The locations of points
(r1, s1) and (r2, s2) control the shape of the transformation Function. If r1=s1 and r2=s2, the
transformation is a linear function that produces No changes in gray levels. If r1=r2, s1=0 and
s2= L-1, the transformation Becomes a thresholding function that creates a binary image, as
illustrated in fig.

Intermediate values of ar1, s1b and ar2, s2b produce various degrees of spread in the gray levels
of the output image, thus affecting its contrast. In general, r1≤ r2 and s1 ≤ s2 is assumed so that
the function is single valued and Monotonically increasing.

Fig. x Contrast stretching. (a) Form of transformation function. (b) A low-contrast stretching.
(c) Result of contrast stretching. (d) Result of thresholding (original image courtesy of
Dr.Roger Heady, Research School of Biological Sciences, Australian National University
Canberra Australia.

Figure x(b) shows an 8-bit image with low contrast. Fig. x(c) shows the result of contrast
stretching, obtained by setting (r1, s1) = (rmin, 0) and (r2, s2) = (rmax, L-1) where rmin and rmax
denote the minimum and maximum gray levels in the image, respectively. Thus, the
transformation function stretched the levels linearly from their original range to the full range
[0, L-1]. Finally, Fig. x(d) shows the result of using the thresholding function defined
previously, with r1=r2=m, the mean gray level in the image. The original image on which these
results are based is a scanning electron microscope image of pollen, magnified approximately
700 times.

17
Gray-level slicing

Highlighting a specific range of gray levels in an image often is desired. Applications


include enhancing features such as masses of water in satellite imagery and enhancing flaws in
X-ray images.
There are several ways of doing level slicing, but most of them are variations of two
basic themes. One approach is to display a high value for all gray levels in the range of interest
and a low value for all other gray levels.
This transformation, shown in Fig. y(a), produces a binary image. The second approach,
based on the transformation shown in Fig. y (b), brightens the desired range of gray levels but
preserves the background and gray-level tonalities in the image. Figure y (c) shows a gray-
scale image, and Fig. y(d) shows the result of using the transformation in Fig. y(a). Variations
of the two transformations shown in Fig. are easy to formulate.

Fig. y (a)This transformation highlights range [A, B] of gray levels and reduces all others to a
constant level (b) This transformation highlights range [A, B] but preserves all other levels.
(c) An image. (d) Result of using the transformation in (a).

Bit-plane slicing:

Instead of highlighting gray-level ranges, highlighting the contribution made to total image
appearance by specific bits might be desired. Suppose that each pixel in an image is represented
by 8 bits. Imagine that the image is composed of eight 1-bit planes, ranging from bit-plane 0
for the least significant bit to bit plane 7 for the most significant bit. In terms of 8-bit bytes,
plane 0 contains all the lowest order bits in the bytes comprising the pixels in the image and
plane 7 contains all the high-order bits.

Figure 3.12 illustrates these ideas, and Fig. 3.14 shows the various bit planes for the image
shown in Fig. 3.13. Note that the higher-order bits (especially the top four) contain the majority
of the visually significant data. The other bit planes contribute to subtler details in the image.
Separating a digital image into its bit planes is useful for analysing the relative importance
played by each bit of the image, a process that aids in determining the adequacy
of the number of bits used to quantize each pixel.

18
In terms of bit-plane extraction for an 8-bit image, it is not difficult to show that the (binary)
image for bit-plane 7 can be obtained by processing the input image with a thresholding gray-
level transformation function that (1) maps all levels in the image between 0 and 127 to one
level (for example, 0); and (2) maps all levels between 129 and 255 to another (for example,
255). The binary image for bit-plane 7 in Fig was obtained in just this manner.

1. 3. Histogram Processing:

The histogram of a digital image with gray levels in the range [0, L-1] is a discrete function of
the form
H(rk) = nk

where rk is the kth gray level and nk is the number of pixels in the image having the level rk..
A normalized histogram is given by the equation

p(rk)=nk/n for k=0,1,2,…..,L-1

P(rk) gives the estimate of the probability of occurrence of gray level rk. The sum of all
components of a normalized histogram is equal to 1. The histogram plots are simple plots of
H(rk)=nk versus rk.

In the dark image the components of the histogram are concentrated on the low (dark) side of
the gray scale. In case of bright image, the histogram components are biased towards the high
side of the gray scale. The histogram of a low contrast image will be narrow and will be centred
towards the middle of the gray scale.

The components of the histogram in the high contrast image cover a broad range of the gray
scale. The net effect of this will be an image that shows a great deal of gray levels details and
has high dynamic range.

19
Histogram Equalization:

Histogram equalization is a common technique for enhancing the appearance of images.


Suppose we have an image which is predominantly dark. Then its histogram would be skewed
towards the lower end of the grey scale and all the image detail are compressed into the dark
end of the histogram. If we could „stretch out‟ the grey levels at the dark end to produce a more
uniformly distributed histogram, then the image would become much clearer.

Let there be a continuous function with r being gray levels of the image to be enhanced. The
range of r is [0, 1] with r=0 repressing black and r=1 representing white. The transformation
function is of the form
S=T(r) where 0<r<1

It produces a level s for every pixel value r in the original image.

20
The transformation function is assumed to fulfil two condition T(r) is single valued and
monotonically increasing in the internal 0<T(r)<1 for 0<r<1.The transformation function
should be single valued so that the inverse transformations should exist. Monotonically
increasing condition preserves the increasing order from black to white in the output image.
The second conditions guarantee that the output gray levels will be in the same range as the
input levels. The gray levels of the image may be viewed as random variables in the interval
[0.1]. The most fundamental descriptor of a random variable is its probability density function
(PDF) Pr(r) and Ps(s) denote the probability density functions of random variables r and s
respectively. Basic results from an elementary probability theory states that if Pr(r) and Tr are
known and T-1(s) satisfies conditions (a), then the probability density function Ps(s) of the
transformed variable is given by the formula

Thus the PDF of the transformed variable s is the determined by the gray levels PDF of the
input image and by the chosen transformations function. A transformation function of a
particular importance in image processing

This is the cumulative distribution function of r.


L is the total number of possible gray levels in the image.

1. 4. Using Arithmetic/Logic Operations:

1.4.1 Arithmetic Operations

Image Addition:

 Adding a constant: H(x,y) = I(x,y) + C


 Adding two images: H(x,y) = I(x,y) + J(x,y)

21
 Blending two images: H(x,y) = α I(x,y) + (1-α) J(x,y)
 Applications: Brightening an image, Image Compositing, (Additive) Dissolves

Image Subtraction:

 Subtracting two images: H(x,y) = I(x,y) - J(x,y)


 Applications: Motion Detection, Frame Differencing for Object Detection, Digital
Subtraction Angiography

Image Multiplication/ Division

 Use to adjust the brightness of the image.

1.4.2 Logical operations

A standard logical operation can be performed between images such as NOT, OR, XOR, and
AND. In general, logical operation is performed between each corresponding bit of the image
pixel representation (i.e. a bit-wise operator).

 NOT (inversion): This inverts the image representation. In the simplest case of a
binary image, the (black) background pixels become (white) and vice versa.
 OR/XOR: are useful for processing binary-valued images (0 or 1) to detect objects
which have moved between frames. Binary objects are typically produced through
application of thresholding to a grey-scale image.
 Logical AND: is commonly used for detecting differences in images, highlighting
target regions with a binary mask or producing bit-planes through an image.

1. 5 Smoothing Spatial filters:

 Smoothing filters are used for blurring and for noise reduction.

22
 Blurring is used in pre-processing steps, such as removal of small details from an
image prior to object extraction, and bridging of small gaps in lines or curves.
 Noise reduction can be accomplishing by blurring with a linear filter and also by
nonlinear filtering.

(a) Low-pass filtering

 The key requirement is that all coefficients are positive.


 Neighbourhood averaging is a special case of LPF where all coefficients are equal.
 It blurs edges and other sharp details in the image.

 Example:

(b) Median filtering


 If the objective is to achieve noise reduction instead of blurring, this method should be
used.
 This method is particularly effective when the noise pattern consists of strong, spike-
like components and the characteristic to be preserved is edge sharpness.
 It is a nonlinear operation.
 For each input pixel f(x,y), we sort the values of the pixel and its neighbors to determine
their median and assign its value to output pixel g(x,y).

23
1. 6 Sharpening Spatial filters:

 To highlight fine detail in an image or to enhance detail that has been blurred, either in
error or as a natural effect of a particular method of image acquisition.
 Uses of image sharpening vary and include applications ranging from electronic
printing and medical imaging to industrial inspection and autonomous target detection
in smart weapons.

(a) Basic high pass spatial filters

 The shape of the impulse response needed to implement a high pass spatial filter
indicates that the filter should have positive coefficients near its centre, and negative
coefficients in the outer periphery.
 Example: filter mask of a 3x3 sharpening filter

24
 The filtering output pixels might be of a gray level exceeding [0, L-1].
 The results of high pass filtering involve some form of scaling and/or clipping to
make sure that the gray levels of the final results are within [0, L-1].

(b) Derivative filters

 Differentiation can be expected to have the opposite effect of averaging, which tends
to blur detail in an image, and thus sharpen an image and be able to detect edges.
 The most common method of differentiation in image processing applications is the
gradient.
 For a function f (x, y), the gradient of f at coordinates (x', y') is defined as the vector

 Its magnitude can be approximated in a number of ways, which result in a number of


operators such as Roberts, Prewitt and Sobel operators for computing its value.
 Example: masks of various operators

25
2. Image enhancement in frequency domain

2. 1. Introduction

In the frequency domain, a digital image is converted from spatial domain to frequency domain.
In the frequency domain, image filtering is used for image enhancement for a specific
application. A Fast Fourier transformation is a tool of the frequency domain used to convert
the spatial domain to the frequency domain. For smoothing an image, low filter is implemented
and for sharpening an image, high pass filter is implemented. When both the filters are
implemented, it is analysed for the ideal filter, Butterworth filter and Gaussian filter.

2.2 Introduction to the Fourier transformation and frequency domain concepts:

Introduction: he frequency domain is a space which is defined by Fourier transform. Fourier


transform has a very wide application in image processing. Frequency domain analysis is used
to indicate how signal energy can be distributed in a range of frequency.

The basic principle of frequency domain analysis in image filtering is to computer 2D discrete
Fourier transform of the image.

26
2.3 Fourier transformation: Fourier transformation is a tool for image processing. it is used
for decomposing an image into sine and cosine components. The input image is a spatial
domain and the output is represented in the Fourier or frequency domain. Fourier
transformation is used in a wide range of application such as image filtering, image
compression. Image analysis and image reconstruction etc.

The formula for Fourier transformation:

Example:

27
2. 4. Smoothing frequency domain filters:

 Ideal low-pass filter


 Butterworth low-pass filter
 Gaussian low-pass filters

Ideal low-pass filter:

Cuts off all high-frequency components at a distance greater than a certain distance
from origin (cut-off frequency).

H (u, v) = 1, if D (u, v) ≤ D0 (6)


0, if D (u, v) ˃ D0

Where D0 is a positive constant and D (u, v) is the distance between a point (u, v) in the
frequency domain and the centre of the frequency rectangle; that is

D (u, v) = [(u-P/2)2 + (v-Q/2)2 ]1/2 (3)

Whereas P and Q are the padded sizes from the basic equations. Wraparound error in their
circular convolution can be avoided by padding these functions with zeros,

VISUALIZATION: IDEAL LOW PASS FILTER:

Aa shown in fig. below

Fig: ideal low pass filter 3-D view and 2-D view and line graph.

EFFECT OF DIFFERENT CUTOFF FREQUENCIES:

Fig. below(a) Test pattern of size 688x688 pixels, and (b) its Fourier spectrum. The spectrum
is double the image size due to padding but is shown in half size so that it fits in the page. The
superimposed circles have radii equal to 10, 30, 60, 160 and 460 with respect to the full-size
spectrum image. These radii enclose 87.0, 93.1, 95.7, 97.8 and 99.2% of the padded image
power respectively.

28
Fig: (a) Test patter of size 688x688 pixels (b) its Fourier spectrum

Fig: (a) original image, (b)-(f) Results of filtering using ILPFs with cut-off frequencies set at
radii values 10, 30, 60, 160 and 460, as shown in fig.2.2.2(b). The power removed by these
filters was 13, 6.9, 4.3, 2.2 and 0.8% of the total, respectively.

As the cut-off frequency decreases,

 image becomes more blurred


 Noise becomes increases
 Analogous to larger spatial filter sizes

The severe blurring in this image is a clear indication that most of the sharp detail information

29
in the picture is contained in the 13% power removed by the filter. As the filter radius is
increases less and less power is removed, resulting in less blurring. Fig. (c ) through (e) are
characterized by “ringing” , which becomes finer in texture as the amount of high frequency
content removed decreases.

WHY IS THERE RINGING?

Ideal low-pass filter function is a rectangular function. The inverse Fourier transform of a
rectangular function is a sinc function.

Fig. Spatial representation of ILPFs of order 1 and 20 and corresponding intensity

profiles through the centre of the filters (the size of all cases is 1000x1000 and the cut-off
frequency is 5), observe how ringing increases as a function of filter order.

Butterworth low-pass filter:

Transform function of a Butterworth low pass filter (BLPF) of order n, and with cut-off
frequency at a distance D0 from the origin, is defined as

30
Transfer function does not have sharp discontinuity establishing cut-off between passed and
filtered frequencies.

Cut off frequency D0 defines point at which H (u, v) = 0.5

Fig. (a) perspective plot of a Butterworth low pass filter transfer function. (b) Filter displayed
as an image. (c)Filter radial cross sections of order 1 through 4.

Unlike the ILPF, the BLPF transfer function does not have a sharp discontinuity that gives a
clear cut-off between passed and filtered frequencies.

Gaussian low pass filters: The form of these filters in two dimensions is given by

 This transfer function is smooth, like Butterworth filter.


 Gaussian in frequency domain remains a Gaussian in spatial domain
 Advantage: No ringing artifacts.

Where D0 is the cut off frequency. When D(u,v) = D0, the GLPF is down to 0.607 of its
maximum value. This means that a spatial Gaussian filter, obtained by computing the IDFT of
above equation., will have no ringing. Fig. shows a perspective plot, image display and radial
cross sections of a GLPF function.

31
Fig. (a) Perspective plot of a GLPF transfer function. (b) Filter displayed as an image. (c) Filter
radial cross sections for various values of D0

Fig.(a) Original image. (b)-(f) Results of filtering using GLPFs with cut off frequencies at the
radii shown in fig.2.2.2. compare with fig.2.2.3 and fig.2.2.6

Fig. (a) Original image (784x 732 pixels). (b) Result of filtering using a GLPF with D0 = 100.
(c) Result of filtering using a GLPF with D0 = 80. Note the reduction in fine skin lines in the

32
magnified sections in (b) and (c).
Fig. shows an application of low pass filtering for producing a smoother, softer-looking result
from a sharp original. For human faces, the typical objective is to reduce the sharpness of fine
skin lines and small blemished.

2. 5. Sharpening frequency domain filters:

An image can be smoothed by attenuating the high-frequency components of its Fourier


transform. Because edges and other abrupt changes in intensities are associated with high-
frequency components, image sharpening can be achieved in the frequency domain by high
pass filtering, which attenuates the low-frequency components without disturbing high-
frequency information in the Fourier transform.

The filter function H(u, v) are understood to be discrete functions of size PxQ; that is
the discrete frequency variables are in the-1 and range u = 0,1,2,……P-1 and v = 0, 1,2,
…..Q-1.

The meaning of sharpening is

 Edges and fine detail characterized by sharp transitions in image intensity


 Such transitions contribute significantly to high frequency components of Fourier
transform
 Intuitively, attenuating certain low frequency components and preserving high
frequency components result in sharpening.

Intended goal is to do the reverse operation of low-pass filters

 When low-pass filter attenuated frequencies, high-pass filter passes them

A high pass filter is obtained from a given low pass filter using the equation.

Hhp (u,v) = 1- Htp(u,v)

Where Htp (u,v) is the transfer function of the low-pass filter. That is when the low-pass filter
attenuates frequencies, the high-pass filter passed them, and vice-versa.

We consider ideal, Butter-worth, and Gaussian high-pass filters. As in the previous section, we
illustrate the characteristics of these filters in both the frequency and spatial domains. Fig..
shows typical 3-D plots, image representations and cross sections for these filters. As before,
we see that the Butter-worth filter represents a transition between the sharpness of the ideal
filter and the broad smoothness of the Gaussian filter. Fig. discussed in the sections the follow,
illustrates what these filters look like in the spatial domain. The spatial filters were obtained
and displayed by using the procedure used.

33
Fig: Top row: Perspective plot, image representation, and cross section of a typical ideal high-
pass filter. Middle and bottom rows: The same sequence for typical butter-worth and Gaussian
high-pass filters.

Ideal high-pass filter:

A 2-D ideal high-pass filter (IHPF) is defined as

𝐻(𝑢, 𝑣) = 0, 𝑖𝑓 𝐷(𝑢, 𝑣) ≤ 𝐷0
{
1, 𝑖𝑓 𝐷(𝑢, 𝑣) > 𝐷0

Where D0 is the cut-off frequency and D(u, v) is given by eq. As intended, the IHPF is the
opposite of the ILPF in the sense that it sets to zero all frequencies inside a circle of radius D0
while passing, without attenuation, all frequencies outside the circle. As in case of the ILPF,
the IHPF is not physically realizable.

SPATIAL REPRESENTATION OF HIGHPASS FILTERS:

34
Fig. Spatial representation of typical (a) ideal (b) Butter-worth and (c) Gaussian frequency
domain high-pass filters, and corresponding intensity profiles through their centres. We can
expect IHPFs to have the same ringing properties as ILPFs. This is demonstrated clearly in Fig.
which consists of various IHPF results using the original image in Fig.(a) with D0 set to 30, 60
and 160 pixels, respectively. The ringing in Fig. (a) is so severe that it produced distorted,
thickened object boundaries (e.g., look at the large letter “a”). Edges of the top three circles do
not show well because they are not as strong as the other edges in the image (the intensity of
these three objects is much closer to the background intensity, giving discontinuities of smaller
magnitude).

FILTERED RESULTS: IHPF:

Fig. Results of high-pass filtering the image in Fig.(a) using an IHPF with D0 = 30, 60, and
160.

The situation improved somewhat with D0 = 60. Edge distortion is quite evident still, but now
we begin to see filtering on the smaller objects. Due to the now familiar inverse relationship
between the frequency and spatial domains, we know that the spot size of this filter is smaller
than the spot of the filter with D0 = 30. The result for D0 = 160 is closer to what a high-pass
filtered image should look like. Here, the edges are much cleaner and less distorted, and the
smaller objects have been filtered properly.

Of course, the constant background in all images is zero in these high-pass filtered images
because high pass filtering is analogous to differentiation in the spatial domain.

Butter-worth high-pass filters:

A 2-D Butter-worth high-pass filter (BHPF) of order n and cut-off frequency D0 is defined as

Where D(u, v) is given by Eq.(3). This expression follows directly from Eqs. (3) and (6). The
middle row of Fig.2.2.11. shows an image and cross section of the BHPF function. Butter-
worth high-pass filter to behave smoother than IHPFs. Fig.2.2.14 shows the performance of a
BHPF of order 2 and with D0 set to the same values as in Fig.2.2.13. The boundaries are much
less distorted than in Fig.2.2.13. even for the smallest value of cut-off frequency.

35
FILTERED RESULTS: BHPF:

Fig. Results of high-pass filtering the image in Fig.2.2.2(a) using a BHPF of order 2 with D0
= 30, 60, and 160 corresponding to the circles in Fig.2.2.2(b). These results are much smoother
than those obtained with an IHPF.

Gaussian high-pass filters:

The transfer function of the Gaussian high-pass filter(GHPF) with cut-off frequency locus at a
distance D0 from the centre of the frequency rectangle is given by

Where D(u,v) is given by Eq.(4). This expression follows directly from Eqs.(2) and (6). The
third row in Fig.2.2.11. shows a perspective plot, image and cross section of the GHPF
function. Following the same format as for the BHPF, we show in Fig.2.2.15. comparable
results using GHPFs. As expected, the results obtained are more gradual than with the previous
two filters.

FILTERED RESULTS: GHPF:

Fig. Results of high-pass filtering the image in fig.(a) using a GHPF with D0 = 30, 60 and 160,
corresponding to the circles in Fig.(b)

36
UNIT III
IMAGE RESTORATION AND COLOUR IMAGE PROCESSING

1. Image Restoration:

1.1 Introduction

Restoration improves image in some predefined sense. It is an objective process. Restoration


attempts to reconstruct an image that has been degraded by using a priori knowledge of the
degradation phenomenon. These techniques are oriented toward modeling the degradation and
then applying the inverse process in order to recover the original image. Restoration techniques
are based on mathematical or probabilistic models of image processing. Enhancement, on the
other hand is based on human subjective preferences regarding what constitutes a “good”
enhancement result. Image Restoration refers to a class of methods that aim to remove or reduce
the degradations that have occurred while the digital image was being obtained. All natural
images when displayed have gone through some sort of degradation:

 During display mode


 Acquisition mode, or
 Processing mode
 Sensor noise
 Blur due to camera mis focus
 Relative object-camera motion
 Random atmospheric turbulence
 Other

1.2 Degradation Model:

Degradation process operates on a degradation function that operates on an input image with
an additive noise term. Input image is represented by using the notation f(x,y), noise term can
be represented as η(x,y).These two terms when combined gives the result as g(x,y). If we are
given g(x,y), some knowledge about the degradation function H or J and some knowledge
about the additive noise teem η(x,y), the objective of restoration is to obtain an estimate f'(x,y)
of the original image. We want the estimate to be as close as possible to the original image.
The more we know about h and η, the closer f(x,y) will be to f'(x,y). If it is a linear position
invariant process, then degraded image is given in the spatial domain by

g(x,y)=f(x,y)*h(x,y)+η(x,y)

h(x,y) is spatial representation of degradation function and symbol * represents convolution.


In frequency domain we may write this equation as

G(u,v)=F(u,v)H(u,v)+N(u,v)

The terms in the capital letters are the Fourier Transform of the corresponding terms in
the spatial domain.

37
Fig: A model of the image Degradation / Restoration process

1. 3 Various Noise Models

The principal source of noise in digital images arises during image acquisition and /or
transmission. The performance of imaging sensors is affected by a variety of factors, such as
environmental conditions during image acquisition and by the quality of the sensing elements
themselves. Images are corrupted during transmission principally due to interference in the
channels used for transmission. Since main sources of noise presented in digital images are
resulted from atmospheric disturbance and image sensor circuitry, following assumptions can
be made i.e. the noise model is spatial invariant (independent of spatial location). The noise
model is uncorrelated with the object function.

Gaussian Noise:

These noise models are used frequently in practices because of its tractability in both spatial
and frequency domains. The PDF of Gaussian random variable is

Where z represents the gray level, μ= mean of average value of z, σ= standard deviation.

38
Rayleigh Noise:

Unlike Gaussian distribution, the Rayleigh distribution is no symmetric. It is given by


the formula.

The mean and variance of this density is

(iii) Gamma Noise:

The PDF of Erlang noise is given by

The mean and variance of this density are given by

39
Its shape is similar to Rayleigh disruption. This equation is referred to as gamma density it is
correct only when the denominator is the gamma function.
(iv) Exponential Noise:
Exponential distribution has an exponential shape. The PDF of exponential noise is given as

Where a>0. The mean and variance of this density are given by

(v) Uniform Noise:


The PDF of uniform noise is given by

The mean and variance of this noise is

40
The mean and variance of this noise is

(vi) Impulse (salt & pepper) Noise:


In this case, the noise is signal dependent, and is multiplied to the image. The PDF of bipolar
(impulse) noise is given by

1. 3 Restoration in the presence of Noise only- Spatial filtering


When the only degradation present in an image is noise, i.e.

g(x,y)=f(x,y)+η(x,y)
or
G(u,v)= F(u,v)+ N(u,v)
The noise terms are unknown so subtracting them from g(x,y) or G(u,v) is not a realistic
approach. In the case of periodic noise it is possible to estimate N(u,v) from the spectrum
G(u,v). So N(u,v) can be subtracted from G(u,v) to obtain an estimate of original image. Spatial

41
filtering can be done when only additive noise is present. The following techniques can be used
to reduce the noise effect:
i) Mean Filter:
(a)Arithmetic Mean filter:
It is the simplest mean filter. Let Sxy represents the set of coordinates in the sub image of size
m*n cantered at point (x, y). The arithmetic mean filter computes the average value of the
corrupted image g(x, y) in the area defined by Sxy. The value of the restored image f at any
point (x,y) is the arithmetic mean computed using the pixels in the region defined by Sxy.

This operation can be using a convolution mask in which all coefficients have value 1/mn A
mean filter smoothes local variations in image Noise is reduced as a result of blurring. For
every pixel in the image, the pixel value is replaced by the mean value of its neighboring
pixels with a weight. This will have resulted in a smoothing effect in the image.

(b) Geometric Mean filter:


An image restored using a geometric mean filter is given by the expression

Here, each restored pixel is given by the product of the pixel in the sub image window, raised
to the power 1/mn. A geometric mean filters but it to loose image details in the process.
(c) Harmonic Mean filter:
The harmonic mean filtering operation is given by the expression

The harmonic mean filter works well for salt noise but fails for pepper noise. It does well with
Gaussian noise also.
(d) Order statistics filter:
Order statistics filters are spatial filters whose response is based on ordering the pixel contained
in the image area encompassed by the filter. The response of the filter at any point is determined
by the ranking result.
(e) Median filter:
It is the best order statistic filter; it replaces the value of a pixel by the median of gray

42
levels in the Neighbourhood of the pixel.

The original of the pixel is included in the computation of the median of the filter are quite
possible because for certain types of random noise, the provide excellent noise reduction
capabilities with considerably less blurring then smoothing filters of similar size. These are
effective for bipolar and unipolar impulse noise.
(e) Max and Min filter:
Using the l00th percentile of ranked set of numbers is called the max filter and is given by the
equation

It is used for finding the brightest point in an image. Pepper noise in the image has very low
values, it is reduced by max filter using the max selection process in the sublimated area sky.
The 0th percentile filter is min filter.

This filter is useful for flinging the darkest point in image. Also, it reduces salt noise of the min
operation.
(f) Midpoint filter:
The midpoint filter simply computes the midpoint between the maximum and minimum values
in the area encompassed by

It comeliness the order statistics and averaging. This filter works best for randomly distributed
noise like Gaussian or uniform noise.

1.4 Image restoration using frequency domain filtering:


These types of filters are used for this purpose
Band Reject Filters:
It removes a band of frequencies about the origin of the Fourier transformer.
Ideal Band Reject Filter:
An ideal band reject filter is given by the expression

43
D(u, v)- the distance from the origin of the cantered frequency rectangle.
W- the width of the band
Do- the radial centre of the frequency rectangle.
Butterworth Band Reject Filter:

Gaussian Band Reject Filter:

These filters are mostly used when the location of noise component in the frequency domain is
known. Sinusoidal noise can be easily removed by using these kinds of filters because it shows
two impulses that are mirror images of each other about the origin. Of the frequency transform.

Band pass Filter:


The function of a band pass filter is opposite to that of a band reject filter It allows a specific
frequency band of the image to be passed and blocks the rest of frequencies. The transfer
function of a band pass filter can be obtained from a corresponding band reject filter with
transfer function Hbr(u,v) by using the equation

These filters cannot be applied directly on an image because it may remove too much details
of an image but these are effective in isolating the effect of an image of selected frequency
bands.

44
Notch Filters:
A notch filter rejects (or passes) frequencies in predefined neighborhoods about a center
frequency. Due to the symmetry of the Fourier transform notch filters must appear in symmetric
pairs about the origin. The transfer function of an ideal notch reject filter of radius D0 with
centres a (u0, v0) and by symmetry at (-u0 , v0) is

Ideal, Butterworth, Gaussian notch filters

45
1.5 Inverse Filtering
The simplest approach to restoration is direct inverse filtering where we complete an estimate
𝐹̂ (𝑢, 𝑣)of the transform of the original image simply by dividing the transform of the degraded
image G(u, v) by degradation function H(u, v)

From the above equation we observe that we cannot recover the undegraded image exactly
because N(u, v) is a random function whose Fourier transform is not known. One approach to
get around the zero or small-value problem is to limit the filter frequencies to values near the
origin. We know that H(0,0) is equal to the average values of h(x, y). By Limiting the analysis
to frequencies near the origin we reduce the probability of encountering zero values.
Minimum mean Square Error (Wiener) filtering:
The inverse filtering approach has poor performance. The wiener filtering approach uses the
degradation function and statistical characteristics of noise into the restoration process. The
objective is to find an estimate 𝑓̂ of the uncorrupted image f such that the mean square error
between them is minimized. The error measure is given by

Where E{.} is the expected value of the argument.


We assume that the noise and the image are uncorrelated one or the other has zero mean. The
gray levels in the estimate are a linear function of the levels in the degraded image.

46
Where H(u,v)= degradation function
H*(u,v)=complex conjugate of H(u,v)
| H(u,v)|2 = H* (u,v) H(u,v)
Sn(u,v)=|N(u,v)|2 = power spectrum of the noise
Sf(u,v)=|F(u,v)|2 = power spectrum of the underrated image
The power spectrum of the undegraded image is rarely known. An approach used frequently
when these quantities are not known or cannot be estimated then the expression used is

Where K is a specified constant.


Constrained least squares filtering:
The wiener filter has a disadvantage that we need to know the power spectra of the undegraded
image and noise. The constrained least square filtering requires only the knowledge of only the
mean and variance of the noise. These parameters usually can be calculated from a given
degraded image this is the advantage with this method. This method produces a optimal result.
This method requires the optimal criteria which is important we express the

The optimality criteria for restoration is based on a measure of smoothness, such as the second
derivative of an image (Laplacian). The minimum of a criterion function C defined as

Subject to the constraint

Where ||𝑤||2 ≜ 𝑤 𝑇 w is a euclidean vector norm 𝑓̂is estimate of the undegraded image. ∇2 is
laplacian operator.
The frequency domain solution to this optimization problem is given by

47
Where γ is a parameter that must be adjusted so that the constraint is satisfied. P(u,v) is the
Fourier transform of the laplacian operator

1.5 Colour Fundamentals


Colours are seen as variable combination of the primary colours of light: red(R), green (G)
and blue (B). The primary colours can be mixed to produce the secondary colours: magenta
(red+blue), cyan (green+blue), and yellow (red+green). Mixing the three primaries, or a
secondary with its opposite primary colour, produces white light.

Fig. Primary and secondary colour of light


RGB colours are used for colour TV, monitors, and video cameras. However, the primary
colours of pigments are cyan (C), magenta (M), and yellow (Y), and the secondary colours are
red, green and blue. A proper combination of the three pigment primaries, or a secondary with
its opposite primary, produces black.

Fig. Primary and secondary colours of pigments


CMY colours are used for colour printing.
48
Colour characteristics
The characteristics used to distinguish one colour from another are:
 Brightness: means the amount of intensity (i.e. colour level).
 Hue: represents dominant colour as perceived by an observer.
 Saturation: refers to the amount of white light mixed with a hue.
1.6 Colour Model
The purpose of a colour model is to facilitate the specification of colours in some standard way.
A colour model is a specification of a coordinate system and a subspace within that system
where each colour is represented by a single point. Colour models most commonly used in
image processing are:
 RGB model for colour monitors and video cameras
 CMY and CMYK (cyan, magenta, yellow, black) models for colour printing
 HIS (hue, saturation, intensity) model
The RGB colour model
In this model, each colour appears in its primary colours red, green and blue. This model is
based on a Cartesian coordinate system. The colour subspace is the cube shown in the figure
below. The different colours in this model are points on or inside the cube and are defined by
vectors extending from the origin.

All colour values R, G and B have been normalized in the range [0, 1]. However, we can
represent each of R, G and B from 0 to 255. Each RGB colour image consist of three component
images, one for each primary colour as shown in the figure below. These three images are
combined on the screen to produce a colour image.

49
The total number of bits used to represent each pixel in RGB image is called pixel depth. For
example, in an RGB image if each of the red, green and blue images in 8-bit image, the pixel
depth of the RGB image is 24-bits. The figure below shows the component images of an RGB
image.

The CMY and CMYK colour model


Cyan, magenta and yellow are the primary colours of pigments. Most printing devices such as
colour printers and copies require CMY data input or perform an RGB to CMY conversion
internally. This conversion is performed using the equation

50
where, all colour values have been normalized to the range [0, 1]. In printing, combining equal
amounts of cyan, magenta and yellow produce muddy-looking black. In order to produce true
black, a fourth colour, and black is added, giving rise to the CMYK colour model.
The figure below shows the CMYK component images of an RGB image.

The HIS colour model


The RGB and CMY colour models are not suited for describing colours in terms of human
interpretation. When we view a colour object, we describe it by its hue, saturation and
brightness (intensity). Hence the HIS colour model has been presented. The HSI model
decouples the intensity component from the colour-carrying information (hue and saturation)
in a colour image. As a result, this model is an ideal is an ideal tool for developing colour image
processing algorithms.
The hue, saturation and intensity values can be obtained from the RGB colour cube. That is,
we can convert any RGB point to a corresponding point is the HIS colour model by working
out the geometric formulas

51
52
Converting colours from HIS to RGB

53
Fig. A full-colour images and its HSI component images
1.7 Colour Transformation
As with the gray-level transformation, we model colour transfermations using the expression

where f (x, y) is a colour input image, g(x, y) is the transformed colour output image, and T is
the colour transform.
This colour transform can also be written

For example, we wish to modify the intensity of the image shown in below figure

 In the RGB colour space, three components must be transformed:

 In CMYK space also three components must be transformed

 In HSI space only intensity component r3 is transformed

54
Fig. (a) Original image. (b) Result of decreasing its intensity
 Colour Complement

 Colour complement replaces each colour with its opposite colour in the colour
circle of the Hue component. This operation is analogous to image negative in a
gray scale image

 Can be used to enhance details buried in dark regions of an image


 Below figure shows saturation component unaltered

55
 Colour Slicing

 Used to highlight a specific range of colours in an image to separate objects


from surroundings
 Display just the colours of interest, or use the regions defined by specified
colours for further processing
 More complex than gray-level slicing, due to multiple dimensions for each
pixel
 Dependent on the colour space chosen; I prefer HSI
 Using a cube of width W to enclose the reference color with components (a1,
a2, . . . , an), the transformation is given by

 If the color of interest is specified by a sphere of radius R0, the transformation


is

56
Example

1.8 Smoothing and Sharpening


 Colour image smoothing
 Colour image smoothing is the of pre-processing technique intended for
removing possible image perturbations without losing information
 Extend spatial filtering mask to colour smoothing, dealing with component
vectors
 Let Sxy be the neighbourhood cantered at (x, y)
 Average of RGB components in the neighbourhood is given by

 Same effect as smoothing each channel separately


Example:

57
 Colour image sharpening

 Colour image sharpening is the set of techniques whose purpose is the


improvement of the image visual appearance and highlight or recover certain
details of the image for conducting a suitable analysis by a human or a machine
 Use Laplician

58
Example:

1.8 Colour Segmentation


 Segmentation in HSI colour space

 Colour is conveniently represented in hue image


 Saturation is used as a masking image to isolate regions of interest in the hue
image
 Intensity image used less frequently as it has no colour information
 The binary mask is generated by thresholding the saturation plane with T = 0.1x
(maximum value in the saturation plane).
Example

59
 Segmentation in RGB vector space

 Create an estimate of the average color to be segmented as vector a


 Let z be an arbitrary point in the RGB color space
 z is similar to a if the Euclidean distance between them is less than specified
threshold D0

Example:

60
61
UNIT IV
IMAGE COMPRESSION AND IMAGE SEGMENTAION

1.1 Introduction
Image compression is an application of data compression that encodes the original image with
few bits. The objective of image compression is to reduce the redundancy of the image and to
store or transmit data in an efficient form. The main goal of such system is to reduce the storage
quantity as much as possible, and the decoded image displayed in the monitor can be similar
to the original image as much as can be.
1. 2 Image Compression Model
Compression has two types i.e. Lossy and Lossless technique. Atypical image compression
system comprises of two main blocks an Encoder (Compressor) and Decoder (Decompressor).
The image f (x, y) is fed to the encoder which encodes the image so as to make it suitable for
transmission. The decoder receives this transmitted signal and reconstructs the output image f
(x, y). If the system is an error free one f (x, y) will be a replica of f (x, y).

The encoder and the decoder are made up of two blocks each. The encoder is made up of a
Source encoder and a Channel encoder. The source encoder removes the input redundancies
while the channel encoder increases the noise immunity of the source encoders. The decoder
consists of a channel decoder and a source decoder. The function of the channel decoder is to
ensure that the system is immune to noise. Hence if the channel between the encoder and the
decoder is noise free, the channel encoder and the channel decoder are omitted.
Source encoder and decoder: The three basic types of the redundancies in an image are
interpixel, coding redundancies and psychovisual redundancies. Run length coding is used to
eliminate or reduce interpixel redundancies Huffman encoding is used to eliminate or reduce
coding redundancies while I.G.S is used to eliminate interpixel redundancies. The job of the
source decoders is to get back the original signal. The problem solved by runlength coding,
Huffman encoding and I.G.S coding are examples of source encoders and decoders.

The input image is passed through a mapper. The mapper reduces the interpixel redundancies.
The mapping stage is a lossless technique and hence is an reversible operation. The output of
a mapper is passed through a Quantizer block. The quantizer block reduces the psychovisual

62
redundancies. It compresses teh data by eliminating some information and hence is an
irreversible operation. The quantizer block uses JPEG compression which means a lossy
compression. Hence in case of lossless compression, the quantizer block is eliminated. The
final block of the source encoder is that of a symbol encoder. This block creates a variable
length code to represent the output of the quantizer. The Huffman code is a typical example of
the symbol encoder. The symbol encoder reduces coding redundancies.

The source decoder block performs exactly the reverse operation of the symbol encoder and
the mapper blocks. It is important to note that the source decoder has only two blocks. Since
quantization is irreversible, an inverse quantizer block does not exist. The channel is noise free
and hence have ignored the channel encoder and channel decoder.
Channel encoder and decoder: The channel encoder is used to make the system immune to
transmission noise. Since the output of the source encoder has very little redundancy, it is
highly susceptible to noise. The channel encoder inserts a controlled form of redundancy to the
source encoder output making it more noise resistant.
1. 3 Error-free compression or loss less compression
It is also known as entropy coding as it uses decomposition techniques to minimize loopholes.
The original image can be perfectly recovered from the compressed image, in lossless
compression techniques. These do not add noise to the signal. It is also known as entropy
coding as it uses decomposition techniques to minimize redundancy.
Following techniques are included in lossless compression:
 Variable Length Coding (VLC)
 Run length encoding
 Differential coding
 Predictive coding
 Dictionary-based coding
Variable Length Coding (VLC): Most entropy-based encoding techniques rely on assigning
variable-length code words to each symbol, whereas the most likely symbols are assigned
shorter code words. In the case of image coding, the symbols may be raw pixel values or the
numerical values obtained at the output of the mapper stage (e.g., differences between
consecutive pixels, run-lengths, etc.). The most popular entropy-based encoding technique is
the Huffman code. It provides the least amount of information units (bits) per source symbol.
It is described in more detail in a separate short article
Run length encoding (RLE): RLE is one of the simplest data compression techniques. It
consists of replacing a sequence (run) of identical symbols by a pair containing the symbol and

63
the run length. It is used as the primary compression technique in the 1-D CCITT Group 3 fax
standard and in conjunction with other techniques in the JPEG image compression standard
(described in a separate short article).

Differential coding: Differential coding techniques explore the inter pixel redundancy in
digital images. The basic idea consists of applying a simple difference operator to neighbouring
pixels to calculate a difference image, whose values are likely to follow within a much narrower
range than the original gray-level range. As a consequence of this narrower distribution – and
consequently reduced entropy – Huffman coding or other VLC schemes will produce shorter
code words for the difference image.
Predictive coding:
Predictive coding techniques constitute another example of exploration of inter pixel
redundancy, in which the basic idea is to encode only the new information in each pixel. This
new information is usually defined as the difference between the actual and the predicted value
of that pixel. The key component is the predictor; whose function is to generate an estimated
(predicted) value for each pixel from the input image based on previous pixel values. The
predictor’s output is rounded to the nearest integer and compared with the actual pixel value:
the difference between the two called prediction error – is then encoded by a VLC encoder.
Since prediction errors are likely to be smaller than the original pixel values, the VLC encoder
will likely generate shorter code words. There are several local, global, and adaptive prediction
algorithms in the literature. In most cases, the predicted pixel value is a linear combination of
previous pixels.
Dictionary-based coding:
Dictionary-based coding techniques are based on the idea of incrementally building a
dictionary (table) while receiving the data. Unlike VLC techniques, dictionary-based
techniques use fixed-length code words to represent variable-length strings of symbols that
commonly occur together. Consequently, there is no need to calculate, store, or transmit the
probability distribution of the source, which makes these algorithms extremely convenient and
popular. The best-known variant of dictionary-based coding algorithms is the LZW (Lempel-
Ziv-Welch) encoding scheme, used in popular multimedia file formats such as GIF, TIFF, and
PDF.
1. 4 Lossy Compression
Lossy compression methods have larger compression ratios as compared to the lossless
compression techniques. Lossy methods are used for most applications. By this the output
image that is reconstructed image is not exact copy but somehow resembles it at larger portion.

64
As shown in Fig., this prediction – transformation – decomposition process is completely
reversible. There is loss of information due to process of quantization. The entropy coding after
the quantizing, is lossless. When decoder has input data, entropy decoding is applied to
compressed signal values to get the quantized signal values. Then, de-quantization is used on
it and the image is recovered which resembles to the original.
Lossy compression includes following methods:
 Block truncation coding
 Code Vector quantization
 Fractal coding
 Transform coding
 Sub-band coding
Block Truncation Coding: In this, image is divided into blocks like we have in fractals. The
window of N by N of an image is considered as a block. The mean value of all values of that
window consisting a certain number of pixel. The threshold is normally the mean value of the
pixel values in the vector. Then a bitmap of that vector is generated by replacing all pixels
having values are greater than or equal to the threshold by a 1. Then for each segment in the
bitmap, a value is determined which is the average of the values of the corresponding pixels in
the original code vector.
Code Vector Quantization: The basic idea in Vector Quantization is to create a dictionary of
vectors of constant size, called code vectors. Values of pixels composed the blocks called as
code vector. A given image is then parted into non-recurring vectors called image vectors.
Dictionary is made out this information and it is indexed. Further, it is used for encoding the
original image. Thus, every image is then entropy coded with the help of these indices.
Fractal Compression: The basic thing behind this coding is to divide image into segments by
using standard points like colour difference, edges, frequency and texture. It is obvious that
parts of an image and other parts of the same image are usually resembling. Here, there is a
dictionary which is used as a look up table called as fractal segments. The library contains
codes which are compact sets of numbers. Doing an algorithm operation, fractals are operated
and image is encoded. This scheme is far more effective for compressing images that are natural
and textured.

65
Transform Coding: In this coding, transforms like Discrete Fourier Transform (DFT) and
Discrete Cosine Transform (DCT), Discrete Sine Transform are used to alter the pixel
specifications from spatial domain into frequency domain. One is the energy compaction
property; some few coefficients only have the energy of original image signal that can be used
to reproduce itself. Only those few significant coefficients are considered and the remaining is
discarded. These coefficients are given for quantization and encoding. DCT coding has been
the most commonly used in transformation of image data.
Subband Coding: In this scheme, quantization and coding is applied to each of the analyzed
sub-bands from the frequency components bands. This coding is very useful because
quantization and coding is more accurately applied to the sub-bands.
1. 5 Detection of Discontinuities
There are 3 basic types of discontinuities: point detection, line detection and edge detection.
The detection is based on convoluting the image with a spatial mask.

 A general 3x3 mask

 The response of the mask at any point (x,y) in the image is

Point Detection:
 A point has been detected at the location p(i,j) on which the mask is centered if |R |>T,
where T is a nonnegative threshold, and R is obtained with the following mask.

 The idea is that the gray level of an isolated point will be quite different from the gray
level of its neighbours.

66
Line Mask

 Horizontal line

 45° line

 Vertical line

 -45° line

 If, at a certain point in the image, |Ri|>|Rj| for all j ≠ i, that point is said to be more likely
associated with a line in the direction of mask i.

67
Edge Detection
 It locates sharp changes in the intensity function.
 Edges are pixels where brightness changes abruptly.
 A change of the image function can be described by a gradient that points in the
direction of the largest growth of the image function.
 An edge is a property attached to an individual pixel and is calculated from the image
function behaviour in a neighbourhood of the pixel.
 Magnitude of the first derivative detects the presence of the edge
 Sign of the second derivative determines whether the edge pixel lies on the dark sign
or light side.

68
(a) Gradient operator:

 For a function f(x,y), the gradient of f at coordinates (x',y') is defined as the vector

 Magnitude of vector ∇f (x', y')

 Direction of the vector ∇f (x', y')

69
 Its magnitude can be approximated in the digital domain in a number of ways, which
result in a number of operators such as Roberts, Prewitt and Sobel operators for
computing its value.

(b) Sobel operator:

 It provides both a differentiating and a smoothing effect, which is particularly


attractive as derivatives typically enhance noise.

(c) Laplacian Operator:

 The Laplacian of a 2D function f(x,y) is a 2nd-order derivative defined as

 The Laplacian has the same properties in all directions and is therefore invariant to
rotation in the image.
 It can also be implemented in digital form in various ways.
 For a 3x3 region, the mask is given as

70
 It is seldom used in practice for edge detection for the following reasons:
1. As a 2nd-order derivative, it is unacceptably
sensitive to noise.
2. It produces double edges and is unable to detect
edge direction.
 The Laplacian usually plays the secondary role of detector for establishing whether a
pixel is on the dark or light side of an edge.
1. 6 Edge linking and boundary detection
 The techniques of detecting intensity discontinuities yield pixels lying only on the
boundary between regions.
 In practice, this set of pixels seldom characterizes a boundary completely because if
noise, breaks in boundary from no uniform illumination, and other effects that introduce
spurious intensity discontinuities.
 Edge detection algorithms are typically followed by linking and other boundary
detection procedures designed to assemble edge pixels into meaningful boundaries.
(a) Local Processing
 Two principal properties used for establishing similarity of edge pixels in this kind
of analysis are:
1. The strength of the response of the gradient
operator used to produce the edge pixel.
2. The direction of the gradient.
 In a small neighbourhood, e.g. 3x3, 5x5, all points with common properties are
linked
 A point (x',y') in the neighbourhood of (x,y) is linked to the pixel at (x,y) if both the
following magnitude and direction criteria are satisfied.

71
1. 6 Thresholding
 Thresholding is one of the most important approaches to image segmentation.
 If background and object pixels have gray levels grouped into 2 dominant modes, they
can be separated with a threshold easily.

 Thresholding may be viewed as an operation that involves tests against a function T of


the form T=T[x,y,p(x,y),f(x,y)], where f(x,y) is the gray level of point (x,y), and p(x,y)
denotes some local property of this point such as the average gray level of a
neighbourhood centred on (x,y).
 Special cases:
If T depends on
1. f(x,y) only - global threshold
2. Both f(x,y) & p(x,y) - local threshold
3. (x,y) - dynamic threshold

72
 Multilevel thresholding is in general less reliable as it is difficult to establish effective
thresholds to isolate the regions of interest.

Adaptive thresholding
 The threshold value varies over the image as a function of local image characteristics.
 Image f is divided into sub images.
 A threshold is determined independently in each sub image.
 If a threshold can't be determined in a sub image, it can be interpolated with thresholds
obtained in neighbouring sub images.
 Each sub image is then processed with respect to its local threshold.

73
Threshold selection based on boundary characteristics
 A reliable threshold must be selected to identify the mode peaks of a given histogram.
 This capability is very important for automatic threshold selection in situations where
image characteristics can change over a broad range of intensity distributions.
 We can consider only those pixels that lie on or near the boundary between objects and
the background such that the associated histogram is well-shaped to provide a good
chance for us to select a good threshold.
 The gradient can indicate if a pixel is on an edge or not.
 The Laplacian can tell if a given pixel lies on the dark or light (background or object)
side of an edge.
 The gradient and laplacian can produce a 3-level image

74
Where T is threshold

75

You might also like