0% found this document useful (0 votes)
3 views

Digital Image Processing Course Material-TAB

Uploaded by

Anu S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Digital Image Processing Course Material-TAB

Uploaded by

Anu S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 346

Digital Image Processing

Prepared by:
Sri. T. Aravinda Babu
Asst. Prof.,
Dept. of ECE, CBIT
Contents:
 Introduction

 Need for digital image processing

 Applications

 Course Orientation

Sri. T. Aravinda babu, Asst. Prof., Dept. of


ECE, CBIT 2
Digital Image Processing
 An image is defined as a two dimensional function f(x,y) where x and
y are spatial (plane) coordinates.

 Amplitude of f at any pair of coordinates(x, y) is called the intensity


or gray level of the image at that point.

 x,y and the intensity values of f are all finite, discrete quantities then
the image is called as digital image.

 Digital Image Processing means processing of images which are


digital in nature by a digital computer.

 Digital image is composed of a finite number of elements. These


elements are called as picture elements, image elements, pels and
pixels.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 3
Need for Image processing

It is motivated by two major applications

 Improvement of pictorial information for human


perception.

 Processing of image data for efficient storage, transmission


and representation for autonomous machine application

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 4


Human Perception
 Employ methods capable of enhancing pictorial information
for human interpretation and analysis.

Typical Applications:

 Noise filtering

 Content enhancement

 Contrast enhancement

 Deblurring

 Remote sensing

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 5


Noise Filtering

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 6


Image Enhancement

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 7


Image Enhancement

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 8


Image Deblurring

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 9


Medical Imaging

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 10


Medical Imaging

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 11


Remote Sensing

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 12


Machine Vision Applications
 Here the interesting is on procedures for extraction of image
information suitable for computer processing

Typical Applications:
 Industrial machine vision for product assembly and inspection

 Automated Target detection and Tracking

 Finger print recognition

 Machine processing of aerial and satellite imagery for weather


prediction and crop assessment etc.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 13


Automated Inspection

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 14


Automated Inspection

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 15


Automated Inspection

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 16


Movement Detection

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 17


Automated Target detection and Tracking

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 18


Image Compression
 An image usually contains lot of redundancy that can be
exploited to achieve compression
1. Pixel redundancy

2. Coding redundancy

3. Psychovisual redundancy

Applications:

1. Reduced storage

2. Reduction in bandwidth

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 19


Image Compression

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 20


DIGITAL IMAGE PROCESSING
(Program Elective-V)

Subject Code: 18EC E21

Instruction: 3 L Hours per Week

Duration of SEE: 3 Hours

SEE: 70 Marks

CIE: 30 Marks

Credits: 3

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 21


UNIT – I

Elements of Digital Image Processing Systems, Digital


image representation, elements of visual perception,
Image sampling and Quantization, Basic Relationships
between pixels.

CO1: Describe basic concepts of image processing system.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 22


UNIT – II

Properties and Applications of Fourier Transform: FFT,


Discrete cosine transform, Hadamard transform, Haar
transform, Slant transform, DWT and Hotelling transform.

CO-2: Summarize and compare various digital image


transform techniques.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 23


UNIT – III
Spatial Enhancement Techniques: Histogram equalization,
direct histogram specification, Local enhancement.
Frequency domain techniques: Low pass, High pass and
Homomorphic Filtering, Image Zooming Techniques.

CO-3: Demonstrate and survey digital image enhancement in


practical applications.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 24


UNIT – IV
Image Degradation model, Algebraic approach to restoration,
inverse filtering, Least mean square filter, Constrained least
square restoration and interactive restoration. Speckle noise
and its removal techniques.

CO-4: Analyze the case study related to various techniques of


image restoration.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT


25
UNIT – V
Redundancies for image compression, Huffman Coding,
Arithmetic coding, Bit- plane coding, loss less and lossy
predictive coding. Transform coding techniques: Zonal coding
and Threshold coding.

CO-5: Apply compression techniques on digital image.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 26


Text Books
1. Gonzalez R.C. and Woods R.E., “Digital Image Processing” 2/e, PHI,

2005.

2. A. K. Jain, “Fundamentals of Digital Image processing”, PHI, 1989.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 27


Digital Image Processing
 An image is defined as a two dimensional function f(x,y) where x and
y are spatial (plane) coordinates.

 Amplitude of f at any pair of coordinates(x, y) is called the intensity


or gray level of the image at that point.

 x,y and the intensity values of f are all finite, discrete quantities then
the image is called as digital image.

 Digital Image Processing means processing of images which are


digital in nature by a digital computer.

 Digital image is composed of a finite number of elements. These


elements are called as picture elements, image elements, pels and
pixels.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 28
Need for Image processing

It is motivated by two major applications

 Improvement of pictorial information for human


perception.

 Processing of image data for efficient storage,


transmission and representation for autonomous
machine application

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 29


Origins of Digital Image Processing
 First, application of digital images was in the newspaper industry, when
pictures were first sent by submarine cable between London and New
York.

 Introduction of the Bartlane cable picture transmission system in the early


1920s.

 Specialized printing equipment coded pictures for cable transmission,


Figure was transmitted in this way and reproduced on a telegraph printer
fitted with typefaces simulating a halftone pattern.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 30


 The initial problems in improving the visual quality of these early digital
pictures were related to the selection of printing procedures and the
distribution of intensity levels
 The printing technique based on photographic reproduction made from
tapes perforated at the telegraph receiving terminal from 1921.

 Figure shows an image obtained using this method.


 The improvements are tonal quality and in resolution.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 31


 The early Bartlane systems were capable of coding images in five distinct
levels of gray.
 This capability was increased to 15levels in 1929.

 Figure is typical of the type of images that could be obtained using the15-
tone equipment.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 32


Electro magnetic spectrum

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 33


Fundamental Steps in Digital Image Processing

34
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Step 1: Image Acquisition
The image is captured by a sensor (Ex: Camera), and digitized if the output of
the camera or sensor is not already in digital form, using analogue-to-digital
convertor.
Step 2: Image Enhancement
 The process of manipulating an image so that the result is more suitable
than the original for specific applications.
 The idea behind enhancement techniques is to bring out details that are
hidden, or simple to highlight certain features of interest in an image.
Step 3: Image Restoration
 Improving the appearance of an image
 Tend to be mathematical or probabilistic models. Enhancement, on the
other hand, is based on human subjective preferences regarding what
constitutes a “good” enhancement result.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 35


Step 4: Colour Image Processing
Use the colour of the image to extract features of interest in an image
Step 5: Wavelets
Are the foundation of representing images in various degrees of resolution. It
is used for image data compression.
Step 6: Compression
Techniques for reducing the storage required to save an image or the
bandwidth required to transmit it.
Step 7: Morphological Processing
Tools for extracting image components that are useful in the representation
and description of shape. In this step, there would be a transition from
processes that output images, to processes that output image attributes.
Step 8: Image Segmentation
Segmentation procedures partition an image into its constituent parts or
objects.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 36
Step 9: Representation and Description
Representation: Make a decision whether the data should be represented as a
boundary or as a complete region. It is almost always follows the output of a
segmentation stage.
Boundary Representation: Focus on external shape characteristics, such as
corners and inflections
Region Representation: Focus on internal properties, such as texture or
skeleton shape choosing a representation is only part of the solution for
transforming raw data into a form suitable for subsequent computer
processing (mainly recognition)
Description or feature selection: This deals with extracting attributes that
result in some information of interest.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 37


Step 10: Object Recognition

Recognition is the process that assigns label to an object based on the


information provided by its description.

Step 11: Knowledge Base

Knowledge about a problem domain is coded into an image processing system


in the form of a knowledge database.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 38


Components/ Elements of an Digital Image Processing System

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 39


Image Sensors
1. Image Sensors
Two elements are required to acquire digital images. The first is the physical
device that is sensitive to the energy radiated by the object we wish to image
(Sensor). The second, called a digitizer, is a device for converting the output of
the physical sensing device into digital form.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 40


Single Sensor Element

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT


41
Linear Sensor

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 42


Array Sensor

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT


43
1. Image Sensors
Two elements are required to acquire digital images. The first is the physical
device that is sensitive to the energy radiated by the object we wish to image
(Sensor). The second, called a digitizer, is a device for converting the output of
the physical sensing device into digital form.
2. Specialized Image Processing Hardware
 Usually consists of the digitizer, mentioned before, plus hardware that
performs other primitive operations, such as an arithmetic logic unit
(ALU), which performs arithmetic and logical operations in parallel on
entire images.
 This type of hardware sometimes is called a frontend subsystem, and its
most distinguishing characteristic is speed. In other words, this unit
performs functions that require fast data throughputs that the typical
main computer cannot handle.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 44


3. Computer
The computer in an image processing system is a general-purpose computer
and can range from a PC to a supercomputer. In dedicated applications,
sometimes specially designed computers are used to achieve a required level
of performance.
4. Image Processing Software
Software for image processing consists of specialized modules that perform
specific tasks. A well-designed package also includes the capability for the user
to write code that, as a minimum, utilizes the specialized modules.
5. Mass Storage Capability
Mass storage capability is a must in a image processing applications. And
image of sized 1024 * 1024 pixels requires one megabyte of storage space if
the image is not compressed.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 45


Digital storage for image processing applications falls into three principal
categories:
1. Short-term storage for use during processing.
2. on line storage for relatively fast recall
3. Archival storage, characterized by infrequent access

6. Image Displays

The displays in use today are mainly color (preferably flat screen) TV
monitors. Monitors are driven by the outputs of the image and graphics
display cards that are an integral part of a computer system.

7. Hardcopy devices

Used for recording images, include laser printers, film cameras, heat-sensitive
devices, inkjet units and digital units, such as optical and CD-ROM disks.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 46


8. Networking

 Is almost a default function in any computer system, in use today. Because


of the large amount of data inherent in image processing applications the
key consideration in image transmission is bandwidth.

 In dedicated networks, this typically is not a problem, but communications


with remote sites via the internet are not always as efficient. This situation
is improving quickly as a result of optical fiber and other broadband
technologies.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 47


SIMPLE IMAGE FORMATION MODEL
 Images by two-dimensional functions of the form f (x, y).

 The value of f at spatial coordinates (x, y) is a scalar quantity whose

physical meaning is determined by the source of the image, and

whose values are proportional to energy radiated by a physical

source (e.g., electromagnetic waves).

 f (x, y) must be nonnegative and finite,

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 48


 Function f (x, y) is characterized by two components:

(1) the amount of source illumination incident on the scene being viewed,

(2) the amount of illumination reflected by the objects in the scene.

These are called the illumination and reflectance

components, and are denoted by i(x, y) and r(x, y), respectively.

The two functions combine as a product to form f (x, y):

 Thus, reflectance is bounded by 0 (total absorption)

1 (total reflectance)

The nature of i(x, y) is determined by the illumination source, and r(x, y) is

determined by the characteristics of the imaged objects.


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 49
 Let the intensity (gray level) of a monochrome image at any coordinates (x,

y) be denoted by L = f (x, y)

 L lies in the range Lmin ≤ L ≤ Lmax

 In theoretically, the requirement on Lmin is that it be nonnegative and

Lmax that it be finite.

 Lmin= imin* rmin and Lmax = imax* rmax

 The interval [Lmin ,Lmax ] is called the intensity (or gray) scale.

 where L = 0 is considered black and L = 1 is considered white on the scale.

 All intermediate values are shades of gray varying from black to white.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 50


Human Visual System
 In many image processing applications, the objective is to help a
human observer perceive the visual information in an image.
Therefore, it is important to understand the human visual system.

 The human visual system consists mainly of the eye (image sensor or
camera), optic nerve (transmission path), and brain (image
information processing unit or computer).

 It is one of the most sophisticated image processing and analysis


systems.

 Its understanding would also help in the design of efficient, accurate


and effective computer/machine vision systems.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 51


Elements of Visual Perception
 The field of digital image processing is built on a foundation of
mathematics, human intuition and analysis often play a role in the choice
of one technique versus another, and this choice often is made based on
subjective, visual judgments.

 In particular, our interest is in the elementary mechanics of how images


are formed and perceived by humans.

 We are interested in learning the physical limitations of human vision in


terms of factors that also are used in our work with digital images.

 Factors such as how human and electronic imaging devices compare in


terms of resolution and ability to adapt to changes in illumination are not
only interesting, they are also important from a practical point of view

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 52


Structure of the Human eye
Figure shows a simplified cross section of the human eye. The eye is nearly a
sphere (with a diameter of about 20 mm) enclosed by three membranes:
1. cornea and sclera outer cover;
2. choroid
3. retina.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 53


 Nearly spherical with a diameter of 20 mm (approx.).

 Cornea : Outer tough transparent membrane, covers anterior surface of the

eye.

 Sclera : Outer tough opaque membrane, covers rest of the optic globe.

 Choroid : Contains blood vessels, provides nutrition to the eye.

 Iris: Anterior portion of choroid, pigmented, gives color to the eye.

 Retina : Innermost membrane of the eye. When the eye is properly focused,

light form an object outside the eye is imaged on the retina. Pattern vision is

afforded by the distribution of discrete light receptors over the surface of

the retina. 54
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Receptors

 Two types of receptors: rods and cones (light sensors).

 Cones - 6-7 million, located in central portion of retina (fovea), responsible


for photopic vision (bright-light vision) and color perception, can resolve
fine details.

 Rods - 75-150 million, distributed over the entire retina, responsible for
scotopic vision (dim-light vision), not color sensitive, gives general overall
picture (not details).

 Fovea - Circular indentation in center of retina, about 1.5mm diameter,


dense with cones.

 Blind spot - Point on retina where optic nerve emerges, devoid of


photoreceptors.
55
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Distribution of Rods and Cones on Retina

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 56


Image formation in the eye

 In an ordinary photographic camera, the lens has a fixed focal length. Focusing
at various distances is achieved by varying the distance between the lens and
the imaging plane, where the film (or imaging chip in the case of a digital
camera) is located.
In the human eye, the distance between the center of the lens and the imaging
sensor (the retina) is fixed, and the focal length needed to achieve proper focus
is obtained by varying the shape of the lens.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 57


 Distance between center of lens and retina varies from 14- 17mm.

 Farther the object, smaller the refractive power of lens, larger the focal length.

 For example, suppose that a person is looking at a tree 15 m high at a distance


of 100 m. Letting h denote the height of that object in the retinal image, the
geometry of Figure yields

15/100 = h / 17 or h = 2.55 mm

 Perception then takes place by the relative excitation of light receptors, which
transform radiant energy into electrical impulses that ultimately are decoded
by the brain.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 58


Brightness Adaptation
 Human eye can adapt to an enormous range of
light intensity levels, almost orders of 10 10.
 Brightness perceived (subjective brightness) is a
logarithmic function of light intensity.
 Visual system cannot simultaneously operate over
such a range of intensity levels.
The total range of distinct intensity levels the eye can simultaneously
discriminate only a small number of intensity levels. For a given condition, the
sensitivity of the visual system is called the brightness adaptation level .
 At this adaptation, the eye can perceive brightness in the range Bb (below
which, everything is perceived as black) to Ba (above which, the eye adapts to a
different sensitivity).

59
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Brightness Discrimination
The ability of the eye to discriminate between changes
in brightness levels is called brightness discrimination.
 The increment of intensity ∆Ic that is discriminable
over a background intensity of I is measured.
 Weber ratio --- it is the ratio ∆Ic/I.
 Small value of Weber ratio --- good brightness
discrimination, a small percentage change in intensity is
discriminable.
 Large value of Weber ratio --- poor brightness
discrimination, a large percentage change in intensity is
required.
 At high intensities the brightness discrimination is
good (small Weber ratio), than at low intensities
60
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Perceived Brightness is not a Simple Function of Light Intensity

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 61


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 62
Image Sampling and Quantization
 Digital images is generated from sensed data.

 The output of most sensors is a continuous voltage waveform whose


amplitude and spatial behavior are related to the physical phenomenon
being sensed.

 To create a digital image we need to convert the continuous sensed data


into digital form.

 Let a continuous image f is to be convert to digital form

 An image measuring be continuous w.r.t to the x and y coordinate and also


in amplitude.

 To convert it to the digital form we have sample the function in both


coordinate and amplitude.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 63


 This involves two process

1. Sampling 2. Quantization

 Digitizing the coordinate values are called sampling.

 Digitizing the amplitude values are called quantization.

 Sampling the analog signal mean instantaneously measuring the voltage of


signal at fixed interval in time.

 The value of the voltage at each instant is convert into number.

 The number represents the brightness of the image at that point.

 The grabbed image is now a digital image and can be accessed as a two
dimensional array of data.

 Each data point is called pixel(picture element)

 The notation used to express a digital image F(x,y)

 F(x,y) is brightness of the image at point (x,y) where, x is row, y is column


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 64
 A plot of amplitude(intensity level ) values of

the continuous image along the line segment AB.

 To sample this function, we take equally spaced

samples along line AB

 The spatial location of each sample is indicates

by a vertical tick mark in the bottom part of

the figure.

 The samples are shown as small white

squares superimposed on the function.

 The set of these discrete locations gives the sampled function.

 However the value of the samples still (vertically) a continuous range of


intensity values.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 65


 Figure shows the intensity values divided into

eight discrete intervals ranging from black to white.

 The vertical tick marks indicates the specific

values assigned to each of the eight intensity levels.

 The continuous intensity levels are quantized by

assigning of the eight values of each samples.

 The digital samples resulting for both sampling

and quantization are shown

 Starting at the top of the image carryings out procedure line by line
produces a two dimensional digital image.

 In additional to the number of discrete levels used, the accuracy achieved


in quantization is highly dependent on the noise content of the sampled
signal.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 66
 In practice the method of sampling is determined by the sensor
arrangement used to generate the image.

 The quality of a digital image is determine to a large degree by the


number of samples and discrete intensity levels used in sampling and
quantization.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 67


Representing Digital Image
 Let f(x,y) represent a continuous image function of two continuous
variable, x and y.

 We convert this function into a digital image by sampling and quantization.

 Suppose that we sample the continuous image into a 2-D array, f(x,y)
containing M-rows and N- columns where ( x, y) are discrete coordinates.

 Integer values for the discrete coordinates x=0,1,….M-1 and y=0,1,…N-1.

 The values of the image at any coordinate (x,y) is denoted f(x,y) where x
and y are integer. The section of the real plane spanned by the coordinates
of an image is called the spatial domain, with x and y being referred to as
spatial variables or spatial coordinates.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 68


 The intensity of each point in the display is proportional to the value of f at
that point. In below figure, there are only three equally spaced intensity
values. If the intensity is normalized to the interval [0,1] then each point in
the image has the value 0, 0.5, or 1.

 A monitor or printer converts these three values to black, gray, or white,


respectively, as in Figure

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 69


Representation of Digital Image as Matrice
 f(x,y) is a digital image. Each element of this array is called an image
element, picture element, pixel or pel.

 If ∆x and ∆y are separation of grid points in the x and y directions,


respectively, we have

 The sampling process requires specification of ∆x and ∆y, or equivalently


M and N (for a given image dimensions).

70
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
 Image digitization requires that decisions be made regarding the values for
M, N, and for the number L, of discrete intensity levels.

 Digital storage and quantizing considerations usually lead to the number of


intensity levels L, being an integer power of two,

L=2k where k is an integer.

 The discrete levels are equally spaced and that they are integers in the
range [0,L-1].

 The range of values spanned by the gray scale is referred to as the dynamic
range, a term used in different ways in different fields

 Dynamic range of an imaging system to be the ratio of the maximum


measurable intensity to the minimum detectable intensity level in the
system.

 Dynamic range establishes the lowest and highest intensity levels that a
system can represent an image.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 71
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 72
 Image contrast is define as the difference in intensity between the
highest and lowest intensity levels in an image.

 When an appreciable number of pixels in an image have a high dynamic


range, we can expect the image to have high contrast. Conversely, an
image with low dynamic range typically has dull.

 Q: Suppose a pixel has 1 bit, how many gray levels can it represent?

Answer: 2 intensity levels only, black and white. Bit (0,1)

0:black , 1: white

Q: Suppose a pixel has 2 bit, how many gray levels can it represent?

A: 4 gray intensity levels 2Bit (00, 01, 10 ,11).

Now .. if we want to represent 256 intensities of grayscale, how many bits


do we need?

Answer: 8 bits. which represents: 28=256.


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 73
L: grayscale levels the pixel that can be represent, L= 2K

N * M: the no. of pixels in image

K: No. of bits in each pixel

No of bits required to store digital image(b) = M*N*k

If M=N case , then b=N2K

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 74


Spatial Resolution
 Spatial resolution is measure of the smallest discernible details in an image.

 This is measures in number of ways

 Dots or pixel per unit distance

 Line pairs per unit distance

 In the US, this measure is usually expressed as dots per inch(dpi).

 Newspapers are printed with a resolution - 75dpi

 Magazines – 133dpi

 Glossy brochures-175dpi

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 75


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 76
77
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 78
Intensity Resolution
 Intensity resolution is the smallest discernible change in intensity level.

 No. of intensity levels usually is an integer power of two.

 Most common number is 8bits or 16bits but, 32 bits is rarely used

 The number of bits used to quantize intensity as intensity resolution.

 Discernible changes in intensity are influenced not only by noise and


saturation values but also by the capabilities of human perception.

 An imperceptible set of very fine ridge like structures in area of constant


or nearly constant intensity. This effect, caused by the use of an
insufficient number of intensity levels in smooth areas of a digital image
is called false contouring.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 79


80
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 81
Basic relationships between pixels
 Image boundary and region need to be extracted in image analysis.

 To draw the object boundary in an image, we need to join connected


pixels.

Various relationship between pixels are the following


 Neighbours

 Adjacency

 Path

 Connectivity

 Region

 Boundary

 Distance

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 82


Neighbours of a pixel
 4-Neighbours of pixel

 Diagonal-Neighbours of pixel

 8-Neighbours of pixel

 A pixel p at (x,y) has 2-horizontal/vertical neighbours at (x+1,y), (x-1,y),


(x,y+1) and (x,y-1). These are called the 4-neighbours of p : N4(p).

 A pixel p at (x,y) has 4 diagonal neighbours at (x+1,y+1), (x+1,y-1), (x-1,y+1)


and (x-1,y-1). These are called the diagonal-neighbours of p : ND(p).

83
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
 The 4-neighbours and the diagonal neighbours of p are called 8-
neighbours of p : N8(p).

N8(p)= N4(p ) ∪ ND(p)

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 84


Adjacency between pixels
 Let V be the set of intensity values used to define adjacency.

 In a binary image, V ={1} if we are referring to adjacency of pixels with


value 1. In a gray-scale image, the idea is the same, but set V typically
contains more elements.

 For example, in the adjacency of pixels with a range of possible intensity


values 0 to 255, set V could be any subset of these 256 values.

Three types of adjacency:

 4-adjacency

 8-adjacency

 M-adjacency:

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 85


 4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is
in the set N4(p).

 8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is


in the set N8(p).

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 86


 M-adjacency (also called mixed adjacency): Two pixels p and q with
values from V are m-adjacent

(a) q is in N4(p )

or

(b) q is in ND(p ) and the set N4(p ) ∩ N4(q ) has no pixels whose values

are from V

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 87


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 88
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 89
M-adjacency (also called mixed adjacency) problem:

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 90


Path
 A path (or curve) from pixel p with coordinates (x,y) to pixel q with
coordinates (s,t) is a sequence of distinct pixels with coordinates

(x0,y0), (x1,y1),.…,( xn ,yn)

where points (x0,y0)=(x,y), ( xn ,yn) = (s,t) and pixels (xi ,yi) and (xi-1 ,yi-1)
are adjacent for 1 ≤ i ≤ n. In this case, n is the length of the path. If (x0,y0)=(
xn ,yn) the path is a closed path.

 4-, 8- or m-paths depending on the type of adjacency specified.

91
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Connectivity between pixels
 It is an important concept in digital image processing.

 It is used for establishing boundaries of objects and components of


regions in an image.

Two pixels are said to be connected:

 If they are adjacent in some sense(neighbour pixels,4/8/m-adjacency)

 If their gray levels satisfy a specified criterion of similarity(equal


intensity level)

Three types of connectivity on the basis of adjacency. They are

 4-connectivity: Two or more pixels are said to be 4-connected if they are


4-adjacent with each others.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 92


 8-connectivity: Two or more pixels are said to be 8-connected if they are
8-adjacent with each others.

 m-connectivity: Two or more pixels are said to be m-connected if they are


m-adjacent with each others.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT


93
Example: For V={0,1}, find the length of shortest 4,8 and M-paths between p and
q. repeat for V={1,2} for the given image.

94
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Example: For V={2,3,4}, compute the lengths of shortest 4,8 and M-paths between
p and q in the following image.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 95


Region
 Let R represent a subset of pixels in an image. R is a region of the image if R
is a connected set.

 Two regions, Ri and Rj are said to be adjacent if their union forms a


connected set.

 Regions that are not adjacent are said to be disjoint.

 4- and 8-adjacency when referring to regions.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 96


Boundary
 The boundary (also called the border or contour) of a region R is the set of
pixels in R that are adjacent to pixels in the complement of R.

 another way, the border of a region is the set of pixels in the region that
have at least one background neighbor.

 Boundary of the region is defined as a set of pixels in the region that have
one or more neighbors that are not in R. Boundary is the edge of a region.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 97


DISTANCE MEASURES
 For pixels p, q, and s, with coordinates (x, y), (u,v), and (w,z), respectively, D is
a distance function or metric if

 The Euclidean distance between p and q is defined as

 D4 distance, (called the city-block distance) between p and q is defined as

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 98


 D8 distance (called the chessboard distance) between p and q is defined as

 Dm Distance: It is defined as the shortest m-path between the points. In this


case, the distance between two pixels will depend on the values of the pixels
along the path, as well as the values of their neighbors.

Example: Consider the following arrangement of pixels and assume that p, p2, and
p4 have value 1 and that p1 and p3 can have can have a value of 0 or 1 Suppose
that we consider the adjacency of pixels values 1 (i.e. V ={1}) Now, to compute
the Dm between points p and p4.

99
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Here we have 4 cases:

 Case1: If p1 =0 and p3 = 0 The length of the shortest m-path (the Dm distance)


is 2 (p, p2, p4)

 Case2: If p1 =1 and p3 = 0, now, p1 and p will no longer be adjacent ( m-


adjacency ) then, the length of the shortest path will be 3 (p, p1, p2, p4)

 Case3: If p1 =0 and p3 = 1, The same applies here, and the shortest –m-path
will be 3 (p, p2, p3, p4) .

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 100


 Case4: If p1 =1 and p3 = 1 The length of the shortest m-path will be 4 (p, p1,
p2, p3, p4).

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 101


 Example:

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 102


Types of Digital Images

 Binary Images

 Grey scale images

 Color or RGB images

 Multispectral images

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 103


Binary Image
 Binary image is also called as black and white image

 Two possible values of each pixel: black (0) and white(1)

 Each pixel needs 1 bit to represent a binary image

 Applications: Text, finger print

 Obtained fro gray level (or color) image f(x,y) by thresholding.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 104


Grey scale image
 Gray scale images are referred as monochrome(one color) images.

 They contain gray level information, no color information.

 Each pixel is a shade of grey between black and white.

 For an 8-bit image, black represented by 0 and white by 255.

 For an 8-bit image, number of quantization levels are 256.

 In applications like medical imaging and astronomy,12 or 16 bits/pixel images are


used.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 105


Colour or RGB image
 Color images can be modeled as three band monochrome image data,
where each band of data corresponds to a different color.

 The actual information stored in the digital image data is the gray level
information in each spectral band.

 Typical color images are represented as red, green and blue (RGB image).

 Using the 8-bit monochrome standard as a model, the corresponding color


image would have 24-bits per pixel (8-bits for the each of the three color
bands red, green and blue)

 Each of color components have a range from 0 to 255.

 Color image consists of stacking of three matrices, representing red, green


and blue for each pixel.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 106


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 107
Multi spectral image
 Multispectral images typically contain information outside the normal

human perceptual range.

 This may included infrared, ultraviolet, X-ray, acoustic or radar data.

 These are not images in the usual sense because the information

represented is not directly visible by the human system.

 This information is often represented in the digital form by mapping the

different spectral band to RGB components.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 108


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 109
Image formats
 Standardized means of organizing and storing image.

 Image files are composed of digital data in one of these formats that can be
rasterized for use on a computer display or printer.

 An image file format may store data in uncompressed, compressed or


vector formats.

 Different image format are

1. GIF

2. JPEG and JPEG2000

3. PNG

4. TIFF

5. BMP

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 110


GIF Format
 GIF stands for Graphic Interchange Format

 This format loss less compresses images

 Extremely limited colors range suitable for web but not for printing,
photography.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 111


JPEG format
 JPEG stands for Joint Photographic Experts Group

 JPEG files are images that have been compressed to store a lot of
information in a small size file.

 AJPEG is compressed loses some of the image detail during the compression
in order to make the file small.

 JPEG files are usually used for photographs on the web, because they create
a small file that is easily loaded on a web page and also looks good.

 JPEGs are bad for line drawings or logos or graphics.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 112


JPEG2000
 JPEG2000 is a compression standard enabling both lossy and lossless
storage.

 They improve quality and compression ratio, but also require more
computational power to process.

 JPEG2000 also adds features that are missing in JPEG.

 Used in professional movie editing and distribution.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 113


PNG
 PNG stands for portable Network Graphics.

 It is created to replace GIF.

 It is allows for a full range of color and better compression.

 For photographs, PNG is not as good as JPEG, because it creates a


larger file but images with some text or line it is better.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 114


TIFF
 TIFF stands for Tagged Image File Format.

 TIFF images create very large file sizes

 TIFF images are uncompressed and thus contain a lot of image


data (which is why the files are so big).

 Most common file type used in photo software.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 115


BMP
 BMP is called windows bitmap.

 It handles graphic files within the Microsoft windows OS.

 BMP files are uncompressed and therefore large and lossless.

 Advantages is simple structure and wide acceptance windows


programs.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 116


Image Transforms

Prepared by:
Sri. T. Aravinda Babu
Asst. Prof.,
Dept. of ECE, CBIT
Image Transform
 Transforms are mathematical tool which allows to move from one domain to
another domain(e.g. time domain to frequency domain and vice versa)

 Transformation do not change the information content present in the signal,


it only changes the way of representation.

Image Transform:
Transform
Another
Image
Image
NXN
NXN
Inverse
Transform

Coefficient
matrix

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 2


 Image transform is class of unitary matrices used for representing images.

 In 1D signal can be represented by an orthogonal series of basis functions.

 An image can also be expanded in terms of a discrete set of basis arrays

called basis images. These basis images are generated from unitary matrices.

Applications: Need for Transform:

Pre processing 1. Fast computation (convolution,

 Filtering correlation)

 Enhancement 2. Efficient storage and

Data Compression Transmission. (compression)

Feature extraction 3. Better image processing (e.g

 Edge detection denosing, enhancement,

 Corner Detection restoration)

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 3


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 4
Unitary Transform
 Consider A as a 1-D vector consisting of {a(0), a(1), …a(N-1)} elements.

Orthogonal matrix:

 A is said to be an orthogonal matrix, If A-1=AT Thus AA-1 = I = AAT

Unitary matrix:

 A is said to be an unitary matrix, if A-1=A*T Thus AA-1 = I = AA*T

 In case of a real matrix A=A*, Thus a real orthogonal matrix is also unitary.

Example: Check, if A1,A2,A3 are orthogonal or not and unitary or not

 A1T = A1A1T = =I

 As, A1A1T = I A1 is orthogonal

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 5


 A1*T = A1 A1*T = =I

 As, A1A1T = I then A1 is unitary

1D Unitary Transform:

For a one dimensional sequence, u(n), 0 ≤ n ≤ N-1, also represented as a


vector U of size N, Unitary transform is defined as

Where v(k) is the transformed sequence

a(k,n) is called forward transform kernel.

The original sequence can be recovered back by inverse transform

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 6


 Where a* is the conjugate of a, and is the inverse transformation kernel.

 In matrix notation,

Forward transform V=AU

Inverse Transform U=A*T V


 If A-1=A*T (If A is unitary) .
Example:

Find forward transform matrix V and inverse transform matrix U.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 7


2D Unitary Transform
 For an N X N image u(m , n), the forward and inverse transform are given by

where a k ,l (m, n) is forward transformation kernel

a* k ,l (m, n) is reverse transformation kernel

u(m , n) is original image

v(k ,l ) is transformed image

satisfying orthonormality condition

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 8


2D separable unitary Transform
 The forward kernel is said to be separable, if

 A=a(k , m), B=b(l , n) are unitary matrices AA* = I, BB*=I

 The forward kernel is said to be symmetric if A=B such that

 If the kernel A of an image transform is separable and symmetric, then the


forward transform is given by

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 9


 Inverse transform is given by

 In matrix notation of forward transform is given by

V=AUAT

 Reverse or inverse transform is

U=A*TVA*

 A is the transformation kernel and it will keep changing based on the


transform, e.g DFT, DCT

 Separable transform can be completed by two steps:

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 10


Advantage of separable transform: Computationally efficient.
The no of multiplication and additions required to compute v(k,l)
with out separable transform is O(N4)
with separable transform is O(N3)

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 11


1D Discrete Fourier Transform(1D-DFT)
 The DFT of a sequence u(n), n=0,1,…N-1 is defined as

 The inverse transform is given by

 The pair of equations not scaled properly to be unitary transformations. In

Image processing, it is convenient to consider the unitary DFT is defined as

 N X N unitary DFT matrix is defined as

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 12


 For N=4,

 In the matrix form

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 13


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 14
2D Discrete Fourier Transform(2D-DFT)
 The 2D-DFT of N X N image is defined as

 The inverse transform is given by

 In the matrix form

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 15


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 16
Find DFT of V?

17
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Properties
1. Elements of F are complex valued, Thus F is complex.

2. FT=F then F is symmetric

3. F is unitary then F-1 = F*T

4. Separable property: 2D transform can be computed using


successive 1D operation on rows and columns

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 18


5. periodicity: If u(m,n) and v(k,l) are 2D-DFT pair then

6. Circular convolution: let u(m,n) and h(m,n) are N X N images then


DFT[u(m,n) h(m,n) ]= DFT[u(m,n)] DFT[h(m,n)]

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 19


Discrete Cosine Transform
 DCT of the sequence u(n), 0 ≤ n ≤ N-1 is defined as

 Inverse DCT is defined as

 N X N cosine transformation matrix C={c(k,n)} is

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 20


 In matrix notation

Forward Transformation

Reverse Transformation

21
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 22
2D Discrete Cosine Transform(2D-DCT)
 For N X N image u(m,n), 2D forward DCT is defined as

 Inverse DCT is defined as

23
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 24
Properties of DCT
1. DCT matrix C is real, not symmetric and unitary.

2. As DCT is unitary, the energy is conserved

3. DCT has excellent energy compaction for highly correlated data. Hence

it is used in image Compression.

4. Cosine transform is not real part of DFT.

5. DCT is simpler than DFT as it does not involve complex

6. Cosine transform is a fast transform, 1D DCT of N elements can be

computed using 0(N log2N) operation via N-point FFT.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 25


Hadamard Transform
 Basis function of hadamard transform are non sinusoidal.

 Hadamard transform matrix has only ±1 in their basis functions which


gives it significant computational advantage over other transforms.

 Hadamard transform can be computed by addition and subtraction of


input values.

 It does not required any multiplications.

 A N x N hadamard transform matrix is generated by iterative rule

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 26


27
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
 1D forward hadamard Transform:

 1D inverse hadamard transform:

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 28


2D Hadamard Transforms
 Forward hadamard transform for a 2D image u(m,n), 0≤ m,n ≤ N-1
is defined as

Where bi(n) represents ith (from LSB) bit of the binary value of n
decimal number represented in binary.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 29


 Inverse hadamard transform:

Where bi(n) represents ith (from LSB) bit of the binary value of n decimal
number represented in binary.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 30


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 31
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 32
Properties
1. H is real, real transform ensures efficient implementation

having less number of computations

2. Symmetrical, symmetrical transform ensures that forward and

inverse transforms are same.

3. Unitary transform is energy conservation

4. Very good energy compaction for highly correlated data.

5. Fast transform, it does not involve multiplication, only

addition/subtraction is required to implement the transform.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 33


Applications
 Data encryption

 Data compression

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 34


N=2n

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 35


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 36
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 37
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 38
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 39
40
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 41
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 42
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 43
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 44
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 45
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 46
Haar Transform
 Basic functions of Haar transform are non sinusoidal functions.

 Haar transform basis can be found from haar function HK(x).

 There are several steps to find basis matrix

Step1: For different values of k, find values of p,q

k is defined as k= 2p + q – 1, where k = 0,1,…N-1 and n = log2N, 0 ≤ p ≤ n-1

for p = 0, q = 0,1

for p ≠ 0, 1≤ q ≤ 2p

For a particular value of k, there is a unique value of p and q.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 47


 Step2: x є [0,1] find internal x=m/N, 0 ≤ m ≤ N-1.

 Step3: Haar function is defined as

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 48


49
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
50
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 51
52
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 53
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 54
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 55
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 56
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 57
Image Enhancement

Prepared by:
Sri. T. Aravinda Babu
Asst. Prof.,
Dept. of ECE, CBIT
UNIT – III
Spatial Enhancement Techniques: Introduction, Histogram
equalization, direct histogram specification, Local
enhancement.

Frequency domain techniques: Low pass, High pass and


Homomorphic Filtering, Image Zooming Techniques.

CO-3: Demonstrate and survey digital image enhancement in


practical applications.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 2
Image Enhancement
 The objective of enhancement techniques is to process an image so that the
result is more suitable than the original image for a specific application.

 Image enhancement technique improve the quality of image as received by


human observer/machine vision system.

 Image quality can degrade because of poor illumination, improper


acquisition device, coarse quantization noise during acquisition process etc.

 The recorded images after acquisition exhibits problems such as

 Too dark

 Too light

 Not Enough contrast

 Noise

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 3


 Enhancement aims to improve visual quality of any image so that it is more
suitable to a particular application.

 Image enhancement techniques are application specific and produces a


better image. They are broadly classified into two categories
 Spatial domain methods

 Frequency domain methods

 In spatial domain methods, pixel values are manipulated directly to get an


enhanced image.

 In frequency domain methods, first Fourier transform of image is taken to


convert image into frequency domain. Then the Fourier transform
coefficients are manipulated and modified spectrum is transformed back to
spatial domain to view the enhanced image.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 4


Spatial domain methods
 Spatial domain refers to the aggregate of pixels composing an image and
spatial domain methods are procedures that operate directly on these pixels.

 Image processing functions in the spatial domain can be expressed as

s = T[r]

where r is the input image,

s is the processed image

T is intensity transformation function that maps a pixel value r into a

pixel value s.

 Intensity transformation are three basic types of functions used in image


enhancement: linear(negative and identity transformations), logarithmic(log
and inverse log transformations) and power law(nth power and nth root
transformations).
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 5
Point Processing
 The grey values of an individual pixel in the input image is manipulated
to generate the grey levels of the pixels in output image using the
transformation.

S=T[r]

Different point processing transformation techniques:

 Image negative

 Log transformation

 Power law transformation

 Contrast streching

 Bit plane slicing

 Histogram equalization

 Histogram specification
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 6
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 7
Image negative
 Digital negatives of an image with the intensity levels in the range [0,L-1] is
given by

s=(L-1)-r

 In this transformation, highest grey level is mapped to lowest and vice versa.
For an 8-bit image, the transformation is s=255-r.

 Middle grey level has not changed much where as dark grey level has become
bright.

 This transformation is generally used to enhance the white details embedded


in the dark regions of an image where black is dominant.

 It is used in displaying the medical images.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 8


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 9
Log Transformations
 Log transformation is given by

S = clog(1+r)

where c is constant and it is assumed r ≥ 0.

 This transformation maps narrow range of low intensity values in the input
into a wider range of output levels. The opposite is true of higher values of
input levels.

 Use a Transformation of this type to expand the values of dark pixels in an


image while compressing the higher level values. The opposite is true of the
inverse log transformation.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 10


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 11
Power-Law(Gamma) Transformation
 Power law transformation is given by

S=crγ

where c and γ are positive constants

 As in case of the log transformation, power law curves with fractional values
of γ map a narrow range of dark input values into a wider range of output
values, with the opposite being true for higher values of input levels.

 Unlike the log function, a family of possible transformation curves obtained


simply by varying γ.

 A variety of devices used for image capture, printing and display response
according to a power law. The process used to correct these power law
response phenomena is called gamma correction.

 For example, CRT devices have an intensity to voltage response that is a


power function with exponent varying from approximately 1.8 to 2.5.
12
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 13
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 14
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 15
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 16
Original gamma=0.45
gamma=2.20

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 17


Piecewise-Linear Transformation Functions
 Principle Advantage

Some important transformations can be formulated only as a

piecewise function.

 Principle Disadvantage

Their specification requires more user input

 Types of Piecewise transformations are

 Contrast Stretching

 Gray-level Slicing

 Bit-plane slicing

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 18


Contrast Stretching
 One of the simplest piecewise linear functions is a contrast stretching
transformation, which is used to enhance the low contrast images.

 Low contrast images can result from poor illumination, lack of dynamic
range in the image sensor, or even the wrong setting of a lens aperture
during image acquisition.

 Contrast stretching is a process that expands the range of intensity levels in


an image so that it spans the full intensity range of the recording medium
or display device.

 Figure shows a typical transformation used for contrast stretching. The


locations of points (r1, s1) and (r2, s2) control the shape of the
transformation function.

 If r1=s1 and r2=s2, the transformation is a linear function that produces no


changes in intensity levels.
19
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 20
 If r1=r2, s1=0 and s2=L-1, the transformation becomes a thresholding function

that creates a binary image.

 Intermediate values of (r1, s1) and (r2, s2) produce various degrees of spread in

the gray levels of the output image, thus affecting its contrast.

 In general, r1 ≤ r2 and s1 ≤ s2 is assumed, so the function is always increasing.

 In figure(b) shows an 8-bit image with low contrast.

 Figure(c) shows the result of contrast stretching, obtained by setting

(r1,s1)=(rmin,0) and (r2,s2)=(rmax,L-1), where rmin and rmax denote the minimum

and maximum intensity levels in the image respectively.

 The transformation function stretched the levels linearly from their original range

to the full range[0,L-1].

 Figure(d) shows the result of using the thresholding function with (r1,s1)=(m,0)

and (r2,s2) = (m,L-1) where m is mean intensity level in an image.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 21


Gray-level Slicing
 This technique is used to highlight a specific range of gray levels in a given

image. It can be implemented in several ways, but the two basic themes are:

 One approach is to display a high value for all gray levels in the range of

interest and a low value for all other gray levels. This transformation,

shown in Figure(a), produces a binary image.

 The second approach, based on the transformation shown in Figure(b),

brightens the desired range of gray levels but preserves gray levels

unchanged.

 Figure (c) shows a gray scale image, and Figure (d) shows the result of

using the transformation in Figure (a).

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 22


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 23
Bit-plane Slicing
 Pixels are digital numbers, each one composed of bits. Instead of
highlighting gray-level range, we could highlight the contribution made by
each bit.
 This method is useful and used in image compression.

 Most significant bits contain the majority of visually significant data.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 24


 The (binary) image for bit-plane 7
can be obtained by processing the
input image with a thresholding
gray-level transformation.

 Map all levels between 0 and


127 to 0

 Map all levels between 128


and 255 to 1

An 8-bit fractal image

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 25


Bit-plane 7 Bit-plane 6

Bit- Bit- Bit-


plane 5 plane 4 plane 3

Bit- Bit- Bit-


plane 2 plane 1 plane 0

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 26


Histogram
 Histogram of an image represents the number of times a particular grey level
has occurred in an image.

 It is a graph between various grey levels on x-axis and the number of times a
grey level has occurred in an image on y-axis.

h(rk) = nk

 rk : the kth gray level

 nk : the number of pixels in the image having gray level rk

 h(rk) : histogram of a digital image with gray levels rk

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 27


Normalized histogram
 dividing each of histogram at gray level rk by the total number of pixels in
the image, n

p(rk) = nk / n ` for k = 0,1,…,L-1

 p(rk) gives an estimate of the probability of occurrence of gray level rk

 The sum of all components of a normalized histogram is equal to 1

Histogram Processing
 Basic for numerous spatial domain processing techniques

 Used effectively for image enhancement

 Information inherent in histograms also is useful in image compression


and segmentation

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 28


Dark image

Components of
histogram are
concentrated on the low
side of the gray scale.

Bright image

Components of
histogram are
concentrated on the
high side of the gray
scale.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 29


Low-contrast image

histogram is narrow and


centered toward the
middle of the gray scale

High-contrast image

histogram covers broad


range of the gray scale and
the distribution of pixels is
not too far from uniform,
with very few vertical lines
being much higher than
the others

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 30


Histogram transformation
s s = T(r)

 Where 0  r  1

 T(r) satisfies

sk= T(rk)  (a). T(r) is single-valued and


monotonically increasingly
T(r)
in the interval 0  r  1

 (b). 0  T(r)  1 for


0r1

0 r 1 r
k

31
Conditions of T(r)
 Single-valued (one-to-one relationship) guarantees that the inverse
transformation will exist

 Monotonicity condition preserves the increasing order from black to white


in the output image thus it won’t cause a negative image

 0  T(r)  1 for 0  r  1 guarantees that the output gray levels will be in the
same range as the input levels.

 The inverse transformation from s back to r is

r = T -1(s) ;0s1

32
Probability Density Function
 The gray levels in an image may be viewed as random variables in the
interval [0,1]

 PDF is one of the fundamental descriptors of a random variable

 Let

 pr(r) denote the PDF of random variable r

 ps (s) denote the PDF of random variable s

 If pr(r) and T(r) are known and T-1(s) satisfies condition (a) then ps(s) can
be obtained using a formula :

dr
ps(s)  pr (r)
ds

33
 The PDF of the transformed variable s is determined by the gray-level PDF of
the input image and by the chosen transformation function

 A transformation function is a cumulative distribution function (CDF) of


random variable r has the form

r
s  T ( r )   pr ( w )dw
0
where w is a dummy variable of integration

Note: T(r) depends on Pr(r)

 CDF is an integral of a probability function (always positive) is the area


under the function

 Thus, CDF is always single valued and monotonically increasing

 Thus, CDF satisfies the conditions of the transformation function.

 Hence, use CDF as a transformation function


Finding ps(s) from given T(r)
ds dT ( r )

dr dr
  dr
p s ( s )  pr ( r )
r
d
   pr ( w) dw ds
dr  0 
1
 pr ( r )  pr ( r )
pr ( r )
 1 where 0  s  1
Substitute and yield

 As ps(s) is a probability function, it must be zero outside the interval [0,1]


in this case because its integral over all values of s must equal 1.
 ps(s) as a uniform probability density function
 ps(s) is always a uniform, independent of the form of pr(r)

35
 The probability of occurrence of gray level in an image is approximated by

nk
pr ( rk )  where k  0 , 1, ..., L-1
n
 The discrete version of transformation is given
k
sk  T ( rk )   pr ( r j )
j 0
k nj
 where k  0 , 1, ..., L-1
j 0 n
 Thus, an output image is obtained by mapping each pixel with level rk in
the input image into a corresponding pixel with level sk in the output image

 A plot of p(rk) versus rk is called a histogram The transformation mapping


is called histogram equalization or histogram linearization.
Example
before after Histogram
equalization

37
before after Histogram
equalization

The quality is not


improved much
because the original
image already has a
broaden gray-level
scale

38
Example
No. of pixels

6
2 3 3 2 5

4 2 4 3 4

3 2 3 5 3

2
2 4 2 4
1
Gray level
4x4 image
0 1 2 3 4 5 6 7 8 9
Gray scale = [0,9]
histogram
39
Gray
0 1 2 3 4 5 6 7 8 9
Level(j)

No. of pixels 0 0 6 5 4 1 0 0 0 0

n
j 0
j 0 0 6 11 15 16 16 16 16 16

15 16 16 16 16 16
k nj 11/1
s 0 0 6 /16
6
/ / / / / /
j 0 n 16 16 16 16 16 16

3.3 6.1 8.4


sx9 0 0 9 9 9 9 9
3 6 8
Example
No. of pixels

6
3 6 6 3 5

8 3 8 6 4

6 3 6 9 3

2
3 8 3 8
1
Output image
0 1 2 3 4 5 6 7 8 9
Gray scale = [0,9] Gray level
Histogram equalization
41
Example: for the given 8 X 8 image having grey levels between [0,7] get
histogram equalized image

42
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 43
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 44
Histogram specifications
 Histogram equalization is capable of generating an approximation of a
uniform histogram.
 Sometimes the ability of specify particular histogram shapes capable of
highlighting certain gray levels in an image is desirable.
 Let Pr(r) and Pz(z) be the original and desirable probability density
functions respectively then histogram equalization of original image is

 If desired image were available, its levels could also be equalized by


using the transformation function

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 45


 The inverse process will give us back the levels, z, of the desired
image.

 Pr(s) and pv(v) would be identical uniform densities because the final
result is independent of the density inside the integral.

 Thus if instead of using v in the inverse process we use the uniform levels s
obtained from the original image, the resulting levels, would
have the desired probability density function.

 Procedure to calculate G-1(s) can be as follows

1. histogram equalization of the original image

2. Specify the desired density function and obtained the transformation


function G(z) using

3. Apply the inverse transformation function z=G-1(s), then we can map


input grey levels r to the output grey levels s.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 46


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 47
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 48
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 49
Histogram specification is a trial-and-error process
There are no rules for specifying histograms, and one must resort to
analysis on a case-by-case basis for any given enhancement task.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 50


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 51
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 52
Local Enhancement
 Histogram processing methods are global processing, in the sense that pixels
are modified by a transformation function based on the gray-level content of
an entire image.

 This global approach is suitable for overall enhancement, but generally fails
when the objective is to enhance details over small areas in an image.

 The solution is to plan transformation functions based on the intensity


distribution of pixel neighborhoods. The histogram processing techniques
that can be used to enhance details over small areas in an image, which is
called a local enhancement

 The procedure is to define a neighborhood and move its center from pixel to
pixel in a horizontal or vertical direction.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 53


 At each location, the histogram of the points in the neighborhood is computed,
and either a histogram equalization or histogram specification transformation
function is obtained. This function is used to map the intensity of the pixel
centered in the neighborhood.

 The center of the neighborhood is then moved to an adjacent pixel location


and the procedure is repeated.

 This approach has obvious advantages over repeatedly computing the


histogram of all pixels in the neighborhood region each time the region is
moved one pixel location.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 54


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 55
Spatial filters

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 56


Spatial filtering
 Filtering refers to passing, modifying, or rejecting specified frequency
components of an image.

 For example, a filter that passes low frequencies is called a lowpass filter.

 Lowpass filter is used to smooth an image by blurring it.

 Smoothing directly on the image itself by using spatial filters.

 Spatial filtering modifies an image by replacing the value of each pixel by a


function of the values of the pixel and its neighbors.

 If the operation performed on the image pixels is linear, then the filter is
called a linear spatial filter.

 If the operation performed on the image pixels is nonlinear, then the filter is
called a nonlinear spatial filter.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 57


Linear Spatial Filtering
 A linear spatial filter performs a sum-of-products operation between an
image f and a filter kernel, w. The kernel is an array whose size defines the
neighborhood of operation, and whose coefficients determine the nature of
the filter.

 Spatial filter kernel are mask, template, and window. We use the term filter
kernel or simply kernel.

 Figure illustrates the mechanics of linear spatial filtering using a 3 x 3


kernel. At any point ( x, y) in the image, the response, g( x, y) of the filter is
the sum of products of the kernel coefficients and the image pixels
encompassed by the kernel:

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 58


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 59
 As coordinates x and y are varied, the center of the kernel moves from pixel
to pixel, generating the filtered image g, in the process.

 For a kernel of size m x n, we assume that m = 2a+1 and n = 2b+1, where a


and b are nonnegative integers.

 Linear Filtering of an image f of size MxN filter mask of size mxn is given by
the expression

 To generate a complete filtered image this equation must be applied for x =


0, 1, 2, … , M-1 and y = 0, 1, 2, … , N-1

 The above Equation implements the sum of products of the form, but for a
kernel of arbitrary odd size.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 60
 simply move the filter mask from point to point in an image, at each point
(x,y), the response of the filter at that point is calculated using a
predefined relationship.
R  w1 z1  w2 z 2  ...  wmn z mn
mn
 w
i i
i zi

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 61


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 62
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 63
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 64
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 65
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 66
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 67
Smoothing Spatial Filters
 Smoothing filters are used for blurring and for noise reduction

 Blurring is used in preprocessing steps, such as removal of small details from


an image prior to object extraction bridging of small gaps in lines or curves

 noise reduction can be accomplished by blurring with a linear filter and also
by a nonlinear filter.

 The output is simply the average of the pixels contained in the neighborhood
of the filter mask. This is called averaging filters or lowpass filters.

 Replacing the value of every pixel in an image by the average of the gray levels
in the neighborhood will reduce the “sharp” transitions in gray levels.

 The low pass filter preserves the smooth region in the image and it removes
the sharp variations leading to blurring effect.

 The blurring effect will be more with the increase in the size of the mask.
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 68
 sharp transitions
 random noise in the image
 edges of objects in the image
 Smoothing can reduce noises (desirable) and blur edges (undesirable)
 In spatial mask, sum of the elements is equal to 1.

box filter weighted average

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 69


Weighted average filter
 the basic strategy behind weighting the center point the highest and then
reducing the value of the coefficients as a function of increasing distance
from the origin is simply an attempt to reduce blurring in the smoothing
process.

General form : smoothing mask:

filter of size mxn (m and n odd)


a b

  w( s, t ) f ( x  s, y  t )
g ( x, y )  s   at   b
a b

  w( s, t )
s   at   b

summation of all coefficient of the mask

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 70


a b
Example c d
e f
 a). original image 500x500 pixel
 b). - f). results of smoothing with
square averaging filter masks of size
n = 3, 5, 9, 15 and 35, respectively.
 Note:
 big mask is used to eliminate small
objects from an image.
 the size of the mask establishes the
relative size of the objects that will be
blended with the background.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 71


Median Filters
 Median filters are statistical non-linear filters.

 Median filter replaces the value of a pixel by the median of the gray levels
in the neighborhood of that pixel (the original value of the pixel is
included in the computation of the median).

 All pixels in the neighborhood of the pixel in the original image which are
identified by the mask are sorted in the ascending or descending order.

 This filter technique is popular because for certain types of random noise
(impulse noise  salt and pepper noise) , they provide excellent noise-
reduction capabilities, with considering less blurring than linear
smoothing filters of similar size.

72
Example : Median Filters

73
Sharpening Spatial Filters

 The objective of sharpening is to highlight fine details in an image


or to enhance detail that has been blurred.

 Image sharpening include applications ranging from electronic


printing and medical imaging to industrial inspection and
autonomous guidance in military systems.

 Image blurring could be accomplished in the spatial domain by


pixel averaging in a neighborhood. Hence, it is logical to conclude
that sharpening could be accomplished by spatial differentiation.

74
Derivative operator
 The strength of the response of a derivative operator is
proportional to the degree of discontinuity of the image at the point
at which the operator is applied.

 Thus, image differentiation

 Enhances edges and other discontinuities (noise)

 deemphasizes area with slowly varying gray-level values.

75
First-order derivative
 a basic definition of the first-order derivative of a one-dimensional
function f(x) is the difference

f
 f ( x  1)  f ( x )
x
Second-order derivative
 similarly, we define the second-order derivative of a one-
dimensional function f(x) is the difference

2 f
 f ( x  1)  f ( x  1)  2 f ( x)
x 2

76
First and second order derivative
 First-order derivatives produce thicker edges in an images,
generally have a stronger response to a gray-level step.

 Second order derivatives have a stronger response to fine detail,


such as thin lines and isolated points, produce a double response
at step change in gray level.

77
First and Second-order derivative of f(x,y)
 when we consider an image function of two variables, f(x,y), at
which time we will dealing with partial derivatives along the two
spatial axes.

f ( x, y) f ( x, y) f ( x, y)
Gradient operator f   
xy x y
Laplacian operator
 2
f ( x , y )  2
f ( x, y )
(linear operator)  f 
2

x 2
y 2

78
Discrete Form of Laplacian

from  f 2
 f ( x  1, y)  f ( x  1, y)  2 f ( x, y)
x 2

 f
2
 f ( x, y  1)  f ( x, y  1)  2 f ( x, y )
y 2

yield,

 f  [ f ( x  1, y )  f ( x  1, y )
2

 f ( x, y  1)  f ( x, y  1)  4 f ( x, y )]
79
Laplacian mask

Laplacian mask implemented an extension of diagonal neighbors

Other implementation of Laplacian masks

80
Correct the effect of background
 Easily by adding the original and laplacian image be careful with
the Laplacian filter used

if the center coefficient


of the Laplacian mask is
 f ( x, y)   2 f ( x, y) negative
g ( x, y)  
 f ( x, y )   2
f ( x, y)
if the center coefficient
of the Laplacian mask is
positive

81
Example
 a). image of the North pole of the
moon

 b). Laplacian-filtered image with

1 1 1
1 -8 1
1 1 1
 c). Laplacian image scaled for
display purposes

 d). image enhanced by addition


with original image

82
Mask of Laplacian + addition
 To simply the computation, we can create a mask which do both
operations, Laplacian Filter and Addition the original image.

g ( x, y )  f ( x, y )  [ f ( x  1, y )  f ( x  1, y )
 f ( x, y  1)  f ( x, y  1)  4 f ( x, y )]
 5 f ( x, y )  [ f ( x  1, y )  f ( x  1, y )
 f ( x, y  1)  f ( x, y  1)]

0 -1 0
-1 5 -1
0 -1 0
83
 f ( x, y)   2 f ( x, y)
Note g ( x, y)  
 f ( x, y)   f ( x, y)
2

0 -1 0 0 0 0 0 -1 0
-1 5 -1 = 0 1 0 + -1 4 -1
0 -1 0 0 0 0 0 -1 0

0 -1 0 0 0 0 0 -1 0
-1 9 -1 = 0 1 0 + -1 8 -1
0 -1 0 0 0 0 0 -1 0

84
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 85
Unsharp masking
 Unsharp masking is one of the technique used for edge enhancement.

 Smoothened version of the image is subtracted from the original


image produces sharpening output image.

f s ( x, y )  f ( x, y )  f ( x, y )
sharpened image = original image – blurred image

86
High-boost filtering
 A high boost filter is also known as a high frequency emphasis filter.

 generalized form of Unsharp masking A  1


f hb ( x, y )  Af ( x, y )  f ( x, y )

f hb ( x, y )  ( A  1) f ( x, y )  f ( x, y )  f ( x, y )
 ( A  1) f ( x, y )  f s ( x, y )

f hb ( x, y )  ( A  1) f ( x, y )  f s ( x, y )
 if we use Laplacian filter to create sharpen image fs(x,y) with
addition of original image

 f ( x, y )   2 f ( x, y )
f s ( x, y )  
 f ( x, y )   2
f ( x, y )
87
High-boost filtering

if the center coefficient


of the Laplacian mask is
negative
 yields

 Af ( x, y)   2 f ( x, y)
f hb ( x, y)  
 Af ( x, y)   f ( x, y)
2

if the center coefficient


of the Laplacian mask is
positive

88
High-boost Masks

 A1

 if A = 1, it becomes “standard” Laplacian sharpening

89
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 90
Gradient Operator
 Derivative operative are used for edge detection techniques.

 The magnitude of first derivative calculated with in a neighborhood around


the pixel are used to detect presence of edge in an image

 first derivatives are implemented using the magnitude of the gradient.

 f 
G x   x 
f      f 
1
G y   
f  mag (f )  [G  G ] 2
x
2
y
2
 y 
1
 f  2
 f  
2 2
commonly approx.
      
 x   y  
f  Gx  Gy
the magnitude becomes nonlinear
91
Gradient Mask
z1 z2 z3
 simplest approximation, 2x2
z4 z5 z6
z7 z8 Z9

Gx  ( z8  z5 ) and G y  ( z 6  z5 )
1 1
f  [G  G ]
2
x
2
y
2
 [( z8  z5 )  ( z6  z5 ) ]
2 2 2

f  z8  z5  z6  z5

92
z1 z2 z3
Gradient Mask
z4 z5 z6
 Roberts cross-gradient operators, 2x2 z7 z8 z9

G x  ( z9  z5 ) and G y  ( z8  z 6 )
1 1
f  [G  G ]
2
x
2
y
2
 [( z9  z5 )  ( z8  z6 ) ]
2 2 2

f  z9  z5  z8  z6

93
z1 z2 z3
z4 z5 z6
Gradient Mask
z7 z8 z9

 Sobel operators, 3x3


Gx  ( z7  2 z8  z9 )  ( z1  2 z 2  z3 )
G y  ( z3  2 z6  z9 )  ( z1  2 z 4  z7 )

f  Gx  Gy
the weight value 2 is to achieve
smoothing by giving more
important to the center point

94
Filtering in frequency domain

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 95


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 96
Basic steps in frequency domain filtering

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 97


Shifting the centre of the spectrum

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 98


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 99
Steps in filtering

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 100


Smoothing using low pass filter
 Edges and other sharp intensity transitions (such as noise) in an image contribute
significantly to the high frequency content of its Fourier transform.

 Smoothing (blurring) is achieved in the frequency domain by high-frequency


attenuation; that is, by low pass filtering.

 Three types of low pass filters: ideal, Butterworth, and Gaussian.

 These three categories cover the range from very sharp (ideal) to very smooth
(Gaussian) filtering.

 The shape of a Butterworth filter is controlled by a parameter called the filter order.

 For large values of this parameter, the Butterworth filter approaches the ideal filter.

 For lower values, the Butterworth filter is more like a Gaussian filter.

 The Butterworth filter provides a transition between two “extremes.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 101


Ideal LPF

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 102


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 103
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 104
105
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Gaussian LPF

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 106


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 107
Butterworth LPF

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 108


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 109
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 110
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 111
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 112
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 113
Image sharpening using HPF
 An image can be smoothed by attenuating the high-frequency
components of its Fourier transform. Because edges and other abrupt
changes in intensities are associated with high-frequency components.

 Image sharpening can be achieved in the frequency domain by high pass


filtering, which attenuates low-frequencies components without
disturbing high-frequencies in the Fourier transform.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 114


Ideal HPF

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 115


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 116
Butterworth HPF

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 117


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 118
Gaussian HPF

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 119


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 120
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 121
Homomorphic filtering
 Homomorphic filtering is a generalized technique for signal and image processing,
involving a nonlinear mapping to a different domain in which linear filter
techniques are applied, followed by mapping back to the original domain.

 Homomorphic filtering is sometimes used for image enhancement.

 Homomorphic filtering is a frequency domain procedure to improve the


appearance of an image by
 Grey level range compression

 Contrast enhancement

 Homomorphic filtering is used to remove multiplicative noise.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 122


Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 123
124
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 125
 In some cases is not illuminated properly or camera angle is not correct,
some part of image appears very dark.

 In order to improve these type of images reflectance and illumination has


to be treated independently.

 The illumination component of an image generally is characterized by slow


spatial variations, while the reflectance component tends to vary abruptly,
particularly at the junctions of dissimilar objects.

 These characteristics lead to associating the low frequencies of the Fourier


transform of the logarithm of an image with illumination, and the high
frequencies with reflectance.

 A good deal of control can be gained over the illumination and reflectance
components with a homomorphic filter.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 126


This control requires specification of a filter transfer function H(u, v) that
affects the low- and high-frequency components of the Fourier transform in
different, controllable ways.
 If the parameters rL and rH are chosen so that rL <1 and rH >1, the filter
function will attenuate the contribution made by the low frequencies
(illumination) and amplify the contribution made by high frequencies
(reflectance). The net result is simultaneous dynamic range compression and
contrast enhancement.

127
Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT
Image zooming

 Zooming is enlarging a picture in a scene that the details in the image become
more visible and clear.

Sri. T. Aravinda babu, Asst. Prof., Dept. of ECE, CBIT 128


Digital Image Processing Question & Answers

UNIT-III

IMAGE RESTORATION

125
Digital Image Processing Question & Answers

1. Explain about gray level interpolation.

The distortion correction equations yield non integer values for x' and y'. Because the
distorted image g is digital, its pixel values are defined only at integer coordinates. Thus using
non integer values for x' and y' causes a mapping into locations of g for which no gray levels are
defined. Inferring what the gray-level values at those locations should be, based only on the pixel
values at integer coordinate locations, then becomes necessary. The technique used to
accomplish this is called gray-level interpolation.

The simplest scheme for gray-level interpolation is based on a nearest neighbor approach.
This method, also called zero-order interpolation, is illustrated in Fig. 6.1. This figure shows

(A) The mapping of integer (x, y) coordinates into fractional coordinates (x', y') by means of
following equations

x' = c1x + c2y + c3xy + c4


and

y' = c5x + c6y + c7xy + c8


(B) The selection of the closest integer coordinate neighbor to (x', y');

and

(C) The assignment of the gray level of this nearest neighbor to the pixel located at (x, y).

Fig. 6.1 Gray-level interpolation based on the nearest neighbor concept.

Although nearest neighbor interpolation is simple to implement, this method often has the
drawback of producing undesirable artifacts, such as distortion of straight edges in images of
high resolution. Smoother results can be obtained by using more sophisticated techniques, such
as cubic convolution interpolation, which fits a surface of the sin(z)/z type through a much larger

126
Digital Image Processing Question & Answers

number of neighbors (say, 16) in order to obtain a smooth estimate of the gray level at any

127
Digital Image Processing Question & Answers

desired point. Typical areas in which smoother approximations generally are required include 3-
D graphics and medical imaging. The price paid for smoother approximations is additional
computational burden. For general-purpose image processing a bilinear interpolation approach
that uses the gray levels of the four nearest neighbors usually is adequate. This approach is
straightforward. Because the gray level of each of the four integral nearest neighbors of a non
integral pair of coordinates (x', y') is known, the gray-level value at these coordinates, denoted
v(x', y'), can be interpolated from the values of its neighbors by using the relationship

v (x', y') = ax' + by' + c x' y' + d


where the four coefficients are easily determined from the four equations in four unknowns that
can be written using the four known neighbors of (x', y'). When these coefficients have been
determined, v(x', y') is computed and this value is assigned to the location in f{x, y) that yielded
the spatial mapping into location (x', y'). It is easy to visualize this procedure with the aid of Fig.
6.1. The exception is that, instead of using the gray-level value of the nearest neighbor to (x', y'),
we actually interpolate a value at location (x', y') and use this value for the gray-level assignment
at (x, y).

2. Explain about Wiener filter used for image restoration.

The inverse filtering approach makes no explicit provision for handling noise. This approach
incorporates both the degradation function and statistical characteristics of noise into the
restoration process. The method is founded on considering images and noise as random
processes, and the objective is to find an estimate f of the uncorrupted image f such that the mean
square error between them is minimized. This error measure is given by

e2 = E {(f- f )2}
where E{•} is the expected value of the argument. It is assumed that the noise and the image are
uncorrelated; that one or the other has zero mean; and that the gray levels in the estimate are a
linear function of the levels in the degraded image. Based on these conditions, the minimum of
the error function is given in the frequency domain by the expression

128
Digital Image Processing Question & Answers

where we used the fact that the product of a complex quantity with its conjugate is equal to the
magnitude of the complex quantity squared. This result is known as the Wiener filter, after N.
Wiener [1942], who first proposed the concept in the year shown. The filter, which consists of
the terms inside the brackets, also is commonly referred to as the minimum mean square error
filter or the least square error filter. The Wiener filter does not have the same problem as the
inverse filter with zeros in the degradation function, unless both H(u, v) and S η(u, v) are zero for
the same value(s) of u and v.

The terms in above equation are as follows:

H (u, v) = degradation function

H*(u, v) = complex conjugate of H (u, v)

│H (u, v│ 2 = H*(u, v)* H (u, v)

Sη (u, v) = │N (u, v) 2 = power spectrum of the noise

Sf (u, v) = │F (u, v) 2 = power spectrum of the undegraded image.

As before, H (u, v) is the transform of the degradation function and G (u, v) is the
transform of the degraded image. The restored image in the spatial domain is given by the
inverse Fourier transform of the frequency-domain estimate F (u, v). Note that if the noise is
zero, then the noise power spectrum vanishes and the Wiener filter reduces to the inverse filter.

When we are dealing with spectrally white noise, the spectrum │N (u, v│ 2 is a constant,
which simplifies things considerably. However, the power spectrum of the undegraded image
seldom is known. An approach used frequently when these quantities are not known or cannot be
estimated is to approximate the equation as

129
Digital Image Processing Question & Answers

where K is a specified constant.

3. Explain a Model of the Image Degradation/Restoration Process.

The Fig. 6.3 shows, the degradation process is modeled as a degradation function that,
together with an additive noise term, operates on an input image f(x, y) to produce a degraded
image g(x, y). Given g(x, y), some knowledge about the degradation function H, and some
knowledge about the additive noise term η(x, y), the objective of restoration is to obtain an
estimate f(x, y) of the original image. the estimate should be as close as possible to the original
input image and, in general, the more we know about H and η, the closer f(x, y) will be to f(x, y).

The degraded image is given in the spatial domain by

g (x, y) = h (x, y) * f (x, y) + η (x, y)


where h (x, y) is the spatial representation of the degradation function and, the symbol *
indicates convolution. Convolution in the spatial domain is equal to multiplication in the
frequency domain, hence

G (u, v) = H (u, v) F (u, v) + N (u, v)


where the terms in capital letters are the Fourier transforms of the corresponding terms in above
equation.

Fig. 6.3 model of the image degradation/restoration process.

130
Digital Image Processing Question & Answers

4. Explain about the restoration filters used when the image degradation is due to noise
only.

If the degradation present in an image is only due to noise, then,

g (x, y) = f (x, y) + η (x, y)

G (u, v) = F (u, v) + N (u, v)


The restoration filters used in this case are,

1. Mean filters
2. Order static filters and
3. Adaptive filters

Also read 5, 6, 7 answers.

5. Explain Mean filters.

There are four types of mean filters. They are

(i) Arithmetic mean filter

This is the simplest of the mean filters. Let Sxy represent the set of coordinates in a
rectangular subimage window of size m X n, centered at point (x, y).The arithmetic mean
filtering process computes the average value of the corrupted image g(x, y) in the area defined by
Sxy. The value of the restored image f at any point (x, y) is simply the arithmetic mean computed
using the pixels in the region defined by S xy. In other words

This operation can be implemented using a convolution mask in which all coefficients have
value 1/mn

(ii) Geometric mean filter

An image restored using a geometric mean filter is given by the expression

131
Digital Image Processing Question & Answers

Here, each restored pixel is given by the product of the pixels in the subimage window, raised to
the power 1/mn. A geometric mean filter achieves smoothing comparable to the arithmetic mean
filter, but it tends to lose less image detail in the process.

(iii) Harmonic mean filter

The harmonic mean filtering operation is given by the expression

The harmonic mean filter works well for salt noise, but fails for pepper noise. It does well also
with other types of noise like Gaussian noise.

(iv) Contra harmonic mean filter

The contra harmonic mean filtering operation yields a restored image based on the expression

where Q is called the order of the filter. This filter is well suited for reducing or virtually
eliminating the effects of salt-and-pepper noise. For positive values of Q, the filter eliminates
pepper noise. For negative values of Q it eliminates salt noise. It cannot do both simultaneously.
Note that the contra harmonic filter reduces to the arithmetic mean filter if Q = 0, and to the
harmonic mean filter if Q = -1.

132
Digital Image Processing Question & Answers

The best-known order-statistics filter is the median filter, which, as its name implies,
replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel:

The original value of the pixel is included in the computation of the median. Median filters are
quite popular because, for certain types of random noise, they provide excellent noise-reduction
capabilities, with considerably less blurring than linear smoothing filters of similar size. Median
filters are particularly effective in the presence of both bipolar and unipolar impulse noise.

(ii) Max and min filters

Although the median filter is by far the order-statistics filler most used in image
processing, it is by no means the only one. The median represents the 50th percentile of a ranked
set of numbers, but the reader will recall from basic statistics that ranking lends itself to many
other possibilities. For example, using the 100th percentile results in the so-called max filter,
given by

This filter is useful for finding the brightest points in an image. Also, because pepper noise has
very low values, it is reduced by this filter as a result of the max selection process in the
subimage area S xy.

The 0th percentile filter is the min filter.

This filter is useful for finding the darkest points in an image. Also, it reduces salt noise as a
result of the min operation.

(iii) Midpoint filter

The midpoint filter simply computes the midpoint between the maximum and minimum
values in the area encompassed by the filter:

134
Digital Image Processing Question & Answers

Note that this filter combines order statistics and averaging. This filter works best for randomly
distributed noise, like Gaussian or uniform noise.

(iv) Alpha - trimmed mean filter

It is a filter formed by deleting the d/2 lowest and the d/2 highest gray-level values of g(s,
t) in the neighborhood Sxy. Let gr (s, t) represent the remaining mn - d pixels. A filter formed by
averaging these remaining pixels is called an alpha-trimmed mean filter:

where the value of d can range from 0 to mn - 1. When d = 0, the alpha- trimmed filter reduces to
the arithmetic mean filter. If d = (mn - l)/2, the filter becomes a median filter. For other values
of d, the alpha-trimmed filter is useful in situations involving multiple types of noise, such as a
combination of salt-and-pepper and Gaussian noise.

7. Explain the Adaptive Filters.

Adaptive filters are filters whose behavior changes based on statistical characteristics of
the image inside the filter region defined by the m X n rectangular window Sxy.

Adaptive, local noise reduction filter:

The simplest statistical measures of a random variable are its mean and variance. These
are reasonable parameters on which to base an adaptive filler because they are quantities closely
related to the appearance of an image. The mean gives a measure of average gray level in the
region over which the mean is computed, and the variance gives a measure of average contrast in
that region.

This filter is to operate on a local region, Sxy. The response of the filter at any point (x, y)
on which the region is centered is to be based on four quantities: (a) g(x, y), the value of the
noisy image at (x, y); (b) a2, the variance of the noise corrupting /(x, y) to form g(x, y); (c) ray,
the local mean of the pixels in S xy; and (d) σ2L , the local variance of the pixels in Sxy.

135
Digital Image Processing Question & Answers

1. If σ2η is zero, the filler should return simply the value of g (x, y). This is the trivial, zero-noise
case in which g (x, y) is equal to f (x, y).

2. If the local variance is high relative to σ2η the filter should return a value close to g (x, y). A
high local variance typically is associated with edges, and these should be preserved.

3. If the two variances are equal, we want the filter to return the arithmetic mean value of the
pixels in S xy. This condition occurs when the local area has the same properties as the overall
image, and local noise is to be reduced simply by averaging.

Adaptive local noise filter is given by,

The only quantity that needs to be known or estimated is the variance of the overall noise, a2.
The other parameters are computed from the pixels in S xy at each location (x, y) on which the
filter window is centered.

Adaptive median filter:

The median filter performs well as long as the spatial density of the impulse noise is not large (as
a rule of thumb, Pa and Pb less than 0.2). The adaptive median filtering can handle impulse noise
with probabilities even larger than these. An additional benefit of the adaptive median filter is
that it seeks to preserve detail while smoothing nonimpulse noise, something that the
"traditional" median filter does not do. The adaptive median filter also works in a rectangular
window area Sxy. Unlike those filters, however, the adaptive median filter changes (increases) the
size of Sxy during filter operation, depending on certain conditions. The output of the filter is a
single value used to replace the value of the pixel at (x, y), the particular point on which the
window Sxy is centered at a given time.

Consider the following notation:

zmin = minimum gray level value in Sxy

zmax = maximum gray level value in Sxy

zmcd = median of gray levels in Sxy

zxy = gray level at coordinates (x, y)

Smax = maximum allowed size of Sxy.


137
Digital Image Processing Question & Answers

The adaptive median filtering algorithm works in two levels, denoted level A and level B, as
follows:

Level A: A1 = zmed - zmin

A2 = zmed - zmax

If A1 > 0 AND A2 < 0, Go to level B

Else increase the window size

If window size ≤ S max repeat level A

Else output zxy

Level B: B1 = zxy - zmin

B2 = zxy - zmax
If B1> 0 AND B2 < 0, output zxy

Else output zmed

8. Explain a simple Image Formation Model.

An image is represented by two-dimensional functions of the form f(x, y). The value or
amplitude of f at spatial coordinates (x, y) is a positive scalar quantity whose physical meaning is
determined by the source of the image. When an image is generated from a physical process, its
values are proportional to energy radiated by a physical source (e.g., electromagnetic waves). As
a consequence, f(x, y) must be nonzero and finite; that is,

0 < f (x, y) < ∞ …. (1)

The function f(x, y) may be characterized by two components:

A) The amount of source illumination incident on the scene being viewed.

B) The amount of illumination reflected by the objects in the scene.

Appropriately, these are called the illumination and reflectance components and are denoted by
i (x, y) and r (x, y), respectively. The two functions combine as a product to form f (x, y).

f (x, y) = i (x, y) r (x, y) …. (2)


where
138
Digital Image Processing Question & Answers

0 < i (x, y) < ∞ …. (3)

and

0 < r (x, y) < 1 …. (4)

Equation (4) indicates that reflectance is bounded by 0 (total absorption) and 1 (total
reflectance).The nature of i (x, y) is determined by the illumination source, and r (x, y) is
determined by the characteristics of the imaged objects. It is noted that these expressions also are
applicable to images formed via transmission of the illumination through a medium, such as a
chest X-ray.

9. Write brief notes on inverse filtering.

The simplest approach to restoration is direct inverse filtering, where F (u, v), the
transform of the original image is computed simply by dividing the transform of the degraded
image, G (u, v), by the degradation function

The divisions are between individual elements of the functions.

But G (u, v) is given by

G (u, v) = F (u, v) + N (u, v)


Hence

It tells that even if the degradation function is known the undegraded image cannot be
recovered [the inverse Fourier transform of F( u, v)] exactly because N(u, v) is a random
function whose Fourier transform is not known.

If the degradation has zero or very small values, then the ratio N(u, v)/H(u, v) could
easily dominate the estimate F(u, v).

One approach to get around the zero or small-value problem is to limit the filter
139
Digital Image Processing Question & Answers

is usually the highest value of H (u, v) in the frequency domain. Thus, by limiting the analysis to
frequencies near the origin, the probability of encountering zero values is reduced.

10. Write about Noise Probability Density Functions.

The following are among the most common PDFs found in image processing applications.

Gaussian noise

Because of its mathematical tractability in both the spatial and frequency domains,
Gaussian (also called normal) noise models are used frequently in practice. In fact, this
tractability is so convenient that it often results in Gaussian models being used in situations in
which they are marginally applicable at best.

The PDF of a Gaussian random variable, z, is given by

… (1)

where z represents gray level, µ is the mean of average value of z, and a σ is its standard
deviation. The standard deviation squared, σ2, is called the variance of z. A plot of this function
is shown in Fig. 5.10. When z is described by Eq. (1), approximately 70% of its values will be in
the range [(µ - σ), (µ +σ)], and about 95% will be in the range [(µ - 2σ), (µ + 2σ)].

Rayleigh noise

The PDF of Rayleigh noise is given by

The mean and variance of this density are given by


µ = a + ƒMb/4
σ2 = b(4 – Π)/4

141
Digital Image Processing Question & Answers

Figure 5.10 shows a plot of the Rayleigh density. Note the displacement from the origin and the
fact that the basic shape of this density is skewed to the right. The Rayleigh density can be quite
useful for approximating skewed histograms.

Erlang (Gamma) noise

The PDF of Erlang noise is given by

where the parameters are such that a > 0, b is a positive integer, and "!" indicates factorial. The
mean and variance of this density are given by

µ=b/a

σ2 = b / a2
Exponential noise

The PDF of exponential noise is given by

The mean of this density function is given by

µ=1/a

σ2 = 1 / a2

This PDF is a special case of the Erlang PDF, with b = 1.

Uniform noise

The PDF of uniform noise is given by

142
Digital Image Processing Question & Answers

The mean of this density function is given by

µ = a + b /2

σ2 = (b – a ) 2 / 12
Impulse (salt-and-pepper) noise

The PDF of (bipolar) impulse noise is given by

If b > a, gray-level b will appear as a light dot in the image. Conversely, level a will
appear like a dark dot. If either Pa or Pb is zero, the impulse noise is called unipolar. If neither
probability is zero, and especially if they are approximately equal, impulse noise values will
resemble salt-and-pepper granules randomly distributed over the image. For this reason, bipolar
impulse noise also is called salt-and-pepper noise. Shot and spike noise also are terms used to
refer to this type of noise.

143
Digital Image Processing Question & Answers

144
Digital Image Processing Question & Answers

11. Enumerate the differences between the image enhancement and image restoration.

(i) Image enhancement techniques are heuristic procedures designed to manipulate an image in
order to take advantage of the psychophysical aspects of the human system. Whereas image
restoration techniques are basically reconstruction techniques by which a degraded image is
reconstructed by using some of the prior knowledge of the degradation phenomenon.

(ii) Image enhancement can be implemented by spatial and frequency domain technique,
whereas image restoration can be implement by frequency domain and algebraic techniques.

(iii) The computational complexity for image enhancement is relatively less when compared to
the computational complexity for irrrage restoration, since algebraic methods requires
manipulation of large number of simultaneous equation. But, under some condition
computational complexity can be reduced to the same level as that required by traditional
frequency domain technique.

(iv) Image enhancement techniques are problem oriented, whereas image restoration techniques
are general and are oriented towards modeling the degradation and applying the reverse process
in order to reconstruct the original image.

(v) Masks are used in spatial domain methods for image enhancement, whereas masks are not
used for image restoration techniques.

(vi) Contrast stretching is considered as image enhancement technique because it is based on the
pleasing aspects of the review, whereas removal of’ image blur by applying a deblurring function
is considered as a image restoration technique.

12. Explain about iterative nonlinear restoration using the Lucy–Richardson algorithm.

Lucy-Richardson algorithm is a nonlinear restoration method used to recover a latent


image which is blurred by a Point Spread Function (psf). It is also known as Richardson-Lucy
deconvolution.

With as the point spread function, the pixels in observed image are expressed as,

Here,

uj = Pixel value at location j in the image

146
Digital Image Processing Question & Answers

The L-R algorithm cannot be used in application in which the psf (Pij) is dependent on one or
more unknown variables.

The L-R algorithm is based on maximum-likelihood formulation, in this formulation Poisson statistics are
used to model the image. If the likelihood of model is increased, then the result is an equation which
satisfies when the following iteration converges.

Here,

f = Estimation of undegraded image.

The factor f which is present in the right side denominator leads to non-linearity. Since, the
algorithm is a type of nonlinear restorations; hence it is stopped when satisfactory result is
obtained.

The basic syntax of function deconvlucy with the L-R algorithm is implemented is given
below.

fr = Deconvlucy (g, psf, NUMIT, DAMPAR, WEIGHT)

Here the parameters are,

g = Degraded image

fr = Restored image

psf = Point spread function

NUMIT = Total number of iterations.

The remaining two parameters are,

DAMPAR

The DAMPAR parameter is a scalar parameter which is used to determine the deviation
of resultant image with the degraded image (g). The pixels which gel deviated from their original
value within the DAMPAR, for these pixels iterations are cancelled so as to reduce noise
148
Digital Image Processing Question & Answers

WEIGHT

WEIGHT parameter gives a weight to each and every pixel. It is array of size similar to
that of degraded image (g). In applications where a pixel leads to improper image is removed by
assigning it to a weight as 0’. The pixels may also be given weights depending upon the flat-field
correction, which is essential according to image array. Weights are used in applications such as
blurring with specified psf. They are used to remove the pixels which are pre9ent at the boundary
of the image and are blurred separately by psf.

If the array size of psf is n x n then the width of weight of border of zeroes being used is
ceil (n / 2)

150
Digital Image Processing

UNIT-V

IMAGE COMPRESSION

185
Digital Image Processing

1. Define image compression. Explain about the redundancies in a digital


image.
The term data compression refers to the process of reducing the amount of data required to
represent a given quantity of information. A clear distinction must be made between data and
information. They are not synonymous. In fact, data are the means by which information is
conveyed. Various amounts of data may be used to represent the same amount of information.
Such might be the case, for example, if a long-winded individual and someone who is short and
to the point were to relate the same story. Here, the information of interest is the story; words are
the data used to relate the information. If the two individuals use a different number of words to
tell the same basic story, two different versions of the story are created, and at least one includes
nonessential data. That is, it contains data (or words) that either provide no relevant information
or simply restate that which is already known. It is thus said to contain data redundancy.

Data redundancy is a central issue in digital image compression. It is not an abstract concept but
a mathematically quantifiable entity. If n1 and n2 denote the number of information-carrying
units in two data sets that represent the same information, the relative data redundancy R D of the
first data set (the one characterized by n1) can be defined as

where CR , commonly called the compression ratio, is

For the case n2 = n1, CR = 1 and RD = 0, indicating that (relative to the second data set) the first
representation of the information contains no redundant data. When n2 << n1, CR  ∞ and
RD1, implying significant compression and highly redundant data. Finally, when n2 >> n1
, CR 0 and R D ∞, indicating that the second data set contains much more data
than the original representation. This, of course, is the normally undesirable case of data
expansion. In general, CR and RD lie in the open intervals (0, ∞) and (- ∞, 1), respectively.
A practical compression ratio, such as 10 (or 10:1), means that the first data set has 10
information carrying
units (say, bits) for every 1 unit in the second or compressed data set. The corresponding
redundancy of 0.9 implies that 90% of the data in the first data set is redundant.

In digital image compression, three basic data redundancies can be identified and exploited:
coding redundancy, interpixel redundancy, and psychovisual redundancy. Data compression
is achieved when one or more of these redundancies are reduced or eliminated.
186
Digital Image Processing

Coding Redundancy:

In this, we utilize formulation to show how the gray-level histogram of an image also can
provide a great deal of insight into the construction of codes to reduce the amount of data used to
represent it.

Let us assume, once again, that a discrete random variable r k in the interval [0, 1] represents the
gray levels of an image and that each rk occurs with probability p r (rk).

where L is the number of gray levels, nk is the number of times that the kth gray level appears in
the image, and n is the total number of pixels in the image. If the number of bits used to represent
each value of rk is l (rk), then the average number of bits required to represent each pixel is

That is, the average length of the code words assigned to the various gray-level values is found
by summing the product of the number of bits used to represent each gray level and the
probability that the gray level occurs. Thus the total number of bits required to code an M X N
image is MNLavg.

Interpixel Redundancy:

Consider the images shown in Figs. 1.1(a) and (b). As Figs. 1.1(c) and (d) show, these images
have virtually identical histograms. Note also that both histograms are trimodal, indicating the
presence of three dominant ranges of gray-level values. Because the gray levels in these images
are not equally probable, variable-length coding can be used to reduce the coding redundancy
that would result from a straight or natural binary encoding of their pixels. The coding process,
however, would not alter the level of correlation between the pixels within the images. In other
words, the codes used to represent the gray levels of each image have nothing to do with the
correlation between pixels. These correlations result from the structural or geometric
relationships between the objects in the image.

187
Digital Image Processing

Fig.1.1 Two images and their gray-level histograms and normalized autocorrelation
coefficients along one line.
188
Digital Image Processing

Figures 1.1(e) and (f) show the respective autocorrelation coefficients computed along one line
of each image.

where

The scaling factor in Eq. above accounts for the varying number of sum terms that arise for each
integer value of Δn. Of course, Δn must be strictly less than N, the number of pixels on a line.
The variable x is the coordinate of the line used in the computation. Note the dramatic difference
between the shape of the functions shown in Figs. 1.1(e) and (f). Their shapes can be
qualitatively related to the structure in the images in Figs. 1.1(a) and (b).This relationship is
particularly noticeable in Fig. 1.1 (f), where the high correlation between pixels separated by 45
and 90 samples can be directly related to the spacing between the vertically oriented matches of
Fig. 1.1(b). In addition, the adjacent pixels of both images are highly correlated. When Δn is 1, γ
is 0.9922 and 0.9928 for the images of Figs. 1.1 (a) and (b), respectively. These values are
typical of most properly sampled television images.

These illustrations reflect another important form of data


redundancy—one directly related to the interpixel correlations within an image. Because the
value of any given pixel can be reasonably predicted from the value of its neighbors, the
information carried by individual pixels is relatively small. Much of the visual contribution of a
single pixel to an image is redundant; it could have been guessed on the basis of the values of its
neighbors. A variety of names, including spatial redundancy, geometric redundancy, and
interframe redundancy, have been coined to refer to these interpixel dependencies. We use the
term interpixel redundancy to encompass them all.

In order to reduce the interpixel redundancies in an image, the 2-D pixel


array normally used for human viewing and interpretation must be transformed into a more
efficient (but usually "nonvisual") format. For example, the differences between adjacent pixels
can be used to represent an image. Transformations of this type (that is, those that remove
interpixel redundancy) are referred to as mappings. They are called reversible mappings if the
original image elements can be reconstructed from the transformed data set.

189
Digital Image Processing

Psychovisual Redundancy:

The brightness of a region, as perceived by the eye, depends on factors other than simply the
light reflected by the region. For example, intensity variations (Mach bands) can be perceived in
an area of constant intensity. Such phenomena result from the fact that the eye does not respond
with equal sensitivity to all visual information. Certain information simply has less relative
importance than other information in normal visual processing. This information is said to be
psychovisually redundant. It can be eliminated without significantly impairing the quality of
image perception.

That psychovisual redundancies exist should not come as a surprise, because


human perception of the information in an image normally does not involve quantitative analysis
of every pixel value in the image. In general, an observer searches for distinguishing features
such as edges or textural regions and mentally combines them into recognizable groupings. The
brain then correlates these groupings with prior knowledge in order to complete the image
interpretation process. Psychovisual redundancy is fundamentally different from the
redundancies discussed earlier. Unlike coding and interpixel redundancy, psychovisual
redundancy is associated with real or quantifiable visual information. Its elimination is possible
only because the information itself is not essential for normal visual processing. Since the
elimination of psychovisually redundant data results in a loss of quantitative information, it is
commonly referred to as quantization.

This terminology is consistent with normal usage of the word, which generally
means the mapping of a broad range of input values to a limited number of output values. As it is
an irreversible operation (visual information is lost), quantization results in lossy data
compression.

2. Explain about fidelity criterion.


The removal of psychovisually redundant data results in a loss of real or quantitative visual
information. Because information of interest may be lost, a repeatable or reproducible means of
quantifying the nature and extent of information loss is highly desirable. Two general classes of
criteria are used as the basis for such an assessment:

A) Objective fidelity criteria and

B) Subjective fidelity criteria.

190
Digital Image Processing

When the level of information loss can be expressed as a function of the original or input image
and the compressed and subsequently decompressed output image, it is said to be based on an
objective fidelity criterion. A good example is the root-mean-square (rms) error between an input
and output image. Let f(x, y) represent an input image and let f(x, y) denote an estimate or
approximation of f(x, y) that results from compressing and subsequently decompressing the
input. For any value of x and y, the error e(x, y) between f (x, y) and f^ (x, y) can be defined as

so that the total error between the two images is

where the images are of size M X N. The root-mean-square error, erms, between f(x, y) and f^(x,
y) then is the square root of the squared error averaged over the M X N array, or

A closely related objective fidelity criterion is the mean-square signal-to-noise ratio of the
compressed-decompressed image. If f^ (x, y) is considered to be the sum of the original image
f(x, y) and a noise signal e(x, y), the mean-square signal-to-noise ratio of the output image,
denoted SNRrms, is

The rms value of the signal-to-noise ratio, denoted SNRrms, is obtained by taking the square root
of Eq. above.

Although objective fidelity criteria offer a simple and convenient mechanism for
evaluating information loss, most decompressed images ultimately are viewed by humans.
Consequently, measuring image quality by the subjective evaluations of a human observer often
is more appropriate. This can be accomplished by showing a "typical" decompressed image to an
appropriate cross section of viewers and averaging their evaluations. The evaluations may be
made using an absolute rating scale or by means of side-by-side comparisons of f(x, y) and f^(x,

191
Digital Image Processing

y).

3. Explain about image compression models.

Fig. 3.1 shows, a compression system consists of two distinct structural blocks: an encoder and a
decoder. An input image f(x, y) is fed into the encoder, which creates a set of symbols from the
input data. After transmission over the channel, the encoded representation is fed to the decoder,
where a reconstructed output image f^(x, y) is generated. In general, f^(x, y) may or may not be
an exact replica of f(x, y). If it is, the system is error free or information preserving; if not, some
level of distortion is present in the reconstructed image. Both the encoder and decoder shown in
Fig. 3.1 consist of two relatively independent functions or subblocks. The encoder is made up of
a source encoder, which removes input redundancies, and a channel encoder, which increases the
noise immunity of the source encoder's output. As would be expected, the decoder includes a
channel decoder followed by a source decoder. If the channel between the encoder and decoder
is noise free (not prone to error), the channel encoder and decoder are omitted, and the general
encoder and decoder become the source encoder and decoder, respectively.

Fig.3.1 A general compression system model

The Source Encoder and Decoder:

The source encoder is responsible for reducing or eliminating any coding, interpixel, or
psychovisual redundancies in the input image. The specific application and associated fidelity
requirements dictate the best encoding approach to use in any given situation. Normally, the
approach can be modeled by a series of three independent operations. As Fig. 3.2 (a) shows, each
operation is designed to reduce one of the three redundancies. Figure 3.2 (b) depicts the
corresponding source decoder. In the first stage of the source encoding process, the mapper
transforms the input data into a (usually nonvisual) format designed to reduce interpixel
redundancies in the input image. This operation generally is reversible and may or may not
reduce directly the amount of data required to represent the image.

192
Digital Image Processing

Fig.3.2 (a) Source encoder and (b) source decoder model

Run-length coding is an example of a mapping that directly results in data compression in this
initial stage of the overall source encoding process. The representation of an image by a set of
transform coefficients is an example of the opposite case. Here, the mapper transforms the image
into an array of coefficients, making its interpixel redundancies more accessible for compression
in later stages of the encoding process.

The second stage, or quantizer block in Fig. 3.2 (a), reduces the
accuracy of the mapper's output in accordance with some preestablished fidelity criterion. This
stage reduces the psychovisual redundancies of the input image. This operation is irreversible.
Thus it must be omitted when error-free compression is desired.

In the third and final stage of the source encoding process, the symbol
coder creates a fixed- or variable-length code to represent the quantizer output and maps the
output in accordance with the code. The term symbol coder distinguishes this coding operation
from the overall source encoding process. In most cases, a variable-length code is used to
represent the mapped and quantized data set. It assigns the shortest code words to the most
frequently occurring output values and thus reduces coding redundancy. The operation, of
course, is reversible. Upon completion of the symbol coding step, the input image has been
processed to remove each of the three redundancies.

Figure 3.2(a) shows the source encoding process as three successive operations, but all three
operations are not necessarily included in every compression system. Recall, for example, that
the quantizer must be omitted when error-free compression is desired. In addition, some
compression techniques normally are modeled by merging blocks that are physically separate in
193
Digital Image Processing

Fig. 3.2(a). In the predictive compression systems, for instance, the mapper and quantizer are
often represented by a single block, which simultaneously performs both operations.

The source decoder shown in Fig. 3.2(b) contains only two components: a symbol
decoder and an inverse mapper. These blocks perform, in reverse order, the inverse operations of
the source encoder's symbol encoder and mapper blocks. Because quantization results in
irreversible information loss, an inverse quantizer block is not included in the general source
decoder model shown in Fig. 3.2(b).

The Channel Encoder and Decoder:

The channel encoder and decoder play an important role in the overall encoding-decoding
process when the channel of Fig. 3.1 is noisy or prone to error. They are designed to reduce the
impact of channel noise by inserting a controlled form of redundancy into the source encoded
data. As the output of the source encoder contains little redundancy, it would be highly sensitive
to transmission noise without the addition of this "controlled redundancy." One of the most
useful channel encoding techniques was devised by R. W. Hamming (Hamming [1950]). It is
based on appending enough bits to the data being encoded to ensure that some minimum number
of bits must change between valid code words. Hamming showed, for example, that if 3 bits of
redundancy are added to a 4-bit word, so that the distance between any two valid code words is
3, all single-bit errors can be detected and corrected. (By appending additional bits of
redundancy, multiple-bit errors can be detected and corrected.) The 7-bit Hamming (7, 4) code
word h1, h2, h3…., h6, h7 associated with a 4-bit binary number b3b2b1b0 is

where Ⓧ denotes the exclusive OR operation. Note that bits h1, h2, and h4 are even- parity bits for
the bit fields b3 b2 b0, b3b1b0, and b2b1b0, respectively. (Recall that a string of binary bits has
even parity if the number of bits with a value of 1 is even.) To decode a Hamming encoded
result, the channel decoder must check the encoded value for odd parity over the bit fields in
which even parity was previously established. A single-bit error is indicated by a nonzero parity
word c4c2c1, where

194
Digital Image Processing

If a nonzero value is found, the decoder simply complements the code word bit position
indicated by the parity word. The decoded binary value is then extracted from the corrected code
word as h3 h5 h6h7.

4. Explain a method of generating variable length codes with an example.

Variable-Length Coding:

The simplest approach to error-free image compression is to reduce only coding redundancy.
Coding redundancy normally is present in any natural binary encoding of the gray levels in an
image. It can be eliminated by coding the gray levels. To do so requires construction of a variable-
length code that assigns the shortest possible code words to the most probable gray levels. Here,
we examine several optimal and near optimal techniques for constructing such a code. These
techniques are formulated in the language of information theory. In practice, the source symbols
may be either the gray levels of an image or the output of a gray-level mapping operation (pixel
differences, run lengths, and so on).

Huffman coding:

The most popular technique for removing coding redundancy is due to Huffman (Huffman
[1952]). When coding the symbols of an information source individually, Huffman coding yields
the smallest possible number of code symbols per source symbol. In terms of the noiseless
coding theorem, the resulting code is optimal for a fixed value of n, subject to the constraint that
the source symbols be coded one at a time.

The first step in Huffman's approach is to create a series of source reductions by ordering the
probabilities of the symbols under consideration and combining the lowest probability symbols
into a single symbol that replaces them in the next source reduction. Figure 4.1 illustrates this
process for binary coding (K-ary Huffman codes can also be constructed). At the far left, a
hypothetical set of source symbols and their probabilities are ordered from top to bottom in terms
of decreasing probability values. To form the first source reduction, the bottom two probabilities,
0.06 and 0.04, are combined to form a "compound symbol" with probability 0.1. This compound
symbol and its associated probability are placed in the first source reduction column so that the

195
Digital Image Processing

probabilities of the reduced source are also ordered from the most to the least probable. This
process is then repeated until a reduced source with two symbols (at the far right) is reached.

The second step in Huffman's procedure is to code each reduced source,


starting with the smallest source and working back to the original source. The minimal length
binary code for a two-symbol source, of course, is the symbols 0 and 1. As Fig. 4.2 shows, these
symbols are assigned to the two symbols on the right (the assignment is arbitrary; reversing the
order of the 0 and 1 would work just as well). As the reduced source symbol with probability 0.6
was generated by combining two symbols in the reduced source to its left, the 0 used to code it is
now assigned to both of these symbols, and a 0 and 1 are arbitrarily

Fig.4.1 Huffman source reductions.

Fig.4.2 Huffman code assignment procedure.

appended to each to distinguish them from each other. This operation is then repeated for each
reduced source until the original source is reached. The final code appears at the far left in Fig.
4.2. The average length of this code is

196
Digital Image Processing

and the entropy of the source is 2.14 bits/symbol. The resulting Huffman code efficiency is
0.973.

Huffman's procedure creates the optimal code for a set of symbols and probabilities subject to
the constraint that the symbols be coded one at a time. After the code has been created, coding
and/or decoding is accomplished in a simple lookup table manner. The code itself is an
instantaneous uniquely decodable block code. It is called a block code because each source
symbol is mapped into a fixed sequence of code symbols. It is instantaneous, because each code
word in a string of code symbols can be decoded without referencing succeeding symbols. It is
uniquely decodable, because any string of code symbols can be decoded in only one way. Thus,
any string of Huffman encoded symbols can be decoded by examining the individual symbols of
the string in a left to right manner. For the binary code of Fig. 4.2, a left-to-right scan of the
encoded string 010100111100 reveals that the first valid code word is 01010, which is the code
for symbol a3 .The next valid code is 011, which corresponds to symbol a1. Continuing in this
manner reveals the completely decoded message to be a3a1a2a2a6.

5. Explain arithmetic encoding process with an example.

Arithmetic coding:

Unlike the variable-length codes described previously, arithmetic coding generates nonblock
codes. In arithmetic coding, which can be traced to the work of Elias, a one-to-one
correspondence between source symbols and code words does not exist. Instead, an entire
sequence of source symbols (or message) is assigned a single arithmetic code word. The code
word itself defines an interval of real numbers between 0 and 1. As the number of symbols in the
message increases, the interval used to represent it becomes smaller and the number of
information units (say, bits) required to represent the interval becomes larger. Each symbol of the
message reduces the size of the interval in accordance with its probability of occurrence.
Because the technique does not require, as does Huffman's approach, that each source symbol
translate into an integral number of code symbols (that is, that the symbols be coded one at a
time), it achieves (but only in theory) the bound established by the noiseless coding theorem.

197
Digital Image Processing

Fig.5.1 Arithmetic coding procedure

Figure 5.1 illustrates the basic arithmetic coding process. Here, a five-symbol sequence or
message, a1a2a3a3a4, from a four-symbol source is coded. At the start of the coding process, the
message is assumed to occupy the entire half-open interval [0, 1). As Table 5.2 shows, this
interval is initially subdivided into four regions based on the probabilities of each source symbol.
Symbol ax, for example, is associated with subinterval [0, 0.2). Because it is the first symbol of
the message being coded, the message interval is initially narrowed to [0, 0.2). Thus in Fig. 5.1
[0, 0.2) is expanded to the full height of the figure and its end points labeled by the values of the
narrowed range. The narrowed range is then subdivided in accordance with the original source
symbol probabilities and the process continues with the next message symbol.

Table 5.1 Arithmetic coding example

In this manner, symbol a2 narrows the subinterval to [0.04, 0.08), a3 further narrows it to [0.056,
0.072), and so on. The final message symbol, which must be reserved as a special end-of-
198
Digital Image Processing

message indicator, narrows the range to [0.06752, 0.0688). Of course, any number within this
subinterval—for example, 0.068—can be used to represent the message.

In the arithmetically coded message of Fig. 5.1, three decimal digits are used
to represent the five-symbol message. This translates into 3/5 or 0.6 decimal digits per source
symbol and compares favorably with the entropy of the source, which is 0.58 decimal digits or 10-
ary units/symbol. As the length of the sequence being coded increases, the resulting arithmetic
code approaches the bound established by the noiseless coding theorem.

In practice, two factors cause coding performance to fall short of the bound: (1)
the addition of the end-of-message indicator that is needed to separate one message from an-
other; and (2) the use of finite precision arithmetic. Practical implementations of arithmetic
coding address the latter problem by introducing a scaling strategy and a rounding strategy
(Langdon and Rissanen [1981]). The scaling strategy renormalizes each subinterval to the [0, 1)
range before subdividing it in accordance with the symbol probabilities. The rounding strategy
guarantees that the truncations associated with finite precision arithmetic do not prevent the
coding subintervals from being represented accurately.

6. Explain LZW coding with an example.

LZW Coding:

The technique, called Lempel-Ziv-Welch (LZW) coding, assigns fixed-length code words to
variable length sequences of source symbols but requires no a priori knowledge of the
probability of occurrence of the symbols to be encoded. LZW compression has been integrated
into a variety of mainstream imaging file formats, including the graphic interchange format
(GIF), tagged image file format (TIFF), and the portable document format (PDF).

LZW coding is conceptually very simple (Welch [1984]). At the onset of the
coding process, a codebook or "dictionary" containing the source symbols to be coded is
constructed. For 8-bit monochrome images, the first 256 words of the dictionary are assigned to
the gray values 0, 1, 2..., and 255. As the encoder sequentially examines the image's pixels, gray-
level sequences that are not in the dictionary are placed in algorithmically determined (e.g., the
next unused) locations. If the first two pixels of the image are white, for instance, sequence ―255-
255‖ might be assigned to location 256, the address following the locations reserved for gray
levels 0 through 255. The next time that two consecutive white pixels are encountered, code
word 256, the address of the location containing sequence 255-255, is used to represent them. If
a 9-bit, 512-word dictionary is employed in the coding process, the original (8 + 8) bits that were
199
Digital Image Processing

used to represent the two pixels are replaced by a single 9-bit code word. Cleary, the size of the
dictionary is an important system parameter. If it is too small, the detection of matching gray-
level sequences will be less likely; if it is too large, the size of the code words will adversely
affect compression performance.

Consider the following 4 x 4, 8-bit image of a vertical edge:

Table 6.1 details the steps involved in coding its 16 pixels. A 512-word dictionary with the
following starting content is assumed:

Locations 256 through 511 are initially unused. The image is encoded by processing its pixels in
a left-to-right, top-to-bottom manner. Each successive gray-level value is concatenated with a
variable—column 1 of Table 6.1 —called the "currently recognized sequence." As can be seen,
this variable is initially null or empty. The dictionary is searched for each concatenated sequence
and if found, as was the case in the first row of the table, is replaced by the newly concatenated
and recognized (i.e., located in the dictionary) sequence. This was done in column 1 of row 2.

200
Digital Image Processing

Table 6.1 LZW coding example

No output codes are generated, nor is the dictionary altered. If the concatenated sequence is not
found, however, the address of the currently recognized sequence is output as the next encoded
value, the concatenated but unrecognized sequence is added to the dictionary, and the currently
recognized sequence is initialized to the current pixel value. This occurred in row 2 of the table.
The last two columns detail the gray-level sequences that are added to the dictionary when
scanning the entire 4 x 4 image. Nine additional code words are defined. At the conclusion of
coding, the dictionary contains 265 code words and the LZW algorithm has successfully
identified several repeating gray-level sequences—leveraging them to reduce the original 128-bit
image lo 90 bits (i.e., 10 9-bit codes). The encoded output is obtained by reading the third
column from top to bottom. The resulting compression ratio is 1.42:1.

A unique feature of the LZW coding just demonstrated is that the coding
dictionary or code book is created while the data are being encoded. Remarkably, an LZW
decoder builds an identical decompression dictionary as it decodes simultaneously the encoded
data stream. . Although not needed in this example, most practical applications require a strategy
for handling dictionary overflow. A simple solution is to flush or reinitialize the dictionary when

201
Digital Image Processing

it becomes full and continue coding with a new initialized dictionary. A more complex option is
to monitor compression performance and flush the dictionary when it becomes poor or
unacceptable. Alternately, the least used dictionary entries can be tracked and replaced when
necessary.

7. Explain the concept of bit plane coding method.

Bit-Plane Coding:

An effective technique for reducing an image's interpixel redundancies is to process the image's
bit planes individually. The technique, called bit-plane coding, is based on the concept of
decomposing a multilevel (monochrome or color) image into a series of binary images and
compressing each binary image via one of several well-known binary compression methods.

Bit-plane decomposition:

The gray levels of an m-bit gray-scale image can be represented in the form of the base 2
polynomial

Based on this property, a simple method of decomposing the image into a collection of binary
images is to separate the m coefficients of the polynomial into m 1-bit bit planes. The zeroth-
order bit plane is generated by collecting the a0 bits of each pixel, while the (m - 1) st-order bit
plane contains the am-1, bits or coefficients. In general, each bit plane is numbered from 0 to m-1
and is constructed by setting its pixels equal to the values of the appropriate bits or polynomial
coefficients from each pixel in the original image. The inherent disadvantage of this approach is
that small changes in gray level can have a significant impact on the complexity of the bit planes.
If a pixel of intensity 127 (01111111) is adjacent to a pixel of intensity 128 (10000000), for
instance, every bit plane will contain a corresponding 0 to 1 (or 1 to 0) transition. For example,
as the most significant bits of the two binary codes for 127 and 128 are different, bit plane 7 will
contain a zero-valued pixel next to a pixel of value 1, creating a 0 to 1 (or 1 to 0) transition at
that point.

An alternative decomposition approach (which reduces the effect of small gray-level


variations) is to first represent the image by an m-bit Gray code. The m-bit Gray code g m-1...
g2g1g0 that corresponds to the polynomial in Eq. above can be computed from

202
Digital Image Processing

Here, Ⓧ denotes the exclusive OR operation. This code has the unique property that successive
code words differ in only one bit position. Thus, small changes in gray level are less likely to
affect all m bit planes. For instance, when gray levels 127 and 128 are adjacent, only the 7th bit
plane will contain a 0 to 1 transition, because the Gray codes that correspond to 127 and 128 are
11000000 and 01000000, respectively.

8. Explain about lossless predictive coding.

Lossless Predictive Coding:

The error-free compression approach does not require decomposition of an image into a
collection of bit planes. The approach, commonly referred to as lossless predictive coding, is
based on eliminating the interpixel redundancies of closely spaced pixels by extracting and
coding only the new information in each pixel. The new information of a pixel is defined as the
difference between the actual and predicted value of that pixel.

Figure 8.1 shows the basic components of a lossless predictive coding


system. The system consists of an encoder and a decoder, each containing an identical predictor.
As each successive pixel of the input image, denoted f n, is introduced to the encoder, the
predictor generates the anticipated value of that pixel based on some number of past inputs. The
output of the predictor is then rounded to the nearest integer, denoted f^ n and used to form the
difference or prediction error which is coded using a variable-length code (by the symbol
encoder) to generate the next element of the compressed data stream.

203
Digital Image Processing

Fig.8.1 A lossless predictive coding model: (a) encoder; (b) decoder

The decoder of Fig. 8.1 (b) reconstructs en from the received variable-length code words and
performs the inverse operation

Various local, global, and adaptive methods can be used to generate f^ n. In most cases, however,
the prediction is formed by a linear combination of m previous pixels. That is,

where m is the order of the linear predictor, round is a function used to denote the rounding or
nearest integer operation, and the αi, for i = 1,2,..., m are prediction coefficients. In raster scan
applications, the subscript n indexes the predictor outputs in accordance with their time of
occurrence. That is, fn, f^n and en in Eqns. above could be replaced with the more explicit
notation f (t), f^(t), and e (t), where t represents time. In other cases, n is used as an index on the
spatial coordinates and/or frame number (in a time sequence of images) of an image. In 1-D
linear predictive coding, for example, Eq. above can be written as

204
Digital Image Processing

where each subscripted variable is now expressed explicitly as a function of spatial coordinates x
and y. The Eq. indicates that the 1-D linear prediction f(x, y) is a function of the previous pixels
on the current line alone. In 2-D predictive coding, the prediction is a function of the previous
pixels in a left-to-right, top-to-bottom scan of an image. In the 3-D case, it is based on these
pixels and the previous pixels of preceding frames. Equation above cannot be evaluated for the
first m pixels of each line, so these pixels must be coded by using other means (such as a
Huffman code) and considered as an overhead of the predictive coding process. A similar
comment applies to the higher-dimensional cases.

9. Explain about lossy predictive coding.

Lossy Predictive Coding:

In this type of coding, we add a quantizer to the lossless predictive model and examine the
resulting trade-off between reconstruction accuracy and compression performance. As Fig.9
shows, the quantizer, which absorbs the nearest integer function of the error-free encoder, is
inserted between the symbol encoder and the point at which the prediction error is formed. It
maps the prediction error into a limited range of outputs, denoted e^ n which establish the amount
of compression and distortion associated with lossy predictive coding.

Fig. 9 A lossy predictive coding model: (a) encoder and (b) decoder.
205
Digital Image Processing

In order to accommodate the insertion of the quantization step, the error-free encoder of figure
must be altered so that the predictions generated by the encoder and decoder are equivalent. As
Fig.9 (a) shows, this is accomplished by placing the lossy encoder's predictor within a feedback
loop, where its input, denoted f˙n, is generated as a function of past predictions and the
corresponding quantized errors. That is,

This closed loop configuration prevents error buildup at the decoder's output. Note from Fig. 9
(b) that the output of the decoder also is given by the above Eqn.

Optimal predictors:

The optimal predictor used in most predictive coding applications minimizes the encoder's mean-
square prediction error

subject to the constraint that

and

That is, the optimization criterion is chosen to minimize the mean-square prediction error, the
quantization error is assumed to be negligible (e˙n ≈ en), and the prediction is constrained to a
linear combination of m previous pixels.1 These restrictions are not essential, but they simplify
the analysis considerably and, at the same time, decrease the computational complexity of the
predictor. The resulting predictive coding approach is referred to as differential pulse code
modulation (DPCM).

206
Digital Image Processing

10. Explain with a block diagram about transform coding system.

Transform Coding:

All the predictive coding techniques operate directly on the pixels of an image and thus are
spatial domain methods. In this coding, we consider compression techniques that are based on
modifying the transform of an image. In transform coding, a reversible, linear transform (such as
the Fourier transform) is used to map the image into a set of transform coefficients, which are
then quantized and coded. For most natural images, a significant number of the coefficients have
small magnitudes and can be coarsely quantized (or discarded entirely) with little image
distortion. A variety of transformations, including the discrete Fourier transform (DFT), can be
used to transform the image data.

Fig. 10 A transform coding system: (a) encoder; (b) decoder.

Figure 10 shows a typical transform coding system. The decoder implements the inverse
sequence of steps (with the exception of the quantization function) of the encoder, which
performs four relatively straightforward operations: subimage decomposition, transformation,
quantization, and coding. An N X N input image first is subdivided into subimages of size n X n,
which are then transformed to generate (N/n) 2 subimage transform arrays, each of size n X n.
The goal of the transformation process is to decorrelate the pixels of each subimage, or to pack
as much information as possible into the smallest number of transform coefficients. The
quantization stage then selectively eliminates or more coarsely quantizes the coefficients that
carry the least information. These coefficients have the smallest impact on reconstructed
subimage quality. The encoding process terminates by coding (normally using a variable-length
code) the quantized coefficients. Any or all of the transform encoding steps can be adapted to

207
Digital Image Processing

local image content, called adaptive transform coding, or fixed for all subimages, called
nonadaptive transform coding.

11. Explain about wavelet coding.


Wavelet Coding:

The wavelet coding is based on the idea that the coefficients of a transform that decorrelates the
pixels of an image can be coded more efficiently than the original pixels themselves. If the
transform's basis functions—in this case wavelets—pack most of the important visual
information into a small number of coefficients, the remaining coefficients can be quantized
coarsely or truncated to zero with little image distortion.

Figure 11 shows a typical wavelet coding system. To encode a 2 J X 2J image, an analyzing


wavelet, Ψ, and minimum decomposition level, J - P, are selected and used to compute the
image's discrete wavelet transform. If the wavelet has a complimentary scaling function φ, the
fast wavelet transform can be used. In either case, the computed transform converts a large
portion of the original image to horizontal, vertical, and diagonal decomposition coefficients
with zero mean and Laplacian-like distributions.

Fig.11 A wavelet coding system: (a) encoder; (b) decoder.

Since many of the computed coefficients carry little visual information, they can be quantized
and coded to minimize intercoefficient and coding redundancy. Moreover, the quantization can
be adapted to exploit any positional correlation across the P decomposition levels. One or more
of the lossless coding methods, including run-length, Huffman, arithmetic, and bit-plane coding,
can be incorporated into the final symbol coding step. Decoding is accomplished by inverting the
encoding operations—with the exception of quantization, which cannot be reversed exactly.

The principal difference between the wavelet-based system and the

208
Digital Image Processing

Because wavelet transforms are both computationally efficient and inherently local (i.e., their
basis functions are limited in duration), subdivision of the original image is unnecessary.

210

You might also like