0% found this document useful (0 votes)
5 views

Image Processing Module1 Notes

The document provides an overview of digital image fundamentals, including definitions, types of digital image processing, and the fundamental steps involved. It discusses the components of an image processing system and the role of human visual perception in image processing. Key processes such as image acquisition, enhancement, restoration, segmentation, and feature extraction are outlined, along with the structure of the human eye and its impact on image perception.

Uploaded by

rithusagar5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Image Processing Module1 Notes

The document provides an overview of digital image fundamentals, including definitions, types of digital image processing, and the fundamental steps involved. It discusses the components of an image processing system and the role of human visual perception in image processing. Key processes such as image acquisition, enhancement, restoration, segmentation, and feature extraction are outlined, along with the structure of the human eye and its impact on image perception.

Uploaded by

rithusagar5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

IMAGE PROCESSING ADS234051

Module 1
Digital Image Fundamentals

1 Digital Image
An image may be defined as a two-dimensional function, f (x, y), where x and y are spatial (plane) coordi-
nates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity of the image at that
point. When x, y, and the intensity values of f are all finite, discrete quantities, we call the image a digital
image.

Figure 1: A Digital Image - 2D function.

A digital image is composed of a finite number of elements, each of which has a particular location and
value. These elements are called picture elements, image elements, pels, and pixels.
Pixel is the smallest resolvable unit of a digital image or display

1.1 Digital Image Processing


Digital image processing refers to processing digital images by means of a digital computer.
However, unlike humans, who are limited to the visual band of the electromagnetic (EM) spectrum, imag-
ing machines cover almost the entire EM spectrum, ranging from gamma to radio waves, ultrasound, electron
microscopy, and computer-generated images. Thus, digital image processing encompasses a wide and varied
field of applications.

There are no clear-cut boundaries in the continuum from image processing at one end to computer vision
at the other. However, one useful paradigm is to consider three types of computerized processes in this
continuum: low-, mid-, and highlevel processes.

Low-level processes involve primitive operations such as image preprocessing to reduce noise, contrast
enhancement, and image sharpening. A lowlevel process is characterized by the fact that both its inputs
and outputs are images.
Mid-level processing of images involves tasks such as segmentation (partitioning an image into regions
or objects), description of those objects to reduce them to a form suitable for computer processing, and
classification (recognition) of individual objects. A mid-level process is characterized by the fact that its

1
Module 1 Digital Image Fundamentals

inputs generally are images, but its outputs are attributes extracted from those images (e.g., edges, contours,
and the identity of individual objects)
High-level processing involves making sense of an ensemble of recognized objects, as in image analysis,
and, at the far end of the continuum, performing the cognitive functions normally associated with human
vision.
On the other hand, there are fields such as computer vision whose ultimate goal is to use computers
to emulate human vision, including learning and being able to make inferences and take actions based on
visual inputs. This area itself is a branch of artificial intelligence (AI) whose objective is to emulate human
intelligence. The field of AI is in its earliest stages of infancy in terms of development, with progress having
been much slower than originally anticipated. The area of image analysis (also called image understanding)
is in between image processing and computer vision.

1.2 Fundamental Steps in Digital Image Processing


The diagram does not imply that every process is applied to an image. Rather, the intention is to convey
an idea of all the methodologies that can be applied to images for different purposes

Figure 2: Fundamental steps in digital image processing.

1. Image acquisition Acquisition could be as simple as being given an image that is already in digital
form. Generally, the image acquisition stage involves preprocessing, such as scaling.
2. Image enhancement is the process of manipulating an image so the result is more suitable than the
original for a specific application. (The word specific is important, for example, a method that is quite
useful for enhancing X-ray images may not be the best approach for enhancing satellite images taken
in the infrared band of the electromagnetic spectrum.)
There is no general theory of image enhancement. When an image is processed for visual interpretation,
the viewer is the ultimate judge of how well a particular method works.
3. Image restoration is an area that also deals with improving the appearance of an image. However,
unlike enhancement, which is subjective, image restoration is objective, in the sense that restoration

Dr. Pooja S. Dept. of AI & DS, GAT 2


Module 1 Digital Image Fundamentals

techniques tend to be based on mathematical or probabilistic models of image degradation. Enhance-


ment, on the other hand, is based on human subjective preferences regarding what constitutes a good
enhancement result.
4. Color image processing comprises of various color models and basic color processing in digital
domain. Color is used also as the basis for extracting features of interest in an image.
5. Wavelets are the foundation for representing images in various degrees of resolution and can be
used for various applications such as image data compression, pyramidal representation and other
transformations
6. Compression deals with techniques for reducing the storage required to save an image, or the band-
width required to transmit it. Although storage technology has improved significantly over the past
decade, the same cannot be said for transmission capacity. This is true particularly in uses of the
internet, which are characterized by significant pictorial content.
7. Morphological processing deals with tools for extracting image components that are useful in the
representation and description of shape.
8. Segmentation partitions an image into its constituent parts or objects. In general, autonomous
segmentation is one of the most difficult tasks in digital image processing. Weak or erratic segmentation
algorithms almost always guarantee eventual failure of imaging problems that require objects to be
identified individually.
9. Feature extraction almost always follows the output of a segmentation stage. Feature extraction
consists of feature detection and feature description. Feature detection refers to finding the features
in an image, region, or boundary. Feature description assigns quantitative attributes to the detected
features. For example, we might detect corners in a region, and describe those corners by their
orientation and location; both of these descriptors are quantitative attributes.

10. Image pattern classification Image pattern classification is the process that assigns a label (e.g.,
vehicle) to an object based on its feature descriptors.
Methods of image pattern classification range from classical approaches such as minimum-distance,
correlation, and Bayes classifiers, to more modern approaches implemented using deep neural networks
such as convolutional neural networks.

11. Knowledge Knowledge/prior knowledge, about a problem domain is coded into an image processing
system in the form of a knowledge database.

1.3 Components of an Image Processing System


In the mid-1980s, numerous models of image processing systems, were rather substantial peripheral devices
In the late 1990s and early 2000s, a new class of add-on boards, called graphics processing units (GPUs)
were introduced for work on 3-D applications, such as games and other 3-D graphics applications. It was
not long before GPUs found their way into image processing applications involving large-scale matrix im-
plementations, such as training deep convolutional networks.

The trend continues toward miniaturizing and blending of general-purpose small computers with special-
ized image processing hardware and software.

Figure 3 shows the basic components comprising a typical general-purpose system used for digital image
processing.

1. Image Sensor
Two subsystems are required to acquire digital images.

(a) Sensor: that responds to the energy radiated by the object we wish to image.
(b) Digitizer: a device for converting the output of the physical sensing device into digital form

2. Specialized image processing hardware


usually consists of the just mentioned digitizer and hardware for fast processing

Dr. Pooja S. Dept. of AI & DS, GAT 3


Module 1 Digital Image Fundamentals

Figure 3: Components of an Image Processing System

(a) Digitizer: that responds to the energy radiated by the object we wish to image. For instance,
in a digital video camera, the sensors (CCD chips) produce an electrical output proportional to
light intensity. The digitizer converts these outputs to digital data.
(b) hardware: that performs other primitive operations, such as an arithmetic logic unit (ALU),
ALU is used is in averaging images as quickly as they are digitized, for the purpose of noise
reduction. This type of hardware sometimes is called a front-end subsystem, and its most dis-
tinguishing characteristic is speed. In other words, this unit performs functions that require fast
data throughputs (e.g., digitizing and averaging video images at 30 frames/s) that the typical
main computer cannot handle.

3. The computer
can range from a PC to a supercomputer

• In dedicated applications, sometimes custom computers are used to achieve a required level of
performance

4. Software
for image processing consists of specialized modules that perform specific tasks

• sophisticated software packages allow the integration of those modules and general-purpose soft-
ware commands eg: OpenCV, Scikit-image, Matplotlib, and more.

5. Mass Storage
An image of size 1024 x 1024 pixels, in which the intensity of each pixel is an 8-bit quantity, requires
one megabyte of storage space if the image is not compressed. When dealing with image databases
that contain thousands, or even millions, of images, providing adequate storage in an image processing
system can be a challenge.
Digital storage for image processing applications falls into three principal categories:

(a) short-term storage: during processing


(b) on-line storage: for relatively fast recall

Dr. Pooja S. Dept. of AI & DS, GAT 4


Module 1 Digital Image Fundamentals

(c) archival storage: characterized by infrequent access.

6. Image Displays
in the form of
(a) Monitors: driven by the outputs of image and graphics display cards that are an integral part
of the computer system
(b) Stereo displays: implemented in the form of headgear containing two small displays embedded
in goggles worn by the user.
7. Hardcopy devices
for recording images
• laser printers, film cameras, heatsensitive devices, ink-jet units, and digital units, such as optical
and CD-ROM disks.
8. Networking and cloud communication
for recording images
• Because of the large amount of data inherent in image processing applications, the key consider-
ation in image transmission is bandwidth. In dedicated networks, this typically is not a problem
• but communications with remote sites via the internet are not always as efficient.

2 Elements of visual perception


Although the field of digital image processing is built on a foundation of mathematics, human intuition and
analysis often play a role in the choice of one technique versus another, and this choice often is made based
on subjective, visual judgments. Thus, developing an understanding of basic characteristics of human visual
perception as a first step in Image Processing

2.1 Structure of the Human Eye

Figure 4: Simplified diagram of a cross section of the human eye.

Dr. Pooja S. Dept. of AI & DS, GAT 5


Module 1 Digital Image Fundamentals

The eye is nearly a sphere (with a diameter of about 20 mm) enclosed by three membranes
1. The cornea and sclera outer cover

• The cornea is a tough, transparent tissue that covers the front surface of the eye
• Continuous with the cornea, the sclera is an opaque membrane that encloses the remainder of the
optic globe.
2. The choroid,the ciliary body and the iris along with the lens

• The choroid lies directly below the sclera. This membrane contains a network of blood vessels
that serve as the major source of nutrition to the eye.
• The choroid coat is heavily pigmented,to reduce extraneous light entering the eye and the optic
globe.
• At its front end, the choroid is divided into the ciliary body and the iris
• The iris contracts or expands to control the amount of light that enters the eye.
• The lens consists of concentric layers of fibrous cells and is suspended by fibers that attach to the
ciliary body. It is composed of 60% to 70% water
3. The retina is the innermost membrane

• The innermost membrane of the eye is the retina.


• When the eye is focused, light from an object is imaged on the retina. Pattern vision is afforded by
discrete light receptors distributed over the surface of the retina. There are two types of receptors:
– cones:
There are between 6 and 7 million cones in each eye, located primarily in the central portion of
the retina, called the fovea, and are highly sensitive to color. Cone vision is called photopic
or bright-light vision.
– rods:
Some 75 to 150 million are distributed over the retina and are connected to a single nerve
ending. Which reduces the amount of detail discernible by these receptors.
Rods capture an overall image of the field of view. They are not involved in color vision, and
are sensitive to low levels of illumination.
Rod Vision is known as scotopic or dim-light vision.

Figure 5: Distribution of rods and cones in the retina.

Figure 5 shows the density of rods and cones for a cross section of the right eye. The absence of receptors
due to passing of the optic nerve from the eye, causes the blind spot. Except for this region, the distribution
of receptors is radially symmetric about the fovea.

Dr. Pooja S. Dept. of AI & DS, GAT 6


Module 1 Digital Image Fundamentals

2.2 Image formation in the eye

Figure 6: Graphical representation of the eye looking at a palm tree. Point C is the focal center of the lens.

In an ordinary photographic camera, the lens has a fixed focal length. Focusing at various distances is
achieved by varying the distance between the lens and the imaging plane, where the film (or imaging chip
in the case of a digital camera) is located.

In the human eye, the converse is true; the distance between the center of the lens and the imaging sensor
(the retina) is fixed [approximately 17 mm], and the focal length needed to achieve proper focus is obtained
by varying the shape of the lens.

The fibers in the ciliary body accomplish this by flattening or thickening the lens for distant or near
objects, respectively. The range of focal lengths is approximately 14 mm to 17 mm, the latter taking place
when the eye is relaxed and focused at distances greater than about 3 m.

For example, suppose that a person is looking at a tree 15 m high at a distance of 100 m. Letting h
denote the height of that object in the retinal image, the geometry of Fig. 6 yields
15 h
=
100 17

h = 2.5 mm

2.3 Brightness Adaptation


Because digital images are displayed as sets of discrete intensities, the eyes ability to discriminate between
different intensity levels is an important consideration in presenting image processing results.
The range of light intensity levels to which the human visual system can adapt is enormous on the order
of 1010 from the scotopic threshold to the glare limit.
Figure 7 shows a plot of light intensity versus subjective brightness (intensity as perceived by the
human visual system).
• Subjective brightness is a logarithmic function of the light intensity incident on the eye.
• The long solid curve represents the range of intensities to which the visual system can adapt.
• In photopic vision alone, the range is about 106 .
• The visual system cannot operate over such a range simultaneously. Rather, it accomplishes this large
variation by changing its overall sensitivity, a phenomenon known as brightness adaptation.
• The total range the eye can discriminate simultaneously is rather small, For a given set of conditions,
the current sensitivity level of the visual system is called the brightness adaptation level
• For eg: Let Ba be the current brightness adaptation level, the short intersecting curve represents the
range of subjective brightness that the eye can perceive
• This range is restricted, by the level Bb below which, all stimuli are perceived as indistinguishable
blacks.

Dr. Pooja S. Dept. of AI & DS, GAT 7


Module 1 Digital Image Fundamentals

Figure 7: Range of subjective brightness for a particular adaptation level, Ba.

2.4 Brightness Discrimination

Figure 8: Basic experimental setup used to characterize brightness discrimination.

• The ability of the eye to discriminate between changes in light intensity at any specific adaptation level
can be determined by a classic experiment. of having a subject look at a flat, uniformly illuminated
area large enough to occupy the entire field of view.

• This area typically is a diffuser, such as opaque glass, illuminated from behind by a light source, I,
• To this field is added an increment of illumination, ∆I , in the form of a short-duration flash is added
as a circle in the center
• If ∆I is not bright enough, there will be no perceivable change.

• when ∆I is strong enough it is perceived.


• The quantity ∆II
c
, where ∆Ic is the increment of illumination discriminable 50% of the time with
background illumination I, is called the Weber ratio.
• A small value of ∆I
I means that a small percentage change in intensity is discriminable. This represents
c

good brightness discrimination.


• Conversely, a large value of ∆I
I means that a large percentage change in intensity is required for the
c

eye to detect the change. This represents poor brightness discrimination.

Dr. Pooja S. Dept. of AI & DS, GAT 8


Module 1 Digital Image Fundamentals

3 A simple image formation model


An image is denoted as a two-dimensional functions of the form f (x, y).

Figure 9: An example of digital image acquisition. (a) Illumination (energy) source. (b) A scene. (c) Imaging
system. (d) Projection of the scene onto the image plane. (e) Digitized image.

Since the values of an image generated are proportional to energy radiated by a physical source (e.g.,
electromagnetic waves), f (x, y) must be nonnegative and finite; that is,

0 ≤ f (x, y) < ∞
The function f (x, y) is characterized by two components:

1. The amount of source illumination incident on the scene, illumination components i(x, y)
2. The amount of illumination reflected by the objects in the scene, reflectance components r(x, y)

The two functions combine as a product to form f (x, y):

f (x, y) = i(x, y) r(x, y) (1)


where

0 ≤ i(x, y) < ∞ (2)


and

0 ≤ r(x, y) < 1 (3)


Where i (x, y) is dependent on the illumination source and r (x, y) is dependent on the reflectance
characteristics of the imaged objects, with 0 (total absorption) and 1 (total reflectance).
Let the intensity level of a monochrome (black and white or grayscale) image at any coordinates (x, y)
be denoted by

l = f (x, y) (4)
it is evident that l lies in the range

Lmin ≤ l ≤ Lmax

Dr. Pooja S. Dept. of AI & DS, GAT 9


Module 1 Digital Image Fundamentals

In Practice,

Lmin = imin rmin , Lmax = imax rmax

The interval [Lmin ,Lmax ] is called the intensity or grayscale.


This interval is numerically shifted to [0, L-1]
where,
l=0 is considered black
l=L-1 is considered white on the grayscale

4 Image Sampling and Quantization


To create a digital image, we need to convert the continuous sensed data into a digital format. This requires
two processes: sampling and quantization.

Figure 10: (a) Continuous image. (b) A scan line showing intensity variations along line AB in the continuous
image. (c) Sampling and quantization. (d) Digital scan line. (The black border in (a) is included for clarity.
It is not part of the image).

Figure 10 shows a continuous image f that we want to convert to digital form. An image is continuous
with respect to the x- and y-coordinates, and also in amplitude. To digitize it, we have to sample the function
in both coordinates and also in amplitude.
Digitizing the coordinate values is called sampling.
Digitizing the amplitude values is called quantization.

The basic idea behind sampling and quantization:


• The one-dimensional function in Figure 10b is a plot of amplitude (intensity level) values of the con-
tinuous image along the line segment AB
• To sample this function, we take equally spaced samples along line AB, as shown in Figure 10c. The
samples are shown as small dark squares superimposed on the function

Dr. Pooja S. Dept. of AI & DS, GAT 10


Module 1 Digital Image Fundamentals

• However, the values of the samples still span (vertically) a continuous range of intensity values.
• In order to form a digital function, the intensity values also must be converted (quantized) into discrete
quantities.
• The vertical gray bar in Figure 10c depicts the intensity scale divided into eight discrete intervals,
ranging from black to white. The vertical tick marks indicate the specific value assigned to each of the
eight intensity intervals.
• The continuous intensity levels are quantized by assigning one of the eight values to each sample,
depending on the vertical proximity of a sample
• The digital samples resulting from both sampling and quantization are shown as white squares in
Figure 10d
• Starting at the top of the continuous image and carrying out this procedure downward, line by line,
produces a two-dimensional digital image.

4.1 Representing Digital Images


Let f (s, t) represent a continuous image function of two continuous variables, s and t. We convert this
function into a digital image by sampling and quantization, as explained in the previous section.

Thus the digital image f (x, y), resulting from sampling and quantization is a 2D matrix of real numbers,
which has M rows and N columns.
For notational clarity and convenience, we use integer values for these discrete coordinates: x = 0, 1,
2,. . . ,M -1 and y = 0, 1, 2,. . . , N -1.
the value of the digital image at the origin is f (0,0), and its value at the next coordinates along the first row
is f (0,1).
the value of a digital image at any coordinates (x, y) is denoted f (x, y), the coordinates of an image is called
the spatial domain, with x and y being referred to as spatial variables or spatial coordinates.
Using this notation the digital image is represented in matrix form as

Here, RHS is a Digital Image, each element in the matrix is a pixel/pel.

Image digitization requires that decisions be made regarding the values for M, N, and for the number, L,
of discrete intensity levels.
There are no restrictions placed on M and N, other than they have to be positive integers. However, digital
storage and quantizing hardware considerations usually lead to the number of intensity levels, L, being an
integer power of two; that is
L = 2k (5)
where k is an integer. We assume that the discrete levels are equally spaced and that they are integers in
the range [0,L 1]. Sometimes, the range of values spanned by the gray scale is referred to as the dynamic
range.
The number, b, of bits required to store a digital image is

b=M ×N ×k (6)

When M = N, this equation becomes


b = N2 × k (7)
Figure 11 shows three ways of representing f (x, y). Figure 2.18(a) is a plot of the function, with two
axes determining spatial location and the third axis being the values of f as a function of x and y. This

Dr. Pooja S. Dept. of AI & DS, GAT 11


Module 1 Digital Image Fundamentals

Figure 11: (a) Image plotted as a surface. (b) Image displayed as a visual intensity array. (c) Image shown
as a 2-D numerical array. (The numbers 0, .5, and 1 represent black, gray, and white, respectively.)

representation is useful when working with grayscale sets whose elements are expressed as triplets of the
form (x, y,z), where x and y are spatial coordinates and z is the value of f at coordinates (x, y).

The representation in Figure 11b is more common, and it shows f (x, y) as it would appear on a computer
display or photograph.

Figure 11c shows, the third representation as an array (matrix) composed of the numerical values of f
(x, y).

Dr. Pooja S. Dept. of AI & DS, GAT 12


Module 1 Digital Image Fundamentals

5 Relationships between Pixels


In this section, we discuss several important relationships between pixels in a digital image. When referring
in the following discussion to particular pixels, we use lowercase letters, such as p and q.

5.1 Neighbors of a Pixel


A pixel p at coordinates (x, y) has two horizontal and two vertical neighbors with coordinates

(x + 1, y), (x 1, y), (x, y + 1), (x, y 1)

This set of pixels, called the 4-neighbors of p, is denoted N4 (P ).

The four diagonal neighbors of p have coordinates


(x + 1, y + 1), (x + 1, y 1), (x 1, y + 1), (x 1, y 1)
and are denoted ND (P ).

These neighbors, together with the 4-neighbors, are called the 8-neighbors of p, denoted by N8 (P )

5.2 Adjacency
Let V be the set of intensity values used to define adjacency. For example, if we are dealing with the
adjacency of pixels whose values are in the range 0 to 255, set V could be any subset of these 256 values.
We consider three types of adjacency:

1. 4-adjacency:
Two pixels p and q with values from V are 4-adjacent if q is in the set N4 (P )
2. 8-adjacency:
Two pixels p and q with values from V are 8-adjacent if q is in the set N8 (P )
3. m-adjacency:
(also called mixed adjacency). Two pixels p and q with values from V are m-adjacent if

(a) q is in N4 (P ), or
T
(b) q is in ND (P ) and the set N4 (P ) N4 (P ) has no pixels whose values are from V.

5.3 Path
A digital path (or curve) from pixel p with coordinates (x0 , y0 ) to pixel q with coordinates (xn , yn ) is a
sequence of distinct pixels with coordinates (x 0, y0 ), (x1 , y1 ),. . . , (xn , yn ) where points (xi , yi ) and
(x i-1 , y i-1 ) are adjacent. We can define 4-, 8-, or m-paths, depending on the type of adjacency specified.

5.4 Connectivity
Let S represent a subset of pixels in an image. Two pixels p and q are said to be connected in S if there
exists a path between them consisting entirely of pixels in S. For any pixel p in S, the set of pixels that are
connected to it in S is called a connected component of S. If it only has one component, and that component
is connected, then S is called a connected set.

Dr. Pooja S. Dept. of AI & DS, GAT 13


Module 1 Digital Image Fundamentals

5.5 Region
Let R represent a subset of pixels in an image. We call R a region of the image if R is a connected set.
Two regions, Ri and Rj are said to be adjacent if their union forms a connected set. Regions that are not
adjacent are said to be disjoint.

5.6 Boundary
The boundary (also called the border or contour) of a region R is the set of pixels in R that are adjacent to
pixels in the complement of R. Stated another way, the border of a region is the set of pixels in the region
that have at least one background neighbor.

5.7 Edge
Edges in an image represent the measure of gray-level discontinuity at a point.

5.8 Distance Measures


For pixels p, q, and s, with coordinates (x, y), (u,v), and (w,z), respectively, D is a distance function or
metric if

1. D(p,q) ≥ 0 (D(p,q) = 0 iff p = q) , ,


2. D(p,q) = D(q, p), and
3. D(p, s) ≤ D(p, q) + D(q, s)

5.8.1 Euclidean distance


The Euclidean distance between p and q is defined as
1
De (p, q) = [(x − u)2 + (y − v)2 ] 2 (8)
For this distance measure, the pixels having a distance less than or equal to some value r from (x, y) are
the points contained in a disk of radius r centered at (x, y).

5.8.2 D − 4 distance
(called the city-block distance) between p and q is defined as

D4 (p, q) = |x − u| + |y − v| (9)
In this case, pixels having a D4 distance from (x, y) that is less than or equal to some value d form a diamond
centered at (x, y). For example, the pixels with D4 distance ≤ 2 from (x, y) (the center point) form the
following contours of constant distance

Dr. Pooja S. Dept. of AI & DS, GAT 14


Module 1 Digital Image Fundamentals

5.8.3 D − 8 distance
(called the chessboard distance) between p and q is defined as

D8 (p, q) = max(|x − u|, |y − v|) (10)


In this case, the pixels with D8 distance from (x, y) less than or equal to some value d form a square centered
at (x, y). For example, the pixels with D8 distance ≤ 2 form the following contours of constant distance:

Dr. Pooja S. Dept. of AI & DS, GAT 15

You might also like