0% found this document useful (0 votes)
9 views126 pages

DIP01

The document provides an introduction to Digital Image Processing (DIP), detailing its definition, origins, applications, and fundamental components. It explains the processes involved in DIP, such as image acquisition, enhancement, and restoration, as well as the role of light in image formation and perception. Additionally, it covers the components of a digital imaging system and the biological aspects of human vision.

Uploaded by

chinchong7841
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views126 pages

DIP01

The document provides an introduction to Digital Image Processing (DIP), detailing its definition, origins, applications, and fundamental components. It explains the processes involved in DIP, such as image acquisition, enhancement, and restoration, as well as the role of light in image formation and perception. Additionally, it covers the components of a digital imaging system and the biological aspects of human vision.

Uploaded by

chinchong7841
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 126

Lecture 1: Introduction

Dr. Pratishtha Verma


Department of Computer Engineering
NIT Kurukshetra
Table of content
● What is Digital Image Processing (DIP)?
● The Origins of DIP
● DIP Applications
● Fundamental Steps in DIP
● Components of a DIP System
● Elements of Visual Perception
● Light and the Electromagnetic Spectrum
● Image Sensing and Acquisition
● Image Sampling and Quantization
What is Digital Image Processing (DIP)?

● What is an image?
- A two-dimensional function, f(x, y), with x and y as spatial coordinates, and the amplitude
of fat any pair of coordinates (x, y) as the intensity or gray level of the image at that point.
● When x, y and intensity values f are all finite, discrete quantities, image is a digital image.
● Digital Image Processing (DIP) is to process digital images by digital computers.
● Each digital image is composed of a finite number of elements called picture elements,
image elements, pels, or pixels.
● Human Vision and Image Machines?
● How is DIP different than image analysis and computer vision?
- DIP is when both input and output of an algorithm are images? Maybe!
What is Digital Image Processing (DIP)?
Better categorization:

-Low-level processes: both input and output are images

● Noise reduction
● Contrast enhancement
● Image sharpening

- Mid-level processes: inputs are images, outputs are image attributes

● Segmentation
● Classification of objects

- High-level processes: "making sense" of the ensemble of recognized objects, for performing
cognitive functions.

DIP consists of processes Whose inputs and outputs are images, and of processes that extract attributes
from images.
The Origins of DIP
● Earliest applications of digital images were in the newspaper industry, when pictures
were sent by submarine cables across the Atlantic Ocean. (b/w London and new york)
● Bartlane cable-1920s- reduced transmission time- image to 3 hs from more than one
week- across Atlantic.
● Specialized printing equipment coded pictures for transmission, then reconstructed at
the receiving end.
● Early systems (Bartlane systems) were coding images in five distinct levels of gray. By
end of 1920s the capability increased to 15 levels.

Figure 1.1. A digital picture produced in 1921 from a coded tape by a telegraph printer with special typefaces. (McFarlane.) [References in the bibliography at
the end of the book are listed in alphabetical order by authors’ last names.]
The Origins of DIP
● The basis for digital computers dates back to 1940s with introduction of the concepts for
memories that could hold programs and data, and conditional branching.
● These two concepts are the foundation of CPU development. The series of key advances made
computer more powerful enough for DIP:
- Invention of transistors at Bell Lab, 1948;
- Common Business-Oriented Language (COBOL) and Formula
-Translator (FORTRAN) programming languages, 1950s and 1960s;
- Inventions of Integrated Circuits (IC) by Texas Instruments, 1958;
- Operating systems, 1960s;
- Microprocessors by Intel, 1970s;
- Personal computers by IBM, 1981;
- Large-scale integration (LI) and very-large-scale integration (VLSI) in 1970s and 1980s;
- Ultra-large-scale integration (ULSI), present.
The Origins of DIP
● Early examples of DIP dates back to Jet Propulsion Lab in 1964 for
processing images of moon captured by Ranger 7.

● The invention of Computerized Axial Tomography (CAT), or Computerized


Tomography (CT) in 1970s is one of the most important events in the
application of DIP in medical diagnosis.
DIP Applications

● The areas of applications of DIP are numerous.


● To categorize them, let's focus on the images according to their sources.
- Electromagnetic;
- Acoustic;
- Ultrasonic;
- Electronics;
- Synthetic.
DIP Applications

Electromagnetic (EM) waves are propagating sinusoidal waves of varying wavelength, or stream
of mass-less particles traveling in a wavelike pattern, moving at the speed of light.
* Each mass-less particle contains a certain amount or bundle of energy called a photon.
* Energy of a photon:
E=hf= hc\𝝺
h: Planck constant (6.63 x10 "Js)
f: frequency
C: speed of light 3x10 m/s
𝝺: wavelength of photon
Fundamental Steps in DIP
● Acquisition could be as simple as being given an image that is already in digital
form. Generally, the image acquisition stage involves preprocessing, such as
scaling.
● Image enhancement is the process of manipulating an image so the result is
more suitable than the original for a specific application.
● Image restoration is an area that also deals with improving the appearance of an
image. However, unlike enhancement, which is subjective, image restoration is
objective, in the sense that restoration techniques tend to be based on
mathematical or probabilistic models of image degradation. Enhancement, on the
other hand, is based on human subjective preferences regarding what constitutes
a “good” enhancement result.
● Color image processing is an area that has been gaining in importance because
of the significant increase in the use of digital images over the internet. This covers
a number of fundamental concepts in color models and basic color processing in a
digital domain. Color is used also as the basis for extracting features of interest in
an image.
● Wavelets are the foundation for representing images in various degrees of
resolution.
● Compression, as the name implies, deals with techniques for reducing the
storage required to save an image, or the bandwidth required to transmit it.
● Morphological processing deals with tools for extracting image components
that are useful in the representation and description of shape.
● Segmentation partitions an image into its constituent parts or objects.

Components of a DIP System
Digital Imaging Systems

● Components of digital Imaging system


● Overview of the human visual system
● Overview of digital imaging components
● Digital Halftone process
● Overview of file formats
Overview of Digital imaging systems
A digital imaging system is a set of components used for storing, manipulating, and
transmitting images. This system includes hardware components for image acquisition,
processors, and input/output devices.

Imaging
system

Direct Imaging Indirect Imaging


system system

Serial acquisition Parallel acquisition


systems systems
● In active imaging systems, the acquired images are in a recognizable form However, in indirect imaging
systems, data processing or reconstruction is required before the images can be produced for observation.

● Direct imaging systems are further divided into serial and parallel acquisition systems In serial acquisition
systems, images are obtained by scanning the acquired image. Serial acquisition imaging systems include
scanning microdensitometer and scanning confocal microscopes.J A (scanning confocal. microdensitometer
scans the processed negatives of electron micrographs to provide a digital image.

● The human eye and digital cameras are examples of parallel active imaging systems where parallel rays are
captured simultaneously.
Basic components of imaging systems is listed as:

● Image sensors
● Image storage
● Image processors
● Image display devices
● Image processing software
Image sensors:

❏ The input images originating from a source are captured, stored, and processed by digital imaging system.
❏ These images are originate in different energy forms such as light, X-ray, infrared, radar, and acoustic energy.
❏ Normally, cameras or scanners are used to acquire images. These devices use image sensors that convert light
energy into an analog signal, which is then sampled, quantized, and stored.

Image storage:

❏ Imaging systems store images temporarily in the working memory using random access memory (RAM), or
permanently in magnetic media, such as floppy disks, hard disks, etc.

Image processors:

❏ Most of the image processing applications like airport baggage screening systems are real-time applications with
deadlines. This means that the task should be completed with great precision and within the deadline. Therefore,
image processing programs should run faster; in addition, power requirements should be less and the overall
hardware should be maintainable.
❏ These complicated requirements demand specialized and dedicated hardware components. Imaging processors may
be general-purpose or dedicated processors. Dedicated processors are faster, effective, and efficient as they have
parallel architectures that can hasten the process.
Output Devices: Imaging systems often produce results through films, cathode ray tubes (CRTs), and printers.
Medical imaging systems still use films to store images, as films produce higher resolution.
CRTs and liquid crystal displays (LCDs) are commonly used to visualize the results of the image processing
workstation. Hard-copy devices such as printers and plotters are required to produce the results on a mass scale.

Image Processing Software: Image processing software requirements are too complex as imaging software is
expected to execute tasks faster. Many software systems are available to implement imaging applications. Image
processing software may be a general-purpose programming language like Java or a specialized programming
environment like MATrix LABoratories (Matlab).
Physical Aspects of image acquisition
● Human beings perceive objects because of light. Light sources are of two types:

Primary

Secondary
Nature of Light
Light waves are a part of the electromagnetic spectrum, which do not need a medium to travel
through. Light waves can travel through vacuum with a velocity of 2.998 x 10° m/s.
Light waves consist of both electrical and magnetic fields. These fields vibrate at right angles to the
direction of movement of the wave, and also at right angles to each other.
Therefore, they can be described as transverse waves. A transverse wave is a moving wave that
consists of oscillations that are perpendicular to the direction of energy transfer that is, if a transverse
wave is moving in the x-direction, its oscillations are in the up and down directions in the y-z plane.
Light can be interpreted as both wave and particle. The representation of light as a wave is more
suitable because this assumption helps us study the properties of light effectively. A sample light
wave is shown in Fig. 2.3.
Wavelength: A wavelength is the distance between two successive crests or wave
troughs in the direction of wave propagation.

Amplitude: The amplitude is the maximum distance an oscillation travels, away


from its horizontal axis.

Frequency (v): number of cycle per second. It is expressed in units sec-1 or


Hertz.

Velocity of light( c)=vλ

● Light exhibits dual nature, that is, it exhibits properties of waves as well as
particles. If light is assumed as particles, it can be explained on the basis of
photons. Photons are packets of energy.

Energy(E)=hv
Simple Image Model
Model construction requires parameters that are derived from the knowledge of the following factors:

1. Amount of light generated by the light source

2. Nature of the surface

3. Amount of light that is captured by the sensor

Similar to the power of a bulb, which is represented using the unit Watt, the power of the light generated by a
light source is measured in luminous flux/Luminous flux refers to the amount of light energy generated by the
light source per unit time The amount of luminous flux falling on a given area of a surface is called illuminance.
The other name for illuminance is luminous flux density, and the unit of illuminance is lux.

Human beings can perceive only a very narrow range of frequencies. The visible light range is 400-700 nm! This
range of frequencies is called a spectral power distribution (SPD) or simply a spectrum. Hence, energy
distribution of the light passing through a plane can be can be represented as L(x, y, t, λ), where x and y are
spatial coordinates, λ is the wavelength, and t is the time If x, y, and t are assumed to be constant, this function
can be written as L(λ).

The light upon falling on the object may reflect or refract. Reflection is the phenomenon of bouncing back of the
incident light ray after hitting any surface/In refraction, the light ray is refracted, that is, the wavelength and
speed of the light ray are changed
The light emitted by an object at a particular wavelength i can be written as follows:

I(x,y,λ) = 𝜎(x,y,λ) X L(λ)

For an opaque object, ox, y, 2) = 0 and for a perfect reflector, it is 1.

Light source

L(x,y,λ) Eye Brain

Object

Video cameras Computer


Colour Fundamentals
● Colour is a complex phenomenon involving the physical properties of light and the
phenomenon of visual perception by the human brain. It can be explained using the
tristimulus theory.
● There are three types of cone cells in the retina, arranged, on the basis of spectral
sensitivities, as short (S), medium (M), and long (L) cones. The tristimulus theory states that
the retinal cones are sensitive to blue, green, and red light. For example, it has been
observed that the red light stimulates L cones. Its stimulation of M cones is less and virtually
does not affect S cones. This leads to the perception of red colour.
● In addition, human beings possess the ability to perceive an object in the same colour
irrespective of different lighting conditions. This is called colour constancy.
● However, this is not possible in digital cameras where the colour of the object is influenced by
lighting conditions. This requires additional colour correction by means of white balance to
maintain the same colour. This phenomenon is called chromatic adaptation.
Thus, a colour image can be considered as a set of three monochrome images. Let the
sensitive functions of three primary colours red, green, and blue be Sr(λ), Sg(λ), and Sb(λ),
respectively.

Then the colour image can be described as a set of three responses:


The combination of these three RGB responses can yield any colour. This is called Grassman axiom. As
per the Grassman axiom, a mixture of no more than three colours is necessary to create any colour. When
light is reflected by an object in a uniform manner, the object is perceived as white. Spectral densities of a
few colours are shown in Fig. below.

The x-axis denotes wavelength (λ) whose unit is nm and the y-axis denotes the perception S(λ). The area
under the density curve indicates the total power of the light.

White

Grey
S(λ)

Black
Green
Black

Spectral densities for some selected colours


Simple Image Formation Process

● The simplest camera is a pinhole camera. A pinhole camera demonstrates basic image
formation but takes time to form an image. Modern cameras use lenses to speed up and
enhance this process.
● Refraction: When light moves from one medium (e.g., air) to another (e.g., glass), it bends.
● This bending is crucial for focusing light onto an image plane.
● Convex Lens: Converges light at a specific point (focal point).
● Concave Lens: Causes light rays to spread out
● Focal Point: Where light rays converge.
● Focal Length (f): Distance between the focal point and the lens center.
● Principal Axis: Imaginary line passing through the lens center and focal point.
a. v: Distance between the image plane and the lens.
b. u: Distance between the object and the lens.
c. f: Focal length.
● Explains the relationship between object distance, image distance, and lens properties.
Another important measure is the magnification factor. It is also the measure of how strongly
a lens converges light. The magnification factor (M) can be defined as the ratio of the size
of the image to the size of the object. The focal length is expressed in terms of M as follows:

f=(uM/M+1)

Q. An object is 15 cm wide and is imaged with a sensor of size 8.8 × 6.6 mm' from a
distance of 7 m. What should be the required focal length?
solution
M= Size of the image/Size of the object = 8.8/150 = 0.0587 mm

Therefore, f= (700×0.0587)/(1+0.0587) = 38.81 mm. This means that a typical lens


of 38.81 mm focal length is required.
Biological Aspects of image acquisition
Human Visual System:
The human eye is the most complex organ of the human body. It is spherical in shape and its diameter is
approximately 20 mm. The human eye can be compared to a camera system.

Like a camera, the eye converts the visual information into neural signals. These signals are then carried
by the optical nerves to the human brain. The human brain processes these neural signals and produces
object perception. The components of the human eye are shown in Fig.. Broadly speaking, the human
visual system has three components: the optical component that produces an image, the neural system
that converts the image into electrical signals, and the human brain that converts the electrical signals into
sensations.

The components of the human eye and their functions are as follows:

1. The cornea is the protective covering of the eye. This also acts as a focusing system and focuses light.
This is the first level of focusing that the light entering the eye undergoes. The light then travels through the
aqueous and the vitreous humour, which are liquid media that facilitate the travelling of the light.

2. The iris lies behind the cornea, and its central point is a hole called the pupil.

3. The pupil is the black spot in the middle of the iris, which is actually a hole through which light passes.
4. The iris acts as a variable aperture to control the amount of light that is allowed to pass through the
lens. This is controlled by ciliary muscles, which open and close, thus controlling the pupil for focusing. In
short, the curvature and degree of refraction of the Jens are adjusted so that a sharp image can be
formed.

5. The retina is the eye sensor system. The distance between the lens and the retina varies from 14 to 17
mm. The retina is composed of photoreceptors such as cones and rods. When the light strikes the cones
or rods, it creates electrochemical reactions that generate neural impulses.

Two distinct visual systems are created by rods and cones. Rods cannot perceive colour, but are sensitive
to light. This kind of vision created by rods is called scotopic vision or dim-light vision) A photopic vision
or bright-light vision is created by cones) Scotopic and photopic vision systems ensure a large dynamic
range for human visual system. Another type of vision is called a mesotopic vision, where both cones
and rods are active.

6. Neural signals are sent by the retina via the optical nerve to the visual cortex region in the brain, where
an image is perceived.
Rods Cones

Number: Around 75-150 million Number: Around 6-7 million

Able to respond to the broad spectrum of Able to perceive colour


light but are colour blind

Useful in low light vision Useful in bright light vision

Absent in the region called the fovea Present more in the region around the
fovea; colour perception is best when the
object is directly viewed

Best view in the periphery Minimal perception in the periphery


Q. Assume that a 10 m high structure is observed from a distance of 50 m.
What is the height of the retinal image?
solution
Let the height of the retinal image be x. As shown in previous Example,

Aeipat of osiec 2,10

= Leight of imape

>.50

17

It is assumed that the distance between the centre of the lens and the retina on the vertical axis 17 mm.
Therefore,

x=

17×10

— = 3.4 mm
Review of digital cameras
Similar to the human eye, a digital camera also captures optical signals and converts them into an image.

The essential components of a digital camera are as follows:

1. A subsystem of sensors to capture the image. This subsystem uses photodiodes to convert light energy into
electrical signals.

2. A subsystem that converts analog signals to digital data.

3. A storage subsystem for storing the captured images.

Digital cameras can be connected to computers through a cable, to transfer images to the computer system.

Digital cameras have an aperture, a hole, through which light passes. The aperture determines how much
light is focused onto the image plane. The lens is specified using f-number, which is the ratio of the focal
length to the diameter of the aperture. The preset aperture opening is called an f-stop. A digital camera has a
series of lenses that focus the light onto the subject. However, instead of focusing the light onto a film, a digital
camera focuses the light onto sensors.
Compress
and store

cable
Lens CCD sensors Circuits

Digital cameras
Sampling and Quantization
→ O/P Sensor → Continuous Vtg

→ Convert → continuous sensed data → digital form.

→ Image → Continuous → x & y

Amplitude.

→ Digital form → sample → x & y

→ Amplitude

→ Digitizing Co-ordinate Values → "SAMPLING"

→ Digitizing Amplitude Values → "QUANTIZATION"


The size of the pixel is important for image quality/ If the size is very large, there will be a
lesser number of pixels. Hence, the details become less, which makes the image blurred and
meaningless. This is called pixelization error. This is illustrated in Fig., where grey-level
discontinuities at the edges of the pixel become poor.
Then what should be the ideal size of the pixel? Should it be big or small?

The answer is given by the Shannon-Nyquist theorem.

As per this theorem, the sampling frequency should be greater than or equal to 2 x f-max,
Where f-max is the highest frequency present in the image. Otherwise, the original signal
cannot be reconstructed, In other words, the number of samples required is dictated by this
theorem. The sampling theorem can be stated in terms of distance also. The distance (d) is
expressed as follows:

d ≤ ½*f-max
Q1. An image is 2400 pixels wide and 2400 pixels high. The image was
scanned at 300 dpi. What is the physical size of the image?
Resampling
During the scaling of an image, the number of pixels should be increased to retain image quality. Similarly,
sometimes, the pixels are reduced for better compression. This increase or decrease of the number of pixels is
called resampling. This resampling is of the following two forms:

1. Downsampling.

2. Upsampling

Downsampling (or subsampling) is a spatial resolution technique in which the image is scaled down by half, by
reducing the sampling rate. This is done by choosing alternate samples. Subsampling or downsampling is also
known as image reduction. This is performed by replacement of a group of pixels by an arbitrarily chosen pixel
value. Either , the pixel value can be chosen randomly or the top-left pixel within the neighbourhood can be
chosen. This method is computationally simpler. However, for larger neighbourhoods, this technique would not
yield good results. Consider the following image:

3 3 3 3

9 9 9 9
F=
3 3 3 3
3 3
F=
3 3

Subsampling can be done by choosing an upper-left pixel and replacing the neighbourhood with a chosen pixel
value, that is,

This method is called single pixel selection.

Alternatively, the statistical sample can be chosen. This can be the mean of the pixels; it replaces the
neighbourhood. This technique yields the following image:

F=
Upsampling can be done using replication or interpolation. Replication is called a zero-order hold process,
where each pixel along the scan line is repeated once. Then the scan line is repeated. The aim is to
increase the number of pixels, thereby increasing the dimension of the image. For example, the image

is replicated as follows:

This process is called zero hold process.


Linear interpolation is equivalent to fitting a straight line by taking the average along the rows and
columns. This process is described as follows:
Consider the following image:

1. For example, the matrix H can be zero-interlaced as follows:

2. Interpolate the rows. This is achieved by taking the average of the columns. This yields the following
image:
Interpolate the columns. This is achieved by taking the average of the rows. This yields the following
image:
Image processing operations
Image coordinate system:

Images can be easily represented as a two-dimensional array or matrix.

Discrete images are usually represented in the fourth quadrant of the Cartesian coordinate system. A
discrete image f(x, y), of dimension 3 x 3, is shown in Fig. 3.2(a).

Many programming environments including MATLAB start with an index of (1, 1). The equivalent
representation of the given matrix is shown in Fig. 3.2(b).

The coordinates used for discrete image is, by default, the fourth quadrant of the Cartesian system.
Image topology
Image topology is a branch of image processing that deals with the fundamental properties of the
image such as image neighbourhood, paths among pixels, boundary, and connected
components. It characterizes the image with topological properties such as neighbourhood,
adjacency, and connectivity. Neighbourhood is fundamental to understanding image topology. In
the simplest case, the neighbours of a given reference pixel are those pixels with which the given
reference pixel shares its edges and corners.
In N4(P) 4-neighbourhood, the reference pixel p(x, y) at the coordinate position (x, y) has two
horizontal and two vertical pixels as neighbours. This is shown graphically in Fig. below.

0 X 0

X p(x,y) X

0 X 0
The set of pixels {x + 1, y), (x - 1, y), (x, y + 1), (x, y - 1)}, called the 4-neighbours of p, is denoted as N4(P). Thus
4-neighbourhood includes the four direct neighbours of the pixel p(x, y). By duality, these are pixels that share a common
edge with the given reference pixel p(x,y).

Similarly, the pixel may have four diagonal neighbours. They are (x - 1, y = 1), (x + 1, y+ 1), (x - 1, y + 1), and (x + 1, y - 1).
The diagonal pixels for the reference pixel p(x, y) are shown graphically in Fig.

The diagonal neighbours of pixel p(x, y) are represented as N-d(P). The 4-neighbourhood and N-d, are collectively called
the 8-neighbourhood. This refers to all the neighbours and pixels that share a common corner with the reference pixel p(x,
y). The set of pixels N8(x) = N-d(x) U N4(x).

X 0 X X X X

0 p(x,y) 0 X p(x,y) X

X 0 X X X X
CONNECTIVITY
The relationship between two or more pixels is defined by pixel connectivity. Connectivity information is used to establish
the boundaries of the objects. The pixels p and q are said to be connected if certain conditions on pixel brightness specified
by the set V and spatial adjacency are satisfied. For a binary image, this set V will be {0, 1} and for grey scale images, V
might be any range of grey levels.

4-Connectivity The pixels p and q are said to be in 4-connectivity when both have the same values as specified by the set
V and if q is said to be in the set N4(P). This implies any path from p to q on which every other pixel is 4-connected to the
next pixel.

8-Connectivity It is assumed that the pixels p and q share a common grey scale value. The pixels p and q are said to be in
8 connectivity if q is in the set N8(P).

Mixed connectivity Mixed connectivity is also known as m-connectivity. Two pixels p and q are said to be in m-connectivity
when

1. q is in N4(p) or

2. q is in ND(P) and the intersection of N4(p) and N4(q) is empty.


For example, Fig. shows 8-connectivity when V = {I}.

8-Connectivity is shown as lines. Here, a multiple path or loop is present. In m-connectivity, there are no such multiple
paths. It can be observed that the multiple paths have been removed.
Distance Measures
Distance between the pixels p and q in an image can be given by distance measures such as Euclidean distance, D4 distance,
and D8 distance. Consider three pixels p, q, and z.

If the coordinates of the pixels are P(x, y), Q(s, t), and Z(u, w) as shown in Fig., the distances between the pixels can be
calculated.

The distance function can be called metric if the following properties are satisfied:

1. D(p, q) is well-defined and finite for all p and q.

2. D(p, q) ≥ 0 if p = q, then D(p, q) = 0.

3. The distance D(p, q) = D(q, p).

4. D(p, q) + D(q, z) ≥ D(p, z). This is called the property of triangular inequality.

The Euclidean distance between the pixels p and q, with coordinates (x, y) and (s, t), respectively, can be defined as

De(p, q) = sqrt(square(x-s) + square(y - t))


The distances can be checked with Figs (a) – (d). The transition from one element to another is a hop. It can be
verified that the calculations match with the hops in Fig. The distance Dm depends on the values of the set V.
The implication of the set V is that the path should be constructed only using the elements of V. So if the
values of the set are changed, the path also changes. The simplest Dm distance can be calculated along the
diagonal path. The distance is 3. Suppose the set V={1}, the path of distance Dm also changes. It is shown in Fig.
Classification of image processing operations
There are various ways to classify image operations. The reason for categorizing the operations is to gain an insight into the nature of
the operations, the expected results, and the kind of computational burden that is associated with them.

One way of categorizing the operations based on neighbourhood is as follows:

1. Point operations
2. Local operations
3. Global operations
● Point operations are those whose output value at a specific coordinate is dependent only on the input value.
● Local operations are those whose output value at a specific coordinate is dependent on the input values in the neighbourhood
of that pixel.
● Global operations are those whose output value at a specific coordinate is dependent on all the values in the input image.

Another way of categorizing operations is as follows:

1. Linear operations
2. Non-linear operations

An operator is called a linear operator if it obeys the following rules of additivity and homogeneity.
Arithmetic Operations
Arithmetic operations include image addition, subtraction, multiplication, division, and blending. The following
sections discuss the usage of these operations.

3.2.1.1 Image Addition

Two images can be added in a direct manner, as given by:

g(x,y)=f1(x,y)+f2(x,y)

The pixels of the input images f1(x,y) and f2(x,y) are added to obtain the resultant image g(x,y). Figure shows the
effect of adding a noise pattern to an image. However, during the image addition process, care should be taken to
ensure that the sum does not cross the allowed range. For example, in a grayscale image, the allowed range is
0–255, using eight bits. If the sum is above the allowed range, the pixel value is set to the maximum allowed
value.
If the value of k is larger than 0, the overall brightness is increased. Figure (d)
illustrates that the addition of the constant 50 increases the brightness of the
image. Why?

The brightness of an image is the average pixel intensity of an image.

If a positive or negative constant is added to all the pixels of an image, the


average pixel intensity of the image increases or decreases, respectively.

The practical applications of image addition are as follows:

1. To create double exposure:


Double exposure is the technique of superimposing an image on another
image to produce the resultant image. This gives a scenario equivalent to
exposing a film to two pictures. This is illustrated in Figs (a)–(c).
2. To increase the brightness of an image:
Adding a constant value to all pixels increases the brightness of the
image.
Image subtraction:
If there is no difference between the frames, the subtraction process yields zero, and if there is any difference, it indicates the
change. Figures (a)–(d) show the difference between the images. In addition, it illustrates that the subtraction of a constant
results in a decrease of the brightness.

Results of the image subtraction operation (a) Image 1 (b) Image 2 (c) Subtraction of images 1 and 2 (d) Subtraction of
constant 50 from image 1
Image Multiplication
● It increases contrast. If a fraction less than 1 is multiplied with the image, it results in a decrease
of contrast. Figure shows that by multiplying a factor of 1.25 with the original image, the contrast
of the image increases.
● It is useful for designing filter masks.
● It is useful for creating a mask to highlight the area of interest.
Image Division

(a)
Figures (b)-(e) show the multiplication and division operations used to create a mask. It can be observed
that image 2 is used as a mask. The multiplication of image 1 with image 2 results in highlighting certain
portions of image I while suppressing the other portions. It can be observed that division yields back the
original image.

Image division operation


(a) Result of the image division
operation (image/1.25)
(b) Image 1
(c) Image 2 used as a mask
(d) Image 3 = image 1 x image
2
(e) Image 4 = image 3/image 1
Application of Arithmetic operations
where M is the number of noisy images. As M increases, the averaging process reduces the intensity of
the noise and it becomes so low that it can automatically be removed. As M becomes large, the
expectation E {g(x, y)} = f(x, y).
Set & Logical Operations
Logical Operations
This operation deals with true and false variables and expressions, means we can apply them to
binary images. Bitwise operations can be applied to image pixels. The resultant pixel is defined
by the rules of the particular operation. Some of the logical operations that are widely used in
image processing are as follows:

1. AND/NAND
2. OR/NOR
3. EXOR/EXNOR
4. Invert/Logical NOT
1. AND/NAND
The operators AND and NAND take two images as input and produce one output image.

The output image pixels are the output of the logical AND/NAND of the individual pixels.

Some of the practical applications of the AND and NAND operators are as follows:

1. Computation of the intersection of images

2. Design of filter masks

3. Slicing of grey scale images; for example, the pixel value of the grey scale image may be 1100 0000. The first
bits of the pixels of an image constitute one slice. To extract the first slice, a mask of value 1000 0000 can be
designed. The AND operation of the image pixel and the mask can extract the first bit and first slice of the
image.

Figures (a)-(d) shows the effects of the AND and OR logical operators. It illustrates that the AND operator shows
overlapping regions of the two input images and the OR operator shows all the input images with their
overlapping.
Results of the AND and OR logical operators
(a) Image 1
(b) Image 2
(c) Result of image 1 OR image 2
(d) Result of image 1 AND image 2
Geometrical Operations
Translation:
Extensions to Other Transformations:

● Rotation: The transformation can be represented as F′=RF, where R is the rotation matrix.
● Scaling: The transformation can be represented as F′=SF , where S is the scaling matrix.
Mirror or reflection
operation
Consider an image point [2,2]. Perform the following operations and show the
results of these transformations:

1. Translate the image point right by 3 units.


2. Perform a scaling operation in both x-axis and y-axis by 3 units.
3. Rotate the image point about the origin by 45∘.
4. Perform horizontal skewing by 45∘.
5. Perform mirroring about the x-axis.
6. Perform shearing in the y-direction by 30 units.
3D Transforms
Statistical operations
Convolution and correlation
Let f(x,y) and g(x,y) represent the input and output images, respectively. Then,
they can be written as g(x,y)=t∗f(x,y). Convolution is a group process, that is,
unlike point operations, group processes operate on a group of input pixels to
yield the result. Spatial convolution is a method of taking a group of pixels in the
input image and computing the resultant output image. This is also known as a
finite impulse response (FIR) filter. Spatial convolution moves across pixel by pixel
and produces the output image. Each pixel of the resultant image is dependent on
a group of pixels (called kernel).
Convolution is the
process of shifting
and adding the
sum of the
product of mask
coefficients and
the image to give
the center value.
Correlation

Correlation is similar to the convolution operation and is very useful in recognizing the basic
shapes in the image. Correlation reduces to convolution if the kernels are symmetric. The
difference between the correlation and convolution processes is that the mask or template is
applied directly without any prior rotation, as in the convolution process. The correlation of these
sequences is carried out to observe the difference between these processes. The correlation
process also involves the zero padding process, as shown in Table.
Morphology
Classroom Assessment
1.

You might also like