0% found this document useful (0 votes)
9 views103 pages

DIPMOD1111111111111

The document provides an overview of Digital Image Processing, covering its fundamentals, types of images (analog vs digital), and the process of image acquisition and manipulation. It emphasizes the advantages of digital images in terms of storage, processing, and transmission, and outlines various applications in fields such as medicine and remote sensing. Additionally, it discusses image resolution, pixel representation, and the electromagnetic spectrum relevant to image processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views103 pages

DIPMOD1111111111111

The document provides an overview of Digital Image Processing, covering its fundamentals, types of images (analog vs digital), and the process of image acquisition and manipulation. It emphasizes the advantages of digital images in terms of storage, processing, and transmission, and outlines various applications in fields such as medicine and remote sensing. Additionally, it discusses image resolution, pixel representation, and the electromagnetic spectrum relevant to image processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 103

DIP: Module1, Bondita Paul, Asst. Prof.

GLA

Digital Image Processing:


Books: 1. Digital Image Processing”, Rafael C. Gonzalez &
Richard E. Woods, Addison-Wesley, 2002
*Much of the material that follows is taken from this book
2. Digital image processing, second edition, S. Sridhar

Module 1
• Introduction and Fundamentals:
– Motivation and Perspective,
Applications, Components of
Image Processing System,
– Element of Visual Perception, A Simple
Image Model,
– Sampling and Quantization, Some Basic
Relationships between Pixels

Introduction and Fundamentals:


What is Image?
An image is a 2D representation of visual information or 3D
scene that can be captured, displayed, or processed.
In the context of digital technology, images consist of a grid of
pixels (picture elements), where each pixel represents a specific
color or intensity level.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Type of image:
Analog image and Digital image:

Analog: Continuous representation of an image using analog


signals. Analog images are the type of images that we, as
humans, look at. They include such things as photographs,
paintings, old TV images, and all of our medical images
recorded on film.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Digital: A digital image is a binary representation (0 and 1) of


visual information that is stored and processed using digital
systems. E.g. of digital systems: Computer, smartphones,
smartwatches, etc.
It consists of a finite set of discrete values called picture
elements or pixels.
Discrete values represent the brightness or color of pixels in a
two-dimensional grid.

The key differences between analog images and digital images are given in the table below:

Aspect Analog Image Digital Image


A continuous representation of an A discrete representation of an
Definition image, where intensity varies image, where intensity is quantized
smoothly over the image. and sampled.
Exists in continuous signals, such as
Stored as a finite set of pixels, each
Form film photographs or real-world
with specific intensity values.
scenes.
Represented by varying physical
Represented using numerical values
Representation properties like light, voltage, or
(bits) in a computer system.
magnetic fields.
Requires physical space like film Stored digitally on devices like hard
Storage
rolls or analog media. drives, memory cards, etc.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Aspect Analog Image Digital Image


Easily edited, enhanced, and
Hard to manipulate without
Manipulation processed using algorithms or
specialized equipment.
software.
Noise Highly susceptible to noise and Less susceptible to noise, with the
Susceptibility degradation over time. potential for error correction.
Requires continuous signals for Transmitted in discrete signals, e.g.,
Transmission
transmission, e.g., in analog TV. in digital TV or computer networks.
Defined by the number of pixels
Limited by the recording medium
Resolution (spatial resolution) and bit depth
and equipment precision.
(intensity resolution).
X-ray film images, photographic JPEG, PNG images, digital photos,
Examples
negatives, analog TV broadcasts. and satellite images.

Analog images are continuous and physical, while digital images are discrete and numerical.
Digital images are more practical for modern applications due to their ease of processing,
storage, and transmission.

Image Acquisition Using Sensor Arrays:


DIP: Module1, Bondita Paul, Asst. Prof. GLA

What is Digital Image Processing?


Digital image processing focuses on two major tasks
◦ Improvement of pictorial information for human
interpretation
◦ Processing of image data for storage, transmission,
and representation for autonomous machine
perception.

How it works?

Image Processing Overview

• Step 1: A real-world scene is captured by a camera.


• Step 2: The image is sent to a digital system for processing.
• Step 3: All unnecessary details are removed from the image.
• Step 4: The focus is shifted to the water drop in the image.
• Step 5: The image is zoomed in to highlight the water drop.
• Step 6: The zooming ensures the image quality remains
unchanged.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Motivation for Digital Image Processing

Why Digital Image Processing Matters?

• Human Strengths: Excels in 2D image understanding but is limited in


quantification, 3D reconstruction, and handling high-dimensional data.

• Digital Advantages: Enables precise measurements, 3D modelling, and


automated image analysis, overcoming human limitations.

Perspective of Digital Image Processing


• Manipulates digital images using a computer.
• A specialized subfield of signals and systems focusing on images.
• Aims to enhance and analyze visual data efficiently.
• Develops systems to process images using efficient algorithms.
• Efficient storage & transmission

Applications:

DIP has a broad spectrum of applications.


DIP: Module1, Bondita Paul, Asst. Prof. GLA

Some of the major fields in which digital image processing is widely


used are

1. Medical field

2. Geographic Information System (GIS)

3. Remote Sensing
DIP: Module1, Bondita Paul, Asst. Prof. GLA

4. Object Tracking

5. Classification

6. Change Detection
DIP: Module1, Bondita Paul, Asst. Prof. GLA

7. Disaster Monitoring

Digital Image Representation

• An image may be defined as a two- dimensional function, f(x,y)


where x and y are spatial (plane) coordinates, and the amplitude
of f at any pair of coordinates (x, y) is called the intensity or gray
level of the image at that point.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• When x, y, and the amplitude values of f are all finite, discrete


quantities, we call the image a digital image. If x, y, f are
continuous then it is called an analog image.

What is a pixel?

• A digital image is a matrix of many small finite elements which


are called pixels/picture elements/image elements/pels.
• Each pixel is represented by a numerical value.

• A digital image can be


considered as a matrix, where
the matrix element value
identifies the gray level at that
point.
• A gray shade image
(Grayscale image) of L gray levels, consisting of M x N picture
elements (resolutions) is then given by

 f (0, 0) f (0,1) ... f (0, N − 1) 


 f (1, 0) f (1,1) ... f (1, N − 1) 
f ( x, y ) = 
 ... ... ... ... 
 
 f ( M − 1, 0) f ( M − 1,1) ... f ( M − 1, N − 1) 
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Where,
M: number of rows and
N: number of columns

What is Gray level?

• Gray levels refer to the range of intensity values in a grayscale


image (0-255).
• Each pixel in a grayscale image is assigned a specific value
(i=f(x,y)) that represents its brightness or intensity, typically
ranging from black (minimum intensity) to white (maximum
intensity). Intermediate intensity values represent different
shades of gray.

i.e.

0 ≤ i ≤ L – 1, where L=256 (Maximum intensity)


• The number of intensity (gray) levels (L) typically is an integer
power of 2:
• L=2^k

where,
• k: number of bits used for representation
• Assume that the discrete levels are equally spaced and that they
are integers in the interval [0, L-1].
• The total number of bits ‘b’ required to represent an image is
given by
• b=M × N × k
• For M=N
• b=M^2×k
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• When an image can have 2^k intensity levels, then the image is
called as k-bit image. For example, an image with 256 possible
discrete intensity values is called an 8-bit image.
• The range of values spanned by the grayscale is referred to as the
dynamic range.
Detailed Conversion List:

Unit Bits Bytes Equivalent


1 Byte 8 Bits 1 Byte -
1 KB 8×1024 = 8,192 Bits 1024 Bytes 1 Kilobyte
1 MB 8×1024^2= 8,388,608 Bits 1024 KB 1 Megabyte
1 GB 8×1024^3 =8,589,934,592 Bits 1024 MB 1 Gigabyte
1 TB 8×1024^4 = 8,796,093,022,208 Bits 1024 GB 1 Terabyte

Q. For an image of 512 by 512 pixels, with 8 bits per pixel. What will
be the total bits required to store the image?
Ans:
Size = 512 * 512 * 8 bits
= 29 * 29 * 23 bits

= 221/23 bytes

= 218/210 K bytes
= 256 KB
Q. For an image of 64 by 64 pixels with 3 bands, with 4 bits per
pixel. Calculate the total bits present in the image.
Ans: Given:
1. Image Dimensions: 64×64 pixels
2. Number of Bands (Channels): 3 bands
3. Bits per Pixel: 4 bits
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Formula:
Total Bits=Width×Height×Bands×Bits per Pixel
Total Bits=64×64×3×4 =49152 bits
= 6144 bytes (6 KB)
Q. For an image of 32 by 32 pixels with 4 bands/channels, in
which the smallest unit of an image consists of 11 shades.
Determine the total number of bits required to store this image.
Ans: Given:
Image Dimensions: 32×32 pixels
Number of Bands: 4
Shades (Levels): 11
The number of bits required to represent N shades is:
Bits per Pixel (bpp), k=log2(N)
For N=11
bpp=log2(11)=3.459=4 bits per pixel.
Total Bits=Width×Height×Bands×bpp
=32×32×4×4=16384 bits=2,048 bytes = 2 KB

Q. For an image of 25 by 30 pixels with 5 spectra, the pixels have 17


levels. Determine the total number of bits required to store this image.
Ans: Total Bits=25×30×5×5=18750 bits = 2,343.75 bytes ~2.34 KB.

Since we know there are 256 levels and 8 bits per pixel present in
Grayscale image. Find out how many bits per pixel are present in
the binary image and how many levels are present in the color
image.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Key differences among Binary, Grayscale, Black and White, and


Color images:

Black and White image

What is a pseudo-color image? (Find out the answer)


DIP: Module1, Bondita Paul, Asst. Prof. GLA
DIP: Module1, Bondita Paul, Asst. Prof. GLA
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Black and
Feature Binary Image Grayscale Image Color Image
White Image
A subset of Contains color
Simplest form Contains multiple
grayscale, information using
with two intensity shades of gray
Definition typically only combinations of
levels: black (0) between black and
black and white Red, Green, and
and white (1). white.
(high contrast). Blue (RGB).
256 levels (Only Millions of colors
256 levels
Intensity 2 levels (black black (0) and (e.g., 24-bit color
(typically 8 bits per
Levels and white). white (255) are supports 16.7
pixel).
highlighted. million colors).
8 bits or higher per 8-bit (pure black 24 bits (8 bits per
Bit Depth 1 bit per pixel.
pixel. and white). channel: RGB).
Used for Used for detailed High-contrast
Used for
segmentation, intensity applications,
representing real-
Purpose thresholding, or representation (e.g., such as printing
world images with
document medical imaging, or artistic
natural colors.
scanning. photography). purposes.
OCR, edge X-rays, Photography,
Printing, logos,
Applications detection, mask photographs, video, gaming, and
line art.
creation. intensity analysis. digital displays.
A scanned text A black and A colorful
A grayscale portrait
document or a white logo with landscape
Example image with various
thresholded object no intermediate photograph or
shades of gray.
mask. shades. painting.

Image Resolutions:
There are four types of resolution
• Spatial resolution
• Spectral resolution
• Temporal resolution
• Radiometric resolution

Spatial resolution:
The smallest possible feature that can be detected in an image.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

What is Spatial Resolution?

o It indicates the clarity or detail present in an image.


o Higher spatial resolution means more detailed and sharper
images, while lower resolution results in blurry or less
detailed images.

Measurement:

o Typically measured in pixels per unit area (e.g., pixels per


inch).
o In satellite imagery, it may be measured in meters (e.g., 10
m resolution means each pixel represents a 10 × 10 m area).
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Factors Influencing Spatial Resolution:


o Sensor quality.
o Distance between the object and the imaging device.
o Optics of the imaging system.

Applications of Spatial Resolution:

1. Remote Sensing:
o High spatial resolution is used for mapping urban areas,
while low spatial resolution is used for large-scale
environmental monitoring.
2. Medical Imaging:
o Important in modalities like X-rays, CT scans, and MRIs
to detect small abnormalities.
3. Digital Photography:
o Determines the sharpness and quality of captured images.
4. Microscopy:
o Used in biology and materials science to observe fine
details at micro or nanoscales.

Example:

• High Resolution: A satellite image with 1 m spatial resolution


can identify individual cars.
• Low Resolution: A satellite image with 30 m spatial resolution
can only identify larger objects, like buildings or fields.

Various Sensor images:


DIP: Module1, Bondita Paul, Asst. Prof. GLA

Electromagnetic Spectrum:

The electromagnetic spectrum is the range of all types of


electromagnetic radiation, which are waves of energy that travel
through space at the speed of light. These waves differ in wavelength,
frequency, and energy, but they all share the common properties of
electromagnetic waves, such as traveling at the speed of light and
having electric and magnetic field components.

Regions of the Electromagnetic Spectrum

Wavelength
Region Frequency Range Examples/Applications
Range
Communication (radio, TV, mobile
Radio Waves > 1 mm < 3×10^9 Hz
phones)
Microwave ovens, radar, satellite
Microwaves 1 mm – 1 m 10^9 - 10^12 Hz
communication
10^{12} - 10^14 Heat sensors, remote controls,
Infrared (IR) 700 nm – 1 mm
Hz thermal imaging
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Wavelength
Region Frequency Range Examples/Applications
Range
4.3 ×10^14 -
Visible Light 400 – 700 nm Human vision, photography
7.5×10^14 Hz
Ultraviolet Sterilization, fluorescent lights,
10 – 400 nm 10^15 - 10^17 Hz
(UV) tanning
X-rays 0.01 – 10 nm 10^17 - 10^19 Hz Medical imaging, material analysis
Gamma Rays < 0.01 nm > 10^19 Hz Cancer treatment, nuclear processes

Spectral Resolution:

• It describes the ability of a sensor to capture and distinguish


information across various wavelength bands with fine intervals.
• Higher spectral resolution means the sensor can record data in
narrower and more specific wavelength ranges.

For example:

• A multi-spectral sensor captures data in a limited number of broad bands


(e.g., red, green, blue, near-infrared).
• A hyperspectral sensor, with much higher spectral resolution, captures
data in hundreds of narrow bands, allowing precise identification of
materials or substances based on their spectral signatures.

Example Application:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• In agriculture, a hyperspectral sensor can differentiate between healthy and


stressed vegetation by analyzing subtle changes in reflectance within
specific wavelength bands.

Temporal Resolution in Remote Sensing:

• Temporal resolution refers to the time interval between two


successive observations of the same area by a sensor.
• It is crucial for tracking dynamic changes in a region over time,
such as environmental changes, urbanization, or natural disasters.
Example:

• Landsat Satellite:
o Temporal resolution: 16 days
o The satellite revisits the same area once every 16 days.
o Useful for monitoring changes like deforestation, crop growth, or
urban expansion over time.

Radiometric Resolution in Remote Sensing

• Radiometric resolution refers to the ability of a sensor to capture


and represent subtle variations in the intensity of light reflected
or emitted from an object.
• It enhances the pixel characteristics, allowing the identification
of very slight differences in an image.

• Dependence on Sensor Ability:


o The sensor's ability to detect small variations in intensity
determines its radiometric resolution.
• Greater Sensitivity:
o Sensors with higher radiometric resolution are more
sensitive and can detect finer changes in the scene.
• Smallest Visible Change:
o Radiometric resolution determines the smallest visible
change in the gray levels of an image.
Example:

• Landsat Sensor:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

o Landsat's radiometric resolution is typically 8 bits per


pixel, which allows it to differentiate 256 levels of intensity.
o This capability enables it to detect subtle differences in land
surface properties, such as distinguishing between different
vegetation types or identifying slight variations in soil
moisture.
A table highlighting the differences among Spatial, Spectral, Temporal, and
Radiometric Resolution:

Spatial Spectral Temporal Radiometric


Aspect
Resolution Resolution Resolution Resolution
Refers to the Refers to the
Refers to how Refers to the
size of the ability to
frequently a ability to
smallest distinguish
sensor can distinguish slight
Definition object that between
capture data differences in
can be different
for the same intensity levels
detected in anwavelengths of
area. (shades).
image. light.
Number of
Number of
Pixel size on spectral bands Revisit time
intensity levels
the ground and their (e.g., daily,
Key Metric (e.g., 8-bit = 256
(e.g., 10m, wavelength weekly, or
levels, 12-bit =
30m, 1km). range (e.g., yearly).
4096 levels).
visible, NIR).
Determines
Helps identify Tracks changes Detects fine
how detailed
materials based or events over differences in
Purpose the spatial
on their spectral time in a brightness or
features of an
signatures. specific area. reflectance.
object appear.
A satellite
Multispectral
image with Monitoring
imaging (e.g., 3- Differentiating
10m spatial crop growth,
10 bands) or subtle variations
resolution can disaster
Example hyperspectral in soil moisture
distinguish management,
imaging or vegetation
objects at or urban
(hundreds of health.
least 10m expansion.
bands).
apart.
Urban Mineral Weather Climate studies,
Applications
planning, exploration, monitoring, precision
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Spatial Spectral Temporal Radiometric


Aspect
Resolution Resolution Resolution Resolution
forestry, and vegetation disaster agriculture, and
land use classification, tracking, and medical imaging.
mapping. and water seasonal
quality analysis. analysis.

End term 2024:

Solution:

A. Monitoring Wheat Crop Growth

• Suitable Resolution: Temporal Resolution


o The user needs to monitor changes over time (from sowing
to harvest), which requires frequent revisit capability of the
sensor.
o A sensor with high temporal resolution (e.g., daily or
weekly revisit) is suitable for tracking crop growth stages.

B. Tracking a Tropical Cyclone

• Suitable Resolution: Temporal Resolution


o Cyclone tracking requires real-time or near-real-time
updates, emphasizing the need for a sensor with high
temporal resolution.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

o Geostationary satellites (e.g., GOES) with continuous


observation capabilities are ideal for this scenario.

C. More Shades in the Image

• Suitable Resolution: Radiometric Resolution


o The user is focused on capturing more shades or intensity
levels in the image, which depends on the sensor's
radiometric resolution.
o A sensor with high radiometric resolution (e.g., 12-bit or
16-bit) is suitable for capturing finer differences in intensity
levels.

False Contouring Effect: (eg. below Fig.)


This phenomenon occurs when the number of gray levels in an image
is significantly low. The foreground details of the image blend with
the background details, creating ridge-like structures. This
degradation is referred to as false contouring.

Fig: At different bits the variation of the image

Checkerboard Effect:
This occurs when the number of pixels in an image is reduced while
DIP: Module1, Bondita Paul, Asst. Prof. GLA

keeping the number of gray levels constant. Fine checkerboard


patterns appear at the edges of the image as a result of this reduction.
Eg.

Sampling and Quantization:

For an image to be processed by a computer, it must be represented as


a discrete data structure (e.g., a matrix).-
Two Key Processes to Convert Analog to Digital:
1. Sampling: It is a process to convert the continuous spatial data
from the sensor into discrete values by sampling the image at
specific intervals.
2. Quantization: It is a process of assigning discrete numerical
values to the sampled points, converting the continuous
amplitude (brightness levels) into digital numbers.

Example of sampling:

• A continuous image is sampled to a 256 × 256 grid to create a digital


image with 256 pixels in each dimension.

Example of quantization:

• An 8-bit quantization assigns 256 intensity levels (2⁸), while a 4-bit


quantization assigns only 16 levels (2⁴).
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Comparison Table

Aspect Sampling Quantization


What it Converts spatial coordinates into Converts intensity values into
does discrete pixels. discrete levels.
Determines spatial resolution Determines intensity resolution
Resolution
(pixel count). (gray levels).
Focuses on space (x, y Focuses on intensity
Focus
dimensions). (brightness).
Affects the sharpness or clarity Affects the smoothness and
Impact
of the image. detail of tones.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

A digital image is represented in matrix form, where sampling


determines the coordinates (spatial positions) within the matrix
(image grid), and quantization assigns the pixel values
(intensity levels) corresponding to those positions, effectively
defining the elements of the matrix.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Mid-Term 2024:

Fundamental steps in digital image processing Block


Diagram:

Image Acquisition:
• The image is captured by a sensor (eg. Camera) and digitized if
the output of the camera or sensor is not already in digital form,
using an analog-to-digital converter.
• Involves image preprocessing viz. scaling, resizing etc.
Image Enhancement:
• Process of image manipulation to make it more suitable for
specific use
• Different images require different enhancement methods
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• The idea behind enhancement techniques is to bring out details


that are hidden, or simply to highlight certain features of interest
in an image.

Image Restoration:
• Process of reconstructing or recovering the original image from
the degraded image.
• Degradation is caused by factors like motion blur, noise, or poor
lighting.
• Mathematical and probabilistic models are used to reverse the
damage.

Color Image Processing:


• This part handles the image processing of colored images either as
indexed images or RGB images.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Wavelets and Multiresolution processing:


• They are the foundation for representing images in various
degrees of resolution.
• Used mainly for image data compression & pyramidal
representation where images are divided into smaller regions.

Compression:
• Technique for reducing the storage space required to save the
image, or bandwidth required to transmit it.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• JPEG (Joint Photographic Experts Group)


• TIF (Tagged Image File) or TIFF (Tagged Image File Format)
• PNG (Portable Network Graphics)
• GIF (Graphics Interchange Format)
• BMP (Bitmap image file)

Morphological Processing:

• It deals with extracting image components that are useful in the


representation and description of shapes.
• It is used to fill the gaps and remove small particles from the
binary image.
DIP: Module1, Bondita Paul, Asst. Prof. GLA


Segmentation:

• It is the process of partitioning a digital image into multiple


segments. It is generally used to locate objects and boundaries in
objects.


Representation & Description:

• Representation deals with converting the data into a suitable form


for computer processing.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Boundary representation: it is used when the focus is on


external shape characteristics e.g. corners
• Regional representation: it is used when the focus is on
internal properties e.g. texture
• Description deals with extracting attributes that
• Results in some quantitative information of interest
• Used for differentiating one class of objects from others

Object recognition:

• Process of assigning a label to an object (e.g., “vehicle”) based


on its description.

Knowledge Base:

• Knowledge about a problem domain is coded into an image


processing in the form of the knowledge database.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Different Image types:

Aspect Ratio:
The aspect ratio of an image is the proportional
relationship between its width (column) and height
(row), expressed as a ratio. It defines the shape of the
image or frame and is written as:

Aspect Ratio = width/height

For example, an aspect ratio of 16:9 means the width is


16 units, and the height is 9 units.

Significance:
• Determines the overall shape of the image.

• Maintains consistency when resizing or cropping images

to prevent distortion.
Some Common Aspect Ratios:
• 1:1: Square images (e.g., profile pictures).

• 4:3: Older TV screens and some digital cameras.

• 16:9: Widescreen format for modern TVs and monitors.


DIP: Module1, Bondita Paul, Asst. Prof. GLA

• 21:9: Ultra-wide displays or cinematic content.

Q. If we want to resize a 1024x768 image to one that is 600


pixels wide with the same aspect ratio as the original image,
what should be the height of the resized image?
Ans: Aspect Ratio = width/height = col/row
= 768/1024
= 0.75
Height of the resized image= 600/0.75=800.
Therefore, the new image dimension is 800x600.
Q. Suppose, we have an image that has 16 rows and
has an aspect ratio of 1 or 1:1. Find the total number
of pixels in the image.
Ans: 1 = width/16
width=16
Total number of pixels in image = 16 ×16=256 pixels
Q. Suppose, an image has an aspect ratio of 3:2 with a width
of 36 mm. Find the height of the image.
Ans: Height=36×2/3=24 mm
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Q. An image with dimensions 800×1200 pixels has an aspect


ratio of 3:2. If the image is stretched to an aspect ratio of
16:9 while keeping the width constant, what will be the new
height of the image?
Ans: Original dimensions: 800 (height)×1200 (width)
Original aspect ratio: 3:2
New aspect ratio: 16:9
Width remains constant: 1200 pixels
New height=1200×9/16=675 pixels.
Q. A photographer wants to crop a 4000×3000-pixel image
to fit into a 3:2 aspect ratio for printing. What will be the
new dimensions of the cropped image while maximizing
the usable area?
Ans: For a 3:2 aspect ratio:

Width/Height=3/2
Let the new dimensions be w (width) and h (height). These
must satisfy: w/h=3/2
The cropped dimensions must fit within the original image
dimensions: w≤3000, h≤4000
To maximize the usable area, we need to take the largest
possible w and h that fit the aspect ratio of 3:2 while staying
within the original dimensions.
Case 1: If w=3000, Then
h=2/3w=2/3×3000=2000
Since h must be an integer, round down: h=2000
Thus, the cropped dimensions are 2000×3000 pixels.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Case 2: If h=4000, calculate w:


w=3/2h=3/2×4000=6000
This case is invalid since w=6000 exceeds the original width of
4000 pixels.
Therefore, the cropped image dimensions are 2000×3000
pixels, maintaining a 3:2 aspect ratio while maximizing the
usable area.

Baud Rate:
Baud rate refers to the number of signal changes or symbols
transmitted per second in a communication channel. It is a
measure of how quickly information is transmitted.
• Generally, transmission is accomplished in packets
consisting of a start bit, a byte of information & a stop bit
• It is measured in baud (1 baud = 1 symbol per second).
• Relation to Bit Rate:
– The bit rate is the number of bits transmitted per
second, while the baud rate is the number of symbols
transmitted per second.
• If each symbol represents 1 bit, then:
Bit Rate=Baud Rate
• If each symbol represents multiple bits (e.g., in advanced
modulation schemes like QAM), then:
• Bit Rate=Baud Rate×Bits per Symbol
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Example:
– A communication system with a baud rate of 1000
baud transmits 1000 symbols per second.
• If each symbol carries 2 bits, the bit rate would be:
• Bit Rate=1000 baud×2 bits per symbol=2000 bps
(bits per second)

1. Applications:
o Used in data communication systems like modems,

telecommunication systems, and digital signaling.


DIP: Module1, Bondita Paul, Asst. Prof. GLA

oHigher baud rates allow for faster transmission of


symbols but may require more bandwidth.
2. Bandwidth Requirement:
o The baud rate is directly related to the required

bandwidth of the communication channel. Higher


baud rates demand more bandwidth.
Flag Bit, Start Bit, and Stop Bit in the Context of Baud Rate:

Term Purpose Typical Value


Marks start or end of a frame (e.g.,
Flag Protocol-
01111110). (Data sequence. Usually 8 bit
Bit dependent
long)
Start
Indicates start of a data frame. 1 bit, usually 0
Bit
Stop 1 or 2 bits,
Indicates end of a data frame.
Bit usually 1

Q. How many minutes would it take to transmit a 1024 × 1024


image with 256 gray levels using a 56K baud rate?
Ans: Given:
1. Image dimensions: 1024×1024 pixels
2. Number of gray levels: 256 gray levels (which means 8 bits
per pixel, as 2^8 = 256)
3. Baud rate: 56K baud = 56,000 bauds per second
1. Calculate the total number of pixels in the image:
Total Pixels=1024×1024=1,048,576 pixels
2. Calculate the total number of bits required to transmit the
image:
Since each pixel requires 8 bits (for 256 gray levels):
Total Bits=Total Pixels×8=1,048,576×8=8,388,608 bits
DIP: Module1, Bondita Paul, Asst. Prof. GLA

3. Calculate the transmission time in seconds:


Transmission time is given by:
Transmission Time =Total Bits/Baud Rate
=8,388,608/56,000
≈149.79 seconds
=2.5 minutes
Q. Compute the time required to transmit an image of 64×64
with 32 gray levels using a 20 b/s.
Ans: Given:
Image dimensions: 64×64
Number of Gray levels: 32
No. of bits per pixel=log_2(32) = 5 bits per pixel.
Transmission speed: 20 bits/second
Total Pixels=64×64=4096 pixels
Total Bits=Total Pixels×Bits per Pixel=4096×5=20480 bits
Time =Total Bits/Transmission Speed
=20480/20=1024 seconds = 17min 4sec.
Q. Compute the time required to transmit an image of 200 × 200
consisting of 4 bands with 9 gray levels using a 10 b/s.
Ans: Given:
Image dimensions: 200×200
Number of bands: 4
Number of gray levels: 9
No. of bits per pixel = log_2(9) = 3.17 bits per pixel = 3bps
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Transmission speed: 10 bits/second


Total Pixels = 200×200=40000 pixels (per band)
Total Pixels for 4 Bands = 40000×4 = 160000 pixels
Total Bits =160000×3 = 480000 bits
Time = 480000/10 = 48000 sec =13 hr, 20 min.

Q. Compute the time required to transmit an image of


32×32 consisting of 7 bands with 11 levels using a 7
b/s.
Ans: 3072 sec = 51 min, 12 sec.

Q. Compute the time required to transmit an image having a


width of 200 and an aspect of 2:5 using a 7 b/s. It consists of 4
bands with 4 bits per pixel.
Ans: Given:
Width of the image: 200 pixels
Aspect ratio: 2:5
Transmission speed: 7 b/s
Number of bands: 4
Bits per pixel: 4 bits/pixel
Width: Height=2:5
Height=5/2×Width=5/2×200=500 pixels
Total Pixels (for 1 band)
=Width×Height=200×500=1,00,000 pixels
Total Pixels (for 4 bands) =1,00,000×4=4,00,000 pixels
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Each pixel requires 4 bits, thus:


Total Bits=4,00,000×4=16,00,000 bits
Time = 16,00,000/7 ≈ 2,28,517,43 sec ≈ 6347.15 hours
End Term 2024:

Ans: Given:
1. Image I1:
o Dimensions: 10×20

o Aspect Ratio = columns/rows=20/10=2:1

2. Image I2:
o Dimensions: 100×X, maintaining the same aspect

ratio as I1
o X=100×2= 200

o Therefore, I2 has dimensions 100×200.

o Total pixels = 20,000.

3. Pixel Information:
o Maximum value among pixels: 16 (requires 4 bits to

represent as 2^4 = 16)


4. Transfer Information:
o Transfer Speed: 20 bits/sec
DIP: Module1, Bondita Paul, Asst. Prof. GLA

o Packet Size: 10 bits (i.e. 1 packet transfer 10 bits per


sec)
o Flag Bits: Half of the packet size (i.e., 10/2 = 5 bits

per packet)
Each pixel requires 4 bits, so:
Total Bits to Represent the Image I2
=20,000×4=80,000 bits
5. Packets Required for Transmission:
Each packet can carry 10−5 = 5 data bits (as 5 bits are used
for flags).
Number of Packets=Total Bits/Bits per Packet=80,000/5
=16000 packets
6. Total Bits to be Transferred (Including Flags):
Each packet has 10 bits (i.e., 5 data + 5 flag bits), so:
Total Bits Transferred=16000×10=4,00,000 bits
7. Transfer Time:
Transfer speed = 20 bits/sec (Given)
Transfer Time = 4,00,000/20=20,000 sec.

Mid Term 2024


DIP: Module1, Bondita Paul, Asst. Prof. GLA

Ans: Given Data:


1. Image size: 1280 × 960 pixels.
2. Intensity levels: 256 (requires 8 bits per pixel as
log_2(256) = 8).
3. Number of images: 200.
4. Packet structure:
o 14 start bits + 8 information bits + 14 stop bits per
byte = 36 bits per byte.
5. Modem speeds:
o 3 M-baud=3×10^6 bits/sec
o 30 G-baud=30×10^9 bits/sec
6. Total number of bits per image
• Each pixel requires 8 bits.

• Total number of pixels = 1280×960=1,228,800 pixels


• Total bits per image = 1,228,800×36=44,236,800 bits
• Total bits for 200 images = 44,236,800×200 =
8,847,360,000 bits
• For a 3 M-baud:
Time = Total bits/Baud rate = 8,847,360,000/ (3×10^6)
=2,949.12 seconds
• For a 30 G-baud:
Time=8,847,360,000/ (30×10^9) = 0.295 seconds
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Components of Digital Image Processing System:

Image Sensors:
Image sensors sense the intensity, amplitude, coordinates, and
other features of the images and pass the result to the image
processing hardware. It includes the problem domain.

Image Processing Hardware:


Image processing hardware is the dedicated hardware that is
used to process the instructions obtained from the image
sensors. It passes the results from the microcomputer to a
general-purpose computer.
Computer: Computer used in the image processing system is
the general-purpose computer that is used by us in our daily
life.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Image Processing Software: Image processing software is the


software that includes all the mechanisms and algorithms that
are used in the image processing system.
Mass Storage: Mass storage stores the pixels of the images
during the processing.

An 8-bit image of 1024 x 1024


pixels requires 1 megabyte of
storage!!!!!

Hard Copy Device: Once the image is processed then it is


stored in the hard copy device. Eg. CD-ROM disk.

Image Display: It includes the monitor or display screen that


displays the processed images. Eg. TV screen.

Network: Network is the connection of all the above elements


of the image processing system.

Basic Relation between pixels


• An image is denoted by a function f(x, y).
• Each element f(x, y) at location (x, y) is called a pixel.
• There exist some basic and important relationships
between pixels.

➢ Neighbourhood
➢ Adjacency
➢ Connectivity
➢ Paths
➢ Regions and Boundaries
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Neighborhood of a pixel:

• A pixel p at (x,y) has 4-horizontal/vertical neighbours at


(x+1,y), (x-1,y), (x,y+1) and (x,y-1). These are called the
4-neighbors of p: N4(p).

• Each of them is at a unit distance from p.


DIP: Module1, Bondita Paul, Asst. Prof. GLA

• A pixel p at (x,y) has 4 diagonal neighbours at (x+1,y+1),


(x+1,y-1), (x-1,y+1) and (x-1,y-1). These are called the
diagonal neighbors of p: ND(p).

• The union of N4(p) and ND(p) together are called 8-neighbors of


p: N8(p) = N4(p) U ND(p)

• Some of the points in the N4 ND and N8 may fall outside the image
when P lies on the border of image.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Adjacency:

• Two pixels are said to be connected if they are adjacent in


some sense

➢ They are neighbours (N4, ND or N8) and


➢ Their intensity values (gray levels) are similar

• For example, in a binary image two pixels are connected if


they are 4 neighbors and have the same value 0 or 1
• Let V be a set of gray-level values used to define adjacency.
• We consider three types of adjacencies:
a) 4-adjacency: Two pixels p and q with values from V are
4-adjacent if q is in the set N4(p).
Eg: V = {0, 1}

1 1 0
1 1 0
1 0 1
b) 8-adjacency: Two pixels p and q with values from V are
8-adjacent if q is in the set N8(p).
Eg: V = {1, 2}

0 1 1
0 2 0
0 0 1
DIP: Module1, Bondita Paul, Asst. Prof. GLA

c) m-adjacency (mixed adjacency): Two pixels p and q with


values from V are m adjacent if
➢ q is in N4(p), or
➢ q is in ND(p) and the set N4(p)∩N4(q) has no pixels whose
values are from V.
➢ Compute m adjacency for the given image for each pixel
when V = {1}

0 1 1 0 1 1
0 1 0 0 1 0
0 0 1 0 0 1
0 1 1 0 1 1
0 1 0 0 1 0
0 0 1 0 0 1

0 1 1
0 1 0 Final image

0 0 1
• Mixed adjacency is a modification of 8-adjacency. It is
introduced to eliminate the ambiguities that often arise
when 8-adjacency is used. For example:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Q. Find 4-adjacency and 8-adjacency of the center pixel. Note:


V = {1}
Ans: 4 adjacency 8- adjacency
0 1 1
0 1 1 0 1 0
0 1 0 0 0 1
0 0 1
Q. Consider following Image (I) and Vector V= {1,2,3}
Find 4-adjacency, 8-adjacency and m-adjacency of the pixel at
(1,1).

Ans: 4-adjacency of pixel 0 at (1,1) is


{(0,1), (1,2), (1,0)}
8-adjacency of pixel 0 at (1,1) is
DIP: Module1, Bondita Paul, Asst. Prof. GLA

{(0,1), (1,2), (1,0), (0,0), (0,2), (2,2)}


m-adjacency of pixel 0 at (1,1) is calculated as follows:
N4(p) = {(0,1), (1,2), (1,0)} = {1,2,3}
ND(p) = {(0,0), (0,2), (2,2)} = {2,2,1}

Connectivity:

It is used for establishing boundaries of objects and components


of regions in an image.

Two pixels are said to be connected:

• if they are adjacent in some sense (neighbor pixels, 4/8/m-


adjacency)

• if their gray levels satisfy a specified criterion of similarity


(equal intensity level)

There are three types of connectivity based on adjacency. They


are:

a) 4-connectivity: Two or more pixels are said to be 4-


connected if they are 4-adjacent with each other.

b) 8-connectivity: Two or more pixels are said to be 8-


connected if they are 8-adjacent with each other.

c) m-connectivity: Two or more pixels are said to be m-


connected if they are m adjacent to each other.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

To determine whether the pixels are adjacent in some sense let


V be the set of gray-level values used to define connectivity;
Then two pixels p and q that have values from the set V are:

➢ 4-connected, if q is in the set N4(p)


➢ 8-connected, if q is in the set N8(p)
➢ m-connected, if

❖ q is in N4(p) or
❖ q is in ND(p) and the set [N4(p) ∩ N4(q)] is empty.

Let S represent a subset of pixels in an image

• Two pixels p with coordinates (x0, y0) and q with


coordinates (xn, yn) are said to be connected in S if there exists
a path: (x0, y0), (x1, y1), …, (xn, yn)

Example: Consider the following sample image pixel and


vector V= {1,2}
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Paths and Length

A path from pixel p with coordinate (x, y) to pixel q with


coordinate (s, t) is a sequence of distinct pixels with
coordinates:
(x0, y0), (x1, y1), (x2, y2) … (xn, yn),
where (x0, y0)=(x, y) and (xn, yn)=(s, t);
(xi, yi) is adjacent to (xi-1, yi-1)
➢Here n is the path length or no. of the connected path.
➢ If (x0, y0) = (xn, yn), the path is a closed path
➢ We can define 4-, 8-, and m-paths based on the type of
adjacency used.

• Eg:
– Compute the length of shortest-4, 8, m path between pixels
p (3, 0) and q (0, 3), where V ={1, 2}

4 2 3 2
q
3 3 1 3
2 3 2 2
2 3
Ans: Shortest 4 path
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Since there is no 4-
connected path between 1
and 2, the shortest 4-
connected path between
points p and q does not exist
The shortest 8 path exists

For shortest 8 path, the path


length is 4.
Shortest m path exists.
For the shortest m
path, the path length
is 5.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Regions and Boundaries


➢ A subset R of pixels in an image is called a region of the
image if R is a connected set.
➢ A connected set is also called a Region.
➢ Let R be a subset of pixels in an image, two regions Ri and
Rj are said to be adjacent if their union forms a connected
set.
➢ Regions that are not connected are said to be disjoint.

• We consider 4- and 8- adjacency when referring to the


regions
• Eg:

V = {1}
Ri Rj
1 1 0 0 1 1
1 0 1 0 1 1
1 1 0 1 1 1
When discussing a particular region, the type of adjacency must
be specified. In the above example, the two regions are adjacent
only if 8-adjacency is considered.

Foreground and Background


• Suppose an image contain K disjoint regions Rk,
k=1,2,3,…K, none of which touches the image border
• Let Ru denote the union of all the K regions.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Let (Ru)c denote its complement.


• We call all the points in Ru the foreground and all the points
in (Ru)c the background

Boundary
• The boundary (also referred to as the border or contour)
of a region R is defined as the set of points in R that are
adjacent to points in the complement of R.
• It consists of pixels within the region that have at least one
neighbouring pixel belonging to the background.
• In other words, the boundary of a region R includes all
pixels within R that have one or more neighbours not
included in R.
Types of Boundaries:
• Inner Border: The border of the foreground.
• Outer Border: The border of the background.

0 0 0 0 0
0 1 1 0 0
0 1 1 0 0
0 1 1 1 0
0 1 1 1 0
0 0 0 0 0
DIP: Module1, Bondita Paul, Asst. Prof. GLA

0 0 0
0 1 0
0 1 0
0 1 0
0 1 0
0 0 0

If R happens to be the entire Image?


If R represents the entire image, the boundary conditions are affected
as follows:

1. Inner Border:
Since R encompasses the entire image, there are no background
pixels within the image. Therefore, the inner border does not
exist because all pixels in R are surrounded only by other pixels
in R.
2. Outer Border:
Similarly, the outer border does not exist within the image
because there are no background pixels surrounding R from
outside.

Example:

Image Matrix (Entire Image as Region R):

1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1

• Inner Border: None, as every pixel is part of R with no


background pixels in the image.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Outer Border: Does not exist within the image, but conceptually
would lie along the edges of the image, adjacent to the
hypothetical background outside.

When R is the entire image, boundary definitions lose relevance


because there is no distinction between foreground and background
within the image.

Distance measure between pixels:


Given pixels p, q, and z with coordinates (x, y), (s, t), (u, v)
respectively, the distance function D has the following
properties:
a. D(p, q) ≥ 0 ; [D(p, q) = 0, if p = q]
b. D(p, q) = D(q, p)
c. D(p, z)≤[D(p, q) + D(q, z)]

The following are the different Distance measures:


– Euclidean Distance:
• De(p, q) = [(s-x)2 + (t-y)2]1/2
– City Block Distance:
• D4(p, q) = |s-x| + |t-y|
– Chess Board Distance:
• D8(p, q) = max(|s-x|, |t-y|)
1. For Euclidean distance measurement, the pixels with a
distance less than or equal to a specified value from p(x,y)
DIP: Module1, Bondita Paul, Asst. Prof. GLA

are the points that lie within a disk of radius r, centered at


p(x,y).
2. The pixels having a City Block (D4) distance from p(x, y)
less than or equal to a specified value form a diamond-
shaped region centered at p(x, y).
3. The pixels with D4=1 are the 4-neighbors of p(x,y).

2
2 1 2
2 1 0 1 2
2 1 2
2

4. The pixels having a Chessboard (D8) distance from p(x,y)


less than or equal to a specified value form a square-shaped
region centered at p(x,y).
5. The pixels with D8=1 are the 8-neighbors of p(x,y).

2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Mid Term 2024

Ans:
3 1 2 1
2 2 0 2
1 2 1 1
1 0 1 2

The shortest 4 Path does not exist. Because the path


connecting 0 and 1 is not a 4 path. Therefore, we could not
connect p and q using 4 paths.
3 1 2 1
2 2 0 2
1 2 1 1
1 0 1 2

The shortest 8 paths exist. The path length is 4.


3 1 2 1
2 2 0 2
1 2 1 1
1 0 1 2

The shortest m paths exist. The path length is 5.


DIP: Module1, Bondita Paul, Asst. Prof. GLA

Q. Calculate the 3 types of distance between p and q for the


above matrix.
Ans: Start indexing from (0,0), p= (3,0), and q(0,3)
De(p,q)=sqrt[(0-3)^2+(3-0)^2] =3(2)^(1/2)=4.242=4
D4(p, q) = |0-3| + |3-0|=6
D8(p, q) = max(|0-3|, |3-0|)=max(3, 3)=3
Some Mathematical tools for image:
When we multiply two images, we usually carry out array
multiplication.

Example of image multiplication

In the 2nd image Black =0, white = 1


Arithmetic operation:
• Arithmetic operations are performed on the
pixels oftwo or more images
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Let p and q be the pixel values at location (x,


y) in firstand second images respectively
– Addition: p + q
– Subtraction: p - q
– Multiplication: p.q
– Division: p/q
Set operation:

Fig: (a) Two sets of coordinates, A and B, in 2-D space, (b) The union
of A and B, (c) The intersection of A and B, (d) The complement of
A, (e) The difference between A and B.

In (b)–(e), the shaded areas represent the elements of the sets as


indicated by the respective set operations.

Logical operation:
1. AND/NAND
2. OR/NOR
3. EXOR/EXNOR
DIP: Module1, Bondita Paul, Asst. Prof. GLA

4. INVERT/LOGICAL NOT

Practical Applications of AND and NAND:

1. Computation of intersection of image


2. Design of filter masks
3. Slicing of grayscale image.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Module 1: 2nd part


• Intensity Transformations and Spatial
Filtering:
– Introduction, Some Basic Intensity
Transformation Functions,
– Histogram Processing, Histogram Equalization,
Histogram Specification,
– Local Enhancement, Enhancement using
Arithmetic/Logic Operations
– Basics of Spatial Filtering, Smoothing,
Sharpening

Image Enhancement
Image enhancement involves improving the quality of
images to make them more useful and visually appealing.
The main objectives of image enhancement include:
• Enhancing the visual appeal of images.

• Highlighting significant details within the image.


• Reducing or removing noise to improve clarity.

Common Image Degradations


Images often suffer from various degradations that image
enhancement techniques aim to address, such as:
• Poor Contrast: Caused by inadequate illumination or
lighting conditions.
• Noise: Introduced by electronic sensors or atmospheric
disturbances.
• Aliasing: Resulting from insufficient sampling of the
image.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Blur Effects: Due to finite aperture size or motion of the


subject or camera.
• Applications of transformations
– Contrast enhancement
– Gray scale transformation
– Photometric calibration
– Display calibration
– Contour lines

There are two broad categories of image enhancement


approaches
• Spatial domain methods
– Spatial domain refers to the image plane itself.
– In this domain, image processing methods involve
directly manipulating the pixel values of the image.
• Frequency domain methods
DIP: Module1, Bondita Paul, Asst. Prof. GLA

– Based on modifying the Fourier transform of an


image
Types of Spatial Domain Methods:
Intensity Transformations (Point processing)
– Operate on single pixels of an image
– Eg: image averaging; logic operation; contrast stretching
Spatial Filtering (Mask processing)
– Working in a neighborhood of every pixel in an image
– Eg: blurring, median
Some basic intensity transformation functions (or) point
operations:
▪ Linear Transformation:
– Image negatives and Identity transform
▪ Non-linear Transformation
– Log transformations and inverse-log transformation
– Power law (Gamma) transformations
▪ Piecewise-linear transformations
– Contrast stretching
– Intensity-level slicing or gray-level slicing
– Bit-plane slicing
▪ Histogram

The differences between linear, non-linear, and piecewise


linear transformations in intensity transformation methods:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Aspect Linear Non-Linear Piecewise


Transformation Transformation Linear
Transformation
Definition Establishes a Follows a curved Divides the
straight-line function for the transformation
relationship relationship into multiple
between input between input linear segments
and output and output for specific
intensities. intensities. intensity ranges.
Formula/Function s=c⋅r+bs Logarithmic: Combination of
s=c⋅log(1+r) linear equations
Power-law: for different
s=c⋅r^γ intensity ranges.
Complexity Simple Medium Moderate
Flexibility Low High Moderate
Applications Contrast Enhancing dark Thresholding,
stretching, and areas, dynamic localized
brightness range enhancement,
adjustment. compression. contrast
stretching.
Use Case Global Adjustments Enhancements
adjustments targeting specific based on
across the entire intensity ranges. localized ranges
image. or thresholds.
Characteristics Proportional and Flexible and Adaptable to
straightforward. dynamic. specific intensity
regions.

▪ Spatial domain methods are procedures that operate directly


on these pixels.
▪ More efficient computation and requires fewer processing
resources to implement
▪ Spatial domain processes will be denoted by the expression:

g(x, y) = T [f(x, y)]


DIP: Module1, Bondita Paul, Asst. Prof. GLA

where f(x, y) is the input image, g(x, y) is the processed (output)


image, and T is an operator on f, defined over some
neighborhood of (x, y).
• s=T(r)
• r=graylevel of f(x,y)
• s=graylevel of g(x,y)
• T= gray level (or intensity or mapping) transformation
function

Image Negative and identity transformation:


• The negative of an image with gray levels in the range [0, L-
1] is obtained by using the image negative transformation
shown in Fig, which is given by the expression
s= (L-1)-r

• s is the pixel value of the


output image and r is the
pixel value of the input
image.
• It reverses the intensity levels
of an image in this manner
which produces the
equivalent of a photographic negative.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Here is the identity


transformation graph,
where the input
intensity r is equal to
the output intensity s.
This represents no
change in the intensity
values, maintaining
the original
characteristics of the
image.

Eg. Negative Transformation


DIP: Module1, Bondita Paul, Asst. Prof. GLA

Mid Term 2024

Ans:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Log Transformations

For an image having intensity ranging from [0 L-1], log


transformation is given by
s = c log (1+r)
• c is a constant
• Expands the narrow range of low-intensity input values
to a wider range of output levels.
• Compresses the higher range of high-intensity input
values into a narrower range of output levels.


• Used to expand the values of dark pixels
in an image while compressing the higher-level
values.
• Often used to process X-ray images to improve the
visibility of details in darker regions.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

This Fig shows the graph of three transformation methods and


the variations in their intensity levels. The inverse log is also
called exponential.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Power-law transformations
• Power-law transformation, also known as gamma
correction, is a type of intensity transformation used in
image processing. It enhances or adjusts the pixel intensity
levels in an image according to the power-law equation:
s = c r^γ
Where,
• s: Output intensity (transformed pixel value).
• r: Input intensity (original pixel value).
• c: A scaling constant, used to adjust the overall
brightness.
• γ (gamma): A positive real number called the gamma
value controls the degree of transformation.
• For various
values of γ
different
levels of
enhancements
can be
obtained.
(c=1 for all
cases)

Effect of Gamma (γ) Values


For γ<1:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Enhances darker pixels, spreading them across a wider


range.
• Compresses bright regions.
• Reveals details in dark areas.
For γ>1:
• Higher gamma values compress lower intensity values into
a narrower range.
• Emphasizes bright regions.
• Enhances high-intensity details.
For γ=1:
• The transformation becomes an identity transformation,
where the input and output pixel values remain the same
(s=r).

Applications of Power-Law Transformation


• Gamma Correction for Display Devices: Adjusts input
pixel intensities to match the device's non-linear response
(e.g., γ =2.2 for monitors).
• Image Enhancement: Improves visibility of details in
specific intensity ranges.
• Medical Imaging: Highlights specific structures by
adjusting intensity distribution (e.g., X-rays).
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Fig. a) image has a washed-out appearance b) γ = 3.0


(suitable) c) γ = 4.0 (suitable) d) γ = 5.0 (some detail is
lost). For all cases c =1.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Q. a) Consider an image with input intensity values ranging


from 0 to 255. Using c=1 and γ=0.5:
1. If the input pixel intensity r is 36, calculate the
transformed pixel value s.
2. What happens to the intensity levels of lower input
values (e.g., 9 or 25) when this transformation is applied?
b) Consider an image with input intensity values ranging from
0 to 255. Using c=1 and γ=2:
3. If the input pixel intensity r is 12, determine the
transformed pixel value s.
4. Explain how the transformation impacts the bright regions
of the image when applied to a pixel intensity of 200.
Ans:
a) Dark Image Enhancement (γ < 1):
1. For r=36, s= r^{0.5} = 36^{0.5} = 6
2. Effect on lower input values:
o For r=9, s = 9^{0.5} = 3.

o For r=25, s = 25^{0.5} = 5.

Lower intensity values are mapped to wider output


ranges, making dark regions more visible.
b) Bright Image Compression (γ > 1):
3. For r=12, s = 12^2 = 144
4. Effect on bright regions:
o For r=200, s = 200^2 = 40,000.

o For r=255, s = 255^2 = 65,025.

Bright regions are compressed into a narrower


range, making them less overwhelming.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Thus, γ<1 emphasizes the lower input intensities by spreading


them over a wider output range (dark regions), while γ>1
compresses the brighter regions by mapping high intensities to
a smaller output range.

Mid Term 2024:

Piecewise Linear Transformation:


– Definition:
o An image enhancement technique where the transformation
function is divided into linear segments over distinct pixel
intensity ranges.
– How it Works:
o The image’s pixel intensity values are divided into specific
intervals.
o Each interval has its own linear transformation.
o This results in a piecewise function that allows for
independent enhancement of different image regions (e.g.,
dark, mid, and bright areas)

• Principle Advantage:

– Provides greater control over the enhancement process.

– Allows for targeted adjustments to specific pixel


DIP: Module1, Bondita Paul, Asst. Prof. GLA

intensity ranges.

• Principle Disadvantage: Their specification requires


more user input than previous transformations.
Types of Piecewise Linear Transformation:

Contrast Stretching
Gray-level Slicing
Bit-plane slicing
Contrast Stretching:

• Contrast stretching is a simple piecewise linear


transformation used to enhance the dynamic range of pixel
intensities in an image.

• It is particularly
useful for improving
low-contrast images
caused by poor
illumination, limited
dynamic range of the
imaging sensor, or
improper camera
settings.

• In this plot (r1, s1) and (r2, s2) control the shape of the
DIP: Module1, Bondita Paul, Asst. Prof. GLA

transformation function T(r).


How it Works:
◦ The process maps the input intensity range to a wider or
more useful output range to improve contrast.
◦ The transformation function is defined by two points (r1,
s1) and (r2, s2):
o If r1=s1 and r2 = s2, the transformation becomes
linear.

◦ If r1=r2=r, s1=0, and


s2=L−1 (where L is the
maximum intensity value),
the transformation becomes a
thresholding function:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

s1=0; if r<=r1
s1=L-1; if r>r2
◦ Intermediate values of (r1, s1) and (r2, s2) produces
various degree of spread in the intensity.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Example:
1. Input Image:
o Consider an image with pixel intensities ranging
from 50 to 150 (low contrast).
2. Contrast Stretching:
o Let (r1, s1) = (50,0) and (r2, s2) = (150, 255)
o The transformation function stretches the input
range (50–150) to the full output range (0–255).
3. Result:
o The darker areas (close to 50) become even darker
(near 0).
o The brighter areas (close to 150) become even
brighter (near 255), enhancing contrast in the image.
This process is commonly used to improve the visual quality
of an image by making features more distinguishable.
The formula for contrast stretching is defined as a piecewise
linear function:

Where:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• r: Input pixel intensity.


• s: Output pixel intensity.
• r1, r2: Intensity values defining the input range for stretching.
• s1, s2: Corresponding output intensity values for r1 and r2.
• L−1: Maximum intensity level (e.g., 255 for an 8-bit image).

Mid Term 2024:

Ans:

Where:
• Minimum and maximum intensity values in the input
image (rmin=10, rmax=60).
• Minimum and maximum intensity values for the output
image (smin=120, smax=180).
Given Input Intensities:
Input values: 10, 15, 20, 50
DIP: Module1, Bondita Paul, Asst. Prof. GLA

For each input intensity r, substitute into the formula to


calculate the output intensity s.
1. For r=10:
s=120+(10−10)⋅(180−120)/(60−10)= 120
2. For r=15:
s=120+(15−10)⋅(180−120)/(60−10) = 126
3. For r=20: s=132
4. For r=50: s=168
Results:
The output gray levels after applying contrast stretching are:
{120, 126, 132, 168}

Thresholding
• It is a limited case of contrast stretching, it produces a two-
level (binary) image.
• Thresholding is required to extract a part of an image that
contains all the information.
• Thresholding is a part of a more general segmentation
problem.
• In thresholding, pixels with an intensity lower than the
threshold are set to zero (dark area), and those with an
DIP: Module1, Bondita Paul, Asst. Prof. GLA

intensity greater than the threshold are set to 255 (i.e. L-1,
bright area). This produces a binary image.
• S=T(r) = 0 ; r<m
=L-1; r>m
Where m is a threshold
value. r is the input, defined
by the pixels of input image
f(x, y).

Intensity Level/Gray level slicing


Gray Level Slicing is a technique in image processing used to
enhance specific ranges of gray levels in an image while
suppressing others. This is particularly useful in applications
where certain intensity ranges are of interest, such as in
medical imaging (to highlight tissues) or satellite imaging (to
emphasize certain land features).
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Fig: (a) This transformation highlights range [A, B] of gray


levels and reduces all others to a constant level (b) This
transformation highlights range [A, B] but preserves all other
levels
Approaches to Gray Level Slicing
Gray-level slicing can be implemented in two primary ways:
1. Highlighting Specific Gray Levels (Binary Slicing)
In this approach:
• Gray levels within a specified range (r1 to r2) are
assigned a high value (e.g., L−1 for an 8-bit image,
which equals 255).
• Outside this range all other gray levels are assigned a low
value (e.g., 0).
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Transformation Function:

In the above graph,


this condition represents the 1st plot (i.e. Fig (a)) where r1=A
and r2=B. r is the input pixels and s is the transformed or
output pixels. Example:

Example:
• Input image: An 8-bit grayscale image with pixel values
ranging from 0 to 255.
• Desired range: r1=100, r2=150.
For a pixel with:
• Intensity f(x,y)=120: It falls within the range, so s=255.
• Intensity f(x,y)=80: It falls outside the range, so s=0.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Result: The resulting image will have bright areas


corresponding to the range 100–150, and the rest of the image
will appear black.

2. Brightening Specific Gray Levels (Preserve Remaining


Values)
In this approach:
• Gray levels within the desired range are enhanced
(brightened) by scaling them or adding an offset.
• Gray levels outside the range remain unchanged.
Transformation Function:

In the above graph,


this condition represents the 2nd plot (i.e. Fig (b)) where r1=A
and r2=B. r is the input pixels and s is the transformed or
output pixels. Example:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Example:
• Input image: Same as before, with intensity values from
0 to 255.
• Desired range: r1=100, r2 = 150.
For a pixel with:
• Intensity f(x,y)=120: It falls within the range, so s=255.
• Intensity f(x,y)=80: It falls outside the range, so s=80.
Result: The output image will have the desired range (100–
150) brightened, while other areas retain their original
intensity values.

Applications of Gray Level Slicing


1. Medical Imaging: Highlight specific tissue types based
on intensity.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

2. Satellite Imaging: Emphasize features like vegetation or


water bodies.
3. Industrial Imaging: Detect defects or specific regions
of interest in materials.

Bit Plane Slicing


Bit Plane Slicing is a technique in image processing where a
digital image is separated into its individual bit planes to
analyze the contribution of each bit in representing the image's
details and structure.
In an 8-bit grayscale image, each pixel's intensity is represented
by a byte (8 bits), ranging from 0 to 255. Each bit in this byte
contributes to the pixel's intensity, with the most significant bit
(MSB) contributing the most, and the least significant bit (LSB)
contributing the least.

Fig: Bit plane representation of 8-bit image.


DIP: Module1, Bondita Paul, Asst. Prof. GLA

Concept of Bit Planes


• A grayscale image can be visualized as being composed
of 8 bit-planes, from bit-plane 0 (LSB) to bit-plane 7
(MSB).
o Bit-plane 0 (LSB): Contains the least significant bit
of each pixel's intensity, contributing fine details or
noise.
o Bit-plane 7 (MSB): Contains the most significant
bit, contributing the majority of the visually
significant information.
By isolating each bit-plane, we can better understand the role
of individual bits in forming the image.

An 8-bit
fractal image
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Bit Plane Slicing: Process


1. A bit-plane is created by extracting the ith bit from all
pixels in the image.
o For example, bit-plane 3 contains only the 4th bits
(b3) of all pixels.
2. Each bit-plane is a binary image (0 or 1) that can be
scaled to 0–255 for visualization.
Example
Original Image
Suppose the grayscale image is represented as an 8-bit image
with pixel values such as:

Binary Representation
Each pixel is represented in binary:

Extracting Bit-Planes
• Bit-plane 0 (LSB):
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• Bit-plane 7 (MSB):

Similarly, extract the 3rd and 5th bit plane


For 3rd bit plane start counting from 0 indexing (right
side)
First Row

1. 11100111: 4th bit from the right (b3) is 0.


2. 00001111: 4th bit from the right (b3) is 1.
3. 10000000: 4th bit from the right (b3) is 0.
Second Row

4. 01000001: 4th bit from the right (b3) is 0.


5. 11000111: 4th bit from the right (b3) is 0.
6. 01011010: 4th bit from the right (b3) is 1.
Third Row

7. 00101101: 4th bit from the right (b3) is 1.


8. 01111000: 4th bit from the right (b3) is 1.
9. 11110000: 4th bit from the right (b3) is 0.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Final result:

Similarly for the 5th bit plane:

Observation
• Higher-order bit planes (e.g., Bit-plane 7, 6): Contain
most of the visually significant information of the image.
• Lower-order bit planes (e.g., Bit-plane 0, 1): Contain
finer details and noise, contributing less to the overall
image appearance.
Applications
1. Compression: Analyzing bit-plane significance can help
in data compression by discarding less important planes.
2. Encryption: Selective encryption of specific bit planes
enhances security.
3. Image Analysis: Highlighting specific planes aids in
identifying features and patterns.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Mid Term 2024

Ans:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

Histogram Processing
• A histogram in digital image processing is a graphical
representation of the distribution of pixel intensity
values (gray levels) in an image. It helps in visualizing
the frequency of occurrence of different intensity
levels across the entire image.
• Gray Levels:
o The gray levels represent the different intensities
of light in an image. In an 8-bit image, gray levels
DIP: Module1, Bondita Paul, Asst. Prof. GLA

are typically represented in the range [0, 255],


where 0 represents black (the darkest pixel) and
255 represents white (the brightest pixel). This
range is denoted as [0, L-1], where L is the total
number of possible gray levels in the image (for
an 8-bit image, L = 256).
• Histogram Representation:
o The horizontal axis of the histogram represents
the gray levels or pixel intensities (ranging from
0 to 255 in an 8-bit image).
o The vertical axis represents the frequency or
count of pixels in the image that have a specific
intensity (i.e., the number of pixels with a given
gray level).
• Discrete Function:
o The histogram is a discrete function because the
pixel intensities are discrete values, not
continuous. The histogram of a digital image with
intensity levels in the range [0,L-1] is a discrete
function:
DIP: Module1, Bondita Paul, Asst. Prof. GLA

o h(rk)=nk
o where:
▪ rk: The kth gray level or intensity in the range
[0, L-1].
▪ nk is the number of pixels in the image with
the gray level rk.
▪ h(rk): The histogram of a digital image for the
gray level rk

h(2)=6 h(5)=1
h(3)=5 h(6)=0
h(4)=4 h(7)=0
This histogram is normalized by dividing each
component by the total number of pixels (n) in the
image. Thus, the normalized histogram is given by,
DIP: Module1, Bondita Paul, Asst. Prof. GLA

p(rk) = nk/n
– k=0, 1, 2, 3,-----L-1
– p(rk) gives an estimate of the probability of
occurrence of gray level rk
– The sum of all components of a normalized
n

histogram is equal to 1. i. e.  P(r ) = 1


k =0
k

Q. An image with gray levels between 0 to 7 is given


below. Find the histogram of an image and normalize it.
1 6 2 2
1 3 3 3
4 6 4 0
1 6 4 7
Ans: Here, n=total number of pixels=16

rk nk P(rk)=nk/n
0 1 1/16
1 3 3/16
2 2 2/16
3 3 3/16
4 3 3/16
5 0 0
6 3 3/16
7 1 1/16
n

 P(r ) = 1
k =0
k
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• In the dark images, components of the


histogram are concentrated on the lower (dark) side of
the gray scale.

• In bright images, the histogram is biased towards the


higher side of the gray.

• In low contrast, the histogram will be narrow &


centered towards the middle of the gray scale.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

• In high contrast, a large variety of gray tones


occupy the entire range of possible gray levels.

• Histogram-based Operations:
o Histogram Equalization: The process of
adjusting the image histogram to enhance its
contrast by spreading out the intensity values
more uniformly across the available range.
DIP: Module1, Bondita Paul, Asst. Prof. GLA

o Histogram Matching: A technique where the


image histogram is adjusted to match a predefined
or target histogram.
• Histogram Analysis:
o The analysis of the histogram can provide insight
into image characteristics such as:
▪ Brightness: Whether the image is mostly
dark or bright.
▪ Contrast: The range and distribution of pixel
intensities.
▪ Dynamic Range: How widely the pixel
values span the entire intensity range.

You might also like