0% found this document useful (0 votes)
45 views73 pages

Image Processing All 5 Units

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views73 pages

Image Processing All 5 Units

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

ITECH WORLD AKTU BCS057

ITECH WORLD AKTU


Image Processing (BCS057)

Unit 1: Digital Image Fundamentals

• Steps in Digital Image Processing

• Components

• Elements of Visual Perception

• Image Sensing and Acquisition

• Image Sampling and Quantization

• Relationships between Pixels

• Color Image Fundamentals: RGB, HSI Models

• Two-dimensional Mathematical Preliminaries

• 2D Transforms: DFT, DCT

What is an Image?
An image is a visual representation of an object or a scene. It is formed by capturing
light reflected or emitted from the object onto a two-dimensional surface, such as a
camera sensor or photographic film. In the context of digital image processing, an image
is represented as a matrix of pixel values, where each pixel value corresponds to the
intensity or color information at a particular point.
Example: A digital photograph taken by a camera is an image, where each pixel
value indicates the brightness and color at that point in the photograph.

Types of Images
Images can be categorized into several types based on their characteristics:

1. Binary Images
Binary images contain only two pixel values, typically 0 (black) and 1 (white). They are
used for representing simple shapes and structures.
Example: A scanned document where text is represented in black on a white back-
ground.

1
ITECH WORLD AKTU BCS057

2. Grayscale Images
Grayscale images represent various shades of gray, ranging from black (0) to white (255
in 8-bit images). They contain only intensity information without color.
Example: A black and white photograph.

3. Color Images
Color images use multiple color channels, such as RGB (Red, Green, Blue), to represent
colors at each pixel. Each channel has its own intensity value, and the combination of
these values determines the final color.
Example: A digital photograph taken in color.

4. Indexed Images
Indexed images use a colormap to map pixel values to specific colors. Each pixel value is
an index into a table of colors.
Example: A GIF image with a limited palette of 256 colors.

5. Multispectral Images
Multispectral images capture data across multiple wavelengths of light, such as infrared,
visible, and ultraviolet. They are used in remote sensing and satellite imagery.
Example: Satellite images used for land cover classification.

What is Digital Image Processing?


Digital Image Processing refers to the manipulation of an image using digital techniques to
extract meaningful information or to enhance its visual appearance. It involves applying
mathematical and algorithmic operations on an image to achieve the desired outcomes
such as noise reduction, image enhancement, and feature extraction.

Steps in Digital Image Processing


Digital Image Processing involves several steps to enhance and extract useful information
from an image. The primary steps are:

1. Image Acquisition: Capturing the image using a sensor, such as a camera.

2. Preprocessing: Enhancing the quality of the image by removing noise and ad-
justing contrast.

3. Segmentation: Dividing the image into meaningful parts or regions.

4. Representation and Description: Representing the segmented image in a form


suitable for further analysis, such as boundary descriptors.

5. Recognition and Interpretation: Identifying and interpreting objects within


the image.

2
ITECH WORLD AKTU BCS057

6. Knowledge Base: Utilizing prior information about the problem domain to assist
in processing.

Example: A digital camera captures an image, which is then preprocessed to reduce


noise. The regions of interest are segmented, and features such as edges and textures are
described. Finally, the image is analyzed to recognize objects such as faces or text.

Advantages and Disadvantages of Digital Images


Advantages
• Ease of Storage and Retrieval: Digital images can be easily stored in various
formats and retrieved for further processing or analysis.

• Image Manipulation: Digital images can be enhanced, modified, and analyzed


using various image processing algorithms.

• Transmission: Digital images can be transmitted over networks with minimal loss
of quality.

• Integration with Other Systems: Digital images can be easily integrated with
other data types, such as text and audio, for multimedia applications.

Disadvantages
• Storage Requirements: High-resolution digital images require significant storage
space.

• Loss of Information: Compression techniques can lead to loss of image informa-


tion, affecting the quality of the image.

• Processing Time: Large digital images may require significant processing time
and computational resources for analysis.

3
ITECH WORLD AKTU BCS057

Components
Digital Image Processing involves various components that work together to achieve the
desired image analysis.

• Image Sensors: Devices that capture the image, such as CCD or CMOS sensors
in cameras.

• Image Processing Algorithms: Techniques to process and analyze the image


data, such as filtering and enhancement algorithms.

• Hardware: Computing devices that execute image processing algorithms effi-


ciently.

• Software: Programs and tools that provide an interface for implementing image
processing techniques.

Example: A smartphone camera includes an image sensor and uses software to


enhance the image quality, such as applying filters to improve contrast and sharpness.

Elements of Visual Perception


Human visual perception plays a crucial role in how images are processed and understood.
Key elements include:

4
ITECH WORLD AKTU BCS057

• 1. Light and Color Perception: The way humans perceive colors and brightness,
depending on the wavelength of light.

• 2. Spatial Resolution: The ability to distinguish fine details in an image, influ-


enced by the density of photoreceptors in the retina.

• 3. Contrast Sensitivity: The ability to detect differences in brightness, which


helps in distinguishing objects from the background.

• 4. Depth Perception: The ability to perceive the world in three dimensions and
judge the distance of objects.

• 5. Motion Perception: The ability to detect and interpret movement in the


visual field.

• 6. Visual Acuity: The clarity or sharpness of vision, allowing the recognition of


small or detailed objects.

• 7. Adaptation: The ability of the human visual system to adjust to varying levels
of light, ensuring clear vision in different lighting conditions.

Example: The human eye is more sensitive to changes in brightness than to changes
in color, which is why grayscale images often reveal more detail than colored images.

Image Sensing and Acquisition


Image sensing involves capturing visual information using sensors, while acquisition refers
to converting this information into a digital form that can be processed and stored.

• Image Sensors:

– CCD (Charge-Coupled Device) Sensors: CCD sensors are highly sen-


sitive to light and provide high-quality images with low noise levels. They
are commonly used in scientific and medical imaging applications due to their
superior performance in low-light conditions.
– CMOS (Complementary Metal-Oxide-Semiconductor) Sensors: CMOS
sensors are more power-efficient and faster than CCD sensors. They allow for
on-chip processing and are widely used in consumer electronics, such as smart-
phones and digital cameras.

• Image Formation: The process begins with light from the scene entering through
the lens and focusing onto the sensor array. The lens plays a crucial role in deter-
mining the field of view and the focus of the captured image.

• Conversion to Electrical Signals: The image sensor, which consists of an array


of photosensitive elements (pixels), converts the incident light into electrical signals.
Each pixel generates a signal proportional to the intensity of light falling on it.

5
ITECH WORLD AKTU BCS057

• Digitization: The analog electrical signals from the image sensor are converted
into digital values using an Analog-to-Digital Converter (ADC). This process in-
volves sampling the analog signal at discrete intervals and quantizing the sampled
values into digital numbers, typically represented as a binary code.
• Image Acquisition System: In addition to the sensor and ADC, an image ac-
quisition system may include components like amplifiers, filters, and timing circuits
that ensure accurate signal processing and conversion.
• Image Storage: The digitized image data is stored in memory or transmitted to a
processing unit for further analysis. The format and resolution of the stored image
depend on the application requirements and sensor capabilities.
• Calibration and Correction: Calibration processes like white balance, gamma
correction, and lens distortion correction are applied to the raw image data to ensure
accurate color reproduction and image quality.

Example: In a digital camera, light enters through the lens and strikes the image
sensor, which could be either a CCD or CMOS sensor. The sensor converts the light
into electrical signals, which are then digitized by an ADC. The resulting digital image
is stored in the camera’s memory card, ready for viewing or editing.

6
ITECH WORLD AKTU BCS057

Image Sampling and Quantization


Image sampling and quantization are fundamental steps in converting an analog image
into its digital form. These processes determine the resolution and quality of the digital
image.

• Sampling: Sampling refers to measuring the intensity of the image at discrete


points, both horizontally and vertically. It defines the spatial resolution of the
image, which is the number of pixels used to represent the image. Higher sampling
rates result in better resolution, as more details of the image are captured. However,
it also increases the file size and computational requirements.

• Quantization: Quantization involves mapping the continuous range of intensity


values of the sampled image to a finite set of discrete levels. Each pixel value
is assigned to the nearest quantization level. The number of quantization levels
determines the bit depth of the image. Higher quantization levels provide a more
accurate representation of the image with smoother transitions between shades, but
they also require more storage space.

Example: Consider a 10-megapixel camera that samples the image at 10 million


discrete points. Each sample (pixel) is then quantized into 256 levels of brightness,
corresponding to an 8-bit image depth.

7
ITECH WORLD AKTU BCS057

Differences between Sampling and Quantization

S.No. Sampling Quantization


1 Refers to measuring the image at Refers to assigning a discrete
discrete intervals. value to each sampled intensity.
2 Determines the spatial resolution Determines the bit depth or num-
of the image. ber of intensity levels of the im-
age.
3 Affects the number of pixels in Affects the range of grayscale or
the digital image. color values each pixel can repre-
sent.
4 Higher sampling rate captures Higher quantization levels result
more details and results in a in smoother image representation
larger image size. and larger file sizes.
5 Related to the x and y dimensions Related to the z dimension, rep-
of the image matrix. resenting intensity or color levels.
6 Dependent on the sampling fre- Dependent on the number of
quency or rate. quantization levels (e.g., 256 lev-
els for 8-bit depth).
7 Aliasing occurs if the sampling Quantization error occurs if the
rate is too low. number of levels is insufficient to
represent the image accurately.

Table 1: Differences between Sampling and Quantization

8
ITECH WORLD AKTU BCS057

Relationships between Pixels


Understanding the relationships between pixels is crucial for various im-
age processing tasks such as edge detection, region segmentation, and
image analysis. These relationships help define how pixels interact with
each other and contribute to the overall image structure. Some common
relationships include:Neighbors:
– ∗ Neighbors are the pixels that are directly adjacent to a given pixel. There
are three main types of neighborhoods:
· 4-neighborhood (N4): Consists of the four pixels that share a com-
mon edge with the given pixel. For a pixel at coordinates (x, y), the
4-neighbors are located at (x−1, y), (x+1, y), (x, y −1), and (x, y +1).
· 8-neighborhood (N8): Includes the 4-neighbors as well as the four
diagonal neighbors. The 8-neighbors are located at (x−1, y), (x+1, y),
(x, y − 1), (x, y + 1), (x − 1, y − 1), (x − 1, y + 1), (x + 1, y − 1), and
(x + 1, y + 1).
· m-neighborhood: This is a mixture of the 4-neighbors and diagonal
neighbors, considering specific conditions for connectivity.
– Connectivity:
∗ Connectivity defines how pixels are connected based on their intensity
values and spatial relationships. It is crucial for identifying distinct regions
in an image.
· 4-connectivity: Two pixels are 4-connected if they share an edge
and have the same intensity value.
· 8-connectivity: Two pixels are 8-connected if they share either an
edge or a corner and have the same intensity value.
· Mixed-connectivity (m-connectivity): Two pixels are connected
using a combination of 4-connectivity and 8-connectivity rules, avoid-
ing the ambiguity of diagonal connectivity.

– Adjacency:
∗ Adjacency describes the relationship between pixels that share a common
side or corner. There are different types of adjacency:
· 4-adjacency: Two pixels are 4-adjacent if they share a common side.
· 8-adjacency: Two pixels are 8-adjacent if they share a common side
or a common corner.

9
ITECH WORLD AKTU BCS057

· m-adjacency: A combination of 4-adjacency and 8-adjacency used


to avoid multiple path connections in a binary image.
– Distance Measures:
∗ Distance measures quantify the closeness between pixels. Common dis-
tance measures include:
· Euclidean Distance: The straight-line p distance between two pixels
(x1 , y1 ) and (x2 , y2 ), calculated as d = (x2 − x1 )2 + (y2 − y1 )2 .
· City Block Distance (Manhattan Distance): The distance be-
tween two pixels measured along the grid lines, calculated as d =
|x2 − x1 | + |y2 − y1 |.
· Chessboard Distance: The maximum of the horizontal and vertical
distances, calculated as d = max(|x2 − x1 |, |y2 − y1 |).
– Path:
∗ A path is a sequence of pixels where each consecutive pixel is adjacent
to the next. Paths are used to trace connectivity and boundaries in an
image.
· 4-path: A path that connects 4-adjacent pixels.
· 8-path: A path that connects 8-adjacent pixels.
· m-path: A path that uses mixed-adjacency rules to avoid diagonal
ambiguity.
– Region:
∗ A region is a group of connected pixels with similar properties. Regions
are used in segmentation to separate different parts of an image.
· Connected Components: Pixels that are connected and share the
same intensity value form a connected component or region.
· Region Adjacency Graph (RAG): A graph representation where
nodes represent regions and edges represent the adjacency between
regions.
– Boundary:
∗ The boundary is the set of pixels that separate different regions in an
image. Identifying boundaries is important for edge detection and shape
analysis.
· External Boundary: The boundary between a region and the back-
ground.
· Internal Boundary: The boundary between two adjacent regions.

Example: In a binary image, two adjacent pixels with the same value are consid-
ered connected. For instance, if both pixels have a value of 1 and share a common
edge, they are 4-connected. This concept is used in connected component labeling
to identify distinct objects in an image.

10
ITECH WORLD AKTU BCS057

Color Image Fundamentals


Color image processing involves understanding and manipulating the color informa-
tion of images. Several color models are used to represent colors in different ways
for various applications. Key models include:

– RGB Model:
∗ The RGB model represents colors as combinations of the primary colors
Red, Green, and Blue. Each color is defined by its intensity values of R,
G, and B, ranging from 0 to 255 in an 8-bit representation.
∗ It is widely used in digital displays and imaging devices such as cameras,
monitors, and scanners.
∗ Colors are additive in nature, meaning they are formed by adding the
values of R, G, and B.
– HSI Model:
∗ The HSI model represents colors using three components: Hue (color
type), Saturation (color purity), and Intensity (brightness).
∗ It is more intuitive for human interpretation because it separates color
information (Hue) from brightness (Intensity).
∗ HSI is commonly used in image analysis, computer vision, and color-based
object recognition.
Example: The RGB model is widely used in digital displays and imaging
devices due to its straightforward representation of colors. In contrast, the
HSI model is preferred for image analysis and object recognition because it
separates color information from intensity, making it easier to analyze the
color features of objects independently from their brightness.

11
ITECH WORLD AKTU BCS057

Two-dimensional Mathematical Preliminaries


Understanding and manipulating digital images require fundamental mathe-
matical concepts. Key concepts include:
∗ Matrices:
· An image can be represented as a matrix, where each element of the
matrix corresponds to the intensity or color value of a pixel.
· For a grayscale image, each element contains a single value represent-
ing brightness, whereas, for a color image, each element might contain
multiple values (e.g., RGB components).
∗ Linear Algebra:
· Operations such as matrix multiplication, addition, and scalar multi-
plication are essential in image transformations like rotation, scaling,
and translation.
· Eigenvalues and eigenvectors are used in Principal Component Anal-
ysis (PCA) for image compression and recognition.

1 2D Transforms
2D transforms are crucial in image processing, facilitating various appli-
cations such as compression, filtering, and feature extraction. Key trans-
forms include:
· Discrete Fourier Transform (DFT):
· The DFT converts an image from the spatial domain to the frequency
domain, allowing for the analysis of frequency components.
· It helps identify periodic patterns and frequencies in images, which is
essential for tasks like image filtering and noise reduction.
· The transformation reveals how different frequency components con-
tribute to the overall image, aiding in various processing techniques.
· Discrete Cosine Transform (DCT):
· The DCT decomposes an image into a sum of cosine functions, empha-
sizing lower frequencies while minimizing high-frequency components.
· It is widely used in JPEG compression, where images are divided into
blocks, and the DCT is applied to each block to reduce data storage
requirements.
· By concentrating on significant low-frequency information, the DCT
allows for effective compression while preserving visual quality.
Example: In JPEG compression, the DCT is applied to each 8x8 block of
pixels. High-frequency components, which typically carry less perceptible
detail, are quantized more coarsely. This enables substantial data reduc-
tion while maintaining acceptable image quality during reconstruction.

12
ITECH WORLD AKTU BCS057

13
ITECH WORLD AKTU

IMAGE PROCESSING (BCS057)

U
T
K
Unit 2: Image Enhancement

A
Syllabus: LD
• Spatial Domain:
– Gray level transformations
– Histogram processing
R

– Basics of Spatial Filtering


O

– Smoothing and Sharpening Spatial Filtering


• Frequency Domain:
W

– Introduction to Fourier Transform


– Smoothing and Sharpening frequency domain filters
H

– Ideal, Butterworth, and Gaussian filters


– Homomorphic filtering
C

– Color image enhancement


E
IT

1 What is Image Enhancement?


Image enhancement refers to the process of improving the visual appearance of an image
or transforming it into a form that is better suited for analysis by human or machine
perception. The goal is to enhance certain features of the image, such as edges, contrast,
or details, which may not be clearly visible in the original image. It is an essential step
in many image processing applications like medical imaging, satellite image analysis, and
object recognition.

1
1.1 Types of Image Enhancement Techniques
There are two primary domains for image enhancement:
• Spatial Domain Techniques: These techniques directly operate on the pixel
values of an image. Some common spatial domain techniques include:
– Gray Level Transformations: This includes operations like image nega-
tives, contrast stretching, and thresholding, where the pixel values are modified
to enhance the visual quality.
– Histogram Processing: Techniques like histogram equalization and his-
togram matching are used to improve the contrast of the image.

U
– Spatial Filtering: This includes operations like smoothing and sharpening
using various filters (e.g., mean filter, median filter, and Laplacian filter).

T
• Frequency Domain Techniques: These techniques modify the Fourier transform

K
of the image to enhance its appearance. Some common frequency domain techniques
include:

A
– Fourier Transform: By transforming an image into the frequency domain,
filtering operations can be applied more effectively to target specific frequency
components.
LD
– Frequency Domain Filtering: This includes operations like low-pass filter-
ing (smoothing) and high-pass filtering (sharpening).
R

1.2 Applications of Image Enhancement


O

Image enhancement has a wide range of applications, including:


• Medical Imaging: Enhancing features like tissues, bones, and other structures in
W

medical images (e.g., X-rays, MRI scans) to aid in diagnosis.


• Satellite Imaging: Enhancing satellite images to detect and analyze landforms,
water bodies, and vegetation.
H

• Object Recognition: Improving image quality for better object detection and
recognition in computer vision applications.
C

• Photography: Enhancing photos to improve their visual appeal, such as adjusting


E

brightness, contrast, and sharpness.


IT

1.3 Challenges in Image Enhancement


While image enhancement can significantly improve visual quality, it also poses some
challenges:
• Noise Amplification: Some enhancement techniques may also amplify noise
present in the image, leading to a degraded visual quality.
• Over-Enhancement: Excessive enhancement may result in unnatural-looking im-
ages with loss of important details.
• Subjectivity: The definition of a ’good’ enhancement is subjective and may vary
depending on the application and viewer.

2
1.4 Conclusion
Image enhancement plays a crucial role in various fields by improving image quality and
visibility of features. The choice of enhancement techniques depends on the nature of the
image and the desired outcome. Proper care must be taken to balance enhancement and
the preservation of essential image details.

U
T
K
A
LD
2 1. Spatial Domain
R

The spatial domain refers to the space in which images are defined in terms of their pixel
values. In this domain, image processing techniques operate directly on the pixels of
O

an image. This is in contrast to the frequency domain, where transformations like the
Fourier Transform are applied to analyze and modify the frequency components of an
W

image.
Example: Consider a simple 3x3 image matrix with pixel intensity values ranging
from 0 to 255 (for grayscale images):
H

 
12 45 78
34 56 90 
C

78 123 150
Any modification in this matrix directly alters the spatial domain representation of
E

the image. For instance, increasing each pixel value by 10 would brighten the image in
the spatial domain.
IT

2.1 1.1 Gray Level Transformations


Gray level transformations are point processing operations where each pixel’s intensity
is transformed independently of other pixels. They are used for image enhancement to
improve contrast, brightness, or to apply specific effects.
Types of Gray Level Transformations:
• Image Negatives: Inverts the intensity of the pixels. Useful for enhancing white
or gray information in dark regions.

• Log Transformation: Enhances details in dark regions of an image.

3
• Power-Law Transformation: Used for contrast adjustments, also known as
gamma correction.

• Contrast Stretching: Expands the range of intensity values to enhance contrast.

• Thresholding: Converts a grayscale image into a binary image by assigning pixels


below a certain value to 0 and above to 1.

Detailed Example: Image Negative


The image negative transformation reverses the intensity levels of an image. In a
grayscale image, pixel values typically range from 0 (black) to 255 (white). The negative

U
of an image is obtained by subtracting each pixel value from the maximum intensity value
(255).

T
• Formula: If I is the input image, the negative image I ′ is given by:

K
I ′ = 255 − I

A
• Explanation: For a pixel with an intensity value of 50, its negative would be:

I ′ = 255 − 50 = 205
LD
This transformation darkens bright areas and lightens dark areas, resulting in a
visually inverted image.
R

• Practical Example: Consider a 2x2 image matrix:


O

 
30 100
150 200
W

The negative of this image would be:


   
255 − 30 255 − 100 225 155
=
255 − 150 255 − 200 105 55
H

Here, dark areas have been lightened and bright areas darkened.
C

Applications of Gray Level Transformations:


E

• Medical Imaging: Enhances the contrast of medical images to highlight specific


IT

structures.

• Remote Sensing: Enhances satellite images to detect features such as vegetation


or water bodies.

• Photography: Used to correct brightness and contrast in digital images.

4
U
T
K
A
2.2 1.2 Histogram Processing
Histogram processing involves the analysis and modification of an image’s histogram to
LD
enhance its contrast. The histogram of an image represents the distribution of pixel
intensities, and by modifying this distribution, we can improve the visibility of image
features.
Types of Histogram Processing:
R

• Histogram Equalization: A technique that redistributes the image’s pixel inten-


O

sities to achieve a uniform histogram, thereby enhancing the image contrast.

– Formula: The cumulative distribution function (CDF) of the histogram is


W

used to remap the image pixel values:


k
X
sk = (L − 1) · pr (rj )
H

j=0
C

where sk is the new pixel value, L is the total number of intensity levels, and
pr (rj ) is the probability of occurrence of intensity level rj .
E

– Example: Consider an image with a histogram concentrated in the lower


range of intensity values. After applying histogram equalization, the pixel
IT

values are spread over the entire intensity range, resulting in a higher contrast
image.

• Histogram Matching (Specification): This technique adjusts the pixel inten-


sities of an image such that its histogram matches a specified histogram.

– Example: If we have an image with low contrast and we want to match its
histogram to that of a high-contrast reference image, histogram matching can
achieve this transformation.

• Local Histogram Processing: Applies histogram processing to small regions of


the image to enhance local contrast.

5
– Example: Adaptive Histogram Equalization (AHE) enhances the contrast
locally in small regions, making it effective for images with varying contrast.

2.3 1.3 Basics of Spatial Filtering


Spatial filtering involves applying a filter mask to the image pixels in the spatial domain
to achieve desired effects like smoothing, sharpening, or edge detection.
Types of Spatial Filters:

• Linear Filters: These filters use a linear combination of pixel values within a
neighborhood defined by the filter mask.

U
– Example: Smoothing with a Box Filter - The box filter smooths an image

T
by averaging the pixel values in the neighborhood defined by the filter size.
The formula for a box filter is:

K
a b
1 X X
h(x, y) = 2 f (x + i, y + j)

A
n i=−a j=−b

where n is the size of the filter, and a and b are the dimensions of the filter.
LD
• Non-Linear Filters: These filters do not use a linear combination of pixel values.
Examples include median filters, which replace each pixel with the median value of
the neighborhood.
R

– Example: Median Filter - This filter is particularly effective in removing


salt-and-pepper noise from an image. For a 3x3 neighborhood:
O

f ′ (x, y) = median{f (x + i, y + j)}


W

where i, j ∈ {−1, 0, 1}.


H
C
E
IT

2.4 1.4 Smoothing and Sharpening Spatial Filtering


Spatial filtering can be used for both smoothing and sharpening an image, depending on
the desired effect.

• Smoothing Filters: These filters are used to reduce noise and smooth out varia-
tions in an image.

6
– Example: Gaussian Filter - This filter uses a Gaussian function to give
more weight to the central pixel and its neighbors, thereby reducing noise.
The formula is:
1 − x2 +y2 2
h(x, y) = e 2σ
2πσ 2
where σ is the standard deviation, controlling the degree of smoothing.

• Sharpening Filters: These filters highlight the edges in an image, making it


appear sharper.

– Example: Laplacian Filter - This filter calculates the second derivative of

U
the image, highlighting areas of rapid intensity change (edges). The formula
is:
∂ 2f ∂ 2f

T
h(x, y) = + 2
∂x2 ∂y

K
The Laplacian filter can be applied using a mask like:
 
0 −1 0

A
−1 4 −1
0 −1 0
LD
R

2.5 Frequency Domain


O

:
W

Frequency domain refers to the representation of an image or signal in terms of its


frequency components rather than its spatial or time domain values. This transformation
allows us to analyze and process the different frequencies present in the image, such as
high-frequency components (edges, fine details) and low-frequency components (smooth
H

regions, overall shape).


In the frequency domain, an image is represented as a sum of sinusoidal functions
C

with varying frequencies and amplitudes. By transforming an image into the frequency
domain, we can easily manipulate these components for various applications like filtering,
E

compression, and enhancement.


Explanation of the Points:
IT

Fourier Transform:
The Fourier Transform is a mathematical tool used to convert an image from the
spatial domain (where each pixel value corresponds to a specific location) to the fre-
quency domain (where each value corresponds to a specific frequency component). It
helps to separate the image into its constituent frequencies, making it easier to process
high-frequency components (like edges and textures) and low-frequency components (like
smooth areas) separately.
Formula:
The formula provided is the 2D Discrete Fourier Transform (DFT) of an image f (x, y).
It transforms the image from its spatial representation to its frequency representation.
Explanation:

7
• F (u, v): Represents the frequency component of the image at coordinates (u, v).

• f (x, y): The pixel intensity at the spatial coordinates (x, y).

• M, N : The dimensions of the image.

• The exponential term e−j2π(ux/M +vy/N ) represents the basis functions that oscillate
at different frequencies depending on the values of u and v.

• The double summation sums over all pixel values in the image, weighting them by
the complex exponential term to compute the frequency component F (u, v).

U
T
K
A
LD
R
O
W
H
C
E
IT

2.6 2.2 Smoothing and Sharpening Frequency Domain Filters


Frequency domain filtering is applied by manipulating the Fourier transform of an image.

• Smoothing Filter:

– Removes high-frequency noise by reducing the amplitude of high-frequency


components.

8
– Useful for blurring and reducing fine details or texture in an image.
– Examples include Gaussian Low-Pass Filter (GLPF) and Ideal Low-Pass Filter
(ILPF).
– Smoothing can reduce sharp transitions, making the image appear softer.

• Sharpening Filter:

– Emphasizes high-frequency components to enhance edges and fine details.


– Useful for highlighting edges and small structures in an image.
– Examples include Gaussian High-Pass Filter (GHPF) and Ideal High-Pass Fil-

U
ter (IHPF).
– Increases the contrast between neighboring pixels, making the image appear

T
sharper.

K
• Practical Applications:

A
– Smoothing filters are used in image preprocessing to reduce noise before further
analysis.
– Sharpening filters are used in medical imaging to enhance anatomical struc-
LD
tures.

2.7 2.3 Ideal, Butterworth, and Gaussian Filters


R

• Ideal Filter:
O

– An Ideal Filter has a sharp and distinct boundary between the pass band
(where frequencies are allowed to pass) and the stop band (where frequencies
W

are completely attenuated).


– It is characterized by an abrupt cutoff, meaning that all frequencies below a
certain threshold (cutoff frequency) are passed with full strength, while those
H

above are completely removed.


– There are two main types of ideal filters:
C

∗ Ideal Low-Pass Filter (ILPF): Allows all frequencies below the cutoff
frequency to pass while blocking all higher frequencies. It is used for
E

smoothing the image by removing high-frequency noise.


∗ Ideal High-Pass Filter (IHPF): Passes all frequencies above the cutoff
IT

frequency and attenuates those below. It is used for enhancing the edges
and fine details of the image.
– Limitations: Due to the sharp cutoff, ideal filters can cause ringing artifacts
in the spatial domain, known as the Gibbs phenomenon, making them less
practical for real-world applications.

• Butterworth Filter:

– A Butterworth Filter provides a gradual transition between the pass band and
the stop band, making it more practical and avoiding the harsh cutoffs seen
in ideal filters.

9
– The degree of smoothness of this transition is controlled by the order of the
filter. A higher-order filter has a sharper transition, while a lower-order filter
has a more gradual roll-off.
– Types of Butterworth filters include:
∗ Butterworth Low-Pass Filter (BLPF): Reduces high-frequency com-
ponents more gently compared to an ideal low-pass filter, minimizing ar-
tifacts and preserving important image features.
∗ Butterworth High-Pass Filter (BHPF): Enhances high-frequency
components but with a smooth transition, avoiding the abrupt changes
caused by ideal high-pass filters.

U
– Advantages: The smoother transition reduces ringing artifacts, making But-

T
terworth filters suitable for applications where preserving the integrity of the
image is crucial.

K
• Gaussian Filter:

A
– A Gaussian Filter has a bell-shaped curve both in the spatial and frequency
domains, providing a very smooth transition without any abrupt changes.
– It is defined by the standard deviation parameter, σ, which controls the width
LD
of the Gaussian curve. A larger σ results in a wider curve, which in turn results
in a stronger smoothing effect.
– Types of Gaussian filters include:
R

∗ Gaussian Low-Pass Filter (GLPF): Smooths the image by reducing


high-frequency noise, providing an even more gradual attenuation of high
O

frequencies than the Butterworth low-pass filter.


∗ Gaussian High-Pass Filter (GHPF): Highlights edges and fine details
W

with minimal distortion, avoiding the ringing artifacts and the abrupt
changes seen with ideal and Butterworth high-pass filters.
– Advantages: Gaussian filters do not introduce ringing artifacts and are
H

widely used for applications requiring smooth and artifact-free filtering. They
are optimal for applications like image blurring, noise reduction, and feature
C

detection.

• Comparison and Applications:


E
IT

– Ideal Filters: Suitable for theoretical and controlled applications where a


precise cutoff is required, but not ideal for real-world applications due to ring-
ing artifacts.
– Butterworth Filters: Often used in image processing tasks requiring a bal-
ance between sharp cutoff and minimal artifacts, such as image enhancement
and feature extraction.
– Gaussian Filters: Preferred for applications needing smooth results without
any artifacts, such as noise reduction, image blurring, and pre-processing for
edge detection.

10
2.8 2.4 Homomorphic Filtering

U
Homomorphic filtering is a technique that combines and manipulates the illumination
and reflectance components of an image in the frequency domain to enhance its overall

T
appearance. This method is particularly useful for improving the contrast and brightness

K
of images with non-uniform illumination, such as photographs taken in poor lighting
conditions.

A
• Concept:
– An image can be modeled as the product of two components:
LD
∗ Illumination Component (i): Represents the varying lighting condi-
tions in the image, usually containing low-frequency information.
∗ Reflectance Component (r): Represents the intrinsic properties of the
R

objects in the image, such as texture and color, usually containing high-
frequency information.
O

– Homomorphic filtering aims to separate and manipulate these components to


enhance the image. The goal is to suppress the illumination component while
W

emphasizing the reflectance component.


• Process:
– Logarithmic Transformation: The first step is to apply a logarithmic trans-
H

formation to the image, converting the multiplicative model of illumination and


reflectance into an additive model:
C

log(f (x, y)) = log(i(x, y) · r(x, y)) = log(i(x, y)) + log(r(x, y))
E

– Fourier Transform: Apply the Fourier Transform to the log-transformed


IT

image to separate the low-frequency illumination component from the high-


frequency reflectance component.
– Filtering: Use a high-pass filter to reduce the influence of the low-frequency
illumination component while preserving or enhancing the high-frequency re-
flectance component.
– Inverse Fourier Transform: Apply the inverse Fourier Transform to convert
the filtered image back to the spatial domain.
– Exponential Transformation: Apply an exponential transformation to re-
vert the image back to its original scale:

f ′ (x, y) = exp(Filtered Image)

11
• Advantages:

– Enhances the visibility of features in images with poor lighting conditions by


equalizing uneven illumination.
– Improves contrast and highlights details, making it easier to analyze and in-
terpret the image.
– Suppresses the low-frequency illumination component, which often causes glare
or shadow artifacts.

• Applications:

U
– Medical Imaging: Enhances the visibility of tissues and organs in medical
images, improving diagnostic accuracy.

T
– Document Processing: Improves the legibility of scanned documents by

K
correcting uneven lighting.
– Satellite Imaging: Enhances the contrast and detail in satellite images af-

A
fected by varying lighting conditions.

• Limitations:
LD
– The choice of filter parameters is crucial and may require experimentation to
achieve optimal results.
– Over-enhancement can lead to the amplification of noise and artifacts in the
R

image.
– May not perform well on images with complex lighting conditions, where the
O

distinction between illumination and reflectance is not clear.


W
H
C
E
IT

12
U
2.9 2.5 Color Image Enhancement

T
Color image enhancement involves adjusting the intensity and color channels of an image

K
to improve its visual appearance, contrast, and color balance. This process is essential for
enhancing the perceptual quality of images in various applications such as photography,

A
remote sensing, and medical imaging.

• Concept:
LD
– Unlike grayscale images, color images contain multiple channels, usually rep-
resented in the RGB (Red, Green, Blue) color space. Enhancement techniques
must consider all channels to avoid color distortion.
R

– The goal is to improve the visibility of details, correct color imbalances, and
enhance the contrast of the image while preserving its natural appearance.
O

• Techniques:
W

– Histogram Equalization:
∗ Enhances the contrast of the image by redistributing the intensity values
of each color channel.
∗ Can be applied separately to each channel (RGB) or to the intensity com-
H

ponent in a different color space like HSV (Hue, Saturation, Value).


∗ Limitation: Applying it to individual channels in the RGB space can
C

lead to unnatural color shifts, so it’s often preferred in the HSV or YCbCr
E

(Luminance, Chrominance) space.


– Contrast Stretching:
IT

∗ Involves stretching the range of intensity values in the image to cover the
full available range, enhancing the contrast.
∗ Often used to improve the visibility of features in low-contrast images.
∗ Limitation: May cause clipping of bright or dark regions if not carefully
adjusted.
– Color Balance Adjustment:
∗ Adjusts the relative intensities of the RGB channels to correct color im-
balances, such as removing color casts due to incorrect white balance.
∗ Often used to make an image appear more natural or to match a desired
aesthetic.

13
– Saturation Enhancement:
∗ Increases the saturation of colors, making the image appear more vibrant
and lively.
∗ Performed in the HSV or HSL (Hue, Saturation, Lightness) color spaces
to avoid affecting the brightness of the image.
∗ Limitation: Over-enhancement can lead to unnatural and oversaturated
colors.
– Color Space Conversion:
∗ Converts the image from one color space (e.g., RGB) to another (e.g.,

U
HSV, Lab) to simplify the enhancement process.
∗ Specific enhancements like contrast or saturation adjustments can be more

T
effectively applied in certain color spaces.
∗ After enhancement, the image is converted back to the original color space.

K
– Gamma Correction:
∗ Adjusts the brightness of the image by applying a power-law transforma-

A
tion to the pixel values.
∗ Useful for correcting lighting issues, such as underexposed or overexposed
images.
LD
∗ Limitation: Incorrect gamma settings can lead to loss of details in shad-
ows or highlights.
• Applications:
R

– Photography: Enhances the visual appeal of photographs by adjusting bright-


O

ness, contrast, and color balance.


– Remote Sensing: Improves the clarity and interpretability of satellite images
W

for better analysis of land use and environmental changes.


– Medical Imaging: Enhances the visibility of features in medical images such
as MRIs and X-rays, aiding in diagnosis and analysis.
H

– Multimedia: Used in video and image editing to achieve desired visual effects
and color corrections.
C

• Challenges:
E

– Color Distortion: Applying enhancements without considering the interde-


IT

pendence of color channels can lead to unnatural color shifts.


– Noise Amplification: Certain enhancement techniques, like contrast stretch-
ing, can amplify noise in the image, reducing overall quality.
– Maintaining Natural Appearance: Balancing enhancement while preserv-
ing the natural look of the image can be difficult, especially in diverse lighting
conditions.

14
IT
E
C
H
W
O

15
R
LD
A
K
T
U
ITECH WORLD AKTU

SUBJECT NAME: IMAGE PROCESSING (IP)


SUBJECT CODE: BCS057

UNIT 3: IMAGE RESTORATION

Syllabus (Unit 3):

• Image Restoration – degradation model

• Properties

• Noise models

• Mean Filters

• Order Statistics

• Adaptive Filters

• Band Reject Filters

• Band Pass Filters

• Notch Filters

• Optimum Notch Filtering

• Inverse Filtering

• Wiener Filtering

Image Restoration
Image restoration refers to the process of recovering an original image from a degraded
version using mathematical models. The degradation may occur due to various factors
such as noise, motion blur, or any environmental condition.

1
ITECH WORLD AKTU 2

1 Properties of Image Restoration


1. Restoration vs. Enhancement: Restoration focuses on recovering the original image,
while enhancement improves visual appearance for better interpretation.

2. Noise Reduction: Effective image restoration reduces various types of noise, en-
hancing clarity and detail in the processed image.

3. Edge Preservation: Quality restoration techniques maintain edge integrity, prevent-


ing blurring of important features and ensuring structural fidelity.

4. Spatial and Frequency Domain: Restoration methods can be applied in both spatial
and frequency domains, allowing for versatile approaches based on specific needs.

5. Computational Efficiency: Efficient algorithms are crucial for real-time applications,


balancing quality and processing speed to meet user demands.

6. User Control: Restoration techniques often allow user input for adjusting parame-
ters, enabling tailored processing based on individual requirements.

7. Adaptability: Good restoration methods adapt to different types of degradation,


such as blurring, noise, or distortions, providing flexible solutions across applica-
tions.

Degradation Model
A degradation model in image processing defines the relationship between the original
image and the degraded (or observed) image. This model is crucial in image restoration,
as it helps in understanding how the image has been altered by external factors such as
noise, blur, or any other distortions.
Mathematical Representation:
ITECH WORLD AKTU 3

The degradation model is commonly represented by the following equation:


g(x, y) = h(x, y) ∗ f (x, y) + η(x, y)
Where:
• g(x, y) represents the degraded image (the image we observe after degradation).
• f (x, y) is the original, uncorrupted image (which we aim to recover).
• h(x, y) is the degradation function or point spread function (PSF) that describes
how the image was degraded. This could be caused by motion blur, out-of-focus
effects, or lens imperfections.
• ∗ denotes the convolution operation, which means each pixel in the degraded image
is influenced by its neighboring pixels due to the degradation function.
• η(x, y) is the noise introduced during image acquisition, typically modeled as addi-
tive noise.
Key Concepts:
1. Degradation Function h(x, y): This function models the physical process causing
the image degradation. For example, if the image is blurred due to camera motion,
the degradation function might represent the motion blur kernel. A Gaussian kernel
can represent out-of-focus blur.
2. Noise η(x, y): Noise refers to random disturbances that affect the pixel values of the
image. Common types of noise include Gaussian noise (due to sensor imperfections),
salt-and-pepper noise (impulsive noise), and Poisson noise (photon noise).
3. Convolution Operation ∗: The convolution operation expresses how the degra-
dation function affects the entire image. Each pixel in the degraded image is a
weighted sum of the original image’s neighboring pixels, depending on the degra-
dation kernel.
Restoration Goal:
The primary objective in image restoration is to estimate the original image f (x, y)
from the observed image g(x, y) by reversing the degradation process. This typically
involves applying filtering techniques (such as inverse filtering or Wiener filtering) to
remove the effects of h(x, y) and reduce the noise η(x, y).
Types of Degradations:
• Blur: Often caused by relative motion between the camera and object or by out-of-
focus capture. This results in a smoothing effect, where edges and details become
less distinct.
• Noise: Random variation in pixel values that reduces image quality. It may result
from electronic interference, poor lighting conditions, or sensor imperfections.
Example:
Consider an image blurred due to motion. The degradation function h(x, y) in this
case can be represented by a motion blur kernel, which simulates the effect of the camera
moving during exposure. The convolution of this kernel with the original image produces
the blurred image g(x, y).
ITECH WORLD AKTU 4

• If the camera moves horizontally during the image capture, the degradation func-
tion could be a horizontal line kernel. Each pixel value in the degraded image is
influenced by the neighboring pixel values along the direction of the motion.
To restore the original image, we apply an inverse filtering technique where we estimate
the original image by deconvolving the blurred image with the degradation function.
However, noise η(x, y) complicates this process, as blindly applying inverse filtering can
amplify the noise.
Conclusion:
The degradation model forms the foundation for many image restoration techniques.
By understanding the nature of the degradation, including the type of noise and the
degradation function, we can apply appropriate restoration techniques to recover the
original image as accurately as possible.

Noise Models
Noise models describe the different types of noise that can affect an image, often de-
pending on the acquisition method and environmental conditions. Some common noise
models are:
• Rayleigh Noise: This noise follows a Rayleigh distribution and is usually observed
in situations where the overall signal involves scattering or multiple reflections, such
as radar or ultrasonic imaging. The probability density function (PDF) is given by:
( 2
z−a −(z−a)
e b for z ≥ a
p(z) = b
0 for z < a
This distribution has a long tail, which means that it is asymmetric. It can often
be modeled for cases where noise is not symmetrically distributed.
Example: Rayleigh noise is common in radar signal processing.
• Gamma (Erlang) Noise: Gamma or Erlang noise is used to model noise where
the data follows a Gamma distribution, commonly in imaging systems where the
variance changes depending on the signal’s amplitude. The PDF for Gamma noise
is:
a(b − 1)k−1 e−(b−1)z
p(z) =
(b − 1)!
Where a and b control the shape and scale of the distribution. This type of noise
often occurs in signals that involve waiting times or processes that sum several
independent variables.
Example: Gamma noise is used to model noise in telecommunications systems.
ITECH WORLD AKTU 5

• Exponential Noise: This type of noise follows an exponential distribution, and


it is frequently encountered in areas such as wireless communications and some
biological imaging applications. The PDF is:

p(z) = ae−az for z ≥ 0

The exponential distribution is suitable for modeling noise where lower intensities
are more likely to occur, with the likelihood of higher values dropping off exponen-
tially.
Example: Exponential noise is often modeled in signal detection systems, such as
sonar.

These noise models help in selecting appropriate restoration techniques to improve image
quality in various fields of application.

Mean Filters
Mean filters are simple averaging filters used to reduce noise. They operate by averaging
the pixel values in a neighborhood.
m n
1 XX
fmean (x, y) = g(x + i, y + j)
mn i=1 j=1
Example: A 3 × 3 mean filter can reduce Gaussian noise in an image.

Mean Filters
Mean filters are used to reduce noise by averaging the pixel values in a local neighborhood.
This is an effective technique for smoothing images.
ITECH WORLD AKTU 6

1. Mean filters operate by calculating the average of neighboring pixel values.

2. They are typically applied in a sliding window manner over the image.

3. The size of the filter window can vary, commonly 3 × 3, 5 × 5, etc.

4. These filters are effective in reducing Gaussian noise.

5. Mean filters can blur edges as they smooth all pixel values indiscriminately.

6. The formula for a mean filter is:


m n
1 XX
fmean (x, y) = g(x + i, y + j)
mn i=1 j=1

7. Example: A 3 × 3 mean filter is commonly used for image smoothing to reduce


noise.

Order Statistics Filters


Order statistics filters are a class of non-linear filters that manipulate pixel values based on
their rank or order within a defined neighborhood. These filters are particularly effective
in handling noise without distorting image features, especially edges.

1. Sorting Pixel Values:

• Order statistics filters work by sorting the pixel values within a defined neigh-
borhood (usually a window around the target pixel).
• For each pixel, the surrounding values are arranged in ascending or descending
order.
• The filter then selects a specific pixel value from the sorted list based on the
desired statistical measure (median, minimum, maximum, etc.).
• The process of sorting helps identify outliers and central tendencies, which
play a crucial role in removing noise.

2. Median Filter:

• The most widely used order statistics filter is the Median Filter.
• It replaces the value of a pixel with the median of the pixel values in a defined
neighborhood.
• The median is the middle value in a sorted list, making this filter particularly
robust to outliers such as salt and pepper noise.

3. Edge Preservation:

• Unlike mean filters that blur edges, median filters are known for their ability
to preserve edges while removing noise.
• This characteristic makes them suitable for applications where edge clarity is
important, such as medical imaging or satellite image processing.
ITECH WORLD AKTU 7

4. Salt and Pepper Noise Removal:

• Median filters are highly effective in removing impulse noise, also known as
salt and pepper noise, which manifests as random occurrences of black and
white pixels.

5. Handling Outliers:

• Median filters handle outliers better than mean filters by focusing on the cen-
tral value in the sorted neighborhood, making them less sensitive to extreme
pixel values.
• This makes them particularly effective in images where a small percentage of
pixels are significantly different from their neighbors.

6. Better for Non-Gaussian Noise:

• Median filters outperform mean filters in cases of non-Gaussian noise, such


as impulsive noise, because they do not average pixel values but choose the
central pixel value instead.

7. Example:

• For an image corrupted by salt and pepper noise, a 3 × 3 or 5 × 5 median filter


can be applied to reduce noise while preserving important details like edges.

article graphicx

Median Filter Handling Edge Cases


In median filtering, the computational block often overlaps with the edges of the image.
Below are the common scenarios for handling edge cases:

1. Full Overlap within Image:

• When the computational block fully overlaps with the pixels inside the image,
the median is calculated using all the neighborhood pixel values.
• Example: The center pixel and all its neighbors are part of the image, so the
median value is computed normally.

2. Partial Overlap Outside the Image:

• In this scenario, part of the filter window extends beyond the image boundary.
Several strategies can handle this case:
– Padding with zeros: Values outside the image are assumed to be zero.
– Padding with edge values: The nearest image boundary values are
repeated to fill the missing pixels.
– Cyclic padding: The image is treated as wrapping around, so missing
values are taken from the opposite side of the image.
• Example: If the computational block overlaps the edge, values outside the
image can be padded with zeros.
ITECH WORLD AKTU 8

Figure 1: Different cases of median filter window overlap

3. Edge Pixels (No Overlap):


• For edge pixels where the center of the filter window is outside the image,
the pixels beyond the image can be handled with mirroring or repeating edge
values.
• Example: For pixels at the image boundary, padding strategies such as mir-
roring ensure that the filter can still be applied effectively.

Adaptive Filters
Adaptive filters are dynamic tools that adjust their parameters based on local image
statistics, making them particularly effective in varying noise environments.

1. Flexibility: Adaptive filters dynamically alter their behavior according to local


variations in the image, allowing for effective noise reduction while preserving im-
portant features.
2. Effectiveness: They excel at handling non-stationary noise, where noise charac-
teristics may change significantly across different regions of the image.
3. Local Computation: These filters rely on local computations of image char-
acteristics, such as mean and variance, to inform their adjustments and improve
performance.
ITECH WORLD AKTU 9

4. Popular Example: The Wiener Filter is a widely used adaptive filter known
for its ability to minimize mean square error, adjusting itself based on local noise
statistics.

5. Error Minimization: Adaptive filters effectively minimize error by modifying the


filter shape or size in response to the intensity and characteristics of the noise in
the local neighborhood.

6. Rapid Changes: They perform particularly well in regions where noise charac-
teristics exhibit rapid changes, ensuring that details are preserved while noise is
reduced.

7. Application: An adaptive Wiener filter can be utilized in various image processing


tasks, such as denoising images captured in fluctuating light conditions or environ-
ments with unpredictable noise patterns.

Band Reject Filters


Band reject filters, also known as notch filters, are designed to eliminate specific frequency
ranges in an image, making them particularly effective for removing periodic noise.

1. Frequency Blocking: These filters selectively block a defined range of frequencies


while allowing other frequencies to pass through unaffected, ensuring the preserva-
tion of desired image details.

2. Noise Removal: Band reject filters are specifically utilized to eliminate periodic
interference or noise patterns, such as hum or buzz, that often occur in captured
images.

3. Cutoff Frequencies: The frequencies targeted for removal are bounded by two
cutoff frequencies, D1 and D2 , defining the band that the filter will suppress.

4. Mathematical Expression: The filter response can be expressed mathematically


as: (
0 if D1 ≤ D(u, v) ≤ D2
H(u, v) =
1 otherwise
where D(u, v) is the distance from the origin in the frequency domain.

5. Implementation: Band reject filters can be effectively implemented using both


spatial and frequency domain techniques, allowing for flexibility based on the ap-
plication and requirements.

6. Application Areas: They are particularly beneficial in scenarios where noise


manifests as concentrated patterns in specific frequency bands, such as electrical
interference.

7. Example: A common application is removing grid-like patterns caused by electrical


interference in scanned images, enhancing the overall quality and clarity of the
image.
ITECH WORLD AKTU 10

Band Pass and Notch Filters


Band pass and notch filters serve distinct but complementary roles in frequency analysis
and noise removal in images.

1. Band Pass Filters: Allow frequencies within a specific range to pass through while
attenuating those outside.

2. They effectively isolate certain image features by enhancing desired frequency com-
ponents.

3. Notch Filters: Remove a narrow band of frequencies, targeting specific noise or


interference.

4. They are particularly effective for suppressing periodic noise in images.

5. Both filter types enhance image quality: band pass filters minimize background
noise, while notch filters provide precise noise removal.

6. The choice of cutoff frequencies in band pass filters influences feature emphasis,
while notch filters focus on eliminating particular interference.

7. Examples: Band pass filters are used in medical imaging to highlight edges, while
notch filters are effective in removing repetitive noise patterns from old photographs.

Inverse Filtering
Inverse filtering attempts to recover the original image by applying the inverse of the
degradation process.
ITECH WORLD AKTU 11

1. The method assumes knowledge of the degradation function.

2. Inverse filtering is prone to noise amplification, particularly when the degradation


function has small values.

3. The formula for inverse filtering is:

G(u, v)
F (u, v) =
H(u, v)

4. Inverse filtering works well when the noise levels are low.

5. It is a simple and direct approach for image restoration.

6. However, it often fails if the noise is significant or the degradation is severe.

7. Example: Restoring an image that has been blurred due to motion.


ITECH WORLD AKTU 12

Wiener Filtering
Wiener filtering minimizes the mean square error between the original and the degraded
image, taking into account both the noise and image properties.

1. Wiener filtering is optimal when both the noise and the image signal have known
power spectra.

2. The Wiener filter smoothens the image while retaining important features.

3. The formula is:


H ∗ (u, v)
F (u, v) = G(u, v)
|H(u, v)|2 + Sη (u, v)/Sf (u, v)

4. Wiener filters are effective in reducing Gaussian noise.

5. The filter balances noise reduction with image sharpness.

6. It is widely used in both image and signal processing applications.

7. Example: Wiener filtering is applied in medical imaging to reduce noise while


maintaining the clarity of important features.

— END OF UNIT 3 —
ITECH WORLD AKTU

ITECH WORLD AKTU


Subject Name: Image Processing (IP)
Subject Code: BCS057

Unit 4: Image Segmentation

Syllabus
1. Edge detection
2. Edge linking via Hough transform
3. Thresholding
4. Region-based segmentation
5. Region growing
6. Region splitting and merging
7. Morphological processing: erosion and dilation
8. Segmentation by morphological watersheds: basic concepts
9. Dam construction and Watershed segmentation algorithm

1
ITECH WORLD AKTU

Image Segmentation
Image segmentation is the process of dividing an image into distinct regions or segments, where
each segment corresponds to objects or parts of the image. It simplifies the image, making it
easier to analyze by grouping pixels with similar characteristics.

1 Edge Detection
Edge detection is a fundamental tool in image processing and computer vision, primarily used
to identify points in a digital image where the brightness changes sharply or discontinuously.
These sharp changes are known as edges, which typically represent the boundaries between
different objects or regions within the image. Detecting these edges is essential for tasks like
image segmentation, object recognition, and scene understanding.

1.1 Methods of Edge Detection


Several algorithms can be applied for edge detection, with each providing different advantages
in terms of accuracy and computational complexity. Some of the most commonly used methods
are:

• Sobel Operator: This operator is used to compute the gradient of image intensity at
each pixel, identifying the direction of the largest possible increase in intensity and the
rate of change in that direction. It works by applying a convolution using two 3x3 kernels
(one for the horizontal and one for the vertical direction).

• Canny Edge Detection: A more advanced multi-step algorithm that provides superior
edge detection by reducing noise and false edges. The steps include Gaussian filtering,
gradient computation, non-maximum suppression, and hysteresis thresholding.

• Prewitt Operator: This is similar to the Sobel operator, but it uses a simpler kernel.
It is less sensitive to noise compared to Sobel and typically used when noise reduction is
less critical.

2
ITECH WORLD AKTU

1.2 Sobel Operator Example


The Sobel operator is based on convolutions using two kernels, one for detecting edges in the
horizontal direction and one for detecting edges in the vertical direction. The two kernels are:
   
−1 0 1 −1 −2 −1
Gx = −2 0 2 Gy =  0 0 0
−1 0 1 1 2 1
Here, Gx detects changes in the horizontal direction, and Gy detects changes in the vertical
direction. The magnitude of the gradient at each pixel is then computed by combining these
two results as follows:
q
G = G2x + G2y
The direction of the edge at each pixel is computed using:
 
−1 Gy
θ = tan
Gx

1.3 Steps of Sobel Edge Detection


1. Convert Image to Grayscale: If the input image is in color, convert it to grayscale since
edge detection typically works on single-channel intensity values.
2. Apply Sobel Kernels: Convolve the grayscale image with both Sobel kernels Gx and Gy
to compute the horizontal and vertical gradients.
3. Compute Gradient Magnitude and Direction: Use the formulas provided to calculate the
magnitude and direction of the edges at each pixel.
4. Thresholding: Apply a threshold to the gradient magnitude to keep only the significant
edges, discarding weak edges that may be due to noise.

3
ITECH WORLD AKTU

2 Edge Linking via Hough Transform


The Hough transform is a powerful technique used in image processing to detect and link edges
into parametric shapes, such as lines, circles, or ellipses, within an image. It is particularly
useful for detecting shapes in the presence of noise, occlusion, or incomplete edges.

2.1 Principle of Hough Transform


The basic principle behind the Hough transform is to transform points in the image space (edge
pixels) into a parameter space, where lines or curves can be represented in a more convenient
way. For example, a line in 2D space can be described by the equation:

y = mx + c
However, to avoid the difficulty of representing vertical lines, the Hough transform often
uses the polar coordinate form of a line:

ρ = x cos(θ) + y sin(θ)
Where:

• ρ is the perpendicular distance from the origin to the line.

• θ is the angle formed by the line with respect to the x-axis.

Each point in the image space corresponds to a sinusoidal curve in the parameter space
(ρ, θ).

2.2 Steps of Hough Transform for Line Detection


1. Edge Detection: First, apply an edge detection algorithm (such as Sobel or Canny) to
identify edge pixels in the image.

2. Transform to Hough Space: For each edge pixel (x, y), compute the values of ρ and
θ for a range of angles (usually 0◦ to 180◦ ) and plot the sinusoidal curves in the Hough
parameter space. Each point in the image space corresponds to a curve in Hough space.

3. Accumulator Array: Maintain an accumulator array where each cell corresponds to a


particular (ρ, θ) value. The value in each cell is incremented every time a curve passes
through that cell. High values in the accumulator array indicate potential lines in the
image.

4. Identify Peaks in the Accumulator Array: Once all points are transformed into
Hough space, peaks in the accumulator array represent potential lines in the original
image. These peaks correspond to the parameters (ρ, θ) of lines that pass through multiple
edge points.

5. Line Drawing: Finally, map the detected peaks back into the image space to draw the
detected lines.

4
ITECH WORLD AKTU

2.3 Advantages of the Hough Transform


• Robust to noise: The Hough transform can detect lines even if the edges are noisy or
incomplete.

• Detects multiple shapes: It can be extended to detect different shapes like circles, ellipses,
or other parametric curves by adjusting the parameterization.

• Can handle occlusion: Even when parts of the shape are missing or occluded, the Hough
transform can still link edges and detect the complete shape.

2.4 Limitations of the Hough Transform


• High computational cost: The Hough transform can be computationally expensive, espe-
cially for detecting complex shapes like circles or ellipses.

• Discretization errors: The accuracy of the detected lines depends on the resolution of the
parameter space, which may lead to discretization errors.

• Memory usage: The accumulator array can become large, especially when detecting mul-
tiple shapes in an image.

2.5 Applications of Hough Transform


The Hough transform is widely used in various fields, such as:

• Road lane detection: Used in autonomous driving to detect lanes on roads by identi-
fying straight lines in the image.

• Medical image analysis: Helps in detecting structures like blood vessels or bone frac-
tures in medical images.

• Object recognition: The transform is used to detect predefined shapes in images, such
as circular objects or regular geometric patterns.

5
ITECH WORLD AKTU

3 Thresholding
Thresholding is a fundamental technique in image processing used to convert a grayscale image
into a binary image by classifying pixel values into two categories: foreground and background.
The goal of thresholding is to segment an image by selecting a proper threshold value, which
differentiates between the object and its background.

3.1 Working Principle


In thresholding, each pixel in a grayscale image is compared to a specific threshold value. If
the pixel value is greater than or equal to the threshold, it is assigned a value representing
the foreground (typically white, or 255). Otherwise, it is assigned a value representing the
background (typically black, or 0). This results in a binary image where objects are separated
from the background.
(
255 if I(x, y) ≥ T
I(x, y) =
0 if I(x, y) < T
Where I(x, y) is the intensity of pixel (x, y) and T is the threshold value.

3.2 Types of Thresholding


There are two main types of thresholding techniques, depending on how the threshold value is
selected:

• Global Thresholding: A single threshold value T is chosen for the entire image. This
method works well when the image has uniform lighting and clear contrast between the
object and the background. However, it may fail when the lighting conditions are uneven.

• Adaptive Thresholding: In this method, different threshold values are applied to


different regions of the image based on local pixel intensities. This technique is useful
when the image has varying lighting conditions or when different parts of the image need
different thresholds for accurate segmentation. Two common approaches to adaptive
thresholding are:

– Mean Thresholding: The threshold for each pixel is set to the mean of the inten-
sities in the local neighborhood of that pixel.
– Gaussian Thresholding: The threshold for each pixel is calculated based on a
weighted sum of local intensities, giving more importance to the pixels closer to the
center.

3.3 Thresholding Example


For an image with pixel intensity values ranging from 0 to 255, let’s apply a global threshold
value of 128:

6
ITECH WORLD AKTU

• Before Thresholding: Suppose a grayscale image has the following intensity values for
a small 3x3 region:
 
100 150 120
200 90 140
60 180 130

• After Thresholding: If a threshold value T = 128 is used, all pixel values greater than
or equal to 128 will be set to 255 (white), and those less than 128 will be set to 0 (black).
The result is:
 
0 255 0
255 0 255
0 255 255

In this example, the thresholding operation helps in separating brighter regions (object)
from the darker background.

3.4 Applications of Thresholding


Thresholding is widely used in many image processing tasks, including:

• Document Image Binarization: Thresholding is commonly applied to digitize text


documents, where the goal is to separate the text (foreground) from the paper (back-
ground).

• Medical Imaging: It is used to segment regions of interest, such as tumors or other


anatomical structures, from medical images like X-rays or MRIs.

• Object Detection: In industrial vision systems, thresholding is often employed to detect


objects by separating them from the background in production lines.

3.5 Limitations of Thresholding


• Global Thresholding Sensitivity: Global thresholding can be highly sensitive to light-
ing conditions and may not work well in images with uneven illumination.

• Noise Susceptibility: Thresholding may incorrectly classify pixels as foreground or


background due to noise, particularly in low-quality images.

7
ITECH WORLD AKTU

4 Region-Based Segmentation
Region-based segmentation is a technique in image processing that divides an image into regions
based on similarities in pixel properties such as intensity, color, or texture. The goal is to group
together pixels that share similar characteristics, effectively separating different objects or areas
within the image.

4.1 Basic Principle


The idea behind region-based segmentation is that neighboring pixels within the same region
have similar properties, while pixels in different regions are significantly different. These prop-
erties can include pixel intensity, color, or texture patterns.
This method differs from edge-based techniques as it focuses on the homogeneity of pixel

8
ITECH WORLD AKTU

values rather than identifying sharp discontinuities between regions.

4.2 Types of Region-Based Segmentation


Two common approaches to region-based segmentation are:
• Region Growing: Starts with a set of seed points (initial pixels) and expands the region
by adding neighboring pixels that meet specific similarity criteria.

• Region Splitting and Merging: This method begins by considering the entire image
as a single region, then splits it into smaller regions. If adjacent regions are found to be
similar, they are merged back together.

4.3 Region Growing


Region growing is a simple and effective technique for image segmentation. It starts from one
or more seed points in the image, and the region expands by including neighboring pixels that
have similar intensity values or other properties. The process continues until no more pixels
meet the similarity condition.
Steps of Region Growing:
1. Select Seed Points: Choose one or more initial seed points, typically based on user
input or some predefined criteria (such as the brightest pixel in an area).

2. Grow the Region: For each seed point, examine its neighboring pixels. If a neighboring
pixel has a similar intensity (or another property), it is added to the region.

3. Repeat: The process continues recursively, growing the region by checking the neighbors
of the newly added pixels.

4. Stop Condition: The region-growing process stops when no more neighboring pixels
satisfy the similarity condition.
Criteria for Region Growing:
• Pixel intensity difference: Neighboring pixels are added if their intensity is within a certain
range of the seed point’s intensity.

• Texture: Regions may be grown based on texture similarity rather than intensity.

• Color: For colored images, pixels may be added based on the similarity in RGB or other
color space values.

4.4 Example of Region Growing


Consider a medical scan (e.g., an MRI) where different tissues exhibit different intensity ranges.
Region growing can be used to segment tissues such as the brain, liver, or heart. For example:
• The process starts by selecting seed points in areas corresponding to different tissue types.

• The algorithm then grows regions around these seed points, including neighboring pixels
that have similar intensity values to the seed pixel (indicating they belong to the same
tissue type).

9
ITECH WORLD AKTU

• This results in a segmented image where different tissues are clearly separated based on
their intensity.

Region Growing Example:


 
85 86 88
Initial Image: 84 90 150
85 89 151
If the seed point is at (2, 2) with intensity 90, the region will grow by including pixels with
intensities close to 90 (e.g., 85, 86, 88, etc.) but stop before reaching pixels with much higher
values (e.g., 150, 151). The segmented region might look like:
 
1 1 1
1 1 0
1 1 0
Where ’1’ indicates pixels that belong to the region, and ’0’ represents excluded pixels.

4.5 Advantages of Region Growing


• Simple and Intuitive: The algorithm is easy to implement and understand.

• Effective in Homogeneous Regions: Region growing works well when there is a clear
difference between objects and background.

• Good for Smooth Boundaries: Produces regions with smooth boundaries, making it
useful for medical image segmentation.

4.6 Limitations of Region Growing


• Sensitivity to Seed Points: The outcome of the algorithm highly depends on the choice
of seed points.

• Noise Sensitivity: If the image is noisy, the region-growing process may include irrele-
vant pixels, leading to inaccurate segmentation.

• Computational Cost: As the algorithm checks many neighboring pixels, it can be


computationally expensive for large images.

10
ITECH WORLD AKTU

5 Region Splitting and Merging


Region splitting and merging is an image segmentation technique that begins by considering
the entire image as a single region, then recursively splits the region into smaller subregions.
After splitting, adjacent regions that are similar in terms of a predefined criterion, such as pixel
intensity, are merged to form larger homogeneous regions.

5.1 Basic Principle


The primary idea behind region splitting and merging is to iteratively divide the image into
smaller regions (splitting) where the pixels are not uniform, and to combine adjacent regions
(merging) that have similar properties. This approach helps in achieving a more accurate
segmentation when compared to simply splitting or region growing.
The process can be summarized as:

• If a region is not homogeneous, it is split into smaller subregions.

• If adjacent subregions are homogeneous, they are merged back together.

11
ITECH WORLD AKTU

5.2 Steps of Region Splitting and Merging


1. Initialize the Region: Treat the entire image as a single region.
2. Splitting: Recursively divide the region into four quadrants. The process continues until
each subregion becomes homogeneous based on a given similarity criterion, such as pixel
intensity or texture.
3. Merging: Check adjacent regions. If neighboring regions meet the similarity condition,
merge them to form a larger, homogeneous region.
4. Stop Condition: The process terminates when no further regions can be split or merged.

5.3 Example of Region Splitting and Merging


Consider an image with pixel intensities:
 
200 202 198 210
203 200 199 211
 
150 152 148 155
149 151 147 156
1. **Initial Region (Whole Image)**: The whole image is treated as one region. However,
since the intensity values vary greatly between the top two rows and the bottom two rows, the
image is not homogeneous, and splitting occurs.
2. **Splitting**: The image is split into four quadrants:
   
200 202 198 210
Quadrant 1: , Quadrant 2:
203 200 199 211
   
150 152 148 155
Quadrant 3: , Quadrant 4:
149 151 147 156
3. **Merging**: After splitting, Quadrants 1 and 2 are merged because their intensity
values are similar. Similarly, Quadrants 3 and 4 are merged as they exhibit close intensity
values. The final segmentation may look like two merged regions:
 
1 1 1 1
1 1 1 1
 
2 2 2 2
2 2 2 2
Where Region 1 corresponds to the top half (originally Quadrants 1 and 2) and Region 2
corresponds to the bottom half (originally Quadrants 3 and 4).

5.4 Advantages of Region Splitting and Merging


• Accurate Segmentation: Combines both splitting and merging techniques, leading to
more precise segmentation results.
• Flexibility: Handles both over-segmentation (from splitting) and under-segmentation
(from merging), balancing the segmentation process.
• Homogeneity: Ensures that the final segmented regions are homogeneous, leading to
meaningful partitions in the image.

12
ITECH WORLD AKTU

5.5 Limitations of Region Splitting and Merging


• Computation Complexity: The process of recursively splitting and merging regions
can be computationally expensive, especially for large images.

• Selection of Homogeneity Criterion: The success of this method heavily depends on


the homogeneity criterion used. Poor selection can lead to inaccurate segmentation.

• Boundary Handling: The method may struggle with accurately segmenting complex
shapes and boundaries.

13
ITECH WORLD AKTU

6 Morphological Processing: Erosion and Dilation


Morphological processing is a collection of non-linear image processing techniques based on the
shape and structure of objects in an image. These operations are particularly useful in binary
images, where pixels are either part of an object (foreground) or the background. The two
basic morphological operations are erosion and dilation, and they are typically applied using a
structuring element.

6.1 Erosion
Erosion is a morphological operation that removes pixels on the boundaries of objects. It works
by shrinking the boundaries of the foreground object based on the shape of the structuring
element.
How Erosion Works:
• A structuring element (e.g., a small square or cross) is placed over each pixel in the image.

• If every pixel under the structuring element matches the shape of the structuring element,
the central pixel remains unchanged. Otherwise, it is removed (set to background).
Effect: Erosion causes objects in the image to shrink and can be useful for removing small
noise, separating connected objects, or shrinking object boundaries.
Erosion Example: Consider a binary image where ’1’ represents foreground pixels (object)
and ’0’ represents background:
 
0 1 1 0
1 1 1 1
Original Image:  0 1 1 0

0 0 1 0

After applying erosion with a 3 × 3 structuring element, the result might look like:
 
0 0 0 0
0 1 1 0
Eroded Image:  0 1 1 0

0 0 0 0

Here, pixels on the boundary of the object have been removed.

6.2 Dilation
Dilation is the opposite of erosion. It adds pixels to the boundaries of objects, effectively
enlarging the object. The structuring element defines the shape of this expansion.
How Dilation Works:
• A structuring element is placed over each pixel in the image.

• If at least one pixel under the structuring element is a foreground pixel, the central pixel
is set to the foreground (expanded).

14
ITECH WORLD AKTU

Effect: Dilation increases the size of objects, fills in small holes, and can connect nearby
objects that are close together.
Dilation Example: Given the same binary image:
 
0 1 1 0
1 1 1 1
Original Image: 0 1 1 0

0 0 1 0

After applying dilation with a 3 × 3 structuring element, the result might be:
 
1 1 1 1
1 1 1 1
Dilated Image:  1 1 1 1

0 1 1 1

Here, the object has expanded, filling in the gaps and connecting nearby foreground pixels.

6.3 Applications of Erosion and Dilation


• Noise Removal: Erosion can eliminate small noise or stray pixels in binary images.

• Object Separation: Erosion helps separate objects that are connected in a binary
image.

• Hole Filling: Dilation can fill small holes or gaps inside objects.

• Edge Detection: Combining erosion and dilation can be used for edge detection by
subtracting the eroded image from the dilated image.

6.4 Combined Morphological Operations


Erosion and dilation are often used in combination to achieve more complex image processing
tasks:

• Opening: Erosion followed by dilation. This operation removes small objects or noise
while preserving the shape and size of larger objects.

• Closing: Dilation followed by erosion. This operation is useful for closing small holes
inside objects and connecting close objects.

6.5 Structuring Element


A structuring element is a small matrix used to probe and modify the image. The shape and
size of the structuring element define how erosion and dilation affect the image. Common
structuring elements include:

• Square: A 3 × 3 or larger square.

• Cross: A cross-shaped element for selective dilation or erosion.

15
ITECH WORLD AKTU

7 Segmentation by Morphological Watersheds


Morphological watershed segmentation is a region-based method that treats the grayscale im-
age as a topographic surface, where the intensity values of the pixels represent the height in
the topography. High-intensity pixels correspond to ridges (peaks), while low-intensity pixels
represent valleys (basins). The goal of the watershed algorithm is to identify and delineate
these basins, which correspond to different regions in the image.

7.1 Watershed Segmentation Algorithm


The watershed algorithm is inspired by the process of flooding a topographic relief. The key
steps of the algorithm are:

1. Topographic Interpretation: Treat the image as a 3D landscape where pixel intensities


define height. High-intensity regions form peaks, and low-intensity regions form valleys.

2. Flooding Process: The image is conceptually ”flooded” from the lowest intensity points
(basins). As water rises, basins start filling up.

3. Dam Construction: When two basins meet, a dam (ridge) is built to prevent further
merging. These dams represent the segmentation boundaries.

4. Output Segmentation: The final result is the set of ridges or watersheds that separate
different regions (basins) in the image, forming the segmented regions.

16
ITECH WORLD AKTU

7.2 Mathematical Representation


Let I(x, y) be the grayscale image where (x, y) denotes pixel coordinates. The watershed
segmentation function defines regions R1 , R2 , . . . , Rn such that:

Ri = {(x, y) | I(x, y) ∈ Basini }

The boundaries between these regions form the watershed lines.

7.3 Example
Example of Watershed Algorithm: Consider an image where objects are overlapping or
touching, such as coins or cells. These objects may be difficult to separate using traditional
thresholding or region-growing techniques.
Steps in Watershed Segmentation:

• First, apply a preprocessing step such as noise removal and gradient computation to
emphasize the object boundaries.

• Then, mark the background and foreground regions (e.g., using markers for known ob-
jects).

• The watershed algorithm will flood the valleys, and the boundaries where the water from
different regions meets will be identified as segmentation lines.

Application Example: In medical imaging, watershed segmentation is used to separate


touching cells or tissues. For instance, in a 2D scan of cells, the watershed algorithm helps
distinguish individual cells that appear connected due to overlapping intensities.

7.4 Advantages and Disadvantages


• Advantages:

– Can accurately separate objects based on intensity gradients.


– Suitable for segmenting objects that touch or overlap.

• Disadvantages:

– Sensitive to noise and over-segmentation if applied without preprocessing.


– Requires careful selection of markers for meaningful segmentation.

7.5 Watershed Algorithm Variants


To address the sensitivity of the watershed algorithm, several variants and improvements are
used:

• Marker-Controlled Watershed: In this approach, foreground and background markers are


explicitly defined to guide the segmentation process and reduce over-segmentation.

• Gradient-Based Watershed: The gradient of the image is computed, and the watershed
algorithm is applied on the gradient, which sharpens the object boundaries.

17
ITECH WORLD AKTU

8 Watershed Algorithm
The watershed algorithm is a region-based segmentation technique that visualizes an image as
a topographic surface. In this topography, pixel intensity values represent the height of the
surface: high-intensity areas correspond to peaks or ridges, and low-intensity areas correspond
to valleys or basins. The goal of the watershed algorithm is to divide the image into distinct
regions by identifying the boundaries between basins.

18
ITECH WORLD AKTU

8.1 Concept of Watershed


The term ”watershed” refers to the ridge that separates water flow into different basins. In the
context of image segmentation, the algorithm simulates the flooding of an image where water
starts filling basins from the lowest intensity points. As the water rises, different regions are
segmented by the ”watershed lines” that separate the flooded basins.

8.2 Steps of the Watershed Algorithm


The key steps of the watershed algorithm are as follows:

1. Compute the Gradient:

• The gradient of the image is calculated, which highlights the areas of rapid intensity
change, often corresponding to edges between objects.

2. Identify Markers:

• Markers are placed in the image to identify foreground objects (the regions of inter-
est) and background regions. These markers help guide the segmentation process.

3. Flooding Process:

• Starting from the markers, the image is conceptually ”flooded” from the lowest
intensity to the highest intensity points. Water rises simultaneously from all basins.

4. Dam Construction:

• When water from two different basins meets, a dam (watershed line) is constructed
to prevent the merging of regions. These watershed lines define the boundaries
between regions.

5. Segmentation Output:

• The final result is the segmented image, where regions are divided by the watershed
lines.

8.3 Mathematical Explanation


Let f (x, y) be the intensity function of an image, where (x, y) represents the coordinates of
each pixel. The goal of the watershed algorithm is to find the set of regions R1 , R2 , . . . , Rn such
that:
Ri = {(x, y) | f (x, y) ∈ Basini }
The boundaries between these regions are the watershed lines, which correspond to the ridges
in the intensity landscape.

19
ITECH WORLD AKTU

8.4 Example
Consider an image containing overlapping circular objects, such as coins. The watershed algo-
rithm can be used to segment these objects, even when they are touching or overlapping.
Steps:
• Preprocessing: Apply a smoothing filter or a morphological operation (e.g., opening)
to remove noise.
• Gradient Computation: Compute the gradient of the image to highlight object bound-
aries.
• Marker Placement: Mark the foreground (coins) and background (spaces between
coins).
• Apply Watershed: The algorithm floods the regions from the markers, and watershed
lines are formed where the regions meet.
Result: The touching coins are successfully separated into distinct segments, with water-
shed lines marking the boundaries between them.

8.5 Advantages and Disadvantages


• Advantages:
– Accurately segments touching or overlapping objects.
– Can be combined with other segmentation techniques (e.g., marker-controlled wa-
tershed) for better results.
• Disadvantages:
– Sensitive to noise and can lead to over-segmentation.
– Requires preprocessing and marker selection to reduce over-segmentation.

8.6 Marker-Controlled Watershed


One common solution to over-segmentation is the marker-controlled watershed algorithm. In
this variant:
• Markers are placed manually or automatically to identify objects of interest (foreground
markers) and the background.
• The watershed algorithm is applied with these markers to control the flooding process,
reducing the likelihood of over-segmentation.

8.7 Applications
• Medical Imaging: Watershed segmentation is used to separate overlapping structures
such as cells, tissues, or anatomical features in medical images.
• Object Detection: Helps in detecting and separating closely connected objects in im-
ages, such as leaves, coins, or industrial components.

20
ITECH WORLD AKTU 1

ITECH WORLD AKTU


SUBJECT NAME: IMAGE PROCESSING
(IP)
SUBJECT CODE: BCS057

UNIT 5: Image Compression and Recognition


Syllabus
• Need for Data Compression

• Huffman Coding

• Run Length Encoding

• Shift Codes

• Arithmetic Coding

• JPEG Standard

• MPEG Standard

• Boundary Representation

• Boundary Description

• Fourier Descriptor

• Regional Descriptors: Topological Feature

• Texture: Patterns and Pattern Classes

• Recognition Based on Matching

BCS057 Image Processing


ITECH WORLD AKTU 2

IMAGE COMPRESSION
What is Image Compression?

1. Image compression refers to the process of reducing the amount of data required to
represent a digital image.

2. The goal is to minimize the file size of the image while maintaining acceptable image
quality.

3. Compression can be classified into lossless and lossy methods. Lossless preserves original
data, while lossy sacrifices some data for higher compression rates.

4. Image compression reduces storage space, making it easier to manage and store large
image datasets.

5. It decreases the bandwidth requirement during transmission, speeding up image transfer


over networks.

6. Compression is often used to lower the costs of storing and transmitting large images in
sectors like healthcare, satellites, and multimedia.

7. Efficient image compression is vital for applications like video streaming, where bandwidth
and storage costs are critical factors.

What is Data Compression and Why Do We Need It? Data compression involves
encoding information using fewer bits than the original representation. It helps in reducing the
size of data for storage, transmission, and efficient use of bandwidth.
Why Do We Need Data Compression?

• To save storage space, especially when dealing with large datasets.

• To reduce the time and bandwidth required for data transmission over networks.

• Compression helps in making data more portable and shareable across different systems.

• Reducing data size enables faster processing, reducing both time and cost for large data-
driven applications.

Compression and Reconstruction: Compression involves transforming data into a com-


pressed form, while reconstruction restores the original or an approximate version of the data.
In lossy compression, reconstruction leads to a slight loss in quality.

BCS057 Image Processing


ITECH WORLD AKTU 3

Application Areas:

• Medical Imaging: In medical diagnostics, large images from MRI, CT scans, or X-rays
are compressed to reduce storage and enable quick transmission for remote analysis.

• Satellite Imaging: Remote sensing images captured by satellites require compression


to handle the massive data collected over large areas.

• Multimedia Storage and Streaming: Videos, images, and audio files are compressed
to provide real-time streaming services and reduce storage demands for platforms like
YouTube and Netflix.

• Security and Surveillance: Security cameras generate large amounts of image data
that need to be compressed for storage and monitoring purposes.

Image Compression Algorithms: There are several algorithms used to compress images,
and each employs different techniques for efficient compression. Below are some of the most
popular approaches:

1. Transform Coding:

• Transform coding is a widely used technique in image compression that works by


converting the image’s spatial pixel representation into frequency components.
• The most common transformation used is the Discrete Cosine Transform (DCT),
which converts the image into a sum of cosine functions oscillating at different fre-
quencies.
• After transformation, the frequency components are quantized—this is where data
is reduced by removing less important frequencies, which are typically the high-
frequency components.
• The remaining data is then encoded using entropy coding techniques like Huffman
coding.
• Transform coding provides high compression efficiency and is used in formats like
JPEG.

2. Entropy Coding:

BCS057 Image Processing


ITECH WORLD AKTU 4

• Entropy coding is a lossless data compression technique that compresses data based
on the probability distribution of the symbols.
• It works by assigning shorter codes to more frequent symbols and longer codes to
less frequent ones, optimizing the overall size of the encoded data.
• Huffman coding and arithmetic coding are the most common types of entropy coding.
• This technique is often used in conjunction with other compression methods to op-
timize the final data size, such as after transform coding or predictive coding.

3. Predictive Coding:

• Predictive coding is a technique where the current pixel value is predicted from
neighboring pixel values, and only the difference (error) between the actual and
predicted values is encoded.
• By encoding the prediction errors, which typically have smaller values than the
original pixel intensities, the data can be more efficiently compressed.
• Lossless predictive coding involves exact predictions with no loss of data, whereas
lossy predictive coding allows some inaccuracies to improve compression.
• Predictive coding is widely used in lossless image formats like PNG and lossless
JPEG.

4. Layered Coding:

• Layered coding, also known as scalable coding, involves compressing an image at


multiple layers or levels of detail.
• The base layer provides a coarse representation of the image, and additional en-
hancement layers add progressively more detail.
• This method allows for scalable transmission, where the image quality can adapt to
different network conditions or display requirements.
• Layered coding is particularly useful in applications such as streaming, where differ-
ent devices or network speeds can request varying levels of image detail.

Huffman Coding: Huffman coding is a lossless data compression algorithm based on the
frequency of occurrence of a data item. The principle behind Huffman coding is to use shorter
codes for more frequent items and longer codes for less frequent items, resulting in efficient
compression.

• Huffman coding is based on constructing a binary tree where each leaf node represents a
data symbol.

• The most frequent symbols are given shorter binary codes, and the less frequent symbols
are given longer codes.

• The algorithm first calculates the frequency of each symbol in the dataset, then constructs
the binary tree by merging the two lowest-frequency nodes repeatedly until only one node
remains.

• Once the tree is built, each symbol is assigned a unique prefix-free binary code, which
ensures that no code is a prefix of another, making decoding unambiguous.

• Huffman coding is widely used in image compression algorithms like JPEG and PNG.

BCS057 Image Processing


ITECH WORLD AKTU 5

Lossless Compression Lossy Compression


• Preserves all original data • Sacrifices some data for
without any loss. higher compression rates.

• Decompressed data is iden- • Decompressed data is not


tical to the original. identical but is approxi-
mately close to the original.
• Typically results in lower
compression ratios. • Achieves much higher com-
pression ratios compared to
• Used when data integrity is lossless.
crucial, such as in text files,
medical imaging, or archival • Commonly used in appli-
storage. cations where some quality
loss is acceptable, such as
• Common algorithms in- video streaming or multi-
clude PNG, GIF, and ZIP. media storage.
• Slower decompression due • Common algorithms in-
to complexity of maintain- clude JPEG, MP3, and
ing accuracy. MPEG.
• Ideal for precise applica- • Faster decompression due
tions where quality loss is to simpler processing.
not acceptable.
• Suitable for applications
where storage and band-
width efficiency is priori-
tized over perfect quality.
Table 1: Comparison between Lossless and Lossy Compression

BCS057 Image Processing


ITECH WORLD AKTU 6

Run Length Encoding (RLE)


Run Length Encoding compresses data by encoding consecutive repeated values as a single
data value and a count. It is particularly useful for images with large areas of uniform color.

1. RLE works by replacing consecutive repeated values (or runs) with the value and the
number of times it is repeated.

2. It is highly efficient for data with long runs of the same value, such as simple images or
text files with large blocks of identical characters.

3. RLE is lossless, meaning no data is lost in the compression process.

4. The compression ratio depends on the nature of the data—better performance with repet-
itive data and poor performance with random data.

5. In an image, long horizontal lines of the same color can be significantly compressed using
RLE.

6. RLE is simple to implement, making it popular for tasks like image compression in formats
like BMP or TIFF.

7. A drawback of RLE is that it can increase file size for data with few runs, such as noisy
or highly varied images.

Example: In a binary image with a black background and white text, RLE can efficiently
compress the long black sequences.

Arithmetic Coding
Arithmetic coding is a lossless coding technique that represents a sequence of symbols as a
single number in a continuous range of real numbers between 0 and 1.

1. Unlike Huffman coding, which assigns a fixed number of bits to each symbol, arithmetic
coding encodes an entire message as a single fractional number.

BCS057 Image Processing


ITECH WORLD AKTU 7

2. Arithmetic coding is more efficient in cases where the symbol probabilities are not powers
of two, allowing for more precise compression.

3. It works by subdividing the interval between 0 and 1 according to the probabilities of the
symbols and then successively narrowing this interval based on the input sequence.

4. The final output is a number within the last narrowed range, representing the entire
sequence of symbols.

5. Arithmetic coding is widely used in multimedia compression standards like JPEG2000


and H.264.

6. Its main advantage over Huffman coding is its ability to handle sources with fractional
probabilities more efficiently.

7. Arithmetic coding is computationally more complex than Huffman coding, but it offers
better compression for highly skewed probability distributions.

Applications of Arithmetic Coding:

• JPEG2000 image compression standard.

• H.264/MPEG-4 AVC video compression.

• Compression in file archivers like 7-Zip.

• Speech and audio compression, where symbol probabilities may vary significantly.

Difference Between Huffman and Arithmetic Cod-


ing

Huffman Coding Arithmetic Coding


• Assigns fixed-length binary • Represents the entire mes-
codes based on symbol sage as a fractional number
probabilities. in the interval [0,1].

• Efficient for distributions • More efficient for symbols


where symbol probabilities with probabilities that are
are powers of two. not powers of two.

• Simpler to implement and • More complex and slower to


faster to encode and de- encode and decode due to
code. iterative process.

• Less efficient for skewed or • Offers better compression


fractional probability distri- for highly skewed distribu-
butions. tions.
Table 2: Comparison between Huffman and Arithmetic Coding

BCS057 Image Processing


ITECH WORLD AKTU 8

JPEG Standard
JPEG (Joint Photographic Experts Group) is a widely used method of lossy compression for
digital images, particularly photographs.

1. JPEG transforms the image into the frequency domain using the Discrete Cosine Trans-
form (DCT).

2. The image is divided into 8x8 blocks of pixels, and each block is transformed into frequency
components.

3. The DCT coefficients are quantized, reducing precision and leading to data loss.

4. High-frequency components, which contribute less to image quality, are more heavily
quantized.

5. The quantized coefficients are encoded using entropy coding such as Huffman or arithmetic
coding.

6. JPEG provides high compression ratios while maintaining acceptable visual quality, mak-
ing it ideal for photographs and web images.

7. One limitation is that multiple compression cycles degrade image quality due to cumula-
tive data loss.

MPEG Standard
MPEG (Moving Picture Experts Group) is a standard for compressing video files, commonly
used in streaming and storage applications.

1. MPEG exploits both spatial redundancy (within frames) and temporal redundancy (be-
tween frames) to achieve compression.

2. Instead of storing every frame completely, MPEG stores a reference frame and then only
the differences between successive frames.

3. It uses techniques like motion estimation to predict frame differences and reduce the data
required.

4. MPEG compression is divided into three types of frames: I-frames (intra-coded), P-frames
(predictive-coded), and B-frames (bi-directional coded).

5. I-frames are compressed independently, while P and B frames are encoded based on
differences with neighboring frames.

6. The use of temporal redundancy allows MPEG to achieve high compression ratios, making
it ideal for streaming video over the internet.

7. MPEG compression standards include MPEG-1, MPEG-2 (used in DVDs), and MPEG-4
(used in online video streaming platforms like YouTube).

BCS057 Image Processing


ITECH WORLD AKTU 9

Boundary Representation
Boundary representation involves describing the shape of an object by its edges or boundaries.
This is particularly useful for object recognition in images where the focus is on the shape of
the objects.

Approaches Used in Boundary Representation


There are several approaches to represent the boundary of a region. These include:

1. Minimum Perimeter Polygons: This method approximates the object’s boundary us-
ing the smallest possible polygon, helping to simplify the shape while preserving essential
features.

2. Merging Technique: This approach involves merging neighboring boundary segments


based on criteria such as proximity and similarity in orientation.

3. Splitting: In contrast to merging, splitting breaks a complex boundary into simpler


segments, often based on significant changes in the boundary curvature.

4. Signature: A boundary signature is a one-dimensional representation of the bound-


ary, often based on the distance of boundary points from a reference point (such as the
centroid).

5. Boundary Segments: The boundary can be segmented into parts, each representing a
specific curve or straight-line segment for further analysis.

6. Skeletons: Skeletonization reduces a shape to its central structure, capturing the general
form of the object with minimal data.

Boundary Description
Boundary description refers to the detailed outline of the object’s shape, which can be repre-
sented using different mathematical descriptors. Two important descriptors include:

BCS057 Image Processing


ITECH WORLD AKTU 10

• Eccentricity: This measures how much the shape deviates from being circular. It is
calculated as the ratio of the major axis to the minor axis of the boundary’s fitted ellipse.

• Curvature: Curvature describes how much the boundary deviates from being straight.
It is the rate of change of direction along the boundary.

Fourier Descriptor
Fourier descriptors provide a compact and efficient representation of a boundary by transform-
ing its coordinates into the frequency domain using the Fourier transform.

1. Fourier descriptors are computed by applying the Fourier transform to the boundary
points, treating them as a sequence of complex numbers.

2. The resulting coefficients represent the boundary at different frequency components, with
low-frequency components capturing the general shape and high-frequency components
capturing finer details.

3. Truncating high-frequency components allows for smoothing the boundary, useful for
noise reduction and shape approximation.

4. Fourier descriptors are invariant to translation, rotation, and scaling, making them ideal
for object recognition tasks.

5. The descriptor is highly efficient in representing complex shapes with fewer coefficients
than other methods.

6. Fourier descriptors are widely used in image analysis for pattern recognition and shape
classification.

7. They also allow for shape reconstruction from a limited number of frequency components.

BCS057 Image Processing


ITECH WORLD AKTU 11

Topological Features
Topological features describe the structural properties of an image region, independent of geo-
metric distortions. Important topological features include:

1. Connected Components: A connected component is a maximal set of pixels that are


adjacent to each other, forming a distinct object or region.

2. Euler Number: The Euler number is a topological invariant that measures the difference
between the number of connected components and the number of holes in a region.

3. Number of Holes: A hole is a region of background pixels enclosed by a connected


component. The number of holes gives information about the complexity of the object.

4. Total Hole Area: The total hole area is the sum of the areas of all the holes inside a
connected region, providing insight into the shape and structure of the object.

Approaches for Region Description


There are various approaches used to describe a region, which can be categorized into:

1. Statistical Approach: This approach uses statistical properties such as mean, variance,
and standard deviation of pixel values within the region to describe the texture and
appearance.

2. Structural Approach: Structural methods focus on the arrangement and relationship


between basic elements, such as edges, lines, or repetitive patterns within the region.

3. Spectral Approach: This method uses frequency-domain features derived from trans-
formations like the Fourier transform or wavelet transform to capture the texture and
spatial frequency components of the region.

Relational Descriptor
A relational descriptor describes the spatial relationship between objects or regions in an image.
It captures how different parts of an image are positioned relative to each other, providing
additional context beyond individual object properties. Common relational descriptors include:

• Adjacency: Whether two regions share a boundary.

• Distance: The spatial distance between regions.

• Orientation: The relative angle between regions.

Binary and Its Sequence


In image processing, binary refers to a representation where pixel values are either 0 or 1.
Typically, 0 represents the background (black) and 1 represents the foreground (white). Binary
images are used for tasks like object detection, edge detection, and morphological operations.
A binary sequence is an ordered series of binary values, representing either the pixel values
along a boundary or the pixel patterns in a particular region.

BCS057 Image Processing


ITECH WORLD AKTU 12

Example: Consider a binary image of size 4 × 4:


 
1 1 0 0
1 1 0 0
 
0 0 1 1
0 0 1 1

This image contains two regions: the top-left and bottom-right squares of ”1”s. A boundary
around the top-left square could be described by the binary sequence 11001100. The sequence
helps describe the pattern of pixel transitions along the boundary.
Binary sequences are also used in run-length encoding, where consecutive pixels of the same
value are compressed by storing the length of each run of identical values. For example, the
binary sequence 11001100 could be encoded as (2, 2, 2, 2), representing two 1’s, two 0’s, two 1’s,
and two 0’s.

Tags in Arithmetic Coding


In arithmetic coding, a tag is a number within the interval [0, 1) that represents the entire
input sequence. The process of arithmetic coding involves successively narrowing the interval
based on the probabilities of the symbols in the sequence. The final tag is a fractional value
within the narrowed interval, which uniquely encodes the entire message.
Mathematical Process:
1. Symbol probabilities: Consider a source with three symbols: A, B, and C with
probabilities P (A) = 0.5, P (B) = 0.3, and P (C) = 0.2. These probabilities define the ranges
for each symbol within the interval [0, 1):

A : [0, 0.5), B : [0.5, 0.8), C : [0.8, 1)

2. Encoding a message: Let’s encode the message ”ABC”. Start with the interval [0, 1).
For each symbol, narrow the interval based on its probability range:

• For A, the new interval is [0, 0.5).

• For B, the range within [0, 0.5) is reduced to [0.25, 0.4) (since B occupies [0.5, 0.8) of the
current interval).

• For C, the final range within [0.25, 0.4) is [0.37, 0.4) (since C occupies [0.8, 1) of the
current interval).

3. Final tag: The tag representing the message ”ABC” is any number within the final
interval [0.37, 0.4). For example, we could choose 0.375 as the tag.
4. Decoding: To decode the message, start with the tag 0.375 and reverse the process.
Find which symbol’s range contains the tag:

• 0.375 lies in [0, 0.5), so the first symbol is A.

• Narrow the interval to [0, 0.5) and continue. 0.375 lies in [0.25, 0.4) within this range, so
the second symbol is B.

• Finally, 0.375 lies in [0.37, 0.4), so the third symbol is C.

Thus, the tag 0.375 decodes to ”ABC”.


Advantages of Arithmetic Coding:

BCS057 Image Processing


ITECH WORLD AKTU 13

• It provides better compression than fixed-length or variable-length coding schemes (e.g.,


Huffman coding) for certain symbol probability distributions.

• Arithmetic coding is particularly useful when symbol probabilities are not powers of two,
leading to more precise compression.

• The final tag represents the entire sequence, allowing for efficient encoding of long mes-
sages.

BCS057 Image Processing

You might also like