0% found this document useful (0 votes)
14 views17 pages

Unit 4

Image segmentation is the process of partitioning an image into distinct segments or objects to facilitate analysis, with applications in fields like medical imaging and autonomous vehicles. It includes techniques such as semantic segmentation, which classifies pixels based on their meanings, and instance segmentation, which identifies individual instances of objects. Traditional methods like thresholding and edge detection are foundational, while advanced techniques like graph cuts and Gabor filters enhance segmentation accuracy.

Uploaded by

Rahul Vijay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views17 pages

Unit 4

Image segmentation is the process of partitioning an image into distinct segments or objects to facilitate analysis, with applications in fields like medical imaging and autonomous vehicles. It includes techniques such as semantic segmentation, which classifies pixels based on their meanings, and instance segmentation, which identifies individual instances of objects. Traditional methods like thresholding and edge detection are foundational, while advanced techniques like graph cuts and Gabor filters enhance segmentation accuracy.

Uploaded by

Rahul Vijay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

UNIT-4

Introduction to Segmentation

Image segmentation refers to the task of annotating a single class to different groups of pixels. While the
input is an image, the output is a mask that draws the region of the shape in that image. Image segmentation
has wide applications in domains such as medical image analysis, self-driving cars, satellite image analysis,
etc. There are different types of image segmentation techniques like semantic segmentation, instance
segmentation, etc. To summarize the key goal of image segmentation is to recognize and understand what’s
in an image at the pixel level.

Image segmentation is the technique of subdividing an image into constituent sub-regions or distinct objects.
The level of detail to which subdivision is carried out depends on the problem being solved. That is,
segmentation should stop when the objects or the regions of interest in an application have been detected.

Segmentation of non-trivial images is one of the most difficult tasks in image processing. Segmentation
accuracy determines the eventual success or failure of computerized analysis procedures. Segmentation
procedures are usually done using two approaches – detecting discontinuity in images and linking edges to
form the region (known as edge-based segmenting), and detecting similarity among pixels based on intensity
levels (known as threshold-based segmenting).

Image segmentation is a fundamental technique in digital image processing and computer vision. It involves
partitioning a digital image into multiple segments (regions or objects) to simplify and analyze an image by
separating it into meaningful components, Which makes the image processing more efficient by focusing on specific
regions of interest. A typical image segmentation task goes through the following steps:

Groups pixels in an image based on shared characteristics like colour, intensity, or texture.

Assigns a label to each pixel, indicating its belonging to a specific segment or object.

The resulting output is a segmented image, often visualized as a mask or overlay highlighting the different segments.

Image segmentation is crucial in computer vision tasks because it breaks down complex images into manageable
pieces. It's like separating ingredients in a dish. By isolating objects (things) and backgrounds (stuff), image analysis
becomes more efficient and accurate. This is essential for tasks like self-driving cars identifying objects or medical
imaging analyzing tumours. Understanding the image's content at this granular level unlocks a wider range of
applications in computer vision.
Semantic Classes in Image Segmentation: Things and Stuff.

In semantic image segmentation, we categorize image pixels based on their semantic meaning, not just their visual
properties. This classification system often uses two main categories: Things and Stuff.

Things: Things refer, to countable objects or distinct entities in an image with clear boundaries, like people, flowers,
cars, animals etc. So, the segmentation of "Things" aims to label individual pixels in the image to specific classes by
delineating the boundaries of individual objects within the image

Stuff: Stuff refers to specific regions or areas in an image different elements in an image like background or repeating
patterns of similar materials which can not be counted like road, sky and grass which may not have clear boundaries
but play a crucial role in understanding the overall context in an image. The segmentation of "Stuff" involves
grouping of pixels in an image into clearly identifiable regions based on the common properties like colour, texture
or context.

Semantic segmentation

Semantic Segmentation is one of the different types of image segmentation where a class label is assigned to image
pixels using deep learning (DL) algorithm. In Semantic Segmentation, collections of pixels in an image are identified
and classified by assigning a class label based on their characteristics such as colour, texture and shape. This provides
a pixel-wise map of an image (segmentation map) to enable more detailed and accurate image analysis.

For example, all pixels related to a ‘tree’ would be labelled the same object name without distinguishing between
individual trees. Another example would be, group of people in an image would be labelled as single object as
'persons', instead of identifying individual people.

Instance segmentation

Instance segmentation in image segmentation of computer vision task is a more sophisticated feature which involves
identifying and delineating each individual object within an image. So instance segmentation goes beyond just
identifying objects in an image, but also delineate the exact boundaries of each individual instance of that object.

So, the key focus of instance segmentation is to differentiate between separate objects of the same class. for
example, if there are many cats in a image, instance segmentation would identify and outline each specific cat. The
segmentation map is created for each individual pixel and separate labels are assigned to specific object instances by
creating different coloured labels which will represent different 'cat' in the group of cats in an image.
Instance segmentation is useful in autonomous vehicles to identify individual objects like pedestrians, other vehicles
and any objects along the navigation route. In medical imaging, analysing scan images for detection of specific
abnormalities are useful for early detection of cancer and other organ conditions.

Traditional image segmentation techniques

The traditional image segmentation techniques which formed the foundation of modern image segmentation
methods using deep learning algorithms, uses thresholding, edge detection, Region-Based Segmentation, clustering
algorithms and Watershed Segmentation. These techniques are more reliant on principle of image processing,
mathematical operation and heuristics to separate an image into meaningful regions.

Thresholding: This method involves selecting a threshold value and classifying image pixels between foreground and
background based on intensity values

Edge Detection: Edge detection method identify abrupt change in intensity or discontinuation in the image. It uses
algorithms like Sobel, Canny or Laplacian edge detectors.

Region-based segmentation: This method segments the image into smaller regions and iteratively merges them
based on predefined attributes in colour, intensity and texture to handle noise and irregularities in the image.

Clustering Algorithm: This method uses algorithms like K-means or Gaussian models to group object pixels in an
image into clusters based on similar features like colour or texture.

Watershed Segmentation:The watershed segmentation treats the image like a topographical map where the
watershed lines are identifies based on pixel intensity and connectivity like water flowing down different valleys.

Applications of Image segmentation

Below are the list of different uses cases of Image Segmentation in Image processing:

Autonomous Vehicles: Image segmentation helps autonomous vehicles in identifying and segmenting objects like
real time road lane detections, vehicles, pedestrians, traffic signs for safe navigation.

Medical Imaging Analysis: Image segmentation used for segmenting organs, tumours and other anatomical
structures from medical images like X-Rays, MRIs, and CT Scans, helps in diagnosis and treatment planning.

Satellite Image Analysis: Used in analysing satellite images for landcover classification, urban planning, and
environmental changes.

Object Detection and Tracking: Segmenting different objects in image or video for different tasks like person
detection, anomaly detection, and detecting different activities in security systems.

Content Moderation: Used in monitoring and segmenting inappropriate content from images or videos for social
media platforms.

Smart Agriculture: Image segmentation methods are used by farmers and agronomists for crop health monitoring,
estimating yield and detect plant diseases from images and videos.

Industrial Inspection: Image segmentation helps in manufacturing process for quality control, detecting defects in
products.
4.1 Thresholding

Thresholding is one of the segmentation techniques that generates a binary image (a binary image is one
whose pixels have only two values – 0 and 1 and thus requires only one bit to store pixel intensity) from a
given grayscale image by separating it into two regions based on a threshold value. Hence pixels having
intensity values greater than the said threshold will be treated as white or 1 in the output image and the
others will be black or 0.

4.1.1
4.1.2 Global Thresholding
When the intensity distribution of objects and background are sufficiently distinct, it is possible to use a
single or global threshold applicable over the entire image. The basic global thresholding algorithm
iteratively finds the best threshold value so segmenting.

The algorithm is explained below.

Select an initial estimate of the threshold T.

Segment the image using T to form two groups G1 and G2: G1 consists of all pixels with intensity values >
T, and G2 consists of all pixels with intensity values ≤ T.

Compute the average intensity values m1 and m2 for groups G1 and G2.σ

Compute the new value of the threshold T as T = (m1 + m2)/2

Repeat steps 2 through 4 until the difference in the subsequent value of T is smaller than a pre-defined value
δ.

Segment the image as g(x,y) = 1 if f(x,y) > T and g(x,y) = 0 if f(x,y) ≤ T.

This algorithm works well for images that have a clear valley in their histogram. The larger the value of δ,
the smaller will be the number of iterations. The initial estimate of T can be made equal to the average pixel
intensity of the entire image.
4.1.3 Otsu Thresholding

In the simplest form, the algorithm returns a single intensity threshold that separate pixels into two classes,
foreground and background. This threshold is determined by minimizing intra-class intensity variance, or
equivalently, by maximizing inter-class variance
4.1.4 Variable Thresholding/Adaptive thresholding is the method where the threshold value is calculated
for smaller regions. This leads to different threshold values for different regions with respect to the change
in lighting. There are broadly two different approaches to local thresholding. One approach is to partition the
image into non-overlapping rectangles. Then the techniques of global thresholding or Otsu’s method are
applied to each of the sub-images. Hence in the image partitioning technique, the methods of global
thresholding are applied to each sub-image rectangle by assuming that each such rectangle is a separate
image in itself.

Ex- Brensens, Niblacks, Sauvola Thresholding


Ex-

4.1.5

4.1.6 Gaussian thresholding

For each pixel, the threshold value is calculated by using a weighted sum of the pixel values in a local neighborhood.
The weights are a Gaussian window, which means that pixels closer to the center of the region have a greater
influence.

In 2-D, an isotropic (i.e. circularly symmetric) Gaussian has the form:

This distribution is shown in Figure 2.


Figure 2 2-D Gaussian distribution with mean (0,0) and =1

Using the above function a gaussian kernel of any size can be calculated, by providing it with appropriate values. A
3×3 Gaussian Kernel Approximation(two-dimensional) with Standard Deviation = 1, appears as follows.

3.4375=3 1.625=2

2.5625=3 2.6875=3

Ex-

4.2 Edge-based segmentation techniques work by identifying areas in an image where there is a rapid
change in intensity or color. These changes often mark the edges of objects or regions within the image.

Techniques such as gradient-based methods (like Sobel or Prewitt operators) detect changes in intensity,
while other methods like Canny edge detection apply more sophisticated filtering to get clearer, more
defined edges.

So, when you apply edge-based segmentation to an image, you’re looking for the points where there’s a
sudden jump in brightness or color, marking a transition from one region to another.

The core of edge detection revolves around the concept of gradients. A gradient measures how quickly
image intensity changes at a given pixel. The greater the change, the more likely the pixel is on an edge.

1. Image Gradient Calculation


The first step in any edge detection algorithm is calculating the gradient of the image. The gradient at a pixel
is a vector pointing in the direction of the greatest intensity change. Mathematically, this is calculated using
partial derivatives.

• G_x is the gradient in the x (horizontal) direction.


• G_y is the gradient in the y (vertical) direction.

These gradients are typically calculated using filters (or kernels) like Sobel or Prewitt.
Where I is the intensity of the image.

2. Edge Magnitude Calculation


The next step is to calculate the magnitude of the gradient at each pixel. This tells us how strong the edge is.
The magnitude M can be calculated using the Pythagorean theorem:

This gives the strength of the edge at each pixel, with larger values indicating stronger
edges.

3. Edge Direction
Once the magnitude is calculated, the direction of the edge can also be determined using:

This angle helps understand the orientation of the edge.

4. Thresholding
After calculating the gradient magnitude and direction, the next step is to apply thresholding. This step helps
in identifying only the strong edges by filtering out weak gradient values.

A simple thresholding rule might look like:

Where T is a predefined threshold value. This creates a binary edge map


where pixels above the threshold are classified as edges.

5. Non-Maximum Suppression (Optional)


To further refine the edges, non-maximum suppression is applied. This step ensures that only the local
maxima are retained as edges by looking at neighboring pixels and suppressing non-edge pixels. In simpler
terms, it thins out the edges to give a cleaner result.

4.3 Region Based Segmentation


This process involves dividing the image into smaller segments that have a certain set of rules. This
technique employs an algorithm that divides the image into several components with common pixel
characteristics. The process looks out for chunks of segments within the image. Small segments can include
similar pixes from neighboring pixels and subsequently grow in size. The algorithm can pick up the gray
level from surrounding pixels.
Region growing − This method recursively grows segments by including neighboring pixels with similar
characteristics. It uses the difference in gray levels for gray regions and the difference in textures for
textured images.
Region splitting − In this method, the whole image is considered a single region. Now to divide the region
into segments it checks for pixels included in the initial region if they follow the predefined set of criteria. If
they follow similar rules they are taken into one segment.
4.4 Mean shift clustering is used for image segmentation because it is a non-parametric, unsupervised
method that effectively identifies the modes or high-density regions of data, which correspond to the
different segments in an image. By iteratively shifting data points towards the average of points within a
given window, mean shift clustering can discover arbitrarily shaped clusters without assuming a specific
number of clusters beforehand. This makes it particularly suitable for image segmentation, where the goal is
to partition the image into meaningful regions based on pixel intensity and color.
The method's ability to handle complex structures and its robustness to noise further enhance its
effectiveness in accurately segmenting images.

4.5 The power of the Markov random field (MRF) in vision, treating the MRF both as a tool for modeling
image data and, utilizing recently developed algorithms, as a means of making inferences about images.
These inferences concern underlying image and scene structure as well as solutions to such problems as
image reconstruction, image segmentation, 3D vision, and object labeling.
4.6 Graph cut is a semiautomatic segmentation technique that you can use to segment an image into
foreground and background elements. Graph cut segmentation does not require good initialization. You
draw lines on the image, called scribbles, to identify what you want in the foreground and what you want in
the background. The Graph Cut technique applies graph theory to image processing to achieve fast
segmentation. The technique creates a graph of the image where each pixel is a node connected by weighted
edges. The higher the probability that pixels are related the higher the weight. The algorithm cuts along
weak edges, achieving the segmentation of objects in the image. Graph cuts to divide an image into
background and foreground segments. The framework consists of two parts. First, a network flow graph is
built based on the input image. Then a max-flow algorithm is run on the graph in order to find the min-cut,
which produces the optimal segmentation.

Image to Graph-
4.7 The Gabor filter

The Gabor filter, named after Dennis Gabor, is a linear filter used in myriad image processing applications
for edge detection, texture analysis, feature extraction, etc. These filters have been shown to possess optimal
localization properties in both spatial and frequency domains and thus are well-suited for texture
segmentation problems. Gabor filters are special classes of bandpass filters, i.e., they allow a certain ‘band’ of
frequencies and reject the others. A Gabor filter can be viewed as a sinusoidal signal of particular frequency
and orientation, modulated by a Gaussian wave. In practice to analyze texture or obtain feature from an
image, a bank of Gabor filter with a number of different orientation are used.

The filter has a real and an imaginary component representing orthogonal directions. The two components
may be formed into a complex number or used individually. The equations are shown below:
In the above equation,
λ — Wavelength of the sinusoidal component. Controls the width of the strips of the Gabor function.
Decreasing the wavelength produces thinner stripes

Ө — The orientation of the normal to the parallel stripes of the Gabor function.

Ψ — The phase offset of the sinusoidal function.


σ — The sigma/standard deviation of the Gaussian envelope It controls the overall size of the Gabor
envelope. for larger bandwidth the envelope increase allowing more stripes and with smaller bandwidth the
envelope tightens.

ɣ — The spatial aspect ratio and specifies the ellipticity of the support of the Gabor function. controls the
height of the Gabor filter. If the gamma value is the height of Gabor reduces and if the gamma value is small
the height of Gabor increases.

4.8 DWT- Wavelets are functions that are concentrated in time and frequency around a certain point. This
transformation technique is used to overcome the drawbacks of fourier method. Fourier transformation,
although it deals with frequencies, does not provide temporal details. According to Heisenberg’s Uncertainty
Principle, we can either have high frequency resolution and poor temporal resolution or vice versa. This
wavelet transform finds its most appropriate use in non-stationary signals. This transformation achieves good
frequency resolution for low-frequency components and high temporal resolution for high-frequency
components.

This method starts with a mother wavelet such as Haar, Morlet, Daubechies, etc. The signal is then essentially
translated into scaled and shifted versions of mother wavelet.

Wavelet analysis is used to divide information present on an image (signals) into two discrete components —
approximations and details (sub-signals).

A signal is passed through two filters, high pass and low pass filters. The image is then decomposed into high

frequency (details) and low frequency components (approximation). At every level, we get 4 sub-signals. The

approximation shows an overall trend of pixel values and the details as the horizontal, vertical and diagonal

components.

If these details are insignificant, they can be valued as zero without significant impact on the image, thereby

achieving filtering and compression.

You might also like