0% found this document useful (0 votes)

143 views

Image Segmentation in Deep Learning

This document discusses image segmentation in deep learning. It provides an overview of modern computer vision techniques using deep learning for tasks like image classification and object detection. Image segmentation divides images into segments to simplify analysis, with techniques like semantic and instance segmentation. Deep learning methods like convolutional neural networks (CNNs) and fully convolutional networks (FCNs) are commonly used for image segmentation. The document outlines several image segmentation applications in areas such as medical imaging, machine vision, video surveillance, and self-driving vehicles.

Uploaded by

Hema malini

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

143 views

Image Segmentation in Deep Learning

Uploaded by

Hema malini

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Image Segmentation in

Deep Learning:
Methods and
Applications

Modern Computer Vision technology, based on AI and deep learning

methods, has evolved dramatically in the past decade. Today it is used for
applications like image classification, face recognition, identifying objects
in images, video analysis and classification, and image processing in
robots and autonomous vehicles.

Many computer vision tasks require intelligent segmentation of an image,

to understand what is in the image and enable easier analysis of each
part. Today’s image segmentation techniques use models of deep learning
for computer vision to understand, at a level unimaginable only a decade
ago, exactly which real-world object is represented by each pixel of an
image.

Deep learning can learn patterns in visual inputs in order to predict object
classes that make up an image. The main deep learning architecture used
for image processing is a Convolutional Neural Network (CNN), or specific
CNN frameworks like AlexNet, VGG, Inception, and ResNet. Models of
deep learning for computer vision are typically trained and executed on
specialized graphics processing units (GPUs) to reduce computation time.

In this article you will learn

 What is image segmentation?

 Old-school image segmentation methods

 Image segmentation methods in deep learning

 Image segmentation applications

What is Image Segmentation?

Image segmentation is a critical process in computer vision. It
involves dividing a visual input into segments to simplify image
analysis. Segments represent objects or parts of objects, and
comprise sets of pixels, or “super-pixels”. Image segmentation
sorts pixels into larger components, eliminating the need to
consider individual pixels as units of observation. There are three
levels of image analysis:

 Classification – categorizing the entire image into a class

such as “people”, “animals”, “outdoors”

 Object detection – detecting objects within an image and

drawing a rectangle around them, for example, a person or a
sheep.

 Segmentation – identifying parts of the image and

understanding what object they belong to. Segmentation lays
the basis for performing object detection and classification.
Semantic Segmentation vs. Instance
Segmentation
Within the segmentation process itself, there are two levels of
granularity:

 Semantic segmentation—classifies all the pixels of an image

into meaningful classes of objects. These classes are
“semantically interpretable” and correspond to real-world
categories. For instance, you could isolate all the pixels
associated with a cat and color them green. This is also
known as dense prediction because it predicts the meaning of
each pixel.
 Instance segmentation—identifies each instance of each
object in an image. It differs from semantic segmentation in
that it doesn’t categorize every pixel. If there are three cars in
an image, semantic segmentation classifies all the cars as
one instance, while instance segmentation identifies each
individual car.

Old-School Image Segmentation

Methods
There are additional image segmentation techniques that were
commonly used in the past but are less efficient than their deep
learning counterparts because they use rigid algorithms and
require human intervention and expertise. These include:

 Thresholding—divides an image into a foreground and

background. A specified threshold value separates pixels into
one of two levels to isolate objects. Thresholding converts
grayscale images into binary images or distinguishes the
lighter and darker pixels of a color image.

 K-means clustering—an algorithm identifies groups in the

data, with the variable K representing the number of groups.
The algorithm assigns each data point (or pixel) to one of the
groups based on feature similarity. Rather than analyzing
predefined groups, clustering works iteratively to organically
form groups.

 Histogram-based image segmentation—uses a histogram to

group pixels based on “gray levels”. Simple images consist of
an object and a background. The background is usually one
gray level and is the larger entity. Thus, a large peak
represents the background gray level in the histogram. A
smaller peak represents the object, which is another gray
level.

 Edge detection—identifies sharp changes or discontinuities in

brightness. Edge detection usually involves arranging points
of discontinuity into curved line segments, or edges. For
example, the border between a block of red and a block of
blue.

How Deep Learning Powers Image

Segmentation Methods
Modern image segmentation techniques are powered by deep
learning technology. Here are several deep learning architectures
used for segmentation:

Convolutional Neural Networks (CNNs) Image segmentation with

CNN involves feeding segments of an image as input to a
convolutional neural network, which labels the pixels. The CNN
cannot process the whole image at once. It scans the image,
looking at a small “filter” of several pixels each time until it has
mapped the entire image. To learn more see our in-depth guide
about Convolutional Neural Networks .

Fully Convolutional Networks (FCNs) Traditional CNNs have fully-

connected layers, which can’t manage different input sizes. FCNs
use convolutional layers to process varying input sizes and can
work faster. The final output layer has a large receptive field and
corresponds to the height and width of the image, while the
number of channels corresponds to the number of classes. The
convolutional layers classify every pixel to determine the context
of the image, including the location of objects.

Ensemble learning Synthesizes the results of two or more related

analytical models into a single spread. Ensemble learning can
improve prediction accuracy and reduce generalization error. This
enables accurate classification and segmentation of images.
Segmentation via ensemble learning attempts to generate a set of
weak base-learners which classify parts of the image, and
combine their output, instead of trying to create one single optimal
learner.

DeepLab One main motivation for DeepLab is to perform image

segmentation while helping control signal decimation—reducing
the number of samples and the amount of data that the network
must process. Another motivation is to enable multi-scale
contextual feature learning—aggregating features from images at
different scales. DeepLab uses an ImageNet pre-trained residual
neural network (ResNet) for feature extraction. DeepLab uses
atrous (dilated) convolutions instead of regular convolutions. The
varying dilation rates of each convolution enable the ResNet block
to capture multi-scale contextual information. DeepLab is
comprised of three components:

 Atrous convolutions—with a factor that expands or contracts

the convolutional filter’s field of view.

 ResNet—a deep convolutional network (DCNN) from

Microsoft. It provides a framework that enables training
thousands of layers while maintaining performance. The
powerful representational ability of ResNet boosts computer
vision applications like object detection and face recognition.
 Atrous spatial pyramid pooling (ASPP)—provides multi-scale
information. It uses a set of atrous convolutions with varying
dilation rates to capture long-range context. ASPP also uses
global average pooling (GAP) to incorporate image-level
features and add global context information.

SegNet neural network An architecture based on deep encoders

and decoders, also known as semantic pixel-wise segmentation. It
involves encoding the input image into low dimensions and then
recovering it with orientation invariance capabilities in the decoder.
This generates a segmented image at the decoder
end.

Image Segmentation Applications

Image segmentation helps determine the relations between
objects, as well as the context of objects in an image. Applications
include face recognition, number plate identification, and satellite
image analysis. Industries like retail and fashion use image
segmentation, for example, in image-based searches. Autonomous
vehicles use it to understand their surroundings.

Object Detection and Face Detection

These applications involve identifying object instances of a
specific class in a digital image. Semantic objects can be
classified into classes like human faces, cars, buildings, or cats.

 Face detection—a type of object-class detection with many

applications, including biometrics and autofocus features in
digital cameras. Algorithms detect and verify the presence of
facial features. For example, eyes appear as valleys in a gray-
level image.

 Medical imaging—extracts clinically relevant information from

medical images. For example, radiologists may use machine
learning to augment analysis, by segmenting an image into
different organs, tissue types, or disease symptoms. This can
reduce the time it takes to run diagnostic tests.

 Machine vision—applications that capture and process

images to provide operational guidance to devices. This
includes both industrial and non-industrial applications.
Machine vision systems use digital sensors in specialized
cameras that allow computer hardware and software to
measure, process, and analyze images. For example, an
inspection system photographs soda bottles and then
analyzes the images according to pass-fail criteria to
determine if the bottles are properly filled.

Video Surveillance—video tracking and moving

object tracking
This involves locating a moving object in video footage. Uses
include security and surveillance, traffic control, human-computer
interaction, and video editing.

 Self-driving vehicles—autonomous cars must be able to

perceive and understand their environment in order to drive
safely. Relevant classes of objects include other vehicles,
buildings, and pedestrians. Semantic segmentation enables
self-driving cars to recognize which areas in an image are safe
to drive.

 Iris recognition—a form of biometric identification that

recognizes the complex patterns of an iris. It uses automated
pattern recognition to analyze video images of a person’s eye.

 Face recognition—identifies an individual in a frame from a

video source. This technology compares selected facial
features from an input image with faces in a database.

Retail Image Recognition

This application provides retailers with an understanding of the
layout of goods on the shelf. Algorithms process product data in
real time to detect whether goods are present or absent on the
shelf. If a product is absent, they can identify the cause, alert the
merchandiser, and recommend solutions for the corresponding
part of the supply chain.

Image Segmentation with Deep

Learning in the Real World
In this article we explained the basics of modern image segmentation, which is
powered by deep learning architectures like CNN and FCNN. When you start working
on computer vision projects and using deep learning frameworks like TensorFlow,
Keras and PyTorch to run and fine-tune these models, you’ll run into some practical
challenges:


Tracking experiment source code, configuration and

hyperparameters Convolutional networks have many
variations that can impact performance. You’ll run many
experiments to discover the hyperparameters that provide
the best performance for your problem. Organizing, tracking
and sharing experiment data can be a challenge.

Scaling experiments on-premise or in the cloud CNNs

require a lot of computing power, so to run large numbers of
experiments you’ll need to scale up across multiple
machines. Provisioning machines and setting them up to run
deep learning projects is time-consuming; manually running
experiments results in idle time and wasted resources.

Manage training data Computer vision projects use training

sets with rich media like images or video. A dataset can
weigh anywhere from Gigabytes to Petabytes. You need to
copy and re-copy this data to each training machine, which
takes time and hurts productivity.

MissingLink is a deep learning platform that can help you

automate these operational aspects of CNNs and computer vision,
so you can concentrate on building winning image recognition
experiments. Learn more to see how easy it is.

Camera Angle or Shot Worksheet
67% (6)
Camera Angle or Shot Worksheet
2 pages
Detailed Lesson Plan in MOBILE PHONE ARTS CGI and DIGITAL PHOTOGRAPHY
67% (3)
Detailed Lesson Plan in MOBILE PHONE ARTS CGI and DIGITAL PHOTOGRAPHY
15 pages
Lecture 8 Image Segmentationi n Computer Vision 2025
No ratings yet
Lecture 8 Image Segmentationi n Computer Vision 2025
18 pages
1907.06119
No ratings yet
1907.06119
58 pages
UNIT_3 _DL
No ratings yet
UNIT_3 _DL
15 pages
UNIT_3__DL[1]
No ratings yet
UNIT_3__DL[1]
15 pages
Image Segmentation Using Deep Learning: A Survey
No ratings yet
Image Segmentation Using Deep Learning: A Survey
23 pages
Image Segmentation For Object Detection Using Mask R-CNN in Colab
No ratings yet
Image Segmentation For Object Detection Using Mask R-CNN in Colab
5 pages
Lec+2(+Image+Segemnation)
No ratings yet
Lec+2(+Image+Segemnation)
52 pages
DL UNIt-III
No ratings yet
DL UNIt-III
21 pages
ML Report-Image Segmentation
No ratings yet
ML Report-Image Segmentation
19 pages
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
BML Assign Print 4
No ratings yet
BML Assign Print 4
8 pages
A Study On Image Categorization Techniques
No ratings yet
A Study On Image Categorization Techniques
7 pages
Image Segmentation Using Deep Learning: A Survey
No ratings yet
Image Segmentation Using Deep Learning: A Survey
22 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
IVP notes
No ratings yet
IVP notes
25 pages
DL UNIT 5
No ratings yet
DL UNIT 5
63 pages
Lecture 13 Image Segmentation Using Convolutional Neural Network
No ratings yet
Lecture 13 Image Segmentation Using Convolutional Neural Network
9 pages
NNDL Unit 5
No ratings yet
NNDL Unit 5
21 pages
Electronics 12 01199
No ratings yet
Electronics 12 01199
24 pages
Computer vision
No ratings yet
Computer vision
13 pages
computer vision technology
No ratings yet
computer vision technology
29 pages
A Brief Survey and An Application of Sem
No ratings yet
A Brief Survey and An Application of Sem
38 pages
Lecture 3 Image Segmentation
No ratings yet
Lecture 3 Image Segmentation
25 pages
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
Thesis On Image Segmentation
No ratings yet
Thesis On Image Segmentation
4 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
Image Segmentation Using Deep Learning A Survey
No ratings yet
Image Segmentation Using Deep Learning A Survey
20 pages
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
UNIT 2 COMPUTER VISION & IMAGE PROCESSSING (1)
No ratings yet
UNIT 2 COMPUTER VISION & IMAGE PROCESSSING (1)
16 pages
Image Processing
No ratings yet
Image Processing
7 pages
A Survey On Deep Learning Techniques For Image and Video Semantic Segmentation
No ratings yet
A Survey On Deep Learning Techniques For Image and Video Semantic Segmentation
68 pages
Classification Using Deep Learning Networks
No ratings yet
Classification Using Deep Learning Networks
27 pages
23-2021_A comprehensive survey of image segmentation
No ratings yet
23-2021_A comprehensive survey of image segmentation
26 pages
CV Lecture 7
No ratings yet
CV Lecture 7
119 pages
CV Expl 21070126001
No ratings yet
CV Expl 21070126001
16 pages
Computer Vision Experiential Learning Report
No ratings yet
Computer Vision Experiential Learning Report
20 pages
A Survey of Diverse Segmentation Methods in Image Processing
No ratings yet
A Survey of Diverse Segmentation Methods in Image Processing
5 pages
Image_Segmentation_DeepLearning
No ratings yet
Image_Segmentation_DeepLearning
18 pages
Week5_Computer_Vision
No ratings yet
Week5_Computer_Vision
58 pages
IA Unit-03
No ratings yet
IA Unit-03
10 pages
image segmentation
No ratings yet
image segmentation
6 pages
Lecture 8 Segmentation
No ratings yet
Lecture 8 Segmentation
54 pages
Expl CV
No ratings yet
Expl CV
16 pages
IT5409 Ch5 Segmentation v2
No ratings yet
IT5409 Ch5 Segmentation v2
64 pages
IT5409 Ch5 Segmentation v2
No ratings yet
IT5409 Ch5 Segmentation v2
70 pages
Image Segmentationand Semantic Labelingusing Machine Learning
No ratings yet
Image Segmentationand Semantic Labelingusing Machine Learning
6 pages
SDPT Semantic-Aware Dimension-Pooling Transformer For Image Segmentation
No ratings yet
SDPT Semantic-Aware Dimension-Pooling Transformer For Image Segmentation
13 pages
Block-4-output
No ratings yet
Block-4-output
101 pages
Feature Extraction
No ratings yet
Feature Extraction
23 pages
Hutten Loc Her
No ratings yet
Hutten Loc Her
9 pages
9781638280712-summary
No ratings yet
9781638280712-summary
65 pages
Topic2 Semantic Image Segmentation
No ratings yet
Topic2 Semantic Image Segmentation
56 pages
Image Processing
No ratings yet
Image Processing
7 pages
Image Segmentation and Classification Using Neural Network
No ratings yet
Image Segmentation and Classification Using Neural Network
15 pages
Pyramid Image Processing: Exploring the Depths of Visual Analysis
From Everand
Pyramid Image Processing: Exploring the Depths of Visual Analysis
Fouad Sabry
No ratings yet
Recent Progress in Semantic Image Segmentation: Xiaolong Liu Zhidong Deng Yuhan Yang
No ratings yet
Recent Progress in Semantic Image Segmentation: Xiaolong Liu Zhidong Deng Yuhan Yang
18 pages
Visual Image Understanding
No ratings yet
Visual Image Understanding
7 pages
8DL
No ratings yet
8DL
6 pages
Image Segmentation: Femur
No ratings yet
Image Segmentation: Femur
18 pages
ImSeg 10 11 18
No ratings yet
ImSeg 10 11 18
41 pages
Mango Classification Using Convolutional Neural Networks
No ratings yet
Mango Classification Using Convolutional Neural Networks
3 pages
Enq 2021 0010926
No ratings yet
Enq 2021 0010926
2 pages
An Introduction To Machine Learning and How To Teach Machines To See
No ratings yet
An Introduction To Machine Learning and How To Teach Machines To See
50 pages
Leaf Counting With Multi-Scale Convolutional Neural Network Features and Fisher Vector Coding
No ratings yet
Leaf Counting With Multi-Scale Convolutional Neural Network Features and Fisher Vector Coding
16 pages
Contempo-Techniques and Local Materials
No ratings yet
Contempo-Techniques and Local Materials
32 pages
Structured Light + Range Imaging Lecture #17
No ratings yet
Structured Light + Range Imaging Lecture #17
41 pages
Orthoscan Advantage TAU Family Vs Insight FD v1.70
No ratings yet
Orthoscan Advantage TAU Family Vs Insight FD v1.70
2 pages
Tehnologies For Data GIS-LIDAR
No ratings yet
Tehnologies For Data GIS-LIDAR
6 pages
Event Photography Contract
No ratings yet
Event Photography Contract
3 pages
Detailed Urban Land Use Land Cover Classification
No ratings yet
Detailed Urban Land Use Land Cover Classification
24 pages
User Manual: Instruction Diagram
No ratings yet
User Manual: Instruction Diagram
1 page
Review ML 16803
No ratings yet
Review ML 16803
9 pages
Honeywell Impact Wifi Camera
No ratings yet
Honeywell Impact Wifi Camera
2 pages
Making The Print
No ratings yet
Making The Print
65 pages
ClearVue8_Disposable
No ratings yet
ClearVue8_Disposable
4 pages
Bx53pax7988 01
No ratings yet
Bx53pax7988 01
36 pages
The Electromagnetic Spectrum: Alternate Light Sources
No ratings yet
The Electromagnetic Spectrum: Alternate Light Sources
20 pages
Varicam
No ratings yet
Varicam
20 pages
Study_on_the_Effects_of_Display_Color_Mode_and_Luminance_Contrast_on_Visual_Fatigue
No ratings yet
Study_on_the_Effects_of_Display_Color_Mode_and_Luminance_Contrast_on_Visual_Fatigue
9 pages
Leica Viva TS16 DS
No ratings yet
Leica Viva TS16 DS
2 pages
Robotics Assignment#1
No ratings yet
Robotics Assignment#1
5 pages
Slit Lamp - S390L
No ratings yet
Slit Lamp - S390L
30 pages
Corocam 6Hd: High Performance Daylight Corona Imaging Camera
No ratings yet
Corocam 6Hd: High Performance Daylight Corona Imaging Camera
5 pages
PDS-MX-15D-March-2018
No ratings yet
PDS-MX-15D-March-2018
2 pages
Katalog Quantum
No ratings yet
Katalog Quantum
9 pages
ORAL QUESTIONING Copy 2
No ratings yet
ORAL QUESTIONING Copy 2
5 pages
The Evolution of Mobile Phones
No ratings yet
The Evolution of Mobile Phones
2 pages
Aerial Photographic and Satellite Image Interpretation
No ratings yet
Aerial Photographic and Satellite Image Interpretation
3 pages
Reachy2-Dual-arms-Datasheet
No ratings yet
Reachy2-Dual-arms-Datasheet
5 pages
Data Epson-Home-Theatre-EH-TW6250-V5
No ratings yet
Data Epson-Home-Theatre-EH-TW6250-V5
12 pages
Satir E8n Thermal Imaging Camera Manual-1
No ratings yet
Satir E8n Thermal Imaging Camera Manual-1
82 pages
Contempo Module 5 8 PDF Soft
No ratings yet
Contempo Module 5 8 PDF Soft
16 pages