0% found this document useful (0 votes)

11 views16 pages

Image Segmentation ÔÇö A BeginnerÔÇÖs Guide - Medium

Uploaded by

pedro garcia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views16 pages

Image Segmentation ÔÇö A BeginnerÔÇÖs Guide - Medium

Uploaded by

pedro garcia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Search Write

Get unlimited access to the best of Medium for less than $1/week. Become a member

Image Segmentation — A
Beginner’s Guide
The essentials of Image Segmentation + implementation in
TensorFlow

Raj Pulapakura · Follow

6 min read · Feb 4, 2024

Image segmentation is a computer vision technique that assigns a label to

every pixel in an image such that pixels with the same label share certain
characteristics.

For example, in a street scene, all pixels belonging to cars might be labeled
with one color, while those belonging to the road might be labeled with
https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 1/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

another.

But to understand image segmentation and why it is useful, let’s go back to

basics….

Boring Classifiers

Cute doggo. Source

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 2/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Is there a cute little doggo in this picture? Of course there is.

This is a classification task. It tells us if there is a dog in the image.

But what if we want to know exactly where the dog is.

One approach is to draw a bounding box around the dog, which is called
Object Detection.

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 3/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Cute doggo + bounding box. Source + Author

If that’s all you want, then you’re done! But if you want to know exactly where
the dog is, on the pixel level, then you’ll need something better. That’s where
image segmentation comes into play.

Image Segmentation

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 4/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Street segmentation. Source

The core task of image segmentation is to classify each pixel in an image. In

the above street scene, there are 5 classes: road (pink), vehicles (red),
buildings (yellow), nature (green), sky (blue). Each pixel is assigned one of
these classes.
https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 5/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

But sometimes you want to be able to differentiate between different cars, or

different trees. To this end, there are 3 main types of image segmentation,
each providing a different level of detail and information.

Semantic vs. Instance vs. Panoptic

Semantic vs. Instance vs. Panoptic segmentation. Source

Semantic segmentation classifies each pixel based on its semantic class.

All the birds belong to the same class.
https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 6/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Instance segmentation assigns unique labels to different instances, even

if they are of the same semantic class. Each bird belongs to a different
class.

Panoptic segmentation combines the two, providing both class-level and

instance-level labels. Each bird has its own class, but they are all
identified as a “bird”.

Cool, but how do we actually implement image segmentation?

There are a couple of ways, such as thresholding and clustering, but deep
learning (my fav) really takes the spotlight when it comes to image
segmentation.

Real-time body part panoptic segmentation. GIF from TensorFlow Blog

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 7/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

U-Net
The U-Net architecture was initially designed for medical image
segmentation, but it has since been adapted for many other use cases.

U-Net. Image by author.

The U-Net has an encoder-decoder structure.

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 8/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

The encoder is used to compress the input image into a latent space
representation through convolutions and downsampling.

The decoder is used to extrapolate the latent representation into a

segmented image, through convolutions and upsampling.

The long gray arrows running across the “U” are skip connections, and they
serve two main purposes:

1. During the forward pass, they enable the decoder to access information
from the encoder.

2. During the backward pass, they act as a “gradient superhighway” for

gradients from the decoder to flow to the encoder.

The output of the model has the same width and height as the input,
however the number of channels will be equal to the number of classes we
are segmenting.

Code it up
If you’re keen to code, let’s implement the U-Net architecture for semantic
segmentation in TensorFlow.
https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 9/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

U-Net Architecture
Defining the model architecture is rather straightforward.

from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D,

concatenate, Conv2DTranspose

def conv_block(x, n_filters):

"""two convolutions"""
x = Conv2D(n_filters, (3, 3), padding='same', activation='relu')(x)
x = Conv2D(n_filters, (3, 3), padding='same', activation='relu')(x)
return x

def encoder_block(x, n_filters):

"""conv block and max pooling"""
x = conv_block(x, n_filters)
p = MaxPooling2D((2, 2))(x)
return x, p # we will need x for the skip connections later

def decoder_block(x, p, n_filters):

"""upsample, skip connection, and conv block"""
x = Conv2DTranspose(n_filters, (2, 2), strides=(2, 2), padding='same')(x)
x = concatenate([x, p]) # concatenate = skip connection
x = conv_block(x, n_filters)
return x

def unet_model(n_classes, img_height, img_width, img_channels):

inputs = Input((img_height, img_width, img_channels)) # 512x512x3

# Contraction path, encoder

c1, p1 = encoder_block(inputs, n_filters=64) # c1=512x512x64 p1=256x256x64
c2, p2 = encoder_block(p1, n_filters=128) # c2=256x256x128 p2=128x128x128
c3, p3 = encoder_block(p2, n_filters=256) # c3=128x128x256 p3=64x64x256
https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 10/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

c4, p4 = encoder_block(p3, n_filters=512) # c4=64x64x512 p4=32x32x512

# Bottleneck
bridge = conv_block(p5, n_filters=1024) # bridge=32x32x1024

# Expansive path, decoder

u4 = decoder_block(bridge, p4, n_filters=512) # 64x64x512
u3 = decoder_block(u4, p3, n_filters=256) # 128x128x256
u2 = decoder_block(u3, p2, n_filters=128) # 256x256x128
u1 = decoder_block(u2, p1, n_filters=64) # 512x512x64

outputs = Conv2D(n_classes, (1, 1), activation='softmax')(u1) # 512x512xn_cla

# notice the softmax activation in the final layer

model = Model(inputs=[inputs], outputs=[outputs])

return model

# example classes: [road, vehicles, buildings, nature, background]

# instantiate model to predict 5 classes
unet_model = multi_unet_model(
n_classes=5,
img_height=IMG_HEIGHT,
img_width=IMG_WIDTH,
img_channels=3
)
# input: 512x512x3
# output: 512x512x5

Loss Function: Categorical Cross Entropy

How do we optimize this model? Well, since image segmentation is really
just classification on the pixel level, we can use the standard classification
https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 11/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

loss function, which is Categorical Cross Entropy.

model.compile(
loss="categorical_crossentropy",
categorical_crossentropy
)

We can interpret each pixel of the resulting (512x512x5) volume as a vector

of length 5. Since the last layer uses a softmax activation across the last
dimension, each pixel vector contains the probabilities of that pixel
belonging to each class.

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 12/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Intuition for model output

Before we can train the model, we need a dataset. The dataset should
contain (image, mask) pairs, where the image (x) is of shape (512x512x3) and
the mask (y) is of shape (512x512x5).

Here is an example ground truth mask:

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 13/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Image by Prince Canuma

Each pixel can only belong to one class, so it contains a “1” in one of the class
channels, and a “0” in the other channels. You can think of each pixel as a
one-hot vector (because that’s what it is).

Once you have your dataset prepared, you’re ready to train:

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 14/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

model.fit(
train_ds,
validation_data=val_ds,
epochs=10,
)

Of course, this code would not be enough to run a successful model. If you
actually want to implement this, you need to consider preprocessing,
rescaling, batching etc.

I’ve prepared a Kaggle notebook which tackles car segmentation

(segmenting different parts of a car). It contains the complete code to run an
image segmentation model, so check it out here.

Final Notes
Class Imbalance: Often in image segmentation, there is severe class
imbalance. For example, in an average street view image, cars and
buildings take up a lot of pixels, but stop signs take up very few pixels.
The model has less data on stop signs, so it will perform poorly in
segmenting stop signs. To solve this, you can use Focal Categorical Cross
Entropy and class weights, which place emphasis on minority classes.

https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 15/22
23/9/24, 18:23 Image Segmentation — A Beginner’s Guide | Medium

Other Architectures: U-Net is not the only image segmentation

architecture, although it is conceptually the simplest. Others include
SegNet, Mask R-CNN, and PSPNet.

Binary Segmentation: If there is only one class your segmenting (e.g.

segmenting a brain tumor in an MRI scan), then the output of the model
only needs to be (512x512). For the mask, each pixel will contain a “1” if
that pixel belongs to a tumor, or “0” if that pixel does not belong to a
tumor. Make sure to also change “softmax” to “sigmoid” in the final
activation of the model, and use the (Focal) Binary Cross Entropy loss
function.

Thanks for reading!

Follow me for more great content:

📃 Medium
🌐 LinkedIn
📽️ YouTube
https://medium.com/@raj.pulapakura/image-segmentation-a-beginners-guide-0ede91052db7 16/22

Segmentation Detection
100% (1)
Segmentation Detection
109 pages
A Comprehensive Review of Modern Object Segmentation Approaches
No ratings yet
A Comprehensive Review of Modern Object Segmentation Approaches
177 pages
Unet + RL
No ratings yet
Unet + RL
63 pages
A1745136595 29458 13 2025 Unit6cv
No ratings yet
A1745136595 29458 13 2025 Unit6cv
54 pages
Module 4 Dip
No ratings yet
Module 4 Dip
30 pages
DL Unit 5
No ratings yet
DL Unit 5
63 pages
Summary
No ratings yet
Summary
65 pages
Lecture 5 - CNNs For Detection and Segmentation
No ratings yet
Lecture 5 - CNNs For Detection and Segmentation
62 pages
Lec 2 (Image Segemnation)
No ratings yet
Lec 2 (Image Segemnation)
52 pages
Harley MSC Thesis Menos Especializadpo
No ratings yet
Harley MSC Thesis Menos Especializadpo
71 pages
Object Detection and Segmentation - Part 2
No ratings yet
Object Detection and Segmentation - Part 2
36 pages
Overview of Semantic Segmentation
No ratings yet
Overview of Semantic Segmentation
20 pages
Understanding Deep Learning Techniques For Image Segmentation
No ratings yet
Understanding Deep Learning Techniques For Image Segmentation
58 pages
Semantic Segmentation
No ratings yet
Semantic Segmentation
22 pages
Da Unit-Iv
No ratings yet
Da Unit-Iv
23 pages
Lecture 8 Image Segmentationi N Computer Vision 2025
No ratings yet
Lecture 8 Image Segmentationi N Computer Vision 2025
18 pages
CV Expl 21070126001
No ratings yet
CV Expl 21070126001
16 pages
Expl CV
No ratings yet
Expl CV
16 pages
Explo PPT
No ratings yet
Explo PPT
25 pages
Mastering The Art of Image Segmentation - A Comprehensive Guide To Image Segmentation Techniques - by Marcus Angella - Medium
No ratings yet
Mastering The Art of Image Segmentation - A Comprehensive Guide To Image Segmentation Techniques - by Marcus Angella - Medium
15 pages
A Beginner's Guide To Deep Learning Based Semantic Segmentation Using Keras - Divam Gupta
No ratings yet
A Beginner's Guide To Deep Learning Based Semantic Segmentation Using Keras - Divam Gupta
14 pages
Understanding Semantic Segmentation With UNET - by Harshall Lamba - Towards Data Science
No ratings yet
Understanding Semantic Segmentation With UNET - by Harshall Lamba - Towards Data Science
33 pages
Segmentation-Aware Convolutional Networks Using Local Attention Masks
No ratings yet
Segmentation-Aware Convolutional Networks Using Local Attention Masks
11 pages
Image Segmentation Basics
No ratings yet
Image Segmentation Basics
11 pages
IVP Notes
No ratings yet
IVP Notes
25 pages
20PWMCT0732 Ass#3
No ratings yet
20PWMCT0732 Ass#3
8 pages
Segmentation Models Documentation: Release 0.1.2
No ratings yet
Segmentation Models Documentation: Release 0.1.2
25 pages
Lecture Sematic-Segmentation
No ratings yet
Lecture Sematic-Segmentation
23 pages
14 Segmentation
No ratings yet
14 Segmentation
22 pages
IP Bankai
No ratings yet
IP Bankai
10 pages
Bayesian Segnet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures For Scene Understanding
No ratings yet
Bayesian Segnet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures For Scene Understanding
11 pages
Dlcv2017d3l1segmentation 170623173102
No ratings yet
Dlcv2017d3l1segmentation 170623173102
36 pages
Boundary-Aware Segmentation Network For Mobile and Web Applications
No ratings yet
Boundary-Aware Segmentation Network For Mobile and Web Applications
19 pages
Medical Image Segmentation Review: The Success of U-Net
No ratings yet
Medical Image Segmentation Review: The Success of U-Net
38 pages
Computer Vision Experiential Learning Report
No ratings yet
Computer Vision Experiential Learning Report
20 pages
Previously
No ratings yet
Previously
49 pages
2018 - SeGAN - Adversarial Network With Multi-Scale L 1 Loss For Medical
No ratings yet
2018 - SeGAN - Adversarial Network With Multi-Scale L 1 Loss For Medical
10 pages
Image Segmentation Keras: Implementation of Segnet, FCN, Unet, Pspnet and Other Models in Keras
No ratings yet
Image Segmentation Keras: Implementation of Segnet, FCN, Unet, Pspnet and Other Models in Keras
5 pages
End-to-End Boundary Aware Networks For Medical Image Segmentation
No ratings yet
End-to-End Boundary Aware Networks For Medical Image Segmentation
8 pages
Research Ideas
No ratings yet
Research Ideas
2 pages
Image Segmentationand Semantic Labelingusing Machine Learning
No ratings yet
Image Segmentationand Semantic Labelingusing Machine Learning
6 pages
DL UNIt-III
No ratings yet
DL UNIt-III
21 pages
BML Assign Print 4
No ratings yet
BML Assign Print 4
8 pages
IITPatna AIML Brochure V2
100% (1)
IITPatna AIML Brochure V2
10 pages
Nature-Inspired Design of Hybrid Intelligent Systems
100% (1)
Nature-Inspired Design of Hybrid Intelligent Systems
817 pages
IJRAR1DUP001
No ratings yet
IJRAR1DUP001
3 pages
ML Report-Image Segmentation
No ratings yet
ML Report-Image Segmentation
19 pages
IA Unit-03
No ratings yet
IA Unit-03
10 pages
W-Net A Deep Model For Fully Unsupervised Image Segmentation
No ratings yet
W-Net A Deep Model For Fully Unsupervised Image Segmentation
13 pages
Dental X-Ray Image Segmenation Using A U-Shaped Deep Convolutional Network
No ratings yet
Dental X-Ray Image Segmenation Using A U-Shaped Deep Convolutional Network
13 pages
U-Net Swathi 2020
No ratings yet
U-Net Swathi 2020
5 pages
1 Image Segmentation Using Deep Learning
No ratings yet
1 Image Segmentation Using Deep Learning
6 pages
Aligning and Prompting Everything All at Once For Universal Visual Perception
No ratings yet
Aligning and Prompting Everything All at Once For Universal Visual Perception
5 pages
METHODOLOGY
No ratings yet
METHODOLOGY
5 pages
Fully Convolutional Networks For Semantic Segmentation
No ratings yet
Fully Convolutional Networks For Semantic Segmentation
17 pages
Image Segmentation DeepLearning
No ratings yet
Image Segmentation DeepLearning
18 pages
U-Net: Convolutional Networks For Biomedical Image Segmentation
No ratings yet
U-Net: Convolutional Networks For Biomedical Image Segmentation
8 pages
Image Segmentation in Deep Learning
No ratings yet
Image Segmentation in Deep Learning
12 pages
A Comparative Study of Real-Time Semantic Segmentation For Autonomous Driving
No ratings yet
A Comparative Study of Real-Time Semantic Segmentation For Autonomous Driving
11 pages
Image Segmentation: Ross Whitaker SCI Institute, School of Computing University of Utah
No ratings yet
Image Segmentation: Ross Whitaker SCI Institute, School of Computing University of Utah
49 pages
U-Net Architecture For Image Segmentation
No ratings yet
U-Net Architecture For Image Segmentation
7 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Yu Et Al. - 2020 - Gradient Surgery For Multi-Task Learning
No ratings yet
Yu Et Al. - 2020 - Gradient Surgery For Multi-Task Learning
27 pages
Chapter 1 - Course Intro
No ratings yet
Chapter 1 - Course Intro
27 pages
Lecture 28 TransformerIntroductionFinal 1
No ratings yet
Lecture 28 TransformerIntroductionFinal 1
69 pages
Artificial Neural Networks: Introduction To Computational Neuroscience
No ratings yet
Artificial Neural Networks: Introduction To Computational Neuroscience
42 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
CS5486 Intelligent Systems: Prof. Jun Wang Department of Computer Science Tel: 3442 9701 Email: Jwang - Cs@cityu - Edu.hk
No ratings yet
CS5486 Intelligent Systems: Prof. Jun Wang Department of Computer Science Tel: 3442 9701 Email: Jwang - Cs@cityu - Edu.hk
324 pages
Handwritten Text Recognition Using Deep Learning
No ratings yet
Handwritten Text Recognition Using Deep Learning
13 pages
A Survey of Neural Networks Usage For Intrusion Detection Systems
No ratings yet
A Survey of Neural Networks Usage For Intrusion Detection Systems
18 pages
CHP1 Introduction To Machine Learning
No ratings yet
CHP1 Introduction To Machine Learning
52 pages
Different Artificial Neural Networks Architectures
No ratings yet
Different Artificial Neural Networks Architectures
27 pages
GR 10 Ai Portfoilio Activities
No ratings yet
GR 10 Ai Portfoilio Activities
9 pages
Unraveling The Brain - A Creative Journey Into Neural Networks
No ratings yet
Unraveling The Brain - A Creative Journey Into Neural Networks
14 pages
2.introduction To Supervised Learning and K Nearest Neighbors
No ratings yet
2.introduction To Supervised Learning and K Nearest Neighbors
74 pages
IITHyderabad AI-ML Brochure V1
No ratings yet
IITHyderabad AI-ML Brochure V1
8 pages
NDU KN AI Powered Web Development
No ratings yet
NDU KN AI Powered Web Development
42 pages
ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks
No ratings yet
ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks
17 pages
Artificial Intelligence (Ai) : Prima Nur Pratama Fadhil Arif Fathoni Anas Rachmadi
No ratings yet
Artificial Intelligence (Ai) : Prima Nur Pratama Fadhil Arif Fathoni Anas Rachmadi
13 pages
CSE352 MIDSemAssignment2021-22 EvenSem
No ratings yet
CSE352 MIDSemAssignment2021-22 EvenSem
1 page
Deep Learning: A Visual Introduction
No ratings yet
Deep Learning: A Visual Introduction
39 pages
CV 2024 - Obeb Fkiri
No ratings yet
CV 2024 - Obeb Fkiri
1 page
Convolutional Neural Networks : Covnets
No ratings yet
Convolutional Neural Networks : Covnets
22 pages
Midterm Quiz 1 - Attempt Review
No ratings yet
Midterm Quiz 1 - Attempt Review
6 pages
A Review On Detection of Parkinsons Disease Using ML Algorithms
No ratings yet
A Review On Detection of Parkinsons Disease Using ML Algorithms
6 pages
Lightweight and Compact AI Models
No ratings yet
Lightweight and Compact AI Models
2 pages
Subtitle
No ratings yet
Subtitle
3 pages
Đặng Mạnh Trường: Mục Tiêu Nghề Nghiệp
No ratings yet
Đặng Mạnh Trường: Mục Tiêu Nghề Nghiệp
2 pages
Be - Computer Engineering Ai, DS, ML - Semester 6 - 2023 - May - Data Analytics and Visualization Rev 2019 C Scheme
No ratings yet
Be - Computer Engineering Ai, DS, ML - Semester 6 - 2023 - May - Data Analytics and Visualization Rev 2019 C Scheme
1 page
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet