0% found this document useful (0 votes)
26 views

Computer Vision Experiential Learning Report

The document discusses image segmentation techniques applied to the Cityscapes dataset. It provides context on image segmentation and its applications. It also describes unsupervised clustering algorithms and supervised deep learning models like U-Net that were used for segmentation. Quantitative metrics to evaluate the different methods are mentioned.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Computer Vision Experiential Learning Report

The document discusses image segmentation techniques applied to the Cityscapes dataset. It provides context on image segmentation and its applications. It also describes unsupervised clustering algorithms and supervised deep learning models like U-Net that were used for segmentation. Quantitative metrics to evaluate the different methods are mentioned.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Computer Vision Experiential Learning

Report

Aditya Pande
21070126001
AIML A1

Image Segmentation of Cityscapes Data with U-


NET Pytorch

Introduction to Image Segmentation:


Image segmentation is a fundamental task in computer vision, playing
a pivotal role in extracting meaningful information from images by
dividing them into semantically coherent regions. Unlike object
detection, which identifies and localizes objects within an image,
segmentation goes a step further by precisely outlining the boundaries
of individual objects or regions. This process is critical for various
applications, ranging from medical imaging and autonomous vehicles
to augmented reality and content-based image retrieval.

Significance in Computer Vision Applications:

1. Object Recognition and Tracking:


- Image segmentation facilitates precise identification and tracking
of objects within a scene, enabling applications like object recognition
and tracking in real-time video streams.

2. Medical Imaging:
- In medical fields, segmentation aids in the accurate delineation of
structures and organs, assisting in diagnosis, treatment planning, and
monitoring of diseases.

3. Autonomous Vehicles:
- For autonomous vehicles, accurate segmentation is crucial for
understanding the surrounding environment, identifying road lanes,
pedestrians, and other vehicles.

4. Augmented Reality:
- In augmented reality applications, segmentation helps distinguish
between the foreground and background, allowing virtual elements to
seamlessly interact with the real world.

Explanation of the Project:

In our project, we focus on image segmentation using the Cityscapes dataset,


which contains labeled urban scenes captured from vehicles in Germany. The
dataset provides a challenging yet realistic environment for testing and
evaluating segmentation techniques. Our project involves implementing various
image segmentation methods, encompassing traditional techniques such as
thresholding, clustering algorithms, as well as state-of-the-art deep learning
models like U-Net and Mask R-CNN.

One aspect of our project involves the application of clustering algorithms such
as K-means and DBSCAN to segment images. These algorithms group pixels
based on similarities in color, allowing us to explore their effectiveness in
extracting meaningful regions from the dataset. We will compare the results of
clustering algorithms with traditional and deep learning methods to understand
their respective advantages and limitations.

Our evaluation will not only focus on visual comparisons but will also include
quantitative assessments using metrics such as Intersection over Union (IoU)
and Dice Coefficient. These metrics provide insights into the accuracy and
precision of the segmentation methods, aiding in a comprehensive analysis of
their performance.

Additionally, our project aims to explore the trade-offs between traditional and
deep learning approaches, taking into consideration factors such as
computational efficiency, robustness to variations, and interpretability. By
conducting this analysis, we seek to contribute insights into the effectiveness of
different segmentation techniques, offering a holistic understanding of the
challenges associated with image segmentation in complex urban environments.

Literature Review on Image Segmentation:

Author Title Result


Olaf Ronneberger, U-Net: Convolutional In this paper, we present a
Philipp Fischer, Thomas Networks for network and training
strategy that relies on the
Brox Biomedical Image
strong use of data
Segmentation augmentation to use the
available annotated samples
more efficiently
Vijay Badrinarayanan, SegNet: A Deep The novelty of SegNet lies
Alex Kendall, Roberto Convolutional Encoder- is in the manner in which
the decoder upsamples its
Cipolla Decoder Architecture
lower resolution input
for Image Segmentation feature map(s). Specifically,
the decoder uses pooling
indices computed in the
max-pooling step of the
corresponding encoder to
perform non-linear
upsampling.

Fausto Milletari, Nassir V-Net: Fully In this work we propose an


Navab, Seyed-Ahmad Convolutional Neural approach to 3D image
segmentation based on a
Ahmadi · Networks for
volumetric, fully
Volumetric Medical convolutional, neural
Image Segmentation network.

S. Prabu A Study on Image In this paper different


algorithms of segmentation can
J.M. Gnanasekar Segmentation Method be reviewed, analyzed and
for Image Processing finally list out the comparison
for all the algorithms. This
comparison study is useful for
increasing accuracy and
performance of segmentation
methods in various image
processing domains.

Refik Samet; Şahin Fuzzy Rule-Based In this paper, we


Emrah Amrahov; Ali Image Segmentation propose Fuzzy Rule-
Hikmet Ziroğlu technique for rock thin Based Image
section images Segmentation
technique to segment
rock thin section
images.
Ashwani Kumar Yadav; Thresholding and The main objective of
Ratnadeep Roy; morphological based this work is to segment
Rajkumar; Vaishali; segmentation techniques the medical image
Devendra Somwanshi for medical images under various
conditions and
different backgrounds.
Sharifah Lailee Syed An accurate The traditional
Abdullah; Hamirul'Aini thresholding-based thresholding and
Hambali; Nursuriati segmentation technique clustering
Jamil for natural images segmentation
techniques that were
widely used are Otsu
and K-means
Annegreet van Opbroek; Transfer Learning The variation between
M. Arfan Ikram; Meike Improves Supervised images obtained with
W. Vernooij; Marleen de Image Segmentation different scanners or
Bruijne Across Imaging different imaging
Protocols protocols presents a
major challenge in
automatic
segmentation of
biomedical images.

About the Dataset :

Context
Cityscapes data (dataset home page) contains labelled videos taken from
vehicles driven in Germany. This version is a processed subsample
created as part of the Pix2Pix paper. The dataset has still images from
the original videos, and the semantic segmentation labels are shown in
images alongside the original image. This is one of the best datasets
around for semantic segmentation tasks.

Content
This dataset has 2975 training images files and 500 validation image
files. Each image file is 256x512 pixels, and each file is a composite with
the original photo on the left half of the image, alongside the labeled
image (output of semantic segmentation) on the right half.
Acknowledgements
This dataset is the same as what is available here from the Berkeley AI
Research group.

License
The Cityscapes data available from cityscapes-dataset.com has the
following license:

This dataset is made freely available to academic and non-academic


entities for non-commercial purposes such as academic research,
teaching, scientific publications, or personal experimentation. Permission
is granted to use the data given that you agree:

 That the dataset comes "AS IS", without express or implied warranty.
Although every effort has been made to ensure accuracy, we (Daimler
AG, MPI Informatics, TU Darmstadt) do not accept any responsibility for
errors or omissions.
 That you include a reference to the Cityscapes Dataset in any work that
makes use of the dataset. For research papers, cite our preferred
publication as listed on our website; for other media cite our preferred
publication as listed on our website or link to the Cityscapes website.
 That you do not distribute this dataset or modified versions. It is
permissible to distribute derivative works in as far as they are abstract
representations of this dataset (such as models trained on it or additional
annotations that do not directly include any of our data) and do not allow
to recover the dataset or something similar in character.
 That you may not use the dataset or any derivative work for commercial
purposes as, for example, licensing or selling the data, or using the data
with a purpose to procure a commercial gain.
 That all rights not expressly granted to you are reserved by (Daimler AG,
MPI Informatics, TU Darmstadt).
Code Snippets :
Github Repo :
https://github.com/adityapande403/CV_segmentation_UN
ET_EXPL/tree/main

You might also like