Computer Vision Experiential Learning Report
Computer Vision Experiential Learning Report
Report
Aditya Pande
21070126001
AIML A1
2. Medical Imaging:
- In medical fields, segmentation aids in the accurate delineation of
structures and organs, assisting in diagnosis, treatment planning, and
monitoring of diseases.
3. Autonomous Vehicles:
- For autonomous vehicles, accurate segmentation is crucial for
understanding the surrounding environment, identifying road lanes,
pedestrians, and other vehicles.
4. Augmented Reality:
- In augmented reality applications, segmentation helps distinguish
between the foreground and background, allowing virtual elements to
seamlessly interact with the real world.
One aspect of our project involves the application of clustering algorithms such
as K-means and DBSCAN to segment images. These algorithms group pixels
based on similarities in color, allowing us to explore their effectiveness in
extracting meaningful regions from the dataset. We will compare the results of
clustering algorithms with traditional and deep learning methods to understand
their respective advantages and limitations.
Our evaluation will not only focus on visual comparisons but will also include
quantitative assessments using metrics such as Intersection over Union (IoU)
and Dice Coefficient. These metrics provide insights into the accuracy and
precision of the segmentation methods, aiding in a comprehensive analysis of
their performance.
Additionally, our project aims to explore the trade-offs between traditional and
deep learning approaches, taking into consideration factors such as
computational efficiency, robustness to variations, and interpretability. By
conducting this analysis, we seek to contribute insights into the effectiveness of
different segmentation techniques, offering a holistic understanding of the
challenges associated with image segmentation in complex urban environments.
Context
Cityscapes data (dataset home page) contains labelled videos taken from
vehicles driven in Germany. This version is a processed subsample
created as part of the Pix2Pix paper. The dataset has still images from
the original videos, and the semantic segmentation labels are shown in
images alongside the original image. This is one of the best datasets
around for semantic segmentation tasks.
Content
This dataset has 2975 training images files and 500 validation image
files. Each image file is 256x512 pixels, and each file is a composite with
the original photo on the left half of the image, alongside the labeled
image (output of semantic segmentation) on the right half.
Acknowledgements
This dataset is the same as what is available here from the Berkeley AI
Research group.
License
The Cityscapes data available from cityscapes-dataset.com has the
following license:
That the dataset comes "AS IS", without express or implied warranty.
Although every effort has been made to ensure accuracy, we (Daimler
AG, MPI Informatics, TU Darmstadt) do not accept any responsibility for
errors or omissions.
That you include a reference to the Cityscapes Dataset in any work that
makes use of the dataset. For research papers, cite our preferred
publication as listed on our website; for other media cite our preferred
publication as listed on our website or link to the Cityscapes website.
That you do not distribute this dataset or modified versions. It is
permissible to distribute derivative works in as far as they are abstract
representations of this dataset (such as models trained on it or additional
annotations that do not directly include any of our data) and do not allow
to recover the dataset or something similar in character.
That you may not use the dataset or any derivative work for commercial
purposes as, for example, licensing or selling the data, or using the data
with a purpose to procure a commercial gain.
That all rights not expressly granted to you are reserved by (Daimler AG,
MPI Informatics, TU Darmstadt).
Code Snippets :
Github Repo :
https://github.com/adityapande403/CV_segmentation_UN
ET_EXPL/tree/main