0% found this document useful (0 votes)

6 views

Assignment 4

Uploaded by

cryptovideos21

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Assignment 4

Uploaded by

cryptovideos21

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

COL780-Computer Vision

Assignment 4
Deadline: 11:59PM, 27th April 2024

1 Background
Breast cancer is among the most prevalent cancers, and screening mammography plays a crucial role
in early detection. Mammograms, which are essentially X-rays of the breast, serve as the primary
tool for radiologists to identify malignant tissues. Typically, mammograms are of dimensions 4k×4k,
comprising X-rays captured from two perspectives: side-view (MLO-view) and top-view (CC-view).
While Deep Neural Networks (DNNs) have shown promise in classifying malignancy in cancers,
their integration into medical practice faces challenges, notably in terms of model explainability.
Despite achieving commendable results in classification tasks, these models cannot often explain/find
the regions contributing to their predictions. To bridge this gap and foster trust among clinicians,
it becomes imperative for models to not only classify but also localize cancerous regions within the
mammograms. Object detection emerges as a viable solution to this problem, allowing models to
draw bounding boxes around areas indicative of tumors.
In this assignment, your objective is to train a deep neural network for object detection in
breast cancer screening. You will work with a mammogram detection dataset containing images and
corresponding bounding boxes outlining malignant regions.

2 Dataset
The dataset is divided into three splits: training, validation, and testing. For this assignment, you
will be provided with the training and validation sets. The dataset is present in link this link, kindly
use your IIT-Delhi credentials to access the same.

2.1 COCO Data Format

The COCO dataset consists of the following directories:

• annotations: Contains JSON files storing bounding box annotations for both training and
validation sets (instances train.json and instances val.json).
• train2017: Contains training images in JPEG format.
• val2017: Contains validation images in PNG format.

The ”instances train.json” file typically follows a hierarchical structure to store bounding boxes,
with the following key components:

1
• Images: This section contains information about each image in the training set, including its
unique identifier, file name, height, width, and any other relevant metadata.
• Annotations: This section provides details about the annotations for each object instance
present in the training images. Each annotation entry includes the following fields:

– ”id”: A unique identifier for the annotation.

– ”image id”: The identifier of the image to which the annotation belongs, linking it to the
corresponding image entry in the ”images” section.
– ”category id”: The category label of the annotated object.
– ”bbox”: The bounding box coordinates of the annotated object, represented as [x, y,
width, height]. Here, (x, y) denotes the top-left corner of the bounding box, and width
and height represent its dimensions.

2.2 YOLO Data

The YOLO dataset consists of two folders for training and validation, each containing:

• images: Contains images for the respective data split.

• labels: Contains text files with bounding box annotations for malignant images. Each line in
the annotation file represents an object’s class label and normalized bounding box coordinates.

YOLO format of bounding box: Each line the annotation file has 5 values, for example: ”0 0.25 0.3
0.5 0.4”. Here, ”0” is the object class label, and the numbers represent the normalized coordinates
of the bounding box: (x,y,w,h). These coordinates are normalized to the dimensions of the image,
with (x,y) representing the center of the bounding box, and (w,h) representing its width and height.

3 Training Guidelines
• You will train two object detection models: one based on convolution and the other utilizing
a transformer architecture. You have the flexibility to choose a convolution-based model such
as Faster R-CNN or a YOLO-based model, and for the transformer-based model, you can use
DETR-based models like DAB-DETR, Deformable DETR, DINO, etc.
• Utilize any image pre-processing techniques on both the training and validation sets. Ensure
your code can perform the same pre-processing during testing.
• You are permitted to use platforms like Kaggle or Google Colab for model training, or any
other accessible platform.

4 Submission Guidelines
The assignment must be completed in groups of up to two members. Submit the following files on
the portal:

• model.pt: Trained model checkpoints for both models.

2
• requirements.txt: File listing the Python libraries used. Submit two files if libraries differ
for each model.
• test.py: Test script for both models, accepting inputs of an image folder and trained model.
Upon execution, the script runs the model on all images in the folder, saving predictions in
text files.
• train.py: Train script containing dataloader and training loop for both models.
• visualize heatmaps.py: Heatmap visualization script for both models, accepting inputs of an
image folder and trained model. Upon execution, the script runs the model on all images in the
folder, saving GradCAM/Attention Maps visualizations for the highest confidence bounding
box for that image.
• Report.pdf : Detailed report covering all experiments and techniques employed in the as-
signment. Include visualizations before, during, and after training the detection model. The
report should contain:

1. Data visualizations: Distribution information and visualizations of bounding boxes on

images.
2. Effects of image pre-processing techniques.
3. Training and validation loss curves.
4. Model predictions on validation set images, pre and post-Non-Maximum Suppression
(NMS). Please refer to Figure 1.
5. Grad-CAM/Attention Maps visualizations of model predictions for malignant images in
the validation set.

4.1 Prediction File Format

The test script should save all possible bounding boxes and their confidence scores for each image
in a text file. The text file should share the same name as the corresponding image. For instance,
if the image is named ”Calc-Training P 00538 RIGHT MLO.png”, the text file should be named
”Calc-Training P 00538 RIGHT MLO preds.txt”. Each line in the text file represents a bounding
box in YOLO format (center x, center y, width, height, confidence score). Refer to the test samples
in the shared repository for clarification.

5 Grading Guidelines
Your performance will be evaluated based on competitive performance among your peers and the
depth of analysis presented in your report. Marks will be assigned according to the rank and
performance of your model, assessed on two metrics. One metric will be kept hidden, while the
other is described below:
FROC (Free-response Receiver Operating Characteristic) Curve: Similar to the ROC
curve, FROC plots sensitivity against average false positives per image (FPI) on the x-axis. A
prediction is considered a true positive if the center of the bounding box lies within the center of
the ground truth box. False positives are calculated based on all predicted bounding boxes in an
image. The final rank will be determined by a weighted average of both metrics.

3
Figure 1: The illustration depicts the fluctuation of Non-Maximum Suppression (NMS) across the
predictions generated by a detection module. The blue bounding box signifies the ground truth
for a malignant tumor, while the red bounding boxes denote predictions made by the detection
model. Within this visual representation, we observe the nuanced variations in NMS. We explore
five Intersections over Union (IOU) thresholds: 1, 0.5, 0.25, 0.1, and 0. A threshold of 1 indicates
the inclusion of a bounding box regardless of overlaps, while a threshold of 0 signifies the removal of
bounding boxes even in the presence of slight overlaps. Please refer to this image when visualizing
your bounding boxes for the assignment.

A sample test set is provided in the shared repository, comprising three folders: images, labels,
and predictions. The FROC.py script in the data folder takes inputs from labels and predictions
folders to evaluate your model on the validation set. It calculates sensitivity values at different FPI
thresholds and prints corresponding threshold values.
In your report, compare the performance of the two trained models. Set a threshold corresponding
to the same FPI and compare which model performs better. For instance, if both models achieve a
sensitivity of 0.6 and 0.7 at FPI=0.3, identify the additional cases detected by the second model.

6 Do’s and Don’ts

• Utilize the provided FROC metric code for evaluation; refrain from implementing it.

• You may refer to open-source libraries for generating GradCAMs/Attention maps and applying
NMS on model predictions.
• Adhere strictly to the institute’s plagiarism policy.
• Ensure code reproducibility.

• Follow specified formats to avoid a penalty of 10% for each mistake.

4
7 Tips and Tricks
• Resize image data to lower resolutions for faster training, but be mindful of potential loss of
information.
• Employ transfer learning by initializing models with pre-trained weights for better convergence
and performance.

• Consider freezing some layers of the detection model to accelerate training and enhance per-
formance.
• Start the assignment early if training on the cloud, as there may be weekly limits on training
time.

• Implement Non-Maximum Suppression (NMS) on predictions for improved performance.

• Utilize visualizations extensively. Verify data reading accuracy by visualizing bounding boxes
on images. Verify prediction format correctness by visualizing saved predictions in text files.

Veterinary Pharmacology and Toxicology MCQs
97% (63)
Veterinary Pharmacology and Toxicology MCQs
18 pages
Navigat 3000 Operational Manual PDF
75% (4)
Navigat 3000 Operational Manual PDF
228 pages
Group2 AIML Final Report
No ratings yet
Group2 AIML Final Report
82 pages
MIC Assignment4
No ratings yet
MIC Assignment4
9 pages
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
No ratings yet
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
3 pages
Col780 A1
No ratings yet
Col780 A1
4 pages
AIML LAB WEEK 8 SET-2
No ratings yet
AIML LAB WEEK 8 SET-2
5 pages
Semester Project Description and Instructions
No ratings yet
Semester Project Description and Instructions
3 pages
Assignment 6
No ratings yet
Assignment 6
4 pages
Health_omega
No ratings yet
Health_omega
7 pages
ML Paper - Breast Cancer Model
No ratings yet
ML Paper - Breast Cancer Model
38 pages
Brain Tumour Detection
No ratings yet
Brain Tumour Detection
9 pages
S-8
No ratings yet
S-8
4 pages
A.I Final Report
No ratings yet
A.I Final Report
39 pages
Work Flow
No ratings yet
Work Flow
6 pages
Synopsis
No ratings yet
Synopsis
1 page
ML Report Fake News Detection
No ratings yet
ML Report Fake News Detection
15 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
Efficient Net B0
No ratings yet
Efficient Net B0
4 pages
Assignment1_LATEX
No ratings yet
Assignment1_LATEX
11 pages
17 - PPT - NLP Project-2-24
No ratings yet
17 - PPT - NLP Project-2-24
23 pages
Project Report Chest Xray classification (2)
No ratings yet
Project Report Chest Xray classification (2)
18 pages
Team Alacrity - Amazon ML Challenge 2023 - Text File
No ratings yet
Team Alacrity - Amazon ML Challenge 2023 - Text File
8 pages
Project Cdac
No ratings yet
Project Cdac
4 pages
Few_Shot_Brain_Tumor_Segmentation_Guide
No ratings yet
Few_Shot_Brain_Tumor_Segmentation_Guide
4 pages
Building A Deep Learning Model For Skin Cancer Classification Using A Hybrid Approach That Combines Convolutional Neural Networks
No ratings yet
Building A Deep Learning Model For Skin Cancer Classification Using A Hybrid Approach That Combines Convolutional Neural Networks
5 pages
Saquib&Shiva Final ML Report
No ratings yet
Saquib&Shiva Final ML Report
16 pages
AI for Breast Cancer
No ratings yet
AI for Breast Cancer
6 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
No ratings yet
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
10 pages
Q1063255_JEROMEBASIL_CVAI_PRACTIAL_ASSIGNMENT
No ratings yet
Q1063255_JEROMEBASIL_CVAI_PRACTIAL_ASSIGNMENT
17 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
COM4509/6509 MLAI - Assignment Part 2 Brief: This Link This Link
No ratings yet
COM4509/6509 MLAI - Assignment Part 2 Brief: This Link This Link
5 pages
Naïve Bayes & Decision Algorithm
No ratings yet
Naïve Bayes & Decision Algorithm
19 pages
INNOVATION - PDF Phrase 2
No ratings yet
INNOVATION - PDF Phrase 2
9 pages
A Convolutional Neural Network Model for Species Classification o
No ratings yet
A Convolutional Neural Network Model for Species Classification o
14 pages
Assignment_1_Machine Learning
No ratings yet
Assignment_1_Machine Learning
3 pages
Performance Metrics Deep Dive
No ratings yet
Performance Metrics Deep Dive
7 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
20BCS2334 Jitesh Kumar
No ratings yet
20BCS2334 Jitesh Kumar
4 pages
SIGIR 2020 E-Commerce Workshop Data Challenge: Rakuten Institute of Technology
No ratings yet
SIGIR 2020 E-Commerce Workshop Data Challenge: Rakuten Institute of Technology
3 pages
chapter3
No ratings yet
chapter3
9 pages
Final Project Implementation
No ratings yet
Final Project Implementation
3 pages
Unified Modeling Language and Design of Case-Based Retrieval Medical Imaging
No ratings yet
Unified Modeling Language and Design of Case-Based Retrieval Medical Imaging
5 pages
How to Develop a CNN for MNIST Handwritten Digit Classification
No ratings yet
How to Develop a CNN for MNIST Handwritten Digit Classification
43 pages
Project Deep Learning
No ratings yet
Project Deep Learning
4 pages
Assignment
No ratings yet
Assignment
5 pages
Data Science and Machine Learning Essentials: Lab 4B - Working With Classification Models
No ratings yet
Data Science and Machine Learning Essentials: Lab 4B - Working With Classification Models
29 pages
CV Project
No ratings yet
CV Project
6 pages
Keras_Slim_Residual_Neural_Network_Classifier_using_Training_Schedule_(1_Layer__64_Units)_archive_zip_-_2024-10-16_08_50_01_documentation
No ratings yet
Keras_Slim_Residual_Neural_Network_Classifier_using_Training_Schedule_(1_Layer__64_Units)_archive_zip_-_2024-10-16_08_50_01_documentation
35 pages
Vineela Ann1
No ratings yet
Vineela Ann1
9 pages
Movie Prediction
No ratings yet
Movie Prediction
9 pages
Chapter 3 Methodology Sunita Upreti 2250878 (Updated2)
No ratings yet
Chapter 3 Methodology Sunita Upreti 2250878 (Updated2)
29 pages
ML ModuleUntitled 2
No ratings yet
ML ModuleUntitled 2
8 pages
MLp
No ratings yet
MLp
28 pages
Medical
No ratings yet
Medical
4 pages
LLM2
No ratings yet
LLM2
6 pages
Symbiophysics Cell Counting 1
No ratings yet
Symbiophysics Cell Counting 1
18 pages
OpenCV - Cascade Classifier Training
No ratings yet
OpenCV - Cascade Classifier Training
9 pages
Final Mla File For Practical
No ratings yet
Final Mla File For Practical
30 pages
ECE_685D_HW3_2024
No ratings yet
ECE_685D_HW3_2024
3 pages
VTU ML (1)
No ratings yet
VTU ML (1)
62 pages
CoS 2024__UG Programme Rules
No ratings yet
CoS 2024__UG Programme Rules
85 pages
PLACEMENT REPORT 2022-23
No ratings yet
PLACEMENT REPORT 2022-23
195 pages
Converj health Insurance
No ratings yet
Converj health Insurance
18 pages
certi
No ratings yet
certi
1 page
Mcn2 p3 Reviewer
No ratings yet
Mcn2 p3 Reviewer
13 pages
Carmencita Abaquin - Prepare Me
57% (7)
Carmencita Abaquin - Prepare Me
3 pages
Beng (Hons) Telecommunications Engineering With Networking - E432
No ratings yet
Beng (Hons) Telecommunications Engineering With Networking - E432
9 pages
Chapter 01 Organizations and Organization Design
No ratings yet
Chapter 01 Organizations and Organization Design
15 pages
Percy and Raisa - Synastry Report
No ratings yet
Percy and Raisa - Synastry Report
50 pages
Neonatal Resuscitation Pre Test Example Quiz & Answers1
No ratings yet
Neonatal Resuscitation Pre Test Example Quiz & Answers1
3 pages
Anabolic Hormones and Dietary Supplements
No ratings yet
Anabolic Hormones and Dietary Supplements
15 pages
Coupling-and-Uncoupling-Training-Guide
No ratings yet
Coupling-and-Uncoupling-Training-Guide
18 pages
Electroconvulsive Therapy Management
No ratings yet
Electroconvulsive Therapy Management
32 pages
14 - Dreams - The Alpha - Arianna Green - Hinovel
67% (3)
14 - Dreams - The Alpha - Arianna Green - Hinovel
7 pages
"Full Coverage": Non-Right Angled Triangles: (Edexcel IGCSE May2015-4H Q16)
No ratings yet
"Full Coverage": Non-Right Angled Triangles: (Edexcel IGCSE May2015-4H Q16)
15 pages
AIML Brochure B05
No ratings yet
AIML Brochure B05
3 pages
ზოგადი და არაორგანულიქ იმია ქრისტინე გიორგაძე 1
No ratings yet
ზოგადი და არაორგანულიქ იმია ქრისტინე გიორგაძე 1
301 pages
Robotic Arm
No ratings yet
Robotic Arm
29 pages
Soil MCQ Wihtout Answer
No ratings yet
Soil MCQ Wihtout Answer
21 pages
Testori
No ratings yet
Testori
7 pages
Rockwool Slab SL930
No ratings yet
Rockwool Slab SL930
1 page
Global Citizenship: Helping The Bottom Billion
No ratings yet
Global Citizenship: Helping The Bottom Billion
2 pages
Pickup Etude LeMans2014
No ratings yet
Pickup Etude LeMans2014
6 pages
Outflow Meter Test
No ratings yet
Outflow Meter Test
4 pages
2um Medical Laser Applications 18
No ratings yet
2um Medical Laser Applications 18
1 page
Inorganic Chemistry 2 Main Exam (3) and Memo
No ratings yet
Inorganic Chemistry 2 Main Exam (3) and Memo
11 pages
RPJ Dec 1869
No ratings yet
RPJ Dec 1869
32 pages
UC3843 STMicroelectronics
No ratings yet
UC3843 STMicroelectronics
11 pages
Literature Review On 3d Printing
100% (1)
Literature Review On 3d Printing
6 pages
Checklist - Casing - Non Pressure Parts Drgs.
No ratings yet
Checklist - Casing - Non Pressure Parts Drgs.
3 pages
The Bridge Manual 1.0
No ratings yet
The Bridge Manual 1.0
11 pages
8 To 3 Bit Priority Encoder
No ratings yet
8 To 3 Bit Priority Encoder
12 pages

Assignment 4

Uploaded by

Assignment 4

Uploaded by

COL780-Computer Vision

2.1 COCO Data Format

– ”id”: A unique identifier for the annotation.

2.2 YOLO Data

• images: Contains images for the respective data split.

• model.pt: Trained model checkpoints for both models.

1. Data visualizations: Distribution information and visualizations of bounding boxes on

4.1 Prediction File Format

6 Do’s and Don’ts

• Follow specified formats to avoid a penalty of 10% for each mistake.

• Implement Non-Maximum Suppression (NMS) on predictions for improved performance.

You might also like