0% found this document useful (0 votes)

66 views

Second Progress Report UID - 17BCS2127

This document is a progress report submitted by Shanu Naval Singh for their Bachelor of Engineering degree in partial fulfillment of requirements. It discusses an object detection system using state-of-the-art deep learning techniques to achieve high accuracy in real-time. The system is trained on the PASCAL VOC dataset to detect multiple objects in images. Challenges include variable output dimensions and balancing accuracy vs performance. Popular approaches like RCNN, Fast RCNN, Faster RCNN, YOLO, and SSD are reviewed. The project uses SSD which applies convolutional layers to feature maps from a VGG network to predict bounding boxes and classes at multiple scales for high mAP.

Uploaded by

Shanu Naval Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

Second Progress Report UID - 17BCS2127

Uploaded by

Shanu Naval Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Object Detection Smart Security System

Second progress Report

Submitted in partial fulfillment of the requirements for the award of degree of

BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE & ENGINEERING

Submitted to:
Gagandeep Kaur

Submitted By:

Shanu Naval Singh (17bcs2127)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Chandigarh University, Gharuan

June 2021
ACKNOWLEDGEMENT

I have taken efforts in this project. However, it would not have been possible without the kind
support and help of many individuals and organizations. I would like to extend my sincere
thanks to all of them.

I am highly indebted to Gagandeep kaur Mam for their guidance and constant supervision as
well as for providing necessary information regarding the project & also for their support in
completing the project.

I would like to express my gratitude towards my parents & member of Chandigarh University
for their kind co-operation and encouragement which help me in completion of this project.

My thanks and appreciations also go to my colleague in developing the project and people
who have willingly helped me out with their abilities.

Thanking you.

Yours Sincerely,

Shanu naval Singh (17bcs2127)

ABSTRACT
1
Efficient and accurate object detection has been an important topic in the advancement of
computer vision systems. With the advent of deep learning techniques, the accuracy for
object detection has increased drastically. The project aims to incorporate state-of-the-art
technique for object detection with the goal of achieving high accuracy with a real-time
performance. A major challenge in many of the object detection systems is the dependency
on other computer vision techniques for helping the deep learning-based approach, which
leads to slow and non-optimal performance. In this project, we use a completely deep
learning-based approach to solve the problem of object detection in an end-to-end fashion.
The network is trained on the most challenging publicly available dataset (PASCAL VOC), on
which a object detection challenge is conducted annually. The resulting system is fast and
accurate, thus aiding those applications which require object detection.

2
Table of Contents

Sr. No. Topic Page No.

1 Chapter 1: Introduction 4

2 Chapter 2: Related Work 6

3 Chapter 3: Approach 9
4 Chapter 4: Conclusion 11

3
1 Introduction
1.1 Problem Statement
Many problems in computer vision were saturating on their accuracy before a decade.
How- ever, with the rise of deep learning techniques, the accuracy of these problems
drastically improved. One of the major problem was that of image classification, which
is defined as predicting the class of the image. A slightly complicated problem is that of
image localization, where the image contains a single object and the system should
predict the class of the location of the object in the image (a bounding box around the
object). The more complicated problem (this project), of object detection involves both
classification and localization. In this case, the input to the system will be a image, and
the output will be a bounding box corresponding to all the objects in the image, along
with the class of object in each box. An overview of all these problems is depicted in
Fig. 1.

Figure 1: Computer Vision Tasks

1.2 Applications
A well known application of object detection is face detection, that is used in almost all
the mobile cameras. A more generalized (multi-class) application can be used in
autonomous driving where a variety of objects need to be detected. Also it has a
important role to play in surveillance systems. These systems can be integrated with
other tasks such as pose estimation where the first stage in the pipeline is to detect the
object, and then the second stage will be to estimate pose in the detected region. It can
be used for tracking objects and thus can be used in robotics and medical applications.
Thus this problem serves a multitude of applications.

4
(a) Surveillance (b) Autonomous vehicles
Figure 2: Applications of object detections

1.3 Challenges
The major challenge in this problem is that of the variable dimension of the output
which is caused due to the variable number of objects that can be present in any given
input image. Any general machine learning task requires a fixed dimension of input and
output for the model to be trained. Another important obstacle for widespread adoption
of object detection systems is the requirement of real-time (¿30fps) while being accurate
in detection. The more complex the model is, the more time it requires for inference;
and the less complex the model is, the less is the accuracy. This trade-off between
accuracy and performance needs to be chosen as per the application. The problem
involves classification as well as regression, leading the model to be learnt
simultaneously. This adds to the complexity of the problem.
2 Related Work
There has been a lot of work in object detection using traditional computer vision
techniques (sliding windows, deformable part models). However, they lack the accuracy of
deep learning based techniques. Among the deep learning based techniques, two broad
class of methods are prevalent: two stage detection (RCNN [1], Fast RCNN [2], Faster
RCNN [3]) and unified detection (Yolo [4], SSD [5]). The major concepts involved in
these techniques have been explained below.

2.1 Bounding Box

The bounding box is a rectangle drawn on the image which tightly fits the object in the
image. A bounding box exists for every instance of every object in the image. For the
box, 4 numbers (center x, center y, width, height) are predicted. This can be trained
using a distance measure between predicted and ground truth bounding box. The
distance measure is a jaccard distance which computes intersection over union between
the predicted and ground truth boxes as shown in Fig. 3.

Figure 3: Jaccard distance

2.2 Classification + Regression

The bounding box is predicted using regression and the class within the bounding box is
predicted using classification. The overview of the architecture is shown in Fig. 4

Figure 4: Architecture overview

2.3 Two-stage Method
In this case, the proposals are extracted using some other computer vision technique
and then resized to fixed input for the classification network, which acts as a feature
extractor. Then an SVM is trained to classify between object and background (one SVM
for each class). Also a bounding box regressor is trained that outputs some some
correction (offsets) for proposal boxes. The overall idea is shown in Fig. 5 These
methods are very accurate but are computationally intensive (low fps).

(a) Stage 1

(b) Stage 2

Figure 5: Two stage method

2.4 Unified Method

The difference here is that instead of producing proposals, pre-define a set of boxes to
look for objects. Using convolutional feature maps from later layers of the network, run
another network over these feature maps to predict class scores and bounding box
offsets. The broad idea is depicted in Fig. 6. The steps are mentioned below:
1. Train a CNN with regression and classification objective.

2. Gather activation from later layers to infer classification and location with a fully
connected or convolutional layers.
3. During training, use jaccard distance to relate predictions with the ground truth.

4. During inference, use non-maxima suppression to filter multiple boxes around the
same object.

Figure 6: Unified Method

The major techniques that follow this strategy are: SSD (uses different activation
maps (multiple-scales) for prediction of classes and bounding boxes) and Yolo (uses a
single activation map for prediction of classes and bounding boxes). Using multiple
scales helps to achieve a higher mAP(mean average precision) by being able to detect
objects with different sizes on the image better. Thus the technique used in this project
is SSD.
3 Approach
The network used in this project is based on Single shot detection (SSD) [5]. The
architecture is shown in Fig. 7.

Figure 7: SSD Architecture

The SSD normally starts with a VGG [6] model, which is converted to a fully convolu-
tional network. Then we attach some extra convolutional layers, that help to handle
bigger objects. The output at the VGG network is a 38x38 feature map (conv4 3). The
added layers produce 19x19, 10x10, 5x5, 3x3, 1x1 feature maps. All these feature maps
are used for predicting bounding boxes at various scales (later layers responsible for
larger objects).
Thus the overall idea of SSD is shown in Fig. 8. Some of the activations are passed to
the sub-network that acts as a classifier and a localizer.

Figure 8: SSD Overall Idea

Anchors (collection of boxes overlaid on image at different spatial locations, scales

and aspect ratios) act as reference points on ground truth images as shown in Fig. 9.
A model is trained to make two predictions for each anchor:
• A discrete class

• A continuous offset by which the anchor needs to be shifted to fit the ground-truth
bounding box
Figure 9: Anchors

During training SSD matches ground truth annotations with anchors. Each element
of the feature map (cell) has a number of anchors associated with it. Any anchor with an
IoU (jaccard distance) greater than 0.5 is considered a match. Consider the case as
shown in Fig. 10, where the cat has two anchors matched and the dog has one anchor
matched. Note that both have been matched on different feature maps.

Figure 10: Matches

The loss function used is the multi-box classification and regression loss. The
classification loss used is the softmax cross entropy and, for regression the smooth L1
loss is used.
4 Conclusion
An accurate and efficient object detection system has been developed which achieves compa-
rable metrics with the existing state-of-the-art system. This project uses recent
techniques in the field of computer vision and deep learning. Custom dataset was
created using labelImg and the evaluation was consistent. This can be used in real-time
applications which require object detection for pre-processing in their pipeline.
An important scope would be to train the system on a video sequence for usage in
tracking applications. Addition of a temporally consistent network would enable smooth
detection and more optimal than per-frame detection.
References
[1] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies
for accurate object detection and semantic segmentation. In The IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2014.

[2] Ross Girshick. Fast R-CNN. In International Conference on Computer Vision (ICCV),
2015.

[3] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards
real- time object detection with region proposal networks. In Advances in Neural
Information Processing Systems (NIPS), 2015.

[4] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look
once: Unified, real-time object detection. In The IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), 2016.

[5] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-
Yang Fu, and Alexander C. Berg. SSD: Single shot multibox detector. In ECCV, 2016.

[6] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-
scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

A Deep Neural Network Model For Target-Based Sentiment Analysis
No ratings yet
A Deep Neural Network Model For Target-Based Sentiment Analysis
7 pages
Introduction To TensorFlow For Artificial Intelligence
No ratings yet
Introduction To TensorFlow For Artificial Intelligence
41 pages
Overview_of_object_detection_based_on_deep_learnin
No ratings yet
Overview_of_object_detection_based_on_deep_learnin
7 pages
2802 8020 1 PB
No ratings yet
2802 8020 1 PB
3 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
Final Report - Removed
No ratings yet
Final Report - Removed
43 pages
Project Report (Group 9)
No ratings yet
Project Report (Group 9)
20 pages
5-IJLEMR-77839
No ratings yet
5-IJLEMR-77839
5 pages
ObjectDetectionwithConvolutionalNeuralNetworks
No ratings yet
ObjectDetectionwithConvolutionalNeuralNetworks
12 pages
Report 34
No ratings yet
Report 34
26 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
Seminar Topic On Spot Detection of Self Driving Car
No ratings yet
Seminar Topic On Spot Detection of Self Driving Car
7 pages
An_Investigation_of_Deep_Neural_Network_based_Techniques_for_Object_Detection_an
No ratings yet
An_Investigation_of_Deep_Neural_Network_based_Techniques_for_Object_Detection_an
6 pages
Object Detection and Identification
67% (3)
Object Detection and Identification
20 pages
Project Detecto!: A Real-Time Object Detection Model
No ratings yet
Project Detecto!: A Real-Time Object Detection Model
3 pages
Wen Wen 2021 Thesis
No ratings yet
Wen Wen 2021 Thesis
114 pages
Object Detection With Deep Learning_ A Review Summary
No ratings yet
Object Detection With Deep Learning_ A Review Summary
11 pages
Object Detection
No ratings yet
Object Detection
13 pages
Object Detection With Deep Learning
No ratings yet
Object Detection With Deep Learning
3 pages
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
No ratings yet
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
8 pages
Object Detection ppt-1
100% (2)
Object Detection ppt-1
16 pages
Object Detection Using Tensorflow....
No ratings yet
Object Detection Using Tensorflow....
9 pages
MINI PROJECT SYNOPSIS
No ratings yet
MINI PROJECT SYNOPSIS
6 pages
Real Time Object Detection With Deep Learning and OpenCV
No ratings yet
Real Time Object Detection With Deep Learning and OpenCV
5 pages
Final Project Report
No ratings yet
Final Project Report
19 pages
Vijay Report
No ratings yet
Vijay Report
14 pages
Pedestrian Detection System Based On Deep Learning
No ratings yet
Pedestrian Detection System Based On Deep Learning
5 pages
Final Report
No ratings yet
Final Report
62 pages
Full
No ratings yet
Full
9 pages
Fin Irjmets1684232858
No ratings yet
Fin Irjmets1684232858
9 pages
Object Detection using ELAN
No ratings yet
Object Detection using ELAN
6 pages
Object Detection Using YOLO: Challenges, Architectural Successors, Datasets and Applications
No ratings yet
Object Detection Using YOLO: Challenges, Architectural Successors, Datasets and Applications
33 pages
Final Project Paper Akash
No ratings yet
Final Project Paper Akash
5 pages
Object Detection With Deep Learning: A Review
No ratings yet
Object Detection With Deep Learning: A Review
21 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
A Survey of Deep Learning-Based Object Detection
No ratings yet
A Survey of Deep Learning-Based Object Detection
30 pages
Realtime Object Detection Using SSD
No ratings yet
Realtime Object Detection Using SSD
8 pages
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
No ratings yet
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
6 pages
Realtime Visual Recognition in Deep Convolutional Neural Networks
No ratings yet
Realtime Visual Recognition in Deep Convolutional Neural Networks
13 pages
Sensors 22 04833
No ratings yet
Sensors 22 04833
17 pages
Real-Time Object Detection Using Deep Learning and Open CV
No ratings yet
Real-Time Object Detection Using Deep Learning and Open CV
4 pages
Object Detectionusing Machine Learningand Deep Learning
No ratings yet
Object Detectionusing Machine Learningand Deep Learning
9 pages
Literature Survey For Robotics
No ratings yet
Literature Survey For Robotics
6 pages
Real-Time Object Detection Using Deep Learning: Journal of Advances in Mathematics and Computer Science June 2023
No ratings yet
Real-Time Object Detection Using Deep Learning: Journal of Advances in Mathematics and Computer Science June 2023
10 pages
Object Detection and Identification A Project Report: November 2019
No ratings yet
Object Detection and Identification A Project Report: November 2019
45 pages
kumar2019
No ratings yet
kumar2019
6 pages
Incremental Training For Image Classification of Unseen Objects
No ratings yet
Incremental Training For Image Classification of Unseen Objects
19 pages
Object Detection and Identification A Project Report: November 2019
No ratings yet
Object Detection and Identification A Project Report: November 2019
45 pages
The Ultimate Guide To Object Detection
No ratings yet
The Ultimate Guide To Object Detection
16 pages
Object Detection Using Deep Learning Approach
100% (1)
Object Detection Using Deep Learning Approach
9 pages
Introduction To Computer Vision
No ratings yet
Introduction To Computer Vision
45 pages
SL. No Chapter Name Page No
No ratings yet
SL. No Chapter Name Page No
8 pages
Report 34
No ratings yet
Report 34
22 pages
Advancement of Deep Learning and its Applications in Object Detection and Recognition 1st Edition Roohie Naaz Mir instant download
100% (1)
Advancement of Deep Learning and its Applications in Object Detection and Recognition 1st Edition Roohie Naaz Mir instant download
68 pages
SSD Single Shot MultiBox Detector
No ratings yet
SSD Single Shot MultiBox Detector
17 pages
Get Advancement of Deep Learning and its Applications in Object Detection and Recognition 1st Edition Roohie Naaz Mir PDF ebook with Full Chapters Now
100% (4)
Get Advancement of Deep Learning and its Applications in Object Detection and Recognition 1st Edition Roohie Naaz Mir PDF ebook with Full Chapters Now
37 pages
ref14
No ratings yet
ref14
5 pages
Vitamin Deficiency Detection(Base Paper)
No ratings yet
Vitamin Deficiency Detection(Base Paper)
3 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
From Everand
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
Fouad Sabry
No ratings yet
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
Manan's Minor Project Synopsis
No ratings yet
Manan's Minor Project Synopsis
7 pages
Convolutional Neural Network
43% (7)
Convolutional Neural Network
20 pages
Image Blending Using Unitery CNN Algorithm
No ratings yet
Image Blending Using Unitery CNN Algorithm
69 pages
A CNN and VGG - 16 Approach For Identification System Based On Ear Biometrics
No ratings yet
A CNN and VGG - 16 Approach For Identification System Based On Ear Biometrics
13 pages
Literature Review On Image Classification Architecture
No ratings yet
Literature Review On Image Classification Architecture
14 pages
Project Report - Second Eye
No ratings yet
Project Report - Second Eye
19 pages
MSC Thesis Nordin Sahla
100% (1)
MSC Thesis Nordin Sahla
58 pages
Face Spoof Detection Using Deep Structured Learning: Abstract-Face Recognition Systems Are Now Being Used in Many
No ratings yet
Face Spoof Detection Using Deep Structured Learning: Abstract-Face Recognition Systems Are Now Being Used in Many
5 pages
Transfer Learning With VGG16 and Keras - by Gabriel Cassimiro - Towards Data Science
No ratings yet
Transfer Learning With VGG16 and Keras - by Gabriel Cassimiro - Towards Data Science
9 pages
Professional Summary: Nimisha C D
No ratings yet
Professional Summary: Nimisha C D
3 pages
3D Generative Models A Survey
No ratings yet
3D Generative Models A Survey
21 pages
An Enhanced AI-Based Network Intrusion Detection System Using Generative Adversarial Networks-1
No ratings yet
An Enhanced AI-Based Network Intrusion Detection System Using Generative Adversarial Networks-1
16 pages
Pix2code Generating Code From A Graphical User Int
No ratings yet
Pix2code Generating Code From A Graphical User Int
8 pages
One-Class Learning Towards Synthetic Voice Spoofing Detection
No ratings yet
One-Class Learning Towards Synthetic Voice Spoofing Detection
5 pages
Application of Artificial Intelligence in Petroleum Engineering
No ratings yet
Application of Artificial Intelligence in Petroleum Engineering
104 pages
Applsci 12 09597 v2
No ratings yet
Applsci 12 09597 v2
16 pages
Acoustic Detection of Drone: Mel Spectrogram
No ratings yet
Acoustic Detection of Drone: Mel Spectrogram
1 page
Deep Ensemble Learning With Pruning For DDoS Attack Detection in IoT Networks
No ratings yet
Deep Ensemble Learning With Pruning For DDoS Attack Detection in IoT Networks
21 pages
A Real-Time American Sign Language
No ratings yet
A Real-Time American Sign Language
7 pages
1-s2.0-S0957417421017255-main
No ratings yet
1-s2.0-S0957417421017255-main
13 pages
A Deep Learning Based Modeling of Reconfigurable Intelligent Surface Assisted Wireless Communications For Phase Shift Configuration
No ratings yet
A Deep Learning Based Modeling of Reconfigurable Intelligent Surface Assisted Wireless Communications For Phase Shift Configuration
9 pages
AI Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems
No ratings yet
AI Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems
20 pages
20BCEC1109, 20BCE1170, 20BCE1233 - Research Paper
No ratings yet
20BCEC1109, 20BCE1170, 20BCE1233 - Research Paper
11 pages
(A) EEG Emotion Recognition Using Fusion Model of Graph Convolutional
No ratings yet
(A) EEG Emotion Recognition Using Fusion Model of Graph Convolutional
44 pages
Underwater Object Detection Using Deep Learning: Tallinn University of Technology School of Information Technologies
No ratings yet
Underwater Object Detection Using Deep Learning: Tallinn University of Technology School of Information Technologies
76 pages
Spam Comments Detection On Instagram Usi
No ratings yet
Spam Comments Detection On Instagram Usi
14 pages
Speech Mentor For Visually Impaired
No ratings yet
Speech Mentor For Visually Impaired
10 pages
ClothGAN Generation of Fashionable Dunhuang Clothes Using Generative Adversarial Networks
No ratings yet
ClothGAN Generation of Fashionable Dunhuang Clothes Using Generative Adversarial Networks
19 pages

Second Progress Report UID - 17BCS2127

Uploaded by

Second Progress Report UID - 17BCS2127

Uploaded by

Object Detection Smart Security System

Second progress Report

Shanu Naval Singh (17bcs2127)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Chandigarh University, Gharuan

Shanu naval Singh (17bcs2127)

Sr. No. Topic Page No.

2 Chapter 2: Related Work 6

Figure 1: Computer Vision Tasks

2.1 Bounding Box

Figure 3: Jaccard distance

2.2 Classification + Regression

Figure 4: Architecture overview

Figure 5: Two stage method

2.4 Unified Method

Figure 6: Unified Method

Figure 7: SSD Architecture

Figure 8: SSD Overall Idea

Anchors (collection of boxes overlaid on image at different spatial locations, scales

Figure 10: Matches

You might also like