0% found this document useful (0 votes)

274 views62 pages

Major Project Documentation Final 2

The document discusses object detection using the YOLO algorithm. YOLO is an object detection algorithm that detects multiple objects in an image by predicting bounding boxes and class probabilities using a single convolutional neural network. The goal of YOLO is to recognize instances of predefined object classes in an image and describe their locations using bounding boxes.

Uploaded by

puppalasai2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

274 views62 pages

Major Project Documentation Final 2

Uploaded by

puppalasai2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

OBJECT DETECTION USING

MACHINE LEARNING
A
Thesis Submitted
In the partial fulfillment of the requirements for
the award of the degree of

BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING

G. MANVITHA 18281A0551
E. VEDA SRI 18281A0503
P. SAI KRISHNA 18281A0502

PROJECT GUIDE
Dr. Lt. RAVINDRABABU KALLAM
HEAD OF DEPT, CSE
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
KAMALA INSTITUTE OF TECHNOLOGY & SCIENCE
(Approved by AICTE, New Delhi, Affiliated to JNTU, Hyderabad, T.S, Accredited
by NAAC with ‘B++’ Grade) Singapur, Huzurabad, Karimnagar, Telangana-
505468) (2021-2022)
KAMALA INSTITUTE OF TECHNOLOGY & SCIENCE
Sponsored by VODITHALA EDUCATION SOCIETY, Approved by AICTE-New Delhi and Affiliated to JNTUH,
Accreditedwith NAAC B++ Grade & NBA
SINGAPUR, HUZURABAD, KARIMNAGAR, TELANGANA, INDIA- 505468

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CERTIFICATE

This is to certify that G. MANVITHA (18281A0551), E. VEDA SRI (18281A0503), P.

SAIKRISHNA (18281A0502) of IV B. Tech (CSE) have satisfactorily completed the dissertation work
for project entitled "OBJECT DETECTION USING MACHINE LEARNING", towards the partial
fulfillment of B. Tech degree in this academic year i.e. 2021-2022.

Signature of the Guide Signature of the Head of the Department

Dr. Lt. RAVINDRABABU KALLAM Dr. Lt. RAVINDRABABU KALLAM
PROFESSOR, HEAD OF DEPT OF CSE PROFESSOR,HEAD OF DEPT OF CSE

Signature of the External Supervisor

INDEX

TITLE PAGE NO

ACKNOWLEDGEMENT i
ABSTRACT ii
LIST OF FIGURES iii
LIST OF ABBREVIATIONS iv

Chapter 1: Introduction 1

1.1 About Project 1

1.2 Existing System with Drawbacks 2

1.3 Proposed System with Features 3
Chapter 2: Literature Survey 4

Chapter 3: Analysis 8

3.1 Hardware & Software Requirements 8

3.2 Module Description 8

Chapter 4: Design 19

4.1 System Design 19

4.2 Architectural Design 21
4.3 Block Diagram 24
4.4 UML Diagrams 25
4.4.1 Use case Diagram 25

4.4.2 Class Diagram 26

4.4.3 Sequence Diagram 27

4.4.5 Activity Diagram 28

Chapter 5: Implementation 30
5.1 Algorithm used 30
Chapter 6: Testing 45
6.1 Training 45
6.2 Test Results 46
Chapter 7: Results 47
Chapter 8: Conclusion 50
Chapter 9: Future Scope & Enhancements 51
Chapter 10: References 52
ACKNOWLEDGEMENT

The success of any course depends mainly on the teachers who teach us. Only good teaching
can interpret the syllabus and procedure desirable changes and competent citizens. This project was a
team effort and many people whose names do not appear on the cover also deserve credit. First, we
thank God almighty for his manifold mercies in carrying out our project successfully.
We would like to pay our respects to our principal Dr. K. SHANKER and management for
providing all the facilities required for completing this project work.
We sincerely thank our Head of the Department and Professor of Computer Science &
Engineering and our Guide Dr. Lt. RAVINDRA BABU KALLAM, for encouraging us in doing this
project and for his guidance.
We would like to thank our project coordinator Ms. PAVANKUMAR and Assistant
Professors in CSE for the valuable guidance and constant encouragement at each stage of this project.
We thank other teaching and other non-teaching staff of Kamala Institute of Technology & Science, for
supporting us in every stage of our Major Project, entitled “OBJECT DETECTION USING
MACHINE LEARNING”.
We thank our Parents and Friends for their moral support throughout the project work that
helped to strengthen my will.

G. MANVITHA 18281A0551
E. VEDA SRI 18281A0503
P. SAI KRISHNA 18281A0502

i
ABSTRACT

The Objective of this project is to detect objects using You Look Only Once (YOLO) approach. This
approach has several advantages over other object detection algorithms. IN other algorithms like
Convolutional Neural Network, Single Shot Detection, Fast- Convolutional Neural Network, etc the
algorithm will not take the complete image but it will take only the important part of the images. But in
the case of YOLO approach, it looks at the image completely by predicting the bounding boxes
completely using Convolutional Network and the class probabilities of these boxes and detects the
image faster, more accurate and précised then other algorithms.

ii
LIST OF FIGURES

TITLE PAGE NO

1.1 General YOLO Model 1

1.2 SSD Architecture 2
4.1 Architectural Design 21
4.2 Proposed Method and Algorithm 23
4.3 Block Diagram 24
4.4 Use Case Diagram 26
4.5 Class Diagram 27
4.6 Sequence Diagram 28
4.7Activity Diagram 29
5.1 Screenshot of cfg 31
5.2Screenshot of cfg 31
5.3 Screenshot of cfg 32
5.4 Screenshot of cfg 32
5.5 Screenshot of cfg 33
5.6 Screenshot of cfg 33
5.7 Screenshot of cfg 34
5.8 Screenshot of cfg 34
5.9Screenshot of cfg 35
5.10 Screenshot of cfg 35
5.11 Screenshot of cfg 36
5.12 Screenshot of cfg 36
5.13 Screenshot of cfg 37
5.14 Screenshot of cfg 37
5.15Screenshot of cfg 38
5.16 Screenshot of cfg 38
5.17Screenshot of cfg 39
5.18 Screenshot of cfg 39
5.19 Screenshot of coco names 40
5.20 Screenshot of coco names 40
5.21 Screenshot of coco names 41
5.22 Screenshot of coco names 41
5.23 Screenshot of code 42
5.24 Screenshot of code 42
5.25 Screenshot of code 43
5.26 Screenshot of code 43
7.1 Screenshot of execution 47
7.2 Screenshot of Results 48
7.3 Screenshot of Results 48
7.4 Screenshot of Results 49
7.5 Screenshot of Results 49

iii
LIST OF ABBREVIATIONS

YOLO : You Only Look Once

CNN : Convolutional Neural Network
SSD : Single Shot multi box Detector

iv
CHAPTER 1
INTRODUCTION
1.1 About the project

Introduction YOLO (You Look Only Once) [1] is an object Detection algorithm. It detects
multiple objects present in a image and creates a bounding box around it. YOLO brings a unified neural
network architecture to the table, single architecture which does bounding box prediction and also gives
out class probabilities.

Figure 1.1: General Yolo Model

Figure 1.1 describes general yolo model. YOLO[1] a single convNet[2] simultaneously
predicts multiple bounding boxes and also the class probabilities for those boxes. This allows YOLO to
optimize. YOLO is fast and it reasons about the image globally while making predictions example, it
makes less than half the number of background errors compared to Fast R-CNN.
The goal of object detection is to recognize instances of a predefined set of object classes
(e.g., people, cars, bikes, animals) and describe the locations of each detected object in the image using
a bounding box. These days object detection helps many people with different kinds of things. One of
which is self-driving cars. Self-driving cars are a big advantage for the application of object detection.
With this application, self-driving cars can detect object in and around the surroundings and drive
accordingly with the given set of rules. So, by taking this application into our consideration we decided
to take real time object detection as our project. As a result, we are highly motivated to develop a
system that recognizes objects in real time.

1
1.2 EXISTING SYSTEM WITH DRAWBACKS

TheSingle Shot Multi Box Detector (SSD) network was proposed by Liu et al. in 2015. SSD
introduces multi-reference and multi-resolution detection 3 techniques. Multi-reference techniques
define a set of anchor boxes of different sizes and aspect ratios at different locations of an image, and
then predict the detection box based on these references. Multi-resolution techniques allow detecting
objects at several scales and at different layers of the network. A SSD network implements an algorithm
for detecting multiple object classes in images by generating confidence scores related to the presence
of any object category in each default box. Moreover, it produces adjustments in boxes to better match
the object shapes. This network is suited for real-time applications since it does not re-sample features
for bounding box hypotheses (like in models such as Faster R-CNN). The SSD architecture is CNN-
based and for detecting the target classes of objects it follows two stages: • extract the feature maps •
apply convolutional filters to detect the objects. SSD uses VGG16 to extract feature maps. Then, it
detects objects using the Conv4 3 layer of VGG16. Each prediction is composed of a bounding box and
21 scores for each class (one extra class for no object); the class with highest score is selected as the one
for the bounded object. Conv4 3 makes a total of 38 × 38 × 4 predictions: four predictions per cell
independently from depth of feature maps. Many predictions will contain no object as it is expected and
uses the class ‘0’ to indicate that no object was detected in the image. Figure 1.2 illustrates the typical
layer structure of a SSD network.

Figure 1.2: SSD Architecture

2
DRAWBACKS:
• Poor extraction in shallow layers.
• Loss of features in deep layers.
• Detected in only one shot, multiple shots are not allowed.

1.3 PROPOSED SYSTEMWITH FEATURES:

A You Only Look Once (YOLO)[2] detector was proposed by Redmon et al. in 2016.YOLO
was inspired by Google Net and the idea was applying a unique neural network to the full image, where
the network divides the image into regions and simultaneously predicts bounding boxes and
probabilities for each region. YOLO splits an image into a N × N grid, where each cell predicts only one
object. Several improvements on YOLO architecture have been proposed (i.e., YOLOv2 and YOLOv3
versions) which increased the detection accuracy while keeping a very high detection speed. YOLOv3
uses a variant of Darknet architecture and has 53 layers trained with the ImageNet dataset. For the
object detection tasks, an additional 53 layers were added, and this model was trained with the Pascal
VOC dataset. YOLOv3 outperformed most of the detection algorithms for real-time applications. Using
residual connections and up sampling, the architecture can perform detections at three different scales
from the specific layers of the structure. This makes YOLOv3 model more efficient when detecting
small objects but, on the other side, it results in slower processing than the previous versions due to the
complexity of the solution.
FEATURES:

• Fast. Good for real-time preprocessing.

• Predictions (object locations and classes) are made from one single network. Can be trained end-to-
end to improve accuracy.
• YOLO is generalized. It outperforms other methods when generalizing from natural images to other
domains like artwork.
• Region proposal methods limit classifier to the specific region. YOLO accesses to the whole image
in predicting boundaries. With the additional context, YOLO demonstrates fewer false positives in
background areas.

YOLO detects one object per grid cell. It enforces spatial diversity in making decisions.

3
CHAPTER 2
LITERATURE SURVEY
• A. Singh has mentioned the Comparative analysis on YOLO object detection with OpenCV
[3]Computer Vision is a field of study that helps to develop techniques to identify images and
displays. It has various features like image recognition, object detection and image creation, etc.
Object detection is used for face detection, vehicle detection, web images, and safety systems. Its
algorithms are Region-based Convolutional Neural Networks (RCNN), Faster-RCNN and You
Only Look Once Method (YOLO) that have shown state-of-the-art performance. Of these,
YOLO is better in speed compared to accuracy. It has efficient object detection without
compromising on performance.
• Ayush jain has mentioned the Real Time Object Detection and Tracking Using Deep Learning
and OpenCV[4].It uses the Technology Image processing/Video streaming .Deep learning has
gained a tremendous influence on how the world is adapting to Artificial Intelligence since past
few years. Some of the popular object detection algorithms are Region-based Convolutional
Neural Networks (RCNN), Faster-RCNN, Single Shot Detector (SSD) and You Only Look Once
(YOLO). Amongst these, Faster-RCNN and SSD have better accuracy, while YOLO performs
better when speed is given preference over accuracy. Deep learning combines SSD and Mobile
Nets to perform efficient implementation of detection and tracking. This algorithm performs
efficient object detection while not compromising on the performance.
• Ahmad delforouzi has mentioned Training-Based Methods for Comparison of Object Detection
Methods for Visual Object Tracking [5]. It uses the Technology Computer vision/CNN.
Training-Based Methods for Comparison of Object Detection Methods for Visual Object
Tracking. Object tracking in challenging videos is a hot topic in machine vision. Recently, novel
training-based detectors, especially using the powerful deep learning schemes, have been
proposed to detect objects in still images. However, there is still a semantic gap between the
object detectors and higher-level applications like object tracking in videos. This paper presents
a comparative study of outstanding learning-based object detectors such as ACF, Region-Based
Convolutional Neural Network (RCNN), FastRCNN, FasterRCNN and You Only Look Once
(YOLO) for object tracking. We use an online and offline training method for tracking. The
online tracker trains the detectors with a generated synthetic set of images from the object of
interest in the first frame. Then, the detectors detect the objects of interest in the next frames.
4
The detector is updated online by using the detected objects from the last frames of the video.
The offline tracker uses the detector for object detection in still images and then a tracker based
on Kalman filter associates the objects among video frames. Our research is performed on a
TLD dataset which contains challenging situations for tracking. Source codes and
implementation details for the trackers are published to make both the reproduction of the results
reported in this paper and the re-use and further development of the trackers for other
researchers. The results demonstrate that ACF and YOLO trackers show more stability than the
other trackers.
• Rubayat Ahmed Khan; Jia Uddin; Sonia Corraya has mentioned the Real-time object detection
using enhanced color segmentation and novel foreground extraction[6].This paper proposes an
effective real time fire detection technique, based on video processing. The proposed technique
utilizes prominent features such as flame color information and spatiotemporal characteristics to
identify fire areas. The initial stage of the work extracts fire colored pixels using a set of
enhanced rules on RGB. Fire pixels are dynamic and to detect these moving pixels a novel
method is proposed in this paper. The final verification is done by examining the area of the
extracted regions. A harmful fire will grow over time, thus if the area happens to increase, the
region under focus is declared as fire. Experimental results show that the model put forward
outperforms other state of art models yielding an accuracy of 97.7%.
• Xiao-Yan Zhang has mentioned the Automatic video object segmentation using wavelet
transform and moving edge detection [7]. It uses the Technology Imageprocessing/
Edgedetection algorithm A fast and automatic video object segmentation algorithm based on
wavelet transform and moving edge detection is proposed in this paper. First, the wavelet
transform is applied to two consecutive frames. The change detection method with different
thresholds in four wavelet sub-bands and Canny edge detection are used in wavelet domain.
After the inverse wavelet transform, the robust difference edge map can be obtained. Through
combination with the current frame edge map, background edge map and previous frame's
moving edge, the current frame's moving edge can be detected and tracked. It is then used to
extract video object plane (VOP) by a simple filling technique. The proposed algorithm is robust
to the entire motion and local deformation of object. Experiments results and object evaluation
demonstrate the effectiveness of our algorithm.
• Dr. M.N Vijayalakshmi has mentioned the Performance Evaluation of Object Detection

5
Techniques for Object Detection[8].It uses the Technology Image processing/Threshold. Object
detection plays vital role in image processing for finding the objects of interest Increase of image
size and complexity has thrust for developing novel and robust object detection techniques.
There are number of methods existing for detecting the objects in a particular scene. The focus
of this paper is estimate the performance and effieciency of some existing object detection
algorithms on the sampled images. Thus may give good initiative about the suitability of those
algorithms on the images. Three existing algorithms are implemented for various images and
compared under variety of situations to find which detector is robust under different conditions.
• B N Krishna Sai has mentioned the Object Detection and Count of Objects in Image using
Tensor Flow Object Detection API[9].It uses the Technology Computer vision/CNN
algorithm.Object detection plays vital role in image processing for finding the objects of interest
Increase of image size and complexity has thrust for developing novel and robust object
detection techniques. There are number of methods existing for detecting the objects in a
particular scene. The focus of this paper is estimate the performance and effieciency of some
existing object detection algorithms on the sampled images. Thus may give good initiative about
the suitability of those algorithms on the images. Three existing algorithms are implemented for
various images and compared under variety of situations to find which detector is robust under
different conditions.
• Lijun yu has mentioned the Design of Single Moving Object Detection and Recognition
System Based on OpenCV[10].This paper proposes a new Frequency-tuned (FT) algorithm for
extracting target dynamic saliency information from a mixture of Gaussian models, aiming at the
inconspicuous effect of the traditional Frequency-tuned (FT) algorithm saliency map and the
significant “dilution” of feature map fusion. This algorithm makes innovative improvements
from distance metrics and feature graphs. In order to solve the large computational complexity
of traditional identification algorithms, the algorithm uses a Haar cascaded classifier with low
computational complexity as a classification algorithm, and uses OpenCV and Qt interface
library to build an integrated multi-module system software platform to achieve single-target
moving object detection and recognition. The experimental results show that the system has
significant effects on the detection and recognition of single-target motion and has high
accuracy, and it has a good engineering application prospect.
• Xiaohan Liu has mentioned the Multi-Task Fusion of Object Detection and Semantic

6
Segmentation[11].In this paper we propose to exploit multiple related tasks for accurate multi-
sensor 3D object detection. Towards this goal we present an end-to-end learnable architecture
that reasons about 2D and 3D object detection as well as ground estimation and depth
completion. Our experiments show that all these tasks are complementary and help the network
learn better representations by fusing information at various levels. Importantly, our approach
leads the KITTI benchmark on 2D, 3D and bird's eye view object detection, while being real-
time.

7
CHAPTER 3
ANALYSIS

The goal of system analysis is to determine where the problem is in an attempt to fix the
system. This step involves breaking down the system in different pieces to analyze the situation,
analyzing project goals, breaking down what needs to be created and attempting to engage users so that
definite requirements can be defined.

3.1 Hardware and Software requirements

Hardware requirements: The following are the hardware requirements which we have used in our
project.

● Processor Needed : i3 or above.

● RAM : 1 GB or more.
● Hard disk : 40 GB or more.
● Monitor : Any Monitor.
● Keyboard : Standard Keyboard.
● Mouse : Two or Three Button Mouse.

Software Requirements: The following are software requirements.

● Operating System : Windows XP,7, or Higher windows OS.

● Language :Python3.6 IDLE or higher
3.2 Module Description

The modules used in this system are:

1. Python Libraries
2. Open CV
3. NumPy
1. Python: Python is a programming language, which means it’a a language both people and

8
computers can understand. Python was developed by a Dutch software engineer named
Guido van Rossum, who created the language to solve some problems he saw in computer
languages of the time.
Python is an interpreted high-level programming language for general-purpose programming.
Created by Guido van Rossum and first released in 1991, Python has a design philosophy that
emphasizes code readability, and a syntax that allows programmers to express concepts in fewer
lines of code, notably using significant whitespace. It provides constructs that enable clear
programming on both small and large scales.
Python features a dynamic type system and automatic memory management. It supports multiple
programming paradigms, including object-oriented, imperative, functional and procedural, and has a
large and comprehensive standard library. Python interpreters are available for many operating
systems. C Python, the reference implementation of Python, is open-source software and has a
community-based development model, as do nearly all of its variant implementations. C Python is
managed by the non-profit
Python interpreters are available for many operating systems. C Python, the reference
implementation of Python, is open-source software and has a community-based development model,
as do nearly all of its variant implementations. C Python is managed by the non-profit. C Python is
managed by the non-profit Python Software Foundation.
Python cv2 imshow()

2. OpenCV: Python is a library of Python bindings designed to solve computer vision

problems.
To display the image, we read with an image with an imread() function, and then we call the
imshow() method of the cv2 module. The imshow() function will display the image in a window,
and it receives as input the name of the window and the image.
In Computer Vision applications, images are an integral part of the development process. Often
there would be a requirement to read images and display them if required.
To read and display image using OpenCV Python, you could use cv2.imread() for reading the image
to a variable and cv2.imshow() to display the image in a separate window.
Syntax: cv2.imread(path, flag)
Parameters:

9
path: A string representing the path of the image to be read.
flag: It specifies the way in which image should be read. It’s default value is cv2.IMREAD_COLOR
Return Value: This method returns an image that is loaded from the specified file.
All three types of flags are described below:
cv2.IMREAD_COLOR: It specifies to load a color image. Any transparency of image will
be neglected. It is the default flag. Alternatively, we can pass integer value 1 for this flag.
cv2.IMREAD_GRAYSCALE: It specifies to load an image in grayscale mode.
Alternatively, we can pass integer value 0 for this flag.
cv2.IMREAD_UNCHANGED: It specifies to load an image as such including alpha channel.
Alternatively, we can pass integer value -1 for this flag.
Example #1: Using default flag
# Python program to explain cv2.imread() method
# importing cv2
import cv2
# path
path = r'C:\Users\Rajnish\Desktop\geeksforgeeks.png'
# Using cv2.imread() method
img = cv2.imread(path)
# Displaying the image
cv2.imshow('image', img)
Example #2:
Loading an image in grayscale mode
# Python program to explain cv2.imread() method
# importing cv2
import cv2
# path
path = r'C:\Users\Rajnish\Desktop\geeksforgeeks.png'
# Using cv2.imread() method
# Using 0 to read image in grayscale mode
img = cv2.imread(path, 0)
# Displaying the image

10
cv2.imshow('image', img)
Example:
# importing cv2 import cv2
# image path = './forest.jpg'
# Reading an image in default mode
image = cv2.imread(path)
# Window name in which image is displayed
window_name = 'image'
# Using cv2.imshow() method
# Displaying the image
cv2.imshow(window_name, image)
# waits for user to press any key
# (this is necessary to avoid Python kernel form crashing)
cv2.waitKey(0)
# closing all open windows
cv2.destroyAllWindows()
• In this example, first, we are importing the cv2 module.
• In the next step, we have defined an image path.
• Then read the image using cv2.imread() function.
• Then we have defined a window_name that we have set to image.
• Then we are using the imshow() function and pass the two parameters to open the image
window and can see the image.
• To keep the image window, we have used the waitKey() function.
• We passed 0 to waitKey() method that means it will remain open forever until we said
otherwise.
• That is it for the Python cv2.imshow() function.
You Can Use Python for Pretty Much Anything
One significant advantage of learning Python is that it’s a general-purpose language that can be
applied in a large variety of projects. Below are just some of the most common fields where Python
has found its use:
• Data science

11
• Scientific and mathematical computing
• Web development
• Computergraphics
• Basic game development
• Mapping and geography (GIS software)
Python Is Widely Used in Data Science
Python’s ecosystem is growing over the years and it’s more and more capable of the statistical
analysis.
It’s the best compromise between scale and sophistication (in terms od data processing).
Python emphasizes productivity and readability.
Python is used by programmers that want to delve into data analysis or apply statistical techniques
(and by devs that turn to data science)
There are plenty of Python scientific packages for data visualization, machine learning, natural
language processing, complex data analysis and more. All of these factors make Python a great tool
for scientific computing and a solid alternative for commercial packages such as MatLab. The most
popular libraries and tools for data science are:
Pandas: a library for data manipulation and analysis. The library provides data structures and
operations for manipulating numerical tables and time series.
3. NumPy: The fundamental package for scientific computing with Python, adding support for
large, multi-dimensional arrays and matrices, along with a large library of high-level
mathematical functions to operate on these arrays.
SciPy: a library used by scientists, analysts, and engineers doing scientific computing and technical
computing.
Being a free, cross-platform, general-purpose and high-level programming language, Python has
been widely adopted by the scientific community. Scientists value Python for its precise and
efficient syntax, relatively flat learning curve and the fact that it integrates well with other languages
(e.g. C/C++).
As a result of this popularity there are plenty of Python scientific packages for data visualization,
machine learning, natural language processing, complex data analysis and more. All of these factors
make Python a great tool for scientific computing and a solid alternative for commercial packages
such as Matlab.

12
Here’s our list of the most popular Python scientific libraries and tools
Astropy
The Astropy Project is a collection of packages designed for use in astronomy. The core astropy
package contains functionality aimed at professional astronomers and astrophysicists, but may be
useful to anyone developing astronomy software.
Biopython
Biopython is a collection of non-commercial Python tools for computational biology and
bioinformatics. It contains classes to represent biological sequences and sequence annotations, and it
is able to read and write to a variety of file formats.
Cubes
Cubes is a light-weight Python framework and set of tools for the development of reporting and
analytical applications, Online Analytical Processing (OLAP), multidimensional analysis and
browsing of aggregated data.
DEAP
DEAP is an evolutionary computation framework for rapid prototyping and testing of ideas. It
incorporates the data structures and tools required to implement most common evolutionary
computation techniques such as genetic algorithm, genetic programming, evolution strategies,
particle swarm optimization, differential evolution and estimation of distribution algorithm.
SCOOP
SCOOP is a Python module for distributing concurrent parallel tasks on various environments, from
heterogeneous grids of workstations to supercomputers.
PsychoPy
PsychoPy is a package for the generation of experiments for neuroscience and experimental
psychology. PsychoPy is designed to allow the presentation of stimuli and collection of data for a
wide range of neuroscience, psychology and psychophysics experiments.
Pandas
Pandas is a library for data manipulation and analysis. The library provides data structures and
operations for manipulating numerical tables and time series.
Mlpy
Mlpy is a machine learning library built on top of NumPy/SciPy, the GNU Scientific Libraries.
Mlpy provides a wide range of machine learning methods for supervised and unsupervised problems

13
and it is aimed at finding a reasonable compromise between modularity, maintainability,
reproducibility, usability and efficiency.
matplotlib
Matplotlib is a python 2D plotting library which produces publication quality figures in a variety of
hardcopy formats and interactive environments across platforms. Matplotlib allows you to generate
plots, histograms, power spectra, bar charts, errorcharts, scatterplots, and more.
NumPy
NumPy is the fundamental package for scientific computing with Python, adding support for large,
multi-dimensional arrays and matrices, along with a large library of high-level mathematical
functions to operate on these arrays.
NetworkX
NetworkX is a library for studying graphs which helps you create, manipulate, and study the
structure, dynamics, and functions of complex networks.TomoPy
TomoPy is an open-sourced Python toolbox to perform tomographic data processing and image
reconstruction tasks. TomoPy provides a collaborative framework for the analysis of synchrotron
tomographic data with the goal to unify the effort of different facilities and beamlines performing
similar tasks.
Theano
Theano is a numerical computation Python library. Theano allows you to define, optimize, and
evaluate mathematical expressions involving multi-dimensional arrays efficiently.
SymPy
SymPy is a library for symbolic computation and includes features ranging from basic symbolic
arithmetic to calculus, algebra, discrete mathematics and quantum physics. It provides computer
algebra capabilities either as a standalone application, as a library to other applications, or live on
the web.
SciPyis a library used by scientists, analysts, and engineers doing scientific computing and
technical computing. SciPy contains modules for optimization, linear algebra, integration,
interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks
common in science and engineering.

Scikit-learn
Scikit-learn is a machine learning library. It features various classification, regression and
clustering algorithms including support vector machines, random forests, gradient boosting, k-

14
means and DBSCAN, and is designed to interoperate with the Python numerical and scientific
libraries NumPy and SciPy.
Scikit-image
Scikit-image is a image processing library. It includes algorithms for segmentation, geometric
transformations, color space manipulation, analysis, filtering, morphology, feature detection, and
more.
ScientificPython
ScientificPython is a collection of modules for scientific computing. It contains support for
geometry, mathematical functions, statistics, physical units, IO, visualization, and parallelization.
SageMath
SageMath is mathematical software with features covering many aspects of mathematics,
including algebra, combinatorics, numerical mathematics, number theory, and calculus.
SageMath uses the Python, supporting procedural, functional and object-oriented constructs.
Veusz
Veusz is a scientific plotting and graphing package designed to produce publication-quality plots
in popular vector formats, including PDF, PostScript and SVG.
Graph-tool
Graph-tool is a module for the manipulation and statistical analysis of graphs.
SunPy
SunPy is a data-analysis environment specializing in providing the software necessary to analyze
solar and heliospheric data in Python.
Opencv python:
Installation and Usage
• If you have previous/other manually installed (= not installed via pip) version of OpenCV
installed (e.g. cv2 module in the root of Python's site-packages), remove it before installation to
avoid conflicts.
• Make sure that your pip version is up-to-date (19.3 is the minimum supported version):
pip install --upgrade pip. Check version with pip -V. For example Linux distributions ship
usually with very old pip versions which cause a lot of unexpected problems especially with the
manylinux format.
• Select the correct package for your environment:
• There are four different packages (see options 1, 2, 3 and 4 below) and you should
SELECT ONLY ONE OF THEM. Do not install multiple different packages in the same
environment. There is no plugin architecture: all the packages use the same namespace (cv2). If
you installed multiple different packages in the same environment, uninstall them all with pip
uninstall and reinstall only one package.
• a. Packages for standard desktop environments (Windows, macOS, almost any
GNU/Linux distribution)
o Option 1 - Main modules package: pip install opencv-python

15
o Option 2 - Full package (contains both main modules and contrib/extra modules): pip
install opencv-contrib-python (check contrib/extra modules listing from OpenCV
documentation)
• b. Packages for server (headless) environments (such as Docker, cloud environments
etc.), no GUI library dependencies
• These packages are smaller than the two other packages above because they do not
contain any GUI functionality (not compiled with Qt / other GUI components). This means that
the packages avoid a heavy dependency chain to X11 libraries and you will have for example
smaller Docker images as a result. You should always use these packages if you do not use
cv2.imshow et al. or you are using some other package (such as PyQt) than OpenCV to create
your GUI.
o Option 3 - Headless main modules package: pip install opencv-python-headless
o Option 4 - Headless full package (contains both main modules and contrib/extra
modules): pip install opencv-contrib-python-headless (check contrib/extra modules listing from
OpenCV documentation)
• Import the package:
• import cv2
• All packages contain Haar cascade files. cv2.data.haarcascades can be used as a shortcut
to the data folder. For example:
• cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
• Read OpenCV documentation
• Before opening a new issue, read the FAQ below and have a look at the other issues
which are already open.
Bokeh
Bokeh is a Python interactive visualization library that targets modern web browsers for
presentation. Bokeh can help anyone who would like to quickly and easily create interactive
plots, dashboards, and data applications. Its goal is to provide elegant, concise construction of
novel graphics in the style of D3.js, but also deliver this capability with high-performance
interactivity over very large or streaming datasets.
TensorFlow
TensorFlow is an open source software library for machine learning across a range of tasks,
developed by Google to meet their needs for systems capable of building and training neural
networks to detect and decipher patterns and correlations, analogous to the learning and
reasoning which humans use. It is currently used for both research and production at Google
products, often replacing the role of its closed-source predecessor, DistBelief.
Nilearn
DataMelt, or DMelt, is a software for numeric computation, statistics, analysis of large data
volumes ("big data") and scientific visualization. The program can be used in many areas, such
as natural sciences, engineering, modeling and analysis of financial markets. DMelt can be used

16
with several scripting languages including Python/Jython, BeanShell, Groovy, Ruby, as well as
with Java.
Python-weka-wrapper
Weka is a suite of machine learning software written in Java, developed at the University of
Waikato, New Zealand. It contains a collection of visualization tools and algorithms for data
analysis and predictive modeling, together with graphical user interfaces for easy access to these
functions. The python-weka-wrapper package makes it easy to run Weka algorithms and filters
from within Python.
Dask
Dask is a flexible parallel computing library for analytic computing composed of two
components: 1) dynamic task scheduling optimized for computation, optimized for interactive
computational workloads, and 2) Big Data collections like parallel arrays, dataframes, and lists
that extend common interfaces like NumPy, Pandas, or Python iterators to larger-than-memory
or distributed environments.
Python Saves Time
Even the classic “Hello, world” program illustrates this point:
print("Hello, world")
For comparison, this is what the same program looks like in Java:
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, world");
}
}
Python Keywords and Identifier
Keywords are the reserved words in Python.
We cannot use a keyword as variable name, function name or any other identifier. They are used
to define the syntax and structure of the Python language.
In Python, keywords are case sensitive.
There are 33 keywords in Python 3.3. This number can vary slightly in course of time.
All the keywords except True, False and None are in lowercase and they must be written as it is.
The list of all the keywords is given below.

Identifier is the name given to entities like class, functions, variables etc. in Python. It helps
differentiating one entity from another.
Rules for writing identifiers
Identifiers can be a combination of letters in lowercase (a to z) or uppercase (A to Z) or digits (0
to 9) or an underscore (_). Names like myClass, var_1 and print_this_to_screen, all are valid
example.
An identifier cannot start with a digit. 1variable is invalid, but variable1 is perfectly fine.
Keywords cannot be used as identifiers.

17
>>>global = 1
File"<interactive input>", line 1
global = 1
^
SyntaxError: invalid syntax
We cannot use special symbols like !, @, #, $, % etc. in our identifier.

>>> a@ =0
File"<interactive input>", line 1
a@ =0
^
SyntaxError: invalid syntax
Identifier can be of any length.

Python
Python features a dynamic type system and automatic memory management. It supports multiple
programming paradigms, including object-oriented, imperative, functional and procedural, and
has a large and comprehensive standard library.
Python interpreters are available for many operating systems. C Python, the reference
implementation of Python, is open source software and has a community-based development
model, as do nearly all of its variant implementations. C Python is managed by the non-profit
Python Software Foundation.

18
CHAPTER 4
DESIGN

4.1 System Design:

Systems design is the process of defining the architecture, components, modules, interfaces, and
data for a system to satisfy specified requirements. One could see it as the application of systems
theory to product development. Object-oriented analysis and design methods are becoming the
most widely used methods for computer systems design.
VGG is a visual geometry group which has a pack of convolution, maxpooling, fully connected
and softmax.
The image is passed through a stack of convolutional layers,where we use filters with a very
small receptive field: 3 * 3(which is the smallest size to capture the notion of left/right, up/down,
Centre.
This convolution stride is fixed to 1 pixel; the spatial padding of convolution layer input is such
that the spatial resolution is preserved after convolution.The padding is 1 pixel for 3 * 3
convolution layers. Spatial pooling is carried out by five max –pooling layers which follow some
of the convolution layers.
A Convolutional Neural Network (ConvNet/CNN)[2] is a Deep Learning algorithm which can
take in an input image, assign importance (learnable weights and biases) to various
aspects/objects in the image and be able to differentiate one from the other. The pre-processing
required in a ConvNet is much lower as compared to other classification algorithms. While in
primitive methods filters are hand-engineered, with enough training, ConvNets have the ability
to learn these filters/characteristics.
The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the
Human Brain and was inspired by the organization of the Visual Cortex. Individual neurons
respond to stimuli only in a restricted region of the visual field known as the Receptive Field. A
collection of such fields overlap to cover the entire visual area.
A ConvNet is able to successfully capture the Spatial and Temporal dependencies in an image
through the application of relevant filters. The architecture performs a better fitting to the image
dataset due to the reduction in the number of parameters involved and reusability of weights. In
other words, the network can be trained to understand the sophistication of the image better.
Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the spatial size
of the Convolved Feature. This is to decrease the computational power required to process the
data through dimensionality reduction. Furthermore, it is useful for extracting dominant features
which are rotational and positional invariant, thus maintaining the process of effectively training
of the model.

19
There are two types of Pooling: Max Pooling and Average Pooling. Max Pooling returns the
maximum value from the portion of the image covered by the Kernel. On the other hand,
Average Pooling returns the average of all the values from the portion of the image covered by
the Kernel.
Max Pooling also performs as a Noise Suppressant. It discards the noisy activations altogether
and also performs de-noising along with dimensionality reduction. On the other hand, Average
Pooling simply performs dimensionality reduction as a noise suppressing mechanism. Hence, we
can say that Max Pooling performs a lot better than Average Pooling.
The Convolutional Layer and the Pooling Layer, together form the i-th layer of a Convolutional
Neural Network. Depending on the complexities in the images, the number of such layers may
be increased for capturing low-levels details even further, but at the cost of more computational
power.
• After going through the above process, we have successfully enabled the model to
understand the features. Moving on, we are going to flatten the final output and feed it to
a regular Neural Network for classification purposes.
• Testing
• The new picture will be split into the same number of grids that is selected during the
training. The model predicts a 3 x 3 x 16 output for each grid. The 16 values in this
prediction are in the same format as the training label.
• Training
• The input for training the model will be images and their respective y labels. Fig 1 is
divided into 3 X 3 grid with two grid anchors, with 3 separate classes of objects. The
corresponding y labels has the shape of 3 X 3 X 16. Training takes the form of an image
and maps it into a target 3 X 3 X 16.
• Implementation of YOLO
o Darknet: This algorithm is implemented using anpen-source neural network
framework i.e., Darknet which was developed in C Language and CUDA
technology to render speedy calculations on a GPU necessary for real-time
predictions.
o DNModel.py: Darknet Model file is a computer vision code used for building the
model using the configuration file and it appends each layer.
o Util.py: Contains all the formulas used.
o imageprocees.py: Required to perform the image processing task. It takes all the
input images to resize them and perform Up-sampling, also performs transpose
function.
o detect.py: The main code which is run to perform object detection. This code uses
all the above- mentioned files to perform object detection. Performs all the
functions according to the YOLO concept

20
• COCO Dataset
• COCO basically means that the data set images are daily
• Bounding box prediction
• Objectiveness
• > 0.5
• Identifying the class confidence
• Applying non-max suppression
• Input Images
• Ignore the bounding boxes
• objects recorded on everyday scenes and provides the labelling of multi-objects,
annotations of segmentation masks, image captioning, key-point detection and panoptic
segmentation annotations with a total of 81 categories, making it a very flexible and
polyvalent dataset.

4.2Architectural Design:

Figure4.1 Architectural Design

The strategy followed by YOLO is as follows and figure 4.1 represents it :

1. It divides the given image into an S × S grid
2. Then, each grid cell is used to analyze whether an object falls into it or not.
3. Then each grid cell predicts B bounding boxes and confidence scores for those boxes.
4. These confidence scores reflect how confident the model is that the box contains an object
and also how accurate this prediction is.
21
5. Then they choose the most accurate result and put a bounding box around it.
Step-1:

Step-2:

22
Step-3:

PROCESS MODEL

Figure 4.2: Proposed method and Algorithm

23
• YOLOV3 is the algorithm we used for this project. Figure 4.2 represents the proposed
method and algorithm of YOLO.
• YOLOv3 (You Only Look Once, Version 3) is a real-time object detection algorithm that
identifies specific objects in videos, live feeds, or images.
4.3 Block Diagram:
The block diagram is typically used for a higher level, less detailed description aimed more
at understanding the overall concepts and less at understanding the details of Implementation [5].
Main operations are to add, view, update and delete the details of the customers, gallery, and
services details if necessary. The following figure (3.1) represents the block diagram.

Figure 4.3: Block Diagram

24
4.4 UML Diagrams:

The Unified Modeling Language (UML) is a Standard language for specifying, visualizing,
constructing, and documenting the software system and its components [12]. The UML focuses on the
conceptual and physical representation of the system. It captures the decisions and understandings about
systems that must be constructed. Structural models represent the framework for the system and this
framework is the place where all other components exist. So the class diagram, component diagram and
deployment diagrams are the part of structural modeling.

They all represent the elements and the mechanism to assemble them. But the structural
model never describes the dynamic behavior of the system. Behavioral model describes the interaction
in the system. It represents the interaction among the structural diagrams [12]. Behavioral modeling
shows the dynamic nature of the system. Architectural model represents the overall framework of the
system. It contains both structural and behavioral elements of the system. Architectural model can be
defined as the blueprint of the entire system. Package diagram comes under architectural modeling.
The Unified Modeling Language encompasses a number of models
• Use Case Diagram

• Class Diagram

• Sequence Diagram

• Activity Diagram
4.4.1 Use Case Diagram:
Use case diagrams are one of the five diagrams in the UML for modeling the dynamic
aspects of the systems (activity diagrams, sequence diagram, state chart diagram, collaboration diagram
are the four other kinds of diagrams in the UML for modeling the dynamic aspects of systems).Use case
diagrams are central to modeling the behavior of the system, a sub-system, or a class. Each one shows a
set of use cases and actors and relations.
The key points are:
● The main purpose is to show the interaction between the use cases and the actor.
● To represent the system requirement from the user's perspective.
● Use cases are the functions that are to be performed in the module.

25
The following figure 4.4 represents the Use Case diagram.

Figure 4.4: Use Case Diagram

4.4.2 Class Diagram

A “Class Diagram” shows a set of classes, interfaces and collaborations and their relationships. These
diagrams are the most common diagrams in modeling object-oriented systems. The class diagram is a
static diagram. It represents the static view of an application. Class diagram is not only used for
visualizing, describing and documenting different aspects of a system but also for constructing
executable code of the software application.

The class diagram describes the attributes and operations of a class and also the constraints
imposed on the system.[13] The class diagrams are widely used in the modeling of object-
orientedsystems because they are the only UML diagrams which can be mapped directly with object-
26
oriented languages. Figure 4.5 represents Class Diagram.

Figure 4.5: Class Diagram

4.4.3 Sequence Diagram

Sequence diagram is an interaction diagram which focuses on the time ordering of messages.
It shows a set of objects and messages exchanged between these objects. This diagram illustrates the
dynamic view of a system. Figure 4.6 represents sequence diagram.

The key points are:

1. The main purpose is to represent the logical flow of data with respect to a process
2. A sequence diagram displays the objects and not the classes.

27
Figure 4.6: Sequence Diagram

4.4.5 Activity Diagram:

An Activity Diagram is a behavioral diagram that shows the flow or sequence of activities
through a system. The terms activity diagram and process flow are often used interchangeably.
However, the term activity diagram is typically more restrictive as it refers to one of thirteen standard
Unified Model Language (UML) diagrams. Activity Diagrams are one of the most commonly used
diagrams since its notation and origin are based on the widely known flowchart notation. Activity
diagrams are similar to flowchart diagram and data flow. Figure 4.7 represents activity diagram

28
Figure 4.7: Activity Diagram

29
CHAPTER 5

IMPLEMENTATION

The implementation stage of any project is a true display of the defining moments that make a project a
success or a failure. The implementation stage is defined as the system or system modifications being
installed and made operational in a production environment. The phase is initiated after the system has
been tested and accepted by the user. This phase continues until the system is operating in production in
accordance with the defined user requirements.

5.1 Algorithm used

This is an algorithm that detects and recognizes various objects in a picture (in real-time). Object
detection in YOLO is done as a regression problem and provides the class probabilities of the detected
images. YOLO algorithm employs convolutional neural networks (CNN) to detect objects in real-time.
As the name suggests, the algorithm requires only a single forward propagation through a neural
network to detect objects. This means that prediction in the entire image is done in a single algorithm
run. The CNN is used to predict various class probabilities and bounding boxes simultaneously. The
YOLO algorithm consists of various variants. Some of the common ones include tiny YOLO and
YOLOv3.

30
Figure 5.1: screenshot of cfg

Figure 5.2: screenshot of cfg

31
Figure 5.3: screenshot of cfg

Figure 5.4: screenshot of cfg

32
Figure 5.5: screenshot of cfg

Figure 5.6: screenshot of cfg

33
Figure 5.7: screenshot of cfg

Figure 5.8: screenshot of cfg

34
Figure 5.9: screenshot of cfg

Figure 5.10: screenshot of cfg

35
Figure 5.11: screenshot of cfg

Figure 5.12: screenshot of cfg

36
Figure 5.13: screenshot of cfg

Figure 5.14: screenshot of cfg

37
Figure 5.15: screenshot of cfg

Figure 5.16: screenshot of cfg

38
Figure 5.17: screenshot of cfg

Figure 5.18: screenshot of cfg

39
The above figures from Figure5.1 to Figure 5.18 are the screenshots of the cfg data files which
are used for the description of features of the object. cfg file, and the pre-trained weights of the
neural network are stored in yolo.

Figure 5.19: Screenshot of coco names

Figure 5.20: Screenshot of coco names

40
Figure 5.21: Screenshot of coco names

Figure 5.22: Screenshot of coco names

41
The above figures from Figure 5.19 to Figure 5.22 are screenshots of coco dataset. Based on the COCO
dataset, YOLO can detect the 80 COCO object classes: person. Bicycle, car, motorbike, aero plane, bus,
train, truck, boat. traffic light, fire hydrant, stop sign, parking meter, bench.

Figure 5.2 Screenshot of code

Figure 5.24: Screenshot of code

42
Figure 5.25: Screenshot of code

Figure 5.26: Screenshot of code

43
The above figures from Figure 5.23 to Figure 5.26 are the screenshots of code for performing the yolo
algorithm. YOLO algorithm uses a completely different approach. The algorithm applies a single neural
network to the entire full image. Then this network divides that image into regions which provides the
bounding boxes and also predicts probabilities for each region.

44
CHAPTER 6

TESTING

It is the process of testing the functionality and it is the process of executing a program with
the intent of finding an error. A good test case is one that has a high probability of finding an
undiscovered error.

6.1 Training
It is the process of teaching or being taught the skills for a particular job or activity. In this context, it
refers to the process of teaching an algorithm towards a specific task for which it will be used. These
algorithms also learn from experience without being explicitly programmed. In our experiment, we
labelled ten thousand images, each with varying number of URLs and some with no URL at all. We
trained our algorithm using fifty thousand images. These images have varying font size. Each is
responsible for predicting K bounding boxes. The grid cell was selected because we would be working
with only text. An object is considered to lie in a specific cell only if the center co-ordinates of the
anchor box lie in that cell. Due to this property, the center co-ordinates are always calculated relative to
the cell, whereas the height and width are calculated relative to the whole image size. Using the Equation
below, YOLO determines the probability that a cell contains a certain class. The class with the maximum
probability is chosen and assigned to that grid cell. This is repeated for all grid cells in the image. The
probability that there is an object of certain class ‘c’ is:
𝑠𝑐𝑜𝑟𝑒 𝑐 , 𝑖 = 𝑝𝑐 × 𝑐i
At the end of the training process, provided the trained algorithm with a new set of data (with ten
thousand images) to test our result. This new dataset is modeled like the training data.

Test cases:

Testcase1:

The resolution that is specified in this project is 416x416 pixels. If you give an image that is higher than
this resolution value, you can encounter an error, so you need to resize or reduce the image resolution.
Testcase2:
We have tested all kinds of objects such as water bottle, mobile phone, book, chair etc. The test was
45
successful.
Testcase3:
We have tested with multiple objects and it can detect the objects successfully.

6.2 TEST RESULTS

Test cases are done. All the test cases are passed successfully. No defects were encountered.

46
CHAPTER 7
RESULTS

The result screenshots are as follows

Figure 7.1: Screenshot of execution

The above Figure 7.1 represents successful execution of the code.

47
Figure 7.2: Screenshot of Results
From the above figure no 7.2 the result is displayed and the object laptop and other object mouse is being
detected using the yolo algorithm.

Figure 7.3: Screenshot of Results

From the above figure no 7.3 the result is displayed and the object mobile phone, person, bottle is being
detected using the yolo algorithm.

48
Figure 7.4: Screenshot of Results
From the above figure no 7.4 the result is displayed and multiple objects being detected.

Figure 7.5: Screenshot of Results

From the above figure no 7.5 the result is displayed and the objects book,person being detected.

49
CHAPTER 8

CONCLUSION

Object detection is a key ability for most computer and robot vision system. Finally, we need to
consider that we will need object detection systems for nano-robots or for robots that will explore areas
that have not been seen by humans, such as depth parts of the sea or other planets, and the detection
systems will have to learn to new object classes as they are encountered. In such cases, a real-time
open-world learning ability will be critical.

50
CHAPTER 9
FUTURE SCOPE AND ENHANCEMENTS

The applications that this method can be in future are: -

• Creating city guides
• Powering self-driving cars
• Boosting augmented reality applications and gaming
• Organizing one’s visual memory
• Empowering educators and students
• Improving iris recognition

51
CHAPTER 10

REFERENCES

[1] Geethapriya S, N. Duraimurugan, S.P. Chokkalingam, “Real-Time Object Detection with Yolo”,
International Journal of Engineering and Advanced Technology (IJEAT)

[2] Abdul Vahab, Maruti S Naik, Prasanna G Raikaran Prasad S R4, “Applications of Object Detection
System”, International Research Journal of Engineering and Technology (IRJET)

[3] H. Deshpande, A. Singh, H. Herunde, “Comparative analysis on YOLO object detection with
OpenCV”, International Journal of Research in Industrial Engineering, Vol. 9, No. 1 (2020) 46–64

[4] G Chandan, Ayush Jain, Harsh Jain, Mohana “Real Time Object Detection and Tracking Using
Deep Learning and OpenCV”, 2018 International Conference on Inventive Research in Computing
Applications (ICIRCA), Coimbatore, India. IEEE

[5] Ahmad Delforouzi,* Bhargav Pamarthi, and Marcin Grzegorzek, “Training-Based Methods for
Comparison of Object Detection Methods for Visual Object Tracking”,National Library of Medicine,
2018

[6] Rubayat Ahmed Khan; Jia Uddin; Sonia Corraya, “Real-time fire detection using enhanced color
segmentation and novel foreground extraction”, 2017 4th International Conference on Advances in
Electrical Engineering (ICAEE), ISSN: 2378-2692

[7] Xiao-yan Zhang; Rong-chun Zhao, “Automatic Video Object Segmentation using Wavelet
Transform and Moving Edge Detection”, 2006 International Conference on Machine Learning and
Cybernetics, ISBN:1-4244-0061-9

[8] M. N. Vijayalakshmi; M. Senthilvadivu, “Performance evaluation of object detection techniques for

object detection”, 2017 International Conference on Inventive Computation Technologies (ICICT),
IEEE

[9] B N Krishna Sai; T. Sasikala, “Object Detection and Count of Objects in Image using Tensor Flow
Object Detection API”, 2019 International Conference on Smart Systems and Inventive Technology
(ICSSIT), IEEE
52
[10] Lijun Yu; Weijie Sun; Hui Wang; Qiang Wang; Chaoda Liu, “The Design of Single Moving Object
Detection and Recognition System Based on OpenCV”, 2018 International Conference on Mechatronics
and Automation (ICMA), IEEE

[11] Xiaohan Liu; Heng Wang, “Multi-Task Fusion of Object Detection and Semantic Segmentation”,
2019 Chinese Automation Congress (CAC), IEEE

[12] Structured systems Analysis and Design: Data flow Approach, by V.B Kaujalgi, 2nd Edition;
Orient Black swan; ISBN-10:0863113230, ISBN-13:978-0863113239

[13] Software Testing, by Ron Patton;2nd Edition; Sam’s publishing; ISBN-10:0672327988, ISBN
13:978-0672327988

[14] Software Testing concepts and Tools, by NageshwarRaoPusuluri; 2nd Edition;Dreamtech

press;ISBN-10:8177227122,ISBN-13:978-8177227123

Prediction of Rock and Mineral From Sound Navigation and Ranging Waves Using Artificial Intelligence Techniques
No ratings yet
Prediction of Rock and Mineral From Sound Navigation and Ranging Waves Using Artificial Intelligence Techniques
8 pages
Report Minor Project PDF
No ratings yet
Report Minor Project PDF
37 pages
Report of Industrial Training
No ratings yet
Report of Industrial Training
22 pages
Visvesvaraya Technological University: "Car Rental Management System"
No ratings yet
Visvesvaraya Technological University: "Car Rental Management System"
31 pages
Ooad Record Abinash
No ratings yet
Ooad Record Abinash
241 pages
Face Recognition System
No ratings yet
Face Recognition System
7 pages
Handwritten Digit Recognition Using Convolutional Neural Networks
No ratings yet
Handwritten Digit Recognition Using Convolutional Neural Networks
6 pages
Heart Disease Prediction: Submitted For Partial Fulfillment of The Degree
No ratings yet
Heart Disease Prediction: Submitted For Partial Fulfillment of The Degree
38 pages
Dbms Project Report Inventory Management System
No ratings yet
Dbms Project Report Inventory Management System
41 pages
A Project Report ON: Airline Management System
No ratings yet
A Project Report ON: Airline Management System
43 pages
DBMS Project Report - $#$&
100% (1)
DBMS Project Report - $#$&
22 pages
Summer Training Report - Ishan Patwal
No ratings yet
Summer Training Report - Ishan Patwal
52 pages
Major Project Report
No ratings yet
Major Project Report
100 pages
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
No ratings yet
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
76 pages
Big Data
No ratings yet
Big Data
30 pages
Human Activity Recognition Using CNN
No ratings yet
Human Activity Recognition Using CNN
51 pages
Vreportinterm Nsihp
No ratings yet
Vreportinterm Nsihp
28 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Python Programming (Int 213) : Report For House Price Prdiction
No ratings yet
Python Programming (Int 213) : Report For House Price Prdiction
23 pages
Project Report Emaildetection
No ratings yet
Project Report Emaildetection
44 pages
Own Cryptography System: A Project Report
No ratings yet
Own Cryptography System: A Project Report
52 pages
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
No ratings yet
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
74 pages
AIML Internship Report
No ratings yet
AIML Internship Report
53 pages
Internship Report Anthony and Joshil PDF
No ratings yet
Internship Report Anthony and Joshil PDF
20 pages
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
No ratings yet
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
5 pages
Internship - Report Nithin
No ratings yet
Internship - Report Nithin
25 pages
Report
100% (1)
Report
32 pages
Machine Learning Based Car Price Prediction System
No ratings yet
Machine Learning Based Car Price Prediction System
32 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
7th Sem 1
No ratings yet
7th Sem 1
32 pages
E-Commerce Website
No ratings yet
E-Commerce Website
57 pages
Email Client Application Implementing SMTP and POP - DOC
No ratings yet
Email Client Application Implementing SMTP and POP - DOC
103 pages
E Mart - 093653
No ratings yet
E Mart - 093653
49 pages
Enhancing Data Security in Iot Healthcare Services Using Fog Computing
No ratings yet
Enhancing Data Security in Iot Healthcare Services Using Fog Computing
36 pages
Student Result Management System Presentation
No ratings yet
Student Result Management System Presentation
11 pages
PROJECT REPORT For Machine Learning
100% (1)
PROJECT REPORT For Machine Learning
22 pages
Introduction: Data Analytic Thinking
No ratings yet
Introduction: Data Analytic Thinking
38 pages
Seminar
No ratings yet
Seminar
17 pages
Format - Summer Internship Report
No ratings yet
Format - Summer Internship Report
6 pages
CSE35 Project Report
No ratings yet
CSE35 Project Report
111 pages
REPORT FILE of FACE MASK DETECTION
No ratings yet
REPORT FILE of FACE MASK DETECTION
45 pages
Mini-Project Documentation
No ratings yet
Mini-Project Documentation
76 pages
Chatgpt Clone
No ratings yet
Chatgpt Clone
34 pages
Python and Machine Learning: A Practical Training Report On
No ratings yet
Python and Machine Learning: A Practical Training Report On
65 pages
Software Engineering
No ratings yet
Software Engineering
8 pages
Pooja Intership2
No ratings yet
Pooja Intership2
35 pages
File 4
No ratings yet
File 4
60 pages
Summer Internship Report: Bachelor of Technology
No ratings yet
Summer Internship Report: Bachelor of Technology
38 pages
Stock-Price-Prediction-Using-Machine-Learning Final Project Indu Mam Project Final Project
No ratings yet
Stock-Price-Prediction-Using-Machine-Learning Final Project Indu Mam Project Final Project
47 pages
Virtual Mirror - A Hassle Free Approach To The Use of Trial Room
No ratings yet
Virtual Mirror - A Hassle Free Approach To The Use of Trial Room
38 pages
Major Project Innfinder
No ratings yet
Major Project Innfinder
50 pages
Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication
No ratings yet
Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication
34 pages
Placement Prediction Using Various Machine Learning Models and Their Efficiency Comparison
No ratings yet
Placement Prediction Using Various Machine Learning Models and Their Efficiency Comparison
5 pages
Final Project Report - Pet Orphnage
No ratings yet
Final Project Report - Pet Orphnage
43 pages
Python Currency Converter
No ratings yet
Python Currency Converter
5 pages
Nikhil Major Project
No ratings yet
Nikhil Major Project
60 pages
Expenses Calculation Mechanism For Any App .
No ratings yet
Expenses Calculation Mechanism For Any App .
69 pages
Experiment No 4
No ratings yet
Experiment No 4
9 pages
ds4015 Big Data Analytics Vignesh K Notes
No ratings yet
ds4015 Big Data Analytics Vignesh K Notes
146 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet
Salut Macro
No ratings yet
Salut Macro
41 pages
Drink Price Fefef
No ratings yet
Drink Price Fefef
9 pages
Introdction
No ratings yet
Introdction
7 pages
Adder.c 1/1: Lectures/weeks/1/src
No ratings yet
Adder.c 1/1: Lectures/weeks/1/src
18 pages
Exam 1 Solution 072017 C
No ratings yet
Exam 1 Solution 072017 C
10 pages
Batch Testing in QTP
No ratings yet
Batch Testing in QTP
3 pages
Sixthsem Mobileprogramming
No ratings yet
Sixthsem Mobileprogramming
4 pages
Everything You Always Wanted To Know About The Processing of Customer Exit Variables, But - SAP Blogs
No ratings yet
Everything You Always Wanted To Know About The Processing of Customer Exit Variables, But - SAP Blogs
23 pages
Arqc Arpc
No ratings yet
Arqc Arpc
5 pages
Software Engineering Chapter Exam
No ratings yet
Software Engineering Chapter Exam
2 pages
Data Structure and Algorithms Unit-2 Strings
No ratings yet
Data Structure and Algorithms Unit-2 Strings
20 pages
Web Reactive PDF
No ratings yet
Web Reactive PDF
110 pages
Lecture 2 Time and Space Complexity 1
No ratings yet
Lecture 2 Time and Space Complexity 1
54 pages
Sparsh Garg Resume
No ratings yet
Sparsh Garg Resume
1 page
Python Notes and Questions For Interviews
No ratings yet
Python Notes and Questions For Interviews
3 pages
SpaceClaim Developers Guide 11 29
No ratings yet
SpaceClaim Developers Guide 11 29
19 pages
Project Report - Visual OS Scheduler
No ratings yet
Project Report - Visual OS Scheduler
13 pages
Exercises 230520024
No ratings yet
Exercises 230520024
22 pages
20T127 Mini Project
No ratings yet
20T127 Mini Project
11 pages
Sapbw Technical Specification Template
No ratings yet
Sapbw Technical Specification Template
30 pages
Presentation On Polymorphism by Rojip Rai
No ratings yet
Presentation On Polymorphism by Rojip Rai
17 pages
VHDL Tutorial
No ratings yet
VHDL Tutorial
37 pages
Oop Pt-2 Question Bank Chapter 4: Pointer and Polymorphism
No ratings yet
Oop Pt-2 Question Bank Chapter 4: Pointer and Polymorphism
14 pages
Oops Project 29
No ratings yet
Oops Project 29
22 pages
Techniques For Integrating Petri Nets and Object Oriented Concepts
No ratings yet
Techniques For Integrating Petri Nets and Object Oriented Concepts
19 pages
JDBC Control Tutorial (Apache)
No ratings yet
JDBC Control Tutorial (Apache)
5 pages
Chapter 21 - Burst-Error-Correcting Convolutional Codes: 0. EXAMPLE 21.1 Burst Lengths and Guard Spaces
No ratings yet
Chapter 21 - Burst-Error-Correcting Convolutional Codes: 0. EXAMPLE 21.1 Burst Lengths and Guard Spaces
11 pages
Getting Started With Calc Manager For HFM Calc Manager For HFM
No ratings yet
Getting Started With Calc Manager For HFM Calc Manager For HFM
48 pages
MPEG Layer3 Bitstream Syntax and Decoding
No ratings yet
MPEG Layer3 Bitstream Syntax and Decoding
16 pages
Lecture 5
No ratings yet
Lecture 5
15 pages

Major Project Documentation Final 2

Uploaded by

Major Project Documentation Final 2

Uploaded by

OBJECT DETECTION USING

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

This is to certify that G. MANVITHA (18281A0551), E. VEDA SRI (18281A0503), P.

Signature of the Guide Signature of the Head of the Department

Signature of the External Supervisor

1.1 About Project 1

1.2 Existing System with Drawbacks 2

3.1 Hardware & Software Requirements 8

4.1 System Design 19

4.4.2 Class Diagram 26

4.4.3 Sequence Diagram 27

4.4.5 Activity Diagram 28

1.1 General YOLO Model 1

YOLO : You Only Look Once

Figure 1.1: General Yolo Model

Figure 1.2: SSD Architecture

1.3 PROPOSED SYSTEMWITH FEATURES:

• Fast. Good for real-time preprocessing.

3.1 Hardware and Software requirements

● Processor Needed : i3 or above.

Software Requirements: The following are software requirements.

● Operating System : Windows XP,7, or Higher windows OS.

The modules used in this system are:

2. OpenCV: Python is a library of Python bindings designed to solve computer vision

4.1 System Design:

Figure4.1 Architectural Design

The strategy followed by YOLO is as follows and figure 4.1 represents it :

Figure 4.2: Proposed method and Algorithm

Figure 4.3: Block Diagram

Figure 4.4: Use Case Diagram

4.4.2 Class Diagram

Figure 4.5: Class Diagram

4.4.3 Sequence Diagram

The key points are:

4.4.5 Activity Diagram:

5.1 Algorithm used

Figure 5.2: screenshot of cfg

Figure 5.4: screenshot of cfg

Figure 5.6: screenshot of cfg

Figure 5.8: screenshot of cfg

Figure 5.10: screenshot of cfg

Figure 5.12: screenshot of cfg

Figure 5.14: screenshot of cfg

Figure 5.16: screenshot of cfg

Figure 5.18: screenshot of cfg

Figure 5.19: Screenshot of coco names

Figure 5.20: Screenshot of coco names

Figure 5.22: Screenshot of coco names

Figure 5.2 Screenshot of code

Figure 5.24: Screenshot of code

Figure 5.26: Screenshot of code

6.2 TEST RESULTS

The result screenshots are as follows

Figure 7.1: Screenshot of execution

Figure 7.3: Screenshot of Results

Figure 7.5: Screenshot of Results

The applications that this method can be in future are: -

[8] M. N. Vijayalakshmi; M. Senthilvadivu, “Performance evaluation of object detection techniques for

[14] Software Testing concepts and Tools, by NageshwarRaoPusuluri; 2nd Edition;Dreamtech

You might also like