0% found this document useful (0 votes)

54 views10 pages

Constructon

Object detection techniques allow recognition, localization, and detection of multiple objects within an image. YOLO is a state-of-the-art, real-time object detection system that processes images at 30 FPS with 57.9% mAP on COCO. YOLO applies a single neural network to the full image to predict bounding boxes and probabilities for each region, making predictions faster than systems like R-CNN that require thousands of evaluations per image. The YOLOv3 model further improves training and performance over prior versions with techniques like multi-scale predictions and a better backbone classifier.

Uploaded by

Anmol Saxena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views10 pages

Constructon

Uploaded by

Anmol Saxena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Chapter 5: Construction

An object detection technique lets you understand the details of an image or a video as it allows for
the recognition, localization, and detection of multiple objects within an image.
It is usually utilized in applications like image retrieval, security, surveillance, and advanced driver
assistance systems (ADAS). Object Detection is done through many ways:

 Feature Based Object Detection

 Viola Jones Object Detection
 SVM Classifications with HOG Features
 Deep Learning Object Detection

DIGITAL IMAGE PROCESSING

Computerized picture preparing is a range portrayed by the requirement for broad test work to build
up the practicality of proposed answers for a given issue. A critical trademark hidden the plan of
picture preparing frameworks is the huge level of testing and experimentation that

Typically, is required before touching base at a satisfactory arrangement. This trademark informs
that the capacity to plan approaches and rapidly model hopeful arrangements by and large assumes
a noteworthy part in diminishing the cost and time required to land at a suitable framework
execution.

Processing on image:

Processing on image can be of three types They are low-level, mid-level, high level.

Low-level Processing:

 Preprocessing to remove noise.

 Contrast enhancement.

 Image sharpening.

Medium Level Processing:

 Segmentation.

 Edge detection

 Object extraction.

High Level Processing:

 Image analysis
 Scene interpretation

YOLO: Real-Time Object Detection

You only look once (YOLO) is a state-of-the-art, real-time object detection system. On a Pascal Titan
X it processes images at 30 FPS and has a mAP of 57.9% on COCO test-dev.

Comparison to Other Detectors

YOLOv3 is extremely fast and accurate. In mAP measured at .5 IOU YOLOv3 is on par with Focal Loss
but about 4x faster. Moreover, you can easily tradeoff between speed and accuracy simply by
changing the size of the model, no retraining required!
Performance on the COCO Dataset

Model Train Test mAP FLOPS FPS

SSD300 COCO trainval test-dev 41.2 - 46

SSD500 COCO trainval test-dev 46.5 - 19

YOLOv2 608x608 COCO trainval test-dev 48.1 62.94 Bn 40

Tiny YOLO COCO trainval test-dev 23.7 5.41 Bn 244

SSD321 COCO trainval test-dev 45.4 - 16

DSSD321 COCO trainval test-dev 46.1 - 12

R-FCN COCO trainval test-dev 51.9 - 12

SSD513 COCO trainval test-dev 50.4 - 8

DSSD513 COCO trainval test-dev 53.3 - 6

FPN FRCN COCO trainval test-dev 59.1 - 6

Retinanet-50-500 COCO trainval test-dev 50.9 - 14

Retinanet-101-500 COCO trainval test-dev 53.1 - 11

Retinanet-101-800 COCO trainval test-dev 57.5 - 5

YOLOv3-320 COCO trainval test-dev 51.5 38.97 Bn 45

YOLOv3-416 COCO trainval test-dev 55.3 65.86 Bn 35

YOLOv3-608 COCO trainval test-dev 57.9 140.69 Bn 20

YOLOv3-tiny COCO trainval test-dev 33.1 5.56 Bn 220

YOLOv3-spp COCO trainval test-dev 60.6 141.45 Bn 20

How It Works
Prior detection systems repurpose classifiers or localizers to perform detection. They apply the
model to an image at multiple locations and scales. High scoring regions of the image are considered
detections.

We use a totally different approach. We apply a single neural network to the full image. This
network divides the image into regions and predicts bounding boxes and probabilities for each
region. These bounding boxes are weighted by the predicted probabilities. Our model has several
advantages over classifier-based systems. It looks at the whole image at test time so its predictions
are informed by global context in the image. It also makes predictions with a single network
evaluation unlike systems like R-CNN which require thousands for a single image. This makes it
extremely fast, more than 1000x faster than R-CNN and 100x faster than Fast R-CNN. See our paper
for more details on the full system.

What's New in Version 3?

YOLOv3 uses a few tricks to improve training and increase performance, including: multi-scale
predictions, a better backbone classifier, and more. The full details are in our paper!

Detection Using a Pre-Trained Model

This post will guide you through detecting objects with the YOLO system using a pre-trained model.
If you don't already have Darknet installed, you should do that first.

Algorithm:

Yolo

Step 1: if (setModel=YoloV3( )|| TinyYoloV3( ))

Step 2: Set the execution path

Step 3: Set the model path to load the model

Step 4: Set timer as default timer to get the elapsed time

Step 5: Detect the objects from input to output image

Step 6: join the detector on inputImage and outputImage

Step 7: obj:=0, accur:=0

Step 8: for each object in detection do

Step 9: {

Step 10 :obj:=obj+1; accr:=accr+percentprob;

Step 11: }

Step 12: eTime: = defaultTimer – startTime;

5.1 Implementation and testing:

Real-Time Detection on a Webcam

Running YOLO on test data isn't very interesting if you can't see the result. Instead of running it on a
bunch of images let's run it on the input from a webcam!

To run this demo, you will need to compile Darknet with CUDA and OpenCV. Then run the
command:

./darknet detector demo cfg/coco.data cfg/yolov3.cfg yolov3.weights

YOLO will display the current FPS and predicted classes as well as the image with bounding boxes
drawn on top of it.

You will need a webcam connected to the computer that OpenCV can connect to or it won't work. If
you have multiple webcams connected and want to select which one to use you can pass the flag -c
<num> to pick (OpenCV uses webcam 0 by default).

You can also run it on a video file if OpenCV can read the video:

./darknet detector demo cfg/coco.data cfg/yolov3.cfg yolov3.weights <video file>

Implemented Classes:

Some of the main classes that have been implemented in the project are as follows
there are many kind of classes that are present in the weights section of the YOLOv3
but some of the main classes are as follows.

ClassIndex: Used for defining the index of the particular class and the array that we have
created using the coco dataset in our project.

Confidence: Used for creating the boxes outside of the object that has been detected using
the YOLO.It can be of green or any color we want.
Bbox: Used for creating the boxes outside of the object that has been detected using the
YOLO. It can be of green or any color we want.

Implemented Functions

Weighted Sum

Inputs to a neuron can either be features from a training set or outputs from the neurons of
a previous layer. Each connection between two neurons has a unique synapse with a unique
weight attached. If you want to get from one neuron to the next, you have to travel along
the synapse and pay the “toll” (weight). The neuron then applies an activation function to
the sum of the weighted inputs from each incoming synapse. It passes the result on to all the
neurons in the next layer. When we talk about updating weights in a network, we’re talking
about adjusting the weights on these synapses.

A neuron’s input is the sum of weighted outputs from all the neurons in the previous layer.
Each input is multiplied by the weight associated with the synapse connecting the input to
the current neuron. If there are 3 inputs or neurons in the previous layer, each neuron in the
current layer will have 3 distinct weights: one for each synapse.

In a nutshell, the activation function of a node defines the output of that node.

The activation function (or transfer function) translates the input signals to output signals. It
maps the output values on a range like 0 to 1 or -1 to 1. It’s an abstraction that represents
the rate of action potential firing in the cell. It’s a number that represents the likelihood that
the cell will fire. At it’s simplest, the function is binary: yes (the neuron fires) or no (the
neuron doesn’t fire). The output can be either 0 or 1 (on/off or yes/no), or it can be
anywhere in a range. If you were using a function that maps a range between 0 and 1 to
determine the likelihood that an image is a cat, for example, an output of 0.9 would show a
90% probability that your image is, in fact, a cat.

Activation function

In a nutshell, the activation function of a node defines the output of that node.

This is a step function. If the summed value of the input reaches a certain threshold the
function passes on 0. If it’s equal to or more than zero, then it would pass on 1. It’s a very
rigid, straightforward, yes or no function.

Sigmoid function

This function is used in logistic regression. Unlike the threshold function, it’s a smooth,
gradual progression from 0 to 1. It’s useful in the output layer and is used heavily for linear
regression.

Hyperbolic Tangent Function

This function is very similar to the sigmoid function. But unlike the sigmoid function which
goes from 0 to 1, the value goes below zero, from -1 to 1. Even though this isn’t a lot like
what happens in a brain, this function gives better results when it comes to training neural
networks. Neural networks sometimes get “stuck” during training with the sigmoid function.
This happens when there’s a lot of strongly negative input that keeps the output near zero,
which messes with the learning process.

Rectifier function

This might be the most popular activation function in the universe of neural networks. It’s
the most efficient and biologically plausible. Even though it has a kink, it’s smooth and
gradual after the kink at. This means, for example, that your output would be either “no” or
a percentage of “yes.” This function doesn’t require normalization or other complicated
calculations.

It comes under the layer of machine learning, where machines can acquire skills and learn
from past experience without any involvement of human. Deep learning comes under
machine learning where artificial neural networks, algorithms inspired by the human brain,
learn from large amounts of data.

4.2 Validation and Verification

Although Objection detection is an esteemed task yet, it is an innovative errand. It plays an essential
role in numerous implementations like identifying an image, auto-annotation of image, and
apprehension of the ideology. Eliminating the problem of vision in visually impaired persons, the
proposed work can be used effectively in detecting the objects along with their design patterns in an
exact manner and to identify them among multiple different objects in a captured input image
individually with high accuracy and with expert navigation, by implementing the Specific model X-Y
plane by calculating their percentages accurately of the detection and also supporting the
transformation input images to speech. The object detection also furnishes its results on multiple
objects and various methodologies in discovering artefacts, identifying and collating each step for its
productiveness.

4.3 Testing Approach:

The image or video can be loaded into the object detection model, this interface contains loading an
image, running a module to execute the program, the number of detected images detected in the
module and play audio for better understanding for visually impaired persons.

Figure 7: Interface to run the detection modules.

Figure 8: Object detection in an outdoor environment with multi labelling.

Figure 8 shows the loaded image in the outdoor environment at one part, and the other hands, the
model is marked with all the objects available in the picture with blue-coloured frames.

Figure 9: Accuracy values of available objects in the image. Figure 9 shows all the available objects in
the images with accuracies. The detection module observed that there are five bottles, one chair,
and eight persons are detected in the loaded image. By playing the audio, the module says, “Hey!
There are five bottles, one chair’s eight people’s before you”.
Figure 10: Object detection in a traffic environment.

Figure 10 shows all the available variables with labels at the traffic signal environment.

In this case, the model detected seven cars, two trucks, one person’s, and one bicycle’s before the
person, along with accuracy.

Figure 11: Playing audio output. Figure 11 shows the accuracy of the objects available in the loaded
image. By playing the “play audio” module, the visually impaired people can listen to the type of
objects in the surrounding environment and the count.

Figure 12: Object detection at a traffic signal.

Figure 12 shows all the detected variables applied at a traffic signal and used for the comparative
analysis to compare results by Retina Net, Yolo V3, and Yolo Tiny on the same image

Object Detection Presentation
100% (3)
Object Detection Presentation
28 pages
Final Report Yolo Voice
No ratings yet
Final Report Yolo Voice
94 pages
Part 2
No ratings yet
Part 2
225 pages
Grp2 Final PPT YOLO Moving Object Classification
No ratings yet
Grp2 Final PPT YOLO Moving Object Classification
26 pages
YOLO
No ratings yet
YOLO
43 pages
Od Segment
No ratings yet
Od Segment
53 pages
XCMG Catalogue 2017
67% (6)
XCMG Catalogue 2017
14 pages
Lecture06 - Copie
No ratings yet
Lecture06 - Copie
52 pages
Mastering All YOLO Models From YOLOv1 To YOLO
100% (1)
Mastering All YOLO Models From YOLOv1 To YOLO
58 pages
Deed of EXCHANGE of MOTOR VEHICLE
100% (10)
Deed of EXCHANGE of MOTOR VEHICLE
2 pages
Week 05
No ratings yet
Week 05
38 pages
Yolo
No ratings yet
Yolo
20 pages
Object Detection With YOLO
No ratings yet
Object Detection With YOLO
18 pages
11com OCM Final 21-22
80% (5)
11com OCM Final 21-22
5 pages
Yolo India
No ratings yet
Yolo India
14 pages
Thesis (2) Removed
No ratings yet
Thesis (2) Removed
34 pages
基于YOLOv5：车轮检测器的光照和旋转不变性实时检测器
No ratings yet
基于YOLOv5：车轮检测器的光照和旋转不变性实时检测器
16 pages
Lecun 20181015 Ihes Gomax PDF
No ratings yet
Lecun 20181015 Ihes Gomax PDF
109 pages
Deep Learning For Object Detection - 131124
No ratings yet
Deep Learning For Object Detection - 131124
35 pages
YOLOv5 Architecture and Algorithm For Object Detection
No ratings yet
YOLOv5 Architecture and Algorithm For Object Detection
7 pages
Object Detection Using Yolo Algorithm-1
No ratings yet
Object Detection Using Yolo Algorithm-1
9 pages
Comprehensive In-Depth Notes On Computer Vision Tasks & Vision Transformers
No ratings yet
Comprehensive In-Depth Notes On Computer Vision Tasks & Vision Transformers
5 pages
YOLOV1论文-同济子豪兄批注You Only Look Once Unified Real-time Object Detection
No ratings yet
YOLOV1论文-同济子豪兄批注You Only Look Once Unified Real-time Object Detection
10 pages
Paper 45
No ratings yet
Paper 45
7 pages
YOLO Object Detection Explained - A Beginner's Guide - DataCamp
No ratings yet
YOLO Object Detection Explained - A Beginner's Guide - DataCamp
14 pages
Efficient Detection of Small and Complex Objects For Autonomous Driving Using Deep Learning
No ratings yet
Efficient Detection of Small and Complex Objects For Autonomous Driving Using Deep Learning
5 pages
The Basics of Object Detection YOLO SSD R-CNN
No ratings yet
The Basics of Object Detection YOLO SSD R-CNN
4 pages
Lecture 10 Summary
No ratings yet
Lecture 10 Summary
2 pages
Real-Time Face Detection Based On YOLO
No ratings yet
Real-Time Face Detection Based On YOLO
4 pages
Team 10
No ratings yet
Team 10
20 pages
Object Detection
No ratings yet
Object Detection
11 pages
Deep Learning Based Automated Billing Cart
No ratings yet
Deep Learning Based Automated Billing Cart
4 pages
Yolo Algorithm
No ratings yet
Yolo Algorithm
37 pages
CV Lab 9
No ratings yet
CV Lab 9
4 pages
10 - CPU Based YOLO A Real Time Object Detection Algorithm
No ratings yet
10 - CPU Based YOLO A Real Time Object Detection Algorithm
4 pages
Signature Object Detection Based On YOLOv3
No ratings yet
Signature Object Detection Based On YOLOv3
4 pages
Jumeed Oral Questions
100% (1)
Jumeed Oral Questions
261 pages
Yolo Paper
No ratings yet
Yolo Paper
10 pages
Yolopdf
No ratings yet
Yolopdf
10 pages
Object Detection Technique (YOLO)
No ratings yet
Object Detection Technique (YOLO)
19 pages
Seminar 201202175023
No ratings yet
Seminar 201202175023
16 pages
Yolo
No ratings yet
Yolo
10 pages
MJEER-Volume 30-Issue 1 - Page 52-57
No ratings yet
MJEER-Volume 30-Issue 1 - Page 52-57
6 pages
Incremental Training For Image Classification of Unseen Objects
No ratings yet
Incremental Training For Image Classification of Unseen Objects
19 pages
Unified Real-Time Object Detection
No ratings yet
Unified Real-Time Object Detection
36 pages
Object Detection Document
No ratings yet
Object Detection Document
4 pages
Object Detection Using Yolo
No ratings yet
Object Detection Using Yolo
42 pages
Ex No 06
No ratings yet
Ex No 06
4 pages
Project
100% (1)
Project
30 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Final Synopsis1
No ratings yet
Final Synopsis1
10 pages
Yolo: You Only Look Once: Unified Real-Time Object Detection
No ratings yet
Yolo: You Only Look Once: Unified Real-Time Object Detection
60 pages
YOLO
No ratings yet
YOLO
10 pages
MAN TGA ZF Transmission 16S151/16S181 (RL)
100% (4)
MAN TGA ZF Transmission 16S151/16S181 (RL)
4 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Object Detection and Classification Using Yolov3 IJERTV10IS020078
No ratings yet
Object Detection and Classification Using Yolov3 IJERTV10IS020078
6 pages
Math5 - q2 - Mod4 - Multiply Decimals Up To 2 Decimal Places
No ratings yet
Math5 - q2 - Mod4 - Multiply Decimals Up To 2 Decimal Places
30 pages
YOLO V3 ML Project
No ratings yet
YOLO V3 ML Project
15 pages
Deep Learning YOLOv2
No ratings yet
Deep Learning YOLOv2
3 pages
You Only Look Once - Unified, Real-Time Object Detection
No ratings yet
You Only Look Once - Unified, Real-Time Object Detection
10 pages
Breeds of Cattle
No ratings yet
Breeds of Cattle
18 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
Yolo
No ratings yet
Yolo
10 pages
McDonald's in A Monopolistic Competition
No ratings yet
McDonald's in A Monopolistic Competition
23 pages
List of Books and Notebooks - 2025-26 Class 6-12
No ratings yet
List of Books and Notebooks - 2025-26 Class 6-12
7 pages
STV Insights
No ratings yet
STV Insights
20 pages
Moving Forward VOCABULARY WORD SEARCH Y5
No ratings yet
Moving Forward VOCABULARY WORD SEARCH Y5
2 pages
SSC GR 10 Electronics Q4 Module 1 WK 1 - v.01-CC-released-22May2021
No ratings yet
SSC GR 10 Electronics Q4 Module 1 WK 1 - v.01-CC-released-22May2021
20 pages
Question Bank For Certification Programme of Returning Officers
No ratings yet
Question Bank For Certification Programme of Returning Officers
77 pages
Full Job Description - Project Management Fresher
No ratings yet
Full Job Description - Project Management Fresher
2 pages
Vendor Agreement Template
No ratings yet
Vendor Agreement Template
11 pages
2 Edm
No ratings yet
2 Edm
11 pages
Sleep Hygiene
No ratings yet
Sleep Hygiene
12 pages
AOA 2023 Solution
No ratings yet
AOA 2023 Solution
25 pages
GBV Monthly Work Plan
No ratings yet
GBV Monthly Work Plan
20 pages
Stax-21 Quick Reference Guides - Digital - PAX A920
No ratings yet
Stax-21 Quick Reference Guides - Digital - PAX A920
2 pages
COMP40004 - Web Development and Operating Systems
No ratings yet
COMP40004 - Web Development and Operating Systems
4 pages
Writeup 24112023 3
No ratings yet
Writeup 24112023 3
2 pages
Syllabus-EE 414, 517, Deep Learning, Fall 2023
No ratings yet
Syllabus-EE 414, 517, Deep Learning, Fall 2023
4 pages
Corporate Social Responsibility - What Does It Mean ?: by Mallen Baker: First Published 8 Jun 2004
No ratings yet
Corporate Social Responsibility - What Does It Mean ?: by Mallen Baker: First Published 8 Jun 2004
4 pages
Smile-B3-Plus: Residential Series 3 KW Inverter
No ratings yet
Smile-B3-Plus: Residential Series 3 KW Inverter
2 pages
0901d19680089cee PDF Preview Medium
No ratings yet
0901d19680089cee PDF Preview Medium
4 pages
150 Hcs Type-01new
No ratings yet
150 Hcs Type-01new
3 pages
Invoice: Invoice From Invoice To Customer Information
No ratings yet
Invoice: Invoice From Invoice To Customer Information
2 pages
Unit 3
No ratings yet
Unit 3
3 pages
Strain Gauge Measurement: Temperature Compensation For Leadwires in Quarter Bridge
No ratings yet
Strain Gauge Measurement: Temperature Compensation For Leadwires in Quarter Bridge
1 page
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
From Everand
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
Mark Magic
No ratings yet
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet
CCNA Exam Focus: Study Guide with Practice Tests
From Everand
CCNA Exam Focus: Study Guide with Practice Tests
SUJAN
No ratings yet

Constructon

Uploaded by

Constructon

Uploaded by

Chapter 5: Construction

 Feature Based Object Detection

DIGITAL IMAGE PROCESSING

 Preprocessing to remove noise.

Medium Level Processing:

High Level Processing:

YOLO: Real-Time Object Detection

Comparison to Other Detectors

Model Train Test mAP FLOPS FPS

SSD300 COCO trainval test-dev 41.2 - 46

SSD500 COCO trainval test-dev 46.5 - 19

YOLOv2 608x608 COCO trainval test-dev 48.1 62.94 Bn 40

Tiny YOLO COCO trainval test-dev 23.7 5.41 Bn 244

SSD321 COCO trainval test-dev 45.4 - 16

DSSD321 COCO trainval test-dev 46.1 - 12

R-FCN COCO trainval test-dev 51.9 - 12

SSD513 COCO trainval test-dev 50.4 - 8

DSSD513 COCO trainval test-dev 53.3 - 6

FPN FRCN COCO trainval test-dev 59.1 - 6

Retinanet-50-500 COCO trainval test-dev 50.9 - 14

Retinanet-101-500 COCO trainval test-dev 53.1 - 11

Retinanet-101-800 COCO trainval test-dev 57.5 - 5

YOLOv3-320 COCO trainval test-dev 51.5 38.97 Bn 45

YOLOv3-416 COCO trainval test-dev 55.3 65.86 Bn 35

YOLOv3-608 COCO trainval test-dev 57.9 140.69 Bn 20

YOLOv3-tiny COCO trainval test-dev 33.1 5.56 Bn 220

YOLOv3-spp COCO trainval test-dev 60.6 141.45 Bn 20

What's New in Version 3?

Detection Using a Pre-Trained Model

Step 1: if (setModel=YoloV3( )|| TinyYoloV3( ))

Step 2: Set the execution path

Step 3: Set the model path to load the model

Step 4: Set timer as default timer to get the elapsed time

Step 5: Detect the objects from input to output image

Step 6: join the detector on inputImage and outputImage

Step 7: obj:=0, accur:=0

Step 8: for each object in detection do

Step 10 :obj:=obj+1; accr:=accr+percentprob;

Step 12: eTime: = defaultTimer – startTime;

Real-Time Detection on a Webcam

./darknet detector demo cfg/coco.data cfg/yolov3.cfg yolov3.weights

./darknet detector demo cfg/coco.data cfg/yolov3.cfg yolov3.weights <video file>

Hyperbolic Tangent Function

4.2 Validation and Verification

4.3 Testing Approach:

Figure 7: Interface to run the detection modules.

Figure 8: Object detection in an outdoor environment with multi labelling.

Figure 12: Object detection at a traffic signal.

You might also like