Group Number - 2 - MOVING OBJECT CLASSIFICATION USING YOLO Algorithm
Group Number - 2 - MOVING OBJECT CLASSIFICATION USING YOLO Algorithm
Group Number - 2 - MOVING OBJECT CLASSIFICATION USING YOLO Algorithm
Evaluation (Semester – 8)
Title -> Moving Object
Classification Using Deep
Learning
Team Members
2. Motivation
3. Literature Survey
4. Proposed Methodology
5. Result/Discussion
Objective
Here, we use Yolo V3 model instead of Yolo V8 because it gives maximum 0.9 and
minimum 0.6 confidence whereas on the other hand Yolo V8 only gives maximum
0.27 confidence. (tested)
Efficient and accurate object detection in live video feeds for applications like
surveillance, traffic monitoring, and robotics.
Literature Survey
Research Paper 1:
(A Lightweight Moving Vehicle Classification System Through Attention-Based Method and
Deep Learning)
https://ieeexplore.ieee.org/abstract/document/8886464/authors#authors
The research paper discusses using convolutional neural networks (CNNs) for vehicle classification in videos, focusing
on intelligent transportation systems. They propose an attention-based approach to overcome challenges like camera
jitter and bad weather.
Methodology involves two parts: attention-based detection and fine-grained classification using YOLOv3. The authors
customize YOLOv3 with their dataset of 49,652 annotated frames covering four vehicle classes. They address class
imbalance with image augmentation. Results show the proposed method outperforms existing techniques in specificity,
false-positive rate, and precision. They use challenging outdoor scenes from CDNET2014 for evaluation.
Disadvantage: However, the paper lacks details on computational complexity and scalability.
In summary, the paper introduces a promising approach for vehicle classification in complex scenarios, showing
potential for real-world applications in intelligent transportation systems. However, it could provide more insights into
computational efficiency and scalability for practical implementation.
Research Paper 2:
(Moving Vehicle Detection and Classification Using Gaussian Mixture Model and Ensemble Deep
Learning Technique)
https://www.hindawi.com/journals/wcmc/2021/5590894/
The research paper focuses on improving automatic vehicle classification in visual traffic surveillance systems,
particularly during lockdowns to curb COVID-19 spread. It proposes a new technique for classifying vehicle types,
addressing issues with imbalanced data and real-time performance.
The methodology involves collecting data from the Beijing Institute of Technology Vehicle Dataset and the
MIOvision Traffic Camera Dataset. Techniques like adaptive histogram equalization and Gaussian mixture model are
used for image enhancement and vehicle detection. Feature extraction is done using the Steerable Pyramid
Transform and Weber Local Descriptor, followed by classification using ensemble deep learning.
The proposed technique achieves high classification accuracy of 99.13% and 99.28% on the MIOvision Traffic
Camera Dataset and the Beijing Institute of Technology Vehicle Dataset, respectively. These results outperform
existing benchmark techniques.
Disadvantage: However, the paper lacks discussion on the computational efficiency and scalability of
the proposed method, which could be a drawback.
Proposed
Methodology
1) Importing Necessary Packages:
a) numpy: For numerical operations on arrays.
b) argparse: For parsing command line arguments.
c) imutils: Provides convenience functions to work with OpenCV.
d) time: For timing operations.
e) cv2: OpenCV library for computer vision tasks.
f) os: For interacting with the operating system.
2) Argument Parsing:
a) Using argparse, command-line arguments like input video path, confidence
threshold, and non-maximum suppression threshold are parsed.
3) Loading Model and Labels:
a) Loads the YOLO object detection model from disk using
cv2.dnn.readNetFromDarknet() function, which reads the model architecture from
a .cfg file and weights from a .weights file.
b) Loads the class labels from a text file.
4) Initializing Video Stream:
a) Determines whether the input is a video file or a live camera feed.
b) If the input is a video file, it retrieves frame dimensions and total frames.
5) Frame Processing in a Loop:
a) Processes each frame of the video stream.
b) Constructs a blob (resizes the frame, converts it to RGB format, normalizes pixel
values) from the input frame to feed it to the neural network.
c) Performs a forward pass through the network to get bounding boxes, confidences,
and class IDs of detected objects.
d) Filters out weak detections based on confidence threshold.
e) Applies non-maximum suppression to remove redundant bounding boxes.
f) Draws bounding boxes and labels on the frame.
g) Displays the processed frame.
6) Exiting the Program:
a) Listens for the 'q' key press to exit the video.
7) Cleanup:
a) Releases video stream and closes OpenCV windows.
Result/Discussion:
Vid 1 Vid 2
THANK
YOU