0% found this document useful (0 votes)
55 views39 pages

Final Report

The document is a project report submitted by 4 students for their Bachelor of Engineering degree in Electronics and Communication Engineering. It discusses the development of a surveillance system using YOLO algorithm for multiple object tracking and trajectory prediction to control traffic congestion. The system detects and tracks multiple vehicles in video frames over time and predicts their future trajectories. It aims to address challenges like occlusions and scale changes. The report includes an introduction to MOT and trajectory prediction, literature review, conceptualization of YOLO algorithm, results and discussions. It provides insights into applications for autonomous driving and traffic management.

Uploaded by

Hemanth BS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views39 pages

Final Report

The document is a project report submitted by 4 students for their Bachelor of Engineering degree in Electronics and Communication Engineering. It discusses the development of a surveillance system using YOLO algorithm for multiple object tracking and trajectory prediction to control traffic congestion. The system detects and tracks multiple vehicles in video frames over time and predicts their future trajectories. It aims to address challenges like occlusions and scale changes. The report includes an introduction to MOT and trajectory prediction, literature review, conceptualization of YOLO algorithm, results and discussions. It provides insights into applications for autonomous driving and traffic management.

Uploaded by

Hemanth BS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

Jnana Sangama, Belagavi, Karnataka–590014

Project report
on
“SURVEILLANCE SYSTEM, MULTIPLE OBJECT
TRACKING AND PATH PREDICTION FOR
TRAFFIC CONGESTION CONTROL USING
YOLO ALGORITHM”
Submitted in partial fulfillment of the requirements for the award
of degree of
BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by

HEMANTH B S 1BI19EC043
JAYANTH R 1BI19EC049
MOHAMMED TAHEER AHMED 1BI19EC078
RAKSHITH R J 1BI19EC111
Under the guidance of
Dr. S. L MUKTHI
Associate Professor
Dept. of ECE, BIT

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING


BANGALORE INSTITUTE OF TECHNOLOGY
K.R. Road, BANGALORE – 560004
2022-2023
BANGALORE INSTITUTE OF TECHNOLOGY
K.R. Road, V .V Puram, Bangalore - 560004
www.bit-bangalore.edu.in

Department of Electronics and Communication Engineering

CERTIFICATE

Certified that the project work entitled “Surveillance System, Multiple Object
Tracking And Path Prediction For Traffic Congestion Control Using Yolo
Algorithm” carried out by Mr. HEMANTH B S (1BI19EC043) Mr. JAYANTH R
(1BI19EC049) Mr. MOHAMMED TAHEER AHMED (1BI19EC078) Mr.
RAKSHITH R J (1BI19EC111) bonafide students of Bangalore Institute of
Technology in partial fulfillment for the award of Bachelor of Engineering / Bachelor
of Technology in Electronics and Communication Engineering of the Visvesvaraya
Technological University, Belgaum during the year 2022-23. It is certified that all
corrections/suggestions indicated for the Internal Assessment have been incorporated in
the Report deposited in the departmental library. The Project report has been approved
as it satisfies the academic requirements in respect of Project work prescribed for the
above said Degree.

Dr. Mukthi S.L Dr. Hemanth Kumar A R Dr. Aswath M U


Associate Professor Professor & HOD Principal
Dept. of ECE, BIT Dept. of ECE, BIT BIT

External Viva

Name of examiners Signature with date

1.

2.
ACKNOWLEDGEMENT

We would like to take this opportunity to thank all those who have being involved
directly or indirectly in the completion of our Project. We would like to express our
sincere thanks to our internal guide Dr. Mukthi S.L Associate Professor, Department
of Electronics and Communication Engineering, for providing constant support and
guidance. We also thank Dr. Byrareddy C R, Dr. Mukthi S.L, Mr. Gahan A. V,
Mrs. Hithaishi P Project Coordinators, Department of Electronics and Communication
Engineering, for coordinating and extending the support and guidance for
accomplishment of project abide to the guidelines. We express our sincere regards and
thanks To Dr. HEMANTH KUMAR A.R, Professor and HOD, Electronics &
communication Engineering. We immensely thank Dr. Aswath M. U, Principal, BIT
Bangalore for providing excellent academic environment in the college.

We would express our thanks to all teaching and non-teaching staff of Department of
Electronics and Communication and our parents for their support and cooperation
throughout the completion of project. It would be our privilege to express our heartfelt
gratitude and respect to the Bangalore Institute of Technology, which has given us an
opportunity to present this project report

HEMANTH B S (1BI19EC043)
JAYANTH R (1BI19EC049)
MOHAMMED TAHEER AHMED (1BI19EC078)
RAKSHITH RJ (1BI19EC111)
ABSTRACT
Multiple Object Tracking (MOT) and Trajectory Prediction are two important tasks in
computer vision and robotics that enable machines to perceive and interact with their
environment. MOT involves detecting and tracking multiple objects in a video or
image sequence over time, while trajectory prediction aims to forecast the future
positions and movements of these objects. In recent years, deep learning techniques
have shown remarkable progress in addressing these challenges. One popular
approach is to use deep neural networks to extract features from raw sensor data and
predict object trajectories based on learned representations of object motion patterns.
This project provides a comprehensive review of the state-of-the-art methods for MOT
and trajectory prediction, including both classical and deep learning-based approaches.
This project discuss the key challenges involved in these tasks, such as occlusions,
scale changes, and motion blur, and how different methods address them. The project
also highlight the datasets and evaluation metrics commonly used in the field and
provide insights into future research directions. Overall, this project provides a
detailed understanding of the advances and limitations of MOT and trajectory
prediction methods, and highlights the potential impact of these technologies on
various applications such as autonomous driving, surveillance, and robotics.
TABLE OF CONTENTS

CHAPTER CONTENTS PAGE

NUMBER

CHAPTER 1: INTRODUCTION 1-11


1.1 Motivation For MOT And Trajectory Prediction 4

1.2 About MOT And Trajectory Prediction 5

CHAPTER 2: LITERATURE SURVEY 12-16

CHAPTER 3: CONCEPTUALISATION OF YOLO ALGORTITHM 17-19

3.1 Software requirements for MOT and trajectory prediction 17

3.2 Implementation 22

CHAPTER 4: RESULTS AND DISCUSSIONS 23-25

CONCLUSION AND FUTURE SCOPE 26

REFERENCES

Appendix A
Source code
LIST OF FIGURES
FIGURE CONTENTS PAGE
NUMBER
NUMBER
1.1 Object Detection 2

1.2.1 System flow of object detection 5

1.2.2 Trajectory Prediction 6

1.2.3 Traffic congestion 7

1.2.4 Traffic prediction and forecasting 9

1.2.5 Vehicle tracking


10
3.1 Key stages in object detection 17

3.1.1 Working of yolo algorithm 18

3.1.2 Object detection using YOLO algorithm 18

3.1.3 Flowchart of YOLO algorithm 19

3.1.4 OpenCV 20

3.1.5 OpenCV example 21

4.1 Vehicle tracking 23

4.2 Green blue polygons 23

4.3 Multiple object tracking 23

4.4 Dashboard 24

4.5 Processed video 24

4.6 Trajectory prediction 25


BANGALORE INSTITUTE OF TECHNOLOGY

VISION

To establish and develop the Institute as a center of higher learning, ever abreast
with expanding horizon of knowledge in the field of engineering and technology,
with entrepreneurial thinking, leadership excellence for life-long success and solve
societal problem.

MISSION
 Provide high quality education in the engineering disciplines from the
undergraduate through doctoral levels with creative academic and
professional programs.
 Develop the Institute as a leader in Science, Engineering,
Technology and management, Research and apply knowledge
for the benefit of society.
 Establish mutual beneficial partnerships with industry, alumni, local, state
and central governments by public service assistance and collaborative
research.
 Inculcate personality development through sports, cultural and
extracurricular activities and engage in the social, economic and
professional challenges.

LONG TERM GOALS


 To be among top 3 private engineering colleges in Karnataka and top 20 in India.

 To be the most preferred choice of students and faculty.

 To be the preferred partner of corporate.

 To provide knowledge through education and research in engineering.

 To develop in each student mastery of fundamentals, versatility of mind,


motivation for learning, intellectual discipline and self-reliance which
provide the best foundation for continuing professional achievement.
 To provide a liberal; as well as a professional education so that each
student acquires a respect for moral values, a sense of their duties as a
citizen, a feeling for taste and style, and a better human understanding.
DEPARTMENT OF ELECTRONICS AND COMMUNICATION

VISION

Imparting Quality Education to achieve Academic Excellence in Electronics


and Communication Engineering for Global Competent Engineers.

MISSION

 Create state of art infrastructure for quality education.


 Nurture innovative concepts and problem solving skills.
 Delivering Professional Engineers to meet the societal needs.

PROGRAM EDUCATIONAL OBJECTIVES

 Prepare graduates to be professionals, Practicing engineers and


entrepreneurs in the field of Electronics and communication.
 To acquire sufficient knowledge base for innovative techniques in
design and development of systems.
 Capable of competing globally in multidisciplinary field.
 Achieve personal and professional success with awareness and
commitment to ethical and social responsibilities as an individual as
well as a team.
 Graduates will maintain and improve technical competence through
continuous learning process.

PROGRAM SPECIFIC OUTCOMES

PSO1: Core Engineering: The graduates will be able to apply the principles of
Electronics and Communication in core areas.

PSO2: Soft Skills: An ability to use latest hardware and software tools in Electronics and
Communication engineering.

PSO3: Successful Career: Preparing Graduates to satisfy industrial needs and


pursue higher studies with social-awareness and universal moral values.
MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

CHAPTER 1
INTRODUCTION

Multiple Object Tracking (MOT) and Trajectory Prediction are two important tasks in
computer vision and robotics that have gained significant attention in recent years. The ability
to detect and track multiple objects over time and predict their future positions and
movements is critical in many applications, including autonomous driving, surveillance,
robotics, and human-computer interaction involves detecting and tracking multiple objects in a
video or image sequence over time. The goal is to associate object detections across frames to
form object tracks, despite variations in appearance, motion, and occlusions. The task is
challenging due to the large number of objects, their complex motion patterns, and the presence
of occlusions and interactions between objects. Trajectory prediction aims to forecast the
future positions and movements of objects based on their past trajectory and motion patterns.
The task is crucial for planning and decision-making in various applications, such as
autonomous driving and robotics. However, it is also challenging due to the uncertainty and
variability in the motion of objects and the presence of obstacles and other environmental
factors. In recent years, deep learning techniques have shown remarkable progress in
addressing these challenges. One popular approach is to use deep neural networks to extract
features from raw sensor data and predict object trajectories based on learned representations
of object motion patterns. This has led to significant improvements in the accuracy and
robustness of MOT and trajectory prediction methods. However, there are still several
challenges that need to be addressed, such as handling occlusions and interactions between
objects, dealing with scale changes and motion blur, and ensuring real-time performance.
Moreover, the development of reliable evaluation metrics and benchmark datasets is essential
for comparing and evaluating different methods. In this context, this project provides a
comprehensive review of the state-of-the-art methods for MOT and trajectory prediction,
including both classical and deep learning-based approaches. We discuss the key challenges
involved in these tasks, the methods used to address them, and the datasets and evaluation
metrics commonly used in the field. We also provide insights into future research directions
and highlight the potential impact of these technologies on various applications.

Department of Electronics and Communication Engineering ,BIT Page 1


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

In summary, multiple object tracking and trajectory prediction are important tasks in
computer vision and robotics that involve detecting and tracking multiple objects in a video or
image sequence over time, and predicting their future positions and movements. These tasks
are challenging due to various factors such as occlusions, object interactions, and
environmental factors, but deep learning techniques have shown remarkable progress in
addressing these challenges. The applications of MOT and trajectory prediction are numerous
and have the potential to revolutionize various fields, such as autonomous driving and robotics.

Figure 1.1: Object Detection

Traditional object detection (Fig 1.1) methods relied on handcrafted features and classifiers,
but with recent advancements in deep learning, especially convolutional neural networks
(CNNs), the field has undergone a significant transformation. Modern object detection
approaches predominantly employ deep learning techniques, such as region-based
convolutional neural networks (R-CNN), You Only Look Once (YOLO/YOLBO) [3], and
Single Shot MultiBox Detector (SSD), among others.

These deep learning-based approaches typically follow a two-stage or a single-stage


framework. In the two-stage framework, the algorithm first generates a set of region proposals
likely to contain objects, followed by the classification and refinement of these proposals. On
the other hand, single-stage frameworks directly predict the bounding boxes and class labels
without explicit region proposal generation.

Department of Electronics and Communication Engineering ,BIT Page 2


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

Multiple object detection involves several key components, including:

1. Input Data: Images or videos serve as input data for the detection algorithms. The
algorithms analyze the visual information contained in these inputs.

2. Preprocessing: Prior to feeding the data into the detection model, preprocessing steps
such as resizing, normalization, and augmentation might be performed to improve the
detection accuracy.

3. Feature Extraction: Deep learning-based models extract hierarchical features from the
input data using convolutional layers. These features capture discriminative patterns
and semantic information necessary for object detection.

4. Object Proposal Generation: In two-stage approaches, a set of candidate regions or


proposals likely to contain objects are generated. These proposals aim to reduce the
search space for object detection and classification.

5. Object Classification: The detected regions or proposals are classified into various
object classes using classification networks. This step determines the type of object
present in each region.

6. Bounding Box Regression: After classifying the proposals, refinement is performed to


precisely localize the objects within the bounding boxes. Regression models adjust the
coordinates of the bounding boxes based on the predicted offsets.

7. Post-processing: The final step involves post-processing the output of the detection
model, which typically includes non-maximum suppression (NMS) to eliminate
redundant detections and filtering out low-confidence detections.

Multiple object detection techniques have advanced significantly in recent years, driven by
the availability of large-scale annotated datasets and the increasing computational power of
GPUs. These advancements have enabled more accurate and efficient detection of objects in
real-world scenarios, contributing to numerous practical applications across various
industries.

Department of Electronics and Communication Engineering ,BIT Page 3


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

1.1 MOTIVATION TO MOT AND TRAJECTORY PREDICTION


The motivation to implement multiple object tracking (MOT) and trajectory prediction stems
from their numerous applications in various fields, such as surveillance, autonomous driving,
robotics, and human- computer interaction. These tasks are critical for decision-making and
planning, and can help improve safety, efficiency, and effectiveness in various applications.

Surveillance - In surveillance, MOT can help detect and track multiple objects in real-time
video streams, such as monitoring the movement of people and vehicles in a public space. This
can be useful for detecting potential security threats or identifying criminal activities.
Similarly, trajectory prediction can help forecast the future movement of objects, such as
predicting the trajectory of a person walking in a crowded area, which can aid in crowd
management and planning.

Autonomous Driving - In autonomous driving, MOT and trajectory prediction are crucial for
detecting and tracking other vehicles, pedestrians, and obstacles in the environment. This can
help ensure the safety of passengers and other road users, as well as improve the efficiency of
the driving system. Moreover, predicting the future positions and movements of objects can
aid in planning and decision-making, such as predicting the future path of a pedestrian
crossing the road.

Robotics - In robotics, MOT and trajectory prediction can aid in object detection and
tracking, and can help robots navigate in dynamic and uncertain environments. For example,
robots can use MOT to detect and track objects in their workspace, such as parts on a
production line, and trajectory prediction can help forecast the future movement of these
objects, aiding in planning and decision-making.

Human Computer interaction - In human-computer interaction, MOT and trajectory


prediction can aid in tracking human movements and gestures, such as tracking hand
movements in sign language recognition or tracking facial expressions in emotion recognition.

In summary, the motivation to implement MOT and trajectory prediction arises from their
numerous applications in various fields, such as surveillance, autonomous driving, robotics,
and human-computer interaction. These tasks are crucial for decision-making and planning
and can help improve safety, efficiency, and effectiveness in various applications.

Department of Electronics and Communication Engineering ,BIT Page 4


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

1.2 ABOUT MOT, TRAJECTORY PREDICTION AND TRAFFIC


CONGESTION
MOT (Multiple Object Tracking) and trajectory prediction are closely related tasks in
computer vision that deal with estimating the motion paths of objects over time in videos or
sequences of images. These techniques have gained significant attention due to their
applications in surveillance, autonomous driving, behavior analysis, and video understanding.

Figure 1.2.1: System flow of Object Detection

Multiple Object Tracking (MOT):

MOT (Fig 1.2.1) involves identifying and tracking multiple objects across consecutive frames
of a video or image sequence. The goal is to maintain the identity of each object as it moves
throughout the scene. The typical MOT pipeline consists of the following steps:

1. Object Detection: In the initial frame or frames, objects of interest are detected using
object detection algorithms. This provides the starting point for tracking.

2. Object Association: The detected objects are associated across frames to establish
correspondences and track their identities. Various techniques, such as data
association algorithms, Kalman filters, particle filters, or graph-based methods, can be
employed for this purpose.

3. Motion Estimation: Once the associations are established, the motion of each object
is estimated by analyzing its position changes between frames. This can involve
techniques like optical flow estimation or using motion models like constant velocity
or constant acceleration.

4. State Update: The estimated object states, including position, velocity, and other
relevant attributes, are updated based on the motion estimation and association results.

Department of Electronics and Communication Engineering ,BIT Page 5


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

5. Track Maintenance: The ongoing tracking process involves continuously updating


and maintaining the tracks as new frames are processed. This includes handling
challenges like occlusions, object appearances, and track fragmentation or merging.

MOT aims to provide accurate and consistent object tracks throughout the video, enabling
applications such as object behavior analysis, anomaly detection, and activity recognition.

Trajectory Prediction:

Trajectory prediction (Fig 1.2.2) focuses on forecasting the future motion paths of objects
based on their past observed behavior. It aims to estimate the positions and movements of
objects beyond the current frame, facilitating anticipation and proactive decision-making in
various applications like autonomous navigation, human-computer interaction, and pedestrian
safety.

Figure 1.2.2: Trajectory prediction

Trajectory prediction methods typically rely on historical motion patterns and contextual
information to make predictions. Some common approaches include:

1. Data-driven Methods: These methods learn motion patterns from large-scale datasets
and use machine learning algorithms, such as recurrent neural networks (RNNs), long
short-term memory (LSTM) networks, or graph neural networks (GNNs), to predict
future trajectories based on past observations.

2. Physics-based Methods: These methods incorporate physical principles and


dynamics models to simulate the future motion of objects. Examples include Kalman
filters, particle filters, and models based on motion equations, such as constant
velocity or acceleration.

Department of Electronics and Communication Engineering ,BIT Page 6


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

3. Interaction-aware Methods: These methods consider the interactions between


multiple objects and leverage social or crowd behavior patterns to predict trajectories.
They capture the influence of other objects in the scene on an individual object's
motion.

Trajectory prediction [1] can be performed at different time horizons, ranging from short-term
predictions (a few frames ahead) to long-term predictions (several seconds or more into the
future). Both MOT and trajectory prediction are active research areas in computer vision and
have numerous practical applications in surveillance, autonomous systems, robotics, and
human-computer interaction. They are essential for understanding and interpreting object
behaviors and enabling intelligent decision-making in dynamic environments.

Traffic congestion:

Traffic congestion (Fig 1.2.3) is a condition in transport that is characterized by slower


speeds, longer trip times, and increased vehicular queueing. Traffic congestion on urban road
networks has increased substantially. When traffic demand is great enough that the interaction
between vehicles slows the speed of the traffic stream, this results in some congestion.

There are several main causes of traffic congestion in roadways:

Figure 1.2.3: Traffic congestion

 High volume of vehicles: The sheer number of vehicles on the road exceeding the
capacity of the roadway can lead to congestion. When there are more vehicles than the
road can handle, traffic slows down or comes to a standstill, causing congestion.

Department of Electronics and Communication Engineering ,BIT Page 7


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

 Bottlenecks and chokepoints: Bottlenecks occur at locations where the road capacity
decreases, such as narrow lanes, merging lanes, or on-ramps and off-ramps.
Chokepoints can also occur at intersections or junctions with heavy traffic, where
multiple streams of vehicles converge, causing congestion.

 Traffic incidents and accidents: Traffic incidents, including accidents, breakdowns,


or vehicle malfunctions, disrupt the flow of traffic and can lead to congestion. When
lanes are blocked or vehicles are obstructing the road, it causes delays and congestion
as vehicles navigate around the incident.

 Inadequate infrastructure: Insufficient roadway infrastructure, including poorly


designed roads, inadequate lane capacity, lack of turning lanes, or poorly synchronized
traffic signals, can contribute to congestion. Inefficient infrastructure limits the road's
ability to handle traffic flow smoothly.

 Traffic management and control: Ineffective traffic management and control, such
as improper signaling, inadequate signage, or lack of traffic enforcement, can result in
congestion. Inconsistent or improper traffic control measures can lead to confusion
and disruption in traffic flow.

It's important to note that the causes and extent of traffic congestion can vary depending on
the location, time of day, and other specific factors. Effective traffic management strategies,
improved infrastructure, and promoting alternative transportation options can help alleviate
traffic congestion and improve overall traffic flow.

Image processing techniques can also be utilized to prevent traffic congestion. Here are some
ways in which machine learning can be applied:

 Traffic prediction and forecasting: Machine learning algorithms can analyze


historical traffic data, weather conditions, and other relevant factors to predict traffic
patterns and congestion (Fig 1.2.4). These predictions can help drivers, traffic
management systems, and transportation authorities make informed decisions in real-
time, such as adjusting signal timings, suggesting alternate routes, or providing
advance warnings to drivers.

Department of Electronics and Communication Engineering ,BIT Page 8


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23
 Intelligent traffic signal control: Machine learning algorithms can optimize traffic
signal timings based on real-time traffic conditions. By analyzing traffic flow data
from sensors or cameras, machine learning models can dynamically adjust signal
timings to reduce congestion and improve traffic flow at intersections.

Figure 1.2.4: Traffic prediction and forecasting

 Route optimization and navigation: Machine learning algorithms can analyze real-
time traffic data, historical patterns, and user preferences to suggest optimal routes for
drivers. By considering current traffic conditions and predicting congestion, machine
learning-based navigation systems can guide drivers to less congested routes, thereby
distributing traffic and reducing congestion on popular routes.

 Adaptive traffic management systems: Machine learning can be used to develop


adaptive traffic management systems that learn from real-time data and optimize
traffic flow. These systems can automatically adjust traffic control measures, such as
lane assignments, signal timings, or ramp metering, based on current traffic conditions
to mitigate congestion.

 Incident detection and response: Machine learning models can analyze traffic data,
video feeds, and sensor data to detect incidents such as accidents, breakdowns, or road
hazards. By detecting incidents early, authorities can respond quickly, divert traffic,
and minimize the impact on overall traffic flow, reducing congestion.

 Intelligent parking management: Machine learning algorithms can be used to


optimize parking management by analyzing parking demand, availability, and pricing
data. By providing real-time information on parking availability and suggesting
parking options to drivers, machine learning-based systems can reduce the time spent
searching for parking.

Department of Electronics and Communication Engineering ,BIT Page 9


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

 Demand management and prediction: Machine learning models can analyze


historical data and user behavior to predict future demand for transportation services.
By anticipating peak periods and high-demand areas, transportation authorities can
proactively implement strategies such as adjusting public transportation schedules,
increasing capacity, or incentivizing alternative modes of transportation to alleviate
congestion.

It's important to note that the successful implementation of machine learning techniques for
traffic congestion prevention relies on accurate and timely data collection, reliable
communication infrastructure, and effective integration with existing transportation systems.
Additionally, machine learning models need to be continuously updated and improved as
traffic patterns evolve.

Figure 1.2.5: Vehicle Tracking[4]

Contribution of the project to the society :

The benefits of multiple object tracking (MOT) and trajectory prediction are numerous and
depend on the specific application. Here are some potential benefits of MOT and trajectory
prediction:

1. Enhanced safety: In applications such as autonomous driving and surveillance, MOT


and trajectory prediction can help detect and track objects (Fig 1.2.5), forecast their
future movements, and aid in decision-making and planning, leading to improved
safety for passengers, pedestrians, and other road users.

2. Increased efficiency: In applications such as production line management and crowd


management, MOT and trajectory prediction can aid in tracking objects and forecasting
their movements, leading to increased efficiency and reduced waiting times.

3. Improved decision-making: In applications such as robotics and human-computer


interaction, MOT and trajectory prediction can aid in tracking object movements and
human gestures, leading to improved decision-making and task execution.
Department of Electronics and Communication Engineering ,BIT Page 10
MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

4. Better resource allocation: In applications such as traffic management, MOT and


trajectory prediction can aid in predicting traffic patterns and optimizing resource
allocation, such as adjusting traffic signals or rerouting vehicles.

5. Enhanced situational awareness: In applications such as surveillance and robotics,


MOT and trajectory prediction can aid in detecting and tracking objects, leading to
enhanced situational awareness and improved response times.

Overall, the benefits of MOT and trajectory prediction depend on the specific application and
the extent to which these tasks are integrated into the system. However, in general, these tasks
can lead to improved safety, increased efficiency, better decision-making, and enhanced
situational awareness.

Department of Electronics and Communication Engineering ,BIT Page 11


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

CHAPTER 2

LITERATURE SURVEY
[1]Young Yoon , Heesu Hwang Yongjun Choi , Minbeom Joo, Hyeyoon et.al “Analyzing
Basketball Movements and Pass Relationships Using Realtime Object Tracking
Techniques Based on Deep Learning”, 2019, IEEE, Vol.7
In this paper, we present techniques for automatically classifying players and tracking ball
movements in basketball game video clips under poor conditions, where the camera angle
dynamically shifts and changes. In the core of our system lies Yolo, a real time object detection
system. Given the ground truth boxes collected by our data specialists, Yolo is trained to detect
the presence of objects in every video frame. In addition, Yolo uses Darknet that implements
convolution neural networks to classify a detected object to a player and to recognize its jersey
numbers of specific movements. By identifying players and ball possessions, we can
automatically compute ball distributions that are reflected on complex networks. With original
Yolo system, player movement can be interrupted, when the players move out of the frame
due to camera shift and when players overlap each other on a two-dimensional frame. We have
adapted Yolo to keep track of players even under such poor condition by considering
contextual information available from the framework preceding and/or succeeding
problematic video frames. In addition to the novel movement inference method, we provide a
framework for analyzing the pass networks in various perspectives to help the managing staff
to reveal critical determinants of team performance and to design better game strategies. We
assess the performance of our system in terms of accuracy by making a comparison with the
analytical reports generated by human experts.
[2]Mate Krišto, Marina Ivasic-Kos And Miran Pobar, “Thermal Object Detection in
Difficult Weather Conditions Using YOLO “, 2020, IEEE, Vol.8
Global terrorist threats and illegal migration have intensified concerns for the security of
citizens, and every effort is made to exploit all available technological advances to prevent
adverse events and protect people and their property. Due to the ability to use at night and in
weather conditions where RGB cameras do not perform well, thermal cameras have become
an important component of sophisticated video surveillance systems. In this paper, we
investigate the task of automatic person

Department of Electronics and Communication Engineering ,BIT Page 12


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

detection in thermal images using convolutional neural network models originally intended
for detection in RGB images. We compare the performance of the standard state-of-the-art
object detectors such as Faster R-CNN, SSD, Cascade R-CNN, and YOLOv3, that were
retrained on a dataset of thermal images extracted from videos that simulate illegal
movements around the border and in protected areas. Videos are recorded at night in clear
weather, rain, and in the fog, at different ranges, and with different movement types. YOLOv3
was significantly faster than other detectors while achieving performance We experimented
with different training dataset settings in order to determine the minimum number of images
needed to achieve good detection results on test datasets. We achieved excellent detection
results with respect to average accuracy for all test scenarios although a modest set of thermal
images was used for training. We test our trained model on different well known and widely
used thermal imaging datasets as well. In addition, we present the results of the recognition of
humans and animals in thermal images, which is particularly important in the case of sneaking
around objects and illegal border crossings. Also, we present our original thermal dataset used
for experimentation that contains surveillance videos recorded at different weather and
shooting. Conditions.

[3]Daniel S. Kaputa , And Brian P. Landy “ YOLBO: You Only Look Back Once–A
Low Latency Object Tracker Based on YOLO and Optical Flow”, 2020, IEEE, Vol.9
One common computer vision task is to track an object as it moves from frame to frame
within a video sequence. There are a myriad of applications for such capability and the
underlying technologies to achieve this tracking are very well understood. More recently,
deep convolutional neural networks have been employed to not only track, but also to classify
objects as they are tracked from frame to frame. These models can be used in a tracking
paradigm known as tracking by detection and can achieve very high tracking accuracy. The
major drawback to these deep neural networks is the large amount of mathematical operations
that must be performed for each inference which negatively impacts the number of tracked
frames per second. For edge applications residing on size, weight, and power limited
platforms, such as unmanned aerial vehicles, high frame rate and low latency real time tracking
can be an elusive target. To overcome the limited power and computational resources of an
edge compute device, various optimizations have been performed to trade off tracking speed,
accuracy, power, and latency. Previous works on motion based interpolation with neural
networks either do not take into account the latency accrued from camera image capture to
tracking result or they compensate for this latency but are bottlenecked by the motion
interpolation operation instead.
Department of Electronics and Communication Engineering ,BIT Page 13
MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23
[4]Wei Fang , Wang Lin , Ren Peiming, “Tinier-YOLO: A Real-Time Object Detection
Method for Constrained Environments”, 2020, IEEE, Vol.8
Deep neural networks (DNNs) have shown prominent performance in the field of object
detection. However, DNNs usually run on powerful devices with high computational ability
and sufficient memory, which have greatly limited their deployment for constrained
environments such as embedded devices. YOLO is one of the state-of-the-art DNN-based
object detection approaches with good performance both on speed and accuracy and Tiny-
YOLO-V3 is its latest variant with a small model that can run on embedded devices. In this
paper, Tinier-YOLO, which is originated from Tiny-YOLO-V3, is proposed tofurther shrink
the model size while achieving improved detection accuracy and real-time performance.

In Tinier-YOLO, the fire module in Squeeze Net is appointed by investigating the number of
fire modules as well as their positions in the model in order to reduce the number of model
parameters and then reduce the model size. For further improving the proposed Tinier-YOLO
in terms of detection accuracy and real-time performance, the connectivity style between fire
modules in Tinier-YOLO differs from Squeeze Net in that dense connection is introduced and
fine designed to strengthen the feature propagation and ensure the maximum information flow
in the network. The object detection performance is enhanced in Tinier-YOLO by using the
passthrough layer that merges feature maps from the front layers to get fine-grained features,
which can counter the negative effect of reducing the model size. The resulting Tinier-YOLO
yields a model size of 8.9MB (almost 4× smaller than Tiny-YOLO-V3) while achieving 25
FPS real-time performance on Jetson TX1 and an mAP of 65.7% on PASCAL VOC and
34.0% on COCO. Tinier-YOLO alse posses comparable results in mAP and faster runtime
speed with smaller model size and BFLOP/s value compared with other lightweight models
like SqueezeNet SSD and MobileNet SSD
[5]Yongjun Li , Li Shasha , Du Haohao , et.al “YOLO-ACN: Focusing on small target
and occluded object detection”, 2020, IEEE, Vol.20
To further improve the speed and accuracy of object detection, especially small targets and
occluded objects, a novel and efficient detector named YOLO-ACN is presented. The detector
model is inspired by the high detection accuracy and speed of YOLOv3, and it is improved by
the addition of an attention mechanism, a CIoU (complete intersection over union) loss
function, Soft-NMS (non-maximum suppression), and depthwise separable convolution.
First, the attention mechanism is introduced in the channel and spatial dimensions in each
residual block to focus on small targets. Second, CIoU loss is adopted to achieve accurate
bounding box (BBox) regression. Besides, to filter out a more accurate BBox and avoid
deleting occluded objects in dense images, the CIoU is applied in the Soft-NMS, and the
Department of Electronics and Communication Engineering ,BIT Page 14
MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23
Gaussian model in the Soft-NMS is employed to suppress the surrounding BBox. Third, to
significantly reduce the parameters and improve the detection speed, standard convolution is
replaced by depth wise separable convolution, and hard-swish activation function is utilized in
deeper layers. On the MS COCO dataset and infrared pedestrian dataset KAIST, the
quantitative experimental results show that compared with other state-of-the-art models, the
proposed YOLO-ACN has high accuracy and speed in detecting small targets and occluded
objects. YOLO- ACN reaches a mAP50 (mean average precision) of 53.8% and an APs
(average precision for small objects) of 18.2% at a real-time speed of 22 ms on the MS COCO
dataset, and the mAP for a single class on the KAIST dataset even reaches over 80% on an
NVIDIA Tesla K40.

[6]Zhuang-Zhuang Wang , Kai Xie , Xin-Yu Zhang , Hua-Quan Chen , et.al He,“Small-
Object Detection Based on YOLO and Dense Block via Image Super-Resolution”, 2021, IEEE,
Vol.9

Small-object detection is a basic and challenging problem in computer vision tasks. It is


widely used in pedestrian detection, traffic sign detection, and other fields. This paper proposes
a deep learning small-object detection method based on image super-resolution to improve the
speed and accuracy of small-object detection. First, we add a feature texture transfer (FTT)
module at the input end to improve the image resolution at this end as well as to remove the
noise in the image. Then, in the backbone network, using the Darknet53 framework, we use
dense blocks to replace residual blocks to reduce the number of network structure parameters
to avoid unnecessary calculations. Then, to make full use of the features of small targets in the
image, the neck uses a combination of SPPnet and PANnet to complete this part of the multi-
scale feature fusion work. Finally, the problem of image background and foreground imbalance
is solved by adding the foreground and background balance loss function to the YOLOv4 loss
function part. The results of the experiment conducted using our self-built dataset show that
the proposed method has higher accuracy and speed compared with the currently available
small-target detection methods.
[7]Xuejun Chen , Da Li , Qixiang Zou, “Exploiting Acceleration of the Target for Visual
Object Tracking”, 2021, IEEE, Vol.9
Discriminative Correlation Filters (DCF) based trackers have achieved remarkable perfor
mance in visual object tracking in recent years. The trackers represent the target with hand-
craft features or deep features. To reduce the computational cost and irrelevant information,
such trackers choose the region of interest (ROI) as a search window to search the target
rather than search exhaustively in each frame. The size of search window, which is fixed in

Department of Electronics and Communication Engineering ,BIT Page 15


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23
most trackers, can greatly affect the performance of tracker. In order to characterize the motion
state of the target, this paper presents the velocity and acceleration in the field of visual object
tracking. It is observed that the acceleration can change the search window dynamically and
achieve better performance. Experiments on the popular datasets OTB50 and OTB100
demonstrate that the DCF based trackers including the state-of-the-art (SOTA) trackers
improve their performance by exploiting acceleration.
[8]Qingze Yu , Bo Wang , Yumin Su, “Object Detection-Tracking Algorithm for
Unmanned Surface Vehicles Based on a Radar-Photoelectric System”, 2021, IEEE, Vol.9
Object tracking is an important basis for the autonomous navigation of unmanned surface
vehicles. However, several problems still must be addressed for a wide applicating of object
tracking in unmanned surface vehicles. First, if multiple objects of the same classification
exist in the same field of view, then stable extraction of an object is difficult. Second, in an
environment with a complex background and large changes in object shape, the tracking
accuracy is low, and object tracking errors and tracking loss can easily occur. Third, much
time is required to detect a high-resolution real-time video stream, not meeting the delay
requirement of the photoelectric servo stable tracking. To resolve these problems, this paper
proposes an object detection-tracking algorithm based on a radar-photoelectric system. The
algorithm combines an objectdetection algorithm with an object tracking algorithm and
involves the following steps. First, a first-frame object extraction algorithm is used to extract
the tracking object from the first frame. Second, a region of interest (ROI)-prediction
algorithm is used to predict ROIs and detect objects in these ROIs. This algorithm can
effectively solve the above problems in marine tests. When multiple objects of the same
classification exist in the same field of view, the algorithm can extract the radar-guided object
stably. When faced with a complex background and a large change in object shape, the
algorithm substantially improves the accuracy and robustness of object tracking. Compared
with the conventional object detection algorithm, the time consumption of this algorithm is
reduced by 25.8%.

Department of Electronics and Communication Engineering ,BIT Page 16


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

CHAPTER 3

CONCEPTUALISATION OF YOLO ALGORITHM

The primary objective of multiple object tracking (MOT) using the YOLO algorithm is to
detect and track multiple objects in real-time video streams. YOLO (You Only Look Once)
[3] is a popular object detection algorithm that can detect multiple objects in an image or video
frame and provide their locations and class labels with high accuracy and speed. The YOLO
algorithm achieves this by dividing the input image or video frame into a grid of cells (Fig
3.1) and predicting bounding boxes and class probabilities for each cell. The objective of
using the YOLO algorithm in MOT is to achieve accurate and efficient object detection and
tracking in real-time video streams. This has several applications, such as surveillance,
autonomous driving, and robotics, where reliable and efficient object tracking is crucial for
decision- making and planning.

Figure 3.1: Key stages in object detection [5]

3.1 SOFTWARE REQUIREMENTS FOR MOT AND TRAJECTORY


PREDICTION
1. THE YOLO (You Only Look Once) ALGORITHM:

YOLO is an algorithm that uses neural networks to provide real-time object detection. This
algorithm is popular because of its speed and accuracy. It has been used in various applications
to detect traffic signals, people, parking meters, and animals. YOLO is an abbreviation for the
term ‘You Only Look Once’. This is an algorithm that detects and recognizes various objects
in a picture (in real-time). Object detection in YOLO is done as a regression problem and

Department of Electronics and Communication Engineering ,BIT Page 17


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

provides the class probabilities of the detected images. YOLO algorithm employs
convolutional neural networks (CNN) to detect objects in real-time. As the name suggests, the
algorithm requires only a single forward propagation through a neural network to detect
objects. This means that prediction in the entire image is done in a single algorithm run. The
CNN is used to predict various class probabilities and bounding boxes simultaneously.

Figure 3.1.1: Working of YOLO Algorithm Figure 3.1.2: Object Detection using YOLO
Algorithm
YOLO (You Only Look Once) is a real-time object detection algorithm that can detect
multiple objects in an image or video frame simultaneously. It works by dividing the input
image into a grid of cells and predicting the bounding boxes, confidence scores, and class
probabilities for each cell. Here is how the YOLO algorithm works (Fig 3.1.1 & 3.1.2):
1. Input image: The first step is to input an image into the YOLO algorithm.
2. Grid cell division: YOLO divides the input image into a grid of cells. Each cell is
responsible for detecting objects that fall within it.
3. Bounding box prediction: For each cell, YOLO predicts one or more bounding boxes
that represent the locations of objects within the cell. Each bounding box is defined by
five parameters: the x and y coordinates of the center of the box, the width and height of
the box, and the confidence score of the box.
4. Class prediction: YOLO also predicts the class probabilities for each bounding box.
This means that for each box, YOLO predicts the probability that the object in the box
belongs to a certain class, such as "car", "person", "dog", etc.
5. Non-maximum suppression: YOLO uses non-maximum suppression to remove
redundant bounding boxes. This means that if multiple bounding boxes overlap each
other and detect the same.
6. Output: The final output of the YOLO algorithm is a set of bounding boxes that
represent the locations of objects in the input image, along with their class
probabilities.

Department of Electronics and Communication Engineering ,BIT Page 18


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

The YOLO algorithm is known for its speed and accuracy, as it can process images in real-time
and achieve high accuracy in object detection. It is widely used in various applications, such
as autonomous driving, surveillance, and robotics. However, like all deep learning algorithms,
it requires a large amount of training data and computing power to achieve optimal
performance.

Figure 3.1.3: Flowchart of YOLO Algorithm

The YOLO (You Only Look Once) algorithm is a popular object detection algorithm that uses
a single convolutional neural network to split an image into grids, with each grid making
predictions on bounding boxes and confidence scores
The algorithm takes an image as input and then uses a simple deep convolutional neural
network to detect objects in the image
The following is a brief flowchart (Fig 3.1.3) of the YOLO algorithm:
1. Input an image.
2. Divide the image into a grid of cells.
3. For each cell, predict the bounding boxes and confidence scores.
4. Apply non-max suppression to remove overlapping bounding boxes.
5. Output the final set of bounding boxes and their corresponding class probabilities.
To improve the prediction accuracy, YOLO v2 introduced anchor boxes, which are used to

Department of Electronics and Communication Engineering ,BIT Page 19


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23
predict the shape of the bounding box
 YOLO v3 further improved the accuracy and speed of the algorithm by using a feature
pyramid network and a new loss function
 YOLO v4 introduced a fair amount of new concepts, such as spatial pyramid pooling,
path aggregation network, and cross-stage partial connections, to achieve optimal
speed and accuracy of object detection
 YOLO v5 builds upon the success of previous versions and adds several new features
and improvements
In summary, the YOLO algorithm is a fast and efficient object detection algorithm that uses a
single convolutional neural network to split an image into grids and make predictions on
bounding boxes and confidence scores. The algorithm has evolved over time with new
versions introducing new concepts to improve accuracy and speed.

2. OpenCV

Figure 3.1.4: OpenCV

OpenCV (Open Source Computer Vision Library) (Fig 3.1.4 & 3.1.5) is a library of
programming functions mainly for real-time computer vision. Originally developed by Intel,
it was later supported by Willow Garage, then Itseez (which was later acquired by Intel). The
library is cross-platform and licensed as free and open- source software under Apache License
. Starting in 2011, OpenCV features GPU acceleration for real- time operations. OpenCV
provides a wide range of computer vision functions and algorithms, including image and
video processing, object detection and tracking, feature detection and matching, machine
learning, and deep learning. It supports various programming languages such as C++, Python,
and Java, and can run on different platforms including Windows, Linux, macOS, iOS, and
Android.

Department of Electronics and Communication Engineering ,BIT Page 20


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

Some of the key features and functions of OpenCV are:


1. Image and video processing: OpenCV provides various functions for image and video
processing, such as filtering, transformation, geometric operations, color conversion,
and histogram analysis.
2. Object detection and tracking: OpenCV supports various object detection and
tracking algorithms, such as Haar cascades, HOG (Histogram of Oriented Gradients),
and deep learning- based models like YOLO and SSD.
3. Feature detection and matching: OpenCV provides various functions for feature
detection and matching, such as SIFT (Scale-Invariant Feature Transform), SURF
(Speeded-Up Robust Features), and ORB (Oriented FAST and Rotated BRIEF).
4. Machine learning: OpenCV provides various machine learning algorithms, such as k-
nearest neighbors, support vector machines, decision trees, and random forests.
5. Deep learning: OpenCV provides deep learning functions and supports popular deep
learning frameworks like TensorFlow, PyTorch, and Caffe.

Figure 3.1.5: OpenCV example

The working of the algorithm can be further optimized by incorporating deep learning-based
methods for object tracking and trajectory prediction. For instance, deep neural networks can
be trained to learn the appearance and motion patterns of the objects, which can improve the
accuracy of object tracking and trajectory prediction.
Overall, the Multiple Object Tracking and Trajectory Prediction using YOLO algorithm is a
powerful tool for tracking multiple objects in video frames and predicting their trajectories. It
has various applications in fields such as surveillance, autonomous driving, and robotics.

Department of Electronics and Communication Engineering ,BIT Page 21


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

3.2 IMPLEMENTATION:
Implementing multiple object tracking (MOT) and trajectory prediction using YOLO
algorithm involves several steps:
1. Object detection: The first step is to detect objects in each frame of the video using
YOLO (You Only Look Once) algorithm. YOLO is a popular real-time object
detection algorithm that can detect multiple objects in an image simultaneously. It
works by dividing the input image into a grid of cells and predicting the bounding
boxes, confidence scores, and class probabilities for each cell.
2. Object tracking: The next step is to track the detected objects across frames. One
popular approach for object tracking is to use the Kalman filter algorithm, which can
predict the future position of each object based on its current position and velocity.
3. Data association: After object tracking, the next step is to associate the detected
objects in each frame with their corresponding tracks. This can be done using various
techniques, such as the Hungarian algorithm or the Intersection over Union (IoU)
method.
4. Trajectory prediction: Once the objects are associated with their corresponding
tracks, the next step is to predict their future trajectories. One popular approach for
trajectory prediction is to use the linear regression algorithm, which can fit a straight
line to the object's past positions and predict its future positions based on this line.

5. Post-processing: Finally, post-processing can be performed to refine the results and


remove any false positives or noise. Overall, implementing MOT and trajectory prediction
using YOLO algorithm involves a combination of object detection, object tracking, data
association, trajectory prediction, and post-processing. These tasks can be computationally
intensive and require a robust hardware setup to achieve real-time performance. However,
with the advancement of deep learning and computer vision algorithms, the implementation of
MOT an d trajectory prediction using YOLO algorithm has become more accessible and
efficient

Department of Electronics and Communication Engineering ,BIT Page 22


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

CHAPTER 4

RESULTS AND DISCUSSIONS


The primary purpose of this project is to find the optimal item detector-tracker combo. Figure
4.1 & 4.2 depicts all camera views with manually created green and blue polygons that

Figure 4.1: Vehicle tracking Figure 4.2: Green blue polygons


count the number of cars going through in both north and southbound directions.
The vehicle counts are categorized into four categories: overall vehicle counts, total
automobile counts total truck counts, and overall vehicle counts during different periods of
the day (i.e., daylight, nighttime, rain). All of the cars are manually tallied (Fig 4.3) to confirm
ground truth. The accuracy of the system is measured by comparing the automated counts
derived from various model combinations to the ground truth value given in per hundredths or
percentages.

Figure 4.3: Multiple Object Tracking

Misclassifications can also be caused by camera motions and situations such as rain stuck on
the lens or complete darkness. Except for a few camera views where the cars were either too
far away or encountered lowlight or nighttime situations, when only the vehicles' headlights

Department of Electronics and Communication Engineering ,BIT Page 23


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

were visible, most object detection models produced correct true positives. The below figure
4.4 illustrates the dashboard used for the object detection. It comprises of a configuration
menu where the user may select options for how to input the data, which can either be live
video from a webcam or other camera module or recorded video from security cameras.
Additionally, the user has the choice to upload a video clip's URL.

Figure 4.4: Dashboard

Figure 4.5: Processed video

The YOLO algorithm is used to process the video, and the user is then shown the outcome. It
lists the list of detected items that are present in the frame as well as the processing frames per
second (Fig 4.5).

Department of Electronics and Communication Engineering ,BIT Page 24


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

To make the prediction, we utilized the basic laws of motion and assumed that the only force
acting on the basketball (Fig 4.6) is gravity, neglecting any external factors. By applying the
kinematic equations, we calculated the displacement in each coordinate direction (x, y, z)
using the given initial position, velocity, and acceleration values.

Figure 4.6: Trajectory Prediction

In the x-direction, where the basketball's initial velocity was x m/s and there was no
acceleration, the displacement was determined solely by the velocity and time. Consequently,
the basketball traveled d meters horizontally in the t-second interval.
In the y-direction, the basketball was subject to a constant acceleration of -9.8 m/s² due to
gravity. As a result, its vertical displacement was affected by both the initial velocity and the
gravitational acceleration. With an initial vertical velocity of 0 m/s, the basketball fell
downward.
Regarding the z-direction, the basketball had an initial velocity of z m/s and experienced no
acceleration. Therefore, it maintained a constant velocity throughout the t-second period.
It is important to note that this trajectory prediction assumes ideal conditions and neglects
factors such as air resistance, spin, or external disturbances. In reality, these factors can
significantly influence the actual trajectory of a basketball. Therefore, more sophisticated
models or experimental data may be required to account for these variables accurately.
Trajectory prediction is not limited to basketball but can be applied to various scenarios
involving moving objects, such as projectiles, vehicles, or even celestial bodies. Accurate
trajectory predictions are vital for planning and decision-making processes, including sports
strategy, autonomous navigation systems, and impact analysis.

Department of Electronics and Communication Engineering ,BIT Page 25


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

CONCLUSION AND FUTURE SCOPE


Multiple Object Tracking and Trajectory Prediction using YOLO algorithm is a powerful and
efficient approach for tracking multiple objects in video frames and predicting their future
movements. The YOLO algorithm enables fast and accurate object detection, while object
tracking and trajectory prediction algorithms help to track objects over time and forecast their
future positions. This technology has numerous practical applications in various fields, such as
surveillance, autonomous driving, and robotics. For example, it can be used for tracking
vehicles and pedestrians in traffic scenarios, identifying and tracking individuals in crowded
public places, or monitoring the movements of robots in manufacturing plants. In conclusion,
Multiple Object Tracking and Trajectory Prediction using YOLO algorithm is a valuable tool
for real-time object tracking and prediction. As deep learning methods continue to evolve, we
can expect even more accurate and efficient tracking and prediction techniques to emerge,
further advancing the field of computer vision and its applications.
Trajectory prediction is a valuable tool for understanding the future path of objects based on
their initial conditions. By employing fundamental principles of motion, we can estimate how
an object, in this case, a basketball, will move over a specified period. However, it is essential
to consider various factors and potential limitations to improve the accuracy of such
predictions in real-world scenarios.

Department of Electronics and Communication Engineering ,BIT Page 26


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

REFERENCES
[1] Young Yoon , Heesu Hwang Yongjun Choi , Minbeom Joo, Hyeyoon Oh, Insun
Park , Keon-Hee Lee , And Jin-Ha Hwang “Analyzing Basketball Movements and Pass
Relationships Using Realtime Object Tracking Techniques Based on Deep Learning”,
2019, IEEE
[2] Mate Krišto, Marina Ivasic-Kos And Miran Pobar, “Thermal Object Detection
in Difficult Weather Conditions Using YOLO “, 2020, IEEE
[3] Daniel S. Kaputa , And Brian P. Landy “ YOLBO: You Only Look Back Once–A
Low Latency Object Tracker Based on YOLO and Optical Flow”, 2020, IEEE
[4] Wei Fang , Wang Lin , Ren Peiming, “Tinier-YOLO: A Real-Time Object
Detection Method for Constrained Environments”, 2020, IEEE
[5] Yongjun Li , Li Shasha , Du Haohao , Chen Lijia , Dong Ming Zhang , Yao Li,
“YOLO-ACN: Focusing on small targetand occluded
[6] object detection”, 2020, IEEE
[7] Zhuang-Zhuang Wang , Kai Xie , Xin-Yu Zhang , Hua-Quan Chen , Chang Wen ,
Jian-Biao He, “Small-Object Detection Based on YOLO and Dense Block via Image
Super-Resolution”, 2021, IEEE
[8] Xuejun Chen , Da Li , Qixiang Zou, “Exploiting Acceleration of the Target for
Visual Object Tracking”, 2021, IEEE
[9] Qingze Yu , Bo Wang , Yumin Su, “Object Detection-Tracking Algorithm for
Unmanned Surface Vehicles Based on a Radar-Photoelectric System”, 2021, IEEE

Department of Electronics and Communication Engineering ,BIT


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

Appendix A
Source code :
import streamlit as st
from main import Yolov7
import torch
import os

def main():

st.title("Dashboard")
inference_msg = st.empty()
st.sidebar.title("Configuration")

input_source = st.sidebar.radio("Select input source",


('RTSP/HTTPS', 'Webcam', 'Local video'))

conf_thres = float(st.sidebar.text_input("Detection confidence", "0.50"))


save_output_video = st.sidebar.radio("Save output video?",('Yes', 'No'))
if save_output_video == 'Yes':
save_img = True
else:
save_img = False

# ------------------------- LOCAL VIDEO ------------------------------


if input_source == "Local video":

video = st.sidebar.file_uploader("Select input video", type=["mp4", "avi"],


accept_multiple_files=False)

# save video temporarily to process it using cv2


if video is not None:
if not os.path.exists('./tempDir'):
os.makedirs('./tempDir')
with open(os.path.join(os.getcwd(), "tempDir", video.name), "wb") as file:
file.write(video.getbuffer())

video_filename = f'./tempDir/{video.name}'

if st.sidebar.button("Run"):
stframe = st.empty()

st.subheader("Inference Stats")
if1, if2 = st.columns(2)

st.subheader("System Stats")
ss1, ss2, ss3 = st.columns(3)

# Updating Inference results

Department of Electronics and Communication Engineering ,BIT


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23
with if1:
st.markdown("*Frame Rate*")
if1_text = st.markdown("0")

with if2:
st.markdown("*Detected objects in current frame*")
if2_text = st.markdown("0")

# Updating System stats


with ss1:
st.markdown("*Memory Usage*")
ss1_text = st.markdown("0")

with ss2:
st.markdown("*CPU Usage*")
ss2_text = st.markdown("0")

with ss3:
st.markdown("*GPU Memory Usage*")
ss3_text = st.markdown("0")

# Run
local_run = Yolov7(source=video_filename, save_img=save_img,
conf_thres=conf_thres,
stframe=stframe, if1_text=if1_text, if2_text=if2_text,
ss1_text=ss1_text, ss2_text=ss2_text, ss3_text=ss3_text)

local_run.detect()
inference_msg.success("Inference Complete!")

# delete the saved video


if os.path.exists(video_filename):
os.remove(video_filename)

# -------------------------- WEBCAM ----------------------------------


if input_source == "Webcam":

if st.sidebar.button("Run"):

stframe = st.empty()

st.subheader("Inference Stats")
if1, if2 = st.columns(2)

st.subheader("System Stats")
ss1, ss2, ss3 = st.columns(3)

# Updating Inference results


with if1:

Department of Electronics and Communication Engineering ,BIT


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23
st.markdown("*Frame Rate*")
if1_text = st.markdown("0")

with if2:
st.markdown("*Detected objects in current frame*")
if2_text = st.markdown("0")

# Updating System stats


with ss1:
st.markdown("*Memory Usage*")
ss1_text = st.markdown("0")

with ss2:
st.markdown("*CPU Usage*")
ss2_text = st.markdown("0")

with ss3:
st.markdown("*GPU Memory Usage*")
ss3_text = st.markdown("0")

# Run
webcam_run = Yolov7(source='0', save_img=save_img, conf_thres=conf_thres,
stframe=stframe, if1_text=if1_text, if2_text=if2_text,
ss1_text=ss1_text, ss2_text=ss2_text, ss3_text=ss3_text)

webcam_run.detect()

# -------------------------- RTSP/HTTPS ------------------------------


if input_source == "RTSP/HTTPS":

rtsp_input = st.sidebar.text_input("Video link",


"https://www.youtube.com/watch?v=zu6yUYEERwA")

if st.sidebar.button("Run"):

stframe = st.empty()

st.subheader("Inference Stats")
if1, if2 = st.columns(2)

st.subheader("System Stats")
ss1, ss2, ss3 = st.columns(3)

# Updating Inference results


with if1:
st.markdown("*Frame Rate*")
if1_text = st.markdown("0")

with if2:
st.markdown("*Detected objects in current frame*")
if2_text = st.markdown("0")

Department of Electronics and Communication Engineering ,BIT


MULTIPLE OBJECT TRACKING AND PATH PREDICTION FOR TRAFFIC CONGESTION CONTROL 2022-23

# Updating System stats


with ss1:
st.markdown("*Memory Usage*")
ss1_text = st.markdown("0")

with ss2:
st.markdown("*CPU Usage*")
ss2_text = st.markdown("0")

with ss3:
st.markdown("*GPU Memory Usage*")
ss3_text = st.markdown("0")

# Run
stream_run = Yolov7(source=rtsp_input, save_img=save_img,
conf_thres=conf_thres,
stframe=stframe, if1_text=if1_text, if2_text=if2_text,
ss1_text=ss1_text, ss2_text=ss2_text, ss3_text=ss3_text)

stream_run.detect()

torch.cuda.empty_cache()

if _name_ == "_main_":
try:
main()
except SystemExit:
pass

Department of Electronics and Communication Engineering ,BIT

You might also like