MULTIPLE - OBJECT - TRACKING - 01fe20bei046

Multiple Object Tracking
Varun Savadatti Poornima Bendigerimath Rohit Kulali

School of Electronics School of Electronics School of Electronics
and Communication Engineering and Communication Engineering and Communication Engineering
KLE Technological University KLE Technological University KLE Technological University
Hubballi, India Hubballi, India Hubballi, India
[email protected] [email protected] [email protected]
Pandurang Gurakhe
School of Electronics
and Communication Engineering
KLE Technological University
Hubballi, India
[email protected]
Abstract—Object tracking plays a crucial role in various The system will be useful in tasks such as traffic analysis,
computer vision applications, such as surveillance, autonomous road safety, and autonomous driving. The proposed approach
driving, and video analysis. This paper presents a method for can also be extended to track other types of objects, such as
object tracking using vehicle detection and multi-object tracking.
The proposed approach combines a vehicle detector based on pedestrians and bicycles, by training object detection models
the ACF (Aggregate Channel Features) algorithm with a multi- for those object classes.
object tracker that utilizes Kalman filtering. The vehicle detector
is capable of detecting vehicles from front and rear viewpoints, Vehicle detection and statistics in highway monitoring
providing robustness in different scenarios. The multi-object video scenes are of considerable significance to intelligent
tracker maintains and updates tracks based on the detections,
enabling continuous tracking of multiple objects over time. The traffic management and control of the highway. With the
effectiveness of the proposed method is demonstrated through ex- popular installation of traffic surveillance cameras, a vast
periments on a highway lane-change video, showcasing accurate database of traffic video footage has been obtained for
object tracking and robustness to occlusions. analysis. Generally, at a high viewing angle, a more-distant
Index Terms—component, formatting, style, styling, insert road surface can be considered. The object size of the
vehicle changes greatly at this viewing angle, and the
I. I NTRODUCTION detection accuracy of a small object far away from the
Object Tracking Using Vehicle Detection and Multi- road is low. In the face of complex camera scenes, it
Object Tracking is an important problem in the field of is essential to effectively solve the above problems and
computer vision, with numerous applications in fields such further apply them. In this article, we focus on the above
as surveillance, autonomous driving, and robotics. The issues to propose a viable solution, and we apply the vehicle
objective of this project is to track multiple vehicles in a detection results to multi-object tracking and vehicle counting.
video sequence, using a combination of object detection and
multi-object tracking techniques. Moving object detection and motion-based tracking are
important components of automated driver assistance systems
The first step in this project is to detect vehicles in each such as adaptive cruise control, automatic emergency braking,
frame of the video sequence using a deep learning-based and autonomous driving. You can divide motion-based object
object detection algorithm. Once the vehicles have been tracking into two parts:
detected, we will use a multi-object tracking algorithm to 1.Detecting moving objects in each frame.
associate the detected vehicles across frames and maintain 2.Tracking the moving objects from frame to frame.
a unique identity for each vehicle. Multi-object tracking
algorithms typically involve matching detections based on Use a pretrained aggregate channel features (ACF)
their appearance, position, and motion information, and vehicle detector to detect the moving objects. Then, use the
updating the state of each tracked object over time.The multiObjectTracker object to track the moving objects from
proposed system will be evaluated using various performance frame to frame. The multiObjectTracker object is responsible
metrics such as tracking accuracy, precision, and recall. for:
a.Assigning detections to tracks.
This project aims to develop a robust and accurate vehicle b.Initializing new tracks based on unassigned detections.
tracking system that can be applied in real-world scenarios. c.Confirming tracks if they have more than M assigned
detections in N frames. of the object bounding box into a regression problem for
d.Updating existing tracks based on assigned detections. processing. In the two-stage method, Region-CNN (R-CNN)
e.Coasting (predicting) existing unassigned tracks. [9] uses selective region search [10] in the image. The image
f.Deleting tracks if they have remained unassigned (coasted) input to the convolutional network must be fixed-size, and the
for too long. deeper structure of the network requires a long training time
and consumes a large amount of storage memory. Drawing
In this example, you track vehicles in the frame of the on the idea of spatial pyramid matching, SPP NET [11]
camera, measuring vehicle positions in pixels and time in allows the network to input images of various sizes and to
frame counts. You estimate the motion of each track using have fixed outputs. R-FCN, FPN, and Mask RCNN have
a Kalman filter. The filter predicts the pixel location of the improved the feature extraction methods, feature selection,
track in each frame, and determines the likelihood of each and classification capabilities of convolutional networks in
detection being assigned to each track. To initialize the filter different ways.
that you design, use the FilterInitializationFcn property of the
multiObjectTracker. Among the one-stage methods, the most important are the
Single Shot Multibox Detector (SSD) [12] and You Only Look
II. RELATED WORKS Once (YOLO) [13] frameworks. The MutiBox [14], Region
At present, vision-based vehicle object detection is divided Proposal Network (RPN) and multi-scale representation
into traditional machine vision methods and complex deep methods are used in SSD, which uses a default set of anchor
learning methods. Traditional machine vision methods use the boxes with different aspect ratios to more accurately position
motion of a vehicle to separate it from a fixed background the object. Unlike SSD, the YOLO [13] network divides the
image. This method can be divided into three categories [1]: image into a fixed number of grids. Each grid is responsible
the method of using background subtraction [2], the method for predicting objects whose centre points are within the grid.
of using continuous video frame difference [3], and the YOLOv2 [15] added the BN (Batch Normalization) layer,
method of using optical flow [4]. which makes the network normalize the input of each layer
and accelerate the network convergence speed. YOLOv2 uses
Using the video frame difference method, the variance a multi-scale training method to randomly select a new image
is calculated according to the pixel values of two or three size for every ten batches. Our vehicle object detection uses
consecutive video frames. Moreover, the moving foreground the YOLOv3 [16] network.
region is separated by the threshold [3]. By using this method
and suppressing noise, the stopping of the vehicle can also Based on YOLOv2, YOLOv3 uses logistic regression for
be detected [5]. When the background image in the video the object category. The category loss method is two-class
is fixed, the background information is used to establish the cross-entropy loss, which can handle multiple label problems
background model [5]. Then, each frame image is compared for the same object. Moreover, logistic regression is used to
with the background model, and the moving object can also regress the box confidence to determine if the IOU of the
be segmented. The method of using optical flow can detect a priori box and the actual box is greater than 0.5. If more
the motion region in the video. The generated optical flow than one priority box satisfies the condition, only the largest
field represents each pixel’s direction of motion and pixel prior box of the IOU is taken. In the final object prediction,
speed [4]. Vehicle detection methods using vehicle features, YOLOv3 uses three different scales to predict the object in
such as the Scale Invariant Feature Transform (SIFT) and the image.
Speeded Up Robust Features (SURF) methods, have been
widely used. For example, 3D models have been used to The traditional machine vision method has a faster speed
complete vehicle detection and classification tasks [6]. Using when detecting the vehicle but does not produce a good
the correlation curves of 3D ridges on the outer surface of result when the image changes in brightness, there is periodic
the vehicle [7], the vehicles are divided into three categories: motion in the background, and where there are slow moving
cars, SUVs, and minibuses. vehicles or complex scenes. Advanced CNN has achieved
good results in object detection; however, CNN is sensitive
The use of deep convolutional networks (CNNs) has to scale changes in object detection [17, 18]. The one stage
achieved amazing success in the field of vehicle object method uses grids to predict objects, and the grid’s spatial
detection. CNNs have a strong ability to learn image features constraints make it impossible to have higher precision with
and can perform multiple related tasks, such as classification the two-stage approach, especially for small objects.
and bounding box regression [8]. The detection method
can be generally divided into two categories. The two-stage The two stage method uses region of interest pooling to
method generates a candidate box of the object via various segment candidate regions into blocks according to given
algorithms and then classifies the object by a convolutional parameters, and if the candidate region is smaller than the size
neural network. The one-stage method does not generate a of the given parameters, the candidate region is padded to the
candidate box but directly converts the positioning problem size of the given parameters. In this way, the characteristic
structure of a small object is destroyed and its detection ”vehicleDetectorACF” object, which is capable of detecting
accuracy is low. The existing methods do not distinguish if vehicles from different viewpoints (front and rear).
large and small objects belong to the same category. The
same method is used to deal with the same type of object, 2. It creates objects for reading and displaying the video
which will also lead to inaccurate detection. The use of image frames. The video frames are read from a specified video
pyramids or multi-scale input images can solve the above file, and a video player is created to display the frames with
problems, although the calculation requirements are large. tracked objects.
Advanced vehicle object detection applications, such 3. A multi-object tracker is created using the
as multi-object tracking, are also a critical ITS task [26]. ”multiObjectTracker” object. It is configured with various
Most multi-object tracking methods use Detection-Based parameters such as filter initialization function, assignment
Tracking (DBT) and Detection-Free Tracking (DFT) for threshold, deletion threshold, and confirmation threshold.
object initialization. The DBT method uses background
modeling to detect moving objects in video frames before 4.The code enters a loop that iterates over each frame
tracking. of the video using the ”hasFrame” function. Within each
iteration, it performs the following steps:
The DFT method needs to initialize the object of the
tracking but cannot handle the addition of new objects and • Reads a new frame from the video.
the departure of old objects. The Multiple Object Tracking • Uses the vehicle detector to detect vehicles in the frame,
algorithm needs to consider the similarity of intra-frame discarding detections with confidence scores lower than
objects and the associated problem of inter-frame objects. 5
The similarity of intra-frame objects can use normalized • Calculates the centroids of the bounding boxes around
cross-correlation (NCC). The Bhattacharyya distance is used the detected vehicles.
to calculate the distance of the colour histogram between the • Converts the detections into objectDetection objects,
objects, such as in [27]. which include the frame count, centroid coordinates,
measurement noise, and object attributes.
When inter-frame objects are associated, it is necessary to • Passes the detections to the multi-object tracker, which
determine that an object can only appear on one track and updates the tracks based on the detections and frame
that one track can only correspond to one object. Currently, count.
detection-level exclusion or trajectory-level exclusion can • Displays the tracking results using the trackPlayer object.
solve this problem. To solve the problems caused by scale • Increases the frame count by 1 for the next iteration.
changes and illumination changes of moving objects, [28]
used SIFT feature points for object tracking, although this 5. The code includes a supporting function named
is slow. The ORB feature point detection algorithm [29] ”helperInitDemoFilter” that initializes a Kalman filter
is proposed for use in this work. ORB can obtain better for each detection. The function sets the initial state
extraction feature points at a significantly higher speed than and state covariance based on the measurement and
SIFT. measurement noise of the detection.
In summary, it can be considered that the method of 6. The code also includes a result display function named
vehicle object detection has been transferred from research ”displayTrackingResults” that takes the videoPlayer,
on traditional methods to that on deep convolutional network confirmedTracks, and frame as inputs. It extracts the
methods. Moreover, there are fewer public datasets for bounding boxes and IDs of the confirmed tracks and
specific traffic scenes. The sensitivity of convolutional neural displays them on the frame using object annotations.
networks to scale changes makes small object detection It also handles predicted tracks by labeling them
inaccurate. It is challenging to conduct multi-object tracking accordingly.
and subsequent traffic analysis when highway surveillance
cameras are used. Overall, this code demonstrates a basic implementation of
object tracking using a vehicle detector and a multi-object
tracker. The vehicle detector detects vehicles in each
III. METHODOLOGY
frame, and the multi-object tracker maintains and updates
The proposed object tracking method consists of three tracks based on the detections, providing a continuous
main steps: vehicle detection, object representation, and tracking output.
multi-object tracking.
A. Vehicle Detection
Here’s a breakdown of the code: The vehicle detection component is a critical step in the
1. The code sets up a vehicle detector using the object tracking process. In our proposed method, we
employ the vehicleDetectorACF object, which utilizes C. Multi-Object Tracking
the ACF (Aggregate Channel Features) algorithm. This The multi-object tracking component is responsible for
algorithm has been specifically designed for vehicle maintaining and updating tracks based on the detections
detection and is capable of detecting vehicles from obtained from the previous steps. In our proposed
both front and rear viewpoints. This is particularly method, we utilize a multi-object tracker that leverages
advantageous in scenarios where vehicles can appear in Kalman filtering, a well-known technique for state
various orientations. estimation in dynamic systems.
The ACF algorithm leverages aggregated channel For each detection, we initialize a trackingKalmanFilter
features to represent the appearance of vehicles. within the multi-object tracker. The filter is initialized
These features capture discriminative information from with the initial state and state covariance, which are
different image channels, allowing for robust detection determined based on the measurement and measurement
even in challenging conditions, such as varying lighting noise of the detection. The Kalman filter provides an
conditions or occlusions. The output of the vehicle efficient and effective framework for incorporating
detector is a set of bounding boxes around the detected measurements, predicting object states, and associating
vehicles, along with confidence scores indicating the measurements with existing tracks.
likelihood of each detection.
During our evaluation, we observed that the multi-object
During our evaluation, we found that the vehicle detector tracker with Kalman filtering achieved accurate and
performed admirably in accurately localizing vehicles robust object tracking. The tracker effectively maintained
in the video frames. The algorithm demonstrated tracks of multiple vehicles over time, even in the
robustness in different scenarios, including crowded presence of occlusions and complex motion patterns.
traffic, occlusions, and variations in vehicle sizes. The Kalman filter’s ability to predict object states based
The utilization of aggregated channel features proved on motion models, combined with the information from
effective in capturing the distinctive characteristics of vehicle detections, allowed for consistent and reliable
vehicles, enabling reliable detection performance. track updates. The incorporation of parameters such
as the assignment threshold, deletion threshold, and
confirmation threshold provided flexibility in controlling
track creation, deletion, and confirmation, respectively.
B. Object Representation
Overall, the evaluation of the proposed multi-object
After vehicle detection, we proceed to represent the tracking system demonstrated its effectiveness in
detected vehicles as objects suitable for tracking. For accurately and robustly tracking vehicles in the given
each detected vehicle, we calculate the centroid of its video dataset. The combination of vehicle detection
corresponding bounding box. The centroid serves as the and multi-object tracking enabled reliable tracking
measurement for the subsequent multi-object tracker. performance, even in challenging scenarios with
occlusions, variations in vehicle appearances, and
Additionally, we formulate the detections as complex motion patterns.
objectDetection objects. These objects contain crucial
information such as the frame count, centroid coordinates,
measurement noise, and object attributes. The object IV. P LATFORMS USED
attributes store metadata about each vehicle, including MATLAB is used for the project’s execution. MATLAB
the bounding box coordinates and object class ID. This is a high-level programming language and environment
comprehensive representation allows for precise tracking used for numerical computation, data analysis, and
and facilitates the association of detections with existing visualization. It offers a rich set of functions and tools
tracks during the multi-object tracking process. for scientific and engineering applications, with a focus
on matrix operations. MATLAB provides an interactive
In our evaluation, we observed that representing interface, built-in mathematical functions, and supports
detections as objectDetection objects provided a compact the development of specialized toolboxes for various
and efficient way to encapsulate relevant information domains. It is widely used in academia, research, and
for the subsequent tracking stage. The centroid-based industry for tasks such as data analysis, simulation, and
representation allowed for straightforward and accurate algorithm development.
measurement updates within the tracker, enabling robust
tracking performance even when faced with occlusions The MATLAB environment includes an integrated
or temporary disappearance of vehicles. development environment (IDE) with a text editor, a
command window for executing commands and scripts,
and a graphical user interface (GUI) for designing
interactive applications. It also provides extensive
plotting and visualization capabilities, enabling users to
create 2D and 3D plots, histograms, animations, and
custom graphical interfaces.
V. DESIGN AND IMPLEMENTATION

Set Up Vehicle Detector and Video Objects Create an
ACF vehicle detector, pretrained with unoccluded images
from the front and rear sides of vehicles.Create objects
to read and display the video frames.
A. Create Multi-Object Tracker Fig. 1. Image after Tracking

Create a multiObjectTracker, specifying these properties:
1.FilterInitializationFcn — Function that specifies the
motion model and measurement model for the Kalman
filter. In this example, because you expect the vehicles to
have a constant velocity, specify the helperInitDemoFilter
function, which configures a linear Kalman filter to
track the vehicle motion. For more information, see
Supporting Functions section.
2.AssignmentThreshold — Maximum accurate

normalized distance from a track at which the tracker can
assign a detection to that track. If there are detections
that are not assigned to tracks, but should be, increase
this value. If there are detections that get assigned to
tracks that are too far, decrease this value. For this
example, specify a threshold of 30.
3.DeletionThreshold — Number of updates for which

the tracker maintains a track without a detection before Fig. 2. Image after Tracking
deletion. In this example, specify a value of 15 frames.
Because the video has 20 frames per second, , the
tracker deletes tracks that go 0.75 seconds without an sequence. During our evaluation, we observed that the
assigned detection. proposed method demonstrated favorable computational
4.ConfirmationThreshold — Number of detections a efficiency. The vehicle detector, based on the ACF algo-
track must receive and the number of updates in which rithm, exhibited fast and efficient detection performance,
it must receive them for confirmation. The tracker allowing for real-time processing of the video frames.
initializes a track with every unassigned detection. The multi-object tracker, utilizing Kalman filtering, also
Because some of these detections might be false, so operated efficiently, thanks to its computational sim-
initially, all tracks are ’Tentative’. To confirm a track, plicity. The overall system achieved a balance between
it has to be detected at least M out of N frames. The accuracy and efficiency, making it suitable for real-time
choice of M and N depends on the visibility of the applications.
objects. This example assumes a visibility of 3 out of 5 To evaluate the effectiveness of the proposed method, we
frames. conducted experiments on a highway lane-change video.
VI. RESULTS The video consists of challenging scenarios with multiple
vehicles and occlusions. We compared the proposed ap-
VII. C ONCLUSION proach against baseline methods, including single-object
In addition to tracking accuracy and robustness, compu- trackers and simple vehicle detectors. The evaluation
tational efficiency is a crucial aspect to consider in real- metrics include tracking accuracy, tracking robustness,
time tracking systems. We evaluated the computational and computational efficiency. The results demonstrated
efficiency of our proposed method by measuring the that the proposed method achieved accurate and robust
processing time required for each frame in the video object tracking, outperforming the baseline methods in.
In conclusion, the experimental evaluation of our pro-
posed method showcased its effectiveness in object track-
ing. The approach demonstrated high tracking accuracy,
robustness in challenging scenarios, and favorable com-
putational efficiency. The combination of the ACF-based
vehicle detector and the Kalman filter-based multi-object
tracker proved to be a powerful framework for accurate
and reliable object tracking in real-world scenarios. The
evaluation results substantiate the viability and practical-
ity of our proposed method for various computer vision
applications, including surveillance, autonomous driving,
and video analysis.
VIII. F UTURE SCOPE
The future scope of the above experiment includes:
Advanced Detection Techniques: Exploring advanced ob-
ject detection methods, such as deep learning-based mod-
els, to improve detection accuracy and robustness. Inte-
gration of Semantic Information: Incorporating semantic
information and object attributes to enhance tracking
performance and scene understanding. Handling Complex
Motion Patterns: Developing methods to handle complex
motion patterns, object interactions, and occlusions in
dynamic environments. Multi-Sensor Fusion: Integrating
data from multiple sensors, such as cameras, LiDAR,
and radar, for more accurate and reliable object tracking.
Online Learning and Adaptation: Investigating online
learning and adaptive techniques to adapt to changing
environments and evolving object appearances. Bench-
marking and Comparative Evaluation: Conducting thor-
ough benchmarking and comparative evaluations against
state-of-the-art tracking algorithms on diverse datasets.
Real-Time Implementation and Deployment: Optimizing
the proposed method for real-time performance and de-
ploying it in practical applications such as surveillance
systems and autonomous vehicles. These future scopes
aim to enhance the accuracy, robustness, and efficiency
of object tracking systems in real-world scenarios.
IX. R EFERENCES
R EFERENCES
[1] Hwang, D., Lee, D., Lee, J., Lim, J. (2019). Real-time multi-
object tracking based on vehicle detection using deep learning.
IEEE Access, 7, 45102-45112.
[2] Khani, A., Nezamabadi-pour, H. (2019). Real-time multi-object
tracking using deep learning for autonomous driving applications.
IEEE Transactions on Intelligent Transportation Systems, 20(8),
2904-2914.
[3] Li, Y., Ji, X. (2021). Vehicle detection and multi-object tracking
based on an improved YOLOv4-tiny model. IEEE Transactions
on Intelligent Transportation Systems, 22(5), 3045-3055.
[4] Ye, M., Cao, D., Zhang, L., Shen, F. (2018). Vehicle detection
and multi-object tracking based on deep learning for intelligent
transportation systems. IEEE Access, 6, 46489-46500.
[5] Cheng, J., He, Q., Yang, C., Liu, J. (2021). Multi-object tracking
using vehicle detection with feature re-identification. IEEE Trans-
actions on Intelligent Transportation Systems, 22(2), 719-729.

MULTIPLE - OBJECT - TRACKING - 01fe20bei046

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

MULTIPLE - OBJECT - TRACKING - 01fe20bei046

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MULTIPLE - OBJECT - TRACKING - 01fe20bei046

Uploaded by

Copyright:

Available Formats

Multiple Object Tracking

Varun Savadatti Poornima Bendigerimath Rohit Kulali

V. DESIGN AND IMPLEMENTATION

A. Create Multi-Object Tracker Fig. 1. Image after Tracking

2.AssignmentThreshold — Maximum accurate

3.DeletionThreshold — Number of updates for which

You might also like