0% found this document useful (0 votes)

11 views

Design of An Effective Multiple Objects Tracking Framework For Dynamic Video Scenes

Nowadays, the applications corresponding to video surveillance systems are getting popular due to their wide range of deployment in various places such as schools, roads, and airports. Despite the continuous evolution and increasing deployment of object-tracking features in video surveillance applications, the loopholes still need to be solved due to the limited functionalities of video-tracking systems. The existing video surveillance systems pose high processing overhead due to the larger size of video files. However, the traditional literature report quite sophisticated schemes which might successfully retain higher object detection accuracy from the video scenes but needs more effectiveness regarding computational complexity under limited computing resources. The study thereby identifies the scope of enhancement in traditional object-tracking functions. Further, it introduces a novel, cost-effective tracking model based on Gaussian mixture model (GMM) and Kalman filter (KF) that can accurately identify numerous mobile objects from a dynamic video scene and ensures computing efficiency. The study's outcome shows that the proposed strategic modelling offers better tracking performance for dynamic objects with cost-effective computation compared to the popular baseline approaches.

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Design of An Effective Multiple Objects Tracking Framework For Dynamic Video Scenes

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 13, No. 4, December 2024, pp. 3879~3891

ISSN: 2252-8938, DOI: 10.11591/ijai.v13.i4.pp3879-3891  3879

Design of an effective multiple objects tracking framework for

dynamic video scenes

Sunil Kumar Karanam1, Narasimha Murthy Pokale Kavya2

1
Department of Computer Science & Engineering, BMS College of Engineering, Affiliated to Visvesvaraya Technological University,
Belagavi, India
2
Department of Computer Science & Engineering, RNSIT Institute of Technology, Affiliated to Visvesvaraya Technological University,
Belagavi, India

Article Info ABSTRACT

Article history: Nowadays, the applications corresponding to video surveillance systems are
getting popular due to their wide range of deployment in various places such
Received Jan 30, 2024 as schools, roads, and airports. Despite the continuous evolution and
Revised Feb 24, 2024 increasing deployment of object-tracking features in video surveillance
Accepted Mar 21, 2024 applications, the loopholes still need to be solved due to the limited
functionalities of video-tracking systems. The existing video surveillance
systems pose high processing overhead due to the larger size of video files.
Keywords: However, the traditional literature report quite sophisticated schemes which
might successfully retain higher object detection accuracy from the video
Cost evaluation scenes but needs more effectiveness regarding computational complexity
Dynamic scene under limited computing resources. The study thereby identifies the scope of
Internet of things enhancement in traditional object-tracking functions. Further, it introduces a
Mobile object tracking video novel, cost-effective tracking model based on Gaussian mixture model
surveillance (GMM) and Kalman filter (KF) that can accurately identify numerous
Object detection accuracy mobile objects from a dynamic video scene and ensures computing
Public safety efficiency. The study's outcome shows that the proposed strategic modelling
Security offers better tracking performance for dynamic objects with cost-effective
computation compared to the popular baseline approaches.
This is an open access article under the CC BY-SA license.

Corresponding Author:
Karanam Sunil Kumar
Department of Computer Science and Engineering, BMS College of Engineering
Bull Temple Rd, Basavanagudi, Bengaluru, Karnataka 560019, India
Email: [email protected]

1. INTRODUCTION
The growth of the global surveillance market has made dynamic object detection and tracking from
video scenes popular in recent years. The advancement of computer vision technology and image processing
makes this market size grow faster. The prime reason behind its rapid development is urbanization
construction and the wide range of deployment of surveillance systems over large buildings, public places,
parks, roads, and airports. Monitoring and surveillance systems play a crucial role in various aspects, viz.,
traffic movement management, automotive safety, activity-based recognition for cyber-security applications,
and sports analysis [1], [2]. Here arise the requirements of reliable and accurate multiple-object tracking
(MOT) so that the purpose of public safety concerns can be fulfilled under interconnected smart cities. The
prime motive of single or multiple object tracking (MOT) is to consistently localize and identify several
objects in a video sequence which facilitates video analysis applications of video surveillance systems. Most
conventional works on MOT follow the idea of a tracking-by-detection framework due to its simplicity and

Journal homepage: http://ijai.iaescore.com

3880  ISSN: 2252-8938

effectiveness in fulfilling tracking requirements. Traditional MOT tracking consists of two stages of
operations [3]–[8].
In the first stage of operations, the framework employs an object detector to detect objects of
interest in the current video frame, whereas, in the second stage of operations, the detected objects are
associated with the tracks from the previous frames to construct the trajectories further. Here the system
associates the detected objects between frames using features that could be either location or appearance [9]–
[11]. The recent progress in tracking-by-detection strategy has evolved towards solving the ambiguities
associated with object detection. It can also handle the constraints that result in object detection failures.
However, object detection is also closely studied with motion estimation, which is capable of identifying an
object's mobility between two consecutive frames [12].
The segmentation plays a significant role in developing applications or techniques for tracking the
video or the frame sequences in the video. There are studies which have also been worked in this direction
where a significant study is being conducted by the authors [13], where the objective function for optimizing
the accuracy of the segmentation uses two parameters: i) entropy and ii) clustering indices. Further, the
validation of the method has experimented with traditional segmentation techniques that include: i) statistical
region merging, ii) watershed and K-mean. Although they have tested this method on four different datasets,
all these datasets are heterogeneous images, not video sequences. Minhas et al. [14] propose a novel concept
of building a semantic segmentation network from skin features of high significance that fine-tunes the object
boundaries information at different scales. The method is being tested and validated on many human activity
databases. Cheng et al. [15] introduces a framework namely ViTrack which targets to efficiently implement
multi-video tracking systems on edge to facilitates the video surveillance requirements. The problem
formulation in the study addresses the core research challenges in three prime areas of video tracking in
surveillance systems such as i) compressed sensing (CS) [16]–[18], ii) object recognition, and iii) object
tracking. Xing et al. [19] explored the evolution of intelligent transportation systems where vehicular
movement tracking is an important concern for traffic surveillance. The authors mostly emphasized on
designing a real-time tracking system of vehicular movement considering complex form of scenes from
captured video feeds. The authors introduce the tracking model namely NoisyOTNet which realises the
problem of object tracking on complex video scenes as reinforcement learning with parameter space
problem. The study explores traditional vehicle tracking methods such as correlation filter-based method
[20]–[22], deep learning-based methods [23], [24] for vehicle tracking purposes. It finds that
correlation-based methods and deep learning-based methods adopt static learning approach unlike
reinforcement learning [25], [26].
Abdelali et al. [27] also addresses the problem of vehicular traffic surveillance and road violations
and further attempts to design an approach to tackle this issue. In this regard the study introduces a fully
automated methodical approach namely multiple hypothesis detection and tracking (MHDT) to deal with the
multi-object tracking in videos. The research method jointly integrates Kalman filter [28] and data
association-based tracking using YOLO detection [29] to robustly track vehicular objects in the complex
video scenes.
Once the vehicle objects are detected then the system employs Kalman filter based tracking model.
This applies a temporal correlation-based theory to track vehicles among one frame to another. The design of
Kalman filter [28] is constructed in such a way where for each time instance of t, it provides the first
prediction 𝑦́ 𝑡. Here yt correspond to the state.

𝑦́ 𝑡 = 𝑇 × 𝑦𝑡 − 1 (1)

The Kalman filter also estimates the state prediction steps considering a covariance estimation
calculation. The study also analyses various related works and observed that most of the studies and their
incorporated algorithms consider convolutional neural network (CNN) as classifiers and it yields better
accuracy which lies between 93% to 97%. The computational complexity is evaluated with respect to the
estimation of bounding box coordinates (b) which states that the overall computational cost of the model
stands as 𝑂(b3 + b2 + b).
It has been observed that the variation factor in illumination causes significant challenges in video
surveillance systems towards multiple object detection and tracking in the presence of motion factors. Even
though various schemes being evolved and studied for several decades for different tasks, due to illumination
variation factors, there remain constraints of deformation of mobile objects, pause motion blur, occlusions
(full/partial) and camera view angle. These crucial aspects are yet unsolved problems associated with mobile
object detection and tracking from dynamic video scenes. Also, the challenges with the traditional tracking
systems are lack of effectiveness in localizing the object of interest properly in the presence of dynamic
transition of background, lack of handling the presence of variation in aspect ratios, variation of intra-class

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Int J Artif Intell ISSN: 2252-8938  3881

objects, appropriate contextual information and presence of complex background [30], [31]. Apart from this,
the most significant challenge arises with higher accuracy of multiple object detection and tracking while
balancing considerable cost-effective computational performance, which is less likely explored in the
existing systems of MOT models.
After reviewing the existing studies on MOT, the identified research problems outline the fact that
even though there exist various form of work on MOT but the majority of the tracking models accomplish
higher accuracy of detection and tracking at the cost of computational complexity, which is the similar case
with the existing machine learning (ML) based approaches as well. Secondly, most studies do not consider
contextual connectivity factors of an object with its background, which remains a challenge in the existing
works. The appropriate inclusion of feature engineering is also missing in the existing ML-based MOT
techniques for tracking dynamic mobile objects in the complex video scenes, where contextual scene
information also plays a crucial role.
The study's problem statement is "To design a cost-effective and highly accurate MOT framework
to perform object detection and tracking from complex video scenes considering contextual information is a
highly challenging task". This proposed study addresses this problem, and a novel computational contextual
framework is introduced for effective MOT. The novelty of this framework is that it can identify numerous
mobile objects from the dynamic scenes and also reduces the cost of computational effort with a simplified
tracking module. The contribution of the proposed system is it applies cost-effective modelling of assigning
object detection in the current frame to existing tracks with an optimal estimator. It also explores the scope of
improvement in mobile object detection considering the method of Gaussian mixture model (GMM) and
improves the tracking performance using Kalman filter-based approach. Here the strategy also explores the
association among the detected mobile objects from one frame to the next and overcomes the association
problem. Here the inclusion of the Kalman filter method predicts the state variables effectively, which
enhances the tracking performance with cost-effective trajectory formulation for the mobile objects even in
the presence of complex and dynamic scenes. It has to be noted that the identification of mobile objects in the
proposed study considers the contextual aspect of the object, which is also referred to as the line of
movement (LoM). Another novelty of the proposed approach is implied design execution which makes the
entire system computationally efficient when compared with the existing baseline approaches.
This new concept of dynamic tracking of numerous mobile objects takes advantage of GMM in the
segmentation of objects. It also handles the constraints of traditional background subtraction methods
towards the appropriate detection of moving objects. The study also further improvises the tracking model
considering the potential features of the Kalman filter towards predicting the centroid of each track for
motion-based tracking, through which it has also handled the track assignment problem. The experimental
outcome further justifies how the formulated concept of LoM considers directionality movement that
cost-effectively performs association among identified moving objects and performs tracking considering
trajectory formulation. It also shows better identification performance by the tracking module with
cost-effectiveness when compared with the baseline approaches. Unlike baseline studies, the proposed
strategy offers a much lower response time with considerable processing execution and iterations.

2. METHOD
This part of the study formulates the analytical design modeling of the proposed cost-efficient
dynamic tracking model which is capable of tracking multiple video objects with higher accuracy and
computational efficiency. The study formulates the flow of the design with analytical research modeling to
realise the working scenario of the proposed approach. It also involves a set of functional modules which
operates on fulfilling the design requirements of the proposed system.
The block-based architecture of the proposed system in Figure 1 exhibits that it considers of a set of
operational modules where the first module is associated with video I/O initialization where it constructs a
video reader object and read the video file. Here the functionality constructs a reference object (Ov) which
basically computes different attributes which is further discussed in the consecutive sections. Further it also
initializes two players which are P1 and P2 respectively to visualize the computation of foreground mas and
the video file sequence of (Vf). Further the system also constructs explicit functionalities to initialize the
operations corresponding to Gaussian based detector for foreground and binary large objects (BloB) analyzer
which also considers the reference object from the video sequence. Further the study also employs a dynamic
mobile object detection module which basically constructs system objects to read the video file input
sequence and also detect the foreground object. Here the study also enhances the operations of precise object
detection by incorporating morphological operations which performs pre-processing over the data and make
it suitable for video analysis for Blob Analyzer. The proposed strategy further applies GMM to perform
precise object segmentation from the complex video scenes. The approach also considers initialization of
tracking module where it constructs structure array fields. Finally, the study applies a Kalman filter to
Desigen of an effective multiple object tracking framework for dynamic video scenes (Karanam Sunil Kumar)
3882  ISSN: 2252-8938

enhance the prediction of new location of track where the computation of centroid calculation and updating
bounding box also evolves. Finally, the proposed system strategy also handles the track assignment problem
for detected mobile object and here also use Kalman filter approach to perform detections to track
assignment. It has to be noted that the entire process also minimizes the cost of track allocation where the
track depicts the contextual LoM aspect for the mobile object. Further the proposed strategy performs the
updating operations with respect to updating attributes and exhibits the final tracked mobile objects from the
complex video scenes. It has to be noted that the core strategy of the proposed tracking module is to
effectively locate the moving object or multiple objects over progressive time for a given Vf. Here the in the
core strategy of the proposed system identifies the association problem and detects an object across multiple
frames of a video stream. The core strategy of the proposed system also considers the fundamental principle
of baseline models of tracking where the core philosophy is to initially detect the objects of interest in the
video frame and further performing prediction to construct the LoM of object trajectories over the next
consecutive frames of a video sequence. The proposed study handles the problem of data association by
estimating the predicted locations and further associate the detections across the frames to formulate the
trajectories for the LoM for respective objects.

Figure 1. Architecture of the proposed MOT framework

2.1. Video input-output initialization

The computing process involved in the proposed in the proposed cost-effective dynamic tracking
model initially employs a functionality for video input-output initialization. Here the system initially
considers the input video (Vf) from the surveillance system. The information related to O v is handled while
constructing a reference object (Ov). Here the system employs a functionality of fVR(Vf)→Ov which helps
constructing this object. This phase of computation also comes under data exploration corresponding to the
input Vf. The computation of the reference information corresponds to V f. The exploration of the reference
constructed object of Ov shows that the current time (Ct) refers to time stamp required to read the frame
correspond to Vf. Here the tag attribute basically refers to as a reference to identify the Ov such as
[tag ➔ Ov]. This is an optional name-value pair argument for the computation of the reference object from
the video file. Here the user data (UD) is also constructed as an optional name-value pair attribute where it
refers to a generic field to hold any new information which can be added to the reference object Ov. The
processing and the computation of the Vf with the functionality of f VR(x) constructs a reference object Ov
which holds the following properties as shown in the following Figure 2. The location attribute of path (P)
contains the reference path to locate the video file. The general property of the reference object also includes

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Int J Artif Intell ISSN: 2252-8938  3883

the name of the video file nVf which is associated with the object Ov. Here the duration (t) considers the total
length of the Vf. The computed reference object of the V f also consists of other important information related
to video properties. Here in the Table 1 the attribute of b p refers to the bits amount correspond to unit of pixel
in the respective Vf. The attribute (Fr) also refers to the frame rate of the V f computed in frame/s. It also
computes the height (h) of the ith frame (framei) of Vf in pixels along with width (w) of the (framei) in pixels.
It also computes the number of frames (framen) along with the video format type.
The structure of Ov is finally constructed considering its essential properties to understand the input
video data. The challenges arise in the conventional systems in detection of moving objects from the dynamic
video scenes. In the problem of tracking the moving objects from the video sequences, segmentation of the
dynamic region in the real-time synchronization is a quite challenging task because of various reasons which
include complex and moving background, occlusion, motion blur, illumination variations and many more
other factors. Therefore, to handle individual challenges many custom background subtraction methods is
being evolved. The Table 1 further provides some of the important information about the properties of the V f
through Ov. The inference of Table 1 shows the important properties of V f explored through the object and its
associated methods of Ov.

Figure 2. General properties: Ov

Table 1. Important properties of Vf

Sl. No Property Name
1 Bits / Pixel (bp)
2 Frame Rate (Fr)
3 Height (h)
4 Width (w)
5 Number of Frames (n)
6 Video Format

In these methods the fast learning in the dense environment is the main focus of research. The explicit
algorithm for the video input-output initialization as in Algorithm 1. The numerical algorithm modeling
initially considers the video sequences through the video file (Vf) and initially creates two player objects as P1
and P2 for foreground mask and original video sequences respectively. The study further employs
initialization and creation of an explicit function: function for the foreground detector(ffd) takes input
parameter set as {Number of Gaussians (Ng), number of frames for the training (NTf), percentage of the
minimum background ratio (MBr)} to construct the detector (D) to get advantages of the GMM [32], [33].

Algorithm 1: For video input-output initialization

1. Input: Vf
2. Output: D,B
3. Begin
4. Initialization of players
a. P1 foreground Mask
b. P2 Vf
5. Dffd(Ng,NTf,MBr)
6. Bfba(BOp, AOp, COp, MBa)
7. End

2.2. Computation measures of binary large object

The idea of GMM plays a crucial role to influence the outcome of background subtraction for the
detection of moving objects. The idea of background subtraction allows in detecting the moving objects from
dynamic video scenes. Which is applied in this proposed study considering GMM.

Desigen of an effective multiple object tracking framework for dynamic video scenes (Karanam Sunil Kumar)
3884  ISSN: 2252-8938

Idea of GMM: It has been observed that different background objects could more likely appear at
the same pixel location of over a specific period of time. This arises a challenge of single-valued background
model. Several researchers talks about the design and modeling of multi-valued background model which can
easily cope with the multiple background objects appearing in video scenes [34], [35]. The model provides
better description of both foreground and background values by describing the probability of observing a
certain pixel value (xt) at a specific time of (t). The method GMM computes each pixel within a temporal
window (w) considering k number of mixtures of either single or multi-dimensional Gaussian distribution.
Here if the value of k is larger that tends to stronger ability to deal with the disturbance background. If the
sequence is observed with 𝑥 = {𝑥1 , 𝑥2 … . 𝑥𝑡 } for a given pixel. Then the probability computation for
observing a current pixel value at time t can be represented with the following mathematical (1).

𝑃(𝑥𝑡 ) = ∑𝑘𝑖=1 𝜔𝑖,𝑡 𝜂(𝑥𝑡 , 𝜇𝑖,𝑡 , 𝛴𝑖,𝑡 ) (2)

Here k represents the number of gaussian distributions which represents description for one of the
observable foreground or background objects. In practical instances k value is likely to be reside within the
range of 3 ≤ 𝑘 ≤ 5. The computation of Gaussians remains multi-variate for the purpose of describing the
red, green, and blue values. Here μi,t refers to the computation correspond to the mean value of ith gaussian in
the mixture of models at the instance of t. Also Σi,t computation denotes the covariance matrix of the ith
gaussian at the time t. It has to be noted that here k is determined considering the computing aspects of both
memory and computational power. Here the estimation of ωi,t also denotes the factor of weight associated
with ith Gaussian in the time instance of t. The principle here follows that the factor ∑ki=1 ωi,t = 1 and
η(xt , μi,t , Σi,t ) considered to be Gaussian probability density function.

1 −1 T −1
η(xt , μi,t , Σi,t ) = n e ⁄2(xt−μt ) Σ (xt−μt ) (3)
2π ⁄2 |Σ|1/2

The system modeling also considers the beneficial features associated with GMM. The background
modeling of a grayscale image considers the value of n=1 and Σi,t = 𝜎 2 𝑖,𝑡 . However also when the modeling
is applied on an RGB components then, it updates the values of n =3 and Σi,t = 𝜎 2 𝑖,𝑡 𝐼. This computation of
Σi,t = 𝜎 2 𝑖,𝑡 𝐼 basically assumes the form of covariance matrix. Additionally, the system evaluates the
incoming frames in real time, and GMM modifies its parameters in step-by-step response to the changing
pixel value. Additionally, the pixels are mapped using a thresholding approach and the Gaussian model. The
system further modifies the weights of the Gaussian components if a match is identified. This is how the
background model estimation according to the distributions is carried out, and background pixel
categorization is possible. The functionalities defined in the modeling of ffd (Ng, NTf, MBr) basically aims
to form the foreground detector considering effective segmentation of background subtraction. The formation
of the foreground detection object basically enables the potential features of GMM in which it compares the
color or grayscale video frame with a background model as discussed in the (2) and (3).
This computational process enables a classification criterion to understand whether a certain pixel
belongs to a part of background or foreground. This computational process is essential for background
subtraction algorithms as this data exploration and pre-processing stage also helps eliminating the redundant
attribute from the data and make it suitable for further computational analysis with truthful, accurate and
complete information about the foreground object. Here the foreground mask (M f) is computed which is
associated with the D. And the algorithm correspond to background subtraction here efficiently computes the
foreground objects (Of) from the frame sequence of the Vf. another explicit function for the purpose of
analyzing the properties of connected regions is being used as function for BlobAnalyser (fba) that takes
parameters as in set {Port for the bounding box (Bop), port for output area (AOp), Port for output centroid
(COp, Minimum blob area (MBa)} that yield the blob (B). The underlying idea behind Blob analysis is to
explore the statistics for labelled region in the binary frame of the video sequence. It basically helps
segmenting the objects from the video sequence. The description of the Blob analysis can be seen in Figure 3.
The method of Bob analysis basically refers to analyzing the shape features associated with objects.
Here the implications of the method Bob analysis basically identify the group of connected pixels which are
more likely related with the moving object. The idea of Bob analysis is to explores the pixels connectivity
and construct the Blob through the function fba(x). The connectivity among the pixels is represented with
Blob. Firstly, the process computes the statistics associated with blob and further analyse the information of
Blob which correspond to geometric characteristics which include points of borderline, and perimeter. These
ideas and the standard methods are further incorporated in designing the object detection and tracking
methodologies in the proposed system’s context.

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Int J Artif Intell ISSN: 2252-8938  3885

Figure 3. Blob analysis description

In the computation of statistics blob, the system analyses the output of AOp which represents a
vector of pixels in the labeled regions. Here COp refers to an N-by-2 matrix of centroid coordinates c(x,y)
which could be represented with the following matrix (3). Here N represents the number of Blobs. Here [x,y]
represents the centroid coordinates. Here [x1,y1] ➔ [xN yN] implies that there are two blobs then the row and
column coordinates of their centroids are [x 1,y1] and [xN yN] respectively.

x1 y1
COp = [x yN ] (4)
N

The process of computation for the measure of Blob (B) also analyse the parameter MBa which
refers to another N-by-4 matrix which is of [x,y] dimension. Here also N represents the number of blobs
whereas [x,y] denotes the upper left corner of the bounding box. The analysis of the blob considering
statistics returns a blob analysis system object (B). The analysis of B also constructs the significant properties
of centroid, bounding box, label matrix and blob count in the output which are referenced with B. Finally,
this computation process extracts the shape features of the objects of interest from the video sequence.

2.3. Initialization of the tracking module

The formulated design of the dynamic tracking model further constructs an empty structure array of
tracking module 𝑇𝑚 with six different fields. Which could be shown with the Figure 4. The structure array
basically initializes six different fields such as (ID), Kalmar filter (KF), Age (a), bounding box (Bx), total
visible count measure (tVC), and consecutive invisible count measure (cIC).

Figure 4. Structure array fields of 𝑇𝑚

The system also formulates a functionality to initiate the structure for initialization of array of
tracks. Here each individual track 𝑇𝑖 ∈ 𝑇𝑚 . Here each track 𝑇𝑖 represents the structure corresponding to the
moving object appearing in the Vf . The design requirement for the tracking module in the proposed moving
object detection and tracking strategy is to formulate the structure fields in such a way so that the state of the
tracked object (𝑇𝑂 ) can be maintained appropriately. Here 𝐼𝐷 refers to the integer ID of the track, 𝐵𝑥
represents the current bounding box associated with the object. 𝐾𝐹 represents a Kalman filter object which is
used for motion-based tracking. 𝑎 refers to the frame count since the first detection of 𝑇. The consecutive
visible count measure refers to the number of frames in which the track was detected. 𝑐𝐼𝐶 represents the
number of counts of consecutive frames for which the track was not detected. The process of computation of
state correspond to the information utilized for detection of track allocation, track expiry and display.

Desigen of an effective multiple object tracking framework for dynamic video scenes (Karanam Sunil Kumar)
3886  ISSN: 2252-8938

2.4. Object detection module

The computing process further considers identification number of the next track (𝑇𝐼𝐷 ) and initiates
the process of detecting moving objects considering a logical function hF(x): ∀x ∈ Vf . Here the function
ℎ𝐹(𝑥) is a logical function which considers a set of objects associated with the video file (𝑉𝑓 ) to read. The
function basically returns a logical value from the set of 𝑙 → {1,0}. If the function hF(x) returns the value 1
that implies that there is a video frame 𝐹𝑖 available to read. The process further also applies another function
of rF(x) which reads the video frame from the file then the process further detects the binary mask (𝐵𝑚)
from the 𝐹𝑖 . The binary mask is of same size of the input 𝐹𝑖 . Here the reading of the frame considers
constructing of system object (obj). The process of detecting objects from the 𝐹𝑖 enables another explicit
function of 𝑑𝑂(𝑥), here the function considers the input of 𝐹𝑖 and process it to generate three distinct
attributes which are {𝑐, 𝐵𝑥, 𝑚}. Here c refers to the centroid calculation considering the detected objects, Bx
is bounding box of computation of the detected object followed by the measure of mask (m). The initial
computation of the function 𝑑𝑂(𝑥) considers the video frame sequence of 𝐹𝑖 and identify the mask 𝐵𝑚 and
computes a logical matrix 𝐿𝑚(𝑟, 𝑐). Here the computing function of binary mask computation basically
performs motion segmentation considering an explicit method of ffd(𝑥) [32]. The following analytical
algorithm, Algorithm 2, basically modeled to present the proposed work-flow associated with object
detection from video where the advantageous factors of the method GMM us utilized to perform blob
analysis.
The computed mask further undergoes through pre-processing operations as defined by
morphological operations. The morphological operation here subjected to eliminate redundant attributes of
pixels and also fill the missing gaps in the blobs for the resulting mask 𝐵𝑚. The process further performs
morphological operation (𝑀𝑂) over 𝐿𝑚(𝑟, 𝑐). It applies two functions such as 𝐼1 and 𝐼2 to perform the
morphological operations where 𝐼1 opens the 𝐵𝑚[𝐿𝑚] and performs morphological operation over it with
respect to structuring element of size [𝑠 × 𝑠] and update the values of 𝐵𝑚. The process also further applies
another function of 𝐼2 for morphological close operation over 𝐵𝑚 considering dilation followed by erosion
[33]. Finally, another function of 𝐼3 helps filling the image regions and gaps and make the updated 𝐵𝑚
suitable for effective blob analysis. The customized function of dO(x) finally returns three attributes of
{c, Bx, m} and terminates the process of execution.

Algorithm 2: For object detection from video

Input:Vf
Output:{c, Bx, m}
Begin
1. Define:dO(x), construct system object (obj)
2. Define: hF(x): ∀x ∈ Vf
3. While (Fi = 1)
4. rF(x) → Fi
5. End
6. Return: l → {1,0}
7. Bm[Lm] ← ffd(x): ∀Fi , Lm(r, c)
7. MO→ Bm[Lm(r, c)], for {I1, I2 , I3 }
8. Apply fba(x): ∀x ∈ Bm (1), (2) for GMM
9. Return {c, Bx, m}
End

2.5. Prediction module for new position of line of movement

The core strategy developed in the proposed system targets appropriate identification and tracking
of mobile objects from a complex set of scenes. Here the scenes are captured from a camera which is
mounted in static position. The formulated tracking module further considers 𝑇𝑖 ∈ 𝑇𝑚 and apply a function of
𝑃𝑁𝑇 (𝑥) over the tracks with the inclusion of Kalman filter approach to predict the new location of the LoM.
Here the system considers the computation of Bx considering the updates on 𝑇𝑖 for LoM and further initially
predict and estimate the current location of the track of LoM considering the function of 𝑃𝑁𝑇 (𝑥) it optimizes
the process of prediction of centroid 𝑃𝑐𝑖 considering the approach Kalman Filter (𝐾𝐹). The computation
process can be represented in (5).

𝑃𝑐𝑖 ← 𝑃𝑁𝑇 (𝑥): ∀𝑥 ∈ 𝑇𝑖 , 𝐾𝐹 (5)

Here the computation of prediction of centroid basically determines the current location attributes of
the 𝑇𝑖 considering Kalman filter object. The further computation considers shifting of the Bx in such a so that
its center lies in the 𝑃𝑐𝑖 . It is achieved with the (6).

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Int J Artif Intell ISSN: 2252-8938  3887

𝐵𝑥(𝑘)⁄
𝑃𝑐𝑖 = 𝑃𝑐𝑖 − 2 (6)

The function further updates the new location of the 𝑇𝑖 with respect to the LoM for 𝑃𝑐𝑖 . The
proposed system also explores the shape-based features of the target object which further assist in optimal
estimation of motion associated with the identified object on its LoM. The next computational process
performs LoM allocation to the identified objects of interest.

2.6. Line of movement allocation to the identified objects

In the functional module of the proposed system the estimation of the new position of track (LoM) is
predicted considering the approach of Kalman filter over the progressive 𝐹𝑖 ∈ 𝑉𝑖 . In this stage of computation,
the proposed model the appropriate allocation of LoM to the identified moving objects take place along with
the cost evaluation. The system here employs another function of 𝐴𝐿𝑜𝑀 (𝑥) which computes the number of
identified objects 𝑛𝐼𝑂 from the 𝑐𝑖 and compute the cost of assignment 𝐶𝑜𝑠𝑡𝑎𝑙𝑙𝑜𝑐 considering the (7).

𝐶𝑜𝑠𝑡𝑎𝑙𝑙𝑜𝑐 = 𝐴𝐿𝑜𝑀 (𝑥): ∀𝑥1 → 𝑇, 𝑥2 → 𝐾𝐹, 𝑥3 → 𝑐 (7)

Finally, the optimized estimator of this function solves the allocation problem of identified objects
to the track or LoM for multiobject tracking. Also compute four different attributes such as allocated LoM,
non-allocated LoM and non-allocated identifed objects. The Algorithm 3 shows the design strategy of the
tracking module which has got influenced from the [36], [37] for solving the problem of allocation of
detections to tracks during multiobject tracking.

Algorithm 3: For multi-object tracking

Input:𝑇𝑖 ∈ 𝑇𝑚
Output:𝐹𝑂𝑇
Begin
1. Init 𝑇𝑖 . Bx
2. Update 𝐵𝑥 ← 𝑇𝑖 (𝐵𝑥)
3. Compute current location of LoM

𝑃𝑐𝑖 ← 𝑃𝑁𝑇 (𝑥): ∀𝑥 ∈ 𝑇𝑖 , 𝐾𝐹 (5)

4. Predict the new position of LoM

𝐵𝑥(𝑘)⁄
𝑃𝑐𝑖 = 𝑃𝑐𝑖 − 2 (6)

5. Update 𝑇𝑖 with respect to the LoM for 𝑃𝑐𝑖

6. LoM Allocation to identified objects
7. Evalutate Cost

𝐶𝑜𝑠𝑡𝑎𝑙𝑙𝑜𝑐 = 𝐴𝐿𝑜𝑀 (𝑥): ∀𝑥1 → 𝑇, 𝑥2 → 𝐾𝐹, 𝑥3 → 𝑐 (7)

8. Update allocated LoM, Non-Allocated LoM

9. Eliminate Missed LoM, Construct New LoM
10. Exibit Final Tracked Objects (𝐹𝑂𝑇 )
End

Once the cost evaluation metric is computed for solving the assignment problem, further the process
executes updating of allocation of LoM. Here the algorithm strategy estimates the location of the detected
objects considering another approach based KF. Here the KF based method basically performs correction of
the moving object’s location considering LoM. Here the finetuning of LoM for a detected object also takes
place where predicted Bx is replaced with the detected Bx. Finally, the age corresponds to 𝑇𝑖 is updated with
visibility. Finally, the proposed algorithm strategy computes the updated allocated LoM, non-allocated LoM,
eliminate the missed LoM and construct new LoM prior exhibiting the 𝐹𝑂𝑇 attribute. It can be seen that the
design strategy of the proposed MOT module is quite simplistic and less-iterative which has also enhanced
the computing speed of analytical operation of the algorithm. The methods are computationally lesser
complex which perform the tracking operations for the implemented idea and also offers cost effective MOT.
The next section further discusses experimental outcome obtained from the simulation of the proposed
strategy for multi-object tracking over complex video sequence.

Desigen of an effective multiple object tracking framework for dynamic video scenes (Karanam Sunil Kumar)
3888  ISSN: 2252-8938

3. RESULTS AND DISCUSSION

This section discusses about the simulation study outcome obtained from implementing the
proposed multiple objects tracking framework for dynamic video scenes. The study implementation of the
analytical algorithms is scripted over MATLAB numerical computing environment supported by 64-bit
conventional windows system. The study also considers different set of multiple mobile object-oriented
datasets as referred from [38]. It has to be noted that this proposed study is the continuation of our previous
research works [39], [40].
This phase of the study basically judges the outcome of the proposed system and exhibits its
effectiveness in terms of visual and comparative performance analysis from both accuracy of tracking and
cost point of view. The initial experimental analysis considers moving object detection and tracking for a
single test object. In this regard the system considers the case of two-lane system of roadway where the idea
is to track a single moving vehicle attempting to change the lane. The study considers tracking of a white and
a black vehicle which are moving and attempted to change the lane which is further shown in the Figure 5.
The analysis and interpretation of the visual outcome of Figure 5 highlights that the white vehicle
was initially moving over its assigned left lane where it has been detected considering the proposed tracking
module Figure 5(a). However, it has suddenly shifted to the right lane and continued its journey over the right
lane as tracked by the proposed tracking module Figures 5(b)-5(c). A similar tracking outcome is also found
in the case of black vehicle which has changed its lane from right to left and continued its journey on the left
lane of the roadway Figures 5(d) to 5(f). It has to be noted that the tracking of the target mobile object from a
very complex dynamic scene is achieved effectively by the proposed tracking module even in the presence of
partial occlusion between the target vehicle and other similar vehicles over the frame sequence. The outcome
clearly shows that for a single mobile test object the proposed tracking module has achieved higher accuracy
in tracking the fast-moving object. However, the performance assessment is further extended for multiple
moving objects as well which is further shown in the Figure 6.

(a) (b) (c)

(d) (e) (f)

Figure 5. Tracking of a single test object: (a) no tracking of white vehicle, (b) tracking of white vehicle in the
middle of roadway, (c) tracking of white vehicle in the right lane, (d) tracking of black vehicle in the right
lane, (e) in the left lane, and (f) continued its journey on the left lane

Another test instance in the proposed study model is considered where identification and tracking of
multiple mobile objects are performed considering the proposed MOT framework. The Figure 6 clearly
shows that the multiple mobile objects are distinctly indexed initially in Figure 6(a) whereas in the sequence
of other frames the detection and tracking is slightly affected due to occlusion. However, in Figures 6(b)-6(d)
majorly features are positively determined and in the end the accuracy of tracking also improved irrespective
of the presence of partial occlusison. It can also be seen that the proposed study model retains a proper
balance between the performance accuracy of tracking and computational complexity which is further
illustrated in the following comparative Table 2.

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Int J Artif Intell ISSN: 2252-8938  3889

(a) (b)

Figure 6. Tracking of multiple test objects in the presence of occlusions: (a) tracking of multiple objects
distinctly indexed, (b) occlusion between two running objects, (c) major occlusion between two running
objects, and (d) occlusion between the three running objects

Table 2. Comparative analysis based on observations

Approaches Accuracy (%) Response time Number of processing steps Iterativeness Cost evaluation
Cheng et al. [15] 96.00 Slow Higher Higher No
Abdelali et al. [27] 92.50 Faster Higher Very Higher No
Chen et al. [30] 93.3 Medium High Medium No
Aslam and Sharma [32] 95.1 High Higher Higher No
Proposed tracking 96.22 Very less Very less No Yes

The interpretation of the observational outcome from the Table 2 shows that the proposed system
offers comparatively better performance of tracking along with balancing the cost factors where it also
obtained considerable response time along with executional steps which doesn’t involve much complex
procedure. The cost evaluation also shows how the proposed tracking model has addressed the assignment of
detections to track problem effectively while minimizing the cost factors. The insights from the comparative
study outcome shows that when compared with the approaches in [15], [27], [30], [32] the proposed tracking
model attains considerably better tracking accuracy which is approximately 96.22% and comparable with the
exsiting baseline models. Also, the critical findings of the study shows that the proposed model is found to be
better in terms of response time, interativeness, complexity and cost of compuatation factors. Another
strength factor of the study model is that it is capable of providing better accuracy even in the presence of
low ir medium size of video data.

4. CONCLUSION
The study introduces an effective computational framework for multi-object tracking where it
considers tracking a set of mobile objects from a given dynamic video scene. The study attempts to provide a
simplistic design schema for the proposed system. It aims to detect moving objects in each frame precisely and
precisely track the identified objects' movement over successive frames, even in partial occlusion. The study
also handles the problem of assigning the detection to each track, considering an efficient distance
computation using the Kalman filter. The strategic modelling performs the detection of moving objects
considering the background subtraction method, which is based on GMM, and the Blob analysis further
generates the group of connected pixels for the moving object, which is further considered to determine the
Desigen of an effective multiple object tracking framework for dynamic video scenes (Karanam Sunil Kumar)
3890  ISSN: 2252-8938

association of detections of the moving objects for its LoM. The contribution of the proposed model is as
follows: i) unlike the existing system, it offers a simplistic design modelling of tracking model, which attains
better accuracy of LoM for moving objects without compromising the computational performance; ii) it
basically enhances the computation operation with object-oriented design modelling of system objects and also
performs better foreground detection and lump analysis, iii) the proposed system also performs contextual
attribute based LoM analysis for the directionality of movement of an object that assists in effective tracking
of multiple objects over successive frame sequence, and iv) the inclusion of optimal estimator in the proposed
system not only reduces the noise but also offers effective management of allocated and non-allocated LoM to
balance the cost factors which also addresses the assignment problem in dynamic tracking. Overall, it is pretty
clear that the simplistic study model of the proposed system retains a better balance between accuracy and
computation cost while performing detection and tracking of a mobile object over dynamic video scenes. It has
to be noted that the study considered specific form of dataset for the evaluation of the proposed tracking model
and also considered specific volume of dataset to study the effectiveness of the system. The model has not
been evalauated under increasing number of samples. The future scope of the research aims to implicate the
study model towards accomplishing better public safety and security by considering faster, more reliable and
accurate object tracking among the interconnected smart cities.

REFERENCES
[1] M. H. Sedky, M. Moniri, and C. C. Chibelushi, “Classification of smart video surveillance systems for commercial applications,”
IEEE International Conference on Advanced Video and Signal Based Surveillance, vol. 2005, pp. 638–643, 2005, doi:
10.1109/AVSS.2005.1577343.
[2] Y. Wang, “Development of AtoN real-time video surveillance system based on the AIS collision warning,” ICTIS 2019 - 5th
International Conference on Transportation Information and Safety, pp. 393–398, 2019, doi: 10.1109/ICTIS.2019.8883727.
[3] T. Zhang, B. Ghanem, and N. Ahuja, “Robust multi-object tracking via cross-domain contextual information for sports video
analysis,” ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 985–988, 2012, doi:
10.1109/ICASSP.2012.6288050.
[4] F. Wu, S. Peng, J. Zhou, Q. Liu, and X. Xie, “Object tracking via online multiple instance learning with reliable components,”
Computer Vision and Image Understanding, vol. 172, pp. 25–36, 2018, doi: 10.1016/j.cviu.2018.03.008.
[5] J. Gwak, “Multi-object tracking through learning relational appearance features and motion patterns,” Computer Vision and Image
Understanding, vol. 162, pp. 103–115, 2017, doi: 10.1016/j.cviu.2017.05.010.
[6] M. Weber, M. Welling, and P. Perona, “Unsupervised learning of models for recognition,” Computer Vision - ECCV 2000, vol.
1842, pp. 18–32, 2000, doi: 10.1007/3-540-45054-8_2.
[7] M. A. Naiel, M. O. Ahmad, M. N. S. Swamy, J. Lim, and M. H. Yang, “Online multi-object tracking via robust collaborative
model and sample selection,” Computer Vision and Image Understanding, vol. 154, pp. 94–107, 2017, doi:
10.1016/j.cviu.2016.07.003.
[8] M. Han, W. Xu, H. Tao, and Y. Gong, “An algorithm for multiple object trajectory tracking,” Proceedings of the IEEE Computer
Society Conference on Computer Vision and Pattern Recognition, vol. 1, 2004, doi: 10.1109/CVPR.2004.1315122.
[9] D. Riahi and G. A. Bilodeau, “Online multi-object tracking by detection based on generative appearance models,” Computer
Vision and Image Understanding, vol. 152, pp. 88–102, 2016, doi: 10.1016/j.cviu.2016.07.012.
[10] S. Huang, S. Jiang, and X. Zhu, “Multi-object tracking via discriminative appearance modeling,” Computer Vision and Image
Understanding, vol. 153, pp. 77–87, 2016, doi: 10.1016/j.cviu.2016.06.003.
[11] D. B. Reid, “An algorithm for tracking multiple targets,” IEEE Transactions on Automatic Control, vol. 24, no. 6, pp. 843–854,
1979, doi: 10.1109/TAC.1979.1102177.
[12] J. Prokaj, M. Duchaineau, and G. Medioni, “Inferring tracklets for multi-object tracking,” IEEE Computer Society Conference on
Computer Vision and Pattern Recognition Workshops, pp. 37–44, 2011, doi: 10.1109/CVPRW.2011.5981753.
[13] J. D. H. Resendiz, H. M. M. Castro, and E. T. Leal, “A comparative study of clustering validation indices and maximum entropy
for sintonization of automatic segmentation techniques,” IEEE Latin America Transactions, vol. 17, no. 8, pp. 1229–1236, 2019,
doi: 10.1109/TLA.2019.8932330.
[14] K. Minhas et al., “Accurate pixel-wise skin segmentation using shallow fully convolutional neural network,” IEEE Access, vol. 8,
pp. 156314–156327, 2020, doi: 10.1109/ACCESS.2020.3019183.
[15] L. Cheng, J. Wang, and Y. Li, “ViTrack: efficient tracking on the edge for commodity video surveillance systems,” IEEE
Transactions on Parallel and Distributed Systems, vol. 33, no. 3, pp. 723–735, 2022, doi: 10.1109/TPDS.2021.3081254.
[16] E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency
information,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489–509, 2006, doi: 10.1109/TIT.2005.862083.
[17] D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006, doi:
10.1109/TIT.2006.871582.
[18] E. J. Candes and T. Tao, “Near-optimal signal recovery from random projections: Universal encoding strategies?,” IEEE
Transactions on Information Theory, vol. 52, no. 12, pp. 5406–5425, 2006, doi: 10.1109/TIT.2006.885507.
[19] W. Xing, Y. Yang, S. Zhang, Q. Yu, and L. Wang, “NoisyOTNet: a robust real-time vehicle tracking model for traffic
surveillance,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 4, pp. 2107–2119, 2022, doi:
10.1109/TCSVT.2021.3086104.
[20] J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, “High-speed tracking with kernelized correlation filters,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 37, no. 3, pp. 583–596, 2015, doi: 10.1109/TPAMI.2014.2345390.
[21] M. Danelljan, G. Bhat, F. Shahbaz Khan, and M. Felsberg, “ECO: Efficient convolution operators for tracking,” 30th IEEE
Conference on Computer Vision and Pattern Recognition, vol. 2017, pp. 6931–6939, 2017, doi: 10.1109/CVPR.2017.733.
[22] J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi, and P. H. S. Torr, “End-to-end representation learning for correlation filter
based tracking,” 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 5000–5008, 2017, doi:
10.1109/CVPR.2017.531.

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Int J Artif Intell ISSN: 2252-8938  3891

[23] B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, and J. Yan, “SIAMRPN++: Evolution of siamese visual tracking with very deep
networks,” The IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 4277–4286, 2019, doi:
10.1109/CVPR.2019.00441.
[24] H. Fan and H. Ling, “Siamese cascaded region proposal networks for real-time visual tracking,” The IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, pp. 7944–7953, 2019, doi: 10.1109/CVPR.2019.00814.
[25] S. Yun, J. Choi, Y. Yoo, K. Yun, and J. Y. Choi, “Action-decision networks for visual tracking with deep reinforcement
learning,” 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 1349–1358, 2017, doi:
10.1109/CVPR.2017.148.
[26] D. Zhang and Z. Zheng, “High performance visual tracking with siamese actor-critic network,” Proceedings - International
Conference on Image Processing, ICIP, vol. 2020, pp. 2116–2120, 2020, doi: 10.1109/ICIP40778.2020.9191326.
[27] H. A. I. T. Abdelali, H. Derrouz, Y. Zennayi, R. O. H. Thami, and F. Bourzeix, “Multiple hypothesis detection and tracking using
deep learning for video traffic surveillance,” IEEE Access, vol. 9, pp. 164282–164291, 2021, doi:
10.1109/ACCESS.2021.3133529.
[28] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Journal of Fluids Engineering, Transactions of the
ASME, vol. 82, no. 1, pp. 35–45, 1960, doi: 10.1115/1.3662552.
[29] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” The IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.
[30] J. Chen, Z. Xi, C. Wei, J. Lu, Y. Niu, and Z. Li, “Multiple object tracking using edge multi-channel gradient model with ORB
feature,” IEEE Access, vol. 9, pp. 2294–2309, 2021, doi: 10.1109/ACCESS.2020.3046763.
[31] L. Chen, H. Zheng, Z. Yan, and Y. Li, “Discriminative region mining for object detection,” IEEE Transactions on Multimedia,
vol. 23, pp. 4297–4310, 2021, doi: 10.1109/TMM.2020.3040539.
[32] N. Aslam and V. Sharma, “Foreground detection of moving object using Gaussian mixture model,” 2017 IEEE International
Conference on Communication and Signal Processing, ICCSP 2017, pp. 1071–1074, 2017, doi: 10.1109/ICCSP.2017.8286540.
[33] R. M. Haralick, S. R. Sternberg, and X. Zhuang, “Image analysis using mathematical morphology,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 9, no. 4, pp. 532–550, 1987, doi: 10.1109/TPAMI.1987.4767941.
[34] F. Wang, F. Liao, Y. Li, and H. Wang, “A new prediction strategy for dynamic multi-objective optimization using Gaussian
mixture model,” Information Sciences, vol. 580, pp. 331–351, 2021, doi: 10.1016/j.ins.2021.08.065.
[35] X. Lin, C. T. Li, V. Sanchez, and C. Maple, “On the detection-to-track association for online multi-object tracking,” Pattern
Recognition Letters, vol. 146, pp. 200–207, 2021, doi: 10.1016/j.patrec.2021.03.022.
[36] M. L. Miller, H. S. Stone, and I. J. Cox, “Optimizing murty’s ranked assignment method,” IEEE Transactions on Aerospace and
Electronic Systems, vol. 33, no. 3, pp. 851–862, 1997, doi: 10.1109/7.599256.
[37] J. Munkres, “Algorithms for the assignment and transportation problems,” Journal of the Society for Industrial and Applied
Mathematics, vol. 5, no. 1, pp. 32–38, 1957, doi: 10.1137/0105003.
[38] L. Wen et al., “UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking,” Computer Vision and
Image Understanding, vol. 193, 2020, doi: 10.1016/j.cviu.2020.102907.
[39] K. S. Kumar and N. P. Kavya, “An efficient unusual event tracking in video sequence using block shift feature algorithm,”
International Journal of Advanced Computer Science and Applications, vol. 13, no. 7, pp. 98–107, 2022, doi:
10.14569/IJACSA.2022.0130714.
[40] K. S. Kumar and N. P. Kavya, “Compact scrutiny of current video tracking system and its associated standard approaches,”
International Journal of Advanced Computer Science and Applications, vol. 11, no. 12, pp. 398–408, 2020, doi:
10.14569/IJACSA.2020.0111249.

BIOGRAPHIES OF AUTHORS

Sunil Kumar Karanam holds the Bachelor of Engineering in Computer science

and Engineering. Along with a M.Tech. degree from VTU Belagavi. He is currently an
assistant professor at Department of Computer Science and Engineering, BMS College of
Engineering, Bull Temple Rd, Basavanagudi, Bengaluru, Karnataka, India. His research
includes meta-heuristics, network security, object tracking and surveillance, machine learning,
data mining, deep learning, and computer vision. He can be contacted at email:
[email protected].

Narasimha Murthy Pokale Kavya holds Bachelor of Engineering in Computer

Science and Engg. along with M.S. in software systems and Ph.D. in computer science from
VTU Belagavi. She has vast experience of 26 years in the field education and research. She is
currently a Professor in Department of Computer Science and Engineering, RNSIT,
Bengaluru. She has published around 90 research papers in reputed international journals
including IEEE, Elsevier, Springer (SCI and Web of Science). Has 94+ citations in Google
scholar as of Jan 2024. Her main areas of expertise are machine learning, artificial
intelligence, and big data analytics. She can be contacted at email: [email protected].

Desigen of an effective multiple object tracking framework for dynamic video scenes (Karanam Sunil Kumar)

The Comparative Anatomy of Eating
100% (24)
The Comparative Anatomy of Eating
5 pages
BRKSPG 2720
No ratings yet
BRKSPG 2720
80 pages
SST 100
No ratings yet
SST 100
4 pages
10 1109icraie 2018 8710421
No ratings yet
10 1109icraie 2018 8710421
7 pages
Image Processing Techniques For Object Tracking in Video Surveillance A Survey 2015 2
No ratings yet
Image Processing Techniques For Object Tracking in Video Surveillance A Survey 2015 2
6 pages
Real Time Unattended Object Detection and Tracking Using MATLAB
No ratings yet
Real Time Unattended Object Detection and Tracking Using MATLAB
8 pages
Study On MOT
No ratings yet
Study On MOT
17 pages
07 An - Investigate - On - Moving - Object - Tracking - and - Detection - in - Images
No ratings yet
07 An - Investigate - On - Moving - Object - Tracking - and - Detection - in - Images
7 pages
Doaa Nasser Alghamdi-442202873
No ratings yet
Doaa Nasser Alghamdi-442202873
5 pages
Hybrid Approach For Semantic Object Detection in Video
No ratings yet
Hybrid Approach For Semantic Object Detection in Video
4 pages
A Method For Tracking Road Objects
No ratings yet
A Method For Tracking Road Objects
15 pages
Moving Object Tracking and Detection in Videos Using MATLAB: A Review
No ratings yet
Moving Object Tracking and Detection in Videos Using MATLAB: A Review
9 pages
Jih MSP 2012 03 005
No ratings yet
Jih MSP 2012 03 005
15 pages
Object Tracking
100% (1)
Object Tracking
22 pages
IJERTV3IS100721
No ratings yet
IJERTV3IS100721
11 pages
369IJCTET2016043006
No ratings yet
369IJCTET2016043006
5 pages
Moving Objects Detection Based On Histogram of Oriented Gradient Algorithm Chip For Hazy Environment
No ratings yet
Moving Objects Detection Based On Histogram of Oriented Gradient Algorithm Chip For Hazy Environment
12 pages
An Intelligent Motion Detection Using Open CV
No ratings yet
An Intelligent Motion Detection Using Open CV
13 pages
Object Tracking Techniques For Video Tracking: A Survey: Mansi Manocha, Parminder Kaur
No ratings yet
Object Tracking Techniques For Video Tracking: A Survey: Mansi Manocha, Parminder Kaur
5 pages
Bastian Leibe, Konrad Schindler, Nico Cornelis, and Luc Van Gool
No ratings yet
Bastian Leibe, Konrad Schindler, Nico Cornelis, and Luc Van Gool
14 pages
Calafut Multiple Object Tracking in Infrared
No ratings yet
Calafut Multiple Object Tracking in Infrared
6 pages
Research Article
No ratings yet
Research Article
14 pages
Smart Cards
No ratings yet
Smart Cards
39 pages
Halooo
No ratings yet
Halooo
13 pages
Object Tracking System Using Camshift Meanshift and Kalman Filter
No ratings yet
Object Tracking System Using Camshift Meanshift and Kalman Filter
6 pages
Object Tracking System Using Camshift Meanshift and Kalman Filter PDF
No ratings yet
Object Tracking System Using Camshift Meanshift and Kalman Filter PDF
6 pages
Proposed Multi Object Tracking Algorithm
No ratings yet
Proposed Multi Object Tracking Algorithm
10 pages
Single Object Tracking A Survey of Methods Dataset
No ratings yet
Single Object Tracking A Survey of Methods Dataset
15 pages
IP Camera Based Video Surveillance Using
No ratings yet
IP Camera Based Video Surveillance Using
10 pages
Motion Detection and Tracking of Multiple Objects For Intelligent Surveillance
No ratings yet
Motion Detection and Tracking of Multiple Objects For Intelligent Surveillance
7 pages
BSPC D 24 01669 - Reviewer
No ratings yet
BSPC D 24 01669 - Reviewer
19 pages
(IJET-V1I6P15) Authors: Sadhana Raut, Poonam Rohani, Sumera Shaikh, Tehesin Shikilkar, Mrs. G. J. Chhajed
No ratings yet
(IJET-V1I6P15) Authors: Sadhana Raut, Poonam Rohani, Sumera Shaikh, Tehesin Shikilkar, Mrs. G. J. Chhajed
7 pages
Performance comparison of optical flow and background subtraction and discrete wavelet transform methods for moving objects
No ratings yet
Performance comparison of optical flow and background subtraction and discrete wavelet transform methods for moving objects
10 pages
Video Segmentation For Moving Object Detection Using Local Change & Entropy Based Adaptive Window Thresholding
No ratings yet
Video Segmentation For Moving Object Detection Using Local Change & Entropy Based Adaptive Window Thresholding
12 pages
Ijaerv10n9spl 339
No ratings yet
Ijaerv10n9spl 339
9 pages
1207 6774 PDF
No ratings yet
1207 6774 PDF
14 pages
Cao et al. - 2011 - KLT feature based vehicle detection and tracking in airborne videos-annotated
No ratings yet
Cao et al. - 2011 - KLT feature based vehicle detection and tracking in airborne videos-annotated
6 pages
Computer Vision Based Moving Object Detection and Tracking: Suresh Kumar, Prof. Yatin Kumar Agarwal
No ratings yet
Computer Vision Based Moving Object Detection and Tracking: Suresh Kumar, Prof. Yatin Kumar Agarwal
6 pages
1.mot Ijsae
No ratings yet
1.mot Ijsae
10 pages
Comparative+Evaluation+of+SORT,+DeepSORT,+and+ByteTrack+for+Multiple+Object+Tracking+in+Highway+Timelapse+Videos_2
No ratings yet
Comparative+Evaluation+of+SORT,+DeepSORT,+and+ByteTrack+for+Multiple+Object+Tracking+in+Highway+Timelapse+Videos_2
11 pages
Monitoring Crowded Traffic Scenes
No ratings yet
Monitoring Crowded Traffic Scenes
6 pages
SEMINAR
No ratings yet
SEMINAR
4 pages
14166-Article Text-25243-1-10-20231018
No ratings yet
14166-Article Text-25243-1-10-20231018
7 pages
Paper 425-1
No ratings yet
Paper 425-1
5 pages
Ouriginal Report - Research Paper-1.pdf (D148071491) PDF
No ratings yet
Ouriginal Report - Research Paper-1.pdf (D148071491) PDF
12 pages
Computer Vision Paper
No ratings yet
Computer Vision Paper
3 pages
Digital Image Processing
No ratings yet
Digital Image Processing
13 pages
I Jcs It 20120302103
No ratings yet
I Jcs It 20120302103
6 pages
PHD Thesis - Javed Ahmed - R3 (Final)
100% (1)
PHD Thesis - Javed Ahmed - R3 (Final)
151 pages
Image Processing: Object Tracking With Color Detection
No ratings yet
Image Processing: Object Tracking With Color Detection
14 pages
11
No ratings yet
11
19 pages
PHD MOT CNN Proposal
No ratings yet
PHD MOT CNN Proposal
3 pages
Real Time Object Detection and Tracking Using Deep Learning and Opencv
No ratings yet
Real Time Object Detection and Tracking Using Deep Learning and Opencv
4 pages
Object Detection and Identification Using Deep Learning and OpenCV
No ratings yet
Object Detection and Identification Using Deep Learning and OpenCV
7 pages
A Study On Smart Video Security For Banks Using Mobile Remote Control
No ratings yet
A Study On Smart Video Security For Banks Using Mobile Remote Control
4 pages
Visual Object Tracking Based On Dynamic Weights and Gaze Density
No ratings yet
Visual Object Tracking Based On Dynamic Weights and Gaze Density
9 pages
MULTIPLE - OBJECT - TRACKING - 01fe20bei046
No ratings yet
MULTIPLE - OBJECT - TRACKING - 01fe20bei046
6 pages
Video Surveillance Systems - A Survey: Keywords
No ratings yet
Video Surveillance Systems - A Survey: Keywords
8 pages
A Survey On Multiple Object Detection and Tracking IJERTV3IS10574
No ratings yet
A Survey On Multiple Object Detection and Tracking IJERTV3IS10574
3 pages
A Review of Visual Moving Target Tracking
No ratings yet
A Review of Visual Moving Target Tracking
30 pages
Intelligent Video Surveillance System Architecture For Abnormal Activity Detection
No ratings yet
Intelligent Video Surveillance System Architecture For Abnormal Activity Detection
10 pages
Object Detection and Tracking For Surveillance System: Payal Patoliya, Prof. Md. Salman R. Bombaywala
No ratings yet
Object Detection and Tracking For Surveillance System: Payal Patoliya, Prof. Md. Salman R. Bombaywala
7 pages
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
From Everand
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
Fouad Sabry
No ratings yet
Developing a website for English-speaking practice to English as a foreign language learners at the university level
No ratings yet
Developing a website for English-speaking practice to English as a foreign language learners at the university level
12 pages
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
No ratings yet
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
10 pages
Multi-task deep learning for Vietnamese capitalization and punctuation recognition
No ratings yet
Multi-task deep learning for Vietnamese capitalization and punctuation recognition
11 pages
A contest of sentiment analysis: k-nearest neighbor versus neural network
No ratings yet
A contest of sentiment analysis: k-nearest neighbor versus neural network
9 pages
Graph-based methods for transaction databases: a comparative study
No ratings yet
Graph-based methods for transaction databases: a comparative study
10 pages
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
No ratings yet
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
10 pages
Enhancing emotion recognition model for a student engagement use case through transfer learning
No ratings yet
Enhancing emotion recognition model for a student engagement use case through transfer learning
11 pages
A proposed approach for plagiarism detection in Myanmar Unicode text
No ratings yet
A proposed approach for plagiarism detection in Myanmar Unicode text
9 pages
A comparative study of natural language inference in Swahili using monolingual and multilingual models
No ratings yet
A comparative study of natural language inference in Swahili using monolingual and multilingual models
8 pages
Artificial intelligence algorithms to predict customer satisfaction: a comparative study
No ratings yet
Artificial intelligence algorithms to predict customer satisfaction: a comparative study
9 pages
Hindi spoken digit analysis for native and non-native speakers
No ratings yet
Hindi spoken digit analysis for native and non-native speakers
7 pages
Evaluating ChatGPT’s Mandarin “yue” pronunciation system in language learning
No ratings yet
Evaluating ChatGPT’s Mandarin “yue” pronunciation system in language learning
8 pages
Automatic detection of dress-code surveillance in a university using YOLO algorithm
No ratings yet
Automatic detection of dress-code surveillance in a university using YOLO algorithm
8 pages
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on deep neural network
No ratings yet
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on deep neural network
13 pages
Primary phase Alzheimer's disease detection using ensemble learning model
No ratings yet
Primary phase Alzheimer's disease detection using ensemble learning model
9 pages
Improved convolutional neural networks for aircraft type classification in remote sensing images
No ratings yet
Improved convolutional neural networks for aircraft type classification in remote sensing images
8 pages
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
No ratings yet
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
11 pages
Deep learning-based techniques for video enhancement, compression and restoration
No ratings yet
Deep learning-based techniques for video enhancement, compression and restoration
13 pages
Video forgery: An extensive analysis of inter-and intra-frame manipulation alongside state-of-the-art comparisons
No ratings yet
Video forgery: An extensive analysis of inter-and intra-frame manipulation alongside state-of-the-art comparisons
13 pages
A novel scalable deep ensemble learning framework for big data classification via MapReduce integration
No ratings yet
A novel scalable deep ensemble learning framework for big data classification via MapReduce integration
15 pages
Event detection in soccer matches through audio classification using transfer learning
No ratings yet
Event detection in soccer matches through audio classification using transfer learning
9 pages
U-Net for wheel rim contour detection in robotic deburring
No ratings yet
U-Net for wheel rim contour detection in robotic deburring
14 pages
Adaptive kernel integration in visual geometry group 16 for enhanced classification of diabetic retinopathy stages in retinal images
No ratings yet
Adaptive kernel integration in visual geometry group 16 for enhanced classification of diabetic retinopathy stages in retinal images
12 pages
Hybrid model detection and classification of lung cancer
No ratings yet
Hybrid model detection and classification of lung cancer
11 pages
Deep ensemble learning with uncertainty aware prediction ranking for cervical cancer detection using Pap smear images
No ratings yet
Deep ensemble learning with uncertainty aware prediction ranking for cervical cancer detection using Pap smear images
11 pages
Optimizing deep learning models from multi-objective perspective via Bayesian optimization
No ratings yet
Optimizing deep learning models from multi-objective perspective via Bayesian optimization
10 pages
Enhancing fall detection and classification using Jarratt‐butterfly optimization algorithm with deep learning
No ratings yet
Enhancing fall detection and classification using Jarratt‐butterfly optimization algorithm with deep learning
10 pages
Squeeze-excitation half U-Net and synthetic minority oversampling technique oversampling for papilledema image classification
No ratings yet
Squeeze-excitation half U-Net and synthetic minority oversampling technique oversampling for papilledema image classification
10 pages
Exploring DenseNet architectures with particle swarm optimization: efficient tomato leaf disease detection
No ratings yet
Exploring DenseNet architectures with particle swarm optimization: efficient tomato leaf disease detection
9 pages
Detecting road damage utilizing retinanet and mobilenet models on edge devices
No ratings yet
Detecting road damage utilizing retinanet and mobilenet models on edge devices
11 pages
Expt 4B Hydrolysis of Nucleic Acid
No ratings yet
Expt 4B Hydrolysis of Nucleic Acid
19 pages
t2-m-17068-year-4-clocks-maths-mastery-powerpoint-english_ver_1
No ratings yet
t2-m-17068-year-4-clocks-maths-mastery-powerpoint-english_ver_1
8 pages
Implementing IT Governance Am Primer For Informaticians
No ratings yet
Implementing IT Governance Am Primer For Informaticians
7 pages
Aug'22 Cash Book
No ratings yet
Aug'22 Cash Book
6 pages
Sarekat Dagang Islam (1905-1912) : Between The Savagery of Vereenigde Oostindische Compagnie (VOC) and The Independence of Indonesia
No ratings yet
Sarekat Dagang Islam (1905-1912) : Between The Savagery of Vereenigde Oostindische Compagnie (VOC) and The Independence of Indonesia
17 pages
Primary Smart Science P6 - Teacher Guide
100% (1)
Primary Smart Science P6 - Teacher Guide
47 pages
pakistan and india size - Google Search
No ratings yet
pakistan and india size - Google Search
1 page
100-900mec-1st-Mq-4 80449
No ratings yet
100-900mec-1st-Mq-4 80449
55 pages
Summary of Motor Greasing: Total Compliance %
No ratings yet
Summary of Motor Greasing: Total Compliance %
3 pages
MHBH Real Estate Weekly Report
No ratings yet
MHBH Real Estate Weekly Report
8 pages
Is 2102-2 (Iso 2768-2) - 2
No ratings yet
Is 2102-2 (Iso 2768-2) - 2
1 page
DLP - Co FBS
No ratings yet
DLP - Co FBS
4 pages
MapInfo Functions
100% (1)
MapInfo Functions
17 pages
BurjeelAbuDhabi PDF
No ratings yet
BurjeelAbuDhabi PDF
7 pages
VDFACA ESW BENE GAPS 2 To ALMSHET On August 2 2021
No ratings yet
VDFACA ESW BENE GAPS 2 To ALMSHET On August 2 2021
11 pages
S10 Activity Images Formed by Mirrors Oct 29-30-2024
No ratings yet
S10 Activity Images Formed by Mirrors Oct 29-30-2024
5 pages
UNIT 3 Vector differentiation
No ratings yet
UNIT 3 Vector differentiation
25 pages
BAIN BRIEF Leading A Digical Transformation
No ratings yet
BAIN BRIEF Leading A Digical Transformation
16 pages
8655880-Texto Do Artigo-56369-4-10-20190909
No ratings yet
8655880-Texto Do Artigo-56369-4-10-20190909
25 pages
Use Reset To Restore Your Windows 10 PC: Topics in This Guide Include
No ratings yet
Use Reset To Restore Your Windows 10 PC: Topics in This Guide Include
5 pages
Sample Kick Off Meeting Slide
No ratings yet
Sample Kick Off Meeting Slide
4 pages
Form 16 Part A Name and Address of The Employer Name and Designation of The Employee
No ratings yet
Form 16 Part A Name and Address of The Employer Name and Designation of The Employee
3 pages
Fresa 1
No ratings yet
Fresa 1
1 page
Stanley Hoffman - Obstinate or Obsolete? The Fate of The Nation-State and The Case of Western Europe
No ratings yet
Stanley Hoffman - Obstinate or Obsolete? The Fate of The Nation-State and The Case of Western Europe
55 pages
Electric Drives EE302 IM1
No ratings yet
Electric Drives EE302 IM1
13 pages
Active Suspension: Eself Study Program 960393
No ratings yet
Active Suspension: Eself Study Program 960393
33 pages
IOT Based Smart Cradle System With An App For Baby Monitoring
No ratings yet
IOT Based Smart Cradle System With An App For Baby Monitoring
4 pages

Design of An Effective Multiple Objects Tracking Framework For Dynamic Video Scenes

Uploaded by

Design of An Effective Multiple Objects Tracking Framework For Dynamic Video Scenes

Uploaded by

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 13, No. 4, December 2024, pp. 3879~3891

Design of an effective multiple objects tracking framework for

Sunil Kumar Karanam1, Narasimha Murthy Pokale Kavya2

Article Info ABSTRACT

Journal homepage: http://ijai.iaescore.com

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Figure 1. Architecture of the proposed MOT framework

2.1. Video input-output initialization

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Figure 2. General properties: Ov

Table 1. Important properties of Vf

Algorithm 1: For video input-output initialization

2.2. Computation measures of binary large object

𝑃(𝑥𝑡 ) = ∑𝑘𝑖=1 𝜔𝑖,𝑡 𝜂(𝑥𝑡 , 𝜇𝑖,𝑡 , 𝛴𝑖,𝑡 ) (2)

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Figure 3. Blob analysis description

2.3. Initialization of the tracking module

Figure 4. Structure array fields of 𝑇𝑚

2.4. Object detection module

Algorithm 2: For object detection from video

2.5. Prediction module for new position of line of movement

𝑃𝑐𝑖 ← 𝑃𝑁𝑇 (𝑥): ∀𝑥 ∈ 𝑇𝑖 , 𝐾𝐹 (5)

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

2.6. Line of movement allocation to the identified objects

𝐶𝑜𝑠𝑡𝑎𝑙𝑙𝑜𝑐 = 𝐴𝐿𝑜𝑀 (𝑥): ∀𝑥1 → 𝑇, 𝑥2 → 𝐾𝐹, 𝑥3 → 𝑐 (7)

Algorithm 3: For multi-object tracking

𝑃𝑐𝑖 ← 𝑃𝑁𝑇 (𝑥): ∀𝑥 ∈ 𝑇𝑖 , 𝐾𝐹 (5)

4. Predict the new position of LoM

5. Update 𝑇𝑖 with respect to the LoM for 𝑃𝑐𝑖

𝐶𝑜𝑠𝑡𝑎𝑙𝑙𝑜𝑐 = 𝐴𝐿𝑜𝑀 (𝑥): ∀𝑥1 → 𝑇, 𝑥2 → 𝐾𝐹, 𝑥3 → 𝑐 (7)

8. Update allocated LoM, Non-Allocated LoM

3. RESULTS AND DISCUSSION

(a) (b) (c)

(d) (e) (f)

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Table 2. Comparative analysis based on observations

Int J Artif Intell, Vol. 13, No. 4, December 2024: 3879-3891

Sunil Kumar Karanam holds the Bachelor of Engineering in Computer science

Narasimha Murthy Pokale Kavya holds Bachelor of Engineering in Computer

You might also like