Review of Object Tracking Algorithms in Computer V
Review of Object Tracking Algorithms in Computer V
DOI: 10.54254/2755-2721/32/20230178
Xiao Luo
College of Computer and Network Security (Oxford Brookes College), Chengdu
University of Technology Chengdu Sichuan Province 610059 China
Abstract. This paper is a survey of object tracking algorithms in computer vision based on deep
learning. The author first introduces the importance and application of computer vision in the
field of artificial intelligence, and describes the research background and definition of computer
vision, and Outlines its broad role in fields such as autonomous driving. It then discusses various
supporting techniques for computer vision, including correcting linear unit nonlinearities,
overlap pooling, image recognition based on semi-naive Bayesian classification, human action
recognition and tracking based on S-D model, and object tracking algorithms based on
convolutional neural networks and particle filters. It also addresses computer vision challenges
such as building deeper convolutional neural networks and handling large datasets. We discuss
solutions to these challenges, including the use of activation functions, regularization, and data
preprocessing, among others. Finally, we discuss the future directions of computer vision, such
as deep learning, reinforcement learning, 3D vision and scene understanding. Overall, this paper
highlights the importance of computer vision in artificial intelligence and its potential
applications in various fields.
1. Introduction
Computer vision is an important research direction in artificial intelligence field. Its goal is to enable
computers to perceive and understand visual information, enabling automated analysis and
understanding of highlight and video data. Through computer vision technology, computer can extract
meaningful information from images and videos and perform advanced visual tasks, such as target
detection, image segmentation, and pose estimation. Computer vision has gone through multiple stages
of research and development, form edge detection and object recognition algorithms to deep learning
and convolutional neural networks, and its capabilities have been significantly improved, enabling
computer to automatically process image and video data, reducing labor costs and improving work
efficiency. Compute vision is widely used in many fields. For example, in the field of autonomous
driving, computer vision can identify and track vehicles and pedestrians on the road to assist the
realization of intelligent driving systems. It plays an important role in face recognition and medical
imaging fields.
© 2023 The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0
(https://creativecommons.org/licenses/by/4.0/).
22
Proceedings of the 2023 International Conference on Machine Learning and Automation
DOI: 10.54254/2755-2721/32/20230178
2. Supporting technology
2.4. Image human behavior recognition and tracking based on S-D model
Aiming at the problem that SNBC is not accurate enough to recognize human motion in moving images,
an S-D algorithm combining DT and SNBC is proposed, and the corresponding S-D model is established.
The basic structure of the S-D model is shown in the figure below.
The S-D algorithm is used to calculate the optical flow information in the moving image and extract
the movement trajectory of the athlete. Through feature extraction and processing of trajectory, SNBC
is used for training and classification to realize recognition and tracking of human motion. For each
23
Proceedings of the 2023 International Conference on Machine Learning and Automation
DOI: 10.54254/2755-2721/32/20230178
point in the image sequence, sampling points with smaller eigenvalues are deleted according to the
eigenvalue threshold. The motion coordinates of the next frame are calculated by the median filter and
the optical flow portion. The following formula can be used to obtain the human movement trajectory,
so as to realize the movement recognition and tracking.
𝑇 = 0.001 ∗ max 𝑚𝑖𝑛(𝜆1𝑖 ,𝜆2𝑖 ) [2]
𝑖∈𝐼
2.6. Object tracking algorithm based on particle filter and convolutional neural network
Particle filter is a recursive Bayesian filtering algorithm, which uses sequential Monte Carlo important
sampling method to represent the posterior probability. The core idea is to approximate a posterior
probability distribution using a series of random particles. A particle filter has two main components: a
state transition model that generates candidate samples based on previous particle samples. Observe the
model and calculate the similarity between the candidate samples and the model of this objective view.
A given observation sequence: 𝑦1:𝑡 = [𝑦1 , . . . . . , 𝑦𝑡 ], target tracking system's goal is to estimate the
posterior probability density function of the target at time t 𝑝(𝑥𝑡 |𝑦1:𝑡 ). According to Bayesian theory,
the posterior probability density can be expressed as:
𝑝(𝑥𝑡 |𝑦1:𝑡 ) ∝ 𝑝(𝑦𝑡 |𝑥𝑡 ) ∫ 𝑝 (𝑥𝑡 |𝑥𝑡−1 )𝑝(𝑥𝑡−1 |𝑦1:𝑡−1 )𝑑𝑥𝑡−1 [3]
In the above formula, 𝑝(𝑥𝑡 𝑥𝑡−1 ), 𝑝(𝑦𝑡 𝑥𝑡 ) are the dynamic model and the observation model,
respectively. ssThe optimal target state x at the last time t can be obtained from the maximum posterior
probability:
𝑥𝑡∗ = 𝑎𝑟𝑔 max 𝑝 (𝑥𝑡 𝑦1:𝑡 ) = 𝑥𝑡𝑖 = 𝑎𝑟𝑔 max
𝑖
𝑤𝑡𝑖 [3]
𝑥𝑡 𝑥𝑡
In order to improve the computational efficiency, the algorithm to choose only track the target
𝑦
position and size, 𝑥𝑡 = (𝑝𝑡𝑥 , 𝑝𝑡 , 𝑤𝑡 , ℎ𝑡 ), in order to target the abscissa and ordinate, width and length.
It is assumed that the dynamic model of two consecutive frames follows Gaussian distribution.
𝑝(𝑥𝑡 𝑥𝑡−1 ) = 𝑀(𝑥𝑡 ; 𝑥𝑡−1 , ) [3]
24
Proceedings of the 2023 International Conference on Machine Learning and Automation
DOI: 10.54254/2755-2721/32/20230178
25
Proceedings of the 2023 International Conference on Machine Learning and Automation
DOI: 10.54254/2755-2721/32/20230178
Convolutional neural networks), and the application of geometric deep learning in computer graphics,
robotics, video classification and other fields. Future research may focus on how to design more efficient
3D CNNs to apply deep learning to more complex 3D computer vision tasks. Robustness and privacy
protection are becoming increasingly important, and research will focus on developing models and
algorithms that can withstand adversarial attacks, as well as designing privacy-secure data processing
and information transmission methods. However, differential privacy protection mechanisms can
negatively affect model accuracy for unusual data or long-tail distributions [9]. Future research could
delve into the impact of differential privacy on individual samples and propose solutions to improve the
accuracy of the model on these data. Finally, the concept of lifelong learning will play an important role
in the above direction. Models will continue to learn from new data and adapt to changing environments
[10]. These common advances will advance the application of computer vision in areas such as
autonomous driving, medical imaging, and intelligent safety. Therefore, computer vision will contribute
strong visual perception and understanding capabilities to the development of artificial intelligence.
5. Conclusion
Computer vision is an important research direction in the field of artificial intelligence, which aims to
enable computers to perceive and understand visual information. Through the introduction of
technologies such as deep learning and neural networks, computer vision has made significant progress
in image and video processing, and is widely used in multiple fields such as autonomous driving, face
recognition, and medical imaging. In the future, the development of computer vision will focus on deep
learning, reinforcement learning, three-dimensional vision and scene understanding. These
developments will promote the application of computer vision technology in various fields, providing
more powerful visual perception and understanding capabilities for the development of artificial
intelligence.
References
[1] Krizhevsky, A, Sutskever, I and Hinton, G (2017) ImageNet Classification with Deep
Convolutional Neural Networks Available at: ImageNet classification with deep convolutional
neural networks | Communications of the ACM Access date: July 16, 2023
[2] Song, Y (2021) Research on Sports Image Recognition and Tracking Based on Computer Vision
Technology Available at: Research on Sports Image Recognition and Tracking Based on
Computer Vision Technology | IEEE Conference Publication | IEEE Xplore Access date: July
16, 2023
[3] Tian, Y and Cao, D (2022) Computer vision recognition and tracking algorithm based on
convolutional neural network Available at: (PDF) Computer vision recognition and tracking
algorithm based on convolutional neural network (researchgate.net) Access date: July 16, 2023
[4] He, K et.al (2015) Deep Residual Learning for Image Recognition Available at: [1512.03385]
Deep Residual Learning for Image Recognition (arxiv.org) Access date: August 15, 2023
[5] Simonyan, K and Zisserman, A (2015) VERY DEEP CONVOLUTIONAL NETWORKS FOR
LARGE-SCALE IMAGE RECOGNITION Available at: [1409.1556] Very Deep
Convolutional Networks for Large-Scale Image Recognition (arxiv.org) Access date: July 16,
2023
[6] Sun, S et.al (2020) The virtual training platform for computer vision Available at: The virtual
training platform for computer vision | IEEE Conference Publication | IEEE Xplore Access
date: August 16, 2023
[7] Zhang, Y , Wu, Y and Chen, H (2023) Research progress of visual simultaneous localization and
mapping based on Deep learning Available at: Research progress of visual simultaneous
localization and Mapping based on deep learning - CNKI (cdut.edu.cn) Access date: August
15, 2023
[8] Chen, X and Guo, H (2023) A Futures Quantitative Trading Strategy Based on a Deep
Reinforcement Learning Algorithm Available at: A Futures Quantitative Trading Strategy
26
Proceedings of the 2023 International Conference on Machine Learning and Automation
DOI: 10.54254/2755-2721/32/20230178
27