Review of Computer Vision in Sports
Review of Computer Vision in Sports
sciences
Review
A Comprehensive Review of Computer Vision in Sports:
Open Issues, Future Trends and Research Directions
Banoth Thulasya Naik 1 , Mohammad Farukh Hashmi 1 and Neeraj Dhanraj Bokde 2, *
Abstract: Recent developments in video analysis of sports and computer vision techniques have
achieved significant improvements to enable a variety of critical operations. To provide enhanced
information, such as detailed complex analysis in sports such as soccer, basketball, cricket, and
badminton, studies have focused mainly on computer vision techniques employed to carry out
different tasks. This paper presents a comprehensive review of sports video analysis for various
applications: high-level analysis such as detection and classification of players, tracking players or
balls in sports and predicting the trajectories of players or balls, recognizing the team’s strategies,
and classifying various events in sports. The paper further discusses published works in a variety
of application-specific tasks related to sports and the present researcher’s views regarding them.
Since there is a wide research scope in sports for deploying computer vision techniques in various
sports, some of the publicly available datasets related to a particular sport have been discussed. This
paper reviews detailed discussion on some of the artificial intelligence (AI) applications, GPU-based
work-stations and embedded platforms in sports vision. Finally, this review identifies the research
directions, probable challenges, and future trends in the area of visual recognition in sports.
this study presents a survey of detection, classification, tracking, trajectory prediction and
recognizing the team’s strategies in various sports. Detection and tracking of the player
is the only major requirement in some sports such as cycling, swimming, among others.
As a result, as illustrated in Figure 1, this research classifies all sports into two categories:
player-centered and ball-centered sports, with extensive analysis in Section 4.
Sports
Unicycling
Individual Team Individual Team
American
Badminton Basket Ball
football
Futsal Cricket
Soccer
Detection in
Jersey number recognition
sports
Ball possession
Computer vision in
Play Field Extraction Highlight extraction
Sports
Performance analysis of player
Shot/goal classification
• Kamble et al. [8] presented an exhaustive survey on ball tracking categorically and
reviewed several used techniques, their performance, advantages, limitations, and
their suitability for a different sports.
• Shih [9] focused on the content analysis fundamentals (e.g., sports genre classification,
the overall status of sports video analytics). Additionally reviewed are SOTA studies
with prominent challenges observed in the literature.
• Beal et al. [10] explored AI techniques that have been applied to challenges within team
sports such as match outcome prediction, tactical decision making, player investments,
and injury prediction.
• Apostolidis et al. [11] suggested a taxonomy of the existing algorithms and presented
a systematic review of the relevant literature that shows the evolution of deep learning-
based video summarization technologies.
• Yewande et al. [12] explored a review to better understand the usage of wearable
technology in sports to improve performance and avoid injury.
• Rana et al. [13] offered a thorough overview of the literature on the use of wearable
inertial sensors for performance measurement in various sports.
The rest of this paper is organized as follows. Section 2 provides statistical details of
research in sports. Section 3 presents extraction data vis-a-vis various sports playfields,
followed by a broader dimension that covers a wide range of sports and is reviewed in
Section 4. Some of the available datasets for various sports along and embedded platforms
have been reviewed in Sections 5 and 6. Section 7 provides various application-specific
tasks in the field of sports vision. Section 8 covers potential research directions, as well as
different challenges to be overcome in sports studies. Last but not least, Section 9 concludes
by describing the final considerations.
Volleyball_/
6%
Badminton _/
4%
Tennis � Basketball
9% 15%
Cricket}
13%
Current Frame
- >
Foreground Mask
Threshold (T)
Researchers have used a single dominant color for detecting the playfield. Accordingly,
some studies have utilized the features of images in which illumination is not affected
by transforming the images from RGB space to HIS [15–17], YCbCr [18], normalized
RGB [19–21].
For a precise capture of the movements of the players, tracking the ball and actions of
referees on the field or court, it is necessary to calibrate the camera [5,8] and also to use an
appropriate number of cameras to cover the field. Though some algorithms are capable of
tracking the players, some other objects also need to be tracked in dynamically complex
situations of interest for detailed analysis of the events and extraction of the data of the
subject of interest. Reference [22] presented an approach to extract the playing field and
track the players and ball using multiple cameras in soccer video. In [23,24], an architecture
was presented, which uses single (Figure 6a) and multi-cameras (Figure 6c) to capture a
clear view of players and ball in various challenging and tricky situations such as severe
occlusions and the ball being missing from the frames. To estimate the players’ trajectory
and team classification in [25,26] a bird’s eye view of the field is presented to capture
players, precisely as shown in Figure 6b. Various positions of the camera for capturing the
entire field are presented in [27,28] to detect and track the players/ball and estimate the
position of the players.
(a)
(b)
Figure 6. Cont.
Appl. Sci. 2022, 12, 4429 8 of 49
(c)
Figure 6. Camera placements in the playfield. (a) Ceiling-mounted camera [23]. (b) Birds eye view of
the field [25,26]. (c) Multiple cameras placed to cover the complete playfield [27].
Figure 7. Background-labeled samples from the dataset (a) playfield lines, (b) advertisements,
(c) non-playfield region.
Appl. Sci. 2022, 12, 4429 9 of 49
4. Literature Review
In this section, the overview of traditional computer-vision methods implemented for
major application specifics in sports (such as detection, event classification/recognition,
tracking and trajectory prediction) investigated by the researchers and their significant
limitations is discussed.
4.1. Basketball
Basketball is a sport played between two teams consisting of five players each. The task
of this sport is to score more points than the opponent. This sport has several activities with
the ball such as passing, throwing, bouncing, batting, or rolling the ball from one player to
another. Physical contact with an opponent player may be a foul if the contact impedes
the players’ desired movement. The advancements in computer vision techniques have
effectively employed fully automated systems to replace the manual analysis of basketball
sports. Recognizing the player’s action and classifying the events [29–31] in basketball
videos helps to analyze the player’s performance. Player/ball detection and tracking in
basketball videos are carried out in [32–37] but fail in assigning specific identification to
avoid identity switching among the players when they cross. By estimating the pose of the
player, the trajectory of the ball [38,39] is estimated from various distances to the basket.
By recognizing and classifying the referee’s signals [40], player behavior can be assessed
and highlights of the game can be extracted [41]. The behavior of a basketball team [42]
can be characterized by the dynamics of space creation presented in [43–48] that works to
counteract space creation dynamics with a defensive play presented in [49]. By detecting
the specific location of the player and ball in the basketball court, the player movement
can be predicted [50] and the ball trajectory [51–53] can be generated in three dimensions
which is a complicated task. It is also necessary to study the extraction of basketball players’
shooting motion trajectory, combined with the image feature analysis method of basketball
shooting, to reconstruct and quantitatively track the basketball players’ shooting motion
trajectory [54–57]. However, it is difficult to analyze the game data for each play such as
the ball tracking or motion of the players in the game, because the situation of the game
changes rapidly, and the structure of the data is complicated. Therefore, it is necessary to
analyze the real-time gameplay [58]. Table 2 summarizes various proposed methodologies
used to complete various challenging tasks in basketball sport including their limitations.
Studies in Basketball
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
The metrics used to evaluate the
method are the Spearman rank-order
The methodology failed to recognize
Recognizing actions of correlation coefficient, Kendall
difficult actions due to which accuracy is
basketball players by Bi-LSTM rank-order correlation coefficient,
[31] reduced. The accuracy of action
using image Sequence2Sequence Pearson linear correlation coefficient,
recognition can be improved with a deep
recognition techniques and Root Mean Squared Error and
convolutional neural network.
achieved 0.921, 0.803, 0.932, and 1.03,
respectively.
The proposed methodology fails to
The proposed methodology was predict the trajectories in the case of
Conditional Variational tested on Average Displacement uncertain and complex scenarios. As the
Multi-future trajectory
Recurrent Neural Error and Final Displacement Error behavior of the basketball or players is
[54] prediction in
Networks metrics. The methodology is robust dynamic, belief maps cannot steer future
basketball.
(RNN)—TrajNet++ if the number it achieves is smaller positions. Training the model with a
than 7.01 and 10.61. dataset of different events can rectify the
failures of predictions.
Appl. Sci. 2022, 12, 4429 10 of 49
Table 2. Cont.
Studies in Basketball
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
At the point guard (pg) position
4 candidates were detected and at
Predicting line-up
the center (c) position 3 candidates
performance of
were detected. The total score of pg
[58] basketball players by RNN + NN -
candidates is 13.67, 12.96, 13.42,
analyzing the
10.39, and where the total score of c
situation of the field.
candidates is 10.21, 14.08, and 13.48,
respectively.
Faster-RCNN provides better Tracking in specific areas such as severe
accuracy than YOLOv3 among occlusions and improving detection
YOLOv3 + Deep-SORT,
baseline detectors. The joint precision improves the accuracy and
Faster-RCNN +
Multiplayer tracking Detection and Embedding method computation speed. By adopting frame
[32] Deep-SORT, YOLOv3 +
in basketball videos performs better in the accuracy of extraction methods, in terms of speed
DeepMOT, Faster-RCNN +
tracking and computing speed and accuracy, it can achieve
DeepMOT, JDE
among multi-object tracking comprehensive performance, which may
methods. be an alternative solution.
In the case of a noisy environment, a
significant chance of occlusion, an
unusual viewing angle, and/or
Recognizing the Achieved an accuracy of 95.6% for
variability of gestures, the performance
referee signals from referee signal recognition using local
[40] HOG + SVM, LBP + SVM of the proposed method is not consistent.
real-time videos in a binary pattern features and SVM
Detecting jersey color and eliminating all
basketball game. classification.
other detected elements in the frame can
be the other solution to improve the
accuracy of referee signal recognition.
The proposed model can recognize the
Event recognition in mAP for group activity recognition is global movement in the video. By
[30] CNN
basketball videos 72.1% recognizing the local movements, the
accuracy can be improved.
The proposed model gives less accuracy
for actions such as passing and fouling.
Achieved an accuracy of 76.5% for
Analyzing the This also gives less accuracy of
[59] CNN + RNN four types of actions in basketball
behavior of the player. recognition and prediction on the test
videos.
dataset compared to the validation
dataset.
YOLO confuses the overlapped image for
Tracking ball
Jersey number recognition in terms a single player. In the subsequent frame,
movements and
of Precision achieved is 74.3%. the tracking ID of the overlapped player
[33] classification of YOLO + Joy2019
Player recognition in terms of Recall is exchanged, which causes wrong player
players in a basketball
achieved 89.8%. information to be associated with the
game
identified box.
Performance can be improved by
The average accuracy using a
Event classifications in introducing information such as
[29] CNN + LSTM two-stage event classification scheme
basketball videos individual player pose detection and
achieved 60.96%.
player location detection
Considered only two defensive strategies
Classification of
‘switch’ and ‘trap’ involved in Basketball.
different defensive
In addition, the alternative method of
strategies of basketball Achieved 69% classification accuracy
KNN, Decision Trees, and labeling large Spatio-temporal datasets
[49] payers, particularly for automatic defensive strategy
SVM will also lead to better results. Future
when they deviate identification.
research may also consider other
from their initial
defensive strategies such as pick-and-roll
defensive action.
and pick-and-pop.
The proposed method performed To improve the accuracy time series the
Basketball trajectory
well in terms of convergence rate prediction has to consider. By
prediction based on
and final AUC (91%) and proved considering factors such as player
[38] real data and BLSTM + MDN
deep learning models perform better cooperation and defense when predicting
generating new
than conventional models (e.g., NBA player positions, the performance
trajectory samples.
GLM, GBM). of the model can be improved.
Appl. Sci. 2022, 12, 4429 11 of 49
Table 2. Cont.
Studies in Basketball
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
Validated on a hierarchical policy The proposed model failed in the
Generating basketball
[51] GRU-CNN network (HPN) with ground truth trajectory of a three-dimensional
trajectories.
and 3 baselines. basketball match.
Automatically analyses the
The proposed method is lacking in
Score detection, basketball match, detects scoring,
computation speed which achieved 5
highlights video and generates highlights. Achieved
[41] BEI+CNN frames per second. Therefore, it cannot
generation in an accuracy, precision, recall, and
be implemented in a real-time basketball
basketball videos. F1-score of 94.59%, 96.55%, 92.31%,
match.
and 94.38%.
Event classification and event
Multi-person event
detection were achieved in terms of A high-resolution dataset can improve
[34] recognition in BLSTM
mean average precision, i.e., 51.6% the performance of the model.
basketball videos.
and 43.5%.
The methodology fails in many factors
such as complexity of interaction,
distinctiveness, and diversity of the
Player behavior Achieved an accuracy of 80% over
[44] RNN target classes and other extrinsic factors
analysis. offensive strategies.
such as reactions to defense, unexpected
events such as fouls, and consistency of
executions.
Prediction of the The proposed method fails in the case of
Evaluated in terms of AUC and
[39] 3-point shot in the RNN high ball velocity and the noisy nature of
achieved 84.30%.
basketball game motion data.
4.2. Soccer
Soccer is played using football, and eleven players in two teams compete to deliver the
ball into the other team’s goal, thereby scoring a goal. The players confuse each other by
changing their speed or direction unexpectedly. Due to them having the same jersey color,
players look almost identical and are frequently possess the ball, which leads to severe
occlusions and tracking ambiguities. In such a case, a jersey number must be detected to
recognize the player [60]. Accurate tracking [61–72] by detection [73–76] of multiple soccer
players as well as the ball in real-time is a major challenge to evaluate the performance of the
players, to find their relative positions at regular intervals, and to link spatiotemporal data
to extract trajectories. The systems which evaluate the player [77] or team performance [78]
have the potential to understand the game’s aspects, which are not obvious to the human
eye. These systems are able to evaluate the activities of players successfully [79] such as the
distance covered by players, shot detection [80,81], the number of sprints, player’s position,
and their movements [82,83], the player’s relative position concerning other players, pos-
session [84] of the soccer ball and motion/gesture recognition of the referee [85], predicting
player trajectories for shot situations [86]. The generated data can be used to evaluate
individual player performance, occlusion handling [21] by the detecting position of the
player [87], action recognition [88], predicting and classifying the passes [89–91], key event
extraction [92–101], tactical performance of the team [102–106], and analyzing the team’s
tactics based on the team formation [107–109], along with generating highlights [110–113].
Table 3 summarizes various proposed methodologies to resolve various challenging tasks
in soccer with their limitations.
Appl. Sci. 2022, 12, 4429 12 of 49
Studies in Basketball
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
Methodology achieved tracking This methodology effectively handles
Player and ball accuracy of 93.7% on multiple object challenging situations, such as partial
[71] detection and YOLOv3 and SORT tracking accuracy metrics with a occlusions, players and the ball
tracking in soccer. detection speed of 23.7 FPS and a reappearing after a few frames, but fails
tracking speed of 11.3 FPS. when the players are severely occluded.
Player, referee and
The model achieved a tracking The limitation of this method is that,
ball detection and
accuracy of 96% and 60% on MOTA when a player with the same jersey color
[72] tracking by jersey DeepPlayerTrack
and GMOTA metrics, respectively, is occluded, the ID of the player is
color recognition in
with a detection speed of 23 FPS. switched.
soccer.
The method failed to track the ball at
Tracking soccer
Machine Learning and Performance of the player tracking critical moments such as passing at the
players to evaluate
[77] Deep Reinforcement model measured in terms of mAP beginning and shooting. It also failed to
the number of goals
Learning. achieved 74.6%. overcome the identity switching
scored by a player.
problem.
Extracting ball Concatenation of the auto-encoder and
The methodology was evaluated in
events to classify Convolutional extreme learning machine techniques
[94] terms of accuracy and achieved 76.5%
the player’s passing Auto-Encoder will improve classification of the event
for 20 players.
style. performance.
Achieved an F1-score of 95.2% event The deep extreme learning machine
Detecting events in Variational Auto- encoder images and recall of 51.2% on images technique which employs the
[101]
soccer. and EfficientNet not related to soccer at a threshold auto-encoder technique may enhance the
value of 0.50. event detection accuracy.
Action spotting The algorithm achieved an mAP of
[82] YOLO-like encoder -
soccer video. 62.5%.
The proposed model failed in identifying
Prediction models achieved an overall
the players that are more frequently
accuracy of 75.2% in predicting the
involved in match events that end with
Team performance correct segmental and the outcome of
[78] SVM an attempt at scoring i.e., a ‘SHOT’ at
analysis in soccer the likelihood of the team making a
goal, which may assist sports analysts
successful attempt to score a goal on
and team staff to develop strategies
the used dataset.
suited to an opponent’s playing style.
Though the proposed algorithm is
Motion Recognition AlexNet, VGGNet-16, The proposed algorithm achieved immune to variations of illuminance
[85] of assistant referees ResNet-18, and 97.56% accuracy with real-time caused by weather conditions, it failed in
in soccer DenseNet-121 operations. the case of occlusions between referees
and players.
Predicting the
The proposed model predicts 83.3% for
[106] attributes (Loss or ANN -
the winning case and 72.7% for loss.
Win) in soccer.
Team tactics are estimated based on the
The performance of the model is
Team tactics relationship between tactics of the two
Deep Extreme Learning measured on precision, recall, and
[109] estimation in soccer teams and ball possession. The method
Machine (DELM). F1-score and achieved 87.6%, 88%, and
videos. fails to estimate the team formation at the
87.8%, respectively.
beginning of the game.
CNN-based Gaussian
By classifying the actions into subtypes,
Action recognition Weighted event-based Accuracy in terms of F1-score achieved
[88] the accuracy of action recognition can be
in soccer Action Classifier was 52.8% for 6 classes.
enhanced.
architecture
It could not detect when the ball moved
Detection and out of play in the field, in the stands
[62] tracking of the ball VGG – MCNN Achieved an accuracy of 87.45%. region, or from partial occlusion by
in soccer videos. players, or when ball color matched the
player’s jersey.
Automatic event The U-encoder is designed for feature
To carry out a tactical analysis of the
extraction for soccer extraction and has better performance
[95] YOLO team, player trajectory needs to be
videos based on in terms of accuracy compared with
analyzed.
multiple cameras. fixed feature extractors.
Appl. Sci. 2022, 12, 4429 13 of 49
Table 3. Cont.
Studies in Basketball
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
Extracting the features with the
The MobileNetV2 method performed MobileNetV2 and then using 3D
Shot detection in a
[80] MobileNetV2 better than other feature extractor convolution on the extracted features for
football game
methods. each frame can improve detection
performance.
The model failed to predict the player
Predicting player
Performance is measured in terms of trajectory in the case of players confusing
[86] trajectories for shot LSTM
F1-score and achieved 53%. each other by changing their speed or
situations
direction unexpectedly.
The model is limited to scalability as it
cannot be used on high-resolution soccer
videos. The results are bounded to a
particular match, and it cannot evaluate
Analyzing the team the tactical schemes across different
formation in soccer OpenCV is used for The formation detection model games. Visualization of real-time team
[108]
and formulating back-end visualization. achieved a max accuracy of 96.8%. formation is another drawback as it
several design goals. limits the visualization of non-trivial
spatial information. By applying
state-of-the-art tracking algorithms, one
can predominantly improve the
performance of tactics analysis.
The proposed model failed to handle the
Player recognition Achieved an accuracy of 82% by players that are not visible for certain
Spatial Constellation +
[60] with jersey number combining Spatial Constellation + periods. Predicting the position of
CNN
recognition. CNN models. invisible players could improve the
quality of spatial constellation features.
To determine the quality of each pass,
some factors such as pass execution of
player in a particularly difficult situation,
Evaluating and
The proposed model achieves an the strategic value of the pass, and the
classifying the
[89] SVM accuracy of 90.2% during a football riskiness of the pass need to be included.
passes in a football
match. To rate the passes in sequence, it is
game.
necessary to consider the sequence of
passes during which the player possesses
the ball.
Detecting dribbling
actions and
The proposed methodology fails to
[84] estimating Random forest Achieved an accuracy of 93.3%.
evaluate the tactical strategies.
positional data of
players in soccer.
The performance of the methodology
Team tactics The model fails when audiovisual
is measured in terms of precision,
[103] estimation in soccer SVM features could not recognize quick
recall, and F1-score and achieved 98%,
videos. changes in the team’s tactics.
97%, and 98%.
Analyzing past To extract the features of pass location, By incorporating temporal information,
events in the case of they used heatmap generation and the classification accuracy can be
[93] k-NN, SVM
non-obvious achieved an accuracy of 87% in the improved and also offers specific insights
insights in soccer. classification task. into situations.
Player detection is evaluated in terms
Tracking the players of accuracy and achieved 97.7%.
[61] HOG + SVM -
in soccer videos. Classification accuracy using k-NN
achieved 93% for 15 classes.
By extracting the features of various
Action classification The model achieves a classification rate
[79] LSTM + RNN activities, the accuracy of the
in soccer videos of 92% on four types of activities.
classification rate can be improved.
4.3. Cricket
In many aspects of cricket as well, computer vision techniques can effectively re-
place manual analysis. A cricket match has many observable elements including bat-
Appl. Sci. 2022, 12, 4429 14 of 49
ting shots [114–121], bowling performance [122–127], number of runs or score depend-
ing on ball movement, detecting and estimating the trajectory of the ball [128], decision
making on placement of players’ feet [129], outcome classification to generate commen-
tary [130,131], detecting umpire decision [132,133]. Predicting an individual cricketer’s
performance [134,135] based upon his past record can be critical in team member selec-
tion at international competitions. Such process are highly subjective and usually require
much expertise and negotiation decision-making. By predicting the results of cricket
matches [136–140] such as the toss decision, home ground, player fitness, player perfor-
mance criteria [141], and other dynamic strategies the winner can be estimated. The video
summarization process gives a compact version of the original video for ease in managing
the interesting video contents. Moreover, the video summarization methods capture the
interest of the viewer by capturing exciting events from the original video [142,143]. Table 4
summarizes various proposed methodologies with their limitations to resolve various
application issues in cricket.
Studies in Cricket
Precision and Performance
Refs. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
It is evaluated in terms of
By incorporating unorthodox shots
Shot classification in CNN—Gated Recurrent precision, recall, and F1-score
[117] which are played in t20 in the dataset
cricket. Unit and achieved 93.40%, 93.10%,
may improve the testing accuracy.
and 93% for 10 types of shots.
It was evaluated in terms of
precision, recall, and F1-score
Training the model with the dataset of
Detecting the action of and the maximum average
[125] VGG16-CNN wrong actions can improve detection
the bowler in cricket. accuracy achieved is 98.6% for 13
accuracy.
classes (13 types of bowling
actions).
The model was evaluated in
Movement detection
terms of mean square error and
[129] of the batsman in Deep-LSTM -
achieved a minimum error of
cricket.
1.107.
Decision tree classifier performance is
The methodology was evaluated
low due to the existence of a huge
in terms of precision, F1-score,
number of trees. Therefore, a small
Gated Recurrent Neural accuracy and achieved 96.82%,
change in the decision tree may improve
Cricket video Network + Hybrid 94.83%, and 96.32% for four
[142,143] the prediction accuracy. Extreme
summarization. Rotation Forest-Deep classes. YOLO is evaluated on
Learning Machines have faced the
Belief Networks YOLO precision, recall, and F1-score
problem of overfitting, which can be
and achieved 97.1%, 94.4%, and
overcome by removing duplicate data in
95.7% for 8 classes.
the dataset.
The proposed algorithm Replacing machine learning techniques
Prediction of achieves a classification accuracy with deep learning techniques may
Efficient Machine
[134] individual player of 93.73% which is good improve the performance in prediction
Learning Techniques
performance in cricket compared with traditional even in the case of different
classification algorithms. environmental conditions.
Classification of The average classification in To improve the accuracy of classification,
[114] different batting shots CNN terms of precession is 0.80, Recall a deep learning algorithm has to be
in cricket. is 0.79 and F1-score is 0.79. replaced with a better neural network.
Due to the unavailability of the standard
Outcome classification
dataset for the ball by ball outcome
task to create Maximum of 85% of training
classification in cricket, the accuracy is
[130] automatic CNN + LSTM accuracy and 74% validation
not up to mark. In addition, better
commentary accuracy
accuracy leads to automatic commentary
generation.
generation in sports.
Appl. Sci. 2022, 12, 4429 15 of 49
Table 4. Cont.
Studies in Cricket
Precision and Performance
Refs. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
It holds 94% accuracy in the
Detecting the third Deep Conventional Neural To build an automated umpiring system
umpire decision and Network (DCNN) and 100% in based on computer vision application
[132] an automated scoring CNN + Inception V3 Inception V3 for the and artificial intelligence, the results
system in a cricket classification of umpire signals obtained in this paper are more than
game. to automate the scoring system enough.
of cricket.
Classification of The test set accuracy of the The model lacks data for detecting spin
cricket bowlers based model is 93.3% which bowlers. As the dataset is confined to
[133] CNN
on their bowling demonstrates its classification left-arm bowlers, the model misclassifies
actions. ability. the right-arm bowlers.
As the model is dependent on the frame
Recognition of various The proposed models can
per second of the video, it fails to
[115] batting shots in a Deep-CNN recognize a shot being played
recognize when the frames per second
cricket game with 90% accuracy.
increases.
Automatic highlight The proposed method cannot clear
Mean Average Precision of
[131] generation in the CNN + SVM metrics to evaluate the false positives in
72.31%
game of cricket. highlights.
Umpire pose detection Classification and summarization
VGG19-Fc2 Player testing
[133] and classification in SVM techniques can minimize false positives
accuracy of 78.21%
cricket. and false negatives.
To assess the player’s batting caliber,
The proposed method identifies
certain aspects of batting also need to be
Activity recognition 20 classes of batting shots with
Decision Trees, k-Nearest considered, i.e., the position of the
[116] for quality assessment an average F1-score of 88%
Neighbours, and SVM. batsman before playing a shot and the
of batting shots. based on the recorded
method of batting shots for a particular
movement of data.
bowling type can be modeled.
Imbalance in the dataset is one of the
Predicting the Achieved an accuracy of 71% causes which produces lower accuracy.
k-NN, Naïve Bayesian,
[136,137] outcome of the cricket upon the statistics of 366 Deep learning methodologies may give
SVM, and Random Forest
match. matches. promising results by training with a
dataset that included added features.
Variation in ball speed has a
feeble significance in influencing
the bowling performance (the
Performance analysis p-value being 0.069). The
[124] Multiple regression -
of the bowler. variance ratio of the regression
equation to that of the residuals
(F-value) is given as 3.394 with a
corresponding p-value of 0.015.
The model achieves an accuracy
Predicting the
Multilayer perceptron of 77% on batting performance
[135] performance of the -
Neural Network and 63% on bowling
player.
performance.
4.4. Tennis
Worldwide, Tennis has experienced gain a huge popularity. This game need a metic-
ulous analysis to reducing human errors and extracting several statistics from the game’s
visual feed. Automated ball and player tracking belongs to such class of systems that
requires sophisticated algorithms for analysis. The primary data for tennis are obtained
from ball and player tracking systems, such as HawkEye [144,145] and TennisSense [28,146].
The data from these systems can be used to detect and track the ball/player [147–150],
visualizing the overall tennis match [151,152] and predicting trajectories of ball landing
positions [153–155], player activity recognition [156–158], analyzing the movements of the
player and ball [159], analyzing the player behavior [160] and predicting the next shot
movement [161] and real-time tennis swing classification [162]. Table 5 summarizes various
proposed methodologies to resolve various challenging tasks in tennis with their limitations.
Appl. Sci. 2022, 12, 4429 16 of 49
Studies in Tennis
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
Monitoring and The model achieved an mAP of Using a lightweight backbone for
[145] Analyzing tactics of YOLOv3 90% with 13 FPS on detection, modules can improve the
tennis players. high-resolution images. processing speed.
Temporal Deep Belief If two different movements are similar,
Player action The accuracy of the recognition
[158] Network (Unsupervised then the model fails to recognize the
recognition in tennis. rate is 94.72%
Learning Model) current action.
Maximum classification accuracy
of 99.72% achieved using NN If the play styles of the players are
SVM, Neural Network,
Tennis swing with a Recall of 1. The different but the patterns are the same, in
[162] K-NN, Random Forest,
classification. second-highest classification that case, models failed to classify the
Decision Tree
accuracy of 99.44% was achieved current swing direction.
using K-NN with a recall of 0.98.
The average accuracy of player
The model lacks real-time learning ability
Player activity activity recognition based on the
Long Short Term Memory and requires a large computing time at
[156] recognition in a tennis historical LSTM model was 0.95,
(LSTM) the training stage. The model also lacks
game. and that of the typical LSTM
online learning ability.
model was 0.70.
Among all the proposed
Automatic detection methods, model 1 had the
In the case of non-linear regression
and classification of highest F1-score of 0.801, as well
analysis, the classification performance
[147] change of direction Random Forest Algorithm as the smallest rate of
of the proposed model is not up to the
from player tracking false-negative classification
mark.
data in a tennis game. (3.4%) and average accuracy of
80.2%
The performance factor is
Prediction of shot Generative Adversarial measured based on the The performance of the model deviates
[153] location and type of Network (GAN) minimum distance recorded from the different play styles as it is
shot in a tennis game. (Semi-Supervised Model) between predicted and ground trained on the limited player dataset.
truth shot location.
Analyzing individual
tennis matches by For data extraction, a
Generation of 1-D space charts The performance of the model deviates
capturing player and ball tracking
[159] for patterns and point outcomes from different matches, as it was trained
spatio-temporal data system such as HawkEye
to analyze the player activity. only on limited tennis matches.
for player and ball is used.
movements.
The classification accuracies are
as follows: Improves from 84.10
to 88.16% for players of mixed The detection accuracy can be increased
Action recognition in abilities. Improves from 81.23 to by incorporating spatio-temporal data
[157] 3-Layered LSTM
tennis 84.33% for amateurs and from and combining the action recognition
87.82 to 89.42% for professionals, data with statistical data.
when trained using the entire
dataset.
For data extraction, player By combining factors (Outside,
and ball tracking systems Left Top, Right Top, Right As the model is trained on limited data
Shot prediction and
such as HawkEye are used Bottom) together, speed, start (only elite players), it cannot be
[161] player behavior
and a Dynamic Bayesian location, the player movement performed on ordinary players across
analysis in tennis
Network for shot assessment achieved better multiple tournaments.
prediction is used. results of 74% AUC.
Evaluation results in terms of
precision, recall, F1-score are The proposed method cannot handle
Two-Layered Data 84.39%, 75.81%, 79.87% for multi-object tracking and it is possible to
[148] Ball tracking in tennis
Association Australian open tennis matchwa integrate audio information to facilitate
and 82.34%, 67.01%, 73.89% for high-level analysis of the game.
U.S open tennis matches.
Highlight extraction The proposed algorithm fails to
The proposed algorithm
from rocket sports recognize the player, as the player is a
achieved an accuracy of 90.7%
[160] videos based on SVM deformable object of which the limbs
for tennis videos and 87.6% for
human behavior perform free movement during action
badminton videos.
analysis. recognition.
Appl. Sci. 2022, 12, 4429 17 of 49
4.5. Volleyball
In volleyball, two teams of six players each are placed on either side of a net. Each
team attempts to ground a ball on the opposite team’s court and to score points under the
defined rules. So, detecting and analyzing the player activities [163–165], detecting play
patterns and classifying tactical behaviors [166–169], predicting league standings [170],
detecting and classifying spiking skills [171,172], estimating the pose of the player [173],
tracking the player [174], tracking the ball [175], etc., are the major aspects of volleyball
analysis. Predicting the ball trajectory [59] in a volleyball game by observing the motion of
the setter player has been conducted. Table 6 summarizes various proposed methodologies
to resolve various challenging tasks in volleyball sport with their limitations.
Studies in Volleyball
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
The model fails to track the players if the
Group activity video is taken from a dynamic camera.
The model achieved an accuracy
[173] recognition by CNN + Bi-LSTM Temporal action localization can improve
of 93.9%.
tracking players. the accuracy of tracking the players in
severe occlusion conditions.
Recognizing and
The achieved recognition rate
[174] classifying player’s SVM -
was 98% for 349 correct samples.
behavior.
The model achieves better
By employing a state-of-the-art method
classification results as
Classification of and training on a proper dataset that has
prediction accuracies range from
[167] tactical behaviors in RNN + GRU continuous positional data, it is possible
37% for forecasting the attack
beach volleyball. to predict tactics behavior and set/match
and direction to 60% for the
outcomes.
prediction of success.
Motion estimation for Machine Vision and Replacing methods with deep learning
[175] Tracking accuracy is 89%
volleyball Classical particle filter. algorithms gives better results.
Assessing the use of
Inertial Measurement
By incorporating different frequency
Units in the Unweighted Average Recall of
[168] KNN, Naïve Bayes, SVM domain features, the performance factor
recognition of 86.87%
can be improved.
different volleyball
actions.
Predicting the ball
In the case of predicting the 3D body
trajectory in a The proposed method predicts
position data, the method records a large
volleyball game by 0.3 s in advance of the trajectory
[59] Neural Network error. This can be overcome by training
observing the the of the volleyball based on the
properly annotated large data on
motion of the setter motion of the setter player.
state-of-art-methods.
player.
The approach achieved a Instead of using wearable devices,
Activity recognition in classification accuracy of 83.2%, computer vision architectures can be
[164] Deep Convolutional LSTM
beach volleyball which is superior compared with used to classify the activities of the
other classification algorithms. players in volleyball.
Evaluated in terms of Average
Volleyball skills and
[170] ANN Relative Error for 10 samples -
tactics analysis
and achieved 0.69%.
The performance of architecture is poor
Group activity Group activity recognition of
because of the lack of hierarchical
[165] recognition in a LSTM accuracy of the the proposed
considerations of the individual and
volleyball game model in volleyball is 51.1%.
group activity dataset.
player/hockey ball, recognizing the actions of the player [177–179], estimating the pose
of the player [180], classifying and tracking the players of the same team or different
teams [181], referee gesture analysis [182,183] and hockey ball trajectory estimation are the
major aspects of hockey sport.
Ice hockey is another similar game to field hockey, with two teams with six players
each, wearing skates and competing on an ice rink. All players aim to propel a vulcanized
rubber disk, the puck, past a goal line and into a net guarded by a goaltender. Ice hockey is
gaining huge popularity on international platforms due to its speed and frequent physical
contact. So, detecting/tracking the player [184–186], estimating the pose of the player [187],
classifying and tracking with different identification the players of the same team or
different teams, tracking the ice hockey puck [188], and classification of puck possession
events [189] are the major aspects of the ice hockey sport. Table 7 summarizes various
proposed methodologies to resolve various challenging tasks in hockey/ice hockey with
their limitations.
Studies in Hockey
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
HD+SVM achieved the best
Detecting the player SVM, Faster RCNN, SSD, results in terms of accuracy, The model failed to detect the players in
[176]
in hockey. YOLO recall, and F1-score with values occlusion conditions.
of 77.24%, 69.23%, and 73.02%.
Localizing puck Replacing the detection method with the
Evaluated in terms of AUC and
[188] Position and Event Faster RCNN YOLO series can improve the
achieved 73.1%.
recognition. performance.
Some of the jersey number classes such
Achieves player identification as 1 to 4 are incorrectly predicted. The
Identification of
[181] ResNet + LSTM accuracy of over 87% on the split diagonal numbers from 1 to 100 are
players in hockey.
dataset. falsely classified due to the small number
of training examples.
As the proposed model is focused on
The proposed model recognizes spatial features, it does not recognize
the activities such as free hits, activities such as free hits and long
Activity recognition in
[177] LSTM goals, penalties corners, and corners as they appear as similar patterns.
a hockey game.
long corners with an accuracy of By including temporal features and
98%. incorporating LSTM into the model, the
model is robust to performance accuracy.
The architecture is not robust to abrupt
Pose estimation and A novel approach was designed changes in the video, e.g., it fails to
VGG19 + LiteFlowNet +
[180] temporal-based action and achieved an accuracy of 85% predict hockey sticks. Activities such as a
CNN
recognition in hockey. for action recognition. goal being scored, or puck location, are
not recognized.
The performance of the model is
better in similar classes such as
As the number of hidden units to LSTM
Action recognition in passing and shooting. It
increases, the number of parameters also
[187] ice hockey using a CNN+LSTM achieved 90% parameter
increases, which leads to overfitting and
player pose sequence. reduction and 80% floating-point
low test accuracy.
reduction on the HARPET
dataset.
An F1-score of 67% was
The performance of the model is poor
Human activity calculated for action recognition
[178] CNN+LSTM because of the improper imbalanced
recognition in hockey. on the multi-labeled imbalanced
dataset.
dataset.
The accuracy of the actions Pose estimation problems due to severe
Player action recognized in a hockey game is occlusions when motions blur due to the
[179] recognition in an ice CNN 65% and when similar actions speed of the game and also due to lack of
hockey game are merged accuracy rises to a proper dataset to train models, all
78%. causing low accuracy.
Appl. Sci. 2022, 12, 4429 19 of 49
4.7. Badminton
Badminton is one of the most popular racket sports, which includes tactics, techniques,
and precise execution movements. To improve the performance of the player, technol-
ogy plays a key role in optimizing the training of players; technology determines the
movements of the player [190] during training and game situations such as with action
recognition [191–193], analyzing the performance of player [194], detecting and tracking
the shuttlecock [195–197]. Table 8 summarizes various proposed methodologies to resolve
various challenging tasks in badminton with their limitations.
Studies in Badminton
Precision and Performance
Ref. Problem Statement Proposed Methodology Limitations and Remarks
Characteristics
The proposed method fails to detect
Results show that, compared with
Shuttlecock detection different environmental conditions. As it
Tiny YOLOv2 and state-of-art methods, the proposed
[195] problem of a uses the binocular camera to detect a 2D
YOLOv3 networks achieved good accuracy
badminton robot. shuttlecock, it cannot detect the 3D
with efficient computation.
shuttlecock trajectory.
Recognition of badminton actions by The architecture can be improved by
Automated
AlexNet+CNN, the linear SVM classifier for both fine-tuning in an end-to-end manner
badminton player
[191] GoogleNet+CNN and AlexNet and GoogleNet using local with a larger dataset on features
action recognition in
SVM and global extractor methods is 82 extracted at different fully connected
badminton games.
and 85.7%. layers.
Nine different activities were
distinguished: seven badminton
strokes, displacement, and moments
of rest. With accelerometer data,
Badminton activity Computer vision techniques can be
[192] CNN accurate estimation was conducted
recognition employed instead of sensors.
using CNN with 86% precision.
Accuracy is raised to 99% when
gyroscope data are combined with
accelerometer data.
Classification of
Significantly, the GoogleNet model
badminton match The proposed method classifies the hit
has the highest accuracy compared
images to recognize AlexNet, GoogleNet, and non-hit actions and it can be
[193] to other models in which only
the different actions VGG-19 + CNN improved by classifying more actions in
two-hit actions were falsely classified
were conducted by the various sports.
as non-hit actions.
athletes.
The performance of the proposed
An AdaBoost algorithm algorithm was evaluated based on The accuracy of tracking shuttlecocks is
Tracking shuttlecocks
[196] which can be trained using precision and it achieved an average enhanced by replacing state-of-the-art AI
in badminton
the OpenCV Library. precision accuracy of 94.52% with algorithms.
10.65 fps.
The average accuracy of player The unique properties of application
Tactical movement
position detection is 96.03 and such as the length of frequent trajectories
[190] classification in k-Nearest Neighbor
97.09% on two halves of a or the dimensions of the vector space
badminton
badminton court. may improve classification performance.
4.8. Miscellaneous
Player detection and tracking is the major requirement in athletic sports such as run-
ning, swimming [198,199], and cycling. In sports such as table tennis [200], squash [201,202],
and golf [203], ball detection and tracking and player pose detection [204] are challenging
tasks. In ball-centric sports such as rugby, American football, handball, baseball, ball/player
detection [205–211] and tracking [212–221], analyzing the action of the player [23,222–227],
event detection and classification [228–232], performance analysis of player [233–235], ref-
eree identification and gesture recognition are the major challenging tasks. Video highlight
generation is a subclass of video summarization [236–239] which may be viewed as a
subclass of sports video analysis. Table 9 summarizes various proposed methodologies to
resolve various challenging tasks in various sports with their limitations.
Appl. Sci. 2022, 12, 4429 20 of 49
Table 9. Cont.
Machine Learning
Equivalence Class
Training/Validation/Test Clustering and Bottom-
data
Regression Supervised Learning
Up Lattice Traversal
Poisson Regression
Precision/Recall Unsupervised Learning
Evaluation Criteria
Evaluation Criteria
Bias and Variance Ensemble Learning Adjusted R and Index
Q - Learning Bagging
Boosting
Stacking
Figure 8. Block diagram of the road map to machine learning architecture selection and training.
Deep Learning
Feedforward Neural
Optimizers Detection/Classification
Network
SGD Accuracy
Momentum Precision
Autoencoder Learning Rate Schedule Tracking
Adam Recall
Adagrad F1 - Score
Pooling Convolutional Neural
AdaDelta Batch Normalization Area Under Curve Trajectory
Network
Nadam
RMSProp
Long Short Term Memory Recurrent Neural Multi-Objective Tracking
Batch Size Effects
Network (MOT) accuracy
Gated Recurrent Unit
Tools
MOT Precision
Neuroevolution of
Curriculum Learning Tensorboard
Augmenting Topologies
Figure 9. Block diagram of the road map to deep learning architecture selection and training.
Appl. Sci. 2022, 12, 4429 23 of 49
GoogleNet TransTrack
YOLO-Series Region-CNN
Social GAN
ResNet PointTrack++
Single Shot Detection Fast-RCNN
OpenTraj
DenseNet DEFT
RETINANET Faster-RCNN Spacio-Temporal
Graph Transformer
DarkNet MOTR Framework
SQUEEZEDET Feature Pyramid
Network
VGGNet EagerMOT
EfficientDET
Mask-RCNN
Monocular 3D Tracker
YOLOv4
Figure 10. Overview of deep learning algorithms of classification/detection, tracking and trajectory
prediction.
The parameters which are annotated in the ISSIA dataset relate to the positions of the
ball, player, and referee in each video from each camera. The images shown in Figure 11
are a few sample frames from the ISSIA dataset.
The parameters which are annotated in the TTNet dataset are the ball bouncing
moments, the ball hitting the net, and empty events. The images shown in Figure 12 are a
few sample frames from the TTNet dataset.
For the creation of the APIDIS dataset, videos were captured from seven cameras from
above and around the court. The events which are annotated in this dataset are player
positions, movements of referees, baskets, and the position of the ball. The images shown
in Figure 13 are a few samples from the APIDIS dataset.
Appl. Sci. 2022, 12, 4429 26 of 49
Raspberry Pi
Jetson TX1 Jetson TX2 Jetson AGX Xavier Latte Panda Odroid Xu4
Series
Dual-core Denver
Quad-core ARM 64 bit CPU and 63-bit quad- Intel Cherry Train Cortex A7
CPU CPU and quad-core
Processor 8-core ARM core ARM quad-core CPU octa-core CPU
ARM 57
NVIDIA Maxwell NVIDIA Pascall Tensor cores +
GPU - - -
with CUDA cores with CUDA cores 512-core Volta GPU
Stacked memory
Memory 4GB Memory 8 GB Memory 16 GB Memory 1 GB Memory 4 GB Memory
of 2 GB
16 GB Flash Support Micro Support Micro SD
Storage 32 GB storage 32 GB storage 64 GB storage
Storage SD card Card
Possible DL YOLO v2 and v3, tiny YOLO v3, SSD,
Algorithms to Faster–RCNN and Tracking algorithm YOLO, YOLO v2 and SSD-MobileNet etc.
Implement like YOLO v3 + Deep SORT, YOLO v4, YOLOR
Appl. Sci. 2022, 12, 4429 27 of 49
A Field Programmable Gate Array (FPGA) has also been used in sports involving 3D
motion capturing, object movement analysis and image recognition, etc. Table 12 describes
how different researchers performed various studies of sports on hardware platforms such
as FPGA and GPU-based devices and their results in terms of performance measures are
listed.
Modern artificial intelligence fields are not good sparring partners, but they can be
valuable as research tools. One of the most effective methods to improve this is to learn from
one’s failures. The suggested method to improve playing abilities is to review games, but how
can one detect the mistakes? How can one come up with better alternatives? This challenge is
solved by the field of artificial intelligence analysis tools such as AlphaGO [287–289], which
provide probability distribution of smart moves and their assessment.
An application that uses AI contains huge a dataset of game performances and training-
related information, which is backed up with the knowledge of several coaches and sports
scientists. They act as an accumulated source of current knowledge on the dissemination of
the latest techniques, tactics, or knowledge for professional coaches.
With the evolution of knowledge on any tactic or technique, the knowledge base of AI
is updated. The accumulated data can be used for training and educating sports coaches,
scientists, and also athletes, which in turn leads to improved performance.
1980 1988
Edward Fredkin created the First grandmaster to lose to
1950 Fredkin prize for computer chess a computer in a major tournament.
First chess First chess computer to gain
program is written a grandmaster rating
1998
1970 1985
Deep thought the best
The first all computer First computer to gain
computer at the time is
championship was held an Elo rating greater
defeated by Kasparov
in New York than 2400 (2530)
In computers, Deep Blue’s rating which was 2700+ was surpassed by Deep Mind’s Al-
pha Zero with an estimated Elo 3600, which was developed by Google’s sibling DeepMind.
It was developed by a reinforcement learning technique called self-play. It took just 24 h to
achieve it and proves the capabilities of the machine.
STATS SportVU [293] and ChyronHego TRACAB [294] are the technologies for player
tracking in sports. The area of primary application is to track players in various sports,
analyze their performance and assist coaches for training. Figure 17 shows the player
position and pose estimation using commercial systems. SportsVu is a computer vision
technology that provides real-time optical tracking in various sports. It provides in-depth
performance of any team, such as tracking every player from both teams to provide
comprehensive match coverage, collecting data to provide tactical analysis of the match,
and highlighting the performance deviations to reduce injuries in the game.
Player/Ball/Referee
Pose Estimation Trajectory Prediction
Detection and Tracking
Hockey/Ice
Valleyball
hockey
To assess the batter’s caliber, certain aspects of batting need to be considered, i.e.,
position of batsman before playing a shot, and the method of batting shots for a particular
bowling type needs to be modeled [116]. Classification and summarization techniques
can minimize false positives and false negatives to detect and classify umpire poses [133].
Detecting various moments such as whether the ball hit the bat and precise detection of
the player and wicket keeper at the moment of run-outs, as shown in Figure 20a, is still a
Appl. Sci. 2022, 12, 4429 32 of 49
major issue in cricket. Predicting the trajectory of balls bowled by spin bowlers as shown in
Figure 20b can be resolved accurately by labeling large datasets and modeling using SOTA
algorithms.
(a)
(b)
Figure 19. Instances from soccer matches. (a) Detecting body pose and limbs. (b) Handling severe
occlusions among players.
The recognition accuracy of player actions in badminton games [191] can be improved
by SOTA computer vision algorithms and fine-tuning in an end-to-end manner with a larger
dataset on features extracted at different fully connected layers. In the implementation of
an automatic linesman system in badminton games [295], the algorithm is not robust to the
far views of the camera, where illumination conditions heavily impact the system while
the speed of the shuttlecock is also a major factor for poor accuracy. So, it is necessary to
track the path, which becomes simpler for the referee to decide if a shuttle lands out or in
as shown in Figure 21.
Appl. Sci. 2022, 12, 4429 33 of 49
(a)
(b)
Figure 20. Instances from cricket matches. (a) Precise detection at the moment of run outs. (b) Pre-
dicting the trajectory of the ball in-line or out-line, etc.
personalized care monitoring will become a new direction and breakthrough in the
sports industry.
9. Conclusions
Sports video analysis is an emerging and very dynamic field of research. This study
comprehensively reviewed sports video analysis for various applications such as tracking
players or balls in sports and predicting the trajectories of players or balls, players’ skill, and
team strategies analysis, detecting and classifying objects in sports. As per the requirements
of deploying computer vision techniques in various sports, we provided some of the
publicly available datasets related to a particular sport. Detailed discussion on GPU-
based work stations, embedded platforms and AI applications in sports are presented.
We have present various classical techniques and AI techniques employed in sports, their
performance, pros, cons, and suitability to particular sports. We list probable research
directions, existing challenges, and current research trends with a brief discussion and also
widely used computer vision techniques in various sports.
Individual player tracking in sports is very helpful for coaches and personal trainers.
Though the sports include particularly challenging tasks such as similarities between
players, generation of blurry video segments in some cases, partial or full occlusions
between players, the invisibility of jersey numbers in some cases, computer vision is the
best possible solution to achieve player tracking.
Classification of jersey numbers in sports such as soccer and basketball is quite simple
as they have plain jerseys, but in the case of sports such as hockey and American football, the
jerseys are massive and come with sharp contours, due to which jersey number recognition
is quite hard. By implementing proper bounding box techniques and digit recognition
methods, better performance vis-a-vis jersey number recognition in every sport can be
achieved. As the appearance of players varies from sport to sport, an algorithm trained on
Appl. Sci. 2022, 12, 4429 36 of 49
one sport may not work when it is tested on another sport. The problem may be solved by
considering a dataset that contains a small set of samples from every sport for fine-tuning.
In the case of multi-player tracking in real-time sports videos, severe occlusions cause
a critical problem of identity switching among the players. The continuous movement
of players makes it difficult to read jersey numbers. A player’s similar appearance to
another and severe occlusions make it difficult to track and identify players, referees,
and goalkeepers reliably. Multiple object tracking in sports is a key prerequisite for the
realization of advanced operations in sports, such as player movement and their position
in sports, which will give good objective criteria to the team manager for developing a new
plan to improve team performance as well as evaluating each player accurately.
Commercially used multi-camera tracking systems of players rely on some mixtures of
manual and automated tracking and player labeling. Optical tracking systems are a good
approach for tracking players occluding each other or players with a similar appearance.
The algorithm may detect false positives from out of the court such as fans wearing team
uniforms, as the appearance of fans is similar to that of players. This can be eliminated
by estimating the play area or broadcast camera parameters with extra spatiotemporal
locations of player positions.
Action recognition in sports videos is explicitly a non-linearity problem, which can be
obtained by aligning feature vectors, by providing a massive amount discriminative video
representations to capture the temporal structure of the video that is not present in the dynamic
image space and analyzing the salient regions of the frames for action recognition.
The algorithms employed so far for detecting and tracking ball movements began
with estimating the 3D ball position in trajectory. Employing these methods is very critical,
as they include a lot of mathematical relations and require reliable reference objects to
construct the path of the trajectory. Kalman filter- and particle filter-based methods are
robust concerning the size, shape, and velocity of the ball. However, the methods fail
to establish the track when the ball reappears after occlusion. Trajectory-based methods
solve the problem of occlusion and are robust in obtaining data regarding missing and
merging balls but fail to obtain data regarding the case of the size and shape of the ball.
Data association methods are best suited for detecting and tracking small size balls in small
courts such as tennis courts but are not suited for challenges in sports such as basketball,
soccer, and volleyball. AI algorithms predict the precise trajectories of the ball from a
knowledge of previous frames and are immune to challenges such as air friction, ball
spin, and other complex ball movements. A precise database that includes different sizes
and shapes of the ball has to be introduced to detect the ball position and enable tracking
algorithms to perform efficiently.
Detection and tracking of players, balls and assistant referee as well as semantic scene
understanding in computer vision applications of sports is still an open research area due
to various challenges such as sudden and rapid changes in movements of the players and
ball, similar appearance, players with extreme aspect ratios (players have extremely small
aspect ratios in terms of height and width when they fall down on the field) and frequent
occlusions. The future scope of computer vision research in sports, therefore, is handling
limitations more accurately on different AI algorithms.
As the betting process involves financial assets, it is important to decide which team is
likely to win; therefore, bookmakers, fans, and potential bidders are all interested in estimating
the odds of the game in advance. So, provisional tactical analysis of field sports related to
player formation in sports such as soccer, basketball, rugby, American football, and hockey, as
well as pass prediction, shot prediction, and expectations of goals in a given game state or a
possession, or more general game strategies, need to be analyzed in advance.
Tracking algorithms that are used in various sports cannot be compared on a common
scale as experiments, requirements, situations, and infrastructure in every scenario differ.
Determining the performance benchmark of algorithms quantitatively is quite difficult
due to the unavailability of a comparable database with ground truths of different sports
differing in many aspects. In addition to these, there are additional parameters such as
Appl. Sci. 2022, 12, 4429 37 of 49
different video capturing devices and their parameter variations which lead to difficulty in
building an object tracking system in the sports field.
Author Contributions: Conceptualization, B.T.N. and M.F.H.; methodology, B.T.N. and M.F.H.; soft-
ware, B.T.N.; validation, B.T.N., M.F.H. and N.D.B.; formal analysis, B.T.N. and M.F.H.; investigation,
B.T.N., M.F.H. and N.D.B.; resources, M.F.H. and N.D.B.; data curation, B.T.N. and M.F.H.; writing—
original draft preparation, B.T.N., M.F.H. and N.D.B.; writing—review and editing, B.T.N., M.F.H.
and N.D.B.; visualization, B.T.N. and N.D.B.; supervision, M.F.H. and N.D.B.; project administration,
M.F.H. and N.D.B.; funding acquisition, M.F.H. and N.D.B. All authors have read and agreed to the
published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available on request from the
corresponding author.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
References
1. Tan, D.; Ting, H.; Lau, S. A review on badminton motion analysis. In Proceedings of the International Conference on Robotics,
Automation and Sciences (ICORAS), Melaka, Malaysia, 5–6 November 2016; pp. 1–4.
2. Bonidia, R.P.; Rodrigues, L.A.; Avila-Santos, A.P.; Sanches, D.S.; Brancher, J.D. Computational intelligence in sports: A systematic
literature review. Adv. Hum.-Comput. Interact. 2018, 2018, 3426178. [CrossRef]
3. Rahmad, N.A.; As’ari, M.A.; Ghazali, N.F.; Shahar, N.; Sufri, N.A.J. A survey of video based action recognition in sports. Indones.
J. Electr. Eng. Comput. Sci. 2018, 11, 987–993. [CrossRef]
4. Van der Kruk, E.; Reijne, M.M. Accuracy of human motion capture systems for sport applications; state-of-the-art review. Eur. J.
Sport Sci. 2018, 18, 806–819. [CrossRef] [PubMed]
5. Manafifard, M.; Ebadi, H.; Moghaddam, H.A. A survey on player tracking in soccer videos. Comput. Vis. Image Underst. 2017,
159, 19–46. [CrossRef]
6. Thomas, G.; Gade, R.; Moeslund, T.B.; Carr, P.; Hilton, A. Computer vision for sports: Current applications and research topics.
Comput. Vis. Image Underst. 2017, 159, 3–18. [CrossRef]
7. Cust, E.E.; Sweeting, A.J.; Ball, K.; Robertson, S. Machine and deep learning for sport-specific movement recognition: A systematic
review of model development and performance. J. Sports Sci. 2019, 37, 568–600. [CrossRef]
8. Kamble, P.R.; Keskar, A.G.; Bhurchandi, K.M. Ball tracking in sports: A survey. Artif. Intell. Rev. 2019, 52, 1655–1705. [CrossRef]
9. Shih, H.C. A survey of content-aware video analysis for sports. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 1212–1231.
[CrossRef]
10. Beal, R.; Norman, T.J.; Ramchurn, S.D. Artificial intelligence for team sports: A survey. Knowl. Eng. Rev. 2019, 34, e28. doi:
10.1017/S0269888919000225. [CrossRef]
11. Apostolidis, E.; Adamantidou, E.; Metsai, A.I.; Mezaris, V.; Patras, I. Video Summarization Using Deep Neural Networks: A
Survey. Proc. IEEE 2021, 109, 1838–1863. [CrossRef]
12. Adesida, Y.; Papi, E.; McGregor, A.H. Exploring the role of wearable technology in sport kinematics and kinetics: A systematic
review. Sensors 2019, 19, 1597. [CrossRef] [PubMed]
13. Rana, M.; Mittal, V. Wearable sensors for real-time kinematics analysis in sports: A review. IEEE Sens. J. 2020, 21, 1187–1207.
[CrossRef]
14. Kini, S. Real Time Moving Vehicle Congestion Detection and Tracking using OpenCV. Turk. J. Comput. Math. Educ. 2021,
12, 273–279.
15. Davis, M. Investigation into Tracking Football Players from Single Viewpoint Video Sequences. Bachelor’s Thesis, The University
of Bath, Bath, UK, 2008; p. 147.
16. Spagnolo, P.; Mosca, N.; Nitti, M.; Distante, A. An unsupervised approach for segmentation and clustering of soccer players. In
Proceedings of the International Machine Vision and Image Processing Conference (IMVIP 2007), Washington, DC, USA, 5–7
September 2007; pp. 133–142.
17. Le Troter, A.; Mavromatis, S.; Sequeira, J. Soccer field detection in video images using color and spatial coherence. In Proceedings
of the International Conference Image Analysis and Recognition, Porto, Portugal, 29 September–1 October 2004; pp. 265–272.
18. Heydari, M.; Moghadam, A.M.E. An MLP-based player detection and tracking in broadcast soccer video. In Proceedings of the
International Conference of Robotics and Artificial Intelligence, Rawalpindi, Pakistan, 22–23 October 2012; pp. 195–199.
19. Barnard, M.; Odobez, J.M. Robust playfield segmentation using MAP adaptation. In Proceedings of the 17th International
Conference on Pattern Recognition, Cambridge, UK, 26 August 2004; Volume 3, pp. 610–613.
20. Pallavi, V.; Mukherjee, J.; Majumdar, A.K.; Sural, S. Graph-based multiplayer detection and tracking in broadcast soccer videos.
IEEE Trans. Multimed. 2008, 10, 794–805. [CrossRef]
21. Ul Huda, N.; Jensen, K.H.; Gade, R.; Moeslund, T.B. Estimating the number of soccer players using simulation-based occlusion
handling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT,
USA, 18–22 June 2018; pp. 1824–1833.
22. Ohno, Y.; Miura, J.; Shirai, Y. Tracking players and estimation of the 3D position of a ball in soccer games. In Proceedings of the
15th International Conference on Pattern Recognition, Barcelona, Spain, 3–7 September 2000; pp. 145–148.
23. Santiago, C.B.; Sousa, A.; Reis, L.P.; Estriga, M.L. Real time colour based player tracking in indoor sports. In Computational Vision
and Medical Image Processing; Springer: Berlin/Heidelberg, Germany, 2011; pp. 17–35.
24. Ren, J.; Orwell, J.; Jones, G.A.; Xu, M. Tracking the soccer ball using multiple fixed cameras. Comput. Vis. Image Underst. 2009,
113, 633–642. [CrossRef]
25. Kasuya, N.; Kitahara, I.; Kameda, Y.; Ohta, Y. Real-time soccer player tracking method by utilizing shadow regions. In
Proceedings of the 18th ACM international conference on Multimedia, Firenze Italy, 25–29 October 2010; pp. 1319–1322.
26. Homayounfar, N.; Fidler, S.; Urtasun, R. Sports field localization via deep structured models. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 June 2017; pp. 5212–5220.
Appl. Sci. 2022, 12, 4429 39 of 49
27. Leo, M.; Mosca, N.; Spagnolo, P.; Mazzeo, P.L.; D’Orazio, T.; Distante, A. Real-time multiview analysis of soccer matches for
understanding interactions between ball and players. In Proceedings of the 2008 International Conference on Content-Based
Image and Video Retrieval, Niagara Falls, ON, Canada, 7–9 July 2008; pp. 525–534.
28. Conaire, C.O.; Kelly, P.; Connaghan, D.; O’Connor, N.E. Tennissense: A platform for extracting semantic information from
multi-camera tennis data. In Proceedings of the 16th International Conference on Digital Signal Processing, Santorini, Greece,
5–7 July 2009; pp. 1–6.
29. Wu, L.; Yang, Z.; He, J.; Jian, M.; Xu, Y.; Xu, D.; Chen, C.W. Ontology-based global and collective motion patterns for event
classification in basketball videos. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 2178–2190. [CrossRef]
30. Wu, L.; Yang, Z.; Wang, Q.; Jian, M.; Zhao, B.; Yan, J.; Chen, C.W. Fusing motion patterns and key visual information for semantic
event recognition in basketball videos. Neurocomputing 2020, 413, 217–229. [CrossRef]
31. Liu, L. Objects detection toward complicated high remote basketball sports by leveraging deep CNN architecture. Future Gener.
Comput. Syst. 2021, 119, 31–36. [CrossRef]
32. Fu, X.; Zhang, K.; Wang, C.; Fan, C. Multiple player tracking in basketball court videos. J. Real-Time Image Process. 2020,
17, 1811–1828. [CrossRef]
33. Yoon, Y.; Hwang, H.; Choi, Y.; Joo, M.; Oh, H.; Park, I.; Lee, K.H.; Hwang, J.H. Analyzing basketball movements and pass
relationships using realtime object tracking techniques based on deep learning. IEEE Access 2019, 7, 56564–56576. [CrossRef]
34. Ramanathan, V.; Huang, J.; Abu-El-Haija, S.; Gorban, A.; Murphy, K.; Fei-Fei, L. Detecting events and key actors in multi-person
videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June
2016; pp. 3043–3053.
35. Chakraborty, B.; Meher, S. A real-time trajectory-based ball detection-and-tracking framework for basketball video. J. Opt. 2013,
42, 156–170. [CrossRef]
36. Santhosh, P.; Kaarthick, B. An automated player detection and tracking in basketball game. Comput. Mater. Contin. 2019,
58, 625–639.
37. Acuna, D. Towards real-time detection and tracking of basketball players using deep neural networks. In Proceedings of the 31st
Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 4–9.
38. Zhao, Y.; Yang, R.; Chevalier, G.; Shah, R.C.; Romijnders, R. Applying deep bidirectional LSTM and mixture density network for
basketball trajectory prediction. Optik 2018, 158, 266–272. [CrossRef]
39. Shah, R.; Romijnders, R. Applying Deep Learning to Basketball Trajectories. arXiv 2016, arXiv:1608.03793.
40. Žemgulys, J.; Raudonis, V.; Maskeliūnas, R.; Damaševičius, R. Recognition of basketball referee signals from real-time videos. J.
Ambient Intell. Humaniz. Comput. 2020, 11, 979–991. [CrossRef]
41. Liu, W.; Yan, C.C.; Liu, J.; Ma, H. Deep learning based basketball video analysis for intelligent arena application. Multimed. Tools
Appl. 2017, 76, 24983–25001. [CrossRef]
42. Yao, P. Real-Time Analysis of Basketball Sports Data Based on Deep Learning. Complexity 2021, 2021, 9142697. doi:
10.1155/2021/9142697. [CrossRef]
43. Chen, L.; Wang, W. Analysis of technical features in basketball video based on deep learning algorithm. Signal Process. Image
Commun. 2020, 83, 115786. [CrossRef]
44. Wang, K.C.; Zemel, R. Classifying NBA offensive plays using neural networks. In Proceedings of the Proceedings of MIT Sloan
Sports Analytics Conference, Boston, MA, USA, 11–12 March 2016; Volume 4, pp. 1–9.
45. Tsai, T.Y.; Lin, Y.Y.; Jeng, S.K.; Liao, H.Y.M. End-to-End Key-Player-Based Group Activity Recognition Network Applied to
Basketball Offensive Tactic Identification in Limited Data Scenarios. IEEE Access 2021, 9, 104395–104404. [CrossRef]
46. Lamas, L.; Junior, D.D.R.; Santana, F.; Rostaiser, E.; Negretti, L.; Ugrinowitsch, C. Space creation dynamics in basketball offence:
Validation and evaluation of elite teams. Int. J. Perform. Anal. Sport 2011, 11, 71–84. [CrossRef]
47. Bourbousson, J.; Sève, C.; McGarry, T. Space–time coordination dynamics in basketball: Part 1. Intra-and inter-couplings among
player dyads. J. Sports Sci. 2010, 28, 339–347. [CrossRef] [PubMed]
48. Bourbousson, J.; Seve, C.; McGarry, T. Space–time coordination dynamics in basketball: Part 2. The interaction between the two
teams. J. Sports Sci. 2010, 28, 349–358. [CrossRef] [PubMed]
49. Tian, C.; De Silva, V.; Caine, M.; Swanson, S. Use of machine learning to automate the identification of basketball strategies using
whole team player tracking data. Appl. Sci. 2020, 10, 24. [CrossRef]
50. Hauri, S.; Djuric, N.; Radosavljevic, V.; Vucetic, S. Multi-Modal Trajectory Prediction of NBA Players. In Proceedings of the
IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 1640–1649.
51. Zheng, S.; Yue, Y.; Lucey, P. Generating Long-Term Trajectories Using Deep Hierarchical Networks. In Proceedings of the 30th
International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 1551–1559.
52. Bertugli, A.; Calderara, S.; Coscia, P.; Ballan, L.; Cucchiara, R. AC-VRNN: Attentive Conditional-VRNN for multi-future trajectory
prediction. Comput. Vis. Image Underst. 2021, 210, 103245. [CrossRef]
53. Victor, B.; Nibali, A.; He, Z.; Carey, D.L. Enhancing trajectory prediction using sparse outputs: Application to team sports. Neural
Comput. Appl. 2021, 33, 11951–11962. [CrossRef]
54. Li, H.; Zhang, M. Artificial Intelligence and Neural Network-Based Shooting Accuracy Prediction Analysis in Basketball. Mob.
Inf. Syst. 2021, 2021, 4485589. [CrossRef]
Appl. Sci. 2022, 12, 4429 40 of 49
55. Chen, H.T.; Chou, C.L.; Fu, T.S.; Lee, S.Y.; Lin, B.S.P. Recognizing tactic patterns in broadcast basketball video using player
trajectory. J. Vis. Commun. Image Represent. 2012, 23, 932–947. [CrossRef]
56. Chen, H.T.; Tien, M.C.; Chen, Y.W.; Tsai, W.J.; Lee, S.Y. Physics-based ball tracking and 3D trajectory reconstruction with
applications to shooting location estimation in basketball video. J. Vis. Commun. Image Represent. 2009, 20, 204–216. [CrossRef]
57. Hu, M.; Hu, Q. Design of basketball game image acquisition and processing system based on machine vision and image processor.
Microprocess. Microsyst. 2021, 82, 103904. [CrossRef]
58. Yichen, W.; Yamashita, H. Lineup Optimization Model of Basketball Players Based on the Prediction of Recursive Neural
Networks. Int. J. Econ. Manag. Eng. 2021, 15, 283–289.
59. Suda, S.; Makino, Y.; Shinoda, H. Prediction of volleyball trajectory using skeletal motions of setter player. In Proceedings of the
10th Augmented Human International Conference, Reims, France, 11–12 March 2019; pp. 1–8.
60. Gerke, S.; Linnemann, A.; Müller, K. Soccer player recognition using spatial constellation features and jersey number recognition.
Comput. Vis. Image Underst. 2017, 159, 105–115. [CrossRef]
61. Baysal, S.; Duygulu, P. Sentioscope: A soccer player tracking system using model field particles. IEEE Trans. Circuits Syst. Video
Technol. 2015, 26, 1350–1362. [CrossRef]
62. Kamble, P.; Keskar, A.; Bhurchandi, K. A deep learning ball tracking system in soccer videos. Opto-Electron. Rev. 2019, 27, 58–69.
[CrossRef]
63. Choi, K.; Seo, Y. Automatic initialization for 3D soccer player tracking. Pattern Recognit. Lett. 2011, 32, 1274–1282. [CrossRef]
64. Kim, W. Multiple object tracking in soccer videos using topographic surface analysis. J. Vis. Commun. Image Represent. 2019,
65, 102683. [CrossRef]
65. Liu, J.; Tong, X.; Li, W.; Wang, T.; Zhang, Y.; Wang, H. Automatic player detection, labeling and tracking in broadcast soccer video.
Pattern Recognit. Lett. 2009, 30, 103–113. [CrossRef]
66. Komorowski, J.; Kurzejamski, G.; Sarwas, G. BallTrack: Football ball tracking for real-time CCTV systems. In Proceedings of the
16th International Conference on Machine Vision Applications (MVA), Tokyo, Japan, 27–31 May 2019; pp. 1–5.
67. Hurault, S.; Ballester, C.; Haro, G. Self-Supervised Small Soccer Player Detection and Tracking. In Proceedings of the 3rd
International Workshop on Multimedia Content Analysis in Sports, Seattle, WA, USA, 12–16 October 2020; pp. 9–18.
68. Kamble, P.R.; Keskar, A.G.; Bhurchandi, K.M. A convolutional neural network based 3D ball tracking by detection in soccer videos.
In Proceedings of the Eleventh International Conference on machine vision (ICMV 2018), Munich, Germany, 1–3 November 2018;
Volume 11041, p. 110412O.
69. Naidoo, W.C.; Tapamo, J.R. Soccer video analysis by ball, player and referee tracking. In Proceedings of the 2006 Annual Research
Conference of the South African Institute of Computer Scientists and Information Technologists on IT Research in Developing
Countries, Somerset West, South Africa, 9–11 October 2006; pp. 51–60.
70. Liang, D.; Liu, Y.; Huang, Q.; Gao, W. A scheme for ball detection and tracking in broadcast soccer video. In Proceedings of the
Pacific-Rim Conference on Multimedia, Jeju Island, Korea, 13–16 November 2005; pp. 864–875.
71. Naik, B.; Hashmi, M.F. YOLOv3-SORT detection and tracking player-ball in soccer sport. J. Electron. Imaging 2023, 32, 011003.
[CrossRef]
72. Naik, B.; Hashmi, M.F.; Geem, Z.W.; Bokde, N.D. DeepPlayer-Track: Player and Referee Tracking with Jersey Color Recognition
in Soccer. IEEE Access 2022, 10, 32494–32509. [CrossRef]
73. Komorowski, J.; Kurzejamski, G.; Sarwas, G. FootAndBall: Integrated Player and Ball Detector. In Proceedings of the 15th
International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Valletta, Malta,
27–29 February 2020; Volume 5, pp. 47–56. [CrossRef]
74. Pallavi, V.; Mukherjee, J.; Majumdar, A.K.; Sural, S. Ball detection from broadcast soccer videos using static and dynamic features.
J. Vis. Commun. Image Represent. 2008, 19, 426–436. [CrossRef]
75. Leo, M.; Mazzeo, P.L.; Nitti, M.; Spagnolo, P. Accurate ball detection in soccer images using probabilistic analysis of salient
regions. Mach. Vis. Appl. 2013, 24, 1561–1574. [CrossRef]
76. Mazzeo, P.L.; Leo, M.; Spagnolo, P.; Nitti, M. Soccer ball detection by comparing different feature extraction methodologies. Adv.
Artif. Intell. 2012, 2012, 512159. [CrossRef]
77. Garnier, P.; Gregoir, T. Evaluating Soccer Player: From Live Camera to Deep Reinforcement Learning. arXiv 2021, arXiv:2101.05388.
78. Kusmakar, S.; Shelyag, S.; Zhu, Y.; Dwyer, D.; Gastin, P.; Angelova, M. Machine Learning Enabled Team Performance Analysis in
the Dynamical Environment of Soccer. IEEE Access 2020, 8, 90266–90279. [CrossRef]
79. Baccouche, M.; Mamalet, F.; Wolf, C.; Garcia, C.; Baskurt, A. Action classification in soccer videos with long short-term memory
recurrent neural networks. In Proceedings of the International Conference on Artificial Neural Networks, Thessaloniki, Greece,
15–18 September 2010; pp. 154–159.
80. Jackman, S. Football Shot Detection Using Convolutional Neural Networks. Master’s Thesis, Department of Biomedical
Engineering, Linköping University, Linköping, Sweden, 2019.
81. Lucey, P.; Bialkowski, A.; Monfort, M.; Carr, P.; Matthews, I. quality vs quantity: Improved shot prediction in soccer using
strategic features from spatiotemporal data. In Proceedings of the 8th Annual MIT Sloan Sports Analytics Conference, Boston,
MA, USA, 28 February–1 March 2014; pp. 1–9.
Appl. Sci. 2022, 12, 4429 41 of 49
82. Cioppa, A.; Deliege, A.; Giancola, S.; Ghanem, B.; Droogenbroeck, M.V.; Gade, R.; Moeslund, T.B. A context-aware loss function
for action spotting in soccer videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Seattle, WA, USA, 13–19 June 2020; pp. 13126–13136.
83. Beernaerts, J.; De Baets, B.; Lenoir, M.; Van de Weghe, N. Spatial movement pattern recognition in soccer based on relative player
movements. PLoS ONE 2020, 15, e0227746. [CrossRef] [PubMed]
84. Barbon Junior, S.; Pinto, A.; Barroso, J.V.; Caetano, F.G.; Moura, F.A.; Cunha, S.A.; Torres, R.d.S. Sport action mining: Dribbling
recognition in soccer. Multimed. Tools Appl. 2022, 81, 4341–4364. [CrossRef]
85. Kim, Y.; Jung, C.; Kim, C. Motion Recognition of Assistant Referees in Soccer Games via Selective Color Contrast Revelation.
EasyChair Preprint no. 2604, EasyChair, 2020. Available online: https://easychair.org/publications/preprint/z975 (accessed on
2 November 2021).
86. Lindström, P.; Jacobsson, L.; Carlsson, N.; Lambrix, P. Predicting player trajectories in shot situations in soccer. In Proceedings of
the International Workshop on Machine Learning and Data Mining for Sports Analytics, Ghent, Belgium, 14–18 September 2020;
pp. 62–75.
87. Machado, V.; Leite, R.; Moura, F.; Cunha, S.; Sadlo, F.; Comba, J.L. Visual soccer match analysis using spatiotemporal positions of
players. Comput. Graph. 2017, 68, 84–95. [CrossRef]
88. Ganesh, Y.; Teja, A.S.; Munnangi, S.K.; Murthy, G.R. A Novel Framework for Fine Grained Action Recognition in Soccer. In
Proceedings of the International Work-Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019;
pp. 137–150.
89. Chawla, S.; Estephan, J.; Gudmundsson, J.; Horton, M. Classification of passes in football matches using spatiotemporal data.
ACM Trans. Spat. Algorithms Syst. 2017, 3, 1–30. [CrossRef]
90. Gyarmati, L.; Stanojevic, R. QPass: A Merit-based Evaluation of Soccer Passes. arXiv 2016, arXiv:abs/1608.03532.
91. Vercruyssen, V.; De Raedt, L.; Davis, J. Qualitative spatial reasoning for soccer pass prediction. In CEUR Workshop Proceedings;
Springer: Berlin/Heidelberg, Germany, 2016; Volume 1842.
92. Yu, J.; Lei, A.; Hu, Y. Soccer video event detection based on deep learning. In Proceedings of the International Conference on
Multimedia Modeling, Thessaloniki, Greece, 8–11 January 2019; pp. 377–389.
93. Brooks, J.; Kerr, M.; Guttag, J. Using machine learning to draw inferences from pass location data in soccer. Stat. Anal. Data Min.
ASA Data Sci. J. 2016, 9, 338–349. [CrossRef]
94. Cho, H.; Ryu, H.; Song, M. Pass2vec: Analyzing soccer players’ passing style using deep learning. Int. J. Sports Sci. Coach. 2021,
17, 355–365. [CrossRef]
95. Zhang, K.; Wu, J.; Tong, X.; Wang, Y. An automatic multi-camera-based event extraction system for real soccer videos. Pattern
Anal. Appl. 2020, 23, 953–965. [CrossRef]
96. Deliège, A.; Cioppa, A.; Giancola, S.; Seikavandi, M.J.; Dueholm, J.V.; Nasrollahi, K.; Ghanem, B.; Moeslund, T.B.; Droogen-
broeck, M.V. SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos. arXiv 2020,
arXiv:abs/2011.13367.
97. Penumala, R.; Sivagami, M.; Srinivasan, S. Automated Goal Score Detection in Football Match Using Key Moments. Procedia
Comput. Sci. 2019, 165, 492–501. [CrossRef]
98. Khan, A.; Lazzerini, B.; Calabrese, G.; Serafini, L. Soccer event detection. In Proceedings of the 4th International Conference on
Image Processing and Pattern Recognition (IPPR 2018), Copenhagen, Denmark, 28–29 April 2018; pp. 119–129.
99. Khaustov, V.; Mozgovoy, M. Recognizing Events in Spatiotemporal Soccer Data. Appl. Sci. 2020, 10, 8046. [CrossRef]
100. Saraogi, H.; Sharma, R.A.; Kumar, V. Event recognition in broadcast soccer videos. In Proceedings of the Tenth Indian Conference
on Computer Vision, Graphics and Image Processing, Hyderabad, India, 18–22 December 2016; pp. 1–7.
101. Karimi, A.; Toosi, R.; Akhaee, M.A. Soccer Event Detection Using Deep Learning. arXiv 2021, arXiv:2102.04331.
102. Suzuki, G.; Takahashi, S.; Ogawa, T.; Haseyama, M. Team tactics estimation in soccer videos based on a deep extreme learning
machine and characteristics of the tactics. IEEE Access 2019, 7, 153238–153248. [CrossRef]
103. Suzuki, G.; Takahashi, S.; Ogawa, T.; Haseyama, M. Decision level fusion-based team tactics estimation in soccer videos. In
Proceedings of the IEEE 5th Global Conference on Consumer Electronics, Kyoto, Japan, 11–14 October 2016; pp. 1–2.
104. Ohnuki, S.; Takahashi, S.; Ogawa, T.; Haseyama, M. Soccer video segmentation based on team tactics estimation method. In
Proceedings of the International Workshop on Advanced Image Technology, Nagoya, Japan, 7–8 January 2013; pp. 692–695.
105. Clemente, F.M.; Couceiro, M.S.; Martins, F.M.L.; Mendes, R.S.; Figueiredo, A.J. Soccer team’s tactical behaviour: Measuring
territorial domain. J. Sports Eng. Technol. 2015, 229, 58–66. [CrossRef]
106. Hassan, A.; Akl, A.R.; Hassan, I.; Sunderland, C. Predicting Wins, Losses and Attributes’ Sensitivities in the Soccer World Cup
2018 Using Neural Network Analysis. Sensors 2020, 20, 3213. [CrossRef]
107. Niu, Z.; Gao, X.; Tian, Q. Tactic analysis based on real-world ball trajectory in soccer video. Pattern Recognit. 2012, 45, 1937–1947.
[CrossRef]
108. Wu, Y.; Xie, X.; Wang, J.; Deng, D.; Liang, H.; Zhang, H.; Cheng, S.; Chen, W. Forvizor: Visualizing spatio-temporal team
formations in soccer. IEEE Trans. Vis. Comput. Graph. 2018, 25, 65–75. [CrossRef]
109. Suzuki, G.; Takahashi, S.; Ogawa, T.; Haseyama, M. Team tactics estimation in soccer videos via deep extreme learning machine
based on players formation. In Proceedings of the IEEE 7th Global Conference on Consumer Electronics, Nara, Japan, 9–12
October 2018; pp. 116–117.
Appl. Sci. 2022, 12, 4429 42 of 49
110. Wang, B.; Shen, W.; Chen, F.; Zeng, D. Football match intelligent editing system based on deep learning. KSII Trans. Internet Inf.
Syst. 2019, 13, 5130–5143.
111. Zawbaa, H.M.; El-Bendary, N.; Hassanien, A.E.; Kim, T.h. Event detection based approach for soccer video summarization using
machine learning. Int. J. Multimed. Ubiquitous Eng. 2012, 7, 63–80.
112. Kolekar, M.H.; Sengupta, S. Bayesian network-based customized highlight generation for broadcast soccer videos. IEEE Trans.
Broadcast. 2015, 61, 195–209. [CrossRef]
113. Li, J.; Wang, T.; Hu, W.; Sun, M.; Zhang, Y. Soccer highlight detection using two-dependence bayesian network. In Proceedings of
the IEEE International Conference on Multimedia and Expo, Toronto, ON, Canada, 9–12 July 2006; pp. 1625–1628.
114. Foysal, M.F.A.; Islam, M.S.; Karim, A.; Neehal, N. Shot-Net: A convolutional neural network for classifying different cricket shots.
In Proceedings of the International Conference on Recent Trends in Image Processing and Pattern Recognition, Solapur, India,
21–22 December 2018; pp. 111–120.
115. Khan, M.Z.; Hassan, M.A.; Farooq, A.; Khan, M.U.G. Deep CNN based data-driven recognition of cricket batting shots. In
Proceedings of the International Conference on Applied and Engineering Mathematics (ICAEM), Taxila, Pakistan, 4–5 September
2018; pp. 67–71.
116. Khan, A.; Nicholson, J.; Plötz, T. Activity recognition for quality assessment of batting shots in cricket using a hierarchical
representation. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies; ACM Digital Library:
New York, NY, USA, 2017; Volume 1, p. 62. [CrossRef]
117. Sen, A.; Deb, K.; Dhar, P.K.; Koshiba, T. CricShotClassify: An Approach to Classifying Batting Shots from Cricket Videos Using a
Convolutional Neural Network and Gated Recurrent Unit. Sensors 2021, 21, 2846. [CrossRef] [PubMed]
118. Gürpınar-Morgan, W.; Dinsdale, D.; Gallagher, J.; Cherukumudi, A.; Lucey, P. You Cannot Do That Ben Stokes: Dynamically
Predicting Shot Type in Cricket Using a Personalized Deep Neural Network. arXiv 2021, arXiv:2102.01952.
119. Bandara, I.; Bačić, B. Strokes Classification in Cricket Batting Videos. In Proceedings of the 2020 5th International Conference on
Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA), Sydney, Australia, 25–27 November 2020;
pp. 1–6.
120. Moodley, T.; van der Haar, D. Scene Recognition Using AlexNet to Recognize Significant Events Within Cricket Game Footage. In
Proceedings of the International Conference on Computer Vision and Graphics, Valletta, Malta, 27–29 February 2020; pp. 98–109.
121. Gupta, A.; Muthiah, S.B. Viewpoint constrained and unconstrained Cricket stroke localization from untrimmed videos. Image Vis.
Comput. 2020, 100, 103944. [CrossRef]
122. Al Islam, M.N.; Hassan, T.B.; Khan, S.K. A CNN-based approach to classify cricket bowlers based on their bowling actions. In
Proceedings of the IEEE International Conference on Signal Processing, Information, Communication & Systems (SPICSCON),
Dhaka, Bangladesh, 28–30 November 2019, pp. 130–134.
123. Muthuswamy, S.; Lam, S.S. Bowler performance prediction for one-day international cricket using neural networks. In
Proceedings of the IIE Annual Conference Proceedings. Institute of Industrial and Systems Engineers (IISE), New Orleans, LA,
USA, 30 May–2 June 2008, p. 1391.
124. Bhattacharjee, D.; Pahinkar, D.G. Analysis of performance of bowlers using combined bowling rate. Int. J. Sports Sci. Eng. 2012,
6, 1750–9823.
125. Rahman, R.; Rahman, M.A.; Islam, M.S.; Hasan, M. DeepGrip: Cricket Bowling Delivery Detection with Superior CNN
Architectures. In Proceedings of the 6th International Conference on Inventive Computation Technologies (ICICT), Lalitpur,
Nepal, 20–22 July 2021; pp. 630–636.
126. Lemmer, H.H. The combined bowling rate as a measure of bowling performance in cricket. S. Afr. J. Res. Sport Phys. Educ. Recreat.
2002, 24, 37–44. [CrossRef]
127. Mukherjee, S. Quantifying individual performance in Cricket—A network analysis of Batsmen and Bowlers. Phys. A Stat. Mech.
Its Appl. 2014, 393, 624–637. [CrossRef]
128. Velammal, B.; Kumar, P.A. An Efficient Ball Detection Framework for Cricket. Int. J. Comput. Sci. Issues 2010, 7, 30.
129. Nelikanti, A.; Reddy, G.V.R.; Karuna, G. An Optimization Based deep LSTM Predictive Analysis for Decision Making in Cricket.
In Innovative Data Communication Technologies and Application; Springer: Berlin/Heidelberg, Germany, 2021; pp. 721–737.
130. Kumar, R.; Santhadevi, D.; Barnabas, J. Outcome Classification in Cricket Using Deep Learning. In Proceedings of the IEEE
International Conference on Cloud Computing in Emerging Markets (CCEM), Bengaluru, India, 19–20 September 2019; pp. 55–58.
131. Shukla, P.; Sadana, H.; Bansal, A.; Verma, D.; Elmadjian, C.; Raman, B.; Turk, M. Automatic cricket highlight generation using
event-driven and excitement-based features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1800–1808.
132. Kowsher, M.; Alam, M.A.; Uddin, M.J.; Ahmed, F.; Ullah, M.W.; Islam, M.R. Detecting Third Umpire Decisions & Automated
Scoring System of Cricket. In Proceedings of the 2019 International Conference on Computer, Communication, Chemical,
Materials and Electronic Engineering (IC4ME2), Rajshahi, Bangladesh, 11–12 July 2019; pp. 1–8.
133. Ravi, A.; Venugopal, H.; Paul, S.; Tizhoosh, H.R. A dataset and preliminary results for umpire pose detection using SVM
classification of deep features. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI),
Bangalore, India, 18–21 November 2018; pp. 1396–1402.
134. Kapadiya, C.; Shah, A.; Adhvaryu, K.; Barot, P. Intelligent Cricket Team Selection by Predicting Individual Players’ Performance
using Efficient Machine Learning Technique. Int. J. Eng. Adv. Technol. 2020, 9, 3406–3409. [CrossRef]
Appl. Sci. 2022, 12, 4429 43 of 49
135. Iyer, S.R.; Sharda, R. Prediction of athletes performance using neural networks: An application in cricket team selection. Expert
Syst. Appl. 2009, 36, 5510–5522. [CrossRef]
136. Jhanwar, M.G.; Pudi, V. Predicting the Outcome of ODI Cricket Matches: A Team Composition Based Approach. In Proceedings of
the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD
2016), Bilbao, Spain, 19–23 September 2016.
137. Pathak, N.; Wadhwa, H. Applications of modern classification techniques to predict the outcome of ODI cricket. Procedia Comput.
Sci. 2016, 87, 55–60. [CrossRef]
138. Alaka, S.; Sreekumar, R.; Shalu, H. Efficient Feature Representations for Cricket Data Analysis using Deep Learning based
Multi-Modal Fusion Model. arXiv 2021, arXiv:2108.07139.
139. Goel, R.; Davis, J.; Bhatia, A.; Malhotra, P.; Bhardwaj, H.; Hooda, V.; Goel, A. Dynamic cricket match outcome prediction. J. Sports
Anal. 2021, 7, 185–196. [CrossRef]
140. Karthik, K.; Krishnan, G.S.; Shetty, S.; Bankapur, S.S.; Kolkar, R.P.; Ashwin, T.; Vanahalli, M.K. Analysis and Prediction of
Fantasy Cricket Contest Winners Using Machine Learning Techniques. In Evolution in Computational Intelligence; Springer:
Berlin/Heidelberg, Germany, 2021; pp. 443–453.
141. Shah, P. New performance measure in Cricket. ISOR J. Sports Phys. Educ. 2017, 4, 28–30. [CrossRef]
142. Shingrakhia, H.; Patel, H. SGRNN-AM and HRF-DBN: A hybrid machine learning model for cricket video summarization. Vis.
Comput. 2021. [CrossRef]
143. Guntuboina, C.; Porwal, A.; Jain, P.; Shingrakhia, H. Deep Learning Based Automated Sports Video Summarization using YOLO.
Electron. Lett. Comput. Vis. Image Anal. 2021, 20, 99–116.
144. Owens, N.; Harris, C.; Stennett, C. Hawk-eye tennis system. In Proceedings of the International Conference on Visual Information
Engineering, Guildford, UK, 7–9 July 2003; pp. 182–185.
145. Wu, G. Monitoring System of Key Technical Features of Male Tennis Players Based on Internet of Things Security Technology.
Wirel. Commun. Mob. Comput. 2021, 2021, 4076863. [CrossRef]
146. Connaghan, D.; Kelly, P.; O’Connor, N.E. Game, shot and match: Event-based indexing of tennis. In Proceedings of the 9th
International Workshop on Content-Based Multimedia Indexing (CBMI), Lille, France, 28–30 June 2011; pp. 97–102.
147. Giles, B.; Kovalchik, S.; Reid, M. A machine learning approach for automatic detection and classification of changes of direction
from player tracking data in professional tennis. J. Sports Sci. 2020, 38, 106–113. [CrossRef]
148. Zhou, X.; Xie, L.; Huang, Q.; Cox, S.J.; Zhang, Y. Tennis ball tracking using a two-layered data association approach. IEEE Trans.
Multimed. 2014, 17, 145–156. [CrossRef]
149. Reno, V.; Mosca, N.; Marani, R.; Nitti, M.; D’Orazio, T.; Stella, E. Convolutional neural networks based ball detection in tennis
games. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA,
18–22 June 2018; pp. 1758–1764.
150. Archana, M.; Geetha, M.K. Object detection and tracking based on trajectory in broadcast tennis video. Procedia Comput. Sci.
2015, 58, 225–232. [CrossRef]
151. Polk, T.; Yang, J.; Hu, Y.; Zhao, Y. Tennivis: Visualization for tennis match analysis. IEEE Trans. Vis. Comput. Graph. 2014,
20, 2339–2348. [CrossRef]
152. Kelly, P.; Diego, J.; Agapito, P.; Conaire, C.; Connaghan, D.; Kuklyte, J.; Connor, N. Performance analysis and visualisation
in tennis using a low-cost camera network. In Proceedings of the 18th ACM Multimedia Conference on Multimedia Grand
Challenge, Beijing, China, 25–29 October 2010; pp. 1–4.
153. Fernando, T.; Denman, S.; Sridharan, S.; Fookes, C. Memory augmented deep generative models for forecasting the next shot
location in tennis. IEEE Trans. Knowl. Data Eng. 2019, 32, 1785–1797. [CrossRef]
154. Pingali, G.; Opalach, A.; Jean, Y.; Carlbom, I. Visualization of sports using motion trajectories: Providing insights into performance,
style, and strategy. In Proceedings of the IEEE Visualization 2001, San Diego, CA, USA, 24–26 October 2001; pp. 75–544.
155. Pingali, G.S.; Opalach, A.; Jean, Y.D.; Carlbom, I.B. Instantly indexed multimedia databases of real world events. IEEE Trans.
Multimed. 2002, 4, 269–282. [CrossRef]
156. Cai, J.; Hu, J.; Tang, X.; Hung, T.Y.; Tan, Y.P. Deep historical long short-term memory network for action recognition. Neurocom-
puting 2020, 407, 428–438. [CrossRef]
157. Vinyes Mora, S.; Knottenbelt, W.J. Deep learning for domain-specific action recognition in tennis. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 114–122.
158. Ning, B.; Na, L. Deep Spatial/temporal-level feature engineering for Tennis-based action recognition. Future Gener. Comput. Syst.
2021, 125, 188–193. [CrossRef]
159. Polk, T.; Jäckle, D.; Häußler, J.; Yang, J. CourtTime: Generating actionable insights into tennis matches using visual analytics.
IEEE Trans. Vis. Comput. Graph. 2019, 26, 397–406. [CrossRef]
160. Zhu, G.; Huang, Q.; Xu, C.; Xing, L.; Gao, W.; Yao, H. Human behavior analysis for highlight ranking in broadcast racket sports
video. IEEE Trans. Multimed. 2007, 9, 1167–1182.
161. Wei, X.; Lucey, P.; Morgan, S.; Sridharan, S. Forecasting the next shot location in tennis using fine-grained spatiotemporal tracking
data. IEEE Trans. Knowl. Data Eng. 2016, 28, 2988–2997. [CrossRef]
162. Ma, K. A Real Time Artificial Intelligent System for Tennis Swing Classification. In Proceedings of the IEEE 19th World
Symposium on Applied Machine Intelligence and Informatics (SAMI), Herl’any, Slovakia, 21–23 January 2021; pp. 21–26.
Appl. Sci. 2022, 12, 4429 44 of 49
163. Vales-Alonso, J.; Chaves-Diéguez, D.; López-Matencio, P.; Alcaraz, J.J.; Parrado-García, F.J.; González-Castaño, F.J. SAETA: A
smart coaching assistant for professional volleyball training. IEEE Trans. Syst. Man Cybern. Syst. 2015, 45, 1138–1150. [CrossRef]
164. Kautz, T.; Groh, B.H.; Hannink, J.; Jensen, U.; Strubberg, H.; Eskofier, B.M. Activity recognition in beach volleyball using a Deep
Convolutional Neural Network. Data Min. Knowl. Discov. 2017, 31, 1678–1705. [CrossRef]
165. Ibrahim, M.S.; Muralidharan, S.; Deng, Z.; Vahdat, A.; Mori, G. A hierarchical deep temporal model for group activity recognition.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016;
pp. 1971–1980.
166. Van Haaren, J.; Ben Shitrit, H.; Davis, J.; Fua, P. Analyzing volleyball match data from the 2014 World Championships using
machine learning techniques. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 627–634.
167. Wenninger, S.; Link, D.; Lames, M. Performance of machine learning models in application to beach volleyball data. Int. J.
Comput. Sci. Sport 2020, 19, 24–36. [CrossRef]
168. Haider, F.; Salim, F.; Naghashi, V.; Tasdemir, S.B.Y.; Tengiz, I.; Cengiz, K.; Postma, D.; Delden, R.v.; Reidsma, D.; van Beijnum, B.J.;
et al. Evaluation of dominant and non-dominant hand movements for volleyball action modelling. In Proceedings of the Adjunct
of the 2019 International Conference on Multimodal Interaction, Suzhou, China, 14–18 October 2019; pp. 1–6.
169. Salim, F.A.; Haider, F.; Tasdemir, S.B.Y.; Naghashi, V.; Tengiz, I.; Cengiz, K.; Postma, D.; Van Delden, R. Volleyball action modelling
for behavior analysis and interactive multi-modal feedback. In Proceedings of the 15th International Summer Workshop on
Multimodal Interfaces, Ankara, Turkey, 8 July 2019; p. 50.
170. Jiang, W.; Zhao, K.; Jin, X. Diagnosis Model of Volleyball Skills and Tactics Based on Artificial Neural Network. Mob. Inf. Syst.
2021, 2021, 7908897. [CrossRef]
171. Wang, Y.; Zhao, Y.; Chan, R.H.; Li, W.J. Volleyball skill assessment using a single wearable micro inertial measurement unit at
wrist. IEEE Access 2018, 6, 13758–13765. [CrossRef]
172. Zhang, C.; Tang, H.; Duan, Z. WITHDRAWN: Time Series Analysis of Volleyball Spiking Posture Based on Quality-Guided
Cyclic Neural Network. J. Vis. Commun. Image Represent. 2019, 82, 102681. [CrossRef]
173. Thilakarathne, H.; Nibali, A.; He, Z.; Morgan, S. Pose is all you need: The pose only group activity recognition system (POGARS).
arXiv 2021, arXiv:2108.04186.
174. Zhao, K.; Jiang, W.; Jin, X.; Xiao, X. Artificial intelligence system based on the layout effect of both sides in volleyball matches. J.
Intell. Fuzzy Syst. 2021, 40, 3075–3084. [CrossRef]
175. Tian, Y. Optimization of Volleyball Motion Estimation Algorithm Based on Machine Vision and Wearable Devices. Microprocess.
Microsyst. 2020, 81, 103750. [CrossRef]
176. Şah, M.; Direkoğlu, C. Review and evaluation of player detection methods in field sports. Multimed. Tools Appl. 2021. [CrossRef]
177. Rangasamy, K.; As’ari, M.A.; Rahmad, N.A.; Ghazali, N.F. Hockey activity recognition using pre-trained deep learning model.
ICT Express 2020, 6, 170–174. [CrossRef]
178. Sozykin, K.; Protasov, S.; Khan, A.; Hussain, R.; Lee, J. Multi-label class-imbalanced action recognition in hockey videos via
3D convolutional neural networks. In Proceedings of the 19th IEEE/ACIS International Conference on Software Engineering,
Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Busan, Korea, 27–29 June 2018; pp. 146–151.
179. Fani, M.; Neher, H.; Clausi, D.A.; Wong, A.; Zelek, J. Hockey action recognition via integrated stacked hourglass network. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July
2017; pp. 29–37.
180. Cai, Z.; Neher, H.; Vats, K.; Clausi, D.A.; Zelek, J. Temporal hockey action recognition via pose and optical flows. In Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019.
181. Chan, A.; Levine, M.D.; Javan, M. Player Identification in Hockey Broadcast Videos. Expert Syst. Appl. 2021, 165, 113891.
[CrossRef]
182. Carbonneau, M.A.; Raymond, A.J.; Granger, E.; Gagnon, G. Real-time visual play-break detection in sport events using a context
descriptor. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Lisbon, Portugal, 24–27 May
2015; pp. 2808–2811.
183. Wang, H.; Ullah, M.M.; Klaser, A.; Laptev, I.; Schmid, C. Evaluation of local spatio-temporal features for action recognition. In
Proceedings of the British Machine Vision Conference, London, UK, 7–10 September 2009.
184. Um, G.M.; Lee, C.; Park, S.; Seo, J. Ice Hockey Player Tracking and Identification System Using Multi-camera video. In
Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Jeju, Korea, 5–7
June 2019; pp. 1–4.
185. Guo, T.; Tao, K.; Hu, Q.; Shen, Y. Detection of Ice Hockey Players and Teams via a Two-Phase Cascaded CNN Model. IEEE Access
2020, 8, 195062–195073. [CrossRef]
186. Liu, G.; Schulte, O. Deep reinforcement learning in ice hockey for context-aware player evaluation. arXiv 2021, arXiv:1805.11088.
187. Vats, K.; Neher, H.; Clausi, D.A.; Zelek, J. Two-stream action recognition in ice hockey using player pose sequences and optical
flows. In Proceedings of the 16th Conference on Computer and Robot Vision (CRV), Kingston, QC, Canada, 29–31 May 2019;
pp. 181–188.
Appl. Sci. 2022, 12, 4429 45 of 49
188. Vats, K.; Fani, M.; Clausi, D.A.; Zelek, J. Puck localization and multi-task event recognition in broadcast hockey videos. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021;
pp. 4567–4575.
189. Tora, M.R.; Chen, J.; Little, J.J. Classification of puck possession events in ice hockey. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 22–25 July 2017; pp. 147–154.
190. Weeratunga, K.; Dharmaratne, A.; Boon How, K. Application of computer vision and vector space model for tactical movement
classification in badminton. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops,
Honolulu, HI, USA, 21–26 June 2017; pp. 76–82.
191. Rahmad, N.; As’ari, M. The new Convolutional Neural Network (CNN) local feature extractor for automated badminton action
recognition on vision based data. J. Phys. Conf. Ser. 2020, 1529, 022021. [CrossRef]
192. Steels, T.; Van Herbruggen, B.; Fontaine, J.; De Pessemier, T.; Plets, D.; De Poorter, E. Badminton Activity Recognition Using
Accelerometer Data. Sensors 2020, 20, 4685. [CrossRef]
193. binti Rahmad, N.A.; binti Sufri, N.A.J.; bin As’ari, M.A.; binti Azaman, A. Recognition of Badminton Action Using Convolutional
Neural Network. Indones. J. Electr. Eng. Inform. 2019, 7, 750–756.
194. Ghosh, I.; Ramamurthy, S.R.; Roy, N. StanceScorer: A Data Driven Approach to Score Badminton Player. In Proceedings of the
IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), Austin, TX,
USA, 13–20 September 2020; pp. 1–6.
195. Cao, Z.; Liao, T.; Song, W.; Chen, Z.; Li, C. Detecting the shuttlecock for a badminton robot: A YOLO based approach. Expert Syst.
Appl. 2021, 164, 113833. [CrossRef]
196. Chen, W.; Liao, T.; Li, Z.; Lin, H.; Xue, H.; Zhang, L.; Guo, J.; Cao, Z. Using FTOC to track shuttlecock for the badminton robot.
Neurocomputing 2019, 334, 182–196. [CrossRef]
197. Rahmad, N.A.; Sufri, N.A.J.; Muzamil, N.H.; As’ari, M.A. Badminton player detection using faster region convolutional neural
network. Indones. J. Electr. Eng. Comput. Sci. 2019, 14, 1330–1335. [CrossRef]
198. Hou, J.; Li, B. Swimming target detection and tracking technology in video image processing. Microprocess. Microsyst. 2021,
80, 103535. [CrossRef]
199. Cao, Y. Fast swimming motion image segmentation method based on symmetric difference algorithm. Microprocess. Microsyst.
2021, 80, 103541. [CrossRef]
200. Hegazy, H.; Abdelsalam, M.; Hussien, M.; Elmosalamy, S.; Hassan, Y.M.; Nabil, A.M.; Atia, A. IPingPong: A Real-time
Performance Analyzer System for Table Tennis Stroke’s Movements. Procedia Comput. Sci. 2020, 175, 80–87. [CrossRef]
201. Baclig, M.M.; Ergezinger, N.; Mei, Q.; Gül, M.; Adeeb, S.; Westover, L. A Deep Learning and Computer Vision Based Multi-Player
Tracker for Squash. Appl. Sci. 2020, 10, 8793. [CrossRef]
202. Brumann, C.; Kukuk, M.; Reinsberger, C. Evaluation of Open-Source and Pre-Trained Deep Convolutional Neural Networks
Suitable for Player Detection and Motion Analysis in Squash. Sensors 2021, 21, 4550. [CrossRef]
203. Wang, S.; Xu, Y.; Zheng, Y.; Zhu, M.; Yao, H.; Xiao, Z. Tracking a golf ball with high-speed stereo vision system. IEEE Trans.
Instrum. Meas. 2018, 68, 2742–2754. [CrossRef]
204. Zhi-chao, C.; Zhang, L. Key pose recognition toward sports scene using deeply-learned model. J. Vis. Commun. Image Represent.
2019, 63, 102571. [CrossRef]
205. Liu, H.; Bhanu, B. Pose-Guided R-CNN for Jersey Number Recognition in Sports. In Proceedings of the 2019 IEEE/CVF Conference on
Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 2457–2466. [CrossRef]
206. Pobar, M.; Ivašić-Kos, M. Detection of the leading player in handball scenes using Mask R-CNN and STIPS. In Proceedings of
the Eleventh International Conference on Machine Vision (ICMV 2018), Munich, Germany, 1–3 November 2018; Volume 11041,
pp. 501–508.
207. Van Zandycke, G.; De Vleeschouwer, C. Real-time CNN-based Segmentation Architecture for Ball Detection in a Single View
Setup. In Proceedings of the 2nd International Workshop on Multimedia Content Analysis in Sports, Nice, France, 25 October
2019; pp. 51–58.
208. Burić, M.; Pobar, M.; Ivašić-Kos, M. Adapting YOLO network for ball and player detection. In Proceedings of the 8th
International Conference on Pattern Recognition Applications and Methods, Prague, Czech Republic, 19–21 February 2019;
Volume 1, pp. 845–851.
209. Pobar, M.; Ivasic-Kos, M. Active Player Detection in Handball Scenes Based on Activity Measures. Sensors 2020, 20, 1475.
[CrossRef]
210. Komorowski., J.; Kurzejamski., G.; Sarwas., G. DeepBall: Deep Neural-Network Ball Detector. In Proceedings of the 14th
International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Valletta, Malta,
27–29 February 2019; Volume 5, pp. 297–304. [CrossRef]
211. Liu, W. Beach sports image detection based on heterogeneous multi-processor and convolutional neural network. Microprocess.
Microsyst. 2021, 82, 103910. [CrossRef]
212. Zhang, R.; Wu, L.; Yang, Y.; Wu, W.; Chen, Y.; Xu, M. Multi-camera multi-player tracking with deep player identification in sports
video. Pattern Recognit. 2020, 102, 107260. [CrossRef]
Appl. Sci. 2022, 12, 4429 46 of 49
213. Karungaru, S.; Matsuura, K.; Tanioka, H.; Wada, T.; Gotoda, N. Ground Sports Strategy Formulation and Assistance Technology
Develpoment: Player Data Acquisition from Drone Videos. In Proceedings of the 8th International Conference on Industrial
Technology and Management (ICITM), Cambridge, UK, 2–4 March 2019; pp. 322–325.
214. Hui, Q. Motion video tracking technology in sports training based on Mean-Shift algorithm. J. Supercomput. 2019, 75, 6021–6037.
[CrossRef]
215. Castro, R.L.; Canosa, D.A. Using Artificial Vision Techniques for Individual Player Tracking in Sport Events. Proceedings
2019, 21, 21.
216. Buric, M.; Ivasic-Kos, M.; Pobar, M. Player tracking in sports videos. In Proceedings of the IEEE International Conference on
Cloud Computing Technology and Science (CloudCom), Sydney, Australia, 11–13 December 2019; pp. 334–340.
217. Moon, S.; Lee, J.; Nam, D.; Yoo, W.; Kim, W. A comparative study on preprocessing methods for object tracking in sports events.
In Proceedings of the 20th International Conference on Advanced Communication Technology (ICACT), Chuncheon, Korea,
11–14 February 2018; pp. 460–462.
218. Xing, J.; Ai, H.; Liu, L.; Lao, S. Multiple player tracking in sports video: A dual-mode two-way bayesian inference approach with
progressive observation modeling. IEEE Trans. Image Process. 2010, 20, 1652–1667. [CrossRef]
219. Liang, Q.; Wu, W.; Yang, Y.; Zhang, R.; Peng, Y.; Xu, M. Multi-Player Tracking for Multi-View Sports Videos with Improved
K-Shortest Path Algorithm. Appl. Sci. 2020, 10, 864. [CrossRef]
220. Lu, W.L.; Ting, J.A.; Little, J.J.; Murphy, K.P. Learning to track and identify players from broadcast sports videos. IEEE Trans.
Pattern Anal. Mach. Intell. 2013, 35, 1704–1716.
221. Huang, Y.C.; Liao, I.N.; Chen, C.H.; İk, T.U.; Peng, W.C. Tracknet: A deep learning network for tracking high-speed and tiny
objects in sports applications. In Proceedings of the 16th IEEE International Conference on Advanced Video and Signal Based
Surveillance (AVSS), Taipei, Taiwan, 18–21 September 2019; pp. 1–8.
222. Tan, S.; Yang, R. Learning similarity: Feature-aligning network for few-shot action recognition. In Proceedings of the International
Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–7.
223. Ullah, A.; Ahmad, J.; Muhammad, K.; Sajjad, M.; Baik, S.W. Action recognition in video sequences using deep bi-directional
LSTM with CNN features. IEEE Access 2017, 6, 1155–1166. [CrossRef]
224. Russo, M.A.; Kurnianggoro, L.; Jo, K.H. Classification of sports videos with combination of deep learning models and transfer
learning. In Proceedings of the International Conference on Electrical, Computer and Communication Engineering (ECCE),
Chittagong, Bangladesh, 7–9 February 2019; pp. 1–5.
225. Waltner, G.; Mauthner, T.; Bischof, H. Indoor Activity Detection and Recognition for Sport Games Analysis. arXiv 2021,
arXiv:abs/1404.6413.
226. Soomro, K.; Zamir, A.R. Action recognition in realistic sports videos. In Computer Vision in Sports; Springer: Berlin/Heidelberg,
Germany, 2014; pp. 181–208.
227. Xu, K.; Jiang, X.; Sun, T. Two-stream dictionary learning architecture for action recognition. IEEE Trans. Circuits Syst. Video
Technol. 2017, 27, 567–576. [CrossRef]
228. Chaudhury, S.; Kimura, D.; Vinayavekhin, P.; Munawar, A.; Tachibana, R.; Ito, K.; Inaba, Y.; Matsumoto, M.; Kidokoro, S.; Ozaki,
H. Unsupervised Temporal Feature Aggregation for Event Detection in Unstructured Sports Videos. In Proceedings of the IEEE
International Symposium on Multimedia (ISM), San Diego, CA, USA, 9–11 December 2019; pp. 9–97.
229. Li, Y.; He, H.; Zhang, Z. Human motion quality assessment toward sophisticated sports scenes based on deeply-learned 3D CNN
model. J. Vis. Commun. Image Represent. 2020, 71, 102702. [CrossRef]
230. Chen, H.T.; Chou, C.L.; Tsai, W.C.; Lee, S.Y.; Lin, B.S.P. HMM-based ball hitting event exploration system for broadcast baseball
video. J. Vis. Commun. Image Represent. 2012, 23, 767–781. [CrossRef]
231. Punchihewa, N.G.; Yamako, G.; Fukao, Y.; Chosa, E. Identification of key events in baseball hitting using inertial measurement
units. J. Biomech. 2019, 87, 157–160. [CrossRef] [PubMed]
232. Kapela, R.; Świetlicka, A.; Rybarczyk, A.; Kolanowski, K. Real-time event classification in field sport videos. Signal Process. Image
Commun. 2015, 35, 35–45. [CrossRef]
233. Maksai, A.; Wang, X.; Fua, P. What players do with the ball: A physically constrained interaction modeling. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 972–981.
234. Goud, P.S.H.V.; Roopa, Y.M.; Padmaja, B. Player Performance Analysis in Sports: With Fusion of Machine Learning and Wearable
Technology. In Proceedings of the 3rd International Conference on Computing Methodologies and Communication (ICCMC),
Erode, India, 27–29 March 2019; pp. 600–603.
235. Park, Y.J.; Kim, H.S.; Kim, D.; Lee, H.; Kim, S.B.; Kang, P. A deep learning-based sports player evaluation model based on game
statistics and news articles. Knowl.-Based Syst. 2017, 138, 15–26. [CrossRef]
236. Tejero-de Pablos, A.; Nakashima, Y.; Sato, T.; Yokoya, N.; Linna, M.; Rahtu, E. Summarization of user-generated sports video by
using deep action recognition features. IEEE Trans. Multimed. 2018, 20, 2000–2011. [CrossRef]
237. Javed, A.; Irtaza, A.; Khaliq, Y.; Malik, H.; Mahmood, M.T. Replay and key-events detection for sports video summarization
using confined elliptical local ternary patterns and extreme learning machine. Appl. Intell. 2019, 49, 2899–2917. [CrossRef]
238. Rafiq, M.; Rafiq, G.; Agyeman, R.; Choi, G.S.; Jin, S.I. Scene classification for sports video summarization using transfer learning.
Sensors 2020, 20, 1702. [CrossRef]
Appl. Sci. 2022, 12, 4429 47 of 49
239. Khan, A.A.; Shao, J.; Ali, W.; Tumrani, S. Content-Aware summarization of broadcast sports Videos: An Audio–Visual feature
extraction approach. Neural Process. Lett. 2020, 52, 1945–1968. [CrossRef]
240. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM
2017, 60, 84–90. [CrossRef]
241. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the
International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015.
242. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with
convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June
2015; pp. 1–9.
243. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778.
244. Iandola, F.N.; Moskewicz, M.W.; Ashraf, K.; Han, S.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer
parameters and <1 MB model size. arXiv 2016, arXiv:abs/1602.07360.
245. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient
convolutional neural networks for mobile vision applications. arXiv 2021, arXiv:1704.04861.
246. Murthy, C.B.; Hashmi, M.F.; Bokde, N.D.; Geem, Z.W. Investigations of object detection in images/videos using various deep
learning techniques and embedded platforms—A comprehensive review. Appl. Sci. 2020, 10, 3280. [CrossRef]
247. Cao, D.; Zeng, K.; Wang, J.; Sharma, P.K.; Ma, X.; Liu, Y.; Zhou, S. BERT-Based Deep Spatial-Temporal Network for Taxi Demand
Prediction. IEEE Trans. Intell. Transp. Syst. 2021, Early Access. [CrossRef]
248. Wang, J.; Zou, Y.; Lei, P.; Sherratt, R.S.; Wang, L. Research on recurrent neural network based crack opening prediction of concrete
dam. J. Internet Technol. 2020, 21, 1161–1169.
249. Chen, C.; Li, K.; Teo, S.G.; Zou, X.; Li, K.; Zeng, Z. Citywide traffic flow prediction based on multiple gated spatio-temporal
convolutional neural networks. ACM Trans. Knowl. Discov. Data 2020, 14, 1–23. [CrossRef]
250. Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent Neural Network Regularization. arXiv 2014, arXiv:abs/1409.2329.
251. Jiang, X.; Yan, T.; Zhu, J.; He, B.; Li, W.; Du, H.; Sun, S. Densely connected deep extreme learning machine algorithm. Cogn.
Comput. 2020, 12, 979–990. [CrossRef]
252. Zhang, Y.; Wang, C.; Wang, X.; Zeng, W.; Liu, W. Fairmot: On the fairness of detection and re-identification in multiple object
tracking. Int. J. Comput. Vis. 2021, 129, 3069–3087 [CrossRef]
253. Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the IEEE
International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649.
254. Hu, H.N.; Yang, Y.H.; Fischer, T.; Darrell, T.; Yu, F.; Sun, M. Monocular Quasi-Dense 3D Object Tracking. arXiv 2021,
arXiv:2103.07351.
255. Kim, A.; Osep, A.; Leal-Taixé, L. EagerMOT: 3D Multi-Object Tracking via Sensor Fusion. In Proceedings of the 2021 IEEE
International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 11315–11321.
256. Chaabane, M.; Zhang, P.; Beveridge, J.R.; O’Hara, S. Deft: Detection embeddings for tracking. arXiv 2021, arXiv:2102.02267.
257. Zeng, F.; Dong, B.; Wang, T.; Chen, C.; Zhang, X.; Wei, Y. MOTR: End-to-End Multiple-Object Tracking with TRansformer. arXiv
2021, arXiv:2105.03247.
258. Wang, Z.; Zheng, L.; Liu, Y.; Li, Y.; Wang, S. Towards real-time multi-object tracking. In Proceedings of the Computer Vision–ECCV
2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 107–122.
259. Xu, Y.; Osep, A.; Ban, Y.; Horaud, R.; Leal-Taixé, L.; Alameda-Pineda, X. How to train your deep multi-object tracker. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020;
pp. 6787–6796.
260. Sun, P.; Jiang, Y.; Zhang, R.; Xie, E.; Cao, J.; Hu, X.; Kong, T.; Yuan, Z.; Wang, C.; Luo, P. Transtrack: Multiple-object tracking with
transformer. arXiv 2021, arXiv:2012.15460.
261. Xu, Z.; Zhang, W.; Tan, X.; Yang, W.; Su, X.; Yuan, Y.; Zhang, H.; Wen, S.; Ding, E.; Huang, L. PointTrack++ for Effective Online
Multi-Object Tracking and Segmentation. arXiv 2021, arXiv:2007.01549.
262. Gupta, A.; Johnson, J.; Fei-Fei, L.; Savarese, S.; Alahi, A. Social gan: Socially acceptable trajectories with generative adversarial
networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22
June 2018; pp. 2255–2264.
263. Phan-Minh, T.; Grigore, E.C.; Boulton, F.A.; Beijbom, O.; Wolff, E.M. Covernet: Multimodal behavior prediction using trajectory
sets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June
2020; pp. 14074–14083.
264. Li, X.; Ying, X.; Chuah, M.C. Grip: Graph-based interaction-aware trajectory prediction. In Proceedings of the IEEE Intelligent
Transportation Systems Conference (ITSC), Auckland, NZ, USA, 27–30 October 2019; pp. 3960–3966.
265. Salzmann, T.; Ivanovic, B.; Chakravarty, P.; Pavone, M. Trajectron++: Dynamically-feasible trajectory forecasting with heteroge-
neous data. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020;
pp. 683–700.
Appl. Sci. 2022, 12, 4429 48 of 49
266. Mohamed, A.; Qian, K.; Elhoseiny, M.; Claudel, C. Social-stgcnn: A social spatio-temporal graph convolutional neural network
for human trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Seattle, WA, USA, 13–19 June 2020; pp. 14424–14432.
267. Amirian, J.; Zhang, B.; Castro, F.V.; Baldelomar, J.J.; Hayet, J.B.; Pettré, J. Opentraj: Assessing prediction complexity in human
trajectories datasets. In Proceedings of the Asian Conference on Computer Vision, Kyoto, Japan, 30 November–4 December 2020;
pp. 1–17.
268. Yu, C.; Ma, X.; Ren, J.; Zhao, H.; Yi, S. Spatio-temporal graph transformer networks for pedestrian trajectory prediction. In
Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 507–523.
269. Wang, C.; Wang, Y.; Xu, M.; Crandall, D.J. Stepwise Goal-Driven Networks for Trajectory Prediction. arXiv 2021,
arXiv:abs/2103.14107.
270. Chen, J.; Li, K.; Bilal, K.; Li, K.; Philip, S.Y. A bi-layered parallel training architecture for large-scale convolutional neural networks.
IEEE Trans. Parallel Distrib. Syst. 2018, 30, 965–976. [CrossRef]
271. Gu, X.; Xue, X.; Wang, F. Fine-Grained Action Recognition on a Novel Basketball Dataset. In Proceedings of the IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 2563–2567.
272. Giancola, S.; Amine, M.; Dghaily, T.; Ghanem, B. Soccernet: A scalable dataset for action spotting in soccer videos. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018;
pp. 1711–1721.
273. Conigliaro, D.; Rota, P.; Setti, F.; Bassetti, C.; Conci, N.; Sebe, N.; Cristani, M. The s-hock dataset: Analyzing crowds at the
stadium. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015;
pp. 2039–2047.
274. Niebles, J.C.; Chen, C.W.; Li, F.-F. Modeling temporal structure of decomposable motion segments for activity classification. In
Proceedings of the European Conference on Computer Vision, Crete, Greece, 5–11 September 2010; pp. 392–405.
275. Voeikov, R.; Falaleev, N.; Baikulov, R. TTNet: Real-time temporal and spatial video analysis of table tennis. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 13–19 June 2020; pp. 884–885.
276. Pettersen, S.A.; Johansen, D.; Johansen, H.; Berg-Johansen, V.; Gaddam, V.R.; Mortensen, A.; Langseth, R.; Griwodz, C.; Stensland,
H.K.; Halvorsen, P. Soccer video and player position dataset. In Proceedings of the 5th ACM Multimedia Systems Conference,
Singapore, 19 March 2014; pp. 18–23.
277. D’Orazio, T.; Leo, M.; Mosca, N.; Spagnolo, P.; Mazzeo, P.L. A semi-automatic system for ground truth generation of soccer video
sequences. In Proceedings of the Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, Genova,
Italy, 2–4 September 2009; pp. 559–564.
278. Feng, N.; Song, Z.; Yu, J.; Chen, Y.P.P.; Zhao, Y.; He, Y.; Guan, T. SSET: A dataset for shot segmentation, event detection, player
tracking in soccer videos. Multimed. Tools Appl. 2020, 79, 28971–28992. [CrossRef]
279. Zhang, W.; Liu, Z.; Zhou, L.; Leung, H.; Chan, A.B. Martial arts, dancing and sports dataset: A challenging stereo and multi-view
dataset for 3D human pose estimation. Image Vis. Comput. 2017, 61, 22–39. [CrossRef]
280. De Vleeschouwer, C.; Chen, F.; Delannay, D.; Parisot, C.; Chaudy, C.; Martrou, E.; Cavallaro, A. Distributed video acquisition
and annotation for sport-event summarization. NEM Summit 2008, 8. Available onoline: http://hdl.handle.net/2078.1/90154
(accessed on 12 February 2020).
281. Karpathy, A.; Toderici, G.; Shetty, S.; Leung, T.; Sukthankar, R.; Fei-Fei, L. Large-scale video classification with convolutional
neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Columbus, OH, USA,
23–28 June 2014; pp. 1725–1732.
282. Dou, Z. Research on virtual simulation of basketball technology 3D animation based on FPGA and motion capture system.
Microprocess. Microsyst. 2021, 81, 103679. [CrossRef]
283. Yin, L.; He, R. Target state recognition of basketball players based on video image detection and FPGA. Microprocess. Microsyst.
2021, 80, 103340. [CrossRef]
284. Bao, H.; Yao, X. Dynamic 3D image simulation of basketball movement based on embedded system and computer vision.
Microprocess. Microsyst. 2021, 81, 103655. [CrossRef]
285. Junjun, G. Basketball action recognition based on FPGA and particle image. Microprocess. Microsyst. 2021, 80, 103334. [CrossRef]
286. Avaya. Avaya: Connected Sports Fans 2016—Trends on the Evolution of Sports Fans Digital Experience with Live Events.
Available online: https://www.panoramaaudiovisual.com/wp-content/uploads/2016/07/connected-sports-fan-2016-report-
avaya.pdf (accessed on 12 February 2020).
287. Duarte, F.F.; Lau, N.; Pereira, A.; Reis, L.P. A survey of planning and learning in games. Appl. Sci. 2020, 10, 4529. [CrossRef]
288. Lee, H.S.; Lee, J. Applying artificial intelligence in physical education and future perspectives. Sustainability 2021, 13, 351.
[CrossRef]
289. Egri-Nagy, A.; Törmänen, A. The game is not over yet—go in the post-alphago era. Philosophies 2020, 5, 37. [CrossRef]
290. Innovations, H.E. Hawk-Eye in Cricket. 2017. Available online: https://www.hawkeyeinnovations.com/sports/cricket (accessed
on 12 February 2020).
291. Innovations, H.E. Hawk-Eye Tennis System. 2017. Available online: https://www.hawkeyeinnovations.com/sports/tennis
(accessed on 12 February 2020).
Appl. Sci. 2022, 12, 4429 49 of 49
292. Innovations, H.E. Hawk-Eye Goal Line Technology. 2017. Available online: https://www.hawkeyeinnovations.com/products/
ball-tracking/goal-line-technology (accessed on 12 February 2020).
293. SportVU, S. Player Tracking and Predictive Analytics. 2017. Available online: https://www.statsperform.com/team-
performance/football/optical-tracking/ (accessed on 12 February 2020).
294. ChyronHego. Product Information Sheet TRACAB Optical Tracking. 2017. Available online: https://chyronhego.com/wp-
content/uploads/2019/01/TRACAB-PI-sheet.pdf (accessed on 12 February 2020).
295. Leong, L.H.; Zulkifley, M.A.; Hussain, A.B. Computer vision approach to automatic linesman. In Proceedings of the IEEE 10th
International Colloquium on Signal Processing and its Applications, Kuala Lumpur, Malaysia, 9–10 March 2014; pp. 212–215.
296. Zhang, T.; Ghanem, B.; Ahuja, N. Robust multi-object tracking via cross-domain contextual information for sports video analysis.
In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, 25–30
March 2012; pp. 985–988.
297. Xiao, J.; Stolkin, R.; Leonardis, A. Multi-target tracking in team-sports videos via multi-level context-conditioned latent behaviour
models. In Proceedings of the British Machine Vision Conference, Nottingham, UK, 1–5 September 2014.
298. Wang, J.; Yang, Y.; Wang, T.; Sherratt, R.S.; Zhang, J. Big data service architecture: A survey. J. Internet Technol. 2020, 21, 393–405.
299. Zhang, J.; Zhong, S.; Wang, T.; Chao, H.C.; Wang, J. Blockchain-based systems and applications: A survey. J. Internet Technol.
2020, 21, 1–14.
300. Pu, B.; Li, K.; Li, S.; Zhu, N. Automatic fetal ultrasound standard plane recognition based on deep learning and IIoT. IEEE Trans.
Ind. Inform. 2021, 17, 7771–7780. [CrossRef]
301. Messelodi, S.; Modena, C.M.; Ropele, V.; Marcon, S.; Sgrò, M. A Low-Cost Computer Vision System for Real-Time Tennis
Analysis. In Proceedings of the International Conference on Image Analysis and Processing; Springer: Berlin/Heidelberg, Germany,
2019; pp. 106–116.
302. Liu, Y.; Liang, D.; Huang, Q.; Gao, W. Extracting 3D information from broadcast soccer video. Image Vis. Comput. 2006,
24, 1146–1162. [CrossRef]