0% found this document useful (0 votes)
27 views6 pages

Manual Action Estimation

The document describes the pipeline and code for a system that estimates actions in videos using computer vision and deep learning techniques. It involves: 1. Creating custom datasets from videos using ball/player detection and VGG features to identify frames, actions, and ball/player positions. 2. Training an LSTM model on the custom datasets to predict actions by feeding it sequences of frames. 3. The code files described implement the datasets creation using different approaches and train/use the LSTM model to estimate actions in test videos.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views6 pages

Manual Action Estimation

The document describes the pipeline and code for a system that estimates actions in videos using computer vision and deep learning techniques. It involves: 1. Creating custom datasets from videos using ball/player detection and VGG features to identify frames, actions, and ball/player positions. 2. Training an LSTM model on the custom datasets to predict actions by feeding it sequences of frames. 3. The code files described implement the datasets creation using different approaches and train/use the LSTM model to estimate actions in test videos.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Action Estimation: User Manual

1. Environment and files:


● To activate project environment: ​conda activate test
● To enter project directory: ​cd Action_Estimation
The directory contains necessary python files, video files and datasets described in the
Table below:

File Name File type Short Description

“lfc_not_main_2.py” Python file The main part of the project takes a


video file and outputs datasets.

lstm2.py Python file Takes datasets and trains a lstm


model.

vgg_test.py Python file Takes a video file and generates


datasets using VGG approach

dataset_merger.py Python file Modify VGG dataset by adding


corresponding action_features from
our custom dataset.

dataset_30_actions_fcb_rma CSV Dataset Dataset gathered by our approach


_full.csv

dataset_vgg_2020.csv CSV Dataset Dataset gathered by VGG approach

dataset_vgg_2020_merged.c CSV Dataset Modified VGG dataset with action


sv features.

matchForLSTM.mp4 mp4 video file RMA vs FCB 45 minute match

ast_aty_forLSTM.mp4 mp4 video file Astana vs Atyrau 90 minute match

model_ex-100_acc-0.829508. h5 model Scene recognition model (82.9%


h5 accuracy)
2. Pipeline of system:

to predict on OUR approach (custom ball to predict on VGG approach (custom ball
and player detection) use the following and player detection) use the following
sequence sequence

lfc_not_main_2.py

lfc_not_main_2.py vgg_test.py

dataset_merger.py

lstm2 lstm2.py

3. “lfc_not_main_2.py” file
Creates custom dataset which has frame number, action label, ball coordinates, team
owning ball and etc by performing player and ball detection

Lines of code Description

1-22 Importing necessary libraries

23-51 Class ​DominantColors that has function ​dominantColors​. It takes


image, perform K-Mean clustering and returns a list of dominant
colors in the image (in integer)

53-58 Loading Scene_Recogniton model

60-73 Mapping our actions to integers

74-99 Initializing variables

106-116 Function “process_countour” which determines to which team


contours of players belong.

118-134 Function “find_closest_player” returns the team number which


possesses the ball

136 - 140 Loading Custom_Ball_Detector model

143 Start analyzing each frame of the video in while loop


153-159 Save the current frame in file, then use this file as input for
scene_recognition.

154 Scene_recognition is set to recognize each 15th frame (in order


to keep performance time low)

163-182 Applying morphological transformations on image in order to find


borders of the field

186 Set condition to check current scene

187-195 Finding contours on the field

198-210 Checking the sizes of contours to ensure that it is a player


contours.

214-237 Determining to which team each player belongs

239-258 Finding 2 dominant colors within the borders and ignoring the
field color (executed only once, when 16 or more contours
detected)

262-279 Ball detection

297-328 Entering player coordinates into an array

335-342 Condition to check whether a ball is detected for more than 30


frames (or 1 second)

343-350 Performing DB clustering on ball coordinates to remove noise


and false detections.

353-459 Categorizing 30 actions according to ball movement

462-493 Saving all data in dataset

495-498 Clearing the arrays to use them in next frame data

501-506 Visual representation of detection via video output

508-512 Finishing the program

4. “lstm2.py” file
Builts model and based on dataset predicts the action label and print the accuracy

Lines of code Description

24-35, contains load and helper functions which were outdated after
50-72 solving the problem of multiple detections of ball

38-48 finds the max number of frames of one action. It is needed to


make dimension of splitted data fixed. That’s why we need to
know what is the max length of action among dataset

74-98 load data function:


to prepare a dataset to be run in LSTM network it must be
3-dimensional. shape of input to LSTM is (#of samples,
number_of_frames_in_max_sequences, features). Other lines of
code is just matrix manipulation to make input as above
dimensions.

100-115 if dataset needs to be prepared by fixed number of frames (not


fixed action sequence) use function load_data_window. It
manipulates matrix and takes fixed frames which can be given as
argument to function and makes train, test sets as dimensions
above.

117-127 function that returns model. Lines are self-explanatory. add


function adds layers. if one layers is not fully connected (i.e last
layer) the previous layers should have parameter
return_sequence=true

129-137 function that compares predicted and test values by checking if


action label of the most probable equals to test label

139-156 similar to function above but counts success if top-2 actions


matched the label

158-170 plots graph of action distribution

172-189 main function where program begins. each line is


self-explanatory and refers to functions described above. Also
runs time check and output time taken.

5. “vgg_test.py” file
Creates separate file in directory which contains predicted 4096 features from VGG model’s
output

Lines of code Description

17 sets VGG16 model such that output one layer before the last
fully-connected layer i.e output 4096 features

20-23 function that appends array to the end of file


25-40 loops through each frame in video sequence and manipulates the
matrix of image to be accepted as input to the model built above.
At the end append predicted 4096 features to the file

6. “dataset_merger.py” file
To feed the dataset in lstm2.py and make it complete we should attach frame number and
action label to the dataset which only contains 4096 features from vgg_test.py.

Lines of code Description

6-10 function that appends array to the end of file

13 custom dataset with custom features (players numbers, team


owning ball, coordinates etc)

18 dataset which contains VGG features

22-30 iterates through the first dataset and matches frame number and
creates a new array which contains (frame number, action label,
4096 features) and attaches it to the end of file.

32 prints time to execute the procedure

Additional Neural Network models:

7.1 Custom ball detection model:


To train another model for ball detection you can enter the following folder:
/home/lag/ball_detection
A dataset for ball detection is located at
/home/lag/ball_detection/ball_data
To train new model execute the following python file:
moses_object_detection.py

Lines of code Description

1 Importing libraries

3-4 Choosing Training approach : YOLOv3


5 Select dataset directory

6 Setting training configurations

7 Execute training

Dataset is acquired using LabelImg software. To run it, type:


python3 labelimg
This software is used to label objects on the image, and save corresponding labels in
xml files.

7.2 Scene recognition model:


To train another scene_recognition model using new data you can enter the following folder:
/home/lag/imageAI2

ImageAI library was used to train scene_recognition models.


Dataset and models for scene_recognition are located at
/home/lag/imageAI2/idenprof

To train new model execute the following python file:


scene_recognition_training.py

Lines of code Description

1 Importing libraries

3-4 Setting model type: DenseNet

5 Select dataset directory

6 Setting training configurations and starting the training

You might also like