0% found this document useful (0 votes)
9 views15 pages

Comparative Analysis of 1D CNN GRU and LSTM For Classifying Step Duration in Elderly and Adolescents Using Computer Vision

This study compares the efficacy of 1D-CNN, GRU, and LSTM deep learning models in classifying step duration data from healthy adolescents and elderly individuals to aid in early detection of gait-related disorders. Results indicate that 1D-CNN outperforms both GRU and LSTM in accuracy and processing time, achieving a perfect accuracy of 1.000 in under 60 seconds. The research highlights the potential for using computer vision and deep learning to analyze gait and detect neurodegenerative diseases.

Uploaded by

leetenghong06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views15 pages

Comparative Analysis of 1D CNN GRU and LSTM For Classifying Step Duration in Elderly and Adolescents Using Computer Vision

This study compares the efficacy of 1D-CNN, GRU, and LSTM deep learning models in classifying step duration data from healthy adolescents and elderly individuals to aid in early detection of gait-related disorders. Results indicate that 1D-CNN outperforms both GRU and LSTM in accuracy and processing time, achieving a perfect accuracy of 1.000 in under 60 seconds. The research highlights the potential for using computer vision and deep learning to analyze gait and detect neurodegenerative diseases.

Uploaded by

leetenghong06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/387954483

Comparative Analysis of 1D -CNN, GRU, and LSTM for Classifying Step


Duration in Elderly and Adolescents Using Computer Vision

Article in International Journal of Robotics and Control Systems · January 2025


DOI: 10.31763/ijrcs.v5i1.1588

CITATIONS READS

0 43

7 authors, including:

Teng Hong Lee Ezreen Farina Shair


Technical University of Malaysia Malacca Technical University of Malaysia Malacca
2 PUBLICATIONS 0 CITATIONS 48 PUBLICATIONS 481 CITATIONS

SEE PROFILE SEE PROFILE

Abdul Rahim Abdullah Kazi Ashikur Rahman


Technical University of Malaysia Malacca Technical University of Malaysia Malacca
228 PUBLICATIONS 2,710 CITATIONS 5 PUBLICATIONS 1 CITATION

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Ezreen Farina Shair on 13 January 2025.

The user has requested enhancement of the downloaded file.


International Journal of Robotics and Control Systems

IJRCS Vol. 5, No. 1, 2025, pp. 426-439


ISSN 2775-2658
http://pubs2.ascee.org/index.php/ijrcs

Comparative Analysis of 1D – CNN, GRU, and LSTM for


Classifying Step Duration in Elderly and Adolescents Using
Computer Vision
Teng Hong Lee a,1, Ezreen Farina Shair a,2,*, Abdul Rahim Abdullah a,3, Kazi Ashikur Rahman a,4
,
Nursabillilah Mohd Ali a,5, Nur Zawani Saharuddin a.6, Nurhazimah Nazmi b,1
a Rehabilitationand Assistive Technology Research Group, Faculty of Electrical Technology and Engineering, Universiti
Teknikal Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
b Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, Jalan Sultan Yahya Petra, 54100,

Kuala Lumpur, Malaysia


1 [email protected]; 2 [email protected]; 3 [email protected]; 4 [email protected];
5 [email protected]; 6 [email protected]; 7 [email protected]

* Corresponding Author

ARTICLE INFO ABSTRACT

Developing a classification system that can predict the onset of


Article history neurodegenerative diseases or gait-related disorders in elders is vital for
Received August 30, 2024 preventing incidents like falls. Early detection allows reduction in
Revised November 08, 2024
symptoms and treatment cost for the elderly. In this study, step duration
Accepted January 10, 2024
data from five healthy adolescents with age range of 23 – 29 years old and
five healthy elderly individuals with age range of 71 – 77 years old were
Keywords
sourced from PhysioNet. To ensure proper training of the deep learning
Computer Vision;
Step Duration; models, synthetic data was generated from the original dataset using a noise
Deep Learning; jittering technique with random noise of a range between -0.01 and 0.01
Feature Extraction; added to the original data. Long Short-Term Memory (LSTM), Gated
Time Series Data Recurrent Unit (GRU), and 1D Convolutional Neural Network (1D-CNN)
are used for training the data since the data is available in the form time
series data. LSTM and GRU are advanced forms of Recurrent Neural
Network (RNN) while 1D – CNN can capture temporal dependencies in
sequential data. 1D – CNN has the advantages over GRU and LSTM of
being more robust to noise and can capture complex patterns behind the
data. These methods will be compared in terms of processing time and
accuracy. Results show that 1D – CNN outperforms both LSTM and GRU
with accuracy of 1.000 in less than 60 seconds. The novelty and
contribution of this research shows that healthy old people and healthy
young people can be classified with deep learning. Further direction of the
research can explore the deep learning in classification of Parkinson’s
disease.

This is an open-access article under the CC–BY-SA license.

1. Introduction
The study of human gait is extremely complex because there are many factors that involve
coordination of multiple body systems and muscles that influence the way human walks [1], [2]. The
musculoskeletal system that involves bones, muscles and joints working together, the nervous system
that is responsible in coordinating muscle activation and movement patterns and finally, individual

http://dx.doi.org/10.31763/ijrcs.v5i1.1588 [email protected]
International Journal of Robotics and Control Systems
ISSN 2775-2658 427
Vol. 5, No. 1, 2025, pp. 426-439

variability adds to the complexity of the human gait [3], [4]. The research into human gait uncovers
various layers of complexity and encompasses many areas of sciences like biology, physics,
physiotherapy, electrical engineering, mechanical engineering so on and so forth [5], [6]. Hence, there
are many digital biomarkers that can be used for gait analysis like spatiotemporal parameters,
kinematic parameters and muscle activity. One of the aspects of the spatiotemporal parameters is step
duration.
The research scope of this research paper involves the comparative analysis of LSTM, GRU and
1D – CNN on the classification of the step duration of healthy young people and healthy old people.
When a person ages, his or her gait deteriorates [7]. Early detection of gait deterioration is important
to increase awareness for the people to live a healthier lifestyle to prevent the gait deterioration from
advancing further into neurodegenerative diseases like Parkinson’s disease [8]. Even though this
research is only limited to the classification of youngsters and elderly, this research can open door for
the classification of Parkinson’s disease, Huntington’s disease and many more. Reduction in
symptoms and long – term treatment cost is possible if early detection of the abnormality of the adults’
gait is done successfully [9], [10]. Adolescents and elders represent the opposite spectrum of gait
where adolescents represent healthy and more active gait while elders represent a more deteriorating
and more passive gait [11]. Elders suffer more from neurodegenerative diseases than adolescents and
is often viewed as a starting point for progression of neurodegenerative diseases [12]. Hence, the
research in the classification of the gait of adolescents and elders is very important because it allows
the deterioration of gait in adults to be detected and allow early treatment to develop healthier gait in
adults [13], [14].
Researchers come out with various features that can be used in the study of the gait analysis in
which each has its own advantages or disadvantages and the suitability of the use of the feature
depends on the specific cases [15], [16]. For example, Short – Time Fourier Transform (STFT) and
Continuous Wavelet Transform (CWT) can be used to derive the features from the data like root mean
square (RMS) value [17], [18]. Besides, STFT and CWT can then be used to form the time – frequency
images from the data. The images can then be fed into 2D – CNN for classification process.
Nematallah et al conducted CWT on the signals form wearable sensors for human activity recognition
[19]. Detrended Fluctuation Analysis (DFA) can be done on the data to determine if there is long –
term correlation in the data [20], [21]. Di Bacco V et al studied the gait variability of treadmill and
overground walking by performing DFA on the signals captured from accelerometer from the
smartphone [22]. Furthermore, there are entropy methods like Shannon entropy to study the
orderliness of the data [23], [24], [25]. Deep learning is a far easier method and prevails over machine
learning because it does not require human intervention to generate features [26], [27]. There are many
aspects of gait analysis which range from electromyography (EMG) signal, vertical Ground Reaction
Force (vGRF) signal to the step duration [28], [29]. This research paper focuses on the step duration
instead of EMG signal and vGRF signal because the data collection for EMG signal and vGRF signal
requires a system that involves multiple sensor which makes it not convenient to be integrated with
Internet – of – Things (IoT) due to patients’ discomfort [30], [31]. Step duration which can be defined
as time taken for consecutive footstep is easier to be implemented in the IoT because it can be captured
easily using smartphone camera using computer vision with the utilization of OpenPose or MoveNet
[32]-[34]. Computer vision allows objective and precise measurement which has the potential to
replace the conventional system in analyzing the gait – related disorders which involves manual
analysis by the physiotherapists. Computer vision has the advantage of being non – invasive [35],
[36]. There are many researchers that are using the markerless computer vision method which require
low only a camera that is readily available in smartphones in analyzing gait. For example, Wei V et
al developed vision – based gait analysis using two smartphone cameras to analyze the normal subjects
and Parkinson subjects [37]. Besides, Blasco – Garcia J et al incorporated RGB – D camera in robot
to capture human motions [38]. The nature of step duration is time series data which makes LSTM,
GRU and 1D – CNN suitable candidates for training the step duration data. LSTM and GRU can
capture long – term dependencies in the data [39], [40] while 1D – CNN can capture the temporal

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
428 ISSN 2775-2658
Vol. 5, No. 1, 2025, pp. 426-439

features in the data [41]. 1D – CNN is the derivation of the conventional CNN method used to classify
images to extract parameters of the data in the time series data [42], [43].
The contribution of this research paper is to analyze the efficacy of the GRU, LSTM and 1D –
CNN to be used in the classification of the step duration of the young and the old people. The best
deep learning method to be used in accordance with smartphone camera using computer vision in
capturing and testing the stride interval will be identified.

2. Method
The flowchart of the research is shown in Fig. 1. Based on Fig. 1, the step duration data of the 5
healthy adolescents (23 – 29 years old) and 5 healthy elders (60 – 77 years old) have been collected
from PhysioNet. Then, 20 synthetic data are generated from each original data using noise jittering.
Noise jittering involves adding a noise which ranges from -0.01 to 0.01 to the data which replicates a
new data for the original data. The combination of original data and synthetic data are fed into the
deep learning system which are 1D – CNN, LSTM and GRU. The accuracy and the processing time
of each deep learning method are then compared to identify the best deep learning method. There are
a total of 210 data altogether after the synthetic data is combined with the original data. 80% of the
210 data will be used for training the system which is 168 data and the remaining 20% of the data will
be used to test the system. The training data is selected to be 80% to ensure more data is used to train
the deep learning while the testing is selected to be 20% to ensure there is enough data to be tested.
Other than train – test split, cross – validation with 5 splits will also be conducted to be compared to
the train – test split.

Fig. 1. The flowchart of the research

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
ISSN 2775-2658 429
Vol. 5, No. 1, 2025, pp. 426-439

Since the step duration of the data have different timesteps, the maximum timesteps is identified
which is 892. Then, this timestep will be used to train the system. If the data has less than 892
timesteps, then padded zero will be added so that maximum timestep will be 892. Other than padding
zero, there is no normalization, scaling and preprocessing of the data and the raw data is left intact. In
this research paper, it is hypothesized that the deep learning method is capable of handling complex
data in complex environment which include noise for the classification. There is no hyperparameter
tuning for the hyperparameters in this project. The best hyperparameters are identified by rigorous
testing.
To train the 1D – CNN system, the format of the data to be fed into 1D – CNN will be (number
of samples, timestep). Therefore, the shape of the training data will be (168, 892). Since GRU and
LSTM only support data in 3D format which are (number of samples, timestep, feature), the training
data will be converted to this format (168, 892, 1). The comparative analysis does not include the
feature importance because deep learning does not require manual selection of feature. The deep
learning project is run on TensorFlow by using a MSI laptop with 16 GB Random Access Memory
(RAM), Intel Core i9 CPU and NVIDIA GeForce RTX 4070.
2.1. Deep Learning Method
2.1.1. 1D–CNN
The conventional CNN is designed for image processing tasks, applying convolutional operations
on two – dimensional data to extract features. Image is two – dimensional data because it consists of
height and width. 1D – CNN is then developed to work with one – dimensional data like time series
data to extract features from sequential data. The operation behind the 1D – CNN is like the
conventional CNN in the way that convolutional operation is applied too. One advantage of 1D –
CNN over CNN is that 1D – CNN can capture temporal dependencies in sequential data more
effectively. Other than that, 1D – CNN can result in shorter training time compared to CNN because
few parameters are needed due to simplified architecture. The key principle behind the operation of
1D – CNN involves the convolution of a kernel with the input data in one dimension [44], [45]. Fig.
2 shows the construction of the 1D – CNN model for the classification system of the step duration of
the elderly and the adolescents. The constructed 1D – CNN model consists of 3 layers of Conv1D
with 64 filters with the kernel size of 3. The layers of Conv1D are followed by one flattening layer
and finally, one dense layer.

Fig. 2. The construction of 1D – CNN model for the classification of the step duration of the elderly and the
adolescents

The data is trained with the number of epochs set to 300 and the batch size set to 8. Fig. 3 shows
the architecture of constructed 1D – CNN model in this research.
2.1.2. LSTM
LSTM is an improved RNN that was created to solve the problems traditional RNNs had
identifying long-term dependencies in sequential data. Conventional RNNs have problems with
vanishing and exploding gradients, which can make training the model difficult. RNNs also do not
have specific memory cells, which makes it hard for them to remember important information over
time. LSTMs, on the other hand, have specialized memory cells that are more resistant to disappearing
and expanding gradient problems and can store information for longer periods of time. Information
flow is controlled by input, forget, and output gates, which are components of the LSTM architecture
[46], [47]. Fig. 4 shows the construction of the LSTM model for the classification of step duration of
the elderly and the adolescents. The LSTM model starts with 256 LSTM cells, 128 cells, 128 cells, 64

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
430 ISSN 2775-2658
Vol. 5, No. 1, 2025, pp. 426-439

cells, 32 cells and finally, there is a dense layer. Each LSTM cell is followed by dropout(0.2). Finally,
dropout is included to prevent overfitting.

Fig. 3. The architecture of the constructed 1D – CNN model

Fig. 4. The construction of LSTM model for the classification of the step duration of the elderly and the
adolescents

The data is trained with the number of epochs set to 300 and the batch size set to 8. Fig. 5 shows
the architecture of the constructed LSTM model.

Fig. 5. The architecture of the constructed LSTM model

2.1.3. GRU
Like LSTM, GRU is an RNN variant created to get over some of its current drawbacks. With just
two gates which are reset gate and an update gate, GRU has a more straightforward architecture than

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
ISSN 2775-2658 431
Vol. 5, No. 1, 2025, pp. 426-439

LSTM. The reset gate controls how much previous data needs to be deleted, while the update gate
controls how much needs to be combined with fresh input to update the memory. Each GRU and
LSTM has benefits and drawbacks of their own. Compared to LSTM, GRU is more effective at
capturing long-term dependencies in sequential data and is also more computationally economical.
However, due to more straightforward architecture, GRU is less able to understand complex linkages
and dependencies in the data and has less control over the flow of information [48], [49]. Fig. 6 shows
the construction of the GRU model for classification of step duration of the elderly and adolescents.
Like LSTM, the GRU model starts with 256 GRU cells, 128 cells, 128 cells, 64 cells, 32 cells and
finally, there is a dense layer. Each GRU cell is followed by dropout(0.2). Finally, dropout is included
to prevent overfitting.

Fig. 6. The construction of GRU model for the classification of step duration of the elderly and the
adolescents

The data is trained with the number of epochs set to 300 and the batch size set to 8. Fig. 7 shows
the architecture of the constructed GRU model.

Fig. 7. The architecture of the constructed GRU model

3. Results and Discussion


Fig. 8 shows the parameters that are extracted from the constructed 1D – CNN model that is
shown in Fig. 2 and Fig. 3, Fig. 8 shows the total number of parameters extracted are 81665. Fig. 9
shows the parameters that are extracted from the constructed LSTM model that is shown in Fig. 4 and
Fig. 5, Fig. 9 shows the total number of parameters extracted are 654753. Fig. 10 shows the parameters
that are extracted from the constructed GRU model that is shown in Fig. 6 and Fig. 7, Fig. 10 shows
the total number of parameters extracted are 492897.
Based on Fig. 9 and Fig. 10, even though the constructed GRU and LSTM model have the same
number of cells which go from 256, 128, 128, 64 and 32 and the same dense layer, GRU derives fewer
parameters which are 492897 if compared to LSTM which are 654753. This is due to the simplified
nature of GRU that causes it to learn less intricate relationships and dependencies in the data. Table 1
shows the test loss, test accuracy and processing time of 1D – CNN, LSTM and GRU. 1D – CNN has
outperformed both LSTM and GRU in terms of test loss, test accuracy and processing time. Besides,
the operation of 1D – CNN requires far less parameters which are 81665 compared to LSTM and

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
432 ISSN 2775-2658
Vol. 5, No. 1, 2025, pp. 426-439

GRU. Therefore, the best deep learning method to be used in the classification of the step duration of
the elderly and the adolescents is 1D – CNN. Table 1 also shows that the test loss and test accuracy
of GRU and LSTM are roughly the same, which shows that the underlying operation of LSTM and
GRU are also the same. According to literature review, LSTM does not work well with the highly
irregular data, non – stationary data and data with high frequency noise [50]. This does show that the
step duration data is highly irregular, non – stationary and has high frequency noise.

Fig. 8. The parameters that are extracted from the constructed 1D – CNN model

Fig. 9. The parameters that are extracted from the constructed LSTM model

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
ISSN 2775-2658 433
Vol. 5, No. 1, 2025, pp. 426-439

Fig. 10. The parameters that are extracted from the constructed GRU model

Fig. 11 shows the plotting of the line graph of the stride interval against time (s). In this aspect,
1D – CNN prevails over LSTM and GRU because 1D – CNN is more robust to noise due to its ability
to extract local patterns. 1D – CNN uses convolutional layers that apply filters across the input data
that detect specific patterns by sliding over the data. These filters perform element – wise
multiplications and summations. These filters suppress random noise by highlighting important
features. The kernels operate on a fixed – size window of the input data and detect as well as amplify
local features. In contrast, LSTM and GRU are not effective in detecting local patterns due to the
wider focus on the entire sequence. Besides, the reason the processing time of 1D – CNN is shorter
compared to LSTM and GRU because 1D – CNN process data in parallel while LSTM and GRU
process data sequentially [51]. In 1D – CNN, convolutional layers that apply multiple layers across
the input data simultaneously will be used. The parallel processing allows the network to extract
features from different parts of the input at the same time which speeds up the computation and
shortens the processing time. Fig. 12 and Fig. 13 show the bar chart of the parameter count and
processing time versus the deep learning methods respectively.

Table 1. The test loss, test accuracy and processing time of 1D – CNN, LSTM and GRU
Deep Learning Methods Test Loss Test Accuracy
1D – CNN 2.6095 × 10−8 1.0000
LSTM 9.5488 0.3810
GRU 9.5488 0.3810

Fig. 14, Fig. 15 and Fig. 16 show the classification report of 1D – CNN, LSTM and GRU
respectively whether by cross – validation or 80% training and 20% testing. The train – test split and
cross – validation produce the same result for 1D – CNN, LSTM and GRU. Fig. 14 shows that the
classification using 1D – CNN has scored 100% for precision, recall and f1 – score both young group

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
434 ISSN 2775-2658
Vol. 5, No. 1, 2025, pp. 426-439

which is highlighted as 0 as well as elderly group which is highlighted as 1. Fig. 15 and Fig. 16 show
that both LSTM and GRU produce similar classification result.

Fig. 11. The plotting of step duration against time (s) for a healthy old sample 1

Fig. 12. The bar chart of parameter count versus the deep learning methods

The difference between GRU and LSTM is that due to simplified nature of GRU, GRU has fewer
parameters extraction as shown in Fig. 12 and shorter processing times as shown in Fig. 13. This is
because GRU has only two gates which are update gate and reset gate while LSTM has three gates
which are input gates, forget gate and output gate. Even though GRU and LSTM derived a lot more

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
ISSN 2775-2658 435
Vol. 5, No. 1, 2025, pp. 426-439

parameters than 1D – CNN as shown in Fig. 12, 1D – CNN outperformed both GRU and LSTM with
much shorter processing time as shown in Fig. 13 because of the ability to detect and amplify local
features by using the filters which also suppress random noise while GRU and LSTM have a wider
focus on the entire sequence. This suggests that the features that are relevant to the step duration for
the classification of the young and elderly are not spread throughout the sequence but are locally
concentrated in different distinct parts of the step duration. The research into distinct parts of the
sequence of the step duration can be explored further to identify the parts of the step duration of the
elderly that need to be rehabilitated to develop healthier gait.

Fig. 13. The bar chart of processing time versus the deep learning methods

Fig. 14. The classification report of 1D – CNN

Fig. 15. The classification report of LSTM

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
436 ISSN 2775-2658
Vol. 5, No. 1, 2025, pp. 426-439

Fig. 16. The classification report of GRU

4. Conclusion
1D - CNN is the best deep learning technique for classifying the step durations of adolescents
and the elderly, which are easily accessible as time series data. The fact is that 1D- CNN outperformed
LSTM and GRU in terms of test loss, test accuracy and processing time. The cross – validation with
5 splits and train – test split produces the same result for the 1D – CNN, LSTM and GRU. Besides,
1D – CNN required fewer parameters than LSTM and GRU. Furthermore, step duration data has high
frequency noise, is highly irregular and non – stationary data which makes LSTM and GRU which
are suited to capture long – term dependencies in sequential data unsuitable deep learning methods
for the classification of the step duration data whereas, 1D – CNN is more robust to noise due to its
ability to extract local pattern which render it a suitable deep learning method. This study demonstrates
that, in comparison to LSTM and GRU, 1D – CNN is a more effective way to investigate how people's
step duration declines with age, providing greater insight into individualized treatment for the elderly.
Future work can also include combining the local pattern extraction abilities of 1D – CNN with the
sequential learning capabilities of LSTM and GRU to enhance performance on time – series data with
similar kind of noisiness and irregularity. Future work can also include the exploration of 1D – CNN,
LSTM and GRU on the classification of neurodegenerative diseases like Parkinson’s disease,
Huntington’s disease with different severity levels.

Author Contribution: All authors contributed equally to the main contributor of this paper. All authors read
and approved the final paper.
Funding: Ministry of Higher Education (MOHE) of Malaysia and Universiti Teknikal Malaysia Melaka
(UTeM)
Acknowledgement: The authors acknowledge the support and funding provided by the Ministry of Higher
Education (MOHE) of Malaysia and Universiti Teknikal Malaysia Melaka (UTeM) through the Fundamental
Research Grant Scheme (FRGS), No: FRGS/1/2023/SKK06/UTEM/02/1. The authors are also thankful for the
Kesidang Scholarship bestowed by the UTeM.
Conflicts of Interest: The authors declare no conflict of interest.

References
[1] D. Nohelova, L. Bizovska, N. Vuillerme, and Z. Svoboda, “Gait Variability and Complexity during Single
and Dual-Task Walking on Different Surfaces in Outdoor Environment,” Sensors, vol. 21, no. 14, p. 4792,
2021, https://doi.org/10.3390/s21144792.
[2] F. D. Groote and A. Falisse, “Perspective on musculoskeletal modelling and predictive simulations of
human movement to assess the neuromechanics of gait,” Proceedings of the Royal Society B, vol. 288,
no. 1946, pp. 1-10, 2021, https://doi.org/10.1098/rspb.2020.2432.

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
ISSN 2775-2658 437
Vol. 5, No. 1, 2025, pp. 426-439

[3] J. S. Lora-Millan, J. C. Moreno, and E. Rocon, “Coordination Between Partial Robotic Exoskeletons and
Human Gait: A Comprehensive Review on Control Strategies,” Frontiers in Bioengineering and
Biotechnology, vol. 10, p. 842294, 2022, https://doi.org/10.3389/fbioe.2022.842294.
[4] M. K. MacLean and D. P. Ferris, “Human muscle activity and lower limb biomechanics of overground
walking at varying levels of simulated reduced gravity and gait speeds,” PLoS One, vol. 16, no. 7, p.
e0253467, 2021, https://doi.org/10.1371/journal.pone.0253467.
[5] S. Qin, J. Yan, X. Chen, W. Li, P. Li and Z. Liu, "Assessing the Stability of Human Gait Based on a
Human Electrostatic Field Detection System," IEEE Sensors Journal, vol. 24, no. 7, pp. 11036-11047,
2024, https://doi.org/10.1109/JSEN.2024.3370301.
[6] S. Piergiovanni, P. Terrier, “Validity of Linear and Nonlinear Measures of Gait Variability to Characterize
Aging Gait with a Single Lower Back Accelerometer,” Sensors, vol. 24, no. 23, p. 7427, 2024,
https://doi.org/10.3390/s24237427.
[7] N. Nedović, F. Eminović, V. Marković, I. Stanković, and S. Radovanović, “Gait Characteristics during
Dual-Task Walking in Elderly Subjects of Different Ages,” Brain Sciences, vol. 14, no. 2, p. 148, 2024,
https://doi.org/10.3390/brainsci14020148.
[8] J. Silva, T. Atalaia, J. Abrantes, and P. Aleixo, “Gait Biomechanical Parameters Related to Falls in the
Elderly: A Systematic Review,” Biomechanics, vol. 4, no. 1, pp. 165–218, 2024,
https://doi.org/10.3390/biomechanics4010011.
[9] H. Kerminen, E. Marzetti, and E. D’Angelo, “Biological and Physical Performance Markers for Early
Detection of Cognitive Impairment in Older Adults,” Journal of Clinical Medicine, vol. 13, no. 3, p. 806,
2024, https://doi.org/10.3390/jcm13030806.
[10] M. Sato, T. Yamashita, D. Okazaki, H. Asada, and K. Yamashita, “Valid Indicators for Predicting Falls
in Community-Dwelling Older Adults Under Ongoing Exercise Intervention to Prevent Care
Requirement,” Sage Open Aging, vol. 10, 2024, https://doi.org/10.1177/23337214241229328.
[11] M. Antonelli, E. Caselli, and L. Gastaldi, “Comparison of Gait Smoothness Metrics in Healthy Elderly
and Young People,” Applied Sciences, vol. 14, no. 2, p. 911, 2024, https://doi.org/10.3390/app14020911.
[12] L. Delbes, N. Mascret, C. Goulon, and G. Montagne, “Differences of gait adaptability behavior between
young and healthy older adults during a locomotor pointing task in virtual reality,” Gait Posture, vol. 109,
pp. 233–239, 2024, https://doi.org/10.1016/j.gaitpost.2024.02.009.
[13] D. Commandeur, M. Klimstra, R. Brodie, and S. Hundza, “A Comparison of Bioelectric and
Biomechanical EMG Normalization Techniques in Healthy Older and Young Adults during Walking
Gait,” Journal of Functional Morphology and Kinesiology, vol. 9, no. 2, p. 90, 2024,
https://doi.org/10.3390/jfmk9020090.
[14] A. Simonet, A. Delafontaine, P. Fourcade, and E. Yiou, “Vertical Center-of-Mass Braking and Motor
Performance during Gait Initiation in Young Healthy Adults, Elderly Healthy Adults, and Patients with
Parkinson’s Disease: A Comparison of Force-Plate and Markerless Motion Capture Systems,” Sensors,
vol. 24, no. 4, p. 1302, 2024, https://doi.org/10.3390/s24041302.
[15] D. Sethi, C. Prakash, and S. Bharti, “Multi-feature gait analysis approach using deep learning in
constraint-free environment,” Expert System, vol. 41, no. 7, p. e13274, 2024,
https://doi.org/10.1111/exsy.13274.
[16] D. Łuczak, “Machine Fault Diagnosis through Vibration Analysis: Continuous Wavelet Transform with
Complex Morlet Wavelet and Time–Frequency RGB Image Recognition via Convolutional Neural
Network,” Electronics, vol. 13, no. 2, p. 452, 2024, https://doi.org/10.1111/exsy.13274.
[17] H. Kuduz and F. Kaçar, “A deep learning approach for human gait recognition from time-frequency
analysis images of inertial measurement unit signal,” International Journal of Applied Methods in
Electronics and Computers, vol. 11, no. 3, pp. 165–173, 2023, https://doi.org/10.58190/ijamec.2023.44.
[18] T. A. Mostafa, S. Soltaninejad, T. L. McIsaac, and I. Cheng, “A Comparative Study of Time Frequency
Representation Techniques for Freeze of Gait Detection and Prediction,” Sensors, vol. 21, no. 19, p. 6446,
2021, https://doi.org/10.3390/s21196446.

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
438 ISSN 2775-2658
Vol. 5, No. 1, 2025, pp. 426-439

[19] H. Nematallah and S. Rajan, “Quantitative Analysis of Mother Wavelet Function Selection for Wearable
Sensors-Based Human Activity Recognition,” Sensors, vol. 24, no. 7, p. 2119, 2024,
https://doi.org/10.3390/s24072119.
[20] J. H. Hollman, W. D. Lee, D. C. Ringquist, C. Taisey, and D. K. Ness, “Comparing adaptive fractal and
detrended fluctuation analyses of stride time variability: Tests of equivalence,” Gait Posture, vol. 94, pp.
9–14, 2022, https://doi.org/10.1016/j.gaitpost.2022.02.019.
[21] D. A. Gadanayak, M. Mishra, and R. C. Bansal, “High impedance fault detection in distribution networks
using randomness of zero-sequence current signal: A detrended fluctuation analysis approach,” Applied
Energy, vol. 368, p. 123452, 2024, https://doi.org/10.1016/j.apenergy.2024.123452.
[22] V. E. D. Bacco and W. H. Gage, “Gait variability, fractal dynamics, and statistical regularity of treadmill
and overground walking recorded with a smartphone,” Gait Posture, vol. 111, pp. 53–58, 2024,
https://doi.org/10.1016/j.gaitpost.2024.04.002.
[23] H. P. Huang, C. F. Hsu, Y. C. Mao, and L. Hsu, “Gait Stability Measurement by Using Average Entropy,”
Entropy, vol. 23, no. 4, p. 412, 2021, https://doi.org/10.3390/e23040412.
[24] Y.-L. Hsieh, M. F. Abbod, G. Saggio, V. Errico, and I. Mazzetta, “Gait Analyses of Parkinson’s Disease
Patients Using Multiscale Entropy,” Electronics, vol. 10, no. 21, p. 2604, 2021,
https://doi.org/10.3390/electronics10212604.
[25] J. M. Yentes and P. C. Raffalt, “Entropy Analysis in Gait Research: Methodological Considerations and
Recommendations,” Annals of Biomedical Engineering, vol. 49, no. 3, pp. 979–990, 2021,
https://doi.org/10.1007/s10439-020-02616-8.
[26] M. M. Taye, “Understanding of Machine Learning with Deep Learning: Architectures, Workflow,
Applications and Future Directions,” Computers, vol. 12, no. 5, p. 91, 2023,
https://doi.org/10.3390/computers12050091.
[27] S. Dargan, M. Kumar, M. R. Ayyagari, and G. Kumar, “A Survey of Deep Learning and Its Applications:
A New Paradigm to Machine Learning,” Archives of Computational Methods in Engineering, vol. 27, no.
4, pp. 1071–1092, 2020, https://doi.org/10.1007/s11831-019-09344-w.
[28] C. De Marchis et al., “Characterizing the Gait of People With Different Types of Amputation and
Prosthetic Components Through Multimodal Measurements: A Methodological Perspective,” Frontiers
in Rehabilitation Sciences, vol. 3, p. 804746, 2022, https://doi.org/10.3389/fresc.2022.804746.
[29] E. F. Shair, N. A. Jamaluddin, and A. R. Abdullah, “Finger Movement Discrimination of EMG Signals
Towards Improved Prosthetic Control using TFD,” International Journal of Advanced Computer Science
and Applications, vol. 11, no. 9, pp. 244–251, 2020, https://doi.org/10.14569/IJACSA.2020.0110928.
[30] M. Wang, W. Lee, L. Shu, Y. S. Kim, and C. H. Park, “Development and Analysis of an Origami-Based
Elastomeric Actuator and Soft Gripper Control with Machine Learning and EMG Sensors,” Sensors, vol.
24, no. 6, p. 1751, 2024, https://doi.org/10.3390/s24061751.
[31] İ. G. Özeloğlu and E. A. Aydin, “Combining features on vertical ground reaction force signal analysis for
multiclass diagnosing neurodegenerative diseases,” International Journal of Medical Informatics, vol.
191, p. 105542, 2024, https://doi.org/10.1016/j.ijmedinf.2024.105542.
[32] J. Stenum Id, M. M. Hsu, A. Y. Pantelyat, and R. T. Roemmich Id, “Clinical gait analysis using video-
based pose estimation: Multiple perspectives, clinical populations, and measuring change,” PLOS Digital
Health, vol. 3, no. 3, p. e0000467, 2024, https://doi.org/10.1371/journal.pdig.0000467.
[33] T. Matsuda et al., “Validity Verification of Human Pose-Tracking Algorithms for Gait Analysis
Capability,” Sensors, vol. 24, no. 8, p. 2516, 2024, https://doi.org/10.3390/s24082516.
[34] C. Zhou, D. Feng, S. Chen, N. Ban, and J. Pan, “Portable vision-based gait assessment for post-stroke
rehabilitation using an attention-based lightweight CNN,” Expert Systems with Applications, vol. 238, p.
122074, 2024, https://doi.org/10.1016/j.eswa.2023.122074.
[35] H. Wang et al., “Markerless gait analysis through a single camera and computer vision,” Journal of
Biomechanics, vol. 165, p. 112027, 2024, https://doi.org/10.1016/j.jbiomech.2024.112027.

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)
International Journal of Robotics and Control Systems
ISSN 2775-2658 439
Vol. 5, No. 1, 2025, pp. 426-439

[36] R. Cedeno-Moreno, D. L. Malagon-Barillas, L. A. Morales-Hernandez, M. P. Gonzalez-Hernandez, and


I. A. Cruz-Albarran, “Computer Vision System Based on the Analysis of Gait Features for Fall Risk
Assessment in Elderly People,” Applied Sciences, vol. 14, no. 9, p. 3867, 2024,
https://doi.org/10.3390/app14093867.
[37] V. W. S. Tan, W. X. Ooi, Y. F. Chan, T. Connie, M. K. O. Goh, “Vision-Based Gait Analysis for
Neurodegenerative Disorders Detection,” Journal of Informatics and Web Engineering, vol. 3, no. 1, pp.
136–154, 2024, https://doi.org/10.33093/jiwe.2024.3.1.9.
[38] J. D. Blasco-García et al., “A Computer Vision-Based System to Help Health Professionals to Apply
Tests for Fall Risk Assessment,” Sensors, vol. 24, no. 6, p. 2015, 2024,
https://doi.org/10.3390/s24062015.
[39] L. Xiang et al., “Integrating an LSTM framework for predicting ankle joint biomechanics during gait
using inertial sensors,” Computers in Biology and Medicine, vol. 170, p. 108016, 2024,
https://doi.org/10.1016/j.compbiomed.2024.108016.
[40] M. Shayestegan, J. Kohout, L. Verešpejová, M. Chovanec and J. Mareš, "Comparison of Feature
Selection and Supervised Methods for Classifying Gait Disorders," IEEE Access, vol. 12, pp. 17876-
17894, 2024, https://doi.org/10.1109/ACCESS.2024.3360861.
[41] D. Martínez-Pascual, J. M. Catalán, A. Blanco-Ivorra, M. Sanchís, F. Arán-Ais and N. García-Aracil,
"Gait Activity Classification With Convolutional Neural Network Using Lower Limb Angle
Measurement From Inertial Sensors," IEEE Sensors Journal, vol. 24, no. 13, pp. 21479-21489, 2024,
https://doi.org/10.1109/JSEN.2024.3400296.
[42] A. A. Ahmed et al., “Classifying Cardiac Arrhythmia from ECG Signal Using 1D CNN Deep Learning
Model,” Mathematics, vol. 11, no. 3, p. 562, 2023, https://doi.org/10.3390/math11030562.
[43] J. Zhu, H. Chen and W. Ye, "Classification of Human Activities Based on Radar Signals using 1D-CNN
and LSTM," 2020 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-5, 2020,
https://doi.org/10.1109/ISCAS45731.2020.9181233.
[44] A. K. Ozcanli and M. Baysal, “Islanding detection in microgrid using deep learning based on 1D CNN
and CNN-LSTM networks,” Sustainable Energy, Grids and Networks, vol. 32, p. 100839, 2022,
https://doi.org/10.1016/j.segan.2022.100839.
[45] S. Guessoum et al., “The Short-Term Prediction of Length of Day Using 1D Convolutional Neural
Networks (1D CNN),” Sensors, vol. 22, no. 23, p. 9517, 2022, https://doi.org/10.3390/s22239517.
[46] G. V. Houdt, C. Mosquera, and G. Nápoles, “A review on the long short-term memory model,” Artificial
Intelligence Review, vol. 53, no. 8, pp. 5929–5955, 2020, https://doi.org/10.1007/s10462-020-09838-1.
[47] H. Lin et al., “Time series-based groundwater level forecasting using gated recurrent unit deep neural
networks,” Engineering Applications of Computational Fluid Mechanics, vol. 16, no. 1, pp. 1655–1672,
2022, https://doi.org/10.1080/19942060.2022.2104928.
[48] K. E. ArunKumar, D. V. Kalaga, C. M. S. Kumar, M. Kawaji, and T. M. Brenza, “Forecasting of COVID-
19 using deep layer Recurrent Neural Networks (RNNs) with Gated Recurrent Units (GRUs) and Long
Short-Term Memory (LSTM) cells,” Chaos Solitons Fractals, vol. 146, p. 110861, 2021,
https://doi.org/10.1016/j.chaos.2021.110861.
[49] Y. Zhang et al., "Epileptic Seizure Detection Based on Bidirectional Gated Recurrent Unit Network,"
IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 30, pp. 135-145, 2022,
https://doi.org/10.1109/TNSRE.2022.3143540.
[50] S. Obeta, E. Grisan, and C. V. Kalu, “A Comparative Study of Long Short-Term Memory and Gated
Recurrent Unit,” SSRN Electronic Journal, 2020, https://dx.doi.org/10.2139/ssrn.4442677.
[51] F. M. Shiri, T. Perumal, N. Mustapha, and R. Mohamed, “A Comprehensive Overview and Comparative
Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU,” Machine Learning, 2024,
https://arxiv.org/abs/2305.17473.

Teng Hong Lee (Comparative Analysis of 1D – CNN, GRU, and LSTM for Classifying Step Duration in Elderly and
Adolescents Using Computer Vision)

View publication stats

You might also like