0% found this document useful (0 votes)
12 views5 pages

Integrating Multiple Public Datasets For Human Activity Recognition Using Machine Learning

Uploaded by

Ninad Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views5 pages

Integrating Multiple Public Datasets For Human Activity Recognition Using Machine Learning

Uploaded by

Ninad Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Integrating Multiple Public Datasets for Human

Activity Recognition using Machine Learning


Vinicius K. Fukace, Yandre M. G. Costa, and Igor da P. Natal
2023 30th International Conference on Systems, Signals and Image Processing (IWSSIP) | 979-8-3503-3729-7/23/$31.00 ©2023 IEEE | DOI: 10.1109/IWSSIP58668.2023.10180237

Department of Informatics, State University of Maringá, Maringá, Brazil

{ra115672,ymgcosta,ipnatal2}@uem.br

Abstract—This study proposes the integration of multiple pub- there are several works [5]–[7] that focus on the data collection
licly available datasets for Human Activity Recognition (HAR) and sharing process to be used by other scientists for analysis,
to create a unified dataset, which was used to train and test model creation, and comparisons.
different machine learning algorithms, including J48, kNN, LR,
MLP, NB, RF, and SVM. The performance of these algorithms Some databases have limitations regarding the collected
was evaluated in terms of accuracy and F-score, and the results data, such as the number of users, instances, or activities.
showed that RF achieved the best performance, with an accuracy These limitations can influence the results of machine learning
and F-score of 0.969. The study also compared the results with models. User variability is important due to the difference in
previous studies on individual datasets and showed that the performing the same activity by various individuals, making
proposed approach has promising results for HAR.
Index Terms—Human Activity Recognition, Data Integration, the model more generalizable. The number of instances also
Machine Learning. helps with generalization, as it increases the possibilities of
the models finding patterns. The number of activities defines
I. I NTRODUCTION the robustness of the created models.
Human activities play a crucial role in people’s lives, as The main objective of this work is to explore the potential
they provide valuable information about a person’s identity, of using multiple publicly available datasets to develop a
personality, physical aspects, and mental state. In our daily Human Activity Recognition model that is both generalized
routines, it is common for us to try to understand the behavior and robust. By leveraging various datasets, we aim to create
of people around us and, through it, discover which activities a more comprehensive and accurate model that can capture
are being performed [1]. the variability in human activities performed by different
The ability to recognize activities is natural and straight- individuals in different environments. This approach could
forward for any average person, but when it comes to a potentially improve the performance of HAR models in real-
computer, the process requires complex functions for sensing, world scenarios, such as smart homes, healthcare monitoring,
learning, and inference. The necessary functions for detecting and assisted living environments, where the ability to accu-
the environment, learning from past experiences, and applying rately recognize and monitor human activities is crucial for
knowledge for activity inference are still major challenges for enhancing the quality of life and well-being of individuals.
modern computers [2]. We organized this paper as follows: Section II describes the
The main goal of Human Activity Recognition (HAR) is to datasets for HAR; Section III explain our proposed approach;
identify actions performed by a person, based on a set of infor- Section IV discuss the results comparing with related works.
mation about the person and the surrounding environment. The Concluding the paper, in Section V, we reflect on the contri-
information is extracted from data coming from sensors such butions of this research and point out possible future works.
as accelerometer, gyroscope, Global Position System (GPS),
videos, images, among others. The main devices used for data II. DATASETS FOR HAR
collection are smartphones and wearable devices [3]. In this section we describe the main characteristics of some
Advanced HAR models have been applied in various areas datasets commonly used by the research community to address
to improve people’s quality of life, such as health monitoring, the HAR task.
robotics, and Ambient Assisted Living (AAL), as well as The USC-HAD dataset, introduced by Zhang and
contributing to enhance human-computer interaction in various Sawchukl [8], is among the datasets most used by the research
applications. These models can be useful in smart homes and community devoted to the HAR task. The data was collected
cities, offering services and information to users according to from 14 volunteers equiped with acelerometer and gyroscope
the activities they will perform [4]. performing 12 types of basic human activities (i.e. walking
HAR models are based on machine learning techniques. forward, walking left, walking right, walking upstairs, walk-
The HAR task uses supervised machine learning classification ing downstairs, running forward, jumping, sitting, standing,
criteria. Machine learning techniques heavily depend on the sleeping, elevator up, elevator down). For this purpose, a
quantity and quality of data used for model training. Therefore, MotionNode©sensor was attached to the hip of the volunteers.

Authorized licensed use limited to: Khulna Univ of Engineering & Technology - KUET. Downloaded on November 14,2024 at 02:29:29 UTC from IEEE Xplore. Restrictions apply.
Shoaib et al. [9] evaluated the HAR task using data obtained Running, Walking, and Standing, to create the HARTH dataset.
from acelorometer, gyroscope and magnetometer. These three Two accelerometers were used to collect the data, which were
different types of sensors were coupled to four different points placed in pre-defined points on the volunteers’ bodies. The
of the body. Four volunteers contributed to the creation of authors also provided the training of seven baseline machine
the dataset, and each of them performed six different types learning models as a contribution to the research.
of human activity (i.e. walking downstairs, walking upstairs, By analyzing the different datasets aforementioned, we can
running, sitting, standing, walking). The HAR task was eval- note that there is a lack of standard between them. In some
uated using the data from each type of sensor in isolation, datasets, six different human activities are considered, while
and also the use of data from different sensors combined. in others 18 activities are taken into account. Moreover, the
For classification, the authors experimented several algorithms, activities from one dataset to another are not necessarily the
such: Naive Bayes, kNN, MLP, SVM and Logistic Regression. same. Not to mention other non-standardized aspects, like pre-
The authors concluded that mostly the use of acelorometers processing, types of sensors, devices models and so on. In this
and gyroscopes together is beneficial to address the HAR task. scenario, we introduce in this work a robust dataset, composed
However, they pointed that the contribution of each type of using data from other datasets already published, aiming to
sensor for the solution is highly dependent on the position of provide to the community research the most comprehensive
the body where it is fixed, the specific type of activity, and dataset to address the HAR task.
also the classifier algorithm choosen.
Shoaib et al. [10] continued the investigations aiming to III. PROPOSED APPROACH
identify how much each type of sensor indeed contributes for Our study was composed of four stages. First, we searched
the HAR task. A dataset was composed collecting data from for public databases with HAR classification data, and then
seven different types of physical activity (i.e. walking, running, we selected the databases with similar data to perform data
sitting, standing, jogging, biking, ‘walking upstairs and walk- integration. In the second stage, pre-processing techniques
ing downstairs’) using five smartphones attached to specific were applied for standardization, segmentation, and feature
pre-defined positions of the body. A total of ten volunteers selection in the integrated database. In the third stage, various
contributed for the creation of the dataset. The dataset created machine learning techniques were applied to create HAR
was called Fusion, and it was composed of data provided by models. Finally, the generated results were analyzed. Figure 1
the following types of sensor of the smartphone: acelerometer, illustrates the flowchart of the proposed approach. Each of
gyroscope, magnetometer and linear acelerometer sensor. Nine these steps will be detailed below.
different classifiers were experimented, and they concluded To perform the steps of data integration, preprocessing,
that both acelerometer and gyroscope may be used in isola- and segmentation, the available libraries in the Python pro-
tion to address the HAR task with reasonable performance. gramming language were used, namely the Pandas library
However, they warn that, in general, the combination of both for reading and manipulating data and the SciPy library for
provides even better results. performing statistical calculations. For the creation and testing
In [11], the impact of device heterogeneity on HAR task of models to generate the results, the Weka1 tool was used.
solutions is investigated using 13 different smartphone and
smartwatch models. The dataset HHAR was created with A. Data Integration
data from nine volunteers performing six activities, including Several databases were researched (see the section II) to
biking, sitting, standing, walking, stair up and stair down. The create a single integrated dataset. USC-HAD [8], Swell [9],
study evaluated four classifiers, SVM, kNN, C4.5 Decision Fusion [10], HHAR [11], KU-HAR [12], HARSENSE [13],
Tree, and Random Forest, and concluded that different devices and HARTH [14] datasets were selected. The choice of
significantly affect performance. these seven datasets was based on the similarity of recorded
Sikder and Nahid [12] created the KU-HAR dataset using activities, similarity of sensors used (accelerometers and gyro-
data collected from five smartphone models and 90 volun- scopes), different sampling frequencies, and different sample
teers. The dataset comprises 18 activities, including common times. These similarities and differences can be useful in
activities such as walking and sitting, as well as more specific applying the HAR model in real-life situations where there are
activities like playing tennis table. The dataset was publicly specific variations of each individual and device in performing
shared, and the Random Forest classifier was used, achieving activities.
an accuracy of 89.67 In the selected datasets, the accelerometer and gyroscope
Choudhury et al. [13] presented the HARSENSE dataset, signals of the devices were used for data collection. The
collected from 12 people performing six daily activities, signals from each sensor are divided into their directional
with a focus on individual differences in weight, height, and components x, y, and z, resulting in six data columns. After
gender. They used accelerometer and gyroscope data from two analyzing the content of the chosen datasets, the follow-
smartphone models and evaluated ten classifiers, achieving a ing activities were selected to compose the database used
promising accuracy of 99.88 in this work: walking, running, upstairs, downstairs, sitting,
Logacjovet et al. [14] collected data from 22 volunteers
performing 12 activities, including physical activities such as 1 https://www.cs.waikato.ac.nz/ml/weka/

Authorized licensed use limited to: Khulna Univ of Engineering & Technology - KUET. Downloaded on November 14,2024 at 02:29:29 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. Overview of the proposed approach.

and standing. These are the basic activities with the highest Additionally, it is possible to create an overlap between
frequency among the selected datasets. It should be noted that adjacent windows, represented by a percentage of the window
the HHAR and Fusion datasets do not contain the activity of size, thus allowing for a more continuous analysis of the
running, and HARTH only has accelerometer data, without the signal. The goal of segmentation is to obtain signal segments
presence of a gyroscope. in small temporal portions that can be used for HAR. This is
For data integration, an individual treatment was required a technique frequently used in signal processing, as sensory
for each dataset. In some cases, the data was divided into events are represented in time series of continuous values, thus
separate files for each activity or participant, and it was generating instances for model training and testing.
necessary to combine this data into a single structure (the The ideal window size for fixed window segmentation in
Pandas dataframe was used). In addition, activities that were HAR varies from 2 to 5 seconds(s), considering a sampling
not part of the selected set of activities were eliminated. After rate of 20 to 50 Hertz (Hz) [3]. In the datasets used in this
this process, each dataset had its own dataframe with its work, the sampling frequency varies from 50 to 200 Hz.
own attribute naming pattern and activities. To standardize Therefore, sliding windows with a fixed size of 300 samples
the data, the following naming convention was adopted for and 50% overlap were defined, resulting in windows ranging
activities: Walking, Running, Upstairs, Downstairs, Sitting, from 1.5 to 6 s.
and Standing, and attribute columns were renamed as Ax, Ay, From the segmented data, features were extracted from
and Az for the x, y, and z components of the accelerometer, each window. The mean, standard deviation, minimum value,
and Gx, Gy, and Gz for the components of the gyroscope. maximum value, interquartile range, and root mean square
After the standardization process, the dataframes were inte- (energy) of the x, y, and z components of the accelerometer
grated into a single dataframe. The contribution of each dataset and gyroscope, as well as the signal magnitude area of each
in terms of samples to the integrated dataset. The Fusion sensor, were obtained. The signal magnitude area is added be-
dataset has 900,000 samples; the HARSENSE dataset has cause it is a feature independent of the device orientation and
91,821 samples; the HARTH dataset has 5,278,164 samples; may be useful for activity recognition in different datasets [9].
the HHAR dataset has 9,433,718 samples; the KU-HAR has In total, 38 features were extracted for each window. These
2,076,065 samples; The SWEL dataset has 161,958 samples; features are commonly used in activity recognition studies
and the USC-HAD dataset has 1,191,100. The total number with manual feature extraction [6], [14], [15]. At the end of
of 19,132,826 samples is not the final number for training the process, a total of 127,551 instances were obtained for the
the models. These samples went through the preprocessing training and testing of models, divided among the different
and segmentation process to generate instances to be used as activity classes. The instances classes of activities were divided
inputs for the models. as follows: 28,327 Walking; 4,213 Running; 16,831 Upstairs;
15,341 Downstairs; 39,205 Sitting; and, 23,634 Stading.
B. Preprocessing and Segmentation
The original works of the datasets apply preprocessing C. Model Creation
operations and make the dataset available with these operations
already applied. Low-pass filters are used to treat the frequency Seven traditional machine learning algorithms were em-
of the attributes, so such procedures were not applied in ployed to create models, namely: kNN, Random Forest (RF),
this step. The attribute values were analyzed and empty or Multilayer Perceptron (MLP), Decision Tree J48, Logistic
incorrectly filled values were filled with the column median. Regression, Support Vector Machine (SVM), and Naive Bayes
The data is segmented into sliding windows, where each (NB) using the Weka tool. The techniques were used with the
window contains a subset of consecutive signals over a certain default settings of the tool.
time interval. Window segmentation is done sequentially, and These techniques were chosen due to their wide use in other
a window may contain data that represents more than one works [3], [5], [6] and for presenting interesting results for the
activity. In these cases, the activity associated with the window classification task with the numerical data types used in the
is defined as the most frequent in the corresponding subset. dataset.

Authorized licensed use limited to: Khulna Univ of Engineering & Technology - KUET. Downloaded on November 14,2024 at 02:29:29 UTC from IEEE Xplore. Restrictions apply.
TABLE I TABLE II
R ESULTS OF MACHINE LEARNING ALGORITHMS . C OMPARISON OF ACCURACY / F - SCORE WITH RELATED WORK .

Algorithm Correct preditions Errors Accuracy F-score Work Accuracy or (F-score)


J48 119,845 7,706 0.940 0.940 FUSION N/A
kNN 122,334 5,217 0.959 0.959 HARSENSE 0.996
LR 99,486 28,065 0.780 0.775 HARTH 0.810 (F-score)
MLP 110,579 16,972 0.867 0.867 HHAR 0.970 (F-score)
NB 66,603 60,948 0.522 0.465 KU-HAR 0.896
RF 123,553 3,998 0.969 0.969 Swel 0.920
SVM 113,890 13,661 0.893 0.891 USC-HAD N/A
Our approach 0.969

D. Result Analysis IV. COMPARISON WITH RELATED WORKS


The models were evaluated using two commonly used met- In this section, a brief comparison was made between the
rics in the field of HAR: accuracy and F-score [3]. Accuracy accuracy and F-score results of the model generated by the RF
provides a general indication of the model’s performance, algorithm and the models generated in related works. Table II
while F-score, which is calculated as the harmonic mean of presents the results.
precision and recall, allows for a more nuanced understanding The best results were obtained with the HARSENSE and
of the model’s ability to generalize to new data. These metrics HHAR datasets, with an accuracy of 0.996 for HARSENSE
are frequently used in HAR research to assess the effectiveness and an F-score of 0.970 for HHAR. The other results obtained
of different machine learning algorithms on a given dataset. were below 0.920 for accuracy or F-score. This evaluation
A commonly used technique for evaluating machine learn- shows that our model presents the third-best result in this
ing models is cross-validation [16]. This technique involves comparison. This comparison was made directly without con-
dividing the dataset into k subsets (folds), where k is a pre- sidering any singular aspect of each dataset and each work,
selected integer. The model is then trained on k-1 folds and but this result shows that our approach of integrating multiple
tested on the remaining fold, with the process repeated k times, public datasets is promising for HAR.
alternating the test fold in each iteration. This allows for the V. CONCLUSION AND FUTURE WORK
evaluation of the model’s performance on different subsets of
In this study, we evaluated the performance of seven ma-
data and helps to avoid overfitting or underfitting problems.
chine learning algorithms on a integration dataset of HAR. Our
The mean of the evaluation metrics obtained in each fold
results indicate that the random forest algorithm provided the
is used as an estimate of the model’s performance on the
best performance, with an accuracy and F-score of 0.969. This
complete population of data.
highlights the importance of choosing an appropriate algorithm
The results in terms of total number of correct predictions, for a given task, and also emphasizes the potential of machine
total number of errors, accuracy and f-score of the J48, kNN, learning in the field of activity recognition.
LR, MLP, NB, RF and SVM algorithms using 10 folds cross- Overall, this study contributes to the growth of research
validation are presented in Table I. in activity recognition and machine learning, proposing the
The performance of seven machine learning algorithms, integration of multiple datasets to compose a unified dataset
namely J48, kNN, LR, MLP, NB, RF, and SVM, were with greater potential for generalization and robustness. It
evaluated in terms of accuracy and F-score metrics, which also highlights the importance of considering different aspects,
are commonly used in the field of machine learning. The such as feature engineering and algorithm selection, in the
results, presented in Table I, revealed that the NB algorithm development of models for HAR. The results show that the
had the worst accuracy and F-score values of 0.522 and approach of integrating multiple public datasets is promising
0.465, respectively. This is attributed to its assumption of the for HAR and can lead to better performance compared to
independence of the input features, which may not be true individual datasets.
in real-world scenarios. The LR, MLP and SVM algorithms Future work could explore the use of more advanced ma-
achieved accuracy and F-score values between 0.780 and chine learning techniques, such as deep learning or ensem-
0.893, demonstrating that high complexity does not always ble methods, to further improve the performance of activity
translate into better performance. The J48, kNN, and RF algo- recognition models. Additionally, the application of these
rithms had accuracy values above 0.940, which are remarkable models in real-world scenarios could be investigated, as well
results, and their F-score values suggest good generalization as the potential for combining them with other sensors or
potential. technologies.
The RF algorithm stood out with the best results, achieving
0.969 accuracy and 0.969 F-score, making it the chosen model ACKNOWLEDGMENT
for comparison with the models used in the original works with We thank the National Council for Scientific and Techno-
indi logical Development (CNPq) for their financial support.

Authorized licensed use limited to: Khulna Univ of Engineering & Technology - KUET. Downloaded on November 14,2024 at 02:29:29 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES [9] M. Shoaib, H. Scholten, and P. J. Havinga, “Towards physical activity
recognition using smartphone sensors,” in 2013 IEEE 10th international
[1] M. Vrigkas, C. Nikou, and I. A. Kakadiaris, “A review of human activity conference on ubiquitous intelligence and computing and 2013 IEEE
recognition methods,” Frontiers in Robotics and AI, vol. 2, p. 28, 2015. 10th international conference on autonomic and trusted computing.
[2] S. Choi, “Understanding people with human activities and social inter- IEEE, 2013, pp. 80–87.
actions for human-centered computing,” Human-centric Computing and [10] M. Shoaib, S. Bosch, O. D. Incel, H. Scholten, and P. J. Havinga, “Fusion
Information Sciences, vol. 6, no. 1, p. 9, 2016. of smartphone motion sensors for physical activity recognition,” Sensors,
[3] W. Sousa Lima, E. Souto, K. El-Khatib, R. Jalali, and J. Gama, “Human vol. 14, no. 6, pp. 10 146–10 176, 2014.
activity recognition using inertial sensors in a smartphone: An overview,” [11] A. Stisen, H. Blunck, S. Bhattacharya, T. S. Prentow, M. B. Kjærgaard,
Sensors, vol. 19, no. 14, p. 3213, 2019. A. Dey, T. Sonne, and M. M. Jensen, “Smart devices are different:
[4] E. Sansano, R. Montoliu, and O. Belmonte Fernandez, “A study of Assessing and mitigatingmobile sensing heterogeneities for activity
deep neural networks for human activity recognition,” Computational recognition,” in Proceedings of the 13th ACM conference on embedded
Intelligence, vol. 36, no. 3, pp. 1113–1139, 2020. networked sensor systems, 2015, pp. 127–140.
[5] D. Garcia-Gonzalez, D. Rivero, E. Fernandez-Blanco, and M. R. Luaces, [12] N. Sikder and A.-A. Nahid, “Ku-har: An open dataset for heterogeneous
“A public domain dataset for real-life human activity recognition using human activity recognition,” Pattern Recognition Letters, vol. 146, pp.
smartphone sensors,” Sensors, vol. 20, no. 8, p. 2200, 2020. 46–54, 2021.
[6] D. Anguita, A. Ghio, L. Oneto, X. Parra, J. L. Reyes-Ortiz et al., “A pub- [13] N. A. Choudhury, S. Moulik, and D. S. Roy, “Physique-based human
lic domain dataset for human activity recognition using smartphones.” activity recognition using ensemble learning and smartphone sensors,”
in Esann, vol. 3, 2013, p. 3. IEEE Sensors Journal, vol. 21, no. 15, pp. 16 852–16 860, 2021.
[7] D. Micucci, M. Mobilio, and P. Napoletano, “Unimib shar: A dataset for [14] A. Logacjov, K. Bach, A. Kongsvold, H. B. Bårdstu, and P. J. Mork,
human activity recognition using acceleration data from smartphones,” “Harth: a human activity recognition dataset for machine learning,”
Applied Sciences, vol. 7, no. 10, p. 1101, 2017. Sensors, vol. 21, no. 23, p. 7853, 2021.
[8] M. Zhang and A. A. Sawchuk, “Usc-had: A daily activity dataset for [15] Z. Chen, L. Zhang, Z. Cao, and J. Guo, “Distilling the knowledge from
ubiquitous activity recognition using wearable sensors,” in Proceedings handcrafted features for human activity recognition,” IEEE Transactions
of the 2012 ACM conference on ubiquitous computing, 2012, pp. 1036– on Industrial Informatics, vol. 14, no. 10, pp. 4334–4342, 2018.
1043. [16] S. Raschka, “Model evaluation, model selection, and algorithm selection
in machine learning,” arXiv preprint arXiv:1811.12808, 2018.

Authorized licensed use limited to: Khulna Univ of Engineering & Technology - KUET. Downloaded on November 14,2024 at 02:29:29 UTC from IEEE Xplore. Restrictions apply.

You might also like