TII Deep Learning PA Accepted
TII Deep Learning PA Accepted
TII Deep Learning PA Accepted
Article:
Yang, P. orcid.org/0000-0002-8553-7127, Yang, C., Lanfranchi, V. et al. (1 more author)
(2022) Activity graph based convolutional neural network for physical activity recognition
using acceleration and gyroscope data. IEEE Transactions on Industrial Informatics, 18
(10). pp. 6619-6630. ISSN 1551-3203
https://doi.org/10.1109/TII.2022.3142315
© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be
obtained for all other users, including reprinting/ republishing this material for advertising or
promotional purposes, creating new collective works for resale or redistribution to servers
or lists, or reuse of any copyrighted components of this work in other works. Reproduced
in accordance with the publisher's self-archiving policy.
Reuse
Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless
indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by
national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of
the full text version. This is indicated by the licence information on the White Rose Research Online record
for the item.
Takedown
If you consider content in White Rose Research Online to be in breach of UK law, please notify us by
emailing [email protected] including the URL of the record and the reason for the withdrawal request.
[email protected]
https://eprints.whiterose.ac.uk/
Activity Graph based Convolutional Neural
Network for Human Activity Recognition using
Acceleration and Gyroscope Data
Activation function: Rectified linear activation function The kernel size of convolutional layer 2 7
(ReLU) is usually selected as the activation function after the The type of subsampling layer Max-pooling
convolution operation, the purpose of using ReLU is to The kernel size of subsampling layer 1 5
introduce non-linearity because the CNN needs to learn The kernel size of subsampling layer 2 3
nonnegative linear values. In the Eq.(3), 𝜑 is the ReLU
Learning rate 0.0001
function, is shown in Eq.(4).
Optimizer type Adam
𝜑(𝑥) = 𝑚𝑎𝑥(0, 𝑥) (4) Batch size 256
The number of epochs 1000
In addition to this, finally, all data is entered into a full The dropout rate 0.1
connection layer and applying the final activation function
(typically Sigmod in Eq.(5) or Softmax in Eq.(6)) to get the TABLE III DATASET DESCRIPTION
prediction results, C represents the number of classes in a Number
Data
multi-classification problem. Dataset Sensors Position
Subjects
Sampling of
rate activities
1 2 (Acc, 50HZ
𝜑(𝑥) = 1+ 𝑒 −𝑥 (5) UCI [21]
Gyro)
Waist 30 6
USCHAD 2 (Acc, 100HZ
Hip 15 12
[22] Gyro)
𝑒 𝑥𝑖
𝜑(𝑥𝑖 ) = 𝐶
∑𝑐=1 𝑒 𝑥𝑐
(6) UTDI 2 (Acc,
Wrist 8
50HZ
21
[23] Gyro)
Pooling layer: CNN uses pooling layers to reduce the number TABLE IV PROCESSING RESULTS AND PARAMETER SELECTION
of parameters significantly. The most commonly used pooling Sliding window Sampling Training set Test set
layers include average pooling and max pooling. Here, we use Dataset
overlap time samples samples
max pooling as pooling layer. The output size (𝑊 𝑙𝑛 × 𝐻𝑙𝑛 ) UCI [21] 50% 2.5s 7352 2947
of max pooling layer 𝑙𝑛 is equal to the Eq.(1, 2). Specifically, USCHAD
50% 2s 18557 7954
[22]
the Max Pooling layer preserves the maximum value within UTDI
each kernel range and discards other values as the final output. 50% 1s 3014 1293
[23]
In order to select an appropriate CNN as our baseline
classifier, we compared three advanced CNNs: ResNet,
VI. EXPERIMENTS EVALUATION AND RESULTS
A. Experimental Settings
We used three public datasets to validate the proposed method.
The information of these datasets is summarized in Table.III.
UCI dataset was collected from 30 volunteers within an age
bracket of 19-48 years [21]. Each volunteer wore a Samsung
Galaxy S II smartphone on the waist and performed six
different activities. The researchers collected data from
accelerometers and gyroscopes embedded in the smartphone,
and videotaped the experiments so that activity types are
manually labeled. The original sampling frequency is 50HZ.
Extra, the sensor acceleration signal was separated into gravity a) accuracy of differnet sliding window overalp rate
and body acceleration parts based on a Butterworth low-pass
filter. For the USCHAD [22] dataset, a sensing platform called
MotionNode is used to collect data and this platform integrates
a 3-axis accelerometer, 3-axis gyroscope, and a 3-axis
magnetometer with the sampling frequency of 100HZ. The
researchers selected 14 subjects within an age bracket of 21-49
years to collect data from 12 different activities, and they used
observers to manually record and label these activities. In the
UTD [23] dataset, the researchers used one Kinect camera and
one wearable inertial sensor to collect data. The sampling rate
of the wearable sensor is 50 HZ. The 8 objects are required to
perform 27 different activities, it’s worth noting that, for
actions 1 through 21, the inertial sensor was placed on the wrist b) accuracy of differnet sampling time
of subjects but for actions 22 through 27, the inertial sensor Figure 7. Comparison of accuracy with varied sliding window
was placed on the subject’s right thigh. In this paper, we only size and sampling time over three datasets.
take the first 21 class activities of inertial sensor data for
research, and we also do not use the camera data because we optimization process and the corresponding results of these
don’t study optical sensor data, as called UTD1. three parts. For all datasets, LeNet as a baseline classifier is
used to get the classification accuracy with single-column
B. Evaluation Metrics method as a baseline. For data preprocessing, Fig.7 shows the
In order to accurately evaluate our model, Mean Average different results obtained from the different selection of sliding
Precision (mAP) is used as metric for evaluation that takes window overlap and sampling time. For sliding window
mean of Average Precision (AP) value among classes. Given overlap in Fig.7a, the greater the overlap means that finally can
an IoU threshold, AP value used to fuse the precision and recall use the sample will be more. It is of great value to the original
together and defined as the area under Precision-Recall (PR) data set sample amount is small, in UTD1 data set, due to its
curve: original sample amount relative to the other two data sets is
𝐴𝑃(𝑐) = ∫ 𝑃𝑅(𝑐) (7) less, so we can see the overlap take a larger value (65%),
eventually the highest classification accuracy, for the other two
where 𝑐 denotes the class and PR is calculated by: data sets, one of the most commonly used overlap value (50%),
can obtain the good effect. The length of sampling time also
#𝑇𝑃(𝑐)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛(𝑐) = #𝑇𝑃(𝑐)+#𝐹𝑃(𝑐)
(8) has a similar effect on the final number of samples. For UCI
and USCHAD datasets, the sampling time around two seconds
#𝑇𝑃(𝑐) could achieve a good classification effect, as shown in Fig.7b.
𝑅𝑒𝑐𝑎𝑙𝑙(𝑐) = #𝑇𝑃(𝑐)+#𝐹𝑁(𝑐)
(9) However, more than two seconds is too long for the UTD1
dataset, this results in too few samples available for training,
one second is the relative optimal choice of UTD1 when both
in which TP , FP and FN represent True Positive, False the number of samples available and the length of a single
Positive and False Negative samples respectively so the sample need to be considered. The final data preprocessing
Precision measures the samples that are incorrectly detected result is shown in Table IV.
while Recall measures those misdetection samples. Then the For all datasets, there are two most important parameters,
mAP could be obtained by taking mean: aspect ratio and dots per inch (DPI), to determine the final size
and quality of the image when generating the activity graphs.,
1
AP(c)
Fig.8 show the different results obtained from the different
mAP = (10) selection of aspect ratio and DPI. For aspect ratio, we find that
C cC the ratio is equal to 3:3 or 3:4 with the highest accuracy. This
is due to compression problems when data sequences are
C. Parameter Optimisation
arranged into images. For images that have the same content
In the process of data preprocessing, generating the activity and format except for the different proportions of width and
graph, we respectively show the relevant parameter height, we can see when aspect ratio is 3:2 (Width: Height is
each dataset, the method we proposed has achieved better
classification accuracy compared with the single-column
activity graphs, the classification accuracy was improved by
3.96%, 4.56% and 9.93% respectively. It proves the
effectiveness of the algorithm we designed, in other words, for
different activity types and different number of original signal
sources, our method can generate activity graphs containing
more potential features, thus effectively improving activity
recognition accuracy.
Moreover, we can see with the improvement of the number
of activities on these datasets (21 activities >12 activities >6
activities), the degree of improvement of classification
a) accuracy of using differnet aspect ratio of AG generation
accuracy also increases (9.93% >4.56% >3.96%), these results
suggest a possibility, that is, our proposed method has greater
potential to recognize data with more activity types. To verify
this hypothesis, we randomly selected different types of
activities in the UTD1 dataset to generate sub-datasets, and
used two feature graph generation methods respectively to
obtain classification accuracy data, which were then compared
with the original UTD1 recognition data.
We also found that when the number of activities decreases
to 12, the accuracy of Multi-column activity method increase
to 72.02% and the accuracy of single-column activity method
increase to 64.57%, but the improvement of Multi-column
b) accuracy of using differnet DPI of AG generation
methods decrease to 7.45%, a similar test result was presented
Figure 8. Comparison of accuracy with varied aspect ratio when the number of activity types was equal to 16 (the
and DPI over three datasets. improvement of Multi-column methods decrease to 8.18%), in
3:2), the data sequence is stacked in rows, the height of the data contrast, the number is 9.93% when the number of activity is
sequence graph in each row is compressed because the number 21 on UTD1 dataset. This verifies our hypothesis that the
of rows is large, this could cause fluctuations in the original performance improvement of our proposed method is more
data to be partially obscured (i.e., locally lost information). In significant on datasets with more subject activities.
contrast, if we increase the height so that the height is suitable In comparison with two state-of-the-art deep learning
for the arrangement of multiple rows of data sequence, this techniques [35][36], we found that our proposed multi-column
situation will be greatly improved, conducive to the method performs not good as [35][36] in UCI datasets with 6
subsequent recognition. But blindly increasing the height is not activity subjects. It is because the two state-of-the-art deep
a good choice because it makes the activity graph file too large, learning techniques have proposed new SE blocks and SK
considering that we need to save and read a lot of images convolution to optimise the kernel of CNNs, so that it could
later, the ratio of 3:3 was used as our final choice in this paper achieve up to 96.60% accuracy in UCI dataset.
(the actual pixels used are 360:360). Moreover, when data sets TABLE V PERFORMANCE COMPARISON OF OUR PROPOSED METHOD
need to generate activity graphs with more rows, we
recommend using larger heights such as 3:4 or 3:5 as Activity Graph UCI [21] USCHAD [22] UTD1[23]
appropriate. In total, for HAR, too small aspect ratio will cause generation
method (6 activities) (12 activities) (21 activities)
loss of details of picture information, while too large aspect
ratio will cause unnecessary waste of computing resources. Single-column 86.21% 83.14% 54.19%
The appropriate aspect ratio should be 3:3 or 3:4. Multi-column 90.17% 87.70% 64.12%
Similarly, we also tested the choice of different DPI, the
result was shown in Fig.8b. The value of DPI will directly TABLE VI PERFORMANCE COMPARISON WITH OTHER STATE-OF-THE-
ART METHODS
affect the picture quality of the activity graphs, and too much
DPI will cause unnecessary space resource occupancy, we UCI [21] USCHAD [22] UTD1[23]
finally choose DPI equal to 120 for the subsequent (6 activities) (12 activities) (21 activities)
experiments. Too small DPI will cause loss of details of picture
90.17% 87.70% 64.12%
information, while too large DPI will cause unnecessary waste Our
of computing resources. We recommended DPI is 120 as the multi- RF 91.31% LR 76.08% LR 15.54%
generally choice. column
SVM 96.47% J48DT 91.37% J48DT 48.57%
method
D. Comparison of the Single-column method and the ABDT 91.31% ABDT 90.21% ABDT 51.42%
proposed method
[35] 94.51% 87.36% 61.53%
The classification performance of the three public datasets
using single-column activity graphs and multi-column activity [36] 96.60% 86.70% 60.12%
graphs (our proposed method) is shown in the Table V. For
TABLE VII PERFORMANCE COMPARISON OF COMPUTATIONAL COST
However, as for the datasets [22][23] with more activity in more cases, it is limited by the highly complex and more
subjects, our proposed algorithms perform better than these categories of the classification activity itself, from Table VI for
two state-of-the-art algorithms, particularly in UTDI with 21 USCHAD and UTD1 tests, we can get more information. In
activities, our proposed method has outperformed them with addition to accuracy, other important experimental results such
3% accuracy gain. This verifies our hypothesis that the as precision and recall are shown in Table VII.
performance improvement of our proposed method is more
F. Computational cost
significant on datasets with more subject activities. Therefore,
the above results show that our proposed approach performs All our experiments are conducted on a normal computer with
better than these state-of-the-art deep learning approaches with a 2.7GHZ CPU and 8GB memory. When training the
selected kernel convolution, due to our optimised activity convolutional neural network, we used the Tesla P100 GPU
graph generation model. for acceleration. When testing the test set to calculate various
metrics and computational Cost, we did not invoke the GPU.
E. Comparison of other state-of-the-art methods Our computational cost average is 0.54ms per test sample,
We also compared the classification performance of the which is a very low time resource consumption and helps to
proposed method with that of other state-of-art methods on the perform real-time human physical activity recognition on low-
three datasets. Most of these methods use manual feature power devices, especially mobile devices. The result is shown
extraction methods and variants or improvements based on in the Table VII.
traditional classifiers (such as SVM, random forest, etc.) for
activity recognition. The result is shown in the Table VI. RF: VII. DISCUSSION
Random forest tree; SVM: Support Vector Machine, ABDT: AdaBoost
decision tree; J48DT: J48 Decision Tree; LR: logical regression; While our proposed activity graph generation approach with
Since most of state-of-art methods improve the traditional multi-column sorting algorithms demonstrates a superior
classifiers, so the first thing we evaluate the proposed method performance than existing state-of the-art deep learning
and the original traditional classifier performance difference, algorithm [19] on the most complex UTD1 dataset, there are
we used the UCI dataset have done manually extract with 561 still some further issues requiring discussion and future study.
features with four traditional classifiers (Bayesian, Random One main issue is the difference on how to transfer signals
Forest, GBDT, SVM,) carried out the experiment, the 561 into pixel value between our method and [19]. In [19], their
features include the characteristics of the time domain and method is to directly map the original signal into pixel value
frequency domain, is constructed from the original dataset and then generate activity graph through Discrete Fourier
author through expert knowledge and proved the validity of Transform (DFT). But our method including single-column
them. We compared our method with traditional feature and three-column algorithms is to generate activity graph by
extraction methods and traditional classifiers. The results show stacking the waveform of the original signal. We have re-
that our proposed methods can approach or exceed the produced the method in [19] with DFT pre-processing in a
performance of some traditional classifiers (our method: variety of experiments and tested the use of DFT for our
90.17% > Bayesian: 85.00%). At the same time, the traditional proposed method, but the results show that the DFT does not
manual feature extraction method can obtain the optimal improve (or even decrease) accuracy. It is probably because
classification result (SVM: 96.47%), moreover, by comparison [19]’s method is a direct mapping of the original sensor
with other state-of-art methods, the effect of our proposed readings to pixels, and the images generated by this method are
method (90.17%) is also close to that of Casale et al. ’s method more densely arranged, which is more suitable for the use of
based on the random forest classifier (91.31%) and Anguita et DFT. However, our method could better extract the
al. ’s method based on the multiclass SVM classifier gets the ‘correlation information’ on different coordinate axes without
best result (96.47%), which indicates that the proposed method DFT to extract the frequency domain information. So the
is effective but not yet optimal. However, these analysis method we proposed refers to idea from [19]: "Every two
conclusions above just from UCI dataset contains only six signals must be adjacent once", but has some significantly
common kinds of activities and the activities themselves are difference with their method. These two approaches also have
not complicated, not all of the datasets could be extracted different advantages. For data sets with fewer activity
specification and reasonable characteristics of as many as 561, categories and uncomplicated activities, [19]'s method is more
effective. For data sets with more complex activities and more REFERENCES
categories, for example in the dataset of UTD1, [19]'s method [1] F. Lin, A. Wang, Y. Zhuang, M. R. Tomita and W. Xu, “Smart Insole:
contain less information than our method. Our method is A wearable sensor device for unobtrusive gait monitoring in daily life”,
inspired by [19]’s idea, but towards more complex datasets IEEE Trans on Industrial Informatics, Vol 12, Issue 6, pp.2281-2291,
with multiple subjects’ activities. [2]
Jan 2016.
L. Qi, C. Hu, X. Zhang, M. R. Khosravi, S. Sharma, S. Pang and T. Wang,
Another notable issue is whether we could have potentially “Privacy-aware data fusion and prediction with spatial-temporal context
optimised the soring algorithms, as it is key to the success of for smart city industrial environment”, IEEE Trans on Industrial
the output of activity graphs. Current multi-column sorting Informatics, Early Access, July 2020.
[3] G. Zhao, Y. Liu and Y. Shi, “Real-time assessment of the cross-task
algorithm contains duplicated and redundant information of mental workload using physiological measures during anomaly
activity graph. For instance, Figure.2 shows a case of activity detection”, IEEE Trans on Human-Machine Systems, Vol 48, Issue 2,
graph with only 6 axis input. When the axis input is increased pp.149-160, Jan 2018.
to 9, 12, or even more, we will have some more duplicated and [4] A. H. Kronbauer, H. C. Da Luz and J. Campos, “Mobile Security
Monitor: a wearable computing platform to detect and notify falls”, IEEE
redundant information in activity graph, further potentially Latin America Transactions, Vol 16, Issue 3, pp.957-965, May 2018.
affecting efficiency of our CNN solution. Also, our sorting [5] Z. Li, S. Das, J. Codella, T. Hao, K. Lin, C. Maduri and C. H. Chen, “An
algorithm is not the only unique solution for optimal activity adaptive, data-driven personalized advisor for increasing physical
graph, as it is dependent on initialization of axis signals input. activity”, IEEE Journal of Biomedical and Health Informatics, Vol 23,
Issue 3, pp.999-1010, May 2019.
Lastly, one important issue is about selecting parameters of [6] X. Wang, K. Tieu and E. L. Grimson, “Correspondence-free activity
CNN such as kernel size. We find out that the 5x5 convolution analysis and scene modelling in multiple camera views”, IEEE Trans on
kernel processes frequency domain data rather than direct Pattern Analysis and Machine Intelligence, Vol 32, Issue 1, pp.56 -71,
correlation of signals on different axes in [19]. Catching the Jan 2010.
[7] A. Kamel, B. Sheng, P. Yang, P. Li, R. Shen and D.D. Feng, “Deep
"right" amount of information with the adopted 10x10 convolutional neural networks for human action recognition using depth
convolution kernel is applicable to the method we proposed, maps and postures”, IEEE Trans on Systems, Man, and Cybernetics:
where this kernel size could better extract the presentation of Systems, Vol 49, Issue 9, pp.1806-1819, July 2018.
correlations between multiple activity subjects and sensor [8] C. T. Chu and J. N. Hwang, “Fully unsupervised learning of camera link
models for tracking humans across nonoverlapping cameras”, IEEE
signals alignments. Our key contribution of this paper is to Trans on Circuits and Systems for Video Technology, Vol 24, Issue 6,
seek an importantly ignored problem in existing literatures of pp.979-994, Jan 2014.
HAR, that generating optimal activity image can present [9] J. Qi, P. Yang, A. Waraich, Z. Deng, Y. Zhao and Y. Yang, “Examining
correlations between multiple human activity subjects and sensor-based physical activity recognition and monitoring for healthcare
using internet of things: a systematic review”, Journal of Biomedical
sensor signals alignments, further improving its accuracy when Informatics, Vol 87, pp.138-153, Nov 2018.
applying deep learning techniques into activity images. [10] J. Qi, P. Yang, L. Newcombe, X. Peng, Y. Yang and Z. Zhao, “An
Notably, recent literatures [31-34] on HAR indicates some overview of data fusion techniques for internet of things enabled physical
interesting new research progress, such as ambient sensing activity recognition and measure”, Information Fusion, Vol 55, pp.269-
280, March 2020.
technologies [33] for smart home care, Kinect based human [11] A. Mannini and S. S. Intille, “Classifier personalisation for activity
affection recognition technologies [34] for remote healthcare. recognition using wrist accelerometers”, IEEE Journal of Biomedical
Their practical applications in free-living environment are still and Health Informatics, Vol 23, Issue 4, pp.1585-1594, July 2019.
limited, as using smartphone data is easier and more accessible [12] Z. H. Chen, Q. C. Zhu, Y. C. Soh and L. Zhang, “Robust human activity
recognition using smartphone sensors via CT-PCA and online SVM”,
to end-users. Thus, our proposed solution has importance in IEEE Trans on Industrial Informatics, vol 13, issue 6, pp.3070-3080,
this field. To our best knowledge, it is the first time in the Dec 2017.
literature to point out the issue of activity graph generation [13] N. Hegde, M. Bries, T. Swibas, E. Melanson and E. Sazonov,
using “latent ‘correlation’ between human activity subjects “Automatic recognition of activities of daily living utilizing insole-based
and wrist-worn wearable sensors”, IEEE Journal of Biomedical and
and neighboured signals alignments”, also we prove its Health Informatics, Vol 22, Issue 4, pp.979-988, July 2018.
effectiveness in complex datasets of HAR with 21 subjects of [14] J. H. Huang, S. S. Lin, N. Wang, G. H. Dai, Y. X. Xie and J. Zhou, “TSE-
activities, with up to 10% accuracy improvement. CNN: A two-stage end-to-end CNN for human activity recognition”,
IEEE Journal of Biomedical and Health Informatics, Vol 24, Issue 1,
VIII. CONCLUSION pp.292-299, Jan 2020.
[15] E. Kim, “Interpretable and accurate convolutional neural networks for
This paper designed a novel optimal activity graph generation human activity recognition”, IEEE Trans on Industrial Informatics, Vol
model by incorporating deep learning frameworks for 16, Issue 11, pp.7190-7198, Nov 2020.
automatic and accurate HAR with multiple subjects using only [16] D. Ravi, C. Wong, B. Lo and G. Z. Yang, “A deep learning approach to
no-Node sensor data analytics for mobile or wearable devices”, IEEE
acceleration and gyroscope data. Specifically, through a Journal of Biomedical and Health Informatics, Vol 21, Issue 1, pp.56-
comprehensive comparison, we confirm that the classification 64, Jan 2017.
effect of our proposed multicolumn activity graph is better than [17] D. F. Silva, V. M. De Souza, G. E. Batista, Time series classification
other deep learning or traditional supervised learning HARs. using compression distance of recurrence plots, in: 2013 IEEE 13th
International Conference on Data Mining, IEEE, 2013, pp. 687–696.
The results showed that our approach averagely improved [18] F. J. Ord´o˜nez, D. Roggen, Deep convolutional and lstm recurrent
recognition accuracy about 5% compared with other state-of- neural networks for multimodal wearable activity recognition, Sensors
the-art HAR methods. Particularly towards multi-type HAR 16 (1) (2016) 115.
cases, it achieved up to 10% accuracy gain over other methods. [19] W. Jiang, Z. Yin, Human activity recognition using wearable sensors by
deep convolutional neural networks, in: Proceedings of the 23rd ACM
These improvements show the advantage and potential of our international conference on Multimedia, 2015, pp. 1307–1310.
method dealing with complex HAR problems with multiple [20] Z. Chen, L. Zhang, Z. Cao, J. Guo, Distilling the knowledge from
subjects using limited sensing data. handcrafted features for human activity recognition, IEEE Transactions
on Industrial Informatics 14 (10) (2018) 4334–4342.
[21] D. Anguita, A. Ghio, L. Oneto, X. Parra, J. L. Reyes-Ortiz, A public
domain dataset for human activity recognition using smartphones., in:
Esann, Vol. 3, 2013, p. 3.
[22] M. Zhang, A. A. Sawchuk, Usc-had: a daily activity dataset for
ubiquitous activity recognition using wearable sensors, in: Proceedings
of the 2012 ACM Conference on Ubiquitous Computing, 2012, pp. 1036–
1043.
[23] C. Chen, R. Jafari, N. Kehtarnavaz, Utd-mhad: A multimodal dataset for
human action recognition utilizing a depth camera and a wearable inertial
sensor, in: 2015 IEEE International conference on image processing
(ICIP), IEEE, 2015, pp. 168–172.
[24] Z. Qin, Y. Zhang, S. Meng, Z. Qin, K.-K. R. Choo, Imaging and fusing
time series for wearable sensor-based human activity recognition,
Information Fusion 53 (2020) 80–87.
[25] C. A. Ronao, S.-B. Cho, Human activity recognition with smartphone
sensors using deep learning neural networks, Expert systems with
applications 59 (2016) 235–244.
[26] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D.
Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions,
in: Proceedings of the IEEE conference on computer vision and pattern
recognition, 2015, pp. 1–9.
[27] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image
recognition, in: Proceedings of the IEEE conference on computer vision
and pattern recognition, 2016, pp. 770–778.
[28] M. D. Zeiler, R. Fergus, Visualizing and understanding convolutional
networks, in: European conference on computer vision, Springer, 2014,
pp. 818–833.
[29] D. Tao, Y. Wen, R. Hong, Multicolumn bidirectional long short-term
memory for mobile devices-based human activity recognition, IEEE
Internet of Things Journal 3 (6) (2016) 1124–1134.
[30] A. Jordao, A. C. Nazare Jr, J. Sena, W. R. Schwartz, Human activity
recognition based on wearable sensor data: A standardization of the
stateof-the-art, arXiv preprint arXiv:1806.05226 (2018).
[31] Z. Ma, L. Yang, M. Lin, Q. Zhang and C. Dai, “Weighted Support Tensor
Machines for Human Activity Recognition with Smartphone Sensors”,
IEEE Trans on Industrial Informatics, early access, 2021.
[32] Z. H. Chen, C. Y. Jiang and L. Xie, “A Novel Ensemble ELM for Human
Activity Recognition Using Smartphone Sensors”, IEEE Trans on
Industrial Informatics, vol 15, issue 5, pp.2691-2699, May 2019.
[33] M. Kaur, G. Kaur, K. Sharma, A. Jolfaei and D. Singh, “Binary cuckoo
search metaheuristic-based supercomputing framework for human
behavior analysis in smart home”, The Journal of Supercomputing, vol
76, pp.2479-2502, 2020
[34] U. Tripathi, R. S. J, V. Chamola, A. Jolfaei and A. Chintanpalli,
“Advancing remote healthcare using humanoid and affective systems”,
IEEE Sensors Journals, early access, Jan. 2021.
[35] R. Abdel-Salam, R. Mostafa, and M. Hadhood, “Human activity
recognition using Wearable Sensors: Review, Challenges, Evaluation
Benchmark”, the 2nd International Workshop on Deep Learning for
Human Activity Recognition, Held in conjunction with IJCAI-PRICAI
2020 , Jan. 2021.
[36] W. Gao, L. Zhang, W. Huang, F. Min, J. He and A. Song, “Deep Neural
Networks for Sensor-based Human Activity Recognition using Selective
Kernel Convolution”, IEEE Trans. Instrum. Meas, Vol 70. 2021.