Convolutionaland Recurrent Neural Networksfor Activity Recognitionin Smartenvironments
Convolutionaland Recurrent Neural Networksfor Activity Recognitionin Smartenvironments
Convolutionaland Recurrent Neural Networksfor Activity Recognitionin Smartenvironments
net/publication/320688135
CITATIONS READS
56 7,441
6 authors, including:
Some of the authors of this publication are also working on these related projects:
Modular platform concepts and personalized service provision based on efficient recognition of human behavior View project
All content following this page was uploaded by Deepika Singh on 06 November 2017.
1 Introduction
The advancement in sensing, networking and ambient intelligence technologies
has resulted in emergence of smart environments and different services for a
better quality of life and well being of the aging population. The aim are services
providing comfort and security in their private space. Among them, the research
in Smart Home (SH) has gained a lot of interest in the field of Ambient Assisted
Living (AAL) technologies. The motivation behind the smart home research is
the rapid increase in the world’s aging population. According to the World Health
Organization (WHO), the number of older people (aged 60 years or above) has
increased substantially in the past decade and expected to reach about 2 billion
by 2050 [1].
2 Deepika Singh et al.
The concept of smart homes gained popularity in early 2000s. Lutolf [2]
defined smart home concept as the integration of different services within a
home environment by using a common communication system. According to
Satpathy [3] a smart home provides independence and comfort to the residents
by using all mechanical and digital devices interconnected in a network and able
to communicate with the user to create an interactive space.
Smart home equipped with simple, easy to install and low cost interconnected
sensors are providing variety of services such as health care, well being, energy
conservation by ensuring safety and security to the residents. Activity recognition
in the home environment facilitates the remote monitoring for the purpose of
detecting so called Activities of Daily Living (ADL), residents’ behavior and their
interaction with the smart environment. The large amount of data collected from
the installed sensors is analyzed by employing machine learning models to detect
meaningful features and abnormal behavioral patterns in ADLs. Several models
have been proposed to recognize the activities inside smart homes using intrusive
and non-intrusive approaches. Activity recognition by intrusive approaches is
opposed to ethical aspects, e.g. devices such as video cameras, microphones in
private environment raise privacy concern and therefore, unlikely to be accepted
by the residents. On the other hand, non-intrusive approaches are preferable as
they include simple and ubiquitous sensors to measure activities of the residents
and the surroundings without hindering their privacy.
In the recent years, there has been extensive interest in deep learning in
the field of image analysis [4], speech recognition [5] and sensor informatics
[6]. Activity recognition using deep learning has several advantages in terms of
system performance and flexibility. It provides an effective tool for extracting
high-level feature hierarchies from high-dimensional sensory data which is useful
for classification and regression tasks [7]. Deep learning models are based on
learning representations from raw data and contain more than one hidden layer.
The network learns many layers of non-linear information processing for feature
extraction and transformation. Each successive layer uses the output from the
previous layer as input. The well known deep learning models include Long
Short Term Memory (LSTM) [8], Convolutional Neural Network (CNN) [9],
Deep Belief Network (DBN) [10] and autoencoders [11].
In this work we exploit activity recognition using convolutional neural net-
work model on publicly available smart homes dataset [12] which is an extension
of our previous work of activity recognition using LSTM model [13]. The clas-
sification of the daily human activities such as cooking, bathing and sleeping is
performed using temporal 1D-CNN model and evaluation of results has been car-
ried out in comparison with LSTM and other machine learning algorithms such
as Naive Bayes, Hidden Markov Model (HMM), Hidden Semi-Markov Model
(HSMM) and Conditional Random Fields (CRF).
The paper is structured in different sections; the introduction is followed by
Section 2 which presents an overview of existing work in activity recognition us-
ing various machine learning techniques in the field of AAL. Section 3 introduces
Long Short-Term Memory and Convolutional Neural Network model. Section 4
Convolutional and RNN for Activity Recognition in Smart Environment 3
describes the datasets that were used and explains the results. Finally, Section 5
discusses the outcomes of the experiments and suggestions for future work.
2 Related Work
The section is divided into three parts. The first part gives an overview of the
existing smart home projects in the field of AAL. The second lists the available
smart home datasets and the last part presents the existing work in activity
recognition using machine learning techniques.
Several smart home projects have been implemented in the past decade, which
use sensors for activity recognition inside the home environment. The Gator-tech
smart house built by University of Florida contained smart appliances equipped
with sensors such as smart blinds, smart refrigerator, smart stove which moni-
tor user activities and provide services to the residents [14]. The Aware Home
developed by Georgia Institute of Technology uses radio frequency identification
(RFID) tags for the localization of the resident [15]. For the purpose of activ-
ity recognition, House n project has been developed by Massachusetts Institute
of Technology [16]. Various sensors have been installed to detect the routined
activities such as toileting, bathing and grooming using supervised learning al-
gorithms. The Center for Advanced Studies in Adaptive Systems (CASAS) in-
troduced smart home in a box technology which is easy to install and provides
various services with no customization and training [17]. Several other smart
environment efforts have been demonstrated such as Easy Living project of Mi-
crosoft implements an intelligent environment to track and identify multiple
residents through an active badge system [18]. In all of the smart home projects
[19], activity recognition plays an important role.
There has been several efforts to collect datasets from the sensors installed in the
smart homes for human activity recognition. As these datasets are important for
the research community since collecting real house annotated datasets is costly,
time consuming and difficult to obtain.
The publicly available datasets are useful as they provide the baseline for
the comparison of different machine learning algorithms. Eventually, it helps
in collecting real house dataset using the baseline by identifying loopholes (if
any) and corresponding improvement. The table 1 summarizes the widely used
publicly available smart home datasets.
4 Deepika Singh et al.
3 Deep Learning
Nowadays, activity recognition using deep learning has become one of the most
preferred techniques owing to its ability to learn data representations and classi-
fiers. Their performance on different activity recognition tasks has been explored
by researchers [29, 30]. Deep architectures with multiple layers of Restricted
Boltzmann Machines (RBM) handle binary sensory data and use DBN-ANN
Convolutional and RNN for Activity Recognition in Smart Environment 5
and DBN-R algorithms for human behavior prediction [31]. Convolutional neu-
ral networks [32] are type of deep neural network (DNN) which use convolutions
over the input layer to compute the output. Each layer applies different filters to
extract hierarchical features, dependency, translation equi-variance of data and
automates feature learning, which make it suitable to use for time series raw
sensor data. The CNN model has performed well in extracting features and rec-
ognizing activities from the raw sensor data [33] and video frames in comparison
to the other machine learning approaches on publicly available datasets [34]. In
this paper, we performed 1D-CNN on publicly available smart home datasets.
yt−n yt+n
xt−n
xt+n
Fig. 1: Illustrations of an LSTM network with x being the binary vector for
sensor input and y being the activity label prediction of the LSTM network.
6 Deepika Singh et al.
where Wi , Wf , Wo are the weight matrix and xt is the input to the memory
cell layer at time t, σ being the sigmoid and tanh is the hyperbolic tangent
activation function. The terms i, f and o are the input gate, forget gate and
output gate. The term c represents the memory cell and bi , bf , bc and bo are
bias vectors.
Figure 1 illustrates an LSTM single cell layer at time t where xt ,ht and yt
are the input, hidden and output state.
4 Experiments
4.1 Dataset
The data in the experiments are represented in two different forms. The first
is raw sensor data, which are the data received directly from the sensor. The
second form is last-fired sensor data. The last firing sensor gives continuously
1 and changes to 0 when another sensor changes its state. For each house, we
performed leave-one-out cross validation and repeated this for every day and
for each house. Separate models are trained for each house since the number of
sensors varies and a different user resides in each house. Sensors are recorded at
one-minute interval for 24 hours, which totals in 1440 length input for each day.
4.2 Results
The results presented in Table 3 show the performance (accuracy) of the 1D-CNN
model together with LSTM on raw sensor data and Table 4 shows the results of
the last-fired sensor data in comparison with the results of Naive Bayes, HMM,
HSMM and CRF [12]. We calculated accuracy of the model, which represents
the correctly classified activities in each time. For the LSTM model, a time slice
of (70) with hidden state size (300) are used. We implemented 1D(temporal)
convolution with a time slice of (15). 128 filters are used for each layer and 1D
kernel sizes were 5, 5, 3, 3, 3, 3 with a fully connected layer of 128 in the end.
8 Deepika Singh et al.
Dropout of 0.5 is used in order to reduce the overfitting in the data. We also
tested longer timeslices but they tend to overfit considerably. Adam method [37]
is used with a learning rate of 0.0004 for optimization of the networks and
Tensorflow library of Python has been used to implement the CNN and LSTM
network. The training took place on a Titan X GPU and the time required
to train one day for one house is 4 minutes for CNN and approximately 30
minutes for LSTM, but training time differs amongst the houses. Since different
houses have different number of days of data, we calculated the average accuracy
amongst all days. The training is performed using a single GPU but the trained
models can be used for inference without losing performance when there is no
GPU.
Each model for each house is trained with leave-one-day out strategy. If
house has k days of data k-1 days are used to train and 1 day is used to test
and this processed is repeated for each day. In order to compare models average
accuracy with variance are calculated. Table 3 shows the average accuracy with
the variance of accuracies of different models on raw data from three different
houses. Among all the models, the LSTM performs the best for all three datasets
and 1D-CNN performs second best. In House B and House C, LSTM improves
the best result significantly especially on House C where the improvement is
approximately 40% from CRF and 30% from CNN.
Table 4 shows the accuracy on last fired data from three different houses.
The 1D-CNN matches the best performance achieved by CRF in case of House
A but drops slightly in case of House B and C. In comparison to LSTM, 1D-CNN
performs similar except a slight decrease in case of House B. It is also important
to notice the high variance in all models. Variance is halved for the last-fired
sensor data compared to raw sensor data.
In this work, we used deep learning techniques for activity recognition from raw
sensory inputs in smart home environment. As data preprocessing and feature
engineering is expensive for real world applications especially in AAL environ-
ment, the prediction from raw input data can eliminate most of the feature
engineering efforts performed by humans. Deep learning models (1D-CNN and
LSTM) lead to significant improvement in performance, especially on raw data in
comparison to existing probabilistic models such as Naive Bayes, HMM, HSMM
and CRF. In case of last fired data, both deep learning models (1D-CNN and
LSTM) match the best performance of existing models. Although, LSTM gives
better performance than CNN in general cases, but when it comes to training
times CNN is much faster than LSTM based models. The selection of CNN over
LSTM can help in reducing time to design a prototype and see if there is a
temporal dependence between input and output. In addition, CNN also helps in
evaluating performance of different architectures to achieve best results which
could be very time consuming with LSTM.
In general, there are many future research directions of deep learning ap-
proaches in medical applications and specifically, in ambient assisted living sce-
narios. One problem in the medical domain is that the deep learning approaches
are so-called black-box approaches, thus are lacking transparency. However, in
the medical domain trust and acceptance among end-users is of eminent impor-
tance. Consequently, a big research challenge will emerge through rising legal
and privacy aspects, e.g. with the new European General Data Protection Reg-
ulations [38], it will become a necessity to explain why a decision has been
made [39]. An interesting emerging field in the ever-increasing complexity and
large number of heterogeneous sensors (in sensor networks of ambient assistant
living) is to combine deep learning approaches with graphical and topological ap-
proaches [40], which leads to geometric deep learning [41]. This is just to outline
the enormous potential of future research in the area of deep learning applied to
ambient assisted living - as part of health systems.
Nevertheless, the immediate future work will be focusing on combining CNN
with LSTM to utilize the fast training time with high accuracy. More detailed ar-
chitectures search for only CNN based models is also planned to be performed.
In order to avoid overfitting, different methods will be investigated for a bet-
ter generalization. It could also be interesting to investigate the high variance
experienced in some cases.
10 Deepika Singh et al.
Acknowledgement
This work has been funded by the European Union Horizon2020 MSCA ITN
ACROSSING project (GA no. 616757). The authors would like to thank the
members of the project’s consortium for their valuable inputs.
References
1. DESA, U.: United nations department of economic and social affairs,population
division (2015): World population ageing 2015. (2015)
2. Lutolf, R.: Smart home concept and the integration of energy meters into a home
based system. In: Metering Apparatus and Tariffs for Electricity Supply, 1992.,
Seventh International Conference on, IET (1992) 277–278
3. Satpathy, L.: Smart Housing: Technology to Aid Aging in Place: New Opportuni-
ties and Challenges. PhD thesis, Mississippi State University (2006)
4. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.
In: Proceedings of the IEEE conference on computer vision and pattern recognition.
(2016) 770–778
5. Deng, L., Li, J., Huang, J.T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G.,
He, X., Williams, J., et al.: Recent advances in deep learning for speech research
at microsoft. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE
International Conference on, IEEE (2013) 8604–8608
6. Längkvist, M., Karlsson, L., Loutfi, A.: A review of unsupervised feature learning
and deep learning for time-series modeling. Pattern Recognition Letters 42 (2014)
11–24
7. Salakhutdinov, R.: Learning deep generative models. Annual Review of Statistics
and Its Application 2 (2015) 361–385
8. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural
networks. In: Advances in neural information processing systems. (2014) 3104–
3112
9. Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expres-
sion recognition with robust face detection using a convolutional neural network.
Neural Networks 16(5) (2003) 555–559
10. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief
nets. Neural computation 18(7) (2006) 1527–1554
11. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with
neural networks. science 313(5786) (2006) 504–507
12. Kasteren, T.L., Englebienne, G., Kröse, B.J.: Human activity recognition from
wireless sensor network data: Benchmark and software. Activity recognition in
pervasive intelligent environments (2011) 165–186
13. Singh, D., Merdivan, E., Psychoula, I., Kropf, J., Hanke, S., Geist, M., Holzinger,
A.: Human activity recognition using recurrent neural networks. In: Machine
Learning and Knowledge Extraction, Springer (2017)
14. Helal, S., Mann, W., El-Zabadani, H., King, J., Kaddoura, Y., Jansen, E.: The
gator tech smart house: A programmable pervasive space. Computer 38(3) (2005)
50–60
15. Kidd, C., Orr, R., Abowd, G., Atkeson, C., Essa, I., MacIntyre, B., Mynatt, E.,
Starner, T., Newstetter, W.: The aware home: A living laboratory for ubiquitous
computing research. Cooperative buildings. Integrating information, organizations,
and architecture (1999) 191–198
Convolutional and RNN for Activity Recognition in Smart Environment 11
16. Tapia, E.M., Intille, S.S., Larson, K.: Activity recognition in the home using simple
and ubiquitous sensors. In: Pervasive. Volume 4., Springer (2004) 158–175
17. Cook, D.J., Crandall, A.S., Thomas, B.L., Krishnan, N.C.: Casas: A smart home
in a box. Computer 46(7) (2013) 62–69
18. Brumitt, B., Meyers, B., Krumm, J., Kern, A., Shafer, S.: Easyliving: Technologies
for intelligent environments. In: Handheld and ubiquitous computing, Springer
(2000) 97–119
19. Alam, M.R., Reaz, M.B.I., Ali, M.A.M.: A review of smart homespast, present,
and future. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Ap-
plications and Reviews) 42(6) (2012) 1190–1203
20. Ordónez, F.J., de Toledo, P., Sanchis, A.: Activity recognition using hybrid gener-
ative/discriminative models on home environments using binary sensors. Sensors
13(5) (2013) 5460–5477
21. Alemdar, H., Ertan, H., Incel, O.D., Ersoy, C.: Aras human activity datasets in
multiple homes with multiple residents. In: Proceedings of the 7th International
Conference on Pervasive Computing Technologies for Healthcare, ICST (Institute
for Computer Sciences, Social-Informatics and Telecommunications Engineering)
(2013) 232–235
22. Fleury, A., Noury, N., Vacher, M.: Supervised classification of activities of daily
living in health smart homes using svm. In: Engineering in Medicine and Biology
Society, 2009. EMBC 2009. Annual International Conference of the IEEE, IEEE
(2009) 6099–6102
23. Roggen, D., Calatroni, A., Rossi, M., Holleczek, T., Förster, K., Tröster, G., Lukow-
icz, P., Bannach, D., Pirkl, G., Ferscha, A., et al.: Collecting complex activity
datasets in highly rich networked sensor environments. In: Networked Sensing
Systems (INSS), 2010 Seventh International Conference on, IEEE (2010) 233–240
24. Monekosso, D.N., Remagnino, P.: Anomalous behavior detection: Supporting in-
dependent living. Intelligent Environments (2009) 33–48
25. Chen, L., Hoey, J., Nugent, C.D., Cook, D.J., Yu, Z.: Sensor-based activity recogni-
tion. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications
and Reviews) 42(6) (2012) 790–808
26. Lapalu, J., Bouchard, K., Bouzouane, A., Bouchard, B., Giroux, S.: Unsupervised
mining of activities for smart home prediction. Procedia Computer Science 19
(2013) 503–510
27. Li, C., Biswas, G.: Unsupervised learning with mixed numeric and nominal data.
IEEE Transactions on Knowledge and Data Engineering 14(4) (2002) 673–690
28. Longstaff, B., Reddy, S., Estrin, D.: Improving activity classification for health
applications on mobile devices using active and semi-supervised learning. In: Per-
vasive Computing Technologies for Healthcare (PervasiveHealth), 2010 4th Inter-
national Conference on-NO PERMISSIONS, IEEE (2010) 1–7
29. Hammerla, N.Y., Halloran, S., Ploetz, T.: Deep, convolutional, and recur-
rent models for human activity recognition using wearables. arXiv preprint
arXiv:1604.08880 (2016)
30. Yang, J., Nguyen, M.N., San, P.P., Li, X., Krishnaswamy, S.: Deep convolutional
neural networks on multichannel time series for human activity recognition. In:
IJCAI. (2015) 3995–4001
31. Choi, S., Kim, E., Oh, S.: Human behavior prediction for smart homes using deep
learning. In: RO-MAN, 2013 IEEE, IEEE (2013) 173–179
32. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)
http://www.deeplearningbook.org.
12 Deepika Singh et al.
33. Chen, Y., Xue, Y.: A deep learning approach to human activity recognition based
on single accelerometer. In: Systems, Man, and Cybernetics (SMC), 2015 IEEE
International Conference on, IEEE (2015) 1488–1492
34. Geng, C., Song, J.: Human action recognition based on convolutional neural net-
works with a convolutional auto-encoder. In: 5th International Conference on
Computer Sciences and Automation Engineering (ICCSAE 2015). (2015)
35. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural computation
9(8) (1997) 1735–1780
36. Zeng, M., Nguyen, L.T., Yu, B., Mengshoel, O.J., Zhu, J., Wu, P., Zhang, J.: Con-
volutional neural networks for human activity recognition using mobile sensors. In:
Mobile Computing, Applications and Services (MobiCASE), 2014 6th International
Conference on, IEEE (2014) 197–205
37. Kingma, D., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980 (2014)
38. Barnard-Wills, D.: The technology foresight activities of european union data
protection authorities. Technological Forecasting and Social Change 116 (2017)
142–150
39. Holzinger, A., Plass, M., Holzinger, K., Crisan, G.C., Pintea, C.M., Palade, V.: A
glass-box interactive machine learning approach for solving np-hard problems with
the human-in-the-loop. arXiv:1708.01104 (2017)
40. Holzinger, A.: On topological data mining. In Holzinger, A., Jurisica, I., eds.: In-
teractive Knowledge Discovery and Data Mining in Biomedical Informatics: State-
of-the-Art and Future Challenges. Lecture Notes in Computer Science LNCS 8401.
Springer, Heidelberg, Berlin (2014) 331–356
41. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric
deep learning: Going beyond euclidean data. IEEE Signal Processing Magazine
34(4) (2017) 18–42