Deep Learning For Iot: Tausif Diwan, Jitendra V. Tembhurne, Tapan Kumar Jain, and Pooja Jain
Deep Learning For Iot: Tausif Diwan, Jitendra V. Tembhurne, Tapan Kumar Jain, and Pooja Jain
Tausif Diwan, Jitendra V. Tembhurne, Tapan Kumar Jain, and Pooja Jain
1 Introduction
With the advent of 5G technology and evolving Artificial Intelligence (AI), Internet
of Things (IoT) has attained a huge popularity and has a wider scope. IoT describes
the uses and applicability of the interconnected devices and systems for leveraging
the data generated through various embedded sensors and devices. The connected
devices are not limited to mobile phones or systems but cover a wide spectrum of
things such as connected homes and connected buildings. Conclusively, the objects
or things that we are talking about may be mobile or static. As per statistics, the
mobile connections will grow in size very rapidly and it will be accounted approxi-
mately 10 billion by the end of 2020. However, the count of connected devices will
reach to approximately 25 billion by the end of 2020. IoT can benefit to a wide
range of consumers for delivering a dramatically improved version of solutions with
enhanced security, usability, satisfiability, and many other aspects of the day-to-day
life. Retail sector, agricultural, industries, manufacturing, health division, and many
more sectors are benefitted with the development technology of IoT. In addition,
Machine to Machine (M2M) solutions are also known as a subfield of IoT, in which
devices are connected with the help of wireless communication over internet. This
subfield is capable in delivering improved services to the wide range of industries
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 113
V. K. Gunjan et al. (eds.), Modern Approaches in IoT and Machine Learning
for Cyber Security, Internet of Things, https://doi.org/10.1007/978-3-031-09955-7_8
114 T. Diwan et al.
with minimum human intervention. The major role and outcomes of IoT-based
devices and applications can be summarized as: (i) Core sectors that play a major
role in driving economy of any country are benefitted with the help of IoT-based
services; (ii) To capture the global aspects of the consumers with the help of IoT-
based global models and services; (iii) Adjacent industries jointly develop the IoT-
based services to capture and fulfil the end customer need efficiently; (iv) The
research in the direction of improvement for value-added services of the IoT ser-
vices; and (v) Handling and satisfying the varying need of IoT devices and
applications.
Enormous amount of unstructured data is generated from various sources, col-
lectively known as Big data. Big data can be considered as distributed database
wherein big data files are stored, accessed, updated, and analysed by different big-
data technologies for performing various tasks. IoT devices and their applications
are important sources of data and also contribute to the big data database. Generally,
deep learning models are preferred for processing and analysing the large-sized data
only. At the same time, model performance is also higher when introduced to large
amount of data, which is the basic need for various IoT devices and applications.
The rest of the chapter is organized as follows. The second section covers the
introduction to Deep Learning and deep models utilized in various IoT devices and
applications. Specifically, we preset Convolutional Neural Networks (CNN) and
variants, RNN, LSTM, and some generative models that shall be utilized in the
subsequent sections of the chapter. After these, we summarize the role and applica-
bility of deep models for IoT in some of the selected domains with an emphasis to
healthcare. Subsequently, the role and applicability of various deep models in IoT
devices and applications for several other domains such as smart homes, smart cit-
ies, and smart transportation are covered in brief. The last section concludes the
paper and presents the future scope in the context of architectural advances of main-
stream deep models for the betterment of IoT devices and applications. The taxon-
omy of the entire chapter is summarized in Fig. 1.
Deep learning was introduced in the early 2000s after Support vector machines
(SVM), Multilayer perceptron (MLP), Artificial Neural Networks (ANN), and other
shallow neural networks got popularized. Many researchers term it as a subset of
Machine learning (ML), which is considered as a subset of Artificial Intelligence
(AI) in turn [1]. During the early stage of modernization, deep learning didn’t draw
much attention due to scalability of data and several other influential factors. After
2006, it has changed its gear and got popularized tremendously as compared to its
contemporary ML algorithms because of three main reasons: (i) availability of
abundance of data for processing, (ii) availability of high-end computational
resources, and (iii) success stories of deep learning in various other domains.
Multiple layers of neurons are utilized for minimization of loss or error components
for realization of any particular supervised or unsupervised task in the presence of
Deep Learning for IoT 115
any prediction or classification tasks. Usually, these algorithms spend a huge amount
of time in choosing the best method for feature extraction. In order to overcome
these drawbacks, researchers, industrialists, and academicians are actively working
in deep learning. Commonly applied deep learning models include deep neural net-
works, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN),
and its architectural variants such as Long Short-Term Memory (LSTM) and Gated
Recurrent Unit (GRU), Generative Adversarial Network (GAN), and different types
of auto encoders, and are chosen as per the application’s nature and characteristics.
CNN is an important deep learning model adopted for computer vision at its incep-
tion level. However, in other research areas such as Natural Language Processing
(NLP), robotics is also getting benefitted by this model. This model suggests fewer
parameters as compared to dense deep neural networks leveraging comparatively
lesser time for model convergence and offers parameter sharing. Moreover, RNN
and its architectural variants such as LSTM and GRU are basically employed for the
features’ extraction from temporal or sequential data such as time series forecasting.
Broadly, neural networks are divided into two categories viz. generative and dis-
criminative models. Boltzmann machine is stochastic in nature based on generative
neural networks through which one can learn the probability distribution over set of
input provided to this model. Auto encoders are special kind of neural networks
utilized for learning the compact representation of the input, specifically employed
for dimensionality reduction of the input. Deep Belief Network (DBN) is another
important category of generative models wherein smaller unsupervised module
such as auto encoders or Restricted Boltzmann Machine (RBM) are stacked multi-
ple times such that output of a layer acts as an input for the next layer for a genera-
tive task. All these deep models are being applied heavily for capturing, processing,
and analysing the sensory data for better feature extraction and further processing
on various IoT devices and applications. Herein, we briefly elaborate and summa-
rize these deep models, their architectural aspects, and applicability in the con-
text of IoT.
2.1 CNN
Due to scaling inefficacy in deep neural networks, CNN is widely adopted to cap-
ture the spatial and contextual information with fewer parameters [2, 9]. While han-
dling with high-dimensional inputs, it is almost impractical to connect a neuron of
any particular layer to all the neurons in the previous layer. However, one neuron is
connected to only a local region of previous layer in the CNN architecture. This
introduces a considerable reduction in the model complexity and considered as the
backbone of CNN analogy. Three major operations constitute the main pillars of
CNN architectures viz. convolution, pooling, and a non-linear activation preferably
ReLU. The general architecture of the CNN is presented in Fig. 3 wherein two con-
volutional layers are followed by fully connected or dense layers. However, convo-
lution and pooling operation are represented with the help of Fig. 4. Convolutional
Deep Learning for IoT 117
Fig. 4 Convolutional neural network operations. (a) Conv-layer. (b) Max-pooling layer
operation refers to applying a sliding filter over the entire input volume and generat-
ing a weighted sum as illustrated in the convolution operation wherein a 3 × 3 filter
is applied on an input volume of size 5 × 5 and generating an output of size 3 × 3
[3]. Three hyperparameters viz. filter size, stride, and padding decide the size of the
output volume in the convolution operation. Pooling refers to the down sampling of
an input volume, specially utilized for capturing the dominating features [4]. Max-
and Average pooling are the two major types of pooling employed in the
CNN. Pooling layers are optional in the architecture and can be used depending
upon the architecture and application requirements. No parameters are associated
with the pooling layers; however, convolutional and dense layers are characterized
by the number of parameters associated with them. Similarly, a non-linear activa-
tion function is utilized for the activation of the neurons belonging to convolutional
and dense layers.
Object detection and its related tasks such as classification, localization, and seg-
mentation form a very important category in various IoT applications. Object detec-
tion problem generally performs the features extraction followed by the classification
and localization, implemented in two stages, and the corresponding architectures
118 T. Diwan et al.
are known as two-stage object detectors. The first stage is responsible for generating
Regions of Interest (RoI) using Region Proposal Network (RPN). However, the
second stage is responsible for predicting the objects and bounding boxes for the
proposed region. The latest example in this category includes Region based convo-
lutional neural networks (RCNN), fast RCNN, and faster RCNN [5]. Faster R-CNN
is a successor of R-CNN and Fast R-CNN, and was released in early 2016. It had
two modules: (1) First is a CNN, i.e., Region proposal network, which is responsi-
ble for generating region proposals. It takes a single image as input and outputs the
bounding boxes and object scores. (2) During training, RPN is trained on ImageNet
then region proposals are used for object detection, and lastly Fast R-CNN is fine-
tuned with unique layers. However, with the advent of You Only Look Once (YOLO)
and its successors, attempts are being heavily appreciated for solving these tasks in
one shot/stage wherein localization problem is formulated as a regression problem
with the help of deep neural networks. The authors of YOLO [6] have reframed the
problem of object detection as a regression problem instead of classification prob-
lem. A convolutional neural network predicts the bounding boxes as well as class
probabilities for all the objects depicted in an image. As this algorithm identifies the
objects and their positioning with the help of bounding boxes by looking at the
image only once, hence they have named it as YOLO. The architecture of YOLO is
inspired by GoogLeNet architecture. It was implemented and tested on VOC Pascal
Dataset 2007 and 2012 for an object detection task, and Darknet framework is uti-
lized for the training of the model. Herein, inception modules of GoogLeNet were
replaced by (1 × 1) convolution followed by (3 × 3) convolutional filters, and only
the first convolutional layer has a (7 × 7) filter. YOLO has 24 convolution layers
followed by two fully connected layers as shown in Fig. 5.
2.2 Sequence Models
RNN and its successors such as LSTM, GRU, and transformers constitute another
important category of deep learning model for extracting the features and perform-
ing classification on the time series or sequential data, and these models are tremen-
dously performing well on a variety of domains ranging from stock prediction to
wind estimation and forecasting [10]. As most of the IoT devices generate or oper-
ate on sequential data, these models and their role and applicability are explored by
the research community for the betterment of IoT devices and its applications.
Sequential data is characterized by the different time steps. The generic architecture
of RNN is presented in Fig. 6 wherein output of an RNN cell is not only determined
by the current time step input but also depends on the outputs on the previous cell.
Specifically, a cell takes two parameters as inputs, namely, current timestep input
and hidden output of the previous cell. This is the basic philosophy of any cell in all
the aforementioned sequential models. One thing should be noted that the parame-
ters, i.e., weight matrices are shared across different time-steps. Conclusively, tem-
poral ordering across the input is taken care of using the sequential processing by
this family.
RNN cells are comparatively simpler in comparison with LSTM and GRU cells
due to its fewer tensor operations. The inability of capturing long-term dependency
is one of the major bottlenecks of the RNN that is taken care of in the other members
of the family at an expense of increased cell complexity. For the sake of weight
updating and the model convergence, gradient is back propagated from the current
timestep cell to the initial cell, known as back-propagation with time (BPTT). The
gradients at the distant cells get vanishes as the gradients get multiplied in a chain
rule manner in the course of BPTT. This problem is widely regarded as vanishing
gradient problem. As an example, sometimes the distant information should be
given more emphasis as compared to neighbouring cells for the current prediction
and RNN is unable to handle this situation intelligently due to the absence of stor-
age unit of cells.
To address the aforementioned problem and for the improved modelling of long-
term dependency across the cells, LSTM and GRU were proposed [7, 8]. These
models are capable of mitigating the effect of vanishing gradient problems by intro-
ducing several gates in the cells. These gates are responsible for storing and trans-
ferring the past important information to the next cell and removing the irrelevant
information. Cell state information is maintained in each cell that includes the nec-
essary information and removes the non-necessary information in the course of
model training. Figure 7 illustrates the single cell corresponding to the LSTM archi-
tecture wherein three different gates viz. forget, input, and output gates are main-
tained for the fulfilment of long-term dependency. Sigmoid activation draws an
important role in distinguishing necessary and non-necessary information as it
squishes between 0 and 1.
Input at current timestep is concatenated with hidden output of the previous cell
and passed to the various activation functions to perform different gating opera-
tions. Forget gate decides the importance of the previous information and pass it
further for updating the current cell state. Cell state decides whether the information
is added, retained, or subsidized. Cell state gets modified by taking into account the
input gate, forget gate, and previous cell state. Output gate is responsible for the
generation of the hidden state that shall be utilized in the next LSTM cell.
2.3 Generative Models
RBM are generative probabilistic models consisting of two layers of artificial neural
networks. These models are capable of performing dimensionality reduction, stan-
dard supervised tasks such as classification and regression, feature learning, and
Deep Learning for IoT 121
collaborative filtering by learning the probability distribution in the input. These are
extremely simple models having input and output layers, generally referred to as
visible and hidden layers. Connections are established between the neurons belong-
ing to different layers but not from the same layer. In other words, neurons belong-
ing to these two different disjoint sets form a symmetric bipartite graph. In the
learning phase of these models, parameters shall get updated in the direction of
effective and meaningful reconstruction of the input by minimizing the reconstruc-
tion error [11]. Furthermore, Auto encoders are neural networks utilized for learn-
ing the compact representation of the input, specifically employed for dimensionality
reduction of the input or data [12]. As IoT devices and applications signify the short
of resources, reduced computational complexity with the help of such models leads
to better resource utilization. Encoders and decoders are the two main pillars of auto
encoders wherein dimensionality reduction is achieved using encoder network and
decoder tries to reconstruct the same data by introducing latent space representa-
tional layer between these two modules. Auto encoders and RBM are very simple
networks and unable to learn the complex features of efficient reduction. However,
these modules are stacked multiple times to form aDBN, a complex structure capa-
ble of handling bottlenecks of the aforementioned simpler models. Nowadays, GAN
has earned a huge popularity in the race of generative models. It has two main mod-
ules viz. generator and discriminator. New data instances having resemblance with
the input data are generated by the generator module; however, validity of the newly
created instance is verified using discriminator module [13].
With the advent of Artificial intelligence techniques, especially deep learning algo-
rithms, e-health has become quite prevalent. In the corona times, when people are
forced to stay at home, digital health has become all the more useful and important.
People believe in getting medical aid from the comfort of their home. Various deep
learning methods are used to help the people to track their medical problems.
Figure 8 illustrates the generic diagram of deep learning for IoT in healthcare.
Clinical workflow of the ECG is completely dominated and determined through
the ECG interpretation [14]. With proper and accurate interpretation from the ECG,
one can get the prioritized medical diagnosis. A DNN is developed for the classifi-
cation among 12 rhythm classes using approximately 90 k ECGs, collected from
50 k patients using single-lead ambulatory ECG monitoring devices. It was an
attempt to mitigate the misinterpreted ECGs. The constructed DNN achieves an
area under ROC curve of 0.97 on validating the model on an independent annotated
dataset. However, an F1-sore of 0.83 is recorded in comparison with 0.78 recorded
by the expert cardiologists.
Poorly handled blood sugar may cause damage to the tissues at the retinal back
area, known as Diabetic Retinopathy [15]. Referring to the medical experts for the
detection of this disease is a time and resource consuming process. Diabetic
122 T. Diwan et al.
retinopathy detection with the help of retinal fundus images is realized using DCNN,
and the model demonstrates high sensitivity and specificity compared to the state-
of-the-art models. The DCNN was tested on EyePACS-1 dataset consisting of 9963
retinal fundus images from 4997 patients. Eventually, WHO reports unhealthy life
style as a leading cause of mortality [16]. Lack of data related to unhealthy lifestyle
is the main hurdle to approach this problem efficiently and intelligently. A CNN-
based smart personal health advisor (SPHA) is proposed that periodically monitors
the physiological and psychological activities of the end user for the better health
monitoring and to further assist with proper guidelines. Accurate detection of pills
by observing the pill image is a challenging task [17]. Blurring, shading, back-
ground colour, resizing, and different illumination are considered as the five main
challenges in the exact detection of the pill from the visual content. A multi CNN-
based MobileDeepPill model is proposed for this task wherein each CNN takes care
of different challenges for the sake of the improvement of the recognition
performance.
The cause of various chronic diseases such as diabetes, obesity, and cardiovascu-
lar abnormalities may be inferred from day-to-day ambulatory activities, which can
be identified by energy expenditure in turn [18]. Exact estimation and prediction of
Energy Expenditure is an important task for the identification of various ambulatory
activities such as walking, standing, climbing, etc., and to further assist the person
from preventing any chronic disease. Direct estimation of energy expenditure with
the help of wearable sensors has a limited accuracy. Hence, CNN is employed for
the accurate estimation of Energy Expenditure from the data collected by different
wearable sensors, specifically triaxial accelerometer and heart rate sensors in this
case. CNN extracts the useful features automatically from the data collected through
such sensors and outperform other state-of-the-arts by producing 30% lower Root
Mean Square Error (RMSE). The data for this experimentation is gathered from the
Deep Learning for IoT 123
investigated in this research, and conclusive guidelines were presented for the
model-specific performance enhancement. In the context of sequence modelling
such as HAR activities, sequential deep models outperform in comparison with
other deep models. Here, RNNs suffer from vanishing gradients problem, unable to
capture long-term dependencies but enjoying simple architectures, and perform
well for various short activities. However, LSTMs outperform its predecessor in
case of long activity recognition. CNN is preferred for such HAR activities that
don’t require any short- or long-term dependencies because of its ability of captur-
ing local patters or features efficiently.
In ensemble architectures, a combination of CNN and RNN is most popular for
spatiotemporal inputs wherein CNN is employed for the features’ extraction and
these temporal features are provided as an input to the sequence models [23]. In one
such study, recurrent 3D convolutional neural network (R3D) is presented for the
action recognition task from the spatiotemporal input datasets. Three dimensional
convolutional filters are utilized for features’ extraction; however, these extracted
features are given as input to the different cells of LSTM. LSTM captures temporal,
the long-term dependencies from the fused features generated from 3D-CNNs.
Specifically, 3D-CNN convolves over the video frames to capture the short-term
spatiotemporal features from different time steps, which is further aggregated to
extract the long-term spatiotemporal features with the help of LSTM. This architec-
ture is tested for monitoring various human activities of the patients or older person,
and is considered as a prominent step in the direction of intelligent healthcare.
Experimentations are performed on UCF101, a large-scale dataset consisting
approximately 13 k video clips for around 100 action classes. There are two aspects
of the results for this category of tasks: (1) How accurately it is predicting the
actions, and (2) How fast the prediction occurs and it can be counted as frames per
second (fps), generally depends on the underlying hardware. Moreover, with the
advent of latest multicore architectures and many core GPUs, the computation time
is getting reduced to a large extent. R3D achieves a benchmarking accuracy of
85.7% by processing 427 fps. In many studies, deep models are employed for the
features extraction followed by a support vector machine (SVM) that acts as a clas-
sifier. The reason behind this is that SVM performs better as a classifier in compari-
son with the deep models. In this experimentation, an improved accuracy of 86.8%
is achieved by applying an SVM on the extracted features by CNN and
LSTM. However, on the downside, computational time is increased by applying
another layer of linear classifier, and a slight reduction is observed in fps.
In this section, we basically cover some of the important domains that are rigorously
exploring deep learning and models for IoT such as Smart homes, Smart cities, and
Smart transportation. With the evolution of IoT technologies and deep learning,
Deep Learning for IoT 125
huge automation has been introduced in smooth functioning of various tasks related
to the aforementioned domains.
4.1 Smart Homes
4.2 Smart Cities
IoT applications related to Smart cities are trained and tested for a deployment in
real environment such as self-driving cars. Runtime and accurate detection of the
various objects using implanted high-quality cameras in self driving cars is a chal-
lenging task as resulting actions are completely determined based on these deci-
sions. Detection of various sign boards, pedestrian movements, traffic lights, and
other vehicles and their relative movements are the important things and needs to be
captured in real time without any failure. The challenges in these applications to
cover all real-life scenarios that are not always possible while training the model or
application. A CNN-based real-time object detection mechanism is implemented
for autonomous driving and the proposed architecture is computationally efficient,
fast, and accurate [30]. An end-to-end CNN architecture is implemented for the real
time object detection in the self-driving cars [31]. Herein, end-to-end architecture
maps the real time images captured by single front faced camera to the car com-
mands with appropriate processing of the images followed by further necessary
actions executed with the car commands. As driving is a sequential activity and it
can be learnt using different driving videos covering all possible real-life driving
scenarios, an LSTM-based model is introduced using large-scale dataset containing
car driving videos in the crowded areas [32].
4.3 Smart Transportation
objects form the captured images [5, 6]. Incremental approach is adopted for an
improvement in different version of YOLOs. For example, YOLO v1 is unable to
capture the small sized objects that are further rectified using higher version
of YOLOs.
We summarize the role and applicability of deep models for IoT in various domains
with an emphasis to healthcare. The advancements of IoT technologies and deep
learning fields introduce automation, intelligence, and smooth functioning of vari-
ous entities related to these domains. A detailed overview is presented for various
deep models applied on IoT devices and applications in the healthcare domain.
However, we also brief deep learning for IoT in smart homes and smart
transportation.
Recent advances in CNN such as tiled-, dilated-, and transpose-convolution
should be utilized for efficient features extraction for various IoT devices and appli-
cation [9]. As most of the IoT devices are having resource constraints and demand
time-bounded output, different optimization techniques such as batch normaliza-
tion, residual or skip connections, and different augmentation techniques should be
explored for the improved performance of CNNs for IoT applications. For fast pro-
cessing of the CNNs, techniques such as weight compression, usage of low preci-
sion, and Fast Fourier Transform (FFT) should be carefully examined. On the
sequence models’ side, architectural variants of RNNs and LSTMs should be
employed for better feature learning and contextual information extraction from
time series data in the course of better performance of various IoT devices and
applications [10].
References
1. S.M. Ahmed, B. Kovela, V.K. Gunjan, IoT based automatic plant watering system through soil
moisture sensing—A technique to support farmers’ cultivation in rural India, in Advances in
Cybernetics, Cognition, and Machine Learning for Communication Technologies, (Springer,
Singapore, 2020), pp. 259–268
2. S. Albelwi, A. Mahmood, A framework for designing the architectures of deep convolutional
neural networks. Entropy 19(6), 242 (2017)
3. S. Kumar, M.D. Ansari, V.K. Gunjan, V.K. Solanki, On classification of BMD images
using machine learning (ANN) algorithm, in ICDSMLA 2019, (Springer, Singapore, 2020),
pp. 1590–1599
4. Z. Ibrahim, D. Isa, Z. Idrus, Z. Kasiran, R. Roslan, Evaluation of pooling layers in convolu-
tional neural network for script recognition, in International Conference on Soft Computing in
Data Science, (Springer, Singapore, 2019, August), pp. 121–129
5. S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region
proposal networks, in Advances in Neural Information Processing Systems, (2015), pp. 91–99
128 T. Diwan et al.
6. J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time
object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, (2016), pp. 779–788
7. S. Kumar, V.K. Gunjan, M.D. Ansari, R. Pathak, Credit card fraud detection using support vec-
tor machine, in Proceedings of the 2nd International Conference on Recent Trends in Machine
Learning, IoT, Smart Cities and Applications, (2022), pp. 27–37
8. M. Husein, I.Y. Chung, Day-ahead solar irradiance forecasting for microgrids using a long
short-term memory recurrent neural network: A deep learning approach. Energies 12(10),
1856 (2019)
9. J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, et al., Recent advances in convolu-
tional neural networks. Pattern Recogn. 77, 354–377 (2018)
10. V.K. Gunjan, P.S. Prasad, R. Pathak, A. Kumar, Machine learning methods for extraction and
classification for biometric authentication, in ICDSMLA 2019, (Springer, Singapore, 2020),
pp. 1984–1988
11. T. Jaakkola, D. Haussler, Exploiting generative models in discriminative classifiers, in
Advances in Neural Information Processing Systems, (1999), pp. 487–493
12. E. Rashid, M.D. Ansari, V.K. Gunjan, M. Khan, Enhancement in teaching quality methodol-
ogy by predicting attendance using machine learning technique, in Modern Approaches in
Machine Learning and Cognitive Science: A Walkthrough, (2020), pp. 227–235
13. A.Y. Hannun, P. Rajpurkar, M. Haghpanahi, G.H. Tison, C. Bourn, M.P. Turakhia, A.Y. Ng,
Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms
using a deep neural network. Nat. Med. 25(1), 65 (2019)
14. V. Gulshan, L. Peng, M. Coram, M.C. Stumpe, D. Wu, A. Narayanaswamy, R. Kim,
Development and validation of a deep learning algorithm for detection of diabetic retinopathy
in retinal fundus photographs. JAMA 316(22), 2402–2410 (2016)
15. M. Chen, Y. Zhang, M. Qiu, N. Guizani, Y. Hao, SPHA: Smart personal health advisor based
on deep analytics. IEEE Commun. Mag. 56(3), 164–169 (2018)
16. X. Zeng, K. Cao, M. Zhang, MobileDeepPill: A small-footprint mobile deep learning system
for recognizing unconstrained pill images, in Proceedings of the 15th Annual International
Conference on Mobile Systems, Applications, and Services, (2017, June), pp. 56–67
17. J. Zhu, A. Pande, P. Mohapatra, J.J. Han, Using deep learning for energy expenditure estima-
tion with wearable sensors, in 2015 17th International Conference on E-Health Networking,
Application & Services (HealthCom), (IEEE, 2015, October), pp. 501–506
18. A.R. Lopez, X. Giro-i-Nieto, J. Burdick, O. Marques, Skin lesion classification from dermo-
scopic images using deep learning techniques, in 2017 13th IASTED International Conference
on Biomedical Engineering (BioMed), (IEEE, 2017, February), pp. 49–54
19. W.J. Chang, L.B. Chen, C.H. Hsu, C.P. Lin, T.C. Yang, A deep learning-based intelligent medi-
cine recognition system for chronic patients. IEEE Access 7, 44441–44458 (2019)
20. A. Prasoon, K. Petersen, C. Igel, F. Lauze, E. Dam, M. Nielsen, Deep feature learning for
knee cartilage segmentation using a triplanar convolutional neural network, in International
Conference on Medical Image Computing and Computer-Assisted Intervention, (Springer,
Berlin, Heidelberg, 2013, September), pp. 246–253
21. N.Y. Hammerla, S. Halloran, T. Plötz, Deep, convolutional, and recurrent models for human
activity recognition using wearables. arXiv preprint arXiv:1604.08880 (2016)
22. Y. Gao, X. Xiang, N. Xiong, B. Huang, H.J. Lee, R. Alrifai, Z. Fang, Human action monitoring
for healthcare based on deep learning. IEEE Access 6, 52277–52285 (2018)
23. Y. Gu, Y. Chen, J. Liu, X. Jiang, Semi-supervised deep extreme learning machine for Wi-Fi
based localization. Neurocomputing 166, 282–293 (2015)
24. M. Mohammadi, A. Al-Fuqaha, M. Guizani, J.S. Oh, Semisupervised deep reinforcement
learning in support of IoT and smart city services. IEEE Internet Things J. 5(2), 624–635 (2017)
25. X. Wang, L. Gao, S. Mao, S. Pandey, CSI-based fingerprinting for indoor localization: A deep
learning approach. IEEE Trans. Veh. Technol. 66(1), 763–776 (2016a)
Deep Learning for IoT 129
26. J. Wang, X. Zhang, Q. Gao, H. Yue, H. Wang, Device-free wireless localization and activity
recognition: A deep learning approach. IEEE Trans. Veh. Technol. 66(7), 6258–6267 (2016b)
27. B.A. Erol, A. Majumdar, J. Lwowski, P. Benavidez, P. Rad, M. Jamshidi, Improved deep
neural network object tracking system for applications in home robotics, in Computational
Intelligence for Pattern Recognition, (Springer, Cham, 2018), pp. 369–395
28. S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, D. Quillen, Learning hand-eye coordination
for robotic grasping with deep learning and large-scale data collection. Int. J. Robotics Res.
37(4–5), 421–436 (2018)
29. B. Wu, F. Iandola, P.H. Jin, K. Keutzer, Squeezedet: Unified, small, low power fully convolu-
tional neural networks for real-time object detection for autonomous driving, in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, (2017),
pp. 129–137
30. M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, X. Zhang, End to
end learning for self-driving cars. arXiv preprint arXiv, 1604.07316 (2016)
31. H. Xu, Y. Gao, F. Yu, T. Darrell, End-to-end learning of driving models from large-scale video
datasets, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
(2017), pp. 2174–2182
32. W. Huang, G. Song, H. Hong, K. Xie, Deep architecture for traffic flow prediction: Deep belief
networks with multitask learning. IEEE Trans. Intell. Transp. Syst. 15(5), 2191–2201 (2014)
33. Y. Lv, Y. Duan, W. Kang, Z. Li, F.Y. Wang, Traffic flow prediction with big data: A deep learn-
ing approach. IEEE Trans. Intell. Transp. Syst. 16(2), 865–873 (2014)
34. Z. Zhao, W. Chen, X. Wu, P.C. Chen, J. Liu, LSTM network: A deep learning approach for
short-term traffic forecast. IET Intell. Transp. Syst. 11(2), 68–75 (2017)
35. A.K. Goel, R. Chakraborty, M. Agarwal, M.D. Ansari, S.K. Gupta, D. Garg, Profit or loss: A
long short term memory based model for the prediction of share price of DLF group in India,
in 2019 IEEE 9th International Conference on Advanced Computing (IACC), (IEEE, 2019,
December), pp. 120–124
36. J. Zhang, Y. Zheng, D. Qi, Deep spatio-temporal residual networks for citywide crowd flows
prediction. arXiv preprint arXiv, 1610.00081 (2016)