0% found this document useful (0 votes)

27 views23 pages

Ai ML

Uploaded by

tien nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views23 pages

Ai ML

Uploaded by

tien nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Chapter 1

Neural Network Models for Time Series

Data

Shuochao Yao and Tarek Abdelzaher

Abstract A wide range of mobile sensing and computing applications require

time-series measurements from such sensors as accelerometers, gyroscopes, and
magnetometers to generate inputs for various signal estimation and classiﬁcation
applications (Lane et al., IEEE Commun. Mag., 2010).

1 Introduction

A wide range of mobile sensing and computing applications require time-series

measurements from such sensors as accelerometers, gyroscopes, and magnetome-
ters to generate inputs for various signal estimation and classiﬁcation applica-
tions [1]. Using these sensors, mobile devices are able to infer user activities
and states [2, 3] and recognize surrounding context [4, 5]. These capabilities
serve diverse application areas including health and wellbeing [6–8], tracking and
imaging [9, 10], mobile security [11, 12], and vehicular road sensing [13–15].
Although mobile sensing is becoming increasingly ubiquitous, key challenges
remain in improving the accuracy of sensor exploitation. In this chapter, we consider
the general problem of estimating signals from noisy measurements in mobile
sensing applications. This problem can be categorized into two subtypes: regression
and classiﬁcation, depending on whether prediction results are continuous or
categorical, respectively.
For regression-oriented problems, such as tracking and localization, sensor inputs
are usually processed based on physical models of the phenomena involved. Sensors
on mobile devices generate time-series measurements of physical quantities such
as acceleration and angular velocity. From these measurements, other physical

S. Yao ( )
George Mason University, Fairfax, VA, USA
e-mail: [email protected]
T. Abdelzaher
University of Illinois at Urbana Champaign, Urbana, IL, USA
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 3

M. Srivatsa et al. (eds.), Artiﬁcial Intelligence for Edge Computing,
https://doi.org/10.1007/978-3-031-40787-1_1
4 S. Yao and T. Abdelzaher

quantities can be computed, such as displacement through double integration of

acceleration over time. However, measurements of commodity sensors are noisy.
The noise in measurements is non-linear [16] and correlated over time [17], which
makes it hard to model. This makes it challenging to separate signal from noise,
leading to estimation errors and bias.
For classification-oriented problems, such as activity and context recognition, a
typical approach is to compute appropriate features derived from raw sensor data.
These hand-crafted features are then fed into a classifier for training. This general
workflow for classification face the challenge that designing good hand-crafted
features can be time consuming; it requires extensive experiments to generalize
well to diverse settings such as different sensor noise patterns and heterogeneous
user behaviors [3].
In this chapter, we propose DeepSense, a unified deep learning framework
that directly addresses the aforementioned customization challenges that arise in
mobile sensing applications. The core of DeepSense is the integration of convolu-
tional neural networks (CNN) and recurrent neural networks (RNN). Input sensor
measurements are split into a series of data intervals along time. The frequency
representation of each data intervals is fed into a CNN to learn intra-interval local
interactions within each sensing modality and intra-interval global interactions
among different sensor inputs, hierarchically. The intra-interval representations
along time are then fed into an RNN to learn the inter-interval relationships.
The whole framework can be easily customized to fit specific mobile computing
(regression or classification) tasks by three simple steps, as will be described later.
For the regression-oriented mobile sensing problem, DeepSense learns the
composition of physical system and noise model to yield outputs from noisy sensor
data directly. The neural network acts as an approximate transfer function. The CNN
part approximates the computation of sensing quantities within the time interval,
and the RNN part approximates the computation of sensing quantities across time
intervals. Instead of using a model-based noise analysis method that assumes a noise
model with experience or observations, DeepSense can be regarded as a model-
free noise analysis that learns the non-linear and correlated-over-time noises among
sensor measurements.
For the classification-oriented mobile sensing problem, the neural network acts
as an automatic feature extractor encoding local, global, and temporal information.
The CNN part extracts local features within each sensor modality and merges the
local features of different sensory modalities into global features hierarchically. The
RNN part extracts temporal dependencies.
We demonstrate the effectiveness of our DeepSense framework using three rep-
resentative and challenging mobile sensing problems, which illustrate the potential
of solving different tasks with a single unified modeling methodology:
• Car tracking with motion sensors: In this task, we use dead reckoning to infer
position from acceleration measurements. One of the major contributions of
DeepSense is its ability to withstand nonlinear and time-dependent noise and
bias. We chose the car tracking task because it involves double-integration and
1 Neural Network Models for Time Series Data 5

thus is particularly sensitive to error accumulation, as acceleration errors can

lead to significant deviations in position estimate over time. This task thus
constitutes a worst-case of sorts in terms of emphasizing the effects of noise
on modelling error. Traditionally, external means are needed to reset the error
when possible [13, 18, 19]. We intentionally forgo such means to demostrate the
capability of DeepSense for learning accurate models of target quantities in the
presence of realistic noise.
• Heterogeneous human activity recognition: Although human activity recognition
with motion sensors is a mature problem, Stisen et al. [3] illustrated that state-of-
the-art algorithms do not generalize well across users when a new user is tested
who has not appeared in the training set. This classification-oriented problem
therefore illustrates the capability of DeepSense to extract features that generalize
better across users in mobile sensing tasks.
• User identification with biometric motion analysis: Biometric gait analysis can
be used to identify users when they are walking [2, 20]. We extend walking to
other activities, such as biking and climbing stairs, for user identification. This
classification-oriented problem illustrates the capability of DeepSense to extract
distinct features for different users or classes.
We evaluate these three tasks with collected data or existing datasets. We
compare DeepSense to state-of-the-art algorithms that solve the respective tasks,
as well as to three DeepSense variants, each presenting a simplification of the
algorithm as described in Sect. 4.3. For the regression-oriented problem: car tracking
with motion sensors, DeepSense provides an estimator with far smaller tracking
error. This makes tracking with solely noisy on-device motion sensors practi-
cal and illustrates the capability of DeepSense to perform accurate estimation
of physical quantities from noisy sensor data. For the other two classification-
oriented problems, DeepSense outperforms state-of-the-art algorithms by a large
margin, illustrating its capability to automatically learn robust and distinct features.
DeepSense outperforms all its simpler variants in all three tasks, which shows
the effectiveness of its design components. Despite a general shift towards remote
cloud processing for a range of mobile applications, we argue that it is intrinsically
desirable that heavy sensing tasks be carried out locally on-device, due to the usually
tight latency requirements, and the prohibitively large data transmission requirement
as dictated by the high sensor sampling frequency (e.g. accelerometer, gyroscope).
Therefore, we also demonstrate the feasibility of implementing and deploying
DeepSense on mobile devices by showing its moderate energy consumption and
low overhead for all three tasks on two different types of smart devices.
In summary, the main contribution of this chapter is that we develop a deep learn-
ing framework, DeepSense, that solves both regression-oriented and classification-
oriented mobile computing tasks in a unified manner. By exploiting local inter-
actions within each sensing modality, merging local interactions of different
sensing modalities into global interactions, and extracting temporal relationships,
DeepSense learns the composition of physical laws and noise model in regression-
oriented problems, and automatically extracts robust and distinct features that
6 S. Yao and T. Abdelzaher

contain local, global, and temporal relationships in classiﬁcation-oriented prob-

lems. Importantly, it outperforms the state of the art, while remaining implementable
on mobile devices.

2 DeepSense Framework

Recently, deep learning [21] has become one of the most popular methodologies in
AI-related tasks, such as computer vision [22], speech recognition [23], and natural
language processing [24]. Lots of deep learning architectures have been proposed
to exploit the relationships embedded in different types of inputs. For example,
Residual nets [22] introduce shortcut connections into CNNs, which greatly reduces
the difficulty of training super-deep models. However, since residual nets mainly
focus on visual inputs, they lose the capability to model temporal relationships,
which are of great importance in time-series sensor inputs. LRCNs [25] apply CNNs
to extract features for each video frame and combine video frame sequences with
LSTM [26], which exploits spatio-temporal relationships in video inputs. However,
it does not consider modeling multimodal inputs. This capability is important to
mobile sensing and computing tasks, because most tasks require collaboration
among multiple sensors. Multimodal DBMs [27] merge multimodal inputs, such
as images and text, with Deep Boltzmann Machines (DBMs). However, the work
does not model temporal relationships and does not apply tailored structures, such
as CNNs, to effectively and efficiently exploit local interactions within input data.
To the best of our knowledge, DeepSense is the first architecture that possesses the
capability for both (1) modelling temporal relationships and (2) fusing multimodal
sensor inputs. It also contains specifically designed structures to exploit local
interactions in sensor inputs.
There are several illuminating studies, applying deep neural network models
to different mobile sensing applications. DeepEar [28] uses Deep Boltzmann
Machines to improve the performance of audio sensing tasks in an environment
with background noise. RBM [29] and MultiRBM [30] use Deep Boltzmann
Machines and Multimodal DBMs to improve the performance of heterogeneous
human activity recognition. IDNet [20] applies CNNs to the biometric gait analysis
task. DeepX [31], RedEye [32], and ConvTransfer [33] reduce the energy consump-
tion or training time of deep neural networks, based on software and hardware,
respectively. However, these studies do not capture the temporal relationships in
time-series sensor inputs, and, with the only exception of MultiRBM, lack the
capability of fusing multimodal sensor inputs. In addition, these techniques focus
on classification-oriented tasks only. To the best of our knowledge, DeepSense is the
first framework that directly solves both regression-based and classification-based
problems in a unified manner.
We introduce DeepSense, a unified framework for mobile applications with
sensor data inputs, in this section. We separate our description into three parts.
The first two parts, convolutional layers and recurrent layers, are the main building
1 Neural Network Models for Time Series Data 7

blocks for DeepSense, which are the same for all applications. The third part, the
output layer, is the specific layer for two different types of applications; regression-
oriented and classification-oriented.
For the rest of this chapter, all vectors are denoted by bold lower-case letters
(e.g., .x and .y), while matrices and tensors are represented by bold upper-case letters
(e.g., .X and .Y). For a vector .x, the .j th element is denoted by .x[j ] . For a tensor .X,
the .t th matrix along the third axis is denoted by .X··t , and other slicing denotations
are defined similarly. We use calligraphic letters to denote sets (e.g., .X and .Y). For
any set .X, .|X| denotes the cardinality of .X.
For a particular application, we assume that there are K different types of input
sensors .S = {Sk }, .k ∈ {1, · · · , K}. Take a sensor .Sk as an example. It generates a
series of measurements over time. The measurements can be represented by a .d (k) ×
n(k) matrix .V for measured values and .n(k) -dimensional vector .u for time stamps,
where .d (k) is the dimension for each measurement (e.g., measurements along x, y,
and z axes for motion sensors) and .n(k) is the number of measurements. We split the
input measurements .V and .u along time (i.e., columns for .V) to generate a series of
non-overlapping time intervals with width .τ , .W = {(V(k) (k)
t , ut )}, where .|W| = T .
Note that, .τ can be different for different intervals, but here we assume a fixed time
interval width for succinctness. We then apply Fourier transform to each element
in .W, because the frequency domain contains better local frequency patterns that
are independent of how time-series data is organized in the time domain [34]. We
stack these outputs into a .d (k) × 2f × T tensor .X(k) , where f is the dimension
of frequency domain containing f magnitude and phase pairs. The set of resulting
tensors for each sensor, .X = {X(k) }, is the input of DeepSense.
As shown in Fig. 1.1, DeepSense has three major components; the convolutional
layers, the recurrent layers, and the output layer, stacked from bottom to top. In the
following subsections, we detail these components, respectively.

Single/Multiple Outputs
Output Layer
T time intervals with width

Recurrent Layer 2 GRU ...... ...... ...... ...... GRU

Recurrent Layer 1 ...... ...... ...... ......

GRU GRU
Flatten & Concatenation (c)
x ..t
(6)
Merge Convolutional Layer 3 X..t
(5)
X..t
Merge Convolutional Layer 2
(4)
X..t
Merge Convolutional Layer 1
K (3) K
X..t
...

...
...

...

Flatten & Concatenation

............ (k,3)
X..t ............
Individual Convolutional Layer 3
............ (k,2)
X..t ............
Individual Convolutional Layer 2 ............ (k,1) ............
X..t

Individual Convolutional Layer 1 ............ (k)

X..t ............

K sensor inputs K sensor inputs

Fig. 1.1 Main architecture of the DeepSense framework

8 S. Yao and T. Abdelzaher

2.1 Convolutional Layers

The convolutional layers can be further separated into two parts: an individual con-
volutional subnet for each input sensor tensor .X(k) , and a single merge convolutional
subnet for the output of K individual convolutional subnets’ outputs.
Since the structures of individual convolutional subnet for different sensors are
the same, we focus on one individual convolutional subnet with input tensor .X(k) .
(k)
Recall that .X(k) ∈ Rd ×2f ×T , where .d (k) is the sensor measurement dimension, f
is the dimension of frequency domain, and T is the number of time intervals. For
each time interval t, the matrix .X(k)··t will be fed into a CNN architecture (with three
layers in this chapter). There are two kinds of features/relationships embedded in
(k)
.X··t we want to extract. The relationships within the frequency domain and across
sensor measurement dimension. The frequency domain usually contains lots of
local patterns in some neighbouring frequencies. And the interaction among sensor
measurement usually including all dimensions. Therefore, we first apply 2d filters
(k)
with shape .(d (k) , cov1) to .X··t to learn interaction among sensor measurement
(k,1)
dimensions and local patterns in frequency domain, with the output .X··t . Then
we apply 1d filters with shape .(1, cov2) and .(1, cov3) hierarchically to learn high-
(k,2) (k,3)
level relationships, .X··t and .X··t .
(k,3) (k,3) (k,3)
Then we flatten matrix .X··t into vector .x··t and concat all K vectors .{x··t }
(3)
into a K-row matrix .X··t , which is the input of the merge convolutional subnet.
The architecture of the merge convolutional subnet is similar as the individual
convolutional subnet. We first apply 2d filters with shape .(K, cov4) to learn the
(4)
interactions among all K sensors, with output .X··t , and then apply 1d filters with
(5)
shape .(1, cov5) and .(1, cov6) hierarchically to learn high-level relationships, .X··t
(6)
and .X··t .
For each convolutional layer, DeepSense learns 64 filters, and uses ReLU as the
activation function. In addition, batch normalization [35] is applied at each layer to
reduce internal covariate shift. We do not use residual net structures [22], because we
want to simplify the network architecture for mobile applications. Then we flatten
(6) (f ) (f )
the final output .X··t into vector .x··t ; concatenate .x··t and time interval width, .[τ ],
together into .x(c)
t as inputs of recurrent layers.

2.2 Recurrent Layers

Recurrent neural networks are powerful architectures that can approximate function
and learn meaningful features for sequences. Original RNNs fall short of learning
long-term dependencies. Two extended models are Long Short-Term Memory
(LSTM) [26] and Gated Recurrent Unit (GRU) [36]. In this chapter, we choose
GRU, because GRUs show similar performance as LSTMs on various tasks [36],
1 Neural Network Models for Time Series Data 9

while having a more concise expression, which reduces network complexity for
mobile applications.
DeepSense chooses a stacked GRU structure (with two layers in this chapter).
Compared with standard (single-layer) GRUs, stacked GRUs are a more efficient
way to increase model capacity [21]. Compared to bidirectional GRUs [37], which
contain two time flows from start to end and from end to start, stacked GRUs can
run incrementally, when there is a new time interval, resulting in faster processing
of stream data. In contrast, we cannot run bidirectional GRUs until data from all
time intervals are ready, which is infeasible for applications such as tracking. We
apply dropout to the connections between GRU layers [38] for regularization and
apply recurrent batch normalization [39] to reduce internal covariate shift among
time steps. Inputs .{x(c)
t } for .t = 1, · · · , T from previous convolutional layers are
fed into stacked GRU and generate outputs .{x(r) t } for .t = 1, · · · , T as inputs of the
final output layer.

2.3 Output Layer

The output of recurrent layer is a series of vectors .{x(r) t } for .t = 1, · · · , T . For the
regression-oriented task, since the value of each element in vector .x(r) t is within .±1,
(r)
.xt encodes the output physical quantities at the end of time interval t. In the output
(r)
layer, we want to learn a dictionary .Wout with a bias term .bout to decode .xt into
(r)
.ŷt , such that .ŷt = Wout · xt + bout . Therefore, the output layer is a fully connected
layer on the top of each interval with sharing parameter .Wout and .bout .
(r)
For the classification task, .xt is the feature vector at time interval t. The
(r)
output layer first needs to compose .{xt } into a fixed-length feature vector for
further processing. Averaging features over time is one choice. More sophisticated
methods can also be applied to generate the final feature, such as the attention
model [24], which has illustrated its effectiveness in various learning tasks recently.
The attention model can be viewed as weighted averaging of features over time,
but the weights are learnt by neural networks through context. In this chapter,
we still use averaging features over time to generate the final feature, .x(r) =
( Tt=1 x(r)t )/T . Then we feed .x
(r) into a softmax layer to generate the predicted

category probability .ŷ.

3 Task-Speciﬁc Customization

In this section, we ﬁrst describe how to trivially customize the DeepSense frame-
work to different mobile sensing and computing tasks. Next, we instantiate the
solution with three speciﬁc tasks used in our evaluation.
10 S. Yao and T. Abdelzaher

3.1 General Customization Process

In general, we need to customize a few parameters of the main architecture of

DeepSense, shown in Sect. 2, for specific mobile sensing and computing tasks. Our
general DeepSense customization process is as follows:
1. Identify the number of sensor inputs, K. Pre-process the sensor inputs into a set
of tensors .X = {X(k) } as input.
2. Identify the type of the task. Whether the application is regression or
classification-oriented. Select one of the two types of output layer according
to the type of task.
3. Design a customized cost function or choose the default cost function (namely,
mean square error for regression-oriented tasks and cross-entropy error for
classification-oriented tasks).
Therefore, if opt for the default DeepSense configuration, we need only to set the
number of inputs, K, preprocess the input sensor measurements, and identify the
type of task (i.e., regression-oriented versus classification-oriented).
The pre-processing is simple, as stated at the beginning of Sect. 2. We just need
to align and chunk the sensor measurements, and apply Fourier transform to each
sensor chunk. For each sensor, we stack these frequency domain outputs into .d (k) ×
2f × T tensor .X(k) , where .d (k) is the sensor measurement dimension, f is the
frequency domain dimension, and T is the number of time intervals.
To identify the number of sensor inputs K, we usually set K to be the number of
different sensing modalities available. If there exist two or more sensors of the same
modality (e.g., two accelerometers or three microphones), we just treat them as one
multi-dimensional sensor and set its measurement dimension accordingly.
For the cost function, we can design our own cost function other than the default
one. We denote our DeepSense model as function .F(·), and a single training sample
pair as .(X, y). We can express the cost function as:
.L = (F(X), y) + λj Pj (1.1)
j

where . (·) is the loss function, .Pj is the penalty or regularization function, and .λj
controls the importance of the penalty or regularization term.

3.2 Customize Mobile Sensing Tasks

In this section, we provide three instances of customizing DeepSense for speciﬁc

mobile computing applications used in our evaluation.
Car Tracking with Motion Sensors (CarTrack) In this task, we apply acceler-
ator, gyroscope, and magnetometer to track the trajectory of a car without initial
1 Neural Network Models for Time Series Data 11

speed. Therefore, according to our general customization process, carTrack is

a regression-oriented problem with .K = 3 (i.e. accelerometer, gyroscope, and
magnetometer). Instead of applying default mean square error loss function, we
design our own cost function according to Eq. (1.1).
During the training step, the ground-truth 2D displacement of car in each time
interval, .y, is obtained by GPS signal, where .y[t] denotes the 2D displacement in
time interval t. Yet a problem is that GPS signal also contains noise. Training the
DeepSense model to recover the displacement obtained from by GPS signal will
generate sub-optimal results. We apply Kalman ﬁlter to covert displacement .y[t] into
a 2D Gaussian distribution .Y[t] (·) with mean value .y(t) in time interval t. Therefore,
we use negative log likelihood as loss function . (·) with additional penalty terms:
. L = − log Y[t ] F(X)[t]
T
+ λ · max 0, cos(θ) − Sc F(X)[t] , y(t)
t=1

where .Sc (·, ·) denotes the cosine similarity, the first term is the negative log
likelihood loss function, and the second term is a penalty term controlled by
parameter .λ. If the angle between our predicted displacement .F(X)[t] and .y(t) is
larger than a pre-defined margin .θ ∈ [0, π ), the cost function will get a penalty.
We introduce the penalty, because we find that predicting a correct direction is more
important during the experiment, as described in Sect. 4.4.1.
Heterogeneous Human Activity Recognition (HHAR) In this task, we per-
form leave-one-user-out cross-validation on human activity recognition task with
accelerometer and gyroscope measurements. Therefore, according to our general
customization process, HHAR is a classification-oriented problem with .K = 2
(accelerometer and gyroscope). We use the default cross-entropy cost function as
the training objective.

. L = H (y, F(X))

where .H (·, ·) is the cross entropy for two distributions.

User Identification with Motion Analysis (UserID) In this task, we perform user
identification with biometric motion analysis. We classify users’ identity according
to accelerometer and gyroscope measurements. Similarly, according to our general
customization process, UserID is a classification-oriented problem with .K = 2
(accelerometer and gyroscope). Similarly as above, we use the default cross-entropy
cost function as the training objective.
This chapter focuses on solving different mobile sensing and computing tasks in
a unified framework. DeepSense is our solution. It is a framework that requires only
a few steps to be customized into particular tasks. During the customization steps,
we do not tailor the architecture for different tasks in order to lessen the requirement
12 S. Yao and T. Abdelzaher

of human efforts while using the framework. However, particular changes to the
architecture can bring additional performance gains to speciﬁc tasks.
One possible change is separating noise model and physical laws for regression-
oriented tasks. The original DeepSense directly learns the composition of noise
model and physical laws, providing the capability of automatically understanding
underlying physical process from data. However, if we know exactly the physical
process, we can use DeepSense as a powerful denoising component, and apply
physical laws to the outputs of DeepSense.
The other possible change is removing some design components to trade
accuracy for energy. In our evaluations, we show that some variants take acceptable
degradation on accuracy with less energy consumption. The basic principle of
removing design components is based on their functionalities. Individual convolu-
tional subnets explore relationship within each sensor; merge convolutional subnet
explores relationship among different sensors; and stacked RNN increases the
model capacity for exploring relationship over time. We can choose to omit some
components according to the demands of particular tasks.
At last, for a particular sensing task, if there is drastic change in the physical
environment, DeepSense might need to be re-trained with new data. However, on
one hand, the traditional solution with pre-deﬁned noise model and physical laws
(or hand-crafted features) would also need redesigns anyways. On the other hand, an
existing trained DeepSense framework can serve as a good initialization stage for the
new training process that aids in optimization and reduce generalization error [23].

4 Evaluation

In this section, we evaluate DeepSense on three mobile computing tasks. We

ﬁrst introduce the experimental setup for each, including datasets and baseline
algorithms. We then evaluate the three tasks based on accuracy, energy, and latency.
We use the abbreviations, CarTrack, HHAR, and UserID, as introduced in Sect. 3.2,
to refer to the aforementioned tasks.

4.1 Data Collection and Datasets

For the CarTrack task, we collect 17,500 phone-miles worth of driving data.
Namely, we collect around 500 driving hours in total using three cars ﬁtted with
20 mobile phones in the Urbana-Champaign area. Mobile devices include Nexus
5, Nexus 4, Galaxy Nexus, and Nexus S. Each mobile device collects measures
of accelerometer, gyroscope, magnetometer, and GPS. GPS measurements are
collected roughly every second. Collection rates of other sensors are set to their
highest frequency. After obtaining the raw sensor measurements, we ﬁrst segment
them into data samples. Each data sample is a zero-speed to zero-speed journey,
1 Neural Network Models for Time Series Data 13

where the start and termination are detected when there are at least three consecutive
zero GPS speed readings. Each data sample is then separated into time intervals
according to the GPS measurements. Hence, every GPS measurement is an indicator
of the end of a time interval. In addition, each data sample contains one additional
time interval with zero speed at the beginning. Furthermore, for each time interval,
GPS latitude and longitude are converted into map coordinates, where the origin of
coordinates is the position at the first time interval. Fourier transform is applied to
each sensor measurement in each time interval to obtain the frequency response of
the three sensing axes. The frequency responses of the accelerator, gyroscope, and
magnetometer at each time interval are then composed into the tensors as DeepSense
inputs. At last, for evaluation purposes, we apply a Kalman filter to coordinates
obtained by the GPS signal, and generate the displacement distribution of each time
interval. The results serve as ground truth for training.
For both the HHAR and UserID tasks, we use the dataset collected by Allan et
al. [3]. This dataset contains readings from two motion sensors (accelerometer and
gyroscope). Readings were recorded when users executed activities scripted in no
specific order, while carrying smartwatches and smartphones. The dataset contains 9
users, 6 activities (biking, sitting, standing, walking, climbStair-up, and climbStair-
down), and 6 types of mobile devices. For both tasks, accelerometer and gyroscope
measurements are model inputs. However, for HHAR, activities are used as labels,
and for UserID, users’ unique IDs are used as labels. We segment raw measurements
into 5-second samples. For DeepSense, each sample is further divided into time
intervals of length .τ , as shown in Fig. 1.1. We take .τ = 0.25 s. Then we calculate
the frequency response of sensors for each time interval, and compose results from
different time intervals into tensors as inputs.

4.2 Evaluation Platforms

Our evaluation experiments are conducted on two platforms: Nexus 5 with Qual-
comm Snapdragon 800 SoC [40] and Intel Edison Compute Module [41]. We train
DeepSense on Desktop with GPU. And trained DeepSense models are run solely
on mobile with CPU: quad core .2.3 GHz Krait 400 CPU on Nexus 5 and dual-core
500 MHz Atom processor on Intel Edison. In this chapter, we do not exploit the
additional computation power of mobile GPU and DSP units [31].

4.3 Algorithms in Comparison

We evaluate our DeepSense model and compare it with other competitive algorithms
in three tasks. There are three global baselines, which are the variants of DeepSense
model by removing one design component in the architecture. The other baselines
are speciﬁcally designed for each single task.
14 S. Yao and T. Abdelzaher

DS-singleGRU: This model replaces the 2-layer stacked GRU with a single-layer
GRU with larger dimension, while keeping the number of parameters. This baseline
algorithm is used to verify the efficiency of increasing model capacity by staked
recurrent layer.
DS-noIndvConv: In this mode, there are no individual convolutional subnets for
each sensor input. Instead, we concatenate the input tensors along the first axis (i.e.,
the input measurement dimension). Then, for each time interval, we have a single
matrix as the input to the merge convolutional subnet directly.
DS-noMergeConv: In this variant, there are no merge convolutional subnets at each
time interval. Instead, we flatten the output of each individual convolutional subnet
and concatenate them into a single vector as the input of the recurrent layers.
CarTrack Baseline:
• GPS: This is a baseline measurement that is specific to the CarTrack problem. It
can be viewed as the ground truth for the task, as we do not have other means of
more accurately acquiring cars’ locations. In the following experiments, we use
the GPS module in Qualcomm Snapdragon 800 SoC.
• Sensor-fusion: This is a sensor fusion based algorithm. It combines gyroscope
and accelerometer measurements to obtain the pure acceleration without gravity.
It uses accelerometer, gyroscope, and magnetometer to obtain absolute rotation
calibration. Android phones have proprietary solutions for these two func-
tions [42]. The algorithm then applies double integration on pure acceleration
with absolute rotation calibration to obtain the displacement.
• eNav (w/o GPS): eNav is a map-aided car tracking algorithm [13]. This
algorithm constrains the car movement path according to a digital map, and
computes moving distance along the path using double integration of acceleration
derived using principal component analysis that removes gravity. The original
eNav uses GPS when it believes that dead-reckoning error is high. For fairness,
we modified eNav to disable GPS.
HHAR Baselines:
• HAR-RF: This algorithm [3] selects all popular time-domain and frequency
domain features from [43] and ECDF features from [44], and uses random forest
as classifier.
• HAR-SVM: Feature selection of this model is same as the HAR-RF model. But
this model uses support vector machine as classifier [3].
• HRA-RBM: This model is based on stacked restricted Boltzmann machines with
frequency domain representations as inputs [29].
• HRA-MultiRBM: For each sensor input, the model processes it with a single
stacked restricted Boltzmann machine. Then it uses another stacked restricted
Boltzmann machine to merge the results for activity recognition [30].
UserID Baselines:
• GaitID: This model extracts the gait template and identifies user through
template matching with support vector machine [45].
1 Neural Network Models for Time Series Data 15

• IDNet: This model first extracts the gait template, and extracts template features
with convolutional neural networks. Then this model identifies user through sup-
port vector machine and integrates multiple verifications with Wald’s probability
ratio test [20].

4.4 Effectiveness

In this section, we will discuss the accuracy and other related performance metrics
of the DeepSense model, compared with other baseline algorithms.

4.4.1 CarTrack

We use 253 zero-speed to zero-speed car driving examples to evaluate the CarTrack
task. The histogram of evaluation data driving distance is illustrated in Fig. 1.2.
During the whole evaluation, we regard filtered GPS signal as ground truth.
CarTrack is a regression problem. Therefore, we first evaluate all algorithms with
mean absolute error (MAE) between predicted and true final displacements with
.95% confidence interval except for the eNav (w/o GPS) algorithm, which is a map-

aided algorithm without tracking real trajectories. The results about mean absolute
errors are illustrated in the second column of Table 1.1.

Fig. 1.2 Histogram of

driving distance 60

40
Frequency

0
0 500 1000 1500 2000
Driving Distant (m)

Table 1.1 CarTrack task MAE (meter) Map-aided accuracy

accuracy
DeepSense .40.43 ± 5.24 .93.8%
DS-SingleGRU .44.97 ± 5.80 .90.2%

DS-noIndvConv .52.15 ± 6.24 .88.3%

DS-noMergeConv .53.06 ± 6.59 .87.5%

Sensor-fusion .606.59 ± 56.57

eNav (w/o GPS) .6.7%

Bold values reﬂect the evaluation performance of techniques that

are presented in the current section
16 S. Yao and T. Abdelzaher

Compared with senior-fusion algorithm, DeepSense reduces the tracking error

by an order of magnitude, which is mainly attributed to its capability to learn the
composition of noise model and physical laws. Then, we compare our DeepSense
model with three variants as mentioned before. The results show the effectiveness
of each designing component of our DeepSense model. The individual and merge
convolutional subnets learn the interaction within and among sensor measurements
respectively. The stacked recurrent structure increases the capacity of model more
efficiently. Removing any component will cause performance degradation.
DeepSense model achieves .40.43 ± 5.24 m mean absolute error. This is almost
equivalent to half of traditional city blocks (.80 m × 80 m), which means that, with
the aid of map and the assumption that car is driving on roads, DeepSense model has
a high probability to provide accurate trajectory tracking. Therefore, we propose a
naive map-aided track method here. For each segment of original tracking trajectory,
we assign them to the most probable road segment on map (i.e., the nearest road
segment on map). We then compare the resulted trajectory with ground truth. If
all the trajectory segments are the same as the ground truth, we regard it as a
successful tracking trajectory. Finally, we compute the percentage of successful
tracking trajectories as accuracy. eNav (w/o GPS) is a map-aided algorithm, so
we directly compare the trajectory segments. Sensor-fusion algorithm generates
tracking errors that are comparable to driving distances, so we exclude it from
the comparison. We show the accuracy of map-aided versions of algorithms in the
third column of Table 1.1. DeepSense outperforms eNav (w/o GPS) with a large
margin, because eNav (w/o GPS) intrinsically depends on occasional on-demand
GPS samples to correct tracking error.
We next examine how tracking performance is affected by driving distances. We
first sort all evaluation samples according to driving distance. Then we separate them
into 10 groups with 200 m step size. Finally, we compute mean absolute error and
accuracy of map-aided track for DeepSense algorithm separately for each group. We
illustrate the results in Fig. 1.3. For the mean absolute error metric, driving longer
distance generally results in large error, but the error does not accumulate linearly
over distance. There are mainly two reasons for this phenomenon. On one hand, we
observe that the error of our predicted trajectory usually occurs during the beginning
of the driving, where uncertainty in predicting driving direction is the major cause.
This is also the motivation that we add the penalty term for cost function in Sect. 3.2.
On the other hand, longer-driving cases in our testing samples are more stable,
because we extract the trajectory from zero-speed to zero-speed. For the map-aided
track, longer driving distances even yields slightly better accuracy. This is because
long-distance trajectory usually contains long trajectory segments, which can help
to find the ground truth on the map.
Finally, some our DeepSense tracking results (without the help of map and with
downsampling) are illustrated in Fig. 1.4.
1 Neural Network Models for Time Series Data 17

1
0.95
0.9
0.85

Mean Absolute Error (m)

Map−Aided Track (%)

20
0 500 1000 1500 2000
Driving Distance (m)

Fig. 1.3 Performance over driving distance

4.4.2 HHAR

For HHAR task, we perform leave-one-user-out evaluation (i.e., leaving the whole
data from one user as testing data) on datasets consisting of 9 users, which are
labelled from a to i. We illustrate the result of evaluations according to three metrics:
accuracy, macro .F1 score, and micro .F1 score with .95% conﬁdence interval in
Fig. 1.5.
The DeepSense based algorithms (including DeepSense and three variants)
outperform other baseline algorithms with a large margin (i.e., at least .10%).
Compared with two hand-crafted feature based algorithms HAR-RF and HAR-
SVM, DeepSense model can automatically extract more robust features, which
generalize better to the user who does not appear in the training set. Compared with
a deep model, such as HAR-RBM and HAR-MultiRBM, DeepSense model exploit
local structures within sensor measurements, dependency along time, and relation-
ships among multiple sensors to generate better and more robust features from
data. Compared with three variants, DeepSense still achieves the best performance
(accuracy: .0.942 ± 0.032, macro .F1 : .0.931 ± 0.041, and micro .F1 : .0.942 ± 0.032).
This reinforces the effectiveness of our design components in DeepSense model.
Then we illustrate the confusion matrix of best-performing DeepSense model in
Fig. 1.6. Predicting Sit as Stand is the largest error. It is hard to classify these two,
because two activities should have similar motion sensor measurements by nature,
especially when we have no prior information about testing users. In addition, the
algorithm has a minor error about misclassiﬁcation between ClimbStair-up and
ClimbStair-down.
18 S. Yao and T. Abdelzaher

Fig. 1.4 Examples of tracking trajectory without the help of map: Blue trajectory (DeepSense)
and Red trajectory (GPS)

4.4.3 UserID

This task focuses on user identification with biometric motion analysis. We evaluate
all algorithms with 10-fold cross validation. We illustrate the result of evaluations
according to three metrics: accuracy, macro .F1 score, and micro .F1 score with .95%
confidence interval in Fig. 1.7. Specifically, The figure on the left shows the results
when algorithms observe .1.25 seconds of evaluation data, the figure on the right
shows the results when algorithms observe 5 seconds of evaluation data.
DeepSense and three variants outperform other baseline algorithms with a large
margin again (i.e. at least 20%). Compared with the template extraction and
matching method, GaitID, DeepSense model can automatically extract distinct
1 Neural Network Models for Time Series Data 19

0.9

0.8

DeepSense
0.7 DS−singleGRU
DS−noIndvConv
DS−noMergeConv
HAR−MultiRBM
0.6 HAR−RF
HAR−SVM
HAR−RBM
0.5
Accuracy Macro F1 Micro F1

Fig. 1.5 Performance metrics of HHAR task

Fig. 1.6 Confusion matrix of HHAR task

features from data, which ﬁt well to not only walking but also all other kinds
of activities. Compared with method that ﬁrst extracts templates and then apply
neural network to learn features, IDNet, DeepSense solves the whole task in the
end-to-end fashion. We eliminate the manually processing part and exploit local,
global, and temporal relationships through our architecture, which results better
performance. In this task, although the performance of different variants is similar
when observing data with 5 seconds, DeepSense still achieves the best performance
(accuracy: .0.997 ± 0.001, macro .F1 : .0.997 ± 0.001, and micro .F1 : .0.997 ± 0.001).
We further compare DeepSense with three variants by changing the number of
evaluation time intervals from 5 to 20, which corresponds to around 1 to 5 seconds.
We compute the accuracy for each case. The results illustrated in Fig. 1.8 suggest
20 S. Yao and T. Abdelzaher

1 1

0.9 0.9
DeepSense
DS−singleGRU
0.8 DS−noIndvConv 0.8
DS−noMergeConv
IDNet
0.7 GaitID 0.7

0.6 0.6
DeepSense
DS−singleGRU
0.5 0.5 DS−noIndvConv
DS−noMergeConv
IDNet
GaitID
0.4 0.4
Accuracy Macro F1 Micro F1 Accuracy Macro F1 Micro F1

Fig. 1.7 Performance metrics of UserID task for different time intervals: .1.25 s (left) and 5 s
(right)

0.98
Accuracy

DeepSense
0.96 DS−singleGRU
DS−noIndvConv
DS−noMergeConv
0.94
5 10 15 20
Number of input time intervals

Fig. 1.8 Accuracy over input measurement length of UserID task

that DeepSense performs better than all the other variants with a relatively large
margin when algorithms observe sensing data with shorter time. This indicates the
effectiveness of design components in DeepSense.
Then we illustrate the confusion matrix of best-performing DeepSense model
when observing sensing data with 5 seconds in Fig. 1.9. It shows that the algorithm
gives a pretty good result. On average, only about two misclassiﬁcations appear
during each testing.

4.5 Latency and Energy

Final, we examine the computation latency and energy consumption of DeepSens

stereotypical deep learning models are traditionally power hungry and time consum-
ing. We illustrate, through our careful measurements in all three example application
scenarios, the feasibility of directly implementing and deploying DeepSense on
mobile devices without any additional optimization.
Experiments measure the whole process on smart devices including reading the
raw sensor inputs and are conducted on two kinds of devices: Nexus 5 and Intel
1 Neural Network Models for Time Series Data 21

Fig. 1.9 Confusion matrix of

UserID task

Fig. 1.10 Test platforms:

Nexus5 and Intel Edison

150 4
GPS DeepSense
DeepSense DS−singleGRU
Latency (mS)

DS−singleGRU DS−noIndvConv
Power (mW)

DS−noIndvConv 3 DS−noMergeConv
100 DS−noMergeConv Sensor−fusion
Sensor−fusion eNav (w/o GPS)
eNav (w/o GPS) 2

50
1

0 0

Fig. 1.11 Power and Latency of carTrack solutions on Nexus 5

Edison, as shown in Fig. 1.10. The energy consumption of applications on Nexus 5

is measured by PowerTutor [46], while the energy consumption of Intel Edison is
measured by an external power monitor. The evaluations of energy and latency on
Nexus 5 are shown in Figs. 1.11, 1.12, and 1.13, and Intel Edison Figs. 1.14, 1.15,
and 1.16. Since algorithms for carTrack are designed to report position every second,
22 S. Yao and T. Abdelzaher

250 60
DeepSense
DS−singleGRU
50

Latency (mS)
200 DS−noIndvConv
Energy (mJ) DS−noMergeConv
40 HAR−RF
150 DeepSense HAR−SVM
DS−singleGRU 30 HAR−MltiRBM
DS−noIndvConv HAR−RBM
100 DS−noMergeConv
20
HAR−RF
50 HAR−SVM
10
HAR−MltiRBM
HAR−RBM
0 0

Fig. 1.12 Energy and Latency of HHAR solutions on Nexus 5

600 100
DeepSense DeepSense
DS−singleGRU DS−singleGRU
500

Latency (mS)
DS−noIndvConv 80 DS−noIndvConv
Energy (mJ)

DS−noMergeConv DS−noMergeConv
400 IDNet IDNet
GaitID 60 GaitID
300
40
200

100 20

0 0

Fig. 1.13 Energy and Latency of UserID solutions on Nexus 5

800 40
DeepSense
DS−singleGRU
Latency (mS)

DS−noIndvConv
Power (mW)

600 30 DS−noMergeConv
Sensor−fusion
eNav (w/o GPS)
400 20
DeepSense
DS−singleGRU
DS−noIndvConv
200 DS−noMergeConv
10
Sensor−fusion
eNav (w/o GPS)
0 0

Fig. 1.14 Power and Latency of carTrack solutions on Edison

800 300
DeepSense
DS−singleGRU
250
Latency (mS)

DS−noIndvConv
Energy (mJ)

600 DS−noMergeConv
200 HAR−RF
DeepSense HAR−SVM
400 DS−singleGRU 150 HAR−MltiRBM
DS−noIndvConv HAR−RBM
DS−noMergeConv
100
HAR−RF
200 HAR−SVM
HAR−MltiRBM 50
HAR−RBM
0 0

Fig. 1.15 Energy and Latency of HHAR solutions on Edison

1000 150

800
Latency (mS)
Energy (mJ)

100
600
DeepSense DeepSense
400 DS−singleGRU DS−singleGRU
50
DS−noIndvConv DS−noIndvConv
200 DS−noMergeConv DS−noMergeConv
IDNet IDNet
GaitID GaitID
0 0

Fig. 1.16 Energy and Latency of UserID solutions on Edison

1 Neural Network Models for Time Series Data 23

we show the power consumption in Fig. 1.14. Other two tasks are not periodical
tasks by nature. Therefore, we show the per-inference energy consumption in
Figs. 1.15 and 1.16. For experiments on Intel Edison, notice that we measured total
energy consumption, containing 419 mW idle-mode power consumption.
For the carTrack task, all DeepSense based models consume a bit less energy
compared with 1-Hz GPS samplings on Nexus 5. The running times are measured
in the order of microsecond on both platforms, which meets the requirement of per-
second measurement.
For the HHAR task, all DeepSense based models take moderate energy and low
latency to obtain one classiﬁcation prediction on two platforms. An interesting
observation is that HHAR-RF, a random forest model, has a relatively longer
latency. This is due to the fact that random forest is an ensemble method, which
involves combining a bag of individual decision tree classiﬁers.
For the UserID task, except for the IDNet baseline, all other algorithms show
similar running time and energy consumption on two platforms. IDNet contains
both a multi-stage pre-processing process and a relative large CNN, which takes
longer time and more energy to compute in total.

References

1. N.D. Lane, E. Miluzzo, H. Lu, D. Peebles, T. Choudhury, A.T. Campbell, A survey of mobile
phone sensing. IEEE Commun. Mag. (2010)
2. Y. Ren, Y. Chen, M. C. Chuah, J. Yang, Smartphone based user verification leveraging gait
recognition for mobile healthcare systems, in SECON (2013)
3. A. Stisen, H. Blunck, S. Bhattacharya, T.S. Prentow, M.B. Kjærgaard, A. Dey, T. Sonne, M.M.
Jensen, Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities
for activity recognition, in Sensys (2015)
4. S. Nath, Ace: exploiting correlation for energy-efficient and continuous context sensing, in
MobiSys (2012)
5. C. Xu, S. Li, G. Liu, Y. Zhang, E. Miluzzo, Y.-F. Chen, J. Li, B. Firner, Crowd++: unsupervised
speaker count with smartphones, in UbiComp (2013)
6. J. Ko, C. Lu, M.B. Srivastava, J.A. Stankovic, A. Terzis, M. Welsh, Wireless sensor networks
for healthcare. Proc. IEEE (2010)
7. M. Rabbi, M.H. Aung, M. Zhang, T. Choudhury, Personal sensing: understanding mental health
using ubiquitous sensors and machine learning, in UbiComp (2015)
8. C.-Y. Li, C.-H. Yen, K.-C. Wang, C.-W. You, S.-Y. Lau, C. C.-H. Chen, P. Huang, H.-H. Chu,
Bioscope: an extensible bandage system for facilitating data collection in nursing assessments,
in UbiComp (2014)
9. T. Li, C. An, Z. Tian, A. T. Campbell, X. Zhou, Human sensing using visible light
communication, in MobiCom (2015)
10. Y. Zhu, Y. Zhu, B.Y. Zhao, H. Zheng, Reusing 60ghz radios for mobile radar imaging, in
MobiCom (2015)
11. E. Miluzzo, A. Varshavsky, S. Balakrishnan, R.R. Choudhury, Tapprints: your finger taps have
fingerprints, in MobiSys (2012)
12. C. Wang, X. Guo, Y. Wang, Y. Chen, B. Liu, Friend or foe? Your wearable devices reveal your
personal pin, in AsiaCCS (2016)
24 S. Yao and T. Abdelzaher

13. S. Hu, L. Su, S. Li, S. Wang, C. Pan, S. Gu, M.T. Al Amin, H. Liu, S. Nath, et al., Experiences
with enav: a low-power vehicular navigation system, in UbiComp (2015)
14. L. Kang, B. Qi, D. Janecek, S. Banerjee, Ecodrive: a mobile sensing and control system for
fuel efﬁcient driving, in MobiCom (2015)
15. Y. Zhao, S. Li, S. Hu, L. Su, S. Yao, H. Shao, T. Abdelzaher, Greendrive: a smartphone-based
intelligent speed adaptation system with real-time trafﬁc signal prediction, in ICCPS (2017)
16. W.T. Ang, P.K. Khosla, C.N. Riviere, Nonlinear regression model of a low-g mems accelerom-
eter. IEEE Sensors J. (2007)
17. M. Park, Error Analysis and Stochastic Modeling of MEMS-Based Inertial Sensors for
Land Vehicle Navigation Applications. (Library and Archives Canada, 2005). Bibliothè que et
Archives Canada
18. G. Chandrasekaran, T. Vu, A. Varshavsky, M. Gruteser, R.P. Martin, J. Yang, Y. Chen, Tracking
vehicular speed variations by warping mobile phone signal strengths, in PerCom (2011)
19. K. Lin, A. Kansal, D. Lymberopoulos, F. Zhao, Energy-accuracy aware localization for mobile
devices, in MobiSys (2010)
20. M. Gadaleta, M. Rossi, Idnet: Smartphone-based gait recognition with convolutional neural
networks (2016). arXiv:1606.03238
21. I.G.Y. Bengio, A. Courville, Deep Learning (MIT Press, 2016). Book in preparation
22. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition (2015).
arXiv:1512.03385
23. G.E. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for
large-vocabulary speech recognition. IEEE TASLP (2012)
24. D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and
translate (2014). arXiv:1409.0473
25. J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, T.
Darrell, Long-term recurrent convolutional networks for visual recognition and description, in
CVPR (2015)
26. K. Greff, R.K. Srivastava, J. Koutník, B.R. Steunebrink, J. Schmidhuber, Lstm: a search space
odyssey (2015). arXiv:1503.04069
27. N. Srivastava, R.R. Salakhutdinov, Multimodal learning with deep boltzmann machines, in
NIPS (2012)
28. N.D. Lane, P. Georgiev, L. Qendro, Deepear: robust smartphone audio sensing in unconstrained
acoustic environments using deep learning, in UbiComp (2015)
29. S. Bhattacharya, N.D. Lane, From smart to deep: robust activity recognition on smartwatches
using deep learning, in PerCom Workshops (2016)
30. V. Radu, N.D. Lane, S. Bhattacharya, C. Mascolo, M.K. Marina, F. Kawsar, Towards
multimodal deep learning for activity recognition on mobile devices, in UbiComp: Adjunct
(2016)
31. N.D. Lane, S. Bhattacharya, P. Georgiev, C. Forlivesi, L. Jiao, L. Qendro, F. Kawsar, Deepx: a
software accelerator for low-power deep learning inference on mobile devices, in IPSN (2016)
32. R. LiKamWa, Y. Hou, J. Gao, M. Polansky, L. Zhong, Redeye: analog convnet image sensor
architecture for continuous mobile vision, in ISCA (2016), pp. 255–266
33. F.J.O. Morales, D. Roggen, Deep convolutional feature transfer across mobile activity recog-
nition domains, sensor modalities and locations, in ISWC (2016)
34. O. Rippel, J. Snoek, R.P. Adams, Spectral representations for convolutional neural networks,
in NIPS (2015)
35. S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing
internal covariate shift (2015). arXiv:1502.03167
36. J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural
networks on sequence modeling (2014). arXiv:1412.3555
37. M. Schuster, K.K. Paliwal, Bidirectional recurrent neural networks. IEEE Trans Sig. Process.
(1997)
38. W. Zaremba, I. Sutskever, O. Vinyals, Recurrent neural network regularization (2014).
arXiv:1409.2329
1 Neural Network Models for Time Series Data 25

39. T. Cooijmans, N. Ballas, C. Laurent, A. Courville, Recurrent batch normalization (2016).

arXiv:1603.09025
40. Qualcomm Snapdragon 800 Processor, https://www.qualcomm.com/products/snapdragon/
processors/800
41. Intel Edison Compute Module, http://www.intel.com/content/dam/support/us/en/documents/
edison/sb/edison-module_HG_331189.pdf
42. G. Milette, A. Stroud, Professional Android Sensor Programming (John Wiley & Sons,
Hoboken, 2012)
43. D. Figo, P.C. Diniz, D.R. Ferreira, J.M. Cardoso, Preprocessing techniques for context
recognition from accelerometer data. Pers. Ubiquit. Comput. (2010)
44. N.Y. Hammerla, R. Kirkham, P. Andras, T. Ploetz, On preserving statistical characteristics of
accelerometry data using their empirical cumulative distribution, in ISWC (2013)
45. H.M. Thang, V.Q. Viet, N.D. Thuc, D. Choi, Gait identiﬁcation using accelerometer on mobile
phone, in ICCAIS (2012)
46. L. Zhang, B. Tiwana, Z. Qian, Z. Wang, R.P. Dick, Z.M. Mao, L. Yang, Accurate online power
estimation and automatic battery behavior based power model generation for smartphones, in
CODES+ISSS (2010)

The Metamorphica Revised
No ratings yet
The Metamorphica Revised
270 pages
Sylvia WalkThrough Official March2020
No ratings yet
Sylvia WalkThrough Official March2020
10 pages
Deep Learning For Intelligent Wireless Networks: A Comprehensive Survey
No ratings yet
Deep Learning For Intelligent Wireless Networks: A Comprehensive Survey
25 pages
BSIM4 4.8.1 Technical Manual
No ratings yet
BSIM4 4.8.1 Technical Manual
185 pages
Chapter 1
No ratings yet
Chapter 1
23 pages
Deepsense: A Unified Deep Learning Framework For Time-Series Mobile Sensing Data Processing
No ratings yet
Deepsense: A Unified Deep Learning Framework For Time-Series Mobile Sensing Data Processing
9 pages
MobileNetV2 Inverted Residuals and Linear Bottlenecks
No ratings yet
MobileNetV2 Inverted Residuals and Linear Bottlenecks
11 pages
(IJCST-V10I3P2) :yew Kee Wong
No ratings yet
(IJCST-V10I3P2) :yew Kee Wong
11 pages
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
No ratings yet
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
11 pages
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
No ratings yet
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
14 pages
Fulltext01 P
No ratings yet
Fulltext01 P
78 pages
DL Acceleration On The Edge
No ratings yet
DL Acceleration On The Edge
78 pages
A Review On Basic Deep Learning
No ratings yet
A Review On Basic Deep Learning
9 pages
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
From Everand
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Ali Thesis
No ratings yet
Ali Thesis
125 pages
Deep Learning
No ratings yet
Deep Learning
8 pages
Indoor - Nagivgation - System - by NN
No ratings yet
Indoor - Nagivgation - System - by NN
69 pages
Analog Dialogue, Volume 48, Number 2
From Everand
Analog Dialogue, Volume 48, Number 2
Analog Dialogue
No ratings yet
Neral Introduction
No ratings yet
Neral Introduction
35 pages
Rupam's Master Thesis Presentation
No ratings yet
Rupam's Master Thesis Presentation
34 pages
Activity Recognition: Fundamentals and Applications
From Everand
Activity Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Rupam's Master Thesis
No ratings yet
Rupam's Master Thesis
58 pages
Ijisav11n1spl 25
No ratings yet
Ijisav11n1spl 25
7 pages
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
From Everand
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
Fouad Sabry
No ratings yet
UNITS
No ratings yet
UNITS
14 pages
Deep Learning in Mobile and Wireless Networking: A Survey
No ratings yet
Deep Learning in Mobile and Wireless Networking: A Survey
67 pages
FULLTEXT01
No ratings yet
FULLTEXT01
74 pages
2019 Dac Tutorial Nvidia Part
No ratings yet
2019 Dac Tutorial Nvidia Part
55 pages
A Survey of Deep Learning On Mobile Devices Applications Optimizations Challenges and Research Opportunities
No ratings yet
A Survey of Deep Learning On Mobile Devices Applications Optimizations Challenges and Research Opportunities
21 pages
Deep Learning Using SVM in Matlab
No ratings yet
Deep Learning Using SVM in Matlab
13 pages
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
Adaptive Intelligence For Ba Eryless Sensors Using So Ware-Accelerated Tsetlin Machines
No ratings yet
Adaptive Intelligence For Ba Eryless Sensors Using So Ware-Accelerated Tsetlin Machines
14 pages
Computer Vision: Fundamentals and Applications
From Everand
Computer Vision: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Modern Mathematics of Deep Learning
No ratings yet
The Modern Mathematics of Deep Learning
78 pages
Automatic Target Recognition: Fundamentals and Applications
From Everand
Automatic Target Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Efficient Deep Learning Infrastructures For Embedded Computing Systems: A Comprehensive Survey and Future Envision
No ratings yet
Efficient Deep Learning Infrastructures For Embedded Computing Systems: A Comprehensive Survey and Future Envision
101 pages
Computer Vision: Exploring the Depths of Computer Vision
From Everand
Computer Vision: Exploring the Depths of Computer Vision
Fouad Sabry
No ratings yet
Assignment 1 STIBME Eman Naeem
No ratings yet
Assignment 1 STIBME Eman Naeem
6 pages
Scene Classcification Using Deep Learning: by M. Wasif Asrar STUDENT NO:1810140
No ratings yet
Scene Classcification Using Deep Learning: by M. Wasif Asrar STUDENT NO:1810140
13 pages
Resilient H State Estimation For Discrete-Time Stochastic Delayed Memristive Neural Networks A Dynamic Event-Triggered Mechanism
No ratings yet
Resilient H State Estimation For Discrete-Time Stochastic Delayed Memristive Neural Networks A Dynamic Event-Triggered Mechanism
9 pages
Deep Learning For Geometric and Semantic Tasks in Photogrammetry and Remote Sensing
No ratings yet
Deep Learning For Geometric and Semantic Tasks in Photogrammetry and Remote Sensing
11 pages
Deep Learning 2 July 2014
No ratings yet
Deep Learning 2 July 2014
75 pages
Nguyen Duy
No ratings yet
Nguyen Duy
66 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
Deep Neural Networks Are Lazy: On The Inductive Bias of Deep Learning
No ratings yet
Deep Neural Networks Are Lazy: On The Inductive Bias of Deep Learning
78 pages
(2019) Towards Machine Learning With Zero Real - World Data
No ratings yet
(2019) Towards Machine Learning With Zero Real - World Data
6 pages
Accurate Orientation Estimates For Deep Inertial Odometry: Scott Sun July 2020 CMU-RI-TR-20-29
No ratings yet
Accurate Orientation Estimates For Deep Inertial Odometry: Scott Sun July 2020 CMU-RI-TR-20-29
42 pages
Traffic Sign Classification: Mezzi Houssem
No ratings yet
Traffic Sign Classification: Mezzi Houssem
36 pages
Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks - 2016
No ratings yet
Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks - 2016
7 pages
Deep Learning in Neural Networks An Overview
No ratings yet
Deep Learning in Neural Networks An Overview
89 pages
Binary Neural Networks
No ratings yet
Binary Neural Networks
218 pages
Deep Learning Techniques and Application
No ratings yet
Deep Learning Techniques and Application
20 pages
Machine Learning Algorithms For Wireless Sensor Networksa Survey
100% (1)
Machine Learning Algorithms For Wireless Sensor Networksa Survey
25 pages
Deep Learning Applications
No ratings yet
Deep Learning Applications
309 pages
Ug4 Proj
No ratings yet
Ug4 Proj
44 pages
Federated and Continual Learning For Classification Tasks in A Society of Devices
No ratings yet
Federated and Continual Learning For Classification Tasks in A Society of Devices
16 pages
View Synthesis: Exploring Perspectives in Computer Vision
From Everand
View Synthesis: Exploring Perspectives in Computer Vision
Fouad Sabry
No ratings yet
A Deep Convolutional Neural Network For Time Series Classification With Intermediate Targets
No ratings yet
A Deep Convolutional Neural Network For Time Series Classification With Intermediate Targets
24 pages
A Little Book of Deep Learning - Francois Fleuret
No ratings yet
A Little Book of Deep Learning - Francois Fleuret
149 pages
Lbdlu
No ratings yet
Lbdlu
168 pages
Sensors 23 00062 v2
No ratings yet
Sensors 23 00062 v2
86 pages
Deep Learning For Iot: Tausif Diwan, Jitendra V. Tembhurne, Tapan Kumar Jain, and Pooja Jain
No ratings yet
Deep Learning For Iot: Tausif Diwan, Jitendra V. Tembhurne, Tapan Kumar Jain, and Pooja Jain
17 pages
Agent-Based Cloud Service Composition: J. Octavio Gutierrez-Garcia
No ratings yet
Agent-Based Cloud Service Composition: J. Octavio Gutierrez-Garcia
29 pages
Addressing Self-Management in Cloud Platforms: A Semantic Sensor Web Approach
No ratings yet
Addressing Self-Management in Cloud Platforms: A Semantic Sensor Web Approach
8 pages
Elmroth 2011
No ratings yet
Elmroth 2011
12 pages
Self-Management For Cloud Computing: Mariachiara Puviani Regina Frei
No ratings yet
Self-Management For Cloud Computing: Mariachiara Puviani Regina Frei
7 pages
Self-Managed Services Conceptual Model in Trustworthy Clouds' Infrastructure
No ratings yet
Self-Managed Services Conceptual Model in Trustworthy Clouds' Infrastructure
10 pages
Compsac 2009 126
No ratings yet
Compsac 2009 126
6 pages
Cloudlightning: A Self-Organized Self-Managed Heterogeneous Cloud
No ratings yet
Cloudlightning: A Self-Organized Self-Managed Heterogeneous Cloud
10 pages
1 s2.0 S1877050920311339 Main
No ratings yet
1 s2.0 S1877050920311339 Main
10 pages
DS-RT 2013 37
No ratings yet
DS-RT 2013 37
8 pages
Toffetti 2016
No ratings yet
Toffetti 2016
15 pages
Self-Managed Services Over A P2P-based Network Management Overlay
No ratings yet
Self-Managed Services Over A P2P-based Network Management Overlay
8 pages
Passive Voice EXERCISE 1: Change These Sentences Into Passive
No ratings yet
Passive Voice EXERCISE 1: Change These Sentences Into Passive
6 pages
Optical Communication Systems: Introduce An Overview of Optical Systems and Actual Works
No ratings yet
Optical Communication Systems: Introduce An Overview of Optical Systems and Actual Works
39 pages
Unit 8
No ratings yet
Unit 8
3 pages
C2 MidTest
No ratings yet
C2 MidTest
9 pages
Passive Voice
No ratings yet
Passive Voice
3 pages
Atlas-Of-Stress-Strain-Curves Ocr PDF
100% (1)
Atlas-Of-Stress-Strain-Curves Ocr PDF
822 pages
Public Health Approach To Cardiovascular Disease Prevention Management Dorairaj Prabhakaran PDF Download
No ratings yet
Public Health Approach To Cardiovascular Disease Prevention Management Dorairaj Prabhakaran PDF Download
85 pages
Suaza
No ratings yet
Suaza
3 pages
My Voice....
No ratings yet
My Voice....
3 pages
Assignment I
No ratings yet
Assignment I
1 page
Basic Ecg Interpretation Practice Test V 1
100% (1)
Basic Ecg Interpretation Practice Test V 1
7 pages
One App To Trace Them All Examining App Specifications For Mass Acceptance of Contact Tracing Apps
No ratings yet
One App To Trace Them All Examining App Specifications For Mass Acceptance of Contact Tracing Apps
15 pages
The Dairy Farming Handbook 2017 - by DR CJC Muller
No ratings yet
The Dairy Farming Handbook 2017 - by DR CJC Muller
346 pages
Bonding Practice Test HL
No ratings yet
Bonding Practice Test HL
17 pages
OLED
No ratings yet
OLED
10 pages
Hermite Curves, B-Splines and NURBS: Outline
No ratings yet
Hermite Curves, B-Splines and NURBS: Outline
10 pages
Schimmel Deciphering Signs
No ratings yet
Schimmel Deciphering Signs
287 pages
INC 341 Feedback Control Systems 30 พ.ย.2558 PDF
No ratings yet
INC 341 Feedback Control Systems 30 พ.ย.2558 PDF
6 pages
Abaqus-Modeling of Nonlinear Cyclic Load Behavior of Ishaped
No ratings yet
Abaqus-Modeling of Nonlinear Cyclic Load Behavior of Ishaped
10 pages
AENG 521 - 3: Symmetrical & Unsymmetrical Flight Conditions Far Part 23 & 25
No ratings yet
AENG 521 - 3: Symmetrical & Unsymmetrical Flight Conditions Far Part 23 & 25
11 pages
Festo Motion FHPP Rockwell
No ratings yet
Festo Motion FHPP Rockwell
57 pages
Measurements Lab
No ratings yet
Measurements Lab
5 pages
Crio 9068 User Manaul
No ratings yet
Crio 9068 User Manaul
24 pages
16-04-2025 - SR - Super60, Elite, Target & LIIT BT'S - Jee-Adv (2023-P1) - GTA-27 - Key & Sol's
No ratings yet
16-04-2025 - SR - Super60, Elite, Target & LIIT BT'S - Jee-Adv (2023-P1) - GTA-27 - Key & Sol's
17 pages
OPT5508B Tuner To IP Gateway User's Manual: Directory
No ratings yet
OPT5508B Tuner To IP Gateway User's Manual: Directory
16 pages
MCHONE Stainless Grades Chart Downloadable
No ratings yet
MCHONE Stainless Grades Chart Downloadable
3 pages
Save Electricity: N1447018555 MJD70 - 43 - 1447018555
No ratings yet
Save Electricity: N1447018555 MJD70 - 43 - 1447018555
1 page
(Diabetes No More) Diabetes Destroyer
100% (2)
(Diabetes No More) Diabetes Destroyer
30 pages
Domino'S Nutrition Guide
No ratings yet
Domino'S Nutrition Guide
17 pages
11 Astrological Remedies
100% (3)
11 Astrological Remedies
28 pages
Network Theory 3rd Sem Update
No ratings yet
Network Theory 3rd Sem Update
5 pages
Information Sheet On Ramsar Wetlands (RIS) : TH TH
No ratings yet
Information Sheet On Ramsar Wetlands (RIS) : TH TH
11 pages
BQ - Preliminaries
No ratings yet
BQ - Preliminaries
10 pages

Ai ML

Uploaded by

Ai ML

Uploaded by

Chapter 1

Neural Network Models for Time Series

Shuochao Yao and Tarek Abdelzaher

Abstract A wide range of mobile sensing and computing applications require

A wide range of mobile sensing and computing applications require time-series

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 3

quantities can be computed, such as displacement through double integration of

thus is particularly sensitive to error accumulation, as acceleration errors can

contain local, global, and temporal relationships in classiﬁcation-oriented prob-

Recurrent Layer 2 GRU ...... ...... ...... ...... GRU

Recurrent Layer 1 ...... ...... ...... ......

Flatten & Concatenation

Individual Convolutional Layer 1 ............ (k)

K sensor inputs K sensor inputs

Fig. 1.1 Main architecture of the DeepSense framework

2.1 Convolutional Layers

2.2 Recurrent Layers

2.3 Output Layer

category probability .ŷ.

3.1 General Customization Process

In general, we need to customize a few parameters of the main architecture of

3.2 Customize Mobile Sensing Tasks

In this section, we provide three instances of customizing DeepSense for speciﬁc

speed. Therefore, according to our general customization process, carTrack is

where .H (·, ·) is the cross entropy for two distributions.

In this section, we evaluate DeepSense on three mobile computing tasks. We

4.1 Data Collection and Datasets

4.2 Evaluation Platforms

4.3 Algorithms in Comparison

Fig. 1.2 Histogram of

Table 1.1 CarTrack task MAE (meter) Map-aided accuracy

DS-noIndvConv .52.15 ± 6.24 .88.3%

DS-noMergeConv .53.06 ± 6.59 .87.5%

eNav (w/o GPS) .6.7%

Bold values reﬂect the evaluation performance of techniques that

Compared with senior-fusion algorithm, DeepSense reduces the tracking error

Mean Absolute Error (m)

Map−Aided Track (%)

Fig. 1.3 Performance over driving distance

Fig. 1.5 Performance metrics of HHAR task

Fig. 1.6 Confusion matrix of HHAR task

Fig. 1.8 Accuracy over input measurement length of UserID task

4.5 Latency and Energy

Final, we examine the computation latency and energy consumption of DeepSens

Fig. 1.9 Confusion matrix of

Fig. 1.10 Test platforms:

Fig. 1.11 Power and Latency of carTrack solutions on Nexus 5

Edison, as shown in Fig. 1.10. The energy consumption of applications on Nexus 5

Fig. 1.12 Energy and Latency of HHAR solutions on Nexus 5

Fig. 1.13 Energy and Latency of UserID solutions on Nexus 5

Fig. 1.14 Power and Latency of carTrack solutions on Edison

Fig. 1.15 Energy and Latency of HHAR solutions on Edison

Fig. 1.16 Energy and Latency of UserID solutions on Edison

39. T. Cooijmans, N. Ballas, C. Laurent, A. Courville, Recurrent batch normalization (2016).

You might also like