Encoding Time Series As Images For Visual Inspection and 10179-46015-1-PB

Uploaded by

heavywater

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views7 pages

Encoding Time Series As Images For Visual Inspection and 10179-46015-1-PB

Uploaded by

heavywater

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Trajectory-Based Behavior Analytics: Papers from the 2015 AAAI Workshop

Encoding Time Series as Images for Visual Inspection and

Classification Using Tiled Convolutional Neural Networks

Zhiguang Wang and Tim Oates

Computer Science and Electrical Engineering Department
University of Maryland Baltimore County
{stephen.wang, oates}@umbc.edu

Abstract Such success stems from learning distributed representa-

tions via deeply layered structure and unsupervised pretrain-
Inspired by recent successes of deep learning in com- ing by stacking single layer Restricted Boltzmann Machines
puter vision and speech recognition, we propose a novel
framework to encode time series data as different types
(RBM).
of images, namely, Gramian Angular Fields (GAF) and Another deep learning architecture used in computer vi-
Markov Transition Fields (MTF). This enables the use sion is convolutional neural networks (CNN) (LeCun et al.
of techniques from computer vision for classification. 1998). CNNs exploit translational invariance within their
Using a polar coordinate system, GAF images are rep- structures by extracting features through receptive fields
resented as a Gramian matrix where each element is the (Hubel and Wiesel 1962) and learn with weight shar-
trigonometric sum (i.e., superposition of directions) being, becoming the state-of-the-art approach in various im-
tween different time intervals. MTF images represent age recognition and computer vision tasks (Lawrence et
the first order Markov transition probability along one
al. 1997; Krizhevsky, Sutskever, and Hinton 2012; Le-
dimension and temporal dependency along the other.
We used Tiled Convolutional Neural Networks (tiled Cun, Kavukcuoglu, and Farabet 2010). Since unsupervised
CNNs) on 12 standard datasets to learn high-level fea- pretraining has been shown to improve performance (Er-
tures from individual GAF, MTF, and GAF-MTF im- han et al. 2010), sparse coding and Topographic Indepen-
ages that resulted from combining GAF and MTF rep- dent Component Analysis (TICA) are integrated as unsu-
resentations into a single image. The classification re- pervised pretraining approaches to learn more diverse fea-
sults of our approach are competitive with five state- tures with complex invariances (Kavukcuoglu et al. 2010;
of-the-art approaches. An analysis of the features and Ngiam et al. 2010).
weights learned via tiled CNNs explains why the ap- CNNs were proposed for speech processing to be invari-
proach works.
ant to shifts in time and frequency by LeCun and Ben-
gio. Recently, CNNs have been shown to further improve
Introduction hybrid model performance by applying convolution and
max-pooling in the frequency domain on the TIMIT phone
We consider the problem of encoding time series data as im- recognition task (Abdel-Hamid et al. 2012). A heteroge-
ages to allow machines to “visually” recognize and classify neous pooling approach proved to be beneficial for train-
the time series. One type of time series recognition in speech ing acoustic invariance in (Deng, Abdel-Hamid, and Yu
and audio has been well studied. Researchers have achieved 2013). Further exploration with limited weight sharing and
success using combinations of HMMs with acoustic mod- a weighted softmax pooling layer has been proposed to op-
els based on Gaussian Mixture models (GMMs) (Reynolds timize CNN structures for speech recognition tasks (Abdel-
and Rose 1995; Leggetter and Woodland 1995). An alterna- Hamid, Deng, and Yu 2013).
tive approach is to use a deep neural networks to produce
the posterior probabilities over HMM states. Deep learn- Except for audio and speech data, relatively little work
ing has become increasingly popular since the introduc- has explored feature learning in the context of typical time
tion of effective ways to train multiple hidden layers (Hin- series analysis tasks with current deep learning architec-
ton, Osindero, and Teh 2006) and has been proposed as a tures. (Zheng et al. 2014) explores supervised feature learn-
replacement for GMMs to model acoustic data in speech ing with CNNs to classify multi-channel time series with
recognition tasks (Mohamed, Dahl, and Hinton 2012). These two datasets. They extracted subsequences with sliding win-
Deep Neural Network - Hidden Markov Model hybrid sys- dows and compared their results to Dynamic Time Warping
tems (DNN-HMM) achieved remarkable performance in a (DTW) with a 1-Nearest-Neighbor classifier (1NN-DTW).
variety of speech recognition tasks (Hinton et al. 2012; Our motivation is to explore a novel framework to encode
Deng, Hinton, and Kingsbury 2013; Deng et al. 2013). time series as images and thus to take advantage of the
success of deep learning architectures in computer vision
Copyright c 2015, Association for the Advancement of Artificial to learn features and identify structure in time series. Un-
Intelligence (www.aaai.org). All rights reserved. like speech recognition systems in which acoustic/speech

40
data input is typically represented by concatenating Mel-
frequency cepstral coefficients (MFCCs) or perceptual lin-
ear predictive coefficient (PLPs) (Hermansky 1990), typical
time series data are not likely to benefit from transformations
typically applied to speech or acoustic data.
In this paper, we present two new representations for en-
coding time series as images that we call them Gramian An-
gular Field (GAF) and the Markov Transition Field (MTF).
We select the same twelve “hard” time series dataset used
by Oates et al., and applied deep Tiled Convolutional Neural
Networks (Tiled CNN) with a pretraining stage that exploits
local orthogonality by Topographic ICA (Ngiam et al. 2010)
to “visually” represent the time series. We report our clas- Figure 1: Illustration of the proposed encoding map of
sification performance both on GAF and MTF separately, Gramian Angular Field. X is a sequence of typical time se-
and GAF-MTF which resulted from combining GAF and ries in dataset ’SwedishLeaf’. After X is rescaled by eq. (1)
MTF representations into a single image. By comparing our and smoothed by PAA optionally, we transform it into polar
results with five previous and current state-of-the-art hand- coordinate system by eq. (2) and finally calculate its GAF
crafted representation and classification methods, we show image with eq. (4). In this example, we build GAF with-
that our approach in practice achieves competitive perfor- out PAA smoothing, so the GAF has a high resolution of
mance with the state of the art while exploring a relatively 128 ⇥ 128.
small parameter space. We also find that our Tiled CNN
based deep learning method works well with small time se-
ries datasets, while the traditional CNN may not work well novel way to understand time series. As time increases, cor-
on such small datasets (Zheng et al. 2014). In addition to ex- responding values warp among different angular points on
ploring the high level features learned by Tiled CNNs, we the spanning circles, like water rippling. The encoding map
provide an in-depth analysis in terms of the duality between of equation 2 has two important properties. First, it is bijec-
time series and images within our frameworks that more pre- tive as cos( ) is monotonic when 2 [0, ⇡]. Given a time
cisely identifies the reasons why our approaches work. series, the proposed map produces one and only one result in
the polar coordinate system with a unique inverse function.
Encoding Time Series to Images Second, as opposed to Cartesian coordinates, polar coordi-
nates preserve absolute temporal relations. In Cartesian co-
We first introduce our two frameworks for encoding time se- R x( j)
ries as images. The first type of image is a Gramian Angular ordinates, the area is defined by Si,j = x(i) f (x(t))dx(t),
field (GAF), in which we represent time series in a polar co- we have Si,i+k = Sj,j+k if f (x(t)) has the same values
ordinate system instead of the typical Cartesian coordinates. on [i, i + k] and [j, j + k]. However, in polar coordinates,
In the Gramian matrix, each element is actually the cosine of R (j)
if the area is defined as Si,j
0
= (i) r[ (t)]2 d( (t)), then
the summation of angles. Inspired by previous work on the
duality between time series and complex networks (Cam-
0
Si,i+k 0
6= Sj,j+k . That is, the corresponding area from time
panharo et al. 2011), the main idea of the second framework, stamp i to time stamp j is not only dependent on the time
the Markov Transition Field (MTF), is to build the Markov interval |i j|, but also determined by the absolute value of
matrix of quantile bins after discretization and encode the i and j. We will discuss this in more detail in another work.
dynamic transition probability in a quasi-Gramian matrix. After transforming the rescaled time series into the po-
lar coordinate system, we can easily exploit the angular per-
Gramian Angular Field spective by considering the trigonometric sum between each
Given a time series X = {x1 , x2 , ..., xn } of n real-valued point to identify the temporal correlation within different
observations, we rescale X so that all values fall in the in- time intervals. The GAF is defined as follows:
terval [ 1, 1]: 2 3
cos( 1 + 1 ) · · · cos( 1 + n)
(xi max(X) + (xi min(X)) 6 cos( 2 + 1 ) · · · cos( 2+ n)7
x̃i = (1)
max(X) min(X) G = 6
4 .. .. .. 7
5 (3)
. . .
Thus we can represent the rescaled time series X̃ in polar cos( n + 1 ) · · · cos( n+ n)
coordinates by encoding the value as the angular cosine and p 0 p
time stamp as the radius with the equation below: = X̃ 0 · X̃ I X̃ 2 · I X̃ 2 (4)
⇢
= arccos (x̃i ), 1  x̃i  1, x̃i 2 X̃
ti (2) I is the unit row vector [1, 1, ..., 1]. After transforming to
r= N , ti 2 N the polar coordinate system, we take time series at each time
In the equation above, ti is the time stamp and N is a step as a 1-D metric p space. By p defining the inner product
constant factor to regularize the span of the polar coordi- < x, y >= x · y 1 x2 · 1 y 2 , G is a Gramian
nate system. This polar coordinate based representation is a matrix:

41
A B C D Markov Transition Matrx
2 3 A 0.917 0.083 0 0
< x˜1 , x˜1 > ··· < x˜1 , xñ > B 0.083 0.583 0.334 0
C 0 0.260 0.522 0.218
6 < x˜2 , x˜1 > ··· < x˜2 , xñ > 7 D 0 0.083 0.167 0.75
6 .. .. .. 7 (5) Typical Time Series
4 . . . 5
D
D
< xñ , x˜1 > ··· < xñ , xñ >
CC
The GAF has several advantages. First, it provides a way BB

to preserve the temporal dependency, since time increases as AA

the position moves from top-left to bottom-right. The GAF Markov Transition Field

contains temporal correlations because G(i,j||i j|=k) repre-

sents the relative correlation by superposition of directions Blurred Markov Transition Field

with respect to time interval k. The main diagonal Gi,i is

the special case when k = 0, which contains the original
Figure 2: Illustration of the proposed encoding map of
value/angular information. With the main diagonal, we will
Markov Transition Field. X is a sequence of typical time
approximately reconstruct the time series from the high level
series in dataset ’ECG’. X is firstly discretized into Q quan-
features learned by the deep neural network. However, the
tile bins. Then we calculate its Markov Transition Matrix W
GAF is large because the size of Gramian matrix is n ⇥ n
and finally build its MTF with eq. (6). In addition, we re-
when the length of the raw time series is n. To reduce the
duce the image size from 96 ⇥ 96 to 48 ⇥ 48 by averaging
size of the GAF, we apply Piecewise Aggregation Approxi-
the pixels in each non-overlapping 2 ⇥ 2 patch.
mation (Keogh and Pazzani 2000) to smooth the time series
and while keeping trends. The full procedure for generating
the GAF is illustrated in Figure 1.
of the time series. Mi,j||i j|=k denotes the transition prob-
Markov Transition Field ability between the points with time interval k. For exam-
ple, Mij|j i=1 illustrates the transition process along the
We propose a framework similar to (Campanharo et al. time axis with a skip step. The main diagonal Mii , which
2011) for encoding dynamical transition statistics, but we is a special case when k = 0 captures the probability from
extend that idea by representing the Markov transition prob- each quantile to itself (the self-transition probability) at time
abilities sequentially to preserve information in the time do- step i. To make the image size manageable and computation
main. more efficient, we reduce the MTF size by averaging the pix-
Given a time series X, we identify its Q quantile bins and els in each non-overlapping m ⇥ m patch with the blurring
assign each xi to the corresponding bins qj (j 2 [1, Q]). kernel { m12 }m⇥m . That is, we aggregate the transition prob-
Thus we construct a Q ⇥ Q weighted adjacency matrix W abilities in each subsequence of length m together. Figure 2
by counting transitions among quantile bins in the manner of shows the procedure to encode time series to MTF.
a first-order Markov chain along the time axis. wi,j is given
by the frequency with which a point in the quantile qj is fol-
lowed by a point in the quantile qi . After normalization by Tiled Convolutional Neural Networks
P
j wij = 1, W is the Markov transition matrix. It is insen- Tiled Convolutional Neural Networks (Ngiam et al. 2010)
sitive to the distribution of X and temporal dependency on are a variation of Convolutional Neural Networks that use
time steps ti . However, getting rid of the temporal depen- tiles and multiple feature maps to learn invariant features.
dency results in too much information loss in matrix W . To Tiles are parameterized by a tile size k to control the dis-
overcome this drawback, we define the Markov Transition tance over which weights are shared. By producing multi-
Field (MTF) as follows: ple feature maps, Tiled CNNs learn overcomplete represen-
2w 3 tations through unsupervised pretraining with Topographic
ij|x1 2qi ,x1 2qj ··· w ij|x1 2qi ,xn 2qj ICA (TICA).
6 wij|x2 2qi ,x1 2qj ··· wij|x2 2qi ,xn 2qj 7
M =6 7 (6) A typical TICA network is actually a double-stage op-
4 .. .. .. 5
. . . timization procedure with squares and square root nonlin-
wij|xn 2qi ,x1 2qj ··· wij|xn 2qi ,xn 2qj earities in each stage, respectively. In the first stage, the
weight matrix W is learned while the matrix V is hard-
We build a Q ⇥ Q Markov transition matrix (say, W ) coded to represent the topographic structure of units. More
by dividing the data (magnitude) into Q quantile bins. The precisely, given a sequence of inputs {xh }, the activa-
quantile bins that contain the data at time stamp i and j (tem- tion of each unit in the second stage is fi (x(h) ; W, V ) =
poral axis) are qi and qj (q 2 [1, Q]). Mij in MTF denotes q Pp Pq (h) 2
the transition probability of qi ! qj . That is, we spread k=1 Vik ( j=1 Wkj xj ) . TICA learns the weight ma-
out matrix W which contains the transition probability on trix W in the second stage by solving the following:
the magnitude axis into the MTF matrix by considering the
p
temporal positions. n X
X
By assigning the probability from the quantile at time step minimize fi (x(h) ; W, V )
i to the quantile at time step j at each pixel Mij , the MTF
W
h=1 i=1 (7)
M actually encodes the multi-span transition probabilities subject to WW T
=I

42
Convolutional I TICA Pooling I Convolutional II TICA Pooling II Linear SVM
Feature maps 𝑙 = 6

Receptive Field 8 × 8 ... ...

Table 1: Tiled CNN error rate on training set and test set
DATASET GAF MTF
TRAIN TEST TRAIN TEST
... Untied weights 𝑘 = 2
50words 0.338 0.310 0.442 0.426
adiac 0.321 0.284 0.638 0.665

...
Pooling Size 3 × 3 ... beef 0.633 0.4 0.533 0.233
coffee 0 0 0 0
ECG200 0.16 0.11 0.15 0.21
faceall 0.121 0.244 0.102 0.259
Pooling Size 3 × 3 lighting2 0.2 0.18 0.167 0.361
Receptive Field 3 × 3
lighting7 0.329 0.397 0.386 0.411
oliveoil 0.2 0.2 0.033 0.3
Figure 3: Structure of the tiled convolutional neural network. OSULeaf 0.415 0.463 0.43 0.483
We fix the size of receptive field to 8 ⇥ 8 in the first convo- SwedishLeaf 0.134 0.104 0.206 0.176
lutional layer and 3 ⇥ 3 in the second convolutional layer. yoga 0.183 0.177 0.193 0.243
Each TICA pooling layer pools over a block of 3 ⇥ 3 input
units in the previous layer without wraparound the boarders
to optimize for sparsity of the pooling units. The number of platform). Thus, the results in this paper are a preliminary
pooling units in each map is exactly the same as the number lower bound on the potential best performance. Thoroughly
of input units. The last layer is a linear SVM for classifica- exploring the deep network structures and parameters will
tion. We construct this network by stacking two Tiled CNNs, be addressed in future work. The structure and parameters
each with 6 maps (l = 6) and tiling size k = 2. of the tiled CNN used in this paper are illustrated in Figure
3.

Above, W 2 Rp⇥q and V 2 Rp⇥p where p is the number Classifying Time Series Using GAF/MTF
of hidden units in a layer and q is the size of the input. V is
a logical matrix (Vij = 1 or 0) that encodes the topographic We apply Tiled CNNs to classify using GAF and MTF rep-
structure of the hidden units by a contiguous 3 ⇥ 3 block. resentation on twelve tough datasets, on which the classifi-
The orthogonality constraint W W T = I provides diversity cation error rate is above 0.1 with the state-of-the-art SAX-
among learned features. BoP approach (Lin, Khade, and Li 2012; Oates et al. 2012).
More detailed statistics are summarized in Table 2. The
Neither GAF nor MTF images are natural images; they datasets are pre-split into training and testing sets for ex-
have no natural concepts such as “edges” and “angles”. perimental comparisons. For each dataset, the table gives its
Thus, we propose to exploit the benefits of unsupervised pre- name, the number of classes, the number of training and test
training with TICA to learn many diverse features with local instances, and the length of the individual time series.
orthogonality. In addition, Ngiam et al. empirically demon-
strate that tiled CNNs perform well with limited labeled data
Experimental Setting
because the partial weight tying requires fewer parameters
and reduces the need for a large amount of labeled data. Our In our experiments, the size of the GAF image is regulated
data from the UCR Time Series Repository (Keogh et al. by the the number of PAA bins SGAF . Given a time se-
2011) tends to have few instances (e.g., the “yoga” dataset ries X of size n, we divide the time series into SGAF ad-
has 300 labeled instance in the training set and 3000 unla- jacent, non-overlapping windows along the time axis and
beled instance in the test set), tiled CNNs are suitable for our extract the means of each bin. This enables us to construct
learning task. the smaller GAF matrix GSGAF ⇥SGAF . MTF requires the
Typically, tiled CNNs are trained with two hyperparam- time series to be discretized into Q quantile bins to calculate
eters, the tiling size k and the number of feature maps l. the Q ⇥ Q Markov transition matrix, from which we con-
In our experiments, we directly fixed the network structures struct the raw MTF image Mn⇥n afterwards. Before classi-
without tuning these hyperparameters in loops for several fication, we shrink the MTF image size to SM T F ⇥ SM T F
reasons. First, our goal is to explore the expressive power of by the blurring kernel { m12 }m⇥m where m = d SMnT F e. The
the high level features learned from GAF and MTF images. Tiled CNN is trained with image size {SGAF , SM T F } 2
We have already achieved competitive results with the de- {16, 24, 32, 40, 48} and quantile size Q 2 {8, 16, 32, 64}.
fault deep network structures that Ngiam et al. used for im- At the last layer of the Tiled CNN, we use a linear soft mar-
age classification on the NORB image classification bench- gin SVM (Fan et al. 2008) and select C by 5-fold cross val-
mark. Although tuning the parameters will surely enhance idation over {10 4 , 10 3 , . . . , 104 } on the training set.
performance, doing so may cloud our understanding of the For each input of image size SGAF or SM T F and quan-
power of the representation. Another consideration is com- tile size Q, we pretrain the Tiled CNN with the full unla-
putational efficiency. All of the experiments on the 12 “hard” beled dataset (both training and test set) to learn the initial
datasets could be done in one day on a laptop with an In- weights W through TICA. Then we train the SVM at the last
tel i7-3630QM CPU and 8GB of memory (our experimental layer by selecting the penalty factor C with cross validation.

43
Table 2: Summary statistics of standard dataset and comparative results
DATASET CLASS TRAIN TEST LENGTH 1NN- 1NN- FAST BOP SAX- GAF-
EUCLIDEAN DTW SHAPELET VSM MTF
50words 50 450 455 270 0.369 0.242 0.4429 0.466 N/A 0.284
Adiac 37 390 391 176 0.389 0.391 0.514 0.432 0.381 0.307
Beef 5 30 30 470 0.467 0.467 0.447 0.433 0.033 0.3
Coffee 2 28 28 286 0.25 0.18 0.067 0.036 0 0
ECG200 2 100 100 96 0.12 0.23 0.227 0.14 0.14 0.08
FaceAll 14 560 1,690 131 0.286 0.192 0.402 0.219 0.207 0.223
Lightning2 2 60 61 637 0.246 0.131 0.295 0.164 0.196 0.18
Lightning7 7 70 73 319 0.425 0.274 0.403 0.466 0.301 0.397
OliveOil 4 30 30 570 0.133 0.133 0.213 0.133 0.1 0.167
OSULeaf 6 200 242 427 0.483 0.409 0.359 0.236 0.107 0.446
SwedishLeaf 15 500 625 128 0.213 0.21 0.27 0.198 0.251 0.093
Yoga 2 300 3,000 426 0.17 0.164 0.249 0.17 0.164 0.16

Finally, we classify the test set using the optimal hyperpa- nal” channels, like different colors in the RGB image space.
rameters {S, Q, C} with the lowest error rate on the training Thus, we can combine GAF and MTF images of the same
set. If two or more models tie, we prefer the larger S and Q size (i.e. SGAF = SM T F ) to construct a double-channel im-
because larger S helps preserve more information through age (GAF-MTF). Since GAF-MTF combines both the static
the PAA procedure and larger Q encodes the dynamic tran- and dynamic statistics embedded in raw time series, we posit
sition statistics with more detail. Our model selection ap- that it will be able to enhance classification performance. In
proach provides generalization without being overly expen- the next experiment, we pretrain and train the Tiled CNN
sive computationally. on the compound GAF-MTF images. Then, we report the
classification error rate on test sets.
Results and Discussion Table 2 compares the classification error rate of our ap-
proach with previously published performance results of
We use Tiled CNNs to classify GAF and MTF representa- five competing methods: two state-of-the-art 1NN classifiers
tions separately on the 12 datasets. The training and test er- based on Euclidean distance and DTW, the recently pro-
ror rates are shown in Table 1. Generally, our approach is posed Fast-Shapelets based classifier (Rakthanmanon and
not prone to overfitting as seen by the relatively small differ- Keogh 2013), the classifier based on Bag-of-Patterns (BoP)
ence between training and test set errors. One exception is (Lin, Khade, and Li 2012; Oates et al. 2012) and the most re-
the Olive Oil dataset with the MTF approach where the test cent SAX-VSM approach (Senin and Malinchik 2013). Our
error is significantly higher. approach outperforms 1NN-Euclidean, fast-shapelets, and
In addition to the risk of potential overfitting, MTF has BoP, and is competitive with 1NN-DTW and SAX-VSM.
generally higher error rates than GAF. This is most likely be- In addition, by comparing the results between Table 2 and
cause of uncertainty in the inverse image of MTF. Note that Table 1, we verified our assumption that combined GAF-
the encoding function from time series to GAF and MTF are MTF images have better expressive power than GAF or
both surjective. The map functions of GAF and MTF will MTF alone for classification. GAF-MTF achieves the lower
each produce only one image with fixed S and Q for each test error rate on ten datasets out of twelve (except for the
given time series X . Because they are both surjective map- dataset Adiac and Beef). On the Olive Oil dataset, the train-
ping functions, the inverse image of both mapping functions ing error rate is 6.67% and the test error rate is 16.67%. This
is not fixed. As shown in a later section, we can approx- demonstrates that the integration of both types of images
imately reconstruct the raw time series from GAF, but it is into one compound image decreases the risk of overfitting
very hard to even roughly recover the signal from MTF. GAF as well as enhancing the overall classification accuracy.
has smaller uncertainty in the inverse image of its mapping
function because such randomness only comes from the am- Analysis on Features and Weights Learned
biguity of cos( ) when 2 [0, 2⇡]. MTF, on the other hand,
has a much larger inverse image space, which results in large through Tiled CNNs
variation when we try to recover the signal. Although MTF In contrast to the cases in which the CNN is applied in natu-
encodes the transition dynamics which are important fea- ral image recognition tasks, neither GAF nor MTF has nat-
tures of time series, such features seem not to be sufficient ural interpretations of visual concepts like “edges” or “an-
for recognition/classification tasks. gles”. In this section we analyze the features and weights
Note that at each pixel, Gij , denotes the superstition of learned through Tiled CNNs to explain why our approach
the directions at ti and tj , Mij is the transition probability works.
from quantile at ti to quantile at tj . GAF encodes static in- As mentioned earlier, the mapping function from time se-
formation while MTF depicts information about dynamics. ries to GAF is surjective and the uncertainty in its inverse
From this point of view, we consider them as two “orthogo- image comes from the ambiguity of the cos( ) when 2

44
Figure 5: learned sparse weights W for the last SVM layer
Figure 4: (a) Original GAF and its six learned feature maps in Tiled CNN (left) and its orthogonality constraint by
before the SVM layer in Tiled CNN (top left), and (b) raw W W T = I (right).
time series and approximate reconstructions based on the
main diagonal of six feature maps (top right) on ’50Words’ use of local orthogonality. The TICA pretraining provides
dataset; (c) Original MTF and its six learned feature maps the built-in advantage that the function w.r.t the parameter
before the SVM layer in Tiled CNN (bottom left), and (d) space is not likely to be ill-conditioned as W W T = 1. As
curve of self-transition probability along time axis (main shown in Figure 5 (right), the weight matrix W is quasi-
diagonal of MTF) and approximate reconstructions based orthogonal and approaching 0 without very large magnitude.
on the main diagonal of six feature maps (bottom right) on This implies that the condition number of W approaches 1
”SwedishLeaf” dataset. helps the system to be well-conditioned.

[0, 2⇡]. The main diagonal of GAF, i.e. {Gii } = {cos(2 i )} Conclusions and Future Work
allows us to approximately reconstruct the original time se- We created a pipeline for converting time series data into
ries, ignoring the signs by novel representations, GAF and MTF images, and extracted
r high-level features from these using Tiled CNN. The fea-
cos(2 ) + 1 tures were subsequently used for classification. We demon-
cos( ) = (8)
2 strated that our approach yields competitive results when
compared to state-of-the-art methods when searching a rel-
MTF has much larger uncertainty in its inverse image, atively small parameter space. We found that GAF-MTF
making it hard to reconstruct the raw data from MTF alone. multi-channel images are scalable to larger numbers of
However, the diagonal {Mij||i j|=k } represents the transi- quasi-orthogonal features that yield more comprehensive
tion probability among the quantiles in temporal order con- images. Our analysis of high-level features learned from
sidering the time interval k. We construct the self-transition Tiled CNN suggested that Tiled CNN works like a multi-
probability along the time axis from the main diagonal of frequency moving average that benefits from the 2D tempo-
MTF like we do for GAF. Although such reconstructions ral dependency that is preserved by Gramian matrix.
less accurately capture the morphology of the raw time se- Important future work will involve applying our method
ries, they provide another perspective of how Tiled CNNs to massive amounts of data and searching in a more com-
capture the transition dynamics embedded in MTF. plete parameter space to solve the real world problems. We
Figure 4 illustrates the reconstruction results from six fea- are also quite interested in how different deep learning ar-
ture maps learned before the last SVM layer on GAF and chitectures perform on the GAF and MTF images. Another
MTF. The Tiled CNN extracts the color patch, which is es- interesting future work is to model time series through GAF
sentially a moving average that enhances several receptive and MTF images. We aim to apply learned time series mod-
fields within the nonlinear units by different trained weights. els in regression/imputation and anomaly detection tasks. To
It is not a simple moving average but the synthetic integra- extend our methods to the streaming data, we suppose to
tion by considering the 2D temporal dependencies among design the online learning approach with recurrent network
different time intervals, which is a benefit from the Gramian structures.
matrix structure that helps preserve the temporal informa-
tion. By observing the rough orthogonal reconstruction from
each layer of the feature maps, we can clearly observe that
References
the tiled CNN can extract the multi-frequency dependen- Abdel-Hamid, O.; Mohamed, A.-r.; Jiang, H.; and Penn, G.
cies through the convolution and pooling architecture on the 2012. Applying convolutional neural networks concepts to
GAF and MTF images to preserve the trend while address- hybrid nn-hmm model for speech recognition. In Acoustics,
ing more details in different subphases. As shown in Figure Speech and Signal Processing (ICASSP), 2012 IEEE Inter-
4(b) and 4(d), the high-leveled feature maps learned by the national Conference on, 4277–4280. IEEE.
Tiled CNN are equivalent to a multi-frequency approximator Abdel-Hamid, O.; Deng, L.; and Yu, D. 2013. Explor-
of the original curve. ing convolutional neural network structures and optimiza-
Figure 5 demonstrates the learned sparse weight matrix tion techniques for speech recognition. In INTERSPEECH,
W with the constraint W W T = I, which makes effective 3366–3370.

45
Campanharo, A. S.; Sirer, M. I.; Malmgren, R. D.; Ramos, Lawrence, S.; Giles, C. L.; Tsoi, A. C.; and Back, A. D.
F. M.; and Amaral, L. A. N. 2011. Duality between time 1997. Face recognition: A convolutional neural-network ap-
series and networks. PloS one 6(8):e23378. proach. Neural Networks, IEEE Transactions on 8(1):98–
Deng, L.; Abdel-Hamid, O.; and Yu, D. 2013. A deep 113.
convolutional neural network using heterogeneous pooling LeCun, Y., and Bengio, Y. 1995. Convolutional networks
for trading acoustic invariance with phonetic confusion. In for images, speech, and time series. The handbook of brain
Acoustics, Speech and Signal Processing (ICASSP), 2013 theory and neural networks 3361.
IEEE International Conference on, 6669–6673. IEEE. LeCun, Y.; Bottou, L.; Bengio, Y.; and Haffner, P. 1998.
Deng, L.; Li, J.; Huang, J.-T.; Yao, K.; Yu, D.; Seide, F.; Gradient-based learning applied to document recognition.
Seltzer, M.; Zweig, G.; He, X.; Williams, J.; et al. 2013. Proceedings of the IEEE 86(11):2278–2324.
Recent advances in deep learning for speech research at LeCun, Y.; Kavukcuoglu, K.; and Farabet, C. 2010. Convo-
microsoft. In Acoustics, Speech and Signal Processing lutional networks and applications in vision. In Circuits and
(ICASSP), 2013 IEEE International Conference on, 8604– Systems (ISCAS), Proceedings of 2010 IEEE International
8608. IEEE. Symposium on, 253–256. IEEE.
Deng, L.; Hinton, G.; and Kingsbury, B. 2013. New types Leggetter, C. J., and Woodland, P. C. 1995. Maximum likeli-
of deep neural network learning for speech recognition and hood linear regression for speaker adaptation of continuous
related applications: An overview. In Acoustics, Speech and density hidden markov models. Computer Speech & Lan-
Signal Processing (ICASSP), 2013 IEEE International Con- guage 9(2):171–185.
ference on, 8599–8603. IEEE.
Lin, J.; Khade, R.; and Li, Y. 2012. Rotation-invariant sim-
Erhan, D.; Bengio, Y.; Courville, A.; Manzagol, P.-A.; Vin- ilarity in time series using bag-of-patterns representation.
cent, P.; and Bengio, S. 2010. Why does unsupervised pre- Journal of Intelligent Information Systems 39(2):287–315.
training help deep learning? The Journal of Machine Learn-
Mohamed, A.-r.; Dahl, G. E.; and Hinton, G. 2012. Acoustic
ing Research 11:625–660.
modeling using deep belief networks. Audio, Speech, and
Fan, R.-E.; Chang, K.-W.; Hsieh, C.-J.; Wang, X.-R.; and Language Processing, IEEE Transactions on 20(1):14–22.
Lin, C.-J. 2008. Liblinear: A library for large linear classifi-
Ngiam, J.; Chen, Z.; Chia, D.; Koh, P. W.; Le, Q. V.; and
cation. The Journal of Machine Learning Research 9:1871–
Ng, A. Y. 2010. Tiled convolutional neural networks. In
1874.
Advances in Neural Information Processing Systems, 1279–
Hermansky, H. 1990. Perceptual linear predictive (plp) 1287.
analysis of speech. the Journal of the Acoustical Society
Oates, T.; Mackenzie, C. F.; Stein, D. M.; Stansbury, L. G.;
of America 87(4):1738–1752.
Dubose, J.; Aarabi, B.; and Hu, P. F. 2012. Exploiting rep-
Hinton, G.; Deng, L.; Yu, D.; Dahl, G. E.; Mohamed, A.-r.; resentational diversity for time series classification. In Ma-
Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, chine Learning and Applications (ICMLA), 2012 11th Inter-
T. N.; et al. 2012. Deep neural networks for acoustic model- national Conference on, volume 2, 538–544. IEEE.
ing in speech recognition: The shared views of four research
Rakthanmanon, T., and Keogh, E. 2013. Fast shapelets:
groups. Signal Processing Magazine, IEEE 29(6):82–97.
A scalable algorithm for discovering time series shapelets.
Hinton, G.; Osindero, S.; and Teh, Y.-W. 2006. A fast In Proceedings of the thirteenth SIAM conference on data
learning algorithm for deep belief nets. Neural computation mining (SDM). SIAM.
18(7):1527–1554.
Reynolds, D. A., and Rose, R. C. 1995. Robust text-
Hubel, D. H., and Wiesel, T. N. 1962. Receptive fields, independent speaker identification using gaussian mixture
binocular interaction and functional architecture in the cat’s speaker models. Speech and Audio Processing, IEEE Trans-
visual cortex. The Journal of physiology 160(1):106. actions on 3(1):72–83.
Kavukcuoglu, K.; Sermanet, P.; Boureau, Y.-L.; Gregor, K.; Senin, P., and Malinchik, S. 2013. Sax-vsm: Interpretable
Mathieu, M.; and Cun, Y. L. 2010. Learning convolutional time series classification using sax and vector space model.
feature hierarchies for visual recognition. In Advances in In Data Mining (ICDM), 2013 IEEE 13th International Con-
neural information processing systems, 1090–1098. ference on, 1175–1180. IEEE.
Keogh, E. J., and Pazzani, M. J. 2000. Scaling up dynamic Zheng, Y.; Liu, Q.; Chen, E.; Ge, Y.; and Zhao, J. L. 2014.
time warping for datamining applications. In Proceedings of Time series classification using multi-channels deep convo-
the sixth ACM SIGKDD international conference on Knowl- lutional neural networks. In Web-Age Information Manage-
edge discovery and data mining, 285–289. ACM. ment. Springer. 298–310.
Keogh, E.; Xi, X.; Wei, L.; and Ratanamahatana, C. A. 2011.
The ucr time series classification/clustering homepage.
URL= http://www. cs. ucr. edu/˜ eamonn/time series data.
Krizhevsky, A.; Sutskever, I.; and Hinton, G. E. 2012.
Imagenet classification with deep convolutional neural net-
works. In Advances in neural information processing sys-
tems, 1097–1105.

Oxford Big Ideas Geography 8 Ch1 Landforms and Landscapes
0% (1)
Oxford Big Ideas Geography 8 Ch1 Landforms and Landscapes
14 pages
L&T Pushbutton Catalogue Price List
100% (1)
L&T Pushbutton Catalogue Price List
16 pages
MCQ Electrical Circuits
50% (2)
MCQ Electrical Circuits
81 pages
Presentation Manufacturing Process Biologics Kowid Ho Afssaps en
100% (2)
Presentation Manufacturing Process Biologics Kowid Ho Afssaps en
30 pages
Csi 2018 Mechanical Division 15
100% (1)
Csi 2018 Mechanical Division 15
303 pages
Sound
100% (1)
Sound
5 pages
MAF671 Case 3 F N Report
No ratings yet
MAF671 Case 3 F N Report
32 pages
Hotel Classification
No ratings yet
Hotel Classification
9 pages
q2 Module 1 Growingseedling
100% (2)
q2 Module 1 Growingseedling
9 pages
Deep Learning
No ratings yet
Deep Learning
56 pages
Platonic Idealism: By: Dylan Isabela Jairus Marcos
No ratings yet
Platonic Idealism: By: Dylan Isabela Jairus Marcos
15 pages
IGNOU S Indian History Part 2 India Earliest Times To The 8th Century AD
100% (1)
IGNOU S Indian History Part 2 India Earliest Times To The 8th Century AD
439 pages
Video Clasification PDF
100% (1)
Video Clasification PDF
114 pages
UN-Habitat - The State of African Cities 2010: Governance, Inequalities and Urban Land Markets
No ratings yet
UN-Habitat - The State of African Cities 2010: Governance, Inequalities and Urban Land Markets
277 pages
Bronstein2016geometric, Geometric Deep Learning - Going Beyond Euclidean Data
No ratings yet
Bronstein2016geometric, Geometric Deep Learning - Going Beyond Euclidean Data
20 pages
Lecun 20181015 Ihes Gomax PDF
No ratings yet
Lecun 20181015 Ihes Gomax PDF
109 pages
KERNEL Geometric Deep Learning: LeCun
No ratings yet
KERNEL Geometric Deep Learning: LeCun
22 pages
Enhanced Speech Recognition Using ADAG SVM Approach
No ratings yet
Enhanced Speech Recognition Using ADAG SVM Approach
5 pages
Deep Learning For Audio Signal Processing
No ratings yet
Deep Learning For Audio Signal Processing
14 pages
A Speaker Verification Method Based On TDNN-LSTMP PDF
No ratings yet
A Speaker Verification Method Based On TDNN-LSTMP PDF
15 pages
Temporal Pattern Classification Using Spiking Neural Networks
No ratings yet
Temporal Pattern Classification Using Spiking Neural Networks
67 pages
Medical Lecture: Nazem Shams
No ratings yet
Medical Lecture: Nazem Shams
26 pages
Chord Detection Using Deep Learning
No ratings yet
Chord Detection Using Deep Learning
7 pages
Speech Recognition Using Deep Neural Networks: A Systematic Review
No ratings yet
Speech Recognition Using Deep Neural Networks: A Systematic Review
23 pages
A Review On Deep Learning Applications
No ratings yet
A Review On Deep Learning Applications
11 pages
398-Article Text-1335-1-10-20160904
No ratings yet
398-Article Text-1335-1-10-20160904
7 pages
Environmental Sound Classificationwith Convolutional Neural Networks
No ratings yet
Environmental Sound Classificationwith Convolutional Neural Networks
6 pages
Acoustic Modeling Using Deep Belief Networks: Abdel-Rahman Mohamed, George E. Dahl, and Geoffrey Hinton
No ratings yet
Acoustic Modeling Using Deep Belief Networks: Abdel-Rahman Mohamed, George E. Dahl, and Geoffrey Hinton
10 pages
An Unsupervised Deep Domain Adaptation Approach For Robust Speech Recognition PDF
No ratings yet
An Unsupervised Deep Domain Adaptation Approach For Robust Speech Recognition PDF
12 pages
Kubota Tractor B219 Loader - Model 25 Maximum Payload - 500 Pounds Figure: 1 - Safety Precautions
No ratings yet
Kubota Tractor B219 Loader - Model 25 Maximum Payload - 500 Pounds Figure: 1 - Safety Precautions
17 pages
Suburba Contest
100% (4)
Suburba Contest
4 pages
Chord Detection Using Deep Learning
No ratings yet
Chord Detection Using Deep Learning
8 pages
Speech Recognition With Deep Recurrent Neural Networks
No ratings yet
Speech Recognition With Deep Recurrent Neural Networks
2 pages
Indian Sign Language Converter Using Convolutional Neural Networks
No ratings yet
Indian Sign Language Converter Using Convolutional Neural Networks
5 pages
Deep Learning For Time-Series Analysis
No ratings yet
Deep Learning For Time-Series Analysis
13 pages
Survey On Evolving Deep Learning Neural Network Architectures
No ratings yet
Survey On Evolving Deep Learning Neural Network Architectures
10 pages
2 Deep Learning in Image Classification A Survey Report
No ratings yet
2 Deep Learning in Image Classification A Survey Report
4 pages
Name:Nor Shakira Binti Azemi & Dharvin Dharan A/L Elango Theme: Environment Issue Topic: Humans Are To Blame For Environmental Degradation
No ratings yet
Name:Nor Shakira Binti Azemi & Dharvin Dharan A/L Elango Theme: Environment Issue Topic: Humans Are To Blame For Environmental Degradation
3 pages
Turbo Flanges and Wastegate Flanges Product Information
No ratings yet
Turbo Flanges and Wastegate Flanges Product Information
8 pages
Climate-Smart Agriculture in Bhutan
No ratings yet
Climate-Smart Agriculture in Bhutan
26 pages
Image and Audio Caps: Automated Captioning of Background Sounds and Images Using Deep Learning
No ratings yet
Image and Audio Caps: Automated Captioning of Background Sounds and Images Using Deep Learning
9 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
Classification Time-Series Images Using DeepNN
No ratings yet
Classification Time-Series Images Using DeepNN
8 pages
Deep L Earning
No ratings yet
Deep L Earning
7 pages
Paper 10
No ratings yet
Paper 10
7 pages
Do We Really Need Deep Learning Models For Tiem Series Forecasting 2101.02118
No ratings yet
Do We Really Need Deep Learning Models For Tiem Series Forecasting 2101.02118
16 pages
A Brief Survey and An Application of Sem
No ratings yet
A Brief Survey and An Application of Sem
38 pages
These Our Games Sport and The Church of Scotland Mission To Kenya C 1907 1937
No ratings yet
These Our Games Sport and The Church of Scotland Mission To Kenya C 1907 1937
30 pages
A State-Of-The-Art Computer Vision Adopting Non - E
No ratings yet
A State-Of-The-Art Computer Vision Adopting Non - E
33 pages
CNN Interspeech2013 Pub
No ratings yet
CNN Interspeech2013 Pub
5 pages
Transformer 1803.02155
No ratings yet
Transformer 1803.02155
5 pages
Comparison Analysis of Traditional Machine Learnin
No ratings yet
Comparison Analysis of Traditional Machine Learnin
9 pages
A Survey of Multimodal Hybrid Deep Learning For Computer Vision
No ratings yet
A Survey of Multimodal Hybrid Deep Learning For Computer Vision
28 pages
Budnik 2015
No ratings yet
Budnik 2015
6 pages
2021 Deep Learning Audio Book
No ratings yet
2021 Deep Learning Audio Book
38 pages
Yang TVT Transferable Vision Transformer For Unsupervised Domain Adaptation WACV 2023 Paper
No ratings yet
Yang TVT Transferable Vision Transformer For Unsupervised Domain Adaptation WACV 2023 Paper
11 pages
Intelligence in IoMT Turkey
No ratings yet
Intelligence in IoMT Turkey
17 pages
Table of Content
No ratings yet
Table of Content
9 pages
Icassp 2013 6639140
No ratings yet
Icassp 2013 6639140
4 pages
Flaml 2005.01571
No ratings yet
Flaml 2005.01571
29 pages
Unit 3 - CGW4U - Migration Notes
No ratings yet
Unit 3 - CGW4U - Migration Notes
7 pages
Auc Mu Kleiman19a
No ratings yet
Auc Mu Kleiman19a
9 pages
5634 Convolutional Neural Networks With Intra Layer Recurrent Connections For Scene Labeling
No ratings yet
5634 Convolutional Neural Networks With Intra Layer Recurrent Connections For Scene Labeling
9 pages
Weighted Focal Loss
No ratings yet
Weighted Focal Loss
9 pages
Deep Speech 3 1707.07413
No ratings yet
Deep Speech 3 1707.07413
8 pages
Transformers in Time Series A Survey 2202.07125
No ratings yet
Transformers in Time Series A Survey 2202.07125
8 pages
DeepLearningBook RefsByLastFirstNames
No ratings yet
DeepLearningBook RefsByLastFirstNames
195 pages
Analysis and Design of Sprinkler Irrigation Laterals
No ratings yet
Analysis and Design of Sprinkler Irrigation Laterals
16 pages
UNITS
No ratings yet
UNITS
14 pages
Application of Data Augmentation On Deep Learning
No ratings yet
Application of Data Augmentation On Deep Learning
13 pages
Imaging Time-Series To Improve Classification and Imputation
No ratings yet
Imaging Time-Series To Improve Classification and Imputation
7 pages
Applied Computational Intelligence and Soft Computing - 2020 - Kamsing - Deep Neural Learning Adaptive Sequential Monte
No ratings yet
Applied Computational Intelligence and Soft Computing - 2020 - Kamsing - Deep Neural Learning Adaptive Sequential Monte
9 pages
A Deep Convolutional Neural Network For Time Series Classification With Intermediate Targets
No ratings yet
A Deep Convolutional Neural Network For Time Series Classification With Intermediate Targets
24 pages
The Collected Papers of Peter J. W. Debye, Pgs 500-513
No ratings yet
The Collected Papers of Peter J. W. Debye, Pgs 500-513
30 pages
Review of Deep Convolution Neural Network in Image Classification
No ratings yet
Review of Deep Convolution Neural Network in Image Classification
6 pages
Shift-Invariant Sparse Coding For Audio Classification
No ratings yet
Shift-Invariant Sparse Coding For Audio Classification
10 pages
Deep Learning Approaches To Predict Future Frames in Videos
No ratings yet
Deep Learning Approaches To Predict Future Frames in Videos
17 pages
2015 Ismir
No ratings yet
2015 Ismir
6 pages
"Transfer Learning" For Bridging The Gap Between Data Sciences and The Deep Learning
No ratings yet
"Transfer Learning" For Bridging The Gap Between Data Sciences and The Deep Learning
9 pages
Transforming Sensor Data To The Image Domain For Deep Learning - An Application To Footstep Detection
No ratings yet
Transforming Sensor Data To The Image Domain For Deep Learning - An Application To Footstep Detection
8 pages
Dark Geometry
No ratings yet
Dark Geometry
22 pages
Transfer
No ratings yet
Transfer
14 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
16 pages
Modeling of Speech Recognition Based On Deep Learning: Min Zhang
No ratings yet
Modeling of Speech Recognition Based On Deep Learning: Min Zhang
8 pages
Micro Presentation Topics
No ratings yet
Micro Presentation Topics
1 page
May 2024 Resume
No ratings yet
May 2024 Resume
2 pages
Lecture 3 Software Quality Models
No ratings yet
Lecture 3 Software Quality Models
5 pages
Pointo - Pitch Deck - 5-Dec.'24
No ratings yet
Pointo - Pitch Deck - 5-Dec.'24
15 pages
Final Report Yolo Voice
No ratings yet
Final Report Yolo Voice
94 pages
Machine Learning Unit - 1
No ratings yet
Machine Learning Unit - 1
7 pages
تاثير العولمة على السياسات العامة في الدول النامية
No ratings yet
تاثير العولمة على السياسات العامة في الدول النامية
20 pages
Gao Learning To Separate CVPR 2018 Paper
No ratings yet
Gao Learning To Separate CVPR 2018 Paper
4 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
21 pages
Paulin Transformation Pursuit For 2014 CVPR Paper
No ratings yet
Paulin Transformation Pursuit For 2014 CVPR Paper
8 pages
Exploring The Synergies of Hybrid Cnns and Vits Architectures For Computer Vision: A Survey
No ratings yet
Exploring The Synergies of Hybrid Cnns and Vits Architectures For Computer Vision: A Survey
27 pages

Encoding Time Series As Images For Visual Inspection and 10179-46015-1-PB

Uploaded by

Encoding Time Series As Images For Visual Inspection and 10179-46015-1-PB

Uploaded by

Trajectory-Based Behavior Analytics: Papers from the 2015 AAAI Workshop

Encoding Time Series as Images for Visual Inspection and

Zhiguang Wang and Tim Oates

Abstract Such success stems from learning distributed representa-

to preserve the temporal dependency, since time increases as AA

contains temporal correlations because G(i,j||i j|=k) repre-

with respect to time interval k. The main diagonal Gi,i is

Receptive Field 8 × 8 ... ...

You might also like