0% found this document useful (0 votes)
51 views12 pages

Time Alignment Measurement For Time Series

This document discusses time alignment measurement for time series data. It introduces dynamic time warping as a common algorithm to align time series that may be warped in time. It then proposes a new measurement called Time Alignment Measurement that provides a score characterizing the degree of time warping between two sequences.

Uploaded by

jane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views12 pages

Time Alignment Measurement For Time Series

This document discusses time alignment measurement for time series data. It introduces dynamic time warping as a common algorithm to align time series that may be warped in time. It then proposes a new measurement called Time Alignment Measurement that provides a score characterizing the degree of time warping between two sequences.

Uploaded by

jane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Pattern Recognition 81 (2018) 268–279

Contents lists available at ScienceDirect

Pattern Recognition
journal homepage: www.elsevier.com/locate/patcog

Time Alignment Measurement for Time Series


Duarte Folgado a,∗, Marília Barandas a, Ricardo Matias b,c, Rodrigo Martins b,
Miguel Carvalho d, Hugo Gamboa e
a
Associação Fraunhofer Portugal Research, Rua Alfredo Allen 455/461, Porto, Portugal
b
Physiotherapy Department, School of Health, Polytechnic Institute of Setúbal, Estefanilha, Edifício da ESCE, 2914-503, Setúbal, Portugal
c
Champalimaud Research. Champalimaud Centre for the Unknown, Lisbon, Portugal
d
Minho University, Campus de Azurém, 4800-058, Guimarães, Portugal
e
Laboratório de Instrumentação, Engenharia Biomédica e Física da Radiação (LIBPhys-UNL), Departamento de Física, Faculdade de Ciências eTecnologia,
FCT, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal

a r t i c l e i n f o a b s t r a c t

Article history: When a comparison between time series is required, measurement functions provide meaningful scores
Received 26 July 2017 to characterize similarity between sequences. Quite often, time series appear warped in time, i.e, although
Revised 3 January 2018
they may exhibit amplitude and shape similarity, they appear dephased in time. The most common al-
Accepted 2 April 2018
gorithm to overcome this challenge is the Dynamic Time Warping, which aligns each sequence prior
Available online 3 April 2018
establishing distance measurements. However, Dynamic Time Warping takes only into account amplitude
Keywords: similarity. A distance which characterizes the degree of time warping between two sequences can deliver
Time series new insights for applications where the timing factor is essential, such well-defined movements during
Time warping sports or rehabilitation exercises. We propose a novel measurement called Time Alignment Measurement,
Similarity which delivers similarity information on the temporal domain. We demonstrate the potential of our ap-
Distance proach in measuring performance of time series alignment methodologies and in the characterization of
Signal alignment
synthetic and real time series data acquired during human movement.
© 2018 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY-NC-ND license.
(http://creativecommons.org/licenses/by-nc-nd/4.0/)

1. Introduction shape similarity, they can be considered to be out-of-phase. There-


fore, similar regions may appear in different instants in time, lead-
The comparison of time series have existed in the scenario of ing to different degrees of time distortion, or time warping, among
sequence matching, subsequence searching, and motif detection. several sequences, since they are not aligned in the temporal do-
Those challenges are intrinsically related to time series classifica- main. In those circumstances, traditional distances fail to measure
tion applied in several contexts such as pattern recognition [1–4], this distortion since they are very sensitive to small distortions in
signal processing [5], shape detection [6], bioinformatics [7,8], hu- time and typically unable to directly handle unequal length time
man activity recognition [9] and on-line handwritten signature val- series without some sort of preprocessing [11].
idation [10]. In order to overcome these limitations, elastic distances which
When a comparison of two streams of data with implicit or ex- contemplate temporal elastic shifting have been proposed. Dy-
plicit time information associated is executed, there is the need for namic Time Warping (DTW) and Longest Common Subsequence
a measurement function that provides information on the similar- (LCSS) compensate non-linear temporal distortions by aligning the
ity of the two data streams. Time series comparison may be estab- discrete sequences before establishing amplitude measurements in
lished using a wide range of available distance measurement func- the discrete domain [12]. Since those algorithms do not take into
tions. Some of the traditional metrics, such the Euclidean distance account the information between inter-sampling points, [13] pro-
or some modification thereof, assume that the discrete signals are posed the Continuous Dynamic Time Warping (CDTW), which ex-
equidistant points in time and also aligned in the time axis. In tends the classic methodology by allowing mapping between in-
some domains, although time series may present amplitude and stants that may eventually not belong to the original time vector
for each series. The work from [14] uses an optimization approach
to calculate a parametric polynomial warping path reflecting the

Corresponding author.
alignment between both series. Therefore, the last two alternatives
E-mail address: [email protected] (D. Folgado).

https://doi.org/10.1016/j.patcog.2018.04.003
0031-3203/© 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license. (http://creativecommons.org/licenses/by-nc-nd/4.0/)
D. Folgado et al. / Pattern Recognition 81 (2018) 268–279 269

produce an optimal warping path which translates the alignment based on real time series data. Finally, section 5 contains the con-
between two signals in the continuous domain. clusions and future work directions.
Motivated by the fact that off-the-shelf applications of semi-
supervised learning algorithms do not typically work well 2. Time series alignment
when applied to time series, the authors from [15] pro-
posed a new distance which tries to minimize this be- In this section, we motivate for the utility of DTW algorithm to
haviour. The proposed distance is called Dynamic Time Warp- establish an alignment between two time series in order to calcu-
ing Delta (DTW-D) and is the ratio between DTW and Euclidean late TAM. We start with a brief explanation of DTW algorithm and
distances. explore some of the challenges arising while aligning signals that
Inspired by the well-known edit distance for string com- present amplitude fluctuation.
parison, which calculates the minimum number of insertions,
deletions, and substitution operations to transform a string in 2.1. Dynamic Time Warping
another, some authors translated the core idea to time se-
ries [16−18]. In order to generalize the concept from strings The DTW algorithm allows two time-dependent sequences that
to time series, two elements of each sequence are matched are similar, but locally out of phase, to align in time. Its main
if the absolute difference between them is bellow a given objective consist of identifying an optimal alignment between se-
tolerance value. The common goal of the approaches con- quences by warping the time axis iteratively.
sists in identifying the smallest number of operations (addi- In order to align two time series X := (x1 , x2 , …, xN ) and Y := (y1 ,
tions, deletions and substitutions) to transform a sequence in y2 , …, yM ) of length N and M respectively, a N-by-M cost matrix
another. is computed. Each (nth , mth ) element of the cost matrix, C ∈ RN×M ,
Prior to establish a similarity measurement between time se- corresponds to the distance between each pair of elements of the
ries, most of the aforementioned examples perform a previous sequences X and Y. The Euclidean distance is usually employed as
alignment between the two sequences. The optimal alignment may a distance function to define the cost matrix element as:
also be used for summarizing a set of time series, since it allows
to compute a more meaningful average between sequences which c ( xn , ym ) = ( xn − ym )2 (1)
may exhibit time warping. The work developed by [19], and more The goal of DTW is to find the optimal warping alignment path
recently by [20], proposes time series averaging methods based on between X and Y having minimum overall cost. A warping path, W,
preceding alignments, which demonstrated favourable impacts on is a set of matrix elements that define the relationship between X
clustering performance. and Y. The kth element of W is defined as wk = (i, j )k , wk ∈ R2 :
However, whilst we observed a multitude of proposed novel
elastic distances over the last years, they are mostly centered in W = (w1 , w2 , ..., wk , ..., wK ) max(N, M ) ≤ K ≤ N + M − 1 (2)
measuring similarity accounting for amplitude differences [21,22]. The resulted path should be composed by a set of matrix ele-
Those facts motivated our work in the development of a novel ments satisfying the following conditions:
time distance able to measure similarity between time series
in the temporal domain, namely Time Alignment Measurement • Boundary condition: Enforces that the first and the last ele-
(TAM). The proposed methodology is able to describe the be- ments of X and Y are aligned to each other ∴ w1 = (1, 1 ) and
haviour in time between two signals by measuring the fraction of wK = (N, M ).
time distortion between them. The distortion may comprise pe- • Monotonicity condition: Forces the points in the warping
riods of temporal advance or periods of delay. When signals are path to be monotonically spaced in time ∴ i1 ≤ i2 ≤ ... ≤ iN and
similar-alike in time they can be considered to be in phase be- j1 ≤ j2 ≤ ... ≤ jM .
tween each other. This approach can deliver useful information • Step size condition: Avoids omissions in elements and
to domains where information between the temporal misalign- replications in the alignment of X and Y ∴ (wk+1 − wk ) ∈
ment of time series is needed. Examples of such domains in- (1, 0 ), (0, 1 ), (1, 1 ) for k ∈ [1 : K − 1].
clude well-defined human movements executed in sports or re-
The optimal warping path is the path that has the minimum
habilitation exercises. The authors from [23] investigated the fea-
total cost among all possible warping paths. One could test every
sibility of biofeedback training applied to therapeutic exercises,
incumbent warping path and determine the minimum cost can-
where repetitive movements should follow well-defined timings
didate, but such method will lead to a exponential computational
to be considered successfully executed. The authors calculated
complexity in the lengths of N and M. Using dynamic program-
the mean error of the distance between anatomic segments ex-
ming, an accumulated cost matrix, D, is computed in order to find
ecuted by the subject to a previously recorded reference. A dis-
the path that minimizes the warping cost in an O(N, M) complex-
tance able to truly characterize temporal misalignment between
ity [12]. Each accumulated cost matrix element is defined as the
movements should bring new perspectives for the evaluation of
local cost measure in the current cell plus the minimum of the lo-
the correctness of the exercises through the complete movement
cal cost measures in the adjacent cells:
execution.
The literature review allowed to identify that most of the D(n, m ) = min{D(n − 1, m − 1 ),
work developed over the last years in the development of D(n − 1, m ), D(n, m − 1 )} + c (xn , ym ) (3)
new distance functions mostly takes into account amplitude
similarity. The major contribution presented on this work is where n ∈ [1: N], m ∈ [1: M], D is the accumulated cost matrix, and
propose a novel distance which measures similarity in time c(xn , yn ) is the local cost measure found in the current cell.
domain. Using this accumulated matrix, the optimal warping path, W ∗ =
The remaining content of this paper is organized as follows: (w1 , w2 , ..., wK ), is computed in reverse order of indices, starting
in section 2, a brief overview of DTW algorithm is presented, with wK = (N, M ), by the following algorithm:
since we use DTW to align two time series prior calculating ⎧
TAM. Section 3 introduces the TAM distance and presents exam- ⎨ ( 1, m − 1 ),
⎪ if n = 1
( n − 1, 1 ), if m = 1
ples based on synthetic time series to support its potential. In wk−1 = (4)
⎪argmin{D(n − 1, m − 1 ), otherwise
section 4 we present two use cases for the proposed distance ⎩
D(n − 1, m ), D(n, m − 1 )},
270 D. Folgado et al. / Pattern Recognition 81 (2018) 268–279

is also quite susceptible to noise and requires signal smoothing be-


fore applying the algorithm.
In [28] the authors introduced a DTW penalty based version
called Weighted Dynamic Time Warping (WDTW). In this ap-
proach, the cost matrix is modified in order to incorporate a mod-
ified logistic weight function that assigns additional weight as a
function of the phase difference between the reference and test
points. Thus, time instants with higher phase difference will be
more penalized than instants near the reference.
More recently, [29] presented an approach that solely accounts
for the shape of the time series. The similarity measure is per-
formed by comparing the spatial distribution of the data around
each point. This modification tends to reduce singularities and pro-
motes feature alignment that may include peaks and valleys.
Despite the attempts to improve the DTW alignment they are
still dependent of the data’s nature. For instance, time series that
do not comprise higher degree of information on the first deriva-
tive are susceptible not to benefit from an alignment solely based
on the data’s shape. Therefore, an alignment which contemplates a
weighting between the amplitude and derivative domains can con-
stitute an added value towards a more versatile application.

Fig. 1. Accumulated cost matrix between two time series using the Euclidean dis-
tance as local cost measure. The resulted optimal warped follows the low cost re- 2.3. Sliding Window Dynamic Time Warping
gions (represented in white) and avoids high cost regions (represented in dark).

In order to overcome the incorrect alignment generated by


DTW and its variants, we outlined an alternative alignment that
In order to compensate the effect of different optimal warping
should prevent singularities by reflecting a feature-to-feature simi-
path lengths, the path-normalized distance is given as:
 larity. We defined features as notable events on the course of time

K series, which include local minima, maxima or valleys. Therefore,
1
DT W (X, Y ) = w∗k (5) features should always be aligned with the corresponding features
K in the other signal. Furthermore, a feature should always corre-
k=1
spond to a single point, since by definition they are unique in time.
Fig. 1 illustrates a typical example of DTW algorithm. The refer-
The proposed approach is called Sliding Window Dynamic Time
ence time vector of the lower bottom signal was artificially modi-
Warping (SW-DTW).
fied to result in the time warped signal represented vertically. The
The cost measured used by DTW uses an element-to-element
resulting optimal path follows the minimum cost regions on the
distance. In order to use a ”contextual” distance, which takes into
accumulated cost matrix and establishes a pairwise relationship
account the neighbourhood surrounding each point, we modified
between each point of the discrete series.
the cost function to take into account the distance between well-
DTW distance yields to a more intuitive information on signal
defined windows of the signals. The modified local cost measure
amplitude by performing a preceding alignment before calculating
can be defined as:
the distance between two signals. Additionally, the resulted opti-
mal alignment path settles a pairwise relationship between each
point of the discrete signals. This discrete temporal matching may c ( xn , ym ) = w ( δ ) × W ( δ )
be used for signal alignment. In our work, we further explored this W (δ ) = α (X[n− 2δ :n+ 2δ ] − Y[n− 2δ :n+ 2δ ] )2
feature by creating a distance to measure time distortions between
two signals. + (1 − α )([X ][m− δ :m+ δ ] − [Y ][m− δ :m+ δ ] )2 (6)
2 2 2 2

2.2. Signal alignment challenges where w(δ ) is a window function with width δ ∈ N,  is the op-
erator for the first discrete derivative, α ∈ R+
0
∩ [0 : 1] is a constant
Although DTW has been successfully used for many years, it that defines the weighting between the cost in amplitude and first
still encounters some pairwise alignment challenges. In [24] the order derivative, n and m are the indexes of the values of X and Y,
authorts reported unintuitive alignments when the algorithm tries respectively. Using a Hanning window function we can assure that
to express amplitude variability in the Y-axis by improper warp- points closer to the center of the window have more contribution
ing the X-axis. This behaviour leads to situations defined as ”sin- to the local cost than points located near the window limits. Since
gularities”, where a single point of a particular signal maps a large the first discrete derivative of the signal is calculated, the last ele-
subsection of another time series. In order to overcome the sin- ments of X and Y are discarded to guarantee that both time series
gularities challenge they presented the Derivative Dynamic Time have the same length. Additionally, signals are prepared by intro-
Warping (DDTW) approach, which uses the square of the differ- ducing reflected copies of the each signal (with the window size)
ences between the estimated signal derivatives as shown on Fig. 2. in both ends. This procedures aims to minimize boundary errors on
Despite the fact this methodology reduces the number of sin- the first and last elements of each signal. δ and α are free param-
gularities and does not completely solve the problem, it has been eters and, consequently, must be tuned prior to applying the algo-
successfully used in many fields, including human activity recog- rithm. Small windows will tend to similar results compared with
nition using accelerometer signals [26] and biosignal segmenta- the point-to-point DTW distance and excessive large window size
tion [27]. However, since DDTW uses the first signal derivative it will tend to improper feature alignment.
D. Folgado et al. / Pattern Recognition 81 (2018) 268–279 271

Fig. 2. Two time series from Gun Point dataset [25] aligned with the DTW approach (center) and DDTW modification (right). Although a slight improvement can be observed,
there are still sections of consecutive singularities.

Vertical segments.
A vertical segment can be defined as wk+1 − wk = (0, 1 ). This
situation arises when an index of X is associated to one or more
consecutive indexes of the series Y. A temporal delay is therefore
present since sequence Y is progressing in time and the reference
maintains the same instant.

Diagonal segments.
A diagonal segment is defined as wk+1 − wk = (1, 1 ). In this cir-
cumstance, there is no time warping and the signals can be con-
sidered to display phase phenomenon between them.

3.2. Outline

The idea behind the proposed distance is to measure the cost


between a given time series to warp in time relative to the other.
Using the optimal alignment path between two time series we can
extract information in the time domain that allows to characterize
the intervals when the series are in phase, advance or delay.
Let us consider again two sequences X of length N ∈ N and Y
Fig. 3. An example of an artificial optimal warping path superimposed on an ac- of length M ∈ N. During the complete length of each sequence, the
cumulated cost matrix containing all the available step directions. The horizontal, signals may be considered to exhibit phase and out of phase be-
vertical and diagonal segments represent the advance, delay and phase intervals,
haviours. In case the signals are out of phase, one sequence can be
respectively.
considered to be in advance in relation to the other. This charac-
teristic is reciprocal as if X is in advance in relation to Y, Y can be
3. Time Alignment Measurement considered to be delayed in relation to X.
If we assume that Y is delayed in relation to X, the total time
←−
In this section we will start to present a further interpretation which Y is delayed in relation to X is denoted as θ xy and the time


of the optimal warping path from the DTW algorithm. This analy- which may be eventually advanced is denoted as θ xy . Using this
sis will allow to easily present the TAM formulation. We will finish nomenclature, we can write the relation between advance, delay,
this section with illustrative examples to simultaneously consoli- and length of both signals as:
date the presentation and support the potential of our proposed −
→ −

distance. |θxy − θxy | = |M − N| (7)
The total time when both signals are in phase is represented by
3.1. DTW optimal alignment path properties θxy . During the complete length of signal Y, the fraction of advance

→ ←−
( ψ ), delay ( ψ ), and phase (ψ ) to X can be calculated as:
The optimal warping path establishes a pairwise relationship −
→ ←


→ θxy ←
− θxy θxy
between the indexes of both series. This resemblance allows to ψ= ψ= ψ= (8)
characterize the transformations on the time axis between them. N M min{N, M}
Whilst there are a multitude of step patterns proposed on the lit- Finally, the TAM distance can be formally defined as:
erature for the warping path calculation, we started by exploring −
→ ←

the basic step pattern which contemplates vertical, horizontal and
 = ψ + ψ + ( 1 − ψ ),  ∈ {R+0 | ∈ [0 : 3]} (9)
diagonal segments that was discussed in section 2.1. This distance penalizes signals where advance or delay is
Let us consider that the eventual time warping is referenced to present and benefits series that are in phase between each other.
series X which is plotted on the x-axis. The possible slopes which As the distance increases, the dissimilarity between both signals
are contemplated on the warping are outlined on Fig. 3. also increases. Thus, in case the signals are constantly in phase,
−→ −
→ ←− ←

ψ = 1, θxy = 0, ψ = 0, θxy = 0 and ψ = 0. The TAM distance has
Horizontal segments. the minimum allowed value ( = 0), and the signals can be con-
An horizontal segment is defined as wk+1 − wk = (1, 0 ). In this sidered equal in temporal domain. The highest dissimilarity value

→ −
→ ←− ←

case, an index of Y is associated to one or more consecutive in- is traduced when  = 3, where θxy = N ⇒ ψ = 1, θxy = M ⇒ ψ =
dexes of the reference X. This situation illustrates a temporal ad- 1, and consequently ψ = 0.
vance as the same time instant is maintained on the Y sequence, It is important to note some considerations regarding the topol-
several time instants are elapsing in the reference series. ogy of the proposed measure:
272 D. Folgado et al. / Pattern Recognition 81 (2018) 268–279

1. The condition of identity of indiscernibles is not satisfied: −



δ = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1}
(X, Y ) = 0  X = Y . In fact, two signals can be equal in time ←

and possess dissimilarity in amplitude. A trivial application of δ = {0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0}
our distance to two similar signals in time with an offset on δ = {1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0} (13)
amplitude proves this assumption.
2. Symmetry is observed since the concept of advance and de-

→ ←−
lay between two time series is reciprocal: θ xy = θ yx = α 3.3. Application
←− −

and θ xy = θ yx = β . Furthermore, θxy = θyx = θ . The symmetry
In the following paragraphs we will use artificial signals to ex-
proof is trivial and outlined on Eq. 10:
 plain the proposed distance and support its potential to character-

→ ←

θ xy θ xy θx,y ize time series.
+ + 1−
N M min(N, M )

→ ←
− 
θ yx θ yx θyx Equal length time series.
= + + 1− ⇔ When we establish a comparison between two time series, X
N M min(N, M )
 and Y, with equal lengths N = M = L, the duration of the intervals
α β θ where X is in advance and delay in comparison to Y must be equal.
⇔ + + 1−
N M min(N, M ) This property is a consequence of the fact that both signals have
 equal lengths. Therefore, despite an eventual delay, the signal must
β α θ have at least an a posteriori advance to finish at the same instant
= + + 1− ⇔
M N min(N, M ) as the other sequence and vice-versa.

→ ← −
⇔ (X, Y ) = (Y, X ) (10) Since θxy = θxy = θ , the TAM distance can be directly simplified
to:
The aforementioned considerations reveal that TAM can not be
considered as a metric since it fails to guarantee the identity of 3θ
(X, Y ) = (14)
indiscernibles. However, we can state that is a premetric, since it L
fully satisfies both the non-negativity and symmetry conditions. It
Fig. 4 represents an example where a set of four artificial sig-
is important to empathize that in order to calculate the TAM dis-
nals was generated by distorting the first sequence. Although in
tance, it is only required to establish a pairwise relation between
this example the Euclidean distance increases with the increase
the elements of each time series. This pairwise relation provides
of time distortion, it does not reflect a meaningful measurement.
the required alignment to compute the delays, advances, and phase
Since it is only sensitive to amplitude similarity, in case a sequence
periods between signals. Thus, alternative methods for signal align-
possess an offset on the plateau values, it will produce a greater
ment to DTW can also be used to compute TAM.
distance value, independently of the time warping degree. The
DTW aligns each pair of signals prior computing the distance and
Calculate TAM from optimal warping path.
thus produces equal scores for all examples. However, TAM pro-
TAM can be calculated directly from the DTW warping path
duces a meaningful score, which reflects the cost to compress and
based on the following assumptions:
dilate in time a specific signal. In fact, the distance increases with
Let [Wk∗ ] = w∗k+1 − w∗k be the finite difference between two
the degree of time distortion present in each sequence. Signal A is
consecutive coordinates of the optimal warping path at point k
identical to the reference signal. The plateau from signal B has an
represented as a bidimensional vector. Since the optimal warp-
advance of 10 seconds. In order to finish at the same instant, the
ing path is restricted to vertical, horizontal and diagonal seg-
signal enters in delay at the last segment for another 10 seconds.
ments, [Wk∗ ] is also rescricted to the values of [Wk∗ ] ∈ −
→ ← −
{(1, 1 ), (1, 0 ), (0, 1 )}. The vertical segments will have a value of Signal C is distorted half of its time as θxy = θxy = 15 ⇒  = 330 ·15
.
[Wk∗ ] = (0, 1 ), marked by a temporal delay; the horizontal seg- Signal D represents a significant advance of the plateau, which is
ments will have [Wk∗ ] = (1, 0 ), denoting a temporal advance, and represented as a single peak, followed by a significant delay until
the diagonal segments will present [Wk∗ ] = (1, 1 ), denoting phase the end of the signal. Note that in all examples the two first and
among the time series. last samples are in phase, which prevents to achieve the maximum
The number of instants in advance, delay, and phase can be di- theoretical TAM value.
rectly calculated from the optimal path accordind to Eq. 11.

→ 1, [Wk∗ ] = (1, 0 ) Unequal length time series.
δk =
0, otherwise Time series with different lengths may present multiple be-
haviours in the way they perform in time. Fig. 5 represents a group

− 1, [Wk∗ ] = (0, 1 )
δk = (11) of four signals distorted in time. The signals were generated from
0, otherwise the first sequence by modifying the respective time vectors to sim-
ulate delays and advances. Signal A shares the same time represen-
1, [Wk∗ ] = (1, 1 )
δk = tation with the reference signal. Signal B possess an initial delay
0, otherwise
of 10 seconds then enters in phase for the rest of its length. The
Hence, TAM can be calculated according to Eq. 12: TAM distance of 0.25 reflects this compression. A higher distance is
observed for signal C since a more significant advance is present.
1 − 1 ← 
K K K
→ − 1 Signal D reflects a linear delay lag since two instants in the sig-
= δk + δk + 1− δk (12)
N M min{N, M} nal are related to a single instant in signal A, producing a distance
k=1 k=1 k=1
value of 0.5. The Euclidean distance was calculated by linear inter-
Returning to the analysis of Fig. 3 we can see that the path polating all signals to share the same length of the reference. This
starts within phase, enters in delay during four segments and fin- procedure assures that signals have equal lengths in order to com-
ishes in advance during another four segments. We can write the pute the Euclidean distance. Although signal D is different from the

→ ← −
resultant δ , δ , and δ as: reference in time domain, the Euclidean distance is zero.
D. Folgado et al. / Pattern Recognition 81 (2018) 268–279 273

Fig. 4. A set of four examples with equal lengths. The first sequence is the reference signal (dashed line). All the signals are compared against the reference signal. An
annotation is provided with each respective Euclidean, DTW and TAM distances.

Fig. 5. A group of four signals distorted in time. The signals were generated from the upper signal (dashed line) by artificially modifying the vectors to simulate delays
and advances. An annotation is provided to show the Euclidean, DTW and TAM distances. The Euclidean distance was calculated by linear interpolating all signals to the
reference length.

Application notes. by [15], consists of the ratio between DTW and the Euclidean dis-
The previous examples allowed to better describe the nature of tance. One might potentially argue that such ratio could be an ap-
our main contribution: provide a novel distance measurement able proximation to measure warping, as it measures the amount of
to characterize the degree of time warping between time series warping necessary to match a given time series in reference to the
which may be similar-alike in amplitude. The DTW-D, proposed Euclidean distance (which requires no warping at all). However,
274 D. Folgado et al. / Pattern Recognition 81 (2018) 268–279

DTW-D will eventually fail in the presented application examples. the DTW with a 5% warping window of signal’s length (DTW_R),
As the signals are similar-alike in amplitude, DTW will have the the DDTW, and the SW-DTW with empirical values of α = 0.5 and
value of 0 and, consequently, DTW-D will also fail to provide a δ = 0.05 × N. The scale vector, S, was normalized prior the mul-
meaningful score. tiplication in order to guarantee that S ∈ [0.5, 1.5]. The results are
As a final note, a naive approach to compare time series based summarized in Table 1.
on the time domain would be solely compare the length of the The analysis suggests that SW-DTW reduces the number of
sequences. However, the TAM evaluates the temporal behaviour in singularities as it outperforms the other variants in most of the
terms of delay, advance and phase along the time of each sequence datasets. The improvement of DTW_R in comparison with DTW is
and, therefore, it not strictly limited to the endpoints of each se- explained by the fact the maximum distance of the warping path
quence. Consequently, even for sequences with equal length, tem- to the diagonal is restricted. In the majority of the situations where
poral information can be extracted which would not be possible if the SW-DTW is not the best alignment alternative for a given
a direct comparison between signal lengths was performed. dataset, the lowest  is achieved by the DDTW. It is important
to emphasize that the value of α = 0.5 was used for all datasets
4. Experimental Evaluation and that no individual adjustment was performed in order to re-
duce the complexity of the analysis. Since lower values of α will
In this section two studies will be presented to demonstrate increase the weight of the first order derivative, we can anticipate
the applicability and relevance of our approach to characterize real that it can be used to increase the alignment quality by SW-DTW
time series data. As previously mentioned in subsection 3.2, the in datasets where the DDTW achieved superior performance. This
TAM value is calculated based on the previous time series align- fact also suggests that before applying SW-DTW, proper tuning of
ment. Therefore, the value depends on the preceding alignment the α and δ parameters is required.
quality. The first study consists of examining the signal alignment The different DTW alignment methodologies resulted in differ-
quality using well-known DTW variations and our proposed SW- ent alignments for the same dataset and express variability in the
DTW modification. Secondly, we will apply the TAM as a local  values. Therefore, we can anticipate that these results support
measure to examine human repetitive motion using inertial data. our claim that TAM is sensitive to the alignment quality and that
SW-DTW reduces the singularities in comparison with the other
4.1. Simulated time series alignment evaluated alternatives. It is worth to mention that despite SW-
DTW achieved superior performance on this experiment, it is not
We created a controlled experiment in order to assess the sig- our main contribution. Since TAM depends on preceding align-
nal alignment performance across several DTW variations. During ments, supported by the results of Table 1, this experiment allowed
the course of our research, we did not find a dataset whose main to increase our confidence that SW-DTW reduces the number of
objective is to serve as validation for time series alignment mech- singularities and produces a more correct alignment in compari-
anisms. In this sense, we implemented a study based upon a com- son with the evaluated alternatives.
parison between a given time series X and a modified time series A detailed analysis of the UCR dataset also allowed to elaborate
Xˆ calculated from an amplitude modification of X. important highlights before attempting to proceed with a time se-
A scale vector, S, was generated using a series of random val- ries classification exercise using TAM as a local measurement. The
ues from a Gaussian distribution. In order to prevent an excessive TAM should be used in datasets with significant temporal distor-
modification between consecutive elements, we used a similar ap- tion and similar amplitude between different classes. Additionally,
proach to [30], where the initial random values were filtered to en- each class must also comprise time series which are similar-alike
sure adjacent scales differ by at most 1: S(t + 1 ) = S(t ) + sin(π × in the temporal domain. This situation is not present in the major-
randn ). The signal was multiplied by the scale vector in order to ity of the UCR datasets since there are several datasets with minor
modulate negative and positive fluctuations: Xˆ = X  S. This pro- temporal differences between classes (e.g. Adiac, OliveOil and Proxi-
cedure results in two time series which are always in phase dur- malPhalanxOutlineCorrect). Therefore, we introduced a new time se-
ing their entire length, since the unique modification was imple- ries dataset that suits the TAM applicability requirements and will
mented in the amplitude domain (taking also into account that no be thoroughly discussed in subsection 4.2.
excessive modification was performed in order to prevent signifi-
cant changes in the shape of the two signals).
When using DTW and its variants, the ideal expected outcome 4.2. Repetitive upper limb motion
is an optimal warping path which demonstrates that the signals
are continuously in phase during their complete length. However, A base motivation for the development of this new measure
the amplitude fluctuations arising from the multiplication with the was to describe time warping of human movement. The studied
scale vector are susceptible to generate singularities as previously paradigm included the assessment of repetitive well-defined move-
discussed in subsection 2.2. ments in different temporal distortion contexts. Repetitive mo-
In order to quantify the alignment quality, the TAM was cal- tion is present in several circumstances such rehabilitation exer-
culated between each pair of time series. Since the signals are cises, human gait dynamics, and movements that employees exe-
aligned throughout their complete length, it is expected that  = 0 cute during the labor day in certain job activities.
in circumstances where the alignment was indeed performed cor- In this subsection, we will present an experiment based upon
rectly. Given that singularities result in advances and delays that time series retrieved using inertial sensors during the execution of
do not correspond to correct alignments, the value of the TAM will repetitive motion. We created a dataset with a total of 240 signals
be incorrectly influenced. Therefore, in the context of this experi- retrieved by six different subjects that executed ten repetitions of
ment, the TAM value can be used to translate the alignment qual- a well-defined task under four distinct sets. The movements per-
ity and establish a comparison among different signal alignment formed during each task consisted of: grasping a solderless bread-
techniques. board used to build electronic circuits; placing the board on a de-
We used the UCR time series archive [25] to test several DTW fined position and welding a single perforation in each repetition;
variations by randomly selecting 40 signals from 84 datasets, grasping the welded board and move it to a defined position. The
which resulted in 3360 different alignments per algorithm varia- difference among each class is based on the temporal criterion
tion. The selected algorithms were the DTW (no warping window), used by the subjects to perform the task as illustrated on Fig. 6.
D. Folgado et al. / Pattern Recognition 81 (2018) 268–279 275

Table 1
Results of  across the UCR archive. The SW-DTW was applied using α = 0.5 and δ = 0.05 × N. In the DTW_R a 5% warping window was used. Higher values
translate improper alignments and lower values represent correct alignments. The best alignment value for each dataset among the different DTW alignment
variations is highlighted in bold font.

Dataset DTW DTW_R DDTW SW-DTW Dataset DTW DTW_R DDTW SW-DTW

Adiac 0.900 0.743 1.528 0.517 Meat 1.640 1.164 0.930 0.521
ArrowHead 1.255 1.042 0.637 0.457 MedicalImages 1.405 0.602 0.694 0.654
Beef 1.484 1.247 0.684 0.294 MiddlePhalanxOutlineAgeGroup 0.821 0.607 0.676 0.246
BeetleFly 0.810 0.804 1.088 0.196 MiddlePhalanxOutlineCorrect 0.889 0.627 0.733 0.307
BirdChicken 1.125 1.084 0.640 0.352 MiddlePhalanxTW 0.817 0.607 0.738 0.278
Car 1.222 1.067 1.211 0.568 MoteStrain 1.007 0.595 0.532 0.421
CBF 0.608 0.552 0.021 0.001 NonInvasiveFatalECGThorax1 1.612 1.435 0.445 0.284
ChlorineConcentration 0.523 0.510 0.103 0.025 NonInvasiveFatalECGThorax2 1.664 1.362 0.544 0.386
CinCECGtorso 1.640 1.445 0.232 0.266 OliveOil 1.630 1.165 0.978 0.378
Coffee 1.459 1.176 0.541 0.225 OSULeaf 0.966 0.937 0.937 0.270
Computers 0.557 0.328 0.0 0 0 0.0 0 0 PhalangesOutlinesCorrect 0.770 0.586 0.713 0.264
CricketX 1.116 1.033 0.079 0.019 Phoneme 0.707 0.702 0.230 0.006
CricketY 1.206 1.078 0.097 0.032 Plane 0.827 0.776 0.824 0.253
CricketZ 1.162 1.018 0.080 0.026 ProximalPhalanxOutlineAgeGroup 0.816 0.608 0.800 0.316
DiatomSizeReduction 1.0 0 0 0.859 1.426 0.551 ProximalPhalanxOutlineCorrect 0.854 0.564 0.779 0.309
DistalPhalanxOutlineAgeGroup 0.698 0.544 0.616 0.213 ProximalPhalanxTW 0.803 0.628 0.855 0.323
DistalPhalanxOutlineCorrect 0.738 0.576 0.602 0.226 RefrigerationDevices 0.588 0.533 0.002 0.0 0 0
DistalPhalanxTW 0.757 0.581 0.717 0.235 ScreenType 0.546 0.301 0.001 0.0 0 0
Earthquakes 0.016 0.016 0.0 0 0 0.0 0 0 ShapeletSim 0.084 0.084 0.018 0.0 0 0
ECG200 0.893 0.658 0.116 0.162 ShapesAll 1.188 1.088 0.724 0.425
ECG50 0 0 1.235 1.004 0.139 0.117 SmallKitchenAppliances 0.359 0.138 0.0 0 0 0.0 0 0
ECGFiveDays 0.947 0.758 0.153 0.115 SonyAIBORobotSurface1 0.292 0.242 0.052 0.021
ElectricDevices 0.210 0.096 0.007 0.002 SonyAIBORobotSurface2 0.120 0.109 0.048 0.005
FaceAll 0.382 0.373 0.229 0.021 Strawberry 1.525 1.177 0.757 0.433
FaceFour 0.567 0.559 0.006 0.0 0 0 SwedishLeaf 0.809 0.697 0.679 0.325
FacesUCR 0.369 0.365 0.216 0.018 Symbols 1.208 0.847 0.980 0.613
FiftyWords 1.276 0.988 1.081 0.414 SyntheticControl 0.233 0.202 0.030 0.009
Fish 1.171 0.996 1.755 0.619 ToeSegmentation1 1.249 1.107 0.207 0.086
FordA 0.683 0.683 0.625 0.0 0 0 ToeSegmentation2 1.399 1.180 0.321 0.152
FordB 0.658 0.658 0.580 0.0 0 0 Trace 1.926 1.066 0.127 0.292
GunPoint 1.750 0.643 1.443 0.479 TwoLeadECG 1.037 0.661 0.443 0.484
Ham 1.134 1.066 0.494 0.042 TwoPatterns 0.044 0.044 0.008 0.0 0 0
HandOutlines 0.983 0.959 1.214 0.446 UWaveGestureLibraryAll 1.314 1.218 0.517 0.131
Haptics 1.506 1.339 0.363 0.481 UWaveGestureLibraryX 1.197 0.861 0.784 0.429
Herring 1.210 1.119 1.112 0.538 UWaveGestureLibraryY 1.367 0.874 0.805 0.573
InlineSkate 1.786 1.493 0.081 0.575 UWaveGestureLibraryZ 1.330 0.893 0.938 0.563
InsectWingbeatSound 1.473 1.017 0.876 0.290 Wafer 1.479 0.530 0.990 0.169
ItalyPowerDemand 0.629 0.0 0 0 0.269 0.111 Wine 1.262 0.978 0.870 0.591
LargeKitchenAppliances 0.834 0.189 0.001 0.018 WordSynonyms 1.249 0.989 0.992 0.416
Lightning2 1.202 1.007 0.096 0.010 Worms 1.428 1.324 0.292 0.135
Lightning7 1.140 0.863 0.068 0.023 WormsTwoClass 1.562 1.478 0.244 0.109
Mallat 1.523 1.349 1.114 0.324 Yoga 1.148 1.053 1.167 0.465

We divided our study in two perspectives: (1) using one of the


subjects as a reference to establish a comparison between a refer-
ence subject and the group of the remaining subjects. The objective
was to produce a representative distance of how similar the move-
ments were performed against the reference; (2) provide a time
series classification example, using TAM as a feature.
Fig. 6. Summary of the proportional timings for task execution. The grasp move-
ment is depicted by “G” and the soldering process is depicted by “S”.
4.2.1. Reference movement comparison
In this approach we used a reference subject who recorded the
The subjects executed the repetitions under four distinct sets: reference movements that were compared against the group of re-
Normal, where the subjects executed the movements at standard maining subjects. The reference was recorded by a single subject
speed; Short, where the soldering process duration is approxi- executing movements at standard and predefined timings which
mately half of Normal speed; Extended, where the soldering process are described by the Normal set. Since the reference’s model must
takes the double of the standard speed; and OverallSlow, where the at best represent the correct motion with inherent variability, we
subject completes the entire task taking approximately the double firstly interpolated all the tasks performed by the reference volun-
of time from standard speed. teer to the mean task length duration. Secondly, we computed the
Inertial information was retrieved using a custom IMU devel- mean signal among all the interpolated signals. Since no previously
oped by Fraunhofer AICOS [31] streaming gyroscope data at 100 signal alignment was performed, in case we had considered that
Hz. Since the device was placed on the wrist of each subject, it was the reference signal was the mean of the interpolated signals we
possible to retrieve data that contained quasi-periodic sequences were introducing artifacts during the process of mean calculation.
corresponding to all the task repetitions performed by the subject. In order to overcome this issue, we choose as reference task the
The signals were manually segmented based on the beginning and one with minimum TAM value in comparison with the previously
end of each task and the SW-DTW and TAM were applied for each calculated mean. Therefore, the reference time series consists of
segmented window. the task repetition of the reference subject which potentially best
276 D. Folgado et al. / Pattern Recognition 81 (2018) 268–279

Fig. 7. Alignments between the reference time series (blue) and a repetition performed by a subject during the OverallSlow set (red). A comparison is provided between the
DTW alignment (left) and the SW-DTW (right) with α = 0.05 and δ = 2 seconds. The interval where the grasp movement occurs is depicted by “G” and the interval where
the soldering process is executed is depicted by “S”. For presentation purposes the alignment lines are not displayed for the entire set of samples.

Fig. 8. Mean and standard deviation SW-DTW and TAM values of all subjects in different set speeds.

corresponds to the minimum temporal misalignment in compari- Since the signals are similar-alike in amplitude, the SW-DTW
son with its own mean. values are similar between Normal, Short, and Extend sets. The
Fig. 7 illustrates an example of the alignment established by OverallSlow set produced an higher score since angular accelera-
the SW-DTW, where the reference time series is compared against tion may become attenuated when the subject tries to execute the
a signal acquired from another subject executing the OverallSlow task at a slower pace. The analysis using TAM shows a similar pat-
set. The signals comprise gyroscope filtered data and the promi- tern with an exception of the decrease relative distance of the Ex-
nent events correspond to the executed movements necessary to tended set. The highest similarity between sets is present between
achieve the task. The plateau on both series corresponds to the the Short and Extend, despite the fact they still continue to exhibit
moment where the subject is actually placing the iron tip against higher values in comparison to the Normal. This result can be ex-
the perforation to accomplish the soldering. plained since the TAM measures the overall time warping between
We can observe a misadjustment between peaks corresponding series. Since the ratio of advance in the Short set is similar to the
to the same event. In the OverallSlow example the peaks occur in ratio of delay in the Extend set, they end up showing the same ex-
different instants and they tend to show a temporal offset to the tent of overall warping.
right. Therefore, we can declare that they are temporally delayed The advantage of using TAM to complement the analysis lies on
relative to the reference time signal. In line with the results from the fact we are still able to retrieve further information if we ex-
4.1, the visual comparison potentially suggests that the alignment amine the ratios of delay and advance for each set. Since the Short

→ ← −
produced by SW-DTW reduces the singularity issues and allows set comprises an advance it is expected that ψ > ψ . On the other
a more accurate TAM calculation in comparison with the DTW −
→ ← −
hand, as Extend constitutes a delay, one can anticipate ψ < ψ .
methodology. In fact, even in the segment which is prone to lead Those assumptions are supported by the results outlined in Fig. 9.
to singularities, such the plateau, the SW-DTW seems to reason- The ratio between both parameters in the Normal suggests that
ably map the delay among the two series, which is not observed although the subjects try to follow the predefined reference tim-
in the DTW as both an advance and delay are present since two ings for movement execution, there is an inherent variability as-
singularities occur. sociated with the movements required to complete the task. On
After manually segment all tasks, the distances between the the other side, the Short set ratio possesses a significantly higher
reference time series and the remaining signals of the dataset ac- weight for advance, in contrast with the Extend, which denotes a
quired in the four contexts were calculated using the SW-DTW and predominant weight of delay, as expected by the nature of how
the TAM. Fig. 8 summarizes the results of the mean and standard the task was performed in each respective set. OverallSlow shows
deviation values for the SW-DTW and TAM distances between the the more significant weight increase for delay.
reference and the group of the remaining subjects for each set. This study demonstrated the potential of TAM to discriminate
between different time warping contexts of the same activity. The
D. Folgado et al. / Pattern Recognition 81 (2018) 268–279 277

Table 2
Accuracy (mean ± standard deviation) after k-fold cross validation in comparing distinct distance functions. A total of 20
folds was evaluated. The best value among the different classifiers is highlighted in bold font.

Measurement T MSE DTW SW-DTW TAM(DTW) TAM(SW-DTW)

Accuracy (μ ± σ ) (0.85 ± 0.05) (0.80 ± 0.04) (0.90 ± 0.03) (0.92 ± 0.03) (0.37 ± 0.04) (0.96 ± 0.03)

serve a performance increase when using TAM in comparison with


T. This fact suggests that for a given set, there is inherent variabil-
ity among subjects associated to the total time spent in each task.
The TAM performs a description during the complete signal length
and, consequently, detects the temporal pattern associated for each
set and not strictly the difference between the signal lengths.
On the amplitude domain, we were also able to achieve good
performance. The similarity between the DTW and SW-DTW ac-
curacies suggest that even with singularities, the DTW was able
to distinguish the 4 classes. It is worth to mention that DTW and
TAM measure different realities (amplitude and time, respectively).
This fact explains how both DTW and SW-DTW achieved good per-
formance while a different behaviour was present in the TAM cal-
culation based on DTW and SW-DTW alignments, since there was
a significant difference between accuracies calculated. Finally, the
MSE achieved the lower performance on the amplitude based fea-

→ ←

tures as the linear interpolation between the sequences is not able
Fig. 9. Mean and standard deviation values for ψ and ψ for all subjects in differ- to model the non linear warping that is present between differ-
ent set activities.
ent sets. This non linear warping is correctly modelled using DTW
based algorithms as they produce a more intuitive distance as dis-
movements performed in all sets to accomplish the task were ex- cussed in subsection 2.1.
actly the same. However, the timings of the movements executed
5. Conclusions and Future Work
were different and lead to non linear warping between time series,
which were successfully described by our proposed distance. The
One of the most important topics in the context of the study
analysis using TAM is more informative than SW-DTW as it is po-
of time series is the development of novel measures able to char-
tentially able to discriminate the results from Short and Extended,
acterize each signal. As most of the distances are based in ampli-
which shown similar distance values using the SW-DTW.
tude, a true time distance able to characterize the degree of time
In this approach we were able to demonstrate the ability to
warping between two sequences can be useful in a wide range of
produce meaningful scores using TAM in a scenario where a group
domains. We believe that this distance can be applied in several
of subjects were compared against a previously recorded reference.
contexts, such as human movement analysis and electrophysiolog-
In the next subsection we will present a classification example us-
ical data.
ing our proposed measurement.
This paper presents two relevant contributions to the aforemen-
tioned domain. The alignment between time series can provide a
4.2.2. Time series classification pairwise relationship between elements which translates informa-
In this approach we merged the data from all subjects into 4 tion on the time domain. One of the state-of-the-art techniques to
classes that correspond to each set (Normal, Short, Extended, and perform the alignment is based on DTW algorithm. There are how-
OverallSlow). We implemented a 1-NN classifier using the follow- ever some circumstances, arising when the algorithm tries to ex-
ing distance functions: the absolute distance between the length of press amplitude variability in the Y-axis by improper warping the
two series T; Mean Square Error (MSE), which corresponds to the X-axis, where incorrect alignments may eventually be present. The
Euclidean distance after interpolating both sequences to the length distance function used in the DTW only uses a point-to-point dis-
of the longest sequence; DTW; SW-DTW with α = 0.25 and δ = 4s; tance and does not assess the context where a particular time in-
TAM(DTW), which corresponds to the TAM value calculated using stant is inserted. One of our contributions was the development
the DTW alignment and TAM(SW-DTW), which corresponds to the of a new local cost distance for the DTW algorithm. Using a win-
TAM value calculated using the SW-DTW alignment. dow instead of an element-to-element approach potentially allows
We divided the subjects into training and testing sets with an to prevent singularities by looking in a region which takes into ac-
equal number of elements and used a k-fold cross validation to count a weighting between the amplitude and the first discrete
cover the entire group of possibilities between the distribution of derivative of both signals. Despite the achieved results showed a
the subjects in the training and testing sets. The results are pre- significant decrease of singularities, our approach is computation-
sented in Table 2. ally demanding and may require detailed optimization for real-
The highest accuracy was achieved by the TAM calculated us- time usage.
ing the SW-DTW alignment. However, a significant difference is The major contribution of our work is a comprehensive tech-
present when using the DTW alignment. These results reinforce nique to characterize the degree of warping between time se-
the fact that the quality of the previous time series alignment ries. In this paper we started by presenting a detailed analysis
is crucial to achieve representative performance when using our of DTW optimal warping path. The vertical, horizontal and diag-
proposed measurement. The DTW is prone to produce singulari- onal segments can deliver information related to the delay, ad-
ties more often than SW-DTW and they will tend to increase the vance and phase, respectively, between two given time series. Our
variability of the distance between time series of the same class approach tries to measure the cost that one sequence must per-
which downgrades the classifier’s performance. We can also ob- form to match the temporal requirements of a given reference se-
278 D. Folgado et al. / Pattern Recognition 81 (2018) 268–279

quence. A limitation of the TAM is that relies on the alignment time warping on inertial sensor data, Sensors 15 (3) (2015) 6419, doi:10.3390/
quality between two series in order to use the optimal alignment s150306419.
[6] G.E. Batista, X. Wang, E.J. Keogh, A complexity-invariant distance measure
path. Therefore, an improved alignment was achieved using the for time series, in: SDM, Vol. 11, SIAM, 2011, pp. 699–710, doi:10.1137/1.
SW-DTW approach. The TAM distance was successfully applied to 9781611972818.60.
both artificial and real time series data. We demonstrated two ex- [7] J. Aach, G.M. Church, Aligning gene expression time series with time warping
algorithms, Bioinformatics 17 (6) (2001) 495–508, doi:10.1093/bioinformatics/
amples of applicability to this novel measurement: the TAM can 17.6.495.
be used as quality index to establish an comparison between dif- [8] D. Clifford, G. Stone, I. Montoliu, S. Rezzi, F.P. Martin, P. Guy, S. Bruce,
ferent signal alignment methodologies. Our results show that SW- S. Kochhar, Alignment using variable penalty dynamic time warping, Analyt-
ical Chemistry 81 (3) (2009) 1000–1007, doi:10.1021/ac802041e.
DTW is prone to reduce singularities in comparison with the evalu-
[9] I.P. Machado, A.L. Gomes, H. Gamboa, V. Paixo, R.M. Costa, Human activity data
ated alternatives; we also demonstrated the possibility to discrim- discovery from triaxial accelerometer sensor: Non-supervised learning sensi-
inate time warping differences of human repetitive movement. In tivity to feature extraction parametrization, Information Processing and Man-
agement 51 (2) (2015) 204–214, doi:10.1016/j.ipm.2014.07.008.
this case, although time series demonstrated amplitude similarity
[10] X. Xia, X. Song, F. Luan, J. Zheng, Z. Chen, X. Ma, Discriminative feature selec-
as the movements being executed were equal, they show different tion for on-line signature verification, Pattern Recognition 74 (2018) 422–433.
degrees of non-linear warping according to the nature of each set. [11] S. Chu, E.J. Keogh, D.M. Hart, M.J. Pazzani, Iterative deepening dynamic time
It is important to emphasize that in this paper we do not intend warping for time series, in: SDM, SIAM, 2002, pp. 195–212, doi:10.1137/1.
9781611972726.12.
to achieve a generic higher performance of TAM in comparison [12] M. Müller, Dynamic time warping, Information retrieval for music and motion
with DTW. The two approaches measure different realities: SW- (2007) 69–84, doi:10.1007/978- 3- 540- 74048- 3_4.
DTW measures distance in the amplitude domain and it will be [13] M.E. Munich, P. Perona, Continuous dynamic time warping for translation-
invariant curve alignment with applications to signature verification, in: Com-
more suitable for classification in datasets with amplitude variabil- puter Vision, 1999. The Proceedings of the Seventh IEEE International Confer-
ity among classes; TAM express distance in the temporal domain ence on, Vol. 1, IEEE, 1999, pp. 108–115, doi:10.1109/ICCV.1999.791205.
and it is more suitable to classify datasets with minor amplitude [14] P.H. Eilers, Parametric time warping, Analytical chemistry 76 (2) (2004) 404–
411, doi:10.1021/ac034800e.
deviation and high temporal variability. Being two measurements [15] Y. Chen, B. Hu, E. Keogh, G.E. Batista, Dtw-d: time series semi-supervised
of different nature they will be applied according to different real- learning from a single example, in: Proceedings of the 19th ACM SIGKDD in-
ities. ternational conference on Knowledge discovery and data mining, ACM, 2013,
pp. 383–391.
The results obtained also suggested that TAM is sensible to the
[16] L. Chen, R. Ng, On the marriage of lp-norms and edit distance, in: Proceedings
previous pairwise alignment and, therefore, a correct adjustment of the Thirtieth International Conference on Very Large Data Bases - Volume
of the α and δ parameters is required when using SW-DTW. While 30, VLDB ’04, VLDB Endowment, 2004, pp. 792–803.
[17] L. Chen, M.T. Özsu, V. Oria, Robust and fast similarity search for moving object
we based our study in empirical derivation of the best values for
trajectories, in: Proceedings of the 2005 ACM SIGMOD International Confer-
each parameter, in future it will be required a detailed analysis of ence on Management of Data, SIGMOD ’05, ACM, New York, NY, USA, 2005,
their influence in the alignment quality and ultimately in the TAM pp. 491–502, doi:10.1145/1066157.1066213.
calculation. In the present work we analysed the optimal warp- [18] P.F. Marteau, Time warp edit distance with stiffness adjustment for time series
matching, Pattern Analysis and Machine Intelligence, IEEE Transactions on 31
ing path from a discrete perspective. By using the pairwise align- (2) (2009) 306–318, doi:10.1109/TPAMI.2008.76.
ment provided by DTW, the warping path segments follow discrete [19] F. Petitjean, A. Ketterlin, P. Gançarski, A global averaging method for dy-
slopes. The future work of this approach will consist in general- namic time warping, with applications to clustering, Pattern Recognition 44
(3) (2011) 678–693.
izing the TAM calculation from the optimal warping path to the [20] M. Morel, C. Achard, R. Kulpa, S. Dubuisson, Time-series averaging using con-
continuous domain. strained dynamic time warping with tolerance, Pattern Recognition 74 (2018)
77–89.
[21] J. Lines, A. Bagnall, Time series classification with ensembles of elastic dis-
Acknowledgement tance measures, Data Mining and Knowledge Discovery 29 (3) (2014) 565–592,
doi:10.1007/s10618- 014- 0361- 2.
This work was supported by North Portugal Regional Opera- [22] U. Mori, A. Mendiburu, J. Lozano, TSdist: Distance measures for time series
data, in: R Foundation for Statistical Computing, 2016. http://cran.xl-mirror.nl/
tional Programme (NORTE 2020), Portugal 2020 and the European
web/packages/TSdist/index.html.
Regional Development Fund (ERDF) from European Union through [23] M. Barandas, H. Gamboa, J. Fonseca, A real time biofeedback system using vi-
the project Symbiotic technology for societal efficiency gains: Deus sual user interface for physical rehabilitation, Procedia Manufacturing 3 (2015)
823–828, doi:10.1016/j.promfg.2015.07.337.
ex Machina (DEM) [NORTE-01-0145-FEDER-0 0 0 026].
[24] E.J. Keogh, M.J. Pazzani, Derivative dynamic time warping, SIAM, 2001, doi:10.
1137/1.9781611972719.1.
References [25] Y. Chen, E. Keogh, B. Hu, N. Begum, A. Bagnall, A. Mueen, G. Batista, The UCR
time series classification archive (july 2015).
[1] E. Keogh, C.A. Ratanamahatana, Exact indexing of dynamic time warping, [26] R. Muscillo, S. Conforto, M. Schmid, P. Caselli, T. D’Alessio, Classification of mo-
Knowledge and Information Systems 7 (3) (2005) 358–386, doi:10.1007/ tor activities through derivative dynamic time warping applied on accelerom-
s10115- 004- 0154- 9. eter data, in: 2007 29th Annual International Conference of the IEEE Engineer-
[2] C. Ratanamahatana, E. Keogh, Making time-series classification more accu- ing in Medicine and Biology Society, 2007, pp. 4930–4933, doi:10.1109/IEMBS.
rate using learned constraints, in: Proc. of Sdm Int’L Conf, 2004, pp. 11–22. 2007.4353446.
10.1.1.215.1648 [27] A. Zifan, S. Saberi, M.H. Moradi, F. Towhidkhah, Automated ECG segmentation
[3] C.A. Ratanamahatana, J. Lin, D. Gunopulos, E. Keogh, M. Vlachos, G. Das, Mining using piecewise derivative dynamic time warping, Int. J. Biol. Med. Sci 1 (2006)
time series data, in: Data Mining and Knowledge Discovery Handbook, 2010, 181–185.
pp. 1049–1077, doi:10.1007/978- 0- 387- 09823- 4_56. [28] Y.S. Jeong, M.K. Jeong, O.A. Omitaomu, Weighted dynamic time warping for
[4] T. Araújo, N. Nunes, H. Gamboa, A. Fred, Generic Biometry Algorithm Based time series classification, Pattern Recognition 44 (9) (2011) 2231–2240, doi:10.
on Signal Morphology Information: Application in the Electrocardiogram Sig- 1016/j.patcog.2010.09.022.
nal, Springer International Publishing, Cham, 2015, pp. 301–310, doi:10.1007/ [29] Z. Zhang, P. Tang, R. Duan, Dynamic time warping under pointwise shape con-
978- 3- 319- 12610- 4_19. text, Information Sciences 315 (2015) 88–101, doi:10.1016/j.ins.2015.04.007.
[5] J. Barth, C. Oberndorfer, C. Pasluosta, S. Schlein, H. Gassner, S. Reinfelder, P. Ku- [30] J. Zhao, Z. Xi, L. Itti, metricdtw: local distance metric learning in dynamic time
gler, D. Schuldhaus, J. Winkler, J. Klucken, B.M. Eskofier, Stride segmentation warping, ArXiv preprint arXiv:1606.03628.
during free walk movements using multi-dimensional subsequence dynamic [31] Fraunhofer AICOS, White paper: A day with pandlets, tech. rep., Research Cen-
ter for Assistive Information and Communication Solutions, 2016.
D. Folgado et al. / Pattern Recognition 81 (2018) 268–279 279

Duarte Folgado received his MSc in Biomedical Engineering from the Faculty of Sciences and Technology of NOVA University of Lisbon. After finishing the MSc he contin-
ued working as a scientist at Fraunhofer AICOS. His main research interests include computer science techniques, signal processing, and embedded systems for Assistive
Environments.

Marília Barandas received her MSc in Biomedical Engineering from the Faculty of Sciences and Technology of NOVA University of Lisbon. After completing her master’s
thesis, Marília was invited to lecture Medical Information Systems at FCT-UNL and to join Centre of Technology Systems, Portugal. Since April 2015, she is a scientist at
Fraunhofer AICOS, focusing in indoor locations solutions based on smartphones’ built-in inertial sensors.

Ricardo Matias is a Researcher at Champalimaud Centre for the Unknown of the Champalimaud Foundation. He has a Ph.D in Human Kinetics from the University of Lisbon.
His research combines computational biomechanics with machine learning to help uncover the mechanisms that trigger the decline from healthy mobility to movement
pathology.

Rodrigo Martins is an Assistant Professor at the Physiotherapy Department of the Portuguease Red Cross Health School. He is a PhD candidate in Biomechanics at Human
Motricity Faculty, University of Lisbon, Neuromechanics Research Group Interdisciplinary Centre for the Study of Human Performance (CIPER), currently working in human
motion pattern recognition mainly gait.

Miguel Carvalho is a scientist, researcher, and professor of textile engineering at University of Minho. Degree in Textile Engineering, MSc in Design and Marketing, PhD in
Textile Engineering â Clothing Technology. Focus: clothing and textile design, ergonomics, anthropometrics, development of functional and interactive materials, production
planning and control, work study, teamwork.

Hugo Gamboa received his PhD in Electrical Engineering from Instituto Superior Técnico, Universidade de Lisboa, and he is an Assistant Professor in the Physics Department
of Faculty of Sciences and Technology of NOVA University of Lisbon, Portugal. He is also a Senior Scientist at Fraunhofer Portugal AICOS. He has authored more than 100
papers in conferences and journals. His research activities focus on biomedical instrumentation and biosignals processing and classification.

You might also like