0% found this document useful (0 votes)

290 views107 pages

ASDM C02 Clustering

The document outlines a presentation on time series clustering. It will cover four topics: 1) time series clustering by features, 2) model based time series clustering, 3) time series clustering by dependence, and 4) an introduction to clustering, including definitions and examples of algorithms like connectivity-based, centroid-based, and distribution-based clustering.

Uploaded by

anggita

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

290 views107 pages

ASDM C02 Clustering

Uploaded by

anggita

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 107

Outline

Time Series Clustering

Andrés M. Alonso
1 Department of Statistics, UC3M
2 Institute Flores de Lemus

ASDM - C02
June 25 – 29, 2018, Boadilla del Monte

Andrés M. Alonso Time series clustering

Outline

1 Introduction

2 Time series clustering by features.

3 Model based time series clustering

4 Time series clustering by dependence

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

What is the meaning of clustering?

Definition
Cluster analysis or clustering is the task of grouping a set of
objects in such a way that objects in the same group (called a
cluster) are more similar (in some sense or another) to each
other than to those in other groups (clusters).
Wikipedia

Key elements of the definition

Objects
Group (that can be hard or soft).
Similarity.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Algorithms for clustering

Connectivity-based clustering (hierarchical clustering)

Scotto, M., Alonso, A.M. and Barbosa, S. (2010) Clustering time series of
sea levels: an extreme value approach, Journal of Waterway, Port, Coastal,
and Ocean Engineering, 136, 215–225.
Centroid-based clustering
Maharaj, E.A., Alonso, A.M. and D’Urso, P. (2015) Clustering Seasonal
Time Series Using Extreme Value Analysis: An Application to Spanish
Temperature Time Series, Communications in Statistics - Case Studies and
Data Analysis, 1, 175–191.
(Model) Distribution-based clustering
Alonso, A.M., Berrendero, J.R., Hernández, A. and Justel, A. (2006) Time
series clustering based on forecast densities, Computational Statistics and
Data Analysis, 51, 762–766.
Density-based clustering

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Examples of clustering algorithms

Connectivity-based clustering

BS
PR
These algorithms connect “objects”

BR
to form "clusters" based on their

ES
FP
distance/similarity.

AC
NL
A cluster can be described by the

CH
maximum distance needed to

NY
LW
connect parts of the cluster.

KW
NN
At different distances, different

WL
clusters will form, which can be

NW
1600

1400

1200

1000

800

600

400

200

0
represented using a dendrogram.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Examples of clustering algorithms

−2

−4

−6

Centroid-based clustering
−8

Clusters are represented by a central −10

“object”, which may not necessarily −12

be a member of the data set.

−14

k-means −16

−18

k-mediods or PAM −20

−22
35 40 45 50

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Examples of clustering algorithms

Centroid-based clustering
Clusters are represented by a central
“object”, which may not necessarily
be a member of the data set.

k-means
k-mediods or PAM

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Examples of clustering algorithms

0.4
uk lu
nl se
cy ie
(Model) Distribution-based clustering es at

0.3
fr lv
The clustering model most closely no
be
ro
hu
related to statistics is based on

0.2
de si
fi
distribution models.
Clusters can then easily be defined
0.1
as objects belonging most likely to
the same distribution/model.
0.0

100 150 200

(b)

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Examples of clustering algorithms

Density-based clustering
Clusters are defined as areas of
higher density than the remainder of
the data set.
Objects in sparse areas are usually
considered to be noise and border
points.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

The problem

Time series clustering problems arise when we observe a

sample of time series and we want to group them into different
categories or clusters.

This a central problem in many application fields and hence

time series clustering is nowadays an active research area in
different disciplines including finance and economics, medicine,
engineering, seismology and meteorology, among others.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Approaches for time series clustering

Time series clustering by features.

Model based time series clustering.
Time series clustering by dependence.

Liao, T.W. (2005) Clustering of time series data-a survey, Pattern Recognition, 38,
1857–1874.
Aghabozorgi, S., Shirkhorshidi, A.S. and Wah, T.Y. (2015) Time-series clustering
– A decade review. Information Systems 53 16–38.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Approaches for time series clustering

Time series clustering by features.

Kakizawa, Y., Shumway, R.H. and Taniguchi, M. (1998) Discrimination and
clustering for multivariate time series, J. Am. Stat. Assoc., 93, 328–340.
Vilar, J.A. and Pértega, S. (2004) Discriminant and cluster analysis for
Gaussian stationary processes: Local linear fitting approach, J.
Nonparametr. Stat., 16, 443–462.
Caiado, J., Crato, N. and Peña, D. (2006) A periodogram-based metric for
time series classification, Comput. Statist. Data Anal. 50, 2668-2684.
Scotto, M., Alonso, A.M. and Barbosa, S. (2010) Clustering time series of
sea levels: an extreme value approach, Journal of Waterway, Port, Coastal,
and Ocean Engineering, 136, 215–225.
D’Urso, P., Maharaj, E.A. and Alonso, A.M. (2017) Fuzzy Clustering of Time
Series using Extremes, Fuzzy Sets and Systems, 318, 56–79.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Approaches for time series clustering

Model based time series clustering.

Alonso, A.M., Berrendero, J.R., Hernández, A. and Justel, A. (2006) Time
series clustering based on forecast densities, Computational Statistics and
Data Analysis, 51, 762–766.
Corduas, M., Piccolo, D. (2008) Time series clustering and classification by
the autoregressive metric, Comput. Statist. Data Anal., 52, 1860–1872.
Scotto, M.; Barbosa, S. and Alonso, A.M. (2009) Model-based clustering of
Baltic sea-level, Applied Ocean Research, 31, 4–11.
Vilar-Fernández, J.A., Alonso, A.M. and Vilar-Fernández, J.M. (2010)
Nonlinear time series clustering based on nonparametric forecast
densities, Computational Statistics and Data Analysis, 54, 2850–2865.

Time series clustering by dependence.

Alonso, A.M. and Peña, S. (2017) Clustering time series by dependency.
Preprint.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction to clustering
Time series clustering by features
The problem
Model based time series clustering
Approaches
Time series clustering by dependence

Packages for time series clustering

TSclust: Package for Time Series Clustering.

Montero, P and Vilar, J.A. (2014) TSclust: An R Package for Time
Series Clustering. Journal of Statistical Software, 62(1), 1-43.

dtwclust: Time Series Clustering Along with

Optimizations for the Dynamic Time Warping (DTW)
Distance.
https:github.comasardaesdtwclust

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction
Raw data clustering
Time series clustering by features
Autocorrelation clustering
Model based time series clustering
Spectral domain clustering
Time series clustering by dependence
Extreme value clustering

Time series clustering by features

We have a set of univariate time series, X = {X X 1 , X 2 , . . . , X n },

where X i = (Xi,1 , Xi,2 , . . . , Xi,T ) and we want to cluster them.

Starting point
To choose a metric to assess the dissimilarity between two time
series.

This metric plays a crucial role in time series clustering.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by features

Conceptually most of the dissimilarity criteria proposed for time

series clustering lead to a notion of similarity relying on:
Proximity between raw series data.
Proximity between features of the time series.
Proximity between underlying generating processes.

Raw series data can be considered as naïve features of the

time series.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Raw data clustering

Word NO
0.6

0.5

It consists on measure the 0.4

0.3

distance, D, between two time 0.2

0.1

series using an element-wise 0

−0.1
approach: −0.2

−0.3
0 1000 2000 3000 4000 5000 6000 7000 8000

X i , X j ) = d (X
D(X X i − X j ),
Word YES
0.5

where d is a distance on RT . 0.4

0.3

0.2

This approach has a drawback 0.1

since it requires that the series to −0.1

−0.2
be aligned. −0.3

−0.4
0 1000 2000 3000 4000 5000 6000 7000 8000

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Raw data clustering

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

−0.2 −0.2

−0.4 −0.4

−0.6 −0.6
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

−0.2 −0.2

−0.4 −0.4

−0.6 −0.6
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Euclidean distance matrix:

 
0 14.1777 12.3613 12.7610
 14.1777 0 10.5822 11.3088 
 
 12.3613 10.5822 0 8.0949 
12.7610 11.3088 8.0949 0
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Raw data clustering
Time series clustering by features
Autocorrelation clustering
Model based time series clustering
Spectral domain clustering
Time series clustering by dependence
Extreme value clustering

Raw data clustering

Dynamic time warping distance matrix:

 
0 43.4941 70.2141 70.1087
 43.4941 0 75.1402 78.3575 
 
 70.2141 75.1402 0 36.7705 
70.1087 78.3575 36.7705 0

Datafile <yesnot.xls>
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Raw data clustering
Time series clustering by features
Autocorrelation clustering
Model based time series clustering
Spectral domain clustering
Time series clustering by dependence
Extreme value clustering

Time series clustering by features

Raw data clustering it is an Two AR(1) and two MA(1) time

interesting approach when we series:
expect the differences in the level 4

of the series. 1

−1

−2

−3

Euclidean distance matrix:

−4

−5
0 100 200 300 400 500 600 700 800 900 1000

  5

0 51.206 48.735 51.184 4

 51.206 0 51.472 50.709 

−1
 
 48.735 51.472 0 51.669 
−2

−3

−4
0 100 200 300 400 500 600 700 800 900 1000

51.184 50.709 51.669 0

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering
But, in this case, autocorrelation functions are a “good”
clustering criteria:
1 1
Sample Autocorrelation

Sample Autocorrelation
0.5 0.5

0 0

−0.5 −0.5
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Lag Lag

1 1
Sample Autocorrelation

Sample Autocorrelation

0.5 0.5

0 0

−0.5 −0.5
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Lag Lag

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering

Assume that we have two stationary series, X and Y :

H0 : ρ X = (ρX ,1 , ρX ,2 , . . . , ρX ,m )′ = ρ Y = (ρY ,1 , ρY ,2 , . . . , ρY ,m )′
H1 : ρ X = (ρX ,1 , ρX ,2 , . . . , ρX ,m )′ 6= ρ Y = (ρY ,1 , ρY ,2 , . . . , ρY ,m )′

where ρX ,k and ρY ,k are the corresponding autocorrelations.

We can use the following test statistics:

Xm
Tn,m = n (rX ,k − rY ,k )2 ,
k =1

where rX ,k and rY ,k are the estimated autocorrelations.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering

Tn,m can be used as a distance measure.

It is valid/correct when the series are independent.
But its distribution changes if the series X and Y are
cross-dependent.

So, we need a procedure to derive the distribution of Tn,m in

order to be able to evaluate if a given value tn,m is significantly
different from zero.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering

Subsampling algorithm to obtain the distribution of Tn,m :

1 Let X j = (Xj , Xj+1 , . . . , Xj+l−1 ) and Y j = (Yj , Yj+1 , . . . , Yj+l−1 )
with j = 1, 2, . . . , n − l + 1 be the subsamples of l consecutive
observations from X and Y , respectively. We calculate the j-th
(j)
subsampling statistic, Tl,m , by:

(j)
Xm
Tl,m = l ρXj ,k − rρbj ,k )2 ,
(b
k =1

where ρbXj ,k and ρbYj ,k are the k -th estimated autocorrelations

using the subsamples X j and Y j , respectively.
2 We calculate gn,l (1 − α) the 1 − α quantile of G b n,l (·).
3 We reject H0 if and only if Tn,m > gn,l (1 − α).

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering - Example

Code and description of interest rate series:

Code Description
BME08040203001 Reference rates/Banks/Short term prime rate
BME08040203002 Banks lending rates/Current account overdrafts/Effective
rate
BME08040203003 Banks lending rates/Exceed in credit card/Effective rate
BME08040203005 Reference rates/Saving banks/Short term prime rate
BME08040203006 Savings banks lending rates/Current account over-
drafts/Effective rate
BME08040203007 Savings banks lending rates/Credit account over-
drafts/Effective rate

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering - Example

It is clear that series are dependent.

Interest rates (Banks). January 1985 − December 2001
30

15 Short term prime rate

Current account overdrafts
Credit account overdrafts
10

0
0 50 100 150 200 250

Interest rates (Saving banks). January 1985 − December 2001

0
0 50 100 150 200 250

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering - Example

Associated p-value for each pair stationary series:

BME0804020300# 1 2 3 5 6 7
1 - 0.000 0.000 0.155 0.000 0.000
2 - 0.442 0.139 0.524 0.598
3 - 0.065 0.623 0.909
5 - 0.008 0.000
6 - 0.262
7 -

Alonso, A.M. and Maharaj, E.A. (2006) Comparison of time series using
subsampling, Computational Statistics and Data Analysis, 50,
2589–2599.

Datafile <BME.xls>

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Autocorrelation clustering - Example

0.9

0.8

0.7

0.6
1 - p-value

0.5

0.4

0.3

0.2

0.1

3 7 2 6 1 5

BME0804020300#

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Spectral domain clustering

Assume that we have two stationary series, X and Y with
spectral densities
X∞
λX = γX ,k exp(−ikω)
k =−∞

and X∞
λY = γY ,k exp(−ikω)
k =−∞

As before, we are interested on testing:

H0 : λX (ω) = λY (ω) (0 ≤ ω ≤ π)
.
H1 : λX (ω) 6= λY (ω)

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Spectral domain clustering

Diggle y Fisher (1991) propose to compare the integrated
periodograms:
Xj Xm
FX (ωj ) = IX (ωi )/ IX (ωi ),
i=1 i=1

and Xj Xm
FY (ωj ) = IY (ωi )/ IY (ωi ),
i=1 i=1
where ωi = 2πi/n, IX (·) is the periodogram, and
m = ⌈(n − 1)/2⌉.
We can use the following test statistics:
Rπ
Dm = sup |FX (ω) − FY (ω)| or Wm = 0 (FX (ω) − FY (ω))2 d F̄ (ω).

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Spectral domain clustering

We retake the word classification problem (boat versus goat):
0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

1111111111111110000001111100111001111111111111000111101111000011111111000000000000000000000000101100

Alonso, A.M., Casado, D., Lopez-Pintado, S. and Romo, J. (2014)

Robust Functional Classification for Time Series, Journal of
Classification, 31, 325–350.
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Raw data clustering
Time series clustering by features
Autocorrelation clustering
Model based time series clustering
Spectral domain clustering
Time series clustering by dependence
Extreme value clustering

Extreme value clustering

In some applications, the main interest is the highest (lowest)

level that we can observe in a time series in a given period.

To build dykes you need to know the maximum level of the

sea in the area that you want to protect.
Rising sea levels are of great concern to coastal
communities around the world.
To prevent the effect of temperatures in health, you need
information about the highest temperature.
In finance/insurance, the lowest values correspond to
capital losses.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Extreme value clustering

The Generalized Extreme Value distribution, as the following
form: ( 1 )
X − µ −ξ
G(x) = exp − 1 + ξ (1)
σ

defined on {x : 1 + ξ( x−µ
σ ) > 0} where −∞ < µ < ∞, σ > 0,
and −∞ < ξ < ∞,
The three parameters µ, σ and ξ are the location, scale
and shape parameters, respectively where ξ determines
the three extreme value types.
When ξ < 0, ξ > 0 or ξ = 0 , the GEV distribution is the
negative Weibull, the Fréchet or the Gumbel distribution,
respectively.
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Raw data clustering
Time series clustering by features
Autocorrelation clustering
Model based time series clustering
Spectral domain clustering
Time series clustering by dependence
Extreme value clustering

Extreme value clustering

GEV distribution fitting

The GEV log-likelihood function presents a difficulty if the
number of extreme events is small.

It is particularly severe when the method of maxima over

fixed intervals is used.

A possible solution is to consider the r -largest values over

fixed intervals (Coles 2001).

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Extreme value clustering

GEV distribution fitting
The number of largest values per year, r , should be chosen
carefully.
Small values will produce likelihood estimators with high
variance, whereas large values will produce biased
estimates.

In practice, r is selected as large as possible subject to

adequate model diagnostics.

The validity of the models can be checked through the

application of graphical methods (Reiss and Thomas,
2000).
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Raw data clustering
Time series clustering by features
Autocorrelation clustering
Model based time series clustering
Spectral domain clustering
Time series clustering by dependence
Extreme value clustering

Extreme value clustering

The implications of a fitted extreme value model are

usually made with reference to extreme quantiles.

By inversion of the GEV distribution function, the quantile,

xp for a specified exceedance probability p is

6 0, we have xp = µ − σξ 1 − (− log(1 − p)−ξ ) .
for ξ =
for ξ = 0, we have xp = µ − σ log[− log(1 − p)].
xp is referred to the return level associate with a return
period 1/p.

xp is expected to be exceeded by the annual maximum in

any particular year with probability p.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Extreme value clustering - Example

We consider 52 time series of daily maximum temperatures (in
degrees Celsius, o C) observed in Spain from 1990 to 2004.
(a) Cantabria

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500

(b) Comunidad de Madrid

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500

(c) Región de Murcia

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Extreme value clustering - Example

Box-plot of the exceedances (1990-2004) above (below) the
95% (5%) percentile during summer (winter) period.
46
a)
44
42
40
38
36
34
32
30
28
Cantabria Comunidad de Madrid Región de Murcia

12
b)

Cantabria Comunidad de Madrid Región de Murcia

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Extreme value clustering - Example

(a) Two clusters based on GEV estimates for highest temperatures

(b) Two clusters based on GEV estimates for lowest temperatures

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Extreme value clustering - Example

(a) Two clusters based on two sets of GEV estimates

(b) Three clusters based on two sets of GEV estimates

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Extreme value clustering - Example

Means of the 25 and 100 years returns levels with 95%
confidence intervals for the three clusters based on GEV
estimates:

Cluster 25 yr 95% CI 100 yr 95% CI

1 sum 39.12 38.33 39.91 39.61 38.63 40.60
win -0.63 -1.41 0.15 -1.31 -2.40 -0.23
2 sum 43.08 42.33 43.83 43.67 42.68 44.66
win 4.87 4.15 5.59 4.25 3.29 5.20
3 sum 38.37 37.30 39.44 39.63 37.75 41.51
win 8.76 8.03 9.48 8.10 7.13 9.07

Datafile <SpainTemperature.xls>
GEV estimates <SpainTemperatureEstimates.xls>
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Time series clustering by features
Forecast density clustering
Model based time series clustering
Multivariate models with cluster structure
Time series clustering by dependence

Model based time series clustering

We need to define a distance in the space of the parameters of

the models:
Lets assume that {Xt }t∈Z and {Yt }t∈Z follow an
ARIMA(p, d , q) model with ΦX (B)(1 − B)d Xt = ΘX (B)εX ,t
and ΦY (B)(1 − B)d Yt = ΘY (B)εY ,t . Then, we can use:

d (X , Y ) = (ΞX − ΞY )′Σ −1
Ξ (ΞX − ΞY ),

where ΞX = (φX ,1 , φX ,2 , . . . , φX ,p , θX ,1 , θX ,2 , . . . , θX ,q )′ and

ΞY = (φY ,1 , φY ,2 , . . . , φY ,p , θY ,1 , θY ,2 , . . . , θY ,q )′ .

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction
Introduction
Time series clustering by features
Forecast density clustering
Model based time series clustering
Multivariate models with cluster structure
Time series clustering by dependence

Model based time series clustering

If the ARIMA(p, d , q) model is invertible, then we can write

it as AR models: ΠX (B)Xt = εX ,t and ΠY (B)Yt = εY ,t .
Then the following distance can be used (Piccolo, 1990):
X∞ 1/2
d (X , Y ) = (πX ,j − πY ,j )2 .
j=1

For stationary ARMA(p, q) models, we can define a similar

measure using the moving average representation:
Xt = ΨX (B)εX ,t and Xt = ΨY (B)εY ,t (Galeano y Peña,
2000):
X∞ 1/2
d (X , Y ) = (ψX ,j − ψY ,j )2 .
j=1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Model based time series clustering

For stationary and invertible ARMA(p, q) models, Maharaj

(1996) propose a test that can be used as a distance
among models.

X
Z = = W π + ε,
Y

WX 0
where W = , W X and W Y are T − k × k
0 WY
matrices of lagged observations observaciones retardadas,
π ′X π ′Y ]′ , and ε = [εε′X ε′Y ]′ .
π = [π
2
′ σx σxy
E[εε] = 0 , E[εεε ] = V = Σ ⊗ I n−k , y Σ =
σyx σy2

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Model based time series clustering

Under H0 : π X = π Y , the following statistics is distributed as χ2k

(Maharaj, 2000):

Rπ
D = (R b)′ R (W
WVb W )−1R ′ −1 (R
Rπb),

b is the least squared estimator of V , π

where V b is the least
..
squared estimator of π , and R = [II . −II ]. p p

The statistics, D, can be used as a distance measure between

X and Y .

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Model based time series clustering - Example

We use the Maharaj’s approach for demographical data in
Alonso, A.M., Peña, D. and Rodríguez, J. (2013) Predicción de clusters
de series temporales demográficas, MedULA, 22 (1), 25-28.
Females Males
−2 −2

−3 −3

−4 −4

−5 −5
Log−mortality rate

Log−mortality rate
−6 −6

−7 −7

−8 −8

−9 −9

−10 −10
1970 1975 1980 1985 1990 1995 2000 1970 1975 1980 1985 1990 1995 2000

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Model based time series clustering - Example

Why we cluster models?
0.085 1
20 models
50 models
0.0845 100 models
0

0.084

−1
Mean Absolute Prediction Error

0.0835

MAPE reduction
0.083
−2

0.0825

−3
0.082

0.0815
−4

0.081

−5

0.0805
Number of considered models

0.08 −6
20 40 60 80 100 120 140 160 0 5 10 15 20 25
Prediction horizont

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering

Most of distances or dissimilarity criteria proposed up to this

point rely on the proximity between raw (features) series data,
or proximity between underlying generating processes

In both cases, the classification task becomes inherently static

since similarity searching is governed only by the behavior of
the series over their periods of observation.

In some practical situations, the real interest of clustering is the

future behavior and, in particular, on how the forecasts at a
specific horizon can be grouped.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering

25
Present

The clusters will be different if we

20
consider:

15
the models;
the last available observation;
10
the future values.
5

0 10 20 30 40 50 60 70

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering

Why prediction clustering?
It reduce the high dimensionality of the problem.
The predictions include information about the past
observations and about the data generating model.
In some problems, the interest is on the future behaviour or
if the series converge or not to some level:
Sustainable development.
(European) convergence of macroeconomic indicators.
Convergence of β-type (see, Barro and Sala-i-Martin, 1995).
Carvalho and Harvey (2005) analyze the short- and
long-term convergence of the per capita income in the Euro
zone.
Emissions of CO2 (Kyoto Protocol).

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering

Punctual predictions or prediction densities?
Suppose that we have series where the punctual prediction
are similar (or equals).
Example: Prediction of financial asset returns is E[rt ] = 0.
We want to distinguish among the following situations:
0.45 0.45

0.4 0.4

0.35 0.35

0.3 0.3

0.25 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0
-5 0 5 -5 0 5

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering

Punctual predictions or prediction densities?
0.35 Australia
Finland
Luxembourg
0.3 USA

0.25

0.2

0.15

0.1

0.05

| | ||

-0.05
5 10 15 20 25 30 35 40 45

Real data example - Kyoto protocol

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Time series clustering by features
Forecast density clustering
Model based time series clustering
Multivariate models with cluster structure
Time series clustering by dependence

Forecast density clustering

Steps for clustering procedure

1 Prediction calculation by bootstrap.

2 Dissimilarity matrix calculation by non-parametric kernel
estimators.
For each pair of series, X and Y , we calculate the L2 (L1 )
distance among the prediction densities:
Z
p
Dij = fXT +h (x ) − fYT +h (x ) dx ,

where p = 1, 2.
3 Finally, we use classical clustering procedures that allows
distances as inputs.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering

Prediction step

A general class of autoregressive processes

Let {Xt }t∈Z a real valued stationary processes such that

Xt = m(X
X t−1 ) + εt ,

where
{εt } is an i.i.d. sequence
X t−1 is a d-dimensional vector of known lagged variables
m(·) is assumed to be a smooth function but it is not restricted to
any pre-specified parametric model.

Of course, other models can be considered.

Forecast density clustering

Prediction step
1 Estimate m using a Nadaraya-Watson estimator m̂g1 .
2 Compute the nonparametric residuals, εbt = Xt − m̂g1 (X
X t−1 ).
3 Construct a kernel estimate, f̂ε̃,h , of the density function
associated to the centered residual.
4 Draw a bootstrap-resample ε∗t of i.i.d. data from f̂ε̃,h .
5 Define the bootstrap series Xt∗ , by Xt∗ = m̂g1 (XX ∗t−1 ) + ε∗t .
6 Obtain the bootstrap autoregressive function, m̂g∗2 , using the
bootstrap sample (X1∗ , . . . , XT∗ ).
7 Compute bootstrap prediction-paths by Xt∗ = m̂g∗2 (X X ∗t−1 ) + ε∗t , for
∗
t = T + 1, . . . , T + H, and Xt = Xt , for t ≤ T .
8 Repeat Steps (1)-(7) a large number B of times.
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Time series clustering by features
Forecast density clustering
Model based time series clustering
Multivariate models with cluster structure
Time series clustering by dependence

Forecast density clustering

Dissimilarity calculation step

In practice, distances Dp,XY are consistently approximated by

replacing the unknown fXT +b by kernel-type density estimates
f̂XT +b constructed on the basis of bootstrap predictions, that is
Z p
∗
D̂p,XY = f̂XT∗+b (x) − f̂YT∗+b (x) dx, i, j = 1, . . . , s,

for p = 1, 2.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering

Clustering step

Application of a agglomerative hierarchical cluster algorithm

Once the pairwise dissimilarity matrix D̂p∗ = D̂p,XY
∗ is
obtained, a standard agglomerative hierarchical clustering
algorithm based on D̂p∗ is carried out.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering - Example

Dataset: Emissions of CO2 in 24 industrialized countries.
45

Australia
40 Austria
Belgium
Canada
35 China
Cyprus
Denmark
30 Finland
France
Greece
25 Hungary
Ireland
Italy
20 Japan
Luxembourg
Malta
15 Netherlands
Norway
Poland
10 Portugal
Spain
Sweden
5 United Kingdom
United States

0
60

98
19

19
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction
Introduction
Time series clustering by features
Forecast density clustering
Model based time series clustering
Multivariate models with cluster structure
Time series clustering by dependence

Forecast density clustering - Example

Dendrogram based on the last available observation

3.5

2.5

1.5

0.5

0
GRC CYP JPN MLT NLD ITA FRA HUN BEL IRL CAN LUX
POL DNK GBR NOR AUT ESP PRT SWE FIN CHN AUS USA

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering - Example

Dendrogram based on the last available observation

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering - Example

Dendrogram based on the punctual prediction for 2012

3.5

2.5

1.5

0.5

0
JPN AUT DNK GBR PRT ITA NOR HUN AUS LUX MLT IRL
ESP BEL CYP NLD GRC POL FRA SWE FIN USA CAN CHN

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering - Example

Dendrogram based on the punctual prediction for 2012

0.35 Australia
Finland
Luxembourg
0.3 USA

0.25

0.2

0.15

0.1

0.05

| | ||

-0.05
5 10 15 20 25 30 35 40 45

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Forecast density clustering - Example

Dendrogram based on the prediction densities for 2012

0.25

0.2

0.15

0.1

0.05

0
FIN MLT BEL ITA NLD PRT JPN IRL GBR AUS HUN GRC
LUX AUT DNK NOR POL ESP CAN CYP USA FRA SWE CHN

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Multivariate models with cluster structure

Dynamic factor models:
When the number of series is large, VARMA models are hard to
build or even unfeasible.

Dynamic Factor Models can deal with large sets of time series.
Engle and Watson (1981), Peña and Box (1987), Forni et al
(2000), Bai and Ng (2002), Peña and Poncela (2006), Hallin
and Liska (2007), Alonso et al (2011), Lam and Yao (2012),
Forni et al (2015, 2016,2017).

For large panels of time series we often found group structure

and different factors affecting to different groups.
Hallin and Liska (2011), Su et al (2014) and Ando and Bai
(2016, 2017).

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Multivariate models with cluster structure

Dynamic factor models with cluster structure:
Let xt = (x1t , . . . , xmt )′ be an m-dimensional vector time series.
Xk
xt = P0 f0t + Pi fit + nt ,
i=1

where
′
f0t = (f01t , . . . , f0r0 t ) is a r0 -dimensional vector of common
factors, P0 is a m × r0 factor loading matrix and k is the number
of clusters.
′
fit = (fi1t , . . . , firi t ) be a ri -dimensional vector of group-specific
factors corresponding to the ith cluster and Pi is the m × ri factor
loading of these specific factors. The columns of the matrix Pi
are of the form (0, . . . , 0, pj1 , . . . , pjmi , 0, . . . , 0), for j = 1, . . . , ri .

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Multivariate models with cluster structure

Ando, T. and Bai J. (2016) Panel data models with grouped factor
structure under unknown group membership, Journal of Applied
Econometrics, 31, 163–191.
Ando, T. and Bai J. (2017) Clustering huge number of financial
time series: A panel data approach with high-dimensional
predictor and factor structures, Journal of the American
Statistical Association, in press.

Implemented in JAE1.R, JAE2.R and JASA.R

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Multivariate models with cluster structure

We should to provide the number of clusters, k, the

number of common factors, r , and the number of
group-specific factors, ri .

Ando, T. and Bai J. (2017) provides a procedure for

selecting, k, r and ri but it is computationally intensive.
k = 1, 2, . . . , K .
r = 0, 1, . . . , R.
ri = 1, . . . , R.

An information criteria is used to select those parameters.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Multivariate models with cluster structure - Example

Dataset: Mortality rates by single age, Spain 1908 - 2015.
0

−2

−4
Log−mortality rate

−6

Civil War
−8 period
Spanish influenza
pandemy

−10

−12
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Multivariate models with cluster structure - Example

We use k = 3, r = 1 and ri = 1:
0

−2

−4
Log−mortality rates

−6

−8

−10

−12
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Introduction Introduction
Time series clustering by features A dissimilarity measure based on mutual dependency
Model based time series clustering The clustering procedure
Time series clustering by dependence Case-studies with real data

Time series clustering by dependence

Up to this point, the classification task becomes inherently

univariate since similarity searching is governed only by the
behavior of each series but doesn’t take into account the
cross-dependency among the series.
Suppose that we have stationary (standardized) time series.
Define rxx (h) = E (xit xi,t−h ) and rxy (h) = E (xit yj,t−h ).
We can build a measure of the dependency as follows:

rxx (h) rxy (h)
Let B(h) = .
ryx (h) ryy (h)

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Then the matrix

 
B(0) B(1) ··· B(k)
 B(−1) B(0) ··· B(k − 1) 
 
Bk =  .. .. .. .. 
 . . . . 
B(−k) B(−k + 1) · · · B(0)

is the covariance matrix of the vector stationary process

Zt = (xt , yt , xt−1 , yt−1 , ...., xt−k , yt−k )T .

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

A dissimilarity measure based on mutual dependency

A convenient measure of dissimilarity based on their joint

dependency is
D(X , Y ) = |Bk |1/2(k +1)

Notice that 0 ≤ |Bk | ≤ 1 with equality to one when Bk is

diagonal.
This measure will be non-negative, symmetric and will be
zero if x = y.
The dissimilarity will reach the largest value, one, when the
two series are independent, and will be zero if they are
identical.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

A dissimilarity measure based on mutual dependency

Note that

|Bk | = R(x)k R(y)k − R(y, x)k R−1 (x)k R(x, y)k

It should be noticed that if x is integrated then |R(x)k | will be

close to zero and the product will be small whatever the second
term is.
This suggest the alternative measure

RD(X , Y ) = |Bk |1/2(k +1) /(|R(x)k | · |R(y)k |)1/2(k +1) ,

which has not this limitation.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

The clustering procedure:

We use the dissimilarity defined by

RD(X , Y ) = |Bk |1/2(k +1) /(|R(x)k | · |R(y)k |)1/2(k +1)

as input of an agglomerative hierarchical clustering.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

The clustering procedure

The nonlinear features of some time series, as for instance,

volatility and nonlinear behavior are not indicated by the
measures such as simple or partial autocorrelation.

We know that these nonlinear features can be shown by the

autocorrelation of the absolute values or the squared residuals
of a linear fit.

Suppose that we fit an AR(p) model to the series where p is

chosen by the AIC or BIC criterion and we obtain:
et = yt − π
b1 yt−1 − ... − π
bp yt−p .

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Synthetic example

Dependent series
The models for the three populations are:
(1,i) (1,i) (1,i)
1 AR(1) Xt = 0.9Xt−1 + ǫt with i = 1, 2, ..., 5.
(2,i) (2,i) (2,i)
2 AR(1) Xt = 0.2Xt−1 + ǫt with i = 1, 2, ..., 5.
(3,i) (3,i) (3,i)
3 AR(1) Xt = 0.2Xt−1 + ǫt with i = 1, 2, ..., 5.
That is, the second and the third models have the same
autocorrelation structure.
The five scenarios differs in the dependence structure of
the innovations. In the following, we present the
(1,1) (1,2) (3,5)
autocorrelation matrices of (ǫt , ǫt , ..., ǫt ).

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Synthetic example
Scenario D.1
1 .5 0 0 0 0 0 0 0 0 0 0 0 0 0
 
 1 .5 0 0 0 0 0 0 0 0 0 0 0 0 
1 .5 0 0 0 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0
 
 
RD.1 = 1 .5 0 0 0 0 0 0
 

1 .5 0 0 0 0 0
 
 
1 0 0 0 0 0
 
 

 1 0 0 0 0 


 1 0 0 0 


 1 0 0 

 1 0 
1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Synthetic example
Scenario D.1
4
o 3
o
5
o
2
o

6
o

1
o

7
o

15
o

8
o

14
o

9
o

13
o
10
o
12
11 o
o

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Synthetic example
Scenario D.2
1 .5 0 0 0 0 0 0 0 0 0 0 0 0 0
 
 1 .5 0 0 0 0 0 0 0 0 0 0 0 0 
1 .5 0 0 0 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0 0
 
 
1 .5 0 0 0 0 0 0 0
 
 
RD.2 = 1 .5 0 0 0 0 0 0
 

1 .5 0 0 0 0 0
 
 
1 0 0 0 0 0
 
 

 1 .5 0 0 0 


 1 .5 0 0 


 1 .5 0 

 1 .5 
1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Synthetic example
Scenario D.3
1 .9 .9 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 1 .9 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0 
1 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 0 0 0 0 0
 
 
RD.3 = 1 .9 .9 0 0 0 0 0
 

1 .9 0 0 0 0 0
 
 
1 0 0 0 0 0
 
 

 1 0 0 0 0 


 1 0 0 0 


 1 0 0 

 1 0 
1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Synthetic example
Scenario D.4
1 .9 .9 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 1 .9 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0 
1 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 0 0 0 0 0
 
 
RD.4 = 1 .9 .9 0 0 0 0 0
 

1 .9 0 0 0 0 0
 
 
1 0 0 0 0 0
 
 

 1 .5 0 0 0 


 1 .5 0 0 


 1 .5 0 

 1 .5 
1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Time series clustering by dependence

Synthetic example
Scenario D.5
1 .9 .9 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 1 .9 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0 
1 .9 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 .9 0 0 0 0 0
 
 
1 .9 .9 .9 0 0 0 0 0
 
 
RD.5 = 1 .9 .9 0 0 0 0 0
 

1 .9 0 0 0 0 0
 
 
1 0 0 0 0 0
 
 

 1 .9 .9 .9 .9 


 1 .9 .9 .9 


 1 .9 .9 

 1 .9 
1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Synthetic example: Scenarios D.1 - D.5

4 4
o 3 o 3
o o
5 5
o o
2 2
o o

6 6
o o

1 1
o o

7 7
o o

15 15
o o

8 8
o o

14 14
o o

9 9
o o

13 13
o o
10 10
o o
12 12
11 o 11 o
o o

4 4 4
o 3 o 3 o 3
o o o
5 5 5
o o o
2 2 2
o o o

6 6 6
o o o

1 1 1
o o o

7 7 7
o o o

15 15 15
o o o

8 8 8
o o o

14 14 14
o o o

9 9 9
o o o

13 13 13
o o o
10 10 10
o o o
12 12 12
11 o 11 o 11 o
o o o

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Synthetic example: Scenarios D.1 - D.5

The following results are the means of the Gravilov index from 10000
replicates for the sets A and B with T = 100.

The similarity index used in Gavrilov et al. (2000) compares two

different cluster partitions, C = (C1 , . . . , Ck ) and C ′ = (C1′ , . . . , Ck′ ′ )
using the following formulas:
T
′
#(Ci Cj′ )
Sim(Ci , Cj ) = 2 ,
#(Ci ) + #(Cj′ )

and Xk
Sim(C, C ′ ) = k −1 max1≤j≤k ′ Sim(Ci , Cj′ ).
i=1

The closer to one the index, the higher is the agreement between the
two partitions.
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction Introduction
Time series clustering by features A dissimilarity measure based on mutual dependency
Model based time series clustering The clustering procedure
Time series clustering by dependence Case-studies with real data

Synthetic example: Scenario D.1

1
0.98

0.96

0.95 0.94

0.92

0.9
0.9

0.88

0.86

0.85
0.84

0.82

0.8 0.8

0.78

9 10 11 7 8 15 1 2 3 6 4 5 12 13 14 9 10 6 7 8 1 2 3 4 5 11 15 14 12 13

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Synthetic example: Scenario D.2

0.98
1

0.96

0.94
0.95

0.92

0.9

0.9
0.88

0.86

0.85
0.84

0.82

0.8 0.8

0.78
3 4 15 6 7 11 12 1 2 8 5 9 10 13 14 3 4 1 2 5 6 7 8 9 10 11 12 13 14 15

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Synthetic example: Scenario D.3

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

4 5 3 1 2 6 10 7 9 8 11 14 12 13 15 4 5 3 2 1 6 10 7 9 8 13 11 12 15 14

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Synthetic example: Scenario D.4

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2
1 4 2 3 5 6 9 8 7 10 11 12 13 14 15 1 4 2 3 5 6 7 10 8 9 11 12 13 14 15

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Synthetic example: Scenario D.5

1
1

0.9
0.9

0.8
0.8

0.7
0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

11 12 13 15 14 1 3 5 4 2 6 7 8 9 10 11 12 13 14 15 1 2 3 5 4 6 7 8 9 10

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Synthetic example: Scenarios D.1 - D.5

The following results are the means of the Gravilov index from 10000
replicates for the set D using the complete and single linkage

Method D.1 D.2 D.3 D.4 D.5

SAC 0.443 0.643 0.717 0.665 0.665
PAC 0.491 0.666 0.814 0.678 0.689
D 0.698 0.664 1.000 0.842 1.000
RD 0.527 0.654 1.000 0.865 1.000
Method D.1 D.2 D.3 D.4 D.5
SAC 0.478 0.666 0.635 0.667 0.667
PAC 0.474 0.666 0.637 0.667 0.667
D 0.923 0.830 1.000 0.988 1.000
RD 0.934 0.843 1.000 0.993 1.000
ABC - 0.612 - 0.698 0.840
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction Introduction
Time series clustering by features A dissimilarity measure based on mutual dependency
Model based time series clustering The clustering procedure
Time series clustering by dependence Case-studies with real data

Synthetic example: Scenarios D.1 - D.5

Main conclusions
The results of the univariate methods are similar and they don’t
change much across linkage methods.
Notice that here a Gravilov index around 0.667 corresponds to
approximately separate the first population from the third one in
scenarios D.2, D.4 and D.5.
For scenarios D.3, D.4 and D-5 where there are some “strong”
clusters, the complete linkage for both multivariate measures
improve the univariate measures.
For all scenarios, the single linkage and RD is preferable to other
considered alternatives.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Case-study with real data- I

Spanish mortality rates

We consider the Spanish mortality rates by age (0 – 90 years)
for both genders taken from the Human Mortality Database
(http://www.mortality.org).
The data is available from 1908 to 2015. We skip the period
1908 – 1949.
This allows us to use the period 1950 – 2000 as a model
adjustment period and 2001 – 2015 as a test period in the
forecasting exercise.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Case-study with real data - I

Spanish mortality rates
-1

-2

-3

-4

-5
log(MR)

-6

-7

-8

-9

-10
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Case-study with real data: Data description

Spanish mortality rates
It is clear that these series has an strong negative trend. In fact they
share a common trend.
1 25 23
o 21
o
17 o 19
9 27
o o
41
40 29 17
o
o
39
23 15
38 31 o
o
37
36
35 0.8 13
o
34 33
o
33
31
32
30 35
11
o
29 o
28
27
26 0.6 9
o
25
24
37
o
22
21
20
19 7
o
18 39
o
16
13 0.4
45
10 5
1 o
11 41
o
44
42
15
14
43 0.2 3
o
46 43
o
12
8
6
7
5 1
o
4
3
45
o
2 0 91
o
62
48
56
51 47
o
52
49 89
o
47
50
53
61 −0.2
59 49
o
57
55
87
o
54
90
82
60 51
o
58
89 −0.4 85
o
66
65
64
70 53
o
67 83
o
73
63
91
68 −0.6 55
o
72 81
o
88
71
69
84 57
o
80 79
o
75
74
85 −0.8 59
o
83 77
o
87
86 61
o 75
81
79
o
76 63
o 73
78 o
77 65
o 71
o
67
o 69
−1 o
5 4 3 2 1 0 −1 −0.5 0 0.5 1
−3
x 10

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Case-study with real data: Data description

Lee-Carter model
It is a well-known model which looks at the dependence between
mortality time series. It relates the mortality rates by age with a single
unobservable factor:
ln(MRx,t ) = ax + bx kt + εx,t
,
kt = c + kt−1 + ηt

where ax and bx are parameters which depend on age, x ; kt is the

unobservable factor which picks up the general characteristics of
mortality in the year t, and εx,t are the age-specific factors.

We will cluster the series of age-specific factors, εx,t .

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Case-study with real data: Factors & Loadings

Spanish mortality rates
80 0.5

60 First factor 0.4 Second factor

40 0.3

20 0.2

0 0.1

−20 0

−40 −0.1

−60 −0.2
1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000

0.035 2

0.03 1.5
First factor loadings Second factor loadings
1
0.025
0.5
0.02
0
0.015
−0.5
0.01
−1

0.005 −1.5

0 −2
0 20 40 60 80 100 0 20 40 60 80 100

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Spanish mortality rates: Clustering results

Spanish mortality rates
At the age-specific factors, we find two clusters and some
“independent” series.
1 25 23
o 21
o
15 o 19
45 27
o o
13
14 29 17
o
o
46
7 15
11 31 o
o
5
4
3 0.8 13
o
2 33
o
43
51
8 11
o
49
47
35
o
16
9
10 0.6 9
o
44
42
37
o
6
48
56
12 7
o
40 39
o
39
41 0.4
38
37 5
36 o
35 41
o
34
33
31
32
30 0.2 3
o
29 43
o
28
27
26
25 1
o
23
24 45
22 o
21 0 91
o
20
19
18
17 47
o
53
50 89
o
1
52
62
61 −0.2
57 49
o
54
59
87
o
73
63
60
55 51
o
66
65 −0.4 85
o
64
67
58
91 53
o
89 83
o
68
69
71
70 −0.6 55
o
90 81
o
72
83
75
74 57
o
85 79
o
81
88
87 −0.8 59
o
86 77
o
79
82 61
o 75
78
77
o
76 63
o 73
84 o
80 65
o 71
o
67
o 69
−1 o
0.2 0.15 0.1 0.05 0 −1 −0.5 0 0.5 1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Spanish mortality rates: Clustering results

Here, we will compare the forecasting performance of three

models:
A factorial model with a single unobservable factor, as in
Lee-Carter (1992).
A factorial model with two unobservable factors, as in
Alonso, Peña and Rodríguez (2005).
A factorial model with two unobservable factors where:
the first factor is estimated using all series.
the second factor is estimated using the two obtained
clusters.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Case-study with real data: Factors & Loadings

Spanish mortality rates
8 8
Second factor
Second factor 6
6 Cluster 2
Cluster 1
4
4
2
2
0
0
−2

−2 −4

−4 −6
1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000

0.08 −0.01

0.07 −0.015 Second factor loadings

0.06
Cluster 1
−0.02
0.05
−0.025
0.04
−0.03
0.03
Second factor loadings
0.02 Cluster 1 −0.035

0.01 −0.04
15 20 25 30 35 40 50 55 60 65 70 75 80 85 90

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Mean absolute prediction errors

0.5

0.45
One factor model
Two factors model
Factors + Cluster
0.4 Cluster 1
Cluster 2
0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
0 10 20 30 40 50 60 70 80 90 100

We observe improvements in almost all ages

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction Introduction
Time series clustering by features A dissimilarity measure based on mutual dependency
Model based time series clustering The clustering procedure
Time series clustering by dependence Case-studies with real data

Mean absolute prediction errors

0.25
One factor model
Two factors model
Factors + Cluster
Cluster 1
0.2 Cluster 2

0.15

0.1

0.05

55 60 65 70 75 80 85 90 95

We observe improvements in ages where two factors is worse than one factor
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction Introduction
Time series clustering by features A dissimilarity measure based on mutual dependency
Model based time series clustering The clustering procedure
Time series clustering by dependence Case-studies with real data

Mean absolute prediction errors

0.45
One factor model
Two factors model
Factors + Cluster
0.4
Cluster 1
Cluster 2

0.35

0.3

0.25

0.2

0.15

0.1
20 25 30 35 40

But also in ages where two factors is better than one factor
A.M. Alonso, L. Cayuelas and A. Justel Time series clustering
Introduction Introduction
Time series clustering by features A dissimilarity measure based on mutual dependency
Model based time series clustering The clustering procedure
Time series clustering by dependence Case-studies with real data

Case-study with real data- II

Spanish electricity prices
We study the 24 series of hourly prices for the Iberian electricity
market from January 2014 to May 2016.
120
1
2
3
4
5
100 6
7
8
9
10
11
80 12
13
14
15
16
17
18
60
19
20
21
22
23
24
40

0
0 100 200 300 400 500 600 700 800 900

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Case-study with real data- II

Spanish electricity prices - Translated for better visualization.
160
1
2
3
4
140 5
6
7
8
120 9
10
11
12
13
100 14
15
16
17
18
80
19
20
21
22
60 23
24

0
0 100 200 300 400 500 600 700 800 900

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Spanish electricity prices: Clustering results

6
1 7 o 5
o o

8 4
o o
0.8
9 3
o o

0.6
10 2
o o

0.4
There are three clusters: 11
o
1
o
0.2

Sleeping hours 12 24
0o o

Working hours −0.2

13 23
o o

Arriving & staying at −0.4

14
o
22
o

home. −0.6
15 21
o o

−0.8
16 20
o o

17 19
o 18 o
−1 o
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

Mean absolute prediction errors

6.8
One factor model
Two factors model
Factors + Clusters
6.6

6.4

6.2

5.8

5.6

5.4

5.2

5
0 5 10 15 20 25

We observe improvements in all hours for one-day-ahead forecast

Time series clustering by features.

Raw data.
Autocorrelation.
Spectral density.
Extreme value behaviour.

Model based time series clustering.

Forecast based clustering.
Model with cluster structure.

Time series clustering by dependence.

A.M. Alonso, L. Cayuelas and A. Justel Time series clustering

C# Practical Solution
No ratings yet
C# Practical Solution
61 pages
C100 Service Training Manual:: All Wheel Drive (AWD)
No ratings yet
C100 Service Training Manual:: All Wheel Drive (AWD)
18 pages
Magnet Grade 5 WS
0% (1)
Magnet Grade 5 WS
7 pages
ONGC Uran
No ratings yet
ONGC Uran
10 pages
Chapter 5 Introduction To CAD
No ratings yet
Chapter 5 Introduction To CAD
280 pages
Kubota Mobile Light Tower
No ratings yet
Kubota Mobile Light Tower
1 page
Sales Budgeting and Forecasting
0% (1)
Sales Budgeting and Forecasting
16 pages
Unit-1 Lesson 1
No ratings yet
Unit-1 Lesson 1
10 pages
Selected Problems in The Theory of Classical Cellular Automata
No ratings yet
Selected Problems in The Theory of Classical Cellular Automata
410 pages
ER To Relational Model
No ratings yet
ER To Relational Model
39 pages
I - V Converter
No ratings yet
I - V Converter
4 pages
Amazon Braket: Developer Guide
No ratings yet
Amazon Braket: Developer Guide
54 pages
ITF24-DS-Assignment #1
No ratings yet
ITF24-DS-Assignment #1
3 pages
W2915
No ratings yet
W2915
16 pages
Daftar STandard Method
No ratings yet
Daftar STandard Method
33 pages
Chapter 7 (Part I) - User Defined Datatypes
No ratings yet
Chapter 7 (Part I) - User Defined Datatypes
53 pages
PowerWave Observer
No ratings yet
PowerWave Observer
21 pages
Everhard™: Abrasion-Resistant Steel Plate
No ratings yet
Everhard™: Abrasion-Resistant Steel Plate
12 pages
MPC - 1ST Year Jee Mains Coes Paper 10.11.2024
No ratings yet
MPC - 1ST Year Jee Mains Coes Paper 10.11.2024
8 pages
8051 Instruction Set
No ratings yet
8051 Instruction Set
50 pages
Yu 2017 Centrifugal Microfluidics For Sorti
No ratings yet
Yu 2017 Centrifugal Microfluidics For Sorti
12 pages
Switch On - Worksheets - 3 - Amer 6
No ratings yet
Switch On - Worksheets - 3 - Amer 6
1 page
Aqa Mm1b QP Jan13
No ratings yet
Aqa Mm1b QP Jan13
20 pages
0 1 App Log
No ratings yet
0 1 App Log
13 pages
Audio Amplifier Applications Low Noise Audio Amplifier Applications
No ratings yet
Audio Amplifier Applications Low Noise Audio Amplifier Applications
5 pages
Half Deflection
No ratings yet
Half Deflection
4 pages
Reliability: Case Processing Summary
No ratings yet
Reliability: Case Processing Summary
5 pages
Victaulic Grooved IPS-CS Installation
No ratings yet
Victaulic Grooved IPS-CS Installation
3 pages
H53015302 TRQ XXX
No ratings yet
H53015302 TRQ XXX
2 pages
How Data Access Sets & Security Rules Work Together - R12 General Ledger
No ratings yet
How Data Access Sets & Security Rules Work Together - R12 General Ledger
1 page
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)