Intro Stat 153
Intro Stat 153
Joan Bruna
Department of Statistics
UC, Berkeley
September 3, 2015
Grading
1
lowest grade will be dropped
Joan Bruna Stat 153: Introduction to Time Series
Logistics
Introduction to Time Series
Statistical Models of Time Series
trend
seasonality (periodicity)
Joan Bruna Stat 153: Introduction to Time Series
Logistics
Introduction to Time Series
Statistical Models of Time Series
′
′
(1 − α)(Σj + (µj − µj )(µj − µj )T )
′
distinguish and track on account of the videotape quality.
There is no traffic conflict highlighted in the last sequence. It
is sometimes difficult to judge the performance of the method
c c c T
+α(Σj + (µj − µj )(µj − µj ) ) (4) since there is no ground truth, apart from the traffic conflicts
′ ′ used in the program.
where µj and Σj are the mean and covariance of the
Gaussian of the state Sj of the adapted HMM. α is a weight-
ing factor that controls the balance between the original
model and the new estimates on the trajectories involved
in the training traffic conflicts. Both the original model and
the adapted model are kept in the set of HMMs used for
detection.
A traffic conflict is an interaction, defined as an obser-
vational situation in which two or more vehicles are close
enough in space and time, and are nearing each other. So
far, the method has built a set of HMMs, among which
some are models of conflicting trajectories. But a trajectory
is conflicting only with respect to another one. It is their
conjunction in time and space that creates a danger of
Fig. 1. An image of the traffic sequences.
collision. Therefore, the models of the conflicting trajectories
(to which the trajectories involved in the training traffic
conflict instances are assigned) are memorized by pairs (e.g.
pairs of models 4 and 7, 11 and 2, 3 and 7). The traffic
conflict detection proceeds as follows:
1) Vehicles are tracked.
2) If two vehicles are close enough (threshold on their
distance) and nearing each other (their distance de-
creases), an interaction is detected.
3) Each interacting vehicle trajectory is assigned to a
HMM.
4) If the HMMs of both interacting trajectories were both
memorized as conflicting (e.g. any of the pairs of
models 4 and 7, 11 and 2, 3 and 7), a traffic conflict
between these two vehicles is detected.
Fig. 2. An example of trajectories involved in a traffic conflict.
V. E XPERIMENTAL E VALUATION from Saunier and Sayed
A simple vehicle detection and tracking algorithm is used, All K HMMs of a mixture have the same structure param-
based on the implementation of the KLT feature tracking eter values (number of states, simple Gaussian observation
algorithm of [8], used in [9]. The advantages of feature- 2 http://www.cs.ubc.ca/∼murphyk/Software/HMM/hmm.
Robotics
based algorithms include the abilities to work well under html
Transformed
Example: Transformation of the data
data
12
11.5
11
10.5
10
9.5
8.5
7.5
7
0 10 20 30 40 50 60 70 80 90
Joan Bruna Stat 153: Introduction to Time Series
Logistics
Introduction to Time Series
Statistical Models of Time Series
12
11.5
11
10.5
10
9.5
8.5
7.5
7
0 10 20 30 40 50 60 70 80 90
Joan Bruna Stat 153: Introduction to Time Series
Logistics
Introduction to Time Series
Statistical Models of Time Series
Residuals
Example: Look at the residuals
1.5
0.5
−0.5
−1
0 10 20 30 40 50 60 70 80 90
Joan Bruna Stat 153: Introduction to Time Series
Logistics
Introduction to Time Series
Statistical Models of Time Series
Trend Seasonality
Example: Modeling and seasonaland
variation
Trend
12
11.5
11
10.5
10
9.5
8.5
7.5
7
0 10 20 30 40 50 60 70 80 90
Joan Bruna Stat 153: Introduction to Time Series
Logistics
Introduction to Time Series
Statistical Models of Time Series
Example
Xt = Tt + St + Rt .
Example
Xt = Tt + St + Rt .
Example
Xt = Tt + St + Rt .
Stationarity.
Autocorrelation Function.
Chasing Stationarity.
Spectral Methods
Correlation is a form of (global) smoothness. How to study
smoothness properties?
Spectral Methods
Correlation is a form of (global) smoothness. How to study
smoothness properties?
Spectral Methods
Correlation is a form of (global) smoothness. How to study
smoothness properties?
Spectral Density.
Time-Frequency Representations, Periodogram.
Introduction to Wavelet Analysis
Spectral Estimation.
Applications to speech recognition.
ARMAX models
Hidden Markov models (HMM).
Kalman Filter.
Recurrent Neural Networks (RNN). (time allowing)
White Noise
{Xt } is a white noise if for all t,
1 E (Xt ) = 0,
2 var (Xt ) = σ 2 ,
3 Xt and Xu are uncorrelated for t 6= u.
White Noise
{Xt } is a white noise if for all t,
1 E (Xt ) = 0,
2 var (Xt ) = σ 2 ,
3 Xt and Xu are uncorrelated for t 6= u.
In particular, if {Xt } are i.i.d with zero mean, {Xt } is a white
noise. Also,
Y
P(X1 ≤ x1 , . . . , Xt ≤ xt ) = P(Xi ≤ xi ) .
White Noise
{Xt } is a white noise if for all t,
1 E (Xt ) = 0,
2 var (Xt ) = σ 2 ,
3 Xt and Xu are uncorrelated for t 6= u.
In particular, if {Xt } are i.i.d with zero mean, {Xt } is a white
noise. Also,
Y
P(X1 ≤ x1 , . . . , Xt ≤ xt ) = P(Xi ≤ xi ) .
Moving Averages
Moving Averages
P∆
Xt = k=−∆ λk Wt+k .
Random Walks
P
Consider {Wt } white noise, and Xt = i≤t Wi .
Random Walks
P
Consider {Wt } white noise, and Xt = i≤t Wi .
Random Walks
Random Walks
Random Walks
Recall S&P data.
Random Walks
Recall S&P data.
Intractable in general.
We will resort to low-order statistics only (mostly first and second
order).
P
Consider {Wt } white noise, and Xt = i≤t Wi .
P
Consider {Wt } white noise, and Xt = i≤t Wi .
Definition
The mean function of a time series {Xt } is
µX (t) = E (Xt ) .
Definition
The mean function of a time series {Xt } is
µX (t) = E (Xt ) .
Definition
The autocovariance function of a time series {Xt } is
Lecture 2
Autocovariance examples
Autocorrelation and Cross-Correlation
Stationarity
Linear Processes
Examples
P
{Xt } Random Walk: Xt = i≤t Wi , with {Wt } iid white noise.
Examples
P
{Xt } Random Walk: Xt = i≤t Wi , with {Wt } iid white noise.
X X
µX (t) = E (Xt ) = E Wi = E (Wi ) = 0 .
i≤t i≤t
Examples
P
{Xt } Random Walk: Xt = i≤t Wi , with {Wt } iid white noise.
X X
µX (t) = E (Xt ) = E Wi = E (Wi ) = 0 .
i≤t i≤t
X X X
RX (s, t) = cov( Wi , Wi 0 ) = cov(Wi , Wi 0 ) = min(s, t)σ 2 .
i≤s i 0 ≤t i≤s,i 0 ≤t
Examples
Examples
Examples
σ2 if s=t ,
RX (s, t) = cov(F (s)+Ws , F (t)+Wt ) = cov(Ws , Wt ) =
0 otherwise .
P∆
Moving average process: Xt = k=−∆ Wt+k , with {Wt } white
noise.
P
Moving average process: Xt = ∆ k=−∆ Wt+k , with {Wt } white
noise.
RX (s, t) =? .
If |t − s| > 2∆:
∆ ∆
!
X X
RX (s, t) = cov Ws+k , Wt+k =
k=−∆ k=−∆
If |t − s| > 2∆:
∆ ∆
!
X X
RX (s, t) = cov Ws+k , Wt+k =0.
k=−∆ k=−∆
If |t − s| ≤ 2∆:
∆ ∆
!
X X
RX (s, t) = cov Ws+k , Wt+k =
k=−∆ k=−∆
If |t − s| ≤ 2∆:
∆ ∆
!
X X
RX (s, t) = cov Ws+k , Wt+k = (2∆ − |t − s|)σ 2 .
k=−∆ k=−∆
If |t − s| ≤ 2∆:
∆ ∆
!
X X
RX (s, t) = cov Ws+k , Wt+k = (2∆ − |t − s|)σ 2 .
k=−∆ k=−∆
Stationarity: Motivation
Strict Stationarity
Definition
A Time Series {Xt } is strictly Stationary if
Strict Stationarity
Definition
A Time Series {Xt } is strictly Stationary if
Strict Stationarity
Definition
A Time Series {Xt } is strictly Stationary if
Strict Stationarity
Definition
A Time Series {Xt } is strictly Stationary if
µX (t) = cte .
µX (t) = cte .
µX (t) = cte .
Weak Stationarity
Weak Stationarity
Weak Stationarity
Weak Stationarity
Weak Stationarity
ACF of {Xt }:
∞
!
X X
µX (t) = E λk Wt−k = λk E (Wt−k ) = 0 , and
k=0 k
∞
X σ2
E Xt2 = λ2k σ 2 = .
1 − λ2
k=0
Autoregressive Process
Autoregressive Process
ACF of {Xt }:
RX (h)
ρX (h) = .
RX (0)
∞
!
X X
µX (t) = E λk Wt−k = λk E (Wt−k ) = 0 , and
k=0 k
∞
X σ2
E Xt2 = λ2k σ 2 = .
1 − λ2
k=0
Autoregressive Process
Autoregressive Process
ACF of {Xt }:
Today’s menu
Linear Processes
Estimation of Autocovariance and ACF.
Prediction with ACF.
Linear Processes
with
{Wt } white noise.
P 2
k |ψk | < ∞.
Linear Processes
with
{Wt } white noise.
P 2
k |ψk | < ∞.
These are called linear processes.
Linear Processes
Proposition
Any linear Process {Xt } is weakly stationary, with
µX (t) = µ ,
∞
X
RX (h) = σ 2 ψk ψk+h .
k=−∞
Linear Processes
Proposition
Any linear Process {Xt } is weakly stationary, with
µX (t) = µ ,
∞
X
RX (h) = σ 2 ψk ψk+h .
k=−∞
A slide for EE
We can view a linear process {Xt } as
A slide for EE
We can view a linear process {Xt } as
Sample Mean
Sample Mean
Sample Autocovariance
n−h
1X
Rc
X (h) = (xi − µ
b)(xi+h − µ
b) , (for h < n) .
n
i=1
Sample Autocovariance
n−h
1X
Rc
X (h) = (xi − µ
b)(xi+h − µ
b) , (for h < n) .
n
i=1
Sample Autocovariance
Define a linear combination using samples of a stationary process
{Xt }:
Y = a1 X1 + a2 X2 + . . . an Xn .
Sample Autocovariance
Define a linear combination using samples of a stationary process
{Xt }:
Y = a1 X1 + a2 X2 + . . . an Xn .
Then
var (Y ) = cov(Y , Y )
Xn
= ai ai 0 RX (i − i 0 )
i,i 0 =1
≥ 0 , ( ∀a) .
Sample Autocovariance
Define a linear combination using samples of a stationary process
{Xt }:
Y = a1 X1 + a2 X2 + . . . an Xn .
Then
var (Y ) = cov(Y , Y )
Xn
= ai ai 0 RX (i − i 0 )
i,i 0 =1
≥ 0 , ( ∀a) .
P
A function (or matrix) R such that i,i 0 ai ai 0 Ri,i 0 ≥ 0 for all a is
called positive semi-definite.
Joan Bruna Stat 153: Introduction to Time Series
Important Examples
Statistical Measurements
Logistics
Stationarity
Introduction to Time Series
Estimation of Correlation
Statistical Models of Time Series
Sample ACF
Prediction with ACF
Sample Autocovariance
Proposition
The Autocovariance is positive semi-definite.
Sample Autocovariance
Proposition
The Autocovariance is positive semi-definite.
Proposition
The Sample Autocovariance is positive semi-definite.
Sample Autocovariance
Proposition
The Autocovariance is positive semi-definite.
Proposition
The Sample Autocovariance is positive semi-definite.
Example
Xt = 2 + Wt − Wt−1 .
Example
Xt = 2 + Wt − Wt−1 .
Example
Samples from {Xt }:
Example
Sample ACF ρc
X (h):
Example
Sample ACF ρc
X (h) with true ACF (in red):
Example
Sample ACF ρc
X (h) with true ACF (in red):
Proposition
Let {Wt } be a white noise and fix H > 0. Then, under mild
assumptions on the law of Wt , for large n, the sample ACF ρc X (h),
h = 1, . . . , H, is approximately Normally distributed, with zero
mean and variance 1/n.
Proposition
Let {Wt } be a white noise and fix H > 0. Then, under mild
assumptions on the law of Wt , for large n, the sample ACF ρc X (h),
h = 1, . . . , H, is approximately Normally distributed, with zero
mean and variance 1/n.
Example
Sample ACF ρc
X (h) with true ACF (in red) and significance region
(in green):
Rc
X (h)
ρc
X (h) = .
RcX (0)
Sample ACF
Sample ACF
Trend
Yt = Xt + βt , with β 6= 0 ,
Trend
Yt = Xt + βt , with β 6= 0 ,
Trend
Yt = Xt + βt , with β 6= 0 ,
Xt = 0.4Wt−1 + Wt
Xt = 0.4Wt−1 + Wt
Xt = 0.7Xt−1 + Wt
Xt = 0.7Xt−1 + Wt
with
MSE = σ 2 (1 − ρX (h)2 ) .
f (xt ) = µ + ρX (h)(xt − µ)
f (xt ) = µ + ρX (h)(xt − µ)
f (xt ) = µ + ρX (h)(xt − µ)