Lecture 1 - Time Series Fundamentals - Introduction
Lecture 1 - Time Series Fundamentals - Introduction
Lectures (online)
A new lecture recording is generally released every Thursday on FAU.TV
Consultation hours by appointment, write to [email protected]
Exercises (online)
Live Zoom Session starting on November 3rd
Recordings from previous editions are available at https://www.fau.tv/course/id/3178
StudOn 2023-2024:
https://www.studon.fau.de/crs5276833.html
Exams and evaluation
* Please, address all your correspondence about the course to Dr. Dario Zanca
Course organizers
Teaching assistants
Exercises, responsibles:
• Richard Dirauf (M.Sc.), [email protected]
• Philipp Schlieper (M.Sc.), [email protected]
References
Deep Learning
by Ian Goodfellow, Yoshua Bengio, and Aaron Courville (2016)
Time series fundamentals
Motivations
An old history of time series analysis: Babylonian astronomical diaries
(a) https://research.arizona.edu/stories/space-versus-ground-telescopes
(b) https://www.nature.com/articles/d41586-020-02284-7
An old history of time series analysis: The Birth of Epidemiology
https://www.statista.com
Example: Predicting demand of products
Amazon sells 400 million products in over Some products are sold depending on the season
185 countries(a).
Ø Maintaining surplus inventory levels for
every product is cost-prohibitive.
Ø Predict future demand of products
Example: Predicting demand of products
Amazon sells 400 million products in over Some products are sold depending on the season
185 countries(a).
Ø Maintaining surplus inventory levels for
every product is cost-prohibitive.
Ø Predict future demand of products
Methods:
Feedforward □ First models required
Statistical Random Transforme
Neural RNN/CNN manual feature
Methods Forests rs engineering
Networks
□ New methods are fully
data-driven
2007 2009 2020
Time
2015 2017
Example: Duplex makes tedious phone calls
https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html
Example: Duplex makes tedious phone calls
https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html
Example: Activity recognition in sports (FAU Erlangen)
https://doi.org/10.1007/s10618-017-0495-0
Example: Activity recognition in sports (FAU Erlangen)
Actions:
• Underhand serve • Shot attack
• Overhand serve • Spike
• Jump serve • Block
• Underarm set • Dig
• Overhead set • Null class.
https://doi.org/10.1007/s10618-017-0495-0
Time series fundamentals
Definitions and basic properties
What is a time series?
A time series can be described as a set of observations, taken sequentially in time,
𝑆 = {𝑠!, … , 𝑠" }
Quantity [u]
time.
∀𝑖 ∈ 1, … , 𝑇 − 1 ,
Δ%! = 𝑡#&! − 𝑡# = 𝑐𝑜𝑛𝑠𝑡. 𝑡! 𝑡" 𝑡# 𝑡$
Time [s]
Terminology: Regularly Sampled vs Irregularly Sampled
Discrete time series are regularly sampled if
their observations are euqally spaced in
Quantity [u]
time.
∀𝑖 ∈ 1, … , 𝑇 − 1 ,
Δ%! = 𝑡#&! − 𝑡# = 𝑐𝑜𝑛𝑠𝑡. 𝑡! 𝑡" 𝑡# 𝑡$
Time [s]
Quantity [u]
spaced.
• They are generally defined as a collection
of pairs
𝑡! 𝑡" 𝑡# 𝑡$
Time [s]
𝑆= 𝑠!, 𝑡! , … , 𝑠" , 𝑡"
Terminology: Univariate vs Multivariate
Let 𝑆 = 𝑠!, … , 𝑠" be a time series,
where 𝑠# ∈ ℝ$ , ∀𝑖 ∈ {1, … , 𝑇} .
Quantity [u]
If 𝑑 = 1, 𝑆 is said univariate.
• Only one variable is varying over time. Time [s]
Time [s]
Terminology: Univariate vs Multivariate
Let 𝑆 = 𝑠!, … , 𝑠" be a time series,
where 𝑠# ∈ ℝ$ , ∀𝑖 ∈ {1, … , 𝑇} .
Quantity [u]
If 𝑑 = 1, 𝑆 is said univariate.
• Only one variable is varying over time. Time [s]
Quantity [u]
• Multiple variables are varying over time
• E.g., tri-axial accelerometer
measurements Time [s]
Terminology: Discrete vs Continuous
A time series is said to be continuous if observations are made at each instant of time, even
when its measurements consist only of a discrete set of values.
• E.g., the number of people in a room.
A time series is said to be discrete if observations are taken at specific times. Discrete time
series can arise in different ways:
• Sampled (e.g., daily rainfall)
• Aggregated (e.g., monthly reports of daily rainfalls)
Terminology: Discrete vs Continuous
We will denote as mixed-type a multivariate time series consisting of both continuous and
discrete observations
• E.g., a time series consisting of continuous sensor values and discrete event log
for the monitoring of an industrial machine
Terminology: Periodic
such that
𝑠! = 𝑠!"# , ∀𝑖 ∈ {1, … , 𝑇 − 𝜏}
Even if we were to imagine having observed the process for an infinite period 𝑇 of time, the
infinite sequence
&/
𝑆 = … , 𝑠%-!, 𝑠% , 𝑠%&!,… = 𝑠% %.-/
Still, if we had a battery of N computers generating series 𝑆 (!) , …, 𝑆 (2) , and considering
selecting the observation at time 𝑡 from each series,
(!) (2)
𝑠% , … , 𝑠%
Still, if we had a battery of N computers generating series 𝑆 (!) , …, 𝑆 (2) , and considering
selecting the observation at time 𝑡 from each series,
(!) (2)
𝑠% , … , 𝑠%
The unconditional mean is the expectation, provided it exists, of the 𝑡-th observation, i.e.,
&/
𝐸 𝑋% = = 𝑠% 𝑓3" 𝑠% 𝑑𝑠% = 𝜇%
-/
Given any particular realization 𝑆 (#) of a stochastic process (i.e., a time series), we can define
the vector of the 𝑗 + 1 most recent observations
(#) (#)
𝑥%7 = [𝑠%-8 , … , 𝑠% ]
We want to know the probability distribution of this vector 𝑥%7 across realizations. We
can calculate the 𝒋-th autocovariance
If neither the mean 𝜇% or the autocovariance 𝛾8% depend on the temporal variable 𝑡, then
the process is said to be (weakly) stationary.
𝑋% = 𝝁 + 𝜖%
Quantity [u]
Then, its mean is constant: 𝐸(𝑋% ) = 𝜇 + 𝐸(𝜖% ) = 𝜇
and its 𝑗-th autocovariance: 𝐸(𝑋% − 𝜇)(𝑋%-8 − 𝜇) = 𝛾8
Time [s]
Stationarity
If neither the mean 𝜇% or the autocovariance 𝛾8% depend on the temporal variable 𝑡, then
the process is said to be (weakly) stationary.
𝑋% = 𝝁 + 𝜖%
Quantity [u]
Then, its mean is constant: 𝐸(𝑋% ) = 𝜇 + 𝐸(𝜖% ) = 𝜇
and its 𝑗-th autocovariance: 𝐸(𝑋% − 𝜇)(𝑋%-8 − 𝜇) = 𝛾8
Time [s]
In other words: A process is said to be stationary if the process statistics do not depend on time.
Ergodicity
# #
Given a time series, denoted by 𝑆 (#) = 𝑠! , … , 𝑠" , we can compute the sample temporal
average as
"
1 (#)
𝑠̅ = I 𝑠%
𝑇
%.!
The ergodicity of a time series bind the concept of the process mean with that of temporal
sample mean:
• A process is said to be ergodic if 𝑠̅ converges to 𝜇% as 𝑇 → ∞
Ergodicity
# #
Given a time series, denoted by 𝑆 (#) = 𝑠! , … , 𝑠" , we can compute the sample temporal
average as
"
1 (#)
𝑠̅ = I 𝑠%
𝑇
%.!
The ergodicity of a time series bind the concept of the process mean with that of temporal
sample mean:
• A process is said to be ergodic if 𝑠̅ converges to 𝜇% as 𝑇 → ∞
In other words: A process is said to be ergodic if its time statistics equals the process statistic,
provided that the process is observed long enough.
Example: Stationarity and Ergodicity
To clarify the concept, we give an example of stationary but not ergodic process. Suppose the
(#) &/
mean 𝜇 of the 𝑖-th realization of 𝑋% %.-/ is sampled from the normal distribution
(#) (#)
𝑈 0, 𝜆4 and, similarly to the previous example, 𝑋% = 𝜇 + 𝜖% .
However, its sample temporal mean, converges to a different value than the process mean,
i.e.,
For example:
• The price of a stock today depends on its price yesterday (dependence)
• and the volatility of the stock, i.e., its dispersion of returns, might change over time
(change on the underlying distribution)
Time series and i.i.d. data
The structure of this dependence imposes challenges on the statistical data analysis of time
series.
• Many tools for statistical inference are valid only for i.i.d. data
Time series and i.i.d. data
It might be useful to be able to assess the structure of the dependence between random
variables. For this reason we make use of their correlation.
• Generally, we measure the correlation between two variables 𝑋# and 𝑋8 with their
covariance 𝐶𝑜𝑣(𝑋# , 𝑋8 ).
• 𝐶𝑜𝑣(𝑋# , 𝑋8 ) = 0 → uncorrelated
• We measure dependence of an entire time series with a similar concept, the long-run
variance
• 𝜎#4 = ∑ℤ 𝐶𝑜𝑣(𝑋# , 𝑋#&: )
The Central Limit Theorem
The Central Limit Theorem (CLT) suggests that the sum of random variables converges to a
normal distribution, under precise conditions.
More precisely, for a sequence of i.i.d. random variables 𝑋% %∈{!,…,+} with 𝜇 = 𝐸(𝑋% ) and
𝜎 4 = 𝐸 𝑋% − 𝜇 4, by the CLT it holds:
"
1
𝑇 I 𝑋# − 𝜇 → 𝒩(0, 𝜎 4)
𝑇
!
The Central Limit Theorem
The Central Limit Theorem (CLT) suggests that the sum of random variables converges to a
normal distribution, under precise conditions.
More precisely, for a sequence of i.i.d. random variables 𝑋% %∈{!,…,+} with 𝜇 = 𝐸(𝑋% ) and
𝜎 4 = 𝐸 𝑋% − 𝜇 4, by the CLT it holds:
"
1
𝑇 I 𝑋# − 𝜇 → 𝒩(0, 𝜎 4)
𝑇
!
For stationary time series with mean 𝜇 and long-run variance 𝜎 4 the CLT holds as before.
Why is the CLT important?
If the CLT holds for a time series, we can draw from a larger range of methods.
• Statistical inference depends on the possibility to take a broad view of results from a
sample to the population.
• The CLT legitimizes the assumption of normality of the error terms in linear regression.
However,
• Many time series we encounter in the real world satisfy CLT assumption of independence
and stationarity
• Or can be transformed into stationary time series, e.g., by differentiations or other
transformations
Why is the CLT important?
If the CLT holds for a time series, we can draw from a larger range of methods.
• Statistical inference depends on the possibility to take a broad view of results from a
sample to the population.
• The CLT legitimizes the assumption of normality of the error terms in linear regression.
However,
• Many time series we encounter in the real world satisfy CLT assumption of independence
and stationarity
• Or can be transformed into stationary time series, e.g., by differentiations or other
transformations
If 𝑉; > 0, then,
𝑛 𝑋# − 𝜇 → 𝑁 0, 𝑉; .
(a) A stochastic process 𝑋% %∈{!,…,*} is said to be M-dependent if 𝑋% %,- are independent of the stochastic variables 𝑋% %.-/0/!
Time series fundamentals
Recap
Recap
Time series have long been studied in history
• Recent digitalization increases the importance of
time series analysis
Recap
Time series have long been studied in history
• Recent digitalization increases the importance of
time series analysis
Properties of time series
• Regularly vs irregularly sampled
• Univariate vs multivariate
• Discrete vs continuous
• Periodic
• Deterministic vs non-deterministic
• Stationarity
• Ergodicity
Recap
Time series have long been studied in history Central limit theorem only holds for stationary time
series
• Recent digitalization increases the importance of
time series analysis • Less restrictive CLT versions exist
Properties of time series • Need to properly learn dependences
• Regularly vs irregularly sampled
• Univariate vs multivariate
• Discrete vs continuous
• Periodic
• Deterministic vs non-deterministic
• Stationarity
• Ergodicity