Diogo Resende - Time Series Forecasting Models in Python
Diogo Resende - Time Series Forecasting Models in Python
Diogo Resende - Time Series Forecasting Models in Python
Forecasting Models
in Python
0 Introduction to forecasting
1 Seasonal Decomposition
3 TBATS
6 Facebook Prophet
8 Ensemble
Forecasting
3 Understanding turning points
Visualization
Key ideas
Forecasting
model
• Trend
• Seasonality
• Error
Visualization Trend
Visualization Seasonality
Visualization Error
Start End
Diogo Resende | Time Series Forecasting Models in Python
Additive vs. Multiplicative
If it is in adding absolute
values, then it is additive.
t t
Key Idea
Trend
• Heavily dependent on the company
Seasonality
• Depends more on the industry, thus it is more predictable.
Description
Model Assessment
Dataset Time
Key Ideas
Forecasting Models are usually split into a pre and post period from a time perspective
The Test Set should be of the size of a real-world forecast
Key Ideas
Weighted averages of past observations, with the
weights decaying exponentially as the observations get older
Visualization
Importance
Today Time
Y
• MAE and RSME are performance indicators for
Model Regression models with continuous dependent
variables
σ 𝑦ො − 𝑦 2
σ 𝑦 − 𝑦ො
𝑀𝐴𝐸 = x 𝑅𝑆𝑀𝐸 =
𝑛 𝑛
𝑦 − 𝑦ො
σ
x 𝑦
𝑀𝐴𝑃𝐸 =
𝑛
• Clear downside is that all error has the same
relevance, regardless of the magnitude, if the
percent error is the same
2 Visualize data
Description
Key Idea
Past values, the lags, contain information that help predict future values
Visualization
Today Time
Methodological Framework
𝑦𝑡 = 𝑐 + 𝛼1 * 𝜀𝑡−1+ … + 𝛼𝑛 ∗ 𝜀𝑡−𝑛
What it is?
Past error lags, contain information
that help predict future values
How to do it?
We will do it automatically in the
Start End practice tutorials
Visualization Description
Why do we care?
Normal distribution is a requirement or assumption of many
statistical techniques
Key Idea
• Box Cox is part of the modelling.
• In practical terms, we do not need to do anything
Source: UK Government
ARIMA, SARIMA
& SARIMAX
Acronym Description
ARIMA AutoRegregressive Integrated Moving Average
Component Description
Moving Average Instead of using the past values, the MA model uses past forecast errors.
Visualization
Time
Start End
Diogo Resende | Time Series Forecasting Models in Python
Stationarity
Y Y Statistical test:
Dickey-Fuller test. If p-
value is less than 0.05,
time series is
considered stationary
t t
Diogo Resende | Time Series Forecasting Models in Python
Making Data Stationary
Examples
• Moving seasonality
Events like Black Friday or seasonal holidays like
External Regressors Easter or Diwali are not in the same dates every year.
q Order of the Moving Average part Number of unknown terms that multiply your
forecast errors at past times
Key Idea
• P, d, and q are non-negative integers.
No extra work, there are functions to optimize the factors automatically
Diogo Resende | Time Series Forecasting Models in Python
6 factors to optimize in SARIMA
Seasonal Data S P, D, Q
Time Series
Key Idea
• Despite having 3 more factors to optimize, they mirror the classic ARIMA (p, d, q)
• No extra work, there are functions to optimize the factors automatically
Goodness
• AIC and BIC provide a means to select a model of fit
Simplicity
Easy to Understand
3
2 Visualize data.
Visualization Description
Data Seasonality
• Structural Time Series is the
decomposition of the data in at
least:
• Trend
• Seasonality
• Exogenous impacts
Trend Exogenous impacts • Leftovers: noise
Methodological framework
𝑦(𝑡) = 𝑐 𝑡 + 𝑠 𝑡 + 𝑥 𝑡 + 𝜖
Seasonality
Decomposition
• Weekly
• Trend • Monthly
• Seasonality - multiple • Yearly
• Exogenous impacts
• AutoRegressive Autoregressive
• Noise • Focus on giving weight to
recent information
Intuitive
4
Diogo Resende | Time Series Forecasting Models in Python
Description
Udemy wikipedia page visits
Description
1 Built by facebook
Which? Stan background - probabilistic programming
2
language for statistical inference
3 Dynamic Holidays
Where:
c(t) Trend +
Prophet s(t) Seasonality +
Visualization
Facebook Prophet
Chocolate
demand You state Valentine‘s as a key
event and specify how many
days before/after to quantify
Other models:
You must create dummy
variables for each day, if you
believe they have different
impacts
11 12 13 14 15
February
Diogo Resende | Time Series Forecasting Models in Python
Facebook Prophet Model
Component Description
Holiday_prior_scale Larger values allow the model to fit larger seasonal fluctuations
Key Idea
Repeating the assessment of our model reinforces its evaluation
Component Description
Holiday_prior_scale Larger values allow the model to fit larger seasonal fluctuations
Set Parameters
Run XGBoost
Assess Model
Diogo Resende | Time Series Forecasting Models in Python
Description
XGBoost is a
1 Stands for Extreme Gradient Boosting
state-of-art 2
Can be contructed with a tree based algorithm or
linear (worse results)
Machine 3 It is an emsemble algorithm
Learning 4
Each new model is built upon the precedent one ->
continuous improvement
Key Idea
XGBoost only looks at a fraction of the observation at the time
Observations that are more difficult to predict are given a bigger weight
50%
50%
50%
-1 25 30%
4 34 33% 3 35 35%
Third Tree
Error Outcome X1 X2 X3 Weight
1 21 35% Key Idea
Predictors also have different weights
40%
60%
Description
Which? NA:
Unlike other regression models, XGBoost treats NA‘s as
information
Non-linearity:
XGBoost is excellent dealing with non-linearity relationship
between the dependent and the independent variables.
Parameter Description
Minimum Child Relates to the sum of the weights of each observation. Low values can
weight mean that maybe not a lot of observations are in the round
ETA Learning Rate. How fast do you want the model to learn?
Max depth How big should the tree be? Bigger trees go into more detail
Colsample by tree How much of the tree should be analysed per round?
Source: UK Government
Ensemble
Example
Date Y Holt- SARIMAX TBATS TFP Prophet XGBoost Ensemble
Winters
t 50 48 49 51 50.5 53 51 50.5
Key Idea
• Ensemble is an average of models. The goal models have flaws, but if you group all
of them, then some models will average out the error
To consider:
• Dynamic average. You give more weight to models that have less errors, punish the
ones that are not performing as well.
Deep dives
The research on combining forecasts to achieve better accuracy
is extensive, persuasive, and consistent.
Essam Mahmoud,
“Accuracy in Forecasting: A Survey,” Journal of Forecasting, April–
June 1984, p. 139;
Spyros Makridakis and Robert L. Winkler,
“Averages of Forecasts: Some Empirical Results,” Management
Science, September 1983, p. 987
Victor Zarnowitz,
“The Accuracy of Individual and Group Forecasts from Business
Outlook Surveys,” Journal of Forecasting, January–March 1984, p. 10.
Preparation
2