Diogo Resende - Time Series Forecasting Models in Python

Download as pdf or txt
Download as pdf or txt
You are on page 1of 77

Time Series

Forecasting Models
in Python

Diogo Resende | Time Series Forecasting Models in Python


Time Series Forecasting Models in Python

0 Introduction to forecasting

1 Seasonal Decomposition

2 Exponential Smoothing and Holt-Winters

3 TBATS

4 Arima, Sarima and Sarimax

5 Tensorflow Structural Time Series

6 Facebook Prophet

7 Facebook Prophet + XGBoost

8 Ensemble

Diogo Resende | Time Series Forecasting Models in Python


Introduction to
Forecasting

Diogo Resende | Time Series Forecasting Models in Python


Predictions that were just wrong

Thomas Watson, Jonh Maynard


chairman of IBM Keynes
When: 1943 Three hour shifts or
I think there is a a fifteen-hour work Einstein
world market for week There is not the slightest
maybe five
indication that nuclear
computers.
Steve Ballmer energy will ever be
There’s no chance obtainable. That would
that the iPhone is mean that the atom would
going to get any have to be shattered at
significant market will.
share.
Diogo Resende | Time Series Forecasting Models in Python
Description

Analytics is 1 Bringing Science to a sometimes gut-feeling job

key to drive 2 Barometer for the company -> Quantifies direction

Forecasting
3 Understanding turning points

4 Can uncover opportunities


What is Time Series Data?

Visualization

Key ideas

• Sequence of data points in


time order (oldest to newest)

• Most commonly, it is data


recorded in equally
distanced time periods

• Type of Panel Data


(multidimensional dataset)

Diogo Resende | Time Series Forecasting Models in Python


Bike Sharing
How many rides are done per day?

Case Study 1 Holidays and weather KPIs included

Briefing – 2 Time periods: 2011 and 2013

Demand 3 Forecast December 2012 to assess each forecasting

Forecasting
model

[1] Fanaee-T, Hadi, and Gama, Joao, "Event labeling combining


ensemble detectors and background knowledge", Progress in
Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg,
doi:10.1007/s13748-013-0040-3.
Seasonal
Decomposition

Diogo Resende | Time Series Forecasting Models in Python


Seasonal Decomposition: the actuals values to be
decomposed
Visualization Key ideas

A seasonal Time Series can


be decomposed into:

• Trend
• Seasonality
• Error

We try to use external


regressors to model the
remaining error term.

Diogo Resende | Time Series Forecasting Models in Python


Seasonal Decomposition: Trend

Visualization Trend

Diogo Resende | Time Series Forecasting Models in Python


Seasonal Decomposition: Seasonality

Visualization Seasonality

Jan Apr Jul Oct


Diogo Resende | Time Series Forecasting Models in Python
Seasonal Decomposition: Error

Visualization Error

Start End
Diogo Resende | Time Series Forecasting Models in Python
Additive vs. Multiplicative

Additive Multiplicative Key ideas


Y 𝑦 𝑡 = 𝑇𝑡 𝑡 + 𝑆 𝑡 + 𝑒[𝑡] Y 𝑦 𝑡 = 𝑇𝑡 𝑡 ∗ 𝑆 𝑡 ∗ 𝑒[𝑡]
If we talk about seasonality
in terms of percentage, then
we should consider a
multiplicative seasonality.

If it is in adding absolute
values, then it is additive.

If trend is exponential, then


it is multiplicative

t t

Diogo Resende | Time Series Forecasting Models in Python


Descrition

• The essential part of forecasting


Forecasting
• Understanding what else can explain the
is all about Error

error • How? Usually in the form of external


regressors

modelling • High errors in the beginning of dataset?


Consider discarding that part of the data.
Data without patterns : Stocks

Key Idea

• If there is no pattern, you should not use forecasting models


• Forecasting models work best with consistent seasonality and trends

Trend
• Heavily dependent on the company

Seasonality
• Depends more on the industry, thus it is more predictable.

Diogo Resende | Time Series Forecasting Models in Python


Exponential
Smoothing &
Holt-Winters

Diogo Resende | Time Series Forecasting Models in Python


Let‘s imagine this is our full data set

Description

Diogo Resende | Time Series Forecasting Models in Python


Splitting between training and test enables an unbiased
model assessment

Training Set Test Set

Model Assessment

Diogo Resende | Time Series Forecasting Models in Python


Training and Test set in Time series

Training set Test set

Dataset Time

Key Ideas
Forecasting Models are usually split into a pre and post period from a time perspective
The Test Set should be of the size of a real-world forecast

Diogo Resende | Time Series Forecasting Models in Python


What is Exponential Smoothing?

Key Ideas
Weighted averages of past observations, with the
weights decaying exponentially as the observations get older

Visualization

Importance

Today Time

Diogo Resende | Time Series Forecasting Models in Python


Holt-Winters is a Triple split Exponential Smoothing

Splits the time series into 3: Key Ideas


• Level • Performs Exponential Smoothing in
each of the 3 levels

• Trend • Holt-Winters is also called Triple


Exponential Smoothing
• Seasonality • There are 2 variants: Additive and
Multiplicative

Diogo Resende | Time Series Forecasting Models in Python


Mean Absolut Error (MAE) vs Root Squared Mean Error
(RSME)
Visualization Key ideas

Y
• MAE and RSME are performance indicators for
Model Regression models with continuous dependent
variables

σ 𝑦ො − 𝑦 2
σ 𝑦 − 𝑦ො
𝑀𝐴𝐸 = x 𝑅𝑆𝑀𝐸 =
𝑛 𝑛

• RSME is quite useful for models with extremes /


outliers

time • MAE is more interpretable.

Diogo Resende | Time Series Forecasting Models in Python


Mean Absolut Percent Error (MAPE)

Visualization Key ideas

Y • MAPE represents a very interpretable way of


Model measuring errors

𝑦 − 𝑦ො
σ
x 𝑦
𝑀𝐴𝑃𝐸 =
𝑛
• Clear downside is that all error has the same
relevance, regardless of the magnitude, if the
percent error is the same

X • There is no universal good accuracy measure.


It will depend on your problem and business
need!
Diogo Resende | Time Series Forecasting Models in Python
Pros and Cons

Easy to Apply Does not allow external regressors


1 1

Easy to understand Low Flexibility


2 2

Better with low amount of time


3
periods or frequency

Diogo Resende | Time Series Forecasting Models in Python


Description
Use Holt-Winters to predict the amount of airmiles

1 Set Index frequency to Monthly. Use „MS“

2 Visualize data

Create Training and Test Set. Test Set should


Challenge 3
be 12 months

4 Create Holt-Winters Model

Predict 12 months and visualize them, together


5 with the training and test set

6 Assess Model based on MAE

Dataset: TSA package


TBATS

Diogo Resende | Time Series Forecasting Models in Python


Meaning of TBATS

Description

1 Trigonometrics seasonality Origin


Created in 2011
2 Box-Cox transformation Similar to Exponential Smoothing

3 AutoRegressive Moving Average Why

4 Trend The math behind has several


similarities
5 Seasonality

Diogo Resende | Time Series Forecasting Models in Python


AutoRegressive components

Key Idea
Past values, the lags, contain information that help predict future values

Visualization

𝑌𝑡 = 𝑐 + 𝛼1 * 𝑌𝑡−1 + 𝛼2 ∗ 𝑌𝑡−2+ … + 𝛼𝑛 ∗ 𝑌𝑡−𝑛

Today Time

How to determine how many lags


We will do it automatically in the practice tutorials

Diogo Resende | Time Series Forecasting Models in Python


Moving Average components

Visualization of the errors

Methodological Framework
𝑦𝑡 = 𝑐 + 𝛼1 * 𝜀𝑡−1+ … + 𝛼𝑛 ∗ 𝜀𝑡−𝑛

What it is?
Past error lags, contain information
that help predict future values

How to do it?
We will do it automatically in the
Start End practice tutorials

Diogo Resende | Time Series Forecasting Models in Python


Trigonometric seasonality

Visualization Description

• Trigonometry is part of the


modelling.

• Seasonality equation contain


the Sine and Cosine

• In practical terms, we do not


need to do anything

Diogo Resende | Time Series Forecasting Models in Python


BOX-COX

Visualization What is it?


Transforming the dependent variable into a normal distribution

Why do we care?
Normal distribution is a requirement or assumption of many
statistical techniques

Key Idea
• Box Cox is part of the modelling.
• In practical terms, we do not need to do anything

Diogo Resende | Time Series Forecasting Models in Python


Pros and Cons

Seasonality is allowed to change Prediction intervals often wide


1 1
overtime

Automated Optimization Does not allow external regressors


2 2

Easy implementation Slow


3 3

Diogo Resende | Time Series Forecasting Models in Python


Description
Use TBATS to predict weekly store footfall

1 Transform Index to have weekly frequency. Use „W“

2 Visualize data. Something will be off ;)

Create Training and Test Set. Test Set should


Challenge 3
be 5 weeks

4 Create TBATS Model

Predict 5 weeks and visualize them, together


5 with the training and test set

6 Assess Model based on RMSE

Source: UK Government
ARIMA, SARIMA
& SARIMAX

Diogo Resende | Time Series Forecasting Models in Python


What does it all mean?

Acronym Description
ARIMA AutoRegregressive Integrated Moving Average

SARIMA Seasonal + ARIMA

SARIMAX SARIMA + Exogenous variables

Diogo Resende | Time Series Forecasting Models in Python


What is ARIMA?

Component Description

AutoRegressive The output is regressed on its own lagged values

Number of times we need to do differencing to make our time series


Integrated
stationary

Moving Average Instead of using the past values, the MA model uses past forecast errors.

Diogo Resende | Time Series Forecasting Models in Python


ARMA recap

AutoRegressive Moving Average


Past values, the lags, contain information Past error lags, contain information that
that help predict future values help predict future values
Visualization

Visualization

Time

Start End
Diogo Resende | Time Series Forecasting Models in Python
Stationarity

Stationary Time Series Time dependent mean Key idea


Mean, variance and
covariance are not time
dependent
Stationary Time Series
have a clearly defined
pattern
Time dependent variance Time dependent covariance

Y Y Statistical test:
Dickey-Fuller test. If p-
value is less than 0.05,
time series is
considered stationary
t t
Diogo Resende | Time Series Forecasting Models in Python
Making Data Stationary

Time Series 1st differencing 2nd differencing Key idea

5 NA NA Making data stationary


is simple, yet the
9 4 NA concept is confusing.
1 -8 -12 From a practical
7 6 14 perspective, it is a check
that we need to do
3 -4 -10
7 4 8 The Auto.arima function
does it automatically for
4 -3 -7 us!

Diogo Resende | Time Series Forecasting Models in Python


SARIMAX

Examples

• Moving seasonality
Events like Black Friday or seasonal holidays like
External Regressors Easter or Diwali are not in the same dates every year.

• The goal of the regressors is to • Events outside the company control


model the remaining error. Factors like weather or corona interfere with the usual
seasonality or trend, thus you need to model them in
• Information that is not recurrent your forecast to decrease errors
over time or modifies itself.
• Events caused by the company
Major investment or strategy shifts affect the normal
development of a KPI. You need to try to find a metric
that represents any of these factors

Diogo Resende | Time Series Forecasting Models in Python


3 factors to optimize in ARIMA(p,d,q)

Order Description Explanation

p Order of the Autoregressive Number of unknown terms that multiply your


signal at past times

d Degree of first Differencing involved Number of differences to make time series


stationary

q Order of the Moving Average part Number of unknown terms that multiply your
forecast errors at past times

Key Idea
• P, d, and q are non-negative integers.
No extra work, there are functions to optimize the factors automatically
Diogo Resende | Time Series Forecasting Models in Python
6 factors to optimize in SARIMA

Data Type Acronym Factors

Seasonal Data S P, D, Q

Time Series

Non-seasonal Data ARIMA p, d, q

Key Idea
• Despite having 3 more factors to optimize, they mirror the classic ARIMA (p, d, q)
• No extra work, there are functions to optimize the factors automatically

Diogo Resende | Time Series Forecasting Models in Python


Akaike’s Information Criterion (AIC) and Bayesian
Information Criterion (BIC)
Key Ideas Pseudo-visualization

Goodness
• AIC and BIC provide a means to select a model of fit

• Trade-off between simplicity and goodness of fit

• Deal with overfitting and underfitting

Simplicity

Diogo Resende | Time Series Forecasting Models in Python


Pros and Cons

Easy Implementation Better with low amount of time


1 1
periods or frequency

Automated Optimization Low Flexibility


2 2

Easy to Understand
3

Diogo Resende | Time Series Forecasting Models in Python


Description
Use SARIMAX to predict interest in Churrasco

1 Transform Index to have weekly frequency. Use „W“

2 Visualize data.

Create Training and Test Set. Test Set should


Challenge 3
be 10 weeks

Extract Exogenous Variables and Create


4
SARIMAX model

Predict 10 weeks and visualize them, together


5
with the training and test set

6 Assess Model based on MAPE


Source: Google Trends
Tensorflow
Probabilities Structural
Time Series

Diogo Resende | Time Series Forecasting Models in Python


Structural Time Series

Visualization Description
Data Seasonality
• Structural Time Series is the
decomposition of the data in at
least:
• Trend
• Seasonality
• Exogenous impacts
Trend Exogenous impacts • Leftovers: noise

Methodological framework

𝑦(𝑡) = 𝑐 𝑡 + 𝑠 𝑡 + 𝑥 𝑡 + 𝜖

Diogo Resende | Time Series Forecasting


Models in Python
Tensorflow Structural Time series

Seasonality
Decomposition
• Weekly
• Trend • Monthly
• Seasonality - multiple • Yearly
• Exogenous impacts
• AutoRegressive Autoregressive
• Noise • Focus on giving weight to
recent information

Diogo Resende | Time Series Forecasting Models in Python


Description
Simulation used for Bayesian Inference

Causal inference problem statement


We know what happenened, but we do not know what led to
it

Hamiltonian Bayes Theorem


𝑃 𝑖𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑏𝑢𝑦 ∗ 𝑃(𝑏𝑢𝑦)
𝑃 𝑏𝑢𝑦 𝑖𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛) =
Monte Carlo
𝑃(𝑖𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛)

𝑃 𝑖𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑏𝑢𝑦 ∗ 𝑃(𝑏𝑢𝑦)


=
‫)𝑦𝑢𝑏(𝑑 𝑦𝑢𝑏 𝑃 ∗ 𝑦𝑢𝑏 𝑛𝑜𝑖𝑠𝑠𝑒𝑟𝑝𝑚𝑖 𝑃 ׬‬
Problem statement
It is not possible to solve the equation and thus we simulate
outcomes
Tensorflow Structural Time Series Pros and Cons

Flexible Complex programming


1 1

Continuous Regressors Very slow


2 2

Good with short-term dynamics


3

Intuitive
4
Diogo Resende | Time Series Forecasting Models in Python
Description
Udemy wikipedia page visits

1 Set as regressors Easter and Christmas variables

2 Split into training and test set and isolate Y

3 Create weekly and monthly seasonality objects

Challenge 4 Create Trend and Autoregressive components


Create Tensorflow model and fit it with
5 Hamiltonion Monte Carlo

Predict 30 days and add index to the


6
predictions.

7 Visualize forecast, trainining and test data


Dataset: TSA package
Facebook Prophet

Diogo Resende | Time Series Forecasting Models in Python


Facebook Prophet quick facts

Description

1 Built by facebook
Which? Stan background - probabilistic programming
2
language for statistical inference

3 Dynamic Holidays

4 Prophet forecasts are customizable in ways that are


intuitive to non-experts

5 Built-in Cross Validation & Hyperparameter Tuning

Diogo Resende | Time Series Forecasting Models in Python


Methodological framework
𝑦(𝑡) = 𝑐 𝑡 + 𝑠 𝑡 + ℎ 𝑡 + 𝑥 𝑡 + 𝜖

Where:

c(t) Trend +
Prophet s(t) Seasonality +

Mechanics h(t) Holiday effects +


x(t) External regressors +
e error
Visualization
Dynamic Holidays – Valentine‘s example

Visualization
Facebook Prophet
Chocolate
demand You state Valentine‘s as a key
event and specify how many
days before/after to quantify

Other models:
You must create dummy
variables for each day, if you
believe they have different
impacts

11 12 13 14 15
February
Diogo Resende | Time Series Forecasting Models in Python
Facebook Prophet Model

Component Description

Growth Linear or Logistic

Holidays Dataframe that we prepared

Seasonality Yearly, weekly or daily. True or False

Seasonality_mode Multiplicative or additive

Seasonality_prior_scale Strength of the seasonality

Holiday_prior_scale Larger values allow the model to fit larger seasonal fluctuations

Changepoint_prior_scale flexibility of the automatic changepoint selection


Diogo Resende | Time Series Forecasting Models in Python
Cross Validation

Training set Test set

Key Idea
Repeating the assessment of our model reinforces its evaluation

Diogo Resende | Time Series Forecasting Models in Python


Parameters to tune

Component Description

Seasonality_prior_scale Strength of the seasonality

Holiday_prior_scale Larger values allow the model to fit larger seasonal fluctuations

Changepoint_prior_scale flexibility of the automatic changepoint selection

Diogo Resende | Time Series Forecasting Models in Python


Pros and Cons

Flexible Complex programming


1 1

Built-in Cross Validation Can need intense optimization


2 2

Dynamics Events Not good with short-term dynamics


3 3

Allows regressors Not good with non-linear


4 4
regressors
Diogo Resende | Time Series Forecasting Models in Python
Description
Demand for Shelter in New York City

1 Rename Dependent and Time Variable to y and ds

Declare Easter and Thanksgiving as holidays.


2
Combine them. Use pd.concat

3 Create Prophet model. Christmas is a regressor


Challenge 4 Cross Validation. Horizon = 31, initial = 2400.
Assess via MAE

5 Create Parameter Grid for Tuning

Perform Hyperparameter Tuning. Use MAE as


6
the KPI to optimize. Gather Results

Dataset: Open Data NYC initiative


Facebook Prophet
+
XGBoost

Diogo Resende | Time Series Forecasting Models in Python


Prophet and XGBoost step by step

Tuned Prophet Model

Borrow Seasonality, Trend and other Variables

Prepare XGBoost Matrices

Set Parameters

Run XGBoost

Assess Model
Diogo Resende | Time Series Forecasting Models in Python
Description
XGBoost is a
1 Stands for Extreme Gradient Boosting
state-of-art 2
Can be contructed with a tree based algorithm or
linear (worse results)
Machine 3 It is an emsemble algorithm

Learning 4
Each new model is built upon the precedent one ->
continuous improvement

Algorithm 5 Can be used for both Regression and Classification


XGBoost gives different weights depending on how
difficult it is to predict

First Tree Second Tree Third Tree

Outcome Predictor Weight Outcome Predictor Weight Outcome Predictor Weight


1 X 25% 1 X 20% 1 X 23%
0 X 25% 0 X 20% 0 X 15%
0 X 25% 0 X 30% 0 X 35%
1 X 25% 1 X 30% 1 X 27%

Diogo Resende | Time Series Forecasting Models in Python


XGBoost looks at parts of the observations at a time

First Tree Second Tree Third Tree

Outcome Predictor Weight Outcome Predictor Weight Outcome Predictor Weight


1 X1 25% 1 X1 20% 1 X1 23%
0 X2 25% 0 X2 20%
0 X3 30% 0 X3 35%
1 X4 25% 1 X4 27%

Key Idea
XGBoost only looks at a fraction of the observation at the time
Observations that are more difficult to predict are given a bigger weight

Diogo Resende | Time Series Forecasting Models in Python


The logic is similar for Regression-based tasks

First Tree Second tree

Error Outcome Predictor Weight Error Outcome Predictor Weight


-5 15 X1 33% -1 19 X1 40%
2 22 X2 33%
-1 25 X2 30%
4 34 X4 33% 3 35 X4 35%

Diogo Resende | Time Series Forecasting Models in Python


XGBoost also gives different weights to different
predictors

First Tree Second Tree


Error Outcome X1 X2 X3 Weight Error Outcome X1 X2 X3 Weight
-5 15 33% -1 19 40%
2 22 33%
50%

50%

50%

50%
-1 25 30%
4 34 33% 3 35 35%

Third Tree
Error Outcome X1 X2 X3 Weight
1 21 35% Key Idea
Predictors also have different weights
40%

60%

if they yield different model results


0 24 30%
2 36 40%
Diogo Resende | Time Series Forecasting Models in Python
XGBoost quirks

Description

Which? NA:
Unlike other regression models, XGBoost treats NA‘s as
information

Non-linearity:
XGBoost is excellent dealing with non-linearity relationship
between the dependent and the independent variables.

Diogo Resende | Time Series Forecasting Models in Python


Which parameters are there?

Parameter Description

Minimum Child Relates to the sum of the weights of each observation. Low values can
weight mean that maybe not a lot of observations are in the round

ETA Learning Rate. How fast do you want the model to learn?

Max depth How big should the tree be? Bigger trees go into more detail

Gamma How fast should the tree be split?

Subsample Share of observations in each tree?

Colsample by tree How much of the tree should be analysed per round?

Number of rounds How many times do we want the analysis to be run?

Diogo Resende | Time Series Forecasting Models in Python


Prophet + XGBoost Pros and Cons

Flexible Complex programming


1 1

Great with Regressors Can need intense optimization


2 2

Decent with short-term dynamics


3

Diogo Resende | Time Series Forecasting Models in Python


Description
Demand for Shelter in New York City

1 Create future DF with test set length. Add regressor

Forecast and create a DF with: trend, weekly,


2
yearly, holidays, multiplicative_terms

3 Concatenate with df. Drop Easter and Thanksgiving


Challenge 4 Generate Training and Test Set. Isolate X and Y
and form XGBoost Matrices

5 Set Parameters and Create XGBoost model

Predict. Visualize Test Set and Predictions.


6
Assess model using MAPE

Source: UK Government
Ensemble

Diogo Resende | Time Series Forecasting Models in Python


Ensemble mechanism

Example
Date Y Holt- SARIMAX TBATS TFP Prophet XGBoost Ensemble
Winters
t 50 48 49 51 50.5 53 51 50.5

Key Idea
• Ensemble is an average of models. The goal models have flaws, but if you group all
of them, then some models will average out the error

To consider:
• Dynamic average. You give more weight to models that have less errors, punish the
ones that are not performing as well.

Diogo Resende | Time Series Forecasting Models in Python


Why Ensemble

Deep dives
The research on combining forecasts to achieve better accuracy
is extensive, persuasive, and consistent.
Essam Mahmoud,
“Accuracy in Forecasting: A Survey,” Journal of Forecasting, April–
June 1984, p. 139;
Spyros Makridakis and Robert L. Winkler,
“Averages of Forecasts: Some Empirical Results,” Management
Science, September 1983, p. 987
Victor Zarnowitz,
“The Accuracy of Individual and Group Forecasts from Business
Outlook Surveys,” Journal of Forecasting, January–March 1984, p. 10.

Diogo Resende | Time Series Forecasting Models in Python


Pros and Cons

Accuracy Lack of visibility


1 1

Preparation
2

Diogo Resende | Time Series Forecasting Models in Python


Diogo Resende | Time Series Forecasting Models in Python

You might also like