0% found this document useful (0 votes)
77 views

ARIMA Models in Python Chapter3

This document provides an introduction to ARIMA modeling in Python. It discusses autocorrelation functions (ACF) and partial autocorrelation functions (PACF), which can be used to identify appropriate AR and MA parameters for ARIMA models. It also covers implementing ARIMA models in Python using the statsmodels library, and evaluating models using metrics like AIC, BIC and examining model residuals. The Box-Jenkins methodology for ARIMA modeling is outlined, including steps for model identification, estimation and diagnostics.

Uploaded by

Fgpeqw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views

ARIMA Models in Python Chapter3

This document provides an introduction to ARIMA modeling in Python. It discusses autocorrelation functions (ACF) and partial autocorrelation functions (PACF), which can be used to identify appropriate AR and MA parameters for ARIMA models. It also covers implementing ARIMA models in Python using the statsmodels library, and evaluating models using metrics like AIC, BIC and examining model residuals. The Box-Jenkins methodology for ARIMA modeling is outlined, including steps for model identification, estimation and diagnostics.

Uploaded by

Fgpeqw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Intro to ACF and

PACF
A RIMA MODELS IN P YTH ON

James Fulton
Climate informatics researcher
Motivation

ARIMA MODELS IN PYTHON


ACF and PACF
ACF - Autocorrelation Function

PACF - Partial autocorrelation function

ARIMA MODELS IN PYTHON


What is the ACF
lag-1 autocorrelation → corr(yt , yt−1 )

lag-2 autocorrelation → corr(yt , yt−2 )

...

lag-n autocorrelation → corr(yt , yt−n )

ARIMA MODELS IN PYTHON


What is the ACF

ARIMA MODELS IN PYTHON


What is the PACF

ARIMA MODELS IN PYTHON


Using ACF and PACF to choose model order

AR(2) model →

ARIMA MODELS IN PYTHON


Using ACF and PACF to choose model order

MA(2) model →

ARIMA MODELS IN PYTHON


Using ACF and PACF to choose model order

ARIMA MODELS IN PYTHON


Using ACF and PACF to choose model order

ARIMA MODELS IN PYTHON


Implementation in Python
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Create figure
fig, (ax1, ax2) = plt.subplots(2,1, figsize=(8,8))

# Make ACF plot


plot_acf(df, lags=10, zero=False, ax=ax1)

# Make PACF plot


plot_pacf(df, lags=10, zero=False, ax=ax2)

plt.show()

ARIMA MODELS IN PYTHON


Implementation in Python

ARIMA MODELS IN PYTHON


Over/under differencing and ACF and PACF

ARIMA MODELS IN PYTHON


Over/under differencing and ACF and PACF

ARIMA MODELS IN PYTHON


Let's practice!
A RIMA MODELS IN P YTH ON
AIC and BIC
A RIMA MODELS IN P YTH ON

James Fulton
Climate informatics researcher
AIC - Akaike information criterion
Lower AIC indicates a better model

AIC likes to choose simple models with lower order

ARIMA MODELS IN PYTHON


BIC - Bayesian information criterion
Very similar to AIC

Lower BIC indicates a better model

BIC likes to choose simple models with lower order

ARIMA MODELS IN PYTHON


AIC vs BIC
BIC favors simpler models than AIC

AIC is better at choosing predictive models

BIC is better at choosing good explanatory model

ARIMA MODELS IN PYTHON


AIC and BIC in statsmodels
# Create model
model = SARIMAX(df, order=(1,0,1))
# Fit model
results = model.fit()
# Print fit summary
print(results.summary())

Statespace Model Results


==============================================================================
Dep. Variable: y No. Observations: 1000
Model: SARIMAX(2, 0, 0) Log Likelihood -1399.704
Date: Fri, 10 May 2019 AIC 2805.407
Time: 01:06:11 BIC 2820.131
Sample: 01-01-2013 HQIC 2811.003
- 09-27-2015
Covariance Type: opg

ARIMA MODELS IN PYTHON


AIC and BIC in statsmodels
# Create model
model = SARIMAX(df, order=(1,0,1))

# Fit model
results = model.fit()

# Print AIC and BIC


print('AIC:', results.aic)
print('BIC:', results.bic)

AIC: 2806.36
BIC: 2821.09

ARIMA MODELS IN PYTHON


Searching over AIC and BIC
# Loop over AR order
for p in range(3):
# Loop over MA order
for q in range(3):
# Fit model
model = SARIMAX(df, order=(p,0,q))
results = model.fit()
# print the model order and the AIC/BIC values
print(p, q, results.aic, results.bic)

0 0 2900.13 2905.04
0 1 2828.70 2838.52
0 2 2806.69 2821.42
1 0 2810.25 2820.06
1 1 2806.37 2821.09
1 2 2807.52 2827.15
...

ARIMA MODELS IN PYTHON


Searching over AIC and BIC
order_aic_bic =[]
# Loop over AR order
for p in range(3):
# Loop over MA order
for q in range(3):
# Fit model
model = SARIMAX(df, order=(p,0,q))
results = model.fit()
# Add order and scores to list
order_aic_bic.append((p, q, results.aic, results.bic))

# Make DataFrame of model order and AIC/BIC scores


order_df = pd.DataFrame(order_aic_bic, columns=['p','q', 'aic', 'bic'])

ARIMA MODELS IN PYTHON


Searching over AIC and BIC
# Sort by AIC # Sort by BIC
print(order_df.sort_values('aic')) print(order_df.sort_values('bic'))

p q aic bic p q aic bic


7 2 1 2804.54 2824.17 3 1 0 2810.25 2820.06
6 2 0 2805.41 2820.13 6 2 0 2805.41 2820.13
4 1 1 2806.37 2821.09 4 1 1 2806.37 2821.09
2 0 2 2806.69 2821.42 2 0 2 2806.69 2821.42
... ...

ARIMA MODELS IN PYTHON


Non-stationary model orders
# Fit model
model = SARIMAX(df, order=(2,0,1))
results = model.fit()

ValueError: Non-stationary starting autoregressive parameters


found with `enforce_stationarity` set to True.

ARIMA MODELS IN PYTHON


When certain orders don't work
# Loop over AR order
for p in range(3):
# Loop over MA order
for q in range(3):

# Fit model
model = SARIMAX(df, order=(p,0,q))
results = model.fit()

# Print the model order and the AIC/BIC values


print(p, q, results.aic, results.bic)

ARIMA MODELS IN PYTHON


When certain orders don't work
# Loop over AR order
for p in range(3):
# Loop over MA order
for q in range(3):

try:
# Fit model
model = SARIMAX(df, order=(p,0,q))
results = model.fit()

# Print the model order and the AIC/BIC values


print(p, q, results.aic, results.bic)

except:
# Print AIC and BIC as None when fails
print(p, q, None, None)

ARIMA MODELS IN PYTHON


Let's practice!
A RIMA MODELS IN P YTH ON
Model diagnostics
A RIMA MODELS IN P YTH ON

James Fulton
Climate informatics researcher
Introduction to model diagnostics
How good is the nal model?

ARIMA MODELS IN PYTHON


Residuals

ARIMA MODELS IN PYTHON


Residuals
# Fit model
model = SARIMAX(df, order=(p,d,q))
results = model.fit()

# Assign residuals to variable


residuals = results.resid

2013-01-23 1.013129
2013-01-24 0.114055
2013-01-25 0.430698
2013-01-26 -1.247046
2013-01-27 -0.499565
... ...

ARIMA MODELS IN PYTHON


Mean absolute error
How far our the predictions from the real values?

mae = np.mean(np.abs(residuals))

ARIMA MODELS IN PYTHON


Plot diagnostics
If the model ts well the residuals will be white
Gaussian noise

# Create the 4 diagostics plots


results.plot_diagnostics()
plt.show()

ARIMA MODELS IN PYTHON


Residuals plot

ARIMA MODELS IN PYTHON


Residuals plot

ARIMA MODELS IN PYTHON


Histogram plus estimated density

ARIMA MODELS IN PYTHON


Normal Q-Q

ARIMA MODELS IN PYTHON


Correlogram

ARIMA MODELS IN PYTHON


Summary statistics
print(results.summary())

...
===================================================================================
Ljung-Box (Q): 32.10 Jarque-Bera (JB): 0.02
Prob(Q): 0.81 Prob(JB): 0.99
Heteroskedasticity (H): 1.28 Skew: -0.02
Prob(H) (two-sided): 0.21 Kurtosis: 2.98
===================================================================================

Prob(Q) - p-value for null hypothesis that residuals are uncorrelated

Prob(JB) - p-value for null hypothesis that residuals are normal

ARIMA MODELS IN PYTHON


Let's practice!
A RIMA MODELS IN P YTH ON
Box-Jenkins method
A RIMA MODELS IN P YTH ON

James Fulton
Climate informatics researcher
The Box-Jenkins method
From raw data → production model

identi cation

estimation

model diagnostics

ARIMA MODELS IN PYTHON


Identi cation
Is the time series stationary?

What differencing will make it stationary?

What transforms will make it stationary?

What values of p and q are most promising?

ARIMA MODELS IN PYTHON


Identi cation tools
Plot the time series
df.plot()

Use augmented Dicky-Fuller test


adfuller()

Use transforms and/or differencing


df.diff() , np.log() , np.sqrt()

Plot ACF/PACF
plot_acf() , plot_pacf()

ARIMA MODELS IN PYTHON


Estimation
Use the data to train the model coef cients

Done for us using model.fit()

Choose between models using AIC and BIC


results.aic , results.bic

ARIMA MODELS IN PYTHON


Model diagnostics
Are the residuals uncorrelated

Are residuals normally distributed


results.plot_diagnostics()

results.summary()

ARIMA MODELS IN PYTHON


Decision

ARIMA MODELS IN PYTHON


Repeat
We go through the process again with more
information

Find a better model

ARIMA MODELS IN PYTHON


Production
Ready to make forecasts
results.get_forecast()

ARIMA MODELS IN PYTHON


Box-Jenkins

ARIMA MODELS IN PYTHON


Let's practice!
A RIMA MODELS IN P YTH ON

You might also like