Week 6 - ARIMA Forecasting Guidance
Week 6 - ARIMA Forecasting Guidance
ARIMA stands for Auto Regressive Integrated Moving Average. There are seasonal
and Non-seasonal ARIMA models that can be used for forecasting
P = Periods to lag for eg: (if P= 3 then we will use the three previous periods of our
time series in the autoregressive portion of the calculation) P helps adjust the line that is
being fitted to forecast the series
Stationary time series is when the mean and variance are constant over time. It is
easier to predict when the series is stationary.
The first differencing value is the difference between the current time period and the
previous time period. If these values fail to revolve around a constant mean and variance,
then we find the second differencing using the values of the first differencing. We repeat
this until we get a stationary series
The best way to determine whether or not the series is sufficiently differenced is to plot
the differenced series and check to see if there is a constant mean and variance.
1
Q = This variable denotes the lag of the error component, where error component is a
part of the time series not explained by trend or seasonality
Autocorrelation refers to how correlated a time series is with its past values whereas
the ACF is the plot used to see the correlation between the points, up to and including the
lag unit. In ACF, the correlation coefficient is in the x-axis whereas the number of lags is
shown in the y-axis.
The Autocorrelation function plot will let you know how the given time series is
correlated with itself
Normally in an ARIMA model, we make use of either the AR term or the MA term.
We use both of these terms only on rare occasions. We use the ACF plot to decide which
one of these terms we would use for our time series
After plotting the ACF plot we move to Partial Autocorrelation Function plots (PACF).
A partial autocorrelation is a summary of the relationship between an observation in a time
series with observations at prior time steps with the relationships of intervening observations
removed.
The partial autocorrelation at lag k is the correlation that results after removing the
effect of any correlations due to the terms at shorter lags.
If the PACF plot drops off at lag n, then use an AR(n) model and if the drop in
PACF is more gradual then we use the MA term
2
• ACF plots show autocorrelation decaying towards zero
Moving Averages: Random jumps in the time series plot whose effect is felt in two or
more consecutive periods. These jumps represent the error calculated in our ARIMA model
and represent what the MA component would lag for. A purely MA model would smooth
out these sudden jumps like the exponential smoothing method.
Integrated component: This component comes into action when the time series is not
stationary. The number of times we have to difference the series to make it stationary is the
parameter(i-term) for the integrated component
As the name suggests, this model is used when the time series exhibits seasonality.
This model is similar to ARIMA models; we just have to add in a few parameters to account
for the seasons
We write SARIMA as
ARIMA(p,d,q)(P, D, Q)m,
• d — degree of differencing
3
• q — the number of moving average terms
• (P, D, Q)— represents the (p,d,q) for the seasonal part of the time series
Seasonal differencing takes into account the seasons and differences the current value
and its value in the previous season eg: Difference for the month may would be value in May
2018 — value in may 2017.
• In Purely seasonal AR model, ACF decays slowly while PACF cuts off to zero
• In Purely seasonal MA model, ACF cuts off to zero and vice versa
Final steps
• Step 3 — Filter out a validation sample: This will be used to validate how
accurate our model is. Use train test validation split to achieve this
• Step 4 — Select AR and MA terms: Use the ACF and PACF to decide whether
to include an AR term(s), MA term(s), or both.
• Step 5 — Build the model: Build the model and set the number of periods to
forecast to N (depends on your needs).
• Step 6 — Validate model: Compare the predicted values to the actuals in the
validation sample.
4
Link of video guidance:
https://www.youtube.com/watch?v=8xt4q7KHfBs
2. EXAMPLE CODING:
ARIMA DATA
PRACTICES.xlsx