0% found this document useful (0 votes)

7 views58 pages

Time+Series+Forecasting Monograph

Uploaded by

anekal.industries

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views58 pages

Time+Series+Forecasting Monograph

Uploaded by

anekal.industries

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

A Short Monograph on Time

Series Forecasting

TO SERVE AS A REFRESHER FOR PGP-DSBA

[email protected]
Y21IHWS8GO

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Index
Contents
1. Introduction ............................................................................................................................................. 5
2. What is a time series?.............................................................................................................................. 6
2.1. Definition of Time series ........................................................................................................................ 6
3. Decomposition Method ........................................................................................................................... 7
3.1. Components of Time Series.................................................................................................................. 7
3.2. Two alternative methods for decomposition..................................................................................... 12
3.2.1. Using moving average decomposition ....................................................................................... 12
3.2.2. Using Seasonal and Trend decomposition by Loess................................................................. 13
4. Data split and its significance ............................................................................................................... 14
4.1. Measures of Forecast Accuracy ......................................................................................................... 14
5. Forecasting Methods ............................................................................................................................. 17
5.1. Exponential Smoothing Method: .......................................................................................................... 17
5.1.1. Simple smoothing method: ........................................................................................................... 17
5.1.2. Holt’s Method (Double Exponential Smoothing): ........................................................................ 17
5.1.3. Holt-Winter’s method (Triple Exponential Smoothing): .............................................................. 17
[email protected]
Y21IHWS8GO5.1.4. Two custom seasonalities (in Python) .......................................................................................... 21
6. Concept of ARIMA............................................................................................................................... 22
6.1. Stationary Series and Non-stationary series .......................................................................................... 22
6.1.1. Stationarization of a non-stationary series .................................................................................... 22
6.2. ARIMA (p, d, q) Model ........................................................................................................................ 24
6.2.1. AR(p) Model ................................................................................................................................. 25
6.2.2. MA(q) Model ................................................................................................................................ 25
6.2.3. ARMA Model ............................................................................................................................... 25
6.2.4. ARIMA Model .............................................................................................................................. 25
6.3. 𝐒𝐀𝐑𝐈𝐌𝐀(𝐩,𝐝,𝐪)(𝐏,𝐃,𝐐)[𝐦]: ............................................................................................................... 26
6.4. 𝐀𝐑𝐈𝐌AX Model .................................................................................................................................... 40
6.5. Data Split and Plot ............................................................................................................................... 41
6.6. 𝐀𝐑𝐈𝐌𝐀X(𝐩,𝐝,𝐪) Model ....................................................................................................................... 49
7. Further Study ........................................................................................................................................ 54
7.1. Cyclical Component.............................................................................................................................. 54
7.2. ARCH and GARCH Models ................................................................................................................. 54
7.3. Other Variations .................................................................................................................................... 54
7.4. Multivariate Time Series....................................................................................................................... 55

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
List of Figures
Figure 1: US GDP growth ....................................................................................................................... 7
Figure 2: Monthly average temperature data .......................................................................................... 7
Figure 3:Tractor sales data ...................................................................................................................... 9
Figure 4: Seasonal plot year-month-wise ................................................................................................ 10
Figure 5: Seasonal plot month-wise ...................................................................................................... 11
Figure 6: Decomposed tractor time series into components using moving average decomposition .... 12
Figure 7: Decomposed tractor time series into components using Loess decomposition ..................... 14
Figure 8: Data (split into Training and testing purpose) ....................................................................... 16
Figure 9: Forecasted sales using Holt Winter’s method ....................................................................... 19
Figure 10: Plot Actual vs. Forecasted sales using HW’s method for 2013-2014 years ........................ 19
Figure 11: ACF and PACF of Tractor Sales after log transformation .................................................. 24
Figure 12: Residual Diagnostics of the (0,1,1)(0,1,1)12 SARIMA Model……………………………28
Figure 13: Plot of first difference series, ACF and PACF of Log(Tractor Sales) ................................. 29
Figure 14: Plot of first difference and seasonal first difference series, ACF and PACF of Log(Tractor
Sales) ..................................................................................................................................................... 30
[email protected]
Y21IHWS8GOFigure 15: Residual Diagnostics of the (0,1,1)(1,1,1)12 SARIMA Model……………………………32
Figure 16: Tractor sales data forecast using SARIMA (0,1,1)*(0,1,1)[12] .......................................... 33
Figure 17: Tractor sales data forecast using SARIMA (0,1,1)(1,1,1)[12] ............................................. 34
Figure 18: Plot Actual vs. Forecasted sales using SARIMA(0,1,1)*(0,1,1)[12] for 2013-2014 years. 35
Figure 19: Plot Actual vs. Forecasted sales using ARIMA (0,1,1)*(1,1,1)[12] for 2013-2014 years .. 36

Figure 20: Plot Actual vs. Forecasted sales using Triple Exponential Smoothing method for 2015-2016
years ...................................................................................................................................................... 38
Figure 21: Plot Actual vs. Forecasted sales using SARIMA model for 2015-2016 years .................... 39
Figure 22: PM2.5 Pollution data……………………………………………………………………….41
Figure 23: PM2.5 Data (split into Training and testing purpose)……………………………………...42
Figure 24: PM2.5 Data ACF Plot……………………………………………………………………...42
Figure 25: Plot of first difference series, ACF and PACF of PM2.5………………………………….44
Figure 26: Residual Diagnostics of the (0,1,1) ARIMA Model……………………………………….46
Figure 27: PM2.5 data forecast using ARIMA (0,1,1)………………………………………………..47
Figure 28: Plot Actual vs. Forecasted PM2.5 data using ARIMA(0,1,1) for 2016-2017 years……….48
Figure 29: Residual Diagnostics of the (0,1,3) ARIMAX Model……………………………………..51
Figure 30: PM2.5 data forecast using ARIMAX (0,1,3)………………………………………………52
Figure 31: Plot Actual vs. Forecasted PM2.5 data using ARIMAX (0,1,3) for 2016-2017 years…….52
Figure 32: Flow chart of Time Series ................................................................................................... 56
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited. 3

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
List of Tables
Table 1: Summary Table for Exponential Smoothing Models ............................................................... 20
Table 2: Summary Table for ARIMA models ........................................................................................ 37
Table 3: Comparison of result................................................................................................................. 37
Table 4: Comparison Table for ARIMAX Case Study ........................................................................... 53

[email protected]
Y21IHWS8GO

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
1. Introduction
Forecast is a statistical method to predict an attribute using historical patterns in the data. All businesses
rely on forecast and reap major benefits from accurate forecasted values. Forecasts can be made for
several years in advance (e.g., capital investments, strategic planning) or for the next few minutes (for
telecommunication routing). Every business and organization applies different methods of forecasting
in different situations. Therefore, it is imperative to identify what to forecast and which method of
forecasting should be utilized in different scenarios so that the risk of forecasting is minimized.

Data can be classified into three major groups through their temporal nature, such as:

I. Cross sectional data: Data is collected at a single point in time on one or more variables. Here,
data is not sequential and usually, data points are independent of one another. Regression, random
forest or neural network methods have been applied widely on these data.

II. Time series data: Univariate or multivariate data is observed across time in a sequential manner
at pre-determined and equally-spaced time intervals (such as yearly, monthly, quarterly or
hourly). Ordering among data points is important and cannot be destroyed.

III. Combination of cross sectional and time series data: This is a complex study design where
information on same variables are collected over various points of time. Many survey samplings
make use of panel data.
[email protected]
Y21IHWS8GO
In this monograph we have focused only on the forecasting methods appropriate for time series data
observed at regular intervals.

Prediction for cross-sectional data has been the topic of discussion for other predictive models such as linear
regression, logistic regression, CART, RF, ANN etc.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
2. What is a time series?

2.1. Definition of Time series

Formal definition of time series: A collection of observations that has been observed at regular time
intervals for a certain variable over a given duration is called a time series.

The regular time intervals can be daily (stock market prices), weekly (product manufactured), monthly
(unemployment rate), quarterly (sales of product), annual (GDP), quinquennial, that is in every 5 years
(Census of manufactures), or decennial (census of population).

Time series can be applied in various fields such as economic forecasting, sales forecasting, budgetary
analysis, stock market analysis, yield projection, inventory studies, workload projections, utility studies,
census analysis, process and quality control and many more.

Time series data has several characteristics that make it unique. These characteristics can be stated
below as: -

 All observations are dependent: In time series data, each observation is expected to depend on
the past observations.
 Missing data must be imputed: Because all data points are sequential in time series, if any data
point is missing, it must be imputed before the actual analysis process commences; otherwise the
[email protected]
proper ordering is not preserved.
Y21IHWS8GO
 Two different types of intervals cannot be mixed: Time series data is observed on the same
variable over a given period of time with fixed and regular time intervals. Though data can be
collected at various intervals such as yearly, monthly, weekly, daily, hourly (e.g. temperature)
and/or any specific time-interval, the interval must remain the same throughout the entire range;
e.g. yearly series cannot be combined with quarterly or monthly series.

Objective of Time Series Forecasting: Time series forecasting is applied to extract information from
historical series and is used to predict future behaviour of the same series based on past pattern.

Approaches used for Time Series Forecasting: The following are two major approaches to time series
forecasting.

I. Decomposition: This method is based on extraction of individual components of time series.

II. Regression: This method is based on regression on past observations

The two approaches are completely different. Both approaches are discussed with
illustration.

By no means these are the only two approaches of time series forecasting. A few other methods are referred
to in Section 7.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
3. Decomposition Method
3.1. Components of Time Series
There are various forces that may affect the observations in a time series. The three important components
are:
I. Trend (Long term movement)
II. Seasonal component: Intra-year stable fluctuations repeatable over the entire length of the
series
III. Irregular component (Random movements)

Trend (Tt): When the series increases (or decreases) over the entire length of time. For example, the
price of a share may increase or decrease linearly over a period of time., the sales of a new product may
increase exponentially (or non-linearly). Figure 1 shows increasing linear trend of US GDP growth over
a period of time.

[email protected]
Y21IHWS8GO

Figure 1: US GDP growth

Seasonality (St): When a series is observed with more frequently than a year (quarterly or monthly for
example), the series is subject to rhythmic fluctuations which are stable and repeatable each year. For
example, sales of umbrella increase in rainy season whereas sales of AC increase in summers and sales
of woolen clothes increase in winters. This intra-year fluctuation is known as seasonal fluctuation.
Figure 2 contains monthly average temperature that oscillates in a regular pattern over the given period
of time.

Figure 2: Monthly average temperature data

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Irregularity (It): These fluctuations are purely random, erratic, unforeseen, and unpredictable. This is
the random component of time series.

Trend and seasonal components are part of systematic components of time series.

Additive Model: Yt = Tt + St + It is considered when the resultant series is the sum of the components.

Multiplicative Model: Yt = Tt * St * It is considered when the resultant time series is the product of the
components.

A series may be considered multiplicative series when the seasonal fluctuations increase as trend
increases. A multiplicative time series can be transformed into an additive series by taking log
transformation i.e.
log(Yt) = log(Tt) + log(St) + log(It)

Decomposition of a time series leads to identification and extraction of the individual components.

Primary objective of decomposition is to study the components of the time series, NOT forecasting.
However, forecasting models can be built on top of the decomposed series.

Case Study 1

[email protected]
A company ABC selling tractors has to forecast its sales for the next 24 months. It has 12 years of past
Y21IHWS8GO
sales data on monthly basis. The data may contain trend, seasonality or both. Objective is to provide a
reasonable forecast for future sales.

The following steps are to be followed:

Part.I. Checking the pattern of the tractor sales series.

Part.II. Identification of the components of the series
Part.III. Proposing best model for the series

Solution:

Part I: To check the pattern of the Tractor sale series

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
## Import all the required libraries

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
from pylab import rcParams
from statsmodels.graphics.tsaplots import month_plot,plot_acf, plot_pacf
from statsmodels.tsa.seasonal import seasonal_decompose,STL
from statsmodels.tsa.api import ExponentialSmoothing
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.arima_model import ARIMA
import statsmodels.api as sm

## Upload the data file and plot the same

df = pd.read_csv('Tractor-Sales.csv')
timestamp = pd.date_range(start='2003-01-01',end='2015-01-01',freq ='M')
df['Time_Stamp'] = timestamp
df.drop(labels='Month-Year',axis=1,inplace=True)
df.set_index(keys='Time_Stamp',drop=True,inplace=True)
rcParams['figure.figsize'] = 15,8
df.plot(grid=True);

[email protected]
Y21IHWS8GO

Figure 3: Tractor sales data

Following are the observations obtained from Figure 3.:

I. Data values are stored in correct time order and no data is missing.
II. The sales are increasing in numbers, implying presence of trend component.
III. Intra-year stable fluctuations are indicative of seasonal component. As trend increases,
fluctuations are also increasing. This is indicative of multiplicative seasonality.

 Note:
All the versions of the libraries used are given in the appendix of the monograph given in page 57.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited. 9

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Following two plots will help us in identifying the seasonal fluctuations better.

## Plot 1: Seasonal plot Year-wise

months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'Au
gust', 'September', 'October','November', 'December']

yearly_sales_across_years = pd.pivot_table(df, values = 'Number of Tractor So

ld', columns = df.index.year,index = df.index.month_name())

yearly_sales_across_years = yearly_sales_across_years.reindex(index = months)

yearly_sales_across_years.plot()
plt.grid()
plt.legend(loc='best');

[email protected]
Y21IHWS8GO

Figure 4: Seasonal plot year-month wise

## Plot 2: Seasonal plot Month-wise

month_plot(df,ylabel='TractorSalesTS')
plt.grid();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 5: Seasonal plot month-wise

Figure 4 shows that the sales of tractors are increasing every year in number.
Note in Figure 5 that, the vertical lines represent monthly sales and the horizontal lines represent average
[email protected]
Y21IHWS8GOsales of the given month. Here, it can be observed that average sales are higher in July and August as
compared to other months
In all these above plots the increasing lines that represent sales have seasonal fluctuations along with a
trend. Thus, we can confirm that it is multiplicative seasonality.

Part II: To Identify the components of the given Tractor sales data.
Now decomposition method is applied to identify and separate out the three components (i.e.
trend, seasonality and irregular components) from the given series to observe their independent
properties.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
3.2. Two alternative methods for decomposition

3.2.1. Using moving average decomposition

Case Study continued.

## Decomposition of the TS using the moving average decomposition

decomposition = seasonal_decompose(df['Number of Tractor Sold'],

model='multiplicative')
decomposition.plot();

[email protected]
Y21IHWS8GO

Figure 6: Decomposed tractor time series into components using moving average decomposition

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
## Observing the seasonal indices

Seasonal_Ind = pd.DataFrame({'Jan':round(decomposition.seasonal.head(12),2).values[0],
'Feb':round(decomposition.seasonal.head(12),2).values[1],
'Mar':round(decomposition.seasonal.head(12),2).values[2],
'Apr':round(decomposition.seasonal.head(12),2).values[3],
'May':round(decomposition.seasonal.head(12),2).values[4],
'Jun':round(decomposition.seasonal.head(12),2).values[5],
'Jul':round(decomposition.seasonal.head(12),2).values[6],
'Aug':round(decomposition.seasonal.head(12),2).values[7],
'Sep':round(decomposition.seasonal.head(12),2).values[8],
'Oct':round(decomposition.seasonal.head(12),2).values[9],
'Nov':round(decomposition.seasonal.head(12),2).values[10],
'Dec':round(decomposition.seasonal.head(12),2).values[11]},
index=range(1,2))
Seasonal_Ind

Figure 6 indicates that trend is increasing linearly. Since this is monthly data, there are 12 seasonal
indices. Sum of the monthly indices must be 12. In July tractor sales is the highest among all months in
the same year, as borne by the highest value of the seasonal component whereas in November (lowest
value of the seasonality) tractor sales is the lowest.

[email protected]
3.2.2.
Y21IHWS8GO Using Seasonal and Trend decomposition by Loess

Owing to some limitations in the moving average decomposition, Loess decomposition has
been proposed. This is more versatile but does not admit multiplicative seasonality. Hence log
transformation is used to convert multiplicative seasonality into additive seasonality.

## Decomposition of the TS using Loess Trend and Seasonality decomposition

decomposition = STL(np.log10(df['Number of Tractor Sold']),

seasonal=13).fit()
decomposition.plot();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 7: Decomposed tractor time series into components using Loess decomposition

4. Data split and its significance

[email protected]
Y21IHWS8GO

Before a forecast method is proposed, the method needs to be validated. For that purpose, data has to
be split into two sets i.e. training and testing. Training data helps in identifying and fitting right model(s)
and test data is used to validate the same.

In case of time series data, the test data is the most recent part of the series so that the ordering in the
data is preserved.

Part III: To Propose best model for the Tractor sales data

4.1. Measures of Forecast Accuracy

Forecasting accuracy measures compare the predicted values against the observed values to quantify the
predictive power of the proposed model. Mathematically, it can be defined as

Forecast error et for period t is given by: 𝑒𝑡 = Ŷt - Yt

Where
Ŷt = forecast value for time period 𝑡
Yt = actual value in time period 𝑡
𝑛 = No. of observations

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Various measures of forecasting errors can be defined as: -

1
Mean Absolute Deviation (MAD): ∑𝑛𝑡=1 |𝑒𝑡 |
𝑛

Mean Absolute Percentage Error (MAPE): This method is used extensively in time series because
it acts as a unit free criteria; therefore, performance of forecasted values can be easily
1 𝑛 |𝑒𝑡 |
compared. 𝑀𝐴𝑃𝐸= ∑𝑡=1 * 100
𝑛 𝑡 𝑌
MAPE is usually expressed as a percentage.

1
Mean Square Error (MSE): 𝑀𝑆𝐸= ∑𝑛𝑡=1 𝑒𝑡2
𝑛

1
Root Mean Square Error (RMSE): 𝑅𝑀𝑆𝐸= √ ∑𝑛𝑡=1 𝑒𝑡2
𝑛

Case Study continued.

For Tractor Sales series, the first 10 years of data is used for training purpose and last 2 years of data
is for testing purpose.

## Step 4: Splitting data into training and test data sets

[email protected]
Y21IHWS8GO
TS_Train = df[df.index.year <= 2012]
TS_Test = df[df.index.year > 2012]

TS_Train['Number of Tractor Sold'].plot()

TS_Test['Number of Tractor Sold'].plot()
plt.grid()
plt.title('Tractor Sales Traning and Test data')
plt.xlabel('Year')
plt.ylabel('Sales')
plt.legend(['Training Data','Test Data'],title='Forecast');

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 8: Data (split into Training and testing purpose)

[email protected]
Y21IHWS8GO

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
5. Forecasting Methods

5.1. Exponential Smoothing Method:

This is an extension of moving (rolling) average method where more recent observations get
higher weight.

5.1.1. Simple smoothing method:

SES or one-parameter exponential smoothing is applicable to time series which do not contain either of
trend or seasonality. Forecast by SES is given by:

Ŷt+1 = 𝛼𝑌𝑡+𝛼(1−𝛼)𝑌𝑡−1+𝛼(1−𝛼)2𝑌𝑡−1+⋯, 0<𝛼<1

where, 𝛼 is the smoothing parameter for the level. In reality such a series is hard to find. This is a one-
step-ahead forecast where all the forecast values are identical.

5.1.2. Holt’s Method (Double Exponential Smoothing):

This method is an extension of SES method, proposed by Holt in 1957. This method is applicable where
trend is present in the data but no seasonality.
The forecast values are given as:
[email protected]
Y21IHWS8GO
Forecast equation : Ŷ𝑡+1=𝑙𝑡+𝑏𝑡
Level Equation : 𝑙𝑡=𝛼𝑌𝑡+𝛼(1−𝛼)𝑌𝑡−1, 0<𝛼<1
Trend Equation : 𝑏𝑡=𝛽(𝑙𝑡−𝑙𝑡−1)+(1−𝛽)𝑏𝑡−1, 0<𝛽<1

where, 𝑙𝑡 is the estimate of level and 𝑏𝑡 is the trend estimate.

𝛼 is the smoothing parameter for the level and 𝛽 is the smoothing parameter for trend.

5.1.3. Holt-Winter’s method (Triple Exponential Smoothing):

This is an extension of Holt’s method when seasonality is found in the data.

Forecast equation:Y𝑡+1=𝑙𝑡+𝑏𝑡+𝑠𝑡−𝑚(𝑘+1)
Level Equation:𝑙𝑡=𝛼(𝑌𝑡−𝑠𝑡−𝑚)+𝛼(1−𝛼)𝑌𝑡−1, 0 < 𝛼 < 1
Trend Equation:𝑏𝑡=𝛽(𝑙𝑡−𝑙𝑡−1)+(1−𝛽)𝑏𝑡−1, 0 < 𝛽 < 1
Seasonal Equation : 𝛾(𝑌𝑡−𝑙𝑡−1−𝑏𝑡−1)+(1−𝛾)𝑠𝑡−𝑚, 0 < 𝛾 < 1

This is also known as three parameters exponential or triple exponential because of the three
smoothing parameters 𝛼, 𝛽 and 𝛾. This is a general method and a true multi-step ahead forecast.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Case Study continued.
Tractor sales series is to be forecasted using Holt-Winters’ method since it contains both
trend and seasonality.

TS_Train_HW = ExponentialSmoothing(TS_Train,seasonal='multiplicative',tre
nd='additive',freq='M')
TS_Train_HW_autofit = TS_Train_HW.fit(optimized=True)

TS_Train_HW_autofit.params_formatted

[email protected]
Y21IHWS8GO

A user may also choose values of 𝛼, 𝛽 𝑎𝑛𝑑 𝛾, and can observe the differences in the model.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
TES_pred = TS_Train_HW_autofit.forecast(steps=len(TS_Test))
TS_Train.plot()
TES_pred.plot()
plt.grid()
plt.legend(['Training Data','Forecasted Values']);

Figure 9: Forecasted sales using Holt Winter’s method

Accuracy value using Holt-Winter method

[email protected]
Y21IHWS8GO## Plotting only the forecast and the test data

TS_Test['Number of Tractor Sold'].plot()

TES_pred.plot()
plt.grid()
plt.title('Tractor Sales: Actual vs Forecast')
plt.xlabel('Time')
plt.legend(['Actual Data','Forecasted Data']);

Figure 10: Plot Actual vs. Forecasted sales using HW’s method for 2013-2014 years

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
The blue line shows the actual observations and the red shows forecasted values.
The latter is below the actual values.

## Mean Absolute Percentage Error (MAPE) - Function Definition

def mean_absolute_percentage_error(y_true, y_pred):

return np.mean((np.abs(y_true-y_pred))/(y_true))*100

RMSE = mean_squared_error(TS_Test,TES_pred,squared=False)
MAPE = mean_absolute_percentage_error(TS_Test['Number of Tractor Sold'],T
ES_pred)

resultsDf = pd.DataFrame({'Test RMSE': [RMSE],'Test MAPE':[MAPE]}

,index=['TripleExponentialSmoothing'])
resultsDf

Table 1 represents a summary table of decomposition methods stating its conditions,

advantages and disadvantages.
Table 1: Summary Table for Exponential Smoothing Models
When to consider a
particular method?
Method
[email protected] There is There is Advantages Disadvantages
Y21IHWS8GO trend in seasonality
data in data
Helps in understanding Does not generate values for
Decomposition May/May May/may
pattern of time series a few initial and last data
Method not be not be
components individually. points
Simple Suitable when data has no
Takes time in calibrating
Exponential No No clear presence of trend and
method seasonality value of parameter 𝛼

Takes time in calibrating

Holt’s Method Yes No Adjusts level and trend both
value of parameter 𝛼, 𝛽.

Holt Winter’s Adjust level, trend and Takes time in calibrating

Yes Yes
Method seasonality simultaneously value of parameter 𝛼, 𝛽, 𝛾.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
5.1.4. Two custom seasonalities (in Python)
The data_range function within the Pandas library helps adjust seasonality according to
requirement. A few examples of various custom seasonality is given below. This list is not
exhaustive and various more refined seasonality or multiple seasonality may also be defined for
time series.

The following parameters are important for generating time stamps according to frequency
(seasonality) using the date_range function in Pandas.
 ‘start’- This defines the time-stamp to indicate the first instance of the data.
 ‘period’ – This specifies the number of date-time observations to generate.
 ‘freq’ – This can be used to change the seasonality of the time stamps.
 ‘end’ – Instead of the ‘period’ parameter, ‘end’ can be used to specify the last instance of
the observation

Frequency Code Output

Business import pandas as pd DatetimeIndex(['2020-12-01', '2020-12-02'
Day pd.date_range(start='2020/12/1', , '2020-12-03', '2020-12-04',
'2020-12-07', '2020-12-08'
(Monday periods=12,freq='B')
, '2020-12-09', '2020-12-10',
to Friday) '2020-12-11', '2020-12-14'
, '2020-12-15', '2020-12-16'],
dtype='datetime64[ns]', fre
q='B')

[email protected] Time stamps have been generated in accordance to the

Y21IHWS8GO business days.
Only import pandas as pd DatetimeIndex(['2020-12-05', '2020-12-06'
Weekends pd.bdate_range(start='2020/12/1 , '2020-12-12', '2020-12-13',
'2020-12-19', '2020-12-20'
(Saturday ',periods=12,freq='C',weekmask
, '2020-12-26', '2020-12-27',
and ='Sat Sun') '2021-01-02', '2021-01-03'
Sunday) , '2021-01-09', '2021-01-10'],
dtype='datetime64[ns]', fre
q='C')

Time stamps have been generated in accordance to the

weekends only.

After the time stamps are generated, they may be used to index data to make it an appropriate time
series.

The following links of the Pandas library is useful for various other custom date ranges.
1. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html
2. https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-
aliases
3. https://pandas.pydata.org/pandas-
docs/stable/reference/api/pandas.tseries.offsets.CustomBusinessDay.html#pandas.tseries.offs
ets.CustomBusinessDay

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
6. Concept of ARIMA

Auto Regressive Integrated Moving Average (ARIMA) models are applied on time series data
when the current value is assumed to be correlated to past values and past prediction errors.
Therefore, these models are used in defining current value as a linear combination of past
values and past prediction errors.

Here, we have defined a few terms that would be useful in understanding ARIMA models in
detail.

ARMA models can only be applied only on stationary time series data.

6.1. Stationary Series and Non-stationary series

Stationary process: A process is said to be stationary if its mean and variance are constant
over a period of time and, the correlation between the two time periods depends only on the
distance or lag between the two periods. Mathematically, let 𝑌𝑡 be a time series with these
properties:

Mean: 𝐸(𝑌𝑡)=𝜇
Variance: 𝑉𝑎𝑟(𝑌𝑡)=𝐸(𝑌𝑡−𝜇)2=𝜎2
Correlation: 𝜌𝑘=𝐸[(𝑌𝑡−𝜇)(𝑌𝑡+𝑘−𝜇)/(𝜎𝑡𝜎𝑡+𝑘)]
[email protected]
Y21IHWS8GO
Where 𝜌𝑘 is the correlation (or auto-correlation) at lag 𝑘 between the values of 𝑌𝑡 and 𝑌𝑡+𝑘
So, if mean, variance and correlation (or auto-correlation) of time series data is constant (at
different lags) no matter at what point of time it is measured; i.e. if they are time invariant, the
series is called a stationary time series. A series not possessing these properties is termed as a
non-stationary time series.

6.1.1. Stationarization of a non-stationary series

Since ARIMA model requires a stationary series, a formal stationarity test needs to be
applied to the time series under consideration.
Augmented Dickey-Fuller Test: A formal test to check whether time series data follows
stationary process.
H0: Time series is non-stationary
H1: Time series is stationary
Case Study continued.
To check whether Tractor Sales series is stationary or not.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
dftest = adfuller(TS_Train,regression='ct',autolag=None,maxlag=24)
print('DF test statistic is %3.3f' %dftest[0])
print('DF test p-value is' ,dftest[1])
print('Number of lags used' ,dftest[2])

Tractor sales data is not stationary.

Now Log transformed data is put to Augmented Dickey-Fuller test to check for stationarity.

TS_Train_log = np.log10(TS_Train)

dftest = adfuller(TS_Train_log,regression='ct',autolag=None,maxlag=24)
print('DF test statistic is %3.3f' %dftest[0])
print('DF test p-value is' ,dftest[1])
print('Number of lags used' ,dftest[2])

[email protected]
Y21IHWS8GO
Neither original nor log-transformed series is stationary. Hence, a stationarization is
necessary. Often differencing a non-stationary time series leads to a stationary series.
First difference of a series is defined as 𝐷1=𝑌𝑡−𝑌𝑡−1
Autocorrelation Function (ACF): Autocorrelation of order 𝑝 is the correlation between
𝑌𝑡 and 𝑌𝑡+𝑘 for all values of 𝑘=0,1,…, −1≤𝐴𝐶𝐹≤1 and 𝐴𝐶𝐹(0)=1. ACF measures strength of
dependency of current observations on past observations.
Partial Autocorrelation Function (PACF): PACF of order 𝑘 is the autocorrelation between
𝑌𝑡 and 𝑌𝑡+𝑘 adjusting for all the intervening periods i.e. it provides the correlation value between
current and 𝑘 - lagged series by removing the influence of all other observations that exist in
between.
 ACF and PACF used together to identify the order of the ARMA.
 Seasonal ACF and PACF examines correlations for seasonal data

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Case Study continued

# ACF and PACF after taking the logarithmic transformation

f,a = plt.subplots(1,2,sharex=True,sharey=False,squeeze=False)

#Plotting the ACF and the PACF

plot_0 = plot_acf(TS_Train_log,title='ACF Tractor Sales',ax=a[0][0])

plot_1 = plot_pacf(TS_Train_log,title='PACF Tractor Sales',zero= False,ax=a[

0][1]);

[email protected]
Y21IHWS8GO
Figure 11: ACF and PACF of Tractor Sales after log transformation

6.2. ARIMA (p, d, q) Model

𝐴𝑅𝐼𝑀𝐴(𝑝,𝑑,𝑞) Model: ARIMA is defined by 3 parameters
𝑝 : No of autoregressive terms
𝑑 : No of differencing to stationarize the series
𝑞: No of moving average terms

General form of an ARMA model is: (assuming 𝒀𝒕 is a stationary series)

𝒀𝒕=𝜷𝟏𝒀𝒕−𝟏+𝜷𝟐𝒀𝒕−𝟐+𝜷𝟑𝒀𝒕−𝟑+⋯+𝜷𝒑𝒀𝒕−𝒑+𝜺𝒕+𝜶𝟏𝜺𝒕−𝟏+𝜶𝟐𝜺𝒕−𝟐+⋯+𝜶𝒒𝜺𝒕−𝒒

𝜷𝟏,𝜷𝟐,…,𝜷𝒑 : Auto-regression coefficients

𝜶𝟏,𝜶𝟐,…,𝜶𝒒 : Moving average parameters
𝜺𝒕 : white noise

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
6.2.1. AR(p) Model
When the current value of variable can be expressed as a linear function of its past values
then, it is known as an auto-regression process
Autoregressive Process (AR(p)): An autoregressive process of order p is a sequence of a
random variable 𝒀𝒕 defined by the rule:
𝒀𝒕=𝜷𝟏𝒀𝒕−𝟏+𝜷𝟐𝒀𝒕−𝟐+𝜷𝟑𝒀𝒕−𝟑+⋯+𝜷𝒑𝒀𝒕−𝒑+𝜺𝒕
Several AR processes have simpler interpretation.
𝑨𝑹(𝟏) can be written as: 𝒀𝒕=𝜷𝟏𝒀𝒕−𝟏+𝜺, and
𝑨𝑹(2) can be written as: 𝒀𝒕=𝜷𝟏𝒀𝒕−𝟏+𝜷𝟐𝒀𝒕−𝟐+𝜺𝒕
Thus, these models attempts to explain the current values of Y as a linear combination of past
values with some additionally externally generated variations (𝜺𝒕).

PACF is used for identifying the value of 𝑝.

6.2.2. MA(q) Model

When the current value of the series is a function of past forecast errors this model is known
[email protected]
as a moving average model
Y21IHWS8GO
A moving average process of order q i.e. MA(q), is a sequence Yt defined by the rule:
𝒀𝒕=𝝁+𝜺𝒕+𝜶𝟏𝜺𝒕−𝟏+𝜶𝟐𝜺𝒕−𝟐+⋯+𝜶𝒒𝜺𝒕−𝒒

ACF is used for identifying the value of q.

6.2.3. ARMA Model

ARMA model is a combination of two basic processes i.e. AR and MA. The defining
equation of ARMA (p, q) process is:

𝒀𝒕=𝜷𝟏𝒀𝒕−𝟏+𝜷𝟐𝒀𝒕−𝟐+𝜷𝟑𝒀𝒕−𝟑+⋯+𝜷𝒑𝒀𝒕−𝒑+𝜺𝒕+𝜶𝟏𝜺𝒕−𝟏+𝜶𝟐𝜺𝒕−𝟐+⋯+𝜶𝒒𝜺𝒕−𝒒

6.2.4. ARIMA Model

𝑨𝑹𝑰𝑴𝑨 (𝒑,𝒅,𝒒): ARIMA model is an advance version of ARMA model where 𝑑 > 0 indicates
that the original series is non-stationary and d differencing is required to make is stationary. We
will also define the models with centered 𝑌𝑡, centering is done by subtracting the mean of the
stationary series.
So far we have discussed non-seasonal ARIMA model only.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
6.3. 𝐒𝐀𝐑𝐈𝐌𝐀(𝐩,𝐝,𝐪)(𝐏,𝐃,𝐐)[𝐦]:
Seasonal ARIMA model with seasonal frequency m
Seasonal ARIMA models are more complex models with seasonal adjustments. These models
are used when time series data has significant seasonality. The most general form of seasonal
ARIMA is 𝐴𝑅𝐼𝑀𝐴(𝑝,𝑑,𝑞)∗𝐴𝑅𝐼𝑀𝐴(𝑃,𝐷,𝑄)[m], where P, D, Q are defined as seasonal AR
component, seasonal difference and seasonal MA component respectively. And, ‘𝑚’
represents the frequency (time interval) at which the data is observed. For example, a monthly
series will have 𝑚 = 12.
Seasonal ACF and PACF may be used to understand seasonality.

Case Study continued:

Figure (11) indicates the presence of seasonality using the ACF and PACF plots of tractor sales
data as the past values are significantly correlated. It is also clear that at every multiple of 12
the ACF swings higher than its neighbouring values.

## Here we have taken the range of p,q,P and Q to be within 0 to 2. We can chan
ge this if need be.

[email protected]
Y21IHWS8GOimport itertools
p = range(0, 3)
q = range(0, 3)
d = range(1, 2)
pdq = list(itertools.product(p, d, q))
model_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]

## Defining an empty data frame to store the parameter values along with th
e model AIC
SARIMA_AIC = pd.DataFrame(columns=['param','seasonal', 'AIC'])

for param in pdq:

for param_seasonal in model_pdq:
try:
SARIMA_model = sm.tsa.statespace.SARIMAX(TS_Train_log['Number o
f Tractor Sold'].values,
order=param,
seasonal_order=param_season
al)

results_SARIMA = SARIMA_model.fit()
except:
continue
print('SARIMA{}x{}12 - AIC:{}'.format(param, param_seasonal, result
s_SARIMA.aic))
SARIMA_AIC = SARIMA_AIC.append({'param':param,'seasonal':param_seas
onal ,'AIC': results_SARIMA.aic}, ignore_index=True)

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
## Sorting the parameters of the SARIMA models to get the parameters which
give us the lowest AIC value

SARIMA_AIC.sort_values(by=['AIC']).head()

## Running the SARIMA model and getting the model diagnostics

## SARIMA(0, 1, 1)(0, 1, 1, 12)

[email protected]
TS_AutoARIMA = sm.tsa.statespace.SARIMAX(TS_Train_log,
Y21IHWS8GO
order=(0,1,1),
seasonal_order=(0, 1, 1, 12))
TS_AutoARIMA = TS_AutoARIMA.fit()
print(TS_AutoARIMA.summary())

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
According to the result, 𝐴𝑅𝐼𝑀𝐴(0,1,1)(0,1,1)[12] is the indicated model for Tractor Sales. with
AIC= (-568.059) and BIC= (-560.041).

## Checking the behaviour of the residuals of the model

TS_AutoARIMA.plot_diagnostics();

[email protected]
Y21IHWS8GO

Figure 12: Residual Diagnostics of the (0,1,1)(0,1,1)12 SARIMA Model

Alternatively, one might investigate other suitable model(s) for a time series using ACF and
PACF for the differenced series.

# Plot the first difference series, ACF and PACF of Log(Tractor Sales)

TS_Train_log.diff().plot()
plt.grid()
# ACF and PACF after taking the differenced logarithmic transformation

f,a = plt.subplots(1,2,sharex=True,sharey=False,squeeze=False)

#Plotting the ACF and the PACF

plot_0 = plot_acf(TS_Train_log.diff().dropna(),title='ACF of differenced Log

Tractor Sales',ax=a[0][0])

plot_1 = plot_pacf(TS_Train_log.diff().dropna(),title='PACF of differenced Log Tractor Sales',zero =

False,ax=a[0][1]);

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
[email protected]
Y21IHWS8GO

Figure 13: Plot of first difference series, ACF and PACF of Log(Tractor Sales)

The plots of ACF and PACF indicate possible values for p to be 1 and q to be 0.

##Plot of first difference and seasonal first difference series, ACF and PACF of Log(Trac
tor Sales)

TS_Train_log.diff(12).diff().plot()
plt.grid()

# ACF and PACF after taking the differenced logarithmic transformation --> Seasonal Series Plot

f,a = plt.subplots(1,2,sharex=True,sharey=False,squeeze=False)

#Plotting the ACF and the PACF

plot_0 = plot_acf(TS_Train_log.diff(12).diff().dropna(),title='ACF of differenced Log Seasonal Tracto

r Sales',ax=a[0][0])

plot_1 = plot_pacf(TS_Train_log.diff(12).diff().dropna(),title='PACF of differenced Log Seasonal Tractor Sales',ax=a[0][1]);

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
[email protected]
Y21IHWS8GO

Figure 14: Plot of first difference and seasonal first difference series, ACF and PACF of Log(Tractor Sales)

Both ACF and PACF plots indicate possible value of P to be 1 and Q to be 1.

A user may choose different values of 𝑝,𝑑,𝑞 (𝑜𝑟 𝑃,𝐷 𝑎𝑛𝑑 𝑄) and compare AIC, BIC values and
select the ‘best’ model accordingly.
For example, we have chosen 𝐴𝑅𝐼𝑀𝐴(0,1,1)(1,1,1)[12] as an alternative model and got lower
values for AIC and BIC compared to results obtained from using an automated version of the
ARIMA model.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
## Changing the p,q,P and Q values by looking at the ACF and PACF

## SARIMA(0, 1, 1)(1, 1, 1, 12)

TS_f = sm.tsa.statespace.SARIMAX(TS_Train_log,
order=(0,1,1),
seasonal_order=(1, 1, 1, 12))
TS_f = TS_f.fit()
print(TS_f.summary())

[email protected]
Y21IHWS8GO

## Checking the residual behaviour of the above model

TS_f.plot_diagnostics();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 15: Residual Diagnostics of the (0,1,1)(1,1,1)12 SARIMA Model

Model Validation: Seasonal ARIMA model with seasonal frequency m

# Forecast for the test set duration using the automated SARIMA model with
95% confidence intervals
[email protected]
Y21IHWS8GO
pred_AutoARIMA = TS_AutoARIMA.get_forecast(steps=len(TS_Test))

# plot the forecast along with the confidence band

axis = TS_Train_log.plot()
pred_AutoARIMA.summary_frame(alpha=0.05)['mean'].plot(ax=axis, label='Forec
ast', alpha=0.7)
axis.fill_between(pred_AutoARIMA.summary_frame(alpha=0.05).index, pred_Auto
ARIMA.summary_frame(alpha=0.05)['mean_ci_lower'],
pred_AutoARIMA.summary_frame(alpha=0.05)['mean_ci_upper']
, color='k', alpha=.15)
axis.set_xlabel('Year-Months')
axis.set_ylabel('Number of Tractor Sold')
plt.legend(loc='best')
plt.title('Tractor sales data forecast using SARIMA (0,1,1)(0,1,1)[12]')
plt.grid();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 16: Tractor sales data forecast using SARIMA (0,1,1)*(0,1,1)[12]

Forecasting made using SARIMA model (0,1,1)*(1,1,1) is as given below.

[email protected]
# Forecast for the test set duration using the manual SARIMA model with 95%
Y21IHWS8GO
confidence intervals

pred_f = TS_f.get_forecast(steps=len(TS_Test))

# plot the forecast along with the confidence band

axis = TS_Train_log.plot()
pred_f.summary_frame(alpha=0.05)['mean'].plot(ax=axis, label='Forecast', al
pha=0.7)
axis.fill_between(pred_f.summary_frame(alpha=0.05).index, pred_f.summary_fr
ame(alpha=0.05)['mean_ci_lower'],
pred_f.summary_frame(alpha=0.05)['mean_ci_upper'], color=
'k', alpha=.15)
axis.set_xlabel('Year-Months')
axis.set_ylabel('Number of Tractor Sold')
plt.legend(loc='best')
plt.title('Tractor sales data forecast using SARIMA (0,1,1)(1,1,1)[12]')
plt.grid();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 17: Tractor sales data forecast using ARIMA (0,1,1)(1,1,1)[12]

Accuracy measures: SARIMA (0,1,1)*(0,1,1)[12]

## Plot Actual vs. Forecasted sales using SARIMA(0,1,1)*(0,1,1)[12] for 2013-2014 years
[email protected]
Y21IHWS8GO
TS_Test.plot()
np.power(10,pred_AutoARIMA.summary_frame(alpha=0.05)['mean']).plot()
plt.legend(['Actual Data','Forecasted Data']);
plt.title('Plot of Actual vs. Forecasted sales using SARIMA(0,1,1)*(0,1,1)[12]
for 2013-2014 years')
plt.grid();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 18: Plot Actual vs. Forecasted sales using SARIMA(0,1,1)*(0,1,1)[12] for 2013-2014 years

## Accuracy Measures for SARIMA(0,1,1)(0,1,1)12

[email protected]
Y21IHWS8GO
RMSE1 = mean_squared_error(TS_Test.values,np.power(10,pred_AutoARIMA.summar
y_frame()['mean']).values,squared=False)
MAPE1 = mean_absolute_percentage_error(TS_Test,np.power(10,pred_AutoARIMA.s
ummary_frame()['mean']))

print("Accuracy Measures: RMSE:", RMSE1, "and MAPE:", MAPE1,'%')

Accuracy measures: SARIMA (0,1,1)*(1,1,1)[12]

## Plot Actual vs. Forecasted sales using SARIMA(0,1,1)*(1,1,1)[12] for 201

3-2014 years

TS_Test.plot()
np.power(10,pred_f.summary_frame(alpha=0.05)['mean']).plot()
plt.legend(['Actual Data','Forecasted Data']);
plt.title('Plot of Actual vs. Forecasted sales using SARIMA(0,1,1)*(1,1,1)[
12] for 2013-2014 years')
plt.grid();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 19: Plot Actual vs. Forecasted sales using ARIMA (0,1,1)*(1,1,1)[12] for 2013-2014 years

## Accuracy Measures for SARIMA(0,1,1)(1,1,1)12

RMSE2 = mean_squared_error(TS_Test.values,np.power(10,pred_f.summary_frame(
)['mean']).values,squared=False)
MAPE2 = mean_absolute_percentage_error(TS_Test.values,np.power(10,pred_f.su
[email protected]
Y21IHWS8GOmmary_frame()['mean']).values)

print("Accuracy Measures: RMSE:", RMSE2, "and MAPE:", MAPE2,'%')

It seems that ARIMA(0, 1, 1)(1, 1, 1)[12] not only has smaller AIC and BIC values compared to ARIMA(0,
1, 1)(0, 1, 1)[12] recommended by the automated ARIMA, it provides a smaller value of RMSE, albeit
marginal.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Table 2, represents a summary for ARIMA (or SARIMA) models as discussed above.

Table 2: Summary Table for ARIMA models

Method Advantages Disadvantages

Interpretation become more

Inculcates effect of past
complicated when it is
𝐴𝑅(𝑝) observations of the current
dependent on its past values
data
more than 1.
Interpretability issue arises
Managing current values
when dependency on past
𝑀𝐴(𝑞) highly impacted by random
values increases more than
errors.
1.
𝐴𝑅𝐼𝑀𝐴 (𝑝,𝑑,𝑞) It is a generalized model,
Complicated in calibrating
(non-seasonal data) which considers effect of past
right combination of p, d
Or 𝐴𝑅𝐼𝑀𝐴 (𝑝,𝑑,𝑞) (𝑃,𝐷,𝑄) values and random values
and q and (P, D, Q if
(seasonal data) along with (seasonality if
seasonality is present).
present).

A flow chart has been defined for understating ARIMA/ SARIMA models below: -

Case Study continued

Table 3: Comparison of result

[email protected]
Y21IHWS8GO

Final Forecasts
Once a model is chosen and validated, forecasts into the future to be determined. Tractor sales
to be forecasted for 24 months: 2015 Jan – 2016 Dec.
Going strictly by MAPE, recommended model is Triple Exponential Smoothing (Holt Winter’s
Model).

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
TS_df_HW = ExponentialSmoothing(df,seasonal='mul',trend='additive',freq='
M')
TS_df_HW_autofit = TS_df_HW.fit(smoothing_level=3.694208e-01,smoothing_tr
end=1.996233e-07,smoothing_seasonal=6.305791e-01)
TS_df_HW_autofit.params_formatted

[email protected]
Y21IHWS8GO

# Plot Actual vs. Forecasted sales using HW method for 2015-2016 years

df.plot()
TS_df_HW_autofit.forecast(steps=24).plot()
plt.legend(['Actual','Forecast'])
plt.title('Forecast from the Holt-Winters Multiplicative Method')
plt.grid();

Figure 20: Plot Actual vs. Forecasted sales using Triple Exponential Smoothing method for 2015-2016 years

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
For illustrative purpose the codes and output from the model Holt-Winters (given
above) and the chosen ARIMA model are also given.

## Actual forecast by SARIMA (0,1,1)(1,1,1)[12]

TS_final_arima = sm.tsa.statespace.SARIMAX(np.log10(df),
order=(0,1,1),
seasonal_order=(1, 1, 1, 12))
TS_final_arima = TS_final_arima.fit()

pred_final_arima = TS_final_arima.get_forecast(steps=24)

# plot the forecast along with the confidence band

axis = np.log10(df).plot()
pred_final_arima.summary_frame(alpha=0.05)['mean'].plot(ax=axis, lab
el='Forecast', alpha=0.7)
axis.fill_between(pred_final_arima.summary_frame(alpha=0.05).index,
pred_final_arima.summary_frame(alpha=0.05)['mean_c
i_lower'],
pred_final_arima.summary_frame(alpha=0.05)['mean_c
i_upper'], color='k', alpha=.15)
axis.set_xlabel('Year-Months')
axis.set_ylabel('Number of Tractor Sold')
plt.legend(loc='best')
plt.title('Forecast from SARIMA(0,1,1)(1,1,1)[12]')
plt.grid();
[email protected]
Y21IHWS8GO

Figure 21: Plot Actual vs. Forecasted sales using SARIMA Model for 2015-2016 years

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
6.4. 𝐀𝐑𝐈𝐌AX Model

Often a times series is influenced by one or more exogenous variables, i.e. a time series may have
two independent components. One component is directly influenced by the part observations, but
another comment works like a multiple linear regression. Such a time series model is known as an
ARIMAX model and may be treated as an extension of ARIMA model. The ‘X’ in ARIMAX (or
SARIMAX) stands for the independent predictors in the model.

General form of an ARIMAX model is: (assuming 𝒀𝒕 is a stationary series)

𝒀𝒕=𝜷𝟏𝒀𝒕−𝟏+𝜷𝟐𝒀𝒕−𝟐+𝜷𝟑𝒀𝒕−𝟑+⋯+𝜷𝒑𝒀𝒕−𝒑+𝜺𝒕+𝜶𝟏𝜺𝒕−𝟏+𝜶𝟐𝜺𝒕−𝟐+⋯+𝜶𝒒𝜺𝒕−𝒒 + γXt

𝜷𝟏,𝜷𝟐,…,𝜷𝒑 : Auto-regression coefficients

𝜶𝟏,𝜶𝟐,…,𝜶𝒒 : Moving average parameters
γ : Regression coefficient
𝜺𝒕 : white noise

It is possible that X is a vector, like a set of typical multiple linear regression predictor. X may vary
over time or not. For example, if price of a commodity is modelled as a time series, X may be price
index, which is a time dependent variable. Alternatively, X may be a categorical variable, such as
geographic location.
[email protected]
There is one major
Y21IHWS8GO difference with multiple linear regression and inclusion of a set of predictors in
an ARIMA model. Interpretation of the regression coefficients are not straight forward. The
estimated coefficient of X is not an estimate of the increase in 𝒀𝒕 for a unit increase in X, because 𝒀𝒕
includes the lag variables.

A second case study is given below as an example of ARIMAX model fitting.

Case Study with an Exogenous Variable

The dataset Pollution_Data.csv contains average weekly value of several polluting particles in one
pollution monitoring station. The main parameter for monitoring ambient air quality is PM2.5. The
data points are weekly average from 2013 – 2017.

Fig 22 shows the data pattern. Note that the frequency of the time stamps is weekly.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
## Upload the data file and plot the same

df = pd.read_csv('Pollution_Data.csv')
daterange = pd.date_range(start='2013-03-03',periods=len(df),freq='W')
df['Time_Stamp'] = daterange
df.set_index(keys='Time_Stamp',inplace=True)
rcParams['figure.figsize'] = 15,8
df['PM2.5'].plot(grid=True);

[email protected]
Y21IHWS8GO

Figure 22: PM2.5 Pollution data

Following are the observations obtained from Figure 22.:

I. Data values are stored in correct time order and no data is missing.
II. The PM2.5 values are randomly fluctuating.

Since we have already discussed the theory behind building ARIMA models using the lowest
Akaike Information Criteron, we will directly go ahead and apply those concepts over here.
But before that we need to split the data into training and test and then go on to check for stationarity
of the Training Data.

6.5. Data Split and Plot

## Spliting data into training and test data sets

TS_Train = df[df.index.year <= 2015]
TS_Test = df[df.index.year > 2015]

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
TS_Train['PM2.5'].plot()
TS_Test['PM2.5'].plot()
plt.grid()
plt.title('PM2.5 Plot')
plt.xlabel('Year')
plt.ylabel('PM2.5')
plt.legend(['Training Data','Test Data'],title='Data Split');

[email protected]
Y21IHWS8GO
Figure 23: PM2.5 Data (split into Training and testing purpose)

Let us check the ACF plot to understand the exact nature of the seasonality in the data.

## Plotting the ACF plot

plot_acf(TS_Train['PM2.5'],lags=60);

Figure 24: PM2.5 Data ACF Plot

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Notice that there is no clear indicative seasonality in the data and thus we opt for ARIMA over
here and not SARIMA.

Now, we will check whether the Training data is stationary using the Augmented Dickey-Fuller
Test.

dftest = adfuller(TS_Train['PM2.5'],regression='ct')
print('DF test statistic is %9.9f' %dftest[0])
print('DF test p-value is' ,dftest[1])
print('Number of lags used' ,dftest[2])

The data is not stationary at 95% confidence interval. Now, we will take a first order difference
of the data and then check for stationarity.

dftest = adfuller(TS_Train['PM2.5'].diff().dropna(),regression='ct')
print('DF test statistic is %9.9f' %dftest[0])
[email protected]
Y21IHWS8GOprint('DF test p-value is' ,dftest[1])
print('Number of lags used' ,dftest[2])

After taking a first order differencing we see that the data has indeed become stationary at 95%
confidence level.

Let us plot the differenced Time Series once and check the plots of the ACF and the PACF.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
TS_Train['PM2.5'].diff().dropna().plot(grid=True)

# ACF and PACF

f,a = plt.subplots(1,2,sharex=True,sharey=False,squeeze=False)

#Plotting the ACF and the PACF

plot_0 = plot_acf(TS_Train['PM2.5'].diff(),ax=a[0][0],missing='drop')

plot_1 = plot_pacf(TS_Train['PM2.5'].diff().dropna(),ax=a[0][1],zero=False);

[email protected]
Y21IHWS8GO

Figure 25: Plot of first difference series, ACF and PACF of PM2.5

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
We are going to fit the differenced series in the ARIMA(p,d,q) model.

𝐀𝐑𝐈𝐌𝐀(𝐩,𝐝,𝐪):
## The following loop helps us in getting a combination of different parame
ters of p and q in the range of 0 and 3
## We have kept the value of d as 1 as we need to take a difference of the
series to make it stationary.

import itertools
p = q = range(0, 4)
d= range(1,2)
pdq = list(itertools.product(p, d, q))

# Creating an empty Dataframe with column names only

ARIMA_AIC = pd.DataFrame(columns=['param', 'AIC'])

for param in pdq:# running a loop within the pdq parameters defined by iter
tools
ARIMA_model = sm.tsa.statespace.SARIMAX(TS_Train['PM2.5'].values,order=
param).fit()#fitting the ARIMA model using the parameters from the loop
ARIMA_AIC = ARIMA_AIC.append({'param':param, 'AIC': ARIMA_model.aic}, i
gnore_index=True)
#appending the AIC values and the model parameters to the previously cr
eated data frame
#for easier understanding and sorting of the AIC values

## Sort the above AIC values in the ascending order to get the parameters f
or the minimum AIC value
[email protected]
Y21IHWS8GOARIMA_AIC.sort_values(by='AIC',ascending=True)

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
## Running the SARIMA model and getting the model diagnostics

TS_AutoARIMA = sm.tsa.statespace.SARIMAX(endog=TS_Train['PM2.5'],order=(0,1
,1))

TS_AutoARIMA = TS_AutoARIMA.fit()
print(TS_AutoARIMA.summary())

[email protected]
Y21IHWS8GO
According to the result, S𝐴𝑅𝐼𝑀𝐴(0,1,1) is the indicated model for the PM2.5 data with
AIC= (1500.213) and BIC= (1506.194).

## Checking the behaviour of the residuals of the model

TS_AutoARIMA.plot_diagnostics();

Figure 26: Residual Diagnostics of the (0,1,1) ARIMA Model

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Model Validation: ARIMA model

# Forecast for the test set duration using the automated ARIMA model with 95%
confidence intervals

pred_AutoARIMA = TS_AutoARIMA.get_forecast(steps=len(TS_Test))

# plot the forecast along with the confidence band

axis = TS_Train['PM2.5'].plot()
pred_AutoARIMA.summary_frame(alpha=0.05)['mean'].plot(ax=axis, label='Forecas
t', alpha=0.7)
axis.fill_between(pred_AutoARIMA.summary_frame(alpha=0.05).index, pred_AutoAR
IMA.summary_frame(alpha=0.05)['mean_ci_lower'],
pred_AutoARIMA.summary_frame(alpha=0.05)['mean_ci_upper'],
color='k', alpha=.15)
axis.set_xlabel('Time')
axis.set_ylabel('PM2.5')
plt.legend(loc='best')
plt.title('PM2.5 data forecast using ARIMA (0,1,1)')
plt.grid();

[email protected]
Y21IHWS8GO

Figure 27: PM2.5 data forecast using ARIMA (0,1,1)

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Accuracy measures: ARIMA (0,1,1)

## Plotting only the forecast and the test data

TS_Test['PM2.5'].plot()
pred_AutoARIMA.summary_frame()['mean'].plot()
plt.grid()
plt.title('PM2.5: Actual vs Forecast - SARIMA Model')
plt.xlabel('Time')
plt.legend(['Actual Data','Forecasted Data']);

[email protected]
Y21IHWS8GO

Figure 28: Plot Actual vs. Forecasted PM2.5 data using ARIMA(0,1,1) for 2016-2017 years

## Accuracy Measures for ARIMA(0,1,1)

RMSE = mean_squared_error(TS_Test['PM2.5'], pred_AutoARIMA.predicted_mean,s

quared=False)
MAPE = mean_absolute_percentage_error(TS_Test['PM2.5'], pred_AutoARIMA.pred
icted_mean)

print("Accuracy Measures: RMSE:", RMSE, "and MAPE:", MAPE,'%')

Here, we have used the AIC to select the best ‘p’ and ‘q’ values for the ARIMA models. We can
also decide the ‘p’ and ‘q’ values based on lags where the ACF and the PACF cuts-off at a particular
confidence level (usually 95% confidence level is taken).

It is clear that ARIMA model does not provide good approximation to the data. An effort will be made
to fit an ARIMAX model using a set of covariates, temperature, dew point, rainfall amount and wind
speed to improve accuracy of the model.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
6.6. 𝐀𝐑𝐈𝐌𝐀X(𝐩,𝐝,𝐪) Model

Case Study continued:

## Here we have taken the range of p and q to be within 0 to 3. We can change this if need
be. The range of p and q has been defined in the last loop and we will be
using the same parameter values

ARIMAX_AIC = pd.DataFrame(columns=['param', 'AIC'])

ARIMAX_AIC

for param in pdq:#running a loop within the pdq parameters defined by itertools
ARIMAX_model = sm.tsa.statespace.SARIMAX(endog=TS_Train['PM2.5'].values,
order=param,exog=TS_Train[['TEMP','DEWP','RAIN','WSPM']]).fit()#fitting the ARIMA
model using the parameters from the loop
ARIMAX_AIC = ARIMAX_AIC.append({'param':param, 'AIC': ARIMAX_model.aic}, igno
re_index=True)
#appending the AIC values and the model parameters to the previously created
data frame
#for easier understanding and sorting of the AIC values

## Sorting the parameters of the ARIMAX models to get the parameters which give u
s the lowest AIC value
[email protected]
Y21IHWS8GO
ARIMAX_AIC.sort_values(by=['AIC']).head()

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
TS_ARIMAX = sm.tsa.statespace.SARIMAX(endog=TS_Train['PM2.5'],
exog=TS_Train[['TEMP','DEWP','RAIN','WSPM']],order=(0,1,3))

TS_ARIMAX = TS_ARIMAX.fit()
print(TS_ARIMAX.summary())

[email protected]
Y21IHWS8GO

According to the result, 𝐴𝑅𝐼𝑀𝐴X (0,1,3) is the indicated model for the PM2.5 data with
AIC= (1443.447) and BIC= (1467.371).

## Checking the residual behaviour of the above model

TS_ARIMAX.plot_diagnostics();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 29: Residual Diagnostics of the (0,1,3) ARIMAX Model

Model Validation: ARIMAX model

[email protected]
# Forecast for the test set duration using the automated ARIMA model with 95%
Y21IHWS8GO
confidence intervals

pred = TS_ARIMAX.get_forecast(steps=len(TS_Test),exog=TS_Test[['TEMP','DEWP',
'RAIN','WSPM']])

# plot the forecast along with the confidence band

axis = TS_Train['PM2.5'].plot()
pred.summary_frame(alpha=0.05)['mean'].plot(ax=axis, label='Forecast', alpha=
0.7)
axis.fill_between(pred.summary_frame(alpha=0.05).index, pred.summary_frame(al
pha=0.05)['mean_ci_lower'],
pred.summary_frame(alpha=0.05)['mean_ci_upper'], color='k',
alpha=.15)
axis.set_xlabel('Time')
axis.set_ylabel('PM2.5')
plt.legend(loc='best')
plt.title('PM2.5 data forecast using ARIMAX (0,1,3)')
plt.grid();

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Figure 30: PM2.5 data forecast using ARIMAX (0,1,3)

Accuracy measures: ARIMAX (0,1,3)

## Plotting only the forecast and the test data

TS_Test['PM2.5'].plot()
[email protected]
pred.summary_frame()['mean'].plot()
Y21IHWS8GO
plt.grid()
plt.title('PM2.5: Actual vs Forecast - ARIMAX Model (0,1,3)')
plt.xlabel('Time')
plt.legend(['Actual Data','Forecasted Data']);

Figure 31: Plot Actual vs. Forecasted PM2.5 data using ARIMAX (0,1,3) for 2016-2017 years

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
## Accuracy Measures for ARIMA(0,1,3)

RMSE = mean_squared_error(TS_Test['PM2.5'],pred.predicted_mean,squared=False)
MAPE = mean_absolute_percentage_error(TS_Test['PM2.5'],pred.predicted_mean)

print("Accuracy Measures: RMSE:", RMSE, "and MAPE:", MAPE,'%')

Here, we have chosen the model parameters (p and q) using the lowest AIC. But we can definitely
go back and investigate the lags at which the ACF and the PACF cuts-off and take p and q values
accordingly for the ARIMAX model.

Table 4: Comparison Table

[email protected]
Y21IHWS8GO

Note that none of the above models show small MAPE. However, it is to be noted that with the
addition of an exogenous variable, there is a reduction of 40% in the MAPE.

Since no future data for exogenous variables is available, forecast into the future for ARIMAX
models (beyond the time stamps of the test set) is not possible. The seasonal parameters (with the
appropriate seasonal frequency) may be added to the ARIMAX model to make it a SARIMAX
model.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
7. Further Study
Time series is a vast area and quite complex models may be applicable here. In the previous sections
only two elementary but useful methods have been discussed. In this section a few more complex
variations are mentioned.

7.1. Cyclical Component

While decomposing a time series three components are mentioned: trend, seasonality and
irregularity. There is a fourth component called cyclical component. This may be taken as a
seasonality whose frequency is more than one year. There is a special method of analysis called
spectral decomposition to address cyclicality in a time series. The tool used is known as
periodogram.

7.2. ARCH and GARCH Models

ARIMA models assume that the variances are constant over the entire duration of a time series. An
ARCH (Auto Regressive Conditionally Heteroscedastic) model takes into account changing variance
in a series. Although usually an increasing variance is the case for applicability of ARCH model, it
[email protected]
Y21IHWS8GO
can also be used to describe a situation in which there may be short periods of high volatility. In
fact, gradually increasing variance connected to a gradually increasing mean level might be better
handled by transforming the variable. ARCH models have applicability in the context of
econometric and financial time series, such as amount of investments or stock prices. Often the
residuals obtained after fitting an ARIMA model may show heteroscedasticity, thereby indicating
that an ARCH model may be more appropriate for such series. A further extension of ARCH model
is known as a GARCH model.

7.3. Other Variations

Other variations in time series modeling includes time series regression when both response and
predictors are measured as a time series and the errors are assumed to follow an AR structure. This
is often easier to interpret than an ARIMAX model.
The same concept can be extended to a lagged regression where lags of response is regressed on lags
on predictors. The cross-correlation function (CCF) is used to identify possible models.
In the present day context of Artificial Neural Networks, various Deep Neural Network models are
also used to forecast time series data. Especially the Recurrent Neural Network (RNN) and Long
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited. 54

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Short Term Memory (LSTM) models are finding use in varied fields of Time Series. These models
give us a better accuracy in a majority of the cases but these being black-box models, interpretative
power is not very high.

7.4. Multivariate Time Series

Recall that so far only univariate time series has been forecasted. In reality there may be many series
which are inter-dependent. Many such applications can be thought of in application of stock market,
GDP or other growth series. These are known as multivariate time series. Methods such as Vector
Auto Regressive Integrated Moving Average (VARIMA) have been developed to address
multivariate time series. In case of a bivariate time series, the second time-series is used to facilitate
the prediction of the first time series. Concepts such as the causality (e.g., Granger Causality) is
introduced on the top of the VARIMA to understand if there is a causal relationship between the two
time-series in question.

[email protected]
Y21IHWS8GO

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
[email protected]
Y21IHWS8GO

Figure 32: Flow chart of Time Series

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
Appendix:

Libraries Versions
Pandas 1.0.5
Numpy 1.19.0
Matplotlib 3.2.1
Seaborn 0.10.1
Statsmodels 0.12.0
Sklearn 0.23.1

[email protected]
Y21IHWS8GO

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.
References:

Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
Accessed Link: https://otexts.org/fpp2/

Tsay, R. S. (2005). Analysis of financial time series (Vol. 543). John Wiley & Sons.

Mills, T. C., & Patterson, K. D. (2015). Modelling the trend: the historical origins of some
modern methods and ideas. Journal of Economic Surveys, 29(3), 527-548.

Klein, J. L., & Klein, D. (1997). Statistical visions in time: a history of time series analysis,
1662-1938. Cambridge University Press.

Nau, R., (2018), Statistical forecasting: notes on regression and time series analysis, Fuqua
School of Business, Duke University, Accessed link:
http://people.duke.edu/~rnau/411home.htm

Hyndman, R., Koehler, A. B., Ord, J. K., & Snyder, R. D. (2008). Forecasting with exponential
smoothing: the state space approach. Springer Science & Business Media.

Coghlan, A. (2015). A little book of R for time series. Disponível em: https://media.
readthedocs. org/pdf/a-little-book-of-r-for-time-series/latest/a-little-book-of-r-for-time-series.
pdf>. Acesso em, 10.

Tsay, R. S. (2014). An introduction to analysis of financial data with R. John Wiley & Sons.

Dagum, E. B., & Bianconcini,

[email protected] S. (2016). Seasonal adjustment methods and real time trend-
Y21IHWS8GOcycle estimation. Springer.

Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis:
Forecasting and control (5th ed). Hoboken, New Jersey: John Wiley & Sons.

This file is meant for personal use by [email protected] only.

Sharing or publishing the contents in part or full is liable for legal action.

Time+Series+Forecasting Monograph
100% (4)
Time+Series+Forecasting Monograph
58 pages
SPSS Mas202
No ratings yet
SPSS Mas202
6 pages
The Relation Between Library Anxiety and Learning
No ratings yet
The Relation Between Library Anxiety and Learning
15 pages
Arima
No ratings yet
Arima
3 pages
CHANDRA
No ratings yet
CHANDRA
8 pages
Module 4 - Time Series Analysis
No ratings yet
Module 4 - Time Series Analysis
6 pages
Forecasting Inflation Rates Slides
No ratings yet
Forecasting Inflation Rates Slides
35 pages
10 The Forward Premium in Electricity Markets An Experimental Study
No ratings yet
10 The Forward Premium in Electricity Markets An Experimental Study
14 pages
Arima
No ratings yet
Arima
51 pages
JMP Part023
100% (1)
JMP Part023
92 pages
Used Car Exports From Japan
No ratings yet
Used Car Exports From Japan
9 pages
Study Notes For Business Forecasting
No ratings yet
Study Notes For Business Forecasting
23 pages
Output SPSS - Lina Lengkap
No ratings yet
Output SPSS - Lina Lengkap
17 pages
Course Slides - Data Science and ML Fundamentals
No ratings yet
Course Slides - Data Science and ML Fundamentals
92 pages
Arima Model
No ratings yet
Arima Model
4 pages
Forecasting Crime Using The ARIMA Model: Peng Chen, Hongyong Yuan, Xueming Shu
No ratings yet
Forecasting Crime Using The ARIMA Model: Peng Chen, Hongyong Yuan, Xueming Shu
4 pages
Module 5 PDF
No ratings yet
Module 5 PDF
23 pages
Time Series Analysis. Trends, Patters, Seasonality
No ratings yet
Time Series Analysis. Trends, Patters, Seasonality
14 pages
Volker Blobel
No ratings yet
Volker Blobel
27 pages
Forecasting Models
No ratings yet
Forecasting Models
9 pages
The Influence of Financial Management Using The Financial Freedom Approach, Financial Technology and Social Capital On The Income of Msmes in The Tourism Sector
No ratings yet
The Influence of Financial Management Using The Financial Freedom Approach, Financial Technology and Social Capital On The Income of Msmes in The Tourism Sector
14 pages
Sarima Group 11
No ratings yet
Sarima Group 11
21 pages
Statistical Physics: Willoughby Seago
No ratings yet
Statistical Physics: Willoughby Seago
42 pages
ARIMA
No ratings yet
ARIMA
3 pages
MGT 636 Chapter 12 Problem
No ratings yet
MGT 636 Chapter 12 Problem
6 pages
Adsl Exp 9 2024
No ratings yet
Adsl Exp 9 2024
14 pages
Tutorial 9 - Solutions
No ratings yet
Tutorial 9 - Solutions
21 pages
How To Analyze A Split-Plot Experiment: Design of Experiments
No ratings yet
How To Analyze A Split-Plot Experiment: Design of Experiments
8 pages
Seasonal Time Series Forecasting Using SARIMA and Holt Winter
No ratings yet
Seasonal Time Series Forecasting Using SARIMA and Holt Winter
7 pages
10 25092-Baunfbed 423143-471441
No ratings yet
10 25092-Baunfbed 423143-471441
8 pages
Sampling - Quizlet
No ratings yet
Sampling - Quizlet
22 pages
Cross-Validation of Component Models: A Critical Look at Current Methods
No ratings yet
Cross-Validation of Component Models: A Critical Look at Current Methods
12 pages
Assigment 17
No ratings yet
Assigment 17
2 pages
Chapter - ARIMA Models For Time Series Data
No ratings yet
Chapter - ARIMA Models For Time Series Data
44 pages
Statistical Methods For Forecasting
No ratings yet
Statistical Methods For Forecasting
8 pages
Short-Run Electricity Demand Forecast in Maharashtra
No ratings yet
Short-Run Electricity Demand Forecast in Maharashtra
6 pages
Time Series Analysis in R A Beginner's Guide
No ratings yet
Time Series Analysis in R A Beginner's Guide
13 pages
Assignment 1 Supplementary
No ratings yet
Assignment 1 Supplementary
5 pages
M2 - L5 (SARIMA General Linear Process Wold Decomposition)
No ratings yet
M2 - L5 (SARIMA General Linear Process Wold Decomposition)
18 pages
Chapter 8
No ratings yet
Chapter 8
27 pages
Goswami 2020
No ratings yet
Goswami 2020
5 pages
Bestglm Using R
No ratings yet
Bestglm Using R
39 pages
563650.127535233 Comparison of Different Metal Oxide Surge Arrester Models
No ratings yet
563650.127535233 Comparison of Different Metal Oxide Surge Arrester Models
11 pages
26 Ads Expt9
No ratings yet
26 Ads Expt9
7 pages
Regression and Classification
No ratings yet
Regression and Classification
26 pages
06 Time Series Analysis
No ratings yet
06 Time Series Analysis
9 pages
Estimation of Parameters
No ratings yet
Estimation of Parameters
87 pages
How To Interpret Multiple Regression Output in Spss
50% (2)
How To Interpret Multiple Regression Output in Spss
3 pages
Algorithms 16 00248 v2
No ratings yet
Algorithms 16 00248 v2
16 pages
Arima
No ratings yet
Arima
12 pages
Cheat Sheet For Test 4 Updated
No ratings yet
Cheat Sheet For Test 4 Updated
8 pages
Mini Project Based On Time Series Forecasting Methods: Data Used
No ratings yet
Mini Project Based On Time Series Forecasting Methods: Data Used
14 pages
ARIMAKASYOKI
No ratings yet
ARIMAKASYOKI
5 pages
DSS16-Time Series
No ratings yet
DSS16-Time Series
65 pages
From News To Forecast
No ratings yet
From News To Forecast
18 pages
Explore Case Processing Summary
No ratings yet
Explore Case Processing Summary
4 pages
Arima
No ratings yet
Arima
8 pages
The ARIMA Model
No ratings yet
The ARIMA Model
3 pages
Group 10 TS Assignment
0% (1)
Group 10 TS Assignment
21 pages
Time Series Forecasting
No ratings yet
Time Series Forecasting
4 pages
Good Something
No ratings yet
Good Something
14 pages
Business Analytis C4
No ratings yet
Business Analytis C4
10 pages
Time Arima 002
No ratings yet
Time Arima 002
11 pages
Lecture 7
No ratings yet
Lecture 7
10 pages
Time Series Decomposition
No ratings yet
Time Series Decomposition
54 pages
Econometrics ch6
No ratings yet
Econometrics ch6
51 pages
Time+Series+Forecasting Monograph
No ratings yet
Time+Series+Forecasting Monograph
58 pages
Forecasting Sales Through Time Series Clustering
No ratings yet
Forecasting Sales Through Time Series Clustering
18 pages
Determining P, D, Q, P, D, Q: Team Members
No ratings yet
Determining P, D, Q, P, D, Q: Team Members
10 pages
Note - Unit-4
No ratings yet
Note - Unit-4
12 pages
01 Cost Terms, Concept and Behavior
100% (1)
01 Cost Terms, Concept and Behavior
4 pages
MMW - Module 4-1
100% (1)
MMW - Module 4-1
80 pages
Class Notes
No ratings yet
Class Notes
6 pages
Be A 65 Ads Exp 8
No ratings yet
Be A 65 Ads Exp 8
10 pages
Chapter 4 Estimation Theory
0% (1)
Chapter 4 Estimation Theory
40 pages
Body Mass Index As A Measure of Body Fatness Age and Sexspecific Prediction Formulas
No ratings yet
Body Mass Index As A Measure of Body Fatness Age and Sexspecific Prediction Formulas
10 pages
Arima Modeling With R Listendata
No ratings yet
Arima Modeling With R Listendata
12 pages
Resumos Forecasting
No ratings yet
Resumos Forecasting
17 pages
Forecasting
No ratings yet
Forecasting
75 pages
Arima 1b
No ratings yet
Arima 1b
6 pages
Introduction To Time Series Analysis and Forecasti
No ratings yet
Introduction To Time Series Analysis and Forecasti
10 pages
ARIMA Modelling and Forecasting: by Shipra Mishra Intern
No ratings yet
ARIMA Modelling and Forecasting: by Shipra Mishra Intern
17 pages
Time Series Analysis: C5 ARIMA (Box-Jenkins) Models
No ratings yet
Time Series Analysis: C5 ARIMA (Box-Jenkins) Models
14 pages
Uji Gain Hake
No ratings yet
Uji Gain Hake
26 pages
Mock Exam 2023 #1 First Session Ethical and Professional Standards
No ratings yet
Mock Exam 2023 #1 First Session Ethical and Professional Standards
87 pages
Weighted Residual Formulations
No ratings yet
Weighted Residual Formulations
5 pages
A Discourse Analysis of 1 Peter
From Everand
A Discourse Analysis of 1 Peter
Ervin Ray Starwalt
No ratings yet
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Intrusion Detection Honeypots
From Everand
Intrusion Detection Honeypots
Chris Sanders
3/5 (2)