0% found this document useful (0 votes)

22 views47 pages

Group 9 Time Series Data Analysis (ARIMA)

This document discusses time series data analysis using ARIMA models. It introduces time series components like trend, seasonality, cyclicity and irregularity. It explains how to check for stationarity using Dickey-Fuller test and how to make a non-stationary series stationary using differencing. The document also explains AR, MA, ARMA and ARIMA models and their properties.

Uploaded by

bitterguard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views47 pages

Group 9 Time Series Data Analysis (ARIMA)

Uploaded by

bitterguard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

NATIONAL INSTITUTE OF TECHNOLOGY

TIRUCHIRAPALLI

DEPARMENT OF PRODUCTION ENGINEERING

DATA ANALYTICS

TOPIC : TIME SERIES DATA ANALYSIS

(ARIMA)

Under The Guidance of- SANKAR M (214223025)

Dr. Vimal K E K SATYAM (214223026)
SHIVAM SINGH (214223027)

M.Tech
Industrial Engineering & Management
OVERVIEW

Introductionto time series

Components of time series

1. Trend 2. Seasonal 3. Cyclic 4. Irregular

Forecasting models

ARIMA model
Implementation.
R programming
Python – 1.PANDAS 2. MATPLOTLIB
INTRODUCTION
 A time series is a sequence of data being recorded at
specific time intervals.
 These data points are analyzed to forecast a future.
 It is time dependent.

Categories and terminologies

 Univariate and multivariate
 Linear or non- linear
 Discrete or Continuous
 Stationary and non-stationary
COMPONENTS

 Trend : It defines whether, over a period, time series increases

or decreases. That is, it has an upward (increasing) trend or
downward (decreasing) trend.
 Eg : Population growth over the years can be seen as an
upward trend.
COMPONENTS

 Seasonality : It defines a pattern that repeats over a period.

This pattern which repeats periodically is called
as seasonality.
 It is a short-term change usually caused by climate, traditional
habits etc.
 Eg : Sales of ice cream increases during summer season.
COMPONENTS
Cyclicity : Cyclicity is also a pattern in the time series data
but it repeats aperiodically, meaning it doesn’t repeat after
fixed intervals.
 It is a medium-term variation caused by circumstances which
repeats in cycle.
 Eg :5 years of economic growth followed by 2 years of
economic recession, followed by 7 years of economic growth
followed by 1 year of economic recession.
COMPONENTS

 Irregularity : Irregular or random variations in a time

series are caused by unpredictable influences, which are
not regular. These variations are caused by incidences such
as war, earthquake , strike, flood etc. There is no defined
statistical technique for measuring random fluctuations in a
time series.
Combination of four models
Considering the effects of these components, two
different types of models are generally used for a time series.
Additive model
Y(t) = T(t) + S(t) + C(t) + I(t)
Assumption : Components are independent to each
other.
Multiplicative model
Y(t) = T(t) * S(t) * C(t) * I(t)
Assumption : Components are not necessarily
independent, and they can affect each other.
STATIONARY AND NON STATIONARY

 Stationary Time Series

 Defining a Stationary Time Series, it is the one where the
mean and the variance are both constant over time. In other
words, it is the one whose properties do not depend on the
time at which the series is observed. Thus, the Time Series
is a flat series without trend, with constant variance over
time, a constant mean, a constant autocorrelation and no
seasonality. This makes a Stationary Time Series easy to
predict.
STATIONARY AND NON
STATIONARY
 Non-Stationary Time Series
 A Non-Stationary Time Series is one where either mean
or variance or both are not constant over time.
 Most statistical forecasting methods are based on the
assumptions that the time series can be rendered
approximately stationary after mathematical
transformations.
STATIONARITY

Statistical properties more or less same over time

Properties:-

1) Constant mean
2) Constant variance
3) No seasonality
Seasonality : Repeating trends/patterns over time
How to check stationarity
VisualInspection
Dickey Fuller test
How to check stationarity
What Dickey-Fuller test?
Imagine a series where a fraction of the
current value is depending on a fraction of
previous value of the series.
DF builds a regression line between fraction
of the current value yt and fraction of
previous value δyt-1
The usual t-statistic is not valid, thus D-F
developed appropriate critical values. If p
value of DF test is <5% then the series is
stationary.
TESTING STATIONARITY
P value has to be less than 0.05 or 5%.
If p value is greater than 0.05 or 5%, you
accept the null hypothesis, you conclude
that the time series has a unit root.
In that case, you should first difference
the series before proceeding with analysis.
Dickey-Fuller Test for stationarity

 Assume an AR(1) model. The model is non-stationary or a unit root

is present if |ρ| =1
yt = ρyt-1 + et
yt – yt-1 = ρyt-1 – yt-1 + et
yt = (ρ -1)yt-1 + et
yt = γyt-1 + et
 We can estimate the above model and test for the significance of the
coefficient.
 If the null hypothesis is not rejected, γ= 0 then yt is not stationary.
Difference the variable and repeat the Dickey-Fuller test to see if the
differenced variable is stationary.
 If the null hypothesis is rejected, γ > 0, then yt is stationary. Use the
variable.

CONVERT NON-STATIONARY TO STATIONARY

 Differencing: Transformation of the series to a new time series

where the values are the differences between consecutive values.
 Procedure may be applied consecutively more than once, giving
rise to the “first differences”, “second differences”, etc.
 Regular differencing (RD)

(1st order) Xt = Xt – Xt-1

(2nd order) Xt = (
2
Xt – Xt-1 ) = Xt – 2Xt-1 + Xt-2

 Itis unlikely that more than two regular differencing would ever
be needed
 Sometimes regular differencing by itself is not sufficient and
prior transformation is also needed.
(Contd)Example1

US net electricity generation (billion kWh). Other panels show the same
data after transforming and differencing.
(Contd)Example 2

Logs and seasonal differences of the A10 (antidiabetic)

sales data. The logarithms stabilise the variance,
while the seasonal differences remove the seasonality and trend
ARIMA
ARIMA is an acronym that stands for Auto Regressive Integrated
Moving Average.

White Noise Processes

White noise is a series that’s not predictable, as it’s
a sequence of random numbers. If you build a
model and its residuals (the difference between
predicted and actual) values look like white noise,
then the model is a good fit. On the opposite side,
there’s a better model for your dataset if there are
visible patterns in the residuals, then you should go
for other model.
•If µ = 0 then the process is known as Zero Mean White Noise .

Moving Average (MA) Process

 Moving average (MA) models account for the possibility of a
relationship between a variable and the residuals from previous periods.
The previous equation is a qth order of Moving Average model , denoted by
MA(q).
This can also be expressed as :

•A moving average model is simply a linear combination of white noise

processes, so that yt depends on the current and previous values of a white noise
disturbance term .
•This equation could be written with the help of lag operator where Lyt = yt-1
denotes that yt is lagged once.
•In order to show the ith lag of yt the notation would be Li yt = yt-i . Using the lag
operator the above equation can be written as :
Autocorrelation Function
 ACF is the proportion of the auto covariance of yt and yt-k to the
variance of a dependent variable yt .

 The autocorrelation function ACF(k) gives the gross correlation

between yt and yt-k .
 An important property of MA(q) models in general is that there are
non zero autocorrelation for the first q lags , and pk = 0 for all lags k>q
.
 In other words , ACF provides a considerable amount of information
about the order of the dependence q for MA(q).
 Identification of an MA mode is often best done with the ACF .
MA Examples with ACF
MA(1) : Yt = ut +0.8 ut-1
MA(2) : Yt = ut + 0.5 ut-1 + 0.3 ut-2
The given below is a generalized equation of ACF for MA(q) process :
Auto Regression (AR)
 An autoregressive model is one where the current value of a
variable yt depends upon only the values that the variable took in
previous periods plus an error term .
 Generalized equation of AR(p) of order p :

 Where ut is white noise disturbance term .

 The above equation can be rewritten as :

where

 An AR(1) can be converted into infinite MA(q) process.

Partial Auto Correlation Function (PACF)
 Indicates the dependence of Xt on Xt-k when the dependence on all
other variables Xt-1 , Xt-2 ,……..,Xt-k are removed or not
considered .

Ø1 is PAC of order 1

Ø2 is PAC of order 2

 Partial Auto Correlation is calculated using Yule Walker Equation.

 AR(1) Process : Xt = 0.9Xt-1 +ut-1
ARMA Processes
 By combining the AR(p) and MA(q) models , an ARMA (p , q) model
is obtained . Such a model states that the current value of some series
y depends linearly on its own previous values plus a combination of
current and previous values of white noise error term .
 General equation of ARMA is :
 The characteristics of an ARMA process will be a combination of
those from the autoregressive (AR) and moving average (MA)
processes.
 The autocorrelation function will display combinations of
behaviour derived from the AR and MA parts , but for lags beyond
q the ACF will be simply identical to the individual AR(p) model ,
so that the AR part will dominate in the long term .
ARMA TO ARIMA
 In AR , MA , ARMA models there is one assumption or necessity
that the time series has to be stationary .
 If the time series is non stationary then the series has to be
transformed into stationary series .
 In ARMA model the I(integrated) term is zero but when the
differencing is done to make series stationary then I term becomes
non zero.
 After differencing , ARMA model becomes ARIMA model which is
represented by (p , d . q) .
 The general equation of ARIMA is given below :

 It is similar to ARMA equation its just that instead of the given

data(y) the differenced data(y’) is used in the equation.
Procedure of Box-Jenkins Approach
 To build an ARIMA model of order (p , d , q ) there are major 4
steps to be followed .
1. Ensuring Stationarity
2. Identification and Selection of Model Structure
3. Parameter Estimation
4. Model Testing/Validation
Box-Jenkins Approach
1.Identification of Model Structure

Identify if the series is stationary or not using

ADF test.
Remove Non Stationarity if any.
Obtain the order of AR component with the help
of PAC.
Obtain the order of MA component with the
help of ACF.
2.Parameter Estimation

Algorithms are available for parameter

estimation.
One such example is Marquadt’s Algorithm.
Statistical toolboxes can be used for estimation
‘armax’ toolbox in Matlab is one such example.
3. Model Selection

 Model selection is important in time series

analysis as there can be many possible models.
 In general, AR parameters are of order up to 6 and
MA parameters are of order up to 2,3 serve the
purpose .
 A model maybe selected using the following two
criteria from several candidate models
1. Maximum likelihood rule(ML)
2.Mean Square error(MSE)
1. Maximum Likelihood Rule (ML)

Maximum likelihood value for each of the

candidate models is evaluated.
The model with highest likelihood value is
chosen.
This model is mostly used for application where
long term forecasting is needed .
2. Mean Square Error(MSE)

Using a portion of available data(N/2) estimate

the parameters of different models.
Forecast the series for the remaining N/2 data by
using the candidate models.
Estimate the MSE corresponding to each model.
The model with least value of MSE is selected
for prediction.
This method is mostly used when model is
required for short term forecasting.
4. Model Testing / Validation

• First ‘T’ values are used to build the model(say 50% of

the available data) and the rest of data is used to validate
the model.
• All the tests are carried out on the residuals series.
• The tests are performed to examine whether the
following assumptions used in building the model are
valid for the model selection.
The residual series has zero mean – Significance
of residual mean.
The residual series is uncorrelated – Whittle’s
white noise test.

Time Series Analysis - Univariate and Multivariate Methods by William Wei PDF
100% (3)
Time Series Analysis - Univariate and Multivariate Methods by William Wei PDF
634 pages
Time Series ARIMA Models PDF
No ratings yet
Time Series ARIMA Models PDF
22 pages
What Is Time Series Analysis
No ratings yet
What Is Time Series Analysis
28 pages
Slides On ARIMA Models - Robert Nau
No ratings yet
Slides On ARIMA Models - Robert Nau
21 pages
Time Series Analysis and Forecasting Using R
No ratings yet
Time Series Analysis and Forecasting Using R
30 pages
Document 1
No ratings yet
Document 1
6 pages
Applied Econometrics - : Introduction To Time Series
No ratings yet
Applied Econometrics - : Introduction To Time Series
26 pages
Econometrics 2 Notes
No ratings yet
Econometrics 2 Notes
12 pages
Univariate Time Series Analysis
No ratings yet
Univariate Time Series Analysis
20 pages
Time Series
No ratings yet
Time Series
13 pages
Intro of Time Series
No ratings yet
Intro of Time Series
18 pages
ATA Unit3 4 5
No ratings yet
ATA Unit3 4 5
51 pages
Time Series
No ratings yet
Time Series
67 pages
CSE4261 Lecture-9
No ratings yet
CSE4261 Lecture-9
45 pages
Box Jenkins Methodology
100% (1)
Box Jenkins Methodology
29 pages
Non-Seasonal Box-Jenkins Models
No ratings yet
Non-Seasonal Box-Jenkins Models
75 pages
TS CF
No ratings yet
TS CF
54 pages
Class Notes
No ratings yet
Class Notes
6 pages
Arima: Autoregressive Integrated Moving Average
No ratings yet
Arima: Autoregressive Integrated Moving Average
32 pages
Basic Time Series With Python Code
No ratings yet
Basic Time Series With Python Code
33 pages
TSA Chapter 0: Fundamental Concepts of Time Series
No ratings yet
TSA Chapter 0: Fundamental Concepts of Time Series
6 pages
Stat 479
No ratings yet
Stat 479
74 pages
Chapter 4. ARIMA - SV
No ratings yet
Chapter 4. ARIMA - SV
49 pages
TS PartII
100% (1)
TS PartII
50 pages
Intro To Time Series
No ratings yet
Intro To Time Series
85 pages
Ch6 Slides Ed3 Feb2021
No ratings yet
Ch6 Slides Ed3 Feb2021
63 pages
Chapter 4X
No ratings yet
Chapter 4X
96 pages
Time SeriesClass Notes
No ratings yet
Time SeriesClass Notes
18 pages
Mifi 564 - Unit 2 - 2022
No ratings yet
Mifi 564 - Unit 2 - 2022
71 pages
2.10 Stationary Time Series-1607080624480
No ratings yet
2.10 Stationary Time Series-1607080624480
35 pages
ARIMA Model Python Example - Time Series Forecasting
No ratings yet
ARIMA Model Python Example - Time Series Forecasting
11 pages
Ch6 Slides Ed3 Feb2024
No ratings yet
Ch6 Slides Ed3 Feb2024
31 pages
Slides
No ratings yet
Slides
31 pages
TSNotes 2
No ratings yet
TSNotes 2
28 pages
Some Notes On Univariate Time Series Analysis: C:/JIM//CLASSES/588/NOTES/III - ARIMA - 588 - 2000
No ratings yet
Some Notes On Univariate Time Series Analysis: C:/JIM//CLASSES/588/NOTES/III - ARIMA - 588 - 2000
51 pages
Lecture 2 Time Series Analysis
No ratings yet
Lecture 2 Time Series Analysis
32 pages
End Term Project (BA)
No ratings yet
End Term Project (BA)
19 pages
Week4 - 1
No ratings yet
Week4 - 1
18 pages
Arma Arima
No ratings yet
Arma Arima
10 pages
Dynamics
No ratings yet
Dynamics
43 pages
Characteristics of Time Series
No ratings yet
Characteristics of Time Series
17 pages
Econ 4 - Time Series
No ratings yet
Econ 4 - Time Series
23 pages
Time Series
No ratings yet
Time Series
32 pages
Lec 08 - ARIMA Models
No ratings yet
Lec 08 - ARIMA Models
35 pages
Session 13A - The ARMA and ARIMA Models
No ratings yet
Session 13A - The ARMA and ARIMA Models
173 pages
Auto Regressive Integrated Moving Average - Material
No ratings yet
Auto Regressive Integrated Moving Average - Material
21 pages
Arima
No ratings yet
Arima
65 pages
C TSAF Box Jenkins - Method
No ratings yet
C TSAF Box Jenkins - Method
83 pages
MIS410 Chapter7
No ratings yet
MIS410 Chapter7
49 pages
Lecture Note 4 - Dynamic Models For Stationary Data
100% (1)
Lecture Note 4 - Dynamic Models For Stationary Data
28 pages
Time - Series - in - Brief
No ratings yet
Time - Series - in - Brief
11 pages
Time Series Analysis
100% (1)
Time Series Analysis
66 pages
ARIMA Estimation: Theory and Applications: 1 General Features of ARMA Models
No ratings yet
ARIMA Estimation: Theory and Applications: 1 General Features of ARMA Models
18 pages
Quantitative Chapter10
No ratings yet
Quantitative Chapter10
27 pages
Slides On ARIMA Models
No ratings yet
Slides On ARIMA Models
21 pages
Econometrics II Chap 4.1 Univariate Time Series
No ratings yet
Econometrics II Chap 4.1 Univariate Time Series
63 pages
Univariate Time Series Modelling and Forecasting: Introductory Econometrics For Finance' © Chris Brooks 2002 1
No ratings yet
Univariate Time Series Modelling and Forecasting: Introductory Econometrics For Finance' © Chris Brooks 2002 1
62 pages
Time Series Chap21
No ratings yet
Time Series Chap21
27 pages