0% found this document useful (0 votes)
34 views3 pages

Data Collection: Methodology

The document outlines the Box-Jenkins methodology for time series forecasting and model identification. It involves 5 steps: [1] data collection, [2] determining stationarity, [3] model identification and estimation, [4] diagnostic checking, and [5] forecasting and evaluation. The methodology was applied to quarterly electricity consumption data from 2000-2017 in the Philippines to identify an appropriate ARIMA model and test its forecasting ability.

Uploaded by

Theodore Gonzalo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views3 pages

Data Collection: Methodology

The document outlines the Box-Jenkins methodology for time series forecasting and model identification. It involves 5 steps: [1] data collection, [2] determining stationarity, [3] model identification and estimation, [4] diagnostic checking, and [5] forecasting and evaluation. The methodology was applied to quarterly electricity consumption data from 2000-2017 in the Philippines to identify an appropriate ARIMA model and test its forecasting ability.

Uploaded by

Theodore Gonzalo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

METHODOLOGY

Data Collection

Determine the Stationary of Time Series

Model Identification and Estimation

Diagnostic Checking

White Noise
No
Test

Yes

Forecasting and Forecast Evaluation

Conceptual Framework of Box Jenkins

Box and Jenkins (1970) were the first to approach the task of estimating an ARMA model in a
systematic manner. There are five steps to their approaches which include identification,
estimation and the model diagnostic checking, forecasting and evaluation.
TRAINING SET TESTING SET
Qtr1 Qtr2 Qtr3 Qtr4 Qtr1 Qtr2 Qtr3 Qtr4

2000 24,621 28,324 29,482 27,159 2012 75,896 93,300 72,129 65,549

2001 30,391 36,539 36,925 30,719 2013 78,148 103,999 75,266 59,029

2002 35,136 39,014 32,256 32,208 2014 83,832 108,429 78,501 76,877

2003 36,094 40,642 35,343 33,575 2015 87,350 112,880 78,285 77,223

2004 38,145 41,828 39,066 40,972 2016 91,267 118,138 83,029 81,371

2005 44,492 48,846 43,659 42,202 2017 96,130 126,238 89,024 91,843

2006 46,828 54,585 49,141 45,111

2007 50,801 60,980 52,156 46,609

2008 53,087 64,151 51,882 46,236

2009 51,282 68,064 53,448 47,299

2010 63,547 83,087 62,220 56,446

2011 64,848 53,662 63,544 58,542

DATA COLLECTION- The data used in this study were the series of quarterly electricity
consumption of the Philippines in millions of pesos, across the period from January 2000 to
December 2017. The data were collected from Philippine Statistics Authority (PSA).

The data are divided into two subsets: a training set January 2000-December 2011 and the test set
January 2012- December 2017. The training set was used to learn the forecast and the testing set
was used to test the reliability of the train set.

DETERMINE THE STATIONARITY OF TIME SERIES- The researchers used the


Augmented Dicky-Fuller Test and other specific tests for R Studio programming in determining
the data is stationary or not.

MODEL IDENTIFICATION AND ESTIMATION- In order to identify the appropriate model,


the researchers should plot all the data from the starting point. The researchers used the plotted
time series of Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) to
provide a clue what Autoregression (p) and Moving Average (q) should be used in the model. The
researchers started with prevalent spikes. The parameters of the selected Autoregressive Integrated
Moving Average (ARIMA) (p,d,q) model can be estimated by Ordinary Least Square (OLS) or by
maximum likelihood.

DIAGNOSTIC CHECKING- after model identification and estimation, the researchers must
check the White Noise of the data. White Noise needs Identical Independent Distribution (IID)
normal. This means that it has the same distribution and same parameter distribution.

WHITE NOISE- White Noise needs Identical Independent Distribution (IID) to be normal. This
means that the data has the same parameters of distribution. The Shapiro-Wilks Test with the p-
value of 0.4596 shows that there is normal distribution with the model derived.

(note: not ours)

You might also like