ARIMA
ARIMA
26
International Journal of Computer Applications (0975 – 8887)
Volume 120 – No.11, June 2015
The ARIMA process analyze and forecasts uniformly spaced between various attributes of the weather data is considered
univariate time series data, transfer function data, and and their association is generated. The relationship among these
intercession data using the Autoregressive Integrated Moving- associations helps in effective analysis of the weather. If the
Average (ARIMA) or autoregressive moving-average (ARMA) weather changes are not understood, several impacts such as
model. An ARIMA model predicts a value in a response time coastal erosion, agricultural and human health, damage
series as a linear combination of its own past values, past errors infrastructure, agriculture and land will be at stake. Therefore in
and current and past values of other time series. It is fitted this article the hidden associations among the attributes are
using a random walk model, and the equation for the model is considered based on the Time series model. The initial
given by estimates are identified based on the forecasting model called
auto correlation and the intermediate weather changes are
Yˆ (t ) Y (t 1) (1)
estimated using moving averages i.e. by using partial auto
Yˆ (t ) Y (t 1) (2)
The brief procedure is presented below:
1. Preprocess the data to remove missing values.
The prediction process is carried out by summing the last
period’s value with a constant, this indirectly help to estimate 2. Calculate the regression values and auto regression
the prediction changes on an average at particular intervals of values using ARIMA model.
time. 3. Consider different time lags to model the data.
ARIMA (p,q,r) models, are used to identify the various 4. Using the correlation analysis, correlate the data and
seasonal changes , in general, ARIMA(0,1,0) mode, is used to rank the data according to highest correlation.
estimate the non-seasonal difference and a constant term. In
this paper we have used ARIMA(1,1,0) model is used, since it 5. The data with highest correlation is considered to be
helps for better prediction of weather based on the most likely weather change and it is assumed to be
autocorrelation of previous data ie lag 1 and the general producing destructive effects.
equation used for fitting the model is presented in equation( 3)
5. RESULTS AND CONCLUSIONS
Y (t ) Y (t 1) (Y (t 1) Y (t 2)) (3) In this paper the weather data is considered with attributes,
such as wind pressure, humidity, Minimum and Maximum
Temperature, Forecast and Type, of Visakhapatnam city for a
4. METHODOLOGY period of 97days. The forecasting experiment is carried out to
In order to demonstrate the proposed model a data base is evaluate, the weather condition for the next 15 days by
generated from the meteorological department of India enabling the ARIMA model prediction algorithm model to
pertaining to Visakhapatnam district. This weather data set predict the forecasts. Initially the ARIMA (1, 1,0), model is
includes several attributes such as minimum temperature, considered .These two models are used to predict wind
maximum temperature, wind pressure, humidity, perception, pressure and humidity for the next 15days, the comparison
sunshine, evaporation and category. The category attribute between predicted results and real data is shown in Figure 1
decides the intensity and is categorized as normal, cloudy, and Figure 2
depression, severe depression and cyclonic storm. This
categorization is based on one of the attributes causing the
changes in the weather and wind pressure. The relationship
27
International Journal of Computer Applications (0975 – 8887)
Volume 120 – No.11, June 2015
14 21 32 1 32 6 42 1 td
15 21 33 1 34 7 43 1 td
16 1 10 2 5 240 5 2 five
17 1 10 2 9 50 15 2 zero
18 1 10 2 9 55 15 2 zero
19 1 10 2 8 60 10 2 one
20 1 10 2 8 80 10 3 one
21 1 10 2 8 90 9 3 two
22 1 10 2 8 90 9 3 two
23 1 10 2 8 100 8 4 three
24 1 10 2 8 110 8 4 three
25 1 10 2 6 125 8 5 four
26 1 10 2 6 135 7 5 four
27 1 10 2 6 145 7 5 four
28 1 10 2 6 155 7 5 four
29 31 70 1 62 12.7 33 1 td
30 32 70 1 46 13.8 38 1 td
31 33 70 1 72 10.8 43 1 td
1 .610 .102
2 .264 .102
3 .130 .102
4 .067 .102
Figure 1. Time series Data of the predicted
weather 5 .016 .102
Instinctive investigations show that with the increase of the
prediction step of length two sequences the calculated effect 6 -.027 .102
is getting inferior. The mean absolute percentage error 7 -.026 .102
(MAPE) and mean absolute error (MAE) are used for the
analysis and the results derived are presented in table -3 8 -.023 .102
9 -.009 .102
10 -.010 .102
11 -.002 .102
12 -.008 .102
13 -.373 .102
14 -.049 .102
15 .026 .102
16 .090 .102
28
International Journal of Computer Applications (0975 – 8887)
Volume 120 – No.11, June 2015
Table 3. Error Analysis Table of Wind Pressure Inventory Management”, Journal of Computers, vol. 6,
no. 4, April (2011), pp. 784-791.
STEP MAE MAPE
[6] Z. Danping and D. Jin, “The Data Mining of the Human
1 0.21 0.062 Resources Data Warehouse in University Based on
2 0.39 0.067 Association Rule”, Journal of Computers, vol. 6, no. 1,
3 0.65 0.107 (2011) January, pp. 139-146.
4 0.87 0.119 [7] J. Jiang, B. Guo, W. Mo and K. Fan, “Block-Based
5 0.88 0.130 Parallel Intra Prediction Scheme for HEVC”, Journal of
6 1.00 0.145 Multimedia, vol. 7, no. 4, (2012) August, pp. 289-294.
7 1.09 0.176 [8] S.-Y. Yang, C.-M. Chao, P.-Z. Chen and C.-Hao,
8 1.03 0.177 “SunIncremental Mining of Closed Sequential Patterns
in Multiple Data Streams”, Journal of Networks, vol. 6,
9 1.12 0.181 no. 5, (2011) May, pp. 728-735.
10 1.29 0.173
[9] Z. Fu, J. Bai and Q. Wang, “A Novel Dynamic
11 1.16 0.165
Bandwidth Allocation Algorithm with Correction-based
12 1.21 0.208 the Multiple Traffic Prediction in EPON”, Journal of
13 1.30 0.213 Networks, vol. 7, no. 10, (2012) October, pp. 1554-1560.
14 1.23 0.209 [10] Z. Qiu, Z.-W. Lin and Y. Ma, “Research of Hadoop-
15 1.21 0.212 based data flow management system”, The Journal of
16 1.20 0.021 China Universities of Posts and Telecommunications,
vol. 18, (2011) February, pp. 164-168.
[11] J. Cui, T. S. Li and H. X. Lan, “Design and
As shown in the MAE column and MAPE column, as the
Development of the Mass Data Storage Platform
prediction step increases, the prediction error of humidity
Based on Hadoop”, Journal of Computer Research and
and the prediction error of wind increases.
Development, vol. 49, no. 12, (2012) May, pp. 12-18.
In this paper, a methodology for weather forecasting is
[12] P. Sethia and K. Karlapalem, “A multi-agent
presented using the data mining prediction algorithm-
simulation framework on small Hadoop cluster”,
ARIMA The proposal has the capability analyzing and
Engineering Applications of Artificial Intelligence, vol.
weather forecasting
24, no. 7, (2011) May, pp. 1120-1127.
6. REFERENCES [13] H. Yu, J. Wen, H. Wang and L. Jun, “An Improved
[1] Y. W. Dou, L. Lu, X. Liu and Daiping Zhang, Apriori Algorithm Based on the Boolean Matrix and
“Meteorological Data Storage and Management Hadoop”, Procedia Engineering, vol. 15, (2011) July, pp.
System”, Computer Systems & Applications, vol. 20, no. 1827-1831.
7, (2011) July, pp. 116-120.
[14] B. Dong, Q. Zheng and F. Tian, “Optimized approach for
[2] C. Zhang, W.-B. Chen, X. Chen, R. Tiwari, L. Yang storing and accessing small files on cloud storage”,
and G. Warner, “A Multimodal Data Mining Journal of Network and Computer Applications, vol. 35,
Framework for Revealing Common Sources of Spam no. 6, (2012) May, pp. 1847-1862.
Images”, Journal of multimedia, vol. 4, no. 5, (2009)
October, pp. 313-320. [15] G. Mao, “Theory and Algorithm of Data Mining”,
Beijing: Tsinghua University Press, (2007), pp. 121-142.
[3] C. Li, M. Zhang, C. Xing and J. Hu, “Survey and
Review on Key Technologies of Column Oriented [16] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh and D.
Database Systems”, Computer Science, vol. 37, no. 12, A. Wallach, “Bigtable: A distributed storage system for
(2011) February, pp. 1-8. structured data. Proc.of the 7th USENIX Symp.on
Operating Systems Design and Implementation, (2006),
[4] M. Zhang, “Application of Data Mining Technology in pp. 205-218.
Digital Library”, Journal of Computers, vol. 6, no. 4,
(2011) April, pp. 761-768. [17] S. Ghemawat, H. Gobioff and S.-T. Leung, “The
Google File System”, Proc. of the 19th ACM Symp on
[5] C.-W. Shen, H.-C. Lee, C.-C. Chou and C.-C. Cheng, Operating Systems Principles, (2003), pp. 29-43.
“Data Mining the Data Processing Technologies for
IJCATM : www.ijcaonline.org 29