0% found this document useful (0 votes)
220 views6 pages

Gold Price Estimation Using A Multi Variable Model

Uploaded by

Grace Mutore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
220 views6 pages

Gold Price Estimation Using A Multi Variable Model

Uploaded by

Grace Mutore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2017 International Conference on Networks & Advances in Computational Technologies (NetACT) |20-22 July 2017| Trivandrum

GOLD PRICE ESTIMATION USING A


MULTI VARIABLE MODEL
K.R Sekar Manav Srinivasan K.S.Ravichandran J.Sethuraman
School of Computing, SASTRA University, India

Abstract---Stock market analysis is a very popular area of investment [2]. Hidden Markov Model, Similarity
research. Achieving good prediction in forecasting the Index, Crisp Methodologies and Machine learning
stock markets is a very challenging task. The prediction of
algorithms are widely appreciated in the field of
the future stock markets is done using cascading statistical
models. This paper investigates the MCX commodity (Gold) prediction for the past one and half decades [1][3][5].
on which the model is applied. Through the similarity measures, it is possible to
identify the richness and evenness about the attributes
Objective: To predict the trend of the gold commodity will
that contribute to the greater idea of prediction. Dead
remove the uncertainty in the future for the investors.
Stocks and fast moving items are predicted through
Prediction techniques employed:Multi Variable Linear association rule or market basket analysis [4].
regression model,Time series models, Skewness and
Kurtosis.
Data Mining Methodologies are used to large extents to
Finding:The mammoth analysis of the attributes brings up analyse the portfolios, risk management and trend
the greater vision about the gold productand its importance analysis. Clustering algorithm is the tool used to find
to invest in that segment for it keeps the money in harmony
with the real time drift in the price and its fluctuations. The the net trading volume, price volatility and return
data collected consists of commodity prices and volumes volatility ratios [5]. Nowadays the predictors’ vision
over the last 5 years on a monthly basis. The experimental turns to other segments like market movements, equity,
results give us the predicted future values of the
commodities.
commodity market and Euro forex rates. For the above
said, large scale microblog data and higher search
Inference: Prediction can also be valuable to support in behaviour provides insight for the market movements
planning about viable developments and provides a clear
outcome for the future eventual market. [6].

Keywords: Kurtosis, Skewness, Multi Variable Linear Intraday stock price forecasting can be done using
regression model, RMSE, Demonetization. ANN (Artificial Neural Networks) [7]. Stock price
Keywords-Surveillance, Classification, SVM, CNN, PCA, forecasting can also be done using Machine Learning
LDA, HMM, K-NN, Optical Flow. techniques like Support Vector Machines. [8]

I. INTRODUCTION II. RELATED WORKS


The gold model is prediction has been inevitable and Gold Price Forecasting using Multiple Linear
the Bullion market has always fluctuated. Investment in regression Model has been done in the past considering
gold is essential to materialistic people who need to multiple independent variables like Inflation, S&P
acquire large assets. It is also one of the liquid assets as Index, Treasury Bill Rate, etc. This model provided
exchange is quite easy. Stock market in terms of shares, significant insight into the different variables
crude oil and gold are the widely pronounced among influencing the gold price determination. [9]
the business people.Predicting the gold trend needs
many models. Mathematical, Statistical, Random A study evaluated financial news articles and its
process theory, operational research gives people prior influence on the stock quotes The results show that ,
knowledge about the unforeseen data. Prediction is the SVM model, built using both article terms and the
difficult because of the heavy change in the attributes performance dominated all the three metrics at the time
and strategies followed by the business people of article release; measures of closeness at 0.04261,
according to the trend of the market. In foreign directional accuracy at 57.1% and simulated trading
countries like US, UK, Australia the people believe data 2.06% return.[11].
only in real time news about the markets and
accordingly incorporate new short term strategies for

978-1-5090-6590-5/17/$31.00 ©2017 IEEE 1369364

Authorized licensed use limited to: UNIVERSITAS GADJAH MADA. Downloaded on May 30,2021 at 07:38:35 UTC from IEEE Xplore. Restrictions apply.
Another study examined the impact of twitter messages time-series data and since stock data qualify as a time-
(exogenous inputs) on stock prices using Linear series data, HMM model has been attempted by a few
Regression mode. The results showed the correlation studies in the past. There are variants to the HMM
between the daily tweets and the stock market model like Artificial Neural Networks (ANNs) [16],
indicators .Hence Twitter data can be useful to predict Fuzzy Logic (FL) [17], and Support Vector Machines
stock market [13]. (SVM)[18]. The results show that the SVM is most
suitable method for stock market forecasting
Yet another study combined the news mining and time problem.The accuracy is observed high due to less
series analysis to forecast inter-day stock prices. The number of features and data points.
results show that the NTF (News Mining and Time
Series based Analysis forecasting) forecasts the stock This paper has chosen the Regression Model, for Gold
prices better than the regular TSA (Time Series Price Estimation. The Gold Price has been factored as
Analysis) and forecasts the stock price trends better the dependent variable, and a few other variables have
than the classical random walk algorithm. Stock price been considered to be influencing the Gold Price in
forecasting can be obtained from news reports, which is India. Such variables are: NIFTY stock index, Oil
an improvement over conventional forecasting Price, Cost of Price Index, USD to INR exchange rate
techniques [14]. and international Gold Price in USD. Additional other
data like GDP growth has been considered initially, but
due to poor correlation between Gold price and GDP,
III. GOLD TREND ANALYSIS FOR THE the same has been excluded.
FUTURE MARKET
Number of statistical and mathematical models have
evolved over a period of time, and one of them is the IV. THE DATA SET (GOLD PRICE &
Multi-variable Regression model. This method tries NIFTY)
to estimate the value of dependent variable Y that may TABLE-1 (GOLD PRICE, CPI INDEX AND NIFTY)
be influenced by other independent_variable, assuming
that dependant variable Y with other independent Gold Price CPI Nifty
DATE (INR) Index
variables can be expressed as a linear relationship. The
linear regression can be expressed as a formula Y= 31-Jan-12 2,769.25 109.42 5199.25

a+bX where X is independent variable and Y is the 29-Feb-12 2,835.76 109.98 5385.20
dependent variable. The slope of the line is b and a is 31-Mar-12 2,796.10 111.09 5295.55
the intercept (the value of y when X has a value zero).
30-Apr-12 2,862.23 113.30 5248.15
When multiple variables are considered, the formula is
as follows: 31-May-12 2,882.34 113.85 4924.25
30-Jun-12 2,991.21 114.96 5278.90
th
Here Yi represents the i observationof the dependent 31-Jul-12 2,955.50 117.17 5229.00
variable. ELrepresents the parameter to be estimated where
31-Aug-12 3,036.67 118.27 5258.50
i=1,2,3,4,..n, HLrepresents the standard ith distributed normal
error and Xij represents the ith observation of the jth 30-Sep-12 3,170.63 118.82 5703.30
variable. 31-Oct-12 3,120.88 119.93 5619.70
30-Nov-12 3,159.29 120.48 5879.85
A popular use of this model is using Least Squares
Regression. This method is used to calculate the best-fit 31-Dec-12 3,113.05 121.03 5905.10
line by minimizing the sum of the squares of the 31-Jan-13 3,075.48 122.14 6034.75
vertical deviations from each data point to the line (if a 28-Feb-13 3,020.58 123.24 5693.05
point lies on the fitted line exactly, then its vertical
31-Mar-13 2,959.98 123.79 5682.55
deviation is 0). Because the deviations are squared,
there are no issues with the positive and negative 30-Apr-13 2,744.92 124.89 5930.20
deviations, due to address outliers, influential 31-May-13 2,655.15 125.99 5985.95
observations, residuals and lurking variables.
30-Jun-13 2,706.67 127.65 5842.20

The other model that has been evolving recently is the 31-Jul-13 2,689.39 129.86 5742.00

Hidden Markov Model. This method was originally 31-Aug-13 3,038.63 130.97 5471.80
applied in areas like speech recognition, Image 30-Sep-13 3,079.85 131.52 5735.30
processing, etc. HMM models are useful to predict
31-Oct-13 2,959.01 133.17 6299.15
2

365
370

Authorized licensed use limited to: UNIVERSITAS GADJAH MADA. Downloaded on May 30,2021 at 07:38:35 UTC from IEEE Xplore. Restrictions apply.
30-Jun-
30-Nov-13 3,000.05 134.28 6176.10 59.5330
13 8.25 102.92 1,343.35
31-Dec-13 2,889.15 132.06 6304.00 31-Jul-
60.8550
13 10.25 107.93 1,285.52
31-Jan-14 2,909.57 130.95 6089.50 31-Aug-
65.7050
13 10.25 111.28 1,351.74
28-Feb-14 2,948.29 131.50 6276.95 30-Sep-
62.5900
31-Mar-14 2,967.04 132.06 6704.20 13 9.50 111.60 1,348.60
31-Oct-
61.6240
30-Apr-14 2,851.46 133.72 6696.40 13 9.00 109.08 1,316.58
30-Nov-
31-May-14 2,781.28 134.83 7229.95 62.3990
13 8.75 107.79 1,275.86
30-Jun-14 2,681.32 135.93 7611.35 31-Dec-
61.8100
13 8.75 110.76 1,221.51
31-Jul-14 2,786.71 139.25 7721.30 31-Jan-
62.6850
14 9.00 108.12 1,244.27
31-Aug-14 2,827.46 139.81 7954.35
28-Feb-
61.7950
30-Sep-14 2,701.94 139.81 7964.80 14 9.00 108.90 1,299.58
31-Mar-
31-Aug-16 3,133.01 153.66 8786.20 60.0150
14 9.00 107.48 1,336.08
30-Apr-
30-Sep-16 3,108.29 153.11 8611.15 60.3450
14 9.00 107.76 1,298.45
31-Oct-16 2,990.57 153.66 8638.00 31-May-
59.1950
14 9.00 109.54 1,288.74
30-Nov-16 2,966.07 153.11 8224.50 30-Jun-
60.0600
14 9.00 111.80 1,279.10
31-Dec-16 2,747.97 152.01 8185.80
31-Jul-
60.5550
14 9.00 106.77 1,310.59
31-Aug-
60.5200
14 9.00 101.61 1,295.13
30-Sep-
61.9400
A. THE DATA SET (OTHER DATA) 14 9.00 97.09 1,236.55
31-Aug-
66.9730
TABLE-2 (BANK RATE, USD TO INR, OIL PRICE & GOLD 16 7.00 45.84 1,340.17
PRICE IN US) 30-Sep-
66.5560
16 7.00 46.57 1,326.61
Bank USD to Oil Gold 31-Oct-
66.6860
Rate INR Price Price in 16 6.75 49.52 1,266.28
DATE in USD USD 30-Nov-
31-Jan- 68.5980
49.5150 16 6.75 44.73 1,238.35
12 6.00 110.69 1,652.21 31-Dec-
29-Feb- 67.9550
49.1100 16 6.75 53.29 1,157.36
12 6.00 119.33 1,742.14
31-Mar-
50.8750
12 9.50 125.45 1,673.77
30-Apr-
The dataset consists of gold rates (per gm) over the past
52.6650 five years on a monthly basis from Jan 2012 to Dec
12 9.50 119.42 1,649.69
31-May- 2016. The dataset also includes other data such as Nifty
56.0400
12 9.00 110.34 1,591.19
30-Jun-
prices, Oil prices (Brent Crude), CPI, GDP, Interest
55.5100 Rates and Dollar to Rupee rates.
12 9.00 95.16 1,598.76
31-Jul-
55.4400
12 9.00 102.62 1,589.90 The data was collected from various sourcesincluding
31-Aug- World Gold Council, IndexMundi, MCXIndia and RBI.
55.5250
12 9.00 113.36 1,630.31
30-Sep-
52.8550
12 9.00 112.86 1,744.81
31-Oct-
12 9.00
53.8050
111.71 1,746.58
V. APPLIED METHODOLOGY
30-Nov-
12 9.00
54.2650
109.06 1,721.64 The Multi Variable Linear Regression Method has been
31-Dec- built using R Programming. The actual data for the
54.9950
12 9.00 109.49 1,684.76 different variables covered a period of 60 months. The
31-Jan-
13 8.75
53.2750
112.96 1,671.85 data for the first forty seven (47) months have been
28-Feb- used as training data set and the data for the remaining
54.3700
13 8.75 116.05 1,627.57 thirteen (13)months as the testing data set. The model
31-Mar-
13 8.50
54.2850
108.47 1,593.09
involved the following stages: establishing high level of
30-Apr- correlation between the variables and building a model
53.6850
13 8.50 102.25 1,487.86 with the variables chosen.
31-May-
56.5800
13 8.25 102.56 1,414.03
3

366
371

Authorized licensed use limited to: UNIVERSITAS GADJAH MADA. Downloaded on May 30,2021 at 07:38:35 UTC from IEEE Xplore. Restrictions apply.
A. HYPOTHESIS

The following hypothesis has been formulated, to build TABLE-4 : COEFFICIENTS AND INTERCEPT
the model: Intercept Oil Price NIFTY Interest
-3.501e+03 3.829e+00 1.049e-02 -7.611e+00
1. H0: Gold prices don’t have relation with Nifty USD to INR CPI Gold Price
prices. in US
H1: Gold prices have relation with Nifty prices 5.301e+01 4.103e+00 1.655e+00
Note: e stands for 10 and e+0x is 10^x
2. I0: Gold prices don’t have relation with
International Gold prices.
D.SUMMARY
I1: Gold prices have relation with International
Gold prices. The summary of the model is shown below:

3. M0: Gold prices don’t have relation with Oil prices. TABLE-5: SUMMARY
Multiple R- Adjusted R- Residual p-Value
M1: Gold prices have relation with Oil prices. Squared Squared Value Standard of
Value Error
4. N0: Gold prices don’t have relation with Inflation 0.8487 0.826 72.84 on 40 6.661e-15
degrees of
(CPI) freedom
N1: Gold prices have relation with Inflation (CPI)
Note: e stands for 10 and e+0x is 10^x

5. S0: Gold prices don’t have relation with Interest This model shows a Multiple R-Squared value of
Rates. 0.8487, implying that approximately 85% of the
S1: Gold prices have relation with Interest Rates. observations is explained by change in the variables
(Oil Price, Nifty, Interest Rate, and USD to INR Rate,
6. U0: Gold Prices do not have relation with USD to CPI,and Gold Price in US).
INR rate
Thus the derived regression equation for the derived
U1: Gold Prices have relation with USD to INR
model is: y=(-3.501e+03)+( 3.829*oil rate )+( 1.049e-
rate
02*Nifty)+ +(-7.611*Interest) + (53.01*USD to INR)+
Totally six variables have been considered, apart from (4.103*CPI)+ (1.655*Gold price in US)
historical Gold price data for the calculation. A high
E. RESULTS
level of correlation is observed between Gold Prices
The model was then tested on the remaining
and other variables.
observations (13 of them) of the dataset and the results
B. CORRELATION BETWEEN THE VARIABLES are given below.The prediction and observation values
are presented in the following graph diagram:
TABLE-3: CORRELATION BETWEEN GOLD PRICES AND
OTHER VARIABLES
INT CPI OIL US2INR NIFTY INT TABLE-6: PREDICTION AND OBSERVATION
RATE PRICE Date Actual Fit Lower Upper
VALUES 0.325 -0.6 0.68 -0.495 -0.69 0.714
48 31-Dec-15 2,528.41 2,556.57 2,391.37 2,721.77
The above correlation gives us the variables having
49 31-Jan-16 2,604.21 2,662.18 2,489.74 2,834.61
highest correlation with the rate of gold, with their p-
values (Probability Values) less than 0.05.Hence the 50 29-Feb-16 2,874.36 2,842.92 2,657.71 3,028.12
null hypotheses (H0, I0, M0, N0, S0, U0) stated in the 51 31-Mar-16 2,927.31 2,848.15 2,674.15 3,022.15
previous section are false as the p-values are less than 30-Apr-16 2,916.78
52 2,873.28 2,697.61 3,048.94
0.05
53 31-May-16 2,972.89 2,983.59 2,790.35 3,176.83
C. MODEL FINALISATION 54 30-Jun-16 3,044.37 3,036.47 2,835.69 3,237.25
55 31-Jul-16 3,129.77 3,089.02 2,877.67 3,300.36
Using the most correlated variables, the multi-variable
regression model is built. This model revealed a 56 31-Aug-16 3,133.01 3,112.15 2,898.14 3,326.16
Kurtosis value of -1.13 and a skewness value 0.27 57 30-Sep-16 3,108.29 3,066.29 2,860.68 3,271.91
implying that gold is a reliable commodity to invest. 31-Oct-16 2,990.57
58 2,989.07 2,789.67 3,188.47
The resultant interceptor and the coefficients are shown
59 30-Nov-16 2,966.07 3,019.24 2,814.86 3,223.62
below:
60 31-Dec-16 2,747.97 2,878.96 2,692.98 3,064.93

372367

Authorized licensed use limited to: UNIVERSITAS GADJAH MADA. Downloaded on May 30,2021 at 07:38:35 UTC from IEEE Xplore. Restrictions apply.
linear regression model built using the above mentioned
predictors produces an RMSE of 53.583. This is
because, the largest deviation from the actual value is
seen in the month of December, causing a heavy
increase in the RMSE Value.This is mainly attributed to
demonetization, which took effect from the second
week of November, hence causing a slump in the gold
price from November to December. Without
demonetization (i.e. without the last observation), the
RMSE Value is around 11.

REFERENCES

1. Ingle, V., &Deshmukh, S. (2016, August).


Hidden Markov Model Implementation for
Prediction of Stock Prices with TF-IDF
features. In Proceedings of the International
Conference on Advances in Information
Communication Technology & Computing (p.
9). ACM.
2. Wolff, R. C., Robertson, C. S., &Geva, S.
(2006). What Types of Events Provide the
The RMSE Value was computed for the test set (13 Strongest Evidence that the Stock Market is
observations) using the predicted values (fit) and the Affected by Company Specific News? (No.
208b). School of Economics and Finance,
actual values (Rate).
Queensland University of Technology.
The prediction on the Test set produced a RMSE (Root 3. Gavrilov, M., Anguelov, D., Indyk, P.,
Mean Squared Error) value of 53.583. The observed &Motwani, R. (2000, August). Mining the
Value (Actual) of Gold Price lies between the lower stock market (extended abstract): which
measure is best?. In Proceedings of the sixth
and upper value of the predicted interval defined by the
ACM SIGKDD international conference on
Lower and Upper value of the intervals at 95%
Knowledge discovery and data mining (pp.
confidence level. Also, the following table shows the 487-496). ACM.
deviation between the predicted value and the actuals 4. Gul, N., Barki, I., & Akhtar, N. (2009,
for the test data. December). MFP: a mechanism for
determining associated patterns of stock.
TABLE-7 VARIATION % FROM FIT (PREDICTED VALUES)
In Proceedings of the 7th International
48 49 50 51 52 53 54 55 56 57 58 59 60 Conference on Frontiers of Information
1.1 2. 1. 2. 1. 0. 0. 1. 0. 1. 0. 1. 4. Technology (p. 30). ACM.
0 2 1 8 5 4 3 3 7 4 1 8 5 5. Ta, V. D., & Liu, C. M. (2016, December).
Note: Values shown in red are negative. Stock market analysis using clustering
techniques: the impact of foreign ownership on
Barring observation 60, all other values are within a
stock volatility in Vietnam. In Proceedings of
range of   The last observation showed a variation of
the Seventh Symposium on Information and
4.5%.
Communication Technology (pp. 99-106).
VI. CONCLUSION ACM.
6. Rao, T., & Srivastava, S. (2013, May).
The above multi-variable linear regression model Modeling movements in oil, gold, forex and
facilitated a meaningful way to look at the different market indices using search volume index and
variables that influence the price of gold. The model Twitter sentiments. In Proceedings of the 5th
Annual ACM Web Science Conference (pp.
reveals factors like Oil, Nifty Index, CPI (cost of price
336-345). ACM.
index), USD to INR Rate, International Gold price and 7. Louwerse, V., &Rothkrantz, L. (2014, June).
interest rate (Bank Rate) have a high level of influence Intraday stock forecasting. In Proceedings of
over the price of gold. As mentioned earlier the the 15th International Conference on
Kurtosis and the Skewness values indicate thatgold is a Computer Systems and Technologies (pp. 202-
209). ACM.
reliable commodity to invest in. The multivariable
5

373368

Authorized licensed use limited to: UNIVERSITAS GADJAH MADA. Downloaded on May 30,2021 at 07:38:35 UTC from IEEE Xplore. Restrictions apply.
8. Upadhyay, V. P., Panwar, S., Merugu, R., risk measures”. Fuzzy Sets and Systems, pages
&Panchariya, R. (2016, August). Forecasting 769–782, 2007.
Stock Market Movements Using Various 18. L. Cao and F.E.H. Tay. “Financial forecasting
Kernel Functions in Support Vector Machine. using support vector machines”. Neural
In Proceedings of the International Conference
Computation and Application, pages 184–192,
on Advances in Information Communication
Technology & Computing (p. 107). ACM. 2007.

9. Ismail, Z., Yahaya, A., & Shabri, A. (2009).


Forecasting gold prices using multiple linear
regression method.
10. Mao, Y., Wei, W., & Wang, B. (2013,
August). Twitter volume spikes: analysis and
application in stock trading. In Proceedings of
the 7th Workshop on Social Network Mining
and Analysis (p. 4). ACM
11. Schumaker, R. P., & Chen, H. (2009). Textual
analysis of stock market prediction using
breaking financial news: The AZFin text
system. ACM Transactions on Information
Systems (TOIS), 27(2), 12.
12. Tilakaratne, C. D., Mammadov, M. A., &
Morris, S. A. (2007, December). Effectiveness
of using quantified intermarket influence for
predicting trading signals of stock markets.
In Proceedings of the sixth Australasian
conference on Data mining and analytics-
Volume 70 (pp. 171-179). Australian
Computer Society, Inc..
13. Mao, Y., Wei, W., Wang, B., & Liu, B. (2012,
August). Correlating S&P 500 stocks with
Twitter data. In Proceedings of the first ACM
international workshop on hot topics on
interdisciplinary social networks research (pp.
69-72). ACM.
14. Tang, X., Yang, C., & Zhou, J. (2009,
September). Stock price forecasting by
combining news mining and time series
analysis. In Web Intelligence and Intelligent
Agent Technologies, 2009. WI-IAT'09.
IEEE/WIC/ACM International Joint
Conferences on (Vol. 1, pp. 279-282). IET.
15. Wei, P., & Wang, N. (2016, April). Wikipedia
and Stock Return: Wikipedia Usage Pattern
Helps to Predict the Individual Stock
Movement. In Proceedings of the 25th
International Conference Companion on
World Wide Web (pp. 591-594). International
World Wide Web Conferences Steering
Committee.
16. J.H. Choi, M.K. Lee, and M.W. Rhee.
“Trading s&p500 stock index futures using a
neural network”. Proc of the Third Annual
International Conference on Artificial
Intelligence Applications on Wall Street, pages
63–72, 1995.
17. E. Vercher, J.D. Bermudez, and J.V. Segura.
“Fuzzy portfolio optimization under downside

369
374

Authorized licensed use limited to: UNIVERSITAS GADJAH MADA. Downloaded on May 30,2021 at 07:38:35 UTC from IEEE Xplore. Restrictions apply.

You might also like