Stock Price Prediction Using Data Analytics: 978-1-5386-3852-1/17/$31.00 ©2017 IEEE
Stock Price Prediction Using Data Analytics: 978-1-5386-3852-1/17/$31.00 ©2017 IEEE
Abstract— Accurate financial prediction is of great interest for We have used various models including Time Series
investors. This paper proposes use of Data analytics to be used in Model, Artificial Neural Networks (ANNs), Regression, etc.
assist with investors for making right financial prediction so that For modelling the system we have used the stock data of
right decision on investment can be taken by Investors. Two “Nifty 50” for the last 9 years. This stock data is then used as
platforms are used for operation: Python and R. various an input to our various mathematical models’ algorithms like
techniques like Arima, Holt winters, Neural networks (Feed Autoregressive Integrated Moving Average model (ARIMA),
forward and Multi-layer perceptron), linear regression and time Polynomial model, and Linear Regression method, Time
series are implemented to forecast the opening index price Series Model and Radial Basis Function Neural Network and
performance in R. While in python Multi-layer perceptron and
Multi-Layer Perceptron Neural Network, of the Artificial
support vector regression are implemented for forecasting Nifty
50 stock price and also sentiment analysis of the stock was done
Neural Network (ANN) models. The results of the same are
using recent tweets on Twitter. Nifty 50 (^NSEI) stock indices is then compared and the most accurate and efficient model is
considered as a data input for methods which are implemented. 9 found. Another motivation for research in this field is that it
years of data is used. The accuracy was calculated using 2-3 possesses many theoretical and experimental challenges.
years of forecast results of R and 2 months of forecast results of In rest of the paper Section II illustrates proposed system
Python after comparing with the actual price of the stocks. Mean which tells briefly about what we want to do and then Section
squared error and other error parameters for every prediction
III has implementation of proposed system which gives idea
system were calculated and it is found that feed forward network
about everything that is implemented in this project and then
only produces 1.81598342% error when opening price of stock is
forecasted using it.
in Section IV we have attached the results that we got and
then we can concluded according to our results in Section V.
Keywords—Big Data Analytics, Data analytics, Predictive
Analytics, Stock Index Prediction, Time Series Model, Moving II. PROPOSED SYSTEM
Average, Auto Regression, Linear Regression, Artificial Neural
Networks, ARIMA, Holt-Winters, Multi-Layer Perceptron,
The whole purpose of this project is to assist the stock
Radial Basis Function(RBF), Twitter sentiment analysis, Web market traders in making profit. This project helps traders in
Scrapping, Support Vector Regression “Buy low, Sell high” [1] strategy. The practice of buying a
security when its price is (or is perceived to be) low and
selling it when its price is high. The ability to buy low and sell
I. INTRODUCTION high requires one to be able to determine roughly when the
Recently forecasting stock market indices is gaining more low and high prices for a security occur. There are a number
attention, because of the fact that if the direction of the market of technical indicators analysts use to find these, but critics of
is successfully predicted the investors will be better guided. the practice contend it is impossible or at least excessively
Here, an exhaustive study on various mathematical models risky [2]. This project helps avoiding that risky part by
and algorithms that can be used in stock index prediction is forecasting the future prices.
presented. Architecture of artificial neural networks: We used Multilayer
Author of Stock Price Prediction and Trend Prediction perceptron and Feed forward network. Both are having
Using Neural Networks [17] performed stock price prediction sequential model(Linear stack of layers). Multi-layer
using Neural networks and used 500 nodes in his work. He perceptron was implemented using Keras library in python
stated that accuracy of the forecast can be increased if the [3]. Numbers of Hidden layers we have used is 8. The
number of nodes are increased. So we increased the nodes for activation function we used is reLU (rectified linear units) [4].
more accuracy. Authors of Stock Price Prediction Using the One of the advantages of reLU activation function is its
ARIMA Model [18] implemented Arima with different (p,q,r) ability to increase the training speed of neural network.
values and calculated R squared error for every respective “mean_squared_error” is used as a Model loss function. The
value. So to make process less time consuming we used Auto loss function, also called the objective function is the
Arima available in R. Auto arima automatically tests data evaluation of the model used by the optimizer to navigate the
with all p, q and r options and chooses the best value that can weight space. Adam (Adaptive Moment Estimation) optimizer
provide best forecast. [5] is used. Adam’s biggest advantage is it uses adaptive
learning rates and is well suited for large amount of data.
Number of epochs chosen after taking execution time and
IV. RESULTS
For accuracy we took data till 2014 and performed forecasting
for the rest of the time for which we had data. After that we
compared the result of each method (we got average value for
each month as a result for forecasting) with the actual value of
the stock and result was obtained. We predicted the open price
Fig 4: Polynomial Trend of the stock (Nifty 50) and best method to forecast the
Now that we have one generalized Trend, we create another opening price of stock is feed forward neural network. We
function based on the loss window for seasonal extraction also observed that for different stock different methods can
called STL (Seasonal, Trend and Loss) as shown in fig 5. The provide better result and this is true for different types of
overview of both functions is shown in Fig 6. prices (Open, low, high and low) too. All 21 methods are
implemented on R platform and the results are given below in
table 1, 2 and 3.
Table 1: Error Calculation of Results Obtained by Every Method Implemented in R Using polynomial trend
(ME: Mean Error, RMSE: root-mean-square error, MAE: Mean absolute error, MPE: Mean Percent Error,
MAPE: Mean Absolute Percent Error, ACF1: First order Auto-correlation coefficient)
Methods ME RMSE MAE MPE MAPE ACF1
Table 2: Error calculation of Results obtained by every method implemented in R using Time series trend
Methods ME RMSE MAE MPE MAPE ACF1
Table 3: Error calculation of Results obtained by every method implemented in R using Time series trend using actual data
Methods ME RMSE MAE MPE MAPE ACF1