A Comparative Study of Supervised Machine Learning Algorithms For Stock Market Trend Prediction
A Comparative Study of Supervised Machine Learning Algorithms For Stock Market Trend Prediction
Abstract- Impact of many factors on the bounded with the performance of stock
stock prices makes the stock prediction a market. In any country, only 10% of the
difficult and highly complicated task. In people engaging themselves with the stock
this paper, machine learning techniques market investment because of the dynamic
have been applied for the stock price nature of the stock market. There is a
prediction in order to overcome such misconception about the stock market i.e.,
difficulties. In the implemented work, five buying or selling of shares is an act of
models have been developed and their gambling.
performances are compared in predicting Hence, this misconception can be changed
the stock market trends. These models by bringing the awareness across the people
are based on five supervised learning for this. The prediction techniques in stock
techniques i.e., Support Vector Machine market can play a crucial role in bringing
(SVM), Random Forest, K-Nearest more people and existing investors at one
Neighbor (KNN), Naive Bayes, and place. Among the popular methods that have
Softmax. The experimental results show been employed, Machine Learning
that Random Forest algorithm performs techniques are very popular due to the
the best for large datasets and Naive capacity of identifying stock trends from
Bayesian Classifier is the best for small massive amounts of data that capture the
datasets. The results also reveal that underlying stock price dynamics. In this
reduction in the number of technical paper, we applied supervised learning
indicators reduces the accuracies of each methods for stock price trend forecasting.
algorithm. The details of the structure of paper are as
follows. In the next section, related work in
Keywords- machine learning, classifier, this field has been mentioned. In section 3,
Random Forest, SVM, KNN, Naïve Bayes, the paper discusses research data details. In
Softmax section 4 proposed work has been presented.
Finally, in section 5 obtained results are
I. INTRODUCTION
discussed and section 6 concludes the
Stock market plays a very important role in proposal.
fast economic growth of the developing
country like India. So our country and other II. RELATED WORK
developing nation’s growth may depend on
performance of stock market. If stock Correct Prediction of stock market trends is
market rises, then countries economic of great importance for the investors as it
growth would be high. If stock market falls, helps in determining whether the investment
then countries economic growth would be would pay off or not. Many methods have
down. In other words, we can say that stock been deployed for the same. Artificial
market and country growth is tightly Neural Network based method is the first
technique to be used for the stock market returns for a given security. Volatility for a
trend prediction [1]. Machine learning has period of 10 days has been calculated.
been used for prediction of movement sign Disparity Index (DI) that measures the
of stock market index. Kim [2] in last two relative position of selected moving average
decades applied SVM for the first time for to the most recent closing price. Disparity
predicting stock market price. Random Index for 10 days has been calculated. Next
Forest is another machine learning model indicator is Stochastic Oscillator which
used for predicting trend direction of stocks depicts the location of the closing price
[3]. Five-days and ten-days ahead models relative to the high-low range. Williams%R
have been used. In this research paper, the is momentum indicator which shows the
comparative study of the supervised level of the closing price relative to the
machine learning algorithms using the time highest high. Next indicator is Volume Price
window of size 1 to 90 has been proposed. trend which relates the volume and the price.
The algorithms have been compared based Commodity Channel Index (CCI) calculates
upon the parameters: Size of the dataset and the current price level relative to an
Number of technical indicators used. average price level over a given period of
Accuracy and F-measure values have been time
computed for each algorithm. Long term
model has been used to compute the IV. PROPOSED METHODOLOGY
accuracy and F-measure.
The proposed architecture for the
implemented work mainly consist of four
III. RESEARCH DATA steps: feature extraction from the given
dataset, supervised classification of the
The data used in this research paper has training dataset, supervised classification of
been collected from data sources like Yahoo the test dataset, and result evaluation. Flow
Finance, Quandl, NSE-India, and YCharts. chart for the proposed methodology is
The data available has the following described in Figure1.
attributes: Date Open, High, Close, and
Volume. Twelve technical indicators have
been used for the model prediction. First Dataset
technical indicator used is Moving Average
(MA10 and MA50). It is responsible for Feature Extraction
smoothening the stock price signal and
making the identification of trends easier.
Moving averages for 10 and 50 days have Supervised Classification
been used in this paper. Next technical (Training Dataset)
indicator is Relative Strength Index (RSI)
which detects whether the stock is
overbought or oversold or not. Supervised Classification
Next indicator is Rate of Change (RoC) (Test Dataset)
which simply measures the rate of change of
price from one period to another. RoC1 and
RoC2 have been used in the paper. Next Result Evaluation
indicator which has been used is Volatility
Figure 1. Flow chart of proposed methodology
which gives the measure of the dispersion of