Stock Price Trend Forecasting Using Supervised Learning Methods
Stock Price Trend Forecasting Using Supervised Learning Methods
Abstract— The aim of the project is to examine a number B. Feature Selection and Feature Generation
of different forecasting techniques to predict future stock
returns based on past returns and numerical news indicators We created new features from the base features which
to construct a portfolio of multiple stocks in order to diversify provided better insights of the data like 50 day moving
the risk. We do this by applying supervised learning methods average, previous day difference, etc.
for stock price forecasting by interpreting the seemingly chaotic To prune out less useful features, in Feature Selection, we
market data. select features according to the k highest scores, with the help
of an linear model for testing the effect of a single regressor,
I. INTRODUCTION
sequentially for many regressors. We used the SelectKBest
The fluctuation of stock market is violent and there Algorithm, with f regression as the scorer for evaluation.
are many complicated financial indicators. However, the Furthermore, we added Twitters Daily Sentiment Score,
advancement in technology, provides an opportunity to as an feature for each company based upon the users tweets
gain steady fortune from stock market and also can help about that particular company and also the tweets on that
experts to find out the most informative indicators to make companys page.
better prediction. The prediction of the market value is of
paramount importance to help in maximizing the profit of III. A NALYSIS
stock option purchase while keeping the risk low. For analyzing the efficiency of the system we are used the
Root Mean Square Error(RMSE) and r2̂ score value.
The next section of the paper will be methodology where
we will explain about each process in detail. After that we A. Root Mean Squared Error (RMSE)
will have pictorial representations of the analysis that we The square root of the mean/average of the square of all
have made and we will also reason about the results achieved. of the error.
Finally, we will define the scope of the project. We will talk The use of RMSE is very common and it makes an excel-
about how to extend the paper to achieve more better results. lent general purpose error metric for numerical predictions.
Compared to the similar Mean Absolute Error, RMSE
amplifies and severely punishes large errors.
II. METHODOLOGY
This section will give you the detailed analysis of each
process involved in the project. Each sub section is mapped
to one of the stages in the project.
A. Data Pre-Processing
Fig. 1. RMSE Value calculation
The pre-processing stage involves
• Data discretization: Part of data reduction but with
particular importance, especially for numerical data
• Data transformation: Normalization.
• Data Cleaning: Fill in missing values.
• Data Integration: Integration of data files.