0% found this document useful (0 votes)
9 views67 pages

Project 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views67 pages

Project 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 67

ABSTRACT

The stock market is a complex and dynamic system, and predicting stock prices is an
important task for investors and traders. With the rise of deep learning models, especially long-short-
term memory (LSTM) models, stock price prediction has become more accurate and efficient. Our
article presents a new method of visualizing and forecasting stock prices using LSTM models. LSTM
models are well suited for modeling and forecasting sequences with long-term dependencies, a key
feature of financial time series data. Our approach is to train an LSTM model on historical stock
market data, which includes open, high, low and close prices, as well as trading volume. After
training, the LSTM model is used to predict future stock prices based on current market trends and
patterns. To assess the effectiveness of our approach, we compared the forecasting performance of
LSTM models with traditional time series models. Our results show that LSTM models outperform
traditional time series models in predicting stock prices. This finding is consistent with previous
studies that have also shown the effectiveness of LSTM models in predicting stock prices. In addition
to the predictive performance of LSTM models, we also introduce a new visualization technique that
can intuitively explain model predictions. Our visualization technique uses a scatterplot where the x-
axis represents the predicted stock price and the y-axis represents the actual stock price. Each data
point on the scatter plot corresponds to a time period for the stock market. Data points are color
coded based on the size of the difference between the predicted price and the actual price. This
visualization technique allows investors and traders to gain insight into the predictive performance of
a model and identify areas for improvement in the model. We apply our method to several publicly
traded stocks, including Apple, Amazon, and Tesla, to demonstrate its effectiveness. Our LSTM
model predicts these stocks with greater accuracy than traditional time series models, suggesting that
LSTM models can capture more complex patterns in stock market data. This finding is consistent
with previous studies that have also shown the effectiveness of LSTM models in predicting stock
prices. In summary, our method provides a new way to visualize and predict stock prices using deep
LSTM models. The ability of LSTM models to capture long-term dependencies in time series data
makes them an invaluable tool for forecasting stock prices. Our visualization techniques provide an
intuitive way to interpret model predictions and identify areas for improvement. Our approach has
the potential to help investors and traders make informed stock market decisions.
KEYWORDS: Bitcoin, Deep Learning, GPU, Recurrent Neural Network, Long Short-Term
Memory, ARIMA.
CONTENTS

S.No. CONTENTS Page No.


1 Abstract
2 Contents
3 List of figures
4 Chapter 1 Introduction 1
1.1 General 1
1.2 Problem Formulation 3
1.3 Objectives of the thesis 3
1.4 Goals of the thesis 4
5 Chapter 2 Literature Review 5
6 Chapter 3 Proposed Method 10
3.1 Existing Method 10
3.1.1 KNN and Logistic regression 10
3.1.2 Advantages of KNN 16
3.1.3 Disadvantages of KNN 16
3.2 Proposed Method 19
3.2.1 Introduction about LSTM 20
3.2.2 Exploding and vanishing gradients 21
3.2.3 Architecture 22
3.2.4 Hidden layers of LSTM 22
3.2.5 Applications 23
3.2.6 Drawbacks 24
3.2.7 Technical Analysis 25
3.2.8 Source Code 25
3.3 System Requirements 36
3.3.1 Software Requirements 36
3.3.2 Hardware Requirements 36
3.3.3 Software Installation 37
3.3.4 Technologies Overview 45
3.3.5 System Architecture 47
3.3.6 System Testing 48
7 Chapter 4 Applications 54
4.1 ARIMA Applications 54
4.2 LSTM Applications 55
8 Chapter 5 Experimental Results 57
9 Chapter 6 Conclusions and Future Scope 62
10 References 63
LIST OF FIGURES

Fig No. Name of the Figure Page No.


1 KNN ALGORITHM 11
2 EUCLIDEAN DISTANCE 13
3 EVALUATED DISTANCE 14
4 CATEGORIZATION 15
5 SIGMOID CURVE 17
6 HIDDEN LAYERS OF LSTM 22
7 CODE TO GRAB S&P 500 TICKERS 27
8 OUTPUT OF GRABBED 500 TICKERS 27
9 CODE TO GRAB STOCK DATA 28
10 OUTPUT STOCK DATA OF COMPANIES 29
11 CODE TO COMPILE ALL CLOSE INDEX 30
12 OUTPUT CLOSE INDEX 31
13 CODE TO FIND AND VISUALIZE CORRELATIONS 32
14 OUTPUT OF CORRELATION TABLE 33
15 HEATMAP OF CORRELATIONS 33
16 CODE TO SET TRADING CONDTIONS 34
17 CODE TO EXTRACT FEATURE SETS 35
18 SYSTEM ARCHITECTURE 47
19 HOME PAGE 57
20 STOCK PRICE TEXT FIELD 58
21 OUTPUT OF PREDICTED STOCK 59
22 EXISTING MODEL OUTPUT 59
23 LSTM MODEL ACCURACY 60
24 TWEETS ABOUT STOCK ENTERED 60
25 PIE CHART 61
Visualising and forecasting of stocks using LSTM

CHAPTER-1

INTRODUCTION

1.1 GENERAL

The stock price fluctuations are uncertain, and there are many
interconnected reasons behind the scene for such behavior. The possible cause
could be the global economic data, changes in the unemployment rate, monetary
policies of influencing countries, immigration policies, natural disasters, public
health conditions, and several others. All the stock market stakeholders aim to
make higher profits and reduce the risks from the thorough market evaluation.
The major challenge is gathering the multifaceted information, putting them
together into one basket, and constructing a reliable model for accurate
predictions.

Stock price prediction is a complex and challenging task for companies,


investors, and equity traders to predict future returns. Stock markets are naturally
noisy, non-parametric, non-linear, and deterministic chaotic systems (Ahangar,
Yahyazadehfar, & Pournaghshband, 2010). It creates a challenge to effectively
and efficiently predict the future price. Feature selection from the financial data is
another difficult task in the stock prediction for which many approaches have
been suggested (Hoseinzade & Haratizadeh, 2019). There has been a trend in
which some researchers use only technical indicators, whereas others use
historical data (Di Persio and Honchar, 2016, Kara et al., 2011, Nelson et al.,
2017, Patel et al., 2015, Qiu and Song, 2016, Wang and Kim, 2018). The
performance of the predictive model may not be top-notch due to the use of
limited features. On the flip side, if all the available features from the financial
market are included, the model could be complex and difficult to interpret. In
addition, the model performance may be worse due to collinearity among
multiple variables.

A proper model developed with an optimal set of attributes can predict


stock price reasonably well and better inform the market situation. A plethora of
research has been published to study how certain variables correlate with stock
price behavior. A varying degree of success is seen concerning the accuracy and
Department of CSE, GPCET, Kurnool Page 1
Visualising and forecasting of stocks using LSTM

robustness of the models. One possible reason for not achieving the expected
outcome could be in the variable selection process. There is a greater chance that
the developed model performs reasonably better if a good combination of
features is considered. One of the contributions of this study is selecting the
variables by looking meticulously at multiple aspects of the economy and their
potential impact in the broader markets. Moreover, a detailed justification is
supplied why the specific explanatory variables are chosen in the present context
in Section 4.

The field of quantitative analysis in finance has a long history. Several


models ranging from naive to complex have been developed so far to find the
solution to financial problems. However, not all quantitative analyses or models
are fully accepted or widely used. One of the first attempts was made in the
seventies by two British statisticians, Box and Jenkins, using mainframe
computers (Hansen, McDonald, & Nelson, 1999). They developed the Auto-
Regressive Integrated Moving Average (ARIMA) model utilizing only the
historical data of price and volume. The ARIMA is used to handle only stationary
time series data by default. Performance can be abysmal if it is used for non-
stationary data. Therefore, it is essential to convert non-stationary time series data
to stationary before implementation, which may lose the original structure and
interpretability of the feature. With very few exceptions, almost all classical
models assume that data has a linear relationship. This assumption vividly raises
the questions about the robustness of the classical time series models as the real-
world time series data are often nonlinear.

Things were getting more interesting from the eighties because of the
development in data analysis tools and techniques. For instance, the spreadsheet
was invented to model financial performance, automated data collection became
a reality, and improvements in computing power helped predictive models to
analyze the data quickly and efficiently. Because of the availability of large-scale
data, advancement in technology, and inherent problem associated with the
classical time series models, researchers started to build models by unlocking the
power of artificial neural networks and deep learning techniques in the area of
sequential data modeling and forecasting. These methods are capable of learning
complex and non-linear relationships compared to traditional methods. They are
Department of CSE, GPCET, Kurnool Page 2
Visualising and forecasting of stocks using LSTM

more efficient in extracting the most important information from the given input
variables.

1.2 PROBLEM FORMULATION

There are many tools available for predicting the stock market but it is an
exceedingly difficult task for humans to solve with tradational data analysis tools
and only data analytic experts know how to use these tools.But what about the
rest of the people how can they predict the stock market? This is where our
project comes it is really very simple to use,it has a really simple user interface
for the user to use,and the user does not need to have any prior knowledge about
stock market analysis,he just needs to type in the stock name and the date he
wants to know the price of the stock he wants,it returns the price predicted along
with basic information about the company and the forecast.

This study considers the computational framework to predict the stock


index price using the LSTM model, the improved version of neural networks
architecture for time series data. The bird’s-eye view of the proposed research
framework via the schematic diagram is expressed in Fig. 1. As outlined in the
diagram, the proposed study utilizes the carefully selected features from
fundamental, macroeconomic, and technical data to build the model. After that,
the collected data has been normalized using the min–max normalization
technique. Then input sequence for the LSTM model is created using a specific
time step. The hyperparameters such as number of neurons, epochs, learning rate,
batch size, and time step have been incorporated in the model. The regularization
techniques have been utilized to overcome the over-fitting problems. Once the
hyperparameters are tuned, the input data is fed into the LSTM model to predict
the closing price of the stock market index. The quality of the proposed model is
assessed through RMSE, MAPE, and R.

1.3 OBJECTIVES OF THE THESIS

The main objectives of visualizing and forecasting stocks using LSTM


(Long Short-Term Memory) are:

1. To gain insights and understanding: Visualizing stock data can provide


insights into the patterns, trends, and relationships that exist in the data. This

Department of CSE, GPCET, Kurnool Page 3


Visualising and forecasting of stocks using LSTM

can help traders and investors to better understand the behavior of the stock
market and make more informed decisions.

2. To predict future trends: LSTM models can be used to forecast future


trends in stock prices. By analyzing past trends and patterns, these models can
predict what might happen in the future, which can be valuable information
for traders and investors.

3. To identify trading opportunities: By analyzing the patterns and trends in


stock data, LSTM models can identify trading opportunities that may not be
immediately apparent. For example, if a stock is consistently undervalued or
overvalued, an LSTM model may be able to detect this and provide an
opportunity to buy or sell the stock.

1.4 GOALS OF THE THESIS

1. Identify patterns and trends: Visualizing and forecasting stocks using


LSTM can help identify patterns and trends in historical data, allowing
investors to make more informed decisions about the market.

2. Predict future performance: LSTM models can be trained on historical


data to predict future performance of stocks. This can help investors
anticipate changes in the market and make more profitable investments.

3. Manage risk: By using LSTM models to forecast stock prices, investors can
identify potential market fluctuations and manage risk accordingly. This can
help minimize losses and maximize returns.

4. Provide valuable insights: Visualizing stock data can provide valuable


insights into market behavior that may not be immediately apparent. This
can help investors make more informed decisions about their investments
and improve their overall performance in the market.

5. Improve trading strategies: Visualizing and forecasting stocks using LSTM


can help investors improve their trading strategies by providing insights into
market behavior and identifying patterns that may not be immediately
apparent.

Department of CSE, GPCET, Kurnool Page 4


Visualising and forecasting of stocks using LSTM

CHAPTER 2
LITERATURE REVIEW
[1] P. Li, C. Jing, T. Liang, M. Liu, Z. Chen and L. Guo, "Autoregressive
moving average modeling in the financial sector," 2015 2nd International
Conference on Information Technology, Computer, and Electrical
Engineering (ICITACEE), Semarang, Indonesia,2015
10.1109/ICITACEE.2015.7437772
Abstract: Time series modelling has long been used to make forecast in different
industries with a variety of statistical models currently available. Methods for
analyzing changing patterns of stock prices have always been based on fixed time
series. Considering that these methods have ignored some crucial factors in stock
prices, we use ARIMA model to predict stock prices given the stock-trading
volume and exchange rate as independent variables to achieve a more stable and
accurate prediction process. In this paper we will introduce the modeling process
and give the estimate SSE (Shanghai Stock Exchange) Composite Index to see
the model's estimation so this document helps you in visualising of stocks.
keywords:{Biological-system-modeling;Autoregressive-processes;Timeseries
analysis;Predictive models;Indexes;Computational modeling;Estimation;Time
Series;Statistical Modeling;ARIMA;styling},

URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7437772&isnu
mber=7437747

[2] M. Usmani, S. H. Adil, K. Raza and S. S. A. Ali, "Stock market


prediction using machine learning techniques," 2016 3rd International
Conference on Computer and Information Sciences (ICCOINS), Kuala
Lumpur, Malaysia, 2016,10.1109/ICCOINS.2016.7783235 published ref.
Abstract: The main objective of this research is to predict the market
performance of Karachi Stock Exchange (KSE) on day closing using different
machine learning techniques. The prediction model uses different attributes as an
input and predicts market as Positive & Negative. The attributes used in the
model includes Oil rates, Gold & Silver rates, Interest rate, Foreign Exchange
(FEX) rate, NEWS and social media feed. The old statistical techniques including

Department of CSE, GPCET, Kurnool Page 5


Visualising and forecasting of stocks using LSTM

Simple Moving Average (SMA) and Autoregressive Integrated Moving Average


(ARIMA) are also used as input. The machine learning techniques including
Single Layer Perceptron (SLP), Multi-Layer Perceptron (MLP), Radial Basis
Function (RBF) and Support Vector Machine (SVM) are compared. All these
attributes are studied separately also. The algorithm MLP performed best as
compared to other techniques. The oil rate attribute was found to be most
relevant to market performance. The results suggest that performance of KSE-
100 index can be predicted with machine learning techniques.
keywords: {Stock markets;Neurons;Support vector machines;Data
models;Computational modeling;Computer science;Predictive models;Stock
Prediction;KSE-100 Index;Neural Networks;SVM},

URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7783235&isnu
mber=7783169

[3] S. H. Park, B. Kim, C. M. Kang, C. C. Chung and J. W. Choi,


"Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM
Encoder-Decoder Architecture," 2018 IEEE Intelligent Vehicles Symposium
(IV), Changshu, China,2018,10.1109/IVS.2018.8500658

Abstract: In this paper, we propose a deep learning based vehicle trajectory


prediction technique which can generate the future trajectory sequence of
surrounding vehicles in real time. We employ the encoder-decoder architecture
which analyzes the pattern underlying in the past trajectory using the long short-
term memory (LSTM) based encoder and generates the future trajectory
sequence using the LSTM based decoder. This structure produces the K most
likely trajectory candidates over occupancy grid map by employing the beam
search technique which keeps the K locally best candidates from the decoder
output. The experiments conducted on highway traffic scenarios show that the
prediction accuracy of the proposed method is significantly higher than the
conventional trajectory prediction techniques in this architecture.
keywords: {Trajectory;Decoding;Computer architecture;Microprocessors;Task
analysis;Machin learning;Real-time-systems},

Department of CSE, GPCET, Kurnool Page 6


Visualising and forecasting of stocks using LSTM

URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8500658&isnu
mber=8500355

[4] R. Fu, Z. Zhang and L. Li, "Using LSTM and GRU neural network
methods for traffic flow prediction," 2016 31st Youth Academic Annual
Conference of Chinese Association of Automation (YAC), Wuhan, China,
2016, pp.324-328.

Abstract: Accurate and real-time traffic flow prediction is important in Intelligent


Transportation System (ITS), especially for traffic control. Existing models such
as ARMA, ARIMA are mainly linear models and cannot describe the stochastic
and nonlinear nature of traffic flow. In recent years, deep-learning-based methods
have been applied as novel alternatives for traffic flow prediction. However,
which kind of deep neural networks is the most appropriate model for traffic flow
prediction remains unsolved. In this paper, we use Long Short Term Memory
(LSTM) and Gated Recurrent Units (GRU) neural network (NN) methods to
predict short-term traffic flow, and experiments demonstrate that Recurrent
Neural Network (RNN) based deep learning methods such as LSTM and GRU
perform better than auto regressive integrated moving average (ARIMA) model.
To the best of our knowledge, this is main objective of this stock prediction.
keywords: {Decision support systems;Economic indicators;Predictive
models;Integrated-circuits;Conferences;traffic-flow-
prediction;LSTM;GRU;ARIMA},

URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7804912&isnu
mber=7804853

[5] R. Akita, A. Yoshihara, T. Matsubara and K. Uehara, "Deep learning for


stock prediction using numerical and textual information," 2016 IEEE/ACIS
15th International Conference on Computer and Information Science (ICIS),
Okaya.

Abstract: This paper proposes a novel application of deep learning models,


Paragraph Vector, and Long Short-Term Memory (LSTM), to financial time
series forecasting. Investors make decisions according to various factors,

Department of CSE, GPCET, Kurnool Page 7


Visualising and forecasting of stocks using LSTM

including consumer price index, price-earnings ratio, and miscellaneous events


reported in newspapers. In order to assist their decisions in a timely manner,
many automatic ways to analyze those information have been proposed in the last
decade. However, many of them used either numerical or textual information, but
not both for a single company. In this paper, we propose an approach that
converts newspaper articles into their distributed representations via Paragraph
Vector and models the temporal effects of past events on opening prices about
multiple companies with LSTM.

keywords: {Companies;Market research;Time series analysis;Numerical


models;Predictive models;Neural-networks;Informatics},

URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7550882&isnu
mber=7550716

[6] S. Selvin, R. Vinayakumar, E. A. Gopalakrishnan, V. K. Menon and K.


P. Soman, "Stock price prediction using LSTM, RNN and CNN-sliding
window model," 2017 International Conference on Advances in Computing,
Communications and Informatics (ICACCI), Udupi, India, 2017, pp. 1643-
1647.
Abstract: Stock market or equity market have a profound impact in today's
economy. A rise or fall in the share price has an important role in determining the
investor's gain. The existing forecasting methods make use of both linear (AR,
MA, ARIMA) and non-linear algorithms (ARCH, GARCH, Neural Networks),
but they focus on predicting the stock index movement or price forecasting for a
single company using the daily closing price. The proposed method is a model
independent approach. Here we are not fitting the data to a specific model, rather
we are identifying the latent dynamics existing in the data using deep learning
architectures. In this work we use three different deep learning architectures for
the price prediction of NSE listed companies and compares their performance.
We are applying a sliding window approach for predicting future values on a
short term basis. The performance of the models were quantified using
percentage error this is also defined by some keywords which are mentioned.

Department of CSE, GPCET, Kurnool Page 8


Visualising and forecasting of stocks using LSTM

keywords: {Companies;Time series analysis;Logic gates;Predictive


models;Forecasting;Data models;Machine-learning;RNN;LSTM;CNN}

URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8126078&isnu
mber=8125802

Department of CSE, GPCET, Kurnool Page 9


Visualising and forecasting of stocks using LSTM

CHAPTER 3
PROPOSED SYSTEM

3.1 EXISTING METHODS

Stock prices prediction is interesting and challenging research topic.


Developed countries' economies are measured according to their power economy.
Currently, stock markets are considered to be an illustrious trading field because
in many cases it gives easy profits with low risk rate of return. Stock market with
its huge and dynamic information sources is considered as a suitable environment
for data mining and business researchers. In this paper, we applied k-nearest
neighbour algorithm and non-linear regression approach in order to predict stock
prices for a sample of six major companies listed on the Jordanian stock
exchange to assist investors, management, decision makers, and users in making
correct and informed investments decisions. According to the results, the kNN
algorithm is robust with small error ratio; consequently the results were rational
and also reasonable. In addition, depending on the actual stock prices data; the
prediction results were close and almost parallel to actual stock prices

Limitations:

1. System accuracy is low


2. System is highly volatile
3.1.1 Knn And Logisitic Regression

o K-Nearest Neighbour(KNN) Algorithm for Machine Learning

o K-Nearest Neighbour is one of the simplest Machine Learning algorithms


based on Supervised Learning technique.

o K-NN algorithm assumes the similarity between the new case/data and
available cases and put the new case into the category that is most similar to
the available categories.

o K-NN algorithm stores all the available data and classifies a new data point
based on the similarity. This means when new data appears then it can be
easily classified into a well suite category by using K- NN algorithm.

Department of CSE, GPCET, Kurnool Page 10


Visualising and forecasting of stocks using LSTM

o K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.

o K-NN is a non-parametric algorithm, which means it does not make any


assumption on underlying data.

o It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.

o KNN algorithm at the training phase just stores the dataset and when it gets
new data, then it classifies that data into a category that is much similar to
the new data.

o Example: Suppose, we have an image of a creature that looks similar to cat


and dog, but we want to know either it is a cat or dog. So for this
identification, we can use the KNN algorithm, as it works on a similarity
measure. Our KNN model will find the similar features of the new data set
to the cats and dogs images and based on the most similar features it will
put it in either cat or dog category.

Fig 3.1.1.1 : KNN Algorithm

The K-NN working can be explained on the basis of the below algorithm:

o Step-1: Select the number K of the neighbours

o Step-2: Calculate the Euclidean distance of K number of neighbours

Department of CSE, GPCET, Kurnool Page 11


Visualising and forecasting of stocks using LSTM

o Step-3: Take the K nearest neighbours as per the calculated Euclidean


distance.

o Step-4: Among these k neighbours, count the number of the data points in
each category.

o Step-5: Assign the new data points to that category for which the number of
the neighbour is maximum.

o Step-6: Our model is ready.

Suppose we have a new data point and we need to put it in the required category.

Department of CSE, GPCET, Kurnool Page 12


Visualising and forecasting of stocks using LSTM

Fig 3.1.1.2 : Euclidean distance

o Firstly, we will choose the number of neighbours, so we will choose the


k=5.

o Next, we will calculate the Euclidean distance between the data points. The
Euclidean distance is the distance between two points, which we have
already studied in geometry.

Department of CSE, GPCET, Kurnool Page 13


Visualising and forecasting of stocks using LSTM

Fig 3.1.1.3 : Evaluated Distance


o By calculating the Euclidean distance we got the nearest neighbours, as three
nearest neighbours in category A and two nearest neighbours in category B.
Consider the below image:

Department of CSE, GPCET, Kurnool Page 14


Visualising and forecasting of stocks using LSTM

Fig 3.1.1.4 : Categorization


o As we can see the 3 nearest neighbours are from category A, hence this new data
point must belong to category A.

o How to select the value of K in the K-NN Algorithm ?

Below are some points to remember while selecting the value of K in the K-NN
algorithm:

o There is no particular way to determine the best value for "K", so we need to try
some values to find the best out of them. The most preferred value for K is 5.

o A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of
outliers in the model.

o Large values for K are good, but it may find some difficulties.

o So Accuracy is low here so we opt LSTM

o So this model provides low accuracy

Department of CSE, GPCET, Kurnool Page 15


Visualising and forecasting of stocks using LSTM

3.1.2 Advantages of KNN Algorithm:

o It is simple to implement.

o It is robust to the noisy training data

o It can be more effective if the training data is large.

3.1.3 Disadvantages of KNN Algorithm:

o Always needs to determine the value of K which may be complex some


time.

o The computation cost is high because of calculating the distance between


the data points for all the training samples.

o Logistic Regression in Machine Learning

o Logistic regression is one of the most popular Machine Learning


algorithms, which comes under the Supervised Learning technique. It is
used for predicting the categorical dependent variable using a given set of
independent variables.

o Logistic regression predicts the output of a categorical dependent variable.


Therefore the outcome must be a categorical or discrete value. It can be
either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0 and
1.

o Logistic Regression is much similar to the Linear Regression except that


how they are used. Linear Regression is used for solving Regression
problems, whereas Logistic regression is used for solving the
classification problems.

o In Logistic regression, instead of fitting a regression line, we fit an "S"


shaped logistic function, which predicts two maximum values (0 or 1).

o The curve from the logistic function indicates the likelihood of something
such as whether the cells are cancerous or not, a mouse is obese or not
based on its weight, etc.

Department of CSE, GPCET, Kurnool Page 16


Visualising and forecasting of stocks using LSTM

o Logistic Regression is a significant machine learning algorithm because it


has the ability to provide probabilities and classify new data using
continuous and discrete datasets.

o Logistic Regression can be used to classify the observations using different


types of data and can easily determine the most effective variables used for
the classification. The below image is showing the logistic function:

Fig 3.1.3.1 -Sigmoid curve


o Note: Logistic regression uses the concept of predictive modelling as
regression; therefore, it is called logistic regression, but is used to classify
samples; Therefore, it falls under the classification algorithm.

o Logistic Function (Sigmoid Function):

o The sigmoid function is a mathematical function used to map the predicted


values to probabilities.

o It maps any real value into another value within a range of 0 and 1.

o The value of the logistic regression must be between 0 and 1, which cannot
go beyond this limit, so it forms a curve like the "S" form. The S-form
curve is called the Sigmoid function or the logistic function.

Department of CSE, GPCET, Kurnool Page 17


Visualising and forecasting of stocks using LSTM

o In logistic regression, we use the concept of the threshold value, which


defines the probability of either 0 or 1. Such as values above the threshold
value tends to 1, and a value below the threshold values tends to 0.

o Assumptions for Logistic Regression:

o The dependent variable must be categorical in nature.

o The independent variable should not have multi-collinearity.

o Logistic Regression Equation:

The Logistic regression equation can be obtained from the Linear Regression
equation. We know the equation of the straight line can be written as:

o In Logistic Regression y can be between 0 and 1 only, so for this let's divide
the above equation by (1-y):

o But we need range between -[infinity] to +[infinity], then take logarithm of


the equation it will become:

The above equation is the final equation for Logistic Regression.

Type of Logistic Regression:

On the basis of the categories, Logistic Regression can be classified into three types:

o Binomial: In binomial Logistic regression, there can be only two possible


types of the dependent variables, such as 0 or 1, Pass or Fail, etc.

o Multinomial: In multinomial Logistic regression, there can be 3 or more


possible unordered types of the dependent variable, such as "cat", "dogs", or
"sheep"

Department of CSE, GPCET, Kurnool Page 18


Visualising and forecasting of stocks using LSTM

3.2 PROPOSED SYSTEM

The stock market is a complex and dynamic system that is influenced by


many factors, including economic conditions, political events, and investor
sentiment. Predicting stock market movements is challenging, but it can be
achieved by using various techniques such as fundamental analysis, technical
analysis, and quantitative analysis. In recent years, machine learning techniques
such as ARIMA (Autoregressive Integrated Moving Average) have gained
popularity in predicting stock market movements. ARIMA is a time series
forecasting method that is widely used in finance and economics for modeling
and predicting time series data.

ARIMA is a powerful tool for predicting stock prices because it takes into
account the trend, seasonality, and random fluctuations in the data. The ARIMA
model consists of three components: the autoregressive component (AR), the
integrated component (I), and the moving average component (MA). The AR
component represents the relationship between past values and future values of
the time series. The MA component represents the relationship between past
errors and future values of the time series. The I component represents the
differencing of the time series to remove any trend or seasonality.

ARIMA models are commonly denoted as ARIMA where p represents the


order of the AR component, d represents the order of differencing, and q
represents the order of the MA component. The order of the AR and MA
components can be determined by examining the autocorrelation and partial
autocorrelation functions of the time series. The order of differencing can be
determined by examining the trend and seasonality of the time series.

The following are the steps involved in building an ARIMA model for
stock market prediction:

1. Data preparation: The first step is to prepare the time series data. This
involves cleaning the data, checking for missing values, and ensuring that the
data is stationary.

Department of CSE, GPCET, Kurnool Page 19


Visualising and forecasting of stocks using LSTM

2. Stationarity: ARIMA models assume that the time series data is stationary.
Stationarity means that the mean, variance, and autocorrelation of the data are
constant over time. If the data is not stationary, we need to transform the data
to make it stationary. This can be achieved by taking the first or second
difference of the data.

3. Autocorrelation and Partial Autocorrelation: The next step is to examine the


autocorrelation and partial autocorrelation functions of the time series data.
This will help us determine the order of the AR and MA components.

4. Model Selection: Based on the results of the autocorrelation and partial


autocorrelation functions, we can select the order of the AR and MA
components. We can then fit different ARIMA models to the data and select
the best model based on statistical measures such as the Akaike Information
Criterion (AIC) and the Bayesian Information Criterion (BIC).

5. Model Evaluation: Once we have selected the best ARIMA model, we need
to evaluate its performance. We can use statistical measures such as the Mean
Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared
Error (RMSE) to evaluate the accuracy of the model.

6. Forecasting: The final step is to use the ARIMA model to forecast future
values of the time series. We can use the forecasted values to make informed
investment decisions.

In conclusion, ARIMA is a powerful tool for predicting stock prices because


it takes into account the trend, seasonality, and random fluctuations in the data.
By following the steps outlined above, we can build an accurate ARIMA model
for stock market prediction. However, it is important to note that no model is
perfect and that there is always some degree of uncertainty in stock market
predictions. It is important to use ARIMA in conjunction with other techniques
such as fundamental and technical analysis to make informed investment
decisions.

3.2.1 Introduction about LSTM:

LSTM networks extend the recurrent neural network (RNNs) mainly designed to
deal with situations in which RNNs do not work. When we talk about RNN, it is

Department of CSE, GPCET, Kurnool Page 20


Visualising and forecasting of stocks using LSTM

an algorithm that processes the current input by taking into account the output of
previous events (feedback) and then storing it in the memory of its users for a
brief amount of time (short-term memory). Of the many applications, its most
well-known ones are those in the areas of non-Markovian speech control and
music composition. However, there are some drawbacks to RNNs. It is the first
to fail to save information over long periods of time. Sometimes an ancestor of
data stored a considerable time ago is needed to determine the output of the
present. However, RNNs are utterly incapable of managing these "long-term
dependencies." The second issue is that there is no better control of which
component of the context is required to continue and what part of the past must
be forgotten. Other issues associated with RNNs are the exploding or
disappearing slopes (explained later) that occur in training an RNN through
backtracking. Therefore, the Long-Short-Term Memory (LSTM) was introduced
into the picture. It was designed so that the problem of the gradient disappearing
is eliminated almost entirely as the training model is unaffected. Long-time lags
within specific issues are solved using LSTMs, which also deal with the effects
of noise, distributed representations, or endless numbers. With LSTMs, they do
not meet the requirement to maintain the same number of states before the time
required by the hideaway Markov model (HMM). LSTMs offer us an extensive
range of parameters like learning rates and output and input biases. Therefore,
there is no need for minor adjustments. The effort to update each weight is
decreased to O(1) by using LSTMs like those used in Back Propagation Through
Time (BPTT), which is a significant advantage.

3.2.2 Exploding Vanishing gradient problem

In training a network, the primary objective is to reduce the losses (in terms of
cost or error) seen in the output of the network when training data is passed
through it. We determine the gradient, or loss in relation to a weight set, then
adjust the weights in accordance with this, and repeat this process until we arrive
at an optimal set of weights that will ensure the loss is as low as. This is the idea
behind reverse-tracking. Sometimes, it is the case that the gradient becomes
minimal. It is important to note that the amount of gradient in one layer is
determined by some aspects of the following layers. If any component is tiny
(less one), The result is that the gradient will appear smaller. This is also known
Department of CSE, GPCET, Kurnool Page 21
Visualising and forecasting of stocks using LSTM

as "the scaling effect. If this effect is multiplied by the rate of learning, which is a
tiny value that ranges from 0.1-to 0.001, this produces a lower value. This means
that the change in weights is minimal and produces nearly the same results as
before. If the gradients are significant because of the vast components and the
weights are changed to be higher than the ideal value. The issue is commonly
referred to as the issue of explosive gradients. To stop this effect of scaling, the
neural network unit was rebuilt such that the scale factor was set to one. The cell
was then enhanced by a number of gating units and was named the LSTM.

3.2.3 Architecture:

The main difference between the structures that comprise RNNs as well as
LSTMs can be seen in the fact that the hidden layer of LSTM is the gated unit or
cell. It has four layers that work with each other to create the output of the cell, as
well as the cell's state. Both of these are transferred to the next layer. Contrary to
RNNs, which comprise the sole neural net layer made up of Tanh, LSTMs are
comprised of three logistic sigmoid gates and a Tanh layer. Gates were added to
restrict the information that goes through cells. They decide which portion of the
data is required in the next cell and which parts must be eliminated. The output
will typically fall in the range of 0-1, where "0" is a reference to "reject all' while
"1" means "include all."

3.2.4 Hidden layers of LSTM:

Fig 3.2.4.1: Hidden layers of LSTM

Department of CSE, GPCET, Kurnool Page 22


Visualising and forecasting of stocks using LSTM

Each LSTM cell is equipped with three inputs and two outputs, h t, and Ct. At a
specific time, t, which ht is the hidden state, and C t is the cell state or memory. It
xt is the present information point or the input. The first sigmoid layer contains
two inputs: ht-1 and xt, where ht-1 is the state hidden in the cell before it. It is also
known by its name and the forget gate since its output is a selection of the
amount of data from the last cell that should be included. Its output will be a
number [0,1] multiplied (pointwise) by the previous cell's state .

3.2.5 Applications:

LSTM models have to be trained using a training dataset before being used for
real-world use. The most challenging applications are listed in the following
sections:

1. Text generation or language modelling involves the calculation of words


whenever a sequence of words is supplied as input. Language models can
be used at the level of characters or n-gram level as well as at the sentence
or the level of a paragraph.

2. Image processing is the process of the analysis of a photograph and


converting the result into sentences. In order to do this, we will need to have
a set of data consisting of many photos with the appropriate descriptive
captions. A model that has been trained can determine the characteristics of
images in the data. It is a photo dataset. The data is processed to include
only those words that suggest the most. It is text data. By combining these
two types of information, we will try to make the model work. The model's
job is to produce a descriptive phrase for the image, one word at the
moment, using input words predicted by the model and the image.

3. Speech and Handwriting Recognition

4. The process of music generation is identical to text generation, where


LSTMs can predict the musical notes, not text, by studying a mix of notes
fed into the input.

5. Language Translation involves translating a sequence of one language to a


similar sequence in a different language. Like image processing, an image-

Department of CSE, GPCET, Kurnool Page 23


Visualising and forecasting of stocks using LSTM

based dataset that includes words and translations is cleaned first before the
relevant portion to build the model. An encoder-decoder LSTM model can
convert the input sequences into their formatted vector (encoding) and then
convert the translated version.

3.2.6 Drawbacks:

Everything in the world indeed has its advantages and disadvantages. LSTMs are
no exception, and they also come with a few disadvantages that are discussed
below:

1. They became popular due to the fact that they solved the issue of gradients
disappearing. However that they are unable to eliminate the problem. The
issue lies in that data needs to be moved between cells for its analysis.
Furthermore, the cell is becoming extremely complex with the addition of
functions (such as the forget gate) that are now part of the picture.

2. They require lots of time and resources to be trained and prepared for real-
world applications. Technically speaking, they require high memory
bandwidth due to the linear layers present within each cell, which the
system is usually unable to supply. Therefore, in terms of hardware, LSTMs
become pretty inefficient.

3. With the growing technology of data, mining scientists are searching for a
system that is able to store past data for more extended periods of time than
LSTMs. The motivation behind the development of such a model is the
habit of humans of dividing a particular chunk of information into smaller
parts to facilitate recollection.

4. LSTMs are affected by various random weights and behave similarly to


neural networks that feed forward. They favour small initialization over
large weights.

5. LSTMs tend to overfit, and it can be challenging to implement dropout to


stop this problem. Dropout is a method of regularization that ensures that
inputs and recurrent connectivity with LSTM units are systematically
exempted from weight updates and activation when developing a network.

Department of CSE, GPCET, Kurnool Page 24


Visualising and forecasting of stocks using LSTM

3.2.6 Technical Analysis

Technical Analysis is helpful to estimate the future economic stock movement


based on stock historical movement. Technical limitations do not forecast stock
price, but based on historical analysis, technical limits can forecast the stock
movement on existing market condition over time. Technical examination help
depositor to forecast the stock price movement (up/down) in that specific time
interval. Technical examination habits a wide diversity of charts that show price
over period.

The company tickers of S&P 500 list from Wikipedia is being saved and
abstraction of stock data in contradiction of every company ticker is being done.

Then close index of every company is taken into account and put it into one
data frame and try to find a connection between each company and then pre-
processing the data and creating different technical parameters built on stock price,
bulk and close worth and based on the movement of prices will progress
technical meters that will aid set a target percentage to foretell buy, sell, hold.

3.2.7 Source Code

To initiate, a list of companies is essential for which we could progress


statistical models and forecast future conditions of stock. Every company will
have its private record of stock data from 1/1/2000 to 31/12/2017. Firstly, a
list of companies is desired, so the S&P 500 list is mined from Wikipedia,
there the S&P list is in a table format which is easy to handle.

Department of CSE, GPCET, Kurnool Page 25


Visualising and forecasting of stocks using LSTM

For each table row, ticker is the table data, clutch the .text of it and attach this
ticker to the list, to save the list use pickle and if the list changes, modify it to check for
specific periods of time. Redeemable the list of tickers, so not to hit Wikipedia again
and again every time the script is ride.

Have tickers of 500 companies, need the stock estimating data of each
company. Abstract the stock estimating data of the first 15 companies, each
company has stock data to around 6000 admissions for each company. The
companies which were started after 2000 and has vacant values, their entries of nan
are replaced by zero.

Department of CSE, GPCET, Kurnool Page 26


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.1 : Code to Grab S&P 500tickers

Fig 3.2.7.2 : Output of Grabbed 500 tickers

Department of CSE, GPCET, Kurnool Page 27


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.3 : Code to Grab stock data from Morningstar

Department of CSE, GPCET, Kurnool Page 28


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.4 : Output Stock data of companies.

Department of CSE, GPCET, Kurnool Page 29


Visualising and forecasting of stocks using LSTM

Fig 3..2.7.5 : Code to compile all close index of company in one data frame.

Department of CSE, GPCET, Kurnool Page 30


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.6 : Output close index of all companies together in one data frame.

Department of CSE, GPCET, Kurnool Page 31


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.7 : Code to find and visualize correlations.

Department of CSE, GPCET, Kurnool Page 32


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.8 : Output Of the correlation table.

Fig 3.2.7.9: Heatmap of the correlations.

Department of CSE, GPCET, Kurnool Page 33


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.10 : Code to set trading conditions and data processing for labels.

Department of CSE, GPCET, Kurnool. Page 34


Visualising and forecasting of stocks using LSTM

Fig 3.2.7.11: Code to extract feature sets and map them to labels.

Department of CSE, GPCET, Kurnool. Page 35


Visualising and forecasting of stocks using LSTM

3.3 SYSTEM REQUIREMENTS


3.3.1 Software Requirements:

 Operating system : Windows 10.

 Coding Language : Python 3.8

 Web Framework : Flask

3.3.2 Hardware Requirements

 System : Pentium i3 Processor.

 Hard Disk : 500 GB.

 Monitor : 15’’ LED

 Input Devices : Keyboard, Mouse

 Ram : 4 GB

Department of CSE, GPCET, Kurnool. Page 36


Visualising and forecasting of stocks using LSTM

3.3.3 Software Installation:

Python is a general-purpose interpreted, interactive, object-oriented, and high-level


programming language. It was created by Guido van Rossum during 1985- 1990.
Like Perl, Python source code is also available under the GNU General Public
License (GPL). Python is a high-level, interpreted, interactive and object-oriented
scripting language.

Python is designed to be highly readable. It uses English keywords frequently where


as other languages use punctuation, and it has fewer syntactical constructions than
other languages.

□ Python is Interpreted − Python is processed at runtime by the interpreter. You do


not need to compile your program before executing it. This is similar to PERL and
PHP.

□ Python is Interactive − You can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.

□ Python is Object-Oriented − Python supports Object-Oriented style or technique


of programming that encapsulates code within objects.

□ Python is a Beginner's Language − Python is a great language for the beginner-


level programmers and supports the development of a wide range of applications
from simple text processing to WWW browsers to games.

History Of Python:

Python was developed by Guido van Rossum in the late eighties and early nineties at
the National Research Institute for Mathematics and Computer Science in the
Netherlands.

Python is derived from many other languages, including ABC, Modula-3, C, C++,
Algol-68, Smalltalk, and UNIX shell and other scripting languages.

Python is copyrighted. Like Perl, Python source code is now available under the
GNU General Public License (GPL).

Department of CSE, GPCET, Kurnool. Page 37


Visualising and forecasting of stocks using LSTM

Python is now maintained by a core development team at the institute, although


Guido van Rossum still holds a vital role in directing its progress.
Python Features

Python's features include −


 Easy-to-learn − Python has few keywords, simple structure, and a clearly
defined syntax. This allows the student to pick up the language quickly.
 Easy-to-read − Python code is more clearly defined.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.

System Environment
 A broad standard library − Python's bulk of the library is very portable and
crossplatform compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
 Portable − Python can run on a wide variety of hardware platforms and has the
same interface on all platforms.
 Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more
efficient.
 Databases − Python provides interfaces to all major commercial databases.
 GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows
MFC, Macintosh, and the X Window system of Unix.
 Scalable − Python provides a better structure and support for large programs
than shell scripting. Apart from the above-mentioned features, Python has a big
list of good features, few are listed below −
 It supports functional and structured programming methods as well as OOP.
 It can be used as a scripting language or can be compiled to byte-code for
building large applications.
 It provides very high-level dynamic data types and supports dynamic type
checking.
 It supports automatic garbage collection.

Department of CSE, GPCET, Kurnool. Page 38


Visualising and forecasting of stocks using LSTM

 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
Python is available on a wide variety of platforms including Linux and Mac
OS.

Let's understand how to set up our Python environment.

Local Environment Setup:

Open a terminal window and type "python" to find out if it is already installed
and which version is installed.
 Unix (Solaris, Linux, FreeBSD, AIX, HP/UX, SunOS, IRIX, etc.)
 Win 9x/NT/2000
 Macintosh (Intel, PPC, 68K)
 OS/2
 DOS (multiple versions)
 PalmOS
 Nokia mobile phones
 Windows CE
 Acorn/RISC OS
 BeOS
 Amiga
 VMS/OpenVMS
 QNX
 VxWorks
 Psion
 Python has also been ported to the Java and .NET virtual machines

Getting Python:

The most up-to-date and current source code, binaries, documentation, news,
etc., is available on the official website of Python https://www.python.org/

You can download Python documentation from https://www.python.org/doc/.


The documentation is available in HTML, PDF, and PostScript formats.

Department of CSE, GPCET, Kurnool. Page 39


Visualising and forecasting of stocks using LSTM

Installing Python:

Python distribution is available for a wide variety of platforms. You need to


download only the binary code applicable for your platform and install Python. If the
binary code for your platform is not available, you need a C compiler to compile the
source code manually. Compiling the source code offers more flexibility in terms of
choice of features that you require in your installation.

Here is a quick overview of installing Python on various platforms − Unix and Linux
Installation

Here are the simple steps to install Python on Unix/Linux machine.


 Open a Web browser and go to https://www.python.org/downloads/.
 Follow the link to download zipped source code available for Unix/Linux.
 Download and extract files.
 Editing the Modules/Setup file if you want to customize some options.
 run ./configure script
 make
 make install

This installs Python at standard location /user/local/bin and its libraries at


/user/local/lib/python XX where XX is the version of Python.

Windows Installation:

Here are the steps to install Python on Windows machine.


 Open a Web browser and go to https://www.python.org/downloads/.
 Follow the link for the Windows installer python-XYZ.msi file where XYZ is
the version you need to install.
 To use this installer python-XYZ.msi, the Windows system must support
Microsoft Installer 2.0. Save the installer file to your local machine and then
run it to find out if your machine supports MSI.

Department of CSE, GPCET, Kurnool. Page 40


Visualising and forecasting of stocks using LSTM

 Run the downloaded file. This brings up the Python install wizard, which is
really easy to use. Just accept the default settings, wait until the install is
finished, and you are done.

Macintosh installation:

Recent Macs come with Python installed, but it may be several years out of
date.

See http://www.python.org/download/mac/ for instructions on getting the current


version along with extra tools to support development on the Mac. For older Mac
OS's before Mac OS X 10.3 (released in 2003), Mac Python is available.

Setting Up Path:

Programs and other executable files can be in many directories, so operating


systems provide a search path that lists the directories that the OS searches for
executables.

The path is stored in an environment variable, which is a named string


maintained by the operating system. This variable contains information available to
the command shell and other programs.
The path variable is named as PATH in Unix or Path in Windows (Unix is case
sensitive; Windows is not).

In Mac OS, the installer handles the path details. To invoke the Python
interpreter from any particular directory, you must add the Python directory to your
path.

Setting Up Path At Unix/Linux:


 To add the Python directory to the path for a particular session in Unix −
 In the csh shell − type set env PATH "$PATH:/user/local/bin/python" and
press Enter.
 In the bash shell (Linux) − type export ATH="$PATH:/user/local/bin/python"
and press Enter.

Department of CSE, GPCET, Kurnool. Page 41


Visualising and forecasting of stocks using LSTM

 In the sh or ksh shell − type PATH="$PATH:/user/local/bin/python" and press


Enter.
 Note − /usr/local/bin/python is the path of the Python directory

Setting Path At Windows:

To add the Python directory to the path for a particular session in Windows −
At the command prompt − type path %path%; C:\Python and press Enter.

Note − C:\Python is the path of the Python directory

Python Environment Variables

Here are important environment variables, which can be recognized by Python −

Running Python:

There are three different ways to start Python −

1. Interactive Interpreter

You can start Python from UNIX, DOS, or any other system that provides you
a command line interpreter or shell window.

Enter python the command line.

Start coding right away in the interactive interpreter.

Department of CSE, GPCET, Kurnool. Page 42


Visualising and forecasting of stocks using LSTM

Here is the list of all the available command line options −

Sr.No. Option & Description


-d
1
It provides debug output.

-O
2
It generates optimized bytecode (resulting in .pyo files).

-S
3
Do not run import site to look for Python paths on startup.

-v
4
verbose output (detailed trace on import statements).

5 -X
disable class-based built-in exceptions (just use strings); obsolete starting with

version 1.6.
-c cmd
6
run Python script sent in as cmd string

File
7
run Python script from given file

TABLE 6.4.4 : LIST OF ALL THE AVAILABLE COMMAND LINE OPTIONS

Department of CSE, GPCET, Kurnool. Page 43


Visualising and forecasting of stocks using LSTM

2. Script From Command Line

A Python script can be executed at command line by invoking the interpreter on your

application, as in the following −

$python script.py # Unix/Linux

or

python% script.py # Unix/Linux

or

C: > python script.py # Windows/DOS

Note − Be sure the file permission mode allows execution.

Integrated Development Environment

You can run Python from a Graphical User Interface (GUI) environment as well, if
you have a GUI application on your system that supports Python.
 UNIX − IDLE is the very first Unix IDE for Python.
 Windows – Python Win is the first Windows interface for Python and is an
IDE with a GUI.
 Macintosh − The Macintosh version of Python along with the IDLE IDE is
available from the main website, downloadable as either Mac Binary or Bin
Hex'd files.
 If you are not able to set up the environment properly, then you can take help
from your system admin. Make sure the Python environment is properly set
up and working perfectly fine.

Department of CSE, GPCET, Kurnool. Page 44


Visualising and forecasting of stocks using LSTM

3.3.4 Technologies Overview

Python Programming Language

Python implementation was started in the year 1989 December by Guido van
Rossum. It is an open source language easy to learn and easy to read, very friendly
and great interactive environment for the beginners. Its standard library is made up
of many functions that come with Python when it is installed. It is an object oriented
and functional that easy to interface with c, obj c, java, FORTRAN. Python is a very
interactive language because it takes less time in providing the results. Python is
often used for are:

 Web development
 Scientific programming
 Desktop GUIs
 Network programming
 Gamming program.
Advantages of Python

 Extensive Support Libraries


 Integration Feature
 Improved Programmers Productivity
 Productivity

MySql
MySQL is an Open Source DBMS developed, supported and distributed by
Oracle Corporation. MySQL is easy to use, extremely powerful, supported and
secure. It is ideal database solution for web sites because of its small size and
speed.

1. MySQL software is open source

2. MySQL database are relational

3. MySQL server works in client/server or embedded system

Department of CSE, GPCET, Kurnool. Page 45


Visualising and forecasting of stocks using LSTM

4. The MySQL database server is very fast, reliable, scalable, and easy to use.

5. A large amount of contributed MySQL software is available

6. MySQL is a database management system

Department of CSE, GPCET, Kurnool. Page 46


Visualising and forecasting of stocks using LSTM

3.3.5 System Architecture

Fig 3.3.5.1 : System Architecture

Department of CSE, GPCET, Kurnool. Page 47


Visualising and forecasting of stocks using LSTM

3.3.6 System Testing

The purpose of testing is to discover errors. Testing is the process of trying


to discover every conceivable fault or weakness in a work product. It provides a way
to check the functionality of components, sub assemblies, assemblies and/or a
finished product It is the process of exercising software with the intent of ensuring
that the

Software system meets its requirements and user expectations and does not
fail in an unacceptable manner. There are various types of test. Each test type
addresses a specific testing requirement.

Types Of Tests

Unit Testing

Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the
testing of individual software units of the application .it is done after the completion
of an individual unit before integration. This is a structural testing, that relies on
knowledge of its construction and is invasive. Unit tests perform basic tests at
component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and
expected results.

Integration Testing

Integration tests are designed to test integrated software components to


determine if they actually run as one program. Testing is event driven and is more
concerned with the basic outcome of screens or fields. Integration tests demonstrate
that although the components were individually satisfaction, as shown by
successfully unit testing, the combination of components is correct and consistent.
Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.

Department of CSE, GPCET, Kurnool. Page 48


Visualising and forecasting of stocks using LSTM

Functional Testing

Functional tests provide systematic demonstrations that functions tested are


available as specified by the business and technical requirements, system
documentation, and user manuals.

Functional testing is centered on the following items:

Valid Input : identified classes of valid input must be accepted.

Invalid Input : identified classes of invalid input must be rejected.

Functions : identified functions must be exercised.

Output : identified classes of application outputs must be exercised.

Systems/Procedures: interfacing systems or procedures must be invoked.

Organization and preparation of functional tests is focused on requirements,


key functions, or special test cases. In addition, systematic coverage pertaining to
identify Business process flows; data fields, predefined processes, and successive
processes must be considered for testing. Before functional testing is complete,
additional tests are identified and the effective value of current tests is determined.

System Testing

System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.

White Box Testing

White Box Testing is a testing in which in which the software tester has
knowledge of the inner workings, structure and language of the software, or at least
its purpose. It is purpose. It is used to test areas that cannot be reached from a black
box level.

Department of CSE, GPCET, Kurnool. Page 49


Visualising and forecasting of stocks using LSTM

Black Box Testing

Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as most
other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated, as a black box
.you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.

Unit Testing:

Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.

Test strategy and approach

Field testing will be performed manually and functional tests will be written
in detail.

Test objectives

 All field entries must work properly.


 Pages must be activated from the identified link.
 The entry screen, messages and responses must not be delayed.

Features to be tested

 Verify that the entries are of the correct format


 No duplicate entries should be allowed
 All links should take the user to the correct page.
Integration Testing:

Software integration testing is the incremental integration testing of two or


more integrated software components on a single platform to produce failures caused
by interface defects.

Department of CSE, GPCET, Kurnool. Page 50


Visualising and forecasting of stocks using LSTM

The task of the integration test is to check that components or software


applications, e.g. components in a software system or – one step up – software
applications at the company level – interact without error.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

Acceptance Testing :

User Acceptance Testing is a critical phase of any project and requires


significant participation by the end user. It also ensures that the system meets the
functional requirements.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

Unit Testing:

Unit testing focuses verification effort on the smallest unit of Software design
that is the module. Unit testing exercises specific paths in a module’s control
structure to ensure complete coverage and maximum error detection. This test
focuses on each module individually, ensuring that it functions properly as a unit.
Hence, the naming is Unit Testing.

During this testing, each module is tested individually and the module
interfaces are verified for the consistency with design specification. All important
processing path are tested for the expected results. All error handling paths are also
tested.

Integration Testing

Integration testing addresses the issues associated with the dual problems of
verification and program construction. After the software has been integrated a set of
high order tests are conducted. The main objective in this testing process is to take
unit tested modules and builds a program structure that has been dictated by design.

1. Top Down Integration

This method is an incremental approach to the construction of program


structure. Modules are integrated by moving downward through the control

Department of CSE, GPCET, Kurnool. Page 51


Visualising and forecasting of stocks using LSTM

hierarchy, beginning with the main program module. The module subordinates to the
main program module are incorporated into the structure in either a depth first or
breadth first manner.
In this method, the software is tested from main module and individual stubs
are replaced when the test proceeds downwards.

2. Bottom-up Integration

This method begins the construction and testing with the modules at the lowest
level in the program structure. Since the modules are integrated from the bottom up,
processing required for modules subordinate to a given level is always available and
the need for stubs is eliminated. The bottom up integration strategy may be
implemented with the following steps:

 The low-level modules are combined into clusters into clusters that
perform a specific Software sub-function.
 A driver (i.e.) the control program for testing is written to coordinate test
case input and output.
 The cluster is tested.
 Drivers are removed and clusters are combined moving upward in the
programs .

The bottom up approaches tests each module individually and then each module is
module is integrated with a main module and tested for functionality.

Other Testing Methodologies

User Acceptance Testing

User Acceptance of a system is the key factor for the success of any system.
The system under consideration is tested for user acceptance by constantly keeping
in touch with the prospective system users at the time of developing and making
changes wherever required. The system developed provides a friendly user interface
that can easily be understood even by a person who is new to the system.

Department of CSE, GPCET, Kurnool. Page 52


Visualising and forecasting of stocks using LSTM

Output Testing

After performing the validation testing, the next step is output testing of the
proposed system, since no system could be useful if it does not produce the required
output in the specified format. Asking the users about the format required by them tests
the outputs generated or displayed by the system under consideration. Hence the output
format is considered in 2 ways – one is on screen and another in printed format.

Department of CSE, GPCET, Kurnool. Page 53


Visualising and forecasting of stocks using LSTM

CHAPTER 4
APPLICATIONS
4.1 ARIMA APPLICATIONS

ARIMA (Autoregressive Integrated Moving Average) is a time series


forecasting method that has a wide range of applications in various fields. Here are
some of the applications of ARIMA:
Finance: ARIMA is widely used in finance for predicting stock prices, exchange
rates, and other financial time series data. By analyzing the historical trends and
patterns in the data, ARIMA models can help investors make informed decisions
about buying and selling stocks, currencies, and other financial instruments.
Economics: ARIMA models are also used in economics for forecasting economic
indicators such as GDP, inflation, and unemployment rates. By analyzing the
historical data and identifying the underlying trends and patterns, ARIMA models
can help economists make predictions about the future state of the economy.
Energy: ARIMA models are used in the energy sector for forecasting energy prices,
demand, and consumption. By analyzing the historical data and identifying the
underlying trends and patterns, ARIMA models can help energy companies make
informed decisions about production, pricing, and distribution.
Healthcare: ARIMA models are used in healthcare for forecasting patient volumes,
disease incidence rates, and healthcare costs. By analyzing the historical data and
identifying the underlying trends and patterns, ARIMA models can help healthcare
providers make informed decisions about resource allocation, staffing, and
budgeting.
Marketing: ARIMA models are used in marketing for forecasting sales volumes,
customer behavior, and market trends. By analyzing the historical data and
identifying the underlying trends and patterns, ARIMA models can help businesses
make informed decisions about product development, pricing, and promotion.
Meteorology: ARIMA models are used in meteorology for forecasting weather
patterns and natural disasters. By analyzing the historical data and identifying the
underlying trends and patterns, ARIMA models can help meteorologists make
informed predictions about the future weather conditions.

Department of CSE, GPCET, Kurnool. Page 54


Visualising and forecasting of stocks using LSTM

1. Traffic: ARIMA models are used in transportation for forecasting traffic


volumes and congestion. By analyzing the historical data and identifying the
underlying trends and patterns, ARIMA models can help transportation planners
make informed decisions about infrastructure investments, route planning, and
traffic management.

In conclusion, ARIMA is a powerful tool for forecasting time series data in


various fields. By analyzing the historical data and identifying the underlying trends
and patterns, ARIMA models can help decision-makers make informed predictions
about the future state of the system being analyzed. However, it is important to note
that no model is perfect and that there is always some degree of uncertainty in
predictions made by ARIMA models. Therefore, it is important to use ARIMA in
conjunction with other techniques and to continually evaluate and refine the models
based on new data and changing conditions

4.2 LSTM APPLICATIONS

LSTM (Long Short-Term Memory) is a type of recurrent neural network that


has shown great success in various applications in recent years. Here are some of the
applications of LSTM:

1. Natural Language Processing (NLP): LSTM models are used in NLP for tasks
such as language translation, sentiment analysis, and text generation. By
processing and analyzing the context of the text, LSTM models can generate more
accurate and meaningful results compared to traditional machine learning models.

2. Speech Recognition: LSTM models are used in speech recognition for tasks
such as voice recognition and speech-to-text conversion. By analyzing the context
and temporal patterns of speech, LSTM models can generate more accurate and
robust results compared to traditional models.

3. Image and Video Recognition: LSTM models are used in image and video
recognition for tasks such as object detection, facial recognition, and gesture
recognition. By analyzing the temporal patterns and context of the images or video
frames, LSTM models can generate more accurate and detailed results compared to
traditional models.

Department of CSE, GPCET, Kurnool. Page 55


Visualising and forecasting of stocks using LSTM

4. Health Care: LSTM models are used in health care for tasks such as medical
image analysis, disease diagnosis, and patient monitoring. By analyzing the
patient's medical history and identifying the temporal patterns and context of their
symptoms, LSTM models can generate more accurate and timely diagnosis and
treatment recommendations.

5. Financial Forecasting: LSTM models are used in finance for tasks such as stock
price prediction, risk management, and fraud detection. By analyzing the historical
patterns and context of financial data, LSTM models can generate more accurate
and reliable predictions and help investors make informed decisions.

6. Autonomous Driving: LSTM models are used in autonomous driving for tasks
such as object detection, lane recognition, and pedestrian detection. By analyzing
the temporal patterns and context of the driving environment, LSTM models can
help self-driving cars make more accurate and safe decisions.

In conclusion, LSTM is a powerful tool for processing and analyzing time


series data, sequential data, and context-rich data in various applications. LSTM
models have shown great success in natural language processing, speech recognition,
image and video recognition, health care, financial forecasting, and autonomous
driving. However, it is important to note that LSTM models require large amounts of
data and computing power, and their performance depends on the quality and
quantity of the data used for training. Therefore, it is important to carefully design
and evaluate the LSTM models and to continually improve them based on new data
and changing conditions.

Department of CSE, GPCET, Kurnool. Page 56


Visualising and forecasting of stocks using LSTM

CHAPTER 5
EXPERIMENTAL ANALYSIS

Fig 5.1 : Home Page

Department of CSE, GPCET, Kurnool. Page 57


Visualising and forecasting of stocks using LSTM

Fig 5.2: Stock price text field

Department of CSE, GPCET, Kurnool. Page 58


Visualising and forecasting of stocks using LSTM

Fig 5.3:Output of predicted stock that is entered in the text field here it is AAPL in figure

Fig 5.4:Existing Model output such as KNN ARIMA Model Accuracy Of stocks.

Department of CSE, GPCET, Kurnool. Page 59


Visualising and forecasting of stocks using LSTM

Fig 5.5 : LSTM Model Accuracy

Fig 5.6: Tweets about stock entered

Department of CSE, GPCET, Kurnool. Page 60


Visualising and forecasting of stocks using LSTM

Fig 5.7: Pie Chart

Department of CSE, GPCET, Kurnool. Page 61


Visualising and forecasting of stocks using LSTM

CHAPTER 6
CONCLUSION AND FUTURE SCOPE

Through this study, it can be seen that Deep Learning algorithms have
significant influence on modern technologies especially to develop different time
series based prediction models. For stock price prediction, they can generate the
highest level of accuracy compared to any other regression models. Among different
Deep Learning models, both LSTM, and BI-LSTM can be used for stock price
prediction with proper adjustment of different parameters. To develop any kind of
prediction model, adjustment of these parameters is very important as the accuracy
in prediction depends significantly upon these parameters. Therefore, LSTM, and
BILSTM models also require this proper tuning of parameters. Using the same
parameters between these two models, BILSTM model generates lower RMSE
compared to LSTM model. Therefore, our proposed prediction model using
BILSTM can be used by individuals and ventures for stock market forecasting. This
can help the investors to gain much financial benefit while retaining a sustainable
environment in stock market. In future, we plan to analyze the data from more stock
markets of different categories to investigate the performance of our approach

Department of CSE, GPCET, Kurnool. Page 62


Visualising and forecasting of stocks using LSTM

REFERENCES

[1] “Market capitalization of listed domestic companies-world,” accessed: 2020-06-

17.[Online].Available: https://data.worldbank.org/indicator/ refer google document

CM.MKT.LCAP.CD? locations=1W

[2] D. K. Kulıc¸ and O. U ¨ gur, “Multiresolution analysis of s&p500 time ˘ series,”

Annals of Operations Research, vol. 260, no. 1-2, pp. 197–216, 2018.

[3] P. Li, C. Jing, T. Liang, M. Liu, Z. Chen, and L. Guo, “Autoregressive moving

average modeling in the financial sector,” in 2015 2nd International Conference on

Information Technology, Computer, and Electrical Engineering (ICITACEE). IEEE,

2015, pp. 68–71.

[4] G. Zhang, X. Zhang, and H. Feng, “Forecasting financial time series using a

methodology based on autoregressive integrated moving average and taylor

expansion,” Expert Systems, vol. 33, no. 5, pp. 501–516, 2016.

[5] M. Bildirici, O. ¨ O. Ersin ¨ et al., “Nonlinearity, volatility and fractional

integration in daily oil prices: Smooth transition autoregressive st-fi (ap) garch

models,” Romanian Journal of Economic Forecasting, vol. 3, pp. 108–135, 2014.

[6] I. Kaastra and M. Boyd, “Designing a neural network for forecasting financial,”

Neurocomputing, vol. 10, pp. 215–236, 1996.

[7] A. Lendasse, E. de Bodt, V. Wertz, and M. Verleysen, “Non-linear financial time

series forecasting-application to the bel 20 stock market index,” European Journal of

Economic and Social Systems, vol. 14, no. 1, pp. 81–91, 2000.

[8] D. P. Mandic and J. Chambers, Recurrent neural networks for prediction:

learning algorithms, architectures and stability. John Wiley & Sons, Inc., 2001.

Department of CSE, GPCET, Kurnool. Page 63


Visualising and forecasting of stocks using LSTM

[9] K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, ¨ H.

Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-

decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.

[10] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated

recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555,

2014

Department of CSE, GPCET, Kurnool. Page 64

You might also like