ASHOK Visualising Forecasting Stock Using Dashboard Project
ASHOK Visualising Forecasting Stock Using Dashboard Project
Submitted by
Y.ASHOK
Department of Mathematics
Dr. V. S. Krishna Govt. Degree & P.G. College(A),
Visakhapatnam-530013, India
CERTIFICATE
This is to certify that the project report entitled “VISUALISING AND
FORECASTING STOCK USING DASH BOARD”
Submitted By
Y.ASHOK
record of work carried out by him in partial fulfilment of the requirement for the
award of the Degree of Bachelor of Science (Computer Science), as prescribed
by the Dr. V.S. Krishna Govt. Degree & P. G. College(A) in the Academic
Year 2019 –2020.
OBJECTIVE:
You will be creating a single page web application using Dash (a python
framework) and some machine learning models which will show company
information (logo, registered name and description) and stock plots based on the
stock code given by the user. Also, the ML model will enable the user to get
predicted stock prices for the date inputted by the user.
ACKNOWLEDGMENT:
This paper could not have been completed without the support of many
people, some of whom have helped directly and Some of them indirectly such as
by publishing their research paper online which helped us to recognize with this
concept Easily.
We have deepest gratitude and feeling of appreciation to our Project mentor
Citronade Choubey Ma’am (Assistant Professor-CSE), her presence at any time
throughout the Semester, important guidance, opinion, encouragement, and
support greatly enhanced this project work. We are grateful to the college for
providing the necessary infrastructure and technical support for the completion
of this project. At last, we would like to thank our family and friends for their
Unconditional support.
Table of Contents
ABSTRACT
OBJECTIVE
ACKNOWLEDGEMENT
CHAPTER 1 INTRODUCTION
1.1. Background of Study
1.2. Problem Statement
1.3. Scope of Study
CHAPTER 2 LITERATURE REVIEW
2.1. Stock Market Analysis
2.2. Stock Market Prediction
2.3. Time Series
2.3.1. Autoregressive Integrated Moving Average (ARIMA)
2.4. Existing Financial System
CHAPTER 3 RESEARCH METHODOLOGY
3.1. Phase 1: Problem Identification & Knowledge Discovery
3.1.1. Interview
3.1.2. Initial Forecasting Design
3.1.3. Tools for Development
3.1.4. Equipment and Materials
3.2. Phase 2: Prototyping Methodology
3.2.1. Requirement Gathering and Planning
3.2.2. Prototyping Phase
3.2.3. Testing Phase
3.2.4. Deployment Phase
3.3. Phase 3: Proposed Framework
3.3.1. Flow Chart
3.3.2. System Architecture
CHAPTER 4 RESULT AND DISCUSSION
4.1. Introduction
CHAPTER 7
7.1. Existing System
7.2. Data Source for Market Prediction
7.3. Market Mimicry
7.4. Time Series aspect Structuring
CHPTER 8
8.1. Proposed System
8.1.1. Downloading The Data
8.2. Splitting Data into a Training set and a test set
8.3. Standard Average
8.4. Exponential Moving Average
8.5. Data Generator
8.6. Data Augmentation
CHAPTER 9
9.1. Code
CHAPTER 10
10.1. Conclusion
10.2 References
CHAPTER-1
INTRODUCTION
INTRODUCTION
Investment companies and individual investors have been utilizing money
models to possess a higher understanding of the market and build a profitable
investment. Plenty of data regarding stock data fluctuations is at hand for analysis
and processing. Investors take calculated guesses by analysing data. They browse
the news, study the corporate history, trade trends and numerous variables that go
into creating a prediction. The prevailing theories are that stock costs are quite
random and unpredictable. This raises the question that why prime companies
like Morgan Stanley and Citigroup rent quantitative analysts to create predictive
models.
This paper seeks to utilize Deep Learning models, LSTM Neural
Networks, to predict stock costs. As for data with time-frames RNNs are available
handy however recent researches have shown that LSTM networks are the
foremost sought-after and helpful variants of RNNs.
A business could become prone to market fluctuations on the far side your
management together with market sentiment, economic conditions or
developments in your sector. Exchanging the stocks on money markets is the one
of the significant speculation exercises. Already, scientist developed different
stock examination system that could empower to envision the bearings of stock
esteem development. Predicting and foreseeing significant worth future cost, in
perspective of the present cash related information and news, is of colossal use to
the financial pros. financial masters need to know whether some stock will get
higher or lower over particular time period. To obtain the accurate output, the
approach used to implement in machine learning along with supervised learning
algorithms. Results are tested using different types of supervised learning
algorithms with a different set of features.
Stock market is characterized as active, unforeseeable and nonlinear in
universe. Prediction of stock prices is a very difficult task as it depends on various
things including so many things, global economy, financial reports of company
and performance etc.
Traditionally, two important methods have been made for prediction of the
stock price of an company.
Technical analysis method uses recorded price of stocks like closing and
opening price, volume trade, adjacent closing values etc. of the stock for
predicting the coming price of the stock. The second type of analysis is qualitative
analysis, which is performed on the basis of exterior things like company's bio,
market circumstance, profitable and political factors, descriptive information in
the form of financial new articles, social media and even blogs by economical
analyst. The stock market datasets are changing every day, therefore we need to
have the latest or real-time to support data analysis. Of course, in the market, there
are a lot of APIs provided to extract the dataset. Some of the APIs may incur
some costs due to providing the real-time data on time. In this post, I will use
Python third-party module to extract the data from Yahoo-Finance. Pandas will
be used for data processing, while PowerBI use for data visualization. In this
article, I will guide how to use Python. In Power Bi to build Stock Market
Analysis and data visualization in a faster way.
The Measures for the Suitability Management of Securities and Futures
Investors (Hereinafter refer to as the Measures) of China has come into force in
July 27, 2017. As the first special regulation of investor protection in China's
securities and futures market, this act is an important foundation law of the capital
market. The act will regulate the behaviour of the brokers and also provide
evidence for the settlement of disputes in the future. The Measures requires that
when a staff of a securities or insurance company sells financial products and
signs cooperation documents, the company should that complies with certain
rules for verification of possible.
CHAPTER-2
LITERATURE REVIEW
▪ C – Constant term
▪ AR – Non seasonal Auto Regressive coefficients (∅1, … ∅𝑝)
▪ MA - Non seasonal Moving Average coefficient (∅1, … ∅𝑝)
▪ AR Lags - Lags corresponding to nonzero, non-seasonal AR coefficients
▪ MA Lags – Lags corresponding to nonzero, nonseasonal MA coefficients
▪ D - Degree of no seasonal differencing, D [if D has value 0 meaning no
seasonal integration]
▪ Variance – Scalar variance of the innovation process (𝜎𝜀 2)
▪ Distribution - Distribution of the innovation process
CHAPTER-3
RESEARCH METHODOLOGY
In conducting this project, various processes and activities required before
delivering the project and the complete application proposed. Figure 3-1
illustrate phases involved and deliverables.
3.1.1. Interview:
Interview to the expert is conducted to find related findings and analyse the
need of the application being proposed. The expert selected for this study is Ms.
Nor Abiah Zainal Abidin, who has high experience on stock price forecasting and
data mining technique.
Above terms during her job at one of the Malaysia Government Linked
Company.
4.1. Introduction:
This chapter explained the process and result of research performed to
uphold the project objectives. The method of research conducted are expert
interview and experimental. An interview was conducted by interviewing one of
the experts that has this study background and perform day-to-day basis on
statistical occupation and depiction to R Programming. Furthermore,
experimental method conducted by attempt series of experiment of stock price
forecasting with model chosen to investigate and decide the best implementation
for the project deliverables.
4.3. Experimental:
Several attempts done to find the most suitable model being use for this
project.
Overall, most of the attempt are using ARIMA model and use necessary package
that is free distributed in the application. From the perspective of forecasting
ability, author has been successfully implemented ARIMA model to forecast the
price.
Figure 4-1 Stock Price Forecasting with ARIM
Figure - shown proposed design of the dashboard that will be developed
during developing phase. The proposed design has been comparing to certain of
example of existing dashboard in the market. The dashboard will be using full
HTML web page by combining result from R Programming and Twitter and
Google News API as the objects.
Twitter API will be fascinated from the technique of bootstraps that offers
free-cost to developers especially in HTML platform.
These user interfaces are presented on the dashboard that being live on
localhost.
The application developed to forecast stock price of an index or a company
in Bursa Malaysia.
The main idea of this dashboard is to become an alternative for the early
or ready-to-enter investor to analyses information and forecast the stock price
before entering the market.
There are has been the technique of bootstraps that offers free-cost to
developers comparing to certain of example of existing dashboard mainly 4
activities which are “Dashboard” or homepage, “Forecast”, “Technical
Analysis”, and “Companies Profile”.
Figure 4-2 Homepage Page
Home Page or Dashboard, as in Figure 3, consists of options “Forecast!”
“Technical Analysis”, “Stock Profile”, “About Us”. For “Forecast!” button, the
system will proceed to forecasting tool page. “Technical Analysis” will open to
the page of technical analysis. “Stock Profile” button will show and display the
stock profile by options of the tree-view. Lastly, “About Us” will proceed to the
dashboard disclaimer and profile. On the dashboard or the homepage, it is
displaying a live Stock Yahoo Notifier from the Yahoo Finance of top 30 largest
companies on the Bursa Malaysia by market capitalization called as Kuala
Lumpur Composite Index (KLCI) (Bursa Malaysia, n.d.). Added with the twitter
and google newsfeed portraying live tweet and news quote of the KLCI and
market subject.
METHODOLOGY:
We saw that the new user was afraid to invest in the share market because
he did not have the knowledge and did not have any tool with the help of which
he could do this work.
So, then we created a tool with the help of machine learning and deep
learning, which can tell in very precise way the market will move by analysing
things.
We did this work by taking data from finance. We imported this data
through library and analysed with the help of machine learning model and we did
it with a machine learning model because it does this thing very precisely.
In this we have used the model of Deep Learning LSTM, we have trained
the data in its own way, we have learned to use it because it is very advanced and
performs this task very accurately and the result is more accurate.
We used this method because in this we can train the data, due to which it
can do its work very accurately and get close to the real result with great accuracy.
There is a drawback in this that it will work the way we trained it machine
learning model and we did it with a machine learning.
Machine learning models are implemented for predicting the stock price
for the dates requested by the user.
We have to face a lot of difficulty in trained the data because the model
cannot make any changes in itself like the data is transcribed, it works the same
way.
RESULT:
Create Basic Website Layout Expected Outcome. By now you should have
the basic web page setup as shown below in the second image which can be seen
by starting the server locally as shown below in the first image.
Deploying the project on He Roku Finally, our web app is deployed and
can be accessed by anyone in the world.
CHAPTER-6
Stock Market
6.1. What is the Stock Market?
The stock market refers to the collection of markets and exchanges where
regular activities of buying, selling, and issuance of shares of publicly-held
companies take place. Such financial activities are conducted through
institutionalized formal exchanges or over-the-counter (OTC) marketplaces
which operate under a defined set of regulations. There can be multiple stock
trading venues in a country or a region which allow transactions in stocks and
other forms of securities.
While both terms - stock market and stock exchange - are used
interchangeably, the latter term is generally a subset of the former. If one says
that she trades in the stock market, it means that she buys and sells
shares/equities on one (or more) of the stock exchange(s) that are part of the
overall stock market. The leading stock exchanges in the U.S. include the New
York Stock Exchange (NYSE), Nasdaq, and the Chicago Board Options
Exchange (CBOE). These leading national exchanges, along with several other
exchanges operating in the country, form the stock market of the U.S.
Though it is called a stock market or equity market and is primarily known
for trading stocks/equities, other financial securities - like exchange traded funds
(ETF), corporate bonds and derivatives based on stocks, commodities,
currencies, and bonds - are also traded in the stock markets.
url_string=
"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symb
ol=%s&outputsize=full&apik ey=%
s" %tkinter, api_key # Save data to
this file_to_save=
'stock_market_data-%s.csv %ticker
# If you haven't already saved data,
# Go ahead and grab the data from the url
# And store date, low, high, volume, close, open values to a
Pandas Data Frame if not os.path.exists(file_to_save): with
urllib.request.urlopen(url_string) as url:
data = json.loads(url.read().decode()) # extract
stock market data data = data['Time Series
(Daily)'] df =
pd.DataFrame(columns=['Date','Low','High','Cl
ose','Open']) for k,v in data.items():
date = dt.datetime.strptime(k, '%Y-%m-%d')
data_row = [date.date(),float(v['3. low']),
float (v ['2. high']), float (v ['4. close']),
float(v['1. open'])] df.loc [-1, :] = data_row
df.index = df.index + 1 print('Data saved to
: %s'%file_to_save)
df.to_csv(file_to_save)
# If the data is already there, just load it
from the CSV else:
print('File already exists. Loading
data from CSV') df =
pd.read_csv(file_to_save) else:
# ====================== Loading Data from Kaggle
================================== # You will be using HP's
data. Feel free to experiment with other data.
# But while doing so, be careful to have a large enough dataset and also pay
attention to the data normalization
df =
pd.read_csv(os.path.join('Stocks','hpq.us.txt'),delimiter=',',usecols=['Date','Open
','High','Low','Close']) print('Loaded data from the Kaggle repository')
# Sort DataFrame by date df =
df.sort_values('Date') # Double check the result
df.head() plt.figure(figsize = (18,9))
plt.plot(range(df.shape[0]),(df['Low']+df['High'])/2
.0)
plt.xticks(range(0,df.shape[0],500),df['Date'].loc[::
500],rotation=45) plt.xlabel('Date',fontsize=18)
plt.ylabel('Mid Price',fontsize=18) plt.show()
# First calculate the mid prices from the
highest and lowest high_prices =
df.loc[:,'High'].as_matrix() low_prices =
df.loc[:,'Low'].as_matrix() mid_prices =
(high_prices+low_prices)/2.0 train_data =
mid_prices[:11000] test_data =
mid_prices[11000:]
# Scale the data to be between 0 and 1
# When scaling remembers! You normalize both test and train data with respect
to training data
# Because you are not supposed to have
access to test data scaler = MinMaxScaler()
train_data = train_data.reshape(-1,1)
test_data = test_data.reshape(-1,1)
# Train the Scaler with training data
and smooth data
smoothing_window_size = 2500 for di
in
range(0,10000,smoothing_window_siz
e):
scaler.fit(train_data[di:di+smoothing_
window_size,:])
train_data[di:di+smoothing_window_si
ze,:] =
scaler.transform(train_data[di:di+smoothing_window_size,:])
# You normalize the last bit of remaining data
scaler.fit(train_data[di+smoothing_window_size:,:])
train_data[di+smoothing_window_size:,:] =
scaler.transform(train_data[di+smoothing_window_size:,:])
# Reshape both train and test
data train_data =
train_data.reshape(-1)
# Normalize test data test_data =
scaler.transform(test_data).reshape(-
1)
# Now perform exponential moving average smoothing
# So the data will have a smoother curve than the
original ragged data EMA = 0.0 gamma = 0.1 for ti
in range(11000):
EMA = gamma*train_data[ti] + (1-
gamma)*EMA train_data[ti] = EMA
# Used for visualization and test purposes
all_mid_data =
np.concatenate([train_data,test_data],axis=0)
window_size = 100 N = train_data.size
std_avg_predictions = [] std_avg_x = []
mse_errors = [] for pred_idx in
range(window_size,N): if pred_idx >= N:
date = dt.datetime.strptime(k, '%Y-%m-%d').date() +
dt.timedelta(days=1) else:
date = df.loc[pred_idx,'Date']
std_avg_predictions.append(np.mean(train_data[pred_idx-
window_size:pred_idx])) mse_errors.append((std_avg_predictions[-1]-
train_data[pred_idx])**2) std_avg_x.append(date) print('MSE error for
standard averaging: %.5f'%(0.5*np.mean(mse_errors))) plt.figure(figsize =
(18,9)) plt.plot(range(df.shape[0]),all_mid_data,color='b',label='True')
plt.plot(range(window_size,N),std_avg_predictions,color='orange',label='Predi
ction')
#plt.xticks(range(0,df.shape[0],50),df['Date'].loc[:
:50],rotation=45) plt.xlabel('Date') plt.ylabel('Mid
Price') plt.legend(fontsize=18) plt.show()
window_size = 100 N = train_data.size
run_avg_predictions = [] run_avg_x = []
mse_errors = [] running_mean = 0.0
run_avg_predictions.append(running_mean)
decay = 0.5 for pred_idx in range(1,N):
running_mean = running_mean*decay + (1.0-
decay)*train_data[pred_idx-1]
run_avg_predictions.append(running_mean)
mse_errors.append((run_avg_predictions[-1]-
train_data[pred_idx])**2) run_avg_x.append(date)
print('MSE error for EMA averaging:
%.5f'%(0.5*np.mean(mse_errors))) plt.figure(figsize =
(18,9))
plt.plot(range(df.shape[0]),all_mid_data,color='b',label='
True')
plt.plot(range(0,N),run_avg_predictions,color='orange',
label='Prediction')
#plt.xticks(range(0,df.shape[0],50),df['Date'].loc[::50],rotation=45)
plt.xlabel('Date') plt.ylabel('Mid Price')
plt.legend(fontsize=18) plt.show() class
DataGeneratorSeq(object): def
__init__(self,prices,batch_size,num_unroll): self._prices
= prices self._prices_length = len(self._prices) -
num_unroll self._batch_size = batch_size
self._num_unroll = num_unroll self._segments =
self._prices_length //self._batch_size self._cursor =
[offset * self._segments for offset in
range(self._batch_size)] def next_batch(self):
batch_data =
np.zeros((self._batch_size),dtype=np.float32)
batch_labels =
np.zeros((self._batch_size),dtype=np.float32) for b
in range(self._batch_size): if
self._cursor[b]+1>=self._prices_length:
#self._cursor[b] = b * self._segments self._cursor[b]
= np.random.randint(0,(b+1)*self._segments)
batch_data[b] = self._prices[self._cursor[b]]
batch_labels[b]=
self._prices[self._cursor[b]+np.random.randint(0,5)
] self._cursor[b] =
(self._cursor[b]+1)%self._prices_length return
batch_data,batch_labels def unroll_batches(self):
unroll_data,unroll_labels
= [],[] init_data, init_label
= None,None for ui in
range(self._num_unroll):
data, labels =
self.next_batch()
unroll_data.append(data)
unroll_labels.append(labe
ls) return unroll_data,
unroll_labels def
reset_indices(self): for b
in
range(self._batch_size):
self._cursor[b] =
np.random.randint(0,min((b+1)*self._segments,self._prices_lengt
h-1)) dg = DataGeneratorSeq(train_data,5,5) u_data, u_labels =
dg.unroll_batches() for ui,(dat,lbl) in
enumerate(zip(u_data,u_labels)):
print('\n\nUnrolled
index %d'%ui) dat_ind
= dat lbl_ind = lbl
print('\tInputs: ',dat )
print('\n\tOutput:',lbl)
D = 1 # Dimensionality of the data. Since your data is 1-D this
would be 1 num_unrollings = 50 # Number of time steps you look
into the future.
batch_size = 500 # Number of samples in a batch num_nodes = [200,200,150]
# Number of hidden nodes in each layer of the deep LSTM stack we're using
n_layers = len(num_nodes) # number of layers dropout = 0.2 # dropout amount
tf.reset_default_graph() # This is important in case you run this multiple times
# Input data.
train_inputs,
train_outputs = [],[]
# You unroll the input over time defining placeholders
for each time step for ui in range(num_unrollings):
train_inputs.append(tf.placeholder(tf.float32,
shape=[batch_size,D],name='train_inputs_%d'%ui))
train_outputs.append(tf.placeholder(tf.float32, shape=[batch_size,1], name =
'train_outputs_%d'%u lstm_cells = [
tf.contrib.rnn.LSTMCell(num_units=num_nodes[li], state_is_tuple=True,
initializer= tf.contrib.layers.xavier_initializer()
)
for li in range(n_layers)] drop_lstm_cells =
[tf.contrib.rnn.DropoutWrapper( lstm,
input_keep_prob=1.0,output_keep_prob=1.0-dropout,
state_keep_prob=1.0-dropout
) for lstm in lstm_cells] drop_multi_cell =
tf.contrib.rnn.MultiRNNCell(drop_lstm_cells) multi_cell =
tf.contrib.rnn.MultiRNNCell(lstm_cells) w =
tf.get_variable('w',shape=[num_nodes[-1], 1],
initializer=tf.contrib.layers.xavier_initializer()) b =
tf.get_variable('b',initializer=tf.random_uniform([1],-0.1,0.1))
# Create cell state and hidden state variables to maintain the state of the LSTM
c, h = [],[]
initial_state = []
for li in
range(n_layers)
:
c.append(tf.Variable(tf.zeros([batch_size, num_nodes[li]]), trainable=False))
h.append(tf.Variable(tf.zeros([batch_size, num_nodes[li]]), trainable=False))
initial_state.append(tf.contrib.rnn.LSTMStateTuple(c[li], h[li]))
# Do several tensor transofmations, because the function dynamic_rnn
requires the output to be of # a specific format. Read more at:
https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn all_inputs =
tf.concat([tf.expand_dims(t,0) for t in train_inputs],axis=0)
# all_outputs is [seq_length, batch_size, num_nodes] all_lstm_outputs,
state = tf.nn.dynamic_rnn( drop_multi_cell, all_inputs,
initial_state=tuple(initial_state), time_major = True, dtype=tf.float32)
all_lstm_outputs = tf.reshape(all_lstm_outputs,
[batch_size*num_unrollings,num_nodes[-1]]) all_outputs =
tf.nn.xw_plus_b(all_lstm_outputs,w,b) split_outputs =
tf.split(all_outputs,num_unrollings,axis=0)
# When calculating the loss you need to be careful about the exact form,
because you calculate
# loss of all the unrolled steps at the same time
# Therefore, take the mean error or each batch and get the sum of that over
all the unrolled steps print('Defining training Loss') loss = 0.0 with
tf.control_dependencies([tf.assign(c[li], state[li][0]) for li in
range(n_layers)]+
[tf.assign(h[li], state[li][1]) for li in
range(n_layers)]): for ui in
range(num_unrollings):
loss += tf.reduce_mean(0.5*(split_outputs[ui]-
train_outputs[ui])**2) print('Learning rate decay
operations') global_step = tf.Variable(0,
trainable=False) inc_gstep =
tf.assign(global_step,global_step + 1)
tf_learning_rate =
tf.placeholder(shape=None,dtype=tf.float32)
tf_min_learning_rate =
tf.placeholder(shape=None,dtype=tf.float32)
learning_rate = tf.maximum(
tf.train.exponential_decay(tf_learning_rate, global_step,
decay_steps=1, decay_rate=0.5, staircase=True),
tf_min_learning_rate)
# Optimizer. print('TF Optimization
operations') optimizer =
tf.train.AdamOptimizer(learning_rate)
gradients, v =
zip(*optimizer.compute_gradients(loss))
gradients, _ =
tf.clip_by_global_norm(gradients, 5.0)
optimizer = optimizer.apply_gradients(
zip(gradients, v)) print('\tAll done')
print('Defining prediction related TF
functions') sample_inputs =
tf.placeholder(tf.float32, shape=[1,D]) #
Maintaining LSTM state for prediction
stage sample_c, sample_h,
initial_sample_state = [],[],[] for li in
range(n_layers):
sample_c.append(tf.Variable(tf.zeros([1,
num_nodes[li]]), trainable=False))
sample_h.append(tf.Variable(tf.zeros([1,
num_nodes[li]]), trainable=False))
initial_sample_state.append(tf.contrib.rn
n.LSTMStateTuple(sample_c[li],sample
_h[li]))
reset_sample_states = tf.group(*[tf.assign(sample_c[li],tf.zeros([1,
num_nodes[li]])) for li in range(n_layers)],
*[tf.assign(sample_h[li],tf.zeros([1, num_nodes[li]])) for li in
range(n_layers)]) sample_outputs, sample_state =
tf.nn.dynamic_rnn(multi_cell, tf.expand_dims(sample_inputs,0),
initial_state=tuple(initial_sample_state), time_major = True,
dtype=tf.float32) with
tf.control_dependencies([tf.assign(sample_c[li],sample_state[li][0]) for li
in range(n_layers)]+
[tf.assign(sample_h[li],sample_state[li][1]) for li in range(n_layers)]):
sample_prediction =
tf.nn.xw_plus_b(tf.reshape(sample_outputs,[1,-1]), w, b)
print('\tAll done')
epochs = 30 valid_summary = 1 # Interval you
make test predictions n_predict_once = 50 #
Number of steps you continously predict for
train_seq_length = train_data.size # Full length of
the training data train_mse_ot = [] # Accumulate
Train losses test_mse_ot = [] # Accumulate Test
loss predictions_over_time = [] # Accumulate
predictions session = tf.InteractiveSession()
tf.global_variables_initializer().run() # Used for
decaying learning rate loss_nondecrease_count =
0
loss_nondecrease_threshold = 2 # If the test error hasn't increased in this many
steps, decrease learning rate print('Initialized') average_loss = 0
# Define data generator data_gen =
DataGeneratorSeq(train_data,batch_size,num_unrol
lings) x_axis_seq = []
# Points you start your test predictions
from test_points_seq =
np.arange(11000,12000,50).tolist() for ep
in range(epochs):
# ========================= Training ===================================== for
step in range(train_seq_length//batch_size): u_data,
u_labels = data_gen.unroll_batches() feed_dict = {} for
ui,(dat,lbl) in enumerate(zip(u_data,u_labels)):
feed_dict[train_inputs[ui]] = dat.reshape(-1,1)
feed_dict[train_outputs[ui]] = lbl.reshape(-1,1)
feed_dict.update({tf_learning_rate: 0.0001,
tf_min_learning_rate:0.000001})
_, l = session.run([optimizer, loss],
feed_dict=feed_dict) average_loss += l
# ============================ Validation
============================== if (ep+1) % valid_summary == 0:
average_loss = average_loss/(valid_summary*(train_seq_length//batch_size))
# The average loss if
(ep+1)%valid_sum
mary==0:
print('Average loss at step %d: %f' %
(ep+1, average_loss))
train_mse_ot.append(average_loss)
average_loss = 0 # reset loss
predictions_seq = [] mse_test_loss_seq = []
# ===================== Updating State and Making Predicitons ========================
for w_i in test_points_seq:
mse_test_loss = 0.0
our_predictions = []
if (ep+1)-
valid_summary==0
:
# Only calculate x_axis values in the first
validation epoch x_axis=[]
# Feed in the recent past behavior of
stock prices # to make predictions
from that point onwards for tr_i in
range(w_i-num_unrollings+1,w_i-
1):
current_price = all_mid_data[tr_i]
feed_dict[sample_inputs] =
np.array(current_price).reshape(1,1) _ =
session.run(sample_prediction,feed_dict=feed_d
ict) feed_dict = {} current_price =
all_mid_data[w_i-1] feed_dict[sample_inputs] =
np.array(current_price).reshape(1,1)
# Make predictions for this many steps
# Each prediction uses previous prediciton as it's
current input for pred_i in
range(n_predict_once):
pred =
session.run(sample_prediction,feed_dict=fe
ed_dict)
our_predictions.append(np.asscalar(pred))
feed_dict[sample_inputs] =
np.asarray(pred).reshape(-1,1) if (ep+1)-
valid_summary==0:
# Only calculate x_axis values in the first
validation epoch
x_axis.append(w_i+pred_i) mse_test_loss
+= 0.5*(pred-
all_mid_data[w_i+pred_i])**2
session.run(reset_sample_states)
predictions_seq.append(np.array(our_predi
ctions)) mse_test_loss /= n_predict_once
mse_test_loss_seq.append(mse_test_loss)
if (ep+1)-valid_summary==0:
x_axis_seq.append(x_axis)
current_test_mse = np.mean(mse_test_loss_seq)
# Learning rate decay logic if
len(test_mse_ot)>0 and current_test_mse >
min(test_mse_ot):
loss_nondecrease_count += 1
else:
loss_nondecrease_count = 0 if
loss_nondecrease_count >
loss_nondecrease_threshold :
session.run(inc_gstep) loss_nondecrease_count = 0 print('\tDecreasing learning
rate by 0.5') test_mse_ot.append(current_test_mse) print('\tTest MSE:
%.5f'%np.mean(mse_test_loss_seq))
predictions_over_time.append(predictions_seq) print('\tFinished Predictions')
best_prediction_epoch = 28 # replace this with the epoch that you got the best
results when running the plotting code plt.figure(figsize = (18,18))
plt.subplot(2,1,1) plt.plot(range(df.shape[0]),all_mid_data,color='b')
# Plotting how the predictions change over time
# Plot older predictions with low alpha and newer predictions
with high alpha start_alpha = 0.25 alpha =
np.arange(start_alpha,1.1,(1.0-
start_alpha)/len(predictions_over_time[::3])) for p_i,p in
enumerate(predictions_over_time[::3]): for xval,yval in
zip(x_axis_seq,p):
plt.plot(xval,yval,color='r',alpha=alpha[p_i])
plt.title('Evolution of Test Predictions Over
Time',fontsize=18)
plt.xlabel('Date',fontsize=18) plt.ylabel('Mid
Price',fontsize=18)
plt.xlim(11000,12500)
plt.subplot(2,1,2)
# Predicting the best test prediction you got
plt.plot(range(df.shape[0]),all_mid_data,color='b') for
xval,yval in
zip(x_axis_seq,predictions_over_time[best_prediction_epo
ch]):
plt.plot(xval,yval,color='r')
plt.title('Best Test Predictions Over
Time',fontsize=18)
plt.xlabel('Date',fontsize=18)
plt.ylabel('Mid Price',fontsize=18)
plt.xlim(11000,12500) plt.show()
CHAPTER-10
Existing System
10.1. CONCLUSION:
In this, we learnt how difficult it can be to device a model that is able to
correctly predict stock price movements. We started with a motivation for why
we need to model stock prices. This was followed by an explanation and code
for downloading data. Then we looked at two averaging techniques that allow
you to make predictions one step into the future. we next saw that these methods
are futile when you need to predict more than one step into the future. Thereafter
we discussed how you can use LSTMs to make predictions many steps into the
future. Finally, we visualized the results and saw that your model (though not
perfect) is quite good at correctly predicting stock price movements.
Here, I'm stating several takeaways.
1. Stock price/movement prediction is an extremely difficult task.
Personally, I don't think any of the stock prediction models out there
shouldn't be taken for granted and blindly rely on them. However,
models might be able to predict stock price movement correctly most
of the time, but not always.
2. Do not be fooled by articles out there that shows predictions curves that
perfectly overlaps the true stock prices. This can be replicated with a
simple averaging technique and in practice it's useless. A more sensible
thing to do is predicting the stock price movements.
3. The model's hyperparameters are extremely sensitive to the results you
obtain. So, a very good thing to do would be to run some
hyperparameter optimization technique (for example, Grid search /
Random
4. search) on the hyperparameters. Below I listed some of the
most critical hyperparameters o The learning rate of the
optimizer o Number of layers and the number of hidden units
in each layer o The optimizer. I found Adam to perform the
best o Type of the model. You can try GRU/ Standard LSTM/
LSTM with Peepholes and evaluation performance difference
5. In this we did something faulty (due to the small size of data)! That is
we used the test loss to decay the learning rate. This indirectly leaks
information about test set into the training procedure. A better way of
handling this is to have a separate validation set (apart from the test set)
and decay learning rate with respect to performance of the validation
set.
10.2. REFERENCES:
• file:///C:/Users/Visitor/Desktop/Paper_109.pdf
• https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781119197652.oth1
• http://thestockmarketwatch.com/learn/stocks-basicsconclusion/
• https://www.datacamp.com/community/tutorials/lstm-python-stock-
market
• https://link.springer.com/content/pdf/bbm%3A978-81-322-1590-
5%2F1.pdf
• https://www.scribd.com/document/110661247/Conclusion
• https://www.pantechsolutions.net/stock-market-prediction-using-
machinelearning
• https://www.ijeat.org/wp-
content/uploads/papers/v8i4/D6321048419.pdf
• https://markdunne.github.io/public/mark-dunne-stock-market-
prediction.pdf
• https://www.investopedia.com/terms/s/stock-analysis.asp