Project Note
Project Note
1. Data Preparation: The code begins by fetching historical stock data for a specified ticker
symbol using Yahoo Finance API. The data is then preprocessed, including dropping
unnecessary columns and splitting it into training and testing sets.
2. Normalization: Before feeding the data into the neural network, it's normalized using
Min-Max scaling to a range between 0 and 1. This step is crucial for improving the
convergence speed and performance of the neural network.
3. Model Architecture:
The LSTM model is defined using the Sequential API provided by Keras.
Four LSTM layers are stacked sequentially, each followed by a Dropout layer to prevent
overfitting:
The first LSTM layer has 50 units with 'relu' activation function and returns sequences.
The second LSTM layer has 60 units with 'relu' activation function and returns
sequences.
The third LSTM layer has 80 units with 'relu' activation function and returns sequences.
The fourth LSTM layer has 120 units with 'relu' activation function and doesn't return
sequences.
Finally, a Dense layer with one unit is added to produce the output.
Model Compilation: The model is compiled using the Adam optimizer and mean squared
error loss function, which is a common choice for regression problems.
4. Model Training: The model is trained on the training data with a specified number of
epochs.
5. Model Evaluation: After training, the model is evaluated on the testing data to assess its
performance.
6. Prediction and Scaling Back: Once the model is trained and evaluated, predictions are
made on the test data. Before plotting the predictions against the actual values, the
predicted and actual values are scaled back to their original scales using the inverse of
the Min-Max scaling factor.
7. Plotting: Finally, the actual and predicted stock prices are plotted over time to visualize
the performance of the model.
This model aims to predict stock prices based on historical data using LSTM, a type of recurrent
neural network (RNN) architecture well-suited for sequential data like time series. By training on
past stock prices, the model learns patterns and relationships in the data to make predictions
about future stock prices.
Data Preparation:
● Historical stock data is fetched and preprocessed.
● The 'Close' price is selected as the target variable for prediction.
● The data is split into training and testing sets.
Normalization: The data is normalized using Min-Max scaling to a range between 0 and 1. This
is essential for neural networks to perform effectively, as it helps stabilize and speed up the
training process.
Model Architecture:
● The LSTM (Long Short-Term Memory) model architecture is defined using the Keras
library.
● The model consists of four stacked LSTM layers, each followed by a Dropout layer to
prevent overfitting:
● The first LSTM layer has 50 units and returns sequences.
● The second LSTM layer has 60 units and returns sequences.
● The third LSTM layer has 80 units and returns sequences.
● The fourth LSTM layer has 120 units and doesn't return sequences.
● Finally, a Dense layer with one unit is added to produce the output, which predicts the
next day's closing price.
Model Compilation:
● The model is compiled using the Adam optimizer, a popular choice for optimization, and
the mean squared error loss function, which is suitable for regression problems.
Model Training:
● The compiled model is trained on the training data for a specified number of epochs.
● During training, the model learns the patterns and relationships in the historical stock
data to make predictions about future stock prices.
Model Evaluation:
● After training, the model's performance is evaluated on the testing data to assess its
accuracy and generalization capability.
Visualization:The actual and predicted stock prices are plotted over time to visually compare the
model's predictions with the true values.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pandas_datareader as data
import yfinance as yf
-Here, necessary libraries are imported:
● numpy and pandas for numerical and data manipulation tasks.
● matplotlib.pyplot for plotting graphs.
● pandas_datareader and yfinance for fetching stock data from Yahoo Finance.
print(stock_data.head())
-Prints the first few rows of the downloaded stock data to inspect its structure.
stock_data.reset_index(inplace=True)
-Resets the index of the DataFrame, turning the 'Date' column into a regular column.
plt.figure(figsize=(6, 3))
plt.plot(stock_data.index, stock_data['Close'], color='blue', linewidth=2)
plt.title('Stock Close Price')
plt.xlabel('Date')
plt.ylabel('Close Price')
plt.grid(True)
plt.show()
-Plots the closing price of the stock over time using matplotlib.
Calculation of Moving Averages: The moving average is a commonly used technical analysis
tool to smooth out price data by creating a constantly updated average price.
The 100-day moving average (MA100) and 200-day moving average (MA200) are calculated for
the closing prices of the stock data.
These moving averages provide insights into the longer-term trends in the stock price by
averaging out short-term fluctuations.
data_training = pd.DataFrame(stock_data['Close'][0:int(len(stock_data)*0.70)])
data_testing = pd.DataFrame(stock_data['Close'][int(len(stock_data)*0.70): int(len(stock_data))])
-Splits the data into training and testing sets, where 70% of the data is for training and the
remaining 30% is for testing.
data_training_array = scaler.fit_transform(data_training)
-Applies Min-Max scaling to the training data.
import tensorflow as tf
from tensorflow import keras
import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
import keras
from keras.models import Sequential
from keras.layers import LSTM, Dense, Dropout
-Imports necessary libraries from TensorFlow and Keras for building the LSTM model.
Keras:
● Keras is used to define the architecture of the LSTM (Long Short-Term Memory) model.
● Specifically, the Sequential API from Keras is utilized to stack layers sequentially and
build the model.
● The LSTM layers, dropout layers, and dense layers are all defined using the Keras
Sequential API.
● Additionally, Keras is used to compile the model with an optimizer and loss function and
to fit the model to the training data.
TensorFlow:
● TensorFlow serves as the backend for executing the computations defined in the Keras
model.
● While Keras provides a high-level interface for building neural networks, TensorFlow
handles the low-level operations and computations required to train the model.
● TensorFlow efficiently executes the forward and backward passes during training,
updates the model parameters (weights and biases), and computes the loss function
and gradients.
● Essentially, Keras serves as the user-friendly interface for defining the model
architecture and training process, while TensorFlow handles the backend computations
to execute the model efficiently.
In summary, Keras is used to define the LSTM model architecture and compile the model, while
TensorFlow handles the execution of the computations defined in the model. Together, they
provide a powerful framework for building, training, and deploying deep learning models, such
as the LSTM model used for stock price prediction.
model = Sequential()
model.add(LSTM(units=50, activation='relu', return_sequences=True,
input_shape=(x_train.shape[1], 1)))
model.add(Dropout(0.2))
# Add more LSTM and Dropout layers as per the defined architecture
model.summary()
-Defines the LSTM model architecture using the Sequential API from Keras. The architecture
includes four LSTM layers with dropout regularization.
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train, y_train, epochs=50)
-Compiles the model using the Adam optimizer and mean squared error loss function, then
trains the model on the training data for 50 epochs.
model.save('keras_model.h5')
-The file keras_model.h5 is a serialized representation of the trained LSTM model saved in the
Hierarchical Data Format version 5 (HDF5) format.
past_100_days = data_training.tail(100)
final_df = pd.concat([past_100_days, data_testing], ignore_index=True)
-Selects the last 100 days of the training data and concatenates them with the testing data to
create a new DataFrame for prediction.
input_data = scaler.fit_transform(final_df)
-Normalizes the input data (concatenated DataFrame) using the Min-Max scaler.
x_test = []
y_test = []
for i in range(100, input_data.shape[0]):
x_test.append(input_data[i-100:i])
y_test.append(input_data[i, 0])
-Prepares the testing data by creating sequences of 100 days with their corresponding target
values for prediction.
y_predicted = model.predict(x_test)
-Makes predictions on the testing data using the trained LSTM model.
scaler.scale_
scale_factor = 1/0.00721059
y_predicted = y_predicted * scale_factor
y_test = y_test * scale_factor
-Scales back the predicted and actual values to their original scales using the inverse of the
Min-Max scaling factor.
plt.figure(figsize=(12, 6))
plt.plot(y_test, 'b', label='Original Price')
plt.plot(y_predicted, 'r', label='Predicted Price')
plt.xlabel('Time')
plt.ylabel('Price')
plt.legend()
plt.show()
-Plots the actual and predicted stock prices over time for visualization.
Streamlit is an open-source Python library used to create web applications for machine learning
and data science projects. It allows developers to build interactive and customizable web
interfaces directly from Python scripts, without requiring knowledge of web development
languages such as HTML, CSS, or JavaScript.
Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture,
specifically designed to address the limitations of traditional RNNs in capturing and learning
long-term dependencies in sequential data. LSTMs are particularly well-suited for tasks
involving time series data, natural language processing, speech recognition, and other
sequential data domains.
The architecture of a Long Short-Term Memory (LSTM) network consists of multiple memory
cells organized in layers, with each cell containing specialized gates to control the flow of
information. Here's a detailed explanation of the LSTM architecture:
1. **Input Layer**:
- The input layer receives sequential input data, such as time series data, text, or audio
samples.
- Each input sequence is represented as a sequence of feature vectors, which are fed into the
LSTM network.
2. **Memory Cells**:
- The core components of the LSTM architecture are memory cells, which maintain a memory
state over time and selectively update or forget information.
- Each memory cell contains a cell state (also known as the memory cell state) and various
gates to control the flow of information.
4. **Gates**:
- LSTMs use specialized gates to regulate the flow of information into and out of the memory
cells. The three main gates are:
- **Input Gate**: Controls the flow of new input information into the cell state.
- **Forget Gate**: Controls the flow of information from the previous cell state, determining
what information to discard.
- **Output Gate**: Controls the flow of information from the current cell state to the output.
5. **Input Gate**:
- The input gate determines how much new information from the current input should be
stored in the cell state.
- It takes the current input and the previous hidden state as input and outputs a value between
0 and 1, indicating how much of the new information to keep.
6. **Forget Gate**:
- The forget gate decides which information from the previous cell state should be retained or
discarded.
- It takes the current input and the previous hidden state as input and outputs a value between
0 and 1 for each element in the cell state, indicating how much to forget.
7. **Output Gate**:
- The output gate controls how much of the current cell state should be used to compute the
output of the LSTM.
- It takes the current input and the previous hidden state as input and outputs a value between
0 and 1, determining the amount of information to pass to the output.
8. **Hidden State**:
- In addition to the cell state, each LSTM cell also maintains a hidden state, which is used to
carry information across time steps and to compute the output of the network.
- The hidden state serves as the short-term memory of the network, capturing relevant
information for the current time step.
9. **Output Layer**:
- The output layer of the LSTM network processes the final hidden states to produce the
output of the network, which could be a single prediction value, a sequence of predictions, or a
probability distribution over classes, depending on the task.
Overall, the LSTM architecture enables the network to capture long-term dependencies in
sequential data by selectively updating and maintaining information over time. This makes
LSTMs particularly effective for tasks such as time series forecasting, natural language
processing, and speech recognition.