0% found this document useful (0 votes)
23 views8 pages

Project Note

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views8 pages

Project Note

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

This code implements a Long Short-Term Memory (LSTM) neural network model using the

Keras library in Python.

1. Data Preparation: The code begins by fetching historical stock data for a specified ticker
symbol using Yahoo Finance API. The data is then preprocessed, including dropping
unnecessary columns and splitting it into training and testing sets.

2. Normalization: Before feeding the data into the neural network, it's normalized using
Min-Max scaling to a range between 0 and 1. This step is crucial for improving the
convergence speed and performance of the neural network.

3. Model Architecture:
The LSTM model is defined using the Sequential API provided by Keras.
Four LSTM layers are stacked sequentially, each followed by a Dropout layer to prevent
overfitting:
The first LSTM layer has 50 units with 'relu' activation function and returns sequences.
The second LSTM layer has 60 units with 'relu' activation function and returns
sequences.
The third LSTM layer has 80 units with 'relu' activation function and returns sequences.
The fourth LSTM layer has 120 units with 'relu' activation function and doesn't return
sequences.
Finally, a Dense layer with one unit is added to produce the output.
Model Compilation: The model is compiled using the Adam optimizer and mean squared
error loss function, which is a common choice for regression problems.

4. Model Training: The model is trained on the training data with a specified number of
epochs.

5. Model Evaluation: After training, the model is evaluated on the testing data to assess its
performance.

6. Prediction and Scaling Back: Once the model is trained and evaluated, predictions are
made on the test data. Before plotting the predictions against the actual values, the
predicted and actual values are scaled back to their original scales using the inverse of
the Min-Max scaling factor.

7. Plotting: Finally, the actual and predicted stock prices are plotted over time to visualize
the performance of the model.

This model aims to predict stock prices based on historical data using LSTM, a type of recurrent
neural network (RNN) architecture well-suited for sequential data like time series. By training on
past stock prices, the model learns patterns and relationships in the data to make predictions
about future stock prices.
Data Preparation:
● Historical stock data is fetched and preprocessed.
● The 'Close' price is selected as the target variable for prediction.
● The data is split into training and testing sets.

Normalization: The data is normalized using Min-Max scaling to a range between 0 and 1. This
is essential for neural networks to perform effectively, as it helps stabilize and speed up the
training process.

Model Architecture:
● The LSTM (Long Short-Term Memory) model architecture is defined using the Keras
library.
● The model consists of four stacked LSTM layers, each followed by a Dropout layer to
prevent overfitting:
● The first LSTM layer has 50 units and returns sequences.
● The second LSTM layer has 60 units and returns sequences.
● The third LSTM layer has 80 units and returns sequences.
● The fourth LSTM layer has 120 units and doesn't return sequences.
● Finally, a Dense layer with one unit is added to produce the output, which predicts the
next day's closing price.

Model Compilation:
● The model is compiled using the Adam optimizer, a popular choice for optimization, and
the mean squared error loss function, which is suitable for regression problems.

Model Training:
● The compiled model is trained on the training data for a specified number of epochs.
● During training, the model learns the patterns and relationships in the historical stock
data to make predictions about future stock prices.

Model Evaluation:
● After training, the model's performance is evaluated on the testing data to assess its
accuracy and generalization capability.

Prediction and Scaling Back:


● Once trained, the model is used to make predictions on the test data.
● Predictions are made one day ahead based on the previous 100 days of data.
● The predicted and actual values are scaled back to their original scales using the inverse
of the Min-Max scaling factor for visualization.

Visualization:The actual and predicted stock prices are plotted over time to visually compare the
model's predictions with the true values.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pandas_datareader as data
import yfinance as yf
-Here, necessary libraries are imported:
● numpy and pandas for numerical and data manipulation tasks.
● matplotlib.pyplot for plotting graphs.
● pandas_datareader and yfinance for fetching stock data from Yahoo Finance.

ticker_symbol = 'AAPL' # Example: Apple Inc.


stock_data = yf.download(ticker_symbol, start='2015-01-01', end='2024-04-10')
-The ticker symbol for the stock to be analyzed is defined (here, 'AAPL' for Apple Inc.).
Historical stock data for the specified ticker symbol is downloaded from Yahoo Finance,
covering the time period from January 1, 2015, to April 10, 2024.

print(stock_data.head())
-Prints the first few rows of the downloaded stock data to inspect its structure.

stock_data.reset_index(inplace=True)
-Resets the index of the DataFrame, turning the 'Date' column into a regular column.

plt.figure(figsize=(6, 3))
plt.plot(stock_data.index, stock_data['Close'], color='blue', linewidth=2)
plt.title('Stock Close Price')
plt.xlabel('Date')
plt.ylabel('Close Price')
plt.grid(True)
plt.show()
-Plots the closing price of the stock over time using matplotlib.

Calculation of Moving Averages: The moving average is a commonly used technical analysis
tool to smooth out price data by creating a constantly updated average price.
The 100-day moving average (MA100) and 200-day moving average (MA200) are calculated for
the closing prices of the stock data.
These moving averages provide insights into the longer-term trends in the stock price by
averaging out short-term fluctuations.

100-day Moving Average (MA100):


● The 100-day moving average is calculated by taking the average of the closing prices
over the previous 100 days.
● This moving average helps smooth out short-term fluctuations in the stock price and
highlights the underlying trend over a medium-term period.
● A rising MA100 indicates that the stock price is generally increasing over the past 100
days, while a falling MA100 suggests a downward trend.

200-day Moving Average (MA200):


● Similarly, the 200-day moving average is calculated by averaging the closing prices over
the previous 200 days.
● This moving average provides a longer-term perspective on the stock price trend and is
often used by investors to identify major shifts in the market sentiment.
● A crossover between the MA100 and MA200, where the MA100 crosses above the
MA200, is often considered a bullish signal indicating a potential uptrend, while a
crossover in the opposite direction may signal a downtrend.

data_training = pd.DataFrame(stock_data['Close'][0:int(len(stock_data)*0.70)])
data_testing = pd.DataFrame(stock_data['Close'][int(len(stock_data)*0.70): int(len(stock_data))])
-Splits the data into training and testing sets, where 70% of the data is for training and the
remaining 30% is for testing.

from sklearn.preprocessing import MinMaxScaler


scaler = MinMaxScaler(feature_range=(0,1))
-Imports MinMaxScaler from sklearn.preprocessing to normalize the data between 0 and 1.

MinMaxScaler is a preprocessing technique used in machine learning to scale and normalize


features within a specific range. It transforms the features by scaling them to a given range,
typically between 0 and 1. MinMaxScaler is used in this code to normalize the stock price data
before feeding it into the LSTM model. Normalization ensures that all input features contribute
equally to the training process and prevents features with larger scales from dominating the
learning process.
Scaling Stock Prices:Stock prices can vary significantly in magnitude over time, depending on
factors such as the company's market capitalization, trading volume, and market sentiment. By
using MinMaxScaler, the stock prices are scaled to a range between 0 and 1, ensuring that they
are on a consistent scale regardless of the magnitude of the prices.

data_training_array = scaler.fit_transform(data_training)
-Applies Min-Max scaling to the training data.

import tensorflow as tf
from tensorflow import keras
import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
import keras
from keras.models import Sequential
from keras.layers import LSTM, Dense, Dropout
-Imports necessary libraries from TensorFlow and Keras for building the LSTM model.
Keras:
● Keras is used to define the architecture of the LSTM (Long Short-Term Memory) model.
● Specifically, the Sequential API from Keras is utilized to stack layers sequentially and
build the model.
● The LSTM layers, dropout layers, and dense layers are all defined using the Keras
Sequential API.
● Additionally, Keras is used to compile the model with an optimizer and loss function and
to fit the model to the training data.

TensorFlow:
● TensorFlow serves as the backend for executing the computations defined in the Keras
model.
● While Keras provides a high-level interface for building neural networks, TensorFlow
handles the low-level operations and computations required to train the model.
● TensorFlow efficiently executes the forward and backward passes during training,
updates the model parameters (weights and biases), and computes the loss function
and gradients.
● Essentially, Keras serves as the user-friendly interface for defining the model
architecture and training process, while TensorFlow handles the backend computations
to execute the model efficiently.

In summary, Keras is used to define the LSTM model architecture and compile the model, while
TensorFlow handles the execution of the computations defined in the model. Together, they
provide a powerful framework for building, training, and deploying deep learning models, such
as the LSTM model used for stock price prediction.

model = Sequential()
model.add(LSTM(units=50, activation='relu', return_sequences=True,
input_shape=(x_train.shape[1], 1)))
model.add(Dropout(0.2))
# Add more LSTM and Dropout layers as per the defined architecture
model.summary()
-Defines the LSTM model architecture using the Sequential API from Keras. The architecture
includes four LSTM layers with dropout regularization.

model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train, y_train, epochs=50)
-Compiles the model using the Adam optimizer and mean squared error loss function, then
trains the model on the training data for 50 epochs.

model.save('keras_model.h5')
-The file keras_model.h5 is a serialized representation of the trained LSTM model saved in the
Hierarchical Data Format version 5 (HDF5) format.
past_100_days = data_training.tail(100)
final_df = pd.concat([past_100_days, data_testing], ignore_index=True)
-Selects the last 100 days of the training data and concatenates them with the testing data to
create a new DataFrame for prediction.

input_data = scaler.fit_transform(final_df)
-Normalizes the input data (concatenated DataFrame) using the Min-Max scaler.

x_test = []
y_test = []
for i in range(100, input_data.shape[0]):
x_test.append(input_data[i-100:i])
y_test.append(input_data[i, 0])
-Prepares the testing data by creating sequences of 100 days with their corresponding target
values for prediction.

y_predicted = model.predict(x_test)
-Makes predictions on the testing data using the trained LSTM model.

scaler.scale_
scale_factor = 1/0.00721059
y_predicted = y_predicted * scale_factor
y_test = y_test * scale_factor
-Scales back the predicted and actual values to their original scales using the inverse of the
Min-Max scaling factor.

plt.figure(figsize=(12, 6))
plt.plot(y_test, 'b', label='Original Price')
plt.plot(y_predicted, 'r', label='Predicted Price')
plt.xlabel('Time')
plt.ylabel('Price')
plt.legend()
plt.show()
-Plots the actual and predicted stock prices over time for visualization.

Streamlit is an open-source Python library used to create web applications for machine learning
and data science projects. It allows developers to build interactive and customizable web
interfaces directly from Python scripts, without requiring knowledge of web development
languages such as HTML, CSS, or JavaScript.

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture,
specifically designed to address the limitations of traditional RNNs in capturing and learning
long-term dependencies in sequential data. LSTMs are particularly well-suited for tasks
involving time series data, natural language processing, speech recognition, and other
sequential data domains.

The architecture of a Long Short-Term Memory (LSTM) network consists of multiple memory
cells organized in layers, with each cell containing specialized gates to control the flow of
information. Here's a detailed explanation of the LSTM architecture:

1. **Input Layer**:
- The input layer receives sequential input data, such as time series data, text, or audio
samples.
- Each input sequence is represented as a sequence of feature vectors, which are fed into the
LSTM network.

2. **Memory Cells**:
- The core components of the LSTM architecture are memory cells, which maintain a memory
state over time and selectively update or forget information.
- Each memory cell contains a cell state (also known as the memory cell state) and various
gates to control the flow of information.

3. **Cell State (Memory Cell State)**:


- The cell state is the internal memory of the LSTM cell, which stores information over multiple
time steps.
- It can be thought of as the long-term memory of the network, as it can retain information over
long periods of time.

4. **Gates**:
- LSTMs use specialized gates to regulate the flow of information into and out of the memory
cells. The three main gates are:
- **Input Gate**: Controls the flow of new input information into the cell state.
- **Forget Gate**: Controls the flow of information from the previous cell state, determining
what information to discard.
- **Output Gate**: Controls the flow of information from the current cell state to the output.

5. **Input Gate**:
- The input gate determines how much new information from the current input should be
stored in the cell state.
- It takes the current input and the previous hidden state as input and outputs a value between
0 and 1, indicating how much of the new information to keep.

6. **Forget Gate**:
- The forget gate decides which information from the previous cell state should be retained or
discarded.
- It takes the current input and the previous hidden state as input and outputs a value between
0 and 1 for each element in the cell state, indicating how much to forget.
7. **Output Gate**:
- The output gate controls how much of the current cell state should be used to compute the
output of the LSTM.
- It takes the current input and the previous hidden state as input and outputs a value between
0 and 1, determining the amount of information to pass to the output.

8. **Hidden State**:
- In addition to the cell state, each LSTM cell also maintains a hidden state, which is used to
carry information across time steps and to compute the output of the network.
- The hidden state serves as the short-term memory of the network, capturing relevant
information for the current time step.

9. **Output Layer**:
- The output layer of the LSTM network processes the final hidden states to produce the
output of the network, which could be a single prediction value, a sequence of predictions, or a
probability distribution over classes, depending on the task.

Overall, the LSTM architecture enables the network to capture long-term dependencies in
sequential data by selectively updating and maintaining information over time. This makes
LSTMs particularly effective for tasks such as time series forecasting, natural language
processing, and speech recognition.

You might also like