Final Rev
Final Rev
neural networks
Bachelor of Technology
in
Programme
by
Sashank Rijal
19BCE2484
May, 2023
DECLARATION
Place : Vellore
Date :19/05/2
023
Signature of the Candidate
CERTIFICATE
The contents of this report have not been submitted and will not be submitted
either in part or in full, for the award of any other degree or diploma in this institute or
any other institute or university. The thesis fulfills the requirements and regulations of
the University and in my opinion meets the necessary standards for submission.
Place : Vellore
No.
Acknowledgement i
Executive Summary ii
List of Figures ix
Abbreviations xvi
1 INTRODUCTION 1
1.2 Motivation 2
2. Literature Survey 5
2.1. Survey of the Existing Models/Work 5
2.2. Summary/Gaps identified in the Survey 9
3. Overview of the Proposed System 10
3.1. Introduction and Related Concepts 10
3.2. Framework, Architecture or Module for the Proposed System(with explanation) 12
3.3. Proposed System Model(ER Diagram/UML Diagram/Mathematical Modeling) 15
4. Proposed System Analysis and Design 18
4.1. Introduction 18
4.2. Requirement Analysis 19
4.2.1.Functional Requirements
4.2.1.1. Product Perspective
4.2.1.2. Product features
4.2.1.3. User characteristics
4.2.1.4. Assumption & Dependencies
4.2.1.5. Domain Requirements
4.2.1.6. User Requirements
4.2.2.Non Functional Requirements 25
4.2.2.1. Product Requirements
4.2.2.1.1. Efficiency (in terms of Time and Space)
4.2.2.1.2. Reliability
4.2.2.1.3. Portability
4.2.2.1.4. Usability
4.2.2.2. Organizational Requirements 30
4.2.2.2.1. Implementation Requirements (in terms of deployment)
4.2.2.2.2. Engineering Standard Requirements
4.2.2.3. Operational Requirements (Explain the applicability for your work
w.r.to the following operational requirement(s))
Economic
Environmental
Social
Political
Ethical
Health and Safety
Sustainability
Legality
Inspectability
4.2.3.System Requirements 40
4.2.3.1. H/W Requirements(details about Application Specific Hardware)
4.2.3.2. S/W Requirements(details about Application Specific Software)
5. Results and Discussion 43
6. References 50
APPENDIX A
List of Figures
2G Second Generation
3G Third Generation
4G Fourth Generation
f CFO
NCFO
1. INTRODUCTION
1.1Theoretical Background
We know that in today’s world digital currency have become a hot commodity in the financial
markets. Predictions related to them via traditional means like technical analysis are not as
reliable as they are with the traditional finance markets.
The project's theoretical background is built on two key fields: neural networks and market
analysis of cryptocurrencies. A digital or virtual currency that uses cryptography for security is
called cryptocurrency. It is not controlled by a central bank and can be moved instantly between
people or companies. Because of their volatility, cryptocurrencies make for an intriguing topic
for market study and forecasting. On the other hand, a group of machine learning algorithms
known as neural networks is based on how the human brain operates and is structured. They are
especially appropriate for tasks requiring prediction since they are capable to learning intricate
patterns and relationships from data.
Neural networks that can retain information for long periods of time like Long Short-term
memory, gated recurrent unit or a custom version based on these architectures can help us better
predict and visualize future of digital markets
1.2 Motivation
The project's motivation is to create and analyze various neural network models which are then
loaded in a web application that can assist traders and investors in the Crypto market in making
informed choices. The program helps users decide when to buy or sell a specific cryptocurrency
by analyzing previous market data and forecasting future price patterns using the models. This
might result in greater insight in the Crypto market which could lead to more earnings and
improved investment results.
1.3 Aim of the Proposed Work
The project's goal is to create many deep learning models which helps in predicting the price of
various cryptocurrency like bitcoin and Ethereum by looking at their stock metrics and load them
in a web application that uses neural networks to predict cryptocurrency prices in real time. The
software must to be simple to use and open to both new and seasoned traders and investors.
- Develop various deep learning neural networks that can accurately predict the prices
of cryptocurrencies.
- Compare and analyze different models to each other and come with a definitive
conclusion on which kind of model is more suitable for these types of prediction in
general.
- Develop a web app that can display real-time cryptocurrency prices and predictions.
- Integrate the neural network model into the web app and test its performance
Overall, the project aims to leverage the power of neural networks to provide users with valuable
insights into the cryptocurrency market.
2. Literature Survey
2.1 Survey of existing models
There have been several studies on using neural networks for predicting the prices of
cryptocurrencies. Here is a literature survey of some existing models:
1. "Bitcoin Price Prediction Using Deep Learning" by N. Atakan and S. Utku, 2019
The authors proposed a deep learning model based on a convolutional neural network (CNN) to
predict Bitcoin prices. The model achieved an approximate error % of around 10 and was able to
predict the trend of Bitcoin prices.
The authors used a deep learning model based on long short-term memory (LSTM) networks to
predict the prices of five cryptocurrencies. The model achieved error % of around 10 for all five
cryptocurrencies.
The authors used a stacked autoencoder and LSTM neural network to predict the prices of
Bitcoin, Ethereum, and Litecoin. The model achieved an error of 12% for Bitcoin, 13% for
Ethereum, and 11% for Litecoin.
The authors used a machine learning approach based on the XGBoost algorithm to predict the
prices of Bitcoin. The model achieved an error of 11% for predicting the direction of price
movements.
5. “A LSTM-Method for Bitcoin Price Prediction: A Case Study Finance Stock Market”
Ferdiansyah, Siti Hajar Othman – 2019
LSTM is a different type of module provided for RNN that was later developed and made
popular by many researchers. Like RNN, the LSTM also consists of modules with recurrent
consistency. The goal of this research study is to learn how to create a model prediction for the
bitcoin stock market using LSTM. The methodology used in this study, along with the techniques
and tools used to forecast Bitcoin prices on yahoo finance, can forecast a result above $ 12600
USD for the days following the forecast. The described time series model may provide results,
and the results can forecast the price for the upcoming days using split data for training and
testing.
6: “Research on Stock Price Time Series Prediction Based on Deep Learning and
Autoregressive Integrated Moving Average” Daiyou Xiao, Jinxia Su -2022
For forecasting the linear and non-linear problems in this paper, the researchers used
conventional models and machine learning models, respectively. First, stock samples from the
New York Stock Exchange from 2010 to 2019 were gathered. Next, the stock price and stock
price sub correlation are trained and predicted using the ARIMA model and the LSTM neural
network model. The analysis's conclusion revealed that the LSTM model performed better in
prediction than ARIMA, and the ensemble model of ARIMA-LSTM greatly outperformed other
benchmark techniques. So, the suggested approach offers theoretical backing and serves as a
method reference for investors interested in stock trading on the Chinese stock market.
Researchers have utilized a variety of deep learning and machine learning methods, including
long short-term memory (LSTM), neural networks, and gated recurrent units (GRU), to forecast
and examine the factors influencing cryptocurrency pricing. A hybrid cryptocurrency prediction
system based on LSTM and GRU that solely considers Litecoin and Monero was proposed in
this paper. The findings show that the suggested method successfully predicts prices with a high
degree of accuracy, indicating that the method is suitable to price forecasts for a variety of
cryptocurrencies. The errors of prediction show that the suggested system performs better than
the LSTM network
Here, in order to investigate the volatility and comprehend this behavior, the researchers
investigate a time series analysis utilizing deep learning. To identify patterns in cryptocurrency
close prices and forecast future prices, a long short-term memory model is applied. The
suggested model picks up knowledge from nearby values. The root-mean-squared error and a
comparison to an ARIMA model are used to assess the performance of this model. The
managerial implications of their findings include the potential for developing a product for
investors who can expand our model by include more hyperparameters to produce an even more
precise model to forecast the price of cryptocurrencies.
9:” Cryptocurrency Price Prediction and Trading Strategies Using Support Vector
Machines” David Zhao, Alessandro Rinaldo – 2019
Here, a significant variety of technical indicators were constructed to capture patterns in the
cryptocurrency market utilizing historical data from July 2015 to November 2019. Then it was
tested different classification techniques to predict short-term price fluctuations based on these
data. The classifiers did a good job of predicting upward and downward market movements over
the ensuing hour on both PPV and NPV parameters. Along with developing a system for
converting predictions from 1-hour-ahead classes into trading choices and a back tester that
simulates trading in a real-world setting, we go beyond analyzing classification accuracy. Since
January 2018, trading techniques that use support vector machines have outperformed the market
on average for Bitcoin, Ethereum, and Litecoin over the past 22 months.
This study used a deep learning model based on a Long Short-Term Memory (LSTM) network to
predict the price of Bitcoin. The model was trained on historical Bitcoin price data and achieved
a mean squared error (MSE) of 990 on the test data.
11. "Bitcoin price prediction using machine learning: An approach to sample dimension
engineering" by G. Pandey et al.
This study used a Support Vector Machine (SVM) and a Random Forest (RF) algorithm to
predict the price of Bitcoin. The authors also proposed a new approach to sample dimension
engineering to improve the performance of the models. The results showed that the RF model
performed better than the SVM model.
12. "Bitcoin price prediction using neural networks" by V. T. H. Nguyen and S. J. Kim
This study used a neural network model based on a Radial Basis Function (RBF) network to
predict the price of Bitcoin. The model was trained on historical Bitcoin price data and achieved
a mean absolute error (MAE) of 53.09 on the test data.
This study used a deep learning model based on a Convolutional Neural Network (CNN) to
predict the price of various cryptocurrencies. The authors also proposed a new loss function to
improve the performance of the model. The results showed that the model performed well in
predicting the prices of cryptocurrencies.
This study used a deep learning model based on a Multilayer Perceptron (MLP) network to
predict the prices of various cryptocurrencies. The authors also proposed a new method for data
normalization to improve the performance of the model. The results showed that the model
performed well in predicting the prices of cryptocurrencies.
2.2 Summary/ Gaps identified in the Survey
Several studies have used neural networks for cryptocurrency price prediction, with LSTM-based
models being a popular choice. Hybrid models, deep reinforcement learning, ensemble models,
and other neural networks have also been explored. The Error in prediction varies depending on
the approach used and the quality of the data. Some of these gaps include:
- While many studies have explored the use of neural networks for cryptocurrency price
prediction, there is still room for improvement in terms of reduction of error % and
generalization to different cryptocurrencies.
- Some studies do not provide enough information on the evaluation metrics used to
assess the performance of their models, making it difficult to compare their results with
other studies and the project.
- There is a need for more research on the use of ensemble models and Bayesian neural
networks for cryptocurrency price prediction.
- Most studies are not focused on the impact of market dynamics, such as news and social
media sentiment, on cryptocurrency prices.
- Some models are not readily accessible to the public or require a significant amount of
technical expertise to use
Overall, there are many different approaches for predicting cryptocurrency prices using neural
networks, and each approach has its strengths and weaknesses. The choice of model will depend
on the specific requirements of the web app and the available data.
Addressing these gaps in the literature can improve the accuracy and usability of cryptocurrency
price prediction models, making them more valuable for investors and traders. The proposed
project can address some of these gaps by providing a web app that can predict the prices of
various cryptocurrencies and comparing various models that has been considered for this project.
3. Overview of the Proposed System
The proposed system is a web application that uses neural networks to predict the prices of
various cryptocurrencies in real-time. The system aims to assist investors and traders in making
informed decisions by providing them with accurate and timely price predictions.
The system will be designed to collect and preprocess historical data for various cryptocurrencies
from reliable sources. This data will be used to train and optimize a neural network model that
can accurately predict the prices of cryptocurrencies. The web application will provide users
with a user-friendly interface to access cryptocurrency predictions. Users will be able to select
the cryptocurrency they are interested in and view its prediction of its future price based on the
neural network model.
To ensure the accuracy and reliability of the system, it will be continuously monitored and
evaluated. The performance of the neural network model will be assessed using appropriate
evaluation metrics, and the system will be updated as necessary to ensure its effectiveness.
3.2 Architecture for the Proposed System:
Description:
• The necessary libraries such as NumPy, pandas, matplotlib, scikit-learn, and TensorFlow
will be imported.
• Restructuring the data by converting the closing price into Windows and horizons and
removing unnecessary columns.
• Visualize the data by using various plots, such line, scatter plots etc and construct
dataframe accordingly.
• Train and evaluate different machine learning models, such as LSTM, Conv-1D, Custom
N-beats model, Ensemble models using the restructured data and data pipelining in some
cases.
• Callbacks and Early stopping are used to train the models efficiently and avoid huge
usage of time.
• Hyperparameter tuning and learning rate optimization via Adam is done comparing
various instance of the same model with their error metrics
• Various plots are made against the test data with predicted data to analyze the error rate of
the models.
Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture that
is widely used for sequential data analysis and prediction. It is particularly effective in capturing
long-term dependencies and handling vanishing or exploding gradient problems that can occur in
traditional RNNs.
The mathematical modeling of an LSTM network involves several components and equations.
Let's consider a single LSTM cell:
The input gate determines how much of the incoming information should be stored in the cell
state. It takes into account the current input (x) and the previous hidden state (h_{t-1}). The input
gate value is calculated using the sigmoid activation function:
The forget gate determines how much of the previous cell state should be forgotten. It considers
the current input (x) and the previous hidden state (h_{t-1}). The forget gate value is calculated
using the sigmoid activation function:
The cell state update computes a new candidate value for the cell state. It takes into account the
current input (x) and the previous hidden state (h_{t-1}). The candidate value is calculated using
the hyperbolic tangent (tanh) activation function:
The cell state (C_t) is updated by combining the forget gate and the cell state update using
element-wise multiplication and addition:
The output gate determines the amount of information to be outputted from the cell state. It
considers the current input (x) and the previous hidden state (h_{t-1}). The output gate value is
calculated using the sigmoid activation function:
o_t = sigmoid(W_{xo} * x_t + W_{ho} * h_{t-1} + b_o)
The hidden state (h_t) is computed by applying the hyperbolic tangent activation function to the
updated cell state (C_t) and multiplying it by the output gate:
These equations govern the flow of information through the LSTM cell, allowing it to retain and
update information over long sequences. The network can be trained by adjusting the weights
(W) and biases (b) through backpropagation and gradient descent methods, minimizing a loss
function between the predicted outputs and the target outputs.
By stacking multiple LSTM cells in a recurrent manner, deeper LSTM architectures can be
constructed, allowing for more complex and accurate predictions in various applications,
including cryptocurrency price prediction.
Convolutional 1D:
Convolutional 1D is a type of neural network layer commonly used for analyzing sequential
data, such as time series or one-dimensional signals. It applies one-dimensional convolutional
filters to extract local patterns and capture relevant features from the input data. The
mathematical modeling of Conv1D involves the following steps:
- Input:
Let's assume we have a one-dimensional input signal with N data points represented as x = [x[0],
x[1], ..., x[N-1]].
- Convolutional Filters:
Conv1D uses multiple filters (kernels) to extract different features from the input. Each filter has
a width (kernel_size) and a set of learnable weights (kernel), which are convolved with the input
signal.
- Convolution Operation:
The convolution operation involves sliding each filter across the input signal and performing
element-wise multiplication followed by summation. This process generates a new feature map.
- Output:
The outputs of the convolutional layer are the feature maps produced by each filter. These feature
maps capture local patterns and relevant information from the input signal.
where * denotes the convolution operation, kernel[i] represents the weights of the i-th filter,
bias[i] is the bias term for the i-th filter, and z[i] is the output feature map for the i-th filter.
After the convolution and activation steps, the outputs of all filters are concatenated or pooled to
form a feature vector that can be fed into subsequent layers for further processing or prediction
tasks. Conv1D layers can be stacked, and the entire network can be trained using
backpropagation and optimization algorithms to learn the optimal filter weights and biases that
capture relevant features for the specific task, in this case such as cryptocurrency price
prediction.
The N-BEATS (Neural basis expansion analysis for interpretable time series forecasting)
algorithm is a deep learning-based approach for time series forecasting. It introduces a flexible
framework that decomposes the forecasting task into a set of basic functions and combines them
to generate accurate predictions. The mathematical modeling of the N-BEATS algorithm
involves the following steps:
- Input and basis functions:
Let's assume we have a time series dataset with N data points represented as x = [x [0], x [1], ...,
x[N-1]].The N-BEATS algorithm decomposes the forecasting task into a set of basis functions.
Each basis function represents a distinct pattern or component of the time series. These basis
functions capture different aspects of the data, such as trend, seasonality, or other patterns.
- Forecasting Stack:
The algorithm utilizes a stack of fully connected layers, referred to as the forecasting stack. Each
layer in the stack applies a linear transformation to the input data followed by a non-linear
activation function, such as ReLU or sigmoid.
- Stacking Layers:
The basis functions are stacked using skip connections between layers in the forecasting stack.
This allows the model to learn complex patterns by combining the information from different
basis functions at multiple resolutions.
- Training and Optimization:
The N-BEATS algorithm is trained using historical data and optimized using backpropagation
and gradient descent algorithms. The model learns the parameters (weights and biases) of the
forecasting stack to minimize the prediction error between the model's output and the actual
values.
- Prediction:
Once the model is trained, it can be used for forecasting future values of the time series. The
model takes a window of past values as input and generates a prediction for the next time step.
Mathematically, the N-BEATS algorithm can be represented as a series of matrix multiplications
and non-linear transformations. The output of each layer in the forecasting stack is calculated as:
where y[t] is the output at time step t, K is the number of basis functions, g_k is the output of the
k-th fully connected layer, and h_{k-1}[t] is the output of the previous layer at time step t. By
combining different basis functions and stacking layers, the N-BEATS algorithm can effectively
capture the underlying patterns and dynamics of the time series, enabling accurate forecasting for
cryptocurrency prices.
Ensemble Model:
An ensemble model for price prediction combines the predictions of multiple individual models
to improve the overall accuracy and robustness of the predictions. The mathematical modeling of
an ensemble model involves the following steps:
- Individual Models:
An ensemble model consists of a collection of individual models, each trained on a subset of the
data or using a different algorithm. These individual models can be diverse in terms of their
architecture, parameters, or training techniques.
- Training:
Each individual model in the ensemble is trained on a training dataset using a specific algorithm
or approach. This can involve various machine learning techniques such as regression, neural
networks, support vector machines, or decision trees. The training process aims to find the best
parameters or weights for each model that minimize the prediction error.
Once the individual models are trained, they can be used to make predictions on new or unseen
data. Each model independently generates its own prediction based on the input features and its
learned parameters. The predictions generated by the individual models are combined or
aggregated to form the final prediction of the ensemble model. There are different methods for
combining the predictions, such as averaging, weighted averaging, voting, stacking, or boosting.
The combination process aims to leverage the strengths of each individual model and reduce the
impact of their weaknesses.
In some ensemble models, the individual models' predictions may be assigned different weights
or importance levels based on their performance or confidence. The optimization process aims to
find the optimal combination of weights that maximizes the overall accuracy of the ensemble
predictions.
Mathematically, the ensemble model combines the predictions of the individual models
according to the chosen combination method. For example, in the case of weighted averaging,
the final prediction (P_final) can be calculated as:
The ensemble model's strength lies in its ability to capture different perspectives and exploit the
diversity of individual models, resulting in more accurate and robust predictions compared to any
single model alone.
References