Comparative Analysis of Software Reliability Prediction Using Machine Learning and Deep Learning
Comparative Analysis of Software Reliability Prediction Using Machine Learning and Deep Learning
Abstract— Software Reliability is an integral part to Networks (RNNs), and others that use a collection of
determine Software Quality. Software is considered to be of high algorithms to replicate the brain's actions. A neural network
quality if its reliability is high. There exist many statistical is made up of four primary components: inputs, weights, a
models that can help in predicting Software Reliability, but it is bias or threshold, and an output. However, a deep learning
very difficult to consider all the real-world factors and hence it
makes the task of reliability prediction very difficult. Therefore,
model requires more data points to enhance its accuracy,
it becomes more challenging for the IT industry to predict if a although a machine learning model requires less data due to
software is dependable or not. Machine Learning and Deep its fundamental data structure.
Learning can be used for the prediction of Software Reliability
by programming a model that assesses reliability by fault For this experiment, we propose a heuristic examination of
prediction in a more meticulous manner. Therefore, in this several Machine Learning and Deep Learning techniques on
study we intend to use predefined Artificial Intelligence a univariate software failure time series data to investigate
algorithms, mainly Artificial Neural Network (ANN), Recurrent which approach can be used extensively for predicting
Neural Network (RNN), Gated Recurrent Unit (GRU) and Long software reliability. We then used metrics like Mean
Short-Term Memory (LSTM) for predicting software reliability Absolute Error, Mean Squared Error, Median Absolute Error,
on a time series software failure dataset and compare them on and Maximum Error to determine their accuracies. The
the basis of selected performance metrics. Each of the algorithm
reason for choosing these metrics is that they can capture
trained on software failure dataset will be used to predict the
software failure time after a certain number of corrective which technique closely represents the actual software failure
modifications are performed on the software. Based on the dataset. For this paper we chose to lay an impact on the data-
result of the studies, it is discovered that LSTM produces driven approaches compared to the hardware/architecture-
superior outcomes in predicting the software failure trend as it based approaches where other factors like the environment
can capture long and short-term trends in the software failure and time play a significant role and leave a major
dataset. probabilistic factor.
Keywords— Software Reliability, Deep Learning, Time Series II. RELATED WORKS
data, comparative analysis
The most prevalent Machine Learning algorithms used for
Software Reliability Prediction and Modeling include
I. INTRODUCTION Genetic Programming, Decision Trees, Support Vector
Machines, and Particle Swarm Optimization.ML techniques
The term "software reliability" refers to operational have been proven to work better than stochastic ones due to
dependability. Software reliability may alternatively be the nature of models to learn from previous eros are thereby
defined as the probability that a software system will perform leading to close precision and fewer errors (Malhotra and
its assigned task in a given environment for a certain number Negi 2013) [1].
of input cases, assuming that the hardware and input are both
error-free. It is of crucial importance to evaluate Software In order to facilitate the acceptance of connectionist models
Reliability for determining system dependability. However, and their usage in software reliability models, Cai et al.
it is difficult to accomplish reliability given the increasing (1991) [2] and Karunanithi et al. (1992) [3] undertook
complexity in software requirements. considerable research. G Krishan et alii. (2018) [4] compared
mainly Artificial Neural Networks and SVM to conclude that
Machine Learning and Deep Learning techniques can predict geometric understanding leads to better results for SVM than
the fault rate for a given software more precisely by learning NN’s.
on past input data without human judgment thus leaving less
room for errors and assumptions contrary to statistical Prediction tests to oversee software reliability were utilized
methods. (Malhotra and Negi 2013) [1]. System behavior can by Pai and Hong (2006) [5] by using SVM algorithms. Loui
be anticipated by utilizing Machine Learning which learns et alii. (2016) [6] employed a relevance vector machine for
from its past and current software failure data as it’s a tool to the prediction of software dependability. Machine learning
automate data processing. Deep learning techniques include approaches such as fuzzy inference systems, cascade
Artificial Neural Networks (ANNs), Recurrent Neural correlation neural networks, and decision trees are used by
Kumar and Singh (2012) [7] to predict outcomes. Jaiswal and
Malhotra (2016) [8] talk about predicting software reliability Table.1 Dataset description
using ANFIS. Other methods like Bagging, GRNN, SVM,
MLP, M5P, FFBPNN, CFBPNN, Lin Reg, RepTree are Attribute Description
studied by Xingguo and Yanhua (2007) [9]
t The number of modifications made to the
In papers like (Gokhale 1998) [10], an application was run software
against some software and fixed test cases to figure out a
scientific model (architecture of application) in terms of Yt Failure time after t modification is made
criteria like branching probabilities and failure model of its
components issues that could be later solved for better
reliability. Future works have been shown to find a
systematic way of predicting software reliability by
incorporating debugging functionalities (S Trivedi 2006)
[11].
C. ALGORITHMS USED
Fig 3 shows it consists of three layers: an input layer, an Reason for selecting RNN: Since the hidden state of
arbitrary number of hidden levels, and an output layer. RNN is used for remembering information about a
sequence, it can be used for a time series prediction
problem.
Fig. 3 ANN Model Summary ● Update Gate: This gate determines how much
information should be handed on to future
generations.
Reason for selecting ANN: To get a baseline prediction
performance. ● Reset Gate: It determines how much past
knowledge is useless and hence can be forgotten.
Training epochs: 100, Batch Size:5
● Current Memory Gate: This gate is further
2. RNN (Recurrent Neural Network) incorporated into the reset gate and brings non-
linearity in the input.
The RNN algorithm is a type of Neural Network. The
variation in this algorithm is that it has a hidden state,
which takes into consideration the information stored in
a sequence as represented in Fig 4. In RNN, the
independent activations are converted into dependent
activations, which is done by using the same weights and
biases for all the layers.
Fig. 5 GRU Model Summary Fig 6. LSTM Model Summary
Reason for selecting GRU: Since GRU is a variation of Reason for selecting LSTM: Since LSTM was specially
RNN only, hence it can be effective in a time series developed to solve the problem of long-term
prediction problem by virtue of its capability to dependencies in a sequence, it can be very effective in a
memorize some information about a sequence. time series prediction problem.
Training epochs: 100, Batch Size:5 Training epochs: 100, Batch Size: 5