0% found this document useful (0 votes)
95 views

Software Bug Prediction Using Machine Learning App

This document summarizes a research paper that uses machine learning algorithms to predict software bugs. It compares the performance of three machine learning classifiers - Naive Bayes, Decision Tree, and Artificial Neural Networks - on three public datasets. The evaluation shows the machine learning approaches can effectively predict bugs with high accuracy. The paper also compares the performance of the classifiers using various evaluation metrics like accuracy, precision, recall, and ROC curves.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views

Software Bug Prediction Using Machine Learning App

This document summarizes a research paper that uses machine learning algorithms to predict software bugs. It compares the performance of three machine learning classifiers - Naive Bayes, Decision Tree, and Artificial Neural Networks - on three public datasets. The evaluation shows the machine learning approaches can effectively predict bugs with high accuracy. The paper also compares the performance of the classifiers using various evaluation metrics like accuracy, precision, recall, and ROC curves.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/323536716

Software Bug Prediction using Machine Learning Approach

Article  in  International Journal of Advanced Computer Science and Applications · January 2018


DOI: 10.14569/IJACSA.2018.090212

CITATIONS READS
80 10,745

4 authors, including:

Awni Hammouri Mustafa Hammad


Mu’tah University University of Bahrain
16 PUBLICATIONS   390 CITATIONS    89 PUBLICATIONS   488 CITATIONS   

SEE PROFILE SEE PROFILE

Mohammad M Alnabhan
Princess Sumaya University for Technology
36 PUBLICATIONS   195 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

E-learning View project

AcChain View project

All content following this page was uploaded by Mustafa Hammad on 17 December 2018.

The user has requested enhancement of the downloaded file.


(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 9, No. 2, 2018

Software Bug Prediction using Machine Learning


Approach
Awni Hammouri, Mustafa Hammad, Mohammad Alnabhan, Fatima Alsarayrah
Information Technology Department
Mutah University, Al Karak, Jordan

Abstract—Software Bug Prediction (SBP) is an important In addition to, the paper compares between NB classifier,
issue in software development and maintenance processes, which DT classifier and ANNs classifier. The comparison based on
concerns with the overall of software successes. This is because different evaluation measures such as accuracy, precision,
predicting the software faults in earlier phase improves the recall, F-measures and the ROC curves of the classifiers.
software quality, reliability, efficiency and reduces the software
cost. However, developing robust bug prediction model is a The rest of this paper is organized as follow. Section 2
challenging task and many techniques have been proposed in the presents a discussion of the related work in SBP. An overview
literature. This paper presents a software bug prediction model of the selected ML algorithms is presented in Section 3.
based on machine learning (ML) algorithms. Three supervised Section 4 describes the datasets and the evaluation
ML algorithms have been used to predict future software faults methodology. Experimental results are shown in Section 5
based on historical data. These classifiers are Naïve Bayes (NB), followed by conclusions and future works.
Decision Tree (DT) and Artificial Neural Networks (ANNs). The
evaluation process showed that ML algorithms can be used II. RELATED WORK
effectively with high accuracy rate. Furthermore, a comparison There are many studies about software bug prediction using
measure is applied to compare the proposed prediction model
machine learning techniques. For example, the study in [2]
with other approaches. The collected results showed that the ML
approach has a better performance.
proposed a linear Auto-Regression (AR) approach to predict
the faulty modules. The study predicts the software future
Keywords—Software bug prediction; faults prediction; faults depending on the historical data of the software
prediction model; machine learning; Naïve Bayes (NB); Decision accumulated faults. The study also evaluated and compared the
Tree (DT); Artificial Neural Networks (ANNs) AR model and with the Known power model (POWM) used
Root Mean Square Error (RMSE) measure. In addition to, the
I. INTRODUCTION study used three datasets for evaluation and the results were
The existence of software bugs affects dramatically on promising.
software reliability, quality and maintenance cost. Achieving The studies in [3], [4] analyzed the applicability of various
bug-free software also is hard work, even the software applied ML methods for fault prediction. Sharma and Chandra [3]
carefully because most time there is hidden bugs. In addition added to their study the most important previous researches
to, developing software bug prediction model which could about each ML techniques and the current trends in software
predict the faulty modules in the early phase is a real challenge bug prediction using machine learning. This study can be used
in software engineering. as ground or step to prepare for future work in software bug
Software bug prediction is an essential activity in software prediction.
development. This is because predicting the buggy modules R. Malhotra in [5] presented a good systematic review for
prior to software deployment achieves the user satisfaction, software bug prediction techniques, which using Machine
improves the overall software performance. Moreover, Learning (ML). The paper included a review of all the studies
predicting the software bug early improves software adaptation between the period of 1991 and 2013, analyzed the ML
to different environments and increases the resource utilization. techniques for software bug prediction models, and assessed
Various techniques have been proposed to tackle Software their performance, compared between ML and statistic
Bug Prediction (SBP) problem. The most known techniques techniques, compared between different ML techniques and
are Machine Learning (ML) techniques. The ML techniques summarized the strength and the weakness of the ML
are used extensively in SBP to predict the buggy modules techniques.
based on historical fault data, essential metrics and different In [6], the paper provided a benchmark to allow for
software computing techniques. common and useful comparison between different bug
In this paper, three supervised ML learning classifiers are prediction approaches. The study presented a comprehensive
used to evaluate the ML capabilities in SBP. The study comparison between a well-known bug prediction approaches,
discussed Naïve Bayes (NB) classifier, Decision Tree (DT) also introduced new approach and evaluated its performance
classifier and Artificial Neural Networks (ANNs) classifier. by building a good comparison with other approaches using the
The discussed ML classifiers are applied to three different presented benchmark.
datasets obtained from [1] and [2] works.

78 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 9, No. 2, 2018

D. L. Gupta and K. Saxena [7] developed a model for Singh et al. [14] showed that CBO, WMC, LOC, and RFC are
object-oriented Software Bug Prediction System (SBPS). The effective in predicting defects, while Malhotra and Singh [15]
study combined similar types of defect datasets which are showed that the AUC is effective metric and can be used to
available at Promise Software Engineering Repository. The predict the faulty modules in early phases of software
study evaluated the proposed model by using the performance development and to improve the accuracy of ML techniques.
measure (accuracy). Finally, the study results showed that the
average proposed model accuracy is 76.27%. This paper discusses three well-known machine learning
techniques DT, NB and ANNs. The paper also evaluates the
Rosli et al. [8] presented an application using the genetic ML classifiers using various performance measurements (i.e.
algorithm for fault proneness prediction. The application accuracy, precision, recall, F-measure and ROC curve). Three
obtains its values, such as the object-oriented metrics and count public datasets are used to evaluate the three ML classifiers.
metrics values from an open source software project. The
genetic algorithm uses the application's values as inputs to On the other hand, most of the mentioned related works
generate rules which employed to categorize the software discussed more ML techniques and different datasets. Some of
modules to defective and non-defective modules. Finally, the previous studies mainly focused on the metrics that make
visualize the outputs using genetic algorithm applet. the SBP as efficient as possible, while other previous studies
proposed different methods to predict software bugs instead of
The study in [9] assessed various object-oriented metrics by ML techniques.
used machine learning techniques (decision tree and neural
networks) and statistical techniques (logical and linear III. USED MACHINE LEARNING ALGORITHMS
regression). The results of the study showed that the Coupling The study aims to analyze and assess three supervised
Between Object (CBO) metric is the best metric to predict the Machine Learning algorithms, which are Naïve Bayes (NB),
bugs in the class and the Line Of Code (LOC) is fairly well, Artificial Neural Network (ANN) and Decision Tree (DT). The
but the Depth of Inheritance Tree (DIT) and Number Of study shows the performance accuracy and capability of the
Children (NOC) are untrusted metrics. ML algorithms in software bug prediction and provides a
Singh and Chug [10] discussed five popular ML algorithms comparative analysis of the selected ML algorithms.
used for software defect prediction i.e. Artificial Neural The supervised machine learning algorithms try to develop
Networks (ANNs), Particle Swarm Optimization (PSO), an inferring function by concluding relationships and
Decision Tree (DT), Naïve Bayes (NB) and Linear Classifiers dependencies between the known inputs and outputs of the
(LC). The study presented important results including that the labeled training data, such that we can predict the output values
ANN has lowest error rate followed by DT, but the linear for new input data based on the derived inferring function.
classifier is better than other algorithms in term of defect Following are summarized description of the selected
prediction accuracy, the most popular methods used in supervised ML algorithms:
software defect prediction are: DT, BL, ANN, SVM, RBL and
EA, and the common metrics used in software defect  Naïve Bayes (NB): NB is an efficient and simple
prediction studies are: Line Of Code (LOC) metrics, object probabilistic classifier based on Bayes theorem with
oriented metrics such as cohesion, coupling and inheritance, independence assumption between the features. NB is
also other metrics called hybrid metrics which used both object not single algorithms, but a family of algorithms based
oriented and procedural metrics, furthermore the results on common principle, which assumes that the presence
showed that most software defect prediction studied used or absence of a particular feature of the class is not
NASA dataset and PROMISE dataset. related to the presence and absence of any other
features [16], [17].
Moreover, the studies in [11], [12] discussed various ML
techniques and provided the ML capabilities in software defect  Artificial Neural Networks (ANNs): ANNs are networks
prediction. The studies assisted the developer to use useful inspired by biological neural networks. Neural networks
software metrics and suitable data mining technique in order to are non-linear classifier which can model complex
enhance the software quality. The study in [12] determined the relationships between the inputs and the outputs. A
most effective metrics which are useful in defect prediction neural network consists of a collection of processing
such as Response for class (ROC), Line of code (LOC) and units called neurons that are work together in parallel to
Lack Of Coding Quality (LOCQ). produce output [16]. Each connection between neurons
can transmit a signal to other neurons and each neuron
Bavisi et al. [13] presented the most popular data mining calculates its output using the nonlinear function of the
technique (k-Nearest Neighbors, Naïve Bayes, C-4.5 and sum of all neuron’s inputs.
Decision trees). The study analyzed and compared four
algorithms and discussed the advantages and disadvantages of  Decision Tree (DT): DT is a common learning method
each algorithm. The results of the study showed that there were used in data mining. DT refers to a hierarchal and
different factors affecting the accuracy of each technique; such predictive model which uses the item’s observation as
as the nature of the problem, the used dataset and its branches to reach the item’s target value in the leaf. DT
performance matrix. is a tree with decision nodes, which have more than one
branch and leaf nodes, which represent the decision.
The researches in [14], [15] presented the relationship
between object-oriented metrics and fault-proneness of a class.

79 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 9, No. 2, 2018

TABLE II. DS1 - THE FIRST SOFTWARE FAULTS DATASET TABLE III. DS2 - THE SECOND SOFTWARE FAULTS DATASET

Di Fi Ti Di Fi Ti Di Fi Ti Di Fi Ti Di Fi Ti
1 2 75 24 2 8 1 5 4 38 15 8 75 0 4
2 0 31 25 1 15 2 5 4 39 7 8 76 0 4

3 30 63 26 7 31 3 5 4 40 15 8 77 1 4

4 13 128 27 0 1 4 5 4 41 21 8 78 2 2
5 6 4 42 8 8 79 0 2
5 13 122 28 22 57
6 8 5 43 6 8 80 1 2
6 3 27 29 2 27
7 2 5 44 20 8 81 0 2
7 17 136 30 5 35
8 7 5 45 10 8 82 0 2
8 2 49 31 12 26
9 4 5 46 3 8 83 0 2
9 2 26 32 14 36
10 2 5 47 3 8 84 0 2
10 20 102 33 5 28
11 31 5 48 8 4 85 0 2
11 13 53 34 2 22
12 4 5 49 5 4 86 0 2
12 3 26 35 0 4 13 24 5 50 1 4 87 2 2
13 3 78 36 7 8 14 49 5 51 2 4 88 0 2
14 4 48 37 3 5 15 14 5 52 2 4 89 0 2
15 4 75 38 0 27 16 12 5 53 2 4 90 0 2
1`6 0 14 39 0 6 17 8 5 54 7 4 91 0 2
17 0 4 40 0 6 18 9 5 55 2 4 92 0 2
18 0 14 41 0 4 19 4 5 56 0 4 93 0 2

19 0 22 42 5 0 20 7 5 57 2 4 94 0 2

20 0 5 43 2 6 21 6 5 58 3 4 95 0 2
22 9 5 59 2 4 96 1 2
21 0 9 44 3 5
23 4 5 60 7 4 97 0 2
22 30 33 45 0 8
24 4 5 61 3 4 98 0 2
23 15 118 46 0 2
25 2 5 62 0 4 99 0 2
IV. DATASETS AND EVALUATION METHODOLOGY 26 4 5 63 1 4 100 1 2
The used datasets in this study are three different datasets, 27 3 5 64 0 4 101 0 1
namely DS1, DS2 and DS3. All datasets are consisting of two 28 9 6 65 1 4 102 0 1
measures; the number of faults (Fi) and the number of test 29 2 6 66 0 4 103 1 1
workers (Ti) for each day (Di) in a part of software projects
30 5 6 67 0 4 104 2 1
lifetime. The DS1 dataset has 46 measurements that involved
in the testing process presented in [1]. DS2, also taken from 31 4 6 68 1 3 105 0 1
[1], which measured a system faults during 109 successive 32 1 6 69 1 3 106 1 2
days of testing the software system that consists of 200 33 4 6 70 0 3 107 0 2
modules with each having one kilo line of code of Fortran.
DS2 has 111 measurements. DS3 is developed in [2], which 34 3 6 71 0 3 108 0 1
contains real measured data for a test/debug program of a real- 35 6 6 72 1 3 109 1 1
time control application presented in [18]. Tables I to III 36 13 6 73 1 4 110 0 1
present DS1, DS2 and DS3, respectively.
37 19 8 74 0 4 111 1 1
The datasets were preprocessed by a proposed clustering
technique. The proposed clustering technique marks the data A. Confusion Matrix
with class labels. These labels are set to classify the number of The confusion matrix is a specific table that is used to
faults into five different classes; A, B, C, D, and E. Table IV measure the performance of ML algorithms. Table V shows an
shows the value of each class and number of instances that example of a generic confusion matrix. Each row of the matrix
belong to it in each dataset. represents the instances in an actual class, while each column
represents the instance in a predicted class or vice versa.
In order to evaluate the performance of using ML Confusion matrix summarizes the results of the testing
algorithms in software bug prediction, we used a set of well- algorithm and provides a report of the number of True Positive
known measures [19] based on the generated confusion (TP), False Positives (FP), True Negatives (TN), and False
matrixes. The following subsections describe the confusion Negatives (FN).
matrix and the used evaluation measures.

80 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 9, No. 2, 2018

TABLE IV. DS3 - THE THIRD SOFTWARE FAULTS DATASET C. Precision (Positive Predictive Value)
Di Fi Ti Di Fi Ti Di Fi Ti Precision is calculated as the number of correct positive
1 4 1 38 9 2 75 1 2 predictions divided by the total number of positive predictions.
2 0 1 39 7 2 76 11 2 The best precision is 1, whereas the worst is 0 and it can be
3 7 1 40 12 2 77 1 2
4 10 1 41 12 2 78 0 2
calculated as:
5 13 1 42 15 2 79 2 2 Precision = TP / ( TP + FP ) (2)
6 8 1 43 14 2 80 2 2
7 13 1 44 7 2 81 4 2 D. Recall (True Positive Rate or Sensitivity)
8 4 1 45 9 2 82 1 2
9 7 1 46 11 2 83 0 2
Recall is calculated as the number of positive predictions
10 8 1 47 5 2 84 4 2 divided by the total number of positives. The best recall is 1,
11 1 1 48 7 2 85 1 1 whereas the worst is 0. Generally, Recall is calculated by the
12 6 1 49 7 2 86 1 1 following formula:
13 13 1 50 14 2 87 0 1
14 7 1 51 13 2 88 2 3 Recall = TP / ( TP + FN ) (3)
15 9 1 52 14 2 89 0 1 E. F-measure
16 8 2 53 11 2 90 0 2
17 5 2 54 2 1 91 1 1 F-measure is defined as the weighted harmonic mean of
18 10 2 55 4 1 92 1 1 precision and recall. Usually, it is used to combine the Recall
19 7 2 56 4 2 93 0 1 and Precision measures in one measure in order to compare
20 11 2 57 3 2 94 0 2 different ML algorithms with each other. F-measure formula is
21 5 2 58 6 2 95 0 1 given by:
22 8 2 59 6 2 96 0 1
23 13 2 60 2 2 97 1 2 F- measure= (2* Recall * Precision)/(Recall + Precision) (4)
24 9 2 61 0 1 98 0 1
25 7 2 62 0 1 99 1 1 F. Root-Mean-Square Error (RMSE)
26 7 2 63 3 1 100 0 1 RMSE is a measure for evaluating the performance of a
27 5 2 64 0 1 101 0 1
prediction model. The idea herein is to measure the difference
28 7 2 65 4 1 102 0 2
29 6 1 66 0 1 103 0 1 between the predicted and the actual values. If the actual value
30 6 1 67 1 1 104 2 1 is X and the predicted value is XP then RMSE is calculated as
31 4 1 68 2 1 105 0 1 follows:
32 12 2 69 0 2 106 1 2
33 6 2 70 1 2 107 0 2
34 7 2 71 2 2 108 2 2
√ ∑ (5)
35 8 2 72 5 2 109 0 2
36 11 2 73 3 2 V. EXPERIMENTAL RESULTS
37 6 2 74 2 2
This study used WEKA 3.6.9, a machine learning tool, to
TABLE V. NUMBER OF FAULTS CLASSIFICATION evaluate three ML algorithms (NB, DT and ANNs) in software
bug prediction problem. A cross validation (10 fold) is used for
Number of Instances each dataset.
Faults Class Number of Faults
DS1 DS2 DS3
A 0-4 30 76 57 The accuracy of NB, DT and ANNs classifiers for the three
B 5-9 5 23 33 datasets are shown in Table VI. As shown in Table VI, the
C 10-14 5 4 18 three ML algorithms achieved a high accuracy rate. The
D 15-19 2 3 1
average value for the accuracy rate in all datasets for the three
E More than 20 4 5 0
classifiers is over 93% on average. However, the lowest value
TABLE VI. THE CONFUSION MATRIX
appears for NB algorithm in the DS1 dataset. We believe this is
because the dataset is small and NB algorithm needs a bigger
Actual dataset in order to achieve a higher accuracy value. Therefore,
Predicted Class X Class Y
NB got a higher accuracy rate in DS2 and DS3 datasets, which
they are relatively bigger than the DS1 dataset.
Class X TP FP
Class Y FN TN TABLE VII. ACCURACY MEASURE FOR THE THREE ML ALGORITHMS
OVER DATASETS

Datasets NB DT ANNs
B. Accuracy
Accuracy (ACC) is the proportion of true results (both TP DS1 0.898 0.951 0.938
and TN) among the total number of examined instances. The DS2 0.950 0.972 0.954
best accuracy is 1, whereas the worst accuracy is 0. ACC can
DS3 0.954 0.990 0.963
be computed by using the following formula:
Average 0.934 0.971 0.951
ACC = (TP + TN) / (TP + TN+ FP + FN) (1)

81 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 9, No. 2, 2018

TABLE VIII. PRECISION MEASURE FOR THE THREE ML ALGORITHMS TABLE X. RMSE VALUES FOR THE THREE ML ALGORITHMS, AR
OVER DATASETS MODEL, AND POWM MODEL
Datasets NB DT ANNs Machine Learning Algorithms Approaches Presented in [2]
DS1 0.956 1 1 Datasets NB DT ANNs AR Model POWM Model
DS2 0.989 0.990 0.981 DS1 0.163 0.082 0.151 4.096 14.060
DS3 0.990 1 0.990 DS2 0.199 0.104 0.130 0.687 150.075
Average 0.978 0.996 0.990 DS3 0.120 0.062 0.162 3.567 152.969

TABLE IX. RECALL MEASURE FOR THE THREE ML ALGORITHMS OVER Table IX presents the RMSE measure for the used ML
DATASETS algorithms, as well as, AR and POWM models over the three
datasets. The results show that NB, DT, and ANNs classifiers
Datasets NB DT ANNs
DS1 1 1 1
have better values than AR and POWM models. The average
DS2 0.905 1 0.990 RMSE value for all ML classifiers in the three datasets is
DS3 0.972 1 0.981 0.130, while the average RMSE values for AR and POWM
Average 0.959 1 0.990 models are 2.783 and 105.701, respectively.
The precision measures for applying NB, DT and ANNs VI. CONCLUSIONS AND FUTURE WORK
classifiers on DS1, DS2 and DS3 datasets are shown in
Software bug prediction is a technique in which a
Table VII. Results show that three ML algorithms can be used
prediction model is created in order to predict the future
for bug prediction effectively with a good precision rate. The
software faults based on historical data. Various approaches
average precision values for all classifiers in the three datasets
have been proposed using different datasets, different metrics
are more than 97%.
and different performance measures. This paper evaluated the
The third evaluation measure is the recall measure. using of machine learning algorithms in software bug
Table VIII shows the recall values for the three classifiers on prediction problem. Three machine learning techniques have
the three datasets. Also, herein the ML algorithms achieved a been used, which are NB, DT and ANNs.
good recall value. The best recall value was achieved by DT
The evaluation process is implemented using three real
classifier, which is 100% in all datasets. On the other hand, the
testing/debugging datasets. Experimental results are collected
average recall values for ANNs and NB algorithms are 99%
based on accuracy, precision, recall, F-measure, and RMSE
and 96%, respectively.
measures. Results reveal that the ML techniques are efficient
In order to compare the three classifiers with respect to approaches to predict the future software bugs. The comparison
recall and precision measures, we used the F-measure value. results showed that the DT classifier has the best results over
Fig. 1 shows the F-measure values for the used ML algorithms the others. Moreover, experimental results showed that using
in the three datasets. As shown the figure, DT has the highest ML approach provides a better performance for the prediction
F-measure value in all datasets followed by ANNs, then NB model than other approaches, such as linear AR and POWM
classifiers. model.
Finally, to evaluate the ML algorithms with other As a future work, we may involve other ML techniques and
approaches, we calculated the RMSE value. The work in [2] provide an extensive comparison among them. Furthermore,
proposed a linear Auto Regression (AR) model to predict the adding more software metrics in the learning process is one
accumulative number of software faults using historical possible approach to increase the accuracy of the prediction
measured faults. They evaluated their approach with the model.
POWM model [20] based on the RMSE measure. The REFERENCES
evaluation process was done on the same datasets we are using [1] Y. Tohman, K. Tokunaga, S. Nagase, and M. Y., “Structural approach to
in this study. the estimation of the number of residual software faults based on the
hyper-geometric districution model,” IEEE Trans. on Software
1 Engineering, pp. 345–355, 1989.
0.99 [2] A. Sheta and D. Rine, “Modeling Incremental Faults of Software
Testing Process Using AR Models ”, the Proceeding of 4th International
0.98
Multi-Conferences on Computer Science and Information Technology
0.97 (CSIT 2006), Amman, Jordan. Vol. 3. 2006.
NB [3] D. Sharma and P. Chandra, "Software Fault Prediction Using Machine-
0.96
DT Learning Techniques," Smart Computing and Informatics. Springer,
0.95
Singapore, 2018. 541-549.
0.94 ANNs [4] R. Malhotra, "Comparative analysis of statistical and machine learning
0.93 methods for predicting faulty modules," Applied Soft Computing 21,
0.92 (2014): 286-297
[5] Malhotra, Ruchika. "A systematic review of machine learning
0.91
techniques for software fault prediction." Applied Soft Computing 27
DS1 DS2 DS3 (2015): 504-518.
[6] D'Ambros, Marco, Michele Lanza, and Romain Robbes. "An extensive
Fig. 1. F-measure values for the used ML algorithms in the three comparison of bug prediction approaches." Mining Software
datasets. Repositories (MSR), 2010 7th IEEE Working Conference on. IEEE,
2010.

82 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 9, No. 2, 2018

[7] Gupta, Dharmendra Lal, and Kavita Saxena. "Software bug prediction [14] Y. Singh, A. Kaur and R. Malhotra, "Empirical validation of object-
using object-oriented metrics." Sādhanā (2017): 1-15.. oriented metrics for predicting fault proneness models," Software Qual
[8] M. M. Rosli, N. H. I. Teo, N. S. M. Yusop and N. S. Moham, "The J, p. 3–35, 2010.
Design of a Software Fault Prone Application Using Evolutionary [15] Malhotra, Ruchika, and Yogesh Singh. "On the applicability of machine
Algorithm," IEEE Conference on Open Systems, 2011. learning techniques for object oriented software fault prediction."
[9] T. Gyimothy, R. Ferenc and I. Siket, "Empirical Validation of Object- Software Engineering: An International Journal 1.1 (2011): 24-37.
Oriented Metrics on Open Source Software for Fault Prediction," IEEE [16] A.TosunMisirli, A. se Ba¸ S.Bener,“A Mapping Study on Bayesian
Transactions On Software Engineering, 2005. Networks for Software Quality Prediction”, Proceedings of the 3rd
[10] Singh, Praman Deep, and Anuradha Chug. "Software defect prediction International Workshop on Realizing Artificial Intelligence Synergies in
analysis using machine learning algorithms." 7th International Software Engineering, (2014).
Conference on Cloud Computing, Data Science & Engineering- [17] T. Angel Thankachan1, K. Raimond2, “A Survey on Classification and
Confluence, IEEE, 2017. Rule Extraction Techniques for Data mining”,IOSR Journal of
[11] M. C. Prasad, L. Florence and A. Arya, "A Study on Software Metrics Computer Engineering ,vol. 8, no. 5,(2013), pp. 75-78.
based Software Defect Prediction using Data Mining and Machine [18] T. Minohara and Y. Tohma, “Parameter estimation of hyper-geometric
Learning Techniques," International Journal of Database Theory and distribution software reliability growth model by genetic algorithms”, in
Application, pp. 179-190, 2015. Proceedings of the 6th International Symposium on Software Reliability
[12] Okutan, Ahmet, and Olcay Taner Yıldız. "Software defect prediction Engineering, pp. 324–329, 1995.
using Bayesian networks." Empirical Software Engineering 19.1 (2014): [19] Olsen, David L. and Delen, “ Advanced Data Mining Techniques ”,
154-181. Springer, 1st edition, page 138, ISBN 3-540-76016-1, Feb 2008.
[13] Bavisi, Shrey, Jash Mehta, and Lynette Lopes. "A Comparative Study of [20] L. H. Crow, “Reliability for complex repairable systems,” Reliability
Different Data Mining Algorithms." International Journal of Current and Biometry, SIAM, pp. 379–410, 1974.
Engineering and Technology 4.5 (2014).

83 | P a g e
www.ijacsa.thesai.org
View publication stats

You might also like