LCNN Lightweight CNN Architecture For Software Defect Feature Identification Using Explainable AI
LCNN Lightweight CNN Architecture For Software Defect Feature Identification Using Explainable AI
Corresponding authors: Md. Abdus Samad ([email protected]) and Jia Uddin ([email protected])
This research is funded by Woosong University Academic Research 2024.
ABSTRACT Software defect identification (SDI) is a key part of improving the quality of software projects
and lowering the risks that along with maintenance. It does identify the software defect causes that have
not been reached yet to get sufficient results. On the other hand, many researchers have recently developed
several models, including NN, ML, DL, advanced CNN, and LSTM, to enhance the effectiveness of defect
prediction. Due to an insufficient dataset size, repeated investigations, and no longer appropriate baseline
selection, the research on the CNN model was unable to produce reliable results. In addition, XAI a
well-known explainability approach creates deep models in computer vision, as well as successfully handles
the software defect prediction that is easy for humans to understand. To address these issues, firstly we have
used SMOTE for preprocessing which was collected from the NASA repository; categorical and numerical
data. Secondly, we have experimented with software defect prediction using 1D-CNN and 2D-CNN named
lightweight CNN (LCNN). Subsequently, evaluation we have employed a 100-repetition holdout validation.
For the cross-validation setup, we utilized the 1D-CNN model was 20×1, and for the 2D-CNN model, it was
4×5×1. After that, the results of the experiment were compared and assessed in terms of accuracy, MSE, and
AUC. The result shows that 2D-CNN shows 1.36% better contrast with 1D-CNN. Thirdly, we have conducted
research on the identification of software defect features via LIME and SHAP in XAI stand as state-of-the-
art techniques. However, we cannot use 2D-CNN because it involves more complex relationships, making it
challenging to create transparent explanations. That is why we have realized that 1D-CNN will superior result
to explain the root cause of software feature identifications. Finally, LIME provides accurate visualization of
software defect features in contrast with SHAP, as well as it helps the stakeholders of the software industry
easily find actual root causes of software defect identification.
INDEX TERMS Software defect identification (SDI), explainability, 1D-CNN, 2D-CNN, CNN model, deep
learning, SHAP, LIME.
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
55744 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
M. Begum et al.: LCNN Architecture for Software Defect Feature Identification
remediation, and with the advent of artificial intelligence The CNN model is well-suited for software fault pre-
(AI), significant strides have been made in automating fault diction because it can effectively capture local and global
detection and analysis. Software engineers and quality assur- patterns within source code structures, aiding in identifying
ance (QA) managers face numerous challenges in their potential defects. The author [8] improved a CNN model
respective positions. The following are often encountered for within-project defect prediction (WPDP) and examined
challenges: evolving needs, intricacy of software systems, it with CNN and empirical results. This experiment uses
guaranteeing quality across many platforms and devices, 30-repetition holdout validation and 10 × 10 cross-validation.
upholding code quality, automating testing, addressing inte- In WPDP experiments, the improved CNN model was equiv-
gration and compatibility problems, addressing security alent to the existing CNN model and outperformed state-
concerns, and managing limited resources. of-the-art machine learning models. One limitation of this
In the last four decades, different researchers have worked research is the need for more data, specifically C/C++ open-
on software defect identification, but their results have not source projects, to build robust and generalizable datasets for
been fruitful. In this regard, several ANN architectures have deep learning-based defect prediction.
been proposed for software fault prediction and analysis by Scope: This research introduces an LCNN for software
Begum and Dohi [1]. They presented a software fault pre- defect prediction, which utilizes CNN models. The pro-
diction model that utilizes Box-Cox power transformation posed LCNN model outperforms conventional ML models
methods and compared existing software reliability growth in terms of performance. Subsequently, we converted the
models. A Multilayer Perceptron (MLP) is a sequentially executable strings into numerical values and included them
linked artificial neural network with several layers of con- in the CNN model, which comprises a word embedding
nections. In order to solve and foresee software errors, layer, a pair of convolutional layers, two max-pooling lay-
the authors of [1] developed an MLP with a one-stage ers, and one dropout layer. Then, we assessed the proposed
look-ahead prediction utilizing Box-Cox Power Transforma- model’s prediction ability by examining its accuracy, MSE,
tion as an experiment. Traditional ANN models’ prediction and AUC using the CM1 dataset, taken from the NASA
performance was improved using a mix of ML and evo- repository [9]. There were two CNN methods: 1D and 2D,
lutionary computing approaches [2]. However, academics’ which we used for this evaluation. 2D-CNN has almost the
main focus has shifted to the correct identification of a same outcomes compared with 1D-CNN. For root cause
software defect, which has a significant impact on soft- analysis, we use a method called ‘‘explainability,’’ along with
ware release time. Several ANN methods for determining LIME and SHAP, to figure out what happened with a deep
the best time to release software have been proposed by learning model that was trained on tabular data. On the other
Begum and Dohi [3], [4]. Using a multi-stage look-ahead hand, 2D-CNN changes input data shape; for this reason,
prediction approach where the ideal number of hidden neu- we are unable to use tabular data on XAI. Therefore, we use
ral networks and transformation values were discovered has 1D-CNN for the recognition of the root causes of software
limitations. faults. As a result, LIME is a good way to explain fea-
In another work, Liu et al. [5] investigated the impact of tures of software defects in XAI, where we used it along
combining different sampling techniques and ML classifiers with SHAP.
on defect prediction performance. While it finds no single The following are some of our study’s contributions:
optimal combination, it identifies support vector machines • We have introduced an LCNN model that aims to
and deep learning as the most consistently performing classi- enhance generalization and enable the identification of
fiers. In [6], they contributed to the field of software defect software defect features via the use of XAI.
prediction by utilizing various ML approaches to create • We used two CNN techniques: 1D and 2D, for experi-
multiple categorization or classification models, aiming to ments.
improve software quality and reduce testing costs. They also • We divided our research into two sections. In the first
discussed the application of ensembling techniques and fea- section, we use the CNN model for the prediction and
ture selection methods, such as principal component analysis choose 1D-CNN for further experiments. The subse-
(PCA), to further improve the accuracy of defect prediction quent experiment explained the underlying reasons for
models. They focused only on accuracy. software issues.
The authors [7] employed long-short-term memory • In our empirical discoveries, we carried out a hyperpa-
(LSTM) networks in this kind of research to forecast software rameter search, during which we took into account the
faults. They also calculated the data dispersion from the number of dense layers, the kernel size, and the stride
observed independent RMSE data points for each model. step.
The quantified data dispersion value of the second model • LIME is a method that accurately approximates the
was found to be less minimal than the first one. The authors predictions of any classifier or regressor using a locally
applied LSTM to predict the faults of multi-time stamps interpretable model, allowing for truthful explanations.
using a recursive approach. In addition, they compared LSTM • We introduce SHAP-based methodology, enhancing
and traditional software reliability growth models (SRGMs) the interpretability of ML models and offering clearer
based on their prediction accuracy evaluations. insights into feature importance.
• Finally, we concluded that LIME and SHAP in XAI pro- In [11], Zhu et al. introduced a new defect prediction model
vide us with an accurate understanding of the root cause called DAECNN-JDP. This model utilizes a combination
of software defects. However, when comparing them, of denoising autoencoder and CNN techniques to provide
we found that LIME gave better results than SHAP. just-in-time defect prediction. The evaluation of the model
We introduce our research questions, which were answered was conducted using six extensive open-source projects and
in a section of analysis with LIME and SHAP. compared to 11 baseline models. The experimental findings
demonstrated that the suggested model surpasses these base-
• RQ1: How can an LCNN architecture be designed line models. However, they have not evaluated open-source
for effective feature identification in software defect and commercial projects. In addition, they have not used
analysis using XAI? parameter optimization techniques to adjust the parameter.
• RQ2: What role does Explainable AI play in Subsequently, Qiu et al. [12] introduced a new method that
enhancing the transparency and interpretability utilizes a transfer CNN model to extract transferable semantic
of the LCNN model for software defect feature features for cross-project defect prediction (CPDP) tasks. The
identification? studies were carried out using 10 benchmark projects and
• RQ3: How does the LCNN architecture perform 90 pairs of CPDP tasks.
compared to traditional methods in terms of accu- Deep representation and ensemble learning were dis-
racy and efficiency for identifying software defect cussed in [13] for SDP to resolve the class imbalance
features? problem. The experimental findings demonstrated that the
• RQ4: What impact does the choice of hyperpa- proposed method outperformed existing cutting-edge tech-
rameters have on the CNN model’s performance in niques. In addition, to optimize defect prediction models,
software defect feature identification, and how can the authors in [14] proposed an ANN model with automated
these be optimized for better results? parameter tuning techniques. The results indicated that the
• RQ5: How transferable is the proposed LCNN archi- performance of their proposed model improved after param-
tecture to different software domains, and what eter settings were optimized. In [15], the authors improved
factors influence its generalizability? the recurrent artificial neural network (RANN) method used
The remaining parts of this paper are organized as fol- to predict long-term software defects based on the number of
lows: Section II presents a comprehensive review of previous software faults and proposed a simulation-based method (PI
research conducted in the areas of software defect identi- simulation) for calculating prediction intervals (PIs). In the
fication and explainable AI. In Section III, we delve into end, they compared it to the conventional delta method in
the methodology, detailing the XAI techniques and feature terms of the mean prediction interval width and PI coverage
engineering strategies employed. In addition, Section IV rate. Still, they have not validated software metrics, including
delivers the experimental results and examines the practical McCabe, Halstead, and OO metrics.
implications of this research. Finally, in Section V, we draw Currently, the majority of SDP approaches have given little
conclusions and outline future directions for this exciting consideration to the expense associated with misclassifying
intersection of software engineering and AI-driven inter- faulty and non-faulty modules, with just a few instances
pretability. where this has been considered [16], [17]. Nevertheless, the
misclassification cost for the majority class is much lower
II. RELATED WORK compared to the minority class in the context of software
The realm of software defect identification has seen a surge in testing. Cost-sensitive learning has shown its effectiveness
research focusing on leveraging CNNs. In particular, the pur- in connecting various misclassification costs into the SDP
suit of developing lightweight and tailored CNN architectures process [18]. Faruk Arar and Ayan [16] tried to make
has become a pivotal area of interest. Subsequently, it aims to cost-sensitive neural networks using cost-sensitive learning
enhance the efficiency and accuracy of defect identification methods. They did this to try to fix the problem of the unequal
within software systems. Integrating XAI techniques into this distribution of classes by taking costs into account.
domain further augments the interpretability of these models, Zhao et al. [19] have introduced a new SDP model named
providing insights into the decision-making process of the Siamese parallel fully connected networks (SPFCNN), which
network. combines the benefits of Siamese networks with DL. The
In addition, several researchers have delved into designing experimental findings showed that the suggested model
customized CNN architectures, considering the intricacies exhibits considerably superior performance compared to the
of software code while ensuring transparency in the iden- benchmarked SDP techniques. In [20], a hybrid model that
tification process. Tong et al. [10] suggested a new way combines bidirectional long-short-term memory with CNN
to solve the class imbalance problem with SDP that uses for SDP demonstrated the efficacy of the suggested method-
deep representations along with the two-stage ensemble and ology in accurately forecasting software problems. On the
conducted an experiment on 12 NASA datasets. They have other hand, the authors introduced a Semantic Dependency
not worked on cross-project defect prediction, and there is no Parsing (SDP) framework using an RNN that incorporates
clear identification of root causes. attention mechanisms [21].
55746 VOLUME 12, 2024
M. Begum et al.: LCNN Architecture for Software Defect Feature Identification
B. SECOND PHASE (1D-CNN) where S ∗ K denotes the convolution operation and i is the
The one-dimensional convolutional NN (1D-CNN) is a spe- index of the output sequence.
cial kind of ANN that works with sequential data by using 2. Activation function [30]: An activation function in
convolutional operations to pull out features that are related to 1D-CNN introduces non-linearity, which is essential for
each other. It uses convolutional layers to automatically pull learning complex patterns in sequential data. There are var-
out hierarchical features from the input data. Additionally, ious types of functions, but the most popular are ReLU,
it plays a crucial role in enhancing the accuracy and efficiency Sigmoid, and Tanh, which are used for intricate relationships
of predicting software defects by leveraging its ability to between features. We have used the exponential linear unit
capture intricate dependencies within sequential code struc- (ELU) as an activation function, which is a variant of the
tures. Fig. 2 depicts the composition of a frequent 1D-CNN, rectified linear unit (RELU). The ELU incorporates an addi-
which consists of three primary layers: The architecture of the tional alpha constant (α) to determine the smoothness of the
1D-CNN includes 1D convolutional layers, pooling layers, function when the input values are negative. In addition, σ
and fully connected layers [29]. Table 2 provides particular represents the input to the activation function and it exhibits a
details on hyperparameters in the LCNN architecture for higher rate of convergence towards zero cost and yields more
1D-CNN. At first, the input data shape is (6992,1), with a precise outcomes. The formula is expressed as α > 0:
kernel size of (3 × 1) and stride 1. For data compression, (
we used 3 times the convolution and Maxpolling techniques, σ if σ > 0
ELU(σ ) = (3)
and finally, we received the data shape (387,64). After that, α · (exp(σ ) − 1) if σ <= 0
we used the sigmoid activation function to classify software
faults. There are two more important factors besides these 2) STRIDE [31]
three: the dropout layer and the activation function.
Stride refers to the step size of a 1D-CNN. It determines the
amount the kernel moves along the input signal. A larger
1) ONE-DIMENSIONAL CONVOLUTIONAL LAYER
stride reduces the output size by taking larger steps and
The one-dimensional convolutional layer [30] applies con- extracting less information, while a smaller stride captures
volutional operations along a single axis, extracting features more detail but may increase computational complexity.
and patterns from sequential data. The function takes the Its adjustment allows CNNs to control the amount of informa-
one-dimensional input (vector) x[n] as its input, where tion processed and influences the network’s receptive field,
n ranges from 0 to N − 1. N is the total number of instances. impacting feature extraction in 1D sequences like time series
The following parameters are used for making the layer. or signals.
1. Kernels: The kernels slide along the input sequence,
capturing local patterns and producing output representations
C. POOLING LAYER
that highlight relevant features for further analysis. Let S
The Pooling Layer [30] in a 1D-CNN condenses feature
represent the input sequence, K is the one-dimensional kernel
maps, reducing dimensionality and computational load. It is
and the resulting convolution output ζ [n] may be found by
common to use Max Pooling, it selects the maximum value
solving the (2):
within a window, capturing the most prominent features.
K
X −1 There are multiple forms of pooling procedures, including
ζ [n] = (S ∗ K )[i] = S[i + k] · K [k] (2) max pooling, sum pooling, and average pooling [31]. In the
k=0 present investigation, we applied 1D max pooling, which
entails analyzing the input data by applying a predefined TABLE 2. Architecture of proposed 1D-CNN for tabular data.
pool size and stride and picking the maximum value from the
examined area. In (4), the functioning of this can be illustrated
by
TABLE 3. Architecture of proposed 2D-CNN for tabular data. as the ratio of correctly predicted instances to the total number
of instances in the dataset. Higher accuracy refers to the
model’s capacity to accurately categorize instances as either
faulty or non-defective. It is computed as follows:
Correct Predictions
Accuracy = × 100% (5)
Total Number of Predictions
2) MEAN SQUARED ERROR (MSE)
MSE is a statistic for calculating the average squared dif-
ference between what was expected and what happened.
A smaller MSE in software defect predictions indicates more
accurate and reliable results, highlighting improved precision
in forecasting potential defects within the software system.
n
1X 2
applied to preceding convolutional layer output. Reduces MSE = yi − ŷi (6)
spatial dimensions by 2, and each of the 32 feature maps is n
i=1
treated separately. No parameters are trainable in this layer where n is the number of instances, yi is the actual value of the
(0, output shape: None, 2, 2, 32). flatten (None, 64) This i-th instance, and ŷi is the predicted value of the i-th instance.
flattening layer turns the preceding layer’s 3D output into a
1D vector. The output shape is (None, 64), and this layer has 3) AREA UNDER THE CURVE (AUC)
no trainable parameters (0). The output layer we have used is The AUC is a key measure of how well the model can tell
activated using the sigmoid function. the difference between defective and non-defective cases at
Some methodology we used for 2D-CNN: different choice levels. In addition, the ROC shows how the
• Transformation of tabular data into a 2D format akin true positive rate and false positive rate change as limits are
to an image-like structure, representing relationships raised or lowered. In combinally, the AUC takes the ROC
between software metrics. curve and turns it into a single number from 0 to 1, and a
• Normalization and feature engineering to enhance the higher AUC means better discrimination.
network’s ability to discern patterns.
• Designing a 2D CNN architecture with convolutional F. FIFTH PHASE (XAI)
and pooling layers to capture local and global feature A new study method is being suggested to look into how XAI
dependencies. techniques can be used to find and understand the causes of
• Incorporating multiple convolutional layers to learn software errors. By using XAI the fields of software engi-
hierarchical representations from the tabular data neering and artificial intelligence with useful information that
• Splitting the dataset into training, validation, and testing could be used to make software more reliable and easier to
sets. maintain. Therefore, we have focused on creating and using
• Training the 2D CNN model on the transformed a brand-new XAI-based model to analyze software bugs.
tabular data, adjusting hyperparameters for optimal We used advanced XAI methods, like LIME (Local Inter-
performance. pretable Model-agnostic Explanations) and SHAP (SHapley
• Validating the model’s performance using various eval- Additive Explanations), to give clear and understandable
uation metrics like precision, recall, F1 score, and information about how complicated software systems work.
accuracy.
• Visualization techniques to interpret the learned features 1) LIME
and understand the significance of various software met- LIME (Local Interpretable Model-agnostic Explanations)
rics in defect prediction. plays a crucial role in advancing software defect identifi-
• Analyzing the performance metrics to highlight the cation by providing transparent and interpretable insights
superiority and efficiency of the 2D CNN for software into the decision-making process of complex ML models.
defect prediction. This methodology empowers researchers and practitioners to
enhance model trustworthiness and pinpoint potential vulner-
E. FOURTH PHASE abilities, contributing to more effective and reliable software
In the context of comparing the performance of 1D-CNN and defect detection strategies. LIME offers several benefits for
2D-CNN for software defect prediction, many metrics have software fault root cause analysis, such as interpretability,
been used such as accuracy, AUC, and MSE [32], [33]. local explanations, model agnostic, and so on. This inves-
tigation holds promise for advancing the field of software
1) ACCURACY engineering by providing a nuanced understanding of the root
Accuracy is a fundamental metric for evaluating the overall causes behind software faults through the lens of LIME’s
performance of the proposed LCNN architecture. It is defined interpretability.
2) PSEDOCODE FOR LIME TABLE 4. Performance evaluation for 1D-CNN and 2D-CNN.
TABLE 6. LIME agnostic explanations prediction. here, no faults by (a) and faults by (b).
that the accuracy of defect prediction has shown a substan- 1) VISUALIZATION OF LIME
tial improvement. Specifically, for the CM1 datasets using LIME explanations revealed that the LCNN architecture con-
1D-CNN, the prediction accuracy has increased from 91% sistently identified relevant features associated with software
to 9%. defects. In Table 6, we show the top five most effective fea-
tures out of twenty-one features for an explanation of the root
B. EXPLAINABILITY AND INTERPRETABILITY cause of software defects. Table 6.(a) suggests that there will
In the context of software defect prediction using the be no faults. The e (30%) total features ratio and v features had
proposed LCNN architecture, LIME, and SHAP can be the most significant effect on the model’s ability to estimate.
employed to identify the most influential features for each It was found that the t and total_OP features are the best for a
defect prediction. software fault to happen. We conclude that these two features
have become very important for finding defects in software VI. STRENGTHS, LIMITATIONS, AND FUTURE
when using LIME. Additionally, Table 6.(b) illustrates the PERSPECTIVES
important features used to analyze the software defects in Our research has exhibited several strengths. The LCNN
the dataset. The characteristics t, n, and total_Op of the architecture showcases efficiency in identifying software
software faults in this dataset have experienced a favorable defects, providing a lightweight solution for practical deploy-
influence. ment. By integrating XAI, particularly the SHAP method, the
interpretability of the model is improved, thereby promot-
2) VISUALIZATION OF SHAP ing confidence and comprehension in the decision-making
SHAP explanations provided further insights into the relative process. Using the inclusion of the PC1 Promise Repository
importance of different features for each defect prediction. dataset increases its practical applicability, hence strengthen-
The base value in Table 7 is laid at 1.00, which is referred ing the strength of this research.
to as a prediction value. Red-colored features have a posi- Despite its strengths, the research has limitations. The
tive impact, causing the predicted value to go closer to 0. effectiveness of the proposed solution may be context-
If the t feature is removed, the forecast will decrease from dependent, and its generalizability to diverse software envi-
0.9897 to 1.00. In contrast, characteristics that are colored ronments needs validation. The proposed approach provides
blue have a negative impact, meaning they pull the forecast reliance on a specific dataset, however in practical applica-
value closer to 1. Removing the feature of v will increase the tions may limit the model’s adaptability to various software
prediction rate from 0.90 to 1. development practices.
Future research could focus on expanding the model’s
3) ANALYSIS OF LIME AND SHAP applicability by testing it on a broader range of datasets.
Both LIME and SHAP are valuable explainability techniques On the other hand, LCNN can collaborate with industry
that provide insights into the decision-making process of the practitioners to provide valuable insights to address specific
LCNN architecture. However, they differ in their approach software development challenges as well as ensure its effec-
and provide complementary information. We demonstrated tiveness.
the visualization of LIME and SHAP to identify the root
causes of software defects. However, For the comparison, ACKNOWLEDGMENT
LIME shows better than SHAP because LIME provides a (Momotaz Begum, Mehedi Hasan Shuvo, and Imran Asharaf
clear concept of root causes. After that, we easily find unfa- contributed equally to this work.)
vorable features that are the main cause of software defects.
In Table 8, we have discussed the research question and REFERENCES
answer for explaining our methods. [1] M. Begum and T. Dohi, ‘‘A neuro-based software fault prediction with
box-cox power transformation,’’ J. Softw. Eng. Appl., vol. 10, no. 3,
V. CONCLUSION pp. 288–309, 2017, doi: 10.4236/jsea.2017.103017.
As part of this study, we started making a lightweight CNN [2] S. Noekhah, A. A. Hozhabri, and H. S. Rizi, ‘‘Software reliability
prediction model based on ICA algorithm and MLP neural network,’’
design to identify the root causes of software defect fea- in Proc. 7th Int. Conf. e-Commerce Developing Countries, Apr. 2013,
tures. To make software better, it is important to know why pp. 1–15.
bugs occur in the testing phase. In this research, we have [3] M. Begum and T. Dohi, ‘‘Optimal release time estimation of
software system using box-cox transformation and neural network,’’
tried to make an advanced CNN model for finding the rea- Int. J. Math., Eng. Manage. Sci., vol. 3, no. 2, pp. 177–194,
son for software bugs quickly and easily. The dataset went Jun. 2018.
through a lot of preparation to make sure it would work [4] M. Begum and T. Dohi, ‘‘Optimal stopping time of software system
test via artificial neural network with fault count data,’’ J. Quality
with CNNs. We addressed issues such as label encoding, Maintenance Eng., vol. 24, no. 1, pp. 22–36, Mar. 2018. [Online].
standard scaller, and reshaping to meet the input require- Available: https://www.emerald.com/insight/content/doi/10.1108/JQME-
ments of both 1D-CNN and 2D-CNN models. In addition, 12-2016-0082/full/html
we experimented with both 1D-CNN and 2D-CNN models [5] Y. Liu, W. Zhang, G. Qin, and J. Zhao, ‘‘A comparative study on
the effect of data imbalance on software defect prediction,’’ Proc.
to evaluate their performance in identifying software defect Comput. Sci., vol. 214, pp. 1603–1616, Jan. 2022. [Online]. Available:
features. Then these models were trained and tested on rel- https://www.sciencedirect.com/science/article/pii/S1877050922020610
evant datasets to assess their effectiveness. Subsequently, [6] A. Khalid, G. Badshah, N. Ayub, M. Shiraz, and M. Ghouse, ‘‘Soft-
ware defect prediction analysis using machine learning techniques,’’
we evaluated the 1D-CNN architecture due to its superior Sustainability, vol. 15, no. 6, p. 5517, Mar. 2023. [Online]. Available:
performance in capturing spatial relationships within defect https://www.mdpi.com/2071-1050/15/6/5517
features, aligning well with the nature of software defect iden- [7] M. R. Islam, M. Begum, and M. N. Akhtar, ‘‘Recursive approach
for multiple step-ahead software fault prediction through long short-
tification. To enhance interpretability, we employed LIME
term memory (LSTM),’’ J. Discrete Math. Sci. Cryptography, vol. 25,
and SHAP techniques specifically for the 1D-CNN model. no. 7, pp. 2129–2138, Oct. 2022. [Online]. Available: https://www.
These explainability tools provided valuable insights into fea- tandfonline.com/doi/full/10.1080/09720529.2022.2133251
ture importance, aiding in understanding the decision-making [8] C. Pan, M. Lu, B. Xu, and H. Gao, ‘‘An improved CNN model for
within-project software defect prediction,’’ Appl. Sci., vol. 9, no. 10,
process of the model in the domain of software defect p. 2138, May 2019. [Online]. Available: https://www.mdpi.com/2076-
identification. 3417/9/10/2138
[9] J. S. Shirabad and T. Menzies, ‘‘The PROMISE repository of [27] M. Begum, M. H. Shuvo, I. Ashraf, A. A. Mamun, J. Uddin, and
software engineering databases,’’ School Inf. Technol. Eng., Univ. M. A. Samad, ‘‘Software defects identification: Results using machine
Ottawa, Ottawa, ON, Canada, Tech. Rep., 2005. [Online]. Available: learning and explainable artificial intelligence techniques,’’ IEEE Access,
http://promise.site.uottawa.ca/SERepository/datasets/cm1.arff vol. 11, pp. 132750–132765, 2023.
[10] H. Tong, B. Liu, and S. Wang, ‘‘Software defect prediction using stacked [28] J. T. Hancock and T. M. Khoshgoftaar, ‘‘Survey on categorical data
denoising autoencoders and two-stage ensemble learning,’’ Inf. Softw. for neural networks,’’ J. Big Data, vol. 7, no. 1, p. 28, Apr. 2020, doi:
Technol., vol. 96, pp. 94–111, Apr. 2018. [Online]. Available: https://www. 10.1186/s40537-020-00305-w.
sciencedirect.com/science/article/pii/S0950584917300113 [29] E. Chaerun Nisa and Y.-D. Kuan, ‘‘Comparative assessment to pre-
[11] K. Zhu, N. Zhang, S. Ying, and D. Zhu, ‘‘Within-project and cross-project dict and forecast water-cooled chiller power consumption using machine
just-in-time defect prediction based on denoising autoencoder and convo- learning and deep learning algorithms,’’ Sustainability, vol. 13, no. 2,
lutional neural network,’’ IET Softw., vol. 14, no. 3, pp. 185–195, Jun. 2020. p. 744, Jan. 2021. [Online]. Available: https://www.mdpi.com/2071-
[Online]. Available: https://onlinelibrary.wiley.com/doi/pdf/10.1049/iet- 1050/13/2/744
sen.2019.0278 [30] M. Saini, U. Satija, and M. D. Upadhayay, ‘‘One-dimensional convo-
[12] S. Qiu, H. Xu, J. Deng, S. Jiang, and L. Lu, ‘‘Transfer convolutional neural lutional neural network architecture for classification of mental tasks
network for cross-project defect prediction,’’ Appl. Sci., vol. 9, no. 13, from electroencephalogram,’’ Biomed. Signal Process. Control, vol. 74,
p. 2660, Jun. 2019. [Online]. Available: https://www.mdpi.com/2076- Apr. 2022, Art. no. 103494, doi: 10.10162Fj.bspc.2022.103494.
3417/9/13/2660 [31] J. Rala Cordeiro, A. Raimundo, O. Postolache, and P. Sebastião, ‘‘Neu-
ral architecture search for 1D CNNs—Different approaches tests and
[13] S. K. Pandey, R. B. Mishra, and A. K. Tripathi, ‘‘BPDET: An
measurements,’’ Sensors, vol. 21, no. 23, p. 7990, Nov. 2021. [Online].
effective software bug prediction model using deep representation and
Available: https://www.mdpi.com/1424-8220/21/23/7990
ensemble learning techniques,’’ Expert Syst. Appl., vol. 144, Apr. 2020,
[32] S. Wang, T. Liu, and L. Tan, ‘‘Automatically learning semantic features for
Art. no. 113085. [Online]. Available: https://www.sciencedirect.com/
defect prediction,’’ in Proc. IEEE/ACM 38th Int. Conf. Softw. Eng. (ICSE).
science/article/pii/S0957417419308024
New York, NY, USA: Association for Computing Machinery, May 2016,
[14] Z. Yang and H. Qian, ‘‘Automated parameter tuning of artificial neural net-
pp. 297–308, doi: 10.1145/2884781.2884804.
works for software defect prediction,’’ in Proc. 2nd Int. Conf. Adv. Image
[33] K. Wongpheng and P. Visutsak, ‘‘Software defect prediction using convo-
Process. New York, NY, USA: Association for Computing Machinery,
lutional neural network,’’ in Proc. 35th Int. Tech. Conf. Circuits/Systems,
Jun. 2018, pp. 203–209, doi: 10.1145/3239576.3239622.
Comput. Commun. (ITC-CSCC), Jul. 2020, pp. 240–243.
[15] M. Begum, M. S. Hafiz, M. J. Islam, and M. J. Hossain, ‘‘Long-term
software fault prediction with robust prediction interval analysis via refined
artificial neural network (RANN) approach,’’ Eng. Lett., vol. 29, p. 44,
Aug. 2021.
[16] Ö. F. Arar and K. Ayan, ‘‘Software defect prediction using cost-
sensitive neural network,’’ Appl. Soft Comput., vol. 33, pp. 263–277,
Aug. 2015. [Online]. Available: https://www.sciencedirect.com/science/
article/pii/S1568494615002720 MOMOTAZ BEGUM received the M.Sc. degree
[17] P. Kumudha and R. Venkatesan, ‘‘Cost-sensitive radial basis in CSE from DUET, Gazipur, Bangladesh, in 2013,
function neural network classifier for software defect prediction,’’ and the Ph.D. degree in information engineer-
Sci. World J., vol. 2016, pp. 1–20, Sep. 2016, doi: 10.1155/2016/ ing from Hiroshima University, Japan. Her Ph.D.
2401496. study was fully funded by Japanese Govern-
[18] S. Viaene and G. Dedene, ‘‘Cost-sensitive learning and decision making ment, specifically the Ministry of Education,
revisited,’’ Eur. J. Oper. Res., vol. 166, no. 1, pp. 212–220, Oct. 2005. Culture, Sports, Science and Technology [Mon-
[Online]. Available: https://www.sciencedirect.com/science/article/pii/ bukagakusho (MEXT)], from October 2014 to
S0377221704002978 September 2017. She is currently a Distinguished
[19] L. Zhao, Z. Shang, L. Zhao, T. Zhang, and Y. Y. Tang, ‘‘Software Professor with the Department of Computer Sci-
defect prediction via cost-sensitive Siamese parallel fully-connected ence and Engineering, DUET. Her specialization lies in software rejuvena-
neural networks,’’ Neurocomputing, vol. 352, pp. 64–74, Aug. 2019. tion, software aging, and human pose recognition. She has published several
[Online]. Available: https://www.sciencedirect.com/science/article journals and conference papers in the world’s reputed journals and confer-
/pii/S0925231219305004 ences on the different topics of computer science, such as the IoT, machine
[20] A. B. Farid, E. M. Fathy, A. S. Eldin, and L. A. Abd-Elmegid, ‘‘Soft- learning, and neural networks. Her research interests include software reli-
ware defect prediction using hybrid model (CBIL) of convolutional neural ability, software engineering, artificial neural networks, data mining, data
network (CNN) and bidirectional long short-term memory (Bi-LSTM),’’
clustering, information systems, and analysis.
PeerJ Comput. Sci., vol. 7, p. e739, Nov. 2021.
[21] G. Fan, X. Diao, H. Yu, K. Yang, and L. Chen, ‘‘Software defect prediction
via attention-based recurrent neural network,’’ Sci. Program., vol. 2019,
pp. 1–14, Apr. 2019, doi: 10.1155/2019/6230953.
[22] Ł. Chmielowski, M. Kucharzak, and R. Burduk, ‘‘Application of explain-
able artificial intelligence in software bug classification,’’ Informatyka,
Automatyka, Pomiary Gospodarce Ochronie Środowiska, vol. 13, no. 1,
pp. 14–17, Mar. 2023, doi: 10.35784/iapgos.3396. MEHEDI HASAN SHUVO (Member, IEEE) is
[23] A. Khan, A. Ali, J. Khan, F. Ullah, and M. A. Khan, ‘‘A systematic
currently pursuing the B.Sc. degree in engineer-
literature review of explainable artificial intelligence (XAI) in software ing with the Department of Computer Science
engineering (SE),’’ Researchsquare, pp. 1–28, Sep. 2023. [Online]. Avail- and Engineering, Dhaka University of Engineer-
able: https://www.researchsquare.com/article/rs-3209115/v1 ing & Technology, Gazipur, Bangladesh. He is a
[24] A. Hammouri, M. Hammad, M. Alnabhan, and F. Alsarayrah, ‘‘Software very courteous and dedicated person. He possesses
bug prediction using machine learning approach,’’ Int. J. Adv. Comput. Sci. strong technical skills and he loves challenging
Appl., vol. 9, no. 2, pp. 1–6, 2018, doi: 10.14569/ijacsa.2018.090212. work to create engaging and informative content.
[25] G. dos Santos, E. Figueiredo, A. Veloso, M. Viggiato, and N. Ziviani, Pre- He has three years of industrial job experience
dicting Software Defects With Explainable Machine Learning. New York, in the field of mobile application development
NY, USA: Association for Computing Machinery, Dec. 2020, pp. 1–10. (Android and iOS). He worked full-time in the three professor’s laborato-
[26] B. Mahbooba, M. Timilsina, R. Sahal, and M. Serrano, ‘‘Explainable artifi- ries at his university, which improved him. His research interests include
cial intelligence (XAI) to enhance trust management in intrusion detection federated learning, explainable artificial intelligence, computer vision, the
systems using decision tree model,’’ Complexity, vol. 2021, pp. 1–11, IoT, machine learning, deep learning, object detection, and water and envi-
Jan. 2021, doi: 10.1155/2021/6634811. ronmental sustainability. He is an ACM Member.
MOSTOFA KAMAL NASIR received the B.Sc. IMRAN ASHRAF received the M.S. degree
degree in computer science and engineering from (Hons.) in computer science from Blekinge Insti-
Jahangirnagar University, Dhaka, Bangladesh, tute of Technology, Karlskrona, Sweden, in 2010,
in 2000, and the Ph.D. degree in mobile ad-hoc and the Ph.D. degree in information and commu-
technology from the University of Malaya, Kuala nication engineering from Yeungnam University,
Lumpur, Malaysia, in 2016. He is currently a Pro- Gyeongsan, South Korea, in 2018. He was a
fessor of computer science and engineering with Postdoctoral Fellow with Yeungnam University,
Mawlana Bhashani Science & Technology Uni- where he is currently an Assistant Professor with
versity, Tangail, Bangladesh. His current research the Information and Communication Engineering
interests include VANET, the IoT, SDN, and WSN. Department. His research interests include posi-
tioning using next-generation networks, communication in 5G and beyond,
location-based services in wireless communication, smart sensors (LIDAR)
for smart cars, and data analytics.