Artificial Intelligence in Software Testing: A Systematic Review
Artificial Intelligence in Software Testing: A Systematic Review
4
31 Oct - 3 Nov 2023. Chiang Mai, Thailand
Abstract—Software testing is a crucial component of soft- the various methods, techniques, and tools utilized in this
ware development. With the increasing complexity of software domain and evaluate their efficiency. The motivation for this
systems, traditional manual testing methods are becoming less systematic literature review stems from the potential benefits
feasible. Artificial Intelligence (AI) has emerged as a promising
approach to software testing in recent years. This review paper that AI can offer in the field of software testing. AI has the
aims to provide an in-depth understanding of the current state potential to automate the testing process and optimize testing
of software testing using AI. The review will examine the various strategies, making software testing more efficient, effective,
approaches, techniques, and tools used in this area and assess and accessible. Moreover, AI can address the shortage of
their effectiveness. The selected articles for this study have been skilled testers and help keep pace with the rapid development
extracted from different research databases using the advanced
search string strategy. Initially, 40 articles have been extracted cycles of agile development methodologies. There are several
from different research libraries. After gradual filtering finally, challenges in software testing that can be solved using AI.
20 articles have been selected for the study. After studying all Some of these issues include manually generating test cases,
the selected papers, we find that various testing tasks can be test optimization, test results analysis, etc.
automated successfully using AI (Machine Learning and Deep
The following research questions have been investigated
Learning) such as Test Case Generation, Defect Prediction, Test
Case Prioritization Metamorphic Testing, Android Testing, Test in this research study.
Case Validation, and White Box Testing. This study also finds RQ1: Does manual testing have drawbacks?
that the integration of AI in software testing is making software RQ2: Can integration of AI (ML or DL techniques) in
testing activities easier along with better performance. This software testing help to overcome the drawbacks of manual
literature review paper provides a thorough analysis of the
impact AI can have on the software testing process. testing?
Index Terms—Software Testing, Artificial Intelligence, Test RQ3: What software testing tasks can be automated by
Automation, Systematic Literature Review AI (ML or DL)?
RQ4: What techniques do researchers use to assess AI
I. I NTRODUCTION (ML or DL) when used in software testing?
Software testing has a crucial role in software engineering In this research study, 40 articles have been screened from
as it is essential for ensuring the quality, performance, different research libraries but through a gradual filtering
security, and reliability of software systems. By conducting process, only 20 articles were found suitable for the study.
testing, developers can identify and rectify any bugs, or de- We have structured the paper in the following way. Related
fects in the software, improving its overall functionality and works have been discussed in section 2 while the background
making sure that the software satisfies customer needs and of software testing and AI have been presented in section
expectations. AI is a vast area , so in this paper we mainly 3. Systematic review and the results have been presented
investigate the subarea of AI which are Machine Learning in section 4 and 5 consecutively. In the end, conclusion is
(ML) and Deep Learning (DL) techniques in software testing. presented in section 6.
The field of software testing currently faces a number of
challenges. As software systems grow increasingly complex, II. R ELATED W ORKS
it becomes more challenging to manually test all possible
scenarios. Also, traditional test automation approaches are They [1] proposed a deep learning model to rank test
time-consuming and complex to implement. Apart from that, cases. In this work, they consider historical records of test
keeping pace with agile development is also a challenge case executions and based on that deep learning model
as it requires rapid testing. AI has the potential to address rank test cases. They [2] conducted an empirical study on
these challenges by offering optimized and effective testing continuous integration testing. They found the strategy of
strategies. The aim of this study is to gain a thorough under- reward function of Reinforcement learning improves the
standing of the current state of the field of software testing existing test case prioritization practices. They [3] developed
automation through the use of AI. This review will examine a deep reinforcement learning technique for performing black
TABLE I TABLE II
I NCLUSION AND E XCLUSION C RITERIA S ELECTED R ESEARCH S TUDIES ACCORDING TO THE P UBLISHER
Publisher
SL Paper Id Year Findings
Name
Authors proposed an approach utilizing Deep Reinforcement
1 P1 [27] 2022 ACM Learning (RL) for automating the exploration of Android apps. Authors
developed a tool called ARES along with FATE that integrates with ARES.
This paper analyzed ML frameworks in the context of software automation and
2 P2 [28] 2022 MDPI evaluated the performance of testing tools considering various factors. Accuracy
or error rate, scope are important factors to determine the effectiveness of frameworks.
This study investigates the efficacy of machine learning, data mining,
Science
3 P3 [29] 2022 and deep learning methodologies in predicting software faults. This investigation reveals that
Direct
data mining and machine learning techniques are utilized more than deep learning techniques.
This paper introduces Keeper, a novel testing tool. Keeper adopts a unique approach where it
4 P4 [30] 2022 ACM
creates pseudo-inverse functions for ML APIs. Keeper significantly enhances branch coverage .
This study presents DeepOrder, a regression machine learning model based on deep
5 P5 [31] 2021 IEEE learning techniques. DeepOrder can prioritize test cases and identify failed test cases when
it considers various factors such as test case duration and execution status.
This study investigated reward function and reward strategy within the context of
Science
6 P6 [32] 2021 continuous integration (CI) testing. The authors proposed three strategies in terms
Direct
of the reward strategy. Proposed strategies showed promising results.
This paper introduces Deep GUI. Deep GUI utilizes deep learning techniques to create
7 P7 [5] 2021 IEEE
a model of valid GUI interactions, based solely on screenshots of applications.
This study finds that most ML libraries lack a high-quality unit test suite. Moreover, the study
8 P8 [33] 2021 IEEE
also discovers recurring trends in the unexamined code throughout the five assessed ML libraries.
This study presents a deep learning approach to predict the validity of test inputs
9 P9 [34] 2021 IEEE
for RESTful APIs. The proposed network achieved 97% accuracy for the new APIs.
This paper introduces Humanoid, a deep learning approach for generating
GUI test inputs by leveraging knowledge gained from human interactions. It learns
10 P10 [35] 2019 IEEE
from traces of interactions generated by humans, enabling the automatic prioritization
of test inputs based on their perceived importance to users.
This study finds equivalent mutants are effective for augmenting data
11 P11 [36] 2019 ACM
and improving the detection rate of metamorphic relations.
This study introduces an enhanced CNN model specifically designed to improve
the learning of semantic representations from source-code. This study also showed
12 P12 [37] 2019 MDPI
enhancements of the global pattern capture capability of the models which improve
the model’s generalization performance.
This study used three supervised machine learning algorithms for predicting software
bugs. To enhance the accuracy of models, random forest
13 P13 [38] 2019 IEEE
ensemble classifiers have been used. The developed models effectively work for various
scenarios.
This study finds ML algorithms have predominantly been employed in different areas
14 P14 [39] 2019 IEEE of software testing. Test case generation, evaluation, test oracle construction,
and cost predicton for testing activitires can be performed using ML.
This study presents an approach for automating the test oracle mechanism in software
15 P15 [40] 2018 ACM using machine learning (ML). By incorporating a captured component into the application,
historical usage data have been gathered. These data later generate an appropriate oracle.
This paper describes a tool that generates test data for programs.
SCITE
16 P16 [41] 2018 The tool operates by clustering input data from a corpus folder and creating
PRESS
generative models for each cluster. These models are recurrent neural networks.
This paper introduces a methodology called DaOBML, which offers tool support to
17 P17 [42] 2018 ACM enhance the quality of environmental models that generate complex artifacts like images
or plots. In this study, among six ML algorithms, ANN shows the best performance.
This study introduces DeepXplore, an innovative whitebox system designed to
18 P18 [43] 2017 ACM systematically test DL systems and detect faulty behaviors. DeepXplore can
solve joint optimization problems.
Wiley This study, proposed a ML approach that can predict metamorphic relations in
19 P19 [44] 2016 Online software programs. To achieve this, authors utilized a graph-based representation
Library of the program.
This study proposed an approach for prioritizing test cases in manual testing. The proposed
20 P20 [45] 2016 IEEE approach considers black-box metadata, including test case history. SVM Rank ML algorithm
is used in this study.