Utilizing Machine Learning For Predicting Software Faults Through Selenium Testing Tool
Utilizing Machine Learning For Predicting Software Faults Through Selenium Testing Tool
Utilizing Machine Learning for Predicting Software Faults Through Selenium Testing
Tool
Ghada Alsuwailem1, Ohoud Alharbi1
1 King Saud University, Riyadh, Saudi Arabia
A R T I C L E I N F O A B S T R A C T
Software quality assurance, especially in the context of the testing phase, plays a
pivotal role in ensuring the reliability and functionality of software systems.
Keywords: Automation testing is recognized as a valuable technique to enhance test coverage
Machine Learning, Software and accuracy. However, challenges such as diverse automation tools and unrealistic
Faults, Selenium Testing expectations can hold up its effectiveness. This research explores the integration of
Tool. machine learning into the Selenium automation testing tool to predict faults based
on UI and historical scenarios. The study aims to investigate the impact of machine
Received: Nov, 19, 2023
learning on perceived task difficulty and time required for fault prediction during
Accepted: Nov, 30, 2023
Published: Dec, 22, 2023
software testing. The literature review emphasizes the importance of software
testing, automation testing, and the Selenium tool. The research methodology
employs a mixed-methods approach, combining quantitative and qualitative
analyses. The results show positive perceptions regarding the clarity of
implementing machine learning-based Selenium but mixed opinions on the ease of
implementation. The ML-based Selenium tool demonstrates increased
effectiveness, reliability, and reduced testing duration. Interviews highlight the
complementary roles of manual and automated testing. The discussion addresses
improved test effectiveness, reliability, challenges, and future considerations,
affirming the viability and advantages of incorporating machine learning into the
Selenium framework for automation testing.
1. INTRODUCTION
Commencing software quality assurance techniques that would help to increases test
represents one of the most critical facets within the coverage and accuracy. On account of the
software development lifecycle. Within this comprehensive, pluralism features and integrated
lifecycle, there is a testing phase which involves the of software, it is needful to conduct appropriate
process of exercising a software system using a techniques to guarantee reaching a high level of
variety of inputs with the intention of validating its software quality. The software engineer is exposed
behavior and discovering faults. These faults, also to challenges to achieve a high level of software
known as bugs or defects, can cause failures in their quality, including:
software systems (Chen, et al., 1998) The The list of test automation tools in use can
importance of the testing phase would verify and pose a challenge for testers, as each tool
validate the aim of the software. To achieve high necessitates a unique learning curve. For
level of software quality, we should concentrate on instance, the usage methodology of
the techniques and methods used for testing Selenium differs from that of Appium.
process. Automation testing is one of the useful Furthermore, each tool exhibits distinct
G. Alsuwailem & O. Alharbi International Journal on Computations, Information and Manufacturing (IJCIM) 3(2) -2023- 14
variations and applications; for example, objective the detailed goals should be on
Selenium encompasses three types— consideration:
Selenium WebDriver, Selenium IDE, and Conducting a thorough investigation to
Selenium Grid—each demanding dedicated familiarize with the machine learning-
learning and expertise. based Selenium automation testing tool,
In alignment with the 7 Testing Principles gaining insight into both their advantages
(Florea, R. and Stray, V. 2019), the concept and shortcomings.
of 'Absence-of-errors' is recognized as a A profound understanding of machine
fallacious notion. However, organizations learning utilization techniques within the
setting unrealistic expectations of domain of software quality assurance.
automation by assuming that automation Developing a system to implement the
would solve all issues which is not true and machine learning-based Selenium
lead to facing heightened demands and technique in the testing phase. The system
expectations from testers. would be designed to bring together a
Software invariably exhibits defects due to the community of testers, offering a variety of
inherent fallibility of human programmers. The features, with the primary focus being the
genesis of such imperfections lies in several factors, collaborative exchange of software testing
including the potential for ambiguous or erroneous expertise.
software requirements, misinterpretation of these
requirements, misuse of software components, 1.2. Research Questions
human errors during the coding process, and the To verify the effectiveness of applying machine
susceptibility of previously functional code to learning – based Selenium technique, we design the
discrepancies arising from changes. In light of following research question.
these challenges, the practice of software testing RQ1: How does the utilizing of machine learning
emerges as an intuitive and indispensable technology in a Selenium automation testing tool
approach to mitigate the impact of these impact the perceived task difficulty and time
imperfections (Gamido, H. V. and Gamido, M. V. required for fault prediction during software
2019). One of the practices technique of software testing?
testing is Software fault prediction (SFP) that The following is the hypotheses associated with
encompasses the endeavor of constructing research question:
predictive models intended for deployment by H1a: Faults prediction effort as measured by the
software professionals during the nascent stages of perceived task difficulty and time to complete the
the software development life cycle, with the task will be significantly lower for software testing
primary aim of identifying defective elements, such tools supported with machine learning technology.
as modules or classes. Historically, a spectrum of In this paper, we will employ a machine learning-
machine learning methodologies has been based Selenium automation testing tool.
harnessed for the task of fault prediction (Honest,
N. 2019). Machine learning can be harnessed to 2. LITERATURE REVIEW
streamline the automation of test scripts, This section provides an overview of the literature
consequently enhancing the level of software review conducted in the study, encompassing the
quality and concurrently alleviating the testing significance of software testing in achieving
team's workload. software quality. It delves into a brief exploration
of automation testing, focusing on the application
1.1. Objective of one prominent tool, Selenium Automation
This research paper aims to find the appropriate Testing Tool. Additionally, the section provides an
solution to avoid challenges and achieve a high insight into the broader context of machine
level of software quality. This is pursued through learning and its application in predicting software
the implementation of a machine learning-based faults.
Selenium automation testing tool, which aids in
script generation based on the UI and historical 2.1. Importance of Software Testing to Achieve
scenarios to predict faults. To accomplish the Software Quality
A lot of researches have been conducted to 2.3. Selenium Automation Testing Tool
emphasize the importance of software quality with According to (Li, Z. et al., 2018) selenium is defined
the domain of the effectiveness of automation as one of the preeminent automation frameworks
testing utilizing the machine learning capability in encompassing a multitude of tools and extensions
software faults prediction. A group of researchers designed for the purpose of conducting testing on
has explained the concept of software testing. One web applications also, it is recognized for its
of their explanations is that, software testing is powerful capability in performance testing and
among the umbrella activities performed at any maintains a prominent presence in the realm of
organization to provide value and quality, ensuring open-source test automation (Jaganeshwari, K.).
the longevity of software products in the market Discussed various automation tools, including
(Kaur, M. and Kumari, R. 2011). Besides that, other those utilizing artificial intelligence that work as a
researches assert the importance of the testing means to address challenges in testing, one of the
phase that consuming an average of 40% to 70% of mentioned automation tools is selenium. There are
software development process. Furthermore, common advantages and drawbacks of selenium,
software testing plays a crucial role in evaluating the advantages are: Open source, no licensing and
and ensuring the quality of software. It is essential maintenance fees, Open for integration with other
in confirming that software functions as intended tools and frameworks, Ease of use, Flexibility,
and does not perform unintended actions (Kufel, J. Capability to debug and set breakpoints in test
et al. 2023). Correspondingly, the optimal level of cases and Allows tests written in different
testing efficiency is characterized by its capacity to programming languages for advantages. On the
achieve the desired software quality standard other hand, the drawbacks are: Writing test cases
while demanding a reduced level of effort (Li, Z. et with Selenium requires a certain level of
al., 2018). Notably, testing accounts for a programming skill and no built-in error handling
substantial portion of software development costs capabilities, which can make it challenging to
and emphasized the importance of software testing handle and report errors effectively (Marijan, D.
in ensuring software quality on account of modern and Gotlieb, A. 2020) . Selenium simplifies the work
software systems have grown increasingly of automation testers, leading to improved testing
complex, making conventional testing techniques efficiency and cost-effectiveness. Selenium's open-
less scalable. This complexity has driven the source nature, flexibility in scripting languages,
adoption of machine learning-based techniques in compatibility across operating systems and
testing. browsers, and seamless integration with other
tools contribute to its popularity. Testers can write
2.2. Automation Testing scripts in various languages and perform testing on
On the spot of automation testing, automated Windows, MacOS, and Linux, across different
testing addresses the difficulties arising from browsers, ensuring cross-browser compatibility
manual testing and the testers put more focus on (Mobaraya, F. and Ali, S. 2019). Selenium is a
automated tests than on manual tests (Wardhan, H. versatile automation testing framework, comprises
and Madan, S). The researchers mention the three key components. Firstly, Selenium IDE
strength points of using the automation testing facilitates the recording, editing, debugging, and
which are: Automation testing executes test cases replaying of functional tests. Testers can
significantly faster than manual testing, less testers effortlessly record browser interactions and export
are required in automation testing as a result, less tests in multiple programming languages for
investment is required in human resources, enhanced flexibility. Secondly, Selenium Grid
Automation testing programmable that program empowers parallel automated testing across
sophisticated tests to bring out hidden information, multiple machines and browsers, optimizing time
and the last point is that Automation testing is and overall performance. This component allows
more reliable and less error prone than manual testers to conduct tests simultaneously in various
testing (Seralina, N. 2021). The advertisers put browsers and operating systems. Lastly, Selenium
more focus on automated tests than on manual WebDriver stands out as the cornerstone of the
tests. Selenium Suite, providing a programming interface
for the creation and execution of automation
scripts. Testers can choose their preferred artificial intelligence and its importance in the
programming language to identify web elements software development and testing process,
on pages and perform actions, thus tailoring the additionally, it highlights AI's ability to generate
automation process to their specific needs. quick and efficient tests, ultimately saving time and
Selenium WebDriver's compatibility with popular resources (Mobaraya, F. and Ali, S. 2019). Various
browsers further contributes to an efficient and machine learning techniques are categorized into
comprehensive testing experience (Nyamathulla, S. eight groups: Bayesian learners, Decision Trees,
et al. 2021). Evolutionary Algorithms, Ensemble Learners,
Machine learning – based selenium tool consists of Neural Networks, Support Vector Machines, Rule-
eight main steps to implement which are: Collect Based Learning, and Miscellaneous. Notable
Training Data, Feature Extraction, Labeling, Train methods such as Bayesian learners, Decision Trees,
Machine Learning Model, Monitoring and and Miscellaneous are frequently used in SFP and
Inference, Dynamic Test Case Generation, Execute some articles use a combination of these methods
Test Cases, Feedback Loop, Iterate and Improve. for optimal results (Wardhan, H. and Madan, S.
2023). In contrast, there are researches
2.4. Application of Software Faults prediction categorized machine learning algorithms into two
utilizing Machine Learning main learning categories: supervised and
Machine learning is a subset of AI that involves unsupervised. Supervised learning involves
building computer models that are capable of mapping input variables to corresponding output
learning and making independent predictions or variables for prediction or understanding.
decisions, it is a field at the intersection of AI, Unsupervised learning deals with input data only
computer science, and statistics, is used to and focuses on clustering problems (Wardhan, H.
automate and streamline software testing thus, and Madan, S, .2023). Similarly, some studies fall
some software testing problems can be framed as into a semi-supervised category when only a
learning problems, making machine learning a subset of input data has associated output data
suitable approach (Nyamathulla, S. et al. 2021). In (Malhotra, R. 2015). As will, the effectiveness of
the past, extensive research has focused on the various machine learning techniques, such as SVM
application of machine learning in software testing. and RF, can vary depending on the dataset and
A Systematic Literature Review (SLR) that covers problem (Sugali, K. et al. 2021). Furthermore, a
154 studies from 1990 to June 2019, providing novel methodology introduced to enhance the
guidelines for software practitioners and process of GUI testing through the utilization of
researchers (Wardhan, H. and Madan, S. 2023). machine learning techniques for the recognition
Furthermore, various studies have demonstrated and categorization of GUI widgets. Machine
the significance of software fault prediction (SFP) learning can be applied to predict the effectiveness
by conducting comprehensive reviews. One study of test cases by learning from data about what
systematically reviewed 74 research articles from constitutes an effective test case, also, machine
11 different journals up until 2007. This review learning algorithms can identify patterns and
categorized machine learning-based approaches structures in the data to create models for making
and commonly used software metrics in the SFP predictions about test case effectiveness.
field. Another study conducted a systematic Controlled AI systems are credited with achieving
literature review covering the period from 1991 to high test case coverage while effectively
2013. This review analyzed a range of machine addressing scalability and error-related challenges
learning methods, software metrics, datasets, and (Noorian, M. et al. 2018). Regarding the machine
performance measures in the context of fault learning models, it is achieved an average AUC
prediction (Pandey, S.K. et al. 2021; Spicer, J. and between 0.72 to 0.84 and an accuracy range of
Sanborn, A.N. 2019). Moreover, a study indicated 75.7% to 85.01% in SFP, besides that, machine
that the AI support for the quality test extracted in learning faces challenges that related to dealing
technical batches, which used to solve multiple with imbalanced datasets, overfitting, data quality,
problems when testing focuses on information and variations in fault information across different
systems for machine intelligence applied to projects. As a result of the study, it recommended
automated software testing, with an exploration of the availability of benchmark datasets from
various industries as crucial for improving SFP enable a group of four experienced testers to assess
models and encouraged industries to provide the effectiveness of using the machine learning-
freely available datasets with more features and based selenium tool on TCS, which involves a
test cases to support deep learning (DL) system of moderate complexity. Consequently, we
applications without overfitting (Phuc Nguyen, D. will conduct two sets of experiments: one utilizing
and Maag, S. 2020). One of the observations that the machine learning-based selenium tool and the
obtained that the absence of comprehensive other employing manual testing methods.
guidelines for selecting suitable machine learning
methods for software testing (Sugali, K. et al. 2021). 3.2. Data Collection Methods
Correspondingly, other study suggested more We collected data through interviews with testers
rigorous empirical evaluations to support the who have experience in both automation and
proposed solutions with reason of an increasing manual testing. To select participants, we
interest in the application of machine learning considered their expertise and involvement in the
algorithms to automate software testing (Okezie, F. software quality domain. Additionally, we
et al. 2019). gathered post-experimental data using a survey
In conclusion, our observations have identified containing five key questions that focused on
several knowledge gaps that motivate further aspects such as clarity, easily, effectiveness, and
exploration into the application of machine reliability of the tool. These interviews were audio-
learning in software testing, particularly when recorded and later transcribed for analysis (see
coupled with automation testing tools. These gaps Appendix A for usability testing script and
include the scarcity of empirical studies focused on Appendix B for survey questions).
machine learning-based automation testing and a
lack of research addressing the comparison of 3.3. Sampling Strategy
appropriate machine learning techniques tailored The target population for this research comprised
to the field of software testing with manual testing. four experienced testers, each with a minimum of
five years of experience. We selected testers as
3. RESEARCH METHODOLOGY participants because the tool is specifically
This section outlines the research methodology designed for testing purposes. These testers
employed in the study, detailing the procedures possess extensive experience in software quality
and techniques used to collect and analyze data. control, encompassing both manual testing and
The research is based on utilizing machine learning automation testing, and they are familiar with the
– based selenium tool on Tester Community use of the Selenium tool. The decision to include
System (TCS) to measure number of closed test four testers in the study was driven by the tool's
cases, time duration of the test, tester satisfaction, learning curve, which requires a significant amount
and overall software quality of (TCS) using the tool. of time for proficiency. For the ethical
The research follows a mixed-methods approach to considerations, we obtained informed consent
gain a comprehensive understanding of the from all selected participants, explaining the
research problem. This approach combines study's purpose, voluntary participation, and
quantitative and qualitative analyses, allowing us, response confidentiality (see Appendix C for
on one hand, to obtain reliable results regarding consent form).
test cases and the number of closed test cases in a
shorter time frame while maintaining a high level 3.4. Data Analysis Techniques
of quality (Spicer, J. and Sanborn, A.N. 2019). We analyzed the transcribed interviews to
understand the testers' opinions on the quality
3.1. Research Design improvement facilitated by the machine learning-
This research is an experimental study aimed at based selenium tool. Additionally, we used a Paired
investigating the impact of software quality in the Samples t-Test within a within-subject design to
Tester Community System (TCS) by utilizing a assess the significance of differences between
machine learning-based selenium tool. The observations when using the machine learning-
experimental design will employ a within-subject based selenium tool compared to manual testing
approach with two conditions. This design will on our system.
In summary, Selenium's open-source nature, types of testing which are automation testing using
language flexibility, cross-browser compatibility, Machine Learning (ML) – based Selenium and
and parallel testing capabilities have made it an manual testing.
essential tool for software testers and developers.
It has significantly contributed to streamlining the 4.1. Implementing automation and manual testing
testing process, improving test coverage, and A group of four experienced testers had ask to
ensuring the reliability of web applications. implement automation testing by utilizing Machine
Learning (ML)-based Selenium by integrating
4. DATA ANALYSIS machine learning techniques with the Selenium
In this section, we present the results of the automation testing framework. This integration
research focused on the utilization of a Machine aims to enhance Selenium's capabilities by
Learning (ML)-based Selenium tool for automation leveraging ML algorithms to optimize test
testing. The study aimed to assess the clarity, automation processes. After establishing all the
easily, shorter time, effectiveness and reliability of necessary processes for utilizing ML-based
incorporating ML algorithms into the traditional Selenium, a team of four testers initiated the
Selenium framework. We requested from testers to implementation phase. We asked to test one test
start doing the excremental by preparing the case by crafting a Selenium script to test the login
necessary prerequisite for implementing both use case, as illustrated in Figure 1.
6. CONCLUSION
In conclusion, this research delves into the
intersection of machine learning and automation REFERENCES
testing, particularly within the Selenium Chen, T.Y. et al. 1998. Metamorphic Testing: A New Approach
framework, to enhance the quality assurance for Generating Next Test Cases.
Florea, R. and Stray, V. 2019. The skills that employers look for
process in software development. The integration in software testers. Software Quality Journal. 27, 4 (Dec.
of machine learning introduces a promising avenue 2019), 1449–1479.
for addressing challenges in fault prediction during DOI:https://doi.org/10.1007/s11219-019-09462-5.
software testing. The study focused on Florea, R. and Stray, V. 2019. The skills that employers look for
in software testers. Software Quality Journal. 27, 4 (Dec.
understanding the impact of machine learning on 2019), 1449–1479.
task difficulty, time requirements, and overall DOI:https://doi.org/10.1007/s11219-019-09462-5.
software quality within the Tester Community Gamido, H. V. and Gamido, M. V. 2019. Comparative review of
System (TCS). The literature review underscores the features of automated software testing tools.
the critical role of software testing in ensuring International Journal of Electrical and Computer
you wish to be informed of the results of the • No information that would identify me will be
research, please indicate this on the signature page released or printed without asking me first
below. • I will receive a signed copy of this consent
form
Withdrawing from the study You can still participate in the research if you select
Your participation is completely voluntary. You are no:
under no obligation to participate and are free to I consent to being contacted in the future for
withdraw at any time without consequence. Your participation in research studies □Yes
decision to withdraw will not influence your □No
relationship with the researcher in any way. If we
have begun reporting results, we will not be able to
remove your data.
Email address:
Signature Page
Project title: Utilizing Machine Learning for Signature of the person obtaining consent
Predicting Software Faults Through the Selenium
Testing Tools. By signing this form, I attest that:
• I have explained the study to the prospective
Researcher: Ghada Alsuwailem participant
• I answered all of their questions
Statement of consent • I provided a copy of this consent form to the
participant
By signing this form, I agree that: • The participant seemed to understand the
• The study has been explained to me consent form and agreed to participate
• All my questions have been answered
• Possible harm and discomforts and possible
benefits (if any) of this study have been
explained to me Name Signature Date
• I have been told that my personal information
will be kept confidential