0% found this document useful (0 votes)
5 views8 pages

Software Requirements Prioritisation Using Machine Learning

This research paper explores the use of Machine Learning for prioritizing software requirements in complex software release planning, particularly for companies with multiple product and business lines. It evaluates five Machine Learning models based on their accuracy, F1 score, and K-Fold Cross Validation, utilizing real data from a semiconductor company. The study aims to automate and improve the prioritization process, addressing challenges such as scale, complexity, and stakeholder inconsistency.

Uploaded by

mahiban320
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views8 pages

Software Requirements Prioritisation Using Machine Learning

This research paper explores the use of Machine Learning for prioritizing software requirements in complex software release planning, particularly for companies with multiple product and business lines. It evaluates five Machine Learning models based on their accuracy, F1 score, and K-Fold Cross Validation, utilizing real data from a semiconductor company. The study aims to automate and improve the prioritization process, addressing challenges such as scale, complexity, and stakeholder inconsistency.

Uploaded by

mahiban320
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Software Requirements Prioritisation Using Machine Learning

a b c
Arooj Fatima , Anthony Fernandes, David Egan and Cristina Luca
School of Computing and Information Science, Anglia Ruskin University, Cambridge, U.K.

Keywords: Software Requirement Prioritisation, Machine Learning, Classification, Requirements Analysis.

Abstract: Prioritisation of requirements for a software release can be a difficult and time-consuming task, especially
when the number of requested features far outweigh the capacity of the software development team and diffi-
cult decisions have to be made. The task becomes more difficult when there are multiple software product lines
supported by a software release, and yet more challenging when there are multiple business lines orthogonal
to the product lines, creating a complex set of stakeholders for the release including product line managers and
business line managers. This research focuses on software release planning and aims to use Machine Learning
models to understand the dynamics of various parameters which affect the result of software requirements
being included in a software release plan. Five Machine Learning models were implemented and their perfor-
mance evaluated in terms of accuracy, F1 score and K-Fold Cross Validation (Mean).

1 INTRODUCTION conductor company. This data is chosen as it is a good


example of the scale and complexity of SPL/MBL.
Software can be found in very diverse applications We seek to determine how the various inputs to the
(Sommerville, 2016) and embedded software now requirements prioritisation (RP) and planning process
makes up a large proportion of that software (Pohl impact the results of the process: a set of requirements
et al., 2005, p14). With the recent development of chosen to be implemented in the release.
applications for the Internet of Things (IoT) (Ashton, The rest of this paper is organized as follows: Sec-
2009), software is used to unlock the ability of ba- tion 2 investigates the literature and state-of-the-art
sic hardware to service multiple applications. There studies on the topic; Section 3 describes the proposed
are different sets of challenges associated with soft- method; Section 4 outlines the results; Section 5 pro-
ware depending on the business and technical envi- vides discussion about the results and experiments; fi-
ronment. This paper explores one of these challenges: nally the conclusions are drawn in Section 6.
the prioritisation of software requirements for soft-
ware releases, with particular focus on a complex use
case of a company that produces wireless microchips 2 LITERATURE REVIEW
for the IoT. When conflicting demands for additional
features arise and there are insufficient resources for Prioritisation of software requirements becomes nec-
development, prioritisation of software requirements essary when there are competing requests for new
becomes very important. functionality with limited development resources
This study reviews the strategy of prioritising (Wiegers and Beatty, 2013). In this paper we anal-
tasks that will take into account the requirements of yse requirements data from a specific type of software
a particular class of software system and setting: soft- system and context: software product lines (Metzger
ware product lines (SPL) (Devroey et al., 2017) along and Pohl, 2014) with multiple business lines (Pronk,
with Multiple business lines (MBL) (Pronk, 2002) 2002)(SPL/MBL). SPL engineering enables a fam-
and applies machine learning capabilities to the pri- ily of products to be developed by re-using shared
oritisation strategy. The study examines requirements assets (Metzger and Pohl, 2014), (Devroey et al.,
data related to a software release cycle in an IoT semi- 2017), (Montalvillo and Diaz, 2016), which in the
a https://orcid.org/0000-0001-6129-9032 case of IoT may include common utilities, libraries
b https://orcid.org/0000-0003-3456-2522 and pieces of source code that are re-used in multiple
c https://orcid.org/0000-0002-4706-324X software products, ensuring an efficient and effective

893
Fatima, A., Fernandes, A., Egan, D. and Luca, C.
Software Requirements Prioritisation Using Machine Learning.
DOI: 10.5220/0011796900003393
In Proceedings of the 15th International Conference on Agents and Artificial Intelligence (ICAART 2023) - Volume 3, pages 893-900
ISBN: 978-989-758-623-1; ISSN: 2184-433X
Copyright c 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
ICAART 2023 - 15th International Conference on Agents and Artificial Intelligence

use of engineering time. SPL engineering is primar- empirical research focused on comparisons that take
ily an engineering solution to enable tailored software benefits and disadvantages into consideration. (Perini
variants and to manage software product variability, et al., 2013) differentiate the RP techniques into ba-
customisation and complexity (Grüner et al., 2020), sic ranking techniques, which typically permit priori-
(Abbas et al., 2020). From an engineering point of tisation along a single evaluation criterion, and RP
view, product line requirements are handled in the methods, which incorporate ranking techniques inside
domain engineering process, while business line re- a requirement engineering process. Relevant project
quirements are managed in the application engineer- stakeholders, such as customers, users, and system ar-
ing process. chitects conduct rank elicitation, which can be done
The problem of prioritisation must be looked at in a variety of methods. A fundamental strategy is
from the point of view of business owners and product ranking each need in a group of candidates in accor-
managers. Thus the focus is on making business de- dance with a predetermined standard (e.g., develop-
cisions rather than optimising operational efficiency, ment cost, value for the customer). A requirement’s
to establish business priorities in a complex software rank can be stated as either an absolute measure of
product environment. When planning a software re- the assessment criterion for the requirement, as stated
lease to address SPL/MBL, the challenges come from in Cumulative voting (Avesani et al., 2015), or as
the absolute number of requirements to be prioritised a relative position with regard to the other require-
as well as the complexity of the software release in ments in the collection, as in bubble sort or binary
terms of number of product lines and number of stake- search methods. A prioritising technique’s useful-
holders. When multiple product lines are included in ness depends on the kind of rank elicitation. For
a single software release, one inevitable challenge is example, pair-wise evaluation reduces cognitive ef-
scale: the number of requirements increases as more fort when there are just a few dozen criteria to be
products are included in the release. A second chal- assessed, but with a high number of needs, it be-
lenge is complexity, as product lines become depen- comes expensive (or perhaps impracticable) due to
dent on shared assets and therefore shared require- the quadratic growth in the number of pairings that
ments for those assets. There is also potential for de- must be evoked. The ranking produced by the var-
pendencies between requirements for different prod- ious methods includes requirements listed according
uct lines, which adds further to complexity. Addi- to an ordinal scale (Bubble Sort, Binary Search), re-
tional challenges arise when multiple business lines quirements enlisted as per a rational scale (Analytical
are involved in the process (Pronk, 2002). Building a Hierarchy Process (AHP), 100 Points), and as per or-
robust product line platform while also creating cus- dinal scale (groups or classes), as in the Numerical
tomer or target market specific applications (Metzger Assignment (Perini et al., 2013).
and Pohl, 2014) means satisfying a matrix of stake- The scalability of these strategies is directly linked
holders with inconsistent or even opposing views on with the proportional increase of the human effort.
priority based on their specific product line or mar- The computing complexity depends also on the num-
ket segment interest. These three challenges of scale, ber of criteria (n) to be prioritised, ranging from a lin-
complexity and inconsistency of stakeholders must be ear function in n for Numerical Assignment or Cu-
considered by any prioritisation method that is to be mulative Voting to a quadratic function for AHP. In
used with SPL/MBL. order to handle numerous priority criteria, more orga-
Simple prioritisation methods work best when nized software requirements prioritisation approaches
there are small numbers of requirements to prioritise. employ ranking mechanisms (Perini et al., 2013).
For instance, a simple pair-wise comparison (Sadiq The systematic review in (Svahnberg et al., 2010)
et al., 2021) which requires that each requirement is investigated 28 papers that dealt with strategic RP
assessed against all other requirements takes about models. 24 out of 28 models of strategic release plan-
12 hours to execute with just 40 requirements (Carl- ning were considered whereas the remaining investi-
shamre et al., 2001). More advanced prioritisation gations are concerned with validating some of the of-
and decision-making methods employ simple priori- fered models. The EVOLVE-family of release plan-
tisation methods as a foundation, for example the An- ning models makes up sixteen of these. Most tech-
alytic Hierarchy Process (Saaty, 1977) uses pair-wise niques place a heavy emphasis on strict limitations
comparison. and a small number of requirements selection vari-
The topic of requirements prioritisation contains ables. In around 58% of the models, soft variables
the analysis of the role software release planning have also been included. The studylacks a validation
plays in software development processes, suggestions on large-scale industrial projects.
for various RP strategies, and an expanding area of Machine Learning (ML) based data analysis, esti-

894
Software Requirements Prioritisation Using Machine Learning

mation and prediction techniques have grown in pop- better than others.
ularity in recent years as a result of improvements in The performance of the algorithms has been eval-
algorithms, computer power and availability of data. uated using accuracy (the percentage of correctly
Traditional methods of requirement prioritisation are classified data), speed (the amount of time needed for
a cumbersome process since there can be too many computation), comprehensibility (how difficult an al-
patterns to understand and program. Machine Learn- gorithm is to understand).
ing has been used in many areas to analyse large
datasets and identify patterns. Once it is trained to 3.1 Dataset
identify patterns in the data, it can construct an esti-
mation or a classification model. The trained model This project uses real data produced by a company
can detect, predict or recognise similar patterns or in the semiconductor business producing IoT wireless
probabilities. microchips. The data relates to the software require-
Duan et al (Duan et al., 2009) proposes partial au- ments for bi-annual software release cycles for cal-
tomation of software requirements prioritisation using endar year 2020 (20Q2 and 20Q4). The data has 283
data mining and machine learning techniques. They samples, each representing a software requirement re-
used feature set clustering using unsupervised learn- quested to be included in the software release. Each
ing and prioritised requirements mainly based on the sample has various feature values, some of which
business goals and stakeholder’s concerns. were inputs to the original software release planning
Perini et al (Perini et al., 2013) compared Case- cycle, some were outputs of that cycle and others were
Based Ranking (CBRank) requirements prioritization calculated or derived during the release planning pro-
method (combined with machine learning techniques) cess. During the original release planning cycle, these
with Analytic Hierarchy Process (AHP) and con- values were considered and discussed with stakehold-
cluded that their approach provided better results than ers before the actual software release was finalised.
AHP in terms of accuracy. A key element of the original planning process
Tonella et al (Tonella et al., 2013) proposed an In- was the use of themes to abstract and collate require-
teractive Genetic Algorithm (IGA) for requirements ments into cohesive business initiatives. This served
prioritization and compared it with Incomplete Ana- two purposes: a) reduce the number of items to be
lytic Hierarchy Process (IAHP). They used IHAP to discussed by business stakeholders; and b) provide
avoid scalability issues with AHP and concluded that business stakeholders with something that they could
IGA outperforms IAHP in terms of effectiveness, ef- comprehend.
ficiency, and robustness to the user errors. Out of three available subsets of requirements, the
A number of other researchers also explored clus- most recent and focused data was selected in an at-
tering techniques with existing prioritization meth- tempt to get the best results.
ods i.e. case-based ranking (Avesani et al., 2015)
(Qayyum and Qureshi, 2018)(Ali et al., 2021).
3.2 Exploratory Data Analysis
Most of the machine learning based techniques,
reviewed in this study, are based on some existing In the exploratory analysis, detailed information
prioritisation techniques, and partially automate the about the main characteristics of the dataset is pro-
process using different clustering methods. A require- vided. The dataset has 40+ features that were care-
ments prioritization technique that fully automates the
fully analysed. Table 1 presents a description of the
requirements prioritization process for large scale sys-
key features.
tem with sufficient accuracy is lacking.
Various statistical analyses were carried out to
evaluate feature quality and predictability in relation
to the target value. They provided us with a more
3 PROPOSED APPROACH thorough knowledge of the data.
The raw dataset had some inconsistencies in the
We have followed a simple methodology introduced data i.e. redundant features, zero values and missing
by (Kuhn and Johnson, 2013) for their research on values etc. Most of the features have multiple val-
predictive modelling. The methodology is a standard ues for each sample which require further processing.
process for most of machine learning projects. It in- With respect to both zeros and missing values, the
cludes data analysis, pre-processing of data including data is inevitably incomplete for a number of reasons,
feature selection, model selection including train/test including: the process does not insist on complete
split, fitting various models and tuning parameters, data before starting the planning cycle, secondary ver-
and evaluation to find the model which generalises sions of that field that may not be used for many re-

895
ICAART 2023 - 15th International Conference on Agents and Artificial Intelligence

Table 1: Exploratory Data Analysis. impurity metric of the node (feature) subtracting the
impurity metric of any child nodes. The mean de-
Feature Description crease in the impurity of a feature across all trees
Issue Key Unique identifier for each requirement gives us the score of how important that feature is
in the Jira database. (Scornet, 2020). Table 2 presents the importance
Release Output of prioritisation process, it has ranking for the features produced by the model.
Commit- three categories i.e. Q2 (requirement
ment was included), Complete (included and Table 2: Feature Importance Score by Decisoon Tree.
completed) and any other value indicat-
ing not included. Feature Value
Estimate The total estimated time in weeks to Theme Category Divisor 0.483655
(wks) complete the task. This feature was AOP LTR$ Theme Rank 0.191668
added to the data after original priori- Cost 0.121553
tisation process. Theme Value 0.054635
(New) Stakeholder assessment of the depen- (New) MoSCoW Multiplier 0.046509
[MoSCoW] dency of the theme on this requirement: Reqs per Theme 0.034841
Must, Should, Could or Wo’nt. Estimate (wks) 0.034282
(New) Multiplier associated with MoSCoW Dependent on 0.021122
MoSCoW value. 3. Category Theme Rank 0.011735
multiplier (New) MoSCoW 2 Multiplier 0.000000
Theme Themes are categorised to indicate the
Category type of strategic or tactical initiative. Based on the feature importance results, the
Divisor The highest ranked categories have a dataset was tuned. We tested our models on full
divisor of 1, whereas the lower ranked dataset as well as on tuned dataset.
categories have higher divisors.
AOP/LTR This is a ranking for themes based on 3.4 Visual Analysis
Theme the lifetime revenue (LTR) linked to
Rank that theme. Various statistical and visual analysis methods were
Cost cost of the requirement. used to learn patterns in data and understand the rela-
tion of features to other features and the target value.
quirements. The target variable Release Commitment has cat-
egorical values which were converted to numeric data
to make two classes i.e. 1 (requirement included in
3.3 Data Pre-Processing the release) and 0 (requirement not included).
An analysis of Class Distribution (see Figure 1
A number of steps were taken to transform sample
showed that the dataset has a moderate degree of
features to make data machine processable.
imbalance. Since the degree of imbalance wasn’t
Data Transformation: The numerical features
too high and our aim was to learn patterns for both
were extracted from the main dataset, special char-
classes, we chose to train our models on the true dis-
acters from numerical data were removed and cat-
tribution.
egorical values (such as Release commitment) were
mapped to numerical values.
Missing Values: After initial transformation of
data, the next step was to handle missing values and
null values. All rows where the data was missing
or null, were reviewed carefully. The rows were re-
moved where it was not ideal to perform feature en-
gineering to fill in the missing values. Other missing
values (where the data was a numerical spread and
were suitable for feature engineering), were filled in
with the mean value of the given feature.
Calculating Feature Importance: We have used
Decision Tree classifier to learn the feature impor-
tance in our dataset. To calculate the feature impor-
tance, Decision Tree model involves computing the Figure 1: Class Distribution.

896
Software Requirements Prioritisation Using Machine Learning

The Correlation Matrix has been built to iden- Table 3: Decision Tree - full and tuned datasets.
tify how features are correlated to each other. It
can be seen from Figure 2 that the Cost and Esti- Performance Metric Score
mate are highly correlated; Theme Category Divi- full tuned
sor is heavily linked with Category Theme Rank. dataset dataset
Applicable to, (New) MoSCoW 2 multiplier and
Accuracy 0.96 0.94
Theme Value2 are heavily correlated to Applica-
F1 score 0.96 0.94
ble to2. Theme Value 2 is also heavily correlated
Precision 0.97 0.95
to the Applicable to and (New) MoSCoW 2 multi-
Recall 0.96 0.94
plier. Theme Value seems to be inversely correlated
K-fold cross validation 0.89 0.92
to AOP/LTR$ Theme rank, LTR$ Theme rank
mean
and Category Theme Rank. Based on these ob-
servations the features Issue Key, Release Commit-
Cross validation is important metric since it can
ment, First Requested Version, (New) MoSCoW 2
flag problems like selection bias and over-fitting. De-
Multiplier, Dependent on2, Applicable to2, Tacti-
spite a drop in acuracy, tuning the dataset has visible
cal Value, Applicable to, Category Theme Rank
impact on cross validation score.
were dropped while attempting the experiments using
tuned dataset for different models.
4.2 K-Nearest Neighbours (KNN)
Table 4 presents the results of KNN for full and tuned
4 EXPERIMENTS AND RESULTS datasets. The accuracy, precision, recall and F1 score
dropped after tuning the dataset however the k-fold
The goal of this research was to experiment the ap- cross validation (mean) has increased.
plication of Machine Learning models to the problem
of software requirements prioritisation, to understand Table 4: k Nearest Neighbours - full and tuned datasets.
the dynamic of various parameters included in a soft-
ware release plan and evaluate the results received. Performance Metric Score
The models considered for the experiment were put full tuned
through rigorous testing using the base line dataset dataset dataset
acquired from pre-processing techniques.
The dataset was split into 80% training and 20 % Accuracy 0.94 0.92
testing data. Experiments were done in a series of F1 score 0.94 0.92
iterations, aiming to tune the dataset and improve the Precision 0.95 0.92
results. Recall 0.94 0.92
Five different ML models have been used for this K-fold cross validation 0.80 0.82
research - Decision Tree Classifier, K-Nearest Neigh- mean
bours (KNN), Random Forest, Logic Regression and
Support Vector Machine. Five metrics have been used
to evaluate the ML models implemented: accuracy, 4.3 Random Forest
F1 score, Precision, Recall and K-Fold Cross Vali-
dation (Mean). For an overall comparison of the re- Table 5 presents the results of Random Forest perfor-
sults, we only considered accuracy, F1 score and k- mance on full and tuned datasets. The accuracy and
fold cross validation mean. All the models have been F1 score were the same after tuning the dataset how-
trained on the full as well as the tuned datasets. ever the k-fold cross validation(mean) has increased.
In this section we present the results for each im- The precision and recall score remained the same
plemented model. hence indicated that change in removal of features has
limited impact on the scores of Random Forest.
4.1 Decision Tree Classifier Random forest model generalised very well to the
data. We did some further experiments with this
Table 3 presents the results on full and tuned datasets model which are detailed in Section 5.
using decision tree model respectively. The accuracy
and F1 score seems to be dropped after tuning the
dataset however the K-Fold Cross Validation score is
improved for tuned dataset.

897
ICAART 2023 - 15th International Conference on Agents and Artificial Intelligence

Figure 2: Heatmap of correlation matrix.

Table 5: Random Forest - full and tuned datasets. Table 6: Logistic Regression - full and tuned datasets.

Performance Metric Score Performance Metric Score


full tuned full tuned
dataset dataset dataset dataset
Accuracy 0.94 0.94 Accuracy 0.86 0.88
F1 score 0.94 0.94 F1 score 0.86 0.87
Precision 0.95 0.95 Precision 0.87 0.90
Recall 0.94 0.94 Recall 0.87 0.88
K-fold cross validation 0.89 0.90 K-fold cross validation 0.76 0.76
mean mean

4.4 Logistic Regression 4.5 Support Vector Machine


Table 6 presents the results of Logistic Regression for Table 7 presents the results of Support Vector Ma-
full and tuned datasets. The accuracy, precision, re- chine (SVM) for full and tuned datasets. It can be
call, and F1 scores were improved after tuning the seen that there was improvement in accuracy, F1
dataset. However, the k-fold cross validation(mean) score, precision, recall and k-fold cross validation
has remained the same. mean after tuning the dataset.

898
Software Requirements Prioritisation Using Machine Learning

Table 7: SVM - full and tuned datasets. table outcomes:


Performance Metric Score • Overall the level of accuracy in predicting require-
ments priority by using various machine learning
full tuned models is positive and indicates that there may be
dataset dataset value in extending this research to develop this
Accuracy 0.87 0.88 concept further;
F1 score 0.86 0.88 • Estimate (wks) and Cost were identified as pa-
Precision 0.86 0.88 rameters that were essential for the data mod-
Recall 0.87 0.88 elling. They ranked 3rd and 7th respectively in
K-fold cross validation 0.87 0.89 terms of their importance. However, the original
mean software requirement prioritisation process was
completed before either the estimate or cost was
derived/calculated, and they were added later in
the cycle. This could indicate that even when esti-
5 DISCUSSIONS mate or cost information was not available, stake-
holders had an intuitive understanding of the size
Accuracy, F1 score and K-Fold Cross Validation of the requirement when providing their inputs to
(Mean) have been used to evaluate the ML models the prioritisation process;
implemented, with the results shown in Table 9. All
models have performed well with an accuracy score • Theme Category Divisor was found to be the
above 80% for both tuned and full datasets. F1 score, most important parameter. This parameter is an
that is a better indicator of the model’s performance, indicator of the type of theme that a requirement is
shows that the Logic Regression and Support Vector associated with, identifying the strategic/tactical
Machine have performed slightly worse than the other nature of the theme. This could indicate that: a)
models. K-Nearest Neighbours, Decision Tree classi- the use of themes had a large impact on prioriti-
fier and Random Forest have consistently high results sation; and b) strategic themes and requirements
across all the 3 metrics for the evaluation. F1 score for were more likely to get included in the release.
Decision Tree Classifier is the highest however this
model is prone to overfitting, and this is evident by
the decrease in F1 score for the tuned dataset.
As Random Forest had promising results, we did
6 CONCLUSION
some further experiments with hyper parameter tun-
The literature review of prioritisation of require-
ing. After implementing and testing different imput-
ments in software releases for Software Product Line
ers such as simple and iterative, it was concluded that
with Multiple Business Lines (SPL/MBL) enlight-
simple imputer provided the best results with mean,
ened many strategies and methods however the ex-
median strategy used. After tuning the hyper param-
isting strategies would not fit best for the present use
eters for random forest, the results were substantially
case. Investigation into the Machine Learning mod-
higher (see Table 8). The only drawback is the exe-
els led to the implementation of five of them and suc-
cution time (203.0259862 seconds) for random forest
cessfully compare their performance. The dataset was
while its hyper parameter are tuned.
tuned and features were carefully selected. All se-
lected models were trained and tested to get the pre-
Table 8: Random Forest - results after tuning hyper-
parameters.
dictions. Most models were able to achieve 80%
accuracy however further investigation and testing
Performance Metric Score yielded better results. The best results were achieved
with Decision Tree classifier, Random forest and
Accuracy 1.0
K-Nearest Neighbours. Decision Tree Classifier is
F1 score 1.0
known to be prone to overfitting at times and Random
Precision 1.0
Forest can overcome the overfitting problem. Hence
Recall 1.0
hyper parameter tuning was performed for Random
K-fold cross validation mean 0.91
Forest which gave 100% accuracy in many perfor-
To meet the project goal of understanding the im- mance metrics and 91% at k-fold cross validation.
pact of certain parameters on the inclusion of a soft- However, the computational effort was considerably
ware requirement in a release, there are several no- high after hyper parameter tuning. In future, hyper
parameter tuning may be performed for other models

899
ICAART 2023 - 15th International Conference on Agents and Artificial Intelligence

Table 9: Results for the Full and Tuned Datasets.

Model Accuracy F1 Score K-Fold Cross Vali- Execution time


dation(Mean)
full tuned full tuned full tuned full tuned
dataset dataset dataset dataset dataset dataset dataset dataset
Decision Tree 0.96 0.94 0.96 0.94 0.89 0.90 0.0860982 0.085749
Random Forest 0.94 0.94 0.94 0.86 0.89 0.92 2.0259862 2.2580502
Logic Regression 0.86 0.88 0.86 0.87 0.76 0.76 2.6350644 3.1730906
K Nearest Neighbour 0.94 0.92 0.94 0.92 0.80 0.82 4.106245 3.1730906
SVM 0.87 0.88 0.86 0.88 0.87 0.89 3.8039046 3.282107

to explore and evaluate the results and derive further and challenges. In Future of Software Engineering
conclusions. Proceedings.
Montalvillo, L. and Diaz, O. (2016). Requirement-driven
evolution in software product lines: A systematic
mapping study. Journal of Systems and Software,
REFERENCES 122:110 – 143.
Perini, A., Susi, A., and Avesani, P. (2013). A machine
Abbas, M., Jongeling, R., Lindskog, C., Enoiu, E. P., Saa- learning approach to software requirements prioriti-
datmand, M., and Sundmark, D. (2020). Product line zation. IEEE Transactions on Software Engineering,
adoption in industry: An experience report from the 39(4):445–461.
railway domain. In Proceedings of the 24th ACM Con- Pohl, K., Böckle, G., and Van Der Linden, F. (2005). Soft-
ference on Systems and Software Product Line. Asso- ware product line engineering: foundations, princi-
ciation for Computing Machinery. ples, and techniques, volume 1. Springer.
Ali, S., Hafeez, Y., Hussain, S., Yang, S., and Jamal, M. Pronk, B. J. (2002). Product line introduction in a multi-
(2021). Requirement prioritization framework using business line context. International Workshop on
case-based reasoning: A mining-based approach. Ex- product Line Engineering: The Early Steps: Plan-
pert Systems, 38(8):e12770. ning, Modelling and Managing.
Ashton, K. (2009). The ’internet of things’ thing. Qayyum, S. and Qureshi, A. (2018). A survey on ma-
Avesani, P., Perini, A., Siena, A., and Susi, A. (2015). Goals chine learning based requirement prioritization tech-
at risk? machine learning at support of early assess- niques. In Proceedings of the 2018 International Con-
ment. In 2015 IEEE 23rd International Requirements ference on Computational Intelligence and Intelligent
Engineering Conference (RE), pages 252–255. Systems, pages 51–55.
Carlshamre, P., Sandahl, K., Lindvall, M., Regnell, B., and Saaty, T. L. (1977). A scaling method for priorities in hier-
Natt och Dag, J. (2001). An industrial survey of re- archical structures. Journal of Mathematical Psychol-
quirements interdependencies in software product re- ogy, 15(3):234 – 281.
lease planning. In Proceedings 5th IEEE International Sadiq, M., Sadim, M., and Parveen, A. (2021). Applying
Symposium on Requirements Engineering, pages 84– statistical approach to check the consistency of pair-
91. wise comparison matrices during software require-
Devroey, X., Perrouin, G., Cordy, M., Samih, H., Legay, ments prioritization process. International Journal
A., Schobbens, P.-Y., and Heymans, P. (2017). Sta- of System Assurance Engineering and Management,
tistical prioritization for software product line testing: pages 1–10.
an experience report. Software & Systems Modeling, Scornet, E. (2020). Trees, forests, and impurity-based vari-
16(1):153–171. able importance. arXiv preprint arXiv:2001.04295.
Duan, C., Laurent, P., Cleland-Huang, J., and Kwiatkowski, Sommerville, I. (2016). Software engineering. Boston :
C. (2009). Towards automated requirements pri- Pearson Education Ltd, 10th edition.
oritization and triage. Requirements engineering,
14(2):73–89. Svahnberg, M., Gorschek, T., Feldt, R., Torkar, R., Saleem,
S. B., and Shafique, M. U. (2010). A systematic re-
Grüner, S., Burger, A., Kantonen, T., and Rückert, J. (2020). view on strategic release planning models. Informa-
Incremental migration to software product line engi- tion and Software Technology, 52(3):237–248.
neering. In Proceedings of the 24th ACM Conference
on Systems and Software Product Line, pages 1–11. Tonella, P., Susi, A., and Palma, F. (2013). Interactive re-
quirements prioritization using a genetic algorithm.
Kuhn, M. and Johnson, K. (2013). Applied predictive mod- Information and software technology, 55(1):173–187.
eling. Springer, London.
Wiegers, K. and Beatty, J. (2013). Software Requirements.
Metzger, A. and Pohl, K. (2014). Software product line en- Microsoft Press, Redmond, Washington, 3rd edition.
gineering and variability management: Achievements

900

You might also like