Bayesian Network Model For Task Effort Estimation in Agile Software Development

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

The Journal of Systems and Software 127 (2017) 109–119

Contents lists available at ScienceDirect

The Journal of Systems and Software


journal homepage: www.elsevier.com/locate/jss

Bayesian network model for task effort estimation in agile software


development
Srdjana Dragicevic a, Stipe Celar b,∗, Mili Turic c
a
Split Airport, Cesta dr. Franje Tudmana 1270, 21217 Kastel Stafilic, Croatia
b
Department of Electronics, FESB, University of Split, R. Boskovica 32, 21000 Split, Croatia
c
Venio indicium d.o.o., Doverska 19, 21000 Split, Croatia

a r t i c l e i n f o a b s t r a c t

Article history: Even though the use of agile methods in software development is increasing, the problem of effort esti-
Received 11 December 2015 mation remains quite a challenge, mostly due to the lack of many standard metrics to be used for effort
Revised 24 January 2017
prediction in plan-driven software development. The Bayesian network model presented in this paper
Accepted 30 January 2017
is suitable for effort prediction in any agile method. Simple and small, with inputs that can be easily
Available online 31 January 2017
gathered, the suggested model has no practical impact on agility. This model can be used as early as
Keywords: possible, during the planning stage. The structure of the proposed model is defined by the authors, while
Bayesian network the parameter estimation is automatically learned from a dataset. The data are elicited from completed
Effort prediction agile projects of a single software company. This paper describes various statistics used to assess the pre-
Agile software development cision of the model: mean magnitude of relative error, prediction at level m, accuracy (the percentage
of successfully predicted instances over the total number of instances), mean absolute error, root mean
squared error, relative absolute error and root relative squared error. The obtained results indicate very
good prediction accuracy.
© 2017 Elsevier Inc. All rights reserved.

1. Introduction • Lack of engineering activities that help create a specification of


the expected requirements.
In the recent years, agile methodologies have been widely ac-
cepted in software development. According to a survey (Version In addition, agile development shares the same problems re-
One, 2007), only 3.4% of the surveyed companies have never used garding communications between stakeholders with plan-driven
the agile methodology in software development projects. software development, because it uses the same elicitation tech-
The term "agile methods" refers to a number of methods that niques (Sillitti and Succi, 2005).
share the same goals and values (Beck et al., 2001). They all are To resolve the above mentioned problems, we have created the
based on development iterations, incremental improvements and Method for Elicitation, Documentation and Validation of Software
continuous feedback and share a lack of formal documentation and User Requirements (MEDoV) (Dragicevic et. al., 2011; Dragicevic
specification. Requirements are elicited and specified at the begin- and Celar, 2013). MEDoV was successfully applied in an agile soft-
ning of each iteration. That ensures quick responses to require- ware development project (Dragicevic et al., 2014). The project was
ments changes and minimal waste of time, but results in (Helmy completed on time, within the estimated budget, and no unneces-
et al., 2012; Nawrocki et al., 2002; Sillitti and Succi, 2005): sary features were developed. Of course, the success of the appli-
cation of the method in a single project is not sufficient for its final
• A lack of a bigger picture; connections between requirements validation. Before applying the method to other projects, it is nec-
may be missing. essary to define metrics by which the efficiency of the method will
• Implicit elicitation of non-functional requirements; eliciting and be measured in an agile environment.
managing techniques are not provided. Incorrect and incomplete requirements are the major causes of
• Problems in product maintenance; documentation is limited. failure of software projects (Standish Group, 2009). Therefore, a
good method for eliciting and documenting user requirements will
certainly contribute to the success of the entire project. Moreover,

Corresponding author. in agile software development the requirements prioritization is
E-mail addresses: [email protected] (S. Dragicevic), stipe.celar@ the client’s responsibility (Sillitti and Succi, 2005). These are the
fesb.hr (S. Celar), [email protected] (M. Turic). reasons why this metric should be able to measure the project

http://dx.doi.org/10.1016/j.jss.2017.01.027
0164-1212/© 2017 Elsevier Inc. All rights reserved.
110 S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119

success. The success of a software project depends on project ative Error (MMRE), and the Prediction at Level m (Pred. (m)), al-
effort, cost and the quality of the final product. When the project though some authors suggest that other statistics represent more
is completed, it is easy to determine its success. What we would appropriate metrics (Foss et al., 2003; Kitchenham et al., 2001; Ko-
like to confirm is that before the final validation of a project rte and Port, 2008).
success, one can use this metric to predict the success. The remainder of this paper is structured as follows: the in-
Traditional software project prediction models are proven either vestigation of current usages of Bayesian network (BN) models for
to be unreliable, or require sophisticated metrics to be rendered re- software effort prediction is described in Section 2, BN is explained
liable (Borade and Khalkar, 2013), both representing a problem in in Section 3, conditions which the proposed BN model should meet
agile development. Many metrics used in traditional software de- are given in Section 4, and the building process is described in de-
velopment project planning simply cannot be used in agile devel- tail in Section 5. The conclusions as well as the outlines of future
opment project planning. work are presented in Section 6.
Agile development teams usually use story points to measure
the effort needed to implement a user story. Story points are 2. Related works
useful to compare technical complexity, effort and uncertainty
of different user stories, as well as to measure project velocity. Effort estimation is inherently inaccurate. There are many rea-
Sometimes, project managers assign x hours for y story points sons that cause imprecision: the lack of relevant information,
and estimate the hours required to complete the task. But this some metrics are of subjective nature, the complex interaction be-
technique is not appropriate to define effort in hours or days ab- tween metrics and the amount of effort required to gather metrics.
solutely, because story points correspond to time distribution and Bayesian network has been proposed to reduce these uncertainties,
equivalence, e.g., 1 story point = 5 h is not valid for all times, all since the BN automatically deals with the uncertainty and risk due
teams and all projects. Instead of that, a project manager can use to its own statistical nature.
velocity to estimate how many story points can be implemented Mendes et al. (2012) depict the successful use of BN mod-
in a sprint. But velocity needs three or four sprints to stabilize, els for web development effort estimation and mention previ-
and it is not suitable for use at the beginning of a project. ous papers which describe how hybrid Bayesian network models
Consequently, agile teams use a technique called ideal days/ (structure expert-driven and probabilities data-driven) have out-
hours (Cohn, 2005). The developers estimate how many days/hours performed the mean and median-based effort, multivariate regres-
are required to finish the task if they are focused exclusively on sion, case-based reasoning, and classification and regression trees.
that task, without interruptions such as by meetings, drop-ins, This is the reason why BN is used in software development
phone calls, etc. The task that requires eight ideal hours can take projects for effort estimation (Bibi and Stamelos, 2004; Mendes
two or three days because of interruptions. It can take even more et al., 2012), reliability evaluation (Si et al., 2014), quality pre-
days if a developer works on several tasks at the same time, not to diction (Jeet et al., 2011; Schulz et al., 2010), risk assessment
mention that developers usually underestimate the needed effort. (Chatzipoulidis et al., 2015; Lee et al., 2009), testing-effort esti-
Moreover, Jorgensen (2013) shows that effort estimation de- mation (Wooff et al., 2002), and so on. But only a few uses of
pends on the direction of comparison. When developers compare BN have been reported in agile software development projects. A
story A to story B, the effort estimation is different from that when Systematic Literature Review (Usman et al., 2014) confirms these
they compare story B to story A. The results are particularly in- results. The most commonly used assessment methods in agile
accurate when comparing a large user story with a much smaller software development are expert judgment, planning poker and
one. use case points, even though these methods do not result in good
Therefore, the main objective of this research is to find a tech- prediction accuracy.
nique that will facilitate the assessment of the required effort. This Hearty et al. (2009) describe the use of a Dynamic Bayesian
technique should be suitable for use even in the planning stage, Network causal model for the Extreme Programming (XP) meth-
and help project managers in further agile software development. ods. A Dynamic Bayesian Network (DBN) is a BN expanded by a
It should not affect the agility and it should be suitable for any temporal dimension, so that the changes over time can be mod-
agile method. elled. The changes of XP’s key Project Velocity metric are modelled
Typical problems of traditional effort, cost and quality predic- to make effort estimation and risk assessments. The model is vali-
tion models can be overcome by using the BN models (Fenton and dated against the real-world XP project.
Neil, 1999; Fenton et al., 2008) due to the: Abouelela and Benedicenti (2010) also use BN for modelling XP
software development process. This model consists of two mod-
• Flexibility of the BN building process (based purely on expert
els: one for the estimation of the project duration, and the other
judgment, empirical data, or the combination of both).
for the estimation of the expected defect rate. The estimations are
• Ability to reflect causal relationships.
based on the use of three XP practices: Pair Programming, Test
• Explicit incorporation of uncertainty as a probability distribu-
Driven Development and Onsite Customer. The model is validated
tion for each variable.
against two XP projects.
• Graphical representation that makes the model clear.
Perkusich et al. (2013) use a BN model for Scrum project mod-
• Ability of both, forward and backward inferences.
elling to provide information to Scrum Master for problem detec-
• Ability to run the model with missing data.
tion. It is validated with data from ten different scenarios.
It has been shown (Celar et al., 2012; Jorgensen, 2010,2014) that Nagy et al. (2010) create another BN model to assist a project
relevant empirical data can significantly increase the accuracy of manager in decision making. This BN model evaluates several key
predictions. Consequently, data from real agile projects are used factors which influence the development of software in order to
for building this BN model. The model is intended for the predic- detect problems as early as possible. The model is not validated.
tion of smaller parts of projects (project tasks) and not for their Due to a small number of papers on the use of the BN model
scheduling. For that reason, the terms “effort” and “duration” are in agile software development projects, we also examine the use of
used interchangeably in this paper. BN in iterative development. Torkar et al. (2010) depict the use of
Various statistics are used to determine the accuracy of predic- a DBN for test effort estimation. The DBN model is based on pro-
tion in software estimation. The most commonly used metrics are: cess and resource measurements. Process measurement is related
the Magnitude of Relative Error (MRE), the Mean Magnitude of Rel- to software activities such as development and support. Resource
S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119 111

measurement is related to assets such as people, tools and equip-


ment. The DBN model consists of two sub-models: the test process
overall effectiveness model and the test effort model. The authors
use data collected from two industrial projects to validate the DBN
model.
The Systematic Literature Review of Usman et al. (2014) shows
that only Extreme Programming (XP) and Scrum Methods have
been investigated in the estimation studies. However, development
teams usually do not strictly stick to all the practices of the se-
lected methodology. They mostly choose the subset of agile prac-
tices which is suitable for the specific project (Williams, 2010).
Consequently, the estimation model should not depend on the se-
lected agile practices and methods.
We are not aware of any research that attempts to use a BN
model to predict agile project effort, regardless of the selected ag-
ile method and without impact on agility. In all the above men-
tioned papers, validation processes (if performed) have one com-
mon characteristic: all the BN models are validated against a small
number of projects (only one or two). This is the reason we take
a project task as the smallest estimation and validation record –
there are 160 project tasks in our proposed model. Moreover, our
BN model is also validated against a small number of project tasks.
Some of the mentioned BN models are relatively big (Perkusich
et al., 2013). We want to create a model which will have a sat-
isfactory accuracy with a minimal set of input data. DBN mod-
Fig. 1. The BN model and the associated Node Probability Tables (NPTs).
els are usually smaller, but they are unable to predict effort in
the first iteration (Hearty et al., 2009; Torkar et al., 2010). It is
extremely important that the BN model can be used as early as 
n
possible. P ( x1 , . . . , xn ) = P ( xi | π (xi ) ),
To summarize, the existing BN models do not estimate task ef- i=1
fort. Besides, none of the above described models meets all the fol-
where π (xi ) stands for the set of parents of xi , or, in other words,
lowing requirements:
the set of nodes that are directly connected to xi via a single edge.
• Suitability for agile development, regardless of used agile meth- An example of a simple BN is shown in Fig. 1, together with
ods and/or practices. the associated Node Probability Tables (NPTs). Each row in the
• Minimal set of input parameters, provided that the method pre- NPT represents a conditional probability distribution and, there-
dicts with at least 75% accuracy. fore, its values sum up to 1. The nodes “New Feature” and “Re-
• Possibility of using the BN model at the start of the project. port Complexity” are root nodes (nodes without parents), so a pri-
• Validation based on a larger sample size (not just a few ori probabilities are defined for them. Conditional Probabilities for
samples). all possible combinations of outcomes of its parents are defined
for the node “Requirements Complexity”. For example, for the rel-
ative probability of “Requirements Complexity” being ‘High’, condi-
3. Bayesian network
tional on “New Feature” being ‘Yes’, and “Report Complexity” being
“Medium”, the Conditional Probability is 0,3.
A Bayesian network (BN) is a graphical model that describes
The BN is mainly used for presenting ambiguities in various do-
probabilistic relationships between causally related variables.
mains1 in a simple and easily understandable way, for making pre-
The BN is formally determined by the pair BN = (G, P), where G
dictions as well as diagnostics, for computing the probabilities of
is a directed acyclic graph (DAG), and P is a set of local probability
occurrence of an event, and for updating the calculations accord-
distributions for all the variables in the network. A directed acyclic
ing to evidences (Fenton and Neil 1999; Fenton et al. 2008).
graph G = (V (G), E (G)) consists of a finite, nonempty set of tuples
V (G) = {(s1 , V1 ), (s2 , V2 ),…, (sn , Vn )} and a finite set of directed
edges E (G) ⊆ V (G) × V (G) (see Fig. 1). Nodes V1 , V2 ,…, Vn corre- 4. The proposed BN model
spond to random variables X = (X1 ,…, Xn ), that can take on a cer-
tain set of values si (depending on the problem being modelled). 4.1. The BN model requirements
The terms variable and node will be used interchangeably in this
paper. The edges E (G) = {ei,j } represent dependencies among vari- Agile methods avoid the formalisms of traditional specifica-
ables. A directed edge ei,j from Vi to Vj for Vi , Vj ∈ V(G) shows that tion and design techniques. The downside of this is a lack of
Vi (parent node) is a direct cause of Vj (child node). specification metrics for project planning. At the same time, agile
Each variable Xi has a joint probability distribution P (Xi | par- project managers have to plan their projects as any other tradi-
ent (Xi )) which shows the impact of a parent on a child. If Xi has tional project manager. The main purpose of this paper is to build
no parents, its probability distribution is unconditional, otherwise, a BN which can help agile project managers predict project effort.
it is conditional. The probability distribution of variables in a BN The proposed BN model should meet the following conditions:
must satisfy the Markov condition, which states that each vari-
able Xi is independent of its nondescendents, given its parents in 1
A query “Bayesian” in the Scopus database (all areas, all time periods) results
G (Charniak, 1991). The BN decomposes the joint probability dis- in 123.139 papers (Scopus, 2016a). Another query “Bayesian” in the Scopus database
tribution P (X1 ,…, Xn ) into a product of conditional probability dis- that excludes computer science, engineering and mathematics papers (all time pe-
tributions for each variable given its parents in: riods), results in 48.602 papers (Scopus, 2016b).
112 S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119

• Applicability for any agile method.


• Minimization of impact on agility:
◦ Input data must be simple to collect.
◦ Model must be as small and simple as possible.
• Predictability of effort so that (a lack of) success can be evalu-
ated.
• Learnability from newly entered data.
• Possibility of use as early as possible, during planning phase,
before actual start.
• Ability to process different data types (Boolean, rank, integer,
etc.).

When building a BN model, it is essential to define the prob-


lem which is to be solved. Initially, the intention was to build a
BN model for sprint (iteration) effort prediction. The model was
planned to use the existing database of software projects for model
validation. These are the projects of a micro software company
which has been using agile methods for several years now. The Fig. 2. The cone of uncertainty (Kan, 2002).
project manager (scrum master) has two databases:
• The list of software entities (with their complexity classifica- curacy of this model: Accuracy, Mean Absolute Error (MAE), Root
tions) extracted for each task from the project log (Celar et al., Mean Squared Error (RMSE), Relative Absolute Error (RAE) and
2012). Root Relative Squared Error (RRSE). These measures are chosen
• The list of knowledge and skills calculated for each developer, due to their simplicity of use and because of their application ar-
including motivation and experience, classified into 5 levels eas (Hernandez-Orallo et al., 2012; Kim et al., 2014; Sarwar et al.,
(from 1, very low level, to 5, very high level), and updated twice 2001).
a year (Celar et al., 2014). MMRE is the average of the Magnitude of Relative Errors (MREs)
calculated over all the reference tasks:
When planning new tasks, the project developers and the
1
n
project manager try to find the best solution for both sides: de-
MMRE = M REi .
velopers’ ideas and project productivity. When working on a task, n
i=1
the developers record their working hours and the types of activ-
ities (that takes only 1–2 minutes at the end of a workday). So, MRE represents the normalized measure of deviations between
they always know the real task and project status. Thanks to this the actual and the estimated values:
short developers’ activity, the project manager receives very valu- |yi − f (xi )|
MRE = .
able information. Consequently, the authors realized very early in yi
the process that it was convenient to build a BN model for task Although MMRE is the most frequently used prediction accu-
effort prediction because: racy measure, there are criticisms of its applicability. Kitchenham
• It is easier to collect input data. et al. (2001) suggest that MMRE is more suitable for goodness of
• Input data are less complex. fit statistics than for evaluation of prediction. MMRE is the mea-
• It is easier for a project manager to estimate a node value sure of standard deviation (spread) of a variable xi (xi = predict
(e.g., whether the complexity of a report is “low”, “medium” value/actual value), while Pred. (m) is the measure of peakedness
or “high”). (kurtosis) of a variable x. MMRE prefers models that predict es-
• It is easy to predict the iteration duration based on each devel- timates below the mean, and it is also sensitive to outliers (Foss
oper’s prediction effort. et al., 2003; Korte and Port, 2008). A sensitivity to outliers allows
MMRE to detect whether the model occasionally tends to be very
4.2. Measures to assess the accuracy of BN models inaccurate.
Pred. (m) measures the percentage of estimates that are within
Some uncertainty is always included in effort prediction be- m percent of the actual values. It is usually set to m = 25. Pred.
cause each project is unique – there are no two projects with (25)% detects what percentage of estimates is within a tolerance of
same requirements, priorities, technology or developers. Uncer- 25%.
tainty decreases significantly as new knowledge is obtained during Accuracy is the percentage of the correctly classified instances
the project. Anyhow, at the beginning, when a lot of information over the total number of instances. The accuracy can range from 0
is unknown, effort estimation is especially difficult. As the project to 100%.
progresses, estimations become increasingly accurate (see Fig. 2), MAE is the average of absolute values of prediction errors, given
but in the early phase of the project the cone of uncertainty shows by:
1
n
that the actual value varies from 40 to 250% (McConnell, 2006) or
from 60 to 160% (Kan, 2002) compared to the estimation. Models MAE = | f (xi ) − yi |,
n
using BN for effort prediction declare an accuracy range from 14 to i=1

100% (Radlinski, 2010). A prediction accuracy of 80% (significantly where n is the number of the predictions, f(xi ) is a predicted value,
over all predictions) or more is usually satisfactory. and yi is an observed value. All the errors are weighted equally due
The Mean Magnitude of Relative Error (MMRE) and the Pre- to the linear score.
diction at Level m (Pred. (m)) are the two most commonly used RMSE is another measure of deviation between the predicted
metrics to assess the accuracy of prediction in software estimation. f(xi ) and the real value yi :
These measures are also used in this research.

1
n
But, in our research, there is only one prediction per one task. RMSE = ( f (xi ) − yi )2 .
Therefore, other statistical measures are also used to assess the ac- n
i=1
S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119 113

Large errors are weighted more heavily because the errors are
averaged after they are squared. The error variance can be detected
if RMSE and MAE are used together. The variation is greater if the
difference between them is larger. If all the errors have the same
magnitude, MAE = RMSE, otherwise, RMSE > MAE.
Both MAE and RMSE are useful for the comparison of the pre-
diction errors of different models for a particular variable and not
for the comparison between variables. They show errors in the
same unit and scale as the parameter itself, so they are scale-
dependent.
We also include measures that can be used for the comparison
of models whose errors are measured in different units. Such mea-
sures include:
• Relative Absolute Error (RAE):


n 
n
RAE = ( | f ( xi ) − yi | ) (|ȳi − yi | ).
i=1 i=1

• Root Relative Squared Error (RRSE):


 

n 
n
RRSE = ( f (xi ) − yi )2 (ȳi − yi )2 ,
i=1 i=1
 

n 
n
RRSE = ( f (xi ) − yi )2 (ȳi − yi )2 ,
i=1 i=1

where ȳi is the mean value of yi .

5. The BN building process

The BN building process can be divided into two steps: a defi-


nition of the elements of the set G (the DAG structure definition)
and a definition of the elements of the set P (parameter estima-
tion). Both, the DAG structure definition and the parameter estima-
tion, can be either purely expert based, or learned purely from the
data. Some combination of expert knowledge and empirical data
can also be used. The topology of BN for effort prediction is mostly
expert defined (Radlinski, 2010), but the final structure of BN is
mainly built based on the combination of expert knowledge and
inference algorithms (Misirli and Bener, 2014). The BN model pre-
sented in this paper is based on quantitative historical data from
a software project database and on the authors’ expert knowledge.
All the authors are experienced project managers, and two of them
also have experience in the construction of BN.
In this research, the definition of set G is separated in three
sub-tasks:
• Identification of nodes V1 , V2 ,…, Vn in the network (variables in
the problem).
Fig. 3. The BN building process.
• Identification of all the elements of set si , i.e., the definition of
all the possible outcomes for each node (values that variables
can take). For the proposed BN model, these probabilities are obtained au-
• Identification of set E = {ei,j | Vi , Vj ∈ V (G)}, i.e., a definition tomatically from the data. The BN building process is depicted in
of all the network edges which show the dependencies of the Fig. 3. It took three iterations through the process to build the final
variables. version of the BN models.
The set P (V1 , V2 …, Vn ) is defined when:
5.1. Criteria collection (Node definition)
• A priori probabilities for nodes without parents (root nodes) are
defined. The authors strive to establish a minimum set of criteria (cost
• Conditional probabilities for all nodes with parents, and for all drivers) needed for evaluation. There is a great diversity in the use
possible combinations of outcomes of their parents, are de- of cost drivers and no consensus exists regarding which criteria
fined. A priori probabilities for nodes with parents are defined should be used in a specific context (Usman et al., 2014). The most
through an associated table of joint probability distributions, widely used cost drivers in agile development are: task size, skills
and through a priori expectations of their parents. Therefore, and experience of developers. These cost drivers are also used in
it is superfluous to define explicitly a priori probabilities for the proposed model, but they are not sufficient for an accurate
nodes with parents. assessment.
114 S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119

This step focuses on gathering the criteria specific for soft- Table 1
Nodes description.
ware effort estimation from the existing literature (Abouelela and
Benedicenti, 2010; Hearty et al., 2009; Mendes et al., 2012; Misirli Node name Description
and Bener, 2014; Nagy et al., 2010; Perkusich et al., 2013; Radlinski, Form Low_No Number of simple user interfaces (forms)
2010; Torkar et al., 2010; Usman et al., 2014). Form Medium_No Number of moderate complexity user
The survey of BN models for software effort prediction interfaces (forms)
(Radlinski, 2010) shows that the criteria can be grouped in four Form High_No Number of complex user interfaces (forms)
Function Low_No Number of simple functions
categories based on the measured characteristics:
Function Medium _No Number of moderate complexity functions
Function High _No Number of complex functions
• Project scope – determined by the use of various measures, e.g.,
Report Low_No Number of simple reports
use cases or user stories, function points, lines of codes, re- Report Medium _No Number of moderate complexity reports
quirements, etc. Report High _No Number of complex reports
• Other project characteristics – considered in relation to type, Form Complexity Total rating of user interface (form) complexity
complexity and stability of the project. Function Complexity Total rating of function complexity
Report Complexity Total rating of report complexity
• Staff factors – which include, among others, developer skills,
Specification Quality Quality of specification
experience and motivation. New Task Type Type of task (new or familiar one)
• Process characteristics – considered in relation to the organiza- Requirements Complexity Overall rating of requirements complexity
tion and the maturity of the development process. Developer Skills Overall rating of developer experience,
motivation and skills
The elements of set V (BN nodes) are selected by applying Working Hours Number of hours spent on the task
a Goal Question Metric (GQM) approach to the collected criteria Working Hours Classification Intervals of spent working hours: 0–2 h – very
simple task (56 instances)
(Basili et al., 1994; Differding et al., 1996). The GQM plan consists
2,1–10 h – simple task (70 instances)
of a goal and a set of questions and measures. The plan describes 10,1–25 h – moderate task (22 instances)
precisely why the measures are defined and how they are going to 25,1–40 h – complex task (6 instances)
be used. The asked questions help to identify information required >40 h – very complex task (6 instances)
to fulfil the goal. The measures define the data to be collected to
answer the questions. The values of nodes are defined in two steps:
The most important goal of the task effort prediction is to de-
termine the time needed for task completion. Hence, the first ele- • The first step defines the types of the selected variables and
ment of set V is defined: Working Hours. Task effort depends on the identifies the values for each variable. Although BN allows the
complexity of requirements and on the developer skills (including use of both discrete and continuous variables, in this paper we
motivation and experience). So, the next two elements of V are de- use discrete values, because the experimental data are discrete,
fined: Requirements Complexity and Developer Skills. The effort of and because the available BN tools require the discretization of
the task depends largely on whether the programmer is familiar the continuous variables.
with this type of task or he has to use new technologies and new • The second step is extremely time-consuming. An experienced
knowledge. Thus, the next element is: a New Task Type. The com- project manager in agile development checks all the projects,
plexity of the requirements depends on the number and the com- evaluates the unstructured data and creates a database suitable
plexity of reports, user interfaces (forms) and functions that should for BN. As task evaluation is time-consuming, tasks are pro-
be created in a task, as well as on the quality of the requirements cessed in batches: first 40, then 50, and finally 70 tasks. All the
specifications. In the first iteration, set V is completed by elements: values in the newly created database are checked for rank and
Form Complexity, Report Complexity, Function Complexity and Speci- accuracy. In some cases, it is necessary to go back to the first
fication Quality. step and refine the values of the nodes.
The GQM approach ensures that all the relevant domain vari-
ables are included. The authors also checked and assured them- 5.3. DAG structure construction (Node connections)
selves that the variables were named conveniently.
The GQM approach used for the definition of the elements of
5.2. Node values set V is also used for the definition of set E. The causal relation-
ships between the nodes are built based on variables and measures
To fully define a set of tuples V (G) = {(s1 , V1 ),…, (sn , Vn )}, it is selected by using GQM. The building process includes d-separation
necessary to define si , the set of all possible values for each Vi . (d-separation dependencies are used to identify variables influ-
To define node values, the authors first checked the past data. enced by evidence coming from other variables in the BN), as well
As mentioned before, the authors have access to database of 160 as a new node definition.
tasks, but agile development software projects are famous for very For example, a variable Report Complexity represents the com-
limited documentation. The existing data are usually unstructured, plexity of the reports that should be created in a task. The variable
meaning that they are not suitable for direct use in the BN. Even can take one of three states (low, medium or high), and its value
the structured data should be ranked. For example, a variable can be estimated by the project manager. Instead of that, three
Working Hours expresses the number of hours spent on an in- new parent nodes (variables) are added: Report Low_No (number
dividual task. The range of values is too large in terms of too of simple reports), Report Medium_No (number of medium com-
many potential outcomes. To simplify possibilities, the outcome plexity reports) and Report High_No (number of complex reports).
values should be intervals instead of point values. Therefore, a new This is important because with parent nodes the state of variable
node Working Hours Classification is added, with possible outcomes Report Complexity is defined automatically, thus avoiding a situa-
ranked in five intervals (Table 1). Instead of adding a new node, the tion where one project manager estimates the same 10 simple re-
values of Working Hours can be split in five intervals. Furthermore, ports once as low, and at other times as moderately complex. Con-
the authors hope that it will be possible to refine values in more sistency significantly affects the accuracy of prediction of the BN
intervals after the model is used, and more data became available. model.
In that case, it will be easier to change the Working Hours Classifi- For the same reason, parent nodes are also added to nodes
cation only. Form Complexity (defines complexity of user interfaces) and
S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119 115

Fig. 4. The BN model.

Table 2
Empirical data prepared for use in the BN model (Part).

Task New Task Specification Form Form Form Function Function Function Report Report Report Developer Working
ID Type Quality Low_No Medium_No High_No Low_No Medium_No High_No Low_No Medium_No High_No Skills Hours

1 Yes 2 2 0 2 0 10 0 0 0 0 2 16,5
2 Yes 1 3 2 0 0 2 0 0 0 2 4 9
3 Yes 4 0 2 2 3 3 3 0 0 0 3 8,5
4 No 4 5 0 0 5 0 0 0 0 0 2 28
5 Yes 3 0 0 5 0 0 0 0 0 5 4 46,5
6 No 3 0 1 0 0 0 0 1 0 0 2 1
7 No 2 0 1 0 0 1 0 1 0 0 3 1,5
8 Yes 3 2 2 5 0 3 3 1 0 0 3 9
9 Yes 3 0 0 0 0 0 0 0 4 0 2 3,75
10 No 4 1 0 0 0 2 0 0 0 1 2 10
11 No 2 0 1 1 0 1 0 0 0 0 3 0,5
12 Yes 5 1 1 1 0 3 0 0 0 0 2 15,5

Function Complexity (defines complexity of function). The values ity Assessment Method (Celar et al., 2014) and then ranked in to
of both nodes can be expressed as high, medium or low. Con- one of 5 grades. The evaluation of a developer is performed once
sequently, the Form Complexity is a child node of Form Low_No or twice a year.
(number of simple forms), Form Medium_No (number of medium A new iteration of the model building process starts each time
complexity forms) and Form High_No (number of complex forms). when a new node is added. A list of all the nodes with explana-
On the other side, Function Low_No (number of simple functions), tions of their meaning is given in Table 1. The final topology is
Function Medium_No (number of medium complexity functions) shown in Fig. 4.
and Function High_No (number of complex functions) are parents
of the Function Complexity node. 5.4. Parameter estimation
The complexity of the reports as well as the complexity of the
forms and functions is defined on the basis of the elements to Conditional and a priori probabilities are learned from the data
be constructed, their number, and their comparison with historical using the WEKA2 machine learning suite.
data on similar elements (analogy). The report evaluation is also As we already mentioned, the data used in this research origi-
influenced by the database query complexity used to obtain the nate from agile projects of a small software company. These data
result. The assessment of the function complexity also depends on are not suitable for direct use in the BN model. They must be pre-
the complexity of the processing algorithm. pared, but, as the process of preparation is time-consuming, the
The specification quality is determined by a level of require- data are separated in three datasets. The first dataset consists of 40
ments decomposition: definition of technical demands and busi- tasks, the second of 50 tasks, and the third of 70 tasks. Tasks are
ness clarity.
The estimation of skills and knowledge as well as experience 2
Waikato Environment for Knowledge Analysis (WEKA) 3.6.11, http://www.cs.
and motivation of each developer is rated by the Personal Capabil- waikato.ac.nz/ml/weka/.
116 S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119

grouped chronologically on the basis of creation time. Grouping is

Requirements Developer Working Working Hours


made neither by size, nor by complexity, nor according to the de-

Classification
veloper who performs the task. All the datasets include tasks of

25,1-40
10,1-25

10,1-25
different duration and complexity, created by different developers.

2,1-10
2,1-10

2,1-10
2,1-10
2,1-10
>40,1
0-2
0-2

0-2
A set of empirical data for the twelve tasks is shown in Table 2.
Empirical data are not available for all the nodes. Nodes Form
Complexity, Function Complexity, Report Complexity, Requirements

Hours

46,5

3,75

15,5
16,5

8,5

0,5
1,5
Complexity and Working Hours Classification are added to simplify

28

10
9

9
the possible outcomes, as well as to provide better model accu-
racy. The values of Working Hours Classification are ranked in five

Skills
non-linear intervals based on the authors’ experience. The manual

2
4
3
2
4
2
3
3
2
2
3
2
definition of NPTs can be a lengthy and error prone process. Con-
sequently, the values of these nodes are evaluated on the basis of

Low_No Medium_No High_No Complexity Low_No Medium_No High_No Complexity Low_No Medium_No High_No Complexity Complexity
empirical values of their parents. The probabilities are automati-

Medium
Medium

Medium
Medium
Medium
cally learned both for empirical and added nodes.

High

High

High
Low

Low
Low

Low
An example of a table with complete data for parameter esti-
mation in BN model is shown in Table 3. It consists of empirical

Medium
data from Table 2, completed with the data estimated by the au-

Report

High

High

High
thors.

Low

Low
Low

Low
Low
Low

Low
Low
5.5. BN model validation

Report

1
0

0
0

0
0
0
0

0
0
The model validation is performed using empirical data. WEKA
provides a k-fold cross-validation and a summary statistics (predic-

Report Report
tion accuracy, MAE, RMSE) which are used to verify the accuracy
of the generated model. The Weka error statistics are normalized.

0
0
0
0
0
0
0
0
4
0
0
0
The predicted distribution for each class is matched against the ex-
pected distribution for that class. All the mentioned Weka errors
are computed by summing over all classes of an instance, not just

1
1
1
0
0
0
0
0

0
0
0
0
a true class (WEKA, 2007a; WEKA, 2007b).
Function Function

Medium
Medium

Medium

Medium
Medium
Medium
In this case, a 10-fold cross-validation is used. The dataset is
randomly divided into 10 equally sized subsets. Of these 10 sub-

High

High
Low
Low
Low

Low
sets, one is taken as the validation dataset, and the other nine sets
are used as training data. Each of the nine training datasets is com-
pared with a validation dataset to calculate the percentage of the
3

3
0
0

0
0
0
0

0
0
0
0
model accuracy. The cross-validation process is repeated ten times,
each of the ten subsets being used exactly once as a validation set.
Function Function

The results from all the 10 trials are averaged.


The prediction accuracy for all the datasets is shown in Table 4.
10
2
3

1
3

2
1
3
0
0
0

0
The accuracy is above 90% for all datasets. For a dataset of 160 in-
stances, only the effort of one task is wrongly classified. The MAE
values indicate that the expected effort will be within 2.6% of the
3
5
0
0

0
0
0
0
0
0
0
0
true effort for the last set of data. Small differences between the
MAE and the RMSE values indicate that the error variance is rela-
Medium
Medium

Medium
Medium

tively small.
Form

High

High

High
Low

Low
Low

Low
Low

The Pred. (m) and the MMRE metrics are also applied to this
dataset. Since the BN model estimates effort as a set of proba-
bility distributions for all possible classes, a Conversion method
Complete data used for parameter learning (12 task example).

Form

(Pendharkar et al., 20 05; Mendes, 20 08; Tierno, 2013) is used to


2

1
1
0

0
0

0
0

obtain the estimated effort as a discrete value. The class proba-


bilities should be normalized, so that their sum equals one. The
estimated effort is then given by:
Form


n
2
2

1
1
2

1
1
0

0
0

0
0

E f f ort = ρclassi μclassi ,


Task New Task Specification Form

i=1
2
3

2
0

0
0
0

1
0

where μclassi is the mean of class i, and ρ classi is its respective class
probability.
The MMRE values suggest that the prediction error is relatively
Quality

constant, with no occasional large deviations.


2
1
4
4
3
3
2
3
3
4
2
5

The predictions for all the sets are within 25% of the actual val-
ues. Even in the case of the stricter criterion (m = 10), 90% of the
Type

estimates for all the sets remain within a 10% tolerance. Moreover,
Yes
Yes
Yes

Yes

Yes
Yes

Yes
No

No
No

No
No

we should note that this is the result of the set with minimum
Table 3

number of instances, and the BN prediction accuracy grows with


12
10
11
ID

1
2
3
4
5
6
7
8
9

the growth of number of instances.


S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119 117

Table 4 MIB PIVAC, Vrgorac, Croatia, the Croatian Science Foundation un-
Results.
der the project INSENT Innovative Smart Enterprise (1353), Zagreb,
Number of tasks 40 (first) 50 (next) 70 (last) 90 (40+50) 160 Croatia, and the Program of Technological Development, Research
Accuracy (Correctly 90% 96% 97.14% 96.67% 99.375% and Application of Innovations of Split-Dalmatia County (1012-14),
classified instances) Split, Croatia. Special thanks belong to Prof. Sanda Halas for her
MAE 0.1065 0.0533 0.0531 0.0469 0.026 language advice.
RMSE 0.199 0.1301 0.1127 0.117 0.065
RAE 35,28% 24,23% 19,89% 17,69% 9,71%
RRSE 51,27% 40,09% 31,03% 32,31 17,81% References
Pred. (25)% 100% 100% 100% 100% 100%
Pred. (10)% 90% 100% 100% 100% 100%
Abouelela, M., Benedicenti, L., 2010. Bayesian network based XP process modelling.
MMRE 12.80 3.29 4.30 7,69 6.21
Int. J. Softw. Eng. Appl. 1 (3), 1–15.
Basili, V.R., Caldiera, G., Rombach, H.D., 1994. The goal question metric approach.
In: The Encyclopedia of Software Engineering, 1. John Wiley & Sons, New York,
USA, pp. 469–476.
These good results (Table 4) make this simple model applicable Beck, K., Beedle, M., van Bennekum, A., Cockburn, A., Cunningham, W., Fowler, M.,
in practice. According to the cone of uncertainty, the results are et al., 2001. The Agile Manifesto, http://www.agileAlliance.org/.
Bibi, S., Stamelos, I., 2004. Software process modeling with bayesian belief net-
better than expected. The high accuracy of the model is confirmed
works. In: Proceedings of the 10th International Software Metrics Symposium
by comparison with the results listed in the literature (Radlinski, (Metrics 2004). Chicago, USA.
2010). Borade, J.G., Khalkar, V.R., 2013. Software project effort and cost estimation tech-
Such a good prediction accuracy is mainly based on the follow- niques. Proceedings of the International Journal of Advanced Research in Com-
puter Science and Software Engineering 3 (8), 730–739. ISSN: 2277 128X.
ing: Celar, S., Turic, M., Vickovic, L., 2014. Method for personal capability assessment. In:
Proceedings of the 22nd Telecommunications Forum Agile Teams Using Personal
• The BN model outcomes are probability distributions for only Points. Beograd, Serbia. IEEE, pp. 1134–1137.
five intervals. This decreases the prediction precision because Celar, S., Vickovic, L., Mudnic, E., 2012. Evolutionary measurement-estimation
method for micro, small and medium-sized enterprises based on estimation ob-
all the values in an interval are treated equally. For example,
jects. Adv. Prod. Eng. Manag. 7 (2), 81–92.
values 45 and 61 from the interval ‘>40 hours’ have the same Charniak, E., 1991. Bayesian networks without tears: making Bayesian networks
probabilities. more accessible to the probabilistically unsophisticated. AI Mag. 12 (4), 50–63.
• A priori and conditional probabilities are automatically attained Chatzipoulidis, A., Michalopoulos, D., Mavridis, I., 2015. Information infrastruc-
ture risk prediction through platform vulnerability analysis. J. Syst. Softw. 106,
from the experimental data, which is more reliable than elici- 28–41.
tation from scratch. The estimation accuracy of these probabil- Cohn, M., 2005. Agile Estimating and Planning, 3–4. Prentice Hall, Upper Saddle
ities has a notable effect on the outcome quality, and either a River, USA, pp. 43–47.
Differding, C., Joisl, B., Lott, C. M., 1996. Technology Package for the Goal Question
pessimistic or an optimistic approach can spoil the results. Metric Paradigm, Technical Report 281/96, University of Kaiserslautern, Ger-
• The consistency in assessment significantly increases the accu- many.
racy of the prediction. Most experimental data were not suit- Dragicevic, S., Celar, S., 2013. Method for elicitation, documentation and validation
of software user requirements (MEDoV). In: Proceedings of the 18th IEEE Inter-
able for direct application to the BN model, e.g., values that national Symposium on Computers and Communications (ISCC). Split, Croatia,.
should be ranked. Two of the authors have independently eval- Dragicevic, S., Celar, S., Novak, L., 2011. Roadmap for requirements engineering pro-
uated all the data to make sure that the values are consistently cess improvement using BPM and UML. Adv. Prod. Eng. Manag. 6 (3), 221–231.
Dragicevic, S., Celar, S., Novak, L., 2014. Use of method for elicitation, documenta-
assessed.
tion and validation of software user requirements (MEDoV) in agile methods.
In: Proceedings of 6th International Conference on Computational Intelligence,
6. Conclusions and future work Communication Systems and Networks/(CICSyN). Tetovo, Macedonia. IEEE.
Fenton, N., Hearty, P., Neil, M., Radliński, Ł., 2008. Software project and quality mod-
elling using Bayesian networks. In: Meziane, F., Vadera, S. (Eds.), Artificial In-
This paper develops a BN model for effort prediction in agile telligence Applications for Improved Software Engineering Development: New
software development projects. Prospects. Information Science Reference, New York, USA, pp. 1–25.
Fenton, N., Neil, M., 1999. A critique of software defect prediction models. IEEE
The proposed model is relatively small and simple and all the Trans. Softw. Eng. 25 (5), 675–689.
input data are easily elicited, so that the impact on agility is min- Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I., 2003. A simulation study of the
imal. The model predicts task effort, and it is independent of agile model evaluation criterion MMRE. IEEE Trans. Softw. Eng. 29 (11), 985–995.
doi:10.1109/TSE.20 03.124530 0.
methods used. It is also suitable for use in the early project phase. Hearty, P., Fenton, N., Marquez, D., Neil, M., 2009. Predicting project velocity in XP
The model is validated using a database of 160 tasks from real using a learning dynamic Bayesian network model. IEEE Trans. Softw. Eng. 35
agile projects. The prediction accuracy is measured by the percent- (1), 124–137.
Helmy, W., Kamel, A., Hegazy, O., 2012. Requirements engineering methodology in
age of correct over all predictions. The model results in very good
agile environment. IJCSI 9 (5), 293–300 No. 3, ISSN (Online): 1694–0814.
accuracy: only one misclassified value. Pred. (m = 25) equals 100% Hernandez-Orallo, J., Flach, P., Ferri, C., 2012. A unified view of performance metrics:
– all predictions are classified within 25% tolerance. The MMRE translating threshold choice into expected classification loss. J. Mach. Learn. Res.
13 (1), 2813–2869.
values show that there are no occasional large estimation errors.
Jeet, K., Bhatia, N., Minhas, R.S., 2011. A Bayesian network based approach for soft-
All the other statistical metrics used in this research support these ware defects prediction. ACM SIGSOFT Softw. Eng. Notes 36 (4), 1–5.
results. Jorgensen, M., 2014. What we do and don’t know about software development effort
This BN model is presently used in one software company, and estimation. IEEE Softw. 31 (2), 37–40.
Jørgensen, M., 2013. Relative estimation of software development effort: it matters
the project manager considers it very useful. with what and how you compare. IEEE Softw. 30 (2), 74–79. doi:10.1109/MS.
The proposed BN model is currently being expanded with a 2012.70.
new subnet regarding the existing node Developer Skills and with Jorgensen, M., 2010. Selection of strategies in judgment-based effort estimation. J.
Syst. Softw. 83, 1039–1050. doi:10.1016/j.jss.2009.12.028.
a new outcome variable/node Product Quality. We plan to use this Kan, S.H., 2002. Metrics and Models in Software Quality Engineering, 2nd ed. Ad-
model for quality prediction of software products in the early soft- dison-Wesley Longman Publishing Co., Inc., Boston, USA ISBN: 0201729156.
ware project phase. Kim, S.T., Hong, S.R., Kim, C.O., 2014. Product attribute design using an agent-based
simulation of an artificial market. Int. J. Simul. Model. 13 (3), 288–299.
Kitchenham, B.A., Pickard, L.M., MacDonell, S.G., Shepperd, M.J., 2001. What accu-
Acknowledgments racy statistics really measure. IEEE Proc. Softw. 148 (3), 81–85. doi:10.1049/
ip-sen:20010506.
Korte, M., Port, D., 2008. Confidence in software cost estimation results based on
This work has been supported in part by the PIVIS project MMRE and PRED. In: Proceedings of the 4th international workshop on Predic-
(1904-10), technological project at FESB funded by the enterprise tor models in software engineering, pp. 63–70. doi:10.1145/1370788.1370804.
118 S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119

Lee, E., Park, Y., Shin, J.G., 2009. Large engineering project risk management using a Scopus, 2016b. https://www.scopus.com/results/results.uri?sort=plf-f&src=
Bayesian belief network. Expert Syst. Appl. Int. J. 36 (3), 5880–5887. s&nlo=&nlr=&nls=&sid=639877A6D7D0DCF9E076107E54EA6E4C.
McConnell, S., 2006. Software Estimation: Demystifying the Black Art. Microsoft wsnAw8kcdt7IPYLO0V48gA%3a330&sot=a&sdt=cl&cluster=scosubjabbr%
Press, WA, USA. 2c%22COMP%22%2cf%2c%22ENGI%22%2cf%2c%22MATH%22%2cf&sl=
Mendes, E., 2008. The use of Bayesian networks for web effort estimation: further 85&s=TITLE- ABS- KEY±%28±bayesian±%29±AND±%28±DOCTYPE±
investigation. In: Proceedings of the Eighth International Conference on Web %28±ar±%29±OR±DOCT YPE±%28±cp±%29±OR±DOCT YPE±%28±ip±
Engineering, Proceedings of ICWE’08, pp. 203–216. %29±%29&origin=resultslist&zone=leftSideBar&editSaveSearch=&txGid=
Mendes, E., Abu Talib, M., Counsell, S., 2012. Applying knowledge elicitation to im- 639877A6D7D0DCF9E076107E54EA6E4C.wsnAw8kcdt7IPYLO0V48gA%3a33,
prove web effort estimation: a case study. In: Proceedings of the 2012 IEEE 36th (viewed 13 December 2016).
Annual Computer Software and Applications Conference, COMPSAC ’12. Izmir, Si, G., Xu, J., Yang, J., Wen, S., 2014. An evaluation model for dependability of inter-
Turkey, pp. 461–469. net-scale software on basis of Bayesian networks and trustworthiness. J. Syst.
Misirli, A.T., Bener, A.B., 2014. A Mapping Study on Bayesian Networks for Software Softw. 89, 63–75.
Quality Prediction. RAISE’14 PROGRAM, Hyderabad, India. Sillitti, A., Succi, G., 2005. Requirements Engineering for Agile Methods Engineer-
Nagy, A., Njima, M., Mkrtchyan, L., 2010. A Bayesian based method for agile method ing and Managing Software Requirements, Part 2. Springer Berlin Heidelberg,
software development release planning and project health monitoring. In: Pro- pp. 309–326. doi:10.1007/3- 540- 28244- 0_14.
ceedings of the 2010 IEEE International Conference on Intelligent Networking Standish Group, 2009. Project Smart, http://www.projectsmart.co.uk/
and Collaborative Systems. Thessaloniki, Greece. the- curious- case- of- the- chaos- report- 2009.html.
Nawrocki, J., Jasiński, M., Walter, B., Wojciechowski, A., 2002. Extreme program- Tierno, I. A. P., 2013. Assessment of Data-driven Bayesian Networks in Software Ef-
ming modified: embrace requirements engineering practices. In: Proceedings fort Prediction, http://hdl.handle.net/10183/71952.
of IEEE Joint International Requirements Engineering Conference. IEEE CS Press, Torkar, R., Awan, N.M., Alvi, A.K., Afzal, W., 2010. Predicting software test effort in
pp. 303–310. iterative development using a dynamic Bayesian network. In: Proceedings of
Pendharkar, P.C., Subramanian, G.H., Rodger, J.A., 2005. A probabilistic model for the 21st IEEE International Symposium on Software Reliability Engineering. San
predicting software development effort. IEEE Trans. Softw. Eng. 31 (7), 615–624. Jose, USA.
Perkusich, M., Oliveira de Almeida, H., Perkusich, A., 2013. A model to detect prob- Usman, M., Mendes, E., Weidt, F., Britto, R., 2014. Effort estimation in agile software
lems on scrum-based software development projects. In: Proceedings of the development: a systematic literature review. In: Proceedings of the 10th Inter-
28th Annual ACM Symposium on Applied Computing, SAC ’13, pp. 1037–1042. national Conference on Predictive Models in Software Engineering, PROMISE ’14.
Radlinski, L., 2010. A survey of Bayesian net models for software development effort Torino, Italy, pp. 82–91.
prediction. Int. J. Softw. Eng. Comput. 2 (2) ISSN: 2229-7413. Version One, 2007. 2nd Annual Survey "The State of Agile Development care", http:
Sarwar, B., Karypis, G., Konstan, J., Riedl, J., 2001. Item-based collaborative filtering //www.versionone.com/pdf/StateOfAgileDevelopmet2_FullDataReport.pdf.
recommendation algorithms. In: Proceedings of the 10th International Confer- WEKA, 2007a. Mean Absolute Error in Classification, http://weka.8497.n7.nabble.
ence on World Wide Web WWW10. Hong Kong, pp. 285–295. com/Mean- absolute- error- in- classification- td9440.html.
Schulz, T., Radliński, Ł., Gorges, T., Rosenstiel, W., 2010. Defect Cost Flow Model WEKA, 2007b. Root Mean Squared Error Calculation, http://weka.8497.n7.nabble.
– A Bayesian Network for Predicting Defect Correction Effort. PROMISE 2010, com/root- mean- squared- error- calculation- td19651.html.
Timisoara, Romania. Williams, L., 2010. Agile software development methodologies and practices. Adv.
Scopus, 2016a. https://www.scopus.com/results/results.uri?sort=plf-f&src= Comput. 80, 1–44.
s&sid=639877A6D7D0DCF9E076107E54EA6E4C.wsnAw8kcdt7IPYLO0V48gA% Wooff, D.A., Goldstein, M., Coolen, F.P.A., 2002. Bayesian graphical models for soft-
3a210&sot=a&sdt=a&sl=85&s=TITLE- ABS- KEY±%28±bayesian±%29±AND± ware testing. IEEE Trans. Softw. Eng. 28 (5), 510–525. doi:10.1109/TSE.2002.
%28±DOCT YPE±%28±ar±%29±OR±DOCT YPE±%28±cp±%29±OR±DOCT YPE± 10 0 0453.
%28±ip±%29±%29&origin=searchadvanced&editSaveSearch=&txGid=
639877A6D7D0DCF9E076107E54EA6E4C.wsnAw8kcdt7IPYLO0V48gA%3a21,
(viewed 13 December 2016).
S. Dragicevic et al. / The Journal of Systems and Software 127 (2017) 109–119 119

Srdjana Dragicevic received her M.Sc. degree in electrical engineering from the Faculty of Electrical Engineering and Computing, University of Zagreb, Croatia. She is currently
an Honorary Assistant at the Department of Electronics and Computing of the Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture (FESB),
University of Split, Croatia. She is enrolled in the PhD program at FESB, University of Split, Croatia. Her research interests include requirements engineering, decision support,
uncertain reasoning, business processes and project management.

Stipe Celar received his B.Sc. degree in electrical engineering from the Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture (FESB), University of
Split, Croatia and his Ph.D. degree in technical sciences from TU Wien, Austria. After years of professional work in IT companies and honorary lecturing he is currently an
Associate Professor at the Department of Electronics and Computing of FESB, University of Split, Croatia. His research interests include software engineering, software metrics
and business information systems.

Mili Turic received his B.Sc. degree in computer science from the Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture (FESB), University of Split,
Croatia. He is currently an Honorary Assistant at the Department of Electronics and Computing of FESB, University of Split, Croatia. He is enrolled in the PhD program at
FESB, University of Split, Croatia. His research interests include application of software engineering to cost estimation of software projects, and to software project planning
in general.

You might also like