Bayesian Network Based XP Process Modelling
Bayesian Network Based XP Process Modelling
3, July 2010
ABSTRACT
A Bayesian Network based mathematical model has been used for modelling Extreme Programming
software development process. The model is capable of predicting the expected finish time and the
expected defect rate for each XP release. Therefore, it can be used to determine the success/failure of any
XP Project. The model takes into account the effect of three XP practices, namely: Pair Programming,
Test Driven Development and Onsite Customer practices. The model’s predictions were validated against
two case studies. Results show the precision of our model especially in predicting the project finish time.
KEYWORDS
Bayesian Networks, Extreme Programming, Process Modelling, Software Process
1. INTRODUCTION
Extreme Programming (XP) is a lightweight software development methodology. XP is one of
the iterative informal development methodologies known as Agile methods. XP comprises a
number of values, practices and principles. There is no large requirements and design
documents. XP uses what is called User Stories instead of requirements. The XP project
comprises of a number of User Stories. Each user stories contains a number of Story Points. The
development process constructed from iterative small releases. In each release, User Stories are
selected to be developed in this release according to their importance.
Managers of XP projects suffer from lack of prediction systems capable of estimating the
expected effort and quality of the software development process. Managers need to know the
probability of success or failure of XP project. Models capable of predicting the project finish
time are very helpful to the project managers. Those models should also be capable of
predicting the product quality in terms of the expected number of defects. These requirements
should be covered in strong mathematical model.
In this paper, a Bayesian Network based mathematical model for XP process is presented. The
proposed model satisfies the following features:
- The model able to predict the expected finish time, and therefore it could determine the
success/failure of the project.
- The prediction can be done in the project planning phase before starting the actual
development using very simple input data.
- The model tracks the developer velocity (measured in number of Story Points per day)
as function of the developer experience. It also models the increase in the developer velocity as
the project goes on.
DOI : 10.5121/ijsea.2010.1301 1
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
- The model considers the effect of the Pair Programming and Test Driven Development
practices on the Team velocity.
- The model predicts the process quality by measuring the defect rate in each release.
- It considers the effect of the Onsite Customer and Test Driven Development practices
on the defect rate.
The proposed model was implemented using AgenaRisk toolset [1]; a toolset for modelling risk
and making predictions based on Bayesian Network. Two case studies were used for the
validation of our model. Results show the precision of our model especially in predicting the
project finish time.
This paper is organized as follows: in the next section, a survey of the related work and an
overview of the Bayesian Network will be provided. Model Design is illustrated in section 3,
while the validation is provided in section 4. Finally, conclusions are offered in the last section.
2. BACKGROUND
In the next section, a literature survey of related work will be provided. The survey
covers a number of the most important XP process models existing in the literature.
Then, the Impact of XP Practices on Software Productivity and Quality is illustrated in
the following section. Finally, an overview of the Bayesian Network will be provided in
the last section.
Although the widespread usage of XP in both academic and industry, only few attempts
for modelling XP exists. In this section, we will provide a survey of the most important
XP Process models.
In [2], a simulation model was developed to analyse the effects of almost all XP
practices on the software development effort. The developed model was applied for a
typical XP project and the effects of all the individual practices were calculated. The
results showed a reduction in software development cost as increasing the usage levels
of all XP practices.
In [3], the authors built a software effort prediction model for XP based on Bayesian
networks. The proposed model can learn from project data in order to make quantitative
effort predictions and risk assessments. The model has been validated by applying to a
real industrial project. Collecting data from the early part of the project enabled the
model to update its parameters and improve its predictions. The model could
successfully achieve extremely accurate predictions about the level of functionality
delivered over time.
In [4], the authors introduced an XP process model to evaluate the effectiveness of XP
key practices (Pair Programming, Test-First Programming), and to investigate how the
practices influence the evolution of a certain project. To achieve this, software process
simulation has been chosen. A process model has been developed and a simulation
executive has been implemented to enable simulation of XP software development
activities to simulate how the modeled project entities evolve as a result.
Williams and Erdogmus [5] developed a Net Present Value (NPV) model of Pair
Programming (PP). The model combines the productivity rates, code production rates,
2
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
defect insertion rates, and defect removal rates. The authors conclude that PP is a
“viable alternative to individual programming.”
In [6], the authors model, simulate and analyze the pair programming and pair switching
practices. The model explores many variables affecting pair programming efficiency.
The results showed that XP efficiency increases with both psychological compatibility
and pair adaptation speed between the members of the pair. In addition, the XP process
appears to have an advantage over the traditional approach when pair switches are not
too frequent.
A number of Studies were conducted to assess the impact of XP practices such as Pair
Programming, Test Driven Development and Onsite Customer practices in the software
Productivity and quality. Those contributions study the impact of such practices on the
Project Velocity and the product defect rates.
A number of quantitative studies conducted to assess the validity and efficiency of the
Pair Programming practice. Some of these contributions study the impact of the Pair
Programming practice on the Project velocity. Generally speaking, According to those
studies, Pair Programming increases the Project Velocity by a factor starting from 0 (no
change) to 45%.
In [7], the authors conducted found that pair programming is 40-50% faster than solo
programming. In addition, the pairs implement the same functionality with 20% fewer
line of codes than the solo students. In [8], The results showed that there is no
significant difference in the average development time between XP with pair
programming and XP without Pair Programming. In [9], the authors found that the PP
group spends 19% less time than individuals to complete the same project. In [10], the
authors found that Pair Programming does not improve neither the quality nor the
productivity.
A number of quantitative studies conducted to assess efficiency of the Test Driven
Development practice. Some of these contributions study the impact of this practice on
the Project velocity and the product defect rate. Generally speaking, According to those
studies, Test Driven Development results in increasing the project time by a factor
starting from 0 (no increase) in some studies to 80% in others. On the other hand, Test
Driven Development results in reducing the defect rate by a factor of 40% in some
contributions.
In [11], the authors reported an improvement in quality achieved by a team following
test-during-coding process ranged from 38% to 267% fewer defects. Unfortunately, test-
during-coding process increase the development time by a factor from 60% to 100%. In
[12], the authors reported that the TDD developers took more time (16%) than those not
using this practice, but produced higher code quality by a factor of 18%.
A number of studies consider the impact of the Onsite Customer Practice. In [13], the
authors reported a reduction in the development effort by a percentage of 5.48%. In [14]
four case studies with different degrees of customer interaction were considered. The
authors measured the effort spent to fixing defects in the four cases. In the very high
customer involvement case, only 6 percent of the effort was spent to fixing defects.
However, in the low level of customer involvement, the time spent fixing defects
reached about 40 percent.
3
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
Bayesian Network (BN) [15] is a probabilistic graphical model that represents a set of
random variables and their conditional probabilities via a directed acyclic graph (DAG).
In these graphical structures, each node in the graph represents a random variable, while
the edges between the nodes represent probabilistic dependencies among the
corresponding random variables. In addition, in a BN each node has an associated
probability table, called the Node Probability Table (NPT). These conditional
dependencies in the graph are often estimated by using known statistical and
computational methods.
Bayesian Networks are used for reasoning and decision-making in problems that
involve uncertainty and probabilistic reasoning. The first working applications of BNs
(during the period 1988-1995) tended to focus on classical diagnostic problems,
primarily in medicine and fault diagnosis [16]. Nevertheless, the medical and
biological/DNA decision-support domain has continued to be the most preferred area
for published BN applications. On the other hand, companies such as Microsoft and
Hewlett-Packard have used BNs for fault diagnosis [16]. The ever-increasing need for
improved decision support in critical systems has resulted in a range of BN-based
systems. These include BN models for air traffic management, railway safety
assessment, and terrorist threat assessment [16]. There have also been numerous uses of
BNs in military applications.
The structure of a DAG is defined by two sets: the set of nodes (vertices) and the set of
directed edges. The nodes represent random variables, while the edges represent direct
dependence among the variables and are drawn by arrows between nodes. In particular,
an edge from node Xi to node Xj represents a statistical dependence between the
corresponding variables. Thus, the arrow indicates that a value taken by variable Xj
depends on the value taken by variable Xi, or roughly speaking that variable Xi
“influences” Xj . Node Xi is then referred to as a parent of Xj and, similarly, Xj is referred
to as the child of Xi. If Xi has no parents, its local probability distribution is said to be
unconditional, otherwise it is conditional.
Figure 1 shows an example of BN. This graph is part of the proposed XP process
model. The number of defected story points depends on both the test driven
development practice usage and the defect rate random variables, while the team
velocity is affected by only the test driven development practice usage.
4
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
3. MODEL DESIGN
The proposed model complies XP process iterative structure. The model considers the
iterative nature of XP by modelling the process as a number of sequential releases. An
example of releases connectivity is shown in Figure 2. Each release has eight inputs and
two outputs. The output of one release is considered as an input for the following
release.
In the next section, the model details explaining different inputs and outputs are
provided. Then, the details of the Team Velocity Model and Project Defects Model will
be provides in the following sections.
Figure 3 shows the based components of the proposed model. The model comprises
eight input parameters, two estimated output parameters, and two internal models.
According to the input parameters values, with the aid of the two internal models, the
values of the output parameters (Estimated Release Time and Estimated defected User
Stories) can be predicted. In addition, the model considers three XP basic activities:
Release Planning, Development Session and Acceptance Test Activities (shown in the
elliptical shape). For simplicity, the model assumes the following assumptions:
- Defects are only found at the acceptance test phase.
- The defects are modelled as story points to be treated in the next release.
- The number of defects is affected by Test Driven Development and On site
customer Practices.
- The developer velocity increases as the project goes on.
- The estimated release time is calculated as the time of the development session
activity only ignoring the release plan and acceptance test times.
5
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
- The team velocity is affected by the team size, team experience, Pair
Programming Practice, and Test Driven Development Practice.
According to the input parameters values, with the aid of team Velocity and defected
story points Models, the values of the output parameters (Estimated Release Time and
Estimated defected User Stories) can be predicted. Those values are feed as an input to
the next release. The details of the team Velocity and defected story points Models will
be provided in the next section.
The three basic activities of XP release: Release Planning, Development Session and
Acceptance Test are shown in the elliptical shape. In the Release Planning phase, the
user stories are sorted according to the importance and the release user stories are
selected among them. The development session includes the basic development
activities: simple design, coding and unit testing activities. Finally, the release testing
activity is done in the Acceptance Test phase.
Team velocity, measured in number of user stories per day, is a good representation for
the productivity of the team. The model is shown in Figure 4. Three main factors affect
the team velocity in XP, namely: Team XP skills, Pair Programming usage and Test
Driven Development usage.
The main components of the model are:
- Developer initial skills: random number uniformly distributed over the range
[1,10] [4]. This number represents the developer ability to develop software.
- Developer skills: Dev. Skills depend on two main factor, namely:
dev_initial_Skills and project_working_days. This value represents increasing
developer skills with time. It can be calculated as described in Equation (1)[4],
where LC is the learning coefficient, typically 0.009 [4].
Dev_skills = log(dev_initial_Skills + project_working_days*LC) Equation
(1)
7
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
- Developer Initial Velocity: represents the number of story points the user can
develop per day. According to empirical data [4], a random variable following
the normal distribution with mean 4 and standard deviation 1 was considered to
represent this value.
- Dev. Velocity: this value represents the impact of the developer skills on his
velocity. It is calculated as the summation of the Dev. Initial Velocity and the
dev. Skills factor.
- Team Initial Velocity: represents the number of story points the team can
develop per day. It can be calculated as the product of Team Size times Dev.
Velocity.
- Pair Programming Impact Factor (pp_Impactfactor): This factor represents the
impact of adopting Pair Programming practices in the team velocity. This value
was set to a normal distribution random variable with mean 23 and standard
deviation 20. This value was set according to a number of studies. Those studies
8
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
show that Pair Programming reduces the Project time by a factor starting from 0
(no change) in some cases to 45% in others. More data regarding those studies is
available in the Background section.
- Test Driven development Impact Factor (TDD_Impactfactor): This factor
represents the impact of adopting Test Driven Development practices in the
team velocity. This value was set to a normal distribution random variable with
mean -32 and standard deviation 42. This value was calculated according to
number of studies [4]. Those studies show that Test Driven Development results
in increasing the project time by a factor starting from 0 (no increase) in some
studies to 80% in others. A review of those studies is available in the
Background section.
- Team Velocity: This value represents the number of story points the team can
develop per day taking into account all the previous factors. Equation 2
represents this value.
Team Velocity = Team_initial_velocity* (1+PP_usage*PP_ImpactFactor/100)*
(1+TDD_usage*TDD_impactfactor/100) Equation (2)
This model calculates an estimate number for the defected story points to be re-
developed in the next release. This number is affected by two XP practices: Test Driven
development and Onsite Customer practices. Different components of the model are
described as follows:
- Dev. Productivity: The developer productivity measured as the number Line Of
Code (LOC) per day. According to the literature [4], a normal distribution with
mean 40 and Standard Deviation of 20 represents this value.
- Estimated Release KLOC: represents the number of KLOC produced from this
release. This value is calculated as the product of multiplying Dev. Productivity
times Team size times Estimated Release Days.
- Defect Injection Ratio: represents the number of defects per KLOC. This value
was set to a normal distribution with mean 20 and standard deviation 5 [4].
- Defect Rate: represents the number of defects in this release. It is calculated as
the multiplication of the Estimated Release KLOC times Defect Injection Ratio.
- Defected Story Points: This value represents the number of defected story points
to be re-developed in the next release taking into account the impact of two XP
9
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
practices: Test Driven development and Onsite Customer practices (Equation 3).
OSC_Impact_Factor and TDD_Impact_Factor represent the impact of the
Onsite Customer and Test Driven development practices on reducing the defect
rate. According to the literature, there values were set to 0.8 and 0.4 respectively
[3],[4]. More details regarding the impact of these practices in the defect rate
are available in the Background section.
Defected_Story_Points = Defect_Rate*(1- OSC_Impact_Factor *
onsitecustomer_usage )*(1 TDD_Impact_Factor *tddusage) Equation (3)
4. MODEL VALIDATION
The proposed model was implemented using AgenaRisk toolset [1]. AgenaRisk is a
powerful tool for modelling risk and making predictions based on Bayesian Network.
AgenaRisk has the following features:
- It integrates the advantages of Bayesian Networks, statistical simulation and
spreadsheet analysis.
- A wide range of built-in conditional probability functions are available.
- It has the ability to build dynamic models.
10
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
Collecting data from real projects to validate our model was a difficult task due to
several reasons. Due to XP simplicity value, it is difficult to find company collecting
information regarding their activities and practices. Moreover, most real XP projects are
developed by private companies having restrictions on publishing their internal
development process. In addition, there is no guarantee that the available data is
sufficient for model validation.
Two XP projects provided enough data to test our model. The first one is the Repo
Margining System project [4]. The second one is a controlled case study reported by
Pekka Abrahamsson [17]. We will refer to this case study in the rest of this paper by
Abrahamsson Case Study. The model input data for the two projects are shown in tables
2 and 3. The model internal parameters are summarized in table 4.
11
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
Table 4 Model internal parameters (U(a,b) refers to uniform distribution from a to b, while
N(µ,σ) refers to normal distribution with mean µ and standard deviation σ)
Random Variable Value
Developer Initial Skills U(1,10)
Developer Initial Velocity N (4,1)
PP_Impact_Factor N(23,20)
TDD_Impact_Factor N(-32,42)
Dev. Productivity
N(40,20)
(LOC/Day)
Defect Injection Ratio
N(20,5)
(defects/KLOC)
Table 5 shows a comparison between the Estimated and the Real values for the two
projects. Regarding the number of days, the estimated number is so close to the Real
project. This indicates the acceptable accuracy of the proposed Team Velocity Model.
On the other hand, the Defected Story Points model was not that accurate. For
Abrahamsson Case Study, the estimated number of defected story points was close to
the actual one, while for the Repo Margining System, the accuracy of the prediction
system was not that good. The inaccuracy also appears in estimated the produced
number of Line of codes for Abrahamsson Case Study.
The imprecision in some of the results, especially in the Defected Story Points model, is
due to fixing some variables that should not be fixed. For example, the Defect Injection
Ratio was fixed for the two projects to follow the normal distribution with mean 20
defects per KLOC. This value differs from project to another and should not be fixed.
The same for Developer Productivity random variable that set to follow the normal
distribution with mean 40 lines/day. This also depends on the nature of the project and
should vary from project to another.
12
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
One solution of such imprecision is to adopt the model for self-learning, by which the
model can learn from the first iterations and adjust different parameters and variables.
This increases the confidence of the prediction and can correct the model’s prior
assumptions. This learning capability is a good extension for the proposed model.
Figures 6 and 7 show the estimated project status as time passes. Those curves gave
accurate estimated values for the project finish time. Those curves can be obtained in
the project planning phase before starting the actual development using very simple
input data. Using such curves, the success or the failure of the project can be detected in
early stage.
35
Estimated Number of Completed User Stories
30
25
20
15
10
0
0 10 20 30 40 50 60 70
Days
13
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
14
10
0
0 2 4 6 8 10 12
Days
5. CONCLUSIONS
In this paper, a Bayesian Network based mathematical model for XP process is
presented. The model can be used to predict the success/ failure of any XP project by
estimating the expected finish time and the expected defect rate for each XP release.
The proposed model comprises two internal models: Team velocity and Defected Story
Points models. The model takes into account the impact of XP practices.
Two case studies were used for the validation of our model, namely: Repo Margining
System and Abrahamsson Case Study. The results show that the model can be used
successfully to predict the project finish time with a reasonable accuracy in the project
planning phase using very simple input data. In addition, the results show that the
accuracy of the Team Velocity Model is acceptable, while the Defected Story Points
model was not that accurate.
Adopting the model to have a self-learning capability is a good extension of this work
and can solve the imprecision in some of the results, especially in the Defected Story
Points model, by which the model can learn from the first iterations and adjust different
parameters and variables. This increases the confidence of the prediction and can
correct the model’s prior assumptions. This learning capability is a good extension for
the proposed model.
REFERENCES
[1] AgenaRisk User Manual, Agena, www.agenarisk.com, 2008.
[2] S. Kuppuswami, K. Vivekanandan, Prakash Ramaswamy, and Paul Rodrigues. “The effects of
individual xp practices on software development effort”. SIGSOFT Softw. Eng. Notes,
28(6):6–6, 2003.
[3] P. Hearty, N. Fenton, D. Marquez, and M. Neil. “Predicting Project Velocity in XP Using a Learning
Dynamic Bayesian Network Model”. Ieee Transactions On Software Engineering, Vol. 35, No.
1, January/February 2009.
[4] M. Melis “A Software Process Simulation Model of Extreme Programming” PhD thesis. February 7,
2006.
[5] N. E. Fenton and S. Pfleeger. “Software Metrics: a rigorous and pratical approach”. PWS Publishing
Company, 1996.
14
International Journal of Software Engineering & Applications (IJSEA), Vol.1, No.3, July 2010
[6] V. B. Misic, H. Gevaert, and M. Rennie. “Extreme dynamics: Modeling the extreme programming
software development process.” In Proceedings of ProSim04 workshop on Software Process
Simulation and Modeling, pages 237–242, 2004.
[7] A. Cockburn and L. Williams. “The costs and benefits of pair programming.” In Proceedings of the
First International Conference onExtreme Programming and Flexible Processes in Software
Engineering (XP2000), Cagliari, Sardinia, Italy, June 2000.
[8] Jerzy Nawrocki and Adam Wojciechowski. “Experimental evaluation of pair programming.” In 12th
European Software Control and Metrics Conference (ESCOM 2001), 2001.
[9] K. Vivekanandan. “The Effects of Extreme Programming on Productivity, Cost of Change and
Learning Efficiency.” PhD thesis, Department of Computer Science, Ramanaujam School of
Mathematics and Computer Sciences, Pondicherry University, India., 2004.
[10] Hanna Hulkko and Pekka Abrahamsson.”A multiple case study on the impact of pair programming
on product quality.” In ICSE ’05: Proceedings of the 27th international conference on Software
engineering, pages 495–504, New York, NY, USA, 2005. ACM Press.
[11] Randy A. Ynchausti. “Integrating unit testing into a software development teams process.” Published
on: http://www.agilealliance.org/articles/ynchaustirandyaintegr/file, 2001.
[12] Boby George and Laurie Williams. “An initial investigation of test driven development in industry.”
In Proceedings of the 2003 ACM symposium on Applied computing, pages 1135–1139. ACM
Press, 2003.
[13] S. Kuppuswami , K. Vivekanandan , Prakash Ramaswamy , Paul Rodrigues, “The effects of
individual XP practices on software development effort”, ACM SIGSOFT Software Engineering
Notes, v.28 n.6, November 2003.
[14] M. Korkala, P. Abrahamsson, and P. Kyllo¨nen, “A Case Study on the Impact of Customer
Communication on Defects in Agile Software Development,” Proc. AGILE Conf., 2006.
[15] Ben-Gal I., Ruggeri F., Faltin F. & Kenett R., Bayesian Networks, in Encyclopedia of Statistics in
Quality & Reliability, Wiley & Sons, 2007
[16] Norman Fenton and Martin Neil, “Managing Risk in the Modern World: Applications of Bayesian
Networks” Knowledge transfer report, London Mathematical Society, 2007.
[17] P. Abrahamsson, "Extreme Programming: First Results from a Controlled Case Study," presented at
29th. IEEE EUROMICRO Conference, Belek, Turkey, 2003.
Authors
Mohamed Abouelela received his M.Sc. degree in Engineering
Mathematic from Cairo University, Cairo, Egypt in 2007. He is
now a research assistant and Ph.D. candidate affiliated with the
Software System Engineering Department at University of
Regina, Regina, Canada. His main interests include Software
Process modelling, XP assessment, Data management in grid
systems, and optical grid networks.
15