FULLTEXT01

Master Thesis
Computer Science
Thesis no: MCS-2008:5
January 2008
Software Process Simulation Modelling:

A Multi Agent-Based Simulation Approach
Redha Cherif
Department of
Systems and Software Engineering
School of Engineering
Blekinge Institute of Technology
Box 520
SE – 372 25 Ronneby
Sweden
This thesis is submitted to the Department of Systems and Software Engineering, School of
Engineering at Blekinge Institute of Technology in partial fulfilment of the requirements for the
degree of Master of Science in Computer Science (Intelligent Software Systems). The thesis is
equivalent to 20 weeks of full time studies.
Contact Information:
Author: Redha Cherif
E-mail : [email protected]
University advisor:
Professor Paul Davidsson
Department of Systems and Software Engineering
Department of Internet : www.bth.se/tek/aps

Systems and Software Engineering Phone : + 46 457 38 50 00
School of Engineering Fax : + 46 457 271 25
Blekinge Institute of Technology
Box 520
SE – 372 25 Ronneby ii
Sweden
ABSTRACT
In this thesis we present one of the first actual

applications of Multi Agent-Based Simulation (MABS)
to the field of software process simulation modelling
(SPSM). Although a few previous applications were
attempted, we explain in our literature review how these
failed to take full advantage of the agency paradigm.
Our research resulted in a model of the software
development process that integrates performance,
cognition and artefact quality models, for which we
built a common simulation framework to implement
and run MABS and System Dynamics (SD) simulators
upon the same integrated models.
Although it is not possible to fully verify and
validate implementations and models like ours, we used
a number of verification and validation techniques to
increase our confidence in these.
Our work is also quite unique in that it compares
MABS to SD in the context of SPSM. Here, we
uncovered quite interesting properties of each
simulation approach and how MABS, for example, is
risk averse when compared to SD.
In our discussion section we also present a number
of lessons learned regarding the two simulation
paradigms as well as various shortcomings in the
models we adopted and our own.
This research draws on both qualitative and
quantitative methods.
Keywords: Software Process Simulation Modelling,

Multi Agent-Based Simulation, System Dynamics.
1
ACKNOWLEDGEMENTS
I would like to take this opportunity to express my gratitude to my advisor
Professor Paul Davidsson for his help and encouragement since the start of this
research. His guidance was primordial in focusing on the “relevant” and “significant”
individual characteristics to cover in this work. In this context, Professor Claes Wohlin
also deserves credit and thanks for advising me on the nature of these characteristics.
Also, I would like to thank Thomas Birath from UIQ technologies AB in Ronneby,
for his precious time and for giving me the opportunity to investigate the software
development process at UIQ. Even if this did not lead to a full-scale case study, for
me, the added knowledge was truly beneficial.
2
CONTENTS
ABSTRACT ............................................................................................................................1
1 INTRODUCTION ..........................................................................................................7
1.1 BACKGROUND ...........................................................................................................7
1.2 AIMS AND OBJECTIVES..............................................................................................8
1.3 CONTRIBUTION .........................................................................................................8
1.4 SCIENTIFIC RELEVANCE ............................................................................................8
1.5 RESEARCH QUESTIONS ..............................................................................................8
1.6 EXPECTED OUTCOMES ..............................................................................................8
1.7 RESEARCH METHODOLOGY ......................................................................................9
1.7.1 Literature review (RQ 1,2 & 3) ........................................................................9
1.7.2 Simulations (Model validation) ........................................................................9
1.7.3 Quantitative methods (RQ 4) ............................................................................9
1.7.4 Qualitative analysis (RQ 5 and 6) ....................................................................9
2 RELATED WORK.......................................................................................................10
2.1 SOFTWARE PROCESS SIMULATION MODELLING.....................................................10
2.2 SYSTEM DYNAMICS .................................................................................................10
2.3 SYSTEM DYNAMICS APPLIED TO SPSM ..................................................................11
2.4 SOCIAL CONSIDERATIONS .......................................................................................11
2.5 AGENT-BASED SIMULATION MODELLING ...............................................................12
2.5.1 The idea ..........................................................................................................12
2.5.2 The attempts....................................................................................................12
2.6 MODELLING SOFTWARE DEVELOPERS’ PERFORMANCE AND COGNITIVE
CHARACTERISTICS ..............................................................................................................13
2.6.1 Performance characteristics...........................................................................13
2.6.2 Cognitive characteristics ................................................................................13
3 SOFTWARE PROCESS SIMULATION MODELLING.........................................15
3.1 SOFTWARE DEVELOPMENT AS A PROCESS ..............................................................15
3.1.1 The “Water fall” model ..................................................................................15
3.1.2 The Incremental model ...................................................................................15
3.2 MODELLING THE SOFTWARE DEVELOPMENT PROCESS ...........................................16
3.2.1 Effort Performance Model (EPM) ..................................................................16
3.2.2 Knowledge Model (HKM)...............................................................................18
3.2.3 Artefact Quality Model (AQM) .......................................................................20
3.2.4 Integrating the models ....................................................................................21
3.2.5 Developer/artefact interaction model.............................................................21
3.3 SIMULATION FRAMEWORK .....................................................................................22
3.3.1 Framework overview ......................................................................................22
3.3.2 Framework’s model variables manipulation..................................................23
4 MULTI AGENT-BASED SIMULATION MODEL..................................................26
4.1 AGENTS AND THEIR ENVIRONMENT ........................................................................26
4.1.1 Situation (environment) ..................................................................................26
4.1.2 Characteristics of the simulation environment...............................................26
4.1.3 Autonomy ........................................................................................................27
4.1.4 Intelligence .....................................................................................................27
4.2 IMPLEMENTATION ...................................................................................................28
4.2.1 Agents .............................................................................................................28
4.2.2 Decision making .............................................................................................28
3
5 SYSTEM DYNAMICS SIMULATOR .......................................................................29
5.1 MODEL PREREQUISITES ..........................................................................................29
5.2 A SYSTEM DYNAMICS MODEL OF A DEVELOPMENT PROCESS ...............................29
5.2.1 Feedback dynamics structure .........................................................................29
5.2.2 Levels ..............................................................................................................30
5.2.3 Flows ..............................................................................................................30
5.2.4 Auxiliary variables..........................................................................................30
5.2.5 Difficulties.......................................................................................................30
5.3 IMPLEMENTATION ...................................................................................................31
6 VERIFICATION AND VALIDATION......................................................................32
6.1 VERIFICATION .........................................................................................................32
6.2 VALIDATION ...........................................................................................................32
6.2.1 Face validity ...................................................................................................32
6.2.2 Internal validity ..............................................................................................32
6.2.3 Tracing............................................................................................................33
6.2.4 Model-to-Model validation.............................................................................33
6.2.5 Predictive validation.......................................................................................33
6.2.6 Preliminary validity conclusions ....................................................................33
7 COMPARING MABS TO SD .....................................................................................34
7.1 COMPARING OUTCOMES .........................................................................................34
7.1.1 Experiment overview ......................................................................................34
7.1.2 Analysis...........................................................................................................35
7.2 COMPARING MODELLING ISSUES ............................................................................37
7.2.1 Model elicitation.............................................................................................37
7.2.2 Model configuration and initialisation ...........................................................37
8 DISCUSSION................................................................................................................38
8.1 RESULTS..................................................................................................................38
8.1.1 Modelling the individual-based view..............................................................38
8.1.2 Comparing MABS to SD.................................................................................38
8.1.3 Lessons learned from MABS and SD modelling.............................................38
8.1.4 The cost of MABS............................................................................................39
8.2 SHORTCOMINGS ......................................................................................................39
8.2.1 The EPM model and its Locus of Control scale .............................................39
8.2.2 The Knowledge Model ....................................................................................39
8.2.3 Relating requirement scope to effort ..............................................................40
9 CONCLUSIONS...........................................................................................................41
9.1 SUMMARY OF RESULTS...........................................................................................41
9.1.1 Accomplishments ............................................................................................41
9.1.2 Contributions ..................................................................................................41
9.1.3 Lessons learned ..............................................................................................41
9.2 FUTURE WORKS ......................................................................................................42
9.2.1 Improvements..................................................................................................42
9.2.2 Experimentation..............................................................................................42
9.2.3 Application......................................................................................................42
9.2.4 MAS and SPSM...............................................................................................42
9.2.5 Optimisation features......................................................................................42
REFERENCES .....................................................................................................................43
APPENDIX A VALIDATION OF SIMULATION FRAMEWORK ..........................1

A.1 PRELIMINARY FACE VALIDITY TESTING OF TIME DIMENSION ...................................1
4
A.1.1 Single high performing developer working round-the-clock ............................1
A.1.2 Effect of weekend breaks on progress...............................................................1
A.1.3 Restricting work hours to the interval [8 – 17[ ................................................2
A.1.4 Accounting for lunch breaks .............................................................................2
A.1.5 Doubling the human resources .........................................................................3
A.2 PRELIMINARY FACE VALIDITY TESTING OF PERFORMANCE ......................................4
A.2.1 An “ideal” developer........................................................................................4
A.2.2 Two equally “ideal” developers .......................................................................5
A.2.3 Two quite “normal” developers with different performance levels .................5
A.3 MODEL-TO-MODEL COMPARISON WITH HKM .........................................................6
A.3.1 Case 1-1............................................................................................................6
A.3.2 Case 1-2............................................................................................................7
A.3.3 Case 1-3............................................................................................................8
APPENDIX B QUESTIONNAIRES ..............................................................................1
B.1 THE LOCUS OF CONTROL SCALE ..............................................................................1
B.2 SELF-ESTEEM QUESTIONNAIRE .................................................................................3
B.3 ACHIEVEMENT NEEDS QUESTIONNAIRE ....................................................................4
LIST OF FIGURES
Figure 1 The Empirical Performance Model representing the parameters affecting effort
and performance as empirically determined by Rasch and Tosi [25].............................18
Figure 2 A somewhat simplified UML class diagram of the simulation platform............23
Figure 3 Overview of the various catalogues and their files used to set-up a simulation .24
Figure 4 Example of a phases catalogue containing individual phase definition files......24
Figure 5 A system dynamics model of a software development phase.............................29
Figure 6 The progress of 1000 different MABS (in red) and SD (in blue) simulation runs
as a function of duration. ................................................................................................35
Figure 7 A single developer working round-the-clock at 100% efficiency completes a
task of 100 hours in exactly 100 hours. ............................................................................1
Figure 8 A single developer working round-the-clock at 100% efficiency except on
weekends (48 hours delay) completes a task of 100 hours in 148 hours..........................2
Figure 9 A single developer working regular hours only on weekdays at 100% efficiency
completes a task of 100 hours in 361 hours......................................................................2
Figure 10 A single developer working regular hours on regular weekdays with lunch
breaks between 12:00 to 13:00, performing at 100% takes 388 hours to complete a 100-
hour task. 3
Figure 11 Two identical developers collaborating on the same 100-hour task. They
terminate after 194 hours, which is exactly half the time it takes a single developer to
perform that task. ..............................................................................................................3
Figure 12 Single developer with an “ideal” IC = {1.0, 1.0, 1.0} and knowledge level
greater than required knowledge level..............................................................................4
Figure 13 Two developers with IC = {1.0, 1.0, 1.0} and knowledge level greater than
required knowledge level, complete a task worth 100 hours in 53 hours (actually 52,3).5
Figure 14 Two developers. D1 = {0.6, 0.7, 0.2} and D2 = {0.6, 0.7, 0.7} both with a
knowledge level greater than required, complete a task worth 100 hours in 67 hours
(actually 66.6)...................................................................................................................6
Figure 15 Single developer with IC = {0.5, 0.6, 0.5} and KC = {100, 100, 30}
corresponding to case 1-1 of Hanakawa et al. [13]...........................................................7
5
LIST OF TABLES
Table 1 Rasch and Tosi [25]’s definitions of the various individual characteristics they
considered.......................................................................................................................17
Table 2 Empirical Performance Effects; as reported by Rasch and Tosi [25].................18
Table 3 Example of a roles definition file .......................................................................24
Table 4 A simplistic example of a phase definition file pertaining to the design phase..25
Table 5 An example of a project’s phases definition file. Among other things it specifies
that the software design phase will complete once quality has reached at least 96%.....25
Table 6 The currently supported termination criteria, their syntax and description ........25
Table 7 Situation action rules describing the agent’s decision-making process. Text in
bold face represents primitive actions. Clock in and out helps the simulator in
bookkeeping a developer’s effort. ..................................................................................28
Table 8 Result of the statistical analysis of 200 simulation pairs of MABS and SD based
on random project scopes, in range 10 to 60 hours. .......................................................33
Table 9 Individual and knowledge characteristics of the five participants......................34
Table 10 Result of the statistical analysis of 1000 simulation pairs of MABS and SD runs
with varying project scopes drawn at random in the range 100 to 1000 hours...............36
6
1 INTRODUCTION
1.1 Background
Software process simulation modelling (SPSM) is an approach to analysing,
representing and monitoring a software process phenomenon. It addresses a variety of
such phenomena, from strategic software management to software project
management training [16]. Simulation is a means of experimentation, and so is SPSM.
Such experimentation attempts to predict outcomes and improve our understanding of
a given software process. While controlled experiments are too costly and time
consuming [23] SPSM carries the hope of providing researchers and software
managers with “laboratory-like” conditions for experimenting with software processes.
There are numerous techniques for proceeding with SPSM. Kellner et al. [16]
enumerate a number of these, such as: state-based process models, discrete event
simulations and system dynamics (SD) [10]. The two former are discrete in nature
while the latter is continuous.
A number of SD models have been quite successful in matching real life
quantitative data [7]; most notable are those of Abdel-Hamid [1],[2], Abdel-Hamid and
Madnick [3], Madachy [20], Glickman and Kopcho [11]. However, these represent a
centralistic activity-based view that does not capture the interactions at the individual
level [31]. When activity-based view is applied to SPSM the various characteristics of
the developers, that are individual in nature, are represented by group averages such as
average productivity, average assimilation delay and average transfer delay, as in [2].
Models based on these views are in effect assuming homogeneity among the
developers [31], which may result in the model not being able to account for or explain
certain facts observed in real-life as noted by Burke [7].
Since software development is a human-intensive activity, an interest for
incorporating social considerations in to SPSM models has emerged [31]. Christie and
Staley [8] were among the first to introduce social issues into such models. They
attempted to study how the effectiveness of human interactions affected the quality
and timeliness of a JAD1 requirement process. For this purpose, they used a discrete
event-based approach to model the organisational process, while continuous
simulation was used for their social model. Integrating the continuous social model
into the discrete organisational one proved problematic [8] due to the fundamental
difference between these temporal paradigms. Burke [7] followed up by integrating
social considerations in the modelling of a high-maturity software organisation, GSFC2
at NASA. Here too, system dynamics was used.
However, as it was noted above, equation based models such as system dynamics
often embody assumptions of homogeneity yielding less accurate results than those
excluding such assumptions. Parunak et al. [24] illustrate this in a case study
comparing agent-based modelling (ABM) to equation-based modelling (EBM). Their
findings are that ABMs are “most appropriate” for modelling domains characterised by
being highly distributed and dominated by discrete decisions, while EBMs are more
appropriate for domains that can be modelled centrally, “and in which the dynamics
are dominated by physical laws rather than information processing”.
Finally, Wickenberg and Davidsson [31], build the case for applying multi agent-
based simulation (MABS) to software development process (SDP) simulation. They
base their arguments on a review of the field and enlist most of the shortcomings,
described above: activity-based views, homogeneity assumptions and the human-
intensive (thus individual) nature of software processing. Despite all these arguments
1
Joint Application Development
2
Goddard Space Flight Centre
7
in favour of MABS [31], consolidated by the information processing dynamics [24] of
SPSM, hardly any research can be found on integrating the two.
1.2 Aims and objectives

The aim of this research is to establish the appropriateness of MABS to the field of
SPSM, as advocated by Wickenberg and Davidsson [31], by; among other means;
comparing it to SD –a well established SPSM methodology.
To reach our goal, the following objectives need to be achieved:
- Derivation of an SDP model that takes an individual-based view of the process
- Implementation of this SDP model as a common simulator framework
providing a fair comparison ground for both MABS and SD simulatiors.
- Quantitatively and qualitatively compare MABS to SD highlighting
advantages and weaknesses of each.
1.3 Contribution
Based on our review of the literature, there appears to be grounds for claiming that
MABS is an appropriate alternative for SPSM, probably more appropriate than SD in
simulating the software process from an individual-based view. However, we are not
aware of any evidence of this, as there seems not to exist any serious attempts to apply
MABS to SPSM and even less so to compare it to SD (in an SPSM context).
Our primary contribution would be in establishing the appropriateness of MABS
to SPSM, and probably even provide first evidence that MABS is actually more
appropriate than SD in this context (and under certain conditions).
1.4 Scientific relevance

As explained earlier, software process simulation modelling is a means of
experimenting under “laboratory-like” conditions with aspects of software
development that are too costly and/or time consuming to assess otherwise. Such
simulations give scientists and managers alike, the ability to further their knowledge
and understanding of software development processes. A more appropriate and
accurate simulation method, of such processes, is therefore likely to enhance this
ability resulting in more accurate results and improved understanding.
1.5 Research questions

Our research shall attempt to answer the following questions:
A. In order to derive an individual-based view:
1. How do we model the individual characteristics of a software developer?
2. How do we model a software artefact (specification, doc., code etc.)?
3. How is the interaction between developers and artefacts modelled?
B. When comparing MABS and SD:

4. Do MABS and SD actually present any significant differences in projections?
5. What are the advantages and disadvantages of MABS with regards to SD?
6. What aspects of the software process is MABS or SD more appropriate for?
1.6 Expected outcomes

1. A common simulation model framework of an individual-based view of SDP
2. A MABS and an SD software development process simulator.
3. A comparison of both approaches
8
1.7 Research Methodology
For the purpose of our research a number of qualitative and quantitative research
methods were used to answer our various questions and to validate our models.
1.7.1 Literature review (RQ 1,2 & 3)

Our research lead us to investigate a number of questions related to modelling,
namely how to model an individual developer, how to model artefacts and how to
model the interaction between these? Our answers to these questions are mainly based
on our research and analysis of the literature and related works. These provided us
with a performance and cognitive model, which we adapted and completed with a
number of smaller models – based on our interpretation of the remainder of the
literature review as well as our own experience.
One could argue that we could have used a more established and more formal
methodology to derive our various models, such as MAS-CommonKADS, as used in
Henesey et al. [15]. However, we saw two main hinders in using this approach: (i) Our
purpose was not to derive an as accurate a model as possible as much as it was to study
the application of MABS, compared to SD, in software process simulation modelling
and from that perspective develop or even select existing models that can support that
purpose and (ii) MAS-CommonKADS is built around a number of models that need
elicitation through an extensive series of, among other things, interviews. There was
just not room for this with regards to our main objectives.
1.7.2 Simulations (Model validation)

To verify and validate our simulators and models we proceeded with a number of
activities, as described in section 6. Among these we can name face validity tests and
model-to-model validation [33]. These activities required actual simulations, the result
of which enhanced our confidence in the simulator’s prognoses in general.
1.7.3 Quantitative methods (RQ 4)

Having developed both MABS and SD simulators we wanted to answer the
question whether there were any significant differences in their projections. For this an
extensive experiment was set-up with sufficiently large series of simulations. The
projections were then collected and statistical characteristics of the samples were
derived to then establish or reject the significance of the findings (in terms of
differences).
1.7.4 Qualitative analysis (RQ 5 and 6)

The qualitative analysis of the results of the various tests, comparisons and
difficulties encountered during this research, see section 8, allowed us to discuss
advantages and disadvantages of both MABS and SD and the cases we found most
appropriate for each.
9
2 RELATED WORK
In this section we discuss the various researches and articles that pertain to our
work. Given that our research questions are at the junction of a number of fields,
namely: simulation modelling (SPSM, SD and ABM), software engineering (Software
processes and developer characteristics), psychology (various theories of motivation
and human performance) computer science (the agency paradigm), we chose to
organise our review under the following headings.
2.1 Software Process Simulation Modelling

Kellner et al. [16] present an important paper that summarises the work being
carried out at the time of the first “International Workshop on Software Process
Simulation Modeling” (ProSim'98).
The main contribution of this paper is being a structured introduction to the field,
as it identifies and describes in detail the various purposes (‘why’), scope (‘what’) and
approaches (‘how’) of software process simulation modelling. Also it presents a useful
framework that helps categorise simulation modelling. The main purpose of
simulation, as they put it, is in aiding decision-making. Such decisions may relate to:
strategic management, planning, operational management, process improvement,
training and understanding.
The authors insist on that although a single approach could be used for solving
most modelling issues, given sufficient experience of the modeller, no single approach
is naturally fit for all such issues. The article concludes with a discussion over
continuous-time simulation (e.g. system dynamics) versus discrete event and state
based simulations. According to Kellner et al. [16], the former is more appropriate for
macro-level and/or long-term analysis, while the latter is better suited for lower-level
analysis such as analysing details of the process and/or resource utilisation at some
given stages.
2.2 System dynamics

System Dynamics, a field developed by J. Forrester in the 1950s, combines theory
methods and philosophy [10] to attempt to understand the “behavioural implications”
of complex systems. It draws extensively on the field of dynamics of feedback systems.
The SD philosophy, considers that it is not enough to understand the individual
parts of a system if these are not put in the context of the feedback structure that
govern the behaviour of the system as a whole and therefore the final behaviour of the
individual parts. The complexity of such systems and their interactions cannot be
contemplated intuitively. SD, therefore, advocates simulating these systems.
Feedback structure is defined as the setting in which conditions influence
decisions, which in turn affect the conditions that influence future decisions [10]. SD
recognises and actually emphasises on the fact that feedback structures dominate
agents’ decision-making (far beyond their own realisation) [10].
As to simulation, it provides the tool for managing the high complexity of the
model and its feedback structure [1].
We find the philosophy and theory behind SD to explain a lot about system
complexity and its non-linear character, yet its application is quite restrictive. For all of
SD’s acknowledgement of the dynamic behaviour of systems, SD’s application, in our
opinion, fails to account for the dynamics in the very structure of the system. Models
for example cannot be instantiated, which makes it difficult to simulate dynamic
changes to the structure. This is not all that surprising, as SD is older than the object
orientation paradigm. However we believe that SD’s widespread in the research
community should have resulted in more attention to such limits.
10
2.3 System dynamics applied to SPSM
Abdel-Hamid’s work [1],[2] (including Abdel-Hamid and Madnick [3]) from the
1980’s are of the most notable, and very likely the first applications of system
dynamics to the field of software process simulation. The original work [1], a PhD.
dissertation, was carried in light of the then termed “software crisis” (Pressman 1982,
as cited in [1]) which refers to software engineering’s difficulties in terms of cost and
schedule overruns as well as failing to satisfy customer expectations. The concern in
[1] was originally from a managerial perspective, however the derived model was an
“integrative” one that included:
…both the management-type functions (e.g., planning, controlling, staffing) as well as
the software production-type activities (e.g., designing, coding, reviewing, testing).
The model is based on a battery of 27 interviews of software project managers in
five different well-established software organisations, supported by an extensive
review of the literature that provided for a large amount of empirical findings. The
purpose of the model was to improve the then understanding of the software
development process. According to Abdel-Hamid [1], SD’s “powerful formalization
and simulation tools” helped manage the complexity of the model and its hundreds of
variables. This model was then used in a case study of a sixth software-producing
organisation in which it showed to be highly accurate in replicating the history of a
selected project especially on workforce level, schedule and cost variables [1].
2.4 Social considerations

Christie and Staley [8] attempted to model a requirement development process,
namely JAD3, from both an organisational and social perspective. The model helped
analyse how the effectiveness of interaction among participants affects the quality and
duration of the JAD process. The authors propose an empirical model of the various
participants. These are categorised as facilitators and technical experts. The experts are
modelled by assigning a numerical value, ranging from zero to one, to three key
characteristics: technical competence, ability to influence others, and openness to
other’s ideas. The model evaluates the understanding of each participant by applying a
numerical analogy of a flow in and out of a container, with the quantity of fluid
remaining in the container representing therefore the “understanding” of the given
participant. This, in our opinion, is an over simplification of “understanding” partially
due to the system dynamics modelling approach. In addition, facilitators are modelled
to guide the JAD sessions by affecting the values of the key characteristics of each
expert participant. Although an equation of how this change occurs is presented, the
validity of such an equation is not established. The authors seems to be satisfied by
how their model corresponds to what “one would expect” when applying extreme
values zero and one to the expert’s key characteristic values. Although this maybe true,
it says little, if any, about the validity of the model for values ranging in between these
extremes.
Despite these limitations, the article has the benefit of being one of the early
attempts to model human interaction in the field of requirement development for the
purpose of simulation. It underlines the need for realistic models development and
explains that simulation is subject to more stringent barriers than traditional methods,
because of the validity, or lack of, of its underlying models. Additionally the authors
insist on that the software simulation community needs to address these issues if it is to
make a significant contribution in process improvement.
3
Joint Application Development
11
2.5 Agent-based simulation modelling
2.5.1 The idea
Wickenberg and Davidsson [31], on simulating the software development process
note that despite such processes being performed by a set of cooperating individuals,
they usually are modelled using a centralistic activity-based view instead of an
individual-based view. They were the first to suggest that multi agent-based simulation
was a feasible alternative to activity-based approaches for simulating the software
development process. Their investigation was about the applicability of MABS to SDP
in general. Therefore they treated the problem from several abstraction levels and
found MABS to be a feasible alternative. Simulations concerned with the very minutes
of the interactions between developers can benefit of MABS, as it could “(in
principle)” model and capture “conversations” between the various developers. Those
simulations designed for higher levels of abstractions such as studying employee
turnover can also benefit from a MABS approach. Wickenberg and Davidsson [31],
note that at an even higher level of abstraction, agents need not necessarily represent
only the individuals within an organisation. They could be used to model departments
within organisations, the organisations themselves, or other SDP stakeholders and the
interaction among these. According to them, MABS applied to SDP suffers from a
serious limitation. In many cases there is only limited knowledge or information
available about the individual behaviour, more than just the role defined by the
process. Another hinder for using MABS within an SDP setting regards the lack of
statistical data documenting the individual behaviour of the participants during such a
process. There is therefore little to gain using MABS if the behaviour is only known in
terms of collective measures (averages) of their performance and if the role-played by
the developer maps directly to the process, i.e is clearly predictable.
Our work uses this document ([31]) as its starting point. During our study of the
problem, we came to experience the lack of statistical data regarding the individual
behaviour of developers. We partially remedied to this problem by using an integrative
model as explained in section 3.2 which shifts the problem from being the collection of
behavioural data that is difficult to evaluate and measure to the collection of data
related to the variables that affect this behaviour, which are better defined and
therefore easier to identify and measure.
2.5.2 The attempts

Yilmaz and Phillips [34] present an agent-based simulation model that they use to
understand the effects of team behaviour on the effectiveness and efficiency of a
software organisation pursuing an incremental software process such as RUP4. Their
research relies on organisation theory to help construct the simulation framework. This
framework is then used to compare and identify efficient team archetypes as well as
examine the impact of turbulence, which they describe as requirement change and
employee turnover, on the effectiveness of such archetypes. They validated their
model by comparing its output with established facts observed in empirical studies.
While the authors use agents to represent teams of developers, their focus is
articulated at the team level, not the individual one. Currently each team is modelled as
a single agent. They do recognise however the need to investigate further models of
individual developer agents.
Although they view teams as autonomous entities, it is our opinion that they draw
only limited advantage of the agency paradigm because they are forced to rely on
group averages for representing developer performance, which as explained earlier
introduces homogeneity assumptions that may result in such simulations not being able
to account for or explain certain facts observed in real-life as noted by Burke [7].
4
Rational Unified Process
12
In another study, Smith and Capilupp [29] attempted to apply agent-based
simulation modelling to open source software (OSS) to study the relation between size,
complexity and effort. They present a model in which complexity is considered a
hindering factor to productivity, fitness to requirement and developer motivation. They
highlight the fundamental difference between “traditional” proprietary software
development and that of the more cooperative and distributed OSS in terms of both
process and evolution. To validate their model they compared its results to four large
OSS projects. This model could not, so far, account for the evolution of size in an OSS
project.
We find this model to be rather simplistic as both developers and requirements are
“indiscriminately” modelled as agents and implemented as patches on a grid5. This
grid introduces a spatial metaphor that we find inappropriate. The authors for example
use the notion of physical vicinity to model the “chances” of a given requirement to
attract a developer “passing through cyberspace”. Although they speak of cyberspace,
vicinity actually implies physical space.
One of the strengths of the agency paradigm is that it allows designing systems
using metaphors close to the problem domain, especially in presence of distributed and
autonomous individuals or systems. Therefore, using physical vicinity of a
requirement or a task to a bypassing individual as a measure of the probability of the
individual taking interest in that task is a metaphor that, in our opinion, does not map
well to reality suggesting therefore an inappropriate use of agents.
2.6 Modelling software developers’ performance and

cognitive characteristics
2.6.1 Performance characteristics
Rasch and Tosi [25] present a theoretical model over the factors that affect a
developer’s effort and performance. They call their model an integrated one because it
attempts to integrate expectancy theory and goal-setting theory and research on
individual characteristics such as self-esteem, achievement needs and locus of control.
Supporting their model is an empirical study in which they collected around 230
useable answers from three major software firms in the United States of America.
Their study qualifies and quantifies the relation of various factors to individual effort
and performance.
We find this empirical study to be built upon quite solid theoretical grounds and
validated using rigorous statistical methods. Counter intuitively, however, the model
does not explicitly include experience, or for example developers’ social “allegiances”.
It could be that these factors have only limited, if any, influence after all but their
‘falsely’ intuitive character would warrant in our opinion a mention.
2.6.2 Cognitive characteristics

Hanakawa et al. [12] take yet another approach to software process simulation by
presenting a model that accounts for a developer’s level of knowledge and its dynamic
fluctuations as the project progresses. The fluctuation in, what they call, the knowledge
structure is directly correlated to the developer’s productivity.
Importantly, this model acknowledges that a developer may gain more knowledge
as he or she carries out project activities. When a task requires less knowledge than
available, the task is considered easy; the developer performs highly, however there is
no knowledge gained. If the required knowledge is “slightly” over a developer’s
knowledge, her performance may be lower, but there is a gain in knowledge that will
benefit performance at a later stage. Finally, if the required knowledge is beyond a
5
Smith and Capilupp [29] used freely available agent software NetLogo that represents agents as cells
in a grid each of which is responsible for maintaining its own state information.
13
given threshold the task is considered as being too difficult and only very little
knowledge maybe gained.
14
3 SOFTWARE PROCESS SIMULATION MODELLING
In this section, we deal with the modelling issues related to the simulation of the
software development process. The first subsection is a succinct overview of software
engineering from a process or sequence model perspective which provides us with a
background to the following subsection in which we attempt to model the process; to
finally present the simulation framework that implements this model.
3.1 Software development as a process

A software development process can be viewed as a succession of phases. Each of
which comprises a number of activities that rely on input artefacts, such as
specifications, produced on previous phases to generate the artefacts of the current
phase. A process must include a termination criterion for the project and each of its
phases defined either in terms of deadline, quality achieved, completion level, cost or
project cancellation.
A process can also be seen as a sequence or chronology model for the activities it
covers, specifying in what order the various phases occur and what feedbacks if any
there are from one phase to any previous one, and in which sequence such feedbacks
propagate. From a sequence model point of view, most processes can be described as
either following the “water fall” model, one of its derivatives or a tentative
improvement on it [4] most of which can be described as incremental or iterative.
3.1.1 The “Water fall” model

The “water fall” model expects the various phases of a project to follow each other
with very little back propagation (of errors and/or feedback) at all. In a sense the
model implicitly assumes that each phase ends neatly providing the following phase
with sufficiently high-quality artefacts requiring therefore no significant redo of any
previous phase. Such a process works fine for projects where requirements are well
defined, remain stable throughout the project and where most risks, delays and costs
are predictable already in the early stages of the process. It usually starts with a
feasibility study and terminates with a review [4] that may unveil errors, shortcomings
and potential for improvement. If a decision is made to remedy to any such findings,
then the process is started anew from feasibility study to review. Although there are
many engineering disciplines that may draw benefit of such a simple yet systematic
model, software engineering cannot be said to be one of them. Yet large software
companies do tend to adopt this model despite its shortcomings. One explanation we
have is that the simplicity of the model, its main “sales argument”, seems so important
that management and/or software engineers believe that it compensates for the above
mentioned drawbacks and are ready in their turn to compensate for it by “managing”
the problem and balancing its benefits and limits (which is actually an engineering
competence).
3.1.2 The Incremental model

An alternative process sequence model is known as incremental or iterative. Such
a model takes a “divide and conquer” approach in each phase. That is to say, instead of
each phase running until completion (as in the water fall model), it carries out only a
limited portion of its assigned activities, to temporarily suspend and allow a
succeeding phase to proceed in a similar piecemeal fashion. Some processes that
follow such sequencing may incorporate quality control or error detection feedback to
preceding phases in each phase while others may choose to wait until all phases have
partially run to include required improvements. The advantage of such an approach is:
(i) it acknowledges the possibility of low quality output (artefact(s)) from any given
phase, therefore incorporating feedback possibilities, and (ii) it detects incorrectness,
15
incompleteness, inconsistency and/or conflicting requirements early on in the project.
Early remedy to such shortcomings –thanks to their early detection– is obviously much
more economical than late remedy.
The main difference between the two sequence models from a simulation
perspective is the management of remaining phase activities that need to be completed
in a coming iteration including pending improvements or corrections. This
management applies only to the incremental sequence model.
3.2 Modelling the software development process

For the purpose of this work, and to derive the individual-based view (model) we
needed to first define which of the so many possible characteristics and variables that
influence the performance of software developers and the quality of their artefacts
were most relevant.
Based on our review of the literature, we believe that there are two major factors
that greatly contribute to a developer’s achievements and the quality of the artefacts he
or she produces. The first is performance, which is based on individual characteristics
representing the effort level a developer is willing to achieve and at what rate. The
second factor as we see it is the knowledge competence of the developer. Without
adequate knowledge of the domain, the task at hand, related technologies or any
prerequisites, it is unlikely that a developer would achieve high performance and even
less likely would he or she achieve high quality.
According to our review, Rasch and Tosi [25] presented a convincing model of
individual performance by empirically integrating expectancy theory, goal setting
theory and research on individual characteristics such as locus of control, self-esteem
and achievement needs. The scope of their study and the statistical rigour used for its
validation lead us to consider their model as a basis for modelling our developers and
even, as we see it, the quality of various artefacts submitted to a developer. In the
remainder of this document we name this model: The Effort and Performance Model
(EPM).
Figure 1, shows that one of the inputs to the EPM is goal difficulty. Based on our
review of Hanakawa et al. [12], which we name herein: The Hanakawa et al.
Knowledge Model (HKM), we understand that the difficulty of a goal or task is related
to the level of knowledge of the person performing it. In other words the same task
may present different values of difficulty depending on who performs it. In addition
HKM takes into consideration the fact that ones knowledge of a task improves the
more one works on it. In a sense HKM considers knowledge to improve with
experience. Given these facts, we chose to adopt the HKM model and use it to derive,
among other variables, goal difficulty that is then fed to the EPM model.
Another input to the EPM, as shown in Figure 1, is goal clarity. It is our
understanding that this factor is expressed through both verbal instructions from
customers or management and written specifications and artefacts. The former
expressions are too complex to include in our model at present, however the later two
are related to artefact quality, which our proposed model supports.
In the following sections we shall present the relevant aspects of EPM, HKM, and
what we term the Aretefact Quality Model (AQM) and how we expect to integrate
these.
3.2.1 Effort Performance Model (EPM)

As mentioned earlier, Rasch and Tosi establish their work on a conceptual
framework that attempts to integrate concepts from Expectancy theory and Goal
Setting theory with individual characteristic research.
16
3.2.1.1 Expectancy theory
Expectancy theory has been widely used in studying motivational issues (Baker et
al., 1989; Brownell and McInnes, 1986; Butler and Womer, 1985; Harrell and Stahl,
1984; Kaplan, 1985; Nickerson and McClelland, 1989) as cited in [25]. This theory is
based on the belief that highly motivated individuals will exert higher level of effort
resulting in higher performance as compared to less motivated ones [25].
According to [25], expectancy theory relates performance to an individual’s effort-
level, ability and role perceptions.
3.2.1.2 Goal-setting theory

Goal-setting theory relates goal success to its difficulty and clarity levels. Lack of
clarity negatively impacts performance as it introduces anxiety and hesitation in the
decision making process. As to goal difficulty it affects both effort and performance as
a higher level of difficulty leads one to exerting more effort, and as long as the goal is
attainable then the increased effort would result in enhanced performance (Locke and
Latham, 1990) as cited in [25].
3.2.1.3 Individual Characteristics

Rasch and Tosi [25], introduce a third perspective for analysing the effort and
performance of developers, namely individual characteristics. In this perspective they
consider: Need for achievement, locus of control and self-esteem.
Individual characteristic Definition
Need for achievement The extent to which an individual values success (McClelland,
1961) as cited in [25]
Locus of control The perception an individual has of how much control that
person exerts over its own destiny.
Self-esteem An individual’s notion of self-worth. This factor is found
positively correlated to both effort and performance.
Table 1 Rasch and Tosi [25]’s definitions of the various individual characteristics they
considered
3.2.1.4 EPM Results

The results of the empirical study showed that ability was the single most
important factor affecting performance with a correlation of 0.54 as shown in Figure 1.
Table 2 quantifies the direct and indirect effects of the various factors.
17
Self esteem
Goal difficulty 0 .1
5
0.1 -0.1
9 1
Goal clarity 0.19 Effort 0.21 Performance
9
0.3 0.18
1
Achievement 0.1
4
0.5
need
Locus of control
Ability
Figure 1 The Empirical Performance Model representing the parameters affecting effort
and performance as empirically determined by Rasch and Tosi [25].
Relation to performance Direct effect Indirect effect Total effect

Achievement needs 0.18 0.39x0.21 = 0.08 0.18 + 0.08 = 0.26
Self esteem 0.15 _ 0.15
Locus of control 0.11 _ 0.11
Goal clarity _ 0.19x0.21 = 0.04 0.04
Goal difficulty -0.11 0.19x0.21 = 0.04 -0.11 + 0.04 = -0.07
Effort 0.21 _ 0.21
Ability 0.54 _ 0.54
Table 2 Empirical Performance Effects; as reported by Rasch and Tosi [25].
3.2.2 Knowledge Model (HKM)

Hanakwa et al. [14] first presented a learning curve based simulation model that
takes into consideration the fluctuation of a developer’s knowledge of a task with the
time spent working on that task. After that a description of how to apply the model
within an industrial setting (providing hints as to how to elicit the value of most of the
model’s variables) was published [13] followed by an updated model which takes into
consideration prerequisite knowledge of a given knowledge type [12]. In this later
version, they base their model on an individual knowledge structure. This structure is
represented as a cognitive map, in the form of a graph, in which the nodes represent
knowledge elements such as Relational Database (RDB) and Structured Query
Language (SQL), while the links between such nodes represent the prerequisite
relation between these knowledge elements.
Hanakawa et al. [12], complete the cognitive map with two more parameters,
namely: adequacy of knowledge and workload (requiring that particular knowledge).
Adequacy of knowledge represents the percentage of individual achievement on the
particular knowledge element. Workload is an estimate measure of the activity
requiring the knowledge element. The former can be quantified either by submitting a
developer to an examination, testing his knowledge of the knowledge-element in
18
question, or if this is not feasible then one can rely on the experience of the developer
for quantifying the adequacy of his or her knowledge [13]. As to the workload it is
quantified by analysing the volume of artefacts produced previously on similar
projects. From such documents two types of information are extracted (i) size of a
given activity and (ii) type of knowledge applied. Hanakawa et al.[12] present the
example of a, previous, design document in a project similar to the one being
estimated. From that document they would count the type of knowledge required in the
making of each page of the document. In their example the design document was 500
pages of which 50 were concerned by RDB issues while 100 pages addressed SQL
matters. The RDB workload therefore accounts for 10% while SQL presents a
workload of 20% of the total workload. Knowing the requirement size, in function
points, of the new project makes extrapolation simple. If the new project requirements
represents half the function points of the previous project, then Hanakawa et al.[12]
estimate that the new design document would result in 250 pages of which 25 require
knowledge of RDB while 50 pages require SQL knowledge.
We find these latest extrapolations quite hazardous especially if the new project
had less function points precisely because no database was required, in which case
both RDB and SQL workload would amount to zero. However, Hanakawa et al.[12]
were careful in saying that the comparison should be applied to a previous yet similar
project.
Hanakawa et al.[12]’s original model consisted of three sub-models: Activity,
Productivity and Knowledge model. Of these only the later is relevant for our research.
3.2.2.1 Knowledge gain and its model

The knowledge model derives the gain to a developer in terms of added knowledge
by performing a given task. If θ is the level of knowledge required to perform a given
task j and bij is the level of knowledge of developer i about that task, then Hanakawa
et al. [12] present the following conclusions:
(i) There is no knowledge gain to developer i if θ < bij.
(ii) There is significant knowledge gain to developer i in performing task j if θ
is only somewhat greater than bij. If however θ is significantly larger than
bij then only little knowledge can be gained, as the task j is getting too
difficult.
Below we present equation (2) of Hanakawa et al. [12].
K e− Eij(θ − bij), b ≤ θ
j ij
Lij(θ) = Wj  i
0, bij > θ
Where:
Lij(θ): Quantity of knowledge gain to developer i by executing activity j requiring a
level of knowledge θ.
Kij: Maximum knowledge gain to developer i when executing task j
bij: Developer i’s knowledge about activity j
Eij: Developer i’s downward rate of knowledge gain when executing activity j
θ Required level of knowledge to execute activity j
Wj Total size (amount) of activity j
3.2.2.2 Updating the knowledge level of a developer.

The above equation helps us derive the knowledge gain Lij(θ) to developer i in
performing task j requiring a knowledge level θ. At each time step t, the original
knowledge level bijt, is augmented by Lij(θ)t:
bijt + 1 = bijt + Lij(θ)t (3-1)
19
3.2.3 Artefact Quality Model (AQM)
Quality, as its name suggests, is hard to quantify. In our attempt, we first identify a
causal relation between quality, knowledge and experience.
Knowledge provides a developer with the abstract and theoretical foundations for
accomplishing a given task. Experience enhances these foundations with a practical
perspective allowing one to gain awareness of the limits of certain theories or practices
and the crucial importance of others. It is our opinion that experience is a balancing
utility for weighing decision factors, of inherently different nature, against each other
in order to optimise decision outcomes. The EPM abstracts knowledge and experience
in the factor ability. For modelling purposes we shall rely on this abstraction, i.e. use
ability in lieu of knowledge and experience, to relate to artefact quality.
An artefact as such is the synthesis of several sub-activities carried out by probably
more than one person. The size s of an artefact a, at any time, depends on the
performance of the individual(s) working on it. Similarly the quality q depends on the
ability of the individuals performing it.
3.2.3.1 Artefact size

The size of an artefact is simply the size of all contributions. We denote cij the
individual contribution of developer i on activity j, such as:
cij = performanceij × durationij (3-2)
The size sj, of an activity j, depends on the total contribution of its participant(s) i:
d  d 
sj = ∑ cij i.e. s =
 j ∑ performancei,j × durationi,j (3-3)

i=1  i=1 
The total size of the artefact is therefore:

n  n d 
s= ∑ sj i.e. s = ∑ ∑ performance × duration  (3-4)
 i,j i,j
j=1  j = 1i = 1 
3.2.3.2 Artefact quality

As noted earlier, we relate quality to ability. An artefact being a synthesis of
maybe several activities, we can present an average quality qj measure of a sub
activity j based on the ability of its contributors in the following terms:
d
qj = ∑ abilityij × cij ÷ sj (3-5)
i=1
Quality being a subjective matter, it is very probable that the quality of
given aspects, herein modelled as activities, are more important than others
depending on who’s perspective is being considered. We therefore introduce a
weighted sum measure of artefact quality q.
 n  n
q= 
 ∑ 
wj × qj ÷ ∑ wj (3-6)

j = 1  j=1
Where wj is a weight factor that attributes to sub activity j, of the artefact,
the relative importance of its quality to the user (of the simulation).
20
3.2.4 Integrating the models
In order to provide us with a performance value, the EPM model requires as input,
among other variables, ability, task difficulty and task clarity. However the HKM
model provides us with a measure of knowledge adequacy relative to a required level
of knowledge and a model that captures the dynamic fluctuation of the former.
3.2.4.1 Ability
In the EPM model, ability is defined by measuring native intellectual capacity and
the quality of ones formal studies. This leads us to conclude that we can use level of
knowledge as provided by the HKM model to represent ability.
Abilityi,j = bi,j (3-7)
3.2.4.2 Task difficulty

A way of representing difficulty of a task is to consider the intellectual challenge it
represents. In a sense difficulty could be perceived as the difference between actual
level of knowledge and required level of knowledge of a given task.
θj − bij, bij < θj
Difficultyi,j =  (3-8)
0, otherwise
3.2.4.3 Task clarity

The tasks carried out by a developer in our model are based on the artefacts
produced during prior phases of the process. The quality of these artefacts determines,
in our opinion, the clarity of the specified task. For example, a requirement
specification of good quality is one that is less ambiguous more precise and more
complete than one considered of lower quality. In other words a requirement
specification of good quality is one where the requirements are made clear and hence
the analysis task that follows is based on clear input artefacts.
Clarityj = qualityartefact (3-9)
3.2.4.4 Artefact quality

In section 3.2.3.2 we presented our artefact quality model and how it is used to
derive the quality measure qj of a developer’s contribution on task j based on his or
her ability. In our integrated model however, current level of knowledge bi,j represents
Abilityi,j , therefore by replacing (3-7) in equation (3-5) we obtain the integrated
equation:
d
qj = ∑ bi,j × ci,j ÷ sj. (3-10)
i=1
3.2.5 Developer/artefact interaction model

In each phase of a process, there are a number of key activities to carryon. The
developer applies these activities to the input artefacts resulting in a contribution to the
output artefact. Each phase activity is defined in a phase template specifying what
milestones or steps (activities) are carried on and the type of role/competence required.
The input artefact on the other hand represents the actual “object” to which the
activity is applied. For example, we could have a phase, we name, “design” that
defines the following activities A = {“SW design”, “Test cases design”} which are to
be applied to an artefact “analysis specification” that includes items I = {GUI,
RDB,..}. This means that our developer will apply the activity “SW design” to the
21
item “GUI” and then “RDB” thereafter he, or some tester, will apply the activity “Test
cases design” to first the “GUI” then “RDB”.
Our (simulated) project manager decides on which activity to allocate to which
agent using this template.
For an output artefact to be complete, all input artefact items need be “subjected”
to every type of activity defined in the phase (actually in its template).
For example, let us define the following activities A = { a1, a2,…, an} which are to
be applied to an artefact “analysis specification” that includes items I ={i1, i2,…, im}.
Algorithm 1 shows how the input artefact is converted to an output artefact
1 For each activity a ∈ A

2 For each item i ∈ I
3 OutputArtefact.add(contribution = a * i)
4 End for
5 End for
Where * is a conversion operator from A*I to I.
Algorithm 1 Converting input artefact items into an output artefact contribution
However, the above algorithm does not show how developers are allocated
activities according to their role or competence. In the example above, the phase
template specifies that, for example, a tester and not a software developer should carry
out ”Test case design”. In our implementation, the manager compares the required role
type for an activity specified in a process phase template with the role(s) that an agent
represents. If the activity is not adequate the manager looks up the next activity in his
list until some adequate activity is found else “null” is returned, whereby the developer
has nothing more to do during this phase.
3.3 Simulation Framework

Since we were to develop two different simulators that use the same models (EPM,
HKM and AQM), we first started with a generic simulator framework that supports
both MABS and SD simulators.
3.3.1 Framework overview

Figure 2 provides a simplified overview of the simulation framework and how it
extends into MABS and SD simulators.
At the heart of the framework are a number of knowledge related classes (such as
type or primitive, ability and task) these are then extended into a number of artefact
classes, which are shared between software process related classes. The Individual
class holds both individual characteristics properties, such as Achievement needs, Self-
Esteem and Locus of Control, as well as knowledge abilities, such as knowledge level,
potential gain and max difficulty, defined for each type of knowledge task known to
the individual. This class then extends into an actual developer used in for MABS
simulations. In the case of SD however, this developer is extended into a single
“average” developer whose individual characteristics reflect the average of all
developers yet whose actual effort is proportional to that of the team. This single
average developer deactivates and “reduces” the inherited agent to a “simple” thread
applying the update rules as defined by our system dynamic model to the EPM, HKM
and AQM using the inheritance hierarchy for this purpose.
22
KnowledgePrimitive
-type
-name
-importance
java.lang.Thread KnowledgeAbility KnowledgeTask

-k
-currentKnowledgeLevel /* bij */ -requiredKnowledgeLevel
-maximumKnowledgeGain /* Kij */ -size
* -downwardKnowlgedgeRate /* Eij */
Agent ArtefactPrimitive PhaseTaskDescriptior Role

-quality * -id -id
-performed -name 1* -name
-roles[] : Role -acronym
-i 1
Individual Artefact TaskPrimitive

-knowledge[] : KnowledgeAbility -type
-acheivementNeeds -contributions[] : ArtefactPrimitive +getClarity()
-selfEsteem *
+getQuality() +isSelected()
-locusOfControl +getCompletedSize() 1 *
+getExpectedSize() -
Manager Developer 1 Phase -

-costRegularHours -tasks[] : TaskPrimitive
-prepareWBS() -costExtraHours -p -participants[] : Developer 1
*
+getTask() -holidayPeriods[] -outputArtefact : Artefact
-requestTask() -terminationCriteria : string 1
-performTask() -developers +getRemainingTasksSize()
-updateAbility() +isDone() : Boolean
-m *
+getDifficulty()
+getPerformance() -p2 *
-p1 1
AverageDeveloper Calendar
Project -c
-costRegularHours
-requirementSpecification : Artefact
-costExtraHours +getTime()
-phases[] : Phase
-holidayPeriods[] * +getDate()
+getTotalDuration()
#requestTask() +increment()
+getTotalPerformance()
#performTask() +isHoliday(in date) : bool
+getTotalQuality()
#updateAbility() +isWeekEnd(in date) : bool
+getTotalCost()
+getDifficulty() +isWeekDay(in date) : bool
+getPerformance() +isBusinessHour(in time) : bool
* 1 -s
+isRegularHour(in time) : bool
Simulator
+isLunch(in time) : bool
-type
1 -project : Project 1
-calendar : Calendar
-s
SD MABS
Figure 2 A somewhat simplified UML class diagram of the simulation platform
3.3.2 Framework’s model variables manipulation

To allow for a configurable platform, definition files were used to describe the
various model variables. These files act as tables in a database, however for our
experimental purposes flat files were more than sufficient to efficiently calibrate the
system. Figure 3 illustrates the catalogue structure in which our variables and
definition files are organised.
23
resources
|--- database
| |--- project <name>
| | |--- phases.txt
| | |--- requirements specification.txt
| | |--- settings.txt
| |
| |--- developers.txt
|
|--- knowledgebase
| |--- knowledgeabilities.txt
| |--- knowledgeprimitives.txt
|
|--- process
| |--- <name>
| | |--- phases
| | | |--- <phase 1>.txt
| | | |--- <phase 2>.txt
| | | | .
| | | | .
| | | |--- <phase p>.txt
| | |
| | |--- roles.txt
| |
|--- settings
|--- config.txt
Figure 3 Overview of the various catalogues and their files used to set-up a simulation
3.3.2.1 Process definition

In our simulation platform, a software development process is defined through a
number of files contained in a catalogue named after the process in question. These
files define the various phases and the roles that exist in the process.
An example of a role file is illustrated in Table 3
Role id Acronym Description
1 PM Project manager
2 SE Software engineer
3 TE Test engineer
Table 3 Example of a roles definition file
While a phase catalogue would look something like the example presented in
Figure 4
phases
|--- analysis.txt
|--- design.txt
|--- implementation and unit testing.txt
|--- integration testing.txt
|--- validation testing.txt
Figure 4 Example of a phases catalogue containing individual phase definition files
24
3.3.2.2 Phase definition
Each phase of a process is defined in a phase file named after that phase, which
enumerates the activities that need to be carried out, and the qualification/role of the
agent performing that activity. Note that an agent may play several roles.
Number Activity Required knowledge Participant(s) by role
level (%)
1 Software design 80 SE,TE
2 Test case design 85 TE
Table 4 A simplistic example of a phase definition file pertaining to the design phase
3.3.2.3 Project definition

While the previous process and phase definition files define the process and its
phase activities in generic terms, our project catalogue describes the specific
“instantiation” of the process for a given project. It includes the project’s requirement
specification, simulation settings file (including such parameter section as start date
and clock increment frequency), yet most importantly, a detailed phases description
file, such as the example in Table 5, specifying which phases are included, in what
sequence, who participates in which phase and finally which termination criterion is
used to judge fulfilment of each phase.
Sequence Phase (by name) Participants (id:role) Termination
criterion
1 Software design 01:SE; 02:SE; 03:TE Q=96%//quality
2 Test case design 01:TE;03:TE F=97%//functionality
Table 5 An example of a project’s phases definition file. Among other things it specifies
that the software design phase will complete once quality has reached at least
96%
In its current implementation the framework supports the termination criterion

specified in Table 6
Criterion Syntax Description
Quality achieved Q=X Terminate when quality of the output
artefact reaches X%
Completion level (functionality) F=X Terminate when X% of the output
artefact is implemented
Deadline (As a specific date) D=YYYYMMDD(:HHMM) Terminate on specified date (time as
option)
Deadline (As hours since start of T=X Terminate after X hours since project
project) start
Deadline (As hours since start of H=X Terminate after X hours since start of
current phase) phase
Table 6 The currently supported termination criteria, their syntax and description
25
4 MULTI AGENT-BASED SIMULATION MODEL
4.1 Agents and their environment
Wooldridge [32] summarises what most AI readers know that there is probably
more controversy than agreement over what an agent actually is. He provides however
a tentative definition that we chose to adhere to as much as it suits our needs. “An
agent is a computer system that is situated in some environment, and that is capable of
autonomous action in this environment in order to meet its design objectives.” [32]. In
addition, Wooldridge [32] finds an agent-based solution to be appropriate when the
problem has one or more of the following characteristics:
- The Environment is open, highly dynamic, uncertain or complex.
- Agents provide for a natural metaphor in solving the problem.
- Data, control or expertise is distributed.
We find the above to characterise well the attributes of software process
development indicating further the appropriateness of MABS to SPSM.
As to the actual agent definition, our opinion is that from a practical perspective
the exact definition does not matter. From this perspective, agents or their use is not
and should not be an objective by itself; what matters is the advantage one can take out
of the agency paradigm, its “utility” value. In our case the autonomy of agents
provides us with the correct metaphor for representing “autonomous” developers. In
our simulations, a software developer is autonomous only to a certain extent; he or she
cannot for instance just refuse a task that management has decided to allocate him. He
will however refuse to work during his weekends and holidays. So our agent is
sufficiently autonomous, but not more, “to meet its design objectives.” [32].
4.1.1 Situation (environment)

The environment in which our agents “evolve” is a “world” of artefacts with which
they interact. The development process and a time dimension structure this ”world”.
4.1.1.1 Artefacts
Artefacts are the various requirements, specifications and code that are submitted
to the developer at the beginning of each phase for him to use as a platform to develop
the artefacts of the next phase.
4.1.1.2 Process
The process defines the set of input that the developer (agent) will use, at a given
phase, and what artefacts are expected as output of this phase.
4.1.1.3 Time
Time affects both the process and the developer. As to the process, termination
criterion for any phase can be defined in terms of time such as deadlines, whereas the
developer’s environment, artefacts and process state, change with time as a result of
the action of other agents too. Yet the main impact of time on the developer is in terms
of working hours. A developer is not required to work irregular hours (unless specified
by management) and neither is he or she expected to work on weekends and holidays.
4.1.2 Characteristics of the simulation environment

Based on Russell and Norvig [28] our simulation environment can be characterised
as being: partly observable, stochastic, episodic and dynamic.
26
4.1.2.1 Partially observable
The environment is partly observable for the developer because at any given time,
he or she is only aware of a subset of the activities to carry on, namely those he or she
has been allocated. This image of the “world” is incomplete in that the developer
cannot say whether or not the phase is completed as he or she has little insight on the
contribution of the other agents.
4.1.2.2 Stochastic
The environment may appear to be stochastic because it is only partially
observable [28], however it is also stochastic because from an agent’s perspective the
next state does not solely depend on the actions taken by that particular agent on the
current state. There are other agents acting on the environment concurrently.
4.1.2.3 Episodic
The environment is episodic because the agent’s choice of an action is uniquely
dependant on the current state of the environment, irrespective of the actions of other
agents. The decision is mostly based on whether the date and time correspond to a
working day and hour and on whether the developer has anything to do in which case
he will carry on the remaining activity or request a new set from the project manager
instead.
4.1.2.4 Dynamic
The environment is dynamic because it changes with time while the agent is
performing some activity or choosing the next action (deliberating), other agents may
concurrently alter the state of the environment. In other words, the output artefact is
incremented with new contributions from some agents while others are still
deliberating.
4.1.3 Autonomy
Our agent is autonomous as far as his or her design objectives are met. For
example, the developer may refuse to work on weekends, but he or she may not refuse
to take on a task it has been assigned.
4.1.4 Intelligence
Intelligence or the lack thereof, is probably the most controversial aspect in the
debate around the agency paradigm. For the purpose of our work, Brooks’ [6] rejection
of symbolic AI is justified. In addition his subsumption architecture [5] inspired our
agents’ decision-making model, however this was later abandoned for a simpler
situation action rule suite. Although the difference is subtle it is worth explaining.
Strictly speaking Brooks’ subsumption architecture [5] divides the decision-making
process into a hierarchy of layers of competence. Higher priority layers can subsume
the roles of lower ones and even interfere with their data flows. In addition all layers
are assumed to be running concurrently. Although this is practical for the purpose of
building robots or physical agents, it is too sophisticated for our case. Instead we
implement a simple hierarchy of “if” statements, testing conditions about the
environment to make high level decisions and delegating lower level decisions to
lower level “if” statements (embedded within). Here ends the resemblance, as our “if”
statements need not and do not run concurrently. Seen from this perspective we can
characterise our agents as simple reactive ones.
27
4.2 Implementation
4.2.1 Agents
Our multi agent system is implemented as java threads6 (the agents) interacting
with java objects representing artefacts, performance and knowledge models.
4.2.1.1 Developer agent

In our implementation, one agent represents an individual developer (or tester).
The agent is simply a java thread that behaves as a reactive agent as discussed in 4.1.4.
Each such agent has a set of individual characteristics (Achievement needs, Self-
esteem and Locus of Control) as well as a set of knowledge characteristics (bij, Kij and
Eij, see 3.2.2.1) for every type of knowledge task defined in the system. The values of
these characteristics are then used to operate the two models EPM and HKM. bij,
developer i’s current knowledge level of activity j is the only model input variable that
changes during the course of the simulation.
4.2.1.2 Manager agent

The simulation uses a project manager whose purpose is to prepare a work
breakdown structure (WBS) based on the initial requirement specification, of the
initial phase. In the following phases he or she shall convert the output artefact of the
previous phase into a set of activities to carry on in the current one. The manager is
also in charge of allocating the correct type of activity to the right competence. For
example a process may require that testers and not software developers prepare test
cases in a particular phase. In this case the manager ensures that only agents holding
the role “tester” will be allocated “prepare test case” activities. It is acceptable that the
same person and hence the same agent play the role of developer and tester, as long as
this is specified in advance of the simulation.
It was actually not necessary to include a manager. Developer agents could have
by them selves directly interacted with the artefact environment and decided for them
selves which type of activity to carryon resulting in virtually the same outcomes.
However we did not want to break the metaphor, as in real world projects managers
are involved in the WBS definition, resource and task allocation, albeit not to the point
of micro-managing the developers.
4.2.2 Decision making

Our agents implement a quite simple decision making mechanism using situation
action rules as shown in Table 7.
N# Condition Action (condition is true) Else
1 If Date ∉ Holiday Evaluate condition 2 Clock out
2 If Date ∉ Weekend Evaluate condition 3 Clock out
3 If Time ∈ Regular hours Evaluate condition 4 Evaluate condition 5
4 If Time ∈ Work hours Evaluate condition 6 Evaluate condition 5
5 If on extra hours Evaluate condition 6 Clock out
6 If Time ∉ Lunch Evaluate condition 7 Clock out
7 If Not Clocked in Clock in N/A
8 If Clocked in Evaluate condition 9 Evaluate condition 7
9 If no activity allocated Request activity Perform activity
Table 7 Situation action rules describing the agent’s decision-making process. Text in
bold face represents primitive actions. Clock in and out helps the simulator in
bookkeeping a developer’s effort.
6
All of our work is implemented in Java based on SE 1.5.0_08
28
5 SYSTEM DYNAMICS SIMULATOR
System dynamics modelling is mainly a task of defining a system in terms of
levels (stocks) and rates (flows) and eventual feedbacks. Given the purpose of our
model, which is to estimate required effort and output quality, we found it most
appropriate to use the artefact as our main level, with performance and quality being
two independent rates.
5.1 Model prerequisites

If we are to objectively compare SD to MABS we need to integrate our SD model
with our performance (EPM) and knowledge models (HKM) just like we did for the
MABS simulator.
Performance rate will therefore be calculated using the EPM model, however it
will represent an average of all participants, as our SD model only supports average
developer performance and quality in contrast to the MABS model that is designed
around the individual developer and its individual characteristics. Similarly quality of
contributions will be based on an average of the quality abilities of all participants.
A refinement we are introducing is to calculate the averages based on the actual
individuals who will participate in the actual phase, as opposed to using some
“company-wide” average. This provides for more accurate average results, even
though we could be criticised for taking an “individual-based-view” here too. In our
opinion it would not be “fair” to knowingly “weaken” the predictive power of our SD
model when SD itself does not formally hinder us from providing it with “accurate”
information.
5.2 A System Dynamics Model of a development process

Figure 5, illustrates our system dynamics model of a phase of the software
development process and how it’s output artefact relates to that of the previous phase.
Artefact quality Knowledge Artefact quality

(phase p) (of participants) (phase p+1)
Impact of
Task clartiy knowledge
on quality
Knowledge
characteristics
of all
Artefact participants Artefact
(phase p) Knowledge (phase p+1)
gain
Performance
rate
Individual
characteristics
of all
participants
Figure 5 A system dynamics model of a software development phase
5.2.1 Feedback dynamics structure

Phase p+1 starts with an artefact outputted from some previous phase p. This
artefact has a corresponding quality level (named artefact quality of phase p), which
29
determines the clarity of the tasks it implies, which in turn impacts on the performance
of the team. The initial performance is determined by the individual characteristics of
the developers. As the team performs, their knowledge of the task at hand increases
and so does the clarity of the tasks, for them, again increasing their performance. As
the knowledge of the team increases, so does the quality of their artefact p+1.
5.2.2 Levels7
In our SD model, artefact, knowledge and quality are three quantities that fluctuate
with time and so we chose to represent these as levels. Knowledge however is a level
that can only increase with time. Although inactivity could negatively impact on
knowledge HKM does not provide support for this and therefore it has been left out of
the model.
5.2.3 Flows8
There are two rates of fluctuation in our SD model that we represent as flows:
knowledge gain determined by our HKM using the knowledge characteristics of the
team and the size of the performed task at each increment; and performance which is
determined by our EPM model based on individual characteristics and task clarity.
5.2.4 Auxiliary variables

Other important variables are modelled as auxiliaries as they affect levels and rates
based on the value of a connecting auxiliary or level, but without causing a drain in
that level.
5.2.5 Difficulties
A first problem when designing the model was to account for the individual
developers. Although much can be done at implementation, we did not find an obvious
way to account for individual developers and their individual knowledge and
performance characteristics. The actual problem resides in the static character of the
model. Since knowledge and performance are inherently individual we would like
these to be modelled as part of each participant. However, although it is possible to
model as many participants as we want, it is not possible to “instantiate” them out of a
single model, which means that participants are “hard coded” into the main SD model
and therefore neither their number nor their characteristics may change from one
simulation to another –without changing the SD model itself.
In a sense, SD can be seen to take a structural/procedural approach to modelling.
Although SD views system behaviours to be dynamic, the structure of the system itself
is made static and cannot be expected to change dynamically. We think this is the
primary reason SD is more adapted to activity-based views than individual ones.
On this particular point, comparing SD to MABS ressembles comparing a
procedural programming language with an object-oriented one. Although it is possible,
in principle, to coerce the procedural language into object-orientation, at least
somewhat, its underlying ideas belong to another paradigm.
Another problem was in using averages to represent the various characteristics of
the developers. The only practical averages we could provide were those constituting
IK and IC. However average IK and IC values do not result in the average
performance of those persons in question. An alternative that is likely to solve this
problem, is to instantiate all the required developers with their respective IK and IC
values, query our EPM model for performance and HKM for knowledge levels of each
developer, derive the average of these, pass them over to SD, update the abilities of
each developer, query … average out … pass over… update each developer. This is
possible was it not for one problem… the paradigm! By relying on a multitude of
7
In some references these are termed Stocks
8
Also known as Rates
30
developers for which we quire individual characteristics then update each developer
individually, then we are actually running a MABS simulation and not SD.
5.3 Implementation
To provide an objective comparison ground, ensure an exact similar use of the
models, and the various necessary utilities such as calendars and clock functions we
implemented the system dynamics model as a single “average” developer, running on
our simulator framework, who’s individual characteristics represent an average of all
participants within a phase and who’s performance is proportional to the number of
developers considered for the simulation.
These characteristics being averages are no longer individual, instead they provide
the variables and constants that our SD model needs (directly or derived).
A reason why this platform “reuse”, provided by our simulator framework, is
possible, is that although it could be recommended, there need not be a one to one
mapping between a model and its implementation if there is another possible
implementation that remains true to the model. What is important in our case is that
the single average developer implementation preserves all the characteristics of our SD
model. Indeed this is true because in our simulation platform artefacts cumulate
contributions from the developer(s), which are quantified and qualified. And as
contributions are added, knowledge gains of the developer(s) are updated in separate
levels as defined in our SD model. These results in an integral over contributions,
which is what our SD levels mathematically represent.
31
6 VERIFICATION AND VALIDATION
Verification is the process of ensuring that the problem is correctly solved, while
validation is the process of ensuring that the right problem is being solved. In our case
the former would amount to controlling the correctness of the implementation, while
the later concerns the correctness of our model and how “true” it is to the reality it
attempts to abstract.
6.1 Verification
A number of steps were taken to ensure the correctness of the code. Such activities
do not lend them selves well to documentation so we can only mention those we used.
Among these there were: trace analysis, input output testing, code debugging,
calculation verification as mentioned in [33]. In a complex system, such verification
techniques cannot be exhaustive, and therefore, despite all of these precautions, the
corrections they involved and the errors they uncovered, it is not possible to guarantee
the correctness of the implementation. At best, we have only minimised the number of
defects in the code and the risk of errors occurring because of these.
6.2 Validation
Establishing the validity of the model, however, is probably the most difficult
aspect of simulation modelling, and in all likelihood full validity cannot be established.
As in the case of verification, the only validation methods available can only help us
improve our confidence in the model (by reducing the risk of error).
Xian et al. [33] describe a number of model validation techniques such as: face
validity, tracing, internal validity, historical data, sensitivity analysis and predictive
validation. Another technique, they mention, model-to-model comparison (“also
known as back-to back testing or docking”) [33] is quite practical when another
“validated” model (or implementation) of the same abstracted reality exists. In this
case quite large automated testing scenarios can be run to compare the output of both
based on the same input.
However, not all of these methods can be applied at all time. For example it is not
always possible to get hold of historical data, for practical, confidentiality, or other
reasons and it is not always so that there exists an equivalent, yet established model, to
compare with. We present therefore, the methods we used to improve our confidence
in the model.
6.2.1 Face validity9

Face validity normally involves presenting the model’s output to domain experts,
either in the form of an animation or statistics presented in graphs and tables. These
experts then comment on the reasonability of the output [33] providing a qualitative,
yet subjective opinion. Based on our experience we tried our selves to verify face
validity of the model in the more trivial cases. After a number of adjustments we were
satisfied with the output of the simulations.
6.2.2 Internal validity

Our agent-based implementation in Java (SE 1.5.0_08) uses threads to provide
runtime “autonomy” to the individual agent. Since these threads sleep and wakeup in a
“soft” real-time fashion we needed to perform internal validity [33] tests to ensure that
the ensuing stochastic behaviour did not introduce significant variance in the
projections. We analysed therefore the results of several runs of the same simulations
9
For a more detailed discussion see A.1 & A.2 in Appendix A
32
and adjusted our implementation until no significant variance, between the runs, could
be observed.
6.2.3 Tracing
According to [33], tracing is a technique similar to animation during which the
behaviour of the individual entities and variables of the model are monitored closely
and followed to ensure the correctness of the logic. We used the two for this very
purpose. Tracing and animation, however, are dynamic activities in nature it is
therefore difficult to document their unfolding in this “static” document.
6.2.4 Model-to-Model validation10

As mentioned previously it can be very useful to have an existing and validated
model or simulator to compare the “new” one to. This validation approach is termed
Model-to-Model validation [33]. In our case Hanakawa et al. [13] documented the
outcome of three test cases (1-1, 1-2 and 1-3) using their model. Since our platform
could project the same type of output variables as in [13] i.e. duration and knowledge
gains we proceeded with a series of simulations based on input data as described in
[13] and then compared the results. Of the three cases our simulator obtained very
close results to two of them, cases 1-1 and 1-3. However, in case 1-2 a very large
difference was observed. Therefore, we analysed a number of related documents by
Hanakawa et al.[12], [13] and [14] and our own models to understand the reason of the
discrepancy. From this analysis we found a number of shortcomings in their
publications that we document in our discussion section. Therefore no changes were
made to accommodate case 1-2.
6.2.5 Predictive validation

Predictive validation is to test something we know about the system and see if the
model can predict it correctly. We used this technique to compare the MABS and SD
simulators. In this case for example, we know that there should be no significant
difference in prediction between these when simulating a single developer. The reason
being that the average of the characteristics of a single individual, as used by SD, are
exactly the same as the individuals original characteristic values, as used by MABS.
MABS SD Analysis
Variable M S M S z-value p<=5%
Duration (Hours) 269.41 126.89 274.74 131.06 -0.41 False
Performance (Effort/Hour,%) 13.75 2.25 13.57 2.76 0.70 False
Cost (K-SEK) 34.12 16.45 34.04 16.09 0.07 False
Quality (%) 35.97 0.44 35.98 0.44 -0.26 False
Table 8 Result of the statistical analysis of 200 simulation pairs of MABS and SD based
on random project scopes, in range 10 to 60 hours.
Table 8, resumes the results of 200 MABS and SD simulations where only the
scope of the project varied randomly between 10 to 60 hours. From this table, we see
that no significant difference between MABS and SD exist, not even at p = 5%, which
confirms the correctness of the model on this point too.
6.2.6 Preliminary validity conclusions

Given the verification and validation steps we documented above, and the
corrections and adjustments they inspired, and given the initial face validity testing
results we are satisfied with the overall validity of the model so far. However, our
subjective satisfaction is no guarantee that the model is not flawed, it only says that to
the best of our efforts, it seems as if the model behaves in accordance with our
understanding of the real-world problem. This also suggests that there is room for
improvement on our model by comparing its projections to a number of real world
case studies, which could not be completed within the scope of this work.
10
For a more detailed discussion see A.3 in Appendix A
33
7 COMPARING MABS TO SD
Once the simulation platform was in place and its preliminary validity was
established, we wanted to study the differences, if any, between MABS and SD.
7.1 Comparing outcomes

Here we wanted to identify if any significant difference in projections existed
between MABS and SD simulations. We already know that this is not true for a single
developer project, therefore the following experiment focuses on a multi-developer
software project
7.1.1 Experiment overview

To ensure the reliability of the results of the coming statistical analysis we decided
to run 1000 pairs of simulations. Each pair consists of one SD run and one MABS run,
in this order. Each time a simulation is finished projections are collected for the
variables: duration, performance, cost and quality.
7.1.1.1 Project description

The project’s scope, i.e .the predicted effort described in the requirement
specification is the only input variable that changes for each simulation pair. In each
pair the scope had a single value for both MABS and SD. This value was chosen at
random in the range 100 to 1000 hours.
7.1.1.2 Project participants

As mentioned earlier, we already know the outcome of a single developer project.
Here, we wanted to compare MABS and SD on a multi-developer project. The
advantage of this set-up is that it allows both MABS and SD to demonstrate their
characteristics fully. For SD, the fact that several participants are involved lets us see
how its activity-based view and use of averages impact on the result. For MABS, the
use of several developers, allows it to work in a true multi-agent environment, where it
is expected to make full use of its individual-based view.
For this purpose, five developers are engaged in all the simulations. The
characteristics of these participants are resumed in Table 9.
Developer Individual characteristics (IC) Knowledge characteristics (KC)
Achievement Self-esteem Locus of Control bij(%) Kij Eij(%)
needs (%) (%) (%)
1 70 60 50 35 1 60.0
2 65 50 55 60 4 44.8
3 70 60 50 60 5 45.0
4 50 50 50 30 1 60.0
5 65 60 55 60 5 45.0
Table 9 Individual and knowledge characteristics of the five participants
34
7.1.2 Analysis
7.1.2.1 Graphical analysis
Figure 6 The progress of 1000 different MABS (in red) and SD (in blue) simulation runs
as a function of duration.
Figure 6 illustrate the output of the simulators in terms of duration, after 1000
MABS and SD runs. The Y-axis represents the effort completed or size of artefact (in
hours), while the X-axis represents the time spent (in hours). In this figure, red curves
represent MABS simulations, while the blue ones represent SD simulations. It is worth
mentioning that for each random scope value, the SD simulation proceeds first then
does MABS. Therefore, when a MABS point is plotted on an already existing SD
point, that point is likely to turn red, which explains the predominantly red colour of
most points and curves in the figure.
Although there are blue and red curves on both sides of a virtually ascending
diagonal, one can see how a blue trend seems to dominate the upper left side, while a
red one dominates its lower right side. This seems to suggest that SD projects shorter
durations than does MABS. The following statistical analysis will help us evaluate the
significance of this observation.
7.1.2.2 Statistical analysis

We base the following analysis on the assumption that the output variables
considered are normally distributed with regards to the input. Based on this assumption
we designed our experiment around two samples: MABS and SD. For each sample we
determined the statistical parameters: Mean (M), standard deviation (s) and then
performed a z-analysis to see if any significant difference exist at both p<= 5% and
p<=1% significance levels.
Table 10, resumes the results of our statistical analysis. From its z-analysis it is
made quite clear that the difference between the samples MABS and SD is significant
at even p<= 1% significance level.
35
MABS SD Analysis
Variable M s M S z-value p<=1%
Duration (Hours) 772.24 343.41 731.41 325.91 2.73 True
Performance (Effort/Hour,%) 72.98 8.66 77.04 6.26 -12.01 True
Cost (K-SEK) 519.20 240.99 478.08 217.54 4.01 True
Quality (%) 59.82 5.35 61.49 6.82 -6.09 True
Table 10 Result of the statistical analysis of 1000 simulation pairs of MABS and SD runs
with varying project scopes drawn at random in the range 100 to 1000 hours
7.1.2.3 Duration
With regards to duration, MABS presents a mean M =772.24 hours for this five
developer-strong project which is greater than SD’s M = 731.41 hours. This suggests
that either MABS overestimates or SD underestimates the duration of the project.
A good explanation for this is that a project, with a quite small team, terminates
only once the slowest developer is done. This is quite true in real-life too.
Given MABS’s individual-based view, it easily reproduces this phenomenon in its
projection, while SD is unable to account for it. The reason being that for SD, there is
no such thing as the slowest developer, or the fastest for that matter. All developers are
averaged out to become equally productive. So the slowest is as productive as the
average that in turn is as productive as the fastest developer. Therefore, we are inclined
to believe that SD underestimates the true duration.
7.1.2.4 Performance
For performance however, the opposite happens, i.e. it suggests that either MABS
underestimates performance (at M=72.98 for the cumulated effort of the five
developers) or SD overestimates that team’s performance (at M = 77.04). This is
consistent with the previous explanation, as performance is inversely proportional to
duration (i.e. the higher the performance, the lower the duration).
7.1.2.5 Cost
MABS presents a mean cost MMABS greater than MSD. Again we believe that
it is SD that underestimates this number and not MABS that overestimates it, as
the former cannot account for the extra delay induced by the individual
characteristics of the slowest developer while the later can. This explains that
MABS presents us with a significantly higher cost than SD.
7.1.2.6 Quality
The last output variable is quality. As we explain in our AQM model, we consider
quality of the output artefact to be based on the knowledge level of the various
contributors at the moment they delivered their contribution. This means that earlier
contributions sink the quality of the artefact while the later ones raise this value,
mainly due to the knowledge gain achieved during the process. MABS takes an
individual approach in calculating these gains, as each developer improves
independently of the others while SD computes an average gain for the entire team.
From Table 10 we see that MABS predicts a lower quality than SD. Note that this
outcome is independent of the three prior ones. However, given that MABS is using a
more realistic approach in calculating the knowledge gain of each developer separately
while SD is only approximating this value, we are more inclined in relying on the
MABS result for this variable.
36
7.2 Comparing modelling issues
7.2.1 Model elicitation
Typically in an SPSM context, one first starts by eliciting a model of the software
process to abstract. This task can be quite tedious, as it requires unveiling both the
apparent structure of the “reality” but also its lesser “intuitive” and interdependent
feedback loops.
When eliciting a model that will later on be used for MABS, there are many
possible approaches that one can choose from. Quite formal methodologies such as
MAS-CommonKADS can be applied, as in Henesey et al. [15], or when appropriate
less formal ones can be used too. However, this freedom of approach can be perceived
as a lack of methodology.
In the case of SD, however, the modeller is not as free, as SD is quite strongly
built around the concepts of levels and flows to the point, as we experienced it, one
starts to look mainly for the flows and levels exhibited by the system, maybe at the
expense of other crucial aspects of the model. Although a “free” model prepared for
MABS usage and an SD one may capture the same reality, it is felt that in SD model
elicitation and notation are so intertwined that the modeller barely discriminates
between them.
We are encouraged in this opinion, by examining Abdel-Hamid [1]’s model
development chapter. Here he explains how he first started out his modelling effort by
carrying out a series of interviews that lead to an initial SD model.
The information collected in this phase, complimented with our own software
development experience, were the basis for formulating a "skeleton" system dynamics
model of software project management. [1]
It is interesting to note that he mentions a system dynamic model of software
project management and not just a model of software project management. A system
dynamic model in our opinion is not a declarative one, i.e. it does not just state
(“declare”) relational facts (the what); it goes much further in expressing the
proportions and equations governing these relations (the how). Therefore we feel that
SD abstracts some modelling phase or activities. In other words: SD goes “too fast”
from domain problem formulation to “seeing” levels and flows all over the place.
7.2.2 Model configuration and initialisation

Once the model is in place, and after its simulator is implemented, the model needs
to be configured and its variables initialised. In our case for example we needed to
define the development process and the initial values of its various IC and IK
variables.
For an SD simulation, only group averages need be provided for the various actors
(and their variables). Although this is not always the case, such information may exist
in some companies –with sufficient resources and process maturity–, in the form of
average performance measures derived from projects’ historic.
For personal integrity reasons, however, one is less likely to find available
individual performance measures, required for MABS. If no hinders exist to collecting
such information, it remains a tedious task as it involves individually interviewing or
testing the persons to simulate. Appendix B presents three of many questionnaires
needed to configure our various models, gives an indication of the effort required. The
problem worsens when the number of persons to simulate is large, which imposes an
upper limit. From this perspective, MABS is at a disadvantage when compared to SD.
37
8 DISCUSSION
8.1 Results
8.1.1 Modelling the individual-based view
Based on our literature review in section 2, Related Work, we identified, analysed
and integrated an individual model of performance (EPM), an individual cognitive
model (HKM) and an artefact quality model (AQM), that we derived; resulting in an
integrative model that made it possible to design a simulation framework to compare
MABS and SD on equal basis.
After verification and validation, which provided us with relative confidence in
our integrated model, we can say that we now have an individual-based model that
accounts for what we believe to be the most important individual factors for a software
developer, namely effort/performance abilities, experience (knowledge gain) and their
resulting quality of the developers contributions.
8.1.2 Comparing MABS to SD

8.1.2.1 Predictive power
From the comparison of MABS and SD, presented in 7 and its analysis, we
conclude that when compared to MABS, SD underestimates duration and cost while it
overestimates performance and quality.
Not knowing which of the models provides the best prediction we can at least say
that MABS minimises the risk around the project, which is good in many cases where
forecasting the evolution of a project is difficult (quite characteristic of software
development projects).
The downside of this risk aversion characteristic of MABS when compared to SD
is that if it is overestimating the risk while SD is more accurate in its estimate then it is
not impossible that projects for which SD presented an acceptable outcome would
appear too expensive or too lengthy when simulated with MABS and therefore could
be wrongfully abandoned.
However, given the approximate nature of SD’s calculations (reliance on averages)
when compared to MABS’s more accurate and realistic methods, we are inclined to
rely on MABS when this choice is made possible.
8.1.2.2 Modelling issues

From a modelling perspective we found that SD influences the modeller during
model elicitation unlike MABS, however SD requires less data collection effort as it
can do with averages which are likely more available than individual measures.
8.1.3 Lessons learned from MABS and SD modelling

Having had to derive models for MABS and SD as well as implement their
respective simulators atop of our simulation framework, presented us with a number of
insights in terms of difficulties and possibilities of each.
The main difficulties encountered with SD were discussed already in section 5.2.5.
The first difficulty was deriving adequate average values for SD, which proved to
be non trivial.
Yet most importantly we believe SD needs modernisation to allow for the
instantiation of sub-models, such as that of a developer. In our opinion this
“improvement” would make SD more MABS-like. We see no problem with this in
that, as we believe, both share a common philosophy of investigating complexity and
its emerging behaviours. Additionally, both acknowledge in one way or another the
38
existence of feedback, MABS reproduces the feedback structure of a system, even
though the feedback need not explicitly be present in its model.
For SD, a modernisation, in our opinion, requires little less than an update to its
notation to allow for a dynamic “class diagram” instead of just a static “object
diagram” as is the case currently. However even though notation is just a technicality
to be resolved, we expect it to be quite difficult to manage “politically” within a well-
established research community such as that of SD.
8.1.4 The cost of MABS

Although we believe MABS to be quite accurate with regards to SD, it remains an
expensive technology. From a modelling perspective MABS requires a lot more time,
as individual characteristics need to be elicited “individually”. The more the number of
participants, the longer and harder it becomes to configure and initialise the model.
From an implementation and simulation perspective, MABS can be highly CPU
intensive especially if each agent is represented as a thread. This problem becomes
quite evident once the number of agents is large enough. From this perspective, SD
can run any simulation on a single thread, regardless of the number of participants.
MABS on the other hand has an upper limit due to the number of threads it can run,
unless implemented differently.
Therefore starting from a given threshold (number of agents) MABS starts to
become less appealing than SD, and the modeller will need to weigh MABS accuracy
against its cost and decide based on his or her simulation requirements to use the one
or the other.
8.2 Shortcomings
8.2.1 The EPM model and its Locus of Control scale11
Our work heavily relies on Rasch and Tosi’s [25] empirical effort performance
model, which in turn integrates a number of theoretical models. Besides being
relatively old, the model relies on theoretical instruments that have been challenged in
a number of publications. Rasch and Tosi [25] measure Locus of Control (LoC) using
Rotter’s 15-item Internal-External (I-E) scale [26]. Although Rotter’s 1966 publication
[26] is considered a cornerstone on the matter of LoC, the scale he developed has been
criticised since [30], [18], [9] mainly for being mono-dimensional and therefore
excluding important nuances. Rotter did however reply to some of his critics in a
journal publication [27]. Yet another critical issue is the susceptibility of the scale’s
questionnaire to social desirability bias, [9], [17]. The later problem is in fact quite
apparent from just reading the questions, which significantly reduces the reliability of
the collected responses.
8.2.2 The Knowledge Model

8.2.2.1 Modelling issues
K e− Eij(θ − bij), b ≤ θ
j ij
Lij(θ) = Wj  i Equation (2) of Hanakawa et al. [12]
 0, b ij > θ
A problem we see with the HKM model and how it calculates knowledge gain
Lij(θ) is that it considers Wj to represent the size of the entire activity. i.e. the
amount of work performed so far does not matter. bij is therefore the only
variable that changes with time. A more “intuitive” use of the model would be
to replace Wj by the amount of activity performed since the last update, this
11
See B.1 in Appendix B for the Locus of Control questionnaire
39
would reflect the “experience” of the developer with the task and provide a
dynamic update of the knowledge levels.
8.2.2.2 Time scale issues

Another problem with Hanakawa et al. [12],[13] and [14] is that it remains
unclear whether their results, in months or hours, represent calendar or business
hours/months. Our simulations were compared to theirs on the assumption that a
month represents on average 172 (work) hours and not 720 hours!
8.2.3 Relating requirement scope to effort

We experienced a difficulty in quantifying the scope or size of a given requirement
as input to the simulator. The major problem is that since the simulator is to present
time based projections, a conversion or relation need exist between a given scope
measure and the time it takes to perform that much scope. For example a task of size X
(in some scale) represents an effort Y in hours. To identify Y, a conversion function F
from scope S to T needs to exist so that Y = F (X). Our difficulty was to identify the
scale in question because the appropriate scale needs also be convertible into hours.
One alternative was to use Function Point Analysis (FPA) to represent such scope.
FPA is an implementation independent method of quantifying scope and complexity of
a software system in terms of the services (functions) it provides the user [21]. It is
known to present more objective and user driven quantification of scope than for
instance the Lines of Code (LOC or KLOC) metric [19]. Another FPA advantage is
that there are studies that provide formulas for converting function points (FPs) into
effort hours. Matson et al. [21], for instance, present a cost estimation model using
function points based on a case study covering 104 software development projects.
One of the formulas they propose, based on their empirical study, relates FP to effort
in the following way:
E = 585.7+15.12 * FP. (8-1)
Where E represents effort in work hours, and FP the function point measure of the
task or project.
Indeed we could have solved our problem by requesting scope information
presented in FP to then use the above formula to convert the scope into effort. An
objection, however, to the use of FP, in our case, is the fact that FPA, to begin with, is
designed for quantifying information system projects or business related applications
and is not appropriate for technical or scientific applications [21]. It is not even
feasible to “force” its use in a scientific project as it actually considers, for that
purpose quite irrelevant, software components such as inputs, outputs, master files
interfaces and enquiries. Then the count of each such item is multiplied by a factor
based on the component’s complexity. Yet another difficulty is the fact that FPA is
usually performed in two steps. The first is a simple weighted sum of the components
and their respective complexity resulting in unadjusted function points (UFP). The
final FP count, however, is derived by multiplying the UFP with an adjustment factor
determined by examining 14 organisation and process specific factors.
Our choice was to “skip” the problem all together by requiring that scope be
provided in hours by the user due to (i) the difficulty of finding an adequate time
convertible scope metric and (ii) For those cases where FPA is applicable, there
already exists tools [22] that assist the user in performing FP count providing effort
estimates in hours, which can then be fed to our simulator. For all other cases, a
qualified “guess” is unfortunately expected, which we believe is within the
competence of a seasoned project manager. However we acknowledge that this
situation reduces the usefulness of our simulator. Then again the purpose of this work
is not to propose a commercial product as much as it is to compare MABS to SD.
40
9 CONCLUSIONS
Based on our background review of the literature we concluded that there were
grounds to claim that Multi Agent-Based Simulation was an appropriate approach to
the Software Process Simulation Modelling field, probably even better suited than
System Dynamics, a well established methodology in this field. However, and as we
indicated earlier, we found no evidence for these conclusions in prior researches.
The purpose of our thesis was to establish these evidences, of the appropriateness
of MABS to the SPSM field, and provide first indication of how MABS is more
appropriate than SD for certain SPSM applications.
9.1 Summary of Results

9.1.1 Accomplishments
In order to fulfil our purpose and answer our research questions the following
objectives were completed:
- An elaborate software development process model that integrates: individual
performance, cognition and an artefact quality model was derived, verified
and, to a certain extent, validated.
- A simulation framework was designed to allow a MABS simulator and an
SD simulator to be compared on equal grounds.
- A comparison of MABS and SD was performed and a discussion of the
advantages and weaknesses of each was presented.
- Finally a discussion, of the various shortcomings we encountered both in the
models we adopted and the ones we developed, including implementation
issues, was furnished.
9.1.2 Contributions
The main contribution of our thesis is to having provided evidence of (i) MABS is
feasible for SPSM and (ii) MABS is appropriate for SPSM in that it lends itself quite
neatly to the metaphors of the individual-based view of SPSM.
The next contribution is having compared MABS to SD and uncovered how the
former tends to be risk averse when compared to the later. Our experiment shows that
SD tends to underestimate duration and cost while it overestimates both performance
and quality, when compared to MABS.
From a modelling perspective we showed how MABS is more model elicitation
friendly while SD heavily influences the modeller by its concepts and notation. On the
other hand SD requires less data collection effort for the initialisation of its model
variables, which puts it at an advantage over MABS that requires individual-based data
collection effort.
9.1.3 Lessons learned

From our research and comparison of MABS and SD we realised that the accuracy
of MABS comes at an expense, such as when agents are implemented as threads,
which puts an upper limit on its utility, relative to SD, unless other technologies are
provided to allow agent’s autonomy to be less CPU intensive.
Additionally we found that although MABS and SD belong to different paradigms,
they share a common philosophy with regards to system complexity, emerging
behaviours and feedback mechanisms. The differences lay more at the tool and
implementation levels of these paradigms.
Finally, based on this lesson, we suggest that SD could do with improvements to
its notation, to allow for dynamic model instantiation, in turn allowing dynamic
changes to the system’s structure (in SD jargon) at runtime.
41
9.2 Future works
9.2.1 Improvements
A first indication for future work could be found in the shortcomings discussed in
8.2. A major improvement would be to define how requirement specification could
give an indication of the estimated effort, without the user needing to do so. It could be
that the entire model need review to eliminate the need for time estimates all together,
but there is the risk that the simulator can no longer provide durations expressed in a
time scale.
9.2.2 Experimentation
A possible future application would be to complete the simulator so to see if it can
reproduce well-established patterns of software engineering projects. This would
improve the validity of the model as well as its experimentation features.
Another experimentation possibility would be to reproduce Abdel-Hamid’s [1]
case studies and experimentation scenarios, simulated using SD, this time using our
MABS simulator to further investigate the differences, similitude, advantages and
disadvantages of both.
9.2.3 Application
Carrying a number of case studies would give a better indication of the usefulness
and shortcomings of our simulation environment. These studies would have to among
other things, collect individual related data and probably calibrate the simulator when
studying its output with regards to either historical data or an ongoing effort that could
be simulated.
9.2.4 MAS and SPSM

The agents used in our simulator are quite “simple” in that their intelligence is
restricted to reacting to their environment. It could be from a Multi Agent Systems’
perspective more interesting to experiment with agents endowed with higher levels of
“intelligence” and maybe communication abilities.
Actually there is a dual benefit here for both MAS and SPSM in that the later for
example can benefit from an extension of our model to include developer
communication and collaboration, as is the case in the real world.
9.2.5 Optimisation features

The simulator framework currently only provides a projection of duration,
performance, cost and quality, based on a requirement specification and a team of
developers as input. However it could be developed further to present an optimised
team-effort allocation scheme so to improve duration, performance or cost quality.
For this purpose, one could either use a pure agency paradigm approach to
optimise the solution using concepts such as ContractNet [32] or other competitive and
collaborative schemes; or one could resort to operational research methods or AI
algorithms to search the space of possible parameter combinations, and use the
simulator as a sort of “utility function”, which provides feedback to the algorithm as
required.
42
REFERENCES
[1] Abdel-Hamid T.: The Dynamics of Software Development Project Management: An
Integrative System Dynamics Perspective. PhD diss., MIT, 1984.
[2] Abdel-Hamid T.: The Dynamics of Software Project Staffing: A System Dynamics Based
Simulation Approach, IEEE Transactions on Software Engineering 15(2), pp109-120,
1989.
[3] Abdel-Hamid, T. and S. Madnick: Software Project Dynamics. Englewood Cliffs, NJ,
Prentice Hall, 1991.
[4] Avison D. E., Fitzgerald G.: Information Systems Development: Methodologies,

Techniques and Tools. 2nd Edition, McGraw-Hills International (UK), London, 1995.
[5] Brooks, R. A.: A Robust Layered Control System For A Mobile Robot. IEEE Journal of
Robotics and Automation, pp 14–23, 1986.
[6] Brooks, R. A.: Intelligence without reason. In Proceedings of the 12th International Joint
Conference on Artificial Intelligence (IJCAI-91), Sydney, Australia, pp. 569-595, 1991.
[7] Burke S.: Radical Improvements Require Radical Actions: Simulating a High Maturity
Software Organization. Technical Report, CMU/SEI-96-TR-024 ESC-TR-96-024,
Carnegie Mellon University, Pittsburgh, Pennsylvania US 1997.
[8] Christie A.M. and Staley J.M.: Organizational and Social Simulation of a Software
Requirements Development Process. Proceedings of the Software Process Simulation
Modeling Workshop (ProSim 99), Silver Falls, Oregon, 1999.
[9] Duttweiler, P.C.: The Internal Control Index: A Newly Developed Measure of Locus of
Control. Educational and Psychological Measurement, 44, pp 209-221, 1984.
[10] Forrester J.: System Dynamics and the Lessons of 35 Years. In K. B. D. Greene, editor,
Systems-Based Approach to Policymaking. Kluwer Academic Publishers, 1993.
[11] Glickman S. and J. Kopcho.: Bellcore’s Experiences Using Abdel-Hamid’s Systems

Dynamics Model. 1995 COCOMO Conference, Pittsburgh, PA, Software Engineering
Institute, Carnegie Mellon University. 1995.
[12] Hanakawa N., Matsumoto K. and Torii K.: A Knowledge-Based Software Process
Simulation Model. Annals of Software Engineering 14, pp383–406, 2002.
[13] Hanakawa, N., Matsumoto K. and Torii K.: Application of Learning Curve Based
Simulation Model for Software Development to Industry. In Proceedings of the 11th
International Conference on Software Engineering and Knowledge, Kaiserslautern,
Germany, World Scientific Publishing, pp. 283–289, 1999.
[14] Hanakawa, N., Morisaki S., and Matsumoto K.: A Learning Curve Based Simulation
Model for Software Development. In Proceedings of the 20th International Conference
on Software Engineering, Kyoto, Japan, IEEE Computer Society Press, pp. 350–359.
[15] Henesey L., Notteboom T., and Davidsson P.: Agent-based simulation of stakeholders
relations: An approach to sustainable port and terminal management. In Proceedings of
the International Association of Maritime Economists Annual Conference, Busan,
KOREA, 2003.
43
[16] Kellner, M.I. Madachy, R.J. and Raffo, D.M.: Software process simulation modeling:
Why? What? How? Journal of Systems and Software 46(2-3), pp91–105, 1999.
[17] Kestenbaum J.M.: Social Desirability Scale Values of Locus of Control Scale Items.
Journal of Personality assessment, 40(3), pp 306-309, 1976.
[18] Levenson, H.: Multidimensional locus of control in psychiatric patients. Journal of

Consulting and Clinical Psychology, 41, pp 397-404, 1973.
[19] Low, G.C.; Jeffery, D.R.: Function points in the estimation and evaluation of the software
process. IEEE Transactions on Software Engineering 16(1), pp 64–71, 1990.
[20] Madachy, R.: Process Modeling with Systems Dynamics. 1996 SEPG Conference,
Atlantic City, NJ, Software Engineering Institute, Carnegie Mellon University, 1996.
[21] Matson, J.E.; Barrett, B.E.; Mellichamp, J.M.: Software development cost estimation
using function points. Transactions on Software Engineering 20(4), pp 275–287, 1994.
[22] Matson, J.E.; Mellichamp, J.M.: An Object Oriented tool for Function Point Analysis.
Expert Syst. 10, pp 3–14, 1993.
[23] Myers, G.: Software Reliability: Principles and Practices. John Wiley & Sons, Inc., New
York, 1976.
[24] Parunak V.D., Savit R., and Riolo R.: Agent-Based Modeling vs. Equation-Based
Modeling: A Case Study and Users Guide. Multi-Agent Systems and Agent-Based
Simulation, LNAI, Vol. 1534, Springer Verlag, Berlin Germany, 1998.
[25] Rasch R. H. and Tosi H.: Factors affecting software developers’ performance: An
integrated approach, MIS quarterly 16(3), pp 395, 1992.
[26] Rotter, J.B: Generalized expectancies of internal versus external control of

reinforcements. Psychological Monographs, 80 (whole no. 609), 1966.
[27] Rotter, J. B: Some problems and misconceptions related to the construct of internal
versus external control of reinforcement. Journal of Consulting and Clinical Psychology,
43, pp 56-67, 1975.
[28] Russel S. J., Norvig P.: Artificial Intelligence, A Modern Approach, second edition,
Pearson Education, Inc., Upper Saddle River, New Jersey, 2003.
[29] Smith N., Capiluppi A. and Fernández-Ramil J.: Agent-Based Simulation of Open Source
Evolution. Software Process: Improvement and Practice 11(4), pp423-434, 2006.
[30] Weiner, B.: Achievement Motivation and Attribution Theory. General Learning Press,
New Jersey, 1974.
[31] Wickenberg T. and Davidsson P.: On Multi Agent Based Simulation of Software
Development Process, Multi Agent Based Simulation II, LNAI, Vol. 2581, pp171-180,
Springer, 2003.
[32] Wooldridge M.: MultiAgent Systems. John Wiley & Sons Chichester, England, 2002.
44
[33] Xiang, X., Kennedy, R., and Madey, G.: Verification and Validation of Agent-based
Scientific Simulation Models. In Proceedings of the 2005 Agent-Directed Simulation
Symposium (ADS'05), San Diego, CA, pp 47-55, 2005.
[34] Yilmaz L. and Phillips J.: The Impact of Turbulence on the Effectiveness and Efficiency
of Software Development Teams in Small Organizations. Software Process:
Improvement and Practice 12(3), pp 247-265, 2007.
45
Appendix A VALIDATION OF SIMULATION
FRAMEWORK
The properties tested here are specific to the model of the simulation framework
not to MABS or SD that produce quite similar results, on these tests, which ever
simulator is used, except as explained in the discussion section. For this reason, and for
reasons of space, we document in this appendix only the validation steps conducted
using MABS.
A.1 Preliminary face validity testing of time dimension
To evaluate the face validity of our simulator from a time perspective we run a
number of simple tests for which the outcomes are easily predictable or deducible
without the simulator so that we can control the simulator frameworks reliability (at
least on these “easy” cases).
Please note that in the graphs presented in this section A.1, the X-axis represents
time spent working on the task while the Y-axis represents the size of the artefact
produced.
Please also note that the project (a single phase) starts on 2007-11-20 at 08:00,
which is a Tuesday.
A.1.1 Single high performing developer working round-the-clock
In Figure 7 a single developer performs an activity defined to require 100 hours of
effort. The developer’s performance was set to 100%, and as expected the developer
presented a linear progress curve that terminated with an x and y value of 100.
Figure 7 A single developer working round-the-clock at 100% efficiency completes a task

of 100 hours in exactly 100 hours.
A.1.2 Effect of weekend breaks on progress
In Figure 8 we observe the impact of introducing weekend breaks. As expected the
simulator projects the task to last 48 hours longer than in the previous case.
1
Figure 8 A single developer working round-the-clock at 100% efficiency except on
weekends (48 hours delay) completes a task of 100 hours in 148 hours.
A.1.3 Restricting work hours to the interval [8 – 17[

The next step was to observe the impact on project progress of limiting regular
work hours interval to [8 – 17[, as opposed to round-the-clock. Figure 9 illustrates how
the 100-hour task now requires 361 hours to complete. The figure accounts for two
weekends (48*2 = 96), 11 work days (à 9 hrs each, there are no lunch breaks yet, =
99), 1 last hour and 11 after work hour intervals (11 * 15 = 165). The total 96 + 99 + 1
+ 165 = 361 means the project ends on Wednesday, 5th of December 2007.
Figure 9 A single developer working regular hours only on weekdays at 100% efficiency
completes a task of 100 hours in 361 hours.
This result is consistent with reality. We can demonstrate it in the following way.
We know that a single business day, with no lunch break, at 100% productivity would
result in an effort of nine hours, over a period of 24 hours and that a five-day week
period would therefore result in 45 hours of effort, over a period of 168 hours. Yet we
must keep in mind that our project starts on a Tuesday, therefore the very first week
will only produce 36 hours of effort over a duration of 144 hours.
Let wx denote week x. Hence, for our single developer to produce an effort of 100
hours, he or she can only progress as follows: w1 = 36 h. effort over 144 h. duration;
w2 = 45 h. effort over 168 h. duration and finally w3 = 9*2+1=19 h. effort over
24*2+1 = 49 h. in duration. i.e. 36+45+19 = 100 h. effort requires a duration of:
144+168+49 = 361 h.
A.1.4 Accounting for lunch breaks

In the next test, we introduced lunch pauses. The developer is still working 100%,
however termination did not occur before h = 388. The graph in Figure 10 accounts for
2
12 work days (12x8 = 96), 12 after work intervals (12x15 = 180), 2 weekends (2x48 =
96) + 1 half day (4 hours) as well as 12 lunch breaks (12 hrs).
This too is consistent with real life and can be demonstrated as earlier, except that
a single day includes now only eight productive hours, a week therefore only 40 and
the very first week that starts on Tuesday only includes 32 productive hours.
The progress this time is w1= 32 h. by 144 h; w2= 40 h. by 168 h.; w3 = 3*8+4 =
28 by 24*3+4=76. i.e 32+40+28 = 100 h. effort requires a duration of: 144+168+76 =
388 h.
One would at first expect the difference between this scenario and the previous one
to be 12 extra (lunch) hours i.e. this scenario should normally complete at 360 + 12 =
372, leaving us with 15 non-expected hours. These are due to the fact that the
developer needs to go home for the day and come back to complete the last four hours
on Thursday, 6th of December 2007.
Figure 10 A single developer working regular hours on regular weekdays with lunch
breaks between 12:00 to 13:00, performing at 100% takes 388 hours to complete
a 100-hour task.
A.1.5 Doubling the human resources
In Figure 11, we introduced a second “high performing” developer with the exact
same characteristics as the first and observed how the project progressed. This time the
project took 194 hours to complete which is exactly half of that required by a single
developer (388 hours)
Figure 11 Two identical developers collaborating on the same 100-hour task. They
terminate after 194 hours, which is exactly half the time it takes a single
developer to perform that task.
3
A.2 Preliminary face validity testing of performance
In the previous tests, performance was set to 100%. In the following tests we
integrate the EPM model and see how it affects progress in the most intuitive ways,
again to be able to observe if the simulator predicts these “obvious” scenarios
correctly. In the following tests individual characteristics (IC) represent: Achievement-
needs (AN), self-esteem (SE) and Locus of Control (LC). An IC will be noted as IC =
{AN, SE, LC} or just {AN, SE, LC}.
A.2.1 An “ideal” developer
We start by setting a developer’s individual attributes all to their maximum value
i.e. {1.0, 1.0, 1.0}. For now we “deactivate” the knowledge model, buy setting the
developers knowledge level of the current task to higher than that required. This means
that there is no knowledge to gain resulting in a constant level of knowledge and hence
performance. Figure 12 shows how the above values result in a performance near to
95.6%. With such a performance level, 100 hours should normally take 100/0.956 =
104.6 hours. However, our calendar uses only full hour increments and hence the 105
hours displayed.
Figure 12 Single developer with an “ideal” IC = {1.0, 1.0, 1.0} and knowledge level greater
than required knowledge level.
Another interesting point with this simulation is that it shows that according to the
EPM model a developer can expect to be productive –at best– “only” 96% of the time.
One explanation is that according to the EPM model, Task difficulty is negatively
correlated to performance yet positively correlated to effort, which in turn is positively
correlated to performance. Using our interpretation of task difficulty as being the
difference between current knowledge ability and the required level, it is possible to
face tasks of 0 difficulty which improves performance, but reduces Effort which in
turn reduces the Performance result. So 100% performance is unachievable using
EPM. Although this might be seen as a limitation of the model, it is quite realistic in a
real life environment.
4
A.2.2 Two equally “ideal” developers
The next step was to test a multi-developer project and see the impact on the
results. Figure 13, shows that our simulator behaves predictably in that two developers
with highest individual attribute values, performing round the clock, finish nearly
exactly within half the time it took the single developer. In Figure 12, the lonely
developer required 105 hours, actually 104.6, to complete a task worth 100 hours. Our
two developers required 52.3 hours, however the simulator clock ticks full hour
increments, because it was not designed for minute precision, hence the 53 hours result
displayed.
Figure 13 Two developers with IC = {1.0, 1.0, 1.0} and knowledge level greater than
required knowledge level, complete a task worth 100 hours in 53 hours (actually
52,3).
A.2.3 Two quite “normal” developers with different performance levels
In this test we set the individual attributes of the EPM model to less extreme
values. Developer 1 (id 1) = {0.6, 0.7, 0.2} and Developer 2 (id 2) = {0.6, 0.7, 0.7}.
Figure 14, shows how the performance of developer 1 is 0.75 while developer 2,
who has a more internal locus of control, has a 0.79 performance. What is interesting
here is to see that the simulator stops after 66.6 hours (67 as shown in the figure). As
expected, this corresponds to the time it takes the least performing developer, i.e.
developer 1, to complete 50% of the total effort (50hrs/75% = 66.6 hrs).
5
Figure 14 Two developers. D1 = {0.6, 0.7, 0.2} and D2 = {0.6, 0.7, 0.7} both with a
knowledge level greater than required, complete a task worth 100 hours in 67
hours (actually 66.6).
A.3 Model-to-Model comparison with HKM

Please note that in the graphs presented in this section A.3 the X-axis represents
time spent working on the task while the Y-axis is specified in each figure.
Please also note that the project (a single phase) starts on 2007-11-20 at 08:00,
which is a Tuesday.
The following tests are to demonstrate how our HKM model accounts for
knowledge fluctuation with time. In the previous preliminary validation sections we
have “deactivated” HKM by setting all involved developers’ knowledge to a level
higher than that required for completing the task. We shall therefore skip that
particular test in this section. A developer’s knowledge characteristics (KC) that are
considered in our model are: current level of knowledge (bij), time required to acquire
required level of knowledge (Kij-equivalent) and the difficulty level (Eij-equivalent)
that would lead a developer to abandon the task (i.e. max acceptable difficulty level).
We can write:
KC = {bij, Kij-equivalent, Eij-equivalent }. An equivalent value is a simplified
scale introduced by Hanakawa et al.[13] used to calculate the less intuitive final value
of the concerned variable. They proposed formulas for converting answers to simple
questions, e.g. Kij-equivalent, to actual model variable values such as Kij.
In the following subsections we compare the results of cases 1-1, 1-2 and 1-3 of
Hanakawa et al. [13] with our own simulation, using similar KC values as Hanakawa
et al. [13].
A.3.1 Case 1-1

Case 1-1 involves an extremely well adapted developer who has knowledge of C
programming estimated at 100%. In this scenario no knowledge gain is to be expected
6
therefore that performance will remain constant. Hanakawa et al. [13] project that such
a situation would take the developer 6.71 months (6.71 * 172 = 1154 hours).
Figure 15 shows the result of our simulator under similar conditions. Our
simulator projects a result of 1089 hours. The difference d = 65 hours or 5.7%.
Figure 15 Single developer with IC = {0.5, 0.6, 0.5} and KC = {100, 100, 30} corresponding
to case 1-1 of Hanakawa et al. [13]
A.3.2 Case 1-2

Case 1-2 is one where the developer is totally unacquainted with the task
(knowledge level bij = 0). However, the KC values (equivalent) used by Hanakawa et
al. [13] for this case were too extreme for our simulator, as in our simulation no
significant knowledge gain was achieved. As a result the simulations differ drastically.
Hanakawa et al. [13] projected duration of 9.51 months. Figure 16, shows the result of
our simulator, which projects a duration of 2912 (i.e. 17 months). In section 8
(Discussion) we explain our reserves as to how Hanakawa et al. [13] calculate the
7
knowledge gain. Indeed they use the total size of a task as a multiplication factor at
each increment while we use only the size of the effort performed since the last
increment to reflect how actual experience affects the gain to the developer. This
explains why their developer gains knowledge, despite the “drastic” values used in the
simulation, while ours remains by large as “ignorant” by the end of the project as he or
she started it out.
Figure 16 Single developer with IC = {0.7, 0.6, 0.5} and KC = {0, 100, 30} corresponding to
case 1-2 of Hanakawa et al. [13]
A.3.3 Case 1-3

In the following test, although the developer has a quite high level of knowledge,
he has difficulty acquiring more knowledge during his work. Therefore the
performance remains constant. With an IC = {0.7, 0.6, 0.5} and KC = {70, 300, 10},
our simulator projects that the developer would need 1324 hours. According to
8
Hanakawa et al. [13]’s simulation, the developer is done after 7.68 months, which
represents 1321 hours a difference of barely 4 hours, or 0.3%.
Please refer to section 8.2.2. for a discussion over the uncertainties about the HKM
model and its time scales.
Figure 17 Single developer with IC = {0.7, 0.6, 0.5} and KC = {70, 300, 10} corresponding to
case 1-3 of Hanakawa et al. [13]
9
Appendix B QUESTIONNAIRES
In this appendix we present the various questionnaires required to collect the
individual characteristics needed for our EPM model, as used in [25].
B.1 The Locus Of Control Scale

The Locus of Control Scale, developed by Rotter [26], measures generalised
expectancies for internal versus external control of reinforcement. Internal locus
suggests that one believes that she or he has control over the rewards or outcomes of
its actions. A person with external locus suggests that the person believes that external
forces/events/groups/persons, and not the questioned individual, have control over
such outcomes
Below we present Rotter’s 29-item Locus of Control questionnaire, in which an
individual is requested to choose an alternative (a) or (b), according to their belief, but
not both, for each of the 29 questions. Please see section 8 for a discussion of the
criticism this questionnaire has faced.
1 a. Children get into trouble because their parents punish them too much.
b. The trouble with most children nowadays is that their parents are too easy with them.
2 a. Many of the unhappy things in people's lives are partly due to bad luck.
b. People's misfortunes result from the mistakes they make.
3 a. One of the major reasons why we have wars is because people don't take enough
interest in politics.
b. There will always be wars, no matter how hard people try to prevent them.
4 a. In the long run people get the respect they deserve in this world.
b. Unfortunately, an individual's worth often passes unrecognized no matter how hard
he tries.
5 a. The idea that teachers are unfair to students is nonsense.

b. Most students don't realize the extent to which their grades are influenced by
accidental happenings.
6 a. Without the right breaks one cannot be an effective leader.

b. Capable people who fail to become leaders have not taken advantage of their
opportunities.
7 a. No matter how hard you try some people just don't like you.
b. People who can't get others to like them don't understand how to get along with
others.
8 a. Heredity plays the major role in determining one's personality.

b. It is one's experiences in life which determine what they're like.
9 a. I have often found that what is going to happen will happen.

b. Trusting to fate has never turned out as well for me as making a decision to take a
definite course of action.
10 a. In the case of the well prepared student there is rarely if ever such a thing as an unfair
test.
b. Many times exam questions tend to be so unrelated to course work that studying in
really useless.
1
11 a. Becoming a success is a matter of hard work, luck has little or nothing to do with it.
b. Getting a good job depends mainly on being in the right place at the right time.
12 a. The average citizen can have an influence in government decisions.

b. This world is run by the few people in power, and there is not much the little guy can
do about it.
13 a. When I make plans, I am almost certain that I can make them work.
b. It is not always wise to plan too far ahead because many things turn out to be a matter
of good or bad fortune anyhow.
14 a. There are certain people who are just no good.

b. There is some good in everybody.
15 a. In my case getting what I want has little or nothing to do with luck.

b. Many times we might just as well decide what to do by flipping a coin.
16 a. Who gets to be the boss often depends on who was lucky enough to be in the right
place first.
b. Getting people to do the right thing depends upon ability, luck has little or nothing to
do with it.
17 a. As far as world affairs are concerned, most of us are the victims of forces we can
neither understand, nor control.
b. By taking an active part in political and social affairs the people can control world
events.
18 a. Most people don't realize the extent to which their lives are controlled by accidental
happenings.
b. There really is no such thing as "luck."
19 a. One should always be willing to admit mistakes.

b. It is usually best to cover up one's mistakes.
20 a. It is hard to know whether or not a person really likes you.

b. How many friends you have depends upon how nice a person you are.
21 a. In the long run the bad things that happen to us are balanced by the good ones.
b. Most misfortunes are the result of lack of ability, ignorance, laziness, or all three.
22 a. With enough effort we can wipe out political corruption.

b. It is difficult for people to have much control over the things politicians do in office.
23 a. Sometimes I can't understand how teachers arrive at the grades they give.
b. There is a direct connection between how hard 1 study and the grades I get.
24 a. A good leader expects people to decide for themselves what they should do.
b. A good leader makes it clear to everybody what their jobs are.
25 a. Many times I feel that I have little influence over the things that happen to me.
b. It is impossible for me to believe that chance or luck plays an important role in my
life.
26 a. People are lonely because they don't try to be friendly.
2
b. There's not much use in trying too hard to please people, if they like you, they like
you.
27 a. There is too much emphasis on athletics in high school.

b. Team sports are an excellent way to build character.
28 a. What happens to me is my own doing.

b. Sometimes I feel that I don't have enough control over the direction my life is taking.
29 a. Most of the time I can't understand why politicians behave the way they do.
b. In the long run the people are responsible for bad government on a national as well as
on a local level.
B.2 Self-esteem questionnaire

Questionnaire based on Coopersmith’s Self-Esteem Inventory12
Score 4 points per positive answer.
01. Things usually don’t bother me. Like me Not like me
02. I find it very hard to talk in front of a group.
03. There are lots of things about myself I’d change if I could.
04. I can make up my mind without too much trouble.
05. I’m a lot of fun to be with.
06. I get upset easily at home.
07. It takes me a long time to get used to anything new.
08. I’m popular with persons of my own age.
09. My family usually considers my feelings.
10. I give in very easily.
11. My family expects too much of me.
12. It’s pretty tough to be me.
13. Things are all mixed up in my life.
14. People usually follow my ideas.
15. I have a low opinion of myself.
16. There are many times when I would like to leave home.
17. I often feel upset with my work.
18. I’m not as nice looking as most people.
19. If I have something to say, I usually say it.
20. My family understands me.
21. Most people are better liked than me.
22. I usually feel as if my family is pushing me.
23. I often get discouraged with what I’m doing.
24. I often wish I were someone else.
25. I can’t be depended on.
12
Coopersmith S.: The Antecedents of Self-Esteem, Freeman, San Francisco, CA, 1967.
3
B.3 Achievement needs questionnaire
Manifest Needs Questionnaire (MNQ)13 (*Only achievement needs items are included here).
Reply by: always, almost always, usually, sometimes, seldom, almost never, never. (seven-
point Likert scales) the following five questions.
01. I do my best work when my job assignments are fairly difficult.
02. I try very hard to improve on my past performance at work.
03. I take moderate risks and stick my neck out to get ahead at work.
04 I try to avoid any added responsibilities on my job (inversed scale!).
05 I try to perform better than my co-workers.
13
Steers R. M., Braunstein D. N.: A Behaviorally-Based Measure of Manifest Needs in Work settings.
Journal of Vocational Behaviour 9, pp 251 – 266, 1976.

FULLTEXT01

Uploaded by

Copyright:

Available Formats

FULLTEXT01

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FULLTEXT01

Uploaded by

Copyright:

Available Formats

Master Thesis

Software Process Simulation Modelling:

Department of Internet : www.bth.se/tek/aps

In this thesis we present one of the first actual

Keywords: Software Process Simulation Modelling,

APPENDIX A VALIDATION OF SIMULATION FRAMEWORK ..........................1

1.2 Aims and objectives

1.4 Scientific relevance

1.5 Research questions

B. When comparing MABS and SD:

1.6 Expected outcomes

1.7.1 Literature review (RQ 1,2 & 3)

1.7.2 Simulations (Model validation)

1.7.3 Quantitative methods (RQ 4)

1.7.4 Qualitative analysis (RQ 5 and 6)

2.1 Software Process Simulation Modelling

2.2 System dynamics

2.4 Social considerations

2.5.2 The attempts

2.6 Modelling software developers’ performance and

2.6.2 Cognitive characteristics

3.1 Software development as a process

3.1.1 The “Water fall” model

3.1.2 The Incremental model

3.2 Modelling the software development process

3.2.1 Effort Performance Model (EPM)

3.2.1.2 Goal-setting theory

3.2.1.3 Individual Characteristics

3.2.1.4 EPM Results

Goal clarity 0.19 Effort 0.21 Performance

Relation to performance Direct effect Indirect effect Total effect

3.2.2 Knowledge Model (HKM)

3.2.2.1 Knowledge gain and its model

3.2.2.2 Updating the knowledge level of a developer.

3.2.3.1 Artefact size

The total size of the artefact is therefore:

3.2.3.2 Artefact quality

3.2.4.2 Task difficulty

3.2.4.3 Task clarity

3.2.4.4 Artefact quality

3.2.5 Developer/artefact interaction model

1 For each activity a ∈ A

3.3 Simulation Framework

3.3.1 Framework overview

java.lang.Thread KnowledgeAbility KnowledgeTask

Agent ArtefactPrimitive PhaseTaskDescriptior Role

Individual Artefact TaskPrimitive

Manager Developer 1 Phase -

Figure 2 A somewhat simplified UML class diagram of the simulation platform

3.3.2 Framework’s model variables manipulation

3.3.2.1 Process definition

3.3.2.3 Project definition

In its current implementation the framework supports the termination criterion

4.1.1 Situation (environment)

4.1.2 Characteristics of the simulation environment

4.2.1.1 Developer agent

4.2.1.2 Manager agent

4.2.2 Decision making

5.1 Model prerequisites

5.2 A System Dynamics Model of a development process