0% found this document useful (0 votes)
89 views9 pages

Experts' Estimates of Task Durations in Software Development

Uploaded by

derekrichner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views9 pages

Experts' Estimates of Task Durations in Software Development

Uploaded by

derekrichner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

PERGAMON International Journal of Project Management 18 (2000) 13±21

www.elsevier.com/locate/ijproman

Experts' estimates of task durations in software development


projects
J. Hill a, L.C. Thomas a, *, D.E. Allen b
a
Edinburgh University Management School, University of Edinburgh, Edinburgh, U.K.
b
The School of Finance and Business Economics, Edith Cowan University, Perth, Australia

Abstract

This paper reports a case study of how accurate were experts' subjective estimates of the durations of tasks in a software
project. The data available included the estimated task durations given by experts and the subsequent actual duration times. By
looking at the results of the case study, the paper shows that although the majority of tasks are overestimated, the mean error is
an underestimate of about 1%. The experts however could do even better by taking more cognisance of the number of subtasks
that make up a task and hence use the WBS at a lower level when they are estimating durations. # 1999 Elsevier Science Ltd
and IPMA. All rights reserved.

1. Introduction Je€rey and Low [2], Finnie et al. [3]), there is little
work reported on the estimates of the individual tasks
The e€ectiveness of project management techniques that make up the project. It is these estimates that are
depend heavily on the accuracy of the task duration required if project management techniques are to be
estimates. Most of the project management toolsÐ used to plan and control the project as opposed to an
identifying critical activities, the baseline schedules, initial pricing or scoping estimate. This paper reports
milestone determination, resource schedules, cost/time the results of the task duration estimation procedure
chartsÐdepend on accurate duration estimates. This used by the software development experts of an inter-
can be particularly dicult to make accurately in national organisation on all its software development
research and development situations such as software projects over a three-year period. The experts esti-
development. Although the Program Evaluation and mated the duration of the various tasks making up
Review Technique (PERT) methodology allows the in- each project and these are compared with the actual
clusion of errors in the duration estimation it is also time those tasks took. The results suggest that there
heavily a€ected by what are the most likely duration may be biases that appear in such estimates and that
times of the tasks. one might be able to develop an automatic correction,
Although project management has been in use for which allows for such biases. This could improve pro-
more than 30 years there are few published case stu-
ject management of software development consider-
dies reporting on the over and under estimates that
ably. In a survey of information systems managers and
occurred in task duration estimates in actual projects.
software developers in over 100 organisations, Lederer
Even in software development, although there is a
and Prasad [4] found that only 25% of such projects
large literature on methods of estimating the total cost
were completed within estimated time and budgets.
or e€ort of the whole development, (Kemerer [1],
The most frequent cause of overrun was users chan-
ging the scope of the project. However the issues that
* Corresponding author. Tel.: +44-131-650-3798; fax: +44-131- were most highly correlated with overruns were indi-
668-3053. viduals' performance reviews not considering if their

0263-7863/99/$20.00 # 1999 Elsevier Science Ltd and IPMA. All rights reserved.
PII: S 0 2 6 3 - 7 8 6 3 ( 9 8 ) 0 0 0 6 2 - 3
14 J. Hill et al. / International Journal of Project Management 18 (2000) 13±21

estimates were met, lack of ways of setting standard remember the functions developed in a project than
durations for use in estimating common tasks and lack the interfaces involved in the development, while it is
of project reviews to compare estimates with actuals. It true that reasons for large overestimates of tasks tend
is hoped that the results presented hereafter will alert to be recalled more easily than reasons why tasks were
other companies to the need to look at the relationship completed more quickly than expected. To overcome
between their estimates and the actual duration times. these failings of the expert some authors [7, 8]
The next section outlines some of the ways in which suggested using the Delphi technique that allows a
task, total project duration and e€ort are estimated in group of experts to arrive at a consensus. This
software development. Section 3 describes the case approach could also be used to set the parameters for
study and how the data on estimates of task duration a PERT analysis, where for each task one wants esti-
and the subsequent actual duration times were mates of the fastest possible duration; the slowest poss-
obtained. Section 4 analyses the errors in estimation ible duration; and the most likely duration of a
found in the data while the ®nal section draws some task [9].
conclusions from this case study. Estimating by analogy is a powerful technique if
there is a stable technological environment with some
degree of historical data available. One identi®es which
2. Methods of task estimation was the most similar previous project or task and
takes its actual time as the estimate of the new project
Developing accurate estimates of the duration and or task. The advantages of analogy include low cost,
e€ort of the project overall, and its separate tasks, is relative simplicity and reasonable accuracy. The pro-
critical to the usefulness of project management blem is to ®nd suitable analogies [7] and as Yeates [10]
ideasÐboth in the planning and monitoring of pro- emphasises it is unlikely that a previous case will
jects. Thus project managers have used a number of exactly match the new project and hence expert judge-
di€erent ideas to aid their estimation of the duration ment will be needed to adjust for these di€erences.
and e€ort of the project and its tasks, particularly in A great deal of work and not a little hype has gone
the case of software engineering and software develop- into developing algorithmic or parametric estimation
ment projects. This is partly due to the development of methods. These all identify what are the e€ort or cost
this industry at the same time as project management drivers in a project and then seek to use data on past
techniques were coming to the fore, and partly that projects to ®t the parameters of a model based on
the skills needed by engineers in this industry mean these cost drivers. The cost drivers tend to be measures
they ®nd it easy to use project management ideas. It is of system size and complexity plus perhaps personnel
also the nature of software development, which is capabilities and experience, hardware constraints and
essentially a research and development activity with all the availability of software development tools. The
the uncertainty involved in that, but where many of measures of size varies from likely number of lines of
the tasks are not dissimilar to tasks that have been executable code, to number of functions, modules or
previously performed. program features required. The models themselves
The techniques that have been used to estimate pro- either use arithmetic formulae or regression
ject e€ort and task duration include expert judgement, approaches, and even the latter tend to split into those
analogy, parametric models (the most widely used of where the dependent variable is e€ort (additive ones),
which were COCOMO and Bang), Function Point or the log of e€ort or time (multiplicative ones), or
analysis and recently neural nets and case based some other function of e€ort (analytic ones). Boehm [6]
reasoning. adds to this set, in his classi®cation of such models,
Perhaps the most common approach to estimating composite models which are combinations of the pre-
e€ort is to consider the opinions of experts. This does vious models. Jorgensen [5] gives a review and a cri-
not require the existence of historic data and is par- tique of these approaches.
ticularly useful at the start of system development Some of these parametric approaches have become
when requirements are vague and changing, and it is established methodologies in their own right. These
ballpark ®gures that are required. Experiments suggest include COCOMO, and SLIM. The COCOMO model,
expert judgement can be very accurate but it fails to proposed by Boehm, [6] can be considered a composite
provide an objective and quantitative analysis of what model since it provides a combination of functional
are the factors that a€ect e€ort and duration, and it is forms made accessible to the user in a structured man-
hard to separate real experience from the expert's sub- ner. Its purpose is to predict the e€ort and duration of
jective view [5]. The reliability of the estimate depends the total project, but not an estimate of size since the
how closely the project correlates with past experience major factors in the model are the estimated number
and the ability of the expert to recall all the facets of of delivered source instructions and the environment.
historic projects. Boehm [6] suggests it is easier to The latter recognises three types of development mode,
J. Hill et al. / International Journal of Project Management 18 (2000) 13±21 15

each with its own equation, where the mode depends There have been some comparisons of these methods
on the experience of the team and the innovative of estimating total project e€ort even if there has been
nature of the project. SLIM is another composite little reported work on task duration estimates.
model outlined by Putnam, [11] and also based on Kemerer [1] performed an empirical validation of four
lines of code, but using Rayleigh curves to modify the algorithmic models (SLIM, COCOMO, FPA and
estimates. ESTIMACS, which is a proprietary system with simi-
Function Point Analysis (FPA) was developed by lar features to FPA) using data on completed projects
Albrecht at IBM [12, 13] for quantifying the size of a to construct estimates of the completion times and
software system particularly in business applications. then comparing these with the actual times. Je€rey and
Function points are an alternative to source lines of Low [2] conducted a similar investigation but allowed
code in measuring the size of a system, by capturing for the models to be calibrated at both the industry
things like the number of input transaction types and and the organisation level. Mukhopadhyay et al. [16]
the number of di€erent types of reports to be avail- compared a case-based reasoning approach with
able. Thus one ®rst counts the number of user func- COCOMO, FPA and expert judgement, and found
tionsÐexternal input, external output, internal ®les, that the other methods did not outperform expert jud-
external interface ®les, external inquiries, and then gement. Finnie et al. [3] compared FPA with case-
adjusts to allow for processing complexity. The orig- based reasoning and neural network methods on 299
inal Albrecht approach [12] was a two-step approach projects and found the AI models superior. His main
where function points were used to estimate lines of point however was that good practical estimation
code which were then used to estimate e€ort needed. demands good record keeping of estimates and actual
Kemerer [1] tries to estimate project duration directly outcomes on previous projects.
from the function point count. Neither approach was Heemstra [17] surveyed 364 organisations and found
less than 155 used models to estimate software devel-
particularly accurate, and so Symons [13] revised the
opment e€ort and also that model-users made no bet-
approach by making the method compatible with
ter estimates than the non-model users. He also found
structured systems analysis techniques, which then
that ®rms did not recalibrate their models in the light
made it easier to count logical transactions rather than
of their results.
user functions, and then by recognising that the model
It should be noted though that all these comparisons
must be calibrated at the level of the organisation
are of the total project e€ort or cost or duration
which is building the system, and not a general indus-
needed for initially scoping and pricing the project.
try level. This made it closer to the approach adopted
None of them consider the task estimates needed to
by DeMarco [14] in his BANG system. He had also
make the project management approach successful.
recognised that at the design stage of a software sys-
tem, lines of code is not an appropriate metric as one
only has information on the business requirement. He 3. Data on task durations and their estimates
therefore sought to develop metrics of the level of
complexity of the business requirement by considering Data on task durations was obtained from the infor-
the latter as a network of functional primitivesÐsuch mation systems development department of a major in-
as calculating the interest on a loan. The review by ternational ®nancial organisation. The organisation
MacDonell [15] examines how dicult it is to ®nd uses no formal estimation models but uses the expert
such suitable metrics and comments on why reports of opinion of project supervisors and managers to esti-
successful implementation of this method are so few. mate the duration of the tasks that make up each pro-
He felt the main reason was because it was hard to ject. They in turn probably use some form of analogy,
calibrate weightings at an organisation level. and though estimates may be discussed informally
The idea of using Arti®cial Intelligence techniques to between project team members there is no formal use
aid estimation has only been tried since 1990. There of Delphi techniques.
have been two approaches. Case-based reasoning fol- Minor projects are planned at the supervisory level.
lows the analogy approach by trying to compare the In this planning exercise, the tasks and the subtasks
proposed project with similar ones that have been involved are identi®ed and e€ort needed for each task
completed and then identify what are the di€erences is estimated, and the tasks allocated to individuals.
and what implications these have for the e€ort levels, For more major projects senior management set guide-
see Mukhopadhyay et al. [16]. The second approach is lines regarding project estimates following agreement
to use neural networks where the inputs are the with the user steering group. The e€ort required for
measures of size and complexity together with descrip- each task making up the project is then estimated by
tions of the programming environment which are used project managers as before, and the total e€ort of all
in the algorithmic approaches [3]. tasks must be within the management guidelines. If it
16 J. Hill et al. / International Journal of Project Management 18 (2000) 13±21

is not, the project is referred back to the user steering 4. Analysis of data
group either to scale down the scope of the project or
to increase these guidelines. The descriptive statistics for all the tasks outlined in
Once a project is agreed, the data on each task is Tables 1 and 2 and show that although 60% of the
loaded on the computer. This data includes the initial tasks were overestimated, on average the estimated
task e€ort estimate (recorded in workdays with a mini- durations were slightly less than the actual durations,
mum input of 0.1 day), the project manager who made i.e., the error is less than 1%. This implies that the
the estimate, the type of task, the number of subtasks errors when tasks take longer than estimated are sig-
that make up the task, the number of sta€ allocated ni®cantly higher than if they take less time than esti-
for the task in total and whether the task is being mated. This is reasonable since there is a limit on how
tracked or not (tracking means the project's progress much quicker than estimated time the task can be
is being controlled by a project management package). completed, but no limit on how much longer.
During development the actual e€ort needed to com- Table 2 shows that for almost all the di€erent ways
plete the task is also recorded. of grouping the tasks the majority of tasks are overes-
The database on tasks has been in existence since timated. There are two groups in which the majority
1989 and includes details of over 16 600 tasks. A sub- of tasks are underestimated. The ®rst group is tasks
set of these was chosen using the following ®lters. involving analysis and construction and the second
Only tasks which were part of enhancement or devel- group is more complex tasks, including ones with
opment projects were included, which removed pro- more than three sta€ or with more than ®ve subtasks.
jects coded as ®xes and training, while tasks which The underestimating of joint construction and analysis
were part of the administration of the projectÐmeet- tasks suggest that perhaps the requirements were not
ings, walkthroughs, etc.Ðwere also excluded. Only fully understood before the work started and it might
tasks undertaken in the three-year period 1993±1995 be better to split the tasks into separate analysis and
and estimated by one of the six project managers construction tasks. The underestimating of tasks with
employed throughout that period were included. large teams and large numbers of subtasks suggest a
Lastly only tasks with at least 0.5 days of estimated common cause. It may be that estimators tend not to
e€ort were included. A sample of over 500 tasks meet- understand the true complexity of larger systems devel-
ing these criteria was then analysed. There were some opment tasks.
extreme values. For example a task that was estimated The distribution of the error (estimated timeÿactual
to take 126 days only took 1.6 days, while another was time), as displayed in Fig. 1, is unimodal. There are
estimated to take 1 day and took 150.9 days. These far too many errors (25% of the sample) in the region
and other outliers were examined by the senior man- [ÿ1.0, 1.0] for the distribution of errors to be normal
agers to see if they were true records of the estimates (i.e., the tails are too thin) and 10% of the cases have
and actual times. Twenty tasks attracted comment, but no error at all. The standard tests for normality con-
only the two alluded to were felt to be spurious by the ®rm this, and the extremely high kurtosis of this distri-
senior managers, and these were excluded from the bution.
sample leaving 506 tasks. The comments were them- Looking in more detail at the errors in the estimates
selves interesting, in that in some cases it was felt that for di€erent types of tasks con®rm the results of
very low duration estimates had been initially allocated Table 2. Table 3 gives the mean, median and standard
for strategic reasons. deviation for the errors (estimateÿactual) and actual
Lastly the subtasks making up each task were exam- times for each of the task types. Again the major
ined to determine the type of the task. These types di€erences occur with tasks of large sizeÐbe it by the
were classi®ed as: number of subtasks or sta€ allocatedÐand the life
cycle option.
T1: setting objectives and requirements;
T2: external design; Table 1
T3: internal design; Summary statistics of estimates and actual durations
T4: constructionÐprogramme development and
All tasks Estimated Actual EstÿAct
implementation; days (E) days (A) (EÿA)
T5: analysis;
T6: combinations of analysis and construction. Mean 13.51 13.63 ÿ 0.12
Stand. Deviation 27.83 27.2 20.43
The year the project was planned and undertaken Minimum 0.5 0.1 ÿ214.4
Maximum 200 255.4 140.1
was also recorded in case there was a learning e€ect by Median 5.0 3.9 0.3
the estimators.
J. Hill et al. / International Journal of Project Management 18 (2000) 13±21 17

Table 2 mate complex tasks whether one de®nes complexity by


Proportion of tasks which are over and under estimated
number of sta€ used or by number of subtasks. One
Description of group Overestimate On Underestimate should not confuse complexity with long duration.
target When the tasks are split according to the estimated
duration, although the majority of tasks of all esti-
All tasks 59.5% 8.1% 32.4% mated duration categories are overestimated, the aver-
Project manager age error is an underestimation of 1.78 days for tasks
Manager 1 56.3% 16.7% 27.0%
Manager 2 61.3% 2.7% 36.0% estimated 2 days or less, an underestimate of 1.46 days
Manager 3 60.9% 4.7% 34.4% for those estimated between 2 and 5 days and an
Manager 4 54.3% 7.4% 38.3% underestimate of 7.3 days for those between 5 and 10
Manager 5 68.5% 5.6% 25.9% days. On the other hand the average error on tasks
Manager 6 60.4% 5.7% 33.9%
estimated between 10 and 25 days is an overestimate
Life cycle option
T1-objectives 75.0% 0% 25.0%
of 2.7 days and for even large tasks, an average over-
T2-external design 75.0% 0% 25.0% estimate of nearly 13 days. Thus tasks with long esti-
T3-internal design 64.1% 20.8% 15.1% mated durations are over estimated and those with low
T4-construction 65.6% 9.3% 25.1% estimated durations are underestimated. This makes
T5-analysis 46.2% 7.8% 46.2% sense, but is the opposite result to the one concerning
T6-analysis/construction 44.1% 1.5% 54.4%
Completion date complexity of the tasks.
1993 61.5% 7.4% 31.1% Looking at the task types, the combined tasks T6
1994 57.6% 8.7% 33.7% are on average underestimated by about 15% while all
1995 57.8% 8.9% 33.3% the other types are overestimated by an average of
Tracking 15%. This suggests the lack of clarity in type T6 tasks,
Tracking 56.8% 8.2% 35.0%
No tracking 64.0% 7.9% 28.1% which shows up in their mix of subtasks and means
Sta€ allocation that their complexity is underestimated. The other sur-
3 or less 64.5% 10.2% 25.3% prising results are negative ones, in that project man-
More than 3 42.1% 0.9% 57.0% agement tracking seems to make little di€erence to the
No. of subtasks
errors in the estimates. Similarly when the projects are
1±5 66.4% 11.4% 22.2%
6±19 42.9% 1.0% 56.1% split into the projects undertaken and estimated in
20+ 45.6% 0% 54.4% 1993, 1994 and 1995 to see if there is a learning e€ect,
E€ort estimation one ®nds there is a slight drop in number of tasks
Less than 2 days 54.2% 15.3% 30.5% overestimated between 1993 and the next two years.
2±5 days 59.0% 9.5% 31.5% Similarly the estimated error changed from an average
5±10 days 53.7% 3.0% 43.3%
10±25 days 67.9% 0% 32.1% of 1.69 (over) to ÿ1.57 (under) and then to ÿ2.3
More than 25 days 72.1% 0% 27.9% (under). This suggests that the estimators are becoming
meaner if not more accurate over time. However this
trend may be related to the fact that the average task
As is expected, the more the number of tasks the duration time was considerably greater in 1993 than in
larger the estimated and actual durations, but what is subsequent years and as was suggested above there is a
signi®cant is that for tasks with 5 or less subtasks the tendency to overestimate
estimated durations were more than 40% on average So far we have looked at the e€ect on discrepancies
greater than the actual durations. This compares with between the estimates and the real durations in terms
the tasks with more than 19 subtasks where the esti- of each facet of a task separately. In the rest of this
mated average durations are 13% on average less than section we will use regression to see what are the
the average actual durations. There are considerable e€ects of all the tasks combined. This is done by
di€erences in tasks that employ 3 or fewer sta€ com- applying regression analysis to the data, using White's
pared with those that need more sta€ than that. The correction to allow for an unknown form of hetereos-
former have an average actual duration of ®ve and a kedasticity. One ®nds that regressing the actual time
half days while the latter have an average of over 41 (A) on the estimated time (E) only leads to an
days. Again the estimation procedure over-estimates equation
the tasks with few sta€ and underestimates those with
large numbers of sta€. On the ®rst category, the aver- A ˆ 0:708E ‡ 4:07 …1†
age overestimate is 20% and 65% of the tasks were
overestimated. In the second group, the average dur- which has an R 2 of 0.53 suggesting that the estimate
ation underestimate is over 10% and 57% of the tasks only explains about half the variation. Obviously one
were underestimated. There seems no doubt that way of improving the experts' estimates are to includ-
experts tend to overestimate small tasks and underesti- ing all the other factors that describe the tasks. Doing
18 J. Hill et al. / International Journal of Project Management 18 (2000) 13±21

Fig. 1. Distribution of E-A.

Table 3
Means and medians of errors and actual times

Description of group EstÿAct mean EstÿAct st. dev. EstÿAct median Actual mean Actual st. dev. Actual median

Life cycle option


T1-objectives 6.62 8.96 3.9 1.12 0.89 1.15
T2-external design 2.16 7.41 1.2 11.46 15.7 4.15
T3-internal design 1.06 4.8 0.4 4.45 7.0 2.9
T4-construction 1.27 7.75 0.4 8.12 18.2 2.9
T5-analysis 4.04 23.7 0.2 30.81 48.3 11.05
T6-analysis/construction ÿ4.36 36.58 ÿ0.5 27.74 38.25 12.8
No. of subtasks
1±5 1.32 6.44 0.4 3.44 3.88 2.3
6±19 ÿ0.31 19.05 ÿ0.9 18.65 16.6 13.95
20+ ÿ8.78 52.79 ÿ3.9 67.84 49.2 74.5
Sta€ allocation
3 or less 1.26 8,28 0.4 5.49 9.87 2.75
More than 3 ÿ4.93 39.97 ÿ2.4 41.66 44.14 28.65
E€ort estimation
Less than 2 days ÿ1.78 7.44 0.1 3.16 7.53 1.1
2±5 days ÿ1.46 8.65 0.55 5.53 8.84 3.3
5±10 days ÿ7.3 27.24 0.7 15.93 27.5 8.6
10±25 days 2.7 9.34 3.3 15.0 9.56 13.15
More than 25 days 12.97 44.87 12.2 58.27 49.6 41.9
Tracking
Tracking ÿ0.02 23.56 0.2 17.3 31.87 5.0
No Tracking ÿ0.3 13.7 0.4 7.49 14.82 2.6
Completion date
1993 1.69 22.38 0.5 15.62 30.33 3.95
1994 ÿ1.57 13.92 0.25 12.01 20.27 4.1
1995 ÿ2.3 24.7 0.2 11.36 29.6 3.15
J. Hill et al. / International Journal of Project Management 18 (2000) 13±21 19

this leads to the equation A ˆ 1:55SUBTASKS ‡ 0:47 …3†

and the R 2 = 0.82. This high R 2 for Eq. (3) means the
A ˆ0:177E ‡ 0:07STAFF ‡ 1:33SUBTASKS actual times are almost as well explained by the num-
ÿ 1:68Dtracking ‡ 0:21DM1 ÿ 2:13DM2 ÿ 3:30DM3 ber of subtasks alone as when all the variables are put
‡ 0:23DM4 ÿ 6:15DM5 ÿ 0:70DT1 ‡ 2:31DT2 together, and the ®t is better than just using estimated
time.
ÿ 0:55DT4 ‡ 2:65DT5 ‡ 2:04DT6 ‡ 1:79 …2†
This suggests that if the experts counted subtasks
and incorporated this extensively into their estimate
where STAFF is the number of sta€ used, they would get better results. So there is the question,
SUBTASKS the number of subtasks, Dtracking = 1 if do the experts consciously or subconsciously already
tracking was used, DMi is the e€ect of manager i (com- do this. One way of checking is to do a regression of
pared with manager 6) and DTj is the e€ect of task the estimated times against the task variables. If one
type Tj (compared with type T3). This equation has a does this one ®nds that the regression has an R 2 of
R 2 value of 0.845 which means that the right hand 0.49 and the equation is
side of Eq. (2) is quite a good estimate of the actual
durations. Two of the most signi®cant variables are
SUBTASKS with a t-ratio of 6.21 and E with a t-ratio E ˆ1:18SUBTASKS ÿ 0:11STAFF ‡ 2:92Dtracking
of 2.05. The coecient of E is 0.177 which means that 22:93DM1 ‡ 1:18DM2 ‡ 0:94DM3 ‡ 1:92DM4
one takes less than 20% of the estimated time into the ÿ 3:375DM5 ‡ 3:87DA ‡ 3:81DB ÿ 0:37DD
equation, whereas the number of subtasks is multiplied ‡ 12:42DE ‡ 2:19 …4†
up by 1.33 when combining all the factors to get the
best estimate of the duration. The details are found in
Table 4 which shows that other signi®cant factors are Again, if one only looks at the relationship between
whether the estimates are made by managers 3 or 5. estimated times and number of subtasks one gets the
Since the subtasks turn out to be so important it is regression equation
worth looking at what happens if one uses only those E ˆ 1:22SUBTASKS ‡ 3:18 …5†
to estimate the actual durations.. The regression
equation in this case is with a R 2 = 0.48. What these results seem to say is

Table 4
Results of regression in Eq. (2). Ordinary least square regression applying White's corrections for hetereoskedasticity.
Actualtime = a + b1estimate + b2sta€ + b3subtasks + b4tracking + b5M1 + b6M2 + b7M3 + b8M4 + b9M5 + b10T1 + b11T2 + b12T4 + b13-
T5 + b14T6 + e

Variable name Estimated coecient Standard error T-ratio P-value Partial corr. Standardized coecient Elasticity at means
491 DF

Estimate 0.17686 0.8605Eÿ01 2.055 0.040 0.092 0.1810 0.1752


Sta€ 0.0717 0.4944 0.1452 0.885 0.007 0.0060 0.0134
Subtasks 1.3306 0.2142 6.211 0.000 0.270 0.7765 0.8276
Tracking ÿ1.6774 1.408 ÿ1.192 0.234 ÿ0.054 ÿ0.0299 0.0770
M1 0.20860 1.081 0.1929 0.847 0.009 0.0033 0.0038
M2 ÿ2.1308 2.045 ÿ1.042 0.298 ÿ0.047 ÿ0.0279 ÿ0.0231
M3 ÿ3.3011 1.546 ÿ2.135 0.033 ÿ0.096 ÿ0.0404 ÿ0.0306
M4 0.22777 1.958 0.1163 0.907 0.005 0.0031 0.0027
M5 ÿ6.1461 1.850 ÿ3.322 0.001 ÿ0.148 ÿ0.0698 ÿ0.0481
T1 ÿ0.69838 1.839 ÿ0.3798 0.704 ÿ0.017 ÿ0.0023 ÿ0.0004
T2 2.3073 1.899 1.215 0.225 0.055 0.0106 0.0027
T4 ÿ0.54811 0.801 ÿ0.6842 0.494 ÿ0.031 ÿ0.0100 ÿ0.0231
T5 2.6525 4.019 0.6600 0.510 0.030 0.0160 0.0054
T6 2.0388 1.369 1.489 0.137 0.067 0.0333 0.0402
Constant 1.7928 1.690 1.061 0.289 0.048 0.0000 0.1314

Durbin-Watson = 1.8597

Ordinary least square regression applying White's corrections for heteroscedacity giving Eq. (2) 0.506 observations.
Dependent variable = ACTUALT IME.
Using White's heteroskedasticity-consistent covariance matrix, R-square = 0.8453, R-square adjusted = 0.8408.
Sum of squared errors-SSE = 57850.
Mean of dependent variable = 13.643. 491 DF.
Log of the likelihood function =ÿ1916.97.
20 J. Hill et al. / International Journal of Project Management 18 (2000) 13±21

that of all the factors described, it is the number of age errors on tasks that were not clearly de®ned (cat-
subtasks that has the biggest impact on the time esti- egory T6) or were complex, involving a number of
mated. However since R 2 is only 0.48 it means that sta€ and a large number of subtasks. The major over
there is a lot of variation in the estimated times that estimates came in tasks involving setting requirements
cannot be explained by the number of subtasks. It is (T1) or analysis (T5). Tracking of the tasks seemed to
not possible to say from the data what it is that a€ects have little e€ect, and the learning e€ect over time was
the experts estimates. All that can be said after exam- to tighten up the estimates so that the average esti-
ining Eq. (4) and noting that it is no better an estimate mated task times instead of being 1.7 days more than
than Eq. (5) is that it is not anything to do with the actual became 2.3 days less than the actual times. This
type of task or the manager who is doing the estimat- e€ect might have as much to do with pressure within
ing. the organisation for projects to be done as eciently
Although almost all the explanation of the estimated as possible as any learning e€ect on what really hap-
times is in its relationship with subtasks, these esti- pens in projects. One way this pressure exerted itself
mates are still not putting enough emphasis on sub- was in a drive to squeeze `more value for money' as a
tasks, since the relationship of subtasks with actual way of improving the returns to the clients.
times is even stronger than that of subtasks with esti- The main conclusion though is the strong relation
mated times. between task time and the number of subtasks
Since the real error is A±E, regressions that estimate involved in the task. This was by far and away the
real errors in terms of estimated times and the other best estimate of the likely time of the taskÐbetter
factors lead to an equation which is just a rearrange- than the estimated time even. What this suggests is
ment of Eq. (2), though of course the R 2 will be larger that a careful use of work breakdown structure
because the values of the dependent variable are so approaches to identify the packets of work at the low-
much smaller. Looking at the relative error so that the est level is not just useful for smooth management of
dependent variable is (E±A)/A or (E±A)/E does not the project, but is also one of the most useful thing
give very good ®ts (R 2 of 0.13 and 0.26 respectively) experts can do to estimate the times of the tasks better.
and although the e€ect of subtasks remain signi®cant It is clear that the managers in this case study under-
in these regressions it is not so great as in the straight stated the full e€ect that the number of subtasks has
line error regressions. Other variables, like number of on the time of a task. This may be because such tasks
sta€ and type of task, which manager does the estimat- were poorly thought through and the dependencies
ing, become signi®cant at the 5% level in these re- between the subtasks not really understood; or it could
gressions, as well as the estimated time being be that there is poor project management at the sub-
signi®cant. task level.

5. Conclusions
Acknowledgements
The results suggest that in this case study the expert
The paper was written while one of us (LT) was a
project managers all tended to overestimate the ma-
visiting professor at Edith Cowan University. We wish
jority of the tasks (around 60%), but if they underesti-
to acknowledge the University's support in funding
mated, the errors tended to be greater so the mean
this visit.
error was an underestimate of about 1%. The distri-
bution of errors was a mixture, with some tasks being
estimated completely accurately, others taking one day
less than estimated and the rest having almost a nor- References
mal distribution. One should be cautious at giving ex-
planations for this, but it could be that some tasks are [1] Kemerer CF. An empirical validation of software cost esti-
routine and there is an understanding in the organis- mation models. Commun. ACM 1987;30:416±29.
ation of the length of time they are to take, which is [2] Je€ery DR, Low GC. Calibrating estimation tools for software
development. Software Engineering Journal 1990;5:215±21.
known by project managers and analysts. Hence such [3] Finnie GR, Wittig GE, Desharnais JM. A comparison of soft-
tasks tend to come in on time or a little early. The ware e€ort estimation techniques: using function points with
others where their development is less clear are the neural networks, case-based reasoning and regression models. J.
ones that can be approximated by a normal distri- Systems and Software 1997;39:281±9.
bution, with a slightly longer underestimating tail than [4] Lederer AL, Prasad J. Causes of inaccurate software-develop-
ment cost estimates. J. of Systems and Software 1995;31:125±34.
an overestimating one. [5] Jorgansen M. Experience with the accuracy of software main-
Looking at the estimates in more detail it seemed tenance task e€ort prediction models. IEEE Transactions on
that estimators made the largest underestimating aver- software engineering 1995;21:674±81.
J. Hill et al. / International Journal of Project Management 18 (2000) 13±21 21

[6] Boehm BW. Software engineering economics. Englewood Cli€s: L. C. Thomas is Professor of
Prentice Hall, 1981. Management Science at the University
[7] Goodman PA. Application of cost estimation techniques: indus- of Edinburgh. His undergraduate
trial perspective. Information and Software Technology degree and his D.Phil were in
1999;34:379±82. Mathemtics from the University of
[8] Shepperd M. Foundations of software measurement. Oxford. He has authored and edited
Englewood Cli€s, New Jersey: Prentice-Hall, 1995. seven books and over 100 papers in
[9] Kerzner H. Project Management: a systems approach to plan- the Management Science area.
ning scheduling and control. New York: van Nostrand
Reinhold, 1995.
[10] Yeates D. Systems project management. London: Pitman, 1986.
[11] Putnam LH. A general empirical solution to the macro-software
sizing and estimation problem. IEEE Trans. on Software
Engineering, 1978;4.
[12] Albrecht AJ, Ga€ney JE. Software function, source lines of
code and development e€ort prediction: a software science vali-
dation. IEEE Trans. on Software Engineering 1983;9:639±48.
[13] Symons CR. Software sizing and estimating MkII FPA.
Chichester: Wiley, 1991.
[14] DeMarco T. Controlling software projects: management,
measurement and estimation. New York: Yourdon, 1982.
[15] MacDonell SG. Comparative review of functional complexity
assessment methods for e€ort estimation. Software Engineering
Journal 1994;9:107±16.
[16] Mukhopadhay T, Vicinanza SS, Prietula MJ. Examining the
feasibility of a case-based reasoning model for software e€ort
estimation. MIS Quarterly 1992;16(2):155±71.
[17] Heemstra FJ. Software cost estimation. Information and
Software Technology 1992;34:627±39.
D. E. Allen is the foundation Professor
of Finance at Edith Cowan University,
having previously been at Curtin
J. Hill obtained a Masters degree in University, the University of Western
Business Administration from the
Australia and the University of
University of Edinburgh. He is a Edinburgh. He received a degree in
senior manager in the research and economics from the University of
development department of an inter-
St.Andrews, a M.Phil from Leicester
national ®nancial organisation, and University and his Ph.D. in ®nance
has been overseeing software develop- from the University of Western
ment projects for a number of years. Australia. His research interests
include a number of areas of business
economics and ®nance, portfolio
analysis, estimation of risk and the
statistical estimation of ®nancial and
other data.

You might also like