0% found this document useful (0 votes)
107 views74 pages

5-Algorithmic Models For Software Cost Estimation-02-02-2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views74 pages

5-Algorithmic Models For Software Cost Estimation-02-02-2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Algorithmic Models for

Software Cost Estimation


Cost estimation objectives
Budget
• To know what you will spend
Controls
• A lever to control the project
Differential analysis
• Monitor progress by comparing planned with estimated costs
• Cost database
• Make future estimation better
Cost estimation and planning/scheduling are closely related activities
Software cost components
• Hardware and software costs

• Travel and training costs

• Effort costs (the dominant factor in most projects)


salaries of engineers involved in the project
costs of building, heating, lighting
costs of networking and communications
costs of shared facilities (e.g library, staff restaurant, etc.)
costs of pensions, health insurance, etc
Costing and pricing
• Estimating Cost
• Costs for developer, not buyer
• We need our costs to manage and assess

• Estimating Price
• There is not a simple relationship between the development cost and the price charged
to the customer.
• Broader organizational, economic, political and business considerations influence the
price charged.
Productivity Measures

• Size-related measures
• Must be based on some output from the software process
• Delivered source code
• Object code instructions

• Function-related measures
• Based on an estimate of the functionality of the delivered software.
• Function-points are the best known of this type of measure
Lines of Codes
LOC = NCLOC + CLOC
LOC: lines of code
NCLOC: non-commented line of code
CLOC: commented line of code
KLOC = one thousand of line of code
Function points
• Based on a combination of program characteristics
• external inputs and outputs
• user interactions
• external interfaces
• files used by the system

• A weight is associated with each of these

• The function point count is computed by multiplying each raw count by the
weight and summing all values
Function points (FP)
• FP is a unit of measurement to express the amount of business functionality, an
information system (as a product) provides to a user.
• Function point count modified by complexity of the project
• FPs can be used to estimate LOC depending on the average number of LOC per
FP for a given language
• FPs are very subjective
• Depend on the estimator
• FP cannot generally be counted automatically
Factors affecting productivity
Factor Description
Application domain experience Knowledge of the application domain is essential for effective
software development. Engineers who already understand a
domain are likely to be the most productive.
Process quality The development process used can have a significant effect on
productivity.
Project size The larger a project, the more time required for team
communications. Less time is available for development so
individual productivity is reduced.
Technology support Good support technology such as CASE tools, supportive
configuration management systems, etc. can improve
productivity.
Working environment A quiet working environment with private work areas
contributes to improved productivity.
Estimation techniques
• Expert judgement
• Estimation by analogy
• Parkinson’s Law
• Pricing to win
• Top-down estimation
• Bottom-up estimation
• Algorithmic cost modelling
Expert judgement

• One or more experts in both software development and the


application domain use their experience to predict software costs.
Process iterates until some consensus is reached

• Advantages: Relatively cheap estimation method. Can be accurate if


experts have direct experience of similar systems

• Disadvantages: May be very costly


Estimation by analogy

• The cost of a project is computed by comparing the project to a


similar project in the same application domain

• Advantages: Accurate if project data available

• Disadvantages: Impossible if no comparable project has been tackled.


Needs systematically maintained cost database
Parkinson's Law

• The project costs whatever resources are available

• Advantages: No overspending

• Disadvantages: System is usually unfinished


Pricing to win

• The project costs whatever the customer has to spend on it. The estimated
effort depends on the customer's budget and not on the software
functionality.

• Advantages: You get the contract

• Disadvantages: The probability that the customer gets the system he or


she wants is small. Costs do not accurately reflect the work required
Top-down estimation

• Approaches may be applied using a top-down approach. Start at


system level and work out how the system functionality is provided

• Takes into account costs such as integration, configuration


management and documentation

• Can underestimate the cost of solving difficult low-level technical


problems
Bottom-up estimation

• Start at the lowest system level. The cost of each component is


estimated individually. These costs are summed to give final cost
estimate

• Accurate method if the system has been designed in detail

• May underestimate costs of system level activities such as integration


and documentation
Estimation methods
• Each method has strengths and weaknesses
• Estimation should be based on several methods
• If these do not return approximately the same result, there is
insufficient information available
• Some action should be taken to find out more in order to make more
accurate estimates
• Pricing to win is sometimes the only applicable method
Algorithmic cost modelling

• Cost is estimated as a mathematical function of product, project and


process attributes whose values are estimated by project managers

• The function is derived from a study of historical costing data

• Most commonly used product attribute for cost estimation is LOC


(code size)

• Most models are basically similar but with different attribute values
Examples of cost models
• General form: E = A + B  SC
• E: Effort cost; S: Size; A, B, C: constants
Examples:
E = 5.2 x (KLOC)0.91 Walston-Felix Model
E = 5.5 + 0.73 x (KLOC)1.16 Bailey-Basili Model
E = 3.2 x (KLOC)1.05 COCOMO Basic Model
E = 5.288 x (KLOC)1.047 Doty Model for KLOC > 9
Examples of cost models
Cost models using FP as a primary input include (Pressman, 1997):
Albrecht and Gaffney Model
E = -13.39 + 0.0545 FP
Kemerer Model
E = 60.62 x 7.728 x 10-8 FP3
Matson, Barnett, and Mellichamp model
E = 585.7 + 15.12 FP
The Constructive Cost Model (COCOMO)

• COCOMO is one of the most widely used software estimation models in the
world

• It was developed by Barry Boehm in 1981

• COCOMO predicts the effort and schedule for a software product


development based on inputs relating to the size of the software and a
number of cost drivers that affect productivity
COCOMO Models

• COCOMO has three different models that reflect the complexity:


• Basic Model

• Intermediate Model

• Detailed Model
Basic Model

• Applicable to small to medium sized software projects

• Use for a quick and rough estimates

• Three modes of software development are considered


• Organic

• Semi-detached

• Embedded
Organic Mode
• A small team of experienced programmers develop software in a very
familiar environment
• Require little Innovation
• Size range ( 0-50 KLOC)
Semi-detached mode
• An intermediate mode between the organic mode and embedded mode
• Depending on the problem at hand, the team include the mixture of
experienced and less experienced people
• Require medium Innovation
• Development environment is medium
• Size range ( 50 - 300 KLOC)
Embedded mode
• Project has tight constraints
• Hard to find experienced persons
• Require significant Innovation
• Development environment is complex
• Size range ( over 300 KLOC)
COCOMO:
Some Assumptions
• Primary cost driver is the number of Delivered Source Instructions
(DSI) / Delivered Line Of Code developed by the project
• COCOMO estimates assume that the project will enjoy good
management by both the developer and the customer
Basic COCOMO Model: Formula
The basic COCOMO equation
• E= ab (KLOC or KDSI) bb
• D= cb (E) db
• P=E/D where
• E is the effort applied in person-months,
• D is the development time in months,
• KLOC / KDSI is the estimated number of delivered lines of code for the project
(expressed in thousands)
• P is the number of people required and
• ab, bb, cb and db are coefficients given in next slide.
Contd…
Software project ab bb cb db
• Organic 2.4 1.05 2.5 0.38
• Semi-detached 3.0 1.12 2.5 0.35
• Embedded 3.6 1.20 2.5 0.32
Basic COCOMO Model: Equation

Mode Effort Schedule

Organic E=2.4*(KDSI)1.05 TDEV=2.5*(E)0.38

Semidetached E=3.0*(KDSI)1.12 TDEV=2.5*(E)0.35

Embedded E=3.6*(KDSI)1.20 TDEV=2.5*(E)0.32


Basic COCOMO Model: Example
We have determined our project fits the characteristics of Semi-Detached mode. We estimate
our project will have 32,000 Delivered Source Instructions. Using the formulas, we can
estimate: Effort, Schedule, productivity, Average staffing
• Effort = 3.0*(32) 1.12 = 146 man-months
• Schedule = 2.5*(146) 0.35 = 14 months
• Productivity = 32,000 DSI / 146 MM
= 219 DSI/MM
• Average Staffing = 146 MM /14 months
= 10 FSP
Basic COCOMO Model: Example
Suppose that a project was estimated to be 400 KLOC. Calculate the effort and development time
for each of the three modes I.e., organic, semidetached and embedded
The basic COCOMO equation take the form:
E = ab(KLOC) bb
D = cb(E) db
Basic COCOMO Model: Example cont…
E = ab(KLOC)bb

D = Cb(E)db
Organic Mode E = 2.4(400)1.05 = 1285.31 PM

D = 2.5(1285.31)0.38 =38.07 M

Semidetached Mode E = 3.0(400)1.12 = 2462.79 PM

D = 2.5(2462.79)0.38 =38.45 M

Embedded Mode E = 3.6(400)1.20 = 4772.81 PM

D = 2.5(4772.81)0.32 = 37.5 M
• A project size of 200 KLOC is to be developed. Software development
team has average experience on similar type of projects. The project
schedule is not very tight. Calculate the effort, development time,
average staff size and productivity of the project
Solution
• The semi-detached mode is the most appropriate mode; keeping in view
the size, schedule and experience of the development team

• Productivity = KLOC /E = 200 /1122.12 = 0.1765KLOC/PM


• P= 176LOC/PM
The COCOMO model
• Developed at TRW, a US defense contractor
• Based on a cost database of more than 60 different projects
• Exists in three stages
Basic -Gives a 'ball-park' estimate based on product attributes
Intermediate -modifies basic estimate using project and process
attributes
Advanced -Estimates project phases and parts separately
The COCOMO model
Three modes:
• Organic mode: relatively simple projects in which small teams work to a
set of informal requirements
• Semidetached mode: an intermediate project in which mixed teams must
work to a set of rigid and less than rigid requirements
• Embedded mode: a project that must operate within a tight set of
constraints (ie. flight control software for aircraft).
BASIC COCOMO Formula
E = a(KLOC)b
TDEV = c (E)d
Mode a b c d
Organic mode 2.4 1.05 2.5 0.38
Semi-detached 3.0 1.12 2.5 0.35
Embedded 3.6 1.20 2.5 0.32

The constant values a,b,c and d for the Basic Model for the different categories of system
Intermediate COCOMO
• Takes basic COCOMO as starting point
• Identifies personnel, product, computer and project attributes which
affect cost
• Multiplies basic cost by attribute multipliers which may increase or
decrease costs
Effort Multipliers

Cost Driver Description Rating


Very Low Low Nominal High Very High Extra High
Product
RELY Required software reliability 0.75 0.88 1.00 1.15 1.40 -
DATA Database size - 0.94 1.00 1.08 1.16 -
CPLX Product complexity 0.70 0.85 1.00 1.15 1.30 1.65
Computer
TIME Execution time constraint - - 1.00 1.11 1.30 1.66
STOR Main storage constraint - - 1.00 1.06 1.21 1.56
VIRT Virtual machine volatility - 0.87 1.00 1.15 1.30 -
TURN Computer turnaround time - 0.87 1.00 1.07 1.15 -
Personnel
ACAP Analyst capability 1.46 1.19 1.00 0.86 0.71 -
AEXP Applications experience 1.29 1.13 1.00 0.91 0.82 -
PCAP Programmer capability 1.42 1.17 1.00 0.86 0.70 -
VEXP Virtual machine experience 1.21 1.10 1.00 0.90 - -
LEXP Language experience 1.14 1.07 1.00 0.95 - -
Project
MODP Modern programming practices 1.24 1.10 1.00 0.91 0.82 -
TOOL Software Tools 1.24 1.10 1.00 0.91 0.83 -
SCED Development Schedule 1.23 1.08 1.00 1.04 1.10 -
Advanced COCOMO model
• The Advanced COCOMO model computes effort as a function of
program size and a set of cost drivers weighted according to each
phase of the software lifecycle. The Advanced model applies the
Intermediate model at the component level, and then a phase-based
approach is used to consolidate the estimate (Fenton, 1997).
• The 4 phases used in the detailed COCOMO model are: requirements
planning and product design (RPD), detailed design (DD), code and
unit test (CUT), and integration and test (IT). Each cost driver is
broken down by phase as in the example shown in Table (Boehm,
1981).
Cost Driver Rating RPD DD CUT IT

Very Low 1.80 1.35 1.35 1.50

Low 0.85 0.85 0.85 1.20

ACAP Nominal 1.00 1.00 1.00 1.00

High 0.75 0.90 0.90 0.85

Very High 0.55 0.75 0.75 0.70


Model tuning
• All numbers in cost model are organization specific. The parameters
of the model must be modified to adapt it to local needs
• A statistically significant database of detailed cost information is
necessary
Staffing requirements
• Staff required can’t be computed by dividing the development time
by the required schedule
• The number of people working on a project varies depending on the
phase of the project
• The more people who work on the project, the more total effort is
usually required
• Very rapid build-up of people often correlates with schedule slippage
Software Equation
• Putnam used some empirical observations about productivity levels
to derive the software equation from the basic Rayleigh curve
formula (Fenton, 1997). The software equation is expressed as:
1 4
Size =
CE 3 t 3

• where C is a technology factor, E is the total project effort in person


years, and t is the elapsed time to delivery in years.
Technology Factor
• The technology factor is a composite cost driver involving 14
components. It primarily reflects:
Overall process maturity and management practices
The extent to which good software engineering practices are used
The level of programming languages used
The state of the software environment
The skills and experience of the software team
The complexity of the application
The software equation includes a fourth power and therefore has
strong implications for resource allocation on large projects.
Relatively small extensions in delivery date can result in substantial
reductions in effort (Pressman, 1997).
Putnam SLIM Model
• The Putnam model is an empirical software effort estimation model.

• The Putnam model describes the time and effort required to finish a
software project of specified size.

• SLIM (Software LIfecycle Management) is the name given by Putnam


to the proprietary suite of tools his company QSM, Inc. has developed
based on his model.

• It is one of the earliest of these types of models developed, and is


among the most widely used.
• While managing R&D projects for the Army and later at GE, Putnam noticed software staffing profiles
followed the well-known Rayleigh distribution.

• Putnam used his observations about productivity levels to derive the software equation:

Where:

• Size is the product size.

• B is a scaling factor and is a function of the project size.

• Productivity is the Process Productivity, the ability of a particular software organization to produce software
of a given size at a particular defect rate.

• Effort is the total effort applied to the project in person-years.

• Time is the total schedule of the project in years.


• In practical use, when making an estimate for a software task the software
equation is solved for effort:

• An estimated software size at project completion and organizational


process productivity is used.
• Plotting effort as a function of time yields the Time-Effort Curve.
• The points along the curve represent the estimated total effort to complete
the project at some time.
• One of the distinguishing features of the Putnam model is that total effort
decreases as the time to complete the project is extended.
• This is normally represented in other parametric models with a schedule
relaxation parameter.
• This estimating method is fairly sensitive to uncertainty in both size and
process productivity. Putnam advocates obtaining process productivity by
calibration:

• Putnam makes a sharp distinction between 'conventional productivity' :


size / effort and process productivity.
• One of the key advantages to this model is the simplicity with which it is
calibrated.
• Most software organizations, regardless of maturity level can easily collect
size, effort and duration (time) for past projects.
• Process Productivity, being exponential in nature is typically converted to a
linear productivity index an organization can use to track their own
changes in productivity and apply in future effort estimates.
Putnam Slim Model
RCA PRICE Model
• The PRICE-S model (Programming Review of Information Costing
and Evaluation--Software) is developed and supported by RCA,
Inc.

• PRICE S is a commercially available macro cost-estimation model


developed primarily for embedded system applications.

• It has improved steadily with experience.


Contd…
• PRICE S has extended a number of cost estimating relationships developed
in the early 1970's such as the hardware constraint function shown in Fig.

56
Contd…
• It was primarily developed to handle military software projects,
but now also includes rating levels to cover business
applications.
• PRICE S also provides a wide range of useful outputs on gross
phase and activity distributions analyses, and monthly project
cost-schedule-expected progress forecasts.
• Price S uses a two-parameter beta distribution rather than a
Rayleigh curve to calculate development effort distribution
versus calendar time.
Contd…

• PRICE S has recently added a software life-cycle support cost


estimation capability called PRICE SL.

• It involves the definition of three categories of support


activities (Growth, Enhancement & Maintenance).

58
Growth

• The estimator specifies the amount of code to be added to


the product.

• PRICE SL then uses its standard techniques to estimate


the resulting life-cycle-effort distribution.

59
Enhancement

• PRICE SL estimates the fraction of the existing product


which will be modified (the estimator may provide his
own fraction), and uses its standard techniques to estimate
the resulting life-cycle effort distribution.

60
Maintenance

• The estimator provides a parameter indicating the quality level


of the developed code.

• PRICE SL uses this to estimate the effort required to eliminate


remaining errors.
Contd…
• An important disadvantage with regard to COCOMO is that
the underlying concepts and ideas are not publicly defined
and the users are presented with the model as a black box.

• The user of PRICE sends the input to a time-sharing computer


in the USA, UK, or France and gets back his estimates
immediately.

62
Doty model
• This model is the result of an extensive data analysis activity,
including many of the data points from the SDC sample.

• A number of models of similar form were developed for


different application areas.

63
• To estimate the effort for the number of lines of code
as in the following equation:

Effort = 5.288 (KLOC)1.047, for KLOC>=10


𝟏𝟒
Effort = 2.060 (KLOC)1.047( 𝒋−𝟏 𝒇𝒋), for KLOC<10
The effort multipliers fj are shown in Table.

64
• This model has a much more appropriate functional form than the SDC
model, but it has some problems with stability, as it exhibits a
discontinuity at KLOC = 10, and produces widely varying estimates via
the f factors (answering "yes" to "first software developed on CPU" adds
92 percent to the estimated cost).

66
IBM federal systems division
(FSD) Model/ Walston and Flexi
• This model is developed by Walston and Flexi in 1977 by taking the
database of 60 projects from IBM Federal Systems Division (FSD) and
analyzes different features.

• It shows the metric in delivered lines of source code.

• The function for estimating effort is


Effort= 5.2 (KLOC)0.91
Duration d= 4.1 (KLOC)0.36
• IBM FSD is able to systematically compare a large number of projects
on a number of factors regarding cost, delays and methods used.
• This enables them to spot methods or environments which are more
or less productive, and to take management action to weed out the
bad and to nurture the good.
SEL model

• The Software Engineering Laboratory established a model called SEL


model, for estimating its software production. This model is an
example of the static, single variable model.
• E=1.4L0.93
DOC=30.4L0.90
D=4.6L0.26
• Where E= Efforts (Person Per Month)
DOC=Documentation (Number of Pages)
D = Duration (D, in months)
L = Number of Lines per code
Benefits of Software Cost Estimation Technology

The major benefit of a good software cost estimation model is that it provides
a clear and consistent universe of discourse within which to address a good
many of the software engineering issues which arise throughout the software
life cycle
• Which and how many features should we put into the software product?
• Which features should we put in first?
• How much hardware should we acquire to support the software product's
development, operation, and maintenance?
• How much money and how much calendar time should we allow for
software development?
Benefits of Software Cost Estimation Technology
• How much of the product should we adapt from existing software?
• How much should we invest in tool and training?
• Software cost estimation model can help avoid misinterpretations, underestimates ,
overestimates
• In a good cost-estimation model, there is no way of reducing the estimated software cost
without changing some objectively verifiable property of the software project
• A related benefit of software cost estimation technology is that it provides a powerful set
of insights on how a software organization can improve its productivity
• Many of a software cost model's cost-driver attributes are management controllable: use
of software tools and modem programming practices, personnel capability and
experience, available computer speed, memory, and turnaround time, software reuse.
• Software cost estimation technology provides an absolutely essential foundation for
software project planning and control.
Compare the Walston-Felix Model with the SEL model on a software
development expected to involve 8 person-years of effort.
a) Calculate the number of lines of source code that can be produced.
b) Calculate the duration of the development.
c) Calculate the productivity in LOC/PY
d) Calculate the average manning
• The amount of manpower involved = 8PY=96persons-months

(a)Number of lines of source code can be obtained by reversing equation to give:

Then
L (SEL) = (96/1.4)1⁄0.93=94264 LOC
L (W-F) = (96/5.2)1⁄0.91=24632 LOC

(b)Duration in months can be calculated by means of equation


D (SEL) = 4.6 (L) 0.26
= 4.6 (94.264)0.26 = 15 months
D (W-F) = 4.1 L0.36
= 4.1 (24.632)0.36 = 13 months
c) Productivity is the lines of code produced per persons/month (year)

(d)Average manning is the average number of persons required per


month in the project

You might also like