Sofware Cost Estimation
Sofware Cost Estimation
min = i;
min = j;
temp = x[i];
x[i] = x[min];
x[min] = temp;
So, now If LOC is simply a count of the number of lines then the above function shown
contains 13 lines of code (LOC). But when comments and blank lines are ignored, the
function shown above contains 12 lines of code (LOC).
2. Function Point (FP) :
In the function point metric, the number and type of functions held up by the software
are used to find FPC (function point count).
Example of function point :
Check out this article for a detailed example : Calculation of Function Point (FP)
Both function point and LOC are measurement units for the size of the software. The size
of the software that is dependent on development is necessary to come up with accurate
estimates of the effort, cost, and duration of a project. Most parametric estimation models
such as the Constructive Cost Model (COCOMO) accept size conveyed in either FP or
LOC as an input.
Difference between LOC and Function Point :
Function Point (FP) Line of Code (LOC)
Function Point is used for data LOC is used for calculating the size of the computer
processing systems program
Function Point (FP) Line of Code (LOC)
Function Point can be used to portray LOC is used for calculating and comparing the
the project time productivity of programmers.
In general, people prefer the functional size of software indicated as Function Point for
one very important reason, i.e, the size expressed using the Function point metric stays
constant in any case and whichever language or languages are used.
COCOMO Model
Boehm proposed COCOMO (Constructive Cost Estimation Model) in 1981.COCOMO is
one of the most generally used software estimation models in the world. COCOMO
predicts the efforts and schedule of a software product based on the size of the
software.
The initial estimate (also called nominal estimate) is determined by an equation of the
form used in the static single variable models, using KDLOC as the measure of the size.
To determine the initial effort E i in person-months the equation used is of the type is
shown below
Ei=a*(KDLOC)b
The value of the constant a and b are depends on the project type.
1.Organic: A development project can be treated of the organic type, if the project
deals with developing a well-understood application program, the size of the
development team is reasonably small, and the team members are experienced in
developing similar methods of projects. Examples of this type of projects are simple
business systems, simple inventory management systems, and data processing
systems.
For three product categories, Bohem provides a different set of expression to predict
effort (in a unit of person month)and development time from the size of estimation in
KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due
to holidays, weekly off, coffee breaks, etc.
According to Boehm, software cost estimation should be done through three stages:
1. Basic Model
2. Intermediate Model
3. Detailed Model
1. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the
project parameters. The following expressions give the basic COCOMO estimation
model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Where
KLOC is the estimated size of the software product indicate in Kilo Lines of Code,
Effort is the total effort required to develop the software product, expressed in person
months (PMs).
For the three classes of software products, the formulas for estimating the effort based
on the code size are shown below:
For the three classes of software products, the formulas for estimating the development
time based on the effort are given below:
Some insight into the basic COCOMO model can be obtained by plotting the estimated
characteristics for different software sizes. Fig shows a plot of estimated effort versus
product size. From fig, we can observe that the effort is somewhat superliner in the size
of the software product. Thus, the effort required to develop a product increases very
rapidly with project size.
The development time versus the product size in KLOC is plotted in fig. From fig it can
be observed that the development time is a sub linear function of the size of the
product, i.e. when the size of the product increases by two times, the time to develop
the product does not double but rises moderately. This can be explained by the fact that
for larger products, a larger number of activities which can be carried out concurrently
can be identified. The parallel activities can be carried out simultaneously by the
engineers. This reduces the time to complete the project. Further, from fig, it can be
observed that the development time is roughly the same for all three categories of
products. For example, a 60 KLOC program can be developed in approximately 18
months, regardless of whether it is of organic, semidetached, or embedded type.
From the effort estimation, the project cost can be obtained by multiplying the required
effort by the manpower cost per month. But, implicit in this project cost computation is
the assumption that the entire project cost is incurred on account of the manpower cost
alone. In addition to manpower cost, a project would incur costs due to hardware and
software required for the project and the company overheads for administration, office
space, etc.
It is important to note that the effort and the duration estimations obtained using the
COCOMO model are called a nominal effort estimate and nominal duration estimate.
The term nominal implies that if anyone tries to complete the project in a time shorter
than the estimated duration, then the cost will increase drastically. But, if anyone
completes the project over a longer period of time than the estimated, then there is
almost no decrease in the estimated cost value.
Example1: Suppose a project was estimated to be 400 KLOC. Calculate the effort and
development time for each of the three model i.e., organic, semi-detached &
embedded.
Effort=a1*(KLOC)a2 PM
Tdev=b1*(efforts)b2 Months
Estimated Size of project= 400 KLOC
(i)Organic Mode
E=2.4*(400)1.05=1295.31PM
D = 2.5 * (1295.31)0.38=38.07 PM
(ii)Semidetached Model
E=3.0*(400)1.12=2462.79PM
D = 2.5 * (2462.79)0.35=38.45 PM
E=3.6*(400)1.20=4772.81PM
D = 2.5 * (4772.8)0.32 = 38 PM
Software Engineering-The Software Equation
SOFTWARE ENGINEERING
The software equation is a dynamic multivariable model that assumes a specific distribution of effort
over the life of a software development project. The model has been derived from productivity data
collected for over 4000 contemporary software projects. Based on these data, an estimation model of
the form
E=effortinperson-monthsorperson-years
t=projectdurationinmonthsoryears
B = “special skills factor”
• Overallprocessmaturityandmanagementpractices
• Theextentto which good software engineering practices re used
• The level of programming languages used
• The state of the software environment
• The skills and experience of the software team
• The complexity of the application
Typical values might be P = 2,000 for development of real-time embedded software; P = 10,000 for
telecommunication and systems software; P = 28,000 for business systems applications. The
productivity parameter can be derived for local conditions using historical data collected from past
development efforts. It is important to note that the software equation has two independent
parameters:
(1) an estimate of size (in LOC) and (2) an indication of project duration in calendar months or years.
To simplify the estimation process and use a more common form for their estimation model, Putnam
and Myers suggest a set of equations derived from the software equation. Minimum development
time is defined as
tmin=8.14(33200/12000)0.43
tmin=12.6calendarmonths
E=1800.28(1.05)3
E = 58 person-month
Estimation Technique in Agile
Read
Discuss
Courses
Agile estimation gauges the effort needed to achieve a prioritized task in the product backlog.
Usually measured concerning the time needed for task completion, this effort leads to accurate
sprint planning. Agile estimates are also made with reference to story points– a number that
enables evaluation of the difficulty of successful completion of a user story successfully.
However, it must be realized that despite accurately estimating the effort, one must keep room
for impediments and not strive to achieve perfect accuracy. Changing requirements, Agile anti-
patterns and other realities alter the course of development.
They are collaborative: Involving everyone on the Agile development team is one of the best
practices because collaborative efforts lead to better estimates. Collaborative techniques put an end
to the blame game for an incorrect estimate.
They are designed to be fast: Faster than any traditional techniques, Agile estimation is not about
predicting the future. Rather it recognizes that estimations are a non-value-added activity and tries to
minimize them.
Use of relative units: Estimation is not in terms of dollars or days; instead, “points” or qualitative
labels are employed for estimating or comparing tasks.
A bottom-up approach has been a part of traditional project management methods. An individual
or a team spends time upfront to formulate a schedule, plan out the tasks and deliverables, and
break down each to estimate costs and hours starting from the end.
Once things get sorted, a project manager dedicates his quality time to keeping teams on track
abiding by the set deadlines and allocated hours for each deliverable.
Contrastingly, in Agile project management, the order gets flipped, and you use gross-level
estimation. It begins with a broad estimate for various project parts and narrows down to
specifics, continually refining as more information becomes available. Without much practice,
this estimation approach can be challenging for many.
As stated, Agile estimation technique determination and implementation should not be limited to
a product owner or a scrum master’s list of job duties. Rather, involving the entire Agile
development team can lead to better estimates because
Quicker assignment: Anyone can be chosen from the team to complete a task that appears at the top
of the product backlog list. When everybody is involved, they have a better idea of the user story
demands and the corresponding estimation.
Helps avoid overestimation/underestimation: The retrospective sessions and the team’s prior
experience help understand the exact details of the item and the efforts it would entail. Collaborative
efforts help progress with better clarity reducing the chance of inaccurate estimates.
Considering the following factors, you can choose the most appropriate Agile estimation
technique:
1. Three-point Estimate
When faced with inaccuracy problems due to commitment to a particular estimate before any
work began, the three-point estimate technique was introduced.
No matter if the same team is working on the same type of work, estimates can turn unrealistic
when you fall into the trap of thinking that because you have done well before, you will be able
to do it again.
Method: The three-point estimation technique requires the creation of three values and finding
their mean:
Optimistic estimate
Pessimistic estimate
For example: For Activity A, the three values for the estimated time are as follows:
E = (4 + 8 + 16 ) / 3
E = 28 / 3 = 9.3 hours
2. Planning Poker
One of the most commonly adopted Agile estimation techniques is planning poker.
Method: The team members vote for an estimate of an item using specially-numbered playing
cards. Anonymous voting takes place, and discussions are held regarding large differences. The
process is repeated until the entire team reaches a consensus about the accurate estimation.
The technique is suitable for estimating a relatively small number of items (maximum 10) in a
team of 5-8 people.
3. Dot Voting
Another simple, Agile estimation technique is ideal for a relatively small set of items. It
originates from decision-making and suits both small and large teams.
Method: Every participating member gets a small number of dot stickers to vote for the
individual items. More dots correspond to more effort and time requirements. On the other hand,
fewer dots indicate that the task is fairly straightforward and quick to achieve.
The Relative Mass Valuation and the Challenge, Estimate, and Override methods are variations
of this technique.
Method: This technique sets items in order from low to high. The member participating in the
estimation process takes turns to either move an item up or down one spot, hold discussions, or
pass. The process gets completed once everyone passes on their turn.
One of the simplest ways is to categorize items into t-shirt sizes: XS, S, M, L, and XL. It is an
informal technique that can be used with many items for a quick estimation.
Method: Extra Small (XS), Small (S), Medium (M), Large (L), and Extra Large (XL) act as units
for this Agile estimation technique. The sizes can be assigned numerical values postestimation.
The decisions are based on open and collaborative discussions. An occasional vote can be made
to break a stalemate.
6. Buckets
This technique is similar to planning poker but aims for consensus via discussion and assigning
values to every task.
Method: The facilitator begins by placing a task in the middle, and they continue to read and add
tasks into the buckets relative to the first one. Team members can divide tasks, bucket them, and
come back together for review. Discussions are held before finalizing the estimates.
7. Large, Small, Uncertain
Method: There are only three possible values to assign. You begin by discussing and then divide
and conquer to add up tasks to the large, small, or uncertain groups.
8. Affinity Grouping
Method: The team members group similar items. In case of related tasks in scope and effort, you
must place them together to get a clear set of groups.
Like in other techniques (Fibonacci sequence), you can use the same values. You can even make
the groups broader to get this method closer to the large, small, and uncertain method.
Lines of code and functional point metrics can be used for estimating object-oriented
software projects. However, these metrics are not appropriate in the case of
incremental software development as they do not provide adequate details for effort
and schedule estimation. Thus, for object-oriented projects, different sets of metrics
have been proposed. These are listed below.
The afore-mentioned metrics are collected along with other project metrics like effort
used, errors and defects detected, and so on. After an organization completes a
number of projects, a database is developed, which shows the relationship between
object-oriented measure and project measure. This relationship provides metrics that
help in project estimation.
1. Decision Table: Decision Table is just a tabular representation of all conditions and
actions. Decision Table are always used whenever the processing logic is very
complicated and involves multiple conditions. The main components used for the
formation of the Data Table are Conditions Stubs, Action Stubs, and rules.
Types of decision tables:
Extended entry table
Limited entry table
Benefits:
Visualization of Cause and effect relationships in the table.
Easy to understand
In the case of a complex table, it can be readily broken down into simpler tables.
Tables are formatted consistently.
Suggestions of possible actions need to be taken from the summarized outcomes of a
situation.
In these tables, semi-standardized languages might be used.
Table users are not necessarily know how to use a computer.
Drawbacks:
Decision tables are not well suited to large-scale applications. There is a requirement
of splitting huge tables into smaller ones to eliminate redundancy.
The complete sequence of actions is not reflected in the decision tables.
A partial solution is presented.
2. Decision Tree: A decision tree is a graph that always uses a branching method in order
to demonstrate all the possible outcomes of any decision. Decision Trees are graphical
and show a better representation of decision outcomes. It consists of three nodes namely
Decision Nodes, Chance Nodes, and Terminal Nodes.
Types of the decision tree:
Categorical variable decision tree
Continuous variable decision tree
Benefits:
A decision tree is simple to comprehend and use.
New scenarios are simple to add.
Can be combined with other decision-making methods.
Handling of both numerical and categorial variables
The classification does not require many computations.
Useful in analyzing and solving various business problems.
Drawbacks:
They are inherently unstable, which means that a slight change in the data can have a
result in a change in the structure of the optimal decision tree, and they are frequently
wrong.
These are less suitable for estimation tasks where the outcome required is the value of
a continuous variable.
The alternative options perform better with the same data. A random forest of decision
trees can be used as a replacement but it is not as straightforward to comprehend as a
single decision tree.
Calculations can become quite complicated, especially when several values are
uncertain and/or multiple outcomes are related.
Difference between Decision Table and Decision Tree:
S.
No. Decision Table Decision Tree
We can derive a decision table from We can not derive a decision tree from the
2.
the decision tree. decision table.
In Decision Tables, we can include In Decision Trees, we can not include more
4.
more than one ‘or’ condition. than one ‘or’ condition.
It is used when there are small It is used when there are more number of
5.
number of properties. properties.
6. It is used for simple logic only. It can be used for complex logic as well.
logic on the basis of data entered in a decision’s available possibilities and range
the table. of possible outcomes.
Project size estimation is a crucial aspect of software engineering, as it helps in
planning and allocating resources for the project. Here are some of the popular
project size estimation techniques used in software engineering:
Expert Judgment: In this technique, a group of experts in the relevant field estimates the
project size based on their experience and expertise. This technique is often used when
there is limited information available about the project.
Analogous Estimation: This technique involves estimating the project size based on the
similarities between the current project and previously completed projects. This
technique is useful when historical data is available for similar projects.
Bottom-up Estimation: In this technique, the project is divided into smaller modules or
tasks, and each task is estimated separately. The estimates are then aggregated to arrive at
the overall project estimate.
Three-point Estimation: This technique involves estimating the project size using three
values: optimistic, pessimistic, and most likely. These values are then used to calculate
the expected project size using a formula such as the PERT formula.
Function Points: This technique involves estimating the project size based on the
functionality provided by the software. Function points consider factors such as inputs,
outputs, inquiries, and files to arrive at the project size estimate.
Use Case Points: This technique involves estimating the project size based on the
number of use cases that the software must support. Use case points consider factors such
as the complexity of each use case, the number of actors involved, and the number of use
cases.
Each of these techniques has its strengths and weaknesses, and the choice of technique
depends on various factors such as the project’s complexity, available data, and the
expertise of the team.
Estimation of the size of the software is an essential part of Software Project
Management. It helps the project manager to further predict the effort and time which
will be needed to build the project. Various measures are used in project size estimation.
Some of these are:
Lines of Code
Number of entities in ER diagram
Total number of processes in detailed data flow diagram
Function points
1. Lines of Code (LOC): As the name suggests, LOC counts the total number of lines of
source code in a project. The units of LOC are:
KLOC- Thousand lines of code
NLOC- Non-comment lines of code
KDSI- Thousands of delivered source instruction
The size is estimated by comparing it with the existing systems of the same kind. The
experts use it to predict the required size of various components of software and then add
them to get the total size.
It’s tough to estimate LOC by analyzing the problem definition. Only after the whole
code has been developed can accurate LOC be estimated. This statistic is of little utility
to project managers because project planning must be completed before development
activity can begin.
Two separate source files having a similar number of lines may not require the same
effort. A file with complicated logic would take longer to create than one with simple
logic. Proper estimation may not be attainable based on LOC.
0 seconds of 15 secondsVolume 0%
This ad will end in 12
The length of time it takes to solve an issue is measured in LOC. This statistic will differ
greatly from one programmer to the next. A seasoned programmer can write the same
logic in fewer lines than a newbie coder.
Advantages:
Universally accepted and is used in many models like COCOMO.
Estimation is closer to the developer’s perspective.
Both people throughout the world utilize and accept it.
At project completion, LOC is easily quantified.
It has a specific connection to the result.
Simple to use.
Disadvantages:
Different programming languages contain a different number of lines.
No proper industry standard exists for this technique.
It is difficult to estimate the size using this technique in the early stages of the project.
When platforms and languages are different, LOC cannot be used to normalize.
2. Number of entities in ER diagram: ER model provides a static view of the project. It
describes the entities and their relationships. The number of entities in ER model can be
used to measure the estimation of the size of the project. The number of entities depends
on the size of the project. This is because more entities needed more classes/structures
thus leading to more coding.
Advantages:
Size estimation can be done during the initial stages of planning.
The number of entities is independent of the programming technologies used.
Disadvantages:
No fixed standards exist. Some entities contribute more to project size than others.
Just like FPA, it is less used in the cost estimation model. Hence, it must be converted
to LOC.
3. Total number of processes in detailed data flow diagram: Data Flow
Diagram(DFD) represents the functional view of software. The model depicts the main
processes/functions involved in software and the flow of data between them. Utilization
of the number of functions in DFD to predict software size. Already existing processes of
similar type are studied and used to estimate the size of the process. Sum of the estimated
size of each process gives the final estimated size.
Advantages:
It is independent of the programming language.
Each major process can be decomposed into smaller processes. This will increase the
accuracy of the estimation
Disadvantages:
Studying similar kinds of processes to estimate size takes additional time and effort.
All software projects are not required for the construction of DFD.
4. Function Point Analysis: In this method, the number and type of functions supported
by the software are utilized to find FPC(function point count). The steps in function point
analysis are:
Count the number of functions of each proposed type.
Compute the Unadjusted Function Points(UFP).
Find the Total Degree of Influence(TDI).
Compute Value Adjustment Factor(VAF).
Find the Function Point Count(FPC).
The explanation of the above points is given below:
Count the number of functions of each proposed type: Find the number of
functions belonging to the following types:
External Inputs: Functions related to data entering the system.
External outputs: Functions related to data exiting the system.
External Inquiries: They lead to data retrieval from the system but don’t
change the system.
Internal Files: Logical files maintained within the system. Log files are not
included here.
External interface Files: These are logical files for other applications which
are used by our system.
Compute the Unadjusted Function Points(UFP): Categorise each of the five
function types like simple, average, or complex based on their complexity. Multiply
the count of each function type with its weighting factor and find the weighted sum.
The weighting factors for each type based on their complexity are as follows:
Simpl
Function type e Average Complex
External Inputs 3 4 6
External Output 4 5 7
External Inquiries 3 4 6
Simpl
Function type e Average Complex
Find Total Degree of Influence: Use the ’14 general characteristics’ of a system to
find the degree of influence of each of them. The sum of all 14 degrees of influence
will give the TDI. The range of TDI is 0 to 70. The 14 general characteristics are:
Data Communications, Distributed Data Processing, Performance, Heavily Used
Configuration, Transaction Rate, On-Line Data Entry, End-user Efficiency, Online
Update, Complex Processing Reusability, Installation Ease, Operational Ease,
Multiple Sites and Facilitate Change.
Each of the above characteristics is evaluated on a scale of 0-5.
Compute Value Adjustment Factor(VAF): Use the following formula to calculate
VAF
VAF = (TDI * 0.01) + 0.65
Find the Function Point Count: Use the following formula to calculate FPC
FPC = UFP * VAF
Advantages:
It can be easily used in the early stages of project planning.
It is independent of the programming language.
It can be used to compare different projects even if they use different
technologies(database, language, etc).
Disadvantages:
It is not good for real-time systems and embedded systems.
Many cost estimation models like COCOMO use LOC and hence FPC must be
converted to LOC.