Lecture Note(Software Metric)
Lecture Note(Software Metric)
Software metrics are quantitative measures used to assess various aspects of software development,
maintenance, and quality. These metrics provide insights into the efficiency, effectiveness, and overall
health of the software development process and the resulting software product. Software metrics are vital
for making informed decisions, improving processes, and enhancing software quality. Here's an overview
of some common categories of software metrics:
Size Metrics
Lines of Code (LOC): Measures the total number of lines of code in a software program. It's a simple way
to estimate the size and complexity of a project.
Function Points (FP): Measures software size based on its functionality, considering inputs, outputs,
inquiries, internal files, and external interfaces.
Person-Month (PM): Represents the amount of effort required to complete a project in terms of the number
of team members working for a month.
Cost Per Defect (CPD): Calculates the cost associated with finding, fixing, and verifying defects in the
software.
Time Metrics:
Lead Time: The time taken to complete a task from the moment it's initiated.
Cycle Time: The time taken to complete a task from the moment work begins until it's finished.
Productivity Metrics:
Defect Density: Measures the number of defects in the software per unit of size (e.g., defects per KLOC
(thousands of lines of code)).
Lines of Code per Developer Day: Indicates the productivity of developers in terms of lines of code
produced per day.
Quality Metrics:
Code Coverage: Measures the percentage of code that is executed during testing, indicating how thoroughly
the codebase has been tested.
Defect Removal Efficiency (DRE): Calculates the percentage of defects removed before release.
Failure Rate: Measures the frequency of system failures or defects in a given time period.
Complexity Metrics:
Cyclomatic Complexity: Quantifies the complexity of a program by counting the number of independent
paths through its code.
Maintainability Index: A composite metric that considers factors like cyclomatic complexity, lines of code,
and code duplication to assess code maintainability.
Agile Metrics:
Velocity: Measures the amount of work a team completes during an iteration in Agile development.
Burn-Down Chart: Tracks the remaining work versus time during a project to visualize progress.
Process Metrics:
Defect Arrival Rate: Measures the rate at which defects are identified after a release.
Change Request Rate: Tracks the number of requested changes to the software over time.
Risk Metrics:
Risk Exposure Ratio: Compares the potential impact of a risk with the efforts taken to mitigate it.
Risk Priority Number (RPN): Assigns a numerical value to each identified risk based on factors like
probability, impact, and detectability.
Using software metrics offers numerous benefits across various stages of the software development
lifecycle and within an organization as a whole:
Objective Decision-Making: Metrics provide quantifiable data that supports informed decision-making.
Project managers and stakeholders can make decisions based on facts rather than subjective judgments.
Process Improvement: Metrics help identify bottlenecks, inefficiencies, and areas for improvement in the
software development process. By analyzing these metrics, teams can streamline processes and enhance
productivity.
Early Issue Detection: Metrics like defect density and code coverage aid in early detection of issues. This
enables teams to address problems before they escalate, reducing the cost and effort required for fixes.
Quality Assurance: Quality metrics, such as defect removal efficiency and failure rate, offer insights into
the quality of the software. This information guides teams in maintaining or improving software quality.
Resource Allocation: Metrics like effort estimation and productivity metrics help allocate resources
effectively. Teams can identify areas where additional resources are needed or where resources are being
underutilized.
Performance Evaluation: Metrics provide a basis for evaluating individual and team performance. This
allows organizations to recognize and reward high-performing teams and individuals.
Risk Management: Risk metrics help organizations assess and manage project risks. This enables proactive
risk mitigation and better decision-making in risk-prone areas.
Customer Satisfaction: Metrics related to defects and user-reported issues allow teams to address customer
concerns promptly. This contributes to higher customer satisfaction and loyalty.
Predictive Analysis: Historical metrics can be used for predictive analysis. By understanding trends and
patterns, organizations can anticipate potential issues and plan accordingly.
Benchmarking: Metrics enable organizations to benchmark their performance against industry standards
and best practices. This helps identify areas where the organization is excelling or lagging behind.
Transparency: Metrics promote transparency by providing visibility into the software development
process. This transparency can foster trust between development teams and stakeholders.
Continuous Improvement: Metrics are central to the concept of continuous improvement. Regularly
tracking and analyzing metrics allows teams to iteratively enhance their processes and outcomes.
Communication and Collaboration: Metrics provide a common language for teams and stakeholders to
communicate and collaborate effectively. They offer a shared understanding of project status and progress.
ROI Assessment: Metrics help assess the return on investment (ROI) for software development projects.
This is essential for evaluating project success and making strategic decisions for future endeavors.
Efficient Prioritization: Metrics assist in prioritizing tasks and features based on their impact and value.
This helps teams focus on activities that contribute most to project goals.
Management Visibility: Metrics provide management with insights into project status, allowing them to
make well-informed decisions and allocate resources appropriately.
Documentation and Accountability: Metrics serve as documented evidence of progress, issues, and
achievements. They hold teams accountable for their work and outcomes.
Overall, using software metrics fosters a data-driven culture that leads to better outcomes, improved
collaboration, and enhanced software quality. It enables organizations to continuously learn from their
experiences and evolve their practices to achieve higher levels of success.
Chapter-1: Measurement- What is it and why do it?
Software Measurement
Software measurement is the process of quantifying various attributes, characteristics, and properties of
software products, processes, and resources using standardized methods and metrics. The goal of software
measurement is to obtain objective, quantitative data that can be analyzed and used to make informed
decisions, improve processes, and assess the quality and progress of software-related activities.
Software measurement is an essential components of good software engineering. Many of the best
developers measure characteristics of their software to get some sense whether the requirements are
consistent and complete, whether the design is of high quality, and whether the code is ready to be released.
Effective project managers measure attributes of processes and products to be able to tell when software
will be ready for delivery and whether a budget will be exceeded. Organizations use process evaluation
measurements to select suppliers. Informed customers measure the aspects of the final product to determine
if it meets the requirements and is of sufficient quality. Also, maintainers must be able to assess the current
product to see what should be upgraded and improved.
Even when a project is not in trouble, measurement is not only useful but also necessary. After all, how can
you tell if your project is healthy if you have no measures of its health? So, measurement is needed at least
for assessing the status of your projects, products, processes and resources.
The measurement objectives must be specific, tied to what the managers, developers, and users need to
know.
Managers' viewpoint:
➢ What does each process cost?
➢ How productive is the staff?
➢ How good is the code being developed?
➢ Will the user be satisfied with the product?
➢ How can we improve?
Developers Viewpoint:
➢ Are the requirements testable?
➢ Have we found all the faults?
➢ Have we met our product or process goals?
➢ What will happen in the future?
COCOMO Model
Boehm proposed COCOMO (Constructive Cost Estimation Model) in 1981.COCOMO is one of
the most generally used software estimation models in the world. COCOMO predicts the efforts
and schedule of a software product based on the size of the software.
https://www.geeksforgeeks.org/software-engineering-cocomo-model/
1. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the project
parameters. The following expressions give the basic COCOMO estimation model:
Effort=a1*(KLOC)*a2 PM
Tdev=b1*(efforts)*b2 Months
Where,
KLOC is the estimated size of the software product indicate in Kilo Lines of Code,
a1,a2,b1,b2 are constants for each group of software products,
Tdev is the estimated time to develop the software, expressed in months,
Effort is the total effort required to develop the software product, expressed in person months
(PMs).
Example1: Suppose a project was estimated to be 400 KLOC. Calculate the effort and
development time for each of the three model i.e., organic, semi-detached & embedded.
Example2: A project size of 200 KLOC is to be developed. Software development team has
average experience on similar type of projects. The project schedule is not very tight. Calculate
the Effort, development time, average staff size, and productivity of the project.
Solution: The semidetached mode is the most appropriate mode, keeping in view the size,
schedule and experience of development time.
Hence E=3.0(200)1.12=1133.12PM
D=2.5(1133.12)0.35=29.3PM
P = 176 LOC/PM
2. Intermediate Model: The basic Cocomo model considers that the effort is only a function of
the number of lines of code and some constants calculated according to the various software
systems. The intermediate COCOMO model recognizes these facts and refines the initial estimates
obtained through the basic COCOMO model by using a set of 15 cost drivers based on various
attributes of software engineering.
Classification of Cost Drivers and their attributes:
(i) Product attributes -
o Required software reliability extent
o Size of the application database
o The complexity of the product
Hardware attributes -
o Run-time performance constraints
o Memory constraints
o The volatility of the virtual machine environment
o Required turnabout time
Personnel attributes -
o Analyst capability
o Software engineering capability
o Applications experience
o Virtual machine experience
o Programming language experience
Project attributes -
o Use of software tools
o Application of software engineering methods
o Required development schedule
3) Detailed COCOMO Model: Detailed COCOMO incorporates all qualities of the standard
version with an assessment of the cost driver?s effect on each method of the software engineering
process. The detailed model uses various effort multipliers for each cost driver property. In detailed
cocomo, the whole software is differentiated into multiple modules, and then we apply COCOMO
in various modules to estimate effort and then sum the effort.
Measurement Theory
Measurement tells us the rules laying the ground work for developing and reasoning about all kinds of
measurement. It is the mapping from the empirical world to the formal relational world. Consequently, a
measure is the number or symbol assigned to an entity by this mapping in order to characterize an entity.
Empirical Relations
In the real world, we understand the things by comparing them, not by assigning numbers to them.
For example, to compare height, we use the terms ‘taller than’, higher than’. Thus, these ‘taller than’, higher
than’ are empirical relations for height.
We can define more than one empirical relation on the same set.
For example, X is taller than Y. X, Y are much taller than Z.
Empirical relations can be unary, binary etc.
X is tall, Y is not tall are unary relations.
X is taller than Y is a binary relation.
Empirical relations in the real world can be mapped to a formal mathematical world. Mostly these relations
reflect the personal preferences.
Some of the mapping or rating technique used to map these empirical relations to the mathematical world
is follows –
Likert Scale
Here, the users will be given a statement upon which they have to agree or disagree.
Forced Ranking
Order the given alternatives from 1 (best) to n (worst).
For example: Rank the following 5 software modules according to their performance.
Name of Module Rank
Module A
Module B
Module C
Module D
Module E
Ordinal Scale
Here, the users will be given a list of alternatives and they have to select one.
For example − How often does this program fail?
• Hourly
• Daily
• Weekly
• Monthly
• Several times a year
• Once or twice a year
• Never
Comparative Scale
Here, the user has to give a number by comparing the different options.
Very superior About the same Very inferior
1 2 3 4 5 6 7 8 9 10
Numerical Scale
Here, the user has to give a number according to its importance.
Unimportant Important
1 2 3 4 5 6 7 8 9 10
Direct Measurement
These are the measurements that can be measured without the involvement of any other entity or attribute.
The following direct measures are commonly used in software engineering.
❖ Size of source code (measured by LOC)
❖ Schedule of the testing process (measured by elapsed time in hours)
❖ Number of defects discovered (measured by counting defects)
❖ Time a programmer spends on a project (measured by months worked)
Derived Measurement
These are measurements that can be measured in terms of any other entity or attribute.
The following indirect measures are commonly used in software engineering.
Measurement scales are the mappings used for representing the empirical relation (an empirical relationship
or phenomenological relationship is a relationship or correlation that is supported by experiment and
observation but not necessarily supported by theory) system. It is mainly of 5 types −
• Nominal Scale—qualitative(discrete)
• Ordinal Scale
• Interval Scale---quantitative (Continuous)
• Ratio Scale
• Absolute Scale
Nominal Scale
It places the elements in a classification scheme. The classes will not be ordered. Each and every entity
should be placed in a particular class or category based on the value of the attribute.
➢ The empirical relation system consists only of different classes; there is no notion of ordering
among the classes.
➢ Any distinct numbering or symbolic representation of the classes is an acceptable measure, but
there is no notion of magnitude associated with the numbers or symbols.
Ordinal Scale
It places the elements in an ordered classification scheme. It has the following characteristics −
➢ The empirical relation system consists of classes that are ordered with respect to the attribute.
➢ Any mapping that preserves the ordering is acceptable.
➢ The numbers represent ranking only. Hence, addition, subtraction, and other arithmetic operations
have no meaning.
Interval Scale
This scale captures the information about the size of the intervals that separate the classification. Hence, it
is more powerful than the nominal scale and the ordinal scale.
If an attribute is measurable on an interval scale, and M and M’ are mappings that satisfy the representation
condition, then we can always find two numbers a and b such that,
M = aM’ + b
Ratio Scale
This is the most useful scale of measurement. Here, an empirical relation exists to capture ratios. It has the
following characteristics −
➢ It is a measurement mapping that preserves ordering, the size of intervals between the entities and
the ratio between the entities.
➢ There is a zero element, representing total lack of the attributes.
➢ The measurement mapping must start at zero and increase at equal intervals, known as units.
➢ All arithmetic operations can be applied.
M = aM’
Absolute Scale
On this scale, there will be only one possible measure for an attribute. Hence, the only possible
transformation will be the identity transformation.
➢ The measurement is made by counting the number of elements in the entity set.
➢ The attribute always takes the form “number of occurrences of x in the entity”.
➢ There is only one possible measurement mapping, namely the actual count.
➢ All arithmetic operations can be performed on the resulting count.
The representative value of a data set, generally the central value or the most occurring value that gives a
general idea of the whole data set is called the Measure of Central Tendency.
Within each class of entity, we distinguish between internal and external attributes of a product, process, or
resource:
• Internal attributes: Attributes that can be measured purely in terms of the product, process, or resource
itself. An internal attribute can be measured by examining the product, process, or resource on its own,
without considering its behavior.
• External attributes: Attributes that can be measured only with respect to how the product, process, or
resource relates to its environment. Here, the behavior of the process, product, or resource is important,
rather than the entity itself.
To understand the difference between internal and external attributes, consider a set of software modules.
Without actually executing the code, we can determine several important internal attributes: its size
(perhaps in terms of lines of code, LOC or number of operands), its complexity (perhaps in terms of the
number of decision points in the code), and the dependencies among modules. We may even find faults in
the code as we read it: misplaced commas, improper use of a command, or failure to consider a particular
case. However, there are other attributes of the code that can be measured only when the code is executed:
the number of failures experienced by the user, the difficulty that the user has in navigating among the
screens provided, or the length of time it takes to search the database and retrieve requested information,
for instance. It is easy to see that these attributes depend on the behavior of the code, making them external
attributes rather than internal. Table 3.1 provides additional examples of types of entities and attributes.
Processes
We want to know how long it takes for a process to complete, how much it will cost, whether it is effective
or efficient, and how it compares with other processes that we could have selected. However, only a limited
number of internal process attributes can be measured directly. These measures include:
• The duration of the process or one of its activities.
• The effort associated with the process or one of its activities.
• The number of incidents of a specified type arising during the process or one of its activities.
For example, we may be reviewing our requirements to ensure their quality before turning them over to the
designers. To measure the effectiveness of the review process, we can measure the number of requirements
errors found during the review as well as the number of errors found during later activities.
Products
Products are not restricted to the items that management is committed to deliver as the final software
product. Any artifact or document produced during the software life cycle can be measured and assessed.
For example, developers often model the requirements and design using various diagrams defined in the
Unified Modeling Language, and they build prototypes. The purpose of models and prototypes is to help
developers to understand the requirements or evaluate possible designs; these models and prototypes may
be measured in some way.
-External Product Attributes (Products behavior and environment)
-Internal Product Attributes (Size of the product)
The main internal product attributes include size and structure. Size can be measured statically without
having to execute them. The size of the product tells us about the effort needed to create it. Similarly, the
structure of the product plays an important role in designing the maintenance of the product.
L=V*/V
Goal-Question-Metric Paradigm
Goal Question Metric approach is a method used to identify important and meaningful metrics in the
following way:
➢ List the goals or objectives for the process
➢ Trace the goals with data or metrics
➢ Use a framework to interpret the data with respect to listed goals for the process
GQM approach has three levels – Conceptual level, Operational level, and Quantitative level. Each level
is significant for understanding metrics.
1. Conceptual level – Goal:
This level represents a goal or objective. A goal is an object or entity.
Objects of measurement include:
• Products –
Software Requirement Specification (SRS), Designs, Program or code
• Processes –
Testing (Verification and Validation), Designing
• Resources –
Hardware and Software
2. Operational level – Question:
This level represents questions. A set of questions are used to assess a goal.
Example of questions could be:
• Is the current process performance satisfactory from the team’s viewpoint?
• Is performance improving?
• Is the improvement satisfactory?
3. Quantitative level – Metric:
This level represents metrics. With every question added in the scenario, a set of data is used to answer
the question in a quantitative manner. This set of data is called metrics.
Data can be of 2 types:
• Objective –
LOC (Lines of code), size of module, size of program, etc.
• Subjective –
Level of user satisfaction on a scale of 1 to 10
In general, typical goals are expressed in terms of productivity, quality, risk, and customer satisfaction and
the like, coupled with verbs expressing the need to assess, evaluate, improve, or understand. It is important
that the goals and questions be understood in terms of their audience: a productivity goal for a project
manager may be different from that for a department manager or corporate director. To aid in generating
the goals, questions, and metrics, Basili and Rombach provided a series of templates:
Templates for goal definition:
• Purpose: To (characterize, evaluate, predict, motivate, etc.) the (process, product, model, metric, etc.) in
order to (understand, assess, manage, engineer, learn, improve, etc.) it.
Example: To evaluate the maintenance process in order to improve it.
• Perspective: Examine the (cost, effectiveness, correctness, defects, changes, product measures, etc.) from
the viewpoint of the (developer, manager, customer, etc.)
Example: Examine the cost from the viewpoint of the manager.
• Environment: The environment consists of the following: process factors, people factors, problem factors,
methods, tools, constraints, etc.
Example: The maintenance staff consists of poorly motivated programmers who have limited access to
tools.
Measures or measurement systems are used to asses an existing entity by numerically characterizing one or
more of its attributes. A measure is valid if it accurately characterizes the attribute it claims to measure.
Validating a software measurement system is the process of ensuring that the measure is a proper numerical
characterization of the claimed attribute by showing that the representation condition is satisfied.
For validating a measurement system, we need both a formal model that describes entities and a numerical
mapping that preserves the attribute that we are measuring. For example, if there are two programs P1 and
P2, and we want to concatenate those programs, then we expect that any measure m of length to satisfy
that,
If a program P1 has more length than program P2, then any measure m should also satisfy,
The length of the program can be measured by counting the lines of code. If this count satisfies the above
relationships, we can say that the lines of code are a valid measure of the length.
CHAPTER:8 MEASURING INTERNAL PRODUCT ATTRIBUTES (Size)
Size measures only indicate how much of an entity we have. Size alone cannot directly indicate
external attributes such as effort, productivity, and cost.
There is a major problem with the lines-of-code measure: it is not consistent because some lines
are more difficult to code than others.
Although size measures do not indicate external attributes like “difficulty of coding,” they are very
useful. Clearly, when all other attributes are similar, the size of a software entity really matters. In
general, a 1,00000 line program will be more difficult to test and maintain than a 10,000 line
program. A large program is more likely to contain faults than a small program. Problem size is a
good attribute to use to predict software development time and resources. Size is commonly used
as a component to compute indirect attributes such as productivity:
Productivity = Size/Effort
Another example is defect density:
Defect density = Defect count/Size
Also, size is commonly used in many cost estimation models, which are often used for project
planning.
Briand, Morasco, and Basili define the following three properties for any valid measure of software
size:
1. Nonnegativity: All systems have nonnegative size.
2. Null value: The size of a system with no elements is zero.
3. Additivity: The size of the union of two modules is the sum of sizes of the two modules after
subtracting the size of the intersection.
The most commonly used measure of source code program size is the number of lines of code
(LOCs). But some LOCs are different from others. For example, many programmers use spacing
and blank lines to make their programs easier to read. If LOCs are being used to estimate
programming effort, then a blank line does not contribute the same amount of effort as a line
implementing a difficult algorithm. Similarly, comment lines improve a program’s
understandability, and they certainly require some effort to write. But they may not require as
much effort as the code itself. we must explain how each of the following is handled:
• Blank lines
• Comment lines
• Data declarations
• Lines that contain several separate instructions
Conte et al. define an LOC as any line of program text that is not a comment or blank line,
regardless of the number of statements or fragments of statements on the line. This definition
specifically includes all lines containing program headers, declarations, and executable and
nonexecutable statements (Conte et al. 1986). Grady and Caswell report that Hewlett-Packard
defines an LOC as a noncommented source statement: any statement in the program except for
comments and blank lines (Grady and Caswell 1987).
To stress the fact that an LOC according to this definition is actually a noncommented line, we use
the abbreviation NCLOC, sometimes also called effective lines of code. The model associated with
this definition views a program as a simple file listing, with comments and blank lines removed,
giving an indication of the extent to which, it is self-documented.
As a compromise, we recommend that the number of comment lines of program text (CLOC) be
measured and recorded separately. Then we can define
Total size (LOC) = NCLOC + CLOC
and some useful indirect measures follow. For example, the ratio
CLOC/LOC
measures of the density of comments in a program.
Halstead Approach
The Halstead method is a software metrics system developed by Maurice Halstead in 1977. It measures
the complexity of a software program by analyzing the source code, focusing on the number of distinct and
total operators and operands. Let me break it down for you!
Key Terms in Halstead’s Metrics:
• Operators (n₁): Symbols or keywords that represent actions (e.g., +, -, if, while).
• Operands (n₂): Variables or constants used in the program (e.g., x, 5, myVar).
• Unique operators (η₁): The count of distinct operators.
• Unique operands (η₂): The count of distinct operands.
• Total operators (N₁): The total number of operator occurrences.
• Total operands (N₂): The total number of operand occurrences.
V = N * log2(n)
= 31 * log2(14)
= 110.61 (rounded to two decimal places)
E=V/D/D
= V / (n / 2) * (N / n)
= 110.61 / (14 / 2) * (31 / 14)
= 23.36 (rounded to two decimal places)
λ = E * V*
= E * (2 + n2*) * log2(2 + n2*)
(Assuming no input/output parameters are used, n2* = 0)
= 23.36 * (2 + 0) * log2(2 + 0)
= 46.72
To define size differently, we have two other alternatives to explore, both of which are acceptable
on measurement theory grounds as ratio measures:
1. We can measure size in terms of the number of bytes of computer storage required for the
program text. This approach has the advantage of being on the same scale as the normal measure
of size for object code. It is at least as well understood as LOC, and it is very easy to collect.
2. We can measure size in terms of the number of characters (CHAR) in the program text, which
is another easily collected measure. For example, most modern word processors compute this
count routinely for any text file. (Both the UNIX and Linux operating systems have the command
wc <filename> to compute it.)
Design Size
To measure the size of a procedural design, you can count the number of procedures and functions
at the lowest level of abstraction. You can also measure the size of the procedure and function
interfaces in terms of the number of arguments. Such measurements can be taken without code,
for example, by analyzing the APIs of a system. At higher levels of abstraction, you can count the
number of packages and subsystems. You can measure the size of a package or subsystem in terms
of the number functions and procedures in the package.
Object-oriented designs add new abstraction mechanisms: objects, classes, interfaces, operations,
methods, associations, inheritance, etc. Object-oriented design can also include realizations of
design patterns (Gamma et al. 1994). When quantifying size, our focus is generally on the static
entities rather than the links between entities, or runtime entities.
Thus, we will measure size in terms of packages, design patterns, classes, interfaces, abstract
classes, operations, and methods.
• Packages: Number of subpackages, number of classes, interfaces (Java), or abstract classes (C++)
• Design patterns:
➢ Number of different design patterns used in a design
➢ Number of design pattern realizations for each pattern type
➢ Number of classes, interfaces, or abstract classes that play roles in each pattern realization
• Classes, interfaces, or abstract classes: Number of public methods or operations, number of
attributes
• Methods or operations: Number of parameters, number of overloaded versions of a method or
operation.
There are obvious atomic elements in a variety of requirements and specification model types that
can be counted:
i. Use case diagrams: Number of use cases, actors, and relationships of various types
ii. Use case: Number of scenarios, size of scenarios in terms of steps, or activity diagram model
elements
iii. Domain model (expressed as a UML class diagram): Number of classes, abstract classes,
interfaces, roles, operations, and attributes
iv. UML OCL (Object Constraint Language) specifications: Number of OCL expressions, OCL
clauses
v. Alloy models: Number of alloy statements—signatures, facts, predicates, functions, and
assertions (Jackson 2002)
vi. Data-flow diagrams used in structured analysis and design: Processes (bubbles nodes), external
entities (box nodes), data-stores (line nodes) and data-flows (arcs)
vii. Algebraic specifications: Sorts, functions, operations, and axioms
viii. Z specifications: The various lines appearing in the specification, which form part of either a
type declaration or a (nonconjunctive) predicate.
Allan J. Albrecht initially developed function Point Analysis in 1979 at IBM and it has been further
modified by the International Function Point Users Group (IFPUG). FPA is used to make estimate
of the software project, including its testing in terms of functionality or function size of the
software product.
Allan J. Albrecht's Function Point Analysis (FPA) is a widely used technique for measuring the size
and complexity of a software system based on its functionality. It was developed in the late 1970s at IBM
to estimate software development effort and cost more accurately.
What is Function Point Analysis (FPA)?
FPA measures software size in terms of function points — a unit representing the software's functional
requirements as seen from the user’s perspective. It doesn’t consider technical details but focuses on
what the software does.
Differerence between External Inputs (EI) and Internal Logical Files (ILF):
Example Forms, data entry screens, API requests, Database tables, master files,
Components file uploads. reference data.
Example (Library A "New Book Entry" form where The Books Table storing all the
System) librarians input book details. book information.
Objectives of FPA
The basic and primary purpose of the functional point analysis is to measure and provide the
software application functional size to the client, customer, and the stakeholder on their request.
Further, it is used to measure the software project development along with its maintenance,
consistently throughout the project irrespective of the tools and the technologies.
1. FPs of an application is found out by counting the number and types of functions used in the
applications. Various functions used in an application can be put under five types, as shown in
Table:
Types of FP Attributes
2. FP characterizes the complexity of the software system and hence can be used to depict the
project time and the manpower requirement.
3. The effort required to develop the project depends on what the software does.
4. FP is programming language independent.
5. FP method is used for data processing systems, business systems like information systems.
6. The five parameters mentioned above are also known as information domain characteristics.
7. All the parameters mentioned above are assigned some weights that have been experimentally
determined
The functional complexities are multiplied with the corresponding weights against each function,
and the values are added up to determine the UFP (Unadjusted Function Point) of the subsystem.
Scale varies from 0 to 5 according to character of Complexity Adjustment Factor (CAF). Below
table shows scale:
0 - No Influence
1 - Incidental
2 - Moderate
3 - Average
4 - Significant
5 - Essential
Here that weighing factor will be simple, average, or complex for a measurement parameter type.
The Function Point (FP) is thus calculated with the following formula.
➢ Errors/FP
➢ $/FP.
➢ Defects/FP
➢ Pages of documentation/FP
➢ Errors/PM.
➢ Productivity = FP/PM (effort is measured in person-months).
➢ $/Page of Documentation.
8. LOCs of an application can be estimated from FPs. That is, they are interconvertible. This
process is known as backfiring. For example, 1 FP is equal to about 100 lines of COBOL code.
9. FP metrics is used mostly for measuring the size of Management Information System (MIS)
software.
10. But the function points obtained above are unadjusted function points (UFPs). These (UFPs)
of a subsystem are further adjusted by considering some more General System Characteristics
(GSCs). It is a set of 14 GSCs that need to be considered. The procedure for adjusting UFPs is as
follows:
➢ Degree of Influence (DI) for each of these 14 GSCs is assessed on a scale of 0 to 5. (b) If
a particular GSC has no influence, then its weight is taken as 0 and if it has a strong
influence then its weight is 5.
➢ The score of all 14 GSCs is totaled to determine Total Degree of Influence (TDI).
➢ Then Value Adjustment Factor (VAF) is computed from TDI by using the formula: VAF
= (TDI * 0.01) + 0.65
Remember that the value of VAF lies within 0.65 to 1.35 because
a. When TDI = 0, VAF = 0.65
b. When TDI = 70, VAF = 1.35
c. VAF is then multiplied with the UFP to get the final FP count: FP = VAF * UFP
Example: Compute the function point, productivity, documentation, cost per function for the
following data:
Solution:
Example:
Given the following values, compute function point when all complexity adjustment factor (CAF)
and weighting factors are average.
User Input = 50
User Output = 40
User Inquiries = 35
User Files = 6
External Interface = 4
Explanation:
Function points are a widely used technique in software engineering for estimating the size and complexity
of software projects. However, they have several limitations, and it's important to be aware of these when
using function points for project estimation and management. Here are some of the key limitations of
function points:
Subjectivity: Function points rely on human judgment to assess the complexity of various software
components. Different individuals may assess the same component differently, leading to subjectivity in
the measurement.
Lack of precision: Function points provide only a rough estimate of software size and complexity. They
don't account for the nuances of specific technologies or implementation details, making them imprecise
for detailed project planning.
Limited to size measurement: Function points primarily focus on the size of software, which is just one
aspect of a project's complexity. They do not consider other critical factors like performance,
maintainability, and security.
Difficulty in learning: It can be challenging to learn how to use function points effectively, as it requires a
good understanding of the method and its rules. This complexity can deter some teams from using them.
Time-consuming: Function point analysis can be a time-consuming process, especially for larger and more
complex projects. This can be a limitation when quick estimates are needed.
Not suitable for small projects: Function points are most useful for medium to large-scale software projects.
For small projects, the effort required to perform function point analysis may not be justified.
Doesn't account for changes over time: Function points provide a one-time estimate and don't account for
changes that may occur during the software development lifecycle. As the project evolves, the original
function point estimate may become less accurate.
May not work well for new or unique technologies: Function points are based on established software
development practices. They may not work well for projects that involve cutting-edge technologies or
highly unique software solutions.
Can be misused: There is a risk that function points may be misused or manipulated to fit a particular
agenda. For example, stakeholders might try to inflate or deflate function point counts to influence project
estimates.
Requires experienced practitioners: To use function points effectively, it's essential to have experienced
practitioners who can make accurate assessments of software components. Inexperienced assessors can lead
to inaccurate estimates.
COCOMO II Approach
COCOMO II (Constructive Cost Model) is a software cost estimation model that helps project managers
and developers estimate the effort and cost of developing a software project. It is based on various factors,
including the project's size, the software's complexity, and the development environment.
System Implementation
List of metrics with definition, the process of measurement, and corresponding values:
Step 1: Input develops software.
Step 2: Compute the value of EAF.
Step 3: Find the value of E and SE.
Step 4: Find the size of the Software (KSLOC).
Step 5: Calculate the Effort Equation by:
Effort = 2.94 * EAF * (KSLOC)^E
EAF Is the Effort Adjustment Factor derived from the Cost Drivers.
E Is an exponent derived from the five Scale Drivers.
Step 6: Calculate the Duration Equation by:
Duration = 3.67 * (Effort)^SE
SE Is the schedule equation exponent derived from the five Scale Drivers.
Step 7: Calculate the Average staffing by:
Average staffing=Effort/Duration
Step 8: Analysis of the result and drawing of the output represents the volume of
effort to complete the software.
Step 9: End.
Scale Drivers:
1. Precedentness (precedentness refers to how much experience and familiarity a development
team has with the type of project they’re working on.)
➢ Completely unprecedentedness: 0.70
➢ Somewhat similar: 1.00
➢ Very Similar: 1.33
➢ Identical to previous project: 1.66
2. Development Flexibility (is another scale factor that affects the effort required to complete a
software project. It measures how flexible the project requirements are and how much the team
can adjust the design and implementation based on discoveries made during development.)
➢ Very tightly constrained: 0.70
➢ Somewhat flexible: 1.00
➢ Very flexible: 1.33
➢ Completely flexible: 1.66
3. Architecture / Risk Resolution (is a scale factor that measures how well the project's
architecture is defined and how effectively risks are identified and managed. It reflects how much
effort is needed to resolve uncertainties and make key design decisions before full-scale
development starts.)
➢ Very low risk: 0.70
➢ Moderate risk: 1.00
➢ High risk: 1.33
➢ Extremely high risk: 1.66
4. Team Cohesion (is a scale factor that measures how well the development team works together.
It reflects the team's ability to communicate, collaborate, and resolve conflicts — all of which
impact productivity and effort.)
➢ Unexperienced and unfamiliar team: 0.70
➢ Moderately experienced team: 1.00
➢ Highly experienced: 1.33
➢ Exceptional team: 1.66
5. Process Maturity (s a scale factor that reflects how well the software development process is
defined, documented, and followed. It’s based on the Capability Maturity Model (CMM), which
measures process discipline and effectiveness. A mature process reduces project risks and makes
effort more predictable.)
➢ Formal and undefined process: 0.70
➢ Somewhat established process: 1.00
➢ Mature process: 1.33
➢ Optimally mature process: 1.66
❖ Reuse Metrics: Size measures are essential for quantifying the extent of code or component
reuse in software development. Metrics like "percentage of reused code" or "number of
reusable components" are calculated based on the size of reusable elements relative to the
overall size of the software.
❖ Repository Management: When maintaining a repository of reusable software components,
size measures help in categorizing and cataloging these components. It allows developers
to search for and identify reusable items based on their size and functionality.
❖ Test Planning and Coverage: Software size is used in test planning to determine the scope
and coverage of testing efforts. It helps identify the number of test cases, scenarios, and
test data required to adequately test a software system.
❖ Test Cost Estimation: Size measures are also employed in estimating the cost and effort
associated with testing. Larger software systems typically require more extensive testing,
leading to higher testing costs.
❖ Defect Density: Defect density, a common quality metric, is often calculated as the number
of defects per unit of software size (e.g., defects per thousand lines of code). This metric
helps in assessing the quality of the software and identifying areas with a higher density of
defects.
The size of a development product tells us a lot about the effort that went into creating it. All other
things being equal, we would like to assume that a large module takes longer to specify, design,
code, and test than a small one. But experience shows us that such an assumption is not valid; the
structure of the product plays a part, not only in requiring development effort but also in how the
product is maintained. Thus, we must investigate the characteristics of product structure, and
determine how they affect the outcomes we seek.
A software module or design can be viewed from several perspectives. The perspective that we
use depends on
1. The level of abstraction—program unit (function, method, class), package, subsystem, and
system
2. The way the module or design is described—syntax and semantics
3. The specific attribute to be measured
To be sure that we are measuring the desired attribute, we generally represent the relevant aspects
of a module or design using a model containing only the information relevant to the attribute. We
can think of structure from at least two perspectives:
1. Control flow structure
2. Data flow structure
The control flow addresses the sequence in which instructions are executed in a program. This
aspect of structure reflects the iterative and looping nature of programs. Whereas size counts an
instruction or program element just once, control flow makes more visible the fact that an
instruction or program element may be executed many times as the program is actually run.
Data flow follows the trail of a data item as it is created or handled by a program. Many times, the
transactions applied to data are more complex than the instructions that implement them; data flow
measures depict the behavior of the data as it interacts with the program.
In software engineering, data is a critical component of any software project. During the
development of software, various data are processed by the system, program, or module. These
data can be input into the system, used internally within the software, or output from the system.
In order to measure and analyze the data input, processing, and output, a set of metrics are used,
which is known as Data Structure Metrics. These metrics provide a fundamental framework to
evaluate and quantify the amount of data in the software system.
Check out the Data Structure Metrics Example to get a more clear idea about it.
Data Structure Metrics are primarily concerned with counting the data that is processed by software
systems. While some metrics only focus on variables and constants within each module and ignore
input-output dependencies, others take into account input/output situations. The main goal of Data
Structure Metrics is to provide insight into the amount of data that is input to, internally processed
in, and output from a software system or module.
To estimate the effort and time required for completing a software project, several Data Structure
metrics are available. These metrics measure different aspects of how data is processed within the
system, including:
• Amount of Data
• Usage of Data within a Module
• Program Weakness
• Sharing of Data among Modules
Amount of Data
The measurement of the Amount of Data involves various metrics, such as the number of variables
(VARS), number of operands (η2), and total number of occurrences of variables (N2). The VARS
metric counts the number of variables used in the program. The η2 metric counts the number of
operands used in the program, including variables, constants, and labels.
The formula for η2 is η2 = VARS + Constants + Labels. The N2 metric computes the total
number of occurrences of variables in the program.
One can use the following equation to compute the average span size (SP) for a program consisting
of n spans, where (LV) denotes the average live variable metric computed from the ith module:
Program Weakness
The weakness of a program is closely linked to the weakness of its modules. In cases where the
modules lack cohesiveness and are deemed weak, completing the project will require additional
effort and time metrics.
Since a program is usually comprised of multiple modules, assessing program weaknesses can
provide valuable insight. Therefore, program weakness is defined as:
Where
Information flow metrics are an essential aspect of software engineering that measures the flow of
information in a software system. Information flow metrics assess how information is exchanged
between the different components and modules of the software system, and how it affects system
performance, reliability, and maintainability. In today’s ever-evolving technological landscape,
software systems are becoming increasingly complex, making it vital to develop tools and
techniques to analyze and evaluate information flow metrics. By using these metrics, software
engineers can identify weaknesses and bottlenecks in a system and make necessary improvements,
leading to better software quality and efficient development processes.
This metric is given by Henry and Kafura, hence commonly referred to as Henry and Kafura’s
Metric. These metrics are based on the fundamental concept that a system’s complexity is
determined by its components, and how they are organized and interrelated. The metrics developed
by Henry and Kafura measure the work performed by the system’s components and how they are
integrated to determine system complexity.
Cyclomatic Complexity
Cyclomatic complexity is a software metric used to measure the complexity of a program. It was
developed by Thomas McCabe in 1976 to help measure the difficulty of testing software.
Cyclomatic complexity determines the number of independent paths through a program’s source
code by analyzing its control flow. The higher the cyclomatic complexity, the more complex the
code is and the more difficult it is to understand, test, and maintain. Understanding and managing
cyclomatic complexity is important in software development to improve code quality and reduce
the risk of bugs and errors.
In software development, cyclomatic complexity is a useful metric for measuring the complexity
of a program’s control flow. If a program’s source code has no control flow statements, its
cyclomatic complexity is 1, as there is only one possible path.
For a program with the program flow graph G,the cyclomatic complexity v(G)is measured as:
v(G) = e -n + 2p
➢ e: number of edges
o Representing branches and cycles
➢ n: number of nodes
o Representing block of sequential code
➢ p: number of connected components
o For a single component, p=1
For a program with the program flow graph G,the cyclomatic complexity v(G)is measured as:
v(G)=1+d
➢ d: number of predicate nodes (i.e., nodes with out-degree other than 1)
o d represents number of loops in the graph;
o or number of decision points in the program
i.e., The complexity of primes depends only on the predicates (decision points or BCSs) in them.
v(G) = e -n + 2p
v(G) = 7 -6 + 2 x 1
v(G) = 3
Or
v(G) = 1 + d
v(G) = 1 + 2 = 3
https://tutorials.freshersnow.com/software-engineering/cyclomatic-complexity/
https://linearb.io/blog/cyclomatic-complexity
Design-Level Attributes
Tree Impurity
The more a system deviates from being a pure tree structure towards being a graph structure, the
worse the design is … it is one of the few system design metrics* to have been validated on a real
project. (INCE AND HEKMATPOUR 1988)
Thus, we seek to create a measure, called tree impurity, to tell us how far a given graph deviates
from being a tree. In what follows, we restrict our attention to undirected graphs.
To define tree impurity, we first describe several properties of graphs and trees. A tree with n
nodes always has n − 1 edges. For every connected graph G, we can find at least one subgraph that
is a tree built on exactly the same nodes as G; such a tree is called a spanning subtree. A spanning
subgraph G′ of a graph G is built on the same nodes of G, but with a minimum subset of edges so
that any two nodes of G′ are connected by a path. (Thus, a graph may have more than one spanning
subgraph.) Intuitively, the tree impurity of G increases as the difference between G and G′
increases. We want to make our definition formal, ensuring that it is consistent with the principles
of measurement theory.
Property 3: For i = 1 and 2, let Ai denote the number of edges in Gi and Ni the number of nodes
in Gi. Then if N1 > N2 and A1 − N1 + 1 = A2 − N2 + 1 (i.e., the spanning subtree of G1 has edges
than the spanning subtree of G2, but in both cases the number of edges additional to the spanning
tree is the same), then m(G1) < m(G2).
This property formalizes our intuitive understanding that we must take account of size in
measuring deviation from a tree. Consider the two graphs G2 and G4 in Figure 9.21. Each has a
single edge additional to its spanning subtree. However, since the spanning subtree of G4 is
smaller, we have an intuitive feeling that its tree impurity should be greater—a single deviation
represents a greater proportional increase in impurity.
Property 4: For all graphs G, m(G) ≤ m(KN) = 1, where N = number of nodes of G and KN is the
complete graph of N nodes.
This property says that, of all the graphs with n nodes, the complete graph has maximal tree
impurity. Since it is reasonable to assume that tree impurity can be measured on a ratio scale, we
can consider our measure to map to some number between 0 and 1, with the complete graph
measuring 1, the worst impurity.
Internal Reuse
We call internal reuse the extent to which modules within a product are used multiple times within
the same product.
This informal definition leads to a more formally defined measure. Consider the graph in Figure
9.22, where each node represents a module, and two nodes are connected by an edge if one module
calls the other.
The graph shows that module M5 is reused, in the sense that it is called by both M2 and M3.
However, the graph model does not give us information about how many times a particular module
calls another module. Thus, M3 is not shown to be reused at all, but it may in fact be called several
times by module M1. If we think of module reuse solely as whether one module is called by
another, then this graph model is sufficient.
Suppose we want to define a measure of this type of reuse. The first two properties for a tree
density measure should also apply to a reuse measure. Similarly, property 4 seems reasonable, but
we must drop the provision that m(KN) = 1. However, property 3 is not applicable, but its converse
may be a desirable property.
Yin and Winchester have proposed a simple measure of internal reuse, r, which satisfies properties
1, 2, and 4 plus the converse of 3. Calling the system design measure, it is defined by
r(G) = e − n + 1
where G has e edges and n nodes (Yin and Winchester 1978). Thus, the design measure is equal
to the number of edges additional to the spanning subtree of G.
Information Flow
Let us examine Henry and Kafura’s information flow measure, a well-known approach to
measuring the total level of information flow between individual modules and the rest of a system
(Henry and Kafura 1981).
To understand the measurement, consider the way in which data move through a system. We say
a local direct flow exists if either
1. A module invokes a second module and passes information to it, or
2. The invoked module returns a result to the caller
Similarly, we say that a local indirect flow exists if the invoked module returns information that is
subsequently passed to a second invoked module. A global flow exists if information flows from
one module to another via a global data structure.
Using these notions, we can describe two particular attributes of the information flow. The fan-in
of a module M is the number of local flows that terminate at M, plus the number of data structures
from which information is retrieved by M. Similarly, the fan-out of a module M is the number of
local flows that emanate from M, plus the number of data structures that are updated by M.
Based on these concepts, Henry and Kafura measure information flow “complexity” as
Coupling in object-oriented systems refers to the degree of dependence between different modules
or classes. It measures how closely connected or intertwined the components of a system are. Low
coupling is generally desirable as it promotes better maintainability, reusability, and flexibility.
High coupling can lead to a lack of modularity and make the system more difficult to understand
and modify.
1. Static Analysis:
• Counting Dependencies: Analyze the source code to count the number of
dependencies between classes or modules. This includes method calls, attribute
references, and other forms of interactions. Tools like static code analyzers can help
in automating this process.
• Dependency Graphs: Create a dependency graph that visually represents the
relationships between classes or modules. Tools like Dependency Structure Matrix
(DSM) or various UML (Unified Modeling Language) tools can assist in creating
such visualizations.
2. Dynamic Analysis:
• Runtime Dependencies: Monitor and analyze dependencies during runtime.
Tools like profilers can help in identifying the actual dependencies that occur while
the program is executing.
• Code Coverage Analysis: Analyze code coverage during testing to identify how
different parts of the system are interconnected. A high degree of code coverage
might suggest a higher level of coupling.
3. Metrics:
• Coupling Metrics: Use quantitative metrics to measure coupling. Some common
metrics include:
• Afferent Coupling (Ca): The number of classes that depend on a
particular class.
• Efferent Coupling (Ce): The number of classes a particular class depends
on.
• Coupling Between Objects (CBO): The total number of dependencies a
class has (Ca + Ce).
• Lack of Cohesion in Methods (LCOM): Measures the lack of cohesion within a
class. A higher LCOM value may indicate poor class design and potential high
coupling.
Cohesion refers to the degree to which elements within a module work together to fulfill a single,
well-defined purpose. High cohesion means that elements are closely related and focused on a
single purpose, while low cohesion means that elements are loosely related and serve multiple
purposes.
Lack of cohesion metric (LCOM) is metric 6 in the Chidamber and Kemerer suite of metrics
(Chidamber and Kemerer 1994). Here, the cohesion of a class is characterized by how closely the
local methods are related to the local instance variables in the class. LCOM is defined as the
number of disjoint (i.e., nonintersecting) sets of local methods. Two methods in a class intersect if
they reference or modify common local instance variables. LCOM is an inverse cohesion measure;
higher values imply lower cohesion. Briand, Daly, and Wüst found that LCOM violates property
1 of Section 9.1.4, as it is not normalized (Briand et al. 1998). Since LCOM indicates inverse
cohesion, properties 2 through 4 are also not satisfied.
Tight class cohesion (TCC) and loose class cohesion (LCC) are based on connections between
methods through instance variables (Bieman and Kang 1995). Two or more methods have a direct
connection if they read or write to the same instance variable. Methods may also have an indirect
connection if one method uses one or more instance variables directly and the other uses the
instance variable indirectly by calling another method that uses the same instance variable. TCC
is based on the relative number of direct connections:
TCC(C) = NDC(C)/NP(C)
where NDC(C) is the number of direct connections in class C and NP(C) is the maximum number
of possible connections. LCC is based on the relative number of direct and indirect connections:
LCC(C) = (NDC(C) + NIC(C))/NP(C)
where NIC(C) is the number of indirect connections. The measures do not include constructor and
destructor methods in the computation, since they tend to initialize and free all instance variables
and will thus artificially increase the measured cohesion.
EXAMPLE
The depth of inheritance tree (DIT) is metric 3 in the Chidamber and Kemerer suite of object-
oriented metrics (Chidamber and Kemerer 1994). Inheritance in a class diagram is represented as
a hierarchy or tree of classes. The nodes in the tree represent classes, and for each such class, the
DIT metric is the length of the maximum path from the node to the root of the tree. Chidamber
and Kemerer claim that DIT is a measure of how many ancestor classes can potentially affect this
class. This claim could only pertain to effects due to inheritance relations, and would not be
accurate due to multiple inheritance in C++ or the use of hierarchies of interfaces in Java (which
were unknown in 1994).
One of the key benefits of object-oriented development is its support for reuse through data
abstraction, inheritance, encapsulation, etc. This support helps developers to reuse existing
software components in several ways. Depending on the development language, they can reuse
existing packages and classes in a verbatim fashion without changes. Developers can also reuse
existing packages and classes as well as interfaces, types, generics, and templates in a leveraged
fashion by overriding and overloading inherited methods, by implementing interfaces,
instantiating generic classes or templates. Measurement of reuse involves an analysis of the
structures and models used to design and implement an objectoriented system. In addition, you
can measure reuse from one or both of two perspectives: (1) client perspective: the perspective of
a new system or system component that can potentially reuse existing components, and (2) server
perspective: the perspective of the existing components that may potentially be reused, for
example, a component library or package (Bieman and Karunanithi 1995).
From the client perspective, the potential reuse measures include the number of direct and indirect
server classes and interfaces reused. An indirect server class would include the classes that direct
servers use either directly or indirectly—this is essentially the number of ancestor classes in an
inheritance hierarchy. Another aspect of reuse involves the structure of the connections between
clients and servers. To quantify this structure, we can measure the number and length of paths
through a UML diagram that connects a client to indirect servers.
From the server perspective, we are concerned with the way a particular entity is being reused by
clients.
EXAMPLE
The number of children (NOC) is metric 3 in the Chidamber and Kemerer suite of object-oriented
metrics (Chidamber and Kemerer 1994). This metric relates to a node (class or interface) in an
inheritance tree or UML class diagram. NOC is computed by counting the number of immediate
successors (subclasses or subinterfaces) of a class or interface. NOC is a direct server reuse
measure.
High DRE means the team catches most defects before release.
Low DRE indicates that many defects slip through to production, which may signal gaps in
testing or quality assurance.
• If 80 defects are found during testing, but 20 more are discovered by users after release,
those 20 are the latent defects that testing missed.
In DRE terms:
80 x 100%
DRE=
80+20
Since the latent defects in a software product is unknown at any point in time, it is approximated
by adding the number of defects removed during the phase to the number of defects found later
(but that existed during that phase).
For example, assume that the following table reflects the defects detected during the specified
phases and the phase where those defects were introduced.
The Defect Removal Effectiveness for each of the phases would be as follows:
Requirements DRE = 10 / (10+3+0+2+1) x 100% = 63%
Design DRE = (3+18) / (3+0+2+1+18+4+5+2) x 100% = 60%
Coding DRE = (0+4+26) / (0+2+1+4+5+2+26+8+7) x 100% = 55%
Testing DRE = (2+5+8) / (2+1+5+2+8+7) x 100% = 60%
Defect Removal Effectiveness can also be calculated for the entire development cycle to examine
defect detection efforts before the product is released to the field. According to Capers Jones,
world class organizations have Development DRE greater than 95%.
Development DRE = (Pre-release Defect) / (Total Defects) x 100%
= (10+3+2+18+4+5+26+8) / (10+3+2+1+18+4+5+2+26+8+7) x 100 = 88%
The longer a defect exists in a product before it is detected, the more expensive it is to fix. Knowing
the DRE for each phase can help an organization target its process improvement efforts to improve
defect detection methods where they can be most effective. Future DRE measures can then be used
to monitor the impact of those improvement efforts.
Chapter-11: Software Reliability Measurement and Prediction
Software reliability theory is a branch of reliability engineering that focuses on the study and
measurement of the reliability of software systems. Reliability refers to the ability of a system to
perform its intended function without failure over a specified period and under specified
conditions. In the context of software, reliability is crucial to ensure that a program behaves
correctly and consistently.
Here are some key concepts and basics of software reliability theory:
Failure:
A failure in software occurs when the system deviates from its specified behavior, producing
incorrect or unexpected results.
Fault:
A fault, also known as a bug or defect, is the root cause of a failure. It is a mistake in the code or
design that can lead to the system malfunctioning.
Error:
An error is a discrepancy between the actual and the intended behavior of the software. Errors are
typically caused by faults.
Reliability:
Software reliability is a measure of the probability that a software system will perform its intended
function without failure over a specified period and under specified conditions.
Failure Rate:
The failure rate is a measure of how frequently a system fails over time. It is often expressed as
the number of failures per unit of time.
Availability:
Availability is the probability that a system is operational at a given point in time. It is related to
reliability and is often expressed as a percentage.
Mean Time Between Failures (MTBF):
MTBF is the average time between consecutive failures of a system. It is a common metric used
to express reliability.
Mean Time to Failure (MTTF):
MTTF is similar to MTBF but is often used for non-repairable systems. It represents the average
time until the first failure.
Fault Tolerance:
Fault tolerance is the ability of a system to continue operating properly in the presence of faults.
This can involve the detection and correction of faults during runtime.
Reliability Modeling:
Various models, such as exponential models, Weibull models, and reliability block diagrams, are
used to mathematically represent the reliability characteristics of software systems.
Testing and Validation:
Testing is a crucial part of ensuring software reliability. It involves the execution of a program
with the intent of finding errors or verifying that it meets specified requirements.
Fault Injection:
Fault injection is a technique used to intentionally introduce faults into a system to observe its
behavior and assess its reliability under different conditions.
Software Maintenance:
Changes and updates to software can introduce new faults or affect the reliability of the system.
Software maintenance practices are important for preserving and improving reliability.
Metrics and Measurement:
Various metrics, such as failure rate, mean time to failure, and reliability growth models, are used
to quantify and measure software reliability.
MTTR stands for Mean Time to Repair, and it is a crucial metric in the context of reliability engineering.
MTTR measures the average time it takes to restore a failed system to normal operation after a failure has
occurred. It is an essential component in the calculation of availability, which is the measure of a system's
operational performance over time.
• If a server crashes and it takes 3 hours to identify the issue, replace a faulty component, and reboot
the system, the repair time is 3 hours.
• If the server fails 5 times in a month, and the total repair time adds up to 15 hours, the MTTR
would be:
=3 hours