Software Reliability
Software Reliability
and
Maintenance
26/11/24
Organization
Introduction
Reliability metrics
Reliability growth modelling
Software Maintenance and Support
Summary
Introduction
Reliability of a software product:
Denotes its trustworthiness or
dependability
Can be defined as the probability of the
product working correctly over a given
period of time.
Users not only want highly reliable
products:
want quantitative estimation of
reliability before making buying
decision.
Introduction
Accurate measurement of software
reliability:
a very difficult problem
Several factors contribute to making
measurement of software reliability
difficult.
Major Problems in
Reliability Measurements
Errors do not cause failures
at the same frequency and
severity.
measuring latent errors alone
not enough
The failure rate is observer-
dependent
Software Reliability
Intuitively:
a software product having a
large number of defects is
unreliable.
It is also clear:
reliability of a system
improves if the number of
defects is reduced.
Difficulties in Software
Reliability Measurement
(1)
No simple relationship between:
observed system reliability
and the number of latent
software defects.
Removing errors from parts of
software which are rarely used:
makes little difference to the
perceived reliability.
The 90-10 Rule
Experiments from analysis of
behavior of a large number of
programs:
90% of the total execution time is
spent in executing only 10% of
the instructions in the program.
The most used 10%
instructions:
called the core of the program.
Effect of 90-10 Rule on
Software Reliability
Least used 90% statements:
called non-core are executed only
during 10% of the total execution
time.
It may not be very surprising then:
removing 60% defects from least
used parts would lead to only about
3% improvement to product
reliability.
Difficulty in Software
Reliability Measurement
Reliability improvements
from correction of a single
error:
depends on whether the error
belongs to the core or the
non-core part of the program.
Difficulty in Software
Reliability Measurement
(2)
The perceived reliability
depends to a large extent
upon:
how the product is used,
In technical terms on its
operation profile.
Effect of Operational Profile on
Software Reliability
Measurement
If we select input data:
only “correctly”
implemented functions
are executed,
none of the errors will be
exposed
perceived reliability of the
product will be high.
Effect of Operational Profile on
Software Reliability
Measurement
On the other hand, if we
select the input data:
such that only functions
containing errors are invoked,
perceived reliability of the
(ti+1-ti)/(n-1)
Mean Time to Repair
(MTTR)
Once failure occurs:
additional time is lost to fix
faults
MTTR:
measures average time it
takes to fix faults.
Mean Time Between
Failures (MTBF)
We can combine MTTF and
MTTR:
to get an availability metric:
MTBF=MTTF+MTTR
irritants.
Reliability Metrics – contd.
Mean Time to Failure (MTTF)
average time between observed
failures (aka MTBF)
Availability = MTBF /
(MTBF+MTTR)
MTBF = Mean Time Between Failure
MTTR = Mean Time to Repair
Reliability = MTBF / (1+MTBF)
Reliability metrics
All reliability metrics we
discussed:
centered around the probability
of system failures:
take no account of the
consequences of failures.
severity of failures may be very
different.
Failure Classes
More severe types of failures:
may render the system totally
unusable.
To accurately estimate reliability of
a software product:
it is necessary to classify different
types of failures.
Failure Classes
Transient:
Transient failures occur only for
certain inputs.
Permanent:
Permanent failures occur for all input
values.
Recoverable:
When recoverable failures occur:
the system recovers with or without
operator intervention.
Failure Classes
Unrecoverable:
the system may have to be restarted.
Cosmetic:
These failures just cause minor
irritations,
do not lead to incorrect results.
An example of a cosmetic failure:
mouse button has to be clicked twice
instead of once to invoke a GUI function.
Examples
Failure Class Example Metric
ROCOF
Time
Step Function Model
Assumes:
all errors contribute equally to
reliability growth
highly unrealistic:
we already know that different
errors contribute differently to
reliability growth.
Jelinski and Moranda Model
Realizes each time an error is
repaired:
reliability does not increase by a
constant amount.
Reliability improvement due to fixing
of an error:
assumed to be proportional to the
number of errors present in the system
at that time.
Jelinski and Moranda Model
Realistic for many applications,
still suffers from several
shortcomings.
Most probable failures (failure
our discussion.
Applicability of Reliability Growth
Models
There is no universally
applicable reliability growth
model.
Reliability growth is not
independent of application.
Applicability of Reliability Growth
Models
Fit observed data to several
growth models.
Take the one that best fits the
data.
Software Maintenance
Types:
Corrective :It is necessary to