0% found this document useful (0 votes)
10 views64 pages

Software Reliability

Uploaded by

manusinghaniya17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views64 pages

Software Reliability

Uploaded by

manusinghaniya17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 64

Software Reliability

and
Maintenance

26/11/24
Organization
 Introduction
 Reliability metrics
 Reliability growth modelling
 Software Maintenance and Support
 Summary
Introduction
 Reliability of a software product:
 Denotes its trustworthiness or
dependability
 Can be defined as the probability of the
product working correctly over a given
period of time.
 Users not only want highly reliable
products:
 want quantitative estimation of
reliability before making buying
decision.
Introduction
 Accurate measurement of software
reliability:
 a very difficult problem
 Several factors contribute to making
measurement of software reliability
difficult.
Major Problems in
Reliability Measurements
 Errors do not cause failures
at the same frequency and
severity.

measuring latent errors alone
not enough
 The failure rate is observer-
dependent
Software Reliability
 Intuitively:

a software product having a
large number of defects is
unreliable.
 It is also clear:

reliability of a system
improves if the number of
defects is reduced.
Difficulties in Software
Reliability Measurement
(1)
 No simple relationship between:

observed system reliability

and the number of latent
software defects.
 Removing errors from parts of
software which are rarely used:

makes little difference to the
perceived reliability.
The 90-10 Rule
 Experiments from analysis of
behavior of a large number of
programs:

90% of the total execution time is
spent in executing only 10% of
the instructions in the program.
 The most used 10%
instructions:

called the core of the program.
Effect of 90-10 Rule on
Software Reliability
 Least used 90% statements:
 called non-core are executed only
during 10% of the total execution
time.
 It may not be very surprising then:
 removing 60% defects from least
used parts would lead to only about
3% improvement to product
reliability.
Difficulty in Software
Reliability Measurement
 Reliability improvements
from correction of a single
error:
 depends on whether the error
belongs to the core or the
non-core part of the program.
Difficulty in Software
Reliability Measurement
(2)
 The perceived reliability
depends to a large extent
upon:

how the product is used,

In technical terms on its
operation profile.
Effect of Operational Profile on
Software Reliability
Measurement
 If we select input data:

only “correctly”
implemented functions
are executed,

none of the errors will be
exposed

perceived reliability of the
product will be high.
Effect of Operational Profile on
Software Reliability
Measurement
 On the other hand, if we
select the input data:
 such that only functions
containing errors are invoked,
 perceived reliability of the

system will be low.


Software Reliability
 Different users use a software
product in different ways.
 defects which show up for one user,

may not show up for another.
 Reliability of a software product:
 clearly observer-dependent
 cannot be determined absolutely.
Difficulty in Software
Reliability Measurement
(3)
 Software reliability keeps
changing through out the
life of the product
 Each time an error is detected
and corrected
Hardware vs. Software
Reliability
 Hardware failures:
 inherently different from software
failures.
 Most hardware failures are due
to component wear and tear:
 some component no longer
functions as specified.
Hardware vs. Software
Reliability
 A logic gate can be stuck at
1 or 0,

or a resistor might short
circuit.
 To fix hardware faults:

replace or repair the failed
part.
Hardware vs. Software
Reliability
 Software faults are latent:
 system will continue to fail:

unless changes are made to
the software design and
code.
Hardware vs. Software
Reliability
 Because of the difference in
effect of faults:

Though many metrics are
appropriate for hardware
reliability measurements

Are not good software reliability
metrics
Hardware vs. Software
Reliability
 When a hardware is
repaired:
 its reliability is maintained
 When software is repaired:
 its reliability may increase or
decrease.
Hardware vs. Software
Reliability
 Goal of hardware reliability
study :

stability (i.e. interfailure times
remains constant)
 Goal of software reliability
study

reliability growth (i.e.
interfailure times increases)
Bathtub curve for
Hardware reliability
Bathtub curve for software
Reliability
Reliability Metrics
 Different categories of software
products have different
reliability requirements:
 level of reliability required for a
software product should be
specified in the SRS document.
Reliability Metrics
 A good reliability measure
should be observer-
independent,
 so that different people can
agree on the reliability.
Rate of occurrence of
failure (ROCOF):
 ROCOF measures:

frequency of occurrence
failures.

observe the behavior of a
software product in operation:

over a specified time interval

calculate the total number of
failures during the interval.
Mean Time To Failure
(MTTF)
 Average time between two
successive failures:
 observed over a large number
of failures.
Mean Time To Failure
(MTTF)
 MTTF is not as appropriate for
software as for hardware:

Hardware fails due to a
component’s wear and tear

thus indicates how frequently the
component fails

When a software error is detected
and repaired:

the same error never appears.
Mean Time To Failure
(MTTF)
 We can record failure data
for n failures:

let these be t1, t2, …, tn

calculate (ti+1-ti)
 the average value is MTTF

(ti+1-ti)/(n-1)
Mean Time to Repair
(MTTR)
 Once failure occurs:
 additional time is lost to fix
faults
 MTTR:
 measures average time it
takes to fix faults.
Mean Time Between
Failures (MTBF)
 We can combine MTTF and
MTTR:
 to get an availability metric:
 MTBF=MTTF+MTTR

 MTBF of 100 hours would indicae


 Once a failure occurs, the next
failure is expected after 100 hours
of clock time (not running time).
Probability of Failure on
Demand (POFOD)
 Unlike other metrics
 This metric does not explicitly involve
time.
 Measures the likelihood of the
system failing:
 when a service request is made.
 POFOD of 0.001 means:

1 out of 1000 service requests may result
in a failure.
Availability
 Measures how likely the system
shall be available for use over a
period of time:
 considers the number of failures
occurring during a time interval,
 also takes into account the repair
time (down time) of a system.
Availability
 This metric is important for
systems like:

telecommunication systems,

operating systems, etc. which are
supposed to be never down

where repair and restart time are
significant and loss of service
during that time is important.
Reliability metrics
 Failures which are transient and
whose consequences are not
serious:
 of little practical importance in
the use of a software product.
 such failures can at best be minor

irritants.
Reliability Metrics – contd.
 Mean Time to Failure (MTTF)
 average time between observed
failures (aka MTBF)
 Availability = MTBF /
(MTBF+MTTR)
 MTBF = Mean Time Between Failure
 MTTR = Mean Time to Repair
 Reliability = MTBF / (1+MTBF)
Reliability metrics
 All reliability metrics we
discussed:
 centered around the probability
of system failures:
 take no account of the

consequences of failures.

severity of failures may be very
different.
Failure Classes
 More severe types of failures:
 may render the system totally
unusable.
 To accurately estimate reliability of
a software product:
 it is necessary to classify different
types of failures.
Failure Classes
 Transient:

Transient failures occur only for
certain inputs.
 Permanent:

Permanent failures occur for all input
values.
 Recoverable:

When recoverable failures occur:

the system recovers with or without
operator intervention.
Failure Classes
 Unrecoverable:
 the system may have to be restarted.
 Cosmetic:
 These failures just cause minor
irritations,

do not lead to incorrect results.
 An example of a cosmetic failure:

mouse button has to be clicked twice
instead of once to invoke a GUI function.
Examples
Failure Class Example Metric

Permanent ATM fails to


Non-corrupting operate with any ROCOF = .0001
card, must restart to Time unit = days
correct

Transient Magnetic stripe POFOD = .0001


Non-corrupting can't be read on Time unit =
undamaged card transactions
Reliability Growth
Modelling
 A reliability growth model:
 a model of how software reliability
grows

as errors are detected and repaired.
 A reliability growth model can be
used to predict:
 when (or if at all) a particular level of
reliability is likely to be attained.
 i.e. how long to test the system?
Reliability Growth
Modelling
 There are two main types of
uncertainty:
 in modelling reliability growth which
render any reliability measurement
inaccurate:
 Type 1 uncertainty:
 our lack of knowledge about how the
system will be used, i.e.

its operational profile
Reliability Growth
Modelling
 Type 2 uncertainty:
 reflects our lack of knowledge about the
effect of fault removal.
 When we fix a fault

we are not sure if the corrections are
complete and successful and no other faults
are introduced
 Even if the faults are fixed properly

we do not know how much will be the
improvement to interfailure time.
Step Function Model
 The simplest reliability growth
model:
 a step function model
 The basic assumption:
 reliability increases by a constant
amount each time an error is
detected and repaired.
Step Function Model

ROCOF

Time
Step Function Model
 Assumes:
 all errors contribute equally to
reliability growth
 highly unrealistic:


we already know that different
errors contribute differently to
reliability growth.
Jelinski and Moranda Model
 Realizes each time an error is
repaired:
 reliability does not increase by a
constant amount.
 Reliability improvement due to fixing
of an error:
 assumed to be proportional to the
number of errors present in the system
at that time.
Jelinski and Moranda Model
 Realistic for many applications,
 still suffers from several
shortcomings.
 Most probable failures (failure

types which occur frequently):



discovered early during the testing
process.
Jelinski and Moranda Model
 Repairing faults discovered early:
 contribute maximum to the reliability
growth.
 Rate of reliability growth should be
large initially:
 slow down later on,
 contrary to assumption of the model
Littlewood and Verall’s
Model
 Allows for negative reliability
growth:

when software repair introduces
further errors.

Models the fact that as errors are
repaired:

average improvement in reliability
per repair decreases.
Littlewood and Verall’s
Model
 Treats a corrected bug’s contribution
to reliability improvement:

an independent random variable having
Gamma distribution.
 Removes bugs with large
contributions to reliability:

earlier than bugs with smaller
contribution

represents diminishing return as test
continues.
Reliability growth models
 There are more complex
reliability growth models,
 more accurate approximations to
the reliability growth.
 these models are out of scope of

our discussion.
Applicability of Reliability Growth
Models
 There is no universally
applicable reliability growth
model.
 Reliability growth is not
independent of application.
Applicability of Reliability Growth
Models
 Fit observed data to several
growth models.
 Take the one that best fits the
data.
Software Maintenance

Types:
Corrective :It is necessary to

overcome the failures observed


while the system is in use
Adaptive: System to run in new

platform/OS/hardware interface etc.


Perfective: Change functionalities

as per customer demand/support


new features
Characteristics of Software
Evolution
Lehman and Belady studied the
characteristics of several software
product and develop the laws-
1st law: A software product must change

continually or become progressively less


useful
2nd law: Structure of a program tends to

degrade as more and more maintenance


is carried out
3rd law: rate at which code is written or

modified is approximately the same


during development and maintenance
Software Maintenance
Model 1
Software Maintenance
Model 2
Empirical Estimation of Maintenance
Cost
Estimation of Maintenance
Cost
Annual Change Traffic (ACT)
[Bohem ,1981]
Fraction of a software product’s source
instructions which undergo change during a
typical year either through addition or
deletion.
ACT=(KLOCadded + KLOCdeleted)/KLOCtotal

Total Maintenance Cost=ACT ×


Development cost
Software Maintenance and
Support
 Both follow ITIL (Information
Technology Infrastructure Library
 ITIL- a set of practices for
information technology service
management (ITSM)
 It align IT service with business
need of the organization
Software Maintenance and
Support
 Maintenance is a part of SDLC
 Maintenance originates from Change
Request
 Directly linked to changes to application
 Support is directly linked to serving
(supporting) customer to do their jobs
 Level 1- Interface points
 Level 2-Quick fix/ minor enhancement
 Focus on corrective modifications and
changes
Reference
 Software Engineering
Dr. R Mall

You might also like