0% found this document useful (0 votes)
4 views38 pages

Chapter 5

123

Uploaded by

shadizxmostafa99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views38 pages

Chapter 5

123

Uploaded by

shadizxmostafa99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 38

CHAPTER FIVE

Computer Reliability

Dr. Sherif M. Tawfik

1
Learning Objectives
1. Introduction
2. What is Software Reliability ?
3. Software reliability and Hardware reliability
4. Need for software reliability measurement
5. Increasing reliability
6. Software Metrics for Reliability
7. Two Kinds of Data-related Failure

2
Introduction
• Computer systems are sometimes unreliable
– Erroneous information in databases
– Misinterpretation of database information
– Malfunction of embedded systems
• Effects of computer errors
– Inconvenience
– Bad business decisions
– Injuries or Fatalities

3
What is Software Reliability ?
• According to ANSI, “Software Reliability is
defined as the probability of failure-free software
operation for a specified period of time in a specified
environment”.

• The IEEE defines reliability as “The ability of a


system or component to perform its required
functions under stated conditions for a specified
period of time.”
4
Software reliability and
Hardware reliability
• Software reliability The software reliability not measured
on the basis of time, because the software is never wear
out. There is no problem of rust as like in case of
hardware.
• Hardware parts Electronic and mechanical parts may
become ―Old and wear out with time and usage. In the
hardware reliability time is used to define the reliability of
hardware. It means how much time the hardware remain
working without any defect.

5
Hardware reliability
• In Hardware reliability , in the first phase of the
manufacturing , there may be a high number of faults.
• But after discovering and removing faults this number may
decrease and gradually in the second phase (Useful life) ,
there exists only a few number of faults.
• After this phase , there will be wear out phase in which , the
physical component wear out due to the time and usage and
the number of faults will again increase.
Burn In Useful Wear out
Life

Phases of hardware when considering reliability


6
Software reliability
• In software reliability , at the first phase , i.e while testing
and integration there will be high number of faults, but after
removing the faults , there exists only a few number of
faults and this process of removing the faults continues at a
slower rate .
• Software products will not wear out with time and usage ,
but may become outmoded at a later stage.

Integration Useful obsolete


and testing Life

Phases of software when considering reliability


7
Distinct Characteristics of
Software and Hardware
• Fault- Software faults are mainly design faults where as
hardware faults are mostly physical.
• Wear out- It is an important point, software remain reliable
overtime instead of wearing out like hardware. It become
obsolete (out of fashion) if the environment for which it is
developed changes. Hence software may be retired due to
environmental changes, new requirements, new
expectations etc.

8
Distinct Characteristics of
Software and Hardware
• Software is not manufactured- A software is developed it is not
manufactured like hardware. It depends upon the individual skills
and creative abilities of the developer which is very difficult to
specify and even more difficult to quantify and virtually
impossible to standardize.
• Time dependency and life cycle- Software reliability is not a
function of operational time. But it is applicable on hardware
reliability.
• Environmental Factors- Environment factors do not affect
software reliability, but it affect to the hardware.

9
Need for software reliability
measurement
• In any software industry , system quality plays an
important role.
• We know that hardware quality is constantly
high .So if the system quality changes , it is because
of the variation in software quality only.
• Software quality can be measured in many ways.
Reliability is an user – oriented measure of
“software quality”.
10
Need for software reliability
measurement
• As an Example, assume that there are 3 programs
that are executing to solve a problem.
• By finding the reliability of each program we can
find which program has less reliability and we can
put more effort to modify that program to improve
the overall reliability of the system.
• So always there is a need to measure the reliability.

11
Increasing reliability
• Reliability can be increased by preventing the above said errors
and developing quality software through all of the stages of
software life cycle. To do this,
– We have to ensure that whether the requirements are clearly specifying
the functionality of the final product or not. (Requirement phase)

– Among the phases of the software reliability , the second one i.e useful
life is the most important one and so the software product must be
maintained carefully. So we have to ensure that the code generated can
support maintainability to avoid any additional errors. (Coding phase)

12
Increasing reliability

– Next we have to verify that all the requirements specified in the


requirement phase are satisfied or not . ( Testing phase )

• As reliability is an attribute of quality , we can say that reliability


depends on software quality .
• So to build a high reliable software there is a need to measure the
attributes of quality that are applied at each development cycle.
• Software metrics are used to measure these applicable attributes.
The following slides shows different types of metrics that are
applied to improve the reliability of system.

13
Software Metrics for Reliability

• The Metrics are used to improve the reliability of


the system by identifying :
– The areas of requirements (for specification),
– Coding (for errors),
– Testing (for verifying) phases.

14
Requirements Reliability Metrics
• Requirements indicate what features the software must contain.
• So for this requirement document, a clear understanding between client
and developer should exist. Otherwise it is critical to write these
requirements .
• The requirements must contain valid structure to avoid the loss of
valuable information.
• Next , the requirements should be thorough and in a detailed manner
so that it is easy for the design phase.
• The requirements should not contain inadequate information .

15
Requirements Reliability Metrics
• Next one is to communicate easily .There should not be
any ambiguous data in the requirements. If there exists
any ambiguous data , then it is difficult for the developer
to implement that specification.
• Requirement Reliability metrics evaluates the above said
quality factors of the requirement document.

16
Design and Code Reliability Metrics
• The quality factors that exists in design and coding plan are
complexity , size and modularity.
• If there exists more complex modules, then it is difficult to
understand and there is a high probability of occurring errors.
So complexity of the modules should be less.
• Next coming to size, it depends upon the factors such as total
lines, comments, executable statements etc.
• According to SATC , the most effective evaluation is the
combination of size and complexity.

17
Design and Code Reliability Metrics
• The reliability will decrease if modules have a
combination of high complexity and large size or high
complexity and small size. In the later combination also
the reliability decreases because , the smaller size results
in a short code which is difficult to alter.
• These metrics are also applicable to object oriented code ,
but in this , additional metrics are required to evaluate the
quality.

18
Testing Reliability Metrics
• Testing Reliability metrics uses two approaches to
evaluate the reliability.
• First, it ensures that the system is fully equipped with the
functions that are specified in the requirements. Because
of this, the errors due to the lack of functionality
decreases .
• Second approach is nothing but evaluating the code ,
finding the errors and fixing them.

19
Basic Reliability Metrics
• Some reliability metrics which can be used to quantify the
reliability of the software product are discussed below:-
• MEAN TIME TO FAILURE (MTTF) The first metric
that we should understand is the time that a system is not
failed, or is available. Often referred to as “uptime” in the
IT industry, the length of time that a system is online
between outages or failures can be thought of as the “time to
failure” for that system.

20
Basic Reliability Metrics
• For example, if I bring my RAID array online on Monday at
noon and the system functions normally until a disk failure
Friday at noon, it was “available” for exactly 96 hours.
• If this happens every week, with repairs lasting from Friday
noon until Monday noon, I could average these numbers to
reach a “mean time to failure” or “MTTF” of 96 hours.
• I would probably also call my system vendor and demand that
they replace this horribly unreliable device.

21
Basic Reliability Metrics
• MEAN TIME BETWEEN FAILURE (MTBF) We can
combine MTTF &MTTR metrics to get the MTBF metric.
• MTBF = MTTF + MTTR

• Thus, an MTBF of 300 indicates that once the failure occurs,


the next failure is expected to occur only after 300 hours.
• In this case the time measurements are real time & not the
execution time as in MTTF.

22
Basic Reliability Metrics
• MEAN TIME TO REPAIR (MTTR)

• Once the failure occur sometime is required to fix the error.

• MTTR measures the average time it takes to track the errors


causing the failure & to fix them.

23
24
Basic Reliability Metrics
• RATE OF OCCURRENCE OF FAILURE (ROCOF)

• It is the number of failures occurring in unit time interval. The


number of unexpected events over a particular time of operation.
• ROCOF is the frequency of occurrence with which unexpected
behavior is likely to occur.
• An ROCOF of 0.02 means that two failures are likely to occur in
each 100 operational time unit steps. It is also called failure
intensity metric.

25
Basic Reliability Metrics
• PROBABILITY OF FAILURE ON DEMAND (POFOD)
• POFOD is defined as the probability that the system will fail when a
service is requested. It is the number of system failures given a number
of systems inputs.
• POFOD is the likelihood that the system will fail when a service request
is made.
• A POFOD of 0.1 means that one out of a ten service requests may result
in failure. POFOD is an important measure for safety critical systems.
POFOD is appropriate for protection systems where services are
demanded occasionally.

26
Basic Reliability Metrics
• AVAILABILITY (AVAIL)
• Availability is the probability that the system is available for use at a
given time. It takes into account the repair time & the restart time for the
system.
• An availability of 0.995 means that in every 1000 time units, the system
is likely to be available for 995 of these.
• The percentage of time that a system is available for use, taking into
account planned and unplanned downtime. If a system is down an
average of four hours out of 100 hours of operation, its AVAIL is 96%.

27
Two Kinds of Data-related Failure

• A computerized system may fail because


wrong data entered into it
• A computerized system may fail because
people incorrectly interpret data they retrieve

28
Disfranchised Voters
• November 2000 general election
• Florida disqualified thousands of voters
• Reason: People identified as felons
• Cause: Incorrect records in voter database
• Consequence: May have affected outcome
of national presidential election

29
False Arrests
• Sheila Jackson Arrested and spent five days in detention
mistaken for Shirley Jackson

• Terry Dean Rogan arrested after someone stole his


identity
– Arrested five times.

30
Accuracy of NCIC Records
• March 2003: Justice Dept. announces FBI not
responsible for accuracy of National Crime
Information Center (NCIC) information

• Should government take responsibility for data


correctness?

31
Dept. of Justice Position
• Impractical for FBI to be responsible for data’s
accuracy
• Much information provided by other law
enforcement and intelligence agencies
• Agents should be able to use discretion
• If Privacy Act strictly followed, much less
information would be in NCIC
• Result: fewer arrests

32
Position of Privacy Advocates
• Number of records is increasing
• More erroneous records  more false
arrests
• Accuracy of NCIC records more important
than ever

33
Errors When Data Are Correct
• Assume data correctly fed into
computerized system
• System may still fail if there is an error in
its programming

34
Errors Leading to System
Malfunctions
• Qwest sent incorrect bills to cell phone customers
• Faulty The United States Department of Agriculture
(USDA) beef price reports
• U.S. Postal Service returned mail addressed to Patent
and Trademark Office
• New York City Housing authority overcharged renters
• About 450 California prison inmates mistakenly
released

35
Errors Leading to System
Failures
• Ambulance dispatch system in London
• Japan’s air traffic control system
• Comair’s Christmas Day shutdown (The 2004 crash
of a critical legacy system at Comair is a classic risk
management mistake that cost the airline $20 million
and badly damaged its reputation)
• NASDAQ stock exchange shut down
• Insulin pump demo at Black Hat conference
36
Comair Cancelled All Flights on
Christmas Day, 2004

AP Photo/Al Behrman, File

37
Analysis: E-Retailer Posts Wrong
Price, Refuses to Deliver
• Amazon.com in Britain offered iPad for £7
instead of £275
• Orders flooded in
• Amazon.com shut down site, refused to deliver
unless customers paid true price
• Was Amazon.com wrong to refuse to fill the
orders?

38

You might also like