Availability and Reliability
Availability and Reliability
Slide 1
Principal dependability
properties
Slide 2
Reliability
The probability of failure-free
system operation over a specified
time in a given environment for a
given purpose
Slide 3
Availability
The probability that a system, at a
point in time, will be operational and
able to deliver the requested services
Slide 4
Availability specification
Both reliability and availability
attributes can be expressed as
numbers:
Availability of 0.999 means that the
system is up and running for 99.9% of
the time;
Availability and reliability, 2013
Slide 5
Reliability specification
Probability of failure on demand
(POFOD) of 0.0001 means that on
average 1 in 10, 000 demands for
service from a system will fail in
some way
Slide 6
Slide 7
Slide 8
Slide 9
Availability perception
Availability is usually expressed as
a percentage of the time that the
system is available to deliver
services e.g. 99.9%.
Slide 10
Slide 11
Subjective availability
The number of users affected by
the service outage.
Loss of service in the middle of the
night is less important for many
systems than loss of service during
peak usage periods.
Availability and reliability, 2013
Slide 12
Slide 13
Reliability metrics
Probability of failure on demand
(POFOD)
Probability that a system will not
deliver a service correctly when
requested
Used for systems where demands are
infrequent and intermittent
Availability and reliability, 2013
Slide 14
Slide 15
Fault
Error
Failure
Slide 16
Faults-errors-failures
Fault
Error
Failure
Availability and reliability, 2013
Slide 17
Slide 18
Slide 19
Slide 20
Reliability achievement
Fault avoidance
Development technique are used
that either minimise the
possibility of mistakes or trap
mistakes before they result in the
introduction of system faults.
Availability and reliability, 2013
Slide 21
Slide 22
Fault tolerance
Run-time techniques are used to
ensure that system faults do not
result in system errors and/or
that system errors do not lead to
system failures.
Availability and reliability, 2013
Slide 23
Summary
Availability is the probability that a
system will be available when a
service request is made
Reliability is the probablity that a
system will deliver a service as
expected by users
Availability and reliability, 2013
Slide 24
Summary
Software faults lead to state errors
lead to operational failures
Fault avoidance, detection and
tolerance are strategies for
achieving reliability
Availability and reliability, 2013
Slide 25