100% found this document useful (1 vote)
161 views

Introduction - To - Reliability Analysis

Introduction to Reliability Analysis - Safety Critical Systems' Course, University of Hull, Dr Koorosh Aslansefat

Uploaded by

kooec
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
161 views

Introduction - To - Reliability Analysis

Introduction to Reliability Analysis - Safety Critical Systems' Course, University of Hull, Dr Koorosh Aslansefat

Uploaded by

kooec
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Koorosh Aslansefat - University of Hull

Introduction to Reliability Analysis

Koorosh Aslansefat - University of Hull


Focus of the Session
“To learn the basic definitions and calculations.

 Understanding terminology and mathematical bases of Reliable Systems

 Being familiar with probabilities and their use in Reliability

Koorosh Aslansefat - University of Hull 3


Introduction

Koorosh Aslansefat - University of Hull 4


Definitions
Dependability
Ability [of an entity] to perform as and when required [IEC 60050-192]

Factors of dependability
Reliability, Availability, Maintainability, Safety (RAMS)

Notes:

• Dependability is sometimes considered as the "science of failures."


• "RAMS" (or "RAM") is more commonly used instead of "Dependability."
• "Reliability" is often mistakenly used as a "general term" for "dependability" however,
"reliability" is only a factor that is not sufficient to characterise "dependability."

Avizienis, A., Laprie, J. C., Randell, B., & Landwehr, C. (2004). Basic concepts and taxonomy of dependable and secure
computing. IEEE transactions on dependable and secure computing, 1(1), 11-33.

Koorosh Aslansefat - University of Hull 5


Definitions
Reliability

1 – The ability of an item to perform a required function under stated conditions for a
specified period of time. [Oxford Dictionary 2022].

2 – The probability that a component part, equipment, or system will satisfactorily


perform its intended function under given circumstances, such as environmental
conditions, limitations as to operating time, and frequency and thoroughness of
maintenance for a specified period of time. [McGraw-Hill Dictionary 2003].

Koorosh Aslansefat - University of Hull 6


Definitions
Reliability

1 – The ability of an item to perform a required function under stated conditions for a
specified period of time. [Oxford Dictionary 2022].

2 – The probability that a component part, equipment, or system will satisfactorily


perform its intended function under given circumstances, such as environmental
conditions, limitations as to operating time, and frequency and thoroughness of
maintenance for a specified period of time. [McGraw-Hill Dictionary 2003].

Koorosh Aslansefat - University of Hull 7


Definitions
Reliability Indices

Failure Rate (λ) - A Reliability index that represents the rate at which your product
fails.

Mean Time To Failure (MTTF) – The reliability index for non-repairable units
represents the mean time to failure.

Mean Time Between Failure (MTBF) – The reliability index for repairable units
represents the mean time between failure.

Notes:

• Failure Rate can also be in per operating hour, per km, per cycle, per solicitation.
• For systems equipped with sensors for condition monitoring, there is a parameter
called “Remaining Useful Life” that can be considered as a live version of MTTF.
Koorosh Aslansefat - University of Hull 8
Definitions
Examples of Repairable Safety-Critical Systems

 Aircraft Control Systems

 Nuclear Reactor Cooling Systems

 Hospital Ventilation Systems

 Railroad Signal Systems

 Chemical Plant Safety Systems

 Submarine Life Support Systems

Koorosh Aslansefat - University of Hull 9


Definitions
Examples of Non-repairable Safety-Critical Systems

 Airbag Systems in Vehicles

 Fire Suppression Systems

 Single-Use Medical Devices

 Personal Protective Equipment (PPE)

Koorosh Aslansefat - University of Hull 10


Definitions
Space Shuttle Challenger Disaster

https://en.wikipedia.org/wiki/Space_Shuttle_Challenger_disaster
Koorosh Aslansefat - University of Hull 11
Definitions
Chernobyl Disaster

https://en.wikipedia.org/wiki/Chernobyl_disaster
Koorosh Aslansefat - University of Hull 12
Definitions
Reliability Indices

Failure Rate (λ) - A Reliability index that represents the rate at which your product
fails.

Mean Time To Failure (MTTF) – The reliability index for non-repairable units
represents the mean time to failure.

Koorosh Aslansefat - University of Hull 13


Definitions

https://www.collidu.com/presentation-failure-rate-curve
Koorosh Aslansefat - University of Hull 14
Definitions
Failure Rate

Failure Rate (λ) - A Reliability index that represents the rate at which your product
fails.

EXAMPLE 1: 30 laptops are put on test and run at their normal operating
condition for 1,000 hours. If 6 of those laptops fail during the operating time,
what is the failure rate of the product?

https://www.youtube.com/watch?v=BQXnKpP2lrI
Koorosh Aslansefat - University of Hull 15
Definitions
Failure Rate

Failure Rate (λ) - A Reliability index that represents the rate at which your product
fails.

EXAMPLE 1: 30 laptops are put on test and run at their normal operating
condition for 1,000 hours. If 6 of those laptops fail during the operating time,
what is the failure rate of the product?

Koorosh Aslansefat - University of Hull 16


Definitions
Failure Rate

Failure Rate (λ) - A Reliability index that represents the rate at which your product
fails.

EXAMPLE 2: 20 laptops are put on test and run at their normal operating
condition for 1,000 hours. If 6 of those laptops fail at the following hours (550,
480, 680, 790, 860, 620), what is the failure rate of the product?

Koorosh Aslansefat - University of Hull 17


Definitions
Failure Rate

Failure Rate (λ) - A Reliability index that represents the rate at which your product
fails.

EXAMPLE 2: 20 laptops are put on test and run at their normal operating
condition for 1,000 hours. If 6 of those laptops fail at the following hours (550,
480, 680, 790, 860, 620), what is the failure rate of the product?

Koorosh Aslansefat - University of Hull 18


Definitions
Failure Rate

Failure Rate (λ) - A Reliability index that represents the rate at which your product
fails.

EXAMPLE 2: 20 laptops are put on test and run at their normal operating
condition for 1,000 hours. If 6 of those laptops fail at the following times (550h,
20d, 680h, 790h, 860h, 620h), what is the failure rate (per hour) of the product?

𝟔
𝝀= =𝟎 . 𝟎𝟎𝟎𝟑𝟑𝟑𝟕𝟎𝟒𝟏𝟏
𝟓𝟓𝟎+ 𝟒𝟖𝟎+𝟔𝟖𝟎 +𝟕𝟗𝟎 +𝟖𝟔𝟎+ 𝟔𝟐𝟎+(𝟏𝟒 ∗ 𝟏 , 𝟎𝟎𝟎)

Koorosh Aslansefat - University of Hull 19


Definitions
Mean Time To Failure

Mean Time To Failure (MTTF) – The reliability index for non-repairable units
represents the mean time to failure.

EXAMPLE 2: 20 laptops are put on test and run at their normal operating
condition for 1,000 hours. If 6 of those laptops fail at the following times (550h,
20d, 680h, 790h, 860h, 620h), what is the MTTF of the product?

𝟓𝟓𝟎 +𝟒𝟖𝟎+ 𝟔𝟖𝟎+𝟕𝟗𝟎 +𝟖𝟔𝟎 +𝟔𝟐𝟎+(𝟏𝟒 ∗ 𝟏 , 𝟎𝟎𝟎)


𝑴𝑻𝑻𝑭 = =𝟐𝟗𝟗𝟔 𝑯𝒐𝒖𝒓𝒔
𝟔

Koorosh Aslansefat - University of Hull 20


Definitions
Mean Time To Failure vs. Reliability

−( 𝜆 𝑡 )
− ( 𝑀𝑇𝑇𝐹
1
)𝑡
𝑅 ( 𝑡 )=𝑒 =𝑒

Deif, D., & Gadallah, Y. (2017). A comprehensive wireless sensor network reliability metric for critical Internet of Things applications. EURASIP Journal on
Wireless Communications and Networking, 2017, 1-18.

Koorosh Aslansefat - University of Hull 21


Definitions
Mean Time To Failure vs. Reliability

Consider, we have tested 20x units


and found that our MTTF is 4000
Hours. What is the reliability of our
product at 5000 hours of
operation?

− ( 4000
1
) 5000
𝑅 ( 5000 ) =𝑒 =0 . 2865

The probability that our product will perform successfully past the 5000 hours mark is
approximately 28.65%.
Koorosh Aslansefat - University of Hull 22
Questions
1- An industrial machine compresses natural gas into an interstate gas pipeline. The
compressor is on line 24 hours a day. (If the machine is down, a gas field has to be
shut down until the natural gas can be compressed, so down time is very expensive.)
The vendor knows that the compressor has a constant failure rate of 0.0000001
failures/hr. What is the operational reliability after 2500 hours of continuous service?

The compressor has a constant failure rate and therefore the lifetimes of these reliability is
given by: where Failure rate =0.0000001 failures/hr, operational time t = 2500 hours.
Reliability = =0.9975

Koorosh Aslansefat - University of Hull 23


Questions
2- What is the highest failure rate for a product if it is to have a reliability (or probability
of survival) of 98 percent at 5000 hours? Assume that the time to failure follows an
exponential distribution.

The reliability of the product is given to be 0.98. The reliability of an exponential distribution
is given by: i.e., 0.98=

Taking natural logarithms on both sides (see Appendix), we get, −0.02020=−λ∗5000


Therefore =4.04∗10−6 hr

Koorosh Aslansefat - University of Hull 24


Questions
3- Suppose that a component we wish to model has a constant failure rate with a
mean time to failures of 25 hours? Find:
(a) The reliability function.
(b) The reliability of the item at 30 hours.

(a) Since the failure rate is constant, we will use the exponential distribution. Also, the MTTF
= 25 hours. We know, for an exponential distribution, MTTF = 1/. Therefore =1/25=0.04

(b) The reliability function is given by:

Koorosh Aslansefat - University of Hull 25


Questions
4- A certain type of engine seal (non-repairable) is known to have life exponentially
distributed with a constant failure rate = 0.03 * 10^-4 failures/hour.

(a) What is the MTTF of the seal?


(b) What is the reliability at MTTF?

a) λ=0.03∗10−4 failures/hour
MTTF = 1/λ=333,333 hours i.e., the average life of these seals is about 333,333 hours

b) MTTF = 333,333 hours


Therefore: 0.368

Koorosh Aslansefat - University of Hull 26


Questions
5- The equipment in a packaging plant has a MTTF of 1000 hours. What is the
probability that the equipment will operate for a period of 500 hours without failure?

Assuming the exponential model, the hazard rate is 1/MTTF = 0.001 So R(500) = e-
500*0.001 = e-0.5 = 0.61

Koorosh Aslansefat - University of Hull 27


Questions
6- During World War II, the hazard rate for bomber aircraft flying over Europe was
believed to be a 4% chance of non-return from each mission, however experienced
the pilot was. Calculate the probability that a crew member will survive 25 missions.
How many missions would it take to reduce a crew member's probability of survival to
10%?

As the hazard rate is a constant, we can use the exponential model with a hazard rate
of 0.04. R(25) = e -0.04*25 = e-1 = 0.37

The crew have a 37% chance of surviving 25 missions.

To reduce this to 10% we need e-0.4*t =0.1.


Trial and error (or natural logarithms) shows that e -2.3 = 0.1 (approximately)
So, 0.04t = 2.3, so t is 23/0.04 or about 57 missions. Note that “time” is measured in
missions.
Koorosh Aslansefat - University of Hull 28
Questions
ALCO manufactures microwave ovens. To develop warranty guidelines, TALCO
randomly tested 10 microwave ovens continuously to failure. The failure information of
the 10 ovens is shown below.
Microwave Hours
1 2300
2 2150
3 2800
4 1890
5 2790
6 1890
7 2450
8 2630
9 2100
10 2120

What is the mean time to failure of the microwave ovens? (Note that the mean life of the
microwave is defined in terms of their mean time to failure because no maintenance is
performed on the ovens).
The MTTF is 2312 hours (simply the average).
Koorosh Aslansefat - University of Hull 29
Definitions
Availability

The probability that a system is operational and functioning correctly at any point in
time. It refers to the ability of the system to be in a state to perform its designated
function under stated conditions whenever required [IEC 60050-192].

Notes:
• The state of an item of being able to perform as required is the “up state” (also
called “working”).
• The state of an item of being unable to perform as required is the “down state,” (also
called “faulty” or “in maintenance”).
• An available item is not necessarily operating (e.g. “stand-by”), (being able to ≠
performing).
• In more complex systems we might have “degraded states”.

Koorosh Aslansefat - University of Hull 30


Definitions
Availability

The probability that a system is operational and functioning correctly at any point in
time. It refers to the ability of the system to be in a state to perform its designated
function under stated conditions whenever required [IEC 60050-192].

𝑇𝑜𝑡𝑎𝑙 𝑂𝑝𝑒𝑟𝑎𝑡𝑖𝑛𝑔 𝑇𝑖𝑚𝑒 ​


𝐴𝑎𝑣𝑔 =
𝑇𝑜𝑡𝑎𝑙𝑂𝑝𝑒𝑟𝑎𝑡𝑖𝑛𝑔 𝑇𝑖𝑚𝑒+𝑇𝑜𝑡𝑎𝑙 𝐷𝑜𝑤𝑛𝑡𝑖𝑚𝑒

Koorosh Aslansefat - University of Hull 31


Definitions
Availability

The probability that a system is operational and functioning correctly at any point in
time. It refers to the ability of the system to be in a state to perform its designated
function under stated conditions whenever required [IEC 60050-192] .

Performance
Up State Failures Up State Up State

Down State Down State time


Koorosh Aslansefat - University of Hull 32
Definitions

Koorosh Aslansefat - University of Hull 33


Definitions

https://www.toucantoco.com/en/blog/understanding-failure-metrics

Koorosh Aslansefat - University of Hull 34


Definitions

Koorosh Aslansefat - University of Hull 35


Maintenance Actual Procedure Time
Required Vessels Required Crew are Spare Parts Are Appropriate
No Shift Limit Ready
are Available Available Weather Window
Ready to do
V C SH L W R
Maintenance

Delay for Vessels


Availability
Delay for Required
Crew Availability
Delay for Required
Multiple Shifts
Delay for Required
Spare Parts
Delay for Good
Weather Window
¿
¿

(Athanasios, et al. 2019)

Koorosh Aslansefat - University of Hull 36


Definitions
Availability Indices

Repair Rate () - A availability index that represents the rate at which a product can be
repaired.

Mean Time To Repair (MTTR) – The availability index for repairable units represents
the mean time to repair.

Koorosh Aslansefat - University of Hull 37


Example
A manufacturing company tracks MTTR for both equipment repairs and software
maintenance to understand how continuous software updates support equipment
operation. The plant supervisor uses the formula to calculate the MTTR for each task.
The supervisor assumes that software updates to the manufacturing equipment can
improve efficiency and reduce technical and operational issues. For the machinery,
the supervisor determines the total repair time at 26.7 hours. If personnel completes
six maintenance jobs on the equipment, What would be the MTTR value?

MTTR = 26.7 hours / 6 repairs = 4.45 hours

Tracking the software updates, the supervisor finds that the IT department applies
modifications and bug repairs a total of 10 times during the same year. If the total
repair time for yearly updates is 41.5 hours, the supervisor determines the MTTR for
this task as:

MTTR = 41.5 hours / 10 updates = 4.15 hours https://www.indeed.com/career-advice/career-d


evelopment/how-to-calculate-mttr

Koorosh Aslansefat - University of Hull 38


Definitions
Maintainability
Ability [of an item] to be retained in, or restored to a state to perform as required,
under given conditions of use and maintenance [IEC 60050-192].
Preventive maintenance:
Maintenance carried out to mitigate degradation and reduce the probability of failure
[IEC 60050-192].
Corrective maintenance:
maintenance carried out after fault detection to effect restoration [IEC 60050-192].

Notes:
• Preventive maintenance acts on reliability (and, indirectly, on availability), while the
corrective maintenance only acts on availability.
• Preventive maintenance is “scheduled” when it is carried out in accordance with a
specified timetable and “condition-based” when it is performed upon the
assessment of physical conditions.
Koorosh Aslansefat - University of Hull 39
The Costs Associated with (O&M) of Wind Turbines

01 Onshore [1]

02 Offshore [2]

03 Approaching The End


Wind Farms

Of Life [3]

10-15% 25-30% Up to 35%

Koorosh Aslansefat - University of Hull 40


𝟕𝟖. 𝟏𝟐%

Koorosh Aslansefat - University of Hull 41


Corrective vs. Preventive Maintenance

Koorosh Aslansefat - University of Hull 42


Questions
1- What is the difference between reliability and availability?

The availability takes the restoration to “up state” (i.e. repairs) into account,
unlike reliability.

Koorosh Aslansefat - University of Hull 43


Questions
2- On what condition the reliability of an item is equal to its availability?

When no restoration to “up state” is considered (i.e. items never repaired), the
reliability of an item is equal to its availability.

Koorosh Aslansefat - University of Hull 44


Questions
3- What factors of dependability is considered for availability but not for
reliability?
The maintainability (with regards to the corrective maintenance) is a factor of
dependability that is considered for availability but not for reliability.

Koorosh Aslansefat - University of Hull 45


Questions
4- Can a poorly reliable item be very available?

A poorly reliable item can be very available if the restoration to “up state” is
very fast after each failure.

Koorosh Aslansefat - University of Hull 46


Questions
5- How does reliability change according to time (from t0)?

The reliability is always decreasing according to time.

Koorosh Aslansefat - University of Hull 47


Questions
6- How does availability change according to time (from t0)?

The availability can increase, decrease, and/or stay constant according to time.

Koorosh Aslansefat - University of Hull 48


Questions
7- How the reliability of an item can be improved?

The reliability of an item can be improved by improving the quality/properties of


the item, or its environmental constraints.

Koorosh Aslansefat - University of Hull 49


Questions
8- How the availability of an item can be improved?

The availability of an item can be improved by improving the reliability of the


item, or its maintainability.

Koorosh Aslansefat - University of Hull 50


Definitions
Safety
Safety – “risk” point of view
freedom from risk which is not tolerable [ISO/CEI Guide 51:2014]
i.e. “in a given context based on the current values of society”

Safety – “system” point of view


Ability [of an item] to prevent hazardous events (i.e. event that may result in physical
injury or damage to the health of people or damage to property or the environment
[CEI 61508]), or to reduce the consequences of such events on people, property or
the environment.

Note:

Safety refers to accidental events while security refers to intentional events.


Koorosh Aslansefat - University of Hull 51
Summary

 Introduction to basic reliability analysis concepts and calculations.

 Definitions and importance of reliability, availability, maintainability, safety (RAMS).

 Overview of reliability indices: failure rate (λ), mean time to failure (MTTF), mean time

between failures (MTBF).

 Distinction between repairable and non-repairable systems.

 Calculations of failure rates and their impact on system performance.

 Discussion on availability and maintainability indices.

Koorosh Aslansefat - University of Hull 52


Challenges

The Fukushima Daiichi Nuclear Disaster (2011)


The Fukushima Daiichi nuclear disaster initiated by the
tsunami following an earthquake on 11 March 2011 is an
example of a case where the designer of the nuclear
facility failed to foresee the environmental circumstances
that may cause the system failure. The statement by
Tsuneo Futami, a former director of Fukushima
Daiichi plant: “We can only work on precedent, and
there was no precedent. When I headed the plant, the
thought of a tsunami never crossed my mind”.

Koorosh Aslansefat - University of Hull 53


Challenges
Uber Self-driving Car Kills a Bicyclist

10 AI Failures in 2018Review:, https://medium.com/syncedreview/2018-in-review-10-ai-failures-c18faadf5983


Koorosh Aslansefat - University of Hull 54
Challenges

K. Pei, et al. K., Cao, Y., Yang, J., & Jana, S. (2017). Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th
Symposium on Operating Systems Principles (pp. 1-18).

Koorosh Aslansefat - University of Hull 55


References

Trivedi, K. S., & Bobbio, A. (2017).


Reliability and availability engineering:
modelling, analysis, and applications.
Cambridge University Press

Koorosh Aslansefat - University of Hull 56


References
 Avizienis, A., Laprie, J. C., Randell, B., & Landwehr, C. (2004). Basic concepts and taxonomy of
dependable and secure computing. IEEE transactions on dependable and secure computing, 1(1), 11-
33.

 Florent Brissaud, (2018) Introduction to Reliability Theories:


https://www.slideshare.net/FlorentBrissaud/introduction-to-reliability-theories

 Deif, D., & Gadallah, Y. (2017). A comprehensive wireless sensor network reliability metric for critical
Internet of Things applications. EURASIP Journal on Wireless Communications and Networking, 2017,
1-18.

Koorosh Aslansefat - University of Hull 57


Thanks for Your Attention
If you have any question, please feel free to ask

Koorosh Aslansefat - University of Hull 58

You might also like