Chapter 1-1
Chapter 1-1
Chapter 1-1
AND
MAINTAINABILITY
1
Course Outline
• Reliability:
• failure probability and density functions
• component reliability
• measures of reliability
• reliability in the systems life cycle
reliability analysis methods
• design review and evaluation or reliability
• Maintainability:
• Maintenance management organization and scheduling
• measures of maintainability
• maintainability in the system life cycle
• maintainability analysis methods
• Design review and evaluation of maintainability
2
Course Outline
3
INTRODUCTION TO RELIABILITY
Definition of Reliability:
“The ability of an item to perform a required function under stated
conditions for a stated period of time”
• “quality over time”
There are “Real World” conflicts with this definition that we need to
keep in mind…
• Probability – Customers expect a probability of 1, “It Works”
• Intended Function – The product may be used in unintended ways and still be
expected to work
• Under Stated Conditions – The product may be operated outside of the stated
conditions and still be expected to work
• Prescribed Procedures – Customers may not have the required tools or skill level
and may not follow procedures and still expect the product to work
6 e-Learning course.
WHY RELIABILITY ENGINEERING?
8
When Should Reliability Be
Applied?
“From the cradle to the grave.”
i.e. The entire life cycle of the product.
9 e-Learning course.
Failure
“any event or collection of events that causes the system to lose
its functionability”
Functionability
• The inherent characteristic of a product related to its ability to
perform a specified function according to the specified
requirements under the specified operating conditions
• Transition from reliability to failure can be instantaneous (tyre
burst, transformer explosion, transistor blowing)
• Can also be gradual (cracks in insulation, bearing wears, cable
corrodes)
• Health monitoring can prevent failure
10
Definition of Concepts
• Failure - A failure is an event when an item is not available
to perform its function at specified conditions when
scheduled or is not capable of performing functions to
specification.
• Failure Rate - The number of failures per unit of gross
operating period in terms of time, events, cycles.
• MTBF - Mean Time Between Failures - The average time
between failure occurrences. The number of items and
their operating time divided by the total number of failures.
For Repairable Items
• MTTF - Mean Time To Failure - The average time to
failure occurrence. The number of items and their
operating time divided by the total number of failures. For
Repairable Items and Non-repairable Items
11
Definition of Concepts
Hazard-The potential to cause harm. Harm including ill health and
injury, damage to property, plant, products or theenvironment,
production losses or increased liabilities.
Risk-The likelihood that a specified undesired event will occur due
to the realisation of a hazard by, or during work activities or by the
products and services created by workactivities.
12
Quality, Reliability and Safety
Reliability can be considered as ”Quality over
time”.Customers frequently use the terms ”quality”
and”reliability”. We need to understand what they expect.
13
Quality, Reliability and Safety
Quality defects and failures both can adversely affect safety
ofuser, bystanders and equipment.
However, all failures are not safety issues and all safety
issues are not due to failures.
14
As Reliability Engineering is concerned with analyzing
failures and providing feedback to design and production to
prevent future failures, it is only natural that a rigorous
classification of failure types must be agreed upon.
15
How Do Products Really Fail
16
BASIC FUNCTIONS IN RELIABILITY AND THEIR RELATIONSHIP
They include
Reliability function,
Cumulative failure distribution function,
Failure density function, and
Hazard rate function.
In this section, we also obtain expressions for the mean and median of the random variable
called the time to failure or failure time.
i. Reliability Function
If reliability is a probability, then a random variable has to be associated with it because
probability is always associated with a random variable.
The random variable in the case of reliability is generally time – the time in which a
component/system fails.
Let this random variable, called time to failure or failure time of the component/system,
be denoted by T. Then, by definition, reliability [R(t)] at time t, of the component/system
is given by
17
The probability of survival of a component decreases as the life time of the component
increases. Ultimately this probability will approach zero, since no component can perform
its intended function forever. A typical shape of the reliability function is shown in Fig.
13.1.
From the figure, you can see that the reliability function is a non-increasing function in
time (t).
18
ii. Cumulative Failure Distribution Function
19
If we want to calculate the probability of failure of a component at time t (known as
unreliability of the component), then we have to simply obtain the value of the function
F(t). Therefore, if we denote reliability and unreliability of a component by R and Q,
respectively, then they satisfy the relation:
Note that the probability of failure of a component increases after its useful life
period is over. Ultimately this probability will approach 1. A typical shape of
cumulative failure distribution is shown in Fig. 13.2.
20
iii. Failure Density Function
From probability you learnt, the derivative of cumulative distribution function of a
random variable is known as the probability density function (pdf) of the random
variable.
But in the terminology of reliability, the probability density function is known as failure
density function (fdf)
As explained above, the cumulative distribution function in reliability terminology is the
cumulative failure distribution function. So if f(t) denotes failure density function (life
time density function), then
Recall that the following relationship exists between f(t) and F(t):
21
22
iv. Hazard Rate
The instantaneous rate of failure at time t is known as the hazard rate and is generally
denoted by (t).
Conditional probability of failure in the interval 𝑡 - (𝑡 + 𝑑𝑡) given that no failure has
occurred by time 𝑡
23
v. Relationship between the Functions R(t), F(t), f(t) and (t)
24
25
26
27
vi. Mean Time to Failure and Median of the Random Variable T
In reliability terminology, the mean of the random variable T in the absence of
repair and replacement is known as the mean time to failure (MTTF).
Usually the users of a product/component are interested in knowing the average life of
the product/component rather than the complete failure details.
MTTF is a measure which simply gives a number (in units of time in which life of the
product/ component is measured) that tells us on an average how long the product/
component performs its intended function successfully.
It is calculated by taking the mean of the lifetimes obtained on the basis of results of a
sample of such identical products/components tested under stated conditions.
For example, suppose we put 10 identical components to test under the stated
conditions.
Let ti, (i=1, 2,...,10) denote the time for which ith component performs its intended
function successfully. The results of the sample are shown in Fig. 13.4. Thus, mean
time to failure (MTTF) of such components on the basis of the results of this
sample is given by
28
However, if we are given the basic functions of a component then by definition, MTTF is
given by
29
Median is that value of the variate which divides the distribution of the variate into two
equal parts. Therefore, if tmd denotes the median of the random variable T, then
30
31
32
33
Example 4
100 T
f(t) Cdf & R(t)
1/10 t
0 R(t) 1 F(t) 1
100
100
34
Example 5
Given the probability density function
1
f (t ) t , 0 t 100
5000
where t is time-to-failure in hours and the pdf is shown below:
f(t)
0.02
0.015
0.01
0.005
0 t
0 20 40 60 80 100
Solution
t t
1 t2
F(t) f(t)dt t dt
0 0
5000 10000
t2
R(t) 1 F(t) 1
10000
35
Example 6
For the reliability function
R(t ) e(t /800) , t 0
2
( 200800) 2
R(200) e
( 500800) 2
R(500) e
p(T 500) R(500)
R(500/200)
p(T 200) R(200)
36
Example 4
Given the following time to failure probability density function (pdf):
f (t ) 0.01e0.01t , t 0
where t is time-to-failure in hours. What is the reliability function?
Solution
R(t) e 0.01t
37
Example 7
Given the cumulative distribution function (cdf):
F (t ) e(t /800) , t 0
3
38
Example 8
Consider the pdf used in Example 2 given by
1
f (t ) t , 0 t 100
5000
Calculate the hazard function.
Solution
1
f(t) ( 5000)t
h(t)
R(t) t2
1
10000
39
Examples
Example 9
Given h(t)=18t, find R(t), F(t), and f(t).
Solution t
18tdt
R(t) e e 9t
2
0
F(t)
f(t)
40
Example 10
41
Example 11
42
Example 11
43
BATHTUB CURVE
Consider a population of new identical components and suppose that all of them enter the
field at some point in time, say, t = 0.
In general, the curve of relative failure rate of the entire population of the components
without replacement has the typical shape of a bathtub. Therefore, it is known as the
bathtub curve as shown in the figure below.
The bathtub curve is nothing but a graphical representation of the failure rate of a
population of identical components versus time
45
I. Early Failure or Infant Mortality
When a population of new identical components starts to perform its intended function, a
high failure rate is observed in initial stages due to many factors such as manufacturing
errors, poor quality or substandard items, incorrect adjustment or positioning, bad
assembly, human factors, improper design, etc.
This phase of the bathtub curve is known as the period of early failure or infant
mortality. It is also known as the burn-in or decreasing failure rate, or debugging period.
Most failures in this period are due to manufacturing errors, design problems or poor
quality for which manufacturer may be responsible. So this period is generally covered
by a warranty period by the manufacturer. The duration of this period varies from
component to component. However, it is typically for the period of time till failure rate
decreases
The hazard rate in this phase can be reduced by increasing quality control at the
production level.
However, even increased quality control at the production stage cannot completely
eliminate infant mortality. Therefore, components should be tested at the factory before
delivering them to the customers. That is why, good companies generally test the
components before supplying them to users.
47
Bathtub Curve: Summary Table
Phase Failure Rate Possible Causes Possible improvement
actions.
Burn-in (A- Decreasing Manufacturing defects, Better QC, Acceptance
B) (DFR) welding, soldering, testing, Burn-in testing,
assembly errors, part screening, Highly
defects, poor QC, poor Accelerated Stress
workmanship, etc Screening, etc.
Useful Life Constant Environment, random loads, Excess Strength,
(B-C) (CFR) Human errors, chance redundancy, robust design,
events, ’Acts of God’, etc etc
Wear-out Increasing Fatigue, Corrosion, Aging, Derating, preventive
(C-D) (IFR) Friction, etc. maintenance, parts
replacement, better
material, improved designs,
technology, etc.
48 e-Learning course.
ESTIMATION OF RELIABILITY FUNCTIONS FROM FAILURE DATA
It is a real challenge for the people working in the field of reliability to estimate the basic
reliability functions (reliability, cumulative failure distribution, failure density and hazard
rate) from the failure data of a sample of identical components obtained either from test
generated failures or collected from actual field of operation of these components.
Estimate and plot the reliability, cumulative failure distribution, failure density and failure rate
functions.
Solution:
In this example, the number of components in the sample is large. So instead of recording the
time to failure of each component individually, it is a common practice to record the number of
failures in some suitable intervals of time.
In this example, the interval of time is taken as 10 hours. The estimate of reliability,
cumulative failure distribution, failure density and failure/hazard rate functions are given
in columns 5, 6, 7 and 8 of Table 13.3. The calculations for different columns are
explained below.
49
Calculations of Columns 1 and 2
In column 1, we simply enter time in hours starting from the lower bound of first class
interval up to the upper bound of last class interval for the data given in Table 13.2.
Entries in the second column represent number of failures observed during test in each
interval. The number of failures (frequencies) listed against each class interval of time in
Table 13.2 is the sum of the numbers of all components that fail during that time interval.
In the second column of Table 13.3, this frequency is entered against the upper bound of
the class interval during which it is observed.
Calculation of Column 3
Entries in this column simply represent the cumulative frequencies of the frequencies
written in column 2.
If N(t)f denotes the number of components that have failed by time t, then the first three
entries of the column are calculated as follows:
Calculation of Column 4
Entries of column 4 represent the number of components that are performing their
intended function adequately at time t. Therefore, if N(t)s denotes the number of
components that are performing their intended function at time t, then
50
51
52
53
54
55
56
57
58
59
60