0% found this document useful (0 votes)
2 views9 pages

Module-VI Reliability & Failure Data Analysis

Reliability measures a product's ability to perform its intended function, expressed as a probability, with higher reliability indicating lower failure rates. It is crucial for product quality and involves reliability management, which includes testing and estimating reliability through statistical methods. Additionally, metrics like Mean Time Between Failures (MTBF) and Mean Time To Repair (MTTR) are essential for assessing and improving system uptime and maintenance efficiency.

Uploaded by

luluparida358
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views9 pages

Module-VI Reliability & Failure Data Analysis

Reliability measures a product's ability to perform its intended function, expressed as a probability, with higher reliability indicating lower failure rates. It is crucial for product quality and involves reliability management, which includes testing and estimating reliability through statistical methods. Additionally, metrics like Mean Time Between Failures (MTBF) and Mean Time To Repair (MTTR) are essential for assessing and improving system uptime and maintenance efficiency.

Uploaded by

luluparida358
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Reliability

Reliability is a measure of the ability of a product or part to perform its intended


function under a prescribed set of conditions. In effect, reliability is a expressed
probability.
Suppose that an item has a reliability of .90. This means that it has a 90 percent
probability of functioning as specified. The probability that it will fail, i.e., its
failure rate, is 1 - .90 = .10, or 10 percent. Hence, it is expected that, on average, 1
out of every 10 such items will fail. or, equivalently, that the item will fail, on
average, once in every 10 trials. Similarly, a reliability of .985 implies 15 failures
per 1,000 parts or trials.
Reliability of a product or part is used in two ways:
1. Reliability when activated: The first of these focuses on one point in time and is
often used when a product or part must operate for one time, such as a missile or
an air bag in a car.
2. Reliability for a given length of time: This focuses on the length of service, such
as most other products e.g., a car.
Reliability is an important dimension of product quality. Reliability management
involves establishing, achieving, and maintaining reliability objectives for
products, e.g., the expected life of a particular make of light bulb may be specified
to be 5,000 hours. Achieving reliability usually falls on the shoulder of reliability
engineers who use a variety of techniques to build reliability into products (e.g., by
using reliable key components), test their performance, and estimate their
reliability. If the reliability is inadequate, the types of failure and their effect on the
product should be determined, their root cause(s) identified, and potential failure
prevented. We will mainly focus on reliability measurement, which involves
statistics and probability theory. The average reliability of a part is measured by
testing several units over time until some or all fail. However, this time may be
very long (several years). To accelerate this, the items are stressed by using
extreme environmental conditions such as high temperature, temperature cycles
(e.g., hot–cold), high humidity, high vibration, high voltage, surges in power, etc.
The resulting life estimate is then adjusted appropriately. Reliability of a product is
determined from the reliability of its parts.
What follows are examples illustrating the use of two probability rules to
determine whether a given product will operate successfully. Let Pi = probability
that event occurs, i= 1, 2, 3, . . .
Rule 1. If two or more events are independent and “success” is defined as the
occurrence of all of the events, then the probability of success Ps is equal to the
product of the probabilities of the events occurring, i.e., Ps = P1 × P2 × .. Example.
Suppose a room has two lamps, but to have adequate light, both lamps must work
(success) when turned on. Here the product is the lighting system that has two
component lamps. One lamp has a probability of working of .90, and the other has
a probability of working of .80. The probability that both will work is .90 × .80
= .72. This lighting system can be represented by the following diagram where the
two components are connected in series:
Failure Data Analysis
A series of tests were conducted under certain conditions on1000 electronic
components. Total duration of the test is 19 hours. The components failed hourly
interval is noted. The result was tabulated. (Table -1) The time interval was i.e. Δt
were taken as 1 hour. The no of failures is represented by ʄ and the cumulative
failure to the end of the interval is F.
Since the number of components failed during a particular interval is noted at the
end of interval or at the beginning of next interval, the values are entered between
two values of Δt in column (2) (*).
Based on the failure data or survival-test we can study:
i. Failure Density,
ii. Failure Rate,
iii. Reliability
iv. Probability of Failure.
Reliability: This is the ratio of the survivors at any given time to initial
population.
The reliability at the end of first hour will be R (1) = 870/1000 = 0.870
The reliability at the end of second hour will be R (2) = 787/1000 = 0.787
The reliability at the end of twelfth hour will be R (12) = 286/1000 = 0.286
(1) (2) (3) (4) (5) (6) (7)
Time Number Cumulat Number Failure Failure Reliabilit
of ive of Density Rate y
Failure Failures Survivor
s
Δt ʄ F S FD Z R
0 0 1000 1
(*) 130 0.130 0.139
1 130 870 0.870
83 0.083 0.101
2 213 787 0.787
75 0.075 0.100
3 288 712 0.712
68 0.068 0.100
4 356 644 0.644
62 0.062 0.101
5 418 582 0.582
56 0.056 0.101
6 474 526 0.526
51 0.051 0.101
7 525 475 0.475
46 0.046 0.101
8 571 429 0.429
41 0.041 0.100
9 612 388 0.388
37 0.037 0.100
10 649 351 0.351
34 0.034 0.101
11 683 317 0.317
31 0.031 0.103
12 714 286 0.286
28 0.028 0.103
13 742 258 0.258
64 0.064 0.283
14 806 194 0.194
76 0.076 0.486
15 882 118 0.118
62 0.062 0.714
16 944 56 0.056
40 0.040 1.110
17 984 16 0.016
12 0.012 1.200
18 996 4 0.004
4 0.004 2.000
19 1000 0 0
sum = 1.00 mean = 0.376

Failure Data Analysis


A series of tests were conducted under certain conditions on1000 electronic
components. Total duration of the test is 19 hours. The components failed hourly
interval is noted. The result was tabulated. (Table -1) The time interval was i.e. Δt
were taken as 1 hour. The no of failures is represented by ʄ and the cumulative
failure to the end of the interval is F.
Since the number of components failed during a particular interval is noted at the
end of interval or at the beginning of next interval, the values are entered between
two values of Δt in column (2) (*).
Based on the failure data or survival-test we can study:
i. Failure Density,
ii. Failure Rate,
iii. Reliability
iv. Probability of Failure.
Reliability: This is the ratio of the survivors at any given time to initial
population.
The reliability at the end of first hour will be R (1) = 870/1000 = 0.870
The reliability at the end of second hour will be R (2) = 787/1000 = 0.787
The reliability at the end of twelfth hour will be R (12) = 286/1000 = 0.286

(1) (2) (3) (4) (5) (6) (7)


Time Number Cumulativ Number of Failure Density Failure Reliability
of Failure e Failures Survivors Rate
Δt ʄ F S FD Z R
0 0 1000 1
(*) 130 0.130 0.139
1 130 870 0.870
83 0.083 0.101
2 213 787 0.787
75 0.075 0.100
3 288 712 0.712
68 0.068 0.100
4 356 644 0.644
62 0.062 0.101
5 418 582 0.582
56 0.056 0.101
6 474 526 0.526
51 0.051 0.101
7 525 475 0.475
46 0.046 0.101
8 571 429 0.429
41 0.041 0.100
9 612 388 0.388
37 0.037 0.100
10 649 351 0.351
34 0.034 0.101
11 683 317 0.317
31 0.031 0.103
12 714 286 0.286
28 0.028 0.103
13 742 258 0.258
64 0.064 0.283
14 806 194 0.194
76 0.076 0.486
15 882 118 0.118
62 0.062 0.714
16 944 56 0.056
40 0.040 1.110
17 984 16 0.016
12 0.012 1.200
18 996 4 0.004
4 0.004 2.000
19 1000 0 0
sum = 1.00 mean = 0.376
MTBF/MTTR Calculation
I. How to calculate MTBF
To calculate MTBF, you need to divide the total operation time by the number of
failures.

The total operation time is the difference between the total working time and the
total breakdown time, so this is the MTBF formula:

In the MTBF formula:


total working time corresponds to the number of hours the machine would have
been operating had it not failed;
the total breakdown time is the unplanned downtime (thus excluding schedule
maintenance, i.e. inspections, periodic revisions or preventive replacements);
the number of breakdowns equals the number of failures

MTBF Calculation Example


Let’s imagine that an asset which is expected to work for 24 hours a day has three
outages. One lasts for an hour, another for 2 hours and the final breakdown lasts
30 minutes.

total working time = 24 hours


total breakdown time = 3.5 hours (1 + 2 + 0.5)
number of breakdowns = 3

Then, as per the MTBF formula, the Mean Time Between Failures calculation
will be:
Why is MTBF important?
MTBF is useful to estimate how likely an asset is to fail, and how often certain
failures occur. This makes it extremely important for reliability engineering,
although it’s also an indicator for the asset’s availability.

While there are many more maintenance KPIS you should keep an eye out for, the
MTBF is a guideline for preventive maintenance scheduling. Plus, if you make an
accurate estimate, it will improve inventory and prevent stock outs.

However, remember that any KPI is only as good as the data. To make sure you
have accurate data from the equipment to calculate MTBF, you need the right
CMMS.

II. MTTR Calculation


As we touched on earlier, the MTTR formula is the total unplanned maintenance
time divided by the total number of repairs (failures). MTTR is most commonly
represented in hours. Keep in mind, MTTR assumes tasks are performed
sequentially and by trained maintenance personnel.
Total unplanned maintenance time / Total number of repairs = MTTR
A simple example of MTTR might look like this: if you have a pump that fails four
times in one workday and you spend an hour repairing each of those instances of
failure, your MTTR would be 15 minutes (60 minutes / 4 = 15 minutes).
Another example could involve an asset that experiences 10 outages in a 90-day
period. The outage times (time of detection to time the asset is back to
production) are 24, 51, 79, 56 and 12 minutes. The MTTR for this 90-day period is
44 minutes. That is the average time between the detection of the issue to the
recovery of the asset.
There are two assumptions to keep in mind when calculating MTTR:
Usually, every instance of failure varies in severity, so while some breakdowns
require days to repair, others might only take minutes. Therefore, MTTR gives you
an average of what to expect.
It's important that every instance of failure is attended to by competent and
properly trained maintenance personnel who follow standardized procedures.
This ensures reliable results.
It's been said that some of the best maintenance teams in the world have an
MTTR of less than five hours, but it's almost impossible to benchmark your
facility's MTTR with another's metrics due to the number of variables. MTTR
depends on multiple factors like the type of asset you're analyzing, its age,
criticality, maintenance team training, etc.

MTTR vs. MTBF: What's the Difference?


When dealing with systems or equipment that can be repaired, MTTR and MTBF
are two metrics often analyzed and compared when looking into failures that can
result in costly downtime. So, what's the difference between the two? Mean time
between failure (MTBF) is a prediction of the time between the innate failures of
a piece of machinery during normal operating hours or how long a piece of
equipment operates without interruption. It's calculated by taking the total time
an asset is running (uptime) and dividing it by the number of breakdowns that
happened over that same period of time.
MTBF = Total uptime / # of Breakdowns
MTBF analysis helps maintenance departments strategize on how to reduce the
time between failures. Together, MTBF and MTTR determine uptime. To calculate
a system's uptime with these two metrics, use the following formula:
Uptime = MTBF / (MTBF + MTTR)
Consider the following scenario: Your system is supposed to be up and running 40
hours, but it wasn't working for 28 of those hours. It's only been available for 14
hours, and a total of five failures occurred. Using our uptime formula, we'll first
calculate MTBF by taking 40-28 / 5=34.4. Next, we'll calculate MTTR by taking 28 /
5 = 5.6. So, to calculate uptime, our formula would look like this:
34.4 / (34.4 + 5.6) = 0.86 (86%)

How to Improve MTTR


MTTR is seen as a key performance indicator (KPI). Therefore, maintenance teams
should always strive to improve it. The benefits of reducing MTTR are fairly
obvious – less downtime means stable production, happy customers and reduced
maintenance costs. So, what are some steps you can take to help improve your
organization's MTTR? The best place to start is understanding the four stages of
MTTR and taking steps to reduce each of them.

You might also like