Simple Guide To MTBF
Simple Guide To MTBF
MTBF – What it is
and when to use it
Mean Time Between Failure (MTBF) is one of the most widely recognised and yet least
understood indicators in the maintenance and reliability world. Manufacturers quote
it as a rating of their products and industry uses it as a measure of success. But there
is so much misunderstanding associated with MTBF that there is even an online
movement to abandon MTBF. In this article, I will explain in simple terms what MTBS
is, what it’s not, when to use and when not.
What is MTBF?
It is said that the great Greek philosopher Socrates argued that “the beginning of
wisdom is the definition of terms.”
Socrates would have been unimpressed with our use of MTBF or would have
challenged our collective wisdom when it comes to MTBF.
Sure, there are clear definitions for MTBF. But, unfortunately, there is a lack of
common understanding of what MTBF really means.
MTBF stands for Mean Time Between Failures and is represents the
average time between two failures for a repairable system.
For example, three identical pieces of equipment are put into service and run
until they fail. The first system fails after 200 hours, the second after 250 hours
and the third after 400 hours. The MTBF of the systems is the average of the
three failure times, which is 283.33 hours.
Simple guide to MTBF – What it is and when to use it
MTBF is related to failure rate. It assumes a constant random failure rate during
the useful life of a piece of equipment.
We can learn more about MTBF by exploring its origin and the reasons why it
came into use. It also helps to compare MTBF with other indicators to avoid
confusion about terms. This article covers all these aspects along with some
clear guidance about where to use and not to use MTBF.
Failure rate
The failure rate is the number of failures in a component or piece of equipment
over a specified period. It is important to note that the measurement excludes
maintenance-related outages. These outages are not deemed to be failures and
therefore, do not form
part of this calculation. A failure rate does not correlate with online time or
availability for operation – it only reflects the rate of failure.
There is a high rate of infancy failures at the beginning of its life and a high rate
of wear out failures at the end of its life. But in between, during the product’s
useful life, its rate of
Page 2 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Reliability
Before World War II, the term reliability described how repeatable a test was.
The more repeatable the results, the more reliable the test, whether it be in the
field of mechanics, psychology or any other scientific endeavour. However, the
challenges of World War II caused new developments in the definitions and
engineering associated with reliability.
Electronics equipment during the war was highly problematic. Up to half of the
electronic equipment on a naval vessel could be out of service at any time –
leading to a renewed focus on understanding and improving equipment
reliability. Working groups developed strategies like setting quality and reliability
standards for electronic equipment suppliers.
Around this same time, studies showed that up to 60% of failures in army missile
systems were related to component reliability. Military and commercial aviation
continued to drive improvements in reliability engineering throughout the
twentieth century.
Page 3 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
MTBF
We calculate MTBF by dividing the total running time by the number of failures
during a defined period. As such, it is the inverse of the failure rate.
Page 4 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Assuming a situation where there are 1,000 cars that run for one year. If one car
fails in that time, the MTBF would be:
In a population of 500 ANSI pumps in water service across multiple sites, 600 fail
in a period of three years. The MTBF would be:
On their own, these numbers provide some information about reliability but not
enough to fully understand the reliability performance of the equipment.
Service life
Service life refers to the entire duration of an equipment’s use. We measure it
from the time of commissioning to its final failure or decommissioning.
Page 5 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Engineers also predict service life based on the design specifications. A service
life prediction would typically be used in calculations to justify the capital
expense of a new asset. Actual service life can be compared with the design
service life of a piece of equipment to determine whether it met the
expectations of engineers when it was first purchased.
One unique example is that of a missile. By nature, we expect a very high MTBF
for a missile indicating the very low probability of failure. But the service life of a
missile is very short. It can be as little as a few minutes from the time a missile is
fired to the time it explodes.
Mission life
Mission life is the duration used for reliability calculations and analysis. For
example, we base the failure rate calculation on the number of failures in a
specific time. This time is known as the mission life.
Engineers use reliability indicators to predict failures and make decisions about
the future mission life of their equipment. This includes making decisions about
spares holding or maintenance strategies for a mission life of the next five years.
Useful life
Useful life refers to the flat part of the bathtub failure curve. It leaves out the
time associated with infancy failures at the beginning as well as the time
associated with wear out failures at the end of a product’s life. Useful life is,
therefore, the operational life of any piece of equipment.
Page 6 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
A fan belt in a motor is a typical example. Fan belts should have an MTTF that is
higher than the MTBF of the equipment into which it fits. Otherwise, the whole
equipment may fail when the fan belt fails. This correlation provides a key for
improving an engineering design. The way to improve MTBF of a complex
system may be to purchase better quality parts that have a higher MTTF
performance. Nevertheless, one must always bear in mind that MTTF and MTBF
are probability related and do not guarantee the life of a piece of equipment up
to that duration.
This acronym could also describe the Mean Time To Recovery, which is slightly
different. When using recovery as the basis, the time added must include the
notification time of maintenance tasks. In other words, besides the repair time,
there is additional time to diagnose the fault and plan the repair. Using recovery
as the basis for the calculation gives a higher result than using repair time alone.
MTTR does not give enough information on its own to improve maintenance
performance. Reasons for the duration must be investigated to determine
whether the time to repair can be reduced. Strategies to reduce repair times
Page 7 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Lengthy repairs have the potential to cause a loss in production. Where this is
the case, the losses are usually much more significant than the cost of the repair
itself. Loss of production adds a significant economic incentive to minimise the
MTTR of mission-critical equipment.
MTTR is different to MTBF. Having both results available gives more information
to engineers than either one gives on its own. Equipment that fails regularly but
is quick to repair needs a different reliability solution to equipment that hardly
ever fails but takes a long time to repair.
But the limitations of using the handbooks and their assumptions must be taken
into account when using predicted reliability information. Predicted reliability is
most useful for comparative purposes. For example, a manufacturer could
compare the predicted MTBF of different components to help them choose the
most appropriate component for their product.
Page 8 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
There are two main methods of reliability prediction, with one variation
included:
• The parts count method uses the failure rate of the various components
as well as the count of components to calculate a failure rate for the
product itself. It is a theoretical exercise and can only be verified once the
product is in service, and an actual failure history is established.
• The parts stress method uses actual field information from large numbers
of the component operating within its rated conditions. Engineers use this
historical data as a base for predicting the failure rate of products sold in
the present. Of course, field information is not available when a new
component comes onto the market. Therefore, some manufacturers use a
modified version of the parts stress method known as the accelerated life
testing method.
• The accelerated life testing method seeks to establish failure statistics for
a product by placing it under high stress, for example, operating a
component at a higher temperature higher than its rating. These extreme
operating conditions cause premature component failure. Engineers use
this failure information to back-calculate predicted reliability under
normal operating conditions.
Page 9 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Page 10 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Misunderstanding MTBF can lead to poor business decisions that are costly to
organisations. Using MTBF without additional information about the causes of
failures and how to predict failures fails to take advantage of the multiple tools
for maintenance and reliability available to engineers. Rather than build a
maintenance strategy on a theoretical constant rate of failure, maintenance
practitioners can build their strategy around current condition monitoring
results and predictions of failure.
The best approach for deciding whether to use MTBF is to first establish the
reasons behind the need for this information. For example, if the need is to set
spares holding requirements, then there may be a better approach or more
information required to make that decision. If the need is to estimate the
expected mission or service life of a piece of equipment, then MTBF is not the
right tool for that task.
Page 11 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Maintenance and reliability practitioners can use this information to evaluate the
performance of their equipment. If their ANSI pump falls into an acceptable
range, they may turn their attention to other equipment that could benefit from
more direct intervention. But if their pump is performing poorly, it gives them
the motivation to investigate the reasons why and come up with corrective
measures.
Over time, equipment should become more reliable, and therefore, MTBF
should increase. If there is no noticeable change in MTBF, then the reliability
program is not achieving its objectives. A positive trend of MTBF over time for
equipment on site gives maintenance and reliability practitioners confidence
that their programs are achieving the desired results. However, reliability
initiatives may take some time to reflect in the lagging indicators like MTBF.
MTBF is also useful for engineering design. Engineers use MTBF in electronic
manufacture to compare the effect of using different components in an
electronic product. It also helps identify design weaknesses. There may be one
component that lowers the MTBF of the product as a whole, and a single change
could make a significant impact on design reliability. Electronic manufacturers
choose components that meet their overall MTBF objective. Over-specifying
components adds to the cost of the product, but under-specifying could lead to
premature failures and customer dissatisfaction.
Page 12 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Conclusion
In this article, we have explored the idea of MTBF – its origins, the
misunderstandings people have about its meaning and the ways it is used and
abused.
References
1. History of Reliability Engineering, James McLinn, American Society for
Quality – Reliability and Risk
Division, https://www.asqrd.org/home/history-of-reliability/
2. Reliability Engineering Principles for the Plant Engineer, Drew Troyer,
Reliable Plant, https://www.reliableplant.com/Read/18693/reliability-
engineering-plant
Page 13 of 14 www.roadtoreliability.com
Simple guide to MTBF – What it is and when to use it
Erik Hupjé is the founder of the Road to Reliability™ and has over two decades of experience in
the areas of maintenance, reliability, and asset management. Erik has a passion for continuous
improvement and keeping things simple. Through the Road to Reliability™, he helps
Maintenance & Reliability professionals around the globe improve their plant’s reliability and
their organisation’s bottom line.
Erik worked and lived in the Netherlands, the United Kingdom, the Philippines, and the Sultanate
of Oman for multinationals like Shell and ConocoPhillips. He is now based in Brisbane, Australia
where he lives with his wife Olga and their three children. Personal interests include traveling,
cooking, history, and beach fishing.
Page 14 of 14 www.roadtoreliability.com