Operator Driven Reliability Transcript PDF
Operator Driven Reliability Transcript PDF
Module 1: Introduction
Welcome
Welcome to the interactive course about Operator Driven Reliability. This course comprises about 90 slides
with audio and text narration. The estimated study time is approximately two hours. We hope you will enjoy
the course and find its contents useful in your daily work.
This course comprises five lessons. In the pages that follow we will outline the learning objectives for each
lesson.
Module 2:
Lesson 1: Terminology
What is maintenance?
We conduct maintenance because we think there is greater “risk” associated with not doing so. Costs arise
from the consequences of equipment failure. Therefore, maintenance is and always has been a risk control
measure and maintenance is conducted because it is thought that it would cost more if no maintenance is
conducted. Maintenance is actions taken to reduce or eliminate the consequences of equipment failure.
Maintenance is and always has been a risk control measure. As with all risk measures, the basics of risk
management must be applied if the measure is to be effective.
Let’s now look at how this compares with the traditional approach to maintenance.
Page 1 of 13
“Traditional” maintenance
The maintenance program in place in most organisations has been typically derived as a result of a
combination of factors, which may include:
• Relevant legislation and standards
• Machinery vendor recommendations, often applied without question
• Reactions to past plant problems
• Application of new maintenance technologies for their own sake, without questioning whether this
technology is really needed.
Therefore, traditional maintenance is largely defined by judgment and experience, and without any clearly
documented justification. Such a “traditional” approach to maintenance often exhibits some serious
shortcomings, which may include a failure to focus maintenance on critical equipment combined with failure to
take account of failure characteristics. In consequence, potential failures of critical machinery are often not
addressed. The lack of any clearly documented justification means that prescribed maintenance activities may
not be effective, and may (in fact) not even be worth doing at all!
Maintainability
Maintainability is the likelihood that an item or asset can be restored to the operating condition within a stated
period of time under stated maintenance support conditions. A simple measure of this is the Mean Time to
Repair, or MTTR.
Reliability
Reliability is the likelihood that an item will operate without failure for a stated period of time under stated
operating conditions. It’s often expressed as a probability.
A key measure of reliability is the Mean Time Between failures (MTBF).
Availability
Availability is the likelihood that an item will be available when required, given a stated operational use and
stated support conditions. Under steady state conditions, this becomes the proportion of time that an item is
available.
Being profitable
All organisations need to be profitable and this is governed by maintaining the required output and controlling
operating costs, a specific goal being increasing plant availability. This goal is achieved through specific
actions, looking at reliability issues and at maintainability. These actions result in specific tasks such as
procedural improvements, reduced operator errors, or training and tool availability.
Reliability vs maintenance
So if we look at Maintenance versus reliability, maintenance ensures that an asset continues to fulfill its
intended function. Therefore, maintenance is a process of keeping constant what that asset is designed to
deliver. Reliability on the other hand is a process that involves analysis, monitoring, and optimization. It is
continuous and dynamic, and must have linkage to the required operating context, required equipment
performance, and operational requirements. Reliability is therefore a process of continuous improvement
focusing on the operational requirements of the asset instead of simply maintaining a function.
Page 2 of 13
• ODR impacts equipment effectiveness, contributing to optimal production and financial return on
investment.
• ODR incorporates operational, technical, and financial metrics, balanced to best meet the business plan of
the industrial enterprise.
Reliability-focused companies recognize the critically important role of the equipment or process operator.
The contributions of the operator are necessary because they are the first to notice deviations from a normal
operating condition. And the operators are best equipped to understand the interactions between process and
equipment behavior. Therefore, the Best-in-Class companies are poised to pursue Operator Driven Reliability
initiatives.
If we look at the operator involvement in plant reliability, many modern process plants train their operating
technicians to have a general understanding of the manufacturing processes, process safety, basic asset
preservation, and even interpersonal skills. Operator involvement in reliability efforts has to at a minimum
ensure the preservation of a plant’s assets. Operators are the first line of defense, the first ones to spot
deviations from normal operation. For optimum effectiveness, they should be used in that capacity, i.e.,
certain data collection and some first line analysis and action should be assigned to operators.
The experience gained by SKF Reliability Systems in real world conditions in a wide range of industries shows
that achieving competitive productivity and increased profitability through asset management depends on a
balance of key factors.
Click on each factor to explore further.
• There must be strong leadership dedicated to developing proactive people that work as a team and are
willing to embrace expanded roles. People are the first and most important aspect of a successful program,
although they are not always treated as such. It is important that the people understand why they are
performing the duties requested in order to ensure ownership and proactive involvement.
• Processes are the 2nd most important aspect. This is all about how to properly conduct and manage
maintenance. Standardised routines add structure to the work that gets done. As processes become more
effective, people become more productive.
• Systems and technology represent the tools used by the people implementing the processes chosen.
These systems and technology tools are the enablers, and they typically get most of the attention in
maintenance management. Some organizations focus their energy on the people and processes, with only
basic tools and rudimentary technology, but still often achieve high performance levels. Generally however,
emphasizing technology without excellence in managing processes and people will bring only limited
success.
Lesson 2: Reliability
Classic failure profiles
• Before we discuss specific maintenance practices that assure plant reliability, let’s first consider reliability
or how things fail by equipment failure patterns. The traditional view of maintenance typically assumes that
equipment will ultimately wear out. Maintenance routines are thus scheduled to be conducted before this
wear-out period, with the objective of restoring the equipment to its original operability. Referring to the
picture here – preventative maintenance, or PM (interval based), often takes the form of periodically
overhauling, repairing, or otherwise taking equipment apart, replacing certain parts, and reassembling.
This assumes that a given set of equipment will experience a few random, constant failures, but after a
time will enter a period where the conditional probability of failure rises sharply (the wear-out zone).
Therefore, the invasive PM (overhaul, repair, etc.) should be done just prior to entering the wear-out zone.
The drawback is that it assumes that you know the equipment life and that equipment truly fails in this way!
• Another common view of age related failure is what’s known as the biological model or the “bathtub curve.”
This thinking contends that equipment has a greater chance of failure when very young (infant mortality)
followed by a stable period, then is unreliable when very old. However, documented research into
Page 3 of 13
equipment failure probability and advanced age has shown that such a view of equipment life is over-
simplistic and not typical of most machinery. There is a difficulty in selecting the correct maintenance tactic.
Which action and schedule is most appropriate when considering costs, plant downtime, and risks? From a
technical viewpoint, you need to understand how the failure happened and if there was any way you could
have prevented it. Three major studies were conducted by United Airlines, Bromberg, and the US Navy,
the results of which have had a major impact on the way in which maintenance is now regarded.
So, the key points that emerged from these studies showed that failure isn’t usually related directly to age or
use. Neither is failure easily predicted, so restorative or replacement maintenance based on time or use won’t
normally help to improve the failure odds.
The implications of infant mortality mean that major overhauls can be a bad idea because you end up at a
higher failure probability in the most dominant patterns.
Unless the equipment comes into direct contact with the product or processed material (for example raw
material in pipes), or unless it is a simple device, age probably will have little impact on whether it fails.
Therefore, condition based maintenance techniques are going to be more effective.
Even where the failure pattern is known, it doesn’t necessarily tell you which maintenance tactic to use.
Economic studies may dictate that you run to failure despite there being good prospects of predicting
replacement because the failure pattern was age related.
How many plants have data of a quality and standard to be able to allow the determination of the conditional
probability of failure by type?
Most people have come to accept intuitively the bathtub failure pattern. However, data indicates this only
applies to some 3-4% of equipment.
From the curves and depending on which of the three studies you choose, infant mortality or early life failures
represents some 29-68% of failures. This typically means within 60-90 days of start-up.
This then has a profound implication relative to a given plant’s maintenance strategy.
Planned maintenance schedules typically make no allowance for these infant mortality failures. Indeed these
failures are more properly eliminated through better design, procurement, installation, commissioning, start-
Page 4 of 13
up, and operational practices; not through “planned maintenance”. Now consider the role of operators via a
structured, integrated operator driven reliability (ODR) program.
The next biggest age related failure pattern is constant and again, depending on the study, makes up 14-42%
of failures.
This also has a profound implication on your maintenance strategy.
If your conditional probability of failure is a constant random series of events, then the best strategy is to
assure that you have good condition monitoring in place to detect the onset of failure and developing failure
long before they become serious, allowing for planning and scheduling.
Conclusions
When combining infant mortality and constant failure rate patterns, this makes up some 71-82% of all failures.
So developing and applying a maintenance strategy that specifically addresses these failure patterns will have
a profoundly positive effect on your maintenance and operational performance. Assuring that these failures
are eliminated or mitigated is essential to assure that ODR is a “contributor” to “asset management.”
ODR essentials
Maintenance work undertaken has to be appropriate to the technical characteristics of equipment failure and it
must be worth doing by being effective in avoiding or reducing the consequences of failure.
This results in a commercially acceptable risk and must be cost effective.
Page 5 of 13
ODR – a living, dynamic program
ODR is a subset of Asset Management and recognizes that reliability does not belong to any one plant
function alone - rather that “reliability is a cross functional responsibility.” ODR is a “continuous dynamic (i.e.,
living-program) and not the arbitrary deployment of available hardware/software.” ODR is based on sound
principles and process flow - that is “honed” to the individual plant requirements functionally and specific
business conditions. It must be a) workable and b) flexible enough (in range/scope) to accommodate differing
degrees of operator driven reliability and can be a stand alone program - but is ideally part of an overall asset
efficiency optimization program.
ODR as a nucleus
ODR can be the nucleus of an overall asset management plan by bringing the respective programs into
mutual focus and avoiding any areas of duplication and/or omission between all the parallel programs at the
site. In Operator Driven Reliability the plant operator becomes a key player in all of these functions. He has
and uses his unique knowledge regarding such areas as function, failure, loss, consequence, performance,
efficiency and reliability. Typically the operator in an ODR program becomes the owner of the
Lubing/Greasing Program. He is also an important member of the Maintenance Strategy Review (MSR) team,
and also provides vital input to the Root Cause Failure Analysis (RCFA) process. In operator driven reliability
the operator assumes a high degree of “ownership” of his plant.
Page 6 of 13
• What can go wrong, and what can I do when it does?
• How does it matter ? (for example, what are the likely costs, losses, or risks)
Page 7 of 13
TPM failed for some
Many companies have implemented TPM but not all have achieved the expected improvement in
performance. Why then should ODR work? True ODR is structured, systematic, dynamic, and has a clear
technical and financial basis for content.
ODR can be implemented manageably. Total Productive Maintenance (TPM) programs, on the other hand,
often fail due to being all encompassing, and attempting to achieve too much, too soon.
Functional interaction
Operator Performed Maintenance does not mean that operators accept responsibility for all maintenance
activities. It does mean that they are responsible for knowing when they need to carry out simple preventive
tasks themselves, and when they should call in maintenance experts to repair problems that they (the
operators) have identified. Operator Involved Maintenance is the process, which facilitates the necessary
dialogue.
Page 8 of 13
Successful implementation of OPM activities will inevitably impact upon the OIM dialogue, and thus by
definition ODR comes into being. Embracing an ODR approach frees up maintenance time allowing new and
more powerful monitoring and diagnostic technologies to be taken on board (as appropriate), without running
the risk of basic maintenance tasks being neglected.
ODR essentials
An ODR program recognizes that there is an implicit advantage to having operatives involved in the
maintenance of their equipment. Therefore, for it to be considered effective, it must:
• Work: ODR tasks must be appropriate to the technical characteristics of equipment failure.
• Be worth doing; in other words be effective in avoiding or reducing the consequences of failure (including
safety and environmental considerations, - resulting in acceptable, commercial risk).
Consequently, operator driven reliability (ODR) tasks must have both a technical rationale through a robust
technical process of FMEA / RCM / RBM and a commercial rationale through business based targets and a
criticality analysis process. Otherwise ODR breaks down into groups of people trying to do “good things”.
ODR requires a sound technical basis. We must know which ODR technology to deploy, and on which
equipment.
There must also be a clear indication of how often it must be done and there must be an action plan in place
to respond to the results obtained through such measurements and observations. In addition, it must be clear
who will be assigned with the tasks indicated by these results and when the tasks must be completed.
An important element of the program is that we learn and adapt continuously.
In order to satisfy these conditions ODR should result from a structured technical process such as Failure
Modes and Effects Analysis, Reliability Centred Maintenance, or Risk Based Maintenance.
ODR Content
Reliability Centred Maintenance (or as a minimum Failure Modes and Effects Analysis) should ideally be the
“driver” for asset management task content and frequency and hence should provide the technical rationale
for ODR content initially.
Barriers to ODR
Increasingly, maintenance managers seek to involve operations personnel in these inspection activities.
These people already patrol the plant regularly as part of their operations duties. Indeed, their familiarity and
experience with the plant can be a great asset to the inspection program as they are usually very quick to spot
abnormal features such as strange noises, raised temperatures, etc. Unfortunately, it isn’t always easy to
harness this experience. There may exist very real cultural and organizational barriers to the introduction of
such a program. The situation is compounded even further when the operator is expected to undertake more
than just inspection. This type of maintenance activity is frequently viewed as being for the benefit of the
maintenance department rather than a benefit for the good of the company. Gaining the interdepartmental co-
operation required for success may require addressing a number of political issues within the company.
The competence of operations staff to undertake such activities may be called into question. Sometimes, an
attempt to introduce such activities may bring old prejudices to the surface. Some operations staff may be
mistrustful of management’s true motivation for introducing the program, and there is always, of course, some
degree of resistance to change.
Page 9 of 13
Operations attitudes to ODR
Similarly, the process operator may not have trouble finding excuses for his non-participation. He is, of
course, a busy man. He simply doesn’t have time to do maintenance jobs. Anyway, why should he? Will he be
paid any more for his trouble? His pay may well be based (at least in part) on achievement of production
targets, so any operational problem will inevitably take precedence over these “non-productive” activities. He
needs to be assured that his immediate supervisors support the program, lest their instructions and directives
conflict. The operator may also be suspicious of management’s true reason for instigating such a program. Is
this program really a covert version of the watchman’s clock? Is management really trying to verify the
operator walks the plant regularly? If management is suggesting that the operator has time to do this
additional activity are they then suggesting that he’s currently under-employed? Is the operator’s job security
at risk?
Culture
All too often senior management views maintenance as a function that adds cost to the organization, whilst
production adds value.
• Maintenance is often viewed as a necessary evil. Unfortunately, there does exist within many companies a
culture of blame. Poor quality maintenance and consequent disruption is often cited by production
management as the prime reason for non-achievement of production targets.
• Similarly, maintenance management often complains that production management doesn’t allow access to
the plant for routine maintenance, hence the high level of unscheduled downtime.
In many organizations maintenance & operations function virtually independently of each other, each with
separate objectives and structures.
Page 10 of 13
Lesson 5: ODR involvement
Prerequisites to ODR
The main pre-requisite to a successful ODR program is a well defined plan, ideally resulting from a failure
modes and effects analysis. The plan needs to be clearly communicated to all involved, and then
implemented faithfully, resisting any temptation to deviate. Everyone involved must be willing to approach the
program with an open mind, and be receptive to the changes and improvements in work practices and
relationships that will result.
Automated tools
The advantage of using the automatic tools is that the operator can easily compare ODR data to predefined
set-points and alarm conditions. Furthermore, these tools can also automatically provide guidance for actions
that should be taken and tasks to be performed in the event that specific out of limit conditions occur.
Action plan
It is important to have in place an action plan so that the operator knows what to do when out-of-limit
situations arise. The operator should be equipped and trained to deal with remedial actions that are within his
area of competence. Equally importantly the operator should understand when maintenance staff need to be
involved and of the process for instigating their timely intervention. It is also important that maintenance staff
react in a proper and timely manner to maintenance work requests arising from ODR.
Page 11 of 13
Interfaces to other systems
If ODR is being implemented to work with other systems or programs, sign-off procedures must be
established. If other metrics or observations are being collected/calculated by the ODR program, formal or
automated sign-off’s of data need to be passed and operating standards established. Examples would be
automated (or formalized) sign-off’s to:
• Planning
• Maintenance (Computerized Maintenance Management System)
• Reliability (Root cause failure analysis and condition monitoring data)
• Operations (Performance, Safety, and Environmental)
ODR benefits
Let us summarize.
The involvement of focused, trained and more knowledgeable operators means that there are more eyes
watching critical process systems. This increased monitoring results in earlier failure detection.
When plant failures do occur ODR promotes proactive root cause failure analysis, defining opportunities for
continuous improvement and faster execution of operational changes. The combination of benefits can have a
significant effect on the bottom line.
In addition to direct performance improvements, successful implementation of ODR can have a significant
beneficial effect on the working environment. Enhanced cooperation between departments and individuals
reduces friction and unites the team with a common focus and objectives. The improved two-way dialogue
between maintenance and production often has a knock-on effect on other areas of maintenance. This
facilitates earlier identification of reliability issues, more informed and timely discussion regarding possible
solutions, and easier implementation of corrective activities.
Remember
In order for a Operator Driven Reliability program to be successful it requires:
• A corporate culture willing to embrace the need for change.
• Commitment to implementing new technologies with requisite financial, training and personnel resources.
• Willingness to support processes for implementing cultural and technology changes.
Culture change
A structured change management training program at the Operator level as well as for supervisors and
managers, (along with coaching and mentoring), must be part of the process in order to succeed in changing
the culture.
Page 12 of 13
And finally……
Regulary scheduled reviews of successes and gaps in the program will ensure that KPIs are met. Everyone
can then celebrate the positive financial impacts of a successful ODR program and ensure a continuing
improvement of maintenance programs.
Page 13 of 13