Practical Considerations in Developing An

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

IEEE TRANSACTIONS O N RELIABILITY, VOL. 38, NO.

2,1989 JUNE 253

Practical Considerations In Developing an


Instrument-Maintenance Plan
Michael A. S. Guth (In many places throughout this paper, the words sensor and
Consultant, Oak Ridge instrument are used somewhat interchangeably .)
In first implementing a PM plan, a field engineer or repair
technician may find more instruments to service than can be
accommodated at a given time. The engineer must set up a
Key Words - Preventive maintenance (PM), Programmed priority for which instruments need to be serviced first, which
maintenance plan, Nuclear reactor, Risk analysis. are most critical for operations, and which allow some time-
flexibility for maintenance. Once a balanced plan for PM is
Reader Aids - developed, it should be possible to project PM needs into the
Purpose: Present a case study future and eliminate potential periods of congestion.
Special math needed for derivations: None This article develops general considerations to explain how
Special math needed to use results: Boolean algebra, Probability a consistent, well-organized, prioritized, and adequate time-
theory
allowance program plan for routine PM can be constructed.
Results useful to: Instrument-maintenance managers
Unanticipated, and therefore unscheduled, maintenance re-
Summary & Conclusions - This article develops a general set quirements can still disrupt the program schedule; however, with
of considerations to explain how a consistent, well organized, sufficient planning the disruption should be resolved within the
prioritized, and adequate time-allowance program plan for routine day-to-day operation of the PM schedule. The considerations
maintenance can be constructed. The analysis is supplemented with integrate some general theories on reliability and maintenance
experience from the High Flux Isotope Reactor (HFIR) at US Oak with the particular experiences of the High Flux Isotope Reac-
Ridge National Laboratory (ORNL). tor (HFIR) at Oak Ridge National Laboratory (ORNL).
After defining the preventive maintenance (PM) problem, the Many safety trips or alarms are tied to sensor readings and
instruments on the schedule were selected based on manufacturer’s not directly to the position or state of the components. Thus
design specifications, quality assurance requirements, prior a faulty sensor can trigger system shutdowns or alarms even
classifications, experiences with the incidence of breakdowns or
when the actual state of the system is normal. This article also
calibration, and dependencies among instruments. The effects of
repair error in PM should also be studied. The HFIR requires 3 contrasts the operation of a system under a “bare bones”
full-time technicians to perform both PM and unscheduled maintenance philosophy vs PM standards.
maintenance. A second objective is to merge techniques from risk and
Some techniques from risk and fault-tree analyses can be bor- reliability analysis with the design of PM and decision-support
rowed in studying cause-consequence relations between instruments tools. This paper discusses how to incorporate cause-
and maintenance. Examples of false-positive and false-negative consequence relations as well as view statistical data on
signals on the HFIR are given as well as some suggestions for how maintenance or aging of instruments from a risk-analysis
to model the breakdown incidence. An alternative approach uses perspective. The reader can then decide whether these risk-
approximate statistical distributions and the mean value of prob- analysis techniques should be incorporated into his own PM
abilities for repair needs. These distributions can vary from knife- plan.
edged to hyper-exponential.
Searching for congestion periods will assist in the allocation A third objective is to examine the effects of redundancy
of resources to meet both PM and unscheduled maintenance needs. in sensors to see what gains are received from having multiple
This article reviews some concepts from queuing theory to deter- copies of the same sensor. Section 5 also examines how redun-
mine anticipated breakdown patterns. In practice, the pneumatic dancy can reduce or eliminate situations in which total reliance
instruments have a much longer lifetime than the electriclelectronic must be placed on a particular sensor while other action is be-
instruments on various reactors at ORNL. This article concludes ing taken. The paper reviews the ranking of alternative
with a discussion of some special considerations and of risk aver- maintenance schemes from a safety objective to determine if
sion in choosing a maintenance schedule. servicing instruments in a particular order increases the risk of
an accident.
Practical illustrations and insights explain how engineers
and managers set realistic reliability and maintenance re-
1. INTRODUCTION quirements for an engineering system. We examine use of
reliability data from the field as compared with manufacturer’s
Most large engineering systems incorporate a programm- reliability tests. Some of the obstacles to achieving worthwhile
ed plan for preventive maintenance (PM) on the sensors and reliability requirements are explained. This practical paper
instruments in the system. Over time, some of the sensors tend minimizes the use of detailed mathematical models, statistical
to drift out of calibration or otherwise require routine data, and theoretical work; it maximizes the use of field ex-
maintenance. Development of a plan for scheduling PM helps periences and case-history observations.
to assure the operator that the sensor signals are still accurate.

.000
0018-9529/89/0600-0253$01 1989 IEEE
-

254 IEEE TRANSACTIONS ON RELIABILITY, VOL. 38, NO. 2,1989 JUNE

Section 2 identifies the maintenance time constraint prob- on the list, should be added.
lem, provides some sources of information for deciding which 5. Instruments which are relied upon by other instruments
instruments should be on the PM schedule, contains some sam- might need to be on the PM list. These instruments can affect
ple calculations on time spent for PM, and looks at the HFIR the failure rates of systems that depend on their readings [l].
staffing specifically to determine if and how long the The HFIR PM schedule was developed though a joint
maintenance goals will fall behind schedule with the existing meeting of a representative of the HFIR Operations Division,
repair staff. Section 3 broadly discusses sensor validation and the HFIR Field Engineer, and the Instrument Foreman for the
comprises such issues as cause-consequence relations, detec- Instrumentation & Controls (I&C) Division. The Operations
tion of faulty sensors, use of smart sensors, redundancy, and Division representative provided input about the instruments
in-place calibration. Congestion periods, stability, and selected most critical to keep the reactor operating. The Field Engineer
concepts from queuing theory are reviewed in section 4. Sec- advised as to which areas of the HFIR had been most in need
tion 5 discusses some special maintenance considerations in- of maintenance, and the Instrument Foreman provided general
cluding instrument dependency on the same power sources, knowledge on the repair rates of instruments in both reactor
humans interacting with an observation of instruments, and and non-reactor plants. Where manufacturer’s specification call-
signal use in control systems. Section 6 presents alternative ap- ed for recalibration (eg, annually), this figure was used as a
proaches to PM that are based on likelihood vs severity of lower bound, so that such an instrument might be on a 6- or
accidents. 12-month PM schedule. Appendix A explains what instruments
in the HFIR primary pressure system are on the PM schedule.

2. IDENTIFICATION OF THE PROBLEM


2.2 Repair Error in Preventive Maintenance
The first step in setting up a preventive maintenance (PM) The issue of sensors being routinely serviced or
schedule is to develop a short and succinct statement of the prob- calibrated - even when the service is not needed - raises the
lem. For the High Flux Isotope Reactor (HFIR), this statement question of whether the sensors or the system might be in-
might be: advertently damaged in the process of this PM. Concern over
PM error (eg, forgetting to realign a dial after PM on an in-
Routine PM currently requires servicing of 850 instruments
strument), would seem more likely in situations where the in-
on a programmed PM schedule. How can these instruments
strument was working properly prior to the PM. Thus, PM er-
be calibrated and serviced by the repair staff and still allow
ror is more likely to avoid detection when the component was
the staff sufficient flexibility to handle unanticipated
working properly than if the component needed servicing and
breakdowns?
the technician subsequently repaired the device to its proper state
2.1 Selecting Instruments for PM Schedule of operation.
The list of sensors in the engineering system should be At present on the HFIR, following routine calibration or
divided into a list of those sensors that should and should not PM on an instrument, the technician performs an operational
be on the scheduled PM list. Permanent or non-calibrated sen- check of the component after placing it back in service to deter-
sors should not appear on the list because they will not affect mine that it is working properly. Repair error has not been a
the routine PM schedule. The permanent sensors can influence problem on the HFIR; hence, experience suggests that no ad-
the number of unanticipated breakdowns. The non-serviced sen- ditional guidelines or requirements are necessary. If managers
sors will probably be assumed to be accurate over some were concerned about technician repair errors, they could for-
reasonable life for the instrument. mulate a checklist to be completed by the technician or a super-
To determine which sensors should be on a routine PM visor to ensure that the component was restored to proper func-
schedule, we found 5 sources of information to be helpful. tion after servicing. However, such a checklist task would likely
1. The manufacturer’s design specifications often include become tedious and cumbersome.
suggestions for calibration needs based on time and amount of
use. 2.3 HFIR PM Hours
2. Many large engineering systems - particularly those
subject to Government regulation - have quality assurance To illustrate the estimation of PM hours, one can examine
documentation summarizing the PM requirements to keep the the maintenance requirements for the HFIR. Of the 1192 in-
system in operation. struments on the I&C Division inventory list for the HFIR, ap-
3 . The operations department for the system might have proximately 950 instruments are on a routine PM schedule. Of
classified the instruments or sensor into categories based on their those instruments on the PM schedule, approximately 650 are
anticipated maintenance needs. Such was the case with the on a schedule of 12 months service or less. A calculation for
HFIR. the month of 1987 July revealed that 61 instruments were
4. The field engineer or maintenance repairman will have calibrated and/or serviced for a total of 69 hours of work, viz,
first-hand experience with servicing the instruments; he can give an average service of 1.1 hours/instrument. However, the
insight into which instruments currently on the PM schedule technician lacked time to complete the PM tasks on some of
show little need for servicing and which sensors, not presently the redundant units kept on the shelf during July, and these tasks
GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN 255

were subsequently completed in 1987 August. The best time 176 hours/month. If all 3 technicians worked on HFIR
estimate for PM at the HFIR is approximately 90-110 maintenance directly then the work would total approximately
hourdmonth . 528 hours/month.
Experience with the HFIR has shown that most of the in- The combination of vacation time, holidays, sick leave, and
strument breakdown reports were written at night while the reac- attendance at safety meetings takes up between 30-35 % of the
tor was operating. The I&C Division personnel received 3-4 technicians’ available work-time. If the technicians spend only
breakdown reports per night with an average time of 3-4 hours 70% of their time on maintenance activities, the combined work
for unscheduled maintenance on each instrument report. Thus time for the 3 is approximately 370 hours/month (528 X 0.7
unscheduled maintenance averaged about 12 houdnight. Many = 370).
of the breakdown reports were false alarms in that the reactor Each technician spends approximately 40% of the work-
operators were concerned about a reading they were receiving time on unscheduled maintenance and 60%on PM. Unscheduled
from some instrument and wanted it checked, but upon examina- maintenance takes 40% of the 370 hours/month, viz, 148
tion the instrument was in working order. hours/month. That leaves 222 honrs/month for PM under the
Based on experience with the HFIR and other reactors and instrument foreman’s rule-of-thumb. Table 1 summarizes these
treatment plants operated by ORNL, the instrument foreman results.
uses a rule-of-thumb of assigning 250 instruments/technician. These calculations show that the unscheduled maintenance
This figure is not written in any operating procedures or manual hours have a nearly perfect equilibrium between demand and
but is based solely on manpower experience. Using some of supply, but the combined routine and special PM hours appear
these average figures as background, it is useful to turn to the to exceed 112 hours/month if all 3 technicians perform routine
specific HFIR unscheduled maintenance and PM schedules from PM. In fact, one of the technicians is expected to handle changes
both a supply and demand perspective. in equipment required for various experiments at HFIR.
If the HFIR repair staff were limited to 2 technicians then
they would be paid for approximately 352 hours/month and have
2.3.1 Demand for Maintenance Services 70% of that time (approximately 250 hourdmonth) to spend
on actual instrument maintenance. Adding column 1 of table
If unscheduled maintenance requires an average of 12 1 yields 260 hours/month, so that there might be a close cor-
hourdday and the HFIR is kept operating 25 daydmonth, then relation between the HFIR maintenance demand and the supp-
approximately 300 working hours would be needed for ly services of 2 technicians.
unscheduled maintenance on a monthly basis. However, the ob- Five other considerations also affect the manpower deci-
jective of initiating a PM plan is to reduce the number of unan- sions for implementing a successful PM plan.
ticipated breakdowns. Suppose the PM plan meets this objec-
tive, and the unscheduled maintenance requirement is cut in half
to 150 working hours/month, viz, an average of 6 hourdday 1. The 90-100 hourdmonth PM requirement might have
for the 25 days of operation. left out the extra maintenance amenities that could be afforded
If an average of 1.1 hours is spent on the 950 instruments with some extra time. For example, the technician might not
on the HFIR PM schedule then approximately 1045 hours are have time to clean up oil or other materials used in his
needed for this work. Distributed over a 12-month period, the maintenance/calibration work. A third technician could be
1045 hours amounts to approximately 90 houdmonth. If we justified to allow for extra time to do a more thorough job with
consider special requests for equipment verification prior to ex- each instrument.
periments or other unique circumstances then we might want 2. The instrument foreman’s rule-of-thumb of 250 in-
to add a cushion of 10-20 hourdmonth to handle special PM. strumentdtechnician suggests that the HFIR staff would be
Thus we arrive at a figure of approximately 100-110 slightly overworked with only 2 technicians. If the figures used
hours/month for PM which fits the current best estimates of in these calculations have omitted relevant work requirements
HFIR PM needs. Adding the 150 hours/month for unscheduled that are considered in the foreman’s rule-of-thumb then HFIR
maintenance, 90 hours/month for routine PM, and 20 maintenance requirements might need to include the part-time
hours/month for special PM yields approximately 260 services of the engineering technologist.
hours/month on the demand side. 3. Limiting the maintenance to 250 instruments/technician
on the HFIR might be too generous: The entire HFIR safety
system with approximately 240 instruments can be serviced in
2.3.2 Supply of Maintenance Services 3 days. Under the PM procedures written by the field engineer,
some of these instruments can be testdcalibrated simultaneous-
Prior to 1988, the HFIR had one instrument technician to ly. Thus one large group of instruments on the PM plan actual-
handle all the maintenance; however, from 1985-1988the reac- ly requires much less time for servicing than the average for
tor was shut down. The HFIR now has 3 instrument technicians the remaining HFIR instruments.
and an engineering technologist assigned to maintenance tasks 4. Timing is important. The restarted HFIR will run for
for the HFIR. Assuming 22 work-daydmonth and 8 work- 25 days and then require a 4-day shutdown for maintenance.
hours/workday then each technician is paid for approximately Some of the PM activities can be executed only during reactor
256 IEEE TRANSACTIONS ON RELIABILITY, VOL. 38, NO. 2,1989 JUNE

TABLE 1 or a fault tree. Fault trees impose a more rigid form on the rela-
Demand and Supply of Maintenance Services for HFIR tions by: 1) passing events through Boolean logic, 2 ) requiring
(hourdmonth) the implications to hold in both directions so that one can move
Demand Supply up or down the fault tree, and 3) incorporating restrictions on
Unscheduled maintenance 150 148
circular or overlapping branches of the tree. Appendix B con-
Routine PM 90 222 tains a partial listing of cause-consequence relations that can
Special PM 20 (figured in above) be easily depicted in fault trees and are contained in the HFIR
quality
- - assurance documentation.
Once the cause-consequence relations of the components
shutdowns, and it is possible that only 2 technicians, even if
have been identified, it should be possible to isolate particular
they are working overtime, might not complete their scheduled
events stemming from abnormal conditions in the reactor. One
PM tasks during the 4day shutdown. To analyze this constraint
method of detecting the events is direct observation; another
further, the list of PM tasks must be subdivided into: a) tasks
is to rely on alarms or annunciators. These events should then
that can be completed during reactor operations, and b) tasks
be compared to distinguish the difference in appearance between
that require a shutdown. The I&C Division inventory list of in-
an abnormal condition and a seemingly abnormal condition
struments is being updated to include this information.
caused by sensor failure.
. 5. Having additional personnel allows more time for fill- In the HFIR, the operator relies mainly on the instrument
ing out maintenance paperwork. A PM plan is only as good
readings available in the control room. Only during his once-
as the input it receives [2]. If a technician completes work on
per-shift equipment inspections will he walk around the plant
an instrument and fails to report the work then the PM system
to check on other instrument readings. Thus, since operators
acts as if that work has not been completed and omits the work
rely more on sensor signals rather than actual observations to
from the cumulative totals. To cut down on manpower spent
determine abnormal conditions during reactor operations, it is
filing paperwork, a computer terminal has been installed on-
helpful to discuss more rigorously how sensor failure can be
site at the HFIR to record maintenance/service and to make the
incorporated into conventional risk-assessment studies.
reporting requirements less tedious.
Faulty sensors can lead to 2 patterns of observed failures:
In the past, the HFIR shutdown period could last from 14
false positives and false negatives. In a false-positive pattern,
hours to 3 days. It generally overlapped with nights or
sensor failure can trigger an alarm of some safety system even
weekends, and the Operations Division could not afford to pay
when the true state of the system is normal. In a false-negative
overtime for PM. Because the instrument technician had little
pattern, the sensor can fail to register some abnormal system-
time to perform PM, a shutdown was required. The chance that
condition and give instead the appearance that all components
I&C personnel might have 2 working days between the hours
are working properly.
of 8:OO am and 4:30 pm was slim. Now the HFIR shutdown
False positives from sensor failure can be incorporated in-
is anticipated to last a minimum of 4 days, which is announced
to traditional fault-trees by including another parameter for sen-
in advance and thus possible to plan around.
sor failure at each step where it can have an impact.

Example 1
3. INSTRUMENT VALIDATION
The 2 initiating events, both A and B together cause some
consequence, C 1. A Boolean equation for this relationship is -
The second major task in developing a preventive
maintenance (PM) schedule is to look at alternative instrument
validation techniques for allocating time. The most reasonable AABVC1. (3-1)
method of validating instruments involves local testing with
monitors. For example, the field engineer or instrument tech- Notation
nician tests display devices in the High Flux Isotope Reactor A Boolean AND operator
(HFIR) by disconnecting the devices from the system, connec- V Boolean OR operator
ting his own test equipment, and then applying a specified signal A, B initiating events
to see that the devices register the correct value. C1 a consequence event: An alarm sounds (alarm-trip)
Another method of instrument validation comes from deter- C2 a consequence event: A separate alarm sounds
mining the cause-consequence relations for the associated com- S1 event: Sensor sl does not calibrate correctly
ponent failure. Knowledge of the consequences of component S2 event: Sensor s2 calibrates correctly and is
failures assists the planner in distinguishing actual events from working
faulty-sensor signals. If failure of a given component is known
to cause an observable event, then failure to witness this event The terms calibrate and working are a matter of degree.
could indicate that the problem rests with the given-sensor signal
rather than being a component failure. Some faulty sensor can independently cause the conse-
It is often helpful to depict graphically the cause- quence, C1 which could be an alarm-trip. By joining an event
consequence relations, either in the form of a semantic network S 1 to the 1.h.s. of (3-1) with a Boolean OR gate as shown in
GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN 251

(3-2) an alternative source can be introduced to explain obser- uncalibrated resistance-bulb. If, on the average, out of 280 days
vations of C1 when the true component initiatingevents, A and of operation the resistance bulb was working properly, on 275
B, have not both occurred. days then the probabilities assigned to S2 are (279280, 5/280).
This logic implies that in 5/280 trials, the consequence (a signal
that the fans are not properly controlled) would not appear even
if the true reactor component state were abnormal. 0
Eq. (3-2) can be loosely translated as: ”if (both A and B are Few, if any, instruments on the HFIR have failure rates
failures) and/or (sensor SI fails) then an alarm is tripped, Cl”.O with known or relevant frequencies. Most instrument
breakdowns on the HFIR are unique and hence do not lend
Example 2 themselves to probabilistic calculations on failure rates. On the
other hand, many instruments on the HFIR need a 2-year
False negatives can be modeled in a similar fashion with calibration cycle. If the instrument is calibrated annually then
the Boolean AND operator and a sensor-failure parameter. the operator can rest assured of the accuracy of the signals com-
Begin with a causal relation similar to (3-l), except that now ing into the control room. If the routine calibration is delayed
we are interested in showing how another consequence, C2, to 2 years then the sensor signals become more questionable
might not be observed even when both A and B have occurred, towards the latter part of the cycle.
as shown in (3-3). Only one type of instrument on the HFIR, an early design
of an operational amplifier using an electro-mechanical chop-
( A A B ) A ~2 ‘ ~ 2 (3-3) per, has breakdowns that approach a pattern. At one time the
HFIR had about 100 such amplifiers in service. After the
That is, (3-3) shows that Occurrence of both A and B is necessary breakdown pattern was observed, it was found that they could
but not sufficient to cause C2 to occur. With the addition of be replaced by then state-of-the-art integrated circuits for less
S2,the associated sensor must be both calibrated and working money than the repaidupkeep on the old amplifiers. Therefore,
properly for the anticipated consequence C2 to occur. Other- advances in microelectronics led to cost savings by substitu-
wise even if both A and B occur, the consequence C2 does not tion of another type of instrument rather than repair of the ex-
occur. 0 isting type.
Using risk analysis for instrument validation requires col-
Discussion of Examples lecting data on individual sensor failure-rates or aging processes.
In general, the failure rate and aging process depend on the
The numerical values related to S1 & S2 derive from routine PM plan, so that some simultaneity-bias enters these
estimates of instrument failure-rates. Any instrument that re- figures. The lifetime of an instrument can be extended through
mained in perfect calibrationhepair would have: Pr(S2) = 1 routine maintenancehepair - even beyond that lifetime
and Pr(S1) = 0. However, S1 & S2 refer to 2 distinct rela- guaranteed by the manufacturer.
tionships. The S1 value might come from one particular instru- One source of statistics on failure or aging rates of in-
ment and the S2 from another. If S1 & S2 are both based on struments can be obtained from the log-books of repairman, eg,
the same instrument then that sensor might be sufficiently far service hours, nature of the problem, time for repair, frequen-
from calibration to cause C1 to occur. In terms of probabilities cy of repairs. For HFIR sensors, the data on failure rates are
derived from relative frequencies, an analyst could assign (oc- not very complete. The HFIR has been in operation since ca
cur, NOT occur) values of (0.1, 0.9) to S1 and (0.8, 0.2) to 1965. The first system of collecting and recording information
S2. For additional explanations, see [3]. on system repairs (MAINS) went into effect about 1976. The
On the HFIR one example of a false-positive signal is the instrument system was changed to the MAJIC ca 1986 January.
fairly frequent (monthly) spurious trip of an annunciator due MAJIC had some debugging problems with the entry of data;
to electrical noise. We might know that the intended cause for it did not get back into satisfactory.operation until 1986 October.
the annunciator to go off is the joint event ( A AND B). When Thus 10 years of data are missing from the first years of
the consequence (annunciator going off) is observed, but not HFIR operations. The data on MAINS are available, at some
the causes, then a preliminary hypothesis for the observation effort, on hard copy. The data on MAJIC for the first 8 months
is sensor failure. The sensor might be picking up electrical noise apparently contain some gaps. Thus there exists no consistent
or there might be a fault in the electrical system. If a particular data set on HFIR maintenance, and what data do exist might
annunciator, on the average, goes off 13 times a year, and 2 not be readily accessible or reliable, because certain informa-
of the 13 times are spurious then the probabilities assigned to tion on repair incidence was not recorded.
S1 are (2/13, 11/13). What emerges from applying risk analysis to the PM
An example of a false negative on the HFIR has occurred scheduling problem is a perspective of ranking various ac-
on the resistance-bulb calibration for the cooling tower. On cidentdevents that could occur if the sensors are not properly
several occasions, 1 of the 4 resistance bulbs has gone out of serviced/maintained. Risk assessment combines prob-
calibration, usually indicating a low temperature. If the fans abilities-of-events with their seriousness, to arrive at a risk fac-
are not on, or are not working properly, then information about tor. By focusing on these 2 variables, it is possible to re-examine
rising water-temperatures is not correctly conveyed through the the design of a PM plan with a view toward eliminating all
258 IEEE TRANSACTIONS ON RELIABILITY, VOL. 38, NO. 2,1989 JUNE

important risks, either through increased PM on these com-


ponents or coupling additional safety systems to prevent the ef-
fects of an accident or a failure from spreading.

3.1 Alternative Logic on Anticipated Running Time

Morse [4] discusses the conventional logic applied to in-


strument PM based on deriving a statistical distribution of
Tm
breakdowns, then calculating lifetimes based on that distribu- time
tion. In general, calculation of these breakdown distributions
Fig. 1 . Survivor Functions for 3 Types of Breadkown Distributions
requires knowing the performance of the same type of instru-
ment across various environments as well as time-series
knowledge about the history of failure rates in each environ- 3.2 Other Conceptual Issues
ment. The data on failure rates for the HFIR are not very good,
but even assuming that, for example, the incidence of repairs Several other conceptual issues for sensor validation include use
on some instrument amounted to 6 jobs over the 20-year period, of smart sensors, which can be either calibrated by remote con-
the paucity of data precludes useful statistical inference. trol or compensated for internal error. For example, a new
Because of the lack of individual data-collection, managers generation of pressure transducers contain a valve manifold that
of engineering systems must often trust the manufacturer’s allows calibration to be adjusted, once the sensor has been placed
design specifications on breakdown/recalibration rates. In the in operation. Figure 2 shows that valves can be opened during
HFIR, the manufacturer’s specifications are taken as a lower operation to create a static pressure of 10oO psi on both sides
bound on required PM, and, when necessary, the PM on par- of the transducer. In the past it was necessary to calibrate the
ticular instruments is increased over manufacturer’s specifica- transducer in the shop at 0 psi static. The engineer then had
tions - based on first-hand experience with the instrument per- to reset the zero point on the signal from the transducer at a
formance on the HFIR. static pressure of 10oO psi and hope that the span calibration
Accordingly, many of the theoretical papers on remained intact at operational pressures. With the development
maintenance, in this or similar journals, assume distributions of smart sensors the calibration can be set to the correct value
for instrument breakdowns to analyze the maintenance problem when the 10oO psi is placed on both ends of the transducer.
more “rigorously”. For example, the theoretical literature often

,
assumes that the breakdown follows a Weibull or lognormal
distribution. Without going into detail on the properties of each
distribution, it is helpful to maintenance planners to see figure
I
1 which shows survivor functions for 3 distributions (schedules).
Schedule A shows a knife-edged distribution along some
mean service time, Tm. The pattern implies that the overwhelm-
ing majority of individual instruments (of that type) require ser-
vice after or near time, Tm. This distribution corresponds to
an instrument with a well-known breakdown time, and little
deviation from that time.
Schedule B shows an exponential distribution which might
apply to an instrument with a variety of moving parts that can hammer
malfunction. Or, the instrument might depend on many ad- 1000 psi 1000 psi

justments that can deviate from proper working order. TRANSDUCER


Schedule C shows a hyper-exponential distribution which
is more convex than the exponential distribution. It could be
the distribution for an instrument that, when perfectly calibrated, Fig. 2. Pressure Transducer with loo0 psi Static Pressure Hammer.
works properly for a long time. However, if not perfectly
calibrated, the instrument soon needs repair. This distribution Similarly, a new generation of transmitters can adjust for internal er-
lends itself to a bifurcation of the PM/calibration activity on rors. This enhanced sensory capability leads to the question of
the instrument, whether it is perfect or not. whether the benefit of increased calibration with smart sensors
Khandelwal et al. [5] describe a PM decision problem for is worth the up-front costs.
machines subject to deterioration and uncertain breakdowns. For the HFIR it is difficult to conceive of situations or ex-
The bang-bang control solution is derived from optimal-control periments where remote calibration of sensors is helpful. The
theory. Reliability calculations that use mean repair times, as need for remote calibration depends on instrument location (eg,
well as a discussion of the caveats with mean times, are in [6,7]. if the instrument is under-ground or behind a lead shield). The
GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN 259

instruments requiring recalibration after being placed in opera- at least 2 identically functioning instruments on-line at a given
tion on the HFIR are generally accessible, and they have local time, or an inventory of spare parts.
adjustment capabilities. On the High Flux Isotope Reactor (HFIR) the instruments
The potential for sensors or instruments with built-in com- and sensors that form the safety system are tracked in triplicate,
pensation capabilities could help to correct sensors that are sub- and a safety-trip requires 2-out-of-3 to activate. Thus when one
ject to fluctuations in electrical current, or air pressure for of the sensors needs PM or repair, it can be taken out of ser-
pneumatic instruments. A potential application at the HFIR is vice while the HFIR is operating. Scheduling-time constraints
for signals based on other signals, such as heat power for which pose the greatest difficulty on the HFIR for those sensors and
a 2 % deviation in the temperature and flow probes can lead to instruments that have no redundancy. The non-redundant in-
an 8 % deviation in the heat-power calculation. The related issue struments include such parts as the chemical treatment and de-
is how much you are willing to pay to have your heat-power aerator. PM on the sensors associated with the primary pressure-
signal reduced to a maximum variation of 1% instead of 8 % . system also requires scheduling during a shutdown, since the
Another conceptual issue for PM deals with the availabili- pressure-system sensors have no built-in redundancy.
ty of spare components in inventory and the ability to service In looking at congestion problems it is helpful to take some
the instrument during normal operations. One-of-a-kind in- techniques from queuing theory. If an instrument needs calibra-
struments generally have no inventory spares, and they often tion or service but no repairman is available then it is added
require a reactor shutdown before any maintenance can take to the waiting queue. The service times for the instruments are
place. However, many of the instruments on the safety, servo, random variables. Three important parameters to consider from
and counting channels of the HFIR are tracked in triplicate. As queuing theory are:
a result, one panel of instruments can be removed from service
during normal operations and serviced while the reactor is 1. The waiting time for repair on each instrument
operating. 2. The busy period during which one or more repairmen
Finally, tradeoffs between in-place calibratiodrepair com- are busy
pared to removing the instrument and talung it to a shop should 3. The queue size (number of instruments in the queue).
be included in calculations for sensor PM requirements.
Moreover, some sensors require a system shutdown or the Two aspects of PM on equipment distinguish it from other
removal of various obstacles before they can be serviced. Thus queueing processes:
while actual repair time on a particular instrument might take
only 2 hours, it might take a day to remove obstacles, and up 1. The possibility of PM introduces a simultaneity that
to a week before the shop has time to work on the instrument. means the PM required to keep the system working is a func-
tion of the amount of PM. This characteristic further implies
some control over the unanticipated nature of breakdowns -
4. SEARCH FOR CONGESTION PERIODS so that these instances can be controlled or reduced [8].
2. There is a finite population that can potentially break
The objective of designing a routine preventive maintenance down. Once all of these have broken down and are in the queue
(PM) plan is to avoid situations in which an engineer or repair- for repair, no more can enter the system. For most other queu-
man has too many instruments or sensors to service/maintain ing applications the effective population is infinite.
at a given time. The incidence of congestion periods generally When developing a program to ensure some PM objective
depends on the repair frequency of the sensors under study, the (eg, all the sensors related to the primary pressure on the HFIR
number of instruments or sensors in the study, and the priority are properly calibrated and in working order), the PM schedule
for working on the sensors. must be integrated with the service requirements for the rest
PM priority should be given to those sensors and in- of the instruments. Viewed as a separate plan to achieve some
struments whose failure can cause the most serious, as well as special objective, the PM plan should not reveal any conges-
the most frequent, consequence. Safety considerations must tion periods that would be evident when viewed simply as part
prevail over convenience or cost. Consequently, development of an overall schedule.
of a PM plan must consider the effects of various accident- Once a preliminary PM schedule is developed, the stabili-
related scenarios stemming from instrument failure. The PM ty of the plan should be tested by adding some unanticipated
plan might pose questions such as: What is the worst accident failures of instruments. These additional exogenous shocks can
that can happen if the instruments are serviced in the current show what points in the PM schedule have sufficient flexibility
priority ranking? What is the worst event that can occur if the to accommodate unscheduled maintenance. For real PM plans,
priority ranking is changed? After several iterations the most the number of unanticipated breakdowns in instruments is in-
important sensors should be identifiable. versely related to the time spent on PM.
Time flexibility in repairs and redundancy of sensors are Borrowing some concepts from perturbation theory, the
2 important factors in eliminating congestion periods. Where planner could adjust the parameters of the model - mean time
the sensors are redundant or a sensor can be serviced without for routine PM, mean time for unscheduled maintenance,
affecting operations, congestion periods can usually be alleviated number of instruments needing special PM, number of in-
if not eliminated. The redundancy can take the form of either struments needing routing PM, number of instruments in the
260 IEEE TRANSACTIONS ON RELIABILITY, VOL. 38, NO. 2,1989 JUNE

system, etc. - to determine the sensitivity of the PM schedule


to these parameters.
In simple queuing theory, any maintenance is assumed to
restore the item to good-as-new . This assumption violates the
usual capital theory in which capital equipment wears out over
time. Maintenance, in capital theory, can prolong the life of
Probability
the instrument but cannot extend it indefinitely. Of
failure
To bridge the gap between queuing theory and capital theory
perspectives, at least for the HFIR, it is helpful to distinguish
between calibration and some forms of repair. Recalibration
generally does restore instruments to good-as-new in terms of
calibration; but some forms of repair follow the capital theory
view that a restored instrument is merely bad-as-old. 0 1
Figure 3 shows the bathtub curve for failure rate of an in- tx tY time
strument; failure rate is on the vertical axis and time is on the
Fig. 4. Instrument Failure Rates with Maintenance.
horizontal axis. The early period is called infant mortality, the
final period is called wearout. In between, the failure rate is
more or less constant at Pi and represents the “normal” work-
ing period. reactor; this list of instruments includes the permanent as well
as the routinely maintained sensors. Therefore, it is reasonable
to view most of the newly serviced instruments on the HFIR
as good-as-new. However, for other engineering applications,
use of the sensor may wear it out, and the capital theory ap-
proach should be used.
One interesting point that emerged from history with reac-
tors at Oak Ridge National Laboratory (ORNL) is that one
would anticipate the pneumatic instruments to wear out before
Of
the electric instruments because the pneumatic instruments have
parts that rub against one another. Experience has show the op-
posite to be true: The pneumatic instruments last longer than
electric instruments. The pneumatic instruments have soft con-
versions; the electric instruments have more rigid conversions.
Many of the pneumatic instruments from the 1950s are still in
fine working order and online at reactors in 1989. The electric
0
time
instruments are less reliable. Digital electronic instruments break
Fig. 3. Instrument Failure Rate down from static electric charges and glitches; the analog elec-
tronic instruments are sensitive to fluctuation in ac voltage levels
and temperature.
The choice of times at which to engage in PM depends on
From a capital theory perspective, the upward sloping part whether the repair work is viewed from the good-as-new vs bad-
of the graph is the period in which maintenance can change the as-old perspective. Once can envision a control-theory problem
magnitude but not the direction of the slope; ie, beyond a cer- with the objective function defined in terms of keeping the in-
tain time, repairs are only palliative, not restorative. Put yet strument in good working order, and the control variable be-
another way, once the instrument lifetime is used up, then ing the amount of maintenance put into the instrument to keep
repairs are only a temporary fix. it working. If the maintenance restores the instrument to good-
Figure 4 illustrates the queuing theory concept of complete as-new then the solution probably is bang-bang control where
restoration. Suppose the instrument is serviced at time tx. Rather the planner uses an all-or-nothing approach to maximize the in-
than following the ascending dashed line, the failure rate for strument lifetime. In contrast, if the instrument is likely to wear
the instrument then jumps to the new descending schedule, out in any case then the cost of PM probably outweighs the
followed by another long horizonal trend at failure rate, Pi. The benefits.
time between tx and ty represents the period in which the in-
strument, put back into operation, has a relatively high but
decreasing failure rate due to maintenance error.
Moving from theoretical to practical illustration, the 5. SPECIAL CONSIDERATIONS
likelihood of instruments wearing out before the engineering
system itself is mothballed depends on each application. For There are five special considerations for preventive
the HFIR, most of the instruments will probably outlast the maintenance (PM).
~

GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN 26 1

1 . Common causes for degradation or failure that defeat notices a particular instrument or sensor malfunction, yet fails
redundancy. Instruments share a common power source, to take corrective action immediately. The observer might con-
shelf/location, circuits, cooling source, etc. The High Flux clude that the instrument is unimportant and need not be
Isotope Reactor (HFIR) has been checked for common reliance calibratedhepaired immediately. Thus the instrument is allow-
by instruments on power source or other attributes. As a general ed to remain unrepaired until such time as: a) a reading from
rule, similar instruments or sensors have been designed to rely that instrument is actually needed, or b) some accident occurs
upon different electrical sources so that a failure in one area whence the instrument is needed to correct the system state.
does not affect the sensors in another. For example, the in- In developing a PM plan, it is helpful to determine the impact
struments in the safety system are tracked in triplicate. Thus of allowing a bad sensor to remain unrepaired.
core-inlet temperature is measured on panels A, B, C , with The spare instruments kept in inventory at the HFIR can
power source and wiring for panel A physically independent fall into this category of disuse and disrepair. The value of
of panels B, C. redundant or spare parts is questionable if they are not known
A power failure to one of the panels could affect all the to be in proper working condition. To resolve this issue, the
instruments on that panel. Each of the panels is connected to inventory of spare instruments has been added to the PM plan.
a separate battery bank that supplies electrical power to the panel Now the I&C Division maintains closer control over the work-
in the event of a utility power failure. The HFIR has backup ing condition of the spares.
generators that require a failure-to-start before relying solely 5. The manner in which signals from a sensor are used by
on the reserve electricity in the battery banks. the system. Some sensor signals feed directly to: a) a control
The HFIR has one important exception to the separate system, b) a recorder, or c) a local display only. Some sensors
power source & circuitry rule: The process systems, which in- that serve only as local gauges or instruments, and which were
clude the cooling-tower temperature-control and the pH treat- placed on the reactor only for convenience, have been left off
ment of primary water, do share a common power source. the PM plan. From a risk-analysis perspective, studying various
Moreover, in order to service a particular instrument in one accident-related scenarios for each instrument left off the PM
of the process systems, it might be necessary to open a circuit schedule could help predict whether such an instrument would
breaker. Although drawings exist that show the relation between be needed in a crisis. However, all instruments on the safety
a particular breaker and its associated instruments, no one has system that are part of any control system and that give output
studied the total impact on the HFIR facility when a family of signals to the control room have been placed on the PM plan.
instruments is taken out of service, for example, when a cir-
cuit breaker is opened.
Tylee [9] evaluates the functional redundancy approach to 6. ATTITUDES TOWARD RISK
detecting instrument failures in nuclear power plant instrumen-
tation. His real-time method uses a bank of Kalman filters for Preventive maintenance (PM) is not free. In general, the
each instrument to generate optimal estimates of the plant state. anticipated benefits of increased PM must be weighed against
By performing consistency checks among the outputs of ap- the costs. Risk assessment commonly assumes that some forms
propriate filter, Tylee can identify failed instruments. of risk, no matter how intolerable, cannot be completely
2. The number of instruments to be serviced. This number eliminated. Risk assessment often delivers a list of alternatives
partially determines the manner in which PM is undertaken. that can reduce the probability of some accident/event so that
The HFIR has 1132 instruments on the I&C Division inven- its risk factor is acceptable. Risk factor is:
tory list, and of this total, approximately 850 are on the pro-
grammed PM schedule. The kind of individual attention paid
to instruments as well as the variety of problems that can be
checked may be limited by the vast number of instruments in Notation
a reactor. Given a long queue of instruments waiting for PM,
a reactor instrument technician would likely have less time to Fi risk factor of an event
spend on individual instruments and might feel pressured to Pi probability of the event
complete his PM tasks. An analogy is preparing-meals: The way Si severity of the event-
a person serves a meal to one person differs from the way he
would serve meals to an entire family. The method of serving The planner for the PM schedule has an objective function that
a family, in turn, differs from the method of serving over loo0 uses both the severity and the probability of an accident to in-
employees. fluence which instruments are on the PM schedule and what
3. The number of instruments on the PM list. Thls number, their priorities are. The assumptions about risk attitudes in-
more so than just the time constraint, affects - a) decisions fluence why some managers use more PM than others do.
about purchasing tools for on-site repairs vs shop repairs, as Consider the operation of an engineering system with no
well as b) the number of employees in the PM program. PM plan and only one type of accident. The system components
4. The extent of human interaction with the instruments. are repaired on a bare-bones approach. The costs are measured
A sensor can continue giving bad readings until it is observed in terms of the severity of the accident and quantified in dollars,
by some operator or technician. In one scenario, an observer the same as benefits.
262 IEEE TRANSACTIONS ON RELIABILITY, VOL. 38, NO. 2,1989 JUNE

Notation a PM plan. To the extent that managers have other regulatory


constraints to satisfy, a PM plan might be implemented in any
planner’s objective function, viz, utility; U is a function case. If the plant managers are risk averse then the model in
of benefit this section suggests that an increase in the probability of ac-
benefit from an engineering system when running with no cidents has a more important impact than an increase in the
PM, and no accident severity of accidents to induce more managers to incorporate
cost to operations from an accident PM plans. If the managers prefer risk then the severity becomes
probability of an accident more important than the probability of an accident in determin-
ing who implements PM.
The s-expected (average) utility from running an engineer-
ing system without a PM plan is:

E{U} = (1 - p ) U ( B ) + p U(B-C) (6-2) ACKNOWLEDGMENT

When E { U } exceeds the s-expected (average) utility from im- Funding for this research was provided by appointment to
plementing a PM plan, then the engineering-system operators the US Department of Energy Laboratory Cooperative
undertake a bare-bones approach to PM. Therefore, any change Postgraduate Research Training Program administered by Oak
that increases E { U } increases the incidence of operations Ridge Associated Universities. Don Asquith, Field Engineer
without any PM. Similarly, decreases in E{U} increase the for the High Flux Isotope Reactor (HFIR), provided invaluable
benefits from adopting a PM plan. assistance and most of the information on experiences with the
The statement, “The likelihood of an accident has propor- HFIR reported herein. I also thank Bill Zabriske, Charlie Allen,
tionally more impact than the severity of an accident on the deci- and two anonymous referees for their helpful suggestions.
sion to run without a PM plan.” can be expressed in terms of
derivatives:

(6-3a) APPENDIX A: Preventive Maintenance (PM) for the


Primary Pressure System on the HFIR

[U(B) - U(B-C)] p / U > p U’(B-C) C / U (6-3b) The primary pressure system on the HFIR comprises the
following instruments:
Hence (6-3) holds if
1. Channel A Flux - measured 0 to 150%
[U(B) - U(B-C)]/C > U’(B-C). (6-4) 2. Channel B Flux - measured 0 to 150%
3. Channel C Flux - measured 0 to 150%
Inequality (6-4) simply requires U to be convex; ie, the plan- 4. FM258 Letdown Cleanup Flow - measured 0 to 200
ner prefers risk over the certainty equivalent. For example, if gpm
given choices between a 60/40 lottery of receiving $100 or $0, 5. HICM377 Secondary Flow Control Valve - a demand
+
and a second choice of $60 ($60 = 0.6 x $100 0.4 x $O), signal showing 0 to 100% closed for the 36 inch valve
a risk averse person will, by definition, choose the $60, while 6. HICM377A Secondary Flow Control Valve - a de-
a person who prefers risk will, by definition, prefer the fair mand signal showing 0 to 100% closed for the 10 inch valve
lottery. If the odds remain the same but the certain payoff is (attached to the inlet temperature controller)
lowered to an unfair $50 then the marginally risk averse per- 7. FM2 16 Pressurizer Pump Flow - measured 0 to 200
son might, by definition, accept the slightly unfair lottery. gpm, this flow is measured after the letdown flow has passed
Are operators of engineering-systems risk averse, or do they through chemical processing and is returning to the primary
prefer risk? The answer most likely varies across industry system.
because risk-bearing can be a source of profits in the private 8. PM127 Primary Pressure - measured in 0 - 1500 psi.
sector. If the question is limited to nuclear plants then all per- This sensor actually measures from 3 - 15 lbs, which it then
sonnel associated with managing the plant are risk averse. In translates to the 0 - 1500 psi scale.
addition, regulatory constraints added to the objective function 9. PM127A Pressure Control Valve Position - measured
essentially eliminate all gains from operating without a work- 0 to 100% open, a demand signal for the valve to open, not
ing PM plan. Even if the reactor operated without incident, the a feedback signal from the valve itself.
administrators would be penalized for failing to take adequate 10. #1 Primary Flow - measured 0 - 20 0oO gpm
precautions. 11. #2 Primary Flow - measured 0 - 20 0oO gpm
Additional inferences can be drawn from this model. For 12. #3 Primary Flow - measured 0 - 20 0oO gpm
example, increasing the severity of the accident by raising the 13. #1 Inlet Temperature - measured 75 - 200 degrees F
cost thereof, or increasing the probability of an accident will 14. #2 Inlet Temperature - measured 75 - 200 degrees F
lower the s-expected (average) utility from operating without 15. #3 Inlet Temperature - measured 75 - 200 degrees F
GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN 263

16. #1 Outlet Temperature - measured 75 - 200 degrees F true of the three primary flow sensors, which are on the ven-
17. #2 Outlet Temperature - measured 75 - 200 degrees F turi - an hour-glass shaped tube with an orifice for measuring
18. #3 Outlet Temperture - measured 75 - 200 degrees F flow, are permanent, and are not calibrated. The secondary flow
19. FM300 Secondary Flow - mesured 0 - 25 OOO gpm is measured by a dah1 tube - a funnel-shaped tube with an
20. TM3 10B1 Coolinf Tower Inlet Temperture - measured orifice, and is not calibrated.
20 - 120 degrees F, both this signal and the cooling tower outlet The core-mlet temperature sensors are routinely PM’d only
temperature are measured by a resistance bulb. on the safety side. Each of the three transmitters is serviced an-
2 1. TM3 1OA 1 Cooling Tower Outlet Temperature - nually and required about 30 minutes for calibration. The resistance
measured 20 - 120 degrees F bulbs are on a 3 year plan and staggered so that only one bulb
22. #1 Rod Position - measured 0 - 27 inches is serviced in a given year. It is quite an ordeal to check the calibra-
23. #2 Rod Position - measured 0 - 27 inches tion of these bulbs. First the primary water must be drained from
24. #3 Rod Position - measured 0 - 27 inches the system, which requires a minimum of 8 hours. Once the bulb
25. #4 Rod Position - measured 0 - 27 inches is removed it is sent to the Standards Office, which is located in
26. #5 Rod Position - measured 0 - 27 inches a building about 1.5 miles from the HFIR,to be immersed in a
bath. The resistance bulbs rarely if ever show signs of drifting
Two other sensors, not presently linked to any other system, are: out of calibration. But since these bulbs are part of the safety
system, they are serviced just to be sure that they have stayed
FM128 Low Pressure - a local meter, provides no signal in calibration.
FM104 Backup Pressure Sensor to FM127 - appears as digital The core outlet temperature sensors are not part of the safe-
LED in control room ty system; hence, they are not routinely serviced like the core
inlet temperature sensors. Neither the HICM377 36 inch con-
Notation troller valve nor the HICM377A 10 inch controller valve require
routing PM. Repairs on the valves, if ever needed, are split be-
psi pounds per square inch, gauge pressure tween Instrumentation & Controls personnel for the top part of
gpm gallons per minute the valve (the controller box) and Plant & Equipment personnel
F Farenheit for the valve itself.
lbs pounds Primary pressure PM127 is serviced annually. The calibra-
tion check takes about 1 hour, but it may take all day before
the sensor can be removed and taken to the shop. The PM127A
Summary of Which Sensors are Routinely Calibrated and How pressure controller valve is serviced annually and takes about
Ofren one hour to complete the PM check; the valve is easily accessed
in the control room.
PM on the 3 flux channels can be separated into PM on the The 5 rod-position sensors are not on a routine PM schedule.
ion chamber and PM on the instrument itself. Each of the three Because of the way they are locked into place by bolts, they do
ion chambers is serviced on a 3 year basis, and staggered so that not drift out of calibration over time. The only part that requires
one chamber is serviced a year. The PM check takes only about servicing is the pointer, which is visually compared to a yard-
2 hours; the chambers are readily accessible. The instrument itself stick in the subpile room during each restart.
is serviced every 6 months, and this service effort-requires bet- The cooling toward inlet temperature TM31OB1 sensor con-
ween 2 - 3 hours. The HFIR has a total of 9 flux sensors: 3 on tains one transmitter and one resistance bulb. The cooling tower
the safety, 3 on the servo, and 3 on the counting channels. outlet temperature TM310A1 sensor contains 4 transmitters and
The letdown cleanup flow, pressurizer pump flow, and three 4 resistance bulbs with sensor reading being the average of the
primary flow sensors are not calibrated. The manufacturer’s four. The transmitters are serviced annually. Because of the loca-
specifications are taken as true and accurate. The devices are tion of the resistance bulbs, it is economically not feasible to ser-
installed and not generally serviced. This practice is particularly vice them on a routine basis.

APPENDIX B. Illustration of Cause-Consequence Relations Contained in HFIR Quality Assurance Documentation


Event Causes - OR gate
primary cleanup pump fails to start on request 1. basic pump failure
2. basic motor failure
3. overload relay tripped
4. control room switch turned off
5. local switch turned off
6. diesel engine #1 fails AND normal power outage
7. auxiliary contact on transfer switch fails AND normal
power outage
8. flow switch 217 fails AND primary cleanup flow drops
below 75 gpm
264 IEEE TRANSACTIONS O N RELIABILITY, VOL. 38, NO. 2, 1989 JUNE

reduction in primary cleanup flow 1 . primary cleanup pump fails to start on request
primary recirculating pump seals wear reduction in primary cleanup flow
primary cleanup pump fails off during operation 1. basic motor failure
2. overload relay tripped
3. control room switch turned off
4. local switch turned off
5. timer relay TR-2 fails after flow switch 217 clears
6 . timer relay TR-2 fails after auto transfer switch #1 goes
back to normal
pump bowl leak 1. basic pump failure
2. mechanical seal failure
3. pump vent open
4. pump drain value open
bearings and seals fail after extended use loss of cooling water
fuel-cladding failure 1. sufficient flow blocked or diverted away from fuel
region
2. power transients occur
reactivity-control lost or hindered 1. control plates are jammed
2. extension tubes are jammed
3. shock tubes are jammed
4. tracks are jammed
5. moderator shifts to alter the core flux distribution
6. reflector material shifts to alter the core flux distribution
power transients occur reactivity-control lost or hindered

REFERENCES AUTHOR
[I] Winfrid G. Schneeweiss, “The failure of systems with dependent control”,
IEEE Trans. Reliability, vol R-35, 1986 Dec. pp 512-517. Dr. A. S. Guth; RJO Enterprises; 116 Oklahoma Avenue; Oak Ridge, Tennessee
[2] J. B. Fussell, J. S. Arendt, “System reliability engineering methodology: 378308604 USA.
A discussion on tghe state of the art”, Nuclear Safety, vol 20, Sep-Oct Michael Anthony Stephen Guth was born in Oak Ridge, Tennessee on
1979, pp 541-550. 1962 August 1. He completed his BA (Economics) from Rice University in
[3] M. A. S. Guth, “A probabilistic foundation for vagueness and impreci- 1982, his MS (Economics) from California Institute of Technology in 1984,
sion in fault tree analysis”, revision submitted to IEEE Trans. Reliability, and his PhD (Economics) from the University of Tennessee in 1988. He worked
1988, (TR87-042/1). as a system analyst and economist at the NASA Jet Propulsion Laboratory from
[4] P. M. Morse, Queues, Inventories and Maintenance, John Wiley & Sons, 1982 - 1984, an economist at and postgraduate research fellow at Oak Ridge
1958. National Laboratory form 1985 - 1988, and since 1988 July as a Senior
[5] D. N. Khandelwal, Jaydev Sharma, L. M. Ray, “Optimal periodic Technical Specialist with RJO Enterprises. His research interests include uncer-
maintenance policy for machines subject to deterioration and random tainty theory, risk evaluation, and mathematical modeling of decision processes.
breakdown”, IEEE Trans. Reliability, vol R-28, 1979 Oct, pp 328-330. He is a member of the American Economics Association and the Operations
[6] S . E. Emoto, R. E. Schafer, “On the specfication of repair time re- Research Society of America.
quirements”, IEEE Trans. Reliability, vol R-29, 1980 Apr, pp 13-16.
[7] John M. Sheppard, “Discussion o f On the specification of repair time Manuscript TR87-704 received 1987 October 8; revised 1988 September 1.
requirements”, IEEE Trans. Reliability, vol R-30, 1981 Apr, pp 36-37. IEEE Log Number 24512 4TRb
[8] L. Takacs, Introduction to the Theory of Queues, Oxford University Press,
1962.
[9] J . Louis, Tylee, “On-line failure detection in nuclear power plant instrumen-
tation”, IEEE Trans. Automatic Control, vol AC-28, 1983 Mar, pp
406-4 15.

MANUSCRIPTS RECEIVED M ANUSCRIPTS RECEIVED MA NUSCRIPTS RECEIVED MANUSCRIPTS RECEIVED

“A statistical method of obtaining the factors in electronic-component reliability- “Optimal apportionment of reliability & redundancy in series systems under
prediction models”, Zhongsen Yang Dept. of Computer Science Univer- multiple objectives”, Anoop K. Dhingra School of Mechanical Engineer-
sity of Regina 0 Regina, Saskatchewan S4S O A 2 CANADA. (TR89-057) ing c Purdue University West Lafayette, Indiana 47907 o USA. (TR89-058)

You might also like