Availability Prediction Methods - NASA

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Availability Prediction and Analysis, Page 1 of 5

Technique AT-3

Technique Estimate or predict the future availability of a system, function, or unit


where availability is defined as the probability that the system, function,
or unit will be in an operable state at a random time. Availability may
be assessed for a single component, a repairable unit, a replaceable unit,
a system of many replaceable units, or a function performed by multiple
systems.

AVAILABILITY PREDICTION AND


ANALYSIS
Availability analysis provides a measure which can be used to
optimize system readiness within cost and schedule constraints

Benefits Availability prediction and assessment methods can provide quantitative


performance measures that may be used in assessing a given design or
to compare system alternatives to reduce life cycle costs. This
technique increases the probability of mission success by ensuring
operational readiness. Analyses based on availability predictions will
help assess design options and can lead to definition of maintenance
support concepts that will increase future system availability, anticipate
logistics and maintenance resource needs, and provide long term
savings in operations and maintenance costs based on optimization of
logistics support.

Key Words Availability, Achieved Availability, Inherent Availability, Operational,


Stochastic Simulation, Maintainability, RMAT, Markov Model

Application International Space Station Program


Experience

Technical Availability estimation is a valuable design aid and assessment tool for
Rationale any system whose operating profile allows for repair of failed units or
components. These systems include those that operate on earth such as
control centers, system test facilities, or flight simulation
systems/facilities. Applying availability prediction and analysis
techniques is also an extremely valuable process for guiding the
development of maintenance concepts and requirements.

Contact Center Johnson Space Center (JSC)

Page AT-13
Availability Prediction and Analysis, Page 2 of 5
Technique AT-3

Availability Prediction and Analysis maintenance time. However, effective trade-


Technique AT-3 offs using the basic times and parameters are
possible. Trade-off techniques and some
Availability can be predicted or estimated sample uses are included in Reference 1,
using various methods and measures. Section 5.5.
Availability is a characteristic of repairable or
restorable items or systems, and assumes that Another measure of availability, achieved
a failed item can be restored to operation availability or Aa, can be expressed as:
through maintenance, reconfiguration, or
reset. It is a function of how often a unit fails
OT
(reliability) and how fast the unit can be Aa ' (2)
restored after failure (maintainability). A OT% TCM% TPM
foundation to support both the establishment
of reliability and maintainability (R&M) where OT is the total time spent in an
parameters and trade-offs between these operating state, TCM is the total corrective
parameters is created by availability prediction maintenance time that does not include before-
and analyses. Availability can be estimated for and-after maintenance checks, supply, or
components, items, or units, but overall administrative waiting periods; and TPM is the
spacecraft system or ground system total time spent performing preventive
availability estimation is based on the maintenance. Aa is more specifically directed
combinations and connectivity of the units toward the hardware characteristics than the
within the system that perform the functions, operational availability measure, which
i.e., the series and redundant operations paths. considers the operating and logistics policies.
A third basic measure of availability,
Availability Measures operational availability, considers all repair
One basic measure of availability, called time: corrective and preventive maintenance
inherent availability, is useful during the design time, administrative delay time, and logistic
process to assess design characteristics. The support time. This is a more realistic
measure involves only the as-designed definition of availability in terms providing a
reliability and maintainability characteristics measure to assess alternative maintenance and
and can be calculated using the estimated logistics support concepts associated with the
mean-time-between-failure (MTBF) and operation of a system or function. It is usually
mean-time-to repair (MTTR) parameters. defined by the equation:
The predicted or estimated measure of
inherent availability is calculated as: Uptime
Ao '
Uptime% Downtime
(3)
MTBF Uptime
Ai ' (1) '
MTBF % MTTR Total Time

The MTTR time in the inherent availability where Uptime is the total time a system is in
calculation does not include such times as an operable state, and Downtime is the total
administrative or logistic delay time, which time the system is in an inoperable state. The
generally are beyond the control of the sum of Uptime and Downtime, or Total Time,
designer, and does not include preventive is usually known, specified as a requisite

Page AT-14
Availability Prediction and Analysis, Page 3 of 5
Technique AT-3

operating time, or is a given time to perform a calculated using equation (1). When the
critical function. Downtime often is broken system is composed of a number of
down into a variety of subcategories such as components, LRU's, or ORU's, the failure of
detection and diagnosis time, time waiting for any one of which results in the system being
repair parts, actual unit repair or replacement down, the system availability is calculated
time, test and checkout time, etc. Table 1 from the product of these units' availability.
shows the basic difference between the When the system involves item redundancy,
availability measures defined above. redundant block availability estimates can be
calculated using simple Boolean mathematical
Table 1: Commonly Used Availability decomposition procedures similar to reliability
Measures block diagram solution methods. See
Reference 1, Section 10.4.
Availability Function of: Excludes:
Measure
Computer-Aided Simulation
Inherent Hardware design ready time, Availability prediction using computer-aided
(Ai) preventative simulation modeling may use either a
maintenance
stochastic simulation or a Markov model
downtime,
and approach. Stochastic simulation modeling
administrative uses statistical distributions for the system's
downtime reliability, maintainability, and other
maintenance and delay time parameters.
Achieved Hardware logistics time
design, but also and These distributions are used as mathematical
(Aa) models for estimating individual failure and
includes active, administrative
preventive, and downtime restoration times and can include failure
corrective effects and other operational conditions. A
maintenance computer program generates random draws
downtime
from these distributions to simulate when the
Operational Product of actual All inclusive system is up and down, maintains tables of
(Ao) operational failures, repairs, failure effects, etc., and tracks
environment system or function capability over time. These
including ready
data may then be used to calculate and output
time, logistics
time, and system operational availability estimates using
administrative equation (2).
downtime
Stochastic Simulation Methods
System or Function Availability Estimation Discrete event stochastic simulation programs
System/function availability estimates may be are recommended to perform operational
derived in a limited fashion by algebraically availability predictions and analyses for large,
combining mean value estimates of the system repairable systems such as the space station or
units, or more rigorously by using computer- large ground systems and facilities. These
aided simulation methods. methods simulate and monitor the availability
status of defined systems or functions that are
Mean Value Estimation composed of a collection of Replaceable Units
Mean value estimation of system availability is (RUs). The following process is generally
usually performed by algebraically combining used:
component, LRU, and ORU availabilities

Page AT-15
Availability Prediction and Analysis, Page 4 of 5
Technique AT-3

(1) Generate simulated future failure times Maintenance is simulated by allocating


for each designated RU based on available maintenance resources and spare
predicted RU reliability distributions and parts to the awaiting maintenance action (or
parameters. waiting for resources to become available).
Groups of maintenance actions may also be
(2) Step through simulated operating time, packaged into shifts of work. If the system
and when failure events are encountered, under consideration is in a space environment,
evaluate the failure impact or function both external (extravehicular activity or EVA)
status given the specific failures or internal (intravehicular activity or IVA) can
encountered. be considered.

(3) Repair or replace the failed RU using a When the stochastic simulation method is
maintenance policy and procedure based used, each run of the simulation model (called
on the availability of required an iteration) will yield a single value of the
maintenance resources, priority or availability measure that depends on the
criticality of the failure, or the current chance component or unit failures and repairs
system or function status. Once an RU that happened during that iteration.
is repaired or replaced, the system or Therefore, many iterations are required to
function status is reset appropriately, and cover as many potential failure situations as
a future failure time for the RU is again possible, and to give the analyst a better
generated. understanding of the variation in the resulting
availability as a function of the variations in
Generation of simulated failures and the random failure and repair process. The
maintenance actions for RUs requires as input number of iterations required for accurate
the estimated RU time-to-failure distribution availability measure results will depend on the
model parameters and factors that define the iteration to iteration variation in the output
frequency of other scheduled or unscheduled measure. Experience has shown that in system
maintenance. The maintenance actions can availability simulations with a large iteration-
include equipment failures, preventive to-iteration variation, 200 to 1000 iterations
maintenance tasks, and environmentally or or more may be required to obtain a
human-induced failures. statistically accurate estimate of the average
system availability.
To evaluate the effect of a simulated failure on
the function's operational capability at a For example, the Reliability and
particular point in time, minimal cut sets of Maintainability Assessment Tool (RMAT) is a
failure events that define the system or stochastic computer-aided simulation method
function failure conditions can be used. like that described that has been used at
Minimal cut sets of failure events can be Johnson Space Center for assessing the
generated from reliability block diagrams or maintainability and availability characteristics
fault tree analysis of the functions, and then of the Space Station. The output of the
used during a simulation run to dynamically RMAT includes the percent of total (or
determine queuing priorities based upon specified mission) time each defined space
functional criticality and the current level of station function spends in a "down" state as
remaining redundancy after the simulated well as the percent of time each defined
failure occurs. function is one failure away from functional
outage (is zero failure tolerant). Using

Page AT-16
Availability Prediction and Analysis, Page 5 of 5
Technique AT-3

RMAT, analysts at JSC have been able to supporting a space or ground system.
perform trade studies that quantify the
differences between alternative Space Station References
configurations in terms of their respective
operational availability and maintainability 1. MIL-HDBK-338; Electronic
measure estimates. Reliability Design Handbook, Reliability
Analysis Center, Rome, NY, 1989.
The same simulation methods (such as
RMAT) that provide for operational 2. O'Connor, P.D.T.; Practical
availability measures will also provide Reliability Engineering, John Wiley & Sons
maintenance resource usage measures such as Ltd., Chichester, 1991.
maintenance manpower needs and spare part
requirements. With this capability, JSC has
been able to estimate the maintenance
manpower needs, including EVA
requirements, of various Space Station
alternative configurations.

Markov Model Approach


A Markov process, or state-space analysis is a
mathematical tool particularly well suited to
computer simulation of the availability of
complex systems when the necessary
assumptions are valid. This analysis technique
also is well adapted to use in conjunction with
Fault Tree Analysis or Reliability Block
Diagram Analysis (RBDA). Examples of the
use of Markov process analysis may be found
in Reference 1 or in such standard reliability
textbooks as Reference 2.

Failure to use availability predictions and


analysis during the design process may lead to
costly sub-optimization of the as-designed
system reliability and maintainability
characteristics. Where operations and support
costs are a major portion of the life cycle
costs, availability prediction and analysis are
critical to understanding the impact of
insufficiently defined maintenance resources
(personnel, spare parts, test equipment,
facilities, etc.), and maintenance concepts on
overall system operational availability and
mission success probabilities. These analyses
can therefore greatly reduce the life cycle
costs associated with deploying and

Page AT-17
This page intentionally blank

Page AT-18

You might also like