Availability Prediction Methods - NASA
Availability Prediction Methods - NASA
Availability Prediction Methods - NASA
Technique AT-3
Technical Availability estimation is a valuable design aid and assessment tool for
Rationale any system whose operating profile allows for repair of failed units or
components. These systems include those that operate on earth such as
control centers, system test facilities, or flight simulation
systems/facilities. Applying availability prediction and analysis
techniques is also an extremely valuable process for guiding the
development of maintenance concepts and requirements.
Page AT-13
Availability Prediction and Analysis, Page 2 of 5
Technique AT-3
The MTTR time in the inherent availability where Uptime is the total time a system is in
calculation does not include such times as an operable state, and Downtime is the total
administrative or logistic delay time, which time the system is in an inoperable state. The
generally are beyond the control of the sum of Uptime and Downtime, or Total Time,
designer, and does not include preventive is usually known, specified as a requisite
Page AT-14
Availability Prediction and Analysis, Page 3 of 5
Technique AT-3
operating time, or is a given time to perform a calculated using equation (1). When the
critical function. Downtime often is broken system is composed of a number of
down into a variety of subcategories such as components, LRU's, or ORU's, the failure of
detection and diagnosis time, time waiting for any one of which results in the system being
repair parts, actual unit repair or replacement down, the system availability is calculated
time, test and checkout time, etc. Table 1 from the product of these units' availability.
shows the basic difference between the When the system involves item redundancy,
availability measures defined above. redundant block availability estimates can be
calculated using simple Boolean mathematical
Table 1: Commonly Used Availability decomposition procedures similar to reliability
Measures block diagram solution methods. See
Reference 1, Section 10.4.
Availability Function of: Excludes:
Measure
Computer-Aided Simulation
Inherent Hardware design ready time, Availability prediction using computer-aided
(Ai) preventative simulation modeling may use either a
maintenance
stochastic simulation or a Markov model
downtime,
and approach. Stochastic simulation modeling
administrative uses statistical distributions for the system's
downtime reliability, maintainability, and other
maintenance and delay time parameters.
Achieved Hardware logistics time
design, but also and These distributions are used as mathematical
(Aa) models for estimating individual failure and
includes active, administrative
preventive, and downtime restoration times and can include failure
corrective effects and other operational conditions. A
maintenance computer program generates random draws
downtime
from these distributions to simulate when the
Operational Product of actual All inclusive system is up and down, maintains tables of
(Ao) operational failures, repairs, failure effects, etc., and tracks
environment system or function capability over time. These
including ready
data may then be used to calculate and output
time, logistics
time, and system operational availability estimates using
administrative equation (2).
downtime
Stochastic Simulation Methods
System or Function Availability Estimation Discrete event stochastic simulation programs
System/function availability estimates may be are recommended to perform operational
derived in a limited fashion by algebraically availability predictions and analyses for large,
combining mean value estimates of the system repairable systems such as the space station or
units, or more rigorously by using computer- large ground systems and facilities. These
aided simulation methods. methods simulate and monitor the availability
status of defined systems or functions that are
Mean Value Estimation composed of a collection of Replaceable Units
Mean value estimation of system availability is (RUs). The following process is generally
usually performed by algebraically combining used:
component, LRU, and ORU availabilities
Page AT-15
Availability Prediction and Analysis, Page 4 of 5
Technique AT-3
(3) Repair or replace the failed RU using a When the stochastic simulation method is
maintenance policy and procedure based used, each run of the simulation model (called
on the availability of required an iteration) will yield a single value of the
maintenance resources, priority or availability measure that depends on the
criticality of the failure, or the current chance component or unit failures and repairs
system or function status. Once an RU that happened during that iteration.
is repaired or replaced, the system or Therefore, many iterations are required to
function status is reset appropriately, and cover as many potential failure situations as
a future failure time for the RU is again possible, and to give the analyst a better
generated. understanding of the variation in the resulting
availability as a function of the variations in
Generation of simulated failures and the random failure and repair process. The
maintenance actions for RUs requires as input number of iterations required for accurate
the estimated RU time-to-failure distribution availability measure results will depend on the
model parameters and factors that define the iteration to iteration variation in the output
frequency of other scheduled or unscheduled measure. Experience has shown that in system
maintenance. The maintenance actions can availability simulations with a large iteration-
include equipment failures, preventive to-iteration variation, 200 to 1000 iterations
maintenance tasks, and environmentally or or more may be required to obtain a
human-induced failures. statistically accurate estimate of the average
system availability.
To evaluate the effect of a simulated failure on
the function's operational capability at a For example, the Reliability and
particular point in time, minimal cut sets of Maintainability Assessment Tool (RMAT) is a
failure events that define the system or stochastic computer-aided simulation method
function failure conditions can be used. like that described that has been used at
Minimal cut sets of failure events can be Johnson Space Center for assessing the
generated from reliability block diagrams or maintainability and availability characteristics
fault tree analysis of the functions, and then of the Space Station. The output of the
used during a simulation run to dynamically RMAT includes the percent of total (or
determine queuing priorities based upon specified mission) time each defined space
functional criticality and the current level of station function spends in a "down" state as
remaining redundancy after the simulated well as the percent of time each defined
failure occurs. function is one failure away from functional
outage (is zero failure tolerant). Using
Page AT-16
Availability Prediction and Analysis, Page 5 of 5
Technique AT-3
RMAT, analysts at JSC have been able to supporting a space or ground system.
perform trade studies that quantify the
differences between alternative Space Station References
configurations in terms of their respective
operational availability and maintainability 1. MIL-HDBK-338; Electronic
measure estimates. Reliability Design Handbook, Reliability
Analysis Center, Rome, NY, 1989.
The same simulation methods (such as
RMAT) that provide for operational 2. O'Connor, P.D.T.; Practical
availability measures will also provide Reliability Engineering, John Wiley & Sons
maintenance resource usage measures such as Ltd., Chichester, 1991.
maintenance manpower needs and spare part
requirements. With this capability, JSC has
been able to estimate the maintenance
manpower needs, including EVA
requirements, of various Space Station
alternative configurations.
Page AT-17
This page intentionally blank
Page AT-18