Fault Tree Analysis
Fault Tree Analysis
42
Mission Success Starts With Safety
Q OCCURS
G001
A OCCURS AND THEN B
OCCURS
G002
A OCCURS
B001
B OCCURS GIVEN THE
OCCURRENCE OF A
B002
B OCCURS AND THEN A
OCCURS
G003
B OCCURS
B003
A OCCURS GIVEN THE
OCCURRENCE OF B
B004
Linking OR and AND Gates
43
Mission Success Starts With Safety
Terminating Events in a Fault Tree
The terminating events of a fault tree identify where
the FTA stops
Two fundamental terminating events are the basic
event and the undeveloped event
The basic event represents the lowest level event
(cause) resolved in the fault tree
The undeveloped event represents an event which
is not further developed for causes
44
Mission Success Starts With Safety
Expanded Types of Terminating Events
Basic Causal Event- treated as a primary
cause with no further resolution
Condition Event- defines a condition
which needs to exist
Undeveloped Event- not further developed
House Event- an event expected to
occur. Sometimes used as a switch of
True or False
Transfer Symbol- transfer out of a
gate or into a gate
45
Mission Success Starts With Safety
B A S I C E V E N T - A b a s i c i n i t i a t i n g f a u l t r e q u i r i n g n o f u r t h e r d e v e l o p m e n t
C O N D I T I O N I N G E V E N T - S p e c i f i c c o n d i t i o n s o r r e s t r i c t i o n s t h a t a p p l y t o
a n y l o g i c g a t e ( u s e d p r i m a r i l y w i t h P R I O R I T Y A N D a n d I N H I B I T g a t e s )
U N D E V E L O P E D E V E N T - A n e v e n t w h i c h i s n o t f u r t h e r d e v e l o p e d e i t h e r
b e c a u s e i t i s o f i n s u f f i c i e n t c o n s e q u e n c e o r b e c a u s e i n f o r m a t i o n i s
u n a v a i l a b l e
H O U S E E V E N T - A n e v e n t w h i c h i s n o r m a l l y e x p e c t e d t o o c c u r
P R I M A R Y E V E N T S Y M B O L S
G A T E S Y M B O L S
A N D - O u t p u t f a u l t o c c u r s i f a l l o f t h e i n p u t f a u l t s o c c u r
O R - O u t p u t f a u l t o c c u r s i f a l e a s t o n e o f t h e i n p u t f a u l t s o c c u r s
E X C L U S I V E O R - O u t p u t f a u l t o c c u r s i f e x a c t l y o n e o f t h e i n p u t f a u l t s
o c c u r s
P R I O R I T Y A N D - O u t p u t f a u l t o c c u r s i f a l l o f t h e i n p u t f a u l t s o c c u r i n a
s p e c i f i c s e q u e n c e ( t h e s e q u e n c e i s r e p r e s e n t e d b y a C O N D I T I O N I N G
E V E N T d r a w n t o t h e r i g h t o f t h e g a t e )
I N H I B I T - O u t p u t f a u l t o c c u r s i f t h e ( s i n g l e ) i n p u t f a u l t o c c u r s i n t h e
p r e s e n c e o f a n e n a b l i n g c o n d i t i o n ( t h e e n a b l i n g c o n d i t i o n i s r e p r e s e n t e d
b y a C O N D T I O N I N G E V E N T d r a w n t o t h e r i g h t o f t h e g a t e )
T R A N S F E R S Y M B O L S
T R A N S F E R I N - I n d i c a t e s t h a t t h e t r e e i s d e v e l o p e d f u r t h e r a t t h e
o c c u r r e n c e o f t h e c o r r e s p o n d i n g T R A N S F E R O U T ( e . g . , o n a n o t h e r p a g e )
T R A N S F E R O U T - I n d i c a t e s t h a t t h i s p o r t i o n o f t h e t r e e m u s t b e a t t a c h e d
a t t h e c o r r e s p o n d i n g T R A N S F E R I N
n
C O M B I N A T I O N - O u t p u t f a u l t o c c u r s i f n o f t h e i n p u t f a u l t s o c c u r
Extended Gate Symbols
46
Mission Success Starts With Safety
O-RI N G FAI LURE
G0 0 2
EXI STEN CE OF
TEM PERATURE T
B 0 0 4
T < T( c r i t i c al )
B 0 0 3
Illustration of the Inhibit Gate
47
Mission Success Starts With Safety
TRANSFER IN TRANSFER OUT
Transfer Gates
48
Mission Success Starts With Safety
Review Questions
1. What is a FT constructed as part of the resolution
process?
2. What is the basic paradigm of FTA?
3. Can the top event be a system success?
4. Can any relation be expressed by AND and OR gates?
5. Can the FT be terminated at events more general than
basic component failures?
6. Can a FT be developed to a level below a basic
component level, e.g. to a piecepart level?
7. Can an intermediate or basic event in the fault tree
consist of non-failure of a component?
49
Mission Success Starts With Safety
Developing the Fault Tree
1. Define the top event as a rectangle
2. Determine the immediate necessary and sufficient
events which result in the top event
3. Draw the appropriate gate to describe the logic for
the intermediate events resulting in the top event
4. Treat each intermediate event as an intermediate
level top event
5. Determine the immediate, necessary and sufficient
causes for each intermediate event
6. Determine the appropriate gate and continue the
process
50
Mission Success Starts With Safety
Advise in Developing the Fault Tree
The system being analyzed for the undesired event
needs to be studied and understood before the fault
tree is constructed
If an electrical or hydraulic system is being
analyzed, the fault tree is constructed by tracing the
causes upstream in the circuit to the basic causes
For a generalized network or flow, the fault tree is
similarly constructed by upstream tracing of the
causes
51
Mission Success Starts With Safety
Remember the Four Key Attributes of a
Fault Tree
Top Event- What specific event is being analyzed?
Boundary- What is inside and outside the analysis?
Resolution- What are the primary causes to be
resolved to?
Initial State- What is assumed for the initial conditions
and states?
52
Mission Success Starts With Safety
Defining the Boundary and Resolution of
the Fault Tree
The boundary defines what is inside the analysis
and what is outside the analysis
The resolution defines the basic causes to be
resolved
The boundary defines the interfaces to be included
or excluded
The resolution defines what types of events are
modeled
53
Mission Success Starts With Safety
Examples of Boundary Definitions
All components shown in a system schematic with
detailed system specifications
All major systems identified to comprise an enterprise
with detailed system descriptions and their interfaces
The individual steps defined in a process with the
detailed process description
The individual processes involved in transforming given
inputs into a finished product with detailed descriptions
A software description including coding, flow charts,
and detailed descriptions
54
Mission Success Starts With Safety
Examples of Resolution Definitions
Resolve basic causes to major components in the system
with descriptions of the the included components
Resolve basic causes to individual tasks in a process with
specific listing of the tasks to be included
Resolve basic causes to major system components,
including interfaces among the systems, with detailed
descriptions of the components and interfaces
Resolve the basic causes of software failure to the
individual statements in the software program
Resolve basic causes to major components in the system
but do not include interfaces to the system
55
Mission Success Starts With Safety
The Initial State for the Fault Tree
The initial state for the FTA defines the initial states of
components, initial conditions, and initial inputs assumed
The initial states for the components involve what
components are assumed to be initially operational
The initial state can also involve the past history
description of the component
Initial conditions include assumed environments and
operational conditions
Initial inputs include assumed initial commands, assumed
failures existing, and assumed events that have occurred
56
Mission Success Starts With Safety
A Fault Tree Distinguishes Faults
Versus Failures
The intermediate events in a fault tree are called faults
The basic events, or primary events, are called failures
if they represent failures of components
It is important is to clearly define each event as a fault
or failure so it can be further resolved or be identified
as a basic cause
Write the statements that are entered in the event boxes as faults; state
precisely what the fault is and the conditions under which it occurs. Do not
mix successes with faults.
57
Mission Success Starts With Safety
A Fault Tree Distinguishes a Component
Fault From System Fault
For each event, ask the question whether the
fault is a state of component fault or a state
of system fault.
The answer determines the type of gate to
construct
If the answer to the question, Is this fault a component failure? is Yes,
classify the event as a state of component fault. If the answer is No,
classify the event as a state of system fault.
58
Mission Success Starts With Safety
Component Fault Versus System Fault
(Continued)
For a state of component fault the component
has received the proper command
For a state of system fault the proper
command may have not been received or an
improper command may have been received
The event description needs to clearly define
the conditions to differentiate these different
faults
59
Mission Success Starts With Safety
Gates for Component Versus System
Faults
For a state of component fault use an OR gate if the
fault is not a failure (basic event)
For a state of system fault the gate depends on the
event description
If the fault event is classified as state of component, add an OR-gate below the event
and look for primary, secondary and command failure modes. If the fault event is
classified as state of system, look for the minimum necessary and sufficient immediate
cause or causes. A state of system fault event may require an AND-gate, an OR-gate,
an INHIBIT-gate, or possibly no gate at all. As a general rule, when energy originates
from a point outside the component, the event may be classified as state of system.
60
Mission Success Starts With Safety
OPERATING STATE
FAULT CLASSIFICATION
Switch fails to close when
thumb pressure is applied.
State of component
Switch inadvertently opens
when thumb pressure is
applied
State of component
Motor fails to start when
power is applied to its
terminals.
State of component
Motor ceases to run with
power applied to terminals
State of component
STANDBY STATE
FAULT CLASSIFICATION
Switch inadvertently closes
with no thumb pressure
applied.
State of component
Motor inadvertently starts. State of system
Example of
Component
Versus System
Faults
Mission Success Starts With Safety
Primary Failure Versus Secondary Failure
A failure can be further resolved into a primary
failure OR secondary failure
A primary failure is a failure within design
environments
A secondary failure is a failure outside design
environments
Usually secondary failures are not included
unless abnormal conditions are modeled
If secondary failures are included then the
secondary failure is resolved into the abnormal
condition existing AND the failure occurring
62
Mission Success Starts With Safety
63
Abnormal
condition exists
Primary failure
under normal
environment
Secondary
failure under
abnormal
environment
A Primary-Secondary Failure Gate
Mission Success Starts With Safety
Secondary Failure Modeling Guidelines
Include a secondary failure when an abnormal
environment is of specific focus
Include a secondary failure when an abnormal
environment can have a non-negligible probability
of existing
Otherwise, as a general rule, do not include
secondary failures in the fault tree since they can
greatly compound the complexity of the fault tree
64
Mission Success Starts With Safety
The No Miracle Rule
Do not assume abnormal conditions will occur to
prevent a fault from propagating
In particular, do not assume a failure of another
component will occur to prevent a fault from
propagating
If the normal functioning of a component propagates a fault sequence, then
it is assumed that the component functions normally.
65
Mission Success Starts With Safety
Naming Schemes For the Fault Tree
Each Gate and Event on the Fault Tree needs to
be named
The Name should ideally identify the Event Fault
and the What and When Conditions
Software packages have default names that can
be used but are not descriptive
Basic events should in particular be named to
identify the failure mode
What is important is that the same event be given
the same name if it appears at different locations
66
Mission Success Starts With Safety
Component
Type
Component
Failure Mode
Description
HX F Heat Exchanger Cooling Capability Fails
HX J Heat Exchanger Tube Rupture
HX P Heat Exchanger Plugs
IN F Inverter No Output
IR F Regulating Rectifier No Output
IV F Static Voltage Regulator No Output
LC D Logic Circuit Fails to Generate Signal
LS D Level Switch Fails to Respond
LS H Level Switch Fails High
LS L Level Switch Fails Low
Example of Simple Naming Scheme
67
Mission Success Starts With Safety
LH2
A_O_LH2_DISCVL_FTCM
A_O_LH2_DISCVL_FTCE
A_O_LO2_DISCVL_FTCE
Valve, 17"
Disconnect
fails to
close
C o m p o n e n t I D
C o m p o n e n t
T y p e
M o d e
S u b
s y s t e m
P R A F a i l u r e D a t a f a i l u r e
LDS
E_O_LDS_ACTLUL_JAM
E_O_LDS_ACTRUL_JAM
E_O_LDS_ACTNUL_JAM
Actuator, hyd
uplock jams
More Complex Naming Schemes
68
Mission Success Starts With Safety
Advise in Defining Ground Rules for an FTA
1. For FT quantification, model to the highest level for
which data exists and for which there are no
common hardware interfaces
2. Do not generally model wire faults because of their
low failure rates
3. Do not generally model piping faults because of
their low failure rates
4. Do not further develop an AND gate with three
independent inputs if there are lower order
contributing combinations
5. Do not further develop an event to an OR gate if
there are higher probability input events
69
Mission Success Starts With Safety
The Fault Tree Versus the Ishikawa
Fishbone
A fault tree is sometimes erroneously thought to be an
example of an Ishikawa Fishbone Model
The fishbone is a loosely-structured, brain-storming
tool for listing potential causes of an undesired event
Fault tree analysis is a stepwise formal process for
resolving an undesired event into its immediate causes
The fault tree displays the stepwise cause resolution
using formal logic symbols
70
Mission Success Starts With Safety
Fault
Fault
Fault
Fault
Methodology Environment Management
Undesired Event
Machines Material Personnel
Attribute
Attribute
Attribute
Attribute
The Ishikawa Fishbone Diagram
71
Mission Success Starts With Safety
Review Questions
1. What is the basic paradigm of FTA?
2. How is FTA different from a Fishbone Model?
3. Can all relations be expressed by AND and OR gates?
4. What are the four key attributes of an FTA?
5. What is the difference between a fault and a failure as
defined in FTA? Is this distinction used in other areas?
6. How is a state of component fault modeled?
7. Why cant there be more definite rules for modeling a
state of system fault?
72
Mission Success Starts With Safety
Mono-propellant Propulsion System
A mono-propellant propulsion system provides an
example for FTA
The system is pressure fed and provides thrust for a
vehicle while in orbit
Additional support systems are not considered
Different fault trees can be constructed depending
on the failure to be modeled
73
Mission Success Starts With Safety
Defining the FT Key Attributes for the
Monopropellant System Fault Tree
Top Event Defined based on the specific system
failure mode to be analyzed.
Boundary Extracted from the system logic diagrams.
Resolution Include the major components in the
system diagram. Do not include wiring faults.
Initial State Dependent on the system failure mode to
be analyzed.
74
Mission Success Starts With Safety
S 3
R e l i e f v a l v e
R V 4
T h r u s t c h a m b e r i n l e t
v a l v e
I V 3
S 2
S 1
K 5
K 1
K 4
K 3
K 2
P r o p e l l a n t t a n k
w i t h b l a d d e r
P T 1
T i m e r r e l a y
K 6
I n e r t g a s p r e s s u r i z a t i o n
t a n k
I n e r t g a s c h e c k
v a l v e C V 1
T K 1
T h r u s t e r i s o l a t i o n
v a l v e
I V 2
I n e r t g a s i s o l a t i o n
v a l v e I V 1
I n e r t g a s p r e s s u r e
r e g u l a t o r R G 1
C a t a l y s t
R e l i e f v a l v e
R V 1
R e l i e f
v a l v e R V 2
R e l i e f
v a l v e R V 3
Monopropellant Propulsion System
System Schematic and Boundaries
75
Mission Success Starts With Safety
TK1 Propellant Storage Tank PT1- Propellant Tank 1
RV1 Relief Valve 1 K1 Arming Relay K1
RV2 Relief Valve 2 K2 Firing Protection Relay
RV3 Relief Valve 3 K3 Arming Relay
RV4 Relief Valve 4 K4 Firing Relay
IV1 Isolation Valve 1 K5 Firing Relay
IV2 Isolation Valve 2 K6 Timing Relay
IV3 Isolation Valve 3 S1 Arming Switch
RG1 Regulator 1 S2 Firing Switch
CV1 Check Valve 1 S3 Emergency Cutoff Switch
System Components for the FTA
76
Mission Success Starts With Safety
System Description: Basic Operation
The system consists of a reservoir TK1 of inert gas that is fed through an isolation valve IV1 to a
pressure regulator RG1. The pressure regulator RG1 senses pressure downstream and opens or
closes to control the pressure at a constant level. A check valve, CV1 allows passage of the inert
gas to the Propellant Tank PT1. Separating the inert gas from the propellant is a bladder that
collapses as propellant is depleted. Propellant is forced through a feed line to the Thruster
Isolation Valve IV2 and then to the Thrust Chamber Inlet Valve IV3. For the Thruster to fire, the
system must first be armed, by opening IV1 and IV2. After the system is armed, a command is
sent to IV3, to open, allowing H
2
O
2
into the thrust chamber. As the propellant passes over the
catalyst, it decomposes producing the byproducts and heat and the expanding gas that creates the
thrust. The relief valves RV1-4 are available to dump propellant overboard should an
overpressure condition occur in any part of the system.
77
Mission Success Starts With Safety
System Description: Arming and Thrust
The electrical command system controls the arming and thrusting of the propellant
system. To arm, switch S1 is momentarily depressed, allowing electromotive force
(emf) to activate relay switches K1, K2 and K3, and open valves IV1 and IV2. K1
closes and sustains the emf through the arming circuit. K2 momentarily opens to
preclude the inadvertent firing of the system during the transition to the armed mode,
and closes when S1 is released. K3 closes in the firing circuit. The system is now
armed with power supplied to sustain IV1 and IV2 in the open position. When firing
switch S2 is momentarily depressed, K4 closes sustaining the firing circuit. K5 closes
completing the circuit for K6, which begins timing to a predetermined time for the
thruster to fire. The completed circuit opens IV3 and thrusting begins.
78
Mission Success Starts With Safety
System Description: Termination of
Thrusting
When K6 times out, it momentarily opens breaking the arming circuit and opening K1.
Power is removed from the IV1 and IV2 relays and both valves are spring-loaded
closed. K3 opens breaking the firing circuit, which opens K4 and K5. IV3 is spring-
loaded closed, and the system is in now in the dormant mode. Should K6 fail and
remain closed after timing out, the system can be shut down manually by depressing S3,
which breaks the arming circuit, opening K1 and closing IV1 and IV2. The firing
circuit relay switch K3 will open breaking the firing circuit, which causes K4 and K5 to
open. When K5 opens, IV3 will be spring-loaded closed, and the system will be in the
dormant mode.
79
Mission Success Starts With Safety
Summary of System Operation
1. Depress Arming Switch S1. Relays K1and K3, are
energized and close. This results in Isolation Valves
IV1 and IV2 opening. Propellant is consequently
supplied up to Isolation Valve IV3. Relay K2 briefly
opens to preclude inadvertent firing and closes when
S1 is released.
2. Depress Firing Switch S2. Relays K4 and K5 are
energized and close. Isolation Valve IV3 opens and
thrusting begins. The closure of K5 initiates the Timing
Relay K which times out after a given period opening
the relay. The arming circuit is de-energized, closing
the Isolation Valves IV1 and IV2 which are spring
loaded. Propellant supply stops and the thrusting
stops. Manual Switch S3 is a backup emergency.
80
Mission Success Starts With Safety
Transition to
Thrust Mode
Transition to armed mode
Dormant Mode
RV1, 2, 3, 4 Closed
IV1 Closed
RG1 As is
CV1 Closed
IV2 Closed
IV3 Closed
S1 Open
S2 Open
S3 Closed
K1 Open
K2 Closed
K3 Open
K4 Open
K5 Open
K6 Closed
Armed Mode
RV1, 2, 3, 4 Closed
IV1 Open
RG1 Regulating
CV1 Closed
IV2 Open
IV3 Closed
S1 Open
S2 Open
S3 Closed
K1 Closed
K2 Closed
K3 Closed
K4 Open
K5 Open
K6 Closed
Thrust Mode
RV1, 2, 3, 4 Closed
IV1 Open
RG1 Regulating
CV1 Open
IV2 Open
IV3 Open
S1 Open
S2 Open
S3 Closed
K1 Closed
K2 Closed
K3 Closed
K4 Closed
K5 Closed
K6 Closed (timing)
S1 momentarily closed
K1 closes
K2 momentarily opens
K3 closes
IV1 opens
IV2 opens
Transition to
Dormant Mode
Emergency Shutdown Mode
RV1, 2, 3, 4 Closed
IV1 Closed
RG1 As is
CV1 Closed
IV2 Closed
IV3 Closed
S1 Open
S2 Open
S3 Closed
K1 Open
K2 Closed
K3 Open
K4 Open
K5 Open
K6 Closed
Transition to
Emergency
Shutdown Mode
S3 momentarily opened
IV1 closes
CV1 closes
IV2 closes
IV3 closes
K1 open
K3 open
K4 open
K5 open
K6 opens momentarily (times out)
IV1 closes
CV1 closes
IV2 closes
IV3 closes
K1 opens
K3 opens
K4 opens
K5 opens
CV1 opens
IV3 opens
K4 closes
K5 closes
K6 timing
81
State transition
diagram
Mission Success Starts With Safety
THRUSTER IS SUPPLIED
WITH PROPELLANT AFTER
THRUST CUTOFF
G1
ISOLATION VALVE IV3
REMAINS OPEN AFTER
CUTOFF
ISOLATION VALVE IV2
REMAINS OPEN AFTER
CUTOFF
Top Event Structure for Thruster Supplied with
Propellant After Thrust Cutoff
Fault Tree Construction Step 1
82
Mission Success Starts With Safety
TH RU STER SU PPL I ED
WI TH PROPELL AN T
AFTER TH RU ST
CU TOFF
G1
I SOLATI ON VALVE
I V3 REM AI N S OPEN
AFTER CUTOFF
G2
EM F CON TI N U ES TO
B E SU PPLI ED TO I VV3
AFTER CU TOFF
PRI M ARY FAI LU RE OF
I V3 TO CLOSE AFTER
CU TOFF
E2
I SOLATI ON VALVE I V2
REM AI N S OPEN
AFTER CUTOFF
Fault Tree Construction Step 2
83
Mission Success Starts With Safety
Continue Development of the Fault Tree
for the Top Event Thruster Supplied with
Propellant after Thrust Cutoff
84
Mission Success Starts With Safety
Failure
Inadequate Strength
Failure
Inadequate Fatigue
Thruster Supplied with
Propellant after Time Out
Lack of
Specification Clogged
Lack of
Specification Corroded
Machines Material Personnel
Methodology Environment Management
Isolation
Valves
Relays and
Switches
Valve
Material
Lack of focus on
safety
Switch
Metal
Valve
Internals
Switch
Contacts
Training
Skill Level
Propellant
Volume
Time out
time
Example of Fishbone for the Monopropellant
Example
85
Mission Success Starts With Safety
Treatment of Human Errors in FTA
Human errors are classified into two basic types-
errors of omission and errors of commission
An error of omission is not doing a correct action
An error of commission is doing an incorrect action
Human errors are modeled as basic events in a FT,
similarly to component failures
Human errors need to be considered whenever a
human interfaces with the component or system
The failure modes need to be expanded to include
failure induced by the human
86
Mission Success Starts With Safety
Test and maintenance related errors
= A + B Intersection Complementation
118
Mission Success Starts With Safety
Q OCCURS
G001
BI OR B2 OCCURS
G002
B1 OCCURS
B001
B2 OCCURS
B002
B2 OR B3 OCCURS
G003
B2 OCCURS
B002
B3 OCCURS
B003
Sample Fault Tree for Boolean Analysis
119
Mission Success Starts With Safety
Problem: Determine the Minimal
Cutsets of the Sample Fault Tree
120
Mission Success Starts With Safety
The Minimal Cut Set Equation (Sum of
Products) for the Monopropellant Tree
G1= E6E7 + E6E8 + E5E7 + E5E8 + E1E3 +
E1E4 + E1E2
Applying the Distributive Law and Laws of
Absorption to the Top Event Equation in
terms of the Basic Events:
121
Mission Success Starts With Safety
Description of the Minimal Cutsets of the
Monopropellant Tree
E6 E7=Primary Time out Failure of K6 Operational Fail to Open of S3
E6 E8= Primary Time out Failure of K6 Primary Fail to Open of S3
E5 E7=Primary Fail to Open of K6 Operational Fail to Open of S3
E5 E8= Primary Fail to Open of K6 Primary Fail to Open of S3
E1 E3= Primary Fail to Close of IV2 Primary Fail to Open of K5
E1 E4=Primary Fail to Close of IV2 Primary Fail to Open of K3
E1 E2=Primary Fail to Close of IV2 Primary Fail to Close of IV3
122
Mission Success Starts With Safety
Review Questions
1. Why are the minimal cutsets important?
2. How can the minimal cutsets be obtained for any of
the intermediate faults of the fault tree?
3. Why are the minimal cutsets ordered by their size?
4. How can the minimal cutsets be used to check
given design criteria, such as having no single
failure cause?
5. What can be concluded from the minimal cutsets of
the monopropellant fault tree?
123
Mission Success Starts With Safety
Minimal Cutset Quantification Formulas
P(T) = P (M
1
+ M
2
+ + M
N
) where + = Logical OR
P(T) = P(M
k
) Sum of Minimal Cutset Probabilities (Rare
Event Approximation)
P(M)=P(E
1
)P(E
2
)P(E
M
) Product of Independent Basic Event
Probabilities
T = top event
M = minimal cutset
E = basic event
124
Mission Success Starts With Safety
Basic Formulas for Primary Event Probabilities
(P(E))
Failure probability for a non-repairable component (or event)
P = 1-exp(-T) ~ T = component failure rate
T = exposure time
Failure probability for a repairable component
P = /(1+ ) ~ = component failure rate or event rate
= repair time
Constant failure probability for a component
P = c c = constant probability (e.g., per demand )
125
Mission Success Starts With Safety
Details of Formulas: P=1-exp(-T) ~ T
is the constant component failure rate, e.g., no aging,
which is used as a first order approximation.For
extreme time dependency, Weibull, etc., can be used
depends on the failure mode and environment
For an operating (standby) component is the operating
(standby) failure rate
The approximation shown above is valid to two
significant figures for failure probabilities less than 0.1
The failure exposure time T is the time during which the
failure can occur and result in a higher fault
Software packages compute the exact formula
126
Mission Success Starts With Safety
Details of Formulas: P = /(1+ ) ~
is the average detection plus repair time for the
failure
depends on the detection and repair process
The above formula is a steady state formula which is
generally applicable for times significantly greater
than
Since is generally much smaller than one, the above
approximation is generally valid to two significant
figures
Software packages calculate the exact formula
127
Mission Success Starts With Safety
Details of Formulas: P = c
The constant probability model is used when
applicable probabilities are available
The constant probability model is used when c is the
probability per demand, which is called a demand
failure rate
Demand failure rates apply to components starting or
changing state,.e.g, relays, circuit breakers,
engines starting
Human error rates are expressed as a probability c of
human error per action
128
Mission Success Starts With Safety
Assembly,
G yr o Rat e no out put no out put per hour 4. E- 06 1. E- 05
Assembly,
st ar t r ack er no out put no out put per hour 1. E- 06 3. E- 06
Body Flap st ick ing st i cks per hour 1. E- 06 3. E- 06
Body Flap
st ruct ur al
f ailur e
st ruct ural
f ai l ure per hour 1. E- 07 3. E- 07
Br ak e
f ails t o
close fai l t o cl ose
per
demand 1. E- 05 3. E- 05
Component
Type
Mode units medi an mean
Fa ilure
Ra te
PRA
Failure
Generic F ai lure
R ate
Generic
failure
mode
failure
Examples of Component Failure Rates
129
Mission Success Starts With Safety
S e n s o r , t e m p f a i l s h i f a i l h i g h p e r h o u r 4 . E - 0 7 1 . E - 0 6
S e n s o r , t e m p f a i l s l o w f a i l l o w p e r h o u r 4 . E - 0 7 1 . E - 0 6
P u m p , h y d f a i l s t o r u n
f a i l t o
o p e r a t e p e r h o u r 1 . E - 0 6 3 . E - 0 6
P u m p , h y d
f a i l s t o
s t a r t f a i l t o s t a r t
p e r
d e m a n d 2 . E - 0 4 3 . E - 0 4
V a l v e ,
b y p a s s p n e u
f a i l s t o
o p e n f a i l t o o p e n
p e r
d e m a n d 2 . E - 0 4 3 . E - 0 4
V a l v e ,
b y p a s s p n e u
t r a n s f e r s
o p e n t r a n s f e r o p e n p e r h o u r 1 . E - 0 6 3 . E - 0 6
C o m p o n e n t
T y p e
M o d e u n i ts med i an mean
F a i l u r e
R a t e
P R A
F a i l u r e
G en er i c F ai l u r e
R ate
G en er i c
f ai l u r e
mo d e
f a i l u r e
Additional Examples of Component Failure Rates
130
Mission Success Starts With Safety
Steps in Quantifying Component Failure
Probabilities
1. Identify the specific component failure mode
2. Determine whether the failure is time-related or demand-
related
3. Determine the environment e.g., ground or air
4. Select the appropriate failure rate value
5. For a time-related failure determine the exposure time
6. For a time-related failure, if the failure is repairable
determine the repair time
7. For a demand-related failure, determine the number of
demands if greater than 1
8. Input into the software package or if a manual
evaluation use the appropriate formula to quantify
131
Mission Success Starts With Safety
Basic
Event Component Type
Fault Tree
Symbols Failure Mode
Failure
Probability
IV Isolation Valve E1 E2 Failure to close when EMF is removed 2 E-04
K Relay Switch Contacts E3 E4 E5 Failure to return when EMF is removed 3 E-03
K6 Timer Relay E6 Failure to time out 2 E-02
S Manual Switch E7 Operational failure to open Switch 1 E-02
S Manual Switch E8 Failure of Switch to open when operated 5 E-05
Monopropellant Component Failure Data
132
Mission Success Starts With Safety
Quantification of the Minimal Cutsets for
the Monopropellant Tree
E6 E7=Primary Time out Failure of K6Operational Fail to Open of S3 =2-02*1-02=2-04
E6 E8= Primary Time out Failure of K6Primary Fail to Open of S3 =2-02*5-05= 1-06
E5 E7=Primary Fail to Open of K6Operational Fail to Open of S3 =3-03*1-02= 3-05
E5 E8= Primary Fail to Open of K6Primary Fail to Open of S3 =3-03*5-05= 1.5-07
E1 E3= Primary Fail to Close of IV2Primary Fail to Open of K5 =2-04*3-03= 6-07
E1 E4=Primary Fail to Close of IV2Primary Fail to Open of K3 = 2-04*3-03= 6-07
E1 E2=Primary Fail to Close of IV2Primary Fail to Close of IV3 = 2-04*2-04= 4-08
G1=2-04+3-05+1-06+6-07+6-07+1.5-07+4-08 = 2.3-04
133
Mission Success Starts With Safety
Interpretations of Quantitative Results
Basic event probabilities used for quantification
generally have large uncertainties
Thus, the quantified probability for the top event and
other results generally have large uncertainties
Quantitative results should therefore generally be
interpreted as showing the general range of the
value, e.g., the order of magnitude
Uncertainty evaluations are carried out to explicitly
show the associated uncertainty ranges
Relative contributions and importances obtained
from the fault tree generally have smaller
uncertainties
134
Mission Success Starts With Safety
Using Generic Failure Data
Data bases provide generic failure data collected
from a variety of sources
This generic data needs to be screened for the
applicable failure mode and environment
Operational factors or environmental factors are
given to scale reference failure data
The generic data can also be updated using mission
specific data
Bayesian statistical approaches are used in this
updating to appropriately handle the information
135
Mission Success Starts With Safety
Using Expert Opinion
For a variety of basic events, applicable data are not
available
Expert opinion and engineering judgment need thus
to be used to estimate the basic event data
The basis for the estimates need to be documented
A sufficient range needs to be included with each
estimate to cover uncertainties
Sensitivity studies can be carried out to check the
impact of the estimates
Structured expert-elicitation approaches can be
used to increase the fidelity of the estimates
136
Mission Success Starts With Safety
Review Questions
1. Can the sum of products quantification rule for the
top event be used for intermediate faults?
2. How is the failure exposure time changed for a
component tested or not tested before a launch?
3. How can a constant failure rate model be used to
approximate phases or time-dependencies?
4. How can quantification rules for a fault tree be
codified to obtain consistent results?
5. How can the quantitative results be used to check
the fault tree?
137
Mission Success Starts With Safety
Three Basic Importance Measures Used
for Prioritization in FTA
FV Importance (Contribution Importance)- the relative
contribution to the top event probability from an event.
Risk Achievement Worth RAW (Increase Sensitivity,
Birnbaum Importance)- the increase in the top event
probability when an event is given to occur (probability set
to 1).
Risk Reduction Worth RRW(Reduction Sensitivity)- the
reduction in the probability of the top event when an event
is given to not occur (probability set to 0).
138
Mission Success Starts With Safety
Calculation of the Importance
Measures
FV Importance = Sum of min cut cuts containing the event
Sum of all min cut sets
RAW =Top event probability with event probability set to unity
- Top event probability
RRW = Top event probability
- Top event probability with event probability set to zero
139
Mission Success Starts With Safety
Basic Event Importance Measures for
the Monopropellant Example
FV
Importance
RRW
(Reduction)
RAW
(Increase)
Basic Event
Operational Fail to Open S3
Primary Time Out Failure of K6
Primary Fail to Open of K6
Primary Fail to Open of S3
Primary Fail to Close of IV2
Primary Fail to Open of K3
Primary Fail to Close of IV3
0.993 0.023
0.867 0.01
0.13 0.01
0.005 0.023
0.003 0.003
0.003 0.0002
0.993
0.867
0.13
0.005
0.003
0.003
0.0001 0.0001 0.0002
140
Mission Success Starts With Safety
Questions on the Monopropellant
Illustration
1. Why is the Operational Failure of S3 so high?
2. Why is the Primary Failure of K6 so high?
3. Why is importance of IV2 higher than IV3?
4. What components should be a focus of upgrades?
5. What is the potential improvement from such upgrades?
6. What components can be the focus for relaxations?
7. If the system fails, where should diagnosis be focused?
8. What possible changes can reduce the failure
probability?
9. What are other system failures (top events) that can be
analyzed?
141
Mission Success Starts With Safety
Types of Uncertainty in FTA
Two types of uncertainty
Modeling uncertainty
Parameter uncertainty
Modeling uncertainty
Success and failure criteria assumed
Contributions excluded
Independence assumptions
Parameter uncertainty
Uncertainties in data values
142
Mission Success Starts With Safety
Uncertainty Analyses in FTA
Modeling uncertainties are handled by listing
them and carrying out sensitivity analyses
Parameter uncertainties are handled by using a
probability distribution for each data value
Median value
Mean value
5% and 95% Bounds
Type of Distribution (e.g., Beta, Gamma, Lognormal)
143
Mission Success Starts With Safety
FT Uncertainty Propagation
Probability distributions are assigned for each basic
event data value
Data values having the same estimate are identified
as being coupled
The probability distributions are then propagated
using Monte Carlo simulations
The probability distribution and associated
characteristics are determined for the top event
Median value
Mean value
5% and 95% Bounds
144
Mission Success Starts With Safety
Validating an FTA
1. Select lower order minimal cutsets and validate that
they are failure paths
2. Obtain the minimal cutsets for an intermediate fault
and validate selections as failure paths
3. Obtain the success paths and validate selections as
true success paths
4. Review failure records and hazard reports to check
the coverage of the fault tree
5. Carry out sanity checks on the importance results
and probability results
145
Mission Success Starts With Safety
Termination of a Fault Tree Revisited
Basic events that are resolved
AND gates with multiple, diverse independent
inputs (e.g. 4) when there are smaller failure
combinations and with no CCF contribution
Input events to an OR gate of low probability
compared to other inputs
Intermediate events with upper bound
screening probabilities that are determined to
have small contributions
146
Mission Success Starts With Safety
Dynamic Fault Tree Analysis (DFTA)
DFTA is a term used to refer to analysis of a
system which dynamically responds to a
failure or a stimulus
A cold standby component activated by
another failure
A system configuration change due to a failure
A system configuration change responding to a
signal
Failures that occur in a particular sequence
Failure criteria that change for a new mission
phase
147
Mission Success Starts With Safety
Example of a Dynamic System
Primary
Standby
Switch
After Primary failure switch to Standby
148
Mission Success Starts With Safety
Primary and Standby Fail
Primary fails and
Switch fails
before Primary
Primary fails and
secondary fails and
switch does not fail
Before primary
Outline of the FT for the Dynamic Example
Basic events as
defined by the
above event
149
Mission Success Starts With Safety
Dynamic Events Can Be Handled by FTA
Each event is clearly described to include the
dynamic conditions
The basic events are defined including the dynamic
conditions
Standard AND and OR gates are used to describe
the general relational logic
The difference is that more complex quantification
formulas are used to incorporate the dynamic
conditions
150
Mission Success Starts With Safety
Special DFTA Software Can Be Used to
Expedite the FTA
When there are numerous or complex dynamics,
special DFTA software can be used
The DFTA software incorporates special gates to
show standby relations, a common supply, sequential
relations, or re-configurations
Markov analysis is used to quantify the dynamic
events
151
Mission Success Starts With Safety
DFTA Exercise
Assume two processors share a common cold
spare
Develop the fault tree logic structure for the top
event : No Processing Capability
Determine the resulting minimal cutsets
Discuss how the minimal cutsets would be
quantified
152
Mission Success Starts With Safety
Applications of FTA Revisited
Understanding of System Failure and Contributors
Identification of Design Features and Weaknesses
Evaluation of Tradeoffs
Prioritization of Contributors to Focus Actions
Comparison with a Goal
Minimization of Failure Probability
Diagnosing Causes of a Failure or an Incident
153
Mission Success Starts With Safety
The Use of FTA to Understand System Failure
and its Contributors
The FTA logically traces a system failure to its
immediate causes
These immediate causes are traced to their
immediate causes, etc., until the basic component
failure causes are identified
This tracing of causes lays out the failure logic of
the system in terms of causal failures
A complete system failure mapping is thus obtained
154
Mission Success Starts With Safety
FTA: Understanding/Communicating
Formal documentation of the system failure analysis
A structured tool for what-if analysis
A pictorial of failure progression paths to system
failure
A failure diagram of the system to be maintained
with the system drawings
A tool to extract information to communicate with
engineers, managers, and safety assessors
155
Mission Success Starts With Safety
The Use of FTA to Identify Design Features and
Weaknesses
A single component minimal cutset identifies a
single event or single failure that can cause the top
event
A minimal cutset containing events which are of all
the same type has susceptibility to a single common
cause triggering the events
Minimal cutsets of significantly different size show
potential system unbalances
Minimal cutsets grouped according to given
features show corresponding design features
156
Mission Success Starts With Safety
The Fault Tree as a Master Logic Diagram
The Master Logic Diagram (MLD) is a fault tree identifying
all the hazards affecting a system or mission
The Master Logic Diagram can also be called a Master
Hazards Diagram (MHD)
The MLD or MHD is developed using fault tree logic
The basic events of a system MHD are the hazards that can
initiate component failures or increase their likelihood
The basic events of a mission MLD are the hazards that are
the initiating events of potential accident scenarios
157
Mission Success Starts With Safety
The MLD Identified the Initiating Events in the
Space Shuttle PRA
The top event was Loss of Crew and Vehicle (LOCV)
LOCV was resolved into mission phase contributions
Each mission phase contribution was resolved into system
contributors
Each system contributor was resolved into initiating event
contributors
158
Mission Success Starts With Safety
Dispositioning of the Initiating Events in the
PRA
The initiating events were labeled
Each initiating event was cross-referenced to hazards
identified in Hazard Analyses
Events were modified to be consistent with the Hazard
Analyses
Each event was dispositioned as to where it is modeled
or if not modeled then why
159
Mission Success Starts With Safety
Undesired Event
Phase
Function
System
SSME SRB ET MPS OMS FCP APU ECLSS
Failure Types
Basic Events
Individual
Hazardous
Events
Individual
Component
Failures
Individual
Human
Errors
Loss of Structure Loss of Flight Control
Loss of Habitat
Hazardous Events Component Failures Human Errors
Probability of LOCV
Orbit Ascent Entry
Structure of the MLD for the Space Shuttle PRA
160
Mission Success Starts With Safety
LOCV
LOCV During
Pre-Launch
LOCV During
Ascent
LOCV During
Orbit
LOCV During
Descent/Landin
g
Loss of
Structur
e
Loss of
Flight Control
Loss of
Habitat Env
Fire/Explosion
Systems Events
External Events
LOCV
MISSION-based
LOCV
ABORT-based
LOCV During
Ascent ABORT
LOCV During
Orbital ABORT
LOCV During
Landing ABORT
The Space Shuttle MLD Continued
161
Mission Success Starts With Safety
LOCV Due t o Loss of
Structural I ntegrit y Caused by
Fire/Expl osion during Ascent
LOCV-Ascent-LS-FirExp
2
1
APU caused Fire/Explosi on
Fire/Explosi on of STS
during Separati on
SRB caused Fi re/Expl osi on
RSRM fail ures causi ng
element Fi re/Explosion
SRB System failure
causing STS element
Fire/Explos ion
Forei gn object
damage
RSRM fail s t o maint ai n safe
STS attitude/ performance
due t o Thrust failure
Structure breakup of
RSRM resulti ng in
Fire/Exp of STS vehicle
RSRM struct ural failure
causi ng Fire/Explosion
in other STS elements
Structure failure of
RSRM components
Fire/Explos ion of
other STS elements
APU exhaust
leak damage
APU fuel
leak damage
RSS destruct
command of STS
due to element failure
MPS caused
Fire/Explosi on
Overpressurization
due to MPS failure
MPS fuel leak
MPS H2 leak MPS O2 leak
OMS/ RCS caused
Fire/Expl osi on
Overpressurezat ion
due to OMS failure
Overpressurization
due t o RCS failure
ET Fire/Expl osi on
ET failure caus ing
element
Fire/Explos ion
SSME Fi re/Explosion
FCP fuel leak
Orboter I/F leakage
PRSD caused Fi re/Expl osi on
Orbiter failure causing
element Fir/Explosion
Further Development of the IE-MLD for Fires and Explosions on Ascent
161
Mission Success Starts With Safety
F/P Type Sev Like
ORBI 275 184 PAOD ECLSS LOCV FC HE P EE A b
ORBI 339 221 D ECLSS LOCV HE P SE A c
ORBI 511 231 AOD ECLSS LOCV HE SE A c
ORBI 117 135 PAOD ECLSS LOCV FC HE P SE A d
ORBI 241 170 PAOD ECLSS LOCV HE P SE A d
ORBI 321 208 D ECLSS LOCV HE P SE A d
ORBI 254 176 O ECLSS Abort HE P SE B d
ORBI 276 185 PAOD ECLSS Abort HE F EE B d
ORBI 323 210 O ECLSS Abort HE P SE B d
USA
Hazard
Number
MLD
initial
event
Mission
Phase
System PRA
Consequence
Threatened
Function
Hazard Probability
Cross-Reference of Hazard Reports with MLD Events
162
Mission Success Starts With Safety
In d i vid u a l Ha z a r d De s c r ip t i o n
F /P T yp e S ev L i ke FT / E
T
J u s ti
f ic a ti
o n
IN TG 0 0 6 4 P A M P S L O C V S I P F E A c
I g n it i o n o f F la m m a b le A t m os p h e re a t t h e E T / O rb i t e r L H2 Um bi l ic a l
D is c o n n e c t A s s e m b ly
IN TG 0 0 9 6 P M P S L O C V S I F C HE F F E A c
I s o la t i o n o f t h e E T f ro m t h e O rb i t e r M P S o r S S M E s ( 1 7 i nc h va lve b u rs t s
o p e n u n d e r pr e s s ur e f ro m E T)
IN TG 0 1 6 1 2 P A M P S L O C V S I F C P F E A c I g n it i o n S o u rc e s I gn i t i n g F la m m a b le F l u id s i n t h e A f t Co m p a rt m e n t
IN TG 0 1 9 3 9 0 A M P S L O C V F C F S E A c P re m a t u re s h u t d o w n o f o n e o r m o re S S M E ' s
IN TG 0 2 0 1 8 A M P S L O C V S I F C P F E A c H y d ro g e n A cc u m ul a t i o n i n t h e A ft C o m p a rt m e n t D u ri n g A s ce n t
IN TG 0 2 3 2 0 A M P S L O C V S I F C P F E A c
C o n t a m i n at io n in t h e In t e g ra t e d M a in P ro p u l s io n S y s t e m (w hi c h c l o gs
t he s y st e m )
IN TG 0 3 4 2 4 P A M P S L O C V S I F C P F E A c A u t o ig n i t i o n in Hi g h P re s s u re O x y ge n E n vi ro n m e n t (i n M P S )
IN TG 0 4 1 3 9 2 P A M P S L O C V F C F F E A c L o s s o f M P S / S S M E H e s u p p ly pr e s s ur e
IN TG 0 4 2 3 2 P A M P S L O C V S I P S E A c Tu rb o p u m p Fr a gm e n t a t i o n D ur in g E n g in e O p e ra t io n
IN TG 1 1 2 4 8 A D M P S L O C V S I F C P F E A c H 2 / O 2 Co m po n e n t L e a ka g e Du ri n g A s c en t / E n t ry
IN TG 1 1 2 4 9 A D M P S L O C V S I F C P F E A c H 2 / O 2 Co m po n e n t L e a ka g e Du ri n g A s c en t / E n t ry
IN TG 1 6 8 8 1 P A M P S L O C V S I F C E E A c F la m m a b le A t m o s p he r e i n t h e E T I n t e rt a n k (s e e 2 3 8 )
O R B I 0 3 5 1 0 2 A D M P S L O C V S I F C P F E A c
H y d ro g e n A cc u m ul a t i o n i n t h e O rb i t e r Co m p a r t m e n t s Du r in g R TL S / TA L
A b o rt
O R B I 0 4 5 1 0 7 P A O DM P S L O C V S I F C HE P F E A c
I g n it i o n o f O r b it er F l ui d s E n t ra p p e d i n t h e TCS M a t e ri al s ( a ft
c o m p a rt m e n t )
O R B I 1 0 8 1 3 3 P A O DM P S L O C V S I P S E A c
O ve rp re s s u ri z a t i o n o f t h e O rb it e r A f t F u s e la g e C a u s e d b y t he Fa i lu r e o f
a n M P S H e li u m R e g u la t o r o r Re l ie f V a lve
O R B I 2 7 8 1 8 7 P A O DM P S L O C V S I P S E A c
L o s s o f S t ru c t ur a l In t e g r it y Du e t o O ve rp re s s u ri z a t i o n o f t h e M id a n d / o r
A ft F u s e la g e
O R B I 3 0 6 2 0 5 P A M P S L O C V S I F C P F E A c
F ir e / E x p lo s io n in t h e O r b it er A ft C o m p a rt m e n t C a u s e d b y M P S
P ro p e ll a n t L e a k a g e / C om p o ne n t R u pt u re
O R B I 3 3 8 2 1 9 P A M P S L O C V S I F C P F E A c G O 2 E x t e rn a l Ta n k P re s s u ri z a t io n L in e a s M P S / A P U I g n it io n S o u rc e
O R B I 3 4 3 2 2 4 P A M P S L O C V S I F C P F E A c
F ir e / E x p lo s io n in t h e O r b it er A ft C o m p a rt m e n t C a u s e d b y C on t a m in a t io n
i n t h e M a in P r op u l s io n S y s t e m Fe e d S y s t e m
IN TG 0 8 5 4 4 P M P S L O C V S I P F E A d I g n it i o n o f F la m m a b le A t m os p h e re a t T- 0 U m b il ic a l s
IN TG 0 8 9 4 5 P A M P S L O C V S I F S E A d
M a l fu n c t io n o f t h e L H2 a n d L O 2 T-0 Um bi l ic a l Ca r ri er P la t e Re s u l t i n g in
D a m a g e t o S h u t t le V e h i c le
IN TG 1 5 3 7 1 P M P S L O C V S I P E E A d P o t e n t i a l G e y s e ri ng in t h e L O 2 Fe e d Li n e ( Ts a t = b o i li ng p o in t )
IN TG 1 6 6 7 9 P M P S L O C V S I F C P S E A d P re m a t u re S e p ar a t i o n o f O rb i t e r T-0 Um b i li c a l Ca rr ie r P la t e
IN TG 1 6 7 8 0 P M P S L O C V S I F C P S E A d
O ve rp re s s u ri z a t i o n o f L O 2 O rb i t e r B l e ed S y s t em o r L H2 Re c ir c u la t i o n
S y s t e m
M E - F G 3 P , 3 4 6 P A M P S L O C V S I P S E A d g e y s e ri n g o f L O X ( M P S ) (s e e 7 1 )
M E - F G 6 S , 3 5 4 P M P S L O C V S I P S E A d a b n o rm a l t h ru s t l o a d s
M E - F G 8 M 3 5 6 A M P S L O C V S I P S E A d t hr u s t o s c i ll a t i on s le a d in g t o p o g o ( s ee 3 )
O R B I 2 4 8 1 7 2 P A O DM P S L O C V S I F C P F E A d F ir e / E x p lo s io n in G O X P r es s u r iz a t i o n S ys t e m
M E - F A 1 S 3 1 0 P M P S S I F C F E C c h y d ro g e n f ir e / e x p lo s i on e x t e rn a l t o a ft c o m p a rt m e n t ( s ee 2 1 )
M L D
in i ti a
e ve n
t
M is s i
o n
P h a s
e
S
y
s
t
e
m
Re f e re n
c e E S D
N a m e s
An al ys t
R e ma r ks
P
R
A
C
o
n
s
e
q
u
e
n
Th re a t e n e
d Fu n c t io n
Ha z a rd
C a t e g o ry
P ro b
C a t e g o ry
U SA
H a z a r d
Nu m be r
List of Accident Initiating Events Identified in the
IMLD (MPS Related I nitiators)
164
Mission Success Starts With Safety
The Shuttle PRA Process
Over-arching
Event Trees
I
n
itia
tin
g
E
v
e
n
t L
o
g
ic
D
ia
g
r
a
m
165
Mission Success Starts With Safety
ET-SEP/MPS Shutdown Accident Sequences
OK xxxxx
LOCV-DMP 1 xxxxx
LOCV-ETSEP 2 xxxxx
LOCV-MECO 3 xxxxx
LOCV due to
ETSEP/Shutdown
Sequence failure
LOCV
FT for Top Event #5 Identified
in Over-Arching Mission Model
Seq.-1 Seq.-2 Seq.-3
LOCV-DMP LOCV-ETSEP LOCV-MECO
Failure
S
y
s
t
e
m
/
E
l
e
m
e
n
t
L
e
v
e
l
M
o
d
e
l
I
n
t
e
g
r
a
t
i
o
n
PROPL-OK MECO ET-SEP MPS-DUMP END-STATES Freq.
Success
ETSEP-SHUTDW-LOCV
Top #5
166
Mission Success Starts With Safety
Extending a System Fault Tree to a
Master Hazard Diagram (MHD)
The top event is defined as a system failure event
The fault tree is developed to the basic component
level
Each component failure is further resolved into
hazards and conditions that can cause failure or
increase its likelihood
The resulting system MHD identifies the hazards
affecting the system and their consequences
Of particular importance are single failures and
hazards affecting multiple redundant components
167
Mission Success Starts With Safety
Ranking the Criticality of Hazards Using
FTA
Each hazard is linked to a basic event or events on the
fault tree
Equivalently each hazard is linked to the basic events in
the minimal cutsets
The criticality of the hazard is the likelihood of the
hazard times the importance of the basic event
The component importance is determined from the FTA
The likelihood is determined from the hazard analysis
Hazard Criticality=Likelihood x Importance of
Components Affected
168
Mission Success Starts With Safety
The Role of FTA in Mishap Analysis
The accident scenario is constructed for the mishap
System failures (pivotal events) are identified which
resulted in the mishap
A fault tree is constructed for each system failure to
resolve the basic events involved
For further root cause analysis a basic event is resolved
into the possible causes
The basic events (or root causes) are dispositioned
according to their plausibility or likelihood
169
Mission Success Starts With Safety
FTA Applied for Software Assurance
FTA can be applied to a software program to
analyze the logic flow
FTA can be applied to software coding to analyze
detailed command and data transmittal
The same FT process as applied to hardware is
applied to software
A top event defines a particular software undesired
output or lack of output
The top event is resolved into immediate, necessary
and sufficient events for the top event
The resolution is traced back to software failures or
input failures
170
Mission Success Starts With Safety
The Equivalent Monopropellant Software
Diagram
Command
Software Module
Enable/Disable
Command
Software Module
Enable/Disable
Command
Initiation of Thrust
171
Mission Success Starts With Safety
FTA in Design
Top level fault trees are developed
Functional level
System level
Subsystem level
Tradeoffs are carried out
Alternative functional capabilities
Alternative redundancies
Allocations are performed
System requirement into subsystem requirements
Subsystem requirements into component
requirements
172
Mission Success Starts With Safety
The Use of FTA to Evaluate Tradeoffs
Tradeoffs involve alternatives to design or operation
FTA evaluates alternatives by appropriately
modifying the FT
Changes in the top event results show the impact of
the alternatives
The changes can be qualitatively or quantitatively
evaluated
173
Mission Success Starts With Safety
Monopropellant Design Tradeoff FTA
What would the benefit be of adding an additional,
redundant isolation valve in the fuel supply line?
What is the effect of replacing the manual
emergency switch S3 with an automatic timer relay?
What is the effect of removing the automatic timer
relay K6 circuit and having the relay K5 connect to
S3 which now becomes an automatic timer?
What is the effect of adding an additional timer relay
as a redundancy to K6?
174
Mission Success Starts With Safety
The Use of FTA to Prioritize Contributors
Each basic event in the fault tree can be prioritized
for its importance to the top event
Different importance measures can be obtained for
different applications
Basic events are generally significantly different in
their importance providing effective prioritization
In addition to the basic events, every intermediate
event in the FT can be evaluated for its importance
175
Mission Success Starts With Safety
Use of FTA to Compare with a Goal
FTA can be used to calculate a top event
probability that can be compared with a goal
Uncertainty analysis can be incorporated by
assigning each basic event an uncertainty
distribution
If the FTA is carried out according to defined
ground rules and meaningful data are available
then this can be meaningful
176
Mission Success Starts With Safety
Use of FTA in Minimizing Failure Probability
The fault tree equations can be programmed to
handle different values for the failure probabilities,
failure rates, and repair times
Cost equations or resource equations can be
included to handle these constraints
The probability of system failure (represented as the
top event) can be optimized using available software
packages
177
Mission Success Starts With Safety
Reducing the Probability of the
Monopropellant Failure to Terminate Thrust
What are the options for reducing the probability of
failure to terminate thrust in the monopropellant
example?
How do these options effect the probability of no
thrust for the other monopropellant example?
Are there options which reduce both probabilities?
What criteria can be used to determine whether
such reductions are needed or are effective?
178
Mission Success Starts With Safety
Use of FTA to Diagnose Causes of a Failure
FTA can also be used as a reactive tool to assess
the causes of a failure
The observed failure is the top event
The FT is developed to identify the possible basic
causes
The basic causes can be prioritized for their
likelihood using FT importance measures
179
Mission Success Starts With Safety
Diagnostic FTA
The observed failure (end state) is the top event
Observed successes and failures of subsystems and
components are documented
The top event is developed to the immediate possible
causes
Failures which cannot occur because of the
observations are truncated and not further developed
Tests are identified to resolve whether additional
failures have occurred or have not occurred
The FT is developed in this manner to resolve the
plausible causes of the top event
180
Mission Success Starts With Safety
Monopropellant Diagnostic FTA
Observed System Failure: Thruster Supplied with
Propellant after Thrust Cutoff
Additional Observed Events: No continued EMF
measured in any of the circuits
Diagnostic FT: All continued EMF events deleted
from the original FT
The basic causes identified are Isolation Valve IV3
and Isolation Valve IV2 failures
If the diagnostic FT was developed after the
observed event then no EMF events would be
further developed and would be nullified
181
Mission Success Starts With Safety
The Mirror Success Tree (ST)
A Success Tree (ST) identifies all the ways in which
the top event cannot occur
The ST is the complement of the FT
The ST is the mirror of the FT
The ST is useful in showing the explicit ways to
prevent the occurrence of the FT
The ST is the success space twin of the FT
The ST does not as clearly differentiate importances
and priorities for preventing the top event
182
Mission Success Starts With Safety
Determining the ST from the FT
Complement the top event to a NOT event
Complement all intermediate events to NOT
events
Complement all basic events to NOT events
Change all AND gates to OR gates
Change all OR gates to AND gates
The tree is now the ST
The minimal cut sets of the ST are now
called the minimal path sets
183
Mission Success Starts With Safety
Minimal Path Sets
A minimal path set is the smallest number of
events which if they all do not occur then the
top event will not occur
If the events in one path set are prevented to
occur then the top event will be guaranteed
to not occur
The minimal path sets are the totality of
ways to prevent the top event based on the
fault tree
184
Mission Success Starts With Safety
THRUSTER IS NOT SUPPLIED
WITH PROPELLANT AFTER
THRUST CUTOFF
G1
ISOLATION VALVE IV3
DOES NOT REMAIN OPEN AFTER
CUTOFF
ISOLATION VALVE IV2
DOES NOT REMAIN OPEN AFTER
CUTOFF
Top Part of Monopropellant Success Tree
Success Tree Construction Step 1
OR
185
Mission Success Starts With Safety
THRUSTERNOT SUPPLIED
WITH PROPELLANT
AFTER THRUST
CUTOFF
G1
ISOLATION VALVE
IV3 DOES NOT REMAIN OPEN
AFTER CUTOFF
G2
EMF NOT CONTINUED TO
BE SUPPLIED TO IVV3
AFTER CUTOFF
NO PRIMARY FAILURE OF
IV3 TO CLOSE AFTER
CUTOFF
NO E2
ISOLATION VALVE IV2
NOT REMAINS OPEN
AFTER CUTOFF
Success Tree Construction Step 2
AND
OR
186
Mission Success Starts With Safety
Minimal Path Sets from the Minimal
Cut Sets
Take the complement of the union of the minimal
cut sets (mcs)
Carry out Boolean manipulation to obtain a union of
intersections
The intersections, or combinations of events, are
the minimal path sets (mps)
The set of minimal path sets is the totality of
combinations of preventions stopping the top event
from occurring
187
Mission Success Starts With Safety
Monopropellant FT: MPS from MCS
T=E6E7+E6E8+E5E7+E5E8+E1E3+E1E4+E1E2
Take the complement (denoted by a superscript):
T=(E6E7+E6E8+E5E7+E5E8+E1E3+E1E4+E1E2)
Apply the Union Complementation Law
T=(E6E7)(E6E8)(E5E7)(E5E8)(E1E3)(E1E4)(E1E2)
T=(E6+E7)(E6+E8)(E5+E7)(E5+E8)(E1+E3)(E1+E4)
(E1+E2)
T=E6E5E1+E7E8E1+E6E5E3E4E2+E7E8E3E4E2
188
Mission Success Starts With Safety
FTA Interface with Reliability Analysis
For quantification, the basic component inputs to FTA
are component failure rates and repair rates
For a first order calculation, the failure rates and repair
rates are treated as being constant
For more detailed quantifications, the failure rates and
repair rates can be modeled as being age or time
dependent
Weibull distributions are often used for the failure times
Lognormals or threshold exponential can be used for
the repair times
FTA can be linked to failure and repair data records
189
Mission Success Starts With Safety
FTA Project Management Tasks (1)
Define the FTA
Top Event
Scope
Resolution
Assemble the project Team
FT analyst
System engineering support
Data support
Software support
Define the FTA Operational Framework
Assemble the as built drawings
FT naming scheme
Interfaces/Support to be modeled
Software to be used
190
Mission Success Starts With Safety
FTA Project Management Tasks (2)
Assemble the data
Generically applicable data
Specifically applicable data
Prepare the software package
Familiarization
Test problems
Keep a log on the FTA work
Operational and design assumptions
Events not modeled and why
Success and failure definitions
Special models and quantifications used
191
Mission Success Starts With Safety
FTA Project Management Tasks (3)
Review the work at stages
FT construction
Qualitative evaluations
Quantitative evaluations
Check and validate the results
Engineering logic checks
Consistency checks with experience
Prepare and disseminate the draft report
Conclusions/findings
FTA results
FTs
Software inputs/outputs
Obtain feedback and modify and final report
Disseminate the report
Present findings
192
Mission Success Starts With Safety
Reference
Fault Tree Handbook with Aerospace Applications,
Version 1.1, NASA Publication, August 2002.
193