100% found this document useful (1 vote)
147 views75 pages

ME 5702 Module 3 System Reliability

Uploaded by

Ignacio Ceballos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
147 views75 pages

ME 5702 Module 3 System Reliability

Uploaded by

Ignacio Ceballos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

ME 5702 – PHYSICAL ASSET MANAGEMENT

SYSTEM RELIABILITY ANALYSIS


Enrique López Droguett
Professor (Profesor Titular)
Mechanical Engineering Department
University of Chile
e-mail: [email protected]
SYSTEM RELIABILITY ANALYSIS

V-4
V-2 P-1
T1

Water V-1
Source
V-5
V-3 P-2

Sensing and AC Power


Control System Source
S AC
SYSTEM RELIABILITY ANALYSIS

We will talk about: Rs = f (Rc )

Reliability Block Diagrams


1) Reliability block diagrams (Parallel-Series)
2) Standby and Share load systems
3) Complex systems

Logic-Based Diagrams
1) Fault trees
2) Success Tress
3) Event Tress
4) Master Logic Diagrams
5) Failure Mode and Effect Analysis (and Failure Mode and Effect
Criticality Analysis)
SYSTEM RELIABILITY ANALYSIS (cont)
(A) Series
A B C Z

(i) when all of the blocks work, system works, that is,

R s (t ) = Pr(A ! B ! C ! " ! Z)
R s (t ) = Pr(A) × Pr(B) × Pr(C) " Pr(Z)
R s (t ) = R A (t ) × R B (t ) × R C (t ) " R Z (t )
where
A, B, C... = events that the corresponding blocks work.
n
RS ( t ) = P R ( t)
i=1
i
SYSTEM RELIABILITY ANALYSIS (cont)

MTTF of the System:


n

RS (t) = P R i ( t ),
i=1
for R i ( t ) = e - lit
n
n - å lit
RS (t) = Pe
i=1
- lit
= e i=1

n
lS = ål
i=1
i

therefore,

1 1
MTTFS = =
lS n

ål
i=1
i
SYSTEM RELIABILITY ANALYSIS (cont)

For identical units: (assuming exponential distribution)

R( t ) = e- nlt
l S = nl
1
MTTFS =
nl
EXAMPLE 1

Four-component series system where the components are IID with CFR.
If Rs(100hrs) is 0.95, find the individual component MTTF.
SYSTEM RELIABILITY ANALYSIS (cont)

(B) Parallel System

in out
n
SYSTEM RELIABILITY ANALYSIS (cont)
System fails when all the blocks fail, i.e.,

(
FS = Pr A ! B ! C " )
FS = FA ( t ) × FB ( t ) × FC ( t ) × " Fi ( t ) = Unreliability of one companent
n n
1 - R S (t ) = P F (t )
i=1
i = P [1
i=1
- R i (t ) ]
n
RS ( t ) = 1 - P [1 - R ( t )]
i=1
i for R i ( t ) = e- l i t

1
lS =
æ 1 1 ö æ 1 1 ö æ 1 ö
ç + !÷ - ç + + !÷ + ç + !÷ + !
è l1 l2 ø è l1 + l2 l2 + l3 ø è l1 + l2 + l3 ø
1
MTTFS =
lS
SYSTEM RELIABILITY ANALYSIS (cont)

Case I: When all units are identical with constant failure rate

(
RS (t ) = 1 - 1 - e- lt )
n
lC = l

l
lS =
æ 1 1ö
ç1 + +! + ÷
è 2 nø
1
MTTFC =
lC

æ 1 1 1ö
MTTFS = MTTFC ç 1 + + +! + ÷
è 2 3 nø
EXAMPLE 2

Two parallel identical and independent components have CFR. It it is


desired that Rs(1000 hrs) = 0.95, find the component and system MTTF.
SYSTEM RELIABILITY ANALYSIS (cont)

(C) k-out-of-n system


If any of the k blocks out of N independent and identical blocks should
work so that the system works, then

N
æ Nö
RS (t ) = å [
ç ÷ R(t ) ] [1 - R(t)]
r N -r

r=k è rø

3! 3!
R s ( For 2 - out - of - 3 case) = ( )( ) ( )( )
2 3 0
R 1 - R + R 1 - R
2! ´ 1! 3! ´ 0!

= 3( R) (1 - R) + R 3
2

= 3R 2 - 2R 3
EXAMPLE 3

How many components should be used in an active redundancy design


to achieve a reliability of 0.999 such that, for successful system
operation, a minimum of two components is required?
Assume mission time = 720 hrs for a set of IID components with CFR
of 0.00015/hr
SYSTEM RELIABILITY ANALYSIS (cont)

(D) Standby Redundant System

For simplicity consider the two unit standby system.

ss
2
SYSTEM RELIABILITY ANALYSIS (cont)

Operation can be categorized as:

i) Block 1 operates until it fails


ii) Sensing/switch recognizes Block 1 failure and switches to Block 2
iii) Before switching the standby unit should have not failed while on standby
iv) Block 2 starts continues to operate
v) Block 2 ultimately fails

After Steps (i) – (v) → System Fails


SYSTEM RELIABILITY ANALYSIS (cont)
Case I:
Perfect switch, no standby failure and identical units. The mode can be
reduced to the so-called "shock model". In this case "shocks" occur at a
constant rate (thus they follow Poisson distribution) and after all blocks
are failed (shocks cause failure) the system fails. With Nth shock causing
the system to fail.
t
ln n - 1 - lx
RS (t ) = 1 - ò G ( n)
x e dx [gamma distribution]
0

R S ( t ) = e - lt
é
ê1 + lt +
(l t )2 +
(lt )3 ù
+ !ú when lt << 1
êë 2! 3! úû
n
MTTFS =
l
SYSTEM RELIABILITY ANALYSIS (cont)

Example:

In a Two-block standby system with perfect switch, if failures of each


block is 𝜆 = 0.001/hr, which is the rate of occurrence of shocks, then the
system fails
2
MTTFS = = 2000 hours.
0.001
SYSTEM RELIABILITY ANALYSIS (cont)

Case II:

Two-unit system
Imperfect switching, standby failure occurs when block in and blocks on
standby are not identical.

tì ü
ï ï
R S ( t ) = R1 ( t ) + ò í! f1( 1 ) 1
t dt × R ( $1 ) !"$
t × R 2 ( 1)
'
t × R ( 1 )ý
t - t
#"# $ !" #SS # !2#"#$
0 ïî A B C D ïþ

R ¢ ® in standby ® l ¢
R ® in operation ® l
SYSTEM RELIABILITY ANALYSIS (cont)

Probability that
A → Block 1 fails at a time t1 < t
B → Reliability of sensing and switching device (Probability that it
works at time t1)
C → Probability that Block 2 does not fails while in standby mode
(Reliability of Blocks at time t1)
D → Probability that Block 2 works till mission time t (Reliability
of Blocks at time t1 to t).

For Exponential distribution with 𝜆1, 𝜆2, 𝜆'2, 𝜆ss


SYSTEM RELIABILITY ANALYSIS (cont)
𝑡

𝑅𝑠𝑦𝑠 (𝑡) = 𝑅1 (𝑡) + * 𝑓1 (𝑡1). 𝑅𝑆𝑆 (𝑡1). 𝑅′2 (𝑡). 𝑅2 (𝑡 − 𝑡1). 𝑑𝑡1
0
𝑡

𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 −𝜆 1 𝑡 + * 𝜆1 𝑒 −𝜆 1 𝑡1 . 𝑒 −𝜆 𝑆𝑆 𝑡1 𝑒 −𝜆 2 𝑡1 . 𝑒 −𝜆 2 (𝑡−𝑡1) . 𝑑𝑡1
0
𝑡

𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 −𝜆 1 𝑡 + 𝜆1 𝑒 −𝜆 2 𝑡 * 𝑒 −𝜆 1 𝑡1 . 𝑒 −𝜆 𝑆𝑆 𝑡1 𝑒 −𝜆 2 𝑡1 . 𝑒 𝜆 2 𝑡1 . 𝑑𝑡1
0
𝑡
−(𝜆 1 +𝜆 𝑆𝑆 +𝜆 ′2 −𝜆 2 )𝑡1
−𝜆 1 𝑡 −𝜆 2 𝑡
−𝑒
𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 + 𝜆1 𝑒 5 6
(𝜆1 + 𝜆𝑆𝑆 + 𝜆′2 − 𝜆2 )
0

−𝜆 1 𝑡
𝜆1 𝑒 −𝜆 2 𝑡 −(𝜆 1 +𝜆 𝑆𝑆 +𝜆 ′2 −𝜆 2 )𝑡
𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 + 71 − 𝑒 8
(𝜆1 + 𝜆𝑆𝑆 + 𝜆′2 − 𝜆2 )
SYSTEM RELIABILITY ANALYSIS (cont)

Shared Load System:


Initially both units share the load, with times to failure distribution being
fh(t). When one unit fails, other unit operates at a higher stress and then
increased failure rate, (i.e., full load) with time to failure distribution ff (t).

tì ü 1
ï ï
[ ]
RS ( t ) = R h ( t ) + 2 ( ) ( ) ( )
2
!
#"# $ ò í!
ïî
f h 1
#"#
t dt × R
$1 !"$ h 1t × R t
!f#"#$
- t 1 ý
ïþ
A 0 B C D 2

Probability that
A → Both blocks remain in half load operational till mission time
B → Block 1 fails at t1 , t1 < tm
C → Block 2 works at half load till t1 t1 < tm
D → Block 2 works at full load after t1 till mission time.
SYSTEM RELIABILITY ANALYSIS (cont)
Example of Share load System
We assume exponential time-to-failure model for both units with failure rates as
𝜆h → half load failure rate
𝜆f → full load failure rate
𝜆f > 𝜆h since more stressful case.
𝑡

𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 −2𝜆 ℎ 𝑡 + 2 . 𝜆ℎ 𝑒 −𝜆 ℎ 𝑡1 . 𝑒 −𝜆 ℎ 𝑡1 . 𝑒 −𝜆 𝑓 (𝑡−𝑡1) . 𝑑𝑡1


0
𝑡

𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 −2𝜆 ℎ 𝑡 + 2𝜆ℎ 𝑒 −𝜆 𝑓 𝑡1 . 𝑒 −(2𝜆 ℎ −𝜆 𝑓 )𝑡1 . 𝑑𝑡1


0
−(2𝜆 ℎ −𝜆 𝑓 )𝑡1 𝑡
−𝑒
𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 −2𝜆 ℎ 𝑡 + 2𝜆ℎ 𝑒 −𝜆 𝑓 𝑡1 4 5
(2𝜆ℎ − 𝜆𝑓 ) 0

2𝜆ℎ 𝑒 −𝜆 𝑓 𝑡1
𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 −2𝜆 ℎ 𝑡
+ 61 − 𝑒 −(2𝜆 ℎ −𝜆 𝑓 )𝑡 7
(2𝜆ℎ − 𝜆𝑓 )
SYSTEM RELIABILITY ANALYSIS (cont)

R h ( t) = e- l ht
R f ( t) = e- l f t

−𝜆 𝑓 𝑡1
2𝜆 ℎ 𝑒
𝑅𝑠𝑦𝑠 (𝑡) = 𝑒 −2𝜆 ℎ 𝑡 + 01 − 𝑒 −(2𝜆 ℎ −𝜆 𝑓 )𝑡 1
(2𝜆ℎ − 𝜆𝑓 )

if 2 𝜆h - 𝜆f > 0 then
¥

MTTF =
0
ò R ( t) dt
S
SYSTEM RELIABILITY ANALYSIS (cont)
− λ t
2 λh e f ⎡
RS ( t ) = e − 2 λh t
+ 1 - e − ( 2 λh - λf ) t

2 λh - λ f ⎣ ⎦

MTTF = ∫ R ( t ) dt
0
S

∞ −λ t

MTTF = ∫e
0
− 2 λh t
+
2 λh e f ⎡
2 λh - λ f ⎣1 - e − ( 2 λh - λf ) t
⎤ dt
⎦ since e
- λf t
(e (
− 2 λh - λf ) t
) =e − 2 λh t


− λf t − 2 λh t
-1 − 2 λh t 2 λh e 2 λh e
= e - +
2 λh (
λ f 2 λh - λ f 2 λh 2 λh - λ f ) ( ) 0

1 2 λh 1
= + -
2 λh (
λ f 2 λh - λ f 2 λh - λf )
Let λh = 0.002 λ f = 0.003

1 2 ( 0.002 ) 1
MTTF = + -
2 ( 0.002 ) 0.003( 2 ( 0.002 ) - 0.003) 2 ( 0.002 ) - 0.003
= 250 + 1, 333.33 - 1000 = 583.33 hrs
SYSTEM RELIABILITY ANALYSIS (cont)

Example: Find Reliability of Parallel-Series System

T
3
1
4
in V
out
5
2
6
Y

if R i ( t ) = e - l 1t
SYSTEM RELIABILITY ANALYSIS (cont)

Series, parallel reduction

R S = X Y = 1 - (1 - R X ) (1 - R Y ) = R X + R Y - R X R Y
R X = R 1 ( R 3 + R 4 - R 3 R 4 ) = R 1R T
R Y = R 2 ( R5 + R 6 - R5R 6 ) = R 2 R V

( ) (
R S ( t ) = e - l1t e - l 3t + e - l 4 t - e ( 3 4 ) + e - l 2 t e - l5t + e - l 6t - e ( 5 6 )
- l +l t - l +l t
)
- l +l t
[
- e ( 1 2 ) e - l 3t + e - l 4 t - e ( 3 4 )
- l +l t
][ e - l5t + e - l 6t - e ( 5 6 )
- l +l t
]
¥
MTTFs = ò RS (t ) dt
0
SYSTEM RELIABILITY ANALYSIS (cont)

Complex Systems

There are some systems which are neither series nor parallel system, but
are some hybrid combination of the two,

Path Sets Cut Sets


1 4
3
2 5
SYSTEM RELIABILITY ANALYSIS (cont)

Path set method


For the system the path sets are
p1 = (1, 4), p2 = (2, 5), p3 = (1, 3, 5), p4 = (2, 3, 4)
therefore, the
RS = Prsuccess ( p1 ! p 2 ! p 3 ! p 4 )
RS = 1 - Prfail ( p1 " p2 " p3 " p4 )
RS = 1 - Prf ( p1 ) Prf ( p2 ) Prf ( p3 ) Prf ( p4 )
[ ][ ][ ][
RS = 1 - 1 - Prs ( p1 ) 1 - Prs ( p 2 ) 1 - Prs ( p 3 ) 1 - Prs ( p 4 ) ]
Prf (Pi ) = Probability that path i (Pi ) not successful (not available)
Prs (Pi ) = Probability that path i (Pi ) is successful (available)

Prs ( P1 ) = Prs (1) × Prs (4) = R1(t) × R 4 (t)


SYSTEM RELIABILITY ANALYSIS (cont)

System fails when none of the path sets works.

[ ][ ]
R S ( t ) = 1 - 1 - R1 ( t ) R 4 ( t ) 1 - R 2 ( t ) R 5 ( t )
[1 - R1(t) R3(t) R5(t )][1 - R 2 (t) R 3(t) R 4 (t )]
Cut-set method

c1 = {1, 2}, c2 = {4, 5}, c3 = {2, 3, 4} c4 = {1, 3, 5}

If any one of the cut set happens (occurs) system fails, therefore,

therefore, FS ( t ) = Pr(c1 È c2 È c3 È c4 )

RS ( t ) = 1 - Pr(c1 È c2 È c3 È c4 )
SYSTEM RELIABILITY ANALYSIS (cont)
Prf (c1 ) = F1 × F2
Prf (c1 ) = (1 - R 1 )(1 - R 2 )
Pr(c1 È c 2 È c 3 È c 4 ) = 1 - [1 - Prf (c1 ) [1 - Prf (c 2 )
!"
# # $ !#"# $
] ]
(1- R1 )(1- R 2 ) (1- R 4 )(1- R5 )

[1 - Prf (c 3 )
!#"# $
] [1 - Prf (c 4 )
!#"# $
]
(1- R 2 )(1- R 3 )(1- R 4 ) (1- R1 )(1- R 3 )(1- R5 )
Note that

[ ][ ]
Pr(A ! B) = 1 - 1 - Pr(A ) 1 - Pr( B) = Pr(A ) + Pr ( B) - Pr (A ) × Pr ( B)

Pr(A ! B) = Pr (A ) + Pr( B) - Pr(A ) Pr ( B)


"$ $#$$ %
if very small eliminate

Pr(A ! B) £ Pr(A ) + Pr ( B) - [ rare event approximation]


eliminates this term
SYSTEM RELIABILITY ANALYSIS (cont)
Therefore, when rare event approximation is applied

Rsc < R s < R sf

Fs ( t ) = Pr(c1 È c2 È c3 È c4 )
[
R SC ( t ) @ 1 - Pr(c1 ) + Pr(c2 ) + Pr (c3 ) + Pr(c4 ) ]
R SC ( t ) @ 1 - [ (!1 ##
- R1 )(1 - R 2 )
#"### $
+ (1 - R 4 )(1 - R 5 ) +
!## #"### $
x1 x2

[ (!1 ####
- R1 )(1 - R 3 )(1 - R 5 )
#"##### $
+ (1 - R 2 )(1 - R 3 )(1 - R 4 ) ]
!## ## #"##### $
x3 x4

therefore, cuts sets give the lower bound of reliability, whereas path sets give the
higher bound of reliability.
SYSTEM RELIABILITY ANALYSIS (cont)
Non-Series Parallel System (Complex System)

4 7
1 6
3 5 8
2

The system is analyzed by reducing it into basic parallel series systems.


Conditional probability is used to modify the system to parallel-series
system:
( ) ( ) [
RS ( t ) = RS t unit 3 good × R 3 ( t ) + RS t unit 3 bad × 1 - R 3 ( t )
!## #"### $ !##"##$
]
A B

Unit 3 good = functional, successful or reliable


Unit 3 bad = non-functional, failed or unreliable
SYSTEM RELIABILITY ANALYSIS (cont)

A represents:

4 7 1 4 7
1 6 6
out in out
5 8 2 5 8
in
2

1
(Parallel system)
in out
2

RS ( t unit 3 good) = R1 + R2 - R1R2


SYSTEM RELIABILITY ANALYSIS (cont)
B represents:
4 7
1 6
5 8
2

We can once again decompose B into a parallel-series. So we have

R S ( t unit 3 bad ) = R S ( t unit 3 bad ! unit 6 good ) ´ R 6 ( t ) +


[
R S ( t unit 3 bad ! unit 6 bad ) ´ 1 - R 6 ( t ) ]
4 7 4 7
1 1
5 8 5 8
2 2

unit 6 works unit 6 fails


SYSTEM RELIABILITY ANALYSIS (cont)

RS ( t unit 3 bad ! unit 6 good ) = R 2 + R1 ( R 4 + R5 - R 4 R5 )( R 7 + R8 - R 7 R 8 ) -


R1R 2 ( R 4 + R5 - R 4 R5 )( R 7 + R8 - R 7 R 8 )

RS ( t unit 3 bad ! unit 6 bad ) = R 2 + R1 ( R 4 R 7 + R5 R8 - R 4 R 7 R 5 R 8 ) -


R1 R 2 ( R 4 R 7 + R 5 R 8 - R 4 R 7 R 5 R 8 )
EXEMPLE
Determine the reliability of the following system using the decomposition
method:

R1=0.9 R2=0.9

R4=0.99

R3=0.95
DEFINE LOGIC TREE EVENT SYMBOLS

Basic event

Undeveloped event - An event which is not


further developed either because it is of
insufficient consequence or because
information is unavailable

External event - An event which is normally


expected to occur

Intermediate event - Explanation event only


LOGIC TREE ANALYSIS

Top down deductive decomposition of a failure into basic causes of failure


using Boolean Logic

Gates (Logic Representation) in Logic Trees


C C C
Union operation

+ OR gate
A B
+ Not OR gate
𝐶
C ==A𝐴 ∪ B𝐵
𝐶 ==A 𝐴̅ B∩ 𝐵(
NOR
A B
𝐶 = 𝐴̅ ⋅ 𝐵(
A B A B
C𝐶==A𝐴c ∪ 𝐵 =A B
C 𝐶 =𝐴+𝐵
C C
Intersection operation

AND gate
A B

Not AND gate


C
𝐶 ==A 𝐴 B∩ 𝐵 +
NAND A 𝐴̅B∪ 𝐵
𝐶== (
B 𝐶 =𝐴∩𝐵 𝐶 = 𝐴̅ + 𝐵(
A A B =A+B A B
𝐶 =𝐴⋅𝐵
LOGIC TREE ANALYSIS (cont)

( ∪ (𝐴̅ ∩ 𝐵 )
𝐶 = (𝐴 ∩ 𝐵)
( + (𝐴̅ ⋅ 𝐵 )
𝐶 = (𝐴 ⋅ 𝐵)

𝐷 = (𝐴 ∩ 𝐵) ∪ (𝐴 ∩ 𝐶) ∪ (𝐵 ∩ 𝐶 )
𝐷 = (𝐴 ⋅ 𝐵) + (𝐴 ⋅ 𝐶) + (𝐵 ⋅ 𝐶 )
FAULT TREE ANALYSIS
Example:
SW
The system can be represented as a
series model in a block diagram form
M Motor B

In
B SW M
Out

Fault tree for the system will be


M Fails
NO TOP EVENT
(no output form M)

+
M
No DC runs to M
M
Fails
+
SW B

SW Battery
fails fails
FAULT TREE ANALYSIS (cont)
Now we know that say T = M ∪ (SW ∪ B) = M ∪ SW ∪ B
So the above fault tree can be simplified as

No output from M T

M SW B
fails fails fails

therefore,
T = B È M È SW
Pr(T) = Pr(B È M È SW)

Pr( T) = 1 - [1 - Pr( B) ] [1 - Pr(SW) ] [1 - Pr( M) ]


FAULT TREE ANALYSIS (cont)
Example:
+

Co1 C1
where
C1 and C2 = Controller Co2 C2 S (sensor)
S = Sensor
Co1 and Co2 = Contacts
W CB
W = Coil
CB = Circuit Breaker 400 V
FAULT TREE ANALYSIS (cont)
Fault tree:
CB DOES NOT OPEN
WHEN IT SHOULD

+ T
CB
WIRING REMAINS
CB ENERGIZED
fails
. G1

CONTACT 1 DOES CONTACT 2 DOES


NOT OPEN NOT OPEN

+ G2 + G3

Co 1 Co 2

Co 1 NO COMMAND FROM Co 2 NO COMMAND FROM


fails CONTROLLER 1 fails CONTROLLER 2

+ +

S S C2
C1
Sensor C1 Sensor C2
fails fails fails fails

W failure opens the circuit so it is "fail safe" and does not play any role in fault tree.
FAULT TREE ANALYSIS (cont)
Boolean Reduction Process of a Fault Tree (Substitute and Reduce)
T = CB " G1 = CB + G1
G1 = G2 × G3
G2 = S + Co1 + C1
G3 = S + Co 2 + C2
G1 = (S + Co1 + C1 ) × (S + Co 2 + C2 )
G1 = S
! × S + S × Co 2 + S × C2 + Co1 × S + Co1 × Co 2 + Co1 × C2 + C1 × S + C1 × Co 2 + C1 × C2
S

Minimal cut-set
T = CB + G1
= CB + S + Co1 × Co2 + Co1 × C2 + C1 × Co2 + C1 × C2

Pr( T) = F( t ) » Pr ( CB) + Pr( S) + Pr (Co1 ) × Pr(Co 2 ) +


Pr (Co1 ) × Pr(C2 ) + Pr (C1 ) × Pr(Co 2 ) + Pr(C1 ) × Pr(C2 )
SUCCESS TREE
Consider the same example:

CB opens when it should

. T

Power to W removed
CB
works
+ G1

Contact 1 opens Contact 2 opens

. G2 . G3

Co1 C1 S Co2 C2 S
opens works works works works works

While failure of W is a success, it is not a normal success mode of operation.


Therefore, it should not be included.
SUCCESS TREE (cont)

Minimal path sets:

T = CB × G1
G1 = G 2 + G 3
G 2 = Co1 × C1 × S
G 3 = Co 2 × C2 × S
G1 = Co1 × C1 × S + Co 2 × C2 × S
T = CB × Co1 × C1 × S + CB × Co 2 × C2 × S

So generally speaking fault tree and success tree are logically complementary.
COMPARISON OF
SUCCESS TREE AND FAULT TREE
Fault Tree Success Tree

T T
A
. +

B C A + G1 A
. G1

B C B C

For Fault tree, cut sets are


G1 = B + C, T = A × G1
T = A × ( B + C) = A × B + A × C (i)
COMPARISON OF
SUCCESS TREE AND FAULT TREE (cont)
For Success tree, path sets are
G1 = B × C, T = A + G1 T = A + B× C (ii)

Finding "Path Sets" of the Success Tree from Cut Sets of Fault Tree:

( ) (
T = A×B + A×C = A×B × A×C )( )
( )(
= A+B × A+C = A
!"×#)
A + A × C + B×A + B× C
same as (ii)
A
T = A + B× C
So success trees are complement (logical inverse) of fault trees (in most of the
cases).
fault tree ® success tree
OR gate ® AND gate
AND gate ® OR gate
A × B + B× A ® A × B + A × B
(exclusive or)
EXAMPLE*
Consider the following steam boiler system which supplies steam to a
process system at a specified pressure:

*Source: System Reliability Theory: Models, Statistical Methods, and Applications, Second Edition, Wiley, 2004, M. Rausand, A. Hoyland
EXAMPLE
Water is led to the boiler through a pipeline with a regulator valve, a level
indicator controller valve (LICV). Fuel (oil) is led to the burner chamber
through a pipeline with a regulator valve, a pressure controller valve (PCV).
The valve PCV is installed in parallel with a bypass valve V-1 together with
two isolation valves to facilitate inspection and maintenance of the PCV
during normal operation.

The level of the water in the boiler is surveyed by a level emitter (LE). The
water level is maintained in an interval between a specified low level and a
specified high level by a pneumatic control circuit connected to the water
regulator valve LICV. The level indicator controller (LIC) translates the
pneumatic "signal" from LE to a pneumatic "signal" controlling the valve
LICV.
EXAMPLE
It is very important that the water level does not come below the specified
low level. When the water level approaches the low level, a pneumatic
"signal" is passed from the level indicator controller LIC to the level
transmitter (LT). The LT translates the pneumatic "signal" to an electrical
"signal" which is sent to the solenoid valve (SV). The solenoid valve again
controls the valve PCV on the fuel inlet pipeline. This circuit is thus installed
to cut off the fuel supply in case the water level comes below the specified
low level.
EXAMPLE
The pressure in the boiler and in the steam outlet pipeline is surveyed by a
pressure controller PC which is connected to the solenoid valve SV, and
thereby to the valve PCV on the fuel inlet pipeline. This circuit is thus
installed to cut off the fuel supply in case the pressure in the boiler increases
above a specified high pressure.

A critical situation occurs if the boiler is boiled dry. In this case the pressure
in the vessel will increase very rapidly and the vessel may explode:

(a) Construct a fault tree where the TOP event is the critical situation
mentioned above. Secondary failure causes shall not be included
(b) Determine all minimal cut sets in the fault tree
ANOTHER EXAMPLE*
Consider the following fire detector system:

*Source: System Reliability Theory: Models, Statistical Methods, and Applications, Second Edition, Wiley, 2004, M. Rausand, A. Hoyland
ANOTHER EXAMPLE
The fire detector system is divided into two parts, heat detection and smoke
detection. In addition, there is an alarm button that can be operated manually.

Heat Detection:
In the production room there is a closed, pneumatic pipe circuit with four
identical fuse plugs, FPl, FP2, FP3, and FP4. These plugs let air out of the
circuit if they are exposed to temperatures higher than 72C. The pneumatic
system has a pressure of 3 bars and is connected to a pressure switch PS. If
one or more of the plugs are activated, the switch will be activated and give
an electrical signal to the start relay for the alarm and shutdown system. In
order to have an electrical signal, the direct current (DC) source must be
intact.
ANOTHER EXAMPLE
Smoke Detection:
The smoke detection system consists of three optical smoke detectors, SDl,
SD2, and SD3; all are independent and have their own batteries. These
detectors are very sensitive and can give warning of fire at an early stage. In
order to avoid false alarms, the three smoke detectors are connected via a
logical 2-out-of-3 voting unit, VU.

This means that at least two detectors must give fire signal before the fire
alarm is activated. If at least two of the three detectors are activated, the 2-
out-of-3 voting unit will give an electric signal to the start relay, SR, for the
alarm and shutdown system. Again the DC voltage source must be intact to
obtain an electrical signal.
ANOTHER EXAMPLE
Manual Activation:
Together with the pneumatic pipe circuit with the four fuse plugs, there is
also a manual switch, MS, that can be turned to relieve the pressure in the
pipe circuit. If the operator, OP, who should be continually present, notices a
fire, he can activate this switch. When the switch is activated, the pressure in
the pipe circuit is relieved and the pressure switch, PS, is activated and gives
an electric signal to the start relay, SR. Again the DC source must be intact.

The Start Relay:


When the start relay, SR, receives an electrical signal from the detection
systems, it is activated and gives a signal to (i) Shut down the process, and
(ii) Activate the alarm and the fire extinguishers.
ANOTHER EXAMPLE
• Assume now that a fire starts

• The fire detector system should detect and give warning about the fire

• Let the TOP event be: no signals from the start relay SR when a fire
condition is present

• Develop de fault tree and find the cut sets


EVENT TREE METHOD
If successful operation of a system depends on an approximately chronological but
discrete operation of some of its units or subsystems, then an event tree is a useful
logical model for the system.

Example:
ac
ac = failure of power
P Sink
P = failure of the pump
Source

I AC Power (ac) Pump (P) End State

I ac P System functions
S
Sink low I ac P System fails
Level
I ac System fails
F
time

I = frequency of "sink low" event


EVENT TREE METHOD (cont)
Now each of the ac and P pumps event in the event tree can be represented by
a fault tree showing how they could fail:

No ac Supply ac Pump fails P

+ .
G1 G2 F
D

+ .
Note: same component

A B C D
EVENT TREE METHOD (cont)
So, event I × ac × P will be

ac = G1 + G 2
G1 = A + B, G2 = C × D
ac = A + B + C × D
(
ac = A + B + C × D = A × B × C × D )( )
( )
= A × B× C + D = A × B× C + A × B× D
P = D × F, P = D + F
( )
I × ac × P = I × A × B × C + A × B × D × P
= I × ( A × B × C + A × B × D) × D × F
= I × A × B × C × D × F + I!× A
##× B"
× D##
× D$
×F
j
where D × D is a null set

I × ac × P = I × A × B × C × D × F
EVENT TREE METHOD (cont)
In the same way, we can have results for all the events and thus get an estimate of
the overall system reliability.
I × ac = I × A + I × B + I × C × D
Thus, since mutually exclusive
Pr( system failure) = Pr(I × ac + I × ac × P) = Pr( I × ac) + Pr(I × ac × P)

Assume the following probabilities Pr(•) and frequency f(I)

= ,-------------.-------------/ (𝐴̅)Pr⁡
𝑓(𝐼 )Pr⁡ (𝐶̅ )Pr⁡
(𝐵E)Pr⁡
𝑓(𝐼 ) Pr(𝐴) + 𝑓(𝐼 ) Pr(𝐷 ) + 𝑓(𝐼 ) Pr(𝐶 ) Pr(𝐷 ) + ,----------.----------/ (𝐷)Pr⁡
(𝐹)
Assuming rare event but not mutually exclusive items non-mutually exclusive

Failure Probability or Success Failure Probability or Success


Item Frequencies Probability Item Frequencies Probability
I 10 per month C 0.02 0.98
A 0.01 0.99 D 0.05 0.95
B 0.01 0.99 F 0.01 0.99

Using rate event approximation


EVENT TREE METHOD (cont)
Frequency of System Failure = 10 (0.01) + 10 (0.01) + 10 (0.02) (0.05) +
10 (0.99) (0.99) (0.98) (0.05) (0.01)
= 0.1 + 0.1 + 0.01 + 0.0048 = 0.2148 /month

Frequency Using

None rare event approximation = f(10) {P(A+B+CD) + P(ABCDF) - P(A+B+CD)


P(ABCDF)}
=10 {(0.01+0.01+0.02*0.05) + (0.98*0.99*0.99*0.05*0.01) -
(0.01+0.01+0.02*0.05) * (0.98*0.99*0.99*0.05*0.01)}
= 0.2147/month

( )(
I × ac × P = I × A × B × C + A × B × D × D + F )
= I×A × B× C × D + I× A × B× C × F + I×A × B× D + I× A × B× D× F
= I×A × B× C × F + I× A × B× D
{ [(1 - 0.99 ´ 0.99 ´ 0.98 ´ 0.99)(1 - 0.99 ´ 0.99 ´ 0.95)]} @ 9.9662
10 1 -
EXAMPLE: EXPLOSION*

A B C D

*Source: System Reliability Theory: Models, Statistical Methods, and Applications, Second Edition, Wiley, 2004, M. Rausand, A. Hoyland
AN EXAMPLE OF A PUMPING SYSTEM

V-4
V-2 P-1
T1

Water V-1
Source
V-5
V-3 P-2

Sensing and AC Power


Control System Source
S AC
FMEA PROCEDURE

Ø Identify all potential item failure modes and define their effects on the
immediate function or item, on the system, and on the mission to be
performed.
Ø Evaluate each failure mode in terms of the worst potential consequence,
which may result severity classification category.
Ø Identify failure detection methods and compensating provision for each
failure mode.
Ø Identify corrective design or other actions required to eliminate the
failure or control the risk.
Ø Document the analysis and identify the problems, which could not be
corrected by design.
GENERAL PROCEDURE FOR FMEA/FMECA

Ø Define the system to be analyzed


• Internal and interface functions
• System boundaries
• Failure definitions

Ø Construct a block diagram of the system


• Structural (hardware)
• Functional
• Reliability block diagram (RBD)
FMEA + CA = FMECA
Ø FMEA (MIL-STD 1626, Method 101)
• Standard: Severity classification of Failure Mode Occurrence on a
1 to 4 scale
• Variation: Risk Priority Number (RPN) = Occurrence x Severity x
Detection
All above estimates evaluated on a relative 1 to 5 scale
Ø FMECA (MIL-STD 1626, Method 102)
• Qualitative: Probability of Failure Mode Occurrence on a 1 to 5
relative scale
• Quantitative: Criticality Number = Probability of Function Loss x
Failure Mode Ratio x Part Failure Rate x Operating Time
All above estimates received through a respective statistical estimation
procedure of generic (field) failure data
WORKSHEET FORMAT

MIL-STD 1629, Task 101


System: Data:
Indenture Level: Sheet Number:
ReferenceDrawing: Compiled by:
Mission: Approved by:
ID Failure FailureEffects
Item/ Mission Phase Probability Failure
Functional Failure Modes Operational and Data Next Detection Compensating Severity
Identification Function and Causes Mode Source Local Higher End Method Provisions Class Remarks

MIL-STD 1629, Task 102

System: Data:
Indenture Level: Sheet Number:
Reference Drawing: Compiled by:
Mission: Approved by:
ID Item/ Function Failure Mission Severity Failure Failure Failure Failure Mission Failure Item Remarks
Functional Modes Phase Class Probability Effect Mode Rate, Duration Mode Criticality
Identification and Operational and Data Prop. Ratio 1/hour hour Criticality
Causes Mode Source
b𝛽 𝛼
a l𝜆 T Cm = b a l T C= Cm
PROBABILITY OF OCCURRENCE
Description
Probability that an identified potential failure mode will occur over the item operating
time.
Level Criteria
A - Frequent A single failure mode probability of occurrence is greater than 20%
of the overall component probability of failure
B - Reasonably Probable A single failure mode probability of occurrence is greater than 10%
but less than 20% of the overall component probability of failure
C - Occasional A single failure mode probability of occurrence is greater than 1%
but less than 10% of the overall component probability of failure
D - Remote A single failure mode probability of occurrence is greater than 0.1%
but less than 1% of the overall component probability of failure
E - Extremely Unlikely A single failure mode probability of occurrence is less than 0.1% of
the overall component probability of failure

Tips
Use only when specific failure data is not available, that is, primarily with a
functional block diagram approach.
SEVERITY CLASSIFICATION

Description
A qualitative measure of the worst potential consequences resulting from the
item/function failure.
Effect Severity Criteria
Rating
Catastrophic 1 A failure mode that may cause death, complete system loss
or complete mission loss.
Critical 2 A failure mode that may cause severe injury or major system
degradation, damage, or reduction in mission performance.
Marginal 3 A failure that may cause minor injury or degradation, in
system or mission performance.
Minor 4 A failure that does not cause injury or system degradation
but may result in system failure and unscheduled
maintenance or repair.
None 5 –– –

Tips
Where it may not be possible to evaluate the effect of the failure mode according to
the four categories above, similar categories may be developed to better evaluate the
effect.
FAILURE MODE CRITICALITY NUMBER

Description
A numerical value used to rank each potential failure mode based on its
occurrence and the consequence of its effect.
C m = balt
b - Failure Effect Probability
Failure Effect β
Actual loss 1.0
Probable loss 0.1 - 1.0
Possible loss 0.0 - 0.1
No loss 0.0

a -Failure Mode Ratio as the fraction of the part failure rate related to a
particular failure mode estimated by an appropriate failure data analysis or
calculated from MIL-HDBK-217
FAILURE MODE CRITICALITY NUMBER (cont)

𝜆 - Part Failure Rate estimated by an appropriate failure data analysis or


calculated from MIL-HDBK-217
t - the established (mission) operating time of the item (system)

The item criticality number is the sum of individual failure mode criticality
numbers

Tips:
Use the reliability block diagrams to evaluate the failure effect probability 𝛽
for non-series systems.

The notion of the failure mode ratio assumes that independence of individual
failure modes ( * 𝛼! = 1 ).
FAILURE MODE CRITICALITY NUMBER (cont)

Example:
The "short-circuit" failure mode of a 1 kV varistor results into the probable
loss of the HV protection circuit with 𝛽 = 0.01. The generic failure rate of
the varistor is 𝜆! = 10-6 1/hour and the "short-circuit" failure mode ratio is
𝛼 = 0.8. Then the criticality number for the 10 month (7200 hour) period
of the system operating time is

Cm = 0.01 (0.8) (1.0E-6) (7200) = 5.76x10-5

20% of all other future modes → 𝛽 = 0.001

Cm = 0.001(0.2)(1x10-6)(7200) = 1.44x10-6

𝐶 = # = 5.904 × 10−5
CRITICALITY MATRIX
Description
A graphical method to comparing (prioritizing) the failure modes with respect to
their severity and criticality.
0.16 1
high priority 0.15
0.14 failure modes
0.132
0.12

0.10
low priority
0.08 failure modes
0.062
0.06
0.057
0.04
0.023
0.02
0.015
0.00 0
IV III II I
Severity classification
CRITICALITY MATRIX (cont)
Criticality Matrix
0.0045
VR
0.0040 AR

DR
0.0035

0.0030
RI
0.0025

0.0020

0.0015

0.0010

0.0005
CB
TH CT
0.0000
IV III II I
Severity classification

You might also like