Module 9 - Data Estimation
Module 9 - Data Estimation
com
Module 9
Data Estimation
Introduction
• IEC 61511 Clause 11.9 requires verification of the SIS
performance through calculation of the probability of
failure on demand (PFD) or average frequency of failure
(PFH)
Performance Verification
• Integrity
– Equipment must be user approved
– Subsystems must provide minimum fault tolerance
– Systems must achieve probability of failure on
demand
• (or hazard rate if continuous mode)
• Reliability
– Spurious trip rate
– Understand Secondary Consequences
1 of 2 5
2 of 2 6
Deterministic
under given
Natural degradation conditions
mechanisms under
Random Systematic
design stresses
Random Failures
• Early – dominated by systematic - typically due to
manufacturing defects, assembly errors, installation
errors or implementation mistakes
10
10
11
11
Failure Modes
• Complete failures
– Loss of device’s ability to operate as specified,
resulting in either a safe or dangerous failure
– Device will fail the proof test
12
12
Failure Cause
• Basic reason of failure or initiator of the physical
process by which deterioration proceeds to
failure
– Physical environment
– Chemical environment
– Design defect
– Device misapplication
– Quality defect
13
13
14
14
15
15
Failure Mechanisms
• Age (Wear Out)
• Heat
• Humidity
• Electrical surge Most of these are
outside of the
• Electro-static discharge manufacturer
• Shock analysis
• Vibration
16
16
Failure Classifications
• Dangerous
– Failure affecting equipment within a system, which
causes the process to be put in a hazardous state or
puts the system in a condition where it may fail-to-
operate when required
• Safe
– Failure affecting equipment within a system, which
causes, or places the equipment in condition where it
can potentially cause, the process to achieve or
maintain a safe state
17
17
Complete Failure
Signal Output Saturated Electronic failure - Corrosion Application Dependent
High, i.e. > 100 % - Ageing
- Thermal stress
Signal Output Frozen Isolation Valve Closed - Human error Dangerous
TR84.00.02 18
18
Failure Distribution
SAFE
S
DANGEROUS
D
©SIS-TECH
NOTE: THE RELATIONSHIPS BETWEEN THE CATEGORIES ARE NOT RELATED TO ANY
KNOWN DEVICE. IT IS AN ILLUSTRATION ONLY
19
19
Failure Detection
20
20
e.g
Yes Diagnostic No
test
Yes Periodic No Workbench
e.g du
test
Yes Dismantling No
e.g dd
Almost
Immediately Revealed Never
Immediately
revealed after a delay revealed
revealed
21
21
DANGEROUS DISTRIBUTION
DANGEROUS
SAFE DETECTED
REVEALED BY
s DD DIAGNOSTICS
DANGEROUS
UNDETECTED
DANGEROUS
DU REVEALED BY
UNDETECTED INCIDENT
©SIS-TECH
DU
REVEALED BY D= DU + DD
PROOF TEST
22
22
SAFE DISTRIBUTION
REVEALED BY
DIAGNOSTICS
SD
DETECTED
REVEALED BY
TRIP SAFE
DETECTED
DANGEROUS
SAFE D
UNDETECTED
REVEALED BY
PROOF TEST SU
23
23
24
24
25
25
1 of 3 26
26
2 of 3 27
27
3 of 3 28
28
• Published Data
– Actual (some predictive)
• Manufacturer Data
– Almost all predictive now
29
29
• Actual
– Calculated based on an observed sample of similar equipment
– Most useful since it is based on equipment installed in the
operating environment
– Usually collected and analyzed by the user
30
30
Data Sources
• Trips
– Process demands
– Spurious trips
• Device failures
– Detected failures (work orders)
– Proof test records
– Bypass log
– Out-of-service times (time to repair and test)
31
31
Measures
• Failures on demand
– Process demands
– Proof tests
• Failure frequency
– Failures per year
– Failures per hour
– Failures per 106 hours (per million hours)
– Failures per 109 hours (per billion hours)
• Also known as Failures in Time (FITS)
32
Understand precision
• The calculation of risk reduction is an order of
magnitude estimate
– Do not get lost in searching for or calculating multiple
significant digits
• Not much difference between brands of field devices if each
brand has met user approval
– Focus on estimating order-of-magnitude values for
each device technology considering the expected
operating environment
• Most data is selected based on qualitative
judgment of the historical evidence
33
33
Internal Data
• Data requirements
– Last test date and pass/fail result
– Current test data and pass/fail result
– Device technology
– Equipment classification
– Operating environment
– Failure mode (ie., actuator stuck)
– Failure cause (ie., rust seen on shaft)
1 of 2 34
34
Internal Data
• Need enough devices and service time to
establish reasonable confidence
– Remember the bath tub curve
• Need detailed consistent recording
– testing and maintenance
• Consider contributing to Instrument Reliability
Network
– established 2012 at the Mary Kay O’Conner Process
Safety Center - Texas A&M University
2 of 2 35
35
Manufacturer Data
• Sources:
– Manufacturer
– Third-party
• Data quality is highly variable
– Limited boundary
– Assumes specific configuration
– May include
• Very high diagnostic coverage IEC 61508-2010 limits both.
Look for revised claims.
• No effect failures
– Disclaims operating environment and support system
contributions
36
36
37
37
38
39
MTTFD MTTFSP
Description (years) (years)
Relays 100 - 1000 100 - 500
Non-Safety Configured Single Channel PES 10 - 30 10 – 30
Safety-Configured Single Channel PES 100 - 250 5 – 15
IEC 61508 Compliant SIL 3 PES 2500 – 50,000 10 - 1000
Trip Amplifiers (programmable) 300 - 600 150 - 275
Trip Amplifiers (non-programmable) 500 - 850 150 - 250
40
MTTFD MTTFSP
Description (years) (years)
Relays 100 - 1000 100 - 500
Relays and trip
Non-Safety Configured Single Channel PES 10 - 30 10 – 30
Safety-Configured Single Channel PES 100 - 250
amplifiers:
5 – 15
IEC 61508 Compliant SIL 3 PES 2500 – 50,000
SIL 2 in simplex
10 - 1000
Trip Amplifiers (programmable) 300 - 600 SIL
150 -3 / HFT=1
275
Trip Amplifiers (non-programmable) 500 - 850 150 - 250
41
MTTFD MTTFSP
Description (years) (years)
Relays 100 - 1000 100 - 500
Non-Safety Configured Single Channel PES 10 - 30 10 – 30
Safety-Configured Single Channel PES 100 - 250 Logic
5 – 15solver
IEC 61508 Compliant SIL 3 PES 2500 – 50,000 performance
10 - 1000 is
Trip Amplifiers (programmable) 300 - 600 dependent
150 - 275 on
Trip Amplifiers (non-programmable) 500 - 850 architecture
150 - 250
Can vary > 10
42
MTTFD MTTFSP
Description (years) (years)
Relays 100 - 1000 100 - 500
Non-Safety Configured Single Channel PES 10 - 30 10 – 30
Safety-Configured Single Channel PES 100 - 250 5 – 15
IEC 61508 Compliant SIL 3 PES 2500 – 50,000 10 - 1000
Trip Amplifiers (programmable) 300 - 600 150 - 275
Trip Amplifiers (non-programmable) 500 - 850 150 - 250
43
MTTFD MTTFSP
Description (years) (years)
Relays 100 - 1000 100 - 500
Non-Safety Configured Single Channel PES 10 - 30 10 – 30
Safety-Configured Single Channel PES 100 - 250 5 – 15
IEC 61508 Compliant SIL 3 PES 2500 – 50,000 10 - 1000
Trip Amplifiers (programmable) 300 - 600 150 - 275
Trip Amplifiers (non-programmable) 500 - 850 150 - 250
44
MTTFD MTTFSP
Description (years) (years)
Relays 100 - 1000 100 - 500
Non-Safety Configured Single Channel PES 10 - 30 10 – 30
Safety-Configured Single Channel PES 100 - 250 5 – 15
IEC 61508 Compliant SIL 3 PES 2500 – 50,000 10 - 1000
Trip Amplifiers (programmable) 300 - 600 150 - 275
Trip Amplifiers (non-programmable) 500 - 850 150 - 250
45
IEC 61511
Achieving SIL
• PFD/PFH is not the only criteria limiting the achieved SIL
– 70% upper bound confidence limit on reliability parameters
– Failure rates supported by field performance feedback
46
46
47
47
1 of 4 48
48
3 of 4 49
49
4 of 4 50
50
4 of 4 51
51
52
52
IEC 61508
SIL Claim Limit
• Two routes to determine hardware fault
tolerance limit
– Route 1H
• Hardware fault tolerance and safe failure fraction concepts.
– Route 2H
• Component reliability data from feedback from end users,
increased confidence levels and hardware fault tolerance for
specified safety integrity levels.
53
53
% Safe Failures
Safe Failures
S
Ratio indicates
inherent tendency to
fail safe (aka, to the
trip condition or state)
S
D Total Failures
54
54
55
SFF =
reliability product
D (includes more parts)
56
56
SFF =
Detected
Higher SFF is achieved by
increasing diagnostic sensitivity
S +
DD
Higher SFF = is it
reliable? What are S More Total Failures
total failures?
D (includes more parts)
57
57
CAUTION:
Detected Failure
• Which direction is safe?: High/Low, Open/Closed
• How do you maintain safety when a device has failed
detected?
– Fail channel to trip state
• Inherently safer choice
– Willful choice to stay on-line with known failure
• Must achieve functional safety
• Must justify continued operation with known failure
• Requires compensating measures equivalent to loss of SIF integrity
58
58
Total Failures
No Effect
S + Failures
D
59
59
No Effect
S + + Failures
DD
No effect and no
part failures are
explicitly not Total Failures
allowed in the SFF No Effect
in IEC 61508 2nd S + Failures
edition! D
60
60
0 Safe Failures – All failures are detected and reported As Fail Dangerous Detected
61
61
62
62
0 S + 37 DU + 356 DD
138 5
S + + Annunciation
Undetected
D No Effect
SFF = 93%
63
63
0 1 2
64 of 67 64
64
0 safe
356 DD
+
SFF = 91%
Without No Effect
0 S +37 DU + 356 DD
65
65
0 safe
264 DD
+
Internal Diagnostics Only
SFF = 67%
0 S + 37 DU + 356 DD
66
66
0 1 2
67 of 67 67
67
68
68
Determining
Test or Diagnostic Coverage
• Determine which failure modes are detected
• Determine the effect (failure classification) of the
detected failure modes
• Calculate percentage of the dangerous failures can be
detected
– Test Coverage – partial test interval
– Diagnostic Coverage – diagnostic interval
• Remaining failures are detected at complete proof test
Diagnostic requiring operator to provide
compensating measure should be limited to
90%, like any alarm response
69
69
70
70
Summary
• Failure modes and effects analysis is a
qualitative analysis method:
– identifies equipment failure modes,
– determines their impact on the equipment
operation, and
– classifies the impact severity
• MTTF (failure rate) and service life
(length of useful life period) is not the
same
71
71
Summary
• Random hardware failures are considered in the
verification calculations
• A constant failure rate is assumed for the
device’s useful life
– Consider operating environment
– Proper specification, installation and commissioning
practices
– Rigorous mechanical integrity practices
72
72