Safety Integrity Levels Short Course - SIS
Safety Integrity Levels Short Course - SIS
TECH Solutions
SAFETY INTEGRITY LEVELS Do you understand the odds?
Dr. Angela E. Summers, P.E., President SIS-TECH Solutions, LLC
Published Control Engineering website, February 2000
The perception of safety integrity levels (SIL), as related to ANSIIISA S84.01-1996 and IEC61508, currently exists somewhere between science fiction and marketing. The science fiction is bounded by the belief that the safety integrity level describes the absolute performance of the safety instrumented system (SIS) in terms of potential incidents. The marketing perception is controlled by vendors and service providers, who make claims concerning product performance. Neither perception is true.
SIL is a measure of the SIS performance related only to the devices that comprise the SIS. This measure is limited to device integrity, architecture, testing, diagnostics, and common mode faults inherent to the specific SIS design. It is not explicitly related to the cause and effect matrix, but it is instead related to the devices used to prevent a specific incident. Further, SILis not a property of a specific device. It is a system property; input devices through logic solver to output devices. Finally, SIL is not a measure is incident frequency. It is defined as the probability (of the safety instrumented system) to fail on demand (PFD). A demand occurs whenever the process reaches the trip condition and causes the SIS to take action.
A simple explanation of the relationship between incident frequency and SIL is to consider a roulette wheel. A roulette wheel consists of a horizontal wheel containing numbered slots. The wheelis spun and a ball is tossed into the wheel. In a gaming establishment, bets are placed on a specific numbered slot. If the ball lands in the slot that the player has selected, the house pays the player.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
ECH Solutions
On an SIL roulette wheel, the SIL represents the chance of landing in a specific slot on the wheel. Sit is therefore defined by probability and the standards have created four probability categories:
Sit 1 1 in 10 to 1 in 100
SIL2 1 in 100 to 1 in 1000
SIL3 1 in 1000 to 1 in 10000
SIL4 1 in 10000 to 1 in 100000 On an "Sit 1" roulette wheel, let's assume that there are 1 0 slots (minimum required for SIL 1). One is painted red; the other nine are painted black. The roulette wheel is spun when a process demand occurs, e.g. the level in a tank reaches the high level trip set point. The roulette wheel spins; the ball is tossed. If the ball lands in any of the black slots, the safety function works, e.g. the dump valve opens lowering the level. If the ball lands in the red slot, the safety function does not work and whatever the safety function was designed to prevent occurs, e.g. the tank overflows. How often the tank overflows is a product of the number of spins (process demand) and the ratio of red slots to black slots (PFD or SIL). Therefore, in this game, the player can control the probability of success by controlling the number of slots (SIL). The player can also reduce the incident frequency by reducing the number of spins (process demands).
How many slots are required and what actions should be taken to reduce the number of process demands is based on the perceived risk and tolerable incident frequency. The risk,as identified during the process hazards analysis, is essentially the "bet" placed on the red slot. The bet may consist of injuries, fatalities, environmental releases, property/equipment damage, permit violations, and the plant's "license to operate." If the bet is small, e.g. high level in a tank occurring 10 times per year with the potential consequence of overflowing water into a dike,
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
2
ECH Solutions
maybe 10 slots are acceptable with a resultant incident frequency of once per year. If the bet is large, e.g. high pressure in a process vessel with the potential for rupture, release of flammable gas, subsequent ignition, and multiple fatalities and catastrophic damage occurring once in 10 years, maybe 1000 slots are required with a resultant incident frequency of 1 in 10,000 years.
Unfortunately, while it is easy from a risk standpoint to understand the penalty behind the failure of a safety function to work, sometimes it is more difficult to acknowledge that the true payout is when the safety function does what it is supposed to do. After an, how many times do plant engineers get a pat on the back because a safety function worked? The plant engineers don't get a hefty check related to the successful prevention of the incident. No small bets or large bets are actually paid to anyone. Therefore, this game is difficult to play, because the game only issues penalties (the lncident) for incorrect design choices.
Making matters worse is that the drive toward more production may result in the desire to ride-out upsets by temporarily disabling or bypassing trip outputs. This action results in the wheel being reduced to one slot with the operator making the ultimate bet. Will the wheel spin before he can get the process back into control?
In most of the literature, SIL is referred to as a performance criteria - the capability of the safety function to perform at the time it is needed. As explained above, the choice of the SIL is more often related to the cost of non-performance - a blurry sometimes difficult to sell concept, especially at project budget meetings. However, no matter how SIL is viewed, the concept represents an important shift in industry's attitude toward safety system design. The SIL must be chosen to reduce the incident frequency to a tolerable level. The SIL is the design basis for all engineering decisions related to the safety function. When the design is complete, it must be validated against the SIL. Therefore, SIL closes the design cycle - risk identified, requirements quantified, and design validated.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
3
SIS
ECH Solutions
PARTIAL-STROKE TESTING OF BLOCK VALVES
Angela E. Summers, Ph.D., P.E, SIS-TECH Solutions, LLC PMB-295, 2323 Clear Lake City Blvd, Houston, TX 77072
Successful implementation of the safety lifecycle model, associated with ANSI/ISA S84.01-1996 (S84) and lEG 61508, hinges on one design constraint---the safety integrity level (SIL). The SIL is a numerical benchmark, related to the probability of failure on demand (PFD). SIL is affected by the design robustness, e.g., device integrity, voting, and common cause faults. It is also affected by the operation and maintenance strategy, e.g., diagnostics and testing interval.
For many operating companies, the most difficult part of Sit compliance is the testing of final elements, especially emergency block valves. Traditionally, emergency block valves have been tested at unit turnaround, using a full-stroke test to demonstrate performance .. Thirty years ago, turnarounds were every two to three years. Due to mechanical reliability and preventive maintenance programs, companies are now extending unit turnarounds to every five to six years. Extended turnarounds yield great economic returns through increased production. Extended turnaround intervals also mean that emergency block valves are expected to go longer between function tests, yet still achieve the same performance. This is not possible.
Partial-stroke testing can be used to supplement full-stroke testing to reduce the block valve PFD. The amount of the reduction is dependent on the valve and its application environment. This paper will discuss how to determine the actual impact of the partial-stroke test on PFD. It will also present a discussion of the three
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
SIS
TECH Solutions
partial-stroke testing methodologies that are currently being evaluated and used by industry.
Sil VERIFICATION FOR BLOCK VALVES
The probability to fail on demand (PFD) can be calculated using the dangerous failure rate (AD) and the testing interval (TI). The mathematical relationship, assuming that systematic failures are minimized through design practice, is as follows:
PFD = AD * TI/2
The equation shows that the relationship between PFD and TJis linear. Longer test intervals yield larger PFDs. The OREDA database has data for various valve types and sizes. For the purposes of illustration, a dangerous failure rate of 3.03E-06 failures per hour will be used. The valve failure rate varies with type, size, and operating environment (e.g., process chemicals, deposition, polymerization, etc.). The reader should determine the appropriate failure rate for the specific application.
The PFD, based on the 3.03E-06 per hour failure rate, is shown in Table 1 for various testing frequencies. As expected, the valve performance at a 5-year testing interval is not the same as the valve performance at a 2-year testing interval. Reliability data for operating equipment provided justification to extend the turnaround period, in many cases by a factor of three or more. However, the impact of longer testing intervals on standby devices, such as block valves, was not evaluated. Longer turnaround intervals result in improved financial performance. The side effect is increased risk of an incident due to lower performance of safety critical devices, such as the SIS final elements.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)' 281-461-8109 (fax) www.SIS-TECH.com
2
ECH Solutions
Table 1. PFDavg for a Typical Block Valve as a Function of Testing Interval (A=3.03E-06)
Testing Interval PFDavg
1 year 1.33E-02
2 years 2.65E-02
3 years 3.98E-02
4 years 5.31E-02
5 years 6.64E-02
6 years 7.96E-02 Due to the degraded performance at longer testing intervals, many companies have found that they must test the block valves on-line. Once facilities for on-line testing are installed, full-stroke testing can easily be performed. However, since a fullstroke test involves full contact of the valve seating members, frequent stroking can cause excessive wear to the block valve seat. This is a serious concern for softseated valves. Increased testing may achieve a higher integrity, but cause damage to the valve seat, leading to earlier valve failure.
Another major concern is that the plant is unprotected while the block valve is in bypass for testing. The fraction of the time that the valve is in bypass must be considered in the PFD calculation. If the valve is bypassed every six months for testing and the test takes 1-hour, the PFD is increased by 2.3E-04. For longer bypass periods or more frequent testing, the impact on the PFD is even more significant. To maintain safety, operating procedures must include a list of the
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)' 281-461-8109 (fax) www.SIS-TECH.com
3
SIS TECH Solutions
actions to be taken when the valve is in bypass, such as reducing production rates, monitoring certain process variables, or executing shutdown.
An option to a full-stroke test is a partial-stroke test. The test involves moving the valve a minimum of 15 percent, which tests a portion of the valve failure modes. The remainder of the failure modes are tested using a full-stroke test. The main purpose of the partial-stroke test is to reduce the required full-stroke testing frequency.
Partial-stroke testing does not eliminate the need for a full flow bypass. If the valve is partial-stroke tested and determined to be non-functional, maintenance will need a bypass or the process will have to be shutdown for valve repair. Since the bypass is not required for testing, there is no loss of safety integrity. The bypass is only used during valve maintenance.
How does partial-stroke testing affect the PFD? A complete functional test of the valve can be viewed as consisting of two parts: the partial-stroke (PS) and the fullstroke (FS). For the calculation, the dangerous failure rate, AD, must be divided into what can be tested at the partial-stroke (AD ps) and what can only be tested with a full-stroke (AD FS). The resulting equation for the PFD is as follows:
PFD = AD PS * Tlps/2 + A DFS * TIFs/2
The division of AD into parts requires an evaluation of the failure modes of the valve. Table 2 provides a listing of typical dangerous failure modes for block valves and the corresponding effect of these failure modes. The test strategy indicates whether the failure mode can be detected by partial-stroke testing or only by full-stroke testing.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
4
SIS TECH Solutions
Table 2. Dangerous Failure Modes and Effects with Associated Test Strategy
Failure Modes Effects Test Strategy
Actuator sizing is Valve fails to dose (or open) Not tested
insufficient to actuate valve
in ernerqency conditions
Valve packing is seized Valve fails to close (or open) Test valve - Partial or full-stroke
Valve packing is tight Valve is slow to move to Not tested unless speed of closure
closed or open position is monitored.
Air line to actuator crimped Valve is slow to move to Not tested unless speed of closure
closed or open position is monitored, Physical inspection
Air line to actuator blocked Valve fails to move to closed Test valve - Partial or full-stroke
or open position
Valve stem sticks Valve fails to close (or open) Test valve - Partial or full-stroke
Valve seat is scarred Valve fails to seal off Full-stroke test with leak test
Valve seat contains debris Valve fails to seal off Full-stroke test
Valve seat plugged due to Valve fails to seal off Full-stroke test
deposition or
polymerization The failure modes listed in Table 2 can be compared to the failure mode distributions presented in the Offshore Reliability Data Handbook (OREDA) for various valve types and sizes. Based on the OREDA data, the maximum percentage of the failures that can be detected by a partial-stroke test is 70%. The remaining 30% of the failures can only be detected uSing a full-stroke test.
The reader is cautioned that this breakdown is based on average valve performance in off-shore installations and may not represent the breakdown for the reader's application. This evaluation should be done for each valve type, based on the application environment and the shutoff requirements. If the service is erosive, corrosive, or plugging, the failure rate and failure mode breakdown will be different from that shown in this paper. If the valve is specified as tight-shutoff, the
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
5
SIS TECH Solutions
contribution of minor seat deformation or scarring will be more significant than shown in this paper. For these reasons, it is recommended that partial-stroke testing not used as a substitute for full-stroke testing for a single block valve application when:
a. the valve has been shown to fail in the service due to process deposition or plugging,
b. the valve is specified as tiqht-shutoff
c. valve leakage can generate a hazardous incident
Using 70% as the breakdown of the dangerous failure rate, AD, the equation for the PFD can be written as follows:
PFD = 0.7Ao * Tlps/2 + 0.3Ao * TIFs/2
Using a dangerous failure rate of 3.03E~06 per hour, Figure 1 shows the PFD when the test procedure requires removing the valve from service during the test. As expected, the partial-stroke testing does improve the PFD performance of the valve. The star illustrates the point where the partial-stroke testing interval and full-stroke testing interval are both at 8760 hours (1 year). This corresponds to the results for a 1 year full-stroke test, as shown in Table 1.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)' 281-461-8109 (fax) www.SIS-TECH.com
6
SIS
ECH Solutions
Figure 1. Relationship Between Partial-stroke Testing Interval and PFD- Valve is Unavailable During the Test
PARTIAL STROKE TESTING IMPACT ON PFD VALVE IS UNAVAILABLE DURING PARTJAl STROKE TEST
0.001
i:I
...
e:.
0
z
-c
:IE
w
0
z
0
... 0,01
;;:
...
0
I-
~
...
iii
«
In
0
'"
a. ~FULl STROKE AT 1 YR __fA_ FULL STROKE AT 2 YRS
CALCULATION ASSUMES THAT 71)"10 OF THE VALVE FAILURES ARE TESTED AT THE PARTIAL STROKE TESTING INTERVAL AND THAT 30% OF THE VALVE FAILURES ARE TESTED AT THE FULL STROKE TESTING INTERVAL. THE VALVE IS UNAVAILABLE FOR SHUTDOW.N QURINGTHE PARTIAL STROKE TEST.
FULL STROKE AT 3 VRS
. 'X' ...•.. , FUll STROKE AT 4 VRS
~ FULL-STROKE ATS VRS
«li{""'><"O~'~"'>'<
~)K )I(
1000
2000 3000 4000
7000 8000 9000 10 00
5000 6000
PARTIAL STROKE TESTING INTERVAL (HRS)
The downward trend of the curves for very frequent partial-stroke testing is due to the valve being removed from service during the test. This removal results in the valve not being available for the fraction of time that the valve is being tested. The calculation assumes that the total test time is 30 minutes. If the actual test time is longer, the effect will be more pronounced.
Figure 2 shows the PFD when the test procedure allows the valve to remain in service during the test. Very frequent partial-stroke tests improve the PFD substantially, because there is no loss of functionality during the test. Again, the star illustrates the point where the partial-stroke testing interval and full-stroke testing interval are both at 8760 hours (1 year).
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
7
SIS
TECH Solutions
Figure 2. Relationship Between Partial-stroke Testing Interval and PFD - Valve is Available During the Test
PARTIAL STROKE TESTING IMPACT ON PFD VALVE IS AVAILABLE DURINGTEST
0.001 r---;;::::;::::;;:::::::;::::;::;::::;;:::::;:::::::::::::;;::::::::;:::::::;:::::;---;:==:::;;;;;:;:::;;;;;;:;:::;:;:;;;::J,
CALCULATION ASSUMES THAT 70"10 OF THE VAlve FAILURES ARE ~FULl STROKE AT 1 VR
~~s;:~ ::L~:EF~"'L~~~~!~~OTKE~;~~~~~~~T;~~t~;f::K~H~:~~~ -4-FULl STROKE AT 2 YRS
INTERVAL THE VALVE IS AVAILABLE. FOR SHumOWNDURING THE PARTIAL STROKE.TEST.
FULL STROKE AT.3 VRS
• ..•• \<' •••.•.. FULL STROKE AT 4 VRS ~FULLSTROKE AT SYRS
c
~
0
z
-c
'"
OJ
0
Z
0
... 0.01
;.;:
g
~
ill
..
"'
0
II:
.. .... ~ .. -- .... ., .... 7<
)K
1000 2000 3000 4000 5000 6000 7000 8000 9000 1 0
PARnAL STROKE TESTING INTERVAL (HRS)
For both test procedures, partial-stroke testing does improve the valve performance. For example, 5-year full-stroke testing achieved a PFD of 6.64E-02 (Table 1). A 5 year full-stroke test supplemented with a 6-month (4380 hours) partial stroke test achieved a PFD of 2.46E~02, which is a 37% reduction in PFD. In the cases of 1-year and 2-year full-stroke testing, a single block valve can potentially achieve SIL 2 performance when supplemented with frequent partial-stroke tests. For longer full-stroke testing intervals, the valve performance can increase from low SIL 1 to high SIL 1, depending on the partial-stroke testing interval. From the graphs, it is easy to see that no amount of partial-stroke testing is going to allow a
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
8
TECH Solutions
s.ingle valve to achieve high SIL 2 performance, let alone SIL 3 performance, at fullstroke testing intervals of 1 year or more.
PARTIAL-STROKE TEST METHODOLOGY
There are three basic types of partial-stroke test equipment: mechanical limiting, position control, and solenoids. Each type involves different levels ofsophistication and risk. Consequently, each type will be discussed separately.
Mechanical Limiting
Mechanical limiting methods involve the installation of a mechanical device to limit the degree of valve travel. When mechanical limiting methods are used, the valve is not available for process shutdown (see Figure 1).
The mechanical devices used for partial stroke testing include collars, valve jacks, and jammers .
./ Valve collars are slotted pipes that are placed around the valve stem of a linear valve. The collar prevents the valve from traveling any farther than the top of the collar. Any fabrication shop can build a valve collar, suitable for test use .
./ A valve jack is a screw that is turned until it reaches a set position. The valve jack limits the actuator movement to the screw set position. The valve jack is ordered from the valve manufacturer when the valve is purchased. Valve jacks work with both linear valves and rotary valves.
PMB-295., 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
9
ECH Solutions
./ Mechanical jammers are integrated into the rotary valve design. They are essentially slotted rods that limit valve rotation when placed in position using an external key switch. Since the jammer is integrated into the rotary valve, the jammer must be purchased from a valve manufacturer. A contact can be provided for the key switch to allow annunciation in the control room whenever the key is used.
Mechanical limiting methods are inexpensive in terms of capital and installation costs. These methods are manually initiated in the field and are manpower intensive.
A limit switch or visual inspection is used to confirm valve movement. Successful implementation and return to normal operational status are completely procedure driven. For valve collars and jacks, bypass notification to the control room is entirely procedural. For the jammer, automatic notification using the key switch contact can be provided.
One of the biggest drawbacks to these methods is the lack of assurance that the valve is in or has been returned to normal status. There is no way to know for certain that the jack or jammer has been completely retracted without actuating the valve. Furthermore, unauthorized use of the valve jack or jammer cannot be determined by casual inspection. This means that the valve could potentially be out of service with operations personnel unaware of the situation.
These methods do not add to the normal operating spurious trip rate. However, there is the potential for a spurious trip during the partial-stroke test. For valve
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)' 281-461-8109 (fax) www.SIS-TECH.com
10
SIS
ECH Solutions
collars, the main culprit of spurious trips is improper installation, causing the collar to pop off the stem when the valve begins to move. Jacks and jammers must be placed in service by the technician, so procedural mistakes can result in the valve closing completely rather than just partially. Therefore, these methods are really only as .good as the written procedures and technician training.
Position Control
Position control uses a positioner to move the valve to a pre-determined point. This method can be used on linear and rotary valves. Since most emergency block valves are not installed with a positioner, this method does require installation of additional hardware. Positioner operation also requires an analog output, which is typically not installed in SIS applications. Consequently, cost is a major drawback for the position control method.
A limit switch or position transmitter can be used to determine and document the successful completion of the tests. If a smart positioner is used for the position control, a HART maintenance station can collect the test information and generate test documentation. Of course, the use of a smart positioner and maintenance station further increases the capital cost.
Some vendors have promoted the use of the positioner in lieu of a solenoid for valve actuation. However, most positioners do not have a large enough vent port (Cv) for rapid valve closure. Consequently, a solenoid should still be used for valve actuation. This solenoid valve must be installed between the positioner and the actuator.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
11
SI
TECH Solutions
The positioner does contribute to the spurious trip rate during normal operation, since the positioner can fail and vent the air from the valve. When a solenoid is installed between the positioner and the actuator, the safety functionality is never lost during the partial-stroke test (See Figure 2). De-energizing the solenoid will shut the valve regardless of the positioner action.
Solenoid
A partial-stroke test can be accomplished by pulsing a solenoid valve. The solenoid can be the same solenoid used for valve actuation, resulting in a low capital and installation costs for the method. If the actuation solenoid valve is used, this method will also test the solenoid valve functionality.
The time of the pulse must be adjusted for each valve and solenoid pair to achieve the desired valve travel. Valve travel confirmation is accomplished by a limit switch or position transmitter, allowing automatic documentation of test status. Since a serious failure of the valve may result in more movement of the valve during the pulse than desired, the pulse timer should be voted with the limit switch or position transmitter. If the valve reaches its desired travel point before the pulse timer is finished, the solenoid valve should be reset. The test can be programmed in the SIS logic solver with the test being implemented automatically based on a programmed cycle time or initiated by the operator on a maintenance schedule.
Since the valve is never bypassed or disabled, the valve remains available for shutdown during the test (see Figure 2). As with the other partial-stroke testing methods, a maintenance bypass is required to allow maintenance to be performed on-line without a process shutdown.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
12
SIS
ECH Solutions
Spurious trips during testing can be a problem, if non-redundant solenoids are used for valve actuation. After all, the solenoid is being de-energized for the test and reenergized to stop the test. If the solenoid valve does not reset, the test becomes a trip. The use of redundant solenoids can seriously reduce the probability of the spurious trip.
CONCLUSION
Partial-stroke testing does provide measurable improvement of the PFD over fullstroke testing alone. The amount of improvement is dependent on the specification, configuration, and application environment. The three partial-stroke testing methodologies offer choices between manual and automated testing.
Whichever method is selected, procedures must be written to ensure that the block valve is not tripped during testing, the test is properly carried out, incorrect valve performance is documented, and maintenance is performed to return valve to fully functional status. This means that the documentation requirements for the partialstroke test are the same as for the full-stroke test. Since a bypass is still required for maintenance, facilities and procedures must be in place to ensure that use of the bypass valve is restricted. The main benefit is that partial stroke testing can reduce the full-stroke testing interval required to achieve the SIL.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)' 281-461-8109 (fax) www.SIS-TECH.com
13
SIS
ECH Solutions
REFERENCES
1. "Application of Safety Instrumented Systems for the Process Industries," ANSI/ISA-S84.01-1996,ISA, Research Triangle Park, NC (1996).
2. "Functional safety of electrical/electronic/programmable electronic safety related systems," IEC 61508, International Electrotechnical Commission, Geneva, Switzerland (1999).
3. Summers, A.E., "Understanding Safety Integrity Levels," Control Engineering website (February 2000).
4. "OREDA: Offshore Reliability Data Handbook," 3rd Edition, Det Norske Veritas Industri Norge as DNV Technica, Norway, 1997.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
14
SI
TECH Solutions
Techniques for Assigning A Target Safety Integrity Level
Angela E Summers, Ph.D.
This paper was published in ISA Transactions 37 (1998) 95-104.
Abstract
The new ANSI/ISA S84.01-1996(1) Application of safety instrumented systems for the process industries, standard requires that companies assign a target safety integrity level (SIL) for all safety instrumented systems (SIS) applications. The assignment of the target SIL is a decision requiring the extension of the process hazards analysis (PHA). The assignment is based on the amount of risk reduction that is necessary to mitigate the risk associated with the process to an acceptable level. All of the SIS design, operation, and maintenance choices must then be verified against the target SIL. This paper examines the six most common techniques currently utilized throughout the process industries: Consequence Only, Modified HAZOP, Risk Matrix, Risk Graph, Quantitative Assessment, Corporate Mandated SIL.
Introduction
The OSHA process safety management (PSM) and EPA risk management program (RMP) dictate that a process hazards analysis be used to determine the protective measures necessary to protect workers, the community and the environment. A compliant program will incorporate "good engineering practice," which means that the program follows the codes .and standards published by such organizations as the American Society of Mechanical Engineers, American Petroleum Institute, American National Standards Institute, National Fire Protection Association, and American Society for Testing and Materials.
In February 1996, the Instrument Society of America published a standard ANSI/ISA S84.01-1996, "Application of safety instrumented system for the process industries,,(1). This standard became an American National Standards Institute (ANSI) standard in March 1997. With its acceptance as an ANSI standard, it will be enforceable under OSHA PSM and EPA HMP.
The new ANSIIISA S84.01-1996 standard and the draft IEC 61508(2) standard require that a target safety integrity level (SIL) be assigned for any new or retrofitted safety instrumented systems (SIS). The SIS consists of the instrumentation or controls that are installed for the purpose of mitigating the hazard or bringing the process to a safe state in the event of a process upset. A SIS is used for any process in which the process hazards analysis (PHA) has determined that the mechanical integrity of the process equipment, the process control, and other protective equipment are insufficient to mitigate the potential hazard.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)' 281-922-4362 (fax) www.SIS-TECH.com
51
ECH Solutions
The safety integrity level designations, provided in ANSIlISA S84.01-1996 and IEC 61508 (draft), can be correlated to SIS availability requirements. As shown in Table 1, IEC 61508 (draft) recognizes SIL 4, which the U.S. domestic standard ANSIfISA S84.01-1996 does not consider.
Table 1. Safety Integrity Level Correlation with Availability and Probability to Fail on Demand (PFD)
Safety Integrity Level Availability Probability to 1/PFD
Required Fail on Demand
IEC 4 >99.99% E-005 to E-004 100,000 to 10,000
61508
3 99.90-99.99% E-004 to E-003 10,000 to 1,000
2 99.00 - 99.90% E-003 to E-002 1,000 to 100
v
fie 1 90.00 -99.00% E-002 to E-001 100 to 10
:l<l<li What does SIL mean? It should be understood that SIL and availability are simply statistical representations of the integrity of the SIS when a process demand occurs. The acceptance of a SIL 1 SIS means that the level of hazard or economic risk is sufficiently low and that a SIS with an availability of 90% (or 10% chance of failure) is acceptable. For example, consider the installation of a SIL 1 SIS for a high level trip in a liquid tank. The availability of 90% would mean that, out of every 10 times that the level reached the high level trip point, there would be one predicted failure of the SIS and subsequent overflow of the tank. Is this an acceptable risk?
A qualitative view of SIL has slowly developed over the last few years as the concept of SIL has been adopted at many chemical and petrochemical plants. As shown in Table, 2 this qualitative view can be expressed in terms of the consequence of the SIS failure, in terms of facility damage, personnel injury, and the public or community exposure.
Table 2. Qualitative view of SIL
SIL Generalized View
4 Catastrophic Community Impact
3 Employee and Community Impact
2 Major Property and Production Protection. Possible Injury to employee
1 Minor Property and Production Protection PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
2
SI
ECH Solutions
The above qualitative view leaves much open for discussion. What is minor? What is major? At what point, will a theoretical injury or fatality occur? There are no regulations that assign or assist in the assignment of a SIL to particular processes or chemical operations. Further, there are no regulations or standards to follow that recommend specific SILs for certain process hazards. The assignment of SIL is a corporate or company decision based on risk management and risk tolerance philosophy. The caveat is that ANSI/ISA S84.01-1996 does mandate that companies should design their safety instrumented systems (SIS) to be consistent with similar operating process units within their own companies and at other companies. Likewise, in the US, OSHA PSM and EPA RMP require that industry standards and good engineering practice be used in the design and operation of process facilities. This means that the assignment of safety integrity levels must be carefully performed and thoroughly documented.
Methodologies
Safety integrity levels are assigned after the process hazards analysis (PHA) has concluded thata safety instrumented system is required. A PHA is performed to identify potential hazards in the operation of a refining, chemical, or petrochemical process. PHAs range from the very simple screening analysis to the complex Hazard and Operability Study (HAZOP). The HAZOP(3) is a systematic, methodical examination of the process design that utilizes a multi-disciplinary team to identify hazards or operability problems that could result in an accident. The HAZOP provides a prioritized basis for the implementation of risk mitigation strategies, such as safety instrumented systems (SIS) or emergency shutdown systems (ESD).
When the HAZOP is completed, the risk associated with the process, in terms of severity and likelihood should be understood. The event severity is established based on some measure of the anticipated impact or consequence. This can include:
• On-site consequences
o worker injury or death
o equipment damage • Off-site consequences
o community exposure, including injury and death
o property damage
• Environmental impact
o emission of hazardous chemicals
o contamination of air, soil, and water supplies
o damage to environmentally sensitive areas
The risk likelihood is determined by estimating the probability of expected occurrence. The likelihood is classified as high, medium or low rate of occurrence. This is often determined based on company operating experience or industry wide operation history.
PMB~295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
3
SI
TECH Solutions
The choice of the Sit assignment method is dependent on the existing corporate risk assessment methodology. There are several methods of converting HAZOP data into safety integrity levels (SIL), including:
• modified HAZOP,
• consequence only,
• risk matrix,
• risk graph,
• quantitative assessment, and
• Corporate mandated SIL.
It is necessary for the user to develop procedures and gUidelines to ensure that any of the methods are used effectively and consistently. These methods will be discussed below, along with some criteria for choosing the method.
Modified HAZOP
The Modified HAZOP is an extension of the existing HAZOP process. It is a subjective assignment of the SIL based on the team's qualitative understanding of the incident severity and likelihood. This method relies heavily on the experience and knowledge of the team members. The required experience and knowledge extends beyond simple understanding of the process operation. It must include an understanding of the process risk and the acceptable risk tolerance of the company. The SIL is assigned by qualitatively examining the risk potential and selecting a SIL that seems appropriate by the team's estimation of the risk. Since the assignment is very subjective, there needs to be some consistency between the personnel on the SIL assignment teams from project to project.
Consequence Only
The most conservative technique, Consequence only, uses an estimation of the potential consequence of the incident. The incident frequency is not considered. Consequently, all incidents resulting in possible fatalities would have the same SIL no matter how remote or frequent the incident likelihood might occur. A Consequence only decision table may appear as shown in Table 3.
Table 3. Consequence only decision table
SIL Generalized View
4 Potential for fatalities in the community
3 Potential for multiple fatalities
2 Potential for major serious injuries or one fatality
1 Potential for minor inluries PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281"922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
4
SI
ECH Solutions
This method, while conservative, is the simplest tool to utilize, because the team does not need to estimate the likelihood of the incident, which is often the most difficult estimation for the team to make. This method is especially appropriate when the process history is very limited, which contributes substantially to the difficulty in defining the likelihood.
Risk matrix
One of the most common techniques, among refining, chemical and petrochemical companies, uses a risk matrix, which provides a correlation of risk severity and risk likelihood to SIt. Where the Consequence only technique results in a fixed response to a perceived hazard, the Risk matrix method allows the probability of the potential event to be considered during the assignment of SIL
A corporate risk matrix provides control of the SIL assigned for a particular severity and likelihood. During the assessment of the incident severity and likelihood, the available layers of protection must be evaluated and their effect on the incident severity and likelihood must be determined. For risk reduction consideration, the layers of protections must be independent, verifiable, dependable, and designed for the mitigation of the specific risk. An example of a two dimensional Risk matrix is in Fig. 1.
Figure 1. Two dimensional risk matrix
j~
3 3
NOT
ACCEPTABLE
RISK
2 3 3
Num
corr
1 2 3 leve
ANS
and
IEC1
(dra
NR 1 2 ....
,... bers
espond to SIL Is from
I/ISA S84.01
508/IEC 1511 ttl
LOW
MODERATE
HIGH
EVENT LIKELIHOOD
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
5
SI
TECH Solutions
When it is desired that the method provide the capability to formally consider the independent protection layers, a three-dimensional Risk matrix may be used (Fig. 2). The assessment of likelihood and severity is done without considering any additional protection layers. The amount of credit taken for the risk reduction inherent in each layer is controlled by the SIL values assigned in the three dimensional matrix. This provides better control in the amount of risk reduction that is assumed with each applied protection layer.
Figure 2. Three dimensional risk matrix
0
.~
'"
:. High
Q)
fI) Med.
e:
" Low
:.
Iij ~
'c
Q)
:. High
Q)
fI) Med.
e:
Q) Low
:.
IJJ 0
.~
" High
:.
"
fI) Med.
e:
'" Low
:.
IJJ 1 1 1
NR NR 1
NR NR NR IPL = High
2 2 2
1 1 2
NR NR 1 Low Med. High
!=vent Likelihood
IPL", Medium
3 3 3
2 2 3
1 1 2 Low Med. High
!=vent Likelihood
IPL = Low
!=vent Likelihood
1. Event likelihood and severt yare evaluated without consideration for the SIS under consideration.
Low Med. High
2. NR '" Not Required
For this method to be successfully used, the process and its associated risk must be well understood so that the qualitative estimation of the likelihood and severity can be made. The assessment of the likelihood is the most difficult for the assignment team to make, so there should be some general understanding among the assignment team as to frequency of past incidents in the facility or in the general industrial group.
Risk Graph
The international standardlEC 61508 (draft) provides an alternative method to the Risk matrix. It is called a Risk graph and provides a SIL correlation based on four factors:
1) consequence (C),
2) frequency and exposure time (F),
3) possibility of avoiding the hazardous event (P), and
4) probability of the unwanted occurrence (W).
This method is a qualitative technique that requires tools to be developed to ensure that the four parameters listed above are properly chosen. It focuses most of the evaluation
PMB-295, 2323 Clearlake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
6
SIS
rfECH Solutions
on an individual person's risk. The four factors are evaluated from the point of view of a theoretical person being in the incident impact zone. This method is consequence driven, but allows credit for controlling access to the facility. For this method, the likelihood and consequence are determined by considering the independent protection layers during the assessment.
Once these factors are determined, the risk graph is utilized to determine the minimum risk reduction level and associated SIL. As with the Risk matrix, a corporate risk graph should be developed. An example Risk graph is shown in Figure 3.
Figure 3. Example risk graph
Safety Integrity Level or SIL as per IEC SC 65A or ISA 84.01, e.g., SIL 3. Notes:
1) EIEIPES Safety Related System (or SRS) is nor sufficient for a SIL =h;
2) ISA 84.01 does not have a SIL = 4;
3) SIL - a, means no special safety systems required
4) SIL = h, an EIEIPES SRS might not be sufficient (see References [14J and [26J)
Requirement categories, e.g., 6
Example graph depicting relationship between DIN V 19250.IEC 1508and1SA84.01 and Namu
The Risk graph method uses the four parameters: Consequence-C, Frequency of exposure-F. Possibility of escape-P, and Likelihood of event-W. The analysis proceeds with a determination of each of the parameters, in terms of levels shown as subscripted numbers. The Risk graph shown in Fig. 3 has four levels for consequence, two levels for frequency, two levels for possibility of escape, and three levels for likelihood. As the subscripted num bers increase, the perceived hazard is higher. Each of these levels must be carefully defined on a corporate basis for the methodology to be useful. The
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phons}« 281-922-4362 (fax) www.SIS-TECH.com
7
SIS
rECH Solutions
consequence, C, is not simply defining the incident in terms of loss of containment, fires or chemical releases, as defined in the PHA process. It is examining the incident from the exposed person's perspective in terms of an injury or fatality. For the example Risk graph shown in Fig. 3, the consequence levels are as follows:
C1 = Minor injury
C2 = Serious permanent injury to one or more persons C3 = Death to several people
C4 = Very many people killed
In assessing the consequence, the following questions should be evaluated for the incident:
=> Is there a potential for injury or fatality? => Can the exposed person recover?
=> Can the exposed person return to normal activities? => Are the effects acute or chronic?
=> Has consequence assessment been performed?
The answers to these questions enable a determination of which of the consequence levels should be chosen.
For the exposure frequency, F, the process unit must be evaluated in terms of the personnel presence and activity in the unit. For the example Risk .graph, F1is chosen for rare to more often exposure in the hazardous zone and F2 is chosen for frequent to permanent exposure in the hazardous zone. The questions for this parameter should address the following:
=> Is the process unit remote or in the main personnel traffic area? => How close are operation and maintenance stations?
=> How often is operation's staff in the vicinity?
=> What about support staff, such as maintenance crews or engineering personnel? => Is this a main travel area for access to other process units?
Possibility of escape, P, can be difficult for the hazards evaluation team to agree upon, because, as engineers and risk assessment people, there is a tendency to want to believe that people can always escape if there are alarms. However, time becomes an important factor in the escape. The example Risk graph uses P1 for possible under certain conditions and P2 for almost impossible. To determine whether it is truly possible or not, the question that should be asked is, "How easy is it to escape from the hazardous area?" Typical issues that should be addressed are as follows:
=> Are the escape routes well marked?
=> Can personnel in the exposure area readily recognize that a hazardous situation exists?
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
8
ECH Solutions
=> Are there alarm sirens? => Is there time to escape?
=> What is the available escape time between alarm and incident? => Have personnel been through accident scenario training?
=> Do the personnel have historical experience with this scenario?
The probability of occurrence, W, is based on the likelihood of the event, which should be evaluated without taking into account any existing safety instrumented systems. The likelihood parameter in the Risk graph is the same as that determined for the Risk matrix. For the example Risk graph, the probability for occurrence is based on the following:
W 1 = A slight probability
W 2 = A medium probability W 3 = A high probability
The likelihood can be evaluated qualitatively or quantitatively. If a qualitative measure is used, the methodology must define the terms, low, medium, and high.
Quantitative Analysis
The quantitative approach to SIL assignment is the most rigorous technique to utilize. The SIL is assigned by determining the process demand or incident likelihood quantitatively. The potential causes of the incident are modeled using a quantitative risk assessment technique'i", such as that shown in Fig. 4, a fault tree. The quantitative technique is often used when there is very limited historical information about the process.eo that the qualitative determination of likelihood is extremely difficult. The method does require a thorough understanding of the potential causes of the event and an estimated probability of each potential cause. Fig. 4 shows some of the potential failures that should be considered.
Figure 4. Quantitative Calculation of Process Demand
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
9
SIS
ECH Solutions
To determine the required SIL, the accepted or tolerable risk frequency is divided by the calculated process demand as follows:
Probability to Fail on Demand =
Tolerable risk frequency Process demand
The inverse of this equation has also been used to determine the risk reduction factor (RRF).
RRF =
Process demand Tolerable risk frequency
Whichever equation is used, the calculated risk reduction equates to the required safety integrity level.
Corporate mandated choice
The final technique is the least time consuming method, which is one being adopted by many small, specialty chemical plants that do not wish to devote extensive manpower to SIL assignment methodologies. This method recognizes that the greatest increase in cost occurs when the decision is made that the SIL must be higher than SIL 1(6). The selection of SIL 2 or SIL 3 forces the SIS design toward device redundancy and diversity. With this recognition, many small companies are taking the approach that "a safety system is a safety system and therefore should be SIL 3". This eliminates the arguments about whether escape is possible, someone will be injured or killed or the impact will be on-site and/or off-site. It saves time in the PHA process, reduces documentation in justifying the SIL choice, and ensures consistency across process units.
Demonstration of methodologies
To demonstrate the methodologies described in this paper, a simple example will be provided. The reactor shown in Fig. 5 is utilized in the production of chemical C. Chemical A and chemical B are reacted to produce chemical C. Chemicals A, B, and C are flammable and, under certain conditions, explosive.
The reaction is exothermic, so the reactor temperature must be controlled using cooling water. The flow rates of chemical A and chemical B are controlled, because the rate of reactant addition and the ratio of the reactant addition influence the reaction path. A process hazards analysis has documented that, if the flow rates of either chemical A or chemical B exceed certain levels, the reaction will runaway. In addition, the process hazards analysis has shown that if the reaction temperature is not controlled, the reaction path can shift, resulting in a runaway reaction. Both runaway reactions result in volatilization of the reactants and overpressure of the vessel.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)' 281-922-4362 (fax) www.SIS-TECH.com
10
SI
TECH Solutions
Consequence analysis was performed for the various reaction scenarios. It was shown that ignition of the released contents of the vessel would create a pressure wave that would damage a large portion of the facility including the control room.
Figure 5. Simplified P&ID for exothermic reactor example.
~H
t·-~·HH
f--+---'--+ Chemical C
"""
TT .. ~ 101 --------+~ ....... ~-
Modified HAZOP
The modified HAZOP would involve the discussion of the cause, consequence and safeguards for each potential incident. The keyword, More flow, would result in a discussion of the potential for runaway reaction, resulting in the potential overpressure of the vessel and loss of life. The required safeguard would be the installation of a SIS to
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone) » 281-922-4362 (fax) www.SIS-TECH.com
11
TECH Solutions
shutdown the reactor on high reactant flow and on high pressure. The discussion of the likelihood 'and consequence would result in the team determining that SIL.3 is the best choice.
A similar discussion would occur when the keyword, High temperature, was used, resulting in a high temperature and high pressure initiated SIS. For this example, an action item is shown for the high temperature, "consider providing redundant reactor temperature transmitters." Since the control of the reaction temperature is key for the prevention of overpressure, the integrity of the process control layer should be improved by using redundant transmitters. Table 4 provides an example of the documentation that might be created for the Modified HAZOP.
Table 4. Example modified HAZOP
Deviation Cause Consequence Safeguards Action SIL
Item
More Flow FV-101 Potential for runaway High flow and 3
fails open reaction. Potential to High Pressure
overpressure the reactor initiate SIS
with release of
flammable/explosive
contents. Poetnail for
multiple on-site injuires or
fatalities
High TV-103 Same as above Reactor High Conside 3
Temperature fails Temperature and r
closed or High Pressure providing
loss of initiate SIS redunda
cooling nt
water reactor
supply temperat
ure
transmitt
ers. PM~-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
12
S1
TECH Solutions
Consequence Only
The process hazards analysis identified that the consequence of any ignited release was damage to the control with mulitipleinjuires and fatalities. Table 5 shows that this consequence would result in the selection of a SIL 3.
Table 5. Consequence Only Example Table
SIL Consequence
3 Potential for multiple fatalities
2 Potential for major serious injuries or one fatality
1 Potential for minor injuries Risk matrix
The information developed during process hazards analysis would be used as the basis for determining the likelihood and severity of the potential incident. Since the high flow rate scenario is caused by a simple loss of process control, the likelihood of this event is high. The documentation has shown that the runaway reaction would result in an overpressure of the vessel, resulting in the potential for severe damage if the released contents are ignited. The severity would be rated as extensive. The two-dimensional matrix shown in Fig. 1 shows thata high likelihood and extensive severity event requires SIL 3.
If the three dimensional matrix is used, the other layers of protection would need to be determined. For the runaway reactions involved in this process, the overpressure is developed too quickly to be relieved using a pressure relief valve. Therefore, the presence of the pressure relief valve cannot be used as a mitigating device in the SIL assignment. No acceptable layers of protection were identified during this analysis. Examination of Fig. 2 shows that, at IPL=low and at high severity/high likelihood, the assigned SIL would be SIL 3.
Risk graph
The process hazards analysis indicated the potential for multiple injuries and fatalities, so the consequence is C3. The frequency of exposure is high, F2, since the potential explosion will impact the control room. The Risk graph does not allow the use of possibility of escape at this consequence level (fatalities). The likelihood was determined to be high or W3. From the Risk graph shown in Fig. 3, the required SIL is SIL 4.
Quantitative assessment
A fault tree, such as the one shown in Fig. 5, could be drawn to model the process demand frequency or likelihood for the high temperature incident. This fault tree does not include all of the potential sequences associated with the production of high temperature.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.S1S-TECH.com
13
SI
TECH Solutions
For the sake of simplicity, it has been limited to the temperature control loop, cooling water flow, and procedural errors. For completeness, the fault tree would need to be extended to include the effect of the reactant flow on the production of temperature, as well as other direct and indirect causes of high temperature.
Figure 6: Fault Tree Analysis Example
Operator Does Operator RTDI Ma1ua
Not Take Takes Transmitter OCS Faltre AlarmF~lure Shutdown
Action inrorrect Fa!ure lluIlon Fails Action
Feed
Transmitter Fats
CVFaru; Closed
ConWier OCS Faa.
Fa!s
Contrdler i1 -
Transmitter Conlrd Vave
h Ma1ua i1 Ma1ua
Data is collected from historical evidence and published data sources in order to quantify the fault tree. For this example, the fault tree yielded a process demand frequency of 0.01 per year. The corporate risk tolerance is 0.00001 per year. When the corporate risk tolerance is divided by the process demand frequency, the calculated risk reduction factor is 0.001 or SIL 3.
Conclusion
Unfortunately, there is no easy answer when it comes to assigning SILs. The choice involves examining safety, community, environmental, and economic risks. Most importantly, tools must be developed at the corporate level to ensure that the choice of SIL is .consistent with a company's risk management philosophy and that the assignment method is congruent with the existing characteristics of the corporate risk assessment
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
14
SI
TECH Solutions
methodologies. The methods presented are all equally useful in converting PHA data into safety integrity levels (SIL), including Modified HAZOP, Consequence only, Risk matrix, Risk graph, and Quantitative assessment, and Corporate mandated SIL. When choosing a method, there are a number of factors that should be considered:
1. What type of method is currently used for corporate risk analysis?
2. How complex is the process?
3. Is the process well-understood?
4. What is the operating experience and knowledge of process dynamics?
5. Will the SIL assignment team be consistent from project to project?
Whichever method is chosen, it is necessary for the user to develop procedures and guidelines to ensure that the method is used effectively and consistently.
References
1) ANSI/ISA-S84.01-1996 "Application of Safety Instrumented Systems for the Process Industries," Instrument Society of America S84.01 Standard, Research Triangle Park, NC 27709, February 1996.
2) IEC 61508, "Functional safety of electrical/electronic/programmable electronic safety related systems," International Electrotechnical Commission, Draft, 1997.
3) Guidelines for Hazard Evaluation Procedures, Center for Chemical Process Safety, American Institute of Chemical Engineers, New York, New York, 1992.
4) Guidelines for Chemical Process Quantitative Risk Analysis, Center for Chemical Process Safety, American Institute of Chemical Engineers, New York, New York, 1989.
5) Adamski, Robert S., "Design Critical Control or Emergency Shutdown Systems for Safety AND Reliability," Automatizacion 96, Panamerican Automation Conference, Caracas, Venezuela, May 1996.
6) Windhorst, J.C.A., Strategic Initiative. Nova Chemical, Red Deer, AB, Canada.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)' 281-922-4362 (fax) www.SIS-TECH.com
15
SIS
ECH Solutions
Modification
Is the modification of any part of the SIS covered under management of
change procedures?
Do MOC procedures include the evaluation of how the change could
effect the SIL?
Is the MOCprocess audited to determine whether the effect of the
change could impact safety? DYes DNa DNA
DYes DNa DNA
DYes DNa DNA PMB-295, 2323 Clearlake City Blvd Houston, Texas 77062-8032 281·922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
16
LPS~4A
USING INSTRUMENTED SYSTEMS FOR OVERPRESSURE PROTECTION
Dr. Angela E. Summers, P.E., President, SIS~ TECH SOLUTIONS, LLC PMB-295, 2323 Clear Lake City Blvd, Houston, TX 77062-8032 713-320-4777 (phone) 281-461-8109 (fax) [email protected]
ABSTRACT
Industry is moving towards the use of high integrity protection systems (HIPS) to reduce flare loading and alleviate the need to upgrade existing flare systems when expanding facilities. The use of HIPS can minimize capital project costs, while meeting an evolving array of standards and regulations. This paper will discuss API and ASME standards and how these relate to ANSIIISA S84.0 1-1996 and lEC 61508. It will focus on process that should be followed in implementing the engineering design of HIPS.
INTRODUCTION
In the process industry, a key safety consideration is the control and response to overpressure situations. Industry standards from the American Petroleum Institute (API) and American Society of Mechanical Engineers (ASME) provide criteria for the design of vessels and the protection of these vessels from over-pressure. Traditionally, pressure relief valves and flares were used to handle the relieving of vessels in the worst credible scenario. Flare loading calculations gave no credit for operator intervention, fail safe equipment operation or trip systems.
But times have changed. In many communities and countries around the world, the belt
is tightening on the venting and combustion of gases. It is simply not acceptable to flare large volumes of gas. In addition, the cost of designing and installing large flare systems has continued to rise. API 521 (1) and Case 2211 of ASME Section VITI, Division 1 and 2 (2), provide alternatives in the design of overpressure protection systems. These
Summers
1
LPS-4A
alternatives revolve around the use of an instrumented system that exceeds the protection provided by a pressure relief valve and flare system.
These instrumented systems are safety- related systems, since their failure can result in the vessel rupture or in overloading the flare. As safety-related systems, they must be designed according to either the United States domestic ANSIIISA S84.0 1-1996 (3) or the international standard draft IEe 61508 (4,5). The risk typically involved with overpressure protection results in the need for high safety system availability; therefore, these systems are often called "high integrity protection systems" or HIPS.
REGULATIONS AND STANDARDS CONCERNING HIPS
API and ASME provide design standards for pressure vessels. These design standards are used worldwide by insurers to determine the appropriateness of pressure vessel design. As industry-recognized institutions, many API and ASME standards, are enforceable in the United States under OSHA PSM7 and EPA RMP8. In many other countries worldwide, these standards are enforceable under local anellor national regulations.
ANSIIISA S84.0 1-1996 and draft IEe 61508 are standards for SIS design. As a US industrial standard, ANSIIISA S84.0 1-1996 is also enforceable as good engineering practice under OSHA PSM (6) and EPA RMP (7). When fmalized, draft IEe 61508 will be accepted in many countries as an enforceable national standard, whether associated with a national regulation or independently mandated.
American Petroleum Institute (API)
API has recommended practices that address pressure relieving and depressuring systems in the petroleum production industry. API 521 describes flare system design methods. These methods basically require sizing the relief valve for each vessel for the worst credible scenario and require sizing the main flare header for the worst case relieving scenario, involving the simultaneous venting of all affected vessels. The fourth edition of
Summers
2
LPS-4A
API 521 allows credit to be taken for a favorable response of some of the instrument systems. While this design alternative is provided, API 521 Part 2.2 recommends the use of high integrity protective systems (HIPS) only when the use of pressure relief devices is impractical.
American Society of Mechanical Engineers CASME)
ASME Code Case 2211, approved in 1996, sets the conditions under which over-pressure protection may be provided by an instrumented system instead of a PRY. This ruling is intended to enhance the overall safety and environmental performance of a facility by utilizing the most appropriate engineered option for pressure protection. While there are
no specific performance criteria in the Case Code, the substitution of the HIPS for the
PRY should provide a safer installation. Consequently, the substitution is generally
intended for limited services where the PRY may not work properly due to process condition, e.g. plugging, multiple phases, etc. The overpressure protection can be
provided by a SIS in lieu of a pressure relieving device under the following conditions:
a) The vessel is not exclusively in air, water, or steam service.
b) The decision to utilize overpressure protection of a vessel by system design is the responsibility of the User.
c) The User must ensure the MAWP of the vessel is higher than the highest pressure that can reasonably be expected to be encountered by the system.
d) A quantitative or qualitative risk analysis of the proposed system must be made addressing all credible overpressure scenarios.
e) The analysis in (c) and (d) must be documented.
International Society for Measurement and Control aSA) and International Electrotechnical Conunission (lEC) :
ANSI/ISA S84.01-1996 and draft lEC 61508 are intended to address the application of safety instrumented systems (SIS) for the process industries. The objective of these standards is to define the design and documentation requirements for SIS. While these design standards are not prescriptive in nature, the design processes mandated in these standards cover all aspects of design including: risk assessment, conceptual design,
Summers
3
LPS-4A
detailed design, operation, maintenance, and testing (8). To ensure compliant implementation, the requirements of these standards, as pertaining to a specific Hll'S application, must be investigated thoroughly.
One of the most important criteria for SIS design is the requirement that the User assign and verify the safety integrity level (SIL) for the SIS (9). The assignment ofSIL is a corporate decision based on risk management philosophy and risk tolerance. Safety instrumented systems (SIS) should be designed to meet a safety integrity level, which is appropriate for the degree of hazard associated with the process upset. Safety integrity levels per draft lEe 61508, and ANSIIISA S84.01 are designated in the following table.
Table 1: Safety Integrity Levels
Safety Integrity Level Availability Probability to IIPFD
Required Fail on Demand
IEe 4 >99.99% E-005 to E-004 100,000 to lO,OOO
61508 ~i~~ ... ~ 3 99.90 ~ 99.99% E-004 to E-003 10,000 to 1,000
2 99.00 - 99.90% E-003 to E-002 1,000 to 100
1 90.00 - 99.00% E-002 to E-OO 1 100 to 10 From the point of SIL selection, the entire lifecycle of the SIS is evaluated for agreement with the SIL. Thus, the SIL is the cornerstone of the SIS design.
Summers
4
LPS-4A
ADVANTAGES AND DISADVANTAGES OF USING HIPS
Industry is increasingly moving towards utilizing IllPS to reduce flare loading. They are becoming the option of choice to help alleviate the need to replace major portions of the flare system in existing facilities when adding new equipment or units. If the header and flare system must be enlarged, significant downtime is incurred for all of the units that discharge to that header. The relatively low capital cost of Hll'S compared to flare system piping upgrades and the ability to install IllPS without incurring significant additional downtime during a turnaround, makes these systems an extremely attractive option. Another benefit is that the process unit will not flare as much as a process unit designed for full flare loading. In some areas of the world, this is becoming important as regulatory agencies place greater restrictions on flaring.
The main disadvantage of HIPS is these systems are more complex and require that many different components work as designed. The effectiveness of the system is highly dependent on the field design, device testing, and maintenance program. The ability of the HIPS to adequately address overpressure is limited by the knowledge and skill applied in the identification and definition of overpressure scenarios. When a PRY is not installed, the IllPS becomes the "last line of defense," whose failure potentially results in rupture of the vessel or pipeline.
Summers
5
MAKING THE DECISION TO USE HIPS
LPS -4A
A decision tree can be utilized to facilitate the use of HIPS in the process industry. Figure 1 is a highly simplified decision tree showing only the key steps in assessing and designing a HIPS.
Can HIPS be used in lieu of PRVor to reduce flare loading capacity?
Yes
No
No
Determine local code standards, or regulatory/enforcemen authority requirement
Perform Hazard assessment to determine credible overpressure scenario
Safety Requirement Soecificatioo
For each scenario, assess and document how the scenario will be mitigated and the
availability of the overall safety syste
Conceptual Design
Figure 1: Simplified Decision Tree
Summers
6
Detailed Design
Implementation & Commissioning
Yes
Operate, maintain, an functional test installe devices for life of pia
LPS-4A
The first question that must be asked revolves around regulatory and standards issues. Some local codes mandate the use ofPRVs, regardless ofthe industry standards, so make sure local jurisdictional issues are understood. From ASME Code Case 2211, the vessel can not be exclusively in air, water, or steam service. This requirement is intended to prevent building utility systems (e.g. residential boilers) from being installed without
PRVs.
Once the local regulations and standards are understood, a hazard assessment must be performed to determine the credible overpressure scenarios. During the hazard assessment, analyze each scenario thoroughly. If any scenario is determined to be noncredible during the assessment, make sure the documentation provides adequate justification. Remember that the flare system most likely will not be to handle your noncredible event, if it turns out to be credible and happens.
A safety requirement specification (SRS) should be developed to address various overpressure scenarios. The SRS will describe the specific actions required to mitigate each scenario. When assessing the performance of HIPS, examine the process dynamics carefully to make sure that the instrumented system can respord fast enough to the event to prevent the overpressure of the vessel. In addition to the safety functional requirements, the SRS also includes the documentation of the safety integrity requirements, including the safety integrity level (SIL) and anticipated testing frequency.
Typically, the high availability requirements for HIPS drive the choices made concerning component integrity, component redundancy, common cause concerns, diagnostic requirements, and testing frequency. The conceptual design or basis of design document must specify exactly how the HIPS will be configured to achieve the necessary availability.
For documentation of the "as safe or safer" and compliance with the target SIL, the design of any HIPS should be quantitatively verified to ensure it meets the required availability. Quantitative verification of SIL for HIPS is the generally accepted approach
Summers
7
LPS -4A
for most companies utilizing HIPS. This is because the quantitative technique is the most defensible from a legal standpoint. A draft guidance report by ISA, ISA dTR84.02 (10, 11, 12, 13, 14), recommends use of one of the following methods for SIL Verification:
1. Markov Models
2. Fault Tree Analysis (FTA)
3. Simplified Methods
Any of these techniques can be utilized to determine whether the design meets the required SIL. If it does not meet the required SIL, the design must be modified until it does.
Detailed design andimplementationlcommissioning activities must be performed within the bounds of the safety requirements specification and the conceptual design. Any deviations from these documents must be evaluated for impact on the safety integrity level and on any assumptions made with regard to performance.
Finally, the HIPS must be operated, maintained and tested throughout the life of the plant. The high integrity of HIPS is often achieved through the use of frequent testing. Once the required testing frequency is documented in the SRS, it must be done. If the SRS says that the testing occurs at a 6 month interval, it must be done at 6 months, not one year.
CONCLUSIONS
Care must be taken in any decision to implement HIPS. The use of HIPS should be generally restricted to the reduction of relief and flare loading in existing facilities. The use of an instrumented system should not be used as the only justification for reducing the pressure relieving requirements on individual pieces of equipment. Any justification should be thoroughly documented through a hazard analysis, which identifies all potential
Summers
8
LPS-4A
overpressure scenarios and consequences of the scenarios. A SIL appropriate to the risk should be selected and the design should be validated for adherence to this SIL.
All of the regulatory and standards issues boil down to a few simple rules:
• Specific regulatory and enforcement jurisdiction requirements must be determined.
In some instances, approval of local authorities is required.
• Regulatory and standards requirements must be understood by all parties, including management, I&E, operations, and maintenance.
• Detailed hazard assessment must be performed to demonstrate that the HIPS solution can adequatelyaddress all credible overpressure scenarios.
• The User must verify that HIPS will work from a process standpoint (i.e., Can the valves shut in time to prevent pressure wave propagation?).
• The availability of the HIPS must be as good or better than the availability of the "passive" mechanical device it replaces.
• The User must understand the importance of application-specific design aspects, as well as the associated costs of the intensive testing and maintenance program whenever a HIPS is utilized.
• Finally, there is no "approved" rubber stamp in any regulation or standard for the use of HIPS for replacement of relief devices on pressure vessels or pipelines. Substantial cautionary statements are made in all of the regulations and standards, concerning the use of HIPS. No matter what documentation is created, the User still has the responsibility to provide a safe and environmentally friendly operation.
REFERENCES
1. "Guide for Pressure- Relieving and Depressurizing Systems," API 521, Fourth Edition, American Petroleum Institute, March 1997.
2. "Pressure Vessels with Overpressure Protection by System Design," Section VIII, Divisions 1 and 2, ASME Code Case 2211, The 1995 Boiler Pressure Vessel Code, American Society of Mechanica1 Engineers, 1995.
Summers
9
LPS -4A
3. "Application of Safety Instrumented Systems for the Process Industries," ANSI/ISAS84.01-1996, ISA, Research Triangle Park, NC, 1996.
4. IEC 61508, 65A1255/CDV, "Functional safety of electrical! electronic/ programmable electronic safety related systems," Parts 1,3,4, and 5, International Electrotechnical Commission, Final Standard, December 1998.
5. IEC 61508, 65A1255/CDV, "Functional safety of electrical! electronicl programmable electronic safety related systems," Parts 2, 6, and 7, International Electrotechnical Commission, Final Draft International Standard, January 1999.
6. "Process Safety Management of Highly Hazardous Chemicals; Explosives and Blasting Agents," 29 CFR Part 1910, OSHA, Washington, 1992.
7. "Risk Management Programs for Chemical Accidental Release Prevention," 40 CFR Part 68, EPA, Washington, 1996.
8. Ford, K.A. and Summers, A.E., "Are Your Instrumented Safety Systems up to Standard?," Chemical Engineering Progress, 94, pp. 55-58, November, 1998.
9. Summers, A.E., "Techniques for assigning a target safety integrity level," ISA Transactions, 37, pp. 95-104 1998.
10. Safety Instrumented Systems (SIS}-Safety Integrity Level (SIL) Evaluation Techniques, Part 1: Introduction," TR84.0.02, Draft, Version 4, March 1998.
11. "Safety Instrumented Systems (SIS}-SafetyIntegrity Level (SIL) Evaluation Techniques, Part 2: Determining the SIL of a SIS via Simplified Equations," TR84.0.02, Draft, Version 4, March 1998.
Summers
10
LPS-4A
12. "Safety Instrumented Systems (SIS}-Safety Integrity Level (SIL) Evaluation Techniques, Part 3: Determining the SIL of a SIS via Fault Tree Analysis," TR84.0.02, Draft, Version 3, March 1998.
13. "Safety Instrumented Systems (SIS}-Safety Integrity Level (SIL) Evaluation Techniques, Part 4: Determining the SIL of a SIS via Markov Analysis," TR84.0.02, Draft, Version 4, March 1998.
14. "Safety Instrumented Systems (SIS}-Safety Integrity Level (SIL) Evaluation Techmques, Part 5: Determining the PFD of SIS Logic Solvers via Markov Analysis," TR84.0.02, Draft, Version 4, April 1998.
Summers
11
Viewpoint on ISA TR84.0.02 - Simplified Methods and Fault Tree Analysis
Angela E. Summers, Ph.D., P.E. President, SIS-TECH Solutions, LLC
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062·8032 713-320-4777 (phone)· 281-461-8109 (fax) www.SIS-TECH.com
Accepted for publication in ISA Transactions
KEYWORDS
Safety Integrity Level, SIL, Safety Instrumented System, SIS, ANS1/ISA-S84.01-1996, IEC 61508, ISAdTR84.0.02
ABSTRACT
ANSI/ISA-S84.01-1996 and lEG 61508 require the establishment of a safety integrity level for any safety instrumented system or safety related system used to mitigate risk. Each stage of design, operation, maintenance, and testing is judged against this safety integrity level. Quantitative techniques can be used to verify whether the safety integrity level is met. ISA-dTR84.0.02 is a technical report under development by ISA, which discusses how to apply quantitative analysis techniques to safety instrumented systems. This paper discusses two of those techniques: 1) simplified equations and 2) fault tree analysis.
INTRODUCTION
In 1996, ISA, the international society for measurement and control, voted unanimously for the approval of ISA-S84.01. In 1997, the standard was accepted by the American National Standards Institute (ANSI) and is now known as ANSI/ISA-S84.01-1996 (1). This standard is considered by the U.S. Environmental Protection Agency (EPA) and Occupational Safety and Health Administration (OSHA) as a generally accepted good industry practice (2,3). Any U.S. based instrumented systems specified after March 1997 should be designed in compliance with this standard.
Internationally, IEC 61508, "Functional Safety of Electronical/Electronic/Programmable Electronic (E/E/PES) Safety-Related Systems," (4,5) is getting very close to being released as a final standard. The standard consists of seven parts, four of which have already been issued as final and three are waiting for final vote on the final draft international standard. The intent is to release the entire standard as final in early 2000. Instrumented systems designed in the next millennium must comply with this standard with the exception of U.S. installations that must follow ANSIIISA-S84.01-1996.
Both standards are performance-based and contain very few prescriptive requirements. The "performance" of the safety instrumented system (SIS) is based on a target safety integrity level (SIL) that is defined during the safety requirements specification development (6). According to the standards, the ability of the SIS to achieve a specific SIL must be validated at each stage of design and prior to any change made to the design after commissioning. The entire operation, testing, and maintenance procedures and practices are also judged for agreement with the target SIt. Thus, the successful implementation ofa validation process for SIL is very important for compliance with either standard.
The SP84 committee is working to complete a technical report,ISA-dTR84.0.02, which will discuss three techniques for the quantification of SIL. These methods are Simplified Equations (8), Fault Tree Analysis (9), and Markov Modeling (11). The technical report introductory material states that the purpose of dTR84.0.02 is to provide supplemental information that would assist the User in evaluating the capability of any given SIS design to achieve its required SIL and to reinforce the concept of the performance based evaluation of SIS. The technical report further states that the quantification of the SIL is performed to ensure that the SIS meets the SIL required for each safety function, to understand the interactions of all the safety functions, and to understand the impact of failure of each component in the SIS. Therefore, the technical report emphasizes the importance of evaluating the SIS design (7).
The technical report also acknowledges the importance of spurious trip rate to the operation of the facility. Spurious trips are often not without incident. There is a process disruption; alarms sound; and PRVs lift causing flares many meters high. Consequently, the technical report presents the mathematics involved in determining the spurious trip rate. When viewing the calculations presented
2
and interpreting the results, it is important to understand that the spurious trip rate is a frequency with the units of failures per unit of time and the SIL is a probability, l.e., a dimensionless number.
ISA-dTR84.0.02 presents three quantitative methods: 1) Simplified Equations, 2) Fault Tree Analysis, and 3) Markov modeling. The technical report is not a comprehensive textbook or treatise on any of the methods. All of the parts assume that the User of the technical report has a basic understanding of probabilistic theory and the method being presented. It also assumes that the User knows how to obtain and evaluate the appropriateness of the data for a specific application. The intent of the technical report is to provide guidance on how to apply this knowledge to safety instrumented systems.
Many Users will choose to use Simplified Equations for an initial estimation of the PFDavg for various design options. It may also be used to evaluate SIL 1 and SIL 2 systems where the architecture is sufficiently simple for the hand calculations. For SIL 3 systems, the complexity of the design often makes the Simplified Equations not so simple to use. Therefore, the technical report recommends the use of Simplified Equations for "simple SISs."
For more complex SISs, Fault Tree Analysis or Markov modeling is recommended. Fault Tree Analysis is widely used by the general risk assessment industry for defining the frequency or probability of particular incident scenarios. The calculations can be done by hand, but since computer software models are readily available, most Fault Tree Analysis is performed using a computer program.
Many risk analysts are not familiar with Markov modeling and the fundamental math behind the method will bea rude awakening to those Users who have forgotten how to do matrix math or how to solve Laplace Transforms. However, Markov modeling should be used for the evaluation of any programmable logic solver (11), since Markov modeling can take into account time dependent failures and variable repair rates found in most TUV Class 5 and 6 certified logic solvers. It is best to leave the Markov modeling to the Vendor and ask the Vendor for the PFDavg at the anticipated logic solver testing frequency. Users should focus instead on learning how to apply Simplified Equations and Fault Tree Analysis to evaluate the field design, including the input and output devices and support systems.
3
DETERMINING SIL OF ASIS VIA SIMPLIFIED EQUATIONS (8)
The Simplified Equation technique involves determining the PFDavg for the field sensors (FS), logic solver (LS), final elements (FE), and support systems (SS). The field sensors are the inputs required to detect the hazardous condition. The logic solver accepts these inputs and generates correct outputs that change the state of the final elements in order to mitigate the hazardous condition. The support systems are those systems that are required for successful functioning of the SIS. If the valves are airto-move, the instrument air supply must be analyzed. If the SIS is energize-to-trip, the power supply must be considered as part of the SIS. Once the individual PFDs for each input, logic solver, output and support system are known, these PFDs are summed for the PFDsls.
PFDsis= I:PFDFs +I:PFDLS +I:PFDFE + I:PFDss
The Simplified Equations used for calculating the PFDavg were initially derived from Markov Models, however the simplification of the models resulted in some limitations. Unlike Markov Models, this method does not handle time dependent failures or sequence dependent failures. Due to these limitations, this method should not be used to analyze programmable logic solvers.
Part 2 includes equations for 1001, 1002, 1003, 2002, 2003, and 2004 architectures. These equations have been derived from Markov models, assuming the rare event approximation. The rare event approximation can only be used when the failure rate (A) multiplied by the testing interval (TI) is much smaller than 0.1. This can be stated mathematically as ATI «0.1. Simplified Equations results in the calculation of the PFDavg for each voting configuration. The extended equations do include some variables for which published data is not available. These variables must be estimated from experience. Consequently, an experienced risk analyst and/or engineer is required for correct estimation of these variables. For instance, the equation for 1002 architecture is as follows:
4
The first term is the undetected dangerous failure of the SIS. It shows the effect that the device undetected dangerous failure rate (ADU) and testing interval (TI) have on the PFDavg. This is the term of the most important part of this equation in determining the unavailability of the SIS. This term is actually simplified from the full Markov solution.
In explanation, the beta ([3) factor method is a technique that can be used to estimate common cause failure effects on the SIS design. The [3 factor is estimated as a percentage of the failure rate of one of the devices in a redundant configuration, assuming both devices have the same failure rate (note third term above). Therefore, the common cause failure rate or dependent failure rate would be [3*ADU and the device failure rate or independent failure rate would be (1-[3)* ADU• For the purposes of Part 2, (1-[3) was considered to be equal to 1, yielding conservative results. For large [3 factors, (1-[3) should be considered, which would yield the following equation for a 1002 architecture:
The published data in OREDA (12), CCPS (13), and RAC (14) sometimes provide the undetected dangerous failure rate; however, many times, only a total dangerous failure rate is published. If only the total dangerous failures are known, the User must make an assumption concerning the percentage of the total dangerous failures that can be detected with diagnostics. If the percentage is not known, the total dangerous failures can be used to obtain a conservative estimate of the PFDavg.
The second term is the probability of having a second undetected failure (ADU) during the repair of a detected failure (ADD). This numerical value of this term is generally very small, since the repair time (MTTR) is typically less than 24 hours. Consequently, this term often can be considered negligible.
The third term represents the probability of common cause failure based on the beta factor method.
The beta factor must be estimated by the User, since there is almost no published data available for current technology. The technical report states that the value is somewhere between 0 and 20%. Many Users have determined that with proper design practices (15) that a beta factor in the range of 0.1 to 2%
5
can be used. The beta factor has a profound effect on the PFDavg obtained for redundant architectures, so it must be selected carefully. For initial comparisons of architecture and testing frequency, it is best to assume that this term is negligible. Effective design can minimize common cause failure. However, if an analysis of the design indicates that common cause failures can occur, such as shared process taps or a shared orifice plate, a beta factor should be selected and included in the final calculation.
The fourth term is the probability of systematic failure. Systematic failures are those failures that result due to design and implementation errors. Systematic failures are not related to the hardware failure. Examples of systematic failures are as follows:
1) SIS design errors
2) Hardware implementation errors
3) Software errors
4) Human interaction errors
5) Hardware design errors
6) Modification errors
The systematic failure rate (A_DF) is extremely difficult to estimate. Also, many of the listed systematic failures will affect all of the architectures equally. If software design is poor, it does not matter whether there is one, two or three transmitters. This term also assumes that the systematic failures can be diagnosed through testing. Therefore, effective design, independent reviews, and thorough testing processes must be implemented to minimize the probability of systematic failures. When good engineering design practices are utilized, these failures can be considered negligible.
Based on the repair time being short and on the common cause and systematic failures being minimized through good design practices, these terms can be neglected yielding the following equation:
Similar reduced equations are provided for 1001, 1002, 1003, 2002, 2003, and 2004 architectures.
6
DETERMINING SPURIOUS TRIP RATE VIA SIMPLIFIED EQUATIONS
For the spurious trip rate, the full equation for 1002 is as follows:
STR = [ 2( A! + ).?D)] + [~;tS +).?D)] + ~
The first term contains the failures associated with a device experiencing either a dangerous detected failure which forces the logic to the trip state or a safe failure. Due to spurious trip concerns, many Users choose to fail a detected device failure "away" from the trip. This converts the logic to 1001 for the remaining device until repair is initiated. If this type of logic is utilized, the dangerous detected failure rate contribution to the spurious failure rate can be assumed to be zero.
The second term is the common cause term and the third term is the systematic failure rate. Effective design and good engineering techniques should minimize both of these terms. The equation can then be reduced to the following:
STR = 2;tS
Similar reduced equations can be derived for the other architectures.
When STR is known for each combination of field sensors, logic solver, final element, and support systems. The overall STR is calculated by summing the individual STRs. The final answer is the frequency at which the SIS is expected to experience a spurious trip.
LIMITATIONS OF THE SIMPLIFIED EQUATIONS METHODOLOGY
The published equations inISA-dTR84.0.02 do not allow the modeling of diverse technologies. The sensors or final elements used in each voting strategy must have the same failure rate. Consequently, this method does not allow the modeling of a switch and a transmitter or a control valve and a block valve. During the derivation for the equations in Part 2 and those shown in Part 5, it was assumed that
7
the failure rate of voted devices were the same. It must be emphasized that this is a limitation of the equations presented in these parts. It is not a limitation of the mathematics of the methodology.
However, a significant limitation of the mathematics is the requirement that the testing frequency be the same for all voted devices. To perform the Markov model derivation, the integration is performed over the range of time 0 to time "testing frequency." Consequently all devices ina voted set must be tested at the same interval.
The method also does not allow the modeling of any SIS device interactions or complex failure logic, such as 1002 temperature sensors detecting the same potential event as 2003 pressure sensors. The actual failure logic may be that the event will not occur unless both temperature sensors and 2003 pressure sensors fail. This method will only look at the sensor failures as separate issues. Consequently, this method is used to model simple SISs only. However, the math is easy and all this method requires for execution is a pad of paper and a pen (or computer).
DETERMINING SIL OF ASIS VIA FAULT TREE ANALYSIS (9)
Part 3 discusses the use of fault trees analysis for modeling the SIS. Fault tree symbols are used to show the failure logic of the SIS. The graphical technique of Fault Tree Analysis allows easy visualization of failure paths. Since the actual failure logic is modeled, diverse technologies, complex voting strategies, and interdependent relationships can be evaluated. However, Fault Tree Analysis is not readily adaptable to SISs that have time dependent failures. As with Simplified Equations, Fault Tree Analysis is not recommended for modeling programmable logic solvers. The User should obtain the PFDavg for the logic solver from the Vendor at the anticipated logic solver testing frequency.
Fault Tree Analysis is one of the most common techniques applied for quantifying riskln the process industry. Computer programs, books, and courses are available to the User to learn how to apply Fault Tree Analysis. The technical report recommends the use of Fault Tree Analysis in SIL 2 and SIL3 SIS applications. It does require more training and experience than the Simplified Equations, but will yield more precise results.
8
The mathematical approach for Fault Tree Analysis is different from Markov model analysis. Fault Tree Analysis assumes that the failures of redundant devices are independent and unconditional. In Fault Tree Analysis, the PFDavg is calculated for each device and then Boolean algebra is used to account for the architecture and voting. Consequently, the equations used for some architectures will be different when Simplified Equations are used rather than Fault Tree Analysis. When the equations are different, of course, the PFDavg value will differ. However, both methods provide acceptable approximations of the PFDavg for the SIS.
A Fault Tree Analysis begins with a graphical representation of the SIS failure. For example, in the 1002 voting of two identical devices, the fault tree would look as shown in Figure 1. The failure of the SIS would only occur if both device 1 and device 2 failed. The and gate is used to illustrate this logic.
Figure 1. Fault Tree for PFDavg for 1002 Voting Devices
The data would be collected and used to calculate the PFDavg of each device
PFDavg = ",DuTI!2
Boolean algebra, also known as cut-set math, is used to calculate the and gate. This yields:
Since these calculations are based on the PFDavg for a single device, it is easy to examine cases where the failure rates and testing frequencies of the two devices are not the same. The PFDavg for each
9
event is simply calculated based on its failure rate and testing frequency. These PFDavg values are combined using the cut-set math.
Any of the terms discussed in the Simplified Equations overview can be included in the fault tree as events, such as systematic failure and common cause failure. The 1002 voting devices, including common cause, would appear as shown in Figure 2.
Figure 2. PFDavg for 1002 Voting Devices With Common Cause Consideration
The independent failure rate contribution would be calculated as follows:
The common cause contribution to the PFDavg would be calculated as follows:
The common cause failure contribution can then be added to the independent failure rate contribution
using cut-set math. For rare events, the PFDavg calculations would be as follows:
10
The systematic failure contribution to the PFDavg can be added in a similar fashion.
DETERMINING THE SPURIOUS TRIP RATE VIA FAULT TREE ANALYSIS
For the spurious trip rate calculation, the same graphical technique is used, as well as the same cut-set mathematics. However, the equations used to describe the individual events are based on frequencies not probabilities. For the 1002 voting devices, the fault tree is drawn as shown in Figure 3.
Figure 3: Fault tree for Spurious Trip for 1002 Voting Devices
The spurious trip rate is calculated as follows:
STR = STRdevice 1 + STRdevice 2
11
LIMITATIONS OF THE METHODOLOGY
The derivation methodology for fault tree analysis is different from the Markov derivation methodology used in the other parts of TR84. While not truly a limitation of the methodology, the difference in the PFDavg values for some architectures has resulted in disagreement among TR84 members about the true definition of PFDavg. However, the difference in the overall results is seldom significant, but the reader is warned that there will be instances where simplified equations and fault tree analysis will not yield identical results.
There are three principle benefits associated with using Fault Tree Analysis for SIL verification. First, the graphical representation of the failure logic is easily understood by risk analysts, engineers, and project managers. Second, the method has been used by the process industry for risk assessment for many years, so there is already a resource base within many User companies, as well as outside consultants. Finally, the availability of software tools to facilitate the calculations improves the quality and precision of the calculation.
CONCLUSIONS
ISA-dTR84.0.02 is intended to provide guidance on how to calculate the SIL of a SIS. Since ISAdTR84.0.02 is a guidance document, there are no mandatory requirements. The document was not developed to be a comprehensive treatise on any of the methodologies, but was intended to provide assistance on how to apply the techniques to the evaluation of SISs. Each Part expects the User to be familiar with the methodology and suggests that the User obtain additional information and resources beyond that contained in the technical report. The technical report was issued in draft in 1998 and should be released as final in 2000.
Simplified Equations and Fault Tree Analysis are two excellent techniques that can be used together to cost effectively evaluate SIS designs for SIL. Initial assessment of proposed options for input and output architectures can be performed quickly at various testing frequencies using Simplified Equations. When
12
the overall SIS needs to be evaluated, Fault Tree Analysis is a proven technique that can model even the most complex logic relationships.
ACKNOWLEDGEMENTS
This paper was presented at Interkama, Dusseldorf, Germany, October 1999.
REFERENCES
1. "Application of Safety Instrumented Systems for the Process Industries," ANSI/ISA-SB4.01-1996, ISA, Research Triangle Park, NC, 1996.
2. "Process Safety Management of Highly Hazardous Chemicals; Explosives and Blasting Agents," 29 CFR Part 1910, OSHA, Washington, 1992.
3. "Risk Management Programs for Chemical Accidental Release Prevention," 40 CFR Part 6B, EPA, Washington, 1996.
4. IEC 6150B, 65N255/CDV, "Functional safety of electrical/electronic/programmable electronic safety related systems," Parts 1, 3, 4, and 5, International Electrotechnical Commission, Final Standard, December 199B.
5. IEC 6150B, 65N255/CDV, "Functional safety of electrical/electronic/programmable electronic safety related systems," Parts 2, 6, and 7, International Electrotechnical Commission, Final Draft I nternational Standard, January 1999.
6. Summers, A.E., "Techniques for assigning a target safety integrity level," ISA Transactions, 37, pp. 95-104,199B.
7. "Safety Instrumented Systems (SIS)~Safety Integrity Level (SIL) Evaluation Techniques, Part 1:
Introduction," ISA dTRB4.0.02, Draft, Version 4, March 199B.
B. "Safety Instrumented Systems (SIS)-Safety Integrity Level (SIL) Evaluation Techniques, Part 2:
Determining the SIL of a SIS via Simplified Equations," ISA dTRB4.0.02, Draft, Version 4, March 199B.
9. "Safety Instrumented Systems (SIS)-Safety Integrity Level (SIL) Evaluation Techniques, Part 3:
Determining the SIL of a SIS via Fault Tree Analysis," ISA dTRB4.0.02, Draft, Version 3, March 199B.
13
10. "Safety Instrumented Systems (SIS)~Safety Integrity Level (SIL) Evaluation Techniques, Part 4:
Determining the SIL of a SIS via Markov Analysis," ISA dTR84.0.02, Draft, Version 4, March 1998.
11. "Safety Instrumented Systems (SIS)-Safetylntegrity Level (SIL) Evaluation Techniques, Part 5:
Determining the PFD of SIS Logic Solvers via Markov Analysis," ISA dTR84.0.02, Draft, Version 4, April 1998.
12. "OREDA: Offshore Reliability Data Handbook," 3rd Edition, Det Norske Veritas lndustri Norge as DNV Technica, Norway, 1997.
13. "Guidelines for Process Equipment Reliability Data," Center for Chemical Process Safety of the American Institute of Chemical Engineers, NY, NY, 1989.
14. "Non-Electronic Parts Reliability Data," Reliability Analysis Center, Rome, NY, 1995.
15. Summers, Angela E, "Common Cause and Common Sense, Designing Failure Out of Your Safety Instrumented Systems (SIS)," ISA Transactions, 38, 291-299, 1999.
14
LPS-4A
USING INSTRUMENTED SYSTEMS FOR OVERPRESSURE PROTECTION
By
Dr. Angela E. Summers, PE
SIS-TECH Solutions, LLC Houston, TX
Prepared for Presentation at the 34th Annual Loss Prevention Symposium, March 6-8, 2000
Overpressure Protection Alternative Session
Copyright © SIS-TECH Solutions, LLC
December 1999
Accepted for publication in Chemical Engineering Progress
AICHE shall not be responsible for statements or opinions contained in papers or printed publications.
SI
ECH Solutions
Common Cause and Common Sense Designing Failure Out of Your SIS
Angela E. Summers, Ph.D. and Glenn Raney PCE, Triconex Corporation
Published in ISA Transactions, Issue 31999.
Abstract
The ANSI/ISA S84.01-1996 and IEC 61508 (draft) standards provide guidelines for the design, installation, operation, maintenance and testing of safety instrumented systems (SIS). As part of the SIS lifecycle design process, the SIS should be evaluated not only for its safety integrity level (SIL), but also for its potential for common cause failure (CCF).
A CCF occurs when a single fault results in the corresponding failure of multiple components, such as a miscalibration error on a bank of redundant transmitters. The frequency of common cause faults is difficult to estimate. The modeling techniques and available failure rate data make the predictive calculations of these failures cumbersome and the results obtained questionable. Therefore, a more meaningful way for most SIS designers is to eliminate the potential source of CCF in the SIS design, installation, operation, and maintenance.
The paper will focus on how to identify potential common cause events through the application of industry standards or internal design standards or through the use of qualitative assessment techniques. The identification of these events is extremely important, because it is only after identification that strategies can be developed for eliminating or reducing their likelihood. Fortunately, many of these strategies are as simple as applying a little common sense with some good engineering practice to the SIS design.
Introduction
The new ANSI/ISA S84.01-1996(1) and draft IEC 61508(2) standards establish the concept of the safety lifecycle model for designing safety instrumented systems (SIS). The SIS consists of the instrumentation or controls that are installed for the purpose of mitigating a hazard or bringing the process to a safe state in the event of a process upset. A SIS is used for any process in which the process hazards analysis (PHA) has determined that the mechanical integrity of the process
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)' 281-922-4362 (fax) www.SIS-TECH.com
SI
TECH Solutions
equipment, the process control, and other protective equipment are insufficient to mitigate the potential hazard.
The SIS should be designed to meet the required safety integrity level as defined in the safety requirement specificatlorr" (safety requirement allocatlorr"). Moreover, the SIS design should be performed in a way that minimizes the potential for common mode or common cause failures (CCF). A CCF occurs when a single fault results in the corresponding failure of multiple components. Thus, CCFs can result in the SIS failing to function when there is a process demand. Consequently, CCFs must be identified during the design process and the potential impact on the SIS functionality must be understood.
Unfortunately, there is a great deal of disagreement among the experts on how to define CCF and what specific events comprise .a CCF. The following are often cited(2) as examples of common cause faults:
• Miscalibration or no calibration of sensors
• Pluggageof common process taps for redundant sensors
• Incorrect maintenance or no maintenance
• Improper bypassing
• Environmental stress on the field device
• Process fluid or contaminant plugs valve
But the examination of these faults, in light of any SIS design, will indicate that any of these six examples can disable single I/O systems, as well as redundant I/O systems. However, many of the proposed methodoloqies'v" for assessing CCF ignore this fact and only penalize redundant sensors and final elements.
For example, miscalibration of redundant sensors is often cited as an important CCF to consider. The use of a single sensor will eliminate the common cause problem, but that doesn't make sense. The miscalibration of a single sensorwiII cause the SIS to fail just as seriously as the miscalibration of redundant sensors. If the miscalibration is examined from a failure rate standpoint, the following issues would need to be addressed:
• Is the miscalibration a common cause failure? If so, the proposed techniques explicitly account for it only in the case of redundant devices.
• Is this type of failure included in the failure rate data provided in the published databases or in User databases? Miscalibrationis already included in the covert, as well as catastrophic, failure rate provided in some published databases.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281"922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
2
SI
ECH Solutions
• Is this a failure that is independent of the device and should be discussed as a separate procedural failure? There are many procedural errors that could be listed, including bypassing, poor maintenance practice, poor testing, etc. The explicit consideration of all of these failures is time consuming and the failure rate data for these is generally non-existent.
Miscalibration, in its broadest sense, can be anything from poor calibration procedure (procedural problem), bad calibration equipment (mechanical problem), or incorrect execution of calibration procedure (human error). Due to this, the elimination of common cause failure must involve the examination of everything and everyone that interacted with the device.
Of course, the ultimate failure of all is that the safety requirement specification
(SRS) is incorrect at the beginning of the design process and the transmitter cannot detect the potential incident. This is the most disastrous common cause failure that directly leads to the hazardous incident that the designer is seeking to prevent. A bad SRSis the ultimate "trump card" for the entire SIS and is one failure that most of the proposed methodologies ignore.
In an effort to ensure that CCFs had been properly addressed in the standard, the IEC 61508 (draft) committee requested an independent evaluation of the current theories on common cause modeling and the availability of failure rate data. This evaluation was performed by Dr. A.M. Wray of the Health and Safety Laboratory, an agency of the Health and Safety Executive. Dr. Wray concluded in a 1996 report'" to the IEC 61508 committee that "Although IEC 1508 already has mechanisms in place which deal with common-cause failures, itis considered that the current approach is insufficient on its own. It is considered that a more-rigorous qualitative approach, possibly in the form of checklists will make a more viable alternative to modeling."
The IEC 61508 (draft) committee has taken a quantitative approach to Dr. Wray's checklist recommendation by developing, with Dr. Wray's assistance, a methodology for relating specific measures used to reduce potential common cause faults to X or Y factors. These X or Y factors are used to determine the overall beta factor for each component. The beta factor method is the most commonly cited numerical technique for assessing CCF. The beta factor is used in conjunction with the random hardware failure rate to calculate a CCF rate for a redundant set of devices. This checklist methodology has been incorporated into draft ISA TR84.0.02 Part 1 Annex A, where it is referenced to IEC 61508. The proposed methodology is still under development and numerous changes are expected prior to final issuance.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
3
SIS
TECH Solutions
While the experts on the standards committee are working to craft a quantitative technique for assessing CCF, the SIS designers oftoday need a methodology that does not depend upon the definition of various types of failures. Further, the SIS designer needs techniques that can be readily applied at various stages of the SIS design. The numerical techniques cannot be applied until after most design details have been finalized. Qualitative techniques should be established within a corporation, facility, or design team to ensure that a rigorous, comprehensive review is performed on the SIS design. This review can be performed on a proposed SIS design or on an operational, installed SIS. The primary goal of the review is to ensure that.adequate measures have been employed to reduce the potential for failure of the SIS, including failure due to systematic or common cause failure.
Techniques for Evaluating SISs for Common Cause Failure (CCF)
The choice of the evaluation technique is typically dependent on experience of the User with the particular SIS design. This would include documented historical performance of instruments, installation details, and design engineering teams. Experience with the specific application environment is required, because a device or installation detail that works well in one type of application may not work well in another. For example, standard taps into a vessel for mounting a transmitter may work extremely well in clean service, but may plug very quickly in a service where solids can deposit. Three qualitative techniques are often used to assess SIS designs:
1) Industrial Standards
2) Engineering Guidelines and Standards
3) Qualitative Assessment
An overview of each of the techniques is provided below.
Industrial Standards
ANSIiISA S84.01-1996(1) provides specific SIS design requirements in the mandatory portion of the document. It also provides guidance in the informative Annexes in the non-mandatory portion of the document. In addition, draft IEC 1508(6) provides specific design requirements for safety related systems. The draft standard provides specific measures and techniques that must be applied. Proposed or installed SIS designs can be assessed for agreement with these specific requirements.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)' 281-922-4362 (fax) www.SIS-TECH.com
4
SI
. ECH Solutions
While these standards represent a major step forward for the process industry, no general, broad industry standard can incorporate all of the potential caveats in a specific application. The comparison with standards is important, but it is often insufficiently rigorous to ensure that all potential failures in the SIS design are addressed.
Engineering Guidelines and Standards
To assist the design engineer, many Users develop engineering guidelines and standards (EGS) for the SIS design. The level of detail involved in an EGS is entirely dependent on the commonality involved in the various processes within the User company. The EGS may include approved architectures, device types, vendors, testing frequencies, and installation details. The EGS should address what is considered good engineering and design practice within the User company.
For Users who have many of the same types of process units, the EGS may be extended to include application standards that list specific architectures, voting, devices, and installation details for each safety function. For example, for a process furnace in a refinery, the trip for low fuel gas pressure may be completely specified in the standard from the use of 2003 voting pressure transmitters to a double block and bleed. The architecture description could be enhanced with installation details showing accepted practice for transmitter installation and provisions for maintenance bypassing and testing.
The proposed or installed SIS design can be compared to these internal standards. Deviation from the internal standard can be corrected through revised design or justified through documentation that addresses why this specific application has different requirements. Generally, internal standards are an excellent way to address SIS design, since the User can account for its particular application environment and risk tolerance. The reality is that many Users find it difficult to get agreement within their own company as to what is an acceptable design. After all, someone always seems to have a way to improve on the previous design. There must be a strong internal champion for the EGS to be developed. There must also be a strong ally in upper management to support the auditing process that will be required to ensure that the EGS are used.
Qualitative Techniques
Qualitative techniques have been used for many years to assess risk in process units. These techniques require experts, who have extensive experience with the process. Typical hazard identification techniques include checklists, what-if analysis, hazard and operability studies (HAZOP), and failure mode and effect analysis. Each of these techniques have distinct advantages and disadvantages.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
5
SI
ECH Solutions
Any of the techniques can be modified for use in assessing the SIS. Of the qualitative assessment techniques, the checklist is the most easily adaptable to SIS design evaluation. In fact, checklists are incorporated into many international standards, such as API 14C for the design of safety systems on off-shore platforms.
Checklists
Checklists are simply a list of questions that are answered with "yes," "no," or "not applicable" responses. A checklist analysis will identify specific hazards, deviations from standards, design deficiencies and potential incidents through comparison of the design to known expectations, which have been expressed in the checklist questions.
Checklists have, historically, been used to improve human reliability with respect to design and to ensure compliance with various regulations and engineering standards. Where the quantitative analysis is done after the P&IDs are mostly complete, the checklist technique can be applied at any stage of design, e.g. conceptual design, detailed design, or field construction. Checklists can be established for SIS evaluation in general or can be developed for specific applications. Checklists provide the simplest method for the identification of design inadequacies.
Description of Checklist Development
To generate an effective checklist for SIS, the checklist should address all phases of the SIS lifecycle. It should go beyond simple evaluation of the SIS design. It should incorporate but is not limited to the following:
• The design process as applied to defining the SIS.
• The actual design of the SIS, including instrument selection, electrical connections and conduit runs, process connections, communications, etc.
• The construction process, such as qualityinstaliation, loop verification, and conformance with installation standards.
• The operating environment for the SIS, such as electrical, communications, mechanical, and civil-structural.
• The start-up of the process, such as permissive verification (valve positioning) or sequence of events (purging or evacuation).
• The operation of the SIS, such as operating according to established procedures or performing on-line activities, including set point changes, calibration, or testing, per adequate administrative (security) controls.
• The maintenance of the SIS, such as developing and using instrument specific maintenance procedures, using bypasses appropriately and performing on-line maintenance activities without process risk.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
6
SI
ECH Solutions
Any of the techniques can be modified for use in assessing the S18. Of the qualitative assessment techniques, the checklist is the most easily adaptable to SIS design evaluation. In fact, checklists are incorporated into many international standards, such as API14C for the design of safety systems on off-shore platforms.
Checklists
Checklists are simply a list of questions that are answered with "yes," "no," or "not applicable" responses. A checklist analysis will identify specific hazards, deviations from standards, design deficiencies and potential incidents through comparison of the design to known expectations, which have been expressed in the checklist questions.
Checklists have, historically, been used to improve human reliability with respect to design and to ensure compliance with various regulations and engineering standards. Where the quantitative analysis is done after the P&IDs are mostly complete, the checklist technique can be applied at any stage of design, e.g. conceptual design, detailed design, or field construction. Checklists can be established for SIS evaluation in general or can be developed for specific applications. Checklists provide the simplest method for the identification of design inadequacies.
Description of Checklist Development
To generate an effective checklist for SIS, the checklist should address all phases of the SIS lifecycle. It should go beyond simple evaluation of the SIS design. It should incorporate but is not limited to the following:
• The design process as applied to defining the SIS.
• The actual design of the SIS, including instrument selection, electrical connections and conduit runs, process connections, communications, etc.
• The construction process, such as quality installation, loop verification, and conformance with installation standards.
• The operating environment for the SIS, such as electrical, communications, mechanical, and civil-structural.
• The start-up of the process, such as permissive verification (valve positioning) or sequence of events (purging or evacuation).
• The operation of the SIS, such as operating according to established procedures or performing on-line activities, including set point changes, calibration, or testing, per adequate administrative (security) controls.
• The maintenance of the SIS, such as developing and using instrument specific maintenance procedures, using bypasses appropriately and performing on-line maintenance activities without process risk.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)' 281-922-4362 (fax) www.SIS-TECH.com
6
SI
TECH Solutions
• The testing of the SIS, such as development of testing procedures, verification that testing was performed correctly, auditing to ensure that any required maintenance was performed and that the device was returned to a fully operational state.
The development of a good checklist requires knowledgeable, experienced engineers and technical personnel. Since the checklist follows the complete lifecycle model, checklists should be developed by a team, which has substantial experience at all phases of the lifecycle model. Once developed, checklists should be periodically evaluated for completeness and for compliance with the latest, emerging industry standards.
Checklists are qualitative in nature and do not typically provide any numerical measure of the potential for common cause faults due to a particular question. Moreover, the checklist method does not provide a ranking of the importance of one question to another, so all issues are treated equally. The most significant disadvantage of the checklist is its inability to look at interdependencies that could result in SIS failure. For example, the Operator attempts to start a purge sequence on a batch reactor when Maintenance has the valves in bypass.
While all of these areas can not be addressed in complete detail in the contents of this paper, an example of a checklist is provided at the end of the paper. This checklist was developed based on the following key areas:
• Engineering Design
• Safety Requirement Speciflcation'" (Safety Requirement Allocatiod2))
• Conceptual Design(1) (Safety Requirement Reallsatiorr'')
• Detail Design
• Application Software Design
• SIS Components
• Logic Solver
• User Interface
• Sensors
• Actuators
• Final Elements
• Process Connections
• Electrical Connections/ConduiVWire-tray/Junction Boxes
• Electrical Power
• Pneumatic Supply
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)' 281-922-4362 (fax) www.SIS-TECH.com
7
TECH Solutions
• Hydraulic Supply
• Environmental
• Manufacturer's specifications or tolerances
• Operating specifications
• Operation
• Installation/Maintenance
• Installation
• Inspection
• Testing
• Maintenance
• Training
• Modification
Conclusions
The new SIS design standards, ANSI/ISA S84.01 and draft IEC 1508, have changed the rules for the design, operation, maintenance, and testing of safety instrumented systems. Today's, SIS designer must be prepared to defend the selected SIS design. The Operation and Maintenance staff of today must justify how the SIS is operated, when it is bypassed, when it is tested, and when maintenance is performed. These new rules require a change of focus for everyone associated with these systems. The change of focus will result in a significant learning curve for many engineers. In order to make this transition as smooth as possible, rigorous qualitative techniques should be employed to provide a method for assessing these systems for adequacy. Clear communication of the requirements for effective SIS design is essential. All procedures, internal guidelines, and checklists must be evaluated by all personnel involved to ensure a compliant and cost effective SIS. References
1) ANSI/ISA-S84.01-1996 "Application of Safety Instrumented Systems for the Process Industries," Instrument Society of America S84.01 Standard, Research Triangle Park, NC 27709, February 1996.
2) IEC 1508, 65A1255/CDV, "Functional safety of electrical/ electronic/ programmable electronic safety related systems=Part 6: Guidelines to the application of parts 2 and 3," International Electrotechnical Commission, Draft, April 13, 1998.
PMB-295, 2323 Clear Lake City Blvd Houston. Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
8
ECH Solutions
3) "Common-Cause Failures in Relation To Programmable Electronic Systems Used for Protection," A. M. Wray, Health and Safety Laboratory, Report to IEC 1508 Committee, August 1996.
4) ISA TR84.0.02, "Safety Instrumented Systems (SIS)-Safety Integrity Level (SIL) Evaluation Techniques, Part 1: Introduction," Version 4, Draft, March 1998.
PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
9
SI
Example Checklist
EngineeringlDesign
Safety Requirement Specification (SRS)
Have individuals involved in developing SRS been trained to understand the
consequences of common-cause failures?
Was the SRS reviewed by members of the PHA orSIL assignment team?
Was the SRS checked against known standards? (Corporate, domestic
and/or international)
Has the safety integrity level been assigned qualitatively or quantitatively for
each safety function?
Was the SRS reviewed by an independent assessor? Conceptual Design
Have Individuals involved in developing the conceptual design been trained to
understand the consequences of common-cause failures?
Was the conceptual design verified for compliance with the SRS?
Was the conceptual design checked against known standards?
Has the safety integrity level been verified qualitatively or quantitatively for
each safety function?
Was the conceptual design reviewed by an independent assessor? Detail Design
Have individuals involved in developing the detail design been trained to
understand the consequences of common-cause failures?
Was the detail design developed in accordance with the SRS?
Was the detail design checked against known standards?
Are design reviews carried out which include the identification and elimination
of common-cause failures?
Has the safety integrity level been verified qualitatively or quantitatively for
each safety function?
Was the detail design reviewed by an independent assessor? ECH Solutions
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes ONe DNA
DYes DNo DNA PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone}> 281-922-4362 (fax) www.SIS-TECH.com
10
SI
Application Software
Have individuals involved in developing the application software been trained to understand the consequences of common-cause failures.
Application Software (continued)
Is the final program checked against the SRS?
Is the final program verified through factory acceptance testing that
includes fault simulation?
Is the final program verified through complete site acceptance
testing that includes verification of startup, operation, and testing
algorithms? Safety Instrumented System Components
Logic Solver
Does the logic solver have methods to protect against fail-dangerous faults?
Is the logic solver a fault-tolerant device?
Is the logic solver separated from the Basic Process Control System?
Are all SIS functions combined in a single logic solver?
Is the logic solver TUV certified for the application?
Is the application software protected from unauthorized changes? Operator Interface
Is the SIS operation consistent with existing systems and operator
experience?
Is adequate information about normal and upset conditions displayed?
Do separate displays present consistent information?
Are critical alarms obvious to an operator?
Are related displays and alarms grouped together? ECH Solutions
DYes 0 No DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA DYes DNo DNA
DYes DNa DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo ONA
DYes DNo DNA DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
11
SIS
TECH Solutions
Sensors
Have instrument specification sheets been verified by another party?
Is sensor redundancy employed?
If identical redundancy is employed, has the potential for CCF been
adequately addressed?
Are redundant sensors adequately physically separated?
Does each sensor have dedicated wiring to the SIS 1/0 modules?
Does each sensor have a dedicated process taps?
Does the configuration allow each sensor to be independently proof tested?
Can redundant sensors be tested or maintained without reducing the integrity
of the SIS?
Is diversity used?
Are diverse parameters measured?
Are diverse means of processing specified?
Is there sufficient independence of hardware manufacturer?
Is there sufficient independence of hardware test methods?
Are sensor sensing lines adequately purged or heat traced to prevent
plugging? Are SIS sensors clearly identified by some means (tagging, pain
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA
DYes DNo DNA PMB-295, 2323 Clear Lake City Blvd Houston, Texas 77062-8032 281-922-8324 (phone)· 281-922-4362 (fax) www.SIS-TECH.com
12