Kathmandu University
Chemical Process Safety
Risk Assessment
Dr. Ziaul Haque Ansari
Department of
Chemical Science and Engineering
Table of Contents
Review of Probability Theory
Interactions between Process Units
Revealed and Unrevealed Failures
Probability of Coincidence
Redundancy
Common Mode Failures
2 Event Trees
3 Fault Trees
4 QRA and LOPA
Quantitative Risk Analysis
Layer of Protection Analysis
Consequence
Frequency
Introduction
Risk assessment includes incident identification and consequence
analysis. Incident identification describes how an accident occurs.
It frequently includes an analysis of the probabilities.
Consequence analysis describes the expected damage.
This includes loss of life, damage to the environment or capital
equipment, and days outage.
Introduction
In this chapter we will
review probability mathematics, including the mathematics of
equipment failure,
show how the failure probabilities of individual hardware
components contribute to the failure of a process,
describe two probabilistic methods (event trees and fault trees),
describe the concepts of layer of protection analysis (LOPA),
and
describe the relationship between quantitative risk analysis
(QRA) and LOPA.
Introduction
Equipment failures or faults in a process occur as a result of a
complex interaction of the individual components.
The overall probability of a failure in a process depends highly on
the nature of this interaction.
In this section we define the various types of interactions and
describe how to perform failure probability computations.
Data are collected on the failure rate of a particular hardware
component.
With adequate data it can be shown that, on average, the component
fails after a certain period of time.
Introduction
This is called the average failure rate and is represented by with units
of faults/time.
The probability that the component will not fail during the time
interval (0, t) is given by a Poisson distribution:
where R is the reliability. Equation assumes a constant failure rate .
As t , the reliability goes to 0.
The speed at which this occurs depends on the value of the failure
rate .
Introduction
The complement of the reliability is called the failure probability (or
sometimes the unreliability), P, and it is given by
The time interval between two failures of the component is called the
mean time between failures (MTBF) and is given by the first moment
of the failure density function:
Equation is valid only for a constant failure rate .
Introduction
Typical plots of the functions , f , P, and R are shown in Figure 11-1.
Introduction
Many components exhibit a typical bathtub failure rate, shown in Figure.
The failure rate is highest when the component is new (infant mortality) and
when it is old (old age).
Between these two periods (denoted by the lines in Figure 11-2), the failure
rate is reasonably constant.
Interactions between Process Units
Accidents in chemical plants are usually the result of a complicated
interaction of a number of process components.
The overall process failure probability is computed from the
individual component probabilities.
Process components interact in two different fashions.
In some cases a process failure requires the simultaneous failure of a
number of components in parallel.
This parallel structure is represented by the logical AND function.
Interactions between Process Units
This means that the failure probabilities for the individual
components must be multiplied:
n : the total number of components and
Pi : the failure probability of each component.
The total reliability for parallel units is given by
Ri : the reliability of an individual process component
Process components also interact in series.
This means that a failure of any single component in the series of
components will result in failure of the process.
Interactions between Process Units
The logical OR function represents this case.
For series components the overall process reliability is found by
multiplying the reliabilities for the individual components:
The overall failure probability is computed from
Interactions between Process Units
Example 11-1
The water flow to a chemical reactor cooling coil is controlled by the
system shown in Figure 11-4. The flow is measured by a differential
pressure (DP) device, the controller decides on an appropriate control
strategy, and the control valve manipulates the flow of coolant.
Determine the overall failure rate, the unreliability, the reliability, and
the MTBF for this system. Assume a 1-yr period of operation.
Interactions between Process Units
Solution
These process components are related in series. Thus, if any one of the
components fails, the entire system fails. The reliability and failure
probability are computed for each component using Equations 11-1
and 11-2. The results are shown in the following table. The failure
rates are from Table 11-1.
Interactions between Process Units
Solution
Revealed and Unrevealed Failures
Emergency alarms and shutdown systems are used only when a
dangerous situation occurs.
It is possible for the equipment to fail without the operator being
aware of the situation. This is called an unrevealed failure.
Without regular and reliable equipment testing, alarm and emergency
systems can fail without notice. Failures that are immediately obvious
are called revealed failures.
Revealed and Unrevealed Failures
Figure 11-6 shows the nomenclature for revealed failures.
The time that the component is operational is called the period of
operation and is denoted by 0.
After a failure occurs, a period of time, called the period of inactivity
or downtime (1), is required to repair the component.
The MTBF is the sum of the period of operation and the downtime, as
shown.
Revealed and Unrevealed Failures
For unrevealed failures the failure becomes obvious only after regular
inspection. This situation is shown in Figure 11-7.
Revealed and Unrevealed Failures
It is convenient to define an availability and unavailability.
The availability A is simply the probability that the component or
process is found functioning.
The unavailability U is the probability that the component or process
is found not functioning. It is obvious that
It demonstrates that, on average, for unrevealed failures the process or
component is unavailable during a period equal to half the inspection
interval.
A decrease in the inspection interval is shown to increase the
availability of an unrevealed failure.
Revealed and Unrevealed Failures
Example 11-3
Compute the availability and the unavailability for both the alarm and
the shutdown systems of Example 11-2. Assume that a maintenance
inspection occurs once every month and that the repair time is
negligible.
Solution
Both systems demonstrate unrevealed failures. For the alarm system
the failure rate is = 0.18 faults/yr. The inspection period is 1/12 =
0.083 yr. The unavailability is computed using Equation 11-25:
Revealed and Unrevealed Failures
Solution
for unrevealed failures
The alarm system is available 99.2% of the time. For the shutdown
system = 0.55 faults/yr. Thus
The shutdown system is available 97.7% of the time.
Probability of Coincidence
All process components demonstrate unavailability as a result of a
failure.
For alarms and emergency systems it is unlikely that these systems
will be unavailable when a dangerous process episode occurs.
The danger results only when a process upset occurs and the
emergency system is unavailable.
This requires a coincidence of events. Assume that a dangerous
process episode occurs pd times in a time interval Ti.
The mean time between coincidences (MTBC) is the reciprocal of the
average frequency of dangerous coincidences:
Probability of Coincidence
Example 11-4
For the reactor of Example 11-3 a high-pressure incident is expected
once every 14 months. Compute the MTBC for a high-pressure
excursion and a failure in the emergency shutdown device. Assume
that a maintenance inspection occurs every month.
Solution
The frequency of process episodes is given by Equation 11-26:
= 1 episode / [(l4 months)(l yr/12 months)] = 0.857/yr.
Probability of Coincidence
Solution
It is expected that a simultaneous high-pressure incident and failure of
the emergency shutdown device will occur once every 50 yr.
Redundancy
Systems are designed to function normally even when a single
instrument or control function fails.
This is achieved with redundant controls, including two or more
measurements, processing paths, and actuators that ensure that the
system operates safely and reliably.
The degree of redundancy depends on the hazards of the process and
on the potential for economic losses.
An example of a redundant temperature measurement is an additional
temperature probe.
An example of a redundant temperature control loop is an additional
temperature probe, controller, and actuator (for example, cooling
water control valve).
2 Event Trees
Event trees begin with an initiating event and work toward a final
result.
This approach is inductive.
The method provides information on how a failure can occur and the
probability of occurrence.
When an accident occurs in a plant, various safety systems come into
play to prevent the accident from propagating.
These safety systems either fail or succeed.
The event tree approach includes the effects of an event initiation
followed by the impact of the safety systems.
2 Event Trees
The typical steps in an event tree analysis are:
i. identify an initiating event of interest,
ii. identify the safety functions designed to deal with the initiating
event,
iii. construct the event tree, and
iv. describe the resulting accident event sequences
If appropriate data are available, the procedure is used to assign
numerical values to the various events.
This is used effectively to determine the probability of a certain
sequence of events and to decide what improvements are required.
2 Event Trees
Consider the chemical reactor system shown in Figure 11-8.
A high-temperature alarm has been installed to warn the operator of a
high temperature within the reactor.