Accepted Manuscript: 10.1016/j.jlp.2017.09.011
Accepted Manuscript: 10.1016/j.jlp.2017.09.011
Developing a framework for dynamic risk assessment Using Bayesian networks and
reliability data
PII: S0950-4230(17)30357-1
DOI: 10.1016/j.jlp.2017.09.011
Reference: JLPP 3593
Please cite this article as: Kanes, R., Ramirez-Marengo, C., Abdel-Moati, H., Cranefield, J., Véchot,
L., Developing a framework for dynamic risk assessment Using Bayesian networks and reliability data,
Journal of Loss Prevention in the Process Industries (2017), doi: 10.1016/j.jlp.2017.09.011.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
PT
b
Mary Kay O’Connor Process Safety Center at Qatar, Texas A&M University at Qatar, PO Box
27874, Doha, Qatar
*Corresponding author: [email protected]
RI
Abstract
Process Safety in the oil and gas industry is managed through a robust Process Safety
SC
Management (PSM) system that involves the assessment of the risks associated with a facility in
all steps of its life cycle. Risk levels tend to fluctuate throughout the life cycle of many processes
due to several time varying risk factors (performances of the safety barriers, equipment
conditions, staff competence, incidents history, etc.). While current practices for quantitative risk
U
assessments (e.g. Bow-tie analysis (BT), Layer of protection analysis (LOPA) etc.) have brought
significant improvements in the management of major hazards, they are static in nature and do
AN
not fully take into account the dynamic nature of risk and how it improves risk-based decision
making.
In an attempt to continually enhance the risk management in process facilities, the oil and gas
M
industry has put in very significant efforts over the last decade toward the development of
process safety key performance indicators (KPI or parameters to be observed) to continuously
measure or gauge the efficiency of safety management systems and reduce the risks of major
D
incidents. This has increased the sources of information that are used to assess risks in real-time.
The use of such KPIs has proved to be a major step forward in the improvement of process safety
TE
in major hazards facilities. Looking toward the future, there appears to be an opportunity to use
the multiple KPIs measured at a process plant to assess the quantitative measure of risk levels at
the facility on a time-variant basis.
EP
ExxonMobil Research Qatar (EMRQ) has partnered with the Mary Kay O’Connor Process
Safety Center – Qatar (MKOPSC-Q) to develop a tool that monitors, in real time, the potential
increases in risk levels as a result of pre-identified risk factors and process safety related data,
using Bayesian Belief Networks (BN). The development of the tool involved two phases: 1)
C
Development of a methodology that establishes the framework for the tool and 2) Development
of the tool itself with the use of JAVA programming language. The overall tool is to be called
AC
1 Introduction
Industrial chemical processes are complex networks that involve various equipment, rigorous
control loops, and skilled operators all working together to harmonize production throughout the
PT
process lifecycle. In order to ensure smooth operation, it is necessary to understand the hazards
and risks associated, as well as any possible accident scenarios and their mitigation. It follows
that the dynamic nature of these chemical processes extends to their accompanying risks as seen
in Figure 1. Risk levels in a process facility tend to fluctuate due to several time-varying factors,
RI
which include, but are not exclusive to: variations in the integrity and vulnerability of safety
barriers, equipment aging, planned activities and maintenance, shutdown, start-up, simultaneous
operations, changes in the safety culture of the company, health and efficiency of the
SC
management system, commitment to safety of the leadership and process safety incidents (or
near misses).
It is important to quantify these time-dependent factors and their relationships using engineering
U
and mathematical techniques. The relationships are then used to develop quantitative estimates
of risk that are compared against a pre-defined risk criteria, which is reflective of an
AN
organization’s risk tolerability. Traditional quantitative risk assessment (QRA) methods include
Hazard and Operability Study (HAZOP), What-if Analysis, Bow-Tie Analysis (BT), Fault Trees
(FT) and Event Trees (ET), Layer of Protection Analysis (LOPA). However, these methods are
M
limited in that they tend to convey static values of risk at a given point in time; they are simply
not designed to capture the dynamic nature of risk. It is therefore important to focus research on
the area of dynamic risk and its measure, as a way forward for improving current risk assessment
methodologies.
D
More recently, Bayesian Belief Networks (BN) which are probabilistic graphical models that
TE
allow for the quantification of complex relational dependencies using Bayes’ theorem, have
gained traction in various engineering fields, including the area of process safety. BN represent a
set of random variables and their relationship through a directed acyclic graph (DAG). Each
node of the DAG represents a random variable. The directed arcs connect pairs of nodes which
EP
follow a cause-effect relationship [1]. A node is called a “parent node” if there is a directed arc
connecting it to another node called the “child node”. Nodes which have no parent are known as
“root nodes” as seen in Figure 2. All nodes in the BN are assigned an initial corresponding
C
probability.
The main objective of BN is to use initial node probabilities, and their relational dependencies to
AC
estimate and update the distribution probabilities of the random variables, based on given
evidence and prior knowledge. The calculation of the probabilities of the “child” nodes is based
on a combination of the probability of the “parent” nodes and conditional probability tables
(CPT). This is in accordance with the well-known “Bayes theorem” of conditional probability
[2]. The analysis of the probability associated to the nodes can follow two paths:
• Predictive analysis or forward approach: Probability values are defined a priori for root
nodes and calculated by inference for the other nodes.
• Diagnostic analysis or backward approach: Probability values of the nodes are calculated
a posteriori when observations become available.
ACCEPTED MANUSCRIPT
Perhaps what makes BN a most suitable method for risk analysis, decision making and risk
management applications within process facilities, is their ability to handle probability nodes,
represent dependencies between variables and update probability values over time. These are all
advantages that BN have over classical risk analysis methods such as fault trees, event trees and
Bowties, which are limiting due their static nature. Weber et al. (2012) reviewed the literature on
applications of Bayesian Networks for dependability, risk analysis, risk management and
maintenance, and concluded that there is indeed a rising trend in BN use within these areas [3].
PT
Thus, BN are becoming one of the preferred tools for risk management applications [4].
Currently, there has been significant work carried out on the use of BN for predicting the
probability of occurrence of undesired consequences. This involves the mapping of classical risk
RI
analysis techniques such as the FT, LOPA and BT into a BN [5].
Meel and Seider (2006) developed an algorithm for performing dynamic failure assessment on a
process unit. First, an event tree, which includes the safety systems put in place to respond to
SC
abnormal events is created. The failure probabilities of safety systems and consequences (end-
states) was modeled using copulas (multivariate probability distributions) and Bayesian analysis.
Accident precursor data (ASP) was used to dynamically alter initial estimates of failure
U
probabilities, to obtain posterior failure probabilities of the safety systems. The means of the
posterior probability distributions are obtained with Monte Carlo integrations and Bayesian
AN
techniques are used to predict the occurrences of abnormal events for the next time interval. This
methodology was applied to a CSTR reactor scenario. It was concluded that BN analysis and
ASP data was effective for the prediction of the occurrence rate of abnormal events,
quantification and prediction of both equipment and human related reliabilities, and identifying
M
near-misses and incidents, as well as probabilities of accidents. Also, this method involves
repetitive risk analysis after abnormal events occur, and is thus advantageous for operations with
complex nonlinearities and multi-component interactions [6]. Meel and Seider (2008) further
D
developed their work to develop a forecasting analyzer, a reliability analyzer, and an accident
closeness analyzer. These analyzers respectively function to predict: frequencies of occurrence of
TE
abnormal events; failure probabilities of safety systems involving equipment and human actions,
while accounting for the correlations between safety systems using copulas; and current plant
state to a failure or disaster through the use of accident precursor data. In this work, they
highlight the importance of using plant-specific estimates of failure probabilities, instead of
EP
of safety systems [8]. Beta distributions are assumed for prior failure probability of safety
systems. Similar to Meel and Seider (2006) work, accident precursor data such as near misses
AC
and incidents are used to generate the likelihood function that updates the prior probabilities.
More work needs to be done on refining the posterior information and looking into other
probability distributions both for the prior and likelihood functions. In 2010, Kalantarnia et. al.
further developed their approach to learn from near misses, incident, past accidents and predict
event occurrence likelihood with every time interval. They applied a dynamic risk assessment
(DRA) method to model the BP Texas Refinery accident scenario. Prior failure probabilities
were obtained from design stage data. Then, near misses and incident data were used to generate
the likelihood distribution to update the prior probabilities. Finally, Bayesian theory was used to
estimate the posterior failure function of the ISOM unit of the BP Texas city refinery. Their
ACCEPTED MANUSCRIPT
results point out that the accident was predictable had a DRA method been considered. However,
they also identified that degree of accuracy of the tool is heavily dependent on the precision of
the data, and thus strong facility for monitoring and incident recording/reporting systems should
exist. Linkages between events and fault trees for a more integrated dynamic risk simulation, is
still yet to be considered [9].
In their work, Khakzad et al. (2011) presented an algorithm to map a FT into a BN. The resulting
PT
BN was used to assess the performance of a feeding control system scenario transferring propane
from an evaporator to a scrubbing column [10]. In addition to the calculation of the top event, the
developed BN was used to estimate the contribution of each event leading to the top event in the
FT. Discrete probability values were used for their scenario, to specify prior probabilities and
RI
conditional probability tables of the BN nodes. Later, Khakzad et al. (2013) presented a dynamic
approach where a BT was mapped into a BN. The BN was used to perform a probability
adapting analysis where the cumulative occurrence of an accident, over a given time interval,
SC
was used as evidence. Discrete probability distributions were used in the BN network model and
the number of loss of containment events and near misses registered during 4 years were used as
evidence. [11].
U
Simiarly, Cai et. al. (2013) used BN to perform a risk assessment that takes into account human
factors in an offshore blowout scenario. Individual, organizational and group factors were
AN
represented in the BT that was mapped into a BN [12]. On the other hand, Abimbola et. al.
(2014) propose a bow-tie model, which identifies logical relationships between component
systems as well as consequences, for real time risk analysis of drilling operations. This method
M
has a predictive probabilistic model that computes the probabilities of failure of equipment such
as BOPs and valves by using reliability models of constant stress and random stress of
exponential distribution. Then with the use of accident precursor data, the failure probabilities of
D
algorithm for dynamic risk assessment where the prior failure distributions of the safety barriers
are represented by a beta distribution [17]. From these distributions the mean values are
estimated and used as discrete values. The probabilities of the accident outcomes are computed
by the product of the top event, the probabilities of failure of safety barriers and the events
EP
leading to the different accident consequences. The likelihood function is obtained from ASPs,
the resulting function is a binomial distribution. Using Bayesian inference the posterior failure
function is obtained from the prior and likelihood distributions. Their approach allowed for
C
collection of a number of risk notions related to the plant, equipment and materials used.
However, similar to Kalantarnia et. al. (2010), the effectiveness of the approach is tied to
AC
Dynamic Bayesian Networks (DBN) are an extension of BN where variables are correlated to
each other over a sequence of time steps. For each time slice, a static BN represents the variables
at a discrete time step. The variables are then connected through each time slice by temporal
links. DBN often consist of two different time steps and are often called two-time-slice BN. In a
two-time slice BN, the value of a variable can be calculated from the immediate prior value (time
t-1) at any point in time “t. Lately, DBN have been applied to risk assessment for estimating
probabilities across different time period.
PT
Khakzad (2015) developed a DBN to estimate the propagation of heat radiation in a scenario
where one of 3 atmospheric storage tanks containing acetone and benzene, is under fire. The BN
allowed for the estimation of the most probable sequence of events, which would result in a
RI
domino effect within the facility. Time-dependent conditional probabilities with constant failure
(exponential functions) were used [19]. Barua et al. (2016) proposed a BN model for dynamic
operational risk assessment. The model was developed from a dynamic fault tree, where Boolean
SC
states (failure and success) were used to indicate the probability of failure of specific process
equipment. Exponential conditional probabilities were used for each time step [20]. Later on, Wu
et al. (2016) developed a DBN model based on a BT model for predicting the change in the
U
probability of potentially hazardous scenarios over time. Boolean states (Yes and No) were used
to represent the occurrence of an event or equipment failure [21].
AN
Current work in the literature demonstrates the capability of BN for updating the values of risk
based on observed evidence. Primarily, the random variables have discrete states (i.e. failure or
success of safety barriers). The type of evidence used for updating was either an observation of
M
equipment failure or a consequence, over a certain period of time. However, in a process plant
many other parameters or process safety indicators (PSIs) can be used as evidence. These
indicators could be of different natures, and may be reliability related, operational indicators,
D
human and organizational related, demands on safety systems, etc. Incorporating real-time
indicators into the risk assessment represents the path forward for improving current risk
assessment methods. However, one of the current limitations of existing BN is that the majority
TE
of them can only deal with discrete variables as shown throughout the literature. For real world
applications hybrid models with both discrete and continuous variables are necessary for
accurate representation of systems [3], [22], [23].
EP
Thus, the first stage of this work focuses on modeling reliability indicators and aims at
developing a framework for a hybrid DBN model that calculates the dynamic risk profile over
time, based on reliability data (failure rate, probability of failure on demand, time horizon until
C
the next scheduled maintenance, and the characteristic life of equipment) and updates risk based
on insertion of evidence such as equipment failures and time to failure (TTF) for continuous
AC
operation. The model will also be used to demonstrate the effect of maintenance on the overall
probability of consequences. This study incorporates both the use of both Boolean and
continuous random variables in the DBN, as well as the Boolean and continuous causal
relationships between them. This is one of the main differences between this work and others
within this field. A commercially available Bayesian Networks software was used for the
estimation of updated probabilities.
The work presented in this paper is a proposed framework for developing a dynamic risk
assessment tool to determine the current risk level of a given process area or facility. This is a
ACCEPTED MANUSCRIPT
collaborative effort between ExxonMobil Research Qatar and the Mary Kay O’Connor Process
Safety Center at Texas A&M University in Qatar.
PT
creation of a BN that can handle discrete, continuous (e.g. Weibull) or both variables (Figure 3).
Given that reliability applications require BN models that hybrid, this paper presents the
methodology for hybrid BNs. The proposed methodology is comprised of different steps towards
the construction of a BN model, which includes both discrete and continuous variables (hybrid).
RI
The resulting BN model is called a hybrid dynamic Bayesian Network. The hybrid DBN can be
constructed from mapping an existing BT).
SC
2.1 Development of the Skeletal BN
Starting out from a BT, a mapping process is required to create the BN. This BN is a skeletal
BN, as the nodes are not populated with any probability values. Rather, an existing algorithm is
U
used to map a fault tree into a BN [1], including graphical and numerical translation of the fault
tree. Graphically, the structure of the BN is developed from the fault tree such that the top event
AN
and the causes shown in the fault tree are represented by nodes, while its relationship is
represented through arcs in the BN. The relationships within the causes and the top event in the
fault tree are modeled with two types of gates: OR Gate and AND Gate. According to the type of
gate, conditional probabilities are assigned to each one of the nodes as seen in Figure 4.
M
nodal representation of the consequences (from ET) was used as well. The consequence node has
many states, each one of them representing the possible consequences shown in the BT.
TE
inspection/preventive maintenance) and the time to failure (TTF) for equipment under
continuous operation. The time to failure (TTF) is the time elapsing from when a piece of
equipment/safety barrier are put into operations until they fail for the first time [24]. TTF is a
C
random variable that can be represented through probability distributions (e.g. Exponential,
Weibull or Gamma). The failure rate function is related to the time to failure probability
AC
address the characteristic stage of the lifecycle of these barriers. In system reliability, the Weibull
distribution is widely used for modelling the different lifecycle stages of equipment, and is a
function of two different parameters: shape and scale. The shape parameter (β) indicates the
stage of the lifecycle of equipment, where β < 1 represents the wear-in stage, β < 1 represents the
useful life, 1<β < 4 represents the wear-out stage, and β ≥ 4 represents the rapid wear-out stage.
On the other hand, the scale parameter λ is an indicator of the failure rate per hour of
equipment/safety barriers [24].
PT
For continuously operating equipment, continuous distributions in turn are used to obtain the
TTF e.g node A represents the TTF distribution A, a child node, B, of parent A, may be used in
RI
the model to estimate the probability of failure from the TTF distribution. The CPT for node B,
which represents the estimated probability of failure at a given time t, is represented by the
following function: P ( A = Fail ) = P (TTF A ≤ t ) (Figure 6).
SC
2.3 Creation of the hybrid DBN
To transform the above mentioned hybrid BN into a hybrid DBN, wherein changes of the model
over time are now considered, the structure needs to be modified for discrete nodes as follows:
•
U
Identification of the root nodes in the BN
AN
• Creation of prior and posterior nodes for each parent nodes.
Connected prior and posterior nodes associated with a given parent node are used to capture the
change in probability over time. For each time step, evidence is first set and then the posterior
M
probability values of each node are calculated, given the inserted evidence and the prior
probability value of the node.
D
Continuous probability distributions are used to represent the TTF of equipment under
continuous operation, and can be included in the BN. The probability of the consequences is
calculated for different specified time horizons t (i.e. the probability of the consequences if no
EP
inspection/maintenance are given in a period of 3 months, 6 months, etc.). The different specified
time periods are 1, 3, 6, 9, 12, 15 and 18 months.
C
continuous operation, and can also be included in the BN. The probability of the consequences is
calculated for different specified time periods t (monthly time intervals can be used). In addition,
evidence of failure of equipment is inserted every month for the sake of simplicity (other time
units can be used). Thus, the risk profile is generated taking into account the effect of time and
the observed evidence.
3 Case Study
This section presents a case study that applies the proposed framework for a dynamic risk
assessment using Bayesian Networks. The scenario of choice is an extension of the case study
ACCEPTED MANUSCRIPT
performed by M. Tweeddale, Managing Risk and Reliability of Process Plants, Gulf Professional
Publishing, 2003 [25], which focused on the heating oil section of a plant delivering hot oil to
heating coils of bitumen tanks. The oil is returned to the bitumen tanks by a circulating pump
through a gas-fired heater.
The flow should be controlled through the heater, or the heater coils may overheat and rupture,
resulting in a large fire. The flow control system is composed of a transducer (FE), a flow
PT
controller (FC) and a flow control valve (FCV). A manual bypass valve (MBV), which normally
remains in a closed position, is used if FCV is down for routine maintenance. If the flow of the
heating oil is below the desired level, the solenoid valve (SV) will be activated by the signal of
the transducer FE and the temperature control valve (TCV) will be closed. The flow switch
RI
activates the low-flow alarm (FAL), to alert the operator for opening the MBV or closing the
manual gas isolation valve (GIV). In addition to the flow control system, a high-temperature
switch (TSH) is placed in the oil delivery line. In case an increased temperature is measured by
SC
the TSH, it will activate the solenoid valve SV and close the temperature control valve [25].
The corresponding bowtie diagram was constructed with the Heater Coils Burn Out as the Top
Event, and is shown in Figure 7. In this example, the possible consequences, as seen in Table 1,
U
depend on different factors: (i) if the leaking oil finds an ignition source, the leak is isolated very
quickly and (ii) if the oil contacts operator. The probability of failure values mentioned in Table
AN
2, which were used for constructing the BN, were taken from the original case study [25]. The
BN obtained from mapping the bowtie is shown in Figure 8, while the estimated values of the
probability consequences are summarized in Table 3.These results are the initial values of risk
M
obtained by mapping the traditional bowtie in a BN and do not include any updates on
probability values.
3.1 Modification of the BN to a Hybrid BN to Incorporate Reliability Data
D
Therefore, it is necessary to build a hybrid DBN that incorporates both discrete and continuous
nodes. The hybrid BN was built based on the proposed approach given by Fenton and Neil
AC
(2012). In the proposed framework, a Weibull distribution with shape and scale parameters is
used for representing the TTF distribution of equipment/safety barriers[2]. In this case study, the
TTF of each of the pump, transducer (FE) and Flow Switch (FS) were represented using the
Weibull distribution. The parameters for these distributions were taken from available literature
data of equipment failure rates in the useful life as seen in Table 4 [26].
Next, to transition to a hybrid DBN, the prior time-to-failure distributions were defined for the
root nodes. The probability of failure of the system at a given time, t, can be obtained from the
TTF distribution e.g. P ( Pump = Fail ) = P (TTF Pump ≤ t ) .When using TTF continuous
ACCEPTED MANUSCRIPT
distributions, the conditional probability tables (CPTs) were obtained as deterministic functions
of the continuous input time-to-failure nodes. The deterministic functions used in this work were
based on the approach presented by Marquez et. al (2010) and Fenton and Neil (2012).The
resulting hybrid DBN is shown in Figure 9. The continuous nodes in the hybrid BN are the pump
(L.S3E3.Pump), FE (L.S1E1.FE) and FS (L.S2E2.FS). To solve the hybrid DBN, a
commercially available software , which utilizes a dynamic discretization algorithm for
PT
performing inference in the hybrid BN, was used [2], [22]
RI
3.2.1 Evaluation of the risk increment as a function of time
Continuous TTF distributions for the pump, FE and FS were included in the BN. The probability
SC
of the consequences, given that no inspection/maintenance was performed, was calculated for
specified intervals of time, t: 1, 3, 6, 9, 12, 15 and 18 months.
3.2.2 Evaluation of the risk increment as a function of time and the insertion of evidence
U
Continuous TTF distributions for the pump, FE and FS were included in the BN. The probability
of the consequences was calculated for these specified intervals of time, t: 1, 2 and 3 months. In
AN
addition, evidence of equipment failure was inserted as described in Table 5. The DBN with the
insertion of evidence can be seen in Figure 10.
To obtain the dynamic risk profile, the calculation of updated risk values, based on observation
M
or evidence, over time is necessary. For equipment with discrete nodes, this requires a two-step
process which includes:
•
D
Insertion of evidence in the BN and the recalculation of the probability of the nodes
(variables) of the BN given equipment failure evidence;
•
TE
Calculation of the probability of each node of the BN at a given time step, given the
probability of the node at the previous time step. The conditional probability values for
posterior nodes are presented in Table 6 and were obtained from expert judgment. For the
sake of simplicity, the same conditional probability table (CPT) was used for each one of
EP
the parent nodes. CPTs can be assigned based on expert opinion or literature data if
available.
Figure 11 shows how prior and posterior nodes are connected to capture the change of
C
probability with time. For each time step, evidence is first set and the posterior probability values
of each node is calculated given the evidence and the prior probability value of the node. For
AC
equipment, it was assumed that once a failure occurred, the equipment was repaired and it
returned to an “as good as new” condition.
4 Results
For comparison purposes, the initial risk values were estimated using two modeling approaches:
a BN with only discrete variables and a hybrid BN, as shown in Figure 12. The obtained results
show that when TTF continuous distributions are included into the BN, the initial predicted risk
values with the hybrid BN are lower by at least one order of magnitude from those obtained with
a BN with only discrete nodes. This difference can be explained with the use of continuous TTF
ACCEPTED MANUSCRIPT
distributions in the BN. In the hybrid BN the probability of failure is estimated from the TTF
distribution at a given time. The estimated probability values will depend on the failure rate of
the equipment, the specified period of time and the failure rate of the equipment/safety barriers.
When a BN with only discrete nodes is specified only fixed values in the CPTs are used and the
continuous nature of operating variables is not captured. The calculated values are summarized
in Table 7.
PT
In addition, by incorporating TTF continuous distributions, risk can be calculated as a function of
time for the case where no inspection or maintenance is provided as shown in Figure 13. The
calculated values are summarized and presented monthly in Table 8. For example, if no
maintenance is scheduled in 3 months, the value of the probability of the consequences increases
RI
by almost three order of magnitudes for each one of the possible consequences. Thus, predicting
the risk increase becomes highly important as this information can be used in decision-making to
optimize the time intervals between inspection and preventive maintenance.
SC
When equipment failure information is inserted as evidence in addition to the effect of time, the
predicted values of risk increases considerably by various order of magnitude. For example, the
probability of the most severe consequence “C3-Operator burnt, major plant damage”, went from
U
5.54×10-6 to 4.60×10-2, after the failure of both the pump and the flow control components. The
obtained dynamic profile is shown in Figure 14. The obtained values for the probability of the
AN
consequences are presented in weeks and summarized in Table 9 and Table 10.
The creation of the BN in the case study was performed using a commercially available software.
However, the dynamic modeling, when using this software, was a long and manual process. For
M
every time variation in time, or registered evidence, the user must input the information and run
the sequence of calculations manually. Currently there is no straightforward methodology for
automating this sequence with available BN software. To overcome this limitation, the authors
D
developed a tool that aims to monitor, in real-time, potential increases in risk levels as a result of
pre-identified risk factors. The developed tool is called PULSE, and stands for Process Unit Life
TE
Safety Evaluation.
PULSE uses BN math to continuously update the values of risk. Performing dynamic (time
dependent) risk modeling with BN is not a straightforward procedure, as it involves the
EP
• Insertion of evidences in BN and the recalculation of the probability of the nodes in the
BN given the evidence
AC
To automate the dynamic calculations of risk, PULSE has an engine (called the solving engine)
that reads the input data necessary to create the BN, builds the BN and continuously runs the
calculation of the BN. PULSE was developed using the Java programming language in unison
with a BN software application interface that provides a set of routines for building the BN and
executing the calculations.
The output obtained after the execution of PULSE was a series of BN files that are generated
every time that the probability of the consequences was updated as a function of time or insertion
of evidence. In addition, the probability values of the consequences were extracted from the BN
ACCEPTED MANUSCRIPT
files and compiled in a file format used to store tabular data. This data was later on used later for
plotting the dynamic risk profile.
5 Conclusions
The application of Bayesian Networks provides several advantages towards the development of a
tool for real-time risk assessment, because of their ability to update probability values, given
PT
evidence/observations. This property of BN helps to overcome the limitations of current QRA
techniques which are static in nature and thus do not provide a real-time estimation of risk.
Reliability and risk are linked, increasing reliability can reduce the risk of undesired events in
RI
process plants. To that effect, measuring meaningful reliability indicators improves the
performance of equipment and active safety barriers. However, there still seems to be no clear
link between the reliability data measured at a process plant and the quantitative measure of risk.
SC
Reliability assessment models use both discrete and continuous variables. This work presents a
framework that incorporates reliability related data such as failure rate, time to failure, life cycle
stage of equipment into a hybrid Dynamic Bayesian Network to estimate the risk at different
time horizons. As expected, the results show that the probability of failure increases as a function
U
of time if maintenance is not provided, resulting in higher values of risk. In fact, risk can increase
two or three orders of magnitude if a good maintenance program is not put in place.
AN
However, performing dynamic hybrid BN modeling with available software results in a complex,
time consuming and manual process. To overcome this limitations and automate the dynamic
calculations a PULSE, a tool that aims to monitor, in real-time, potential increases in risk levels
M
as a result of pre-identified risk factors was developed. The developed tool can be used for
updating the values of risk as a function of time and as evidence is inserted. In addition, the
developed tool may be applied to support decision making, since it can be used to estimate risk at
D
level.
The results obtained in this work show the change in the values of risk when evidence of
equipment failure is inserted and demonstrate the importance of updating the estimated risk
EP
values given real-time observed risk factors (equipment failure and effect of time). Capturing the
effect of time and failure of equipment in the estimated risk values is key, since the probability of
the consequence values can be several order of magnitudes higher than the predicted values at
early stages of process unit/facilities (obtained from the static analysis).
C
The proposed BN methodology was applied to establish the framework for the development of
AC
PULSE. PULSE provides the means to automate the calculations for updating the values of the
probability of the consequences as a function of time and insertion of evidence. The values
obtained with PULSE can be used for plotting the dynamic risk profile in real-time.
Future research work includes the extension of this approach to repairable systems for including
the time to repair (TTR). In addition, the approach must also be extended to include human
factors KPIs.
ACCEPTED MANUSCRIPT
6 References
[1] A. Bobbio, L. Portinale, M. Minichino, and E. Ciancamerla, “Improving the analysis of
dependable systems by mapping Fault Trees into Bayesian Networks,” Reliab. Eng. Syst.
Saf., vol. 71, no. 3, pp. 249–260, 2001.
[2] N. Fenton and Martin Neil, Risk Assessment and Decision Analysis with Bayesian
Networks. CRC Press, 2013.
PT
[3] P. Weber, G. Medina-Oliva, C. Simon, and B. Iung, “Overview on Bayesian networks
applications for dependability, risk analysis and maintenance areas,” Eng. Appl. Artif.
RI
Intell., vol. 25, no. 4, pp. 671–682, Jun. 2012.
[4] H. Pasman and W. Rogers, “The bumpy road to better risk control: A Tour d’Horizon of
new concepts and ideas,” J. Loss Prev. Process Ind., vol. 35, pp. 366–376, 2015.
SC
[5] H. J. Pasman and W. Rogers, “Bayesian networks make LOPA more effective, QRA more
transparent and flexible, and thus safety more definable!,” J. Loss Prev. Process Ind., vol.
26, no. 3, pp. 434–442, May 2013.
[6]
U
A. Meel and W. D. Seider, “Plant-specific dynamic failure assessment using Bayesian
AN
theory,” Chem. Eng. Sci., vol. 61, pp. 7036–7056, 2006.
[7] A. Meel and W. D. Seider, “Real-time risk analysis of safety systems,” Comput. Chem.
Eng., vol. 32, pp. 827–840, 2008.
M
[8] M. Kalantarnia, F. Khan, and K. Hawboldt, “Dynamic risk assessment using failure
assessment and Bayesian theory,” J. Loss Prev. Process Ind., vol. 22, no. 5, pp. 600–606,
2009.
D
accident using dynamic risk assessment approach,” Process Saf. Environ. Prot., vol. 88,
no. 3, pp. 191–199, 2010.
[10] N. Khakzad, F. Khan, and P. Amyotte, “Safety analysis in process facilities: Comparison
EP
of fault tree and Bayesian network approaches,” Reliab. Eng. Syst. Saf., vol. 96, no. 8, pp.
925–932, 2011.
[11] N. Khakzad, F. Khan, and P. Amyotte, “Dynamic safety analysis of process systems by
C
mapping bow-tie into Bayesian network,” Process Saf. Environ. Prot., vol. 91, no. 1–2,
pp. 46–53, 2013.
AC
[12] B. Cai, Y. Liu, Y. Zhang, Q. Fan, Z. Liu, and X. Tian, “A dynamic Bayesian networks
modeling of human factors on offshore blowouts,” J. Loss Prev. Process Ind., vol. 26, no.
4, pp. 639–649, 2013.
[13] F. Ayello, N. Sridhar, G. Koch, V. Khare, A. W. Al-methen, and S. Safri, “Internal
Corrosion Threat Assessment of Pipelines Using Bayesian Networks,” in Corrosion 2014,
2014.
[14] F. Ayello, S. Jain, N. Sridhar, and G. H. Koch, “Quantitive Assessment of Corrosion
Probability — A Bayesian Network Approach,” Corrosion, vol. 70, no. 11, pp. 1128–
ACCEPTED MANUSCRIPT
1147, 2014.
[15] S. Jain, F. Ayello, V. Khane, and N. Sridhar, “Probabilistic Assessment of Stress
Corrosion Cracking of Pipelins,” in Corrosion 2014, 2014.
[16] M. Abimbola, F. Khan, and N. Khakzad, “Dynamic safety risk analysis of offshore
drilling,” J. Loss Prev. Process Ind., vol. 30, pp. 74–85, 2014.
PT
[17] N. Paltrinieri, F. Khan, P. Amyotte, and V. Cozzani, “Dynamic approach to risk
management: Application to the Hoeganaes metal dust accidents,” Process Saf. Environ.
Prot., vol. 92, no. 6, pp. 669–679, 2014.
RI
[18] C. Yeo, J. Bhandari, R. Abbassi, V. Garaniya, and S. Chai, “Dynamic risk analysis of
offloading process in floating liquefied natural gas (FLNG) platform using Bayesian
Network,” J. Loss Prev. Process Ind., vol. 41, pp. 259–269, 2016.
SC
[19] N. Khakzad, “Application of dynamic Bayesian network to risk analysis of domino effects
in chemical infrastructures,” Reliab. Eng. Syst. Saf., vol. 138, pp. 263–272, 2015.
U
[20] S. Barua, X. Gao, H. Pasman, and M. S. Mannan, “Journal of Loss Prevention in the
Process Industries Bayesian network based dynamic operational risk assessment,” J. Loss
AN
Prev. Process Ind., vol. 41, pp. 399–410, 2016.
[21] S. Wu, L. Zhang, W. Zheng, Y. Liu, and M. Ann, “Journal of Natural Gas Science and
Engineering A DBN-based risk assessment model for prediction and diagnosis of offshore
M
drilling incidents,” J. Nat. Gas Sci. Eng., vol. 34, pp. 139–158, 2016.
[22] D. Marquez, M. Neil, and N. Fenton, “Improving reliability modeling using Bayesian
Networks and dynamic discretization.”
D
[24] R. Manzini, A. Regattieri, H. Pham, and E. Ferrari, Maintenance for Industrial Systems.
Springer, 2010.
EP
[25] M. Tweeddale, Managing Risk and Reliability of Process Plants. Gulf Professional
Publishing, 2003.
[26] SINTEF Industrial Management, OREDA: Offshore Reliability Data Handbook, 4th ed.
C
2002.
AC
ACCEPTED MANUSCRIPT
Consequences Description
C1 Operator burn, minor plant damage
C2 and C6 Minor plant damage
C3 Operator burnt, major plant damage
C4 and C8 Major plant damage
PT
C5 Operator inconvenience, minor plant damage
C7 Operator inconvenience, major plant damage
C9 No consequences
RI
Table 2. Probability of failure of equipment and safety barriers
SC
Node State Probability
SV Failure 1.25×10-2
Success 9.87×10-1
6.26×10-3
U
TCV Failure
Success 9.94×10-1
AN
FAL Failure 6.25×10-3
Success 9.94×10-1
Operator Failure 1.00×10-1
Success 9.00 ×10-1
M
C2 3.96 × 10-3
C3 2.69 × 10-5
C4 2.66 × 10-3
EP
C5 2.38 × 10-6
C6 4.52 × 10-5
C7 1.01 × 10-6
C8 1.94 × 10-5
C
C9 9.93 × 10-1
AC
PT
Table 6. CPT for posterior nodes calculation
Posterior node
RI
Prior Node State Failure Success
Failure 0.4 0.1
Success 0.6 0.9
SC
Table 7. Initial probability of the Consequences for the Discrete and Hybrid BNs.
U
Consequence Discrete BN Hybrid BN
C1 8.08 × 10-5 1.68×10-7
AN
C2 3.96 × 10-3 8.23×10-6
C3 2.69 × 10-5 5.54×10-6
C4 2.66 × 10-3 5.60×10-8
C5 2.38× 10-6 4.95×10-9
M
Months
0 3 6 9 12 15 18
EP
C4
C5 4.95×10-9 8.10×10-6 1.76×10-5 2.81×10-5 3.92×10-5 5.06×10-5 6.17×10-5
9.40×10-8 1.54×10-4 3.34×10-4 5.34×10-4 7.45×10-4 9.62×10-4 1.17×10-3
AC
C6
C7 2.12×10-9 3.47×10-6 7.53×10-6 1.21×10-5 1.68×10-5 2.17×10-5 2.65×10-5
C8 4.03×10-8 6.59×10-5 1.43×10-4 2.29×10-4 3.19×10-4 4.12×10-4 5.03×10-4
Table 9. Probability of the consequences as a function of time and evidence (Week 0-6)
Week
0 1 2 3 4 5 6
C1 1.68×10-7 1.20×10-6 2.58×10-5 5.52×10-5 8.13×10-5 1.06×10-4 1.30×10-4
ACCEPTED MANUSCRIPT
PT
Table 10. Probability of the consequences as a function of time and evidence (Week 7-12)
RI
Week
7 8 9 10 11 12
1.53×10-4 1.49×10-3 1.85×10-4 2.07×10-4 2.30×10-4 1.39×10-3
SC
C1
C2 7.51×10-3 7.32×10-2 9.06×10-3 1.01×10-2 1.12×10-2 6.84×10-2
C3 5.06×10-3 4.93×10-2 6.10×10-3 6.84×10-3 7.59×10-3 4.60×10-2
C4 5.11×10-5 4.98×10-4 6.17×10-5 6.91×10-5 7.67×10-5 4.65×10-4
U
C5 4.52×10-6 4.40×10-5 5.45×10-6 6.11×10-6 6.78×10-6 4.11×10-5
C6 8.58×10-5 8.37×10-4 1.04×10-4 1.16×10-4 1.29×10-4 7.82×10-4
AN
C7 1.94×10-6 1.89×10-5 2.34×10-6 2.62×10-6 2.91×10-6 1.76×10-5
C8 3.68×10-5 3.59×10-4 4.44×10-5 4.98×10-5 5.52×10-5 3.35×10-4
M
D
TE
C EP
AC
ACCEPTED MANUSCRIPT
PT
RI
Figure 1. Dynamic behavior of risk
U SC
AN
M
D
TE
EP
PT
RI
U SC
AN
Figure 3. Types of BN that can be created with PULSE
M
D
TE
EP
C
AC
PT
RI
Figure 5. The bathtub curve
U SC
AN
M
D
PT
RI
SC
Figure 8. Resulting BN from mapping the Bowtie
U
AN
M
D
TE
EP
C
AC
PT
RI
U SC
AN
M
Figure 11. Prior and posterior nodes associated to a given parent node
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
Figure 12. Probability of the consequences for discrete and hybrid BNs
M
D
TE
C EP
AC
PT
RI
U SC
AN
Figure 14. Risk profile as a function of time and insertion of evidence
M
D
TE
C EP
AC