Systemic Approach To Improve Learning From Incidents
Systemic Approach To Improve Learning From Incidents
A Thesis
by
MASTER OF SCIENCE
May 2017
Learning from incidents has always been very challenging for industry and
spite of the efforts and resources spent on incident investigation processes and
Laboratories explosion, and West Fertilizer explosion continue to occur. It is evident that
as an industry, organizations are continually failing to learn from the past incidents. The
question is if organizations are ready to learn, are they following the right learning
approach? Organizations should adopt a systemic learning approach where the collected
systems and consequently prevent future incidents. The objective of this research is set
to enhance our understanding on how a company and the industry as a whole learn from
past incidents and define the key elements to improve the systemic learning. This study
was divided into four main phases: identification of the learning system, development of
a learning process and a proposed incident investigation process and finally the
First, the learning system for the chemical and oil and gas industry, the different
types of learning, and the entities involved in it has been characterized. Based on the
results, some limitations on the system were identified and discussed. Secondly, a
systemic process for improving learning from incidents has been developed based on the
ii
identified limitations of the learning system. The proposed process provides a holistic
view of the learning process and explored the concept of knowledge management into
executed within the organization and how to support the implementation of safety
knowledge inside of it. Third, an incident investigation process has been proposed to
provide additional sources of information into the analysis, in order to support the
identification of root causes and the required changes in the management systems of the
organization. The process that has been developed enhances understanding of how to get
their processes. Finally, the proposed process has been explained through a case study.
The obtained results provide a clear picture of how an incident investigation report can
iii
DEDICATION
To
iv
ACKNOWLEDGEMENTS
mentor, Dr. Sam. Mannan. His guidance and continuous support gave me the confidence
also want to thank him for giving me the opportunity of being part of the Mary Kay
O’Connor Process Safety Center at which I had the opportunity to meet inspiring and
wonderful people who taught me and supported me during this journey. I also would like
to thank my committee members, Dr. Mahmoud El-Halwagi and Dr. Peres, for taking
their time to serve on my committee. Especially, I would like to thank Dr. Peres for her
guidance, support, and the time we spent together trying to make this real.
I want to thank Dr. Noor Quddus for his support in all stages of my research. His
also thankful to Valerie Green for all her administrative support, patience, and personal
I also want to thank my friends at Texas A&M University for all the special
moments we spent together. I am grateful to my boyfriend for all his support and the
happiness he brings to my life. Finally, I would like to thank my parents who made my
dream real, for motivating me to be a better person, and their unconditional love,
v
CONTRIBUTORS AND FUNDING SOURCES
Sam Mannan [advisor and chair] of the Department of Chemical Engineering and
Professor S. Camille Peres of the Department of Public Health. All work for the
Graduate study was supported by the Harry West fellowship from Texas A&M
vi
TABLE OF CONTENTS
Page
ABSTRACT ...................................................................................................................... ii
DEDICATION .................................................................................................................. iv
ACKNOWLEDGEMENTS ............................................................................................... v
1. INTRODUCTION .......................................................................................................... 1
vii
3.5 Incident investigation process ................................................................................ 45
3.5.1. Reporting ........................................................................................................ 46
3.5.2. Collecting evidence ........................................................................................ 47
3.5.3. Analyzing the evidence .................................................................................. 48
3.5.4. Developing recommendations ........................................................................ 49
3.5.5. Report and lessons learned ............................................................................. 50
3.6 Incident investigation methodologies .................................................................... 50
3.6.1. Fault tree analysis ........................................................................................... 50
3.6.2. STEP (Sequential Timed Events Plotting) ..................................................... 52
3.6.3. FRAM (Functional Resonance Analysis Method) ......................................... 54
3.6.4. Tripod BETA .................................................................................................. 55
3.6.5. IPICA (Integrated Procedure of Incident Cause Analysis) ............................ 56
3.6.6. TapRooT® System ......................................................................................... 57
3.6.7. MORT (Management Oversight and Risk Tree) ............................................ 58
3.6.8. STAMP (Systems-Theoretic Accident Model and Processes) ....................... 60
3.7. Developing quality incident investigation reports ................................................ 61
ix
LIST OF FIGURES
Page
Figure 20 US Gulf of Mexico offshore oil and gas pipeline types of failures over the
period from 1970 to 1999, adapted from [116] .............................................. 122
x
LIST OF TABLES
Page
xi
1. INTRODUCTION
1.1. Motivation
In the chemical and oil and gas industry, organizations and regulators have spent
techniques in order to improve continually, avoid the same mistakes, and consequently
prevent incidents. In spite of these efforts, the outcome has not shown a significant
improvement and process safety incidents continue to occur. According to the Bureau of
Labor Statistics [1], over the five years period from 2010 to 2014, there were 758
fatalities in the chemical and oil and gas industry (the statistics covers the following
sectors: manufacturing, oil and gas extraction, drilling oil and gas wells, support
activities for oil and gas operations, and petroleum refineries) [1]. Likewise, in 2014, an
estimated 46,200 nonfatal injuries and illnesses were reported [1]. These statistics
suggest that the industry as a whole is failing to learn from the past incidents. Learning
management systems and prevent future incidents, by understanding what went wrong in
past incidents, analyzing their lessons learned and implementing the necessary changes
During the last three decades, the industry has seen significant process safety
incidents that have been precursors of changes in regulations and standards in United
States and the rest of the world. However, there is a failure to successfully implement
and enforce those changes and make them transcend over the years. Thus, process safety
1
incidents become part of the history and the industry is doomed to repeat the same
mistakes. Without doubt, no process safety incident has been as harmful as Bhopal
disaster in 1984, due to the enormous impact in terms of fatalities, financial losses,
environmental damage, and industry reputation. This incident made organizations realize
the importance of learning from incidents not only inside the organization, but also
learning from external incidents that may be applicable to their operations. While no
other incident has been as devastating as Bhopal incident, there have been several
disasters over the last decades that have had significant impact on the industry and from
which the same mistakes can be identified over and over again.
During the same year as Bhopal disaster, in San Juan, Mexico City a catastrophic
fire and series of explosions at an LPG terminal killed over 500 people, injured 7,000
and more than 200,000 people were evacuated [2]. The day of the incident, LPG leaked
from a pipeline rupture, which formed a vapor cloud that dispersed into the surrounding
for about 10 minutes. The vapor cloud ignited and a flash fire resulted causing a violent
shock, resulting in a series of BLEVEs and minor explosions in nearby houses and
facilities in the area [3]. Some of the causes and contributing factors of this incident
were that no hazard identification process was carried out for the unit and there was a
lack of awareness regarding the associated hazard not only from the organization but
also the community and the emergency responders. Likewise, there was a lack of
effective land use planning due to the lack of regulations to control the construction of
residential houses around the facility. In addition, the incident showed an inadequate
mechanical integrity program as one of the root causes of the incident. Moreover, there
2
was a lack of emergency response planning and training regarding hazard identification.
Due to the substantial impact of this incident, the same causes and contributing factors
organization reveals a failure to learn from its own history. The safety performance of
the organization over the last ten years exposes an estimated 197 incidents and 21
fatalities per year [4, 5]. In these process safety incidents mechanical integrity,
inadequate land use planning, and inadequate emergency response were identified as
contributing factors.
emergency responders, injured more than 260 people and severely damaged the nearby
explosions in a warehouse killed 173 people, 110 of them first responders, injured more
than 797 people and damaged around 17,000 houses [7]. In both incidents, the same
contributing factors as Mexico City disaster such as poor hazard awareness, inadequate
land use planning, and inadequate emergency planning were identified. Even though
these incidents do not have the same causes as in Mexico City incident, it is evident that
the same contributing factors increased their severity. Both examples illustrate a lack of
learned.
As can be seen, organizations have failed to recognize and learn effectively from
past process safety incidents. In addition, the knowledge that is embedded into those
3
incidents has been lost over the years, making it difficult for new generations to
understand the reasons behind the implementation of those lessons learned. The
system, which enables organizations to learn, adapt and grow [8]. This suggests having
relevant information, the analysis of it and the use of this information to improve and
prevent process safety incidents. The question is if organizations have the required
information, are they following the right learning approach? Organizations should adopt
a systemic learning approach where the collected safety knowledge is leveraged in order
to enhance safety management systems and make the lessons learned part of their
culture.
1.2. Background
The study of how organizations learn from incidents and manage safety to
prevent process safety incidents is not a new topic within the process safety research
area. However, this field has gained more attention during the last sixteen years and
more disciplines such as psychology, sociology, and engineering have been studying it.
As a result, learning from incidents has become a fragmented field in which different
approaches have been proposed for specific scenarios or situations such as a certain part
of the learning process, the group of interest or the industry in which it is applied. Back
in the 70s, the theory on organizational learning proposed by Argyris and Schön
highlighted the importance of learning to detect and respond to undesirable events [9].
4
The theory suggested two modes of learning: single and double-loop learning. Even
though, both modes of learning are necessary, the authors emphasized the importance of
This theory is still accepted and has been widely recognized in this field despite current
advances in organizational learning theories [11]. Lukic ‘research supports this theory by
the analysis of the type of learning that is adopted in organizations, in which incidents
usually are a combination of technical, human, and organizational factors, making single
learning [13].
Over the years, learning from incidents has been categorized into several areas of
study such as lessons learned, incident investigation and analysis, learning from incident
process, and conditions for learning [11]. Kletz has been one the pioneers of introducing
lessons learned from process safety incidents into organizational systems and
introducing real examples of how organizations failed and how to overcome these types
of failures. Furthermore, the author discussed how organizations have no memory and
the need to share and implement lesson learned [14, 15]. Aligned with this author,
several studies have also suggested that significant improvement have been made in the
lessons learned inhibit the complete learning process [16-18]. Moreover, the literature
provides guideline of how to develop lessons learned, how to enhance the dissemination
process and how lessons learned can be embedded into the organization [17].
5
Incident investigation and analysis represents one of the most critical steps in the
learning process due to the challenges that may arise from the identification of root
causes in process safety incidents. Lukic stated the importance of identifying appropriate
solutions within the proper domain [12]. Similarly, research by Lindberg, Hansson and
Rollenhagen explored the literature with respect of incident investigation process and
proposed the CHAIN model based on six basic criteria for incident investigation. The
model argued the importance of studying the effectiveness and effects of different
incident investigation methods in order to apply appropriate techniques and improve the
experience feedback process [19]. In the same context, Fahlbruch and Schöbel presented
identified five elements that organizations should analyze to select the appropriate
approach for learning: participants, learning process, type of incident, type of knowledge
and learning context [21]. Although the framework proposes valuable initiatives, it does
not identify how its initiatives are linked to a system that can be implemented in
approaches from which learning have been analyzed. The framework provided a bigger
6
picture of this field and highlighted the main elements that influence learning such as
actors, countries, steps, industry, disciplines and intensity of the event [22]. Conversely,
learning from incidents within an organization over time. Cooke proposed a model to
show the dynamic of the system, how each element operates and serves as a continuous
improvement process. The model also suggested how incident-learning systems serves
as a bridge between accident causation theory and high reliability theory [8]. Likewise,
Avnet analyzed the expected behavior of shared knowledge over time by the validation
of share knowledge networks within the offshore oil and gas industry [23].
necessary conditions to increase reporting and enhance learning [11, 24]. In this context,
studies explored the potential impact of learning at multiple levels and the need of better
tools for incident investigation analysis [25]. Likewise, the literature emphasizes the
need to create resilience organizations in order to been able to adapt to and absorb
changes once an incident occurred [26]. In the same way, studies suggested the need of
involvement of new groups of experts to incorporate new insights into the field of
Jacobson presented a method for evaluating the level of learning in terms of how broadly
the lessons learned are implemented, how much the organization is involved in this
process and for how long the lessons learned last within the organization [28]. Similarly,
7
studies suggested different learning criteria to analyze and identify potential areas of
opportunity within the learning process [29, 30]. Finally, Naot proposed a method to
determines the quality of the organizational learning system within the organization [31].
1.3. Objectives
stages of the learning process and enhance incident investigation processes. This is done
with the objective of providing a holistic and systemic view of the learning from
incidents process and exploring the concept of knowledge management into safety
order to provide an overall analysis of learning systems in the chemical and oil and gas
systems within the organization. In this context, the specific objectives of this research
are:
Identify the limitations in the learning system for the chemical and oil and gas
Enhanced understanding of how to learn form process safety incidents and how to
8
internal learning. Additionally, to identify the key elements organizations should
improve learning from incidents and will serves as a guideline for organizations on
how corporate learning systems can be executed within the organization and how to
This thesis involves four main phases: identification of the learning system,
process and finally the validation of the proposed incident investigation process through
a case study.
The first phase reviews the organizational learning theory and relevant concepts
for the development of the following sections. It also discusses the barriers that inhibit
organizational learning, factors that influence learning, and the different types of
learning methods that organizations can apply. Moreover, reviews the incident
relevant incident investigation methodologies. Finally, this phase discusses the learning
The second phase of this research introduces the proposed learning process and
examines each element of it. It also shows how a corporate learning system should
9
operate and the key elements that organizations needs to take into account during its
implementation.
The third phase of this thesis introduces a more detailed analysis of the first step
in the proposed learning process: enhance internal information. It examines the incident
presents the proposed incident investigation process and provides a description of each
Finally, the last phase analyzes an incident that occurred in the offshore industry
in North America. The incident has been examined through the proposed incident
investigation process showed in the previous section. The analysis comprises the
analysis of two external incidents with similar causes and three internal incidents that
occurred in the same organization. It should be noted that the incident investigation has
been performed by the organization at the time where the incident occurred. Therefore,
the objective of this analysis is to identify improving opportunities of the final report
through the analysis of similar incidents in the industry. The explained methodology is
presented in Figure 1.
10
Figure 1 Thesis methodology
11
2. ORGANIZATIONAL LEARNING
perspective, since it emerges from the internal and external interactions among
individuals, technology and cultures. Thus, learning is not limited to the acquisition of
technical knowledge; it also requires the analysis and interpretation of the context where
the knowledge is going to be applied, and the understanding of human behavior and
social interactions that emerge in an organization [32]. As a result, this field has been
very challenging for the academia and industry, since the outcome is always going to be
different from organization to organization and even in the same organization, among
generations. Then, the real challenge is to transcend from the individual to the
organizational learning, in order to ensure that the knowledge has become part of the
culture of the organization regardless the people that are part of it. Individual learning
refers to the enhancement of individual mental models, through the acquisition of new or
[33]. Once the individual learning is achieved, this knowledge has to be stored,
organizational learning.
12
routine actions and values on behalf of the organization [33]. Therefore, individual
learning set the foundations of this theory, and then the organizational components are
added to gain a better understanding of the expected outcome. As stated by Fiol and
knowledge that occurs as a function of experience” [34]. Likewise, in the technical view,
response to, information both inside and outside the organization” 1999 [32]. Finally, as
assert by Huber, “an entity learns if, through its processing of information, the range of
its potential behaviors is changed… an organization learns if any of its unit acquires
knowledge that it recognizes as potentially useful to the organization” [32, 35] . Based
process in which the knowledge is acquired for both outside and inside the organization,
that this learning, changes human behaviors and gain experience for the organization.
Learning is a continuous process and varies depending of the size of the ‘unit’
Figure 2. Individual and organizational learning have already explained in the previous
paragraphs. However, between them, the group learning is present, which refers to a
group that gains knowledge together through the interaction and experience with another
individual [37]. As an example of group learning we can identify a group people from
13
different departments, sharing the lessons learned from case study. On the other hand,
organizational learning is not the ultimate goal to understand the learning system as a
whole. The final level of learning is the inter-organizational learning, in which different
organizations share, communicate and learn together from experience. For the purpose
of this research this type of learning refers to the industry learning, more specifically, the
and double-loop. This model is based upon “theory of action” and “theory of use”,
developed by Argyris and Schön [38]. The model stated how people act based on their
mental maps and therefore how they plan, execute and evaluate their actions [38]. These
Figure 3.
14
Governing variables describes why people do what they do. Those are the main
values, beliefs and conceptual frameworks that are rooted to their cultural
background [39].
Actions describe what people do. Actions are the plans and strategies used by
Results describe the consequences or what people obtain from their actions. These
Single-loop refers to the learning process in which the actions are modified based
on the expected results or consequences obtained. This type of learning involves the
recognition of an undesired outcome and the modification of the actions that leads to it.
It also can be defined as the correction of an error by changing the strategy of actions. In
the single-loop learning the governing variables remain unchanged. Conversely, double-
15
loop learning requires an additional step in which the governing variables are questioned
and evaluated in order to develop changes from the values, policies and objectives of the
organization [39].
The literature suggested that most of the organizations act according to single-
loop learning [40], in which operators spend their time focusing on how to correct
immediate actions to avoid the same mistake. But no actual analysis is performed to
detect the root causes of the incident and consequently, determine if a change in the
develop critical thinking, in which subject matter experts should analyze the situation
and challenge the existing rules governing the organization. In conclusion, double-loop
learning allows organizations to identify root causes instead of intermediary causes and
learning can limit organizations to only solve the symptoms of the actual organizational
causes.
2.3.1. Knowledge
transformation of information, in which expert opinion, skills, and experience have been
in order to understand patterns and support decision-making [42, 43]. Thus, the
16
employees and make sure that this knowledge is successfully stored and transferred
Knowledge can be categorized into two forms: explicit and tacit knowledge.
Explicit knowledge refers to what people knows and can be explained by individuals.
This type of knowledge is relatively easy to store, identify, and retrieve [44, 45]. Explicit
knowledge transfer can be verbal or written. Examples of explicit knowledge are reports,
procedures or case studies. The challenge with this type of knowledge is to ensure the
know-how [45, 47]. This type of knowledge is personal in nature; this implies that the
knowledge remains in the heads of individuals that are part of the organization.
Therefore, tacit knowledge is hard to extract and is mostly experience based. Tacit
Examples of tacit knowledge are the skills and expertise acquired by an employee or the
experiences from one entity to another, within the organization and among them [49].
Those entities can vary depending of the number of people involved, such as groups,
from individual level to higher levels, in order to ensure that knowledge has been
17
organizations are transferring knowledge is a difficult task, due to the complexity of
interconnection among three elements: members, tools, and tasks. By combining those
elements through the process, organizations can create knowledge management system,
The process of transfer knowledge starts with the externalization, in which the
person or team must find the best method or approach to deliver the knowledge they
want to transfer. In this stage, the tacit knowledge has to be converting into explicit
which the knowledge is adapted and analyzed based on the context where it is going to
incorporated into the system [51, 52]. However, this process does not ensure that the
Additionally, there are barriers involved in the process that makes it more
complex and therefore, makes the expecting outcome unpredictable in some cases. Some
who the intended receiver are, how familiar they are with the knowledge they will
get, and the context where it is applied. The similarity of task and context between
the transferring entity and the receiver entity determines how complex the process
18
understand the background of the receiver, in terms of the experience and technical
knowledge that they have, in order to define the best approach to deliver the
knowledge [52].
Nature of the task: another barrier that involved in the process is the nature of the
task, referring to how frequent a task is performed and if the task is a routine or not
activity. The nature of the task enables us to understand how familiar the
employees are with it and the type of system they are approaching [52].
of the type of knowledge that will be transferred. The knowledge can be tacit or
knowledge and is then analyzed, stored, and retrieved as a form of experience gained by
can increase productivity, decrease costs, and generally improve the know-how of the
organization.
Organizational memory is defined as the knowledge from the past that is exerted
upon present organizational tasks and routines [41, 53]. Likewise, organizational
19
memory is seen as the stored knowledge that an organization possesses [41]. This
concept has been evolving over the years, changing the approach and techniques applied
to gain and retain knowledge within the organization. The main change resides in the
evolution from individual memory to a corporate memory, with respect of the place
where the knowledge is stored. Nowadays, the objective is to enhance the mechanisms in
which knowledge is stored so it can be easily extracted through the years and
the management system, and the physical structure of the workplace [41]. The
As Walsh and Ungson states, the organizational memory process involves three
stages: acquisition, retention, and retrieval [54]. The acquisition occurs when new
information is gained based on decisions and the evaluation of the consequences of those
organizational memory that is acquired by the organization [54]. Then, the information
is retained by the different storing mechanisms that the organization possesses, in order
employees can access the stored knowledge throughout the whole organization.
20
2.3.4. Organizational forgetting
Do organizations retain all the knowledge they get over the years? Does
knowledge change over time? Since employees leave, the technologies and best
practices change, and all the information cannot be retained. It is important that
organizations take into account the depreciation of knowledge over time and the amount
of knowledge that is forgotten by the organization once employees are no longer part of
any level within an organization [55]. The intentional forgetting refers to process by
knowledge. By doing that, organizations are able to adapt to new changes in the
unintentional forgetting refers to the degradation of knowledge over time, due to lack of
forgetting can be seen as positive and negative as the same time, considering unlearning
as the process that helps organizations to adapt and stay competitive in the market. In
contrast, it can be seen as negative when forgetting occurs with no reason and prevents
organizations to remember what need to be done to avoid the same mistakes [33, 55].
The learning process becomes more complex for organization than for individual
due to a number of internal and external difficulties that inhibit organizations to learn
21
appropriately. Consequently, organizations have to focus their attention not only in the
elements that affect individual learning. They also have to understand the complexity of
learning interactions among individuals and the external elements that limit the learning
process. This research has been focusing in the understanding of eight barriers that
Ambiguity about incident causation: refers to the tendency of the top management
level to select one interpretation instead of another. That means, that a leader is
going to choose to learn only from what it is familiar to him, and is aligned with
their preconceptions, actions, and goals [56] . This type of barrier, limited the
within the organizations. Likewise, in some cases, this type of barrier can lead
exposes the self-interests of some people. In this sense, the recommendations are
going to blame someone instead of pointing out failures in the management system
reduced to a limited amount and low quality of lessons learned and partial learning
22
Competitive organizations: refers to the restrictions and limited knowledge that is
shared between and within the organizations. Learning is limited to the knowledge
and experience that is created inside the facility or organization. People from one
facility may not know what another facility is doing in terms of safety in order to
prevent the same mistakes. Competitive organizations are one the major barrier
that the chemical and oil and gas industry is facing nowadays. Sharing lessons
learned is restricted by lawyers inside and outside organizations, due the expose of
sensitive information that can reveal the know-how of organization. The chemical
and oil and gas industry needs to learn from industries such nuclear and aviation, in
which the information is shared among the whole industry to improve together and
Lack of leadership: the role of leadership is key for organizations in order to learn
and have an open environment that helps employees to raise their concerns and
suggestions about how things are running. Leaders need to be involved in the
learning process, to motivate and support people throughout the process and give
them the required resources to achieve the expected outcome. Thus, leaders are
responsible for giving employees the spaces to learn and high quality resources to
23
changed, making this process challenging for organizations because some people
are resistant to changes, act on the defensive, and have their own prejudices.
process [57].
journey in which all employees need to be engaged with the organization and they
have to believe the values and goals of the organization. A good organizational
culture allows companies to build trust among employees, motivate them to receive
organization has and how much of their knowledge they are willing to share. This
knowledge, they can also receive knowledge from others and therefore, avoid the
same mistakes. Likewise, organizations have to ensure open spaces in which the
internal knowledge can be transferred from one facility to another. Finally, leaders
with them to provide suggestions and receive feedback from their job.
24
Short-term focus: refers to the tendency of some organizations to focus their effort
in solving short-term problems instead of looking at the big picture. Most of the
Thus, the management system failures remain hidden and the real problems are not
analyzed. This happens because root causes require more time to be identified and
addressed, and it requires a higher level of commitment for the top management
level. Similarly, this type of barrier refers to the tendency to focus in the single-
loop learning in which only actions are modified but no analysis of the root causes
is performed.
depending of the depth, frequency, and level of detailed of the task that need to be
learned. Edgar Dale developed the model “Cone of Experience”, which refers to the
different levels of abstraction to concrete experiences [58]. The model is explained with
a pyramid, showing in the top of the pyramid, the lowest level of abstraction: words.
Conversely, the bottom of the pyramid is showing the higher level of abstraction: real
life experiences [58]. Based on this model a whole new theory was developed in order to
explain the levels of retention with respect on the learning method that is applied.
Consequently, the pyramid was adapted and percentages were incorporated to show the
information is presented [59]. Some researches claim that this new “learning pyramid”
25
lacks of evidence and proper validation with respect of the percentages that are shown
[58]. However, the learning pyramid can be seen as a guideline in order to understand
the different levels of retention, instead of a quantitated fact of individual retention. This
means, that teachers and organizations can guide their methods considering this
approach. But, they have also take into account the required background to perform each
method, the expected outcome, and the limitations of the target that is going to be taught.
Figure 4. presents the different average rates of retention developed by the National
Training Laboratory [60]. The pyramid can be divided into two levels: passive and
techniques in which the individual is not participating or is not playing an active role
individual plays an active role and is involved during the session, such as group
discussions, practices, and teaching others [60]. Similarly, passive and participatory
26
Figure 4 Training methods, adapted from [27]
Based on the pyramid, people can conclude that learning should be focusing just
in the bottom of the pyramid and forget the traditional ways of learning such as lectures
and reading. However, each method offers specific benefits that cannot be achieved only
by focusing on the bottom of the pyramid. Thus, organizations have to provide training
using a combination of two or more methods, in order to ensure that the knowledge is
knowledge, by generating relationships among the existing and new knowledge [61].
Therefore, these types of techniques are focusing in teaching the rules of how to do
something and providing the basic concepts and theories behind that [62]. In contrast,
27
the behavioral approach is associated with skill development (learn by doing) and a
change in behavior [61]. This type of technique focuses on providing practical training
and allows participants to behave and think how they would act in real life. Generally,
training objectives intend to achieve learning in knowledge, skills and attitudes. None of
those techniques can achieve all three objectives at the same time. For that reason,
training has to combine more than one method to perform a successful training program
[63].
The advantages and disadvantages of each method of the learning pyramid are
described below:
interaction between trainer and trainee is limited to the questions and answers that
introduction, body of the lecture, conclusions, and a summary. This type of method
is useful when you need to reach a large number of people. Moreover, lectures are
less expensive than others techniques and serves as a basis for other techniques
[63]. However, trainer should take into account the right balance between the
amount of material to be taught and the period of time they have, due to the
same objectives as lecture method, but there is no interaction between trainer and
trainee. This type of technique allows trainees go back and check the material as
28
many times as they need [65]. Commonly, reading method is used in organizations
Audio-visual method: is designed to give trainees training through audio and visual
at the same time. Therefore, trainees can be more focused in what is being taught
[65]. This type of technique allows reaching large amount of people at the same
to do certain task and therefore, try to imitate what they see. Audio-visual
techniques such as videos are widely used in organizations to show past incidents
and their lessons learned. Videos have shown to improve quality of training
because of the advantages that they offered, such as flexibility in the speed they
want to learn, they can see events that are not easy to demonstrate, gives them
job task and the importance of each step involved in the process [63]. Moreover,
demonstrations are useful to make the training more meaningful and realistic.
some cases the trainer allows the trainees to perform the same task, followed by
questions and discussion [66]. This type of method is usually applied to teach how
29
Group discussion: is designed to allow trainees to share their opinions and
group discussion methods are case study and role-play [66]. Case study is designed
difficult situation, and they have to analyze and identify the main causes of the
analysis, synthesis and evaluation skills [63]. Similarly, in the role-play technique,
trainees have to perform a character for a specific situation that recreates real life.
The objective is to compare the performance with real life conditions and improve
Practice: refers to the type of training in which the trainee has the opportunity to
get fully involved in the activity that is being taught. Thus, the trainees are
expected to carry out the activity and demonstrate that they are able to do it
correctly. Usually, a complete training program ends with some practical training
in which the trainee is expected to develop specific skills regarding a specific job.
Moreover, the trainee is able to experience how the learned skills and behaviors are
transferred to the job, and how to deal with daily issues that arise during the job
[63]. Some of the most common techniques for practical training are Job
30
of an explanation of the task, instructional plan, a demonstration, try out, and
follow up. In this method, the trainee has to be able to explain how to perform the
task before the execution of it. Similarly, simulators are used to imitate real life
situations. This type of technique is widely used in the military and aerospace
industry.
Teaching others: the ultimate level of retention refers to teaching others, which
someone else.
The learning process can be affected for several factors that can lead to
unsuccessful learning outcomes. Those factors can be represented as physical and mental
process that occurred in an individual level at the same time they are getting trained.
Likewise, the different strategies implemented to transfer knowledge and the level of
characteristics of the "adult learning" and the appropriate approach to reach them. Thus,
Trainees need to know why they should learn: they have to understand the
31
Meaningful training: they are more willing to learn when the information is linked
Opportunities to practice: they need spaces to rehearse and show the learned
capability [61].
Commit training to memory: they need to move the information from the short-
term memory to the long-term memory. That means they need to receive detail
Feedback: they need to receive feedback regarding how well they are meeting the
Spacing: teaching in various sessions instead of teaching all the material in short
Test: testing increases learning and helps trainees to increase their long-term
memory [67].
Second, the age of the trainees has shown influence on the learning process as
certain mental capacities decrease over time. However, ages come with experience.
Thus, in some cases experience can compensate age. Third, the mental precondition that
the trainee brings to the training. This refers to the motivation and basic skills of the
trainee. Fourth, the perception that the trainee gets from the information that he is
receiving. This refers to the ability to organize and processed the information. Fifth, the
ability of the trainee to convey new information with the existing knowledge and uses
32
the information to influence behaviors. Finally, the ability of the trainee to adapts the
33
3. INCIDENT INVESTIGATION
3.1. Introduction
about their behavior and operational practices, in order to identify flaws in the system
and overcome the deficiencies to prevent similar incidents and avoid catastrophes. High
hazardous industries, such as, nuclear and chemical industries are more vulnerable to
incidents are a combination of multiple failures that take place in complex systems. Even
if this type of incidents has very low probability of occurrence, once it happens it is often
catastrophic with devastating consequences that can involve people, infrastructure, and
environment [71]. Thus, organizations have to focus their attention on improving the
incident investigation program to detect latent failures that can lead to this type of
mitigate future incidents [72]. Moreover, this process helps organizations to identify
failures in the management system that can reduce the risk of having another incident
with similar root causes at the same facility or other facilities within the organization
[72].
34
The classical Accident Causation theory or “Swiss Cheese Model” developed by
Reason in 1997 makes a clear representation of the dynamics and path for an incident to
happen [71, 73]. The theory suggested how high-consequence incidents could occur
once latent and active failures are aligned, as depicted graphically in Figure 5. Latent
or preconditions for unsafe acts. While, active failures refer to unsafe acts that act as the
The interaction between latent and active failures that lead to a catastrophic
incident is hard to detect and understand, due to the complexity and limited number of
35
such incidents. Thus, learning from high consequences incidents limited the amount of
knowledge that need to be learned and only shows partial reality of the organization.
Research has shown that in order to unhide latent failures, it is necessary to investigate
all type of incidents that have significance and therefore, help to prevent disasters [75].
H.W. Heinrich was the first person to introduce the safety pyramid in his book
Industrial Accident Prevention in 1931. The safety pyramid refers to the ratio of the
number of minor incidents and near misses that occur before a major incident takes place
[76]. The pyramid has been validated and slightly modified by Frank E. Bird Jr. in 1996
and later on by ConocoPhillips Marine in 2003 [75]. The study made by ConocoPhillips
Marine suggests that for one fatality incident, an organization has 30 severe incidents,
300 minor incidents, 3,000 near misses, and 300,000 at-risk behaviors or unsafe acts
36
[75]. As a result, organizations has to focus their attention to the bottom of the pyramid,
where they can identify trends and find useful information to take the necessary
preventive actions and improve safety management systems. Figure 6. Presents a graphic
incidents that an organization can experience depending of the severity of them. There is
not general agreement regarding incident classification. Even so, organizations recognize
the importance of classifying the nature of the incident in order to determine the
78]. The definition of incident and accident are commonly mixed and used for the
same purpose. However, the definitions differ with respect of the capability to be
impact [77]. Some examples of catastrophic incidents are Bhopal in 1984 and
37
Severe and minor incident: this type of classification varies depending on the risk
matrix applied in the company. Usually, a severe incident involves major injuries,
Near miss: in terms of process safety, near miss is defined by the Mary Kay
O’Connor Process Safety Center as the event in which the loss of containment is
prevented by the last layer of protection in the process [79]. Likewise, near misses
The terminology used in incident investigations has been challenging over the
years because there is not a general understanding among the concepts and the
difference among concepts such as root causes, contributing factors, and direct causes
and the importance to recognized and identifies each of them in order to develop
effective investigations and prevent incidents to happen again. For the purpose of this
incident [81], i.e., that if eliminated would have prevented the incident or reduced
the consequences of it [72]. Causal factors can be classified into root cause,
Direct cause: refers to the immediate events that lead to the incident [82].
38
Root cause: refers to the underlying causes and the most basic causes of an
incident, that if removed, the incident would not have happened. Root causes are
associated with failures in the management system, which can be categorized into
company but it is usually associated with the severity of the incident. Thus, high
potential incidents such as near misses with high significance could be excluded for
detailed analysis and the company may lose valuable findings that can lead to the
identification of failures in the management system. For that reason, companies should
take into account variables such as the nature, complexity, and the actual or potential
severity of the incident, in order to determine the appropriate team and the level of
It should be noted that although not all incidents will be investigated in detail or
identify trends and keep track of all incidents within the organization. In this context,
Minor incidents with significance, i.e., incidents with high potential consequences.
Near misses with significance, i.e., incidents in which the outcome could have been
39
Minor recurring incidents or recurring near misses identified by trend analysis.
Although, incident investigation theory has been changing over the years and
new methodologies have been developed to address the complexity of new systems and
technologies in the industry. Incident investigation still conserves most of the basis and
considered the pioneered of accident causation theory, introducing the domino theory of
accident causation back in 1931 [82]. The domino theory described the occurrence of an
sequence. Thus, an incident (called by the author as accident) can be prevented if the
series of events are disturbed [84]. The events are defined as dominos and are classified
into five categories: social environment and ancestry, fault of the person, unsafe act or
condition, incident, and injury. The theory is lined with the assumptions of incidents
prevention: focusing in people, which are responsible of the incidents and management,
where the incident can be prevented [82]. Moreover, Heinrich developed and set the
basis of the safety pyramid and created a general understanding about the importance of
accident theory.
The work done by Herbert W. Heinrich sets the starting point for the accident
causation theory, which over the years has evolved based on the level of complexity that
arises for an incident to occur. In this sense, the accident causation field has been divided
into three main ways of thinking. These three categories are: simple linear or sequential
40
model, complex linear or epidemiological model, and complex nonlinear model. Those
types of theories have been focused in the identification of single causes, multiple
causes, and complex outcomes, respectively [83]. It should be noted that the
combination of those theories and level of understanding of each way of thinking have
shaped our current understanding about incident investigation and risk analysis.
one of the factors or dominos [83]. Similarly, Bird and Germain developed the Loss
Causation Model in 1985, in which a modified domino theory was established by the
of root causes and the identification of the required barriers or controls in order to
prevent incidents. Moreover, these theories incorporated the interactions between the
individual and the system leading to unsafe conditions. Complex linear models can be
model, epidemiological model, and systemic theories [85]. Complex linear theory is
highly recognized by the work done by James Reason [74] and Jens Rasmussen [57, 86],
which have had a significant impact in accident causation theory. Those theories are
based on a system-oriented approach and were able to change the perspective from the
41
Finally, complex non-linear models refer to the new tendency of thinking, in
which the non-linear incident causation is recognized. Thus, incidents are the result of a
generation of thinking highlights the research done by Hollnagel and Leveson. Both of
them are recognized by the development of non-linear accident models in the 2000s
[78]. The models are: The Systems-Theoretic Accident Model and Process (STAMP)
and analyzes the dysfunctional interactions among components that are part of the
system. Moreover, the model classified the different types of flaws that can lead to an
incident and analyzes the role of constraints in safety management systems [87].
Alternatively, FRAM model considers that system alterations lead to incidents when the
system cannot hold up such alterations. Thus, the model identified the different variables
within an organization and how to manage those alterations that can arise in order to
These three generations of thinking, overlap over the years and combine theories
and models in order to suggest better approaches to understand real systems behaviors
[83]. Meanwhile, a considerable amount of methods have been developed for incident
investigation and those methods have been focused in the identification of root causes.
An older methodology, The Management Oversight and risk Tree – MORT, developed
by Johnson in 1973, provides a detail and comprehensive understanding for root causes
identification [89]. However, their application has been limited due to their complexity
42
and the required time to executed it. Even so, later methodologies have been developed
successful incident investigations processes, because their expertise and teamwork will
shape the quality of the recommendations and the degree of analysis of the identified
causes. Usually, the top management along with the safety department are in charge of
selecting the incident investigation team based on the severity and the nature of the
incident [80]. The team is selected based on their experience, skills, and competencies
associated to the process, where the incident occurred, as well as, the competencies and
skills related with incident investigation methodologies. Since organizations usually are
in charge of multiple types of processes and products, it is not practical to have only one
team trained to perform incident investigations. Thus, organizations should have a pool
of trained employees across the company, who will be familiar with incident
investigation process and the methodologies that need to be applied [77]. The team
composition will vary depending of the specific required knowledge for the
investigation. Generally, an investigation team is composed but not limit to [77, 79, 80]:
Process operators, who will bring expertise of the process where the event
happened
43
Process engineers and process safety specialists
Contractors representatives
Law representatives
involve employees from sister units or plants, who are familiar with the process and can
common practice to include someone, who can help during the investigation with
specific technical inquiries [80]. Furthermore, some organizations try to avoid including
managers or people from the top level management to the team, because it can inhibit an
open communication within the members of the team [77]. Finally, it is also a common
practice to provide training to the team prior the investigation to provide an appropriate
It should be noticed that the responsibilities assigned to the team are temporary
and required their full time in order to gather and analyze the information. Therefore, it
is important that managers understand their role and the temporary suspension of their
After the team has been selected the team members will have to go through the
whole incident investigation process, since gathering the information until the
44
identification and collection of evidence. Since, evidences can be easily disturbed and
manipulated by external sources. Moreover, based on the nature of the incident, there
may be more than one team investigating the incident, making even harder to perform
the job. Thus, the team has to act fast but at the same time be careful identifying
potential hazards to the team, trying to look the big picture of the scene, noting what is
missing and what is there that should not be there, and using all senses to get as much
evidence as possible. At the same time, they have to develop a list of preliminary
potential scenarios, which are going to be the main source for identifying the required
accordingly, in order to arrange the required resources and determine the best strategies
to approach the following investigation. This section addresses the different phases or
steps required to perform incident investigations, since the reporting of the incident until
proper incident investigation management system has been developed and implemented.
The incident investigation management system sets the basis and criteria of how an
incident is going to be handled and the procedures that need to be followed during this
process [72]. A common management system needs to consider at least but no limited to
45
Organization’s responsibilities
Team composition
Required documentation
Required training
Reports
Resources
Interview forms
3.5.1. Reporting
reporting culture, where the employees feel confidents to report unsafe behaviors, near
misses and minor incidents. Likewise, a culture of no blame will allow employees to talk
and report without fear of potential consequences. Reporting must be made as quickly as
possible, in order to provide immediate response and preserve the evidence without
external alterations. There are two types of reporting to be taken into account once the
incident occurred. The first one is the internal reporting, which refers to reporting from
employees to the incident investigation management system and also the corresponding
report within the organization, to alert the departments and people that need to be
46
informed about the event. This type of reporting can be via paper, online or verbal. The
second type of reporting, depends of the severity of the incident, in which external
entities such as regulators and firefighters must be informed about the incident [91].
After the incident has been reported, the next step is to classify it, record it, and
determine how it is going to be approached. Then, the incident investigation team must
Gathering evidence is one of the most challenging and time consuming stages
requires experience and a rigorous plan that must be followed [81]. In this step, it is
important that prior to start collecting data, the team leader and team members available
at that time, develop a plan, which will guide the evidence collection. It is likely that the
The evidence or data can be classified into three types: human, physical or
observations. Physical evidence refers to any physical source relevant to the incident
paper and electronic documentation, such as procedures, logs and reports [81].
Besides the information that can be collected inside the organization, there may be
additional sources that will support the investigation. Some of these of sources are:
47
equipment manufacturers, universities research centers, external databases, government
records, and companies with similar processes. As a result, not all evidence is collected
during the site visit, there may be data or information that will result from additional
The severity of the incident will determine how easily and quickly the incident
investigation team will have access to the evidence. In the case of major process safety
incidents, there may be external entities in place such as OSHA, EPA or insurance
companies that will limit the access to the site and the availability of information. At the
same time, the physical damaged of the site can affect the collection of data, and most of
An objective investigation must consider all realistic scenarios that possibly lead
to an incident. Then, based on the collected evidence, each scenario will be tested and
reevaluated accordingly if new evidence is gathered [80]. The objective of this stage is to
identify the direct causes, root causes, and contributing factors in the incident, in order to
be able to answer the questions of what happened and why did it happen [81]. As
is going to be applied to identify the root causes of the incident. Generally, root cause
understanding of the incident, main events, and conditions [72]. The next steps vary
depending of the methodology that is going to be used. Some of the most relevant
48
methodologies for analysis of evidence and identification of root causes are further
generated based on the root causes and contributing factors identified through the
investigation. During this process, each cause has to be analyzed individually and a
objective and risk reduction action should be easily understood. Since the
approved [72]. Moreover, it is also a common practice that the recommendations are
be evaluated and approved by the management and depending of the nature of the
incident, the legal department may need to review and approve the recommendations as
well. Once the recommendations are approved, the people in charge of the execution of
each recommendation are assigned and a timeline is defined. Finally, the management
recommendations are implemented, organizations are reducing the risk and consequently
49
3.5.5. Report and lessons learned
The final stage of the incident investigation process involved the development of
a written report, which contains all documentation involved during the process. The
report covers all the detailed information used during the investigation, since evidence,
causes until recommendations. Generally, an incident report format includes at least the
communicated with all interested parties in order to ensured learning across the industry
[80].
“why” an incident happened [72]. The methodology uses logical reasoning to determine
the possible combinations or pathways that lead to an incident. Those combinations can
factors [92]. Fault tree analysis uses starting point as the top event, which refers to the
outcome of the incident. Then, the process consists of going backwards in order to
identify the preceding causes or events, until the root causes are finally identified. The
level of detail of this technique will depend of the incident investigation team. A FTA
can be used as a qualitative method, quantitative method or both [81]. The methodology
50
uses symbols to guide readers to understand the different pathways and logic of the
diagram. Figure 7. Present the most relevant symbols for developing fault trees.
The symbols AND and OR represent the gates that connect one event to one or
multiple events. The gate AND indicates that the outcome event occurs only if all the
input events occur. Conversely, the gate OR indicates that one or more input events have
to occur to produce the outcome event [72]. Figure 8. shows a basic representation of a
FTA.
51
Figure 8 Basic representation of a FTA, adapted from [81]
Benner (1987) [93]. STEP is a multi-linear event sequence, which provides a detailed
description of the incident process. The methodology focuses in the identification of the
authors and the sequence of events or actions that lead to the incident. Thus, STEP
considers that multiple actions can take place at the same time by different authors. The
authors are considered as people or things directly involved in the incident [93].
The incident need to be analyzed looking for the big picture and then breaking
down into actors and actions. In order to accomplish that Hendrick ad Benner introduced
the term “making mental movies”, which refers to the visualization of each action
52
executed by each actor from the time the incident was expected to began, until the top
which helps to visualize and link all the events together. The STEP worksheet is
described as a matrix, in which the authors are identified in the rows and the events in
the columns, as shown in Figure 9. To ensure logical sequence, the events are linked by
three different types of tests: row test, column test, and necessary and sufficient test. The
row test verifies that the actions and authors are broken down sufficiently. The column
test verifies that the sequence of events is consistent and coherent. Finally, the necessary
and sufficient test verifies that the previous event is sufficient to produce the outcome
53
event [82]. The STEP worksheet is then analyzed in order to identify the event or events
that generate safety problems. Each safety problem is then analyzed as candidate for
recommendation [82].
order to analyze and understand the functions of the system. The objective of the method
is to determine what should have happened that did not happen. To do that, FRAM
analyzes the potential variability and functional resonance in the system [94]. FRAM is
First, the identification of daily functions required in the system. Functions refer
to the activities required to produce a specific outcome. Thus, the first step required to
determine all possible functions involved in the system as well as the identification of
the attributes of each function. The attributes refer to the characterization of the
functions, which are described into six aspects as shown in Figure 10 [94].
54
Second, the characterization of the variability of each function. This
elements take into account human, technological, and organizational factors. The
variability is determined considering both the potential and actual variability of the
system [94]. Third, the identification of coupling functions and their interrelation in
order to identify unexpected outcomes. This step requires to link all the identified
functions and understand the potential variability that can be expected from that. Finally,
the last step refers to the identification of barriers for variability and performance
monitoring. Barriers can be divided into system and function barriers. System barriers
refer to organizational aspects and physical barriers. Conversely, functions barriers refer
to the way barriers can achieved their purpose. At the end, recommendations are going
method is based on the accident causation theory or Swiss Cheese Model. This
influences of human behavior and organizational factors in the system. Tripod theory
humans work, rather than human conditions that lead to the incident. Thus, the ultimate
culture. Likewise, the methodology provides three main objectives: development of the
55
chain of events that lead to the incident, identification of the barriers that should have
prevented the incident, and the identification of the underling or root causes [95].
The first step, is the development of a “core diagram”, which refers to the
graphical representation of the events. The core diagram is developed based on the
identification of Tripod Beta trios: Agent, object, and the event. Then, the trios are
connected among them to provide a chain and sequence of events. The second step refers
to the identification of the immediate causes that allowed barrier failing. The third step
refers to the detailed understanding of why it happened. Thus, the preconditions and
underling causes of the incident are identified. Finally, the previous steps are repeated
(BRFs) to categorize and guide the identification of underlying causes. Those BRFs are:
[82].
based on the traditional Root Cause Analysis (RCA) method. The RCA is a complex
linear method, whose goal is the identification of casual factors and root causes of an
incident. The method mainly involved the following steps: development of incident
sequence, identification of causal factors, development of causal factor chart, and the
identification of root causes. The last step can be achieved by two different approaches:
56
using logic trees or using predefined trees. IPICA follows the second approach using
three main areas: assumptions of the nature of incident causes, limited definition of root
causes and the methods applied for their identification and the complex non-linear effect
management’s attitudes, and societal. The first level is considered as the direct causes of
the incident, which are associated with the meta-components, i.e., interaction of
hardware, software and personnel. The second level identifies failures in the
management system and safety culture of the organization. In order to identify this, a
root cause map is developed based on the CCPS guidelines to construct process safety
management systems [72]. The third and fourth levels are required when the causal
factors are associated with factors beyond the organizational boundaries. To identify
and develop corrective actions. The system goes beyond the identification of the causes
of the problem, instead the system supports investigators through the whole incident
investigation process, since the collection of data, until the presentation of results and
57
software, which uses and combines multiple techniques in order to achieve detailed and
The system consists in the execution of seven steps. The first step refers to the
collection of the required information. The second step is the development of a sequence
of events. In the third step the causal factors are identified. The second and third steps
are supported by techniques such as SnapCharT® and Equifactor®. The fourth step
refers to the identification of root causes, in which tools such as Root Cause Tree® and
dictionary are used to help investigators in the identification of all the root causes. The
root cause tree is divided into seven main categories and it is then break down into more
detailed. The seven categories are: procedures, communication, work direction, training,
management system, quality control, and human engineering. In addition, the root cause
tree includes a set of fifteen questions to help investigators analyze human factors in
more detail. Once the main root causes have been identified, the system guides the
investigator in the identification of more complex causes such as culture, systemic, and
organizational factors. The following steps refer to the analyzes of the root causes,
recommendations [98].
developed by William Johnson. MORT consists of a predefined tree based on fault tree
branch in the tree and analyzes whether or not the associated causes are applicable to the
58
event. The branches end with the identification of failures in the management system
[92]. Thus, the methodology uses a schematic representation of a dynamic and ideal
The logic diagram identifies 98 generic problems and over 1500 basic causes or
root causes associated with failures in the management system. The diagram starts with
the top event followed by an OR gate, which derived the first important analyses that
need to be done. The OR gate breaks down into two main branches: Management
oversight and omissions or assumed risk. Assumed risk refers to the risks that have been
analyzed and accepted by the management level. Conversely, unknown and unanalyzed
risks are considered in management oversight and omissions. The next step requires to
answer the questions ‘why’ and ‘what happened’. The ‘why’ refers to the management
system factors, while the ‘what happened’ refers to the specific controls that should be in
place [99].
In order to perform the analyses, the diagram is supported with a manual that
helps investigators to ask the right questions in each level of the analysis. Likewise, the
methodology uses a color-coding system to help investigators visualize the progress and
identify the areas that need additional information or analysis. The events that are not
applicable to the incident should be colored in black. For the remaining events, a
not. If the event is considered Less Than Adequate (LTA), it should be colored in red.
On the contrary, if the event is considered adequate, it is colored with green. In some
cases, events can not be classified into those two categories because of lack of
59
information or uncertainty about this specific event. Those events are colored in blue and
needs to be evaluated in more detail or more data need to be collected. The MORT
analysis finished when all the blue events has been evaluated and consequently decided
failures. Thus, STAMP focuses in understanding why the control structure was
inadequate and which feedback loops failed in the system. Likewise, STAMP considers
safety as part of an adaptive socio-technical system, in which all the components are
constrains, control loops and process models, and levels of control. These concepts
factors involved in the incident [100]. First, accidents are conceived as the identification
of constrains rather than events. Thus, STAMP identifies all the required constrains in
the system, including the social and organizational factors and no just system design
level imposes constraints to the lower level in order to control the system. In this sense,
safety has two basic hierarchical levels of control: system development and system
operation. Those levels of control must interact and develop effective channels of
60
communication. Each level in the control structure must define the different channels of
enforcement and communication. Thus, each hierarchical level will impose constraints
downward and will receive feedback upward to verify the effectiveness of those
constraints. Finally, the concept of control loops and process models refers to the
consistency that must be achieved between the model of the process used by the
controllers (human or automated) and the actual process state. In this sense, the
controllers are able to supervise the actual state of the system [87].
inadequate execution of constrains and inadequate feedback. Each branch is then divided
In incident investigations, all the effort and resources spent on it are going to
shape the quality of the final product of the process: the incident investigation report.
making sure the recommendations are implemented are key elements to success in this
process and reducing the risk in order to prevent similar incidents in the future. The
incident investigation process allows organizations to analyze their system behavior and
uncover hidden organizational, cultural, and technical flaws that are not visible in their
day-to-day work. The process has to be performed with rigorousness, time, and making
61
sure the organization understands the importance and valuable output they can get from
it. Otherwise, the process is going to be seen as a useless secondary work that needs to
be done to comply. Having a good incident report and an overall good incident
ensuring a reliable process safety management system and high standard operational
more competitive, and safe money. As Klezt stated, if organizations think safety is
Quality incident investigation reports are the based on recommendations that are
clearly defined and are intended to solve the identified causes on a feasible and
measurable way. This means, that the documentation and analysis are not going to be
limited to generating information, but also to making sure that knowledge is created
upon that. Similarly, it is important to make sure that the created knowledge becomes
part of their management systems and operations. Once the organization creates and
accumulates knowledge within it, the next step is to collectively acquire and implement
this knowledge across the organization and industry, to make sure no one is doomed to
make the same mistakes. This can only be achieved by sharing the lessons learned and
ensuring those lessons are not forgotten. Trevor argued that achieving the first step
(spreading the message) is relatively easy. However, ensuring that this message would
become part of the organization is the real challenge [14]. Likewise, he asserted that
“organizations have no memory, only people have memories and they move on” [14].
62
Therefore, it is important to share lessons learned, create knowledge and make sure it
63
4. LEARNING PROCESS
knowledge through experience, education, or training [101]. Humans have the ability to
learn and this process is based on their previous knowledge. Thus, the results from the
same experience or training can be different for each person. Moreover, learning is not a
linear process. It follows a learning curve, which decreases over time if not adequate
expected.
the interactions and relationships among them. Thus, the learning process is no longer
process, in which, people try to acquire knowledge together and preserve it over time.
The challenge is to ensure that everyone within the organization manages, interprets, and
knowledge will remain available even if there are changes in personnel. As stated by
Kletz, “organizations have no memory, only people have memory and they move on”
[15]. In this context, organizations have been focusing on how to manage knowledge and
preserve it over time. These with the purpose of improve continually, avoid the same
64
4.1. Learning system
In the learning system for the chemical and oil and gas industry, three entities can
be identified as components that have the ability to learn: individual, organization, and
industry. For the purpose of this research, the learning is based on previous incidents, the
implementation of lessons learned, and best practices. Learning occurs from the smallest
entity to the biggest one. Thus, once an incident occurs, the individuals involved in it are
going to challenge their mental models and identify the potential causes of the incident
and consequently learn from experience. However, due to the complexity of the human
behavior, it can be expected that a small portion of the people who experience an
the learning system. In the case of near misses, the safety culture of the organization is
challenged and the individuals involved in the near miss have to decide whether or not to
report the event. On the contrary, in the case of a major incident, the process safety team
is usually part of the immediate executive actions; therefore, reporting is not required
from the individuals that experienced the incidents to the safety team. Moreover,
organizations need to notify the corresponding regulatory agencies and determine their
determined. In this process, the safety team has to evaluate the impact of the incident and
based on those results, the safety team would determine whether or not the incident
65
in which the root causes, lessons learned, and recommendations are determined.
Organizational learning occurs once each individual who is part of the organization has
acquired and processed the new knowledge that is being taught and collectively
implement the lessons learned. The ultimate level of learning is the industry learning, in
which the organizations share the lessons learned and transfer the existing knowledge
across the industry. This is done with the objective of having an updated knowledge and
66
4.1.1. Limitations in the learning system
The learning system presented in the previous section, gives an idea of how the
system works, the entities involved and their role in the learning process. Additionally, it
gives a bigger picture of the main steps and sequence of the learning process. Based on
this diagram, together with the available literature review in this field, it is evident that
organizations still see knowledge as temporal factor that is inherent to the people
addition, they fail to provide the required tools to enhance the individual learning
process in order to ensure higher levels of retention and sense of ownership. Finally, the
chemical and oil and gas industry fails to understand the need of putting all resources
together to improve as a whole. In this context, several limitations that inhibit learning
Identification of root causes: the incident investigation management systems set the
basis to ensure learning from incidents, because it ensures that the fundamental
causes of the incidents have been identified and managed properly. Thus, incident
recommendations. This would ensure that the collected data is reliable, adequate,
and complete. Moreover, that the analysis has been performed with sufficient detail
and the root causes have been identified. Organizations tend to identify direct
67
superficial problems that do not prevent similar incidents in the future. Direct
causes such as human error are usually identified as root cause, and investigation
process ends placing the blame on the operators instead of identifying failures in the
management system that contributed to the execution of the unsafe acts. This
identify root causes. First, they have to provide adequate tools to perform the
with standardized methodologies and procedures to guide them through the whole
incident investigation to ensure a pool of trained people who are familiar with the
nature of the incident to guarantee the required technical knowledge during the
analysis.
Promote reporting of near misses: learning should be promoted from every valuable
incident happened to improve their system. Acquiring data and information and
analyzing it, give the opportunity to learn and consequently prevent the occurrence
of major incidents and avoid losses. In this context, near misses give a significant
68
It can also helps identifying failures in the management system and consequently
crucial element for organizations to analyze more frequent incidents and extract
is avoided. This means, creating an open culture, where employees feel confident to
report incidents and everyone feel responsible for their own and their co-workers
safety. Moreover, promoting the report of near misses require the engagement of the
this effort decreases drastically once the recommendations have been developed and
the final report has been presented. Organizations fail to support the implementation
process and therefore, no actual learning is performed. In the majority of the cases,
learning is limited to the development of the report but actions are not executed.
has been transfer to the organization. Moreover, investigators play an important role
in the development of those recommendations, in order to ensure that they are clear,
actions can easily understand what is intended and set the right timeline and
resources.
69
Ensure sharing knowledge within the organization: the ultimate goal of incident
the extracted learning from those investigations should to be spread out across the
organization in order to ensure that different facilities all around the world, can get
lessons learned, they usually fail to give the message in an appropriate way. This
means to understand the different types of targets across the organization and
unpack those lessons learned in a way that people can understand how those lessons
would impact them. Furthermore, this sharing is limited in most of the cases to
that the data and information is stored and easily extracted when people need it.
Thus, databases play an important role to maintain information is available across the
organization. Additionally, academic resources for large organizations that have the
budget or external databases can help organizations to extract new information and
learn from external resources that handle similar processes. The main objective of
industry, instead of getting partial learning in few organizations. In this sense, the
industry fails in two different ways: First, the industry faces a competing and
politicized environment [56], in which organizations are not willing to share their
70
lessons learned and best practices. Therefore, there is a lack of external information
from which others organizations could learn. Secondly, even in the cases where the
external tools and therefore people are either not aware of there existence or are not
lessons learned from big incidents, but people see that type of incidents as events that
could not happen in their own processes and therefore little attention is put into
analyzing how those lessons learned are applicable to their processes and how that
the actual state of a particular process or activity. It helps organizations to get the
right signals before an incident materializes and therefore, take appropriate actions to
get the process back to normal. When an incident occurs, people involved in the
process spend little or no time analyzing why they failed to get the right signals. Was
the process giving the right signals? Were the indicators well defined? Why did
answered after each incident investigation, in order to reevaluate the process, where
the incident occurred and make sure operators can get all the required information
This research has focused in the analysis of four of the limitations mentioned
and academic resources, ensure sharing knowledge within the organization, and
71
identification of root causes. The first three limitations have been addressed in this
and knowledge is effectively transferred and maintained through generations. The Data-
has been widely used for many organizations to manage knowledge. The model
describes the relationship among data, information, knowledge, and wisdom, and the
transformation of each element to a higher level [102, 103]. Thus, the model represents
how organizations can increase meaning and value as they go higher in the pyramid, as
shown in Figure 12. An extensive literature about this model and knowledge
element is provided to understand the structure of the model and its link with the
learning process.
Information can be defined as data that has been processed for a purpose. It is
data that have been organized so it can have meaning for the user [102].
opinion, skills, and experience has been added. Knowledge is the synthesis of
72
multiple resources of information over time, in order to understand patterns and
This research has been focused only in one type of knowledge: safety knowledge.
It converges technical, human and organizational aspects, that lead to influence people
behavior and values. Thus, the learning process goal is to enhance, capture, refine, and
implement safety knowledge so it can become part of the daily activities and culture of
the organization. The DIKW theory has been used as basis to construct a framework to
73
4.3. Learning process
oil and gas organizations, consulting companies and subject matter experts, a framework
for the learning process inside an organization has been developed. The framework
improve learning from experience. The steps included in the framework are intended to
serve as a guideline for organizations that want to get a better understanding on how to
obtain information and make it valuable. That is, transforming information into
establish a corporate learning system, which should be owned by all employees in the
organization. The system should be easily understood and accessed when required. The
corporate learning system has two main objectives. First, make new information and
lessons learned part of the organization. This means, to make those lessons learned part
of the existing resources in the organization such as procedures and guidelines. In this
context, people can have access to lessons learned related with the information that they
are looking for and at the time they want it. This would help as a reminder about what
can go wrong with the specific activity they are intended to perform. Second, the
alternatives with the objective of making their facilities safer. To ensure that the
74
information and knowledge that is being added will remain in the organization, it is
important to document the whole process, make sure a proper management of change
process is performed prior to any change, and link the reasons that support the
alternatives. The last point is intended to ensure that people will understand the reasons
of why the alternative was implemented at that time and the analysis behind it.
Most of the people involved in the organization are not even aware of where to
find information and the potential benefits they can get from it. On the contrary, some
people are aware about the available sources of information, but they do not know how
to take advantage of this information. People inside the organization have access to a
high volume of information but in many cases they do not have the time to identify
transformed into knowledge, and which information is not that relevant for them. As a
consequence, people lose relevant information that can be applied to their facilities.
People act based on what it is important for them and what they consider relevant for
their work and would have a potential benefit for them. Therefore, transferring
knowledge is a complex task, in which identifying the target and having a clear
In this context, a lesson learned from an incident could be explained with technical detail
involved on it. However, the same lesson learned has to be delivered in a different way
to an operator, so he can understand and process the same information as the supervisor,
75
Some organizations have spent a lot of effort developing sophisticated systems to
store lessons learned and relevant information. Nevertheless, they fail to achieve the
complete cycle, in which people within the organization own the system and understand
it. The corporate learning system should be incorporated as part of the safety
management system of the organization, the top management should support it, and all
levels in the organization should be trained on it. The training would explain the benefits
of it, the role of the corporate learning team and the people involved on it. Moreover,
the training would give guidelines on how to use the Document Management System
(DMS), when to use it and the type of information that can be extracted from it.
system. The team has to be a group of subject matter experts with relevant knowledge in
the processes, products and types of facilities inside the organization. The people in
charge of the corporate learning system serves as a source of knowledge for the different
facilities and people in the organization. In this context, they would be the bridge
between the information and knowledge that need to be digested and implemented.
Moreover, the team has the responsibility of filtering the information in order to ensure
that the information that is being added is valid and reliable. It should be noted that the
team is not intended to impose or enforce the implementation of lessons learned in the
facilities, rather, the team is intended to guide facilities to get the right information and
direct them in the right direction. The objective of the team is to support decision-
making and provide the necessary resources to implement new alternatives. In this sense,
76
the corporate learning team would be responsible to acquire new information, validate
and analyze the applicability, disseminate the information to the intended targets, ensure
what information is relevant for the organization. In addition, it is not possible to have
the level of expertise in all areas of process safety. Thus, a pool of experts in different
areas within the organization should be linked with the corporate learning team, in order
to provide support in the analysis and selection of potential knowledge that should be
acquired by the organization. In this sense, the pool of experts would serve as a guideline
safety system. The system has to ensure the availability, accuracy and updated
knowledge, in order to support the learning process and decision-making. Inside the
system two different types of information can be identified: internal and external
information. Internal information refers to the information that is created inside the
obtained from agencies, databases, academic resources and other organizations. This
practices, etc.
77
The corporate learning system has the challenge to effectively manage all
channels and types of information at the right time and to the right target. Therefore, the
the information would be uploaded easily and extracted as needed. In addition, the DMS
needs to keep the information up to date and link the new information with the existing
In this context, the corporate learning system would acquire internal and external
information, which is going to be analyzed and validated by the corporate learning team.
The relevant information would then be incorporated into the DMS and linked to the
78
existing documentations as required. Then, the team together with representatives of
each facility, in which the knowledge is intend to be transferred, would work together to
determine the best approach to implement that knowledge into the facility. After
implementation, the facility would provide some feedback to the corporate learning team
to ensure continuous improvement and verified the effectiveness of the process. At the
same time, each facility is responsible for providing internal information to the system to
assure that the information is analyzed for different facilities across the organization and
alternative, and monitor performance. The process describes how the corporate learning
system would work and the key elements that need to be considered in each step. The
first two steps refer to the acquisition of relevant data and information into the system
and the last five steps refer to the transformation of that information into knowledge. A
79
Figure 14 Learning process
steps in the process. Its goal is to provide a more detailed analysis of the incident and the
safety management system of the area where the incident occurred. The process is
enhanced through the investigation of all incidents with significance, the review of
80
management system. Additionally, this step support organizations to effectively identify
organizational, design and cultural failures instead of human errors as root cause of
Once the organization can rely on the internal information that is created inside
of it, the next step is to acquire additional information from different sources. In this
sense, it is important to identify the potential sources from which relevant information
can be extracted. As explained previously, there are two types of sources: internal and
external. Internal information would be acquired from the different facilities within the
organization. Likewise, internal information would be gained from all branch offices
around the world that are part of the same organization. This information would be
best practices and trend analysis reports can be considered as well. External information
opens a wide window of opportunities from where new information can be acquired. The
single database available, which can provide full access to a large number of incidents in
the chemical and oil and gas industry and, at the same time, have all the desirable
features such as easily searchable, public, with multiple filters options, and with detailed
Regulatory agencies such as the Environmental Protection Agency (EPA) have their own
database, from which information regarding toxic releases can be found. The Chemical
Safety Board [6] provides detailed analyses of process safety incidents. However, the
81
amount of incidents investigated is limited. Some public databases such as eMARS
developed by the Major Accident Hazards Bureau (MAHB) provides incidents reports
Safety Incident Database (PSID) created by CCPS, provides incidents reports for
associations like the American Petroleum Institute (API) and the National Fire
Protection Association (NFPA), which provides guidelines for standards and best
practices. Finally, large organizations that have the budget also have the opportunity to
established partnership with academic entities in order to be involved and have access to
trending investigations from first hand. This type of partnership can be established with
universities like Texas A&M, which houses Mary Kay O’Connor Process Safety Center.
It is evident the wide area from which safety information can be extracted. Thus, it is
important that the corporate learning team set the basis and defines the scope and
objectives of it. Moreover, those objectives have to be aligned with the organization’s
Once the team has prioritized the alternatives that could be valuable for the
organization and fit the needs of it, they have to identify the potential processes,
facilities or business areas, where the alternative could be applied and would gain
benefits from it. Those relevant installations are the potential target, where the
their processes. During this process the corporate learning team has the responsibility of
82
determining and classifying the potential targets and the channels of communication that
are going to be used during the process. The information can be communicated into two
different approaches: sharing the information or transferring the information. Sharing the
facilities that might be interested in the information but no action is required from them.
receiver, who in this case would be facilities in which the new or alternative information
different facilities that have been selected as potential targets. From this point, a member
or members of the corporate learning team would meet with some representatives of the
potential target and would work together in the analysis of the alternative. Through
meetings the objective is to communicate the alternative and the assumptions that have
been made in order to determine why it is applicable for them. Then, they would
evaluate together the applicability of this alternative to their process, by analyzing the
potential benefits, costs and risk associated with the implementation or not of the
alternative. This process has to be documented and the final decision has to be
supported. It is important to mention, that the objective of those meetings is to take the
time to really analyze the alternative, rather than enforce the implementation of it. At the
end, the final decision would be made by each installation. Not all alternatives have to be
implemented because in some cases the alternative would be not feasible to implement
83
to that specific process or they are just handling the risk with different alternatives that
Some alternatives can be applied across the organization and do not require the
identification of potential targets. Therefore, for those alternatives the corporate learning
team should develop an action plan and execute it as part of their job. One example for
facilities. In this context, the team together with a subject matter expert would be
responsible for making the required changes, documenting the changes, incorporating
This step requires asking the questions of how they can validate what they want
to achieve? How they can know the alternative is operating as expected? Defining
performance indicators allow people to verify the system behavior and measure how
much it has been changed with respect to the initial set point. Indicators should be
determined prior to implementation in order to have a clear picture of what they want to
achieve with it. Likewise, it would help to verify the results since initial phases.
Indicators would change depending of the actual phase of the alternative and would give
well, to evaluate the performance of the system and identify improving opportunities for
it. In this sense, indicators to analyze how the system operates, personnel performance,
feedback and results of the system should be defined and evaluate periodically.
84
Indicators such as the number of identified alternatives vs. the number of alternatives
and how to take advantage of it, percentage of alternatives that are rejected by the
Defining metrics and refining the alternative should be performed at the same
time, because once the facility is fitting the alternative to their own processes, some
metrics have to be determined to set the general goal of the alternative. Refining the
alternative refers to the process in which the facility has to define the limitations,
assumptions, boundaries and required resources based on their own needs and capacity.
in order to determine the potential impact to the process and people and the additional
considerations that need to be made prior implementation. Finally, an action plan has to
evidenced.
Implement the alternative refers to the actual execution of the action plan that has
been proposed by the facility. Each facility would be responsible for the execution and
verification of it. However, the corporate learning team should verify periodically its
85
current status. Likewise, the corporate learning team would support the whole process
based on what has been agreed during the evaluation phase of the alternative.
implemented alternatives and corporate learning system. At this point, a detailed review
knowledge was appropriate, was analyzed in detail, accurate criteria was used, and the
expected benefits were achieved. Moreover, the team has to analyze if the
implementation was performed as expected and what can be improved from it. Based on
the improvement opportunities that have been found during this analysis, the alternative
can be improved and refined as needed or can be modified if the expected results were
validated in order to determine if the process has been successfully implemented across
the organization. In this analysis the corporate learning team should evaluate if the
objectives have been achieved, if people are fully committed with the system and feel
part of it, if the organization is getting the right knowledge, if the level of competence in
the team is adequate, and how the cycle and each phase of it can be improved.
In the learning system presented in the first section of this section, three types of
achieved through the small entity to a bigger one. In this case, industry learning only can
86
be achieved once and organizational learning has been achieved. Similarly,
organizational learning is achieved once individual learning has been achieved for
people involved in the organization. Even though this research has been focused on the
learning. Therefore, organizations need to provide the necessary tools and resources in
order to support and enhance individual learning for all employees. As a result,
corporate learning systems in which the knowledge is being managed within the
apply it into their daily jobs. Some of these elements are briefly discussed below.
Developing and delivering effective training programs ensure the basis for a
understanding of the main goals, structure, and expectations of the organization. Thus,
Detailed training needs matrix for all employees’ position in the organization.
Specific qualification criteria for trainer’s selection. These criteria should include
necessary soft skills such as good communicator, presentation skills, able to open
Up to date training material, which covers all relevant technical knowledge and
courses.
techniques in their training programs based on the predefined expectations of the course
and the current needs of the organization. Although lectures methods are the most
common applied technique in most organizations, this method is not able to support
88
skills and attitudes development. Thus, it is important to recognize the use of multiple
enables employees to acquire and develop knowledge, skills, and attitudes. Similarly,
higher levels of retention’ techniques such as group discussions and practices cannot be
fully achieved without the understanding and analysis of the theory behind it. For
instance, organizations can implement the use of simulators to teach operators how to
perform a specific task and achieved the necessary skills to perform it. However, if the
lectures, readings, and discussions, it is likely that the operator is not going to be able to
make decisions on his own once he experiences an abnormal situation during his real
job. The operator would not have the required technical knowledge and analysis to
Once the training techniques have been defined, it is also important to structure
the development of the course and the elements that trainers must take into consideration
during the preparation, execution, and evaluation of the course. First, trainers have to
determine the objectives of the course; this requires the understanding of the target and
their motivation. In this sense, the challenge is to identify why is this new knowledge
relevant to them, instead of thinking why it is important for the trainer or the
organization. This with the objective of answering the questions: do they want to know
that? Or do they need to know that? And based on it, trainers can take the best approach
for creating ownership about the knowledge that is being taught. Secondly, the
development of the course should be problem-solving oriented in which they are able to
89
analyze situations and make decisions about it. In addition, it should encourage trainees
to seek for more information and create spaces where they have to explain to others what
they are learning. This would allow trainees to digest the information and take the time
to consolidate this knowledge and fitting it into their existing knowledge. Thirdly, each
lesson has to clearly identify the benefits trainees can get for it in order to maintain their
motivation. Likewise, it is always a good approach teaching with examples because this
enables them to be familiar with the situation and recreate this new knowledge into their
own situations, and at the same time, it increases trainer’s credibility. Finally, trainees
should receive constant feedback during the development of the course and an
assessment should be incorporated as part of the process to help trainees refresh concepts
and identifying the key elements of the course. This assessment should be problem-
learnt in the past to give space for possible new learning” [110]. In this sense,
organizations should support employees in the transition from the old knowledge to the
new one and provide the required tools and resources to make the transition easier at
both individual and organizational level. In the process safety field more attention have
that risks are controlled once the organization makes changes in their facilities,
operations or personnel. However, less attention has been given to the cultural and
organizational factors, as well as the individual impact that may be influenced by those
90
changes. Therefore, management of change programs should incorporated individual,
organizational and cultural aspects that may be also influenced by the changes that are
made in the organization. The unlearning process can help organizations to identify the
potential impact of those changes and provide the corresponding plan to overcome the
Some of the elements that people who are intended to transmit the new
knowledge should take into account are briefly discussed below. First, the acquisition of
new knowledge is not always an easy process for some employees, and it is even harder
when they have been doing their job in a particular way for many years. Therefore, it is
important to identify the potential impact of this change (new knowledge) and develop
there is going to be resistant of change because it is part of human behavior and it cannot
be prevented. But at the same time, it is important to note that the resistant of change can
be temporary if proper actions are taken to manage it. Thus, instead of having a negative
attitude of what employees are going to say or how they are going to complain about it,
corrective actions for those attitudes. Thirdly, once the person in charge of transmitting
the new knowledge is in contact with the target, he has to explain to them the
requirements of the process. This means, to explain why this change is necessary for the
organization and why this new knowledge is important for them, instead of start
explaining how to perform the new task or going in detail to the technical part. This step
gives them the opportunity to analyze how these new requirements fit into their existing
91
knowledge and potentially identify some gaps during this process. Finally, the new
attitudes and the implementation of corrective actions. This process should be performed
over and over again until the new knowledge has been consolidated in all employees
92
5. ENHANCE INTERNAL INFORMATION
In this section, the first step of the learning process enhance internal
relationship among the people, the task, and the environment for a specific task. The
corresponding results were then used to enhance the existing incident investigation
process by the incorporation of additional steps into the process. Finally, a description
of how the people, environment, and task work together and how the system is
supporting this process [111-113]. A hierarchical task analysis has been developed for
the incident investigation process to analyze the task’s objectives in this process, the
people involved in it, the complexity of the task and the expected goals for each stage in
the process. The general structure of an incident investigation process involves the
more comprehensive description of the incident investigation process has been presented
in section 3. The incident investigation process has been divided into subtasks and the
most challenging tasks have been selected for further analysis. Each category has been
93
broken down into more detail tasks to understand specific goals of each step of the
process. Nine subtasks have been selected for detailed analysis in which the common
errors were determined. Finally, the general characteristics of the people and
Figures 16-18 present the hierarchical task decomposition for the incident
investigation process. The steps highlighted in gray represent the subtasks that have been
considered the most complex and challenging in the process. Thus, a further analysis has
been performed in the next section. The following subtask were selected:
Report incident
Visit scene:
- Interview witnesses
Analyze evidence
Develop recommendations
Tables 1-7 present the results obtained from the hierarchical task analysis.
94
Figure 16 Task analysis decomposition part 1
95
Figure 17 Task analysis decomposition part 2
96
Figure 18 Task analysis decomposition part 3
97
Table 1 Hierarchical task analysis part 1
98
Table 2 Hierarchical task analysis part 2
Working with new people to Memory recalls change over Tendency to blame
make decisions time employees
99
Table 3 Hierarchical task analysis part 3
Common
Task People Environment Why is Challenging?
Errors
100
Table 4 Hierarchical task analysis part 4
101
Table 5 Hierarchical task analysis part 5
102
Table 6 Hierarchical task analysis part 6
103
Table 7 Hierarchical task analysis part 7
Employees’ availability
Inadequate risk-ranked
Pressure against time Budget is limited
of recommendations
and resources
There is no time for extra
Safety Inadequate top
Define Level of engagement work
Department management
responsible and of the top management
engagement
resources Top management has to be
Management
Working with others engaged.
Inadequate distribution
to make decisions
of resources
Requires a strong safety
culture within the
organization
Recommendations’ status is
Pressure against time Recommendations are
easily forgotten
and resources forgot and therefore are
never implemented
Safety Top management and
Level of engagement
Monitor status of Department employees have to be
of the top management Recommendation’
recommendations engaged
status is only
Management
Working with others responsibility of the
Requires a strong safety
to make decisions safety department
culture within the
organization
104
5.1.3. Subtasks analysis
Since each incident investigation is a new process, the people involved and the
environment is always changing depending on the severity of the incident and the unit
where the incident occurred. In addition, the number of people in charge of the
investigations would vary as well. Furthermore, the frequency of this task cannot be
determined, since it would depend on the number of reported incidents and the type of
incident. In the incident investigation process three groups of people were identified:
employees who perform the task of reporting incidents, safety department and
management, and the incident investigation team. Each of this group of people executes
different stages during the incident investigation process and has different attributes.
- People that work within the organization and perform a specific task on it.
are expected to have the same training regarding policies, values and goals of the
- Employees spend eight or more hours per day in their jobs. Reporting incidents is
105
- Employees need to be motivated and understand the benefits of it in order to report
an incident.
- Employees need to have general knowledge regarding safety and how to report an
Tasks: Select investigation team, define responsible and resources, monitor status of
recommendations
- The tasks mentioned above are specific tasks that are performed in conjunction with
two different groups of people: the safety department and the top management. The
- The safety department should have expertise in process safety and incident
investigation in order to determine the severity of the incident and resources needed.
- The top management should have knowledge in business and safety priorities, as
106
- Both groups of people should have strong communication and leadership skills to
coordinate different groups of people and make decisions within a short period of
time.
- The people that are part of the incident investigation team are usually employees
who have to suspend their routine job and perform the investigation as a temporary
job. Thus, people have to perform the task within a short period of time and under a
lot of pressure.
- The team is composed of different background and experience, which enables the
nature of the incident the incident investigation team would vary and different types
operators, contractors, lawyers, etc. Therefore, the people involved usually have
- People within the team should have communication and leadership skills in order to
- The team has to have a strong technical knowledge about the process where the
107
- At least one of the members of the team should have a strong background in incident
5.1.3.4. Environment
within the organization in which employees and the top management are engaged with it
the level of engagement of management and employees, the right amount of resources
The environment where the task is performed is the facility where the incident
occurred. These are chemical and oil and gas facilities such as refineries, platforms,
onshore facilities, etc. The investigation is executed at the same time that employees are
performing their job or some of them may be recovering from the consequences of the
incident. Furthermore, the unit where the incident occurred may be destroyed or heavily
Reporting the incident is executed depending on the available tools that the
organization has for reporting incidents. This can be verbal reporting, written forms or
available tools and training within the organization. These tools refer to the incident
severity of the incident. Additional tools are the available procedures and guidelines for
the protocol prior the investigation, the development of the investigation, and required
post-activities.
108
5.1.4. Task analysis remarks
The task analysis developed for the incident investigation process gives more
relevant insights of the process. First, the quality of the investigation is proportional to
the level of training, previous experience performing the same task, and level of
expertise of the incident investigation team. Thus, organizations should have a pool of
based on the type of incident and its significance. Secondly, the safety culture of the
determines the number of incidents that are going to be reported, the resources
availability, time spent, and engagement of the top management, as well as employees
who are in charge of the execution of recommendations. Similarly, the safety culture will
significant contributing factors in the process. These skills enable the team to organize
relevant information, interview people, analyze the incident and work together in the
recommendations was identified as one critical task in the process because it converges
all the effort and knowledge spent into the analysis. Moreover, recommendations are the
109
Additionally, the task decomposition and analysis of it identified some of the
limitations of the process and improving opportunities on it. Since the environment and
people involved in the investigation are always changing, the organization miss valuable
investigation is a new process in which the learning curve has to start from zero each
time. This gives the opportunity to enhance the process by the incorporation of
knowledge from previous incidents and extracts this knowledge for the incident that is
being analyzed. Similarly, the current incident investigation process does not support the
increase the risk in the organization. Furthermore, the training material is not updated in
order to ensure a complete learning cycle of the incident. Moreover, the re-evaluation of
leading indicators of the process is not considered in the incident investigation process.
investigation process is conducted and the final product of it have been analyzed and
additional steps have been incorporated into the process. The objective is to provide a
more detailed analysis of the incident and the safety management system, where the
incident occurred. Since incident investigations are one of the most powerful practices to
learn from experience, organizations need to focus their attention on improving their
110
and ensure that organizational and cultural failures are identified [77, 91]. Incident
happens. Starting from the report of the incident, followed by the collection of evidence,
lessons learned, and follow-up of recommendations [80, 81]. Regardless of the incident
investigation method used, the main structure remains constant. Thus, organizations miss
valuable information that should be analyzed during this process such as previous
incidents, leading indicators of the process, how the identified gaps fit into the current
training material, and how to incorporate this learning into the management systems of
additional steps have been incorporated into the traditional process with the objective of
providing additional sources of information into the investigation. The grey boxes in the
figure represent the steps that have been included, while the white boxes refer to the
After an incident occurs and is reported, the significance of the incident has to be
determined. This refers to the categorization of the incident in which the potential impact
of the incident is evaluated. Based on this result, the level of detail in the analysis and
the complexity of it are determined. This means if simple methodologies and/or trend
information and evidence has to be performed. This stage would determine the quality of
the identified causes and recommendations [80]. Once the information has been
111
investigation methodologies are implemented for the reconstruction of events, the
identification of critical and causal events and the identification of root causes of the
incident.
The following steps suggest a detailed analysis of previous internal and external
incidents from which valuable information can be extracted. The analysis of external
incidents gives the opportunity to retrieve missing lessons learned that the organization
failed to learn. Thus, the applicability of those lessons learned can be evaluated into the
112
process. The objective of reviewing external incidents is not only to analyze the
analyze the current practices that others organizations are implementing, compare their
own technology with respect to others organizations, and extend their existing
knowledge regarding national and international standards and best practices that may be
explain the causes of the incident to the interested parties. This means helping build trust
and credibility with regard to the results that the incident investigation team is
presenting.
incidents from the same organization, the process where the incident occurred and the
safety performance of it. The analysis is based on the following criteria: are there any
unit? The answer to the first question should be followed by a trend analysis and the
being investigated. Conversely, the answer to the second question should be followed by
a revalidation of the process hazard analysis for the process where the incident occurred.
Then, the root causes that has been previously determined in this process have to
organizational and cultural flaws can be identified. Subsequently, the required changes
in the management system should be identified. This stage of the investigation aims to
provide more insights into the identification of root causes and successfully uncover
113
hidden faults of the system and avoid the identification of human error as root causes of
the incident. The following step refers to the identification of the technical gaps that
have been identified during the investigation. These gaps should be compared with
respect to the current training material and provide recommendations to ensure that the
training material is up to date and cover all relevant knowledge for each specific role in
the organization. The last additional step incorporated into the process refers to the re-
evaluation of the leading indicators of the process to determine if the process is showing
the right signals, the indicators are well defined and to ensure that operators understand
the data and information the system is giving. The main objective of the additional steps
is to ensure that root causes are identified and enhance the quality of the
114
6. CASE STUDY
This section analyzes an incident that happened in the offshore industry in North
America. The incident has been examined from the perspective of the proposed incident
investigation flow chart presented in the previous section. The analysis comprises the
analysis of two external incidents with similar causes and two internal incidents that
occurred in the same organization. The data and information presented is real and
verified. However, data have been anonymized in order to protect the identity of the
organization. It should be note that the incident investigation has been performed by the
organization at the time where the incident occurred. Therefore, the objective of this
analysis is to identify improving opportunities of the final report through the analysis of
high-pressure gas, which detonated within next 5 seconds. The explosion and resulting
fire killed 7 people and destroyed a significant part of the platform. The line was
designed as a fuel gas separator bypass in order to provide operational flexibility. Thus,
Prior to the incident, the line had been inspected twice, once two years ago and
another eight years ago. In both occasions, visual inspection, thickness measurement,
115
and materials characterization were performed. For those inspections, the results were
satisfactory and no reduction of the internal thickness was detected, with the exception
of a “weldolet” that was changed due to a severe external corrosion. The investigation
found that the line rupture was due to a severe localized reduction of the internal
thickness of the line. The laboratory analysis determined that the reduction of the
corrosion (MIC) and corrosion associated with high levels of hydrogen sulfide (H2 S).
The identified causes of the incident, associated with prevention and detection of
the incident, determined by the organization in the final report are briefly summarized
below:
The fuel gas line was designed assuming that the potential presence of corrosion
Bacteria corrosion mechanisms were not considered during the process hazard
The fuel gas line was designed assuming that there would be insufficient quantities
There was no system in place for real-time monitoring to identify high levels of
The fuel gas line was not identified as a bypass line in the mechanical integrity
116
Personnel in the platform were unaware about the responsibility of communicating
to the inspectors about bypass lines and dead legs to put special attention during
the inspection
corrosion associated with high levels of hydrogen sulfide (H2 S) was not considered
The supervisor in charge did not have the required knowledge to analyze the
Based on the identified causes of the incident some of the recommendations are
1. Identify and monitor the type, origin, and effect of presented bacteria in the platform
and design a tolerance level in order to establish control methods in the fuel gas
system
2. Update material selection criteria for fuel gas lines based on the types of damages
implement the use of corrosion inhibitors in the fuel gas system for the corrosion
personnel and inspectors. Additionally, identify all sporadic fuel gas lines in order to
117
update the inspection program and communicate the required planning and
inspection
7. Implement corrosion coupons in the fuel gas system within the offshore platforms
strategies for gas processing plants to ensure that permissible levels of H2 S are not
exceeded
9. Redesign the corrosion inspection program to ensure that special conditions in the
process such as bypass lines and dead legs and combination of corrosion
10. Ensure that all personnel associated with planning, supervision, and execution of
certified and trained based on the applicable national and international regulations
(being enunciated but no limit API 570, 572, 510, 574, 580, 581, 653)
11. Develop and/or update static equipment inspection procedures to ensure that they are
Additionally, detection and control mechanisms for sporadic lines are included.
12. Implement additional inspection techniques for the fuel gas lines that allows the
118
6.2. Incident analysis
The first step incorporated into the incident investigation process refers to the
preliminary analysis in which the significance of the incident is determined. This means
to categorize the incident based on the potential impact that the incident may have with
the objective of determining the level of detail and complexity of the investigation. In
this study, the significance of the incident was determined as very high due to the
number of fatalities, injuries and property damaged. Thus, the incident was subject to
further analysis during the investigation. The following seven steps were developed
during the investigation performed by the organization: report incident, determine the
critical events, identify causal factors, and identify root causes. The next step is the
Review process of previous incidents takes into account two different types of
incidents: internal and external incidents. Two incidents were considered for the external
incident review process: the first incident is BP North Slope oil spill in 2006 and the
second incident is natural gas pipeline rupture and fire in Carlsbad, New Mexico in
2000.
BP North Slope oil spill, 2006: On March 2nd of 2006, a BP operator discovered a
leak in a 34-inch transit pipeline. The leak turned out to be the largest spill ever
experienced by Alaskan North Slope. The cause of the leak was due to an internal
119
corrosion that caused a hole at the bottom of the pipeline. The leak was associated
Natural gas pipeline rupture and fire in Carlsbad, New Mexico, 2000: On August 19th
of 2000, a 30-inch natural gas transmission pipeline ruptured. The released gas
consequently ignited and burned for almost an hour. The incident killed twelve
determined that the major safety issues were associated with pipeline design, the
internal corrosion control program, the lack of federal safety regulations for natural
analyze the current practices that others organizations are implementing, compare their
technology with respect to others organizations, and extend their existing knowledge
regarding national and international standards and best practices that may be applicable
to their facilities. Moreover, reviewing external incidents serves as support to explain the
causes of the incident to the stakeholders. This means helping build trust and credibility
with regard to the results that the incident investigation team is presenting.
hydrogen sulfide corrosion (H2 S), supported by laboratory analysis and subject matter
experts. However, due to the nature of the combined corrosion mechanisms, the
120
assimilation process from the operators, the supervisor and managers presented a
challenge. Even though, the evidence was there, it was hard for them to believe “such
unexpected event”. Therefore, some representative statistics in the offshore industry has
been presented in order to highlight the frequency of this type of incidents in the
almost 3000 offshore pipeline incidents were reported from 1970 to 1999, of which 51%
Corrosion represents the leading cause of failure in the Gulf of Mexico offshore
pipelines, counting for more than 1483 offshore corrosion incidents over the past years.
Of these incidents, 35% were associated with internal corrosion and the remaining 65%
with external corrosion [116]. Over the period from 1989 to 1999, 75% of the total
number of reported corrosion incidents was associated with internal corrosion failures,
showing a direct connection between internal corrosion and pipeline aging. Statistics
indicates that additional factors such as pipeline infrastructure growing and changes in
operating conditions also increase the likelihood of internal corrosion failures [116].
Likewise, statistics show that over the period from 1970 to 1999, the majority of internal
corrosion failures occurred in natural gas pipelines, representing 67% (518 incidents) of
the reported incidents [116]. Moreover, according with the U.S. Department of
Transportation Pipeline and Hazardous Materials Safety Administration, even though the
number of reported corrosion incidents in gas gathering lines is not that high, more than
121
US Gulf of Mexico offshore oil and gas
pipeline types of failures
3% 3% Corrosion
4% Natural Hazard
Impact
6% Structural
Unknown
8% Other
51% Material
21% Weld defect
Anothoring
Erosion
Construction
Figure 20 US Gulf of Mexico offshore oil and gas pipeline types of failures over the
BP North Spill incident report briefly outlined two key aspects that all pipeline
corrosion management programs in the oil and gas industry may have such as a
corrosion monitoring program and a pipeline leak detection system, which prevent a
pipeline from developing leaks and ensuring prompt response once a leak occurs.
At the time of the incident that is being analyzed, the organization did not have
an effective internal corrosion-monitoring program for gas lines in place. Likewise, the
platform did not have a pipeline leak detection system. The corrosion monitoring
program that the organization had at that time involved visual inspection, thickness
and the implementation of corrosion coupons in the fuel gas system. Additional
122
corrosion monitoring system can be considered based on the analysis of the current
technology applied for different organizations. For example, BP North Spill incident
report highlights the use of “smart pigs” in order to provide more coverage and the
Natural Gas Company highlights the implementation of ultrasonic testing on the non-
piggable portions of the pipeline and the gas quality monitoring to ensure that predefined
limits are not exceeded. Furthermore, the report briefly outlined some of the-state-of-the-
incident report also emphasizes the technology that was implemented after the incident
with respect to their pipeline leak detection system. The organization increased the
number of field inspections and implemented infrared heat detectors to improve leak
detection.
Based on the results of the previous review of the two external incidents with
regards to internal corrosion monitoring program and a pipeline leak detection system,
123
Aligned with the previous recommendations and some of the recommendations
into their existing safety management system. Thus, the required resources, expertise,
and management commitment were not sufficient to fully prevent and mitigate this type
of incident. In this particular case, the organization failed to provide the required training
to the personnel in charge of analyzing and providing corrective actions during the
inspections that were part of the corrosion-monitoring program. In this sense, even if the
inspections were performed based on the schedule, there was a lack of expertise to
conclude what was a good or bad result and provide the technical knowledge to
determine the following corrective and preventing actions. Moreover, during the
inspections performed in the unit, the replaced pieces were never subject to any type of
further analysis in order to analyze the identified corrosion mechanism and the potential
presence of additional corrosion mechanisms. Thus, bacteria activity and high levels of
H2 S were not identified during monitoring activities. El Paso Natural Gas Company
incident report discussed some of the post-inspections activities that the organization
fractions of pipeline that was changed during inspection activities and performed
124
Recommendation 3: Enforce the management of change program for the
implementation of any new technology or system in which the required training,
responsibilities, procedures and channels of communication must be specified.
that must be incorporated into employees and contractors training. Likewise, the report
recommends the inclusion of those standards into their existing inspection procedures.
Those standards mentioned in the report are: being enunciated but no limit, API 570,
572, 510, 574, 580, and 581, 653. However, additional standards can be taken into
consideration.
Gas Company incident report provides a more compressive description of the applicable
standards regarding corrosion issues. This can help the organization to expand their
Pipelines
only for training and procedures purposes. It is important to verify that the current
corrosion management program and internal policies and guidelines are aligned with
Both external incidents emphasize the main elements involved in any corrosion
management program and the required procedures and guidelines that may be in place in
order to ensure that effective actions are performed with respect to prevention,
mitigation and control. A corrosion management program should clearly address the
126
following elements: definition of policies and objectives, organizational structure and
Based on the incident report, the investigation revealed that the organization
previous incidents in the same platform it can be seen that the platform present failures
in their mechanical integrity program in general, not only with respect to corrosion.
with the schedule (implementation phase of the corrosion monitoring program). In their
program, some flaws can be identified in the planning and measuring system
performance phases. The incident investigation reveals that during the planning phase,
not all corrosion threats were identified and the required actions in case of the
identification of any failure were not addressed effectively in the program. Likewise, the
organization failed to effectively review and analyze the system performance in order to
validate the effectiveness of the current corrosion prevention and monitoring methods.
Based on this analysis, the following recommendation can be incorporated into the final
report.
For the internal incident review, two incidents were considered. Both incidents
occurred on the same platform 3 months and one-year back to the incident that is being
127
analyzed. The incidents were selected based on the criteria of cause similarity and
indicators, reliability report, and the mechanical integrity manual, were considered into
the analysis.
Incident 1: A bitter water leak was detected in a 6” line from the low-pressure gas
rectifier FA-3104 to the pressurized system, due to a material loss in the line. The line
was out of service at the time of the incident. The investigation determined that the
leak was caused by severe corrosion in the line. This due to the following causes:
presence of trapped water on the line, there was not an action plan in place for
removing off-line interconnections, and there was a failure to properly analyze and
Incident 2: During maintenance work in a mechanical valve, a minor gas leak was
detected in the stem of the valve. The leak was controlled half an hour later. There
determined that the cause of the incident was due to a deterioration of the stem valve.
Figure 21. First, analyze if there are a significant number of incidents in the same unit
where the incident occurred. The objective of the analysis is to find the need for re-
evaluating the process hazard analysis of the process. Secondly, to analyze similar
incidents that had occurred in the same organization, to find trends and the applicability
of those recommendations to the incident that is being investigated. Finally, the incidents
128
investigation team should go back to the identified root causes and re-evaluate them
Even though, there was no record of a significant number of incidents in the same unit.
The reported incidents have been classified with high significance. Moreover, the
investigation report determined the Process Hazard Analysis (PHA) performed in that
process did not consider all different types of corrosion threats that can be present in fuel
hazard analysis, analyze those results and provide mitigation actions as required.
methods.
129
In the second part of the analysis, similar incidents with similar causes were
considered, along with additional process safety documentation that helped to provide
more evidence into the analysis. The following improving opportunities to the existing
Recommendation 7: Re-evaluate the process hazard analysis performed in the fuel gas
system of the platform, ensuring that is executed by qualified personnel with relevant
experience in offshore operations, corrosion mechanisms, and hazard identification
methods.
A third party hired by the organization was in charge of the development of the
investigation report of the incident that is being analyzed. While both the internal
incidents reports that were selected for this analysis, were developed by people inside of
the organization. Although, the severity of both internal incidents is relatively low
compared with the incident for the case study. Some inconsistencies were identified with
respect to the level of detail of the analysis, the real identification of root causes, and
quality of the recommendations. The analysis exposed that the organization had an
inadequate incident investigation program, which inhibits the organization to learn from
previous incidents. In this sense, it was unlikely that the organization would prevent a
catastrophe like the one in this case study, because the identified causes in previous
incidents were limited to the resolution of direct causes such as physical or human
temporary suspension). Aligned with this, a culture of blame was identified on the
previous reports in which some recommendations tended to punish the operator instead
130
of analyzing the causes behind that behavior. Moreover, even if human and
organizational causes were identified as root causes of those incidents, the following
recommendations were focused just on immediate and physical causes. Hence, flaws
were reflected in the incident investigation program and the provided training in this
reports developed in-house and incident investigation reports developed by a third party.
properly analyze and take actions of the inspections results developed by third parties.
The incident used for this case study exposes the same cause in which the supervisor in
charge was not able to analyze inspections results due to a lack of knowledge and
training regarding this type of data. This is an illustration of how a lack of commitment
management programs. Thus, even if activities were performed based on the program,
no actual actions were performed with respect to the analysis and recommendations of
those results, making it hard for the organization to improve and make their processes
safer.
The recommendation was ranked with high priority. Despite the associated risk, the
recommendation was never implemented and the same recommendation was identified
in this case study, it can be concluded that there is a lack of proper channels of
communications among the different groups of people that interact in the same process.
This means, that people in charge of the development of process hazard analysis,
inspections, and the daily operators are not aligned with respect of the identified hazards
in the process and how they have to work together in order to define the best approach to
132
take for the process. With all things considered, it was concluded that the different
elements of the safety management system of the organization do not aligned properly
and do not interact with each other. Thus, the mechanical integrity, training or
management of change programs are not fully aligned with the process hazard analysis
and compliance audits elements of the safety management system in the platform.
Recommendation 12: Ensure that safety management programs of the organization are
aligned with the process hazard analysis and audit programs performed in the facility.
Ensure appropriate channels of communication, training and qualified personnel to
defined effective prevention and mitigation measures.
training material
Following the review of previous incidents, the next step in the proposed process
is the identification of the changes needed in their management systems, which are
exposed during the development of the incident investigation. This stage of the
investigation aims to provide more insights into the identification of root causes and
successfully uncover hidden faults of the system. Through the analysis of the identified
causes of the incident, review of previous internal and external incidents, and an analysis
of the current state of their management system, relevant changes in the management
The analysis suggested that changes in the current training need-matrix are necessary
to ensure that personnel have the required knowledge and experience to perform
their responsibilities on the platform. This means to validate that the required
training for a specific role is up to date and the identified gaps (Corrosion
133
management program, applicable corrosion standards, incident investigation
technologies or processes take into account the potential impact on training needs,
The mechanical integrity program should be validated to assure that the planning
phase of the program is developed for a group of qualified personnel with relevant
experience of the process and products in the platform. Likewise, the analysis
suggested validating the current personnel selection criteria for the development of
organization in which the level of engagement and awareness of the top management
investigation program.
Within the suggested changed in the safety culture of the organization, the
procedures, standards and policies. Additionally, the audit program must be validated
to ensure that the auditors are qualified to perform the job and that audit procedures
Together with the two previous arguments, the incident investigation program should
be re-evaluated to ensure that appropriate methodologies are applied and that the
134
incident investigation process, procedures, and roles and requirements are well
Aligned with the analysis performed in this step, it is also important to evaluate
which training material should be updated based on the technical gaps that were
identified during the investigation. For instance, during the development of this case
study, technical and organizational gaps such as internal corrosion monitoring program,
identification of corrosion threats, and applicable corrosion standards and best practices
for root cause identification are implemented within the organization and the basis of an
The next step in the analysis is to re-evaluate the leading indicators of the process
where the incident occurred. This with the objective of identifying the fact that if the
process is showing the right signals and making sure those operators understand the data
and information the system is giving. The case study identified that appropriate
presented without any analysis behind it, making difficult to understand the actual state
of the process. The presentation of these indicators was limited to the comparison of the
result with respect to the predefined goal of the year. In this sense, the results are
presented in a very high level and no analysis of the data was performed. Thus, the
learning process cannot be achieved because no actual transformation of the data into
135
information and knowledge is performed. Given these arguments, the following
Recommendation 13: Re-evaluate the objective of the existing leading indicators in the
process, in order to verify how the results can help to understand the process behavior
and what actions can be taken with respect to these results. Similarly, ensure that
performance indicators reports are followed by the corresponding analysis of the
obtained results.
investigation report. Additionally, six changes in the existing management system and
the required changes in their current training material were suggested. The
recommendations need first to be validated for people inside the organization and the
management's approval is required. The next steps in the process are to assign the
recommendations and finally share the lessons learned. Those steps are out of the scope
of this case study. However, the results are going to be presented to the organization and
The first step in the learning process has been explained to illustrate how incident
investigations can be enhanced and quality information can be incorporated into the
existing corporate learning system. The following steps in the process require the
implementation of a pilot test inside the organization in order to determine the actual
performance of the potential alternative and the system. Therefore, a hypothetical case
has been developed for one of the recommendations proposed on the previous analysis.
136
Enhancing internal information also gives the opportunity to identify additional
sources of information that can be used in the future for different facilities or groups of
people. In this sense, the selected external incidents (BP North Slope oil spill, 2006 and
Natural gas pipeline rupture and fire in Carlsbad, New Mexico, 2000) can be grouped
together based on the type of incident or the associated causes (Internal corrosion). Then,
this information can be linked to the document management system, so people would
have access in case similar information is needed for future investigations or operational
inquiries.
During this stage, the corporate learning team identifies the potential scope of the
recommendation, the benefits, potential cost, and the associated risk. Based on this
- Scope: the recommendation that is being analyzed involves the review of the
mechanical integrity program and corrosion guidelines of the platform where the
incident occurred. Since some of these documents are transversal for the
137
consistent interpretation and implementation of the corrosion management
- Cost: the cost would be determined in terms of the number of people required in
the initial identification of gaps with respect to the applicable standards and the
current corrosion program and guideline, and the cost of a subject matter expert
who would serve as an advisory during this phase. Similarly, the time and
resources needed to update the documentation, the training required, and finally,
the time and resources needed for implementation of specific activities at each
facility level.
- Risk: the likelihood of having another similar incident associated with internal
guidelines are in place, people are trained, and the documentation is consistently
For the initial identification, a work team is required. In this team the corporate
learning team would work together with three representatives of the process safety
department (who are part of mechanical integrity team) and an external subject matter
138
A preliminary plan has been developed to determine how the recommendation
would be implemented and the people in charge of the execution of each task. The plan
was divided into three phases: identification phase, implementation at a corporate level,
and implementation at a facility level. Table 8 to 10 present the preliminary plan for the
revision and updating of the corrosion management program and guidelines based on the
A set of indicators has been proposed to track the performance and progress of
the alternative. The indicators were defined based on the three phases of the action plan.
Identification phase:
- Number of facilities that identified the required changes and developed an action
plan
139
- Number of incidents associated with corrosion per year
implementation phase
140
Table 8 Action plan - Identification phase
Timeline
Item Activity Responsible
M1 M2 M3 M4 M5 M6
Identification phase
systems
141
Table 9 Action plan - Implementation at a corporate level
Timeline
Item Activity Responsible
M1 M2 M3 M4 M5 M6 M7 M8
142
Table 10 Action plan - Implementation at a facility level
Timeline
M M M M M M M M M M M M M
Item Activity Responsible M M
M
1 2
9 1 1 1 1 1 1 1 1 1 2 2 2 2
5 0
0 1 2 3 4 6 7 8 9 1 2 3 4
143
Fifth step: refine alternative
Since the alternative has been designed transversal for all the organization. The
execution of the first two phases is required first in order to determine how the
determine the specific subtasks required for each of them. For the execution of the third
phase, the work team would define the expectations for each facility. This would
facilitate the level of detail in auditing and documentation for each facility.
This step involves the execution of each of the task defined in the action plan.
During the first phase of the plan, gaps were identified with respect to the integrity
management program elements. More specifically, gaps associated with the integrity
threat classification and the identification of the potential pipeline impact by threat.
Thus, the standard “ASME B31.8S Managing System Integrity of Gas Pipelines” was
the primary source of information for the following phases. During the second phase of
the plan, the corrosion management program and inspection and maintenance guidelines
were updated, communicated and the mechanical integrity personnel were trained. For
the last phase, each facility had to developed a sub plan in which activities such as the
reevaluation of the process hazard analysis need to be performed based on the results of
the management of change process. Moreover, each facility was in charge of defining
For the last step of the process, review meetings were scheduled to verify the
144
progress for each facility, identified common difficulties, and analyzed preliminary
results of the alternative. Similarly, results of the proposed indicators for the alternative
were analyzed, refreshing training was provided and actions were taken based on these
results. Finally, the work team participated in some of the audits performed in the
facilities in order to verify that auditors are qualified and used the appropriate standards.
This case study was developed with the objective of explaining the proposed
how investigations can be enhanced through a more detailed analysis to identify the root
the case study fits together in the complete learning process through a hypothetical
145
7. CONCLUSIONS AND FUTURE WORK
7.1. Conclusions
In this thesis, a systemic process for improving learning from incidents has been
developed for the chemical and oil and gas industry based on the identified limitations of
the learning system. It consists of seven steps that enable organizations to acquire
relevant safety information, absorb it and transform it into valuable safety knowledge
that can be incorporated into the organizational management systems. The framework is
intended to provide a holistic view of the learning from incident process and explore the
Additionally, the limitations of the learning system and the main elements that
organizations should take into account to improve learning have been discussed in order
process was explained through the development of a case study in which an incident
This work provides a holistic view of the learning from incident process, which
learning systems can be executed within the organization and how to support the
intended to serve as a tool to disseminate and analyze safety knowledge across the
146
converges a detailed analysis of the underlying causes of incidents, provides the
opportunity for analysis from different perspectives and examines of potentially valuable
information for the organization. Moreover, the proposed learning system gives some
empirical and theoretical insights into the implementation of learning incidents into real
organizations for the chemical and oil and gas industry. In this context, the framework
psychological and engineering inputs into the analysis. Finally, this work give a bigger
picture of the learning from incidents process and overcome some of the limitations that
Knowledge management theory has been applied for the purpose of defining the
incorporated technical and human elements into the system [104]. The study of this
the scope of this thesis was limited to an organizational learning inside the organization.
The identified limitations in the learning system have been addressed as follow:
organization have been the focus of this study. Even though, the available literature
highlights the importance of sharing and implementing lessons learned [15, 18], the
all facilities where it may be applicable, and not just in the facility where the incident
147
organization aims to ensure that the lessons learned are not just communicated and
disseminated, but also analyzed and documented for all the facilities. Similarly,
system by the flows of information in and out that are established from and to all
The use and understanding of academic resources and databases are encouraged in
the second step of the proposed learning process: acquire information. The
identification of external resources has been identified as the key feature of this
process in order to get valuable safety information that may be applicable to the
into the traditional incident investigation process. These steps provide some
during the analysis, by the incorporation of additional sources of information that can
give more clarity and detail with respect to the organizational, technical and cultural
flaws of the organization. Similarly, the proposed process gives a holistic view by
the identification of the required changes in the management system and current
order to ensure that operators and supervisors are getting the right signals to identify
148
The applicability of the proposed process has been explained through the analysis
provide a clear picture of how the process can be implemented within existing incident
through a more detailed analysis in the identification of root causes and how high-quality
Based on the limitations and the scope of this research, the performed study can
The DIKW (Data, Information, Knowledge, Wisdom) theory has been used in this
and information into knowledge. However, the concept of wisdom has not yet been
clearly defined in this research. Thus, the analysis of why organizations are not
getting wisdom, which elements are necessary to achieve it? and analyzing the
The concept of individual learning has been discussed in this research in a high
149
characteristics such as age, motivation, and perception can be further analyzed to
understand how these factors can influence the level of retention in individuals.
The concept of sharing knowledge within the organization has been discussed and
methods for delivering this knowledge have not been covered in this research.
Evaluating the benefits and effectiveness of different types of tools would help
A case study was developed to explain how the proposed incident investigation
presented of how the case study fits into the complete learning process. The
This would give the opportunity to validate the complete process and refine it.
150
REFERENCES
1. Bureau of Labor Statistics, Nonfatal and fatal occupational injuries and illnesses
by industry. United States Department of Labor, 2016, September 7 [Data file].
Retrieved from: http://www.bls.gov/iif/oshsum1.htm [Accessed 20 Noviembre
2016].
3. Johnson, K., State and community during the aftermath of Mexico City's
November 19, 1984 Gas Explosion. Florida Mental Health Institute (FMHI),
1985. 58: pp. 1-44.
5. Gonzalez, S., Pemex reporta promedio de 153 accidentes con 21 muertes por
año. La Jornada, 2015. Retrieved from
http://www.jornada.unam.mx/ultimas/2015/04/01/pemex-reporta-promedio-de-
153-accidentes-con-21-muertes-por-ano-7763.html [Accessed 2 March 2016].
6. CSB, West fertilizer final investigation report in west fertilizer explosion and
Fire. U.S. Chemical Safety and Hazard Investigation Board, 2016. Retrieved
from http://www.csb.gov/west-fertilizer-explosion-and-fire-/ [Accessed 5 March
2016].
7. BBC, China explosions: what we know about what happened in Tianjin. BBC
News, 2015, August 17. Retrieved from http://www.bbc.com/news/world-asia-
china-33844084 [Accessed 5 March 2016].
8. Cooke, D.L. and T.R. Rohleder, Learning from incidents: from normal accidents
to high reliability. System Dynamics Review, 2006. 22(3): pp. 213-239.
151
10. Argyris, C. and D.A. Schön, Organizational learning II: theory, method, and
practice. 1996. Addison-Wesley, Reading, MA.
11. Drupsteen, L. and F.W. Guldenmund, What is learning? A review of the safety
literature to define learning from incidents, accidents and disasters. Journal of
Contingencies and Crisis Management, 2014. 22(2): pp. 81-96.
12. Lukic, D., A. Margaryan, and A. Littlejohn, How organisations learn from safety
incidents: a multifaceted problem. Journal of Workplace Learning, 2010. 22(7):
pp. 428-450.
14. Kletz, T.A., Lessons from disaster: How organizations have no memory and
accidents recur. 1993. IChemE, UK.
15. Kletz, T.A., Learning from experience. Journal of Hazardous Materials, 2004.
115(1): pp. 1-8.
16. Keegan, A. and J.R. Turner, Quantity versus quality in project-based learning
practices. Management Learning, 2001. 32(1): pp. 77-98.
17. Duffield, S. and S.J. Whitty, Developing a systemic lessons learned knowledge
model for organisational learning through projects. International Journal of
Project Management, 2015. 33(2): pp. 311-324.
19. Lindberg, A.-K., S.O. Hansson, and C. Rollenhagen, Learning from accidents–
what more do we need to know?. Safety Science, 2010. 48(6): pp. 714-721.
21. Lukic, D., A. Littlejohn, and A. Margaryan, A framework for learning from
incidents in the workplace. Safety Science, 2012. 50(4): pp. 950-957.
152
22. Le Coze, J.C., What have we learned about learning from accidents? Post-
disasters reflections. Safety Science, 2013. 51(1): pp. 441-453.
25. Carroll, J.S. and B. Fahlbruch, “The gift of failure: new approaches to analyzing
and learning from events and near-misses.” honoring the contributions of
Bernhard Wilpert. Safety Science, 2011. 49(1): pp. 1-4.
26. Huber, S., et al., Learning from organizational incidents: resilience engineering
for high risk process environments. Process Safety Progress, 2009. 28(1): pp. 90-
95.
27. Sanne, J.M., Learning from adverse events in the nuclear power industry:
organizational learning, policy making and normalization. Technology in
Society, 2012. 34(3): pp. 239-250.
28. Jacobsson, A., Å. Ek, and R. Akselsson, Method for evaluating learning from
incidents using the idea of “level of learning”. Journal of Loss Prevention in the
Process Industries, 2011. 24(4): pp. 333-343.
29. Drupsteen, L., J. Groeneweg, and G.I. Zwetsloot, Critical steps in learning from
incidents: using learning potential in the process from reporting an incident to
accident prevention. International Journal of Occupational Safety and
Ergonomics, 2013. 19(1): pp. 63-77.
30. Hovden, J., F. Størseth, and R.K. Tinmannsvik, Multilevel learning from
accidents–case studies in transport. Safety Science, 2011. 49(1): pp. 98-105.
153
33. Easterby-Smith, M. and M.A. Lyles, Handbook of organizational learning and
knowledge management. 2011. John Wiley & Sons Ltd Publications, United
Kingdom.
34. Fiol, C.M. and M.A. Lyles, Organizational learning. The Academy of
Management Review, 1985. 10(4): ppp. 803-813.
35. Huber, G.P., Organizational learning: The contributing processes and the
literatures. Organization Science, 1991. 2(1): pp. 88-115.
36. Baum, J., The Blackwell companion to organizations. 2002. Blackwell, Malden,
MA.
37. Turner, M.E., Groups at work: Theory and research. 2001. Lawrence Erlbaum
Associate, Mahwah, NJ.
40. McNamara, C., Field guide to consulting and organizational development with
nonprofits: A collaborative and systems approach to performance, change and
learning. 2005. Authenticity Consulting, Minneapolis, MN.
42. Grant, R.M., The development of knowledge management in the oil and gas
industry/El desarrollo de la dirección del onocimiento en la industria del
petroleo y gas. Universia Business Review, 2013. (40): p. 92.
43. Rowley, J., The wisdom hierarchy: representations of the DIKW hierarchy.
Journal of Information Science, 2007. 33(2): pp. 163-180.
44. Wellman, J., Organizational learning: how companies and institutions manage
and apply knowledge. 2009. Palgrave Macmillan, New York, NY.
154
45. Frost, A. The different types of knowledge. Knowledge Management Site, 2017.
Retrieved from http://www.knowledge-management-tools.net/different-types-of-
knowledge.html [Accessed 20 February 2017].
50. Argote, L. and P. Ingram, Knowledge transfer: A basis for competitive advantage
in firms. Organizational Behavior and Human Decision Processes, 2000. 82(1):
pp. 150-169.
51. Malhotra, Y., Knowledge Management and Business Model Innovation. 2001.
IGI Global, Hershey, PA.
52. Dixon, N.M., Common knowledge: How companies thrive by sharing what they
know. 2000. Harvard Business School Press, Boston, MA.
53. Stein, E.W. and V. Zwass, Actualizing organizational memory with information
systems. Information Systems Research, 1995. 6(2): pp. 85-117.
54. Walsh, J.P. and G.R. Ungson, Organizational memory. Academy of Management
Review, 1991. 16(1): pp. 57-91.
155
56. Mohun, A. and S.D. Sagan, The Limits of Safety: organizations, accidents, and
nuclear weapons. 1995. American Association for the Advancement of Science,
Washington, DC.
58. Lalley, J.P. and R.H. Miller, The learning pyramid: Does it point teachers in the
right direction. Education, 2007. 128(1): p. 64.
60. Magennis, S. and A. Farrell, Teaching and learning activities: expanding the
repertoire to support student learning. Emerging Issues in the Practice of
University Learning and Teaching, 2005. 1: pp.45-54.
62. Basu, P., Training methods for making a training course more lively. LinkedIn,
2015. Retrieved from https://www.linkedin.com/pulse/training-methods-making-
course-more-lively-pabitra-basu [Accessed 20 January 2016].
63. Noe, R.A., Employee training and development. Seventh Edition, 2002. McGraw
Hill Education, New York, NY.
64. Blanchard, P.N., Training delivery methods. Reference for Business, 2016.
Retrieved from http://www.referenceforbusiness.com/management/Tr-
Z/Training-Delivery-Methods.html#ixzz4E7CQAxeU [Accessed 20 January
2016].
65. Amit, S., Choosing the right eLearning methods:factors and elements. eLearning
Industry, 2015. Retrieved from https://elearningindustry.com/choosing-right-
elearning-methods-factors-elements [Accessed 02 November 2016].
66. UNESCO, Training guide and training techniques. 2004. UNESCO Asia and
Pacific Regional Bureau for Education, Bangkok, Thailand.
156
67. Pichee, D., 6 ways to use science to improve your employee training program.
Training Industry, 2016. Retrieved from
https://www.trainingindustry.com/webinars/6-ways-to-use-science-to-improve-
your-employee-training-program.aspx [Accessed 01 December 2016].
68. Taylor, G.S., G.F. Templeton, and L.T. Baker, Factors influencing the success of
organizational learning implementation: A policy facet perspective. International
Journal of Management Reviews, 2010. 12(4): pp. 353-364.
69. Park, J.-H. and H.J. Choi, Factors influencing adult learners' decision to drop
out or persist in online learning. Educational Technology & Society, 2009.
12(4): pp. 207-217.
70. Bos, C.S. and S. Vaughn, Strategies for teaching students with learning and
behavior problems. 2002. Allyn & Bacon, A Pearson Education Company, 75
Arlington Street, Boston, MA.
71. Reason, J., Achieving a safe culture: theory and practice. Work & Stress, 1998.
12(3): pp. 293-306.
72. CCPS, Guidelines for risk based process safety. 2007. Center for Chemical
Process Safety of the American Institute of Chemical Engineers and John Wiley
& Sons, Inc, Hoboken, New Jersey.
73. Reason, J., Managing the risks of organizational accidents. 1997. Ashgate,
Hampshire, England.
74. Reason, J., Human error. 1990. Cambridge University Press, Cambridge, United
Kingdom.
77. CCPS, Guidelines for investigating chemical process incidents. Second edition.
2003. Center for Chemical Process Safety of the American Institute of Chemical
Engineers, New York, NY.
157
78. OSHA, Incident [Accident] Investigations: A Guide for Employees. United States
Department of Labor Occupational Safety and Health Administration, 2015.
Retrieved from https://www.osha.gov/dte/IncInvGuide4Empl_Dec2015.pdf
[Accessed 10 October 2016].
80. Mannan, S., Lees' Loss prevention in the process industries, hazard
identification, assessment and control. 2005. Butterworth–Heinemann,
Burlington, MA.
81. Sklet, S., Methods for accident investigation. 2002. Norwegian University of
Science and Technology, Trondheim, Norway.
82. Hosseinian, S.S. and Z.J. Torghabeh, Major theories of construction accident
causation models: a literature review. International Journal of Advances in
Engineering & Technology, 2012. 4(2): pp. 53-66.
83. HaSPA (Health and Safety Professionals Alliance), The core body of knowledge
for generalist OHS professionals. 2012. Safety Institute of Australia Ltd,
Tullamarine, Victoria, Australia.
84. Bird, F.E., G.L. Germain., Practical loss control leadership. 1986. Institute
Publishing, Loganville, GA.
87. Leveson, N., A new accident model for engineering safer systems. Safety
Science, 2004. 42(4): pp. 237-270.
158
88. Hollnagel, E. and O. Goteman, The functional resonance accident model.
Proceedings of Cognitive System Engineering in Process Plant. 2004. pp. 155-
161.
89. Johnson, W.G., Management Oversight and Risk Tree-MORT. 1973. Aerojet
Nuclear Co., Scoville, ID.
90. Pasman, H.J., Risk analysis and control for industrial processes-gas, oil and
chemicals: a system perspective for assessing and avoiding low-probability,
high-consequence events. 2015. Butterworth-Heinemann, Waltham, MA.
92. Livingstone, A., G. Jackson, and K. Priestley, Root causes analysis: literature
review. Health & Safety Executive, Contract Research Report 325. 2001. HSE
Book, Norwich, UK.
94. Hollnagel, E., FRAM, the functional resonance analysis method: modelling
complex socio-technical systems. 2012. Ashgate Publishing, Ltd., Farnham, UK.
95. Energy Institute, Guidance on using Tripod Beta in the investigation and analysis
of incidents, accidents and business losses. 2015. Energy Institute, London, UK.
96. Ferjencik, M., An integrated approach to the analysis of incident causes. Safety
Science, 2011. 49(6): pp. 886-905.
97. Paradies, M. and L. Unger, TapRoot: The system for root cause analysis,
problem investigation, and proactive improvement. 2000: System Improvements,
Knoxville, TN.
98. TapRooT, Using the TapRooT® System for Chemical Industry Incident
Investigation. How Does the TapRooT® System Work?. TapRooT® Website,
2015. Retrieved from http://www.taproot.com [Accessed 10 October 2016].
159
99. Knox, N.W. and R.W. Eicher, MORT user's manual for use with the management
oversignt and risk tree analytical logic diagram. 1983. EG and G Idaho, Inc.,
Idaho Falls, ID.
100. Leveson, N. G., Daouk, M., Dulac, N., & Marais, K., Applying STAMP in
accident analysis. 2003. Massachusetts Institute of Technology , Cambridge,
MA.
101. Oxford, Learning definition. Oxford Dictionary Website, 2017. Retrieved from
https://en.oxforddictionaries.com/definition/learning [Accesed 12 October 2016].
103. Rowley, J.E., The wisdom hierarchy: representations of the DIKW hierarchy.
Journal of Information Science, 2007. 33(2): pp. 163-180.
104. Davenport, T.H. and L. Prusak, Working knowledge: how organizations manage
what they know. 1998. Harvard Business Press, Boston, MA.
105. Grant, K.A. and C.T. Grant, Developing a model of next generation knowledge
management. Issues in Informing Science and Information Technology, 2008.
5(2): pp. 571-590.
107. TPUB, Instructional Methods and Techniques. Military requirements for Petty
Officers Third and Second Class. 2013. Retrieved from
http://navyadvancement.tpub.com/14504/css/Instructional-Methods-And-
Techniques-25.htm [Accesed 18 November 2016].
108. Wood, E., Problem-based learning: exploiting knowledge of how people learn to
promote effective learning. Bioscience Education, 2004. 3(1): pp. 1-12.
109. Noe, R.A., Employee Training and Development. Fifth Edition, 2010. McGraw-
Hill, New York, NY.
160
110. Pighin, M. and A. Marzona, Unlearning/relearning in processes of business
information systems innovation. Journal of Information and Organizational
Sciences, 2011. 35(1): pp. 59-72.
111. Annett, J., Hierarchical task analysis. Handbook of Cognitive Task Design,
2003. 2: pp. 17-35.
113. Salmon, P., Jenkins, D., Stanton, N., & Walker, G., Hierarchical task analysis vs.
cognitive work analysis: comparison of theory, methodology and contribution to
system design. Theoretical Issues in Ergonomics Science, 2010. 11(6): pp. 504-
531.
114. Bailey, A., BP: Learning from oil spill lessons. Petroleum News, 2006. 11(20).
Retrieved from http://www.petroleumnews.com/pntruncate/573947058.shtml
[Accessed 15 November 2016].
115. National Transportation Safety Board. Natural gas pipeline rupture and fire near
Carlsbad, New Mexico, august 19, 2000. Pipeline Accident Report NTSB/PAR-
03/01, 2003. National Transportation Safety Board, Washington, D.C.
116. Garber, J.D., A. Alvarado, and R.H. Winters, Study tracks internal-corrosion
trends in aging gulf pipelines. Oil and Gas Journal, 2000. 98(13): pp. 68-73.
117. Baker, M. and Fessler, R.R., Pipeline corrosion, Final report. 2008. US
Department of Transportation Pipeline and Hazardous Materials Safety
Administration, Washington, D.C.
161