Factors: Root Cause Analysis
Factors: Root Cause Analysis
Factors: Root Cause Analysis
factors no15
briefing notes
For background information on this series of publications, please see Briefing Note 1 Introduction.
For background information on this series of publications, please see Briefing Note 1 - Introduction
RootofCause:
accidents
happen
at often,
the endthe
of aimmediate
chain of events.
Very
the
ROOT CAUSE: accidents happen at the end
a chain
of events.
Very
cause,
justoften,
before
immediate
cause,
just
before
the
accident,
is
a
human
error.
But
before
that,
the accident, is a human error. But before that, there will be other actions, decisions or events influenced by
there
will be
actions, decisions
or events
influenced
by itvarious
conditions
various conditions that are part of the overall
cause
ofother
the accident.
By finding
the root
causes,
may be
that are part of the overall cause of the accident. By finding the root causes, it
possible to prevent future similar accidents.
Root Cause
Analysis
root cause
analysis
Case study
and and
analysis
Case Study
Analysis
Management
decisions
Unsafe acts
Accidents
Why rootWhy
causeRoot
analysis?
Cause Analysis?
Sparks fromSparks
a welding
fell torch
onto solvent
paintfrom torch
a welding
fell ontoand
solvent
and paint-soakedThe
ragsHealth and Safety Commission issued a consultative
The Health and Safety Commission issued a Consultative
soaked ragsleft
leftininthe
thework
work
area
causing
a small
fire. This
area
causing
a small
fire. This
was quickly
document on
the subject of accident investigation, but rather
Document on the subject of accident investigation and are
was quickly extinguished
extinguishedbyby
the
welder.
Paintwork
andand
cable
the
welder
himself.
Paintwork
cable ducting
than introduce
a legal duty, it plans to develop guidance to
likely to make it a legal duty under the RIDDOR or MHSWR
ducting werewere
damaged
2500.
500.
damagedwith
withrepair
repaircosts
costs of around
around 2
help employers
to conduct
investigations.
Regulations
to conduct
investigations.
Many companies
would review
information
look and look no
Many companies
wouldthis
review
the above and
information
no further than
the than
immediate
causes:causes:
litter was
left in
the
further
the immediate
someone
left
litter in the Many accidents are blamed on the actions or omissions
Many accidents are blamed on the actions or omissions of an
workplace
and the
did not
remove
this orThe
put adequate of
fire
workplace and
adequate
firewelder
blanketing
was
not used.
an individual
who who
waswas
directly
involved
or
individual
directly
involvedininoperational
operational or
in place.the
The
result and
wouldpainting
be to discipline
and
result wouldblanketing
be to discipline
welder
crew the welder
maintenance
work. Thiswork.
typical
short-sighted
response
maintenance
Thisbut
typical
but short-sighted
response
and to
perhaps
issue a to
notice
all workers to pay more
and perhapspainting
issue acrew
notice
all workers
pay to
more
ignores the ignores
fundamental
failures which
led
to the
the fundamental
failures
which
ledaccident.
to the accident.
to housekeeping.
attention to attention
housekeeping.
These are usually
rooted
deeper
in
the
organisations
These are usually rooted deeper in the organisations design,
Such
an analysis
explore
true
underlying causes of
this management and decision-making functions.
Such a basic
analysis
doesdoes
not not
explore
thethe
true
underlying
design,
management and decision-making functions.
event:
root
cause
analysis
methods
do.
Source: Reducing
errorHSE,
and Influencing
behaviour
HSE Error
HSG48
HSE
causes of this event provided by the three root cause
[Source:
(1999) HSG48,
Reducing
and
This Briefing
Providesto
examples
of 3accident
root cause analysis Books
tools (1999) ISBN 0 7176 2452 8
analysis methods
whichNote
are applied
the above
Influencing Behaviour]
applied
in this briefing
note.to the above accident.
HSG65 Approach
HSG65 approach
HSEs Document, Successful Health and Safety Management provides good practical advice on accident
investigation.
It
suggests
the investigation
team should:
HSEs Successful health and safety that
management
(reference
5), provides good practical advice on accident investigation. It
1.
Collect
Evidence
suggests that the investigation team should:
Visit the site and directly observe the conditions where the accident occurred noting the layout (where the welder was
1. Collect evidence
located,
where
the flammable
materialswhere
were, the
of occurred
safety equipment
directly
observe
the conditions
theposition
accident
noting etc)
the layout (in this case, the location of
Visit the site and
Review
documents
procedures
for
the
painting
work
and
the
welding
work,
permits, policy documents, risk assessments etc
the welder, flammable materials, safety equipment, etc)
Conductprocedures
interviews with
those
involved,
witnesses
the accident
its outcome,
those who
any involvement
for the
painting
and
weldingtowork;
permits;orpolicy
documents;
riskhad
assessments,
etc.before the
Review documents:
accident
(eg. supervisors,
inspectors,
maintenance
crew).
involved;
witnesses to
the accident
or its outcome;
and those involved before the accident (e.g. supervisors,
Interview: those
2. Assemble
and Consider
inspectors,
maintenance
crew). the Evidence
Using HSEs
of human factors (See Briefing Note 8) identify the immediate causes of an accident (those concerned
themodel
evidence
2. Assemble and consider
with Personal
Factors
and Job
ie. behaviour
of the
of human
factors
(seeFactors
briefing note
8) identify
the:painting crew, the welder and the supervisor who signed off the
Using HSEs model
to of
work,
work conditions,
adequacy
of guards,
Identify
the underlying
causes
- Immediate permit
causes
an accident
i.e. those
concerned
withseparation
personaldistances
factors etc),
and job
factors;
e.g. behaviour
of(those
the
concerned
with
Management
and
Organisational
Factors
pressure
on
paint
crew
to
complete
too
many
jobs
in
a shift, failures
painting crew, the welder and the supervisor who signed off the permit to work, work conditions, adequacy of guards,
safety policy
separation in
distances
etc. and risk assessment etc).
3. Compare
Findings
with legal
and company
- Underlying
causes
i.e. those
concerned
with standards
management and organisational factors e.g. pressure on paint crew to
too
Were
thejobs
standards
applied
(eg. is in
there
a standard
for housekeeping,
is it enforced?)
complete
many
in a shift,
failures
safety
policy and
risk assessment,
etc.
with
legal
and company
standards
Were
the
standards
themselves
adequate?
3. Compare findings
Were the standards applied (e.g. is there a standard for housekeeping; is it enforced)?
4. standards
Draw conclusions
based on
the evidence
themselves
adequate?
Were the
5. Make improvements
track progress against these (are they in place; are they working?) by regular monitoring and checking.
based on theand
evidence
4. Draw conclusions
5. Make improvements and track progress against these by regular monitoring and checking (e.g. are they in place; are they
working?) HSG65 provides further guidance on how to conduct the investigation logically. Starting with premises workplace problems,
consider if these were significant in the accident. Then consider Plant and Substances - was there sufficient guarding of
HSG65 provides
further
guidance on
to conduct
theProcedures
investigation
logically.
Starting
with premises
consider
workplace
equipment
or containment
ofhow
substances?
Were
adequate
and used
correctly?
Consider People
and iftheir
behaviour etc.
problems were significant in the accident. Then consider plant and substances; was there sufficient guarding of equipment or
containment of substances? Were procedures adequate and used correctly? Consider people and their behaviour etc.
Copyright 2003 by The Institute of Petroleum, London: A charitable company limited by guarantee. Registered No. 135273, England
This method is based on the International safety rating system (ISRS), which is extensively used in the petroleum industry.
ISRS measures the safety management system in a plant against a number of control elements: SCAT helps the user to
interpret accidents and incidents in terms of which of the elements failed. The software version (E-scat), allows the user
to enter:
A description of the accident (date, time, what happened)
Details of the loss potential which comprises the following factors:
- loss severity potential - i.e. how bad could it have been? In the example - serious - the fire could have been worse
- probability of reoccurrence - moderate - a similar problem could arise until measures are taken to remove the causes
- frequency of exposure - moderate - welding is a common task but not a daily occurrence
Type of contact - SCAT provides various choices. In the example, the contact is, with heat (other choices are with: cold;
radiation; caustics; noise; electricity etc)
From lists the user then selects the immediate causes (ICs) - substandard acts or substandard conditions. These include:
failure to follow procedure/policy/practice and failure to check/monitor. These were both causes of the fire in the example.
The user then selects the basic underlying causes (BCs) which are split into personal factors and job factors. The user can
choose to see only the most probable BCs that apply to the ICs chosen in the last step. BCs include: abuse or misuse which
can be described as improper conduct that is not condoned. Or, improper motivation which leads to improper attempt to
save time or effort. From the ICs and BCs identified, SCAT will suggest a number of control actions needed (CANs) based
on ISRS principles, that will help to remove or reduce the impact of the underlying causes of the accident. CANs include,
for this example: task observation - the need for a scheme to carry out spot checks on tasks; rules and work permits review of how compliance with rules is achieved; and, general promotion (of safety) - promotion of critical task safety and
promotion of housekeeping systems.
The following is a simplified description of MORT, as this technique can be quite complex.
In general, MORT views an accident as occurring because an unwanted energy flow reached a target. So, in the example,
sparks (a form of heat energy) from welding fell onto flammable materials (the target) leading to a fire. The fire, i.e. more
heat energy, then burned paintwork and cable ducting (another target) causing damage. Energy flows can only be stopped
by some form of barrier and barrier analysis is a key part of MORT. In the example, there should have been a barrier to
contain the flammable material (e.g. put the rags in containers and take them away). A second barrier, fire blankets, should
have covered a larger area.
MORT provides a tree structure that guides the analyst to consider whether:
similar (worse) events could happen in future
the accident had been assumed and already considered as an inevitable given the type of operations carried out (MORT
calls this an assumed risk), or,
the accident is unacceptable, and the underlying causes need to be investigated
The investigation then proceeds to identify what specifically went wrong and why. The analysis considers:
what happened
what elements of the protection system were less than adequate
- recovery mechanisms (where they failed to reduce the consequences of the event. In the example, they did work as
the fire was extinguished)
- barriers that failed to stop the event occurring (flammable materials were not contained; protective blankets badly
positioned)
- management factors that did not protect against the event (permit system and supervision/checking not carried out
effectively).
According to reference 1, there are three key components of root cause analysis systems:
1. A method of describing and representing the incident sequence and its contributing conditions (e.g. a tree diagram)
2. A method of identifying the critical events and conditions in the incident sequence (often a checklist to stimulate ideas)
3. Based on the identification of the critical events or active failures, a method for systematically investigating the management
and organisational factors that allowed the active failures to occur
The three example methods described here show how these components are used in practice.