Overview of IEC 61508 - Design of Electrical / Electronic / Programmable Electronic Safety-Related Systems
Overview of IEC 61508 - Design of Electrical / Electronic / Programmable Electronic Safety-Related Systems
Overview of IEC 61508 - Design of Electrical / Electronic / Programmable Electronic Safety-Related Systems
The author is with the Health & Safety Executive, Magdalen House, Bootle,
Merseyside, L20 3 QZ, UK
This paper was originally published in the Computing & Control Engineering Journal,
vol. 11, no.11, February 2000 (Institution of Electrical Engineers, London)
This article reviews the principle requirements of IEC 61508 relating to the
specification and design of hardware and software in programmable electronic
systems intended for use in safety-related applications.
Introduction
The aim of the international standard IEC 615081 is to provide a route whereby
safety-related systems can be implemented using electrical or electronic or
programmable electronic technology in such a way that an acceptable level of
functional safety is achieved. The strategy of the standard is first to derive the safety
requirements of the safety-related system from a hazard & risk analysis and then to
design the safety-related system to meet those safety requirements taking into
account all possible causes of failure including random hardware faults, systematic
faults in both hardware and software and human factors.
This article reviews the concept of the safety lifecycle and the way in which the
safety requirements specification for an electrical / electronic / programmable
electronic system is developed. The methodology of IEC 61508 for the design of
hardware and software is described. In particular, the requirements of the standard
relating to quantified failure probability, hardware fault tolerance and avoidance /
control of systematic faults are explained.
The application of IEC 61508 will influence the requirements for subsystems (such
as sensors, programmable logic controllers or actuators) used in any part of a safety-
related system. The way in which such subsystems will need to be characterised, so
that compliance with IEC 61508 can be claimed, is discussed.
23/07/01
IEESBrown.DOC
Ÿ Process plant emergency shut-down system
Ÿ Fire & gas detection system
Ÿ Machinery guard / access interlocking system
Ÿ Machinery emergency stop
Ÿ Crane automatic safe load indicator
Ÿ Railway signalling
Ÿ Steam boiler controls
Ÿ Fairground roller-coaster control system
IEC 61508 can be applied both to systems which operate ‘on demand’ (usually due
to some fault) as well as those which are required to operate continuously to
maintain a safe state. An example of a demand mode system would be an
emergency shut-down system on a chemical process plant which operates valves on
the plant to move the process to a safe state in the event of the pressure in a vessel
exceeding some limit. Demand mode systems are sometimes referred to as
‘protection systems’ because they act to protect against hazardous situations.
The aim is to address all the possible causes of dangerous failures. Such failures
could arise due to faults in hardware, software in any part of the safety-related
system or from human error. Further, faults can be introduced at any stage of the
lifecycle of a system, from its initial concept, through design, installation and
operation to eventual decommissioning.
The scope and boundary of the system to which the standard is applied are entirely
within the hands of those who wish to claim compliance with the standard. Therefore,
a very important first activity is to clearly define the system boundaries. This leads to
a clear view as to which hazards should be considered during the later stages of the
safety lifecycle.
It should be noted that whilst IEC 61508 recognises that it is of primary importance to
eliminate hazards at source, the principles of inherent safety are outside the scope of
IEC 61508.
23/07/01
IEESBrown.DOC
Safety lifecycle
IEC 61508 uses the ‘safety lifecycle’ as a framework to structure its own
requirements and it is a basic requirement of the standard that a similar (though not
necessarily identical) lifecycle is used to structure the activities relating to the
specification, design, integration, operation, maintenance and eventual
decommissioning of an E/E/PE safety-related system. The essence is that all
activities relating to functional safety are managed in a planned and methodical way,
with each phase having defined inputs and outputs. This enables a process of
verification whereby a check is made at the conclusion of each phase to confirm that
the required outputs, have in fact been produced as planned. The ability to check (or
validate) that verification has been properly implemented throughout the safety
lifecycle is one of the foundations of functional safety. The premise is that such a
structured approach will minimise the number of systematic faults which are ‘built-in’
to the safety-related system. This is particularly important for programmable
systems because it cannot be assumed that testing alone will reveal potentially
dangerous faults.
Figure 1 shows the Overall Safety Lifecycle. The use of the term ‘overall’ reflects the
fact that it is necessary to develop the safety requirements for the E/E/PE safety-
related systems taking into account the contributions to safety which may result from
the use of other technology safety-related systems (such as pressure relief valves or
mechanical interlocks) as well as from external risk reduction facilities (such as fire
walls and bunds).
The design and integration of all the necessary safety related systems and risk
reduction facilities comes within within the realisation phase of the Overall Safety
Lifecycle. However, IEC 61508 only addresses in detail the realisation of safety-
related systems based on E/E/PE technology. It is during the realisation phase that
the hardware & software of the E/E/PE safety-related system(s) is designed and
integrated to meet the safety requirements.
Essentially, a safety function is an action which is required to ensure that the risk
associated with a particular hazard is tolerable. A safety function is specified in
terms of its functionality (the action required) and its safety integrity (the required
probability that the specified action will be carried out in order to achieve the required
risk reduction). An accurate specification of the safety functions in terms of
functionality and safety integrity is a corner-stone of IEC 61508. The specification for
a safety function is derived taking into account the nature of the hazard, and the risks
(in terms of likelihood and consequence) which the hazard presents in the absence
of the safety function. It is also necessary to form a view as to what is the tolerable
risk associated with each hazard. In the UK, in order to meet safety legislation, the
need for, and the required extent of, risk reduction will need to be assessed taking
into account the “ALARP” principle 2.
This assessment is undertaken for each hazard which falls within the defined system
boundaries. The result is a set of safety functions which together is called the
23/07/01
IEESBrown.DOC
“Overall Safety Requirements Specification”. This process is illustrated in Figure 2
for an example where there are 3 hazards (H1,H2,H3) within the system boundary,
with each hazard having an associated unacceptable risk (R1,R2,R3) which is
reduced to a tolerable level by the action of a safety function (SF1,SF2,SF3). The
Overall Safety Requirements Specification in this example consists of the Safety
Functions Requirements and Safety Integrity Requirements for each of the safety
functions, SF1, SF2 and SF3.
The next stage is to decide how each of the safety functions is going to be
implemented, in terms of the type of safety-related system technology or external
risk reduction facility. This is the ‘Safety Requirements Allocation’ phase of the
Overall Safety Lifecycle. Each safety function is allocated to one or more safety-
related systems or risk reduction facilities in such a way as to meet the safety
functions requirements and safety integrity requirements for that function. The result
of the allocation process is, for each safety-related system or risk reduction facility, a
set of safety functions and associated safety integrity requirements.
In the example shown in Fig. 4, safety function SF1 is allocated to both an E/E/PE
safety-related system and to an ‘other technology’ safety related system. In this
case a single PES (PES 1) is used to perform all the safety functions allocated to
E/E/PE safety-related systems. Safety function SF2 is also allocated to E/E/PE
technology and hence will also be performed by PES1. Consequently, PES 1 is
required to perform 2 safety functions, SF1a and SF2, having safety integrity
requirements SIR1a and SIR2 respectively. Note that SIR1a will differ from the
safety integrity requirement of the safety function SF1, because SF1 has been
allocated to 2 different safety-related systems, each of which will take a share of the
integrity requirement.
The final stage in the development of the E/E/PES safety requirements is translate
the safety integrity requirements of those safety functions implemented in E/E/PE
safety related systems into safety integrity levels (SILs). If the safety integrity
requirements have been developed on a quantitative basis, then the SIL for a safety
function is determined simply by reference to Tables 2 & 3 according to whether the
safety integrity requirement is expressed in terms of:
The SIL forms the basis for the qualitative grading of the techniques and measures
used for the avoidance and control of systematic faults in both hardware and
software whilst the quantitative target failure measure provides the upper limit for the
quantified estimate of failure probability.
23/07/01
IEESBrown.DOC
Safety integrity level Target failure measure
(Average probability of failure to perform its design function on
demand)
4 ≥ 10-5 to < 10-4
3 ≥ 10-4 to < 10-3
2 ≥ 10-3 to < 10-2
1 ≥ 10-2 to < 10-1
Table 2 Safety integrity levels for safety functions operating in the low demand demand mode
of operation
Table 3 Safety integrity levels for safety functions operating in the high demand / continuous
mode of operation
IEC 61508 accepts that a quantitative approach towards the determination of the SIL
of a safety function is not always appropriate. In such situations, a qualitative
approach, such as a risk graph, can be used. Such an approach however requires
great care to ensure that adequate risk reduction is achieved. IEC 61508-5 provides
general guidance on the use of such techniques. If this approach is used there is still
a need to adopt quantitative target failure measure for the purpose of failure
probability modelling. In this case, the quantitative target failure measure is taken to
be the highest probability of failure associated with the SIL, according to Table 1 or
2.
Having specified the safety requirements for the E/E/PE safety-related system(s), the
task is then to design and integrate the hardware and software to meet those
requirements. IEC 61508 has requirements in 3 key areas, each of which must be
met in order for compliance with the standard to be claimed. These are:
23/07/01
IEESBrown.DOC
Quantified failure probability
The effect of random hardware failures can be modelled using traditional reliability
and availability analysis techniques. For example, IEC 61508-6 gives guidance on
the use of reliability block diagrams and other techniques such as Markov analysis
may be used. The analysis should take into account the use of any automatic
diagnostics, and any periodic proof testing to reveal failures not detected by
diagnostics.
The requirement to take into account common cause failures is included because
redundancy is often used within an E/E/PE safety-related system to reduce the
probability of failure due to random hardware faults. In practice, the benefit to be
gained from the use of redundancy as a technique enhance reliability will be limited
by the likelihood of faults occurring simultaneously due to a common cause (e.g.
over-heating). IEC 61508-6 gives an example of one methodology which may be
used to take account of common cause failures, but other methodologies may be
equally acceptable.
23/07/01
IEESBrown.DOC
When data any form of data communication system (e.g. field bus) is used to support
a safety function it is necessary to ensure that the likelihood of safety-related
information being corrupted, lost or excessively delayed by the communication
process is less than the target failure measure.
The concept of hardware fault tolerance can also be applied to subsystems within
the E/E/PE safety-related system. For example, 2 sensors arranged in a redundant
configuration can be thought of as a single sensor subsystem having a hardware
fault tolerance of 1. This is sometimes referred to as a ‘single redundant’
architecture.
IEC 61508-2 places an upper limit on the SIL which can be claimed for any safety
function on the basis of the fault tolerance of the subsystems which are used by the
safety function. These limits are referred to as ‘architectural constraints’ because
they are principally function of the architecture of the subsystem. The limit which
applies to any particular subsystem is a function of:
b) the fraction of failures of the subsystem which can be regarded as ‘safe’ because
they are either in a mode which does not cause a loss of the safety function, or are
detected by automatic diagnostic tests (the so called safe-failure fraction), and
c) the degree of confidence in the behaviour of the subsystem under fault conditions
Reference should be made to IEC 61508-2 for a full description of how to derive the
limit taking into account the above factors. However, Table 4 shows the limits which
apply to ‘worst case’ and ‘best case’ in terms of the above parameters. In the
absence of sufficient information it would be necessary to assume ‘worst case’ and it
23/07/01
IEESBrown.DOC
would not be allowed to use a single channel subsystem, having zero hardware fault
tolerance, to support a safety function. However, provided that the specified criteria
can be fulfilled, in the ‘best case’ it would be possible to claim up to SIL 3 for a single
channnel subsystem.
Note: ‘Worst case’ is for a complex programmable subsystem, with low (or unknown) safe failure
fraction. ‘Best case’ is for a low complexity subsystem with a high safe failures fraction.
It is a central point of IEC 61508 that these architectural constraints impose a limit on
the SIL of a safety function which cannot be exceeded, even if the quantified
estimate of failure probability aligns with a higher SIL in terms of Table 1 or 2.
Systematic faults are those faults, in either hardware or software, which will always
result in a failure when a particular combination of circumstances (e.g. environmental
conditions or input signal states) arises. Such faults are often introduced during the
specification and design phases, but can also result errors introduced during
integration, operation and maintenance.
23/07/01
IEESBrown.DOC
Technique/measure SIL1 SIL2 SIL3 SIL4
Observance of HR HR HR HR
guidelines and
standards
Project management HR HR HR HR
Documentation HR HR HR HR
Structured design HR HR HR HR
Modularisation HR HR HR HR
Use of well-tried R R R R
components
Semi-formal methods R R HR HR
Checklists - R R R
Computer-aided - R R R
design tools
Simulation - R R R
Inspection of the - R R R
hardware or walk-
through of the
hardware
Formal methods - - R R
R = recommended
HR = highly recommended
All the techniques marked ‘R’ in the grey shaded group are replaceable, but at least one of these is
required.
23/07/01
IEESBrown.DOC
Technique/Measure SIL1 SIL2 SIL3 SIL4
1 Fault detection and diagnosis ---- R HR HR
2 Error detecting and correcting R R R HR
codes
Appropriate techniques/measures should be selected according to the safety integrity level. Alternate
or equivalent techniques/measures are indicated by a letter following the number. Only one of the
alternate or equivalent techniques/measures has to be satisfied.
Table 7 examples of recommendations to avoid introduction of faults during software design &
development (IEC 61508-3)
Proven-in-Use
The measures and techniques for the avoidance & control of systematic faults as
recommended by IEC 61508 generally have to be incorporated as the system
progresses through the various phases of the safety lifecycle. It therefore may not
be possible to claim that an existing item of equipment, not designed according to
IEC 61508, is compliant with the standard in this regard. Nevertheless, there may
be a high degree of confidence, resulting from the previous use of the equipment in a
similar application, that the performance of the equipment, with regard to both
random hardware failures and systematic failures is such that the target failure
measure for the E/E/PE safety-related system can be achieved. In fact, the
previous use of both hardware and software can be a very effective way of proving
the suitability of equipment for use in a safety-related application.
However, this route should be used with extreme caution, especially in relation to
programmable electronic systems, because even minor differences between a
previous application can be the cause of unrevealed systematic faults. IEC 61508-2
defines the criteria which allow the use of such a ‘proven-in-use’ subsystem (which
might comprise both hardware & software). Key factors are the adequacy of the
records of past failures and the match between the previous conditions of use and
those which will be experienced in the intended application. Where there is any
mismatch it will be necessary to undertake analysis and/or testing to demonstrate
that the likelihood of unrevealed systematic faults is low enough.
23/07/01
IEESBrown.DOC
Requirements for subsystems
Conclusion
IEC 61508 provides a methodology for the determination of the safety requirements
specification of safety-related systems based on electrical / electronic /
programmable electronic technology. The aim is to ensure that the design and
performance of such systems is adequate to meet tolerable risk targets, taking into
account all sources failure including random hardware faults and systematic faults in
both hardware and software.
The standard is set to influence the requirements for subsystems such as sensors,
logic controllers, signal processing electronics and actuators used in safety-related
applications.
Acknowledgment
This article is based on the work of the many experts within the IEC working groups
(IEC SC65A WG9, WG10) responsible for the development of IEC 61508. The
author gratefully acknowledges that work as the basis for this article.
Further reading
23/07/01
IEESBrown.DOC
References
23/07/01
IEESBrown.DOC
Concept
Overall scope
definition
Overall safety
requirements
Safety requirements
allocation
Overall installation
and commissioning
Decommissioning
or disposal
NOTE Activities relating to verification , management of functional safety and functional safety assessment are
not shown but are relevant to all the lifecycle phases.
23/07/01
IEESBrown.DOC
System boundary
Hazards H1 H2 H3
Risks R1 R2 R3
23/07/01
IEESBrown.DOC