QHSEOFFICE DNV-MANAGING-THE-RISKS-of-BLACKOUT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

MANAGING THE RISKS

OF BLACKOUT
For passenger ship owners and operators

GUIDANCE PAP ER
MARITIME Managing the risks of blackout

Content

Editorial 3 Step 4: Identify measures to ensure safe and

Reasons for concern 4 reliable newbuilds 32

Time for a step change in safety 5 4.1 Apply the principles of human-centred design 33

4.2 Ensure robust design for closed-bus operations 34

Step 1: Increase understanding of blackout 6 4.3 Improved integration, testing and verification 35

1.1 Investigate the underlying causes of blackout 6 4.4 Design effective blackout-recovery systems 36

1.2 Understand the regulatory framework 6 4.5 Utilize battery systems 36

1.3 Apply a barrier approach 9 4.6 Recommendations and best practices 37

1.4 Establish a holistic risk picture 10

1.5 Recommendations and best practices 11 Step 5: Prioritize and implement cost-efficient

prevention and mitigation measures 38

Step 2: Define the organization’s safety ambition 5.1 Cost-benefit evaluations 38

and manage conflicting goals 12 5.2 Recommendations and best practices 39

2.1 Set your safety ambition 12

2.2. Manage conflicting goals 13 Conclusion 40

2.3 Operationalize your commitment to change 15 References 42

2.4 Recommendations and best practices 16 Abbreviations and definitions 42

Appendix A: Blackout preparedness – self assessment 44

Step 3: Identify measures to ensure safe and Appendix B: Guidance for FMEA analysis 45

reliable vessel operations 17 Appendix C: Enhanced protection measures for

3.1 Implement robust operating modes 18 closed-bus operations and blackout recovery 46

3.2 Ensure safe and reliable closed-bus operations 19 Appendix D: Enhanced system integration and

3.3 Ensure correct maintenance and operation of verification for newbuilds 48

machinery 23 Appendix E: Enhanced blackout prevention test 50

3.4 Manage software and networks 25 Appendix F: Enhanced blackout recovery test 51
3.5 Provide training and decision support for crew 26

3.6. Implement enhanced blackout testing 27 Disclaimer: This document is not meant to replace any
rules, regulations or guidelines that are in existence. It is
3.7 Implement dynamic barrier monitoring 29
a compilation of experiences, practices and information
3.8 Recommendations and best practices 30 gathered from various sources in industry. It is expected
that compliance with applicable class rules and statutory
requirements will be ensured.

2
MARITIME Managing the risks of blackout

Editorial
Most operators of passenger ships occasionally experience
blackout with subsequent temporary loss of propulsion.
The complexity and level of system integra-
Fortunately, most incidents do not have significant conse-
quences, as they usually occur while in transit in open sea. tion challenges our ability to understand in
Still, more can be done to reduce the likelihood that such
depth how these systems work.
events occur, so that they do not happen in more high-risk
situations. There is also a need to ensure efficient restoration
of essential systems once a blackout and/or loss of propulsion To support owners and operators in ensuring the
has occurred. safe and reliable operation of their fleet, DNV devel-
oped a stepwise approach for managing the risks of
The underlying causes of blackouts can often be traced blackout and resulting loss of propulsion. This guid-
back to the operation of complex integrated systems. In ance paper provides recommendations and best
order to reduce the carbon footprint and utilize new tech- practices for fleets in operation as well as newbuilds.
nology in a cost-efficient way, the systems tend to become
more complex at an ever-increasing level of integration. We invite you to compare these best practices
Today, the complexity and level of system integration against your own operations. We want to offer
challenges our ability to understand in-depth how these inspiration on how to ensure more robust and fault
systems work. This has become an increasing concern for tolerant operations of your ships.
the whole industry.
We look forward to engaging in discussions and
receiving your feedback. Together, we can drive the
safety in your business forward.

Hans Eivind Siewers

Segment Director Passenger Ships & RoRo

DNV

3
MARITIME Managing the risks of blackout

Reasons for concern


Blackouts and resulting loss of propulsion have long been considered a major accident
hazard for the passenger industry. Depending on the operational situation, loss of
propulsion may pose an imminent threat to the ship and its passengers and crew.

Blackout Damage potential


The focus of this guidance paper is on blackout that Whether loss of propulsion poses an imminent threat to the
results in loss of propulsion. Blackout occurs when there ship and its passengers and crew depends on the oper-
is a sudden total loss of electric power in the ship’s main ational situation. Incidents that occur during operations in
power distribution system. This could be caused by various confined waterways or during port manoeuvring, transit
mechanical or electrical failures in the power generation, close to shore, combined with severe weather conditions,
distribution or propulsion systems, coupled with an ineffec- have a higher severity potential than incidents occurring
tive operational response to the failure. while the vessel is in open sea. The time it takes to recover
from blackout in these situations is critical, because it may
In ships with diesel or gas-electrical propulsion systems, a be too late to restore propulsion in time to avoid accidents.
blackout will cause immediate loss of propulsion and steer-
ing. Propulsion is then lost until the standby generator(s) are
started, the main source of power feeds the power distribu-
tion system, and propulsion units are re-connected.
Depending on the type of failure causing black-
Depending on the type of failure causing blackout, the
out, the system design and operational config-
system design and operational configuration, the blackout
recovery process may be completed within a minute in the uration, the blackout recovery process may be
best-case scenario, but in the worst case, the recovery pro-
completed within a minute in the best-case scen-
cess may not happen in time to prevent a disaster.
ario, but in the worst case, the recovery process
Research from DNV found that in 2019, the media report-
may not happen in time to prevent a disaster.
ed 12 power loss events on cruise ships that resulted in
full or partial blackout while in transit or manoeuvring.
This was a significant increase from four events in the
previous year. These incidents are a driver for stakehold-
ers in the passenger ship industry to stop and reflect Major incidents may also negatively affect company rep-
on what can be done to reduce the risk of blackout and utation through global media coverage. Today’s incidents
consequential loss of propulsion, in order to ensure safe almost instantly spread through social media. This may have
operations. a major impact on earnings, profit and shareholder value.

4
MARITIME Managing the risks of blackout

Time for a step change in safety


Blackouts should no longer be considered unfortunate or rare events. Through
implementing the best practices and recommendations from this guidance paper, the
industry can significantly reduce the risks of blackout and loss of propulsion, and thereby
take a step change to improving safety.

Purpose of the paper a) The likelihood that blackouts occur


This guidance paper offers the passenger ship industry best b) The underlying interrelated causes of failures that
practices to: lead to blackout and consequentially loss of
propulsion
• Improve general understanding of the risks associated c) The factors influencing successful blackout
with blackout and loss of propulsion. recovery
• Reduce the risk of blackout (e.g. by ensuring
that power systems are redundant and fault tolerant). The outcome of the study is a stepwise approach to man-
• Ensure fast and reliable means of system aging the risks of blackout as illustrated in Figure 1. This
recovery. guidance paper is structured around each of the steps and
concludes with best practices and recommendations to
As such, this guidance paper offers support to improve how help the reader prevent a blackout and mitigate its conse-
we approach and control blackout risk through prevention quences.
and recovery mechanisms. Through implementing the best
practices and recommendations from this guidance paper,
the industry should succeed in reducing the risk.

FIGURE 1
Scope
This guidance paper is written to predominantly spark dis- Stepwise approach to managing the risks of blackout
cussions with passenger ship operators and owners, such
as cruise, RoPax, and expedition/exploration ships. Many
of the principles, however, may be extrapolated to other STEP

segments, to other safety-related issues, and to other in-


dustry stakeholders who play a role in the design, building,
1 Increase understanding of blackout.

procurement and operation of ships, including designers,


yards, vessel managers, class and flag. STEP
Define safety ambitions and manage

The primary focus of the guidance paper is on ships with


2 conflicting goals.

electrical propulsion (i.e. ships where a blackout immediate-


ly causes loss of propulsion). The main hazard of concern is STEP
Identify measures to ensure safe and
loss of propulsion caused by blackout. The scope excludes
loss of propulsion that is caused by mechanical failures
3 reliable vessel operations.

related to shaft, stern tube bearing, propellers and pods.


STEP
Identify measures to ensure safe and
A stepwise approach to prevent blackouts
This guidance paper builds on a study that was based on
4 reliable newbuilds.

input received from the passenger ship industry and DNV’s


expert resources. During this study, DNV analysed incident STEP

5
Prioritize and implement cost-efficient
statistics, performed literature reviews, conducted work- prevention and mitigation measures.
shops with key industry operators and collaborated with
expert resources to gain insight into:

5
MARITIME Managing the risks of blackout

Step 1:
Increase understanding of blackout
In order to achieve a step change in safety for loss of propulsion, it is necessary to gain an
overall understanding of causes of blackouts and the regulatory framework. A barrier-based
and holistic approach to managing risk offers practical tools and a helpful mindset.

1.1 Investigate the underlying 1.2 Understand the regulatory


causes of blackout framework
The purpose of an investigation should always be to maxi- To understand blackout, one needs to understand how the
mize the lessons learned from unexpected events to prevent regulatory framework influences ship design and oper-
re-occurrence. However, while many organizations invest ation. The main class rules of a classification society are the
time and money in performing investigations, they often lack mandatory requirements that generally provide a minimum
a feedback loop that allows the sharing of lessons learned technical standard (Figure 2). Therefore, vessel owners
and that helps the organization to learn from the outcome may be inclined to exceed minimum requirements if they
of an investigation. It is worth reflecting on how many major operate under functional requirements and have an ambition
and minor blackout events are reported and/or investigated, to achieve greater redundancy, reliability, operability and
and how many of these investigations have been shared to maintainability.
support organizational learning.
As the default requirements for shipyards is to fulfil manda-
Establishing a feedback loop requires management tory class rules and statutory regulations, any improvements
commitment and a level of risk awareness that acknowl- to the blackout prevention and recovery system need to be
edges the importance of incident investigations. This calls explicitly agreed in the shipyard contract for the vessel.
for a standardized, systematic and traceable investigation
methodology that is embedded in the safety management While main class rules focus mostly on system reliability and
system (SMS) and that allows organizations to identify response to failures through standards for design, construc-
the root causes of incidents and to derive the necessary tion, commissioning and compliance inspections, a blackout
cultural changes. may be caused by numerous technical or operational failures
that may not conflict with the rules.

FIGURE 2

Level of safety based on layers of requirements

Safety level ambition

Owner/operator’s
requirements Performance requirements

Voluntary Class notations


requirements (e.g. OR, DYNPOS, Cybersecure, ISDS)

Statutory requirements
(e.g. SOLAS, ISM)
Mandatory
requirements
Main class

6
MARITIME Managing the risks of blackout

In certain market segments, a blackout and even a tempor-


ary loss of electrical power, propulsion or manoeuvring
capabilities may impose an immediate and high risk, for
The most essential requirements in DNV’s main class rules
instance in connection with diving operations, anchor
on blackout recovery are:
handling or seismic streamlining.

• There shall be at least two main generator sets arranged


The range of additional DYNPOS and RP class notations
for blackout starting, and these generator sets shall
extend the requirements of main class and is intended to
be connected to separate busbar sections of the main
increase the fault tolerance and minimize the risk of func-
switchboard.
tional loss in exposed operations.
• Stored energy for blackout recovery:
• At least two sources of stored energy shall be arranged
for blackout recovery. The generator sets shall be
divided between the power sources. The capacity shall
A new class notation, OR, specifically targeting
be enough for three starting attempts on each engine.
operational reliability, blackout prevention and • If power supply to auxiliary systems, such as governors,
voltage regulators, switchboard control, fuel supply,
system recovery in passenger ships was launched
etc., is needed for the blackout start, the power
in 2021. supplies to these systems shall be arranged as the
energy for starting. The capacity of these power sources
shall correspond to the required number of starting
attempts and/or last for at least 30 minutes.
The following sections will summarize the rules and require-
• Engines in standby mode will usually be arranged with
ments concerning blackout prevention and recovery.
heating and/or lubrication oil priming. These systems
do not have to be supplied during a blackout situation,
Main class requirements on blackout prevention and recovery
provided start blocking is not activated within 30 minutes
The most essential requirements in DNV’s main class rules
after the blackout.
on blackout prevention are:
• Automatic start and connection of the standby generator
is required in case of blackout. The standby power source
• The power system shall be arranged with automatic load
shall be started and connected to the main switchboard
shedding, or load reduction, to prevent overloading of
within 45 seconds. Essential auxiliaries shall then be
the running generator(s).
automatically re-started.
• When several generators are running in parallel, tripping
of one power unit shall not result in overload or tripping
The 45-second requirement is the maximum time for regain-
of the remaining unit(s).
ing power on the main switchboard. Still, additional time is
• There shall be interlocks to ensure that enough
required to connect propulsion units back on the grid.
generators are connected before large motors are started.
• Essential consumers serving the same service shall
If the vessel has an E0 notation (unmanned machinery),
be distributed between the two sections of the main
the propulsion plant shall be automatically re-started, or it
switchboard.
shall be possible to be manually started from the naviga-
• There shall be discrimination in the electrical protection
tion bridge. The starting arrangement shall be simple to
system to ensure that only the switching device nearest to
operate.
the fault is activated.

7
MARITIME Managing the risks of blackout

SOLAS requirements on emergency power systems ER – enhanced reliability of propulsion, steering and
The SOLAS requirements state that the main and emer- electrical power; minimizing the risk of functional loss and
gency power systems shall be mutually independent, also enabling quick restoration
with respect to blackout recovery. In case of blackout, the EMR – enhanced manoeuvring reliability, targeting the reli-
interconnecting feeder between the main and emergency ability of the manoeuvring thrusters and the DP system
switchboards shall be automatically disconnected, and the OP – operational flexibility and predictability during ma-
two systems shall recover from the blackout independent of chinery damage or maintenance
each other. If the emergency power source is a generator,
it shall be automatically started and supply the required Voluntary notations – Redundant propulsion (RP and RP+)
services within 45 seconds. The range of RP notations give additional requirements
to ensure that the propulsion and steering systems are
Blackout recovery of both the main and emergency power redundant and arranged so that after a single failure as
systems is tested on board both during the newbuilding specified in the rules, propulsion and steering can be
phase and annually when in service. The tests shall ensure recovered within a specified time. For the RP(2,x) notation,
that blackout recovery of the two systems are mutually the failure modes include component failure, while for the
independent. higher notation RP(3,x), the systems shall be arranged with
segregation to also cover incidents of fire or flooding. For
SOLAS requirements for Safe Return to Port (SRtP) both RP(2,x) and RP(3,x), an additional qualifier, +, can be
The SRtP regulations apply to passenger ships above a included to further reduce the risk of functional loss; the
certain size, and the overall intention is to increase the systems shall be designed for continuous availability.
safety level and reduce the likelihood of evacuation. This is
achieved through more redundant and segregated system Voluntary notations – Dynamic positioning (DYNPOS and
arrangements, providing increased robustness and fault DPS)
tolerance after incidents of fire or flooding. The range of class notations for dynamic positioning cover
all types of vessels engaged in any dynamic positioning op-
Although SRtP does not specifically address blackout eration. The requirements to availability, fault tolerance and
events, the SRtP regulations ensure redundancy and robustness in the dynamic positioning capabilities escalates
segregated machinery arrangements that, depending on with the higher level of the notations. For the highest level,
the operational configuration, increase the reliability of the DYNPOS (AUTRO) and DPS(3), the DP systems shall be de-
propulsion and steering function. signed with redundancy and arranged with segregation to
provide continuous availability also in the event of compo-
New class notation – Operational Reliability (OR) nent failure or incidents of fire or flooding.
A new additional class notation, OR, specifically targeting
operational reliability, blackout prevention and system Always be prepared for the unexpected
recovery in passenger ships was launched in 2021. The All owners and operators must have contingency planning
notation builds upon the general principles of the SRtP for shipboard emergencies (as part of the ISM Code) in
scheme and extends the requirements with key elements place that to some degree can manage the unexpected.
and practices from the dynamic positioning and redundant Always being prepared for the unexpected is applicable
propulsion class notations. The OR notation addresses three to all operations and to all types of ships. We cannot rule
main areas covered by different qualifiers: out the unexpected, but this guidance paper can help to
manage the expected.

8
MARITIME Managing the risks of blackout

1.3 Apply a barrier approach

The management of major accident risk requires good Figure 3 shows a simplified bow-tie barrier diagram to
systems that capture the complexity and reduce the uncer- present the threats and barriers that contribute to increas-
tainty associated with major accidents. Barrier management ing/decreasing the likelihood of blackout and the mitigat-
is an approach that enables stakeholders to have a com- ing barriers to improve recovery. The bow tie is a generic
prehensive and common understanding – from design and aggregation of multiple Swiss Cheese models [13], each
throughout operation – of which barriers should be imple- presenting a single event trajectory.
mented to protect from hazards, and how these barriers
should be verified, monitored and maintained. The purpose of this generic bow tie is to be able to apply
it to any blackout incident to a) retrospectively understand
For the barriers to be successful in preventing hazards what may have gone wrong during an incident, and b) pro-
from developing into a major accident and in mitigating actively plan to improve the management of the relevant
the consequences of a major accident, barriers need to be barriers.
managed so that they perform as expected.
In Figure 3, power generation, power distribution and
Simplified bow tie for blackout electrical consumer failures (e.g. failures in pods) are threats
Bow tie is one of many barrier visualizations of risk models that may result in blackout (and loss of propulsion). In gen-
that are available to assist in the identification and man- eral, there are barriers to prevent electrical and mechanical
agement of risks. The benefits of using bow ties is that they failure, and barriers to prevent fault escalation, in case the
visualize the risk you are dealing with in just one, easy to first barrier fails.
understand diagram. The diagram is shaped like a bow
tie, creating a clear differentiation between preventive If the preventive safety barriers fail, it will lead to a blackout
measures (reducing frequency/probability) and mitigating and ultimately loss of propulsion. Mitigation barriers are
measures (reducing consequences). then intended to ensure automatic or manual recovery. The
objective of the barriers is to avoid sustained loss of propul-
sion, with potential consequences such as drift grounding,
an allision, collision or heavy rolling.

FIGURE 3

Simplified barrier model (bow tie) for blackout / loss of propulsion

Preventive barriers Mitigating barriers Consequences

Drift
Power generation grounding
failure

Blackout
Power distribution (loss of Sustained blackout
(loss of propulsion) Collision
failure propulsion)

Electrical consumer Auto Manual


failure recovery recovery Heavy
rolling
Prevent mechanical Prevent fault
and electrical escalation
faults (“fault tolerance”)

9
MARITIME Managing the risks of blackout

1.4 Establish a holistic risk picture


Barriers have a function (a barrier function) that contributes influencing factors, human factors were studied in light of
to managing risk. Barriers can be characterized as technical technical and organizational factors, and organizational
or operational barrier elements. The nature of operational factors were discussed in light of technical and human
barrier elements can be organizational or individual. Risks factors.
can only be controlled if all factors influencing the human
(H), organizational (O) and technical (T) barrier elements As an example, a company’s expectation that the ship
are identified and managed. A good understanding of how reaches a port according to schedule can influence whether
these three components of a system interact is also neces- crew challenges the decision to sail in treacherous weather.
sary. This interaction (HOT) is what ultimately manifests itself Incomplete verification and/or testing of systems can lead to
in human behaviour, indicating how well the system may be unsafe situations, while insufficient training and experience
functioning. in combination with poorly designed systems can reduce
the likelihood that system failures are detected before they
A HOT approach to safety escalate. Taking a HOT approach is relevant across the life-
DNV always challenges the industry to analyse each ele- time of a ship: from the design of a ship, to the operation and
ment in HOT as part of a larger system: technical findings maintenance of a ship, as well as to the continuous process
should be questioned in light of organizational or human of learning from events.

FIGURE 4

Answering three key questions relating to the human


(H), organizational (O) and technological (T) elements
helps companies optimize their safety management
performance and foster a mature safety culture.

10
MARITIME Managing the risks of blackout

1.5 Recommendations and best practices


Recommendations and best practices for vessel managers and crew on board are provided in the table below.

Topic Relevant Recommendations and best practices


for

Systematically monitor trends in blackouts and propulsion losses, and report KPIs
(e.g. recovery time during a drill) to senior management.

Conduct follow-up meetings with equipment suppliers after incidents.

Implement a standardized, systematic and traceable incident investigation pro-


Investigate the
cess. Ensure that the interdependencies between the human, organizational and
underlying causes
technological (HOT) factors are addressed in the incident investigation process.
of blackout
Share data and knowledge on causes of blackouts (e.g. anonymized data and
lessons learned from incident investigations) with fleet, and possibly through joint
industry collaborations.

Report all incidents, near misses and accidents.

Be familiar with the limitations of main class requirements regarding blackout and
  loss of propulsion.
Understand
Understand the main differences between the various voluntary
the regulatory
  notations.
framework
Be familiar with the vessel-specific systems and their limitations which can prevent
  blackout and support recovery of propulsion.

Use barrier models as a tool to communicate with employees and


Apply a barrier
suppliers about blackout risk and the importance of preventive and mitigating
approach
safety barriers.

Ensure that the interdependencies between the HOT elements are addressed in
strategies and operational plans for blackout prevention and recovery.

Communicate an aligned approach that accounts for each of the HOT elements in
preventing and mitigating loss of propulsion.
Establish a holistic
risk picture Create a low-hurdle infrastructure for all employees to communicate feedback on
the strategies and operational goals back to the organization.

Provide feedback to the organization to improve alignment between HOT elements.


Examples: Suggestions for changes to training and system design to improve
crew performance and meet company expectations.

11
MARITIME Managing the risks of blackout

Step 2:
Define the organization’s safety ambition and
manage conflicting goals

Setting an ambition for minimizing the risk for and mitigating the consequences of loss
of propulsion at an organizational level is the first step to ensuring safe and effective
operations. Owners and operators need to agree internally on their ambition, so that they do
not run the risk of prioritizing other organizational goals at the expense of safety.

2.1 Set your safety ambition The ambition-setting process


The process of establishing the safety ambition is valuable
because it offers the opportunity for senior management to
Compliance with existing safety requirements is often not
iron out any conflicting goals (see chapter 2.2) that may ex-
enough to meet the organization’s ambition on blackout
ist at the top level of the organization. The ambition-setting
and loss of propulsion. Historically, updates to rules and
process should involve all stakeholders, including employ-
regulations have come only after a major accident. There-
ees closest to operations, to ensure that all perspectives
fore, major passenger ship owners and operators cannot
and relevant experience are made available and taken into
afford to wait for regulations to raise the bar when it comes
consideration.
to safety.

Once the ambition is set, the organization should prioritize


Goal or vision?
effective communication of the ambition to all managers
An ambition is important because it drives how and to what
and employees. Employees do not need to know as many
degree the organization’s vision, values and goals with re-
details of the ambition as managers do, but they should
spect to loss of propulsion are supported and implemented
know and understand its major intentions. This is to encour-
by the organization [3]. The organization needs to be clear
age involvement and to ensure ownership of the ambition.
on whether their ambition is a goal or a vision; the goals
are the specific targets that move the organization towards
the vision. Although a Zero Vision (e.g. zero blackouts, zero
loss of propulsion, zero accidents) is tempting, it may be The ambition-setting process should involve all
unrealistic. Setting unrealistic goals means that the goals will
stakeholders, including employees closest to op-
be unachievable, which is demotivating to the employees.
A more realistic vision would be to aim for as few blackouts erations, to ensure that all perspectives and rele-
as possible and the goals to reach this vision can differ from
vant experience are available and considered.
ship to ship, as each ship has its own context to relate to.

Examples of an organization’s safety ambition

• Reduce number of loss of propulsion incidents in critical operations to X events per year.
• Reduce number of blackouts / loss of propulsion incidents to X events per year.
• Zero blackouts / loss of propulsion incidents in critical operations.
• Recovery of propulsion within X minutes/seconds.
• Recover propulsion before losing steering speed.
• No single failure of a component shall have a greater effect on the vessel’s ability to maintain propulsion and steering
than the loss of X generators/thrusters on the same bus section. Such a failure represents loss of X% of power
capability.

12
MARITIME Managing the risks of blackout

2.2. Manage conflicting goals


Officers and crew often find themselves amid dilemmas ence both the reliability and operability of assets as well
that require management backing and appropriate policies. as the competence and experience of the people working
They may want to choose a more robust machinery mode with the assets. This push and pull between lowering cost
when entering a critical operation but refrain from doing and freeing up enough financial resources to fund invest-
so because of increased fuel cost. They may want to test a ments in assets and people can create conflicting goals that
function but cannot because of the itinerary or discomfort stand in the way of the overarching ambition to operate
to passengers. During the newbuilding phase, sea trial safely and effectively.
testing is often reduced to the very minimum to save cost at
the expense of system reliability. Stricter rules, regulations and company policies for mini-
mizing the carbon footprint
From a management perspective, the rapid transformations The shipping industry is expected to act upon the Paris
in the industry associated with decarbonization, connectiv- Agreement and reduce greenhouse gas emissions. In April
ity and digitalization require more than ever a need to pull 2018, the IMO adopted a greenhouse-gas reduction strategy
the focus back to safety and to establish a safety ambition with a vision to decarbonize shipping as soon as possible
that lays the foundation for ways of working, for design of within this century. The aim is to reduce total greenhouse gas
technology on board, and for regulatory requirements. emissions from shipping by at least 50% by 2050.

Some transformations and conflicts that may influence how This strategy will likely call for widespread uptake of
management and crew operate ships today are: zero-carbon fuels, in addition to other energy efficiency
measures and new technologies. A natural way to save fuel
• The focus on lowering costs (both CAPEX and OPEX) and reduce emissions is to minimize the number of running
• Stricter rules, regulations and company policies for engines on board and operate with closed bustie, which
minimizing the carbon footprint may have an impact on the system reliability and opera-
• The expectation of increased connectivity tional risk, as explained in chapter 3.2. Other examples
• Inter-organizational goals are SECA regulations that set limits to SOx levels. If the
• Commercial pressures fuel switchover procedure is done faulty, engines may be
• Bonus scheme incentives affected and shut down.

The focus on lowering costs


Operating in a conjunctural market, stakeholders in ship-
ping seek to lower costs as far as practically possible. Over
the years, capital expenditures (CAPEX) have reduced to a
minimum, while more is expected to be saved on the par
of operating expenses (OPEX). The more negotiations push
down prices, the more measures to reduce cost are imple-
mented at the expense of quality and safety. This will influ-

13
MARITIME Managing the risk of blackout

Expectation towards increased connectivity especially if the ship has low par levels and there are dif-
Connectivity and digitalization are other significant techno- ficulties in recruiting competent workforce. This creates a
logical changes in shipping. Organizational goals related catch-22 situation where, despite maximum effort, crew can-
to digital business transformations are emerging. This not meet all expectations and receive negative feedback
concerns how data is being generated, shared, stored and (e.g. audit findings, negative appraisals) from stakeholders
analysed, at an increasing speed. Increased connectivi- in the organization whose requirements have not been met.
ty between vessels and shore may lead to an increased
exposure to cyber threats, and security measures should be Bonus scheme incentives
implemented as an inherent part of the change manage- Organizational goals like speed and production are often
ment process. reinforced by performance agreements or bonuses.
However, bonuses can have contradictory effects on the
Inter-organizational goals performance of a vessel in different situations. If a port call
Departments of many organizations tend to work in silos. is to be made, senior on-board officers can feel pressured
This practice is rooted in how organizations historically to do the call despite challenging circumstances, if they
developed to focus attention first on productivity, followed are incentivized by guest satisfaction comments which
in time by quality, safety and reliability. As such, each tend to be unfavourable for missed port calls. Similarly, if
department has goals to meet (higher revenue, lower cost, the ship must enter or depart from a port under challeng-
higher efficiency, highest reliability) which can be over- ing environmental conditions, then senior officers who
shadowed by risks that threaten the prosperity or survival are incentivized to minimize fuel consumption, could be
of the business. pressured into running fewer engines and compromise
safety during the operation.
Commercial pressures
The challenge for the workforce is that organizational goals These incentives should be reconsidered, because they can
may conflict with each other. To generate higher revenue, impede the organization’s ability to maintain safe operations
the ship must arrive in port on time and turnaround as soon and meet their safety goals [4]. The organization will be
as possible to reach the next destination as per customer better prepared to prevent and mitigate critical events, such
expectations. A demanding itinerary contributes to crew as loss of propulsion, if incentives are connected to leading
fatigue, which can affect quality and safety of operations, indicators such as how many corrective actions are reported.

14
MARITIME Managing the risk of blackout

2.3 Operationalize your commitment to change


Expressing the wish to make a step change in safety, setting and who needs to be informed about progress. The work
a safety ambition and managing conflicting goals are essen- that the task entails should be planned along a timeline
tial foundations for making a difference. However, once with mid-term goals, and criteria should be set for how
this foundation is set, many organizations come to a point success from mid-term goal to mid-term goal will be mea-
where they struggle to convert theory into practice. Often, a sured. This creates a structured and systematic approach
department or person is appointed as a dedicated resource that can guide through the process of operationalizing the
to implement the change, but then these persons are left organization’s commitment to reducing the risk of blackout
without management backing. This illustrates that manag- and loss of propulsion.
ers spend a lot of time on input and output, but less on the
throughput.

Management commitment is not only necessary to establish Management commitment is not only necessary
the organization’s direction to prevent blackout and loss
to establish the organization’s direction to pre-
of propulsion, it is equally important to set aside time and
resources to follow through on the organization’s ambition, vent blackout and loss of propulsion, it is equally
vision and goals. This means that the person who is put in
important to set aside time and resources to
charge of changing organizational practice should get time
to work on the task and resources to help perform the task follow through on the organization’s ambition,
and to share knowledge and insight into what steps should
vision and goals.
be taken to complete the task successfully.

First, the overall ambition, vision and goals need to be


broken down into concrete tasks that can be delegated to
responsible entities in the organization. For each task, it
should be made clear who is responsible and/or account-
able for completing the task, who needs to be consulted,

15
MARITIME Managing the risks of blackout

2.4 Recommendations and best practices


Recommendations and best practices for vessel managers and crew on board are provided in the
table below.

Topic Relevant Recommendations and best practices


for

Engage all stakeholders in establishing an ambition on blackout / loss of pro-


pulsion that fits the vision, values and goals of the organization with respect
Establish and to minimizing the risk of blackout and mitigating its consequences.
implement an
ambition Implement the ambition in the organization’s strategy and procedures and
ensure continuity throughout the safety management system. Ensure that the
organization’s ambition is embedded in newbuild specifications.

Communicate the Establish a plan for communicating the ambition from one layer of the organ-
ambition ization to the next to ensure that a unified view is shared with all employees.

Create a low-hurdle infrastructure for:


(a) employees to communicate any feedback on the organization’s ambition
and any conflicting goals back to the organization,
(b) the organization to convert recommendations into actions and demon-
strate continuous learning.

Give continuous Provide feedback on the organization’s safety ambition and on any misalign-
feedback to the ments between the organization’s ambitions and governance documen-
organization tation, rules, regulations and/or regular practice on board (e.g. company
  ambition to prioritize safety versus unclear procedures, pressure to arrive on
time, lack of relevant training, distracting alarm management systems, and/
or missing protective equipment on board).

Operationalize the safety ambition by identifying how the ambition influenc-


es one’s daily operations.

Establish a verification and validation process in which the organization’s


safety ambition and different department goals are regularly revisited and, if
deemed necessary, adjusted to meet industry and/or organizational goals.
Establish
Revisit the organization’s goals and department goals to identify any con-
compatible
flicts that need to be resolved. The organizational and departmental goals
goals within the
should span business areas rather than conflict with each other.
organization
Review key performance indicators, bonus schemes and communicated
messages to ensure that they reflect the organization’s common safety ambi-
tion and how the ambition is translated into compatible goals.

16
MARITIME Managing the risks of blackout

Step 3:
Identify measures to ensure safe and
reliable vessel operations
To meet both the expectations of stakeholders and the organization’s safety ambition,
it may be necessary to improve safety and reliability of the existing fleet. The challenge is
to establish cost-efficient measures to avoid blackout and loss of propulsion and to ensure
quick and reliable recovery. Step 3 points to operational and technical measures that can be
implemented by the organization.

The following seven themes are covered in Step 3. Each theme


matches different parts (A–D) of the bow tie as illustrated in Figure 5.

1. Implement robust operating modes [B]


2. Ensure safe and reliable closed-bus operations [B]
3. Ensure correct maintenance and operation of machinery [A, B]
4. Manage software and networks [B, C]
5. Provide decision support for crew [A, B, C, D]
6. Implement enhanced blackout testing [C, D]
7. Implement dynamic-barrier monitoring [A, B, C, D]

FIGURE 5

The themes in Step 3 relate to different parts of the bow tie.

UNDERSTANDING BLACKOUT AND OPERATING IN ACCORDANCE WITH A SAFETY AMBITION THAT HELPS TO
MANAGE CONFLICTING GOALS

A B C D
Preventive barriers Mitigating barriers

Power generation
failure

Blackout
Power distribution (loss of Sustained blackout
failure propulsion) (loss of propulsion)

Electrical consumer Auto Manual


failure recovery recovery

Prevent mechanical Prevent fault


and electrical escalation
defaults (fault tolerance)

17
MARITIME Managing the risks of blackout

3.1 Implement robust operating modes


Some passenger ship operators set minimum require- Likewise, the operating mode instructions should define the
ments to machinery arrangements and manning levels machinery arrangements in manoeuvring and transit modes
based on risk, such as green, yellow and red operating with respect to:
modes. However, it is vital that the criteria for going into
these modes are clearly defined, to provide the master • Power generation:
with decision support for deciding to go from one op- • Number of generators online
erating mode to another. Different operators use differ- • Number of generators in standby
ent names for the different modes. A green condition • Ensuring number of remaining generator(s) after a
typically refers to open waters, yellow for higher traffic failure has the capacity to maintain the navigational
density and distance to grounding line, and red for high safety of the ship
traffic density, close distance to grounding line or port • Configuration of auxiliaries and cross-over lines / cross-
manoeuvre. feeders (i.e. common or separated auxiliaries for the
machinery systems)
Procedures offering decision support
While it is the master who is responsible for vessel safety, • Power distribution:
procedures should function as a decision support tool • Closed or open bus-tie configuration
during voyage planning and voyages. As such, procedures
should to a larger extent show which conditions should • Propulsion units (manoeuvring machinery / steering gear):
qualify for green/yellow/red operations, depending on: • Number of units online
• Number of units in standby
• Weather criteria:
e.g. Beaufort level X should lead to operation red. Risk-based approach
• Distance to shore/grounding line: The procedures and decisions to enter green/yellow/red
e.g. distance X nautical mile should lead to status yellow. operations should be risk-based, in accordance with the
• Traffic: company’s risk acceptance criteria. For example, sailing
e.g. high-density ship traffic should lead to status red. with only one generator online in calm weather on open sea
• The condition and state of the vessel, its equipment and may not lead to severe post-blackout consequences. The
any operational limitations. risk acceptance criteria should reflect the company’s safety
ambition with respect to blackout.

18
MARITIME Managing the risks of blackout

3.2 Ensure safe and reliable closed-bus operations


It is common practice in the industry that vessels operate Figure 6 shows a simplified single-line diagram of a power
with a closed bus tie on the main switchboard (hereinafter generation and distribution system, showing the location of
referred to as closed-bus operations), meaning that redun- the bus tie. In this set-up, four gensets (diesel generator) are
dant power systems are configured as one common system. fed to a sectionalized bus (bus A and bus B) by two bus-tie
breakers (P5 and P6). The power systems in most passenger
There are several benefits of this configuration. However, ships can be isolated by means of bus-tie breaker(s), and
with a standard protection strategy used on passenger each power system is then fed by at least one generator.
ships today, certain failures in a closed-bus configuration
will create a failure propagation path leading to blackout,
even with multiple gensets online. Unless additional techni-
cal measures are implemented, and the systems are tested
and verified accordingly, blackouts may occur. This may be With a standard protection strategy used on pas-
caused by failures such as short circuit, earth fault or exci-
senger ships today, certain failures in a closed-
tation control fault and fuel control (speed governor).
bus configuration will create a failure propaga-
The following sections will focus on the risks associated with
tion path leading to blackout, even with multiple
closed-bus operation, present typical failure modes and
define possible barriers against failure propagation. Note gensets online.
that the only way to identify all failure modes is to run case-
by-case analysis and define failures which are applicable for
specific design solutions.

FIGURE 6

Simplified single-line diagram of a power generation and distribution system

Gov Gov Gov Gov


Power Management AVR AVR AVR AVR
Computer Control Control Control Control
Safety Safety Safety Safety

G1 G2 G3 G4
Legend
P: Protection relay
D: Diesel engine
SB-A SB-B
G: Generator P1 P2 P3 P4
M: Motor
T: Transformer 11 KV Bus A 11 KV Bus B
SB: Switchboard
Gov: Govenor P7 P8 P5 P6 P9 P10
AVR: Automatic Voltage Bus tie
Regulator

T1 T3 T4 T2?

P11 P12

Drive A 690 V Drive B


P15 P13 P14 P16

Thruster Control M1 T5 T6 M1
Computer

P17 P18

230 V

P19 P20

19
MARITIME Managing the risks of blackout

Advantages and disadvantages of operating with closed it might trip the load reduction functionality. The faulty
and open bus generator will force the healthy generators into reverse
Common practice in the industry is for vessels to operate power, and they will be tripped by the reverse power pro-
with P5 and P6 closed. There are several benefits of this tection. When the faulty generator is the only generator
configuration, such as: remaining at the switchboard, it will go into overspeed, be
tripped and create a blackout.
• Fewer running generating sets, less total fuel
consumption, less consumption of lube oil, improved The switchboard-breaker protections also need to be
maintenance intervals, fewer engine hours, less wear and coordinated to handle a short circuit ride through. The
tear on the engine. propulsion drives may trip on low voltage before the
• It is more likely that gensets run on optimal load – short circuit protection in the generator breakers and the
lower fuel consumption and emission, and reduced bus ties. This may, in turn, result in loss of propulsion and
environmental footprint. essential systems.
• Decreased risk of partial blackouts caused by loss of a
single generating set. Generator set failures
• Greater flexibility for preventive and corrective Failure modes that can propagate through systems (i.e. in
maintenance activities (depending on the power system closed-bus operations) are mostly associated with faulty
arrangement). fuel control systems on the engine or excitation control sys-
• Increased grid frequency and voltage stability, because tems on the alternator. These faults are not easily detected
more generating sets are connected to the common bus. by the protection relay of the faulty generator. It can lead to
a disconnection of any healthy unit which becomes over-
However, as pointed out in the previous section, certain loaded or starts to absorb power to maintain correct system
failures in a closed-bus configuration will lead to blackout, frequency and voltage.
even with multiple gensets online, unless additional techni-
cal measures are implemented. Therefore, good practice for the power systems operating
in closed-bus modes is to equip the protection scheme with
When operating with open bus, in other words redundant an additional safety barrier that supervises the generating
power systems are configured as independent systems (P5 set’s behaviour. This functionality should be realized by
and P6 open), the likelihood of full blackout is significantly independent control systems that have a dedicated set of
reduced, as no electrical failures in bus A may propagate via interfaces, or it should be executed via power management
the bus tie to bus B. However, this does not eliminate the risk systems with functionalities that extend to generator super-
for blackout completely, as there may be faults that can affect vision modules.
the expected independence. Examples of such faults are:

• Common mode failures resulting in loss of critical


redundant equipment in both bus A and bus B. Good practice for the power systems operating
• Combination of single failure in bus A, followed by hidden
in closed-bus modes is to equip the protection
failure in bus B.
• When fully redundant systems are operated in a non- scheme with an additional safety barrier that
redundant manner; this is especially relevant for all
supervises the generating set’s behaviour.
auxiliaries (e.g. lubrication, cooling, and ventilation) when
auxiliaries that belong to bus A are powered by bus B.

The benefit with this configuration is that it maintains avail- The supervising systems should be independent from the
ability of propulsion/thrusters during most failure modes, fuel control system and excitation control system so that
maintaining at least partial propulsion. However, the risk is there are no common mode failures which would influ-
that many failures can cause partial blackout incidents (i.e. ence the fuel/excitation control system and simultaneously
loss of one busbar), with consequential reduction in propul- disable or influence the supervising system functionality.
sion capability. In this guidance paper, such a system is referred to as
generator protection (GP).
The typical failure modes in closed-bus configurations for
diesel/gas-electric power plants are listed in Table 1. GP should detect the faulty generating set and issue
start command to standby generators. Sometimes, it is
Several protection systems and functionalities are distrib- enough to increase the number of generators to stabilize
uted throughout the power plant that are designed to the power system. If this does not help, and failure de-
handle one specific failure mode, such as load reduction, teriorates (or simply develops too fast), the GP should
overspeed of a generator and reverse power. If these trip the generator associated with the faulty control sys-
functionalities are not coordinated, they may work against tem. Usually, one more protective barrier is implemented
each other and escalate the failures. For example, if an en- as part of the algorithm, which causes a trip (opening) of
gine produces too much power due to a governor failure, the bus-tie breaker(s).

20
MARITIME Managing the risks of blackout

As the GP system has a tripping functionality to all System synchronization failures


generating-set breakers and bus-tie breakers, it will also Power system synchronization (power generation to switch-
represent a potential source of failure. Like all other sys- board and switchboard to switchboard) is a regular activity
tems, the GP shall be reviewed with respect to autonomy, in all power grids. A faulty synchronization process might
architecture and analysis of consequences against spuri- result in severe disturbance in both power systems.
ous trip commands.
Power management system (PMS)
Switchboard and associated feeder line failures The PMS is a system that automatically controls the power
The integrity of closed-bus systems depends on the ability generation and distribution system in accordance with
of the power distribution grid to detect and isolate the fail- the power demand. It is also a barrier that prevents load
ure and to ensure that the various essential consumers are variations from causing blackout. However, the increased
capable of “riding through” the voltage transients without number of functions, the ability to open or trip all feeders
tripping. Therefore, power systems should be analysed to and bus-tie breakers, and the interface with propulsion
define what applicable faults might occur in the specific systems and other vessel-specific functions create poten-
design and discuss the transient nature of the failure. tial sources of failures. Since it is a centralized control unit
with measurements and command signals to both power
Busbar protection should primarily be addressed during systems A and B, failures in the control loops or commu-
the newbuild process (see chapter 4.2). However, oper- nication links between the redundant PMS programmable
ators should ensure regular checks of critical circuits logic controller (PLCs) might lead to blackout, even in
which control opening and tripping functions in the open-bus operation. Hence, from a safety perspective,
breaker’s relays and verification of barriers which prevent the number of centralized functionalities and connections
hidden failures. should be minimized.

TABLE 1
Categorization of failure modes in closed-bus operations

Origin system Example failures leading to blackout

Sudden trip of single generator set without prior warning, together with degraded perfor-
mance of PMS (i.e. not enough power limitation from preferential trip or load limitation on
drives), may potentially cause overload and underfrequency of remaining generator sets in
power plant, forming a common electrical system.

Internal failures in speed control (e.g. governor, actuator, speed pick-ups, load sharing lines)
leading to active power imbalance in a common electrical system. This may trip healthy gen-
erator sets on reverse power protection.
Generator set
Mechanical blockage of fuel rack following a load reduction demand resulting in inability to
reduce fuel to the engine. This may cause other generator sets to be offloaded and conse-
quently trip on their reverse power protection.

Loss of voltage sensing to automatic voltage regulator. This may lead to overexcitation and
significant reactive current in the power system. If not detected and isolated fast enough, it may
consequently result in tripping breakers on other healthy generator sets due to over/under-
voltage.

Earth fault in outgoing feeder causing trip of generator sets. This may be caused by protec-
Switchboard and tion scheme against earth faults that has not been properly coordinated across breakers.
associated feeder line Short circuit in single outgoing feeder which has not been cleared out by dedicated breaker
due to mechanical failure. This may lead to trip of all generator sets from both power systems.

Faulty synchronization device or mechanical fault of generator breaker may lead to unintentional
System
connection of unsynchronized generator set (crash synchronization event) to common electrical
synchronization
system.

Calculation of power available signal by PMS is not fast enough to activate load limitation in
Power management
propulsion drives and consequently mitigate underfrequency effects in case of sudden shut-
system
down of on-line generating set.

Short circuit followed by transient voltage dip in common electrical power system. This may
Transient states in the
cause under-voltage trip of auxiliary machinery and consequently resulting in shutdown of
power system
running generator sets or propulsion.

21
MARITIME Managing the risks of blackout

Examples of typical features and failures in the PMS system Disturbance in power systems operating in closed-bus
that must be considered are: modes is seen throughout the entire power system. The
set points and protective functions in the PMS should be
• Failures in communication links aligned with possible power oscillations to avoid spurious
• Barriers against unintended operations activation of protective functions or spurious blackout de-
• Barriers against unintended automatic actions (e.g. tection. Also, all systems activating trip or load reduction of
actions which could result in unnecessary blackout, partial thrusters must be identified.
blackout or unintentional power reduction)
• Signal validation, faulty signal, loss of signal Transient states in the power system
Failure modes that could cause spurious tripping of running
One of the essential barriers in this regard is to implement machinery or the spurious opening of circuit breakers can-
a mechanism for the validation of feedback signals to the not be eliminated. Thus, power systems shall be optimized,
PMS to prevent: operated and tuned to be stabilized after a sudden loss of
power generation. Severe failures, which cannot be tested,
• Generator (or bus-tie) connection without synchronization might be analysed by transient state simulations.
• Unintended load reduction of thrusters
• A decrease in generator frequencies to a level that
increases the risk of automatic load reduction of drives
and/or tripping of drives
• An increase in frequency to a level that causes systems to trip

22
MARITIME Managing the risks of blackout

3.3 Ensure correct maintenance and operation of machinery


Even though redundancy may be incorporated into the • Auxiliaries and sub-system failures
design, there are still failure modes that can contribute to • Maintenance failures
a failure on multiple units within a short period of time. • Operational failures
Such incidents are referred to as common mode failures.
This chapter will mainly focus on mechanical common The failures covered in this chapter are not exhaustive but
mode failures, and the following categories in particular: rather meant as practical examples for what could lead to
blackout and loss of propulsion.

EXAMPLE EVENTS

• Clogged fuel filters: Fuel tanks can experience accumulation of sludge, water and deposits. In rough weather, the
accumulations can swirl up in all tanks simultaneously due to vessel motion and subsequently clog fuel filters.

• Loss of lube oil suction: The engine lube-oil system may also be subject to unexpected behaviour during rough
vessel motions, either by means of loss of oil suction or triggering of low-level alarm due to sloshing in the lube-oil
tanks. As these tanks might be of identical design on all engines, and at the same time be subject to identical motion,
it is possible that they will simultaneously experience the same kind of problem with the lube-oil system.

• Lack of fuel management: The quality of newly filled fuel can cause severe problems. This may particularly be the
case with compatibility with new, compliant fuels. New regulations introduce the need for frequent fuel changeovers
which increases these risks. Several blackouts have been caused by two different fuels that coagulated, where the
viscous fuel blocked the filters to the generators.

• Failure in common auxiliary systems: Redundant machinery systems arranged in separate engine rooms are normally
provided with separate auxiliary systems (cooling water, fuel-oil, lub-oil, ventilation, etc). However, these auxiliaries are
normally arranged with cross-over pipes/ducts to provide operational flexibility. Operating with common auxiliaries
may reduce the operational cost but will also expose the redundant machinery to common mode failures in the auxiliary
systems, potentially causing blackout.

23
MARITIME Managing the risks of blackout

Auxiliaries and sub-system failures Maintenance failures


Common failures are typically more prone to systems that Poorly maintained equipment and mistakes made during
are connected to the same sub-system, or systems that are maintenance operations can lead to blackout and loss of
designed and built similarly for the redundant systems, and propulsion.
hence will behave similarly. Some of these systems are not
necessarily operated every day, making hidden faults in the
system and any issues with maintenance and/or watchkeep-
ing routines less apparent.

EXAMPLE EVENTS

• Maintenance on multiple gensets: A failure is particularly critical when all DGs are subject to the same maintenance
operation. This may be the case when the wrong type of lube oil is filled in all DGs, when the torque of a big end
lower half is not sufficiently tightened, when a control valve is left in the wrong position for each engine after a regular
maintenance, or when a replaced part is not fit for purpose.

• Using grease that is not compatible: Some DGs have manual greasing intervals where a grease gun is used to press
new grease into the roller bearing. If grease is used that is not compatible with what is already used, a sudden loss of
lubricity with seizure as consequence may occur. If the greasing of the DG is done on all units at the same time as part
of a regular maintenance program, then the failure of the bearings can occur for all DGs in a short period of time.

• Fuel rack free movement: Fuel rack free movement and links to the governor actuators need frequent inspections
to ensure that they are in order and that the fuel racks are free to move. Similarly, fuel pump barrel and plunger
interaction should be checked frequently because they may influence the DG operation, especially when the need for
large load change appears.

• Fuel pump plunger-barrel: During operation, the clearance between the fuel pump plunger and the barrel increases
due to wear. If the fuel is changed to lower viscosity, this clearance might be too high for a stable operation of the
engine at low speed – and no indications were seen with the higher viscosity fuel.

• Maintenance of equipment during critical/high risk operations: Maintenance of equipment during critical operations
could reduce the system’s ability to handle peak loads and unforeseen situations.

Operational failures
Crew is responsible for optimizing the operation of the with reduced redundancy, for instance, might develop to a
ship systems. This includes starting and stopping different critical situation. It is essential that the risks involved in these
sub-systems and switching valves to have the best flow in operations are understood and that there is sufficient com-
fuel, air and cooling-water lines. Mistakes in these operations petence development, mentoring and supervision avail-
may create situations where the system is not capable of able to oversee the planning and performance of critical
handling the demand for power, and where an operation operational tasks

EXAMPLE EVENTS

• Fuel switchover: For the vessels where a fuel switchover is required to meet local regulations, the procedure for
ensuring a correct switchover is crucial. The switchover procedure is usually slow to avoid thermal shock and should
be done at low engine load. Failure to follow this procedure may result in seizure of the fuel pumps or other thermal
shock-related issues, affecting all DGs.

• Valve operations: If a valve that should be opened is not opened fully, it could restrict the flow of fuel to one or several
gensets. If, perhaps through an operational mistake, the load demand then increases, the flow could be insufficient and
eventually create a shutdown.

24
MARITIME Managing the risks of blackout

3.4 Manage software and networks


Power generation and distributions systems contain numer-
ous programmable controllers that are coupled together
Software defects are systematic in nature and can
via network connections to a highly integrated system with
multiple sub-systems. In most sub-systems like control, be managed with proper quality assurance and
monitoring, alarm and safety functions, their functionality is
verification activities.
programmed as software functions. In some cases, faults in
individual software modules or in the interaction between
the sub-systems can cause a blackout.
Software can also easily be made flexible: a single piece
In general, software-related failures can be divided into the of software can be programmed to take into account a
following three main categories: number of different parameter settings and different hard-
ware configurations. This makes it even more important to
1. Software defects present at the time of handover of the strictly control the parameters, ensuring that the relevant
system/vessel to the owner parameterization is indeed verified, and that no unintended
2. Software defects introduced during the operation phase changes are made to the parameter values after they have
when the software or related hardware is updated been verified.
3. Erroneous parameterization and configuration of the
software Typical software failures that may cause a blackout
There is nothing special about the software involved in pow-
Categories 2 and 3 will be addressed in this section, as they er generation and distribution compared to the software
relate mostly to ships in operation, while Category 1 will be involved in any other control system on board the vessel.
addressed in Step 4 (for newbuilds). Yet, because of the potentially severe consequences of
blackout, software manufacturers must pay special attention
Software management to the design, construction, verification and change man-
Software differs from hardware components regarding agement of the software in the involved sub-systems.
defects and failures, because it does not change character-
istics over time. No software failure appears as a result of
wear and tear of the software; software failures are a matter
of the software behaving unexpectedly to a given set of
EXAMPLE EVENTS
input parameters.

• Inadequate integration or cooperation between


This means that all software defects are systematic in nature
multiple programmable systems (e.g. gas-fuel mode  /
and can be managed with proper quality assurance and
diesel fuel mode, damping of control responses,
verification activities. This includes verifying and validating
latency of signal communication).
that the system behaves exactly as expected before it is
taken into operation.
• Running on an old version operating system
(not updated anymore)
The only known way to control the unruly nature of
software is to apply a structured process of verification
• Faulty uploads of new software
activities throughout the whole software life cycle.

• Old parameters (which are tuned to fit the vessel’s


Different versions of software can be verified and de-
operation) are overwritten and/or reset resulting in
ployed in a controlled way while the vessel is in operation
blackout caused by usual load scenario.
by (a) applying a strict version-control regime, and by (b)
exchanging relevant meta-data about the software versions
• Presence of “dead code” in the control software
between the system supplier and the vessel operator. If
critical functions shall be tested without interfering with the
• Incorrect configuration of a protection relays
vessel operation, then it will be necessary to use a replica
system or a simulator of the target system.
• Poor tuning of the PMS parameters that regulates
load reduction, leading to fluctuations, etc., which may
escalate and result in blackout.

• Functionality errors/poor logic, leading to unexpected


system behaviour.

25
MARITIME Managing the risks of blackout

3.5 Provide training and decision support for crew


It can be argued that the extent to which ships can diagnose
their situation and determine the severity of the conse-
quences of loss of propulsion depends on whether crew In companies with a mature safety culture, oper-
on board are appropriately trained and have the necessary
ators are more inclined to raise a red flag before
experience. But they also need to know and feel that they
can get the support they need from the technical systems starting or during an operation that they are not
and from shore. This section will elaborate on training and
comfortable with.
decision support that is necessary to increase the likelihood
of successful human performance.

Enabling successful human intervention Mature company safety cultures promote safety rather
As emphasized by the IMO, the role of the human element than short-term profit objectives, encourage reporting as a
is “a complex multi-dimensional issue that affects maritime timely way to uncover problems, have standards, rules and
safety, security and marine environmental protection” [8]. procedures in place to prevent non-compliance, and have
Indeed, the human element is increasingly being recog- clear processes in place for communicating critical design
nized as an essential safeguard to maritime safety rather and operational factors [10]. In companies with a mature
than the main cause of accidents [9]. safety culture, operators are more inclined to raise a red flag
before starting or during an operation that they are not
For the vessel to recover as quickly as possible from loss comfortable with. These operators respond strongly to weak
of propulsion, operators need to be able to act swiftly and signals, which is a prerequisite for detecting and acting on a
appropriately. The probability that a person will correctly critical situation such as loss of propulsion.
perform some system-required activity during a given time
period (assuming time is a limiting factor) greatly depends Ensure support from shore organization
on the combined effects of factors that influence perfor- Adequate shore support is a manifestation of management
mance [10]. Examples of factors that directly influence oper- commitment to minimizing risk and optimizing performance.
ators are access to appropriate information in a user-friendly Adequate shore support means the shoreside organization
interface, local communication and collaboration practices, has identified who is responsible for addressing ship ques-
and operator’s skills and levels of experience. More latent tions about regular operations, for helping the ship during
factors include work processes in the company, company troubleshooting, and for offering practical and technical sup-
culture, as well as quality and accessibility of procedures and port in case of an emergency. This also includes offering man-
training. agement support to make decisions that come with a cost.

26
MARITIME Managing the risks of blackout

Perceived support from shore is an important factor that can Training for competence and experience development
reduce crew workload during an emergency, which in turn Crews should regularly be trained and mentored on the
helps them in their ability to make decisions and act appro- operation of systems and handling of emergency cases
priately. A common criticism from ships to their shoreside such as local operation of the essential functions in the
organization is that crew perceive employees in the office power system (e.g. manual synchronization and load con-
as lacking the maritime knowledge and/or updated experi- trol). The objective of the training should be for crew to be
ence that is necessary to provide ships with the support they able to recognize and demonstrate their understanding of
need in their day-to-day and exceptional operations. situations where damage to or maintenance on redundant
components can result in reduced fault tolerance.
For ships to quickly recover from a loss of propulsion
situation, they need to get prompt access to the required Crews are essential barriers for preventing the escalation
support. The best-in class operators support their fleet in of situations where power and propulsion systems do not
areas of: recover automatically. Therefore, it is essential that crews
know exactly what to do when such a situation arises.
• Ship nautical operations: voyage planning, weather This requires crews to be familiar with the vessel-specific
routing, port calls, etc. systems, and equally important, the limitations of these sys-
• Technical operations: Equipment and system malfunction tems. The expected response to a blackout situation should
• Emergency operations: casualty/damage assessment, also be part of the familiarization and handover procedures.
damage stability and residual strength calculations,
contingency plans, 3rd party emergency services, etc.

3.6. Implement enhanced blackout testing


Company policies must instruct and allow for adequate A proper blackout test involves both the main power sys-
blackout testing, explaining requirements with respect tem and the emergency generator start-up. This is because
to, for example, frequency, responsibilities, and timing of a blackout recovery sequence consists of these two parallel
testing. Blackout tests must be arranged regularly in order processes which start up independently without any oper-
to verify the system responses to different blackout failures ational delays. As the blackout test is a logistic challenge, it
and to contribute to enhancing crew competence on black- should be prepared well in advance.
out scenarios.
It should be ensured that the blackout test is created by
Crew can then: different conditions (i.e. different failures), to verify system
• Test what manual actions may be required for blackout response triggered by different circumstances and to pre-
restoration. pare crew for various scenarios.
• Learn to identify blackout conditions, observe power
system automated actions and troubleshoot problems It is important that tests verify the functions of the blackout
should the sequence of blackout recovery fail. prevention measures and the blackout recovery measures
• Identify areas for improvement in the technical and (i.e. testing full blackout).
operational barriers.
• Become more confident with emergency response Recommendations and best practices for blackout preven-
procedures and checklists in the event of power system tion and recovery tests are provided in Appendix E (black-
failures. out prevention) and Appendix F (blackout recovery).

27
MARITIME Managing the risks of blackout

KEY ELEMENTS OF BLACKOUT TESTING PROCEDURES

• Objective describing need, motivation and targets.

• Prerequisites describing activities which shall be performed prior to the test.

• Set-up for power system describing operating mode during the test, how many DGs are running, how the
switchboards are assigned to redundancy groups, and which equipment is running prior to the test, etc.
• Typically, the power system shall be configured as during the regular operation.
• This part might also describe the specific loading condition for the power plant.

• Reference to other documents, procedures, and vessel maintenance schemes.

• Test method describing how the test shall be performed.


• Procedure shall be detailed enough to give an overview of what shall be tested and how.
• Good practice is to include breakers tags and other information which makes the entire process short and
straightforward.
• The method shall include a list of manual actions with specific information on what should be operated where and
how, so that the procedure can be understood irrespective of crew rotation.

• Expected results describing how the power system shall prevent and/or recover from blackout, how the power
system shall split and what the expected time for:
• Power generation start-up
• Power generation connection to main switchboards and synchronization with system (if necessary)
• Propulsion recovery

• Results found describing the real results. If the results found deviate from results expected, this shall be described,
explained and concluded for acceptance or rejection.

• Comments to note, such as additional information, drawings or sketches.

28
MARITIME Managing the risks of blackout

3.7 Implement dynamic-barrier monitoring


The barrier approach described in chapter 1.3 created a into the barrier health status and enables real-time decision
theoretical foundation for the project, which resulted in a support. A live status dashboard alerts crews and man-
simplified bow tie that facilitates communication about loss agement teams about degraded barriers and how the risk
of propulsion. However, in order to use barrier models as levels may potentially increase unless mitigating actions are
decision support tools in daily operations, the principles of implemented.
dynamic barrier monitoring should be applied.
Dynamic-barrier monitoring uses quantitative data (e.g.
Monitoring the health status of safety barriers by sensor feeds) or qualitative input (e.g. through assess-
Barrier performance is not static, meaning that the ments). The figure below illustrates the process of building
integrity of a barrier (its status) may degrade over time. a barrier model, followed by applying dynamic checklists
This makes establishing the status of the barrier difficult. (i.e. reporting on barrier status) via applications to generate
Dynamic-barrier monitoring ensures continuous insight an overall risk status.

FIGURE 7

Example of a dynamic-barrier approach

29
MARITIME Managing the risks of blackout

3.8 Recommendations and best practices


Recommendations and best practices for vessel managers and crew on board are provided in the table below.

Topic Relevant Recommendations and best practices


for

Ensure that procedures for power system and propulsion arrangement (e.g. green,
yellow, red modes) are based on operational exposure, e.g. weather states (Beau-

fort level), distance to shoreline, traffic density and operational status of the vessel .

As part of the procedures, define the vessel’s critical/high risk operations and cor-
Implement robust responding ‘safest mode of operation’.
operating modes
Clarify what is expected from the crew in different operation modes.

Provide feedback on procedures, malfunctioning systems and report discrepancies


  between the procedures and any alternative practices on board.

Implement more advanced protection measures to ensure fuel and voltage control
of gensets (e.g. generator protection [GP]).

Ensure maintenance/overhaul of speed governors and correct automatic voltage


  regulator (AVR) settings.

Maintain the integrity of switchboard and associated feeder lines:


- Implement checks of critical circuits that control opening and tripping functions in
breaker’s relays.

- Ensure that bus-tie cables are mechanically protected and insulated, and that
busbars in switchboards are insulated.

Maintain the integrity of system synchronization by implementing synchronization


operations in the operational procedures. Avoid switchboard-to-switchboard synch-

Ensure safe and ronization during critical operating modes.
reliable closed-bus
operations Maintain the integrity of PMS systems by:
- Monitoring hidden failures in trip circuits for the generator and tie-breaker.
- Implementing high-integrity serial communication or direct HW open command
signals to each generator and bus-tie breaker.
- Implementing redundant open command signals to each generator and tie-breaker.

- Providing clear indications of local/remote status of the tie-breakers, making au-
tonomy and distribution of functionality available.
- Implementing dual action functionality for preventing unintended acts of operation
and ensuring validation of feedback signals to PMS.

Ensure that desktop studies (e.g. FMEA) are supported by dynamic computer
simulations. Simulations should address failures that cannot be tested and cannot
be concluded on during a regular desktop exercise such as in FMEAs (e.g. transient
states and “ride through" verification).

Ensure that procedures address common failure modes and maintenance operations
  that could potentially result in reduced fault tolerance.

Ensure correct Ensure that no simultaneous maintenance and upgrade of similar equipment is
maintenance   performed and identify where equpment maintenance should be avoided.
and operation of
machinery Ensure that overhauled or upgraded equipment is thoroughly tested before sailing.

Ensure that newly filled fuel is not used and mixed with other fuel before the test
  results confirm compatibility.

30
MARITIME Managing the risks of blackout

Topic Relevant Recommendations and best practices


for

Appoint an accountable/responsible person to follow-up on software updates that


are conducted on board. This OT position (engine department) should be separat-
ed from an IT position, as it is more focused on network machinery and automation
as opposed to software and network issues (hotel department).

Ensure comprehensive verification and validation processes before new software


  versions are put in operation (i.e. implement proper change management).
Manage software
and networks Actual tests should be performed on board and in simulators (if possible). Special
  attention should be put on system safety-critical parameters.

Refrain from making changes to software during critical operations.


The operator should also ask for records that show that the suppliers have indeed
  performed sufficient verification activities on the software.

Establish a competence development plan for crew to demonstrate their under-


standing of and ability to operate systems, including emergency cases, e.g. local
operation of the essential functions in the power system (e.g. manual synchroni-
zation and load control) and cases where power and propulsion systems do not
recover automatically from blackout.

Ensure that the crew understands and recognizes situations where damage to or
maintenance on redundant components can result in reduced fault tolerance.

Provide training Perform a human-reliability analysis (HRA) to verify that the system provides the
and decision necessary support for users to timely act on threats to and escalations following
support for crew loss of propulsion.

Include progress of the continuous and iterative improvement process of the alert
management system in the safety management system (SMS).

Ensure that roles, responsibilities and training/competence requirements for the


shore-support team are defined (incl. designated person ashore [DPA] responsibilities).

Study the organizational structure of dedicated resources who can assist during
  troubleshooting and emergency situations.

Perform a job safe assessment prior to testing.

Perform a partial blackout test prior to the full blackout test.


This will highlight whether the power system is free from any items that could fail
  the blackout recovery sequence, and whether a healthy, energized redundancy
group has no unintentional crossovers with the redundant group that failed.
Implement
enhanced blackout An extension of this test is to leave the one redundancy group in failed condition
testing for a longer period of time (typically for 30 minutes which is equivalent to typical
UPS units discharge time) to verify that all essential vessel capabilities are main-
  tained. Alternatively, disconnect batteries in UPS units prior to the test, to verify that
the healthy side is fully operational during the partial blackout. This simulates worst
case failure design.

Consider recommendations and best practices for blackout prevention test and black-
  out recovery test provided in Appendices E and F, respectively.

Apply dynamic
Use dynamic-barrier models as decision support tools in daily operations and in-
barrier reporting
  clude barrier condition reporting in vessel manager inspection reporting
and monitoring
31
MARITIME Managing the risks of blackout

Step 4:
Identify measures to ensure safe and
reliable newbuilds
To meet the expectations of stakeholders and the organization’s safety ambition, it may be
necessary to improve safety and reliability of newbuilds. The challenge is to establish cost-
efficient measures to avoid loss of propulsion and to ensure quick and reliable recovery.
Step 4 points to technical measures that can be implemented by the organization.

The following five themes are covered in Step 4. Each of the


themes matches different parts of the bow tie as illustrated in
Figure 8.

1. Human-centred design [A, B, C, D]


2. Ensure robust design for closed-bus operations [B, C]
3. Improved integration, testing and verification [A, B, C, D]
4. Design effective blackout recovery systems [C, D]
5. Utilize battery systems [B, C]

FIGURE 8

The themes in Step 4 relate to different parts of the bow tie.

UNDERSTANDING BLACKOUT AND OPERATING IN ACCORDANCE WITH A SAFETY AMBITION THAT HELPS TO
MANAGE CONFLICTING GOALS

A B C D
Preventive barriers Mitigating barriers

Power generation
failure

Blackout
Power distribution (loss of Sustained blackout
failure propulsion) (loss of propulsion)

Electrical consumer Auto Manual


failure recovery recovery

Prevent mechanical Prevent fault


and electrical escalation
defaults (“fault tolerance”)

32
MARITIME Managing the risks of blackout

4.1 Apply the principles of human-centred design


Historically, there has been a tendency to blame the oper- This shows that current practice in the maritime industry on
ator and over-rely on training rather than creating systemat- alarm management does not take into account the strengths
ic improvements in human performance through improved and limitations of human perception and performance. The
system design [10]. Although training is important, people result is that the weight of the responsibility for safe and
will still make mistakes if the system is not designed to meet efficient operations is mostly placed more on the shoulders
human capabilities and limitations. Also, a system that is of the operator.
well-designed and consistent with users’ needs is easier to
operate and therefore easier to train, potentially reducing
training requirements as well as improving human perfor-
mance. As such, the focus should not lie on operator errors Current practice in the maritime industry on
alone, but rather on how operator errors are symptoms of
alarm management does not take into account
sub-optimally designed systems. Lack of human-centred
design manifests itself in issues with user-friendly design, the strengths and limitations of human percep-
maintenance, testing and verification. These issues are cov-
tion and performance.
ered in detail in the remaining parts of Step 4.

Improving human performance through improved system


design There are many reasons for why the design of alarm man-
Operators work in a context that shapes their perceptions, agement systems has become so complicated that it has
decisions and actions. This is particularly the case in such an lost sight of the needs of the end-user. Some of them are:
acute situation as during loss of propulsion, where operators
are under pressure and promptly need to make sense of • The lack of system integration where each sub-system
their situation and act accordingly. Therefore, to understand adheres to its own alerts.
why operators make mistakes, one must put oneself in the • The design of systems is sometimes focused on normal
shoes of the operator. This requires close cooperation be- state operations rather than on emergency situations,
tween designer and end-user, testing and revising design resulting in the triggering of unnecessarily many alerts.
in an iterative process (ISO 9241-210:2010). This helps • The link between system modifications and/or changes
to uncover any discrepancies between how the tasks are and the introduction of new alerts may be missing. There
expected to be performed (“work as imagined”) compared may be an imbalance between what is considered a
to how the tasks are actually performed (“work as done”). hazard and what is therefore alerted, compared to the risk
The company should then act to understand why there is of the system as a whole.
a misalignment between work as imagined versus work as • Alarms are often put in place because it is difficult to
done and how the gap can be closed. automate a process and because they are designed
to protect the machinery. This creates a control system
Design of alarm systems that puts the responsibility to act on the operator, and it
An alarm management system should alert, inform and thereby relies on the operator’s perception of safety.
guide operators, allowing them to diagnose problems and • There may not be a clear enough link between a
keep the vessel operating within its “safe envelope” [5]. Yet, company’s alert system philosophy (if it even exists),
despite the abundance of guidelines, best practices, inter- redundancy and causality underlying major accident
national and national requirements, and class- and com- hazards such as loss of propulsion.
pany-specific rules, alarms are often said to be least useful
when they are most necessary. Operators are distracted by
nuisance alarms, experience unnecessarily high workloads
from redundant alerts, struggle with alarm texts that are dif-
ficult to understand, and are overwhelmed by the amount
of non-critical information that is presented to them [6].

33
MARITIME Managing the risks of blackout

Presenting what is most relevant to the end-user tegration, meaning that the roles and responsibilities of the
The maritime industry should expand its view on an alarm stakeholders involved in alarm management system design
management system from a traditional view of a system for are defined, that the number of alarms that have access
logging events (mostly of interest to an engineer) to a more to the end-user are reduced, and that the presentation of
user-centered definition (i.e. presenting that what is most alarms is improved [7] (see also chapter 4.3).
relevant to the end-user). This requires improved system in-

KEY PRINCIPLES OF ALARM DESIGN AND MANAGEMENT

• Alarms should direct the operator’s attention towards vessel conditions requiring timely assessment or action.

• Alarms should inform and guide required operator action.

• Every alarm should be useful and relevant to the operator and have a defined response.

• Alarm levels should be set such that the operators have enough time to carry out their defined response before the
situation escalates.

• The alarm system is to accommodate human capabilities and limitations.

4.2 Ensure robust design for closed-bus operations


In the passenger and cruise segment, it is common practice It involves using high-integrity switchboards and im-
to operate the power plant with closed-bus-tie-breakers. It is plementing measures to reduce the risk of connection
therefore essential to consider robust design for closed-bus of non-synchronized power systems. Finally, it requires
operations in the design phase. This includes implementing measures to reduce the risk of PMS failures and to ensure
more advanced generator protection capable of handling power system stabilization.
various failures related to voltage-related failures (AVR) or
governor (fuel-related failures), which basic generator pro- Detailed recommendations and best practices for enhanced
tection will fail to handle. protection measures for safe and reliable closed-bus oper-
ations are provided in Appendix C. The configuration and
operations with closed-bus in operation (chapter 3.2) are
also relevant.

34
MARITIME Managing the risks of blackout

4.3 Improved integration, testing and verification


The industry is facing increasing demands towards com- Even if cabled Ethernet-based networks are supposedly
plexity and efficiency during a newbuilding process. Yards “plug and play”, there may be some details in the different
need to meet delivery deadlines and integrate increasingly systems’ communication mechanisms that impact the in-
complex systems into the newbuilds. That combination of teraction between the systems. This is especially true in the
complexity of integrated systems and time pressure often cases where the network has a high load, either because
introduces risks related to software failures that are still of high normal traffic, or because of a defect leading to a
present when the vessel is handed over to the owner. network storm.

Testing and verification of the robustness and functionality The PLCs in the different redundancy groups may be
of integrated systems is essential for the shipowner to rule connected. This might lead to the spreading of a failure to
out failures during operation. Too often, issues come up several redundancy groups, such as in a network storm.
after vessel delivery. Thoroughly tested and verified safety
critical systems during early newbuild phase will lead to The topology (design) of the network also determines its
both cost efficiencies for yards during commissioning and robustness in case of failures and network storms. Ring, bus,
sea trials, and for more robust and reliable systems during mesh and star topologies are common topology variants
operation for the ship operators. that show different tolerances for and behaviours during
individual failure scenarios. It is also common to apply
Detailed recommendations and best practices for en- double network (e.g. double ring) for critical systems to
hanced integration and verification during newbuilding increase robustness towards failures. However, this requires
processes are provided in Appendix C. more cables and configuration and may lead to unexpected
behaviours.
Maintaining integrity in computer networks
The main challenge with computer networks in a power It is important that the consequences of high-load and
generation and distribution system is that it may not be clear failure scenarios for the actual network design are exam-
who is responsible for the totality of the network and its ined and understood. Even in the simple example shown
performance. Even if the network design from the individual in Figure 9, at least three different computer networks are
suppliers has been through testing as a part of the class present. These may again be connected to an engineering
approval process, there may be a challenge to get all parts station or shore to allow for maintenance and trouble-
working together as whole. Not all companies have dedicat- shooting. The networks may also contain different kinds of
ed OT operators and rely on their IT department to cover all network switches and routers.
issues concerning network machinery and automation.

FIGURE 9

Simplified view of the network connections with different topologies

Diesel Diesel Diesel Diesel


generator 1 generator 2 generator 3 generator 4
Gov Gov Gov Gov
Power Management AVR AVR AVR AVR
Computer Control Control Control Control
Safety Safety Safety Safety

Key Star topology


P: Protection relay
 Network Connection

Ring topology
P1 P2 P3 P4

P5 P6
Thruster Thruster

Thruster Control
Computer
Drive A Drive B
Point-to-point topology

M1 M1

35
MARITIME Managing the risks of blackout

4.4 Design effective blackout-recovery systems


During most blackouts, the power-generation system starts Therefore, it is essential to design the blackout-recovery
up quickly without significant delays in the restoration pro- system in a simple way, and to test the blackout recovery on
cess. However, there are several conditions that can hinder a regular basis throughout the lifetime of the vessel. Rec-
successful recovery from a blackout even if the power man- ommendations and best practices for enhanced blackout
agement system is operating correctly. recovery reliability is provided in Appendix C.

TYPICAL REASONS FOR FAILURE IN AUTOMATIC BLACKOUT RECOVERY

• Interlocks that may not have been properly evaluated and tested may delay or fail blackout recovery on every level
of the power system.

• The complexity in interfaces and the high number of permissions and blocking signals increase the risk of failure.

• Failure mechanisms which led to the blackout incident may trigger safety functions that disable machinery start-
up or set HV breakers to trip and block a position. The system needs to be intelligent enough to move to the next
separate system or start this system up at the same time.

• Automated blackout recovery requires detailed tuning to coordinate signals exchange between the power
management system, HV system, drives, and other control systems. Even minor changes in logic, during the
maintenance or service activities, can disable the recovery process. Any change creates a recovery situation which
requires appropriate procedures and permissions.

4.5 Utilize battery systems


More and more energy storage systems based on batteries
are being installed on smaller RoPax and expedition vessels Batteries will have energy for a limited amount
with smaller propulsion engines. Although this trend is
of time, but for many failure modes a battery can
mainly driven by environmental regulations and the ambi-
tions to save fuel, batteries can also be utilized to reduce bridge the power gap until a standby generator
the risk of blackout.
is online.
Installing battery systems is neither straightforward nor
cheap. It increases the complexity of the electrical systems Large dynamic load changes
and requires more advanced controllers for the power flow. Batteries can also be used to slow down large load changes
Further, batteries increase the investment cost, they need by taking energy from the batteries while the engines are
to be replaced after approximately ten years, and battery ramping up. A gas engine that may have problems with
installation requires maintenance and extra space on board. delivering power during large dynamic load changes could
Nevertheless, there are several important advantages of greatly benefit from battery assistance, as this prevents the
having battery systems on board to prevent blackout and shutdown of generators and, ultimately, blackout.
increase safety.
The batteries could also be used for preventing power fluctua-
Bridging the power gap tions. The batteries can deliver its dynamic power much faster
Batteries can store energy for a limited amount of time, but than combustion engines. This is typically an issue on installa-
for many failure modes, a battery can bridge the power gap tions with high loads and smaller power-generating sources.
during a power outage until a standby generator is online.
Such a transitional source of power is a requirement in pas- Load levelling
senger ships. Batteries that are large enough can therefore Battery system installations can be used for load levelling,
be used to prevent loss of propulsion in passenger ships. A where they allow the engines to run on optimum load
local battery system can also support power to the steering (i.e. enhancing fuel efficiency) while the batteries take the
gear, thereby preventing loss of steering during blackout. dynamic load variations. Large enough batteries can sup-
port zero-emission operations, meaning that the vessel can
operate on batteries only, for a limited load during a limited
period of time.

36
MARITIME Managing the risks of blackout

4.6 Recommendations and best practices


Recommendations and best practices for vessel managers and crew on board are provided in the table below. For the
full list of details, see Appendix C (Enhanced protection measures for closed-bus operations) and Appendix D (Enhanced
system integration and verification for newbuilds).

Topic Relevant Recommendations and best practices


for
Ensure that the design process of setting requirements to technical functionalities and
creating human-machine interfaces adheres to the principles of human-centred design
and that the result is compatible with basic human capabilities.

Improve human Ensure close cooperation between designers and employees with recent operational
performance experience (the end-user).
through human- Actively be a part of defining the system’s design criteria and apply the principles of
centred system   user-centred design in the procurement process.
design Cooperate with competence and experience in operations to set the requirements for the
  technical functionality and interface of equipment.
Rationalize the alarms and improve the quality of alarm texts through a process of hu-
man-centred design.

Continuous Provide feedback to the company about improving the alarm management system (e.g.
feedback to the   alert texts, alert priorities). The company should explicitly encourage crew to provide
organization feedback on improving the alarm management system (e.g. alert texts, alert priorities).

Engage a system integrator that takes a central role in the design process from the
earliest stages of the project.
Perform early-phase assessments by reviewing documentation of the vessel and by per-
forming a Hazard Identification (HAZID) study of automation integration.

Get insight into the suppliers’ tests.

  Perform integration testing on the total integrated system.

Improved Implement an Integrated FAT(IFAT).


integration,
testing and Perform Failure Mode and Effect Analysis (FMEA) and FMEA sea trials. See guidance on
verification FMEA analysis in Appendix B.
Use a well-defined and transparent software-development and delivery process by
knowing what to expect and by ensuring process adherence.

Apply a change management procedure for key parameters and system configurations.

Consider relevant voluntary class notations and guidance (e.g. RP, RP+, HIL and ISDS).

Perform a network-failure analysis, network tests and manage the network configura-
tions as key system parameters.
Ensure robust design for closed-bus operations. See list of recommendations in Appen-
dix C.
Robust design
Design effective blackout recovery systems. See list of recommendations in Appendix C.
for closed-bus
operations Consider using batteries as effective barriers to prevent blackouts. Consider the best
practices and recommendations in Appendix D, to mitigate the increased complexity of
systems and integration that batteries can contribute to.

Vessel manager Crew Newbuilding department System supplier

37
MARITIME Managing the risks of blackout

Step 5:
Prioritize and implement cost-efficient
prevention and mitigation measures
The implementation of preventive and/or mitigating measures should be based on cost-
benefit evaluations that compare the monetary value of benefits against cost. The challenge,
however, is how to assess and monetize the impact of different measures on safety.

5.1 Cost-benefit evaluations


How safe is safe enough? safety ambition. If the ambition goes beyond the minimum
After setting a safety ambition, as described in chapter 1.5, requirements set by class and statutory requirements, then
the shipowner and operators will face typical questions the focus will be on deciding what measures should be
such as: implemented to progress towards the ambition.

• Should we invest in additional safety measures for our The previous chapters of this guidance paper provide
existing fleet? What measures should be implemented? owners and operators with recommendations for how to
• What types of safety features should be specified for our reduce the risk of blackout based on best practice. As such,
newbuilds? it covers measures related to updating procedures, change
• What class notations should be selected to support our management, safety and failure mode assessments, installing
ambition? equipment and systems, verification and testing. Before im-
• Whether a type of ship which has suffered many accidents plementing new measures, you need to consider the impact
should be modified, and if so to what standard, and should it will have on safety and associated implementation costs.
the whole fleet then be modified?
Cost-benefit evaluation
To answer such questions, the decision-maker must have Cost-benefit evaluations help to assess the benefit of the
criteria at hand to be able to decide when the newbuilds proposed safety measure, in terms of the risk that would be
and existing fleet can be considered safe enough. This averted against the cost of implementing the measure. The
requires the decision-maker to look at the organization’s evaluation has two main objectives:

38
MARITIME Managing the risks of blackout

• To determine if an investment in an additional safety during testing. Often, it is not the test itself that may be time
measure should be initiated and assess by how much its consuming or costly; it is the afterwork that may be need-
benefits outweigh its costs. ed if things do not go according to plan. Again, planning,
• To provide a basis for comparing safety measures and competence and contingency measures are essential for
comparing the total expected cost of each measure relatively low cost compared to benefit.
against its total expected benefits.
When 1+1=3: adding value through a combination of
Cost-benefit analyses may have different outcomes for measures
different shipping companies, at different times and for dif- It is the combined effect of measures that will have the
ferent vessels. This is because the operational and technical greatest impact on safety. For example, setting up robust
context of each vessel will determine what may be con- modes of operation in combination with more sophisticated
sidered too high cost and how much a vessel will benefit protective functions in software and hardware. Combining
from one measure compared to another. Vessel managers this with regular testing and verification will undoubted-
should therefore start a cost-benefit evaluation by setting ly have significant positive impacts on vessel safety and
criteria for determining cost-benefits that are relevant to the reliability.
vessel’s and company’s situation.
Passenger ship owners and operators should also ensure
Investments do not necessarily need to be significant. Up- that their strategies and additional measures for blackout
dating procedures and crew training may have a significant prevention and recovery address the interdependencies
impact on safety, while the associated cost may be less than between human (H), organizational (O) and technical (T) ele-
an investment in system retrofits. Testing is also a low-cost ments that influence the risk of blackout. This HOT approach
measure, provided it does not impact operating schedule should be an integral part of the risk management process,
(e.g. testing in-between operations) and that the test is supporting the identification of effective recommendations
properly planned to avoid surprises and system damages and measures to improve safety and system reliability.

5.2 Recommendations and best practices


Recommendations and best practices for vessel managers are provided in the table below.

Topic Relevant Recommendations and best practices


for

Decide which measures to implement based on cost-benefit criteria.

Use insight from internal and external blackout statistics and root cause analyses to
identify which measures will have greatest impact.

Ensure measures address the interdependencies between the human (H), organiza-
Perform tional (O) and technical (T) elements (the HOT approach).
cost-benefit
Consider a combination of measures to ensure greatest effect on vessel safety and
analyses
reliability.

Introduce discussions about cost for preventive and mitigating measures early in the
procurement process with vendors.

Measure the return of investment by establishing key safety performance indicators


that cover both leading and lagging indicators.

39
MARITIME Managing the risks of blackout

Conclusion
To support owners and operators in ensuring the safe and reliable operation of their fleet, DNV developed a stepwise
approach for managing the risks of blackout and resulting loss of propulsion. Through implementing the best practices and
recommendations from this guidance paper, the industry should succeed in reducing the risk.

The five steps and the key elements in each step are summarized below.

To challenge the status quo within organizations and to initiate a discussion on blackout prevention and recovery, owners
and operators are encouraged to use the “Blackout Preparedness – Self Assessment” in Appendix A. This assessment is a
set of questions that is intended to raise awareness about blackout and what can trigger escalation after a blackout.

STEP Increase understanding STEP Define safety ambitions and STEP Identify measures to ensure
1 of blackout
2 manage conflicting goals
3 safe and reliable vessel
operations

In order to achieve a step change in safe- Setting an ambition for minimizing the risk To meet the expectations of stakeholders
ty for loss of propulsion, it is necessary to for and mitigating the consequences of and the organization’s safety ambition, it
gain an overall understanding of causes loss of propulsion at an organizational level may be necessary to improve reliability
of blackouts and the regulatory frame- is the first progression towards ensuring on the existing fleet of vessels. Step 3
work. Increasing understanding of black- safe and effective operations. Owner and points to operational and technical mea-
out requires that organizations investigate operators need to agree internally on their sures that can be implemented by the
the underlying causes of blackout and ambition, so that they don’t run the risk of organization. These include:
that they understand the regulatory prioritizing other organizational goals at
framework. A barrier-based and holistic the expense of safety. • Implementing robust operating modes
approach to managing risk offers practical based on sound procedures that offer
tools and a helpful mindset. Managing conflicting goals implies also decision support
that organizations are ready to set aside • Taking measures to ensure fault tolerant
time and resources to operationalize their operations through safe and reliable
commitment to change. closed-bus operations
• Maintenance and operation of
machinery to tackle common mode
failures
• Managing software and networks
• Providing training and decision support
for crew
• Implementing enhanced blackout testing
• Implementing dynamic-barrier monitoring

40
MARITIME Managing the risks of blackout

STEP Identify measures to ensure STEP Prioritize and implement


4 safe and reliable newbuilds
5 cost-efficient prevention and
mitigation measures

Step 4 addresses technical measures for The implementation of preventive and/or


newbuilds that can be implemented by mitigating measures should be based on
the organization to avoid blackout and cost-benefit evaluations that compare the
loss of propulsion and to ensure quick monetary value of benefits against cost.
and reliable recovery. These include:
The challenge, however, is how to assess
• Applying the principles of human- and monetize the impact of different
centred design measures on safety.
• Ensuring robust design for closed-bus
operations It addresses the most prominent HOT
• Improved integration, testing and elements and their interactions in rela-
verification tionship to risk for blackout, and it offers
• Designing effective blackout recovery recommendations and best practices for
systems each step in the five-step approach to
• Utilizing battery systems preventing and mitigating blackout.

41
MARITIME Managing the risks of blackout

References Abbreviations and


definitions
[1] ISO 9241-210:2010. 2010. Ergonomics of human- AC Alternating current
system interaction — Part 210: Human-centred AVR Automatic Voltage Regulator
design for interactive systems. Geneva: International BMS Battery Management system
Standards Organization. BF Beaufort (scale)
CAPEX Capital Expenditures
[2] Norwegian Shipowners’ Association and DNV (2014) DC Direct Current
Barrier management in Operation for the Rig Industry – DG Diesel generator
Good Practices. Report 2013-1622. Rev1. DP Dynamic Positioning
DPA Designated Person Ashore
[3] DNV (2019). ISRS Book of Knowledge. Available on ER Engine Room
www.isrs.net with login credentials. FAT Factory Acceptance Test
FMEA Failure Modes and Effects Analysis
[4] European Safety, Reliability & Data Association (2015). GP Generator Protection
Barriers to Learning from Incidents and Accidents. HAZID Hazard Identification
HIL Hardware in the Loop, independent
[5] EEMUA 191 (2013), Alarm Systems: A Guide to simulator-based testing
Design, Management and Procurement Edition 2. HOT Human, Organization and Technology
The Engineering Equipment and Materials Users HSEQ Health, Safety, Environmental & Quality
Association. HMI Human Machine Interface
HRA Human Reliability Analysis
[6] Sherwood Jones, B., Earthy, J. V., Gould, D. (2006). HV High Voltage
Improving the design and management of alarm I/O Input/output
systems. Paper presented to the World Maritime IAS Integrated Automation System
Technology Conference, 18.03.2006. IEC International Electrotechnical Commission
ISDS Integrated Software Dependent Systems
[7] DNV (2016). Human-centred design of alert ISO International Organization for Standardization
management systems on the bridge. Report 2016- IT Information Technologies
1147. JIP Joint Industry Project
LV Low Voltage
[8] IMO (2003). Resolution A. 947 (23) Human element OPEX Operating Expenses
vision, principles and goals for the organization. OT Operational Technologies
London: IMO PLC Programmable logic controller
PMS Power Management System
[9] Eurocontrol (2013). From Safety-I to Safety-II. A white RP Redundant Propulsion
paper. SMS Safety Management System
SOTF Switch-on-to-fault
[10] Endsley, M. R. (2019). Human Factors & Aviation SRtP Safe Return to Port
Safety. Testimony to the United States House of TQ Technology Qualification
Representatives. Hearing on Boeing 737-Max8 UPS Uninterruptible Power Supply
Crashes on December 11, 2019. Human Factors and
Ergonomics Society.

[11] DNV (2015) Recommended Practice – Dynamic


Positioning Vessel Design Philosophy Guideline.
DNV-RP-E306. Edition July 2015.

[12] DNV (2012) Recommended Practice D102 - Failure


Mode and Effect Analysis (FMEA) of Redundant
Systems

[13] J. T. Reason (1997). Managing the risks of


organizational accidents

42
MARITIME Managing the risks of blackout

Availability: Ability to be in a state to perform as required (ISO 14224).

Blackout: Blackout situation occurs when there is a sudden loss of electric power in the main distribution
system and remains until the main source of power feeds the system. All means of starting by stored
energy are available (DNV Rules for Ships, Part 4, Chapter 8, January 2018).

Busbar: Low-impedance conductor to which several electric circuits can be separately connected
(IEC 61439-1).

Bus-tie breaker: Circuit breaker to sectionalize the busbar.

Circuit breaker: Mechanical switching device, capable of making, carrying and breaking currents under normal
circuit conditions and also making, carrying for a specific time and breaking currents under
specified abnormal conditions such as those of short circuit (IEC 60947).

Common cause/ Failures of multiple items, which would otherwise be considered independent of one another,
mode failure: resulting from a single cause. Common cause failures can also be common mode failures.
Components that fail due to a shared cause normally fail in the same functional mode. The term
common mode is therefore sometimes used. It is, however, not considered to be a precise term for
communicating the characteristics that describe a common cause failure (ISO 14224).

Failure (of an item): Loss of ability to perform as required. A failure of an item is an event, as distinct from a fault of an
item, which is a state (ISO 14224).

Failure mode: The effect by which a failure is observed on the failed item [12].

(Single) Fault tolerance: (Single) fault tolerance is the ability of a system to function without interruption after a
single failure [11].

Hidden failure: A failure that is not immediately evident to operations and maintenance personnel.

Modes: The vessel operational mode specifies the high-level system set-up and redundancy design
intention for a specified set of vessel operations. Examples of vessel operations are transit,
positioning keeping, manoeuvring, etc.

Reliability: The probability that an item can perform a required function under given conditions for a given time
interval [11].

Redundancy: The existence of more than one means of performing a required function [11].

Separation: With reference to systems or equipment intended to provide redundancy. Reduce the number
of connections between systems to reduce the risk that failure effects may propagate from one
redundant system to the other [11].

Switchboard: A main switchboard is a switchboard directly supplied by the main source of electrical power or
power transformer and intended to distribute electrical energy to the vessel’s services (DNV Rules
for Ships, Part 4, Chapter 8, January 2018). An emergency switchboard is a switchboard, which in the
event of failure of the main electrical power supply system, is directly supplied by the emergency
source of electrical power and/or the transitional source of emergency power and is intended to
distribute electrical energy to the emergency power consumers (DNV Rules for Ships, Part 4, Chap-
ter 8, January 2018).

43
MARITIME Managing the risks of blackout

Appendix A:
Self-Assessment for blackout preparedness

1.  UNDERSTANDING OF BLACKOUT 

a.  Are you familiar with what failures may cause blackout? 

Are you familiar with the minimum regulatory requirements for blackout prevention and recovery (e.g. class,
b. 
statutory)? 

c.  Are you familiar with how additional class notations may help to prevent blackout and ensure efficient recovery? 

d.  Do you know what the typical duration of blackout is before full propulsion is restored? 

2. DEFINE THE ORGANIZATION’S SAFETY AMBITION AND MANAGE CONFLICTING GOALS 

a.  Has your organization defined a safety ambition for blackout or loss of propulsion? 

b.  Do you manage your organization’s conflicting goals? 

3. ENSURE SAFE AND RELIABLE OPERATIONS OF FLEET 

a.  Where have you identified your critical operations? 

b. Do your procedures define set-up of machinery/power system for critical operations? 

c.  Do you have additional protection measures implemented for closed bus-tie operations in critical operations? 

Do your organization’s procedures support Master’s decision regarding critical operations (e.g. severe weather,
d. 
close to shore)? 
Is your crew familiar with the limitations of their systems and what to do in case manual blackout recovery is
e. 
needed? 

f.  Do you have sufficient onshore technical expertise and support to assist in emergency situations on board? 

4. ENSURE SAFE AND RELIABLE NEWBUILDS 

a.  Is integration testing of automation and software systems done during newbuilding process? 

b.  Is closed-bus operation considered in your newbuild specification? 

c.  Do you consider blackout recovery system functionality during newbuild specifications? 

Do you apply principles of human-centred design for the design of man-machine interfaces and alarm manage-
d. 
ment systems? 

44
MARITIME Managing the risks of blackout

Appendix B:
Guidance for FMEA/FMECA analysis

Potential hazards that should at least be adequately addressed  Technical barrier 

• Active power load sharing failure (e.g. caused by governor • Generator protection 
failure, insufficient, excess or unstable active power, fuel- • PMS upgrade 
rack failure, active-power or frequency sensor failures, signal
Active and reactive failures, load-sharing line failures) 
load sharing  • Reactive power-load sharing failure (e.g. caused by AVR failure,
insufficient, excess or unstable reactive power, reactive power-
sensor failures, voltage-sensor failures, signal failures) 
• Detection methods and actions to bring the system to a safe
state with conditions and time responses 
• Reference to analysis of worst-case voltage dip (depth and • System optimization and
duration) on healthy bus after short circuit on other bus (in tuning for entire protection
Consequences of closed tie-breaker operation)  strategy (HV, LV, GP, PMS,
voltage transients  • Document adequate voltage dip “ride-through” capability load reduction functionality
of necessary systems to remain in position: thruster drives, inside converters) 
computer systems, networks, contactors, pumps, ventilation,
and other axillaries. 
• Are there built-in protections in thruster variable speed drives • System optimization and
that cause trip or load reduction? If yes, how is it ensured that tuning for entire protection
not all thrusters are lost at the same time by the same trigger? strategy (HV, LV, GP, PMS,
Risk for simultaneous Examples of such protection can be high/low voltage and/or load reduction functionality
trip or load reduction frequency.  inside converters) 
of all thrusters  • Are there situations where all thrusters will reduce their • All protective functions
power simultaneously to such a level that position cannot be included in the coordination
maintained? Such as built-in load reduction functionality in study and mapped with the
drives that may reduce power to zero if one diesel engine fails to computer model used for
full speed.  transient state simulation 
Ensure that no hid- • Does the PMS have direct HW open command signals to both • Redundant open command
den failure renders it tie-breakers?  signals 
impossible to open • Is it sufficiently ensured that tie-breaker is not in local mode • Fail safe system that trips
tie-breaker from during operation (e.g. clear indication of local/remote status on breaker on wire break on
PMS or other protec- PMS GUI)? open command signal 
tion devices  • Include check of tie-breaker operability in procedures   • Signal monitoring  
• How is it ensured that a single feedback failure to PMS does not • I/O mapping fitted to
cause the PMS to carry out actions that result in loss of position?  nodes / field stations to
• Can, for instance, a single failure on feedback signal to PMS cause:  define possible common
• PMS to connect generator (or bus-tie) without mode failures 
synchronization?  • PMS response during ride
• Force full load reduction to all running thrusters through, e.g. short circuits  
simultaneously?  • Protective function which
• PMS to decrease generator frequencies to a level that causes results in feeder trip / bus-tie
Fault tolerance in breaker trip
risk of automatic load reduction of drives / tripping of drives? 
PMS system 
• PMS to increase frequency to a level that causes systems
to trip? 
• PMS to jump to manual mode? 
• Can single PMS operator failure cause blackout? 
• Can one single PMS unit trip all generator breakers? 
• Failure to start and connect 
• Crash synchronization on connect 
• Connection of a stopped generator 

45
MARITIME Managing the risks of blackout

Documentation, analysis and simulation 


• Is there protection functionality in PMS that can trip generator breakers and thus need to be
included in discrimination analysis? 
• Requires tables with settings of all protection equipment both in relays on breaker and in PMS 
• As part of FMEA: verify all protection settings on breakers, not only short circuit by on board
inspection. 
Documentation and • Discrimination analysis which includes protective functions fitted to HV system, LV system,
verification of PMS, BMS, converters and GP. 
protection settings  • Power system shall be discussed to identify applicable failure modes (FMEA). 
• Failure mode shall be analyzed to identify effects on the overall system, including consideration of
an entire protection strategy scheme which typically includes protection relays at the HV system,
protective functions implemented to PMS / IAS, ability of speed governors and AVRs to stabilize
the power parameters. 
• Computer simulations which demonstrate ride through capability
Short circuit selectivity • Is selectivity documented also for highest maximum short circuit current? 
between bus-tie and • Zero delay in bus-tie short-circuit protection? 
generator breakers 
Mode monitoring in • Is there a warning/alarm if power system set-up is in conflict with defined prerequisite
PMS/IAS system  for the operational profile? 
Loop monitoring • Loop monitoring needs to provide feedback to PMS, etc.
(or similar) 
Bus-tie breaker shunt- • Need to be able to open in case of voltage dip
trip, can this be used?  

Appendix C:
Enhanced protection measures for closed-bus operations and
blackout recovery

Recommended protection measures for enhanced fault tolerance in closed-bus operations


Implement advanced generator protection (GP) to protect the power system against faulty fuel and voltage-control
1. 
systems
a.  Each generating set should be equipped with dedicated GP.
b.  GP should give trip commands via hard-wired interface to the generator breaker and to the tie-breakers.
c.  GP should be fitted with a dedicated set of interfaces and reference signals.
Signals from different secondary cores of feeder current transformers and independent transducers for GP and
d. 
PMS should be used.

2. Use high-integrity switchboards, high-voltage switchboards  


a.  The main switchboard should be designed and prepared for possible short-circuit and earth-fault testing.
Depending on design, various measures may be implemented to increase the robustness of critical sys-
tems or components:
• Implementing additional physical protection of equipment
b.  • Selecting high-end components with good performance and low failure rate
• Using single core bus-tie cables
• Implementing mechanical protection of bus-tie cables and insulated busbars in switchboards
• Implementing additional fire and flood monitoring systems

46
MARITIME Managing the risks of blackout

3. Reduce the risk of connection of non-synchronized power systems


Arrange two sync-check barriers in series. Such arrangements will imply a small residual risk for a
a. 
non-synchronous closing to happen (e.g. in case of mechanical defects in breakers).
Establish a protection strategy against crash-synchronization failure through:
• Switch-on-to-fault (SOTF) protection, understanding that this function does not protect the system against
severe consequences of crash synchronization.
b.
• Analysis of typical function in the protection strategy scheme. Typically, crash synchronization results in high
current and might be cleared on the short circuit protection fitted to DG feeder or bus-tie breakers.
• A transient state analysis to verify system behaviour during crash synchronization failures.

4. Reduce the risk of power management system (PMS) failures


a.  Monitor to prevent hidden failures in trip circuits for the generator and tie-breaker. 
Implement high integrity serial communication or direct HW open command signals to each generator and
b. 
bus-tie breaker.
c.  Implement redundant open command signals to each generator and tie-breaker.
d.  Introduce clear indications of local/remote status of the tie-breakers.
e. Ensure autonomy and distribution of functionality.
f. Implement dual action functionality to prevent unintended acts of operation.
Implement a mechanism for validation of feedback signals to PMS to prevent:
• Generator (or bus-tie) connection without synchronization
• Unintended load reduction of thrusters
g.
• Decrease of generator frequencies to a level that increases the risk of automatic load reduction of
drives and/or tripping of drives
• Increase in frequency to a level that causes systems to trip
Minimize centralized control functions and signal and communication link connections across the
h.
redundancy groups.

5. Ensure power system stabilization


a.  Discuss the power system to identify applicable failure modes. 
Analyze the failure mode to identify effects on the overall system and establish an encompassing protection strategy
b. 
scheme. 
c.  Discuss the phenomena that may interrupt the LV system (including auxiliary systems). 
Examine the propulsion drives, including its safety functions. Since propulsion drives are typically each fitted
d.  with the same protective functions, an event that occurs in the entire power distribution system may trigger
drives that trip on the same parameter. 
e. Perform a coordination study that covers all essential systems (HV, LV, DC distributions and GP). 
f. Ensure that desktop studies are supported by computer simulations.  

47
MARITIME Managing the risks of blackout

Recommended protection measures for enhanced blackout recovery reliability


1.  Ensure switchboard sectioning and that each section has an autonomous blackout-recover functionality
Ensure that main switchboards are split into sections between each generator by means of bus couplers. A
a.  bus coupler can limit effects of failures and possibly provide simplified and faster blackout recovery as well as
more autonomous systems. 
Ensure that each section has an autonomous blackout-recover functionality. This will limit the logic to only one
b. 
section, which speeds up the recovery and reduces the risk of failure of recovery for the entire system. 

2. Implement barriers against unintended automatic blackout recovery actions


Ensure that no action can result in unnecessary blackout or partial blackout, e.g. in scenarios where power sys-
a. 
tem recovers from severe failures, which are followed by active and/or reactive power stabilization.  
Failure modes related to voltage and frequency shall be considered including signal failures (I/O failures) and
b.
situations where the actual voltage and/or frequency deviates from normal values.   
The setpoints and protective functions in the PMS shall be coordinated with possible power oscillations (power
c.  oscillation and stabilization, and the expected stabilization time) to avoid spurious activation of protective func-
tions or blackout detection. 

3. Ensure integrity of blackout recovery sequence


Explicitly plan and discuss the blackout recovery process. Such an exercise should involve all vendors to plan
and optimize the recovery sequence from blackout detection, up to full recovery including automatic start of
a. 
propulsion. A blackout recovery sequence that requires high level of permission signals exchanged between
the control systems prolongs the process and increases the risk of sequence failure. 

Evaluate severe failures which could be considered as initial conditions prior the blackout incidents in terms of
b. 
protective functions that are implemented to the entire protection strategy system.  
This typically means that safety functions which are implemented in HV relays, control and safety in power
generation sets, propulsion drives, other drives implemented in the system should be evaluated and conclud-
c.  ed if might set the system to “trip and block” position and be source of recovery failure. For power systems
operating in closed-bus modes, such condition would disable the blackout recovery sequence throughout the
redundant groups. 
d.  Ensure that systems that are blocked upon consecutive starts are not used for critical equipment 
Implement an override functionality (preferably external) that disables the interlocks that prevent blackout
e.
recovery for the scenarios where power systems cannot promptly be recovered. 
f. Reduce the need for manual actions that could delay the recovery process. 
g. See checklist in Appendix F for full blackout recovery test.

Appendix D:
Enhanced system integration and verification for newbuilds
Recommendations and best practices for improving human performance through human-centred system design
Ensure human-centred system design
Ensure that the design process of setting requirements to technical functionalities and creating human-machine
a.  interfaces adheres to the principles of human-centred design and that the result is compatible with basic human
capabilities (ref ISO 9241-210:2010). 

b.  Ensure close cooperation between designers and employees with recent operational experience. 

Actively be a part of defining the system’s design criteria and apply the principles of user-centred design in the
c. procurement process. This includes updating the safety management system (SMS) with the continuous and
iterative improvement process of the alert management system. 
Cooperate with competence and experience in operations (HSEQ and Masters) to set the requirements for the
d.
technical functionality and interface of equipment. 

e. Rationalize the alarms and improve the quality of alarm texts through a process of human-centred design. 

48
MARITIME Managing the risks of blackout

Recommendations and best practices for improved integration, testing and verification
1. Engage a system integrator
Ensure that the system integrator is responsible for integrating all components of the system, applying and
a.  advocating the principles of human-centred design, being a driver for reducing the number of alerts and being
responsible for managing the improvement process of the alarm management system during operations. 
The equipment manufacturer should deliver equipment in accordance with the requirements that are set by the
b.
system integrator and the system logic. 

2. Perform early-phase assessments


a.  Review documentation of the vessel in order to mitigate any design flaws. 
Perform a Hazard Identification (HAZID) study of automation integration to enable the vendors to agree on who
b.
is responsible for what functionality, especially in the interface between complex safety critical systems. 

3. Perform comprehensive verification and validation of systems


a.  Perform failure mode and effect analysis (FMEA or FMECA).
The objective of failure mode and effects analysis of power and propulsion systems is to provide objective evi-
b. 
dence of required redundancy and fault tolerance. 
c.  The FMEA should address all operational modes of a vessel, which it is intended to be valid for. 
For each of the vessel’s operational modes, the technical system configuration shall be assessed and prerequi-
d. 
sites for achieving the required failure tolerance and redundancy shall be included. 
e. Get insight into the suppliers’ tests.
Multiple tests (e.g. software-module tests, performance tests, FMEA tests, etc.) are normally performed by the
supplier before the system is brought on board the vessel. However, the yard and owner have little insight into the
f.
extent and results of these tests. It is recommended that the yard or owner asks to get insight into the tests that
have been performed, and it is also possible to ask a 3rd party to verify the sufficiency of the performed tests. 
g. Perform integration testing.

4. Use a well-defined and transparent software-development and delivery process


Know what to expect:
By utilizing a development and delivery process that encompasses all stakeholders like the supplier, yard and
a. owner, all parties know what to expect and how to control their part. It also makes it easier for the different stake-
holders to understand what to expect from the others. There are several relevant standard processes available,
and DNV has also published one in DNVGL-OS-D203 (Integrated Software Dependent Systems [ISDS]). 
Ensure process adherence: 
b. 3rd party may follow up if the defined and agreed processes are followed by all parties. This ensures that the
different organizations keep focus on the agreed way of working. 

5. Apply a change management procedure for key parameters and system configurations
Identify the key parameters:  
a.
The key parameters of the system should be identified and agreed. 
Analyse key parameters before changes are made:  
b Some parameters may affect the performance of the whole system and should not be changed until the change
has been agreed between the owner/operator and the supplier in question. 
Verify key parameters after software changes: 
After a software update has been performed, the key parameters should be verified before the system is
c.
brought back into operation. If the update introduces or removes parameters, the list of key parameters should
be revised. 
Implement changes to software between FAT and vessel delivery under strict change management: 
d. After FAT, the software should be under version control. Both supplier and system integrator should have full
transparency into the changes being made.

6. Consider relevant rules and guidance


Approval of manufacturers regarding system and software engineering (described in DNVGL-CP-0507), DNV’s
a. class notations; Integrated Software Dependent Systems (ISDS) (described in DNVGL-OS-D203, Redundant
Propulsion (RP), Cyber Secure, Enhanced System Verification (ESV).

49
MARITIME Managing the risks of blackout

Appendix E:
Enhanced blackout prevention test

Test of load management – power management system (PMS) functions 

1. Set up the power system with two DGs, for instance, operating with typical and realistic
Power system load. Power system should be set up according to operating profile, e.g. closed bus.  
disturbance caused 2. Set remining DGs to standby start.  
by loss of one diesel 3. Trip one DG and verify safety functions like load shedding, load reduction, phase back system.  
generator (DG).  4. Verify that remaining DG can withstand load increase with no spurious trip of tie-breakers
or loss of essential and important consumers. 

1. Set up the power system with two DGs, for instance, operating with highest possible load
Power system (i.e. slightly below stand-by start setpoint). Power system should be set up according to oper-
disturbance followed ating profile, e.g. closed bus.  
by loss of big consumer.  2. Load the power system as much as possible and trip a large consumer (e.g. propulsion).  
3. Verify that speed and frequency increase does not cause spurious trip of DGs.  

Test of voltage ride through capabilities (i.e. power system response to voltage dip caused by short circuit) 

Note: Most systems will have equipment which will have problems to ride through a short period with a reduced voltage
level, e.g. frequency converters, motor starters, circuit breakers with undervoltage protections, power supplies, any PLC
system without battery backup, changeover system, etc. 

Note that quick opening and closing of feeder or tie-breakers might be interlocked and not easily accessible in the HV systems. 

Expected voltage dip time applicable in the system shall be verified prior to the test.  
Consequence of a short
Test can be conducted in different ways, such as: 
circuit at a high level in
Quickly opening and closing feeder breakers to an essential consumer (e.g. propulsion thruster) 
the power system will
Opening and closing bus ties to switchboard sections without generators connected
be a voltage dip.  
Opening one generator breaker and quickly closing another 

Test of power system response to active and reactive load oscillations  

• Below tests should be arranged in the closed-bus mode. This is to verify the impact of power imbalance on redundant systems.  
• Tests should be arranged with the minimum operating set-up, which is typically two DGs online (connected to
redundant systems). 
• Power system set-up shall be agreed prior the test. Test shall be document by plots, records and any other means which
allows to verify the results and reproduce the failure mechanism (if test fails). 

a. Disconnect communication between fuel regulator or load-control system. Do a significant


load change. 
Speed-governor b. Apply a speed increase to the diesel by pulling fuel rack of connecting a laptop to the
failures  governor.  
c. Apply a speed decrees to the diesel by pulling fuel rack of connecting a laptop to the
governor.    

a. Disconnect communication between AVRs if it exists. Do a significant load change. 


b. Disconnect the voltage sensing to the AVR (usually done by shorting the current
Automatic voltage transformer [CT]). 
regulator (AVR) failures  c. Apply a voltage increase to the generator by connecting a simulator to the AVR. 
d. Apply a voltage decrease to the generator by disconnection the AVR or by connecting a
simulator/laptop. 

50
MARITIME Managing the risks of blackout

Appendix F:
Enhanced blackout recovery test

1. Key principles of blackout-recovery testing procedures


The main and emergency power systems can be tested once all the prerequisites are checked and the vessel is
a. 
ready for the full blackout test. 
A test is considered successful when all the equipment is recovered and healthy. Any recovery failures shall be
b.
investigated and once the fault is diagnosed and rectified, the test shall be performed again. 
A test should not be repeated before the roof cause of a failed test is rectified, as this will only show the random
c.
response to failure.  
Implement regular blackout tests, where the blackout incident is initiated by different conditions to verify sys-
d.
tem response triggered by different circumstances and to expose crew to various scenarios. 
Regular blackout testing can be combined with Redundant Propulsion – Failure mode and effect analysis (RP
e.
FMEA) tests or Safe Return to Port (SRtP) casualty threshold testing. 
Create trends to observe the power plant condition, timing between the interlocks and permission signals, and
f.
the exact time that the specific components need to recover and get ready for operation. 

2. Example of a simplified stepwise description of full blackout test


Close the switchboard into one common system.  
1. 
All the thrusters (pods, conventional propulsion) shall be running.  
Set all generating sets to remote control and ensure they are enabled for automatic start. 
2. 
One diesel generator (DG) should be suppling all switchboards connected via bus-tie breakers. 
Make the DG trip.
3.
Good practice is to cause the trip by a different condition year by year, and use different generating set each year. 
Verify that the power system split into independent power systems (typically on the undervoltage protection).  
4.
Verify that the power system detects the blackout.  
Verify that the main power system starts the recovery process.  
In the fully automated systems, the power generation shall start up automatically and connect to the main
5a
switchboards (power systems that recover on smaller sections are more efficient and quicker and are less prob-
able to fail during recovery). 
At the same time, verify that the emergency system detects the blackout and recovers in parallel to the main
system.  
5b
The emergency system shall be free of any time delays that allow the main system to take over the emergency
system (e.g. due operational reasons) and not reduce the time that is needed to energize the emergency services. 
Once the main switchboard is energized, re-energize all the auxiliary systems and allow propulsion drives to
6.
recover. 
If manual actions are needed for main system restoration, ensure that they are part of the on-board procedure.
7. Ensure that on-board procedures are detailed and unambiguous. They should be supported with sketches and
tags and location for circuit breakers or valves. 

51
ABOUT DNV
We are the independent expert in risk management and quality assurance.
Driven by our purpose, to safeguard life, property and the environment,
we empower our customers and their stakeholders with facts and reliable
insights so that critical decisions can be made with confidence. As a trusted
voice for many of the world’s most successful organizations, we use our
knowledge to advance safety and performance, set industry benchmarks,
and inspire and invent solutions to tackle global transformations.

Regional Maritime Offices

Americas Greater China


North Europe
1400 Ravello Drive 1591 Hong Qiao Road
ThormØhlensgt. 49A, 5006 Bergen
Katy, TX 77449 House No. 9
Postbox 7400
USA 200336 Shanghai
5020 Bergen
Phone +1 281 3961000 China
Norway
houston.maritime@dnv.com Phone +86 21 3279 9000
Phone +47 55943600
marketing.rgc@dnv.com
north-europe.maritime@dnv.com

South East Europe, West Europe Korea & Japan


Middle East & Africa Brooktorkai 18 8th Floor, Haeundae I-Park C1 Unit, 38, Marine
5, Aitolikou Street 20457 Hamburg city 2-ro, Haeundae-Gu 48120 Busan
18545 Piraeus Germany Republic of Korea
Greece Phone +49 40 361495609 Phone +82 51 6107700
Phone +30 210 4100200 region.west-europe@dnv.com busan.maritime.region@dnv.com
piraeus@dnv.com

South East Asia, Pacific & India


16 Science Park Drive
118227 Singapore
Singapore
Phone +65 65 083750
singapore.maritime.fis@dnv.com

Disclaimer DNV DNV AS


All information is correct to the best of our Brooktorkai 18 NO-1322 Høvik
knowledge. Contributions by external authors 20457 Hamburg Norway
do not necessarily reflect the views of the Germany Phone +47 67 57 99 00
editors and DNV AS. Phone +49 40 361400 www.dnv.com
www.dnv.com

You might also like