A Review On Software Defined Networking As A Solution To Link Failures
A Review On Software Defined Networking As A Solution To Link Failures
Scientific African
journal homepage: www.elsevier.com/locate/sciaf
Editor name: Benjamin Gyampoh The rapid increase of the internet users has resulted in significant growth of the online
traffic, hence, putting more demand on the network. This has led to a high prevalence of
Keywords:
link failures on the network. Upon failure, a network should be able to return to normal
Software Defined Networking
Open-flow
operational status within a considerable amount of time to avoid unavailability. This is done
Link failure by employing link failure recovery strategies or mechanisms. A desirable recovery mechanism
Link recovery is one that enhances the reliability of the network, maintains low communication latency, and
SDN reduces the memory utilization of switching devices. In the common traditional network, it is
Link failure recovery a challenging task to determine the backup paths since the topology is static. Software Defined
Networking (SDN) architecture separates the control plane from the data plane, thus enabling
dynamic network configuration and programmability. Hence, SDN enables network engineers
to create both proactive and reactive backup paths in the network controller. Link failure is
considered one of the most current research challenges in computer networks. This review
paper analyse and compares the literature on recent SDN link failure recovery strategies. The
existing link failure recovery techniques, their challenges, limitations, performance criteria,
technologies/tools mostly adopted, and also the outline of how they can be improved is
discussed.
Introduction
Software-defined networking (SDN) offers an abstraction of the network, thus adding programmability to the network [1–4]. This
programmability of the SDN technology allows for more flexibility in the network since network administrators/engineers can now
develop network applications and incorporate them with ease [5–9]. Furthermore, the architecture of SDN enables a global view of
the network through the SDN controller [10–14]. The controller is the network’s brain and the most critical component [15–17]. The
SDN Controller is the intelligence and decision-making section of the network [18–20]. The capability of viewing the entire network
through a central point provides for easy design and implementation of essential solutions such as link failure, traffic engineering,
and load-balancing [21–26].
One of the key elements of a smooth network operation is the ability to manage link failures by responding accordingly as soon
as there is a link problem [27,28]. Link failures occur when components of the network are unable to correctly deliver services. A
failure may be a consequence of device malfunction, bottleneck, cable cut, or connection breakdown due to some malicious activities
∗ Corresponding author.
E-mail addresses: [email protected] (T. Semong), [email protected] (A.M. Zungeru).
https://doi.org/10.1016/j.sciaf.2023.e01865
Received 9 February 2023; Received in revised form 14 August 2023; Accepted 15 August 2023
Available online 19 August 2023
2468-2276/© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
T. Semong et al. Scientific African 21 (2023) e01865
Table 1
Comparison of this paper and other SDN Link failure-related review papers.
Review paper Year Publisher SDN/OpenFlow Metrics used to New Failure
Overview gauge performance literature location
[3] 2020 Elsevier × ✓ × ✓
[12] 2020 MDPI ✓ × × ×
[25] 2021 Elsevier ✓ × ✓ ×
[29] 2017 IEEE ✓ × × ✓
[30] 2015 IEEE ✓ × × ×
[34] 2015 Springer ✓ × × ×
[44] 2019 IEEE ✓ × × ✓
[45] 2018 Elsevier ✓ ✓ × ×
[46] 2019 IEEE × ✓ × ×
[47] 2021 Elsevier × ✓ ✓ ×
[48] 2019 IEEE × ✓ × ×
[49] 2016 Hindawi ✓ × × ×
Proposed ✓ ✓ ✓ ✓
paper
on the network. This can be detrimental to the overall performance of the network because they result in serious disruptions in
communication [29,30]. There are several studies on link failure including legacy networks [31–34], network function virtualization
(NFV) [35–37], and SDN networks, which is the focus of this paper.
A good link failure recovery strategy should aim to achieve faster detection, recovery, and good management of network
congestion. Reasonable mitigation of a link failure should last not more than 50 ms [38]. Generally, the strategies that are employed
in the literature to mitigate against link failures can be divided into two types, reactive and proactive recovery. A recovery technique
is considered to be proactive if it is capable of predicting a failure and activate a recovery plan before it occurs, otherwise, it is
considered to be reactive.
SDN technology in conjunction with OpenFlow protocol [39,40] offers a good platform for link failure recovery solutions with
the aforementioned characteristics. This is because the flow tables in SDN are assigned in either reactive or proactive modes unlike
those in traditional IP networks. Since the SDN controller is positioned in such a way that it has a global view of the network,
it can intelligently allocate the network resources and manage the network more efficiently [41,42]. Furthermore, because the
traffic is routed using the flow entries installed into the switches in SDN, it becomes easy for network traffic to be divided and
rescheduled [26,43].
Table 1 below illustrates the comparison between this paper and other very closely related review papers in the SDN failure
recovery. The table columns indicate the survey papers, year of publication, the publisher of the article, whether the survey indicates
the metrics used to measure performance on the articles, whether the survey citation includes recently published articles, and finally,
whether the survey paper indicates the failure locations on the SDN planes. In contrast to other similar review articles [12,30,34,44–
48], this work discusses fairly new research work in literature and states the location of the SDN failure on the proposed techniques,
as well as outlining the metrics used to gauge performance. This paper also discusses the tools/technologies that are commonly
adopted by researchers.
To demonstrate the effectiveness of SDN technology in link failure recovery and management, the authors review and analyse
the recent literature. The contributions of this review paper are outlined as follows;
• This paper carries out a comprehensive survey on link failure recovery schemes and algorithms for SDN. We shall classify the
schemes as either reactive or proactive.
• This paper is compared to other related reviews and states how it differs from the existing ones.
• This paper discusses a multi-objective optimization model to formalize the link failure recovery problem in SDNs.
• This paper summarizes the mathematical tools, emulators, performance metrics, and technologies used to implement the
proposed solutions.
• Finally, this paper suggests further improvements in the existing techniques and future research directions.
The rest of this paper is organized as follows; In Section ‘‘SDN and OpenFlow architectures’’, SDN and OpenFlow architectures
are discussed. The paper show how the two architectures contribute to effective link failure management. Section ‘‘SDN link failure
recovery taxonomy’’ review the literature on link-failure recovery. The techniques are summarized based on research challenges and
future research directions are provided. The performance metrics and implementation tools are discussed in Section ‘‘Performance
Metrics and Implementation Tools’’. Finally, conclusions and future work direction are given in Section ‘‘Conclusions’’.
2
T. Semong et al. Scientific African 21 (2023) e01865
The SDN architecture consists of three (3) distinct hierarchical planes, which are the application, control and data planes, as
shown in Fig. 1. The data plane comprises of all the network hardware infrastructure such as routers, switches, etc. On top of the
data plane resides the network control plane, which could be considered the brain of the SDN architecture. The control plane is
made up of a centralized controller that computes and communicates network policies to all devices in the data plane. A controller
could be a single server machine or a farm of servers. Lastly, the application plane comprises all individual applications such as
load balancers, traffic engineering, link failure recovery, voice over IP, network monitoring utilities that are designed to fine-tune
the network operations, and any 3rd party applications [50,51].
Communication between the SDN controller and the application plane is through the northbound application programming
interfaces (APIs), [49,52,53]. Whereas, communication between devices in the data plane and SDN controller is done via the
southbound APIs. The most commonly adopted southbound API is OpenFlow protocol. OpenFlow was first introduced and made
a standard by the Open Networking Foundation (ONF) [40], and has since been widely adopted, for example by companies like
Cisco, Huawei, HP, Facebook, and Google [54–56], just to mention a few.
Although widely adopted, OpenFlow is not the only protocol option that can be used to manage the communication between
the SDN controller and southbound APIs. There are other available southbound API protocols like Forwarding and Control Element
Separation (ForCES) [57–59], that can be utilized. OpenFlow is mostly adopted because it takes advantage of the fact that most
modern Ethernet routers and switches contain flow tables for essential networking functions, such as statistical analysis of data
streams, routing, firewall protection, and subnetting. The main concept of SDN is creating a separate network based on easy
management and laying the foundation for logically centralized control and network programmability. ForCES are not widely
accepted because of the lack of open-source support and some disruptive business model when compared to OpenFlow. More
comprehensive comparisons between ForCES and OpenFlow protocols have been discussed in [53]. The other protocols that
can be used for the exchange of information between control and data plane, although not very common, include SoftRouter
architecture [60,61] and Path Computation Element (PCE) protocol [62–65].
3
T. Semong et al. Scientific African 21 (2023) e01865
Fig. 2. A link failure example with 9 OpenFlow switches and 3 end-devices in software-defined networking.
In traditional IP networks, devices like a switch do not have programmability features, they come with fixed firmware and
functions developed by the manufacturers in a closed manner. So network engineers cannot experiment with these devices as
network innovations are mostly restricted, they can only configure the devices according to the manufacturer’s handbooks. SDN
brings programmability that creates space for creativity and innovation in the network. The network programmability aspect of
SDN allows for the development of applications that automates and manages connectivity, reliability, and flexibility of the network.
This enables network engineers to develop and manage networks with a huge number of devices, different topologies, link failures,
quality of service (QoS), and different traffic flow policies using high-level programming languages [66].
In SDN architecture, the switches are just forwarding devices connected to an SDN controller. The controller manages the action
of the switches and the entire network. This simply means that a switch cannot act on its own, it must contact the controller to
perform any action [67–70]. Therefore, it is the SDN controller that manages the switch and dictates which action to execute in case
of any link or network failures. Since the SDN controller manages the entire network, it enables the virtualization of the network,
and that ensures that better decisions are made with respect to link failure recoveries. As illustrated with an example below.
Network model
Let an SDN G = (V, E), illustrated in Fig. 2, where V and E denote the sets of nodes (switches) and links, respectively. The
network consists of an SDN Controller, 9 switches, and 3 end devices.
In this case, switches 4 and 6 are regarded as core switches as they are responsible for forwarding network flows. The other
switches are edge switches as they can directly connect to end devices. In the same manner, the links are also differentiated according
to edge links and core links. Links like (switch 1, switch 4) or (switch 4, and switch 6) are classified as core links because they
are connecting to the core switches. Examples of edge links are (server, switch 9) or (switch 1, host 2), which are basically the
connections between the switches and the end nodes.
Let the preferred path from the server to host 2 be [(server, switch 9)-(switch 9, switch 6)-(switch 6, switch 4)-(switch 4, switch
1)-(switch 1, host 2)]. We can analogously define the path from the server to host 1 via the link (switch 6, switch 4). Assuming that
there is a failure on the link (switch 4, switch 6), as depicted in Fig. 2, then the paths from the server to both host 1 and 2 will
4
T. Semong et al. Scientific African 21 (2023) e01865
Table 2
Summary of notation and definitions.
Notation Definition
𝐷(𝑚𝑎𝑥) Upper limit of delay requirement
𝐵(𝑚𝑖𝑛) Lower limit of bandwidth requirement
𝐹(𝑏,𝑤) Available bandwidth capacity of backup path
𝑇(𝑚𝑎𝑥) Upper limit of recovery time
𝑇(𝑟,𝑡) Total time from link failure to recovery
𝐹(𝑡,𝑑) Time delay of the backup path
𝑛(𝑢) Node 𝑢
𝑛(𝑢,𝑣) A link from node 𝑢 to node 𝑣
𝑑(𝑢,𝑣) The delay of a link from node 𝑢 to node 𝑣
𝑏(𝑢,𝑣) The backup path from node 𝑢 to node 𝑣
𝑀(𝑟,𝑏) Total memory of backup forwarding rules for recovery
𝐵(𝑠,𝑑) The calculated backup path from a source node 𝑠 to a
destination node 𝑑
𝑡𝑟𝑢 The time taken by the SDN controller to install the
backup rules into node 𝑢
𝑝𝑏𝑢 The number of backup entries into node 𝑢
𝑘𝑚 A unit memory space for the forwarding rules
be inaccessible. The SDN controller in G knows the whole network topology. The controller must be able to intelligently reroute
the traffic to other available paths, in order for the network to recover from the link failure. The SDN controller can utilize link
discovery protocols like LLDP [69,71–73] to learn and get real-time updates about all the dynamics of the network topology. The
controller continuously updates the flow paths in the network and places forwarding rules into the flow tables.
To explain some of the factors that a controller must take into consideration when computing backup paths, let 𝑁𝑣 denote the
set of all neighbour nodes of node v in G, and u is in 𝑁𝑣 if (𝑢, 𝑣) is a link from u to v. Assuming that there is a link failure on link
𝑛(𝑢,𝑣) ∈ 𝐸 in G. To compute the time taken to compute backup and install them, consider the notations in Table 2.
The total link recovery time 𝑇(𝑟,𝑡) is computed by Eq. (1);
∑
𝑇(𝑟,𝑡) = 𝑡𝑟𝑢 , (1)
𝑛(𝑢) ∈𝐵(𝑠,𝑑)
After the detection of the failure, the controller calculates the backup paths and then update the forwarding rules for the switches.
The controller must also pay attention to the links bandwidth, communication delay, and the memory space used for quality of
service (QoS), as it calculates the backup paths. Different applications may have different delay and bandwidth requirements and
the controller must be able to manage these resources accordingly. The delay 𝐹(𝑡,𝑑) of 𝐵(𝑠,𝑑) during the failure recovery process can
be computed by Eq. (2).
∑
𝐹(𝑡,𝑑) = 𝑑(𝑢,𝑣) , (2)
𝑛(𝑢,𝑣) ∈𝐵(𝑠,𝑑)
Where 𝑑(𝑢,𝑣) , 𝑛(𝑢,𝑣) , and 𝐵(𝑠,𝑑) as defined in Table 2. The available bandwidth of the backup path 𝐵(𝑠,𝑑) , is denoted by 𝐹(𝑏,𝑤) and
can be calculated as follows.
{ }
𝐹(𝑏,𝑤) = min 𝑏(𝑢,𝑣) | ∀ 𝑛(𝑢,𝑣) ∈ 𝐵(𝑠,𝑑) . (3)
Where 𝑏(𝑢,𝑣) represents the bandwidth available on link 𝑛(𝑢,𝑣) , which is part of the calculated backup path 𝐵(𝑠,𝑑) . The memory
space needed for installing the backup traffic flows denoted by 𝑀(𝑟,𝑏) , is given by Eq. (4).
∑
𝑀(𝑟,𝑏) = 𝑘𝑚 𝑝𝑏𝑢 . (4)
𝑛(𝑢) ∈𝐵(𝑠,𝑑)
With 𝑘𝑚 and 𝑝𝑏𝑢 as defined in Table 2. All the calculations from Eqs. (1) to (4) are subjected to the following constraints.
Constraints:
𝐹(𝑡,𝑑) ≤ 𝐷(𝑚𝑎𝑥) , (5)
5
T. Semong et al. Scientific African 21 (2023) e01865
Fig. 3. The system architecture for the controller’s link failure recovery manager module.
Link failure recovery architecture based on OpenFlow protocol presented in Fig. 3 is the most implemented in literature. The
architecture is designed in a way that the three bottom modules are in the switches while the link failure recovery manager module
resides in the controller. Based on this system architecture, the controller’s responsibilities include calculating backup paths in
advance and installing them in the switches. The switches are responsible for creating renewal packets that ensure that the flows
always reach their destination and that existing backup paths are kept alive. When the switches detect any failed link they remove
all the entries that are associated with that link. When a failed link becomes available again, the switches inform the controller so
that the best path can be recalculated.
Packet renewal
According to the OpenFlow protocol specifications, every flow entry must have some hard timeout and inactive timeout
properties. Let inactive timeout be n milliseconds. If there is no flow traffic for n milliseconds, then that flow entry must be removed
from the switch’s flow table. Consequently, the alternative paths will also be deleted after n milliseconds if there is no link failure.
To mitigate against this, packet renewal is generated with some specific format and is periodically forwarded by the switches to
redirect the alternative paths from the current working path. The switch that is sending a renewed packet would be able to uniquely
identify it using a specific field as a way of ensuring that it does not send it to its intended destination.
6
T. Semong et al. Scientific African 21 (2023) e01865
Reinstatement of a path
After the restoration of the failed link flow, traffic can now be forwarded using the original link, if it is the shortest available
path. The switches would continue to use the alternative path entries, without consulting the controller for a working path
re-computation. This situation is avoided by ensuring that when switches are connected using a recently failed link, they forward
reinstatement packets for each flow. The controller will then calculate the working path and the alternative path and install them
in the switch’s flow tables.
In this section, a summary of the literature on SDN link failure recovery techniques is provided. The available techniques shall
be classified into two (2) categories, those that are reactive and those that are proactive [24]. The section ends with a summary of
the advantages and disadvantages of link failure recovery in Table 3. Table 3 also classify various link failure strategies as reactive
or proactive approaches. The reactive approach usually incurs a longer recovery time because a solution is sort when the failure has
already occurred, while the proactive scheme typically under-utilizes resources [43]. In Table 4, the paper point out some future
research directions that should be considered to improve the existing solutions.
In dealing with link failure and mitigation for wireless mesh networks (WMN) [74], the authors present a two-fold approach.
The approach is such that in the first instance they can predict link failure in both the control and data planes. It is worth noting
that link failure in WMN could be a result of, but not limited to, radio fading, link noise, node mobility, node deafness and traffic
congestion. The signal-to-noise ratio (SNR) is used to measure the link quality and predict node mobility. The signal used is the
received signal strength (RSS). Since the measurements of RSS are normally accompanied by background noise the length of the
monitored period impacts the prediction, hence the authors advocate for longer monitoring of SNR to improve prediction accuracy at
a higher cost of delays. To smoothen out the fluctuations of the measured SNR and obtain precise state estimations, the authors use
Kalman filters, we refer the reader to [75] for a thorough exposition on Kalman filters. The filtered SNR signal has an approximately
linear relationship with sender–receiver distance, hence an increase (decrease) in the SNR values represents the increase (decrease)
in the two nodes’ distance. For link failure prediction in the data plane, patterns of measured SNR values from the link failure node
due to mobility are compared with patterns of the measured SNR data of the current link. If the patterns are similar then the link
is likely to fail.
To improve the link mobility failure prediction, a link failure is detected in the data plane then the data is passed on to the
controller where SNR values of neighbouring nodes are used to confirm mobility. Since the controller has access to multiple sets of
SNR measurements, the prediction is more accurate and is done using support vector machines (SVM) [76]. Having predicted the
link node mobility, the second part then focuses on finding an alternative routing path. The path is computed with the following
considerations: the shortest path distance, lowest control overheads and least impact on other already scheduled traffic. As soon
as the link failure is predicted at the control plane, an alternative route(s) is computed and a re-routing command is sent to the
corresponding nodes. All these procedures are done before the link actually fails. The proposed two-layer scheme is tested on 𝑛𝑠 − 3
discrete event simulator, using a network with 50 nodes deployed in an area of 450, 000𝑚2 . Finally, since the most computationally
expensive processes, e.g. SVM and linear programming (used when calculating re-routing paths), are carried out in the controller,
which has a high computational capacity, their complexity has little impact on the efficiency of the SDN as a whole and on routing
delays.
A programmable, fault-resilient SDN pipeline design that allows for failure detection and recovery in the switches is proposed
in [77] and called SPIDER, short for Stateful Programmable faIlure DEtection and Recovery. As the name suggests, SPIDER is based
on stateful data planes and it guarantees short failure detection and recovery delays. SPIDER is like an OpenFlow pipeline and is
designed to detect link failure in the switches by periodic link probing. Once a failure is detected, re-routing paths will be computed
fast, regardless of the status of the controller or its availability. SPIDER is also such that re-routing can even occur before the failure
happens, moreover, SPIDER has a provision for establishing original packet forwarding once a failure is resolved.
In [78], the authors discuss how to use performance and resource sharing to recover from a single-link failure. The approach
is based on utilizing a proactive failure recovery scheme that considers the performance of the backup paths and backup resource
consumption. This ring-based single-link failure and recovery strategy are able to minimize the consumption of backup flow resources
because it manages to enhance the utilization of backup resources.
A novel recovery strategy for smart grids is proposed in [79]. The framework is integrated with a link failure detection feature that
makes it possible to achieve real-time self-recovery. The link failure detection is proactive OpenFlow fast-failover groups strategy,
that is supported by all OpenFlow protocols from version 1.1.0. Their strategy ensures that the switches are able to use active ports
with the highest priority to update their flow rules. The authors specifically focused on analysing the switching delay, packet flow
delay, packet loss, and the number of packets missing the delay requirement of teleportation as metrics that are due to link failures.
Their results indicated that the SDN-based recovery strategy can effectively meet the deadline of 4 ms for teleprotection applications,
for both WiFi and LTE connections.
According to the authors of [80], although the decoupling of the control and data plane in SDN makes it flexible to control the
network behaviour, it may also be a disadvantage when dealing with link failure recovery as there are delays between fail point and
the controller. This is because the network depends on the remote controller to handle the failure, this may lead to packet loss and
7
T. Semong et al. Scientific African 21 (2023) e01865
network interruptions. The authors indicated that in order to avoid delay and packet loss when a failure occurs, pre-defined backup
paths should be used to re-route the disrupted flows immediately to the destination without the controller’s decision. They further
demonstrate that this process of rerouting the flows can also introduce its own inconveniences. It may result in a large overhead
when building backup paths, and once the backup paths are constructed they can be hard to modify, hence causing the maintenance
of these backup paths to have to adaptively adjust according to the network status so as to avoid congestion. The authors proposed a
proactive method whereby there has to be an installation of multi-backup paths for every link through source routing, per-hop-tags,
and spread flows into different paths to mitigate against congestion and link failure. Through experiments and simulations, this
research work demonstrated the ability to achieve congestion-aware link failure recovery in SDN with low overhead.
In [81], a scheme for detecting switch failures is proposed. The main objective is to use the detection logic to detect switch
failures before it happens. The authors argue that when a switch fails, it can lead to multiple link failures, and addressing only the
link failures may be ineffective. Hence, it is essential to detect switch failures before they could occur.
In a recent paper [82], a hybrid solution called TFLink is presented. TFLink is hybrid in the sense that it mitigates challenges such
as poor Ternary Content Addressable Memory (TCAM), bandwidth management, and slow recovery times which are the results of
proactive and active link failure solutions available in the literature. TFLink is a three-fold approach, in the first instance a backup
of the current network flow rules and links are created. To ensure efficient link recovery when creating the backup the authors
optimized the combined cost of TCAM and link bandwidth. Then the backup links and additional flow rules created are stored in
global mapping tables in the controller instead of the switches. Finally, the network is monitored for faults, and once a link failure
is detected the stored backup is used to load alternative links and new flow rules to the switches.
The authors of [83] covered restoration and protection techniques for traffic control which utilized existing restoration and
protection strategies for the out-of-band network for data traffic. The researchers also argued that to achieve carrier-grade quality,
the network should be able to recover from the failure within 50 ms. In [84], an SDN-based link failure recovery that uses the
shortest path fast rerouting is discussed. The proposed solution is such that when a link failure occurs an alternative route is found
as quickly as possible to avoid network congestion. High-priority packets must be rerouted without any delays and the traffic must be
distributed fairly as well, hence the authors have to find a technique that can be used to find the shortest paths. The proposed scheme
is able to take a short time to recover after a link failure. A fast failover technique for dealing with the problem of controller failure
is presented in [85]. The researchers are of the view that inter-domain controller synchronization can lead to high network flow
overhead and should be kept to a minimum. They effectively utilize a forwarding information table that quickly replays inputs to
the controller after failure recovery as a way of avoiding packet loss. Experiments carried out demonstrated that the average latency
between the controllers was about twice that of synchronization. The results also indicated that the proposed scheme achieved about
75% flow_mod reduction during a single link failure and about 50% reduction in the service interruption period over the traditional
SDN baseline approach.
A comparison of different restoration options on SDN after failure is done in [86]. The authors propose an alternative restoration
strategy for SDN networks. Packet loss and switch-over time were used as metrics to measure and evaluate the performance of the
proposed strategy against the other existing restoration strategies.
Other proposed recovery schemes enhance the network reliability and generally, these schemes fall under the mechanism of
failure restoration (computes a new path for the affected packets after detecting failure) and failure recovery (pre-compute and pre-
configure the backup paths before even a failure occurs) as discussed in [87]. The researchers in this work proposed a flexible link
failure and recovery called BOND. It is explained that BOND, first, calculates the backup paths in advance, and secondly, allocates
the corresponding backup rules to forwarding switches with the flows considered requirements. To accelerate the failure recovery
and to avoid the potential congestion effectively, a global hash table must be adopted precisely for selecting the backup paths.
In testing the efficiency and efficacy of BOND, the authors constructed comprehensive experiments with real-world environment
typologies. Their strategy has shown that against any single-link failure, BOND was able to calculate all the backup paths and was
also able to allocate the backup rules.
In Table 3, the paper provides a brief summary of the literature, stating the year of publication, highlighting the failure location,
indicating whether the technique is proactive or reactive, and outlining the advantages and disadvantages of the proposed schemes.
The general observations from the 14 articles summarized in Table 3 are that they are fairly new articles from 2017 to 2022
and the majority of the failures are taking place in the data-plane layer. This suggests that most of the failures are due to a link
being down between the switches or one of the switches being down. Only 2 of the cited articles discussed a link failure in the
control-plane layer. In terms of the techniques that have been proposed, one can conclude that most of them are fairly proactive.
In fact, some of the proposed techniques in literature are both reactive and proactive by design.
In Table 4, the paper point out the research challenges on the reviewed articles, the proposed solutions in the literature as well
as the possible future improvements on the existing literature. The future improvement suggestions come in the form of research
directions that the authors believe can lead to better solutions in terms of enhancing the proposed solutions.
8
T. Semong et al. Scientific African 21 (2023) e01865
Table 3
A summary of various link failure recovery techniques in SDN.
Ref. Year Failure Proactive/ Pros Cons
location Reactive
[88] 2020 Links, Reactive • Maximum bandwidth utilization • Some complexities
Nodes • Path and bandwidth protection
[14] 2021 Links, Proactive • Good packet delivery ratio • Energy not considered
Nodes • Low latency
[89] 2018 Data Proactive • Better throughput • Packets loss
plane • Lower loss of flows • Lower resilience
[90] 2019 Switches, Proactive • More scalable • Memory overhead
Hosts • Lower cost delay and faster reroute • Some complexities
• Moderate resilience
[91] 2018 Switches, Proactive • Better backup resources utilization • No load balancing
Links • Lower recovery delay • Memory overhead
• Moderate resilience
[67] 2022 Links, Proactive • Low latency rate • Memory overhead
Nodes • Increased reliability • Some loss of packets
• Decreased traffic overhead
[92] 2018 General Proactive • Faster recovery and lower link load • Additional rules on each switch
• Higher link utilization
• Paths protection and resilience
[93] 2020 Controller Reactive • Faster recovery time and resilience • High complexities
• Improved latency and performance
[94] 2017 Links, Proactive • More security and resilience • Memory overhead
Nodes • Lower packet loss rate
• Prevent false positives
[95] 2022 Controller Proactive • Reduced latency • Security issues
• lower recovery time • High complexities
[21] 2021 Links, Proactive • Lower packet loss and delays • More resource consumption
Nodes • Low flow interruption rate
[96] 2019 General Both • High capacity utilization • Uses more forwarding rules
• Less traffic loss and high resilience
[97] 2018 Nodes, Reactive • Lower total cost • Single link failure
Links • Less hop count • Lower failure protection
This section focuses on the analysis and discussion of how different metrics and mathematical tools or emulators were used in
the sampled articles. In Table 5, the paper point out the performance metrics that are used. The table also contains the mathematical
tools and emulators. A total of 17 research articles were sampled, and each row contains the article reference, adopted mathematical
tool or emulator, and performance metrics.
Fig. 4 illustrates the qualitative metrics used to measure the effectiveness of the proposed link failure recovery techniques from
the 17 selected articles. Metrics like recovery time, packet loss, and delay are used by most of the reviewed articles because they
are attributes of any good link failure recovery technique. An effective link failure recovery strategy should be able to restore a
network back to normality within a short time, with minimum delay and fewer packet losses.
From Fig. 4, the recovery time metric was used by 15 articles, which constitute 88.2%, 16 articles tested for delay metric, which
is 94.1% and 12 reviewed articles used packet loss metric, which is 70.6% of the articles. There are 9 review articles that tested
for congested links, which constitute 52.9%, 13 articles or 76.5% used throughput metric. Link utilization metric was used by 7
reviewed articles or 41.2%. Other metrics used in literature, not covered under this subsection, includes the total number of traffic
flows, latency, run time, and the performance of backup paths. It is also worth noting that the metrics adopted are based on the
objective, as well as the techniques that are used to test the effectiveness of the proposed solution.
The illustration in Fig. 5, indicates that 54% of the 17 selected articles used Mininet as an emulator for testing. Mininet is widely
adopted as a powerful emulator in literature because it allows researchers to create SDN elements, customize them, share them
with other networks, and perform interactions [114]. These elements could be switches, end devices, links and SDN controllers.
SDN-enabled switches emulated by Mininet provide the same packet delivery semantics as the one from a hardware switch. iPerf,
which is a traffic-generating software is used by 13% of the reviewed articles. From Table 5, iPerf is used to generate traffic within
the Mininet. Some researchers utilized ns-3 and Mininet to implement and evaluate the performance of their proposed solutions.
In [74], the simulation was done in ns-3, but all the wireless nodes in the data plane were managed by an SDN controller. In
total, 9% of the reviewed articles utilized the ns-3 simulator. Other important tools like; g++, OMNET++, Matlab, Python, CPLEX,
and C++ were also used for the implementation of some of the proposed solutions as illustrated by Table 5. These tools were used
20% of the time, with each tool contributing 4%. The adoption of various tools for the implementation of link failure recovery
9
T. Semong et al. Scientific African 21 (2023) e01865
Table 4
A summary of the current research challenges, the proposed solutions and the future research directions.
Ref. Research challenge Proposed solution Research direction
[43] • The cooperative link failure • Scheme for balancing between • Effective network resources
recovery in SDN resource utilization and recovery reallocation during link failure
• Multi-objective optimization for time problems
link failure recovery • OpenFlow based fast failover • Bandwidth and delay
and VLAN features for enhanced requirements focused scheme in
performance SDN
[98] • Delay in large-scale video • Algorithms that are able to • Problem of packet loss on video
surveillance system reroute packets at the ingress traffics during link failure
• Dealing with link failures in node switch • Securing video traffics flows
video
[99] • The problem of distributed • Link failure recovery techniques • AI schemes to avoid the traffic
depth-first search in SDN based on shortest path bottleneck during link failure
• SDN fast fail-over scheme • Backup path calculation scheme
for link failure
[100] • Fast rerouting of traffic after • Schemes with backup path that • IoT intelligent link restorations
link failure are pre-computed • Intelligent recovery in wireless
• Mission critical apps for wireless • Restricted repair and Flow networks
sensor networks thinning for path recovery
[101] • Failure management using SDN • Packet tagging using VLAN and • Congestion management SDN
• Time consuming and complex MPLS technologies to mitigate based failure recovery
reduction processes on link against failures • Intelligent load balancing after
failures • Effective local detouring failure schemes
techniques using SDN
[102] • Fault tolerance SDN based • Link failure recovery on optical • Resilience strategies for software
optical network SDN defined optical networks
• Protection strategy for
OpenFlow failures
[103] • SDN controller capacity during • Min-cut strategy for controller • Constraints for scalable and
link failure placement effective controller placement
• Controller to switch latency and • Flow path diversity and scheme
placement problem placement scheme for effective • AI for controller placement
switch to controller connection strategy
[104] • Green SDN link failure recovery • Backup route switching for • Proactive routing schemes for
• Link recovery with minimum failure recovery Green SDN
complexity and low latency • Secure and scalable Green SDN
Table 5
Qualitative comparison between different link-failure recovery SDN schemes based on the mathematical/emulation tools [105–107] and the performance metrics
adopted.
Ref. M. Tools/ Emulators Performance metrics
Throughput Congested links Packet Loss Delay Recovery Link
Time (ms) utilization
[82] Mininet [108,109] ✓ × × ✓ × ✓
[80] OMNET++ 4.5 ✓ × ✓ ✓ ✓ ✓
[77] Mininet ✓ ✓ ✓ ✓ × ✓
[90] Mininet, C++,g++ ✓ ✓ ✓ ✓ ✓ ×
[92] Mininet ✓ × ✓ ✓ ✓ ✓
[105] Mininet ✓ × ✓ × ✓ ×
[110] Mininet, iPerf ✓ ✓ × ✓ ✓ ×
[111] Mininet ✓ × ✓ ✓ ✓ ×
[112] Mininet × ✓ × ✓ ✓ ×
[88] Matlab × ✓ × ✓ ✓ ×
[93] Mininet, iPerf ✓ × ✓ ✓ ✓ ✓
[113] Python, CPLEX [106] × ✓ ✓ ✓ ✓ ×
[91] Mininet ✓ × × ✓ ✓ ×
[107] Mininet, iPerf ✓ × ✓ ✓ ✓ ✓
[43] Mininet ✓ ✓ ✓ ✓ ✓ ×
[79] Mininet, ns-3 × × ✓ ✓ ✓ ×
[74] ns-3 ✓ ✓ ✓ ✓ ✓ ✓
strategies demonstrates the fact that SDN technology enables the network administrator to have the flexibility of developing and
testing networks using different methods that are not limited by equipment vendors.
For completeness, in Table 6, this paper provides details on the mathematical tools and emulators, their versions, and some
short descriptions. The paper discussed 4 tools/technologies that have been highly utilized by researchers in the literature, being
10
T. Semong et al. Scientific African 21 (2023) e01865
Fig. 4. Number and percentage of link-failure recovery metrics from the sampled articles as illustrated by Table 5.
Fig. 5. Distribution of the selected articles by the Mathematical tool/ Emulator adopted as shown in Table 5.
the different controllers, simulators, traffic generators, and development tools. Ryu and Floodlight are by far the 2 most utilized
SDN controllers by researchers. According to [136,137], Ryu is a very flexible controller in terms of the southbound API and it
also provides a very basic web-based GUI. Floodlight [136] has enabled many researchers to find good results when it comes to
the capacity of many requests. Both Floodlight and Ryu enable the use of multi-threading to enhance performance under heavy
loads. POX and OpenDaylight are the other 2 most adopted SDN controllers from the literature. Most researchers preferred to use
the Mininet simulator as already discussed. Other researchers utilized the iPerf application to generate traffic flows within the
simulator. Python and Matlab are the 2 common development tools that some researchers preferred to use for the implementation
of their proposed solutions. Some researchers utilize Matlab because it is very consistent and more robust, although it is more
expensive. On the other hand, Python is free and open-source software that many researchers can utilize.
Conclusions
In this review paper, the authors brought together the newly published literature on link failure recovery in software-defined
networks (SDN). This paper has demonstrated how the architecture of SDN and OpenFlow technology enable for efficient link
failure detection and recovery. A comparison was performed between the OpenFlow protocol and other protocols that can be used
to manage communication between the control plane and the data plane. Similarly, this review paper was also compared to other
existing related reviews in the literature. Failure location and the techniques used to identify the failure in the reviewed literature
have also been discussed. For the completeness of the review, the paper covered common tools used to test and implement the
proposed solutions. Finally, a summary of the metrics used to evaluate the proposed solutions was presented. From the literature,
most researchers concluded that since the SDN controller has a global view of the entire network, it enables intelligent traffic flow
decisions during link failures. SDN technology also provides a platform for easy implementation of modules like the link failure
recovery manager module which can be utilized to compute backup flow paths. The link failure recovery system architecture utilizing
the OpenFlow protocol can comprises of four modules. These modules are linked failure recovery manager, which is located in the
11
T. Semong et al. Scientific African 21 (2023) e01865
Table 6
Software mostly used in the implementation of the link failure and recovery in SDN.
Tools/Technologies Name and references Version Description
POX [84,115–117] 2.0.2 Pox is an open source SDN controller based on
python programming language.
Ryu [82,99,111,118,119] 3.1.16 Ryu is common SDN controller adopted by many
researchers. It provides tools and libraries for
conveniently assembling SDN networks and is
compatible with various OpenFlow versions.
SDN controllers Floodlight [79,85,93,96,120,121] v1.1, v1.2 This controller utilizes OpenFlow protocol to
organize traffic flows in an SDN environment. It
enables easy adaption of software applications, as
indicated by many researchers who use it.
NOX [122,123] NOX is the first SDN/OpenFlow controller and it is
normally regarded as the basis for other
implementations that came after it.
OpenDaylight [124,125] OpenDaylight supports southbound protocols like
OpenFlow for communication between the
application layer and the control plane.
Simulator Mininet [43,78,92,108,109,126] 2.2.1 Mininet is the main SDN-enabled emulator that
allows deployment of large networks using a single
computer or virtual machines. Mininet was
designed to enable researchers to simulate and
perform some research tests in SDN and OpenFlow
protocol. Most of the proposed SDN link failure
recovery techniques in literature were simulated
using mininet.
Traffic generator iPerf 3.1.2 iPerf is a traffic generating software tool that
[43,75,110,112,122,127,128] enables for active calculations of the maximum
achievable bandwidth on IP networks. It supports
tuning of different metrics related to timing,
buffers, and protocols (TCP, UDP, SCTP with IPv4
and IPv6). In most of the literature, iPerf was used
to generate traffic within the mininet.
Development tools Python [113,129,130] 3.7.2 Python is currently the fastest growing
programming language globally. It is an
object-oriented and programming language with
dynamic semantics. Python was used by some
researchers as cited, to develop some simulators
for their proposed solutions.
Matlab [88,131–135] R2018a Matlab is a very good language for technical
computing and high performance. It enables the
integration of visualization, computation, and
programming in a simpler manner. Some of the
proposed techniques in the literature have been
implemented and tested using Matlab.
control plane, detection and removal of link module, packet renewal, and reinstatement of path module. The last three modules are
based on the data plane-based switches. This ideal architecture has enabled researchers to come up with innovative solutions for
link failure recoveries in SDN, as summarized in this paper.
In conclusion, this review paper discusses some SDN link failure-recovery issues and techniques. These are some grey areas that
need to be explored and properly addressed as future research directions.
Synchronization of fail link state: There are some special cases whereby a link can be down in one direction but still be up on
the opposite direction. This kind of scenario is common in fibre optics links. The downside of the link will be detected for failure
first after the configured timeout and then the sending of traffic on that port will be stopped. The other link direction which is
up, would only trigger the down state after that, hence the inconsistency in the link state. There should be a mechanism for a
synchronized session state for the link, that enables the first endpoint that detects the failure to share that information with the
other side. Different techniques like packet tagging could be explored to possibly force the other side of the link to trigger a fail
state.
More programmability on SDN: The concept of SDN focuses more on the SDN Controller plane programmability only. Enabling
more programmability on the data plane could be a game changer for network administrators. More research work is needed on the
implications of a more programmable data place and how it could enable network engineers to enforce more security, compression,
12
T. Semong et al. Scientific African 21 (2023) e01865
etc. on the packets. Most link failures take place at the data plane, hence programmability at this level could reduce latency in terms
of decision-making. A programmable data plane would also reduce the need for hardware (switch) that is dependent on a special
family of hardware or that runs a certain protocol (OpenFlow-compatible).
Intelligent load balancing: Intelligent load balancing is an integral part of link failure recovery. After link failure, new possible
routes should be established intelligently to avoid bottlenecks. A link failure recovery technique that ignores congestion avoidance
should be discouraged at all costs.
Hybrid SDN: Most literature on SDN link failure assumes full deployment of the technology, but practically it would be a
challenging task to overhaul to SDN once. A gradual deployment to SDN is ideal. Which existing network components should be
converted to SDN-enable first for the hybrid network for maximum impact is not an easy task. Detecting the link failures on the
hybrid SDN network also can be a challenging experience. More research work needs to be conducted as more organizations envision
the deployment of SDN in an incremental fashion.
End user focus control: Currently the SDN control plane is more focused on the network administrators. It would be ideal to
have some APIs that are user-focused. These APIs can then be used by the user for on-demand services like filtering traffic from
certain sources or bandwidth guarantees over a certain link for better performance. As much as this may raise issues of security and
trust, the authors strongly believe a user-focused API can exist between the client application and the control plane.
Thabo Semong: Conceived idea, Designed and performed the simulation work, Writing – original draft. Thabiso Maupong:
Conceived the idea, Writing – original draft, Provided critical feedback, Research analysis. Adamu Murtala Zungeru: Conceived the
idea, Supervision, Writing – original draft. Oteng Tabona: Conceived the idea, Writing – original draft, Provided critical feedback,
Research analysis. Setso Dimakatso: Provided critical feedback, Research analysis, Writing – original draft. Gabanthone Boipelo:
Provided critical feedback, Research analysis, Writing – original draft. Mesiah Phuthego: Provided critical feedback, Research
analysis, Writing – original draft.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared
to influence the work reported in this paper.
References
[1] Leonardo Ochoa-Aday, Cristina Cervelló-Pastor, Adriana Fernández-Fernández, Self-healing and SDN: bridging the gap, Digit. Commun. Netw. (2019).
[2] Hamid Farhady, HyunYong Lee, Akihiro Nakao, Software-defined networking: A survey, Comput. Netw. 81 (2015) 79–95.
[3] Jagdeep Singh, Sunny Behal, Detection and mitigation of ddos attacks in SDN: A comprehensive review, research challenges and future directions, Comp.
Sci. Rev. 37 (2020) 100279.
[4] Kamal Benzekki, Abdeslam El Fergougui, Abdelbaki Elbelrhiti Elalaoui, Software-defined networking (SDN): a survey, Secur. Commun. Netw. 9 (18)
(2016) 5803–5833.
[5] N.M. Sahri, Koji Okamura, Fast failover mechanism for software defined networking: Openflow based, in: Proceedings of the Ninth International Conference
on Future Internet Technologies, 2014, pp. 1–2.
[6] Jacob H. Cox, Joaquin Chung, Sean Donovan, Jared Ivey, Russell J. Clark, George Riley, Henry L. Owen, Advancing software-defined networks: A survey,
IEEE Access 5 (2017) 25487–25526.
[7] Kilho Lee, Minsu Kim, Hayeon Kim, Hoon Sung Chwa, Jinkyu Lee, Insik Shin, Fault-resilient real-time communication using software-defined networking,
in: 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), IEEE, 2019, pp. 204–215.
[8] Ziyong Li, Yuxiang Hu, Jiangxing Wu, Jie Lu, P4resilience: Scalable resilience for multi-failure recovery in SDN with programmable data plane, Comput.
Netw. 208 (2022) 108896.
[9] Suchismita Rout, Kshira Sagar Sahoo, Sudhansu Sekhar Patra, Bibhudatta Sahoo, Deepak Puthal, Energy efficiency in software defined networking: a
survey, SN Comput. Sci. 2 (4) (2021) 1–15.
[10] Jehad Ali, Byeong-hee Roh, Seungwoon Lee, Qos improvement with an optimum controller selection for software-defined networks, Plos One 14 (5)
(2019) e0217631.
[11] Thabo Semong, Kun Xie, Xuhui Zhou, Hemant Kumar Singh, Zhetao Li, Delay bounded multi-source multicast in software-defined networking, Electronics
7 (1) (2018) 10.
[12] Jehad Ali, Gyu-min Lee, Byeong-hee Roh, Dong Kuk Ryu, Gyudong Park, Software-defined networking approaches for link failure recovery: A survey,
Sustainability 12 (10) (2020) 4255.
[13] Tong Duan, Venkata Dinavahi, Fast path recovery for single link failure in SDN-Enabled Wide Area measurement system, IEEE Trans. Smart Grid 13 (2)
(2021) 1645–1653.
[14] Nurzaman Ahmed, Arijit Roy, Ayan Mondal, Sudip Misra, SDN-based link recovery scheme for large-scale internet of things, in: 2021 IEEE 22nd
International Conference on High Performance Switching and Routing (HPSR), IEEE, 2021, pp. 1–6.
[15] Jehad Ali, Byungkyu Lee, Jimyung Oh, Jungtae Lee, Byeong-hee Roh, A novel features prioritization mechanism for controllers in software-defined
networking, Comput. Mater. Contin. 69 (2021) 267–282.
[16] Ahmadreza Montazerolghaem, Mohammad Hossein Yaghmaee, Alberto Leon-Garcia, Green cloud multimedia networking: NFV/SDN based energy-efficient
resource allocation, IEEE Trans. Green Commun. Netw. 4 (3) (2020) 873–889.
[17] Soheil Hassas Yeganeh, Amin Tootoonchian, Yashar Ganjali, On scalability of software-defined networking, IEEE Commun. Mag. 51 (2) (2013) 136–141.
[18] Sahar Abdollahi, Arash Deldari, Hamid Asadi, AhmadReza Montazerolghaem, Sayyed Majid Mazinani, Flow-aware forwarding in SDN datacenters using
a knapsack-PSO-based solution, IEEE Trans. Netw. Serv. Manag. 18 (3) (2021) 2902–2914.
[19] Saeed A. Astaneh, Shahram Shah Heydari, Sara Taghavi Motlagh, Alireza Izaddoost, Trade-offs between risk and operational cost in SDN failure recovery
plan, Future Internet 14 (9) (2022) 263.
[20] Shrinivas Petale, Jaisingh Thangaraj, Link failure recovery mechanism in software defined networks, IEEE J. Sel. Areas Commun. 38 (7) (2020) 1285–1292.
13
T. Semong et al. Scientific African 21 (2023) e01865
[21] Dong Liang, Qinrang Liu, Binghao Yan, Yanbin Hu, Bo Zhao, Tao Hu, Low interruption ratio link fault recovery scheme for data plane in software-defined
networks, Peer-to-Peer Netw. Appl. 14 (6) (2021) 3806–3819.
[22] Zhaogang Shu, Jiafu Wan, Jiaxiang Lin, Shiyong Wang, Di Li, Seungmin Rho, Changcai Yang, Traffic engineering in software-defined networking:
Measurement and management, IEEE Access 4 (2016) 3246–3256.
[23] Rahim Masoudi, Ali Ghaffari, Software defined networks: A survey, J. Netw. Comput. Appl. 67 (2016) 1–25.
[24] Ian F. Akyildiz, Ahyoung Lee, Pu Wang, Min Luo, Wu Chou, A roadmap for traffic engineering in SDN-OpenFlow networks, Comput. Netw. 71 (2014)
1–30.
[25] Binghao Yan, Qinrang Liu, JianLiang Shen, Dong Liang, Bo Zhao, Ling Ouyang, A survey of low-latency transmission strategies in software defined
networking, Comp. Sci. Rev. 40 (2021) 100386.
[26] Muhammad Yunis Daha, Mohd Soperi Mohd Zahid, Babangida Isyaku, Abdussalam Ahmed Alashhab, Cdra: A community detection based routing algorithm
for link failure recovery in software defined networks, Int. J. Adv. Comput. Sci. Appl. 12 (11) (2021).
[27] Ali Malik, Benjamin Aziz, Mo Adda, Chih-Heng Ke, Optimisation methods for fast restoration of software-defined networks, IEEE Access 5 (2017)
16111–16123.
[28] Tao Hu, Peng Yi, Julong Lan, Yuxiang Hu, Penghao Sun, FTLink: Efficient and flexible link fault tolerance scheme for data plane in Software-Defined
Networking, Future Gener. Comput. Syst. 111 (2020) 381–400.
[29] Paulo Cesar da Rocha Fonseca, Edjard Souza Mota, A survey on fault management in software-defined networks, IEEE Commun. Surv. Tutor. 19 (4)
(2017) 2284–2321.
[30] V. Muthumanikandan, C. Valliyammai, A survey on link failures in software defined networks, in: 2015 Seventh International Conference on Advanced
Computing (ICoAC), IEEE, 2015, pp. 1–5.
[31] Stewart Bryant, Stefano Previdi, Mike Shand, A framework for IP and MPLS fast reroute using not-via addresses, RFC 6981, 2013.
[32] Srihari Nelakuditi, Sanghwan Lee, Yinzhe Yu, Zhi-Li Zhang, Chen-Nee Chuah, Fast local rerouting for handling transient link failures, IEEE/ACM Trans.
Netw. 15 (2) (2007) 359–372.
[33] Alia Atlas, Alex Zinin, Basic specification for IP fast reroute: Loop-free alternates, Technical report, RFC 5286, September, 2008.
[34] Jue Chen, Jinbang Chen, Fei Xu, Min Yin, Wei Zhang, When software defined networks meet fault tolerance: A survey, in: International Conference on
Algorithms and Architectures for Parallel Processing, Springer, 2015, pp. 351–368.
[35] Yong Li, Min Chen, Software-defined network function virtualization: A survey, IEEE Access 3 (2015) 2542–2553.
[36] Rashid Mijumbi, Joan Serrat, Juan-Luis Gorricho, Steven Latré, Marinos Charalambides, Diego Lopez, Management and orchestration challenges in network
functions virtualization, IEEE Commun. Mag. 54 (1) (2016) 98–105.
[37] Shiming He, Kun Xie, Xuhui Zhou, Thabo Semong, Jin Wang, Multi-source reliable multicast routing with QoS constraints of NFV in edge computing,
Electronics 8 (10) (2019) 1106.
[38] Andrea Sgambelluri, Alessio Giorgetti, Filippo Cugini, Francesco Paolucci, Piero Castoldi, OpenFlow-based segment protection in ethernet networks, J.
Opt. Commun. Netw. 5 (9) (2013) 1066–1075.
[39] Nick McKeown, Tom Anderson, Hari Balakrishnan, Guru Parulkar, Larry Peterson, Jennifer Rexford, Scott Shenker, Jonathan Turner, OpenFlow: enabling
innovation in campus networks, ACM SIGCOMM Comput. Commun. Rev. 38 (2) (2008) 69–74.
[40] Open Network Founadation. ONF, [Online]. Available: https://www.opennetworking.org/, (Accessed: 04.07.2020).
[41] Ali Akbar Neghabi, Nima Jafari Navimipour, Mehdi Hosseinzadeh, Ali Rezaee, Load balancing mechanisms in the software defined networks: a systematic
and comprehensive review of the literature, IEEE Access 6 (2018) 14159–14178.
[42] Andrew Goodney, Saurabh Kumar, Akshay Ravi, Young H. Cho, Efficient PMU networking with software defined networks, in: 2013 IEEE International
Conference on Smart Grid Communications (SmartGridComm), IEEE, 2013, pp. 378–383.
[43] Likun Wang, Lin Yao, Zichuan Xu, Guowei Wu, Mohammad S. Obaidat, CFR: A cooperative link failure recovery scheme in software-defined networks,
Int. J. Commun. Syst. 31 (10) (2018) e3560.
[44] A.U. Rehman, Rui L. Aguiar, João Paulo Barraca, Fault-tolerance in the scope of software-defined networking (SDN), IEEE Access 7 (2019) 124474–124490.
[45] Yuan Zhang, Lin Cui, Wei Wang, Yuxiang Zhang, A survey on software defined networking with multiple controllers, J. Netw. Comput. Appl. 103 (2018)
101–118.
[46] Tamal Das, Vignesh Sridharan, Mohan Gurusamy, A survey on controller placement in SDN, IEEE Commun. Surv. Tutorials 22 (1) (2019) 472–503.
[47] Mosab Hamdan, Entisar Hassan, Ahmed Abdelaziz, Abdallah Elhigazi, Bushra Mohammed, Suleman Khan, Athanasios V. Vasilakos, Muhammad Nadzir
Marsono, A comprehensive survey of load balancing techniques in software-defined network, J. Netw. Comput. Appl. 174 (2021) 102856.
[48] Jie Lu, Zhen Zhang, Tao Hu, Peng Yi, Julong Lan, A survey of controller placement problem in software-defined networking, IEEE Access 7 (2019)
24290–24307.
[49] Othmane Blial, Mouad Ben Mamoun, Redouane Benaini, An overview on SDN architectures with multiple controllers, J. Comput. Netw. Commun. 2016
(2016).
[50] Dan Levin, Andreas Wundsam, Brandon Heller, Nikhil Handigol, Anja Feldmann, Logically centralized? State distribution trade-offs in software defined
networks, in: Proceedings of the First Workshop on Hot Topics in Software Defined Networks, 2012, pp. 1–6.
[51] Jonathan Vestin, Andreas Kassler, Johan Akerberg, Resilient software defined networking for industrial control networks, in: 2015 10th International
Conference on Information, Communications and Signal Processing (ICICS), IEEE, 2015, pp. 1–5.
[52] Diego Kreutz, Fernando M.V. Ramos, Paulo Esteves Verissimo, Christian Esteve Rothenberg, Siamak Azodolmolky, Steve Uhlig, Software-defined
networking: A comprehensive survey, Proc. IEEE 103 (1) (2014) 14–76.
[53] Wenfeng Xia, Yonggang Wen, Chuan Heng Foh, Dusit Niyato, Haiyong Xie, A survey on software-defined networking, IEEE Commun. Surv. Tutor. 17 (1)
(2014) 27–51.
[54] Sushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Arjun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, et al.,
B4: Experience with a globally-deployed software defined WAN, ACM SIGCOMM Comput. Commun. Rev. 43 (4) (2013) 3–14.
[55] Juliano Araujo Wickboldt, Wanderson Paim De Jesus, Pedro Heleno Isolani, Cristiano Bonato Both, Juergen Rochol, Lisandro Zambenedetti Granville,
Software-defined networking: management requirements and challenges, IEEE Commun. Mag. 53 (1) (2015) 278–285.
[56] Huawei Technologies, Huawei, Enabling Agile Service Chaining with Service Based Routing, [Online]. Available: http://www.huawei.com/ilink/en/
download/HW_308622, (Accessed: 03.08.2020).
[57] Avri Doria, J. Hadi Salim, Robert Haas, Horzmud Khosravi, Weiming Wang, Ligang Dong, Ram Gopal, Joel Halpern, Forwarding and control element
separation (forces) protocol specification, Technical report, 2010.
[58] Lily Yang, Ram Dantu, Terry Anderson, Ram Gopal, Forwarding and control element separation (ForCES) framework, Technical report, 2004.
[59] Weiming Wang, Ligang Dong, Bin Zhuge, Ming Gao, Fenggen Jia, Rong Jin, Jin Yu, Xiaochun Wu, Design and implementation of an open programmable
router compliant to IETF ForCES specifications, in: Sixth International Conference on Networking (ICN’07), IEEE, 2007, p. 82.
[60] Nick Feamster, Jennifer Rexford, Ellen Zegura, The road to SDN: an intellectual history of programmable networks, ACM SIGCOMM Comput. Commun.
Rev. 44 (2) (2014) 87–98.
[61] T.V. Lakshman, T. Nandagopal, Ramachandran Ramjee, K. Sabnani, T. Woo, The softrouter architecture, in: Proc. ACM SIGCOMM Workshop on Hot
Topics in Networking, 2004, 2004.
14
T. Semong et al. Scientific African 21 (2023) e01865
[62] Ramon Casellas, Raül Muñoz, Ricardo Martínez, Ricard Vilalta, Lei Liu, Takehiro Tsuritani, Itsuro Morita, Víctor López, Oscar González de Dios, Juan Pedro
Fernández-Palacios, SDN orchestration of OpenFlow and GMPLS flexi-grid networks with a stateful hierarchical PCE, J. Opt. Commun. Netw. 7 (1) (2015)
A106–A117.
[63] Raül Muñoz, Ricard Vilalta, Ramon Casellas, Ricardo Martínez, Frederic Francois, Mayur Channegowda, Ali Hammad, Shuping Peng, Reza Nejabati,
Dimitra Simeonidou, et al., Transport network orchestration for end-to-end multilayer provisioning across heterogeneous SDN/OpenFlow and GMPLS/PCE
control domains, J. Lightwave Technol. 33 (8) (2015) 1540–1548.
[64] A. Sgambelluri, F. Paolucci, A. Giorgetti, F. Cugini, P. Castoldi, SDN and PCE implementations for segment routing, in: 2015 20th European Conference
on Networks and Optical Communications-(NOC), IEEE, 2015, pp. 1–4.
[65] Francesco Paolucci, Filippo Cugini, Alessio Giorgetti, Nicola Sambo, Piero Castoldi, A survey on the path computation element (PCE) architecture, IEEE
Commun. Surv. Tutor. 15 (4) (2013) 1819–1841.
[66] Sanjeev Singh, Rakesh Kumar Jha, A survey on software defined networking: Architecture for next generation network, J. Netw. Syst. Manage. 25 (2)
(2017) 321–374.
[67] Qadeer Yasin, Zeshan Iqbal, Muhammad Attique Khan, Seifedine Kadry, Yunyoung Nam, Reliable multipath flow for link failure recovery in 5G networks
using SDN paradigm, Inf. Technol. Control 51 (1) (2022) 5–17.
[68] M. Betts, S. Fratini, N. Davis, D. Hoods, R. Dolin, M. Joshi, Z. Dacheng, SDN architecture, issue 1, open networking foundation, Technical report, ONF
TR-502, 2014.
[69] Thabo Semong, Kun Xie, Efficient load balancing and multicasting for uncertain-source SDN: Real-time link-cost monitoring, in: Computer Science on-Line
Conference, Springer, 2018, pp. 178–187.
[70] Arsalan Tavakoli, Martin Casado, Teemu Koponen, Scott Shenker, Applying NOX to the datacenter, in: HotNets, 2009.
[71] Ulas C. Kozat, Guanfeng Liang, Koray Kokten, Janos Tapolcai, On optimal topology verification and failure localization for software defined networks,
IEEE/ACM Trans. Netw. 24 (5) (2015) 2899–2912.
[72] Lingxia Liao, Victor C.M. Leung, LLDP based link latency monitoring in software defined networks, in: 2016 12th International Conference on Network
and Service Management (CNSM), IEEE, 2016, pp. 330–335.
[73] Farzaneh Pakzad, Marius Portmann, Wee Lum Tan, Jadwiga Indulska, Efficient topology discovery in software defined networks, in: 2014 8th International
Conference on Signal Processing and Communication Systems (ICSPCS), IEEE, 2014, pp. 1–8.
[74] Ke Bao, John D. Matyjas, Fei Hu, Sunil Kumar, Intelligent software-defined mesh networks with link-failure adaptive traffic balancing, IEEE Trans. Cogn.
Commun. Netw. 4 (2) (2018) 266–276.
[75] Sigurd I. Aanonsen, Geir Nævdal, Dean S. Oliver, Albert C. Reynolds, Brice Vallès, et al., The ensemble Kalman filter in reservoir engineering–a review,
Spe J. 14 (03) (2009) 393–412.
[76] Steve R. Gunn, Support vector machines for classification and regression, ISIS Technical Report, 14, (1) 1998, pp. 5–16.
[77] Carmelo Cascone, Luca Pollini, Davide Sanvito, Antonio Capone, Brunilde Sanso, SPIDER: Fault resilient SDN pipeline with recovery delay guarantees,
in: 2016 IEEE NetSoft Conference and Workshops (NetSoft), IEEE, 2016, pp. 296–302.
[78] Ying Wang, Sixiang Feng, Hantao Guo, Xuesong Qiu, Hengbin An, A single-link failure recovery approach based on resource sharing and performance
prediction in SDN, IEEE Access 7 (2019) 174750–174763.
[79] Abdullah Aydeger, Nico Saputro, Kemal Akkaya, Selcuk Uluagac, SDN-enabled recovery for smart grid teleprotection applications in post-disaster scenarios,
J. Netw. Comput. Appl. 138 (2019) 39–50.
[80] Liaoruo Huang, Qingguo Shen, Wenjuan Shao, Congestion aware fast link failure recovery of SDN network based on source routing, TIIS 11 (11) (2017)
5200–5222.
[81] V. Muthumanikandan, C. Valliyammai, B. Swarna Deepa, Switch failure detection in software-defined networks, in: Advances in Big Data and Cloud
Computing, Springer, 2019, pp. 155–162.
[82] Tao Hu, Peng Yi, Julong Lan, Yuxiang Hu, Penghao Sun, Ftlink: Efficient and flexible link fault tolerance scheme for data plane in software-defined
networking, Future Gener. Comput. Syst. (2019).
[83] Sachin Sharma, Dimitri Staessens, Didier Colle, Mario Pickavet, Piet Demeester, Fast failure recovery for in-band OpenFlow networks, in: 2013 9th
International Conference on the Design of Reliable Communication Networks (Drcn), IEEE, 2013, pp. 52–59.
[84] V. Muthumanikandan, C. Valliyammai, Link failure recovery using shortest path fast rerouting technique in SDN, Wirel. Pers. Commun. 97 (2) (2017)
2475–2495.
[85] Oluwatobi A. Akanbi, Amer Aljaedi, Xiaobo Zhou, Adel R. Alharbi, Fast fail-over technique for distributed controller architecture in software-defined
networks, IEEE Access 7 (2019) 160718–160737.
[86] Sachin Sharma, Dimitri Staessens, Didier Colle, Mario Pickavet, Piet Demeester, Enabling fast failure recovery in OpenFlow networks, in: 2011 8th
International Workshop on the Design of Reliable Communication Networks (DRCN), IEEE, 2011, pp. 164–171.
[87] Qing Li, Yang Liu, Zhijie Zhu, Hengtong Li, Yong Jiang, BOND: Flexible failure recovery in software defined networks, Comput. Netw. 149 (2019) 1–12.
[88] Oleksandr Lemeshko, Oleksandra Yeremenko, Batoul Sleiman, Maryna Yevdokymenko, Fast ReRoute model with realization of path and bandwidth
protection scheme in SDN, Adv. Electr. Electron. Eng. 18 (1) (2020) 23–30.
[89] Wang Xin-gang, A link performance-based failure recovery approach in SDN data plane, in: Proceedings of the 3rd International Conference on Multimedia
and Image Processing, 2018, pp. 46–51.
[90] Kun Qiu, Jin Zhao, Xin Wang, Xiaoming Fu, Stefano Secci, Efficient recovery path computation for fast reroute in large-scale software-defined networks,
IEEE J. Sel. Areas Commun. 37 (8) (2019) 1755–1768.
[91] Sixiang Feng, Ying Wang, Xuxia Zhong, Junran Zong, Xuesong Qiu, Shaoyong Guo, A ring-based single-link failure recovery approach in SDN data plane,
in: NOMS 2018-2018 IEEE/IFIP Network Operations and Management Symposium, IEEE, 2018, pp. 1–7.
[92] Zhijie Zhu, Qing Li, Shutao Xia, Mingwei Xu, Caffe: Congestion-aware fast failure recovery in software defined networks, in: 2018 27th International
Conference on Computer Communication and Networks (ICCCN), IEEE, 2018, pp. 1–9.
[93] Shadi Moazzeni, Mohammad Reza Khayyambashi, Naser Movahhedinia, Franco Callegati, Improving the reliability of Byzantine fault-tolerant distributed
software-defined networks, Int. J. Commun. Syst. 33 (9) (2020) e4372.
[94] Carmelo Cascone, Davide Sanvito, Luca Pollini, Antonio Capone, Brunilde Sanso, Fast failure detection and recovery in SDN with stateful data plane, Int.
J. Netw. Manag. 27 (2) (2017) e1957.
[95] Tong Duan, Venkata Dinavahi, Fast path recovery for single link failure in SDN-Enabled Wide Area measurement system, IEEE Trans. Smart Grid 13 (2)
(2022) 1645.
[96] Jiaqi Zheng, Hong Xu, Xiaojun Zhu, Guihai Chen, Yanhui Geng, Sentinel: Failure recovery in centralized traffic engineering, IEEE/ACM Trans. Netw. 27
(5) (2019) 1859–1872.
[97] Zijing Cheng, Xiaoning Zhang, Shaohui Shen, Shui Yu, Jing Ren, Rongping Lin, T-trail: link failure monitoring in software-defined optical networks, J.
Opt. Commun. Netw. 10 (4) (2018) 344–352.
[98] Mustafa Ismael Salman, et al., Link failure recovery for a large-scale video surveillance system using a software-defined network, J. Eng. 26 (1) (2020)
104–120.
15
T. Semong et al. Scientific African 21 (2023) e01865
[99] Young-Zhe Liao, Shi-Chun Tsai, Fast failover with hierarchical disjoint paths in sdn, in: 2018 IEEE Global Communications Conference (GLOBECOM),
IEEE, 2018, pp. 1–7.
[100] Shehroz Riaz, Maaz Rehan, Tariq Umer, Muhammad Khalil Afzal, Waqas Rehan, Ehsan Ullah Munir, Tassawar Iqbal, FRP: A novel fast rerouting protocol
with multi-link-failure recovery for mission-critical WSN, Future Gener. Comput. Syst. 89 (2018) 148–165.
[101] Pankaj Thorat, Seil Jeon, Hyunseung Choo, Enhanced local detouring mechanisms for rapid and lightweight failure recovery in OpenFlow networks,
Comput. Commun. 108 (2017) 78–93.
[102] Xu Zhang, Lei Guo, Weigang Hou, Qihan Zhang, Siqi Wang, Failure recovery solutions using cognitive mechanisms based on software-defined optical
network platform, Opt. Eng. 56 (1) (2017) 016107.
[103] Bala Prakasa Rao Killi, Seela Veerabhadreswara Rao, Link failure aware capacitated controller placement in software defined networks, in: 2018
International Conference on Information Networking (ICOIN), IEEE, 2018, pp. 292–297.
[104] Chia-Wei Huang, Chung-An Shen, Tai-Lin Chin, Shan-Hsiang Shen, A real-time and memory-saving link recovery mechanism for green software-defined
networking, in: 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), IEEE, 2018, pp. 853–857.
[105] V. Padma, P. Yogesh, Proactive failure recovery in OpenFlow based software defined networks, in: 2015 3rd International Conference on Signal Processing,
Communication and Networking (ICSCN), IEEE, 2015, pp. 1–6.
[106] CPLEX Studio, CPLEX Optimization Studio, [Online]. Available: http://www-03.ibm.com/software/products/en/ibmilogcpleoptistud, (Accessed:
01.07.2020).
[107] Israat Haque, M.A. Moyeen, Revive: A reliable software defined data plane failure recovery scheme, in: 2018 14th International Conference on Network
and Service Management (CNSM), IEEE, 2018, pp. 268–274.
[108] Team-Mininet, Mininet overview, [Online]. Available: http://mininet.org/overview, (Accessed: 01.08.2020).
[109] Bob Lantz, Brandon Heller, Nick McKeown, A network in a laptop: rapid prototyping for software-defined networks, in: Proceedings of the 9th ACM
SIGCOMM Workshop on Hot Topics in Networks, 2010, pp. 1–6.
[110] Gokay Saldamli, Himanshu Mishra, Naveen Ravi, Rahul Rao Kodati, Sahithi A. Kuntamukkala, Loai Tawalbeh, Improving link failure recovery and
congestion control in SDNs, in: 2019 10th International Conference on Information and Communication Systems (ICICS), IEEE, 2019, pp. 30–35.
[111] Jue Chen, Jinbang Chen, Junchen Ling, Wei Zhang, Failure recovery using vlan-tag in SDN: High speed with low memory requirement, in: 2016 IEEE
35th International Performance Computing and Communications Conference (IPCCC), IEEE, 2016, pp. 1–9.
[112] Selcuk Cevher, Ali Mumcu, Abdulsamet Caglan, Eda Kurt, Mehmet Kerim Peker, Ibrahim Hokelek, Sedat Altun, A fault tolerant software defined networking
architecture for integrated modular avionics, in: 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC), IEEE, 2018, pp. 1–9.
[113] S. Tomovic, I. Radusinovic, A new traffic engineering approach for QoS provisioning and failure recovery in SDN-based ISP networks, in: 2018 23rd
International Scientific-Professional Conference on Information Technology (IT), IEEE, 2018, pp. 1–4.
[114] Rogério Leão Santos De Oliveira, Christiane Marie Schweitzer, Ailton Akira Shinoda, Ligia Rodrigues Prete, Using mininet for emulation and prototyping
software-defined networks, in: 2014 IEEE Colombian Conference on Communications and Computing (COLCOM), IEEE, 2014, pp. 1–6.
[115] Daniel Gyllstrom, Nicholas Braga, Jim Kurose, Recovery from link failures in a smart grid communication network using openflow, in: 2014 IEEE
International Conference on Smart Grid Communications (SmartGridComm), IEEE, 2014, pp. 254–259.
[116] Sukhveer Kaur, Japinder Singh, Navtej Singh Ghumman, Network programmability using POX controller, in: ICCCS International Conference on
Communication, Computing & Systems, IEEE, Vol. 138, 2014, p. 70.
[117] J. Mccauley, POX: a python-based OpenFlow controller (2014), 2015.
[118] Han Xu, Lianshan Yan, Huanlai Xing, Yunhe Cui, Saifei Li, Link failure detection in software defined networks: an active feedback mechanism, Electron.
Lett. 53 (11) (2017) 722–724.
[119] Ryu-SDN, Framework community: Ryu SDN framework, 2019, [Online]. Available: https://github.com/osrg/ryu, (Accessed: 20.07.2020).
[120] Zijing Cheng, Xiaoning Zhang, Yichao Li, Shui Yu, Rongping Lin, Lei He, Congestion-aware local reroute for fast failure recovery in software-defined
networks, IEEE/OSA J. Opt. Commun. Networking 9 (11) (2017) 934–944.
[121] Floodlight, Project floodlight, 2012, [Online]. Available: http://floodlight.openflowhub.org/, (Accessed: 06.07.2020).
[122] Pankaj Thorat, Rajesh Challa, Syed M. Raza, Dongsoo S. Kim, Hyunseung Choo, Proactive failure recovery scheme for data traffic in software defined
networks, in: 2016 IEEE NetSoft Conference and Workshops (NetSoft), IEEE, 2016, pp. 219–225.
[123] Natasha Gude, Teemu Koponen, Justin Pettit, Ben Pfaff, Martín Casado, Nick McKeown, Scott Shenker, NOX: towards an operating system for networks,
ACM SIGCOMM Comput. Commun. Rev. 38 (3) (2008) 105–110.
[124] Deepa B. Swarna, V. Muthumanikandan, Nested failure detection and recovery in software defined networks, in: 2019 IEEE International Conference on
Electrical, Computer and Communication Technologies (ICECCT), IEEE, 2019, pp. 1–6.
[125] OpenDaylight, Linux foundation collaborative project, 2013, [Online]. Available: http://www.opendaylight.org/, (Accessed: 01.07.2020).
[126] Karamjeet Kaur, Japinder Singh, Navtej Singh Ghumman, Mininet as software defined networking testing platform, in: International Conference on
Communication, Computing & Systems (ICCCS), 2014, pp. 139–142.
[127] Tram Truong-Huu, Prarthana Prathap, Purnima Murali Mohan, Mohan Gurusamy, Fast and adaptive failure recovery using machine learning in software
defined networks, in: 2019 IEEE International Conference on Communications Workshops (ICC Workshops), IEEE, 2019, pp. 1–6.
[128] Sikandar Ejaz, Zeshan Iqbal, Peer Azmat Shah, Bilal Haider Bukhari, Armughan Ali, Farhan Aadil, Traffic load balancing using software defined networking
(SDN) controller as virtualized network function, IEEE Access 7 (2019) 46646–46658.
[129] Naga Katta, Haoyu Zhang, Michael Freedman, Jennifer Rexford, Ravana: Controller fault-tolerance in software-defined networking, in: Proceedings of the
1st ACM SIGCOMM Symposium on Software Defined Networking Research, 2015, pp. 1–12.
[130] Dave Kuhlman, A Python Book: Beginning Python, Advanced Python, and Python Exercises, Dave Kuhlman Lutz, 2009.
[131] Xin Cui, Xiaohong Huang, Yan Ma, Qingke Meng, A load balancing routing mechanism based on SDWSN in smart city, Electronics 8 (3) (2019) 273.
[132] Federico Cimorelli, Francesco Delli Priscoli, Antonio Pietrabissa, Lorenzo Ricciardi Celsi, Vincenzo Suraci, Letterio Zuccaro, A distributed load balancing
algorithm for the control plane in software defined networking, in: 2016 24th Mediterranean Conference on Control and Automation (MED), IEEE, 2016,
pp. 1033–1040.
[133] Xiaoyu Duan, Auon Muhammad Akhtar, Xianbin Wang, Software-defined networking-based resource management: data offloading with load balancing
in 5G HetNet, EURASIP J. Wireless Commun. Networking 2015 (1) (2015) 181.
[134] Faroq Al-Tam, Noélia Correia, On load balancing via switch migration in software-defined networking, IEEE Access 7 (2019) 95998–96010.
[135] R MATLAB, version 9.4. 0.813654 (R2018a), MathWorks R Natick, MA, USA, 2018.
[136] Silvio E. Quincozes, Arthur A.Z. Soares, Wilker Oliveira, Eduardo B. Cordeiro, Robson A. Lima, Débora C. Muchaluat-Saade, Vinicius C. Ferreira, Yona
Lopes, Juan Lucas Vieira, Luana M. Uchôa, et al., Survey and comparison of SDN controllers for teleprotection and control power systems, in: LANOMS,
2019.
[137] Mohammad Nowsin Amin Sheikh, SDN-based approach to evaluate the best controller: Internal controller NOX and external controllers POX, ONOS, RYU,
Global J. Comput. Sci. Technol. (2019).
16