Controllers in SDN A Review Report
Controllers in SDN A Review Report
ABSTRACT Software-defined networking (SDN) is a networking scenario which changes the traditional
network architecture by bringing all control functionalities to a single location and making centralized
decisions. Controllers are the brain of SDN architecture, which perform the control decision tasks while
routing the packets. Centralized decision capability for routing enhances the network performance. Through
this paper, we presented a review report on various available SDN controllers. Along with the SDN
introduction, we discuss the prior work in the field. The review states how the centralized decision
capability of the controller changes the network architecture with network flexibility and programmability.
We also discuss the two categories of the controller along with some popular available controller. For each
controller, we discuss the architectural overview, design aspects, and so on. We also evaluate the performance
characteristics by using various metrics, such as throughput, response time, and so on. This paper points to
the major state-of-the-art controllers used in industry and academia. Our review work covers major popular
controllers used in SDN paradigm.
INDEX TERMS OpenFlow, software defined networks (SDN), topology abstraction, pending raw-packet
threshold (PRT), model-driven service abstraction layer (MD-SAL).
2169-3536 2018 IEEE. Translations and content mining are permitted for academic research only.
36256 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
M. Paliwal et al.: Controllers in SDN: A Review Report
is eliminated to some extent. Now the network administrator strategies along with their suitable classification. The per-
can operate the different vendor-specific devices from a sin- formance analysis is carried out in section 5. Finally, we
gle software console. The controller is designed in such a way conclude our review in section 6 by giving a summary of the
that it can view the whole network globally. This controller article.
design helps a lot to introduce a new functionality or program
as it just needs to be placed in the centralized controller [3]. II. PRERUNNERS OF SDN
From the start of the SDN, many industrial communi- The idea of breaking the control and data plane is not
ties worked around the Open source standardization of the introduced the first time rather it is the result of several
technique and they came with some solution like Open- efforts made of separation of planes like Network Control
Daylight, OpenStack etc. In March 2011 IT & Networking Point [9], ForCES [10], Ethane [11], Active Networking [1]
giants like Cisco, Facebook, Google, Verizon, and Microsoft etc. Fig. 1 illustrates the evolution of different technologies
etc. collaborated with each other to form a working group that lead to the development of SDN at the time scale starting
on the widespread and open source adoption of the SDN from 1995 to 2015. Active Networking was the first attempt in
architecture. The working group called as Open Networking this direction as they suggested that network element should
Foundation (ONF). Each member of ONF is responsible for have the capability to perform computation and modification
a specific activity for the promotion of SDN. For example, of packet. Programmable Switch and capsules are the two
Architecture & Framework group deals with the architectural distinct approaches suggested by the active networks. How-
aspects of the SDN and defining the various components of it. ever, it doesn’t provide a clear picture of separation. Soon
Configuration & Management group deals with Operation, NCP came into light which defined clean separation image.
Administration, and Management of OpenFlow Protocols [3]. It was initially meant for telephone network and introduced
There are several review articles available in the litera- by AT&T networks. The idea suggested by NCP leads to
ture which address design issues and key challenges in the several innovations in the field.
different field of SDN. Sood et al. [4] reviewed for chal- ForCES was another major effort introduced in 2003.
lenges and opportunities Software-defined wireless network- ForCES separated the control logic of individual data plane
ing in IoT. Kobo et al. [5] addressed the challenges and design devices and made them available at the centralized location.
requirement for SD-WSN. Review work on various DDoS However, this centralization was not complete one which we
attacks on SDN controller carried out by Zubaydi et al. [6]. supposed to have because each control element interacted
Neghabi et al. [7] stated various load balancing approaches with their corresponding data plane element. So we can call it
in SDN through their article. Kreutz et al. [8] presented a as partial centralization. In ForCES architecture, Forwarding
comprehensive survey on overall design aspects along with Element typically implemented in hardware and responsible
challenges and opportunities in SDN. While working with for filtering and forwarding approaches. On the other hand,
SDN architecture, it is a major point of concern that which Control Element works with the coordination between indi-
controller should be selected for deployment. Every con- vidual devices in the network for communicating the forward-
troller has its own pros and cons along with its working ing and Routing Information of all devices. Fig. 2 illustrates
domain. This paper is motivated by this need of controller
selection. This article is different in a certain way as instead of
briefly discussing and describing the overall architecture and
design aspects of SDN it specifically focuses on the control
plane aspects. It brings the all controller design aspects at
a single point so that network administrator can select right
controller as per the requirements of his SDN.
We organized our survey into 6 different sections.
In section 1 Introduction, we start with the formal introduc-
tion of the technology along with the major areas where this
technology has major effects. Section 2 discusses the histor-
ical background of the SDN. Section 3 introduces the basic
building blocks of SDN architecture. Section 4 describes the
main idea of the article by presenting the different controller FIGURE 2. ForCES architecture.
processes the packet according to that particular flow entry. API support like ad-hoc API’s, RESTful APIs etc. Soon we
The last field Statistic stores various counting information hope for the common standard for northbound API as SDN
like how many packets get pass from the specific port, how is growing day by day. The selection of northbound interface
many packets for a destination address get process etc. This depends on the programming language used in application
information can be grouped based on per flow, per table, per development. NOSIX [24] was the first approach towards
port basis [13], [15]. the northbound interface implementation which was indepen-
dent of programming language and controller aspects. The
B. SOUTHBOUND INTERFACE emergence of common northbound API is a critical task as
The southbound interface provides a means of communica- the requirement of each networking application can vary. For
tion between the controller and switching devices. It installs example, a security application can have requirement differ-
the appropriate flow rules in the switch forwarding table ent from the routing applications. The Northbound interface
decided by the controller. OpenFlow is the most widely working group from ONF community are working already on
deployed southbound standard from open source community. the common standardization of northbound API.
OpenFlow provides various information for the Controller.
It generates the event-based messages in case of port or link E. NETWORKING APPLICATIONS
changes. The protocol also generates a flow based statistic Network applications available at management layer are
for the forwarding device and passes it to the controller. responsible for implementing the control logic, which pro-
Packet IN message sends in the case when a router doesn’t vides an appropriate command to be installed in the data
know how to handle the new incoming packet. However, this plane. The network applications are broadly divided into
is not the only choice for the southbound interface. Other five categories, which are Traffic Engineering, mobility and
interfaces include Open vSwitch Database (OVSDB) [17], wireless, measurement and monitoring, security and depend-
ForCES [10], OpFlex [18] etc. ability and data center networking. Various types of network-
OVSDB is a southbound API designed to provide addi- ing applications can be implemented at management level
tional management capabilities like networking functions. e.g. load balancing, traffic optimization, QoS Enforcement,
With OVSDB we can create the virtual switch instances, set Predict application workloads, Fine-grained access control
the interfaces and connect them to the switches. We can also etc [20].
provide the Quality of Service (QoS) policy for the interfaces.
OpFlex southbound API contrasts to the OpenFlow API to IV. CONTROLLER CATEGORIES
some extent. OpFlex allows forwarding devices to deal with Software Defined networking makes use of two types of
some of the management functionality. Initially, it abstracts controllers which are Centralized and Distributed. Fig 5
the policies from the underlined plane and decides which describes the classification of the various controllers into two
functionality to be placed where. categories.
language perspective C#, Java and Python were selected for Performance is a major challenge in any networking
beacon implementation. Lack of official support for different design. From controller perspective, we define the perfor-
operating system platform eliminated C# as a choice for mance based on two parameters - time required to process
implementation. Similarly, python was eliminated because of a single request and the number of input packet request
the lack of true multi-threading support. So they only left it can handle in per unit time. Proper event handling can
with java which provides effective memory management and lead to effective performance in any architecture. In Beacon,
proper segmentation fault and memory leaks handling. they have provided a pipelining of messages. A shared
From developer productivity point of view, Beacon pro- queue is implemented which contains the packets from all
vides a rich set of libraries for application development. For switching elements. Any worker threads in the controller
example, for routing purpose, they provide IRoutingEngine can pick available message request and execute it. For that
interface which is helpful for designing different routing purpose, IOFMessageListener should register with IBeacon-
modules. Shortest path routing [28] is one of the well-known Provider [25].
examples of routing module which can be implemented
through this interface. For network topology, it provides 2) ROSEMARY
ITopology interface which contains a set of operation to Sometimes it happens that network application interferes
retrieve information related to link discovery, link registra- with the controller program and allows it to malfunction.
tion/deregistration. To get the benefit of OpenFlow protocol, Rosemary [30] offers the feature of controller resiliency in
it uses OpenFlowJ API which is the Java-based implemen- which third-party applications can perform interference. It is
tation of OpenFlow 1.0 specification. So beacon can inter- possible that they perform maliciously themselves but can-
act with the OpenFlow switches through IBeaconProvider not harm the controller functioning. In Rosemary, the NOS
interface. Code reusability is an important property of object- is designed in such a way that it provides some sort of
oriented languages. Beacon achieves this through a library security corresponding to OpenFlow applications. It also
called Spring. Through this, it can create multiple instances makes a sandbox like structure around each application. The
of an object and bind them together into a single entity. NOS architecture specially designed as Micro-NOS architec-
Beacon also provides other facilities like Device Manage- ture [31]. The three design pillars of this architecture are-
ment, web application development etc. through various i) Schedule each network application separately in differ-
available interfaces. ent address space other than the controller.
To deal with the runtime modularity feature, beacon pro- ii) Implement a resource monitoring system so that we can
vides Open Services Gateway initiative (OSGi) specifica- track the resource consumption pattern of each application to
tions [29]. Like OSGi specification, Equinox provides facility find out its behavior.
to create a new instance of the application at runtime. Now not iii) Introduce a permission structure for each micro NOS
only you can start and stop existing application at runtime but instance so that we can allow constraints for each instance
also create a completely new application. One more specifi- regarding libraries, resources etc.
cation is OSGi service registry which allows new services to During the design of Rosemary, two important issues
register themselves so that user can pick any service from the were robustness and security. Researchers found that network
pool based on their requirement. application doesn’t support the robustness property because
they work in the same privilege zone in which network OS of threads depends upon the number of cores. Maestro allows
applications are running. Also, it lacks resource manage- multiple flow requests execution by different worker threads.
ment facility as each application can demand any number of Through this, we can achieve the parallelism which is first
resources for task executions. It was monolithic type archi- design issue in maestro design. There are some design con-
tecture. These two factors allow enforcement of separate siderations in threads. First, how we can divide the available
process context for network applications. From a security requests among the threads? One solution is to distribute the
point of view, they found that network applications don’t requests fairly among the available cores/threads by intro-
need any authentication procedure for initialization of task ducing a dedicated task queue for each thread. However,
execution. Access control mechanisms did not adapt properly there can be a drawback of this as the idle thread cannot
so resource access could be done in an unsolicited way. These take care of requests for other threads. Also, demand for
two factors allow NOS to incorporate security paradigm in the each request can be different with varying number of CPU
design architecture [31]. cycles. The second issue in design is core binding. Many
Based on these issues the design, of rosemary separates times it happens that when we move the actively running
the network application’s context from the controller context. code from one CPU to another CPU, the complete processing
They made a compartment for each shared module of NOS fails if both cores don’t share a common cache. In such
applications. Each time when an application needs resources case, manual synchronization is needed of core states and
it first contacts the NOS kernel. NOS kernel performs some cache to continue execution in a new environment. It is called
scheduling activity like fair share scheduling to grant the core binding which is an overhead for the system. Alternate
resources to the application [30]. solution for this is to perform all execution of one task at
While performing the tasks, NOS should ensure that it single core irrespective of execution time. Thread binding
should create a clear view of network application e.g. shared is also a solution in which the thread first checks its own
resources, functions, libraries etc. The abstraction should be dedicated task queue for execution. Once it is empty then it
minimal so that it can achieve maximum performance. The accesses shared queue to pick any available pending request.
design also ensures that proper balancing should be there Each queue should specify some threshold while growing too
between robustness and performance. Sometimes it happens large in size. In Maestro there is provision for task priority.
that enforcement of too many constraints leads to latency Different tasks have different priorities e.g. output stage has
overhead and the system cannot accept a large number of the highest priority and input stage has low priority process.
requests. The NOS takes care of this by implementing the The reason for assigning lowest priority to input stage is that
light weighted architecture. they are shared in the raw packet task queue. Tasks, which are
in flow process stage, possess medium priority.
3) MAESTRO Maestro introduces threshold to raw packet task queue
The main factor for the design of any system is high per- which is called Pending Raw-Packet Threshold (PRT) [35].
formance with parallelism. Consider the case of Datacenter PRT has importance in deciding the pending task quantity in
network design [32]. Datacenter network should be capable the queue. When the tasks in the pending queue have more
of accepting multiple requests at the same time. It should have number of requests then the PRT, the incoming tasks are
proper scheduling strategies [33] to schedule the requests paused to maintain the queue size. Similarly, if the numbers
from multiple users. Design should also be inspired by the of tasks are small then input requests are resumed. Now the
fault tolerance [34] issues so that it can continue work even question is how to decide the size of PRT. The PRT should
in the case of partial failure. have enough size so that raw packet task queue cannot be
The same issues are important while designing any con- completely empty.
troller. Maestro [35] deals with these issues. It is a java based Maestro also provides a rich set of interfaces and libraries.
multithreaded controller from Rice University. It explores Discovery is one of application, which continuously sends
additional throughput optimization technique to achieve max- probing messages to switches, to find the status of the newly
imum performance with exploiting parallelism. If we observe joined switch. When the discovery routine finds the return
the basic working of any controller then we find that first LLDP probing message, then it can decide that from where it
packet is sent to the controller each time. After perform- is getting the message so that it can determine the topology of
ing a security check, the controller carries path calculation the system. IntradomainRoutingapplication allows update in
and push appropriate flow entries in the data plane. This routing table once the topology gets changed. Authentication
scenario performs well when a number of packet request application takes care of security check constructs. Once it
are small. But in case of datacenter like a scenario where passed then the RouteFlow application can determine the
around 10 million requests arrive per second, it is not a good appropriate path for the request. How does one ensure that the
choice. RouteFlow application is using correct routing table during
To deal with this, Maestro introduces the concept of batch- the update procedure? For this, maestro allows execution of
ing. Multiple requests are grouped in a single batch from path selection of current request through older routing table to
users. Once a thread is free it can pick any available pending maintain the consistency in the result. Once the update takes
request from the batch and start executing it. The availability place it will apply for upcoming requests [35].
5) MERIDIAN 6) OPENDAYLIGHT
Meridian controller was originally designed for the applica- The opendaylight project started with the concept of model-
bility of SDN architecture in a cloud environment. The idea driven software engineering (MDSE) approach. Its architec-
was to build a service level network that can support features ture is inspired by the Beacon and makes use of Open Service
like policy abstraction, high connectivity in the cloud. SDN Gateway Interface (OSGi). MDSE consists of a framework
fits perfectly to the cloud architecture either in Infrastructure which defines the models and relationships among them.
as a Service (IaaS) or Platform as a Service (PaaS). The different models communicate with each other by data
The Meridian cloud network architecture, which is SDN modelling language. These models are platform independent
architecture for cloud networking, is mainly organized as to support the different business policy needs. NETCONF and
three different layers. The first layer is Abstracted API layer RESTCONF are used as a model-driven network manage-
responsible for exposing the required abstracted details for ment protocol. The basic operations supported by NETCONF
the network model. For example, Topology abstraction is one are Create, Retrieve, Update and Delete. Besides this, it also
kind of functioning in which abstracted view of the topology supports Remote Procedure Call (RPC) operation. The data
of underlying network is presented to the networking applica- encoding technique used in NETCONF is based on XML to
tions. The level of abstraction can differ from application to support data configuration and operation. Another configura-
application. Cloud orchestrator requires a detailed and com- tion protocol is RESTCONF which is similar in some aspect
plete topology information as it needs to decide that where the to the typical REST-like protocol. This protocol is responsible
virtual machine should be placed. On the other hand, Control for providing programmatic interface over the HTTP [38].
application requires topology along with different path infor- YANG is used as a modelling language so that models can
mation for controlling the networking devices. Second layer communicate with each other. Initially, YANG was used to
Network Orchestration layer collaborates with the abstraction configure the models but later it was used to describe the
layer to convert the logical command into their corresponding other network constructs i.e. services, policies, protocols etc.
The data structure used by the YANG is a tree structure. The deals with the runtime activity of the architecture. Binding
internal structure of tree can be further complex i.e. lists and aware broker deals with the JAVA APIs for plugins. BA-BI
unions [39]. Connector works as a mediator between the DOM broker and
While developing the OpenDaylight controller certain con- Binding aware broker. To implement the dynamic late bind-
siderations are taken. The controller should be flexible. ing, the BA-BI connector works along with Codec Registry
It should provide a common configuration platform for dif- and Codec Generator.
ferent application development. The system should support
runtime modularity for the addition of models at runtime. B. DISTRIBUTED CONTROLLER
The modularity should meet the performance and scalability
In comparison to centralized controllers, distributed con-
feature [38].
trollers have advantages in case of scalability and high
Fig. 6 describes the network view of OpenDaylight con-
performance during increase demand of requests.
troller architecture. The OpenDaylight architecture con-
sists of a set of North Bound and South Bound plugins
which are separated by the Service Adaption Layer (SAL). 1) HYPERFLOW
At the Southbound side, the plugins are Openflow, Netconf HyperFlow [40] is the first distributed control plane designed
Client, PCEP etc. Similarly, North Bound plugins consist of for OpenFlow. The original design of HyperFlow is inspired
Topology exporter, Forwarding Rule manager, Statistics by the NOX [26]. The design is distributed because of
Manager etc. To meet the objective of the OpenDaylight, physical availability of different controllers but they form a
the SAL is modified using the model-driven software logically centralized environment. There are certain issues
engineering concept and termed as Model Driven Service pointed out during the design e.g. When the switches increase
Adaption Layer (MDSAL). Following are the certain points in their quantity than the traffic increases towards the cen-
regarding the MDSAL. tralized controller. In such condition, centralized control
i) An RPC is a call from a consumer to the provider which becomes a bottleneck. To handle such scenario we need
is processed either locally or remotely. The call connection is multiple controller replicas physically distributed over a geo-
of one to one type. graphical area. This large network size gives rise to long
ii) A Notification is a reply expected by the consumer from flow setup latency for switches. The processing power of
the provider side. individual controller is also a significant issue.
iii) The Data store is a tree-like logical structure described FlowVisor [41] has a similar design to HyperFlow but
by the YANG schemas. it allows resource slicing so that each slice takes care
iv) A Path is the location of the specific leaf in the tree. by corresponding controller instance. HyperFlow uses pub-
Fig. 7 describes the architecture of MDSAL. It consists lic/subscribe message system to send the event messages
of two different brokers for data handling. DOM Broker towards another controller. During message passing, it is
necessary that we have persistent storage of events because To achieve fault tolerance, we assign the job of overall
there may be a chance of reordering of event list for a con- network management to a single controller which is called
troller during network partitioning into slices. To adopt such primary controller while other controllers act as backups.
feature HyperFlow makes use of WheelFS [42] which is a In the case of failure of primary a smooth transition is per-
distributed file system for distributed applications. When the formed to choose one of the back-ups as a new primary.
network gets partitioned WheelFS continues its functioning The controller stores all the application related data in a
in each partition. Controllers in one partition don’t receive shared data store which is implemented through Replicated
any messages from other partition controllers. State Machine (RSM) [45]. Shared data store keeps all the
In HyperFlow design, each controller can only program the state related information to the application like Network
switches which are directly controlled by it. While to control Information Base (NIB). During the transaction from the fail
other it publishes a message which contains the source con- controller, the new controller gets a complete update from the
troller identifier, target switch identifier, and local command data store.
identifier. Each controller continues to send periodic mes- The design also keeps a cache for fast access to state
sages to show its presence in the network. If any controller information. Now we don’t need to access data store again
fails to send a message within three advertisement intervals and again for a simple read operation. The cache is also
then it is assumed to be failed. In this condition, the switches free from synchronization aspects because at a time only one
associated with failed controller needs to migrate to some controller is using it.
other controller to continue operation. The control applica- The system is arranged in such a way that switches
tion should not depend on the temporal ordering of events can directly connect to a controller but not to data stores.
until they belong to same switch or links because different Controllers are available in between the data store and
controllers see different ordering of events. HyperFlow also switches so they can communicate with both. Based on mes-
supports the Authoritative controller to ensure the correct sage passing paradigm we can say that process p is connected
operation of each case [40]. to process q if the request sends by p can be answered in
a predefined time interval by process q [43]. System design
2) SMARTLIGHT gives three main facts to detect the failure of the component.
SMaRtLight [43] is designed to address the fault tolerance i) If a switch is connected to all correct controllers and it
issue in the network. Three aspects of failure are discussed in has not crashed then it is working correctly.
it which are a switch or link failures in the data plane, switch- ii) If a controller is connected to all data servers and it has
controller connection failure in the control plane and failure not crashed then it working correctly.
of the controller. The controller is designed by extending the iii) If a data store is well connected to all correct data
Floodlight [44] controller. Lease management application, servers and running the recovery protocols then we can say
One to one mapping between switches and data store connec- that it is working correctly.
tion, caching support for data store are major changes which Initially, all the switches have controller’s role as EQUAL.
are carried out in Floodlight for SMaRtLight design. Once a Primary gets selected the primary replica changes its
status in all switches as MASTER. Eventually, this leads to administrator configuration. The administrator can damage
change the role of other controllers as SLAVE in all switches. the routing, forwarding, network ability of controller.
Few points to be noted while interacting the controller repli- It has been observed that human errors are responsible for
cas as at any point of time there can be only one primary 50% to 80% network outages [49]. The malicious network
replica in the system. Once primary replica fails some correct administrator can easily degrade the performance of the
controller replicas can claim to become the primary. Each system by misconfiguring the controller.
replica must generate an acquireLease(id;L) message which The objective of Fleet design is to prevent the k mali-
should be sent to the data server. The parameter id defines the cious administrators among n administrators from further
id of replica and L defines the lease time require. The reply affecting routing, forwarding and availability in the system.
message from data store contains the id of primary replica. It is assumed that the number of network administrators
If any controller receives id its own from data server then for a network is restricted to at most 10. Switches in the
it becomes eligible to be MASTER otherwise, nothing will system are installed with authentication scheme so that they
happen and lease time will be updated for the current primary. can verify controller. The non-malicious administrators are
Data store is designed in such a way that it works on key-value grouped and they follow a single routing policy among them.
store interface which supports a basic operation like put, get, Besides this, all administrators have proper communication
remove, list, etc. It also supports a cache for fast access to to each switch for message exchange. The probability of
state information [43]. compromise, Protocol Overhead and Recovery time are the
metrics considered for measurement. The first metric is used
3) ONOS to find out the possibility that amongst the given controllers
Open Network Operating System (ONOS) [46] is the Open a group of k controllers will have different network configu-
source, Distributed Controller designed for SDN environ- ration compared to non-malicious controllers [21]. Protocol
ment. It is mainly designed to address the scalability, overhead specifies computational overhead which is carried
availability, and performance issue. The major challenges out by the system from the time when failure first introduced
identified are- in the network and a fix is performed on the same. Total time
i) Achieve high throughput: about 1M requests per second. duration denotes the third parameter recovery time.
ii) Latency should be in 10-100 ms range for event The basic building block of Fleet architecture is Adminis-
processing trator Layer and Switch intelligence Layer [21]. These lay-
iii) State size for network which is order of 1TB ers are arranged in logically centralized manner but physi-
iv) Achieve high availability 99.99% for services cally they are distributed across various controller instances.
Similar to HyperFlow it supports logically centralized but Switch intelligence layer interacts with its corresponding
physically distributed controller architecture. Two prototype switch and operates on each switch. Two versions of Fleet
specifications are defined for the ONOS. Prototype 1 focuses design are suggested by the researchers. They are- single
on building network architecture which provides a global net- configuration approach and multi-configuration approach.
work view with fault tolerant and scalability features. It was In single configuration approach, all the administrators agree
originally based on open source single instance controller, upon a single threshold value for making a high-level routing
Floodlight [44]. Prototype 2 focuses on improving the per- decision which is installed in corresponding switches. On the
formance of overall system. To achieve this, they emphasized other hand, the multi-configuration approach allows a set of
on a number of remote operation and time required to process n different routing configuration decisions from the different
them. This should be kept as small as possible. administrator and select anyone for particular switch flow
To achieve high performance in the system, RAMCloud based on metrics.
data store [47] is used which provides latency in the range
15-30 ms for read/write operation. Topology cache support
is provided to reduce the time for most frequent read oper- 5) ONIX
ations. Faster lookup can be possible through in memory SDN requires a common control platform which allows
topology view. The polling issue is addressed by implement- implementing various control functions like routing, access
ing public-subscribe event notification and communication control, traffic engineering etc. ONIX [19] is introduced
system based on Hazelcast [48]. Network View API is also as distributed controller offering such a common control
simplified and contains major three areas - Topology abstrac- platform written in C++. There are certain challenges for
tion, Path installation system, and events. Table 1 discusses designing common control platform.
the various characteristic aspects of the controller in central- The control platform should ensure that it provides various
ized and distributed domain. functionality for management application in various contexts.
Scalability is the need for any network architecture so control
4) FLEET paradigm should meet this requirement.
Fleet [21] is one of the first controllers which addressed i) It should be reliable to handle failure in the system.
the malicious administrator problem. The idea is to prevent ii) It should provide a simplified structure for building
the controller from malfunctioning because of malicious management application.
iii) Control plane functionality should not enforce addi- In networking scenario, it happens that many times the
tional burden on the overall functioning of the system to deal required condition cannot be fulfilled instantly but we can
with performance and latency issues. reserve it for future. Configuration API does the same thing.
Onix allows its instance to be written in multiple languages It applies greater visibility and control over the network to
which subsequently run in different processes. Currently, make a required reservation. PANE deals with two problems-
Onix supports C++, Python, and Java for implementation. decomposition of control and visibility in the network,
Some of the management application built on the top of Onix conflict resolution among the users and their requests.
instances are Multi-tenant virtualized data centers, Scale-out Decomposition of control and visibility can be resolved
carrier-grade IP router, Distributed Virtual Switch etc. The with the use of Privileges. Similarly, request conflict can be
NIB stores all the state information of the Switches and func- resolved by making use of conflict resolution operator and
tion supported by Onix API. Query, Create, destroy access using the Hierarchical flow table.
attributes, notifications, synchronize, configuration and pull Principals in PANE are end users or more specifically
are certain functions supported by Onix API [19]. application running on their behalf. The principal can inter-
act with three types of messages which are request, query
6) PANE and hints. The request message is used to take control over
The idea of PANE [50] controller suggests that there should the resources e.g. bandwidth or access control. Query mes-
be a configuration API between user and control plane. sages are used to gain information about the network states.
36266 VOLUME 6, 2018
M. Paliwal et al.: Controllers in SDN: A Review Report
Hints messages indicate the future demand of systems or to the switch characteristic vendor, version, and statistic
possible future behaviour of the system. The principal should details [50].
be limited in their authority. For this, PANE introduces the Just like other controllers, PANE also contains fault tol-
concept of the share which is a combination of principal, erance and Resilience procedure. Two types of failures can
privileges and flow group. A share indicates that which be possible in network one is the failure of networking ele-
principal can issue which message for which flow. Based ments i.e. links, ports, switches etc. Second is the failure
on the different shares it prepares a share tree. Share tree of the controller itself. In the case of link failure or link
does not itself introduce any new policy in the system while modification, PANE controller recompiles the policy tree.
it applies a constraint to the existing policies. Policies and As we know that the link gets updated so the outcome of
share tree combined together to form a policy tree. It may be the recompilation is not necessarily been available. If it is
possible that two policies can conflict with each other over not available then in such scenario controller processes the
some criteria. To avoid such condition policy tree are well request again and again to recreate a new policy tree and each
organized in Hierarchical Flow Tables [50]. of the principal gets informed regarding this. To handle the
The request in PANE is processed in step by step manner. controller failure it stores the database instance into the log
First, a principal generates a request regarding requesting a by periodically checking the database. If the controller gets a
resource or something else and passes it to the controller. restart because of failure, the instance details copies from the
It should be noted that only an authenticated principal can log record so that the controller can continue its functioning
send a request message to the controller. PANE first checks like before [50].
for the integrity of the message and if the messages follow the
specified criteria and are compatible with network state. If the V. PERFORMANCE ANALYSIS
request gets passed successfully than it gets added to the tree This section discusses performance analysis among various
and controller installs appropriate policy in the network. controllers discussed in the previous section. Two metrics
Conflict resolution in PANE takes place through the con- namely throughput and Response time are taken for analy-
flict resolution operator which is used in the Hierarchical sis. Throughput defines the number of input requests which
Flow Table. For each of the two conflicting request three controller can handle per second. Access time defines latency
types of operators can be applied they are +D, +P and +S. which is a time period required by the controller to process
The conflict request can have different types of relationship the request.
to each other like they can be siblings, they can be parent and Cbench [50] provides a platform to evaluate these param-
child to each other or they can belong to the same share. Based eters. The features offered by Cbench are a measurement
on this relationship PANE applies the appropriate operator. of maximum and minimum response time for controller
+D operator is used to avoid conflict when both the requests irrespective of a number of connected switches, throughput
belong to the same share. +P operator is used when the measurement in bounded environment i.e. bounded number
requests follow the parent-child relationship to each other. of the packet on the fly, calculation of maximum throughput
+S operator applies when both requests are siblings. Based etc. It operates in two modes i.e. latency and throughput
on the operator PANE introduces very simple procedures mode. In latency mode, each switch interacts with the single
as in the case of a parent-child relationship, child request request at a time until the processing is over and subsequently
overrides the parent request. Similarly, in the case of siblings, proceeds for next request. Generally, low load condition is
Deny request will be overridden by the Allow request. The considered for this case. On the other hand throughput mode
+D and +S operator in the PANE have similar meaning while computes the maximum flow processed by the controller in
dealing with conflict resolution procedure [50]. unit time.
Each request gets processed in either strict mode or Controller’s throughput performance is also evaluated in
Partial mode. In strict mode, it is necessary that the required the Multithreaded and single threaded environment. For com-
condition should hold clearly for each packet. No relaxation parative analysis of controller, system configuration assumed
is allowed in strict mode. For example, if an application is to be having Intel Xeon E5-2870 processor, 32-64 RAM
demanding for 50 Mbps bandwidth then it is necessary that with Ubuntu 11.10 or higher VM. Fig. 8 defines responses
result of HFT must allocate 50 Mbps bandwidth to it. On the of the controller in a single-threaded environment where
other hand in partial mode, there can be a relaxation for Onix is leading with 2.2M requests per second. Amongst
the required condition. For a similar example, the required the discussed controllers Onix and Beacon are the only con-
bandwidth 50 Mbps can be get relaxed with the 40 Mbps trollers which show throughput over 1M/s. Onix has double
bandwidth. Each of these two modes has their own advan- responses in the unit time period as compared to another
tages and they purely depend on the network application controller like NOX and ONOS. Ryu [52], Hyperflow and
behaviour. Another point of interest in PANE design is Net- POX [53] are amongst the controller having least response
work Information Base which stores the network element like time. For application development, which requires higher
switches, ports, queues etc. and their corresponding capabili- throughput, Onix can be a preferred choice.
ties. NIB translates the logical action into their corresponding On the other hand, Fig. 9 shows the throughput perfor-
physical actions and holds the necessary information related mance in a multithreaded environment where Beacon again
FIGURE 8. Throughput in single thread environment. FIGURE 10. Latency in single thread environment.
Bringing routing control functionality at the centralized [17] B. Pfaff and B. Davie, The Open vSwitch Database Management Protocol
location relax the forwarding devices working. All the intel- Internet Engineering Task Force, document RFC 7047, 2013.
[18] M. Smith, OpFlex Control Protocol Internet Engineering Task Force,
ligence of SDN comes from the controller that acts like the Internet Draft, 2014.
brain of the system. The capability of the controller can be [19] T. Koponen et al., ‘‘Onix: A distributed control platform for large-scale
defined by the number of requests it can handle from the production networks,’’ in Proc. 9th USENIX Conf. Oper. Syst. Design
Implement., 2010, pp. 1–6.
switches. Various modules inside the controller take care of [20] A. Ferguson, A. Guha, C. Liang, R. Fonseca, and S. Krishnamurthi,
network discovery, path discovery, flow pushing functionality ‘‘Participatory networking: An API for application control of SDNs,’’ in
etc. Proc. ACM SIGCOMM Conf., 2013, pp. 327–338.
[21] S. Matsumoto, S. Hitz, and A. Perrig, ‘‘Fleet: Defending SDNs from mali-
The centralized controller provides simplified architecture, cious administrators,’’ in Proc. 3rd Workshop Hot Topics Softw. Defined
efficient handling of request messages but it fails to address Netw., 2014, pp. 103–108.
the scalability issue. On the other hand, Distributed con- [22] U. Krishnaswamy, ONOS: An Open Source Distributed SDN OS, 2013.
trollers perform well at scalability issue and give maximum [23] M. Banikazemi, D. Olshefski, A. Shaikh, J. Tracey, and G. Wang, ‘‘Merid-
ian: An SDN platform for cloud network services,’’ IEEE Commun. Mag.,
throughput with high availability but they require proper vol. 51, no. 2, pp. 120–127, Feb. 2013.
message exchange procedure in the cluster. [24] M. Raju, A. Wundsam, and M. Yu, ‘‘NOSIX: A lightweight portability
Both categories contain controllers from open source as layer for the SDN OS,’’ SIGCOMM Comput. Commun. Rev., vol. 44, no. 2,
pp. 28–35, 2014.
well as dedicated vendors. Open source controllers like [25] D. Erickson, ‘‘The beacon openflow controller,’’ in Proc. 2nd ACM SIG-
ONOS, Beacon, OpenDaylight provide rich community sup- COMM Workshop Hot Topics Softw. Defined Netw., 2013, pp. 13–18.
port to go through the SDN concept in brief. We can also [26] N. Gude et al., ‘‘NOX: Towards an operating system for networks,’’ ACM
SIGCOMM Comput. Commun. Rev., vol. 38, no. 3, pp. 105–110, 2008.
categorize the controllers based on their uses in industry and [27] A. Tootoonchian, S. Gorbunov, Y. Ganjali, M. Casado, and R. Sherwood,
academia. Selection of controller depends upon the various ‘‘On controller performance in software-defined networks,’’ in Proc. 2nd
criteria like a single thread, Multithread etc. The choice of USENIX Conf. Hot Topics Manage. Internet Cloud Enterprise Netw. Ser-
vices, 2012, p. 10.
controller for academia can differ from industry. [28] C. Demetrescu and G. F. Italiano, ‘‘A new approach to dynamic all pairs
shortest paths,’’ J. ACM, vol. 51, no. 6, pp. 968–992, 2004.
REFERENCES [29] Open Service Gateway Initiative. Accessed: Nov. 11, 2017. [Online].
Available: https://www.osgi.org/
[1] D. L. Tennenhouse, J. M. Smith, W. D. Sincoskie, D. J. Wetherall, and
G. J. Minden, ‘‘A survey of active network research,’’ IEEE Commun. [30] S. Shin et al., ‘‘Rosemary: A robust, secure, and high-performance network
Mag., vol. 35, no. 1, pp. 80–86, Jan. 1997. operating system,’’ in Proc. 21st ACM Conf. Comput. Commun. Secur.,
[2] B. Pfaff, J. Pettit, K. Amidon, M. Casado, T. Koponen, and S. Shenker, Scottsdale, AZ, USA, 2014, pp. 78–89.
‘‘Extending networking into the virtualization layer,’’ in Proc. Workshop [31] M. Accetta et al., ‘‘Mach: A new kernel foundation for unix development,’’
Hot Topics Netw., 2009, pp. 1–6. in Proc. USENIX Conf., 1986, pp. 93–112.
[3] N. McKeown et al., ‘‘OpenFlow: Enabling innovation in campus net- [32] A. Tavakoli, M. Casado, T. Koponen, and S. Shenker, ‘‘Applying NOX to
works,’’ ACM SIGCOMM Comput. Commun. Rev., vol. 38, no. 2, the datacenter,’’ in Proc. 8th ACM Workshop Hot Topics Netw. (HotNets-
pp. 69–74, Apr. 2008. VIII), 2009, pp, 1–6.
[4] K. Sood, S. Yu, and Y. Xiang, ‘‘Software-defined wireless networking [33] M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat,
opportunities and challenges for Internet-of-Things: A review,’’ IEEE ‘‘Hedera: Dynamic flow scheduling for data center networks,’’ in Proc.
Internet Things J., vol. 3, no. 4, pp. 453–463, Aug. 2016. USENIX NSDI, 2010, p. 19.
[5] H. Kobo, A. Abu-Mahfouz, and G. Hancke, ‘‘A survey on software- [34] R. N. Mysore et al., ‘‘PortLand: A scalable fault-tolerant layer 2 data center
defined wireless sensor network: Challenges and design requirements,’’ network fabric,’’ ACM SIGCOMM Comput. Commun. Rev., vol. 39, no. 4,
IEEE Access, vol. 5, pp. 1872–1899, 2017. pp. 39–50, 2009.
[6] H. Zubaydi, M. Anbar, and C. Wey, ‘‘Review on detection techniques [35] Z. Cai and A. Cox, ‘‘Maestro: A system for scalable OpenFlow control,’’
against DDoS attacks on a software-defined networking controller,’’ in Rice Univ., Houston, TX, USA, Tech. Rep., 2011.
Proc. PICICT, 2017, pp. 10–16. [36] S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken, ‘‘The
[7] A. Neghabi, N. Navimipour, M. Hosseinzadeh, and A. Rezaee, ‘‘Load nature of data center traffic: measurements & analysis,’’ in Proc. IMC,
balancing mechanisms in the software defined networks: A system- 2009, pp. 202–208.
atic and comprehensive review of the literature,’’ IEEE Access, vol. 6, [37] T. Benson, A. Akella, and D. Maltz, ‘‘Network traffic characteristics of
pp. 14159–14178, 2018. data centers in the wild,’’ in Proc. IMC, 2010, pp. 267–280.
[8] D. Kreutz, F. M. V. Ramos, P. E. Veríssimo, C. E. Rothenberg, [38] J. Medved, R. Varga, A. Tkacik, and K. Gray, ‘‘OpenDaylight: Towards a
S. Azodolmolky, and S. Uhlig, ‘‘Software-defined networking: A compre- model-driven SDN controller architecture,’’ in Proc. IEEE 15th Int. Symp.
hensive survey,’’ Proc. IEEE, vol. 103, no. 1, pp. 14–76, Jan. 2015. World Wireless, Mobile Multimedia Netw., Jun. 2014, pp. 1–6.
[9] D. Sheinbein and R. Weber, ‘‘800 service using SPC network capability,’’ [39] M. Bjorklund, YANG—A Data Modeling Language for the Network Con-
Bell Syst. Tech. J., vol. 61, no. 7, pp. 1737–1744, 1982. figuration Protocol (NETCONF), document RFC 6020, Internet Engineer-
[10] A. Doria et al., Forwarding and Control Element Separation (ForCES) ing Task Force, 2010.
Protocol Specification, document RFC 5810, Internet Engineering Task [40] A. Tootoonchian and Y. Ganjali, ‘‘HyperFlow: A distributed control plane
Force, 2010. for OpenFlow,’’ in Proc. Internet Netw. Manage. Conf. Res. Enterprise
[11] M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. McKeown, and S. Shenker, Netw., 2010, p. 3.
‘‘Ethane: Taking control of the enterprise,’’ in Proc. Conf. Appl. Technol. [41] R. Sherwood et al., ‘‘FlowVisor: A network virtualization layer,’’ Deutsche
Architect. Protocols Comput. Commun., 2007, pp. 1–12. Telekom Inc. R&D Lab, Stanford, Nicira Netw., Stanford, CA, USA,
[12] (2013). Open vSwitch. [Online]. Available: http://vswitch.org/ Tech. Rep. 1, 2009.
[13] E. Fernandes and C. Rothenberg, ‘‘OpenFlow 1.3 software switch,’’ in [42] J. Stribling et al., ‘‘Flexible, wide-area storage for distributed systems
Proc. Brazilian Symp. Comput. Netw. Distrib. Syst., 2014, pp. 1021–1028. with wheelfs,’’ in Proc. 6th USENIX Symp. Netw. Syst. Design Imple-
[14] (2013). Pica8 3920. [Online]. Available: http://www.pica8.org/ ment. (NSDI), Boston, MA, USA, 2009, pp. 43–58.
documents/pica8-datasheet-64x10gbe-p3780-p3920.pdf [43] F. Botelho, A. Bessani, F. Ramos, and P. Ferreira, ‘‘SMaRtLight: A prac-
[15] Juniper Networks, Inc. (2013). Contrail Virtual Router. [Online]. Avail- tical fault-tolerant and controller,’’ Cornell Univ. Liberary, Netw. Internet
able: https://github.com/Juniper/contrail-vrouter Archit., Ithaca, NY, USA, Tech. Rep., 2014.
[16] IBM System Networking. (2013). RackSwitch G8264. [Online]. Available: [44] Project Floodlight. (2012). Floodlight. [Online]. Available:
http://www-03.ibm.com/systems/networking/switches/rack/g8264/ http://floodlight.openflowhub.org/
[45] F. Botelho, F. M. V. Ramos, D. Kreutz, and A. Bessani, ‘‘On the feasibility DEEPTI SHRIMANKAR received the B.Tech.
of a consistent and fault-tolerant data store for SDNs,’’ in Proc. EWSDN, degree in computer technology and the M.Tech.
2013, pp. 38–43. degree in image processing from Nagpur Univer-
[46] P. Berde, ‘‘ONOS: Towards an open, distributed SDN OS,’’ in Proc. 3rd sity in 1997 and 2007, respectively, and the Ph.D.
Workshop Hot Topics Softw. Defined Netw., 2014, pp. 1–6. degree in parallel computing from the Visves-
[47] J. Ousterhout et al., ‘‘The case for RAMClouds: Scalable high- varaya National Institute of Technology, Nagpur,
performance storage entirely in DRAM,’’ SIGOPS Operat. Syst. Rev., India. She is currently an Assistant Professor
vol. 43, no. 4, pp. 92–105, 2010.
with the Department of Computer Science and
[48] Hazelcast Project. Accessed: Oct. 22, 2017. [Online]. Available:
Engineering, Visvesvaraya National Institute of
http://www.hazelcast.org/
[49] What’s Behind Network Downtime? Sunnyvale, CA, USA, 2008. Technology.
[50] A. Ferguson, A. Guha, C. Liang, R. Fonseca, and S. Krishnamurthi,
‘‘Participatory networking: An API for application control of SDNs,’’ in
Proc. ACM SIGCOMM Conf., 2013, pp. 327–338.
[51] R. Sherwood and K. Yap. Cbench: An Open-Flow Controller
Benchmarker. Accessed: Dec. 2, 2017. [Online]. Available:
http://www.openflow.org/wk/index.php/Oflops
[52] Nippon Telegraph and Telephone Corporation. (2012). RYU Network
Operating System. [Online]. Available: http://osrg.github.com/ryu/
[53] POX. (2012). [Online]. Available: http://noxrepo.org/
MANISH PALIWAL received the B.E. degree OMPRAKASH TEMBHURNE received the
in computer science and engineering from B.E. degree in computer engineering and the
UIT-RGPV, Bhopal, in 2012, and the M.Tech. M.Tech. degree from Rastrasant Tukadoji Maharaj
degree from the Government College of Engineer- Nagpur University in 2009 and 2012, respectively.
ing at Amravati, Amravati, in 2015. He is currently He is currently pursuing the Ph.D. degree with the
pursuing the Ph.D. degree with the Department of Department of Computer Science and Engineer-
Computer Science and Engineering, Visvesvaraya ing, Visvesvaraya National Institute of Technol-
National Institute of Technology, Nagpur, India. ogy, Nagpur.