0% found this document useful (0 votes)
41 views24 pages

Final Report

This document provides an overview of a senior year project focused on making software defined networks (SDNs) more resilient to attacks. The project aimed to configure controller-less functionality, implement an SDN controller cluster, and develop a vulnerability assessment tool. Background information is provided on SDN technologies like Opendaylight and Open Network Operating System controllers as well as containerization with Kubernetes. Work done includes setting up hypervisors, controller clusters, controller-less functionality testing, vulnerability assessments, and containerization. The document discusses final results, lessons learned, potential applications, and future development areas.

Uploaded by

antoine.al.gr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views24 pages

Final Report

This document provides an overview of a senior year project focused on making software defined networks (SDNs) more resilient to attacks. The project aimed to configure controller-less functionality, implement an SDN controller cluster, and develop a vulnerability assessment tool. Background information is provided on SDN technologies like Opendaylight and Open Network Operating System controllers as well as containerization with Kubernetes. Work done includes setting up hypervisors, controller clusters, controller-less functionality testing, vulnerability assessments, and containerization. The document discusses final results, lessons learned, potential applications, and future development areas.

Uploaded by

antoine.al.gr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Attack-Resilient Software Defined Networks

Senior Year Project, NET 4901


Carleton University
April 2, 2020

Alexandre Lacasse 101001105


Caleb Zavits 101039101
Kyler Manseau 101003961
Shane Shuster 100966132
Table of Contents
Table of Contents 2
1 - Introduction 3
1.1 - Initial Proposal 3

2 - Background 4
2.1 - Software Defined Networking 4
2.2 - SDN Controllers and Dataplane 5
2.2.1 - Opendaylight 5
2.2.2 - Open Network Operating System 6
2.2.3 - Mininet 8
2.3 - Controller Clustering 8
2.4 - Controllerless Functionality 9
2.5 - Vulnerability Assessment 10
2.6 - Containerization and Kubernetes 10

3 - Planning and timeline 12

4 - Work done 13
4.1 - Hypervisors 13
4.2 - Controller Selection and Clustering 14
4.2.1 - Controller Setups 14
4.2.2 - Controller testing 15
4.3 - Controllerless Functionality 16
4.4 - Vulnerability assessment 16
4.5 - Containerization 18

5 - Analysis 19
5.1 - Final results 19
5.2 - Lessons learned 20
5.3 - Potential applications 20
5.4 - Future development 21

6 - Conclusion 21

7 - Appendices 23
7.1 - Appendix A: References 23
7.2 - Appendix B: Demonstration Video 24

2
1 - Introduction
Traditional networks, used by organizations and service providers, are inflexible
and are prone to making inefficient use of resources. Software Defined Networking
(SDN) brings a completely new paradigm to network architectures by being able to
dynamically alter data paths in real-time based on any software-defined criteria to make
the most efficient use of the network infrastructure. These criteria can be based on
collected metadata such as network congestion, user access control or Quality of
Service (QoS) policies, which can be taken into consideration to produce an optimal
network. The SDN architecture functions around a centralized controller making the
decisions for the entire network. This is, in essence, an abstraction of the control plane
typically present on each legacy network device, into a logically centralized point. This
creates a central source of authority but consequently creates a single attack surface
and point of failure. When compared against decentralized systems, a central controller
causes the network to be particularly susceptible to targeted attacks such as Denial of
Service (DoS), and the strong reliance network devices have on the controller means
that a lack of reachability may cause downtime. While SDN may be a tremendously
beneficial solution, its downsides must first be addressed. This final year project
focused on finding ways to bolster a software defined network’s overall resilience to
attacks. This paper will discuss the goals the team set, some background on SDN and
the technologies used, the progress achieved, and takeaways from the results of the
experiments.

1.1 - Initial Proposal


In September, the team wrote a proposal that laid out the groundwork for the
project. In it, the team listed three main goals with a number of additional goals that
would be attempted if time permitted. The first main goal was to configure controller-less
functionality to allow the SDN-managed switches to continue forwarding traffic for a
period of time in the absence of a controller. The second main goal was to configure the
SDN controllers in a fully-active cluster. This allows the cluster to lose one or more
controllers while still maintaining full network function. The third goal set was to create a
vulnerability assessment tool to assess the security of the SDN controller solution.
There are a myriad of auxiliary goals which would be attempted, granted
sufficient time and resources. These additional goals included the possibility of
implementing Segment Routing through IPv6 as a fallback mechanism, autonomous
dataplane node reconnection, random controller migrations across hosts, and DDoS
detection and prevention. Additionally, the team had plans to investigate the benefits of
implementing the controller in a containerized environment, thereby accessing the high
availability (HA) benefits that Kubernetes can provide. Although not all auxiliary goals
were achieved, those that were will be discussed further on.

3
2 - Background
2.1 - Software Defined Networking
Software defined networking is a technology in which all control plane decisions
are made by a central device, typically referred to as a controller. These decisions are
then parsed into forwarding tables, which are then pushed to all the network devices
functioning as the data plane in the network. There are two major applications for SDN;
Software Defined Datacenter (SD-DC) and Software-Defined Wide-Area Network (SD-
WAN). The former is an implementation for high-capacity LANs, which has a heavy
focus on capacity and hypervisor integration. The latter is used to interconnect
geographically separated sites. SDN is an excellent technology for centralizing and
simplifying network configuration, in exchange for having a central point of failure. If
taken down, the lack of an SDN controller can disrupt data transmission across the
entire network. The nature of SDN is such that all routing calculations are done on the
controller. If contact with the controller is lost, the data plane devices no longer have a
method of updating their forwarding information.
One of the key facets of the research that was done is the concept of open
source, wherein the team would rely as much as possible on Free and Open Source
Software (FOSS). Open source SDN products are designed with the goal of being
flexible, accessible, and modular. Using open source software is a strength as it allows
for development on top of the existing code with collaboration from the community.
An emerging technology that is of interest for the project is containerization. This
is a style of virtualization that has a lower consumption of host resources compared to a
virtual machine. This is due to the fact that containers typically only virtualize a specific
application or service while relying on the external, shared kernel of the host. This
design allows containers to be much more lightweight than a virtual machine and also
makes it easier to quickly deploy the containerized applications. Containers on their own
are useful but the real benefit of containerization comes from using a container
orchestration engine like Kubernetes. Kubernetes is a container orchestration platform
that brings high availability to containers and reduces network administration workloads.
It can automatically spin up and take down containers depending on demand. It also
has features to restart and/or remove containers that are in erroneous or failure states.
This type of container orchestration engine could add additional high availability
features to the ones built into SDN controllers along with greatly reducing the
maintenance performed by network engineers.
In recent years there has been research done on improving the security features
of SDN. Of particular interest was a paper from an IEEE conference in 2017 [3] which
looked into the possibilities of isolating SDN controllers from their hosts through the use
of containerisation. The research also looked into using live migration of controllers
across hosts as an attack prevention mechanism without having a major performance
impact[2]. These implementations could be used to harden an SDN solution against the
countless threats that can be faced in the networking world.

4
2.2 - SDN Controllers and Dataplane
There are a few open source SDN controllers that network engineers are able to
choose from and Opendaylight (ODL) is by far the most commonly used. However, a
lesser known but very powerful alternative exists: Open Network Operating System
(ONOS). Both controllers are open sourced projects that exist as Linux Foundation
projects. Open source applications are a factor that a lot of people are starting to
consider. If the people designing the network have to maintain the solutions anyways,
saving a lot of money by using something open sourced may be much more appealing
than paying extreme amounts of money for proprietary products and support.
Along with being open sourced, both controllers are java based and utilize the
Open Service Gateway Initiative (OSGi) framework. This framework is designed to
reduce complexity by providing a modular framework for distributed systems. The
controllers themselves take advantage of this framework and developers can also use it
when developing their apps for these SDN controllers.

2.2.1 - Opendaylight
Opendaylight (ODL) is one of the most widely deployed open source SDN
controller and as such was our initial choice for an SDN solution. Opendaylight is
designed as a modular platform. A key feature of Opendaylight is its focus on network
programmability and has been successfully deployed in many large enterprises to
provide network resource optimization, network visibility and control, and bring native
cloud support to networks. Not only is ODL a java based controller, but users can also
write their own applications in java to augment any sort of SDN feature they would
require for their network.

5
Fig. 1: OpenDaylight three tier architecture

Opendaylight is architected with a three layer design. Observing Figure 11, the
light blue tier illustrates one of those layers, the southbound api layer. This layer is
responsible for using apis and protocol plugins such as OpenFlow to communicate with
dataplane devices (Dark blue layer) in the software defined network. The second layer
is the core services layer. This is the layer that allows a user to add applications into the
controller while everything is still running. The third layer is the northbound api. This is
the layer that a network operator can use to push high level policies and then let
opendaylight handle the fine details of pushing specific configs to the network devices
through the southbound API.

2.2.2 - Open Network Operating System


Open Network Operating System (ONOS) is another major open source SDN
controller that the team considered using. It was developed with carrier grade solutions
in mind and has a large focus on network programmability and network function
virtualization (NFV). Like OpenDaylight, ONOS is a part of the Linux Foundation but
according to their github repository “ONOS is the only SDN controller platform that
supports the transition from legacy ‘brown field’ networks to SDN ‘green field’ networks”
[12]. Brown field environments are ones that are currently established and cannot be
redesigned from the ground up. This means that solutions need to be able to integrate

1 https://www.howtoforge.com/tutorial/software-defined-networking-sdn-architecture-and-role-of-openflow/

6
Fig. 2: ONOS architecture
with the current infrastructure rather than planning the infrastructure to support the
solution. Greenfield networks on the other hand are one that are just being deployed
and have the flexibility to be modified to meet certain solution deployment requirements.
ONOS can be considered like the operating system of a network. An important
benefit of an operating system is that “it provides a useful and usable platform for
software designed for a particular application or use case” [13]. The idea is that with this
stable platform, network administrators can either install or create a network application
to suit the needs of their SDN networks. On its own, ONOS has a useful API and
functional CLI and GUI interfaces but out of the box ONOS isn’t much more than that;
there is no logic built in. This is why the network applications are so crucial to the
architecture. Any network functionality required including L2 learning, forwarding, and
traffic monitoring is all handled by the applications installed on top of the ONOS network
OS. ONOS and all applications are JAVA based, with a large community of developers.
Similarly to opendaylight, ONOS can be broken down into three layers. Taking a
look at Figure 22, it may seem like there are more than three layers. However, they can
be broken down into the module layer, core layer and the application layer. The module
layer allows the southbound APIs to speak different protocols and languages depending
on the installed applications (Southbound layer in figure). The core layer is responsible
for keeping the state of the network operational. The third layer is the application layer.

2 https://aptira.com/comparison-of-software-defined-networking-sdn-controllers-part-2-open-network-
operating-system-onos/

7
As mentioned previously, ONOS is heavily dependent on applications to support the
large variety of network functions required in a software defined network.
With the initial research between open daylight and ONOS, the features
discussed made ONOS stand out. It’s not until later on that the team made the decision
to exclusively continue using ONOS.

2.2.3 - Mininet
Mininet is a network emulator used for facilitating the development, testing, and
demonstrations of network technologies, including but not exclusive to OpenFlow-based
SDN solutions [8]. Developers create topologies through a script that creates Open
vSwitch switches with links connecting to each other and to emulated hosts. These
switches can be connected to one or multiple external SDN controllers to begin
forwarding traffic. Since the topology is emulated, it is still possible to interact with it
using pings and packet sniffing tools like Wireshark.

2.3 - Controller Clustering


One of the most straightforward and powerful ways to mitigate the singular failure
point of a centralized design is to decentralize the weak point. Having multiple devices
acting as one logically centralized controller, with the ability to be physically and
geographically dispersed is a boon in defending against attacks. A functioning cluster of
controllers enables network traffic to continue to flow even if one or few of the clustered
controllers lose function.
When ONOS is clustered properly, the network is able to remain operational as
long as one controller remains functional. Controller clustering has certain weaknesses,
as it can be difficult to ensure all controllers maintain the same active network state.
Some other solutions operate in clusters where a single controller is actively managing
the network while the others are waiting in stand-by in case the primary fails. This isn’t
entirely desirable as when a controller transitions from standby to active it has to
connect and authenticate with all of the data plane devices as well as learn the
topology, meaning the process might not be hitless. For this reason, it is desirable to
configure a cluster of all-active controllers where any individual controller can manage
the entire network if needed at any point in time. Fully-active clustering is difficult
because each controller must have access to the same information at all times so there
are no conflicts. Maintaining consistency of network states between multiple active
controllers is one of the main challenges of controller clustering that had to be
addressed in this project.

8
Fig. 3: ONOS Cluster configuration
In order for all controllers to operate in an active state, ONOS architecturally
offloads the cluster state management to a separate entity. As of ONOS 1.14, the state
of the cluster is maintained externally by Atomix. Atomix is a java framework that helps
maintain fault-tolerance and consistency for distributed resources [14]. The Atomix cluster
is independent of ONOS. Figure 33 above illustrates what it looks like when an ONOS
node comes online. First the ONOS node will connect to the Atomix cluster, Atomix will
notify the other ONOS nodes and then they will connect to the new node forming the
cluster. There are some appealing security benefits of using an architecture like this. It
is possible to completely isolate the Atomix cluster from all traffic except that from the
ONOS nodes. With isolated state management the attack surface of this solution can be
greatly reduced. Since Atomix is built using the RAFT consensus algorithm, with a 3
node Atomix cluster, there can be one node that fails with no effect on the state being
maintained. Independently of the Atomix cluster, the ONOS cluster can have all nodes
except one fail. This is possible because the state of the topology is being maintained
externally by Atomix and is ready to be accessed by any ONOS node.
The external state management that comes with Atomix is the main key to
helping us achieve our second primary goal, clustering an SDN controller in an active
state. With Atomix, this allowed us to set up ONOS in an active cluster that can handle
all but one of the nodes going down.

2.4 - Controllerless Functionality


Controllerless SDN functionality is a concept where the data plane device is
capable of maintaining its forwarding information for a brief amount of time while
disconnected from its controller to allow for continued forwarding while the network
engineers work on addressing the issue. This is implemented in various ways but
typically entails having the data plane device continue forwarding with its current
3 https://wiki.onosproject.org/pages/viewpage.action?pageId=28836788

9
information for a duration of time. While this is occurring, it gives the time to the
administrator to restore functionality of the controller and connectivity to the data plane
without suffering downtime. Controllerless functionality is not without flaw, as when
dataplane nodes are functioning without a controller they are granted one of two
options. The default functionality of an Open vSwitch node in standalone mode is that of
a standard Layer 2 switch, conducting MAC address learning. While this allows them to
continue forwarding, they lack security, and the provisioning and management can no
longer be done which can lead to disjointed networks. For this reason, controllerless
functionality is not a long-term solution but rather a temporary fix to prevent downtime in
urgent cases, which increases the network’s resiliency to attacks.

2.5 - Vulnerability Assessment


Assessing the security vulnerabilities of a solution, system, or network is a
complicated task. The challenge grows even further when considering all the complex
components comprising a software defined network. Furthermore, automating this is
nearly impossible as there is no standard formula or set of tools involved in performing a
vulnerability assessment (VA). For the scope of this project, the focus was on
automating one of the first steps of a VA, remote enumeration.
In remote enumeration, the security analyst is gathering all outwardly-disclosed
information from networked applications from which they determine an initial list of
vulnerabilities. This frequently involves things such as port scanning, web fuzzing and
directory busting - brute force techniques used on web servers, as well as manual
reconnaissance and the use of open-source intelligence (OSINT). The information
gathered from this step is typically used to determine vulnerabilities and find attack
vectors that can be exploited, identifying issues that need to be resolved.

2.6 - Containerization and Kubernetes


Containers, popularized by Docker, are applications running on a host that are
completely isolated in virtualized environments. Containers stay isolated by making use
of Linux Namespaces to ensure each container only has access to its own filesystem,
processes and dependencies. Control Groups (cgroups) are also used to limit the
amount of hardware resources each container can consume [7]. Unlike VMs, containers
do not need their own OS which makes them much more lightweight and efficient. They
need their own libraries and dependencies which might be derived from an OS, but they

10
Fig. 4: Time comparison in process execution when scaling VMs and containers
share the kernel of the host. Properties of containers give them three key advantages
over VMs: portability, security, and lightweight workloads.
Container images are compressed directories containing all the code, libraries,
and dependencies that an application needs to run [4]. Because of this, a container can
theoretically run on any host running a container runtime engine such as Docker and
always behave in a consistent manner.
Typically, each container contains a single application. Since each container is
completely isolated, any compromised or defective application will have no effect on the
host or any other container, increasing the overall security of the system.
Since containers share the kernel of the host, applications in containers require
significantly less CPU, RAM, and storage resources to operate. In fact, research shows
that when comparing a large number of instances of containers to a similar number of
VMs, the time taken for all VMs to execute a task increases exponentially but for
containers the time increases mostly linearly [10]. Figure 44 shows a comparison between
execution times of a given task in multiple VMs compared to multiple containers.

Kubernetes enables an administrator to take this concept of containers and apply


it to production-grade distributed systems, and does so by abstracting away the
hardware implementation of each machine from the application. Kubernetes
deployments are organized in clusters with one or more master nodes and one or more
worker nodes. The master nodes orchestrate where Kubernetes resources are created,

4 [10] Q. Zhang, L. Liu, C. Pu, Q. Dou, L. Wu, and W. Zhou, “A Comparative Study of Containers and
Virtual Machines in Big Data Environment,” 2018 IEEE 11th International Conference on Cloud
Computing (CLOUD), 2018.

11
stored, and organized among the worker nodes. The administrator can interact with
these resources by communicating through the kube-api server on the master nodes.
The worker nodes monitor scheduling tasks on the master nodes in case there is any
change to the worker’s assigned workload. They are also responsible for monitoring and
maintaining the health of the pods (comprising containers) scheduled on themselves.
Kubernetes has a wide range of High Availability (HA) features for production
environments. Firstly, applications can be deployed with any number of replicas. When
replicas are used, the workload will automatically be load balanced across all the
instances of the application in round-robin fashion. Secondly, the kube-scheduler
process running on the master nodes automatically spreads these replicas onto as
many worker nodes as possible, ensuring maximal HA. Thirdly, the kubelet process on
each worker node constantly monitors the health of all the pods running on that node. It
will automatically delete and recreate any pod that is malfunctioning. Finally, with an
application such as Rook Ceph, each node in the cluster can merge their internal
storage together to create a distributed storage pool. This means that workloads can
dynamically move from one worker node to another. This is possible because of the
data replication across the storage pool on the cluster nodes.

3 - Planning and timeline


In the project proposal, the team presented a project plan with a table of
deliverables along with sharp deadlines to ensure continuous progress towards the
goals of the project. Figure 5 shows the timeline presented in the project proposal.
Progress towards the project had a slow start due to difficulties related to hardware and
the hypervisors used. But in retrospect, the team achieved the majority of the tasks set
out on schedule, though some tasks were modified. For instance, the team had the
opportunity to present and demonstrate the progress towards the project at General
Dynamic’s office, which was a welcomed surprise but not listed in the timeline. In terms
of objectives, the focus on creating an attack tooling suite was redirected towards
vulnerability assessment, and research on augmenting controller-less mode showed
that while feasible, time would not permit its completion. The following sections will
explain what the team accomplished and the changes made in more detail.

12
Fig. 5: Project timeline taken from the project proposal

4 - Work done
4.1 - Hypervisors
The original plan involved using KVM/QEMU with oVirt due to its reputation for
being a high quality open-source hypervisor platform, but the team encountered several
issues in the setup process. Due to the fact that the project was focused on building an
SDN solution, the hypervisor layer was not as relevant. This led us to continue
investigating alternative hypervisors that would allow us to spend more time working on
the project goals rather than the hypervisor.
Proxmox, which offers an open-source version of their product, was the second
hypervisor used. While the initial configuration was much easier, issues emerged while

13
managing VMs. Users connect to VMs using built-in open source terminal emulators,
and their performance varied widely from platform to platform. Some research was done
on some of the supported terminal emulators and some showed improvements in
performance with various tweaks, but no perfect solution was found that works on Mac,
Linux, and Windows.
Finally, the team was able to procure licenses for vSphere and vCenter server
through a partnership between Algonquin College and VMware, allowing any college
student in IT to use the software for up to a year. The configuration of the vSphere
ecosystem was much quicker and VMs were much easier to manage subsequently.
Though Ovirt and Proxmox are both quality hypervisors and could have adequately met
our needs, time constraints meant that the team could not justify spending any more
time troubleshooting hypervisor issues unrelated to the main goals.

4.2 - Controller Selection and Clustering

4.2.1 - Controller Setups


OpenDaylight was the team’s initial choice for an open source SDN controller.
Based on the initial research alone, the team was having a hard time determining which
controller was better suited for our primary goals. The team decided to implement both
OpenDaylight and ONOS in parallel, in a standalone setup for further analysis.
Following the OpenDaylight documentation the team was able to get a standalone node
communicating with a simulated mininet network. One thing noticed at this point was
that not all versions of opendaylight have a supported layer 2 learning application. At
this stage this was not a major concern because it could be something that the team
could implement down the line. The next step was to cluster the controller and test out
the CLI and GUI interfaces. The team faced more difficulty getting a cluster setup with
opendaylight than with ONOS. Once clustered, the team discovered that the CLI
interfaces of both controllers were nearly identical and offered similar functionality.
However, ONOS had a much nicer and functional GUI. Due to the difficulty with setup
and troubleshooting, lack of L2 learning functionality out-of-box and a lack-luster GUI,
the team decided to continue the project with ONOS.
ONOS was the second choice controller initially. Following the parallel setup
alongside ODL, there were many features our team grew to prefer. The initial setup of a
standalone controller was simpler. Being that ONOS is also an open sourced project,
the documentation is fragmented across different releases. Once the exact versions of
ONOS and the packages that were needed were determined, upgrading the standalone
node

14
Fi
g. 6: Mininet topology
to a cluster was also relatively straightforward. Three VMs were initially used for Atomix,
and three separate VMs were used for each ONOS controller. It was very simple to set
up the Atomix cluster and connect them to the ONOS controllers. With the Atomix and
ONOS cluster setup, the GUI was very nice to use in troubleshooting as it updates in
real-time and gives the ability to rebalance network nodes amongst the controllers. At
the point of having Atomix and ONOS clusters up, it was decided that the team
preferred the functionality and ease of use of ONOS compared to opendaylight. The
project from this point on only utilizes ONOS.

4.2.2 - Controller testing


A mininet topology script was created with eight OpenFlow enabled switches and six
emulated hosts, linked as shown in figure 6, above. Furthermore, figure 7 shows the
script used to create the topology. To run the topology while connecting it to the SDN
controllers being tested, the following command was used.

sudo mn --custom topology.py --topo topology --controller=remote ip=<controller-ip>


port=6633

15
Fig. 7: Mininet topology script

4.3 - Controllerless Functionality


Controllerless functionality is simply the ability for a data plane device to continue
forwarding traffic despite being unable to reach it’s controller. Sometimes it is referred to
as a headless mode or controllerless mode. It was achieved by modifying the behaviour
of the emulated Open vSwitch switches. This was performed by modifying the Open
vSwitch failure mode from the mininet default of “Secure” to “Standalone”. When in
secure failure mode, a data plane node will drop all flow tables and forwarding
information when losing connection to the controller. This results in no traffic being
forwarded by the device. The standalone mode allows the forwarding devices to
maintain their present knowledge, and continue to function as standard Layer 2
switches. At first this configuration change was done manually by running an OVS
command on each data plane device, which eventually evolved into embedding this
command into the mininet script that launched the topology, seen above in Figure 7.

4.4 - Vulnerability assessment


The objectives listed in the project proposal mentioned creating an attack tooling
suite to simulate an attack on the SDN controllers, which would most likely result in
some form of DoS attack. Due to the limited resources available to the team and the
use of a virtualized environment, it was concluded that creating a successful DoS attack

16
would be unfeasible. In its most basic form, a denial of service attack is an attrition war
between two devices, wherein the one with superior resources will defeat the other. In
this virtual environment, that is nearly impossible without shifting to a distributed denial
of service (DDoS) attack through the use of a botnet. This is exacerbated when
considering the first component that would suffer would be the vSwitch used to
interconnect the devices rather than the actual SDN controller virtual machine. For this
reason, the team shifted their focus towards vulnerability assessment.
A python script was composed, functioning as a command line interface (CLI)
utility which has been codenamed, and will henceforth be referred to as pscan. Pscan is
an automated and multithreaded stager for the initial steps of remote enumeration and
vulnerability assessment. The script is broken into two key stages, service discovery
and service assessment.
Service discovery is a multithreaded port scanner which builds a queue with all of
the desired transmission control protocol (TCP) ports the user selects. The user may
select a range of ports, the most common ports, or a full scan of all 65336 ports. Worker
threads then take the port number off the queue, formulate a TCP synchronize (SYN)
packet using the indicated target’s IP address and this port and wait for the response.
The specific function used listens for a synchronize acknowledgement (SYN/ACK)
packet in response which indicates a service is listening for traffic on that port. Pscan
then responds with a TCP reset (RST) packet to prevent a TCP session from forming
and appends the port number to a list of open ports. This process is repeated by all
worker threads simultaneously to quickly progress through all of the requested ports
and assemble a list of all open ports on the target host. Finally, if selected the script will
access a python library to translate the port number to commonly-used or registered
services that run on said port. It then takes this list of open ports and forwards it for use
in service assessment.
Service assessment makes use of a powerful network mapping tool called nmap
and the nmap scripting engine (NSE). Pscan converts the previously formed list of open
ports into a comma-separated string, which is then forwarded into an nmap scan
conducting operating system (OS) scanning, version scanning, a traceroute, and a
script scan, all of which are enabled by the -A option. The results of the OS scan and
version scan, being the operating system running on the target host as well as the
specific services and their versions running on each scanned port are sent to the NSE
to run a specific script called vulscan. Vulscan is an open-source nmap script which
takes the output of a service versioning scan and checks it against a variety of Common
Vulnerability and Exposure (CVE) databases. For each running service, it returns a list
of all known vulnerabilities associated with the version of the service that’s running.
Pscan then takes this output and collects it all into a text file for easier reading.
Pscan provides the foundation for a thorough vulnerability assessment or
penetration test. By implementing a staged design, the first half of which being

17
multithreaded, pscan enables an incredibly quick delivery of results and simplifies the
execution of a multi-step process.

4.5 - Containerization
The first step taken towards our goal of containerizing our setup was to configure
a single ONOS node in a docker container and linking it with the mininet simulated
topology. The team behind ONOS created a docker image for the ONOS controller,
making this part of our setup much easier. Similar to the setup in a VM, all that is
required for the standalone setup is to have an IP address that is reachable from the
mininet topology. Things started to get more complicated when trying to setup the
Atomix and ONOS clusters. Doing this in docker involved some research into docker
networks. When containers are created, they get a random IP in the docker default
network. This is not ideal when you need to tell some containers (ONOS) how to contact
other containers (Atomix). What the team ended up doing was creating a new docker
network and assigning all the containers static IPs. This allowed the config files to
remain relatively unchanged from the VM setup. However, since the containers are all
isolated in a docker network inside the VM running docker, the mininet topology is
unable to reach the ONOS cluster by default. Enabling IP forwarding on the docker
host, essentially turning it into a linux router, allowed us to define a route on the mininet
box pointing all traffic towards the custom docker network towards the host running
docker. The docker host performs the forwarding between the external network and the
isolated docker network. The team was really happy with how well ONOS was
functioning in a containerized environment and continued to attempt implementing the
solution in Kubernetes.
Pleased with how ONOS was behaving in a docker container, the team
continued to evolve the project by attempting one of our next auxiliary goals, to build the
SDN solution in kubernetes (commonly referred to by k8s). This began by creating four
VMs to be used as k8s nodes, one master with three worker nodes. The issues only
began when implementing ceph as the distributed storage for the cluster. This
distributed storage was required in order to allow the migration of ONOS controllers
across k8s nodes. There were lots of difficulties setting up ceph as it was a technology
that none of us had previously had experience with. After overcoming the challenges of
distributed storage, the team began migrating the docker-compose files that had been
used for the docker setup into k8s manifests. These manifests defined the required k8s
objects to host ONOS and Atomix. Some of these include stateful sets, config maps,
services and deployments. Setting up ONOS and Atomix in k8s was very similar to
when performed in docker containers. The fatal issue that the team encountered was
the inability to troubleshoot why the Atomix pods were unable to communicate with each
other. The team was able to get the ONOS pods to communicate with each other and
even with the Atomix nodes, the team was unable to get Atomix communicating

18
properly. Since Atomix is the backbone of an ONOS cluster, the team had to halt efforts
towards deploying our solution in k8s to focus our efforts on other deliverables of the
project.

5 - Analysis
5.1 - Final results
In the end, three major products were produced in order to showcase all of the
improvements made through the work done in the project. The first product is the
simple, unenhanced version in which there is a standalone ONOS controller, no
controllerless functionality, and no containerization meaning all components are simply
virtual machines. If any single component of this network were to be attacked, the entire
network’s forwarding functionality would be disabled, rendering this topology very
vulnerable to attacks and failures.
The second creation is the proof-of-concept driving the entire project and
incorporates the results of all the research done. The ONOS controllers are configured
in a cluster of three, utilizing three clustered Atomix nodes to maintain network state
consistency. This configuration allows for two of the three controllers to be attacked and
removed from functionality without impacting traffic forwarding of the data plane
devices. Additionally, one of the three Atomix nodes can be eliminated without causing
any complications for the controllers. Furthermore, both of these eliminations can occur
at the same time, meaning 1 Atomix node and up to 2 controllers can fail or be attacked
without affecting network functionality and both clusters can be easily restored. The
second major upgrade in the proof-of-concept topology is controllerless mode. This was
achieved by modifying the Open vSwitch fail-mode to standalone, which when
disconnected from the controller functions as a standard Layer 2 switch. The key
advantage of this is reducing downtime, keeping the network forwarding while the
controllers are being worked on or restored. However, a disadvantage to this, is that the
dataplane devices are now more vulnerable to malicious hosts entering the network
during this time, until the controllers are reestablished. In essence, the second product
achieved what the team had set as primary goals at the onset of the project.
The third product created, pscan, is a vulnerability assessment script, outlined
and described before. This script automates the execution and staging of the initial
steps of remote enumeration and reconnaissance. In essence, it is a multithreaded port
scanner that then feeds into an OS and service scan which feeds those results into a
CVE database parser. The results of running pscan is a text file containing all of the
known vulnerabilities related to the specific versions of software running on the remote
host.

19
In the end, the team was able to achieve all of the major goals set, producing a
functioning and usable proof-of-concept, demonstrating the ideas of attack resilience
discussed in the proposal. Additionally, there was large progress made towards the
auxiliary goals set, achieving some such as containerization of the platform, and nearly
achieving others such as orchestrating containerization through Kubernetes. The team
considers the final result to generally be a success, with areas for continued growth and
expansion.

5.2 - Lessons learned


In this project, the team encountered several roadblocks and surprises, from
which many lessons were to be gleaned. Many of the tasks that were expected to be
difficult came easily, while many of the tasks that were anticipated to be simple were
very much not. From logistical complications to technical difficulties, there was growth
and education for each member of the team.
Logistical lessons were learned about the challenges and benefits of working
with and coordinating between many contributing parties. This is both applicable within
the team and with those working with the team. Furthermore, balancing and achieving
the expectations of the various parties involved, including grading professors and
financially supporting companies.
A great many technical lessons were learned across the various components
involved in this project. Technical complications arose at every step, from setting up the
hypervisors, to implementing controllers, building the topology, all before developing the
project itself. These roadblocks allowed the team to learn about the technologies
involved and grow as both individuals and as a collective.

5.3 - Potential applications


Software defined networking has many applications in the real world, being one
of the fastest-growing methods for small businesses to access a Wide-Area Network
(WAN). While some companies, such as Nokia Nuage and Cisco Meraki, offer a
complete and serviced proprietary SDN solution, many small businesses will opt for a
do-it-yourself (DIY) SDN option, such as OpenDayLight or ONOS. In these cases, the
concept demonstrated in this project would provide utility in increasing network
resiliency to attack, thereby increasing overall availability of network resources.
For businesses like General Dynamics, who offer networks to function in
unreliable conditions, a semi-decentralized SDN could offer great utility. For
disconnected intermittent and limited-bandwidth (DIL) environments, the combination of
clustered controllers and controllerless functionality on the data plane nodes offer a
great many benefits. The logically centralized but geographically separated controllers
offer high availability and resiliency while maintaining the simplicity of configuration and

20
management, while the controllerless functionality of the data plane nodes help ensure
availability and function even when disconnected from the controller or in
uncompromising environments.

5.4 - Future development


If given more time, the team would continue development on a few features that
would help bolster the resilience and usability of the product. First, implementing a
container orchestration system for the systems underlying the controllers and clustering
nodes, such as Kubernetes. Second, creating a java applet to integrate changing the
Open vSwitch failure mode to standalone as an option within the ONOS graphical user
interface (GUI) found on the controller’s webpage. Finally, the team would implement
TLS-encryption for the communication between the controllers and the dataplane
nodes, as well as for the communication between the controllers and the Atomix nodes.
Doing so would help bolster the security of the entire application both against
eavesdroppers as well as malicious or rogue switches.
In addition to the intended areas of development, there are a plethora of features
that could be developed in order to continue the growth of this concept. Some examples
include enhancing the standalone Open vSwitch functionality to enhance security,
enhancing the security on the controllers to make them less susceptible to attack,
among others. Overall, there are a great number of ways that this proof-of-concept can
be expanded to allow it to closer approach a finalized product.

21
6 - Conclusion
While the solutions proposed by the team are by no means silver bullets, the
team believes the results of this project demonstrate that these solutions should be an
integral part of a high-availability and security-centered design for SDN architectures.
The natural single point of failure of centralized designs can be overcome with the
versatility of SDN. Utilizing clustering to bolster controller resiliency, controllerless
functionality to bolster data plane resiliency, and running it all over a containerized
environment for further versatility, many of the weaknesses traditionally associated with
SDN can be mitigated. The team discovered that attacking the controllers in such a way
that would result in disconnecting from the data plan was infeasible due to the fully
virtualized environment and the resources available.. Instead, the focus was placed on
automating the initial steps of a vulnerability assessment through python scripting.
Implementing the enhancements to this SDN solution does offer some negative
aspects that must be considered. The primary negative aspect being overhead, both
regarding resources and protocol. The resource overhead for clustering controllers
increases nearly sixfold when compared to that of a standalone controller as you need
at least three controllers to create a cluster, and those controllers then communicate
with at least three Atomix nodes, all of which need processing power, storage and RAM.
In addition to the resource usage increase, there is a protocol overhead added, as now
each controller communicates with every other controller and with every Atomix node.
This means that more protocols need to be run on each of these nodes and more traffic
is generated between them to maintain a synchronized network state.
The team would like to extend their sincerest gratitude to Dr. Richard Yu and to
General Dynamics for their support throughout the process of research and
development of this project.

22
7 - Appendices
7.1 - Appendix A: References
[1] A. Al-Shabibi, B. OConnor and T. Vachuska, "Building Highly-Available Distributed
SDN Applications with ONOS," 2016 46th Annual IEEE/IFIP International Conference
on Dependable Systems and Networks Workshop (DSN-W), Toulouse, 2016, pp.
266-266.
[2] A. Koshibe, “Open Network Operating System - Documentation,” Wiki, 07-Nov-2016.
[Online]. Available: https://wiki.onosproject.org/display/ONOS/ONOS+Documentation
[3]Azab, Mohamed, and Jose A.b. Fortes. “Towards Proactive SDN-Controller Attack
and Failure Resilience.” 2017 International Conference on Computing, Networking
and Communications (ICNC), 2017, doi:10.1109/iccnc.2017.7876169.
[4] C. Anderson, “Domosapiens: Kubernetes - Carson” , Vimeo, 2018. [Online].
Available: https://vimeo.com/245778144/4d1d597c5e
[5] F. Yamei, L. Qing and H. Qi, "Research and comparative analysis of performance
test on SDN controller," 2016 First IEEE International Conference on Computer
Communication and the Internet (ICCCI), Wuhan, 2016, pp. 207-210.
[6]J. M. S. Vilchez and D. E. Sarmiento, "Fault Tolerance Comparison of ONOS and
OpenDaylight SDN Controllers," 2018 4th IEEE Conference on Network
Softwarization and Workshops (NetSoft), Montreal, QC, 2018, pp. 277-282.
[7]2018.
Lukša Marko, Kubernetes in action. Shelter Island, NY: Manning Publications Co.,
[8] M. Core Team and B. Lantz, “mininet/mininet - Readme.md,” GitHub, 01-Dec-2019.
[Online]. Available: https://github.com/mininet/mininet
[9] “Open vSwitch Documentation,” Open vSwitch Documentation - Open vSwitch
2.13.90 documentation, 2016. [Online]. Available:
http://docs.openvswitch.org/en/latest/
[10] Q. Zhang, L. Liu, C. Pu, Q. Dou, L. Wu, and W. Zhou, “A Comparative Study of
Containers and Virtual Machines in Big Data Environment,” 2018 IEEE 11th
International Conference on Cloud Computing (CLOUD), 2018.
[11] T. Kim, S. Choi, J. Myung and C. Lim, "Load balancing on distributed datastore in
opendaylight SDN controller cluster," 2017 IEEE Conference on Network
Softwarization (NetSoft), Bologna, 2017, pp. 1-3.
[12] Opennetworkinglab. (n.d.). opennetworkinglab/onos. Retrieved from
https://github.com/opennetworkinglab/onos/blob/master/README.md
[13] (n.d.). Retrieved from https://wiki.onosproject.org/display/ONOS/ONOS
[14] Baeldung. (2020, February 12). Introduction to Atomix. Retrieved from
https://www.baeldung.com/atomix

23
7.2 - Appendix B: Demonstration Video
A video was prepared to demonstrate the results of this project, and is accessible at the
following address: https://youtu.be/QUYkeWNPUgo

24

You might also like