Elsarticle Template
Elsarticle Template
Elsarticle Template
Ahmadreza Montazerolghaem∗
Department of Computer Engineering, Quchan University of Technology, Quchan,
Khorasan Razavi, Iran
Abstract
Data centers are growing densely and providing various services to millions of
users through a collection of limited servers. That’s why large-scale data center
servers are threatened by the overload phenomenon. In this paper, we propose
a framework for data centers that are based on Software-defined networking
(SDN) technology and, taking advantage of this technology, seek to balance the
load between servers and prevent overloading on a given server. In addition, this
framework provides the required services in a fast time and with low computa-
tional complexity. The proposed framework is implemented in a real testbed,
and a wide variety of experimentations are carried out in comprehensive sce-
narios to evaluate its performance. Furthermore, the framework is evaluated
with four data centers including Three-layer, Fat-Tree, BCube, and Dcell data
centers. In the testbed, Open vSwitch v2.4.1 and Floodlight v1.2 are used to
implement switches and OpenFlow controllers. The results show that in all four
SDN-based architectures, the load balances between the servers is well main-
tained, and a significant improvement has been made in parameters such as
throughput, delay, and resource consumption.
Keywords: Software-defined networking (SDN), Data center design, Load
balancing, PID controller, Throughput, Delay
∗ Correspondingauthor
Email address: [email protected] (Ahmadreza Montazerolghaem)
2
saturation of server resources, overload, and loss of service quality.
3
introduces the dynamic timeout for SDN-based data centers, which can assign
60 appropriate timeout to various flows under their characteristics. In [23], Hwang
et al. introduce an approach for fast failover of both the control and data plane
in the data centers based on SDN. An dynamic load management method based
on SDN technology for optimizing data center link utilization by flow priority
is proposed in [24].
65 While much prior research has suggested the potential benefits of applying
SDN in computer networks in order to facilitate network management, there
has only been few studies about the practical approaches of applying SDN in
data center, practically. Additionally, the whole concept of SD-DC is in its
infancy and standardization efforts in terms of framework, protocols, applica-
70 tions, and assessment tools are still underway. Also, as discussed earlier, the
proposed ideas and related work are mostly preliminary proposals about soft-
warization of WSNs; or they focus on security and big data challenges of IoT.
Here, we pay particular attention to management of resources as well as QoS of
data centers. To the best of our knowledge, there are no studies concerning a
75 comprehensive approach for combining data center server load balancing with
QoS mechanism with operational view. Therefore, the exploration of such an
approach is timely and crucial, especially considering the rapid development of
data center applications and the emergence of SDN. In this paper, we propose a
SDN-based architecture for data center applications, so that both the path and
80 server selection can be managed together to improve QoS for the users, and to
balance traffic between servers simultaneously.
According to the above studies, there are no comprehensive studies on an
SDN-based data center load balancing. So, in this paper, we propose an SDN-
based architecture for data center to balance the load between servers and
85 prevent overloading. Also, the proposed framework is implemented in a real
testbed, and a wide variety of experimentations are carried out under various
scenarios.
4
1.2. Motivation
1.3. Contributions
105 Our study mainly aims at managing the traffic through the concept of the
SDN. In this regard, by applying the SDN in the data center, we provide a
resource and QoS conscious framework. In this framework, we seek to choose
the best path among the existing paths for each of the data center traffic classes
in a way that the load balance of data center servers is established and the QoS
110 of the traffic class is satisfied. To this end, we design a modular controller that
uses PID system. In this regard, we propose the system for resource managment
based on PID. In other words, PID is used to decide how to load balances
between servers.
So, the main innovation is the architectural design of a modular controller
115 based on SDN technology according to Proportional-Integral-Derivative (PID),
Least Connection and Least Response Time algorithms to achieve a scalable
5
data centers with a higher quality of service. In this regard, the main contribu-
tions of this paper can be summarized as follows:
- Theoretical aspect
120 • Designing a novel SDN-based control and management framework for the
data center (for avoiding the overload occurrence on data center servers
together with increasing the QoS),
- Implementation aspect
1.4. Organization
6
Data Plane
Switches SDN
S1
Controller
S2 S3
S4 S5 S6 S7
P1 P2 P3 P4 P5 P6 P7 P8
As can be seen in this architecture, there are three layers of edge, aggrega-
tion, and core, each layer having a desired number of switches with different
150 capacities. For example, in this example, there are 8 edge switches, 4 aggrega-
tion switches, and 2 core switches.
7
2.2. Fat-Tree architecture
Fat-Tree architecture is illustrated in Fig.3. The topology of this architecture
is tree-like.
Data Plane
OpenFlow
Switches Control Plane
S1 S2 SDN
Controller
S3 S4 S5 S6
S7 S8 S9 S10
P1 P2 P3 P4 P5 P6 P7 P8
Figure 3: SDN-based data center tree architecture
155 The traditional architecture of this model of the data centers is shown in
Fig.4. As can be seen, the data center consists of 4 pods, each of them consisting
of 8 aggregation and edge switches. A total of 4 core switches establish the
communication between the pods.
8
Data Plane
OpenFlow Switches Control Plane
SDN
Controller
S1 S2 S3 S4
S5 S6 S7 S8
9
SDN Controller
OpenFlow Switches
only way to make an adjustment to the policy is via changes to the configuration
of the network equipment. This has proven restrictive for network operators
180 who are keen to scale their networks in response to changing traffic demands,
increasing the use of mobile devices.
10
Figure 8: Traditional Dcell data center architecture
Z t
de(t)
u(t) = KP e(t) + KI e(t)dt + KD (1)
0 dt
In this equation, increasing the P component (KP ) leads to faster response
205 but also to overshooting and oscillation problems. Increasing I component (KI )
reduces stationary errors but at the expense of larger oscillations. Finally, the D
component (KD ) reduces the oscillations but may lead to slower response. Only
setting these values turned out to be insufficient to achieve a good performance
(stable and fast response nal value (P + I + D)). This filter keeps threshold
210 load levels between certain values, i.e., between 0.2 and 0.8 instead of the [0,1]
range. Limiting the spectrum of feasible values for the load threshold reduces
fluctuations caused by fast switching between accepting too many and too few
services.
11
Figure 9: The proposed architecture of the SDN controller
Finally, if the server load tolerance is lower than the threshold (δ), the server
215 with the lowest response time is selected, and otherwise, the server with the
lowest connection is selected.
Least response time method uses the response information from the Network
Statistics module to determine the server that is responding fastest at a particu-
lar time. In least connection method the current request goes to the server that
220 is servicing the least number of active sessions at the current time. The reason
is that if the load tolerance is high and the servers’ load is more imbalanced,
the least connection method can better detect less efficient servers and balance
the servers by injecting load into them, in comparison to the least response time
method. If the server load tolerance is low, the least response time method can
225 better and faster keep the server load status balanced.
12
In the end, depending on the selected server, the Flow Manager module
adjusts the appropriate path of flows to that server by installing the appropriate
rolls on the OpenFlow switches. Algorithm 1 below shows the details of the
proposed load balancing approach.
13
230 The Flow Manager module is in charge of sending LLDP (Link Layer Dis-
covery Protocol) packets to all the connected switches through Packet Out mes-
sages. These messages instruct the switches to send LLDP packets to all ports.
Once a switch receives the Packet-Out message, the LLDP packets are sent out
among all the ports. If the neighbor device is an OpenFlow switch, it will per-
235 form a flow lookup. Since the switch does not have a flow entry for this LLDP
message, it will send this packet to the controller by means of a Packet-In mes-
sage. When the controller receives the Packet-In, it analyses the packet and
creates a connection in its discovery table for the two switches. All remaining
switches in the network will similarly send a packet into the controller, which
240 would create a complete network topology. LLDP messages are periodically
exchanged and events are brought to the controller when links go up/down,
or new links are added/removed. Information on switches and links are main-
tained in the Network Statistics module. Servers load information and load level
of servers (y) is also recorded by the Servers Load Monitoring module. Load
245 monitoring is defined in one place for the entire network - in the control plane
module. It is easy to scale by replicating the control plane. As mentioned, PID
module is used to decide how to load balances between servers. So, the PID
controller seeks to regulate the servers load (y) with the target load (r). The
PID controller output is u. u is the increment or decrement in the actual load
250 threshold needed to achieve the target utilization. The u value at time t can be
obtained from Equation 1. Finally, if the server load tolerance (u) is upper than
the threshold (δ), the server with the least connection is selected, and otherwise,
the server with the least response time is selected. Least connection load bal-
ancing algorithm selects the server with the fewest active connections. Neither
255 round robin or random take the current server load into consideration when dis-
tributing messages. The least connection algorithm considers the current server
load. In return, the least response time algorithm uses the response data from
a server to determine the server that is responding fastest at a deadline.
14
3. Testing and evaluating the proposed schemes
260 In this section, we first describe the implementation details. We then evalu-
ate the performance and present the results of the proposed approach in several
of the following subsections.
In the testbed, we employ Open vSwitch v2.4.1 and Floodlight v1.2 to im-
plement the OpenFlow switch and controller and modify them as described in
265 the previous section. Floodlight is a Java-based controller. The PID controller
is also coded and tuned as a module in Floodlight in Java. Open vSwitch is
also a virtual and software switch that supports the OpenFlow protocol. We
use SIPp to inject traffic. Oprofile software is also used to observe consumables.
If we need to inject background traffic between the servers, we use iperf to send
270 packets at a fixed rate. We run each experiment three times and reported the
mean as a result.
• HP G9 DL580 server
2 http://voip-lab.um.ac.ir/index.php?lang=en
15
Figure 10: Implementation platform in the IP-PBX type approval laboratory of Ferdowsi
University of Mashhad
16
by the server), and the resources consumed by the servers are among the eval-
uation criteria. The goal is to achieve the maximum throughput and minimum
response time without overloading and with respect to resources.
300 The proposed method has achieved better results than the Round-robin and
Random methods in both scenarios (Fig. 10). In addition, the proposed method
results are similar in both scenarios, however, the results of Random-Robin and
Random in Scenario 2 are worse than in Scenario 1. The reason is that the
different background traffic of servers in Scenario 2 has made the blind load
305 distribution of the two methods worse over time. As can be seen from the
comparison of Fig. 11a to 11b, the SDN-based method in both scenarios is
able to achieve near offered-load throughput, as it can have a good estimate
of servers’ load using both, response time and the number of connections. Not
only consuming the resources in Round-Robin and Random methods is more
310 than SDN-based method, but also their average response time is longer. The
servers’ resource utilization rate in the SDN-based approach is approximately
equal, indicating a conscious and fair distribution of the load of this method
(Fig. 10e to 10p).
In Round-Robin and Random methods unequal distribution of load over
315 time causes unanswered messages in the server queue. This is because the
redundant retransmissions and manipulation of the timers increase CPU and
memory occupation and worsen server overload. In other words, failure to
perform one task in due time affects subsequent tasks, causing an overload
of CPU and memory. The same is true for subsequent tasks. As a result, the
320 server processor and memory are always involved in a large number of previous
or retransmission messages. All of this is due to the lack of fair distribution.
Besides that, the flow of new requests eventuated to overflowing of the queue
and losing the packages. In this regard, an SDN-based framework is proposed
which uses the global view to distribute fairly and based on servers capacity.
325 On the contrary, in the random and round-robin approaches, the capacity of
the servers is neglected. This over time causes the queues to become full and
also overloaded.
17
1600
Offered load
1400
SDN-based
Round-robin
Random
1200
1000
800
600
0 50 100 150 200 250 300 350 400
Time (second)
1600
Offered load
1400 SDN-based
Round-robin
1200 Random
1000
800
600
400
0 50 100 150 200 250 300 350 400
Time (Second)
70
SDN-based
60 Round-robin
Random
50
40
30
20
10
0 50 100 150 200 250 300 350 400
Time (Second)
50
40
30
20
0 50 100 150 200 250 300 350 400
Time (Second)
18
(e) Servers average CPU usage in Scenario 1 (SDN-based)
P1
100 P2
P3
Average CPU Usage (%)
P4
80
P5
P6
60 P7
P8
40
20
0
100 200 300 400
Time (Second)
Round-robin
(f) Server average و1 سناریو
متدCPU سرورها در
usage ِپردازنده مصرفی
in Scenario میانگین-گ
1 (Round-robin)
P1
100 P2
P3
Average CPU Usage (%)
P4
80
P5
P6
60 P7
P8
40
20
0
100 200 300 400
Time (Second)
Random
(g) Server و متد1 CPU
average در سناریو سرورهاinِیScenario
usage پردازنده مصرف1 میانگین
(Random)-ل
19
3
(h) Server average CPU usage in Scenario 2 (SDN-based)
P1
100
P2
P3
Average CPU Usage (%)
80 P4
P5
P6
60 P7
P8
40
20
0
100 200 300 400
Time (Second)
Round-robin
(i) Server average و2 سناریو
متدCPU usageسرورها در ِپردازنده مصرفی
in Scenario میانگین-ه
2 (Round-robin)
P1
100
P2
P3
Average CPU Usage (%)
80 P4
P5
P6
60 P7
P8
40
20
Random
(j) Server و متد2 سناریو
average CPU درusage
سرورهاinِمصرفی
Scenario
پردازنده2میانگین
(Random)
-ل
20
6
(k) Server average memory utilization in Scenario 1 (SDN-based)
100 P1
Average Memory Usage (%)
P2
P3
80 P4
P5
P6
60 P7
P8
40
20
0
100 200 300 400
Time (Second)
Round-robin
(l) Server average و متد1 utilization
memory سرورها در سناریو
in ِیScenario
حافظه مصرف1 میانگین -م
(Round-robin)
100 P1
Average Memory Usage (%)
P2
P3
80 P4
P5
P6
60 P7
P8
40
20
(m) ServerRandom
average و1 سناریوutilization
متدmemory مصرفیِ سرورها در میانگین حافظه
in Scenario -ص
1 (Random)
21
9
)(n) Server average memory utilization in Scenario 2 (SDN-based
P1
100 P2
)Average Memory Usage (%
P3
P4
80
P5
P6
60 P7
P8
40
20
0
100 200 300 400
)Time (Second
P1
100
P2
)Average Memory Usage (%
P3
80 P4
P5
P6
60 P7
P8
40
20
0
100 200 300 400
)Time (Second
همانگونه که از شکل 11-5الف و ب مشخص اسات ،ممککارد کنتارلکنناده مساتقل از ساناریو اسات و
میانگین گذردهی آن تقریباً 1450 fpsو میانگین زمان پاسخ آن تقریباً 7 msاست .این نشان مایدهاد کاه
22باالیی دست پیادا کناد .ایان در حاالی اسات کاه
کنترلکننده توانسته است با تأخیر بسیار کم به گذردهی
ماژولهایِ آن سربار اضافی بر روی منابع نمیآورند .مصرف منابع کمترِ کنترلکننده نسبت به سرورها به این
12
خاطر است که سرور وظیفه برقراری و خاتمهی کل درخواستها را بر مهده دارد درحالیکاه در ایان راساتا،
کنترلکننده وظیفه مدیریت تعداد محدودی سوئیچ را بر مهده دارد .این موضوع از شاکل 12-5نیاز قابال
استنباط است .این شکل نشان میدهد که تعداد بستههای پردازششده در واحد زمان توسط سرورها تقریبااً
7برابرِ کنترلکننده است .همانطور که می دانید تقریباا در جریاان برقاراری درخواسات 7 ،پیاام شارکت
Fig. 11 shows the performance of the proposed controller. The controller
throughput is denoted by the number of serviced flows per time, and the average
330 controller response time is denoted by the time between sending a Packet-In
message from the switch to receiving the Flow-Mod by the controller.
As shown in Fig. 11a and 11b, the controller performance is scenario-
independent, with an average throughput of approximately 1450 fps and an
average response time of approximately 7 ms. This indicates that the controller
335 has been able to achieve high throughput and low delay. However, its modules
do not overload resources. Consuming fewer resources by the controller than
the servers is because the server is responsible for establishing and terminating
all requests, while the controller is responsible for managing a limited number of
switches. This is also deduced from Fig. 12. This figure shows that the number
340 of packets processed per server is approximately 7 times that of the controller.
As you know, at least 7 messages are involved in establishing a connection, while
the process of managing the rules on the switches by the controller is initiated
with the Packet-In message and terminated with the Flow-Mod message. So the
overload on the controller is much less likely than on servers. Also, servers will
345 not be exposed to overload by the fair distribution of the load by the controller.
Other reasons the servers may be in trouble are the sudden failure of network
components and the sudden loss of capacity. This failure may be an imposed
corrupted server load on other servers. In order to test this situation, let P1 to
P4 fail at second 80 and recover at second 160, in the first scenario. Similarly,
350 let P1 to P4 fail at second 240 and recover at second 320, in the second scenario.
Fig. 13 shows the performance under these conditions. In Scenario 1, the con-
troller is able to transfer the entire load, from second 80 to 160, to P5 to P8,
indicating the speed of the controller action, which, despite the sudden failure
of P1 to P4, still maintained the entire system throughput near to offered-load.
355 The consumption of resources P5 to P8 has also increased during this period.
The controller resources consumption has also increased in this period. Under
normal circumstances, server resource usage is almost equal and controller re-
source consumption is very low. In Scenario 2, the same process is repeated in
23
1500
1400
Scenario 1
1350 Scenario 2
1300
0 50 100 150 200 250 300 350 400
Time (Second)
25
Average Response Time (ms)
Scenario 2
20
Scenario 1
15
10
0
0 50 100 150 200 250 300 350 400
Time (Second)
50
Scenario1
Average CPU Usage (%)
40 Scenario2
30
20
10
0
0 50 100 150 200 250 300 350 400
Time (Second)
کنندهaverage
(c) The مصرفیِ کنترل میانگین پردازنده
controller CPU -usage
ج
50
Average memory usage (%)
Scenario 1
40
Scenario 2
30
14
20
10
0
0 50 100 150 200 250 300 350 400
Time (Second)
کننده
(d) The یِ کنترلcontroller
average میانگین حافظه مصرف
memory - دusage
15
16
seconds 240 to 320, except that the load to P1 to P4 is not equal to P5 to P8,
360 yet the controller is able to transfer the entire load to the set P5 to P8 in second
240, and redistribute the load across all servers in second 320.
12000
10500
packets processed per second (PPS)
9000
By controller - Scenario 1
7500
By controller - Scenario 2
6000 By servers - Scenario 1
4500 By servers - Scenario 2
3000
1500
0
0 40 80 120 160 200 240 280 320 360 400
Time (Second)
خراب شدن ناگهانی اجزای شابکه و،یکی دیگر از دالیکی که ممکن است کنترلکننده را در تنگنا قرار دهد
. این خرابی ممکن است تحمیل بارِ سرورِ خرابشده روی سایر سرورها باشد.کاهش ناگهانی ظرفیت میباشد
3.2.2. Experiment 2: Variable load
160 دچاار خرابای شاده و در ثانیاه80 در ثانیاهP4 تااP1 ، در سناریو اول،برای آزمایش چنین وضعیتی
In the previous section, a constant load of 1500 requests per second (1500
ً مجددا320 دچار خرابی شده و در ثانیه240 در ثانیهP4 تاP1 در سناریو دوم نیز.مجدداً به کار میافتند
rps) is injected into the system, but in this section, we evaluate variable load
160 تاا80 و از ثانیاه1 در ساناریو. کارایی را در این شرایط نشان مایدهاد13-5 شکل.به کار میافتند
365 performance. In the previous section, the servers are not exposed to overload
که نشان از سرمت ممل کنترل کننده است، بدهدP8 تاP5 کنترلکننده بهخوبی توانسته است کل بار را به
because they had a capacity exceeding 1500 rps (and thus did not exposed to
همچنان توانسته است گذردهی کال سیساتم را نزدیاه باه باارP4 تاP1 که مکیرغم خراب شدن ناگهانی
resource shortage). But in this section, as the load increases, we introduce the
مصرف منابع کنترلکنناده. در این بازه نیز افزایشیافته است8 تا5 هایP مصرف منابع.ارائهشده نگه دارد
overload phenomenon and test the performance of the servers and controllers.
تقریبااً برابار و مصارفP5 وP1 در شرایط مادی مصارف مناابع.نیز در این بازه تا حدی افزایشیافته است
The results are shown in Fig. 14. The load starts in 1500 and increases to 6000
باا، نیز همین روند تکرار شده است320 تا240 و از ثانیه2 در سناریو.منابع کنترلکننده بسیار کم است
370 requests per second (up to 400 seconds) in four steps. Then, as a result of a
کل بار، توانسته است60 با اینحال کنترلر در ثانیه، برابر نبوده استP8 تاP5 وP4 تاP1 ِاین تفاوت که بار
sudden loss in the second 400, it again reaches 1500 requests per second. In
. مجدداً بار را بین همه سرورها توزیع کند320 منتقل و در ثانیهP8 تاP5 را به مجمومه
the second 500, it jumps to 6000 requests per second with a sudden jump and
reaches 3000 requests per second in the last 100 seconds. Before the second 200,
the server and controller throughputs are very close to the offered-load. In the
375 second 200, the servers become overloaded until the second 400. During these
200 seconds, the average throughput of18the servers is approximately 3000 rps
and the rest of the load provided by the servers is rejected. Servers’ overload
occurs due to a lack of resources especially the processor (Fig. 14c and 14d).
25
Of f ered Load
Throughput of serv ers 1 to 4 - Sceacnario 1
Throughput of serv ers 5 to 8 - Sceacnario 1
2000 Throughput of serv ers 1 to 4 - Sceacnario 2
Throughput of serv ers 5 to 8 - Sceacnario 2
1500
Throughput (cps)
1000
500
0
0 40 80 120 160 200 240 280 320 360 400
Time (Second)
(a) Throughput
گذردهی-الف
80 Servers 1 to 4 - Scenario 1
Servers 5 to 8 - Scenario 1
Controller - Scenario 1
Average CPU Usage (%)
Servers 1 to 4 - Scenario 2
60
Servers 5 to 8 - Scenario 2
Controller - Scenario 2
40
20
0
40 80 120 160 200 240 280 320 360 400
Time (Second)
Controller - Scenario 1
Servers 1 to 4 - Scenario 2
60 Servers 5 to 8 - Scenario 2
Controller - Scenario 2
40
20
0
40 80 120 160 200 240 280 320 360 400
Time (Second)
27
معماری مبتنی بر SDNبازهم توانسته است گذردهی سرورها را به حداکثر برساند .ازدحام آنی زمانی اتفااق
میافتد که تعداد بسیار زیادی UAبهطور همزمان اقدام به درخواست نمایند .گذردهی باالیِ سرورها درافت
و خیزهای ناگهانیِ بار ارائهشده ،نشاندهندهی پایداریِ 2سیستم است .نهایتااً در صاد ثانیاهی آخار نارخ رد
تماسها ناچیز و گذردهی ماکزیمم و زمان پاسخ تقریباً 10میکیثانیه است.
نکته اینکه در بازه زمانیِ ثانیه 200تا 400و یا 500تا 600که بار ورودی از منابع شابکه بیشاتر اسات،
میتوان با افزایش منابع سرورها و رفع محدودیتهای سختافزاری آنها ،به گذردهی نزدیه به بار ارائهشده
دستیافت.
7500 7500
Offered Load
Controller Throughput
6000 6000
Servers Throughput
Rejection Rate
0
100 200 300 400 500 600 700
)Time (Second
گذردهی)(a الف-
Throughput
1000
)Average response time (ms
100
10
1
0 100 200 300 400 500 600 700
)Time (Second
1
Flash Crowd
2
stability 75
50 22
25
0
50 150 250 350 450 550 650
)Time (Second
مصرفی)(c
Average
CPUپردازنده
usageمیانگین
ج-
)Average memory usage (%
100
75
50
25
0
50 150 250 350 450 550 650
)Time (Second
28سنتیِ Three-layer
آزمایش سوم :مقایسه با معماری -2
در جدول ذیل مقایسه ای از نظر گذردهی ،تاخیر و مصرف مناابع باین دو معمااری Three-layerسانتی و
مبتنی بر SDNآورده شده است .همانگونه که مشاهده می کنید ،تکنولوژی SDNتوانسته است به خوبی در
افزایش کیفیت سرویس درخواست های بین سرورها شامل گذردهی ،تاخیر و مصرف منابع تاثیر گذار باشد.
23
Table 1: Comparison of traditional Three-layer and SDN architectures
3.3. Discussion
4 we want the control of the network to be centralized rather than having each device be
its own island, which greatly simplifies the network discovery, connectivity and control issues
that have resulted with the current system. Having this overarching control actually makes
29
controller and APIs (such as OpenFlow) are capable of L2/3/4-based policy
enforcement.
430 So overall, by implementing a new orchestration level, SDN can tackle the
inflexibility and complexity of the traditional network. SDN provides enterprises
with the ability to control their networks programmatically and to scale them
without affecting performance, reliability, or user experience. The data and
control-plane abstractions constitute the immense worth of SDN. By eliminating
435 the complexity of the infrastructure layer and adding visibility for applications
and services, SDN simplifies network management and brings virtualization to
the network. It abstracts flow control from individual devices to the network
level. Network-wide data-flow control gives administrators the power to define
network flows that meet connectivity requirements and address the specific needs
440 of discrete user communities.
4. Conclusion
the whole network programmable instead of having to individually configure each device every
time an application is added or something moves.
30
architecture. Observations and experiments show that all four SDN-based ar-
455 chitectures have been able to well distribute the load across algorithms such as
the proposed method or even Round-Robin. They also achieve higher evalua-
tion criteria than traditional architecture and have significant improvements in
parameters such as throughput, delay, and resource consumption. One of the
future works of this paper is to present a framework based on SDN and NFV
460 (Network Functions Virtualization) technologies in combination, to present vir-
tual servers in data centers in a VNF (Virtual Network Function) manner and
optimize their network communications by SDN controller. In this case, the
energy consumption of data centers can be optimized. Mathematical modeling
of this framework will also be followed in our future work.
465 Acknowledgment
References
475 [3] Y. Zhang, L. Cui, W. Wang, Y. Zhang, A survey on software defined net-
working with multiple controllers, Journal of Network and Computer Ap-
plications 103 (2018) 101–118.
31
480 Research issues and challenges, IEEE Communications Surveys & Tutorials
21 (1) (2018) 393–430.
495 [10] E. W. Rozier, P. Zhou, D. Divine, Building intelligence for software de-
fined data centers: modeling usage patterns, in: Proceedings of the 6th
International Systems and Storage Conference, ACM, 2013, p. 20.
[13] Y. Hu, C. Li, L. Liu, T. Li, Hope: Enabling efficient service orchestration
in software-defined data centers, in: Proceedings of the 2016 International
505 Conference on Supercomputing, ACM, 2016, p. 10.
32
[14] R. Touihri, S. Alwan, A. Dandoush, N. Aitsaadi, C. Veillon, Novel opti-
mized sdn routing scheme in camcube server only data center networks, in:
2019 16th IEEE Annual Consumer Communications Networking Confer-
ence (CCNC), 2019, pp. 1–2. doi:10.1109/CCNC.2019.8651677.
[17] Wei Hou, L. Shi, Yingzhe Wang, Fan Wang, Hui Lyu, M. St-Hilaire, An
improved sdn-based fabric for flexible data center networks, in: 2017 In-
520 ternational Conference on Computing, Networking and Communications
(ICNC), 2017, pp. 432–436. doi:10.1109/ICCNC.2017.7876167.
[18] H. Yao, W. Muqing, L. Shen, An sdn-based slow start algorithm for data
center networks, in: 2017 IEEE 2nd Information Technology, Networking,
Electronic and Automation Control Conference (ITNEC), 2017, pp. 687–
525 691. doi:10.1109/ITNEC.2017.8284820.
[19] Jian Di, Quanquan Ma, Design and implementation of sdn-base qos traffic
control method for electric power data center network, in: 2016 2nd IEEE
International Conference on Computer and Communications (ICCC), 2016,
pp. 2669–2672. doi:10.1109/CompComm.2016.7925182.
33
[21] K. Xie, X. Huang, S. Hao, M. Ma, P. Zhang, D. Hu, E 3 mc: Improving
535 energy efficiency via elastic multi-controller sdn in data center networks,
IEEE Access 4 (2016) 6780–6791. doi:10.1109/ACCESS.2016.2617871.
[23] R. Hwang, Y. Tang, Fast failover mechanism for sdn-enabled data centers,
in: 2016 International Computer Symposium (ICS), 2016, pp. 171–176.
doi:10.1109/ICS.2016.0042.
[24] U. Zakia, H. Ben Yedder, Dynamic load balancing in sdn-based data center
545 networks, in: 2017 8th IEEE Annual Information Technology, Electronics
and Mobile Communication Conference (IEMCON), 2017, pp. 242–247.
doi:10.1109/IEMCON.2017.8117206.
34
مطابق بخش قبل آزمایش ها را تکرار مر کنی .
برای آزمایش این معماری ،مطابق با معماری شکل 3-5از 10سوئیچ اپن فلو و 8سررور برررم مرر برری و
کنی . بار مر
ثابت اول:تکرار
آزمایش ها را -1
مطابق بخش قبل آزمایش
)Throughput (rps
1400
)Throughput (rps
12001200
Offered load
Offered load
10001000 SDN-based
SDN-based
1400
)Throuhput (rps
1200
12001000
Offered load
1000 800 SDN-based
Offered load
Round-Robin
SDN-based
800 600 Random
0 50 Round-Robin
100 150 200 250 300 350 400
600 Random )Time (Second
0 50 100 150 200 250 300 350 400
)Time (Second
1
)Average Response Time (ms
60 SDN-based
60 SDN-based
Round-Robin
Round-Robin
Random 1
40 Random
40
20
20
0
0 50 100 150 200 250 300 350 400
0 )Time (Second
0 50 100 150 200 250 300 350 400
)Average Response Time (ms
60
40
40
20 SDN-based
Round-Robin
20 0 Random
0 50 SDN-based
100 150 200 250 300 350 400
Round-Robin
)Time (Second
Random
0
0 50 100 150 200 250 300 350 400
ب -گذردهر سرورها در سناریو Time (Second)2الف -گذردهر سرورها در سناریو 1
ج -میانگین زمان پاسخ سرورها در سناریو 1 د -میانگین زمان پاسخ سرورها در سناریو 2
35
شکل -15-5مقایس کارایرِ سرورهای مرکز دادم در دو سناریو
2
Table 2: Comparison of traditional Fat-Tree architecture with SDN
The reason is that the Fat-Tree architecture (compared to the Three-layer ar-
chitecture) is divided into two general sections. OpenFlow switches are divided
into two subsystems in this architecture, which will result in better load balanc-
ing between the 8 servers. As in the previous sections, the results of Scenario
565 2 is slightly worse than the results of Scenario 1 due to the lack of background
traffic.
36
60
(%)(%)
SDN-based
60 SDN-based
Round-Robin
Usage
Random
Round-Robin
Usage
40
60 Random
CPU
Round-Robin
CPU
20
40
Random
Average
20
Average
200
0 100 200 300 400
100 200 300
Time (Second) 400
Time (Second)
0
100Servers’ average
(a) 200 CPU usage 300in scenario 1400
60
(%)
60 Time (Second)
Usage(%)
CPUUsage
60
Average CPU Usage (%)
40
40
AverageCPU
40 SDN-based
20 SDN-based
20
Average
Round-Robin
Round-Robin
SDN-based
Random
20 Random
Round-Robin
0
0
100 Random
100 200
200 300
300 400
400
0 Time (Second)
100 200 300 400
Time (Second)
Usage (%)
60 SDN-based
(b) Servers’ average CPU usage in scenario 2
Round-Robin
Memory Usage
Average Memory Usage (%)
60 SDN-based
Random
Round-Robin
40
Random
AverageMemory
40
20
20
Average
20
0
0 100 200 300 400
0 100 200 300 400
100 200Time (Second)
300
Time (Second) 400
Time (Second)
60 SDN-based
Round-Robin
Random
40
3
20 33
0
100 200 300 400
Time (Second)
Figure 16: Comparison of data center servers’ resource usage in two scenarios
1 میانگین پردازندم مصرفرِ سرورها در سناریو-الف 2 میانگین پردازندم مصرفرِ سرورها در سناریو-ب
37
1 میانگین حافظ مصرفرِ سرورها در سناریو-ج 2 میانگین حافظ مصرفرِ سرورها در سناریو-د
گذردهر و میانگین تاخیر در هر دو سرناریو نسربت، مشاهدم مرگردد16-5 و15-5 همانگون ک از شکل
(نسربت برFat-Tree ِ دلیل این امر این اسرت کر معمراری. کمر بربود یافت اندThree-layer ب معماری
سوئیچ های اپن فلو در این معمراری بر دو زیرر.) ب دو بخش کلر تقسی شدم استThree-layer معماری
7500 Offered Load 7500
Controller Throughput
7500 Offered Load 7500
7500 Offered
Servers Load
Throughput 7500
6000
6000 Controller Throughput
Controller Throughput
Rejection
Servers Rate
6000
Throughput
6000
1000
Average response time (ms)
100
Average response time (ms)
100
100
10
1010
1
0 100 200 300 400 500 600 700
Time (Second)
11
100 00 100
100 200
200 300
300 400
400 500
500 600
600 700
700
Controller
Time (Second)
Time (Second)
100
Average CPU usage (%)
Servers
(b) Servers average response time
100
10075
Controller
Controller
10050
100
Average CPU usage (%)
Average CPU usage (%)
Servers
Servers
25
7575
0
50 150 250 350 450 550 650
5050 Time (Second)
2525
25
5
5
0
50 150 250 350 450 550 650
Time (Second)
38
میانگین پردازندم مصرفر-ج میانگین حافظ مصرفر-د
سنتر و مبتنرFat-Tree تاخیر و مصرف منابع بین دو معماری،در جدول ذیل مقایس ای از نظر گذردهر
توانسرت اسرت بر خروبر درSDN تکنولرو ی، همانگون ک مشاهدم مرر کنیرد. آوردم شدم استSDN بر
580 5.2.1. Experiment 1: Constant load
The first experiment, as described in the previous sections, consists of two
scenarios with different background traffic. In Scenario 1, each server’s back-
ground traffic is equal to 500 packets per second. However, in the second sce-
nario, the background traffic of the servers are not equal. The results are shown
585 in Fig. 18.
As shown in Fig. 18 and the figures in the previous sections, servers are
less efficient in the BCube architecture than in the Fat-Tree architecture but
relatively better than the Three-layer architecture. This is also illustrated in
the resources usage by the servers in Fig. 19.
590 As can be seen, the servers resource usage in this architecture is slightly more
than in the Fat-Tree architecture. This is because, in the Fat-Tree architecture,
all the components are divided into two parts, and access to each server with
fewer jumps (links) is possible, while in the BCube architecture, access from the
source server to the destination server is possible with more jumps.
595 It is worth noting that all three examined architectures have correctly dis-
tributed loads, and the process of the three SDN-based, Round-Robin and Ran-
dom algorithms are consistent in all three architectures, having no significant
difference. For example, Scenario 2 consumes more resources than Scenario 1
(Fig. 19).
39
آزمایش اول :بار ثابت -1
آزمایش اول مطابق بخشهای قبل شامل دو سناریو با ترافیک پسزمین متفاوت است .در سناریو ،1ترافیک
پسزمین هر یک از سرورها ب طور مساوی 500بست بر ثانی است .اما در سناریو دوم ترافیک پرس زمینر
ثابت 18-5آوردم شدم است.
اول:دربارشکل سرورها1با-ه برابر نیست.
آزمایشنتایج
آزمایش اول مطابق بخشهای قبل شامل دو سناریو با ترافیک پسزمین متفاوت است .در سناریو ،1ترافیک
Offered load
پسزمین هر یک از سرورها ب طور مساوی 500بست بر ثانی است .اما در سناریو دوم ترافیک پرس زمینر
1500 SDN-based
Round-Robin سرورها با ه برابر نیست .نتایج در شکل 18-5آوردم شدم است.
)Throughput (rps
Random
Offered load
1500 SDN-based
Round-Robin
)Throughput (rps
1000 Random
1000
500
0 50 100 150 200 250 300 350 400
500 Time )(Second
0 50 100 150 200 250 300 350 400
Timeسناریو 1 )(Second
سرورها در الف -گذردهر
الف -گذردهر سرورها در سناریو 1
(a) Servers’ throughput in Scenario 1
1500
1500
)Throughput (rps
)Throughput (rps
Offered load
Offered load
SDN-based
SDN-based
Round-Robin
Round-Robin
10001000 Random
Random
500
500 0 50 100 150 200 250 300 350 400
0 50 100 150 Time200 250
)(Second 300 350 400
Time
سناریو 2 )(Second
سرورها در ب -گذردهر
سناریو 2
’(b) Servers throughput in Scenario
گذردهر سرورها در ب- 2
)(ms
)Time(ms
60
60
ResponseTime
40
40
AverageResponse
20
20
0 SDN-based
Round-Robin 7
0 Round-Robin
Average
Random
Random
-20
-20 0 50 100 150 200 250 300 350 400
0 50 100 150 200
7 250
)Time (Second
300 350 400
)Time (Second
ج -میانگین تاخیر سرورها در سناریو 1
سناریو 1
’(c) Servers سرورها در
average inتاخیر
delay میانگین 1ج-
Scenario
)(ms
60
)(ms
60
Time
Time
40
Response
40
Response
20
20
Average
Average
0
0 50 100 150 200 250 300 350 400
0
0 50 100 150)Time (Second
200 250 300 350 400
2 سناریو Time
در )(Second
سرورها تاخیر میانگین - د
(d) Servers’ average delay in Scenario 2
-18-5servers
د -میانگین تاخیر سرورها در سناریو 2
سناریوFigure 18: Comparison performanceدر دو
of the ofسرورهای مرکز دادم
کارایرِ
مقایسdata
center شکل in two scenarios
شکل -18-5مقایس کارایرِ سرورهای مرکز دادم در دو سناریو
همانگون ک از شکل 18-5و شکل های بخش های قبل مشخص است ،سرورها در معماری BCubeکارایر
کارایر BCube
مرورد از برترریمعماری
دارنرد .ایرن سرورها در
است،کرارایر
نسربتا Three-layer
مشخص معماریهای قبل
هایاز بخش
شکل اما شکل 18-5و
Fat-Treeدارند کمتری کاز از
معماری همانگون
40است.
Three-layerنسربتا کرارایر برترری دارنرد .ایرن مرورد از اما 5از 19-نیز
معماریمشخص شکل Fat-Treeدر
دارند توسط سرورها مصرفازمنابع
معماری کمتری
مصرف منابع توسط سرورها در شکل 19-5نیز مشخص است.
60
0
100 200 300 400
0
100 )200Time (Second
300 400
)Time (Second
الف -میانگین پردازندم مصرفر در سناریو 1
1
سناریو (a) The مصرفر در
average پردازندمCPU
میانگینusage 1الف-
in Scenario
60
)Average CPU Usage (%
60
50
)Average CPU Usage (%
50
40
30
40
SDN-based
20
30 Round-Robin
SDN-based
Random
10
20 Round-Robin
Random
100
200 100 300 400
0 )Time (Second
100 200 300 400
2 سناریو در Time
مصرفر )(Second
پردازندم
(b) The average CPU usage in Scenario 2 میانگین - ب
Figure 19: Comparison of resource usage by the data center server in two scenarios
سناریو 2
در دو سناریو مرکز دادم
مصرفری در
پردازندم سرورها
منابع مصرفر شکل -19-5ب-
مقایس
میانگین
Fat-Treeکمرر بیشرتر
معمراریسناریو معمراری از
دادم در دو اینهای مرکز
مصرفردرسرور
منابعسرورها
مقایسمنابع
مصرف -19 شکل -5
گردد، همانگون ک مالحظ مر
5.2.3. Experiment 3: Comparison
دسترسر بر هرر یرک از withدو بخش تقسی و
traditionalاجزا ب BCube
معماری Fat-Treeکل architecture
است ک در است .این بدلیل این
همانگون ک مالحظ مر گردد ،مصرف منابع سرورها در این معمراری از معمراری Fat-Treeکمرر بیشرتر
Table 31 providesبر
دسترسر aاز سررور مبردا در معماری BCube
comparison پذیر بود در حالیک کمتری امکان پرش سرورها با تعداد
یرک از دسترسر بر هرر ofدو بخش تقسی و
traditionalاجزا ب BCube
معماری Fat-Treeکل andک در SDN-based
این است archi-است .این بدلیل
سرور مقصد 1با تعداد پرش بیشتری همرام است.
610 مبردا in1بر tectures
terms throughput,ازofسررور delay,
معماری BCubeدسترسر andدر resource
در حالیک consumption.
امکان پذیر بود سرورها با تعداد پرش کمتری
نکت قابل توج این ا ست ک هر س معماری بررسر شدم ب درسرتر برار را توزیرع کرردم انرد و رونرد سر
ableاست.
بیشتری همرام مقصد 1با تعداد پرش theسرور
As can be seen, SDN technology has been to significantly improve
الگوریت Round-Robin ،HWARو Randomدر هر س معماری مطابق با هر اسرت و اخرتالف فاحشرر
رونرد سر
quality of کرردم انرد و
service requestsرا توزیرع
of server includingدرسرتر برار
معماری بررسر شدم ب delay,ست ک هر س
throughput, andتوج این ا
resourceقابل
نکت
ندارد .ب عنوان مثال مصرف منابع سناریو 2از سناریو 1بیشتر است (شکل .)19-5
الگوریت Round-Robin ،HWARو Randomدر هر س معماری مطابق با هر اسرت و اخرتالف فاحشرر
consumption.up to here, in all three studied architectures, SDN technology has
ندارد .ب عنوان مثال مصرف منابع سناریو 2از سناریو 1بیشتر است (شکل .)19-5
had a great impact on server performance.
9
41
آزمایش دوم :بار متغیر -2
آزمایش دوم :بار متغیر -2
در بخش قبل بارِ ثابتِ 1500درخواست در ثانی ( )1500 rpsب سیست تزریق گشت اما در این بخرش بر
در بخش قبل بارِ ثابتِ 1500درخواست در ثانی ( )1500 rpsب سیست تزریق گشت اما در این بخرش بر
ارزیابر عملکرد با بارِ متغیر مرپردازی .نتایج در شکل 20-5نشان دادم شدم است .این نتایج تقریبرا مشراب
ارزیابر عملکرد با بارِ متغیر مرپردازی .نتایج در شکل 20-5نشان دادم شدم است .این نتایج تقریبرا مشراب
نتایج بدست آمدم از معماریرای قبل مر باشد.
نتایج بدست آمدم از معماریرای قبل مر باشد.
7500 Offered Load
7500
7500 OfferedController
Load Throughput
Rejection
Controller Rate
Throughput
7500
6000 Servers 6000
Rejection Rate Throughput
6000 6000
100
100
10
10
1
1 100 200 300 400 500 600 700
100 200 300 400 (Second)500 600
Time 700
)Time (Second
(b) Average ب)delay
میانگین تاخیر
100 ب) میانگین تاخیر
)Average CPU Usage (%
100 Controller
)Average CPU Usage (%
Controller
Servers
Servers
50
50
10
10
0 0
0 0150 150
250 250350350 450450 550
550 650
650
)Time (Second
)Time (Second
ج) میانگین پردازندم مصرفر کنترلر و سرورها
سرورها(c) Average
کنترلر وCPU
مصرفرusage میانگین پردازندم
of controllers and serversج)
)Average Memory Usage (%
100
)Average Memory Usage (%
100 Controller
Controller
Servers
Servers
50
50
0
0 150 250 350 450 550 650
)Time (Second
0
0 سرورها 150
مصرفرofکنترلر و 250
میانگین حافظ 350 450
serversد) 550 650
(d) Average memory usage controllers and
)Time (Second
شکل -20-5کارایر در طول زمان و با بارهای ارائ شدم متفاوت
Figure 20: Performance over time and with different offered-loads
د) میانگین حافظ مصرفر کنترلر و سرورها
همانگون ک از شکل های ج و د 20-5مشخص است میزان مصرف منابع توسط کنترل کنندم قابل مقایس
است.متفاوت
پایین شدم کنندم بسیار
بارهای ارائ کنترل
زمان و با شدنطول دلیل-20-5
احتمال تنگنا
کارایر در با سرورها نیست و ب همینشکل
استیِ42
BCubeمصرف منابع توسط کنترل کنندم قابل مقایس
میزان 20با مشخص
معماری سنت سومو :د -5
مقایسه های ج همانگون -ک از شکل
آزمایش 3
BCubeاست.
سنتر و مبتنر بر بسیار پایین
کنندممعماری
کنترلبین دو
شدن منابع
تنگنامصرف احتمال
تاخیر و همیننظردلیل
گذردهر، نیست و ب
مقایس ای از سرورها ذیل
بادر جدول
SDNآوردم شدم است.
آزمایش سوم :مقایسه با معماری سنتیِ BCube -3
جدول : 3-5مقایس معماری BCubeسنتر با SDN
11
SDNآوردم شدم است.
جدول : 3-5مقایس معماری BCubeسنتر با SDN
615 5.3. Experimental results of Dcell architecture
43
Dcell ِ نتایج تجربی برای معماری-3-1-1-1
. سرور بررم مر بری20 سوئیچ اپن فلو و5 از7-5 مطابق با معماری شکل،Dcell برای آزمایش معماری
آزمایش اول شامل دو سناریو با ترافیک پسزمینر.این آزمایش نیز تکرار آزمایش اول بخش های قبل است
در. بست بر ثانی است500 ترافیک پسزمین هر یک از سرورها ب طور مساوی،1 در سناریو.متفاوت است
تراP11( 20 ترا11 و سررورهای1000 ،)P10 تراP1( 10 تا1 ترافیک پسزمین یِ سرورهای2 سناریو
. در این معماری مرباشدSIP نشاندهندم کارایرِ سرورهای21-5 شکل. بست بر ثانی مرباشد500 ،)P20
1500
Throughput (rps)
Offered load
SDN-based
1000 Round-Robin
Random
500
0 50 100 150 200 250 300 350 400
Time (Second)
1500
1 سناریو
Throughput (rps)
1000
Throughput (rps)
Round-Robin
Random
Throughput
Offered load
Offered load
SDN-based
1000
1000
Round-Robin
Round-Robin
500 Random
Random
0 50 100 150 200 250 300 350 400
Time (Second)
500
500
00 50
50 100
100 150
2 سناریو150 200
200
سرورها در
12 250
250
گذردهر 300
- ب300 350
350 400
400
Time (Second)
Time (Second)
Average Response Time (ms)
60
(ms)
40 60
Time
Time
40
Response
Round-Robin
20 40
Response
SDN-based
Random
Round-Robin
20 SDN-based
Random
Round-Robin
Average
0 20 Random
0 0 50 100 150 200 250 300 350 400
Average
50
60
40 40
50
Average Response
30 30 SDN-based
Round-Robin
40 Round-Robin
Random
20 Random
20 0 50 100 150 200 250 300 350 400
0 30 50 100
SDN-based150 Time 200
(Second)250 300 350 400
Round-Robin Time (Second)
20
Random
2 تاخیر سرورها در سناریو-د
0 2 سناریو
50(d) Servers
100 150سرورها در
delay 200 تاخیر
250
in Scenario- د2300 350 400
Time (Second)
13
Figure 21: Server performance over time
2 در سناریو13 تاخیر سرورها-د
13
44
Table 4: Comparison of traditional Dcell architecture with SDN
45