O (:+ki'04a: Minimizing End-to-End Delay in High-Speed Networks With A Simple Coordinated Schedule
O (:+ki'04a: Minimizing End-to-End Delay in High-Speed Networks With A Simple Coordinated Schedule
hop.
I. INTRODUCTION
The provision of end-to-end delay guarantees in high-speed
networks remains one of the most important and widely studied Quality-of-Service (QoS) issues. Many real time audio and
video applications rely on the ability of the network to provide small delays. One key mechanism for achieving this aim
is scheduling at the outputs of the switches. In this paper, we
attempt to minimize end-to-end delay using a novel scheduling
scheme.
Before we introduce our scheme we first recall the delay
bounds for the much studied Weighted Fair Queueing (WFQ)
scheduling discipline, also known as Packet-by-Packet Generalized Processor-Sharing (PGPS). In their seminal papers [1], [2],
Parekh and Gallager showed that WFQ achieves the following
session-i delay bound for Rate Proportional Processor Sharing
(RPPS).1
(1)
review the
+Kz.
delay
and the single-hop delay is essentially ~., Moreover, it is possible to construct an example in which this bound is achieved
since a packet can wait for time fi at every hop. This illustrates
our earlier observation.
In this paper, we demonstrate with both analysis and simulation that even for small burst sizes, a bound of g~ x Ki is not
necessary, i.e. the K-hop delay does not have to e K times the
l-hop delay. Indeed, in the case of uniform packet sizes, uniform service rates and small burst sizes, [3] showed that each
session i can achieve a delay bound,2
O
()
#+Ki
O(:+Ki04a
Here, n is the number of servers in the network and p~in is the
minimumsessionrate.
Our Results In Section III we generalize the above simple protocol to accommodate arbitrary packet sizes and arbitrary server
2The bound 0 ( -+ + Ki ) is best possible
Up tO a COIISkKIt
factor. TOsee
this, under non-cut-&rough service all sessions must suffer delay Ki. Moreover,
examples can be constructed iu which some sessions must suffer delay pi.
01
O51O152Q
30
35404550
Se&5k#h
x K~.
Each curve
Ki.
(2)
The parameter e is the server utilization factor defined later. The
logarithmic term, although small, is somewhat involved. We
give the full definition later. In Section IV we provide simulation
results to compare the actual performance of our protocol and
WFQ.
The basic ideas of our protocol are an earliest-deadline-first
approach coupled with randomization and coordination. We assign a deadline for every server through which a packet passes.
By introducing some randomness, the deadlines can be sufficiently spread out so that all the packets can meet all their
deadlines. By introducing simple coordination among the deadlines, we can ensure that once a packet has passed through
its first server, it can pass through all its subsequent servers
quickly. We refer to our protocol as COORDINATED-EARLIESTDEADLINE-FIRST
(CEDF).
(EDF) scheduling
discipline when applied to a single server has received much attention. For example, Ferrari and Verma [4] and Verma, Zhang
and Ferrari [5] showed that it can provide delay bounds and
delay-jitter bounds.
Georgiadis, Gu6rin and Parekh [6] and
Liebeherr, Wrege and Ferrari [7] proved that EDF is delayoptimal in the sense that if a set of delay bounds is achievable
then it can be achieved by EDF. Necessary and sufficient conditions for a set of delay bounds to be achievable were given.
Liebeherr et al. also presented schemes with low implementation complexity that approximate EDF [7], [8]. For networks,
Georgiadis, Gu6rin, Peris and Sivarajan [9] showed that EDF
can be sub-optimal. Nevertheless they proved that if the traftic
is correctly reshaped after each node then EDF can outperform
Weighted Fair Queueing. However, the best explicit bound on
end-to-end delay given in [9] is the same as Equation (1). General techniques for calculating end-to-end delay bounds were
obtained by Goyal, Lam and Vin [10] and Goyal and Vin [11].
A number of papers have simulated end-to-end delay performance.
Simulation results for EDF are presented in [4],
[5]. Clark, Shenker and Zhang [12] used simulation to compare WFQ with variants of FIFO. Yates, Kurose, Towsley and
Hluchyj [13] examined end-to-end delay distributions for WFQ,
FIFO and Golestanis Stop-and-Go Fair Queueing [14], [15].
They found that the analytic delay bounds can be too pessimistic.
Grossglauser and Keshav [16] showed that FIFO
can outperform the Weighted Round Robin (WRR) and Round
Robin (RR) disciplines for CBR traffic.
A. Overview
The basic idea of COORDINATED-EDF is very simple. For
each packet p, we assign deadlines D1, D2 . . . DK for eve~
server, ml, mz . . . mK, through which p passes. The deadlines
at a server m are defined using a parameter G~, where G~ is
essentially %
log(.). (We define the logarithmic term in Gm
later.) In particular, D1 is rand + G~, time after ps injection,
where rand is a random number chosen from an appropriate
range. Each subsequent deadline D~+l is Dk + Gm~. CEDF
gives priority to the packet with the earliest deadline if more
than one packet is waiting for a server. Ties are broken arbitrarily.
Note that randomness is only added to the first deadline
of each packet. This randomness has the important effect of
spreading out the deadlines. If rand is chosen from a large
enough range, i.e. proportional to L~/p~ for session i, then deadlines from different sessions do not cluster together. In this way,
packets do not compete for the same server simultaneously, and
hence all packets are able to meet all their deadlines.
The Gms provide coordination among the deadlines. We
point out that the values of the G~s are usually small, especially in high-speed networks where the server rates rm are
large. This means that once a packet passes through its first
server, it passes through all its subsequent servers quickly. As
an analogy to our strategy, consider the traffic lights on an avenue in Manhattan. If a car is stopped at a red light then once
that light turns green, many of the subsequent lights turn green
also. In other words, the coordination of the lights means that
once the car has passed through one light, it can quickly travel
through many lights in succession.
We emphasize that the Gms are independent of the session
rates. Under CEDF, session-i packets do not accumulate a delay
of ~ for each server that they pass through. Hence, CEDF does
not have a multiplicative
Parameters
nillrme
()
Grn=c@#og,
mm
0(s3
72
...
TM/Ti
rl+M
71+ 2M
rl + 3M
T2 + M
...
...
...
TM/Ti
T2
T2
2M
3M
rMITi + M
+ 2M
TM/Ti
1- 3M
Deadlines
DI
Dj
I_+ Gml
Dj.l + Gmi
Now that all deadlines are defined, each server gives priority
to the packet that has the earliest deadline.
K;
+cl~*
Pi
k=l
loge
(%)
Remarks
1. The only coordination required comes from the above iterative definition of the deadlines.
This coordination can be
achieved simply by stamping each packet with its current deadline.3 Each server can then update the deadlines of its pending
3This con be done using techniques sirnikwto the protocols of [26].
We note that the bound for CEDF does not contain Ki.
IV. SIMULATION RESULTS
Our experiments simulate a simple situation with uniform
packet sizes and uniform server rates. Since CEDF involves
WFQ
150
140 -
/,
140
,30
,30
,
120 -
120
110 -
110
Iw
I@
t
90
80
70 60
50
40
30
20
10
0.
05,0,5m
3935404553
Ses..?wh
150140
1s0
mko.03
Me-mm
!40
,,
rale-o.z
Me-0.s
Inko.b
Me-w
rate=m
rate=a7
tm
110 Ico
80
70
+
+
130 -
,.
.,
,.
4
*
* .+.
.e--.*..
,,,
,.4,
eo
50
40
34
m
10
0
05101520
.3Q
3540456a
*.,,:=m
many parameters, we simulate a simplified version, SIMPLECEDF, which nevertheless contains the essence of CEDF. Under S-CEDF, the deadline for the first server is chosen randomly
(without reference to periodic tokens). Every subsequent deadline is the deadline for the previous server incremented by one
packet service time. (See Figure 11.) As we shall see, the performance of S-CEDF corresponds to the analytical bounds of
Section III.
P:
A session-i packet
Injection time of p
Deadline of p at its kth hop
t~~j
Ilk:
1
2
3
Fig. 12. SessionOis the long sessionwith5 hops. Sessions1 tluough5 are the
1-hopsessions.
We first use a deterministic injection model that conforms
to the (a, p) traffic model with o = 1 for each session. Figures 3,4,5 and 6 illustrate the end-to-end delay experienced by
the long session. We note the striking resemblance between the
curves for these actual delays and the curves for the analytical
.
,;,;;
a,,,,,,...~
1s0
140 130 !m
mle.0,05
m-t
mfea15
ratn.D.2
Me-m
M9-a.4
mw,5
rate-m
rde.f.7
,<
,.$
....
110 700
.,/
SipltiEDF
rale=ma
,,;
,/@;..,...x
+
*.+
+
+.+..,
---.*-
,.!-0,03
ra&o.05
.
+
+
,,-
,.,W___
raw,z
W.a
rale=.a4
rale-D.5
ralea, e
ra&a7
.. ..
.-., *- J
-* + -+.
.G..*/
so 80
70
/,d//
//
,.>
~ .-
;/x,...
,..-
6D ,/(,
....
,,
.,,*,/,..#-
50
,/;
40
J
3D
K
,/
20
70 -
;;.:$..2
....*
e,>,%.,:.,
/#r
-,
.<,,,,<+.,,,,.---?
.=:,,::,::s
. ...
/.-+---:SPS-
.>s=@--
.....*
. . ... ,-s:.~f
--
. . ..
01
o
05101520
W354D4550
05107523
Se*aim=b@
se*sl%@lh
i 30
rako,,s
rd4-0.2
mls-as
rale.sO.4
mte-c.s
Mn-0.6
ratn*.7
120
:m
~lo
,Cm
rae=o,oa
.
We=O.os +
rata-o
! +-.
MD
O(
.*--*
+... .
-e-.* -
O51O%52V
3J354945E0
S=si%mmh
of
.-
0.77
&Q
S-CEDF
Pt.
I 0.03
0.1
0.7
0.7
0.1
1.0
1.0
1.0
1.0
1.4
1.5
1.8
2.2
I 1.06 I 1.16 I 1.26 I 1.3 I 1.42 I 1.53 I 1.79 I 2.15
I
similar experiments using the FIFQ disthe delays produced by FIFQ are close
i.e. the delays can be approximated by a
(We omit the plots here.)
S,mF4e.CEDF
WFO
Icn
Km
so ,*
rale403
-3
role-0.05 me-o. ! *MA-O.15
.. ..
,
mla.o 03 4
raa-c.05
+
W.*-a. ! *-.
,.1-0.1s
.*..
90
80 -
80
70
70 t
64
.*
2
so
~
.
40
24
20
,0
0
0510152d
mm
:;woh~
35
40
45
late-a. 1 +---
,, ..
rat6_m3
.lE=r.05
mm-o. 1
,ak=O.15
90
+
+
*-*-
60 -
70 -
+
$
60
...
so
:@
30 -
20
10 -
01
051015~
23404550
OL
O51OI52U
35
Lenglll o%ewkw
Lw!UI
404550
.?Neiwmk=
scheduling discipline
with end-to-end
delay bound,
+:~e+f *
%().
k=l
Wm
la
MO
mk-o,os
ram-ao5
rate-a. 1
raleaO.i5
rab.az
130 -
+
+
*-.
.*--.
rakO.03
.
,anwkmo; ~
mfE=o.15
rab-az
120 -
.. ..
-.
01
or
05107524
53354Q45YJ
05101520
30954045m
S&=be
Sessimxkgth
Wm
simple.cEOF
154,
Ma
*30
lm
Fak=o,oa
rakO.05
Ie.tn.o, 1
.Ia&
+
-1--*-.
.g -
130
I=
110 Iw
90
.
.
.=
i
;/ ;
,.. ...
,,. ,..
/,/.
70
en
.../)
..-
50
..
=5
40
0
30
;
:
;/
. . . . . . ...2
/,.
~,,;25z5=--
20
10
01
O51O152V
0
05101520
w3540456a
Sw:slsm
Weighted Fair Quetteing discipline. We have also presented simulation results to show that the performance of CEDF and WFQ
can bo comparable to the analytical bounds.
The major open problem is to reduce the delay bound still further. The ultimate goal is a simple protocol with a delay bound,
ACKNOWLEDGMENTS
[2]
[3]
[4]
A. K. Parekb and R. G. Gallager. A generated processor sharing approach to flow control in integrated services networks: The single-node
case. IEEELACM Transactions on Networking, 1(3):344 -357, 1993.
A. K. Parekb and R. G. Galtager. A generalized processor sharing approach to flow control in integrated services networks: The multiple-node
case. IEEEZACM Transactions on Networking, 2(2):137 -150, 1994.
M. Andrews, A. Fem6rrdez,M. Harchol-Batter, T. Leighton, and L. Zhrmg.
Dynamic packet routing with per-packet delay guarantees of O(distance +
lkession rate). In Proceedings of the 38th Annual Symposium on Founoh
tions of Computer Science, pages 294 302, Miami Beach, FL, October
1!)97.
D. Ferrari and D. Venna. A scheme for rerd-time channel establishment in
wide-area networks. IEEE Journal on Selected Areas in Communications,
8(,3):368 379, A@t 1990.
a0a5404550
Se,sl.%m
[1]
.
+
+.**-
110 -
80
~=
mlE=O.w
[email protected]
R&0,
1
m!e=o.15
mko.z
140
shop on Network and Operating System Support for Digital Audio and
Wdeo, pages 287 298, Jlurham, NH, April 1995.
Let
186, 1993.
[21]
[22]
[23]
[24]
[25]
[26]
[27]
APPENDIX
PROOF OF THEOREM
loge
Recall
-s3(l-c)rw
that
Gm
G_/(48.L~.X)
(:2)(
(n:z)
<
1 Psuc.
($:.)3(1-)48
of the protocol, to
H
c/2)rmGm.
l~.uc
May 1997.
S. Keshav. An engineering approach to computer networking. Addison
Wesley, Reading, MA, 1997.
H. Zhang. Service disciplines for guaranteed performance service in
packet-switching networks. In Proceedings of IEEE, October 1995.
R. L. Cruz. A catculus for network delay, Part I: Network elements in
isolation. IEEE Transactions on Information Theory, pages 114-131,
1991.
R. L. Cruz. A catculus for network delay, Part 11:Network anatysis. IEEE
Transactions on Information Theory, pages 132-141, 1991.
A. Demers, S. Keshav, and S. Shenker. Anatysis and simulation of a fair
queueing rdgorithm. Journal of Interneiworking: Research and Experience, 1:3 26, 1990.
A. BanerjeA D. Ferrari, B. Mah, M. Moran, D. Verma, and H. Zhang. The
Tenet reaf-time protocol suite: Design, implementation, and experiences,
IEEEZACM Transactions on Networking, 4(1):1 11, February 1996.
D. Stiliadis. Traf?7cscheduling in packet-switched networks: analysis design and implementation. PhD thesis, UCSC, 1996.
to)~i.
i(Si
- .Li).
Hence,
Since the token placement is periodic with period M, we only
need to consider a fixed time period of length M. For each
server m, only M/Ti intervals I = [t Gm, t] can have t as
a deadline for a session-i packet in that time period. There are n
servers in the network. Hence, the total number of such intervals
I is,