Understanding TCP Vegas Theory and Practice
Understanding TCP Vegas Theory and Practice
Steven Low University of Melbourne Larry Peterson and Limin Wang Princeton University TR 616-00 February 8, 2000
Abstract
This paper presents a model of the TCP Vegas congestion control mechanism as a distributed optimization algorithm. Doing so has three important benets. First, it helps us gain a fundamental understanding of why TCP Vegas works, and an appreciation of its limitations. Second, it allows us to prove that Vegas stabilizes at a weighted proportionally fair allocation of network capacity when there is sufcient buffering in the network. Third, it suggests how we might use explicit feedback to allow each Vegas source to determine the optimal sending rate when there is insufcient buffering in the network. In addition to presenting the model and exploring these three issues, the paper presents simulation results that validate our conclusions.
1 Introduction
TCP Vegas was introduced in 1994 as an alternative source-based congestion control mechanism for the Internet [10]. In contrast to the TCP Reno algorithm, which induces congestion to learn the available network capacity, a Vegas source anticipates the onset of congestion by monitoring the difference between the rate it is expecting to see and the rate it is actually realizing. Vegas strategy is to adjust the sources sending rate (congestion window) in an attempt to keep a small number of packets buffered in the routers along the transmission path. Although experimental results presented in [6] and [1] show that TCP Vegas achieves better throughput and fewer losses than TCP Reno under many scenarios, at least two concerns remained: is Vegas stable, and if so, does it stabilize to a fair distribution of resources; and does Vegas result in persistent congestion. These concerns are particularly signicant in view of evidence that Renos linear increase, multiplicative decrease algorithm stabilizes around a fair allocation to all connections [23, 12, 13]. In short, Vegas has lacked a theoretical explanation of why it works. This paper addresses this shortcoming by presenting a model of Vegas as a distributed optimization algorithm. Specically, we show that the global objective of Vegas is to maximize the aggregate utility of all sources (subject to the capacity constraints of the networks resources), and that the sources solve the dual of this maximization problem by implementing an approximate gradient projection algorithm. This model 1
implies that Vegas stabilizes at a weighted proportionally fair allocation of network capacity when there is sufcient buffering in the network, that is, when the network has enough buffers to accommodate the extra packet(s) the algorithm strives to keep in the network. If sufcient buffers are not available, equilibrium cannot be reached, and Vegas reverts to Reno. Our analysis shows that Vegas does have the potential to induce persistent queues (up to the point that Reno-like behavior kicks in), but that by augmenting Vegas with explicit feedbackfor example, in the form of the recently proposed ECN bit [22]it is possible to avoid this problem. Explicit feedback serves to decouple the buffer process from the feedback required by each Vegas source to determine its optimal sending rate. The paper concludes by presenting simulation results that both serve to validate the model and to illustrate the impact of this explicit feedback mechanism. Models of Vegas are also analyzed in [5, 18] using a different framework.
2 A Model of Vegas
This section presents a model of Vegas and shows that 1) the objective of Vegas is to maximize aggregate source utility subject to capacity constraints of network resources, and 2) the Vegas algorithm is a dual method to solve the maximization problem. The primary goal of this effort is to better understand Vegas stability, loss and fairness properties, which we discuss in Section 3.
2.1 Preliminaries
A network of routers is modeled by a set of unidirectional links of capacity , . It is shared by a set of sources. A source traverses a subset of links to the destination, and attains a utility when it transmits at rate (e.g., in packets per second). Let be the round trip propagation delay for source . For each link let be the set of sources that uses link . By denition if and only if . According to one interpretation of Vegas, a source monitors the difference between its expected rate and its actual rate, and increments or decrements its window by one in the next round trip time according to whether the difference is less or greater than a parameter .1 If the difference is zero, the window size is unchanged. We model this by a synchronous discrete time system. Let be the window of source at time and let be the associated round trip time (propagation plus queueing delay). Note that depends not only on source s own window but also on those of all other sources, possibly even those sources that do not share a link with . We model the change in window size by one packet per round trip time in actual implementation, with a change of per discrete time. Thus, source adjusts its window according to:
The actual algorithm in [6] tries to keep this difference between assumes . It is simpler and captures the essence of Vegas.
1
and , with
to reduce oscillation.
Our model
Vegas Algorithm:
if
if
(1)
else
In the original paper [6], is referred to as the Expected rate, as the Actual rate, and the difference as DIFF. The actual implementation estimates the round trip propagation delay by the minimum round trip time observed so far. The unit of is, say, KB/s. We will explain the signicance of on fairness in Section 3. When the algorithm converges the equilibrium windows um round trip times satisfy
for all
(2)
Let denote the bandwidth realized by source at time . The window size minus the bandwidthdelay product equals the total backlog buffered in the path of . Hence, multiplying the conditional in (1) by , we see that a source increments or decrements its window according to whether the total backlog is smaller or larger than . This is a second interpretation of Vegas.
(3)
as their utility functions. Moreover the objective of Vegas is to choose source rates
so as to
(4) (5)
subject to
Constraint (5) says that the aggregate source rate at any link does not exceed the capacity. We will refer to (45) as the primal problem. A rate vector that satises the constraints is called feasible and a feasible that maximizes (4) is called primal optimal (or socially optimal or simply optimal). A unique optimal rate vector exists since the objective function is strictly concave, and hence continuous, and the feasible solution set is compact. The following theorem claries the objective of Vegas. It was rst proved in [19].
Theorem 1 Let be the equilibrium windows of Vegas and associated equilibrium round trip times. Then the equilibrium source rates is the unique optimal solution of (45).
3
dened by
the
Proof. By the KarushKuhnTucker theorem a feasible source rate vector there exists a vector such that, for all ,
(6)
and, for all , if the aggregate source rate at link is strictly less than the capacity (complementary slackness). We now prove that the equilibrium backlog at the links provide such a vector , and hence the equilibrium rates are optimal.
Let be the equilibrium backlog at link . The fraction of that belongs to source under rstin rstout service discipline is where is the link capacity. Hence source maintains a backlog of in its path in equilibrium. Since the window size equals the bandwidthdelay product plus the total backlog in the path, we have
(7)
where the last equality follows from (7). This yields (6) upon identifying
and rearranging terms. Clearly, must be feasible since otherwise the backlog will grow without bound, contradicting (7). Since the equilibrium backlog at a link if the aggregate source rate is strictly less than the capacity, the complementary slackness condition is also satised.
Associated with each link is a dual variable . The dual problem of (45) is to choose the dual vector so as to [4, 16]:
(8)
where
(9) (10)
If we interpret the dual variable as the price per unit bandwidth at link , then in (10) is the price per unit bandwidth in the path of . Hence in (9) represents the bandwidth cost to source when it transmits at rate , is the net benet of transmitting at rate , and represents the maximum benet can achieve at the given (scalar) price . A vector that minimizes the dual problem (8) is called dual optimal. Given a vector price or a scalar price , . A feasible rate vector we will abuse notation and denote the unique maximizer in (9) by or by is called individually optimal (with respect to ) when each individual rate minimizes (9). There are two important points to note. First, given scalar prices , each source can easily solve (9) to obtain the individually optimal source rates without having to coordinate with any other sources; see (12) below. Second, by duality theory, there exists a dual optimal price such that these individually optimal rates are also socially optimal, that is, solve (45) as well. Furthermore, as we will see below, solution of the dual problem can be distributed to individual links and sources. Hence a better alternative to solving the primal problem (45) directly is to solve its dual (8) instead. In the rest of the paper we will refer to as link price, as path price (of source ), and the vector simply as price. It can be interpreted in two ways. First, the price is a congestion measure at the links: the larger the link price , the more severe the congestion at link . The path price is thus a congestion measure of source s path. Indeed in the special case of Vegas with its particular utility function, the link price turns out to be the queueing delay at link ; see Section 3. Second, an optimal is a shadow price (Lagrange multiplier) associated with the constrained maximization (45); i.e., is the marginal increment in aggregate utility for a marginal increment in link s capacity . We emphasize however that may be unrelated to the actual charge users pay. If sources are indeed charged according to these prices, then aligns individual optimality with social optimality, thus providing the right incentive for sources to choose the optimal rates. A scaled gradient projection algorithm to solve the dual problem takes the following form [16]. In each iteration , each link individually updates its own price based on the aggregate rate at link , and each source individually adjusts its rate based on its path price . Specically let denote the unique source rate that maximizes (910) with replaced by , and denote the aggregate source rate at link . Then link computes according to:
where and
(11) are constants. Here represents the demand for bandwidth at link and
represents the supply. The price is adjusted according to the law of demand and supply: if demand exceeds the supply, raise the price; otherwise reduce it. 5
Let denote the path price at time . Then source sets its rate to the unique maximizer of (910) given by (setting the derivative of to zero):
(12)
This is referred to as the demand function in economics: the higher the path price (i.e., the more congested the path), the lower the source rate. The following result says that the scaled gradient projection algorithm dened by (1112) converges to yield the unique optimal source rates. It is a minor modication of Theorem 1 of [16]; indeed the convergence proof in [2] for a (different) scaled gradient projection algorithm applies directly here. Theorem 2 Provided that the step-size is sufciently small, then starting from any initial rates and prices , every limit point of the sequence generated by algorithm (1112) is primaldual optimal.
That is, the backlog in the next period is either zero or equals the current backlog plus the total input less the total output in the current period. Dividing both sides by we have
(13)
Identifying and scaling factor , we see that (13) is the same as (11) with stepsize are updated slightly differently from (12). , except that the source rates in Recall from (1) that the Vegas algorithm updates the window based on whether
or
(14)
This is an approximation which holds in equilibrium when buffer stabilizes; see [15] for a more accurate model of the buffer process.
As for (7) this quantity is related to the backlog, and hence the prices, in the path:
(15)
or
Hence, a Vegas source compares the current source rate with the target rate . The window is incremented or decremented by in the next period according as the current source rate is smaller or greater than the target rate . In contrast, the algorithm (12) sets the rate directly to the target rate. The sufcient condition in Theorem 2 requires that the stepsize be sufciently small to guarantee convergence. Vegas assumes that ; see (13). We now describe a way to reintroduce into the Vegas algorithm which can then be adjusted to ensure convergence. Multiplying both sides of (13) by and identifying , we obtain
that is, the prices are updated with a stepsize that is not necessarily one. This implies a multiplication of both sides of the rst equality of (15) by , and hence the comparison in (14) becomes:
or
This amounts to using a that is times larger, i.e., use a unit of 10KBps (say) instead of KBps for .3 Note that (or unit of ) should be the same at all sources. Smaller ensures convergence of source rates, albeit slower, but it leads to a larger backlog since . This dilemma can be overcome by introducing marking to decouple the buffer process from price computation; see Section 5. Finally, we mention in passing that the Vegas algorithm can also be regarded as a Lagrangian method [4, Chapter 4] where the primal variable and dual variable are iterated together to solve the Karush KuhnTucker condition and the feasibility condition.
Using a smaller link capacity, say, Mbps instead of 10Mbps, has the same effect.
and one that is too far indicates congestion. The second interpretation is that a Vegas source adjusts its rate so as to maintain between (typically 1) and (typically 3) number of packets buffered in its path, so as to take advantage of extra capacity when it becomes available. The optimization model suggests a third interpretation. Vegas measures congestion at a link by its queueing delay, and that of a path by the endtoend queueing delay (without propagation delay). A Vegas source computes the queueing delay from the round trip time and the estimated propagation delay, and attempts to set its rate to be proportional to the ratio of propagation to queueing delay, the proportionality constant being between and . We now elaborate on this third interpretation. The dynamics of the buffer process at link implies the important relation (comparing (11) and (13)):
It says that the link price is just the queueing delay at link faced by a packet arrival at time . Moreover, the difference between the round trip time and the propagation delay is the path price , the congestion signal a source needs to adjust its rate. Let denote the queueing delay at link and be the endtoend queueing delay in source s path. Then (12) implies that, , a Vegas source sets its target rate to be proportional to the ratio of propagation to since queueing delay:
(16)
As the number of sources increases, individual source rates necessarily decrease. The relation (16) then implies that queueing delay must increase with the number of sources. This is just a restatement that every source attempts to keep some extra packets buffered in its path. It also follows from (16) that in equilibrium the bandwidthqueueingdelay product of a source is equal to the extra packets buffered in its path:
(17)
This is just Littles Law in queueing theory when propagation delay is ignored.
3.2 Fairness
Although we did not recognize it at the time, there are two equally valid implementations of Vegas, each springing from a different interpretation of an ambiguity in the algorithm. The rst, which corresponds to the actual code, denes the and parameters in terms of bytes (packets) per round trip time, while the second, which corresponds to the prose in [6], denes and in terms of bytes (or packets) per second. These two implementations have an obvious impact on fairness: the rst penalizes sources with a large propagation delay, while the second favors such sources. In terms of our model, Theorem 1 implies that the equilibrium rates are weighted proportionally fair [11, 12]: for any other feasible rate vector , we have
The rst implementation has inversely proportional to the sources propagation delay, and the second has identical for all sources, for some . These two implementations lead to different fairness in equilibrium. When (in unit of (say) packets) are the same for all sources, the utility functions are identical for all sources, and the equilibrium rates are proportionally fair and are independent of propagation delays. All sources with the same path price receive the same rate, for example, in a singlelink network. In a network with multiple congested links, however, a source that traverses more links, not merely having higher propagation delay, will be discriminated against. This is because for each marginal increment in aggregate utilitythe objective of the primal problem (45)such a long connection consumes more resources than a short one that uses fewer links; see [16, Section V]. We call this implementation proportionally fair (PF). When are identical, sources have different utility functions, and the equilibrium rates are weighted proportional fair, with weights being proportional to sources propagation delays. (17) implies that if two sources and face the same path price (or equivalently, the same endtoend queueing delay), then their equilibrium rates are proportional to their propagation delays:
In particular, if there is only a single congested link in the network, then a source that has twice the propagation delay will receive twice the bandwidth. In a network with multiple congested links, weighting the utility by propagation delay has a balancing effect to the discrimination against long connections, if the propagation delay is proportional to the number of congested links in a sources path. We call the second implementation weighted proportionally fair (WPF). It is argued in [13, Remark 2] that TCP Reno can be roughly modeled as maximizing problem (45) with utility functions (ignoring random loss) Hence in equilibrium source rates satisfy . If two sources and see the same path price (e.g., in a singlelink network), then their rates are inversely proportional to their propagation delays:
That is, a source with twice the propagation delay receives half as much bandwidth. This discrimination against connections with high propagation delay is well known in the literature, e.g., [7, 9, 14, 17, 5].
3.3 Loss
Provided that buffers at links are large enough to accommodate the equilibrium backlog , a is no more Vegas source will not suffer any loss in equilibrium since the aggregate source rate than the link capacity in the network (feasibility condition (5)). This is in contrast to TCP Reno which constantly probes the network for spare capacity by linearly increasing its window until packets are lost, upon which the window is multiplicatively decreased. Thus, by carefully extracting congestion information from observed round trip time and intelligently reacting to it, Vegas avoids the perpetual cycle of sinking into congestion and recovering from it. This is conrmed by the experimental results of [6] and [1]. As observed in [6] and [5], if the buffers are not sufciently large, equilibrium cannot be reached, loss cannot be avoided, and Vegas reverts to Reno. This is because, in attempting to reach equilibrium, Vegas 9
sources all attempt to place number of packets in their paths, overowing the buffers in the network. The minimum buffer needed in the entire network for equilibrium to exists is .
4 Persistent Congestion
This section examines the phenomenon of persistent congestion, as a consequence of both Vegas exploitation of buffer process for price computation and of its need to estimate propagation delay. The next section explains how this can be overcome by Random Early Marking (REM), in the form of the recently proposed ECN bit [8, 22].
10
result says that the estimation error effectively changes the utility function: source appears to have a utility (cf. (3))
(18)
(19) (20)
subject to
Theorem 3 Let be the equilibrium windows of Vegas and associated equilibrium round trip times. Then the equilibrium source rates is the unique optimal solution of (1920).
Proof. The argument follows the proof of Theorem 1, except that (6) is replaced by
dened by
the
Using
To show that the equilibrium backlog at the links provide such a vector , and hence the equilibrium rates are optimal, substitute the estimated propagation delay for the true value in (2) to get
(21)
we thus have
This yields (21) upon identifying and rearranging terms. As in the proof of Theorem 1, must be feasible and the complementary slackness condition must be satised. Hence the proof is complete.
The signicance of Theorem 3 is twofold. First, it implies that incorrect propagation delay does not upset the stability of Vegas algorithm the rates simply converge to a different equilibrium that optimizes (1920). Second, it allows us to compute the new equilibrium rates, and hence assess the fairness, when we know the relative error in propagation delay estimation. It provides a qualitative assessment of the effect of estimation error when such knowledge is not available. For example, suppose sources and see the same path price. If there is zero estimation error then their equilibrium rates are proportional to their weights:
(22)
11
Hence, a large positive error generally leads to a higher equilibrium rate to the detriment of other sources. For PF implementation where , if sources have identical absolute error, , then source rates are proportional to . Although Vegas can be stable in the presence of error in propagation delay estimation, the error may cause two problems. First, overestimation increases the equilibrium source rate. This pushes up prices and hence buffer backlogs, leading to persistent congestion. Second, error distorts the utility function of the source, leading to an unfair network equilibrium in favor of newer sources.
4.3 Remarks
Note that we did not see persistent congestion in our original simulations of Vegas. This is most likely due to three factors. One is that Vegas reverts to Reno-like behavior when there is insufcient buffer capacity in the network. The second is that our simulations did not take the possibility of route changes into consideration, but on the other hand, evidence suggests that route changes are not likely to be a problem in practice [21]. The third is that the situation of connections starting up serially is pathological. In practice, connections continually come and go, meaning that all sources are likely to measure a baseRTT that represents the propagation delay plus the average queuing delay. Indeed, if two sources and see the same price, then they have the same queueing delay (because ). If the error in round trip time estimation is entirely due to the (average) queueing delay , then for both sources. For PF implementation, (22) then implies that their rates are proportional to , i.e., instead of equally sharing the bandwidth, the source with a smaller propagation delay will be favored. In a high speed network where is small, this distortion is small.
adjusts its rate. We now summarize REM in the context of Vegas; see [3] for its derivation and evaluation of its stability, fairness and robustness through extensive simulations. Each link updates a link price in period based on the aggregate input rate and the buffer occupancy at link :
(23)
where is a small constant and . The parameters controls the rate of convergence and trades off link utilization and average backlog. Hence is increased when the backlog or the aggregate input rate at link is large compared with its capacity , and is reduced otherwise. Note that the algorithm does not require perow information. Link marks each packet arriving in period , that is not already marked at an upstream link, with a probability that is exponentially increasing in the congestion measure:
(24)
where is a constant. Once a packet is marked, its mark is carried to the destination and then conveyed back to the source via acknowledgement. The exponential form is critical for multilink network, because the endtoend probability that a packet of source is marked after traversing a set of links is then
where is the path price. The endtoend marking probability is high when is large. Source estimates this endtoend marking probability by the fraction of its packets marked in period , and estimates the path price by inverting (25):
(25)
where
is logarithm to base . It then adjusts its rate using marginal utility (cf. (12)):
(26)
Hence the source algorithm (26) says: if the path is congested (the fraction of marked packets is large), transmit at a small rate, and vice versa. In practice a source may adjust its rate more gradually by incrementing it slightly if the current rate is less than the target (the right hand side of (26)), and decrementing it slightly otherwise, in the spirit of the original Vegas algorithm (1): Vegas with REM:
if if
else 13
is discussed in [3] which also shows that REM is very robust to parameter
As argued in [3], the price adjustment (23) leads to small backlog ( ) and high utilization ( ) in equilibrium at bottleneck links , regardless of the equilibrium price . Hence high utilization is not achieved but maintaining a large backlog, but by feeding back accurate congestion information for sources to set their rates. This is conrmed by simulation results in the next section.
6 Evaluation
This section presents three sets of simulation results. The rst set shows that source rate converges quickly under Vegas to the theoretical equilibrium, thus validating our model. The second set illustrates the phenomenon of persistent congestion discussed in Section 4. The third set shows that the source rates (windows) under Vegas+REM behave similarly to those under plain Vegas, but the buffer stays low. We use the ns-2 network simulator [20] congured with the topology shown in Figure 1. Each host on the left runs an FTP application that transfers a large le to its counterpart on the right. We use a packet size of 1KB. The various simulations presented in this section use different latency and bandwidth parameters, as described below.
1a 2a 3a 4a 5a Router1 Router2 1b 2b 3b 4b 5b
The reported baseRTT includes both the round trip latency and transmit time.
14
TCP Vegas Sending Rate for Host1a (Proportional Fair) 2500 theoretical optimal rate actual rate 1600 1400 1200
Sending Rate KB/s Sending Rate KB/s
TCP Vegas Sending Rate for Host2a (Proportionally Fair) theoretical optimal rate actual rate 1600 1400 1200
Sending Rate KB/s
TCP Vegas Sending Rate for Host3a (Proportional Fair) theoretical optimal rate actual rate
2000 1000 800 600 400 500 200 0 0 1 2 3 4 time TCP Vegas Sending Rate for Host4a (Proportional Fair) 1400 1200
Sending Rate KB/s Sending Rate KB/s
1500
1000
0 5 6 7 8 0 1 2 3 4 time 5 6 7 8
4 time
TCP Vegas Sending Rate for Host5a (Proportional Fair) 1400 1200 1000 800 600 400 200 0 theoretical optimal rate actual rate
4 time
Figure 2: Stability (PF): sending rate of ve connections how well the model predicts the simulation.
Host baseRTT (ms) RTT w/ queueing (ms) Sending rate (KB/s) Congestion window (pkts) Buffer occupancy at Router1 (pkts) M 15.34 17 1200 20.4 1a S 15.34 17.1 1205 20.5 2a M S 15.34 15.34 17 17.1 1200 1183 20.4 20.2 Model 10 M 20.34 22 1200 26.4 3a S 20.34 21.9 1228 27 M 30.34 32 1200 38.4 4a S M 30.34 40.34 31.9 42 1247 1200 39.9 50.4 Simulation 9.8 5a S 40.34 41.9 1161 49.8
Table 1: Stability (PF): comparison of theoretical and simulation results. M stands for Model and S stands for Simulation. All simulation numbers are averaged at the equilibrium point. For weighted proportional fairness, we set to 2 packets per 10ms, which means each source will have a different number of extra packets in the pipe and the optimal sending rate will be proportional to the propagation delay. The results for the two (of the ve) connections are shown in Figure 3, except this time we show the congestion windows instead of the sending rates. The other performance numbers are in Table 2, which again show that the simulations closely follow the models predictions. Both the sending rates (Figure 2) and the congestion windows (Figure 3) oscillate around the equilibrium. This is an artifact of setting = in our simulations, which we have assumed in the model for 15
TCP Vegas window sizes for Host2a (Weighted Proportionally Fair) 25 theoretical equilibrium window cwnd 60 50
window size KB
TCP Vegas window sizes for Host4a (Weighted Proportionally Fair) theoretical equilibrium window cwnd
20
window size KB
40 30 20 10 0
15
10
0 0 1 2 3 4 time 5 6 7 8
4 time
Figure 3: Stability (WPF): congestion window size for two (of the ve) connections
Host baseRTT (ms) RTT w/ queueing (ms) Sending rate (KB/s) Congestion window (pkts) Buffer occupancy at Router1 (pkts) M 15.34 19.4 756.3 14.7 1a S 15.34 19.55 781 15.1 2a M S 15.34 15.34 19.4 19.58 756.3 774 14.7 14.9 Model 24.34 3a M 20.34 24.4 1003 24.5 S 20.34 24.4 994 24.6 M 30.34 34.4 1496 51.5 4a S M 30.34 40.34 34.3 44.4 1495 1990 51.7 88.4 Simulation 24.24 5a S 40.34 44.3 1975 88.6
Table 2: Stability (WPF): comparison of theoretical and simulation results. M stands for Model and S stands for Simulation, all simulation numbers are averaged at equilibrium point.
simplicity. Vegas adjusts the congestion window by one packet in each round trip time. The adjustment is large relative to = = 2 packets, rendering the window prone to oscillation. We have repeated the simulation using an that is 10 times as large (corresponding to a step-size 10 times as small). This reduces the impact of adjusting the window by one packet, and the curves smooth out.
buffer occupancy
buffer occupancy
20 15 10 5 0 0 1 2 3 4 5 time 6 7 8 9 10
Figure 4: Persistent congestion: buffer occupancy at router Next, we take propagation delay estimation error into account by letting the Vegas sources discover the propagation delay for themselves. Since each source perceives a larger round trip delay due to queueing at the router, it takes longer for them to reach equilibrium. Therefore, sessions are now staggered 20 seconds apart. As shown in Figure 4(b), buffer occupancy grows much faster than linearly in the number of sources. We have also applied Theorem 3 to calculate the queue size, RTT with queueing and equilibrium rates. The measured numbers match very well with the prediction of Theorem 3 for the rst half of the simulation up to a queue size of 272, further verifying our Vegas model. At very large buffer sizes, the KarushKuhnTucker equation describing the equilibrium situation becomes very illconditioned, and the system can be easily jolted into a different point, as it has apparently been. The distortion in utility functions not only leads to excess backlog, it also strongly favors new sources. Without estimation error, sources should equally share the bandwidth. With error, at when two sources are active, the ratio of the measured (equilibrium) source rates is ; at when three sources are active, the ratio is (the ratio calculated using Theorem 3 is at and at ).
During slowstart, Vegas keeps updating the variable fraction , but does not use it in window adjustment.
17
We start 3 connections with an interstart interval of 5ms in order to test our claim that REM reduces the estimation error in Vegas propagation delay. Figure 5 plots the congestion window size of the three connections and buffer occupancy at Router1. As expected, each of the three connections converges to its appropriate share of link bandwidth over time. When the link is not congested, source rate oscillates more severely, as seen from Host1a during time 0 - 5s. This is a consequence of the log utility function; see [3]. As more sources become active (5 - 16s), oscillation becomes smaller and convergence faster. Without REM, each connection would have maintained packets buffered in the router, amounting to 120 packets in equilibrium. With REM, buffer occupancy is much smaller in equilibrium, even though the link utilitzation is high (varies from 92% to 96%). Setting large keeps buffer occupancy small while decreases link utilization. This tradeoff could be decided by each router separately based on its own resources, such buffer space, and other policies. Small buffer occupancy reduces the estimation error and eliminates the superlinear growth in queue length demonstrated in Figure 4(b) of Section 6.2. This is conrmed by the measurement shown in Table 3. Host baseRTT (ms) 1a Model Simulation 10.34 10.34 2a Model Simulation 20.34 20.34 3a Model Simulation 30.34 30.34
7 Conclusions
We have shown that TCP Vegas can be regarded as a distributed optimization algorithm to maximize aggregate source utility over their transmission rates. The optimization model has four implications. First it implies that Vegas measures the congestion in a path by endtoend queueing delay. A source extracts this information from round trip time measurement and uses it to optimally set its rate. The equilibrium is characterized by Littles Law in queueing theory. Second, it implies that the equilibrium rates are weighted proportionally fair. Third, it claries the mechanism, and consequence, of potential persistent congestion due to error in the estimation of propagation delay. Finally, it suggests a way to eliminate persistent congestion through binary feedback that decouples the computation and the feedback of congestion measure. We have presented simulation results that validate our conclusions. Extensive simulations to compare Vegas+REM and Reno+RED will be reported in a future paper. Acknowledgement. We gratefully acknowledge helpful discussions with Sanjeewa Athuraliya. This work supported by the Australian Research Council through grants S499705, A49930405 and S4005343, and by the National Science Foundation through grant ANI-9906704. 18
Vegas+REM: Window sizes for Host3a (Weighted Proportionally Fair) 100 90 80 70 60 50 40 30 20 10 0 theoretical equilibrium window cwnd
70 60 50 40 30 20 10 0 0 2 4 6 8 time 10 12 14 16
60 50 40 30 20 10 0 0 2 4 6 8 time 10 12 14 16
8 time
10
12
14
16
queue size
30 20 10 0 0 2 4 6 8 time 10 12 14 16
19
References
[1] J. S. Ahn, P. B. Danzig, Z. Liu, and L. Yan. Evaluation of TCP Vegas: emulation and experiment. In Proceedings of SIGCOMM95, 1995. [2] Sanjeewa Athuraliya and Steven Low. Optimization ow control with Newtonlike algorithm. In Proceedings of IEEE Globecom99, December 1999. [3] Omitted for anonymity. January 2000. [4] D. Bertsekas. Nonlinear Programming. Athena Scientic, 1995. [5] Thomas Bonald. Comparison of TCP Reno and TCP Vegas via uid approximation. In Workshop on the Modeling of TCP, December 1998. Available at http://www.dmi.ens.fr/%7Emistral/tcpworkshop.html. [6] Lawrence S. Brakmo and Larry L. Peterson. TCP Vegas: end to end congestion avoidance on a global Internet. IEEE Journal on Selected Areas in Communications, 13(8), October 1995. [7] S. Floyd. Connections with multiple congested gateways in packetswitched networks, Part I: oneway trafc. Computer Communications Review, 21(5), October 1991. [8] S. Floyd. TCP and Explicit Congestion Notication. ACM Computer Communication Review, 24(5), October 1994. [9] S. Floyd and V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. on Networking, 1(4):397413, August 1993. [10] V. Jacobson. Congestion avoidance and control. Proceedings of SIGCOMM88, ACM, August 1988. An updated version is available via ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. [11] F. P. Kelly. Charging and rate control for elastic trafc. European Transactions on Telecommunications, 8:3337, 1997. http://www.statslab.cam.ac.uk/ frank/elastic.html. [12] Frank P. Kelly, Aman Maulloo, and David Tan. Rate control for communication networks: Shadow prices, proportional fairness and stability. Journal of Operations Research Society, 49(3):237252, March 1998. [13] Srisankar Kunniyur and R. Srikant. Endtoend congestion control schemes: utility functions, random losses and ECN marks. In Proceedings of IEEE Infocom, March 2000. [14] T. V. Lakshman and Upamanyu Madhow. The performance of TCP/IP for networks with high bandwidthdelay products and random loss. IEEE/ACM Transactions on Networking, 5(3):336350, June 1997. [15] Steven H. Low. Optimization ow control with on-line measurement. In Proceedings of the ITC, volume 16, June 1999. [16] Steven H. Low and David E. Lapsley. Optimization ow control, I: basic algorithm and convergence. IEEE/ACM Transactions on Networking, 7(6), December 1999. [17] Matthew Mathis, Jeffrey Semke, Jamshid Mahdavi, and Teunis Ott. The macroscopic behavior of the TCP congestion avoidance algorithm. ACM Computer Communication Review, 27(3), July 1997. [18] J. Mo, R. La, V. Anantharam, and J. Walrand. Analysis and comparison of TCP Reno and Vegas. In Proceedings of IEEE Infocom, March 1999. [19] Jeonghoon Mo and Jean Walrand. Fair endtoend windowbased congestion control. Preprint, 1999. [20] Ns network simulator. Available via http://www-nrg.ee.lbl.gov/ns/. [21] V. Paxson. EndtoEnd Routing Behavior in the Internet. Proceedings of SIGCOMM96, ACM, August 1996. [22] K. K. Ramakrishnan and S. Floyd. A Proposal to add Explicit Congestion Notication (ECN) to IP. Internet draft draft-kksjf-ecn-01.txt, July 1998. [23] K. K. Ramakrishnan and Ran Jain. A binary feedback scheme for congestion avoidance in computer networks. ACM Transactions on Computer Systems, 8(2):158181, May 1990.
20