Computer Science Department: Technical Report NWU-CS-04-34 April 19, 2004
Computer Science Department: Technical Report NWU-CS-04-34 April 19, 2004
Technical Report
NWU-CS-04-34
April 19, 2004
Abstract
The packet pair mechanism has been shown to be a reliable method to measure the
bottleneck link capacity and available bandwidth on a network path, and has been widely
deployed in tools such as nettimer, IGI, and PTR. However, the available bandwidth is
different from the TCP throughput that an application can achieve and the difference can
be huge. TCP throughput benchmarking techniques are widely used to probe the TCP
throughput for applications, for example in the Network Weather Service (NWS).
Unfortunately recent research shows that these techniques often cannot predict TCP
throughput well for large transfers. This paper addresses this issue. We begin by
statistically characterizing the TCP throughput on the Internet, exploring the strong
correlation between TCP flow size and throughput, and the transient end-to-end
throughput distribution. We then analyze why benchmarking fails to predict large
transfers, and propose a novel yet simple prediction model based on our observations.
Our prototype, dualPats, is an application level TCP throughput prediction framework
that combines our model with simple time series models and a dynamic probing rate
adjustment algorithm that relates intrusiveness to path dynamics. Our analysis and
evaluation is based on large scale Internet-based measurements and experiments
involving many sites distributed all over the world.
Effort sponsored by the National Science Foundation under Grants ANI-0093221, ACI-
0112891, ANI-0301108, EIA-0130869, and EIA-0224449. Any opinions, findings and
conclusions or recommendations expressed in this material are those of the author and do not
necessarily reflect the views of the National Science Foundation (NSF).
Keywords: Statistical TCP throughput characterization; TCP throughput prediction; TCP throughput
monitoring;
Characterizing and Predicting TCP Throughput on the Wide Area Network
Abstract pair [16], to Crovella’s cprobe [6], and the latest work, such
as IGI [12], the purpose is to measure the end-to-end avail-
The packet pair mechanism has been shown to be a able bandwidth accurately, quickly, and non-intrusively. To-
reliable method to measure the bottleneck link capacity day’s definition of available bandwidth is “the maximum
and available bandwidth on a network path, and has been rate that the path can provide to a flow, without reducing
widely deployed in tools such as nettimer, IGI, and PTR. the rate of the rest of the traffic.” [12, 13]. Other tools to
However, the available bandwidth is different from the TCP measure either the bottleneck link capacity or the available
throughput that an application can achieve and the differ- bandwidth include nettimer [17], pathchar and pchar [11],
ence can be huge. TCP throughput benchmarking tech- pathload [13, 14], NCS and pipechar [15], pathrate [10],
niques are widely used to probe the TCP throughput for spruce [26] and pathchirp [24], and Remos [18]. Most of
applications, for example in the Network Weather Ser- such tools used the packet pair or packet train techniques to
vice (NWS). Unfortunately recent research shows that these conduct the measurements.
techniques often cannot predict TCP throughput well for The available bandwidth is different from the TCP
large transfers. This paper addresses this issue. We begin throughput that an application can achieve, and that dif-
by statistically characterizing the TCP throughput on the ference can be huge. Lai’s Nettimer paper [17] showed
Internet, exploring the strong correlation between TCP flow many cases where the TCP throughput is much lower than
size and throughput, and the transient end-to-end through- the available bandwidth, while Jain’s pathload paper [13]
put distribution. We then analyze why benchmarking fails to showed the bulk transfer capacity [19] of a path could even
predict large transfers, and propose a novel yet simple pre- be higher than the measured available bandwidth. Addition-
diction model based on our observations. Our prototype, ally, most of these tools take a long time to run, which make
dualPats, is an application level TCP throughput prediction them unsuitable to be used in real time for applications and
framework that combines our model with simple time series services.
models and a dynamic probing rate adjustment algorithm The most widely used TCP throughput prediction tool is
that relates intrusiveness to path dynamics. Our analysis Network Weather Service [31] (NWS). NWS applies bench-
and evaluation is based on large scale Internet-based mea- marking techniques and time series models to measure TCP
surements and experiments involving many sites distributed throughput and provide predictions to applications in real
all over the world. time. NWS has been broadly applied. Allen, et al [3] ap-
plied NWS to address the so called Livny and Plank-Beck
problems. Swany, et al [28] applied NWS in the Grid infor-
mation service.
1 Introduction Unfortunately, recent work [30, 29] has argued that
NWS, and by implication, TCP benchmarking techniques
The concept of available bandwidth has been of central in general, are not good at predicting large file transfers on
importance throughout the history of packet networks, and the high speed Internet. Sudharshan, et al [30] proposed
researchers have been trying to create end-to-end measure- and implemented predicting large file transfers from a log
ment algorithms for a long time. From Keshav’s packet of previous transfers and showed that it can produce reason-
able results. However, a pure log-based predictor is updated
Effort sponsored by the National Science Foundation under Grants ANI- at application-chosen times and thus neglects the dynamic
0093221, ACI-0112891, ANI-0301108, EIA-0130869, and EIA-0224449.
Any opinions, findings and conclusions or recommendations expressed in
nature of the Internet. Hence, when a path changes dramati-
this material are those of the author and do not necessarily reflect the views cally, the predictor will be unaware of it until after the appli-
of the National Science Foundation (NSF). cation begins to use the path. To take the dynamic changes
1
of Internet into consideration, Sudharshan, et al [29] and size and throughput, explaining the phenomenon and how
Swany, et al [27] separately proposed regression and CDF it can cause benchmarking to err, and develop a new pre-
techniques to combine the log-based predictor with small dictive model that incorporates it. Next, we consider the
NWS probes, using the probes to estimate the current load statistical stability of the Internet and how it affects the life-
on the path and adjust the log-based predictor. These tech- time of measurements and predictions (Section 4). Finally,
niques enhanced the accuracy of log based predictors. we incorporate our results into dualPats and evaluate its per-
These combined techniques are limited to those host formance (Section 5).
pairs that have logs of past transfers between them, and due
to the dynamic nature of Internet, which only shows certain
statistical stabilities, the logs can become invalid after some
2 Experiments
time. Furthermore, due to the strong correlation between
TCP flow size and throughput [35], logs for certain ranges Our experimental testbed includes PlanetLab and several
of TCP flow (file) size are not useful for the prediction of additional machines located at Northwestern and Argonne
different TCP flow sizes. National Laboratories (ANL). PlanetLab [1] is an open plat-
The questions we address here are: form for developing, deploying, and accessing planetary-
scale services. It currently consists of 359 computers lo-
How can we explain the strong correlation between cated at 147 sites around the world.
TCP flow size and throughput, and what are its im-
We conducted S1 and S2 mainly to characterize the TCP
plications for predicting TCP throughput?
throughput on the Internet in which we implemented a sim-
How can we characterize the statistical stability of the ple C client-server program. S2 was also used to verify the
Internet and TCP throughput, and what are its implica- new TCP benchmarking mechanism we proposed. We con-
tions for predicting TCP throughput? ducted S3 using GridFTP and scp to strengthen S1 and S2
with big TCP flows and with applications that require au-
How can we predict the TCP throughput with different thentication before transferring effective data. S3 was also
TCP flow sizes without being intrusive? used to further verify the benchmarking mechanis. S4 was
The main contributions of this paper are: conducted to evaluate dualPats, our TCP throughput predic-
tion framework.
We explored reasons for the observed strong correla- Our experiments are summarized in Figure 1.
tion between TCP flow size and throughput [36].
We characterized end-to-end TCP throughput stability 3 The strong correlation between TCP flow
and distribution. size and throughput
We proposed a novel yet simple TCP benchmark
mechanism. A surprising finding in recent TCP connection charac-
terization is that TCP flow size and throughput are strongly
We proposed a dynamic sampling rate adjustment al-
correlated. This section explains the phenomenon, provides
gorithm to lower active probing overhead.
new additional explanations for it, explains why it can lead
We described and evaluated dualPats, a TCP through- to inaccurate TCP throughput predictions, and outlines a
put prediction service based on the preceding contri- new prediction approach.
butions.
We define TCP throughput as
where
TCP connection initialization and ends when data transfer Yin Zhang, et al [35] analyzed the correlations between
finishes. In some of our experiments using GridFTP [2] and the TCP flow characteristics of interest, including flow du-
scp [33], we treat as equivalent to file size, neglecting the ration and throughput, flow duration and size, and flow size
small messages exchanged for authentication. is equiv- and throughput. They pointed out that these correlations are
alent to file transfer time in our experiments with GridFTP fairly consistent across all their traces, and show a slight
and scp. TCP throughput is directly experienced by appli- negative correlation between duration and throughput, a
cations and thus accurate predictions are very important for slight positive correlation between size and duration, and a
the design and implementation of distributed applications. strong correlation between throughput and flow size. They
We begin by describing our experimental setup and mea- pointed out that the strong correlation between flow size and
surements (Section 2). In Section 3, we use our measure- throughput is the most interesting one and explained it in
ments to address the strong correlation between TCP flow two ways:
2
Name Statistics Main Purpose Hosts, Paths, Repetitions Messages, Software, procedure
Client/Server: 100 KB, 200 KB, 400
40 PlanetLab nodes in
KB , 600 KB, 800 KB, 1 MB, 2
North America, Europe,
To evaluate TCP through- MB, 4 MB, 10 MB. Server sends a
1,620,000 Asia, and Australia. Re-
S1 put stability and transient file with specific size to client con-
TCP transfers peat random pairing 3
distributions tinuously for 3,000 times and then
times, 60 distinctive paths
start to send a another file of differ-
total
ent size.
40 PlanetLab nodes in
To study correlation be- Client/Server: 100 KB, 200 KB, 400
North America, Europe,
2,430,000 tween TCP throughput and KB , 600 KB, 800 KB, 1 MB, 2 MB,
Asia, and Australia. Re-
S2 TCP transfers; flow size, and evaluate 4 MB, 10 MB. Server sends a se-
peat random pairing 3
270,000 runs proposed TCP benchmark quence of files with increasing sizes
times, 60 distinctive paths
mechanism. in order, start over after each run.
total
To test proposed TCP 20 PlanetLab nodes in
throughput benchmark North America, Europe, GridFTP, scp: 5 KB to 1GB. Server
4,800 TCP mechanism; To strengthen Asia, and Australia, one sends a sequence of files with in-
S3 transfers; 300
S1 and S2 with large TCP node at Northwestern, one creasing sizes in order, start over af-
runs flow sizes and different node at ANL, 30 distinc- ter each run.
applications tive paths total
20 PlanetLab nodes in
North America, Europe,
To evaluate the dualPats GridFTP, scp: randomly send a file
2400 test Asia, and Australia, one
S4 TCP throughput prediction of size 40 MB or 160 MB. About 48
cases node at Northwestern, one
service. hours long
node at ANL, 20 distinc-
tive paths total
Slow start: TCP slow start could cause some corre- Let’s consider the correlation between flow size and
lation between flow size and flow rate [35]. Hari throughput in our experiments. Figure 2 gives the cumu-
Balakrishnan, et al [4] showed that 85% of the web- lative distribution functions (CDFs) of the correlation co-
related TCP packets were transfered during slow start. efficient (Pearson’s )1 , where each individual value is
This implies that most web-related flows ended in slow calculated from one run of S2 or S3. It is clear from the
start, before TCP had fully opened its congestion win- graph that for the simple C server results in S2, over 80%
dow, leading to throughput much lower than would be of all runs demonstrate strong or medium s between flow
possible with a fully open window. sizes and flow rates. Further, 64% of all runs have .
User effect: The users are estimating the underlying 3.2 Explanations
bandwidth, and thus transferring big files only when
the estimated estimated bandwidth is correspondingly Now we consider additional explanations for the surpris-
large. ing correlation between flow size and transfer time.
is widely used to mea-
sure the strength of a (linear) relationship, therefore we use to show how
gests that there must be some other mechanisms at work. strong two random variables are linearly correlated.
3
CDF of Correlation Coefficients between Flow Size and Flow Rate 3.5
1 y = 9E-05x + 0.7246
3
R2 = 0.9992
0.9 2.5
Time (sec)
0.8 The Simple Program
GridFTP+SCP
2 Noise area
γ
Cumulative Percentages
0.7 1.5
1 β
α
0.6
0.5
0.5
0
γ
0.4 0 5000 10000 15000 20000γ 25000 30000 35000
File size (KB)
0.3
0.2
Figure 4. Transfer time versus TCP flow size
0.1
with GridFTP. Transfers are between North-
0
0 0.2 0.4 0.6 0.8 1
western University and Argonne National
Correlation Coefficient Lab. Single TCP flow with TCP buffer set.
10000 A closer look at Figure 4 shows that the total TCP flow
8000 duration or file transfer time can be divided into two parts:
6000
the startup overhead and the effective data transfer time. In
this case, the startup overhead is about 0.72 seconds. We
4000
represent this as
2000
(1)
0
0 20000 40000
TCPFile
flowsize
60000
size (KB)
80000 100000 where is the TCP flow duration,
Most applications have an initial message exchange. For where is the TCP throughput, and , ,
are the same
example, GridFTP and scp require certificate or public key as in Equation 1.
authentication before starting to send or receive data.
Figure 3 shows the TCP throughput as a function of TCP Residual slow start effect
flow size, for transfers using GridFTP between Northwest- Mathis, et al [20] pointed out that it takes TCP some time
ern university and ANL. The dotted line is the asymptotic before its throughput reaches equilibrium. Assuming se-
TCP throughput. We tried linear, logarithmic, order 2 poly-
packets in the unstable phase, where is the
lective acknowledgments (SACK), TCP will send roughly
nomial, power, and exponential curve fitting, but none of
them fit well. " #
We next considered the relationship between TCP flow loss rate and is a constant ! . This number can be
duration (transfer time) and flow size (file size). Figure 4 significant given a low loss rate . This happens because
shows that this relationship can be well modeled with a sim- with SACK, slow start will overshoot and drive up the loss
4
CDF of Overhead for the Simple TCP Program and (GridFTP+SCP) Why simple TCP benchmarking fails
1
0.7
tests [29]:
0.6
0.5
The default probe used by NWS is too small. It will
The Simple Program likely end up in the noise area as shown in Figure 4.
0.4 GridFTP+SCP
0.3 The TCP throughput that the probe measures is only
0.2
useful to TCP flows of similar size because of the
strong correlation between throughput and flow size.
0.1
Given Equation 2, it is clear that
is the TCP
0
0 0.5 1
Overhead (second)
1.5 2 throughput for the file size 2000KB,
throughput for the file size 30000KB and
is
the isTCP
the
steady state TCP throughput. As file size increases
Figure 5. CDF of , the startup overhead. decreases, and when the file size isapproaching
infin-
Even for the simple client/server there is ity, the throughput will approach .
startup overhead likely caused by the resid-
ual slow start effect. The startup overheads The TCP buffer is not set for NWS probes while the
of scp and GridFTP are much larger. GridFTP tests were done with adjusted TCP buffer
size.
well, this residual slow start effect can be treated as another simple TCP benchmark mechanism. Instead of using
kind of startup overhead, incorporated in as above. This probes with the same size, we use two probes with different
can also explain why in Figure 2 the s for the scp and sizes, chosen to be beyond the noise area. We then fit a line
GridFTP traces are much stronger than that of the simple between the two measurements, as shown in figure 4. Using
program. Equation 1 and 2, we can then calculate the TCP throughput
for other flow sizes (file sizes).
To verify that this is the case in general for the simple To verify that the new TCP benchmark works, we used
applications without other startup overheads, we used the the trace data in experiment S2. We chose a small probe
data collected in experiment S2. We did least square linear with size 400KB and a bigger probe with size 800KB, and
curve fitting and calculated for each set of data. Figure 5 predicted the throughput of the other TCP transfers in the
shows the CDF for these s. The effect of residual slow trace. Figure 7 shows the CDF of relative prediction error
For comparison purpose, we also plot the CDF of for The CDFs look normal, so we used quantile-quantile
applications that require authentication in the same Figure, plots to test this. Figure 8 shows an example plot. In al-
for such
namely GridFTP and SCP. As the CDF indicates, a typical
applications is much larger than that of the sim-
most all cases, we can fit a straight line to these plots with
! , which tells us that our relative error is almost al-
ple application. ways normal. Normality of prediction errors here is both
5
2 CDF of Prediction Error of Wide−area TCP Experiment S2
CDF of R for a Simple Program and (SCP+GridFTP)
0
10 1
SCP+GridFTP 2M
−1 3M
10 0.8
4M
Cumulative Percentages
0.7 10M
−2
10 0.6
0.5
−3
10 0.4
0.3
−4
10 0.2
0.1
−5
10 0
0 0.2 0.4 0.6 0.8 1 −100 −80 −60 −40 −20 0 20 40 60 80 100
2
R Relative Prediction Error (%)
Figure 6. CDF of for linear model of Fig- Figure 7. CDF of relative prediction error for
ure 4. Each is from a independent test. TCP throughput with different flow sizes.
Both simple client/server and applications
that require authentication show a strong lin-
ear property. Note that the Y axis is in log Q−Q Plot of Relative Prediction Error
1
scale to show detail. Over 99% of the runs
0.8
had .
0.6
0.4
Y Quantiles
6
CDF of Standard Deviation of File Transfer Time help us to make decisions about prediction strategies, such
1
as the frequency of active probing, and therefore to lower
0.9 the intrusiveness of the predictors.
0.8 1M Routing stability: Paxson [23, 22] proposed two met-
2M
Cumulative Percentages
0.7 3M
rics for route stability, prevalence and persistency. Preva-
0.6
4M lence, which is of particular interest to us here, is the prob-
10M
ability of observing a given route over time. If a route is
0.5
prevalent, than the observation of it allows us to predict that
0.4 it will be used again. Persistency is the frequency of route
0.3 changes. The two metrics are not closely correlated. Pax-
0.2 son’s conclusions are that Internet paths are heavily domi-
nated by a single route, but that the time periods over which
0.1
routes persist show wide variation, ranging from seconds
0 −4 −2 0 2 4 to days. However, 2/3 of the Internet paths Paxson studied
10 10 10 10 10
Standard Deviation (in log scale) had routes that persisted for days to weeks. Chinoy found
that route changes tend to concentrate at the edges of the
Figure 9. CDF of standard deviation of trans- network, not in its “backbone” [7].
fer time at all Internet paths for 5 different flow Spatial locality and temporal locality: Balakrishnan,
sizes. et al analyzed statistical models for the observed end-to-end
network performance based on extensive packet-level traces
collected from the primary web site for the Atlanta Sum-
CDF of COV (Coefficient of Variation) of File Transfer Time
1
mer Olympic Games in 1996. They concluded that nearby
Internet hosts often have almost identical distributions of
0.9
observed throughput. Although the size of the clusters for
0.8 which the performance is identical varies as a function of
Cumulative Percentages
4 Statistical stability of the Internet (OCR). Instead of using OCR, we define a Statistically Sta-
ble Region (SSR) as the length of the period where the ratio
7
CDF of Statistical Stable Region Length at Different Factors throughput distribution across all different flow sizes be-
1
tween each pair of Internet hosts. Their statistical analysis
0.9 1.2
1.5 suggests that end-to-end TCP throughput can be well mod-
0.8 2 eled as a log-normal distribution.
3
Since we have already seen earlier that there exists strong
Cumulative Percentages
0.7 5
10 correlation between TCP throughput and flow size, we are
0.6
therefore more interested in studying the TCP throughput
0.5 distribution of a particular flow size than in getting an aggre-
0.4 gated throughput distribution across all different flow sizes.
0.3 The data from experiment S1 lets us do this analysis.
Recall that in S1, for each client/server pair, we re-
0.2
peated the transfer of each file 3,000 times. We histogramed
0.1
throughput data for each flow size/path tuple. Almost in ev-
0 1 2 3 4 5
ery case, the throughput histrogram demonstrates a multi-
10 10 10 10 10
Time (seconds) modal distribution. This suggests that it is probably not fea-
sible to model TCP throughput using simple distributions.
Because the collection of data for each client/server pair
Figure 11. CDF of statistically stable region
lasted several hours or even longer, we suspect that the mul-
different .
(SSR) for steady-state TCP throughput with
timodal feature may be partially due to the change in net-
work conditions during the measurement period. To ver-
ify this hypothesis, we try to study throughput distribu-
tion using subsets of each dataset. A subset contains much
size, while SSR characterizes the steady state throughput less data and covers shorter measurement length. In other
for all flows with different sizes. We used traces from ex- words, we hoped to find “subregions” in each dataset in
periment S2 to characterize the SSR with steady-state TCP which the network conditions are relatively stable and the
throughput. That is, instead of looking at the TCP through- throughput data can be better modelled unimodally.
put of a specific flow size, we applied least square linear It is very hard to predefine an optimal length or data size
fitting to get Equation 1, and therefore the estimated steady- for such “subregions” in the throughput data; in fact, the ap-
state TCP throughput of the path. propriate length may vary from time to time. Therefore, we
Figure 11 gives the CDF of length of all SSRs modeled believe it is necessary to adaptively change the subregion
by steady-state TCP throughput from experiment S2. Each length over time as we acquire data (or walk the dataset of-
fline). The purpose is to segment the whole dataset into
constant factor . Under all different values of , some de-
curve in the plot corresponds to a particular value of the
multiple subregions (or identify segement boundaries on-
line). For each segment, we fit the data with several analyt-
is, the longer the SSRs tend to be.
gree of temporal locality is exhibited. Moreover, the larger
ical distributions, and evaluate the goodness of fit using the
values of .
For comparison purposes, we also calculated the CDF
Our offline distribution fitting algorithm for TCP
of OCR with data from S1. The comparison between ours
throughput has the following steps:
and Zhang’s results [36] suggests that the temporal local-
ity in our test environment is much weaker. For instance, 1. Select a trace of TCP throughput (sequence of mea-
Zhang found that !
when
% of OCRs are longer than 1 hour surements for a particular flow size on a particular In-
. In our results, the two corresponding numbers
and % of all OCRs exceed 3 hours when ternet path).
drop to 2% and 10% respectively. TCP throughput in our 2. Initialize the subregion length, and set the start and end
testbed appears to be less stable. We suspect that this differ- point of the subregion to 1 and 100, respectively.
ence may largely due to the fact that Planetlab nodes often
3. Fit the subregion data with an analytical distribution,
and calculate the value of .
become CPU or bandwidth saturated, causing great fluctu-
ations of TCP throughput. It is challenging to predict TCP
throughput under a highly dynamic environment. 4. Increase the subregion length by 100, that is, keep the
End-to-end TCP Throughput Distribution: An im- start point as from the previous step, but increase the
portant question an application often poses is how the TCP end point by 100. For this new subregion, fit the data
throughput varies, and, beyond that, whether an analytical with the analytical distribution model again, get a new
distribution model can be applied to characterize its distri- value of . Note that the granularity here, 100, can
bution. Balakrishman, et al [5] studied aggregated TCP also be changed.
8
2
CDF of R for Different Distributions On the other hand, none of analytic distributions are
1
overwhelmingly successful. Even for the normal distribu-
0.9 tion, half of the time.
0.8
Cumulative Percentages
9
algorithm that can detect it automatically. For now, the al-
gorithm uses feedback from application about its prediction
Applications error.
5.2 Dynamic sampling rate adjustment algorithm dualPats ran ! predictions on the 20 paths 48 hours
long S4 experiments. Test cases are randomly chosen
40MB or 160MB files.
There are two ways to decrease the overhead caused by
The prediction results are shown in Figure 14. Mean
the probe pairs of dualPats: decrease the size of the probe
error is calculated by averaging all of the relative errors.
pair or decrease the sampling rate.
For an unbiased predictor, this value should be close to zero
As we discussed in Section 4, each Internet path shows
given enough test cases. We can see that in our evaluation it
statistical stability in TCP throughput. However, each path
is quite small in most cases, and we see an equal proportion
is different in the length of its SSR. Therefore, we designed
of positive and negative mean errors.
a simple algorithm to dynamically adjust the sampling rate
to the path’s SSR. The algorithm is as follows: The mean absolute error is the average of the absolute
value of all of the relative errors. We consider it the most
and a lower bound for the
1. Set an upper bound important metric in evaluating the predictions. Figure 14
sampling interval. They were set as 20 and 1200 sec- shows that all 20 paths have a mean absolute error %,
onds in our tests. 17 out of 20 paths are %, and 13 out 20 paths are
%. As we commented in Section 4, PlanetLab is much
2. Set another two relative changing bounds, , , in more heavily loaded and dynamic than the current Internet,
units of percentage. After sending each probe pair, es- thus it is likely to be much harder to predict than on the
timate the current steady-state TCP throughput. If it
has changed less than , increases the sampling in-
current Internet.
We studied the correlation between the error and sev-
terval a step of seconds; if it changes between
by, keep eral known attributes. The results are shown in Figure 15.
and otherwise
the current interval;
the interval. In experiment S4, ,
were decrease
set to be
Clearly, the mean error is not related to any of the attributes,
which further suggests that the predictions given by dual-
5% and 15%. Pats are unbiased. However, if the path is very dynamic
3. The interval must be between and . it is hard to predict. Figure 15 shows that the between
the mean absolute error and the sampling interval length
We also want to minimize the size of probe pairs on the (and, indirectly, the SSR) is negative and pretty strong. This
condition that none of them will fall into the noise area as implies that our algorithm captured the path dynamics and
shown in Figure 4. However, the noise area is different for tried to adjust to its changes.
each Internet path, as discussed in Section 3. It is a func- Our conclusion is that dualPats does a effective job of
tion of loss rate and underlying bandwidth. We need an predicting TCP throughput.
10
Path Router Hops Mean RTT Mean err % Mean stderr Mean abs(err)% Mean abs(stderr) Mean Interval
1 20 55 -0.0073 0.11 0.069 0.13 641.97
2 18 60 0.10 0.17 0.17 0.18 29.44
3 17 33 -0.21 0.23 0.25 0.51 132.2
4 11 27.5 -0.03 0.19 0.13 0.25 71.56
5 13 31 -0.04 0.20 0.16 0.28 48.76
6 16 138 -0.079 0.19 0.14 0.29 58.18
7 16 120 0.048 0.355 0.28 0.42 21.87
8 14 51 0.021 0.12 0.095 0.168 512.64
9 18 207 -0.14 0.17 0.18 0.36 51.50
10 14 29 -0.11 0.19 0.14 0.31 180.17
11 19 110 -0.036 0.18 0.11 0.24 28.57
12 15 36 -0.038 0.14 0.078 0.18 258.16
13 17 59 0.035 0.208 0.16 0.24 32.23
14 12 23.5 -0.012 0.060 0.042 0.082 320.97
15 13 28 -0.095 0.186 0.14 0.31 511.33
16 18 100 -0.028 0.16 0.11 0.21 543.75
17 19 70 -0.083 0.030 0.083 0.17 543.63
18 14 81 -0.076 0.025 0.076 0.154 522.20
19 19 72 0.21 0.38 0.29 0.39 48.39
20 17 50 0.11 0.12 0.14 0.12 97.25
Figure 14. Prediction error statistics for experiment S4. RTT is the round trip time between the two
sites in miliseconds, and Mean Interval is the average interval time between probe pairs in seconds.
Mean err is the average relative error while mean abs(err) is the average of the absolute relative
errors.
Router Hops Mean RTT Mean Interval work within a passive measurement model such as that in
Mean abs(err) 0.112 0.257 -0.62 Wren [34], and the use of hierarchical decomposition as in
Mean err 0.076 0.13 -0.13 Remos [18] and NWS Clique [32]. We have assumed that
the network path is the bottleneck for file transfer. In some
Figure 15. Correlation coefficient between cases, especially in high speed optical networks, this may
prediction error and known attributes. not be true, and transfer time prediction would also have
to take into account processor and memory system perfor-
mance.
6 Conclusions and future work
References
We have characterized the behavior of TCP throughput
in the wide area environment, providing additional expla- [1] http://www.planet-lab.org.
nations for the correlation of throughput and flow size and [2] A LLCOCK , W., B ESTER , J., B RESNAHAN , J., C ERVENAK ,
demonstrating how this correlation causes erroneous pre- A., L IMING , L., AND T UECKE , S. GridFTP: Protocol ex-
dictions to be made when using simple TCP benchmarking tensions to ftp for the grid. Tech. rep., Argonne National
to characterize a path. In response, we proposed and evalu- Laboratory, August 2001.
ated a new benchmarking approach, probe pair, from which [3] A LLEN , M., AND W OLSKI , R. The livny and plank-beck
TCP throughput for different messages sizes can be derived. problems: Studies in data movement on the computational
We described and evaluated the performance of a new pre- grid. In Supercomputing 2003 (November 2003).
dictor, dualPats, implements this approach. [4] BALAKRISHNAN , H., PADMANABHAN , V. N., S ESHAN ,
In this work, we do not consider parallel TCP flows, S., S TEMM , M., AND K ATZ , R. H. TCP behavior of a busy
which is a current subject for us. We also acknowledge internet server: Analysis and improvements. In INFOCOM
that, like all benchmarking-based systems, our approach has (1) (1998), pp. 252–262.
scalability problems. We have addressed this to some ex- [5] BALAKRISHNAN , H., S ESHAN , S., S TEMM , M., AND
tent with our dynamic sample rate adjustment algorithm. K ATZ , R. H. Analyzing Stability in Wide-Area Network
However, we are also considering whether our ideas can Performance. In ACM SIGMETRICS (June 1997).
11
[6] C ARTER , R., AND C ROVELLA , M. Measuring bottleneck [22] PAXSON , V. End-to-end routing behavior in the Inter-
link speed in packet-switched networks. Performance Eval- net. In Proceedings of the ACM SIGCOMM Conference
uation, 28 (1996), 297–318. on Applications, Technologies, Architectures, and Protocols
[7] C HINOY, B. Dynamics of internet routing information. In for Computer Communications (New York, August 1996),
SIGCOMM (1993), pp. 45–52. vol. 26,4 of ACM SIGCOMM Computer Communication Re-
view, ACM Press, pp. 25–38.
[8] D INDA , P. A. The statistical properties of host load. Scien-
[23] PAXSON , V. End-to-end routing behavior in the Internet.
tific Programming 7, 3,4 (1999). A version of this paper
IEEE/ACM Transactions on Networking 5, 5 (1997), 601–
is also available as CMU Technical Report CMU-CS-TR-
615.
98-175. A much earlier version appears in LCR ’98 and as
CMU-CS-TR-98-143. [24] R IBEIRO , V., R IEDI , R., BARANIUK , R., NAVRATIL , J.,
AND C OTTRELL , L. pathchirp: Efficient available band-
[9] D INDA , P. A. Online prediction of the running time of tasks.
width estimation for network paths. In Passive and Active
Cluster Computing 5, 3 (2002). Earlier version in HPDC
Measurement Workshop (2003).
2001, summary in SIGMETRICS 2001.
[25] S ESHAN , S., S TEMM , M., AND K ATZ , R. H. SPAND:
[10] D OVROLIS , C., R AMANATHAN , P., AND M OORE , D. What Shared passive network performance discovery. In USENIX
do packet dispersion techniques measure? In INFOCOM Symposium on Internet Technologies and Systems (1997).
(2001), pp. 905–914.
[26] S TRAUSS , J., K ATABI , D., AND K AASHOEK , F. A mea-
[11] D OWNEY, A. B. Using pathchar to estimate internet link surement study of available bandwidth estimation tools. In
characteristics. In Measurement and Modeling of Computer Internet Measurement Conference (2003).
Systems (1999), pp. 222–223.
[27] S WANY, M., AND W OLSKI , R. Multivariate resource per-
[12] H U , N., AND S TEENKISTE , P. Evaluation and characteriza- formance forecasting in the network weather service. In
tion of available bandwidth probing techniques. IEEE JSAC ACM/IEEE conference on Supercomputing (2002).
Special Issue in Internet and WWW Measurement, Mapping,
[28] S WANY, M., AND W OLSKI , R. Representing dynamic per-
and Modeling 21, 6 (August 2003).
formance information in grid environments with the network
[13] JAIN , M., AND D OVROLIS , C. End-to-end available band- weather service. In 2nd IEEE/ACM International Symposium
width: Measurement methodolody, dynamics, and relation on Cluster Computing and the Grid (CCGRID’02) (2002).
with tcp throughput. In ACM SIGCOMM (2002). [29] VAZHKUDAI , S., AND S CHOPF, J. Predicting sporadic
[14] JAIN , M., AND D OVROLIS , C. Pathload: A measurement grid data transfers. In 12th IEEE International Symposium
tool for end-to-end available bandwidth. In Passive and Ac- on High Performance Distributed Computing (HPDC-12)
tive Measurement Workshop (2002). (2002).
[15] J IN , G., YANG , G., C ROWLEY, B., AND AGARWAL , D. [30] VAZHKUDAI , S., S CHOPF, J., AND F OSTER , I. Predict-
Network characterization service (ncs). In 10th IEEE Sym- ing the performance of wide area data transfers. In The
posium on High Performance Distributed Computing, Aug. 16th Int’l Parallel and Distributed Processing Symposium
2001. (2001). (IPDPS 2002). (2002).
[16] K ESHAV, S. A control-theoretic approach to flow control. [31] W OLSKI , R. Dynamically forecasting network performance
Proceedings of the conference on Communications architec- using the network weather service. Cluster Computing 1, 1
ture and protocols (1993), 3–15. (1998), 119–132.
[17] L AI , K., AND BAKER , M. Nettimer: A tool for measur- [32] W OLSKI , R., S PRING , N., AND H AYES , J. The network
ing bottleneck link bandwidth. In USENIX Symposium on weather service: A distributed resource performance fore-
Internet Technologies and Systems (2001), pp. 123–134. casting service for metacomputing. Journal of Future Gen-
eration Computing Systems 15, 5-6 (1999), 757–768.
[18] L OWEKAMP, B., M ILLER , N., S UTHERLAND , D., G ROSS ,
T., S TEENKISTE , P., AND S UBHLOK , J. A resource moni- [33] Y LONEN , T. SSH — secure login connections over the inter-
toring system for network-aware applications. In Proceed- net. In Proceedings of the 6th USENIX Security Symposium
ings of the 7th IEEE International Symposium on High (1996), pp. 37–42.
Performance Distributed Computing (HPDC) (July 1998), [34] Z ANGRILLI , M., AND L OWEKAMP, B. B. Comparing pas-
IEEE, pp. 189–196. sive network monitoring of grid application traffic with ac-
[19] M ATHIS , M., AND A LLMAN , M. A framework for defining tive probes. In Fourth International Workshop on Grid Com-
empirical bulk transfer capacity metrics, rfc3148, July 2001. puting (2003).
[35] Z HANG , Y., B RESLAU , L., PAXSON , V., AND S HENKER ,
[20] M ATHIS , M., S EMKE , J., AND M AHDAVI , J. The macro-
S. On the Characteristics and Origins of Internet flow rates.
scopic behavior of the tcp congestionavoidance algorithm.
In ACM SIGCOMM (2002).
Computer Communication Review 27, 3 (1997).
[36] Z HANG , Y., D UFFIELD , N., PAXSON , V., AND S HENKER ,
[21] M YERS , A., D INDA , P. A., AND Z HANG , H. Performance
S. On the constancy of internet path properties. In ACM SIG-
characteristics of mirror servers on the internet. In INFO-
COMM Internet Measurement Workshop (November 2001).
COM (1) (1999), pp. 304–312.
12