0% found this document useful (0 votes)
131 views11 pages

Computer Science Technical Report Series: Comparison of TCP Congestion Control Performance Over A Satellite Network

This technical report compares the performance of four TCP congestion control algorithms - BBR, Cubic, Hybla, and PCC - over a commercial satellite network. Experiments were conducted directly over the satellite network to measure throughput, round-trip time, and packet loss for each algorithm. The results show that while the steady-state throughput is similar for all algorithms, Hybla achieves the highest throughput during startup for short downloads. Overall, PCC results in the lowest round-trip times and highest average throughput, making it the most powerful algorithm when considering both metrics.

Uploaded by

Tuan Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views11 pages

Computer Science Technical Report Series: Comparison of TCP Congestion Control Performance Over A Satellite Network

This technical report compares the performance of four TCP congestion control algorithms - BBR, Cubic, Hybla, and PCC - over a commercial satellite network. Experiments were conducted directly over the satellite network to measure throughput, round-trip time, and packet loss for each algorithm. The results show that while the steady-state throughput is similar for all algorithms, Hybla achieves the highest throughput during startup for short downloads. Overall, PCC results in the lowest round-trip times and highest average throughput, making it the most powerful algorithm when considering both metrics.

Uploaded by

Tuan Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

WPI-CS-TR-20-06 August 2020

Comparison of TCP Congestion Control Performance over a


Satellite Network

by
Saahil Claypool
Jae Chung and Mark Claypool

Computer Science
Technical Report
Series
WORCESTER POLYTECHNIC INSTITUTE

Computer Science Department


100 Institute Road, Worcester, Massachusetts 01609-2280
Comparison of TCP Congestion Control
Performance over a Satellite Network
Saahil Claypool Jae Chung Mark Claypool
Worcester Polytechnic Institute Viasat Worcester Polytechnic Institute
Worcester, MA, USA Marlborough, MA, USA Worcester, MA, USA
[email protected] [email protected] [email protected]

Abstract—Satellite connections are critical for continuous net- alternative protocols such as QUIC, can render these satellite
work connectivity when disasters strike and for remote hosts PEPs ineffective.
that cannot use traditional wired, WiFi or mobile networking.
While Internet satellite bitrates have increased, latency can still
degrade TCP performance. Assessment of TCP over satellites
is lacking, typically done by simulation or emulation only, if
at all. This paper presents experiments comparing four TCP
congestion control algorithms – BBR, Cubic, Hybla and PCC
– on a commercial satellite network. Analysis of the results
shows similar steady-state bitrates for all, but with significant
differences in start up throughputs and round-trip times caused
by queuing of packets in flight. Power analysis combining
throughput and latency shows that overall, PCC is the most
powerful, due to relatively high throughputs and low, steady
round-trip times, while for small downloads Hybla is the most
powerful, due to fast throughput ramp-ups. BBR generally fares
similarly to Cubic in all cases. Fig. 1. TCP over a satellite with a performance enhancing proxy. [2].
I. I NTRODUCTION
Satellite networks are an essential part of modern society, TCP congestion control algorithms play a critical role in
providing ubiquitous network connectivity even in times of enhancing or restricting throughput in the presence of network
disaster. The number of satellites in orbit is over 2100, a 67% loss and latency. TCP Cubic [3] is the default TCP con-
increase from 2014 to 2019 [1]. As few as three geosyn- gestion control algorithm in Linux and Microsoft Windows,
chronous (GSO) satellites can provide global coverage, in- but BBR [4] has been widely deployed by Google on Linux
terconnecting widely distributed networks and providing “last servers and is a congestion control option available in the
mile” connectivity to remote hosts. The idea of “always-on” QUIC transport protocol, as well [5]. A better understanding of
connectivity is particularly useful for redundancy, especially in TCP congestion control algorithm performance over satellite
an emergency when traditional (i.e., wired) connections may networks without PEPs is needed in order assess challenges
not be available. Recent research in satellite technology has and opportunities that satellites have to better support TCP
produced spot beam technology frequency reuse to increase and QUIC moving forward.
transmissions capacity more than 20x and the total capacity However, while TCP and BBR measurements have been
of planned GEO satellites is over 5 Tb/s. done over wireless networks [6], there are few published
While throughput gains for satellite Internet are promising, studies measuring network performance over actual satellite
satellite latencies can be a challenge. GSO satellites orbit networks [7], with most studies either using just simulation [8]
about 22k miles above the earth, which means about 300 or emulation with satellite parameters [9], [10], [11], [12].
milliseconds of latency to bounce a signal up and down, This paper presents results from experiments that mea-
a hurdle for TCP-based protocols that rely upon round-trip sure the performance of TCP over a commercial Internet
time communication to advance their data windows. The satellite network. We compare four algorithms with differ-
physics involved for round-trip time Internet communication ent approaches to congestion control: default AIMD-based
between terrestrial hosts using a satellite accounts for about (Cubic [3]), bandwidth estimation-based (BBR [4]), utility
550 milliseconds of latency at a minimum [2]. To combat function-based (PCC [12]), and satellite-optimized for startup
these latencies that can limit TCP throughput, many satellite (Hybla [13]). The network testbed and experiments are done
operators use middle-boxes (also known as “performance over the Internet, but designed to be as comparable across runs
enhancing proxies” or PEPs) to short-circuit the round-trip as possible by interlacing runs of each protocol serially to min-
time communication to the satellite (see Figure 1). Unfortu- imize temporal differences and by doing 80 bulk downloads
nately, encrypted connections that use TCP (such as TLS) or for each protocol to provide for a large sample. In addition, a
custom ping application provides several days worth of round- where RT T0 is typically fixed at a “wired” round-trip time
trip time and lost packet data to get a baseline on a “quiet” of 0.025 seconds. Hybla is available for Linux as of kernel
satellite network. 2.6.11 (in 2005).
Analysis of the results shows the satellite link has con- Ha et al. [3] develop TCP Cubic as in incremental improve-
sistent baseline round-trip times of about 600 milliseconds, ment to earlier congestion control algorithms. Cubic is less
but infrequently has round-trip times of several seconds. Loss aggressive than previous TCP congestion control algorithms
events are similarly infrequent (less than 0.05%) and short- in most steady-state cases, but can probe for more bandwidth
lived. For the TCP algorithms, all four congestion control quickly when needed. Cubic’s window size is dependent only
algorithms have similar overall median throughputs. BBR on the last congestion event, providing for more fairness to
achieves the maximum link capacity more often than Cubic, flows that share a bottleneck but have different round-trip
Hybla or PCC. However, during the start-up phase, Hybla has times. TCP Cubic has been the default in Linux as of kernel
the highest throughput followed by PCC, BBR and Cubic, in 2.6.19 (in 2007), Windows 10.1709 Fall Creators Update (in
that order – faster start-up means faster completion for short- 2017), and Windows Server 2016 1709 update (in 2017).
lived downloads, such as Web pages. Overall, Hybla has the Cardwell et al. [4] develop TCP Bottleneck Bandwidth and
highest average throughput, owing to its higher throughput Round-trip time (BBR) as a Cubic alternative. BBR uses the
during start-up. PCC has the lowest overall round-trip time, maximum bandwidth and minimum round-trip time observed
and Hybla the highest, consistently 50% higher than PCC. over a recent time window to build a model of the network and
BBR and Cubic round-trip times are similar and between those set the congestion window size (up to twice the bandwidth-
of PCC and Hybla. However, BBR, Cubic and PCC (to a delay product). BBR has been deployed by Google servers
lesser extent), have periods of high retransmission rates owing since at least 2017 and is available for Linux TCP since Linux
to their over-saturation of the bottleneck queue, while Hybla kernel 4.9 (end of 2016).
mostly avoids this. Power analysis that combines throughput Dong et al. [12] propose TCP PCC that continuously
and delay shows PCC is generally the most powerful, followed observes performance based on measurements in the form of
by Hybla with Cubic and BBR equally the least powerful. mini “experiments”. Different actions taken these experiments
The rest of this report is organized as follows: Section II are compared using a utility function, adopting the rate that
describes research related to this work, Section III describes has the best performance. The authors compare PCC against
our testbed and experimental methodology, Section IV an- other TCP congestion control algorithms, including Cubic and
alyzes our experiment data, and Section V summarizes our Hybla, and emulate a satellite network based on parameters
conclusions and suggests possible future work. from a satellite Internet system. PCC is not generally avail-
able for Linux, but we were able to obtain a Linux-based
II. R ELATED W ORK implementation from Compira Labs.1
This section describes work related to our paper, including
TCP congestion control (Section II-A), comparisons of TCP B. Comparison of CC Algorithms
congestion control algorithms (Section II-B), and TCP perfor- Cao et al. [14] analyze measurement results of BBR and
mance over satellite networks (Section II-C). Cubic over a range of different network conditions. They
produce heat maps and a decision tree that identifies conditions
A. TCP Congestion Control (CC)
which show performance benefits from BBR over using Cubic.
There have been numerous proposals for improvements to They find it is the relative difference between the bottleneck
TCP’s congestion control algorithm since it’s inception. This buffer size and bandwidth-delay product that dictates when
section highlights a few of the papers most relevant to our BBR performs well. Our work extends this work by providing
work, presented in chronological order. detailed evaluation of Cubic and BBR in a satellite configu-
Caini and Firrinielli [13] propose TCP Hybla to overcome ration, with round-trip times significantly beyond those tested
the limitations traditional TCP flows have when running over by Cao et al.
high-latency links (e.g., a Satellite). TCP Hybla modifies the Ware et al. [15] model how BBR interacts with loss-based
standard congestion window increase with: a) an extension congestion control protocols (e.g., TCP Cubic). Their validated
to the “additive increase”; b) adoption of the SACK option model shows BBR becomes window-limited by its in-flight
and timestamps, which help in the presence of loss, and c) cap which then determines BBR’s bandwidth consumption.
using packet spacing to reduce transmission burstiness. The Their models allow for prediction of BBR’s throughput when
slow start (SS) and congestion avoidance (CA) algorithms for competing with Cubic with less than a 10% error.
Hybla are: Turkovic et al. [16] study the interactions between con-
gestion control algorithms. They measure performance in a
SS : cwnd = cwnd + 2ρ − 1 (1) network testbed using a “representative” algorithm from three
ρ 2 main groups of TCP congestion control – loss-based (TCP
CA : cwnd = cwnd + (2) Cubic), delay based (TCP Vegas [17]) and hybrid (TCP BBR)
cwnd
ρ = RT T /RT T0 (3) 1 https://www.compiralabs.com/
– using 2 flows with combinations of protocols competing
with each other. They also do some evaluation of QUIC [18]
as an alternative transport protocol to TCP. They observed
bandwidth fairness issues, except for Vegas and BBR, and
found BBR is sensitive to even small changes in round-trip
time.
C. TCP over Satellite Networks
Obata et al. [7] evaluate TCP performance over actual (not
emulated, as is typical) satellite networks. Specifically, they
compare a satellite-oriented TCP congestion control algorithm
(STAR) with TCP NewReno and TCP Hybla. Experiments
with the Wideband InterNetworking Engineering test and
Demonstration Satellite (WINDS) system show throughputs Fig. 2. Satellite measurement testbed.
around 26 Mb/s and round-trip times around 860 milliseconds.
Both TCP STAR and TCP Hybla had better throughputs over
the satellite links than TCP NewReno. via Gb/s Ethernet. The WPI campus network is connected to
Wang et al. [10] provide a preliminary performance evalu- the Internet via several 10 Gb/s links, all throttled to 1 Gb/s.
ation of QUIC with BBR on a network testbed that emulates The client connects to a Viasat satellite terminal (with a
a satellite network (capacities 1 Mb/s and 10 Mb/s, RTTs modem and router) via a Gb/s Ethernet connection.
200, 400 and 1000 ms, and packet loss rates up to 20%). The terminal communicates through a Ka-band outdoor
Their results confirm QUIC with BBR has some throughput antenna (RF amplifier, up/down converter, reflector and feed)
improvements compared with TCP Cubic for their emulated through the Viasat 2 satellite2 to the larger Ka-band gateway
satellite network. antenna. The terminal supports adaptive coding and modula-
Utsumi et al. [9] develop an analytic model for TCP Hybla tion using 16-APK, 8 PSK, and QPSK (forward) at 10 to 52
for steady-state throughput and latency over satellite links. MSym/s and 8PSK, QPSK and BPSK (return) at 0.625 to 20
They verify the accuracy of their model with simulated and MSym/s.
emulated satellite links (capacity 8 Mb/s, RTT 550 ms, and The connected Viasat service plan provides a peak data rate
packet loss rates up to 2%). Their analysis shows substantial of 144 Mb/s.
improvements to throughput over that of TCP Reno for loss The gateway does per-client queue management for traffic
rates above 0.0001% destined for the client, where the queue can grow up to 36
Our work extends the above with comparative performance MBytes allowing the maximum queuing delay of about 2
for four TCP congestion control algorithms on an actual, seconds at the peak data rate. Queue lengths are controlled
commercial satellite Internet network. by Active Queue Management (AQM) that starts to randomly
III. M ETHODOLOGY drop incoming packets when the queue grows over a half of
the limit (i.e., 18 MBytes).
In order to evaluate TCP congestion control over satellite Wireshark captures all packet header data on each server
links, we use the following methodology: setup a testbed and the client.
(Section III-A), measure network baseline loss and round-trip The performance enhancing proxy (PEP) that Viasat deploys
times (Section III-B), bulk-download data using each conges- by default is disabled for all experiments.
tion control algorithm serially (Section III-C), and analyze the
results (Section IV). B. Baseline
A. Testbed For the network baseline, we run UDP Ping3 consecutively
We setup a satellite link and configure our clients and for 1 week from a server to the client. UDP Ping sends one
servers so as to allow for repeated tests. Our setup is design to 20-byte packet every 200 milliseconds (5 packets/s) round-trip
enable comparative performance measurements by keeping all from the server to the client and back, recording the round-trip
conditions the same across runs as much as possible, except time for each packet returned and the number of packets lost.
for the change in TCP congestion control algorithm.
Our testbed is depicted in Figure 2. The client is a Linux PC C. Downloads
with an Intel i7-1065G7 CPU @ 1.30GHz and 32 GB RAM. We compare the performance of the four congestion control
There are four servers, each with a different TCP congestion algorithms: Cubic, BBR (version 1), Hybla and PCC [19].
control algorithm: Cubic, BBR, Hybla and PCC. Each server The four servers are configured to provide for bulk-downloads
has an Intel Ken E312xx CPUs @ 2.5 GHz and 32 GB RAM.
The servers and client all run Ubuntu 18.04.4 LTS, Linux 2 https://en.wikipedia.org/wiki/ViaSat-2

kernel version 4.15.0. The servers connect to the WPI LAN 3 http://perform.wpi.edu/downloads/#udp
via iperf34 (v3.3.1), each server hosting one of our four We start by analyzing the network baseline loss and round-
congestion control algorithms. trip times, obtained on a “quiet” satellite link to our client –
Cubic, BBR and Hybla are used without further configu- i.e., without any of our active bulk-downloads.
ration. PCC is configured to use the Vivache-Latency utility Figure 3 depicts analysis of about 2.5 days of round-trip
function [20]. times for the satellite network, obtained from UDP Ping
For all hosts, the default TCP buffer settings are changed on measurements from our server to the client and back. Figure 3a
both the server and client so that flows are not flow-controlled show the round-trip times over that time period, with one
and instead are governed by TCP’s congestion window. These data point every 200 milliseconds, and Figure 3b shows
included setting tcp_mem, tcp_wmem and tcp_rmem to 60 the complementary cumulative distribution of the same data.
MBytes. Table I provides summary statistics.
The client initiates a connection to one server via iperf, From the graphs and table, the vast majority (99%) of
downloading 1 GByte of data, then proceeding to the next the round-trip times are between 560 and 625 milliseconds.
server. After cycling through each server, the client pauses However, the round-trip times have a heavy-tailed tendency,
for 1 minute. The process repeats a total of 80 times – thus, evidenced by the round-trip times extending from 625 ms to
providing 80 network traces of a 1 GByte download for each 1500 ms in Figure 3b and again from 1700 ms to 2200 ms on
protocol over the satellite link. Since each cycle takes about the bottom right. These high values show multi-second round-
15 minutes, the throughput tests run for about a day total. We trip times can be observed on a satellite network even without
analyze results from a weekday in July 2020. any self-induced queuing. From the graph, there are no visual
time of day patterns to the round-trip times.
IV. A NALYSIS
TABLE I
BASELINE ROUND - TRIP TIME SUMMARY STATISTICS .
This section first presents network baseline metrics without
any active TCP flows, followed by TCP download perfor- mean 597.5 ms
mance: overall, during steady state, and at start-up. To com- std dev 16.9 ms
pare the TCP congestion control performance, we consider median 597 ms
the metrics of throughput, delay (round-trip time) and loss min 564 ms
(retransmissions) [21]. max 2174 ms

A. Network Baseline In the same time period, only 604 packets are lost, or about
0.05%. Most of these (77%) are single-packet losses, with
44 multi-packet loss events, the largest 11 packets (about 2.2
seconds). There is no apparent correlation between these losses
2000
and the round-trip times (i.e., the losses do not seem to occur
Delay (milliseconds)

1500 during the highest round-trip times observed). Note, these loss
rates are considerably lower than the WINDS satellite loss of
1000
0.7%, reported by Obata et al. [7].
500
B. Representative Behavior
0 To compare the TCP congestion control protocols, we begin
12p 12a 12p 12a 12p 12a
Time
by examining the performance over time of a single flow that
(a) Versus time
is representative of typical behavior for each protocol for our
satellite connection. We analyze the throughput, round-trip
1e+00 time and retransmission rate, depicted in Figure 4, where each
value is computed per second from Wireshark traces on the
1 - Cumulative Distribution

1e-01
server and client.
1e-02
TCP Cubic illustrates typical exponential growth in rates
1e-03 during start up, but exits slow start relatively early, about 15
1e-04 seconds in where throughput is far lower than the expected 100
1e-05 Mb/s or more. Thus, it takes Cubic about 45 seconds to reach
1e-06
a more expected steady state throughput of about 100 Mb/s.
500 625 1000 1500 2000 During steady state (post 45 seconds) the AQM drops enough
Delay (milliseconds)
packets to keep Cubic from persistently saturating the queue,
(b) Distribution resulting in round-trip times of about 1 second. However,
Fig. 3. Baseline round-trip times. several spikes in transmission rates yield corresponding spikes
in round-trip time above 3 seconds and retransmission rates
4 https://software.es.net/iperf/ above 20 percent.
(a) CUBIC (b) BBR

(c) Hybla (d) PCC


Fig. 4. Stacked graph comparison. From top to bottom, the graphs are: throughput (Mb/s), round-trip time (milliseconds), and retransmission rate (percent).
For all graphs, the x-axis is time (in seconds) since the flow started.

TCP BBR ramps up to higher throughput more quickly than time until the queue limit is reached accompanied by some
Cubic, but this also causes high round-trip times and loss rates retransmissions.
around 20 seconds in as it over-saturates the bottleneck queue. TCP PCC ramps up somewhat slower than Hybla but faster
At steady state, BBR operates at a fairly steady 140 Mb/s, than Cubic, causing some queuing and some loss, albeit less
with relatively low loss and RTTs about 750 milliseconds as than BBR. At steady state, throughput and round-trip times are
BBR keeps the queue below the AQM limit. However, there quite consistent, near the minimum round-trip time (around
are noticeable dips in throughput every 10 seconds when BBR 600 milliseconds), and the expected maximum throughput
enters its PROBE RTT state. In addition, there are intermittent (about 140 Mb/s).
round-trip time spikes and accompanying loss which occur
when BBR enters PROBE BW and increases its transmission C. Overall
rate for 1 round-trip time.
We next evaluate overall throughput, computed per second
TCP Hybla ramps up quickly, faster than does Cubic, over the entirety of each flow’s download. Figure 5 depicts
causing queuing at the bottleneck, evidenced by the high early throughput boxplot distributions at different percentiles taken
round-trip times. However, there is little loss. At steady state across all flows and grouped by protocol. The top left is the
Hybla achieves consistently high throughput, with a slight tenth percentile, the top right the 50% (or median), the bottom
growth in the round-trip time upon reaching about 140 Mb/s. left the ninetieth percentile and the bottom right the mean.
Thereupon, there is a slight upward trend to the round-trip Each box depicts quartiles and median for the distribution.
Points higher or lower than 1.4 × the inter-quartile range are under 0.2, medium is 0.2 to 0.5, large 0.5 to 0.8, and very
outliers, depicted by the circles. The whiskers span from the large is above 0.8. The t test and effect size results are shown
minimum non-outlier to the maximum non-outlier. Table II in Table III. Statistically significance is highlighted in bold.
provides the overall throughput summary statistics. From the table, the mean overall throughput for BBR and
Hybla are statistically different than Cubic, whereas PCC is
not. The effect size for BBR versus Cubic is moderate, and
the effect sizes for Hybla versus Cubic is very large.

TABLE III
OVERALL THROUGHPUT EFFECT SIZE ( VERSUS C UBIC ).

t(158) p Effect Size


BBR 3.69 0.0003 0.58
Hybla 12.85 <.0001 2.03
PCC 0.5133 0.6084 0.08

Figure 6 shows the cumulative distributions of the round-


trip times taken over the entire download for each flow. The
x-axis is the round-trip time in seconds computed from the
TCP acknowledgments in the Wireshark traces, and the y-axis
is the cumulative distribution. There is one trendline for each
Fig. 5. Overall throughput distributions for 10%, 50%, 90% and mean. protocol. Table IV provides the overall throughput summary
statistics.

TABLE II
OVERALL THROUGHPUT SUMMARY STATISTICS .

Protocol Mean (Mb/s) Std Dev


BBR 95.8 6.7
Cubic 91.0 9.5
Hybla 112.1 11.2
PCC 91.8 10.2

From the figure and table, Cubic, BBR and PCC all suffer
from low throughput distributions at the tenth percentile. This
is attributed to a slower start-up phase for both BBR and
CUBIC, and the RTT probing phase by BBR. In contrast,
Hybla has throughputs near 100 Mb/s even at the tenth Fig. 6. Overall round-trip time distributions.
percentile.
BBR and Hybla have similar throughput distributions at the
median, but Cubic and PCC are still lower. TABLE IV
BBR has the highest throughput distributions at the ninetieth OVERALL ROUND - TRIP TIME SUMMARY STATISTICS .
percentile, Cubic and Hybla are similar, and PCC trails by
a bit. Both BBR and PCC have more variation in ninetieth Protocol Mean (ms) Std Dev
percentile throughputs, evidenced by the larger boxes. BBR 827 85.4
For the overall mean, Cubic and PCC are the lowest, with Cubic 806 163.1
BBR a bit higher and Hybla the highest. Hybla 906 129.3
PCC 722 49.6
Since Cubic is the default TCP congestion control proto-
col for Linux and Windows servers, we compare the mean
throughput for an alternate protocol choice – BBR, Hybla or From the table and figure, Hybla has a slightly higher
PCC – to the mean of Cubic by doing independent, 2-tailed distribution of round-trip times, with BBR and Cubic next, and
t tests (α = 0.05) with a Bonferroni correction, as well as PCC with the lowest round-trip time distribution. Conversely,
compute the effect sizes. The effect size provides a quantitative Cubic and BBR have the heaviest tail distribution of RTTs,
measure of the magnitude of difference – in our case, the owing to the times they saturate the bottleneck queue. While
difference of the means for two protocols. The Cohen’s d BBR and Cubic have similar mean round-trip times, BBRs
effect size quantifies the differences in means in relation to the has considerably less variance. PCC has both a low round-trip
standard deviation. Generally small effect sizes are anything time and a steady round trip time, with the smallest variance.
Cubic does have RTT values over 10 seconds, trimmed off to TABLE V
the right of the graph. S TEADY STATE THROUGHPUT SUMMARY STATISTICS .
Figure 7 shows the cumulative distributions of the retrans-
Protocol Mean (Mb/s) Std Dev
missions. The axes and data groups are as for Figure 6, but
BBR 112.9 12.2
the y-axis is the percentage of retransmitted packets computed
Cubic 123.3 17.0
over the entire flow.
Hybla 130.1 17.2
From the figure, BBR and Cubic have the highest distri- PCC 112.6 17.9
butions of retransmissions, caused by saturating the bottle-
neck queue and having packets lost. Hybla has the lowest
distribution of retransmissions, almost always under 1%. PCC
From the graphs, BBR has lowest distribution of steady
is inbetween, consistently having a 0.5% retransmission rate,
state throughput at the tenth percentile. This is attributed to
although it has a maximum near 4%.
the round-trip time probing phase by BBR, which, if there is
no change to the minimum round-trip time, triggers every 10
seconds whereupon throughput is minimal for about 1 second.
PCC’s throughput at the tenth percentile is also a bit lower than
Cubic’s or Hybla’s.
BBR, Cubic and Hybla all have a similar median steady
state throughputs, while PCC’s is a bit lower.
BBR has the highest distribution of throughput at the nineti-
eth percentile, followed by Cubic, Hybla and PCC. Hybla’s
distribution here is the most consistent (as seen by the small
box), while PCC’s is the least.
From the table, Hybla has the highest mean steady state
throughput, followed by CUBIC and then BBR and PCC are
about the same. BBR steady state throughput varies the least.
Fig. 7. Overall retransmission distributions. Table VI is like Table III, but for steady state. From the
table, the mean steady state throughputs are all statistically
significantly different than Cubic. BBR and PCC have lower
D. Steady State
steady state throughput than Cubic with a large effect size.
TCP’s overall performance includes both start-up and con- Hybla has a higher throughput than Cubic with a moderate
gestion avoidance phases – the latter we call “steady state” in effect size.
this paper. We analyze steady state behavior based on the last
half (in terms of bytes) of each trace. TABLE VI
S TEADY STATE THROUGHPUT EFFECT SIZE ( VERSUS C UBIC ).

t(158) p Effect Size


BBR 4.44 <.0001 0.7
Hybla 2.51 0.0129 0.4
PCC 3.88 0.0002 0.6

Figure 9 shows the cumulative distributions of the round-


trip times during steady state. The axes and data groups are as
in Figure 6, but taken only for the second half of each flow.
Table VII shows the summary statistics.

TABLE VII
S TEADY STATE ROUND - TRIP TIME SUMMARY STATISTICS .

Protocol Mean (ms) Std Dev


BBR 780 125.1
Cubic 821 206.4
Fig. 8. Steady state throughput distributions for 10%, 50%, 90% and mean.
Hybla 958 142.1
Figure 8 depicts throughput comparisons for the steady PCC 685 73.1
state of all downloads for each protocol. The graphs are as
in Figure 5, but only include the last half of all downloads. The steady state trends are similar in many ways to the
Table V shows the corresponding summary statistics. overall trends, with Hybla typically having round-trip times
The average Web page size for the top 1000 sites worldwide
was around 2 MBytes as of 2018, a steady increase over
previous years [22]. This includes the HTML payloads, as
well as all linked resources (e.g., CSS files and images). The
distribution’s 95th percentile was about 6 MBytes and the
maximum was about 29 MBytes (a shared-images Website).
Today’s average total Web page size is probably about 5
MBytes [23], dominated by images and video. Note, these
sizes are upper bounds for the average download sizes from
a Web server since the individual components of a page
are obtained from different servers [24], hence different TCP
flows.
Many long-lived TCP flows carry video content and these
Fig. 9. Steady state round-trip time distributions. may be capped by the streaming video rate, which itself
depends upon the video encoding. However, assuming videos
are downloaded completely, about 90% of YouTube videos are
about 200 milliseconds higher than any other protocol. PCC less than 30 MBytes [25].
has the lowest and steadiest round-trip times, near the link Traditionally, the initial congestion window for TCP is
minimum. BBR and Cubic are inbetween, with BBR being one or two maximum segment sizes (MSS) (subsequently
somewhat lower than Cubic and a bit steadier. Cubic, in four [26]), and during slow-start, the window increases by the
particular, has a few cases with extremely high round-trip MSS for each non-duplicate ACK received [27]. While there
times. Across all flows, about 5% of the round trip times are are advantages to throughputs from a larger initial window,
2 seconds or higher. there are risks to overshooting the slow start threshold and
Figure 10 shows the cumulative distributions of the retrans- overflowing router queues [8].
missions during steady state. The axes and data groups are as The initial congestion window settings can vary from server
for Figure 10, but the y-axis is the percentage of retransmitted to server, ranging from 10 to 50 MSS for major CDN
packets computed over just the second half of each flow. providers [28]. The default initial window in Linux since
From the figure, Cubic has the highest retransmission dis- kernel version 2.6 in 2011 is 10. Hybla multiplies the initial
tribution and Hybla the least. BBR and PCC are inbetween, congestion window by ρ from Equation 3 based on the TCP
with BBR moderately higher but PCC having a much heavier handshake round-trip time measurement; for our testbed, this
tail. Hybla and PCC are consistently low (0% loss) for about is an increase of about 25x over the default, so a starting
3/4ths of all runs, compared to only about 20% for BBR and window of 50. Our servers use an initial congestion window
Cubic. of 10 for both BBR and Cubic, with Hybla adjusting it’s initial
congestion window size as above. PCC’s initial congestion
window over our satellite connection is about 25.
Figure 11 depicts the time that would have been required
to download an object on the x-axis (in seconds) for an object
sized on the y-axis (in MBytes). The object size increment is
1 MByte. Each point is the average time required a protocol
run to download an object of the indicated size, shown with
a 95% confidence interval.
From the figure, for the smallest objects (1 MByte), Hybla
and PCC download the fastest, about 4 seconds, owning to
the larger initial congestion windows they both have – both
PCC and Hybla have initial congestion windows about 2.5x
to 5x larger than either BBR or Cubic. In general, Hybla
downloads small objects fastest followed by PCC up to about
Fig. 10. Steady state retransmission distributions.
20 MBytes, then BBR and Cubic. After 20 MBytes, BBR
downloads objects faster than PCC. For an average Web page
download (5 MBytes), Hybla takes an average of 4 seconds,
E. Start-Up PCC 7 seconds, BBR 10 seconds and Cubic 13 seconds. For
We compare the steady state behavior for each protocol by 90% of all videos and the largest Web pages (30 MBytes),
analyzing the first 30 seconds of each trace, approximately Hybla takes about 8 seconds, BBR and PCC about twice that
long enough to download 50 MBytes on our satellite link. and Cubic about thrice.
This is indicative of protocol performance for some short-lived Table VIII presents the summary statistics for the first 30
flows, such as for a large Web page. seconds of each flow for each protocol. During startup, Cubic
(units are MBytes). The protocol with the most power in each
phase is indicated in bold.

TABLE X
TCP P OWER - THROUGHPUT ÷ DELAY

Power (MBytes)
Protocol Overall Steady Start-up
BBR 115 144 35
Cubic 112 150 22
Hybla 123 136 51
PCC 127 165 25

From the table, Overall, Cubic has the least power owing
Fig. 11. Download time versus download object size.
to its low throughput and moderate round-trip times. PCC
has the most power, with moderate throughputs but relatively
has a low round-trip time, mostly because it takes a long low round-trip times. BBR is only slightly better than Cubic
time to ramp up its data rate, hence a low throughput. BBR whereas Hybla is almost as good as PCC.
has the highest round-trip time despite not having the highest During steady state, PCC remains the most powerful based
throughput – that is had by Hybla, despite having a lower on high throughput with the lowest round-trip times. Cubic
round-trip time than BBR. PCC has average throughputs and is more powerful than BBR or Hybla since it has good
round-trip times, but the steadiest round-trip times. throughput and round trip times, whereas BBR is deficient
in throughput and Hybla in round-trip times.
TABLE VIII
At start-up, Hybla has the most power by far, primarily due
S TART- UP SUMMARY STATISTICS . to its high throughput. BBR has moderate power, while Cubic
and PCC are similar at about half the power of Hybla.
Tput (Mb/s) RTT (ms)
V. C ONCLUSION
Protocol Mean Std Dev Mean Std Dev
BBR 23.1 1.8 917 42.9 Satellite Internet connections are important for providing
Cubic 16.6 0.3 757 22.3 reliable, ubiquitous network connectivity, especially for hard
Hybla 40.8 2.9 799 130.8 to reach geographic regions and when conventional networks
PCC 20.3 1.6 806 15.1 fail (e.g., in times of natural or human-caused disaster). While
research in satellites has increased satellite network through-
Table VI is like Table III, but for start-up (the first 30 puts, the inherent latencies satellites bring are a challenge to
seconds). From the table, the start-up throughputs are all TCP connections. Moreover, conventional approaches to split
statistically significantly different than Cubic. The effect sizes a satellite’s TCP connection via a middle-box Performance
for throughput for PCC, BBR and Hybla are all very large Enhancing Proxy (PEP) are ineffective for encrypted TCP
compared with Cubic. payloads and for the increasingly popular QUIC protocol.
Alternate TCP congestion control algorithms – such as BBR
TABLE IX or Hybla instead of the default Cubic – can play a key role
S TARTUP THROUGHPUT EFFECT SIZE ( VERSUS C UBIC ). in determining performance in latency-limited conditions with
congestion. However, to date, there are few published research
t(158) p Effect Size papers detailing TCP congestion control performance over
BBR 31.9 <.0001 5 actual satellite networks.
Hybla 74.2 <.0001 12 This paper presents results from experiments on a produc-
PCC 20.3 <.0001 3.2 tion satellite network comparing four TCP congestion control
algorithms – the two dominant algorithms, Cubic and BBR, a
commercial implementation of PCC, as well as the satellite-
F. Power tuned algorithm Hybla. These algorithms together represent
In addition to examining throughput and round-trip time a different approaches to congestion control: default AIMD-
separately, it has been suggested that throughput and delay based (Cubic), bandwidth estimation-based (BBR), utility
can be combined into a single “power” metric by dividing function-based (PCC), and satellite-optimized for startup (Hy-
throughput by delay [21] – the idea is that the utility of higher bla) Results from 80 downloads for each protocol, interlaced
throughput is offset by lower delay and vice-versa. so as to minimize temporal differences, are analyzed for over-
Doing power analysis using the mean throughput (in Mb/s) all, steady state and start-up performance. Baseline satellite
and delay (in seconds) for each protocol for each phase (start- network results are obtained by long-term round-trip analysis
up, steady state and overall) yields the numbers in Table X in the absence of any other traffic.
Overall, the production satellite link has consistent baseline
[8] C. Barakat, N. Chaher, W. Dabbous, and E. Altman, “Improving TCP/IP
round-trip times near the theoretical minimum (about 600 over Geostationary Satellite Links,” in Proceedings of GLOBECOM, Rio
milliseconds) and very low (about a twentieth of a percent) de Janeireo, Brazil,, Dec. 1999.
loss rates. For TCP downloads, during steady state, the four [9] S. Utsumi, S. Muhammad, S. Zabir, Y. Usuki, S. Takeda, N. Shiratori,
Y. Katod, and J. Kimb, “A New Analytical Model of TCP Hybla for
protocols evaluated – Cubic, BBR, Hybla and PCC – all have Satellite IP Networks,” Journal of Network and Computer Applications,
similar median throughputs, but Hybla and Cubic have slightly vol. 124, Dec. 2018.
higher mean throughputs owing to BBR’s bitrate reduction [10] Y. Wang, K. Zhao, W. Li, J. Fraire, Z. Sun, and Y. Fang, “Performance
Evaluation of QUIC with BBR in Satellite Internet,” in Proceedings
when probing for minimal round-trip times (probing lasts of the 6th IEEE International Conference on Wireless for Space and
about a second and happens about once per second). During Extreme Environments (WiSEE), Huntsville, AL, USA, Dec. 2018.
start-up, Hybla’s higher throughputs allow it to complete small [11] V. Arun and H. Balakrishnan, “Copa: Practical Delay-Based Congestion
Control for the Internet,” in Proceedings of the Applied Networking
downloads (e.g., Web pages) about twice as fast as BBR Research Workshop, Montreal, QC, Canada, Jul. 2018.
(∼5 seconds versus ∼10), while BBR is about 50% faster [12] M. Dong, Q. Li, D. Zarchy, P. B. Godfrey, and M. Schapira, “PCC:
(10 seconds versus 15 seconds) than Cubic. Hybla is able Re-architecting Congestion Control for Consistent High Performance,”
to avoid some of the high retransmission rates brought on in Proceedings of the 12th USENIX Symposium on Networked Systems
Design and Implementation (NSDI), Oakland, CA, USA, 2015.
by Cubic and BBR, and to a lesser extent PCC, saturating [13] C. Caini and R. Firrincieli, “TCP Hybla: a TCP Enhancement for Hetero-
the bottleneck queue, too. However, as a cost, Hybla has a geneous Networks,” International Journal of Satellite Communications
consistently higher round-trip time, an artifact of more packets and Networking, vol. 22, no. 5, pp. 547–566, Sep. 2004.
[14] Y. Cao, A. Jain, K. Sharma, A. Balasubramanian, and A. Gandhi,
in the bottleneck queue, while PCC has the least. Combining “When to Use and When not to Use BBR: An Empirical Analysis
throughput and round-trip into one “power” metric shows PCC and Evaluation Study,” in Proceedings of the Internet Measurement
the most powerful, owing to high throughputs and steady, low Conference (IMC), Amsterdam, NL, Oct. 2019.
[15] R. Ware, M. K. Mukerjee, S. Seshan, and J. Sherry, “Modeling BBR’s
round-trip times. Interactions with Loss-Based Congestion Control,” in Proceedings of
There are several areas we are keen to pursue as future work. the Internet Measurement Conference (IMC), Amsterdam, Netherlands,
Settings to TCP, such as the initial congestion window, may Oct. 2019.
have a significant impact on performance, especially for small [16] B. Turkovic, F. A. Kuipers, and S. Uhlig, “Interactions Between Con-
gestion Control Algorithms,” in Proceedings of the Network Traffic
object downloads. Since prior work has shown TCP BBR does Measurement and Analysis Conference (TMA), Paris, France, 2019.
not always share a bottleneck network connection equitably [17] L. Brakmo, S. O’Malley, and L. Peterson, “TCP Vegas: New Techniques
with TCP Cubic [29], future work is to run multiple flow for Congestion Detection and Avoidance,” ACM SIGCOMM Computer
Communication Review, vol. 24, no. 4, 1994.
combination with homo- and heterogeneous congestion control [18] A. Riddoch, A. Wilk, A. Vicente, C. Krasic, D. Zhang, , and F. Yang,
protocols over the satellite link. Lastly, experiments with the “The QUIC Transport Protocol: Design and Internet-Scale Deployment,”
increasingly popular QUIC protocol are warranted since QUIC in Proceedings of the ACM SIGCOMM Conference, Los Angeles, CA,
USA, Aug. 2017.
can use BBR and has encrypted payloads, making it difficult [19] A. Cohen, “How compira solves the last mile for streaming media,”
to use with satellite PEPs. Online: https://tinyurl.com/yymxtubj, Jan. 2020.
[20] M. Dong, T. Meng, D. Zarchy, E. Arslan, Y. Gilad, B. Godfrey, and
ACKNOWLEDGMENTS M. Schapira, “PCC Vivace: Online-Learning Congestion Control,” in
Proceedings of the 15th USENIX Symposium on Networked Systems
Thanks to Amit Cohen, Lev Gloukhenki and Michael Design and Implementation (NSDI), Renton, WA, USA, Apr. 2018.
Schapira of Compira Labs for providing the implementation [21] S. Floyd, “Metrics for the Evaluation of Congestion Control Mecha-
of PCC. nisms,” Internet Requests for Comments, RFC 5166, March 2008.
[22] Data and Analysis, “Webpages Are Getting Larger Every Year, and
R EFERENCES Here’s Why it Matters,” Solar Winds Pingdom. Online at: https://tinyurl.
com/y4pjrvhl, November 15 2018.
[1] S. I. Association, “Introduction to the Satellite Industry,” Online presen- [23] T. Everts, “The Average Web Page is 3 MB. How Much Should We
tation: https://tinyurl.com/y5m7z77e, 2020. Care?” Speed Matters Blog. Online at: https://speedcurve.com/blog/
[2] Cisco, Interface and Hardware Component Configuration Guide, Cisco web-performance-page-bloat/, August 9th 2017.
IOS Release 15M&T. Cisco Systems, Inc., 2015, chapter: Rate Based [24] M. Kaplan, M. Claypool, and C. Wills, “How’s My Network? - Predict-
Satellite Control Protocol. ing Performance from within a Web Browser Sandbox,” in Proceedings
[3] S. Ha, I. Rhee, and L. Xu, “CUBIC: A New TCP-Friendly High-Speed of the 37th IEEE Conference Workshop on Local Computer Networks
TCP Variant,” ACM SIGOPS Operating Systems Review, vol. 42, no. 5, (LCN), Clearwater, FL, USA, Oct. 2012.
2008.
[25] X. Che, B. Ip, and L. Lin, “A Survey of Current YouTube Video
[4] N. Cardwell and Y. Cheng and C. S. Gunn and S. H. Yeganeh and Van
Characteristics,” IEEE Multimedia, vol. 22, no. 2, April - June 2015.
Jacobson, “BBR: Congestion-based Congestion Control,” Communica-
tions of the ACM, no. 2, pp. 58–66, Jan. 2017. [26] M. Allman, S. Floyd, and C. Partridge, “Increasing TCP’s Initial
[5] N. Cardwell, Y. Cheng, S. H. Yeganeh, and V. Jacobson, “BBR Conges- Window,” Internet Requests for Comments, RFC 2414, September 1998.
tion Control,” IETF Draft draft-cardwell-iccrg-bbr-congestion-control- [27] M. Allman, V. Paxson, and W. Stevens, “TCP Congestion Control,”
00, Jul. 2017. Internet Requests for Comments, RFC 2581, April 1999.
[6] F. Li, J. W. Chung, X. Jiang, and M. Claypool, “TCP CUBIC versus [28] CDN Planet, “Initcwnd Settings of Major CDN Providers,” Online: https:
BBR on the Highway,” in Proceedings of the Passive and Active //www.cdnplanet.com/blog/initcwnd-settings-major-cdn-providers/, feb
Measurement Conference (PAM), Berlin, Germany, Mar. 2018. 2017.
[7] H. Obata, K. Tamehiro, and K. Ishida, “Experimental Evaluation of [29] S. Claypool, M. Claypool, J. Chung, and F. Li, “Sharing but not Caring -
TCP-STAR for Satellite Internet over WINDS,” in Proceedings of the Performance of TCP BBR and TCP CUBIC at the Network Bottleneck,”
International Symposium on Autonomous Decentralized Systems, Tokyo, in Proceedings of the 15th IARIA Advanced International Conference on
Japan, Mar. 2011. Telecommunications (AICT), Nice, France, Aug. 2019.

You might also like