0% found this document useful (0 votes)
127 views16 pages

Delphi: A Software Controller For Mobile Network Selection

This document summarizes a technical report about Delphi, a software controller for mobile devices that helps applications select the best available network. Delphi considers objectives like transfer completion time, energy efficiency, and data costs. It uses a traffic profiler, network monitor, performance predictor, and network selector. Evaluation shows Delphi can reduce transfer times by 46-49% compared to always using Wi-Fi, and achieve 1.9x higher throughput while using only 6% more energy.

Uploaded by

pancawawan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
127 views16 pages

Delphi: A Software Controller For Mobile Network Selection

This document summarizes a technical report about Delphi, a software controller for mobile devices that helps applications select the best available network. Delphi considers objectives like transfer completion time, energy efficiency, and data costs. It uses a traffic profiler, network monitor, performance predictor, and network selector. Evaluation shows Delphi can reduce transfer times by 46-49% compared to always using Wi-Fi, and achieve 1.9x higher throughput while using only 6% more energy.

Uploaded by

pancawawan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Computer Science and Artificial Intelligence Laboratory

Technical Report

MIT-CSAIL-TR-2016-004 February 25, 2016

Delphi: A Software Controller for Mobile


Network Selection
Shuo Deng,, Anirudh Sivaraman,, and Hari Balakrishnan

m a ss a c h u se t t s i n st i t u t e o f t e c h n o l o g y, c a m b ri d g e , m a 02139 u s a — w w w. c s a il . m i t . e d u
Delphi: A Software Controller for Mobile Network Selection
Shuo Deng, Anirudh Sivaraman, Hari Balakrishnan
MIT Computer Science and Artificial Intelligence Lab, Cambridge, Massachusetts, U.S.A.
{shuodeng, anirudh, hari}@csail.mit.edu

A BSTRACT exhibit rates competitive with, or even far exceeding, Wi-Fi


This paper presents Delphi, a mobile software controller that networks. In any given situation, it is no longer easy to tell
helps applications select the best network among available which network will perform the best for a given application,
choices for their data transfers. Delphi optimizes a specified because latency and throughput vary with time and location.
objective such as transfer completion time, or energy per byte This state of affairs is likely to continue in the future as both
transferred, or the monetary cost of a transfer. It has four local-area and wide-area technologies will continue to exhibit
components: a performance predictor that uses features gath- higher rates and significant variability.
ered by a network monitor, and a traffic profiler to estimate Today’s mobile operating systems typically hard-code the
transfer sizes near the start of a transfer, all fed into a network decision of which network to use when confronted with mul-
selector that uses the prediction and transfer size estimate to tiple choices. If the user has previously associated with an
optimize an objective. available Wi-Fi network, they use that over a cellular option.
For each transfer, Delphi either recommends the “best” sin- This choice often leads to frustrating results. For example,
gle network to use, or recommends Multi-Path TCP (MPTCP), when walking outdoors, users often find their device connect-
but crucially selects the network for MPTCP’s “primary sub- ing to a Wi-Fi access point inside a building and experiencing
flow”. The choice of primary subflow has a strong impact on poor performance when the right answer is to use the cellular
the transfer completion time, especially for short transfers. network. Even inside homes and buildings, a static choice is
We designed and implemented Delphi in Linux. It requires not always the best: there are rooms where the Wi-Fi network
no application modifications. Our evaluation shows that Del- might be much slower than the cellular network, depending
phi reduces application network transfer time by 46% for on other users, time-of-day, and other factors.
Web browsing and by 49% for video streaming, compared A recent paper presented the results of empirical measure-
with Android’s default policy of always using Wi-Fi when it ments of Wi-Fi and LTE/4G networks [13]. The conclu-
is available. Delphi can also be configured to achieve high sions from this study were that 73% of the time, the through-
throughput while being battery-efficient: in this configuration, put of one of the networks was higher than the other by at
it achieves 1.9x the throughput of Android’s default policy least 1 Mbit/s, with each network dominating the other al-
while only consuming 6% more energy. most equally. The study also found that Multi-Path TCP
(MPTCP) [39], which uses multiple interfaces whenever pos-
1. I NTRODUCTION sible, did not always out-perform single-path TCP.
These conclusions left a key question open: how do we
With the proliferation of Wi-Fi and cellular networks, mo- design a practical solution for mobile devices to select the
bile users and applications often have multiple wireless net- best network for applications? This question is important for
works at their disposal. Until recently, deciding which net- both single-path and Multi-Path TCP transfers.
work to use was easy because of the wide disparity in link In this paper, we present Delphi, a software controller that
rates between Wi-Fi and wide-area cellular networks like solves this problem. Our starting point is from the perspective
EDGE or 3G, as well as the difference in the monetary cost: of users and applications, rather than the transport layer or
Wi-Fi is usually free, but cellular data plans were usage-based. the network. Depending on the objectives of interest, Delphi
The economics of cellular data plans are changing. After makes different decisions about which network to use and
being offered in 2007, “unlimited” plans were halted in 2011 in what order. The way we formulate the objective function
by several carriers (although pre-existing users could hold on takes into consideration 1) transfer completion time, 2) energy
to them). Since 2013, however, unlimited plans have made a efficiency and 3) monetary cost, which is still a major concern
resurgence especially in “Tier-2” operators, where 45% of the for limited data plan users.
users have such plans today [36]. In addition, an increasing Delphi has four components:
number of major app providers like Facebook, Google, and 1. A Traffic Profiler (Section 4) that provides an estimate
WhatsApp, have proposed and are deploying “zero rating” of the length of transfers.
plans so that mobile device users will not be charged when 2. The Network Monitor (Section 5) uses passive observa-
these apps generate cellular traffic [43]. tions of wireless network properties such as the RSSI
These trends indicate that for many users and applications, and channel quality, lightweight active probes, and adap-
the choice of which network to use will not depend solely tive active probing triggered when passive indicators
on monetary cost. Performance also matters: LTE and var- suggest a significant change in conditions.
ious “4G” standards being deployed around the world now

1
3. A Network Performance Predictor (Section 6), which a mechanism to allow smartphones to use multiple networks
estimates the latency and throughput of different net- based on certain policies, such as energy saving, data of-
works by running a machine-learning algorithm using floading, and performance. However, MultiNets explicitly
features obtained from the Network Monitor. assumes that Wi-Fi is faster than the cellular link, which no
4. A Network Selector (Section 7) that uses information longer holds [13]. ATOM [17] is a traffic management sys-
from the above components to select the best network. tem allocating mobile devices’ traffic between LTE and Wi-Fi
We have implemented Delphi for Linux; using it requires networks operated by the same service provider. ATOM’s
no changes to applications. We evaluated Delphi using trace- selection decision was made at the service provider side in a
driven simulations (Section 7.3) and real-world experiments centralized way. However, Delphi is able to make selections
(Section 9.2). For trace-driven simulation, we conducted across different Wi-Fi and LTE providers, in a distributed
measurements in 22 locations on the east and west coasts approach, where the mobile devices make the decision.
of the United States, collecting both single-path TCP and Theoretical work on this problem includes multi-attribute
MPTCP performance results for different transfer sizes over decision making [7], game theory and reinforcement learning
Wi-Fi and LTE (Verizon), as well as other measurement data [25], and analytic hierarchy processes [34]. These are pri-
such as Wi-Fi RSSI, LTE signal strength, ping RTT, etc. In marily evaluated in simulation using simplified models of the
our real-world experiments, we run Delphi with unmodified network and workloads. In contrast, our evaluation consists
applications. We also test Delphi when the user is moving. of trace-driven simulations and real-world experiments with
Our results are as follows: traffic from unmodified applications.
1. In the trace-driven simulations (Section 7.3), Delphi im-
proved the median throughput by 2.1x compared with Multi-Path TCP. Multi-Path TCP (MPTCP) [39], and its
always using Wi-Fi, the default policy on Android de- recent implementation in iOS 7 [20] allow a single TCP con-
vices today. Delphi can also achieve high throughput nection to use multiple paths. MPTCP does not specify if
while being energy efficient: in this mode, it achieved interfaces should be used simultaneously, or in master-backup
1.9x higher throughput while consuming only 6% more mode. The iOS implementation operates in master-backup
energy compared with Android’s default policy. mode using Wi-Fi as the primary path, falling back to a cellu-
2. When running with unmodified applications, Delphi re- lar path only if Wi-Fi is unavailable. Other implementations,
duces the network transfer time by 46% for web brows- such as the default mode in Linux, use all available interfaces
ing and by 49% for video streaming (Section 9.1), com- in “striped” mode1 . Delphi can be viewed as specifying an
pared with Android’s default policy. For file download- MPTCP network-selection policy when operating on mobile
ing (Section 9.2), Delphi increases average throughput networks. The choice between a cellular link and Wi-Fi is
by between 1.25x and 4x. necessarily dynamic in such cases and a static policy such
3. Delphi is also proactive in switching networks (Sec- as the one in Android (use Wi-Fi if it’s available) does not
tion 9.3) when the device is moving. It can detect that suffice.
the network currently in use is performing worse than Processor sharing for multiple interfaces. Recent work [42]
the alternatives, and can switch before the connection extends generalized processor sharing [28] to multiple in-
breaks. In our experiments, Delphi switches networks terfaces. In follow-up work [41], the authors also propose
30 seconds earlier than the MPTCP handover mode scheduling packets over multiple interfaces while respect-
proposed in [27]. ing relative preferences (e.g. Dropbox should get twice the
throughput of Netflix) and absolute preferences (e.g. give
2. R ELATED W ORK YouTube at least 5 Mbps). These algorithms operate on every
We discuss related work on mobile network selection poli- packet, while Delphi is invoked only when a flow is created.
cies, MPTCP, scheduling algorithms that generalize processor
Roaming mechanisms. Mobile IP [31] and end-to-end al-
sharing [28] to multiple interfaces, roaming mechanisms to
ternatives [33, 38] allow a mobile device to freely roam be-
seamlessly migrate between interfaces, and systems and APIs
tween networks without disconnecting connections. Multi-
that allow applications to benefit from multiple interfaces.
Path TCP [39] supports break-before-make semantics as well:
Mobile network selection. Zhao et al. [44] present a system an MPTCP connection can have no active subflows for a short
that picks from one of three choices for every flow: regular IP, duration before a new subflow is created and attached to the
Mobile IP [31] with triangle routing, and Mobile IP with bidi- connection. These mechanisms are complementary to Delphi,
rectional tunneling. Instead of selecting an entire path within and Delphi can determine the network-selection policy while
the Internet, as Zhao et al. do, Delphi picks either an LTE retaining the roaming mechanisms of the underlying transport
link or a Wi-Fi link for the last hop alone. CoolSpots [30] protocol.
and SwitchR [1] address the question of network selection be-
tween Wi-Fi and Bluetooth networks available on the phone. 1 “Striped” mode denotes that packets are striped across both inter-
In contrast, Delphi chooses between Wi-Fi and LTE on the faces with one being a primary interface and is the mode in which
last hop using different techniques. MultiNets [24] proposes we use MPTCP for the rest of the paper.

2
Mobile Device
work(s) to use for a data transfer:
1. App traffic profile: how much data does a transfer send
App 1 App 1 App … or receive, for example, as part of a HTTP transaction.
2. Network conditions for the wireless interfaces, for ex-
Delphi
ample, channel quality, current load in the network,
Traffic  Network end-to-end delay, etc. This information allows us to es-
Profiler Monitor
timate higher-layer network performance metrics, such
Predictor
as flow completion time, average burst throughput, etc.
3. The objective function to be optimized, such as the flow
Network  completion time, energy per byte, or monetary cost for
Selector the transfer.
Figure 1 shows the four components in Delphi, which is
Network Interfaces implemented as a software controller between the application
Wi‐Fi LTE … and transport layers. The Traffic Profiler (Section 4) estimates
transfer sizes, and the Network Monitor (Section 5) collects
data needed for the Predictor to predict current network per-
Figure 1: Delphi design. formance. The Predictor (Section 6) feeds the prediction to
Network Selector (Section 7), which selects the network(s)
to optimize the specified objective. Implementing this design
Systems and APIs to exploit multiple interfaces. The idea requires no modification to applications (Section 8).
of using multiple networks for increased capacity and fault-
tolerance has attracted significant attention from researchers 4. T RAFFIC P ROFILER
over the past decade. Early work [6] shows the benefits of Recent study [11] analyzed 90,000 Android apps and found
combining multiple networks and shows use cases where that 70,000 of them used HTTP or HTTPS, and that among
selecting the right network can reduce energy consumption, the 70,000, 70% of them used HTTP and not HTTPS. Thus,
enhance network capacity, and manage mobility. we design the Traffic Profiler by first focusing on HTTP.
FatVAP [16] and MultiNet [10] improve throughput by When a mobile user downloads a file using a HTTP GET,
allowing a single Wi-Fi card to connect to multiple APs. the HTTP GET response header usually contains a “Content-
COMBINE [4] improves individual device throughput by Length” field specifying the length of the response. Dur-
leveraging the wireless wide area network of neighboring de- ing uploads, the mobile device issues a HTTP POST whose
vices. Blue-Fi [5] is a system that uses Bluetooth and cellular “Content-Length” field specifies the transfer size. In both
tower information to predict whether Wi-Fi is available to cases, the Content-Length field provides the relevant transfer
reduce Wi-Fi duty cycle. Airdrop [2], a feature of Apple OS size information to the Traffic Profiler readily.
X, allows users to share files over both Wi-Fi and Bluetooth.
However, it is designed explicitly for the purpose of file shar- Count Percentage
ing (a long-running flow), while our system focuses on the HTTP Transactions 59679 100%
more common case of both short and long flows on mobile Transactions with Content-Length 50865 85%
Predictable Transactions 50613 84%
devices today. Chunked-Encoding Transactions 3559 6%
Contact Networking [8] provides localized network com-
munication between devices with multiple networks and fo- Table 1: HTTP transaction data lengths for the Alexa top-500
cuses on designing mechanisms that enable lightweight neigh- sites. 84% of the transfer lengths are predictable by the Traffic
bor discovery, name resolution, and routing. Intentional Net- Profiler.
working [14] provides APIs that allow apps to label their
network flows. The labels include background or foreground However, the “Content-Length” field is not mandatory for
to specify whether the flow is delay-tolerant, and large or HTTP headers; for instance, HTTP transactions often use
small to specify the amount of data to be transmitted. In- chunked encoding when the length of data to be transmitted
tentional Networking uses a connection scout that probes net- is dynamic. To determine how many HTTP transactions
work conditions periodically, an overhead Delphi can avoid by contain the “Content-Length” header, we use a record-and-
using passive measurement only. We consider these abstrac- replay tool, Mahimahi [23], to record the HTTP requests and
tions orthogonal because Delphi’s network selection policy responses when loading the homepage of the Alexa top-500
is agnostic to the API exposed by the system to applications. websites [3]. The results are listed in Table 1. When each site
Delphi can be used as a decision-making module to select is loaded once, the total number of HTTP transactions are
network interfaces within these systems. 59679. We note that 84% of the transactions are predictable
by Traffic Profiler. Here, predictable means relative difference
3. OVERVIEW between the “Content-Length” value and the actual amount of
Delphi uses three pieces of information to select the net- data transmitted is less than 10%. Thus, using the “Content-

3
Length” field to predict the size of the data transfer works for the delay introduced by packet buffering at the cell tower
most (84%) HTTP transactions. For HTTPS, the header is side can be significant [15], but is not captured by last-hop
encrypted. To be able to look into the headers, we could set passive measurements. Wi-Fi access in public areas such as
up a SSL proxy on the mobile device and make all traffic go shopping malls, airports, etc. may be subject to bandwidth
through it [35]. limits introduced at the gateway to the Internet, and these are
Delphi also provides an API for the application to let the not captured by last hop measurements. To capture these non-
Traffic Profiler know how much data it is going to transfer. last-hop, or end-to-end network performance factors, Delphi
Compared with the Traffic Profiler monitoring data trans- also runs active probes between the mobile device and an
missions on its own, this API allows the Traffic Profiler to Internet server (see Section 8).
unambiguously determine the amount of data the application To quantify how each indicator affects TCP throughput,
is going to transfer. As shown in Section 6.2.1, providing we analyze data collected from 22 locations. Those locations
accurate transfer length helps Delphi to make better network included shopping malls, Wi-Fi-covered downtown areas,
selection. This benefit can incentivize application developers and university campuses, where both Wi-Fi and LTE were
to adopt the API for better performance, as while as better available. At each location, the total measurement time is
security guarantees (for applications using HTTPS). at least 1 hour. We compute the Pearson Correlation [37]
The Traffic Profiler notifies the Predictor by sending it between the throughput and each indicator. The correlation is
(TCP_CONNECTION_ID, data_length, direction) a number between -1 and 1, inclusive. A value close to 1/-1
(direction means whether the device is downloading or up- means strong positive/negative correlation. A value close to 0
loading data) when any of the following events occur: means weak correlation.
1. an API call occurs from the app, Figure 2 shows the absolute value for correlation between
2. a request to initiate a new TCP connection is observed, throughput and each indicator. The bars in each sub-figure are
3. an HTTP request/response is observed. sorted from strongest to weakest correlation. Figure 2a and
The Traffic Profiler may not be able to tell how much 2b shows the correlation over the entire dataset. For both Wi-
data is going to be transmitted in Case 2, and sometimes in Fi and LTE, among the most correlated indicators, we see both
Case 3 (e.g., chunked encoding). In such cases, the Traf- active probing indicators (such as Wi-Fi UDP throughput and
fic Profiler will simply return a data_length of 3 KB. LTE Average Ping RTT) and passive indicators. Figure 2c
Once chunked encoding was observed, the profiler up- and 2d show the correlation values calculated using data
date predicted transfer size to be 100 KB. We chose these collected at only one location. Compared to Figure 2a, at
numbers because they are the median values of data transmis- each location, the order of the correlation strength changes.
sion length observed in Alexa top-500 sites. Section 6.2.1 Similar results can be seen in LTE and MPTCP analysis.
analyzes how these default data length values affect network
5.2 Adaptive Probing
selection results.
Probe Type DNS Query 10 Pings UDP
5. N ETWORK M ONITOR Data Transferred (Bytes) 271 1K 200K
Wi-Fi Median Delay (Sec) 0.64 9.02 0.65
The Network Monitor tracks a set of network-condition Wi-Fi Energy (mJ) 331 4730 366
indicators for both Wi-Fi and LTE, and notifies the Predictor Cellular Median Delay (Sec) 0.63 9.01 0.58
whenever an indicator value changes. Cellular Energy (mJ) 1378 19697 1613

Category Wi-Fi Indicators LTE Indicators Table 3: Overhead for one occurrence of active probing. The
Passive RSSI, Link Signal Strength, DBM, RSSNR, delay values are the median value across all measurement data
Indica- Speed,Wi-Fi AP CQI,RSRP, RSRQ, Wi-Fi AP Count
tors Count collected at 22 locations. The energy values are measured in
Active Max/Min/Mdev Ping RTT, DNS Lookup Time, UDP a indoor setting. The cellular energy values do not contain
Probing throughput, UDP lossrate, UDP packet Inter-arrival-time tail-energy [12].
mean/median/90 percentile

Table 2: Network Indicators monitored by Delphi. As shown in Figure 2, actively probing the network can
provide important information to estimate network perfor-
mance. However, active probing can be expensive in terms
5.1 Network Indicators of energy, bandwidth and delay. Table 3 summaries the over-
Table 2 lists the indicators used by Delphi. These indicators head in terms of delay, amount of data transferred, and energy
are categorized into 2 sub-groups: passive measurement and consumption.
active probing. The passive measurements capture channel To reduce the probing overhead, the Network Monitor
quality and contention for the last-hop wireless link, which is probes the network adaptively, only if there is a significant
often the bottleneck along both the forward and reverse paths change in the passive indicators. Otherwise, it will reuse
between the mobile device and the Internet. However, this active probing information collected previously. To further
last-hop information does not always reflect network condi- reduce probing overhead, Delphi adaptively probes only when
tions along the whole path. For example, for LTE networks, the mobile device’s screen is on, which suggests that the

4
ile

e
i-u p- ss -co ent

til
en
1.0 1.0

W Fi- dp ab -rt -pe e


i-F ud -lo le t rc

e
Wi-Fi-ava-pin -90 -tim

rc
-IAT-mte t

db -IA te nt
LTE-u rq ss -c ian
ia
dp IA ra un

m n

rtt m

pe

n
t

E- dp ra ou
T- ea
ed
-rt

g- -ti

r -9 tt

ea
Wi-Fi-avg ing t

p
t

LTE-c sn AT g-r
0-

LT -r p- ab e
Wi-Fi- e v-p g-r

Wi-Fi-udp-loog-rt

LTE-ms pin -up


W -Fi- in IAT -u

LT siz p- ng tt
LTE-d e put -rtt
e- m x- ed

-m
abs(Correlation)

abs(Correlation)

e- d pi -r

e
g
k
Wi-Fi-link t
0.8 0.8

n
e n
Wi-Fi-rssi -tpu

-
l
fil E-u in- ing

LTE-s ax ook
W i- ns in
fil i-Fi-ma pe

LTE-r dp- -pi

LTE-uava AT

m T
W siz d pi

o
l

l
p
s

v
-

i
t

I
-

l
Wi-Fi-udp

LTi-Fi- dp-
LTE-mvg-

LT -m s-

LTE-u de
m

W E-uqi
n

d
d

s
LTE-a
Wi-Fi-

0.6 0.6

E
-

-
i

LT
W

0.4 0.4

0.2 0.2

0 0
Indicator Name Indicator Name

(a) Wi-Fi: All Data (b) LTE: All Data

e
til
axlabl -rtt cen
e
til

-rt dia e
t n
Wi-Fi-min pe t an cen

Wi-Fi-rssi -pin -m -tim

i-m ai ing ate er

rtt t
g- un
Wi-Fi-dns -tpu an
1.0 1.0
e

Wi-Fi-avg -los -90rtt


i-F av -p sr -p
i-u ai -up ian
dp b -tim
e r

in co
te t

e
p
Wi-Fi-link -tpu -m -pe

ra un

-
Wi-Fi-udp -m

Wi-Fi-min -IATk-u

W Fi- dp AT g
W i- ns AT t
k ed

-p e-
Wi-Fi- e -IAT d
-rt

n
Wi-Fi-udp ing t

ss co

g
e

t
e- m x- g t
e n t
t
Wi-Fi-udp-IAT-90

i
fil i-Fi-ma -ping-rt
W siz d pi -rt
Wi-Fi- e v-p g-r

Wi-Fi-udp v-p
Wi-Fi-rssi -loo -m

W Fi- dp loo
W -sizudp pe
-lo le-
W Fi- vg in d
abs(Correlation)

abs(Correlation)

I
e

0.8 0.8
Wi-Fi-udp-IAT

-
-

e
fil i-Fi-link

W Fi- d
a
p

m
l
s

u
-
Wi-Fi-udp

Wi-Fi-
i-F av
a

-
e
Wi-Fi-

i
W
0.6 0.6
F
-

-
i

i
W

0.4 0.4

0.2 0.2

0 0
Indicator Name Indicator Name

(c) Wi-Fi: Location 1 (d) Wi-Fi: Location 2

Figure 2: Correlation between Wi-Fi/LTE single path TCP throughput and each indicators.

user is currently interacting with the device, and is more The definitions of d and e for a run r are:
sensitive to network delays. There are, of course, times when
background applications also need low delays (e.g., a cloud- m
|pi,r − pi,r−1 |
based navigation app), but in the common case, delay is less of dr = ∑ (1)
i=1 pi,max − pi,min
a concern in such situations. For background transmissions,
Delphi will only probe the network when both conditions n
|â j,r − a j,r |
occur: 1) there is a large change in the passive measurements; er = ∑ a j,r
(2)
and 2) there is a data transmission request. j=1
Adaptive probing has two benefits: Here, m is the total number of passive indicators. pi is the
1. It is energy-efficient compared with fixed-rate probing. value of the passive indicator i. pi,max and pi,min are the max
2. It is proactive compared with probing only on a trans- and min values for these indicators. n is the total number of
mission request, which would delay the request. active indicators. â j,r is the active probing value for indicator
To evaluate adaptive probing, we simulate it using our j and a j,r is the ground truth value, in run r.
collected data as follows. We take the first run’s passive and Figure 3 shows that as the probing threshold increases,
active measurement as input. Then, for the second run, we fewer probes are triggered. Also, as the probing threshold
compare the passive measurement values with the first run. increases, the error of reusing the previous active probing
If the passive measurement difference d is less than a certain value increases. In Section 6, we will analyze the extent
threshold T h, we keep the first run’s active probing values to which the choice of network is affected by this adaptive
as our measured number, and use the second run’s active probing error.
probing values as the ground truth to calculate an error e. If d
is greater than T h, we do active probing again. 6. P REDICTOR

5
1.0 1.0
1 1.6
Wi-Fi Error
0.75 0.75
LTE Error 1.4
Active Probing Frequency

Probing Frequency

CDF

CDF
0.8 0.5 0.5

Active Probing Error


1.2
0.25 0.25
Tree Tree
0.6 1 SVR SVR
0 0
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
0.8
Relative Error Relative Error
0.4 0.6 (a) TCP over Wi-Fi (b) TCP over LTE
0.4
0.2 Figure 4: Relative error when using regression tree and sup-
0.2
port vector regression to learn flow-completion time. The
0 0 line marked with “Tree” shows the relative error of regression
0.5 1 2 4 8 16 32 64 128 256
tree model. The line with “SVR” shows the relative error of
Adaptive Probing Threshold (10-4)
support vector regression model.

Figure 3: As the adaptive probing threshold (x-axis) increases,


the number of probing decreases and the probing error in- time estimated for single-path TCP, because previous mea-
creases. Here the left y-axis shows the probing frequency, surement study [13] shows that throughputs of single-path
which is the average number of proving in every five minutes. TCP over two networks do not add up to the throughput of
MPTCP. Besides, there are cases where MPTCP over both
networks gives lower throughput than single-path TCP over a
Delphi’s Predictor takes the traffic profile and network sta- single network.
tus as input to estimate network performance. In this section, The prediction accuracy is affected by two factors:
we focus on how it estimates TCP flow completion time, 1. the predictive power of the machine learning model, i.e.,
because flow completion time is used in all our objectives whether regression trees are a good model for predicting
mentioned in Section 7. However, similar techniques can be flow completion time, and
used to estimate other metrics such as average throughput 2. the measurement accuracy of the inputs to the machine-
(for steaming applications) or average RTT (for interactive learning methods.
applications).
To predict TCP flow completion time, or related metrics 6.1 Regression Tree Prediction Accuracy
such as the end-to-end throughput for Wi-Fi and LTE net- Here, we use our dataset collected from 22 locations to
works, previous work either uses historical data from the analyze Delphi’s prediction accuracy. To train the regression
same flow to predict current throughput [40], or outputs bi- tree model, we randomly select 10% of the total samples
nary results such as high/low throughput [9]. However, for from our dataset. We then use the remaining 90% as our
Delphi, the prediction is more challenging: first, it needs to testing data. As a comparison, we also trained support vector
make its decision just before the connection transfers data regression (SVR) models with radio basis kernel [29], which
and recent historical data may not always available. Second, also take a multi-dimensional vector as input and outputs
it needs to return numerical values instead of binary results to numerical results. The SVR models also consists of four
be fed into the Selector. separate models, each for one network configuration. To train
We use a machine learning model, Regression Tree [26] to each SVR model, we first sort all the indicators from the most-
estimate TCP flow-completion time. This learning method correlated to the least-correlated with the flow completion
matches well with our problem definition, as it takes multi- time. Here, we compute the correlation across all the data,
dimensional vectors as input, and produces a real-valued not just the 10% in the training set, so that the sorting would
result. In our case, the multi-dimensional input includes not be biased by the training set. Then, we train the SVR
data size and network condition indicators. Another advan- model using the N most correlated features. As N increases,
tage of using a regression tree is that by assigning different the testing errors first decrease (because the model improves
weight to different indicators when traversing through differ- in predictive power) and then increase (because the model
ent branches, per-location differences (Figure 2) are captured overfits to the training set). For each SVR model, we choose
naturally. Regression trees have been used to solve other the N that gives the smallest error. Thus, the resulting models
network performance estimation problems [40] due to its low are the best that can be achieved using the SVR method given
memory and computation overhead. the features that we measure.
Delphi constructs the four regression trees to predict the We test both models using the same testing set. Figure 4
flow completion time for single-path TCP over Wi-Fi or LTE, shows the CDF of relative error between the learned result and
and for MPTCP using Wi-Fi or LTE for the primary subflow. the ground truth when predicting Wi-Fi and LTE. MPTCP pre-
We estimate the flow completion time of MPTCP using sep- dictions give similar results. Here, the regression-tree model
arate trees, instead of deriving it from the flow completion predicts the flow completion time with smaller error than the

6
1
5 KB
subsets, each subset contains measurement done for a certain
0.9 10 KB transfer size. As the difference between the actual transfer
100 KB
0.8 1 MB size and 3 KB increases, the relative error increases. The
0.7
prediction error is significantly higher than the previous 2%-
0.6
10%. Fortunately, however, only less than 15% of transfers
CDF

0.5
0.4
are affected by this error, as noted in Section 4.
0.3 6.2.2 Network Monitor Error
0.2
Another source of error is adaptive probing. We run the
0.1
regression tree testing over different T h values, which is an
0
0 0.2 0.4 0.6 0.8 1 adaptive probing parameter. Figure 6 shows the CDF curves
Relative Error of relative errors for LTE. Here, as T h increases (i.e., probing
less frequently), the prediction error increases. Wi-Fi and
Figure 5: Relative error when using an empirical flow size
MPTCP predictions also gives similar results.
number (3 KB) to predict flow completion time. The legends
are the actual flow sizes. 7. N ETWORK S ELECTOR
Delphi’s Selector uses network performance predictions,
0.9 transfer lengths, and the specified objectives for application
0.8 transfers to determine which network to use. The choice
0.7 of network depends on three factors: throughput, energy
0.6
efficiency, and monetary cost.
0.5
CDF

0.4 7.1 Objective Functions


Th=0
0.3 Th=0.0001 Throughput, S: The Selector can estimate the average
Th=0.0002
0.2 Th=0.0004 throughput of the current transfer using the transfer size f that
Th=0.0008
0.1 Th=0.0016 is provided by the Traffic Profiler, and the flow completion
0
Th=0.0032
time t that is provided by the Predictor. Using the subscript i
0 0.05 0.1 0.15 0.2 0.25
Relative Error
0.3 0.35 0.4
to refer to the choice of the network (either Wi-Fi or LTE or
MPTCP with a specified primary subflow), we have:
Figure 6: Relative error when using adaptive probing to pre- f
dict flow completion time. Si = (3)
ti
Energy per byte, Ei : Knowing the power level pi of each
SVR model. Regression tree is more powerful because it network choice i, together with the above ti and f , the energy
is able to traverse through different paths in the tree when to transmit one byte is:
predicting different locations. pi · ti
The median testing error ranges from 2% to 10% across Ei = (4)
f
the four regression tree models. Given that the training set
The energy per byte is a metric that captures both battery
only consists of 10% of the data, this shows the TCP flow
consumption and transfer rate. A faster transfer over a net-
completion time is predictable using regression trees.
work that has a higher power consumption may incur a lower
6.2 Input Error Analysis energy per byte. As a result, minimizing energy per byte
Section 6.1 shows the prediction error when the input fea- is not always the same as picking the network with lowest
ture vectors are accurate. As described earlier, the Traffic power consumption.
Profiler sometimes needs to guess the transfer size, and the Monetary cost per byte, Mi : The “dollar” cost incurred
Network Monitor may reduce active probing frequency to when transferring each byte of the transfer on network choice
reduce energy and traffic overhead. Thus, the input to the i.
regression tree is not always perfect. In this section, we By knowing Si , Ei and Mi , the Selector chooses the network
investigate what impact do these imperfections have. that maximizes the following objective function:
6.2.1 Traffic Profiler Error
Si α
As mentioned in Section 4, when there is no “Content- Oi = (5)
Length” field specified in a burst of transmission, the Traffic Ei β · Mi γ
Profiler returns an empirical number of 3 kbytes. Figure 5 This objective function prefers networks with high Si val-
shows the relative error using an inaccurate transfer size as a ues, meaning in a given second, it wants the device to transfer
regression tree’s input (in Figure 5, we only show the results more bytes. It also prefers networks with low Ei value, mean-
for TCP over LTE for clarity, but similar results hold for Wi- ing when consuming one Joule of energy, it wants the device
Fi and MPTCP). Here, we split the testing dataset into smaller to transfer more bytes. Similarly, it prefers network with low

7
Mi values, meaning when consuming one dollar, it wants the 4
device transfer more bytes. 3.5
The exponents α, β , γ ∈ [0, ∞) determine the relative im-

Throughput (MBits/Sec)
3 Max(S)
portance of throughput, energy efficiency, and monetary cost Max(S/E)
2.5
respectively. For example, if α = 1, β = 0 and γ = 0, then MPTCP(Wi-Fi)
2 MPTCP(LTE)
Oi = Si , and the Selector will select the network with high- LTE Oracle Frontier
1.5
est average throughput. This optimization is realistic if the Wi-Fi
device has an unlimited data plan and is connected to a AC 1
power source. As another example, if α = 0, β = 1 and γ = 0, 0.5
then Oi = 1/Ei , and the Selector will select the network that 0
consumes the least amount of energy. This optimization is 0 1 2 3 4 5 6 7
preferable when the device is about to run out of battery. Energy Efficiency (MBits/J)
In our experiments, we set α, β , γ to different values
to experiment with different scenarios. In a more realistic Figure 7: Median megabits per Joule and throughput values
implementation, α, β and γ can be pre-defined by users, for different schemes. The black line shows a frontier of
or decided by Delphi dynamically according to the phone’s the best that can be achieved when setting α = 1, γ = 0 and
current status. For example, β can increase as the battery level changing β from 0 to 5.
decreases. If the mobile device user has a limited monthly
data plan for a cellular network, γ can increase as the amount Objective Max(S) Max(S/E) Wi-Fi
Throughput (Mbits/Sec) 3.0 2.6 1.4
of data plan consumed is approaching the budget, and fixed Energy Efficiency (Mbits/J) 2.8 4.9 5.2
at a large number once the cellular usage exceeds the budget.
Picking different values of α, β and γ makes the objective Table 5: Median values for Max(S), Max(S/E) and Wi-Fi as
function expressive enough to handle a range of preferences. shown in Figure 7.
7.2 Energy Model
Interface LTE Wi-Fi described below. The dataset collected earlier maps a feature
Send Tput (mbps) 2.3 ∼ 4.4 7.4∼ 9.4 vector (made of up all the features listed in Table 2) to a
Send Power (mW) 2778± 56 536 ± 23 tcpdump trace captured when running standard TCP on Wi-Fi
Recv Tput (mbps) 0.1∼1.5 7.3∼9.5
Recv Power (mW) 1674 ± 95 428 ± 58
and LTE and a tcpdump trace for MPTCP in striped mode
using Wi-Fi as the primary subflow and using LTE as the
Table 4: Power Level measurement for Wi-Fi and LTE. primary subflow.
When evaluating any scheme, we assume all the features
required by Delphi are available at the beginning of each run.
An energy model is required to estimate Ei . We measured
We run Delphi’s selection algorithms described earlier using
the power level of both Wi-Fi and LTE by connecting the
the features collected at the beginning of the run along with
phone to a power monitor [18]. We measured the power
flow size as input. We then pick the network interface that
level during both TCP uploads and downloads using Wi-Fi
maximizes the objective function described above and look
and LTE. We also measured the power level across different
at the previously collected tcpdump trace to determine the
transmission rates for both Wi-Fi and LTE, by changing the
duration from the beginning of the trace until a transfer size
underneath channel quality. To do so, for Wi-Fi, we change
worth of bytes are transferred. We repeat the same procedure,
the distance between the phone and the access point. For
without any prediction, for the other policies for every run at
LTE, we change the channel quality by moving into and out
each location.
of buildings.
In these simulations, we set γ = 0, meaning we do not
Table 4 shows the results. For each network, both while
consider monetary cost. Figure 7 shows the simulation results.
uploading and downloading data, the power level stays the
The x-axis of the figure shows the number of bits that can
same regardless of the data rate. Hence, we model the power
be transmitted when consuming 1 Joule. The y-axis shows
pi for each network using two constant numbers (one for
the throughput. We first draw a frontier line by changing β
upload and one for download). However, for LTE, there is
from 0 to 5. When computing the frontier, we also use the
an additional “tail energy” [12] component that is consumed
ground-truth value of each flow’s completion time, so that the
after a transfer finishes, which we treat as a constant number.
frontier represents the best that can be achieved if we have a
As a result, when estimating the energy consumed for LTE,
perfect predictor. We call this line “Oracle Frontier”.
Delphi adds this tail energy to the total energy consumed.
In Figure 7, Wi-Fi, LTE, MPTCP(Wi-Fi) and MPTCP(LTE)
7.3 Selector Performance are four fixed schemes, where the same network is used across
To quantify the Selector’s performance, we run simulations all locations. Here Wi-Fi and LTE refer to transmitting data
using data that we collected from 22 locations. We wrote using single path TCP over Wi-Fi or LTE. MPTCP(Wi-Fi)
a custom-built simulator that operates on this data set as and MPTCP(LTE) refer to transmitting data over MPTCP,

8
LTE
phi selects LTE more often than Wi-Fi or any other MPTCP-
Wi-Fi based scheme because LTE provides the highest throughput
MPTCP(LTE)
MPTCP(Wi-Fi) in most cases in our dataset. For “Max(S/E)” (maximizing
1.0
throughput over energy), or for “Max(1/E)” (minimizing en-
Selected Ratio

0.75 ergy consumption), Delphi tends to choose Wi-Fi much more


often, since Wi-Fi is generally more energy-efficient. How-
0.5 ever, both the oracle and Delphi still select other network(s)
0.25
because there are cases where Wi-Fi gives really low through-
put, lengthening the transfer and consuming significant energy
0
le i le i le i in the process.
ac elph ac elph ac elph
Or D
Max(S)
Or D
Max(S/E)
Or D
Max(1/E)
To understand these errors in more detail, we compare the
decision made by the oracle and by Delphi. We say that
Figure 8: Percentage of each network(s) is selected when Delphi makes a successful decision if it selects the same net-
using different objective functions. work(s) as the oracle. Figure 9 shows the ratio of successful
decisions affected by different error sources. Here, “Active”
refers to using active probing information for flow-completion
Active
Adaptive time prediction. “Adaptive” refers to our adaptive probing
100 3 KB
technique described earlier. “3 KB” refers to the using active
probing, where the traffic profiler outputs “3 KB” in 16%
Success Rate (%)

80
of the total outputs. We select 16% because we found that
60 in 16% of the total network transactions’ data size is not
predictable (Table 1 in Section 4).
40
We can see that “Active” gives highest success ratio (93%
20 for Max(S/E) and 85% for Max(S)). As we add error to the
network measurement input, i.e., “Adaptive”, the success
0
Max(S/E) Max(S) ratio is lower (90% and 81% respectively). Finally, inaccurate
transfer size information gives us the lowest success rate
Figure 9: Success Rate for different schemes. (86% and 78% respectively). These results show that we can
achieve high success rates by adding active probing overhead.
If the overhead is a concern, removing active probing can still
while use Wi-Fi or LTE for primary subflow. “Max(S)” and give reasonable results. However, being able to predict the
“Max(S/E)” are two schemes run by the Selector. In Figure 7, size of the burst of traffic is more important.
the two Delphi schemes, “Max(S)” and “Max(S/E)”, fall
close to the frontier line. In “Max(S)”, the Selector set β = 7.4 System Generalization
0, meaning it is simply trying to maximize throughput. In
Model Type Obj. Throughput Energy Ef-
“Max(S/E)”, β = 1, meaning it tries to select networks with (Mbits/Sec) ficiency
high throughput but without consuming too much energy. We (Mbits/J)
can see that “Max(S)” and “Max(S/E)” are much closer to the SVR Max(S) 2.3 1.6
Reg. Tree Max(S) 2.1 2.0
Oracle Frontier line than any other schemes.
Wi-Fi Only - 1.4 5.6
Table 5 lists the x and y-axis values for “Max(S)” and LTE Only - 1.7 1.5
“Max(S/E)”, together with the median value for the fixed SVR Max(S/E) 1.4 5.6
scheme that always uses Wi-Fi for comparison. Here, “Max(S)” Reg. Tree Max(S/E) 1.9 2.7
gives the highest median throughput of 3.0 Mbits/sec, which
is a 2.1x gain over Wi-Fi’s 1.4 Mbits/sec. “Max(S/E)” tries Table 6: Median values for throughput and energy efficiency
to achieve high throughput with high energy efficiency. Its when testing different model at new locations.
median energy efficiency is 4.9 Mbits/J, 6% worse than al-
ways using Wi-Fi, but it achieves a median throughput of 2.6 In our above analysis, our both training and testing datasets
Mbits/sec, a 1.9x gain over Wi-Fi. are generated from the 22 locations we measured. These
We now take a closer look at each scheme: when an ob- results show the improvement when we have prior knowledge
jective function is defined, how many times is each available of the network condition of each location. In this section,
choice selected? Figure 8 shows the percentage of time that we show the result in a more challenging condition: we test
each network choice is selected when trying to optimize differ- Delphi’s performance when there is no prior knowledge. This
ent objectives. First, we can see that Delphi makes very simi- corresponds to the real use case where a smartphone user
lar decisions compared with an omniscient oracle. However, enters a new location that he/she has never been to.
for each specific objective function, Delphi chooses different We train Delphi using data collected from 21 out of the
networks. For example, when maximizing throughput, Del- 22 locations, and test it on the last location. We repeat this

9
process 22 times by using each location to be the testing completion time, and then calls the Selector function to se-
set. Table 6 shows that when maximizing throughput, Delphi lect the a single network by changing iptable rules [21].
achieves up to 1.6x (when using SVR model) improvement These procedure of turning an interface off allows MPTCP
over Wi-Fi. When maximizing throughput over energy con- to migrate to the new connection because it supports break-
sumption, Delphi behaves as good as Wi-Fi and better than before-make semantics. When the network is idle, this thread
using LTE only. These results show that: 1) when there is no also reads the passive indicator values periodically (every 5
prior knowledge, the improvement decreases; 2) SVR model seconds), and uses a default value of “3KB” as transfer length
achieve higher improvement than Regression Tree. This is when calling the Predictor and Selector function, so that the
because Regression Tree is powerful when characterizes data network can be pre-selected before a new TCP connection
in training set, which means it tend to overfit and being less starts. When a connection is actively transmitting, Delphi
powerful when predicting new data. Thus, Delphi can use also periodically (every 1 second) reads the passive indicator
SVR model to make decisions when entering new locations, values, so that it can detect significant network environment
or crowd-sourcing measurement techniques [32] can be used changes, in case the mobile device is moving. This allows
to feed the smartphone with prior knowledge of new loca- Delphi to dynamically select networks during long-running
tions, to further improve performance. The details of model transfers. Once the Selector decide a interface switching is re-
generalizing is part of our future work. quired, it achieves the switch by changing iptable entries
for that specific TCP connection. Table 7 shows the interface
8. I MPLEMENTATION switching delay measured in a indoor setting. Each number is
We implement Delphi on a laptop (2.4GHz Due Core with an average across ten measurements. The switching delay is
4GB RAM, comparable to a smartphone) running MPTCP defined as the time between a iptable rule changing com-
enabled (Ubuntu Linux 13.10 with Kernel version 3.11.0, mand is issued and a packet transfer occurred on the newly
with the MPTCP Kernel implementation version v0.88 [22]). brought up interface. In our currently implementation, the
We tethered two smartphones to the laptop, one in “airplane” out-going delay is 500 ms. During this 500 ms, the transfer
mode with Wi-Fi enabled, and the other with Wi-Fi disabled does not pause, but continues on the pre-selected network.
but connected to the Verizon LTE network. The reason we im- Noticed that not all connections will experience this delay
plemented Delphi on laptop instead of Android phones is that because Delphi can pre-select the network configurations as
we want to utilize already existed machine learning libraries. it observes a network condition change. This switch delay
Importing the machine learning algorithms into Android plat- will only happen when 1) network condition changes during
form will be our future work. We also enabled MPTCP on a transfer 2) the network selection result based on the actual
Galaxy Nexus running Android 4.1 [19] to validate that all transfer size is different from the result based on the default
the following functionality is feasible on smartphones. All the “3KB” value. In our experiments (Section 9.2), 18% of HTTP
measurement data used for simulation in the previous settings transactions needs a network switch during data transfer.
are also collected under the same setting. We configured the laptop to pass all its HTTP traffic through
The current implementation of Delphi is a user-level appli- an MPTCP-enabled proxy server. Because current app servers
cation implemented in Python. One thread of Delphi serves as do not always support MPTCP, the proxy server allows our
the Network Monitor; it continuously polls passive indicator client to run apps over MPTCP, while talking to the apps’ orig-
values from both phones over the USB interface very 500 inal server. Also, since MPTCP is enabled on both the client
milliseconds. and the proxy server, we can migrate connections from Wi-Fi
Switch Type Send Delay (ms) Recv Delay (ms) to LTE easily using ip link set dev [interface]
LTE switch on 494± 1 507 ± 13 multipath on/off, without breaking them.
Wi-Fi switch on 495 ± 2 782 ± 47
9. E VALUATION
Table 7: Switching delay, averaged across ten measure- In previous sections, we analyze the performance of each
ments. The switching delay is defined as the time between module using a trace-driven approach. This serves as micro-
a iptable rule changing command is issued and a packet benchmark evaluation of Delphi. In this section, we focus
transfer occurred on the newly brought up interface. The send on macro-benchmark evaluation done by emulation and real-
delay is the delay for the first out-going packet showing up, world experiments.
and the receive delay is for the first ACK packet coming from
the server. 9.1 Delphi over Emulated Networks
To understand Delphi’s performance under real applica-
The other thread serves as the Traffic Profiler, Predictor tion workloads, we use Mahimahi [23], a record-and-replay
and Selector. It runs tcpdump to monitor packet transmis- tool that can record and replay client-server interactions over
sions in real time. Once it sees that a network transfer has HTTP. In our experiment, we record client-server interac-
been initiated, it looks for a HTTP request or response header tions when the client runs two applications: Web brows-
and reads the “Content-Length” information from the header. ing and video streaming. During replay, Mahimahi replays
It then calls the prediction function to predict the transfer the recorded interactions on top of an emulated network.

10
3
Wi-Fi Wi-Fi
LTE LTE
MPTCP(Wi-Fi) 2.5 Delphi
MPTCP(LTE)
Delphi

Normalized Tput
2
Normalized Obj.

1.0

1.5
0.75

1
0.5
0.5
0.25
0
0 Loc-1 Loc-2 Loc-3 Loc-4 Loc-5
Max(S) Max(S/E) Location ID
(a) Web Browsing (a) Break down by locations

4.5
Wi-Fi Wi-Fi
LTE 4 LTE
MPTCP(Wi-Fi) Delphi
MPTCP(LTE) 3.5
Delphi

Normalized Tput
3
Normalized Obj.

1.0
2.5
0.75 2
1.5
0.5
1
0.25 0.5
0
0 1KB 3KB 10KB 100KB 1MB
Max(S) Max(S/E) Transfer Sizes
(b) Video Streaming (b) Break down by transfer sizes

Figure 10: Objective function value normalized by oracle. Figure 11: Objective function value normalized by oracle.
The histogram shows the median value, the error bar shows The histogram shows the median value, the error bar shows
one standard deviation. one standard deviation.

We use the optimal ground truth to normalize the predictions


Mahimahi can emulate network delays and variable-rate links of all schemes that we consider.
using packet-delivery traces. In our experiment, we used the Figure 10 shows the median normalized value for each
tcpdump traces captured during our measurement at 22 loca- scheme across all locations. Figure 10a shows the results for
tions as packet-delivery traces for network emulation. During Web browsing. For Max(S), Delphi gives the highest through-
the tcpdump measurements, we also measured passive indi- put over all the other fixed schemes. When compared with
cator values, which are fed into Delphi during our emulation Wi-Fi, which is the default network selection on most mobile
as inputs to the Network Monitor. This emulation setup en- devices, our throughput improvement is 83%, which corre-
able us to compare different network selection schemes when sponds to a 46% reduction in transfer time. For Max(S/E),
running exactly the same application traffic, and under the Delphi improves the median normalized throughput over en-
same network conditions. ergy per byte by 0% (over Wi-Fi, since Wi-Fi tends to be
We use Delphi to optimize two different objective func- much more energy efficient than other schemes) to 6x (over
tions: 1) Max(S): maximizing average throughput, i.e. mini- MPTCP(Wi-Fi)). In Figure 10b, for video-streaming appli-
mizing transfer completion time. 2) Max(S/E): maximizing cations, for Max(S), Delphi improves the throughput by 93%
average throughput over energy per byte. In each experiment, over Wi-Fi, corresponding to a 49% reduction in transfer time.
we record the actual value of S and S/E achieved by Delphi For Max(S/E), Delphi’s improvement ranges from 41% (over
and by using different fixed choices. After running Delphi Wi-Fi) to 3.9x (over MPTCP(LTE)).
and the fixed network(s) schemes at one location, we can
determine which network choices gives the highest value of S 9.2 Experiments
and S/E. We call this highest value the optimal ground truth. To understand how Delphi behaves in the real world, we

11
6
Delphi
Another benefit of using Delphi is that by continuously
Wi-Fi monitoring the network conditions using the Network Moni-
5 LTE
Throughput (Mbits/Sec) Delphi Cell Predict tor, Delphi can tell whether the network performance is get-
Delphi WiFi Predict
4 ting worse, and trigger handover proactively. This is best
Delphi Switch Lose IP demonstrated when the mobile device is moving. In this ex-
3 periment, we keep Delphi running on the laptop while moving
2
it from inside a building to outside a building. The tethered
Wi-Fi phone was initially connected to the Wi-Fi AP inside
1 the building. As we walk outside the building, the Wi-Fi
signal keeps decreasing until the phone cannot associate with
0
0 10 20 30 40 50 the AP. We run wget on the laptop to download a large file
from our proxy server. (In this experiment, we configured
Figure 12: When the mobile device is moving away from an Delphi to Max(S), and only select between Wi-Fi and LTE,
access point, Delphi predicts that Wi-Fi can be worse than not the MPTCP choices, to study the handover behavior). In
LTE, then it switches to LTE at time 10.5 seconds. MPTCP our experiment, We first run wget without running Delphi,
handover mode switches to LTE when it sees Wi-Fi loses IP, and with only one interface at a time, to measure the through-
which happens at time 41 seconds. put of Wi-Fi and LTE as the laptop moves, shown as “Wi-Fi”
and “LTE” in Figure 12. Then we run Delphi, while moving
the laptop along the same path. In Figure 12, we can see
first train Delphi’s predictor using data we collected at the that at time 10.5, Delphi predicted that Wi-Fi is worse than
22 locations. The model training was done using a desktop LTE, and consequently triggered a switch and the throughput
(Intel Xeon(R) CPU, 3.30GHz Quad Core, 16 GB RAM). drops but soon recovers to LTE’s throughput. In Figure 12,
The total training process takes 5 minutes. Then among we also marked the time when the Wi-Fi phone loses its IP
the 22 locations, we visited 5 close to our campus. On our address; this is when a handover will happen according to
Delphi enabled laptop, we ran wget to download files with Multi-Path TCP Handover-Mode proposed in [27]. However,
various sizes (1 KB, 3 KB, 10 KB, 100 KB and 1000 KB) we can see that in this case, Wi-Fi throughput has already
from a remote server. For comparison, we also ran the same dropped to zero before it loses IP address. Compared with
wget with only Wi-Fi or LTE enabled before or after we run this scheme, Delphi triggers LTE/Wi-Fi handover earlier, so
Delphi. We randomized the sequence of configurations (file that the application sees constantly high throughput.
sizes, network measured). For each configuration, we ran five
times. 10. C ONCLUSION AND F UTURE W ORK
Figure 11 shows the average improvement of Delphi over We have presented Delphi, a mobile software controller
Wi-Fi and LTE, when Delphi’s objective is set to maximizing to help applications select the best network among multiple
throughput. Due to high variation of the actual throughput choices for their data transfers. Delphi’s selection schemes
across different configuration, here we show the throughput are able to handle trade-offs between high throughput and
normalized by Wi-Fi’s throughput. Figure 11a shows Delphi’s energy efficiency. Thus it outperforms static schemes such
performance at each location. We can see that Delphi does as using Wi-Fi by default (the policy on Android today), or
not perform perfectly well. At Location 3, the LTE has a using LTE by default, or always using both, since neither
higher average throughput than Wi-Fi, but Delphi still selects Wi-Fi nor LTE is unequivocally better than the other, in terms
Wi-Fi. However, at other locations, Delphi performs better of average throughput, and using both networks consumes an
than always using Wi-Fi or LTE. At Location 2, 4 and 5, excessive amount of energy.
Delphi performs better than both Wi-Fi and LTE, because it Applications could care about other metrics such as av-
uses MPTCP running over both networks. In Figure 11b, we erage per-packet delay, and tail per-packet delay, or more
can see that Delphi achieves improvement for small (1KB, app-centric metrics such as page-load time for web pages
3 KB and 10 KB) and large (100 KB) transfers. A deeper or minimizing the risk of a stall for streaming video. We
investigation reveals that the middle sized (1 MB) transfers view Delphi as a first step in answering these more involved
are affected by the switching delay the most: most of the questions. One direction of our future work is to provide ex-
transfers go through the sub-optimal network(s). However, pressive APIs for applications to express their specific needs
for large transfers, although Delphi transfers on sub-optimal to Delphi.
network(s) for some time, most of the transfer happens on In this paper, we use machine learning as a tool to make
the optimal network. Thus, Delphi performs well for large decisions where a static policy does not suffice. Another di-
transfers. In summary, Delphi increases average throughput rection that we plan to explore is to further enhance Delphi’s
by between 1.25x and 4x compared with Android’s default learning capability by using online learning or crowd-sourced
policy which always use Wi-Fi when it is available. learning mechanisms. This would allow mobile devices to
make better network selection decisions when they enter lo-
9.3 Handling User Mobility cations for the first time.

12
R EFERENCES //www.msoon.com/LabEquipment/
[1] Y. Agarwal, T. Pering, R. Want, and R. Gupta. SwitchR: PowerMonitor/.
Reducing System Power Consumption in a Multi-Client, [19] MPTCP for Android. http://multipath-
Multi-Radio Environment. In Wearable Computers, tcp.org/pmwiki.php/Users/Android.
2008. [20] Apple iOS 7 surprises as first with
[2] ios: Using airdrop. http://support.apple.com/ new multipath TCP connections.
kb/HT5887. http://www.networkworld.com/news/2013/091913-
[3] Top Sites in United States. http:// ios7-multipath-273995.html.
www.alexa.com/topsites/countries/US. [21] Configure MPTCP Routing. http://
[4] G. Ananthanarayanan, V. N. Padmanabhan, L. Ravin- multipath-tcp.org/pmwiki.php/Users/
dranath, and C. A. Thekkath. Combine: Leveraging the ConfigureRouting.
Power of Wireless Peers through Collaborative Down- [22] MPTCP v0.88 Release. http://multipath-
loading. In MobiSys, 2007. tcp.org/pmwiki.php?n=Main.Release88.
[5] G. Ananthanarayanan and I. Stoica. Blue-Fi: Enhancing [23] R. Netravali, A. Sivaraman, K. Winstein, S. Das,
Wi-Fi Performance using Bluetooth Signals. In Proc. A. Goyal, and H. Balakrishnan. Mahimahi: A
MobiSys, 2009. Lightweight Toolkit for Reproducible Web Measure-
[6] P. Bahl, A. Adya, J. Padhye, and A. Walman. Recon- ment (Demo). In SIGCOMM, 2014.
sidering Wireless Systems with Multiple Radios. SIG- [24] S. Nirjon, A. Nicoara, C.-H. Hsu, J. Singh, and
COMM CCR, 2004. J. Stankovic. Multinets: Policy oriented real-time
[7] F. Bari and V. Leung. Automated Network Selection in switching of wireless interfaces on mobile devices. In
a Heterogeneous Wireless Network Environment. Net- RTAS, 2012.
work, IEEE, 21(1):34–40, Jan 2007. [25] D. Niyato and E. Hossain. Dynamics of Network Selec-
[8] C. Carter, R. Kravets, and J. Tourrilhes. Contact Net- tion in Heterogeneous Wireless Networks: An Evolu-
working: a Localized Mobility System. In MobiSys, tionary Game Approach. IEEE Transactions on Vehicu-
2003. lar Technology,, 58(4):2008–2017, May 2009.
[9] A. Chakraborty, V. Navda, V. N. Padmanabhan, and [26] L. Olshen and C. J. Stone. Classification and Regression
R. Ramjee. Coordinating Cellular Background Transfers Trees. Wadsworth International Group, 1984.
Using LoadSense. In MobiCom, 2013. [27] C. Paasch, G. Detal, F. Duchene, C. Raiciu, and
[10] R. Chandra and P. Bahl. MultiNet: Connecting to Mul- O. Bonaventure. Exploring Mobile/WiFi Handover with
tiple IEEE 802.11 Networks Using a Single Wireless Multipath TCP. In CellNet, 2012.
Card. In INFOCOM, 2004. [28] A. Parekh and R. Gallager. A generalized processor
[11] S. Dai, A. Tongaonkar, X. Wang, A. Nucci, and D. Song. sharing approach to flow control in integrated services
Networkprofiler: Towards Automatic Fingerprinting of networks: the single-node case. Networking, IEEE/ACM
Android Apps. In INFOCOM, 2013. Transactions on, 1(3):344–357, Jun 1993.
[12] S. Deng and H. Balakrishnan. Traffic-Aware Techniques [29] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
to Reduce 3G/LTE Wireless Energy Consumption. In B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
CoNEXT, 2012. R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-
[13] S. Deng, R. Netravali, A. Sivaraman, and H. Balakrish- napeau, M. Brucher, M. Perrot, and E. Duchesnay.
nan. WiFi, LTE, or Both? Measuring Multi-Homed Scikit-learn: Machine learning in Python. Journal of
Wireless Internet Performance. In IMC, 2014. Machine Learning Research, 2011.
[14] B. D. Higgins, A. Reda, T. Alperovich, J. Flinn, T. J. [30] T. Pering, Y. Agarwal, R. Gupta, and R. Want.
Giuli, B. Noble, and D. Watson. Intentional Networking: Coolspots: Reducing the Power Consumption of Wire-
Opportunistic Exploitation of Mobile Network Diversity. less Mobile Devices with Multiple Radio Interfaces. In
In MobiCom, 2010. MobiSys, 2006.
[15] H. Jiang, Z. Liu, Y. Wang, K. Lee, and I. Rhee. Under- [31] C. E. Perkins. Mobile IP. Communications Magazine,
standing Bufferbloat in Cellular Networks. In CellNet, IEEE, 1997.
2012. [32] S. Rosen, S.-J. Lee, J. Lee, P. Congdon, Z. Morley Mao,
[16] S. Kandula, K. C.-J. Lin, T. Badirkhanli, and D. Katabi. and K. Burden. MCNet: Crowdsourcing Wireless Per-
FatVAP: Aggregating AP Backhaul Capacity to Maxi- formance Measurements through the Eyes of Mobile
mize Throughput. In NSDI, volume 8, 2008. Devices. Communications Magazine, IEEE, 2014.
[17] R. Mahindra, H. Viswanathan, K. Sundaresan, M. Y. [33] A. C. Snoeren and H. Balakrishnan. An End-to-end
Arslan, and S. Rangarajan. A Practical Traffic Manage- Approach to Host Mobility. In MobiCom, 2000.
ment System for Integrated LTE-WiFi Networks. In [34] Q. Song and A. Jamalipour. Network Selection in
MobiCom, 2014. an Integrated Wireless LAN and UMTS Environment
[18] Monsoon Power Monitor. http: Using Mathematical Modeling and Computing Tech-
niques. IEEE Wireless Communications, 12(3):42–48,

13
June 2005. Control for Multipath TCP. In NSDI, 2011.
[35] SSL Proxy: Man-in-the-middle. http: [40] Q. Xu, S. Mehrotra, Z. Mao, and J. Li. PROTEUS: Net-
//crypto.stanford.edu/ssl-mitm/. work Performance Forecast for Real-Time, Interactive
[36] Cisco visual networking index: Global mobile Mobile Applications. In MobiSys, 2013.
data traffic forecast update, 2014-2019. http: [41] K.-K. Yap, T.-Y. Huang, Y. Yiakoumis, S. Chinchali,
//www.cisco.com/c/en/us/solutions/ N. McKeown, and S. Katti. Scheduling Packets over
collateral/service-provider/visual- Multiple Interfaces While Respecting User Preferences.
networking-index-vni/white_paper_ In CoNEXT, 2013.
c11-520862.html. [42] K.-K. Yap, N. McKeown, and S. Katti. Multi-server
[37] WiFi direct description. http:// Generalized Processor Sharing. In ITC, 2012.
en.wikipedia.org/wiki/Pearson_product- [43] Google’s next bid to lower mobile data costs: Zero
moment_correlation_coefficient. rating. https://www.theinformation.com/
[38] K. Winstein and H. Balakrishnan. Mosh: An interactive Google-s-Next-Bid-to-Lower-Mobile-
remote shell for mobile clients. In USENIX ATC, 2012. Data-Costs-Zero-Rating.
[39] D. Wischik, C. Raiciu, A. Greenhalgh, and M. Handley. [44] X. Zhao, C. Castelluccia, and M. Baker. Flexible Net-
Design, Implementation and Evaluation of Congestion work Support for Mobility. In MobiCom, 1998.

14

You might also like