ECO: Edge-Cloud Optimization of 5G Applications: Kunal Rao, Giuseppe Coviello, Wang-Pin Hsiung, Srimat Chakradhar
ECO: Edge-Cloud Optimization of 5G Applications: Kunal Rao, Giuseppe Coviello, Wang-Pin Hsiung, Srimat Chakradhar
Abstract end user is not an option because it takes too long, uses exces-
Centralized cloud computing with 100+ milliseconds net- sive power, creates privacy and security vulnerabilities, and
work latencies cannot meet the tens of milliseconds to sub- causes scalability problems. The new applications demand a
millisecond response times required for emerging 5G applica- different kind of computing fabric, one that is distributed and
tions like autonomous driving, smart manufacturing, tactile built to support low-latency and high-bandwidth service deliv-
internet, and augmented or virtual reality. We describe a new ery capability, which centralized cloud implementations with
programming model and run-time that enables such appli- 100+ milliseconds (ms) network latencies are not well-suited
cations to make effective use of a 5G network, computing for.
at the edge of this network, and resources in the centralized
1.1 Edge-cloud
cloud. Our new programming model captures the internal
knowledge about the application’s microservices, their in- The speed of light places a fundamental limit on the network
terconnections, and the critical microservices pipelines that latencies, and the farther the distance between the data source
determine the response latency of the application. Then, our and the processing destination, the more time it will take
run-time continuously monitors the interaction among the to transmit the data to the destination. So, edge computing
microservices, estimates the data produced and exchanged places computing resources at the edges of the Internet in
among the microservices, and uses a novel graph min-cut close proximity to devices, information sources and end-users,
algorithm to dynamically map the microservices to the edge where the content is created and consumed. This, much like
or the cloud to satisfy application-specific response times. a cache on a CPU, increases bandwidth and reduces latency
Our run-time also handles temporary network partitions, and between the end-users or data sources, and data processing.
maintains data consistency across the distributed fabric by Today, centralized cloud has more than 10,000 data centers
using microservice proxies to reduce WAN bandwidth by [19] scattered across the globe, but within the next five years,
an order of magnitude, all in an application-specific manner driven by a need to get data and applications closer to end-
by leveraging knowledge about the application’s functions, users (both humans and machines), orders of magnitude more
latency-critical pipelines and intermediate data. We illustrate heavily scaled-down data centers are expected to sprout up at
the use of our programming model and runtime by success- the edge of the Internet to form the edge-cloud.
fully mapping two representative video analytics applications A tiered system, as shown in Figure 1, with the cloud, and
to the AWS/Verizon Wavelength edge-cloud architecture, and additional, heterogeneous computing and storage resources
improving application response times by 2x when compared placed inside or in close proximity to the sensor, is emerging
with a static edge-cloud implementation. as a computing reference architecture for edge-cloud applica-
tions. These additional compute or storage resources can be
mobile as in a vehicle or smartphone, they can be static as in a
1 Introduction manufacturing plant or offshore oil rig, they can be a mixture
of the two, as in hospitals, or they can be in a telecommu-
Cloud services are everywhere. From individual users watch-
nication provider’s data centers at the edges of the cellular
ing over-the-top video content to enterprises deploying software-
network. In all cases, edge resources are expected to be used
as-a-service, cloud services are increasingly how the world
solely to meet application-specific needs like very short re-
consumes content and data. Although centralized cloud com-
sponse times, or to do some analysis locally on large sensor
puting is ubiquitous, and economically efficient, an exponen-
data sets that are impractical to send to the cloud due to their
tial growth in internet-connected machines and devices is
high dimensionality or data-rate. The cloud is expected to be
resulting in emerging new applications, services, and work-
used for a fully centralized application delivery, management
loads for which the centralized cloud quickly becomes com-
and execution of select application functions that may require
putationally inefficient [28]. New, emerging applications like
a global perspective.
autonomous driving, smart manufacturing, tactile internet,
remote surgeries, real-time closed-loop control as in Industry 1.1.1 Advantages. A tiered reference architecture is attrac-
4.0, augmented or virtual reality require tens of milliseconds tive for several reasons. First, wireless data rates have in-
to sub-millisecond response times. For these applications, creased by four orders of magnitude over the past twenty
processing all data in the cloud and returning the results to the years, and the push towards new networks like 5G, which
NEC Laboratories America, Princeton, USA,
Chakradhar.S, et al.
promises cellular communications at least an order of mag- developers to address, and they are almost impossible for
nitude beyond the LTE networks today, can deliver radio any application-agnostic underlying distributed network and
access links operating at 1Gbps or higher, access network computing platforms to handle.
latencies reducing from 10s of milliseconds to 1 ms, and de- Second, edge resources (compute, storage and network
vice densities as high as a million internet-connected devices bandwidth) are severely limited and a lot more expensive than
in one square kilometer. Clearly, 5G coupled with comput- cloud resources, and many applications from different users
ing capability at the edge of the cellular network can enable would want to use them. So, it is important to use the edge re-
fundamentally new applications that require high data-rate sources very efficiently, and application-specific optimization
instantaneous communications, low latency, and massive con- strategies beyond the efficiencies provided by the underly-
nectivity. Applications can use an appropriate 5G network ing compute and network platforms are necessary to realize
slice, which is a logical or virtual network over a shared, economically viable low-latency response applications.
physical communication network, to address their distinct Third, temporary network disruptions are unavoidable. Tra-
characteristics and service requirements. For example, a 5G ditional application-agnostic methods like database synchro-
slice that supports a robot automation would differ in terms nization of data across different tiers of the distributed infras-
of throughput, latency and reliability requirements from a 5G tructure are too slow (and resource intensive for the resource-
slice for a typical voice phone call. Since individual slices constrained edge) to achieve low-latency application response.
share a finite physical network, ensuring service quality and Lastly, typical cloud fault tolerant solutions like active state
end-to-end network performance of every slice means that machine replication, or check-pointing and restarting, are not
the 5G network may sometimes have to turn down requests applicable to failures in the edge-cloud. For real-time ap-
for another slice. plications such as closed-loop industrial control, restarting
Second, by extending the cloud paradigm to the heav- from past checkpoints may not be appropriate; instead, light-
ily scaled-down edge data centers, it is possible for edge weight restarts need to be performed from a currently valid,
providers to quickly develop, install, deliver and manage ap- application-specific operating point.
plications using the same tools and techniques that are used in
the cloud. For example, customers of AWS Wavelength can
deploy parts of their application that require ultra-low latency
at the edge of 5G networks using the same AWS APIs, tools,
and functionality they use today in the cloud, while seam- 1.1.3 Our contribution. In this paper, we focus on the de-
lessly connecting back to the rest of their application and the sign and development of edge-cloud applications, and de-
full range of cloud services running in an AWS Region. scribe a programming model and a run-time that enables ap-
Third, cloud can be used for a fully centralized application plications to make effective use of the large-scale distributed
delivery, and management, in addition to providing computing platform consisting of a 5G network, computing and stor-
resources for execution of application functions that require a age resources across the cloud, different tiers of edge-cloud,
global perspective. and the devices. Our programming model captures internal
knowledge about the application’s microservices, their inter-
1.1.2 Challenges. Despite its promise, and obvious advan- connections, and the critical pipelines of microservices that
tages, the tiered reference architecture also poses several fun- determine the latency response of the application. Our run-
damental challenges for applications: time continuously monitors data produced and exchanged
First, the complex, tiered distributed architecture entails among the microservices, dynamically maps the microser-
very high programming complexity. Mapping and execution vices to different tiers of computing and storage resources
of applications on a complex edge-cloud geo-spatially dis- to achieve application latency goals, maintains data consis-
tributed infrastructure with heterogeneous resources (different tency across the distributed storage by using microservice
types of networks and computing resources), and at differ- proxies to reduce WAN bandwidth by an order of magni-
ent network hierarchies, to meet low-latency response times tude, handles temporary network disconnections, all in an
is a major challenge. The execution of an edge-cloud ap- application-specific manner by leveraging our knowledge
plication often requires its functions to span across mobile about the application’s functions, latency-critical pipelines
devices, edges, and the distant central cloud, with several and intermediate data.
stages of computation where data is flowing from one stage We illustrate the use of the proposed programming model,
to another in a pipeline. Understanding of the concurrency and the new run-time, by successfully mapping two different
and latency-sensitive pipelines in the application, and subse- types of video analytics applications to the AWS/Verizon
quent dynamic distribution and parallel execution of these Wavelength edge-cloud architecture. The proposed approach
functions in a dynamic heterogeneous environment, are nec- is just as easily applicable to any application that will be
essary to achieve low-latency application response. These packaged in a 5G network slice, whose definition is expected
concerns are non-trivial and daunting for most application to be standardized in a few months.
NEC Laboratories America, Princeton, USA,
ECO: Edge-Cloud Optimization of 5G applications
Network
Instance Memory Price
vCPUs GPU Storage Perf.
Types (GiB) (per hour)
(Gigabit)
EBS
t3.medium 2 4 no upto 5 $0.056
only
EBS
t3.xlarge 4 16 no upto 5 $0.224
only
EBS
r5.2xlarge 8 64 no upto 10 $0.68
only
1x225
g4dn.2xlarge 8 32 yes upto 25 $1.317
(SSD)
X 𝑒𝑑𝑔𝑒
𝐿𝑡𝑜𝑡𝑎𝑙 = 𝐹 𝑣 × 𝑇𝑣 (𝐸𝑑𝑔𝑒 latency)
Figure 9. Hybrid deployment of real-time monitoring and 𝑣 ∈𝑉
access control application (best option, but region-specific) X
+ (1 − 𝐹 𝑣 ) × 𝑇𝑣𝑐𝑙𝑜𝑢𝑑 (𝐶𝑙𝑜𝑢𝑑 latency)
𝑣 ∈𝑉
X
+ 𝐹𝑒 × 𝑤(𝑒(𝑣𝑖 , 𝑣 𝑗 )) (𝐶𝑜𝑚𝑚𝑛 latency) (4)
𝑐𝑙𝑜𝑢𝑑 𝑒(𝑣𝑖 ,𝑣 𝑗 )∈𝐸
𝑤(𝑣) = 𝑇𝑣𝑐𝑙𝑜𝑢𝑑 ∗ 𝑃 𝑣𝑐𝑙𝑜𝑢𝑑 (2)
where the total latency is the sum of the edge latency i.e.
𝑒𝑑𝑔𝑒
where 𝑇𝑣 is the execution time of microservice 𝑣 on processing time taken by microservices on the edge for a unit
𝑒𝑑𝑔𝑒
the edge, 𝑃 𝑣 is the price (AWS cost) of running the mi- of work, cloud latency i.e. processing time taken by microser-
croservice on the edge, 𝑇𝑣𝑐𝑙𝑜𝑢𝑑 is the execution time of the vices on the cloud for a unit of work and communication
microservice on the cloud and 𝑃 𝑣𝑐𝑙𝑜𝑢𝑑 is the price (AWS cost) latency i.e. time taken for data transfer between the edge
of running the microservice in the cloud. Note that some and the cloud. Flags 𝐹 𝑣 and 𝐹𝑒 in Equation 4 are defined as
microservices cannot be offloaded to the cloud and have to follows:
remain on the edge e.g. microservices that receive input from ( (
devices in the carrier network and those that deliver output to 1, if 𝑣 ∈ 𝑉 𝑒𝑑𝑔𝑒 1, if 𝑒 ∈ 𝐸𝑐𝑢𝑡
𝐹𝑣 = and 𝐹𝑒 = (5)
devices in the carrier network. Such microservices are fixed 0, otherwise 0, if 𝑒 ∈/ 𝐸𝑐𝑢𝑡
to the edge and they only have edge cost.
Each vertex receives one of the two weights depending on where 𝑉 𝑒𝑑𝑔𝑒 is the set of vertices (microservices) scheduled
where it is scheduled to run i.e. it will get weight 𝑤(𝑣)𝑒𝑑𝑔𝑒 if it to run on the edge and 𝐸𝑐𝑢𝑡 is the set of edges 𝑒(𝑣𝑖 , 𝑣 𝑗 ) in which
is scheduled to run on the edge or 𝑤(𝑣)𝑐𝑙𝑜𝑢𝑑 if it is scheduled 𝑣𝑖 and 𝑣 𝑗 are scheduled on edge and cloud or vice versa.
to run on the cloud. Each edge 𝑒(𝑣𝑖 , 𝑣 𝑗 ) ∈ 𝐸 represents the We formulate the total cost given by Equation 6
communication between 𝑣𝑖 and 𝑣 𝑗 , where 𝑣𝑖 is on the edge and
𝑣 𝑗 is on the cloud (or vice versa), and this edge is assigned a 𝐶𝑜𝑠𝑡𝑡𝑜𝑡𝑎𝑙 = 𝑐𝑒𝑑𝑔𝑒 ×
X
𝐹 𝑣 × 𝑤(𝑣)𝑒𝑑𝑔𝑒 (𝐸𝑑𝑔𝑒 cost)
weight given by Equation 3 𝑣 ∈𝑉
X
+ 𝑐𝑐𝑙𝑜𝑢𝑑 × (1 − 𝐹 𝑣 ) × 𝑤(𝑣)𝑐𝑙𝑜𝑢𝑑 (𝐶𝑙𝑜𝑢𝑑 cost)
𝑑𝑎𝑡𝑎_𝑖𝑛𝑖,𝑗 𝑑𝑎𝑡𝑎_𝑜𝑢𝑡𝑖,𝑗 𝑣 ∈𝑉
𝑤(𝑒(𝑣𝑖 , 𝑣 𝑗 )) = + (3)
𝑏𝑤𝑢𝑝𝑙𝑜𝑎𝑑 𝑏𝑤𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑 (6)
where 𝑑𝑎𝑡𝑎_𝑖𝑛𝑖,𝑗 is the amount of data transferred (up- where the total cost is the sum of the edge computation
loaded) from 𝑣𝑖 to 𝑣 𝑗 , 𝑑𝑎𝑡𝑎_𝑜𝑢𝑡𝑖,𝑗 is the amount of data re- cost and cloud computation cost, and weight parameters 𝑐𝑒𝑑𝑔𝑒
ceived (downloaded) from 𝑣 𝑗 to 𝑣𝑖 , 𝑏𝑤𝑢𝑝𝑙𝑜𝑎𝑑 is the network and 𝑐𝑐𝑙𝑜𝑢𝑑 are to adjust the relative importance between them.
NEC Laboratories America, Princeton, USA,
Chakradhar.S, et al.
The goal of partitioning is to find a cut in the graph 𝐺 = some pre-defined periodic interval, say 10 seconds, all ap-
(𝑉 , 𝐸) with minimum total cost under given total latency con- plications are checked for making scheduling decisions. If
straint per unit of work. This latency constraint is provided the application is not already scheduled, then appropriate
to the system, and ES adheres to this constraint while deter- application partitioning is determined and the application is
mining the cut in the graph. This cut separates the graph into scheduled to run as per the selected partition. For the applica-
two disjoint sets, where one side of the cut is scheduled on tions that are already running, the environmental conditions,
the edge while the other side is scheduled on the cloud, such including application-level parameters (processing speed, in-
that the overall cost of application execution is reduced, while put and output data exchange rate from microservices, etc.)
keeping the end-to-end latency within the provided total la- and network-level parameters (latency, upload and download
tency constraint. Since the cost of VM in the cloud is lower bandwidth between edge and cloud) are checked and if the
than that on the edge (Wavelength in this case) for the same change is over a pre-defined threshold for any of the param-
or better VMs in the cloud, scheduling microservices on the eters, then the same partitioning function is used with latest
cloud will certainly help reduce the overall cost. However, if updated parameters and the application is re-scheduled as per
the time for data transfer between microservices scheduled the newly determined partition scheme.
on cloud and edge is high, then the end-to-end latency will
go up, which is not desirable. The above formulation helps in Algorithm 1 Application scheduling
obtaining a desired partition which reduces the overall cost, 1: while true do
while keeping end-to-end latency within acceptable limit. 2: for 𝑎𝑝𝑝 ∈ 𝑎𝑝𝑝𝑠 do
In scenarios where there are multiple layers of computing 3: if !isAppScheduled(𝑎𝑝𝑝) OR
infrastructure available, such that the cost reduces as you go 4: conditionsChanged(𝑎𝑝𝑝, 𝑎_𝑝, 𝑛_𝑝) then
to upper layers at the expense of increased latency, the same 5: 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 ← getPartition(𝑎𝑝𝑝, 𝑎_𝑝, 𝑛_𝑝)
method can be applied iteratively across layers to identify 6: scheduleApp(𝑎𝑝𝑝, 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛)
the appropriate allocation and scheduling of microservices. 7: end if
For example, lets say there are three computing layers A, B 8: end for
and C, with A being at the top, B in the middle and C at the 9: sleep(𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙)
bottom. Cost of computing goes lower as you go up from C to 10: end while
A, while the latency goes higher as you go up from C to A. In
this scenario, the above partitioning scheme will first consider
5.0.3 Alerts-Manager at Edge (AM-E). For low-latency
A as the cloud, and B and C together will be considered as
response applications, alerts are typically generated at the
edge. Once this partition is determined, certain microservices
edge, and persisted in the long-term store in the cloud. In
will be scheduled to run on A, while others will be scheduled
order to make the alerts available quickly (real-time) to other
to run on B and C. Now, for the one’s that are scheduled to
applications, we introduce a proxy microservice for AM in
run on B and C, only for those microservices, the partitioning
the cloud. This new microservice, called Alerts-Manager at
scheme will be applied again. This time B is considered as
Edge (AM-E), receives alerts from application microservices,
the cloud and C as the edge. The set of microservices will
either locally on edge or from the cloud and publishes them
now be split to run between B and C. This way, the various
over a ZeroMQ channel immediately for other applications
microservices will be allocated and scheduled to run on layers
to consume. This proxy also maintains the received alerts in
A, B and C. This iterative process can be extended to any
a temporary buffer, which is periodically synchronized with
number of computing layers and appropriate partitioning of
AM microservice’s persistent storage in the cloud. AM-E is
microservices can be determined across these various layers.
also useful if a network partition happens. After the network
Before the execution of the application starts, based on
connectivity is restored, the proxy synchronizes with AM
application and network parameters i.e. the current network
microservice in the cloud by using lightweight application-
condition, and apriori knowledge about execution times and
specific alerts. This is in stark contrast to an approach where
communication across various microservices of the applica-
databases on the edge in the Wavelength zone are synchro-
tion, the above partitioning scheme is used by ES to decide
nized with the persistent store in the cloud by using an order
where to schedule the various microservices. After the applica-
of magnitude more resources on the edge. AM-E is trans-
tion starts running, ES continuously receives application-level
parently added by the runtime in the application pipeline to
and network-level performance data from EM. This is used to
ensure quick delivery of alerts from the edge to other appli-
periodically check if the previously selected partition is still
cations or to devices in the carrier network. This is possible
good or needs to be adjusted dynamically based on the chang-
since the runtime has intrinsic knowledge of the application
ing environmental (application-level and/or network-level)
pipeline.
conditions. Algorithm 1 shows the scheduling algorithm used
by ES to schedule applications between edge and cloud. At 5.0.4 File Transfer (FT). For applications that work with
files, when they reside completely on the edge or cloud, there
NEC Laboratories America, Princeton, USA,
ECO: Edge-Cloud Optimization of 5G applications
his camera B, he visits the central site in the Ohio Region zone
and makes a request for a particular application on camera B
(step 4). In response, we create another VPC with resources
in the North Virginia Region zone and a Wavelength zone in
New York (steps 5 and 6). Bob’s camera B connects to the
carrier gateway IP address for the Wavelength zone in New
York, and avails the requested application. Our system can
quickly set up such a global application service, without any
human involvement.
Figure 10. Hybrid deployment of forensics application
6.0.1 Cloud Monitor (CM). Edge Monitor, described in
Section 5.0.1, monitors application-level and network-level
is no need for an explicit component that manages and co- performance metrics at individual edge, while Cloud Monitor
ordinates file transfer since all microservices are co-located. (CM) monitors and maintains this information from multi-
However, when microservices are split between edge and ple edges, at a central location in the cloud. CM thus has a
cloud, it becomes necessary to co-ordinate file transfer across global view, while EM has a local view of the state of deploy-
the network and handle situations when there is network par- ment of various applications. CM works in conjunction with
tition. FT is a new component to handle such a scenario and EMs and maintains key edge-level metrics from individual
similar to AM-E, FT is also transparently added by the run- edges and makes them available for Cloud Scheduler (CS) to
time in the application pipeline to co-ordinate file transfer make decisions about placement of applications at appropri-
between microservices over the network. Again, this is only ate edges. Each EM, at periodic interval, reports edge-level
possible because the runtime has intrinsic knowledge of the metrics like health of the edge, performance of applications
entire application pipeline and can introduce FT component running on the edge, network performance between the edge
appropriately. Measurements from FT regarding the progress and the cloud, etc., to CM, which then has the most up-to-
of file transfer, speed, etc. is used by EM to keep track of date information from all the connected edges. If any of the
application and network performance, which is used by ES edge looses connection due to network partitioning, then the
to make dynamic application placement decisions. Figure 10 information at CM from that edge is stale and subsequently
shows hybrid deployment of investigation and forensics appli- CS avoids scheduling applications at that particular edge un-
cation, where components are distributed between edge and til network connection is restored. Alternatively, if the edge
cloud. While the processing at the edge continues, resources health seems to be unstable, then CM notes this and CS sub-
from cloud are also leveraged to process parts of the files in sequently avoids scheduling applications on that edge until
parallel. For this, the file transfer co-ordination is handled by the edge health is back to normal.
FT component at the edge. FT also keeps track of which parts
of the files are transferred, which ones are in progress, which 6.0.2 Cloud Scheduler (CS). Cloud Scheduler (CS) is the
ones are complete, etc. In case of network partition, the one’s component that enables deployment of applications across
that were in progress on the cloud are re-initiated on the edge different Wavelength zones. Figure 11 shows the flowchart
by FT. of the procedure followed by CS to serve a request for ap-
plication deployment. Whenever a request to deploy a new
6 Centralized delivery and management application in a particular geographical region is received, CS
Figure 12 shows a distributed architecture to handle deploy- first checks if there already exists an availability and wave-
ments across different Wavelength zones, all from a central- length zone setup for that region. If not, then a new VPC is
ized cloud. Users access a central site, and request for a set created and availability and wavelength zone for the corre-
of services to be deployed on a collection of sensors. For sponding geographical region is setup. If a Wavelength zone
example, in Figure 12, Alice from San Jose area contacts a is already setup, and is reachable (checked using data from
VM in the Ohio Region zone, expressing an interest to start a CM), then it is selected for deployment of the application.
particular application on a camera A (step 1). In response to After Wavelength zone, next is the selection or creation of
the request, we create a VPC (as described later in Section 13) VMs for application processing. If existing VMs can acco-
with appropriate Region and Wavelength zones for customers modate microservices of the application (checked using data
in San Jose area (steps 2 and 3). Then, the carrier gateway from CM) then they are selected and if more VMs are needed
IP address, as well as the public IP address of the VM in the for processing the application, then new VMs are created
Oregon Region zone is sent back to Alice as a React progran, within the availability and wavelength zone. The health of
which points camera A to the carrier gateway IP address, and these VMs is then checked using edge-level metrics from CM
the service is now available on camera A. Similarly, when and if they look good, then application is deployed on these
Bob from New York wants to start a particular application on VMs and the request processing ends. If either the existing
NEC Laboratories America, Princeton, USA,
Chakradhar.S, et al.
Wavelength zone is not reachable or the VMs are not in good times and report the minimum, average, maximum and the
health, an error is reported and request processing ends. standard deviation values for all measurements.
Figure 11. Cloud Scheduler flowchart From Table 2, we can see that the latency between device
to VM in Wavelength zone is much lower than that between
device to VM in Availability zone, indicating faster turn-
7 5G and WAN network latencies around time for responses from Wavelength zone compared to
In this section we present some raw network performance those from Availability zone (where the standard deviation is
numbers for the test beds that we created (one on the west also quite high). Also, the latency between VM in Availability
coast and another one on the east coast). For all our experi- zone and the VM in Wavelength zone is very low, indicating
ments, we used Verizon 5G MIFI 2100 hotspot and t3.xlarge that we can offload processing to the availability zone and
instance type in Amazon AWS infrastructure for Wavelength get back response from there to Wavelength zone quickly.
and Availability Zone. Although Verizon claims to have 5G With respect to upload bandwidth, we see that from device
speed in San Francisco bay area, the location where we mea- to VM in Wavelength zone, the upload bandwidth is quite
sured, did not get the full expected 5G speed and therefore we high compared to the one to VM in availability zone. This
measured at two different locations within the SF Bay area. shows that high-resolution video can be streamed to VM in
We report the numbers we observed using this 5G hotspot Wavelength zone at high FPS, whereas it will be slow to
at these two locations, namely location-1 (GPS coordinates: stream to the VM in Availability zone. Download bandwidth
37.351368, -121.994782 ) and location-2 (GPS coordinates: from VM in Wavelength zone to device is quite high while
37.349002, -121.993945 ). Table 2 summarizes the latency, from VM in Availability zone to device is relatively low. This
upload bandwidth and download bandwidth that we observed indicates that it will not be possible to receive high network
between (a) Device in Wavelength zone to VM in Wavelength traffic from VM in availability zone to device, but it is possible
zone (b) Device in non-wavelength zone (since user can con- to obtain from VM in Wavelength zone. If we see the upload
nect from anywhere) to VM in Availability zone and (c) VM and download bandwidth between VM in availability zone
in Availability zone to VM in Wavelength zone. We used ping and VM in Wavelength zone, we can see that it is high enough
to measure the latency and iperf3 [4] to measure the upload to offload processing to availability zone VM and receive back
and download bandwidth. We repeat the experiments multiple responses on wavelength zone VM.
NEC Laboratories America, Princeton, USA,
ECO: Edge-Cloud Optimization of 5G applications
Test bed network performance performance in such a setup is extremely poor and not suitable
Standard for real-time video analytics applications. Therefore, although
Metrics Minimum Average Maximum
deviation
VMs in Wavelength zone are reachable from an area outside
Latency
(milli-seconds) the Wavelength zone, it certainly makes a difference w.r.t.
(a) Device to Wavelength 24.6 30.9 50.9 4.5 network performance, to be in recommended 5G coverage
Upload Bandwidth area while using the Wavelength Zone VMs.
(Mbits/s)
(a) Device to Wavelength 40 42.8 46 2.4 Test bed network performance
Download Bandwidth
Standard
(Mbits/s) Metrics Minimum Average Maximum
deviation
(a) Wavelength to Device 255 292 318 18.25
Latency
Table 3. Network performance at location-2 in SF Bay area (milli-seconds)
(a) Device to Wavelength 39.47 77.46 117.43 22.59
Upload Bandwidth
(Mbits/s)
Table 3 summarizes the latency, upload and download band- (a) Device to Wavelength 2.01 3.1 4.25 1.03
width measured between device in Wavelength zone to VM Download Bandwidth
(Mbits/s)
in Wavelength zone at location-2. We are not reporting (b) (a) Wavelength to Device 13.5 17.8 24 5.03
and (c) from Table 2 in this table, since they are not directly
affected by the 5G hotspot network performance at different Table 4. Network performance at non-wavelength zone in NJ
locations. We notice that the latency is slightly lower, up-
load bandwidth is slightly higher, and download bandwidth is
significantly higher at location-2 than location-1. Along with measurements between devices in carrier net-
In addition to network performance measurements in SF work and VMs in Wavelength zone, in different geographical
Bay area, we also conducted measurements from a device area, we also performed an experiment to measure the sta-
in non-wavelength zone in NJ on our east coast test bed. bility of connection between VM in Wavelength zone and
This location is not in the recommended 5G coverage area, VM in Availability zone. We send a sample frame from wave-
but still we tried to see what kind of network performance length zone VM to availability zone VM, and then receive a
we observe from the 5G hotspot to the Wavelength zone in sample alert from availability zone VM to wavelength zone
NYC. Table 4 summarizes the latency, upload and download VM. Figure 13 shows the latency profile for continuously
bandwidth between device in non-wavelength zone in NJ to doing this over a period of 24 hours. We can see that over a
VM in wavelength zone in NYC. We observe that the network period of 24 hours, the latency remains quite stable, except
NEC Laboratories America, Princeton, USA,
Chakradhar.S, et al.
Number of Number of
Deployment Number of Price Compute price Storage price Total cost
Availability zone Wavelength zone
scenario cameras (per hour) (per month) (per month) (per month)
instances instances
Availability
20 10 0 $1.67 $1202 $195 $1397
Zone
Wavelength
20 1 10 $2.40 $1733.04 $310 $2043
Zone
Hybrid 20 5 5 $1.95 $1407 $243 $1650
Availability
100 50 0 $8.35 $6012 $975 $6987
Zone
Wavelength
100 1 50 $11.37 $8184 $1474 $9658
Zone
Hybrid 100 25 25 $9.78 $7038 $1215 $8253
then the total cost (compute and storage) would be around there stream it into the cloud, where entire application pro-
$1397.4 per month, while deploying it on Wavelength zone cessing happens (reducing cost of processing since VMs in
i.e. all application processing happens in the wavelength zone, cloud are cheaper) and then get back results through Wave-
will cost around $2043.54, where we use 1 EC2 instance in length zone VM back to the devices. For such a deployment,
Availability zone and 10 on Wavelength zone. In case of de- the overall network time is 260 milliseconds vs 134 millisec-
ployment in availability zone, the cameras connect directly onds when hybrid deployment (see Figure 9) determined by
to the VMs in availability zone, while in case of wavelength our runtime optimization is used. Also, for such a deploy-
zone, cameras directly connect to the VMs in Wavelength ment with 100-cameras, the number of VMs in Wavelength
zone through the carrier network. Along with these two, there zone to stream the video to availability zone would be around
is also a hybrid deployment, where cameras directly connect 15 (required bitrate per camera to stream full HD video at
to VMs in the Wavelength zone through the carrier network, 30 FPS is about 4 to 5 Mbits/s and total upload bandwidth
but the processing does not entirely happen there. Instead, the between wavelength zone and availability zone is around 35
processing is split between VMs in availability zone and VMs MBits/s, so 1 VM can support 7 cameras; to support 100
in wavelength zone. The cost of such a hybrid deployment for cameras around 15 VMs would be required) and number of
a 20-camera setup is around $1650.6 per month, which uses 5 VMs for processing in availability zone would be 50 (1 VM
EC2 instance in Wavelength zone and 5 instances in availabil- can support 2 cameras; so for 100 cameras 50 VMs would
ity zone. The total cost for 100-camera deployment for purely be needed), which brings the total cost per month to around
on availability zone is around $6987 per month, while for $9842.7, whereas the total cost per month for the hybrid
deployment on wavelength zone is around $9658.74. In both deployment (see Figure 9) determined by our runtime opti-
cases, 50 t3.xlarge instances are required for application pro- mization would be around $8253 per month. Our runtime
cessing, either in availability zone or wavelength zone. The optimization thus improves network latency by around 2x,
cost for 100-camera hybrid deployment, with 25 instances while bringing the cost down by around 16% for a 100-camera
in availability zone and 25 instances in wavelength zone is deployment.
around $8253. We note that the cost of hybrid deployment is
lower than that of edge-only deployment i.e on Wavelength
zone, at the expense of slightly increased total time to action.
[5] [n.d.]. Transformational Performance with 5G and Edge Comput- [23] Jinke Ren, Guanding Yu, Yinghui He, and Geoffrey Li. 2019. Collab-
ing. https://d1.awsstatic.com/Wavelength2020/AWS-5G-edge- orative Cloud and Edge Computing for Latency Minimization. IEEE
Infographic-FINAL-Aug2020-2.pdf Last accessed 29 October 2020. Transactions on Vehicular Technology PP (03 2019), 1–1. https:
[6] Dalia Adib. [n.d.]. The edge computing latency promise. //doi.org/10.1109/TVT.2019.2904244
https://stlpartners.com/edge-computing/edge-computing- [24] David K. Rensin. 2015. Kubernetes - Scheduling the Future at Cloud
architecture-impact-latency/ Last accessed 29 October 2020. Scale. 1005 Gravenstein Highway North Sebastopol, CA 95472. All
[7] Hind Bangui, Said Rakrak, Said Raghay, and Barbora Buhnova. 2018. pages. http://www.oreilly.com/webops-perf/free/kubernetes.csp
Moving to the Edge-Cloud-of-Things: Recent Advances and Future [25] Dario Sabella, Vadim Sukhomlinov, Linh Trang, Sami Kekki, Pietro
Research Directions. 7 (11 2018), 309. https://doi.org/10.3390/ Paglierani, Ralf Rossbach, Xinhui Li, Yonggang Fang, Dan Druta,
electronics7110309 Fabio Giust, Luca Cominardi, Walter Featherstone, Bob Pike, Shlomi
[8] Bjorn Butzin, Frank Golatowski, and Dirk Timmermann. 2016. Mi- Hadad, Linh Sony, Vmware Fang, and Bob Acs. 2019. Developing
croservices approach for the internet of things. 1–6. https://doi.org/ Software for Multi-Access Edge Computing.
10.1109/ETFA.2016.7733707 [26] Espen Tønnessen, Thomas Haugen, and Shaher Shalfawi. 2013. Re-
[9] Keyan Cao, Yefan Liu, Gongjie Meng, and Qimeng Sun. 2020. An action Time Aspects of Elite Sprinters In Athletics World Champi-
Overview on Edge Computing Research. IEEE Access PP (01 2020), onships. Journal of strength and conditioning research / National
1–1. https://doi.org/10.1109/ACCESS.2020.2991734 Strength Conditioning Association 27 (04 2013), 885–892. https:
[10] Ann Castelfranco and Daniel Hartline. 2016. Evolution of rapid nerve //doi.org/10.1519/JSC.0b013e31826520c3
conduction. Brain Research 1641 (02 2016). https://doi.org/10.1016/ [27] Prateeksha Varshney and Yogesh Simmhan. 2020. Characterizing
j.brainres.2016.02.015 application scheduling on edge, fog, and cloud computing resources.
[11] Byung-Gon Chun, Sunghwan Ihm, Petros Maniatis, Mayur Naik, and Software: Practice and Experience 50, 5 (May 2020), 558–595. https:
Ashwin Patti. 2011. CloneCloud: Elastic execution between mobile //doi.org/10.1002/spe.2699
device and cloud. EuroSys’11 - Proceedings of the EuroSys 2011 [28] S. Vidas, R. Lakemond, S. Denman, C. Fookes, S. Sridharan, and T.
Conference, 301–314. https://doi.org/10.1145/1966445.1966473 Wark. 2012. A Mask-Based Approach for the Geometric Calibration
[12] Eduardo Cuervo, Aruna Balasubramanian, Dae-ki Cho, Alec Wolman, of Thermal-Infrared Cameras. IEEE Transactions on Instrumentation
Stefan Saroiu, Ranveer Chandra, and Paramvir Bahl. 2010. MAUI: and Measurement 61 (06 2012), 1625–1635. https://doi.org/10.1109/
Making smartphones last longer with code offload. Proc. ACM MO- TIM.2012.2182851
BISYS 2010, 49–62. https://doi.org/10.1145/1814433.1814441 [29] Jianyu Wang, Jianli Pan, Flavio Esposito, Prasad Calyam, Zhicheng
[13] Beth Daley. 2015. It feels instantaneous, but how long does it really Yang, and Prasant Mohapatra. 2018. Edge Cloud Offloading Algo-
take to think a thought? https://theconversation.com/it-feels- rithms: Issues, Methods, and Perspectives.
instantaneous-but-how-long-does-it-really-take-to-think-a- [30] Huaming Wu, William Knottenbelt, and Katinka Wolter. 2019. An
thought-42392 Last accessed 30 October 2020. Efficient Application Partitioning Algorithm in Mobile Environments.
[14] Nicola Dragoni, Saverio Giallorenzo, Alberto Lluch-Lafuente, Manuel IEEE Transactions on Parallel and Distributed Systems 30 (01 2019),
Mazzara, Fabrizio Montesi, Ruslan Mustafin, and Larisa Safina. 2016. 1464 – 1480. https://doi.org/10.1109/TPDS.2019.2891695
Microservices: yesterday, today, and tomorrow. (06 2016).
[15] Martin Fowler and James Lewis. 2014. Microservices. https://
martinfowler.com/articles/microservices.html Last accessed 29
13 Appendix
October 2020. 13.1 Setup and creation of a Wavelength zone
[16] TejasP Ghuntla, PradnyaA Gokhale, Hemant Mehta, and Chinmay Shah.
2014. A comparison and importance of auditory and visual reaction
We setup 2 independent Wavelength zones, one in San Fran-
time in basketball players. Saudi Journal of Sports Medicine 14 (01 cisco Bay area, and the other in New York city. Figure 17
2014), 35. https://doi.org/10.4103/1319-6308.131616 shows the steps needed to setup the edge-cloud infrastructure,
[17] Najmul Hassan, Kok-Lim Yau, and Celimuge Wu. 2019. Edge Com- and we describe the details of each step.
puting in 5G: A Review. IEEE Access PP (08 2019), 1–1. https:
//doi.org/10.1109/ACCESS.2019.2938534
1. Select the region based on the wavelength zone and
[18] Teemu Leppänen, Claudio Savaglio, Lauri Lovén, Tommi Järven- image id to spawan the ec2 nodes
pää, Rouhollah Ehsani, Ella Peltonen, Giancarlo Fortino, and Jukka
$ e x p o r t REGION=" us − west −2"
Riekki. 2019. Edge-Based Microservices Architecture for Internet
of Things: Mobility Analysis Case Study. https://doi.org/10.1109/ $ e x p o r t WL_ZONE=" us − west −2− wl1 \
GLOBECOM38437.2019.9014273 − s f o −wlz −1"
[19] Augus Loten. 2019. Data-Center Market Is Booming Amid Shift $ e x p o r t IMAGE_ID= \
to Cloud. https://www.wsj.com/articles/data-center-market-is- " ami −003 b a 0 8 1 1 3 5 9 2 0 4 6 f "
booming-amid-shift-to-cloud-11566252481 Last accessed 30 Oc-
$ e x p o r t KEY_NAME=< y o u r key name>
tober 2020.
[20] C. MacKenzie, Kenneth Laskey, Francis Mccabe, Peter Brown, and
Rebekah Metz. 2006. Reference model for service oriented architecture
2. Create the VPC and set VPC_ID
1.0. Public Rev. Draft 2 (08 2006). $ e x p o r t VPC_ID=$ ( aws e c 2 \
[21] Sumit Maheshwari, Dipankar Raychaudhuri, Ivan Seskar, and
−− r e g i o n $REGION \
Francesco Bronzino. 2018. Scalability and Performance Evalua-
tion of Edge Cloud Systems for Latency Constrained Applications. −− o u t p u t t e x t \
https://doi.org/10.1109/SEC.2018.00028 c r e a t e − vpc \
[22] Muthucumaru Maheswaran, Robert Wenger, Richard Olaniyan, Salman −− c i d r − b l o c k 1 0 . 0 . 0 . 0 / 1 6 \
Memon, Olamilekan Fadahunsi, and Richboy Echomgbe. 2019. A −− q u e r y ' Vpc . VpcId ' ) \
Language for Programming Edge Clouds for Next Generation IoT
&& e c h o ' \ nVPC_ID = ' $VPC_ID
Applications.
NEC Laboratories America, Princeton, USA,
ECO: Edge-Cloud Optimization of 5G applications
3. Create a subnet for the availability zone && e c h o ' \ nIGW_ID = ' $IGW_ID
aws e c 2 −− r e g i o n $REGION \
$ CLOUD_SUBNET_ID=$ ( aws e c 2 \ a t t a c h − i n t e r n e t −gateway \
−− r e g i o n $REGION \ −−vpc − i d $VPC_ID \
−− o u t p u t t e x t \ −− i n t e r n e t − gateway − i d $IGW_ID
create −subnet \
−− c i d r − b l o c k 1 0 . 0 . 1 . 0 / 2 4 \ $ e x p o r t CAGW_ID=$ ( aws e c 2 \
−−vpc − i d $VPC_ID \ −− r e g i o n $REGION \
−− q u e r y ' S u b n e t . S u b n e t I d ' ) \ −− o u t p u t t e x t \
&& e c h o ' \ nCLOUD_SUBNET_ID= \ c r e a t e − c a r r i e r −gateway \
' $CLOUD_SUBNET_ID −−vpc − i d $VPC_ID \
−− q u e r y ' C a r r i e r G a t e w a y . \
4. Create an internet gateway and attach to the VPC
CarrierGatewayId ' ) \
$ e x p o r t IGW_ID=$ ( aws e c 2 \ && e c h o ' \ nCAGW_ID= '$CAGW_ID
−− r e g i o n $REGION \
5. Create the routing table to route the traffic to the internet
−− o u t p u t t e x t \
gateway
c r e a t e − i n t e r n e t −gateway \
−− q u e r y ' I n t e r n e t G a t e w a y . \ $ e x p o r t CLOUD_RT_ID=$ ( aws e c 2 \
InternetGatewayId ' ) \ −− r e g i o n $REGION \
NEC Laboratories America, Princeton, USA,
Chakradhar.S, et al.
−− r o u t e − t a b l e − i d $WL_RT_ID \ c r e a t e −network − i n t e r f a c e \
−− s u b n e t − i d $WL_SUBNET_ID −− s u b n e t − i d $WL_SUBNET_ID \
−− g r o u p s $WL_SG_ID \
$ aws e c 2 −− r e g i o n $REGION \ −− q u e r y ' N e t w o r k I n t e r f a c e . \
create −route \ NetworkInterfaceId ' ) \
−− r o u t e − t a b l e − i d $WL_RT_ID \ && e c h o ' \ nWL_ENI_ID = ' $WL_ENI_ID
−− d e s t i n a t i o n − c i d r − b l o c k \
0.0.0.0/0 \ $ aws e c 2 −− r e g i o n $REGION \
−− c a r r i e r − gateway − i d $CAGW_ID associate −address \
−− a l l o c a t i o n − i d $WL_CIP_ALLOC_ID \
11. Create the carrier IP and the elastic network interfaces −− n e t w o r k − i n t e r f a c e − i d $WL_ENI_ID
(ENI) and associate the IP to the ENI
12. Create the security group to establish rules Use the
$ e x p o r t WL_CIP_ALLOC_ID=$ ( aws \
e c 2 −− r e g i o n \ same security group created in step 6.
$REGION \ 13. Spawn a EC2 instance in the wavelength subnet
−− o u t p u t t e x t \ $ aws e c 2 −− r e g i o n $REGION \
allocate −address \ run − i n s t a n c e s \
−− domain vpc \ −− i n s t a n c e − t y p e t 3 . x l a r g e \
−− n e t w o r k − b o r d e r − g r o u p $WL_ZONE \ −− n e t w o r k − i n t e r f a c e \
−− q u e r y ' A l l o c a t i o n I d ' ) \ ' [ { " DeviceIndex ":0 , \
&& e c h o ' \ nWL_CIP_ALLOC_ID= \ " NetworkInterfaceId ": \
' $WL_CIP_ALLOC_ID " ' $WL_ENI_ID ' " } ] ' \
−−image − i d $IMAGE_ID \
$ e x p o r t WL_ENI_ID=$ ( aws e c 2 \ −−key −name $KEY_NAME \
−− r e g i o n $REGION \ −− b l o c k − d e v i c e − m a p p i n g s \
−− o u t p u t t e x t \ f i l e : / / mapping . j s o n