0% found this document useful (0 votes)

67 views8 pages

Improving Performance On The Internet

artikel

Uploaded by

Farin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views8 pages

Improving Performance On The Internet

artikel

Uploaded by

Farin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

practice

doi:10.1145/ 1461928.1461944

Given the Internet’s bottlenecks, how can we

build fast, scalable, content-delivery systems?
by Tom Leighton

Improving
Performance
on the Internet

W hen it comes to achieving performance, architect’s control. Achieving good

reliability, and scalability for commercial-grade Web first-mile performance and reliabil-
ity is now a fairly well-understood and
applications, where is the biggest bottleneck? In many tractable problem. From the end user’s
cases today, we see that the limiting bottleneck is the point of view, however, a robust first
mile is necessary, but not sufficient,
middle mile, or the time data spends traveling back for achieving strong application per-
and forth across the Internet, between origin server formance and reliability.
and end user. This is where the middle mile comes
in. Difficult to tame and often ignored,
This wasn’t always the case. A decade ago, the last the Internet’s nebulous middle mile
mile was a likely culprit, with users constrained to injects latency bottlenecks, throughput
constraints, and reliability problems
sluggish dial-up modem access speeds. But recent into the Web application performance
high levels of global broadband penetration—more equation. Indeed, the term middle mile
than 300 million subscribers worldwide—have not is itself a misnomer in that it refers to
a heterogeneous infrastructure that is
only made the last-mile bottleneck history, they have owned by many competing entities and
also increased pressure on the rest of the Internet typically spans hundreds or thousands
of miles.
infrastructure to keep pace.5 This article highlights the most
Today, the first mile—that is, origin infrastructure— serious challenges the middle mile
tends to get most of the attention when it comes to presents today and offers a look at the
approaches to overcoming these chal-
designing Web applications. This is the portion of the lenges and improving performance on
problem that falls most within an application the Internet.

44 comm unic atio ns o f the ac m | f ebr ua ry 2009 | vo l . 5 2 | n o. 2

Stuck in the Middle The fragile economic model of peer- net’s primary internetwork routing al-
While we often refer to the Internet as a ing can have even more serious conse- gorithm) are just as susceptible as the
single entity, it is actually composed of quences. In March 2008, for example, physical network infrastructure. For
13,000 different competing networks, two major network providers, Cogent example, in February 2008, when Paki-
each providing access to some small and Telia, de-peered over a business stan tried to block access to YouTube
subset of end users. Internet capacity dispute. For more than a week, cus- from within the country by broadcast-
has evolved over the years, shaped by tomers from Cogent lost access to Telia ing a more specific BGP route, it acci-
market economics. Money flows into and the networks connected to it, and dentally caused a near-global YouTube
the networks from the first and last vice versa, meaning that Cogent and blackout, underscoring the vulnerabil-
miles, as companies pay for hosting Telia end users could not reach certain ity of BGP to human error (as well as
and end users pay for access. First- and Web sites at all. foul play).2
last-mile capacity has grown 20- and Other reliability issues plague the The prevalence of these Internet re-
50-fold, respectively, over the past five middle mile as well. Internet outages liability and peering-point problems
to 10 years. have causes as varied as transoceanic means that the longer data must travel
On the other hand, the Internet’s cable cuts, power outages, and DDoS through the middle mile, the more it
middle mile—made up of the peer- (distributed denial-of-service) attacks. is subject to congestion, packet loss,
ing and transit points where networks In February 2008, for example, com- and poor performance. These middle-
trade traffic—is literally a no man’s munications were severely disrupted mile problems are further exacerbated
land. Here, economically, there is very in Southeast Asia and the Middle East by current trends—most notably the
little incentive to build out capacity. If when a series of undersea cables were increase in last-mile capacity and de-
illustration by studio tonne

anything, networks want to minimize cut. According to TeleGeography, the mand. Broadband adoption continues
traffic coming into their networks cuts reduced bandwidth connectivity to rise, in terms of both penetration
that they don’t get paid for. As a re- between Europe and the Middle East and speed, as ISPs invest in last-mile
sult, peering points are often overbur- by 75%.8 infrastructure. AT&T just spent ap-
dened, causing packet loss and service Internet protocols such as BGP proximately $6.5 billion to roll out
degradation. (Border Gateway Protocol, the Inter- its U-verse service, while Verizon is

f e brua ry 2 0 0 9 | vo l. 52 | n o. 2 | c om m u n ic at i on s of t he acm 45
practice

spending $23 billion to wire 18 million broadband numbers represent a 19% dience of 50 million viewers, approxi-
homes with FiOS (Fiber-optic Service) increase in just three months. mately the audience size of a popular
by 2010.6,7 Comcast also recently an- TV show. The scenario produces ag-
nounced it plans to offer speeds of up A Question of Scale gregate bandwidth requirements of
to 100Mbps within a year.3 Along with the greater demand and 100Tbps. This is a reasonable vision
Demand drives this last-mile boom: availability of broadband comes a rise in for the near term—the next two to five
Pew Internet’s 2008 report shows that user expectations for faster sites, richer years—but it is orders of magnitude
one-third of U.S. broadband users have media, and highly interactive applica- larger than the biggest online events
chosen to pay more for faster connections. The increased traffic loads and today, leading to skepticism about
tions.4 Akamai Technologies’ data, performance requirements in turn put the Internet’s ability to handle such
shown in Figure 1, reveals that 59% of greater pressure on the Internet’s inter- demand. Moreover, these numbers
its global users have broadband con- nal infrastructure—the middle mile. In are just for a single TV-quality show. If
nections (with speeds greater than 2 fact, the fast-rising popularity of video hundreds of millions of end users were
Mbps), and 19% of global users have has sparked debate about whether the to download Blu-ray-quality movies
“high broadband” connections great- Internet can scale to meet the demand. regularly over the Internet, the result-
er than 5Mbps—fast enough to sup- Consider, for example, delivering ing traffic load would go up by an addi-
port DVD-quality content.2 The high- a TV-quality stream (2Mbps) to an au- tional one or two orders of magnitude.
Another interesting side effect of
Figure 1: Broadband penetration by country. the growth in video and rich media
file sizes is that the distance between
server and end user becomes critical
Broadband to end-user performance. This is the
Ranking Country % > 2Mbps result of a somewhat counterintuitive
— Global 59% phenomenon that we call the Fat File
Paradox: given that data packets can
1 South Korea 90%
traverse networks at close to the speed
2 Belgium 90% of light, why does it takes so long for a
3 Japan 87% “fat file” to cross the country, even if
4 Hong Kong 87% the network is not congested?
5 Switzerland 85%
It turns out that because of the
way the underlying network proto-
6 Slovakia 83%
cols work, latency and throughput
7 Norway 82% are directly coupled. TCP, for exam-
8 Denmark 79% ple, allows only small amounts of
9 Netherlands 77% data to be sent at a time (that is, the
TCP window) before having to pause
10 Sweden 75%
and wait for acknowledgments from
… the receiving end. This means that
20. United States 71% throughput is effectively throttled by
network round-trip time (latency),
which can become the bottleneck for
Fast Broadband file download speeds and video view-
Ranking Country % > 5Mbps ing quality.
— Global 19%
Packet loss further complicates the
problem, since these protocols back
1 South Korea 64%
off and send even less data before
2 Japan 52% waiting for acknowledgment if packet
3 Hong Kong 37% loss is detected. Longer distances in-
4 Sweden 32% crease the chance of congestion and
packet loss to the further detriment of
5 Belgium 26%
throughput.
6 United States 26%
Figure 2 illustrates the effect of dis-
7 Romania 22% tance (between server and end user)
8 Netherlands 22% on throughput and download times.
9 Canada 18% Five or 10 years ago, dial-up modem
speeds would have been the bottle-
10 Denmark 18%
neck on these files, but as we look at
Source: Akamai’s State of the Internet Report, 02 2008 the Internet today and into the future,
middle-mile distance becomes the
bottleneck.

46 communic atio ns o f th e ac m | f ebr ua ry 2009 | vo l . 5 2 | n o. 2

practice

Four Approaches away from most users and still deliver side of the middle-mile bottlenecks,
to Content Delivery content from the wrong side of the eliminating peering, connectivity,
Given these bottlenecks and scalability middle-mile bottlenecks. routing, and distance problems, and
challenges, how does one achieve the It may seem counterintuitive that reducing the number of Internet com-
levels of performance and reliability re- having a presence in a couple dozen ma- ponents depended on for success.
quired for effective delivery of content jor backbones isn’t enough to achieve Moreover, this architecture scales. It
and applications over the Internet? commercial-grade performance. In can achieve a capacity of 100Tbps, for
There are four main approaches to fact, even the largest of those networks example, with deployments of 20 serv-
distributing content servers in a con- controls very little end-user access traf- ers, each capable of delivering 1Gbps
tent-delivery architecture: centralized fic. For example, the top 30 networks in 5,000 edge locations.
hosting, “big data center” CDNs (con- combined deliver only 50% of end-user On the other hand, deploying a high-
tent-delivery networks), highly distrib- traffic, and it drops off quickly from ly distributed CDN is costly and time
uted CDNs, and peer-to-peer networks. there, with a very long tail distribution consuming, and comes with its own
Centralized Hosting.Traditionally ar- over the Internet’s 13,000 networks. set of challenges. Fundamentally, the
chitected Web sites use one or a small Even with connectivity to all the biggest network must be designed to scale ef-
number of collocation sites to host backbones, data must travel through ficiently from a deployment and man-
content. Commercial-scale sites gen- the morass of the middle mile to reach agement perspective. This necessitates
erally have at least two geographically most of the Internet’s 1.4 billion users. development of a number of technolo-
dispersed mirror locations to provide A quick back-of-the-envelope calcu- gies, including:
additional performance (by being clos- lation shows that this type of architec- ˲˲ Sophisticated global-scheduling,
er to different groups of end users), reli- ture hits a wall in terms of scalability mapping, and load-balancing algo-
ability (by providing redundancy), and as we move toward a video world. Con- rithms
scalability (through greater capacity). sider a generous forward projection ˲˲ Distributed control protocols and
This approach is a good start, and on such an architecture—say, 50 high- reliable automated monitoring and
for small sites catering to a localized capacity data centers, each with 30 alerting systems
audience it may be enough. The per- outbound connections, 10Gbps each. ˲˲ Intelligent and automated failover
formance and reliability fall short of This gives an upper bound of 15Tbps and recovery methods
expectations for commercial-grade total capacity for this type of network, ˲˲ Colossal-scale data aggregation
sites and applications, however, as the far short of the 100Tbps needed to sup- and distribution technologies (de-
end-user experience is at the mercy of port video in the near term. signed to handle different trade-offs
the unreliable Internet and its middle- Highly Distributed CDNs. Another ap- between timeliness and accuracy or
mile bottlenecks. proach to content delivery is to leverage completeness)
There are other challenges as well: a very highly distributed network—one ˲˲ Robust global software-deploy-
site mirroring is complex and costly, with servers in thousands of networks, ment mechanisms
as is managing capacity. Traffic levels rather than dozens. On the surface, this ˲˲ Distributed content freshness, in-
fluctuate tremendously, so the need to architecture may appear quite similar tegrity, and management systems
provision for peak traffic levels means to the “big data center” CDN. In reality, ˲˲ Sophisticated cache-management
that expensive infrastructure will sit however, it is a fundamentally different protocols to ensure high cache-hit ratios
underutilized most of the time. In ad- approach to content-server placement, These are nontrivial challenges, and
dition, accurately predicting traffic with a difference of two orders of mag- we present some of our approaches
demand is extremely difficult, and a nitude in the degree of distribution. later on in this article.
centralized hosting model does not By putting servers within end-user Peer-to-Peer Networks. Because a
provide the flexibility to handle unex- ISPs, for example, a highly distributed highly distributed architecture is criti-
pected surges. CDN delivers content from the right cal to achieving scalability and perfor-
“Big Data Center” CDNs. Content-
delivery networks offer improved Figure 2: Effect of distance on throughput and download times.

scalability by offloading the delivery

of cacheable content from the origin
server onto a larger, shared network. Distance from Server Network Typical Throughput 4GB DVD
to User Latency Packet Loss (quality) Download Time
One common CDN approach can be
described as “big data center” architec- Local: 1.6ms 0.6% 44Mbs 12 min.
<100 mi. (HDTV)
ture—caching and delivering customer
content from perhaps a couple dozen Regional: 16ms 0.7% 4Mbs 2.2 hrs.
500–1,000 mi. (not quite DVD)
high-capacity data centers connected
Cross-continent: 48ms 1.0% 1Mbs 8.2 hrs.
to major backbones. ~3,000 mi. (not quite TV)
Although this approach offers some
Multi-continent: 96ms 1.4% 0.4Mbs 20 hrs.
performance benefit and economies ~6,000 mi. (poor)
of scale over centralized hosting, the
potential improvements are limited
because the CDN’s servers are still far

f e brua ry 2 0 0 9 | vo l. 52 | n o. 2 | c om m u n ic at i on s of t he acm 47
practice

using the Dimes Project data sets that describes the structure of the Internet,
Chris harrison of Carnegie mellon university created this visualization illustrating how
cities across the globe are interconnected (by router configuration and not physical
backbone). In total, there are 89,344 connections.

mance in video distribution, it is nat- Web-quality streams. Similarly, P2P have focused our conversation on the
ural to consider a P2P (peer-to-peer) fails in “flash crowd” scenarios where same. As Web sites become increasing-
architecture. P2P can be thought of there is a sudden, sharp increase in de- ly dynamic, personalized, and applica-
as taking the distributed architecture mand, and the number of download- tion-driven, however, the ability to ac-
to its logical extreme, theoretically ers greatly outstrips the capacity of up- celerate uncacheable content becomes
providing nearly infinite scalability. loaders in the network. equally critical to delivering a strong
Moreover, P2P offers attractive eco- Somewhat better results can be end-user experience.
nomics under current network pricing achieved with a hybrid approach, lever- Ajax, Flash, and other RIA (rich In-
structures. aging P2P as an extension of a distrib- ternet application) technologies work
In reality, however, P2P faces some uted delivery network. In particular, to enhance Web application respon-
serious limitations, most notably be- P2P can help reduce overall distribu- siveness on the browser side, but ul-
cause the total download capacity of a tion costs in certain situations. Be- timately, these types of applications
P2P network is throttled by its total up- cause the capacity of the P2P network all still require significant numbers of
link capacity. Unfortunately, for con- is limited, however, the architecture round-trips back to the origin server.
sumer broadband connections, uplink of the non-P2P portion of the network This makes them highly susceptible
speeds tend to be much lower than still governs overall performance and to all the bottlenecks I’ve mentioned
downlink speeds: Comcast’s standard scalability. before: peering-point congestion, net-
high-speed Internet package, for ex- Each of these four network architec- work latency, poor routing, and Inter-
ample, offers 6Mbps for download tures has its trade-offs, but ultimately, net outages.
but only 384Kbps for upload (one-six- for delivering rich media to a global Speeding up these round-trips is a
teenth of download throughput). Web audience, a highly distributed ar- complex problem, but many optimi-
This means that in situations such chitecture provides the only robust so- zations are made possible by using a
as live streaming where the number of lution for delivering commercial-grade highly distributed infrastructure.
uploaders (peers sharing content) is performance, reliability, and scale. Optimization 1: Reduce transport-
limited by the number of download- layer overhead. Architected for reliabil-
ers (peers requesting content), average application acceleration ity over efficiency, protocols such as
download throughput is equivalent Historically, content-delivery solutions TCP have substantial overhead. They
to the average uplink throughput and have focused on the offloading and de- require multiple round-trips (between
thus cannot support even mediocre livery of static content, and thus far we the two communicating parties) to set

48 CommunICatIo ns o f th e aC m | f eBR ua RY 2009 | vo l . 5 2 | N o. 2

practice

up connections, use a slow initial rate ability by finding alternate routes content prefetching does not expend
of data exchange, and recover slowly when the default routes break. extra bandwidth resources and does
from packet loss. In contrast, a net- Optimization 3: Prefetch embed- not request extraneous objects that
work that uses persistent connections ded content. You can do a number of may not be requested by the end user.
and optimizes parameters for efficien- additional things at the application With current trends toward highly
cy (given knowledge of current network layer to improve Web application re- personalized applications and user-
conditions) can significantly improve sponsiveness for end users. One is to generated content, there’s been growth
performance by reducing the number prefetch embedded content: while in either uncacheable or long-tail (that
of round-trips needed to deliver the an edge server is delivering an HTML is, not likely to be in cache) embedded
same set of data. page to an end user, it can also parse content. In these situations, prefetch-
Optimization 2: Find better routes. the HTML and retrieve all embedded ing makes a huge difference in the us-
In addition to reducing the number content before it is requested by the er-perceived responsiveness of a Web
of round-trips needed, we would also end user’s browser. application.
like to reduce the time needed for each The effectiveness of this optimiza- Optimization 4: Assemble pages at the
round-trip—each journey across the tion relies on having servers near end edge. The next three optimizations in-
Internet. At first blush, this does not users, so that users perceive a level of volve reducing the amount of content
seem possible. All Internet data must application responsiveness akin to that needs to travel over the middle
be routed by BGP and must travel over that of an application being delivered mile. One approach is to cache page
numerous autonomous networks. directly from a nearby server, even fragments at edge servers and dynami-
BGP is simple and scalable but not though, in fact, some of the embedded cally assemble them at the edge in re-
very efficient or robust. By leveraging a content is being fetched from the ori- sponse to end-user requests. Pages can
highly distributed network—one that gin server across the long-haul Inter- be personalized (at the edge) based on
offers potential intermediary servers net. Prefetching by forward caches, for characteristics including the end us-
on many different networks—you can example, does not provide this perfor- er’s location, connection speed, cook-
actually speed up uncacheable com- mance benefit because the prefetched ie values, and so forth. Assembling the
munications by 30% to 50% or more, by content must still travel over the mid- page at the edge not only offloads the
using routes that are faster and much dle mile before reaching the end user. origin server, but also results in much
less congested. You can also achieve Also, note that unlike link prefetching lower latency to the end user, as the
much greater communications reli- (which can also be done), embedded middle mile is avoided.

f e B R ua RY 2 0 0 9 | vo l. 52 | N o. 2 | C om m u n IC at Ion s of t he aCm 49
practice

Optimization 5: Use compression and control both endpoints). To maximize fails. To ensure robustness of all sys-
delta encoding. Compression of HTML the effect of this optimized connec- tems, however, you will likely need to
and other text-based components can tion, the endpoints should be as close work around the constraints of existing
reduce the amount of content travel- as possible to the origin server and the protocols and interactions with third-
ing over the middle mile to one-tenth end user. party software, as well as balancing
of the original size. The use of delta Note also that these optimizations trade-offs involving cost.
encoding, where a server sends only work in synergy. TCP overhead is in For example, the Akamai network
the difference between a cached HTML large part a result of a conservative ap- relies heavily on DNS (Domain Name
page and a dynamically generated ver- proach that guarantees reliability in System), which has some built-in con-
sion, can also greatly cut down on the the face of unknown network condi- straints that affect reliability. One ex-
amount of content that must travel tions. Because route optimization gives ample is DNS’s restriction on the size
over the long-haul Internet. us high-performance, congestion-free of responses, which limits the number
While these techniques are part of paths, it allows for a much more ag- of IP addresses that we can return to a
the HTTP/1.1 specification, browser gressive and efficient approach to relatively static set of 13. The Generic
support is unreliable. By using a highly transport-layer optimizations. Top Level Domain servers, which sup-
distributed network that controls both ply the critical answers to akamai.net
endpoints of the middle mile, com- Highly Distributed Network Design queries, required more reliability, so
pression and delta encoding can be It was briefly mentioned earlier that we took several steps, including the use
successfully employed regardless of building and managing a robust, highly of IP Anycast.
the browser. In this case, performance distributed network is not trivial. At Ak- We also designed our system to take
is improved because very little data amai, we sought to build a system with into account DNS’s use of TTLs (time
travels over the middle mile. The edge extremely high reliability—no down- to live) to fix resolutions for a period
server then decompresses the content time, ever—and yet scalable enough of time. Though the efficiency gained
or applies the delta encoding and deliv- to be managed by a relatively small through TTL use is important, we need
ers the complete, correct content to the operations staff, despite operating in to make sure users aren’t being sent
end user. a highly heterogeneous and unreliable to servers based on stale data. Our ap-
Optimization 6: Offload computations environment. Here are some insights proach is to use a two-tier DNS—em-
to the edge. The ability to distribute ap- into the design methodology. ploying longer TTLs at a global level and
plications to edge servers provides the The fundamental assumption be- shorter TTLs at a local level— allowing
ultimate in application performance hind Akamai’s design philosophy is less of a trade-off between DNS effi-
and scalability. Akamai’s network en- that a significant number of compo- ciency and responsiveness to changing
ables distribution of J2EE applications nent or other failures are occurring at conditions. In addition, we have built
to edge servers that create virtual appli- all times in the network. Internet sys- in appropriate failover mechanisms at
cation instances on demand, as need- tems present numerous failure modes, each level.
ed. As with edge page assembly, edge such as machine failure, data-center Principle 2: Use software logic to pro-
computation enables complete origin failure, connectivity failure, software vide message reliability. This design
server offloading, resulting in tremen- failure, and network failure—all oc- principle speaks directly to scalability.
dous scalability and extremely low ap- curring with greater frequency than Rather than building dedicated links
plication latency for the end user. one might think. As mentioned earlier, between data centers, we use the pub-
While not every type of application for example, there are many causes of lic Internet to distribute data—includ-
is an ideal candidate for edge compu- large-scale network outages—including control messages, configurations,
tation, large classes of popular applica- ing peering problems, transoceanic monitoring information, and custom-
tions—such as contests, product cata- cable cuts, and major virus attacks. er content—throughout our network.
logs, store locators, surveys, product Designing a scalable system that We improve on the performance of
configurators, games, and the like— works under these conditions means existing Internet protocols—for exam-
are well suited for edge computation. embracing the failures as natural and ple, by using multirouting and limited
expected events. The network should retransmissions with UDP (User Da-
Putting it All Together continue to work seamlessly despite tagram Protocol) to achieve reliability
Many of these techniques require a these occurrences. We have identified without sacrificing latency. We also use
highly distributed network. Route op- some practical design principles that software to route data through inter-
timization, as mentioned, depends on result from this philosophy, which we mediary servers to ensure communica-
the availability of a vast overlay net- share here.1 tions (as described in Optimization 2),
work that includes machines on many Principle 1: Ensure significant redun- even when major disruptions (such as
different networks. Other optimiza- dancy in all systems to facilitate failover. cable cuts) occur.
tions such as prefetching and page as- Although this may seem obvious and Principle 3: Use distributed control for
sembly are most effective if the deliver- simple in theory, it can be challenging coordination. Again, this principle is
ing server is near the end user. Finally, in practice. Having a highly distributed important both for fault tolerance and
many transport and application-layer network enables a great deal of redun- scalability. One practical example is the
optimizations require bi-nodal connec- dancy, with multiple backup possibili- use of leader election, where leadership
tions within the network (that is, you ties ready to take over if a component evaluation can depend on many factors

50 communic atio ns o f th e ac m | f ebr ua ry 2009 | vo l . 5 2 | n o. 2

practice

including machine status, connectivity els of demand for the content, while deploy in greater numbers of smaller
to other machines in the network, and keeping the network safe. regions—many of which host our serv-
monitoring capabilities. When connec- ers for free—rather than in fewer, larg-
tivity of a local lead server degrades, for Practical Results and Benefits er, more “reliable” data centers where
example, a new server is automatically Besides the inherent fault-tolerance congestion can be greatest.
elected to assume the role of leader. benefits, a system designed around
Principle 4: Fail cleanly and restart. these principles offers numerous other Conclusion
Based on the previous principles, the benefits. Even though we’ve seen dramatic ad-
network has already been architected Faster software rollouts. Because the vances in the ubiquity and usefulness
to handle server failures quickly and network absorbs machine and regional of the Internet over the past decade,
seamlessly, so we are able to take a failures without impact, Akamai is able the real growth in bandwidth-intensive
more aggressive approach to failing to safely but aggressively roll out new Web content, rich media, and Web-
problematic servers and restarting software using the phased rollout ap- and IP-based applications is just begin-
them from a last known good state. proach. As a benchmark, we have his- ning. The challenges presented by this
This sharply reduces the risk of operat- torically implemented approximately growth are many: as businesses move
ing in a potentially corrupted state. If 22 software releases and 1,000 custom- more of their critical functions on-
a given machine continues to require er configuration releases per month to line, and as consumer entertainment
restarting, we simply put it into a “long our worldwide network, without dis- (games, movies, sports) shifts to the
sleep” mode to minimize impact to the rupting our always-on services. Internet from other broadcast media,
overall network. Minimal operations overhead. A the stresses placed on the Internet’s
Principle 5: Phase software releases. large, highly distributed, Internet- middle mile will become increasingly
After passing the quality assurance (QA) based network can be very difficult to apparent and detrimental. As such, we
process, software is released to the live maintain, given its sheer size, number believe the issues raised in this article
network in phases. It is first deployed of network partners, heterogeneous and the benefits of a highly distributed
to a single machine. Then, after per- nature, and diversity of geographies, approach to content delivery will only
forming the appropriate checks, it is time zones, and languages. Because grow in importance as we collectively
deployed to a single region, then pos- the Akamai network design is based work to enable the Internet to scale to
sibly to additional subsets of the net- on the assumption that components the requirements of the next genera-
work, and finally to the entire network. will fail, however, our operations team tion of users.
The nature of the release dictates how does not need to be concerned about
many phases and how long each one most failures. In addition, the team References
lasts. The previous principles, particu- can aggressively suspend machines or 1. Afergan, M., Wein, J., LaMeyer, A. Experience with
some principles for building an Internet-scale reliable
larly use of redundancy, distributed data centers if it sees any slightly wor- system. In Proceedings of the 2nd Conference on Real,
control, and aggressive restarts, make risome behavior. There is no need to Large Distributed Systems 2. (These principles are laid
out in more detail in this 2005 research paper.)
it possible to deploy software releases rush to get components back online 2. Akamai Report: The State of the Internet, 2nd quarter,
2008; http://www.akamai.com/stateoftheinternet/.
frequently and safely using this phased right away, as the network absorbs the (These and other recent Internet reliability events are
approach. component failures without impact to discussed in Akamai’s quarterly report.)
3. Anderson, N. Comcast at CES: 100 Mbps connections
Principle 6: Notice and proactively overall service. coming this year. ars technica (Jan. 8, 2008); http://
quarantine faults. The ability to isolate This means that at any given time, it arstechnica.com/news.ars/post/20080108-comcast-
100mbps-connections-coming-this-year.html.
faults, particularly in a recovery-orient- takes only eight to 12 operations staff 4. Horrigan, J.B. Home Broadband Adoption 2008. Pew
ed computing system, is perhaps one members, on average, to manage our Internet and American Life Project; http://www.
pewinternet.org/pdfs/PIP_Broadband_2008.pdf.
of the most challenging problems and network of approximately 40,000 devic- 5. Internet World Statistics. Broadband Internet
an area of important ongoing research. es (consisting of more than 35,000 serv- Statistics: Top World Countries with Highest Internet
Broadband Subscribers in 2007; http://www.
Here is one example. Consider a hypo- ers plus switches and other networking internetworldstats.com/dsl.htm.
thetical situation where requests for hardware). Even at peak times, we suc- 6. Mehta, S. Verizon’s big bet on fiber optics. Fortune
(Feb. 22, 2007); http://money.cnn.com/magazines/
a certain piece of content with a rare cessfully manage this global, highly fortune/fortune_archive/2007/03/05/8401289/.
set of configuration parameters trig- distributed network with fewer than 20 7. Spangler T. AT&T: U-verse TV spending to increase.
Multichannel News (May 8, 2007); http://www.
ger a latent bug. Automatically failing staff members. multichannel.com/article/CA6440129.html.
the servers affected is not enough, as Lower costs, easier to scale. In addi- 8. TeleGeography. Cable cuts disrupt Internet in Middle
East and India. CommsUpdate (Jan. 31, 2008); http://
requests for this content will then be tion to the minimal operational staff www.telegeography.com/cu/article.php?article_
id=21528.
directed to other machines, spreading needed to manage such a large net-
the problem. To solve this problem, work, this design philosophy has had
Tom Leighton co-founded Akamai Technologies in August
our caching algorithms constrain each several implications that have led to 1998. Serving as chief scientist and as a director to the
set of content to certain servers so as reduced costs and improved scalabil- board, he is Akamai’s technology visionary, as well as
a key member of the executive committee setting the
to limit the spread of fatal requests. In ity. For example, we use commodity company’s direction. He is an authority on algorithms for
general, no single customer’s content hardware instead of more expensive, network applications. Leighton is a Fellow of the American
Academy of Arts and Sciences, the National Academy of
footprint should dominate any other more reliable servers. We deploy in Science, and the National Academy of Engineering.
customer’s footprint among available third-party data centers instead of hav- A previous version of this article appeared in the October
2008 issue of ACM Queue magazine.
servers. These constraints are dynami- ing our own. We use the public Internet
cally determined based on current lev- instead of having dedicated links. We © 2009 ACM 0001-0782/09/0200 $5.00

f e brua ry 2 0 0 9 | vo l. 52 | n o. 2 | c om m u n ic at i on s of t he acm 51

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2141)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Comparative Analysis of Marketing Strategies of BSNL and Airtel
No ratings yet
Comparative Analysis of Marketing Strategies of BSNL and Airtel
4 pages
Industrial Report On Communication
No ratings yet
Industrial Report On Communication
51 pages
National Roaming Consultation Paper - Final Version
No ratings yet
National Roaming Consultation Paper - Final Version
21 pages
Heartbeat Protocols: - Protocol
No ratings yet
Heartbeat Protocols: - Protocol
20 pages
Harmony First Mile 200 Product Description
No ratings yet
Harmony First Mile 200 Product Description
68 pages
Manual Huawei Hg655b
No ratings yet
Manual Huawei Hg655b
36 pages
Huawei IP RAN
No ratings yet
Huawei IP RAN
17 pages
Datasheet
100% (1)
Datasheet
2 pages
Ericsson Md110 Bc12 Sp5 Using E1 Iso-Qsig To Cisco Unified Callmanager 5.0
No ratings yet
Ericsson Md110 Bc12 Sp5 Using E1 Iso-Qsig To Cisco Unified Callmanager 5.0
44 pages
LTE Timers - T300, T301, T303, T304, T305, T310, T311, T320, T321
100% (1)
LTE Timers - T300, T301, T303, T304, T305, T310, T311, T320, T321
2 pages
5G Qos Identifier (5gqi)
No ratings yet
5G Qos Identifier (5gqi)
23 pages
3.3.3.3 Packet Tracer - Explore A Network Instructions - Sle
No ratings yet
3.3.3.3 Packet Tracer - Explore A Network Instructions - Sle
5 pages
Elastix Easy
100% (1)
Elastix Easy
197 pages
BGP Packet Format: Important BGP Address Families
No ratings yet
BGP Packet Format: Important BGP Address Families
1 page
Band Duplex F (MHZ) F (MHZ) N N: DL - Low DL - High Offs-Dl DL Earfcn
No ratings yet
Band Duplex F (MHZ) F (MHZ) N N: DL - Low DL - High Offs-Dl DL Earfcn
4 pages
Ec1008 High Speed Networks
No ratings yet
Ec1008 High Speed Networks
4 pages
Common Clock (SRAN9.0 01)
No ratings yet
Common Clock (SRAN9.0 01)
82 pages
Yealink T23G Enterprise HD IP Phone
No ratings yet
Yealink T23G Enterprise HD IP Phone
2 pages
Ochure Rev 1 12 PDF
No ratings yet
Ochure Rev 1 12 PDF
2 pages
B CFG Otn PDF
No ratings yet
B CFG Otn PDF
30 pages
IE2052 - Advanced Networking Technologies: Virtual Local Area Networks (VLAN) Ms - Hansika Mahaadikara
No ratings yet
IE2052 - Advanced Networking Technologies: Virtual Local Area Networks (VLAN) Ms - Hansika Mahaadikara
50 pages
The Five Layer Model
100% (4)
The Five Layer Model
5 pages
CMR 2010 Final
No ratings yet
CMR 2010 Final
379 pages
Wireless Communication With Wi Fi Sree Venkata College o
No ratings yet
Wireless Communication With Wi Fi Sree Venkata College o
21 pages
Network Journal 1
No ratings yet
Network Journal 1
138 pages
Clock Signal Component Issue
No ratings yet
Clock Signal Component Issue
4 pages
Aruba - Testkings.ACMP 6.3
No ratings yet
Aruba - Testkings.ACMP 6.3
120 pages
Satpep: A TCP Performance Enhancing Proxy For Satellite Links
No ratings yet
Satpep: A TCP Performance Enhancing Proxy For Satellite Links
6 pages
BRKRST-2669-Cloud-Ready WAN For IAAS & SaaS With Cisco's NextGen SD-WAN
No ratings yet
BRKRST-2669-Cloud-Ready WAN For IAAS & SaaS With Cisco's NextGen SD-WAN
126 pages
DAP-1155 A1 Manual v1.10
No ratings yet
DAP-1155 A1 Manual v1.10
80 pages

Improving Performance On The Internet

Uploaded by

Improving Performance On The Internet

Uploaded by

practice

Given the Internet’s bottlenecks, how can we

W hen it comes to achieving performance, architect’s control. Achieving good

44 comm unic atio ns o f the ac m | f ebr ua ry 2009 | vo l . 5 2 | n o. 2

46 communic atio ns o f th e ac m | f ebr ua ry 2009 | vo l . 5 2 | n o. 2

scalability by offloading the delivery

48 CommunICatIo ns o f th e aC m | f eBR ua RY 2009 | vo l . 5 2 | N o. 2

50 communic atio ns o f th e ac m | f ebr ua ry 2009 | vo l . 5 2 | n o. 2

You might also like