Shortcut Guide To Wan App Delivery
Shortcut Guide To Wan App Delivery
tm
Optimized
WAN Application
Delivery
Ed Tittel
Introduction
For several years, now, Realtime has produced dozens and dozens of high-quality books that just
happen to be delivered in electronic format—at no cost to you, the reader. We’ve made this
unique publishing model work through the generous support and cooperation of our sponsors,
who agree to bear each book’s production expenses for the benefit of our readers.
Although we’ve always offered our publications to you for free, don’t think for a moment that
quality is anything less than our top priority. My job is to make sure that our books are as good
as—and in most cases better than—any printed book that would cost you $40 or more. Our
electronic publishing model offers several advantages over printed books: You receive chapters
literally as fast as our authors produce them (hence the “realtime” aspect of our model), and we
can update chapters to reflect the latest changes in technology.
I want to point out that our books are by no means paid advertisements or white papers. We’re an
independent publishing company, and an important aspect of my job is to make sure that our
authors are free to voice their expertise and opinions without reservation or restriction. We
maintain complete editorial control of our publications, and I’m proud that we’ve produced so
many quality books over the past years.
I want to extend an invitation to visit us at http://nexus.realtimepublishers.com, especially if
you’ve received this publication from a friend or colleague. We have a wide variety of additional
books on a range of topics, and you’re sure to find something that’s of interest to you—and it
won’t cost you a thing. We hope you’ll continue to come to Realtime for your educational needs
far into the future.
Until then, enjoy.
Don Jones
i
Table of Contents
ii
Table of Contents
iii
Table of Contents
iv
Copyright Statement
Copyright Statement
© 2008 Realtime Publishers, Inc. All rights reserved. This site contains materials that
have been created, developed, or commissioned by, and published with the permission
of, Realtime Publishers, Inc. (the “Materials”) and this site and any such Materials are
protected by international copyright and trademark laws.
THE MATERIALS ARE PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
TITLE AND NON-INFRINGEMENT. The Materials are subject to change without notice
and do not represent a commitment on the part of Realtime Publishers, Inc or its web site
sponsors. In no event shall Realtime Publishers, Inc. or its web site sponsors be held
liable for technical or editorial errors or omissions contained in the Materials, including
without limitation, for any direct, indirect, incidental, special, exemplary or consequential
damages whatsoever resulting from the use of any information contained in the Materials.
The Materials (including but not limited to the text, images, audio, and/or video) may not
be copied, reproduced, republished, uploaded, posted, transmitted, or distributed in any
way, in whole or in part, except that one copy may be downloaded for your personal, non-
commercial use on a single computer. In connection with such use, you may not modify
or obscure any copyright or other proprietary notice.
The Materials may contain trademarks, services marks and logos that are the property of
third parties. You are not permitted to use these trademarks, services marks or logos
without prior written consent of such third parties.
Realtime Publishers and the Realtime Publishers logo are registered in the US Patent &
Trademark Office. All other product or service names are the property of their respective
owners.
If you have any questions about these terms, or if you would like information about
licensing materials from Realtime Publishers, please contact us via e-mail at
[email protected].
v
Chapter 1
[Editor's Note: This eBook was downloaded from Realtime Nexus—The Digital Library. All
leading technology guides from Realtime Publishers can be found at
http://nexus.realtimepublishers.com.]
1
Chapter 1
We often refer to the speed of light—the absolute maximum transmission speed—as being a
fundamental latency. For example, despite a “perfect” network link operating under ideal conditions
between the United States and India, there is a 60ms latency just to account for the speed of light
across the total distance and media traversed from “here” to “there.” This becomes glaringly obvious
in satellite communications, where the round trip to a single satellite typically adds a half-second to
latency, and where multiple satellite hops can add as much as 2 to 3 seconds to overall delays from
sender to receiver (and back again).
2
Chapter 1
• Serialization delay refers to the amount of time it takes to convert an n-bit wide signal
into a corresponding series of individual bit values for transmission across a network
medium. Thus, for example, if data shows up in 8-bit bytes at an interface, serialization
involves stripping all bits from each byte in some specific order, then emitting each one
in that order onto the network medium on the sending side (and works in reverse on the
receiving end). Except for increasing the signal clock used to synchronize transmissions
from sender to receiver (which again involves a hardware upgrade), serialization delay
remains a constant source of latency in networked communications.
• Queue delay refers to how long a message element must “stand in line” to wait its turn
for media access, and generally applies when transmissions must traverse a router or a
switch of some kind. Here again, this is a case where latency depends on the type of
hardware used to store and forward network transmissions, as well its queuing capacity.
When link oversubscription occurs, in fact, sufficient congestion can occur to make users
think that they have “run out of bandwidth.”
Generically, latency is a measure for the time that any one part of a system or network spends
waiting for another portion to catch up or respond to communication activity. Latency describes any
appreciable delay or the time that elapses between stimulus and response. Such delays occur
virtually throughout all operational aspects of any given computing environment but not all forms of
latency are perceptible in human terms. Once introduced into any computing environment—within a
system or network—the cause itself must be removed, mitigated, or reduced to improve performance.
Latency is measured in a number of ways, include one-way transit time from sender to receiver
as well as round-trip time (often the most useful measure of latency because a complete
transaction from sender to receiver invariably involves transmission of a request of some kind
from sender to receiver, followed by delivery of a response to the request back from the receiver
to the sender). Round-trip latency also offers the advantage that it can be measured at any single
point on a network. On a complex, far-flung network, in fact, variations in round-trip latency
between minimum to maximum values may be more interesting from a quality control standpoint
than average round-trip latency, because those users subject to maximum round-trip latency will
be those whose experience of a system’s responsiveness and efficiency is worst.
But how do you accommodate latency incurred by the needs of users who are accessing
resources cross-country or across the globe? Or to further compound the problem, how can you
accommodate protocols that may require multiple round-trips to satisfy a single service request?
What if a given protocol format specification doesn’t directly provide any means for protocol
acceleration or traffic prioritization?
Consequently, network designers and implementers have had to consider performance and
latency from a radically different perspective, as the networking landscape has shifted to include
more services and applications, each with its own unique operational parameters and specific
properties. Mobile workers, remote offices, and distant partnerships are an important aspect of
this brave new networking world, and demands acceptable performance for a diverse set of
applications, platforms, and users. And when the end-user experience suffers or degrades,
network designers must shoulder the inevitable blame that follows in its wake. For remote users
and remote applications, the Internet is your WAN, therefore Internet latency reduction should
be a key factor when choosing any WAN optimization solution.
3
Chapter 1
We’ll use the term end-user experience throughout this guide. The end user serves as our barometer
for the overall status and well being of any business network, as they drive business operations and
experience the most severe penalties whenever performance lags or falters. Thus, the “end-user
experience” encompasses all aspects of their interactions with a company, its services, and its
products. To deliver a truly superior user experience requires seamless integration among multi-
disciplinary platforms and services, and a holistic, end-to-end view of networks that sets and
manages user expectations and carefully monitors and manages network behavior and performance.
4
Chapter 1
Traditional network protocols favor short, bursty communications and chatty information
exchanges along short, transitory paths—and include no formal concept of traffic shaping or
prioritization. This includes well-known application support protocols such as NetBIOS as well
as various distributed file services—most notably Microsoft’s Common Internet File Services
(CIFS). Because of the erratic nature of so-called “best effort” delivery mechanisms and limited
consideration for careful consumption of WAN resources, it becomes extremely difficult to
predict the demands of the network and how much operational capacity may be necessary at any
given moment. When sufficient resources are unavailable, only buffering can help offset
demand—but this is only a stopgap measure and not a true resolution. Ultimately, enough bursty
traffic at any given moment produces severe contention for network resources and introduces
difficulties for modern globally distributed network designs, no matter how they might have been
deliberately over-provisioned when initially specified and introduced.
Existing network services and applications operate primarily in terms of simple data exchanges
and generally short message lengths and durations—such as HTTP. HTTP is notoriously chatty
and exchanges numerous small bits of data (text files, graphics, style sheets, and so forth) to
accommodate the client-server request-response cycle. Consequently, any good WAN
optimization strategy seeks to address this issue, often by batching multiple requests into a single
transmission, and doing likewise for all the responses produced to answer those requests. Lots of
other client-server applications likewise employ protocols that utilize “chatty” response
mechanisms and produce large amounts of traffic on a per-request basis. This works perfectly
well on a local network in most cases, but remote Web-based applications and high-latency
protocols typically suffer from performance degradation when employed across long-haul
networks, particularly when large numbers of such traffic streams must share the same WAN
links. Typically, these applications and services utilize parameters within the Transmission
Control Protocol/Internet Protocol (TCP/IP) framework for session initiation, management, and
tear-down.
Then, too, it is not uncommon for enterprises to recognize that as the level of WAN traffic
increases, it becomes ever more necessary to regulate which protocols and applications may
access the WAN and to what degree. Detailed inspection of protocol distributions for such traffic
may, for example, reveal the presence of unauthorized and unwanted peer-to-peer (P2P)
protocols such as BitTorrent, FreeNet, or KaZaA, which typically do not play an official role on
enterprise networks and can be blocked at the gateway without affecting critical services or
applications.
However, many of the protocols for important services and applications built atop TCP/IP lack
native traffic prioritization schemes (or fail to exploit any such properties that TCP/IP may offer
to developers) to alleviate some of the traffic burden so typical of streaming protocols and short-
duration bursts of activity. This leaves the network medium exposed to saturation issues because
both short- and long-term protocol sessions coexist in the same resource space with no real
differentiation between importance and potential.
5
Chapter 1
TCP/IP is the protocol framework that defines the most prominent types of network interaction but is
by no means the only format. Many Internet-based transactions occur via TCP/IP with some foreign
or little-used protocols encapsulated as payloads. Our focus throughout this guide is primarily geared
toward higher prioritization and enhanced performance in the existing TCP/IP framework. It’s crucial
to understand the operational parameters and properties of TCP/IP to properly design, implement,
and utilize performance-enhancing programs, platforms, and procedures. One of the very best books
to help guide you into this subject is by Geoff Hughes: Internet Performance Survival Guide (Wiley
Computer Publishing, 2000, ISBN: 0471378089); despite its publication date, it offers the best in-
depth analysis of WAN application and protocol behaviors we know of in print.
Efficiency in throughput hits a downward spiral as more applications, services, and end users
share and increasingly occupy the same medium. Additional in-line network appliances and
routing devices only increase the congestion burden because inherent performance issues are not
directly addressed but compounded instead. And as the distance between end users and
applications also increases, some network designers optimistically assume they can create a
“one-size-fits-all” network solution for most scenarios, which is entirely incorrect when it comes
to serious WAN optimization, where an understanding of the various factors that come into play
is needed, and where different situations dictate different optimization approaches.
Research firm Gartner uses the terminology of application and vendor silos to explain that
networking professionals are responsible for delivering more than just the bits and bytes
involved in networked communications, and must also be sensitive to the quality of the end-user
experience. Typically, an application or service is viewed as a standalone silo, which implies a
challenging and severely limited reach or range of capability. The goal then becomes to find a
common language, both in the real-world and world of technology, so that network architects
and application architects can exchange information about overall performance and behavior. By
way of explanation, an information silo is any management system incapable of reciprocal
interaction with other related management systems. This, too, directly impacts and influences the
end-user experience because it means that the systems used to monitor and assess that experience
cannot provide a single, coherent, end-to-end view of that experience. This works against
developing a broader understanding of network conditions and behavior, and often has to be
forcibly offset by creating mechanisms to deliver measurements and analysis of the real end-user
experience and of end-to-end activity and response times. In many cases, enterprises find
themselves forced to add a set of probes or agents to deliberately create (or simulate) end-user
activity, just so it can be measured and monitored. WAN optimization solutions can also offer
such information because measuring and managing response time is such a key component in
making such technology function properly.
A system is considered a silo if it cannot exchange information with other related systems within its
own organization, or with the management systems of its customers, vendors, or business partners.
The term silo is a pejorative expression used to describe the absence of operational reciprocity.
The original networking model plays well for LANs where latency does not typically create a
significant role in communication delays. Local transmissions often happen without much
perceptual lag and usually have no substantial impact on throughput, where distances are kept at
a minimum and technologies may operate at or near maximum speeds. Scale this into much
larger, more modern networking environments and these seemingly insignificant factors soon
impose significant hindrances on site-to-site communications.
6
Chapter 1
This institutionalized chaos also puts forward a strong case for introducing quality or class of
service mechanisms into enterprise networks. These mechanisms relegate protocols, applications,
and services into specific well-defined categories and offer priority access to WAN bandwidth
for mission-critical or time-sensitive applications according to such category assignments.
However, a quality of service scheme can downgrade less time-sensitive data transfers so that
they fit themselves in around such higher-priority traffic. It is important to recognize when using
these methods that overall bandwidth and throughput for the entire network usually degrades
slightly because of the processing overhead involved in classifying and organizing traffic by
class or type of service. But indeed the end-user experience for important or time-sensitive
services and applications should improve: why else would one abandon best-effort delivery in
favor of priority mechanisms? At the same time, performance for less time-sensitive, lower-
priority traffic will actually degrade, but if the right protocols, services, and applications are
relegated to this category, the overall end-user experience should not change for them, or be too
noticeable anyway.
7
Chapter 1
There is another, arguably more important, issue around iterative improvements to application
delivery: going up the protocol stack. Any bit-pushing aspects of the network infrastructure have
seen endless improvement at Layers 1 through 3, and even some at Layer 4 for UDP and TCP.
Thus, though there is little left to optimize below Layer 4, there is ample opportunity to resolve
issues at the application layer. The reason this hasn’t already happened is because of the complex
nature of managing application layer issues at line speeds. However, this is crucial in situations
where performance remains a constant problem because there are no further iterative
improvements available for optimization at Layers 4 and below.
The distance between points (A) and (B) now spans to include various regions, territories and
continents. What was once an easily manageable network environment by a few on-site
personnel has expanded to long-haul linkages between distant end-points. Many of these
connections involve the use of satellite communications, making the problem of network
management increasingly more difficult, thanks to delays that can be as long as several seconds
as soon as one or more geosynchronous satellites enter into the latency equation.
Wide Area Network is any computer-based communications network that covers a broad range, such
as links across major metropolitan, regional, or national territorial boundaries. Informally, a WAN is a
network that uses routers and public Internet links (or in some cases, private and expensive leased
lines). WANs are the new-age bailing wire that ties and interconnects separate LANs together, so that
users and computers in one location can communicate with those in another location, and so that all
can share in the use of common resources, services, and applications.
Services and applications include complex transactions that occur among many multi-tiered
applications, and employ multiple server and service hierarchies. “One size fits all” approaches
and “end-all be-all/mother-of-all” methodologies simply cannot apply. Network designers are
forced to re-examine their approaches to design and implementation best practices and
reconsider what they know works best, as a shifting business landscape dictates new and
different operational parameters and workplace requirements to meet changing or increasing
business needs. “Add more bandwidth” is neither practical nor the panacea that it once was,
because not all forms of delay are amenable to cure by increasing WAN throughput.
8
Chapter 1
Figure 1.1: Typical enterprise architectures involve long-haul links between data and operations centers.
Here, New York provides the servers and the computing facilities, while LA and Tokyo drive business
activity.
As Figure 1.1 is meant to illustrate, many typical enterprise network architectures connect distant
users (in Tokyo and Los Angeles, in this case) with data centers and other centralized resources
(in NYC as depicted). Without careful attention to how the WAN links between the user centers
and the datacenter get used, enterprises can easily find themselves woefully short on bandwidth,
and dolefully long on response time. The end user experience can’t help but suffer in such
situations, without careful forethought, intelligent design, and judicious use of WAN
optimization tools and technologies.
Multiple carriers and operators control various portions of the intervening infrastructure between
sites (and also between computing resources and mobile users as well), which itself introduces
trust and control issues. The diverse nature of hardware and software operating platforms and the
introduction of dispersed and diverse employee bases—and all related permissions, parameters
and properties—creates a complex management nightmare. It’s all too easy to inherit problems
associated with other platforms through newly-formed partnerships, or via mergers and
acquisitions. As operational wants, needs and concerns are addressed for some given platform in
a particular environment, they may still prove insufficient when applied to the much larger
context in a business computing environment that encompasses many platforms, technologies,
and computing strategies.
9
Chapter 1
Anyone can control operational behavior in site-specific and localized contexts, but no single
entity or organization can expect to completely control behavior from end-to-end. The
management nightmare only worsens when public Internets become involved versus leased-line,
site-to-site connections. As the old saying goes, “Jack of all trades is master of none,” and this
holds truest in a global network context. Personnel may adapt and acclimate, and even become
adept at handling site-to-site procedures for a specific set of operational conditions and criteria.
Users and network staff may even get used to poor performance because that is the only kind
they’ve ever experienced for certain applications. But when you introduce a global computing
context for business networking, such expectations and specific adaptive behaviors developed in
response to particular conditions will soon go by the wayside.
In essence this phenomenon explains why the one size/method fits all approach falls so
drastically short of meeting business goals. Local and specific solutions targeting only packet
delivery issues cannot and will never cover WAN application and management needs, and
instead create only dysfunctional infrastructures that require remediation or redesign.
10
Chapter 1
Interestingly, it’s also sometimes the case that SOA-based applications load significant amounts
of data that particular users may not need at any given moment. The nature of Web delivery is
such that a page typically won’t finish loading until all its data is available (transferred to the
client), which can impose increasingly onerous delays when that data is unnecessary or irrelevant
to the task at hand. This is another situation where the caching that WAN optimization devices
provide can reduce delays, because as long as a local cached copy is current, it can be supplied at
LAN speeds to users rather than forcing them to wait for transfer of that information across the
WAN.
The notion of service contracts within an SOA context is similar to though distinctly different
from the kinds of service level agreements, or SLAs, with which network professionals are
already familiar—namely, packet delivery (latency, bandwidth, and loss). SOA service contracts
involve application uptime, data coherency, and schema compliance in addition to response time.
Network SLAs, in contrast, deal with link uptimes, packet losses, and average sustained
throughput levels. An application that compiles data from multiple sources throughout a data
center and across a distributed enterprise or global network, may fail to meet higher-level
business service requirements. This occurs when a network imposes delays in getting access to
the distributed data that SOA is supposed to compile and present in new and silo-busting ways.
These kinds of applications must, however, be understood and prioritized within the overall
application and services context, lest they rob precious bandwidth needed to process orders,
synchronize databases, ferry payroll and payment information, and all the other critical business
transactions that go straight to the bottom line (or not, as the case may be).
A Service Oriented Architecture (SOA) is a computer systems architectural style for creating and
using business processes, packaged as services, throughout their life cycle. SOA also defines and
provisions the IT infrastructure to allow different applications to exchange data and participate in
business processes.
A Service Level Agreement (SLA) is part of a service contract where the level of service is formally
defined, which is formally negotiated between two participating parties. This contract exists between
customers and service provider, or between separate service providers, and documents common
understanding about services, priorities, responsibilities and guarantees (collectively the level of
service).
Related access issues can be logically divided into three separate elements: internal users
accessing distant internal applications (internal to internal); internal users accessing distant
external applications (internal to external); and external users accessing distant internal
applications (external to internal). We omit coverage of the final case: external users accessing
distant external applications, because those types of communications are not normally addressed
with WAN optimization and are outside the scope of this guide. Here’s a diagram for what we’ve
just described, after which we take an individualized look into each of these different operational
perspectives and see how they apply and affect business processes related to WAN optimization.
11
Chapter 1
Figure 1.2: Different ways to involve the WAN between users and applications.
Figure 1.3: When an outside supplier has parts ready for delivery to the fulfillment center, it interacts with an
automated ordering process to inform the center of pending delivery.
12
Chapter 1
In general, scaling applications and services to make most effective use of WAN links means
choosing protocols and services that behave more reasonably when using such links where
possible. Alternatively, it means using technologies that act as local proxies for chatty protocols
and services, while implementing more efficient, less chatty replacement protocols and services
across wide area links in the background.
When making WAN-friendly protocol or service choices isn’t possible, it becomes necessary to
use local proxies to accommodate chatty, bursty services and applications. Then organizations
can repackage and reformat WAN communications to behave more responsibly, to communicate
less frequently, and to make best possible use of bandwidth when data must actually traverse a
WAN link. This is also a case where shared cache data (identical information elements
maintained in mirrored sets of storage at both ends of a WAN link) can speed communications
significantly, because pairs of devices with such caches can exchange cache references (which
may require only hundreds of bytes of data to be exchanged) rather than shuttling the actual data
between sender and receiver (which may require exchanges at megabyte to gigabyte ranges).
Most organizations consider private WANs to be internal-to-internal application delivery issues,
but it remains a subject of debate as to whether this view is strictly true or false for semi-public
MPLS. It’s definitely true for point-to-point links such as frame relay. Internal-to-internal
acceleration is the basis for the WAN optimization market as it currently stands, whereas today’s
CIFS/MAPI issues will become the intranet/portal issue of tomorrow.
These same observations hold true for private networks. There is about 120ms delay between
internal-to-internal configurations where long distance transmission is involved.
13
Chapter 1
Figure 1.4: When the California HQ operation needs to exchange design information with a Hong Kong
partner, it used a shared network link to ferry that data across the Pacific
In practice, this means that applications and services should be designed to minimize back-and-
forth communications, and to stuff as much data as possible into messages whenever they must
move between sender and receiver. Thus, as shown in Figure 1.4, when the HQ operation needs
to share design plans with its Hong Kong partner, the mechanisms employed to manage and
ensure their delivery must work as quickly and efficiently as possible, so that individual file
transfers proceed rapidly, and so that transmission errors or failures can neither abort nor
severely slow down or damage key data files and information. Here again, this requires judicious
selection and use of protocols and services optimized for WAN situations.
14
Chapter 1
Figure 1.5: A client in Tokyo accesses the corporate Web site in the New York office to access messaging
and financial services applications
Soon someone somewhere will notice that the existing state of affairs on this Internet-based
WAN is insufficient and unsupportive in its operational limitations, capacity, and speed. The
ability to provide a fast and safe connection to all users and applications efficiently and
effectively, regardless of workstation and location will also prove problematic. This is
particularly evident where no strategies are in place yet to integrate the Internet into the existing
WAN topology beyond a basic VPN-based approach.
CIFS/Server Message Block (SMB), Real-Time Streaming Protocol (RTSP), VoIP, HTTPS, and
various other protocols all present significant challenges for application monitoring,
measurement and optimization. Additionally, Secure Socket Layer (SSL) and HTTPS
acceleration is necessary to enhance speed and security, especially when traffic must traverse
Internet and WAN links.
15
Chapter 1
16
Chapter 1
17
Chapter 1
• Instant, predetermined compression and encryption of data before distribution across the
WAN. WAN optimization devices employ sophisticated hardware compression and
encryption devices to make sure that the communications that actually traverse WAN
links are both compact and as indecipherable to unauthorized third parties as modern
technology will allow.
• Data caching is challenging when considering that a majority of objects by count (and by
size) are too small to fit in a byte cache as described here. Unless they happen to appear
in exactly the same order, which is highly unlikely on a contended network, byte caching
won’t improve performance. The only improvement for large (that is, video) and small
(that is, Web page) object performance is through an object cache. Byte caching is
designed for CIFS and MAPI optimizations, where it continues to perform the best.
Object caching, however, often delivers the most dramatic improvements in performance
when WAN optimization techniques are properly employed.
• Establishment of data delivery priorities based on users, applications, and processes.
WAN optimization technology lets enterprises determine what kinds of traffic gets to
jump to the front of the queue and obtains the best quality of service or service level
guarantees. This not only helps to make effective use of WAN links and bandwidth, it
also helps to ensure that end-user experiences are as positive as their priority ranking and
assigned importance will allow.
If you don’t measure and model your network infrastructure through a well-constructed service
commitment, SLA breaches may go undetected. Reasonable expectations cannot be stated or met
in terms of the service commitment when informal or ad-hoc service architectures are in place.
Enforcement of said commitments becomes an infeasible and impractical proposition.
If you don’t monitor the end-user experience, end-user perception and end-to-end response time,
unpleasant surprises lie in wait on the bumpy network path ahead. Expectations can neither be
defined nor met without a formal understanding of these performance properties. Ultimately, it’s
the end user who suffers the most with an indirect but significant impact on business flow.
18
Chapter 1
Protocol optimization requires in-depth protocol knowledge to accelerate end-user response time
and enhance serially-oriented network requests. Optimization strategies can better anticipate user
requests through by understanding the intricacies of how certain protocols function natively on
the LAN, and how they can better function across the WAN. Applications that use serialized
requests (e.g., HTTP, CIFS, etc.) and traditionally “chatty” applications (e.g., RPC, RTSP) or
those designed for LAN environments (i.e., CIFS, MAPI) achieve considerable performance
gains through by bundling or short-circuiting transactions, or using pre-fetch techniques to
anticipate upcoming requests and data transfers. Essentially this translates into batching up
groups of related requests on one side of the WAN link, then doing likewise for related responses
on the other side of the WAN link. It also involves use of proxies to carry on conversations
locally for chatty protocols, then switching to bulk transfer and communication mechanisms
across the WAN to lower the amount of back-and-forth traffic required across such links.
Networking professionals ultimately inherit the responsibility of promoting service and
performance levels because IT and Information Management System (IMS) are inherently
problematic. Remote Windows branch office servers have proven unmanageable, and IT
governance doesn’t mean the same thing to all people. Many organizations use spur-of-the-
moment processes that are either too loosely or too rigidly adhered, often concentrating efforts
on the wrong aspects and failing to focus on key operational factors that make the IT process
work efficiently. Oftentimes, there’s no surefire direction or method of approach to ensure the
right aspects of performance are maintained at reasonable levels. Sometimes this results in the
end user pointing the accusative finger of blame directly to those hard-working network
professionals.
Shortly thereafter follows all kinds of server proliferation as an interim solution that proves
equally unmanageable. Many of these so-called solutions still require manual intervention to
operate and maintain, which is neither a model of efficiency nor room for innovation to thrive.
Router blades for Domain Name Services (DNS), Dynamic Host Control Protocol (DHCP), and
Remote Authentication Dial-In User Service (RADIUS) largely rely on the data professional
delivering these goods over time. Print, file and services delivery are also an integral component
to this unmanageably complex nightmare.
Moreover, these services are not integrated into routers because it’s the optimal architectural
place for them—the performance issues inherent in hosting high-level services in a store-and-
forward appliance are obvious. Such services are integrated into routers because there is a
profound organizational desire to have network administrators manage them, and for that
purpose, there is no better obvious placement.
Get involved in the application and protocol format: deconstruct the entire application and
analyze its format to perform protocol optimization and manipulation. It requires a keen
programmer’s insight—well, almost—and fundamental understanding of protocol topics to
design, implement and deliver the appropriate optimization solution.
19
Chapter 1
Summary
This chapter lays the foundation and defines the concepts for WAN concepts and components,
with an emphasis toward enhancing and optimizing its operation. By layering WAN topologies
over LAN technologies, performance decreases in a dramatic and discernible way. There are
methods of monitoring, measuring, and modifying operational aspects of WAN technologies to
improve the end-user experience and alleviate strain on potentially overworked networking
professionals. In the next chapter, we’ll adopt a more focused perspective on the types of routing
protocols, processes, and procedures used to address these performance issues.
20
Chapter 2
This chapter uses the term routing in a broadly defined, generally applicable way. This usage is
entirely different from the more specific term router, which is effectively a TCP/IP Layer 3 device.
Request for Comment (RFC) 1983 defines routing as “The process of selecting the correct interface
and next hop for a packet being forwarded.” That’s really what this guide is all about—finding and
using the next best hop to ensure secure, timely, and/or qualitative delivery of network data, and
optimizing traffic across hops that involve wide area network (WAN) links.
The science of routing is the process of identifying connective pathways along which to deliver
data between subnets or external network sources, using a variety of logical and algorithmic
techniques. It is the directional flow of datagram or packet traffic from source to destination
according to some defined passageway that is typically specified through administratively
managed memory-resident routing tables. A router selects the correct interface from its available
routing table and determines the next hop along which to forward a packet. Similar network
address structures (closely related numeric values) imply proximity within a network, even for
WAN-spanning connections. The process of accessing the Internet through a WAN connection is
depicted in Figure 2.1.
21
Chapter 2
Figure 2.1: When moving a packet across a WAN link, the router picks it up from some internal interface,
then forwards it out an external interface, which typically delivers the packet into an “Internet cloud.” At the
destination side, the same packet eventually arrives at the router’s external interface for forwarding into an
internal LAN.
22
Chapter 2
A packet is the basic unit on any TCP/IP-based packet-switched pathway. It is a formatted block of
information that includes protocol fields, headers, trailers, and optional payloads. Protocol properties
parameterize how a packet is to be handled and delivered. They also include putative identities (in
the form of IP addresses) for both sender and recipient stations, error-control information, message
payload, and optional routing characteristics, all of which we discuss in more detail later in this
chapter.
A packet can be a complete unit in itself or part of some larger ongoing communication between
endpoints. Computer communications links that do not support packets, such as traditional point-
to-point (PPP) telecommunications links, simply transmit data as a series or stream of bytes,
characters, or bits. TCP/IP networks handle such links with relative ease by providing reversible
encodings to enable them to be transited using native formats, then retransformed back into
packet-based traffic on the other side of such links. Also, TCP/IP networks chop up large data
sequences into smaller packets for transmission and logically group data according to the DoD
network reference model, which creates four layers populated with various protocol definitions.
Imagine a router is the mail room for a busy postal clerk who’s constantly rushing deliverables
between senders and recipients. Envision each packet as an envelope full of mail circulating the
globe, and for many fleeting moments throughout his day, this busy mail clerk processes such
items. Now consider that some mail has higher priority than others and is marked accordingly to
reflect its status. That mail will be processed with more attention to delivery timeframes than
other pieces of mail, so it may very well “jump the line” or receive other special handling along
its way.
Also consider that certain pieces of mail are too big (in size, shape, or weight) to fit into a single
envelope or reasonably large box, so its contents are broken into a larger number of smaller,
simpler packages and elements, and sent in multiple bit and pieces. Perhaps some of these items
are marked “fragile” or “one of many,” indicating other special handling or delivery
considerations. In essence, these parcels specify some special handling characteristics that are
dealt with by other post office personnel who may handle them at some point during their trip
from sender to receiver. From a simplified perspective, this model is analogous to packet routing.
Alas, this is where the router-as-a-mailman analogy ends and a more accurate definition of
routing prevails. The analogy breaks down because network routing is far more complex than
what mail courier services encounter and endure. Packets possess a vast variety of protocol
properties and parameters that influence their handling and delivery throughout the routing
process, enough to swamp mere human minds but well within the primitive (but more
calculating) capabilities of the kinds of computer “brains” present in modern high-speed
networking gear (switches, routers, and so forth).
23
Chapter 2
A router can itself be a computer or some functional equivalent that is used to interconnect two
or more network segments. It operates at Layer 3 of the OSI reference model, routing traffic
through network segments so as to move it toward the final destination to which it is addressed.
A router accomplishes this task by interpreting the network (Layer 3) address of every packet it
receives to make an algorithm-based decision about the next interface to which that packet must
be delivered.
The pathways along which packets travel may be static or dynamic. Static routes use pathways
that must be explored, negotiated, and then established before traffic can proceed across them,
whereas dynamic routes are made and used as needed, in keeping with parameters presented
within packets in motion or based on data included in connection requests that last only as long
as they’re needed. Either way, a router must keep up with changes to network topology, route
availability, traffic conditions, and other factors that can influence if, when, and how quickly
traffic moves across pathways accessible through specific interfaces. Figure 2.2 shows a
simplified routing grid, with cost factors applied for paths to networks A through E.
Figure 2.2: Routers must track and keep up with path cost factors to understand how to forward specific
types of packets for transmission across the WAN, symbolized by the light blue cylinder at the picture’s
center.
24
Chapter 2
25
Chapter 2
Outside the border is also where big delays kick in (WAN links are invariably far slower than
LAN links, and public pathways likewise slower than private ones, if only because of higher
utilization and traffic volumes) and where traffic gets more expensive to move. This
phenomenon helps to explain much of the appeal inherent to WAN optimization, and stems from
reductions in traffic achieved through all sorts of clever techniques that include protocol proxies,
caching, shared symbol and data dictionaries, and more.
Throughout the remainder of this chapter, several references will be made to the concept of an
autonomous system (AS)—a collection or collective group of IP networks and routers under control of
a common administration with common routing policies. An official definition can be found in RFC
1930 at http://tools.ietf.org/html/rfc1930. An AS may reside inside the network boundary and operate
within its borders. Anything outside the border is usually under somebody else’s control, though that
routing domain is probably also an AS. But exterior routing requires consensus to operate and
adherence to common rules and requirements to use.
IGP is a routing protocol used within an AS to determine reachability between endpoints within that
system. In its distance-vector form, IGP identifies available pathways through advertisement of
routing locations (in relation to other locations that likewise advertise themselves). When IGP uses a
link-state-oriented protocol, each node possesses complete network topology information for all
available pathways. Both distance-vector and link-state concepts are described shortly.
However, RIP quickly shows is crippling limitations within any sizable network environment.
Chiefly among its inadequacies is a non-negotiable 15-hop limitation, which severely restricts
the operational capacity and logical expanse of WAN topologies. RIP also cannot handle
variable-length subnet masks (VLSM), a problem for an ever-shrinking IP address space. RIP
routers also periodically advertise full routing tables that are a major unnecessary consumer of
available bandwidth—another major blemish for WAN topologies. Convergence on RIP
networks occurs slowly, with routers enduring a period of holding formation and garbage
collection before expiring information that has not been recently received—also inappropriate
for large-scale networks, particularly slow links and WAN clouds.
26
Chapter 2
From a network management perspective, RIP possesses no concept of network delays and link
costs, and therefore provides no resolution for these issues. Routing decisions are entirely hop
count-based, even despite better aggregate link bandwidth or lower latency. Also problematic is
the fact that RIP network topologies are uncharacteristically flat, with no concept of containment
boundaries or logically divided areas. RIP networks fall drastically behind without Classless
Inter-Domain Routing (CIDR) capability, the use of link aggregation or route summarization.
A second version, RIPv2, seeks to address several shortcomings and glaring omissions from its
predecessor but still possesses the 15-hop limitation and slow convergence. Both of these
properties are essential to support modern large-scale network environments. As is usually the
case, technological innovation designed by human inspiration has a way of besting the most
difficult of challenges. RIP also describes the most basic kind of operation that involves WAN
optimization, in that it is most often applied between pairs of devices across a single, specific
link, where the parties on each side of a WAN connection have thorough or exhaustive
knowledge of everything they need to know about what’s on the “other side” of that WAN link.
This might be viewed as a paragon of static routing, in that much of what WAN optimization
devices can do depends on knowing the ins and outs of operations and characteristics on both
sides of the WAN link, and of taking steps based on that knowledge to limit the use of that WAN
link as much as such knowledge will permit.
What is meant by link-state? Consider a link as any interface on the WAN router. The state of that link
describes the interface and its relationship to nearby routers; this state description includes its IP
address, subnet mask, network connection type, interconnected routers, and so forth. Collectively,
this information forms a link-state database, described later.
RIP is a distance-vector protocol, which means that it uses hop count to select the shortest route
to a destination network. RIP always uses the lowest hop count despite the speed or reliability
properties of its supplied network link. OSPF is a link-state protocol, meaning it can
algorithmically consider a variety of link-related conditions when determining the best path to a
network destination, including speed and reliability properties. Furthermore, OSPF has no hop
limitation and routers can be added to the network as necessary making it highly suitable for
highly scalable enterprise WAN environments.
27
Chapter 2
OSPF also provides several other enhancements still outstanding from its predecessors, RIP
versions 1 and 2. OSPF has unlimited hop count, VLSM capability, and uses IP multicast to send
link-state updates as they occur to reduce network noise. Routing changes are propagated
instantaneously, so OSPF has better convergence than RIP. OSPF allows for better load-
balancing, enables the logical definition of networks (with routers divided into areas), and limits
the delivery of link-state updates network-wide. Password-secured route authentication, external
route tagging for AS, and aggregate routing are also advantages OSPF has over RIP.
What is meant by convergence? From a network routing perspective, convergence is essentially the
combination and merging of advertised routes and route updates from all available sources of such
information (other routers). When we say that RIP converges more slowly than OSPF, that means it
takes longer to propagate updates through a collection of RIP routers because each update goes
through hold-off and garbage collection periods that timeout and delete stale information more slowly
than is the case in OSPF.
The OSPF protocol format is specified in RFC 2328, which you can find by pointing your favorite
online browser to http://www.ietf.org/rfc/rfc2328.txt.
28
Chapter 2
29
Chapter 2
BGP and OSPF interaction is all spelled out in RFC 1403—BGP OSPF Interaction. You can read up
on this subject at http://www.ietf.org/rfc/rfc1403.txt.
BGP exchanges routing information for the Internet and acts as the adhesive protocol between
Internet Service Providers (ISPs). Customer and client networks (such as university or corporate
enterprise networks) will usually employ an IGP (such as RIP or OSPF, where the former
suffices for small, simple networks and the latter becomes necessary for larger, more complex
ones) for internal routing exchanges. These customers and client networks then connect to ISPs
that use BGP to exchange customer/client and ISP routes. When BGP is utilized between Ass,
the protocol is referred to as an External BGP (EBGP). When an ISP uses BGP to exchange
routes within a single AS, it’s called an Interior BGP (IBGP).
BGP is a robust, reliable, and scalable routing protocol capable of handling tens of thousands of
routes via numerous route parameters called attributes that define routing policies and maintain a
stable routing environment. Classless Inter-Domain Routing (CIDR) and route aggregation (to
reduce routing table size) are two prominent features of BGP version 4, as widely used on the
Internet. Route aggregation is a technique used to conserve address space and limit the amount of
routing information that must be advertised to other routers. From a conceptual viewpoint, CIDR
takes a block of contiguous class C addresses and represents them in an abbreviated and
concatenated numerical form.
BGP offers capabilities and scale that goes well beyond current WAN optimization technology,
which seldom scales to embrace systems by the thousands, let alone in larger numbers.
Nevertheless, BGPs facilities at aggregating traffic, managing complex routes, and reducing
addressing and traffic complexity have provided important models for WAN optimization
techniques, albeit at a smaller scale.
30
Chapter 2
31
Chapter 2
Figure 2.3: In general, queuing priority works by inspecting incoming packets for internal CoS or QoS
identifiers, then depositing those packets into any of a number of priority queues. The ways in which queues
are serviced, and how long each queue gains exclusive access to the attached network interface, determines
how each of the preceding queuing disciplines is implemented.
Data packets are scheduled on the network through a series of queue service disciplines used to
determine service priority, delay bounds, jitter bounds, and bandwidth allocation. Each queue is
assigned a certain weight indicative of the amount of its guaranteed capacity. Among these
choices, the Weighted Round Robin (WRR) technique may be mathematically proven to provide
the most reasonable performance both in guaranteeing bandwidth and achieving fairness
requirements. WRR, however, fails to accommodate some end-to-end delay requirements and
jitter bounds, and thus may not be suitable for time-sensitive streaming traffic such as video or
voice.
When discussing QoS, the terms service priority, delay bounds, jitter bounds, and bandwidth
allocation all describe properties of queue service disciplines. Service priority is the precedence value
given to specific application, service, or protocol traffic. Delay bounds specify predetermined
operational latency values, whereas jitter bounds specify a predefined range of transmission signal
variance. Bandwidth allocation is the amount of traffic (or range of signal frequency) provisioned on a
given transmission medium.
32
Chapter 2
QoS does confer an ability to apply priority levels for various applications, users, or data flows,
and to guarantee a certain level of performance for a specific data flow. Individual requirements
can be guaranteed for bit rates, delay, jitter, packet drop probability, and error rate. Such
guarantees may be necessary where network capacity is insufficient to accommodate any and all
traffic using best-effort delivery (no QoS, no priority) particularly for real-time streaming
multimedia applications such as VoIP, IP-TV, or other fixed bit-rate, time-sensitive protocols.
QoS mechanisms can be instrumental to improving performance anywhere network capacity is
limited and multiple protocols are in use, particularly when some of that traffic takes precedence
over the rest or where exceeding certain delay thresholds may make such traffic unusable or the
user experience unacceptable.
Many branch office routers support various forms of QoS and will allow network administrators
to apply traffic-shaping policies to network flows both inbound and outbound. This can help to
ensure that business-critical applications perform acceptably as long as sufficient bandwidth is
available to them.
Available tools to establish QoS between a service provider and a subscriber may include a
contractual Service Level Agreement (SLA) that specifies guarantees for network or protocol
performance, throughput, or latency values. These guarantees are typically based on mutually
agreed upon measures and enforced through traffic prioritization.
SLAs are discussed briefly in Chapter 1. For more information about SLAs in general, please visit the
SLA Information Zone at http://www.sla-zone.co.uk/.
At this point, we’ve coursed through the evolution of network topology-aware protocols that
work within defined parameters (or perimeters, if you prefer) of network boundaries. These
protocols use their existing knowledge of network topology to make instantaneous decisions
about how to handle packet transmissions, including when and where to make delivery. Such
protocols can be encapsulated one within another, in fact, wrapped in layers of enveloping
protocol data much like a set of nested Russian Matrioshka dolls. Ultimately, however, some
outer layer tag, label, or value helps to set priority and instructs routers how to handle the
contents whenever it encounters a non-empty queue for some network interface.
WAN Optimization techniques often prove surprisingly helpful as organizations seek to
implement class or quality of service mechanisms for their network traffic. Of course, because
such priorities weigh most heavily on traffic that crosses WAN links, there’s a definite and
beneficial synergy between QoS mechanisms and WAN optimization. On the one hand, QoS
seeks to make sure that the most important and deserving traffic gets an appropriate share of
WAN bandwidth and is subject to the lowest possible latencies. On the other hand, WAN
optimization seeks to compress, compact, and reduce the amount of data that actually has to
traverse WAN links between specific pairs of senders and receivers. Thus, WAN optimization
often helps to impose and enforce all kinds of traffic policy, including class or quality of service,
as well as providing the means whereby companies and organizations can make the most and
best use out of WAN bandwidth made available to them.
We’ve covered the many traditional and time-honored protocols introduced to enhance routing
performance using a number of techniques, tactics, and technological approaches. Let’s
transition into more modernized big-league protocols that significantly up the ante for routing
gambles.
33
Chapter 2
34
Chapter 2
35
Chapter 2
36
Chapter 2
37
Chapter 2
Figure 2.4: When installed, WAN optimization devices typically sit between the LAN and the boundary router
(or most properly, on the stream of traffic destined for any WAN links inside the boundary router).
Circuit switching is an early communications technology designed for analog-based phone networks
modified to use digital circuit switching technology for dedicated connections between sender and
receiver. Packet switching is a follow-up communications system that utilizes digital packets to
transmit all forms of communications signals and serves as the primary method of communications
for the Internet and other digital communications. A datagram-driven service is one where individual
packets that comprise entire messages are sent individually across the transmission medium.
The original motivation for MPLS was to support construction of simple but extremely fast
network switches so that IP packets could be forwarded as quickly as available high-speed
technologies will permit. This approach keeps traffic continually on the move and requires little
or no intermediate storage in slower queues where traffic must pause and wait for its allotted
service interval. MPLS also supports multiple service models and can perform traffic
management on the fly.
38
Chapter 2
Figure 2.5: WAN optimization devices provide ingress and egress for services for Layer 1 to Layer 3
protocols.
Figure 2.5 shows how service elements (which might include boundary routers and WAN
optimization devices) can provide ingress and egress for services at Layers 1 through 3 of the
ISO/OSI model, along with access for streaming or time-sensitive services such as voice, video,
and so forth. In an MPLS environment, traffic essentially flows from an ingress service element
to some corresponding egress service element through an IP/MPLS core architecture where only
MPLS labels need to be inspected and managed as traffic flows through a core network cloud.
Here, the cloud analogy is a good one because IT professionals lose substantial visibility into and
access to what is going on in the IP/MPLS core, but in exchange obtain better traffic
management and much higher transit times through that core.
MPLS prefixes packets that enter the cloud at any egress point with an MPLS header, which
contains one or more 32-bit MPLS label fields (because multiple labels may be affixed, this data
structure is called a label stack). Each label is constructed as follows:
• 20-bit label value
• 3-bit QoS field (actually this is better described as a prioritized CoS scheme, though this
flag is still called QoS)
• 1-bit bottom of stack flag (if set, indicates the current label is the bottom of the stack)
• 8-bit time to live (TTL) field
MPLS-labeled packets can be switched from an incoming to an outgoing port based on a simple
label lookup rather than requiring a lookup into a routing table for IP addresses (a more complex,
compute-intensive operation). Such lookups can be performed while the packet is moving
through a switch fabric rather than requiring the attention of a separate CPU. Entry and exist
points for MPLS networks are called Label Edge Routers (LERs). These devices push MPLS
labels onto packets as they enter the cloud, then strip them off when they leave the cloud. In the
core, routers that forward traffic purely on the basis of the MPLS label are called Label Switch
Routers (LSRs), though an LSR may push a second (or additional) label onto a packet with an
MPLS label from an LER already affixed.
39
Chapter 2
Labels are distributed among LERs and LSRs using a special Label Distribution Protocol (LDP).
LSRs in MPLS networks periodically exchange label and reachability data according to standard
algorithms to permit them to manage a complete map of the network paths they may use to
forward packets according to their labels. When a labeled MPLS packet hops from one MPLS
router to another, it is said to be traversing an MPLS tunnel. Label Switch Paths (LSPs) may also
be configured in an MPLS network to support network-based IP virtual private networks (IP
VPNs) or to move traffic across specific paths in the network. In many ways, LSPs resemble
permanent virtual circuits (PVCs) in Frame Relay or ATM networks, although they do not
require specific Layer 2 technologies to be at their disposal.
When an unlabeled packet enters an LER to transit the MPLS cloud, the LER determines that
packet’s forwarding equivalence class (FEC) and pushes one or more labels onto the packet’s
freshly created label stack. This is also where QoS/CoS regimes may be applied so as to expedite
high-priority traffic. Once the label stack is complete, the LER passes the packet onto the next
hop router. When an MPLS router receives a labeled packet, the topmost label in the stack is
examined. Depending on its contents, one of the following operations will be performed:
• Swap—The topmost label is switched out for a new label, and the packet gets forwarded
along the associated path for that label.
• Push—A new label is pushed on top of the stack, on top of the existing label, thereby
encapsulating that packet inside another layer of MPLS information. This technique
supports hierarchical routing for MPLS packets, and explains how MPLS VPNs operate
within the MPLS cloud (the core sees only relevant path information, and only VPN
service routers deal with private traffic data).
• Pop—The topmost label is removed from the label stack, which may reveal another label
beneath it (when this occurs, it is called decapsulation). If it is the bottom label in the
stack, the packet will no longer be traversing an MPLS tunnel on its next hop and is
leaving the MPLS cloud behind. Perforce this step is usually handled at an egress router
(LER). Sometimes when an LER handles many MPLS tunnels, the MPLS router one hop
ahead of the LER may pop the final label(s) to relieve the processing involved in cleaning
up the label stack.
Although MPLS traffic remains in the cloud, the contents of such packets is completely ignored,
except for the contents of the MPLS label stack. Even then, transit routers (LSRs) typically only
work with the label at the top of the stack, and forwarding occurs based on label content only.
This explains how MPLS operates independently of other routing protocols and the routing
tables they require as well as the well-known IP longest prefix match performed at each hop in a
conventional IP router.
Successful implementation of QoS/CoS for MPLS depends on its ability to handle multiple
services and to manage traffic priority and flow thanks to extremely quick label inspection and
label stack operations. MPLS can be especially helpful when service providers or enterprises
want to impose service level requirements for specific classes of service so that low-latency
applications such as voice over IP (VoIP) or video teleconferencing can count on acceptable
levels of latency and jitter for traffic on the move.
That said, MPLS carriers differ in the number of classes of service they offer (up to a maximum
of 8 different classes, as dictated by the QoS field size). Specific features, service guarantees,
and pricing for classes of service also differ from carrier to carrier.
40
Chapter 2
41
Chapter 2
42
Chapter 2
Where such traffic isn’t blocked completely, it may prove useful to limit bandwidth to some ridiculously
small value—for example, 1 Kbps. Doing so will keep connections active long enough for
administrators to document them and, if necessary, drop in on offenders to remind them about
acceptable use policy requirements and possible repercussions for its violation.
43
Chapter 2
44
Chapter 3
45
Chapter 3
46
Chapter 3
Figure 3.1: Blue links indicate “fat WAN pipes” between HQ and Regional hubs, red links “skinny WAN
pipes” between Regional hubs and branch offices. Not shown: remote access links into all hubs!
But as the underpinnings of WAN technology have continued to evolve, a hierarchical, tunnel-
based approach can be seen as an impairment rather than an improvement. Given the flexibility,
scalability and performance available from more modern cloud architectures (which don’t need
implicit or explicit hierarchy to function), the hub and spoke model can pose problems when
changes in relationships, traffic patterns, or even work assignments overload WAN links at the
periphery. Organizations and enterprises have found themselves scrapping hub-and-spoke
architectures in favor of MPLS clouds, because these allow them faster access between arbitrary
pairs of endpoints, and because additional carrying capacity can be laid on (or taken off) as
changing traffic patterns and needs dictate.
47
Chapter 3
48
Chapter 3
The concept of screen scraping is still utilized to harvest information in useful ways. Web
scraping, a modern-age variant, generically describes any of several methods to extract content
from Web sites to reformat or transform content into another context. Example scraper
applications may scour retail sites—all coded in various languages and differently formatted—in
search of books, cookware, and electronics categorized and indexed for online bargain hunters.
Figure 3.2 shows a screen scraper at work, harvesting text from a Web browser and depositing
same in a database.
Figure 3.2: A screen scraper operates a browser window just so it can harvest text on display there.
Screen scraping applications make excellent candidates for WAN optimization because they can
fall prey to inefficiencies that WAN optimization tools address quite readily. First, they produce
regular streams of character data that inevitably benefit from compression but also may benefit
from dictionary and string caching capabilities. Second, screen scraping applications may utilize
inefficient protocols, involve frequent communications, and be subject to “chatty” behavior.
Properly repackaged through proxy agents, WAN optimization tools can help with all these
shortcomings. But most important, the sheer doggedness of screen scraping as a technique for
grabbing data when no other means is available shows us that clever programming techniques
can also be applied when seeking to optimize WAN traffic, even if only at the level of brute
force via compression or protocol streamlining.
49
Chapter 3
Data-bearing servers operate on a simple principle of supply and demand: clients make requests,
and servers respond to them. But Internet-facing servers that service client requests are especially
prone to overload during peak operating hours and under heavy, intense network loads. Such
peak load periods create data congestion or bottlenecking across the connection that can cause
server instability and eventual system failure, resulting in downtime. Bandwidth throttling is
used as a preventive method to control the server’s response level to any surges in client requests
throughout peak hours of the day.
In February of 2008, members of the Federal Communications Commission announced they
might consider establishing regulations to discourage Internet providers from selectively
throttling bandwidth from sites and services that would otherwise consume large amounts. In late
2007, Comcast actively interfered with some of its high-speed Internet subscribers using file-
sharing clients and protocols by throttling such connections during peak hours (and only for
uploads). This sparked a controversy that continues to this day.
Organizations can (and should) use bandwidth throttling or firewall filters to limit or block traffic
that explicitly violates Acceptable Use Policy. But otherwise, bandwidth-throttling is best
applied in the form of Class of Service or Quality of Service (CoS/QoS) markers applied to
various types or specific instances of network traffic. CoS and QoS represent classification
schemes for network traffic that give priority to time-sensitive and mission-critical traffic rather
than by limiting a specific type of traffic explicitly. Many experts recommend that unauthorized
or unwanted protocols be throttled to extremely low levels of bandwidth (under 10 Kbps) rather
than blocked completely, so as to give network administrators an opportunity to ferret out and
deal with users or programs involved. Thus, for example, by limiting bandwidth available to
peer-to-peer protocols such as BitTorrent (used for video and other personal media downloads)
or FastTrack (the Kazaa protocol) to only 5K or 10K bits per second, administrators may have
time to identify the workstations or servers acting as endpoints for related peer-to-peer activities,
and identify the individuals involved in their use. They can then counsel or discipline users as
per prevailing acceptable use policies (AUP).
50
Chapter 3
In the same vein, encryption and security protocol acceleration tools become resource-intensive
but utterly necessary burdens, especially when sensitive traffic must traverse Internet links. Even
the most widely used protocol on the Internet—namely, HTTP—may be described as both chatty
(involving frequent communications) and bursty (involving numerous periods during which tens
to hundreds of resource requests may be in flight on the network at any given moment). The
protocol trace shown in Figure 3.3 indicates that the display of a single Web page, involves back-
and-forth exchange of information about a great many elements over a short period of time (12
showing on the sample trace, with more out of sight below).
Figure 3.3: A single Web page fetch can spawn tens to hundreds of HTTP “Get” requests and associated
data-bearing replies.
Fixing these “broken” aspects of the network environment becomes a traffic engineering
proposition that takes into account not just the applications themselves but application
programming in general. Knowing how an application operates, its protocol formats and
parameters, and observable run-time behaviors is crucial to understanding how it fits with other
applications, services, and protocols on the network. It’s not just a patchwork proposition that
involves mending individual parts, but instead requires accommodating best practices for
efficient WAN communications: send and receive infrequently, in bulk, and in the form of
complete transactions whenever possible.
51
Chapter 3
The TCP format was originally designed and engineered to operate reliably over unreliable
transmission media irrespective of transmission rates, inherent delays, protocol corruption, data
duplication, and segment reordering. Because of this, TCP is indeed a robust and reliable
mechanism for delivery of applications, services, and protocols. But this design strength also
exposes inherent weakness in TCP delivery when deployed across modern, higher-speed media
that completely exceed the conditions under which TCP was originally intended to be used.
Re-engineering these and other “broken” network protocols occurs in WAN optimization
solutions, usually through some form of proxy. Such a proxy permits unfettered protocol
behavior so that protocols can behave normally and unhindered on the LAN. But the same proxy
also translates and typically repackages LAN-oriented transmissions to reduce or eliminate
“chattiness” across WAN links, while also batching up individual transmissions to limit the
number and maximize the payloads for such WAN transmissions as do occur. This approach
maximizes use of available bandwidth when transferring request and reply traffic across WAN
links.
IP blindly sends packets without checking on their arrival; TCP maintains ongoing end-to-end
connections throughout setup and tear-down phases, and even requires periodic
acknowledgements for receipt of data. Unacknowledged data triggers an exponential back-off
algorithm that times out and retries transmissions until they’re received and acknowledged, or
times out to signal connection failure. Sliding TCP window sizes—these denote the number of
packets that can be sent before receipt of an acknowledgement is required—directly influences
performance where larger values equal greater throughput (but also, much longer potential
delays). TCP employs a well-defined “slow start” algorithm that initiates communications with a
small window size, then scales TCP window sizes to optimal proportions as connections are
established and maintained while they remain active. Each of these and other such procedures of
the TCP/IP stack introduce network delay addressed in WAN optimization solutions through
connection optimization techniques and aggressive windowing methods.
For an outstanding discussion on TCP window size, the slow start algorithm, and other TCP
congestion management techniques, please consult Charles Kozierok’s excellent book The TCP/IP
Guide. This book is available in its entirely online; the section on TCP Reliability and Flow Control
Features and Protocol Modifications includes detailed discussion of TCP window management,
window size adjustment, congestion handling, and congestion avoidance mechanisms.
52
Chapter 3
53
Chapter 3
54
Chapter 3
55
Chapter 3
Figure 3.4: Given the ability to decrypt encrypted data streams, WAN optimization devices can enforce
policy, impose throttling, and even apply various compression and dictionary schemes.
56
Chapter 3
57
Chapter 3
Application Traffic
An inordinate number of applications and application protocols exist that can be controlled and
monitored consistently and cohesively Each obtains its own priority assignment, poses its own
unique value in the network management equation. Not all applications are created equally,
though many are designed equally badly (or are comparatively worse) when it comes to WAN
deployment. The abilities that WAN optimization solutions confer to tame these sometimes
savage beasts remain among their most potent value propositions.
Some vendors offer Web acceleration appliances that optimize only certain types of traffic by
off-loading certain servers. Other products optimize all TCP traffic equally regardless of
differences in their application-layer behaviors. A complete and comprehensive WAN
optimization solution must be able to selectively prioritize traffic, especially in situations where
WAN links are heavily-utilized or operating at (or above) their rated capacity.
58
Chapter 3
User identity tracking also facilitates better end-to-end network visibility. It can allow network
engineers and planners to streamline security and prioritize delivery of certain traffic from
certain sources. Identity may be used to block or allow certain types of traffic, or to apply
varying levels of priority to the same kinds of traffic (CEO and customer support email goes
ahead of all other email, for example). In the same vein, a salesperson located in some remote
branch office may be granted higher priority than a marketing staff employee when accessing the
company’s centralized CRM application, because of the perceived difference in importance for
such access (servicing an existing customer in the former case, prospecting for new customers or
expanding on an existing relationship in the latter case).
59
Chapter 3
Caching
Caching is an excellent strategy in any aspect of computing. Router hardware caches MAC
address tables and maintains lists of IP assignments; application proxies cache application layer
data to conserve bandwidth against repeat requests; and WAN optimization technologies cache
sequences of traffic data to avoid duplicate replay of protocol patterns on the network. And the
process can be entirely application-independent for general-purpose usage. Single-purpose
caches work only with specific applications or repeat requests for the same resource irrespective
of all other network traffic (Web-only, email-only, backup-only, ERP and so forth). WAN
optimization devices have a global view of all traffic that passes over the links they manage, so
their caches can handle data for all the applications whose traffic traverses those links (making
them a “links-only” rather than “application-only” type of cache).
Data reduction is an efficient means for WAN application and bandwidth optimization. The trick
here is to avoid sending as much data as possible, or at least, never to send the same data more
than once. Acceleration appliances examine data in real-time prior to its transmission across the
WAN, and store objects and items locally. Any duplicate detected triggers the appropriate
appliance to resend data locally instead of moving that same data (unnecessarily) across a WAN
link.
Wherever data is stored by intermediary devices, it should also be handled in a secure, policy-
driven manner. Caching copies of repeatedly issued data across a network is a great strategy for
network performance but a terrible hindrance for application security across the data path. Any
information obtained from this cache must also be secured so that it’s both accurate and timely
upon delivery to the requesting source, but also safe and secure from unwarranted inspection or
alteration by unauthorized third parties.
Ideally, a cache should also be free of deployment constraint. Transparency plays a crucial role
in the peaceful coexistence of intermediary device and end-user, so having to expose caching
servers and services through end-user configurations can be a labor-intensive hands-on process.
Zero-configuration is the objective in many of today’s application platforms, and this area is no
exception. Any and all network optimizations should be readily accessible and completely
transparent to the end-user.
60
Chapter 3
Bandwidth Control
Bandwidth control and bandwidth management are two ways of saying the same thing. It is the
process of measuring and controlling packet-based network communications to avoid overusing
capacity, which results in network congestion and poor performance. The channel capacity of
partitioned, multiple-user Internetwork links is administratively limited. Once this threshold is
reached, performance degrades in a highly noticeable way—network congestion.
Controlling and managing network traffic reduces capacity use to maintain smooth, continuous
service between endpoints. The art and science of controlling and managing traffic is a deeply-
faceted practice of its own, with myriad solutions at virtually every layer of the network protocol
stack. ISPs typically retain control over queue management and QoS to subscribers, window
shaping promotes traffic flow reduction in high-end enterprise products and other such solutions
increase usability of network capacity and resources.
The majority of WAN protocols utilized today include Integrated Services Digital Network
(ISDN), frame relay, Multi-Protocol Label Switching (MPLS), Asynchronous Transfer Mode
(ATM), and Point-to-Point Protocol (PPP) over Synchronous Optical Network (SONET).
Harmonizing and orchestrating optimal performance among this heterogeny requires handling a
series of deeply complex tasks.
61
Chapter 3
Figure 3.5: A key benefit of the “Managed WAN Cloud” is its ability to accommodate different kinds of WAN
links for ingress and egress.
Point-to-Point Links
An established, individual communications path from subscriber to provider is referred to as a
point-to-point link. In this arrangement, a carrier network (such as a local telephone company)
provides a direct connection via leased lines (that may include copper wiring and other necessary
hardware such as CSU/DSU units) to the customer’s premises. Accordingly, both sets of links
will generally use the same service provider network arrangements.
Circuits are normally priced according to bandwidth requirements and the distance between the
two connection points. Point-to-point links are typically priced higher than Frame Relay links but
also provide permanently established, exclusive connectivity between provider and subscriber
regardless of the extent to which allocated bandwidth may be utilized. Another common term for
such a link is leased line (which refers to the ongoing reservation of the connection between the
two endpoints).
Circuit Switching
Using circuit-switching communications, data paths are formed as needed and terminated when
such use ceases. This setup operates much like a typical telephone network in that
“conversations” are arbitrarily created and terminated, existing only for the duration of the “call”
(which is actually an active data connection between at least two parties).
ISDN is a primary example of this kind of technology: a switched circuit is initiated whenever a
router possesses data for a remote site, which essentially places a direct-dial call into the remote
site’s circuit. Once the two parties are authenticated and connected, they begin the transfer of
data from source to destination. Upon completion, the call terminates.
62
Chapter 3
Packet Switching
WAN packet-switching technology uses a shared carrier infrastructure unlike the private, one-on-
one pairings used in a circuit-switched network arrangement. This scenario enables the carrier to
make more efficient use of its infrastructure, often resulting in better subscriber costs for similar
levels of service. In a packet-switched environment, a shared WAN medium is distributed and
utilized among a broad subscriber base that creates virtual connections between sites for packet
delivery.
Such a topology is called a cloud and includes protocols such as Asynchronous Transfer Mode
(ATM), Frame Relay, Switched Multimegabit Data Services (SMDS), and—less commonly in
the US—X.25. Packet-switched connectivity is ideal for organizations whose WAN traffic is
“bursty” or variable in nature and does not require strictly dedicated bandwidth or always-on
WAN links.
63
Chapter 3
WAN Devices
A typical WAN comprises numerous networking devices, most of which are not unique to the
WAN environment itself. Modems, switches, and servers are non-specific, general-purpose
elements in every business computing landscape. These devices bridge network connectivity
among LAN and WAN segments, where each type provides different advantages and benefits,
along with individually applicable disadvantages and drawbacks. Let’s examine each
representative category in turn.
WAN Switches
Typical LAN-based Ethernet switches are multiport networking devices used in localized
environments. Similarly, WAN switches perform identical functions for distributed networking
contexts. They operate at the data-link layer (OSI Layer 2) and switch traffic from Frame Relay
and SMDS.
Access Servers
Central dial-in/dial-out gateways for dial-up connections are called access servers. These devices
provide LAN and WAN networking equipment access to asynchronous devices. Network access
servers function as control points for roaming and remote users so that they may access internal
resources (or to an ISP) from external locations.
Analog Modems
An analog modem translates between analog and digital signaling. This enables data-bearing
communications to transmit via voice-based telephony. Digital signals are converted into an
analog format suitable for transmission through analog carriers and then restored to digital
format on the receiving end.
64
Chapter 3
See Chapter 2 for more information about data substitution, caching, and compression.
Actual data reduction implementations and methods vary widely among vendors and product
platforms. For the purposes of this chapter, it suffices simply to distinguish among distinctive
differences between data caching and data reduction approaches (data compression is completely
different and mutually independent).
65
Chapter 3
66
Chapter 3
• Application breadth—Data reduction solutions operate at the network layer of the TCP/IP
network stack to support any transport protocol including UDP. Solutions that
specifically target TCP flows are designed to footprint and store bulk TCP application
data (such as file transfers and email messages). Support for UDP streams expands the
breadth of supported applications (including VoIP as used for IP Telephony and related
services, and the Real Time Streaming Protocol—RTSP, as used for streaming media
playback over the Internet, primarily for entertainment videos).
• Data protection—Data reduction solutions take protective measures to safeguard end-user
information that usually involves application of encryption mechanisms. Compression
and reduction strategies work well on repetitive data elements, but effective encryption
randomizes such data and renders those strategies ineffective. SSL acceleration
originating and terminating on the WAN optimizer expedites overall traffic by permitting
optimization mechanisms to operate even within encrypted (therefore unintelligible)
transmission streams (essentially, this involves sharing keys or certificates, decrypting
data streams in the device to seek out repetition, applying data reduction and caching
techniques, then re-encrypting the reduced output for transmission across the WAN. The
benefits of WAN optimization usually outweigh the associated overhead involved,
making this approach entirely cost effective).
• Granular matching—Each solution also differs in how it seeks matching data patterns
both in the granularity of the search employed and the resulting long-term database
fingerprints. Some solutions work well for duplicate data strings or streams sent in rapid
succession but may be ineffective when working with derived data or duplicates sent
after older data ages out of the cache.
Finally, data compression seeks to reduce traffic traversing the WAN topology. Simple
algorithms identify repetitive byte sequences within a single packet, whereas more sophisticated
implementations go beyond the packet level to match packet sequences and entire protocol
streams. Header compression provides further bandwidth gains through specialized algorithms
designed for protocol-specific properties. Payload compression algorithms identify relatively
short byte-pattern sequences in data-bearing protocols that recur over a measured duration,
which are replaced with shorter references. Compression across various flows of traffic is called
crossflow compression and works even on UDP-based traffic.
In each of these strategies, a centralized analysis and control point is required to monitor and
modify entire network transactions through individual conversations. The proxy appliance or
proxy server dutifully services this role and proves itself a greatly effective tool in enhancing
performance and security capabilities for given client-server needs.
67
Chapter 3
68
Chapter 4
69
Chapter 4
70
Chapter 4
71
Chapter 4
Figure 4.1: What WAN optimization devices do for WAN links, client software does for individual remote
access.
72
Chapter 4
73
Chapter 4
Security is an issue primarily in two situations: data at rest and data in motion. Data at rest is any
information stored on the WAN accelerator and must therefore comply with any applicable
organizational and federal regulations governing the storage of private and confidential data. At
the same time, data in motion—anything sent across the wire—must also be securely encrypted
where applicable. For these reasons, encryption must necessarily occur for drive partitions and
data pathways whenever sensitive information is involved. This data security must also be
backed up by proper access control and user authentication systems.
Figure 4.2: Moving symbol dictionary references instead of the data referenced thereby can achieve data
reductions of three orders or magnitude or better.
74
Chapter 4
75
Chapter 4
Packet striping overcomes quota restrictions emplaced on bandwidth for TCP streams and
enforced by firewalls or routers. The intention is to prevent overutilization of available
bandwidth. Packet striping divides the aggregate throughput for any given data stream among
multiple flows. In this way, multiple smaller streams can still play by the rules without a large
payload subject to checkpoint restrictions in passage. A single, bulky 100Mbps stream transiting
a router restriction of 10Mbps per flow easily divides into 10 separate streams for optimal
delivery.
Though beneficial, these routing restriction enforcements may unintentionally inhibit important
traffic (such as scheduled online backups) with equal prejudice among several competing but
less significant flows (for example, client HTTP traffic, Simple Network Management
Protocol—SNMP—interactions, and routine network status checks). Striping breaks high-
bandwidth traffic into several discrete flows for optimal transmission and later reassembly at the
receiving end. It may be prudent to synthesize your approximate WAN conditions using a
simulator or emulator as part of your WAN optimization evaluation process. Switching or
upgrading existing network infrastructure to MPLS and VPN technology necessitates this
discovery process and greatly benefits from its results. Good WAN emulators effectively
reproduce real-world conditions specific to your network properties to include effective
bandwidth, inherent latency, and non-sequential packet delivery.
Data pre-fetching builds a cache repository based on read and requested file segments for maximal
efficiency. Read requests are served from partially cached files if requested elements are present.
Data read-ahead takes a predictive approach to accelerating WAN traffic by preemptively requesting
file data ahead of the current cached portion to increase cache hits and performance.
Data write-behind techniques accelerate file transfers by deferring write requests until sufficient data
accumulates to warrant issuing an all-at-once write.
For example, previous MAPI issues associated with Outlook 2000 are addressed in Microsoft
Exchange 2003, which includes a new cached mode of operation to improve WAN performance.
Although CIFS is designed for remote file sharing access across the Internet and other IP-based
networks, it’s a fairly chatty protocol issuing hundreds to thousands of round-trip packets for a
single file transfer (see Figure 4.3). CIFS performance is strictly LAN-bound and its chatty
nature directly impinges WAN performance. Across the WAN, file shares accessed from a
centralized data center undergo bandwidth and latency constraints that negatively impact
performance.
76
Chapter 4
Figure 4.3: Replacing NFS or CIFS with WAN-capable file services and caching enables greatly improved
communication efficiencies.
77
Chapter 4
78
Chapter 4
79
Chapter 4
80
Chapter 4
81