0% found this document useful (0 votes)
156 views54 pages

A Practical Guide To (Correctly) Troubleshooting With Traceroute

This document provides an overview of traceroute and how to correctly interpret traceroute results. It discusses how traceroute works, including how it uses TTL values and ICMP messages to trace the path to a destination. It also covers important topics like understanding latency, asymmetric routing paths, load balancing, and interpreting DNS names and locations in traceroute output to gain useful insights about the network topology and troubleshoot issues. The document emphasizes that simply looking at where packets time out or latency increases is often not enough, and that skill is required to correctly analyze complex traceroute results.

Uploaded by

rygetz
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views54 pages

A Practical Guide To (Correctly) Troubleshooting With Traceroute

This document provides an overview of traceroute and how to correctly interpret traceroute results. It discusses how traceroute works, including how it uses TTL values and ICMP messages to trace the path to a destination. It also covers important topics like understanding latency, asymmetric routing paths, load balancing, and interpreting DNS names and locations in traceroute output to gain useful insights about the network topology and troubleshoot issues. The document emphasizes that simply looking at where packets time out or latency increases is often not enough, and that skill is required to correctly analyze complex traceroute results.

Uploaded by

rygetz
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

A Practical Guide to (Correctly) Troubleshooting with Traceroute

Richard A Steenbergen <[email protected]>

nLayer Communications, Inc.

Introduction
Troubleshooting T bl h ti problems bl on th the I Internet? t t?
The number one go-to tool is traceroute
Every OS comes with a traceroute tool of some kind. There are thousands of websites which can run a traceroute. There are dozens of visual visual traceroute traceroute tools available, available both commercially and free.

And it seems like such a simple p tool to use


Type in an IP address and it shows you a list of router hops And where the traceroute stops, drops packets, or where the l t latency goes up a l lot, t th thats t where h th the problem bl i is, right? i ht? How could this possibly go wrong?

Unfortunately, Unfortunately it almost never works out this way way.


By Richard Steenbergen, nLayer Communications, Inc. 2

Problem Statement
So whats what s wrong with traceroute?
Most modern commercial networks are actually well run
Simple issues like congestion or routing loops are becoming a smaller percentage of the total issues encountered. y, issues are complex p enough g that a nave More commonly, traceroute interpretation is utterly useless.

Few people are actually skilled at interpreting traceroute


Most ISP NOCs and even most mid-level engineering staff are not able to correctly interpret a complex traceroute. Leads L d t to a significant i ifi t number b of f misdiagnosed i di di issues, f false l reports, etc, which flood the NOCs of networks world-wide. False report rate is so high that it is almost impossible to report a real traceroute-based issue through all the noise.
By Richard Steenbergen, nLayer Communications, Inc. 3

Traceroute Topics
Topics to discuss
How traceroute works Interpreting DNS in traceroute Understanding network latency ICMP prioritization and rate rate-limiting limiting Asymmetric forwarding paths L db Load balancing l i across multiple lti l paths th Traceroute and MPLS

Random Traceroute Factoid


The default starting destination probe port in the UNIX traceroute implementation is 33434. This comes from 32768 (215) + 666 (the mark of Satan). Coincidence?
By Richard Steenbergen, nLayer Communications, Inc. 4

How Traceroute Works

By Richard Steenbergen, nLayer Communications, Inc.

Traceroute The 10 10,000 000 Ft Overview


1 Launch a probe packet towards DST 1. DST, with a TTL of 1 2. Every router hop decrements the IP TTL of the packet by 1 3. When the TTL hits 0, packet is dropped, router sends ICMP TTL Exceed packet to SRC with the original probe packet as payload 4. SRC receives this ICMP message, displays a traceroute hop 5. Repeat from step 1, with TTL incremented by 1 each time, until 6. DST host receives probe, returns ICMP Dest Unreachable 7. SRC stops the traceroute upon receipt of ICMP Dest Unreachable
ICMP Dest Unreach ICMP TTL Exceed ICMP TTL Exceed ICMP TTL Exceed ICMP TTL Exceed

TTL=1

TTL=2

TTL=3

TTL=4

TTL=5

SRC

Router 1

Router 2

Router 3

Router 4

DST
6

By Richard Steenbergen, nLayer Communications, Inc.

Traceroute Implementation Details


Traceroute can use many protocols for probe packets
Classic UNIX traceroute uses UDP probes
With a starting destination port of 33434, incrementing once per probe. Cannot detect the end of the traceroute if the DST does not return an ICMP Dest Unreachable. This can happen as the result of firewalls, configuration g settings, g or a real application pp listening g on the dest p port.

Other implementations use ICMP Echo Request probes


Windows tracert.exe and MTR are the two biggest examples. These also cannot detect the end of the traceroute if the DST does not return an ICMP Echo Response. This may or may not be more frequently firewalled than the UDP->ICMP Dest Unreachable response.

Many modern traceroute implementations can do all


Configurable UDP, TCP, or ICMP probe packets via CLI flags. TCP is a poor choice for general use se (freq (frequently entl filtered) filtered), t typically picall only seen as a method to work around specific firewalls.
By Richard Steenbergen, nLayer Communications, Inc. 7

Traceroute Implementation Details


Most implementations send multiple probes per router hop hop.
The default for classic traceroute is 3 probes per hop.
Giving the 3 latency results, or 3 *s s if there is no response.

One specific implementation (MTR) sends an endless loop of probes.

Each probe has a unique code embedded in it


So the original traceroute implementation can map the responses. UDP/TCP use incrementing layer 4 ports, ICMP uses the seq #.

Layer 4 hashing can send each probe down a different path This may or may not be visible to traceroute
Yes in the case of ECMP (Layer 3 Equal-Cost Multi-Path) load-balancing. No in the case of LAG (Layer 2 802.3ad/Port-channel) load-balancing.

But the result is the same, , each probe p can behave in different ways, y , leading to different results for the same TTL hop.
By Richard Steenbergen, nLayer Communications, Inc. 8

Traceroute Latency Calculation


How H i is t traceroute t l latency t calculated? l l t d?
Timestamp when the probe packet is launched. Timestamp when the ICMP response is received. Calculate the difference to determine round-trip time. Routers along the path do not do any time processing
They simply reflect the original packets data back to the SRC. Many implementations encode the original launch timestamp into the probe packet, to increase accuracy and reduce state.

Most Importantly: only the ROUND TRIP is measured


Traceroute is showing you the hops on the forward path. But showing you latency based on the forward PLUS reverse path. Any delays on the reverse path will affect your results!
By Richard Steenbergen, nLayer Communications, Inc. 9

Traceroute What Hops Are You Seeing?


ICMP TTL Exceed ICMP TTL Exceed

ICMP Return Interface 192.168.2.1/30

ICMP Return Interface 192.168.3.1/30

TTL=1 Ingress Interface 172.16.2.1/30

TTL=2 Egress Interface 10.3.2.1/30 Ingress Interface 10.3.2.2/30

SRC

Router 1

Router 2

Traceroute packet with TTL of 1 enters router via the ingress interface. Router decrements TTL to 0, drops packet, generates ICMP TTL Exceed

ICMP packet dst address is set to the original traceroute probe source (SRC) ICMP packet src address is set to the IP of the ingress router interface. Traceroute shows a result based on the src address of the ICMP packet. The above traceroute will read: 172.16.2.1 10.3.2.2 You have NO visibility into the return path or the egress interface used.

Random factoid: This behavior is actually non non-standard. standard RFC1812 says the ICMP source MUST be from the egress iface. If obeyed, this would completely change traceroute results.
By Richard Steenbergen, nLayer Communications, Inc. 10

Interpreting p g DNS in Traceroute

By Richard Steenbergen, nLayer Communications, Inc.

11

Interpreting DNS in a Traceroute


Interpreting DNS is one of the most important aspects of correctly using traceroute traceroute. Information you can uncover includes:
Physical Router Locations Interface Types and Capacities Router Type and Roles Network Boundaries and Relationships

Deductions made from this information can be absolutely essential to troubleshooting.


By Richard Steenbergen, nLayer Communications, Inc. 12

Interpreting Traceroute - Location


Why do you need to know geographical locations?
To identify incorrect/suboptimal routing.
Going G i f from Atlanta Atl t t to Mi Miami i via i N New Y York? k? P Probably b bl not t good. d

To know when high latency is justified and when it isnt.


100ms across an ocean is normal normal, 100ms across town isnt isn t.

To help you understand network interconnection points.

The most commonly used location identifiers are:


IATA Airport Codes CLLI Codes Non-standard abbreviations based on a city name. Of course sometimes you just have to take a guess.
By Richard Steenbergen, nLayer Communications, Inc. 13

Location Identifiers IATA Airport Codes


Good International coverage of most large cities. cities
Typically seen in networks with a small number of large POPs POPs, or heavy focus in well well developed developed areas. areas

Examples:
Dallas Texas = DFW San Jose California = SJC

Sometimes represented by pseudo pseudo-airport airport codes


Especially where multiple airports serve a region Or where the airport code is non-intuitive for the city name New York, NY is served by JFK, LGA, and EWR airports.
But may be represented by simply NYC.

Northern VA is served by IAD, Washington DC by DCA.


But both may be written as WDC.
By Richard Steenbergen, nLayer Communications, Inc. 14

Location Identifiers CLLI Codes


Common C L Language Location L ti Identifier Id tifi
Full codes maintained (and sold) by Telecordia. Most commonly used by Telephone Companies
Example: HSTNTXMOCG0

For non-telco uses, only city/state codes are important


Examples:
HSTNTX = Houston Texas ASBNVA = Ashburn Virginia

Well defined standard covering g almost all NA cities


Commonly seen in networks with a larger number of POPs. Not an actual standard outside of North America
Some providers fudge these, e.g. AMSTNL = Amsterdam NL
By Richard Steenbergen, nLayer Communications, Inc. 15

Location Identifiers Arbitrary Values


And then sometimes people just make stuff up
Chicago IL
Airport Code: ORD (OHare) or MDW (Midway) CLLI Code: CHCGIL Example E l A Arbitrary bit C Code: d CHI

Toronto ON
Airport Code: YYZ (Pearson) or YTZ (City Center) CLLI Code: TOROON p Arbitrary y Code: TOR Example

Frequently based on the good intentions of making thing readable in plain English, even though these may not follow any standards.
By Richard Steenbergen, nLayer Communications, Inc. 16

Common Locations US Major IP Cities


Location Name Ashburn VA Atlanta GA Chicago IL Dallas TX Houston TX Los Angeles CA Miami FL Newark NJ New York NY San Jose CA Palo Alto CA Seattle CA Airport Codes IAD ATL ORD, MDW DFW IAH LAX MIA EWR JFK, , LGA SJC PAO SEA CLLI Code ASBNVA ATLNGA CHCGIL DLLSTX HSTNTX LSANCA MIAMFL NWRKNJ NYCMNY SNJSCA PLALCA STTLWA
17

Other Codes WDC, DCA, ASH CHI DAL HOU LA NEW, NWK NYC, , NYM SJO, SV, SF PAIX, PA

By Richard Steenbergen, nLayer Communications, Inc.

Common Locations Non-US Non US Major Cities


Location Name Amsterdam NL Frankfurt GE Hong Kong HK London UK Madrid SP Montreal CA Paris FR Singapore SG Seoul KR Sydney AU Tokyo JP Toronto CA Airport Codes AMS FRA HKG LHR MAD YUL CDG SIN GMP, , ICN SYD NRT YYZ, YTC CLLI Code ( (*) ) AMSTNL FRNKGE NEWTHK LONDEN MDRDSP MTRLPQ PARSFR SNGPSI SEOLKO SYDNAU TOKYJP TOROON TYO, TKO, TOK TOR
18

Other Codes

LON MDR MTL PAR SEL

By Richard Steenbergen, nLayer Communications, Inc.

Interpreting DNS Interface Types


Most networks will try to put interface info into DNS
Often to help them troubleshoot their own networks. Though Th h thi this many not t always l b be up t to date. d t
Many large networks use automatically generated DNS. Others can be surprisingly sloppy sloppy.

Can potentially help you identify the type of interface


As well as capacity capacity, and maybe even the make/model of router router.

Examples:
xe-11-1-0.edge1.NewYork1.Level3.net e 11 1 0 edge1 Ne York1 Le el3 net
XE-#/#/# is Juniper 10GE port. The device has at least 12 slots. Its s at least a 40G/slot router since it has a 10GE PIC in slot 1 1. It It must be Juniper MX960, no other device could fit this profile.
By Richard Steenbergen, nLayer Communications, Inc. 19

Common Interface Naming Conventions


Interface Type Fast Ethernet Gigabit g Ethernet 10 Gigabit Ethernet SONET T1 T3 Ethernet Bundle SONET Bundle Tunnel ATM Vlan Po# / Port-channel# Port channel# BE#### PosCh# Tu# ATM#/# Vl### BS#### TT# or TI# AT#/#/#/# Gi#/#/#/#.### Cisco IOS Fa#/# Gi#/# Te#/# Pos#/# Se#/# Gi#/#/#/# Te#/#/#/# POS#/#/#/# Cisco IOS XR Juniper fe-#/#/# ge-#/#/# g xe-#/#/# (*) so-#/#/# t1-#/#/# t3-#/#/# ae# as# ip-#/#/# p or g gr-#/#/# at-#/#/# ge-#-#-#.###

(*) Some early Juniper 10GE interfaces on some platforms are named GE
By Richard Steenbergen, nLayer Communications, Inc. 20

Interpreting DNS Router Types/Roles


Knowing the role of a router can be useful
But every network is different, and uses different naming conventions. And just to be extra confusion, they dont always follow their own naming rules.

Generally speaking, you can guess the context and get a basic understanding of the roles.
Core routers CR, , Core, , GBR, , BB, , CCR, , EBR Peering routers BR, Border, Edge, IR, IGR, Peer Customer e routers oute s AR, , Aggr, gg , Cust, C CAR, , HSA, S ,G GW Custo
By Richard Steenbergen, nLayer Communications, Inc. 21

Network Boundaries and Relationships


Identifying Network Boundaries is important
These tend to be where routing policy changes occur
E.g. E different diff t return t paths th based b d on Local L lP Preference. f

These tend to be areas where capacity and routing are the most difficult, difficult and thus likely to be problems. problems It also helps to know who to blame.

Identifying Id tif i the th relationship l ti hi can be b helpful h l f l too t


Typically: a) Transit Provider, b) Peer, or c) Customer. Many networks will try to indicate demarcs in their DNS
Examples:
Clear Cl names lik like networkname.customer.alter.net t k t lt t Or always landing customers on routers named gw
By Richard Steenbergen, nLayer Communications, Inc. 22

Network Boundaries and Relationships


Sometimes its it s easy to spot where the DNS changes:
4 te1-2-10g.ar3.DCA3.gblx.net (67.17.108.146) 5 sl-st21-ash-8-0-0.sprintlink.net sl st21 ash 8 0 0.sprintlink.net (144.232.18.65)

Alternatively, look for the other party name in the DNS:


4 po2 po2-20G 20G.ar5.DCA3.gblx.net ar5 DCA3 gblx net (67.16.133.90) (67 16 133 90) 5 cogent-1.ar5.DCA3.gblx.net (64.212.107.90)

Sometimes there will be no useful DNS info at all:


2 3 4 po2-20G.ar4.DCA3.gblx.net (67.16.133.82) 192 192.205.34.109 205 34 109 (192 (192.205.34.109) 205 34 109) cr2.wswdc.ip.att.net (12.122.84.46) p 3 the GBLX/AT&T border here, , or is hop p 4? Is hop Whois says 192.205.34.109 is owned by AT&T.
By Richard Steenbergen, nLayer Communications, Inc. 23

Network Boundaries and Relationships


For more info info, look at the other side of the /30
4 po2-20G.ar5.DCA3.gblx.net (67.16.133.90) 5 cogent-1.ar5.DCA3.gblx.net cogent 1 ar5 DCA3 gblx net (64.212.107.90) (64 212 107 90) > nslookup 64.212.107.89 = te2-3-10GE.ar5.DCA3.gblx.net The multiple ar5 ar5.DCA3 DCA3 hops are a clear indicator that hop 5 is NOT a gblx router, even without the Cogent hint in DNS. private p peering. g One side will p provide the /30 Common with p but not collect/maintain DNS info from the other side, so the data gets filled in with info from their side rather than left blank.
Port-channel2 Ingress Interface 67.16.133.90/30 TenGigabitEthernet2/3 Unknown

Egress Interface Ingress Interface 64.212.107.89/30 64.212.107.90/30

SRC

ar5.DCA3

Cogent Router
24

By Richard Steenbergen, nLayer Communications, Inc.

Understanding g Network Latency y

By Richard Steenbergen, nLayer Communications, Inc.

25

Understanding Network Latency


Three main types of network induced latency
Serialization Delay y
The delay caused by having to transmit data through routers/switches in packet sized chunks.

Queuing Delay
The time spent in a routers queues waiting for transmission. This is mostly related to line contention (full interfaces) interfaces), since without congestion there is very little need for a measurable queue.

Propagation p g Delay y
The time spent in flight, in which the signal is traveling over the transmission medium. This is primarily a limitation based on the speed of light light, or other electromagnetic propagation delays delays.
By Richard Steenbergen, nLayer Communications, Inc. 26

Latency Serialization Delay


The delay caused by packet-based forwarding
Ap packet moves through g a network as a discrete unit. Cant transmit the next packet until last one is finished.

Not much of an issue in high-speed networks


Speeds have increased by orders of magnitude over the years while packet sizes have stayed the same. years, same 1500 bytes over a 56k link (56Kbps) = 214.2ms delay 1500 bytes over a T1 (1.536Mbps) (1 536Mbps) = 7 7.8ms 8ms delay 1500 bytes over a FastE (100Mbps) = 0.12ms delay 1500 bytes over a GigE (1Gbps) = 0 0.012ms 012ms delay
By Richard Steenbergen, nLayer Communications, Inc. 27

Latency Queuing Delay


First we must understand Utilization Utilization
A 1GE port doing 500Mbps is said to be 50% utilized.
But in reality, an interface can only be transmitting (100% utilized) or not transmitting (0% ( % utilized) ) at any given instant. The above is actually used 50% of the time, over a period of 1 second.

Some queuing is a natural function of networking


When an interface is in use, the next packet must be queued.
The odds that an interface will be in use ( (transmitting) g) at any yg given instant depends on how much traffic is being sent across it.
90% utilization = 90% chance that the packet will have to be queued.

Transitions between interface speeds p also require q q queuing. g As an interface reaches saturation, the time spent in queue rises rapidly. When an interface is extremely full, a packet may be queued for many hundreds or thousands of milliseconds (depending on the router) router). Thus queuing delays are often associated with congestion (full interfaces).
By Richard Steenbergen, nLayer Communications, Inc. 28

Latency Propagation Delay


Delay caused by signal propagation over distance.
Light travels through a vacuum at around 300,000 km/sec
But fiber is made of glass, not a vacuum, so it travels slower. Fiber cores have a refractive index of 1.48, 1/1.48 = ~0.67c
Light Li ht t travels l th through h fib fiber at t around d 200,000 200 000 k km/sec. /

200,000 km/sec = 200km (or 125 miles) per millisecond. Or, Or 100 km (or 62 62.5 5 miles) per 1 ms of round round-trip trip delay. delay

Example:
A round-trip around the world at the equator, via a perfectly straight fiber route, would take ~400ms due solely to speed speed-of-light of light propagation delays delays.
By Richard Steenbergen, nLayer Communications, Inc. 29

Identifying the Latency Affecting You


So, So how do you determine if latency is normal?
Use location identifiers to determine geographical data. See if the latency fits with propagation delay delay. For example:
3 xe xe-3-0-0.cr1.nyc3.us.nlayer.net 3 0 0 cr1 nyc3 us nlayer net (69 (69.22.142.74) 22 142 74) 6 6.570ms 570ms 4 xe-0-0-0.cr1.lhr1.uk.nlayer.net (69.22.142.10) 74.144ms New York NY to London UK in 67.6ms? 4200 miles? Normal.

Another example:
5 cr2.wswdc.ip.att.net (12.122.3.38) [MPLS: Label 17221 Exp 0] 8 msec 8 msec 8 msec 6 tbr2.wswdc.ip.att.net tbr2 wswdc ip att net (12.122.16.102) (12 122 16 102) [MPLS: Label 32760 Exp 0] 8 msec 8 msec 8 msec 7 ggr3.wswdc.ip.att.net (12.122.80.69) 8 msec 8 msec 8 msec 8 192.205.34.106 [AS 7018] 228 msec 228 msec 228 msec p g ( (154.54.3.222) )[ [AS 174] ] 228 msec 228 msec 228 msec 9 te1-4.mpd01.iad01.atlas.cogentco.com

Washington DC to Washington DC in 220ms? Not normal.


By Richard Steenbergen, nLayer Communications, Inc. 30

Prioritization and Rate Limiting g

By Richard Steenbergen, nLayer Communications, Inc.

31

Cosmetic Delays Affecting Traceroute


The Th latency l t value l measured db by t traceroute t i is b based d on:
1. The time taken for the probe packet to reach a specific router, plus 2 The time taken for the router to generate the ICMP TTL Exceed 2. Exceed, plus 3. The time taken for the ICMP TTL Exceed to return to the SRC.

Items #1 and #3 are based on actual network conditions. But Item #2 is not. not
It is by definition impossible for item #2 to cause impact to real traffic. Only the traceroute probes and responses themselves are affected. This results in cosmetic issues which are mistaken for real issues.

By Richard Steenbergen, nLayer Communications, Inc.

32

Routing To To It It vs. vs Through Through It It


Architecture of a modern router:

Packets forwarded through the router (data-plane)


Fast Path: hardware based forwarding g of ordinary y packets p
Examples: Almost every packet in normal Internet traffic.

Slow Path: software based handling of exception packets


Examples: E l IP Options, O ti ICMP generation ti Traceroute T t happens h here h

Packets being forwarded to the router (control-plane)


Examples: BGP BGP, IGP IGP, SNMP SNMP, CLI access (telnet/ssh) (telnet/ssh), ping ping, or any other packets sent directly to a local IP address on the router.

Router CPUs tend to be relatively y underpowered p


A 320-640+ Gbps router may only have a single 600MHz MIPS CPU Which is usually busy enough doing things other than traceroute
ICMP Generation is NOT a priority for the router. And in most cases is specifically rate-limited and de-prioritized.
By Richard Steenbergen, nLayer Communications, Inc. 33

The Infamous BGP Scanner


O On some popular l router t platforms l tf th the slow-path l th d data t plane and the control-plane share the same resources.
And often dont have the best software schedulers either. As a result, control-plane activity such as BGP churn, CLI use and periodic software processes can consume enough use, CPU to slow the generation of ICMP TTL Exceed packets. p in traceroute reported p latency. y This results in spikes

The most infamous process is Cisco IOS BGP Scanner


Runs every 60 seconds on all BGP speaking IOS routers. Does periodic removal of routes with invalid next-hops, etc. Impact p significantly g y reduced with Next-Hop p Tracking g feature.
By Richard Steenbergen, nLayer Communications, Inc. 34

Rate Limited ICMP Generation Rate-Limited


Most routers also rate limit their ICMP generation
Often with arbitrary values which cant be changed. These may be insufficient under heavy traceroute load.
Especially with more and more users running MTR.

Juniper M/T/MX-series M/T/MX series


Distributed ICMP generation, runs on FPC CPU, doesnt touch RE. Hard-coded limit of 50pps per FPC for type 1/2, 250pps on FPC3s. FPC3 hard-coded limit bumped to 500pps per FPC in JUNOS 8.3+.

Foundry MLX/XMR
Hard-coded limit of 400pps per interface.

Force10 E-series
Hard-coded limit of 200pps or 600pps per interface.
By Richard Steenbergen, nLayer Communications, Inc. 35

Rate Limited ICMP Generation Rate-Limited


Cisco Ci 6500/7600 R Routers t (SUP720+)
Configurable rate-limit for TTL expiring packets
mls rate-limit rate limit all ttl-failure ttl failure 1000 255 Affects all ICMP TTL Exceed generations for the entire chassis.

Centralized p processing g of ICMP g generation


Runs on MSFC for the entire chassis, along-side control-plane operations.

Cisco GSR
Hard-coded rate-limit per line-card, ICMP done on LC CPU.

Cisco IOS-XR Platforms (CRS-1) ( )


Hard-coded rate-limit prior to IOS XR 3.6 g via LPTS 3.6+ rate-limit somewhat configurable
hw-module forwarding mpls ttl-expiry police rate 1000
By Richard Steenbergen, nLayer Communications, Inc. 36

Spotting The Fake Latency Spikes


The Th most ti important t t rule l for f troubleshooting t bl h ti l latency t
If there is an actual issue, the latency will continue or i increase f for all ll f future t hops h afterwards. ft d
Example (Not a real issue in hop 2):
1 ae3 ae3.cr2.iad1.us.nlayer.net cr2 iad1 us nlayer net 0.275 0 275 ms 0.264 0 264 ms 0 0.137 137 ms 2 xe-1-2-0.cr1.ord1.us.nlayer.net 18.271 ms 68.257 ms 18.001 ms 3 tge2-1.ar1.slc1.us.nlayer.net 53.373 ms 53.213 ms 53.227

Latency spikes in the middle of a traceroute mean absolutely nothing if they do not continue forward.
At worst t it could ld be b the th result lt of f an asymmetric t i path. th But it is probably an artificial rate-limit or prioritization issue. definition, if the regularly forwarded packets are being By definition affected you should see the issue persist on all future hops.
By Richard Steenbergen, nLayer Communications, Inc. 37

Asymmetric y Forwarding g Paths

By Richard Steenbergen, nLayer Communications, Inc.

38

Asymmetric Paths
Routing R ti on th the Internet I t t has h no guarantee t of f symmetry t
In fact, it is almost always going to be asymmetric.

Traceroute shows you only the forward path


Even though the latency is based on the round-trip time.

The reverse path is completely invisible to traceroute


It can be completely different at every hop along the path. The only practical solution is to look at both forward and reverse path traceroutes to try and spot reverse path issues. And A d even th that t wont t catch t h asymmetric t i paths th i in th the middle. iddl

By Richard Steenbergen, nLayer Communications, Inc.

39

Asymmetric Paths and Network Boundaries


Asymmetric paths often start at network boundaries
Why? Because that is where administrative policies change
te1-1.ar2.DCA3.gblx.net te1 1 ar2 DCA3 gblx net (69 (69.31.31.209) 31 31 209) 0 0.719 719 ms 0 0.560 560 ms 0 0.428 428 ms te1-2-10g.ar3.DCA3.gblx.net (67.17.108.146) 0.574 ms 0.557 ms 0.576 ms sl-st21-ash-8-0-0.sprintlink.net (144.232.18.65) 100.280 ms 100.265 ms 100.282 ms 144 232 20 149 (144 144.232.20.149 (144.232.20.149) 232 20 149) 102 102.037 037 ms 101 101.876 876 ms 101 101.892 892 ms sl-bb20-dc-15-0-0.sprintlink.net (144.232.15.0) 101.888 ms 101.876 ms 101.890 ms

Whats wrong g in the p path above?


It COULD be congestion between GBLX and Sprint. But it could also be an asymmetric reverse path. At this GBLX/Sprint boundary, the reverse path policy changes. This is often seen in multi-homed network with multiple paths. In the example above above, Sprints Sprint s reverse route goes via a circuit that is congested, but that circuit is NOT shown in this traceroute.
By Richard Steenbergen, nLayer Communications, Inc. 40

Multiple Interconnection Points


Asymmetric y paths p can potentially p y happen pp at every y router hop. p Especially where networks connect in multiple locations.
The forward p path of all hops p g goes via Washington g DC interconnection. Hop 1 (purple) returns via the Washington DC interconnection. Hop 2 (red) returns via the Chicago interconnection. Hop 3 (green) returns via the San Jose interconnection. Congestion at the Chicago interconnection would disappear by hop 3.
Chi Chicago IL

San Jose CA
By Richard Steenbergen, nLayer Communications, Inc.

Washington DC
41

Using Source Address in your Traceroute


How can you test around asymmetric paths?
Consider the previous example of a problem with Sprint
You Y are multi-homed lti h dt to GX and dL Level l3 3. Traceroute shows You -> GX -> Sprint and latency starting at Sprint
Global Crossing Your Network Level 3 Sprint

How can you prove the issue isnt between GX and Sprint?
Run a traceroute using your side of the GX /30 as your src address. address This /30 comes from your provider (GX)s larger aggregate block. The reverse path will be guaranteed to go Sprint->GBLX If the latency doesnt persist, you know the issue is on the reverse.
By Richard Steenbergen, nLayer Communications, Inc. 42

Using Source Address in your Traceroute


But what if the /30 is numbered out of my space?
As in the case of a customer or potentially a peer.

You can still see some benefits from setting SRCs


Consider trying to examine the reverse path of a peer who you have multiple interconnection points with.
A traceroute sourced from your IP space (such as a loopback) may come back via any of multiple interconnection points. But if the remote network carries the /30s of your interconnection in their IGP, IGP setting the traceroute source to that /30 would force the return path to come back via that interconnection. Trying both options can give you different viewpoints.

By Richard Steenbergen, nLayer Communications, Inc.

43

Tracerouting From a Router


Default Source Address
Most routers default to using g the source address of the egress interface that the probe leaves from.
This may or may not be what you want to see.

Some platforms can be configured to default to a loopback address rather than the egress interface.
For example, Juniper using system default-address-selection.

Clock granularity
Some platforms may be less accurate than others.
For example, Cisco IOS has a 4ms latency granularity.

By Richard Steenbergen, nLayer Communications, Inc.

44

Load Balancing g Across Multiple p Paths

By Richard Steenbergen, nLayer Communications, Inc.

45

Equal Cost Multi Multi-Path Path Routing

SRC

Router A

Router B1

Router C1

Router D

DST

Router B2

Router C2

Flow Fl hashing h hi keeps k a single i l TCP/UDP fl flow mapped dt to a single i l path. th UDP/TCP traceroute probes with incrementing layer 4 ports look like unique flows, which may cause them to go down different parallel paths.
Example: 6 ldn-bb2-link.telia.net (80.91.251.14) 74.139 ms 74.126 ms ldn bb1 link telia net (80 ldn-bb1-link.telia.net (80.91.249.77) 91 249 77) 74 74.144 144 ms 7 hbg-bb1-link.telia.net (80.91.249.11) 89.773 ms hbg-bb2-link.telia.net (80.91.250.150) 88.459 ms 88.456 ms 8 s s-bb2-link.telia.net bb2 link telia net (80 (80.91.249.13) 91 249 13) 105 105.002 002 ms s-bb2-link.telia.net (80.239.147.169) 102.647 ms 102.501 ms
By Richard Steenbergen, nLayer Communications, Inc. 46

Multiple Paths - Examples


A slightly li htl more complex l example l
4 p16-1-0-0.r21.asbnva01.us.bb.verio.net (129.250.5.21) 0.571 ms 0.604 ms 0.594 ms 5 p16 p16-1-2-2 1 2 2.r21.nycmny01.us.bb.verio.net r21 nycmny01 us bb verio net (129 (129.250.4.26) 250 4 26) 7 7.279 279 ms 7 7.260 260 ms p16-4-0-0.r00.chcgil06.us.bb.verio.net (129.250.5.102) 25.981 ms 6 p16-2-0-0.r21.sttlwa01.us.bb.verio.net (129.250.2.180) 71.027 ms p16-1-1-3.r20.sttlwa01.us.bb.verio.net 16 1 1 3 20 ttl 01 bb i t (129 (129.250.2.6) 250 2 6) 66 66.730 730 ms 66 66.535 535 ms

ECMP between two somewhat parallel paths


Ashburn VA New York NY Seattle WA Ashburn VA Chicago IL Seattle WA

Completely harmless, harmless flow hashing protects against reordering, but the resulting traceroute is potentially confusing. confusing
By Richard Steenbergen, nLayer Communications, Inc. 47

Multiple Unequal-Length Unequal Length Paths

SRC

Router A

Router B1

Router C

DST

Router B2

Router X

A far f more confusing f i scenario i i is equal-cost l t unequal-length ll th paths. th This makes the traceroute appear to jump back and forth between hops It can be extremely confusing to end users and very difficult to parse parse. An example traceroute would end up looking something like this:
1 2 3 4 5 A B1 C D E A B2 X C D A B1 C D E
By Richard Steenbergen, nLayer Communications, Inc. 48

Coping With Multiple Paths


When in doubt, only look at a single path
Set your traceroute client to only send a single probe probe. But be aware that this may not be the path which your actual traffic forwards over over. One way to try out different paths manually is to increment the source or destination IP by 1 across multiple complete traceroutes.

By Richard Steenbergen, nLayer Communications, Inc.

49

MPLS and Traceroute

By Richard Steenbergen, nLayer Communications, Inc.

50

MPLS ICMP Tunneling


Many large networks operate an MPLS based core Some devices dont even carry y an IP routing g table
This is fine for switching MPLS labeled packets But presents a problem when an ICMP is generated How does the MPLS-only router deliver an ICMP msg?

One solution is called ICMP Tunneling


If generating an ICMP about a packet inside an LSP Th put Then t the th generated t d ICMP b back ki into t th the same LSP Works for delivering the message, but It can make traceroutes look really WEIRD!
By Richard Steenbergen, nLayer Communications, Inc. 51

MPLS ICMP Tunneling Diagram


ICMP Dest Unreach ICMP TTL Exceed ICMP TTL Exceed ICMP TTL Exceed ICMP TTL Exceed

TTL=1

TTL=2

TTL=3

TTL=4

TTL=5

SRC

Router 1

Router 2

Router 3

Router 4

DST

All returned ICMP packets must travel to the end of the LSP before going back to the sender. y hop p in the LSP appear pp to have the same RTT as the final hop. p This makes every
ICMP Dest Unreach ICMP TTL Exceed ICMP TTL Exceed ICMP TTL Exceed ICMP TTL Exceed

TTL=1

TTL=2

TTL=3

TTL=4

TTL=5

SRC

Router 1

Router 2

Router 3

Router 4

DST
52

By Richard Steenbergen, nLayer Communications, Inc.

MPLS ICMP Tunneling Example


1 te2-4.ar5.PAO2.gblx.net 1. 24 PAO2 bl (69.22.153.209) (69 22 1 3 209) 1 1.160 160 ms 1 1.060 060 ms 1 1.029 029 ms 2. 192.205.34.245 (192.205.34.245) 3.984 ms 3.810 ms 3.786 ms 3 tbr1.sffca.ip.att.net 3. tbr1 sffca ip att net (12.123.12.25) (12 123 12 25) 74 74.848 848 ms 74 74.859 859 ms 74 74.936 936 ms 4. cr1.sffca.ip.att.net (12.122.19.1) 74.344 ms 74.612 ms 74.072 ms 5. cr1.cgcil.ip.att.net g p ( (12.122.4.122) ) 74.827 ms 75.061 ms 74.640 ms 6. cr2.cgcil.ip.att.net (12.122.2.54) 75.279 ms 74.839 ms 75.238 ms 7. cr1.n54ny.ip.att.net (12.122.1.1) 74.667 ms 74.501 ms 77.266 ms 8. gbr7.n54ny.ip.att.net (12.122.4.133) 74.443 ms 74.357 ms 75.397 ms 9. ar3.n54ny.ip.att.net (12.123.0.77) 74.648 ms 74.369 ms 74.415 ms 10 12 126 0 29 (12.126.0.29) 10.12.126.0.29 (12 126 0 29) 76 76.104 104 ms 76 76.283 283 ms 76 76.174 174 ms 11.route-server.cbbtier3.att.net (12.0.1.28) 74.360 ms 74.303 ms 74.272 ms

By Richard Steenbergen, nLayer Communications, Inc.

53

Send q questions, complaints, p to:

Richard A Steenbergen <[email protected]>

You might also like